Re: [Mesa-dev] Make Jordan an Owner of the mesa project?

2018-12-05 Thread Jordan Justen
On 2018-12-04 19:39:05, Jason Ekstrand wrote:
> Given that everyone else has firmly ACKed, I'm going to click the button.
> Congratulations, Jordan, you're now a mesa Owner!

Thanks all!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108925] vkCmdCopyQueryPoolResults(VK_QUERY_RESULT_WAIT_BIT) for timestamps with large query count hangs

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108925

Alex Smith  changed:

   What|Removed |Added

 Attachment #142700|0   |1
is obsolete||

--- Comment #5 from Alex Smith  ---
Created attachment 142732
  --> https://bugs.freedesktop.org/attachment.cgi?id=142732=edit
New test case

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/59] intel/compiler: Extended Math is limited to SIMD8 on half-float

2018-12-05 Thread Pohjolainen, Topi
On Tue, Dec 04, 2018 at 08:16:47AM +0100, Iago Toral Quiroga wrote:
> From the Skylake PRM, Extended Math Function:
> 
>   "The execution size must be no more than 8 when half-floats
>are used in source or destination operand."
> 
> Earlier generations do not support Extended Math with half-float.

Reviewed-by: Topi Pohjolainen 

> ---
>  src/intel/compiler/brw_fs.cpp | 30 +++---
>  1 file changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 43b920ae33d..509c6febf38 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -5386,18 +5386,34 @@ get_lowered_simd_width(const struct gen_device_info 
> *devinfo,
> case SHADER_OPCODE_EXP2:
> case SHADER_OPCODE_LOG2:
> case SHADER_OPCODE_SIN:
> -   case SHADER_OPCODE_COS:
> +   case SHADER_OPCODE_COS: {
>/* Unary extended math instructions are limited to SIMD8 on Gen4 and
> * Gen6.
> */
> -  return (devinfo->gen >= 7 ? MIN2(16, inst->exec_size) :
> -  devinfo->gen == 5 || devinfo->is_g4x ? MIN2(16, 
> inst->exec_size) :
> -  MIN2(8, inst->exec_size));
> +  unsigned max_width =
> + (devinfo->gen >= 7 ? MIN2(16, inst->exec_size) :
> +  devinfo->gen == 5 || devinfo->is_g4x ? MIN2(16, inst->exec_size) :
> +  MIN2(8, inst->exec_size));
>  
> -   case SHADER_OPCODE_POW:
> +  /* Extended Math Function is limited to SIMD8 with half-float */
> +  if (inst->dst.type == BRW_REGISTER_TYPE_HF)
> + max_width = MIN2(max_width, 8);
> +
> +  return max_width;
> +   }
> +
> +   case SHADER_OPCODE_POW: {
>/* SIMD16 is only allowed on Gen7+. */
> -  return (devinfo->gen >= 7 ? MIN2(16, inst->exec_size) :
> -  MIN2(8, inst->exec_size));
> +  unsigned max_width =
> +  (devinfo->gen >= 7 ? MIN2(16, inst->exec_size) :
> +   MIN2(8, inst->exec_size));
> +
> +  /* Extended Math Function is limited to SIMD8 with half-float */
> +  if (inst->dst.type == BRW_REGISTER_TYPE_HF)
> + max_width = MIN2(max_width, 8);
> +
> +  return max_width;
> +   }
>  
> case SHADER_OPCODE_INT_QUOTIENT:
> case SHADER_OPCODE_INT_REMAINDER:
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] gm107/ir: add lowering of atomic f32 add on shared memory

2018-12-05 Thread Karol Herbst
On Wed, Dec 5, 2018 at 6:30 AM Ilia Mirkin  wrote:
>
> Signed-off-by: Ilia Mirkin 
> ---
>  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 49 +++
>  .../nouveau/codegen/nv50_ir_lowering_nvc0.h   |  1 +
>  2 files changed, 50 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> index 295497be2f9..44c62820342 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> @@ -1347,6 +1347,53 @@ NVC0LoweringPass::handleBUFQ(Instruction *bufq)
> return true;
>  }
>
> +void
> +NVC0LoweringPass::handleSharedATOMGM107(Instruction *atom)
> +{
> +   if (atom->dType != TYPE_F32)
> +  return;
> +
> +   assert(atom->subOp == NV50_IR_SUBOP_ATOM_ADD);
> +   assert(atom->src(0).getFile() == FILE_MEMORY_SHARED);
> +
> +   BasicBlock *currBB = atom->bb;
> +   BasicBlock *addAndCasBB = atom->bb->splitBefore(atom, false);
> +   BasicBlock *joinBB = atom->bb->splitAfter(atom);
> +
> +   bld.setPosition(currBB, true);
> +
> +   Value *load = atom->getDef(0), *newval = bld.getSSA();
> +   // TODO: Use "U" subop?
> +   bld.mkLoad(TYPE_U32, load, atom->getSrc(0)->asSym(), atom->getIndirect(0, 
> 0));
> +   assert(!currBB->joinAt);
> +   currBB->joinAt = bld.mkFlow(OP_JOINAT, joinBB, CC_ALWAYS, NULL);
> +
> +   bld.mkFlow(OP_BRA, addAndCasBB, CC_ALWAYS, NULL);
> +   currBB->cfg.attach(>cfg, Graph::Edge::TREE);
> +
> +   bld.setPosition(addAndCasBB, true);
> +   bld.remove(atom);
> +
> +   bld.mkOp2(OP_ADD, TYPE_F32, newval, load, atom->getSrc(1));
> +
> +   // Try to do a compare-and-swap. If the old value doesn't match the loaded
> +   // value, repeat.
> +   Value *old = bld.getSSA();
> +   Instruction *cas =
> +  bld.mkOp3(OP_ATOM, TYPE_U32, old, atom->getSrc(0), load, newval);
> +   cas->setIndirect(0, 0, atom->getIndirect(0, 0));
> +   cas->subOp = NV50_IR_SUBOP_ATOM_CAS;
> +   Value *pred = bld.getSSA(1, FILE_PREDICATE);
> +   bld.mkCmp(OP_SET, CC_EQ, TYPE_U32, pred, TYPE_U32, old, load);
> +   bld.mkMov(load, old);
> +   bld.mkFlow(OP_BRA, addAndCasBB, CC_NOT_P, pred);
> +   bld.mkFlow(OP_BRA, joinBB, CC_ALWAYS, NULL);
> +   addAndCasBB->cfg.attach(>cfg, Graph::Edge::BACK);
> +
> +   bld.setPosition(joinBB, false);
> +   bld.mkFlow(OP_JOIN, NULL, CC_ALWAYS, NULL)->fixed = 1;
> +}
> +
>  void
>  NVC0LoweringPass::handleSharedATOMNVE4(Instruction *atom)
>  {
> @@ -1559,6 +1606,8 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
>   handleSharedATOM(atom);
>else if (targ->getChipset() < NVISA_GM107_CHIPSET)
>   handleSharedATOMNVE4(atom);
> +  else
> + handleSharedATOMGM107(atom);

but doesn't this makes all shared ATOM operations get lowered now?

>return true;
> default:
>assert(atom->src(0).getFile() == FILE_MEMORY_BUFFER);
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h
> index e0f50ab0904..2d77e918358 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h
> @@ -143,6 +143,7 @@ protected:
> void handleSurfaceOpNVC0(TexInstruction *);
> void handleSharedATOM(Instruction *);
> void handleSharedATOMNVE4(Instruction *);
> +   void handleSharedATOMGM107(Instruction *);
> void handleLDST(Instruction *);
> bool handleBUFQ(Instruction *);
> void handlePIXLD(Instruction *);
> --
> 2.18.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: Flush before vkCmdWriteTimestamp() if needed

2018-12-05 Thread Samuel Pitoiset



On 12/5/18 11:15 AM, Alex Smith wrote:
Thanks. Though this fixes the 100% repro hang, I think your first patch 
is still needed as well to handle getting 0x in the low 32 bits.


Yeah, it's still needed. Though I think it should be enough to wait on 
the high 32bits as suggested by Bas.




On Wed, 5 Dec 2018 at 10:04, Samuel Pitoiset > wrote:


Yes, this is correct, indeed.

The issue wasn't present because we used EOP events before removing the
availability bit.

Btw, just noticed that we should reset pending_reset_query directly in
si_emit_cache_flush() to reduce the number of stalls. I will send a
patch.

Also note that fill CP DMA operations are currently always sync'ed,
while CP DMA copies are not. I plan to change this at some point.

Reviewed-by: Samuel Pitoiset mailto:samuel.pitoi...@gmail.com>>

On 12/5/18 10:52 AM, Alex Smith wrote:
 > As done for vkCmdBeginQuery() already. Prevents timestamps from being
 > overwritten by previous vkCmdResetQueryPool() calls if the shader
path
 > was used to do the reset.
 >
 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
 > Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for
resetting the query pool")
 > Signed-off-by: Alex Smith mailto:asm...@feralinteractive.com>>
 > ---
 >   src/amd/vulkan/radv_query.c | 30 +++---
 >   1 file changed, 19 insertions(+), 11 deletions(-)
 >
 > diff --git a/src/amd/vulkan/radv_query.c
b/src/amd/vulkan/radv_query.c
 > index 550abe307a..e226bcef6a 100644
 > --- a/src/amd/vulkan/radv_query.c
 > +++ b/src/amd/vulkan/radv_query.c
 > @@ -1436,6 +1436,22 @@ static unsigned
event_type_for_stream(unsigned stream)
 >       }
 >   }
 >
 > +static void emit_query_flush(struct radv_cmd_buffer *cmd_buffer,
 > +                          struct radv_query_pool *pool)
 > +{
 > +     if (cmd_buffer->pending_reset_query) {
 > +             if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
 > +                     /* Only need to flush caches if the query
pool size is
 > +                      * large enough to be resetted using the
compute shader
 > +                      * path. Small pools don't need any cache
flushes
 > +                      * because we use a CP dma clear.
 > +                      */
 > +                     si_emit_cache_flush(cmd_buffer);
 > +                     cmd_buffer->pending_reset_query = false;
 > +             }
 > +     }
 > +}
 > +
 >   static void emit_begin_query(struct radv_cmd_buffer *cmd_buffer,
 >                            uint64_t va,
 >                            VkQueryType query_type,
 > @@ -1582,17 +1598,7 @@ void radv_CmdBeginQueryIndexedEXT(
 >
 >       radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
 >
 > -     if (cmd_buffer->pending_reset_query) {
 > -             if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
 > -                     /* Only need to flush caches if the query
pool size is
 > -                      * large enough to be resetted using the
compute shader
 > -                      * path. Small pools don't need any cache
flushes
 > -                      * because we use a CP dma clear.
 > -                      */
 > -                     si_emit_cache_flush(cmd_buffer);
 > -                     cmd_buffer->pending_reset_query = false;
 > -             }
 > -     }
 > +     emit_query_flush(cmd_buffer, pool);
 >
 >       va += pool->stride * query;
 >
 > @@ -1669,6 +1675,8 @@ void radv_CmdWriteTimestamp(
 >
 >       radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
 >
 > +     emit_query_flush(cmd_buffer, pool);
 > +
 >       int num_queries = 1;
 >       if (cmd_buffer->state.subpass &&
cmd_buffer->state.subpass->view_mask)
 >               num_queries =
util_bitcount(cmd_buffer->state.subpass->view_mask);
 >


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/59] intel/compiler: implement conversions from 16-bit float to 64-bit

2018-12-05 Thread Iago Toral
On Tue, 2018-12-04 at 18:10 +0200, Pohjolainen, Topi wrote:
> On Tue, Dec 04, 2018 at 02:33:25PM +0200, Pohjolainen, Topi wrote:
> > On Tue, Dec 04, 2018 at 08:16:34AM +0100, Iago Toral Quiroga wrote:
> > > Signed-off-by: Samuel Iglesias Gonsálvez 
> > > ---
> > >  src/intel/compiler/brw_fs_nir.cpp | 41
> > > +++
> > >  1 file changed, 41 insertions(+)
> > > 
> > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > b/src/intel/compiler/brw_fs_nir.cpp
> > > index 6eb68794f58..7294f49ddc0 100644
> > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > @@ -796,6 +796,47 @@ fs_visitor::nir_emit_alu(const fs_builder
> > > , nir_alu_instr *instr)
> > > case nir_op_f2f64:
> > > case nir_op_f2i64:
> > > case nir_op_f2u64:
> > > +  /* BDW PRM, vol02, Command Reference Instructions, mov -
> > > MOVE:
> > > +   *
> > > +   *   "There is no direct conversion from HF to DF or DF to
> > > HF.
> > > +   *Use two instructions and F (Float) as an
> > > intermediate type.
> > > +   *
> > > +   *There is no direct conversion from HF to Q/UQ or
> > > Q/UQ to HF.
> > > +   *Use two instructions and F (Float) or a word integer
> > > type
> > > +   *or a DWord integer type as an intermediate type."
> > > +   */
> > > +  if (nir_src_bit_size(instr->src[0].src) == 16) {
> > > + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 1);
> > > + inst = bld.MOV(tmp, op[0]);
> > > + inst->saturate = instr->dest.saturate;
> > > + op[0] = tmp;
> > > +  }
> > > +
> > > +  /* CHV PRM, vol07, 3D Media GPGPU Engine, Register Region
> > > Restrictions:
> > > +   *
> > > +   *"When source or destination is 64b (...), regioning
> > > in Align1
> > > +   * must follow these rules:
> > > +   *
> > > +   * 1. Source and destination horizontal stride must be
> > > aligned to
> > > +   *the same qword.
> > > +   * (...)"
> > > +   *
> > > +   * This means that conversions from bit-sizes smaller than
> > > 64-bit to
> > > +   * 64-bit need to have the source data elements aligned to
> > > 64-bit.
> > > +   * This restriction does not apply to BDW and later.
> > > +   */
> > > +  if (type_sz(result.type) == 8 && type_sz(op[0].type) < 8
> > > &&
> > > +  (devinfo->is_cherryview ||
> > > gen_device_info_is_9lp(devinfo))) {
> > > + fs_reg tmp = bld.vgrf(result.type, 1);
> > > + tmp = subscript(tmp, op[0].type, 0);
> > > + inst = bld.MOV(tmp, op[0]);
> > > + op[0] = tmp;
> > > +  }
> > 
> > For this second part we seem to have similar logic further down
> > after
> > "nir_op_u2u64" (not visible here) in master? Would it be possible
> > to fallthru
> > from here and re-use that?
> 
> And after reading it more carefully myself it looks that this is
> actually
> cleaner.
> 
> I noticed that in the nir_op_u2u64 case the destination and source
> sizes are
> checked using:
> 
>if (nir_dest_bit_size(instr->dest.dest) == 64 &&
>nir_src_bit_size(instr->src[0].src) < 64 &&
>...
> 
> Should we use the same here for consistency?

Right above this we can rewrite op[0] with a temporary that would be
different from instr->src[0].src so we can't check the nir sources any
more. Also, by the end of the series, when we incorporate 8-bit
conversions there will be more cases like this that we need to account
for and we end up rewriting the u2u64 case to this style as well.

Iago

> > 
> > > +
> > > +  inst = bld.MOV(result, op[0]);
> > > +  inst->saturate = instr->dest.saturate;
> > > +  break;
> > > +
> > > case nir_op_i2f64:
> > > case nir_op_i2i64:
> > > case nir_op_u2f64:
> > > -- 
> > > 2.17.1
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106958] Mass Effect Andromeda renders correctly on RX480 POLARIS but BAD ON RX VEGA 64 on wine 3.10 stagingf with DXVK

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106958

Samuel Pitoiset  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #27 from Samuel Pitoiset  ---
Yes, I can confirm this too.

The issue has probably been fixed somewhere in LLVM 8 (master).

Thanks for being so responsive.

Closing.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/59] intel/compiler: Implement float64/int64 to float16 conversion

2018-12-05 Thread Pohjolainen, Topi
On Wed, Dec 05, 2018 at 09:49:29AM +0100, Iago Toral wrote:
> On Tue, 2018-12-04 at 14:57 +0200, Pohjolainen, Topi wrote:
> > On Tue, Dec 04, 2018 at 08:16:35AM +0100, Iago Toral Quiroga wrote:
> > > From: Samuel Iglesias Gonsálvez 
> > > 
> > > It is not supported directly in the HW, we need to convert to a 32-
> > > bit
> > > type first as intermediate step.
> > > 
> > > v2 (Iago): handle conversions from 64-bit integers as well
> > > 
> > > Signed-off-by: Samuel Iglesias Gonsálvez 
> > > ---
> > >  src/intel/compiler/brw_fs_nir.cpp | 42
> > > ---
> > >  1 file changed, 39 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > b/src/intel/compiler/brw_fs_nir.cpp
> > > index 7294f49ddc0..9f3d3bf9762 100644
> > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > @@ -784,6 +784,19 @@ fs_visitor::nir_emit_alu(const fs_builder
> > > , nir_alu_instr *instr)
> > > */
> > >  
> > > case nir_op_f2f16:
> > > +  /* BDW PRM, vol02, Command Reference Instructions, mov -
> > > MOVE:
> > > +   *
> > > +   *   "There is no direct conversion from HF to DF or DF to
> > > HF.
> > > +   *Use two instructions and F (Float) as an intermediate
> > > type.
> > > +   */
> > > +  if (nir_src_bit_size(instr->src[0].src) == 64) {
> > > + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 1);
> > > + inst = bld.MOV(tmp, op[0]);
> > > + inst->saturate = instr->dest.saturate;
> > > + inst = bld.MOV(result, tmp);
> > > + inst->saturate = instr->dest.saturate;
> > > + break;
> > > +  }
> > >inst = bld.MOV(result, op[0]);
> > >inst->saturate = instr->dest.saturate;
> > >break;
> > > @@ -864,7 +877,32 @@ fs_visitor::nir_emit_alu(const fs_builder
> > > , nir_alu_instr *instr)
> > >   inst->saturate = instr->dest.saturate;
> > >   break;
> > >}
> > > -  /* fallthrough */
> > 
> > This is more or less nit-picking but I thought I ask anyway. The
> > fallthru
> > comment gets now dropped also for other cases than "i2f16" and
> > "u2f16". And if
> > we added the logic for nir_op_i2f16/nir_op_u2f16 cases just after the
> > f2f16
> > case that would yield a diff without the following three copy-paste
> > lines as
> > well. Or amd I missing something?
> 
> Yes, I think you're right and if you look at this patch standalone I
> think it would make sense to do that. The thing is that later on in the
> series we have to change this further to incorporate more restrictions
> for conversions to/from integer and half-float for atom platforms, so
> having the f2f16 case separated from the {i,u}2f16 cases will make more
> sense. That would be patch 46 in the series, which comes later because
> that is when we addressed integer conversions from 8-bit and noticed
> this whole thing on atom.
> 
> I can still make the change you suggest in this patch and then do the
> split later on if you think that helps though. I could also try to move
> the fix for atom earlier in the series, that will lead to conflicts and
> I'd need to slightly rewrite other patches in the series to accomodate
> to that, but it is certainly doable if you that makes the commit
> history better.

I'm not sure if I understood correctly your answer but I didn't suggest to
merge f2f16 case with {i,u}2f16 cases. I thought that having:

  case nir_op_f2f16:
  ...
  break;

  case nir_op_i2f16:
  case nir_op_u2f16:
  ...
  break;

  case nir_op_b2i:
  ...


would have yielded smaller diff than:

  case nir_op_f2f16:
  ...
  break;

  case nir_op_b2i:
  ...

  case nir_op_u2u64:
  ...
  /* fallthrough */

  case nir_op_i2f16:
  case nir_op_u2f16:
  ...
  break;

  case nir_op_f2f32
  ...
  break;

> 
> Iago
> 
> > > +  inst = bld.MOV(result, op[0]);
> > > +  inst->saturate = instr->dest.saturate;
> > > +  break;
> > > +
> > > +   case nir_op_i2f16:
> > > +   case nir_op_u2f16:
> > > +  /* BDW PRM, vol02, Command Reference Instructions, mov -
> > > MOVE:
> > > +   *
> > > +   *"There is no direct conversion from HF to Q/UQ or Q/UQ
> > > to HF.
> > > +   * Use two instructions and F (Float) or a word integer
> > > type or a
> > > +   * DWord integer type as an intermediate type."
> > > +   */
> > > +  if (nir_src_bit_size(instr->src[0].src) == 64) {
> > > + brw_reg_type reg_type = instr->op == nir_op_i2f16 ?
> > > +BRW_REGISTER_TYPE_D : BRW_REGISTER_TYPE_UD;
> > > + fs_reg tmp = bld.vgrf(reg_type, 1);
> > > + inst = bld.MOV(tmp, op[0]);
> > > + inst->saturate = instr->dest.saturate;
> > > + inst = bld.MOV(result, tmp);
> > > + inst->saturate = 

Re: [Mesa-dev] [PATCH] radv: Flush before vkCmdWriteTimestamp() if needed

2018-12-05 Thread Samuel Pitoiset

Yes, this is correct, indeed.

The issue wasn't present because we used EOP events before removing the 
availability bit.


Btw, just noticed that we should reset pending_reset_query directly in 
si_emit_cache_flush() to reduce the number of stalls. I will send a patch.


Also note that fill CP DMA operations are currently always sync'ed, 
while CP DMA copies are not. I plan to change this at some point.


Reviewed-by: Samuel Pitoiset 

On 12/5/18 10:52 AM, Alex Smith wrote:

As done for vkCmdBeginQuery() already. Prevents timestamps from being
overwritten by previous vkCmdResetQueryPool() calls if the shader path
was used to do the reset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query 
pool")
Signed-off-by: Alex Smith 
---
  src/amd/vulkan/radv_query.c | 30 +++---
  1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
index 550abe307a..e226bcef6a 100644
--- a/src/amd/vulkan/radv_query.c
+++ b/src/amd/vulkan/radv_query.c
@@ -1436,6 +1436,22 @@ static unsigned event_type_for_stream(unsigned stream)
}
  }
  
+static void emit_query_flush(struct radv_cmd_buffer *cmd_buffer,

+struct radv_query_pool *pool)
+{
+   if (cmd_buffer->pending_reset_query) {
+   if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
+   /* Only need to flush caches if the query pool size is
+* large enough to be resetted using the compute shader
+* path. Small pools don't need any cache flushes
+* because we use a CP dma clear.
+*/
+   si_emit_cache_flush(cmd_buffer);
+   cmd_buffer->pending_reset_query = false;
+   }
+   }
+}
+
  static void emit_begin_query(struct radv_cmd_buffer *cmd_buffer,
 uint64_t va,
 VkQueryType query_type,
@@ -1582,17 +1598,7 @@ void radv_CmdBeginQueryIndexedEXT(
  
  	radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
  
-	if (cmd_buffer->pending_reset_query) {

-   if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
-   /* Only need to flush caches if the query pool size is
-* large enough to be resetted using the compute shader
-* path. Small pools don't need any cache flushes
-* because we use a CP dma clear.
-*/
-   si_emit_cache_flush(cmd_buffer);
-   cmd_buffer->pending_reset_query = false;
-   }
-   }
+   emit_query_flush(cmd_buffer, pool);
  
  	va += pool->stride * query;
  
@@ -1669,6 +1675,8 @@ void radv_CmdWriteTimestamp(
  
  	radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
  
+	emit_query_flush(cmd_buffer, pool);

+
int num_queries = 1;
if (cmd_buffer->state.subpass && cmd_buffer->state.subpass->view_mask)
num_queries = 
util_bitcount(cmd_buffer->state.subpass->view_mask);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: reset pending_reset_query when flushing caches

2018-12-05 Thread Samuel Pitoiset
If the driver used a compute shader for resetting a query pool,
it should be completed when caches are flushed.

This might reduce the number of stalls if operations are done
between vkCmdResetQueryPool() and vkCmdBeginQuery()
(or vkCmdWriteTimestamp()).

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_query.c| 1 -
 src/amd/vulkan/si_cmd_buffer.c | 5 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
index e226bcef6a9..276cc1c42d7 100644
--- a/src/amd/vulkan/radv_query.c
+++ b/src/amd/vulkan/radv_query.c
@@ -1447,7 +1447,6 @@ static void emit_query_flush(struct radv_cmd_buffer 
*cmd_buffer,
 * because we use a CP dma clear.
 */
si_emit_cache_flush(cmd_buffer);
-   cmd_buffer->pending_reset_query = false;
}
}
 }
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index a9f25725415..2f57584bf82 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -992,6 +992,11 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
radv_cmd_buffer_trace_emit(cmd_buffer);
 
cmd_buffer->state.flush_bits = 0;
+
+   /* If the driver used a compute shader for resetting a query pool, it
+* should be finished at this point.
+*/
+   cmd_buffer->pending_reset_query = false;
 }
 
 /* sets the CP predication state using a boolean stored at va */
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/android: Do not reject storage images.

2018-12-05 Thread Tapani Pälli



On 12/5/18 12:34 PM, Bas Nieuwenhuizen wrote:

We do the ImageFormatProperties check already, and rejecting an usage
flag when both ImageFormatProperties and the WSI (which is Android)
support it is not allowed.

Intel does support storage for some of the support WSI formats, such
as R8G8B8A8_UNORM, and looking at the ISL_SURF_USAGE_DISABLE_AUX_BIT,
the imported images do not have any form of compression that would
prevent this fix.


Bas FYI, we have this one used internally:

https://patchwork.freedesktop.org/patch/247681/



Fixes: 053d4c328fa "anv: Implement VK_ANDROID_native_buffer (v9)"
CC: Jason Ekstrand 
---
  src/intel/vulkan/anv_android.c | 7 ---
  1 file changed, 7 deletions(-)

diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
index a3bab8087b4..92c3787b49b 100644
--- a/src/intel/vulkan/anv_android.c
+++ b/src/intel/vulkan/anv_android.c
@@ -268,13 +268,6 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
 "inside %s", __func__);
 }
  
-   /* Reject STORAGE here to avoid complexity elsewhere. */

-   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
-  return vk_errorf(device->instance, device, VK_ERROR_FORMAT_NOT_SUPPORTED,
-   "VK_IMAGE_USAGE_STORAGE_BIT unsupported for gralloc "
-   "swapchain");
-   }
-
 if (unmask32(, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
   VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT))
*grallocUsage |= GRALLOC_USAGE_HW_RENDER;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/59] compiler/nir: add lowering for 16-bit ldexp

2018-12-05 Thread Iago Toral
On Wed, 2018-12-05 at 11:39 +0200, Pohjolainen, Topi wrote:
> I remember people preferring to order things 16, 32, 64 before.
> Should
> we follow that here as well?

Yes, it makes sense. I'll change that.

> On Tue, Dec 04, 2018 at 08:16:46AM +0100, Iago Toral Quiroga wrote:
> > ---
> >  src/compiler/nir/nir_opt_algebraic.py | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/src/compiler/nir/nir_opt_algebraic.py
> > b/src/compiler/nir/nir_opt_algebraic.py
> > index 6c3b77c9b6e..747f1751086 100644
> > --- a/src/compiler/nir/nir_opt_algebraic.py
> > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > @@ -778,6 +778,8 @@ def fexp2i(exp, bits):
> >return ('ishl', ('iadd', exp, 127), 23)
> > elif bits == 64:
> >return ('pack_64_2x32_split', 0, ('ishl', ('iadd', exp,
> > 1023), 20))
> > +   elif bits == 16:
> > +  return ('i2i16', ('ishl', ('iadd', exp, 15), 10))
> > else:
> >assert False
> >  
> > @@ -796,6 +798,8 @@ def ldexp(f, exp, bits):
> >exp = ('imin', ('imax', exp, -252), 254)
> > elif bits == 64:
> >exp = ('imin', ('imax', exp, -2044), 2046)
> > +   elif bits == 16:
> > +  exp = ('imin', ('imax', exp, -30), 30)
> 
> I expected this to be:
> 
>  exp = ('imin', ('imax', exp, -29), 30)

Actually, I think this should be -28, since the minimum exponent value
is -14.

> > else:
> >assert False
> >  
> > @@ -814,6 +818,7 @@ def ldexp(f, exp, bits):
> >  optimizations += [
> > (('ldexp@32', 'x', 'exp'), ldexp('x', 'exp', 32), 'options-
> > >lower_ldexp'),
> > (('ldexp@64', 'x', 'exp'), ldexp('x', 'exp', 64), 'options-
> > >lower_ldexp'),
> > +   (('ldexp@16', 'x', 'exp'), ldexp('x', 'exp', 16), 'options-
> > >lower_ldexp'),
> >  ]
> >  
> >  # Unreal Engine 4 demo applications open-codes bitfieldReverse()
> > -- 
> > 2.17.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/android: handle storage images in vkGetSwapchainGrallocUsageANDROID

2018-12-05 Thread Bas Nieuwenhuizen
On Fri, Sep 7, 2018 at 12:54 AM Kevin Strasser  wrote:
>
> Android P and earlier expect that the surface supports storage images, and
> so many of the tests fail when the framework checks for that support. The
> framework also includes various image format and usage combinations that are
> invalid for the hardware.
>
> Drop the STORAGE restriction from the HAL and whitelist a pair of
> formats so that existing versions of Android can pass these tests.
>
> Fixes:
>dEQP-VK.wsi.android.*
>
> Signed-off-by: Kevin Strasser 
> ---
>  src/intel/vulkan/anv_android.c | 23 ++-
>  1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
> index 46c41d5..e2640b8 100644
> --- a/src/intel/vulkan/anv_android.c
> +++ b/src/intel/vulkan/anv_android.c
> @@ -234,7 +234,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> *grallocUsage = 0;
> intel_logd("%s: format=%d, usage=0x%x", __func__, format, imageUsage);
>
> -   /* WARNING: Android Nougat's libvulkan.so hardcodes the VkImageUsageFlags
> +   /* WARNING: Android's libvulkan.so hardcodes the VkImageUsageFlags
>  * returned to applications via 
> VkSurfaceCapabilitiesKHR::supportedUsageFlags.
>  * The relevant code in libvulkan/swapchain.cpp contains this fun comment:
>  *
> @@ -247,7 +247,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
>  * dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
>  */
>
> -   const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
> +   VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {

Why remove the const here?

>.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
>.format = format,
>.type = VK_IMAGE_TYPE_2D,
> @@ -255,6 +255,17 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
>.usage = imageUsage,
> };
>
> +   /* Android P and earlier doesn't check if the physical device supports a
> +* given format and usage combination before calling this function. Omit 
> the
> +* storage requirement to make the tests pass.
> +*/
> +#if ANDROID_API_LEVEL <= 28
> +   if (format == VK_FORMAT_R8G8B8A8_SRGB ||
> +   format == VK_FORMAT_R5G6B5_UNORM_PACK16) {
> +  image_format_info.usage &= ~VK_IMAGE_USAGE_STORAGE_BIT;
> +   }
> +#endif

I don't think you need this. Per the vulkan spec you can only use an
format + usage combination for a swapchain if it is supported per
ImageFormatProperties, using essentially the same check happening
above. I know CTs has been bad at this, but Vulkan CTS should have
been fixed for a bit now. (I don't think all the fixes are in Android
CTS 9.0_r4 yet, maybe the next release?)

(Also silently removing the usage bit is bad, because the app could
try actually using images stores with the image ...)

> +
> VkImageFormatProperties2KHR image_format_props = {
>.sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR,
> };
> @@ -268,19 +279,13 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> "inside %s", __func__);
> }
>
> -   /* Reject STORAGE here to avoid complexity elsewhere. */
> -   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
> -  return vk_errorf(device->instance, device, 
> VK_ERROR_FORMAT_NOT_SUPPORTED,
> -   "VK_IMAGE_USAGE_STORAGE_BIT unsupported for gralloc "
> -   "swapchain");
> -   }
> -
> if (unmask32(, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
>   VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT))
>*grallocUsage |= GRALLOC_USAGE_HW_RENDER;
>
> if (unmask32(, VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
>   VK_IMAGE_USAGE_SAMPLED_BIT |
> + VK_IMAGE_USAGE_STORAGE_BIT |
>   VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT))
>*grallocUsage |= GRALLOC_USAGE_HW_TEXTURE;
>
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/59] intel/compiler: Implement float64/int64 to float16 conversion

2018-12-05 Thread Iago Toral
On Tue, 2018-12-04 at 14:57 +0200, Pohjolainen, Topi wrote:
> On Tue, Dec 04, 2018 at 08:16:35AM +0100, Iago Toral Quiroga wrote:
> > From: Samuel Iglesias Gonsálvez 
> > 
> > It is not supported directly in the HW, we need to convert to a 32-
> > bit
> > type first as intermediate step.
> > 
> > v2 (Iago): handle conversions from 64-bit integers as well
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez 
> > ---
> >  src/intel/compiler/brw_fs_nir.cpp | 42
> > ---
> >  1 file changed, 39 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index 7294f49ddc0..9f3d3bf9762 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -784,6 +784,19 @@ fs_visitor::nir_emit_alu(const fs_builder
> > , nir_alu_instr *instr)
> > */
> >  
> > case nir_op_f2f16:
> > +  /* BDW PRM, vol02, Command Reference Instructions, mov -
> > MOVE:
> > +   *
> > +   *   "There is no direct conversion from HF to DF or DF to
> > HF.
> > +   *Use two instructions and F (Float) as an intermediate
> > type.
> > +   */
> > +  if (nir_src_bit_size(instr->src[0].src) == 64) {
> > + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 1);
> > + inst = bld.MOV(tmp, op[0]);
> > + inst->saturate = instr->dest.saturate;
> > + inst = bld.MOV(result, tmp);
> > + inst->saturate = instr->dest.saturate;
> > + break;
> > +  }
> >inst = bld.MOV(result, op[0]);
> >inst->saturate = instr->dest.saturate;
> >break;
> > @@ -864,7 +877,32 @@ fs_visitor::nir_emit_alu(const fs_builder
> > , nir_alu_instr *instr)
> >   inst->saturate = instr->dest.saturate;
> >   break;
> >}
> > -  /* fallthrough */
> 
> This is more or less nit-picking but I thought I ask anyway. The
> fallthru
> comment gets now dropped also for other cases than "i2f16" and
> "u2f16". And if
> we added the logic for nir_op_i2f16/nir_op_u2f16 cases just after the
> f2f16
> case that would yield a diff without the following three copy-paste
> lines as
> well. Or amd I missing something?

Yes, I think you're right and if you look at this patch standalone I
think it would make sense to do that. The thing is that later on in the
series we have to change this further to incorporate more restrictions
for conversions to/from integer and half-float for atom platforms, so
having the f2f16 case separated from the {i,u}2f16 cases will make more
sense. That would be patch 46 in the series, which comes later because
that is when we addressed integer conversions from 8-bit and noticed
this whole thing on atom.

I can still make the change you suggest in this patch and then do the
split later on if you think that helps though. I could also try to move
the fix for atom earlier in the series, that will lead to conflicts and
I'd need to slightly rewrite other patches in the series to accomodate
to that, but it is certainly doable if you that makes the commit
history better.

Iago

> > +  inst = bld.MOV(result, op[0]);
> > +  inst->saturate = instr->dest.saturate;
> > +  break;
> > +
> > +   case nir_op_i2f16:
> > +   case nir_op_u2f16:
> > +  /* BDW PRM, vol02, Command Reference Instructions, mov -
> > MOVE:
> > +   *
> > +   *"There is no direct conversion from HF to Q/UQ or Q/UQ
> > to HF.
> > +   * Use two instructions and F (Float) or a word integer
> > type or a
> > +   * DWord integer type as an intermediate type."
> > +   */
> > +  if (nir_src_bit_size(instr->src[0].src) == 64) {
> > + brw_reg_type reg_type = instr->op == nir_op_i2f16 ?
> > +BRW_REGISTER_TYPE_D : BRW_REGISTER_TYPE_UD;
> > + fs_reg tmp = bld.vgrf(reg_type, 1);
> > + inst = bld.MOV(tmp, op[0]);
> > + inst->saturate = instr->dest.saturate;
> > + inst = bld.MOV(result, tmp);
> > + inst->saturate = instr->dest.saturate;
> > + break;
> > +  }
> > +  inst = bld.MOV(result, op[0]);
> > +  inst->saturate = instr->dest.saturate;
> > +  break;
> > +
> > case nir_op_f2f32:
> > case nir_op_f2i32:
> > case nir_op_f2u32:
> > @@ -874,8 +912,6 @@ fs_visitor::nir_emit_alu(const fs_builder ,
> > nir_alu_instr *instr)
> > case nir_op_u2u32:
> > case nir_op_i2i16:
> > case nir_op_u2u16:
> > -   case nir_op_i2f16:
> > -   case nir_op_u2f16:
> > case nir_op_i2i8:
> > case nir_op_u2u8:
> >inst = bld.MOV(result, op[0]);
> > -- 
> > 2.17.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix vkCmdCopyQueryoolResults() for timestamp queries

2018-12-05 Thread Alex Smith
On Tue, 4 Dec 2018 at 21:57, Bas Nieuwenhuizen 
wrote:

> On Tue, Dec 4, 2018 at 4:52 PM Samuel Pitoiset
>  wrote:
> >
> > Because WAIT_REG_MEM can only wait for a 32-bit value, it's not
> > safe to use it for timestamp queries. If we only wait on the low
> > 32 bits of a timestamp query we could be unlucky and the GPU
> > might hang.
> >
> > One possible fix is to emit a full end of pipe event and wait
> > on a 32-bit value which is actually an availability bit. This
> > bit is allocated at creation time and always cleared before
> > emitting the EOP event.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
> > Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp
> queries")
> > Signed-off-by: Samuel Pitoiset 
> > ---
> >  src/amd/vulkan/radv_query.c | 49 +++--
> >  1 file changed, 41 insertions(+), 8 deletions(-)
> >
> > diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
> > index 550abe307a1..9bb6b660add 100644
> > --- a/src/amd/vulkan/radv_query.c
> > +++ b/src/amd/vulkan/radv_query.c
> > @@ -1056,8 +1056,15 @@ VkResult radv_CreateQueryPool(
> > pool->pipeline_stats_mask = pCreateInfo->pipelineStatistics;
> > pool->availability_offset = pool->stride *
> pCreateInfo->queryCount;
> > pool->size = pool->availability_offset;
> > -   if (pCreateInfo->queryType == VK_QUERY_TYPE_PIPELINE_STATISTICS)
> > +   if (pCreateInfo->queryType == VK_QUERY_TYPE_PIPELINE_STATISTICS)
> {
> > pool->size += 4 * pCreateInfo->queryCount;
> > +   } else if (pCreateInfo->queryType == VK_QUERY_TYPE_TIMESTAMP) {
> > +   /* Allocate one DWORD for the availability bit which is
> needed
> > +* for vkCmdCopyQueryPoolResults() because we can't
> perform a
> > +* WAIT_REG_MEM on a 64-bit value.
> > +*/
> > +   pool->size += 4;
> > +   }
> >
> > pool->bo = device->ws->buffer_create(device->ws, pool->size,
> >  64, RADEON_DOMAIN_GTT,
> RADEON_FLAG_NO_INTERPROCESS_SHARING);
> > @@ -1328,19 +1335,45 @@ void radv_CmdCopyQueryPoolResults(
> >   pool->availability_offset + 4 *
> firstQuery);
> > break;
> > case VK_QUERY_TYPE_TIMESTAMP:
> > +   if (flags & VK_QUERY_RESULT_WAIT_BIT) {
> > +   /* Emit a full end of pipe event because we can't
> > +* perform a WAIT_REG_MEM on a 64-bit value. If
> we only
> > +* do a WAIT_REG_MEM on the low 32 bits of a
> timestamp
> > +* query we could be unlucky and the GPU might
> hang.
> > +*/
> > +   enum chip_class chip =
> cmd_buffer->device->physical_device->rad_info.chip_class;
> > +   bool is_mec =
> radv_cmd_buffer_uses_mec(cmd_buffer);
> > +   uint64_t avail_va = va +
> pool->availability_offset;
> > +
> > +   /* Clear the availability bit before waiting on
> the end
> > +* of pipe event.
> > +*/
> > +   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
> > +   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
> > +   S_370_WR_CONFIRM(1) |
> > +   S_370_ENGINE_SEL(V_370_ME));
> > +   radeon_emit(cs, avail_va);
> > +   radeon_emit(cs, avail_va >> 32);
> > +   radeon_emit(cs, 0xdeadbeef);
> > +
> > +   /* Wait for all prior GPU work. */
> > +   si_cs_emit_write_event_eop(cs, chip, is_mec,
> > +
> V_028A90_BOTTOM_OF_PIPE_TS, 0,
> > +
> EOP_DATA_SEL_VALUE_32BIT,
> > +  avail_va, 0, 1,
> > +
> cmd_buffer->gfx9_eop_bug_va);
> > +
> > +   /* Wait on the timestamp value. */
> > +   radv_cp_wait_mem(cs, WAIT_REG_MEM_EQUAL,
> avail_va,
> > +1, 0x);
> > +   }
> > +
>
> Can we put this in a separate function? Also, you'll want to allocate
> the availability bit in the upload buffer, in case there are multiple
> concurrent command buffers using the same query pool.
>
> Alternative solution: look at the upper 32 bits, those definitely
> should not be 0xfff until a far away point in the future.
>

I just looked into this a bit more, since if the cause of the hang is that
the low 32 bits on a valid timestamp are 0x, it seemed a bit
suspicious that it's 100% repro.

What's actually happening is that some of the timestamps are being written
before vkCmdResetQueryPool completes, so the reset ends up overwriting them
back to TIMESTAMP_NOT_READY. I've updated the test case on the bug to map

[Mesa-dev] [PATCH] etnaviv: fix resource usage tracking across different pipe_context's

2018-12-05 Thread Marek Vasut
From: Christian Gmeiner 

A pipe_resource can be shared by all the pipe_context's hanging off the
same pipe_screen.

Signed-off-by: Christian Gmeiner 
---
 src/gallium/drivers/etnaviv/etnaviv_context.c | 21 -
 src/gallium/drivers/etnaviv/etnaviv_context.h |  3 --
 .../drivers/etnaviv/etnaviv_resource.c| 44 ++-
 .../drivers/etnaviv/etnaviv_resource.h|  7 ++-
 src/gallium/drivers/etnaviv/etnaviv_screen.c  |  8 
 src/gallium/drivers/etnaviv/etnaviv_screen.h  |  4 ++
 6 files changed, 58 insertions(+), 29 deletions(-)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c 
b/src/gallium/drivers/etnaviv/etnaviv_context.c
index 3038d210e10..28c6b8fab84 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_context.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_context.c
@@ -36,6 +36,7 @@
 #include "etnaviv_query.h"
 #include "etnaviv_query_hw.h"
 #include "etnaviv_rasterizer.h"
+#include "etnaviv_resource.h"
 #include "etnaviv_screen.h"
 #include "etnaviv_shader.h"
 #include "etnaviv_state.h"
@@ -329,7 +330,8 @@ static void
 etna_cmd_stream_reset_notify(struct etna_cmd_stream *stream, void *priv)
 {
struct etna_context *ctx = priv;
-   struct etna_resource *rsc, *rsc_tmp;
+   struct etna_screen *screen = ctx->screen;
+   struct set_entry *entry;
 
etna_set_state(stream, VIVS_GL_API_MODE, VIVS_GL_API_MODE_OPENGL);
etna_set_state(stream, VIVS_GL_VERTEX_ELEMENT_CONFIG, 0x0001);
@@ -384,16 +386,13 @@ etna_cmd_stream_reset_notify(struct etna_cmd_stream 
*stream, void *priv)
ctx->dirty = ~0L;
ctx->dirty_sampler_views = ~0L;
 
-   /* go through all the used resources and clear their status flag */
-   LIST_FOR_EACH_ENTRY_SAFE(rsc, rsc_tmp, >used_resources, list)
-   {
-  debug_assert(rsc->status != 0);
-  rsc->status = 0;
-  rsc->pending_ctx = NULL;
-  list_delinit(>list);
-   }
+   /* go through all the used context resources and clear their status flag */
+   set_foreach(screen->used_resources, entry) {
+  struct etna_resource *rsc = (struct etna_resource *)entry->key;
 
-   assert(LIST_IS_EMPTY(>used_resources));
+  _mesa_set_remove_key(rsc->pending_ctx, ctx);
+  _mesa_set_remove(screen->used_resources, entry);
+   }
 }
 
 static void
@@ -437,8 +436,6 @@ etna_context_create(struct pipe_screen *pscreen, void 
*priv, unsigned flags)
/* need some sane default in case state tracker doesn't set some state: */
ctx->sample_mask = 0x;
 
-   list_inithead(>used_resources);
-
/*  Set sensible defaults for state */
etna_cmd_stream_reset_notify(ctx->stream, ctx);
 
diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.h 
b/src/gallium/drivers/etnaviv/etnaviv_context.h
index 584caa77080..eff0a2378c7 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_context.h
+++ b/src/gallium/drivers/etnaviv/etnaviv_context.h
@@ -136,9 +136,6 @@ struct etna_context {
uint32_t prim_hwsupport;
struct primconvert_context *primconvert;
 
-   /* list of resources used by currently-unsubmitted renders */
-   struct list_head used_resources;
-
struct slab_child_pool transfer_pool;
struct blitter_context *blitter;
 
diff --git a/src/gallium/drivers/etnaviv/etnaviv_resource.c 
b/src/gallium/drivers/etnaviv/etnaviv_resource.c
index 7fd374ae23d..166f2a4f71d 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_resource.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_resource.c
@@ -33,6 +33,7 @@
 #include "etnaviv_screen.h"
 #include "etnaviv_translate.h"
 
+#include "util/hash_table.h"
 #include "util/u_inlines.h"
 #include "util/u_memory.h"
 
@@ -275,7 +276,6 @@ etna_resource_alloc(struct pipe_screen *pscreen, unsigned 
layout,
rsc->halign = halign;
 
pipe_reference_init(>base.reference, 1);
-   list_inithead(>list);
 
size = setup_miptree(rsc, paddingX, paddingY, msaa_xscale, msaa_yscale);
 
@@ -296,6 +296,11 @@ etna_resource_alloc(struct pipe_screen *pscreen, unsigned 
layout,
   memset(map, 0, size);
}
 
+   rsc->pending_ctx = _mesa_set_create(NULL, _mesa_hash_pointer,
+  _mesa_key_pointer_equal);
+   if (!rsc->pending_ctx)
+  goto free_rsc;
+
return >base;
 
 free_rsc:
@@ -457,6 +462,8 @@ etna_resource_destroy(struct pipe_screen *pscreen, struct 
pipe_resource *prsc)
 {
struct etna_resource *rsc = etna_resource(prsc);
 
+   _mesa_set_destroy(rsc->pending_ctx, NULL);
+
if (rsc->bo)
   etna_bo_del(rsc->bo);
 
@@ -466,8 +473,6 @@ etna_resource_destroy(struct pipe_screen *pscreen, struct 
pipe_resource *prsc)
if (rsc->scanout)
   renderonly_scanout_destroy(rsc->scanout, etna_screen(pscreen)->ro);
 
-   list_delinit(>list);
-
pipe_resource_reference(>texture, NULL);
pipe_resource_reference(>external, NULL);
 
@@ -501,7 +506,6 @@ etna_resource_from_handle(struct pipe_screen *pscreen,
*prsc = *tmpl;
 
pipe_reference_init(>reference, 1);
-   list_inithead(>list);
prsc->screen = pscreen;
 
rsc->bo = etna_screen_bo_from_handle(pscreen, 

Re: [Mesa-dev] [PATCH] radv: Flush before vkCmdWriteTimestamp() if needed

2018-12-05 Thread Alex Smith
Thanks. Though this fixes the 100% repro hang, I think your first patch is
still needed as well to handle getting 0x in the low 32 bits.

On Wed, 5 Dec 2018 at 10:04, Samuel Pitoiset 
wrote:

> Yes, this is correct, indeed.
>
> The issue wasn't present because we used EOP events before removing the
> availability bit.
>
> Btw, just noticed that we should reset pending_reset_query directly in
> si_emit_cache_flush() to reduce the number of stalls. I will send a patch.
>
> Also note that fill CP DMA operations are currently always sync'ed,
> while CP DMA copies are not. I plan to change this at some point.
>
> Reviewed-by: Samuel Pitoiset 
>
> On 12/5/18 10:52 AM, Alex Smith wrote:
> > As done for vkCmdBeginQuery() already. Prevents timestamps from being
> > overwritten by previous vkCmdResetQueryPool() calls if the shader path
> > was used to do the reset.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
> > Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting
> the query pool")
> > Signed-off-by: Alex Smith 
> > ---
> >   src/amd/vulkan/radv_query.c | 30 +++---
> >   1 file changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
> > index 550abe307a..e226bcef6a 100644
> > --- a/src/amd/vulkan/radv_query.c
> > +++ b/src/amd/vulkan/radv_query.c
> > @@ -1436,6 +1436,22 @@ static unsigned event_type_for_stream(unsigned
> stream)
> >   }
> >   }
> >
> > +static void emit_query_flush(struct radv_cmd_buffer *cmd_buffer,
> > +  struct radv_query_pool *pool)
> > +{
> > + if (cmd_buffer->pending_reset_query) {
> > + if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
> > + /* Only need to flush caches if the query pool
> size is
> > +  * large enough to be resetted using the compute
> shader
> > +  * path. Small pools don't need any cache flushes
> > +  * because we use a CP dma clear.
> > +  */
> > + si_emit_cache_flush(cmd_buffer);
> > + cmd_buffer->pending_reset_query = false;
> > + }
> > + }
> > +}
> > +
> >   static void emit_begin_query(struct radv_cmd_buffer *cmd_buffer,
> >uint64_t va,
> >VkQueryType query_type,
> > @@ -1582,17 +1598,7 @@ void radv_CmdBeginQueryIndexedEXT(
> >
> >   radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
> >
> > - if (cmd_buffer->pending_reset_query) {
> > - if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
> > - /* Only need to flush caches if the query pool
> size is
> > -  * large enough to be resetted using the compute
> shader
> > -  * path. Small pools don't need any cache flushes
> > -  * because we use a CP dma clear.
> > -  */
> > - si_emit_cache_flush(cmd_buffer);
> > - cmd_buffer->pending_reset_query = false;
> > - }
> > - }
> > + emit_query_flush(cmd_buffer, pool);
> >
> >   va += pool->stride * query;
> >
> > @@ -1669,6 +1675,8 @@ void radv_CmdWriteTimestamp(
> >
> >   radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
> >
> > + emit_query_flush(cmd_buffer, pool);
> > +
> >   int num_queries = 1;
> >   if (cmd_buffer->state.subpass &&
> cmd_buffer->state.subpass->view_mask)
> >   num_queries =
> util_bitcount(cmd_buffer->state.subpass->view_mask);
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108925] vkCmdCopyQueryPoolResults(VK_QUERY_RESULT_WAIT_BIT) for timestamps with large query count hangs

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108925

Samuel Pitoiset  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Samuel Pitoiset  ---
Should be fixed with

https://cgit.freedesktop.org/mesa/mesa/commit/?id=c1b6cb068c4dfe49c309624610e8610b3f0b27c3

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108914] blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108914

Samuel Pitoiset  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Samuel Pitoiset  ---
Should be fixed with

https://cgit.freedesktop.org/mesa/mesa/commit/?id=824cfc1ee5e0aba15b676b9363ff32046d96eb42

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108578] RADV reports wrong hardcoded Vulkan API Version

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108578

Samuel Pitoiset  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTABUG

--- Comment #4 from Samuel Pitoiset  ---
This is not a bug. We should be able to bump the patch version but that
requires to look at the changelog since 1.1.70.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: wait on the high 32 bits of timestamp queries

2018-12-05 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 
On Wed, Dec 5, 2018 at 11:43 AM Samuel Pitoiset
 wrote:
>
> In case we are unlucky if the low part is 0x.
>
> Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp 
> queries")
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_query.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
> index 550abe307a1..8c64c98ffaa 100644
> --- a/src/amd/vulkan/radv_query.c
> +++ b/src/amd/vulkan/radv_query.c
> @@ -1336,8 +1336,11 @@ void radv_CmdCopyQueryPoolResults(
>
>
> if (flags & VK_QUERY_RESULT_WAIT_BIT) {
> +   /* Wait on the high 32 bits of the timestamp 
> in
> +* case the low part is 0x.
> +*/
> radv_cp_wait_mem(cs, WAIT_REG_MEM_NOT_EQUAL,
> -local_src_va,
> +local_src_va + 4,
>  TIMESTAMP_NOT_READY >> 32,
>  0x);
> }
> --
> 2.19.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Make Jordan an Owner of the mesa project?

2018-12-05 Thread Matt Turner
On Tue, Dec 4, 2018 at 7:39 PM Jason Ekstrand  wrote:
>
> It's been 24 hours and the only owner who hasn't replied yet is Matt.  Given 
> that everyone else has firmly ACKed, I'm going to click the button.  
> Congratulations, Jordan, you're now a mesa Owner!

That's certainly no reflection on my opinion of Jordan :)

(I've just been sick)

Ack!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/59] intel/compiler: implement conversions from 16-bit float to 64-bit

2018-12-05 Thread Pohjolainen, Topi
On Wed, Dec 05, 2018 at 09:20:57AM +0100, Iago Toral wrote:
> On Tue, 2018-12-04 at 18:10 +0200, Pohjolainen, Topi wrote:
> > On Tue, Dec 04, 2018 at 02:33:25PM +0200, Pohjolainen, Topi wrote:
> > > On Tue, Dec 04, 2018 at 08:16:34AM +0100, Iago Toral Quiroga wrote:
> > > > Signed-off-by: Samuel Iglesias Gonsálvez 
> > > > ---
> > > >  src/intel/compiler/brw_fs_nir.cpp | 41
> > > > +++
> > > >  1 file changed, 41 insertions(+)
> > > > 
> > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > > b/src/intel/compiler/brw_fs_nir.cpp
> > > > index 6eb68794f58..7294f49ddc0 100644
> > > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > > @@ -796,6 +796,47 @@ fs_visitor::nir_emit_alu(const fs_builder
> > > > , nir_alu_instr *instr)
> > > > case nir_op_f2f64:
> > > > case nir_op_f2i64:
> > > > case nir_op_f2u64:
> > > > +  /* BDW PRM, vol02, Command Reference Instructions, mov -
> > > > MOVE:
> > > > +   *
> > > > +   *   "There is no direct conversion from HF to DF or DF to
> > > > HF.
> > > > +   *Use two instructions and F (Float) as an
> > > > intermediate type.
> > > > +   *
> > > > +   *There is no direct conversion from HF to Q/UQ or
> > > > Q/UQ to HF.
> > > > +   *Use two instructions and F (Float) or a word integer
> > > > type
> > > > +   *or a DWord integer type as an intermediate type."
> > > > +   */
> > > > +  if (nir_src_bit_size(instr->src[0].src) == 16) {
> > > > + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 1);
> > > > + inst = bld.MOV(tmp, op[0]);
> > > > + inst->saturate = instr->dest.saturate;
> > > > + op[0] = tmp;
> > > > +  }
> > > > +
> > > > +  /* CHV PRM, vol07, 3D Media GPGPU Engine, Register Region
> > > > Restrictions:
> > > > +   *
> > > > +   *"When source or destination is 64b (...), regioning
> > > > in Align1
> > > > +   * must follow these rules:
> > > > +   *
> > > > +   * 1. Source and destination horizontal stride must be
> > > > aligned to
> > > > +   *the same qword.
> > > > +   * (...)"
> > > > +   *
> > > > +   * This means that conversions from bit-sizes smaller than
> > > > 64-bit to
> > > > +   * 64-bit need to have the source data elements aligned to
> > > > 64-bit.
> > > > +   * This restriction does not apply to BDW and later.
> > > > +   */
> > > > +  if (type_sz(result.type) == 8 && type_sz(op[0].type) < 8
> > > > &&
> > > > +  (devinfo->is_cherryview ||
> > > > gen_device_info_is_9lp(devinfo))) {
> > > > + fs_reg tmp = bld.vgrf(result.type, 1);
> > > > + tmp = subscript(tmp, op[0].type, 0);
> > > > + inst = bld.MOV(tmp, op[0]);
> > > > + op[0] = tmp;
> > > > +  }
> > > 
> > > For this second part we seem to have similar logic further down
> > > after
> > > "nir_op_u2u64" (not visible here) in master? Would it be possible
> > > to fallthru
> > > from here and re-use that?
> > 
> > And after reading it more carefully myself it looks that this is
> > actually
> > cleaner.
> > 
> > I noticed that in the nir_op_u2u64 case the destination and source
> > sizes are
> > checked using:
> > 
> >if (nir_dest_bit_size(instr->dest.dest) == 64 &&
> >nir_src_bit_size(instr->src[0].src) < 64 &&
> >...
> > 
> > Should we use the same here for consistency?
> 
> Right above this we can rewrite op[0] with a temporary that would be
> different from instr->src[0].src so we can't check the nir sources any
> more. Also, by the end of the series, when we incorporate 8-bit
> conversions there will be more cases like this that we need to account
> for and we end up rewriting the u2u64 case to this style as well.

Ok, thanks for the explanation! This patch is:

Reviewed-by: Topi Pohjolainen 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/59] compiler/nir: add lowering for 16-bit ldexp

2018-12-05 Thread Pohjolainen, Topi

I remember people preferring to order things 16, 32, 64 before. Should
we follow that here as well?

On Tue, Dec 04, 2018 at 08:16:46AM +0100, Iago Toral Quiroga wrote:
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 6c3b77c9b6e..747f1751086 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -778,6 +778,8 @@ def fexp2i(exp, bits):
>return ('ishl', ('iadd', exp, 127), 23)
> elif bits == 64:
>return ('pack_64_2x32_split', 0, ('ishl', ('iadd', exp, 1023), 20))
> +   elif bits == 16:
> +  return ('i2i16', ('ishl', ('iadd', exp, 15), 10))
> else:
>assert False
>  
> @@ -796,6 +798,8 @@ def ldexp(f, exp, bits):
>exp = ('imin', ('imax', exp, -252), 254)
> elif bits == 64:
>exp = ('imin', ('imax', exp, -2044), 2046)
> +   elif bits == 16:
> +  exp = ('imin', ('imax', exp, -30), 30)

I expected this to be:

 exp = ('imin', ('imax', exp, -29), 30)

> else:
>assert False
>  
> @@ -814,6 +818,7 @@ def ldexp(f, exp, bits):
>  optimizations += [
> (('ldexp@32', 'x', 'exp'), ldexp('x', 'exp', 32), 'options->lower_ldexp'),
> (('ldexp@64', 'x', 'exp'), ldexp('x', 'exp', 64), 'options->lower_ldexp'),
> +   (('ldexp@16', 'x', 'exp'), ldexp('x', 'exp', 16), 'options->lower_ldexp'),
>  ]
>  
>  # Unreal Engine 4 demo applications open-codes bitfieldReverse()
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/59] intel/compiler: Implement float64/int64 to float16 conversion

2018-12-05 Thread Iago Toral
On Wed, 2018-12-05 at 11:08 +0200, Pohjolainen, Topi wrote:
> On Wed, Dec 05, 2018 at 09:49:29AM +0100, Iago Toral wrote:
> > On Tue, 2018-12-04 at 14:57 +0200, Pohjolainen, Topi wrote:
> > > On Tue, Dec 04, 2018 at 08:16:35AM +0100, Iago Toral Quiroga
> > > wrote:
> > > > From: Samuel Iglesias Gonsálvez 
> > > > 
> > > > It is not supported directly in the HW, we need to convert to a
> > > > 32-
> > > > bit
> > > > type first as intermediate step.
> > > > 
> > > > v2 (Iago): handle conversions from 64-bit integers as well
> > > > 
> > > > Signed-off-by: Samuel Iglesias Gonsálvez 
> > > > ---
> > > >  src/intel/compiler/brw_fs_nir.cpp | 42
> > > > ---
> > > >  1 file changed, 39 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > > b/src/intel/compiler/brw_fs_nir.cpp
> > > > index 7294f49ddc0..9f3d3bf9762 100644
> > > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > > @@ -784,6 +784,19 @@ fs_visitor::nir_emit_alu(const fs_builder
> > > > , nir_alu_instr *instr)
> > > > */
> > > >  
> > > > case nir_op_f2f16:
> > > > +  /* BDW PRM, vol02, Command Reference Instructions, mov -
> > > > MOVE:
> > > > +   *
> > > > +   *   "There is no direct conversion from HF to DF or DF
> > > > to
> > > > HF.
> > > > +   *Use two instructions and F (Float) as an
> > > > intermediate
> > > > type.
> > > > +   */
> > > > +  if (nir_src_bit_size(instr->src[0].src) == 64) {
> > > > + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 1);
> > > > + inst = bld.MOV(tmp, op[0]);
> > > > + inst->saturate = instr->dest.saturate;
> > > > + inst = bld.MOV(result, tmp);
> > > > + inst->saturate = instr->dest.saturate;
> > > > + break;
> > > > +  }
> > > >inst = bld.MOV(result, op[0]);
> > > >inst->saturate = instr->dest.saturate;
> > > >break;
> > > > @@ -864,7 +877,32 @@ fs_visitor::nir_emit_alu(const fs_builder
> > > > , nir_alu_instr *instr)
> > > >   inst->saturate = instr->dest.saturate;
> > > >   break;
> > > >}
> > > > -  /* fallthrough */
> > > 
> > > This is more or less nit-picking but I thought I ask anyway. The
> > > fallthru
> > > comment gets now dropped also for other cases than "i2f16" and
> > > "u2f16". And if
> > > we added the logic for nir_op_i2f16/nir_op_u2f16 cases just after
> > > the
> > > f2f16
> > > case that would yield a diff without the following three copy-
> > > paste
> > > lines as
> > > well. Or amd I missing something?
> > 
> > Yes, I think you're right and if you look at this patch standalone
> > I
> > think it would make sense to do that. The thing is that later on in
> > the
> > series we have to change this further to incorporate more
> > restrictions
> > for conversions to/from integer and half-float for atom platforms,
> > so
> > having the f2f16 case separated from the {i,u}2f16 cases will make
> > more
> > sense. That would be patch 46 in the series, which comes later
> > because
> > that is when we addressed integer conversions from 8-bit and
> > noticed
> > this whole thing on atom.
> > 
> > I can still make the change you suggest in this patch and then do
> > the
> > split later on if you think that helps though. I could also try to
> > move
> > the fix for atom earlier in the series, that will lead to conflicts
> > and
> > I'd need to slightly rewrite other patches in the series to
> > accomodate
> > to that, but it is certainly doable if you that makes the commit
> > history better.
> 
> I'm not sure if I understood correctly your answer but I didn't
> suggest to
> merge f2f16 case with {i,u}2f16 cases. I thought that having:
> 
>   case nir_op_f2f16:
>   ...
>   break;
> 
>   case nir_op_i2f16:
>   case nir_op_u2f16:
>   ...
>   break;
> 
>   case nir_op_b2i:
>   ...
> 
> 
> would have yielded smaller diff than:
> 
>   case nir_op_f2f16:
>   ...
>   break;
> 
>   case nir_op_b2i:
>   ...
> 
>   case nir_op_u2u64:
>   ...
>   /* fallthrough */
> 
>   case nir_op_i2f16:
>   case nir_op_u2f16:
>   ...
>   break;
> 
>   case nir_op_f2f32
>   ...
>   break;

Ah, yes, I see what you mean now. I guess this is very subjective in
the end but in general my preference was to separate the cases and try
to avoid too many fallthroughs and specially for large blocks of
opcodes, at least for the cases where the main benefit was to reuse
that 3-line block which is basically the MOV instruction that is going
to be there for all conversion cases. I found that as we added more
types (there is also 8-bit conversions coming up later in the series)
and some of these conversions come with additional restrictions for
specific platforms or source/destination types, 

Re: [Mesa-dev] [PATCH 12/59] intel/compiler: handle b2i/b2f with other integer conversion opcodes

2018-12-05 Thread Iago Toral
On Tue, 2018-12-04 at 18:16 +0200, Pohjolainen, Topi wrote:
> On Tue, Dec 04, 2018 at 08:16:36AM +0100, Iago Toral Quiroga wrote:
> > Since we handle booleans as integers this makes more sense.
> 
> If this is applied before patch 10, can we merge 10 and 13?

We can't apply this before patch 10 because patch 10 is the one that
splits the f264 and {i,u}264 opcodes. However, we could merge this and
patch 13 into patch 10 if that looks better to you.

Iago

> > ---
> >  src/intel/compiler/brw_fs_nir.cpp | 10 +-
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index 9f3d3bf9762..6c765fc2661 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -801,11 +801,6 @@ fs_visitor::nir_emit_alu(const fs_builder
> > , nir_alu_instr *instr)
> >inst->saturate = instr->dest.saturate;
> >break;
> >  
> > -   case nir_op_b2i:
> > -   case nir_op_b2f:
> > -  op[0].type = BRW_REGISTER_TYPE_D;
> > -  op[0].negate = !op[0].negate;
> > -  /* fallthrough */
> > case nir_op_f2f64:
> > case nir_op_f2i64:
> > case nir_op_f2u64:
> > @@ -850,6 +845,11 @@ fs_visitor::nir_emit_alu(const fs_builder
> > , nir_alu_instr *instr)
> >inst->saturate = instr->dest.saturate;
> >break;
> >  
> > +   case nir_op_b2i:
> > +   case nir_op_b2f:
> > +  op[0].type = BRW_REGISTER_TYPE_D;
> > +  op[0].negate = !op[0].negate;
> > +  /* fallthrough */
> > case nir_op_i2f64:
> > case nir_op_i2i64:
> > case nir_op_u2f64:
> > -- 
> > 2.17.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: wait on the high 32 bits of timestamp queries

2018-12-05 Thread Samuel Pitoiset
In case we are unlucky if the low part is 0x.

Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp 
queries")
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_query.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
index 550abe307a1..8c64c98ffaa 100644
--- a/src/amd/vulkan/radv_query.c
+++ b/src/amd/vulkan/radv_query.c
@@ -1336,8 +1336,11 @@ void radv_CmdCopyQueryPoolResults(
 
 
if (flags & VK_QUERY_RESULT_WAIT_BIT) {
+   /* Wait on the high 32 bits of the timestamp in
+* case the low part is 0x.
+*/
radv_cp_wait_mem(cs, WAIT_REG_MEM_NOT_EQUAL,
-local_src_va,
+local_src_va + 4,
 TIMESTAMP_NOT_READY >> 32,
 0x);
}
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: reset pending_reset_query when flushing caches

2018-12-05 Thread Alex Smith
Reviewed-by: Alex Smith 

On Wed, 5 Dec 2018 at 10:32, Samuel Pitoiset 
wrote:

> If the driver used a compute shader for resetting a query pool,
> it should be completed when caches are flushed.
>
> This might reduce the number of stalls if operations are done
> between vkCmdResetQueryPool() and vkCmdBeginQuery()
> (or vkCmdWriteTimestamp()).
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_query.c| 1 -
>  src/amd/vulkan/si_cmd_buffer.c | 5 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
> index e226bcef6a9..276cc1c42d7 100644
> --- a/src/amd/vulkan/radv_query.c
> +++ b/src/amd/vulkan/radv_query.c
> @@ -1447,7 +1447,6 @@ static void emit_query_flush(struct radv_cmd_buffer
> *cmd_buffer,
>  * because we use a CP dma clear.
>  */
> si_emit_cache_flush(cmd_buffer);
> -   cmd_buffer->pending_reset_query = false;
> }
> }
>  }
> diff --git a/src/amd/vulkan/si_cmd_buffer.c
> b/src/amd/vulkan/si_cmd_buffer.c
> index a9f25725415..2f57584bf82 100644
> --- a/src/amd/vulkan/si_cmd_buffer.c
> +++ b/src/amd/vulkan/si_cmd_buffer.c
> @@ -992,6 +992,11 @@ si_emit_cache_flush(struct radv_cmd_buffer
> *cmd_buffer)
> radv_cmd_buffer_trace_emit(cmd_buffer);
>
> cmd_buffer->state.flush_bits = 0;
> +
> +   /* If the driver used a compute shader for resetting a query pool,
> it
> +* should be finished at this point.
> +*/
> +   cmd_buffer->pending_reset_query = false;
>  }
>
>  /* sets the CP predication state using a boolean stored at va */
> --
> 2.19.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Flush before vkCmdWriteTimestamp() if needed

2018-12-05 Thread Alex Smith
As done for vkCmdBeginQuery() already. Prevents timestamps from being
overwritten by previous vkCmdResetQueryPool() calls if the shader path
was used to do the reset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query 
pool")
Signed-off-by: Alex Smith 
---
 src/amd/vulkan/radv_query.c | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
index 550abe307a..e226bcef6a 100644
--- a/src/amd/vulkan/radv_query.c
+++ b/src/amd/vulkan/radv_query.c
@@ -1436,6 +1436,22 @@ static unsigned event_type_for_stream(unsigned stream)
}
 }
 
+static void emit_query_flush(struct radv_cmd_buffer *cmd_buffer,
+struct radv_query_pool *pool)
+{
+   if (cmd_buffer->pending_reset_query) {
+   if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
+   /* Only need to flush caches if the query pool size is
+* large enough to be resetted using the compute shader
+* path. Small pools don't need any cache flushes
+* because we use a CP dma clear.
+*/
+   si_emit_cache_flush(cmd_buffer);
+   cmd_buffer->pending_reset_query = false;
+   }
+   }
+}
+
 static void emit_begin_query(struct radv_cmd_buffer *cmd_buffer,
 uint64_t va,
 VkQueryType query_type,
@@ -1582,17 +1598,7 @@ void radv_CmdBeginQueryIndexedEXT(
 
radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
 
-   if (cmd_buffer->pending_reset_query) {
-   if (pool->size >= RADV_BUFFER_OPS_CS_THRESHOLD) {
-   /* Only need to flush caches if the query pool size is
-* large enough to be resetted using the compute shader
-* path. Small pools don't need any cache flushes
-* because we use a CP dma clear.
-*/
-   si_emit_cache_flush(cmd_buffer);
-   cmd_buffer->pending_reset_query = false;
-   }
-   }
+   emit_query_flush(cmd_buffer, pool);
 
va += pool->stride * query;
 
@@ -1669,6 +1675,8 @@ void radv_CmdWriteTimestamp(
 
radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo);
 
+   emit_query_flush(cmd_buffer, pool);
+
int num_queries = 1;
if (cmd_buffer->state.subpass && cmd_buffer->state.subpass->view_mask)
num_queries = 
util_bitcount(cmd_buffer->state.subpass->view_mask);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/android: Do not reject storage images.

2018-12-05 Thread Bas Nieuwenhuizen
We do the ImageFormatProperties check already, and rejecting an usage
flag when both ImageFormatProperties and the WSI (which is Android)
support it is not allowed.

Intel does support storage for some of the support WSI formats, such
as R8G8B8A8_UNORM, and looking at the ISL_SURF_USAGE_DISABLE_AUX_BIT,
the imported images do not have any form of compression that would
prevent this fix.

Fixes: 053d4c328fa "anv: Implement VK_ANDROID_native_buffer (v9)"
CC: Jason Ekstrand 
---
 src/intel/vulkan/anv_android.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
index a3bab8087b4..92c3787b49b 100644
--- a/src/intel/vulkan/anv_android.c
+++ b/src/intel/vulkan/anv_android.c
@@ -268,13 +268,6 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
"inside %s", __func__);
}
 
-   /* Reject STORAGE here to avoid complexity elsewhere. */
-   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
-  return vk_errorf(device->instance, device, VK_ERROR_FORMAT_NOT_SUPPORTED,
-   "VK_IMAGE_USAGE_STORAGE_BIT unsupported for gralloc "
-   "swapchain");
-   }
-
if (unmask32(, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
  VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT))
   *grallocUsage |= GRALLOC_USAGE_HW_RENDER;
-- 
2.20.0.rc1.387.gf8505762e3-goog

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/59] compiler/nir: add lowering for 16-bit ldexp

2018-12-05 Thread Pohjolainen, Topi
On Wed, Dec 05, 2018 at 11:53:44AM +0100, Iago Toral wrote:
> On Wed, 2018-12-05 at 11:39 +0200, Pohjolainen, Topi wrote:
> > I remember people preferring to order things 16, 32, 64 before.
> > Should
> > we follow that here as well?
> 
> Yes, it makes sense. I'll change that.
> 
> > On Tue, Dec 04, 2018 at 08:16:46AM +0100, Iago Toral Quiroga wrote:
> > > ---
> > >  src/compiler/nir/nir_opt_algebraic.py | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/src/compiler/nir/nir_opt_algebraic.py
> > > b/src/compiler/nir/nir_opt_algebraic.py
> > > index 6c3b77c9b6e..747f1751086 100644
> > > --- a/src/compiler/nir/nir_opt_algebraic.py
> > > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > > @@ -778,6 +778,8 @@ def fexp2i(exp, bits):
> > >return ('ishl', ('iadd', exp, 127), 23)
> > > elif bits == 64:
> > >return ('pack_64_2x32_split', 0, ('ishl', ('iadd', exp,
> > > 1023), 20))
> > > +   elif bits == 16:
> > > +  return ('i2i16', ('ishl', ('iadd', exp, 15), 10))
> > > else:
> > >assert False
> > >  
> > > @@ -796,6 +798,8 @@ def ldexp(f, exp, bits):
> > >exp = ('imin', ('imax', exp, -252), 254)
> > > elif bits == 64:
> > >exp = ('imin', ('imax', exp, -2044), 2046)
> > > +   elif bits == 16:
> > > +  exp = ('imin', ('imax', exp, -30), 30)
> > 
> > I expected this to be:
> > 
> >  exp = ('imin', ('imax', exp, -29), 30)
> 
> Actually, I think this should be -28, since the minimum exponent value
> is -14.

I kept wondering about. The offset is 15 and -14 - 15 yields -29. But -28
in turn would be more in line with the 32- and 64-bit cases.

> 
> > > else:
> > >assert False
> > >  
> > > @@ -814,6 +818,7 @@ def ldexp(f, exp, bits):
> > >  optimizations += [
> > > (('ldexp@32', 'x', 'exp'), ldexp('x', 'exp', 32), 'options-
> > > >lower_ldexp'),
> > > (('ldexp@64', 'x', 'exp'), ldexp('x', 'exp', 64), 'options-
> > > >lower_ldexp'),
> > > +   (('ldexp@16', 'x', 'exp'), ldexp('x', 'exp', 16), 'options-
> > > >lower_ldexp'),
> > >  ]
> > >  
> > >  # Unreal Engine 4 demo applications open-codes bitfieldReverse()
> > > -- 
> > > 2.17.1
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> > 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108949] RADV: Subgroup codegen is sub-optimal

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108949

--- Comment #2 from mais...@archlinux.us ---
Interesting. No, haven't tried with an LLVM that recent. I'll post when I have
results.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/android: handle storage images in vkGetSwapchainGrallocUsageANDROID

2018-12-05 Thread Tapani Pälli



On 12/5/18 1:44 PM, Bas Nieuwenhuizen wrote:

On Wed, Dec 5, 2018 at 12:37 PM Tapani Pälli  wrote:




On 12/5/18 1:22 PM, Bas Nieuwenhuizen wrote:

On Wed, Dec 5, 2018 at 12:15 PM Tapani Pälli  wrote:




On 12/5/18 1:01 PM, Bas Nieuwenhuizen wrote:

On Fri, Sep 7, 2018 at 12:54 AM Kevin Strasser  wrote:


Android P and earlier expect that the surface supports storage images, and
so many of the tests fail when the framework checks for that support. The
framework also includes various image format and usage combinations that are
invalid for the hardware.

Drop the STORAGE restriction from the HAL and whitelist a pair of
formats so that existing versions of Android can pass these tests.

Fixes:
  dEQP-VK.wsi.android.*

Signed-off-by: Kevin Strasser 
---
src/intel/vulkan/anv_android.c | 23 ++-
1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
index 46c41d5..e2640b8 100644
--- a/src/intel/vulkan/anv_android.c
+++ b/src/intel/vulkan/anv_android.c
@@ -234,7 +234,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
   *grallocUsage = 0;
   intel_logd("%s: format=%d, usage=0x%x", __func__, format, imageUsage);

-   /* WARNING: Android Nougat's libvulkan.so hardcodes the VkImageUsageFlags
+   /* WARNING: Android's libvulkan.so hardcodes the VkImageUsageFlags
* returned to applications via 
VkSurfaceCapabilitiesKHR::supportedUsageFlags.
* The relevant code in libvulkan/swapchain.cpp contains this fun 
comment:
*
@@ -247,7 +247,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
* dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
*/

-   const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
+   VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {


Why remove the const here?


  .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
  .format = format,
  .type = VK_IMAGE_TYPE_2D,
@@ -255,6 +255,17 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
  .usage = imageUsage,
   };

+   /* Android P and earlier doesn't check if the physical device supports a
+* given format and usage combination before calling this function. Omit the
+* storage requirement to make the tests pass.
+*/
+#if ANDROID_API_LEVEL <= 28
+   if (format == VK_FORMAT_R8G8B8A8_SRGB ||
+   format == VK_FORMAT_R5G6B5_UNORM_PACK16) {
+  image_format_info.usage &= ~VK_IMAGE_USAGE_STORAGE_BIT;
+   }
+#endif


I don't think you need this. Per the vulkan spec you can only use an
format + usage combination for a swapchain if it is supported per
ImageFormatProperties, using essentially the same check happening
above. I know CTs has been bad at this, but Vulkan CTS should have
been fixed for a bit now. (I don't think all the fixes are in Android
CTS 9.0_r4 yet, maybe the next release?)


AFAIK the problem here is not about CTS. It's the swapchain
implementation that always requires storage support.


Actually swapchain creation has the following valid usage rule:

"The implied image creation parameters of the swapchain must be
supported as reported by vkGetPhysicalDeviceImageFormatProperties"

So since those formats don't support the STORAGE usage bit, that test
fails and you are not allowed to create a swapchain with those formats
and storage, even if the surface capabiliities expose the STORAGE
usage bit in general.


Right ... this stuff was done because comment in the swapchain setting
the bits seems like maybe it's not thought through:

// TODO(jessehall): I think these are right, but haven't thought hard about
// it. Do we need to query the driver for support of any of these?


That was from before the spec was changed to add that rule.


OK if I understand correctly, so should we rather then try to fix those 
tests to skip instead of fail?







(Also silently removing the usage bit is bad, because the app could
try actually using images stores with the image ...)


True, it is not nice ..



+
   VkImageFormatProperties2KHR image_format_props = {
  .sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR,
   };
@@ -268,19 +279,13 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
   "inside %s", __func__);
   }

-   /* Reject STORAGE here to avoid complexity elsewhere. */
-   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
-  return vk_errorf(device->instance, device, VK_ERROR_FORMAT_NOT_SUPPORTED,
-   "VK_IMAGE_USAGE_STORAGE_BIT unsupported for gralloc "
-   "swapchain");
-   }
-
   if (unmask32(, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
 VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT))
  *grallocUsage |= GRALLOC_USAGE_HW_RENDER;

   if (unmask32(, VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
 VK_IMAGE_USAGE_SAMPLED_BIT |
+ VK_IMAGE_USAGE_STORAGE_BIT 

Re: [Mesa-dev] [PATCH 0/5] Fixueps for ppc64 and gnu hurd

2018-12-05 Thread Timo Aaltonen
On 4.12.2018 23.52, Dylan Baker wrote:
> This little series is aimed at fixing problems reported by fedora and debian
> when using meson, there's a couple of patches in here for fixing ppc64 
> detection
> (tested without llvm), and a couple for gnu hurd (not tested).
> 
> Dylan Baker (5):
>   meson: remove duplicate definition
>   meson: Fix ppc64 little endian detection
>   meson: Override C++ standard to gnu++11 when building with altivec on
> ppc64le
>   meson: Add support for gnu hurd
>   meson: Add toggle for glx-direct
> 
>  meson.build   | 33 ---
>  meson_options.txt |  6 
>  src/gallium/state_trackers/clover/meson.build |  3 ++
>  3 files changed, 31 insertions(+), 11 deletions(-)
> 

Thanks, I'll give these a try soon.

-- 
t
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/59] intel/compiler: set correct precision fields for 3-source float instructions

2018-12-05 Thread Pohjolainen, Topi
On Tue, Dec 04, 2018 at 08:16:52AM +0100, Iago Toral Quiroga wrote:
> Source0 and Destination extract the floating-point precision automatically
> from the SrcType and DstType instruction fields respectively when they are
> set to types :F or :HF. For Source1 and Source2 operands, we use the new
> 1-bit fields Src1Type and Src2Type, where 0 means normal precision and 1
> means half-precision. Since we always use the type of the destination for
> all operands when we emit 3-source instructions, we only need set Src1Type
> and Src2Type to 1 when we are emitting a half-precision instruction.
> ---
>  src/intel/compiler/brw_eu_emit.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_eu_emit.c 
> b/src/intel/compiler/brw_eu_emit.c
> index 2c9fc9a5c7c..66edfb43baf 100644
> --- a/src/intel/compiler/brw_eu_emit.c
> +++ b/src/intel/compiler/brw_eu_emit.c
> @@ -801,6 +801,11 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
> brw_reg dest,
>*/
>   brw_inst_set_3src_a16_src_type(devinfo, inst, dest.type);
>   brw_inst_set_3src_a16_dst_type(devinfo, inst, dest.type);
> +
> + if (devinfo->gen >= 8 && dest.type == BRW_REGISTER_TYPE_HF) {
> +brw_inst_set_3src_a16_src1_type(devinfo, inst, 1);
> +brw_inst_set_3src_a16_src2_type(devinfo, inst, 1);
> + }

I had similar patch which prepares for mixed mode (useful for linterp with
32-bit input varyings):

 /* From the Bspec: Instruction types
  *
  * Three source instructions can use operands with mixed-mode
  * precision. When SrcType field is set to :f or :hf it defines
  * precision for source 0 only, and fields Src1Type and Src2Type
  * define precision for other source operands:
  *
  *   0b = :f. Single precision Float (32-bit).
  *   1b = :hf. Half precision Float (16-bit).
  */
 if (src1.type == BRW_REGISTER_TYPE_HF)
brw_inst_set_3src_src1_type(devinfo, inst, 1);

 if (src2.type == BRW_REGISTER_TYPE_HF)
brw_inst_set_3src_src2_type(devinfo, inst, 1);

How would you feel about that? (Direct cut-paste and the helpers have
different name).

>}
> }
>  
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 27/59] intel/compiler: allow half-float on 3-source instructions since gen8

2018-12-05 Thread Pohjolainen, Topi

Reviewed-by: Topi Pohjolainen 

On Tue, Dec 04, 2018 at 08:16:51AM +0100, Iago Toral Quiroga wrote:
> ---
>  src/intel/compiler/brw_eu_emit.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/compiler/brw_eu_emit.c 
> b/src/intel/compiler/brw_eu_emit.c
> index 5f066d17a1f..2c9fc9a5c7c 100644
> --- a/src/intel/compiler/brw_eu_emit.c
> +++ b/src/intel/compiler/brw_eu_emit.c
> @@ -755,7 +755,8 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
> brw_reg dest,
>assert(dest.type == BRW_REGISTER_TYPE_F  ||
>   dest.type == BRW_REGISTER_TYPE_DF ||
>   dest.type == BRW_REGISTER_TYPE_D  ||
> - dest.type == BRW_REGISTER_TYPE_UD);
> + dest.type == BRW_REGISTER_TYPE_UD ||
> + (dest.type == BRW_REGISTER_TYPE_HF && devinfo->gen >= 8));
>if (devinfo->gen == 6) {
>   brw_inst_set_3src_a16_dst_reg_file(devinfo, inst,
>  dest.file == 
> BRW_MESSAGE_REGISTER_FILE);
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: add SpvCapabilityInt64Atomics

2018-12-05 Thread Samuel Pitoiset
Required for VK_KHR_shader_atomic_int64.

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/shader_info.h| 1 +
 src/compiler/spirv/spirv_to_nir.c | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h
index 65bc0588d67..b95cc310fd6 100644
--- a/src/compiler/shader_info.h
+++ b/src/compiler/shader_info.h
@@ -62,6 +62,7 @@ struct spirv_supported_capabilities {
bool post_depth_coverage;
bool transform_feedback;
bool geometry_streams;
+   bool int64_atomics;
 };
 
 typedef struct shader_info {
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index a05c4d236ca..22efaa276d9 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -3416,7 +3416,6 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, 
SpvOp opcode,
   case SpvCapabilityVector16:
   case SpvCapabilityFloat16Buffer:
   case SpvCapabilityFloat16:
-  case SpvCapabilityInt64Atomics:
   case SpvCapabilityStorageImageMultisample:
   case SpvCapabilityInt8:
   case SpvCapabilitySparseResidency:
@@ -3447,6 +3446,10 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, 
SpvOp opcode,
  spv_check_supported(geometry_streams, cap);
  break;
 
+  case SpvCapabilityInt64Atomics:
+ spv_check_supported(int64_atomics, cap);
+ break;
+
   case SpvCapabilityAddresses:
   case SpvCapabilityKernel:
   case SpvCapabilityImageBasic:
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/android: handle storage images in vkGetSwapchainGrallocUsageANDROID

2018-12-05 Thread Bas Nieuwenhuizen
On Wed, Dec 5, 2018 at 12:15 PM Tapani Pälli  wrote:
>
>
>
> On 12/5/18 1:01 PM, Bas Nieuwenhuizen wrote:
> > On Fri, Sep 7, 2018 at 12:54 AM Kevin Strasser  
> > wrote:
> >>
> >> Android P and earlier expect that the surface supports storage images, and
> >> so many of the tests fail when the framework checks for that support. The
> >> framework also includes various image format and usage combinations that 
> >> are
> >> invalid for the hardware.
> >>
> >> Drop the STORAGE restriction from the HAL and whitelist a pair of
> >> formats so that existing versions of Android can pass these tests.
> >>
> >> Fixes:
> >> dEQP-VK.wsi.android.*
> >>
> >> Signed-off-by: Kevin Strasser 
> >> ---
> >>   src/intel/vulkan/anv_android.c | 23 ++-
> >>   1 file changed, 14 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/src/intel/vulkan/anv_android.c 
> >> b/src/intel/vulkan/anv_android.c
> >> index 46c41d5..e2640b8 100644
> >> --- a/src/intel/vulkan/anv_android.c
> >> +++ b/src/intel/vulkan/anv_android.c
> >> @@ -234,7 +234,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> >>  *grallocUsage = 0;
> >>  intel_logd("%s: format=%d, usage=0x%x", __func__, format, imageUsage);
> >>
> >> -   /* WARNING: Android Nougat's libvulkan.so hardcodes the 
> >> VkImageUsageFlags
> >> +   /* WARNING: Android's libvulkan.so hardcodes the VkImageUsageFlags
> >>   * returned to applications via 
> >> VkSurfaceCapabilitiesKHR::supportedUsageFlags.
> >>   * The relevant code in libvulkan/swapchain.cpp contains this fun 
> >> comment:
> >>   *
> >> @@ -247,7 +247,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> >>   * dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
> >>   */
> >>
> >> -   const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
> >> +   VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
> >
> > Why remove the const here?
> >
> >> .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
> >> .format = format,
> >> .type = VK_IMAGE_TYPE_2D,
> >> @@ -255,6 +255,17 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> >> .usage = imageUsage,
> >>  };
> >>
> >> +   /* Android P and earlier doesn't check if the physical device supports 
> >> a
> >> +* given format and usage combination before calling this function. 
> >> Omit the
> >> +* storage requirement to make the tests pass.
> >> +*/
> >> +#if ANDROID_API_LEVEL <= 28
> >> +   if (format == VK_FORMAT_R8G8B8A8_SRGB ||
> >> +   format == VK_FORMAT_R5G6B5_UNORM_PACK16) {
> >> +  image_format_info.usage &= ~VK_IMAGE_USAGE_STORAGE_BIT;
> >> +   }
> >> +#endif
> >
> > I don't think you need this. Per the vulkan spec you can only use an
> > format + usage combination for a swapchain if it is supported per
> > ImageFormatProperties, using essentially the same check happening
> > above. I know CTs has been bad at this, but Vulkan CTS should have
> > been fixed for a bit now. (I don't think all the fixes are in Android
> > CTS 9.0_r4 yet, maybe the next release?)
>
> AFAIK the problem here is not about CTS. It's the swapchain
> implementation that always requires storage support.

Actually swapchain creation has the following valid usage rule:

"The implied image creation parameters of the swapchain must be
supported as reported by vkGetPhysicalDeviceImageFormatProperties"

So since those formats don't support the STORAGE usage bit, that test
fails and you are not allowed to create a swapchain with those formats
and storage, even if the surface capabiliities expose the STORAGE
usage bit in general.

>
> > (Also silently removing the usage bit is bad, because the app could
> > try actually using images stores with the image ...)
>
> True, it is not nice ..
>
>
> >> +
> >>  VkImageFormatProperties2KHR image_format_props = {
> >> .sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR,
> >>  };
> >> @@ -268,19 +279,13 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> >>  "inside %s", __func__);
> >>  }
> >>
> >> -   /* Reject STORAGE here to avoid complexity elsewhere. */
> >> -   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
> >> -  return vk_errorf(device->instance, device, 
> >> VK_ERROR_FORMAT_NOT_SUPPORTED,
> >> -   "VK_IMAGE_USAGE_STORAGE_BIT unsupported for 
> >> gralloc "
> >> -   "swapchain");
> >> -   }
> >> -
> >>  if (unmask32(, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
> >>VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT))
> >> *grallocUsage |= GRALLOC_USAGE_HW_RENDER;
> >>
> >>  if (unmask32(, VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
> >>VK_IMAGE_USAGE_SAMPLED_BIT |
> >> + VK_IMAGE_USAGE_STORAGE_BIT |
> >>VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT))
> >> *grallocUsage |= GRALLOC_USAGE_HW_TEXTURE;
> >>
> >> --
> >> 2.7.4
> >>
> >> 

Re: [Mesa-dev] [PATCH 3/5] meson: Override C++ standard to gnu++11 when building with altivec on ppc64le

2018-12-05 Thread Eric Engestrom
On Tuesday, 2018-12-04 13:52:19 -0800, Dylan Baker wrote:
> Otherwise there will be symbol collisions for the vector name.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108943
> Fixes: 34bbb24ce7702658cdc4e9d34a650e169716c39e
>("meson: Add support for ppc assembly/optimizations")
> ---
>  meson.build   | 12 
>  src/gallium/state_trackers/clover/meson.build |  3 +++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/meson.build b/meson.build
> index 3d07c88364a..0785609c4b0 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -624,6 +624,7 @@ if with_gallium_st_nine
>endif
>  endif
>  
> +clover_cpp_std = []
>  if get_option('power8') != 'false'
># on old versions of meson the cpu family would return as ppc64le on little
># endian power8, this was changed in 0.48 such that the family would always
> @@ -631,6 +632,7 @@ if get_option('power8') != 'false'
># should be checked. Since we support versions < 0.48 we need to use
># startswith.
>if host_machine.cpu_family().startswith('ppc64') and host_machine.endian() 
> == 'little'
> +_test_args = []
>  if cc.get_id() == 'gcc' and cc.version().version_compare('< 4.8')
>error('Altivec is not supported with gcc version < 4.8.')
>  endif
> @@ -645,9 +647,19 @@ if get_option('power8') != 'false'
>  args : '-mpower8-vector',
>  name : 'POWER8 intrinsics')
>pre_args += ['-D_ARCH_PWR8', '-mpower8-vector']
> +  _test_args += ['-D_ARCH_PWR8', '-mpower8-vector']
>  elif get_option('power8') == 'true'
>error('POWER8 intrinsic support required but not found.')
>  endif
> +
> +if cpp.compiles('''
> +#if !defined(__VEC__) || !defined(__ALTIVEC__)
> +#error "AltiVec not enabled"
> +#endif''',
> +args : _test_args,
> +name : 'Altivec')
> +  clover_cpp_std += ['cpp_std=gnu++11']
> +endif

This doesn't look quite right, but I don't trust my brain right now;
I'll have a look at it again later.

In the mean time, the rest of this series is:
Reviewed-by: Eric Engestrom 

>endif
>  endif
>  
> diff --git a/src/gallium/state_trackers/clover/meson.build 
> b/src/gallium/state_trackers/clover/meson.build
> index 1a09d8f2ca9..a6729af2fb8 100644
> --- a/src/gallium/state_trackers/clover/meson.build
> +++ b/src/gallium/state_trackers/clover/meson.build
> @@ -30,6 +30,7 @@ libcltgsi = static_library(
>files('tgsi/compiler.cpp', 'tgsi/invocation.hpp'),
>include_directories : clover_incs,
>cpp_args : [cpp_vis_args],
> +  override_options : clover_cpp_std,
>  )
>  
>  libclllvm = static_library(
> @@ -56,6 +57,7 @@ libclllvm = static_library(
>  )),
>],
>dependencies : [dep_llvm, dep_elf],
> +  override_options : clover_cpp_std,
>  )
>  
>  clover_files = files(
> @@ -119,4 +121,5 @@ libclover = static_library(
>include_directories : clover_incs,
>cpp_args : [clover_cpp_args, cpp_vis_args],
>link_with : [libcltgsi, libclllvm],
> +  override_options : clover_cpp_std,
>  )
> -- 
> 2.19.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/android: handle storage images in vkGetSwapchainGrallocUsageANDROID

2018-12-05 Thread Tapani Pälli



On 12/5/18 1:22 PM, Bas Nieuwenhuizen wrote:

On Wed, Dec 5, 2018 at 12:15 PM Tapani Pälli  wrote:




On 12/5/18 1:01 PM, Bas Nieuwenhuizen wrote:

On Fri, Sep 7, 2018 at 12:54 AM Kevin Strasser  wrote:


Android P and earlier expect that the surface supports storage images, and
so many of the tests fail when the framework checks for that support. The
framework also includes various image format and usage combinations that are
invalid for the hardware.

Drop the STORAGE restriction from the HAL and whitelist a pair of
formats so that existing versions of Android can pass these tests.

Fixes:
 dEQP-VK.wsi.android.*

Signed-off-by: Kevin Strasser 
---
   src/intel/vulkan/anv_android.c | 23 ++-
   1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
index 46c41d5..e2640b8 100644
--- a/src/intel/vulkan/anv_android.c
+++ b/src/intel/vulkan/anv_android.c
@@ -234,7 +234,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
  *grallocUsage = 0;
  intel_logd("%s: format=%d, usage=0x%x", __func__, format, imageUsage);

-   /* WARNING: Android Nougat's libvulkan.so hardcodes the VkImageUsageFlags
+   /* WARNING: Android's libvulkan.so hardcodes the VkImageUsageFlags
   * returned to applications via 
VkSurfaceCapabilitiesKHR::supportedUsageFlags.
   * The relevant code in libvulkan/swapchain.cpp contains this fun comment:
   *
@@ -247,7 +247,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
   * dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
   */

-   const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
+   VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {


Why remove the const here?


 .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
 .format = format,
 .type = VK_IMAGE_TYPE_2D,
@@ -255,6 +255,17 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
 .usage = imageUsage,
  };

+   /* Android P and earlier doesn't check if the physical device supports a
+* given format and usage combination before calling this function. Omit the
+* storage requirement to make the tests pass.
+*/
+#if ANDROID_API_LEVEL <= 28
+   if (format == VK_FORMAT_R8G8B8A8_SRGB ||
+   format == VK_FORMAT_R5G6B5_UNORM_PACK16) {
+  image_format_info.usage &= ~VK_IMAGE_USAGE_STORAGE_BIT;
+   }
+#endif


I don't think you need this. Per the vulkan spec you can only use an
format + usage combination for a swapchain if it is supported per
ImageFormatProperties, using essentially the same check happening
above. I know CTs has been bad at this, but Vulkan CTS should have
been fixed for a bit now. (I don't think all the fixes are in Android
CTS 9.0_r4 yet, maybe the next release?)


AFAIK the problem here is not about CTS. It's the swapchain
implementation that always requires storage support.


Actually swapchain creation has the following valid usage rule:

"The implied image creation parameters of the swapchain must be
supported as reported by vkGetPhysicalDeviceImageFormatProperties"

So since those formats don't support the STORAGE usage bit, that test
fails and you are not allowed to create a swapchain with those formats
and storage, even if the surface capabiliities expose the STORAGE
usage bit in general.


Right ... this stuff was done because comment in the swapchain setting 
the bits seems like maybe it's not thought through:


// TODO(jessehall): I think these are right, but haven't thought hard about
// it. Do we need to query the driver for support of any of these?




(Also silently removing the usage bit is bad, because the app could
try actually using images stores with the image ...)


True, it is not nice ..



+
  VkImageFormatProperties2KHR image_format_props = {
 .sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR,
  };
@@ -268,19 +279,13 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
  "inside %s", __func__);
  }

-   /* Reject STORAGE here to avoid complexity elsewhere. */
-   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
-  return vk_errorf(device->instance, device, VK_ERROR_FORMAT_NOT_SUPPORTED,
-   "VK_IMAGE_USAGE_STORAGE_BIT unsupported for gralloc "
-   "swapchain");
-   }
-
  if (unmask32(, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT))
 *grallocUsage |= GRALLOC_USAGE_HW_RENDER;

  if (unmask32(, VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
VK_IMAGE_USAGE_SAMPLED_BIT |
+ VK_IMAGE_USAGE_STORAGE_BIT |
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT))
 *grallocUsage |= GRALLOC_USAGE_HW_TEXTURE;

--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/android: handle storage images in vkGetSwapchainGrallocUsageANDROID

2018-12-05 Thread Bas Nieuwenhuizen
On Wed, Dec 5, 2018 at 12:37 PM Tapani Pälli  wrote:
>
>
>
> On 12/5/18 1:22 PM, Bas Nieuwenhuizen wrote:
> > On Wed, Dec 5, 2018 at 12:15 PM Tapani Pälli  wrote:
> >>
> >>
> >>
> >> On 12/5/18 1:01 PM, Bas Nieuwenhuizen wrote:
> >>> On Fri, Sep 7, 2018 at 12:54 AM Kevin Strasser  
> >>> wrote:
> 
>  Android P and earlier expect that the surface supports storage images, 
>  and
>  so many of the tests fail when the framework checks for that support. The
>  framework also includes various image format and usage combinations that 
>  are
>  invalid for the hardware.
> 
>  Drop the STORAGE restriction from the HAL and whitelist a pair of
>  formats so that existing versions of Android can pass these tests.
> 
>  Fixes:
>   dEQP-VK.wsi.android.*
> 
>  Signed-off-by: Kevin Strasser 
>  ---
> src/intel/vulkan/anv_android.c | 23 ++-
> 1 file changed, 14 insertions(+), 9 deletions(-)
> 
>  diff --git a/src/intel/vulkan/anv_android.c 
>  b/src/intel/vulkan/anv_android.c
>  index 46c41d5..e2640b8 100644
>  --- a/src/intel/vulkan/anv_android.c
>  +++ b/src/intel/vulkan/anv_android.c
>  @@ -234,7 +234,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
>    *grallocUsage = 0;
>    intel_logd("%s: format=%d, usage=0x%x", __func__, format, 
>  imageUsage);
> 
>  -   /* WARNING: Android Nougat's libvulkan.so hardcodes the 
>  VkImageUsageFlags
>  +   /* WARNING: Android's libvulkan.so hardcodes the VkImageUsageFlags
> * returned to applications via 
>  VkSurfaceCapabilitiesKHR::supportedUsageFlags.
> * The relevant code in libvulkan/swapchain.cpp contains this fun 
>  comment:
> *
>  @@ -247,7 +247,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> * dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
> */
> 
>  -   const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
>  +   VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
> >>>
> >>> Why remove the const here?
> >>>
>   .sType = 
>  VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
>   .format = format,
>   .type = VK_IMAGE_TYPE_2D,
>  @@ -255,6 +255,17 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
>   .usage = imageUsage,
>    };
> 
>  +   /* Android P and earlier doesn't check if the physical device 
>  supports a
>  +* given format and usage combination before calling this function. 
>  Omit the
>  +* storage requirement to make the tests pass.
>  +*/
>  +#if ANDROID_API_LEVEL <= 28
>  +   if (format == VK_FORMAT_R8G8B8A8_SRGB ||
>  +   format == VK_FORMAT_R5G6B5_UNORM_PACK16) {
>  +  image_format_info.usage &= ~VK_IMAGE_USAGE_STORAGE_BIT;
>  +   }
>  +#endif
> >>>
> >>> I don't think you need this. Per the vulkan spec you can only use an
> >>> format + usage combination for a swapchain if it is supported per
> >>> ImageFormatProperties, using essentially the same check happening
> >>> above. I know CTs has been bad at this, but Vulkan CTS should have
> >>> been fixed for a bit now. (I don't think all the fixes are in Android
> >>> CTS 9.0_r4 yet, maybe the next release?)
> >>
> >> AFAIK the problem here is not about CTS. It's the swapchain
> >> implementation that always requires storage support.
> >
> > Actually swapchain creation has the following valid usage rule:
> >
> > "The implied image creation parameters of the swapchain must be
> > supported as reported by vkGetPhysicalDeviceImageFormatProperties"
> >
> > So since those formats don't support the STORAGE usage bit, that test
> > fails and you are not allowed to create a swapchain with those formats
> > and storage, even if the surface capabiliities expose the STORAGE
> > usage bit in general.
>
> Right ... this stuff was done because comment in the swapchain setting
> the bits seems like maybe it's not thought through:
>
> // TODO(jessehall): I think these are right, but haven't thought hard about
> // it. Do we need to query the driver for support of any of these?

That was from before the spec was changed to add that rule.

>
> >>
> >>> (Also silently removing the usage bit is bad, because the app could
> >>> try actually using images stores with the image ...)
> >>
> >> True, it is not nice ..
> >>
> >>
>  +
>    VkImageFormatProperties2KHR image_format_props = {
>   .sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR,
>    };
>  @@ -268,19 +279,13 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
>    "inside %s", __func__);
>    }
> 
>  -   /* Reject STORAGE here to avoid complexity elsewhere. */
>  -   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
>  -  return vk_errorf(device->instance, 

Re: [Mesa-dev] [PATCH] anv/android: handle storage images in vkGetSwapchainGrallocUsageANDROID

2018-12-05 Thread Bas Nieuwenhuizen
On Wed, Dec 5, 2018 at 12:51 PM Tapani Pälli  wrote:
>
>
>
> On 12/5/18 1:44 PM, Bas Nieuwenhuizen wrote:
> > On Wed, Dec 5, 2018 at 12:37 PM Tapani Pälli  wrote:
> >>
> >>
> >>
> >> On 12/5/18 1:22 PM, Bas Nieuwenhuizen wrote:
> >>> On Wed, Dec 5, 2018 at 12:15 PM Tapani Pälli  
> >>> wrote:
> 
> 
> 
>  On 12/5/18 1:01 PM, Bas Nieuwenhuizen wrote:
> > On Fri, Sep 7, 2018 at 12:54 AM Kevin Strasser 
> >  wrote:
> >>
> >> Android P and earlier expect that the surface supports storage images, 
> >> and
> >> so many of the tests fail when the framework checks for that support. 
> >> The
> >> framework also includes various image format and usage combinations 
> >> that are
> >> invalid for the hardware.
> >>
> >> Drop the STORAGE restriction from the HAL and whitelist a pair of
> >> formats so that existing versions of Android can pass these tests.
> >>
> >> Fixes:
> >>   dEQP-VK.wsi.android.*
> >>
> >> Signed-off-by: Kevin Strasser 
> >> ---
> >> src/intel/vulkan/anv_android.c | 23 ++-
> >> 1 file changed, 14 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/src/intel/vulkan/anv_android.c 
> >> b/src/intel/vulkan/anv_android.c
> >> index 46c41d5..e2640b8 100644
> >> --- a/src/intel/vulkan/anv_android.c
> >> +++ b/src/intel/vulkan/anv_android.c
> >> @@ -234,7 +234,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> >>*grallocUsage = 0;
> >>intel_logd("%s: format=%d, usage=0x%x", __func__, format, 
> >> imageUsage);
> >>
> >> -   /* WARNING: Android Nougat's libvulkan.so hardcodes the 
> >> VkImageUsageFlags
> >> +   /* WARNING: Android's libvulkan.so hardcodes the VkImageUsageFlags
> >> * returned to applications via 
> >> VkSurfaceCapabilitiesKHR::supportedUsageFlags.
> >> * The relevant code in libvulkan/swapchain.cpp contains this 
> >> fun comment:
> >> *
> >> @@ -247,7 +247,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> >> * dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
> >> */
> >>
> >> -   const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
> >> +   VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
> >
> > Why remove the const here?
> >
> >>   .sType = 
> >> VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
> >>   .format = format,
> >>   .type = VK_IMAGE_TYPE_2D,
> >> @@ -255,6 +255,17 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
> >>   .usage = imageUsage,
> >>};
> >>
> >> +   /* Android P and earlier doesn't check if the physical device 
> >> supports a
> >> +* given format and usage combination before calling this 
> >> function. Omit the
> >> +* storage requirement to make the tests pass.
> >> +*/
> >> +#if ANDROID_API_LEVEL <= 28
> >> +   if (format == VK_FORMAT_R8G8B8A8_SRGB ||
> >> +   format == VK_FORMAT_R5G6B5_UNORM_PACK16) {
> >> +  image_format_info.usage &= ~VK_IMAGE_USAGE_STORAGE_BIT;
> >> +   }
> >> +#endif
> >
> > I don't think you need this. Per the vulkan spec you can only use an
> > format + usage combination for a swapchain if it is supported per
> > ImageFormatProperties, using essentially the same check happening
> > above. I know CTs has been bad at this, but Vulkan CTS should have
> > been fixed for a bit now. (I don't think all the fixes are in Android
> > CTS 9.0_r4 yet, maybe the next release?)
> 
>  AFAIK the problem here is not about CTS. It's the swapchain
>  implementation that always requires storage support.
> >>>
> >>> Actually swapchain creation has the following valid usage rule:
> >>>
> >>> "The implied image creation parameters of the swapchain must be
> >>> supported as reported by vkGetPhysicalDeviceImageFormatProperties"
> >>>
> >>> So since those formats don't support the STORAGE usage bit, that test
> >>> fails and you are not allowed to create a swapchain with those formats
> >>> and storage, even if the surface capabiliities expose the STORAGE
> >>> usage bit in general.
> >>
> >> Right ... this stuff was done because comment in the swapchain setting
> >> the bits seems like maybe it's not thought through:
> >>
> >> // TODO(jessehall): I think these are right, but haven't thought hard about
> >> // it. Do we need to query the driver for support of any of these?
> >
> > That was from before the spec was changed to add that rule.
>
> OK if I understand correctly, so should we rather then try to fix those
> tests to skip instead of fail?

They should be fixed with:
https://github.com/KhronosGroup/VK-GL-CTS/commit/49eab80e4a8b3af1790b9ac88b096aa9bffd193f#diff-8369d6640a2c6ad0c0fc1d85b113faeb

Re: [Mesa-dev] [PATCH] radv: expose VK_EXT_scalar_block_layout

2018-12-05 Thread Bas Nieuwenhuizen
On Wed, Dec 5, 2018 at 2:14 PM Samuel Pitoiset
 wrote:
>
> Nothing to do, the compiler already handles that.
>
> All new dEQP.VK.ubo.* and dEQP.VK.ssbo.* pass, except some
> 16-bit tests that are quite related to fdo bug #108114.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c  | 6 ++
>  src/amd/vulkan/radv_extensions.py | 1 +
>  2 files changed, 7 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index ad057a87509..d39e00eebe2 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -848,6 +848,12 @@ void radv_GetPhysicalDeviceFeatures2(
> features->geometryStreams = true;
> break;
> }
> +   case 
> VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SCALAR_BLOCK_LAYOUT_FEATURES_EXT: {
> +   VkPhysicalDeviceScalarBlockLayoutFeaturesEXT 
> *features =
> +   (VkPhysicalDeviceScalarBlockLayoutFeaturesEXT 
> *)ext;
> +   features->scalarBlockLayout = true;

Last I talked to Nicolai, it looked like we may only be able to
support this on CI+.

> +   break;
> +   }
> default:
> break;
> }
> diff --git a/src/amd/vulkan/radv_extensions.py 
> b/src/amd/vulkan/radv_extensions.py
> index 6bdf988d117..7d726d6f5e8 100644
> --- a/src/amd/vulkan/radv_extensions.py
> +++ b/src/amd/vulkan/radv_extensions.py
> @@ -107,6 +107,7 @@ EXTENSIONS = [
>  Extension('VK_EXT_global_priority',   1, 
> 'device->rad_info.has_ctx_priority'),
>  Extension('VK_EXT_pci_bus_info',  1, True),
>  Extension('VK_EXT_sampler_filter_minmax', 1, 
> 'device->rad_info.chip_class >= CIK'),
> +Extension('VK_EXT_scalar_block_layout',   1, True),
>  Extension('VK_EXT_shader_viewport_index_layer',   1, True),
>  Extension('VK_EXT_shader_stencil_export', 1, True),
>  Extension('VK_EXT_transform_feedback',1, True),
> --
> 2.19.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: add SpvCapabilityInt64Atomics

2018-12-05 Thread Jason Ekstrand

Rb


On December 5, 2018 07:26:22 Samuel Pitoiset  wrote:


Required for VK_KHR_shader_atomic_int64.

Signed-off-by: Samuel Pitoiset 
---
src/compiler/shader_info.h| 1 +
src/compiler/spirv/spirv_to_nir.c | 5 -
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h
index 65bc0588d67..b95cc310fd6 100644
--- a/src/compiler/shader_info.h
+++ b/src/compiler/shader_info.h
@@ -62,6 +62,7 @@ struct spirv_supported_capabilities {
   bool post_depth_coverage;
   bool transform_feedback;
   bool geometry_streams;
+   bool int64_atomics;
};

typedef struct shader_info {
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c

index a05c4d236ca..22efaa276d9 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -3416,7 +3416,6 @@ vtn_handle_preamble_instruction(struct vtn_builder 
*b, SpvOp opcode,

  case SpvCapabilityVector16:
  case SpvCapabilityFloat16Buffer:
  case SpvCapabilityFloat16:
-  case SpvCapabilityInt64Atomics:
  case SpvCapabilityStorageImageMultisample:
  case SpvCapabilityInt8:
  case SpvCapabilitySparseResidency:
@@ -3447,6 +3446,10 @@ vtn_handle_preamble_instruction(struct vtn_builder 
*b, SpvOp opcode,

 spv_check_supported(geometry_streams, cap);
 break;

+  case SpvCapabilityInt64Atomics:
+ spv_check_supported(int64_atomics, cap);
+ break;
+
  case SpvCapabilityAddresses:
  case SpvCapabilityKernel:
  case SpvCapabilityImageBasic:
--
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108952] mesa-git broke cinnamon, temporary downgrade fix

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108952

Bug ID: 108952
   Summary: mesa-git broke cinnamon, temporary downgrade fix
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: major
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: walter.bis...@protonmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Since a few weeks I'm using Lord Heavys
https://wiki.archlinux.org/index.php/unofficial_user_repositories#mesa-git
repo, to be able to play Dying Light on Arch.

As of November 30th, the repo broke my DE (Cinnamon). SDDM worked, although the
user icon was missing. After login cinnamon is stuck in a restart loop.
Downgrading mesa-git, llvm-libs-svn and llvm-svn fixed the problem.

Affected versions on my side are

mesa-git-105999.89b4798c06-1 mesa-git-106052.95d62baac5-1

llvm-libs-svn-347971-1 llvm-libs-svn-348278-1

llvm-svn-347971-1-x86_64 llvm-svn-348278-1

cinnamon-session[2916]: WARNING: t+30,89171s: Application 'cinnamon.desktop'
failed to register before timeout
cinnamon-session[2916]: CRITICAL: t+30,89191s: We failed, but the fail whale is
dead. Sorry
dbus-daemon[623]: [system] Activating via systemd: service
name='org.freedesktop.Accounts' unit='accounts-daemon.service' requested by
':1.101' (uid=1000 pid=3071
comm="/usr/lib/cinnamon-settings-daemon/csd-background ")
/cinnamon-killer-daemon[4359]: Bound Cinnamon restart to Escape.

This is my first bug report for mesa, so I don't know what other information
you might need. The few lines before are from the journal, that I thought might
be of importance. I'm not even sure whether I chose the right driver. (It might
also be radeonsi? Thought that was deprecated.) Let me know if you need some
more information.

Linux 4.19.4
AMD 8320e / 990X
RX 480
Cinnamon 4.0.3
SDDM 0.18.0

Machine:   Type: Desktop System: Gigabyte product: N/A v: N/A serial:   Mobo: Gigabyte model: 990XA-UD3 v: x.x serial:  UEFI:
American Megatrends v: FD  date: 02/04/2013

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/59] intel/compiler: lower some 16-bit float operations to 32-bit

2018-12-05 Thread Pohjolainen, Topi
On Tue, Dec 04, 2018 at 08:16:38AM +0100, Iago Toral Quiroga wrote:
> The hardware doesn't support half-float for these.

Reviewed-by: Topi Pohjolainen 

> ---
>  src/intel/compiler/brw_nir.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> index aa6788b9fe5..e0027f5179c 100644
> --- a/src/intel/compiler/brw_nir.c
> +++ b/src/intel/compiler/brw_nir.c
> @@ -620,6 +620,11 @@ lower_bit_size_callback(const nir_alu_instr *alu, UNUSED 
> void *data)
> case nir_op_irem:
> case nir_op_udiv:
> case nir_op_umod:
> +   case nir_op_fceil:
> +   case nir_op_ffloor:
> +   case nir_op_ffract:
> +   case nir_op_fround_even:
> +   case nir_op_ftrunc:
>return 32;
> default:
>return 0;
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 24/59] intel/compiler: add instruction setters for Src1Type and Src2Type.

2018-12-05 Thread Pohjolainen, Topi
On Tue, Dec 04, 2018 at 08:16:48AM +0100, Iago Toral Quiroga wrote:
> The original SrcType is a 3-bit field that takes a subset of the types
> supported for the hardware for 3-source instructions. Since gen8,
> when the half-float type was added, 3-source floating point operations
> can use use mixed precision mode, where not all the operands have the
> same floating-point precision. While the precision for the first operand
> is taken from the type in SrcType, the bits in Src1Type (bit 36) and
> Src2Type (bit 35) define the precision for the other operands
> (0: normal precision, 1: half precision).

Reviewed-by: Topi Pohjolainen 

> ---
>  src/intel/compiler/brw_inst.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
> index ce89bbba72f..c45697eaa3a 100644
> --- a/src/intel/compiler/brw_inst.h
> +++ b/src/intel/compiler/brw_inst.h
> @@ -222,6 +222,8 @@ F8(3src_src1_negate,39, 39, 40, 40)
>  F8(3src_src1_abs,   38, 38, 39, 39)
>  F8(3src_src0_negate,37, 37, 38, 38)
>  F8(3src_src0_abs,   36, 36, 37, 37)
> +F8(3src_a16_src1_type,  -1, -1, 36, 36)
> +F8(3src_a16_src2_type,  -1, -1, 35, 35)
>  F8(3src_a16_flag_reg_nr,34, 34, 33, 33)
>  F8(3src_a16_flag_subreg_nr, 33, 33, 32, 32)
>  FF(3src_a16_dst_reg_file,
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/android: handle storage images in vkGetSwapchainGrallocUsageANDROID

2018-12-05 Thread Tapani Pälli



On 12/5/18 1:01 PM, Bas Nieuwenhuizen wrote:

On Fri, Sep 7, 2018 at 12:54 AM Kevin Strasser  wrote:


Android P and earlier expect that the surface supports storage images, and
so many of the tests fail when the framework checks for that support. The
framework also includes various image format and usage combinations that are
invalid for the hardware.

Drop the STORAGE restriction from the HAL and whitelist a pair of
formats so that existing versions of Android can pass these tests.

Fixes:
dEQP-VK.wsi.android.*

Signed-off-by: Kevin Strasser 
---
  src/intel/vulkan/anv_android.c | 23 ++-
  1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
index 46c41d5..e2640b8 100644
--- a/src/intel/vulkan/anv_android.c
+++ b/src/intel/vulkan/anv_android.c
@@ -234,7 +234,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
 *grallocUsage = 0;
 intel_logd("%s: format=%d, usage=0x%x", __func__, format, imageUsage);

-   /* WARNING: Android Nougat's libvulkan.so hardcodes the VkImageUsageFlags
+   /* WARNING: Android's libvulkan.so hardcodes the VkImageUsageFlags
  * returned to applications via 
VkSurfaceCapabilitiesKHR::supportedUsageFlags.
  * The relevant code in libvulkan/swapchain.cpp contains this fun comment:
  *
@@ -247,7 +247,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
  * dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
  */

-   const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
+   VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {


Why remove the const here?


.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
.format = format,
.type = VK_IMAGE_TYPE_2D,
@@ -255,6 +255,17 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
.usage = imageUsage,
 };

+   /* Android P and earlier doesn't check if the physical device supports a
+* given format and usage combination before calling this function. Omit the
+* storage requirement to make the tests pass.
+*/
+#if ANDROID_API_LEVEL <= 28
+   if (format == VK_FORMAT_R8G8B8A8_SRGB ||
+   format == VK_FORMAT_R5G6B5_UNORM_PACK16) {
+  image_format_info.usage &= ~VK_IMAGE_USAGE_STORAGE_BIT;
+   }
+#endif


I don't think you need this. Per the vulkan spec you can only use an
format + usage combination for a swapchain if it is supported per
ImageFormatProperties, using essentially the same check happening
above. I know CTs has been bad at this, but Vulkan CTS should have
been fixed for a bit now. (I don't think all the fixes are in Android
CTS 9.0_r4 yet, maybe the next release?)


AFAIK the problem here is not about CTS. It's the swapchain 
implementation that always requires storage support.



(Also silently removing the usage bit is bad, because the app could
try actually using images stores with the image ...)


True, it is not nice ..



+
 VkImageFormatProperties2KHR image_format_props = {
.sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR,
 };
@@ -268,19 +279,13 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
 "inside %s", __func__);
 }

-   /* Reject STORAGE here to avoid complexity elsewhere. */
-   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
-  return vk_errorf(device->instance, device, VK_ERROR_FORMAT_NOT_SUPPORTED,
-   "VK_IMAGE_USAGE_STORAGE_BIT unsupported for gralloc "
-   "swapchain");
-   }
-
 if (unmask32(, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
   VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT))
*grallocUsage |= GRALLOC_USAGE_HW_RENDER;

 if (unmask32(, VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
   VK_IMAGE_USAGE_SAMPLED_BIT |
+ VK_IMAGE_USAGE_STORAGE_BIT |
   VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT))
*grallocUsage |= GRALLOC_USAGE_HW_TEXTURE;

--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/59] intel/compiler: handle b2i/b2f with other integer conversion opcodes

2018-12-05 Thread Pohjolainen, Topi
On Wed, Dec 05, 2018 at 11:23:06AM +0100, Iago Toral wrote:
> On Tue, 2018-12-04 at 18:16 +0200, Pohjolainen, Topi wrote:
> > On Tue, Dec 04, 2018 at 08:16:36AM +0100, Iago Toral Quiroga wrote:
> > > Since we handle booleans as integers this makes more sense.
> > 
> > If this is applied before patch 10, can we merge 10 and 13?
> 
> We can't apply this before patch 10 because patch 10 is the one that
> splits the f264 and {i,u}264 opcodes. However, we could merge this and
> patch 13 into patch 10 if that looks better to you.

What you have is just fine. I just didn't see all the corners involved.
Patches 11-13:

Reviewed-by: Topi Pohjolainen 

> 
> Iago
> 
> > > ---
> > >  src/intel/compiler/brw_fs_nir.cpp | 10 +-
> > >  1 file changed, 5 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > b/src/intel/compiler/brw_fs_nir.cpp
> > > index 9f3d3bf9762..6c765fc2661 100644
> > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > @@ -801,11 +801,6 @@ fs_visitor::nir_emit_alu(const fs_builder
> > > , nir_alu_instr *instr)
> > >inst->saturate = instr->dest.saturate;
> > >break;
> > >  
> > > -   case nir_op_b2i:
> > > -   case nir_op_b2f:
> > > -  op[0].type = BRW_REGISTER_TYPE_D;
> > > -  op[0].negate = !op[0].negate;
> > > -  /* fallthrough */
> > > case nir_op_f2f64:
> > > case nir_op_f2i64:
> > > case nir_op_f2u64:
> > > @@ -850,6 +845,11 @@ fs_visitor::nir_emit_alu(const fs_builder
> > > , nir_alu_instr *instr)
> > >inst->saturate = instr->dest.saturate;
> > >break;
> > >  
> > > +   case nir_op_b2i:
> > > +   case nir_op_b2f:
> > > +  op[0].type = BRW_REGISTER_TYPE_D;
> > > +  op[0].negate = !op[0].negate;
> > > +  /* fallthrough */
> > > case nir_op_i2f64:
> > > case nir_op_i2i64:
> > > case nir_op_u2f64:
> > > -- 
> > > 2.17.1
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> > 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108949] RADV: Subgroup codegen is sub-optimal

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108949

--- Comment #1 from Connor Abbott  ---
This should be fixed by
https://github.com/llvm-mirror/llvm/commit/e3924b1c15606bb5bf98392e0c20e731b4965311
which was just committed 5 days ago. You'll need to build LLVM and Mesa master
to try it out.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/59] compiler/nir: add lowering for 16-bit ldexp

2018-12-05 Thread Pohjolainen, Topi
On Wed, Dec 05, 2018 at 12:26:06PM +0100, Iago Toral wrote:
> On Wed, 2018-12-05 at 13:20 +0200, Pohjolainen, Topi wrote:
> > On Wed, Dec 05, 2018 at 11:53:44AM +0100, Iago Toral wrote:
> > > On Wed, 2018-12-05 at 11:39 +0200, Pohjolainen, Topi wrote:
> > > > I remember people preferring to order things 16, 32, 64 before.
> > > > Should
> > > > we follow that here as well?
> > > 
> > > Yes, it makes sense. I'll change that.
> > > 
> > > > On Tue, Dec 04, 2018 at 08:16:46AM +0100, Iago Toral Quiroga
> > > > wrote:
> > > > > ---
> > > > >  src/compiler/nir/nir_opt_algebraic.py | 5 +
> > > > >  1 file changed, 5 insertions(+)
> > > > > 
> > > > > diff --git a/src/compiler/nir/nir_opt_algebraic.py
> > > > > b/src/compiler/nir/nir_opt_algebraic.py
> > > > > index 6c3b77c9b6e..747f1751086 100644
> > > > > --- a/src/compiler/nir/nir_opt_algebraic.py
> > > > > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > > > > @@ -778,6 +778,8 @@ def fexp2i(exp, bits):
> > > > >return ('ishl', ('iadd', exp, 127), 23)
> > > > > elif bits == 64:
> > > > >return ('pack_64_2x32_split', 0, ('ishl', ('iadd', exp,
> > > > > 1023), 20))
> > > > > +   elif bits == 16:
> > > > > +  return ('i2i16', ('ishl', ('iadd', exp, 15), 10))
> > > > > else:
> > > > >assert False
> > > > >  
> > > > > @@ -796,6 +798,8 @@ def ldexp(f, exp, bits):
> > > > >exp = ('imin', ('imax', exp, -252), 254)
> > > > > elif bits == 64:
> > > > >exp = ('imin', ('imax', exp, -2044), 2046)
> > > > > +   elif bits == 16:
> > > > > +  exp = ('imin', ('imax', exp, -30), 30)
> > > > 
> > > > I expected this to be:
> > > > 
> > > >  exp = ('imin', ('imax', exp, -29), 30)
> > > 
> > > Actually, I think this should be -28, since the minimum exponent
> > > value
> > > is -14.
> > 
> > I kept wondering about. The offset is 15 and -14 - 15 yields -29. But
> > -28
> > in turn would be more in line with the 32- and 64-bit cases.
> 
> I think the idea is to have this be 2x the minimum (and maximum)
> exponents we can represent, since below we are dividing it by two and
> emitting two exponentials, each with half that exponent. That way we
> ensure that when we divide the exponent by 2 we still produce a
> representable exponent for the bit-size.

Ah, right. I should have checked the context, -28 makes sense now.

Reviewed-by: Topi Pohjolainen 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/android: handle storage images in vkGetSwapchainGrallocUsageANDROID

2018-12-05 Thread Tapani Pälli



On 12/5/18 2:00 PM, Bas Nieuwenhuizen wrote:

On Wed, Dec 5, 2018 at 12:51 PM Tapani Pälli  wrote:




On 12/5/18 1:44 PM, Bas Nieuwenhuizen wrote:

On Wed, Dec 5, 2018 at 12:37 PM Tapani Pälli  wrote:




On 12/5/18 1:22 PM, Bas Nieuwenhuizen wrote:

On Wed, Dec 5, 2018 at 12:15 PM Tapani Pälli  wrote:




On 12/5/18 1:01 PM, Bas Nieuwenhuizen wrote:

On Fri, Sep 7, 2018 at 12:54 AM Kevin Strasser  wrote:


Android P and earlier expect that the surface supports storage images, and
so many of the tests fail when the framework checks for that support. The
framework also includes various image format and usage combinations that are
invalid for the hardware.

Drop the STORAGE restriction from the HAL and whitelist a pair of
formats so that existing versions of Android can pass these tests.

Fixes:
   dEQP-VK.wsi.android.*

Signed-off-by: Kevin Strasser 
---
 src/intel/vulkan/anv_android.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
index 46c41d5..e2640b8 100644
--- a/src/intel/vulkan/anv_android.c
+++ b/src/intel/vulkan/anv_android.c
@@ -234,7 +234,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
*grallocUsage = 0;
intel_logd("%s: format=%d, usage=0x%x", __func__, format, imageUsage);

-   /* WARNING: Android Nougat's libvulkan.so hardcodes the VkImageUsageFlags
+   /* WARNING: Android's libvulkan.so hardcodes the VkImageUsageFlags
 * returned to applications via 
VkSurfaceCapabilitiesKHR::supportedUsageFlags.
 * The relevant code in libvulkan/swapchain.cpp contains this fun 
comment:
 *
@@ -247,7 +247,7 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
 * dEQP-VK.wsi.android.swapchain.*.image_usage to fail.
 */

-   const VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {
+   VkPhysicalDeviceImageFormatInfo2KHR image_format_info = {


Why remove the const here?


   .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR,
   .format = format,
   .type = VK_IMAGE_TYPE_2D,
@@ -255,6 +255,17 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
   .usage = imageUsage,
};

+   /* Android P and earlier doesn't check if the physical device supports a
+* given format and usage combination before calling this function. Omit the
+* storage requirement to make the tests pass.
+*/
+#if ANDROID_API_LEVEL <= 28
+   if (format == VK_FORMAT_R8G8B8A8_SRGB ||
+   format == VK_FORMAT_R5G6B5_UNORM_PACK16) {
+  image_format_info.usage &= ~VK_IMAGE_USAGE_STORAGE_BIT;
+   }
+#endif


I don't think you need this. Per the vulkan spec you can only use an
format + usage combination for a swapchain if it is supported per
ImageFormatProperties, using essentially the same check happening
above. I know CTs has been bad at this, but Vulkan CTS should have
been fixed for a bit now. (I don't think all the fixes are in Android
CTS 9.0_r4 yet, maybe the next release?)


AFAIK the problem here is not about CTS. It's the swapchain
implementation that always requires storage support.


Actually swapchain creation has the following valid usage rule:

"The implied image creation parameters of the swapchain must be
supported as reported by vkGetPhysicalDeviceImageFormatProperties"

So since those formats don't support the STORAGE usage bit, that test
fails and you are not allowed to create a swapchain with those formats
and storage, even if the surface capabiliities expose the STORAGE
usage bit in general.


Right ... this stuff was done because comment in the swapchain setting
the bits seems like maybe it's not thought through:

// TODO(jessehall): I think these are right, but haven't thought hard about
// it. Do we need to query the driver for support of any of these?


That was from before the spec was changed to add that rule.


OK if I understand correctly, so should we rather then try to fix those
tests to skip instead of fail?


They should be fixed with:
https://github.com/KhronosGroup/VK-GL-CTS/commit/49eab80e4a8b3af1790b9ac88b096aa9bffd193f#diff-8369d6640a2c6ad0c0fc1d85b113faeb
https://github.com/KhronosGroup/VK-GL-CTS/commit/858f5396a4f63223fcf31f717d23b4b552e10182#diff-8369d6640a2c6ad0c0fc1d85b113faeb


Thanks, will try with these!










(Also silently removing the usage bit is bad, because the app could
try actually using images stores with the image ...)


True, it is not nice ..



+
VkImageFormatProperties2KHR image_format_props = {
   .sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR,
};
@@ -268,19 +279,13 @@ VkResult anv_GetSwapchainGrallocUsageANDROID(
"inside %s", __func__);
}

-   /* Reject STORAGE here to avoid complexity elsewhere. */
-   if (imageUsage & VK_IMAGE_USAGE_STORAGE_BIT) {
-  return vk_errorf(device->instance, device, VK_ERROR_FORMAT_NOT_SUPPORTED,
-   

[Mesa-dev] [PATCH v3] nir/algebraic: Rewrite bit-size inference

2018-12-05 Thread Connor Abbott
Before this commit, there were two copies of the algorithm: one in C,
that we would use to figure out what bit-size to give the replacement
expression, and one in Python, that emulated the C one and tried to
prove that the C algorithm would never fail to correctly assign
bit-sizes. That seemed pretty fragile, and likely to fall over if we
make any changes. Furthermore, the C code was really just recomputing
more-or-less the same thing as the Python code every time. Instead, we
can just store the results of the Python algorithm in the C
datastructure, and consult it to compute the bitsize of each value,
moving the "brains" entirely into Python. Since the Python algorithm no
longer has to match C, it's also a lot easier to change it to something
more closely approximating an actual type-inference algorithm. The
algorithm used is based on Hindley-Milner, although deliberately
weakened a little. It's a few more lines than the old one, judging by
the diffstat, but I think it's easier to verify that it's correct while
being as general as possible.

We could split this up into two changes, first making the C code use the
results of the Python code and then rewriting the Python algorithm, but
since the old algorithm never tracked which variable each equivalence
class, it would mean we'd have to add some non-trivial code which would
then get thrown away. I think it's better to see the final state all at
once, although I could also try splitting it up.

v2:
- Replace instances of "== None" and "!= None" with "is None" and
"is not None".
- Rename first_src to first_unsized_src
- Only merge the destination with the first unsized source, since the
sources have already been merged.
- Add a comment explaining what nir_search_value::bit_size now means.
v3:
- Fix one last instance to use "is not" instead of !=
- Don't try to be so clever when choosing which error message to print
based on whether we're in the search or replace expression.
- Fix trailing whitespace.
---
 src/compiler/nir/nir_algebraic.py | 520 --
 src/compiler/nir/nir_search.c | 146 +
 src/compiler/nir/nir_search.h |  17 +-
 3 files changed, 317 insertions(+), 366 deletions(-)

diff --git a/src/compiler/nir/nir_algebraic.py 
b/src/compiler/nir/nir_algebraic.py
index 728196136ab..efd6e52cdb9 100644
--- a/src/compiler/nir/nir_algebraic.py
+++ b/src/compiler/nir/nir_algebraic.py
@@ -88,7 +88,7 @@ class Value(object):
 
__template = mako.template.Template("""
 static const ${val.c_type} ${val.name} = {
-   { ${val.type_enum}, ${val.bit_size} },
+   { ${val.type_enum}, ${val.c_bit_size} },
 % if isinstance(val, Constant):
${val.type()}, { ${val.hex()} /* ${val.value} */ },
 % elif isinstance(val, Variable):
@@ -112,6 +112,40 @@ static const ${val.c_type} ${val.name} = {
def __str__(self):
   return self.in_val
 
+   def get_bit_size(self):
+  """Get the physical bit-size that has been chosen for this value, or if
+  there is none, the canonical value which currently represents this
+  bit-size class. Variables will be preferred, i.e. if there are any
+  variables in the equivalence class, the canonical value will be a
+  variable. We do this since we'll need to know which variable each value
+  is equivalent to when constructing the replacement expression. This is
+  the "find" part of the union-find algorithm.
+  """
+  bit_size = self
+
+  while isinstance(bit_size, Value):
+ if bit_size._bit_size is None:
+break
+ bit_size = bit_size._bit_size
+
+  if bit_size is not self:
+ self._bit_size = bit_size
+  return bit_size
+
+   def set_bit_size(self, other):
+  """Make self.get_bit_size() return what other.get_bit_size() return
+  before calling this, or just "other" if it's a concrete bit-size. This is
+  the "union" part of the union-find algorithm.
+  """
+
+  self_bit_size = self.get_bit_size()
+  other_bit_size = other if isinstance(other, int) else 
other.get_bit_size()
+
+  if self_bit_size == other_bit_size:
+ return
+
+  self_bit_size._bit_size = other_bit_size
+
@property
def type_enum(self):
   return "nir_search_value_" + self.type_str
@@ -124,6 +158,21 @@ static const ${val.c_type} ${val.name} = {
def c_ptr(self):
   return "&{0}.value".format(self.name)
 
+   @property
+   def c_bit_size(self):
+  bit_size = self.get_bit_size()
+  if isinstance(bit_size, int):
+ return bit_size
+  elif isinstance(bit_size, Variable):
+ return -bit_size.index - 1
+  else:
+ # If the bit-size class is neither a variable, nor an actual 
bit-size, then
+ # - If it's in the search expression, we don't need to check anything
+ # - If it's in the replace expression, either it's ambiguous (in which
+ # case we'd reject it), or it equals the bit-size of the search value
+ # We represent these cases 

Re: [Mesa-dev] [PATCH 28/59] intel/compiler: set correct precision fields for 3-source float instructions

2018-12-05 Thread Iago Toral
On Wed, 2018-12-05 at 14:58 +0200, Pohjolainen, Topi wrote:
> On Tue, Dec 04, 2018 at 08:16:52AM +0100, Iago Toral Quiroga wrote:
> > Source0 and Destination extract the floating-point precision
> > automatically
> > from the SrcType and DstType instruction fields respectively when
> > they are
> > set to types :F or :HF. For Source1 and Source2 operands, we use
> > the new
> > 1-bit fields Src1Type and Src2Type, where 0 means normal precision
> > and 1
> > means half-precision. Since we always use the type of the
> > destination for
> > all operands when we emit 3-source instructions, we only need set
> > Src1Type
> > and Src2Type to 1 when we are emitting a half-precision
> > instruction.
> > ---
> >  src/intel/compiler/brw_eu_emit.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/src/intel/compiler/brw_eu_emit.c
> > b/src/intel/compiler/brw_eu_emit.c
> > index 2c9fc9a5c7c..66edfb43baf 100644
> > --- a/src/intel/compiler/brw_eu_emit.c
> > +++ b/src/intel/compiler/brw_eu_emit.c
> > @@ -801,6 +801,11 @@ brw_alu3(struct brw_codegen *p, unsigned
> > opcode, struct brw_reg dest,
> >*/
> >   brw_inst_set_3src_a16_src_type(devinfo, inst, dest.type);
> >   brw_inst_set_3src_a16_dst_type(devinfo, inst, dest.type);
> > +
> > + if (devinfo->gen >= 8 && dest.type ==
> > BRW_REGISTER_TYPE_HF) {
> > +brw_inst_set_3src_a16_src1_type(devinfo, inst, 1);
> > +brw_inst_set_3src_a16_src2_type(devinfo, inst, 1);
> > + }
> 
> I had similar patch which prepares for mixed mode (useful for linterp
> with
> 32-bit input varyings):
> 
>  /* From the Bspec: Instruction types
>   *
>   * Three source instructions can use operands with mixed-
> mode
>   * precision. When SrcType field is set to :f or :hf it
> defines
>   * precision for source 0 only, and fields Src1Type and
> Src2Type
>   * define precision for other source operands:
>   *
>   *   0b = :f. Single precision Float (32-bit).
>   *   1b = :hf. Half precision Float (16-bit).
>   */
>  if (src1.type == BRW_REGISTER_TYPE_HF)
> brw_inst_set_3src_src1_type(devinfo, inst, 1);
> 
>  if (src2.type == BRW_REGISTER_TYPE_HF)
> brw_inst_set_3src_src2_type(devinfo, inst, 1);
> 
> How would you feel about that? (Direct cut-paste and the helpers have
> different name).

Sure, if we are planning to use mixed mode in the future this makes
more sense. Thanks!

> >}
> > }
> >  
> > -- 
> > 2.17.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: expose VK_EXT_scalar_block_layout

2018-12-05 Thread Samuel Pitoiset
Nothing to do, the compiler already handles that.

All new dEQP.VK.ubo.* and dEQP.VK.ssbo.* pass, except some
16-bit tests that are quite related to fdo bug #108114.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c  | 6 ++
 src/amd/vulkan/radv_extensions.py | 1 +
 2 files changed, 7 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index ad057a87509..d39e00eebe2 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -848,6 +848,12 @@ void radv_GetPhysicalDeviceFeatures2(
features->geometryStreams = true;
break;
}
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SCALAR_BLOCK_LAYOUT_FEATURES_EXT: {
+   VkPhysicalDeviceScalarBlockLayoutFeaturesEXT *features =
+   (VkPhysicalDeviceScalarBlockLayoutFeaturesEXT 
*)ext;
+   features->scalarBlockLayout = true;
+   break;
+   }
default:
break;
}
diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index 6bdf988d117..7d726d6f5e8 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -107,6 +107,7 @@ EXTENSIONS = [
 Extension('VK_EXT_global_priority',   1, 
'device->rad_info.has_ctx_priority'),
 Extension('VK_EXT_pci_bus_info',  1, True),
 Extension('VK_EXT_sampler_filter_minmax', 1, 
'device->rad_info.chip_class >= CIK'),
+Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_viewport_index_layer',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, True),
 Extension('VK_EXT_transform_feedback',1, True),
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/59] intel/compiler: set correct precision fields for 3-source float instructions

2018-12-05 Thread Pohjolainen, Topi
On Wed, Dec 05, 2018 at 02:04:16PM +0100, Iago Toral wrote:
> On Wed, 2018-12-05 at 14:58 +0200, Pohjolainen, Topi wrote:
> > On Tue, Dec 04, 2018 at 08:16:52AM +0100, Iago Toral Quiroga wrote:
> > > Source0 and Destination extract the floating-point precision
> > > automatically
> > > from the SrcType and DstType instruction fields respectively when
> > > they are
> > > set to types :F or :HF. For Source1 and Source2 operands, we use
> > > the new
> > > 1-bit fields Src1Type and Src2Type, where 0 means normal precision
> > > and 1
> > > means half-precision. Since we always use the type of the
> > > destination for
> > > all operands when we emit 3-source instructions, we only need set
> > > Src1Type
> > > and Src2Type to 1 when we are emitting a half-precision
> > > instruction.
> > > ---
> > >  src/intel/compiler/brw_eu_emit.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/src/intel/compiler/brw_eu_emit.c
> > > b/src/intel/compiler/brw_eu_emit.c
> > > index 2c9fc9a5c7c..66edfb43baf 100644
> > > --- a/src/intel/compiler/brw_eu_emit.c
> > > +++ b/src/intel/compiler/brw_eu_emit.c
> > > @@ -801,6 +801,11 @@ brw_alu3(struct brw_codegen *p, unsigned
> > > opcode, struct brw_reg dest,
> > >*/
> > >   brw_inst_set_3src_a16_src_type(devinfo, inst, dest.type);
> > >   brw_inst_set_3src_a16_dst_type(devinfo, inst, dest.type);
> > > +
> > > + if (devinfo->gen >= 8 && dest.type ==
> > > BRW_REGISTER_TYPE_HF) {
> > > +brw_inst_set_3src_a16_src1_type(devinfo, inst, 1);
> > > +brw_inst_set_3src_a16_src2_type(devinfo, inst, 1);
> > > + }
> > 
> > I had similar patch which prepares for mixed mode (useful for linterp
> > with
> > 32-bit input varyings):
> > 
> >  /* From the Bspec: Instruction types
> >   *
> >   * Three source instructions can use operands with mixed-
> > mode
> >   * precision. When SrcType field is set to :f or :hf it
> > defines
> >   * precision for source 0 only, and fields Src1Type and
> > Src2Type
> >   * define precision for other source operands:
> >   *
> >   *   0b = :f. Single precision Float (32-bit).
> >   *   1b = :hf. Half precision Float (16-bit).
> >   */
> >  if (src1.type == BRW_REGISTER_TYPE_HF)
> > brw_inst_set_3src_src1_type(devinfo, inst, 1);
> > 
> >  if (src2.type == BRW_REGISTER_TYPE_HF)
> > brw_inst_set_3src_src2_type(devinfo, inst, 1);
> > 
> > How would you feel about that? (Direct cut-paste and the helpers have
> > different name).
> 
> Sure, if we are planning to use mixed mode in the future this makes
> more sense. Thanks!

Nice!

Reviewed-by: Topi Pohjolainen 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108949] RADV: Subgroup codegen is sub-optimal

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108949

Bug ID: 108949
   Summary: RADV: Subgroup codegen is sub-optimal
   Product: Mesa
   Version: 18.2
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: mais...@archlinux.us
QA Contact: mesa-dev@lists.freedesktop.org

I have some code using subgroups which generates suboptimal code where I expect
more use of SGPRs, but I see a lot of VGPRs/vector loads being used instead.
The code-gen is worse than AMDVLK and much worse than AMD's Windows driver as a
result. I filed a similar issue here:
https://github.com/GPUOpen-Drivers/AMDVLK/issues/68.
On a more useful, complicated test, I get 0% uplift from subgroup on RADV, 5%
on AMDVLK and 15% on Windows. GPU is RX 470 (Polaris). Mesa version is 18.2.5.

With a trivial test:
https://github.com/Themaister/Granite/blob/master/tests/assets/shaders/subgroup.comp
I expect the subgroupBroadcastFirst(subgroupOr) to trigger all scalar loads,
but I get in the loop:

BB629_1:
s_load_dwordx4 s[8:11], s[0:1], 0x0  ;
C00A0200 
s_ff1_i32_b32 s3, s2 ;
BE831002
v_mul_u32_u24_e64 v7, s3, 48 ;
D1080007 00016003
v_or_b32_e32 v5, 4, v7   ;
280A0E84
v_mad_u32_u24 v10, s3, 48, 20;
D1C3000A 02516003
v_mad_u32_u24 v8, s3, 48, 16 ;
D1C30008 02416003
s_waitcnt lgkmcnt(0) ;
BF8C007F
*   buffer_load_dwordx2 v[5:6], v5, s[8:11], 0 offen ;
E0541000 80020505
*   buffer_load_dword v10, v10, s[8:11], 0 offen ;
E0501000 80020A0A
*   buffer_load_dword v14, v7, s[8:11], 0 offen  ;
E0501000 80020E07
*   buffer_load_dword v8, v8, s[8:11], 0 offen   ;
E0501000 80020808
v_mad_u32_u24 v11, s3, 48, 24;
D1C3000B 02616003
v_or_b32_e32 v7, 12, v7  ;
280E0E8C
v_mad_u32_u24 v12, s3, 48, 28;
D1C3000C 02716003
*   buffer_load_dword v7, v7, s[8:11], 0 offen   ;
E0501000 80020707
v_mad_u32_u24 v9, s3, 48, 32 ;
D1C30009 02816003
buffer_load_dword v11, v11, s[8:11], 0 offen ;
E0501000 80020B0B
v_mad_u32_u24 v13, s3, 48, 36;
D1C3000D 02916003
*   buffer_load_dword v12, v12, s[8:11], 0 offen ;
E0501000 80020C0C
*   buffer_load_dword v9, v9, s[8:11], 0 offen   ;
E0501000 80020909
...

where Windows codegen is:

label_0028:
  s_cmp_eq_i32  s0, 0   // 00A0:
BF008000
  s_cbranch_scc1  label_0052// 00A4:
BF850028
  s_and_b32 s1, s3, 0x  // 00A8:
8601FF03 
  s_ff1_i32_b32  s4, s0 // 00B0:
BE841000
  s_andn2_b32   s1, s1, 0x3fff  // 00B4:
8901FF01 3FFF
  s_mul_i32 s5, s4, 48  // 00BC:
9205B004
  s_mov_b32 s12, s2 // 00C0:
BE8C0002
  s_mov_b32 s13, s1 // 00C4:
BE8D0001
  s_movk_i32s14, 0x // 00C8:
B00E
  s_mov_b32 s15, 0x00024fac // 00CC:
BE8F00FF 00024FAC
  s_buffer_load_dwordx8  s[16:23], s[12:15], s5 // 00D4:
C02C0406 0005
  s_add_u32 s1, s5, 32  // 00DC:
8001A005
  s_buffer_load_dwordx4  s[12:15], s[12:15], s1 // 00E0:
C0280306 0001
  s_lshl_b32s1, 1, s4   // 00E8:
8E010481
  s_xor_b32 s0, s0, s1  // 00EC:
88000100
  s_waitcnt vmcnt(0) & lgkmcnt(0)   // 00F0:
BF8C0070
...

The subgroupOr is implemented strangely, getting similar code as AMDVLK, i.e.
this:

v_mov_b32_dpp v7, v7  quad_perm:[1,0,3,2] row_mask:0xf bank_mask:0xf ;
7E0E02FA FF00B107
v_or_b32_e32 v5, v5, v7  ;
280A0F05
v_mov_b32_e32 v7, v5 ;
7E0E0305
s_nop 1  ;
BF81
v_mov_b32_dpp v7, v7  

Re: [Mesa-dev] [PATCH 22/59] compiler/nir: add lowering for 16-bit ldexp

2018-12-05 Thread Iago Toral
On Wed, 2018-12-05 at 13:20 +0200, Pohjolainen, Topi wrote:
> On Wed, Dec 05, 2018 at 11:53:44AM +0100, Iago Toral wrote:
> > On Wed, 2018-12-05 at 11:39 +0200, Pohjolainen, Topi wrote:
> > > I remember people preferring to order things 16, 32, 64 before.
> > > Should
> > > we follow that here as well?
> > 
> > Yes, it makes sense. I'll change that.
> > 
> > > On Tue, Dec 04, 2018 at 08:16:46AM +0100, Iago Toral Quiroga
> > > wrote:
> > > > ---
> > > >  src/compiler/nir/nir_opt_algebraic.py | 5 +
> > > >  1 file changed, 5 insertions(+)
> > > > 
> > > > diff --git a/src/compiler/nir/nir_opt_algebraic.py
> > > > b/src/compiler/nir/nir_opt_algebraic.py
> > > > index 6c3b77c9b6e..747f1751086 100644
> > > > --- a/src/compiler/nir/nir_opt_algebraic.py
> > > > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > > > @@ -778,6 +778,8 @@ def fexp2i(exp, bits):
> > > >return ('ishl', ('iadd', exp, 127), 23)
> > > > elif bits == 64:
> > > >return ('pack_64_2x32_split', 0, ('ishl', ('iadd', exp,
> > > > 1023), 20))
> > > > +   elif bits == 16:
> > > > +  return ('i2i16', ('ishl', ('iadd', exp, 15), 10))
> > > > else:
> > > >assert False
> > > >  
> > > > @@ -796,6 +798,8 @@ def ldexp(f, exp, bits):
> > > >exp = ('imin', ('imax', exp, -252), 254)
> > > > elif bits == 64:
> > > >exp = ('imin', ('imax', exp, -2044), 2046)
> > > > +   elif bits == 16:
> > > > +  exp = ('imin', ('imax', exp, -30), 30)
> > > 
> > > I expected this to be:
> > > 
> > >  exp = ('imin', ('imax', exp, -29), 30)
> > 
> > Actually, I think this should be -28, since the minimum exponent
> > value
> > is -14.
> 
> I kept wondering about. The offset is 15 and -14 - 15 yields -29. But
> -28
> in turn would be more in line with the 32- and 64-bit cases.

I think the idea is to have this be 2x the minimum (and maximum)
exponents we can represent, since below we are dividing it by two and
emitting two exponentials, each with half that exponent. That way we
ensure that when we divide the exponent by 2 we still produce a
representable exponent for the bit-size.

Iago

> > 
> > > > else:
> > > >assert False
> > > >  
> > > > @@ -814,6 +818,7 @@ def ldexp(f, exp, bits):
> > > >  optimizations += [
> > > > (('ldexp@32', 'x', 'exp'), ldexp('x', 'exp', 32), 'options-
> > > > > lower_ldexp'),
> > > > 
> > > > (('ldexp@64', 'x', 'exp'), ldexp('x', 'exp', 64), 'options-
> > > > > lower_ldexp'),
> > > > 
> > > > +   (('ldexp@16', 'x', 'exp'), ldexp('x', 'exp', 16), 'options-
> > > > > lower_ldexp'),
> > > > 
> > > >  ]
> > > >  
> > > >  # Unreal Engine 4 demo applications open-codes
> > > > bitfieldReverse()
> > > > -- 
> > > > 2.17.1
> > > > 
> > > > ___
> > > > mesa-dev mailing list
> > > > mesa-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > 
> > > 
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] gm107/ir: add lowering of atomic f32 add on shared memory

2018-12-05 Thread Karol Herbst
nvm, I somehow didn't notice that "if (atom->dType != TYPE_F32)" check...
On Wed, Dec 5, 2018 at 3:43 PM Karol Herbst  wrote:
>
> but uhm, how would that work if you assert(atom->subOp ==
> NV50_IR_SUBOP_ATOM_ADD); inside handleSharedATOMGM107? I thought
> that's only needed for fadd, not for all atoms
>
> On Wed, Dec 5, 2018 at 3:17 PM Ilia Mirkin  wrote:
> >
> > On Wed, Dec 5, 2018 at 4:59 AM Karol Herbst  wrote:
> > >
> > > On Wed, Dec 5, 2018 at 6:30 AM Ilia Mirkin  wrote:
> > > >
> > > > Signed-off-by: Ilia Mirkin 
> > > > ---
> > > >  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 49 +++
> > > >  .../nouveau/codegen/nv50_ir_lowering_nvc0.h   |  1 +
> > > >  2 files changed, 50 insertions(+)
> > > >
> > > > diff --git 
> > > > a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
> > > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > > > index 295497be2f9..44c62820342 100644
> > > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > > > @@ -1347,6 +1347,53 @@ NVC0LoweringPass::handleBUFQ(Instruction *bufq)
> > > > return true;
> > > >  }
> > > >
> > > > +void
> > > > +NVC0LoweringPass::handleSharedATOMGM107(Instruction *atom)
> > > > +{
> > > > +   if (atom->dType != TYPE_F32)
> > > > +  return;
> > > > +
> > > > +   assert(atom->subOp == NV50_IR_SUBOP_ATOM_ADD);
> > > > +   assert(atom->src(0).getFile() == FILE_MEMORY_SHARED);
> > > > +
> > > > +   BasicBlock *currBB = atom->bb;
> > > > +   BasicBlock *addAndCasBB = atom->bb->splitBefore(atom, false);
> > > > +   BasicBlock *joinBB = atom->bb->splitAfter(atom);
> > > > +
> > > > +   bld.setPosition(currBB, true);
> > > > +
> > > > +   Value *load = atom->getDef(0), *newval = bld.getSSA();
> > > > +   // TODO: Use "U" subop?
> > > > +   bld.mkLoad(TYPE_U32, load, atom->getSrc(0)->asSym(), 
> > > > atom->getIndirect(0, 0));
> > > > +   assert(!currBB->joinAt);
> > > > +   currBB->joinAt = bld.mkFlow(OP_JOINAT, joinBB, CC_ALWAYS, NULL);
> > > > +
> > > > +   bld.mkFlow(OP_BRA, addAndCasBB, CC_ALWAYS, NULL);
> > > > +   currBB->cfg.attach(>cfg, Graph::Edge::TREE);
> > > > +
> > > > +   bld.setPosition(addAndCasBB, true);
> > > > +   bld.remove(atom);
> > > > +
> > > > +   bld.mkOp2(OP_ADD, TYPE_F32, newval, load, atom->getSrc(1));
> > > > +
> > > > +   // Try to do a compare-and-swap. If the old value doesn't match the 
> > > > loaded
> > > > +   // value, repeat.
> > > > +   Value *old = bld.getSSA();
> > > > +   Instruction *cas =
> > > > +  bld.mkOp3(OP_ATOM, TYPE_U32, old, atom->getSrc(0), load, newval);
> > > > +   cas->setIndirect(0, 0, atom->getIndirect(0, 0));
> > > > +   cas->subOp = NV50_IR_SUBOP_ATOM_CAS;
> > > > +   Value *pred = bld.getSSA(1, FILE_PREDICATE);
> > > > +   bld.mkCmp(OP_SET, CC_EQ, TYPE_U32, pred, TYPE_U32, old, load);
> > > > +   bld.mkMov(load, old);
> > > > +   bld.mkFlow(OP_BRA, addAndCasBB, CC_NOT_P, pred);
> > > > +   bld.mkFlow(OP_BRA, joinBB, CC_ALWAYS, NULL);
> > > > +   addAndCasBB->cfg.attach(>cfg, Graph::Edge::BACK);
> > > > +
> > > > +   bld.setPosition(joinBB, false);
> > > > +   bld.mkFlow(OP_JOIN, NULL, CC_ALWAYS, NULL)->fixed = 1;
> > > > +}
> > > > +
> > > >  void
> > > >  NVC0LoweringPass::handleSharedATOMNVE4(Instruction *atom)
> > > >  {
> > > > @@ -1559,6 +1606,8 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
> > > >   handleSharedATOM(atom);
> > > >else if (targ->getChipset() < NVISA_GM107_CHIPSET)
> > > >   handleSharedATOMNVE4(atom);
> > > > +  else
> > > > + handleSharedATOMGM107(atom);
> > >
> > > but doesn't this makes all shared ATOM operations get lowered now?
> >
> > All shared ATOM operations (gm107+) call this function, yes.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] gm107/ir: add lowering of atomic f32 add on shared memory

2018-12-05 Thread Ilia Mirkin
On Wed, Dec 5, 2018 at 4:59 AM Karol Herbst  wrote:
>
> On Wed, Dec 5, 2018 at 6:30 AM Ilia Mirkin  wrote:
> >
> > Signed-off-by: Ilia Mirkin 
> > ---
> >  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 49 +++
> >  .../nouveau/codegen/nv50_ir_lowering_nvc0.h   |  1 +
> >  2 files changed, 50 insertions(+)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > index 295497be2f9..44c62820342 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > @@ -1347,6 +1347,53 @@ NVC0LoweringPass::handleBUFQ(Instruction *bufq)
> > return true;
> >  }
> >
> > +void
> > +NVC0LoweringPass::handleSharedATOMGM107(Instruction *atom)
> > +{
> > +   if (atom->dType != TYPE_F32)
> > +  return;
> > +
> > +   assert(atom->subOp == NV50_IR_SUBOP_ATOM_ADD);
> > +   assert(atom->src(0).getFile() == FILE_MEMORY_SHARED);
> > +
> > +   BasicBlock *currBB = atom->bb;
> > +   BasicBlock *addAndCasBB = atom->bb->splitBefore(atom, false);
> > +   BasicBlock *joinBB = atom->bb->splitAfter(atom);
> > +
> > +   bld.setPosition(currBB, true);
> > +
> > +   Value *load = atom->getDef(0), *newval = bld.getSSA();
> > +   // TODO: Use "U" subop?
> > +   bld.mkLoad(TYPE_U32, load, atom->getSrc(0)->asSym(), 
> > atom->getIndirect(0, 0));
> > +   assert(!currBB->joinAt);
> > +   currBB->joinAt = bld.mkFlow(OP_JOINAT, joinBB, CC_ALWAYS, NULL);
> > +
> > +   bld.mkFlow(OP_BRA, addAndCasBB, CC_ALWAYS, NULL);
> > +   currBB->cfg.attach(>cfg, Graph::Edge::TREE);
> > +
> > +   bld.setPosition(addAndCasBB, true);
> > +   bld.remove(atom);
> > +
> > +   bld.mkOp2(OP_ADD, TYPE_F32, newval, load, atom->getSrc(1));
> > +
> > +   // Try to do a compare-and-swap. If the old value doesn't match the 
> > loaded
> > +   // value, repeat.
> > +   Value *old = bld.getSSA();
> > +   Instruction *cas =
> > +  bld.mkOp3(OP_ATOM, TYPE_U32, old, atom->getSrc(0), load, newval);
> > +   cas->setIndirect(0, 0, atom->getIndirect(0, 0));
> > +   cas->subOp = NV50_IR_SUBOP_ATOM_CAS;
> > +   Value *pred = bld.getSSA(1, FILE_PREDICATE);
> > +   bld.mkCmp(OP_SET, CC_EQ, TYPE_U32, pred, TYPE_U32, old, load);
> > +   bld.mkMov(load, old);
> > +   bld.mkFlow(OP_BRA, addAndCasBB, CC_NOT_P, pred);
> > +   bld.mkFlow(OP_BRA, joinBB, CC_ALWAYS, NULL);
> > +   addAndCasBB->cfg.attach(>cfg, Graph::Edge::BACK);
> > +
> > +   bld.setPosition(joinBB, false);
> > +   bld.mkFlow(OP_JOIN, NULL, CC_ALWAYS, NULL)->fixed = 1;
> > +}
> > +
> >  void
> >  NVC0LoweringPass::handleSharedATOMNVE4(Instruction *atom)
> >  {
> > @@ -1559,6 +1606,8 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
> >   handleSharedATOM(atom);
> >else if (targ->getChipset() < NVISA_GM107_CHIPSET)
> >   handleSharedATOMNVE4(atom);
> > +  else
> > + handleSharedATOMGM107(atom);
>
> but doesn't this makes all shared ATOM operations get lowered now?

All shared ATOM operations (gm107+) call this function, yes.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] gm107/ir: add lowering of atomic f32 add on shared memory

2018-12-05 Thread Karol Herbst
but uhm, how would that work if you assert(atom->subOp ==
NV50_IR_SUBOP_ATOM_ADD); inside handleSharedATOMGM107? I thought
that's only needed for fadd, not for all atoms

On Wed, Dec 5, 2018 at 3:17 PM Ilia Mirkin  wrote:
>
> On Wed, Dec 5, 2018 at 4:59 AM Karol Herbst  wrote:
> >
> > On Wed, Dec 5, 2018 at 6:30 AM Ilia Mirkin  wrote:
> > >
> > > Signed-off-by: Ilia Mirkin 
> > > ---
> > >  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 49 +++
> > >  .../nouveau/codegen/nv50_ir_lowering_nvc0.h   |  1 +
> > >  2 files changed, 50 insertions(+)
> > >
> > > diff --git 
> > > a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
> > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > > index 295497be2f9..44c62820342 100644
> > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > > @@ -1347,6 +1347,53 @@ NVC0LoweringPass::handleBUFQ(Instruction *bufq)
> > > return true;
> > >  }
> > >
> > > +void
> > > +NVC0LoweringPass::handleSharedATOMGM107(Instruction *atom)
> > > +{
> > > +   if (atom->dType != TYPE_F32)
> > > +  return;
> > > +
> > > +   assert(atom->subOp == NV50_IR_SUBOP_ATOM_ADD);
> > > +   assert(atom->src(0).getFile() == FILE_MEMORY_SHARED);
> > > +
> > > +   BasicBlock *currBB = atom->bb;
> > > +   BasicBlock *addAndCasBB = atom->bb->splitBefore(atom, false);
> > > +   BasicBlock *joinBB = atom->bb->splitAfter(atom);
> > > +
> > > +   bld.setPosition(currBB, true);
> > > +
> > > +   Value *load = atom->getDef(0), *newval = bld.getSSA();
> > > +   // TODO: Use "U" subop?
> > > +   bld.mkLoad(TYPE_U32, load, atom->getSrc(0)->asSym(), 
> > > atom->getIndirect(0, 0));
> > > +   assert(!currBB->joinAt);
> > > +   currBB->joinAt = bld.mkFlow(OP_JOINAT, joinBB, CC_ALWAYS, NULL);
> > > +
> > > +   bld.mkFlow(OP_BRA, addAndCasBB, CC_ALWAYS, NULL);
> > > +   currBB->cfg.attach(>cfg, Graph::Edge::TREE);
> > > +
> > > +   bld.setPosition(addAndCasBB, true);
> > > +   bld.remove(atom);
> > > +
> > > +   bld.mkOp2(OP_ADD, TYPE_F32, newval, load, atom->getSrc(1));
> > > +
> > > +   // Try to do a compare-and-swap. If the old value doesn't match the 
> > > loaded
> > > +   // value, repeat.
> > > +   Value *old = bld.getSSA();
> > > +   Instruction *cas =
> > > +  bld.mkOp3(OP_ATOM, TYPE_U32, old, atom->getSrc(0), load, newval);
> > > +   cas->setIndirect(0, 0, atom->getIndirect(0, 0));
> > > +   cas->subOp = NV50_IR_SUBOP_ATOM_CAS;
> > > +   Value *pred = bld.getSSA(1, FILE_PREDICATE);
> > > +   bld.mkCmp(OP_SET, CC_EQ, TYPE_U32, pred, TYPE_U32, old, load);
> > > +   bld.mkMov(load, old);
> > > +   bld.mkFlow(OP_BRA, addAndCasBB, CC_NOT_P, pred);
> > > +   bld.mkFlow(OP_BRA, joinBB, CC_ALWAYS, NULL);
> > > +   addAndCasBB->cfg.attach(>cfg, Graph::Edge::BACK);
> > > +
> > > +   bld.setPosition(joinBB, false);
> > > +   bld.mkFlow(OP_JOIN, NULL, CC_ALWAYS, NULL)->fixed = 1;
> > > +}
> > > +
> > >  void
> > >  NVC0LoweringPass::handleSharedATOMNVE4(Instruction *atom)
> > >  {
> > > @@ -1559,6 +1606,8 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
> > >   handleSharedATOM(atom);
> > >else if (targ->getChipset() < NVISA_GM107_CHIPSET)
> > >   handleSharedATOMNVE4(atom);
> > > +  else
> > > + handleSharedATOMGM107(atom);
> >
> > but doesn't this makes all shared ATOM operations get lowered now?
>
> All shared ATOM operations (gm107+) call this function, yes.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: wait on the high 32 bits of timestamp queries

2018-12-05 Thread Emil Velikov
Hi guys

On Wed, 5 Dec 2018 at 10:49, Bas Nieuwenhuizen  wrote:
>
> Reviewed-by: Bas Nieuwenhuizen 
> On Wed, Dec 5, 2018 at 11:43 AM Samuel Pitoiset
>  wrote:
> >
> > In case we are unlucky if the low part is 0x.
> >
> > Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp 
> > queries")
> > Signed-off-by: Samuel Pitoiset 

There was a trivial conflict when applying the patch to 18.3
I've resolved it here [1], please double-check that it looks good.

Thanks
Emil

[1] 
https://cgit.freedesktop.org/mesa/mesa/commit/?h=staging/18.3=d92bbe54eaf8406d2b3ceb8b6b7eba6c69681b76
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] nir/algebraic: Rewrite bit-size inference

2018-12-05 Thread Jason Ekstrand

Rb me.  Now you can review my comparison patches. 

On December 5, 2018 06:20:49 Connor Abbott  wrote:


Before this commit, there were two copies of the algorithm: one in C,
that we would use to figure out what bit-size to give the replacement
expression, and one in Python, that emulated the C one and tried to
prove that the C algorithm would never fail to correctly assign
bit-sizes. That seemed pretty fragile, and likely to fall over if we
make any changes. Furthermore, the C code was really just recomputing
more-or-less the same thing as the Python code every time. Instead, we
can just store the results of the Python algorithm in the C
datastructure, and consult it to compute the bitsize of each value,
moving the "brains" entirely into Python. Since the Python algorithm no
longer has to match C, it's also a lot easier to change it to something
more closely approximating an actual type-inference algorithm. The
algorithm used is based on Hindley-Milner, although deliberately
weakened a little. It's a few more lines than the old one, judging by
the diffstat, but I think it's easier to verify that it's correct while
being as general as possible.

We could split this up into two changes, first making the C code use the
results of the Python code and then rewriting the Python algorithm, but
since the old algorithm never tracked which variable each equivalence
class, it would mean we'd have to add some non-trivial code which would
then get thrown away. I think it's better to see the final state all at
once, although I could also try splitting it up.

v2:
- Replace instances of "== None" and "!= None" with "is None" and
"is not None".
- Rename first_src to first_unsized_src
- Only merge the destination with the first unsized source, since the
sources have already been merged.
- Add a comment explaining what nir_search_value::bit_size now means.
v3:
- Fix one last instance to use "is not" instead of !=
- Don't try to be so clever when choosing which error message to print
based on whether we're in the search or replace expression.
- Fix trailing whitespace.
---
src/compiler/nir/nir_algebraic.py | 520 --
src/compiler/nir/nir_search.c | 146 +
src/compiler/nir/nir_search.h |  17 +-
3 files changed, 317 insertions(+), 366 deletions(-)

diff --git a/src/compiler/nir/nir_algebraic.py 
b/src/compiler/nir/nir_algebraic.py

index 728196136ab..efd6e52cdb9 100644
--- a/src/compiler/nir/nir_algebraic.py
+++ b/src/compiler/nir/nir_algebraic.py
@@ -88,7 +88,7 @@ class Value(object):

   __template = mako.template.Template("""
static const ${val.c_type} ${val.name} = {
-   { ${val.type_enum}, ${val.bit_size} },
+   { ${val.type_enum}, ${val.c_bit_size} },
% if isinstance(val, Constant):
   ${val.type()}, { ${val.hex()} /* ${val.value} */ },
% elif isinstance(val, Variable):
@@ -112,6 +112,40 @@ static const ${val.c_type} ${val.name} = {
   def __str__(self):
  return self.in_val

+   def get_bit_size(self):
+  """Get the physical bit-size that has been chosen for this value, or if
+  there is none, the canonical value which currently represents this
+  bit-size class. Variables will be preferred, i.e. if there are any
+  variables in the equivalence class, the canonical value will be a
+  variable. We do this since we'll need to know which variable each value
+  is equivalent to when constructing the replacement expression. This is
+  the "find" part of the union-find algorithm.
+  """
+  bit_size = self
+
+  while isinstance(bit_size, Value):
+ if bit_size._bit_size is None:
+break
+ bit_size = bit_size._bit_size
+
+  if bit_size is not self:
+ self._bit_size = bit_size
+  return bit_size
+
+   def set_bit_size(self, other):
+  """Make self.get_bit_size() return what other.get_bit_size() return
+  before calling this, or just "other" if it's a concrete bit-size. 
This is

+  the "union" part of the union-find algorithm.
+  """
+
+  self_bit_size = self.get_bit_size()
+  other_bit_size = other if isinstance(other, int) else 
other.get_bit_size()

+
+  if self_bit_size == other_bit_size:
+ return
+
+  self_bit_size._bit_size = other_bit_size
+
   @property
   def type_enum(self):
  return "nir_search_value_" + self.type_str
@@ -124,6 +158,21 @@ static const ${val.c_type} ${val.name} = {
   def c_ptr(self):
  return "&{0}.value".format(self.name)

+   @property
+   def c_bit_size(self):
+  bit_size = self.get_bit_size()
+  if isinstance(bit_size, int):
+ return bit_size
+  elif isinstance(bit_size, Variable):
+ return -bit_size.index - 1
+  else:
+ # If the bit-size class is neither a variable, nor an actual 
bit-size, then

+ # - If it's in the search expression, we don't need to check anything
+ # - If it's in the replace expression, either it's ambiguous (in 
which

+ # case we'd 

Re: [Mesa-dev] [PATCH] gallium: Android build fixes

2018-12-05 Thread Emil Velikov
On Tue, 4 Dec 2018 at 18:51, Kristian H. Kristensen  wrote:
>
> A couple of simple fixes for building on Android with autotools.

Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa] radv: drop unused variable

2018-12-05 Thread Eric Engestrom
Added in 824cfc1ee5e0aba15b676 "radv: rework the TC-compat HTILE
hardware bug with COND_EXEC", but it is unused.

Signed-off-by: Eric Engestrom 
---
 src/amd/vulkan/radv_cmd_buffer.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 23909a0f7dda537bf9a1..945442d7b97974f780b2 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1278,7 +1278,6 @@ radv_update_tc_compat_zrange_metadata(struct 
radv_cmd_buffer *cmd_buffer,
  struct radv_image *image,
  VkClearDepthStencilValue ds_clear_value)
 {
-   struct radeon_cmdbuf *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->tc_compat_zrange_offset;
uint32_t cond_val;
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 27/28] anv: enable support for SPV_KHR_shader_float_controls capabilities

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/vulkan/anv_pipeline.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index d55e51adcbb..cadf9288ad9 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -144,6 +144,7 @@ anv_shader_compile_to_nir(struct anv_pipeline *pipeline,
  .image_write_without_format = true,
  .multiview = true,
  .variable_pointers = true,
+ .shader_float_controls = device->instance->physicalDevice.info.gen >= 
8,
  .storage_16bit = device->instance->physicalDevice.info.gen >= 8,
  .int16 = device->instance->physicalDevice.info.gen >= 8,
  .float16 = device->instance->physicalDevice.info.gen >= 8,
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 28/28] anv: enable VK_KHR_shader_float_controls extension

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/vulkan/anv_extensions.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 9ca42d998ef..d572df3c342 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -105,6 +105,7 @@ EXTENSIONS = [
 Extension('VK_KHR_sampler_ycbcr_conversion',  1, True),
 Extension('VK_KHR_shader_draw_parameters',1, True),
 Extension('VK_KHR_shader_float16_int8',   1, 'device->info.gen 
>= 8'),
+Extension('VK_KHR_shader_float_controls', 1, 'device->info.gen 
>= 8'),
 Extension('VK_KHR_storage_buffer_storage_class',  1, True),
 Extension('VK_KHR_surface',  25, 
'ANV_HAS_SURFACE'),
 Extension('VK_KHR_swapchain',68, 
'ANV_HAS_SURFACE'),
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/28] spirv/glsl450: fix atan2(0,0) lowering

2018-12-05 Thread Samuel Iglesias Gonsálvez
We were returning 3*pi/4 when we should return 0.0 according to IEEE 754.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/spirv/vtn_glsl450.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 64a1431ae14..ab4eb7da1e3 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -402,8 +402,12 @@ build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
 * continuous along the whole positive y = 0 half-line, so it won't affect
 * the result significantly.
 */
-   return nir_bcsel(b, nir_flt(b, nir_fmin(b, y, rcp_scaled_t), zero),
-nir_fneg(b, arc), arc);
+   nir_ssa_def *result = nir_bcsel(b, nir_flt(b, nir_fmin(b, y, rcp_scaled_t), 
zero),
+   nir_fneg(b, arc), arc);
+   nir_ssa_def *is_xy_zero = nir_iand(b,
+ nir_feq(b, x, zero),
+ nir_feq(b, y, zero));
+   return nir_bcsel(b, is_xy_zero, zero, result);
 }
 
 static nir_ssa_def *
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/28] intel/nir: call nir_opt_constant_folding before brw_nir_apply_trig_workarounds

2018-12-05 Thread Samuel Iglesias Gonsálvez
If we have fsin or fcos trigonometric operations with constant values as inputs,
we will multiply the result by 0.7 in brw_nir_apply_trig_workarounds,
making the result wrong. Running nir_opt_constant_folding before, we will
calculate correctly the result for these trignometric ops.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_nir.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 600f7a97df9..41e27054595 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -664,8 +664,10 @@ brw_preprocess_nir(const struct brw_compiler *compiler, 
nir_shader *nir)
 
/* See also brw_nir_trig_workarounds.py */
if (compiler->precise_trig &&
-   !(devinfo->gen >= 10 || devinfo->is_kabylake))
+   !(devinfo->gen >= 10 || devinfo->is_kabylake)) {
+  OPT(nir_opt_constant_folding);
   OPT(brw_nir_apply_trig_workarounds);
+   }
 
static const nir_lower_tex_options tex_options = {
   .lower_txp = ~0,
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/28] nir: fix fmin/fmax support for doubles

2018-12-05 Thread Samuel Iglesias Gonsálvez
Until now, it was using the floating point version of fmin, instead
of the double version.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/nir/nir_opcodes.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index 191025f6932..38f1fde72b6 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -569,10 +569,10 @@ opcode("fdph", 1, tfloat, [3, 4], [tfloat, tfloat], "",
 opcode("fdph_replicated", 4, tfloat, [3, 4], [tfloat, tfloat], "",
"src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src1.w")
 
-binop("fmin", tfloat, "", "fminf(src0, src1)")
+binop("fmin", tfloat, "", "fmin(src0, src1)")
 binop("imin", tint, commutative + associative, "src1 > src0 ? src0 : src1")
 binop("umin", tuint, commutative + associative, "src1 > src0 ? src0 : src1")
-binop("fmax", tfloat, "", "fmaxf(src0, src1)")
+binop("fmax", tfloat, "", "fmax(src0, src1)")
 binop("imax", tint, commutative + associative, "src1 > src0 ? src1 : src0")
 binop("umax", tuint, commutative + associative, "src1 > src0 ? src1 : src0")
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/28] spirv/nir: keep track of SPV_KHR_shader_float_controls execution modes

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/shader_enums.h   | 14 ++
 src/compiler/shader_info.h|  3 +++
 src/compiler/spirv/spirv_to_nir.c | 26 ++
 3 files changed, 43 insertions(+)

diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index f023b48cbb3..15caf753efb 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -750,6 +750,20 @@ enum compare_func
COMPARE_FUNC_ALWAYS,
 };
 
+enum shader_float_controls
+{
+   SHADER_DEFAULT_FLOAT_CONTROL_MODE  = 0x,
+   SHADER_DENORM_PRESERVE_FP16= 0x0001,
+   SHADER_DENORM_PRESERVE_FP32= 0x0002,
+   SHADER_DENORM_PRESERVE_FP64= 0x0004,
+   SHADER_DENORM_FLUSH_TO_ZERO_FP16   = 0x0008,
+   SHADER_DENORM_FLUSH_TO_ZERO_FP32   = 0x0010,
+   SHADER_DENORM_FLUSH_TO_ZERO_FP64   = 0x0020,
+   SHADER_SIGNED_ZERO_INF_NAN_PRESERVE= 0x0040,
+   SHADER_ROUNDING_MODE_RTE   = 0x0080,
+   SHADER_ROUNDING_MODE_RTZ   = 0x0100,
+};
+
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h
index 21c3d371a63..0383058522e 100644
--- a/src/compiler/shader_info.h
+++ b/src/compiler/shader_info.h
@@ -133,6 +133,9 @@ typedef struct shader_info {
/** Was this shader linked with any transform feedback varyings? */
bool has_transform_feedback_varyings;
 
+   /* SPV_KHR_shader_float_controls: execution mode for floating point ops */
+   unsigned shader_float_controls_execution_mode;
+
union {
   struct {
  /* Which inputs are doubles */
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 100fcd8e298..3f71644ce34 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -3758,6 +3758,32 @@ vtn_handle_execution_mode(struct vtn_builder *b, struct 
vtn_value *entry_point,
   vtn_assert(b->shader->info.stage == MESA_SHADER_FRAGMENT);
   break;
 
+   case SpvExecutionModeDenormPreserve:
+  switch (mode->literals[0]) {
+  case 16: b->shader->info.shader_float_controls_execution_mode |= 
SHADER_DENORM_PRESERVE_FP16; break;
+  case 32: b->shader->info.shader_float_controls_execution_mode |= 
SHADER_DENORM_PRESERVE_FP32; break;
+  case 64: b->shader->info.shader_float_controls_execution_mode |= 
SHADER_DENORM_PRESERVE_FP64; break;
+  default: vtn_fail("Floating point type not supported");
+  }
+  break;
+   case SpvExecutionModeDenormFlushToZero:
+  switch (mode->literals[0]) {
+  case 16: b->shader->info.shader_float_controls_execution_mode |= 
SHADER_DENORM_FLUSH_TO_ZERO_FP16; break;
+  case 32: b->shader->info.shader_float_controls_execution_mode |= 
SHADER_DENORM_FLUSH_TO_ZERO_FP32; break;
+  case 64: b->shader->info.shader_float_controls_execution_mode |= 
SHADER_DENORM_FLUSH_TO_ZERO_FP64; break;
+  default: vtn_fail("Floating point type not supported");
+  }
+   break;
+   case SpvExecutionModeSignedZeroInfNanPreserve:
+  b->shader->info.shader_float_controls_execution_mode |= 
SHADER_SIGNED_ZERO_INF_NAN_PRESERVE;
+  break;
+   case SpvExecutionModeRoundingModeRTE:
+  b->shader->info.shader_float_controls_execution_mode |= 
SHADER_ROUNDING_MODE_RTE;
+  break;
+   case SpvExecutionModeRoundingModeRTZ:
+  b->shader->info.shader_float_controls_execution_mode |= 
SHADER_ROUNDING_MODE_RTZ;
+  break;
+
default:
   vtn_fail("Unhandled execution mode");
}
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/28] spirv/glsl450: fix atan2(x, x) case

2018-12-05 Thread Samuel Iglesias Gonsálvez
If x < 0 -> atan2(x, x) = -3*pi/4.
If x > 0 -> atan2(x, x) = pi/4.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/spirv/vtn_glsl450.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index ab4eb7da1e3..0115648cbb0 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -35,6 +35,7 @@
 #define M_PIf   ((float) M_PI)
 #define M_PI_2f ((float) M_PI_2)
 #define M_PI_4f ((float) M_PI_4)
+#define M_minus_3PI_4f ((float) -3*M_PI_4)
 
 static nir_ssa_def *
 build_mat2_det(nir_builder *b, nir_ssa_def *col[2])
@@ -402,12 +403,16 @@ build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def 
*x)
 * continuous along the whole positive y = 0 half-line, so it won't affect
 * the result significantly.
 */
-   nir_ssa_def *result = nir_bcsel(b, nir_flt(b, nir_fmin(b, y, rcp_scaled_t), 
zero),
+   nir_ssa_def *atan2 = nir_bcsel(b, nir_flt(b, nir_fmin(b, y, rcp_scaled_t), 
zero),
nir_fneg(b, arc), arc);
nir_ssa_def *is_xy_zero = nir_iand(b,
  nir_feq(b, x, zero),
  nir_feq(b, y, zero));
-   return nir_bcsel(b, is_xy_zero, zero, result);
+   nir_ssa_def *res_equal = nir_bcsel(b, nir_feq(b, x, y),
+  nir_bcsel(b, nir_flt(b, x, zero), 
nir_imm_floatN_t(b, M_minus_3PI_4f, bit_size), nir_imm_floatN_t(b, M_PI_4f, 
bit_size)),
+  atan2);
+   nir_ssa_def *res =  nir_bcsel(b, is_xy_zero, zero, res_equal);
+   return res;
 }
 
 static nir_ssa_def *
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 24/28] i965/fs: remove brw_rounding_mode() and use brw_float_controls_mode() instead

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_eu.h |  4 ---
 src/intel/compiler/brw_eu_emit.c| 36 -
 src/intel/compiler/brw_fs_generator.cpp | 13 +++--
 src/intel/compiler/brw_fs_nir.cpp   | 18 +++--
 4 files changed, 27 insertions(+), 44 deletions(-)

diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
index 2309d3b10d8..ae068964936 100644
--- a/src/intel/compiler/brw_eu.h
+++ b/src/intel/compiler/brw_eu.h
@@ -673,10 +673,6 @@ brw_broadcast(struct brw_codegen *p,
   struct brw_reg src,
   struct brw_reg idx);
 
-void
-brw_rounding_mode(struct brw_codegen *p,
-  enum brw_rnd_mode mode);
-
 void
 brw_float_controls_mode(struct brw_codegen *p,
 unsigned mode, unsigned mask);
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index f0193712a9f..cda6f3ea70f 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -3634,42 +3634,6 @@ brw_WAIT(struct brw_codegen *p)
brw_inst_set_mask_control(devinfo, insn, BRW_MASK_DISABLE);
 }
 
-/**
- * Changes the floating point rounding mode updating the control register
- * field defined at cr0.0[5-6] bits. This function supports the changes to
- * RTNE (00), RU (01), RD (10) and RTZ (11) rounding using bitwise operations.
- * Only RTNE and RTZ rounding are enabled at nir.
- */
-void
-brw_rounding_mode(struct brw_codegen *p,
-  enum brw_rnd_mode mode)
-{
-   const unsigned bits = mode << BRW_CR0_RND_MODE_SHIFT;
-
-   if (bits != BRW_CR0_RND_MODE_MASK) {
-  brw_inst *inst = brw_AND(p, brw_cr0_reg(0), brw_cr0_reg(0),
-   brw_imm_ud(~BRW_CR0_RND_MODE_MASK));
-  brw_inst_set_exec_size(p->devinfo, inst, BRW_EXECUTE_1);
-
-  /* From the Skylake PRM, Volume 7, page 760:
-   *  "Implementation Restriction on Register Access: When the control
-   *   register is used as an explicit source and/or destination, hardware
-   *   does not ensure execution pipeline coherency. Software must set the
-   *   thread control field to ‘switch’ for an instruction that uses
-   *   control register as an explicit operand."
-   */
-  brw_inst_set_thread_control(p->devinfo, inst, BRW_THREAD_SWITCH);
-}
-
-   if (bits) {
-  brw_inst *inst = brw_OR(p, brw_cr0_reg(0), brw_cr0_reg(0),
-  brw_imm_ud(bits));
-  brw_inst_set_exec_size(p->devinfo, inst, BRW_EXECUTE_1);
-  brw_inst_set_thread_control(p->devinfo, inst, BRW_THREAD_SWITCH);
-   }
-}
-
-/* TODO: Refactor brw_rounding_mode() to use this. */
 void
 brw_float_controls_mode(struct brw_codegen *p,
 unsigned mode, unsigned mask)
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 7ae42a639f8..d92ee063893 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -2423,9 +2423,18 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
  brw_DIM(p, dst, retype(src[0], BRW_REGISTER_TYPE_F));
  break;
 
-  case SHADER_OPCODE_RND_MODE:
+  case SHADER_OPCODE_RND_MODE: {
  assert(src[0].file == BRW_IMMEDIATE_VALUE);
- brw_rounding_mode(p, (enum brw_rnd_mode) src[0].d);
+ /*
+  * Changes the floating point rounding mode updating the control 
register
+  * field defined at cr0.0[5-6] bits. This function supports the 
changes to
+  * RTNE (00), RU (01), RD (10) and RTZ (11) rounding using bitwise 
operations.
+  * Only RTNE and RTZ rounding are enabled at nir.
+  */
+ enum brw_rnd_mode mode =
+(enum brw_rnd_mode) (src[0].d << BRW_CR0_RND_MODE_SHIFT);
+ brw_float_controls_mode(p, mode, BRW_CR0_RND_MODE_MASK);
+  }
  break;
 
   case SHADER_OPCODE_FLOAT_CONTROL_MODE:
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index de24a322e68..28f38949f73 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -653,10 +653,15 @@ emit_find_msb_using_lzd(const fs_builder ,
 }
 
 static brw_rnd_mode
-brw_rnd_mode_from_nir_op (const nir_op op) {
+brw_rnd_mode_from_nir_op (const nir_op op)
+{
switch (op) {
+   case nir_op_f2f64_rtz:
+   case nir_op_f2f32_rtz:
case nir_op_f2f16_rtz:
   return BRW_RND_MODE_RTZ;
+   case nir_op_f2f64_rtne:
+   case nir_op_f2f32_rtne:
case nir_op_f2f16_rtne:
   return BRW_RND_MODE_RTNE;
default:
@@ -803,6 +808,9 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
 
case nir_op_f2f64_rtne:
case nir_op_f2f64_rtz:
+  bld.emit(SHADER_OPCODE_RND_MODE, bld.null_reg_ud(),
+   brw_imm_d(brw_rnd_mode_from_nir_op(instr->op)));
+  /* fallthrough */
case nir_op_f2f64:
case nir_op_f2i64:
case 

[Mesa-dev] [PATCH 26/28] anv: add support for VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/vulkan/anv_device.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 17b73c115cd..af07c7c831e 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1245,6 +1245,37 @@ void anv_GetPhysicalDeviceProperties2(
  properties->quadOperationsInAllStages = VK_TRUE;
  break;
   }
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR : {
+ VkPhysicalDeviceFloatControlsPropertiesKHR *properties = (void *)ext;
+ properties->separateDenormSettings = true;
+ properties->separateRoundingModeSettings = false;
+
+ /* Broadwell does not support HF denorms and there are restrictions
+  * other gens. According to Kabylake's PRM:
+  *
+  * "math - Extended Math Function
+  * [...]
+  * Restriction : Half-float denorms are always retained."
+  */
+ properties->shaderDenormFlushToZeroFloat16 = false;
+ properties->shaderDenormPreserveFloat16 = pdevice->info.gen > 8;
+ properties->shaderRoundingModeRTEFloat16 = true;
+ properties->shaderRoundingModeRTZFloat16 = true;
+ properties->shaderSignedZeroInfNanPreserveFloat16 = false;
+
+ properties->shaderDenormFlushToZeroFloat32 = true;
+ properties->shaderDenormPreserveFloat32 = true;
+ properties->shaderRoundingModeRTEFloat32 = true;
+ properties->shaderRoundingModeRTZFloat32 = true;
+ properties->shaderSignedZeroInfNanPreserveFloat32 = false;
+
+ properties->shaderDenormFlushToZeroFloat64 = true;
+ properties->shaderDenormPreserveFloat64 = true;
+ properties->shaderRoundingModeRTEFloat64 = true;
+ properties->shaderRoundingModeRTZFloat64 = true;
+ properties->shaderSignedZeroInfNanPreserveFloat64 = false;
+ break;
+  }
 
   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_PROPERTIES_EXT: {
  VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT *props =
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/28] i965/fs/generator: add support to set floating points modes in control register

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_eu.h |  4 
 src/intel/compiler/brw_eu_defines.h | 10 ++
 src/intel/compiler/brw_eu_emit.c| 26 +
 src/intel/compiler/brw_fs_generator.cpp |  8 +++-
 src/intel/compiler/brw_shader.cpp   |  3 +++
 5 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
index 9f1ca769bd3..2309d3b10d8 100644
--- a/src/intel/compiler/brw_eu.h
+++ b/src/intel/compiler/brw_eu.h
@@ -677,6 +677,10 @@ void
 brw_rounding_mode(struct brw_codegen *p,
   enum brw_rnd_mode mode);
 
+void
+brw_float_controls_mode(struct brw_codegen *p,
+unsigned mode, unsigned mask);
+
 /***
  * brw_eu_util.c:
  */
diff --git a/src/intel/compiler/brw_eu_defines.h 
b/src/intel/compiler/brw_eu_defines.h
index 52957882b10..e72645ba2c9 100644
--- a/src/intel/compiler/brw_eu_defines.h
+++ b/src/intel/compiler/brw_eu_defines.h
@@ -412,6 +412,7 @@ enum opcode {
SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL,
 
SHADER_OPCODE_RND_MODE,
+   SHADER_OPCODE_FLOAT_CONTROL_MODE,
 
/**
 * Byte scattered write/read opcodes.
@@ -1312,6 +1313,15 @@ enum PACKED brw_rnd_mode {
BRW_RND_MODE_UNSPECIFIED,  /* Unspecified rounding mode */
 };
 
+#define BRW_CR0_FP64_DENORM_PRESERVE (1 << 6)
+#define BRW_CR0_FP32_DENORM_PRESERVE (1 << 7)
+#define BRW_CR0_FP16_DENORM_PRESERVE (1 << 10)
+
+#define BRW_CR0_FP_MODE_MASK (BRW_CR0_FP64_DENORM_PRESERVE | \
+  BRW_CR0_FP32_DENORM_PRESERVE | \
+  BRW_CR0_FP16_DENORM_PRESERVE | \
+  BRW_CR0_RND_MODE_MASK << BRW_CR0_RND_MODE_SHIFT)
+
 /* MDC_DS - Data Size Message Descriptor Control Field
  * Skylake PRM, Volume 2d, page 129
  *
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index eef36705c7b..f0193712a9f 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -3668,3 +3668,29 @@ brw_rounding_mode(struct brw_codegen *p,
   brw_inst_set_thread_control(p->devinfo, inst, BRW_THREAD_SWITCH);
}
 }
+
+/* TODO: Refactor brw_rounding_mode() to use this. */
+void
+brw_float_controls_mode(struct brw_codegen *p,
+unsigned mode, unsigned mask)
+{
+   brw_inst *inst = brw_AND(p, brw_cr0_reg(0), brw_cr0_reg(0),
+brw_imm_ud(~mask));
+   brw_inst_set_exec_size(p->devinfo, inst, BRW_EXECUTE_1);
+
+   /* From the Skylake PRM, Volume 7, page 760:
+*  "Implementation Restriction on Register Access: When the control
+*   register is used as an explicit source and/or destination, hardware
+*   does not ensure execution pipeline coherency. Software must set the
+*   thread control field to ‘switch’ for an instruction that uses
+*   control register as an explicit operand."
+*/
+   brw_inst_set_thread_control(p->devinfo, inst, BRW_THREAD_SWITCH);
+
+   if (mode) {
+  brw_inst *inst_or = brw_OR(p, brw_cr0_reg(0), brw_cr0_reg(0),
+ brw_imm_ud(mode));
+  brw_inst_set_exec_size(p->devinfo, inst_or, BRW_EXECUTE_1);
+  brw_inst_set_thread_control(p->devinfo, inst_or, BRW_THREAD_SWITCH);
+   }
+}
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index ba7ed07e692..7ae42a639f8 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -2425,7 +2425,13 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 
   case SHADER_OPCODE_RND_MODE:
  assert(src[0].file == BRW_IMMEDIATE_VALUE);
- brw_rounding_mode(p, (brw_rnd_mode) src[0].d);
+ brw_rounding_mode(p, (enum brw_rnd_mode) src[0].d);
+ break;
+
+  case SHADER_OPCODE_FLOAT_CONTROL_MODE:
+ assert(src[0].file == BRW_IMMEDIATE_VALUE);
+ assert(src[1].file == BRW_IMMEDIATE_VALUE);
+ brw_float_controls_mode(p, src[0].d, src[1].d);
  break;
 
   default:
diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp
index adbb52f..8a7b3a2b1be 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -509,6 +509,8 @@ brw_instruction_name(const struct gen_device_info *devinfo, 
enum opcode op)
 
case SHADER_OPCODE_RND_MODE:
   return "rnd_mode";
+   case SHADER_OPCODE_FLOAT_CONTROL_MODE:
+  return "float_control_mode";
}
 
unreachable("not reached");
@@ -1047,6 +1049,7 @@ backend_instruction::has_side_effects() const
case TCS_OPCODE_URB_WRITE:
case TCS_OPCODE_RELEASE_INPUT:
case SHADER_OPCODE_RND_MODE:
+   case SHADER_OPCODE_FLOAT_CONTROL_MODE:
   return true;
default:
   return eot;
-- 
2.19.1

___
mesa-dev mailing 

[Mesa-dev] [PATCH 22/28] i965/fs: define emit_shader_float_controls_execution_mode() and aux functions

2018-12-05 Thread Samuel Iglesias Gonsálvez
We need this function to emit code that setups the control register later with
the defined execution mode for the shader.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_fs.h   |  1 +
 src/intel/compiler/brw_fs_visitor.cpp | 52 +++
 2 files changed, 53 insertions(+)

diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
index f79f8554fb9..3f38faf94bb 100644
--- a/src/intel/compiler/brw_fs.h
+++ b/src/intel/compiler/brw_fs.h
@@ -187,6 +187,7 @@ public:
void emit_gen6_gather_wa(uint8_t wa, fs_reg dst);
fs_reg resolve_source_modifiers(const fs_reg );
void emit_discard_jump();
+   void emit_shader_float_controls_execution_mode();
bool opt_peephole_sel();
bool opt_peephole_csel();
bool opt_peephole_predicated_break();
diff --git a/src/intel/compiler/brw_fs_visitor.cpp 
b/src/intel/compiler/brw_fs_visitor.cpp
index 51a0ca2374a..2da06cf78d0 100644
--- a/src/intel/compiler/brw_fs_visitor.cpp
+++ b/src/intel/compiler/brw_fs_visitor.cpp
@@ -198,6 +198,58 @@ fs_visitor::emit_interpolation_setup_gen4()
abld.emit(SHADER_OPCODE_RCP, this->pixel_w, wpos_w);
 }
 
+static unsigned
+brw_rnd_mode_from_nir(unsigned mode, unsigned *mask)
+{
+   unsigned brw_mode = 0;
+   *mask = 0;
+
+   if (mode & SHADER_ROUNDING_MODE_RTZ) {
+  brw_mode |= BRW_RND_MODE_RTZ << BRW_CR0_RND_MODE_SHIFT;
+  *mask |= BRW_CR0_RND_MODE_MASK;
+   }
+   if (mode & SHADER_ROUNDING_MODE_RTE) {
+  brw_mode |= BRW_RND_MODE_RTNE << BRW_CR0_RND_MODE_SHIFT;
+  *mask |= BRW_CR0_RND_MODE_MASK;
+   }
+   if (mode & SHADER_DENORM_PRESERVE_FP16) {
+  brw_mode |= BRW_CR0_FP16_DENORM_PRESERVE;
+  *mask |= BRW_CR0_FP16_DENORM_PRESERVE;
+   }
+   if (mode & SHADER_DENORM_PRESERVE_FP32) {
+  brw_mode |= BRW_CR0_FP32_DENORM_PRESERVE;
+  *mask |= BRW_CR0_FP32_DENORM_PRESERVE;
+   }
+   if (mode & SHADER_DENORM_PRESERVE_FP64) {
+  brw_mode |= BRW_CR0_FP64_DENORM_PRESERVE;
+  *mask |= BRW_CR0_FP64_DENORM_PRESERVE;
+   }
+   if (mode & SHADER_DENORM_FLUSH_TO_ZERO_FP16)
+  *mask |= BRW_CR0_FP16_DENORM_PRESERVE;
+   if (mode & SHADER_DENORM_FLUSH_TO_ZERO_FP32)
+  *mask |= BRW_CR0_FP32_DENORM_PRESERVE;
+   if (mode & SHADER_DENORM_FLUSH_TO_ZERO_FP64)
+  *mask |= BRW_CR0_FP64_DENORM_PRESERVE;
+   if (mode == SHADER_DEFAULT_FLOAT_CONTROL_MODE)
+  *mask |= BRW_CR0_RND_MODE_MASK;
+
+   return brw_mode;
+}
+
+void
+fs_visitor::emit_shader_float_controls_execution_mode()
+{
+   unsigned execution_mode = 
this->nir->info.shader_float_controls_execution_mode;
+   if (execution_mode == SHADER_DEFAULT_FLOAT_CONTROL_MODE)
+  return;
+
+   fs_builder abld = bld.annotate("shader floats control execution mode");
+   unsigned mask = 0;
+   unsigned mode = brw_rnd_mode_from_nir(execution_mode, );
+   abld.emit(SHADER_OPCODE_FLOAT_CONTROL_MODE, bld.null_reg_ud(),
+ brw_imm_d(mode), brw_imm_d(mask));
+}
+
 /** Emits the interpolation for the varying inputs. */
 void
 fs_visitor::emit_interpolation_setup_gen6()
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 25/28] i965/fs: add support for shader float control to remove_extra_rounding_modes()

2018-12-05 Thread Samuel Iglesias Gonsálvez
The remove_extra_rounding_modes() optimization will remove duplicated
rounding mode changes.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_fs.cpp | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 18dcd92219c..eb253679930 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -3457,10 +3457,15 @@ bool
 fs_visitor::remove_extra_rounding_modes()
 {
bool progress = false;
+   unsigned execution_mode = 
this->nir->info.shader_float_controls_execution_mode;
 
-   foreach_block (block, cfg) {
-  brw_rnd_mode prev_mode = BRW_RND_MODE_UNSPECIFIED;
+   brw_rnd_mode prev_mode = BRW_RND_MODE_UNSPECIFIED;
+   if (execution_mode & SHADER_ROUNDING_MODE_RTE)
+  prev_mode = BRW_RND_MODE_RTNE;
+   if (execution_mode & SHADER_ROUNDING_MODE_RTZ)
+  prev_mode = BRW_RND_MODE_RTZ;
 
+   foreach_block (block, cfg) {
   foreach_inst_in_block_safe (fs_inst, inst, block) {
  if (inst->opcode == SHADER_OPCODE_RND_MODE) {
 assert(inst->src[0].file == BRW_IMMEDIATE_VALUE);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] loader: free error state, when checking the drawable type

2018-12-05 Thread Emil Velikov
From: Kirill Burtsev 

Currently we distinguish if the drawable is a window or pixmap by
checking xcb_present_select_input throws an error or not.

Yet, we don't always free the error state returned by xcb.

Cc: Kirill Burtsev 
Cc: Boyan Ding 
Fixes: 6bd9ba7d074 ("loader: Add dri3 helper")
Reviewed-by: Emil Velikov 
[Emil: add commit message, fixes tag]
Signed-off-by: Emil Velikov 
---
Kirill thanks for the patch. I've taken the liberty of doing some minor
polish, hope that's ok with you.

In general patches must land in master first before being considered
for stable. If our docs [1] are unclear please send us a patch to
improve them - the website lives in $mesa/docs

Emil

[1] https://www.mesa3d.org/submittingpatches.html#criteria
---
 src/loader/loader_dri3_helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index 1981b5f0515..7cd6b1e8ab6 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -1509,6 +1509,7 @@ dri3_update_drawable(struct loader_dri3_drawable *draw)
 mtx_unlock(>mtx);
 return false;
  }
+ free(error);
  draw->is_pixmap = true;
  xcb_unregister_for_special_event(draw->conn, draw->special_event);
  draw->special_event = NULL;
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108742] Battlefield 4 in Wine Freezes when joining games since ~mesa-18.2.3

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108742

--- Comment #2 from Samuel Pitoiset  ---
Did you bisect?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] nir/algebraic: Rewrite bit-size inference

2018-12-05 Thread Dylan Baker
For the python bits:
Reviewed-by: Dylan Baker 

Quoting Connor Abbott (2018-12-05 04:20:30)
> Before this commit, there were two copies of the algorithm: one in C,
> that we would use to figure out what bit-size to give the replacement
> expression, and one in Python, that emulated the C one and tried to
> prove that the C algorithm would never fail to correctly assign
> bit-sizes. That seemed pretty fragile, and likely to fall over if we
> make any changes. Furthermore, the C code was really just recomputing
> more-or-less the same thing as the Python code every time. Instead, we
> can just store the results of the Python algorithm in the C
> datastructure, and consult it to compute the bitsize of each value,
> moving the "brains" entirely into Python. Since the Python algorithm no
> longer has to match C, it's also a lot easier to change it to something
> more closely approximating an actual type-inference algorithm. The
> algorithm used is based on Hindley-Milner, although deliberately
> weakened a little. It's a few more lines than the old one, judging by
> the diffstat, but I think it's easier to verify that it's correct while
> being as general as possible.
> 
> We could split this up into two changes, first making the C code use the
> results of the Python code and then rewriting the Python algorithm, but
> since the old algorithm never tracked which variable each equivalence
> class, it would mean we'd have to add some non-trivial code which would
> then get thrown away. I think it's better to see the final state all at
> once, although I could also try splitting it up.
> 
> v2:
> - Replace instances of "== None" and "!= None" with "is None" and
> "is not None".
> - Rename first_src to first_unsized_src
> - Only merge the destination with the first unsized source, since the
> sources have already been merged.
> - Add a comment explaining what nir_search_value::bit_size now means.
> v3:
> - Fix one last instance to use "is not" instead of !=
> - Don't try to be so clever when choosing which error message to print
> based on whether we're in the search or replace expression.
> - Fix trailing whitespace.
> ---
>  src/compiler/nir/nir_algebraic.py | 520 --
>  src/compiler/nir/nir_search.c | 146 +
>  src/compiler/nir/nir_search.h |  17 +-
>  3 files changed, 317 insertions(+), 366 deletions(-)
> 
> diff --git a/src/compiler/nir/nir_algebraic.py 
> b/src/compiler/nir/nir_algebraic.py
> index 728196136ab..efd6e52cdb9 100644
> --- a/src/compiler/nir/nir_algebraic.py
> +++ b/src/compiler/nir/nir_algebraic.py
> @@ -88,7 +88,7 @@ class Value(object):
>  
> __template = mako.template.Template("""
>  static const ${val.c_type} ${val.name} = {
> -   { ${val.type_enum}, ${val.bit_size} },
> +   { ${val.type_enum}, ${val.c_bit_size} },
>  % if isinstance(val, Constant):
> ${val.type()}, { ${val.hex()} /* ${val.value} */ },
>  % elif isinstance(val, Variable):
> @@ -112,6 +112,40 @@ static const ${val.c_type} ${val.name} = {
> def __str__(self):
>return self.in_val
>  
> +   def get_bit_size(self):
> +  """Get the physical bit-size that has been chosen for this value, or if
> +  there is none, the canonical value which currently represents this
> +  bit-size class. Variables will be preferred, i.e. if there are any
> +  variables in the equivalence class, the canonical value will be a
> +  variable. We do this since we'll need to know which variable each value
> +  is equivalent to when constructing the replacement expression. This is
> +  the "find" part of the union-find algorithm.
> +  """
> +  bit_size = self
> +
> +  while isinstance(bit_size, Value):
> + if bit_size._bit_size is None:
> +break
> + bit_size = bit_size._bit_size
> +
> +  if bit_size is not self:
> + self._bit_size = bit_size
> +  return bit_size
> +
> +   def set_bit_size(self, other):
> +  """Make self.get_bit_size() return what other.get_bit_size() return
> +  before calling this, or just "other" if it's a concrete bit-size. This 
> is
> +  the "union" part of the union-find algorithm.
> +  """
> +
> +  self_bit_size = self.get_bit_size()
> +  other_bit_size = other if isinstance(other, int) else 
> other.get_bit_size()
> +
> +  if self_bit_size == other_bit_size:
> + return
> +
> +  self_bit_size._bit_size = other_bit_size
> +
> @property
> def type_enum(self):
>return "nir_search_value_" + self.type_str
> @@ -124,6 +158,21 @@ static const ${val.c_type} ${val.name} = {
> def c_ptr(self):
>return "&{0}.value".format(self.name)
>  
> +   @property
> +   def c_bit_size(self):
> +  bit_size = self.get_bit_size()
> +  if isinstance(bit_size, int):
> + return bit_size
> +  elif isinstance(bit_size, Variable):
> + return -bit_size.index - 1
> +  else:
> + # If the bit-size class is neither a 

[Mesa-dev] [PATCH 00/28] Add VK_KHR_shader_float_controls support to anv

2018-12-05 Thread Samuel Iglesias Gonsálvez
Hello,

This patch series implements the support for
VK_KHR_shader_float_controls for Intel platforms (Broadwell and
later).

This extension enables efficient use of floating-point computations
through the ability to query and override the implementation's default
behavior for rounding modes, denormals, signed zero, and infinity.

This patch series depends on Iago's patch series implementing
VK_KHR_shader_float16_int8 support on ANV [0] for the float16 support.

If you want to test this patch series, you can clone this branch (it
already includes Iago's patch series):

$ git clone -b siglesias/VK_KHR_shader_float_controls \
https://github.com/Igalia/mesa.git

Thanks!

Sam

[0] https://lists.freedesktop.org/archives/mesa-dev/2018-December/210896.html


Samuel Iglesias Gonsálvez (28):
  spirv: Update SPIR-V json and headers to Khronos master
  spirv: check support for SPV_KHR_shader_float_controls capabilities
  spirv/nir: keep track of SPV_KHR_shader_float_controls execution modes
  nir: add support for flushing to zero denorm constants
  Revert "spirv: Don’t check for NaN for most OpFOrd* comparisons"
  spirv/glsl450: fix atan2(0,0) lowering
  spirv/glsl450: fix atan2(x, x) case
  spirv/glsl450: fix reflect(denorm, denorm) FTZ = 0.0 case
  nir/algebraic: fix (inf - inf) = NaN case
  nir: create new conversion opcodes with floating point rounding modes
  util: added float to float16 conversions with RTZ and RTNE
  util: add fp64 -> fp32 conversion support for RTNE and RTZ rounding
modes
  nir: take into account rounding modes in conversions
  nir: fix denorms in unpack_half_1x16()
  nir: support for denorm flush-to-zero in nir_lower_double_ops
  nir: fix fmin/fmax support for doubles
  intel/nir: call nir_opt_constant_folding before nir_opt_algebraic is
executed
  intel/nir: call nir_opt_constant_folding before
brw_nir_apply_trig_workarounds
  i965/fs: add nir_op_f2f*_{rtne,rtz}
  i965/fs/nir: add nir_op_unpack_half_2x16_split_*_flush_to_zero
  i965/fs/generator: add support to set floating points modes in control
register
  i965/fs: define emit_shader_float_controls_execution_mode() and aux
functions
  i965/fs: emit shader float controls execution modes as first
instruction of shaders
  i965/fs: remove brw_rounding_mode() and use brw_float_controls_mode()
instead
  i965/fs: add support for shader float control to
remove_extra_rounding_modes()
  anv: add support for
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR
  anv: enable support for SPV_KHR_shader_float_controls capabilities
  anv: enable VK_KHR_shader_float_controls extension

 src/compiler/nir/nir.h   |  15 +
 src/compiler/nir/nir_constant_expressions.py |  59 +++-
 src/compiler/nir/nir_lower_alu_to_scalar.c   |  10 +-
 src/compiler/nir/nir_lower_double_ops.c  |  12 +
 src/compiler/nir/nir_opcodes.py  |  11 +-
 src/compiler/nir/nir_opcodes_c.py|   4 +-
 src/compiler/nir/nir_opt_algebraic.py|   2 -
 src/compiler/nir/nir_opt_constant_folding.c  |  74 -
 src/compiler/shader_enums.h  |  14 +
 src/compiler/shader_info.h   |   4 +
 src/compiler/spirv/spirv.core.grammar.json   | 316 ++-
 src/compiler/spirv/spirv.h   |  84 +++--
 src/compiler/spirv/spirv_to_nir.c|  33 ++
 src/compiler/spirv/vtn_alu.c |  33 +-
 src/compiler/spirv/vtn_glsl450.c |  26 +-
 src/intel/compiler/brw_eu.h  |   4 +-
 src/intel/compiler/brw_eu_defines.h  |  10 +
 src/intel/compiler/brw_eu_emit.c |  52 ++-
 src/intel/compiler/brw_fs.cpp|  20 +-
 src/intel/compiler/brw_fs.h  |   1 +
 src/intel/compiler/brw_fs_generator.cpp  |  19 +-
 src/intel/compiler/brw_fs_nir.cpp|  40 ++-
 src/intel/compiler/brw_fs_visitor.cpp|  52 +++
 src/intel/compiler/brw_nir.c |   6 +-
 src/intel/compiler/brw_shader.cpp|   3 +
 src/intel/vulkan/anv_device.c|  31 ++
 src/intel/vulkan/anv_extensions.py   |   1 +
 src/intel/vulkan/anv_pipeline.c  |   1 +
 src/util/Makefile.sources|   2 +
 src/util/double.c| 197 
 src/util/double.h|  46 +++
 src/util/half_float.c|  74 +
 src/util/half_float.h|   7 +
 src/util/meson.build |   2 +
 34 files changed, 1072 insertions(+), 193 deletions(-)
 create mode 100644 src/util/double.c
 create mode 100644 src/util/double.h

-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/28] spirv: check support for SPV_KHR_shader_float_controls capabilities

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/shader_info.h| 1 +
 src/compiler/spirv/spirv_to_nir.c | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h
index e745cc15fc5..21c3d371a63 100644
--- a/src/compiler/shader_info.h
+++ b/src/compiler/shader_info.h
@@ -47,6 +47,7 @@ struct spirv_supported_capabilities {
bool int16;
bool float16;
bool int8;
+   bool shader_float_controls;
bool shader_viewport_index_layer;
bool subgroup_arithmetic;
bool subgroup_ballot;
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 47b11b6ddc3..100fcd8e298 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -3559,6 +3559,13 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, 
SpvOp opcode,
 
   case SpvCapabilitySampleMaskPostDepthCoverage:
  spv_check_supported(post_depth_coverage, cap);
+
+  case SpvCapabilityDenormFlushToZero:
+  case SpvCapabilityDenormPreserve:
+  case SpvCapabilitySignedZeroInfNanPreserve:
+  case SpvCapabilityRoundingModeRTE:
+  case SpvCapabilityRoundingModeRTZ:
+ spv_check_supported(shader_float_controls, cap);
  break;
 
   default:
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108914] blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this

2018-12-05 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108914

--- Comment #16 from tempel.jul...@gmail.com ---
Looking fine now with mesa-git, thanks again!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: wait on the high 32 bits of timestamp queries

2018-12-05 Thread Samuel Pitoiset

Hi Emil,

Yeah, that looks correct, Thanks!

On 12/5/18 4:22 PM, Emil Velikov wrote:

Hi guys

On Wed, 5 Dec 2018 at 10:49, Bas Nieuwenhuizen  wrote:


Reviewed-by: Bas Nieuwenhuizen 
On Wed, Dec 5, 2018 at 11:43 AM Samuel Pitoiset
 wrote:


In case we are unlucky if the low part is 0x.

Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp 
queries")
Signed-off-by: Samuel Pitoiset 


There was a trivial conflict when applying the patch to 18.3
I've resolved it here [1], please double-check that it looks good.

Thanks
Emil

[1] 
https://cgit.freedesktop.org/mesa/mesa/commit/?h=staging/18.3=d92bbe54eaf8406d2b3ceb8b6b7eba6c69681b76


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] radv: drop unused variable

2018-12-05 Thread Samuel Pitoiset
So my compiler doesn't want to show me warnings? That's something I 
would need to fix up. Anyways,


Reviewed-by: Samuel Pitoiset 

On 12/5/18 4:44 PM, Eric Engestrom wrote:

Added in 824cfc1ee5e0aba15b676 "radv: rework the TC-compat HTILE
hardware bug with COND_EXEC", but it is unused.

Signed-off-by: Eric Engestrom 
---
  src/amd/vulkan/radv_cmd_buffer.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 23909a0f7dda537bf9a1..945442d7b97974f780b2 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1278,7 +1278,6 @@ radv_update_tc_compat_zrange_metadata(struct 
radv_cmd_buffer *cmd_buffer,
  struct radv_image *image,
  VkClearDepthStencilValue ds_clear_value)
  {
-   struct radeon_cmdbuf *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->tc_compat_zrange_offset;
uint32_t cond_val;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/28] spirv: Update SPIR-V json and headers to Khronos master

2018-12-05 Thread Samuel Iglesias Gonsálvez
This corresponds to commit 17da9f8231f78cf519b4958c2229463a63ead9e2 on GitHub.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/spirv/spirv.core.grammar.json | 316 +++--
 src/compiler/spirv/spirv.h |  84 +++---
 2 files changed, 281 insertions(+), 119 deletions(-)

diff --git a/src/compiler/spirv/spirv.core.grammar.json 
b/src/compiler/spirv/spirv.core.grammar.json
index 034e3ab4446..cd178596594 100644
--- a/src/compiler/spirv/spirv.core.grammar.json
+++ b/src/compiler/spirv/spirv.core.grammar.json
@@ -3825,7 +3825,7 @@
   "version" : "None"
 },
 {
-  "opname" : "OpReportIntersectionNVX",
+  "opname" : "OpReportIntersectionNV",
   "opcode" : 5334,
   "operands" : [
 { "kind" : "IdResultType" },
@@ -3833,25 +3833,25 @@
 { "kind" : "IdRef", "name" : "'Hit'" },
 { "kind" : "IdRef", "name" : "'HitKind'" }
   ],
-  "capabilities" : [ "RaytracingNVX" ],
-  "extensions" : [ "SPV_NVX_raytracing" ]
+  "capabilities" : [ "RayTracingNV" ],
+  "extensions" : [ "SPV_NV_ray_tracing" ]
 },
 {
-  "opname" : "OpIgnoreIntersectionNVX",
+  "opname" : "OpIgnoreIntersectionNV",
   "opcode" : 5335,
 
-  "capabilities" : [ "RaytracingNVX" ],
-  "extensions" : [ "SPV_NVX_raytracing" ]
+  "capabilities" : [ "RayTracingNV" ],
+  "extensions" : [ "SPV_NV_ray_tracing" ]
 },
 {
-  "opname" : "OpTerminateRayNVX",
+  "opname" : "OpTerminateRayNV",
   "opcode" : 5336,
 
-  "capabilities" : [ "RaytracingNVX" ],
-  "extensions" : [ "SPV_NVX_raytracing" ]
+  "capabilities" : [ "RayTracingNV" ],
+  "extensions" : [ "SPV_NV_ray_tracing" ]
 },
 {
-  "opname" : "OpTraceNVX",
+  "opname" : "OpTraceNV",
   "opcode" : 5337,
   "operands" : [
 
@@ -3867,17 +3867,28 @@
 { "kind" : "IdRef", "name" : "'Ray Tmax'" },
 { "kind" : "IdRef", "name" : "'PayloadId'" }
   ],
-  "capabilities" : [ "RaytracingNVX" ],
-  "extensions" : [ "SPV_NVX_raytracing" ]
+  "capabilities" : [ "RayTracingNV" ],
+  "extensions" : [ "SPV_NV_ray_tracing" ]
 },
 {
-  "opname" : "OpTypeAccelerationStructureNVX",
+  "opname" : "OpTypeAccelerationStructureNV",
   "opcode" : 5341,
   "operands" : [
 { "kind" : "IdResult" }
   ],
-  "capabilities" : [ "RaytracingNVX" ],
-  "extensions" : [ "SPV_NVX_raytracing" ]
+  "capabilities" : [ "RayTracingNV" ],
+  "extensions" : [ "SPV_NV_ray_tracing" ]
+},
+{
+  "opname" : "OpExecuteCallableNV",
+  "opcode" : 5344,
+  "operands" : [
+
+{ "kind" : "IdRef", "name" : "'SBT Index'" },
+{ "kind" : "IdRef", "name" : "'Callable DataId'" }
+  ],
+  "capabilities" : [ "RayTracingNV" ],
+  "extensions" : [ "SPV_NV_ray_tracing" ]
 },
 {
   "opname" : "OpSubgroupShuffleINTEL",
@@ -4443,34 +4454,34 @@
   "capabilities" : [ "MeshShadingNV" ]
 },
 {
-  "enumerant" : "RayGenerationNVX",
+  "enumerant" : "RayGenerationNV",
   "value" : 5313,
-  "capabilities" : [ "RaytracingNVX" ]
+  "capabilities" : [ "RayTracingNV" ]
 },
 {
-  "enumerant" : "IntersectionNVX",
+  "enumerant" : "IntersectionNV",
   "value" : 5314,
-  "capabilities" : [ "RaytracingNVX" ]
+  "capabilities" : [ "RayTracingNV" ]
 },
 {
-  "enumerant" : "AnyHitNVX",
+  "enumerant" : "AnyHitNV",
   "value" : 5315,
-  "capabilities" : [ "RaytracingNVX" ]
+  "capabilities" : [ "RayTracingNV" ]
 },
 {
-  "enumerant" : "ClosestHitNVX",
+  "enumerant" : "ClosestHitNV",
   "value" : 5316,
-  "capabilities" : [ "RaytracingNVX" ]
+  "capabilities" : [ "RayTracingNV" ]
 },
 {
-  "enumerant" : "MissNVX",
+  "enumerant" : "MissNV",
   "value" : 5317,
-  "capabilities" : [ "RaytracingNVX" ]
+  "capabilities" : [ "RayTracingNV" ]
 },
 {
-  "enumerant" : "CallableNVX",
+  "enumerant" : "CallableNV",
   "value" : 5318,
-  "capabilities" : [ "RaytracingNVX" ]
+  "capabilities" : [ "RayTracingNV" ]
 }
   ]
 },
@@ -4762,6 +4773,56 @@
   "extensions" : [ "SPV_KHR_post_depth_coverage" ],
   "version" : "None"
 },
+{
+  "enumerant" : "DenormPreserve",
+  "value" : 4459,
+  "capabilities" : [ "DenormPreserve"],
+  "extensions" : [ "SPV_KHR_float_controls" ],
+  "parameters" : [
+{ "kind" : "LiteralInteger", "name" : "'Target Width'" }
+  ],
+  "version" : "None"
+},
+{
+  "enumerant" : "DenormFlushToZero",
+  "value" : 4460,
+  "capabilities" : [ "DenormFlushToZero"],
+  

[Mesa-dev] [PATCH 05/28] Revert "spirv: Don’t check for NaN for most OpFOrd* comparisons"

2018-12-05 Thread Samuel Iglesias Gonsálvez
This reverts commit c4ab1bdcc9710e3c7cc7115d3be9c69b7e7712ef. We need
to check the arguments looking for NaNs, because they can introduce
failures in tests for FOrd*, specially when running
VK_KHR_shader_float_control tests in CTS.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/spirv/vtn_alu.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index dc6fedc9129..629b57560ca 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -535,18 +535,23 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   break;
}
 
-   case SpvOpFOrdNotEqual: {
-  /* For all the SpvOpFOrd* comparisons apart from NotEqual, the value
-   * from the ALU will probably already be false if the operands are not
-   * ordered so we don’t need to handle it specially.
-   */
+   case SpvOpFOrdEqual:
+   case SpvOpFOrdNotEqual:
+   case SpvOpFOrdLessThan:
+   case SpvOpFOrdGreaterThan:
+   case SpvOpFOrdLessThanEqual:
+   case SpvOpFOrdGreaterThanEqual: {
   bool swap;
   unsigned src_bit_size = glsl_get_bit_size(vtn_src[0]->type);
   unsigned dst_bit_size = glsl_get_bit_size(type);
   nir_op op = vtn_nir_alu_op_for_spirv_opcode(b, opcode, ,
   src_bit_size, dst_bit_size);
 
-  assert(!swap);
+  if (swap) {
+ nir_ssa_def *tmp = src[0];
+ src[0] = src[1];
+ src[1] = tmp;
+  }
 
   val->ssa->def =
  nir_iand(>nb,
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/28] nir: fix denorms in unpack_half_1x16()

2018-12-05 Thread Samuel Iglesias Gonsálvez
According to VK_KHR_shader_float_controls:

"Denormalized values obtained via unpacking an integer into a vector
 of values with smaller bit width and interpreting those values as
 floating-point numbers must: be flushed to zero, unless the entry point
 is declared with the code:DenormPreserve execution mode."

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/nir/nir_constant_expressions.py | 13 +
 src/compiler/nir/nir_lower_alu_to_scalar.c   | 10 --
 src/compiler/nir/nir_opcodes.py  |  5 +
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_constant_expressions.py 
b/src/compiler/nir/nir_constant_expressions.py
index a9af1bd233d..bc60a08da28 100644
--- a/src/compiler/nir/nir_constant_expressions.py
+++ b/src/compiler/nir/nir_constant_expressions.py
@@ -245,6 +245,19 @@ pack_half_1x16(float x)
return _mesa_float_to_half(x);
 }
 
+/**
+ * Evaluate one component of unpackHalf2x16.
+ */
+static float
+unpack_half_1x16_flush_to_zero(uint16_t u)
+{
+   if (u < 0x0400)
+  u = 0;
+   if (u & 0x8000 && !(u & 0x7c00))
+  u = 0x8000;
+   return _mesa_half_to_float(u);
+}
+
 /**
  * Evaluate one component of unpackHalf2x16.
  */
diff --git a/src/compiler/nir/nir_lower_alu_to_scalar.c 
b/src/compiler/nir/nir_lower_alu_to_scalar.c
index 7ef032cd164..d80cf2504c7 100644
--- a/src/compiler/nir/nir_lower_alu_to_scalar.c
+++ b/src/compiler/nir/nir_lower_alu_to_scalar.c
@@ -133,8 +133,14 @@ lower_alu_instr_scalar(nir_alu_instr *instr, nir_builder 
*b)
   nir_ssa_def *packed = nir_ssa_for_alu_src(b, instr, 0);
 
   nir_ssa_def *comps[2];
-  comps[0] = nir_unpack_half_2x16_split_x(b, packed);
-  comps[1] = nir_unpack_half_2x16_split_y(b, packed);
+
+  if (b->shader->info.shader_float_controls_execution_mode & 
SHADER_DENORM_FLUSH_TO_ZERO_FP16) {
+ comps[0] = nir_unpack_half_2x16_split_x_flush_to_zero(b, packed);
+ comps[1] = nir_unpack_half_2x16_split_y_flush_to_zero(b, packed);
+  } else {
+ comps[0] = nir_unpack_half_2x16_split_x(b, packed);
+ comps[1] = nir_unpack_half_2x16_split_y(b, packed);
+  }
   nir_ssa_def *vec = nir_vec(b, comps, 2);
 
   nir_ssa_def_rewrite_uses(>dest.dest.ssa, nir_src_for_ssa(vec));
diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index eb554a66b44..191025f6932 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -309,6 +309,11 @@ unop_convert("unpack_half_2x16_split_x", tfloat32, tuint32,
 unop_convert("unpack_half_2x16_split_y", tfloat32, tuint32,
  "unpack_half_1x16((uint16_t)(src0 >> 16))")
 
+unop_convert("unpack_half_2x16_split_x_flush_to_zero", tfloat32, tuint32,
+ "unpack_half_1x16_flush_to_zero((uint16_t)(src0 & 0x))")
+unop_convert("unpack_half_2x16_split_y_flush_to_zero", tfloat32, tuint32,
+ "unpack_half_1x16_flush_to_zero((uint16_t)(src0 >> 16))")
+
 unop_convert("unpack_32_2x16_split_x", tuint16, tuint32, "src0")
 unop_convert("unpack_32_2x16_split_y", tuint16, tuint32, "src0 >> 16")
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/28] util: add fp64 -> fp32 conversion support for RTNE and RTZ rounding modes

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/util/Makefile.sources |   2 +
 src/util/double.c | 197 ++
 src/util/double.h |  46 +
 src/util/meson.build  |   2 +
 4 files changed, 247 insertions(+)
 create mode 100644 src/util/double.c
 create mode 100644 src/util/double.h

diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index f09b89b3be5..1b998bf26c4 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -11,6 +11,8 @@ MESA_UTIL_FILES := \
debug.h \
disk_cache.c \
disk_cache.h \
+   double.c \
+   double.h \
fast_idiv_by_const.c \
fast_idiv_by_const.h \
format_r11g11b10f.h \
diff --git a/src/util/double.c b/src/util/double.c
new file mode 100644
index 000..61f23e3b57b
--- /dev/null
+++ b/src/util/double.c
@@ -0,0 +1,197 @@
+/*
+ * Mesa 3-D graphics library
+ *
+ * Copyright (C) 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "rounding.h"
+#include "double.h"
+
+
+
+
+typedef union { double f; int64_t i; uint64_t u; } fi_type;
+
+/**
+ * Convert a 8-byte double to a 4-byte float.
+ *
+ * Not all float64 values can be represented exactly as a float32 value. We
+ * round such intermediate float64 values to the nearest float32. When the
+ * float64 lies exactly between two float32 values, we round to the one with
+ * an even mantissa.
+ */
+
+float
+_mesa_double_to_float(double val)
+{
+   const fi_type fi = {val};
+   const int64_t flt_m = fi.i & 0x0f;
+   const int64_t flt_e = (fi.i >> 52) & 0x7ff;
+   const int64_t flt_s = (fi.i >> 63) & 0x1;
+   int s, e, m = 0;
+   float result;
+
+   /* sign bit */
+   s = flt_s;
+
+   /* handle special cases */
+   if ((flt_e == 0) && (flt_m == 0)) {
+  /* zero */
+  /* m = 0; - already set */
+  e = 0;
+   }
+   else if ((flt_e == 0) && (flt_m != 0)) {
+  /* denorm -- denorm float64 maps to 0 */
+  /* m = 0; - already set */
+  e = 0;
+   }
+   else if ((flt_e == 0x7ff) && (flt_m == 0)) {
+  /* infinity */
+  /* m = 0; - already set */
+  e = 255;
+   }
+   else if ((flt_e == 0x7ff) && (flt_m != 0)) {
+  /* NaN */
+  m = 1;
+  e = 255;
+   }
+   else {
+  /* regular number */
+  const int new_exp = flt_e - 1023;
+  if (new_exp < -126) {
+ /* The float64 lies in the range (0.0, min_normal32) and is rounded
+  * to a nearby float32 value. The result will be either zero, 
subnormal,
+  * or normal.
+  */
+ e = 0;
+ m = _mesa_lroundeven(((double)((uint64_t)1 << 54)) * fabs(fi.f));
+  }
+  else if (new_exp > 127) {
+ /* map this value to infinity */
+ /* m = 0; - already set */
+ e = 255;
+  }
+  else {
+ /* The float64 lies in the range
+  *   [min_normal32, max_normal32 + max_step32)
+  * and is rounded to a nearby float32 value. The result will be
+  * either normal or infinite.
+  */
+ e = new_exp + 127;
+ m = _mesa_lroundeven((double)flt_m / (double) (1 << 29));
+  }
+   }
+
+   assert(0 <= m && m <= (1 << 23));
+   if (m == (1 << 23)) {
+  /* The float64 was rounded upwards into the range of the next exponent,
+   * so bump the exponent. This correctly handles the case where f64
+   * should be rounded up to float32 infinity.
+   */
+  ++e;
+  m = 0;
+   }
+
+   unsigned result_int = (s << 31) | (e << 23) | m;
+   memcpy(, _int, sizeof(float));
+   return result;
+}
+
+float
+_mesa_double_to_float_rtz(double val)
+{
+   const fi_type fi = {val};
+   const int64_t flt_m = fi.i & 0x0f;
+   const int64_t flt_e = (fi.i >> 52) & 0x7ff;
+   const int64_t flt_s = (fi.i >> 63) & 0x1;
+   int s, e, m = 0;
+   float result;
+
+   /* sign bit */
+  

[Mesa-dev] [PATCH 11/28] util: added float to float16 conversions with RTZ and RTNE

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/util/half_float.c | 74 +++
 src/util/half_float.h |  7 
 2 files changed, 81 insertions(+)

diff --git a/src/util/half_float.c b/src/util/half_float.c
index 63aec5c5c14..5fdcb20045b 100644
--- a/src/util/half_float.c
+++ b/src/util/half_float.c
@@ -125,6 +125,80 @@ _mesa_float_to_half(float val)
return result;
 }
 
+uint16_t
+_mesa_float_to_float16_rtz(float val)
+{
+   const fi_type fi = {val};
+   const int flt_m = fi.i & 0x7f;
+   const int flt_e = (fi.i >> 23) & 0xff;
+   const int flt_s = (fi.i >> 31) & 0x1;
+   int s, e, m = 0;
+   uint16_t result;
+
+   /* sign bit */
+   s = flt_s;
+
+   /* handle special cases */
+   if ((flt_e == 0) && (flt_m == 0)) {
+  /* zero */
+  /* m = 0; - already set */
+  e = 0;
+   }
+   else if ((flt_e == 0) && (flt_m != 0)) {
+  /* denorm -- denorm float maps to 0 half */
+  /* m = 0; - already set */
+  e = 0;
+   }
+   else if ((flt_e == 0xff) && (flt_m == 0)) {
+  /* infinity */
+  /* m = 0; - already set */
+  e = 31;
+   }
+   else if ((flt_e == 0xff) && (flt_m != 0)) {
+  /* NaN */
+  m = 1;
+  e = 31;
+   }
+   else {
+  /* regular number */
+  const int new_exp = flt_e - 127;
+  if (new_exp < -14) {
+ /* The float32 lies in the range (0.0, min_normal16) and is rounded
+  * to a nearby float16 value. The result will be either zero, 
subnormal,
+  * or normal.
+  */
+ e = 0;
+ m = truncf((1 << 24) * fabsf(fi.f));
+  }
+  else if (new_exp > 15) {
+ /* map this value to infinity */
+ /* m = 0; - already set */
+ e = 31;
+  }
+  else {
+ /* The float32 lies in the range
+  *   [min_normal16, max_normal16 + max_step16)
+  * and is rounded to a nearby float16 value. The result will be
+  * either normal or infinite.
+  */
+ e = new_exp + 15;
+ m = truncf(flt_m / (float) (1 << 13));
+  }
+   }
+
+   assert(0 <= m && m <= 1024);
+   if (m == 1024) {
+  /* The float32 was rounded upwards into the range of the next exponent,
+   * so bump the exponent. This correctly handles the case where f32
+   * should be rounded up to float16 infinity.
+   */
+  ++e;
+  m = 0;
+   }
+
+   result = (s << 15) | (e << 10) | m;
+   return result;
+}
 
 /**
  * Convert a 2-byte half float to a 4-byte float.
diff --git a/src/util/half_float.h b/src/util/half_float.h
index 01557424735..df90802bf34 100644
--- a/src/util/half_float.h
+++ b/src/util/half_float.h
@@ -39,6 +39,13 @@ uint16_t _mesa_float_to_half(float val);
 float _mesa_half_to_float(uint16_t val);
 uint8_t _mesa_half_to_unorm8(uint16_t v);
 uint16_t _mesa_uint16_div_64k_to_half(uint16_t v);
+uint16_t _mesa_float_to_float16_rtz(float val);
+
+static inline uint16_t
+_mesa_float_to_float16_rtne(float val)
+{
+   return _mesa_float_to_half(val);
+}
 
 static inline bool
 _mesa_half_is_negative(uint16_t h)
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/28] intel/nir: call nir_opt_constant_folding before nir_opt_algebraic is executed

2018-12-05 Thread Samuel Iglesias Gonsálvez
This would do constant folding and also flush to zero denorms operands before
the nir_opt_algebraic is executed.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_nir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 0a5aa35c700..600f7a97df9 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -570,8 +570,8 @@ brw_nir_optimize(nir_shader *nir, const struct brw_compiler 
*compiler,
   OPT(nir_opt_cse);
   OPT(nir_opt_peephole_select, 0);
   OPT(nir_opt_intrinsics);
-  OPT(nir_opt_algebraic);
   OPT(nir_opt_constant_folding);
+  OPT(nir_opt_algebraic);
   OPT(nir_opt_dead_cf);
   if (OPT(nir_opt_trivial_continues)) {
  /* If nir_opt_trivial_continues makes progress, then we need to clean
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/28] i965/fs: add nir_op_f2f*_{rtne,rtz}

2018-12-05 Thread Samuel Iglesias Gonsálvez
This way, we can implement its support later if SPIR-V supports it.
Right now, the RTZ, RTNE support in SPIR-V in FPRoundingMode only
applies to f2f16 conversions.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_fs_nir.cpp | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 7a4594a24ac..5f2f7ec419e 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -801,6 +801,8 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
   inst->saturate = instr->dest.saturate;
   break;
 
+   case nir_op_f2f64_rtne:
+   case nir_op_f2f64_rtz:
case nir_op_f2f64:
case nir_op_f2i64:
case nir_op_f2u64:
@@ -814,7 +816,23 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
*or a DWord integer type as an intermediate type."
*/
   if (nir_src_bit_size(instr->src[0].src) == 16) {
- fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 1);
+ brw_reg_type type;
+ switch (instr->op) {
+ case nir_op_f2f64:
+ case nir_op_f2f64_rtne:
+ case nir_op_f2f64_rtz:
+type = BRW_REGISTER_TYPE_F;
+break;
+ case nir_op_f2i64:
+type = BRW_REGISTER_TYPE_D;
+break;
+ case nir_op_f2u64:
+type = BRW_REGISTER_TYPE_UD;
+break;
+ default:
+unreachable("Not supported");
+ }
+ fs_reg tmp = bld.vgrf(type, 1);
  inst = bld.MOV(tmp, op[0]);
  inst->saturate = instr->dest.saturate;
  op[0] = tmp;
@@ -978,6 +996,8 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
   }
   /* Fallthrough */
 
+   case nir_op_f2f32_rtne:
+   case nir_op_f2f32_rtz:
case nir_op_f2f32:
case nir_op_f2i32:
case nir_op_f2u32:
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/28] i965/fs: emit shader float controls execution modes as first instruction of shaders

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_fs.cpp | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 32e0817ce02..18dcd92219c 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -6671,6 +6671,8 @@ fs_visitor::run_vs()
if (shader_time_index >= 0)
   emit_shader_time_begin();
 
+   emit_shader_float_controls_execution_mode();
+
emit_nir_code();
 
if (failed)
@@ -6741,6 +6743,7 @@ fs_visitor::run_tcs_single_patch()
   brw_imm_ud(nir->info.tess.tcs_vertices_out), BRW_CONDITIONAL_L);
   bld.IF(BRW_PREDICATE_NORMAL);
}
+   emit_shader_float_controls_execution_mode();
 
emit_nir_code();
 
@@ -6793,6 +6796,8 @@ fs_visitor::run_tes()
if (shader_time_index >= 0)
   emit_shader_time_begin();
 
+   emit_shader_float_controls_execution_mode();
+
emit_nir_code();
 
if (failed)
@@ -6843,6 +6848,8 @@ fs_visitor::run_gs()
if (shader_time_index >= 0)
   emit_shader_time_begin();
 
+   emit_shader_float_controls_execution_mode();
+
emit_nir_code();
 
emit_gs_thread_end();
@@ -6934,6 +6941,8 @@ fs_visitor::run_fs(bool allow_spilling, bool do_rep_send)
  retype(dispatch_mask, BRW_REGISTER_TYPE_UW));
   }
 
+  emit_shader_float_controls_execution_mode();
+
   emit_nir_code();
 
   if (failed)
@@ -6990,6 +6999,8 @@ fs_visitor::run_cs(unsigned min_dispatch_width)
suboffset(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UW), 1));
}
 
+   emit_shader_float_controls_execution_mode();
+
emit_nir_code();
 
if (failed)
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/28] i965/fs/nir: add nir_op_unpack_half_2x16_split_*_flush_to_zero

2018-12-05 Thread Samuel Iglesias Gonsálvez
The denorm mode is set in the control register, no need to do something else.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/compiler/brw_fs_nir.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 5f2f7ec419e..de24a322e68 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -1533,10 +1533,12 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
   unreachable("not reached: should be handled by lower_packing_builtins");
 
case nir_op_unpack_half_2x16_split_x:
+   case nir_op_unpack_half_2x16_split_x_flush_to_zero:
   inst = bld.emit(FS_OPCODE_UNPACK_HALF_2x16_SPLIT_X, result, op[0]);
   inst->saturate = instr->dest.saturate;
   break;
case nir_op_unpack_half_2x16_split_y:
+   case nir_op_unpack_half_2x16_split_y_flush_to_zero:
   inst = bld.emit(FS_OPCODE_UNPACK_HALF_2x16_SPLIT_Y, result, op[0]);
   inst->saturate = instr->dest.saturate;
   break;
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] radv: drop unused variable

2018-12-05 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 
On Wed, Dec 5, 2018 at 4:44 PM Eric Engestrom  wrote:
>
> Added in 824cfc1ee5e0aba15b676 "radv: rework the TC-compat HTILE
> hardware bug with COND_EXEC", but it is unused.
>
> Signed-off-by: Eric Engestrom 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 23909a0f7dda537bf9a1..945442d7b97974f780b2 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1278,7 +1278,6 @@ radv_update_tc_compat_zrange_metadata(struct 
> radv_cmd_buffer *cmd_buffer,
>   struct radv_image *image,
>   VkClearDepthStencilValue ds_clear_value)
>  {
> -   struct radeon_cmdbuf *cs = cmd_buffer->cs;
> uint64_t va = radv_buffer_get_va(image->bo);
> va += image->offset + image->tc_compat_zrange_offset;
> uint32_t cond_val;
> --
> Cheers,
>   Eric
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/28] nir: create new conversion opcodes with floating point rounding modes

2018-12-05 Thread Samuel Iglesias Gonsálvez
It adds round-towards-zero and round-to-nearest-even opcodes for
floating point conversions.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/nir/nir_opcodes.py   | 2 +-
 src/compiler/nir/nir_opcodes_c.py | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index 4ef4ecc6f22..eb554a66b44 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -180,7 +180,7 @@ for src_t in [tint, tuint, tfloat]:
   else:
  bit_sizes = [8, 16, 32, 64]
   for bit_size in bit_sizes:
-  if bit_size == 16 and dst_t == tfloat and src_t == tfloat:
+  if src_t == tfloat and dst_t == tfloat:
   rnd_modes = ['_rtne', '_rtz', '']
   for rnd_mode in rnd_modes:
   unop_convert("{0}2{1}{2}{3}".format(src_t[0], dst_t[0],
diff --git a/src/compiler/nir/nir_opcodes_c.py 
b/src/compiler/nir/nir_opcodes_c.py
index 8bfcda6d719..bd8afe75148 100644
--- a/src/compiler/nir/nir_opcodes_c.py
+++ b/src/compiler/nir/nir_opcodes_c.py
@@ -71,7 +71,7 @@ nir_type_conversion_op(nir_alu_type src, nir_alu_type dst, 
nir_rounding_mode rnd
 % endif
 % for dst_bits in bit_sizes:
   case ${dst_bits}:
-%if src_t == 'float' and dst_t == 'float' and dst_bits == 
16:
+%if src_t == 'float' and dst_t == 'float':
  switch(rnd) {
 %   for rnd_t in [('rtne', '_rtne'), ('rtz', '_rtz'), 
('undef', '')]:
 case nir_rounding_mode_${rnd_t[0]}:
@@ -79,7 +79,7 @@ nir_type_conversion_op(nir_alu_type src, nir_alu_type dst, 
nir_rounding_mode rnd
dst_bits, 
rnd_t[1])};
 %   endfor
 default:
-   unreachable("Invalid 16-bit nir rounding mode");
+   unreachable("Invalid ${dst_bits}-bit nir rounding 
mode");
  }
 %else:
  assert(rnd == nir_rounding_mode_undef);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/28] nir: add support for flushing to zero denorm constants

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/nir/nir_opt_constant_folding.c | 74 +++--
 1 file changed, 68 insertions(+), 6 deletions(-)

diff --git a/src/compiler/nir/nir_opt_constant_folding.c 
b/src/compiler/nir/nir_opt_constant_folding.c
index 1fca530af24..a6df8284e17 100644
--- a/src/compiler/nir/nir_opt_constant_folding.c
+++ b/src/compiler/nir/nir_opt_constant_folding.c
@@ -39,7 +39,7 @@ struct constant_fold_state {
 };
 
 static bool
-constant_fold_alu_instr(nir_alu_instr *instr, void *mem_ctx)
+constant_fold_alu_instr(nir_alu_instr *instr, void *mem_ctx, unsigned 
execution_mode)
 {
nir_const_value src[NIR_MAX_VEC_COMPONENTS];
 
@@ -77,12 +77,39 @@ constant_fold_alu_instr(nir_alu_instr *instr, void *mem_ctx)
  switch(load_const->def.bit_size) {
  case 64:
 src[i].u64[j] = load_const->value.u64[instr->src[i].swizzle[j]];
+if (execution_mode & SHADER_DENORM_FLUSH_TO_ZERO_FP64 &&
+(nir_op_infos[instr->op].input_types[i] == nir_type_float ||
+ nir_op_infos[instr->op].input_types[i] == nir_type_float64)) {
+   if (src[i].u64[j] < 0x0010)
+  src[i].u64[j] = 0;
+   if (src[i].u64[j] & 0x8000 &&
+   !(src[i].u64[j] & 0x7ff0))
+  src[i].u64[j] = 0x8000;
+}
 break;
  case 32:
 src[i].u32[j] = load_const->value.u32[instr->src[i].swizzle[j]];
+if (execution_mode & SHADER_DENORM_FLUSH_TO_ZERO_FP32 &&
+(nir_op_infos[instr->op].input_types[i] == nir_type_float ||
+ nir_op_infos[instr->op].input_types[i] == nir_type_float32)) {
+   if (src[i].u32[j] < 0x0080)
+  src[i].u32[j] = 0;
+   if (src[i].u32[j] & 0x8000 &&
+   !(src[i].u32[j] & 0x7f80))
+  src[i].u32[j] = 0x8000;
+}
 break;
  case 16:
 src[i].u16[j] = load_const->value.u16[instr->src[i].swizzle[j]];
+if (execution_mode & SHADER_DENORM_FLUSH_TO_ZERO_FP16 &&
+(nir_op_infos[instr->op].input_types[i] == nir_type_float ||
+ nir_op_infos[instr->op].input_types[i] == nir_type_float16)) {
+   if (src[i].u16[j] < 0x0400)
+  src[i].u16[j] = 0;
+   if (src[i].u16[j] & 0x8000 &&
+   !(src[i].u16[j] & 0x7c00))
+  src[i].u16[j] = 0x8000;
+}
 break;
  case 8:
 src[i].u8[j] = load_const->value.u8[instr->src[i].swizzle[j]];
@@ -106,6 +133,40 @@ constant_fold_alu_instr(nir_alu_instr *instr, void 
*mem_ctx)
   nir_eval_const_opcode(instr->op, instr->dest.dest.ssa.num_components,
 bit_size, src);
 
+   for (unsigned j = 0; j < instr->dest.dest.ssa.num_components; j++) {
+  if (execution_mode & SHADER_DENORM_FLUSH_TO_ZERO_FP64 &&
+  bit_size == 64 &&
+  (nir_op_infos[instr->op].output_type == nir_type_float ||
+   nir_op_infos[instr->op].output_type == nir_type_float64)) {
+ if (dest.u64[j] < 0x0010)
+dest.u64[j] = 0;
+ if (dest.u64[j] & 0x8000 &&
+ !(dest.u64[j] & 0x7ff0))
+dest.u64[j] = 0x8000;
+  }
+  if (execution_mode & SHADER_DENORM_FLUSH_TO_ZERO_FP32 &&
+  bit_size == 32 &&
+  (nir_op_infos[instr->op].output_type == nir_type_float ||
+   nir_op_infos[instr->op].output_type == nir_type_float32)) {
+ if (dest.u32[j] < 0x0080)
+dest.u32[j] = 0;
+ if (dest.u32[j] & 0x8000 &&
+ !(dest.u32[j] & 0x7f80))
+dest.u32[j] = 0x8000;
+  }
+
+  if (execution_mode & SHADER_DENORM_FLUSH_TO_ZERO_FP16 &&
+  bit_size == 16 &&
+  (nir_op_infos[instr->op].output_type == nir_type_float ||
+   nir_op_infos[instr->op].output_type == nir_type_float16)) {
+ if (dest.u16[j] < 0x0400)
+dest.u16[j] = 0;
+ if (dest.u16[j] & 0x8000 &&
+ !(dest.u16[j] & 0x7c00))
+dest.u16[j] = 0x8000;
+  }
+   }
+
nir_load_const_instr *new_instr =
   nir_load_const_instr_create(mem_ctx,
   instr->dest.dest.ssa.num_components,
@@ -157,14 +218,14 @@ constant_fold_intrinsic_instr(nir_intrinsic_instr *instr)
 }
 
 static bool
-constant_fold_block(nir_block *block, void *mem_ctx)
+constant_fold_block(nir_block *block, void *mem_ctx, unsigned execution_mode)
 {
bool progress = false;
 
nir_foreach_instr_safe(instr, block) {
   switch (instr->type) {
   case nir_instr_type_alu:
- progress |= constant_fold_alu_instr(nir_instr_as_alu(instr), mem_ctx);
+ progress |= 

[Mesa-dev] [PATCH 08/28] spirv/glsl450: fix reflect(denorm, denorm) FTZ = 0.0 case

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/spirv/vtn_glsl450.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 0115648cbb0..69588f56968 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -692,13 +692,18 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 
entrypoint,
src[0], nir_fneg(nb, src[0]));
   return;
 
-   case GLSLstd450Reflect:
+   case GLSLstd450Reflect: {
   /* I - 2 * dot(N, I) * N */
-  val->ssa->def =
+  nir_ssa_def *reflect =
  nir_fsub(nb, src[0], nir_fmul(nb, NIR_IMM_FP(nb, 2.0),
-  nir_fmul(nb, nir_fdot(nb, src[0], src[1]),
-   src[1])));
+   nir_fmul(nb, nir_fdot(nb, src[0], 
src[1]),
+src[1])));
+  nir_ssa_def *zero = NIR_IMM_FP(nb, 0.0);
+  val->ssa->def = nir_bcsel(nb, nir_iand(nb,
+nir_feq(nb, src[0], zero),
+nir_feq(nb, src[1], zero)), zero, 
reflect);
   return;
+   }
 
case GLSLstd450Refract: {
   nir_ssa_def *I = src[0];
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/28] nir/algebraic: fix (inf - inf) = NaN case

2018-12-05 Thread Samuel Iglesias Gonsálvez
If we have (inf - inf) we should return NaN, not 0.0. Same for
(NaN - NaN) case.

Fixes tests in Vulkan CTS that produce such kind subtractions.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/nir/nir_opt_algebraic.py | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 747f1751086..e4f77e7b952 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -91,7 +91,6 @@ optimizations = [
(('usadd_4x8', a, ~0), ~0),
(('~fadd', ('fmul', a, b), ('fmul', a, c)), ('fmul', a, ('fadd', b, c))),
(('iadd', ('imul', a, b), ('imul', a, c)), ('imul', a, ('iadd', b, c))),
-   (('~fadd', ('fneg', a), a), 0.0),
(('iadd', ('ineg', a), a), 0),
(('iadd', ('ineg', a), ('iadd', a, b)), b),
(('iadd', a, ('iadd', ('ineg', a), b)), b),
@@ -891,7 +890,6 @@ before_ffma_optimizations = [
 
(('~fadd', ('fmul', a, b), ('fmul', a, c)), ('fmul', a, ('fadd', b, c))),
(('iadd', ('imul', a, b), ('imul', a, c)), ('imul', a, ('iadd', b, c))),
-   (('~fadd', ('fneg', a), a), 0.0),
(('iadd', ('ineg', a), a), 0),
(('iadd', ('ineg', a), ('iadd', a, b)), b),
(('iadd', a, ('iadd', ('ineg', a), b)), b),
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/28] nir: support for denorm flush-to-zero in nir_lower_double_ops

2018-12-05 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/nir/nir_lower_double_ops.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/compiler/nir/nir_lower_double_ops.c 
b/src/compiler/nir/nir_lower_double_ops.c
index b3543bc6963..97b825d2fdb 100644
--- a/src/compiler/nir/nir_lower_double_ops.c
+++ b/src/compiler/nir/nir_lower_double_ops.c
@@ -558,6 +558,18 @@ lower_doubles_instr(nir_alu_instr *instr, 
nir_lower_doubles_options options)
   unreachable("unhandled opcode");
}
 
+   bool denorm_flush_to_zero =
+  bld.shader->info.shader_float_controls_execution_mode & 
SHADER_DENORM_FLUSH_TO_ZERO_FP64;
+   if (denorm_flush_to_zero) {
+  /* TODO: add support for flushing negative denorms to -0.0 */
+  /* Flush to zero if the result value is a denorm */
+  result = nir_bcsel(,
+  nir_flt(, nir_fabs(, result),
+  nir_imm_double(, 
2.22507385850720138309023271733e-308)),
+  nir_imm_double(, 0.0),
+  result);
+   }
+
nir_ssa_def_rewrite_uses(>dest.dest.ssa, nir_src_for_ssa(result));
nir_instr_remove(>instr);
return true;
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >