date:20170611

On Sun, Jun 11, 2017 at 8:25 PM, Henri Verbeet  wrote:
> On 7 June 2017 at 21:54, Marek Olšák  wrote:
>> On Wed, Jun 7, 2017 at 2:07 AM, Marek Olšák  wrote:
>>> On Wed, Jun 7, 2017 at 12:21 AM, Samuel Li  wrote:
 @@ -790,6 +790,15 @@ static const char* r600_get_device_vendor(struct 
 pipe_screen* pscreen)

  static const char* r600_get_chip_name(struct r600_common_screen *rscreen)
  {
 +   const char *mname;
 +
 +   if (rscreen->ws->get_chip_name) {
 +   mname = rscreen->ws->get_chip_name(rscreen->ws);
 +   if (mname != NULL)
 +   return mname;
 +   }
 +
 +   /* fall back to family names*/
 switch (rscreen->info.family) {
 case CHIP_R600: return "AMD R600";
 case CHIP_RV610: return "AMD RV610";
>
> As someone downstream of this, I have to say I find the "family" names
> much more informative than whatever marketing came up with. More
> importantly however, this commit changes the GL_RENDERER string
> reported to applications, like Wine, for existing GPUs in an
> incompatible way. Since I suspect displaying the "marketing" name is
> important to at least some people at AMD, could I request please
> including the family name as well, as is done by for example lspci?

Yes, if you write the patch with the codename in the existing parentheses. :)

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 8/9] nv50/ir: disable mul+add to mad for precise instructions

fixes missrendering in TombRaider

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 4c92a1efb5..85f3f44832 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1669,6 +1669,10 @@ AlgebraicOpt::handleABS(Instruction *abs)
 bool
 AlgebraicOpt::handleADD(Instruction *add)
 {
+   // we can't optimize to SAD/MAD if the instruction is tagged as precise
+   if (add->precise)
+  return false;
+
Value *src0 = add->getSrc(0);
Value *src1 = add->getSrc(1);
 
@@ -1712,7 +1716,7 @@ AlgebraicOpt::tryADDToMADOrSAD(Instruction *add, 
operation toOp)
   return false;
 
if (src->getInsn()->saturate || src->getInsn()->postFactor ||
-   src->getInsn()->dnz)
+   src->getInsn()->dnz || src->getInsn()->precise)
   return false;
 
if (toOp == OP_SAD) {
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 9/9] nv50/ir/tgsi: split mad to mul+add

fixes
KHR-GL44.gpu_shader5.precise_qualifier
KHR-GL45.gpu_shader5.precise_qualifier

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index c633185893..cd45e82426 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -3184,6 +3184,20 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
   break;
case TGSI_OPCODE_MAD:
case TGSI_OPCODE_UMAD:
+  FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
+ val0 = getSSA();
+ src0 = fetchSrc(0, c);
+ src1 = fetchSrc(1, c);
+ src2 = fetchSrc(2, c);
+ geni = mkOp2(OP_MUL, dstTy, val0, src0, src1);
+ if (dstTy == TYPE_F32)
+geni->dnz = info->io.mul_zero_wins;
+ geni->precise = insn->Instruction.Precise;
+
+ geni = mkOp2(OP_ADD, dstTy, dst0[c], val0, src2);
+ geni->precise = insn->Instruction.Precise;
+  }
+  break;
case TGSI_OPCODE_SAD:
case TGSI_OPCODE_FMA:
   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 6/9] nv50/ir: add precise field to Instruction

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index 5c09fed05c..6835c4fa8c 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -884,6 +884,7 @@ public:
unsigned perPatch   : 1;
unsigned exit   : 1; // terminate program after insn
unsigned mask   : 4; // for vector ops
+   unsigned precise: 1; // prevent algebraic optimisations like mul+add to 
mad
 
int8_t postFactor; // MUL/DIV(if < 0) by 1 << postFactor
 
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 7/9] nv50/ir/tgsi: handle precise for most ALU instructions

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 1264dd4834..c633185893 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -3179,6 +3179,7 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
  geni->subOp = tgsi::opcodeToSubOp(tgsi.getOpcode());
  if (op == OP_MUL && dstTy == TYPE_F32)
 geni->dnz = info->io.mul_zero_wins;
+ geni->precise = insn->Instruction.Precise;
   }
   break;
case TGSI_OPCODE_MAD:
@@ -3192,6 +3193,7 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
  geni = mkOp3(op, dstTy, dst0[c], src0, src1, src2);
  if (dstTy == TYPE_F32)
 geni->dnz = info->io.mul_zero_wins;
+ geni->precise = insn->Instruction.Precise;
   }
   break;
case TGSI_OPCODE_MOV:
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 2/9] tgsi/dump: print _PRECISE modifier on Instrutions

Signed-off-by: Karol Herbst 
---
 src/gallium/auxiliary/tgsi/tgsi_dump.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index f6eba7424b..b58e64511c 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -584,6 +584,10 @@ iter_instruction(
   TXT( "_SAT" );
}
 
+   if (inst->Instruction.Precise) {
+  TXT( "_PRECISE" );
+   }
+
for (i = 0; i < inst->Instruction.NumDstRegs; i++) {
   const struct tgsi_full_dst_register *dst = &inst->Dst[i];
 
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 4/9] tgsi: populate precise

Only implemented for glsl->tgsi. Other converters just set precise to 0.

Signed-off-by: Karol Herbst 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c   |  3 +++
 src/gallium/auxiliary/tgsi/tgsi_ureg.c| 14 +++---
 src/gallium/auxiliary/tgsi/tgsi_ureg.h| 20 +++---
 src/gallium/auxiliary/util/u_simple_shaders.c |  2 +-
 src/gallium/state_trackers/nine/nine_shader.c |  6 ++---
 src/mesa/state_tracker/st_atifs_to_tgsi.c | 38 +--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 12 -
 src/mesa/state_tracker/st_mesa_to_tgsi.c  |  8 +++---
 src/mesa/state_tracker/st_pbo.c   |  2 +-
 9 files changed, 65 insertions(+), 40 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index 55e4d064ed..144a017768 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -651,6 +651,7 @@ tgsi_default_instruction( void )
 static struct tgsi_instruction
 tgsi_build_instruction(unsigned opcode,
unsigned saturate,
+   unsigned precise,
unsigned num_dst_regs,
unsigned num_src_regs,
struct tgsi_header *header)
@@ -665,6 +666,7 @@ tgsi_build_instruction(unsigned opcode,
instruction = tgsi_default_instruction();
instruction.Opcode = opcode;
instruction.Saturate = saturate;
+   instruction.Precise = precise;
instruction.NumDstRegs = num_dst_regs;
instruction.NumSrcRegs = num_src_regs;
 
@@ -1061,6 +1063,7 @@ tgsi_build_full_instruction(
 
*instruction = tgsi_build_instruction(full_inst->Instruction.Opcode,
  full_inst->Instruction.Saturate,
+ full_inst->Instruction.Precise,
  full_inst->Instruction.NumDstRegs,
  full_inst->Instruction.NumSrcRegs,
  header);
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 5bd779728a..56db2252c5 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -1213,6 +1213,7 @@ struct ureg_emit_insn_result
 ureg_emit_insn(struct ureg_program *ureg,
unsigned opcode,
boolean saturate,
+   unsigned precise,
unsigned num_dst,
unsigned num_src)
 {
@@ -1226,6 +1227,7 @@ ureg_emit_insn(struct ureg_program *ureg,
out[0].insn = tgsi_default_instruction();
out[0].insn.Opcode = opcode;
out[0].insn.Saturate = saturate;
+   out[0].insn.Precise = precise;
out[0].insn.NumDstRegs = num_dst;
out[0].insn.NumSrcRegs = num_src;
 
@@ -1354,7 +1356,8 @@ ureg_insn(struct ureg_program *ureg,
   const struct ureg_dst *dst,
   unsigned nr_dst,
   const struct ureg_src *src,
-  unsigned nr_src )
+  unsigned nr_src,
+  unsigned precise )
 {
struct ureg_emit_insn_result insn;
unsigned i;
@@ -1369,6 +1372,7 @@ ureg_insn(struct ureg_program *ureg,
insn = ureg_emit_insn(ureg,
  opcode,
  saturate,
+ precise,
  nr_dst,
  nr_src);
 
@@ -1391,7 +1395,8 @@ ureg_tex_insn(struct ureg_program *ureg,
   const struct tgsi_texture_offset *texoffsets,
   unsigned nr_offset,
   const struct ureg_src *src,
-  unsigned nr_src )
+  unsigned nr_src,
+  unsigned precise )
 {
struct ureg_emit_insn_result insn;
unsigned i;
@@ -1406,6 +1411,7 @@ ureg_tex_insn(struct ureg_program *ureg,
insn = ureg_emit_insn(ureg,
  opcode,
  saturate,
+ precise,
  nr_dst,
  nr_src);
 
@@ -1434,7 +1440,8 @@ ureg_memory_insn(struct ureg_program *ureg,
  unsigned nr_src,
  unsigned qualifier,
  unsigned texture,
- unsigned format)
+ unsigned format,
+ unsigned precise)
 {
struct ureg_emit_insn_result insn;
unsigned i;
@@ -1442,6 +1449,7 @@ ureg_memory_insn(struct ureg_program *ureg,
insn = ureg_emit_insn(ureg,
  opcode,
  FALSE,
+ precise,
  nr_dst,
  nr_src);
 
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.h 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
index 54f95ba565..105c85abd5 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
@@ -546,7 +546,8 @@ ureg_insn(struct ureg_program *ureg,
   const struct ureg_dst *dst,
   u

[Mesa-dev] [RFC 1/9] tgsi: add precise flag to tgsi_instruction

Signed-off-by: Karol Herbst 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c| 1 +
 src/gallium/include/pipe/p_shader_tokens.h | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index 00843241f8..55e4d064ed 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -642,6 +642,7 @@ tgsi_default_instruction( void )
instruction.Label = 0;
instruction.Texture = 0;
instruction.Memory = 0;
+   instruction.Precise = 0;
instruction.Padding = 0;
 
return instruction;
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 1e08d97329..aa0fb3e3b3 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -638,7 +638,8 @@ struct tgsi_instruction
unsigned Label  : 1;
unsigned Texture: 1;
unsigned Memory : 1;
-   unsigned Padding: 2;
+   unsigned Precise: 1;
+   unsigned Padding: 1;
 };
 
 /*
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 3/9] st/glsl_to_tgsi: handle precise modifier

all subexpression inside an ir_assignment needs to be tagged as precise.

Signed-off-by: Karol Herbst 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 80 --
 1 file changed, 65 insertions(+), 15 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index c5d2e0fcd2..19f90f21fe 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -87,6 +87,13 @@ static int swizzle_for_type(const glsl_type *type, int 
component = 0)
return swizzle;
 }
 
+static unsigned is_precise(const ir_variable *ir)
+{
+   if (!ir)
+  return 0;
+   return ir->data.precise || ir->data.invariant;
+}
+
 /**
  * This struct is a corresponding struct to TGSI ureg_src.
  */
@@ -296,6 +303,7 @@ public:
ir_instruction *ir;
 
unsigned op:8; /**< TGSI opcode */
+   unsigned precise:1;
unsigned saturate:1;
unsigned is_64bit_expanded:1;
unsigned sampler_base:5;
@@ -435,6 +443,7 @@ public:
bool have_fma;
bool use_shared_memory;
bool has_tex_txf_lz;
+   unsigned precise;
 
variable_storage *find_variable_storage(ir_variable *var);
 
@@ -505,13 +514,29 @@ public:
   st_src_reg src0 = undef_src,
   st_src_reg src1 = undef_src,
   st_src_reg src2 = undef_src,
-  st_src_reg src3 = undef_src);
+  st_src_reg src3 = undef_src,
+  unsigned precise = 0);
 
glsl_to_tgsi_instruction *emit_asm(ir_instruction *ir, unsigned op,
   st_dst_reg dst, st_dst_reg dst1,
   st_src_reg src0 = undef_src,
   st_src_reg src1 = undef_src,
   st_src_reg src2 = undef_src,
+  st_src_reg src3 = undef_src,
+  unsigned precise = 0);
+
+   glsl_to_tgsi_instruction *emit_asm(ir_expression *ir, unsigned op,
+  st_dst_reg dst = undef_dst,
+  st_src_reg src0 = undef_src,
+  st_src_reg src1 = undef_src,
+  st_src_reg src2 = undef_src,
+  st_src_reg src3 = undef_src);
+
+   glsl_to_tgsi_instruction *emit_asm(ir_expression *ir, unsigned op,
+  st_dst_reg dst, st_dst_reg dst1,
+  st_src_reg src0 = undef_src,
+  st_src_reg src1 = undef_src,
+  st_src_reg src2 = undef_src,
   st_src_reg src3 = undef_src);
 
unsigned get_opcode(unsigned op,
@@ -650,7 +675,8 @@ glsl_to_tgsi_instruction *
 glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
st_dst_reg dst, st_dst_reg dst1,
st_src_reg src0, st_src_reg src1,
-   st_src_reg src2, st_src_reg src3)
+   st_src_reg src2, st_src_reg src3,
+   unsigned precise)
 {
glsl_to_tgsi_instruction *inst = new(mem_ctx) glsl_to_tgsi_instruction();
int num_reladdr = 0, i, j;
@@ -691,6 +717,7 @@ glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned 
op,
STATIC_ASSERT(TGSI_OPCODE_LAST <= 255);
 
inst->op = op;
+   inst->precise = precise;
inst->info = tgsi_get_opcode_info(op);
inst->dst[0] = dst;
inst->dst[1] = dst1;
@@ -881,9 +908,28 @@ glsl_to_tgsi_instruction *
 glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
st_dst_reg dst,
st_src_reg src0, st_src_reg src1,
+   st_src_reg src2, st_src_reg src3,
+   unsigned precise)
+{
+   return emit_asm(ir, op, dst, undef_dst, src0, src1, src2, src3, precise);
+}
+
+glsl_to_tgsi_instruction *
+glsl_to_tgsi_visitor::emit_asm(ir_expression *ir, unsigned op,
+   st_dst_reg dst,
+   st_src_reg src0, st_src_reg src1,
+   st_src_reg src2, st_src_reg src3)
+{
+   return emit_asm(ir, op, dst, undef_dst, src0, src1, src2, src3, 
this->precise);
+}
+
+glsl_to_tgsi_instruction *
+glsl_to_tgsi_visitor::emit_asm(ir_expression *ir, unsigned op,
+   st_dst_reg dst, st_dst_reg dst1,
+   st_src_reg src0, st_src_reg src1,
st_src_reg src2, st_src_reg src3)
 {
-   return emit_asm(ir, op, dst, undef_dst, src0, src1, src2, src3);
+   return emit_asm(ir, op, dst, dst1, src0, src1, src2, sr

[Mesa-dev] [RFC 5/9] tgsi/text: parse _PRECISE modifier

Signed-off-by: Karol Herbst 
---
 src/gallium/auxiliary/tgsi/tgsi_text.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index 93a05568f4..c5fcb3283d 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -999,6 +999,7 @@ parse_texoffset_operand(
 static boolean
 match_inst(const char **pcur,
unsigned *saturate,
+   unsigned *precise,
const struct tgsi_opcode_info *info)
 {
const char *cur = *pcur;
@@ -1007,6 +1008,7 @@ match_inst(const char **pcur,
if (str_match_nocase_whole(&cur, info->mnemonic)) {
   *pcur = cur;
   *saturate = 0;
+  *precise = 0;
   return TRUE;
}
 
@@ -1015,8 +1017,15 @@ match_inst(const char **pcur,
   if (str_match_nocase_whole(&cur, "_SAT")) {
  *pcur = cur;
  *saturate = 1;
- return TRUE;
   }
+
+  if (str_match_nocase_whole(&cur, "_PRECISE")) {
+ *pcur = cur;
+ *precise = 1;
+  }
+
+  if (*precise || *saturate)
+ return TRUE;
}
 
return FALSE;
@@ -1029,6 +1038,7 @@ parse_instruction(
 {
uint i;
uint saturate = 0;
+   uint precise = 0;
const struct tgsi_opcode_info *info;
struct tgsi_full_instruction inst;
const char *cur;
@@ -1043,7 +1053,7 @@ parse_instruction(
   cur = ctx->cur;
 
   info = tgsi_get_opcode_info( i );
-  if (match_inst(&cur, &saturate, info)) {
+  if (match_inst(&cur, &saturate, &precise, info)) {
  if (info->num_dst + info->num_src + info->is_tex == 0) {
 ctx->cur = cur;
 break;
@@ -1064,6 +1074,7 @@ parse_instruction(
 
inst.Instruction.Opcode = i;
inst.Instruction.Saturate = saturate;
+   inst.Instruction.Precise = precise;
inst.Instruction.NumDstRegs = info->num_dst;
inst.Instruction.NumSrcRegs = info->num_src;
 
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI

Running Tomb Raider on Nouveau I found some flicker caused by ignoring precise
modifiers on variables inside Nouveau.

This series add precise/invariant handling to TGSI, which can be then used by
drivers to disable certain unsafe optimisations which may otherwise alter
calculations, which depend on having the same result across shaders.

This series fixes this bug in Tomb Raider and one CTS test for 4.4 and 4.5

Note on Patch 3: I really dislike how I tell glsl_to_tgsi_visitor to apply the
precise flag on instruction emited in ir_assignment->rhs->accept(); but I found
no other easy way to handle this. Maybe somebody of you has a better idea?

Karol Herbst (9):
  tgsi: add precise flag to tgsi_instruction
  tgsi/dump: print _PRECISE modifier on Instrutions
  st/glsl_to_tgsi: handle precise modifier
  tgsi: populate precise
  tgsi/text: parse _PRECISE modifier
  nv50/ir: add precise field to Instruction
  nv50/ir/tgsi: handle precise for most ALU instructions
  nv50/ir: disable mul+add to mad for precise instructions
  nv50/ir/tgsi: split mad to mul+add

 src/gallium/auxiliary/tgsi/tgsi_build.c|  4 +
 src/gallium/auxiliary/tgsi/tgsi_dump.c |  4 +
 src/gallium/auxiliary/tgsi/tgsi_text.c | 15 +++-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 14 +++-
 src/gallium/auxiliary/tgsi/tgsi_ureg.h | 20 -
 src/gallium/auxiliary/util/u_simple_shaders.c  |  2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  |  1 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 16 
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |  6 +-
 src/gallium/include/pipe/p_shader_tokens.h |  3 +-
 src/gallium/state_trackers/nine/nine_shader.c  |  6 +-
 src/mesa/state_tracker/st_atifs_to_tgsi.c  | 38 -
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 92 +-
 src/mesa/state_tracker/st_mesa_to_tgsi.c   |  8 +-
 src/mesa/state_tracker/st_pbo.c|  2 +-
 15 files changed, 172 insertions(+), 59 deletions(-)

-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/1] radeonsi: Use libdrm to get chipset name

2017-06-11 Thread Henri Verbeet

On 7 June 2017 at 21:54, Marek Olšák  wrote:
> On Wed, Jun 7, 2017 at 2:07 AM, Marek Olšák  wrote:
>> On Wed, Jun 7, 2017 at 12:21 AM, Samuel Li  wrote:
>>> @@ -790,6 +790,15 @@ static const char* r600_get_device_vendor(struct 
>>> pipe_screen* pscreen)
>>>
>>>  static const char* r600_get_chip_name(struct r600_common_screen *rscreen)
>>>  {
>>> +   const char *mname;
>>> +
>>> +   if (rscreen->ws->get_chip_name) {
>>> +   mname = rscreen->ws->get_chip_name(rscreen->ws);
>>> +   if (mname != NULL)
>>> +   return mname;
>>> +   }
>>> +
>>> +   /* fall back to family names*/
>>> switch (rscreen->info.family) {
>>> case CHIP_R600: return "AMD R600";
>>> case CHIP_RV610: return "AMD RV610";

As someone downstream of this, I have to say I find the "family" names
much more informative than whatever marketing came up with. More
importantly however, this commit changes the GL_RENDERER string
reported to applications, like Wine, for existing GPUs in an
incompatible way. Since I suspect displaying the "marketing" name is
important to at least some people at AMD, could I request please
including the family name as well, as is done by for example lspci?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation

2017-06-11 Thread Gert Wollny

Hello Marek, 

thanks for chiming in. 

Am Sonntag, den 11.06.2017, 16:15 +0200 schrieb Marek Olšák:
> Also, I don't know if people will like that it uses STL. I personally
> have no issue with that as long as it doesn't break apps (e.g. the
> STL shipped with apps should be the same as the STL shipped with the
> distribution).

Well, on Linux I would take it for granted that the STL used to run the
code is the same like the one the code was compiled with, and there are
already quite some places in the mesa code where STL constructs are
used (if that wounld't  have been the case, then I would tried to avoid
the STL). I am actually more concerned that propagating the  C++11
requirement to the whole  of src/mesa might not be welcomed (although
everything compiles and runs fine).


> On Sun, Jun 11, 2017 at 4:12 PM, Marek Olšák 
> wrote:
> > Hi Gert,
> > 
> > Have you measured the CPU overhead of the new code?

So far no, I guess one would do that with the shader-db to get
reasonable complex shaders, but I only have a r600 based card so I'm
not sure whether I can run this. In any case, tomorrow I will take a
look into this. 

Best, 
Gert 

> > 
> > Marek
> > 
> > On Sat, Jun 10, 2017 at 1:15 AM, Gert Wollny 
> > wrote:
> > > Dear all,
> > > 
> > > as I wrote before, I was looking into the temporary register
> > > renaming.
> > > 
> > > This series of patches implements a new approach that achieves a
> > > tigher
> > > estimation of the life time of the temporaries, and as a result
> > > the Piano
> > > and Voloplosion benchmarks implemented in gputest [1] now work.
> > > Before
> > > they failed with "r600_pipe_shader_create - translation from TGSI
> > > failed!"
> > > 
> > > Piglit shows 7 fixes and 6 regressions compared to git 8fac894f,
> > > but they don't
> > > seem to be related to shaders. I've also tested other programs
> > > like the unignie-*
> > > benchmarks and they didn't show regressions.
> > > 
> > > I think that the patch will need a few more iterations to remove
> > > code duplication
> > > and generally adhere to the mesa style, but I think it is atthe
> > > point where I could
> > > need a bit of feedback to get it into shape to be acceptable, and
> > > I'd also like to
> > > mention that since I'm new to mesa this I have no commit rights.
> > > 
> > > many thanks,
> > > Gert
> > > 
> > > [1] http://www.geeks3d.com/gputest/
> > > 
> > > Gert Wollny (3):
> > >   mesa/st: glsl_to_tgsi move some helper classes to extra files
> > >   mesa/st: glsl_to_tgsi Implement a new lifetime tracker for
> > > temporaries
> > >   mesa/st: glsl_to_tgsi: tie in the new register renaming
> > > approach
> > > 
> > >  configure.ac   |   1 +
> > >  src/mesa/Makefile.am   |   4 +-
> > >  src/mesa/Makefile.sources  |   4 +
> > >  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +
> > > ---
> > >  src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++
> > >  src/mesa/state_tracker/st_glsl_to_tgsi_private.h   | 135 
> > >  .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 551
> > > ++
> > >  .../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
> > >  src/mesa/state_tracker/tests/Makefile.am   |  40 ++
> > >  src/mesa/state_tracker/tests/st-renumerate-test| 210 ++
> > >  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 789
> > > +
> > >  11 files changed, 2104 insertions(+), 287 deletions(-)
> > >  create mode 100644
> > > src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
> > >  create mode 100644
> > > src/mesa/state_tracker/st_glsl_to_tgsi_private.h
> > >  create mode 100644
> > > src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
> > >  create mode 100644
> > > src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
> > >  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
> > >  create mode 100755 src/mesa/state_tracker/tests/st-renumerate-
> > > test
> > >  create mode 100644
> > > src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
> > > 
> > > --
> > > 2.13.0
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 0/3] Fix missing initializer errors in generated tables

This series aims to fix hundreds of missing initializer warnings in generated 
header files
when compiling with -Wextra

V1: Fix the old fashioned way by adding 0s where needed
V2: switch to designated initializers (Emil), didnt send
V3: add some layout so its easier to read and create a new version for vk that
just uses {0} instead of {0, 0, 0, 0, 0} (same thing, less zeros)

---

Because this generated such unexpected controversy and out of curiosity i wrote 
a little
test program to show the problem. Compile with -Wextra.
There is no init code generated for any variant on any compiler i tested.

gcc version 5.4.1 20170304 (Ubuntu 5.4.1-8ubuntu1)
gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)
gcc version 7.0.1 20170407 (experimental) [trunk revision 246759] (Ubuntu 
7-20170407-0ubuntu2)
clang version 4.0.0-1ubuntu1 (tags/RELEASE_400/rc1)
clang version 5.0.0-svn305158-0~z~padoka0 (trunk)
MSVC Compiler Version 19.00.24210

---

struct s {
   int a;
   int b;
   int c;
   int d;
   int e;
};

static const struct s str1 = {};   // gcc 5/6/7, clang 4/5 accept this 
without warning, MSVC2013 doesnt compile
(not Standard, will generate a warning with -Wpendantic, much prettier however)
static const struct s str2 = {0};  // clang 4/5 generate a warning here 
(although ANSI Standard)
static const struct s str3 = {0,}; // clang 4/5 generate a warning here 
(although ANSI Standard)
static const struct s str4 = {1,2,3};  // gcc 5/6/7, clang 4/5 generate a 
warning here
static const struct s str5 = {1,2,3,}; // gcc 5/6/7, clang 4/5 generate a 
warning here
static const struct s str6 = {1,2,3,0,0};  // this is fine with all compilers
static const struct s str7 = {.a = 1, .b = 2, .c = 3}; // this might not 
compile on MSVC <2013 but couldnt test

int main() {
   return 0;
}

---

This is what Rust does:

---

#![allow(unused_variables)]
#![allow(dead_code)]

#[derive(Default)]
struct Test {
   a: i32,
   b: i32,
   c: i32,
   d: i32,
   e: i32
}

fn main() {
//all of these wont work
//let t1 = Test {};
//let t2 = Test {0};
//let t3 = Test {0,};
//let t3b= Test {..}; would be very cool
//let t4 = Test {1,2,3};
//let t5 = Test {1,2,3,};
//let t6 = Test {1,2,3,0,0};
//let t7 = Test {a: 1, b: 1};
//let t8 = Test {a: 1, b: 1, ..}; would be cool

//only this is legal in rust
let t9: Test = Default::default();
let t10 = Test {..Default::default()};
let t11 = Test {a: 1, b: 1, ..Default::default()};
let t12 = Test {a: 1, b: 1, c: 1, d: 1, e: 1};

println!("Hello, world!");
}

---

So in the end I followed Emils suggestion of designated initializers for 
partial initialization,
to make it explicit and get rid of the warnings.

Please kindly review and push.

Thanks,
Benedikt

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 2/3] Fix missing initializer warning in egd_tables.h by adding appropriate default fields in egd_tables.py

Fix missing initializer warning in egd_tables.h by adding appropriate 
designated initializers in egd_tables.py

---
 src/gallium/drivers/r600/egd_tables.py | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/r600/egd_tables.py 
b/src/gallium/drivers/r600/egd_tables.py
index 4c606025ba..289981ae18 100644
--- a/src/gallium/drivers/r600/egd_tables.py
+++ b/src/gallium/drivers/r600/egd_tables.py
@@ -266,11 +266,13 @@ struct eg_packet3 {
 while value[1] >= len(values_offsets):
 values_offsets.append(-1)
 values_offsets[value[1]] = 
strings.add(strip_prefix(value[0]))
-print '\t{%s, %s(~0u), %s, %s},' % (
-strings.add(field.name), field.s_name,
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u), \
+   \r\n\t .num_values\t= %s,\r\n\t .values_offset\t= 
%s},' \
+   % (strings.add(field.name), field.s_name,
 len(values_offsets), 
strings_offsets.add(values_offsets))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= 
%s(~0u)},' \
+   % (strings.add(field.name), field.s_name)
 fields_idx += 1

 print '};'
@@ -279,10 +281,13 @@ struct eg_packet3 {
 print 'static const struct eg_reg egd_reg_table[] = {'
 for reg in regs:
 if len(reg.fields):
-print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s, \
+   \r\n\t .num_fields\t= %s,\r\n\t .fields_offset\t= %s},' \
+   % (strings.add(reg.name), reg.r_name,
 len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s},' \
+   % (strings.add(reg.name), reg.r_name)
 print '};'
 print

-- 
2.11.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 3/3] Fix missing initializer warning in vk_format_table.h by, adding appropriate default fields in vk_format_table.py

Fix missing initializer warning in vk_format_table.h by changing to a default 
initializer in vk_format_table.py
and correct the autogenerated from message

---
 src/amd/vulkan/vk_format_table.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/vk_format_table.py 
b/src/amd/vulkan/vk_format_table.py
index 36352b108d..139bb9544c 100644
--- a/src/amd/vulkan/vk_format_table.py
+++ b/src/amd/vulkan/vk_format_table.py
@@ -86,7 +86,7 @@ def print_channels(format, func):
 print '#endif'

 def write_format_table(formats):
-print '/* This file is autogenerated by u_format_table.py from 
u_format.csv. Do not edit directly. */'
+print '/* This file is autogenerated by vk_format_table.py from 
vk_format_layout.csv. Do not edit directly. */'
 print
 # This will print the copyright message on the top of this file
 print CopyRight.strip()
@@ -106,7 +106,7 @@ def write_format_table(formats):
 if channel.size:
 print "  {%s, %s, %s, %s, %u, %u}%s\t/* %s = %s */" % 
(type_map[channel.type],
bool_map(channel.norm), bool_map(channel.pure), bool_map(channel.scaled), 
channel.size, channel.shift, sep, "xyzw"[i],
channel.name)
 else:
-print "  {0, 0, 0, 0, 0}%s" % (sep,)
+print "  {0}%s" % (sep,)
 print "   },"

 def do_swizzle_array(channels, swizzles):
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 1/3] Fix missing initializer warning in sid_tables.h by adding appropriate default fields in sid_tables.py

Fix missing initializer warning in sid_tables.h by adding appropriate 
designated initializers in sid_tables.py

---
 src/amd/common/sid_tables.py | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/sid_tables.py b/src/amd/common/sid_tables.py
index fd88d3c9d5..691d766b08 100644
--- a/src/amd/common/sid_tables.py
+++ b/src/amd/common/sid_tables.py
@@ -266,11 +266,13 @@ struct si_packet3 {
 while value[1] >= len(values_offsets):
 values_offsets.append(-1)
 values_offsets[value[1]] = 
strings.add(strip_prefix(value[0]))
-print '\t{%s, %s(~0u), %s, %s},' % (
-strings.add(field.name), field.s_name,
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u), \
+   \r\n\t .num_values\t= %s,\r\n\t .values_offset\t= 
%s},' \
+   % (strings.add(field.name), field.s_name,
 len(values_offsets), 
strings_offsets.add(values_offsets))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= 
%s(~0u)},' \
+   % (strings.add(field.name), field.s_name)
 fields_idx += 1

 print '};'
@@ -279,10 +281,13 @@ struct si_packet3 {
 print 'static const struct si_reg sid_reg_table[] = {'
 for reg in regs:
 if len(reg.fields):
-print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s, \
+   \r\n\t .num_fields\t= %s,\r\n\t .fields_offset\t= %s},' \
+   % (strings.add(reg.name), reg.r_name,
 len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s},' \
+   % (strings.add(reg.name), reg.r_name)
 print '};'
 print

-- 
2.11.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #15 from John  ---
Created attachment 131879
  --> https://bugs.freedesktop.org/attachment.cgi?id=131879&action=edit
trace

ooops

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #14 from Grazvydas Ignotas  ---
Looks like you attached the wrong file.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #13 from John  ---
Created attachment 131878
  --> https://bugs.freedesktop.org/attachment.cgi?id=131878&action=edit
trace

The ML patch worked!

Here's the trace.

Thank you!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH] st/mesa: skip texture validation logic when nothing has changed

If it's copied from i965, it must be correct, right? ;) It probably is.

Reviewed-by: Marek Olšák 

Marek

On Sat, Jun 10, 2017 at 6:52 AM, Timothy Arceri  wrote:
> Based on the same logic in the i965 driver 2f225f61451abd51 and
> 16060c5adcd4.
>
> perf reports st_finalize_texture() going from 0.60% -> 0.16% with
> this change when running the Xonotic benchmark from PTS.
> ---
>
>  A full run of piglit on radeonsi produced no regressions. No other drivers
>  have been tested.
>
>  src/mesa/state_tracker/st_cb_texture.c | 28 
>  src/mesa/state_tracker/st_manager.c|  2 ++
>  src/mesa/state_tracker/st_texture.h|  9 +
>  3 files changed, 39 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_cb_texture.c 
> b/src/mesa/state_tracker/st_cb_texture.c
> index 99c59f7..443bb7b 100644
> --- a/src/mesa/state_tracker/st_cb_texture.c
> +++ b/src/mesa/state_tracker/st_cb_texture.c
> @@ -147,20 +147,22 @@ st_DeleteTextureImage(struct gl_context * ctx, struct 
> gl_texture_image *img)
>
>  /** called via ctx->Driver.NewTextureObject() */
>  static struct gl_texture_object *
>  st_NewTextureObject(struct gl_context * ctx, GLuint name, GLenum target)
>  {
> struct st_texture_object *obj = ST_CALLOC_STRUCT(st_texture_object);
>
> DBG("%s\n", __func__);
> _mesa_initialize_texture_object(ctx, &obj->base, name, target);
>
> +   obj->needs_validation = true;
> +
> return &obj->base;
>  }
>
>  /** called via ctx->Driver.DeleteTextureObject() */
>  static void
>  st_DeleteTextureObject(struct gl_context *ctx,
> struct gl_texture_object *texObj)
>  {
> struct st_context *st = st_context(ctx);
> struct st_texture_object *stObj = st_texture_object(texObj);
> @@ -599,20 +601,22 @@ st_AllocTextureImageBuffer(struct gl_context *ctx,
> struct st_texture_object *stObj = st_texture_object(texImage->TexObject);
> const GLuint level = texImage->Level;
> GLuint width = texImage->Width;
> GLuint height = texImage->Height;
> GLuint depth = texImage->Depth;
>
> DBG("%s\n", __func__);
>
> assert(!stImage->pt); /* xxx this might be wrong */
>
> +   stObj->needs_validation = true;
> +
> etc_fallback_allocate(st, stImage);
>
> /* Look if the parent texture object has space for this image */
> if (stObj->pt &&
> level <= stObj->pt->last_level &&
> st_texture_match_image(st, stObj->pt, texImage)) {
>/* this image will fit in the existing texture object's memory */
>pipe_resource_reference(&stImage->pt, stObj->pt);
>return GL_TRUE;
> }
> @@ -2478,20 +2482,30 @@ st_finalize_texture(struct gl_context *ctx,
>   pipe_resource_reference(&stObj->pt, st_obj->buffer);
>   st_texture_release_all_sampler_views(st, stObj);
>}
>return GL_TRUE;
>
> }
>
> firstImage = 
> st_texture_image_const(stObj->base.Image[cubeMapFace][stObj->base.BaseLevel]);
> assert(firstImage);
>
> +   /* Skip the loop over images in the common case of no images having
> +* changed.  But if the GL_BASE_LEVEL or GL_MAX_LEVEL change to something 
> we
> +* haven't looked at, then we do need to look at those new images.
> +*/
> +   if (!stObj->needs_validation &&
> +   stObj->base.BaseLevel >= stObj->validated_first_level &&
> +   stObj->lastLevel <= stObj->validated_last_level) {
> +  return GL_TRUE;
> +   }
> +
> /* If both firstImage and stObj point to a texture which can contain
>  * all active images, favour firstImage.  Note that because of the
>  * completeness requirement, we know that the image dimensions
>  * will match.
>  */
> if (firstImage->pt &&
> firstImage->pt != stObj->pt &&
> (!stObj->pt || firstImage->pt->last_level >= stObj->pt->last_level)) {
>pipe_resource_reference(&stObj->pt, firstImage->pt);
>st_texture_release_all_sampler_views(st, stObj);
> @@ -2624,20 +2638,24 @@ st_finalize_texture(struct gl_context *ctx,
>  (stImage->base.Width == u_minify(ptWidth, level) &&
>   stImage->base.Height == height &&
>   stImage->base.Depth == depth)) {
> /* src image fits expected dest mipmap level size */
> copy_image_data_to_texture(st, stObj, level, stImage);
>  }
>   }
>}
> }
>
> +   stObj->validated_first_level = stObj->base.BaseLevel;
> +   stObj->validated_last_level = stObj->lastLevel;
> +   stObj->needs_validation = false;
> +
> return GL_TRUE;
>  }
>
>
>  /**
>   * Called via ctx->Driver.AllocTextureStorage() to allocate texture memory
>   * for a whole mipmap stack.
>   */
>  static GLboolean
>  st_AllocTextureStorage(struct gl_context *ctx,
> @@ -2705,20 +2723,25 @@ st_AllocTextureStorage(struct gl_context *ctx,
>GLuint face;
>for (face = 0; face < numFaces; face++) {
>   struct st_texture_image *stImage =
>  st_texture_ima

Re: [Mesa-dev] [PATCH] ac: Use mov_dpp for derivatives.

Hi Bas,

Have you tested piglit on radeonsi with this?

Marek

On Sat, Jun 10, 2017 at 10:05 PM, Bas Nieuwenhuizen
 wrote:
> Slightly faster than bpermute, and seems supported since at least
> LLVM 3.9.
>
> v2: Since this supersedes bpermute, remove the bpermute code.
> Signed-off-by: Bas Nieuwenhuizen 
> ---
>  src/amd/common/ac_llvm_build.c   | 47 
> 
>  src/amd/common/ac_llvm_build.h   |  2 +-
>  src/amd/common/ac_nir_to_llvm.c  |  8 +++---
>  src/gallium/drivers/radeonsi/si_pipe.c   |  2 +-
>  src/gallium/drivers/radeonsi/si_pipe.h   |  2 +-
>  src/gallium/drivers/radeonsi/si_shader.c |  4 +--
>  6 files changed, 38 insertions(+), 27 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index 237e9291d41..99d41bf52d6 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -783,41 +783,52 @@ ac_get_thread_id(struct ac_llvm_context *ctx)
>   */
>  LLVMValueRef
>  ac_build_ddxy(struct ac_llvm_context *ctx,
> - bool has_ds_bpermute,
> + bool has_mov_dpp,
>   uint32_t mask,
>   int idx,
>   LLVMValueRef lds,
>   LLVMValueRef val)
>  {
> -   LLVMValueRef thread_id, tl, trbl, tl_tid, trbl_tid, args[2];
> +   LLVMValueRef thread_id, tl, trbl, args[5];
> LLVMValueRef result;
>
> -   thread_id = ac_get_thread_id(ctx);
> +   if (has_mov_dpp) {
> +   uint32_t tl_ctrl = 0, trbl_ctrl = 0;
>
> -   tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
> - LLVMConstInt(ctx->i32, mask, false), "");
> -
> -   trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
> -   LLVMConstInt(ctx->i32, idx, false), "");
> +   for (unsigned i = 0; i < 4; ++i) {
> +   tl_ctrl |= (i & mask) << (2 * i);
> +   trbl_ctrl |= ((i & mask) + idx) << (2 * i);
> +   }
>
> -   if (has_ds_bpermute) {
> -   args[0] = LLVMBuildMul(ctx->builder, tl_tid,
> -  LLVMConstInt(ctx->i32, 4, false), "");
> -   args[1] = val;
> +   args[0] = val;
> +   args[1] = LLVMConstInt(ctx->i32, tl_ctrl, false);
> +   args[2] = LLVMConstInt(ctx->i32, 0xf, false);
> +   args[3] = LLVMConstInt(ctx->i32, 0xf, false);
> +   args[4] = LLVMConstInt(ctx->i1, 1, false);
> tl = ac_build_intrinsic(ctx,
> -   "llvm.amdgcn.ds.bpermute", ctx->i32,
> -   args, 2,
> +   "llvm.amdgcn.mov.dpp.i32", ctx->i32,
> +   args, 5,
> AC_FUNC_ATTR_READNONE |
> AC_FUNC_ATTR_CONVERGENT);
>
> -   args[0] = LLVMBuildMul(ctx->builder, trbl_tid,
> -  LLVMConstInt(ctx->i32, 4, false), "");
> +   args[1] = LLVMConstInt(ctx->i32, trbl_ctrl, false);
> trbl = ac_build_intrinsic(ctx,
> - "llvm.amdgcn.ds.bpermute", ctx->i32,
> - args, 2,
> + "llvm.amdgcn.mov.dpp.i32", ctx->i32,
> + args, 5,
>   AC_FUNC_ATTR_READNONE |
>   AC_FUNC_ATTR_CONVERGENT);
> } else {
> +   LLVMValueRef tl_tid, trbl_tid;
> +
> +   thread_id = ac_get_thread_id(ctx);
> +
> +   tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
> +   LLVMConstInt(ctx->i32, mask, false), "");
> +
> +   trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
> +   LLVMConstInt(ctx->i32, idx, false), 
> "");
> +
> +
> LLVMValueRef store_ptr, load_ptr0, load_ptr1;
>
> store_ptr = ac_build_gep0(ctx, lds, thread_id);
> diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
> index ebb78fbd79b..14260b05018 100644
> --- a/src/amd/common/ac_llvm_build.h
> +++ b/src/amd/common/ac_llvm_build.h
> @@ -161,7 +161,7 @@ ac_get_thread_id(struct ac_llvm_context *ctx);
>
>  LLVMValueRef
>  ac_build_ddxy(struct ac_llvm_context *ctx,
> - bool has_ds_bpermute,
> + bool has_mov_dpp,
>   uint32_t mask,
>   int idx,
>   LLVMValueRef lds,
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 49117d21bd2..2385c60d316 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -164,7 +164,7 @@ struct nir_to_llvm_context {
> uint8_t num_output_clips;
> uint8_t num_output_c

Re: [Mesa-dev] [PATCH] radv: fix trace dumping for !use_ib_bos

2017-06-11 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

We shouldn't chain when use_ib_bos is false and embed secondary
command buffers directly in the primary buffer as well, so no handling
of chaining is needed.

On Sun, Jun 11, 2017 at 4:03 PM, Grazvydas Ignotas  wrote:
> Fixes trace dumping crash for SI or when RADV_DEBUG=noibs is set.
>
> Fixes: 97dfff5410 "radv: Dump command buffer on hang."
> Signed-off-by: Grazvydas Ignotas 
> ---
> Not sure if chained buffer dumping can be done for !use_ib_bos,
> returning NULL in _get_cpu_addr() just skips that.
>
>  src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 14 +++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
> b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> index 7b74970..ffc7566 100644
> --- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> +++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> @@ -950,10 +950,13 @@ static int radv_amdgpu_winsys_cs_submit(struct 
> radeon_winsys_ctx *_ctx,
>
>  static void *radv_amdgpu_winsys_get_cpu_addr(void *_cs, uint64_t addr)
>  {
> struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs *)_cs;
> void *ret = NULL;
> +
> +   if (!cs->ib_buffer)
> +   return NULL;
> for (unsigned i = 0; i <= cs->num_old_ib_buffers; ++i) {
> struct radv_amdgpu_winsys_bo *bo;
>
> bo = (struct radv_amdgpu_winsys_bo*)
>(i == cs->num_old_ib_buffers ? cs->ib_buffer : 
> cs->old_ib_buffers[i]);
> @@ -968,14 +971,19 @@ static void *radv_amdgpu_winsys_get_cpu_addr(void *_cs, 
> uint64_t addr)
>  static void radv_amdgpu_winsys_cs_dump(struct radeon_winsys_cs *_cs,
> FILE* file,
> uint32_t trace_id)
>  {
> struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs *)_cs;
> +   void *ib = cs->base.buf;
> +   int num_dw = cs->base.cdw;
>
> -   ac_parse_ib(file,
> -   radv_amdgpu_winsys_get_cpu_addr(cs, cs->ib.ib_mc_address),
> -   cs->ib.size, trace_id,  "main IB", 
> cs->ws->info.chip_class,
> +   if (cs->ws->use_ib_bos) {
> +   ib = radv_amdgpu_winsys_get_cpu_addr(cs, 
> cs->ib.ib_mc_address);
> +   num_dw = cs->ib.size;
> +   }
> +   assert(ib);
> +   ac_parse_ib(file, ib, num_dw, trace_id, "main IB", 
> cs->ws->info.chip_class,
> radv_amdgpu_winsys_get_cpu_addr, cs);
>  }
>
>  static struct radeon_winsys_ctx *radv_amdgpu_ctx_create(struct radeon_winsys 
> *_ws)
>  {
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation

Also, I don't know if people will like that it uses STL. I personally
have no issue with that as long as it doesn't break apps (e.g. the STL
shipped with apps should be the same as the STL shipped with the
distribution).

Marek

On Sun, Jun 11, 2017 at 4:12 PM, Marek Olšák  wrote:
> Hi Gert,
>
> Have you measured the CPU overhead of the new code?
>
> Marek
>
> On Sat, Jun 10, 2017 at 1:15 AM, Gert Wollny  wrote:
>> Dear all,
>>
>> as I wrote before, I was looking into the temporary register renaming.
>>
>> This series of patches implements a new approach that achieves a tigher
>> estimation of the life time of the temporaries, and as a result the Piano
>> and Voloplosion benchmarks implemented in gputest [1] now work. Before
>> they failed with "r600_pipe_shader_create - translation from TGSI failed!"
>>
>> Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they 
>> don't
>> seem to be related to shaders. I've also tested other programs like the 
>> unignie-*
>> benchmarks and they didn't show regressions.
>>
>> I think that the patch will need a few more iterations to remove code 
>> duplication
>> and generally adhere to the mesa style, but I think it is atthe point where 
>> I could
>> need a bit of feedback to get it into shape to be acceptable, and I'd also 
>> like to
>> mention that since I'm new to mesa this I have no commit rights.
>>
>> many thanks,
>> Gert
>>
>> [1] http://www.geeks3d.com/gputest/
>>
>> Gert Wollny (3):
>>   mesa/st: glsl_to_tgsi move some helper classes to extra files
>>   mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
>>   mesa/st: glsl_to_tgsi: tie in the new register renaming approach
>>
>>  configure.ac   |   1 +
>>  src/mesa/Makefile.am   |   4 +-
>>  src/mesa/Makefile.sources  |   4 +
>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +---
>>  src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++
>>  src/mesa/state_tracker/st_glsl_to_tgsi_private.h   | 135 
>>  .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 551 ++
>>  .../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
>>  src/mesa/state_tracker/tests/Makefile.am   |  40 ++
>>  src/mesa/state_tracker/tests/st-renumerate-test| 210 ++
>>  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 789 
>> +
>>  11 files changed, 2104 insertions(+), 287 deletions(-)
>>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
>>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
>>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
>>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
>>  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
>>  create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
>>  create mode 100644 
>> src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
>>
>> --
>> 2.13.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #12 from Grazvydas Ignotas  ---
Created attachment 131877
  --> https://bugs.freedesktop.org/attachment.cgi?id=131877&action=edit
trace everything

I've sent a patch that should fix trace dumping for SI:
https://lists.freedesktop.org/archives/mesa-dev/2017-June/158739.html

If you want to trace everything, use the attached patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation

Hi Gert,

Have you measured the CPU overhead of the new code?

Marek

On Sat, Jun 10, 2017 at 1:15 AM, Gert Wollny  wrote:
> Dear all,
>
> as I wrote before, I was looking into the temporary register renaming.
>
> This series of patches implements a new approach that achieves a tigher
> estimation of the life time of the temporaries, and as a result the Piano
> and Voloplosion benchmarks implemented in gputest [1] now work. Before
> they failed with "r600_pipe_shader_create - translation from TGSI failed!"
>
> Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they 
> don't
> seem to be related to shaders. I've also tested other programs like the 
> unignie-*
> benchmarks and they didn't show regressions.
>
> I think that the patch will need a few more iterations to remove code 
> duplication
> and generally adhere to the mesa style, but I think it is atthe point where I 
> could
> need a bit of feedback to get it into shape to be acceptable, and I'd also 
> like to
> mention that since I'm new to mesa this I have no commit rights.
>
> many thanks,
> Gert
>
> [1] http://www.geeks3d.com/gputest/
>
> Gert Wollny (3):
>   mesa/st: glsl_to_tgsi move some helper classes to extra files
>   mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
>   mesa/st: glsl_to_tgsi: tie in the new register renaming approach
>
>  configure.ac   |   1 +
>  src/mesa/Makefile.am   |   4 +-
>  src/mesa/Makefile.sources  |   4 +
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +---
>  src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++
>  src/mesa/state_tracker/st_glsl_to_tgsi_private.h   | 135 
>  .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 551 ++
>  .../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
>  src/mesa/state_tracker/tests/Makefile.am   |  40 ++
>  src/mesa/state_tracker/tests/st-renumerate-test| 210 ++
>  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 789 
> +
>  11 files changed, 2104 insertions(+), 287 deletions(-)
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
>  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
>  create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
>  create mode 100644 
> src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
>
> --
> 2.13.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

https://bugs.freedesktop.org/show_bug.cgi?id=90264

--- Comment #76 from omerfarukdoga...@hotmail.com ---
This problem happens when a tooltip was previously shown with a multi-line
content and the current tooltip has smaller number of lines than the previous
one. Check this topic for detailed explanation:
https://www.kubuntuforums.net/showthread.php?t=71878

It's like the placement of the drawn element is calculated according to the
previous tooltip size (aligned to the bottom).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: fix trace dumping for !use_ib_bos

2017-06-11 Thread Grazvydas Ignotas

Fixes trace dumping crash for SI or when RADV_DEBUG=noibs is set.

Fixes: 97dfff5410 "radv: Dump command buffer on hang."
Signed-off-by: Grazvydas Ignotas 
---
Not sure if chained buffer dumping can be done for !use_ib_bos,
returning NULL in _get_cpu_addr() just skips that.

 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
index 7b74970..ffc7566 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
@@ -950,10 +950,13 @@ static int radv_amdgpu_winsys_cs_submit(struct 
radeon_winsys_ctx *_ctx,
 
 static void *radv_amdgpu_winsys_get_cpu_addr(void *_cs, uint64_t addr)
 {
struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs *)_cs;
void *ret = NULL;
+
+   if (!cs->ib_buffer)
+   return NULL;
for (unsigned i = 0; i <= cs->num_old_ib_buffers; ++i) {
struct radv_amdgpu_winsys_bo *bo;
 
bo = (struct radv_amdgpu_winsys_bo*)
   (i == cs->num_old_ib_buffers ? cs->ib_buffer : 
cs->old_ib_buffers[i]);
@@ -968,14 +971,19 @@ static void *radv_amdgpu_winsys_get_cpu_addr(void *_cs, 
uint64_t addr)
 static void radv_amdgpu_winsys_cs_dump(struct radeon_winsys_cs *_cs,
FILE* file,
uint32_t trace_id)
 {
struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs *)_cs;
+   void *ib = cs->base.buf;
+   int num_dw = cs->base.cdw;
 
-   ac_parse_ib(file,
-   radv_amdgpu_winsys_get_cpu_addr(cs, cs->ib.ib_mc_address),
-   cs->ib.size, trace_id,  "main IB", cs->ws->info.chip_class,
+   if (cs->ws->use_ib_bos) {
+   ib = radv_amdgpu_winsys_get_cpu_addr(cs, cs->ib.ib_mc_address);
+   num_dw = cs->ib.size;
+   }
+   assert(ib);
+   ac_parse_ib(file, ib, num_dw, trace_id, "main IB", 
cs->ws->info.chip_class,
radv_amdgpu_winsys_get_cpu_addr, cs);
 }
 
 static struct radeon_winsys_ctx *radv_amdgpu_ctx_create(struct radeon_winsys 
*_ws)
 {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: unmap the stream_uploader buffer before drawing

On Sat, Jun 10, 2017 at 5:27 AM, Brian Paul  wrote:
> Some drivers require that the vertex buffers be unmapped prior to
> drawing.  This change unmaps the stream_uploader buffer after we've
> uploaded the zero-stride attributes (unless the driver supports
> rendering with mapped buffers).
>
> This fixes a regression in the VMware driver since 17f776c27be266f2.
> Some Mesa demos such as mandelbrot and brick would display black
> quads instead of the expected rendering.
>
> --
>
> Marek: can you please verify that this is the right place for this
> call (and not in st_draw_vbo())?

Yes, this is the right place.

Reviewed-by: Marek Olšák 

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 1/2] hud: Handle query values according to their type

2017-06-11 Thread Boyan Ding

Signed-off-by: Boyan Ding 
---
 src/gallium/auxiliary/hud/hud_driver_query.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_driver_query.c 
b/src/gallium/auxiliary/hud/hud_driver_query.c
index 76104b5b49..7a469bd1bd 100644
--- a/src/gallium/auxiliary/hud/hud_driver_query.c
+++ b/src/gallium/auxiliary/hud/hud_driver_query.c
@@ -202,6 +202,7 @@ struct query_info {
unsigned query_type;
unsigned result_index; /* unit depends on query_type */
enum pipe_driver_query_result_type result_type;
+   enum pipe_driver_query_type type;
 
/* Ring of queries. If a query is busy, we use another slot. */
struct pipe_query *query[NUM_QUERIES];
@@ -229,6 +230,19 @@ query_new_value_batch(struct query_info *info)
}
 }
 
+static uint64_t
+query_result_to_u64(union pipe_query_result result,
+enum pipe_driver_query_type type)
+{
+   switch (type) {
+   case PIPE_DRIVER_QUERY_TYPE_PERCENTAGE:
+   case PIPE_DRIVER_QUERY_TYPE_FLOAT:
+  return result.f;
+   default:
+  return result.u64;
+   }
+}
+
 static void
 query_new_value_normal(struct query_info *info)
 {
@@ -242,10 +256,11 @@ query_new_value_normal(struct query_info *info)
   while (1) {
  struct pipe_query *query = info->query[info->tail];
  union pipe_query_result result;
- uint64_t *res64 = (uint64_t *)&result;
+ union pipe_query_result *presult = &result;
 
- if (query && pipe->get_query_result(pipe, query, FALSE, &result)) {
-info->results_cumulative += res64[info->result_index];
+ if (query && pipe->get_query_result(pipe, query, FALSE, presult)) {
+info->results_cumulative +=
+   query_result_to_u64(presult[info->result_index], info->type);
 info->num_results++;
 
 if (info->tail == info->head)
@@ -383,6 +398,7 @@ hud_pipe_query_install(struct hud_batch_query_context **pbq,
info = gr->query_data;
info->pipe = pipe;
info->result_type = result_type;
+   info->type = type;
 
if (flags & PIPE_DRIVER_QUERY_FLAG_BATCH) {
   if (!batch_query_add(pbq, pipe, query_type, &info->result_index))
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/util: whitespace, formatting fixes in u_upload_mgr.c

Reviewed-by: Marek Olšák 

Marek

On Sat, Jun 10, 2017 at 5:27 AM, Brian Paul  wrote:
> ---
>  src/gallium/auxiliary/util/u_upload_mgr.c | 54 
> +--
>  1 file changed, 29 insertions(+), 25 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_upload_mgr.c 
> b/src/gallium/auxiliary/util/u_upload_mgr.c
> index 9528495..4bb14d6 100644
> --- a/src/gallium/auxiliary/util/u_upload_mgr.c
> +++ b/src/gallium/auxiliary/util/u_upload_mgr.c
> @@ -1,8 +1,8 @@
>  /**
> - *
> + *
>   * Copyright 2009 VMware, Inc.
>   * All Rights Reserved.
> - *
> + *
>   * Permission is hereby granted, free of charge, to any person obtaining a
>   * copy of this software and associated documentation files (the
>   * "Software"), to deal in the Software without restriction, including
> @@ -10,11 +10,11 @@
>   * distribute, sub license, and/or sell copies of the Software, and to
>   * permit persons to whom the Software is furnished to do so, subject to
>   * the following conditions:
> - *
> + *
>   * The above copyright notice and this permission notice (including the
>   * next paragraph) shall be included in all copies or substantial portions
>   * of the Software.
> - *
> + *
>   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
>   * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>   * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> @@ -22,7 +22,7 @@
>   * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
>   * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
>   * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> - *
> + *
>   **/
>
>  /* Helper utility for uploading user buffers & other data, and
> @@ -59,7 +59,7 @@ struct u_upload_mgr *
>  u_upload_create(struct pipe_context *pipe, unsigned default_size,
>  unsigned bind, enum pipe_resource_usage usage)
>  {
> -   struct u_upload_mgr *upload = CALLOC_STRUCT( u_upload_mgr );
> +   struct u_upload_mgr *upload = CALLOC_STRUCT(u_upload_mgr);
> if (!upload)
>return NULL;
>
> @@ -104,7 +104,8 @@ u_upload_clone(struct pipe_context *pipe, struct 
> u_upload_mgr *upload)
>upload->usage);
>  }
>
> -static void upload_unmap_internal(struct u_upload_mgr *upload, boolean 
> destroying)
> +static void
> +upload_unmap_internal(struct u_upload_mgr *upload, boolean destroying)
>  {
> if (!destroying && upload->map_persistent)
>return;
> @@ -124,30 +125,32 @@ static void upload_unmap_internal(struct u_upload_mgr 
> *upload, boolean destroyin
>  }
>
>
> -void u_upload_unmap( struct u_upload_mgr *upload )
> +void
> +u_upload_unmap(struct u_upload_mgr *upload)
>  {
> upload_unmap_internal(upload, FALSE);
>  }
>
>
> -static void u_upload_release_buffer(struct u_upload_mgr *upload)
> +static void
> +u_upload_release_buffer(struct u_upload_mgr *upload)
>  {
> /* Unmap and unreference the upload buffer. */
> upload_unmap_internal(upload, TRUE);
> -   pipe_resource_reference( &upload->buffer, NULL );
> +   pipe_resource_reference(&upload->buffer, NULL);
>  }
>
>
> -void u_upload_destroy( struct u_upload_mgr *upload )
> +void
> +u_upload_destroy(struct u_upload_mgr *upload)
>  {
> -   u_upload_release_buffer( upload );
> -   FREE( upload );
> +   u_upload_release_buffer(upload);
> +   FREE(upload);
>  }
>
>
>  static void
> -u_upload_alloc_buffer(struct u_upload_mgr *upload,
> -  unsigned min_size)
> +u_upload_alloc_buffer(struct u_upload_mgr *upload, unsigned min_size)
>  {
> struct pipe_screen *screen = upload->pipe->screen;
> struct pipe_resource buffer;
> @@ -155,9 +158,9 @@ u_upload_alloc_buffer(struct u_upload_mgr *upload,
>
> /* Release the old buffer, if present:
>  */
> -   u_upload_release_buffer( upload );
> +   u_upload_release_buffer(upload);
>
> -   /* Allocate a new one:
> +   /* Allocate a new one:
>  */
> size = align(MAX2(upload->default_size, min_size), 4096);
>
> @@ -232,7 +235,7 @@ u_upload_alloc(struct u_upload_mgr *upload,
>offset,
>buffer_size - offset,
>upload->map_flags,
> - &upload->transfer);
> +  &upload->transfer);
>if (unlikely(!upload->map)) {
>   upload->transfer = NULL;
>   *out_offset = ~0;
> @@ -256,13 +259,14 @@ u_upload_alloc(struct u_upload_mgr *upload,
> upload->offset = offset + size;
>  }
>
> -void u_upload_data(struct u_upload_mgr *upload,
> -   unsigned min_out_offset,
> -   unsigned size,
> -   unsigned alignment,
> -   const void *data,
> -

[Mesa-dev] [RFC 0/2] nvc0: Fix non-integer counters in AMD_performance_monitor

2017-06-11 Thread Boyan Ding

Some performance counters in nouveau uses non-integer type in AMD_perfmon,
but they are currently returning int values. One reason behind this is
that gallium hud, which is one of the users of the counters, only supports
integers. This series tries to fix problem in both parts -- making nouveau
return values of appropriate types and teaching hud to be aware of types.
Although hud is still not clever enough to handle floating points, it is
becoming no worse.

Note that this series is highly RFC. I'm posting this to solicit ideas,
both on whether this approach is appropriate and details in handling --
the current code looks somewhat ugly. There are also further problems
that I look to solve, namely some performance counters, such as ipc,
which should be floats instead of ints. But I want to get basics correct
first.

Boyan Ding (2):
  hud: Handle query values according to their type
  nvc0: Return value of appropriate type instead of u64

 src/gallium/auxiliary/hud/hud_driver_query.c   | 22 ++-
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 70 +-
 2 files changed, 61 insertions(+), 31 deletions(-)

-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 2/2] nvc0: Return value of appropriate type instead of u64

2017-06-11 Thread Boyan Ding

Signed-off-by: Boyan Ding 
---
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 70 +-
 1 file changed, 42 insertions(+), 28 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
index 089af61820..6d4deaf2ba 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
@@ -498,53 +498,59 @@ nvc0_hw_metric_end_query(struct nvc0_context *nvc0, 
struct nvc0_hw_query *hq)
 static uint64_t
 sm20_hw_metric_calc_result(struct nvc0_hw_query *hq, uint64_t res64[8])
 {
+   union pipe_query_result result;
+
+   result.u64 = 0;
switch (hq->base.type - NVC0_HW_METRIC_QUERY(0)) {
case NVC0_HW_METRIC_QUERY_ACHIEVED_OCCUPANCY:
   /* ((active_warps / active_cycles) / max. number of warps on a MP) * 100 
*/
   if (res64[1])
- return ((res64[0] / (double)res64[1]) / 48) * 100;
+ result.f = ((res64[0] / (double)res64[1]) / 48) * 100;
   break;
case NVC0_HW_METRIC_QUERY_BRANCH_EFFICIENCY:
   /* (branch / (branch + divergent_branch)) * 100 */
   if (res64[0] + res64[1])
- return (res64[0] / (double)(res64[0] + res64[1])) * 100;
+ result.f = (res64[0] / (double)(res64[0] + res64[1])) * 100;
   break;
case NVC0_HW_METRIC_QUERY_INST_PER_WRAP:
   /* inst_executed / warps_launched */
   if (res64[1])
- return res64[0] / (double)res64[1];
+ result.u64 = res64[0] / (double)res64[1];
   break;
case NVC0_HW_METRIC_QUERY_INST_REPLAY_OVERHEAD:
   /* (inst_issued - inst_executed) / inst_executed */
   if (res64[1])
- return (res64[0] - res64[1]) / (double)res64[1];
+ result.u64 = (res64[0] - res64[1]) / (double)res64[1];
   break;
case NVC0_HW_METRIC_QUERY_ISSUED_IPC:
   /* inst_issued / active_cycles */
   if (res64[1])
- return res64[0] / (double)res64[1];
+ result.u64 = res64[0] / (double)res64[1];
   break;
case NVC0_HW_METRIC_QUERY_ISSUE_SLOT_UTILIZATION:
   /* ((inst_issued / 2) / active_cycles) * 100 */
   if (res64[1])
- return ((res64[0] / 2) / (double)res64[1]) * 100;
+ result.f = ((res64[0] / 2) / (double)res64[1]) * 100;
   break;
case NVC0_HW_METRIC_QUERY_IPC:
   /* inst_executed / active_cycles */
   if (res64[1])
- return res64[0] / (double)res64[1];
+ result.u64 = res64[0] / (double)res64[1];
   break;
default:
   debug_printf("invalid metric type: %d\n",
hq->base.type - NVC0_HW_METRIC_QUERY(0));
   break;
}
-   return 0;
+   return result.u64;
 }
 
 static uint64_t
 sm21_hw_metric_calc_result(struct nvc0_hw_query *hq, uint64_t res64[8])
 {
+   union pipe_query_result result;
+
+   result.u64 = 0;
switch (hq->base.type - NVC0_HW_METRIC_QUERY(0)) {
case NVC0_HW_METRIC_QUERY_ACHIEVED_OCCUPANCY:
   return sm20_hw_metric_calc_result(hq, res64);
@@ -552,31 +558,31 @@ sm21_hw_metric_calc_result(struct nvc0_hw_query *hq, 
uint64_t res64[8])
   return sm20_hw_metric_calc_result(hq, res64);
case NVC0_HW_METRIC_QUERY_INST_ISSUED:
   /* issued1_0 + issued1_1 + (issued2_0 + issued2_1) * 2 */
-  return res64[0] + res64[1] + (res64[2] + res64[3]) * 2;
+  result.u64 = res64[0] + res64[1] + (res64[2] + res64[3]) * 2;
   break;
case NVC0_HW_METRIC_QUERY_INST_PER_WRAP:
   return sm20_hw_metric_calc_result(hq, res64);
case NVC0_HW_METRIC_QUERY_INST_REPLAY_OVERHEAD:
   /* (metric-inst_issued - inst_executed) / inst_executed */
   if (res64[4])
- return (((res64[0] + res64[1] + (res64[2] + res64[3]) * 2) -
-   res64[4]) / (double)res64[4]);
+ result.u64 = (((res64[0] + res64[1] + (res64[2] + res64[3]) * 2) -
+ res64[4]) / (double)res64[4]);
   break;
case NVC0_HW_METRIC_QUERY_ISSUED_IPC:
   /* metric-inst_issued / active_cycles */
   if (res64[4])
- return (res64[0] + res64[1] + (res64[2] + res64[3]) * 2) /
-(double)res64[4];
+ result.u64 = (res64[0] + res64[1] + (res64[2] + res64[3]) * 2) /
+  (double)res64[4];
   break;
case NVC0_HW_METRIC_QUERY_ISSUE_SLOTS:
   /* issued1_0 + issued1_1 + issued2_0 + issued2_1 */
-  return res64[0] + res64[1] + res64[2] + res64[3];
+  result.u64 = res64[0] + res64[1] + res64[2] + res64[3];
   break;
case NVC0_HW_METRIC_QUERY_ISSUE_SLOT_UTILIZATION:
   /* ((metric-issue_slots / 2) / active_cycles) * 100 */
   if (res64[4])
- return (((res64[0] + res64[1] + res64[2] + res64[3]) / 2) /
- (double)res64[4]) * 100;
+ result.f =  (((res64[0] + res64[1] + res64[2] + res64[3]) / 2) /
+  (double)res64[4]) * 100;
   break;
case NVC0_HW_METRIC_QUERY_IPC:
   return sm20_hw_metric_calc_result(hq, res64);
@@ -585,

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #11 from John  ---
Created attachment 131874
  --> https://bugs.freedesktop.org/attachment.cgi?id=131874&action=edit
gdb backtrace

Well, I've been able to get a backtrace thanks to screen.

That looks more interesting already.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #10 from John  ---
I'm not sure if it's thanks to debug, but now I get something in dmesg, not
that helpful I'm afraid:

[  141.325269] raytracing[2417]: segfault at 8 ip 7fd0b21e74d2 sp
7ffc604d5520 error 4 in libvulkan_radeon.so[7fd0b217+1b3000]


The trace file has been empty the various times I've tried. Is there a way to
get a full trace of everything it's doing? maybe that would allow the last line
or so to be useful.

As for gdb, it gets stuck on "attaching to process" and the process command in
ps is displayed in square brackets.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #9 from Grazvydas Ignotas  ---
Is the process still alive when you ssh to the system with a hung GPU? If it
is, you could attach gdb and try to get a backtrace of a hung thread.

You can try at least a few other things:
* compile mesa with --enable-debug if you aren't already, it will enable
asserts that might detect something bad
* set a RADV_TRACE_FILE=/path/to/file environment variable, it will then try to
write out trace of GPU commands to that file if/when it detects a hang.

The trace file sometimes takes a few tries to produce successfully, but if you
can get it, it might help to find the cause of the hang.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101378] interpolateAtSample check for input parameter is too strict

https://bugs.freedesktop.org/show_bug.cgi?id=101378

--- Comment #1 from freedesk...@ca.sh13.net ---
Sorry, copy/paste error, the first two lines are from another unrelated error.
The correct error message is just this part:

0:18(33): error: parameter `interpolant` must be a shader input
0:18(6): error: no matching function for call to `length(error)';
candidates are:
0:18(6): error:float length(float)
0:18(6): error:float length(vec2)
0:18(6): error:float length(vec3)
0:18(6): error:float length(vec4)
0:18(6): error:double length(double)
0:18(6): error:double length(dvec2)
0:18(6): error:double length(dvec3)
0:18(6): error:double length(dvec4)
0:18(6): error: operands to relational operators must be scalar and numeric
0:18(6): error: if-statement condition must be scalar boolean

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101378] interpolateAtSample check for input parameter is too strict

https://bugs.freedesktop.org/show_bug.cgi?id=101378

Bug ID: 101378
   Summary: interpolateAtSample check for input parameter is too
strict
   Product: Mesa
   Version: 17.0
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: freedesk...@ca.sh13.net
QA Contact: intel-3d-b...@lists.freedesktop.org

The following code snippet fails on Mesa 17.0.3 with the following error:

in GeometryToPixel
{
vec3 color;
vec2 uv;
} gs2ps;

   out vec4 Color;

   void main ()
   {
Color = vec4(gs2ps.color, 1);

if (length(interpolateAtSample(gs2ps.uv, gl_SampleID))>1) {
discard;
}
}

This is the error:

0:10(1): error: if a fragment input is (or contains) an integer, then it
must be qualified with 'flat'
0:10(8): error: `gl_SampleID' redeclared
0:18(33): error: parameter `interpolant` must be a shader input
0:18(6): error: no matching function for call to `length(error)';
candidates are:
0:18(6): error:float length(float)
0:18(6): error:float length(vec2)
0:18(6): error:float length(vec3)
0:18(6): error:float length(vec4)
0:18(6): error:double length(double)
0:18(6): error:double length(dvec2)
0:18(6): error:double length(dvec3)
0:18(6): error:double length(dvec4)
0:18(6): error: operands to relational operators must be scalar and numeric
0:18(6): error: if-statement condition must be scalar boolean

Changing the shader to use:

in  vec3 icolor;
in  vec2 iuv;

fixes the problem. This seems to be related to
https://patchwork.freedesktop.org/patch/15298/, but the input variable check is
too strict. It should work with an input block as well, which is for instance
used by the G-Truc samples:
https://github.com/g-truc/ogl-samples/blob/69499c23b9566ac432cc2af33cde6646271c/data/gl-400/fbo-multisample-interpolate.frag

Code in question works fine on AMD desktop drivers.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc

2017-06-11 Thread Jose Fonseca

On 11/06/17 07:59, Jacob Lifshay wrote:
On Sat, Jun 10, 2017 at 3:25 PM Jose Fonseca > wrote:

I don't see how to effectively tack triangle setup into the vertex
shader: vertex shader applies to vertices, where as triangle setup and
bining applies to primitives. Usually, each vertex gets transformed
only once with llvmpipe, no matter how many triangles refer that vertex.
The only way to tack triangle setup into vertex shading would be if
you processed vertices a primitive at a time. Of course one could put
an if-statement to skip reprocessing a vertex that already was
processed, but then you have race conditions, and no benefit of
inlining.

I was mostly thinking of non-indexed vertices.

And I'm afraid that tacking rasterization too is one those things that
sound great on paper, quite bad in practice. And I speak from
experience: in fact llvmpipe had the last step of rasterization bolted
on the fragment shaders for some time. But we took it out because it
was _slower_.

The issue is that if you bolt on to the shader body, you either:

- inline in the shader body code for the maxmimum number of planes that
(which are 7, 3 sides of triangle, plus 4 sides of a scissor rect), and
waste cpu cicles going through all of those tests, even when most of the
time many of those tests aren't needed

- or you generate if/for blocks for each place, so you only do the
needed tests, but then you have branch prediction issues...

Whereas if you keep rasterization _outside_ the shader you can have
specialized functions to do the rasterization based on the primitive
itself: (is the triangle fully inside the scissor, you need 3 planes, if
the stamp is fully inside the triangle you need zero). Essentially you
can "compose" by coupling two functions calls: you call a rasterization
function that's especiallized for the primitive, then a shading function
that's specialized for the state (but not depends on the primitive).

It makes sense: rasterization needs to be specialized for the primitive,
not the graphics state; where as the shader needs to be specialized for
the state.

I am planning on generating a function for each primitive type and state
combination, or I can convert all primitives into triangles and just
have a function for each state. The state includes stuff like if a
particular clipping/scissor equation needs to be checked. I did it that
way in my proof-of-concept code by using c++ templates to do the code
duplication:
https://github.com/programmerjake/tiled-renderer/blob/47e09f5d711803b8e899c3669fbeae3e62c9e32c/main.cpp#L366

I'm not sure there will be enough benefits of iniline to compensate the
time spent on compiling 2**7 variants of each shader to cope with all
possible incoming triangles..

And this is just one of those non-intuitive things that's not obvious
until one actually does a lot of profiling, a lot of experimentation.
And trust me, lot of time was spent fine tuning this for llvmpipe (not
be me -- most of rasterization was done by Keith Whitwell.) And by
throwing llvmpipe out of the window and starting a new software
rendering from scratch you'd be just subscribing to do it all over
again.

Whereas if instead of starting from scratch, you take llvmpipe, and you
rewrite/replace one component at a time, you can reach exactly the same
destination you want to reach, however you'll have something working
every step of the way, so when you take a bad step, you can measure
performance impact, and readjust. Plus if you run out of time, you have
something useful -- not yet another half finished project, which quickly
will rot away.

In the case that the project is not finished this summer, I'm still
planning on working on it, just at a reduced rate. If all else fails, we
will at least have a up-to-date spir-v to llvm converter that handles
the glsl spir-v extensions.

Regarding generating the spir-v -> scalar llvm, then do whole function
vectorization, I don't think it's a bad idea per se. If was I writing
llvmpipe from scratch today I'd do something like that. Especially
because (scalar) LLVM IR is so pervasive in the graphics ecosistem
anyway.

It was only after I had tgsi -> llvm ir all done that I stumbled into
http://compilers.cs.uni-saarland.de/projects/wfv/ .

I think the important thing here is that, once you've vectorized the
shader, and you converted your "texture_sample" to
"texture_sample.vector8", and your "output_merger" intrinsics to
"output_merger.vector8", or you log2/exp2, you then slot the fine tuned
llvmpipe code for texture sampling and blending and math, as that's were
your bottle necks tend to be. Because if you plan to write all texture
sampling from scratch then you need a time/clone machine to

[Mesa-dev] [Bug 101374] Worms Clan Wars hangs on loading screen