Re: [Mesa-dev] [PATCH] intel/compiler: update validator to account for half-float exec type promotion

2019-02-03 Thread Iago Toral
On Fri, 2019-02-01 at 11:23 -0800, Francisco Jerez wrote:
> Iago Toral  writes:
> 
> > On Fri, 2019-01-25 at 12:54 -0800, Francisco Jerez wrote:
> > > Iago Toral  writes:
> > > 
> > > > On Thu, 2019-01-24 at 11:45 -0800, Francisco Jerez wrote:
> > > > > Iago Toral  writes:
> > > > > 
> > > > > > On Wed, 2019-01-23 at 06:03 -0800, Francisco Jerez wrote:
> > > > > > > Iago Toral Quiroga  writes:
> > > > > > > 
> > > > > > > > Commit c84ec70b3a72 implemented execution type
> > > > > > > > promotion to
> > > > > > > > 32-
> > > > > > > > bit
> > > > > > > > for
> > > > > > > > conversions involving half-float registers, which
> > > > > > > > empirical
> > > > > > > > testing
> > > > > > > > suggested
> > > > > > > > was required, but it did not incorporate this change
> > > > > > > > into
> > > > > > > > the
> > > > > > > > assembly validator
> > > > > > > > logic. This commits adds that, preventing validation
> > > > > > > > errors
> > > > > > > > like
> > > > > > > > this:
> > > > > > > > 
> > > > > > > 
> > > > > > > I don't think we should be validating empirical
> > > > > > > assumptions
> > > > > > > in
> > > > > > > the EU
> > > > > > > validator.
> > > > > > 
> > > > > > I am not sure I get your point, isn't c84ec70b3a72 also
> > > > > > based
> > > > > > on
> > > > > > empirical testing after all?
> > > > > > 
> > > > > 
> > > > > To some extent, but it doesn't attempt to enforce ISA
> > > > > restrictions
> > > > > based
> > > > > on information obtained empirically.
> > > > > 
> > > > > > 
> > > > > > > > mov(16)  g9<4>B   g3<16,8,2>HF { align1 1H };
> > > > > > > > ERROR: Destination stride must be equal to the ratio of
> > > > > > > > the
> > > > > > > > sizes
> > > > > > > > of the
> > > > > > > >execution data type to the destination type
> > > > > > > > 
> > > > > > > > Fixes: c84ec70b3a72 "intel/fs: Promote execution type
> > > > > > > > to
> > > > > > > > 32-bit
> > > > > > > > when any half-float conversion is needed."
> > > > > > > 
> > > > > > > I don't think this "fixes" anything that ever worked.
> > > > > > 
> > > > > > It is true that the code in that trace above is not
> > > > > > something
> > > > > > we
> > > > > > can
> > > > > > produce right now, because it is a conversion from HF to B
> > > > > > and
> > > > > > that
> > > > > > should only happen within the context of
> > > > > > VK_KHR_shader_float16_int8,
> > > > > > however, this is a consequence of the fact that since
> > > > > > c84ec70b3a72
> > > > > > there is an inconsistency between what we do at the IR
> > > > > > level
> > > > > > regarding
> > > > > > execution size of HF conversions and what the EU validator
> > > > > > is
> > > > > > doing,
> > > > > > and from that perspective this is really fixing an
> > > > > > inconsistency
> > > > > > that
> > > > > > didn't exist before, and I thought we would want to address
> > > > > > that
> > > > > > sooner
> > > > > > rather than later and track it down to the original change
> > > > > > that
> > > > > > introduced that inconsistency so we know where this is
> > > > > > coming
> > > > > > from.
> > > > > > 
> > > > > 
> > > > > The "inconsistency" between the IR's get_exec_type() and the
> > > > > EU
> > > > > validator's execution_type() has existed ever since
> > > > > a05b6f25bf4bfad7
> > > > > removed the HF assert from get_exec_type() without actually
> > > > > implementing
> > > > > the code required to handle HF operands (which is what my
> > > > > commit
> > > > > c84ec70b3a72 did).
> > > > 
> > > > I agree with the fact that since a05b6f25bf4bfad7 the validator
> > > > could
> > > > reject valid code and that had nothing to do with your patch,
> > > 
> > > The validator rejected the same valid HF code since it was
> > > written,
> > > that
> > > had nothing to do with neither a05b6f25bf4bfad7 nor with my
> > > patch,
> > > and
> > > it is the real problem this patch was working around.
> > > 
> > > > but the inconsistency I am talking about here, that this patch
> > > > fixes,
> > > > is the one about get_exec_type() in the IR and execution_type()
> > > > in
> > > > the
> > > > validator doing different things for HF instructions, which
> > > > only
> > > > exists since your patch and which you discuss below.
> > > > 
> > > 
> > > The "inconsistency" exists ever since get_exec_type() was
> > > introduced
> > > without correct handling of HF types (even though
> > > execution_type()
> > > already attempted to handle it).  And I disagree that it's a real
> > > inconsistency except due to the fact that the validator is
> > > incorrectly
> > > attempting to validate the alignment of the destination region
> > > according
> > > to a rule that doesn't apply to HF types.
> > > 
> > > > > > Anyway, that was my rationale for the Fixes tag, but if you
> > > > > > think
> > > > > > this
> > > > > > is not useful I am happy to drop this patch and just
> > > > > > include it
> > > > > > as
> > > > > > part
> > > > > > of my series without the tag.
> > > > > > 

Re: [Mesa-dev] [PATCH 1/2] panfrost: Initial stub for Panfrost driver

2019-02-03 Thread Alyssa Rosenzweig
> You should just land it and start doing in-tree development!

I don't have push access, you know :P
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nvc0: fix 3d images on kepler

2019-02-03 Thread Ilia Mirkin
Looks like SUBFM.3D and SUEAU are perfectly capable of dealing with 3d
tiling, they just need the correct inputs. Supply them.

We also have to deal with the case where a 2d "layer" of a 3d image is
bound. In this case, we supply the z coordinate separately to the
shader, which has to optionally treat every 2d case as if it could be a
slice of a 3d texture.

Signed-off-by: Ilia Mirkin 
---
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 45 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c   | 23 --
 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 9e87c97b0f4..291677cfbbb 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -1920,7 +1920,8 @@ NVC0LoweringPass::processSurfaceCoordsNVE4(TexInstruction 
*su)
   su->op == OP_SULDB || su->op == OP_SUSTB || su->op == OP_SUREDB;
const int slot = su->tex.r;
const int dim = su->tex.target.getDim();
-   const int arg = dim + (su->tex.target.isArray() || su->tex.target.isCube());
+   const bool array = su->tex.target.isArray() || su->tex.target.isCube();
+   const int arg = dim + array;
int c;
Value *zero = bld.mkImm(0);
Value *p1 = NULL;
@@ -1929,6 +1930,7 @@ NVC0LoweringPass::processSurfaceCoordsNVE4(TexInstruction 
*su)
Value *bf, *eau, *off;
Value *addr, *pred;
Value *ind = su->getIndirectR();
+   Value *y, *z;
 
off = bld.getScratch(4);
bf = bld.getScratch(4);
@@ -1959,34 +1961,43 @@ 
NVC0LoweringPass::processSurfaceCoordsNVE4(TexInstruction *su)
for (; c < 3; ++c)
   src[c] = zero;
 
+   if (dim == 2 && !array) {
+  v = loadSuInfo32(ind, slot, NVC0_SU_INFO_UNK1C, su->tex.bindless);
+  src[2] = bld.mkOp2v(OP_SHR, TYPE_U32, bld.getSSA(),
+  v, bld.loadImm(NULL, 16));
+
+  v = loadSuInfo32(ind, slot, NVC0_SU_INFO_DIM(2), su->tex.bindless);
+  bld.mkOp3(OP_SUCLAMP, TYPE_S32, src[2], src[2], v, zero)
+ ->subOp = NV50_IR_SUBOP_SUCLAMP_SD(0, 2);
+   }
+
// set predicate output
if (su->tex.target == TEX_TARGET_BUFFER) {
   src[0]->getInsn()->setFlagsDef(1, pred);
} else
-   if (su->tex.target.isArray() || su->tex.target.isCube()) {
+   if (array) {
   p1 = bld.getSSA(1, FILE_PREDICATE);
   src[dim]->getInsn()->setFlagsDef(1, p1);
}
 
// calculate pixel offset
if (dim == 1) {
+  y = z = zero;
   if (su->tex.target != TEX_TARGET_BUFFER)
  bld.mkOp2(OP_AND, TYPE_U32, off, src[0], bld.loadImm(NULL, 0x));
-   } else
-   if (dim == 3) {
+   } else {
+  y = src[1];
+  z = src[2];
+
   v = loadSuInfo32(ind, slot, NVC0_SU_INFO_UNK1C, su->tex.bindless);
+  bld.mkOp2(OP_AND, TYPE_U32, v, v, bld.loadImm(NULL, 0x));
   bld.mkOp3(OP_MADSP, TYPE_U32, off, src[2], v, src[1])
  ->subOp = NV50_IR_SUBOP_MADSP(4,2,8); // u16l u16l u16l
 
   v = loadSuInfo32(ind, slot, NVC0_SU_INFO_PITCH, su->tex.bindless);
   bld.mkOp3(OP_MADSP, TYPE_U32, off, off, v, src[0])
- ->subOp = NV50_IR_SUBOP_MADSP(0,2,8); // u32 u16l u16l
-   } else {
-  assert(dim == 2);
-  v = loadSuInfo32(ind, slot, NVC0_SU_INFO_PITCH, su->tex.bindless);
-  bld.mkOp3(OP_MADSP, TYPE_U32, off, src[1], v, src[0])
- ->subOp = (su->tex.target.isArray() || su->tex.target.isCube()) ?
- NV50_IR_SUBOP_MADSP_SD : NV50_IR_SUBOP_MADSP(4,2,8); // u16l u16l u16l
+ ->subOp = array ?
+ NV50_IR_SUBOP_MADSP_SD : NV50_IR_SUBOP_MADSP(0,2,8); // u32 u16l u16l
}
 
// calculate effective address part 1
@@ -1999,19 +2010,15 @@ 
NVC0LoweringPass::processSurfaceCoordsNVE4(TexInstruction *su)
 ->subOp = NV50_IR_SUBOP_V1(7,6,8|2);
   }
} else {
-  Value *y = src[1];
-  Value *z = src[2];
   uint16_t subOp = 0;
 
   switch (dim) {
   case 1:
- y = zero;
- z = zero;
  break;
   case 2:
- z = off;
- if (!su->tex.target.isArray() && !su->tex.target.isCube()) {
-z = loadSuInfo32(ind, slot, NVC0_SU_INFO_UNK1C, su->tex.bindless);
+ if (array) {
+z = off;
+ } else {
 subOp = NV50_IR_SUBOP_SUBFM_3D;
  }
  break;
@@ -2034,7 +2041,7 @@ NVC0LoweringPass::processSurfaceCoordsNVE4(TexInstruction 
*su)
   eau = bld.mkOp3v(OP_SUEAU, TYPE_U32, bld.getScratch(4), off, bf, v);
}
// add array layer offset
-   if (su->tex.target.isArray() || su->tex.target.isCube()) {
+   if (array) {
   v = loadSuInfo32(ind, slot, NVC0_SU_INFO_ARRAY, su->tex.bindless);
   if (dim == 1)
  bld.mkOp3(OP_MADSP, TYPE_U32, eau, src[1], v, eau)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index 1bf4f56de83..a301dc5d610 100644
--- 

Re: [Mesa-dev] [PATCH 1/2] panfrost: Initial stub for Panfrost driver

2019-02-03 Thread Jason Ekstrand
On Sun, Feb 3, 2019 at 6:18 PM Alyssa Rosenzweig 
wrote:

> > Small comment, you should plan on single build for all supported
> > generations.. I'm not entirely sure if this same header is eventually
> > planned to be #include'd from different C code w/ different defines
> > for gpu gen (afaict you just currently hard-code it at the top of this
> > header)..  But distro's will be unhappy if it comes to different mesa
> > builds for 8xx vs 6xx ;-)
>
> Yeah, that's a moderately high-prio item, only blocking on other things
> being more interesting ;)
>
> > Also, I guess for your sanity at some point you'll want to autogen
> > cmdstream encoding and decoding from a single source.  I get the
> > impression that envytools isn't the right thing for the bitpacked
> > format for mali cmdstream.  Maybe the intel thing is better?  But I
> > didn't get very far w/ a2xx r/e before I realized that keeping hand
> > coded decoding and encoding in sync sucked.
>
> Sure, but.. autogen from.. what? As you note, Mali's "cmdstream" is
> wacky and doesn't line up with the model assumed by most of these tools.
>
> Also, if I have to read/write XML, I might lose my sanity faster ;P
>
> > Anyways, totally fine w/ those details getting worked out in-tree,
> > after merging.
> >
> > Acked-by: Rob Clark 
>
> Thank you!
>
> (Is there any particular ack we're waiting for for pushing?)
>

I don't think so.  You've gotten enough buy-in from the community so far
that, if your build system and core (if any) changes aren't going to break
anything, you should just land it and start doing in-tree development!
Right after the 19.0 branch point (which was Wednesday) is a great time to
do it too because any small breakages won't cause any release problems.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] panfrost: Implement Midgard shader toolchain

2019-02-03 Thread Alyssa Rosenzweig
> Looks like you leak the constants?  You could pass ctx->ssa_constants
> instead of NULL and the allocation would be automatically freed.

Hm, alright. Is there documentation anywhere on how memctx works in
general?

> Instead of hardcoding 4096, use impl->ssa_index?

Good catch, thank you.

> Looks like instead of encoding components here you could use
> nir_op_info->num_inputs?

Components at this point is a misnomer; it's really "encoding type". The
correct solution, now that I have the infrastructure for it, is to use a
combination of nir_op_info and instruction quirks, and get rid of the
magic numbers here. Bumping up priority list for next time I dive into
the compiler.

> > +nir_foreach_variable(var, >uniforms) {
> > +if (glsl_get_base_type(var->type) == GLSL_TYPE_SAMPLER) 
> > continue;
> > +
> > +unsigned length = glsl_get_aoa_size(var->type);
> > +
> > +if (!length) {
> > +length = glsl_get_length(var->type);
> > +}
> > +
> > +if (!length) {
> > +length = glsl_get_matrix_columns(var->type);
> > +}
> 
> This seems suspicious -- I don't have anything like this for my uniforms.

Suspicious indeed... what is the correct way to map, then, without
allocating a uniform for samplers and other not-real-uniform-uniforms?
The hardware just wants a vec4 index; NIR mirrors the GLSL; poof?

I think I had troubles there, but I can't recall exactly.

> Using info.outputs_written might be nicer here.

Mayhaps... I have to transform order anyway, or establish a generic
interface for communicating order back to the cmdstream bits and resolve
it dynamically there. OTOH, maybe that's the right way to go anyway;
a lot of this code grew "organically" and the details of varying
descriptors were only understood recently, long after the first batch of
that was written... I suppose this could be a good refactor.

> I'm skeptical that this many lower_var_copies() is needed :)

^_^

I gotta make sure they're _really_ lowered! ;)

> I need to steal your isign.

Bon apetit.
> > +(('fge', a, b), ('flt', b, a)),
> > +
> > +# XXX: We have hw ops for this, just unknown atm..
> > +#(('fsign@32', a), ('i2f32@32', ('isign', ('f2i32@32', ('fmul', a, 
> > 0x4380)
> > +#(('fsign', a), ('fcsel', ('fge', a, 0), 1.0, ('fcsel', ('flt', a, 
> > 0.0), -1.0, 0.0)))
> > +(('fsign', a), ('bcsel', ('fge', a, 0), 1.0, -1.0)),
> 
> Looks like your fsign never returns 0.0 like it should?

Indeed it does not. I should maybe figure out what "hw ops" I was
referring to; less risk of bugs that way, I suppose.

> All of this is suggestions for future work.  I'm mostly glad to see the
> driver coming into the tree at last.  Both patches are:
> 
> Acked-by: Eric Anholt 

Thank you! As I mentioned in the other email (to Rob), is there anything
particular blocking a push into master?

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] panfrost: Initial stub for Panfrost driver

2019-02-03 Thread Alyssa Rosenzweig
> Small comment, you should plan on single build for all supported
> generations.. I'm not entirely sure if this same header is eventually
> planned to be #include'd from different C code w/ different defines
> for gpu gen (afaict you just currently hard-code it at the top of this
> header)..  But distro's will be unhappy if it comes to different mesa
> builds for 8xx vs 6xx ;-)

Yeah, that's a moderately high-prio item, only blocking on other things
being more interesting ;)

> Also, I guess for your sanity at some point you'll want to autogen
> cmdstream encoding and decoding from a single source.  I get the
> impression that envytools isn't the right thing for the bitpacked
> format for mali cmdstream.  Maybe the intel thing is better?  But I
> didn't get very far w/ a2xx r/e before I realized that keeping hand
> coded decoding and encoding in sync sucked.

Sure, but.. autogen from.. what? As you note, Mali's "cmdstream" is
wacky and doesn't line up with the model assumed by most of these tools.

Also, if I have to read/write XML, I might lose my sanity faster ;P

> Anyways, totally fine w/ those details getting worked out in-tree,
> after merging.
> 
> Acked-by: Rob Clark 

Thank you!

(Is there any particular ack we're waiting for for pushing?)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0/ir: always use CG mode for loads from atomic-only buffers

2019-02-03 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Sun, Feb 3, 2019 at 4:10 PM Ilia Mirkin  wrote:
>
> Atomic operations don't update the local cache, which means that we
> would have to issue CCTL operations in order to get the updated values.
> When we know that a buffer is primarily used for atomic operations, it's
> easier to just avoid the caching at that level entirely.
>
> The same issue persists for non-atomic buffers, which will have to be
> fixed separately.
>
> Fixes the failing dEQP-GLES31.functional.atomic_counter.* tests.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index afd7916a321..335e708c5cb 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -1087,6 +1087,8 @@ public:
> };
> std::vector memoryFiles;
>
> +   std::vector bufferAtomics;
> +
>  private:
> int inferSysValDirection(unsigned sn) const;
> bool scanDeclaration(const struct tgsi_full_declaration *);
> @@ -1137,6 +1139,7 @@ bool Source::scanSource()
> //resources.resize(scan.file_max[TGSI_FILE_RESOURCE] + 1);
> tempArrayId.resize(scan.file_max[TGSI_FILE_TEMPORARY] + 1);
> memoryFiles.resize(scan.file_max[TGSI_FILE_MEMORY] + 1);
> +   bufferAtomics.resize(scan.file_max[TGSI_FILE_BUFFER] + 1);
>
> info->immd.bufSize = 0;
>
> @@ -1483,11 +1486,14 @@ bool Source::scanDeclaration(const struct 
> tgsi_full_declaration *decl)
>   tempArrayInfo.insert(std::make_pair(arrayId, std::make_pair(
> first, last - first + 
> 1)));
>break;
> +   case TGSI_FILE_BUFFER:
> +  for (i = first; i <= last; ++i)
> + bufferAtomics[i] = decl->Declaration.Atomic;
> +  break;
> case TGSI_FILE_ADDRESS:
> case TGSI_FILE_CONSTANT:
> case TGSI_FILE_IMMEDIATE:
> case TGSI_FILE_SAMPLER:
> -   case TGSI_FILE_BUFFER:
> case TGSI_FILE_IMAGE:
>break;
> default:
> @@ -2720,7 +2726,11 @@ Converter::handleLOAD(Value *dst0[4])
>   }
>
>   Instruction *ld = mkLoad(TYPE_U32, dst0[c], sym, off);
> - ld->cache = tgsi.getCacheMode();
> + if (tgsi.getSrc(0).getFile() == TGSI_FILE_BUFFER &&
> + code->bufferAtomics[r])
> +ld->cache = nv50_ir::CACHE_CG;
> + else
> +ld->cache = tgsi.getCacheMode();
>   if (ind)
>  ld->setIndirect(0, 1, ind);
>}
> --
> 2.19.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 109391] LTO Build fails

2019-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109391

--- Comment #6 from Rudolf Kastl  ---
(In reply to Hi-Angel from comment #5)
> Created attachment 143280 [details] [review]
> Fix LTO build with GCC

(In reply to Hi-Angel from comment #3)
> (In reply to Eric Engestrom from comment #2)
> > That file is generated by src/mapi/mapi_abi.py
> > The exact command line used to generate src/glapi/gen/glapi_mapi_tmp.h is:
> > $ python3 src/mapi/mapi_abi.py --printer glapi
> > src/mapi/glapi/gen/gl_and_es_API.xml > build/src/glapi/gen/glapi_mapi_tmp.h
> > 
> > I'm afraid I can't help with any assembly issue though.
> > 
> > As for LTO, it never worked for me :/
> > It's been on my "to look at eventually" list, but I haven't yet.
> 
> Oh, thank you very much! For some reason I didn't get a notification about
> reply, it could've saved me some hours :(
> 
> --
> 
> To give some update, I reduced it to a minimal testcase, the problem turns
> out that gcc with flto removes functions implemented in asm. I reported a
> bug on that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89147 FWIW with
> clang it works correctly.

Thank you for taking the time to look into that. I will do a scratch build with
the patch tomorrow and some testing! Awesome!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 109540] gen_builder_meta.hpp:51:117: error: no matching function for call to ‘cast(llvm::FunctionCallee)’

2019-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109540

Bug ID: 109540
   Summary: gen_builder_meta.hpp:51:117: error: no matching
function for call to ‘cast(llvm::FunctionCallee)’
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/swr
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org

Build error with LLVM 9.0.

In file included from ./rasterizer/jitter/builder.h:158:0,
 from swr_shader.cpp:35:
./rasterizer/jitter/gen_builder_meta.hpp: In member function ‘llvm::Value*
SwrJit::Builder::VGATHERPD(llvm::Value*, llvm::Value*, llvm::Value*,
llvm::Value*, llvm::Value*, const llvm:
:Twine&)’:
./rasterizer/jitter/gen_builder_meta.hpp:51:117: error: no matching function
for call to ‘cast(llvm::FunctionCallee)’
 Function* pFunc =
cast(JM()->mpCurrentModule->getOrInsertFunction("meta.intrinsic.VGATHERPD",
pFuncTy));
   
 ^

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker

2019-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109535

Vinson Lee  changed:

   What|Removed |Added

 Depends on||109131


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=109131
[Bug 109131] cc1plus: error: unrecognized command line option "-std=c++11"
-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 109131] cc1plus: error: unrecognized command line option "-std=c++11"

2019-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109131

Vinson Lee  changed:

   What|Removed |Added

 Blocks||109535


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=109535
[Bug 109535] [Tracker] Mesa 19.0 release tracker
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker

2019-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109535

Francisco Jerez  changed:

   What|Removed |Added

 Depends on||109328


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=109328
[Bug 109328] [BSW BXT GLK] dEQP-VK.subgroups.arithmetic.subgroup regressions
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/deref: Drop zero ptr_as_array derefs

2019-02-03 Thread Lionel Landwerlin

On 03/02/2019 16:04, Jason Ekstrand wrote:

They are effectively ()[0] or * which does nothing.



Reviewed-by: Lionel Landwerlin 



---
  src/compiler/nir/nir_deref.c | 21 +
  1 file changed, 21 insertions(+)

diff --git a/src/compiler/nir/nir_deref.c b/src/compiler/nir/nir_deref.c
index 2f5fda643ca..13aa10c7532 100644
--- a/src/compiler/nir/nir_deref.c
+++ b/src/compiler/nir/nir_deref.c
@@ -670,6 +670,27 @@ opt_deref_ptr_as_array(nir_builder *b, nir_deref_instr 
*deref)
 assert(deref->deref_type == nir_deref_type_ptr_as_array);
  
 nir_deref_instr *parent = nir_deref_instr_parent(deref);

+
+   if (nir_src_is_const(deref->arr.index) &&
+   nir_src_as_int(deref->arr.index) == 0) {
+  /* If it's a ptr_as_array deref with an index of 0, it does nothing
+   * and we can just replace its uses with its parent.
+   *
+   * The source of a ptr_as_array deref always has a deref_type of
+   * nir_deref_type_array or nir_deref_type_cast.  If it's a cast, it
+   * may be trivial and we may be able to get rid of that too.  Any
+   * trivial cast of trivial cast cases should be handled already by
+   * opt_deref_cast() above.
+   */
+  if (parent->deref_type == nir_deref_type_cast &&
+  is_trivial_deref_cast(parent))
+ parent = nir_deref_instr_parent(parent);
+  nir_ssa_def_rewrite_uses(>dest.ssa,
+   nir_src_for_ssa(>dest.ssa));
+  nir_instr_remove(>instr);
+  return true;
+   }
+
 if (parent->deref_type != nir_deref_type_array &&
 parent->deref_type != nir_deref_type_ptr_as_array)
return false;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 109391] LTO Build fails

2019-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109391

--- Comment #5 from Hi-Angel  ---
Created attachment 143280
  --> https://bugs.freedesktop.org/attachment.cgi?id=143280=edit
Fix LTO build with GCC

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 109391] LTO Build fails

2019-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109391

--- Comment #4 from Hi-Angel  ---
Please test the following patch, it should resolve the building problem.

What it does is disables flto for specific files with assembly defined
functions. It's okay since from a cursory look there's not much code except the
assembly stuff.

I should mention however, for some reason flto-optimized r600g works
incorrectly for me. But this is irrelevant to the current problem, and in fact
I've stopped using LTO build some months ago for that reason (yeah, LTO build
with GCC worked for me too, I don't know why it broke recently). I wanted to
bisect that back then, but screwed bisection up, and later I just didn't have
motivation or time.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir/deref: Drop zero ptr_as_array derefs

2019-02-03 Thread Jason Ekstrand
They are effectively ()[0] or * which does nothing.
---
 src/compiler/nir/nir_deref.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/compiler/nir/nir_deref.c b/src/compiler/nir/nir_deref.c
index 2f5fda643ca..13aa10c7532 100644
--- a/src/compiler/nir/nir_deref.c
+++ b/src/compiler/nir/nir_deref.c
@@ -670,6 +670,27 @@ opt_deref_ptr_as_array(nir_builder *b, nir_deref_instr 
*deref)
assert(deref->deref_type == nir_deref_type_ptr_as_array);
 
nir_deref_instr *parent = nir_deref_instr_parent(deref);
+
+   if (nir_src_is_const(deref->arr.index) &&
+   nir_src_as_int(deref->arr.index) == 0) {
+  /* If it's a ptr_as_array deref with an index of 0, it does nothing
+   * and we can just replace its uses with its parent.
+   *
+   * The source of a ptr_as_array deref always has a deref_type of
+   * nir_deref_type_array or nir_deref_type_cast.  If it's a cast, it
+   * may be trivial and we may be able to get rid of that too.  Any
+   * trivial cast of trivial cast cases should be handled already by
+   * opt_deref_cast() above.
+   */
+  if (parent->deref_type == nir_deref_type_cast &&
+  is_trivial_deref_cast(parent))
+ parent = nir_deref_instr_parent(parent);
+  nir_ssa_def_rewrite_uses(>dest.ssa,
+   nir_src_for_ssa(>dest.ssa));
+  nir_instr_remove(>instr);
+  return true;
+   }
+
if (parent->deref_type != nir_deref_type_array &&
parent->deref_type != nir_deref_type_ptr_as_array)
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir/deref: Drop zero ptr_as_array derefs

2019-02-03 Thread Jason Ekstrand
They are effectively ()[0] or * which does nothing.
---
 src/compiler/nir/nir_deref.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/compiler/nir/nir_deref.c b/src/compiler/nir/nir_deref.c
index 2f5fda643ca..0af26b80e77 100644
--- a/src/compiler/nir/nir_deref.c
+++ b/src/compiler/nir/nir_deref.c
@@ -670,6 +670,27 @@ opt_deref_ptr_as_array(nir_builder *b, nir_deref_instr 
*deref)
assert(deref->deref_type == nir_deref_type_ptr_as_array);
 
nir_deref_instr *parent = nir_deref_instr_parent(deref);
+
+   if (nir_src_is_const(deref->arr.index) &&
+   nir_src_as_int(deref->arr.index) == 0) {
+  /* If it's a ptr_as_array deref with an index of 0, it does nothing
+   * and we can just replace its uses with its parent.
+   *
+   * The source of a ptr_as_array deref always has a deref_type of
+   * nir_deref_type_array or nir_deref_type_cast.  If it's a cast, it
+   * may be trivial and we may be able to get rid of that too.  Any
+   * trivial cast of trivial cast cases should be handled already by
+   * opt_deref_cast() above.
+   */
+  if (parent->deref_type == nir_deref_type_cast &&
+  is_trivial_deref_cast(parent))
+ parent = nir_deref_instr_parent(parent);
+  nir_ssa_def_rewrite_uses(>dest.ssa,
+   nir_src_for_ssa(>instr));
+  nir_instr_remove(>instr);
+  return true;
+   }
+
if (parent->deref_type != nir_deref_type_array &&
parent->deref_type != nir_deref_type_ptr_as_array)
   return false;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nvc0/ir: always use CG mode for loads from atomic-only buffers

2019-02-03 Thread Ilia Mirkin
Atomic operations don't update the local cache, which means that we
would have to issue CCTL operations in order to get the updated values.
When we know that a buffer is primarily used for atomic operations, it's
easier to just avoid the caching at that level entirely.

The same issue persists for non-atomic buffers, which will have to be
fixed separately.

Fixes the failing dEQP-GLES31.functional.atomic_counter.* tests.

Signed-off-by: Ilia Mirkin 
---
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index afd7916a321..335e708c5cb 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1087,6 +1087,8 @@ public:
};
std::vector memoryFiles;
 
+   std::vector bufferAtomics;
+
 private:
int inferSysValDirection(unsigned sn) const;
bool scanDeclaration(const struct tgsi_full_declaration *);
@@ -1137,6 +1139,7 @@ bool Source::scanSource()
//resources.resize(scan.file_max[TGSI_FILE_RESOURCE] + 1);
tempArrayId.resize(scan.file_max[TGSI_FILE_TEMPORARY] + 1);
memoryFiles.resize(scan.file_max[TGSI_FILE_MEMORY] + 1);
+   bufferAtomics.resize(scan.file_max[TGSI_FILE_BUFFER] + 1);
 
info->immd.bufSize = 0;
 
@@ -1483,11 +1486,14 @@ bool Source::scanDeclaration(const struct 
tgsi_full_declaration *decl)
  tempArrayInfo.insert(std::make_pair(arrayId, std::make_pair(
first, last - first + 1)));
   break;
+   case TGSI_FILE_BUFFER:
+  for (i = first; i <= last; ++i)
+ bufferAtomics[i] = decl->Declaration.Atomic;
+  break;
case TGSI_FILE_ADDRESS:
case TGSI_FILE_CONSTANT:
case TGSI_FILE_IMMEDIATE:
case TGSI_FILE_SAMPLER:
-   case TGSI_FILE_BUFFER:
case TGSI_FILE_IMAGE:
   break;
default:
@@ -2720,7 +2726,11 @@ Converter::handleLOAD(Value *dst0[4])
  }
 
  Instruction *ld = mkLoad(TYPE_U32, dst0[c], sym, off);
- ld->cache = tgsi.getCacheMode();
+ if (tgsi.getSrc(0).getFile() == TGSI_FILE_BUFFER &&
+ code->bufferAtomics[r])
+ld->cache = nv50_ir::CACHE_CG;
+ else
+ld->cache = tgsi.getCacheMode();
  if (ind)
 ld->setIndirect(0, 1, ind);
   }
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #4 from Ilia Mirkin  ---
(In reply to Mark Janes from comment #3)
> This test was broken in the dEQP suite recently for m32 i965 platforms by:

FWIW I don't think this is a test bug. The shaders used appear perfectly
reasonable. It seems like the commit in question would have changed the
generated shader a bit, which could be triggering the different behavior (seems
likely, in fact). But ultimately this is a mesa issue, not a dEQP issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] i965: Update the shadow miptree from the main to fake the ETC2 compression

2019-02-03 Thread Eleni Maria Stea
On Fri, 18 Jan 2019 17:09:03 -0800
Nanley Chery  wrote:

> On Mon, Nov 19, 2018 at 10:54:08AM +0200, Eleni Maria Stea wrote:
[...]
> > +   int img_d = smt->surf.logical_level0_px.depth;  
> 
> I don't think 3D ETC textures are possible. From the GL4.6 spec:
> 
>   An INVALID_OPERATION error is generated by
> CompressedTexImage3D if internalformat is one of the EAC, ETC2, or
> RGTC formats and either border is non-zero, or target is not
> TEXTURE_2D_ARRAY.

Hi Nanley,

Thanks for pointing this out. I've made the change in my new series
of patches but after giving it a second thought, I believe that I'd
rather put back the depth in the calculation of num_slices:

As, I understand the spec, if the border is zero, the 3D images should
be supported. Mesa already checks the border value in the file:
src/mesa/main/teximage.c function: compressed_texture_error_check and
has a comment:

/* No compressed formats support borders at this time */

and so only ETC/EAC compressed formats without border will reach the
update function and we should support them.

Also, I see that we have some CTS tests that call the
CompressedTexImage3D for ETC/EAC formats with 0 border value, so I
suppose that is expected to have 3D images of these formats.

What do you think?

Thank you in advance,
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-03 Thread Eleni Maria Stea
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 133 --
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  22 +++
 3 files changed, 150 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 618e2ab35bc..c2cf34aee71 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   */
  mesa_fmt = mt->format;
   } else if (mt->etc_format != MESA_FORMAT_NONE) {
- mesa_fmt = mt->format;
+ mesa_fmt = mt->shadow_mt->format;
   } else if (plane > 0) {
  mesa_fmt = mt->format;
   } else {
@@ -581,6 +581,9 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
  assert(mt->shadow_mt && !mt->shadow_needs_update);
  mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
+  } else if (intel_miptree_needs_fake_etc(brw, mt)) {
+ assert(mt->shadow_mt);
+ mt = mt->shadow_mt;
   }
 
   const int surf_index = surf_offset - >wm.base.surf_offset[0];
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 0a25dfd0161..3ff36b84a5a 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -57,6 +57,11 @@ static void *intel_miptree_map_raw(struct brw_context *brw,
GLbitfield mode);
 
 static void intel_miptree_unmap_raw(struct intel_mipmap_tree *mt);
+static void intel_miptree_update_etc_shadow(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+unsigned int level,
+unsigned int slice,
+int 

[Mesa-dev] [PATCH v2 2/5] i965: Removed assertions from intel_miptree_map_etc

2019-02-03 Thread Eleni Maria Stea
The assertions that the GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
in intel_miptree_map_etc will fail when the ETC miptree is mapped for
reading. As we are about to fix the GetCompressed* functions in the
following patches and allow the reading from etc miptrees, we have to
remove them.

Fixes the crash of the test
KHR-GL45.direct_state_access.textures_compressed_subimage on Gen 7 GPUs.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 479188fd1c8..0a25dfd0161 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3497,9 +3497,6 @@ intel_miptree_map_etc(struct brw_context *brw,
   assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
}
 
-   assert(map->mode & GL_MAP_WRITE_BIT);
-   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
-
intel_miptree_access_raw(brw, mt, level, slice, true);
 
map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-03 Thread Eleni Maria Stea
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back. (Kenneth Graunke)
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 3a95be58a63..d2e232f3ff1 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -287,14 +287,24 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8 || devinfo->is_baytrail) {
-  /* For now, we only enable OES_copy_image on platforms that support
-   * ETC2 natively in hardware.  We would need more hacks to support it
-   * elsewhere. Same with OES_texture_view.
+  /*
+   * For now, we can't enable OES_texture_view on Gen 7 because of
+   * some piglit failures coming from
+   * piglit/tests/spec/arb_texture_view/rendering-formats.c that need
+   * investigation.
*/
-  ctx->Extensions.OES_copy_image = true;
   ctx->Extensions.OES_texture_view = true;
}
 
+   if (devinfo->gen >= 7) {
+  /*
+   * We can safely enable OES_copy_image on Gen 7, since we emulate
+   * the ETC2 support using the shadow_miptree to store the
+   * compressed data.
+   */
+  ctx->Extensions.OES_copy_image = true;
+   }
+
if (devinfo->gen >= 8) {
   ctx->Extensions.ARB_gpu_shader_int64 = true;
   /* requires ARB_gpu_shader_int64 */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 4/5] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8

2019-02-03 Thread Eleni Maria Stea
For CopyImageSubData to copy the data during the 1st draw call, we need
to update the shadow tree right before the rendering.
---
 src/mesa/drivers/dri/i965/brw_draw.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index ec4fe0b096f..d00e0a726b1 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
rendering,
   tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
  intel_update_r8stencil(brw, tex_obj->mt);
   }
+
+  if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) &&
+  tex_obj->mt->shadow_needs_update) {
+ intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt);
+  }
}
 
/* Resolve color for each active shader image. */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/5] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

2019-02-03 Thread Eleni Maria Stea
From: Nanley Chery 

Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index b067a174056..618e2ab35bc 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b4e3524aa51..479188fd1c8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 17668944adc..1a7507023a1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -294,16 +294,16 @@ struct intel_mipmap_tree
struct intel_mipmap_tree *stencil_mt;
 
/**
-* \brief Stencil texturing miptree for sampling from a stencil texture
+* \brief Shadow miptree for sampling when the main isn't supported by HW.
 *
-* Some hardware doesn't support sampling from the stencil texture as
-* required by the GL_ARB_stencil_texturing extenion. To workaround this we
-* blit the texture into a new texture that can be sampled.
+* To workaround various sampler bugs and limitations, we blit the main
+* texture into a new texture that can be sampled.
 *
-* \see intel_update_r8stencil()
+* This miptree may be used for:
+* - Stencil texturing (pre-BDW) as required by GL_ARB_stencil_texturing.
 */
-   struct intel_mipmap_tree *r8stencil_mt;
-   bool r8stencil_needs_update;
+   struct intel_mipmap_tree *shadow_mt;
+   bool shadow_needs_update;
 
/**
 * \brief CCS, 

[Mesa-dev] [PATCH v2 0/5] improved the support for ETC2 formats on Gen 7

2019-02-03 Thread Eleni Maria Stea
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
show the pixels properly we convert them to RGBA and create RGBA miptrees.
The problem with that is that the GetCompressed* functions that should
return the compressed pixel values return the RGBA instead.

These patches are an attempt to give a solution to this problem, by
using 2 miptrees: the main to stores the ETC values and the generic
shadow (mt->shadow) to store the RGBA. Each time that the main miptree
is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly,
we update all the mipmap levels of the image (if necessary) before the
drawing, for the CopyImageSubData to work.

Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
lack of the ETC support is now enabled back.

Eleni Maria Stea (4):
  i965: Removed assertions from intel_miptree_map_etc
  i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
  i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
  i965: Enabled the OES_copy_image extension on Gen 7 GPUs

Nanley Chery (1):
  i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

 src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
 .../drivers/dri/i965/brw_wm_surface_state.c   |  13 +-
 src/mesa/drivers/dri/i965/intel_extensions.c  |  18 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 152 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  36 -
 5 files changed, 188 insertions(+), 36 deletions(-)

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meson: drop the xcb-xrandr version requirement

2019-02-03 Thread Erik Faye-Lund
On Sat, 2019-02-02 at 12:58 -0500, Marek Olšák wrote:
> 
> 
> On Sat, Feb 2, 2019, 12:41 PM Eric Engestrom <
> eric.engest...@intel.com wrote:
> > On Saturday, 2019-02-02 10:32:15 -0500, Marek Olšák wrote:
> > > On Sat, Feb 2, 2019, 7:17 AM Eric Engestrom <
> > eric.engest...@intel.com wrote:
> > > 
> > > > On Friday, 2019-02-01 15:42:17 -0500, Marek Olšák wrote:
> > > > > If there is no feedback soon, I'll push this.
> > > >
> > > > Have you tested that xcb-randr < 1.12 works?
> > > > Probably shouldn't remove a restriction unless you're sure it
> > isn't
> > > > needed :)
> > > >
> > > 
> > > Is this a joke? I'm just mirroring autotools. Supporting the same
> > linux
> > > distributions as autotools is a requirement for meson's general
> > acceptance.
> > 
> > No, I'm being serious: just because a restriction didn't exist on
> > autotools doesn't mean that code path was exercised by people
> > running
> > an old xcb-randr, hence the need to test it :)
> > 
> > I didn't mean to offend you, I was just asking the question to make
> > sure
> > this was tested before we claim to support xcb-randr < 1.12, as it
> > might
> > be that autotools was simply missing the version check.
> 
> Ok. I use old xcb-xrandr on some of my systems, one of them used to
> be my main system. Not being able to use meson on those systems
> without this patch is a big deal for me.

This sounds like you have indeed tested on xcb-randr < 1.12, so I
suppose the answer to the question is "yes"? If so, I think it's all
good, no?

Anyway, I think this seems like the right move, and since Keith has't
responded, feel free to add:

Reviewed-by: Erik Faye-Lund 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev