Re: [Mesa-dev] [PATCH 3/3] i965/fs: Don't consider the stencil output to be a color output.

2016-08-23 Thread Kenneth Graunke
On Monday, August 22, 2016 6:59:45 PM PDT Francisco Jerez wrote:
> This would cause gl_FragStencilRef to be counted as a color output
> incorrectly during the precompile phase, which leads to unnecessary
> recompilation on master and could trigger an assertion failure in
> fs_visitor::emit_fb_writes() on my i965-fb-fetch branch.
> ---
>  src/mesa/drivers/dri/i965/brw_wm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
> b/src/mesa/drivers/dri/i965/brw_wm.c
> index 7b1b839..d725a16 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -608,7 +608,8 @@ brw_fs_precompile(struct gl_context *ctx,
>  
> key.nr_color_regions = _mesa_bitcount_64(fp->Base.OutputsWritten &
>   ~(BITFIELD64_BIT(FRAG_RESULT_DEPTH) |
> - BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)));
> +   BITFIELD64_BIT(FRAG_RESULT_STENCIL) |
> +   BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)));
>  
> key.program_string_id = bfp->id;
>  
> 

Patch 3 is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] intel/isl: Pass the dim_layout into choose_alignment_el

2016-08-23 Thread Jason Ekstrand
Signed-off-by: Jason Ekstrand 
Cc: Chad Versace 
---
 src/intel/isl/isl.c  | 23 ---
 src/intel/isl/isl_gen4.c |  1 +
 src/intel/isl/isl_gen4.h |  1 +
 src/intel/isl/isl_gen6.c |  1 +
 src/intel/isl/isl_gen6.h |  1 +
 src/intel/isl/isl_gen7.c |  1 +
 src/intel/isl/isl_gen7.h |  1 +
 src/intel/isl/isl_gen8.c |  1 +
 src/intel/isl/isl_gen8.h |  1 +
 src/intel/isl/isl_gen9.c |  5 +++--
 src/intel/isl/isl_gen9.h |  1 +
 11 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index b191df3..6e32302 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -421,6 +421,7 @@ static void
 isl_choose_image_alignment_el(const struct isl_device *dev,
   const struct isl_surf_init_info *restrict info,
   enum isl_tiling tiling,
+  enum isl_dim_layout dim_layout,
   enum isl_msaa_layout msaa_layout,
   struct isl_extent3d *image_align_el)
 {
@@ -434,20 +435,20 @@ isl_choose_image_alignment_el(const struct isl_device 
*dev,
}
 
if (ISL_DEV_GEN(dev) >= 9) {
-  gen9_choose_image_alignment_el(dev, info, tiling, msaa_layout,
- image_align_el);
+  gen9_choose_image_alignment_el(dev, info, tiling, dim_layout,
+ msaa_layout, image_align_el);
} else if (ISL_DEV_GEN(dev) >= 8) {
-  gen8_choose_image_alignment_el(dev, info, tiling, msaa_layout,
- image_align_el);
+  gen8_choose_image_alignment_el(dev, info, tiling, dim_layout,
+ msaa_layout, image_align_el);
} else if (ISL_DEV_GEN(dev) >= 7) {
-  gen7_choose_image_alignment_el(dev, info, tiling, msaa_layout,
- image_align_el);
+  gen7_choose_image_alignment_el(dev, info, tiling, dim_layout,
+ msaa_layout, image_align_el);
} else if (ISL_DEV_GEN(dev) >= 6) {
-  gen6_choose_image_alignment_el(dev, info, tiling, msaa_layout,
- image_align_el);
+  gen6_choose_image_alignment_el(dev, info, tiling, dim_layout,
+ msaa_layout, image_align_el);
} else {
-  gen4_choose_image_alignment_el(dev, info, tiling, msaa_layout,
- image_align_el);
+  gen4_choose_image_alignment_el(dev, info, tiling, dim_layout,
+ msaa_layout, image_align_el);
}
 }
 
@@ -1146,7 +1147,7 @@ isl_surf_init_s(const struct isl_device *dev,
return false;
 
struct isl_extent3d image_align_el;
-   isl_choose_image_alignment_el(dev, info, tiling, msaa_layout,
+   isl_choose_image_alignment_el(dev, info, tiling, dim_layout, msaa_layout,
  _align_el);
 
struct isl_extent3d image_align_sa =
diff --git a/src/intel/isl/isl_gen4.c b/src/intel/isl/isl_gen4.c
index 52aa565..1d584fc 100644
--- a/src/intel/isl/isl_gen4.c
+++ b/src/intel/isl/isl_gen4.c
@@ -41,6 +41,7 @@ void
 gen4_choose_image_alignment_el(const struct isl_device *dev,
const struct isl_surf_init_info *restrict info,
enum isl_tiling tiling,
+   enum isl_dim_layout dim_layout,
enum isl_msaa_layout msaa_layout,
struct isl_extent3d *image_align_el)
 {
diff --git a/src/intel/isl/isl_gen4.h b/src/intel/isl/isl_gen4.h
index 06cd70b..48175ca 100644
--- a/src/intel/isl/isl_gen4.h
+++ b/src/intel/isl/isl_gen4.h
@@ -39,6 +39,7 @@ void
 gen4_choose_image_alignment_el(const struct isl_device *dev,
const struct isl_surf_init_info *restrict info,
enum isl_tiling tiling,
+   enum isl_dim_layout dim_layout,
enum isl_msaa_layout msaa_layout,
struct isl_extent3d *image_align_el);
 
diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c
index de95a8f..2c52e38 100644
--- a/src/intel/isl/isl_gen6.c
+++ b/src/intel/isl/isl_gen6.c
@@ -86,6 +86,7 @@ void
 gen6_choose_image_alignment_el(const struct isl_device *dev,
const struct isl_surf_init_info *restrict info,
enum isl_tiling tiling,
+   enum isl_dim_layout dim_layout,
enum isl_msaa_layout msaa_layout,
struct isl_extent3d *image_align_el)
 {
diff --git a/src/intel/isl/isl_gen6.h b/src/intel/isl/isl_gen6.h
index 0779c67..04414c7 100644
--- a/src/intel/isl/isl_gen6.h
+++ b/src/intel/isl/isl_gen6.h
@@ -39,6 +39,7 @@ void
 

[Mesa-dev] [PATCH 3/3] intel/isl/gen9: Only use the magic 1D alignment for GEN9_2D surfaces

2016-08-23 Thread Jason Ekstrand
If the surface has a layout of GEN4_2D then we need to compute a normal 2D
alignment and not use the bogus 1D one.

Signed-off-by: Jason Ekstrand 
Cc: Chad Versace 
---
 src/intel/isl/isl_gen9.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_gen9.c b/src/intel/isl/isl_gen9.c
index ca6..da8c749 100644
--- a/src/intel/isl/isl_gen9.c
+++ b/src/intel/isl/isl_gen9.c
@@ -174,7 +174,7 @@ gen9_choose_image_alignment_el(const struct isl_device *dev,
   return;
}
 
-   if (info->dim == ISL_SURF_DIM_1D) {
+   if (dim_layout == ISL_DIM_LAYOUT_GEN9_1D) {
   /* See the Skylake BSpec > Memory Views > Common Surface Formats > 
Surface
* Layout and Tiling > 1D Surfaces > 1D Alignment Requirements.
*/
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] intel/isl: Use DIM_LAYOUT_GEN4_2D for tiled 1-D surfaces on SKL

2016-08-23 Thread Jason Ekstrand
The Sky Lake 1D layout is only used if the surface is linear.  For tiled
surfaces such as depth and stenil the old gen4 2D layout is used.

Signed-off-by: Jason Ekstrand 
Cc: Chad Versace 
---
 src/intel/isl/isl.c | 28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index b6f7d5b..b191df3 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -453,12 +453,30 @@ isl_choose_image_alignment_el(const struct isl_device 
*dev,
 
 static enum isl_dim_layout
 isl_surf_choose_dim_layout(const struct isl_device *dev,
-   enum isl_surf_dim logical_dim)
+   enum isl_surf_dim logical_dim,
+   enum isl_tiling tiling)
 {
if (ISL_DEV_GEN(dev) >= 9) {
   switch (logical_dim) {
   case ISL_SURF_DIM_1D:
- return ISL_DIM_LAYOUT_GEN9_1D;
+ /* From the Sky Lake PRM Vol. 5, "1D Surfaces":
+  *
+  *One-dimensional surfaces use a tiling mode of linear.
+  *Technically, they are not tiled resources, but the Tiled
+  *Resource Mode field in RENDER_SURFACE_STATE is still used to
+  *indicate the alignment requirements for this linear surface
+  *(See 1D Alignment requirements for how 4K and 64KB Tiled
+  *Resource Modes impact alignment). Alternatively, a 1D surface
+  *can be defined as a 2D tiled surface (e.g. TileY or TileX) with
+  *a height of 0.
+  *
+  * In other words, ISL_DIM_LAYOUT_GEN9_1D is only used for linear
+  * surfaces and, for tiled surfaces, ISL_DIM_LAYOUT_GEN4_2D is used.
+  */
+ if (tiling == ISL_TILING_LINEAR)
+return ISL_DIM_LAYOUT_GEN9_1D;
+ else
+return ISL_DIM_LAYOUT_GEN4_2D;
   case ISL_SURF_DIM_2D:
   case ISL_SURF_DIM_3D:
  return ISL_DIM_LAYOUT_GEN4_2D;
@@ -1112,9 +1130,6 @@ isl_surf_init_s(const struct isl_device *dev,
   .a = info->array_len,
};
 
-   enum isl_dim_layout dim_layout =
-  isl_surf_choose_dim_layout(dev, info->dim);
-
enum isl_tiling tiling;
if (!isl_surf_choose_tiling(dev, info, ))
   return false;
@@ -1123,6 +1138,9 @@ isl_surf_init_s(const struct isl_device *dev,
if (!isl_tiling_get_info(dev, tiling, fmtl->bpb, _info))
   return false;
 
+   enum isl_dim_layout dim_layout =
+  isl_surf_choose_dim_layout(dev, info->dim, tiling);
+
enum isl_msaa_layout msaa_layout;
if (!isl_choose_msaa_layout(dev, info, tiling, _layout))
return false;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/2] *** Aubinator tool for Intel Gen platforms ***

2016-08-23 Thread Kenneth Graunke
On Friday, August 19, 2016 12:13:23 PM PDT Sirisha Gandikota wrote:
> This is a patch series for adding the aubinator tool to the codebase.

Hi Sirisha,

I've pushed your patches to import the tool.  I also pushed a third
commit that fixes some minor style nitpicks - I figured it would be
easier to just fix it up rather than to point out every instance of
whitespace changes in a big long email.  Here's the patch, if you
want to see what I did:

https://cgit.freedesktop.org/mesa/mesa/commit/?id=e7530bfcd6acdc8f8984820445c4b41602952298

While reviewing the code, I did find some things I'd like to see
changed:

1. Fix a compiler warning:

   decoder.c:460:12: warning: assignment discards ‘const’ qualifier from
   pointer target type [-Wdiscarded-qualifiers] 

   You can just mark the gen_field_iterator::p pointer as const.

2. Simplify gen_disasm_create()'s devinfo handling.

   You can just do gd->devinfo = *brw_get_device_info(pciid) to
   copy the whole struct, rather than just a couple fields.

3. Simplify print_dword_val().

   You don't need the float/dword union - iter->p[f->start / 32] 
   is already a uint32_t, which is what you want for the %08x
   printf formatter.  You may as well use it directly.

4. Make gen_disasm_disassemble handle split sends.

   Skylake adds new SENDS and SENDSC opcodes, which you should
   handle in the send-with-EOT check.  Maybe make an is_send()
   helper that checks if the opcode is SEND/SENDC/SENDS/SENDSC?

5. Bogus "end" parameter in gen_disasm_disassemble()?

   The loop pretends to loop over instructions from "start" to "end",
   but the callers always pass 8192 for end, which is some huge bogus
   value.  The real loop termination condition is send-with-EOT or 0.
   Maybe just drop the bogus "end" parameter entirely?

6. Fix decoding of values that span two DWords.

   64-bit fields (such as most pointers on Gen8+) aren't currently
   decoded correctly.  gen_field_iterator_next seems to walk one
   DWord at a time, sets v.dw, and then passes it to field().

   So, even though field() takes a uint64_t, we're passing it a
   uint32_t (which gets promoted, so the top 32 bits will always
   be zero).  This seems pretty bogus...

   If a field's (end - start) > 32, you probably want to read a
   whole uint64_t and pass that to field().  Might need to % 64
   instead of % 32.  I'll let you figure it out :)

Could you send a new patch series to address these?  All but the last
should be trivial to fix.  Now that the tool has landed, it'd be nice
to see each change as a separate commit - let's not squash things from
now on.

Thanks!
--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization

2016-08-23 Thread Rhys Kidd
Duplicate line is currently on 1535.

Identified by Clang, when run through Eric Anholt's Travis harness.

Signed-off-by: Rhys Kidd 
---
 src/gallium/auxiliary/nir/tgsi_to_nir.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
index 906c643..7f5774d 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -1575,7 +1575,6 @@ static const nir_op op_trans[TGSI_OPCODE_LAST] = {
[TGSI_OPCODE_TXB] = 0,
[TGSI_OPCODE_DIV] = nir_op_fdiv,
[TGSI_OPCODE_DP2] = 0,
-   [TGSI_OPCODE_DP2A] = 0,
[TGSI_OPCODE_TXL] = 0,
 
[TGSI_OPCODE_BRK] = 0,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97456] Detect wrong driver on AMD r4 graphic gpu

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97456

--- Comment #1 from Michel Dänzer  ---
Please attach the full output containing the information you think is
incorrect.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97455] Detect wrong driver for AMD r3 graphic

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97455

BO-YI TSAI  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97456] Detect wrong driver on AMD r4 graphic gpu

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97456

BO-YI TSAI  changed:

   What|Removed |Added

   Hardware|Other   |x86-64 (AMD64)
 OS|All |Linux (All)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97456] Detect wrong driver on AMD r4 graphic gpu

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97456

Bug ID: 97456
   Summary: Detect wrong driver on AMD r4 graphic gpu
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: a8581...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

I'm using a AMD tablet with a6-micro 6500t apu, which has r4 graphic GPU. It's
ID is "VEN_1002_9854_06", though the Mesa driver regard it as "R3E
graphic" . could any one help to fix it? thanks :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97455] Detect wrong driver for AMD r3 graphic

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97455

Bug ID: 97455
   Summary: Detect wrong driver for AMD r3 graphic
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: a8581...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] isl: round format alignment to nearest power of 2

2016-08-23 Thread Jason Ekstrand
On Tue, Aug 23, 2016 at 5:41 PM, Ilia Mirkin  wrote:

> On Fri, Aug 19, 2016 at 7:44 PM, Lionel Landwerlin
>  wrote:
> > A few inline asserts in anv assume alignments are power of 2, but with
> > formats like R8G8B8 we have odd alignments.
> >
> > Signed-off-by: Lionel Landwerlin 
> > ---
> >  src/intel/isl/isl.c  | 1 +
> >  src/intel/isl/isl_priv.h | 6 ++
> >  2 files changed, 7 insertions(+)
> >
> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index 18e95e2..dfe0fc1 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -1182,6 +1182,7 @@ isl_surf_init_s(const struct isl_device *dev,
> >  base_alignment = MAX(base_alignment, fmtl->bpb / 8);
> >   }
> >}
> > +  base_alignment = isl_round_to_power_of_two(base_alignment);
> > } else {
> >assert(phys_slice0_sa.w % fmtl->bw == 0);
> >const uint32_t total_w_el = phys_slice0_sa.width / fmtl->bw;
> > diff --git a/src/intel/isl/isl_priv.h b/src/intel/isl/isl_priv.h
> > index 3a7af1a..cc9991c 100644
> > --- a/src/intel/isl/isl_priv.h
> > +++ b/src/intel/isl/isl_priv.h
> > @@ -99,6 +99,12 @@ isl_log2u(uint32_t n)
> >  }
> >
> >  static inline uint32_t
> > +isl_round_to_power_of_two(uint32_t value)
> > +{
> > +   return 1 << isl_log2u(value);
>
> This will round down. Is that desired? (Don't know, just asking... I
> kind of assumed you'd want to round up to the nearest power of two,
> but perhaps not.)
>

You're right!  R-B recinded.  If you rounded up to a power of two, that
would work fine.


> > +}
> > +
> > +static inline uint32_t
> >  isl_minify(uint32_t n, uint32_t levels)
> >  {
> > if (unlikely(n == 0))
> > --
> > 2.9.3
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97444] mesa git crashes in libxshmfence

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97444

--- Comment #3 from Michel Dänzer  ---
Assuming you can reproduce the problem with Mesa Git + LLVM 3.9/3.8, you can
stick to the latter for bisecting. If you run into any clang related issues,
you can build Mesa with --disable-opencl for the bisection.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97260] [bisected] R9 290 low performance in Linux 4.7

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97260

--- Comment #27 from Michel Dänzer  ---
(In reply to Kai from comment #26)
> (In reply to Alex Deucher from comment #25)
> > Is anyone still having issues with vblank_mode=0?
> 
> With or without Michel's patch
> (), [...]

My patch doesn't have any effect with vblank_mode=0, which is why Alex asked.
However, if there are still issues with that, I'd like them to be tracked in a
separate report, ideally with an independent bisection confirming they started
with the same kernel change.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] isl: round format alignment to nearest power of 2

2016-08-23 Thread Ilia Mirkin
On Fri, Aug 19, 2016 at 7:44 PM, Lionel Landwerlin
 wrote:
> A few inline asserts in anv assume alignments are power of 2, but with
> formats like R8G8B8 we have odd alignments.
>
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/intel/isl/isl.c  | 1 +
>  src/intel/isl/isl_priv.h | 6 ++
>  2 files changed, 7 insertions(+)
>
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 18e95e2..dfe0fc1 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -1182,6 +1182,7 @@ isl_surf_init_s(const struct isl_device *dev,
>  base_alignment = MAX(base_alignment, fmtl->bpb / 8);
>   }
>}
> +  base_alignment = isl_round_to_power_of_two(base_alignment);
> } else {
>assert(phys_slice0_sa.w % fmtl->bw == 0);
>const uint32_t total_w_el = phys_slice0_sa.width / fmtl->bw;
> diff --git a/src/intel/isl/isl_priv.h b/src/intel/isl/isl_priv.h
> index 3a7af1a..cc9991c 100644
> --- a/src/intel/isl/isl_priv.h
> +++ b/src/intel/isl/isl_priv.h
> @@ -99,6 +99,12 @@ isl_log2u(uint32_t n)
>  }
>
>  static inline uint32_t
> +isl_round_to_power_of_two(uint32_t value)
> +{
> +   return 1 << isl_log2u(value);

This will round down. Is that desired? (Don't know, just asking... I
kind of assumed you'd want to round up to the nearest power of two,
but perhaps not.)

> +}
> +
> +static inline uint32_t
>  isl_minify(uint32_t n, uint32_t levels)
>  {
> if (unlikely(n == 0))
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] isl: round format alignment to nearest power of 2

2016-08-23 Thread Jason Ekstrand
This seems perfectly reasonable.  I don't think rounding up to a power of
two can hurt.

Reviewed-by: Jason Ekstrand 

On Mon, Aug 22, 2016 at 10:46 AM, Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:

> The one I actually wanted to Cc you on :)
>
>
> On 20/08/16 00:44, Lionel Landwerlin wrote:
>
>> A few inline asserts in anv assume alignments are power of 2, but with
>> formats like R8G8B8 we have odd alignments.
>>
>> Signed-off-by: Lionel Landwerlin 
>> ---
>>   src/intel/isl/isl.c  | 1 +
>>   src/intel/isl/isl_priv.h | 6 ++
>>   2 files changed, 7 insertions(+)
>>
>> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
>> index 18e95e2..dfe0fc1 100644
>> --- a/src/intel/isl/isl.c
>> +++ b/src/intel/isl/isl.c
>> @@ -1182,6 +1182,7 @@ isl_surf_init_s(const struct isl_device *dev,
>>   base_alignment = MAX(base_alignment, fmtl->bpb / 8);
>>}
>> }
>> +  base_alignment = isl_round_to_power_of_two(base_alignment);
>>  } else {
>> assert(phys_slice0_sa.w % fmtl->bw == 0);
>> const uint32_t total_w_el = phys_slice0_sa.width / fmtl->bw;
>> diff --git a/src/intel/isl/isl_priv.h b/src/intel/isl/isl_priv.h
>> index 3a7af1a..cc9991c 100644
>> --- a/src/intel/isl/isl_priv.h
>> +++ b/src/intel/isl/isl_priv.h
>> @@ -99,6 +99,12 @@ isl_log2u(uint32_t n)
>>   }
>> static inline uint32_t
>> +isl_round_to_power_of_two(uint32_t value)
>> +{
>> +   return 1 << isl_log2u(value);
>> +}
>> +
>> +static inline uint32_t
>>   isl_minify(uint32_t n, uint32_t levels)
>>   {
>>  if (unlikely(n == 0))
>>
>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add explicit return type to texture opcodes

2016-08-23 Thread Roland Scheidegger
Am 22.08.2016 um 23:13 schrieb Marek Olšák:
> On Mon, Aug 22, 2016 at 10:18 PM, Dave Airlie  wrote:
>> On 23 August 2016 at 06:14, Marek Olšák  wrote:
>>> On Mon, Aug 22, 2016 at 7:05 PM, Brian Paul  wrote:
 On 08/22/2016 08:38 AM, Marek Olšák wrote:
>
> From: Marek Olšák 
>
> Sampler view declarations have return types, but that doesn't work with
> variable indexing (e.g. SAMP[-1+i]).
>
> Adding the return type to the instruction is simpler.
>
> All sampler view declaration flags might have to be removed since variable
> indexing makes them inaccessible.


 Do you want to get rid of sampler view declarations entirely?
>>>
>>> No, they have their place and I think the sampler view declarations
>>> should stay, but the flags that are attached to them are basically
>>> inaccessible with variable indexing. Since I have no reason to add
>>> array support for sampler view declarations (unlike IN,OUT,TEMP, where
>>> we needed array support due to other reasons), I prefer moving the
>>> flags to instructions.
>>
>> It should be illegal in all APIs I thought to do mixed arrays.
>>
>> So if you have 1+x, then the bits on 1 should be the same as the
>> bits on all the other sampler views in the array (even though there is
>> no array).
>>
>> Is there a real world problem being solved?
> 
> No, it's just for TGSI robustness (not to be confused with GL robustness).
> 

I guess I replied to the wrong thread...

In any case, I'd much prefer array support for sampler view dcls
instead. Mixed arrays are illegal everywhere, and it's nice to see that
in the translated shaders. And, even if you might not care on your hw,
someone might benefit from separate arrays (I'm near certain a proper
llvmpipe implementation of variable indexing would).

Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] gallium: add a pipe_context parameter to resource_get_handle

2016-08-23 Thread Roland Scheidegger
I admit I'm not familiar with the interopability stuff.
But how is the coherency stuff handled after you get the handle?
Does this also work if you've got multiple users of resource_get_handle?

Roland

Am 22.08.2016 um 16:06 schrieb Marek Olšák:
> From: Marek Olšák 
> 
> radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL
> interop and this is the only way to make it coherent with the current
> context. It can optionally be set to NULL.
> ---
>  src/gallium/auxiliary/util/u_transfer.c   |  1 +
>  src/gallium/auxiliary/util/u_transfer.h   |  1 +
>  src/gallium/auxiliary/vl/vl_winsys_dri3.c |  2 +-
>  src/gallium/drivers/ddebug/dd_screen.c|  4 +++-
>  src/gallium/drivers/ilo/ilo_resource.c|  1 +
>  src/gallium/drivers/llvmpipe/lp_texture.c |  1 +
>  src/gallium/drivers/noop/noop_pipe.c  |  1 +
>  src/gallium/drivers/r300/r300_texture.c   |  1 +
>  src/gallium/drivers/r300/r300_texture.h   |  1 +
>  src/gallium/drivers/radeon/r600_texture.c |  1 +
>  src/gallium/drivers/rbug/rbug_screen.c|  5 -
>  src/gallium/drivers/softpipe/sp_texture.c |  1 +
>  src/gallium/drivers/trace/tr_screen.c |  5 -
>  src/gallium/include/pipe/p_screen.h   |  7 +++
>  src/gallium/state_trackers/dri/dri2.c | 13 +++--
>  src/gallium/state_trackers/nine/swapchain9.c  |  3 ++-
>  src/gallium/state_trackers/va/buffer.c|  3 ++-
>  src/gallium/state_trackers/vdpau/output.c |  3 ++-
>  src/gallium/state_trackers/vdpau/surface.c|  3 ++-
>  src/gallium/state_trackers/xa/xa_tracker.c|  3 ++-
>  src/gallium/winsys/sw/wrapper/wrapper_sw_winsys.c |  2 +-
>  21 files changed, 46 insertions(+), 16 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_transfer.c 
> b/src/gallium/auxiliary/util/u_transfer.c
> index 82cf68d..ba4b9dc 100644
> --- a/src/gallium/auxiliary/util/u_transfer.c
> +++ b/src/gallium/auxiliary/util/u_transfer.c
> @@ -104,20 +104,21 @@ void u_default_transfer_unmap( struct pipe_context 
> *pipe,
>  }
>  
>  
>  static inline struct u_resource *
>  u_resource( struct pipe_resource *res )
>  {
> return (struct u_resource *)res;
>  }
>  
>  boolean u_resource_get_handle_vtbl(struct pipe_screen *screen,
> +   struct pipe_context *ctx,
> struct pipe_resource *resource,
> struct winsys_handle *handle,
> unsigned usage)
>  {
> struct u_resource *ur = u_resource(resource);
> return ur->vtbl->resource_get_handle(screen, resource, handle);
>  }
>  
>  void u_resource_destroy_vtbl(struct pipe_screen *screen,
>   struct pipe_resource *resource)
> diff --git a/src/gallium/auxiliary/util/u_transfer.h 
> b/src/gallium/auxiliary/util/u_transfer.h
> index 7f680bc..ab787ab 100644
> --- a/src/gallium/auxiliary/util/u_transfer.h
> +++ b/src/gallium/auxiliary/util/u_transfer.h
> @@ -66,20 +66,21 @@ struct u_resource_vtbl {
>  };
>  
>  
>  struct u_resource {
> struct pipe_resource b;
> const struct u_resource_vtbl *vtbl;
>  };
>  
>  
>  boolean u_resource_get_handle_vtbl(struct pipe_screen *screen,
> +   struct pipe_context *ctx,
> struct pipe_resource *resource,
> struct winsys_handle *handle,
> unsigned usage);
>  
>  void u_resource_destroy_vtbl(struct pipe_screen *screen,
>   struct pipe_resource *resource);
>  
>  void *u_transfer_map_vtbl(struct pipe_context *context,
>struct pipe_resource *resource,
>unsigned level,
> diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c 
> b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
> index 61d6205..3d596a6 100644
> --- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
> +++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
> @@ -236,21 +236,21 @@ dri3_alloc_back_buffer(struct vl_dri3_screen *scrn)
> templ.depth0 = 1;
> templ.array_size = 1;
> buffer->texture = scrn->base.pscreen->resource_create(scrn->base.pscreen,
>   );
> if (!buffer->texture)
>goto unmap_shm;
>  
> memset(, 0, sizeof(whandle));
> whandle.type= DRM_API_HANDLE_TYPE_FD;
> usage = PIPE_HANDLE_USAGE_EXPLICIT_FLUSH | PIPE_HANDLE_USAGE_READ;
> -   scrn->base.pscreen->resource_get_handle(scrn->base.pscreen,
> +   scrn->base.pscreen->resource_get_handle(scrn->base.pscreen, NULL,
> buffer->texture, ,
> usage);
> buffer_fd = whandle.handle;
> buffer->pitch = whandle.stride;
> 

Re: [Mesa-dev] [PATCH 1/3] gallium: add explicit return type to texture opcodes

2016-08-23 Thread Roland Scheidegger
I don't really like this - I'd like to keep d3d10 semantics unless
there's a good reason to change it.
And I don't see such a reason - even with variable indexing you cannot
use mixed int/uint/float textures since texturing results with using the
wrong type of sampler are undefined anyway. Hence you cannot have a
single array of samplers both with int and float types, it has to be
multiple arrays in glsl.
If tgsi can't have multiple sampler arrays, that should be fixed instead
imho.

Roland


Am 22.08.2016 um 16:06 schrieb Marek Olšák:
> From: Marek Olšák 
> 
> Sampler view declarations have return types, but that doesn't work with
> variable indexing (e.g. SAMP[-1+i]).
> 
> Adding the return type to the instruction is simpler.
> 
> All sampler view declaration flags might have to be removed since variable
> indexing makes them inaccessible.
> ---
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c|  7 ++-
>  src/gallium/auxiliary/tgsi/tgsi_ureg.h| 21 ++---
>  src/gallium/auxiliary/util/u_simple_shaders.c | 32 +++---
>  src/gallium/auxiliary/vl/vl_bicubic_filter.c  |  3 +-
>  src/gallium/auxiliary/vl/vl_compositor.c  | 16 ---
>  src/gallium/auxiliary/vl/vl_deint_filter.c| 39 +++--
>  src/gallium/auxiliary/vl/vl_idct.c|  6 ++-
>  src/gallium/auxiliary/vl/vl_matrix_filter.c   |  3 +-
>  src/gallium/auxiliary/vl/vl_mc.c  |  3 +-
>  src/gallium/auxiliary/vl/vl_median_filter.c   |  3 +-
>  src/gallium/auxiliary/vl/vl_mpeg12_decoder.c  |  3 +-
>  src/gallium/auxiliary/vl/vl_zscan.c   |  9 ++--
>  src/gallium/drivers/freedreno/freedreno_program.c |  4 +-
>  src/gallium/drivers/nouveau/nv50/nv50_surface.c   |  4 +-
>  src/gallium/include/pipe/p_shader_tokens.h|  3 +-
>  src/gallium/state_trackers/nine/nine_ff.c |  6 +--
>  src/gallium/state_trackers/nine/nine_shader.c | 53 
> +++
>  src/gallium/state_trackers/xa/xa_tgsi.c   | 21 +
>  src/mesa/state_tracker/st_atifs_to_tgsi.c |  1 +
>  src/mesa/state_tracker/st_cb_drawpixels.c |  6 ++-
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 35 +--
>  src/mesa/state_tracker/st_mesa_to_tgsi.c  |  1 +
>  src/mesa/state_tracker/st_pbo.c   |  8 +++-
>  23 files changed, 180 insertions(+), 107 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
> b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> index b67c383..b82a2ea 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> @@ -1207,32 +1207,33 @@ ureg_fixup_label(struct ureg_program *ureg,
>  {
> union tgsi_any_token *out = retrieve_token( ureg, DOMAIN_INSN, 
> label_token );
>  
> out->insn_label.Label = instruction_number;
>  }
>  
>  
>  void
>  ureg_emit_texture(struct ureg_program *ureg,
>unsigned extended_token,
> -  unsigned target, unsigned num_offsets)
> +  unsigned target, unsigned num_offsets, unsigned 
> return_type)
>  {
> union tgsi_any_token *out, *insn;
>  
> out = get_tokens( ureg, DOMAIN_INSN, 1 );
> insn = retrieve_token( ureg, DOMAIN_INSN, extended_token );
>  
> insn->insn.Texture = 1;
>  
> out[0].value = 0;
> out[0].insn_texture.Texture = target;
> out[0].insn_texture.NumOffsets = num_offsets;
> +   out[0].insn_texture.ReturnType = return_type;
>  }
>  
>  void
>  ureg_emit_texture_offset(struct ureg_program *ureg,
>   const struct tgsi_texture_offset *offset)
>  {
> union tgsi_any_token *out;
>  
> out = get_tokens( ureg, DOMAIN_INSN, 1);
>  
> @@ -1321,20 +1322,21 @@ ureg_insn(struct ureg_program *ureg,
>  
> ureg_fixup_insn_size( ureg, insn.insn_token );
>  }
>  
>  void
>  ureg_tex_insn(struct ureg_program *ureg,
>unsigned opcode,
>const struct ureg_dst *dst,
>unsigned nr_dst,
>unsigned target,
> +  unsigned return_type,
>const struct tgsi_texture_offset *texoffsets,
>unsigned nr_offset,
>const struct ureg_src *src,
>unsigned nr_src )
>  {
> struct ureg_emit_insn_result insn;
> unsigned i;
> boolean saturate;
> boolean predicate;
> boolean negate = FALSE;
> @@ -1359,21 +1361,22 @@ ureg_tex_insn(struct ureg_program *ureg,
>   saturate,
>   predicate,
>   negate,
>   swizzle[0],
>   swizzle[1],
>   swizzle[2],
>   swizzle[3],
>   nr_dst,
>   nr_src);
>  
> -   ureg_emit_texture( ureg, insn.extended_token, target, nr_offset );
> +   ureg_emit_texture(ureg, insn.extended_token, target, nr_offset,
> + 

Re: [Mesa-dev] [PATCH] isl/formats: Integer formats are not filterable

2016-08-23 Thread Jason Ekstrand
On Tue, Aug 23, 2016 at 2:18 PM, Nanley Chery  wrote:

> On Tue, Aug 23, 2016 at 02:13:26PM -0700, Jason Ekstrand wrote:
> > In ca2a8e56285, we updated the format table to add more formats (most of
> > which are new on SKL) but accidentally marked some integer formats as
> > filterable.  You can't filter an integer format.
> >
> > Signed-off-by: Jason Ekstrand 
>
> This patch is
>
> Reviewed-by: Nanley Chery 
>

Thanks for catching my mistake and for the quick review!

--Jason


> > ---
> >  src/intel/isl/isl_format.c | 16 
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
> > index bced4ff..8507cc5 100644
> > --- a/src/intel/isl/isl_format.c
> > +++ b/src/intel/isl/isl_format.c
> > @@ -237,10 +237,10 @@ static const struct surface_format_info
> format_info[] = {
> > SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   P4A4_UNORM_PALETTE1)
> > SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   A4P4_UNORM_PALETTE1)
> > SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   Y8_UNORM)
> > -   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8_UINT)
> > -   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8_SINT)
> > -   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   I8_UINT)
> > -   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   I8_SINT)
> > +   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8_UINT)
> > +   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8_SINT)
> > +   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   I8_UINT)
> > +   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   I8_SINT)
> > SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   DXT1_RGB_SRGB)
> > SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   R1_UNORM)
> > SF( Y,  Y,  x,  Y,  Y,  x,  x,  x, 60,x,   YCRCB_NORMAL)
> > @@ -287,8 +287,8 @@ static const struct surface_format_info
> format_info[] = {
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_SIGNED_R11)
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_SIGNED_RG11)
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_SRGB8)
> > -   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_UINT)
> > -   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_SINT)
> > +   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_UINT)
> > +   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_SINT)
> > SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R32_SFIXED)
> > SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R10G10B10A2_SNORM)
> > SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R10G10B10A2_USCALED)
> > @@ -305,8 +305,8 @@ static const struct surface_format_info
> format_info[] = {
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_SRGB8_PTA)
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_EAC_RGBA8)
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_EAC_SRGB8_A8)
> > -   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_UINT)
> > -   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_SINT)
> > +   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_UINT)
> > +   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_SINT)
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_4X4_FLT16)
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_5X4_FLT16)
> > SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_5X5_FLT16)
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/31] i965/blorp: Move the guts of brw_blorp_exec into genX_blorp_exec.c

2016-08-23 Thread Jason Ekstrand
On Tue, Aug 23, 2016 at 8:50 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Fri, Aug 19, 2016 at 09:55:54AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/mesa/drivers/dri/i965/blorp.c   | 66
> -
> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 66
> +
> >  src/mesa/drivers/dri/i965/genX_blorp_exec.h |  8 ++--
> >  3 files changed, 70 insertions(+), 70 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/blorp.c
> b/src/mesa/drivers/dri/i965/blorp.c
> > index dba3441..0688f6b 100644
> > --- a/src/mesa/drivers/dri/i965/blorp.c
> > +++ b/src/mesa/drivers/dri/i965/blorp.c
> > @@ -347,28 +347,6 @@ brw_blorp_compile_nir_shader(struct brw_context
> *brw, struct nir_shader *nir,
> >  void
> >  brw_blorp_exec(struct brw_context *brw, const struct brw_blorp_params
> *params)
> >  {
> > -   struct gl_context *ctx = >ctx;
> > -   const uint32_t estimated_max_batch_usage = brw->gen >= 8 ? 1800 :
> 1500;
> > -   bool check_aperture_failed_once = false;
> > -
> > -   /* Flush the sampler and render caches.  We definitely need to flush
> the
> > -* sampler cache so that we get updated contents from the render
> cache for
> > -* the glBlitFramebuffer() source.  Also, we are sometimes warned in
> the
> > -* docs to flush the cache between reinterpretations of the same
> surface
> > -* data with different formats, which blorp does for stencil and
> depth
> > -* data.
> > -*/
> > -   brw_emit_mi_flush(brw);
> > -
> > -   brw_select_pipeline(brw, BRW_RENDER_PIPELINE);
> > -
> > -retry:
> > -   intel_batchbuffer_require_space(brw, estimated_max_batch_usage,
> RENDER_RING);
> > -   intel_batchbuffer_save_state(brw);
> > -   drm_intel_bo *saved_bo = brw->batch.bo;
> > -   uint32_t saved_used = USED_BATCH(brw->batch);
> > -   uint32_t saved_state_batch_offset = brw->batch.state_batch_offset;
> > -
> > switch (brw->gen) {
> > case 6:
> >gen6_blorp_exec(brw, params);
> > @@ -389,50 +367,6 @@ retry:
> >/* BLORP is not supported before Gen6. */
> >unreachable("not reached");
> > }
> > -
> > -   /* Make sure we didn't wrap the batch unintentionally, and make sure
> we
> > -* reserved enough space that a wrap will never happen.
> > -*/
> > -   assert(brw->batch.bo == saved_bo);
> > -   assert((USED_BATCH(brw->batch) - saved_used) * 4 +
> > -  (saved_state_batch_offset - brw->batch.state_batch_offset) <
> > -  estimated_max_batch_usage);
> > -   /* Shut up compiler warnings on release build */
> > -   (void)saved_bo;
> > -   (void)saved_used;
> > -   (void)saved_state_batch_offset;
> > -
> > -   /* Check if the blorp op we just did would make our batch likely to
> fail to
> > -* map all the BOs into the GPU at batch exec time later.  If so,
> flush the
> > -* batch and try again with nothing else in the batch.
> > -*/
> > -   if (dri_bufmgr_check_aperture_space(>batch.bo, 1)) {
> > -  if (!check_aperture_failed_once) {
> > - check_aperture_failed_once = true;
> > - intel_batchbuffer_reset_to_saved(brw);
> > - intel_batchbuffer_flush(brw);
> > - goto retry;
> > -  } else {
> > - int ret = intel_batchbuffer_flush(brw);
> > - WARN_ONCE(ret == -ENOSPC,
> > -   "i965: blorp emit exceeded available aperture
> space\n");
> > -  }
> > -   }
> > -
> > -   if (unlikely(brw->always_flush_batch))
> > -  intel_batchbuffer_flush(brw);
> > -
> > -   /* We've smashed all state compared to what the normal 3D pipeline
> > -* rendering tracks for GL.
> > -*/
> > -   brw->ctx.NewDriverState |= BRW_NEW_BLORP;
> > -   brw->no_depth_or_stencil = false;
> > -   brw->ib.type = -1;
> > -
> > -   /* Flush the sampler cache so any texturing from the destination is
> > -* coherent.
> > -*/
> > -   brw_emit_mi_flush(brw);
> >  }
> >
> >  void
> > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > index e07fa0a..9ba1f8a 100644
> > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > @@ -170,6 +170,28 @@ void
> >  genX(blorp_exec)(struct brw_context *brw,
> >   const struct brw_blorp_params *params)
> >  {
> > +   struct gl_context *ctx = >ctx;
> > +   const uint32_t estimated_max_batch_usage = GEN_GEN >= 8 ? 1800 :
> 1500;
> > +   bool check_aperture_failed_once = false;
> > +
> > +   /* Flush the sampler and render caches.  We definitely need to flush
> the
> > +* sampler cache so that we get updated contents from the render
> cache for
> > +* the glBlitFramebuffer() source.  Also, we are sometimes warned in
> the
> > +* docs to flush the cache between reinterpretations of the same
> surface
> > +* data with different formats, which blorp does for stencil and
> depth
> > +* data.
> > +*/
> > +   brw_emit_mi_flush(brw);
> > +
> > +   

Re: [Mesa-dev] [PATCH 11/31] i965/blorp/genX: Add a blorp_surface_reloc helper

2016-08-23 Thread Jason Ekstrand
On Tue, Aug 23, 2016 at 7:06 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Fri, Aug 19, 2016 at 09:55:48AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 38
> -
> >  1 file changed, 21 insertions(+), 17 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > index 32a7445..f226255 100644
> > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > @@ -61,6 +61,23 @@ blorp_emit_reloc(struct brw_context *brw, void
> *location,
> > }
> >  }
> >
> > +static void
> > +blorp_surface_reloc(struct brw_context *brw, uint32_t ss_offset,
> > +struct blorp_address address, uint32_t delta)
> > +{
> > +   drm_intel_bo_emit_reloc(brw->batch.bo, ss_offset,
> > +   address.buffer, address.offset + delta,
> > +   address.read_domains, address.write_domain);
> > +
> > +   uint64_t reloc_val = address.buffer->offset64 + address.offset +
> delta;
> > +   void *reloc_ptr = (void *)brw->batch.map + ss_offset;
> > +#if GEN_GEN >= 8
> > +   *(uint64_t *)reloc_ptr = reloc_val;
> > +#else
> > +   *(uint32_t *)reloc_ptr = reloc_val;
> > +#endif
> > +}
> > +
> >  static void *
> >  blorp_alloc_dynamic_state(struct blorp_context *blorp,
> >enum aub_state_struct_type type,
> > @@ -951,24 +968,15 @@ blorp_emit_surface_state(struct brw_context *brw,
> >
> > const uint32_t mocs =
> >is_render_target ? brw->blorp.mocs.rb : brw->blorp.mocs.tex;
> > -   uint64_t aux_bo_offset =
> > -  surface->aux_addr.buffer ? surface->aux_addr.buffer->offset64 :
> 0;
> >
> > isl_surf_fill_state(>isl_dev, dw, .surf = , .view =
> >view,
> > -   .address = surface->addr.buffer->offset64 +
> surface->addr.offset,
> > .aux_surf = >aux_surf, .aux_usage =
> aux_usage,
> > -   .aux_address = aux_bo_offset +
> surface->aux_addr.offset,
>
> Should you have dropped ".address" and ".aux_address" already in the
> previous
> patch?
> Otherwise the patch makes sense.
>

No.  One of the big changes here is that the blorp_surface_reloc helper
writes the relocated value into the buffer.  Previously, we were depending
on isl to write the value for us and drm_intel_bo_emit_reloc only added the
relocation to the list.

Maybe that needs to go in the commit message?

--Jason


> > .mocs = mocs, .clear_color =
> surface->clear_color,
> > .x_offset_sa = surface->tile_x_sa,
> > .y_offset_sa = surface->tile_y_sa);
> >
> > -   /* Emit relocation to surface contents */
> > -   drm_intel_bo_emit_reloc(brw->batch.bo,
> > -   surf_offset + ss_info.reloc_dw * 4,
> > -   surface->addr.buffer,
> > -   dw[ss_info.reloc_dw] - surface->addr.buffer->
> offset64,
> > -   surface->addr.read_domains,
> > -   surface->addr.write_domain);
> > +   blorp_surface_reloc(brw, surf_offset + ss_info.reloc_dw * 4,
> > +   surface->addr, 0);
> >
> > if (aux_usage != ISL_AUX_USAGE_NONE) {
> >/* On gen7 and prior, the bottom 12 bits of the MCS base address
> are
> > @@ -976,12 +984,8 @@ blorp_emit_surface_state(struct brw_context *brw,
> > * surface buffer addresses are always 4K page alinged.
> > */
> >assert((surface->aux_addr.offset & 0xfff) == 0);
> > -  drm_intel_bo_emit_reloc(brw->batch.bo,
> > -  surf_offset + ss_info.aux_reloc_dw * 4,
> > -  surface->aux_addr.buffer,
> > -  dw[ss_info.aux_reloc_dw] & 0xfff,
> > -  surface->aux_addr.read_domains,
> > -  surface->aux_addr.write_domain);
> > +  blorp_surface_reloc(brw, surf_offset + ss_info.aux_reloc_dw * 4,
> > +  surface->aux_addr, dw[ss_info.aux_reloc_dw]);
> > }
> >
> > return surf_offset;
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/31] i965/blorp: Add an "exec" function pointer to blorp_context

2016-08-23 Thread Jason Ekstrand
On Tue, Aug 23, 2016 at 9:00 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Fri, Aug 19, 2016 at 09:55:55AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/mesa/drivers/dri/i965/blorp.c   | 27
> +--
> >  src/mesa/drivers/dri/i965/blorp.h   |  4 
> >  src/mesa/drivers/dri/i965/blorp_blit.c  |  2 +-
> >  src/mesa/drivers/dri/i965/blorp_clear.c |  6 +++---
> >  src/mesa/drivers/dri/i965/blorp_priv.h  | 21 -
> >  src/mesa/drivers/dri/i965/brw_blorp.c   | 19 +++
> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c |  9 +++--
> >  7 files changed, 35 insertions(+), 53 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/blorp.c
> b/src/mesa/drivers/dri/i965/blorp.c
> > index 0688f6b..df10e50 100644
> > --- a/src/mesa/drivers/dri/i965/blorp.c
> > +++ b/src/mesa/drivers/dri/i965/blorp.c
> > @@ -345,31 +345,6 @@ brw_blorp_compile_nir_shader(struct brw_context
> *brw, struct nir_shader *nir,
> >  }
> >
> >  void
> > -brw_blorp_exec(struct brw_context *brw, const struct brw_blorp_params
> *params)
> > -{
> > -   switch (brw->gen) {
> > -   case 6:
> > -  gen6_blorp_exec(brw, params);
> > -  break;
> > -   case 7:
> > -  if (brw->is_haswell)
> > - gen75_blorp_exec(brw, params);
> > -  else
> > - gen7_blorp_exec(brw, params);
> > -  break;
> > -   case 8:
> > -  gen8_blorp_exec(brw, params);
> > -  break;
> > -   case 9:
> > -  gen9_blorp_exec(brw, params);
> > -  break;
> > -   default:
> > -  /* BLORP is not supported before Gen6. */
> > -  unreachable("not reached");
> > -   }
> > -}
> > -
> > -void
> >  blorp_gen6_hiz_op(struct brw_context *brw, struct brw_blorp_surf *surf,
> >unsigned level, unsigned layer, enum gen6_hiz_op op)
> >  {
> > @@ -436,5 +411,5 @@ blorp_gen6_hiz_op(struct brw_context *brw, struct
> brw_blorp_surf *surf,
> >unreachable("not reached");
> > }
> >
> > -   brw_blorp_exec(brw, );
> > +   brw->blorp.exec(>blorp, brw, );
> >  }
> > diff --git a/src/mesa/drivers/dri/i965/blorp.h
> b/src/mesa/drivers/dri/i965/blorp.h
> > index a9ef754..671731e 100644
> > --- a/src/mesa/drivers/dri/i965/blorp.h
> > +++ b/src/mesa/drivers/dri/i965/blorp.h
> > @@ -39,6 +39,8 @@ struct hash_table;
> >  extern "C" {
> >  #endif
> >
> > +struct brw_blorp_params;
> > +
> >  struct blorp_context {
> > void *driver_ctx;
> >
> > @@ -56,6 +58,8 @@ struct blorp_context {
> >
> > uint32_t (*upload_shader)(struct blorp_context *,
> >   const void *data, uint32_t size);
> > +   void (*exec)(struct blorp_context *blorp, void *batch,
> > +const struct brw_blorp_params *params);
> >  };
> >
> >  void blorp_init(struct blorp_context *blorp, void *driver_ctx,
> > diff --git a/src/mesa/drivers/dri/i965/blorp_blit.c
> b/src/mesa/drivers/dri/i965/blorp_blit.c
> > index d01dfff..449e09d 100644
> > --- a/src/mesa/drivers/dri/i965/blorp_blit.c
> > +++ b/src/mesa/drivers/dri/i965/blorp_blit.c
> > @@ -1647,5 +1647,5 @@ brw_blorp_blit(struct brw_context *brw,
> >   swizzle_to_scs(GET_SWZ(src_swizzle, i));
> > }
> >
> > -   brw_blorp_exec(brw, );
> > +   brw->blorp.exec(>blorp, brw, );
> >  }
> > diff --git a/src/mesa/drivers/dri/i965/blorp_clear.c
> b/src/mesa/drivers/dri/i965/blorp_clear.c
> > index fb4d050..3b6d6d7 100644
> > --- a/src/mesa/drivers/dri/i965/blorp_clear.c
> > +++ b/src/mesa/drivers/dri/i965/blorp_clear.c
> > @@ -110,7 +110,7 @@ blorp_fast_clear(struct brw_context *brw, const
> struct brw_blorp_surf *surf,
> > brw_blorp_surface_info_init(brw, , surf, level, layer,
> > surf->surf->format, true);
> >
> > -   brw_blorp_exec(brw, );
> > +   brw->blorp.exec(>blorp, brw, );
> >  }
> >
> >
> > @@ -156,7 +156,7 @@ blorp_clear(struct brw_context *brw, const struct
> brw_blorp_surf *surf,
> > brw_blorp_surface_info_init(brw, , surf, level, layer,
> > format, true);
> >
> > -   brw_blorp_exec(brw, );
> > +   brw->blorp.exec(>blorp, brw, );
> >  }
> >
> >  void
> > @@ -186,5 +186,5 @@ brw_blorp_ccs_resolve(struct brw_context *brw,
> struct brw_blorp_surf *surf,
> >
> > brw_blorp_params_get_clear_kernel(brw, , true);
> >
> > -   brw_blorp_exec(brw, );
> > +   brw->blorp.exec(>blorp, brw, );
> >  }
> > diff --git a/src/mesa/drivers/dri/i965/blorp_priv.h
> b/src/mesa/drivers/dri/i965/blorp_priv.h
> > index 9b987a8..c7a2a03 100644
> > --- a/src/mesa/drivers/dri/i965/blorp_priv.h
> > +++ b/src/mesa/drivers/dri/i965/blorp_priv.h
> > @@ -183,27 +183,6 @@ struct brw_blorp_params
> >  void
> >  brw_blorp_params_init(struct brw_blorp_params *params);
> >
> > -void
> > -brw_blorp_exec(struct brw_context *brw, const struct brw_blorp_params
> *params);
> > -
> > -void
> > -gen6_blorp_exec(struct brw_context *brw,
> > -const struct brw_blorp_params *params);
> > -
> > -void
> > 

Re: [Mesa-dev] [PATCH v2] i965/vec4: make offset() operate in terms of channels instead of full registers

2016-08-23 Thread Michael Schellenberger Costa
Hi Iago,

given that the idea here was to unify vec4 and fs you might want to
adopt the names/function types accordingly.

In brw_ir_fs.h there is byte_offset that returns a fs_reg while you have
void add_byte_offset.

--Michael

Am 23.08.2016 um 10:24 schrieb Iago Toral Quiroga:
> This will make it more consistent with the FS implementation of the same
> helper and will provide more flexibility that will come in handy, for
> example, when we add a SIMD lowering pass in the vec4 backend.
> 
> v2:
>  - Move the switch statement to add_byte_offset (Iago)
>  - Remove the assert on the register file, it is redundant with the switch
>(Michael Schellenberger)
>  - Fix use of '=' instead of '==' (Michael Schellenberger,
>Francesco Ansanelli)
> ---
>  src/mesa/drivers/dri/i965/brw_ir_vec4.h | 36 
> +
>  1 file changed, 28 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> index 81b6a13..d55b522 100644
> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> @@ -60,12 +60,33 @@ retype(src_reg reg, enum brw_reg_type type)
> return reg;
>  }
>  
> +static inline void
> +add_byte_offset(backend_reg *reg, unsigned delta)
> +{
> +   switch (reg->file) {
> +   case BAD_FILE:
> +  break;
> +   case MRF:
> +   case VGRF:
> +   case ATTR:
> +   case UNIFORM: {
> +  const unsigned suboffset = reg->subreg_offset + delta;
> +  reg->reg_offset += suboffset / REG_SIZE;
> +  reg->subreg_offset += suboffset % REG_SIZE;
> +  /* Align16 requires that register accesses are 16-byte aligned */
> +  assert(reg->subreg_offset % 16 == 0);
> +  break;
> +   }
> +   default:
> +  assert(delta == 0);
> +   }
> +}
> +
>  static inline src_reg
> -offset(src_reg reg, unsigned delta)
> +offset(src_reg reg, unsigned width, unsigned delta)
>  {
> -   assert(delta == 0 ||
> -  (reg.file != ARF && reg.file != FIXED_GRF && reg.file != IMM));
> -   reg.reg_offset += delta;
> +   unsigned byte_offset = delta * width * type_sz(reg.type);
> +   add_byte_offset(, byte_offset);
> return reg;
>  }
>  
> @@ -130,11 +151,10 @@ retype(dst_reg reg, enum brw_reg_type type)
>  }
>  
>  static inline dst_reg
> -offset(dst_reg reg, unsigned delta)
> +offset(dst_reg reg, unsigned width, unsigned delta)
>  {
> -   assert(delta == 0 ||
> -  (reg.file != ARF && reg.file != FIXED_GRF && reg.file != IMM));
> -   reg.reg_offset += delta;
> +   unsigned byte_offset = delta * width * type_sz(reg.type);
> +   add_byte_offset(, byte_offset);
> return reg;
>  }
>  
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/31] i965/blorp/exec: Refactor to use blorp_context and a void *batch

2016-08-23 Thread Jason Ekstrand
On Tue, Aug 23, 2016 at 8:33 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Fri, Aug 19, 2016 at 09:55:52AM -0700, Jason Ekstrand wrote:
> > This gets rid of brw_context throughout the core of the state setup code.
> > ---
> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 332
> +++-
> >  1 file changed, 182 insertions(+), 150 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > index 288e384..8c15b16 100644
> > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > @@ -34,8 +34,11 @@
> >  #include "genxml/gen_macros.h"
> >
> >  static void *
> > -blorp_emit_dwords(struct brw_context *brw, unsigned n)
> > +blorp_emit_dwords(struct blorp_context *blorp, void *batch, unsigned n)
> >  {
> > +   assert(blorp->driver_ctx == batch);
> > +   struct brw_context *brw = batch;
> > +
> > intel_batchbuffer_begin(brw, n, RENDER_RING);
> > uint32_t *map = brw->batch.map_next;
> > brw->batch.map_next += n;
> > @@ -44,9 +47,12 @@ blorp_emit_dwords(struct brw_context *brw, unsigned n)
> >  }
> >
> >  static uint64_t
> > -blorp_emit_reloc(struct brw_context *brw, void *location,
> > - struct blorp_address address, uint32_t delta)
> > +blorp_emit_reloc(struct blorp_context *blorp, void *batch,
> > + void *location, struct blorp_address address, uint32_t
> delta)
> >  {
> > +   assert(blorp->driver_ctx == batch);
> > +   struct brw_context *brw = batch;
> > +
> > uint32_t offset = (char *)location - (char *)brw->batch.map;
> > if (brw->gen >= 8) {
> >return intel_batchbuffer_reloc64(brw, address.buffer, offset,
> > @@ -62,9 +68,11 @@ blorp_emit_reloc(struct brw_context *brw, void
> *location,
> >  }
> >
> >  static void
> > -blorp_surface_reloc(struct brw_context *brw, uint32_t ss_offset,
> > +blorp_surface_reloc(struct blorp_context *blorp, uint32_t ss_offset,
> >  struct blorp_address address, uint32_t delta)
> >  {
> > +   struct brw_context *brw = blorp->driver_ctx;
> > +
> > drm_intel_bo_emit_reloc(brw->batch.bo, ss_offset,
> > address.buffer, address.offset + delta,
> > address.read_domains, address.write_domain);
> > @@ -129,8 +137,12 @@ blorp_alloc_vertex_buffer(struct blorp_context
> *blorp, uint32_t size,
> >  }
> >
> >  static void
> > -blorp_emit_urb_config(struct brw_context *brw, unsigned vs_entry_size)
> > +blorp_emit_urb_config(struct blorp_context *blorp, void *batch,
> > +  unsigned vs_entry_size)
> >  {
> > +   assert(blorp->driver_ctx == batch);
> > +   struct brw_context *brw = batch;
> > +
> >  #if GEN_GEN >= 7
> > if (!(brw->ctx.NewDriverState & (BRW_NEW_CONTEXT |
> BRW_NEW_URB_SIZE)) &&
> > brw->urb.vsize >= vs_entry_size)
> > @@ -145,26 +157,34 @@ blorp_emit_urb_config(struct brw_context *brw,
> unsigned vs_entry_size)
> >  }
> >
> >  static void
> > -blorp_emit_3dstate_multisample(struct brw_context *brw, unsigned
> samples)
> > +blorp_emit_3dstate_multisample(struct blorp_context *blorp, void
> *batch,
> > +   unsigned samples)
> >  {
> > +   assert(blorp->driver_ctx == batch);
> >  #if GEN_GEN >= 8
> > -   gen8_emit_3dstate_multisample(brw, samples);
> > +   gen8_emit_3dstate_multisample(batch, samples);
> >  #else
> > -   gen6_emit_3dstate_multisample(brw, samples);
> > +   gen6_emit_3dstate_multisample(batch, samples);
> >  #endif
> >  }
> >
> > +struct blorp_batch {
> > +   struct blorp_context *blorp;
> > +   void *batch;
> > +};
> > +
> >  #define __gen_address_type struct blorp_address
> > -#define __gen_user_data struct brw_context
> > +#define __gen_user_data struct blorp_batch
> >
> >  static uint64_t
> > -__gen_combine_address(struct brw_context *brw, void *location,
> > +__gen_combine_address(struct blorp_batch *batch, void *location,
> >struct blorp_address address, uint32_t delta)
> >  {
> > if (address.buffer == NULL) {
> >return address.offset + delta;
> > } else {
> > -  return blorp_emit_reloc(brw, location, address, delta);
> > +  return blorp_emit_reloc(batch->blorp, batch->batch,
> > +  location, address, delta);
> > }
> >  }
> >
> > @@ -175,21 +195,22 @@ __gen_combine_address(struct brw_context *brw,
> void *location,
> >  #define _blorp_cmd_header(cmd) cmd ## _header
> >  #define _blorp_cmd_pack(cmd) cmd ## _pack
> >
> > -#define blorp_emit(brw, cmd, name)\
> > -   for (struct cmd name = { _blorp_cmd_header(cmd) }, \
> > -*_dst = blorp_emit_dwords(brw, _blorp_cmd_length(cmd));   \
> > -__builtin_expect(_dst != NULL, 1);\
> > -_blorp_cmd_pack(cmd)(brw, (void *)_dst, ),   \
> > +#define blorp_emit(batch, cmd, name)   

Re: [Mesa-dev] [PATCH v1 00/13] Implement sw_sync test

2016-08-23 Thread Eric Engestrom
On Tue, Aug 23, 2016 at 01:56:02PM -0400, robert.f...@collabora.com wrote:
> From: Robert Foss 
> 
> This series implements the sw_sync test and the lib/sw_sync helper functions
> for said test.
> 
> Gustavo Padovans sw_sync series was just de-staged in
> gregkh-staging/staging-next [1], and this test is targeted at verifying the
> functionality implemented in that series.
> 
> The sw_sync subtests range from very basic tests of the sw_sync functionality,
> to stress testing and randomized tests.
> 
> [1] http://git.kernel.org/cgit/linux/kernel/git/gregkh/staging.git/
> 
> Robert Foss (13):
>   lib/sw_sync: Add helper functions for managing synchronization
> primitives
>   tests/sw_sync: Add sw_sync test
>   tests/sw_sync: Add subtest test_alloc_fence
>   tests/sw_sync: Add subtest test_alloc_fence_invalid_timeline
>   tests/sw_sync: Add subtest test_alloc_merge_fence
>   tests/sw_sync: Add subtest test_sync_wait
>   tests/sw_sync: Add subtest test_sync_merge
>   tests/sw_sync: Add subtest test_sync_merge_same
>   tests/sw_sync: Add subtest test_sync_multi_consumer
>   tests/sw_sync: Add subtest test_sync_multi_consumer_producer
>   tests/sw_sync: Add subtest test_sync_random_merge
>   tests/sw_sync: Add subtest test_sync_multi_timeline_wait
>   tests/sw_sync: Add subtest test_sync_multi_producer_single_consumer
> 
>  lib/Makefile.sources   |   2 +
>  lib/sw_sync.c  | 237 +
>  lib/sw_sync.h  |  49 
>  tests/Makefile.sources |   1 +
>  tests/sw_sync.c| 693 
> +
>  5 files changed, 982 insertions(+)
>  create mode 100644 lib/sw_sync.c
>  create mode 100644 lib/sw_sync.h
>  create mode 100644 tests/sw_sync.c
> 
> -- 
> 2.7.4

Thanks for your work!

I sent some specific comments directly to a few patches, but everything
else looks good to me.

With the issues raised in patches 1 & 2 fixed (and with or without my
suggestions), the whole series is:
Reviewed-by: Eric Engestrom 

Cheers,
  Eric
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Give the installed intel_icd.json file an absolute path

2016-08-23 Thread Emil Velikov
On 23 August 2016 at 22:56, Nicholas Miell  wrote:
> On 08/23/2016 02:45 PM, Emil Velikov wrote:
>>
>> On 23 August 2016 at 19:05, Nicholas Miell  wrote:
>>>
>>> On 08/23/2016 02:07 AM, Emil Velikov wrote:


 Skimmed through the discussion and I'm not sure the above will be
 enough.

 Since the user is free to place json files in $HOME/.local ... this
 implies that they may _not_ have access to /usr or /etc. Thus as they
 install the file (to say $HOME/foo/lib) the Vulkan loader will not be
 able to pick it up.
>>>
>>>
>>>
>>> Why not?
>>>
>>> Stick the 32-bit library in $HOME/foo/lib, the 64-bit library in
>>> $HOME/foo/lib64, set library_path to "/home/emil/foo/$LIB/lib_mygpu.so"
>>> in
>>> the $HOME/.local/share/vulkan/icd.d/mygpu.json file, and it should work,
>>> right?
>>>
>> Few reasons:
>>  - If you use $LIB you also need to provide for all arches supported
>> by your setup. otherwise the loader will for missing files (small
>> nitpick but still).
>
>
> The loader already handles this case.
>
Even so, lying/misleading to the loader isn't the nicest thing to do.

>>  - conflict of who provides the json file - is it the x86-64 or i386
>> package, both, neither etc.
>
>
> Both, assuming your package manager supports it (rpm) or stick it in an
> architecture-neutral package that the architecture-specific packages require
> (deb).
>
As previously below you _cannot_ assume that a package manager (if
any) can handle with this.

>>  - the vulkan loader is based on the presumption that can be multiple
>> drivers on the system. Each one clearly distinguished and described in
>> it's own json. Thus as ABI changes (gets updated) we must mandate that
>> distributions and users update both 64 and 32bit packages at the same
>> time. Otherwise there will be a miss-match between the described ABI
>> in json and the actual one of the binary/ies.
>
>
> That's how distributions and package managers work already.
>
Almost, but not quite:
 - Many distros give the user and option to select which package to
update. Thus the user can shoot themselves in the foot because they're
not familiar with this unique Vulkan specifics.
 - At least two distros have ~24h or more period between 64bit and the
multilib package are rolled out.

 Json update:
  - the same file _cannot_ be provided by multiple packages
>>>
>>>
>>>
>>> RPM certainly allows this, as long as the packages that provide the file
>>> have the same name & version, different architectures, and the files are
>>> byte-identical. Sticking $LIB in the path instead of "lib" or "lib64"
>>> makes
>>> the files byte-identical.
>>>
>>>
>> Not everyone uses RPM, or even a package manager all together. In any
>> distro (mostly for source based ones) having one package overwrite
>> files of another package, is a serious no go. It may be allowed by
>> that's an exception of the general rule, requiring it's own list with
>> "ifs and buts".
>>
>> Real world example: your distro ships the 64bit ICD, but not a 32bit
>> one. You build and install the latter yourself and as
>>   - you remove it (because of reasons) you end up breaking your 64bit
>> setup
>>  -  or, you update the 64bit package and the contents of the json
>> change in a way that they don't reflect/describe your 32bit ICD.
>
>
> That isn't a real world example, or at the very least its an example of a
> bad distribution that you shouldn't use combined with user error.
>
Again, almost.
Archlinux does not (last checked a few minutes ago) does not ship
multilib vulkan. While it has a fair few drawbacks it is in no means
bad one imho. If I am to build and install a multilib one - be that a
proper package or not, I will exhibit the issues described above.

And yes I wish things were as simple as using $LIB. Yet, realistically
one cannot expect that every distribution (let's ignore the LFS alike
ones for the time being) follows the approaches used by the distro you
have in mind.

Regards,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v1 11/13] tests/sw_sync: Add subtest test_sync_random_merge

2016-08-23 Thread Eric Engestrom
On Tue, Aug 23, 2016 at 01:56:13PM -0400, robert.f...@collabora.com wrote:
> From: Robert Foss 
> 
> This subtest verifies that creating many timelines and merging random fences
> from each timeline with eachother results in merged fences that are fully
> functional.
> 
> Signed-off-by: Robert Foss 
> ---
>  tests/sw_sync.c | 73 
> +
>  1 file changed, 73 insertions(+)
> 
> diff --git a/tests/sw_sync.c b/tests/sw_sync.c
> index 0e67ad5..8e5a7c9 100644
> --- a/tests/sw_sync.c
> +++ b/tests/sw_sync.c
> @@ -383,6 +383,76 @@ static void test_sync_multi_consumer_producer(void)
>   igt_assert_f(thread_ret == 0, "A sync thread reported failure.\n");
>  }
>  
> +static void test_sync_random_merge(void)
> +{
> + int i, size, ret;
> + const int nbr_timeline = 32;
> + const int nbr_merge = 1024;
> + int fence_map[nbr_timeline];
> + int timeline_arr[nbr_timeline];
> + int fence, tmpfence, merged;
> + int timeline, timeline_offset, sync_pt;
> +
> + srand(time(NULL));
> +
> + for (i = 0; i < nbr_timeline; i++)
> + timeline_arr[i] = sw_sync_timeline_create();
> +
> + memset(fence_map, -1, sizeof(fence_map));

This sets each byte to -1, which happens to be the same as an int -1,
but I don't really like it: this is only true for two special values, -1
and 0, and would become a bug if it was ever changed.

I'd much prefer if you simply set:
fence_map[i] = -1;
in the loop you already have just above.

BTW, seeing how many times you set and test for -1 as the "invalid fd
sentinel", how about a:
#define INVALID_FD (-1)
and use that token throughout the code? I think it would improve
readability as well

> +
> + sync_pt = rand();
> + fence = sw_sync_fence_create(timeline_arr[0], sync_pt);
> +
> + fence_map[0] = sync_pt;
> +
> + /* Randomly create syncpoints out of a fixed set of timelines,
> +  * and merge them together.
> +  */
> + for (i = 0; i < nbr_merge; i++) {
> + /* Generate syncpoint. */
> + timeline_offset = rand() % nbr_timeline;
> + timeline = timeline_arr[timeline_offset];
> + sync_pt = rand();
> +
> + /* Keep track of the latest sync_pt in each timeline. */
> + if (fence_map[timeline_offset] == -1)
> + fence_map[timeline_offset] = sync_pt;
> + else if (fence_map[timeline_offset] < sync_pt)
> + fence_map[timeline_offset] = sync_pt;
> +
> + /* Merge. */
> + tmpfence = sw_sync_fence_create(timeline, sync_pt);
> + merged = sw_sync_merge(tmpfence, fence);
> + sw_sync_fence_destroy(tmpfence);
> + sw_sync_fence_destroy(fence);
> + fence = merged;
> + }
> +
> + size = 0;
> + for (i = 0; i < nbr_timeline; i++)
> + if (fence_map[i] != -1)
> + size++;
> +
> + /* Trigger the merged fence. */
> + for (i = 0; i < nbr_timeline; i++) {
> + if (fence_map[i] != -1) {
> + ret = sw_sync_wait(fence, 0);
> + igt_assert_f(ret == 0,
> + "Failure waiting on fence until timeout\n");
> + /* Increment the timeline to the last sync_pt */
> + sw_sync_timeline_inc(timeline_arr[i], fence_map[i]);
> + }
> + }
> +
> + /* Check that the fence is triggered. */
> + ret = sw_sync_wait(fence, 0);
> + igt_assert_f(ret > 0, "Failure triggering fence\n");
> +
> + sw_sync_fence_destroy(fence);
> + for (i = 0; i < nbr_timeline; i++)
> + sw_sync_timeline_destroy(timeline_arr[i]);
> +}
> +
>  igt_main
>  {
>   igt_subtest("alloc_timeline")
> @@ -411,5 +481,8 @@ igt_main
>  
>   igt_subtest("sync_multi_consumer_producer")
>   test_sync_multi_consumer_producer();
> +
> + igt_subtest("sync_random_merge")
> + test_sync_random_merge();
>  }
>  
> -- 
> 2.7.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: report bound buffer size not underlying buffer size for image size (v2)

2016-08-23 Thread Dave Airlie
On 23 August 2016 at 19:28, Antía Puentes  wrote:
> Hi Dave,
>
> The "GL44-CTS.shader_image_size.advanced-nonMS-fs-int" test fails for
> me with current master (which contains this patch), I have tested it
> both in a Broadwell and a Skylake machine.
>
> I recall that you sent to the mailing list a second patch "i965: don't
> fail to shift height images for levels." related to the test. Is this
> patch or a new version of it still needed?. I remember that as it is it
> regressed GL44-CTS.texture_view.gettexparameter, raising the assertion:
> "glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion
> `height0 = 1' failed."

Yes still needed.

>
> On a side note, as we are taking about CTS, just remind that Andres has
> reviewed the "mesa/subroutines: start adding per-context subroutine
> index support" series that you sent, which is not pushed yet.

I cleaned up and pushed it yesterday.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v1 02/13] tests/sw_sync: Add sw_sync test

2016-08-23 Thread Eric Engestrom
On Tue, Aug 23, 2016 at 01:56:04PM -0400, robert.f...@collabora.com wrote:
> From: Robert Foss 
> 
> Add initial tests for sw_sync.
> 
> Signed-off-by: Gustavo Padovan 
> Signed-off-by: Robert Foss 
> ---
>  lib/sw_sync.c  | 13 ++---
>  lib/sw_sync.h  |  2 +-
>  tests/Makefile.sources |  1 +
>  tests/sw_sync.c| 51 
> ++
>  4 files changed, 59 insertions(+), 8 deletions(-)
>  create mode 100644 tests/sw_sync.c
> 
> diff --git a/lib/sw_sync.c b/lib/sw_sync.c
> index c4e7d07..e3d5f85 100644
> --- a/lib/sw_sync.c
> +++ b/lib/sw_sync.c
> @@ -150,8 +150,6 @@ int sw_sync_wait(int fence, int timeout)
>  
>   ret = poll(, 1, timeout);
>  
> - sw_sync_fd_close(fence);
> -
>   return ret;
>  }
>  
> @@ -179,9 +177,10 @@ static struct sync_file_info *sync_file_info(int fd)
>   info->num_fences = num_fences;
>  
>   fence_info = calloc(num_fences, sizeof(struct sync_fence_info));
> - if (!fence_info)
> + if (!fence_info) {
>   free(info);
>   return NULL;
> + }

Oh, I see you fixed it here.
I would fold all these lib/ hunks from this patch into the previous
patch though, they don't belong in "Add sw_sync test" IMO, and having
them as a separate patch between the two would mean knowingly
introducing bugs in one commit and fixing them in the next.

>  
>   info->sync_fence_info = (uint64_t)(unsigned long) (fence_info);
>  
> @@ -217,18 +216,18 @@ int sw_sync_fence_size(int fd)
>   return count;
>  }
>  
> -int sw_sync_fence_count_with_status(int fd, int status)
> +int sw_sync_fence_count_status(int fd, int status)
>  {
>   int i, count = 0;
> - struct sync_fence_info *fenceInfo = NULL;
> + struct sync_fence_info *fence_info = NULL;
>   struct sync_file_info *info = sync_file_info(fd);
>  
>   if (!info)
>   return -1;
>  
> - fenceInfo = (struct sync_fence_info *)(uintptr_t)info->sync_fence_info;
> + fence_info = (struct sync_fence_info *)(uintptr_t)info->sync_fence_info;
>   for (i = 0 ; i < info->num_fences ; i++) {
> - if (fenceInfo[i].status == status)
> + if (fence_info[i].status == status)
>   count++;
>   }
>  
> diff --git a/lib/sw_sync.h b/lib/sw_sync.h
> index b179adf..1092608 100644
> --- a/lib/sw_sync.h
> +++ b/lib/sw_sync.h
> @@ -43,7 +43,7 @@ void sw_sync_timeline_inc(int fd, uint32_t count);
>  int sw_sync_merge(int fd1, int fd2);
>  int sw_sync_wait(int fence, int timeout);
>  int sw_sync_fence_size(int fd);
> -int sw_sync_fence_count_with_status(int fd, int status);
> +int sw_sync_fence_count_status(int fd, int status);
>  
>  #endif
>  
> diff --git a/tests/Makefile.sources b/tests/Makefile.sources
> index 72a58ad..0ba769f 100644
> --- a/tests/Makefile.sources
> +++ b/tests/Makefile.sources
> @@ -125,6 +125,7 @@ TESTS_progs_M = \
>   prime_mmap_kms \
>   prime_self_import \
>   prime_vgem \
> + sw_sync \
>   template \
>   vgem_basic \
>   vgem_slow \
> diff --git a/tests/sw_sync.c b/tests/sw_sync.c
> new file mode 100644
> index 000..d2d4c42
> --- /dev/null
> +++ b/tests/sw_sync.c
> @@ -0,0 +1,51 @@
> +/*
> + * Copyright 2012 Google, Inc
> + * Copyright © 2016 Collabora, Ltd.
> + *
> + * Based on the implementation from the Android Open Source Project
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Authors:
> + *Robert Foss 
> + */
> +
> +#include 
> +
> +#include "sw_sync.h"
> +#include "igt.h"
> +#include "igt_aux.h"
> +
> +IGT_TEST_DESCRIPTION("Test SW Sync Framework");
> +
> +static void test_alloc_timeline(void)
> +{
> + int timeline;
> +

Re: [Mesa-dev] [PATCH v1 01/13] lib/sw_sync: Add helper functions for managing synchronization primitives

2016-08-23 Thread Eric Engestrom
On Tue, Aug 23, 2016 at 01:56:03PM -0400, robert.f...@collabora.com wrote:
> From: Robert Foss 
> 
> Base functions to help testing the Sync File Framework (explicit fencing
> mechanism ported from Android).
> These functions allow you to create, use and destroy timelines and fences.
> 
> Signed-off-by: Gustavo Padovan 
> Signed-off-by: Robert Foss 
> ---
>  lib/Makefile.sources |   2 +
>  lib/sw_sync.c| 238 
> +++
>  lib/sw_sync.h|  49 +++
>  3 files changed, 289 insertions(+)
>  create mode 100644 lib/sw_sync.c
>  create mode 100644 lib/sw_sync.h
> 

[snip]

> +int sw_sync_fd_is_valid(int fd)
> +{
> + int status;
> +
> + if (fd == -1)

`-1` seems too specific. While open() will return -1 on error, any
negative fd is invalid, so I'd test for `<0` here instead.

> + return 0;
> +
> + status = fcntl(fd, F_GETFD, 0);
> + return status >= 0;
> +}
> +
> +static
> +void sw_sync_fd_close(int fd)
> +{
> + if (fd == -1)
> + return;
> +
> + if (fcntl(fd, F_GETFD, 0) < 0)
> + return;

Why not replace these two tests with a simple
if (sw_sync_fd_is_valid(fd))

> +
> + close(fd);
> +}
> +
> +int sw_sync_timeline_create(void)
> +{
> + int fd = open("/dev/sw_sync", O_RDWR);
> +
> + if (!sw_sync_fd_is_valid(fd))
> + fd = open("/sys/kernel/debug/sync/sw_sync", O_RDWR);

I tend to prefer for hard-coded paths to be in a #define at the top, but
I don't know what the policy for that is in IGT.

> +
> + igt_assert(sw_sync_fd_is_valid(fd));
> +
> + return fd;
> +}
> +

[snip]

> +static struct sync_file_info *sync_file_info(int fd)
> +{
> + struct sync_file_info *info;
> + struct sync_fence_info *fence_info;
> + int err, num_fences;
> +
> + info = malloc(sizeof(*info));
> + if (info == NULL)
> + return NULL;
> +
> + memset(info, 0, sizeof(*info));

You could replace malloc() + memset(0) with calloc(), as you're doing
a few lines below.

> + err = ioctl(fd, SYNC_IOC_FILE_INFO, info);
> + if (err < 0) {
> + free(info);
> + return NULL;
> + }
> +
> + num_fences = info->num_fences;
> +
> + if (num_fences) {
> + info->flags = 0;
> + info->num_fences = num_fences;
> +
> + fence_info = calloc(num_fences, sizeof(struct sync_fence_info));

sizeof(*fence_info)

> + if (!fence_info)
> + free(info);
> + return NULL;

Missing braces

> +
> + info->sync_fence_info = (uint64_t)(unsigned long) (fence_info);
> +
> + err = ioctl(fd, SYNC_IOC_FILE_INFO, info);
> + if (err < 0) {
> + free(fence_info);
> + free(info);
> + return NULL;
> + }
> + }
> +
> + return info;
> +}

[snip]

Cheers,
  Eric
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Give the installed intel_icd.json file an absolute path

2016-08-23 Thread Nicholas Miell

On 08/23/2016 02:45 PM, Emil Velikov wrote:

On 23 August 2016 at 19:05, Nicholas Miell  wrote:

On 08/23/2016 02:07 AM, Emil Velikov wrote:


Skimmed through the discussion and I'm not sure the above will be enough.

Since the user is free to place json files in $HOME/.local ... this
implies that they may _not_ have access to /usr or /etc. Thus as they
install the file (to say $HOME/foo/lib) the Vulkan loader will not be
able to pick it up.



Why not?

Stick the 32-bit library in $HOME/foo/lib, the 64-bit library in
$HOME/foo/lib64, set library_path to "/home/emil/foo/$LIB/lib_mygpu.so" in
the $HOME/.local/share/vulkan/icd.d/mygpu.json file, and it should work,
right?


Few reasons:
 - If you use $LIB you also need to provide for all arches supported
by your setup. otherwise the loader will for missing files (small
nitpick but still).


The loader already handles this case.


 - conflict of who provides the json file - is it the x86-64 or i386
package, both, neither etc.


Both, assuming your package manager supports it (rpm) or stick it in an 
architecture-neutral package that the architecture-specific packages 
require (deb).



 - the vulkan loader is based on the presumption that can be multiple
drivers on the system. Each one clearly distinguished and described in
it's own json. Thus as ABI changes (gets updated) we must mandate that
distributions and users update both 64 and 32bit packages at the same
time. Otherwise there will be a miss-match between the described ABI
in json and the actual one of the binary/ies.


That's how distributions and package managers work already.


Json update:
 - the same file _cannot_ be provided by multiple packages



RPM certainly allows this, as long as the packages that provide the file
have the same name & version, different architectures, and the files are
byte-identical. Sticking $LIB in the path instead of "lib" or "lib64" makes
the files byte-identical.



Not everyone uses RPM, or even a package manager all together. In any
distro (mostly for source based ones) having one package overwrite
files of another package, is a serious no go. It may be allowed by
that's an exception of the general rule, requiring it's own list with
"ifs and buts".

Real world example: your distro ships the 64bit ICD, but not a 32bit
one. You build and install the latter yourself and as
  - you remove it (because of reasons) you end up breaking your 64bit setup
 -  or, you update the 64bit package and the contents of the json
change in a way that they don't reflect/describe your 32bit ICD.


That isn't a real world example, or at the very least its an example of 
a bad distribution that you shouldn't use combined with user error.




So all in all, without separate json files each describing a separate
ICD, things are extremely easy to break.

-Emil



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Give the installed intel_icd.json file an absolute path

2016-08-23 Thread Emil Velikov
On 23 August 2016 at 19:05, Nicholas Miell  wrote:
> On 08/23/2016 02:07 AM, Emil Velikov wrote:
>>
>> Skimmed through the discussion and I'm not sure the above will be enough.
>>
>> Since the user is free to place json files in $HOME/.local ... this
>> implies that they may _not_ have access to /usr or /etc. Thus as they
>> install the file (to say $HOME/foo/lib) the Vulkan loader will not be
>> able to pick it up.
>
>
> Why not?
>
> Stick the 32-bit library in $HOME/foo/lib, the 64-bit library in
> $HOME/foo/lib64, set library_path to "/home/emil/foo/$LIB/lib_mygpu.so" in
> the $HOME/.local/share/vulkan/icd.d/mygpu.json file, and it should work,
> right?
>
Few reasons:
 - If you use $LIB you also need to provide for all arches supported
by your setup. otherwise the loader will for missing files (small
nitpick but still).
 - conflict of who provides the json file - is it the x86-64 or i386
package, both, neither etc.
 - the vulkan loader is based on the presumption that can be multiple
drivers on the system. Each one clearly distinguished and described in
it's own json. Thus as ABI changes (gets updated) we must mandate that
distributions and users update both 64 and 32bit packages at the same
time. Otherwise there will be a miss-match between the described ABI
in json and the actual one of the binary/ies.

>> In theory one should be able to have a varying .json (one with and one
>> without the path) although the heuristics will be fragile due to the
>> varying $LIB (like the case of Debian and derivatives) and $DESTDIR.
>>
>> So I see the following options:
>>  - new configure option - have to agree with Dave, please don't.
>> Furthermore we already have more than 50! of them.
>>  - fold the "w or w/o full path" under and existing option
>> (--enable-debug ?) - somewhat picky/fragile as well. Kind of OK for
>> short term solution.
>>  - use LD_LIBRARY_PATH - slightly annoying yet add once and forget
>> about it. OK-ish short term solution.
>>  - json update - the better long term solution imho.
>>
>> Json update:
>>  - the same file _cannot_ be provided by multiple packages
>
>
> RPM certainly allows this, as long as the packages that provide the file
> have the same name & version, different architectures, and the files are
> byte-identical. Sticking $LIB in the path instead of "lib" or "lib64" makes
> the files byte-identical.
>
>
Not everyone uses RPM, or even a package manager all together. In any
distro (mostly for source based ones) having one package overwrite
files of another package, is a serious no go. It may be allowed by
that's an exception of the general rule, requiring it's own list with
"ifs and buts".

Real world example: your distro ships the 64bit ICD, but not a 32bit
one. You build and install the latter yourself and as
  - you remove it (because of reasons) you end up breaking your 64bit setup
 -  or, you update the 64bit package and the contents of the json
change in a way that they don't reflect/describe your 32bit ICD.

So all in all, without separate json files each describing a separate
ICD, things are extremely easy to break.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] loader/dri3: Overhaul dri3_update_num_back

2016-08-23 Thread Eric Anholt
Michel Dänzer  writes:

> On 20/08/16 04:42 AM, Eric Anholt wrote:
>> Michel Dänzer  writes:
>> 
>>> From: Michel Dänzer 
>>>
>>> Always use 3 buffers when flipping. With only 2 buffers, we have to wait
>>> for a flip to complete (which takes non-0 time even with asynchronous
>>> flips) before we can start working on the next frame. We were previously
>>> only using 2 buffers for flipping if the X server supports asynchronous
>>> flips, even when we're not using asynchronous flips. This could result
>>> in bad performance (the referenced bug report is an extreme case, where
>>> the inter-frame stalls were preventing the GPU from reaching its maximum
>>> clocks).
>>>
>>> I couldn't measure any performance boost using 4 buffers with flipping.
>>> Performance actually seemed to go down slightly, but that might have
>>> been just noise.
>>>
>>> Without flipping, a single back buffer is enough for swap interval 0,
>>> but we need to use 2 back buffers when the swap interval is non-0,
>>> otherwise we have to wait for the swap interval to pass before we can
>>> start working on the next frame. This condition was previously reversed.
>>>
>>> Cc: "12.0 11.2" 
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97260
>>> Signed-off-by: Michel Dänzer 
>> 
>> Reviewed-by: Eric Anholt 
>
> Thanks.
>
> Note that four piglit EGL tests fail with this change, but AFAICT all of
> those tests are broken. I submitted fixes for three of them:
>
> https://patchwork.freedesktop.org/patch/106809/
> https://patchwork.freedesktop.org/patch/106807/
> https://patchwork.freedesktop.org/patch/106808/
>
> The last one is egl_chromium_sync_control. It calls eglSwapBuffers twice
> in a row and expects that the SBC is higher than before immediately
> after that. I.e. it expects that there is only one back buffer, so that
> at least the second eglSwapBuffers has to block until the first swap
> finishes. Which is no longer true with this DRI3 change. Not sure what
> to do about this one.

From a quick glance, change the test to wait on the SBC before querying
the new one, maybe?  It does seem questionable to assume that a swap
waits on the previous swap.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] isl/formats: Integer formats are not filterable

2016-08-23 Thread Nanley Chery
On Tue, Aug 23, 2016 at 02:13:26PM -0700, Jason Ekstrand wrote:
> In ca2a8e56285, we updated the format table to add more formats (most of
> which are new on SKL) but accidentally marked some integer formats as
> filterable.  You can't filter an integer format.
> 
> Signed-off-by: Jason Ekstrand 

This patch is

Reviewed-by: Nanley Chery 

> ---
>  src/intel/isl/isl_format.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
> index bced4ff..8507cc5 100644
> --- a/src/intel/isl/isl_format.c
> +++ b/src/intel/isl/isl_format.c
> @@ -237,10 +237,10 @@ static const struct surface_format_info format_info[] = 
> {
> SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   P4A4_UNORM_PALETTE1)
> SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   A4P4_UNORM_PALETTE1)
> SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   Y8_UNORM)
> -   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8_UINT)
> -   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8_SINT)
> -   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   I8_UINT)
> -   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   I8_SINT)
> +   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8_UINT)
> +   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8_SINT)
> +   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   I8_UINT)
> +   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   I8_SINT)
> SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   DXT1_RGB_SRGB)
> SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   R1_UNORM)
> SF( Y,  Y,  x,  Y,  Y,  x,  x,  x, 60,x,   YCRCB_NORMAL)
> @@ -287,8 +287,8 @@ static const struct surface_format_info format_info[] = {
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_SIGNED_R11)
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_SIGNED_RG11)
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_SRGB8)
> -   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_UINT)
> -   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_SINT)
> +   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_UINT)
> +   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_SINT)
> SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R32_SFIXED)
> SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R10G10B10A2_SNORM)
> SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R10G10B10A2_USCALED)
> @@ -305,8 +305,8 @@ static const struct surface_format_info format_info[] = {
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_SRGB8_PTA)
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_EAC_RGBA8)
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_EAC_SRGB8_A8)
> -   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_UINT)
> -   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_SINT)
> +   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_UINT)
> +   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_SINT)
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_4X4_FLT16)
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_5X4_FLT16)
> SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_5X5_FLT16)
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] isl/formats: Integer formats are not filterable

2016-08-23 Thread Jason Ekstrand
In ca2a8e56285, we updated the format table to add more formats (most of
which are new on SKL) but accidentally marked some integer formats as
filterable.  You can't filter an integer format.

Signed-off-by: Jason Ekstrand 
---
 src/intel/isl/isl_format.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
index bced4ff..8507cc5 100644
--- a/src/intel/isl/isl_format.c
+++ b/src/intel/isl/isl_format.c
@@ -237,10 +237,10 @@ static const struct surface_format_info format_info[] = {
SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   P4A4_UNORM_PALETTE1)
SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   A4P4_UNORM_PALETTE1)
SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   Y8_UNORM)
-   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8_UINT)
-   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   L8_SINT)
-   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   I8_UINT)
-   SF(90, 90,  x,  x,  x,  x,  x,  x,  x,x,   I8_SINT)
+   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8_UINT)
+   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8_SINT)
+   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   I8_UINT)
+   SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   I8_SINT)
SF(45, 45,  x,  x,  x,  x,  x,  x,  x,x,   DXT1_RGB_SRGB)
SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   R1_UNORM)
SF( Y,  Y,  x,  Y,  Y,  x,  x,  x, 60,x,   YCRCB_NORMAL)
@@ -287,8 +287,8 @@ static const struct surface_format_info format_info[] = {
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_SIGNED_R11)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   EAC_SIGNED_RG11)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_SRGB8)
-   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_UINT)
-   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_SINT)
+   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_UINT)
+   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R16G16B16_SINT)
SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R32_SFIXED)
SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R10G10B10A2_SNORM)
SF( x,  x,  x,  x,  x,  x, 75,  x,  x,x,   R10G10B10A2_USCALED)
@@ -305,8 +305,8 @@ static const struct surface_format_info format_info[] = {
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_SRGB8_PTA)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_EAC_RGBA8)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ETC2_EAC_SRGB8_A8)
-   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_UINT)
-   SF(90, 90,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_SINT)
+   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_UINT)
+   SF(90,  x,  x,  x,  x,  x, 75,  x,  x,x,   R8G8B8_SINT)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_4X4_FLT16)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_5X4_FLT16)
SF(80, 80,  x,  x,  x,  x,  x,  x,  x,x,   ASTC_LDR_2D_5X5_FLT16)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] nir/algebraic: Optimize common array indexing sequence

2016-08-23 Thread Thomas Helland
12. aug. 2016 04.02 skrev "Timothy Arceri" :
>
> On Thu, 2016-08-11 at 20:54 +0200, Thomas Helland wrote:
> > 2016-08-11 18:19 GMT+02:00 Ian Romanick :
> > >
> > > ping
> > >
> > > On 07/19/2016 10:37 PM, Ian Romanick wrote:
> > > >
> > > > From: Ian Romanick 
> > > >
> > > > Some shaders include code that looks like:
> > > >
> > > >uniform int i;
> > > >uniform vec4 bones[...];
> > > >
> > > >foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]);
> > > >
> > > > CSE would do some work on this:
> > > >
> > > >x = i * 3
> > > >foo(bones[x], bones[x + 1], bones[x + 2]);
> > > >
> > > > The compiler may then add '<< 4 + base' to the index
> > > > calculations.
> > > > This results in expressions like
> > > >
> > > >x = i * 3
> > > >foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]);
> > > >
> >
> > This may just be me being dense, but why is the compiler adding the
> > shift?
> > It seems like this would be "un-optimizing" and so it shouldn't be
> > added?
> >
> > This patch is also a good argument for adding a constant
> > reassociation pass.
> > At least the (a * #b) << #c--->a * (#b << #c) would be caught
> > by that.
> >
> > I was confused whether the optimization was preserving signed-ness
> > correctly
> > but after a couple rounds with myself I believe it is correct.
> >
> > The change looks good and is:
> >
> > Reviewed-by: Thomas Helland 
> >
> > (Oh, and yes, I'm back to hobby mesa development after finally
> > securing a new job and a new apartment)
>
> Congrats and welcome back.
>
> I did send you an email but just in case you missed it I've picked up
> on your loop unrolling work.

Thanks!

Yeah, I noticed that one. At least I believe I did. I think I might have
also replied with some hints and suggestions? At least I meant to. My Gmail
doesn't seem perfectly stable; somehow this thread didn't get pushed up in
my inbox when it got more replies, so I might have missed something. If so,
make sure to ping me! I should probably find out what's up with that mail
thing though.

>
> >
> > >
> > > >
> > > > Just rearranging the math to produce (i * 48) + 16 saves an
> > > > instruction, and it allows CSE to do more work.
> > > >
> > > >x = i * 48;
> > > >foo(bones[x], bones[x + 16], bones[x + 32]);
> > > >
> > > > So, ~6 instructions becomes ~3.
> > > >
> > > > Some individual shader-db results look pretty bad.  However, I
> > > > have a
> > > > really, really hard time believing the change in estimated cycles
> > > > in,
> > > > for example, 3dmmes-taiji/51.shader_test after looking that
> > > > change in
> > > > the generated code.
> > > >
> > > > G45
> > > > total instructions in shared programs: 4020840 -> 4010070 (-
> > > > 0.27%)
> > > > instructions in affected programs: 177460 -> 166690 (-6.07%)
> > > > helped: 894
> > > > HURT: 0
> > > >
> > > > total cycles in shared programs: 98829000 -> 98784990 (-0.04%)
> > > > cycles in affected programs: 3936648 -> 3892638 (-1.12%)
> > > > helped: 894
> > > > HURT: 0
> > > >
> > > > Ironlake
> > > > total instructions in shared programs: 6418887 -> 6408117 (-
> > > > 0.17%)
> > > > instructions in affected programs: 177460 -> 166690 (-6.07%)
> > > > helped: 894
> > > > HURT: 0
> > > >
> > > > total cycles in shared programs: 143504542 -> 143460532 (-0.03%)
> > > > cycles in affected programs: 3936648 -> 3892638 (-1.12%)
> > > > helped: 894
> > > > HURT: 0
> > > >
> > > > Sandy Bridge
> > > > total instructions in shared programs: 8357887 -> 8339251 (-
> > > > 0.22%)
> > > > instructions in affected programs: 432715 -> 414079 (-4.31%)
> > > > helped: 2795
> > > > HURT: 0
> > > >
> > > > total cycles in shared programs: 118284184 -> 118207412 (-0.06%)
> > > > cycles in affected programs: 6114626 -> 6037854 (-1.26%)
> > > > helped: 2478
> > > > HURT: 317
> > > >
> > > > Ivy Bridge
> > > > total instructions in shared programs: 7669390 -> 7653822 (-
> > > > 0.20%)
> > > > instructions in affected programs: 388234 -> 372666 (-4.01%)
> > > > helped: 2795
> > > > HURT: 0
> > > >
> > > > total cycles in shared programs: 68381982 -> 68263684 (-0.17%)
> > > > cycles in affected programs: 1972658 -> 1854360 (-6.00%)
> > > > helped: 2458
> > > > HURT: 307
> > > >
> > > > Haswell
> > > > total instructions in shared programs: 7082636 -> 7067068 (-
> > > > 0.22%)
> > > > instructions in affected programs: 388234 -> 372666 (-4.01%)
> > > > helped: 2795
> > > > HURT: 0
> > > >
> > > > total cycles in shared programs: 68282020 -> 68164158 (-0.17%)
> > > > cycles in affected programs: 1891820 -> 1773958 (-6.23%)
> > > > helped: 2459
> > > > HURT: 261
> > > >
> > > > Broadwell
> > > > total instructions in shared programs: 9002466 -> 8985875 (-
> > > > 0.18%)
> > > > instructions in affected programs: 658784 -> 642193 (-2.52%)
> > > > helped: 2795
> > > > HURT: 5
> > > >
> > > > total cycles in shared programs: 

Re: [Mesa-dev] [PATCH] glsl: Mark tessellation qualifier maps static const.

2016-08-23 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

Ah so that's why they weren't static const in my patch... coz I copied
it :) Nice sleuth work.

On Tue, Aug 23, 2016 at 4:39 PM, Kenneth Graunke  wrote:
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/glsl_parser.yy | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/glsl/glsl_parser.yy 
> b/src/compiler/glsl/glsl_parser.yy
> index 4f4a83c..5b65861 100644
> --- a/src/compiler/glsl/glsl_parser.yy
> +++ b/src/compiler/glsl/glsl_parser.yy
> @@ -1373,7 +1373,7 @@ layout_qualifier_id:
>
>/* Layout qualifiers for tessellation evaluation shaders. */
>if (!$$.flags.i) {
> - struct {
> + static const struct {
>  const char *s;
>  GLenum e;
>   } map[] = {
> @@ -1396,7 +1396,7 @@ layout_qualifier_id:
>   }
>}
>if (!$$.flags.i) {
> - struct {
> + static const struct {
>  const char *s;
>  GLenum e;
>   } map[] = {
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Mark tessellation qualifier maps static const.

2016-08-23 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/glsl_parser.yy | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy
index 4f4a83c..5b65861 100644
--- a/src/compiler/glsl/glsl_parser.yy
+++ b/src/compiler/glsl/glsl_parser.yy
@@ -1373,7 +1373,7 @@ layout_qualifier_id:
 
   /* Layout qualifiers for tessellation evaluation shaders. */
   if (!$$.flags.i) {
- struct {
+ static const struct {
 const char *s;
 GLenum e;
  } map[] = {
@@ -1396,7 +1396,7 @@ layout_qualifier_id:
  }
   }
   if (!$$.flags.i) {
- struct {
+ static const struct {
 const char *s;
 GLenum e;
  } map[] = {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] i965: move subreg_offset to backend_reg

2016-08-23 Thread Francisco Jerez
Iago Toral Quiroga  writes:

> So we can access it in the vec4 backend to handle byte offsets into
> registers.

This change has deep implications in the meaning of the vec4 register
objects because the representation of register offsets is now split
between 'reg_offset' and 'subreg_offset', and there are a *lot* of
places that directly rely on the current representation of register
offsets, just grep the VEC4 back-end for 'reg_offset' -- Every
occurrence is now suspect of being inaccurate because the offset of a
register is no longer guaranteed to be 32B-aligned, so comparing
reg_offsets alone to find out whether two registers are equivalent or
whether they overlap is no longer sufficient.

Do you *really* need to make this representation change for FP64?  Or is
there some way around it?  ISTR that in the big VEC4 FP64 series you
ended up using backend_reg::subnr for this (which is just an undercover
version of subreg_offset so the same caveat applies :P) when you had to
address both halves of a single-precision register during SIMD lowering
(Can you remind me what the exact use-case was for that?).  To address
the ZW components of a double-precision vector this seems less of a
requirement because you can just use the logical (i.e. 64-bit-based)
swizzles at the IR level and only translate to physical
swizzles+offsets+strides during codegen time.

If the answer is that you definitely need this change, I think I'm going
to help you out with this because the change is going to be massive and
involve auditing the whole VEC4 back-end.  I think there are two lessons
to learn from the FS back-end:

 - Having the register offset split between two variables has been an
   endless source of bugs, because it's just too tempting to only take
   one of them into account and ignore the other.

 - It wouldn't be substantially more difficult to change reg_offset to
   be expressed in byte units instead, even if it involves going through
   every occurrence of reg_offset in the back-end, because adding a
   separate subreg_offset field invalidates every ocurrence of
   reg_offset in the back-end anyway.

> ---
>  src/mesa/drivers/dri/i965/brw_ir_fs.h  | 6 --
>  src/mesa/drivers/dri/i965/brw_shader.h | 6 ++
>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
> b/src/mesa/drivers/dri/i965/brw_ir_fs.h
> index f214483..00fbace 100644
> --- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
> @@ -52,12 +52,6 @@ public:
> /** Smear a channel of the reg to all channels. */
> fs_reg _smear(unsigned subreg);
>  
> -   /**
> -* Offset in bytes from the start of the register.  Values up to a
> -* backend_reg::reg_offset unit are valid.
> -*/
> -   int subreg_offset;
> -
> /** Register region horizontal stride */
> uint8_t stride;
>  };
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
> b/src/mesa/drivers/dri/i965/brw_shader.h
> index e61c080..ae23830 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.h
> +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> @@ -75,6 +75,12 @@ struct backend_reg : private brw_reg
>  */
> uint16_t reg_offset;
>  
> +   /**
> +* Offset in bytes from the start of the register.  Values up to a
> +* backend_reg::reg_offset unit are valid.
> +*/
> +   uint16_t subreg_offset;
> +
> using brw_reg::type;
> using brw_reg::file;
> using brw_reg::negate;
> -- 
> 2.7.4


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 95190] Tomb Raider with PostProcessing enable and Depth of Field set to Ultra has white stuff in the foreground

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=95190

Karol Herbst  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Karol Herbst  ---
except the bad perf, everything looks fine :)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vl/rbsp: fix another three byte not detected

2016-08-23 Thread Christian König

Am 22.08.2016 um 18:27 schrieb Leo Liu:

This happens when three byte "00 00 03" is partly loaded to
vlc->buffer, and left bytes like "00 03" or "03" in the data,
so that it will not be detected by three byte emulation check.

Signed-off-by: Leo Liu 
I currently don't have the time to double check that, but it sounds 
reasonable to me.


So patch is Acked-by: Christian König .

Cheers,
Christian.


---
  src/gallium/auxiliary/vl/vl_rbsp.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/vl/vl_rbsp.h 
b/src/gallium/auxiliary/vl/vl_rbsp.h
index c8bebff..160b2f8 100644
--- a/src/gallium/auxiliary/vl/vl_rbsp.h
+++ b/src/gallium/auxiliary/vl/vl_rbsp.h
@@ -56,7 +56,7 @@ static inline void vl_rbsp_init(struct vl_rbsp *rbsp, struct 
vl_vlc *nal, unsign
 /* copy the position */
 rbsp->nal = *nal;
  
-   rbsp->escaped = 0;

+   rbsp->escaped = 16;
  
 /* search for the end of the NAL unit */

 while (vl_vlc_search_byte(nal, num_bits, 0x00)) {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] glsl: Fix incorrect hard-coded location of the gl_SecondaryFragColorEXT built-in.

2016-08-23 Thread Francisco Jerez
Francisco Jerez  writes:

> Ilia Mirkin  writes:
>
>> On Tue, Aug 23, 2016 at 12:18 AM, Francisco Jerez  
>> wrote:
>>> Ilia Mirkin  writes:
>>>
 On Tue, Aug 23, 2016 at 12:05 AM, Francisco Jerez  
 wrote:
> Ilia Mirkin  writes:
>
>> On Mon, Aug 22, 2016 at 10:55 PM, Francisco Jerez 
>>  wrote:
>>> Ilia Mirkin  writes:
>>>
 On Mon, Aug 22, 2016 at 9:59 PM, Francisco Jerez 
  wrote:
> gl_SecondaryFragColorEXT should have the same location as gl_FragColor
> for the secondary fragment color to be replicated to all fragment
> outputs.  The incorrect location of gl_SecondaryFragColorEXT would
> cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0
> as being written to, which isn't allowed by the spec and would
> ultimately lead to an assertion failure in
> fs_visitor::emit_fb_writes() on my i965-fb-fetch branch.

 My recollection was that it didn't work with COLOR for "stupid"
 reasons. Can you confirm that
 bin/arb_blend_func_extended-fbo-extended-blend-pattern_gles2 -auto
 passes with this patch?

>>> Yes, it does, in fact
>>> arb_blend_func_extended-fbo-extended-blend-pattern_gles2 hits the i965
>>> assertion failure I mentioned above unless this patch is applied.
>>
>> This causes the test in question to fail on nouveau... the TGSI shader
>> generated starts with
>>
>> FRAG
>> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
>> DCL IN[0], POSITION, LINEAR
>> DCL OUT[0], SAMPLEMASK
>
> Heh, this smells a lot like the bug fixed in PATCH 1, but somewhere in
> the mesa state tracker.  st_glsl_to_tgsi.cpp:2422 does:
>
> | entry = new(mem_ctx) variable_storage(var,
> |   PROGRAM_OUTPUT,
> |   var->data.location
> |   + var->data.index);
>
> which is obviously bogus, e.g. for var->data.location ==
> FRAG_RESULT_COLOR and var->data.index == 1 you get
> FRAG_RESULT_SAMPLE_MASK which explains the sample mask declaration
> above.

 Right, because having FRAG_RESULT_COLOR and index != 0 was never
 possible prior to this. That might be why Ryan stuck it into
 FRAG_RESULT_DATA0 [I may have been the one to suggest that].
>>>
>>> Heh, so I guess that's the "stupid" reason you were referring to,
>>> working around this mesa state tracker bug in the GLSL front-end.
>>
>> Right. Or another way of looking at it, FRAG_RESULT_COLOR + index != 0
>> is illegal,
>
> That would be an acceptable limitation if it were taken into account
> consistently (which would imply among other things binding gl_FragColor
> to FRAG_RESULT_DATA0 if gl_SecondaryFragColor is written).  Otherwise
> you will be telling the linker and the back-end that both
> FRAG_RESULT_COLOR and FRAG_RESULT_DATA0 are being written, which is
> illegal.  The i965 has been misbehaving since forever because of this,
> but we just didn't notice until I added that assertion because the only
> symptom is increased shader recompiles.
>
>> (as it would, among other things, imply multi-rt support for
>> dual-source blending)
>
> FRAG_RESULT_COLOR + index != 0 doesn't imply multi-rt support, it's a
> requirement for multi-rt support, which the GLSL spec makes room for and
> some drivers in this tree could potentially support if the GLSL IR
> representation of dual-source blending wasn't deliberately inconsistent.
>
>> and the former code was making the GLSL fe not emit the illegal
>> combination.
>>
> I started digging into this and found out that that's not the only way
> the GLSL-to-TGSI pass relies on a GLSL bug: The output semantic array
> calculation in st_translate_fragment_program() relies on there being
> multiple locations marked as written in the shader's OutputsWritten set
> in the dual-source blending case (IOW you're relying on the bug fixed by
> the previous patch too).  GLSL is not TGSI, in GLSL the secondary and
> primary colors of a dual-source-blended output have the same location,
> so you cannot expect there will be multiple elements present in a bitset
> of locations.
>
>> Whichever way you look at it, breaking st/mesa isn't a great option.
>
> Then you may want to take a look at the attached patches in addition.
> They are untested and I have close to zero experience with the
> GLSL-to-TGSI pass, so they may break st/mesa after all.
>

Oops, I attached the wrong 0001-glsl.*.patch file, these are the right
patches.

>>   -ilia
>

From 7503e974d08636ae83d7248af8cc5991bd0034e5 Mon Sep 17 00:00:00 2001
From: Francisco Jerez 
Date: Tue, 23 Aug 2016 11:15:57 -0700

Re: [Mesa-dev] [PATCH 2/3] glsl: Fix incorrect hard-coded location of the gl_SecondaryFragColorEXT built-in.

2016-08-23 Thread Francisco Jerez
Ilia Mirkin  writes:

> On Tue, Aug 23, 2016 at 12:18 AM, Francisco Jerez  
> wrote:
>> Ilia Mirkin  writes:
>>
>>> On Tue, Aug 23, 2016 at 12:05 AM, Francisco Jerez  
>>> wrote:
 Ilia Mirkin  writes:

> On Mon, Aug 22, 2016 at 10:55 PM, Francisco Jerez  
> wrote:
>> Ilia Mirkin  writes:
>>
>>> On Mon, Aug 22, 2016 at 9:59 PM, Francisco Jerez 
>>>  wrote:
 gl_SecondaryFragColorEXT should have the same location as gl_FragColor
 for the secondary fragment color to be replicated to all fragment
 outputs.  The incorrect location of gl_SecondaryFragColorEXT would
 cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0
 as being written to, which isn't allowed by the spec and would
 ultimately lead to an assertion failure in
 fs_visitor::emit_fb_writes() on my i965-fb-fetch branch.
>>>
>>> My recollection was that it didn't work with COLOR for "stupid"
>>> reasons. Can you confirm that
>>> bin/arb_blend_func_extended-fbo-extended-blend-pattern_gles2 -auto
>>> passes with this patch?
>>>
>> Yes, it does, in fact
>> arb_blend_func_extended-fbo-extended-blend-pattern_gles2 hits the i965
>> assertion failure I mentioned above unless this patch is applied.
>
> This causes the test in question to fail on nouveau... the TGSI shader
> generated starts with
>
> FRAG
> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
> DCL IN[0], POSITION, LINEAR
> DCL OUT[0], SAMPLEMASK

 Heh, this smells a lot like the bug fixed in PATCH 1, but somewhere in
 the mesa state tracker.  st_glsl_to_tgsi.cpp:2422 does:

 | entry = new(mem_ctx) variable_storage(var,
 |   PROGRAM_OUTPUT,
 |   var->data.location
 |   + var->data.index);

 which is obviously bogus, e.g. for var->data.location ==
 FRAG_RESULT_COLOR and var->data.index == 1 you get
 FRAG_RESULT_SAMPLE_MASK which explains the sample mask declaration
 above.
>>>
>>> Right, because having FRAG_RESULT_COLOR and index != 0 was never
>>> possible prior to this. That might be why Ryan stuck it into
>>> FRAG_RESULT_DATA0 [I may have been the one to suggest that].
>>
>> Heh, so I guess that's the "stupid" reason you were referring to,
>> working around this mesa state tracker bug in the GLSL front-end.
>
> Right. Or another way of looking at it, FRAG_RESULT_COLOR + index != 0
> is illegal,

That would be an acceptable limitation if it were taken into account
consistently (which would imply among other things binding gl_FragColor
to FRAG_RESULT_DATA0 if gl_SecondaryFragColor is written).  Otherwise
you will be telling the linker and the back-end that both
FRAG_RESULT_COLOR and FRAG_RESULT_DATA0 are being written, which is
illegal.  The i965 has been misbehaving since forever because of this,
but we just didn't notice until I added that assertion because the only
symptom is increased shader recompiles.

> (as it would, among other things, imply multi-rt support for
> dual-source blending)

FRAG_RESULT_COLOR + index != 0 doesn't imply multi-rt support, it's a
requirement for multi-rt support, which the GLSL spec makes room for and
some drivers in this tree could potentially support if the GLSL IR
representation of dual-source blending wasn't deliberately inconsistent.

> and the former code was making the GLSL fe not emit the illegal
> combination.
>
I started digging into this and found out that that's not the only way
the GLSL-to-TGSI pass relies on a GLSL bug: The output semantic array
calculation in st_translate_fragment_program() relies on there being
multiple locations marked as written in the shader's OutputsWritten set
in the dual-source blending case (IOW you're relying on the bug fixed by
the previous patch too).  GLSL is not TGSI, in GLSL the secondary and
primary colors of a dual-source-blended output have the same location,
so you cannot expect there will be multiple elements present in a bitset
of locations.

> Whichever way you look at it, breaking st/mesa isn't a great option.

Then you may want to take a look at the attached patches in addition.
They are untested and I have close to zero experience with the
GLSL-to-TGSI pass, so they may break st/mesa after all.

>   -ilia

From ccd4f17436018688c77a23ea0728316a7f5141b3 Mon Sep 17 00:00:00 2001
From: Francisco Jerez 
Date: Sat, 20 Aug 2016 14:55:19 -0700
Subject: [PATCH] glsl: Fix gl_program::OutputsWritten computation for
 dual-source blending.

---
 src/compiler/glsl/ir_set_program_inouts.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_set_program_inouts.cpp 

[Mesa-dev] [PATCH] i965/blorp: Add a format parameter to blorp_fast_clear

2016-08-23 Thread Jason Ekstrand
We were inferring the format from the surface but that doesn't always work
as the surface format is the texture format and we want the render format
(which may be different on i965).

Signed-off-by: Jason Ekstrand 
Cc: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/blorp.h   | 2 +-
 src/mesa/drivers/dri/i965/blorp_clear.c | 4 ++--
 src/mesa/drivers/dri/i965/brw_blorp.c   | 4 +++-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/blorp.h 
b/src/mesa/drivers/dri/i965/blorp.h
index c20e2be..22b5760 100644
--- a/src/mesa/drivers/dri/i965/blorp.h
+++ b/src/mesa/drivers/dri/i965/blorp.h
@@ -68,7 +68,7 @@ brw_blorp_blit(struct brw_context *brw,
 void
 blorp_fast_clear(struct brw_context *brw,
  const struct brw_blorp_surf *surf,
- uint32_t level, uint32_t layer,
+ uint32_t level, uint32_t layer, enum isl_format format,
  uint32_t x0, uint32_t y0, uint32_t x1, uint32_t y1);
 
 void
diff --git a/src/mesa/drivers/dri/i965/blorp_clear.c 
b/src/mesa/drivers/dri/i965/blorp_clear.c
index 2da08f8..9e6b2a9 100644
--- a/src/mesa/drivers/dri/i965/blorp_clear.c
+++ b/src/mesa/drivers/dri/i965/blorp_clear.c
@@ -99,7 +99,7 @@ brw_blorp_params_get_clear_kernel(struct brw_context *brw,
 
 void
 blorp_fast_clear(struct brw_context *brw, const struct brw_blorp_surf *surf,
- uint32_t level, uint32_t layer,
+ uint32_t level, uint32_t layer, enum isl_format format,
  uint32_t x0, uint32_t y0, uint32_t x1, uint32_t y1)
 {
struct brw_blorp_params params;
@@ -119,7 +119,7 @@ blorp_fast_clear(struct brw_context *brw, const struct 
brw_blorp_surf *surf,
brw_blorp_params_get_clear_kernel(brw, , true);
 
brw_blorp_surface_info_init(brw, , surf, level, layer,
-   surf->surf->format, true);
+   format, true);
 
brw_blorp_exec(brw, );
 }
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index f4c2740..b504861 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -647,7 +647,9 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   DBG("%s (fast) to mt %p level %d layer %d\n", __FUNCTION__,
   irb->mt, irb->mt_level, irb->mt_layer);
 
-  blorp_fast_clear(brw, , level, layer, x0, y0, x1, y1);
+  blorp_fast_clear(brw, , level, layer,
+   brw->render_target_format[format],
+   x0, y0, x1, y1);
 
   /* Now that the fast clear has occurred, put the buffer in
* INTEL_FAST_CLEAR_STATE_CLEAR so that we won't waste time doing
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Give the installed intel_icd.json file an absolute path

2016-08-23 Thread Nicholas Miell

On 08/23/2016 02:07 AM, Emil Velikov wrote:

Skimmed through the discussion and I'm not sure the above will be enough.

Since the user is free to place json files in $HOME/.local ... this
implies that they may _not_ have access to /usr or /etc. Thus as they
install the file (to say $HOME/foo/lib) the Vulkan loader will not be
able to pick it up.


Why not?

Stick the 32-bit library in $HOME/foo/lib, the 64-bit library in 
$HOME/foo/lib64, set library_path to "/home/emil/foo/$LIB/lib_mygpu.so" 
in the $HOME/.local/share/vulkan/icd.d/mygpu.json file, and it should 
work, right?



In theory one should be able to have a varying .json (one with and one
without the path) although the heuristics will be fragile due to the
varying $LIB (like the case of Debian and derivatives) and $DESTDIR.

So I see the following options:
 - new configure option - have to agree with Dave, please don't.
Furthermore we already have more than 50! of them.
 - fold the "w or w/o full path" under and existing option
(--enable-debug ?) - somewhat picky/fragile as well. Kind of OK for
short term solution.
 - use LD_LIBRARY_PATH - slightly annoying yet add once and forget
about it. OK-ish short term solution.
 - json update - the better long term solution imho.

Json update:
 - the same file _cannot_ be provided by multiple packages


RPM certainly allows this, as long as the packages that provide the file 
have the same name & version, different architectures, and the files are 
byte-identical. Sticking $LIB in the path instead of "lib" or "lib64" 
makes the files byte-identical.



 - thus we should use differently named files, but we should not rely
on the filename to determine arch/triple.
There has been no such recommendations/restrictions about the name so
starting now feels wrong.
 - so let's bump "file_format_version" - 1.0.1 or 1.1.0 ?
 - add ICD::arch tag and teach the loader to honour it. For json files
missing it we'll assume a "can be used on this platform" behaviour.
 - should we bother with the other parts of the triple or add it only
as need arise ?

With the above in place, one can have full path + filename for the
mutlilib setups. Old loaders will just get an error upon dlopen of
incompatible/wrong arch ICDs.

How do the above short/long term solutions sound ?

Thanks
Emil



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 04/13] tests/sw_sync: Add subtest test_alloc_fence_invalid_timeline

2016-08-23 Thread robert . foss
From: Robert Foss 

This subtests tests that creating fences on negative timelines fail.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index a8c8ca4..102647d 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -55,6 +55,10 @@ static void test_alloc_fence(void)
sw_sync_timeline_destroy(timeline);
 }
 
+static void test_alloc_fence_invalid_timeline(void)
+{
+   sw_sync_fence_create(-1, 0);
+}
 
 igt_main
 {
@@ -63,5 +67,8 @@ igt_main
 
igt_subtest("alloc_fence")
test_alloc_fence();
+
+   igt_subtest("alloc_fence_invalid_timeline")
+   test_alloc_fence_invalid_timeline();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 03/13] tests/sw_sync: Add subtest test_alloc_fence

2016-08-23 Thread robert . foss
From: Robert Foss 

Add subtest alloc_fence that verifies that it's possible to allocate a fence
on a timeline.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index d2d4c42..a8c8ca4 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -43,9 +43,25 @@ static void test_alloc_timeline(void)
sw_sync_timeline_destroy(timeline);
 }
 
+static void test_alloc_fence(void)
+{
+   int in_fence;
+   int timeline;
+
+   timeline = sw_sync_timeline_create();
+   in_fence = sw_sync_fence_create(timeline, 0);
+
+   sw_sync_fence_destroy(in_fence);
+   sw_sync_timeline_destroy(timeline);
+}
+
+
 igt_main
 {
igt_subtest("alloc_timeline")
test_alloc_timeline();
+
+   igt_subtest("alloc_fence")
+   test_alloc_fence();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 13/13] tests/sw_sync: Add subtest test_sync_multi_producer_single_consumer

2016-08-23 Thread robert . foss
From: Robert Foss 

This subtest runs a single consumer thread and multiple producer thread that
are synchronized using multiple timelines.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 139 
 1 file changed, 139 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index 9ff5088..2accdee 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -446,6 +446,142 @@ static void test_sync_multi_consumer_producer(void)
igt_assert_f(thread_ret == 0, "A sync thread reported failure.\n");
 }
 
+static int test_mspc_wait_on_fence(int fence)
+{
+   int error, active;
+
+   do {
+   error = sw_sync_fence_count_status(fence,
+  SW_SYNC_FENCE_STATUS_ERROR);
+   igt_assert_f(error == 0, "Error occurred on fence\n");
+   active = sw_sync_fence_count_status(fence,
+   
SW_SYNC_FENCE_STATUS_ACTIVE);
+   } while (active);
+
+   return 0;
+}
+
+static struct {
+   int iterations;
+   int threads;
+   int counter;
+   int cons_timeline;
+   int *prod_timeline;
+   pthread_mutex_t lock;
+} test_mpsc_data;
+
+static int mpsc_producer_thread(void *d)
+{
+   int id = (long)d;
+   int fence, i;
+   int *prod_timeline = test_mpsc_data.prod_timeline;
+   int cons_timeline = test_mpsc_data.cons_timeline;
+   int iterations = test_mpsc_data.iterations;
+
+   for (i = 0; i < iterations; i++) {
+   fence = sw_sync_fence_create(cons_timeline, i);
+
+   /* Wait for the consumer to finish. Use alternate
+* means of waiting on the fence
+*/
+   if ((iterations + id) % 8 != 0) {
+   igt_assert_f(sw_sync_wait(fence, -1) > 0,
+"Failure waiting on fence\n");
+   } else {
+   igt_assert_f(test_mspc_wait_on_fence(fence) == 0,
+"Failure waiting on fence\n");
+   }
+
+   /* Every producer increments the counter, the consumer
+* checks and erases it
+*/
+   pthread_mutex_lock(_mpsc_data.lock);
+   test_mpsc_data.counter++;
+   pthread_mutex_unlock(_mpsc_data.lock);
+
+   sw_sync_timeline_inc(prod_timeline[id], 1);
+   sw_sync_fence_destroy(fence);
+   }
+
+   return 0;
+}
+
+static int mpsc_consumer_thread(void)
+{
+   int fence, merged, tmp, it, i;
+   int *prod_timeline = test_mpsc_data.prod_timeline;
+   int cons_timeline = test_mpsc_data.cons_timeline;
+   int iterations = test_mpsc_data.iterations;
+   int n = test_mpsc_data.threads;
+
+   for (it = 1; it <= iterations; it++) {
+   fence = sw_sync_fence_create(prod_timeline[0], it);
+   for (i = 1; i < n; i++) {
+   tmp = sw_sync_fence_create(prod_timeline[i], it);
+   merged = sw_sync_merge(tmp, fence);
+   sw_sync_fence_destroy(tmp);
+   sw_sync_fence_destroy(fence);
+   fence = merged;
+   }
+
+   /* Make sure we see an increment from every producer thread.
+* Vary the means by which we wait.
+*/
+   if (iterations % 8 != 0) {
+   igt_assert_f(sw_sync_wait(fence, -1) == 0,
+   "Producers did not increment as 
expected\n");
+   } else {
+   igt_assert_f(test_mspc_wait_on_fence(fence) == 0,
+"Failure waiting on fence\n");
+   }
+
+   igt_assert_f(test_mpsc_data.counter == n * it,
+"Counter value mismatch\n");
+
+   /* Release the producer threads */
+   sw_sync_timeline_inc(cons_timeline, 1);
+   sw_sync_fence_destroy(fence);
+   }
+
+   return 0;
+}
+
+/* IMPORTANT NOTE: if you see this test failing on your system, it may be
+ * due to a shortage of file descriptors. Please ensure your system has
+ * a sensible limit for this test to finish correctly.
+ */
+static void test_sync_multi_producer_single_consumer(void)
+{
+   int iterations = 1 << 12;
+   int n = 5;
+   int prod_timeline[n];
+   int cons_timeline;
+   pthread_t threads[n];
+   long i;
+
+   cons_timeline = sw_sync_timeline_create();
+   for (i = 0; i < n; i++)
+   prod_timeline[i] = sw_sync_timeline_create();
+
+   test_mpsc_data.prod_timeline = prod_timeline;
+   test_mpsc_data.cons_timeline = cons_timeline;
+   test_mpsc_data.iterations = iterations;
+   test_mpsc_data.threads = n;
+   test_mpsc_data.counter = 0;

[Mesa-dev] [PATCH v1 11/13] tests/sw_sync: Add subtest test_sync_random_merge

2016-08-23 Thread robert . foss
From: Robert Foss 

This subtest verifies that creating many timelines and merging random fences
from each timeline with eachother results in merged fences that are fully
functional.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 73 +
 1 file changed, 73 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index 0e67ad5..8e5a7c9 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -383,6 +383,76 @@ static void test_sync_multi_consumer_producer(void)
igt_assert_f(thread_ret == 0, "A sync thread reported failure.\n");
 }
 
+static void test_sync_random_merge(void)
+{
+   int i, size, ret;
+   const int nbr_timeline = 32;
+   const int nbr_merge = 1024;
+   int fence_map[nbr_timeline];
+   int timeline_arr[nbr_timeline];
+   int fence, tmpfence, merged;
+   int timeline, timeline_offset, sync_pt;
+
+   srand(time(NULL));
+
+   for (i = 0; i < nbr_timeline; i++)
+   timeline_arr[i] = sw_sync_timeline_create();
+
+   memset(fence_map, -1, sizeof(fence_map));
+
+   sync_pt = rand();
+   fence = sw_sync_fence_create(timeline_arr[0], sync_pt);
+
+   fence_map[0] = sync_pt;
+
+   /* Randomly create syncpoints out of a fixed set of timelines,
+* and merge them together.
+*/
+   for (i = 0; i < nbr_merge; i++) {
+   /* Generate syncpoint. */
+   timeline_offset = rand() % nbr_timeline;
+   timeline = timeline_arr[timeline_offset];
+   sync_pt = rand();
+
+   /* Keep track of the latest sync_pt in each timeline. */
+   if (fence_map[timeline_offset] == -1)
+   fence_map[timeline_offset] = sync_pt;
+   else if (fence_map[timeline_offset] < sync_pt)
+   fence_map[timeline_offset] = sync_pt;
+
+   /* Merge. */
+   tmpfence = sw_sync_fence_create(timeline, sync_pt);
+   merged = sw_sync_merge(tmpfence, fence);
+   sw_sync_fence_destroy(tmpfence);
+   sw_sync_fence_destroy(fence);
+   fence = merged;
+   }
+
+   size = 0;
+   for (i = 0; i < nbr_timeline; i++)
+   if (fence_map[i] != -1)
+   size++;
+
+   /* Trigger the merged fence. */
+   for (i = 0; i < nbr_timeline; i++) {
+   if (fence_map[i] != -1) {
+   ret = sw_sync_wait(fence, 0);
+   igt_assert_f(ret == 0,
+   "Failure waiting on fence until timeout\n");
+   /* Increment the timeline to the last sync_pt */
+   sw_sync_timeline_inc(timeline_arr[i], fence_map[i]);
+   }
+   }
+
+   /* Check that the fence is triggered. */
+   ret = sw_sync_wait(fence, 0);
+   igt_assert_f(ret > 0, "Failure triggering fence\n");
+
+   sw_sync_fence_destroy(fence);
+   for (i = 0; i < nbr_timeline; i++)
+   sw_sync_timeline_destroy(timeline_arr[i]);
+}
+
 igt_main
 {
igt_subtest("alloc_timeline")
@@ -411,5 +481,8 @@ igt_main
 
igt_subtest("sync_multi_consumer_producer")
test_sync_multi_consumer_producer();
+
+   igt_subtest("sync_random_merge")
+   test_sync_random_merge();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 07/13] tests/sw_sync: Add subtest test_sync_merge

2016-08-23 Thread robert . foss
From: Robert Foss 

Add subtest test_sync_merge that tests merging fences and the validity of the
resulting merged fence.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 67 +
 1 file changed, 67 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index 07e9638..00ac44b 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -115,6 +115,70 @@ static void test_sync_wait(void)
sw_sync_timeline_destroy(timeline);
 }
 
+static void test_sync_merge(void)
+{
+   int in_fence[3];
+   int fence_merge;
+   int timeline;
+   int active, signaled;
+
+   timeline = sw_sync_timeline_create();
+   in_fence[0] = sw_sync_fence_create(timeline, 1);
+   in_fence[1] = sw_sync_fence_create(timeline, 2);
+   in_fence[2] = sw_sync_fence_create(timeline, 3);
+
+   fence_merge = sw_sync_merge(in_fence[0], in_fence[1]);
+   fence_merge = sw_sync_merge(in_fence[2], fence_merge);
+
+   /* confirm all fences have one active point (even d) */
+   active = sw_sync_fence_count_status(in_fence[0],
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   igt_assert_f(active == 1, "in_fence[0] has too many active fences\n");
+   active = sw_sync_fence_count_status(in_fence[1],
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   igt_assert_f(active == 1, "in_fence[1] has too many active fences\n");
+   active = sw_sync_fence_count_status(in_fence[2],
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   igt_assert_f(active == 1, "in_fence[2] has too many active fences\n");
+   active = sw_sync_fence_count_status(fence_merge,
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   igt_assert_f(active == 1, "fence_merge has too many active fences\n");
+
+   /* confirm that fence_merge is not signaled until the max of fence 
0,1,2 */
+   sw_sync_timeline_inc(timeline, 1);
+   signaled = sw_sync_fence_count_status(in_fence[0],
+ SW_SYNC_FENCE_STATUS_SIGNALED);
+   active = sw_sync_fence_count_status(fence_merge,
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   igt_assert_f(signaled == 1, "in_fence[0] did not signal\n");
+   igt_assert_f(active == 1, "fence_merge signaled too early\n");
+
+   sw_sync_timeline_inc(timeline, 1);
+   signaled = sw_sync_fence_count_status(in_fence[1],
+ SW_SYNC_FENCE_STATUS_SIGNALED);
+   active = sw_sync_fence_count_status(fence_merge,
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   igt_assert_f(signaled == 1, "in_fence[1] did not signal\n");
+   igt_assert_f(active == 1, "fence_merge signaled too early\n");
+
+   sw_sync_timeline_inc(timeline, 1);
+   signaled = sw_sync_fence_count_status(in_fence[2],
+ SW_SYNC_FENCE_STATUS_SIGNALED);
+   igt_assert_f(signaled == 1, "in_fence[2] did not signal\n");
+   signaled = sw_sync_fence_count_status(fence_merge,
+  SW_SYNC_FENCE_STATUS_SIGNALED);
+   active = sw_sync_fence_count_status(fence_merge,
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   igt_assert_f(active == 0 && signaled == 1,
+"fence_merge did not signal\n");
+
+   sw_sync_fence_destroy(in_fence[0]);
+   sw_sync_fence_destroy(in_fence[1]);
+   sw_sync_fence_destroy(in_fence[2]);
+   sw_sync_fence_destroy(fence_merge);
+   sw_sync_timeline_destroy(timeline);
+}
+
 igt_main
 {
igt_subtest("alloc_timeline")
@@ -131,5 +195,8 @@ igt_main
 
igt_subtest("sync_wait")
test_sync_wait();
+
+   igt_subtest("sync_merge")
+   test_sync_merge();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 09/13] tests/sw_sync: Add subtest test_sync_multi_consumer

2016-08-23 Thread robert . foss
From: Robert Foss 

This subtest verifies the access ordering of multiple consumer threads.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 103 
 1 file changed, 103 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index db03f48..0c9c923 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -27,6 +27,8 @@
  *Robert Foss 
  */
 
+#include 
+#include 
 #include 
 
 #include "sw_sync.h"
@@ -35,6 +37,15 @@
 
 IGT_TEST_DESCRIPTION("Test SW Sync Framework");
 
+typedef struct {
+   int timeline;
+   uint32_t thread_id;
+   uint32_t nbr_threads;
+   uint32_t nbr_iterations;
+   volatile uint32_t * volatile counter;
+   sem_t *sem;
+} data_t;
+
 static void test_alloc_timeline(void)
 {
int timeline;
@@ -203,6 +214,95 @@ static void test_sync_merge_same(void)
sw_sync_timeline_destroy(timeline);
 }
 
+static void * test_sync_multi_consumer_thread(void *arg)
+{
+   data_t *data = arg;
+   int thread_id = data->thread_id;
+   int nbr_threads = data->nbr_threads;
+   int timeline = data->timeline;
+   int iterations = data->nbr_iterations;
+   int ret, i;
+
+   for (i = 0; i < iterations; i++) {
+   int next_point = i * nbr_threads + thread_id;
+   int fence = sw_sync_fence_create(timeline, next_point);
+
+   ret = sw_sync_wait(fence, 1000);
+   if (ret <= 0)
+   {
+   return (void *) 1;
+   }
+
+   if (*(data->counter) != next_point)
+   {
+   return (void *) 1;
+   }
+
+   sem_post(data->sem);
+   sw_sync_fence_destroy(fence);
+   }
+   return NULL;
+}
+
+static void test_sync_multi_consumer(void)
+{
+   const uint32_t nbr_threads = 8;
+   const uint32_t nbr_iterations = 1 << 14;
+   data_t data_arr[nbr_threads];
+   pthread_t thread_arr[nbr_threads];
+   sem_t sem;
+   int timeline;
+   volatile uint32_t counter = 0;
+   uintptr_t thread_ret = 0;
+   data_t data;
+   int i, ret;
+
+   sem_init(, 0, 0);
+   timeline = sw_sync_timeline_create();
+
+   data.nbr_iterations = nbr_iterations;
+   data.nbr_threads = nbr_threads;
+   data.counter = 
+   data.timeline = timeline;
+   data.sem = 
+
+   /* Start sync threads. */
+   for (i = 0; i < nbr_threads; i++)
+   {
+   data_arr[i] = data;
+   data_arr[i].thread_id = i;
+   ret = pthread_create(_arr[i], NULL,
+test_sync_multi_consumer_thread,
+(void *) &(data_arr[i]));
+   igt_assert_eq(ret, 0);
+   }
+
+   /* Produce 'content'. */
+   for (i = 0; i < nbr_threads * nbr_iterations; i++)
+   {
+   sem_wait();
+
+   counter++;
+   sw_sync_timeline_inc(timeline, 1);
+   }
+
+   /* Wait for threads to complete. */
+   for (i = 0; i < nbr_threads; i++)
+   {
+   uintptr_t local_thread_ret;
+   pthread_join(thread_arr[i], (void **)_thread_ret);
+   thread_ret |= local_thread_ret;
+   }
+
+   sw_sync_timeline_destroy(timeline);
+   sem_destroy();
+
+   igt_assert_f(counter == nbr_threads * nbr_iterations,
+"Counter has unexpected value.\n");
+
+   igt_assert_f(thread_ret == 0, "A sync thread reported failure.\n");
+}
+
 igt_main
 {
igt_subtest("alloc_timeline")
@@ -225,5 +325,8 @@ igt_main
 
igt_subtest("sync_merge_same")
test_sync_merge_same();
+
+   igt_subtest("sync_multi_consumer")
+   test_sync_multi_consumer();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 10/13] tests/sw_sync: Add subtest test_sync_multi_consumer_producer

2016-08-23 Thread robert . foss
From: Robert Foss 

This test verifies that stressing the kernel by creating multiple
consumer/producer threads that wait on a single timeline to be incremented
by another conumer/producer thread does not fail.
And that the order amongst the threads is maintained.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 83 +
 1 file changed, 83 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index 0c9c923..0e67ad5 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -303,6 +303,86 @@ static void test_sync_multi_consumer(void)
igt_assert_f(thread_ret == 0, "A sync thread reported failure.\n");
 }
 
+static void * test_sync_multi_consumer_producer_thread(void *arg)
+{
+   data_t *data = arg;
+   int thread_id = data->thread_id;
+   int nbr_threads = data->nbr_threads;
+   int timeline = data->timeline;
+   int iterations = data->nbr_iterations;
+   int ret, i;
+
+   for (i = 0; i < iterations; i++) {
+   int next_point = i * nbr_threads + thread_id;
+   int fence = sw_sync_fence_create(timeline, next_point);
+
+   ret = sw_sync_wait(fence, 1000);
+   if (ret <= 0)
+   {
+   return (void *) 1;
+   }
+
+   if (*(data->counter) != next_point)
+   {
+   return (void *) 1;
+   }
+
+   (*data->counter)++;
+
+   /* Kick off the next thread. */
+   sw_sync_timeline_inc(timeline, 1);
+
+   sw_sync_fence_destroy(fence);
+   }
+   return NULL;
+}
+
+static void test_sync_multi_consumer_producer(void)
+{
+   const uint32_t nbr_threads = 8;
+   const uint32_t nbr_iterations = 1 << 14;
+   data_t data_arr[nbr_threads];
+   pthread_t thread_arr[nbr_threads];
+   int timeline;
+   volatile uint32_t counter = 0;
+   uintptr_t thread_ret = 0;
+   data_t data;
+   int i, ret;
+
+   timeline = sw_sync_timeline_create();
+
+   data.nbr_iterations = nbr_iterations;
+   data.nbr_threads = nbr_threads;
+   data.counter = 
+   data.timeline = timeline;
+
+   /* Start consumer threads. */
+   for (i = 0; i < nbr_threads; i++)
+   {
+   data_arr[i] = data;
+   data_arr[i].thread_id = i;
+   ret = pthread_create(_arr[i], NULL,
+test_sync_multi_consumer_producer_thread,
+(void *) &(data_arr[i]));
+   igt_assert_eq(ret, 0);
+   }
+
+   /* Wait for threads to complete. */
+   for (i = 0; i < nbr_threads; i++)
+   {
+   uintptr_t local_thread_ret;
+   pthread_join(thread_arr[i], (void **)_thread_ret);
+   thread_ret |= local_thread_ret;
+   }
+
+   sw_sync_timeline_destroy(timeline);
+
+   igt_assert_f(counter == nbr_threads * nbr_iterations,
+"Counter has unexpected value.\n");
+
+   igt_assert_f(thread_ret == 0, "A sync thread reported failure.\n");
+}
+
 igt_main
 {
igt_subtest("alloc_timeline")
@@ -328,5 +408,8 @@ igt_main
 
igt_subtest("sync_multi_consumer")
test_sync_multi_consumer();
+
+   igt_subtest("sync_multi_consumer_producer")
+   test_sync_multi_consumer_producer();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 02/13] tests/sw_sync: Add sw_sync test

2016-08-23 Thread robert . foss
From: Robert Foss 

Add initial tests for sw_sync.

Signed-off-by: Gustavo Padovan 
Signed-off-by: Robert Foss 
---
 lib/sw_sync.c  | 13 ++---
 lib/sw_sync.h  |  2 +-
 tests/Makefile.sources |  1 +
 tests/sw_sync.c| 51 ++
 4 files changed, 59 insertions(+), 8 deletions(-)
 create mode 100644 tests/sw_sync.c

diff --git a/lib/sw_sync.c b/lib/sw_sync.c
index c4e7d07..e3d5f85 100644
--- a/lib/sw_sync.c
+++ b/lib/sw_sync.c
@@ -150,8 +150,6 @@ int sw_sync_wait(int fence, int timeout)
 
ret = poll(, 1, timeout);
 
-   sw_sync_fd_close(fence);
-
return ret;
 }
 
@@ -179,9 +177,10 @@ static struct sync_file_info *sync_file_info(int fd)
info->num_fences = num_fences;
 
fence_info = calloc(num_fences, sizeof(struct sync_fence_info));
-   if (!fence_info)
+   if (!fence_info) {
free(info);
return NULL;
+   }
 
info->sync_fence_info = (uint64_t)(unsigned long) (fence_info);
 
@@ -217,18 +216,18 @@ int sw_sync_fence_size(int fd)
return count;
 }
 
-int sw_sync_fence_count_with_status(int fd, int status)
+int sw_sync_fence_count_status(int fd, int status)
 {
int i, count = 0;
-   struct sync_fence_info *fenceInfo = NULL;
+   struct sync_fence_info *fence_info = NULL;
struct sync_file_info *info = sync_file_info(fd);
 
if (!info)
return -1;
 
-   fenceInfo = (struct sync_fence_info *)(uintptr_t)info->sync_fence_info;
+   fence_info = (struct sync_fence_info *)(uintptr_t)info->sync_fence_info;
for (i = 0 ; i < info->num_fences ; i++) {
-   if (fenceInfo[i].status == status)
+   if (fence_info[i].status == status)
count++;
}
 
diff --git a/lib/sw_sync.h b/lib/sw_sync.h
index b179adf..1092608 100644
--- a/lib/sw_sync.h
+++ b/lib/sw_sync.h
@@ -43,7 +43,7 @@ void sw_sync_timeline_inc(int fd, uint32_t count);
 int sw_sync_merge(int fd1, int fd2);
 int sw_sync_wait(int fence, int timeout);
 int sw_sync_fence_size(int fd);
-int sw_sync_fence_count_with_status(int fd, int status);
+int sw_sync_fence_count_status(int fd, int status);
 
 #endif
 
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 72a58ad..0ba769f 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -125,6 +125,7 @@ TESTS_progs_M = \
prime_mmap_kms \
prime_self_import \
prime_vgem \
+   sw_sync \
template \
vgem_basic \
vgem_slow \
diff --git a/tests/sw_sync.c b/tests/sw_sync.c
new file mode 100644
index 000..d2d4c42
--- /dev/null
+++ b/tests/sw_sync.c
@@ -0,0 +1,51 @@
+/*
+ * Copyright 2012 Google, Inc
+ * Copyright © 2016 Collabora, Ltd.
+ *
+ * Based on the implementation from the Android Open Source Project
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Robert Foss 
+ */
+
+#include 
+
+#include "sw_sync.h"
+#include "igt.h"
+#include "igt_aux.h"
+
+IGT_TEST_DESCRIPTION("Test SW Sync Framework");
+
+static void test_alloc_timeline(void)
+{
+   int timeline;
+
+   timeline = sw_sync_timeline_create();
+   sw_sync_timeline_destroy(timeline);
+}
+
+igt_main
+{
+   igt_subtest("alloc_timeline")
+   test_alloc_timeline();
+}
+
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 08/13] tests/sw_sync: Add subtest test_sync_merge_same

2016-08-23 Thread robert . foss
From: Robert Foss 

This subtest verifies merging a fence with itself does not fail.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index 00ac44b..db03f48 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -179,6 +179,30 @@ static void test_sync_merge(void)
sw_sync_timeline_destroy(timeline);
 }
 
+static void test_sync_merge_same(void)
+{
+   int in_fence[2];
+   int timeline;
+   int signaled;
+
+   timeline = sw_sync_timeline_create();
+   in_fence[0] = sw_sync_fence_create(timeline, 1);
+   in_fence[1] = sw_sync_merge(in_fence[0], in_fence[0]);
+
+   signaled = sw_sync_fence_count_status(in_fence[0],
+ SW_SYNC_FENCE_STATUS_SIGNALED);
+   igt_assert_f(signaled == 0, "fence signaled too early\n");
+
+   sw_sync_timeline_inc(timeline, 1);
+   signaled = sw_sync_fence_count_status(in_fence[0],
+ SW_SYNC_FENCE_STATUS_SIGNALED);
+   igt_assert_f(signaled == 1, "fence did not signal\n");
+
+   sw_sync_fence_destroy(in_fence[0]);
+   sw_sync_fence_destroy(in_fence[1]);
+   sw_sync_timeline_destroy(timeline);
+}
+
 igt_main
 {
igt_subtest("alloc_timeline")
@@ -198,5 +222,8 @@ igt_main
 
igt_subtest("sync_merge")
test_sync_merge();
+
+   igt_subtest("sync_merge_same")
+   test_sync_merge_same();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 06/13] tests/sw_sync: Add subtest test_sync_wait

2016-08-23 Thread robert . foss
From: Robert Foss 

This subtest verifies that waiting on fences works properly.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index 851430e..07e9638 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -80,6 +80,41 @@ static void test_alloc_merge_fence(void)
sw_sync_timeline_destroy(timeline[1]);
 }
 
+static void test_sync_wait(void)
+{
+   int fence, ret;
+   int timeline;
+
+   timeline = sw_sync_timeline_create();
+   fence = sw_sync_fence_create(timeline, 5);
+
+   /* Wait on fence until timeout */
+   ret = sw_sync_wait(fence, 0);
+   igt_assert_f(ret == 0, "Failure waiting on fence until timeout\n");
+
+   /* Advance timeline from 0 -> 1 */
+   sw_sync_timeline_inc(timeline, 1);
+
+   /* Wait on fence until timeout */
+   ret = sw_sync_wait(fence, 0);
+   igt_assert_f(ret == 0, "Failure waiting on fence until timeout\n");
+
+   /* Signal the fence */
+   sw_sync_timeline_inc(timeline, 4);
+
+   /* Wait successfully */
+   ret = sw_sync_wait(fence, 0);
+   igt_assert_f(ret > 0, "Failure waiting on fence\n");
+
+   /* Go even further, and confirm wait still succeeds */
+   sw_sync_timeline_inc(timeline, 10);
+   ret = sw_sync_wait(fence, 0);
+   igt_assert_f(ret > 0, "Failure waiting ahead\n");
+
+   sw_sync_fence_destroy(fence);
+   sw_sync_timeline_destroy(timeline);
+}
+
 igt_main
 {
igt_subtest("alloc_timeline")
@@ -93,5 +128,8 @@ igt_main
 
igt_subtest("alloc_merge_fence")
test_alloc_merge_fence();
+
+   igt_subtest("sync_wait")
+   test_sync_wait();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 12/13] tests/sw_sync: Add subtest test_sync_multi_timeline_wait

2016-08-23 Thread robert . foss
From: Robert Foss 

This subtest verifies that waiting, timing out on a wait and that counting
fences in various states works.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 66 +
 1 file changed, 66 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index 8e5a7c9..9ff5088 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -214,6 +214,69 @@ static void test_sync_merge_same(void)
sw_sync_timeline_destroy(timeline);
 }
 
+static void test_sync_multi_timeline_wait(void)
+{
+   int timeline[3];
+   int in_fence[3];
+   int fence_merge;
+   int active, signaled, ret;
+
+   timeline[0] = sw_sync_timeline_create();
+   timeline[1] = sw_sync_timeline_create();
+   timeline[2] = sw_sync_timeline_create();
+
+   in_fence[0] = sw_sync_fence_create(timeline[0], 5);
+   in_fence[1] = sw_sync_fence_create(timeline[1], 5);
+   in_fence[2] = sw_sync_fence_create(timeline[2], 5);
+
+   fence_merge = sw_sync_merge(in_fence[0], in_fence[1]);
+   fence_merge = sw_sync_merge(in_fence[2], fence_merge);
+
+   /* Confirm fence isn't signaled */
+   active = sw_sync_fence_count_status(fence_merge,
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   igt_assert_f(active == 3, "Fence signaled too early\n");
+
+   ret = sw_sync_wait(fence_merge, 0);
+   igt_assert_f(ret == 0, "Failure waiting on fence until timeout\n");
+
+   sw_sync_timeline_inc(timeline[0], 5);
+   active = sw_sync_fence_count_status(fence_merge,
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   signaled = sw_sync_fence_count_status(fence_merge,
+ SW_SYNC_FENCE_STATUS_SIGNALED);
+   igt_assert_f(active == 2 && signaled == 1,
+   "Fence did not signal properly\n");
+
+   sw_sync_timeline_inc(timeline[1], 5);
+   active = sw_sync_fence_count_status(fence_merge,
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   signaled = sw_sync_fence_count_status(fence_merge,
+ SW_SYNC_FENCE_STATUS_SIGNALED);
+   igt_assert_f(active == 1 && signaled == 2,
+   "Fence did not signal properly\n");
+
+   sw_sync_timeline_inc(timeline[2], 5);
+   active = sw_sync_fence_count_status(fence_merge,
+   SW_SYNC_FENCE_STATUS_ACTIVE);
+   signaled = sw_sync_fence_count_status(fence_merge,
+ SW_SYNC_FENCE_STATUS_SIGNALED);
+   igt_assert_f(active == 0 && signaled == 3,
+"Fence did not signal properly\n");
+
+   /* confirm you can successfully wait */
+   ret = sw_sync_wait(fence_merge, 100);
+   igt_assert_f(ret > 0, "Failure waiting on signaled fence\n");
+
+   sw_sync_fence_destroy(in_fence[0]);
+   sw_sync_fence_destroy(in_fence[1]);
+   sw_sync_fence_destroy(in_fence[2]);
+   sw_sync_fence_destroy(fence_merge);
+   sw_sync_timeline_destroy(timeline[0]);
+   sw_sync_timeline_destroy(timeline[1]);
+   sw_sync_timeline_destroy(timeline[2]);
+}
+
 static void * test_sync_multi_consumer_thread(void *arg)
 {
data_t *data = arg;
@@ -476,6 +539,9 @@ igt_main
igt_subtest("sync_merge_same")
test_sync_merge_same();
 
+   igt_subtest("sync_multi_timeline_wait")
+   test_sync_multi_timeline_wait();
+
igt_subtest("sync_multi_consumer")
test_sync_multi_consumer();
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 05/13] tests/sw_sync: Add subtest test_alloc_merge_fence

2016-08-23 Thread robert . foss
From: Robert Foss 

This subtest verifies that merging two fences works in the simples possible
case.

Signed-off-by: Robert Foss 
---
 tests/sw_sync.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/tests/sw_sync.c b/tests/sw_sync.c
index 102647d..851430e 100644
--- a/tests/sw_sync.c
+++ b/tests/sw_sync.c
@@ -60,6 +60,26 @@ static void test_alloc_fence_invalid_timeline(void)
sw_sync_fence_create(-1, 0);
 }
 
+static void test_alloc_merge_fence(void)
+{
+   int in_fence[2];
+   int fence_merge;
+   int timeline[2];
+
+   timeline[0] = sw_sync_timeline_create();
+   timeline[1] = sw_sync_timeline_create();
+
+   in_fence[0] = sw_sync_fence_create(timeline[0], 1);
+   in_fence[1] = sw_sync_fence_create(timeline[1], 1);
+   fence_merge = sw_sync_merge(in_fence[1], in_fence[0]);
+
+   sw_sync_fence_destroy(in_fence[0]);
+   sw_sync_fence_destroy(in_fence[1]);
+   sw_sync_fence_destroy(fence_merge);
+   sw_sync_timeline_destroy(timeline[0]);
+   sw_sync_timeline_destroy(timeline[1]);
+}
+
 igt_main
 {
igt_subtest("alloc_timeline")
@@ -70,5 +90,8 @@ igt_main
 
igt_subtest("alloc_fence_invalid_timeline")
test_alloc_fence_invalid_timeline();
+
+   igt_subtest("alloc_merge_fence")
+   test_alloc_merge_fence();
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 01/13] lib/sw_sync: Add helper functions for managing synchronization primitives

2016-08-23 Thread robert . foss
From: Robert Foss 

Base functions to help testing the Sync File Framework (explicit fencing
mechanism ported from Android).
These functions allow you to create, use and destroy timelines and fences.

Signed-off-by: Gustavo Padovan 
Signed-off-by: Robert Foss 
---
 lib/Makefile.sources |   2 +
 lib/sw_sync.c| 238 +++
 lib/sw_sync.h|  49 +++
 3 files changed, 289 insertions(+)
 create mode 100644 lib/sw_sync.c
 create mode 100644 lib/sw_sync.h

diff --git a/lib/Makefile.sources b/lib/Makefile.sources
index bac9a7f..3dc7c3c 100644
--- a/lib/Makefile.sources
+++ b/lib/Makefile.sources
@@ -61,6 +61,8 @@ lib_source_list = \
rendercopy_gen8.c   \
rendercopy_gen9.c   \
rendercopy.h\
+   sw_sync.c   \
+   sw_sync.h   \
intel_reg_map.c \
intel_iosf.c\
igt_kms.c   \
diff --git a/lib/sw_sync.c b/lib/sw_sync.c
new file mode 100644
index 000..c4e7d07
--- /dev/null
+++ b/lib/sw_sync.c
@@ -0,0 +1,238 @@
+/*
+ * Copyright 2012 Google, Inc
+ * Copyright © 2016 Collabora, Ltd.
+ *
+ * Based on the implementation from the Android Open Source Project
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Robert Foss 
+ */
+
+#ifndef ANDROID
+#define _GNU_SOURCE
+#else
+#include 
+#endif
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "sw_sync.h"
+#include "drmtest.h"
+#include "ioctl_wrappers.h"
+
+#ifndef SW_SYNC_IOC_INC
+struct sw_sync_create_fence_data {
+   __u32   value;
+   charname[32];
+   __s32   fence;
+};
+
+#define SW_SYNC_IOC_MAGIC  'W'
+#define SW_SYNC_IOC_CREATE_FENCE   _IOWR(SW_SYNC_IOC_MAGIC, 0,\
+   struct 
sw_sync_create_fence_data)
+#define SW_SYNC_IOC_INC_IOW(SW_SYNC_IOC_MAGIC, 1, 
__u32)
+#endif
+
+int sw_sync_fd_is_valid(int fd)
+{
+   int status;
+
+   if (fd == -1)
+   return 0;
+
+   status = fcntl(fd, F_GETFD, 0);
+   return status >= 0;
+}
+
+static
+void sw_sync_fd_close(int fd)
+{
+   if (fd == -1)
+   return;
+
+   if (fcntl(fd, F_GETFD, 0) < 0)
+   return;
+
+   close(fd);
+}
+
+int sw_sync_timeline_create(void)
+{
+   int fd = open("/dev/sw_sync", O_RDWR);
+
+   if (!sw_sync_fd_is_valid(fd))
+   fd = open("/sys/kernel/debug/sync/sw_sync", O_RDWR);
+
+   igt_assert(sw_sync_fd_is_valid(fd));
+
+   return fd;
+}
+
+void sw_sync_timeline_destroy(int fd)
+{
+   return sw_sync_fd_close(fd);
+}
+
+void sw_sync_fence_destroy(int fd)
+{
+   return sw_sync_fd_close(fd);
+}
+
+int sw_sync_fence_create(int fd, int32_t seqno)
+{
+   struct sw_sync_create_fence_data data = {};
+
+   data.value = seqno;
+   if (fd >= 0) {
+   do_ioctl(fd, SW_SYNC_IOC_CREATE_FENCE, );
+   return data.fence;
+   } else {
+   do_ioctl_err(fd, SW_SYNC_IOC_CREATE_FENCE, , EBADF);
+   return -1;
+   }
+}
+
+void sw_sync_timeline_inc(int fd, uint32_t count)
+{
+   uint32_t arg = count;
+
+   if (fd == 0 || fd == -1)
+   return;
+
+   do_ioctl(fd, SW_SYNC_IOC_INC, );
+}
+
+int sw_sync_merge(int fd1, int fd2)
+{
+   struct sync_merge_data data = {};
+   int err;
+
+   data.fd2 = fd2;
+
+   err = ioctl(fd1, SYNC_IOC_MERGE, );
+   if (err < 0)
+   return err;
+
+   sw_sync_fd_is_valid(data.fence);
+
+   return data.fence;
+}
+
+int sw_sync_wait(int fence, int timeout)
+{
+   struct pollfd fds;
+   int ret;
+
+   fds.fd = fence;
+   fds.events = POLLIN | POLLERR;
+
+  

[Mesa-dev] [PATCH v1 00/13] Implement sw_sync test

2016-08-23 Thread robert . foss
From: Robert Foss 

This series implements the sw_sync test and the lib/sw_sync helper functions
for said test.

Gustavo Padovans sw_sync series was just de-staged in
gregkh-staging/staging-next [1], and this test is targeted at verifying the
functionality implemented in that series.

The sw_sync subtests range from very basic tests of the sw_sync functionality,
to stress testing and randomized tests.

[1] http://git.kernel.org/cgit/linux/kernel/git/gregkh/staging.git/

Robert Foss (13):
  lib/sw_sync: Add helper functions for managing synchronization
primitives
  tests/sw_sync: Add sw_sync test
  tests/sw_sync: Add subtest test_alloc_fence
  tests/sw_sync: Add subtest test_alloc_fence_invalid_timeline
  tests/sw_sync: Add subtest test_alloc_merge_fence
  tests/sw_sync: Add subtest test_sync_wait
  tests/sw_sync: Add subtest test_sync_merge
  tests/sw_sync: Add subtest test_sync_merge_same
  tests/sw_sync: Add subtest test_sync_multi_consumer
  tests/sw_sync: Add subtest test_sync_multi_consumer_producer
  tests/sw_sync: Add subtest test_sync_random_merge
  tests/sw_sync: Add subtest test_sync_multi_timeline_wait
  tests/sw_sync: Add subtest test_sync_multi_producer_single_consumer

 lib/Makefile.sources   |   2 +
 lib/sw_sync.c  | 237 +
 lib/sw_sync.h  |  49 
 tests/Makefile.sources |   1 +
 tests/sw_sync.c| 693 +
 5 files changed, 982 insertions(+)
 create mode 100644 lib/sw_sync.c
 create mode 100644 lib/sw_sync.h
 create mode 100644 tests/sw_sync.c

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 12/16] glsl: Add a lowering pass to handle advanced blending modes.

2016-08-23 Thread Francisco Jerez
Kenneth Graunke  writes:

> On Monday, August 22, 2016 5:50:49 PM PDT Francisco Jerez wrote:
>> Kenneth Graunke  writes:
>> 
>> > Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and
>> > need to emulate it in the pixel shader.  This lowering pass implements
>> > all the necessary math for advanced blending.  It fetches the existing
>> > framebuffer value using the MESA_shader_framebuffer_fetch built-in
>> > variables, and the previous commit's state var uniform to select
>> > which equation to use.
>> >
>> > This is done at the GLSL IR level to make it easy for all drivers to
>> > implement the GL_KHR_blend_equation_advanced extension and share code.
>> >
>> > Drivers need to hook up MESA_shader_framebuffer_fetch functionality:
>> > 1. Hook up the fb_fetch_output variable
>> > 2. Implement BlendBarrier()
>> >
>> > Then to get KHR_blend_equation_advanced, they simply need to:
>> > 3. Disable hardware blending based on ctx->Color._AdvancedBlendEnabled
>> > 4. Call this lowering pass.
>> >
>> > Very little driver specific code should be required.
>> >
>> > v2: Handle multiple output variables per render target (which may exist
>> > due to ARB_enhanced_layouts), and array variables (even with one
>> > render target, we might have out vec4 color[1]), and non-vec4
>> > variables (it's easier than finding spec text to justify not
>> > handling it).  Thanks to Francisco Jerez for the feedback.
>> >
>> > Signed-off-by: Kenneth Graunke 
>> > ---
>> >  src/compiler/Makefile.sources  |   1 +
>> >  src/compiler/glsl/ir_optimization.h|   1 +
>> >  .../glsl/lower_blend_equation_advanced.cpp | 557 
>> > +
>> >  3 files changed, 559 insertions(+)
>> >  create mode 100644 src/compiler/glsl/lower_blend_equation_advanced.cpp
>> >
>> > diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
>> > index 0ff9b23..fe26132 100644
>> > --- a/src/compiler/Makefile.sources
>> > +++ b/src/compiler/Makefile.sources
>> > @@ -77,6 +77,7 @@ LIBGLSL_FILES = \
>> >glsl/loop_analysis.h \
>> >glsl/loop_controls.cpp \
>> >glsl/loop_unroll.cpp \
>> > +  glsl/lower_blend_equation_advanced.cpp \
>> >glsl/lower_buffer_access.cpp \
>> >glsl/lower_buffer_access.h \
>> >glsl/lower_const_arrays_to_uniforms.cpp \
>> > diff --git a/src/compiler/glsl/ir_optimization.h 
>> > b/src/compiler/glsl/ir_optimization.h
>> > index c29260a..3bd6928 100644
>> > --- a/src/compiler/glsl/ir_optimization.h
>> > +++ b/src/compiler/glsl/ir_optimization.h
>> > @@ -151,6 +151,7 @@ void optimize_dead_builtin_variables(exec_list 
>> > *instructions,
>> >  bool lower_tess_level(gl_linked_shader *shader);
>> >  
>> >  bool lower_vertex_id(gl_linked_shader *shader);
>> > +bool lower_blend_equation_advanced(gl_linked_shader *shader);
>> >  
>> >  bool lower_subroutine(exec_list *instructions, struct 
>> > _mesa_glsl_parse_state *state);
>> >  void propagate_invariance(exec_list *instructions);
>> > diff --git a/src/compiler/glsl/lower_blend_equation_advanced.cpp 
>> > b/src/compiler/glsl/lower_blend_equation_advanced.cpp
>> > new file mode 100644
>> > index 000..5632865
>> > --- /dev/null
>> > +++ b/src/compiler/glsl/lower_blend_equation_advanced.cpp
>> > @@ -0,0 +1,557 @@
>> > +/*
>> > + * Copyright © 2016 Intel Corporation
>> > + *
>> > + * Permission is hereby granted, free of charge, to any person obtaining a
>> > + * copy of this software and associated documentation files (the 
>> > "Software"),
>> > + * to deal in the Software without restriction, including without 
>> > limitation
>> > + * the rights to use, copy, modify, merge, publish, distribute, 
>> > sublicense,
>> > + * and/or sell copies of the Software, and to permit persons to whom the
>> > + * Software is furnished to do so, subject to the following conditions:
>> > + *
>> > + * The above copyright notice and this permission notice (including the 
>> > next
>> > + * paragraph) shall be included in all copies or substantial portions of 
>> > the
>> > + * Software.
>> > + *
>> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>> > EXPRESS OR
>> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>> > MERCHANTABILITY,
>> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
>> > SHALL
>> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> > OTHER
>> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> > + * DEALINGS IN THE SOFTWARE.
>> > + */
>> > +
>> > +#include "ir.h"
>> > +#include "ir_builder.h"
>> > +#include "ir_optimization.h"
>> > +#include "ir_hierarchical_visitor.h"
>> > +#include "program/prog_instruction.h"
>> > +#include "program/prog_statevars.h"
>> > +#include "util/bitscan.h"
>> > +
>> > +using namespace 

Re: [Mesa-dev] [PATCH v2] i965: don't fail to shift height images for levels.

2016-08-23 Thread Jason Ekstrand
On Aug 23, 2016 10:10 AM, "Ilia Mirkin"  wrote:
>
> On Tue, Aug 23, 2016 at 12:34 PM, Jason Ekstrand 
wrote:
> > On Tue, Aug 23, 2016 at 9:28 AM, Jason Ekstrand 
> > wrote:
> >>
> >> This smells like fish... I'm going to have a look.
> >
> >
> > Ok, I looked...
> >
> >>
> >> On Tue, Aug 23, 2016 at 8:29 AM, Antia Puentes 
> >> wrote:
> >>>
> >>> From: Dave Airlie 
> >>>
> >>> This fixes one subtest of:
> >>> GL44-CTS.shader_image_size.advanced-nonMS-fs-int
> >>>
> >>> I've no idea why this wouldn't be scaled up here,
> >>> and I've no idea what else will break, but I might
> >>> as well open for discussion.
> >>>
> >>> v2: Only shift height if the texture is not an 1D_ARRAY,
> >>> it fixes assertion in GL44-CTS.texture_view.gettexparameter
> >>> due to the original patch (Antia).
> >>>
> >>> Signed-off-by: Dave Airlie 
> >>> Signed-off-by: Antia Puentes 
> >>> ---
> >>>
> >>> I have not taken a deep look to the test so take this with a grain of
> >>> salt.
> >>> As I said in a previous email, this patch raises an assertion in
> >>> GL44-CTS.texture_view.gettexparameter:
> >>>
> >>> "glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout:
Assertion
> >>> `height0 = 1' failed."
> >>>
> >>> Looking at the code surrounding the assertion, we have:
> >>>
> >>>if (target == GL_TEXTURE_1D_ARRAY)
> >>>  assert(height0 == 1);
> >>>
> >>> which suggests that we should avoid shifting the height at least for
> >>> TEXTURE_1D_ARRAYs. Sending a second version of the patch.
> >>>
> >>>  src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
> >>> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> >>> index 958f8bd..120e7e0 100644
> >>> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> >>> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> >>> @@ -49,7 +49,7 @@ intel_miptree_create_for_teximage(struct brw_context
> >>> *brw,
> >>> /* Figure out image dimensions at start level. */
> >>> for (i = intelImage->base.Base.Level; i > 0; i--) {
> >>>width <<= 1;
> >>> -  if (height != 1)
> >>> +  if (intelObj->base.Target != GL_TEXTURE_1D_ARRAY)
> >>>   height <<= 1;
> >>>if (intelObj->base.Target == GL_TEXTURE_3D)
> >>>   depth <<= 1;
> >
> >
> > I think the whole pile of code is bogus and needs to be rewritten.
First
> > off, why are we using a for loop?  Seriously?  Second, I think what we
want
> > is more like
> >
> > switch (intelObj->base.Target) {
> > case GL_TEXTURE_3D:
> >depth <<= intelImage->base.Base.Level;
> >/* Fall through */
> >
> > case GL_TEXTURE_2D:
> > case GL_TEXTURE_2D_ARRAY:
> > case GL_TEXTURE_RECTANGLE:
>
> and cube/cube_array

Yes

> and ms/ms_array

Noish... You can't have multiple levels for multisampled but sure.

> [this is why != 1d_array tends to be attractive]

Sure, but it had better be != 1d &&!= 1d_array.  I'm not married to the
switch statement, but we could easily make it exhaustive with an
unreachable default case and solve that problem.  That may actually be
better than != 1d.  If we did that, we could put an assert(levels ==0)  in
the case for multisampled.

In any case, I do think we should kill the loop and be a bit more explicit.

--Jason

> >height <<= intelImage->base.Base.Level
> >/* Fall through */
> >
> > default:
> >width <<= intelImage->base.Base.Level;
> > }
> >
> > I think that would be far more clear and correct.  I don't have a build
of
> > the CTS handy so I didn't send it as a 3rd counter-patch.
> >
> > --Jason
> >
> >>>
> >>> --
> >>> 2.7.4
> >>>
> >>> ___
> >>> mesa-dev mailing list
> >>> mesa-dev@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >>
> >>
> >
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: add support for cull distances.

2016-08-23 Thread Marek Olšák
On Tue, Aug 23, 2016 at 4:46 PM, Kai Wasserbäch
 wrote:
> Hey Dave,
> Dave Airlie wrote on 23.08.2016 03:28:
>> From: Dave Airlie 
>>
>> This should be all that is required for cull distances to work
>> on radeonsi.
>
> in case this is indeed enough to enable ARB_cull_distance, mind adding that to
> GL3.txt and the release notes?

Good point.

>
>> Signed-off-by: Dave Airlie 
>> ---
>>  src/gallium/drivers/radeonsi/si_pipe.c  | 3 ++-
>>  src/gallium/drivers/radeonsi/si_state.c | 8 
>>  2 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
>> b/src/gallium/drivers/radeonsi/si_pipe.c
>> index 62b62db..14b0b80 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>> @@ -436,7 +436,6 @@ static int si_get_param(struct pipe_screen* pscreen, 
>> enum pipe_cap param)
>>   case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
>>   case PIPE_CAP_VERTEXID_NOBASE:
>>   case PIPE_CAP_QUERY_BUFFER_OBJECT:
>> - case PIPE_CAP_CULL_DISTANCE:
>>   case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
>>   case PIPE_CAP_TGSI_VOTE:
>>   case PIPE_CAP_MAX_WINDOW_RECTANGLES:
>> @@ -447,6 +446,8 @@ static int si_get_param(struct pipe_screen* pscreen, 
>> enum pipe_cap param)
>>   case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
>>   return sscreen->has_draw_indirect_multi;
>>
>> + case PIPE_CAP_CULL_DISTANCE:
>> + return 1;
>>   case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
>>   return 30;
>
> Hm, shouldn't the PIPE_CAP_CULL_DISTANCE just be moved up to the long list of
> supported features with boolean caps at the beginning of si_get_param()?
> (Instead of creating another "true segment"?)

Yes.

Also, like Ilia said, clipdist_mask should be used so as not to break
gl_ClipVertex.

And please keep the empty line between declarations and code.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 95190] Tomb Raider with PostProcessing enable and Depth of Field set to Ultra has white stuff in the foreground

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=95190

--- Comment #10 from Samuel Pitoiset  ---
This has been recently fixed by Kenneth. At least, I can't reproduce that fog
issue on gm107 (I did force GL4.3 because that game requires a 4.2 context
though).

Karol, can you confirm and close the ticket, please?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965: don't fail to shift height images for levels.

2016-08-23 Thread Ilia Mirkin
On Tue, Aug 23, 2016 at 12:34 PM, Jason Ekstrand  wrote:
> On Tue, Aug 23, 2016 at 9:28 AM, Jason Ekstrand 
> wrote:
>>
>> This smells like fish... I'm going to have a look.
>
>
> Ok, I looked...
>
>>
>> On Tue, Aug 23, 2016 at 8:29 AM, Antia Puentes 
>> wrote:
>>>
>>> From: Dave Airlie 
>>>
>>> This fixes one subtest of:
>>> GL44-CTS.shader_image_size.advanced-nonMS-fs-int
>>>
>>> I've no idea why this wouldn't be scaled up here,
>>> and I've no idea what else will break, but I might
>>> as well open for discussion.
>>>
>>> v2: Only shift height if the texture is not an 1D_ARRAY,
>>> it fixes assertion in GL44-CTS.texture_view.gettexparameter
>>> due to the original patch (Antia).
>>>
>>> Signed-off-by: Dave Airlie 
>>> Signed-off-by: Antia Puentes 
>>> ---
>>>
>>> I have not taken a deep look to the test so take this with a grain of
>>> salt.
>>> As I said in a previous email, this patch raises an assertion in
>>> GL44-CTS.texture_view.gettexparameter:
>>>
>>> "glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion
>>> `height0 = 1' failed."
>>>
>>> Looking at the code surrounding the assertion, we have:
>>>
>>>if (target == GL_TEXTURE_1D_ARRAY)
>>>  assert(height0 == 1);
>>>
>>> which suggests that we should avoid shifting the height at least for
>>> TEXTURE_1D_ARRAYs. Sending a second version of the patch.
>>>
>>>  src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
>>> b/src/mesa/drivers/dri/i965/intel_tex_image.c
>>> index 958f8bd..120e7e0 100644
>>> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
>>> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
>>> @@ -49,7 +49,7 @@ intel_miptree_create_for_teximage(struct brw_context
>>> *brw,
>>> /* Figure out image dimensions at start level. */
>>> for (i = intelImage->base.Base.Level; i > 0; i--) {
>>>width <<= 1;
>>> -  if (height != 1)
>>> +  if (intelObj->base.Target != GL_TEXTURE_1D_ARRAY)
>>>   height <<= 1;
>>>if (intelObj->base.Target == GL_TEXTURE_3D)
>>>   depth <<= 1;
>
>
> I think the whole pile of code is bogus and needs to be rewritten.  First
> off, why are we using a for loop?  Seriously?  Second, I think what we want
> is more like
>
> switch (intelObj->base.Target) {
> case GL_TEXTURE_3D:
>depth <<= intelImage->base.Base.Level;
>/* Fall through */
>
> case GL_TEXTURE_2D:
> case GL_TEXTURE_2D_ARRAY:
> case GL_TEXTURE_RECTANGLE:

and cube/cube_array
and ms/ms_array

[this is why != 1d_array tends to be attractive]

>height <<= intelImage->base.Base.Level
>/* Fall through */
>
> default:
>width <<= intelImage->base.Base.Level;
> }
>
> I think that would be far more clear and correct.  I don't have a build of
> the CTS handy so I didn't send it as a 3rd counter-patch.
>
> --Jason
>
>>>
>>> --
>>> 2.7.4
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97426] glScissor gives vertically inverted result

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97426

--- Comment #3 from Emil Velikov  ---
Classic xlib libGL.so (comment 1) vs classic swrast_dri.so this report ?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97426] glScissor gives vertically inverted result

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97426

--- Comment #2 from Ruslan Kabatsayev  ---
Hmm, a difference from mine is in GL_RENDERER: here's what I have:

GL_RENDERER   = Software Rasterizer
GL_VERSION= 2.1 Mesa 12.1.0-devel (git-607ab6d)
GL_VENDOR = Mesa Project

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: reset 'ViewportInitialized' when unbinding current context

2016-08-23 Thread Emil Velikov
On 23 August 2016 at 17:36, Dongwon Kim  wrote:
> I read that part in egl specification as well but I wasn't sure
> if this statement covers the case when the context is unbound via
> eglMakeCurrent call with NULL context then it is made current again.
> The problem I saw was when the context is made current again especially
> with a new surface with different dimension, there's no way the context
> updates its viewport accordingly.
>
In that case one should adjust the viewport manually as mentioned in the spec ?

Also note that the src/mesa/ code is _not_ EGL but EGL/GLX agnostic,
so one should we
quite careful that changes don't break in either one of the two ;-)

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.

2016-08-23 Thread Emil Velikov
On 22 August 2016 at 08:48, Stencel, Joanna  wrote:
> Segfault occurs when destroying EGL surface attached to already destroyed
> Wayland window. The fix is to set to NULL the pointer of surface's
> native window when wl_egl_destroy_window() is called.
>
Are you sure one is not supposed to call eglDestroySurface() first and
then wl_egl_window_destroy() ? As-is the patch "fixes" a crash (by
creating a leak) caused by user misuse. Should we a) leave the crash
to teach people about bugs in their code, b) plug the crash yet cause
a leak, or c) plus the crash w/o causing a leak.

I'm leaning towards a) or c) since b) only papers over things.
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: reset 'ViewportInitialized' when unbinding current context

2016-08-23 Thread Dongwon Kim
I read that part in egl specification as well but I wasn't sure 
if this statement covers the case when the context is unbound via 
eglMakeCurrent call with NULL context then it is made current again.
The problem I saw was when the context is made current again especially 
with a new surface with different dimension, there's no way the context 
updates its viewport accordingly. 

On Tue, Aug 23, 2016 at 12:23:47PM +0100, Emil Velikov wrote:
> On 15 August 2016 at 23:58, Dongwon Kim  wrote:
> > 'ViewportInitialized' flag in gl_context has to be reset to '0'
> > when the current context is unbound via a eglMakeCurrent call with
> > all of 'NULL' resources (surfaces and context).
> >
> > This is to make sure the viewport of the context is re-initialized
> > when the same context is bound to new read and draw surfaces
> > (or same surfaces but with different size.) next time when the
> > context is made current again.
> >
> The spec is pretty clear about this:
> 
> "The first time a OpenGL or OpenGL ES context is made current the
> viewport and scissor dimensions are set to the size of the draw
> surface (as though glViewport(0, 0, w, h) and glScissor(0, 0, w, h)
> were called, where w and h are the width and height of the surface,
> respectively). However, the viewport and scissor dimensions are not
> modified when ctx is subsequently made current."
> 
> So unless coffee hasn't kicked this patch is wrong.
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965: don't fail to shift height images for levels.

2016-08-23 Thread Jason Ekstrand
On Tue, Aug 23, 2016 at 9:28 AM, Jason Ekstrand 
wrote:

> This smells like fish... I'm going to have a look.
>

Ok, I looked...


> On Tue, Aug 23, 2016 at 8:29 AM, Antia Puentes 
> wrote:
>
>> From: Dave Airlie 
>>
>> This fixes one subtest of:
>> GL44-CTS.shader_image_size.advanced-nonMS-fs-int
>>
>> I've no idea why this wouldn't be scaled up here,
>> and I've no idea what else will break, but I might
>> as well open for discussion.
>>
>> v2: Only shift height if the texture is not an 1D_ARRAY,
>> it fixes assertion in GL44-CTS.texture_view.gettexparameter
>> due to the original patch (Antia).
>>
>> Signed-off-by: Dave Airlie 
>> Signed-off-by: Antia Puentes 
>> ---
>>
>> I have not taken a deep look to the test so take this with a grain of
>> salt.
>> As I said in a previous email, this patch raises an assertion in
>> GL44-CTS.texture_view.gettexparameter:
>>
>> "glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion
>> `height0 = 1' failed."
>>
>> Looking at the code surrounding the assertion, we have:
>>
>>if (target == GL_TEXTURE_1D_ARRAY)
>>  assert(height0 == 1);
>>
>> which suggests that we should avoid shifting the height at least for
>> TEXTURE_1D_ARRAYs. Sending a second version of the patch.
>>
>>  src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
>> b/src/mesa/drivers/dri/i965/intel_tex_image.c
>> index 958f8bd..120e7e0 100644
>> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
>> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
>> @@ -49,7 +49,7 @@ intel_miptree_create_for_teximage(struct brw_context
>> *brw,
>> /* Figure out image dimensions at start level. */
>> for (i = intelImage->base.Base.Level; i > 0; i--) {
>>width <<= 1;
>> -  if (height != 1)
>> +  if (intelObj->base.Target != GL_TEXTURE_1D_ARRAY)
>>   height <<= 1;
>>if (intelObj->base.Target == GL_TEXTURE_3D)
>>   depth <<= 1;
>>
>
I think the whole pile of code is bogus and needs to be rewritten.  First
off, why are we using a for loop?  Seriously?  Second, I think what we want
is more like

switch (intelObj->base.Target) {
case GL_TEXTURE_3D:
   depth <<= intelImage->base.Base.Level;
   /* Fall through */

case GL_TEXTURE_2D:
case GL_TEXTURE_2D_ARRAY:
case GL_TEXTURE_RECTANGLE:
   height <<= intelImage->base.Base.Level
   /* Fall through */

default:
   width <<= intelImage->base.Base.Level;
}

I think that would be far more clear and correct.  I don't have a build of
the CTS handy so I didn't send it as a 3rd counter-patch.

--Jason


> --
>> 2.7.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/dri2: Do not need to check return value from mtx_lock

2016-08-23 Thread Dongwon Kim
Sure, I will. Thanks

On Tue, Aug 23, 2016 at 11:39:52AM +0100, Emil Velikov wrote:
> Hi Dongwon,
> 
> On 15 August 2016 at 23:20, Dongwon Kim  wrote:
> > A new patch, "[PATCH] egl/dri2: remove error checks on return values from 
> > mtx_lock
> > and cnd_wait" containing additional clean-up has been submitted. Please 
> > disregard
> > this one.
> >
> Thanks for the update. For the future please add this is information
> in the new patch (after the --- line). Using [PATCH v2] and revision
> log would also be appreciated.
> 
> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965: don't fail to shift height images for levels.

2016-08-23 Thread Jason Ekstrand
This smells like fish... I'm going to have a look.

On Tue, Aug 23, 2016 at 8:29 AM, Antia Puentes  wrote:

> From: Dave Airlie 
>
> This fixes one subtest of:
> GL44-CTS.shader_image_size.advanced-nonMS-fs-int
>
> I've no idea why this wouldn't be scaled up here,
> and I've no idea what else will break, but I might
> as well open for discussion.
>
> v2: Only shift height if the texture is not an 1D_ARRAY,
> it fixes assertion in GL44-CTS.texture_view.gettexparameter
> due to the original patch (Antia).
>
> Signed-off-by: Dave Airlie 
> Signed-off-by: Antia Puentes 
> ---
>
> I have not taken a deep look to the test so take this with a grain of salt.
> As I said in a previous email, this patch raises an assertion in
> GL44-CTS.texture_view.gettexparameter:
>
> "glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion
> `height0 = 1' failed."
>
> Looking at the code surrounding the assertion, we have:
>
>if (target == GL_TEXTURE_1D_ARRAY)
>  assert(height0 == 1);
>
> which suggests that we should avoid shifting the height at least for
> TEXTURE_1D_ARRAYs. Sending a second version of the patch.
>
>  src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 958f8bd..120e7e0 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -49,7 +49,7 @@ intel_miptree_create_for_teximage(struct brw_context
> *brw,
> /* Figure out image dimensions at start level. */
> for (i = intelImage->base.Base.Level; i > 0; i--) {
>width <<= 1;
> -  if (height != 1)
> +  if (intelObj->base.Target != GL_TEXTURE_1D_ARRAY)
>   height <<= 1;
>if (intelObj->base.Target == GL_TEXTURE_3D)
>   depth <<= 1;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97260] [bisected] R9 290 low performance in Linux 4.7

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97260

--- Comment #26 from Kai  ---
(In reply to Alex Deucher from comment #25)
> Is anyone still having issues with vblank_mode=0?

With or without Michel's patch
(), which btw, still hasn't
landed despite the reviews?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/31] i965/blorp: Add an "exec" function pointer to blorp_context

2016-08-23 Thread Pohjolainen, Topi
On Fri, Aug 19, 2016 at 09:55:55AM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/blorp.c   | 27 +--
>  src/mesa/drivers/dri/i965/blorp.h   |  4 
>  src/mesa/drivers/dri/i965/blorp_blit.c  |  2 +-
>  src/mesa/drivers/dri/i965/blorp_clear.c |  6 +++---
>  src/mesa/drivers/dri/i965/blorp_priv.h  | 21 -
>  src/mesa/drivers/dri/i965/brw_blorp.c   | 19 +++
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c |  9 +++--
>  7 files changed, 35 insertions(+), 53 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/blorp.c 
> b/src/mesa/drivers/dri/i965/blorp.c
> index 0688f6b..df10e50 100644
> --- a/src/mesa/drivers/dri/i965/blorp.c
> +++ b/src/mesa/drivers/dri/i965/blorp.c
> @@ -345,31 +345,6 @@ brw_blorp_compile_nir_shader(struct brw_context *brw, 
> struct nir_shader *nir,
>  }
>  
>  void
> -brw_blorp_exec(struct brw_context *brw, const struct brw_blorp_params 
> *params)
> -{
> -   switch (brw->gen) {
> -   case 6:
> -  gen6_blorp_exec(brw, params);
> -  break;
> -   case 7:
> -  if (brw->is_haswell)
> - gen75_blorp_exec(brw, params);
> -  else
> - gen7_blorp_exec(brw, params);
> -  break;
> -   case 8:
> -  gen8_blorp_exec(brw, params);
> -  break;
> -   case 9:
> -  gen9_blorp_exec(brw, params);
> -  break;
> -   default:
> -  /* BLORP is not supported before Gen6. */
> -  unreachable("not reached");
> -   }
> -}
> -
> -void
>  blorp_gen6_hiz_op(struct brw_context *brw, struct brw_blorp_surf *surf,
>unsigned level, unsigned layer, enum gen6_hiz_op op)
>  {
> @@ -436,5 +411,5 @@ blorp_gen6_hiz_op(struct brw_context *brw, struct 
> brw_blorp_surf *surf,
>unreachable("not reached");
> }
>  
> -   brw_blorp_exec(brw, );
> +   brw->blorp.exec(>blorp, brw, );
>  }
> diff --git a/src/mesa/drivers/dri/i965/blorp.h 
> b/src/mesa/drivers/dri/i965/blorp.h
> index a9ef754..671731e 100644
> --- a/src/mesa/drivers/dri/i965/blorp.h
> +++ b/src/mesa/drivers/dri/i965/blorp.h
> @@ -39,6 +39,8 @@ struct hash_table;
>  extern "C" {
>  #endif
>  
> +struct brw_blorp_params;
> +
>  struct blorp_context {
> void *driver_ctx;
>  
> @@ -56,6 +58,8 @@ struct blorp_context {
>  
> uint32_t (*upload_shader)(struct blorp_context *,
>   const void *data, uint32_t size);
> +   void (*exec)(struct blorp_context *blorp, void *batch,
> +const struct brw_blorp_params *params);
>  };
>  
>  void blorp_init(struct blorp_context *blorp, void *driver_ctx,
> diff --git a/src/mesa/drivers/dri/i965/blorp_blit.c 
> b/src/mesa/drivers/dri/i965/blorp_blit.c
> index d01dfff..449e09d 100644
> --- a/src/mesa/drivers/dri/i965/blorp_blit.c
> +++ b/src/mesa/drivers/dri/i965/blorp_blit.c
> @@ -1647,5 +1647,5 @@ brw_blorp_blit(struct brw_context *brw,
>   swizzle_to_scs(GET_SWZ(src_swizzle, i));
> }
>  
> -   brw_blorp_exec(brw, );
> +   brw->blorp.exec(>blorp, brw, );
>  }
> diff --git a/src/mesa/drivers/dri/i965/blorp_clear.c 
> b/src/mesa/drivers/dri/i965/blorp_clear.c
> index fb4d050..3b6d6d7 100644
> --- a/src/mesa/drivers/dri/i965/blorp_clear.c
> +++ b/src/mesa/drivers/dri/i965/blorp_clear.c
> @@ -110,7 +110,7 @@ blorp_fast_clear(struct brw_context *brw, const struct 
> brw_blorp_surf *surf,
> brw_blorp_surface_info_init(brw, , surf, level, layer,
> surf->surf->format, true);
>  
> -   brw_blorp_exec(brw, );
> +   brw->blorp.exec(>blorp, brw, );
>  }
>  
>  
> @@ -156,7 +156,7 @@ blorp_clear(struct brw_context *brw, const struct 
> brw_blorp_surf *surf,
> brw_blorp_surface_info_init(brw, , surf, level, layer,
> format, true);
>  
> -   brw_blorp_exec(brw, );
> +   brw->blorp.exec(>blorp, brw, );
>  }
>  
>  void
> @@ -186,5 +186,5 @@ brw_blorp_ccs_resolve(struct brw_context *brw, struct 
> brw_blorp_surf *surf,
>  
> brw_blorp_params_get_clear_kernel(brw, , true);
>  
> -   brw_blorp_exec(brw, );
> +   brw->blorp.exec(>blorp, brw, );
>  }
> diff --git a/src/mesa/drivers/dri/i965/blorp_priv.h 
> b/src/mesa/drivers/dri/i965/blorp_priv.h
> index 9b987a8..c7a2a03 100644
> --- a/src/mesa/drivers/dri/i965/blorp_priv.h
> +++ b/src/mesa/drivers/dri/i965/blorp_priv.h
> @@ -183,27 +183,6 @@ struct brw_blorp_params
>  void
>  brw_blorp_params_init(struct brw_blorp_params *params);
>  
> -void
> -brw_blorp_exec(struct brw_context *brw, const struct brw_blorp_params 
> *params);
> -
> -void
> -gen6_blorp_exec(struct brw_context *brw,
> -const struct brw_blorp_params *params);
> -
> -void
> -gen7_blorp_exec(struct brw_context *brw,
> -const struct brw_blorp_params *params);
> -
> -void
> -gen75_blorp_exec(struct brw_context *brw,
> - const struct brw_blorp_params *params);
> -
> -void
> -gen8_blorp_exec(struct brw_context *brw, const struct brw_blorp_params 
> *params);
> -
> 

[Mesa-dev] [Bug 97260] [bisected] R9 290 low performance in Linux 4.7

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97260

--- Comment #25 from Alex Deucher  ---
Is anyone still having issues with vblank_mode=0?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/31] i965/blorp: Move the guts of brw_blorp_exec into genX_blorp_exec.c

2016-08-23 Thread Pohjolainen, Topi
On Fri, Aug 19, 2016 at 09:55:54AM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/blorp.c   | 66 
> -
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 66 
> +
>  src/mesa/drivers/dri/i965/genX_blorp_exec.h |  8 ++--
>  3 files changed, 70 insertions(+), 70 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/blorp.c 
> b/src/mesa/drivers/dri/i965/blorp.c
> index dba3441..0688f6b 100644
> --- a/src/mesa/drivers/dri/i965/blorp.c
> +++ b/src/mesa/drivers/dri/i965/blorp.c
> @@ -347,28 +347,6 @@ brw_blorp_compile_nir_shader(struct brw_context *brw, 
> struct nir_shader *nir,
>  void
>  brw_blorp_exec(struct brw_context *brw, const struct brw_blorp_params 
> *params)
>  {
> -   struct gl_context *ctx = >ctx;
> -   const uint32_t estimated_max_batch_usage = brw->gen >= 8 ? 1800 : 1500;
> -   bool check_aperture_failed_once = false;
> -
> -   /* Flush the sampler and render caches.  We definitely need to flush the
> -* sampler cache so that we get updated contents from the render cache for
> -* the glBlitFramebuffer() source.  Also, we are sometimes warned in the
> -* docs to flush the cache between reinterpretations of the same surface
> -* data with different formats, which blorp does for stencil and depth
> -* data.
> -*/
> -   brw_emit_mi_flush(brw);
> -
> -   brw_select_pipeline(brw, BRW_RENDER_PIPELINE);
> -
> -retry:
> -   intel_batchbuffer_require_space(brw, estimated_max_batch_usage, 
> RENDER_RING);
> -   intel_batchbuffer_save_state(brw);
> -   drm_intel_bo *saved_bo = brw->batch.bo;
> -   uint32_t saved_used = USED_BATCH(brw->batch);
> -   uint32_t saved_state_batch_offset = brw->batch.state_batch_offset;
> -
> switch (brw->gen) {
> case 6:
>gen6_blorp_exec(brw, params);
> @@ -389,50 +367,6 @@ retry:
>/* BLORP is not supported before Gen6. */
>unreachable("not reached");
> }
> -
> -   /* Make sure we didn't wrap the batch unintentionally, and make sure we
> -* reserved enough space that a wrap will never happen.
> -*/
> -   assert(brw->batch.bo == saved_bo);
> -   assert((USED_BATCH(brw->batch) - saved_used) * 4 +
> -  (saved_state_batch_offset - brw->batch.state_batch_offset) <
> -  estimated_max_batch_usage);
> -   /* Shut up compiler warnings on release build */
> -   (void)saved_bo;
> -   (void)saved_used;
> -   (void)saved_state_batch_offset;
> -
> -   /* Check if the blorp op we just did would make our batch likely to fail 
> to
> -* map all the BOs into the GPU at batch exec time later.  If so, flush 
> the
> -* batch and try again with nothing else in the batch.
> -*/
> -   if (dri_bufmgr_check_aperture_space(>batch.bo, 1)) {
> -  if (!check_aperture_failed_once) {
> - check_aperture_failed_once = true;
> - intel_batchbuffer_reset_to_saved(brw);
> - intel_batchbuffer_flush(brw);
> - goto retry;
> -  } else {
> - int ret = intel_batchbuffer_flush(brw);
> - WARN_ONCE(ret == -ENOSPC,
> -   "i965: blorp emit exceeded available aperture space\n");
> -  }
> -   }
> -
> -   if (unlikely(brw->always_flush_batch))
> -  intel_batchbuffer_flush(brw);
> -
> -   /* We've smashed all state compared to what the normal 3D pipeline
> -* rendering tracks for GL.
> -*/
> -   brw->ctx.NewDriverState |= BRW_NEW_BLORP;
> -   brw->no_depth_or_stencil = false;
> -   brw->ib.type = -1;
> -
> -   /* Flush the sampler cache so any texturing from the destination is
> -* coherent.
> -*/
> -   brw_emit_mi_flush(brw);
>  }
>  
>  void
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index e07fa0a..9ba1f8a 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -170,6 +170,28 @@ void
>  genX(blorp_exec)(struct brw_context *brw,
>   const struct brw_blorp_params *params)
>  {
> +   struct gl_context *ctx = >ctx;
> +   const uint32_t estimated_max_batch_usage = GEN_GEN >= 8 ? 1800 : 1500;
> +   bool check_aperture_failed_once = false;
> +
> +   /* Flush the sampler and render caches.  We definitely need to flush the
> +* sampler cache so that we get updated contents from the render cache for
> +* the glBlitFramebuffer() source.  Also, we are sometimes warned in the
> +* docs to flush the cache between reinterpretations of the same surface
> +* data with different formats, which blorp does for stencil and depth
> +* data.
> +*/
> +   brw_emit_mi_flush(brw);
> +
> +   brw_select_pipeline(brw, BRW_RENDER_PIPELINE);
> +
> +retry:
> +   intel_batchbuffer_require_space(brw, estimated_max_batch_usage, 
> RENDER_RING);
> +   intel_batchbuffer_save_state(brw);
> +   drm_intel_bo *saved_bo = brw->batch.bo;
> +   uint32_t saved_used = USED_BATCH(brw->batch);
> +   uint32_t saved_state_batch_offset = 

[Mesa-dev] [PATCH] shaderapi: don't generate not linked error on GetProgramStage in general

2016-08-23 Thread Alejandro Piñeiro
Both ARB_shader_subroutine and the GL core spec doesn't list any
error when the program is not linked.

We left a error generation for the uniform location, in order to be
consistent with other methods from the spec that generate them.
---

On 23/08/16 10:14, Tapani Pälli wrote:

> IMO it looks like GetProgramStageiv is currently not correct from this
> perspective as it does not follow these rules, it should not require
> linking either as GL_ARB_shader_subroutine spec does not specify such
> for the call.

This patch tries to put some coherence between both specs. Having said
so I found this to be tricky, as GetProgramStageiv also allow tos ask
for uniform locations. The rest of the methods at the subroutine spec
generates an error if asking for locations when the program is not linked.

So I went for the compromise of keep this method not generate errors
when the program is not linked in general, as the spec for the method
points, but still generates the error for uniform location, trying to be
coherent with other methods.

Or in other words: I tried to be as coherent as possible, but failing to
be be 100% coherent and follow the spec 100% at the same time.


 src/mesa/main/shaderapi.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 4f29cd9..6806c02 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -2745,8 +2745,25 @@ _mesa_GetProgramStageiv(GLuint program, GLenum 
shadertype,
 
stage = _mesa_shader_enum_to_shader_stage(shadertype);
sh = shProg->_LinkedShaders[stage];
+
+   /* ARB_shader_subroutine doesn't ask the program to be linked, or list any
+* INVALID_OPERATION in the case of not be linked.
+*
+* And for some pnames, like GL_ACTIVE_SUBROUTINE_UNIFORMS, you can ask the
+* same info using other specs (ARB_program_interface_query), without the
+* need of the program to be linked, being the value for that case 0.
+*
+* But at the same time, some other methods require the program to be
+* linked for pname related to locations, so it would be inconsistent to
+* not do the same here. So we are:
+*   * Return GL_INVALID_OPERATION if not linked only for locations.
+*   * Setting a default value of 0, to be returned if not linked.
+*/
if (!sh) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "%s", api_name);
+  values[0] = 0;
+  if (pname == GL_ACTIVE_SUBROUTINE_UNIFORM_LOCATIONS) {
+ _mesa_error(ctx, GL_INVALID_OPERATION, "%s", api_name);
+  }
   return;
}
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/31] i965/blorp/exec: Refactor to use blorp_context and a void *batch

2016-08-23 Thread Pohjolainen, Topi
On Fri, Aug 19, 2016 at 09:55:52AM -0700, Jason Ekstrand wrote:
> This gets rid of brw_context throughout the core of the state setup code.
> ---
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 332 
> +++-
>  1 file changed, 182 insertions(+), 150 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index 288e384..8c15b16 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -34,8 +34,11 @@
>  #include "genxml/gen_macros.h"
>  
>  static void *
> -blorp_emit_dwords(struct brw_context *brw, unsigned n)
> +blorp_emit_dwords(struct blorp_context *blorp, void *batch, unsigned n)
>  {
> +   assert(blorp->driver_ctx == batch);
> +   struct brw_context *brw = batch;
> +
> intel_batchbuffer_begin(brw, n, RENDER_RING);
> uint32_t *map = brw->batch.map_next;
> brw->batch.map_next += n;
> @@ -44,9 +47,12 @@ blorp_emit_dwords(struct brw_context *brw, unsigned n)
>  }
>  
>  static uint64_t
> -blorp_emit_reloc(struct brw_context *brw, void *location,
> - struct blorp_address address, uint32_t delta)
> +blorp_emit_reloc(struct blorp_context *blorp, void *batch,
> + void *location, struct blorp_address address, uint32_t 
> delta)
>  {
> +   assert(blorp->driver_ctx == batch);
> +   struct brw_context *brw = batch;
> +
> uint32_t offset = (char *)location - (char *)brw->batch.map;
> if (brw->gen >= 8) {
>return intel_batchbuffer_reloc64(brw, address.buffer, offset,
> @@ -62,9 +68,11 @@ blorp_emit_reloc(struct brw_context *brw, void *location,
>  }
>  
>  static void
> -blorp_surface_reloc(struct brw_context *brw, uint32_t ss_offset,
> +blorp_surface_reloc(struct blorp_context *blorp, uint32_t ss_offset,
>  struct blorp_address address, uint32_t delta)
>  {
> +   struct brw_context *brw = blorp->driver_ctx;
> +
> drm_intel_bo_emit_reloc(brw->batch.bo, ss_offset,
> address.buffer, address.offset + delta,
> address.read_domains, address.write_domain);
> @@ -129,8 +137,12 @@ blorp_alloc_vertex_buffer(struct blorp_context *blorp, 
> uint32_t size,
>  }
>  
>  static void
> -blorp_emit_urb_config(struct brw_context *brw, unsigned vs_entry_size)
> +blorp_emit_urb_config(struct blorp_context *blorp, void *batch,
> +  unsigned vs_entry_size)
>  {
> +   assert(blorp->driver_ctx == batch);
> +   struct brw_context *brw = batch;
> +
>  #if GEN_GEN >= 7
> if (!(brw->ctx.NewDriverState & (BRW_NEW_CONTEXT | BRW_NEW_URB_SIZE)) &&
> brw->urb.vsize >= vs_entry_size)
> @@ -145,26 +157,34 @@ blorp_emit_urb_config(struct brw_context *brw, unsigned 
> vs_entry_size)
>  }
>  
>  static void
> -blorp_emit_3dstate_multisample(struct brw_context *brw, unsigned samples)
> +blorp_emit_3dstate_multisample(struct blorp_context *blorp, void *batch,
> +   unsigned samples)
>  {
> +   assert(blorp->driver_ctx == batch);
>  #if GEN_GEN >= 8
> -   gen8_emit_3dstate_multisample(brw, samples);
> +   gen8_emit_3dstate_multisample(batch, samples);
>  #else
> -   gen6_emit_3dstate_multisample(brw, samples);
> +   gen6_emit_3dstate_multisample(batch, samples);
>  #endif
>  }
>  
> +struct blorp_batch {
> +   struct blorp_context *blorp;
> +   void *batch;
> +};
> +
>  #define __gen_address_type struct blorp_address
> -#define __gen_user_data struct brw_context
> +#define __gen_user_data struct blorp_batch
>  
>  static uint64_t
> -__gen_combine_address(struct brw_context *brw, void *location,
> +__gen_combine_address(struct blorp_batch *batch, void *location,
>struct blorp_address address, uint32_t delta)
>  {
> if (address.buffer == NULL) {
>return address.offset + delta;
> } else {
> -  return blorp_emit_reloc(brw, location, address, delta);
> +  return blorp_emit_reloc(batch->blorp, batch->batch,
> +  location, address, delta);
> }
>  }
>  
> @@ -175,21 +195,22 @@ __gen_combine_address(struct brw_context *brw, void 
> *location,
>  #define _blorp_cmd_header(cmd) cmd ## _header
>  #define _blorp_cmd_pack(cmd) cmd ## _pack
>  
> -#define blorp_emit(brw, cmd, name)\
> -   for (struct cmd name = { _blorp_cmd_header(cmd) }, \
> -*_dst = blorp_emit_dwords(brw, _blorp_cmd_length(cmd));   \
> -__builtin_expect(_dst != NULL, 1);\
> -_blorp_cmd_pack(cmd)(brw, (void *)_dst, ),   \
> +#define blorp_emit(batch, cmd, name)\
> +   for (struct cmd name = { _blorp_cmd_header(cmd) },   \
> +*_dst = blorp_emit_dwords(batch.blorp, batch.batch, \
> +  _blorp_cmd_length(cmd));  \
> +__builtin_expect(_dst != NULL, 1);  \
> +_blorp_cmd_pack(cmd)(, (void 

[Mesa-dev] [PATCH v2] program_resource: subroutine active uniforms should return NumSubroutineUniforms

2016-08-23 Thread Alejandro Piñeiro
Before this commit, GetProgramInterfaceiv for pname ACTIVE_RESOURCES
and all the _SUBROUTINE_UNIFORM programInterface were
returning the count of resources on the shader program using that
interface, instead of the num of uniform resources. This would get a
wrong value (for example) if the shader has an array of subroutine
uniforms.

Note that this means that in order to get a proper value, the shader
needs to be linked, something that is not explicitly mentioned on
ARB_program_interface_query spec, but comes from the general
definition of active uniform. If the program is not linked we
return 0.

v2: don't generate an error if the program is not linked, returning 0
active uniforms instead, plus extra spec references (Tapani Palli)

Fixes GL44-CTS.program_interface_query.subroutines-compute
---
 src/mesa/main/program_resource.c | 141 ---
 1 file changed, 118 insertions(+), 23 deletions(-)

diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c
index f2a9f00..6ddbdad 100644
--- a/src/mesa/main/program_resource.c
+++ b/src/mesa/main/program_resource.c
@@ -66,6 +66,79 @@ supported_interface_enum(struct gl_context *ctx, GLenum 
iface)
}
 }
 
+static struct gl_shader_program *
+lookup_linked_program(GLuint program,
+  const char *caller,
+  bool raise_link_error)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   struct gl_shader_program *prog =
+  _mesa_lookup_shader_program_err(ctx, program, caller);
+
+   if (!prog)
+  return NULL;
+
+   if (prog->LinkStatus == GL_FALSE) {
+  if (raise_link_error)
+ _mesa_error(ctx, GL_INVALID_OPERATION, "%s(program not linked)",
+ caller);
+  return NULL;
+   }
+   return prog;
+}
+
+static GLenum
+stage_from_program_interface(GLenum programInterface)
+{
+   switch(programInterface) {
+   case GL_VERTEX_SUBROUTINE_UNIFORM:
+  return MESA_SHADER_VERTEX;
+   case GL_TESS_CONTROL_SUBROUTINE_UNIFORM:
+  return MESA_SHADER_TESS_CTRL;
+   case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM:
+  return MESA_SHADER_TESS_EVAL;
+   case GL_GEOMETRY_SUBROUTINE_UNIFORM:
+  return MESA_SHADER_GEOMETRY;
+   case GL_FRAGMENT_SUBROUTINE_UNIFORM:
+  return MESA_SHADER_FRAGMENT;
+   case GL_COMPUTE_SUBROUTINE_UNIFORM:
+  return MESA_SHADER_COMPUTE;
+   default:
+  assert(!"unexpected programInterface value");
+   }
+}
+
+static struct gl_linked_shader *
+lookup_linked_shader(GLuint program,
+ GLenum programInterface,
+ const char *caller)
+{
+   struct gl_shader_program *shLinkedProg =
+  lookup_linked_program(program, caller, false);
+   gl_shader_stage stage = stage_from_program_interface(programInterface);
+
+   if (!shLinkedProg)
+  return NULL;
+
+   return shLinkedProg->_LinkedShaders[stage];
+}
+
+static bool
+is_subroutine_uniform_program_interface(GLenum programInterface)
+{
+   switch(programInterface) {
+   case GL_VERTEX_SUBROUTINE_UNIFORM:
+   case GL_TESS_CONTROL_SUBROUTINE_UNIFORM:
+   case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM:
+   case GL_GEOMETRY_SUBROUTINE_UNIFORM:
+   case GL_FRAGMENT_SUBROUTINE_UNIFORM:
+   case GL_COMPUTE_SUBROUTINE_UNIFORM:
+  return true;
+   default:
+  return false;
+   }
+}
+
 void GLAPIENTRY
 _mesa_GetProgramInterfaceiv(GLuint program, GLenum programInterface,
 GLenum pname, GLint *params)
@@ -101,9 +174,49 @@ _mesa_GetProgramInterfaceiv(GLuint program, GLenum 
programInterface,
/* Validate pname against interface. */
switch(pname) {
case GL_ACTIVE_RESOURCES:
-  for (i = 0, *params = 0; i < shProg->NumProgramResourceList; i++)
- if (shProg->ProgramResourceList[i].Type == programInterface)
-(*params)++;
+  if (is_subroutine_uniform_program_interface(programInterface)) {
+ /* ARB_program_interface_query doesn't explicitly says that those
+  * uniforms would need a linked shader, or that should fail if it is
+  * not the case, but Section 7.6 (Uniform Variables) of the OpenGL
+  * 4.4 Core Profile says:
+  *
+  *"A uniform is considered an active uniform if the compiler and
+  * linker determine that the uniform will actually be accessed
+  * when the executable code is executed. In cases where the
+  * compiler and linker cannot make a conclusive determination,
+  * the uniform will be considered active."
+  *
+  * So in order to know the real number of active subroutine uniforms
+  * we would need a linked shader .
+  *
+  * At the same time, Section 7.3 (Program Objects) of the OpenGL 4.4
+  * Core Profile says:
+  *
+  *"The GL provides various commands allowing applications to
+  * enumerate and query properties of active variables and in-
+  * terface blocks for a specified 

[Mesa-dev] [PATCH v2] i965: don't fail to shift height images for levels.

2016-08-23 Thread Antia Puentes
From: Dave Airlie 

This fixes one subtest of:
GL44-CTS.shader_image_size.advanced-nonMS-fs-int

I've no idea why this wouldn't be scaled up here,
and I've no idea what else will break, but I might
as well open for discussion.

v2: Only shift height if the texture is not an 1D_ARRAY,
it fixes assertion in GL44-CTS.texture_view.gettexparameter
due to the original patch (Antia).

Signed-off-by: Dave Airlie 
Signed-off-by: Antia Puentes 
---

I have not taken a deep look to the test so take this with a grain of salt.
As I said in a previous email, this patch raises an assertion in
GL44-CTS.texture_view.gettexparameter:

"glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion
`height0 = 1' failed."

Looking at the code surrounding the assertion, we have:

   if (target == GL_TEXTURE_1D_ARRAY)
 assert(height0 == 1);

which suggests that we should avoid shifting the height at least for
TEXTURE_1D_ARRAYs. Sending a second version of the patch.

 src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 958f8bd..120e7e0 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -49,7 +49,7 @@ intel_miptree_create_for_teximage(struct brw_context *brw,
/* Figure out image dimensions at start level. */
for (i = intelImage->base.Base.Level; i > 0; i--) {
   width <<= 1;
-  if (height != 1)
+  if (intelObj->base.Target != GL_TEXTURE_1D_ARRAY)
  height <<= 1;
   if (intelObj->base.Target == GL_TEXTURE_3D)
  depth <<= 1;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/31] i965/blorp: Pull the guts of blorp_exec into a driver-agnostic header

2016-08-23 Thread Pohjolainen, Topi
On Fri, Aug 19, 2016 at 09:55:53AM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources  |   15 +-
>  src/mesa/drivers/dri/i965/blorp_priv.h  |2 +-
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 1113 +-
>  src/mesa/drivers/dri/i965/genX_blorp_exec.h | 1121 
> +++
>  4 files changed, 1133 insertions(+), 1118 deletions(-)
>  create mode 100644 src/mesa/drivers/dri/i965/genX_blorp_exec.h
> 
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
> b/src/mesa/drivers/dri/i965/Makefile.sources
> index 5ea7b96..c97486c 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -259,16 +259,21 @@ i965_FILES = \
>   intel_upload.c
>  
>  i965_gen6_FILES = \
> - genX_blorp_exec.c
> + genX_blorp_exec.c \
> + genX_blorp_exec.h
>  
>  i965_gen7_FILES = \
> - genX_blorp_exec.c
> + genX_blorp_exec.c \
> + genX_blorp_exec.h
>  
>  i965_gen75_FILES = \
> - genX_blorp_exec.c
> + genX_blorp_exec.c \
> + genX_blorp_exec.h
>  
>  i965_gen8_FILES = \
> - genX_blorp_exec.c
> + genX_blorp_exec.c \
> + genX_blorp_exec.h
>  
>  i965_gen9_FILES = \
> - genX_blorp_exec.c
> + genX_blorp_exec.c \
> + genX_blorp_exec.h
> diff --git a/src/mesa/drivers/dri/i965/blorp_priv.h 
> b/src/mesa/drivers/dri/i965/blorp_priv.h
> index 977f54d..9b987a8 100644
> --- a/src/mesa/drivers/dri/i965/blorp_priv.h
> +++ b/src/mesa/drivers/dri/i965/blorp_priv.h
> @@ -141,7 +141,7 @@ struct brw_blorp_prog_data
>  */
> uint32_t flat_inputs;
> unsigned num_varying_inputs;
> -   GLbitfield64 inputs_read;
> +   uint64_t inputs_read;
>  };
>  
>  static inline unsigned
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index 8c15b16..e07fa0a 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -29,9 +29,7 @@
>  #include "brw_context.h"
>  #include "brw_state.h"
>  
> -#include "blorp_priv.h"
> -
> -#include "genxml/gen_macros.h"
> +#include "genX_blorp_exec.h"
>  
>  static void *
>  blorp_emit_dwords(struct blorp_context *blorp, void *batch, unsigned n)
> @@ -168,1115 +166,6 @@ blorp_emit_3dstate_multisample(struct blorp_context 
> *blorp, void *batch,
>  #endif
>  }
>  
> -struct blorp_batch {
> -   struct blorp_context *blorp;
> -   void *batch;
> -};
> -
> -#define __gen_address_type struct blorp_address
> -#define __gen_user_data struct blorp_batch
> -
> -static uint64_t
> -__gen_combine_address(struct blorp_batch *batch, void *location,
> -  struct blorp_address address, uint32_t delta)
> -{
> -   if (address.buffer == NULL) {
> -  return address.offset + delta;
> -   } else {
> -  return blorp_emit_reloc(batch->blorp, batch->batch,
> -  location, address, delta);
> -   }
> -}
> -
> -#include "genxml/genX_pack.h"
> -
> -#define _blorp_cmd_length(cmd) cmd ## _length
> -#define _blorp_cmd_length_bias(cmd) cmd ## _length_bias
> -#define _blorp_cmd_header(cmd) cmd ## _header
> -#define _blorp_cmd_pack(cmd) cmd ## _pack
> -
> -#define blorp_emit(batch, cmd, name)\
> -   for (struct cmd name = { _blorp_cmd_header(cmd) },   \
> -*_dst = blorp_emit_dwords(batch.blorp, batch.batch, \
> -  _blorp_cmd_length(cmd));  \
> -__builtin_expect(_dst != NULL, 1);  \
> -_blorp_cmd_pack(cmd)(, (void *)_dst, ),   \
> -_dst = NULL)
> -
> -#define blorp_emitn(batch, cmd, n) ({   \
> -  uint32_t *_dw = blorp_emit_dwords(batch.blorp, batch.batch, n);   \
> -  struct cmd template = {   \
> - _blorp_cmd_header(cmd),\
> - .DWordLength = n - _blorp_cmd_length_bias(cmd),\
> -  };\
> -  _blorp_cmd_pack(cmd)(, _dw, ); \
> -  _dw + 1; /* Array starts at dw[1] */  \
> -   })
> -
> -/* Once vertex fetcher has written full VUE entries with complete
> - * header the space requirement is as follows per vertex (in bytes):
> - *
> - * HeaderPositionProgram constants
> - *   +++---+
> - *   |   16   | 16 |  n x 16   |
> - *   +++---+
> - *
> - * where 'n' stands for number of varying inputs expressed as vec4s.
> - *
> - * The URB size is in turn expressed in 64 bytes (512 bits).
> - */
> -static inline unsigned
> -gen7_blorp_get_vs_entry_size(const struct brw_blorp_params *params)
> -{
> -const unsigned num_varyings =
> -   params->wm_prog_data ? params->wm_prog_data->num_varying_inputs : 0;
> -const unsigned total_needed 

Re: [Mesa-dev] [PATCH 2/2] anv: Throw INCOMPATIBLE_DRIVER for non-fatal initialization errors

2016-08-23 Thread Jason Ekstrand
On Tue, Aug 23, 2016 at 2:17 AM, Emil Velikov 
wrote:

> On 23 August 2016 at 02:11, Jason Ekstrand  wrote:
> > The only reason we should throw INITIALIZATION_FAILED is if we have found
> > useable intel hardware but have failed to bring it up for some reason.
> > Otherwise, we should just throw INCOMPATIBLE_DRIVER which will turn into
> > successfully advertising 0 physical devices
> > ---
> >  src/intel/vulkan/anv_device.c | 8 +++-
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> > index f31f366..940a14e 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -62,8 +62,7 @@ anv_physical_device_init(struct anv_physical_device
> *device,
> >
> > fd = open(path, O_RDWR | O_CLOEXEC);
> > if (fd < 0)
> > -  return vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
> > -   "failed to open %s: %m", path);
> > +  return vk_error(VK_ERROR_INCOMPATIBLE_DRIVER);
> >
> Mildly related:
> You mentioned before that you prefer no not link against libdrm. What
> is the reasoning behind that - opencoding some of it's functionality
> in here is not likely to pan out well.
>

The amount of libdrm functionality we are repeating here is pretty small.
Also, if we need more queries, having to add them to libdrm *and* add
support to the vulkan driver to query it through libdrm is kind-of
pointless since each query is only a couple of lines.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: add support for cull distances.

2016-08-23 Thread Kai Wasserbäch
Hey Dave,
Dave Airlie wrote on 23.08.2016 03:28:
> From: Dave Airlie 
> 
> This should be all that is required for cull distances to work
> on radeonsi.

in case this is indeed enough to enable ARB_cull_distance, mind adding that to
GL3.txt and the release notes?

> Signed-off-by: Dave Airlie 
> ---
>  src/gallium/drivers/radeonsi/si_pipe.c  | 3 ++-
>  src/gallium/drivers/radeonsi/si_state.c | 8 
>  2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 62b62db..14b0b80 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -436,7 +436,6 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
> pipe_cap param)
>   case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
>   case PIPE_CAP_VERTEXID_NOBASE:
>   case PIPE_CAP_QUERY_BUFFER_OBJECT:
> - case PIPE_CAP_CULL_DISTANCE:
>   case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
>   case PIPE_CAP_TGSI_VOTE:
>   case PIPE_CAP_MAX_WINDOW_RECTANGLES:
> @@ -447,6 +446,8 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
> pipe_cap param)
>   case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
>   return sscreen->has_draw_indirect_multi;
>  
> + case PIPE_CAP_CULL_DISTANCE:
> + return 1;
>   case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
>   return 30;

Hm, shouldn't the PIPE_CAP_CULL_DISTANCE just be moved up to the long list of
supported features with boolean caps at the beginning of si_get_param()?
(Instead of creating another "true segment"?)

Cheers,
Kai



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/31] i965/blorp: Add a helper for allocating binding tables and surface states

2016-08-23 Thread Pohjolainen, Topi
On Fri, Aug 19, 2016 at 09:55:51AM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 64 
> +++--

Had to read some bits three or four times but looks correct in the end:

Reviewed-by: Topi Pohjolainen 

>  1 file changed, 42 insertions(+), 22 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index 6461af7..288e384 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -89,6 +89,25 @@ blorp_alloc_dynamic_state(struct blorp_context *blorp,
> return brw_state_batch(brw, type, size, alignment, offset);
>  }
>  
> +static void
> +blorp_alloc_binding_table(struct blorp_context *blorp, unsigned num_entries,
> +  unsigned state_size, unsigned state_alignment,
> +  uint32_t *bt_offset, uint32_t **bt_map,
> +  void **surface_maps)
> +{
> +   struct brw_context *brw = blorp->driver_ctx;
> +
> +   *bt_map = brw_state_batch(brw, AUB_TRACE_BINDING_TABLE,
> + num_entries * sizeof(uint32_t), 32,
> + bt_offset);
> +
> +   for (unsigned i = 0; i < num_entries; i++) {
> +  surface_maps[i] = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> +state_size, state_alignment,
> +&(*bt_map)[i]);
> +   }
> +}
> +
>  static void *
>  blorp_alloc_vertex_buffer(struct blorp_context *blorp, uint32_t size,
>struct blorp_address *addr)
> @@ -941,9 +960,10 @@ static const struct surface_state_info 
> surface_state_infos[] = {
> [9] = {16, 64, 8,  10},
>  };
>  
> -static uint32_t
> +static void
>  blorp_emit_surface_state(struct brw_context *brw,
>   const struct brw_blorp_surface_info *surface,
> + uint32_t *state, uint32_t state_offset,
>   bool is_render_target)
>  {
> const struct surface_state_info ss_info = surface_state_infos[brw->gen];
> @@ -961,21 +981,17 @@ blorp_emit_surface_state(struct brw_context *brw,
> if (aux_usage == ISL_AUX_USAGE_HIZ)
>aux_usage = ISL_AUX_USAGE_NONE;
>  
> -   uint32_t surf_offset;
> -   uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> -  ss_info.num_dwords * 4, ss_info.ss_align,
> -  _offset);
> -
> const uint32_t mocs =
>is_render_target ? brw->blorp.mocs.rb : brw->blorp.mocs.tex;
>  
> -   isl_surf_fill_state(>isl_dev, dw, .surf = , .view = 
> >view,
> +   isl_surf_fill_state(>isl_dev, state,
> +   .surf = , .view = >view,
> .aux_surf = >aux_surf, .aux_usage = 
> aux_usage,
> .mocs = mocs, .clear_color = surface->clear_color,
> .x_offset_sa = surface->tile_x_sa,
> .y_offset_sa = surface->tile_y_sa);
>  
> -   blorp_surface_reloc(brw, surf_offset + ss_info.reloc_dw * 4,
> +   blorp_surface_reloc(brw, state_offset + ss_info.reloc_dw * 4,
> surface->addr, 0);
>  
> if (aux_usage != ISL_AUX_USAGE_NONE) {
> @@ -984,28 +1000,32 @@ blorp_emit_surface_state(struct brw_context *brw,
> * surface buffer addresses are always 4K page alinged.
> */
>assert((surface->aux_addr.offset & 0xfff) == 0);
> -  blorp_surface_reloc(brw, surf_offset + ss_info.aux_reloc_dw * 4,
> -  surface->aux_addr, dw[ss_info.aux_reloc_dw]);
> +  blorp_surface_reloc(brw, state_offset + ss_info.aux_reloc_dw * 4,
> +  surface->aux_addr, state[ss_info.aux_reloc_dw]);
> }
> -
> -   return surf_offset;
>  }
>  
>  static void
>  blorp_emit_surface_states(struct brw_context *brw,
>const struct brw_blorp_params *params)
>  {
> -   uint32_t bind_offset;
> -   uint32_t *bind =
> -  brw_state_batch(brw, AUB_TRACE_BINDING_TABLE,
> -  sizeof(uint32_t) * BLORP_NUM_BT_ENTRIES,
> -  32, /* alignment */ _offset);
> -
> -   bind[BLORP_RENDERBUFFER_BT_INDEX] =
> -  blorp_emit_surface_state(brw, >dst, true);
> +   uint32_t bind_offset, *bind_map;
> +   void *surface_maps[2];
> +
> +   const unsigned ss_size = GENX(RENDER_SURFACE_STATE_length) * 4;
> +   const unsigned ss_align = GENX(RENDER_SURFACE_STATE_length) > 8 ? 64 : 32;
> +
> +   unsigned num_surfaces = 1 + (params->src.addr.buffer != NULL);
> +   blorp_alloc_binding_table(>blorp, num_surfaces, ss_size, ss_align,
> + _offset, _map, surface_maps);
> +
> +   blorp_emit_surface_state(brw, >dst,
> +surface_maps[BLORP_RENDERBUFFER_BT_INDEX],
> +bind_map[BLORP_RENDERBUFFER_BT_INDEX], true);
>  

Re: [Mesa-dev] [PATCH 0/2] st/va: minor clean-ups

2016-08-23 Thread Kai Wasserbäch
Hey Boyuan, hey Christian,
Zhang, Boyuan wrote on 22.08.2016 22:52:
> Thanks for cleaning up the codes. The duplicated end_frame and unused 
> variable are due to the new patch submitted last Friday. I have reviewed both 
> patches.

thanks for your reviews! Just in case you missed my postscript in the cover
letter: either of you needs to commit these changes since I do not have commit
access.

Thanks again,
Kai


> -Original Message-
> From: Kai Wasserbäch [mailto:k...@dev.carbon-project.org] 
> Sent: August-20-16 12:15 PM
> To: mesa-dev@lists.freedesktop.org
> Cc: Zhang, Boyuan
> Subject: [PATCH 0/2] st/va: minor clean-ups
> 
> Hey,
> just noticed a duplicate call to context->decoder->end_frame() while looking 
> through the recent changes. This is just a trivial clean-up and no functional 
> change is intended. And while I was there, I also noticed an unused variable, 
> which the second patch removes.
> 
> Cheers,
> Kai
> 
> P.S.: If this "series" gets accepted, please commit it for me, since I do not 
> have commit access.
> 
> 
> Kai Wasserbäch (3):
>   st/va: Remove else case in vlVaEndPicture() made superfluous by
> c59628d11b
>   st/va: Remove unused variable coded_size from vlVaEndPicture()
> 
>  src/gallium/state_trackers/va/picture.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.

2016-08-23 Thread Eric Engestrom
On Mon, Aug 22, 2016 at 09:48:50AM +0200, Stencel, Joanna wrote:
> Segfault occurs when destroying EGL surface attached to already destroyed
> Wayland window. The fix is to set to NULL the pointer of surface's
> native window when wl_egl_destroy_window() is called.
> 
> Signed-off-by: Stencel, Joanna 

LGTM
Reviewed-by: Eric Engestrom 

> ---
>  src/egl/drivers/dri2/platform_wayland.c| 15 +--
>  src/egl/wayland/wayland-egl/wayland-egl-priv.h |  1 +
>  src/egl/wayland/wayland-egl/wayland-egl.c  |  3 +++
>  3 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/src/egl/drivers/dri2/platform_wayland.c 
> b/src/egl/drivers/dri2/platform_wayland.c
> index d2f47cc..821d7c4 100644
> --- a/src/egl/drivers/dri2/platform_wayland.c
> +++ b/src/egl/drivers/dri2/platform_wayland.c
> @@ -119,6 +119,13 @@ resize_callback(struct wl_egl_window *wl_win, void *data)
> (*dri2_dpy->flush->invalidate)(dri2_surf->dri_drawable);
>  }
>  
> +static void
> +destroy_window_callback(void *data)
> +{
> +   struct dri2_egl_surface *dri2_surf = data;
> +   dri2_surf->wl_win = NULL;
> +}
> +
>  /**
>   * Called via eglCreateWindowSurface(), drv->API.CreateWindowSurface().
>   */
> @@ -160,6 +167,7 @@ dri2_wl_create_surface(_EGLDriver *drv, _EGLDisplay *disp,
>  
> dri2_surf->wl_win->private = dri2_surf;
> dri2_surf->wl_win->resize_callback = resize_callback;
> +   dri2_surf->wl_win->destroy_window_callback = destroy_window_callback;
>  
> dri2_surf->base.Width = window->width;
> dri2_surf->base.Height = window->height;
> @@ -258,8 +266,11 @@ dri2_wl_destroy_surface(_EGLDriver *drv, _EGLDisplay 
> *disp, _EGLSurface *surf)
> if (dri2_surf->throttle_callback)
>wl_callback_destroy(dri2_surf->throttle_callback);
>  
> -   dri2_surf->wl_win->private = NULL;
> -   dri2_surf->wl_win->resize_callback = NULL;
> +   if (dri2_surf->wl_win) {
> +  dri2_surf->wl_win->private = NULL;
> +  dri2_surf->wl_win->resize_callback = NULL;
> +  dri2_surf->wl_win->destroy_window_callback = NULL;
> +   }
>  
> free(surf);
>  
> diff --git a/src/egl/wayland/wayland-egl/wayland-egl-priv.h 
> b/src/egl/wayland/wayland-egl/wayland-egl-priv.h
> index 74a1552..eae1224 100644
> --- a/src/egl/wayland/wayland-egl/wayland-egl-priv.h
> +++ b/src/egl/wayland/wayland-egl/wayland-egl-priv.h
> @@ -27,6 +27,7 @@ struct wl_egl_window {
>  
>   void *private;
>   void (*resize_callback)(struct wl_egl_window *, void *);
> + void (*destroy_window_callback)(void *);
>  };
>  
>  #ifdef  __cplusplus
> diff --git a/src/egl/wayland/wayland-egl/wayland-egl.c 
> b/src/egl/wayland/wayland-egl/wayland-egl.c
> index 80a5be5..4a4701a 100644
> --- a/src/egl/wayland/wayland-egl/wayland-egl.c
> +++ b/src/egl/wayland/wayland-egl/wayland-egl.c
> @@ -66,6 +66,7 @@ wl_egl_window_create(struct wl_surface *surface,
>   egl_window->surface = surface;
>   egl_window->private = NULL;
>   egl_window->resize_callback = NULL;
> + egl_window->destroy_window_callback = NULL;
>   wl_egl_window_resize(egl_window, width, height, 0, 0);
>   egl_window->attached_width  = 0;
>   egl_window->attached_height = 0;
> @@ -76,6 +77,8 @@ wl_egl_window_create(struct wl_surface *surface,
>  WL_EGL_EXPORT void
>  wl_egl_window_destroy(struct wl_egl_window *egl_window)
>  {
> + if (egl_window->destroy_window_callback)
> + egl_window->destroy_window_callback(egl_window->private);
>   free(egl_window);
>  }
>  
> -- 
> 1.9.1
> 
> 
> 
> Intel Technology Poland sp. z o.o.
> ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII 
> Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 
> 957-07-52-316 | Kapital zakladowy 200.000 PLN.
> 
> Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
> moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
> wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
> jakiekolwiek
> przegladanie lub rozpowszechnianie jest zabronione.
> This e-mail and any attachments may contain confidential material for the 
> sole use of the intended recipient(s). If you are not the intended recipient, 
> please contact the sender and delete all copies; any review or distribution by
> others is strictly prohibited.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/31] i965/blorp: Use BT_INDEX enums for setting up the binding table

2016-08-23 Thread Pohjolainen, Topi
On Fri, Aug 19, 2016 at 09:55:50AM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/blorp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Patches 9, 12 and 13 are:

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/mesa/drivers/dri/i965/blorp.c 
> b/src/mesa/drivers/dri/i965/blorp.c
> index 2a488c2..dba3441 100644
> --- a/src/mesa/drivers/dri/i965/blorp.c
> +++ b/src/mesa/drivers/dri/i965/blorp.c
> @@ -304,8 +304,8 @@ brw_blorp_compile_nir_shader(struct brw_context *brw, 
> struct nir_shader *nir,
> wm_prog_data.base.param = NULL;
>  
> /* BLORP always just uses the first two binding table entries */
> -   wm_prog_data.binding_table.render_target_start = 0;
> -   wm_prog_data.base.binding_table.texture_start = 1;
> +   wm_prog_data.binding_table.render_target_start = 
> BLORP_RENDERBUFFER_BT_INDEX;
> +   wm_prog_data.base.binding_table.texture_start = BLORP_TEXTURE_BT_INDEX;
>  
> nir = brw_preprocess_nir(compiler, nir);
> nir_remove_dead_variables(nir, nir_var_shader_in);
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/31] i965/blorp/genX: Add a blorp_surface_reloc helper

2016-08-23 Thread Pohjolainen, Topi
On Fri, Aug 19, 2016 at 09:55:48AM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 38 
> -
>  1 file changed, 21 insertions(+), 17 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index 32a7445..f226255 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -61,6 +61,23 @@ blorp_emit_reloc(struct brw_context *brw, void *location,
> }
>  }
>  
> +static void
> +blorp_surface_reloc(struct brw_context *brw, uint32_t ss_offset,
> +struct blorp_address address, uint32_t delta)
> +{
> +   drm_intel_bo_emit_reloc(brw->batch.bo, ss_offset,
> +   address.buffer, address.offset + delta,
> +   address.read_domains, address.write_domain);
> +
> +   uint64_t reloc_val = address.buffer->offset64 + address.offset + delta;
> +   void *reloc_ptr = (void *)brw->batch.map + ss_offset;
> +#if GEN_GEN >= 8
> +   *(uint64_t *)reloc_ptr = reloc_val;
> +#else
> +   *(uint32_t *)reloc_ptr = reloc_val;
> +#endif
> +}
> +
>  static void *
>  blorp_alloc_dynamic_state(struct blorp_context *blorp,
>enum aub_state_struct_type type,
> @@ -951,24 +968,15 @@ blorp_emit_surface_state(struct brw_context *brw,
>  
> const uint32_t mocs =
>is_render_target ? brw->blorp.mocs.rb : brw->blorp.mocs.tex;
> -   uint64_t aux_bo_offset =
> -  surface->aux_addr.buffer ? surface->aux_addr.buffer->offset64 : 0;
>  
> isl_surf_fill_state(>isl_dev, dw, .surf = , .view = 
> >view,
> -   .address = surface->addr.buffer->offset64 + 
> surface->addr.offset,
> .aux_surf = >aux_surf, .aux_usage = 
> aux_usage,
> -   .aux_address = aux_bo_offset + 
> surface->aux_addr.offset,

Should you have dropped ".address" and ".aux_address" already in the previous
patch?
Otherwise the patch makes sense.

> .mocs = mocs, .clear_color = surface->clear_color,
> .x_offset_sa = surface->tile_x_sa,
> .y_offset_sa = surface->tile_y_sa);
>  
> -   /* Emit relocation to surface contents */
> -   drm_intel_bo_emit_reloc(brw->batch.bo,
> -   surf_offset + ss_info.reloc_dw * 4,
> -   surface->addr.buffer,
> -   dw[ss_info.reloc_dw] - 
> surface->addr.buffer->offset64,
> -   surface->addr.read_domains,
> -   surface->addr.write_domain);
> +   blorp_surface_reloc(brw, surf_offset + ss_info.reloc_dw * 4,
> +   surface->addr, 0);
>  
> if (aux_usage != ISL_AUX_USAGE_NONE) {
>/* On gen7 and prior, the bottom 12 bits of the MCS base address are
> @@ -976,12 +984,8 @@ blorp_emit_surface_state(struct brw_context *brw,
> * surface buffer addresses are always 4K page alinged.
> */
>assert((surface->aux_addr.offset & 0xfff) == 0);
> -  drm_intel_bo_emit_reloc(brw->batch.bo,
> -  surf_offset + ss_info.aux_reloc_dw * 4,
> -  surface->aux_addr.buffer,
> -  dw[ss_info.aux_reloc_dw] & 0xfff,
> -  surface->aux_addr.read_domains,
> -  surface->aux_addr.write_domain);
> +  blorp_surface_reloc(brw, surf_offset + ss_info.aux_reloc_dw * 4,
> +  surface->aux_addr, dw[ss_info.aux_reloc_dw]);
> }
>  
> return surf_offset;
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97444] mesa git crashes in libxshmfence

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97444

--- Comment #2 from Fabian Maurer  ---
I'd like to, but the older versions fail due to the wrong version of
llvm/clang.
Is there somewhere a tutorial how to compile both together?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: fix format string compiler warning for 32-bit machines

2016-08-23 Thread Brian Paul

On 08/22/2016 10:59 AM, ⚛ wrote:

I am marking this patch as (de-facto) rejected at Patchwork.


I gave you my R-b and I just pushed it to master.

-Brian



Reason: The patch moved from the 1st page to the 2nd page of
https://patchwork.freedesktop.org/project/mesa/series/?ordering=-last_updated

Although I am letting it go you can still merge it to master if you want to.




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/dri2: dri2_initialize: Do not reference-count TestOnly display

2016-08-23 Thread Emil Velikov
On 22 August 2016 at 07:10, Nicolas Boichat  wrote:
> Hi Emil,
>
> On Tue, Aug 16, 2016 at 1:17 AM, Emil Velikov  
> wrote:
>> Hi Nicolas,
>>
>> On 4 August 2016 at 02:51, Nicolas Boichat  wrote:
>>> On Thu, Aug 4, 2016 at 9:38 AM, Michel Dänzer  wrote:
 On 04.08.2016 09:53, Nicolas Boichat wrote:
> On Thu, Aug 4, 2016 at 12:22 AM, Martin Peres
>  wrote:
>> On 03/08/16 16:54, Nicolas Boichat wrote:
>>>
>>> In the case where dri2_initialize is called with a TestOnly display,
>>> the display is not actually initialized, so dri2_egl_display always
>>> fails, and we cannot do any reference counting.
>>>
>>> Fixes piglit spec@egl_khr_create_context@verify gl flavor (reproducible
>>> with LIBGL_ALWAYS_SOFTWARE=1) and spec@egl_khr_fence_sync@conformance.
>>>
>>> Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
>>> Cc: "12.0" 
>>> Reported-by: Michel Dänzer 
>>> Signed-off-by: Nicolas Boichat 
>>> ---
>>>
>>> Compile-tested only, please give it a spin, thanks!
>>
>> Still crashes, same backtrace before and after the patch:
>
> Actually, I was thinking about this bug:
> https://bugs.freedesktop.org/show_bug.cgi?id=97136, which should be
> spec@egl_khr_create_context@verify gl flavor? Did you try that test?

 Your patch fixes this test for me.

 Tested-by: Michel Dänzer 

 Please remove the reference to the egl_khr_fence_sync test from the
 commit log.
>>>
>>> Thanks.
>>>
>>> Emil: Can you fixup the commit message before applying?
>>>
>> Sure I can do that. Yet this change fixes another
>> unexpected/undocumented bug
>
> I don't think there is anything difference _before_ 9ee683f877
> (egl/dri2: Add reference count for dri2_egl_display), and after this
> patch.
>
> After 9ee683f877, of course, there is a bug, as eglGetProcAddress just
> returns NULL when called before initializing any display, which is
> clearly wrong.
>
Yes, that's correct. Yet I was thinking about the case explained below.

>> - currently calling eglGetProcAddress will
>> fail, when libEGL is built without the said platform.
>> Yet the API/documentation is clear - entry points are independent of
>> display (which is platform specific) or context.
>>
>> Did I miss something, does the above make sense ?
>
> Well, spec says "A return value of NULL indicates that the specific
> function does not exist for the EGL implementation.", so I guess it's
> fine to return NULL is EGL is built without a given platform.
>
> One possible issue is if the "default" display (TestOnly), is
> different from the display being used later... But I don't see any
> problem here, if I trace the code correctly:
> eglGetProcAddress:
>   => For many functions (egl*), returns a function pointer that is
> always valid => No problem
>   => For others (I guess mostly gl*): calls _eglGetDriverProc
>
> _eglGetDriverProc:
> - Creates a default display (which initializes _eglModules)
> - Iterates over the modules, and calls
> mod->Driver->API.GetProcAddress. I think this can only be
> dri2_get_proc_address.
>
> dri2_get_proc_address:
>  - Calls dri2_drv->get_proc_address (that's dlsym(handle,
> "_glapi_get_proc_address");, see dri2_load)
>
> _glapi_get_proc_address:
> /**
>  * Return pointer to the named function.  If the function name isn't found
>  * in the name of static functions, try generating a new API entrypoint on
>  * the fly with assembly language.
>  */
> _glapi_proc
> _glapi_get_proc_address(const char *funcName)
> {
>const struct mapi_stub *stub = _glapi_get_stub(funcName, 1);
>return (stub) ? (_glapi_proc) stub_get_addr(stub) : NULL;
> }
>
> Which looks platform/display independent? Or am I missing something?
>
The gist is that _eglGetNativePlatform() can return a platform which
isn't present/build in the libEGL implementation. See below for
detailed before/after:

With your patch:
 - we dive into _eglMatchDriver calling dri2_initialize (via
_eglMatchAndInitialize) with TestOnly == TRUE regardless of the
platform requested
 - thus we reach API.GetProcAddress/dri2_get_proc_address and provide
a non NULL entry point function pointer.

Before this patch (or any of your patches):
 - in _eglGetDriverProc we use eglGetDisplay(EGL_DEFAULT_DISPLAY)
 - the latter can return a dpy for a platform that is _not_ built into
the libEGL implementation (thanks to _eglGetNativePlatform)
 - thus by reusing the detected platform in dri2_initialize() we get
false and we'll never get to the API.GetProcAddress call in
_eglGetDriverProc.
 - from a user POV they'll get NULL for eglGetProcAddress because we
(sort of) miss-detect the EGL_DEFAULT_DISPLAY even if they are going
to pick/use _any_ working platform via 

Re: [Mesa-dev] [PATCH 2/3] glsl: Fix incorrect hard-coded location of the gl_SecondaryFragColorEXT built-in.

2016-08-23 Thread Ilia Mirkin
On Tue, Aug 23, 2016 at 12:18 AM, Francisco Jerez  wrote:
> Ilia Mirkin  writes:
>
>> On Tue, Aug 23, 2016 at 12:05 AM, Francisco Jerez  
>> wrote:
>>> Ilia Mirkin  writes:
>>>
 On Mon, Aug 22, 2016 at 10:55 PM, Francisco Jerez  
 wrote:
> Ilia Mirkin  writes:
>
>> On Mon, Aug 22, 2016 at 9:59 PM, Francisco Jerez  
>> wrote:
>>> gl_SecondaryFragColorEXT should have the same location as gl_FragColor
>>> for the secondary fragment color to be replicated to all fragment
>>> outputs.  The incorrect location of gl_SecondaryFragColorEXT would
>>> cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0
>>> as being written to, which isn't allowed by the spec and would
>>> ultimately lead to an assertion failure in
>>> fs_visitor::emit_fb_writes() on my i965-fb-fetch branch.
>>
>> My recollection was that it didn't work with COLOR for "stupid"
>> reasons. Can you confirm that
>> bin/arb_blend_func_extended-fbo-extended-blend-pattern_gles2 -auto
>> passes with this patch?
>>
> Yes, it does, in fact
> arb_blend_func_extended-fbo-extended-blend-pattern_gles2 hits the i965
> assertion failure I mentioned above unless this patch is applied.

 This causes the test in question to fail on nouveau... the TGSI shader
 generated starts with

 FRAG
 PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
 DCL IN[0], POSITION, LINEAR
 DCL OUT[0], SAMPLEMASK
>>>
>>> Heh, this smells a lot like the bug fixed in PATCH 1, but somewhere in
>>> the mesa state tracker.  st_glsl_to_tgsi.cpp:2422 does:
>>>
>>> | entry = new(mem_ctx) variable_storage(var,
>>> |   PROGRAM_OUTPUT,
>>> |   var->data.location
>>> |   + var->data.index);
>>>
>>> which is obviously bogus, e.g. for var->data.location ==
>>> FRAG_RESULT_COLOR and var->data.index == 1 you get
>>> FRAG_RESULT_SAMPLE_MASK which explains the sample mask declaration
>>> above.
>>
>> Right, because having FRAG_RESULT_COLOR and index != 0 was never
>> possible prior to this. That might be why Ryan stuck it into
>> FRAG_RESULT_DATA0 [I may have been the one to suggest that].
>
> Heh, so I guess that's the "stupid" reason you were referring to,
> working around this mesa state tracker bug in the GLSL front-end.

Right. Or another way of looking at it, FRAG_RESULT_COLOR + index != 0
is illegal (as it would, among other things, imply multi-rt support
for dual-source blending), and the former code was making the GLSL fe
not emit the illegal combination. Whichever way you look at it,
breaking st/mesa isn't a great option.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] st/va: add missing mutex_unlock

2016-08-23 Thread Zhang, Boyuan
Hi Eric,

Thanks for catching it. The patch is Reviewed-by: Boyuan Zhang 


Regards,
Boyuan

-Original Message-
From: Eric Engestrom [mailto:eric.engest...@imgtec.com] 
Sent: August-22-16 12:17 PM
To: Zhang, Boyuan
Cc: mesa-dev@lists.freedesktop.org; Koenig, Christian; Eric Engestrom
Subject: Re: [Mesa-dev] [PATCH mesa] st/va: add missing mutex_unlock

CC'ing Boyuan Zhang (author of the original patch), who got somehow removed 
from the CC list when sending my patch.


On Sun, Aug 21, 2016 at 10:11:48PM +0100, Eric Engestrom wrote:
> Fixes: c59628d11b134fc01638 ("st/va: enable dual instances encode by 
> sync surface")
> Signed-off-by: Eric Engestrom 
> ---
>  src/gallium/state_trackers/va/surface.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/va/surface.c 
> b/src/gallium/state_trackers/va/surface.c
> index 012e48e..3ee1cdd 100644
> --- a/src/gallium/state_trackers/va/surface.c
> +++ b/src/gallium/state_trackers/va/surface.c
> @@ -106,8 +106,10 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
> render_target)
> pipe_mutex_lock(drv->mutex);
> surf = handle_table_get(drv->htab, render_target);
>  
> -   if (!surf || !surf->buffer)
> +   if (!surf || !surf->buffer) {
> +  pipe_mutex_unlock(drv->mutex);
>return VA_STATUS_ERROR_INVALID_SURFACE;
> +   }
>  
> context = handle_table_get(drv->htab, surf->ctx);
> if (!context) {
> --
> 2.9.3
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Give the installed intel_icd.json file an absolute path

2016-08-23 Thread Julien Cristau
On Fri, Aug 19, 2016 at 09:04:14 -0700, Jason Ekstrand wrote:

> diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
> index ad0148d..9fef960 100644
> --- a/src/intel/vulkan/Makefile.am
> +++ b/src/intel/vulkan/Makefile.am
> @@ -141,7 +141,7 @@ anv_timestamp.h:
>   $(AM_V_GEN) echo "#define ANV_TIMESTAMP \"$(TIMESTAMP_CMD)\"" > $@
>  
>  BUILT_SOURCES = $(VULKAN_GENERATED_FILES)
> -CLEANFILES = $(BUILT_SOURCES) dev_icd.json
> +CLEANFILES = $(BUILT_SOURCES) dev_icd.json intel_icd.json
>  EXTRA_DIST = \
>   $(top_srcdir)/include/vulkan/vk_icd.h \
>   anv_entrypoints_gen.py \
> @@ -170,6 +170,11 @@ dev_icd.json : dev_icd.json.in
>   -e "s#@build_libdir@#${abs_top_builddir}/${LIB_DIR}#" \
>   < $(srcdir)/dev_icd.json.in > $@
>  
> +intel_icd.json : intel_icd.json.in
> + $(AM_V_GEN) $(SED) \
> + -e "s#@install_libdir@#${libdir}#" \
> + < $(srcdir)/intel_icd.json.in > $@
> +
>  # Libvulkan with dummy gem. Used for unit tests.
>  libvulkan_test_la_SOURCES = $(VULKAN_GEM_STUB_FILES)
>  libvulkan_test_la_LIBADD = $(VULKAN_LIB_DEPS) -lX11-xcb

Incidentally, if intel_icd.json is generated then it should be removed
from EXTRA_DIST (but maybe intel_icd.json.in needs to be added?).

Cheers,
Julien
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965/vec4: make offset() operate in terms of channels instead of full registers

2016-08-23 Thread Iago Toral
On Tue, 2016-08-23 at 11:21 +0200, Michael Schellenberger Costa wrote:
> Hi Iago,
> 
> given that the idea here was to unify vec4 and fs you might want to
> adopt the names/function types accordingly.
> 
> In brw_ir_fs.h there is byte_offset that returns a fs_reg while you
> have
> void add_byte_offset.

Yeah, the reason for that is that in the FS backend we only have
fs_reg, so byte_offset takes that as argument and returns an updated
copy of an fs_reg that we can return directly from the offset() helper.

In the vec4 backend we have to deal with src_reg and dst_reg but I was
trying not want to duplicate the logic for each (the switch statement).
I did not notice that the FS backend was using byte_offset() directly
in other places, so it might be a good idea to make a byte_offset()
helper in the vec4 backend that works in the same fashion. That means
that I'll move the switch statement to another helper that takes a
backend_reg as input parameter and call that from byte_offset.

Iago

> --Michael
> 
> Am 23.08.2016 um 10:24 schrieb Iago Toral Quiroga:
> > 
> > This will make it more consistent with the FS implementation of the
> > same
> > helper and will provide more flexibility that will come in handy,
> > for
> > example, when we add a SIMD lowering pass in the vec4 backend.
> > 
> > v2:
> >  - Move the switch statement to add_byte_offset (Iago)
> >  - Remove the assert on the register file, it is redundant with the
> > switch
> >    (Michael Schellenberger)
> >  - Fix use of '=' instead of '==' (Michael Schellenberger,
> >    Francesco Ansanelli)
> > ---
> >  src/mesa/drivers/dri/i965/brw_ir_vec4.h | 36
> > +
> >  1 file changed, 28 insertions(+), 8 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > index 81b6a13..d55b522 100644
> > --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > @@ -60,12 +60,33 @@ retype(src_reg reg, enum brw_reg_type type)
> > return reg;
> >  }
> >  
> > +static inline void
> > +add_byte_offset(backend_reg *reg, unsigned delta)
> > +{
> > +   switch (reg->file) {
> > +   case BAD_FILE:
> > +  break;
> > +   case MRF:
> > +   case VGRF:
> > +   case ATTR:
> > +   case UNIFORM: {
> > +  const unsigned suboffset = reg->subreg_offset + delta;
> > +  reg->reg_offset += suboffset / REG_SIZE;
> > +  reg->subreg_offset += suboffset % REG_SIZE;
> > +  /* Align16 requires that register accesses are 16-byte
> > aligned */
> > +  assert(reg->subreg_offset % 16 == 0);
> > +  break;
> > +   }
> > +   default:
> > +  assert(delta == 0);
> > +   }
> > +}
> > +
> >  static inline src_reg
> > -offset(src_reg reg, unsigned delta)
> > +offset(src_reg reg, unsigned width, unsigned delta)
> >  {
> > -   assert(delta == 0 ||
> > -  (reg.file != ARF && reg.file != FIXED_GRF && reg.file !=
> > IMM));
> > -   reg.reg_offset += delta;
> > +   unsigned byte_offset = delta * width * type_sz(reg.type);
> > +   add_byte_offset(, byte_offset);
> > return reg;
> >  }
> >  
> > @@ -130,11 +151,10 @@ retype(dst_reg reg, enum brw_reg_type type)
> >  }
> >  
> >  static inline dst_reg
> > -offset(dst_reg reg, unsigned delta)
> > +offset(dst_reg reg, unsigned width, unsigned delta)
> >  {
> > -   assert(delta == 0 ||
> > -  (reg.file != ARF && reg.file != FIXED_GRF && reg.file !=
> > IMM));
> > -   reg.reg_offset += delta;
> > +   unsigned byte_offset = delta * width * type_sz(reg.type);
> > +   add_byte_offset(, byte_offset);
> > return reg;
> >  }
> >  
> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] i965: use new subroutine index uploader.

2016-08-23 Thread Dave Airlie
>>  #include "main/mtypes.h"
>>  #include "program/prog_parameter.h"
>> -
>> +#include "main/shaderapi.h"
>
> Why the extra empty line? If so, I would put it after the new include.

It actually removes the empty line, but I think it was there
deliberately, so I'll put it back.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: reset 'ViewportInitialized' when unbinding current context

2016-08-23 Thread Emil Velikov
On 15 August 2016 at 23:58, Dongwon Kim  wrote:
> 'ViewportInitialized' flag in gl_context has to be reset to '0'
> when the current context is unbound via a eglMakeCurrent call with
> all of 'NULL' resources (surfaces and context).
>
> This is to make sure the viewport of the context is re-initialized
> when the same context is bound to new read and draw surfaces
> (or same surfaces but with different size.) next time when the
> context is made current again.
>
The spec is pretty clear about this:

"The first time a OpenGL or OpenGL ES context is made current the
viewport and scissor dimensions are set to the size of the draw
surface (as though glViewport(0, 0, w, h) and glScissor(0, 0, w, h)
were called, where w and h are the width and height of the surface,
respectively). However, the viewport and scissor dimensions are not
modified when ctx is subsequently made current."

So unless coffee hasn't kicked this patch is wrong.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97420] "#version 0" crashes glsl_compiler

2016-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97420

Vinson Lee  changed:

   What|Removed |Added

   Keywords||bisected, regression

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] glsl: Fix incorrect hard-coded location of the gl_SecondaryFragColorEXT built-in.

2016-08-23 Thread Francisco Jerez
Ilia Mirkin  writes:

> On Mon, Aug 22, 2016 at 10:55 PM, Francisco Jerez  
> wrote:
>> Ilia Mirkin  writes:
>>
>>> On Mon, Aug 22, 2016 at 9:59 PM, Francisco Jerez  
>>> wrote:
 gl_SecondaryFragColorEXT should have the same location as gl_FragColor
 for the secondary fragment color to be replicated to all fragment
 outputs.  The incorrect location of gl_SecondaryFragColorEXT would
 cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0
 as being written to, which isn't allowed by the spec and would
 ultimately lead to an assertion failure in
 fs_visitor::emit_fb_writes() on my i965-fb-fetch branch.
>>>
>>> My recollection was that it didn't work with COLOR for "stupid"
>>> reasons. Can you confirm that
>>> bin/arb_blend_func_extended-fbo-extended-blend-pattern_gles2 -auto
>>> passes with this patch?
>>>
>> Yes, it does, in fact
>> arb_blend_func_extended-fbo-extended-blend-pattern_gles2 hits the i965
>> assertion failure I mentioned above unless this patch is applied.
>
> This causes the test in question to fail on nouveau... the TGSI shader
> generated starts with
>
> FRAG
> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
> DCL IN[0], POSITION, LINEAR
> DCL OUT[0], SAMPLEMASK

Heh, this smells a lot like the bug fixed in PATCH 1, but somewhere in
the mesa state tracker.  st_glsl_to_tgsi.cpp:2422 does:

| entry = new(mem_ctx) variable_storage(var,
|   PROGRAM_OUTPUT,
|   var->data.location
|   + var->data.index);

which is obviously bogus, e.g. for var->data.location ==
FRAG_RESULT_COLOR and var->data.index == 1 you get
FRAG_RESULT_SAMPLE_MASK which explains the sample mask declaration
above.

> DCL OUT[1], COLOR
> DCL CONST[2..3]
> DCL CONST[0..1]
> DCL TEMP[0]
> DCL TEMP[1..2], LOCAL
>
> So clearly something gets confused. I think this needs a bit more
> investigation... without this patch, the header looks more like
>
> FRAG
> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
> DCL IN[0], POSITION, LINEAR
> DCL OUT[0], COLOR
> DCL OUT[1], COLOR[1]


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >