[Mesa-dev] [PATCH] intel/compiler: Cast reg types explicitly

2017-08-25 Thread Topi Pohjolainen
Makes coverity happier.

CC: Matt Turner 
CID: 1416799
Fixes: c1ac1a3d25 (i965: Add a brw_hw_type_to_reg_type() function)

Signed-off-by: Topi Pohjolainen 
---
 src/intel/compiler/brw_reg_type.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index a0f674f0d7..98c4cf7234 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -111,13 +111,13 @@ brw_hw_type_to_reg_type(const struct gen_device_info 
*devinfo,
 {
if (file == BRW_IMMEDIATE_VALUE) {
   for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) {
- if (gen4_hw_type[i].imm_type == hw_type) {
+ if (gen4_hw_type[i].imm_type == (enum hw_imm_type)hw_type) {
 return i;
  }
   }
} else {
   for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) {
- if (gen4_hw_type[i].reg_type == hw_type) {
+ if (gen4_hw_type[i].reg_type == (enum hw_reg_type)hw_type) {
 return i;
  }
   }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nir: Fix some whatespace

2017-08-25 Thread Jason Ekstrand
On Fri, Aug 25, 2017 at 8:59 PM, Matt Turner  wrote:

> Both are
>
> Reviewed-by: Matt Turner 
>
> (Typo in the title of this patch)
>
> Should 2/2 go to stable? I'm not really sure how that code gets used.
>

Probably wouldn't hurt.  Doing a quick grep for the function, I don't see
anywhere that it actually matters for those two system values.  We use it
in nir_gather_info to fill out system_values_used and in brw_fs_nir because
we have a table of pre-computed system values.  However, since they get
lowered earlier on, I don't think it matters in practice.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nir: Fix some whatespace

2017-08-25 Thread Matt Turner
Both are

Reviewed-by: Matt Turner 

(Typo in the title of this patch)

Should 2/2 go to stable? I'm not really sure how that code gets used.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/10] gallium: normalize CONST file accesses to 2D - 2. try without image for the list

2017-08-25 Thread Dieter Nützel

Brave man,

with the little 'squash! st/glsl_to_tgsi: inline src_register into 
translate_src' on top of your series, both issues are _solved_.


So you have my

Tested-by: Dieter Nützel 

for the series.

BTW
Running this together with:
Prehash-all-the-things.mbox (Thomas Helland)
mesa-st-glsl_to_tgsi-refined-register-merge-algorithm.mbox (v9, by Gert 
Wollny, disabled/enabled)


Cheers back to you with German beer (of course),
Dieter

Am 25.08.2017 09:13, schrieb Nicolai Hähnle:

Hi Dieter,

sorry for the churn -- do these issues also occur with the latest
addition to
https://cgit.freedesktop.org/~nh/mesa/log/?h=tgsi-const-2d?

Cheers,
Nicolai

On 25.08.2017 08:19, Dieter Nützel wrote:

Am 25.08.2017 07:38, schrieb Dieter Nützel:

Am 23.08.2017 18:41, schrieb Nicolai Hähnle:

Hi all,

Following the discussion on Timothy's std430 packing series, here's
a quick proposal to just always use 2D accesses to the CONST file
in TGSI.

The first patch should be sufficient for all drivers to accept
those 2D accesses. It seems that most older drivers simply ignore
the dimension, and newer ones should handle it directly.

Subsequent patches modify the producers of TGSI to always use 2D
constant references. This is mostly done by changing ureg.

Finally, the last patch adds an assertion to radeonsi to make
sure all constant references are really 2D. It has survived my
very superficial initial testing.


Addendum:

glmark2 (2017.07) threw assertion.
System hang followed so sadly only mobile screenshot appended.

Goodnight! ;-)

Dieter


What needs to be tested is:
- some more drivers
- Nine


Sorry Nicolai,

but Nine corruption with Wine (LS2017 / FarmingSimulator2017) on 
RX580, here.


After KDE relogin partially window/screen corruption (window boarder
pixel flickering).

Dieter



- TGSI-to-NIR

You can find the series here:
https://cgit.freedesktop.org/~nh/mesa/log/?h=tgsi-const-2d

Please comment/review!
Thanks,
Nicolai
--  src/gallium/auxiliary/hud/hud_context.c  |   8 +-
 src/gallium/auxiliary/nir/tgsi_to_nir.c  |   2 +-
 src/gallium/auxiliary/postprocess/pp_mlaa.h  |  20 +--
 src/gallium/auxiliary/tgsi/tgsi_ureg.c   |  22 +--
 src/gallium/auxiliary/util/u_tests.c |   4 +-
 src/gallium/docs/source/screen.rst   |  11 +-
 src/gallium/drivers/radeon/r600_query.c  |  36 ++--
 src/gallium/drivers/radeonsi/si_shader.c |   1 +
 src/gallium/state_trackers/nine/nine_ff.c|   2 +-
 .../state_trackers/nine/nine_shader.c|  10 +-
 .../tests/graw/fragment-shader/frag-cb-1d.sh |   8 +-
 .../tests/graw/vertex-shader/vert-cb-1d.sh   |   8 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 153 
+

 13 files changed, 136 insertions(+), 149 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] nir: Fix system_value_from_intrinsic for subgroups

2017-08-25 Thread Jason Ekstrand
A couple of the cases were backwards
---
 src/compiler/nir/nir.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index e9e0489..afd4d1a 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1992,10 +1992,10 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
   return SYSTEM_VALUE_HELPER_INVOCATION;
case nir_intrinsic_load_view_index:
   return SYSTEM_VALUE_VIEW_INDEX;
-   case SYSTEM_VALUE_SUBGROUP_SIZE:
-  return nir_intrinsic_load_subgroup_size;
-   case SYSTEM_VALUE_SUBGROUP_INVOCATION:
-  return nir_intrinsic_load_subgroup_invocation;
+   case nir_intrinsic_load_subgroup_size:
+  return SYSTEM_VALUE_SUBGROUP_SIZE;
+   case nir_intrinsic_load_subgroup_invocation:
+  return SYSTEM_VALUE_SUBGROUP_INVOCATION;
case nir_intrinsic_load_subgroup_eq_mask:
   return SYSTEM_VALUE_SUBGROUP_EQ_MASK;
case nir_intrinsic_load_subgroup_ge_mask:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] nir: Fix some whatespace

2017-08-25 Thread Jason Ekstrand
Somehow tabs got in there...
---
 src/compiler/nir/nir.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index 841b7f4..e9e0489 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1928,15 +1928,15 @@ nir_intrinsic_from_system_value(gl_system_value val)
case SYSTEM_VALUE_SUBGROUP_INVOCATION:
   return nir_intrinsic_load_subgroup_invocation;
case SYSTEM_VALUE_SUBGROUP_EQ_MASK:
-   return nir_intrinsic_load_subgroup_eq_mask;
+  return nir_intrinsic_load_subgroup_eq_mask;
case SYSTEM_VALUE_SUBGROUP_GE_MASK:
-   return nir_intrinsic_load_subgroup_ge_mask;
+  return nir_intrinsic_load_subgroup_ge_mask;
case SYSTEM_VALUE_SUBGROUP_GT_MASK:
-   return nir_intrinsic_load_subgroup_gt_mask;
+  return nir_intrinsic_load_subgroup_gt_mask;
case SYSTEM_VALUE_SUBGROUP_LE_MASK:
-   return nir_intrinsic_load_subgroup_le_mask;
+  return nir_intrinsic_load_subgroup_le_mask;
case SYSTEM_VALUE_SUBGROUP_LT_MASK:
-   return nir_intrinsic_load_subgroup_lt_mask;
+  return nir_intrinsic_load_subgroup_lt_mask;
default:
   unreachable("system value does not directly correspond to intrinsic");
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102377] PIPE_*_4BYTE_ALIGNED_ONLY caps crashing

2017-08-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102377

Brian Paul  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #5 from Brian Paul  ---
Thanks, Bruce.  Patch committed: d819b1fcec02be5e0cfc87b6246833a2a2d5f034

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: move string_to_uint_map to glsl

2017-08-25 Thread Jason Ekstrand

Ack.  Not a review though.


On August 25, 2017 6:38:46 PM Emil Velikov  wrote:


From: Emil Velikov 

The functionality is used by glsl and mesa. With the latter already
depending on the former.

With this in place the src/util/ static library libmesautil.la no longer
has a C++ dependency. Thus objects which use it (like libEGL) don't need
the C++ link.

Cc: Jason Ekstrand 
Suggested-by: Jason Ekstrand 
Cc: "17.2" 
Fixes: 02cc35937277 ("egl/wayland: Use linux-dmabuf interface for buffers")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851
Signed-off-by: Emil Velikov 
---
 src/compiler/Makefile.sources | 4 +++-
 src/compiler/glsl/link_uniform_initializers.cpp   | 2 +-
 src/compiler/glsl/link_uniforms.cpp   | 2 +-
 src/compiler/glsl/linker.cpp  | 2 +-
 src/compiler/glsl/shader_cache.cpp| 2 +-
 src/compiler/glsl/standalone.cpp  | 2 +-
 src/{util => compiler/glsl}/string_to_uint_map.cpp| 0
 src/{util => compiler/glsl}/string_to_uint_map.h  | 0
 src/compiler/glsl/tests/set_uniform_initializer_tests.cpp | 2 +-
 src/mesa/main/shader_query.cpp| 2 +-
 src/mesa/main/shaderobj.c | 2 +-
 src/mesa/program/ir_to_mesa.cpp   | 2 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp | 2 +-
 src/util/Makefile.sources | 2 --
 14 files changed, 13 insertions(+), 13 deletions(-)
 rename src/{util => compiler/glsl}/string_to_uint_map.cpp (100%)
 rename src/{util => compiler/glsl}/string_to_uint_map.h (100%)

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 6e08dfb8448..a8309a1c592 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -140,7 +140,9 @@ LIBGLSL_FILES = \
glsl/program.h \
glsl/propagate_invariance.cpp \
glsl/s_expression.cpp \
-   glsl/s_expression.h
+   glsl/s_expression.h \
+   glsl/string_to_uint_map.cpp \
+   glsl/string_to_uint_map.h

 LIBGLSL_SHADER_CACHE_FILES = \
glsl/shader_cache.cpp \
diff --git a/src/compiler/glsl/link_uniform_initializers.cpp 
b/src/compiler/glsl/link_uniform_initializers.cpp

index e7f9c9d8ac0..84a38793f64 100644
--- a/src/compiler/glsl/link_uniform_initializers.cpp
+++ b/src/compiler/glsl/link_uniform_initializers.cpp
@@ -25,7 +25,7 @@
 #include "ir.h"
 #include "linker.h"
 #include "ir_uniform.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"

 /* These functions are put in a "private" namespace instead of being marked
  * static so that the unit tests can access them.  See
diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp

index 1b87c5860b6..99b171d7881 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -27,7 +27,7 @@
 #include "ir_uniform.h"
 #include "glsl_symbol_table.h"
 #include "program.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"
 #include "ir_array_refcount.h"

 /**
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index b4784c51199..90c1084c50f 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -75,7 +75,7 @@
 #include "program/program.h"
 #include "util/mesa-sha1.h"
 #include "util/set.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"
 #include "linker.h"
 #include "link_varyings.h"
 #include "ir_optimization.h"
diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp

index cc4d24482d9..887b8954c51 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -59,7 +59,7 @@
 #include "program.h"
 #include "shader_cache.h"
 #include "util/mesa-sha1.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"

 extern "C" {
 #include "main/enums.h"
diff --git a/src/compiler/glsl/standalone.cpp 
b/src/compiler/glsl/standalone.cpp

index 52554bb92a2..7a84ca72212 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -36,7 +36,7 @@
 #include "loop_analysis.h"
 #include "standalone_scaffolding.h"
 #include "standalone.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"
 #include "util/set.h"
 #include "linker.h"
 #include "glsl_parser_extras.h"
diff --git a/src/util/string_to_uint_map.cpp 
b/src/compiler/glsl/string_to_uint_map.cpp

similarity index 100%
rename from src/util/string_to_uint_map.cpp
rename to src/compiler/glsl/string_to_uint_map.cpp
diff --git a/src/util/string_to_uint_map.h 
b/src/compiler/glsl/string_to_uint_map.h

similarity index 100%
rename from src/util/string_to_uint_map.h
rename 

Re: [Mesa-dev] Question for nir lower load uniform to scalar

2017-08-25 Thread Qiang Yu
On Sat, Aug 26, 2017 at 3:07 AM, Eric Anholt  wrote:
> Qiang Yu  writes:
>
>> Hi Eric,
>>
>> I'm working on lima gp compiler which should benefit from nir lowering
>> uniform load to scalar.
>> I notice you write the nir_lower_io_to_scalar.c which support lowering
>> shader_in/shader_out
>> but left the uniform lowering in vc4 driver, any reason why not
>> implement in the nir_lower_io_to_scalar.c?
>
> I think my theory was that drivers would want different units for the
> base/offset (bytes or dwords), so I left it in vc4.  Anyone else want to
> weigh in on this?  vc4 wants indirect load offsets in units of bytes.
Oh, I see, unfortunately lima gp need the base/offset in 4 components
just as the nir base/offset, so I have to come to add a component field.

>
>> I'm new to nir, tried to add it but seems not correct after
>> optimization pass. So I should missing
>> some place, anyone can help to point out?
>
> Your nir_lower_io.c code looks correct to me, so I'm not sure what might
> be missing.  I'm not sure about using the component field, though -- for
> VC4 all I want after lowering is a byte offset within the constant
> buffer.
After some dump, it just go wrong from the CSE pass which eliminate
the ssa from the component != 0. Maybe some change need for the
CSE to know the load_uniform component field just like load_input.

The problem is found to be my fault not change the num_indices:
- LOAD(uniform, 1, 2, BASE, RANGE, COMPONENT,
NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+ LOAD(uniform, 1, 3, BASE, RANGE, COMPONENT,
NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)

Thanks,
Qiang
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util: move string_to_uint_map to glsl

2017-08-25 Thread Emil Velikov
From: Emil Velikov 

The functionality is used by glsl and mesa. With the latter already
depending on the former.

With this in place the src/util/ static library libmesautil.la no longer
has a C++ dependency. Thus objects which use it (like libEGL) don't need
the C++ link.

Cc: Jason Ekstrand 
Suggested-by: Jason Ekstrand 
Cc: "17.2" 
Fixes: 02cc35937277 ("egl/wayland: Use linux-dmabuf interface for buffers")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851
Signed-off-by: Emil Velikov 
---
 src/compiler/Makefile.sources | 4 +++-
 src/compiler/glsl/link_uniform_initializers.cpp   | 2 +-
 src/compiler/glsl/link_uniforms.cpp   | 2 +-
 src/compiler/glsl/linker.cpp  | 2 +-
 src/compiler/glsl/shader_cache.cpp| 2 +-
 src/compiler/glsl/standalone.cpp  | 2 +-
 src/{util => compiler/glsl}/string_to_uint_map.cpp| 0
 src/{util => compiler/glsl}/string_to_uint_map.h  | 0
 src/compiler/glsl/tests/set_uniform_initializer_tests.cpp | 2 +-
 src/mesa/main/shader_query.cpp| 2 +-
 src/mesa/main/shaderobj.c | 2 +-
 src/mesa/program/ir_to_mesa.cpp   | 2 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp | 2 +-
 src/util/Makefile.sources | 2 --
 14 files changed, 13 insertions(+), 13 deletions(-)
 rename src/{util => compiler/glsl}/string_to_uint_map.cpp (100%)
 rename src/{util => compiler/glsl}/string_to_uint_map.h (100%)

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 6e08dfb8448..a8309a1c592 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -140,7 +140,9 @@ LIBGLSL_FILES = \
glsl/program.h \
glsl/propagate_invariance.cpp \
glsl/s_expression.cpp \
-   glsl/s_expression.h
+   glsl/s_expression.h \
+   glsl/string_to_uint_map.cpp \
+   glsl/string_to_uint_map.h
 
 LIBGLSL_SHADER_CACHE_FILES = \
glsl/shader_cache.cpp \
diff --git a/src/compiler/glsl/link_uniform_initializers.cpp 
b/src/compiler/glsl/link_uniform_initializers.cpp
index e7f9c9d8ac0..84a38793f64 100644
--- a/src/compiler/glsl/link_uniform_initializers.cpp
+++ b/src/compiler/glsl/link_uniform_initializers.cpp
@@ -25,7 +25,7 @@
 #include "ir.h"
 #include "linker.h"
 #include "ir_uniform.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"
 
 /* These functions are put in a "private" namespace instead of being marked
  * static so that the unit tests can access them.  See
diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index 1b87c5860b6..99b171d7881 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -27,7 +27,7 @@
 #include "ir_uniform.h"
 #include "glsl_symbol_table.h"
 #include "program.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"
 #include "ir_array_refcount.h"
 
 /**
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index b4784c51199..90c1084c50f 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -75,7 +75,7 @@
 #include "program/program.h"
 #include "util/mesa-sha1.h"
 #include "util/set.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"
 #include "linker.h"
 #include "link_varyings.h"
 #include "ir_optimization.h"
diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index cc4d24482d9..887b8954c51 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -59,7 +59,7 @@
 #include "program.h"
 #include "shader_cache.h"
 #include "util/mesa-sha1.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"
 
 extern "C" {
 #include "main/enums.h"
diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index 52554bb92a2..7a84ca72212 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -36,7 +36,7 @@
 #include "loop_analysis.h"
 #include "standalone_scaffolding.h"
 #include "standalone.h"
-#include "util/string_to_uint_map.h"
+#include "string_to_uint_map.h"
 #include "util/set.h"
 #include "linker.h"
 #include "glsl_parser_extras.h"
diff --git a/src/util/string_to_uint_map.cpp 
b/src/compiler/glsl/string_to_uint_map.cpp
similarity index 100%
rename from src/util/string_to_uint_map.cpp
rename to src/compiler/glsl/string_to_uint_map.cpp
diff --git a/src/util/string_to_uint_map.h 
b/src/compiler/glsl/string_to_uint_map.h
similarity index 100%
rename from src/util/string_to_uint_map.h
rename to src/compiler/glsl/string_to_uint_map.h
diff --git a/src/compiler/glsl/tests/set_uniform_initializer_tests.cpp 

Re: [Mesa-dev] [PATCH] st/query: init result data with 0

2017-08-25 Thread Ilia Mirkin
On Fri, Aug 25, 2017 at 8:23 PM, Karol Herbst  wrote:
> On Sat, Aug 26, 2017 at 1:38 AM, Ilia Mirkin  wrote:
>> On Fri, Aug 25, 2017 at 7:37 PM, Karol Herbst  wrote:
>>> On Sat, Aug 26, 2017 at 1:30 AM, Ilia Mirkin  wrote:
 Why is this necessary? If data is not initialized, then presumably
 pipe->get_query_result will have returned false.

>>>
>>> but it didn't. It might be the drivers fault (in my case nouveau) that
>>> it writes garbage or nothing into data. Where it is most likely the
>>> latter.
>>
>> Sounds like a nouveau bug then.
>
> looks like nouveau never writes to
> result->pipeline_statistics.cs_invocations, because it only writes
> into the first 10 fields leaving out this 11th one.

Right. We don't support CS invocations. Can't say I'm entirely sure
how to implement it. And unfortunately mmt traces that involve compute
don't always decode properly.

I don't think there's a hw counter, so you have to do something
clever. Like keep track of it in a scratch method for QBO and in a
counter on the cpu side?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/13] i965/fs: Don't apply POW/FDIV workaround on Gen10+

2017-08-25 Thread Matt Turner
The documentation says it applies only to Gens 8 and 9.
---
 src/intel/compiler/brw_fs_generator.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 03ee26ccd4..07fd6307f0 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1639,6 +1639,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
* and empirically this affects CHV as well.
*/
   if (devinfo->gen >= 8 &&
+  devinfo->gen <= 9 &&
   p->nr_insn > 1 &&
   brw_inst_opcode(devinfo, brw_last_inst) == BRW_OPCODE_MATH &&
   brw_inst_math_function(devinfo, brw_last_inst) == 
BRW_MATH_FUNCTION_POW &&
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/13] i965: Add align1 ternary instruction field encodings

2017-08-25 Thread Matt Turner
---
 src/intel/compiler/brw_eu_defines.h | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/src/intel/compiler/brw_eu_defines.h 
b/src/intel/compiler/brw_eu_defines.h
index da482b73c5..3af55e830c 100644
--- a/src/intel/compiler/brw_eu_defines.h
+++ b/src/intel/compiler/brw_eu_defines.h
@@ -148,6 +148,18 @@ enum PACKED brw_horizontal_stride {
BRW_HORIZONTAL_STRIDE_4 = 3,
 };
 
+enum PACKED gen10_align1_3src_src_horizontal_stride {
+   BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_0 = 0,
+   BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_1 = 1,
+   BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_2 = 2,
+   BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_4 = 3,
+};
+
+enum PACKED gen10_align1_3src_dst_horizontal_stride {
+   BRW_ALIGN1_3SRC_DST_HORIZONTAL_STRIDE_1 = 0,
+   BRW_ALIGN1_3SRC_DST_HORIZONTAL_STRIDE_2 = 1,
+};
+
 #define BRW_INSTRUCTION_NORMAL0
 #define BRW_INSTRUCTION_SATURATE  1
 
@@ -819,6 +831,12 @@ enum PACKED brw_reg_file {
BAD_FILE,
 };
 
+enum PACKED gen10_align1_3src_reg_file {
+   BRW_ALIGN1_3SRC_GENERAL_REGISTER_FILE = 0,
+   BRW_ALIGN1_3SRC_IMMEDIATE_VALUE   = 1, /* src0, src2 */
+   BRW_ALIGN1_3SRC_ACCUMULATOR   = 1, /* dest, src1 */
+};
+
 /* SNB adds 3-src instructions (MAD and LRP) that only operate on floats, so
  * the types were implied. IVB adds BFE and BFI2 that operate on doublewords
  * and unsigned doublewords, so a new field is also available in the da3src
@@ -830,6 +848,16 @@ enum PACKED brw_reg_file {
 #define BRW_3SRC_TYPE_UD 2
 #define BRW_3SRC_TYPE_DF 3
 
+/* CNL adds Align1 support for 3-src instructions. Bit 35 of the instruction
+ * word is "Execution Datatype" which controls whether the instruction operates
+ * on float or integer types. The register arguments have fields that offer
+ * more fine control their respective types.
+ */
+enum PACKED gen10_align1_3src_exec_type {
+   BRW_ALIGN1_3SRC_EXEC_TYPE_INT   = 0,
+   BRW_ALIGN1_3SRC_EXEC_TYPE_FLOAT = 1,
+};
+
 #define BRW_ARF_NULL  0x00
 #define BRW_ARF_ADDRESS   0x10
 #define BRW_ARF_ACCUMULATOR   0x20
@@ -868,6 +896,13 @@ enum PACKED brw_vertical_stride {
BRW_VERTICAL_STRIDE_ONE_DIMENSIONAL = 0xF,
 };
 
+enum PACKED gen10_align1_3src_vertical_stride {
+   BRW_ALIGN1_3SRC_VERTICAL_STRIDE_0 = 0,
+   BRW_ALIGN1_3SRC_VERTICAL_STRIDE_2 = 1,
+   BRW_ALIGN1_3SRC_VERTICAL_STRIDE_4 = 2,
+   BRW_ALIGN1_3SRC_VERTICAL_STRIDE_8 = 3,
+};
+
 enum PACKED brw_width {
BRW_WIDTH_1  = 0,
BRW_WIDTH_2  = 1,
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/13] i965: Add align1 ternary instruction disassembler support

2017-08-25 Thread Matt Turner
---
 src/intel/compiler/brw_disasm.c | 399 +---
 src/intel/compiler/brw_eu_defines.h |  11 -
 2 files changed, 322 insertions(+), 88 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index 3726172e5d..7215735967 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -30,6 +30,7 @@
 #include "brw_reg.h"
 #include "brw_inst.h"
 #include "brw_eu.h"
+#include "util/half_float.h"
 
 static bool
 has_jip(const struct gen_device_info *devinfo, enum opcode opcode)
@@ -237,13 +238,6 @@ static const char *const access_mode[2] = {
[1] = "align16",
 };
 
-static const char *const three_source_reg_encoding[] = {
-   [BRW_3SRC_TYPE_F]  = "F",
-   [BRW_3SRC_TYPE_D]  = "D",
-   [BRW_3SRC_TYPE_UD] = "UD",
-   [BRW_3SRC_TYPE_DF] = "DF",
-};
-
 static const char *const reg_file[4] = {
[0] = "A",
[1] = "g",
@@ -762,17 +756,17 @@ dest(FILE *file, const struct gen_device_info *devinfo, 
const brw_inst *inst)
 static int
 dest_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst 
*inst)
 {
+   bool is_align1 = brw_inst_3src_access_mode(devinfo, inst) == BRW_ALIGN_1;
int err = 0;
+   unsigned flags = 0;
uint32_t reg_file;
-   enum brw_reg_type type =
-  brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_dst_hw_type(devinfo, 
inst),
-   0);
-   unsigned dst_subreg_nr =
-  brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 /
-  brw_reg_type_to_size(type);
-
-   if (devinfo->gen == 6 && brw_inst_3src_a16_dst_reg_file(devinfo, inst))
+   unsigned subreg_nr;
+   unsigned hw_type;
+   enum brw_reg_type type;
+
+   if (is_align1 && brw_inst_3src_a1_dst_reg_file(devinfo, inst))
+  reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
+   else if (devinfo->gen == 6 && brw_inst_3src_a16_dst_reg_file(devinfo, inst))
   reg_file = BRW_MESSAGE_REGISTER_FILE;
else
   reg_file = BRW_GENERAL_REGISTER_FILE;
@@ -780,13 +774,32 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
err |= reg(file, reg_file, brw_inst_3src_dst_reg_nr(devinfo, inst));
if (err == -1)
   return 0;
-   if (dst_subreg_nr)
-  format(file, ".%u", dst_subreg_nr);
+
+   if (is_align1) {
+  flags |= IS_ALIGN1;
+
+  if (brw_inst_3src_a1_exec_type(devinfo, inst) ==
+  BRW_ALIGN1_3SRC_EXEC_TYPE_INT)
+ flags |= IS_INTEGER;
+
+  hw_type = brw_inst_3src_a1_dst_hw_type(devinfo, inst);
+  subreg_nr = brw_inst_3src_a1_dst_subreg_nr(devinfo, inst);
+   } else {
+  hw_type = brw_inst_3src_a16_dst_hw_type(devinfo, inst);
+  subreg_nr = brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4;
+   }
+   type = brw_hw_3src_type_to_reg_type(devinfo, hw_type, flags);
+   subreg_nr /= brw_reg_type_to_size(type);
+
+   if (subreg_nr)
+  format(file, ".%u", subreg_nr);
string(file, "<1>");
-   err |= control(file, "writemask", writemask,
-  brw_inst_3src_a16_dst_writemask(devinfo, inst), NULL);
-   err |= control(file, "dest reg encoding", three_source_reg_encoding,
-  brw_inst_3src_a16_dst_hw_type(devinfo, inst), NULL);
+
+   if (!is_align1) {
+  err |= control(file, "writemask", writemask,
+ brw_inst_3src_a16_dst_writemask(devinfo, inst), NULL);
+   }
+   string(file, brw_reg_type_to_letters(type));
 
return 0;
 }
@@ -931,36 +944,169 @@ src_da16(FILE *file,
return err;
 }
 
+static enum brw_vertical_stride
+vstride_from_align1_3src_vstride(enum gen10_align1_3src_vertical_stride 
vstride)
+{
+   switch (vstride) {
+   case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_0: return BRW_VERTICAL_STRIDE_0;
+   case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_2: return BRW_VERTICAL_STRIDE_2;
+   case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_4: return BRW_VERTICAL_STRIDE_4;
+   case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_8: return BRW_VERTICAL_STRIDE_8;
+   default:
+  unreachable("not reached");
+   }
+}
+
+static enum brw_horizontal_stride
+hstride_from_align1_3src_hstride(enum gen10_align1_3src_src_horizontal_stride 
hstride)
+{
+   switch (hstride) {
+   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_0: return 
BRW_HORIZONTAL_STRIDE_0;
+   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_1: return 
BRW_HORIZONTAL_STRIDE_1;
+   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_2: return 
BRW_HORIZONTAL_STRIDE_2;
+   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_4: return 
BRW_HORIZONTAL_STRIDE_4;
+   default:
+  unreachable("not reached");
+   }
+}
+
+static enum brw_vertical_stride
+vstride_from_align1_3src_hstride(enum gen10_align1_3src_src_horizontal_stride 
hstride)
+{
+   switch (hstride) {
+   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_0: return BRW_VERTICAL_STRIDE_0;
+   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_1: return BRW_VERTICAL_STRIDE_1;
+   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_2: return BRW_VERTICAL_STRIDE_2;
+   case 

[Mesa-dev] [PATCH 09/13] i965: Add align1 ternary instruction-word support

2017-08-25 Thread Matt Turner
---
 src/intel/compiler/brw_inst.h | 114 ++
 1 file changed, 114 insertions(+)

diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
index e6169057e3..b9c03fa88f 100644
--- a/src/intel/compiler/brw_inst.h
+++ b/src/intel/compiler/brw_inst.h
@@ -268,6 +268,120 @@ REG_TYPE(src)
 #undef REG_TYPE
 
 /**
+ * Three-source align1 instructions:
+ *  @{
+ */
+/* Reserved 127:126 */
+/* src2_reg_nr same in align16 */
+FC(3src_a1_src2_subreg_nr, 117, 113, devinfo->gen >= 10)
+FC(3src_a1_src2_hstride,   112, 111, devinfo->gen >= 10)
+/* Reserved 110:109. src2 vstride is an implied parameter */
+FC(3src_a1_src2_hw_type,   108, 106, devinfo->gen >= 10)
+/* Reserved 105 */
+/* src1_reg_nr same in align16 */
+FC(3src_a1_src1_subreg_nr,  96,  92, devinfo->gen >= 10)
+FC(3src_a1_src1_hstride,91,  90, devinfo->gen >= 10)
+FC(3src_a1_src1_vstride,89,  88, devinfo->gen >= 10)
+FC(3src_a1_src1_hw_type,87,  85, devinfo->gen >= 10)
+/* Reserved 84 */
+/* src0_reg_nr same in align16 */
+FC(3src_a1_src0_subreg_nr,  75,  71, devinfo->gen >= 10)
+FC(3src_a1_src0_hstride,70,  69, devinfo->gen >= 10)
+FC(3src_a1_src0_vstride,68,  67, devinfo->gen >= 10)
+FC(3src_a1_src0_hw_type,66,  64, devinfo->gen >= 10)
+/* dst_reg_nr same in align16 */
+FC(3src_a1_dst_subreg_nr,   55,  54, devinfo->gen >= 10)
+FC(3src_a1_special_acc, 55,  52, devinfo->gen >= 10) /* aliases 
dst_subreg_nr */
+/* Reserved 51:50 */
+FC(3src_a1_dst_hstride, 49,  49, devinfo->gen >= 10)
+FC(3src_a1_dst_hw_type, 48,  46, devinfo->gen >= 10)
+FC(3src_a1_src2_reg_file,   45,  45, devinfo->gen >= 10)
+FC(3src_a1_src1_reg_file,   44,  44, devinfo->gen >= 10)
+FC(3src_a1_src0_reg_file,   43,  43, devinfo->gen >= 10)
+/* Source Modifier fields same in align16 */
+FC(3src_a1_dst_reg_file,36,  36, devinfo->gen >= 10)
+FC(3src_a1_exec_type,   35,  35, devinfo->gen >= 10)
+/* Fields below this same in align16 */
+/** @} */
+
+#define REG_TYPE(reg) \
+static inline void\
+brw_inst_set_3src_a1_##reg##_type(const struct gen_device_info *devinfo,  \
+  brw_inst *inst, enum brw_reg_type type) \
+{ \
+   enum gen10_align1_3src_exec_type exec_type =   \
+  (enum gen10_align1_3src_exec_type) brw_inst_3src_a1_exec_type(devinfo,  \
+inst);\
+   unsigned flags = IS_ALIGN1;\
+   if (brw_reg_type_is_floating_point(type)) {\
+  assert(exec_type == BRW_ALIGN1_3SRC_EXEC_TYPE_FLOAT);   \
+   } else {   \
+  assert(exec_type == BRW_ALIGN1_3SRC_EXEC_TYPE_INT); \
+  flags |= IS_INTEGER;\
+   }  \
+   unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type, flags); \
+   brw_inst_set_3src_a1_##reg##_hw_type(devinfo, inst, hw_type);  \
+} \
+  \
+static inline enum brw_reg_type   \
+brw_inst_3src_a1_##reg##_type(const struct gen_device_info *devinfo,  \
+  const brw_inst *inst)   \
+{ \
+   enum gen10_align1_3src_exec_type exec_type =   \
+  (enum gen10_align1_3src_exec_type) brw_inst_3src_a1_exec_type(devinfo,  \
+inst);\
+   unsigned flags = IS_ALIGN1;\
+   if (exec_type == BRW_ALIGN1_3SRC_EXEC_TYPE_INT) {  \
+  flags |= IS_INTEGER;\
+   }  \
+   unsigned hw_type = brw_inst_3src_a1_##reg##_hw_type(devinfo, inst);\
+   return brw_hw_3src_type_to_reg_type(devinfo, hw_type, flags);  \
+}
+
+REG_TYPE(dst)
+REG_TYPE(src0)
+REG_TYPE(src1)
+REG_TYPE(src2)
+#undef REG_TYPE
+
+/**
+ * Three-source align1 instruction immediates:
+ *  @{
+ */
+static inline uint16_t
+brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo,
+   const brw_inst *insn)
+{
+   assert(devinfo->gen >= 10);
+   return brw_inst_bits(insn, 82, 67);
+}
+
+static inline uint16_t

[Mesa-dev] [PATCH 12/13] i965/fs: Use align1 mode on ternary instructions on Gen10+

2017-08-25 Thread Matt Turner
Align1 mode offers some nice features over align16, like access to more
data types and the ability to use a 16-bit immediate. This patch does
not start using any new features. It just emits ternary instructions in
align1 mode.
---
 src/intel/compiler/brw_fs_generator.cpp | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index afaec5c949..03ee26ccd4 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1728,14 +1728,16 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 
   case BRW_OPCODE_MAD:
  assert(devinfo->gen >= 6);
-brw_set_default_access_mode(p, BRW_ALIGN_16);
+ if (devinfo->gen < 10)
+brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_MAD(p, dst, src[0], src[1], src[2]);
 break;
 
   case BRW_OPCODE_LRP:
  assert(devinfo->gen >= 6);
 brw_set_default_access_mode(p, BRW_ALIGN_16);
- brw_LRP(p, dst, src[0], src[1], src[2]);
+ if (devinfo->gen < 10)
+brw_LRP(p, dst, src[0], src[1], src[2]);
 break;
 
   case BRW_OPCODE_FRC:
@@ -1834,7 +1836,8 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 
   case BRW_OPCODE_BFE:
  assert(devinfo->gen >= 7);
- brw_set_default_access_mode(p, BRW_ALIGN_16);
+ if (devinfo->gen < 10)
+brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_BFE(p, dst, src[0], src[1], src[2]);
  break;
 
@@ -1844,7 +1847,8 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
  break;
   case BRW_OPCODE_BFI2:
  assert(devinfo->gen >= 7);
- brw_set_default_access_mode(p, BRW_ALIGN_16);
+ if (devinfo->gen < 10)
+brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_BFI2(p, dst, src[0], src[1], src[2]);
  break;
 
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/13] i965: Rename brw_inst's functions that access the 3src register type

2017-08-25 Thread Matt Turner
Put hw_ in the name so that it's clear these are the hardware encodings.

Similar to commit 9fb832332868 ("i965: Rename brw_inst's functions that
access the register type")
---
 src/intel/compiler/brw_disasm.c  | 16 
 src/intel/compiler/brw_eu_emit.c | 16 
 src/intel/compiler/brw_inst.h|  4 ++--
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index ade2c28336..aab4a65b7d 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -766,7 +766,7 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
uint32_t reg_file;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_dst_type(devinfo, inst));
+   brw_inst_3src_a16_dst_hw_type(devinfo, 
inst));
unsigned dst_subreg_nr =
   brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -785,7 +785,7 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
err |= control(file, "writemask", writemask,
   brw_inst_3src_a16_dst_writemask(devinfo, inst), NULL);
err |= control(file, "dest reg encoding", three_source_reg_encoding,
-  brw_inst_3src_a16_dst_type(devinfo, inst), NULL);
+  brw_inst_3src_a16_dst_hw_type(devinfo, inst), NULL);
 
return 0;
 }
@@ -936,7 +936,7 @@ src0_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
int err = 0;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_type(devinfo, inst));
+   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
unsigned src0_subreg_nr =
   brw_inst_3src_a16_src0_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -958,7 +958,7 @@ src0_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
   err |= src_swizzle(file, brw_inst_3src_a16_src0_swizzle(devinfo, inst));
}
err |= control(file, "src da16 reg type", three_source_reg_encoding,
-  brw_inst_3src_a16_src_type(devinfo, inst), NULL);
+  brw_inst_3src_a16_src_hw_type(devinfo, inst), NULL);
return err;
 }
 
@@ -968,7 +968,7 @@ src1_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
int err = 0;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_type(devinfo, inst));
+   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
unsigned src1_subreg_nr =
   brw_inst_3src_a16_src1_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -990,7 +990,7 @@ src1_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
   err |= src_swizzle(file, brw_inst_3src_a16_src1_swizzle(devinfo, inst));
}
err |= control(file, "src da16 reg type", three_source_reg_encoding,
-  brw_inst_3src_a16_src_type(devinfo, inst), NULL);
+  brw_inst_3src_a16_src_hw_type(devinfo, inst), NULL);
return err;
 }
 
@@ -1001,7 +1001,7 @@ src2_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
int err = 0;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_type(devinfo, inst));
+   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
unsigned src2_subreg_nr =
   brw_inst_3src_a16_src2_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -1023,7 +1023,7 @@ src2_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
   err |= src_swizzle(file, brw_inst_3src_a16_src2_swizzle(devinfo, inst));
}
err |= control(file, "src da16 reg type", three_source_reg_encoding,
-  brw_inst_3src_a16_src_type(devinfo, inst), NULL);
+  brw_inst_3src_a16_src_hw_type(devinfo, inst), NULL);
return err;
 }
 
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 717824c0c3..e4fcbe908d 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -816,20 +816,20 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
brw_reg dest,
*/
   switch (dest.type) {
   case BRW_REGISTER_TYPE_F:
- brw_inst_set_3src_a16_src_type(devinfo, inst, BRW_3SRC_TYPE_F);
- brw_inst_set_3src_a16_dst_type(devinfo, inst, BRW_3SRC_TYPE_F);
+ brw_inst_set_3src_a16_src_hw_type(devinfo, inst, BRW_3SRC_TYPE_F);
+ brw_inst_set_3src_a16_dst_hw_type(devinfo, inst, BRW_3SRC_TYPE_F);
  break;
   case BRW_REGISTER_TYPE_DF:
- 

[Mesa-dev] [PATCH 02/13] i965: Add functions for brw_reg_type <-> hw 3src type

2017-08-25 Thread Matt Turner
---
 src/intel/compiler/brw_reg_type.c | 50 +++
 src/intel/compiler/brw_reg_type.h |  8 +++
 2 files changed, 58 insertions(+)

diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index a0f674f0d7..d65ebaee48 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -79,6 +79,27 @@ static const struct {
[BRW_REGISTER_TYPE_UV] = { INVALID, BRW_HW_IMM_TYPE_UV  },
 };
 
+/* SNB adds 3-src instructions (MAD and LRP) that only operate on floats, so
+ * the types were implied. IVB adds BFE and BFI2 that operate on doublewords
+ * and unsigned doublewords, so a new field is also available in the da3src
+ * struct (part of struct brw_instruction.bits1 in brw_structs.h) to select
+ * dst and shared-src types.
+ */
+enum hw_3src_reg_type {
+   GEN7_3SRC_TYPE_F  = 0,
+   GEN7_3SRC_TYPE_D  = 1,
+   GEN7_3SRC_TYPE_UD = 2,
+   GEN7_3SRC_TYPE_DF = 3,
+};
+
+static const enum hw_3src_reg_type gen7_3src_type[] = {
+   [0 ... BRW_REGISTER_TYPE_LAST] = INVALID,
+   [BRW_REGISTER_TYPE_F]  = GEN7_3SRC_TYPE_F,
+   [BRW_REGISTER_TYPE_D]  = GEN7_3SRC_TYPE_D,
+   [BRW_REGISTER_TYPE_UD] = GEN7_3SRC_TYPE_UD,
+   [BRW_REGISTER_TYPE_DF] = GEN7_3SRC_TYPE_DF,
+};
+
 /**
  * Convert a brw_reg_type enumeration value into the hardware representation.
  *
@@ -126,6 +147,35 @@ brw_hw_type_to_reg_type(const struct gen_device_info 
*devinfo,
 }
 
 /**
+ * Convert a brw_reg_type enumeration value into the hardware representation
+ * for a 3-src instruction
+ */
+unsigned
+brw_reg_type_to_hw_3src_type(const struct gen_device_info *devinfo,
+ enum brw_reg_type type)
+{
+   assert(type < ARRAY_SIZE(gen7_3src_type));
+   assert(gen7_3src_type[type] != -1);
+   return gen7_3src_type[type];
+}
+
+/**
+ * Convert the hardware representation for a 3-src instruction into a
+ * brw_reg_type enumeration value.
+ */
+enum brw_reg_type
+brw_hw_3src_type_to_reg_type(const struct gen_device_info *devinfo,
+ unsigned hw_type)
+{
+   for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) {
+  if (gen7_3src_type[i] == hw_type) {
+ return i;
+  }
+   }
+   unreachable("not reached");
+}
+
+/**
  * Return the element size given a register type.
  */
 unsigned
diff --git a/src/intel/compiler/brw_reg_type.h 
b/src/intel/compiler/brw_reg_type.h
index 0b40906d92..ed249d77e6 100644
--- a/src/intel/compiler/brw_reg_type.h
+++ b/src/intel/compiler/brw_reg_type.h
@@ -89,6 +89,14 @@ brw_hw_type_to_reg_type(const struct gen_device_info 
*devinfo,
 enum brw_reg_file file, unsigned hw_type);
 
 unsigned
+brw_reg_type_to_hw_3src_type(const struct gen_device_info *devinfo,
+ enum brw_reg_type type);
+
+enum brw_reg_type
+brw_hw_3src_type_to_reg_type(const struct gen_device_info *devinfo,
+ unsigned hw_type);
+
+unsigned
 brw_reg_type_to_size(enum brw_reg_type type);
 
 const char *
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/13] i965: Add align1 ternary instruction emission support

2017-08-25 Thread Matt Turner
---
 src/intel/compiler/brw_eu_emit.c | 196 ---
 1 file changed, 143 insertions(+), 53 deletions(-)

diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index f1a2283de8..7f3980f83e 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -758,64 +758,154 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
brw_reg dest,
 
gen7_convert_mrf_to_grf(p, );
 
-   assert(brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_16);
-
-   assert(dest.file == BRW_GENERAL_REGISTER_FILE ||
- dest.file == BRW_MESSAGE_REGISTER_FILE);
assert(dest.nr < 128);
+   assert(src0.nr < 128);
+   assert(src1.nr < 128);
+   assert(src2.nr < 128);
assert(dest.address_mode == BRW_ADDRESS_DIRECT);
-   assert(dest.type == BRW_REGISTER_TYPE_F  ||
-  dest.type == BRW_REGISTER_TYPE_DF ||
-  dest.type == BRW_REGISTER_TYPE_D  ||
-  dest.type == BRW_REGISTER_TYPE_UD);
-   if (devinfo->gen == 6) {
-  brw_inst_set_3src_a16_dst_reg_file(devinfo, inst,
- dest.file == 
BRW_MESSAGE_REGISTER_FILE);
-   }
-   brw_inst_set_3src_dst_reg_nr(devinfo, inst, dest.nr);
-   brw_inst_set_3src_a16_dst_subreg_nr(devinfo, inst, dest.subnr / 16);
-   brw_inst_set_3src_a16_dst_writemask(devinfo, inst, dest.writemask);
-
-   assert(src0.file == BRW_GENERAL_REGISTER_FILE);
assert(src0.address_mode == BRW_ADDRESS_DIRECT);
-   assert(src0.nr < 128);
-   brw_inst_set_3src_a16_src0_swizzle(devinfo, inst, src0.swizzle);
-   brw_inst_set_3src_a16_src0_subreg_nr(devinfo, inst, 
get_3src_subreg_nr(src0));
-   brw_inst_set_3src_src0_reg_nr(devinfo, inst, src0.nr);
-   brw_inst_set_3src_src0_abs(devinfo, inst, src0.abs);
-   brw_inst_set_3src_src0_negate(devinfo, inst, src0.negate);
-   brw_inst_set_3src_a16_src0_rep_ctrl(devinfo, inst,
-   src0.vstride == BRW_VERTICAL_STRIDE_0);
-
-   assert(src1.file == BRW_GENERAL_REGISTER_FILE);
assert(src1.address_mode == BRW_ADDRESS_DIRECT);
-   assert(src1.nr < 128);
-   brw_inst_set_3src_src1_reg_nr(devinfo, inst, src1.nr);
-   brw_inst_set_3src_src1_abs(devinfo, inst, src1.abs);
-   brw_inst_set_3src_src1_negate(devinfo, inst, src1.negate);
-   brw_inst_set_3src_a16_src1_rep_ctrl(devinfo, inst,
-   src1.vstride == BRW_VERTICAL_STRIDE_0);
-
-   assert(src2.file == BRW_GENERAL_REGISTER_FILE);
assert(src2.address_mode == BRW_ADDRESS_DIRECT);
-   assert(src2.nr < 128);
-   brw_inst_set_3src_a16_src2_swizzle(devinfo, inst, src2.swizzle);
-   brw_inst_set_3src_a16_src2_subreg_nr(devinfo, inst, 
get_3src_subreg_nr(src2));
-   brw_inst_set_3src_src2_reg_nr(devinfo, inst, src2.nr);
-   brw_inst_set_3src_src2_abs(devinfo, inst, src2.abs);
-   brw_inst_set_3src_src2_negate(devinfo, inst, src2.negate);
-   brw_inst_set_3src_a16_src2_rep_ctrl(devinfo, inst,
-   src2.vstride == BRW_VERTICAL_STRIDE_0);
-
-   if (devinfo->gen >= 7) {
-  /* Set both the source and destination types based on dest.type,
-   * ignoring the source register types.  The MAD and LRP emitters ensure
-   * that all four types are float.  The BFE and BFI2 emitters, however,
-   * may send us mixed D and UD types and want us to ignore that and use
-   * the destination type.
-   */
-  brw_inst_set_3src_a16_src_type(devinfo, inst, dest.type);
-  brw_inst_set_3src_a16_dst_type(devinfo, inst, dest.type);
+
+   if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) {
+  unsigned flags = IS_ALIGN1;
+
+  assert(dest.file == BRW_GENERAL_REGISTER_FILE ||
+ dest.file == BRW_ARCHITECTURE_REGISTER_FILE);
+
+  if (dest.file == BRW_ARCHITECTURE_REGISTER_FILE) {
+ brw_inst_set_3src_a1_dst_reg_file(devinfo, inst,
+   BRW_ALIGN1_3SRC_ACCUMULATOR);
+ brw_inst_set_3src_dst_reg_nr(devinfo, inst, BRW_ARF_ACCUMULATOR);
+  } else {
+ brw_inst_set_3src_a1_dst_reg_file(devinfo, inst,
+   
BRW_ALIGN1_3SRC_GENERAL_REGISTER_FILE);
+ brw_inst_set_3src_dst_reg_nr(devinfo, inst, dest.nr);
+  }
+  brw_inst_set_3src_a1_dst_subreg_nr(devinfo, inst, dest.subnr / 8);
+
+  brw_inst_set_3src_a1_dst_hstride(devinfo, inst, 
BRW_ALIGN1_3SRC_DST_HORIZONTAL_STRIDE_1);
+
+  if (brw_reg_type_is_floating_point(dest.type)) {
+ brw_inst_set_3src_a1_exec_type(devinfo, inst,
+BRW_ALIGN1_3SRC_EXEC_TYPE_FLOAT);
+  } else {
+ flags |= IS_INTEGER;
+ brw_inst_set_3src_a1_exec_type(devinfo, inst,
+BRW_ALIGN1_3SRC_EXEC_TYPE_INT);
+  }
+
+  brw_inst_set_3src_a1_dst_type(devinfo, inst, dest.type);
+  brw_inst_set_3src_a1_src0_type(devinfo, inst, src0.type);
+  brw_inst_set_3src_a1_src1_type(devinfo, inst, src1.type);
+  

[Mesa-dev] [PATCH 00/13] i965: Align1 ternary instruction support for CNL

2017-08-25 Thread Matt Turner
Cannonlake (Gen10) adds align1 access mode to ternary instructions. In align1
mode, instructions can use more (and mixed) datatypes and a single 16-bit
immediate value. This series adds the infrastructure to emit and disassemble
such instructions. Patch 12 switches ternary instructions to align1 mode, but
does not begin using any of the new features. I'm not sure if that's worth
committing on its own.

i965: Move brw_reg_type_is_floating_point to brw_reg_type.h
i965: Add functions for brw_reg_type <-> hw 3src type
i965: Print subreg in units of type-size on ternary instructions
i965: Rename brw_inst 3src functions in preparation for align1
i965: Rename brw_inst's functions that access the 3src register type
i965: Add functions to abstract access to 3src register types
i965: Add align1 ternary instruction field encodings
i965: Add align1 ternary instruction support to conversion functions
i965: Add align1 ternary instruction-word support
i965: Add align1 ternary instruction disassembler support
i965: Add align1 ternary instruction emission support
i965/fs: Use align1 mode on ternary instructions on Gen10+
i965/fs: Don't apply POW/FDIV workaround on Gen10+
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/13] i965: Print subreg in units of type-size on ternary instructions

2017-08-25 Thread Matt Turner
The instruction word contains SubRegNum[4:2] so it's in units of dwords
(hence the * 4 to get it in terms of bytes). Before this patch, the
subreg would have been wrong for DF arguments.
---
 src/intel/compiler/brw_disasm.c | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index e2675b5f4c..188c7c53d0 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -764,6 +764,12 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
 {
int err = 0;
uint32_t reg_file;
+   enum brw_reg_type type =
+  brw_hw_3src_type_to_reg_type(devinfo,
+   brw_inst_3src_dst_type(devinfo, inst));
+   unsigned dst_subreg_nr =
+  brw_inst_3src_dst_subreg_nr(devinfo, inst) * 4 /
+  brw_reg_type_to_size(type);
 
if (devinfo->gen == 6 && brw_inst_3src_dst_reg_file(devinfo, inst))
   reg_file = BRW_MESSAGE_REGISTER_FILE;
@@ -773,8 +779,8 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
err |= reg(file, reg_file, brw_inst_3src_dst_reg_nr(devinfo, inst));
if (err == -1)
   return 0;
-   if (brw_inst_3src_dst_subreg_nr(devinfo, inst))
-  format(file, ".%"PRIu64, brw_inst_3src_dst_subreg_nr(devinfo, inst));
+   if (dst_subreg_nr)
+  format(file, ".%u", dst_subreg_nr);
string(file, "<1>");
err |= control(file, "writemask", writemask,
   brw_inst_3src_dst_writemask(devinfo, inst), NULL);
@@ -928,7 +934,12 @@ static int
 src0_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst 
*inst)
 {
int err = 0;
-   unsigned src0_subreg_nr = brw_inst_3src_src0_subreg_nr(devinfo, inst);
+   enum brw_reg_type type =
+  brw_hw_3src_type_to_reg_type(devinfo,
+   brw_inst_3src_src_type(devinfo, inst));
+   unsigned src0_subreg_nr =
+  brw_inst_3src_src0_subreg_nr(devinfo, inst) * 4 /
+  brw_reg_type_to_size(type);
 
err |= control(file, "negate", m_negate,
   brw_inst_3src_src0_negate(devinfo, inst), NULL);
@@ -955,7 +966,12 @@ static int
 src1_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst 
*inst)
 {
int err = 0;
-   unsigned src1_subreg_nr = brw_inst_3src_src1_subreg_nr(devinfo, inst);
+   enum brw_reg_type type =
+  brw_hw_3src_type_to_reg_type(devinfo,
+   brw_inst_3src_src_type(devinfo, inst));
+   unsigned src1_subreg_nr =
+  brw_inst_3src_src1_subreg_nr(devinfo, inst) * 4 /
+  brw_reg_type_to_size(type);
 
err |= control(file, "negate", m_negate,
   brw_inst_3src_src1_negate(devinfo, inst), NULL);
@@ -983,7 +999,12 @@ static int
 src2_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst 
*inst)
 {
int err = 0;
-   unsigned src2_subreg_nr = brw_inst_3src_src2_subreg_nr(devinfo, inst);
+   enum brw_reg_type type =
+  brw_hw_3src_type_to_reg_type(devinfo,
+   brw_inst_3src_src_type(devinfo, inst));
+   unsigned src2_subreg_nr =
+  brw_inst_3src_src2_subreg_nr(devinfo, inst) * 4 /
+  brw_reg_type_to_size(type);
 
err |= control(file, "negate", m_negate,
   brw_inst_3src_src2_negate(devinfo, inst), NULL);
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/13] i965: Rename brw_inst 3src functions in preparation for align1

2017-08-25 Thread Matt Turner
---
 src/intel/compiler/brw_disasm.c | 46 +++
 src/intel/compiler/brw_eu_compact.c | 30 -
 src/intel/compiler/brw_eu_emit.c| 46 +++
 src/intel/compiler/brw_inst.h   | 54 ++---
 4 files changed, 90 insertions(+), 86 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index 188c7c53d0..ade2c28336 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -766,12 +766,12 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
uint32_t reg_file;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_dst_type(devinfo, inst));
+   brw_inst_3src_a16_dst_type(devinfo, inst));
unsigned dst_subreg_nr =
-  brw_inst_3src_dst_subreg_nr(devinfo, inst) * 4 /
+  brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
 
-   if (devinfo->gen == 6 && brw_inst_3src_dst_reg_file(devinfo, inst))
+   if (devinfo->gen == 6 && brw_inst_3src_a16_dst_reg_file(devinfo, inst))
   reg_file = BRW_MESSAGE_REGISTER_FILE;
else
   reg_file = BRW_GENERAL_REGISTER_FILE;
@@ -783,9 +783,9 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
   format(file, ".%u", dst_subreg_nr);
string(file, "<1>");
err |= control(file, "writemask", writemask,
-  brw_inst_3src_dst_writemask(devinfo, inst), NULL);
+  brw_inst_3src_a16_dst_writemask(devinfo, inst), NULL);
err |= control(file, "dest reg encoding", three_source_reg_encoding,
-  brw_inst_3src_dst_type(devinfo, inst), NULL);
+  brw_inst_3src_a16_dst_type(devinfo, inst), NULL);
 
return 0;
 }
@@ -936,9 +936,9 @@ src0_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
int err = 0;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_src_type(devinfo, inst));
+   brw_inst_3src_a16_src_type(devinfo, inst));
unsigned src0_subreg_nr =
-  brw_inst_3src_src0_subreg_nr(devinfo, inst) * 4 /
+  brw_inst_3src_a16_src0_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
 
err |= control(file, "negate", m_negate,
@@ -949,16 +949,16 @@ src0_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
   brw_inst_3src_src0_reg_nr(devinfo, inst));
if (err == -1)
   return 0;
-   if (src0_subreg_nr || brw_inst_3src_src0_rep_ctrl(devinfo, inst))
+   if (src0_subreg_nr || brw_inst_3src_a16_src0_rep_ctrl(devinfo, inst))
   format(file, ".%d", src0_subreg_nr);
-   if (brw_inst_3src_src0_rep_ctrl(devinfo, inst))
+   if (brw_inst_3src_a16_src0_rep_ctrl(devinfo, inst))
   string(file, "<0,1,0>");
else {
   string(file, "<4,4,1>");
-  err |= src_swizzle(file, brw_inst_3src_src0_swizzle(devinfo, inst));
+  err |= src_swizzle(file, brw_inst_3src_a16_src0_swizzle(devinfo, inst));
}
err |= control(file, "src da16 reg type", three_source_reg_encoding,
-  brw_inst_3src_src_type(devinfo, inst), NULL);
+  brw_inst_3src_a16_src_type(devinfo, inst), NULL);
return err;
 }
 
@@ -968,9 +968,9 @@ src1_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
int err = 0;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_src_type(devinfo, inst));
+   brw_inst_3src_a16_src_type(devinfo, inst));
unsigned src1_subreg_nr =
-  brw_inst_3src_src1_subreg_nr(devinfo, inst) * 4 /
+  brw_inst_3src_a16_src1_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
 
err |= control(file, "negate", m_negate,
@@ -981,16 +981,16 @@ src1_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
   brw_inst_3src_src1_reg_nr(devinfo, inst));
if (err == -1)
   return 0;
-   if (src1_subreg_nr || brw_inst_3src_src1_rep_ctrl(devinfo, inst))
+   if (src1_subreg_nr || brw_inst_3src_a16_src1_rep_ctrl(devinfo, inst))
   format(file, ".%d", src1_subreg_nr);
-   if (brw_inst_3src_src1_rep_ctrl(devinfo, inst))
+   if (brw_inst_3src_a16_src1_rep_ctrl(devinfo, inst))
   string(file, "<0,1,0>");
else {
   string(file, "<4,4,1>");
-  err |= src_swizzle(file, brw_inst_3src_src1_swizzle(devinfo, inst));
+  err |= src_swizzle(file, brw_inst_3src_a16_src1_swizzle(devinfo, inst));
}
err |= control(file, "src da16 reg type", three_source_reg_encoding,
-  brw_inst_3src_src_type(devinfo, inst), NULL);
+  brw_inst_3src_a16_src_type(devinfo, inst), NULL);
return err;
 }
 
@@ 

[Mesa-dev] [PATCH 08/13] i965: Add align1 ternary instruction support to conversion functions

2017-08-25 Thread Matt Turner
---
 src/intel/compiler/brw_disasm.c   | 12 ---
 src/intel/compiler/brw_inst.h |  4 +--
 src/intel/compiler/brw_reg_type.c | 76 ---
 src/intel/compiler/brw_reg_type.h |  7 ++--
 4 files changed, 79 insertions(+), 20 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index aab4a65b7d..3726172e5d 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -766,7 +766,8 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
uint32_t reg_file;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_dst_hw_type(devinfo, 
inst));
+   brw_inst_3src_a16_dst_hw_type(devinfo, 
inst),
+   0);
unsigned dst_subreg_nr =
   brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -936,7 +937,8 @@ src0_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
int err = 0;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
+   brw_inst_3src_a16_src_hw_type(devinfo, 
inst),
+   0);
unsigned src0_subreg_nr =
   brw_inst_3src_a16_src0_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -968,7 +970,8 @@ src1_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
int err = 0;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
+   brw_inst_3src_a16_src_hw_type(devinfo, 
inst),
+   0);
unsigned src1_subreg_nr =
   brw_inst_3src_a16_src1_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -1001,7 +1004,8 @@ src2_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
int err = 0;
enum brw_reg_type type =
   brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
+   brw_inst_3src_a16_src_hw_type(devinfo, 
inst),
+   0);
unsigned src2_subreg_nr =
   brw_inst_3src_a16_src2_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
index 0cc1a3e911..e6169057e3 100644
--- a/src/intel/compiler/brw_inst.h
+++ b/src/intel/compiler/brw_inst.h
@@ -251,7 +251,7 @@ static inline void  
  \
 brw_inst_set_3src_a16_##reg##_type(const struct gen_device_info *devinfo, \
brw_inst *inst, enum brw_reg_type type)\
 { \
-   unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type);\
+   unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type, 0); \
brw_inst_set_3src_a16_##reg##_hw_type(devinfo, inst, hw_type); \
 } \
   \
@@ -260,7 +260,7 @@ brw_inst_3src_a16_##reg##_type(const struct gen_device_info 
*devinfo, \
const brw_inst *inst)  \
 { \
unsigned hw_type = brw_inst_3src_a16_##reg##_hw_type(devinfo, inst);   \
-   return brw_hw_3src_type_to_reg_type(devinfo, hw_type); \
+   return brw_hw_3src_type_to_reg_type(devinfo, hw_type, 0);  \
 }
 
 REG_TYPE(dst)
diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index d65ebaee48..7fb4a1e62a 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -84,20 +84,55 @@ static const struct {
  * and unsigned doublewords, so a new field is also available in the da3src
  * struct (part of struct brw_instruction.bits1 in brw_structs.h) to select
  * dst and shared-src types.
+ *
+ * CNL adds support for 3-src instructions in align1 mode, and with it support
+ * for most register types.
  */
 enum hw_3src_reg_type {
GEN7_3SRC_TYPE_F  = 0,
GEN7_3SRC_TYPE_D  = 1,
GEN7_3SRC_TYPE_UD = 2,
GEN7_3SRC_TYPE_DF = 3,
+
+   /** When ExecutionDatatype is 1: @{ */
+   GEN10_ALIGN1_3SRC_REG_TYPE_HF = 0b000,
+   GEN10_ALIGN1_3SRC_REG_TYPE_F  = 0b001,
+   GEN10_ALIGN1_3SRC_REG_TYPE_DF = 0b010,
+   /** @} */
+
+   /** When ExecutionDatatype is 0: @{ */
+   GEN10_ALIGN1_3SRC_REG_TYPE_UD = 0b000,
+   

[Mesa-dev] [PATCH 06/13] i965: Add functions to abstract access to 3src register types

2017-08-25 Thread Matt Turner
---
 src/intel/compiler/brw_eu_emit.c | 22 ++
 src/intel/compiler/brw_inst.h| 21 +
 2 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index e4fcbe908d..f1a2283de8 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -814,26 +814,8 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
brw_reg dest,
* may send us mixed D and UD types and want us to ignore that and use
* the destination type.
*/
-  switch (dest.type) {
-  case BRW_REGISTER_TYPE_F:
- brw_inst_set_3src_a16_src_hw_type(devinfo, inst, BRW_3SRC_TYPE_F);
- brw_inst_set_3src_a16_dst_hw_type(devinfo, inst, BRW_3SRC_TYPE_F);
- break;
-  case BRW_REGISTER_TYPE_DF:
- brw_inst_set_3src_a16_src_hw_type(devinfo, inst, BRW_3SRC_TYPE_DF);
- brw_inst_set_3src_a16_dst_hw_type(devinfo, inst, BRW_3SRC_TYPE_DF);
- break;
-  case BRW_REGISTER_TYPE_D:
- brw_inst_set_3src_a16_src_hw_type(devinfo, inst, BRW_3SRC_TYPE_D);
- brw_inst_set_3src_a16_dst_hw_type(devinfo, inst, BRW_3SRC_TYPE_D);
- break;
-  case BRW_REGISTER_TYPE_UD:
- brw_inst_set_3src_a16_src_hw_type(devinfo, inst, BRW_3SRC_TYPE_UD);
- brw_inst_set_3src_a16_dst_hw_type(devinfo, inst, BRW_3SRC_TYPE_UD);
- break;
-  default:
- unreachable("not reached");
-  }
+  brw_inst_set_3src_a16_src_type(devinfo, inst, dest.type);
+  brw_inst_set_3src_a16_dst_type(devinfo, inst, dest.type);
}
 
return inst;
diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
index e0bc2c1ceb..0cc1a3e911 100644
--- a/src/intel/compiler/brw_inst.h
+++ b/src/intel/compiler/brw_inst.h
@@ -246,6 +246,27 @@ F(3src_access_mode,  8,  8)
 F(3src_opcode,   6,  0)
 /** @} */
 
+#define REG_TYPE(reg) \
+static inline void\
+brw_inst_set_3src_a16_##reg##_type(const struct gen_device_info *devinfo, \
+   brw_inst *inst, enum brw_reg_type type)\
+{ \
+   unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type);\
+   brw_inst_set_3src_a16_##reg##_hw_type(devinfo, inst, hw_type); \
+} \
+  \
+static inline enum brw_reg_type   \
+brw_inst_3src_a16_##reg##_type(const struct gen_device_info *devinfo, \
+   const brw_inst *inst)  \
+{ \
+   unsigned hw_type = brw_inst_3src_a16_##reg##_hw_type(devinfo, inst);   \
+   return brw_hw_3src_type_to_reg_type(devinfo, hw_type); \
+}
+
+REG_TYPE(dst)
+REG_TYPE(src)
+#undef REG_TYPE
+
 /**
  * Flow control instruction bits:
  *  @{
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/13] i965: Move brw_reg_type_is_floating_point to brw_reg_type.h

2017-08-25 Thread Matt Turner
I'm going to call this from brw_inst.h, and I don't want to have to
include all of brw_reg.h.
---
 src/intel/compiler/brw_reg.h  | 13 -
 src/intel/compiler/brw_reg_type.h | 15 +++
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index 441dfb2447..d68d64f003 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -289,19 +289,6 @@ type_sz(unsigned type)
}
 }
 
-static inline bool
-brw_reg_type_is_floating_point(enum brw_reg_type type)
-{
-   switch (type) {
-   case BRW_REGISTER_TYPE_F:
-   case BRW_REGISTER_TYPE_HF:
-   case BRW_REGISTER_TYPE_DF:
-  return true;
-   default:
-  return false;
-   }
-}
-
 static inline enum brw_reg_type
 get_exec_type(const enum brw_reg_type type)
 {
diff --git a/src/intel/compiler/brw_reg_type.h 
b/src/intel/compiler/brw_reg_type.h
index 87d9fe31e8..0b40906d92 100644
--- a/src/intel/compiler/brw_reg_type.h
+++ b/src/intel/compiler/brw_reg_type.h
@@ -24,6 +24,8 @@
 #ifndef BRW_REG_TYPE_H
 #define BRW_REG_TYPE_H
 
+#include 
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -65,6 +67,19 @@ enum PACKED brw_reg_type {
BRW_REGISTER_TYPE_LAST = BRW_REGISTER_TYPE_UV
 };
 
+static inline bool
+brw_reg_type_is_floating_point(enum brw_reg_type type)
+{
+   switch (type) {
+   case BRW_REGISTER_TYPE_DF:
+   case BRW_REGISTER_TYPE_F:
+   case BRW_REGISTER_TYPE_HF:
+  return true;
+   default:
+  return false;
+   }
+}
+
 unsigned
 brw_reg_type_to_hw_type(const struct gen_device_info *devinfo,
 enum brw_reg_file file, enum brw_reg_type type);
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] egl: automake: don't link against libmesautil

2017-08-25 Thread Emil Velikov
On 25 August 2017 at 23:25, Jason Ekstrand  wrote:
> On Fri, Aug 25, 2017 at 1:20 PM, Emil Velikov 
> wrote:
>>
>> From: Emil Velikov 
>>
>> Originally required for the u_vector implementation, which was inlined
>> in u_vector.h with previous commit.
>>
>> Using libmesautil pulled the C++ runtime (string_to_uint_map.cpp),
>> which is something don't want to impose in our libEGL.
>>
>> We could consider rewriting string_to_uint_map in C, but that's too
>> invasive for a stable fix.
>
>
> A quick grep and it looks like the only users of string_to_uint_map are
> src/compiler/glsl and src/mesa which depends on src/compiler/glsl.  Why not
> just move string_to_uint_map into src/compiler/glsl or src/compiler until
> such a time as we actually have another user.  Don't get me wrong, I think
> string_to_uint_map is useful but I think the better option here is to
> disallow C++ in src/util.
>
Great idea, thanks.
I'm build testing for subtle breakages and will send a patch in a
couple of minutes.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/query: init result data with 0

2017-08-25 Thread Karol Herbst
On Sat, Aug 26, 2017 at 1:38 AM, Ilia Mirkin  wrote:
> On Fri, Aug 25, 2017 at 7:37 PM, Karol Herbst  wrote:
>> On Sat, Aug 26, 2017 at 1:30 AM, Ilia Mirkin  wrote:
>>> Why is this necessary? If data is not initialized, then presumably
>>> pipe->get_query_result will have returned false.
>>>
>>
>> but it didn't. It might be the drivers fault (in my case nouveau) that
>> it writes garbage or nothing into data. Where it is most likely the
>> latter.
>
> Sounds like a nouveau bug then.

looks like nouveau never writes to
result->pipeline_statistics.cs_invocations, because it only writes
into the first 10 fields leaving out this 11th one.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102377] PIPE_*_4BYTE_ALIGNED_ONLY caps crashing

2017-08-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102377

--- Comment #4 from Bruce Cherniak  ---
The proposed patch looks good.  Regressions with the 4BYTE_ALIGNED_ONLY caps
set are fixed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/query: init result data with 0

2017-08-25 Thread Ilia Mirkin
On Fri, Aug 25, 2017 at 7:37 PM, Karol Herbst  wrote:
> On Sat, Aug 26, 2017 at 1:30 AM, Ilia Mirkin  wrote:
>> Why is this necessary? If data is not initialized, then presumably
>> pipe->get_query_result will have returned false.
>>
>
> but it didn't. It might be the drivers fault (in my case nouveau) that
> it writes garbage or nothing into data. Where it is most likely the
> latter.

Sounds like a nouveau bug then.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/query: init result data with 0

2017-08-25 Thread Ilia Mirkin
Why is this necessary? If data is not initialized, then presumably
pipe->get_query_result will have returned false.

On Fri, Aug 25, 2017 at 7:15 PM, Karol Herbst  wrote:
> otherwise the result might contain random data.
>
> fixes on nvc0:
>  * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values
>  * 
> KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries
>
> Signed-off-by: Karol Herbst 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/mesa/state_tracker/st_cb_queryobj.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/state_tracker/st_cb_queryobj.c 
> b/src/mesa/state_tracker/st_cb_queryobj.c
> index 4c25724b5d..9a65fe7bd9 100644
> --- a/src/mesa/state_tracker/st_cb_queryobj.c
> +++ b/src/mesa/state_tracker/st_cb_queryobj.c
> @@ -211,7 +211,7 @@ get_query_result(struct pipe_context *pipe,
>   struct st_query_object *stq,
>   boolean wait)
>  {
> -   union pipe_query_result data;
> +   union pipe_query_result data = { 0 };
>
> if (!stq->pq) {
>/* Only needed in case we failed to allocate the gallium query earlier.
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/query: init result data with 0

2017-08-25 Thread Karol Herbst
otherwise the result might contain random data.

fixes on nvc0:
 * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values
 * 
KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries

Signed-off-by: Karol Herbst 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/state_tracker/st_cb_queryobj.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_cb_queryobj.c 
b/src/mesa/state_tracker/st_cb_queryobj.c
index 4c25724b5d..9a65fe7bd9 100644
--- a/src/mesa/state_tracker/st_cb_queryobj.c
+++ b/src/mesa/state_tracker/st_cb_queryobj.c
@@ -211,7 +211,7 @@ get_query_result(struct pipe_context *pipe,
  struct st_query_object *stq,
  boolean wait)
 {
-   union pipe_query_result data;
+   union pipe_query_result data = { 0 };
 
if (!stq->pq) {
   /* Only needed in case we failed to allocate the gallium query earlier.
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-25 Thread Timothy Arceri



On 26/08/17 00:50, Nicolai Hähnle wrote:

On 25.08.2017 13:58, Marek Olšák wrote:

Nicolai,

Have you thought about switching to NIR for radeonsi completely to get 
16-bit support? We need NIR support anyway for spirv, right? Would be 
it be easier than adding 16-bit support into TGSI, glsl2tgsi, and 
tgsi2llvm?


Well. What's missing from the NIR path is:

(1) GS and tess (the ABI parts only)
(2) re-adding some minor extensions (shader_group_vote?)
(3) fixing all the bugs -- it's been a while since I've done a piglit 
comparison


There's a bunch of unknowns, so it's hard to say, but once we're there 
16-bit should be much easier, so may be worth it.


Hi Nicolai,

Do you have a branch somewhere with your latest work on this? Or is it 
all in Mesa currently?


I tried to run master on shader-db to see if the NIR path gave us any 
big changes either way. However I had to stop it as it hit 11GB of mem 
and rising. Not sure yet if this was a memleak, or we got stuck in a 
loop somewhere.





Cheers,
Nicolai

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] egl: automake: don't link against libmesautil

2017-08-25 Thread Jason Ekstrand
On Fri, Aug 25, 2017 at 1:20 PM, Emil Velikov 
wrote:

> From: Emil Velikov 
>
> Originally required for the u_vector implementation, which was inlined
> in u_vector.h with previous commit.
>
> Using libmesautil pulled the C++ runtime (string_to_uint_map.cpp),
> which is something don't want to impose in our libEGL.
>
> We could consider rewriting string_to_uint_map in C, but that's too
> invasive for a stable fix.
>

A quick grep and it looks like the only users of string_to_uint_map are
src/compiler/glsl and src/mesa which depends on src/compiler/glsl.  Why not
just move string_to_uint_map into src/compiler/glsl or src/compiler until
such a time as we actually have another user.  Don't get me wrong, I think
string_to_uint_map is useful but I think the better option here is to
disallow C++ in src/util.

--Jason


> Cc: Daniel Stone 
> Cc: "17.2" 
> Fixes: 02cc35937277 ("egl/wayland: Use linux-dmabuf interface for buffers")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851
> Signed-off-by: Emil Velikov 
> ---
>  src/egl/Makefile.am | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
> index bb8ec9745dd..7331b616a8d 100644
> --- a/src/egl/Makefile.am
> +++ b/src/egl/Makefile.am
> @@ -82,7 +82,6 @@ AM_CFLAGS += $(WAYLAND_CFLAGS)
>  libEGL_common_la_LIBADD += $(WAYLAND_LIBS)
>  libEGL_common_la_LIBADD += $(LIBDRM_LIBS)
>  libEGL_common_la_LIBADD += $(top_builddir)/src/egl/wayland/wayland-drm/
> libwayland-drm.la
> -libEGL_common_la_LIBADD += $(top_builddir)/src/util/libmesautil.la
>  dri2_backend_FILES += \
> drivers/dri2/platform_wayland.c \
> drivers/dri2/linux-dmabuf-unstable-v1-protocol.c \
> --
> 2.13.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Question for nir lower load uniform to scalar

2017-08-25 Thread Jason Ekstrand

On August 25, 2017 12:14:20 PM Eric Anholt  wrote:


Qiang Yu  writes:


Hi Eric,

I'm working on lima gp compiler which should benefit from nir lowering
uniform load to scalar.
I notice you write the nir_lower_io_to_scalar.c which support lowering
shader_in/shader_out
but left the uniform lowering in vc4 driver, any reason why not
implement in the nir_lower_io_to_scalar.c?


I think my theory was that drivers would want different units for the
base/offset (bytes or dwords), so I left it in vc4.  Anyone else want to
weigh in on this?  vc4 wants indirect load offsets in units of bytes.


We could do the same thing as nir_lower_io and pass a size function in.  
I've thought about doing some sort of io secularization for our driver but 
it already handles the vectors so there's no rush.  It would also be a 
reasonable thing to just combine lower_io_to_scalar with lower_io and just 
let the driver indicate what it wants scalarized.  Just thoughts...


--Jason


I'm new to nir, tried to add it but seems not correct after
optimization pass. So I should missing
some place, anyone can help to point out?


Your nir_lower_io.c code looks correct to me, so I'm not sure what might
be missing.  I'm not sure about using the component field, though -- for
VC4 all I want after lowering is a byte offset within the constant
buffer.



--
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 17.1.8 release candidate

2017-08-25 Thread Andres Gomez
Hello list,

The candidate for the Mesa 17.1.8 is now available. Currently we have:
 - 15 queued
 - 0 nominated (outstanding)
 - and 2 rejected patches


In the current queue we have:

In Mesa Core we include a fix to prevent a crash in
glCompressedTextureSubImage3D.

The GLSL compiler now includes a fix to add some int64 constant
propagation cases.

Intel drivers have received several patches. Among those, i965 has
gotten a fix for performance queries and another one for a problem that
could have caused rendering corruption or possibly hangs in programs
which use compute shaders.

AMD's drivers have also seen some improvements. radeonsi has gotten a
temporary workaround for a tessellation driver bug while radv includes
a patch to prevent a potential crash.

nouveau's has gotten a fix to properly set the sType for TXF ops to
U32.

For EGL, the Wayland platform of the the DRI2 drivers has received a
fix to prevent a possible indefinite block.

Take a look at section "Mesa stable queue" for more information.


Testing reports/general approval


Any testing reports (or general approval of the state of the branch)
will be greatly appreciated.

The plan is to have 17.1.8 next Monday (28th of August), around or
shortly after 13:00 GMT.

If you have any questions or suggestions - be that about the current
patch queue or otherwise, please go ahead.


Trivial merge conflicts
---

3f3e925d404c8524c342b72c32e4c151e293b2c9
Author: Dave Airlie 

radv: don't crash if we have no framebuffer

(cherry picked from commit 4a091b0788664f73bbb35c14d04c00cebf37e17a)

54bb87c25a73a9c0d4c8c65b6df586dc144db361
Author: Marek Olšák 

radeonsi/gfx9: add a temporary workaround for a tessellation driver bug

(cherry picked from commit 166823bfd26ff7e9b88099598305967336525716)

4fce4ce271b42357df50ef6d62a1481e41655f00
Author: Christoph Haag 

mesa: only copy requested compressed teximage cubemap faces

(cherry picked from commit 87556a650ad363b41d86f4e25d5c4696f9af4550)


Cheers,
Andres


Mesa stable queue
-

Nominated (0)
=


Queued (15)
===

Andres Gomez (5):
  docs: add sha256 checksums for 17.1.7
  cherry-ignore: cherry-ignore: added 17.2 nominations.
  cherry-ignore: add "i965/tex: Don't pass samples to 
miptree_create_for_teximage"
  cherry-ignore: add "i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit."
  cherry-ignore: add "egl/drm: Fix misused x and y offsets in 
swrast_*_image*"

Christoph Haag (1):
  mesa: only copy requested compressed teximage cubemap faces

Dave Airlie (1):
  radv: don't crash if we have no framebuffer

Ilia Mirkin (2):
  glsl: add a few missing int64 constant propagation cases
  nv50/ir: properly set sType for TXF ops to U32

Jason Ekstrand (1):
  i965: Stop looking at NewDriverState when emitting 3DSTATE_URB

Kai Chen (1):
  egl/wayland: Use roundtrips when awaiting buffer release

Lionel Landwerlin (1):
  i965: perf: minimize the chances to spread queries across batchbuffers

Marek Olšák (1):
  radeonsi/gfx9: add a temporary workaround for a tessellation driver bug

Tim Rowley (1):
  swr/rast: switch gen_knobs.cpp license

Topi Pohjolainen (1):
  intel/blorp: Adjust intra-tile x when faking rgb with red-only


Rejected (2)


Jason Ekstrand (1):
  i965/tex: Don't pass samples to miptree_create_for_teximage

Depends on earlier commit 76e2f390f98 which did not land in branch.

Kenneth Graunke (1):
  i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit.

Depends on earlier commit f296c22989ff which did not land in branch.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Andy Furniss

Leo Liu wrote:




+  }


Should we bail out with an error here when it's the other
way around?
Although I cannot think of any of case that to get buffer 
Interlaced now, It's still a good idea to bail out here

when it happnens Will add it in v4.


It's not a error when case like buffer is deinterlaced, and 
interlaced result from query. What we need to do is to do

nothing, just ignores. I have sent out v4, please ignore it,
it won't work.


Well that's not correct either.

When the buffer is allocated as progressive and the codec
doesn't supports that we should certainly do something.

Either bail out as an error when we encode because we can't
convert progressive->interlaced (just the other way around)
and/or reallocate for decoding.

Here is current situation for  transcode

Decoder allocate I buffers as preferred, then encoder prefers as
P buffers , so re-allocated them to P buffers. and then next
frame, decoder take P buffer, but not as preferred.

3 possible ways for decoder to go:

1. ignores the the Preferred, and keep buffer as P, and pipe
goes. V3 2. go for Preferred, and then do endless alloc/dealloc
frame by frame. V2 3. Bailout as error, the pipeline stops. V4


Won't have time to test till tomorrow but just getting one of the
cases I thought may work with this, that couldn't work with the
env, out.

ffmpeg can (in theory anyway) do -

hwdec -> hw deinterlace -> hw encode.

Possible?


Not sure about FFMpeg. Have you tried that before? I always use 
gstreamer for encode/transcode.


The env existed before ffmpeg vaapi could do it, so I expected and got
failiure.

IIRC with the env = 0 hwdec -> hw deint -> copy back raw nv12 worked,
but trying to get encode failed as expected without env = 1. It was a
while ago I tried, IIRC it was possible to hang GPU. I guess the deint

I don't know how, so haven't tried a gstreamer command that would do the
same.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/drm: Don't "fall back" to /dev/dri/card0 if the first open fails

2017-08-25 Thread Adam Jackson
On Fri, 2017-08-25 at 14:41 +0100, Emil Velikov wrote:

> Should we drop the "if (n != -1 && n < sizeof(buf))" part as well with
> this patch?

Meh. I've got some other changes coming in the area so that'll probably
happen soon anyway. At least for this change I just wanted to make
things deterministic and either open the named device for or fail.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Fix sparse BO mapping merging.

2017-08-25 Thread Bas Nieuwenhuizen
If we merge a mapping with the mapping before it, we also need
to not only change the offset, but also the bo offset.

Fixes: 715df30a4e2 "radv/amdgpu: Add winsys implementation of virtual buffers."
---
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
index 5c374a238d6..75444d57dac 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
@@ -149,6 +149,7 @@ radv_amdgpu_winsys_bo_virtual_bind(struct radeon_winsys_bo 
*_parent,
if (parent->ranges[first].bo == bo && (!bo || offset - bo_offset == 
parent->ranges[first].offset - parent->ranges[first].bo_offset)) {
size += offset - parent->ranges[first].offset;
offset = parent->ranges[first].offset;
+   bo_offset = parent->ranges[first].bo_offset;
remove_first = true;
}
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] egl: automake: don't link against libmesautil

2017-08-25 Thread Emil Velikov
From: Emil Velikov 

Originally required for the u_vector implementation, which was inlined
in u_vector.h with previous commit.

Using libmesautil pulled the C++ runtime (string_to_uint_map.cpp),
which is something don't want to impose in our libEGL.

We could consider rewriting string_to_uint_map in C, but that's too
invasive for a stable fix.

Cc: Daniel Stone 
Cc: "17.2" 
Fixes: 02cc35937277 ("egl/wayland: Use linux-dmabuf interface for buffers")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851
Signed-off-by: Emil Velikov 
---
 src/egl/Makefile.am | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index bb8ec9745dd..7331b616a8d 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -82,7 +82,6 @@ AM_CFLAGS += $(WAYLAND_CFLAGS)
 libEGL_common_la_LIBADD += $(WAYLAND_LIBS)
 libEGL_common_la_LIBADD += $(LIBDRM_LIBS)
 libEGL_common_la_LIBADD += 
$(top_builddir)/src/egl/wayland/wayland-drm/libwayland-drm.la
-libEGL_common_la_LIBADD += $(top_builddir)/src/util/libmesautil.la
 dri2_backend_FILES += \
drivers/dri2/platform_wayland.c \
drivers/dri2/linux-dmabuf-unstable-v1-protocol.c \
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] util: inline u_vector.c within the header

2017-08-25 Thread Emil Velikov
From: Emil Velikov 

Inlining the implementation does not cause additional overhead in
terms of build time while the binary is increased only marginally (~1k)

At the same time the compiler should be able to optimise better, although
this is not a path where we'll notice much difference.

Use a local u_is_power_of_two to avoid pulling u_math.h. Latter of
which would require updating 5+ locations to have an extra -I directive.

Doing this will allow us to address a unrelated issue, as mentioned in
the report below.

Cc: "17.2" 
Cc: Daniel Stone 
Cc: Jason Ekstrand 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851
Signed-off-by: Emil Velikov 
---
FTR I'm not a huge fan of this change, but it's like the least invasive
one.
---
 src/util/Makefile.sources |   1 -
 src/util/u_vector.c   | 110 --
 src/util/u_vector.h   |  93 ---
 3 files changed, 88 insertions(+), 116 deletions(-)
 delete mode 100644 src/util/u_vector.c

diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index 3315285f05e..08c7a7ea6a8 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -51,7 +51,6 @@ MESA_UTIL_FILES := \
u_queue.h \
u_string.h \
u_thread.h \
-   u_vector.c \
u_vector.h
 
 MESA_UTIL_GENERATED_FILES = \
diff --git a/src/util/u_vector.c b/src/util/u_vector.c
deleted file mode 100644
index 0de492ccf9a..000
--- a/src/util/u_vector.c
+++ /dev/null
@@ -1,110 +0,0 @@
-/*
- * Copyright © 2015 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- */
-
-#include 
-#include "util/u_math.h"
-#include "util/u_vector.h"
-
-/** @file u_vector.c
- *
- * A dynamically growable, circular buffer.  Elements are added at head and
- * removed from tail. head and tail are free-running uint32_t indices and we
- * only compute the modulo with size when accessing the array.  This way,
- * number of bytes in the queue is always head - tail, even in case of
- * wraparound.
- */
-
-int
-u_vector_init(struct u_vector *vector, uint32_t element_size, uint32_t size)
-{
-   assert(util_is_power_of_two(size));
-   assert(element_size < size && util_is_power_of_two(element_size));
-
-   vector->head = 0;
-   vector->tail = 0;
-   vector->element_size = element_size;
-   vector->size = size;
-   vector->data = malloc(size);
-
-   return vector->data != NULL;
-}
-
-void *
-u_vector_add(struct u_vector *vector)
-{
-   uint32_t offset, size, split, src_tail, dst_tail;
-   void *data;
-
-   if (vector->head - vector->tail == vector->size) {
-  size = vector->size * 2;
-  data = malloc(size);
-  if (data == NULL)
- return NULL;
-  src_tail = vector->tail & (vector->size - 1);
-  dst_tail = vector->tail & (size - 1);
-  if (src_tail == 0) {
- /* Since we know that the vector is full, this means that it's
-  * linear from start to end so we can do one copy.
-  */
- memcpy((char *)data + dst_tail, vector->data, vector->size);
-  } else {
- /* In this case, the vector is split into two pieces and we have
-  * to do two copies.  We have to be careful to make sure each
-  * piece goes to the right locations.  Thanks to the change in
-  * size, it may or may not still wrap around.
-  */
- split = u_align_u32(vector->tail, vector->size);
- assert(vector->tail <= split && split < vector->head);
- memcpy((char *)data + dst_tail, (char *)vector->data + src_tail,
-split - vector->tail);
- memcpy((char *)data + (split & (size - 1)), vector->data,
-vector->head - split);
-  }
-  free(vector->data);
-  vector->data = 

Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Leo Liu




+  }


Should we bail out with an error here when it's the other way 
around?
Although I cannot think of any of case that to get buffer 
Interlaced now, It's still a good idea to bail out here when it 
happnens

Will add it in v4.


It's not a error when case like buffer is deinterlaced, and 
interlaced result from query. What we need to do is to do nothing, 
just ignores.

I have sent out v4, please ignore it, it won't work.


Well that's not correct either.

When the buffer is allocated as progressive and the codec doesn't 
supports that we should certainly do something.


Either bail out as an error when we encode because we can't convert 
progressive->interlaced (just the other way around) and/or 
reallocate for decoding.

Here is current situation for  transcode

Decoder allocate I buffers as preferred, then encoder prefers as P 
buffers , so re-allocated them to P buffers.

and then next frame, decoder take P buffer, but not as preferred.

3 possible ways for decoder to go:

1. ignores the the Preferred, and keep buffer as P, and pipe goes. V3
2. go for Preferred, and then do endless alloc/dealloc frame by 
frame. V2

3. Bailout as error, the pipeline stops. V4


Won't have time to test till tomorrow but just getting one of the cases
I thought may work with this, that couldn't work with the env, out.

ffmpeg can (in theory anyway) do -

hwdec -> hw deinterlace -> hw encode.

Possible?


Not sure about FFMpeg. Have you tried that before? I always use 
gstreamer for encode/transcode.


Regards,
Leo



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Andy Furniss

Leo Liu wrote:



On 08/25/2017 03:16 PM, Christian König wrote:

Am 25.08.2017 um 17:15 schrieb Leo Liu:



On 08/25/2017 10:53 AM, Leo Liu wrote:



On 08/25/2017 02:57 AM, Christian König wrote:

Am 24.08.2017 um 20:49 schrieb Leo Liu:

v2: use deinterlace common function
v3: make sure deinterlace only

Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 22 --
  1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c

index 6c3c4fe..aa4062d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -613,17 +613,22 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 mtx_lock(>mutex);
 surf = handle_table_get(drv->htab, context->target_id);
 context->mpeg4.frame_num++;
-
 screen = context->decoder->context->screen;
 interlaced = screen->get_video_param(screen, 
context->decoder->profile,

context->decoder->entrypoint,
PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
   if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,

- PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, 
context->decoder->profile,

+ context->decoder->entrypoint,
+ PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }


Should we bail out with an error here when it's the other way around?
Although I cannot think of any of case that to get buffer Interlaced 
now, It's still a good idea to bail out here when it happnens

Will add it in v4.


It's not a error when case like buffer is deinterlaced, and 
interlaced result from query. What we need to do is to do nothing, 
just ignores.

I have sent out v4, please ignore it, it won't work.


Well that's not correct either.

When the buffer is allocated as progressive and the codec doesn't 
supports that we should certainly do something.


Either bail out as an error when we encode because we can't convert 
progressive->interlaced (just the other way around) and/or reallocate 
for decoding.

Here is current situation for  transcode

Decoder allocate I buffers as preferred, then encoder prefers as P 
buffers , so re-allocated them to P buffers.

and then next frame, decoder take P buffer, but not as preferred.

3 possible ways for decoder to go:

1. ignores the the Preferred, and keep buffer as P, and pipe goes. V3
2. go for Preferred, and then do endless alloc/dealloc frame by frame. V2
3. Bailout as error, the pipeline stops. V4


Won't have time to test till tomorrow but just getting one of the cases
I thought may work with this, that couldn't work with the env, out.

ffmpeg can (in theory anyway) do -

hwdec -> hw deinterlace -> hw encode.

Possible?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/mesa: only try to create 1x msaa surfaces for "fake" msaa drivers

2017-08-25 Thread Bruce Cherniak
From: Brian Paul 

For software drivers where we want "fake" msaa support for GL 3.x, we
treat 1 sample as being msaa.

For drivers with real msaa support, start format probing at 2x msaa.
For drivers with fake msaa support, start format probing at 1x msaa.

This also tweaks the MaxSamples code in st_init_extensions() so that
we use MaxSamples=1 for fake msaa.  This allows the format proble loops
to run at least one iteration.

This fixes a llvmpipe/VTK regression from commit 6839d3369905eb02151.
And for drivers with fake msaa support, calls such as
glTexImage2DMultisample(samples=1) will now succeed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
---
 src/mesa/state_tracker/st_cb_fbo.c | 13 ++---
 src/mesa/state_tracker/st_cb_texture.c | 11 ---
 src/mesa/state_tracker/st_extensions.c | 14 ++
 3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_fbo.c 
b/src/mesa/state_tracker/st_cb_fbo.c
index afc7700306..a7c286bcc5 100644
--- a/src/mesa/state_tracker/st_cb_fbo.c
+++ b/src/mesa/state_tracker/st_cb_fbo.c
@@ -155,12 +155,19 @@ st_renderbuffer_alloc_storage(struct gl_context * ctx,
 *   to  and no more than the next larger sample count supported
 *   by the implementation.
 *
-* So let's find the supported number of samples closest to NumSamples.
+* Find the supported number of samples >= rb->NumSamples
 */
if (rb->NumSamples > 0) {
-  unsigned i;
+  unsigned start, i;
 
-  for (i = MAX2(2, rb->NumSamples); i <= ctx->Const.MaxSamples; i++) {
+  if (ctx->Const.MaxSamples > 1 &&  rb->NumSamples == 1) {
+ /* don't try num_samples = 1 with drivers that support real msaa */
+ start = 2;
+  } else {
+ start = rb->NumSamples;
+  }
+
+  for (i = start; i <= ctx->Const.MaxSamples; i++) {
  format = st_choose_renderbuffer_format(st, internalFormat, i);
 
  if (format != PIPE_FORMAT_NONE) {
diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index af2052db24..b5006b05a7 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -2739,13 +2739,18 @@ st_texture_storage(struct gl_context *ctx,
 
bindings = default_bindings(st, fmt);
 
-   /* Raise the sample count if the requested one is unsupported. */
if (num_samples > 0) {
+  /* Find msaa sample count which is actually supported.  For example,
+   * if the user requests 1x but only 4x or 8x msaa is supported, we'll
+   * choose 4x here.
+   */
   enum pipe_texture_target ptarget = gl_target_to_pipe(texObj->Target);
   boolean found = FALSE;
 
-  /* start the query with at least two samples */
-  num_samples = MAX2(num_samples, 2);
+  if (ctx->Const.MaxSamples > 1 && num_samples == 1) {
+ /* don't try num_samples = 1 with drivers that support real msaa */
+ num_samples = 2;
+  }
 
   for (; num_samples <= ctx->Const.MaxSamples; num_samples++) {
  if (screen->is_format_supported(screen, fmt, ptarget,
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 904d9cd834..2008e28250 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -1046,17 +1046,15 @@ void st_init_extensions(struct pipe_screen *screen,
  void_formats, 32,
  PIPE_BIND_RENDER_TARGET);
}
-   if (consts->MaxSamples == 1) {
-  /* one sample doesn't really make sense */
-  consts->MaxSamples = 0;
-   }
-   else if (consts->MaxSamples >= 2) {
+
+   if (consts->MaxSamples >= 2) {
+  /* Real MSAA support */
   extensions->EXT_framebuffer_multisample = GL_TRUE;
   extensions->EXT_framebuffer_multisample_blit_scaled = GL_TRUE;
}
-
-   if (consts->MaxSamples == 0 &&
-   screen->get_param(screen, PIPE_CAP_FAKE_SW_MSAA)) {
+   else if (consts->MaxSamples > 0 &&
+screen->get_param(screen, PIPE_CAP_FAKE_SW_MSAA)) {
+  /* fake MSAA support */
   consts->FakeSWMSAA = GL_TRUE;
   extensions->EXT_framebuffer_multisample = GL_TRUE;
   extensions->EXT_framebuffer_multisample_blit_scaled = GL_TRUE;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] swr: Report format max_samples=1 to maintain support for "fake" msaa.

2017-08-25 Thread Bruce Cherniak
Accompanying patch "st/mesa: only try to create 1x msaa surfaces for
'fake' msaa" requires driver to report max_samples=1 to enable "fake"
msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions()
and either could enable "fake" msaa.

This patch raises the swr default msaa_max_count from 0 to 1, so that
swr_is_format_supported will report max_samples=1.

Real msaa can still be enabled by exporting SWR_MSAA_MAX_COUNT with a
pow2 value between 2 and 16.

This patch is necessary to prevent an OpenSWR regression resulting from
the st/mesa patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
---
 src/gallium/drivers/swr/swr_screen.cpp | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
b/src/gallium/drivers/swr/swr_screen.cpp
index 3287bc6fee..cc8d9955b8 100644
--- a/src/gallium/drivers/swr/swr_screen.cpp
+++ b/src/gallium/drivers/swr/swr_screen.cpp
@@ -255,13 +255,13 @@ swr_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
   return 1;
 
/* MSAA support
-* If user has explicitly set max_sample_count = 0 (via SWR_MSAA_MAX_COUNT)
-* then disable all MSAA support and go back to old caps. */
+* If user has explicitly set max_sample_count = 1 (via SWR_MSAA_MAX_COUNT)
+* then disable all MSAA support and go back to old (FAKE_SW_MSAA) caps. */
case PIPE_CAP_TEXTURE_MULTISAMPLE:
case PIPE_CAP_MULTISAMPLE_Z_RESOLVE:
-  return swr_screen(screen)->msaa_max_count ? 1 : 0;
+  return (swr_screen(screen)->msaa_max_count > 1) ? 1 : 0;
case PIPE_CAP_FAKE_SW_MSAA:
-  return swr_screen(screen)->msaa_max_count ? 0 : 1;
+  return (swr_screen(screen)->msaa_max_count > 1) ? 0 : 1;
 
   /* unsupported features */
case PIPE_CAP_ANISOTROPIC_FILTER:
@@ -1079,22 +1079,22 @@ swr_validate_env_options(struct swr_screen *screen)
   screen->client_copy_limit = client_copy_limit;
 
/* XXX msaa under development, disable by default for now */
-   screen->msaa_max_count = 0; /* was SWR_MAX_NUM_MULTISAMPLES; */
+   screen->msaa_max_count = 1; /* was SWR_MAX_NUM_MULTISAMPLES; */
 
/* validate env override values, within range and power of 2 */
-   int msaa_max_count = debug_get_num_option("SWR_MSAA_MAX_COUNT", 0);
-   if (msaa_max_count) {
-  if ((msaa_max_count < 0) || (msaa_max_count > SWR_MAX_NUM_MULTISAMPLES)
+   int msaa_max_count = debug_get_num_option("SWR_MSAA_MAX_COUNT", 1);
+   if (msaa_max_count != 1) {
+  if ((msaa_max_count < 1) || (msaa_max_count > SWR_MAX_NUM_MULTISAMPLES)
 || !util_is_power_of_two(msaa_max_count)) {
  fprintf(stderr, "SWR_MSAA_MAX_COUNT invalid: %d\n", msaa_max_count);
  fprintf(stderr, "must be power of 2 between 1 and %d" \
- " (or 0 to disable msaa)\n",
+ " (or 1 to disable msaa)\n",
SWR_MAX_NUM_MULTISAMPLES);
- msaa_max_count = 0;
+ msaa_max_count = 1;
   }
 
   fprintf(stderr, "SWR_MSAA_MAX_COUNT: %d\n", msaa_max_count);
-  if (!msaa_max_count)
+  if (msaa_max_count == 1)
  fprintf(stderr, "(msaa disabled)\n");
 
   screen->msaa_max_count = msaa_max_count;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Leo Liu



On 08/25/2017 03:16 PM, Christian König wrote:

Am 25.08.2017 um 17:15 schrieb Leo Liu:



On 08/25/2017 10:53 AM, Leo Liu wrote:



On 08/25/2017 02:57 AM, Christian König wrote:

Am 24.08.2017 um 20:49 schrieb Leo Liu:

v2: use deinterlace common function
v3: make sure deinterlace only

Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 22 --
  1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c

index 6c3c4fe..aa4062d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -613,17 +613,22 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 mtx_lock(>mutex);
 surf = handle_table_get(drv->htab, context->target_id);
 context->mpeg4.frame_num++;
-
 screen = context->decoder->context->screen;
 interlaced = screen->get_video_param(screen, 
context->decoder->profile,

context->decoder->entrypoint,
PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
   if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,

- PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, 
context->decoder->profile,

+ context->decoder->entrypoint,
+ PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }


Should we bail out with an error here when it's the other way around?
Although I cannot think of any of case that to get buffer Interlaced 
now, It's still a good idea to bail out here when it happnens

Will add it in v4.


It's not a error when case like buffer is deinterlaced, and 
interlaced result from query. What we need to do is to do nothing, 
just ignores.

I have sent out v4, please ignore it, it won't work.


Well that's not correct either.

When the buffer is allocated as progressive and the codec doesn't 
supports that we should certainly do something.


Either bail out as an error when we encode because we can't convert 
progressive->interlaced (just the other way around) and/or reallocate 
for decoding.

Here is current situation for  transcode

Decoder allocate I buffers as preferred, then encoder prefers as P 
buffers , so re-allocated them to P buffers.

and then next frame, decoder take P buffer, but not as preferred.

3 possible ways for decoder to go:

1. ignores the the Preferred, and keep buffer as P, and pipe goes. V3
2. go for Preferred, and then do endless alloc/dealloc frame by frame. V2
3. Bailout as error, the pipeline stops. V4

Regards,
Leo



Christian.



Leo






Thanks,
Leo





Would be nice if we could at least sanely handle that case.

Apart from that it looks good to me,
Christian.


 }
   if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_JPEG &&
@@ -640,13 +645,18 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 }
   if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
  if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) 
!= VA_STATUS_SUCCESS) {

+ old_buf->destroy(old_buf);
   mtx_unlock(>mutex);
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
  +  if (context->decoder->entrypoint == 
PIPE_VIDEO_ENTRYPOINT_ENCODE)
+ vl_compositor_yuv_deint(>cstate, >compositor, 
old_buf, surf->buffer);

+
+  old_buf->destroy(old_buf);
context->target = surf->buffer;
 }











___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi/gfx9: add a temporary workaround for a tessellation driver bug

2017-08-25 Thread Marek Olšák
On Fri, Aug 25, 2017 at 8:42 PM, Marek Olšák  wrote:
> On Tue, Aug 22, 2017 at 2:15 PM, Nicolai Hähnle  wrote:
>> On 22.08.2017 14:10, Nicolai Hähnle wrote:
>>>
>>> On 22.08.2017 13:00, Marek Olšák wrote:

 On Tue, Aug 22, 2017 at 9:37 AM, Nicolai Hähnle 
 wrote:
>
> On 18.08.2017 19:06, Marek Olšák wrote:
>>
>>
>> Ping.
>>
>> On Wed, Aug 16, 2017 at 12:57 AM, Marek Olšák  wrote:
>>>
>>>
>>> From: Marek Olšák 
>>>
>>> The workaround will do for now. The root cause is still unknown.
>>>
>>> This fixes new piglit: 16in-1out
>
>
>
> I don't see this test. Did you already send it out?


 "[PATCH] arb_tessellation_shader: new tests for a radeonsi bug" on the
 piglit ML.
>>>
>>>
>>> Curious, I can't reproduce the problem on my Raven.
>>
>>
>> VGT_LS_HS_CONFIG.NUM_PATCHES is 16, so there should definitely be more than
>> one wave per thread-group.
>
> The test is insufficient to reproduce the issue, but you'll see it
> when you run the test with ST_DEBUG=wf with and without the fix.

I just pushed a commit into piglit that adjusts the test, so that
random geometry results in a failure.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Christian König

Am 25.08.2017 um 17:15 schrieb Leo Liu:



On 08/25/2017 10:53 AM, Leo Liu wrote:



On 08/25/2017 02:57 AM, Christian König wrote:

Am 24.08.2017 um 20:49 schrieb Leo Liu:

v2: use deinterlace common function
v3: make sure deinterlace only

Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 22 --
  1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c

index 6c3c4fe..aa4062d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -613,17 +613,22 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 mtx_lock(>mutex);
 surf = handle_table_get(drv->htab, context->target_id);
 context->mpeg4.frame_num++;
-
 screen = context->decoder->context->screen;
 interlaced = screen->get_video_param(screen, 
context->decoder->profile,

context->decoder->entrypoint,
PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
   if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,

- PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, 
context->decoder->profile,

+ context->decoder->entrypoint,
+ PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }


Should we bail out with an error here when it's the other way around?
Although I cannot think of any of case that to get buffer Interlaced 
now, It's still a good idea to bail out here when it happnens

Will add it in v4.


It's not a error when case like buffer is deinterlaced, and interlaced 
result from query. What we need to do is to do nothing, just ignores.

I have sent out v4, please ignore it, it won't work.


Well that's not correct either.

When the buffer is allocated as progressive and the codec doesn't 
supports that we should certainly do something.


Either bail out as an error when we encode because we can't convert 
progressive->interlaced (just the other way around) and/or reallocate 
for decoding.


Christian.



Leo






Thanks,
Leo





Would be nice if we could at least sanely handle that case.

Apart from that it looks good to me,
Christian.


 }
   if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_JPEG &&
@@ -640,13 +645,18 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 }
   if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
  if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) 
!= VA_STATUS_SUCCESS) {

+ old_buf->destroy(old_buf);
   mtx_unlock(>mutex);
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
  +  if (context->decoder->entrypoint == 
PIPE_VIDEO_ENTRYPOINT_ENCODE)
+ vl_compositor_yuv_deint(>cstate, >compositor, 
old_buf, surf->buffer);

+
+  old_buf->destroy(old_buf);
context->target = surf->buffer;
 }









___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102017] Wrong colours in Cities Skyline

2017-08-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102017

--- Comment #17 from Thomas Jollans  ---
Thanks everyone for looking into this. I can confirm that the issue was a
missing libtxc_dxtn for me too.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Question for nir lower load uniform to scalar

2017-08-25 Thread Eric Anholt
Qiang Yu  writes:

> Hi Eric,
>
> I'm working on lima gp compiler which should benefit from nir lowering
> uniform load to scalar.
> I notice you write the nir_lower_io_to_scalar.c which support lowering
> shader_in/shader_out
> but left the uniform lowering in vc4 driver, any reason why not
> implement in the nir_lower_io_to_scalar.c?

I think my theory was that drivers would want different units for the
base/offset (bytes or dwords), so I left it in vc4.  Anyone else want to
weigh in on this?  vc4 wants indirect load offsets in units of bytes.

> I'm new to nir, tried to add it but seems not correct after
> optimization pass. So I should missing
> some place, anyone can help to point out?

Your nir_lower_io.c code looks correct to me, so I'm not sure what might
be missing.  I'm not sure about using the component field, though -- for
VC4 all I want after lowering is a byte offset within the constant
buffer.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 10/12] anv: Use DRM sync objects to back fences whenever possible

2017-08-25 Thread Jason Ekstrand
In order to implement VK_KHR_external_fence, we need to back our fences
with something that's shareable.  Since the kernel wait interface for
sync objects already supports waiting for multiple fences in one go, it
makes anv_WaitForFences much simpler if we only have one type of fence.
---
 src/intel/vulkan/anv_batch_chain.c |   8 +++
 src/intel/vulkan/anv_device.c  |   2 +
 src/intel/vulkan/anv_private.h |   4 ++
 src/intel/vulkan/anv_queue.c   | 134 ++---
 4 files changed, 139 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 0a0be8d..52c4510 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1560,6 +1560,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
 return result;
  break;
 
+  case ANV_FENCE_TYPE_SYNCOBJ:
+ result = anv_execbuf_add_syncobj(, impl->syncobj,
+  I915_EXEC_FENCE_SIGNAL,
+  >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
   default:
  unreachable("Invalid fence type");
   }
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index a6d5215..2e0fa19 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -339,6 +339,8 @@ anv_physical_device_init(struct anv_physical_device *device,
device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
device->has_syncobj = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_FENCE_ARRAY);
+   device->has_syncobj_wait = device->has_syncobj &&
+  anv_gem_supports_syncobj_wait(fd);
 
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 7817dc0..f9537c2 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -646,6 +646,7 @@ struct anv_physical_device {
 boolhas_exec_async;
 boolhas_exec_fence;
 boolhas_syncobj;
+boolhas_syncobj_wait;
 
 uint32_teu_total;
 uint32_tsubslice_total;
@@ -1747,6 +1748,9 @@ struct anv_fence_impl {
  struct anv_bo bo;
  enum anv_bo_fence_state state;
   } bo;
+
+  /** DRM syncobj handle for syncobj-based fences */
+  uint32_t syncobj;
};
 };
 
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index df9b647..1c0de52 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -271,17 +271,29 @@ VkResult anv_CreateFence(
if (fence == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   fence->permanent.type = ANV_FENCE_TYPE_BO;
+   if (device->instance->physicalDevice.has_syncobj_wait) {
+  fence->permanent.type = ANV_FENCE_TYPE_SYNCOBJ;
 
-   VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
-   >permanent.bo.bo, 4096);
-   if (result != VK_SUCCESS)
-  return result;
+  uint32_t create_flags = 0;
+  if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT)
+ create_flags |= DRM_SYNCOBJ_CREATE_SIGNALED;
 
-   if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
-  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
+  fence->permanent.syncobj = anv_gem_syncobj_create(device, create_flags);
+  if (!fence->permanent.syncobj)
+ return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
} else {
-  fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
+  fence->permanent.type = ANV_FENCE_TYPE_BO;
+
+  VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
+  >permanent.bo.bo, 4096);
+  if (result != VK_SUCCESS)
+ return result;
+
+  if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
+ fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
+  } else {
+ fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
+  }
}
 
*pFence = anv_fence_to_handle(fence);
@@ -301,6 +313,10 @@ anv_fence_impl_cleanup(struct anv_device *device,
case ANV_FENCE_TYPE_BO:
   anv_bo_pool_free(>batch_bo_pool, >bo.bo);
   return;
+
+   case ANV_FENCE_TYPE_SYNCOBJ:
+  anv_gem_syncobj_destroy(device, impl->syncobj);
+  return;
}
 
unreachable("Invalid fence type");
@@ -328,6 +344,8 @@ VkResult anv_ResetFences(
 uint32_tfenceCount,
 const VkFence*  pFences)
 {
+   ANV_FROM_HANDLE(anv_device, device, _device);
+
for (uint32_t 

[Mesa-dev] [PATCH v3 03/12] anv/wsi: Use QueueSubmit to trigger the fence in AcquireNextImage

2017-08-25 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_wsi.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
index 9369f26..00edb22 100644
--- a/src/intel/vulkan/anv_wsi.c
+++ b/src/intel/vulkan/anv_wsi.c
@@ -364,22 +364,25 @@ VkResult anv_GetSwapchainImagesKHR(
 }
 
 VkResult anv_AcquireNextImageKHR(
-VkDevice device,
+VkDevice _device,
 VkSwapchainKHR   _swapchain,
 uint64_t timeout,
 VkSemaphore  semaphore,
 VkFence  _fence,
 uint32_t*pImageIndex)
 {
+   ANV_FROM_HANDLE(anv_device, device, _device);
ANV_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain);
ANV_FROM_HANDLE(anv_fence, fence, _fence);
 
VkResult result = swapchain->acquire_next_image(swapchain, timeout,
semaphore, pImageIndex);
 
-   /* Thanks to implicit sync, the image is ready immediately. */
+   /* Thanks to implicit sync, the image is ready immediately.  However, we
+* should wait for the current GPU state to finish.
+*/
if (fence)
-  fence->state = ANV_FENCE_STATE_SIGNALED;
+  anv_QueueSubmit(anv_queue_to_handle(>queue), 0, NULL, _fence);
 
return result;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 11/12] anv: Implement VK_KHR_external_fence

2017-08-25 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c |  19 -
 src/intel/vulkan/anv_extensions.py |   5 ++
 src/intel/vulkan/anv_queue.c   | 142 -
 3 files changed, 161 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 52c4510..4f5137c 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1549,8 +1549,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
}
 
if (fence) {
-  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
-  struct anv_fence_impl *impl = >permanent;
+  /* Under most circumstances, out fences won't be temporary.  However,
+   * the spec does allow it for opaque_fd.  From the Vulkan 1.0.53 spec:
+   *
+   *"If the import is temporary, the implementation must restore the
+   *semaphore to its prior permanent state after submitting the next
+   *semaphore wait operation."
+   *
+   * The spec says nothing whatsoever about signal operations on
+   * temporarily imported semaphores so it appears they are allowed.
+   * There are also CTS tests that require this to work.
+   */
+  struct anv_fence_impl *impl =
+ fence->temporary.type != ANV_FENCE_TYPE_NONE ?
+ >temporary : >permanent;
 
   switch (impl->type) {
   case ANV_FENCE_TYPE_BO:
@@ -1617,6 +1629,9 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
}
 
if (fence && fence->permanent.type == ANV_FENCE_TYPE_BO) {
+  /* BO fences can't be shared, so they can't be temporary. */
+  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
+
   /* Once the execbuf has returned, we need to set the fence state to
* SUBMITTED.  We can't do this before calling execbuf because
* anv_GetFenceStatus does take the global device lock before checking
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 3252e0f..6b3d72e 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -47,6 +47,11 @@ class Extension:
 EXTENSIONS = [
 Extension('VK_KHR_dedicated_allocation',  1, True),
 Extension('VK_KHR_descriptor_update_template',1, True),
+Extension('VK_KHR_external_fence',1,
+  'device->has_syncobj_wait'),
+Extension('VK_KHR_external_fence_capabilities',   1, True),
+Extension('VK_KHR_external_fence_fd', 1,
+  'device->has_syncobj_wait'),
 Extension('VK_KHR_external_memory',   1, True),
 Extension('VK_KHR_external_memory_capabilities',  1, True),
 Extension('VK_KHR_external_memory_fd',1, True),
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 1c0de52..a954f65 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -349,7 +349,18 @@ VkResult anv_ResetFences(
for (uint32_t i = 0; i < fenceCount; i++) {
   ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
 
-  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
+  /* From the Vulkan 1.0.53 spec:
+   *
+   *"If any member of pFences currently has its payload imported with
+   *temporary permanence, that fence’s prior permanent payload is
+   *first restored. The remaining operations described therefore
+   *operate on the restored payload.
+   */
+  if (fence->temporary.type != ANV_FENCE_TYPE_NONE) {
+ anv_fence_impl_cleanup(device, >temporary);
+ fence->temporary.type = ANV_FENCE_TYPE_NONE;
+  }
+
   struct anv_fence_impl *impl = >permanent;
 
   switch (impl->type) {
@@ -379,11 +390,14 @@ VkResult anv_GetFenceStatus(
if (unlikely(device->lost))
   return VK_ERROR_DEVICE_LOST;
 
-   assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
-   struct anv_fence_impl *impl = >permanent;
+   struct anv_fence_impl *impl =
+  fence->temporary.type != ANV_FENCE_TYPE_NONE ?
+  >temporary : >permanent;
 
switch (impl->type) {
case ANV_FENCE_TYPE_BO:
+  /* BO fences don't support import/export */
+  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
   switch (impl->bo.state) {
   case ANV_BO_FENCE_STATE_RESET:
  /* If it hasn't even been sent off to the GPU yet, it's not ready */
@@ -665,6 +679,128 @@ VkResult anv_WaitForFences(
}
 }
 
+void anv_GetPhysicalDeviceExternalFencePropertiesKHR(
+VkPhysicalDevicephysicalDevice,
+const VkPhysicalDeviceExternalFenceInfoKHR* pExternalFenceInfo,
+VkExternalFencePropertiesKHR*   pExternalFenceProperties)
+{
+   ANV_FROM_HANDLE(anv_physical_device, device, physicalDevice);
+
+   switch (pExternalFenceInfo->handleType) {
+   case VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
+  if (device->has_syncobj_wait) {
+ 

[Mesa-dev] [PATCH v3 02/12] anv: Rework fences to work more like BO semaphores

2017-08-25 Thread Jason Ekstrand
This commit changes fences to work a bit more like BO semaphores.
Instead of the fence being a batch, it's simply a BO that gets added
to the validation list for the last execbuf call in the QueueSubmit
operation.  It's a bit annoying finding the last submit in the execbuf
but this allows us to avoid the dummy execbuf.
---
 src/intel/vulkan/anv_batch_chain.c | 26 ++-
 src/intel/vulkan/anv_private.h |  5 +--
 src/intel/vulkan/anv_queue.c   | 88 +++---
 3 files changed, 51 insertions(+), 68 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 1e7455f..ef6ada4 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1451,8 +1451,11 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
const VkSemaphore *in_semaphores,
uint32_t num_in_semaphores,
const VkSemaphore *out_semaphores,
-   uint32_t num_out_semaphores)
+   uint32_t num_out_semaphores,
+   VkFence _fence)
 {
+   ANV_FROM_HANDLE(anv_fence, fence, _fence);
+
struct anv_execbuf execbuf;
anv_execbuf_init();
 
@@ -1545,6 +1548,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   }
}
 
+   if (fence) {
+  result = anv_execbuf_add_bo(, >bo, NULL,
+  EXEC_OBJECT_WRITE, >alloc);
+  if (result != VK_SUCCESS)
+ return result;
+   }
+
if (cmd_buffer)
   result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
else
@@ -1588,6 +1598,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   anv_semaphore_reset_temporary(device, semaphore);
}
 
+   if (fence) {
+  /* Once the execbuf has returned, we need to set the fence state to
+   * SUBMITTED.  We can't do this before calling execbuf because
+   * anv_GetFenceStatus does take the global device lock before checking
+   * fence->state.
+   *
+   * We set the fence state to SUBMITTED regardless of whether or not the
+   * execbuf succeeds because we need to ensure that vkWaitForFences() and
+   * vkGetFenceStatus() return a valid result (VK_ERROR_DEVICE_LOST or
+   * VK_SUCCESS) in a finite amount of time even if execbuf fails.
+   */
+  fence->state = ANV_FENCE_STATE_SUBMITTED;
+   }
+
if (result == VK_SUCCESS && need_out_fence) {
   int out_fence = execbuf.execbuf.rsvd2 >> 32;
   for (uint32_t i = 0; i < num_out_semaphores; i++) {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 6b24144..715e0ad 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1642,7 +1642,8 @@ VkResult anv_cmd_buffer_execbuf(struct anv_device *device,
 const VkSemaphore *in_semaphores,
 uint32_t num_in_semaphores,
 const VkSemaphore *out_semaphores,
-uint32_t num_out_semaphores);
+uint32_t num_out_semaphores,
+VkFence fence);
 
 VkResult anv_cmd_buffer_reset(struct anv_cmd_buffer *cmd_buffer);
 
@@ -1720,8 +1721,6 @@ enum anv_fence_state {
 
 struct anv_fence {
struct anv_bo bo;
-   struct drm_i915_gem_execbuffer2 execbuf;
-   struct drm_i915_gem_exec_object2 exec2_objects[1];
enum anv_fence_state state;
 };
 
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 03769be..5023172 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -114,10 +114,9 @@ VkResult anv_QueueSubmit(
 VkQueue _queue,
 uint32_tsubmitCount,
 const VkSubmitInfo* pSubmits,
-VkFence _fence)
+VkFence fence)
 {
ANV_FROM_HANDLE(anv_queue, queue, _queue);
-   ANV_FROM_HANDLE(anv_fence, fence, _fence);
struct anv_device *device = queue->device;
 
/* Query for device status prior to submitting.  Technically, we don't need
@@ -158,7 +157,20 @@ VkResult anv_QueueSubmit(
 */
pthread_mutex_lock(>mutex);
 
+   if (fence && submitCount == 0) {
+  /* If we don't have any command buffers, we need to submit a dummy
+   * batch to give GEM something to wait on.  We could, potentially,
+   * come up with something more efficient but this shouldn't be a
+   * common case.
+   */
+  result = anv_cmd_buffer_execbuf(device, NULL, NULL, 0, NULL, 0, fence);
+  goto out;
+   }
+
for (uint32_t i = 0; i < submitCount; i++) {
+  /* Fence for this submit.  NULL for all but the last one */
+  VkFence submit_fence = (i == submitCount - 1) ? fence : NULL;
+
   if (pSubmits[i].commandBufferCount == 0) {
  /* If we don't have any 

[Mesa-dev] [PATCH v3 08/12] anv/gem: Add a flags parameter to syncobj_create

2017-08-25 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_gem.c   | 4 ++--
 src/intel/vulkan/anv_gem_stubs.c | 2 +-
 src/intel/vulkan/anv_private.h   | 2 +-
 src/intel/vulkan/anv_queue.c | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 57a8b79..9bd37f4 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -438,10 +438,10 @@ anv_gem_sync_file_merge(struct anv_device *device, int 
fd1, int fd2)
 }
 
 uint32_t
-anv_gem_syncobj_create(struct anv_device *device)
+anv_gem_syncobj_create(struct anv_device *device, uint32_t flags)
 {
struct drm_syncobj_create args = {
-  .flags = 0,
+  .flags = flags,
};
 
int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_CREATE, );
diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c
index c9f05ee..a092869 100644
--- a/src/intel/vulkan/anv_gem_stubs.c
+++ b/src/intel/vulkan/anv_gem_stubs.c
@@ -188,7 +188,7 @@ anv_gem_sync_file_merge(struct anv_device *device, int fd1, 
int fd2)
 }
 
 uint32_t
-anv_gem_syncobj_create(struct anv_device *device)
+anv_gem_syncobj_create(struct anv_device *device, uint32_t flags)
 {
unreachable("Unused");
 }
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 3b50c49..9b3efda 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -805,7 +805,7 @@ int anv_gem_set_caching(struct anv_device *device, uint32_t 
gem_handle, uint32_t
 int anv_gem_set_domain(struct anv_device *device, uint32_t gem_handle,
uint32_t read_domains, uint32_t write_domain);
 int anv_gem_sync_file_merge(struct anv_device *device, int fd1, int fd2);
-uint32_t anv_gem_syncobj_create(struct anv_device *device);
+uint32_t anv_gem_syncobj_create(struct anv_device *device, uint32_t flags);
 void anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle);
 int anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle);
 uint32_t anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd);
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 23f8d7d..df9b647 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -582,7 +582,7 @@ VkResult anv_CreateSemaphore(
   assert(handleTypes == 
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR);
   if (device->instance->physicalDevice.has_syncobj) {
  semaphore->permanent.type = ANV_SEMAPHORE_TYPE_DRM_SYNCOBJ;
- semaphore->permanent.syncobj = anv_gem_syncobj_create(device);
+ semaphore->permanent.syncobj = anv_gem_syncobj_create(device, 0);
  if (!semaphore->permanent.syncobj) {
 vk_free2(>alloc, pAllocator, semaphore);
 return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 06/12] vulkan/util: Add a vk_zalloc helper

2017-08-25 Thread Jason Ekstrand
---
 src/vulkan/util/vk_alloc.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/vulkan/util/vk_alloc.h b/src/vulkan/util/vk_alloc.h
index 2915021..f58a806 100644
--- a/src/vulkan/util/vk_alloc.h
+++ b/src/vulkan/util/vk_alloc.h
@@ -37,6 +37,20 @@ vk_alloc(const VkAllocationCallbacks *alloc,
 }
 
 static inline void *
+vk_zalloc(const VkAllocationCallbacks *alloc,
+  size_t size, size_t align,
+  VkSystemAllocationScope scope)
+{
+   void *mem = vk_alloc(alloc, size, align, scope);
+   if (mem == NULL)
+  return NULL;
+
+   memset(mem, 0, size);
+
+   return mem;
+}
+
+static inline void *
 vk_realloc(const VkAllocationCallbacks *alloc,
void *ptr, size_t size, size_t align,
VkSystemAllocationScope scope)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 04/12] anv: Pull the guts of anv_fence into anv_fence_impl

2017-08-25 Thread Jason Ekstrand
This is just a refactor, similar to what we did for semaphores, in
preparation for handling VK_KHR_external_fence.
---
 src/intel/vulkan/anv_batch_chain.c |  22 --
 src/intel/vulkan/anv_private.h |  42 ++-
 src/intel/vulkan/anv_queue.c   | 144 ++---
 3 files changed, 159 insertions(+), 49 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index ef6ada4..775009c 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1549,10 +1549,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
}
 
if (fence) {
-  result = anv_execbuf_add_bo(, >bo, NULL,
-  EXEC_OBJECT_WRITE, >alloc);
-  if (result != VK_SUCCESS)
- return result;
+  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
+  struct anv_fence_impl *impl = >permanent;
+
+  switch (impl->type) {
+  case ANV_FENCE_TYPE_BO:
+ result = anv_execbuf_add_bo(, >bo.bo, NULL,
+ EXEC_OBJECT_WRITE, >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
+  default:
+ unreachable("Invalid fence type");
+  }
}
 
if (cmd_buffer)
@@ -1598,7 +1608,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   anv_semaphore_reset_temporary(device, semaphore);
}
 
-   if (fence) {
+   if (fence && fence->permanent.type == ANV_FENCE_TYPE_BO) {
   /* Once the execbuf has returned, we need to set the fence state to
* SUBMITTED.  We can't do this before calling execbuf because
* anv_GetFenceStatus does take the global device lock before checking
@@ -1609,7 +1619,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
* vkGetFenceStatus() return a valid result (VK_ERROR_DEVICE_LOST or
* VK_SUCCESS) in a finite amount of time even if execbuf fails.
*/
-  fence->state = ANV_FENCE_STATE_SUBMITTED;
+  fence->permanent.bo.state = ANV_FENCE_STATE_SUBMITTED;
}
 
if (result == VK_SUCCESS && need_out_fence) {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 715e0ad..ab6e5e2 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1707,6 +1707,12 @@ anv_cmd_buffer_alloc_blorp_binding_table(struct 
anv_cmd_buffer *cmd_buffer,
 
 void anv_cmd_buffer_dump(struct anv_cmd_buffer *cmd_buffer);
 
+enum anv_fence_type {
+   ANV_FENCE_TYPE_NONE = 0,
+   ANV_FENCE_TYPE_BO,
+   ANV_FENCE_TYPE_SYNCOBJ,
+};
+
 enum anv_fence_state {
/** Indicates that this is a new (or newly reset fence) */
ANV_FENCE_STATE_RESET,
@@ -1719,9 +1725,41 @@ enum anv_fence_state {
ANV_FENCE_STATE_SIGNALED,
 };
 
+struct anv_fence_impl {
+   enum anv_fence_type type;
+
+   union {
+  /** Fence implementation for BO fences
+   *
+   * These fences use a BO and a set of CPU-tracked state flags.  The BO
+   * is added to the object list of the last execbuf call in a QueueSubmit
+   * and is marked EXEC_WRITE.  The state flags track when the BO has been
+   * submitted to the kernel.  We need to do this because Vulkan lets you
+   * wait on a fence that has not yet been submitted and I915_GEM_BUSY
+   * will say it's idle in this case.
+   */
+  struct {
+ struct anv_bo bo;
+ enum anv_fence_state state;
+  } bo;
+   };
+};
+
 struct anv_fence {
-   struct anv_bo bo;
-   enum anv_fence_state state;
+   /* Permanent fence state.  Every fence has some form of permanent state
+* (type != ANV_SEMAPHORE_TYPE_NONE).  This may be a BO to fence on (for
+* cross-process fences0 or it could just be a dummy for use internally.
+*/
+   struct anv_fence_impl permanent;
+
+   /* Temporary fence state.  A fence *may* have temporary state.  That state
+* is added to the fence by an import operation and is reset back to
+* ANV_SEMAPHORE_TYPE_NONE when the fence is reset.  A fence with temporary
+* state cannot be signaled because the fence must already be signaled
+* before the temporary state can be exported from the fence in the other
+* process and imported here.
+*/
+   struct anv_fence_impl temporary;
 };
 
 struct anv_event {
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 5023172..f8a2f64 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -262,23 +262,26 @@ VkResult anv_CreateFence(
 VkFence*pFence)
 {
ANV_FROM_HANDLE(anv_device, device, _device);
-   struct anv_bo fence_bo;
struct anv_fence *fence;
 
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_FENCE_CREATE_INFO);
 
-   VkResult result = anv_bo_pool_alloc(>batch_bo_pool, _bo, 
4096);
+   fence = vk_zalloc2(>alloc, pAllocator, sizeof(*fence), 8,
+  VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+   if (fence == NULL)
+  return 

[Mesa-dev] [PATCH v3 12/12] anv: Add support for the SYNC_FD handle type for fences

2017-08-25 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_gem.c   | 28 +
 src/intel/vulkan/anv_gem_stubs.c | 13 ++
 src/intel/vulkan/anv_private.h   |  4 +++
 src/intel/vulkan/anv_queue.c | 53 +++-
 4 files changed, 87 insertions(+), 11 deletions(-)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 8283117..3994c6b 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -489,6 +489,34 @@ anv_gem_syncobj_fd_to_handle(struct anv_device *device, 
int fd)
return args.handle;
 }
 
+int
+anv_gem_syncobj_export_sync_file(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_handle args = {
+  .handle = handle,
+  .flags = DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE,
+   };
+
+   int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD, );
+   if (ret)
+  return -1;
+
+   return args.fd;
+}
+
+int
+anv_gem_syncobj_import_sync_file(struct anv_device *device,
+ uint32_t handle, int fd)
+{
+   struct drm_syncobj_handle args = {
+  .handle = handle,
+  .fd = fd,
+  .flags = DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE,
+   };
+
+   return anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, );
+}
+
 void
 anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle)
 {
diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c
index 36700d7..02527b5 100644
--- a/src/intel/vulkan/anv_gem_stubs.c
+++ b/src/intel/vulkan/anv_gem_stubs.c
@@ -187,6 +187,19 @@ anv_gem_sync_file_merge(struct anv_device *device, int 
fd1, int fd2)
unreachable("Unused");
 }
 
+int
+anv_gem_syncobj_export_sync_file(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+
+int
+anv_gem_syncobj_import_sync_file(struct anv_device *device,
+ uint32_t handle, int fd)
+{
+   unreachable("Unused");
+}
+
 uint32_t
 anv_gem_syncobj_create(struct anv_device *device, uint32_t flags)
 {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index f9537c2..b30b71f 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -810,6 +810,10 @@ uint32_t anv_gem_syncobj_create(struct anv_device *device, 
uint32_t flags);
 void anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle);
 int anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle);
 uint32_t anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd);
+int anv_gem_syncobj_export_sync_file(struct anv_device *device,
+ uint32_t handle);
+int anv_gem_syncobj_import_sync_file(struct anv_device *device,
+ uint32_t handle, int fd);
 void anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle);
 bool anv_gem_supports_syncobj_wait(int fd);
 int anv_gem_syncobj_wait(struct anv_device *device,
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index a954f65..429bac9 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -688,11 +688,14 @@ void anv_GetPhysicalDeviceExternalFencePropertiesKHR(
 
switch (pExternalFenceInfo->handleType) {
case VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
+   case VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR:
   if (device->has_syncobj_wait) {
  pExternalFenceProperties->exportFromImportedHandleTypes =
-VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
+VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR |
+VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR;
  pExternalFenceProperties->compatibleHandleTypes =
-VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
+VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR |
+VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR;
  pExternalFenceProperties->externalFenceFeatures =
 VK_EXTERNAL_FENCE_FEATURE_EXPORTABLE_BIT_KHR |
 VK_EXTERNAL_FENCE_FEATURE_IMPORTABLE_BIT_KHR;
@@ -732,22 +735,41 @@ VkResult anv_ImportFenceFdKHR(
   if (!new_impl.syncobj)
  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
 
-  /* From the Vulkan 1.0.53 spec:
-   *
-   *"Importing a fence payload from a file descriptor transfers
-   *ownership of the file descriptor from the application to the
-   *Vulkan implementation. The application must not perform any
-   *operations on the file descriptor after a successful import."
-   *
-   * If the import fails, we leave the file descriptor open.
+  break;
+
+   case VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR:
+  /* Sync files are a bit tricky.  Because we want to continue using the
+   * syncobj implementation of WaitForFences, we don't use the sync file
+   * directly but instead import it into a syncobj.
*/
-  close(fd);
+  new_impl.type = 

[Mesa-dev] [PATCH v3 09/12] anv/gem: Add support for syncobj wait and reset

2017-08-25 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_gem.c   | 62 
 src/intel/vulkan/anv_gem_stubs.c | 20 +
 src/intel/vulkan/anv_private.h   |  5 
 3 files changed, 87 insertions(+)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 9bd37f4..8283117 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -488,3 +488,65 @@ anv_gem_syncobj_fd_to_handle(struct anv_device *device, 
int fd)
 
return args.handle;
 }
+
+void
+anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_array args = {
+  .handles = (uint64_t)(uintptr_t),
+  .count_handles = 1,
+   };
+
+   anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_RESET, );
+}
+
+bool
+anv_gem_supports_syncobj_wait(int fd)
+{
+   int ret;
+
+   struct drm_syncobj_create create = {
+  .flags = 0,
+   };
+   ret = anv_ioctl(fd, DRM_IOCTL_SYNCOBJ_CREATE, );
+   if (ret)
+  return false;
+
+   uint32_t syncobj = create.handle;
+
+   struct drm_syncobj_wait wait = {
+  .handles = (uint64_t)(uintptr_t),
+  .count_handles = 1,
+  .timeout_nsec = 0,
+  .flags = DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+   };
+   ret = anv_ioctl(fd, DRM_IOCTL_SYNCOBJ_WAIT, );
+
+   struct drm_syncobj_destroy destroy = {
+  .handle = syncobj,
+   };
+   anv_ioctl(fd, DRM_IOCTL_SYNCOBJ_DESTROY, );
+
+   /* If it timed out, then we have the ioctl and it supports the
+* DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT flag.
+*/
+   return ret == -1 && errno == ETIME;
+}
+
+int
+anv_gem_syncobj_wait(struct anv_device *device,
+ uint32_t *handles, uint32_t num_handles,
+ int64_t abs_timeout_ns, bool wait_all)
+{
+   struct drm_syncobj_wait args = {
+  .handles = (uint64_t)(uintptr_t)handles,
+  .count_handles = num_handles,
+  .timeout_nsec = abs_timeout_ns,
+  .flags = DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+   };
+
+   if (wait_all)
+  args.flags |= DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL;
+
+   return anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_WAIT, );
+}
diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c
index a092869..36700d7 100644
--- a/src/intel/vulkan/anv_gem_stubs.c
+++ b/src/intel/vulkan/anv_gem_stubs.c
@@ -210,3 +210,23 @@ anv_gem_syncobj_fd_to_handle(struct anv_device *device, 
int fd)
 {
unreachable("Unused");
 }
+
+void
+anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+
+bool
+anv_gem_supports_syncobj_wait(int fd)
+{
+   return false;
+}
+
+int
+anv_gem_syncobj_wait(struct anv_device *device,
+ uint32_t *handles, uint32_t num_handles,
+ int64_t abs_timeout_ns, bool wait_all)
+{
+   unreachable("Unused");
+}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 9b3efda..7817dc0 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -809,6 +809,11 @@ uint32_t anv_gem_syncobj_create(struct anv_device *device, 
uint32_t flags);
 void anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle);
 int anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle);
 uint32_t anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd);
+void anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle);
+bool anv_gem_supports_syncobj_wait(int fd);
+int anv_gem_syncobj_wait(struct anv_device *device,
+ uint32_t *handles, uint32_t num_handles,
+ int64_t abs_timeout_ns, bool wait_all);
 
 VkResult anv_bo_init_new(struct anv_bo *bo, struct anv_device *device, 
uint64_t size);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 07/12] drm-uapi/drm: Pull in new syncobj uabi

2017-08-25 Thread Jason Ekstrand
This adds the DRM_SYNCOBJ_CREATE_SIGNALED flag as well as the ioctls:

 - DRM_IOCTL_SYNCOBJ_WAIT
 - DRM_IOCTL_SYNCOBJ_RESET
 - DRM_IOCTL_SYNCOBJ_SIGNAL
---
 include/drm-uapi/drm.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h
index bf3674a..4737261 100644
--- a/include/drm-uapi/drm.h
+++ b/include/drm-uapi/drm.h
@@ -694,6 +694,7 @@ struct drm_prime_handle {
 
 struct drm_syncobj_create {
__u32 handle;
+#define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
__u32 flags;
 };
 
@@ -712,6 +713,24 @@ struct drm_syncobj_handle {
__u32 pad;
 };
 
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+struct drm_syncobj_wait {
+   __u64 handles;
+   /* absolute timeout */
+   __s64 timeout_nsec;
+   __u32 count_handles;
+   __u32 flags;
+   __u32 first_signaled; /* only valid when not waiting all */
+   __u32 pad;
+};
+
+struct drm_syncobj_array {
+   __u64 handles;
+   __u32 count_handles;
+   __u32 pad;
+};
+
 #if defined(__cplusplus)
 }
 #endif
@@ -834,6 +853,9 @@ extern "C" {
 #define DRM_IOCTL_SYNCOBJ_DESTROY  DRM_IOWR(0xC0, struct 
drm_syncobj_destroy)
 #define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD DRM_IOWR(0xC1, struct 
drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE DRM_IOWR(0xC2, struct 
drm_syncobj_handle)
+#define DRM_IOCTL_SYNCOBJ_WAIT DRM_IOWR(0xC3, struct drm_syncobj_wait)
+#define DRM_IOCTL_SYNCOBJ_RESETDRM_IOWR(0xC4, struct 
drm_syncobj_array)
+#define DRM_IOCTL_SYNCOBJ_SIGNAL   DRM_IOWR(0xC5, struct drm_syncobj_array)
 
 /**
  * Device specific ioctls should only be in their respective headers
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 05/12] anv: Rename anv_fence_state to anv_bo_fence_state

2017-08-25 Thread Jason Ekstrand
It only applies to legacy BO fences.
---
 src/intel/vulkan/anv_batch_chain.c |  2 +-
 src/intel/vulkan/anv_private.h | 10 +-
 src/intel/vulkan/anv_queue.c   | 24 
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 775009c..0a0be8d 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1619,7 +1619,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
* vkGetFenceStatus() return a valid result (VK_ERROR_DEVICE_LOST or
* VK_SUCCESS) in a finite amount of time even if execbuf fails.
*/
-  fence->permanent.bo.state = ANV_FENCE_STATE_SUBMITTED;
+  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SUBMITTED;
}
 
if (result == VK_SUCCESS && need_out_fence) {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index ab6e5e2..3b50c49 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1713,16 +1713,16 @@ enum anv_fence_type {
ANV_FENCE_TYPE_SYNCOBJ,
 };
 
-enum anv_fence_state {
+enum anv_bo_fence_state {
/** Indicates that this is a new (or newly reset fence) */
-   ANV_FENCE_STATE_RESET,
+   ANV_BO_FENCE_STATE_RESET,
 
/** Indicates that this fence has been submitted to the GPU but is still
 * (as far as we know) in use by the GPU.
 */
-   ANV_FENCE_STATE_SUBMITTED,
+   ANV_BO_FENCE_STATE_SUBMITTED,
 
-   ANV_FENCE_STATE_SIGNALED,
+   ANV_BO_FENCE_STATE_SIGNALED,
 };
 
 struct anv_fence_impl {
@@ -1740,7 +1740,7 @@ struct anv_fence_impl {
*/
   struct {
  struct anv_bo bo;
- enum anv_fence_state state;
+ enum anv_bo_fence_state state;
   } bo;
};
 };
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index f8a2f64..23f8d7d 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -279,9 +279,9 @@ VkResult anv_CreateFence(
   return result;
 
if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
-  fence->permanent.bo.state = ANV_FENCE_STATE_SIGNALED;
+  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
} else {
-  fence->permanent.bo.state = ANV_FENCE_STATE_RESET;
+  fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
}
 
*pFence = anv_fence_to_handle(fence);
@@ -336,7 +336,7 @@ VkResult anv_ResetFences(
 
   switch (impl->type) {
   case ANV_FENCE_TYPE_BO:
- impl->bo.state = ANV_FENCE_STATE_RESET;
+ impl->bo.state = ANV_BO_FENCE_STATE_RESET;
  break;
 
   default:
@@ -363,18 +363,18 @@ VkResult anv_GetFenceStatus(
switch (impl->type) {
case ANV_FENCE_TYPE_BO:
   switch (impl->bo.state) {
-  case ANV_FENCE_STATE_RESET:
+  case ANV_BO_FENCE_STATE_RESET:
  /* If it hasn't even been sent off to the GPU yet, it's not ready */
  return VK_NOT_READY;
 
-  case ANV_FENCE_STATE_SIGNALED:
+  case ANV_BO_FENCE_STATE_SIGNALED:
  /* It's been signaled, return success */
  return VK_SUCCESS;
 
-  case ANV_FENCE_STATE_SUBMITTED: {
+  case ANV_BO_FENCE_STATE_SUBMITTED: {
  VkResult result = anv_device_bo_busy(device, >bo.bo);
  if (result == VK_SUCCESS) {
-impl->bo.state = ANV_FENCE_STATE_SIGNALED;
+impl->bo.state = ANV_BO_FENCE_STATE_SIGNALED;
 return VK_SUCCESS;
  } else {
 return result;
@@ -427,7 +427,7 @@ anv_wait_for_bo_fences(struct anv_device *device,
  struct anv_fence_impl *impl = >permanent;
 
  switch (impl->bo.state) {
- case ANV_FENCE_STATE_RESET:
+ case ANV_BO_FENCE_STATE_RESET:
 /* This fence hasn't been submitted yet, we'll catch it the next
  * time around.  Yes, this may mean we dead-loop but, short of
  * lots of locking and a condition variable, there's not much that
@@ -436,7 +436,7 @@ anv_wait_for_bo_fences(struct anv_device *device,
 pending_fences++;
 continue;
 
- case ANV_FENCE_STATE_SIGNALED:
+ case ANV_BO_FENCE_STATE_SIGNALED:
 /* This fence is not pending.  If waitAll isn't set, we can return
  * early.  Otherwise, we have to keep going.
  */
@@ -446,14 +446,14 @@ anv_wait_for_bo_fences(struct anv_device *device,
 }
 continue;
 
- case ANV_FENCE_STATE_SUBMITTED:
+ case ANV_BO_FENCE_STATE_SUBMITTED:
 /* These are the fences we really care about.  Go ahead and wait
  * on it until we hit a timeout.
  */
 result = anv_device_wait(device, >bo.bo, timeout);
 switch (result) {
 case VK_SUCCESS:
-   impl->bo.state = ANV_FENCE_STATE_SIGNALED;
+   impl->bo.state = ANV_BO_FENCE_STATE_SIGNALED;
signaled_fences = true;

[Mesa-dev] [PATCH v3 01/12] anv/queue: Allow temporary import of SYNC_FD semaphores

2017-08-25 Thread Jason Ekstrand
We didn't allow them before because it didn't look like the spec allowed
it.  It certainly doesn't make much sense.  However, there are CTS tests
that apparently hit this.  What the spec actually says is:

"Importing a payload using handle types with copy transference
creates a duplicate copy of the payload at the time of import, but
makes no further reference to it. Fence signaling, waiting, and
resetting operations performed on the target of copy imports must
not affect any other fence or payload."

A SYNC_FD has copy transference but the import may be temporary or
permanent.  If you do a permanent import of something with copy
transference, I guess it's supposed to work and end up resetting the
permanent state.  In any case, there seems to be no real harm in
allowing it, so why not.
---
 src/intel/vulkan/anv_queue.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 0a40ebc..03769be 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -751,9 +751,6 @@ VkResult anv_ImportSemaphoreFdKHR(
   anv_semaphore_impl_cleanup(device, >temporary);
   semaphore->temporary = new_impl;
} else {
-  /* SYNC_FILE must be a temporary import */
-  assert(new_impl.type != ANV_SEMAPHORE_TYPE_SYNC_FILE);
-
   anv_semaphore_impl_cleanup(device, >permanent);
   semaphore->permanent = new_impl;
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size

2017-08-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102038

--- Comment #17 from Brad King  ---
After applying the two patches I can confirm that the VTK test I used to
produce the apitrace now passes again.  Thanks!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 41/47] i965/fs: Add reuse_16bit_conversions_register optimization

2017-08-25 Thread Francisco Jerez
Alejandro Piñeiro  writes:

> On 24/08/17 21:07, Connor Abbott wrote:
>>
>> Hi Alejandro,
>
> Hi Connor,
>
>>
>> This seems really suspicious. If the live ranges are really
>> independent, then the register allocator should be able to assign the
>> two virtual registers to the same physical register if it needs to.
>
> Yes, it is true, the register allocator should be able to assign two
> virtual registers to the same physical register. But that is done at the
> end (or really near the end), so late for the problem this optimization
> is trying to fix. We are also reducing the amount of instructions used.
>
> Probably not really clear on the commit message. When I say "reduce the
> pressure of the register allocator" I mean having a code that the
> register allocator would be able to handle without using too much time.
> The problem this optimization tries to solve is that for some 16 bit CTS
> tests (some with matrices and geometry shaders), the amount of virtual
> registers used and instructions was really big. For the record,
> initially, some tests needed 24 min just to compile. Right now, thanks
> to other optimizations, the slower test without this optimization needs
> 1min 30 seconds. Adding some hacky timestamps, the time used  at
> fs_visitor::allocate_registers (brw_fs.cpp:6096) is:
>
> * While trying to schedule using the three available pre mode
> heuristics: 7 seconds
> * Allocation with spilling: 63 seconds
> * Final schedule using SCHEDULE_POST: 19 seconds
>
> With this optimization, the total time goes down to 14 seconds (10 + 0 +
> 3 on the previous bullet point list).
>
> One could argue that 1min 30 seconds is okish. But taking into account
> that it goes down to 14 seconds, even with some caveats (see below), I
> still think that it is worth to use the optimization.
>
> And a final comment. For that same test, this is the final stats (using
> INTEL_DEBUG):
>
>  * With the optimization: SIMD8 shader: 4610 instructions. 0 loops.
> 130320 cycles. 15:9 spills:fills.
>  * Without the optimization: SIMD8 shader: 12312 instructions. 0 loops.
> 174816 cycles. 751:1851 spills:fills.
>
>> This change forces the two to be the same, which constrains the
>> register allocator unecessarily and should make it worse, so I'm
>> confused as to why this would help at all.
>
> I didn't check that issue specifically, but I recently found that this
> optimization affects copy propagation/dead code eliminate. So there are
> still some room for improvement. But in any case, take into account that
> this custom optimization is only used if there is a 32 to 16 bit
> conversion, so only affects shaders with this specific feature.
>
>>
>> IIRC there were some issues where we unnecessarily made the sources
>> and destination of an instruction interefere with each other, but if
>> that's what's causing this, then we should fix that underlying issue.
>>
>> (From what I remember, a lot of SIMD16 were expanded to SIMD8 in the
>> generator, in which case the second half of the source is read after
>> the first half of the destination is written, and we falsely thought
>> that the HW did that too, so we had some code to add a fake
>> interference between them, but a while ago Curro moved the expansion
>> to happen before register allocation. I don't have the code in front
>> of me, but I think we still have this useless code lying around, and I
>> would guess this is the source of the problem.)
>
> Taking into account what I explained before, I don't think that the
> problem is the interference or this code you mention (although perhaps
> Im wrong).
>

I agree with Connor's feed-back on this change, this really smells like
a hack working around register allocator brokenness.  If the register
allocator is failing to assign two variables with disjoint live ranges
to the same register it has a bug.  If you forcefully merge the live
ranges of source and destination it might turn out that that wasn't the
optimal decision to take after all register pressure-wise (because it's
frequently harder to find room in the GRF for a variable with 2x the
live range than for two independent variables), so you will be
pessimizing register usage in some cases -- The register allocator
should know better than you.

>> 
>> Connor
>>
>> On Aug 24, 2017 2:59 PM, "Alejandro Piñeiro" > > wrote:
>>
>> When dealing with HF/U/UW, it is usual having a register with a
>> F/D/UD, and then convert it to HF/U/UW, and not use again the F/D/UD
>> value. In those cases it would be possible to reuse the register where
>> the F value is initially stored instead of having two. Take also into
>> account that when operating with HF/U/UW, you would need to use the
>> full register (so stride 2). Packs/unpacks would be only useful when
>> loading/storing several HF/W/UW.
>>
>> Note that no instruction is removed. The main benefict is reducing the
>> 

Re: [Mesa-dev] [PATCH] radeonsi/gfx9: add a temporary workaround for a tessellation driver bug

2017-08-25 Thread Marek Olšák
On Tue, Aug 22, 2017 at 2:15 PM, Nicolai Hähnle  wrote:
> On 22.08.2017 14:10, Nicolai Hähnle wrote:
>>
>> On 22.08.2017 13:00, Marek Olšák wrote:
>>>
>>> On Tue, Aug 22, 2017 at 9:37 AM, Nicolai Hähnle 
>>> wrote:

 On 18.08.2017 19:06, Marek Olšák wrote:
>
>
> Ping.
>
> On Wed, Aug 16, 2017 at 12:57 AM, Marek Olšák  wrote:
>>
>>
>> From: Marek Olšák 
>>
>> The workaround will do for now. The root cause is still unknown.
>>
>> This fixes new piglit: 16in-1out



 I don't see this test. Did you already send it out?
>>>
>>>
>>> "[PATCH] arb_tessellation_shader: new tests for a radeonsi bug" on the
>>> piglit ML.
>>
>>
>> Curious, I can't reproduce the problem on my Raven.
>
>
> VGT_LS_HS_CONFIG.NUM_PATCHES is 16, so there should definitely be more than
> one wave per thread-group.

The test is insufficient to reproduce the issue, but you'll see it
when you run the test with ST_DEBUG=wf with and without the fix.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/47] i965/fs: Define new shader opcodes to set rounding modes

2017-08-25 Thread Francisco Jerez
Alejandro Piñeiro  writes:

> Although it is possible to emit them directly as AND/OR on brw_fs_nir,
> having specific opcodes makes it easier to remove duplicate settings
> later.
>
> Signed-off-by:  Alejandro Piñeiro 
> Signed-off-by:  Jose Maria Casanova Crespo 
> ---
>  src/intel/compiler/brw_eu.h |  3 +++
>  src/intel/compiler/brw_eu_defines.h |  9 +
>  src/intel/compiler/brw_eu_emit.c| 19 +++
>  src/intel/compiler/brw_fs_generator.cpp |  8 
>  src/intel/compiler/brw_shader.cpp   |  5 +
>  5 files changed, 44 insertions(+)
>
> diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
> index a3a9c63239d..0a7f8020398 100644
> --- a/src/intel/compiler/brw_eu.h
> +++ b/src/intel/compiler/brw_eu.h
> @@ -500,6 +500,9 @@ brw_broadcast(struct brw_codegen *p,
>struct brw_reg src,
>struct brw_reg idx);
>  
> +void
> +brw_rounding_mode(struct brw_codegen *p,
> +  enum brw_rnd_mode mode);
>  /***
>   * brw_eu_util.c:
>   */
> diff --git a/src/intel/compiler/brw_eu_defines.h 
> b/src/intel/compiler/brw_eu_defines.h
> index 1af835d47ed..50435df2fcf 100644
> --- a/src/intel/compiler/brw_eu_defines.h
> +++ b/src/intel/compiler/brw_eu_defines.h
> @@ -388,6 +388,9 @@ enum opcode {
> SHADER_OPCODE_TYPED_SURFACE_WRITE,
> SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL,
>  
> +   SHADER_OPCODE_RND_MODE_RTE,
> +   SHADER_OPCODE_RND_MODE_RTZ,
> +

We don't need an opcode for each possible rounding mode (there's also RU
and RD).  How about you add a single SHADER_OPCODE_RND_MODE opcode
taking an immediate with the right rounding mode?

Also, you should be marking the rounding mode opcodes as
has_side_effects(), because otherwise you're giving the scheduler the
freedom of moving your rounding mode update instruction past the
instruction you wanted it to have an effect on...

> SHADER_OPCODE_MEMORY_FENCE,
>  
> SHADER_OPCODE_GEN4_SCRATCH_READ,
> @@ -1233,4 +1236,10 @@ enum brw_message_target {
>  /* R0 */
>  # define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT   27
>  
> +enum PACKED brw_rnd_mode {
> +   BRW_RND_MODE_UNSPECIFIED,
> +   BRW_RND_MODE_RTE,
> +   BRW_RND_MODE_RTZ,

Since you're introducing a back-end-specific rounding mode enum already,
why not use the hardware values right away so you avoid hard-coding
magic constants below.

> +};
> +
>  #endif /* BRW_EU_DEFINES_H */
> diff --git a/src/intel/compiler/brw_eu_emit.c 
> b/src/intel/compiler/brw_eu_emit.c
> index 0b0d67a5c56..07ad3d9384b 100644
> --- a/src/intel/compiler/brw_eu_emit.c
> +++ b/src/intel/compiler/brw_eu_emit.c
> @@ -3723,3 +3723,22 @@ brw_WAIT(struct brw_codegen *p)
> brw_inst_set_exec_size(devinfo, insn, BRW_EXECUTE_1);
> brw_inst_set_mask_control(devinfo, insn, BRW_MASK_DISABLE);
>  }
> +
> +void
> +brw_rounding_mode(struct brw_codegen *p,
> +  enum brw_rnd_mode mode)
> +{
> +   switch (mode) {
> +   case BRW_RND_MODE_UNSPECIFIED:
> +  /* nothing to do here */
> +  break;
> +   case BRW_RND_MODE_RTZ:
> +  brw_OR(p, brw_cr0_reg(0), brw_cr0_reg(0), brw_imm_ud(0x0030u));
> +  break;
> +   case BRW_RND_MODE_RTE:
> +  brw_AND(p, brw_cr0_reg(0), brw_cr0_reg(0), brw_imm_ud(0xffcfu));

This has undefined behavior because the ALU instructions you use to set
cr0 have non-zero latency, so the rounding mode change won't take effect
till ~8 cycles after the instruction is issued.  Any instructions issued
in that window will pick up the wrong rounding mode.  This is likely one
of the reasons for your observations off-list regarding some shaders
using the right or wrong rounding mode non-deterministically depending
on the scheduler's behaviour.

Here's a spec quote from the SKL PRM suggesting a workaround you should
probably include in this commit:

| Implementation Restriction on Register Access: When the control
| register is used as an explicit source and/or destination, hardware
| does not ensure execution pipeline coherency. Software must set the
| thread control field to ‘switch’ for an instruction that uses control
| register as an explicit operand. This is important as the control
| register is an implicit source for most instructions. For example,
| fields like FPMode and Accumulator Disable control the arithmetic
| and/or logic instructions. Therefore, if the instruction updating the
| control register doesn’t set ‘switch’, subsequent instructions may
| have undefined results.


> +  break;
> +   default:
> +  unreachable("Not reached");
> +   }
> +}
> diff --git a/src/intel/compiler/brw_fs_generator.cpp 
> b/src/intel/compiler/brw_fs_generator.cpp
> index 2ade486705b..e0bd191ea7e 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -2139,6 +2139,14 @@ 

[Mesa-dev] [PATCH 2/2] radeon/uvd: add Define Restart Interval to MJPEG bitstream reconstruction

2017-08-25 Thread Leo Liu
Signed-off-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_uvd.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/gallium/drivers/radeon/radeon_uvd.c 
b/src/gallium/drivers/radeon/radeon_uvd.c
index 228f654af1..00d6267018 100644
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -1012,6 +1012,17 @@ static void get_mjpeg_slice_header(struct ruvd_decoder 
*dec, struct pipe_mjpeg_p
 
saved_size = size;
 
+   /* DRI */
+   if (pic->slice_parameter.restart_interval) {
+   buf[size++] = 0xff;
+   buf[size++] = 0xdd;
+   buf[size++] = 0x00;
+   buf[size++] = 0x04;
+   bs = (uint16_t*)[size++];
+   *bs = util_bswap16(pic->slice_parameter.restart_interval);
+   saved_size = ++size;
+   }
+
/* SOF */
buf[size++] = 0xff;
buf[size++] = 0xc0;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeon/uvd: fix MJPEG quantization table index

2017-08-25 Thread Leo Liu
Signed-off-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_uvd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/radeon_uvd.c 
b/src/gallium/drivers/radeon/radeon_uvd.c
index 648a493b59..228f654af1 100644
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -969,7 +969,7 @@ static void get_mjpeg_slice_header(struct ruvd_decoder 
*dec, struct pipe_mjpeg_p
continue;
 
buf[size++] = i;
-   memcpy((buf + size), >quantization_table.quantiser_table, 
64);
+   memcpy((buf + size), 
>quantization_table.quantiser_table[i], 64);
size += 64;
}
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Implement GL_ARB_texture_filter_anisotropic

2017-08-25 Thread Roland Scheidegger
Am 24.08.2017 um 20:40 schrieb Adam Jackson:
> The only difference from the EXT version is bumping the minmax to 16, so
> just hit all the drivers at once.
> 
> v2: Fix driver names, add to 17.3 release notes (Ilia Mirkin)
> 
> Reviewed-by: Ilia Mirkin 
> Signed-off-by: Adam Jackson 
> ---
>  docs/features.txt| 4 +++-
>  docs/relnotes/17.3.0.html| 1 +
>  src/glx/glxextensions.c  | 1 +
>  src/glx/glxextensions.h  | 1 +
>  src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
>  src/mesa/drivers/dri/r200/r200_context.c | 1 +
>  src/mesa/drivers/dri/radeon/radeon_context.c | 1 +
>  src/mesa/main/extensions.c   | 1 +
>  src/mesa/main/extensions_table.h | 1 +
>  src/mesa/main/mtypes.h   | 1 +
>  src/mesa/main/version.c  | 2 +-
>  src/mesa/state_tracker/st_extensions.c   | 4 
>  12 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/features.txt b/docs/features.txt
> index 6f57ec26fd..3f91c2daae 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -231,10 +231,12 @@ GL 4.6, GLSL 4.60
>GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
> radeonsi)
>GL_ARB_shader_group_vote  DONE (i965, nvc0, 
> radeonsi)
>GL_ARB_spirv_extensions   in progress (Nicolai 
> Hähnle, Ian Romanick)
> -  GL_ARB_texture_filter_anisotropic not started
> +  GL_ARB_texture_filter_anisotropic DONE (i965, nv50, 
> nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
>GL_ARB_transform_feedback_overflow_query  DONE (i965/gen6+, 
> radeonsi, llvmpipe, softpipe)
>GL_KHR_no_error   started (Timothy 
> Arceri)
>  
> +(*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the 
> setting
That's actually not quite true for neither (though I thought it was for
llvmpipe but I just checked). llvmpipe says the aniso limit is 16x, but
it does not set the cap bit for anisotropic filtering
(PIPE_CAP_ANISOTROPIC_FILTER) itself, so we don't actually lie (but if
you give it an anisotropic sampler despite this, it will indeed ignore it).
softpipe, otoh, does have a AF implementation, working up to 16x (albeit
I quickly checked with texfilt, and it looks like it may not filter
between mips (might be a bug or by design, I don't know), but it does
respect high degree AF).

Roland



> +
>  These are the extensions cherry-picked to make GLES 3.1
>  GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
>  
> diff --git a/docs/relnotes/17.3.0.html b/docs/relnotes/17.3.0.html
> index 25d02cdca7..8da43f22f0 100644
> --- a/docs/relnotes/17.3.0.html
> +++ b/docs/relnotes/17.3.0.html
> @@ -45,6 +45,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  
>  
>  GL_ARB_transform_feedback_overflow_query on radeonsi
> +GL_ARB_texture_filter_anisotropic on i965, nv50, nvc0, r600, 
> radeonsi
>  GL_EXT_memory_object on radeonsi
>  GL_EXT_memory_object_fd on radeonsi
>  
> diff --git a/src/glx/glxextensions.c b/src/glx/glxextensions.c
> index 22b078ce48..88bf0de3e6 100644
> --- a/src/glx/glxextensions.c
> +++ b/src/glx/glxextensions.c
> @@ -190,6 +190,7 @@ static const struct extension_info known_gl_extensions[] 
> = {
> { GL(ARB_texture_env_combine),VER(1,3), Y, N, N, N },
> { GL(ARB_texture_env_crossbar),   VER(1,4), Y, N, N, N },
> { GL(ARB_texture_env_dot3),   VER(1,3), Y, N, N, N },
> +   { GL(ARB_texture_filter_anisotropic), VER(0,0), Y, N, N, N },
> { GL(ARB_texture_mirrored_repeat),VER(1,4), Y, N, N, N },
> { GL(ARB_texture_non_power_of_two),   VER(1,5), Y, N, N, N },
> { GL(ARB_texture_rectangle),  VER(0,0), Y, N, N, N },
> diff --git a/src/glx/glxextensions.h b/src/glx/glxextensions.h
> index 21ad02a44b..2a595516ee 100644
> --- a/src/glx/glxextensions.h
> +++ b/src/glx/glxextensions.h
> @@ -101,6 +101,7 @@ enum
> GL_ARB_texture_env_combine_bit,
> GL_ARB_texture_env_crossbar_bit,
> GL_ARB_texture_env_dot3_bit,
> +   GL_ARB_texture_filter_anisotropic_bit,
> GL_ARB_texture_mirrored_repeat_bit,
> GL_ARB_texture_non_power_of_two_bit,
> GL_ARB_texture_rectangle_bit,
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index b91bbdc8d9..c3cd8004a1 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -80,6 +80,7 @@ intelInitExtensions(struct gl_context *ctx)
> ctx->Extensions.ARB_texture_env_combine = true;
> ctx->Extensions.ARB_texture_env_crossbar = true;
> ctx->Extensions.ARB_texture_env_dot3 = true;
> +   ctx->Extensions.ARB_texture_filter_anisotropic = true;
> ctx->Extensions.ARB_texture_float = 

Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Leo Liu



On 08/25/2017 12:42 PM, Andy Furniss wrote:

Leo Liu wrote:

v2: use deinterlace common function
v3: make sure deinterlace only


Doesn't apply to master with git.


I will attach you another one. should be good. Too much patches on the fly.

Thanks,
Leo




patch was less fussy

patch -p 1  < ~/Leo-va-interl-patches/02-3
patching file src/gallium/state_trackers/va/picture.c
Hunk #1 succeeded at 619 with fuzz 1 (offset 6 lines).
Hunk #2 succeeded at 662 (offset 17 lines).


Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 22 --
  1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c

index 6c3c4fe..aa4062d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -613,17 +613,22 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 mtx_lock(>mutex);
 surf = handle_table_get(drv->htab, context->target_id);
 context->mpeg4.frame_num++;
-
 screen = context->decoder->context->screen;
 interlaced = screen->get_video_param(screen, 
context->decoder->profile,

context->decoder->entrypoint,
PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
   if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,

- PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, 
context->decoder->profile,

+ context->decoder->entrypoint,
+ PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }
 }
   if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_JPEG &&
@@ -640,13 +645,18 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 }
   if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
  if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) != 
VA_STATUS_SUCCESS) {

+ old_buf->destroy(old_buf);
   mtx_unlock(>mutex);
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
  +  if (context->decoder->entrypoint == 
PIPE_VIDEO_ENTRYPOINT_ENCODE)
+ vl_compositor_yuv_deint(>cstate, >compositor, 
old_buf, surf->buffer);

+
+  old_buf->destroy(old_buf);
context->target = surf->buffer;
 }





>From 0979afc77528557ed0e713b20c79ba91d142a889 Mon Sep 17 00:00:00 2001
From: Leo Liu 
Date: Fri, 25 Aug 2017 10:49:43 -0400
Subject: [PATCH 2/3] st/va move YUV content to deinterlaced buffer

When reallocation for encoder

v2: use deinterlace common function
v3: make sure deinterlace only

Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/va/picture.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c b/src/gallium/state_trackers/va/picture.c
index 47e63d3b30..74d741f91a 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -626,10 +626,16 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
 PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
 
if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, context->decoder->profile,
- PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, context->decoder->profile,
+   context->decoder->entrypoint,
+   PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }
}
 
format = screen->get_video_param(screen, context->decoder->profile,
@@ -657,13 +663,18 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
}
 
if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
 
   if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) != VA_STATUS_SUCCESS) {
+ old_buf->destroy(old_buf);
  mtx_unlock(>mutex);
  return VA_STATUS_ERROR_ALLOCATION_FAILED;
   }
 
+  if (context->decoder->entrypoint == 

Re: [Mesa-dev] [PATCH 1/2] mesa: Implement GL_ARB_texture_filter_anisotropic

2017-08-25 Thread Adam Jackson
On Thu, 2017-08-24 at 23:27 -0700, Kenneth Graunke wrote:

> > diff --git a/src/glx/glxextensions.h b/src/glx/glxextensions.h
> > index 21ad02a44b..2a595516ee 100644
> > --- a/src/glx/glxextensions.h
> > +++ b/src/glx/glxextensions.h
> > @@ -101,6 +101,7 @@ enum
> > GL_ARB_texture_env_combine_bit,
> > GL_ARB_texture_env_crossbar_bit,
> > GL_ARB_texture_env_dot3_bit,
> > +   GL_ARB_texture_filter_anisotropic_bit,
> > GL_ARB_texture_mirrored_repeat_bit,
> > GL_ARB_texture_non_power_of_two_bit,
> > GL_ARB_texture_rectangle_bit,
> 
> Hi Adam,
> 
> I've never seen new GL extensions added to the GLX code like this.  As
> far as I know, we haven't done that for any of the other extensions we've
> added over the last few years.  I guess this is something related to
> indirect GLX?  What does it do?  Should we drop it?

That I touched the GLX code with this change is kind of accidental, as
I was merely grepping for the EXT string. But since I did...

Yes, it's about indirect GLX. Early in GLX setup the client library
sends the server (among other things) the list of GL extension strings
it supports. glGetString(GL_EXTENSIONS) in an indirect context is then
the intersection of the client-support list and the list supported by
the indirect renderer. Since this extension's GLX support is trivial we
can just claim we support it and be done.

I suspect the reason we don't touch this list very often is that the
indirect code only implements through GL 1.4 at the moment, so the set
of extensions you can usefully enable is pretty limited. 

That said, that list has a convention for aliases that I'd missed on
first writing. I'll fix that up and merge.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Andy Furniss

Leo Liu wrote:

v2: use deinterlace common function
v3: make sure deinterlace only


Doesn't apply to master with git.

patch was less fussy

patch -p 1  < ~/Leo-va-interl-patches/02-3
patching file src/gallium/state_trackers/va/picture.c
Hunk #1 succeeded at 619 with fuzz 1 (offset 6 lines).
Hunk #2 succeeded at 662 (offset 17 lines).


Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 22 --
  1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 6c3c4fe..aa4062d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -613,17 +613,22 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
 mtx_lock(>mutex);
 surf = handle_table_get(drv->htab, context->target_id);
 context->mpeg4.frame_num++;
-
 screen = context->decoder->context->screen;
 interlaced = screen->get_video_param(screen, context->decoder->profile,
  context->decoder->entrypoint,
  PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
  
 if (surf->buffer->interlaced != interlaced) {

-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,
- 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- 
PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, context->decoder->profile,
+   context->decoder->entrypoint,
+   PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }
 }
  
 if (u_reduce_video_profile(context->templat.profile) == PIPE_VIDEO_FORMAT_JPEG &&

@@ -640,13 +645,18 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
 }
  
 if (realloc) {

-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
  
if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) != VA_STATUS_SUCCESS) {

+ old_buf->destroy(old_buf);
   mtx_unlock(>mutex);
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
  
+  if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)

+ vl_compositor_yuv_deint(>cstate, >compositor, old_buf, 
surf->buffer);
+
+  old_buf->destroy(old_buf);
context->target = surf->buffer;
 }
  



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-25 Thread Connor Abbott
On Aug 25, 2017 4:10 PM, "Matt Turner"  wrote:

On Fri, Aug 25, 2017 at 10:50 AM, Nicolai Hähnle  wrote:
> On 25.08.2017 13:58, Marek Olšák wrote:
>>
>> Nicolai,
>>
>> Have you thought about switching to NIR for radeonsi completely to get
>> 16-bit support? We need NIR support anyway for spirv, right? Would be it
be
>> easier than adding 16-bit support into TGSI, glsl2tgsi, and tgsi2llvm?
>
>
> Well. What's missing from the NIR path is:
>
> (1) GS and tess (the ABI parts only)
> (2) re-adding some minor extensions (shader_group_vote?)

I added shader_group_vote/shader_ballot support to NIR when I enabled
those extensions for i965 recently, so no work to do there.


Well, you still need to hook up the NIR-to-LLVM bits, but my series for
implementing the Vulkan equivalent extension in radv does that, so once it
lands there should be no work to do - reviews for it appreciated!

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] docs: remove released and extend the calendar until the end of 2017

2017-08-25 Thread Emil Velikov
On 25 August 2017 at 12:40, Andres Gomez  wrote:
> Completed the 17.2 cycle and added the beginning of the 17.3 one.
>
> Cc: Emil Velikov 
> Cc: Juan A. Suarez Romero 
> Signed-off-by: Andres Gomez 
> ---
>  docs/release-calendar.html | 86 
> ++
>  1 file changed, 80 insertions(+), 6 deletions(-)
>
> diff --git a/docs/release-calendar.html b/docs/release-calendar.html
> index 93c02dafe94..1ed3ae14a97 100644
> --- a/docs/release-calendar.html
> +++ b/docs/release-calendar.html
> @@ -46,17 +46,91 @@ if you'd like to nominate a patch in the next stable 
> release.
>  Final planned release for the 17.1 series
>  
>  
> -17.2
> -2017-08-11
> -17.2.0-rc4
> +17.2
> +2017-09-08
> +17.2.1
>  Emil Velikov
> -May be promoted to 17.2.0 final
> +
As Eric mentioned - we had a few blockers so 17.2.0 is not out yet.
I'm hoping to have it today/the weekend but will need to double-check.
Can you add that one to the list?

We might need to tweak the dates for the stable releases... but that
as things unfold.

Reviewed-by: Emil Velikov 

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] docs: add an additional final cycle for 17.1

2017-08-25 Thread Emil Velikov
Reviewed-by: Emil Velikov 

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] khronos/egl: remove dependency on Android NDK header

2017-08-25 Thread Emil Velikov
On 24 August 2017 at 15:22, Eric Engestrom  wrote:
> On Thursday, 2017-08-24 08:54:04 -0500, Rob Herring wrote:
>> On Thu, Aug 24, 2017 at 7:49 AM, Eric Engestrom
>>  wrote:
>> > Khronos: https://github.com/KhronosGroup/EGL-Registry/pull/22
>> > Cc: Rob Herring 
>> > Cc: Emil Velikov 
>> > Signed-off-by: Eric Engestrom 
>> > ---
>> >  include/EGL/eglplatform.h | 3 +--
>> >  1 file changed, 1 insertion(+), 2 deletions(-)
>> >
>> > diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h
>> > index f045d009c0..bf9ec0bf5f 100644
>> > --- a/include/EGL/eglplatform.h
>> > +++ b/include/EGL/eglplatform.h
>> > @@ -97,8 +97,7 @@ typedef void   *EGLNativeWindowType;
>> >
>> >  #elif defined(__ANDROID__) || defined(ANDROID)
>> >
>> > -#include 
>> > -
>> > +struct ANativeWindow;
>> >  struct egl_native_pixmap_t;
>>
>> How does this work when we need to dereference the struct to call
>> ANativeWindow::dequeueBuffer() and others?
>
> Right, there are two things at play here:
> - eglplatform.h doesn't need to know the struct, so it shouldn't include
>   a whole header but simply forward declare for the pointer.
> - platform_android does need it, but wasn't including the proper
>   headers, so I missed it in my initial grep.
>
> It seems these two issues are orthogonal after all. This khronos header
> patch should land IMO, but platform_android needs a patch to avoid
> breaking (incoming) and will need the proper libraries that your patch
> 2/2 provides for the O update.
>
Thanks guys, for the correction.
The two "issues" seem the same but not.

On this patch - Eric, can you please mention in the commit message why
we don't copy the whole file.

With that:
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] dri: fix typo

2017-08-25 Thread Emil Velikov
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] anv: set right datatypes in anv_pipeline_binding

2017-08-25 Thread Juan A. Suarez Romero
This structure contains two fields, binding and index, that store the
binding in the descriptor set and the index inside the binding.

These structures are defined as uint8_t, but the types in Vulkan
specification are uint32_t, so big values are clamp.

This fixes 
dEQP-VK.binding_model.shader_access.*.multiple_arbitrary_descriptors.*

v2: use UINT32_MAX for index when having no render targets (Tapani)
---
 src/intel/vulkan/anv_pipeline.c  | 2 +-
 src/intel/vulkan/anv_private.h   | 4 ++--
 src/intel/vulkan/genX_pipeline.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 279d76561a..13ee5f47e2 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -907,7 +907,7 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline,
  rt_bindings[0] = (struct anv_pipeline_binding) {
 .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS,
 .binding = 0,
-.index = UINT8_MAX,
+.index = UINT32_MAX,
  };
  num_rts = 1;
   }
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 6b2414429f..d5657c5e09 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1236,10 +1236,10 @@ struct anv_pipeline_binding {
uint8_t set;
 
/* Binding in the descriptor set */
-   uint8_t binding;
+   uint32_t binding;
 
/* Index in the binding */
-   uint8_t index;
+   uint32_t index;
 
/* Input attachment index (relative to the subpass) */
uint8_t input_attachment_index;
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 55db5339d6..acf3ee37d3 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -1346,7 +1346,7 @@ has_color_buffer_write_enabled(const struct anv_pipeline 
*pipeline)
   if (bind_map->surface_to_descriptor[i].set !=
   ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS)
  continue;
-  if (bind_map->surface_to_descriptor[i].index != UINT8_MAX)
+  if (bind_map->surface_to_descriptor[i].index != UINT32_MAX)
  return true;
}
 
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: set right datatypes in anv_pipeline_binding

2017-08-25 Thread Juan A. Suarez Romero
On Fri, 2017-08-25 at 09:08 +0300, Tapani Pälli wrote:
> On 08/24/2017 01:12 PM, Juan A. Suarez Romero wrote:
> > This structure contains two fields, binding and index, that store
> > the
> > binding in the descriptor set and the index inside the binding.
> > 
> > These structures are defined as uint8_t, but the types in Vulkan
> > specification are uint32_t, so big values are clamp.
> 
> Forgive me if I'm asking stupid things ... but does this mean also
> that 
> some of the code dealing with anv_pipeline_binding should change? I
> can 
> see that at least index gets set with UINT8_MAX when having no
> render 
> targets. Otherwise I agree that from API pov these are uint32_t.
> 

That's a good question. I think that we should use UINT32_MAX instead,
the maximum value.

I'll send a new v2 with this change.


J.A.


> 
> > This fixes dEQP-
> > VK.binding_model.shader_access.*.multiple_arbitrary_descriptors.*
> > ---
> >   src/intel/vulkan/anv_private.h | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/anv_private.h
> > b/src/intel/vulkan/anv_private.h
> > index 6b2414429f..d5657c5e09 100644
> > --- a/src/intel/vulkan/anv_private.h
> > +++ b/src/intel/vulkan/anv_private.h
> > @@ -1236,10 +1236,10 @@ struct anv_pipeline_binding {
> >  uint8_t set;
> >   
> >  /* Binding in the descriptor set */
> > -   uint8_t binding;
> > +   uint32_t binding;
> >   
> >  /* Index in the binding */
> > -   uint8_t index;
> > +   uint32_t index;
> >   
> >  /* Input attachment index (relative to the subpass) */
> >  uint8_t input_attachment_index;
> > 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Leo Liu



On 08/25/2017 10:53 AM, Leo Liu wrote:



On 08/25/2017 02:57 AM, Christian König wrote:

Am 24.08.2017 um 20:49 schrieb Leo Liu:

v2: use deinterlace common function
v3: make sure deinterlace only

Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 22 --
  1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c

index 6c3c4fe..aa4062d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -613,17 +613,22 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 mtx_lock(>mutex);
 surf = handle_table_get(drv->htab, context->target_id);
 context->mpeg4.frame_num++;
-
 screen = context->decoder->context->screen;
 interlaced = screen->get_video_param(screen, 
context->decoder->profile,

context->decoder->entrypoint,
PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
   if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,

- PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, 
context->decoder->profile,

+ context->decoder->entrypoint,
+ PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }


Should we bail out with an error here when it's the other way around?
Although I cannot think of any of case that to get buffer Interlaced 
now, It's still a good idea to bail out here when it happnens

Will add it in v4.


It's not a error when case like buffer is deinterlaced, and interlaced 
result from query. What we need to do is to do nothing, just ignores.

I have sent out v4, please ignore it, it won't work.

Leo






Thanks,
Leo





Would be nice if we could at least sanely handle that case.

Apart from that it looks good to me,
Christian.


 }
   if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_JPEG &&
@@ -640,13 +645,18 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 }
   if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
  if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) 
!= VA_STATUS_SUCCESS) {

+ old_buf->destroy(old_buf);
   mtx_unlock(>mutex);
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
  +  if (context->decoder->entrypoint == 
PIPE_VIDEO_ENTRYPOINT_ENCODE)
+ vl_compositor_yuv_deint(>cstate, >compositor, 
old_buf, surf->buffer);

+
+  old_buf->destroy(old_buf);
context->target = surf->buffer;
 }







___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-25 Thread Matt Turner
On Fri, Aug 25, 2017 at 10:50 AM, Nicolai Hähnle  wrote:
> On 25.08.2017 13:58, Marek Olšák wrote:
>>
>> Nicolai,
>>
>> Have you thought about switching to NIR for radeonsi completely to get
>> 16-bit support? We need NIR support anyway for spirv, right? Would be it be
>> easier than adding 16-bit support into TGSI, glsl2tgsi, and tgsi2llvm?
>
>
> Well. What's missing from the NIR path is:
>
> (1) GS and tess (the ABI parts only)
> (2) re-adding some minor extensions (shader_group_vote?)

I added shader_group_vote/shader_ballot support to NIR when I enabled
those extensions for i965 recently, so no work to do there.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] st/omx: move YUV deinterlace function to common

2017-08-25 Thread Leo Liu



On 08/24/2017 02:48 PM, Leo Liu wrote:



On 08/24/2017 11:34 AM, Christian König wrote:

Am 24.08.2017 um 17:11 schrieb Leo Liu:

Signed-off-by: Leo Liu 


Reviewed-by: Christian König  for the series.

Andy do you want to test this? Should make VA-API transcoding simpler 
to use.


Just got chance to test the transcoding(encoding previously). There is 
an issue with current patch 2, which is encode/decoder have buffer 
deinterlaced/interlaced.

v3, will address that, and performance keep same as before.
Forget the V4, it's not the right logic, the V3 handling is correct. If 
it's the other way around , and then should ignore,  not reallocated.


Regards,
Leo





Regards,
Leo






Regards,
Christian.


---
  src/gallium/auxiliary/vl/vl_compositor.c | 87 
+---

  src/gallium/auxiliary/vl/vl_compositor.h | 21 
  src/gallium/state_trackers/omx/vid_dec.c | 32 +---
  3 files changed, 68 insertions(+), 72 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c

index a79bf11264..794c8b5b17 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -885,6 +885,32 @@ draw_layers(struct vl_compositor *c, struct 
vl_compositor_state *s, struct u_rec

 }
  }
  +static void
+set_yuv_layer(struct vl_compositor_state *s, struct vl_compositor 
*c, unsigned layer,
+  struct pipe_video_buffer *buffer, struct u_rect 
*src_rect,

+  struct u_rect *dst_rect, bool y)
+{
+   struct pipe_sampler_view **sampler_views;
+   unsigned i;
+
+   assert(s && c && buffer);
+
+   assert(layer < VL_COMPOSITOR_MAX_LAYERS);
+
+   s->used_layers |= 1 << layer;
+   sampler_views = buffer->get_sampler_view_components(buffer);
+   for (i = 0; i < 3; ++i) {
+  s->layers[layer].samplers[i] = c->sampler_linear;
+ pipe_sampler_view_reference(>layers[layer].sampler_views[i], 
sampler_views[i]);

+   }
+
+   calc_src_and_dst(>layers[layer], buffer->width, buffer->height,
+src_rect ? *src_rect : 
default_rect(>layers[layer]),
+dst_rect ? *dst_rect : 
default_rect(>layers[layer]));

+
+   s->layers[layer].fs = (y) ? c->fs_weave_yuv.y : c->fs_weave_yuv.uv;
+}
+
  void
  vl_compositor_reset_dirty_area(struct u_rect *dirty)
  {
@@ -1143,36 +1169,6 @@ vl_compositor_set_layer_rotation(struct 
vl_compositor_state *s,

  }
void
-vl_compositor_set_yuv_layer(struct vl_compositor_state *s,
-struct vl_compositor *c,
-unsigned layer,
-struct pipe_video_buffer *buffer,
-struct u_rect *src_rect,
-struct u_rect *dst_rect,
-bool y)
-{
-   struct pipe_sampler_view **sampler_views;
-   unsigned i;
-
-   assert(s && c && buffer);
-
-   assert(layer < VL_COMPOSITOR_MAX_LAYERS);
-
-   s->used_layers |= 1 << layer;
-   sampler_views = buffer->get_sampler_view_components(buffer);
-   for (i = 0; i < 3; ++i) {
-  s->layers[layer].samplers[i] = c->sampler_linear;
- pipe_sampler_view_reference(>layers[layer].sampler_views[i], 
sampler_views[i]);

-   }
-
-   calc_src_and_dst(>layers[layer], buffer->width, buffer->height,
-src_rect ? *src_rect : 
default_rect(>layers[layer]),
-dst_rect ? *dst_rect : 
default_rect(>layers[layer]));

-
-   s->layers[layer].fs = (y) ? c->fs_weave_yuv.y : c->fs_weave_yuv.uv;
-}
-
-void
  vl_compositor_render(struct vl_compositor_state *s,
   struct vl_compositor   *c,
   struct pipe_surface *dst_surface,
@@ -1215,6 +1211,37 @@ vl_compositor_render(struct 
vl_compositor_state *s,

 draw_layers(c, s, dirty_area);
  }
  +void
+vl_compositor_yuv_deint(struct vl_compositor_state *s,
+struct vl_compositor *c,
+struct pipe_video_buffer *src,
+struct pipe_video_buffer *dst)
+{
+   struct pipe_surface **dst_surfaces;
+   struct u_rect dst_rect;
+
+   dst_surfaces = dst->get_surfaces(dst);
+   vl_compositor_clear_layers(s);
+
+   dst_rect.x0 = 0;
+   dst_rect.x1 = src->width;
+   dst_rect.y0 = 0;
+   dst_rect.y1 = src->height;
+
+   set_yuv_layer(s, c, 0, src, NULL, NULL, true);
+   vl_compositor_set_layer_dst_area(s, 0, _rect);
+   vl_compositor_render(s, c, dst_surfaces[0], NULL, false);
+
+   dst_rect.x1 /= 2;
+   dst_rect.y1 /= 2;
+
+   set_yuv_layer(s, c, 0, src, NULL, NULL, false);
+   vl_compositor_set_layer_dst_area(s, 0, _rect);
+   vl_compositor_render(s, c, dst_surfaces[1], NULL, false);
+
+   s->pipe->flush(s->pipe, NULL, 0);
+}
+
  bool
  vl_compositor_init(struct vl_compositor *c, struct pipe_context 
*pipe)

  {
diff --git a/src/gallium/auxiliary/vl/vl_compositor.h 
b/src/gallium/auxiliary/vl/vl_compositor.h

index 535abb75cd..2546d75b23 100644
--- 

[Mesa-dev] [PATCH 2/3] st/va move YUV content to deinterlaced buffer

2017-08-25 Thread Leo Liu
When reallocation for encoder

v2: use deinterlace common function
v3: make sure deinterlace only 
v4: bail out when reallocation for interlaced buffer

Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/va/picture.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 47e63d3b30..76434ee721 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -626,10 +626,20 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
 PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
 
if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,
- 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- 
PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, context->decoder->profile,
+   context->decoder->entrypoint,
+   PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  } else {
+ mtx_unlock(>mutex);
+ return VA_STATUS_ERROR_INVALID_SURFACE;
+  }
+
}
 
format = screen->get_video_param(screen, context->decoder->profile,
@@ -657,13 +667,18 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
}
 
if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
 
   if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) != 
VA_STATUS_SUCCESS) {
+ old_buf->destroy(old_buf);
  mtx_unlock(>mutex);
  return VA_STATUS_ERROR_ALLOCATION_FAILED;
   }
 
+  if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
+ vl_compositor_yuv_deint(>cstate, >compositor, old_buf, 
surf->buffer);
+
+  old_buf->destroy(old_buf);
   context->target = surf->buffer;
}
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-25 Thread Leo Liu



On 08/25/2017 02:57 AM, Christian König wrote:

Am 24.08.2017 um 20:49 schrieb Leo Liu:

v2: use deinterlace common function
v3: make sure deinterlace only

Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 22 --
  1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c

index 6c3c4fe..aa4062d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -613,17 +613,22 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 mtx_lock(>mutex);
 surf = handle_table_get(drv->htab, context->target_id);
 context->mpeg4.frame_num++;
-
 screen = context->decoder->context->screen;
 interlaced = screen->get_video_param(screen, 
context->decoder->profile,

context->decoder->entrypoint,
PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
   if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,

- PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, 
context->decoder->profile,

+ context->decoder->entrypoint,
+ PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }


Should we bail out with an error here when it's the other way around?
Although I cannot think of any of case that to get buffer Interlaced 
now, It's still a good idea to bail out here when it happnens

Will add it in v4.

Thanks,
Leo





Would be nice if we could at least sanely handle that case.

Apart from that it looks good to me,
Christian.


 }
   if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_JPEG &&
@@ -640,13 +645,18 @@ vlVaEndPicture(VADriverContextP ctx, 
VAContextID context_id)

 }
   if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
  if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) != 
VA_STATUS_SUCCESS) {

+ old_buf->destroy(old_buf);
   mtx_unlock(>mutex);
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
  +  if (context->decoder->entrypoint == 
PIPE_VIDEO_ENTRYPOINT_ENCODE)
+ vl_compositor_yuv_deint(>cstate, >compositor, 
old_buf, surf->buffer);

+
+  old_buf->destroy(old_buf);
context->target = surf->buffer;
 }





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-25 Thread Nicolai Hähnle

On 25.08.2017 13:58, Marek Olšák wrote:

Nicolai,

Have you thought about switching to NIR for radeonsi completely to get 
16-bit support? We need NIR support anyway for spirv, right? Would be it 
be easier than adding 16-bit support into TGSI, glsl2tgsi, and tgsi2llvm?


Well. What's missing from the NIR path is:

(1) GS and tess (the ABI parts only)
(2) re-adding some minor extensions (shader_group_vote?)
(3) fixing all the bugs -- it's been a while since I've done a piglit 
comparison


There's a bunch of unknowns, so it's hard to say, but once we're there 
16-bit should be much easier, so may be worth it.


Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] radeonsi: fix ARB_transform_feedback_overflow_query on <= VI

2017-08-25 Thread Nicolai Hähnle
From: Nicolai Hähnle 

The result written by the shader workaround needs to be written back, or
the CP may read stale data.

Fixes: 78476cfe071a ("radeonsi: enable ARB_transform_feedback_overflow_query")
---
 src/gallium/drivers/radeon/r600_pipe_common.h | 5 +
 src/gallium/drivers/radeon/r600_query.c   | 4 
 src/gallium/drivers/radeonsi/si_pipe.c| 4 +++-
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 59886e6..dca56734cd7 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -447,20 +447,25 @@ struct r600_common_screen {
 * contexts' compressed texture binding masks.
 */
unsignedcompressed_colortex_counter;
 
struct {
/* Context flags to set so that all writes from earlier jobs
 * in the CP are seen by L2 clients.
 */
unsigned cp_to_L2;
 
+   /* Context flags to set so that all writes from earlier jobs
+* that end in L2 are seen by CP.
+*/
+   unsigned L2_to_cp;
+
/* Context flags to set so that all writes from earlier
 * compute jobs are seen by L2 clients.
 */
unsigned compute_to_L2;
} barrier_flags;
 
void (*query_opaque_metadata)(struct r600_common_screen *rscreen,
  struct r600_texture *rtex,
  struct radeon_bo_metadata *md);
 
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index eaff39c830d..f937612bc1f 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -1826,20 +1826,24 @@ static void r600_render_condition(struct pipe_context 
*ctx,
 
/* Reset to NULL to avoid a redundant SET_PREDICATION
 * from launching the compute grid.
 */
rctx->render_cond = NULL;
 
ctx->get_query_result_resource(
ctx, query, true, PIPE_QUERY_TYPE_U64, 0,
>workaround_buf->b.b, 
rquery->workaround_offset);
 
+   /* Settings this in the render cond atom is too late,
+* so set it here. */
+   rctx->flags |= rctx->screen->barrier_flags.L2_to_cp;
+
atom->num_dw = 5;
 
rctx->render_cond_force_off = old_force_off;
} else {
for (qbuf = >buffer; qbuf; qbuf = 
qbuf->previous)
atom->num_dw += (qbuf->results_end / 
rquery->result_size) * 5;
 
if (rquery->b.type == 
PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE)
atom->num_dw *= R600_MAX_STREAMS;
}
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 74900439320..93f9e5c49af 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -1071,22 +1071,24 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws,
(sscreen->b.family == CHIP_STONEY ||
 sscreen->b.family == CHIP_RAVEN);
}
 
(void) mtx_init(>shader_parts_mutex, mtx_plain);
sscreen->use_monolithic_shaders =
(sscreen->b.debug_flags & DBG_MONOLITHIC_SHADERS) != 0;
 
sscreen->b.barrier_flags.cp_to_L2 = SI_CONTEXT_INV_SMEM_L1 |
SI_CONTEXT_INV_VMEM_L1;
-   if (sscreen->b.chip_class <= VI)
+   if (sscreen->b.chip_class <= VI) {
sscreen->b.barrier_flags.cp_to_L2 |= SI_CONTEXT_INV_GLOBAL_L2;
+   sscreen->b.barrier_flags.L2_to_cp |= 
SI_CONTEXT_WRITEBACK_GLOBAL_L2;
+   }
 
sscreen->b.barrier_flags.compute_to_L2 = SI_CONTEXT_CS_PARTIAL_FLUSH;
 
if (debug_get_bool_option("RADEON_DUMP_SHADERS", false))
sscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS | DBG_PS | 
DBG_CS;
 
for (i = 0; i < num_compiler_threads; i++)
sscreen->tm[i] = si_create_llvm_target_machine(sscreen);
for (i = 0; i < num_compiler_threads_lowprio; i++)
sscreen->tm_low_priority[i] = 
si_create_llvm_target_machine(sscreen);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] radeonsi: ensure cache flushes happen before SET_PREDICATION packets

2017-08-25 Thread Nicolai Hähnle
From: Nicolai Hähnle 

The data is read when the render_cond_atom is emitted, so we must
delay emitting the atom until after the flush.

Fixes: 0fe0320dc074 ("radeonsi: use optimal packet order when doing a pipeline 
sync")
---
 src/gallium/drivers/radeon/r600_pipe_common.h |  3 ++-
 src/gallium/drivers/radeon/r600_query.c   |  9 ++---
 src/gallium/drivers/radeonsi/si_state_draw.c  | 15 ++-
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index dca56734cd7..f78e38b65af 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -54,21 +54,22 @@ struct u_log_context;
 #define R600_RESOURCE_FLAG_TRANSFER(PIPE_RESOURCE_FLAG_DRV_PRIV << 
0)
 #define R600_RESOURCE_FLAG_FLUSHED_DEPTH   (PIPE_RESOURCE_FLAG_DRV_PRIV << 
1)
 #define R600_RESOURCE_FLAG_FORCE_TILING
(PIPE_RESOURCE_FLAG_DRV_PRIV << 2)
 #define R600_RESOURCE_FLAG_DISABLE_DCC (PIPE_RESOURCE_FLAG_DRV_PRIV << 
3)
 #define R600_RESOURCE_FLAG_UNMAPPABLE  (PIPE_RESOURCE_FLAG_DRV_PRIV << 
4)
 
 #define R600_CONTEXT_STREAMOUT_FLUSH   (1u << 0)
 /* Pipeline & streamout query controls. */
 #define R600_CONTEXT_START_PIPELINE_STATS  (1u << 1)
 #define R600_CONTEXT_STOP_PIPELINE_STATS   (1u << 2)
-#define R600_CONTEXT_PRIVATE_FLAG  (1u << 3)
+#define R600_CONTEXT_FLUSH_FOR_RENDER_COND (1u << 3)
+#define R600_CONTEXT_PRIVATE_FLAG  (1u << 4)
 
 /* special primitive types */
 #define R600_PRIM_RECTANGLE_LIST   PIPE_PRIM_MAX
 
 #define R600_NOT_QUERY 0x
 
 /* Debug flags. */
 /* logging and features */
 #define DBG_TEX(1 << 0)
 #define DBG_NIR(1 << 1)
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index f937612bc1f..03ff1018a71 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -1828,25 +1828,28 @@ static void r600_render_condition(struct pipe_context 
*ctx,
 * from launching the compute grid.
 */
rctx->render_cond = NULL;
 
ctx->get_query_result_resource(
ctx, query, true, PIPE_QUERY_TYPE_U64, 0,
>workaround_buf->b.b, 
rquery->workaround_offset);
 
/* Settings this in the render cond atom is too late,
 * so set it here. */
-   rctx->flags |= rctx->screen->barrier_flags.L2_to_cp;
-
-   atom->num_dw = 5;
+   rctx->flags |= rctx->screen->barrier_flags.L2_to_cp |
+  R600_CONTEXT_FLUSH_FOR_RENDER_COND;
 
rctx->render_cond_force_off = old_force_off;
+   }
+
+   if (needs_workaround) {
+   atom->num_dw = 5;
} else {
for (qbuf = >buffer; qbuf; qbuf = 
qbuf->previous)
atom->num_dw += (qbuf->results_end / 
rquery->result_size) * 5;
 
if (rquery->b.type == 
PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE)
atom->num_dw *= R600_MAX_STREAMS;
}
}
 
rctx->render_cond = query;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 1d8be49a480..81751d2186e 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -1385,34 +1385,39 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
  SI_CONTEXT_PS_PARTIAL_FLUSH |
  SI_CONTEXT_CS_PARTIAL_FLUSH))) {
/* If we have to wait for idle, set all states first, so that 
all
 * SET packets are processed in parallel with previous draw 
calls.
 * Then upload descriptors, set shader pointers, and draw, and
 * prefetch at the end. This ensures that the time the CUs
 * are idle is very short. (there are only SET_SH packets 
between
 * the wait and the draw)
 */
struct r600_atom *shader_pointers = >shader_pointers.atom;
+   unsigned masked_atoms = 1u << shader_pointers->id;
 
-   /* Emit all states except shader pointers. */
-   si_emit_all_states(sctx, info, 1 << shader_pointers->id);
+   if (unlikely(sctx->b.flags & 
R600_CONTEXT_FLUSH_FOR_RENDER_COND))
+   masked_atoms |= 1u << sctx->b.render_cond_atom.id;
+
+   /* Emit all states except shader pointers and render 

[Mesa-dev] [PATCH 1/3] radeonsi: fix compute shader state dumping

2017-08-25 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Fixes: 420c438589c8 ("radeonsi: log draw and compute state into log context")
---
 src/gallium/drivers/radeonsi/si_debug.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
b/src/gallium/drivers/radeonsi/si_debug.c
index c2242a6deab..43ad73d92df 100644
--- a/src/gallium/drivers/radeonsi/si_debug.c
+++ b/src/gallium/drivers/radeonsi/si_debug.c
@@ -53,72 +53,78 @@ struct si_log_chunk_shader {
/* The shader destroy code assumes a current context for unlinking of
 * PM4 packets etc.
 *
 * While we should be able to destroy shaders without a context, doing
 * so would happen only very rarely and be therefore likely to fail
 * just when you're trying to debug something. Let's just remember the
 * current context in the chunk.
 */
struct si_context *ctx;
struct si_shader *shader;
+   enum pipe_shader_type processor;
 
/* For keep-alive reference counts */
struct si_shader_selector *sel;
struct si_compute *program;
 };
 
 static void
 si_log_chunk_shader_destroy(void *data)
 {
struct si_log_chunk_shader *chunk = data;
si_shader_selector_reference(chunk->ctx, >sel, NULL);
si_compute_reference(>program, NULL);
FREE(chunk);
 }
 
 static void
 si_log_chunk_shader_print(void *data, FILE *f)
 {
struct si_log_chunk_shader *chunk = data;
struct si_screen *sscreen = chunk->ctx->screen;
-   si_dump_shader(sscreen, chunk->shader->selector->info.processor,
+   si_dump_shader(sscreen, chunk->processor,
   chunk->shader, f);
 }
 
 static struct u_log_chunk_type si_log_chunk_type_shader = {
.destroy = si_log_chunk_shader_destroy,
.print = si_log_chunk_shader_print,
 };
 
 static void si_dump_gfx_shader(struct si_context *ctx,
   const struct si_shader_ctx_state *state,
   struct u_log_context *log)
 {
struct si_shader *current = state->current;
 
if (!state->cso || !current)
return;
 
struct si_log_chunk_shader *chunk = CALLOC_STRUCT(si_log_chunk_shader);
chunk->ctx = ctx;
+   chunk->processor = state->cso->info.processor;
chunk->shader = current;
si_shader_selector_reference(ctx, >sel, current->selector);
u_log_chunk(log, _log_chunk_type_shader, chunk);
 }
 
-static void si_dump_compute_shader(const struct si_cs_shader_state *state,
+static void si_dump_compute_shader(struct si_context *ctx,
   struct u_log_context *log)
 {
-   if (!state->program || state->program != state->emitted_program)
+   const struct si_cs_shader_state *state = >cs_shader_state;
+
+   if (!state->program)
return;
 
struct si_log_chunk_shader *chunk = CALLOC_STRUCT(si_log_chunk_shader);
+   chunk->ctx = ctx;
+   chunk->processor = PIPE_SHADER_COMPUTE;
chunk->shader = >program->shader;
si_compute_reference(>program, state->program);
u_log_chunk(log, _log_chunk_type_shader, chunk);
 }
 
 /**
  * Shader compiles can be overridden with arbitrary ELF objects by setting
  * the environment variable 
RADEON_REPLACE_SHADERS=num1:filename1[;num2:filename2]
  */
 bool si_replace_shader(unsigned num, struct ac_shader_binary *binary)
@@ -737,22 +743,21 @@ static void si_dump_gfx_descriptors(struct si_context 
*sctx,
 {
if (!state->cso || !state->current)
return;
 
si_dump_descriptors(sctx, state->cso->type, >cso->info, log);
 }
 
 static void si_dump_compute_descriptors(struct si_context *sctx,
struct u_log_context *log)
 {
-   if (!sctx->cs_shader_state.program ||
-   sctx->cs_shader_state.program != 
sctx->cs_shader_state.emitted_program)
+   if (!sctx->cs_shader_state.program)
return;
 
si_dump_descriptors(sctx, PIPE_SHADER_COMPUTE, NULL, log);
 }
 
 struct si_shader_inst {
char text[160];  /* one disasm line */
unsigned offset; /* instruction offset */
unsigned size;   /* instruction size = 4 or 8 */
 };
@@ -1060,21 +1065,21 @@ void si_log_draw_state(struct si_context *sctx, struct 
u_log_context *log)
si_dump_gfx_descriptors(sctx, >tes_shader, log);
si_dump_gfx_descriptors(sctx, >gs_shader, log);
si_dump_gfx_descriptors(sctx, >ps_shader, log);
 }
 
 void si_log_compute_state(struct si_context *sctx, struct u_log_context *log)
 {
if (!log)
return;
 
-   si_dump_compute_shader(>cs_shader_state, log);
+   si_dump_compute_shader(sctx, log);
si_dump_compute_descriptors(sctx, log);
 }
 
 static void si_dump_dma(struct si_context *sctx,
struct radeon_saved_cs *saved, FILE *f)
 {
static const 

[Mesa-dev] [PATCH v2 2/2] radv: propagate VK_ERROR_OUT_OF_HOST_MEMORY to vk{Begin, End}CommandBuffer()

2017-08-25 Thread Samuel Pitoiset
v2: - store record_result in radv_CmdBeginRenderPass()

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 21e2dfd9f7..3c5d80b053 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1832,7 +1832,7 @@ radv_cmd_buffer_set_subpass(struct radv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_RENDER_TARGETS;
 }
 
-static void
+static VkResult
 radv_cmd_state_setup_attachments(struct radv_cmd_buffer *cmd_buffer,
 struct radv_render_pass *pass,
 const VkRenderPassBeginInfo *info)
@@ -1841,7 +1841,7 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer 
*cmd_buffer,
 
if (pass->attachment_count == 0) {
state->attachments = NULL;
-   return;
+   return VK_SUCCESS;
}
 
state->attachments = vk_alloc(_buffer->pool->alloc,
@@ -1849,8 +1849,8 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer 
*cmd_buffer,
sizeof(state->attachments[0]),
8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (state->attachments == NULL) {
-   /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to 
vkEndCommandBuffer */
-   abort();
+   cmd_buffer->record_result = VK_ERROR_OUT_OF_HOST_MEMORY;
+   return cmd_buffer->record_result;
}
 
for (uint32_t i = 0; i < pass->attachment_count; ++i) {
@@ -1887,6 +1887,8 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer 
*cmd_buffer,
 
state->attachments[i].current_layout = att->initial_layout;
}
+
+   return VK_SUCCESS;
 }
 
 VkResult radv_AllocateCommandBuffers(
@@ -1980,6 +1982,8 @@ VkResult radv_BeginCommandBuffer(
const VkCommandBufferBeginInfo *pBeginInfo)
 {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
+   VkResult result = VK_SUCCESS;
+
radv_reset_cmd_buffer(cmd_buffer);
 
memset(_buffer->state, 0, sizeof(cmd_buffer->state));
@@ -2008,12 +2012,15 @@ VkResult radv_BeginCommandBuffer(
struct radv_subpass *subpass =

_buffer->state.pass->subpasses[pBeginInfo->pInheritanceInfo->subpass];
 
-   radv_cmd_state_setup_attachments(cmd_buffer, 
cmd_buffer->state.pass, NULL);
+   result = radv_cmd_state_setup_attachments(cmd_buffer, 
cmd_buffer->state.pass, NULL);
+   if (result != VK_SUCCESS)
+   return result;
+
radv_cmd_buffer_set_subpass(cmd_buffer, subpass, false);
}
 
radv_cmd_buffer_trace_emit(cmd_buffer);
-   return VK_SUCCESS;
+   return result;
 }
 
 void radv_CmdBindVertexBuffers(
@@ -2642,11 +2649,14 @@ void radv_CmdBeginRenderPass(
 
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws,
   cmd_buffer->cs, 
2048);
+   MAYBE_UNUSED VkResult result;
 
cmd_buffer->state.framebuffer = framebuffer;
cmd_buffer->state.pass = pass;
cmd_buffer->state.render_area = pRenderPassBegin->renderArea;
-   radv_cmd_state_setup_attachments(cmd_buffer, pass, pRenderPassBegin);
+   result = radv_cmd_state_setup_attachments(cmd_buffer, pass, 
pRenderPassBegin);
+   if (result != VK_SUCCESS)
+   cmd_buffer->record_result = result;
 
radv_cmd_buffer_set_subpass(cmd_buffer, pass->subpasses, true);
assert(cmd_buffer->cs->cdw <= cdw_max);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radv: propagate VK_ERROR_OUT_OF_HOST_MEMORY to vk{Begin, End}CommandBuffer()

2017-08-25 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 21e2dfd9f7..cc9aeafefa 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1832,7 +1832,7 @@ radv_cmd_buffer_set_subpass(struct radv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_RENDER_TARGETS;
 }
 
-static void
+static VkResult
 radv_cmd_state_setup_attachments(struct radv_cmd_buffer *cmd_buffer,
 struct radv_render_pass *pass,
 const VkRenderPassBeginInfo *info)
@@ -1841,7 +1841,7 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer 
*cmd_buffer,
 
if (pass->attachment_count == 0) {
state->attachments = NULL;
-   return;
+   return VK_SUCCESS;
}
 
state->attachments = vk_alloc(_buffer->pool->alloc,
@@ -1849,8 +1849,8 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer 
*cmd_buffer,
sizeof(state->attachments[0]),
8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (state->attachments == NULL) {
-   /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to 
vkEndCommandBuffer */
-   abort();
+   cmd_buffer->record_result = VK_ERROR_OUT_OF_HOST_MEMORY;
+   return cmd_buffer->record_result;
}
 
for (uint32_t i = 0; i < pass->attachment_count; ++i) {
@@ -1887,6 +1887,8 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer 
*cmd_buffer,
 
state->attachments[i].current_layout = att->initial_layout;
}
+
+   return VK_SUCCESS;
 }
 
 VkResult radv_AllocateCommandBuffers(
@@ -1980,6 +1982,8 @@ VkResult radv_BeginCommandBuffer(
const VkCommandBufferBeginInfo *pBeginInfo)
 {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
+   VkResult result = VK_SUCCESS;
+
radv_reset_cmd_buffer(cmd_buffer);
 
memset(_buffer->state, 0, sizeof(cmd_buffer->state));
@@ -2008,12 +2012,15 @@ VkResult radv_BeginCommandBuffer(
struct radv_subpass *subpass =

_buffer->state.pass->subpasses[pBeginInfo->pInheritanceInfo->subpass];
 
-   radv_cmd_state_setup_attachments(cmd_buffer, 
cmd_buffer->state.pass, NULL);
+   result = radv_cmd_state_setup_attachments(cmd_buffer, 
cmd_buffer->state.pass, NULL);
+   if (result != VK_SUCCESS)
+   return result;
+
radv_cmd_buffer_set_subpass(cmd_buffer, subpass, false);
}
 
radv_cmd_buffer_trace_emit(cmd_buffer);
-   return VK_SUCCESS;
+   return result;
 }
 
 void radv_CmdBindVertexBuffers(
@@ -2642,11 +2649,13 @@ void radv_CmdBeginRenderPass(
 
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws,
   cmd_buffer->cs, 
2048);
+   MAYBE_UNUSED VkResult result;
 
cmd_buffer->state.framebuffer = framebuffer;
cmd_buffer->state.pass = pass;
cmd_buffer->state.render_area = pRenderPassBegin->renderArea;
-   radv_cmd_state_setup_attachments(cmd_buffer, pass, pRenderPassBegin);
+   result = radv_cmd_state_setup_attachments(cmd_buffer, pass, 
pRenderPassBegin);
+   assert(result == VK_SUCCESS);
 
radv_cmd_buffer_set_subpass(cmd_buffer, pass->subpasses, true);
assert(cmd_buffer->cs->cdw <= cdw_max);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radv: rename record_fail to record_result and use VkResult

2017-08-25 Thread Samuel Pitoiset
This will allow to propagate VK_ERROR_OUT_OF_HOST_MEMORY to
vkEndCommandBuffer() when necessary.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 16 
 src/amd/vulkan/radv_private.h|  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index fe8f3f2cb8..21e2dfd9f7 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -231,7 +231,7 @@ static void  radv_reset_cmd_buffer(struct radv_cmd_buffer 
*cmd_buffer)
  
cmd_buffer->upload.upload_bo, 8);
cmd_buffer->upload.offset = 0;
 
-   cmd_buffer->record_fail = false;
+   cmd_buffer->record_result = VK_SUCCESS;
 
cmd_buffer->ring_offsets_idx = -1;
 
@@ -262,7 +262,7 @@ radv_cmd_buffer_resize_upload_buf(struct radv_cmd_buffer 
*cmd_buffer,
   RADEON_FLAG_CPU_ACCESS);
 
if (!bo) {
-   cmd_buffer->record_fail = true;
+   cmd_buffer->record_result = VK_ERROR_OUT_OF_DEVICE_MEMORY;
return false;
}
 
@@ -271,7 +271,7 @@ radv_cmd_buffer_resize_upload_buf(struct radv_cmd_buffer 
*cmd_buffer,
upload = malloc(sizeof(*upload));
 
if (!upload) {
-   cmd_buffer->record_fail = true;
+   cmd_buffer->record_result = 
VK_ERROR_OUT_OF_DEVICE_MEMORY;
device->ws->buffer_destroy(bo);
return false;
}
@@ -286,7 +286,7 @@ radv_cmd_buffer_resize_upload_buf(struct radv_cmd_buffer 
*cmd_buffer,
cmd_buffer->upload.map = 
device->ws->buffer_map(cmd_buffer->upload.upload_bo);
 
if (!cmd_buffer->upload.map) {
-   cmd_buffer->record_fail = true;
+   cmd_buffer->record_result = VK_ERROR_OUT_OF_DEVICE_MEMORY;
return false;
}
 
@@ -2136,7 +2136,7 @@ static bool radv_init_push_descriptor_set(struct 
radv_cmd_buffer *cmd_buffer,
 
if (!set->mapped_ptr) {
cmd_buffer->push_descriptors.capacity = 0;
-   cmd_buffer->record_fail = true;
+   cmd_buffer->record_result = 
VK_ERROR_OUT_OF_DEVICE_MEMORY;
return false;
}
 
@@ -2252,10 +2252,10 @@ VkResult radv_EndCommandBuffer(
si_emit_cache_flush(cmd_buffer);
}
 
-   if (!cmd_buffer->device->ws->cs_finalize(cmd_buffer->cs) ||
-   cmd_buffer->record_fail)
+   if (!cmd_buffer->device->ws->cs_finalize(cmd_buffer->cs))
return VK_ERROR_OUT_OF_DEVICE_MEMORY;
-   return VK_SUCCESS;
+
+   return cmd_buffer->record_result;
 }
 
 static void
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 0e297f5b6e..cf5c853278 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -833,7 +833,7 @@ struct radv_cmd_buffer {
bool tess_rings_needed;
bool sample_positions_needed;
 
-   bool record_fail;
+   VkResult record_result;
 
int ring_offsets_idx; /* just used for verification */
uint32_t gfx9_fence_offset;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] gallium/u_threaded: disallow discard_range if map_buffer is unsynchronized

2017-08-25 Thread Marek Olšák
From: Marek Olšák 

The discard range codepath takes precedence, so if we get both
unsynchronized and discard_range, choose unsynchronized.
---
 src/gallium/auxiliary/util/u_threaded_context.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_threaded_context.c 
b/src/gallium/auxiliary/util/u_threaded_context.c
index cbcd405..8e3cc34 100644
--- a/src/gallium/auxiliary/util/u_threaded_context.c
+++ b/src/gallium/auxiliary/util/u_threaded_context.c
@@ -1330,22 +1330,24 @@ tc_improve_map_buffer_flags(struct threaded_context *tc,
usage &= ~PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE;
 
/* GL_AMD_pinned_memory and persistent mappings can't use staging
 * buffers. */
if (usage & (PIPE_TRANSFER_UNSYNCHRONIZED |
 PIPE_TRANSFER_PERSISTENT) ||
tres->is_user_ptr)
   usage &= ~PIPE_TRANSFER_DISCARD_RANGE;
 
/* Unsychronized buffer mappings don't have to synchronize the thread. */
-   if (usage & PIPE_TRANSFER_UNSYNCHRONIZED)
+   if (usage & PIPE_TRANSFER_UNSYNCHRONIZED) {
+  usage &= ~PIPE_TRANSFER_DISCARD_RANGE;
   usage |= TC_TRANSFER_MAP_THREADED_UNSYNC; /* notify the driver */
+   }
 
/* Never invalidate inside the driver and never infer "unsynchronized". */
return usage |
   TC_TRANSFER_MAP_NO_INVALIDATE |
   TC_TRANSFER_MAP_IGNORE_VALID_RANGE;
 }
 
 static void *
 tc_transfer_map(struct pipe_context *_pipe,
 struct pipe_resource *resource, unsigned level,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] gallium/u_threaded: rename IGNORE_VALID_RANGE -> NO_INFER_UNSYNCHRONIZED

2017-08-25 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/util/u_threaded_context.c | 4 ++--
 src/gallium/auxiliary/util/u_threaded_context.h | 4 ++--
 src/gallium/drivers/radeon/r600_buffer_common.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_threaded_context.c 
b/src/gallium/auxiliary/util/u_threaded_context.c
index 8e3cc34..043d4e6 100644
--- a/src/gallium/auxiliary/util/u_threaded_context.c
+++ b/src/gallium/auxiliary/util/u_threaded_context.c
@@ -1293,21 +1293,21 @@ tc_improve_map_buffer_flags(struct threaded_context *tc,
* result in an incorrect behavior with the threaded context.
*/
   return usage;
}
 
/* Handle CPU reads trivially. */
if (usage & PIPE_TRANSFER_READ) {
   /* Drivers aren't allowed to do buffer invalidations. */
   return (usage & ~PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE) |
  TC_TRANSFER_MAP_NO_INVALIDATE |
- TC_TRANSFER_MAP_IGNORE_VALID_RANGE;
+ TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED;
}
 
/* See if the buffer range being mapped has never been initialized,
 * in which case it can be mapped unsynchronized. */
if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED) &&
!tres->is_shared &&
!util_ranges_intersect(>valid_buffer_range, offset, offset + 
size))
   usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
 
if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) {
@@ -1338,21 +1338,21 @@ tc_improve_map_buffer_flags(struct threaded_context *tc,
 
/* Unsychronized buffer mappings don't have to synchronize the thread. */
if (usage & PIPE_TRANSFER_UNSYNCHRONIZED) {
   usage &= ~PIPE_TRANSFER_DISCARD_RANGE;
   usage |= TC_TRANSFER_MAP_THREADED_UNSYNC; /* notify the driver */
}
 
/* Never invalidate inside the driver and never infer "unsynchronized". */
return usage |
   TC_TRANSFER_MAP_NO_INVALIDATE |
-  TC_TRANSFER_MAP_IGNORE_VALID_RANGE;
+  TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED;
 }
 
 static void *
 tc_transfer_map(struct pipe_context *_pipe,
 struct pipe_resource *resource, unsigned level,
 unsigned usage, const struct pipe_box *box,
 struct pipe_transfer **transfer)
 {
struct threaded_context *tc = threaded_context(_pipe);
struct threaded_resource *tres = threaded_resource(resource);
diff --git a/src/gallium/auxiliary/util/u_threaded_context.h 
b/src/gallium/auxiliary/util/u_threaded_context.h
index 0742fae..8977b03 100644
--- a/src/gallium/auxiliary/util/u_threaded_context.h
+++ b/src/gallium/auxiliary/util/u_threaded_context.h
@@ -87,21 +87,21 @@
  *
  * 1) If transfer_map has PIPE_TRANSFER_UNSYNCHRONIZED, the call is made
  *in the non-driver thread without flushing the queue. The driver will
  *receive TC_TRANSFER_MAP_THREADED_UNSYNC in addition to PIPE_TRANSFER_-
  *UNSYNCHRONIZED to indicate this.
  *Note that transfer_unmap is always enqueued and called from the driver
  *thread.
  *
  * 2) The driver isn't allowed to infer unsychronized mappings by tracking
  *the valid buffer range. The threaded context always sends TC_TRANSFER_-
- *MAP_IGNORE_VALID_RANGE to indicate this. Ignoring the flag will lead
+ *MAP_NO_INFER_UNSYNCHRONIZED to indicate this. Ignoring the flag will lead
  *to failures.
  *The threaded context does its own detection of unsynchronized mappings.
  *
  * 3) The driver isn't allowed to do buffer invalidations by itself under any
  *circumstances. This is necessary for unsychronized maps to map the latest
  *version of the buffer. (because invalidations can be queued, while
  *unsychronized maps are not queued and they should return the latest
  *storage after invalidation). The threaded context always sends
  *TC_TRANSFER_MAP_NO_INVALIDATE into transfer_map and buffer_subdata to
  *indicate this. Ignoring the flag will lead to failures.
@@ -159,21 +159,21 @@
 #define U_THREADED_CONTEXT_H
 
 #include "pipe/p_context.h"
 #include "pipe/p_state.h"
 #include "util/u_queue.h"
 #include "util/u_range.h"
 #include "util/slab.h"
 
 /* These are transfer flags sent to drivers. */
 /* Never infer whether it's safe to use unsychronized mappings: */
-#define TC_TRANSFER_MAP_IGNORE_VALID_RANGE   (1u << 29)
+#define TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED (1u << 29)
 /* Don't invalidate buffers: */
 #define TC_TRANSFER_MAP_NO_INVALIDATE(1u << 30)
 /* transfer_map is called from a non-driver thread: */
 #define TC_TRANSFER_MAP_THREADED_UNSYNC  (1u << 31)
 
 /* Size of the queue = number of batch slots in memory.
  * - 1 batch is always idle and records new commands
  * - 1 batch is being executed
  * so the queue size is TC_MAX_BATCHES - 2 = number of waiting batches.
  *
diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index dd1c209..076faa9 100644
--- 

Re: [Mesa-dev] [PATCH] egl/drm: Don't "fall back" to /dev/dri/card0 if the first open fails

2017-08-25 Thread Emil Velikov
On 24 August 2017 at 19:52, Adam Jackson  wrote:
> The snprintf stuff here already constructs the right name for the device
> node, and if it doesn't, you configured Mesa wrong, don't do that.
>
I think the idea was that "snprintf can fail" even though in practise
it will never do.
In all fairness how much to care in that case.

Should we drop the "if (n != -1 && n < sizeof(buf))" part as well with
this patch?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glmark2 terrain errors on imx6q

2017-08-25 Thread Fabio Estevam
Hi Lucas,

On Fri, Aug 25, 2017 at 4:57 AM, Lucas Stach  wrote:

> There is no fix for this. The terrain shaders are simply too big to be
> executed on GC2000. (You remember that 512 instruction limit mentioned
> in the reference manual? That's it.)
>
> This demo runs fine on GC3000.

Thanks for the clarification!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: Fix off by one in MAX_VBS assert.

2017-08-25 Thread Samuel Pitoiset

Thanks Bas!

This fixes CTS 
dEQP-VK.pipeline.vertex_input.max_attributes.32_attributes.binding_one_to_one.interleaved


Tested-by: Samuel Pitoiset 

On 08/25/2017 02:15 PM, Bas Nieuwenhuizen wrote:

e.g. 0 + 32 <= 32 should be valid.

Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
  src/amd/vulkan/radv_cmd_buffer.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index cbe0de17db4..5cb994c3604 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2025,7 +2025,7 @@ void radv_CmdBindVertexBuffers(
/* We have to defer setting up vertex buffer since we need the buffer
 * stride from the pipeline. */
  
-	assert(firstBinding + bindingCount < MAX_VBS);

+   assert(firstBinding + bindingCount <= MAX_VBS);
for (uint32_t i = 0; i < bindingCount; i++) {
vb[firstBinding + i].buffer = 
radv_buffer_from_handle(pBuffers[i]);
vb[firstBinding + i].offset = pOffsets[i];


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] docs: remove released and extend the calendar until the end of 2017

2017-08-25 Thread Eric Engestrom
On Friday, 2017-08-25 14:40:25 +0300, Andres Gomez wrote:
> Completed the 17.2 cycle and added the beginning of the 17.3 one.
> 
> Cc: Emil Velikov 
> Cc: Juan A. Suarez Romero 
> Signed-off-by: Andres Gomez 
> ---
>  docs/release-calendar.html | 86 
> ++
>  1 file changed, 80 insertions(+), 6 deletions(-)
> 
> diff --git a/docs/release-calendar.html b/docs/release-calendar.html
> index 93c02dafe94..1ed3ae14a97 100644
> --- a/docs/release-calendar.html
> +++ b/docs/release-calendar.html
> @@ -46,17 +46,91 @@ if you'd like to nominate a patch in the next stable 
> release.
>  Final planned release for the 17.1 series
>  
>  
> -17.2
> -2017-08-11
> -17.2.0-rc4
> +17.2
> +2017-09-08
> +17.2.1
>  Emil Velikov
> -May be promoted to 17.2.0 final

17.2.0 isn't here yet, so maybe leave an "-rc5, may be promoted" line here?

> +
>  
>  
> -2017-08-25
> -17.2.1
> +2017-09-22
> +17.2.2
> +Juan A. Suarez Romero
> +
> +
> +
> +2017-10-06
> +17.2.3
> +Emil Velikov
> +
> +
> +
> +2017-10-20
> +17.2.4
> +Juan A. Suarez Romero
> +
> +
> +
> +2017-11-03
> +17.2.5
> +Andres Gomez
> +
> +
> +
> +2017-11-17
> +17.2.6
> +Andres Gomez
> +
> +
> +
> +2017-12-01
> +17.2.7
> +Andres Gomez
> +Final planned release for the 17.2 series
> +
> +
> +17.3
> +2017-10-20
> +17.3.0-rc1
>  Emil Velikov
>  
> +
> +
> +2017-10-27
> +17.3.0-rc2
> +Emil Velikov
> +
> +
> +
> +2017-11-03
> +17.3.0-rc3
> +Emil Velikov
> +
> +
> +
> +2017-11-10
> +17.3.0-rc4
> +Emil Velikov
> +May be promoted to 17.3.0 final
> +
> +
> +2017-11-24
> +17.3.1
> +Andres Gomez
> +
> +
> +
> +2017-12-08
> +17.3.2
> +Emil Velikov
> +
> +
> +
> +2017-12-22
> +17.3.3
> +Emil Velikov
> +
> +
>  
>  
>  
> -- 
> 2.14.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] docs: update the release calendar until the end of 2017

2017-08-25 Thread Juan A. Suarez Romero
On Fri, 2017-08-25 at 14:40 +0300, Andres Gomez wrote:
> The first email updates the calendar with a proposal for the future
> releases until the end of 2017. It also removes versions that have
> already been released.
> 
> The second patch is a proposal to add yet another final iteration to
> the 17.1 cycle.
> 
> Andres Gomez (2):
>   docs: remove released and extend the calendar until the end of 2017
>   docs: add an additional final cycle for 17.1
> 
>  docs/release-calendar.html | 94
> ++
>  1 file changed, 87 insertions(+), 7 deletions(-)
> 

For the full series, 


Reviewed-by: Juan A. Suarez Romero 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] etnaviv: use correct param for etna_compatible_rs_format(..)

2017-08-25 Thread Eric Engestrom
On Friday, 2017-08-25 13:39:05 +0200, Christian Gmeiner wrote:
> Found by code inspection.
> 
> Fixes: c9e8b49b885 ("etnaviv: gallium driver for Vivante GPUs")
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Christian Gmeiner 

Good catch!
Reviewed-by: Eric Engestrom 

> ---
>  src/gallium/drivers/etnaviv/etnaviv_clear_blit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_clear_blit.c 
> b/src/gallium/drivers/etnaviv/etnaviv_clear_blit.c
> index b832dd8f26..92c9107343 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_clear_blit.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_clear_blit.c
> @@ -396,7 +396,7 @@ etna_try_rs_blit(struct pipe_context *pctx,
> }
>  
> unsigned src_format = etna_compatible_rs_format(blit_info->src.format);
> -   unsigned dst_format = etna_compatible_rs_format(blit_info->src.format);
> +   unsigned dst_format = etna_compatible_rs_format(blit_info->dst.format);
> if (translate_rs_format(src_format) == ETNA_NO_MATCH ||
> translate_rs_format(dst_format) == ETNA_NO_MATCH ||
> blit_info->scissor_enable || blit_info->src.box.x != 0 ||
> -- 
> 2.13.5
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Fix off by one in MAX_VBS assert.

2017-08-25 Thread Bas Nieuwenhuizen
e.g. 0 + 32 <= 32 should be valid.

Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
 src/amd/vulkan/radv_cmd_buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index cbe0de17db4..5cb994c3604 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2025,7 +2025,7 @@ void radv_CmdBindVertexBuffers(
/* We have to defer setting up vertex buffer since we need the buffer
 * stride from the pipeline. */
 
-   assert(firstBinding + bindingCount < MAX_VBS);
+   assert(firstBinding + bindingCount <= MAX_VBS);
for (uint32_t i = 0; i < bindingCount; i++) {
vb[firstBinding + i].buffer = 
radv_buffer_from_handle(pBuffers[i]);
vb[firstBinding + i].offset = pOffsets[i];
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-25 Thread Marek Olšák
Nicolai,

Have you thought about switching to NIR for radeonsi completely to get
16-bit support? We need NIR support anyway for spirv, right? Would be it be
easier than adding 16-bit support into TGSI, glsl2tgsi, and tgsi2llvm?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >