Re: [Mesa-dev] [PATCH v2] i965: Issue performance warnings when growing the program cache

2017-08-23 Thread Kenneth Graunke
On Wednesday, August 23, 2017 1:58:32 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-08-22 21:47:54)
> > This involves a bunch of unnecessary copying, a batch flush, and
> > state re-emission.
> 
> > ---
> >  src/mesa/drivers/dri/i965/brw_program_cache.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c 
> > b/src/mesa/drivers/dri/i965/brw_program_cache.c
> > index 4dcfd5234df..e9706be8961 100644
> > --- a/src/mesa/drivers/dri/i965/brw_program_cache.c
> > +++ b/src/mesa/drivers/dri/i965/brw_program_cache.c
> > @@ -217,6 +217,9 @@ brw_cache_new_bo(struct brw_cache *cache, uint32_t 
> > new_size)
> > struct brw_context *brw = cache->brw;
> > struct brw_bo *new_bo;
> >  
> > +   perf_debug("Copying to larger program cache: %zu kB -> %u kB\n",
> > +  cache->bo->size / 1024, new_size / 1024);
> 
> Hmm, z -> size_t but bo->size is uin64_t, so sadly we need "%"PRIu64
> -Chris
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

Right...

I'll just do:

+   perf_debug("Copying to larger program cache: %u kB -> %u kB\n",
+  (unsigned) cache->bo->size / 1024, new_size / 1024);

new_size is already a uint32_t, and we know the existing size is half
of that, so it's not like the extra bits are buying us anything.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50/ir: properly set sType for TXF ops to U32

2017-08-23 Thread Ilia Mirkin
All of the coordinates and LOD args are integers for TXF. This mostly
doesn't matter, except for converting into a levelZero=true operation by
removing an explicit zero LOD. For the comparison against zero to work
properly, the sType of the instruction has to be set correctly.

Fixes: KHR-GL45.robust_buffer_access_behavior.texel_fetch
Reported-by: Karol Herbst 
Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
index 14e8c1320cf..4076177e56d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
@@ -912,6 +912,9 @@ TexInstruction::TexInstruction(Function *fn, operation op)
 
tex.rIndirectSrc = -1;
tex.sIndirectSrc = -1;
+
+   if (op == OP_TXF)
+  sType = TYPE_U32;
 }
 
 TexInstruction::~TexInstruction()
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/5] util/disk_cache: rename mesa cache dir and introduce cache versioning

2017-08-23 Thread Nicholas Miell

On 08/23/2017 06:23 PM, Matt Turner wrote:

On Wed, Aug 23, 2017 at 2:32 AM, Timothy Arceri  wrote:

Steam is already analysing cache items


What does this mean?


Steam will attempt to download compiled shaders for your GPU and version 
of Mesa, or upload the shaders you compile locally if they're not 
already stored on Valve's servers.


How they verify shaders aren't malicious is an unanswered question.

You can disable this attack surface by setting the 
STEAM_ENABLE_SHADER_CACHE_MANAGEMENT environment variable to 0 before 
starting the Steam client.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glx: add support for GLX_ARB_create_context_no_error

2017-08-23 Thread Timothy Arceri

On 04/08/17 04:26, Grigori Goronzy wrote:

Hi,

there also is a patch needed to make this work for Xorg on the 
xorg-devel list as well as preliminary piglit test to verify the 
functionality on the piglit list.


Hi Emil,

Any chance you can take a look at this series? I feel you are much more 
familiar with this code than I am.


Thanks,
Tim



Grigori

On 2017-08-03 20:07, Grigori Goronzy wrote:

---
 src/glx/dri2_glx.c  | 12 
 src/glx/dri3_glx.c  |  8 
 src/glx/dri_common.c| 52 
-

 src/glx/dri_common.h|  5 +
 src/glx/drisw_glx.c |  3 +++
 src/glx/glxclient.h |  6 ++
 src/glx/glxextensions.c |  1 +
 src/glx/glxextensions.h |  1 +
 8 files changed, 87 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index ae8cb11..263f864 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -262,6 +262,10 @@ dri2_create_context_attribs(struct glx_screen *base,
  , , error))
   goto error_exit;

+   if (!dri2_check_no_error(flags, shareList, major_ver, error)) {
+  goto error_exit;
+   }
+
/* Check the renderType value */
if (!validate_renderType_against_config(config_base, renderType))
goto error_exit;
@@ -1159,6 +1163,14 @@ dri2BindExtensions(struct dri2_screen *psc,
struct glx_display * priv,
  __glXEnableDirectExtension(>base,
"GLX_ARB_create_context_robustness");

+  /* DRI2 version 3 is also required because
+   * GLX_ARB_create_context_no_error requires 
GLX_ARB_create_context.

+   */
+  if (psc->dri2->base.version >= 3
+  && strcmp(extensions[i]->name, __DRI2_NO_ERROR) == 0)
+ __glXEnableDirectExtension(>base,
+ "GLX_ARB_create_context_no_error");
+
   /* DRI2 version 3 is also required because GLX_MESA_query_renderer
* requires GLX_ARB_create_context_profile.
*/
diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5091606..19667fa 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -283,6 +283,10 @@ dri3_create_context_attribs(struct glx_screen *base,
  , error))
   goto error_exit;

+   if (!dri2_check_no_error(flags, shareList, major_ver, error)) {
+  goto error_exit;
+   }
+
/* Check the renderType value */
if (!validate_renderType_against_config(config_base, render_type))
goto error_exit;
@@ -754,6 +758,10 @@ dri3_bind_extensions(struct dri3_screen *psc,
struct glx_display * priv,
  __glXEnableDirectExtension(>base,
"GLX_ARB_create_context_robustness");

+  if (strcmp(extensions[i]->name, __DRI2_NO_ERROR) == 0)
+ __glXEnableDirectExtension(>base,
+ "GLX_ARB_create_context_no_error");
+
   if (strcmp(extensions[i]->name, __DRI2_RENDERER_QUERY) == 0) {
  psc->rendererQuery = (__DRI2rendererQueryExtension *) 
extensions[i];
  __glXEnableDirectExtension(>base, 
"GLX_MESA_query_renderer");

diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c
index 854733a..2cab207 100644
--- a/src/glx/dri_common.c
+++ b/src/glx/dri_common.c
@@ -468,6 +468,7 @@ dri2_convert_glx_attribs(unsigned num_attribs,
const uint32_t *attribs,
 {
unsigned i;
bool got_profile = false;
+   int no_error = 0;
uint32_t profile;

*major_ver = 1;
@@ -499,6 +500,9 @@ dri2_convert_glx_attribs(unsigned num_attribs,
const uint32_t *attribs,
   case GLX_CONTEXT_FLAGS_ARB:
  *flags = attribs[i * 2 + 1];
  break;
+  case GLX_CONTEXT_OPENGL_NO_ERROR_ARB:
+ no_error = attribs[i * 2 + 1];
+ break;
   case GLX_CONTEXT_PROFILE_MASK_ARB:
  profile = attribs[i * 2 + 1];
  got_profile = true;
@@ -527,6 +531,10 @@ dri2_convert_glx_attribs(unsigned num_attribs,
const uint32_t *attribs,
   }
}

+   if (no_error) {
+  *flags |= __DRI_CTX_FLAG_NO_ERROR;
+   }
+
if (!got_profile) {
   if (*major_ver > 3 || (*major_ver == 3 && *minor_ver >= 2))
  *api = __DRI_API_OPENGL_CORE;
@@ -567,7 +575,8 @@ dri2_convert_glx_attribs(unsigned num_attribs,
const uint32_t *attribs,
/* Unknown flag value.
 */
if (*flags & ~(__DRI_CTX_FLAG_DEBUG | 
__DRI_CTX_FLAG_FORWARD_COMPATIBLE

-  | __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS)) {
+  | __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS
+  | __DRI_CTX_FLAG_NO_ERROR)) {
   *error = __DRI_CTX_ERROR_UNKNOWN_FLAG;
   return false;
}
@@ -592,4 +601,45 @@ dri2_convert_glx_attribs(unsigned num_attribs,
const uint32_t *attribs,
return true;
 }

+_X_HIDDEN bool
+dri2_check_no_error(uint32_t flags, struct glx_context *share_context,
+int major, unsigned *error)
+{
+   Bool noError = flags & __DRI_CTX_FLAG_NO_ERROR;
+
+   /* The KHR_no_error specs say:
+*
+*Requires OpenGL ES 2.0 or OpenGL 2.0.
+*/
+   if (major < 2) {
+  *error = __DRI_CTX_ERROR_UNKNOWN_ATTRIBUTE;
+  return false;
+   }
+
+   /* The 

Re: [Mesa-dev] [PATCH 10/10] radeonsi: add an assertion that only two-dimensional constant references are used

2017-08-23 Thread Timothy Arceri

On 24/08/17 02:41, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

---
  src/gallium/drivers/radeonsi/si_shader.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index f02fc9e9ba2..c445c49d2aa 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1851,6 +1851,7 @@ static LLVMValueRef fetch_constant(
return lp_build_gather_values(>gallivm, values, 4);
}
  
+	assert(reg->Register.Dimension);

buf = reg->Register.Dimension ? reg->Dimension.Index : 0;


Shouldn't you change this to:

   buf = reg->Dimension.Index;

And below this:

if (reg->Register.Dimension && reg->Dimension.Indirect) {

to

if (reg->Dimension.Indirect) {


idx = reg->Register.Index * 4 + swizzle;
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Add support for the HelperInvocation builtin

2017-08-23 Thread Ian Romanick
On 08/23/2017 11:09 AM, Jason Ekstrand wrote:
> On Wed, Aug 23, 2017 at 9:58 AM, Ian Romanick  > wrote:
> 
> Reviewed-by: Ian Romanick  >
> 
> Did you submit a CTS bug?
> 
> 
> No, I didn't.  It does get some coverage through the up-and-coming
> subgroup tests but it should probably have it's own test.  That's going
> to be really annoying to test...

I mean... basically *any* sort of test would have caught this, right? :)

> On 08/21/2017 10:11 PM, Jason Ekstrand wrote:
> > I have no idea how this got missed but it's been missing since
> forever.
> >
> > Cc: mesa-sta...@lists.freedesktop.org
> 
> > ---
> >  src/compiler/spirv/vtn_variables.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/spirv/vtn_variables.c
> b/src/compiler/spirv/vtn_variables.c
> > index 6a8776b..87cb935 100644
> > --- a/src/compiler/spirv/vtn_variables.c
> > +++ b/src/compiler/spirv/vtn_variables.c
> > @@ -1121,6 +1121,10 @@ vtn_get_builtin_location(struct vtn_builder *b,
> >*location = FRAG_RESULT_DEPTH;
> >assert(*mode == nir_var_shader_out);
> >break;
> > +   case SpvBuiltInHelperInvocation:
> > +  *location = SYSTEM_VALUE_HELPER_INVOCATION;
> > +  set_mode_system_value(mode);
> > +  break;
> > case SpvBuiltInNumWorkgroups:
> >*location = SYSTEM_VALUE_NUM_WORK_GROUPS;
> >set_mode_system_value(mode);
> > @@ -1177,7 +1181,6 @@ vtn_get_builtin_location(struct vtn_builder *b,
> >*location = SYSTEM_VALUE_VIEW_INDEX;
> >set_mode_system_value(mode);
> >break;
> > -   case SpvBuiltInHelperInvocation:
> > default:
> >unreachable("unsupported builtin");
> > }
> >
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/5] util/disk_cache: rename mesa cache dir and introduce cache versioning

2017-08-23 Thread Matt Turner
On Wed, Aug 23, 2017 at 2:32 AM, Timothy Arceri  wrote:
> Steam is already analysing cache items

What does this mean?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/5] util/disk_cache: rename mesa cache dir and introduce cache versioning

2017-08-23 Thread Timothy Arceri

On 23/08/17 21:35, Vedran Miletić wrote:

On 08/23/2017 08:32 AM, Timothy Arceri wrote:

Steam is already analysing cache items, unfortunatly we did not
introduce a versioning mechanism for identifying structural changes
to cache entries earlier so the only way to do so is to rename the
cache directory.

Since we are renaming it we take the opportunity to give the directory
a more meaningful name.

Adding a version field to the header of cache entries will help us to
avoid having to rename the directory in future. Please note this is
versioning for the internal structure of the entries as defined in
disk_cache.{c,h} as opposed to the structure of the data provided to
the disk cache by the GLSL compiler and the various driver backends.
---
  src/compiler/glsl/tests/cache_test.c |  6 +++--
  src/util/disk_cache.c| 46 ++--
  src/util/disk_cache.h|  2 ++
  3 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/src/compiler/glsl/tests/cache_test.c 
b/src/compiler/glsl/tests/cache_test.c
index af1b66fb3d..3796ce6170 100644
--- a/src/compiler/glsl/tests/cache_test.c
+++ b/src/compiler/glsl/tests/cache_test.c
@@ -178,38 +178,40 @@ test_disk_cache_create(void)
 /* Test with XDG_CACHE_HOME set */
 setenv("XDG_CACHE_HOME", CACHE_TEST_TMP "/xdg-cache-home", 1);
 cache = disk_cache_create("test", "make_check", 0);
 expect_null(cache, "disk_cache_create with XDG_CACHE_HOME set with"
 "a non-existing parent directory");
  
 mkdir(CACHE_TEST_TMP, 0755);

 cache = disk_cache_create("test", "make_check", 0);
 expect_non_null(cache, "disk_cache_create with XDG_CACHE_HOME set");
  
-   check_directories_created(CACHE_TEST_TMP "/xdg-cache-home/mesa");

+   check_directories_created(CACHE_TEST_TMP "/xdg-cache-home/"
+ CACHE_DIR_NAME);
  
 disk_cache_destroy(cache);
  
 /* Test with MESA_GLSL_CACHE_DIR set */

 err = rmrf_local(CACHE_TEST_TMP);
 expect_equal(err, 0, "Removing " CACHE_TEST_TMP);
  
 setenv("MESA_GLSL_CACHE_DIR", CACHE_TEST_TMP "/mesa-glsl-cache-dir", 1);

 cache = disk_cache_create("test", "make_check", 0);
 expect_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set with"
 "a non-existing parent directory");
  
 mkdir(CACHE_TEST_TMP, 0755);

 cache = disk_cache_create("test", "make_check", 0);
 expect_non_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set");
  
-   check_directories_created(CACHE_TEST_TMP "/mesa-glsl-cache-dir/mesa");

+   check_directories_created(CACHE_TEST_TMP "/mesa-glsl-cache-dir/"
+ CACHE_DIR_NAME);
  
 disk_cache_destroy(cache);

  }
  
  static bool

  does_cache_contain(struct disk_cache *cache, const cache_key key)
  {
 void *result;
  
 result = disk_cache_get(cache, key, NULL);

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index b2229874e0..644a911e53 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -51,20 +51,34 @@
  
  /* Number of bits to mask off from a cache key to get an index. */

  #define CACHE_INDEX_KEY_BITS 16
  
  /* Mask for computing an index from a key. */

  #define CACHE_INDEX_KEY_MASK ((1 << CACHE_INDEX_KEY_BITS) - 1)
  
  /* The number of keys that can be stored in the index. */

  #define CACHE_INDEX_MAX_KEYS (1 << CACHE_INDEX_KEY_BITS)
  
+/* The cache version should be bumped whenever a change is made to the

+ * structure of cache entries or the index. This will give any 3rd party
+ * applications reading the cache entries a chance to adjust to the changes.
+ *
+ * - The cache version is checked internally when reading a cache entry. If we
+ *   ever have a mismatch we are in big trouble as this means we had a cache
+ *   collision. In case of such an event please check the skys for giant
+ *   asteroids and that the entire Mesa team hasn't been eaten by wolves.
+ *
+ * - There is no strict requirement that cache versions be backwards
+ *   compatible but effort should be taken to limit disruption where possible.
+ */
+#define CACHE_VERSION 1
+
  struct disk_cache {
 /* The path to the cache directory. */
 char *path;
  
 /* Thread queue for compressing and writing cache entries to disk */

 struct util_queue cache_queue;
  
 /* Seed for rand, which is used to pick a random directory */

 uint64_t seed_xorshift128plus[2];
  
@@ -153,20 +167,25 @@ concatenate_and_mkdir(void *ctx, const char *path, const char *name)

return NULL;
  
 new_path = ralloc_asprintf(ctx, "%s/%s", path, name);
  
 if (mkdir_if_needed(new_path) == 0)

return new_path;
 else
return NULL;
  }
  
+#define DRV_KEY_CPY(_dst, _src, _src_size) { \

+   memcpy(_dst, _src, _src_size);\
+   _dst += _src_size;\
+} while (0)
+
  struct disk_cache *
  disk_cache_create(const char *gpu_name, const char *timestamp,

[Mesa-dev] [PATCH V3 1/5] util/disk_cache: rename mesa cache dir and introduce cache versioning

2017-08-23 Thread Timothy Arceri
Steam is already analysing cache items, unfortunatly we did not
introduce a versioning mechanism for identifying structural changes
to cache entries earlier so the only way to do so is to rename the
cache directory.

Since we are renaming it we take the opportunity to give the directory
a more meaningful name.

Adding a version field to the header of cache entries will help us to
avoid having to rename the directory in future. Please note this is
versioning for the internal structure of the entries as defined in
disk_cache.{c,h} as opposed to the structure of the data provided to
the disk cache by the GLSL compiler and the various driver backends.

V3: fix silly bug where cache->driver_keys_blob was incremented directly
---
 src/compiler/glsl/tests/cache_test.c |  6 +++--
 src/util/disk_cache.c| 47 +++-
 src/util/disk_cache.h|  2 ++
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/src/compiler/glsl/tests/cache_test.c 
b/src/compiler/glsl/tests/cache_test.c
index af1b66fb3d..3796ce6170 100644
--- a/src/compiler/glsl/tests/cache_test.c
+++ b/src/compiler/glsl/tests/cache_test.c
@@ -178,38 +178,40 @@ test_disk_cache_create(void)
/* Test with XDG_CACHE_HOME set */
setenv("XDG_CACHE_HOME", CACHE_TEST_TMP "/xdg-cache-home", 1);
cache = disk_cache_create("test", "make_check", 0);
expect_null(cache, "disk_cache_create with XDG_CACHE_HOME set with"
"a non-existing parent directory");
 
mkdir(CACHE_TEST_TMP, 0755);
cache = disk_cache_create("test", "make_check", 0);
expect_non_null(cache, "disk_cache_create with XDG_CACHE_HOME set");
 
-   check_directories_created(CACHE_TEST_TMP "/xdg-cache-home/mesa");
+   check_directories_created(CACHE_TEST_TMP "/xdg-cache-home/"
+ CACHE_DIR_NAME);
 
disk_cache_destroy(cache);
 
/* Test with MESA_GLSL_CACHE_DIR set */
err = rmrf_local(CACHE_TEST_TMP);
expect_equal(err, 0, "Removing " CACHE_TEST_TMP);
 
setenv("MESA_GLSL_CACHE_DIR", CACHE_TEST_TMP "/mesa-glsl-cache-dir", 1);
cache = disk_cache_create("test", "make_check", 0);
expect_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set with"
"a non-existing parent directory");
 
mkdir(CACHE_TEST_TMP, 0755);
cache = disk_cache_create("test", "make_check", 0);
expect_non_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set");
 
-   check_directories_created(CACHE_TEST_TMP "/mesa-glsl-cache-dir/mesa");
+   check_directories_created(CACHE_TEST_TMP "/mesa-glsl-cache-dir/"
+ CACHE_DIR_NAME);
 
disk_cache_destroy(cache);
 }
 
 static bool
 does_cache_contain(struct disk_cache *cache, const cache_key key)
 {
void *result;
 
result = disk_cache_get(cache, key, NULL);
diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index b2229874e0..b2747fbce4 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -51,20 +51,34 @@
 
 /* Number of bits to mask off from a cache key to get an index. */
 #define CACHE_INDEX_KEY_BITS 16
 
 /* Mask for computing an index from a key. */
 #define CACHE_INDEX_KEY_MASK ((1 << CACHE_INDEX_KEY_BITS) - 1)
 
 /* The number of keys that can be stored in the index. */
 #define CACHE_INDEX_MAX_KEYS (1 << CACHE_INDEX_KEY_BITS)
 
+/* The cache version should be bumped whenever a change is made to the
+ * structure of cache entries or the index. This will give any 3rd party
+ * applications reading the cache entries a chance to adjust to the changes.
+ *
+ * - The cache version is checked internally when reading a cache entry. If we
+ *   ever have a mismatch we are in big trouble as this means we had a cache
+ *   collision. In case of such an event please check the skys for giant
+ *   asteroids and that the entire Mesa team hasn't been eaten by wolves.
+ *
+ * - There is no strict requirement that cache versions be backwards
+ *   compatible but effort should be taken to limit disruption where possible.
+ */
+#define CACHE_VERSION 1
+
 struct disk_cache {
/* The path to the cache directory. */
char *path;
 
/* Thread queue for compressing and writing cache entries to disk */
struct util_queue cache_queue;
 
/* Seed for rand, which is used to pick a random directory */
uint64_t seed_xorshift128plus[2];
 
@@ -153,20 +167,25 @@ concatenate_and_mkdir(void *ctx, const char *path, const 
char *name)
   return NULL;
 
new_path = ralloc_asprintf(ctx, "%s/%s", path, name);
 
if (mkdir_if_needed(new_path) == 0)
   return new_path;
else
   return NULL;
 }
 
+#define DRV_KEY_CPY(_dst, _src, _src_size) { \
+   memcpy(_dst, _src, _src_size);\
+   _dst += _src_size;\
+} while (0)
+
 struct disk_cache *
 disk_cache_create(const char *gpu_name, const char *timestamp,
   uint64_t driver_flags)
 {
void *local;
struct disk_cache *cache = NULL;
char 

Re: [Mesa-dev] [PATCH 3/3] mesa: remove duplicate assignments in bind_xfb_buffers()

2017-08-23 Thread Timothy Arceri

2-3:

Reviewed-by: Timothy Arceri 

Honestly not sure about the first patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/11] radv: Expose VK_KHX_multiview.

2017-08-23 Thread Dave Airlie
On 24 August 2017 at 06:51, Bas Nieuwenhuizen  wrote:
> ---
>  src/amd/vulkan/radv_device.c   | 17 +
>  src/amd/vulkan/radv_pipeline.c |  1 +
>  2 files changed, 18 insertions(+)

I've posted some comments on a couple of the patches, with those investigated,

Reviewed-by: Dave Airlie 

for the series.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/11] radv: Implement multiview draws.

2017-08-23 Thread Dave Airlie
On 24 August 2017 at 06:51, Bas Nieuwenhuizen  wrote:
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 132 
> ++-
>  src/amd/vulkan/radv_private.h|   1 +
>  2 files changed, 105 insertions(+), 28 deletions(-)
>

This looks like it has a few more candidate for for_each_bit.

I really don't like the duplication of the draw PKT3s, I'm assuming
all other options are
ugly.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: do not assert when reserving bindless slot 0

2017-08-23 Thread Michael Schellenberger Costa
Hi Samuel,

do you want to fully remove the assert or should this be something the kind of

MAYBE_UNUSED unsigned res = util_idalloc_alloc(>bindless_used_slots);
assert(res != 0);

--Michael

-Ursprüngliche Nachricht-
Von: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] Im Auftrag von 
Samuel Pitoiset
Gesendet: Mittwoch, 23. August 2017 09:43
An: mesa-dev@lists.freedesktop.org
Betreff: [Mesa-dev] [PATCH] radeonsi: do not assert when reserving bindless 
slot 0

When assertions were disabled, the compiler removed
the call to util_idalloc_alloc() and the first allocated
bindless slot was 0 which is invalid per the spec.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index f66ecc3e68..c53253ac8d 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -2192,7 +2192,7 @@ static void si_init_bindless_descriptors(struct 
si_context *sctx,
util_idalloc_resize(>bindless_used_slots, num_elements);
 
/* Reserve slot 0 because it's an invalid handle for bindless. */
-   assert(!util_idalloc_alloc(>bindless_used_slots));
+   util_idalloc_alloc(>bindless_used_slots);
 }
 
 static void si_release_bindless_descriptors(struct si_context *sctx)
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/11] radv: Add multiview clears.

2017-08-23 Thread Dave Airlie
On 24 August 2017 at 06:51, Bas Nieuwenhuizen  wrote:
> ---
>  src/amd/vulkan/radv_cmd_buffer.c |  1 +
>  src/amd/vulkan/radv_meta_clear.c | 65 
> 
>  src/amd/vulkan/radv_private.h|  1 +
>  3 files changed, 48 insertions(+), 19 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 94453094eb6..ed11a4aa35e 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1867,6 +1867,7 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer 
> *cmd_buffer,
> }
>
> state->attachments[i].pending_clear_aspects = clear_aspects;
> +   state->attachments[i].cleared_views = 0;
> if (clear_aspects && info) {
> assert(info->clearValueCount > i);
> state->attachments[i].clear_value = 
> info->pClearValues[i];
> diff --git a/src/amd/vulkan/radv_meta_clear.c 
> b/src/amd/vulkan/radv_meta_clear.c
> index af76a517aaf..ea777d9979c 100644
> --- a/src/amd/vulkan/radv_meta_clear.c
> +++ b/src/amd/vulkan/radv_meta_clear.c
> @@ -337,7 +337,8 @@ radv_device_finish_meta_clear_state(struct radv_device 
> *device)
>  static void
>  emit_color_clear(struct radv_cmd_buffer *cmd_buffer,
>   const VkClearAttachment *clear_att,
> - const VkClearRect *clear_rect)
> + const VkClearRect *clear_rect,
> + uint32_t view_mask)
>  {
> struct radv_device *device = cmd_buffer->device;
> const struct radv_subpass *subpass = cmd_buffer->state.subpass;
> @@ -400,7 +401,14 @@ emit_color_clear(struct radv_cmd_buffer *cmd_buffer,
>
> radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, 
> _rect->rect);
>
> -   radv_CmdDraw(cmd_buffer_h, 3, clear_rect->layerCount, 0, 
> clear_rect->baseArrayLayer);
> +   if (view_mask) {
> +   for (unsigned i = 0; (1u << i) <= view_mask; ++i)
> +   if ((1u << i) & view_mask) {

for_each_bit?

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: fix resolve subpass restoring in compute resolve path.

2017-08-23 Thread Dave Airlie
From: Dave Airlie 

We need to restore the subpass before we do the fast clear flush.

found while hacking around on vega.

Fixes: 19be95f71 (radv: add subpass resolve compute path)
Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_meta_resolve_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_meta_resolve_cs.c 
b/src/amd/vulkan/radv_meta_resolve_cs.c
index d20d042..6ac8601 100644
--- a/src/amd/vulkan/radv_meta_resolve_cs.c
+++ b/src/amd/vulkan/radv_meta_resolve_cs.c
@@ -543,7 +543,7 @@ radv_cmd_buffer_resolve_subpass_cs(struct radv_cmd_buffer 
*cmd_buffer)
 &(VkOffset2D) { 0, 0 },
 &(VkExtent2D) { fb->width, fb->height });
}
-
+   cmd_buffer->state.subpass = subpass;
radv_meta_restore_compute(_state, cmd_buffer, 16);
 
for (uint32_t i = 0; i < subpass->color_count; ++i) {
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102038

--- Comment #16 from Bruce Cherniak  ---
Created attachment 133732
  --> https://bugs.freedesktop.org/attachment.cgi?id=133732=edit
Updated swr patch to fix msaa once new_patch_to_try has been applied.

Here's a patch that fixes the MSAA override validation.  "new-msaa-swr.patch"
should be included along with the "new_patch_to_try".

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102038

Bruce Cherniak  changed:

   What|Removed |Added

 Attachment #133726|0   |1
is obsolete||

--- Comment #15 from Bruce Cherniak  ---
Comment on attachment 133726
  --> https://bugs.freedesktop.org/attachment.cgi?id=133726
Patch to fix swr after "new patch to try"

Nevermind that patch, I found a bug in validating MSAA override value. 
Updating with a new version.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102017] Wrong colours in Cities Skyline

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102017

Timothy Arceri  changed:

   What|Removed |Added

 Resolution|--- |NOTABUG
 Status|NEW |RESOLVED

--- Comment #16 from Timothy Arceri  ---
Hopefully we can enable this but default soon, as far as I can tell the patent
only has just over a month til it expires \o/

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Simplify MOCS mashing in genX_state_upload.c.

2017-08-23 Thread Kenneth Graunke
Instead of having a proliferation of generation checks and MOCS values,
we can just #define MOCS_ALL to the generation-specific value for "use
as many caches as possible" and use that in various places.

This should make it easier to change MOCS values, as there are fewer
places that need updating.
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index f1e9fa38ffc..f2bbe4e9897 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -143,6 +143,16 @@ KSP(struct brw_context *brw, uint32_t offset)
 }
 #endif
 
+#if GEN_GEN == 10
+#define MOCS_ALL CNL_MOCS_WB
+#elif GEN_GEN == 9
+#define MOCS_ALL SKL_MOCS_WB
+#elif GEN_GEN == 8
+#define MOCS_ALL BDW_MOCS_WB
+#elif GEN_GEN == 7
+#define MOCS_ALL GEN7_MOCS_L3
+#endif
+
 #include "genxml/genX_pack.h"
 
 #define _brw_cmd_length(cmd) cmd ## _length
@@ -323,6 +333,7 @@ genX(emit_vertex_buffer_state)(struct brw_context *brw,
 
 #if GEN_GEN >= 7
   .AddressModifyEnable = true,
+  .VertexBufferMOCS = MOCS_ALL,
 #endif
 
 #if GEN_GEN < 8
@@ -331,16 +342,6 @@ genX(emit_vertex_buffer_state)(struct brw_context *brw,
 #if GEN_GEN >= 5
   .EndAddress = ro_bo(bo, end_offset - 1),
 #endif
-#endif
-
-#if GEN_GEN == 10
-  .VertexBufferMOCS = CNL_MOCS_WB,
-#elif GEN_GEN == 9
-  .VertexBufferMOCS = SKL_MOCS_WB,
-#elif GEN_GEN == 8
-  .VertexBufferMOCS = BDW_MOCS_WB,
-#elif GEN_GEN == 7
-  .VertexBufferMOCS = GEN7_MOCS_L3,
 #endif
};
 
@@ -847,7 +848,7 @@ genX(emit_index_buffer)(struct brw_context *brw)
   ib.IndexFormat = brw_get_index_type(index_buffer->index_size);
   ib.BufferStartingAddress = ro_bo(brw->ib.bo, 0);
 #if GEN_GEN >= 8
-  ib.IndexBufferMOCS = GEN_GEN >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
+  ib.IndexBufferMOCS = MOCS_ALL;
   ib.BufferSize = brw->ib.size;
 #else
   ib.BufferEndingAddress = ro_bo(brw->ib.bo, brw->ib.size - 1);
@@ -3599,7 +3600,6 @@ genX(upload_3dstate_so_buffers)(struct brw_context *brw)
 #else
struct brw_transform_feedback_object *brw_obj =
   (struct brw_transform_feedback_object *) xfb_obj;
-   uint32_t mocs_wb = GEN_GEN >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
 #endif
 
/* Set up the up to 4 output buffers.  These are the ranges defined in the
@@ -3634,7 +3634,7 @@ genX(upload_3dstate_so_buffers)(struct brw_context *brw)
  sob.SOBufferEnable = true;
  sob.StreamOffsetWriteEnable = true;
  sob.StreamOutputBufferOffsetAddressEnable = true;
- sob.SOBufferMOCS = mocs_wb;
+ sob.SOBufferMOCS = MOCS_ALL;
 
  sob.SurfaceSize = MAX2(xfb_obj->Size[i] / 4, 1) - 1;
  sob.StreamOutputBufferOffsetAddress =
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Do not store SRC after 0 on component control.

2017-08-23 Thread Anuj Phogat
On Wed, Aug 23, 2017 at 2:59 PM, Rafael Antognolli
 wrote:
> The PRM SKL-Vol 2b-05.16 says:
>
>"Within a VERTEX_ELEMENT_STATE structure, if a Component Control
>field is set to something other than VFCOMP_STORE_SRC, no
>higher-numbered Component Control fields may be set to
>VFCOMP_STORE_SRC. In other words, only trailing components can be set
>to something other than VFCOMP_STORE_SRC."
>
> Since we set the component 1 to VFCOMP_STORE_0 on gen8+, and
> VFCOMP_STORE_IID on gen5+, and we are not using components 2 and 3,
> let's also set them to VFCOMP_STORE_0.
>
> Signed-off-by: Rafael Antognolli 
> ---
>  src/intel/blorp/blorp_genX_exec.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/blorp/blorp_genX_exec.h 
> b/src/intel/blorp/blorp_genX_exec.h
> index 93534169ef..524736fbc0 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -395,8 +395,8 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
>  #else
>.Component1Control = VFCOMP_STORE_0,
>  #endif
> -  .Component2Control = VFCOMP_STORE_SRC,
> -  .Component3Control = VFCOMP_STORE_SRC,
> +  .Component2Control = VFCOMP_STORE_0,
> +  .Component3Control = VFCOMP_STORE_0,
>  #if GEN_GEN <= 5
>.DestinationElementOffset = slot * 4,
>  #endif
> --
> 2.13.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Do not store SRC after 0 on component control.

2017-08-23 Thread Jason Ekstrand
Assuming Jenkins is happy with it (both Vulkan and GL),

Reviewed-by: Jason Ekstrand 

On Wed, Aug 23, 2017 at 2:59 PM, Rafael Antognolli <
rafael.antogno...@intel.com> wrote:

> The PRM SKL-Vol 2b-05.16 says:
>
>"Within a VERTEX_ELEMENT_STATE structure, if a Component Control
>field is set to something other than VFCOMP_STORE_SRC, no
>higher-numbered Component Control fields may be set to
>VFCOMP_STORE_SRC. In other words, only trailing components can be set
>to something other than VFCOMP_STORE_SRC."
>
> Since we set the component 1 to VFCOMP_STORE_0 on gen8+, and
> VFCOMP_STORE_IID on gen5+, and we are not using components 2 and 3,
> let's also set them to VFCOMP_STORE_0.
>
> Signed-off-by: Rafael Antognolli 
> ---
>  src/intel/blorp/blorp_genX_exec.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/blorp/blorp_genX_exec.h
> b/src/intel/blorp/blorp_genX_exec.h
> index 93534169ef..524736fbc0 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -395,8 +395,8 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
>  #else
>.Component1Control = VFCOMP_STORE_0,
>  #endif
> -  .Component2Control = VFCOMP_STORE_SRC,
> -  .Component3Control = VFCOMP_STORE_SRC,
> +  .Component2Control = VFCOMP_STORE_0,
> +  .Component3Control = VFCOMP_STORE_0,
>  #if GEN_GEN <= 5
>.DestinationElementOffset = slot * 4,
>  #endif
> --
> 2.13.5
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Do not store SRC after 0 on component control.

2017-08-23 Thread Rafael Antognolli
The PRM SKL-Vol 2b-05.16 says:

   "Within a VERTEX_ELEMENT_STATE structure, if a Component Control
   field is set to something other than VFCOMP_STORE_SRC, no
   higher-numbered Component Control fields may be set to
   VFCOMP_STORE_SRC. In other words, only trailing components can be set
   to something other than VFCOMP_STORE_SRC."

Since we set the component 1 to VFCOMP_STORE_0 on gen8+, and
VFCOMP_STORE_IID on gen5+, and we are not using components 2 and 3,
let's also set them to VFCOMP_STORE_0.

Signed-off-by: Rafael Antognolli 
---
 src/intel/blorp/blorp_genX_exec.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/blorp/blorp_genX_exec.h 
b/src/intel/blorp/blorp_genX_exec.h
index 93534169ef..524736fbc0 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -395,8 +395,8 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
 #else
   .Component1Control = VFCOMP_STORE_0,
 #endif
-  .Component2Control = VFCOMP_STORE_SRC,
-  .Component3Control = VFCOMP_STORE_SRC,
+  .Component2Control = VFCOMP_STORE_0,
+  .Component3Control = VFCOMP_STORE_0,
 #if GEN_GEN <= 5
   .DestinationElementOffset = slot * 4,
 #endif
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv,i965: Move CS shared lowering into anv

2017-08-23 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2017-08-23 14:00:29, Jason Ekstrand wrote:
> Right now, OpenGL uses the GLSL lowering for shared variables and anv
> uses NIR to lower them.  For a long time, we've done this weird thing
> where we do the NIR lowering unconditionally and then add the SLM sizes
> from the two together.  This works because one of them will always be 0
> but it's a bit sketchy.  Let's just move the NIR-based lowering into
> anv_pipeline and get rid of the sketch.
> 
> Cc: Jordan Justen 
> ---
>  src/intel/compiler/brw_fs.cpp   | 2 --
>  src/intel/vulkan/anv_pipeline.c | 5 +
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index f2596e3..eb9b4c3 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -6751,8 +6751,6 @@ brw_compile_cs(const struct brw_compiler *compiler, 
> void *log_data,
>  {
> nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> shader = brw_nir_apply_sampler_key(shader, compiler, >tex, true);
> -   brw_nir_lower_cs_shared(shader);
> -   prog_data->base.total_shared += shader->num_shared;
>  
> /* Now that we cloned the nir_shader, we can update num_uniforms based on
>  * the thread_local_id_index.
> diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
> index 6ae682f..279d765 100644
> --- a/src/intel/vulkan/anv_pipeline.c
> +++ b/src/intel/vulkan/anv_pipeline.c
> @@ -381,6 +381,11 @@ anv_pipeline_compile(struct anv_pipeline *pipeline,
> if (stage != MESA_SHADER_COMPUTE)
>NIR_PASS_V(nir, anv_nir_lower_multiview, pipeline->subpass->view_mask);
>  
> +   if (stage == MESA_SHADER_COMPUTE) {
> +  NIR_PASS_V(nir, brw_nir_lower_cs_shared);
> +  prog_data->total_shared = nir->num_shared;
> +   }
> +
> nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir));
>  
> /* Figure out the number of parameters */
> -- 
> 2.5.0.400.gff86faf
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-23 Thread Leo Liu
Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/va/picture.c | 40 +
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 47e63d3b30..3c5eb5de97 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -626,9 +626,12 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
 PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
 
if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,
- 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- 
PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  surf->templat.interlaced =
+ (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) ?
+ interlaced :
+ screen->get_video_param(screen, context->decoder->profile,
+ PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
+ PIPE_VIDEO_CAP_PREFERS_INTERLACED);
   realloc = true;
}
 
@@ -657,13 +660,42 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
}
 
if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
 
   if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) != 
VA_STATUS_SUCCESS) {
  mtx_unlock(>mutex);
  return VA_STATUS_ERROR_ALLOCATION_FAILED;
   }
 
+  if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
+ struct vl_compositor *compositor = >compositor;
+ struct vl_compositor_state *s = >cstate;
+ struct pipe_surface **dst_surface;
+ struct u_rect dst_rect;
+
+ dst_surface = surf->buffer->get_surfaces(surf->buffer);
+ vl_compositor_clear_layers(s);
+
+ dst_rect.x0 = 0;
+ dst_rect.x1 = old_buf->width;
+ dst_rect.y0 = 0;
+ dst_rect.y1 = old_buf->height;
+
+ vl_compositor_set_yuv_layer(s, compositor, 0, old_buf, NULL, NULL, 
true);
+ vl_compositor_set_layer_dst_area(s, 0, _rect);
+ vl_compositor_render(s, compositor, dst_surface[0], NULL, false);
+
+ dst_rect.x1 /= 2;
+ dst_rect.y1 /= 2;
+
+ vl_compositor_set_yuv_layer(s, compositor, 0, old_buf, NULL, NULL, 
false);
+ vl_compositor_set_layer_dst_area(s, 0, _rect);
+ vl_compositor_render(s, compositor, dst_surface[1], NULL, false);
+
+ context->decoder->context->flush(context->decoder->context, NULL, 0);
+  }
+
+  old_buf->destroy(old_buf);
   context->target = surf->buffer;
}
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] Revert "st/va: add enviromental variable to disable interlace"

2017-08-23 Thread Leo Liu
This reverts commit 10dec2de2d9f568675d66d736b48701fa26f7b50.
---
 src/gallium/state_trackers/va/surface.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index b116fc3f27..67773cf76a 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -43,8 +43,6 @@
 
 #include "va_private.h"
 
-DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE", FALSE);
-
 #include 
 
 static const enum pipe_format vpp_surface_formats[] = {
@@ -709,8 +707,6 @@ vlVaCreateSurfaces2(VADriverContextP ctx, unsigned int 
format,
 
templat.width = width;
templat.height = height;
-   if (debug_get_option_nointerlace())
-  templat.interlaced = false;
 
memset(surfaces, VA_INVALID_ID, num_surfaces * sizeof(VASurfaceID));
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv,i965: Move CS shared lowering into anv

2017-08-23 Thread Jason Ekstrand
Right now, OpenGL uses the GLSL lowering for shared variables and anv
uses NIR to lower them.  For a long time, we've done this weird thing
where we do the NIR lowering unconditionally and then add the SLM sizes
from the two together.  This works because one of them will always be 0
but it's a bit sketchy.  Let's just move the NIR-based lowering into
anv_pipeline and get rid of the sketch.

Cc: Jordan Justen 
---
 src/intel/compiler/brw_fs.cpp   | 2 --
 src/intel/vulkan/anv_pipeline.c | 5 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index f2596e3..eb9b4c3 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -6751,8 +6751,6 @@ brw_compile_cs(const struct brw_compiler *compiler, void 
*log_data,
 {
nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
shader = brw_nir_apply_sampler_key(shader, compiler, >tex, true);
-   brw_nir_lower_cs_shared(shader);
-   prog_data->base.total_shared += shader->num_shared;
 
/* Now that we cloned the nir_shader, we can update num_uniforms based on
 * the thread_local_id_index.
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 6ae682f..279d765 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -381,6 +381,11 @@ anv_pipeline_compile(struct anv_pipeline *pipeline,
if (stage != MESA_SHADER_COMPUTE)
   NIR_PASS_V(nir, anv_nir_lower_multiview, pipeline->subpass->view_mask);
 
+   if (stage == MESA_SHADER_COMPUTE) {
+  NIR_PASS_V(nir, brw_nir_lower_cs_shared);
+  prog_data->total_shared = nir->num_shared;
+   }
+
nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir));
 
/* Figure out the number of parameters */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/11] radv: Expose VK_KHX_multiview.

2017-08-23 Thread Bas Nieuwenhuizen
---
 src/amd/vulkan/radv_device.c   | 17 +
 src/amd/vulkan/radv_pipeline.c |  1 +
 2 files changed, 18 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 9bdad6ad6fd..9174562b1bf 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -178,6 +178,10 @@ static const VkExtensionProperties 
ext_sema_device_extensions[] = {
.extensionName = VK_KHR_EXTERNAL_SEMAPHORE_FD_EXTENSION_NAME,
.specVersion = 1,
},
+   {
+   .extensionName = VK_KHX_MULTIVIEW_EXTENSION_NAME,
+   .specVersion = 1,
+   },
 };
 
 static VkResult
@@ -628,6 +632,13 @@ void radv_GetPhysicalDeviceFeatures2KHR(
features->variablePointers = false;
break;
}
+   case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_FEATURES_KHX: {
+   VkPhysicalDeviceMultiviewFeaturesKHX *features = 
(VkPhysicalDeviceMultiviewFeaturesKHX*)ext;
+   features->multiview = true;
+   features->multiviewGeometryShader = true;
+   features->multiviewTessellationShader = true;
+   break;
+   }
default:
break;
}
@@ -804,6 +815,12 @@ void radv_GetPhysicalDeviceProperties2KHR(
properties->deviceLUIDValid = false;
break;
}
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MULTIVIEW_PROPERTIES_KHX: {
+   VkPhysicalDeviceMultiviewPropertiesKHX *properties = 
(VkPhysicalDeviceMultiviewPropertiesKHX*)ext;
+   properties->maxMultiviewViewCount = MAX_VIEWS;
+   properties->maxMultiviewInstanceIndex = INT_MAX;
+   break;
+   }
default:
break;
}
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 60740c58c2e..5a74dfabd9c 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -230,6 +230,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
.image_write_without_format = true,
.tessellation = true,
.int64 = true,
+   .multiview = true,
.variable_pointers = true,
};
entry_point = spirv_to_nir(spirv, module->size / 4,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/11] ac/nir: Add shader support for multiviews.

2017-08-23 Thread Bas Nieuwenhuizen
It uses an user SGPR to pass the view index to the shaders, except
for the fragment shader where we use layer=view (which comes in
handy when we want to do the NV ext that allows us to execute pre-FS
stages once instead of per view).
---
 src/amd/common/ac_nir_to_llvm.c | 38 +-
 src/amd/common/ac_nir_to_llvm.h |  4 +++-
 src/amd/common/ac_shader_info.c |  3 +++
 src/amd/common/ac_shader_info.h |  1 +
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 90406e88dfb..ab5be457cf7 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -90,6 +90,7 @@ struct nir_to_llvm_context {
LLVMValueRef descriptor_sets[AC_UD_MAX_SETS];
LLVMValueRef ring_offsets;
LLVMValueRef push_constants;
+   LLVMValueRef view_index;
LLVMValueRef num_work_groups;
LLVMValueRef workgroup_ids;
LLVMValueRef local_invocation_ids;
@@ -744,6 +745,8 @@ static void create_function(struct nir_to_llvm_context *ctx)
if (ctx->shader_info->info.vs.needs_draw_id)
add_user_sgpr_argument(, ctx->i32, 
>abi.draw_id); // draw id
}
+   if (ctx->shader_info->info.needs_multiview_view_index || 
(!ctx->options->key.vs.as_es && !ctx->options->key.vs.as_ls && 
ctx->options->key.has_multiview_view_index))
+   add_user_sgpr_argument(, ctx->i32, 
>view_index);
if (ctx->options->key.vs.as_es)
add_sgpr_argument(, ctx->i32, >es2gs_offset); 
// es2gs offset
else if (ctx->options->key.vs.as_ls)
@@ -760,6 +763,8 @@ static void create_function(struct nir_to_llvm_context *ctx)
add_user_sgpr_argument(, ctx->i32, >tcs_out_offsets); 
// tcs out offsets
add_user_sgpr_argument(, ctx->i32, >tcs_out_layout); 
// tcs out layout
add_user_sgpr_argument(, ctx->i32, >tcs_in_layout); 
// tcs in layout
+   if (ctx->shader_info->info.needs_multiview_view_index)
+   add_user_sgpr_argument(, ctx->i32, 
>view_index);
add_sgpr_argument(, ctx->i32, >oc_lds); // param oc 
lds
add_sgpr_argument(, ctx->i32, >tess_factor_offset); 
// tess factor offset
add_vgpr_argument(, ctx->i32, >tcs_patch_id); // 
patch id
@@ -767,6 +772,8 @@ static void create_function(struct nir_to_llvm_context *ctx)
break;
case MESA_SHADER_TESS_EVAL:
add_user_sgpr_argument(, ctx->i32, 
>tcs_offchip_layout); // tcs offchip layout
+   if (ctx->shader_info->info.needs_multiview_view_index || 
(!ctx->options->key.tes.as_es && ctx->options->key.has_multiview_view_index))
+   add_user_sgpr_argument(, ctx->i32, 
>view_index);
if (ctx->options->key.tes.as_es) {
add_sgpr_argument(, ctx->i32, >oc_lds); // OC 
LDS
add_sgpr_argument(, ctx->i32, NULL); //
@@ -783,6 +790,8 @@ static void create_function(struct nir_to_llvm_context *ctx)
case MESA_SHADER_GEOMETRY:
add_user_sgpr_argument(, ctx->i32, 
>gsvs_ring_stride); // gsvs stride
add_user_sgpr_argument(, ctx->i32, 
>gsvs_num_entries); // gsvs num entires
+   if (ctx->shader_info->info.needs_multiview_view_index)
+   add_user_sgpr_argument(, ctx->i32, 
>view_index);
add_sgpr_argument(, ctx->i32, >gs2vs_offset); // 
gs2vs offset
add_sgpr_argument(, ctx->i32, >gs_wave_id); // wave id
add_vgpr_argument(, ctx->i32, >gs_vtx_offset[0]); // 
vtx0
@@ -894,6 +903,8 @@ static void create_function(struct nir_to_llvm_context *ctx)
 
set_userdata_location_shader(ctx, 
AC_UD_VS_BASE_VERTEX_START_INSTANCE, _sgpr_idx, vs_num);
}
+   if (ctx->view_index)
+   set_userdata_location_shader(ctx, AC_UD_VIEW_INDEX, 
_sgpr_idx, 1);
if (ctx->options->key.vs.as_ls) {
set_userdata_location_shader(ctx, 
AC_UD_VS_LS_TCS_IN_LAYOUT, _sgpr_idx, 1);
}
@@ -902,13 +913,19 @@ static void create_function(struct nir_to_llvm_context 
*ctx)
break;
case MESA_SHADER_TESS_CTRL:
set_userdata_location_shader(ctx, AC_UD_TCS_OFFCHIP_LAYOUT, 
_sgpr_idx, 4);
+   if (ctx->view_index)
+   set_userdata_location_shader(ctx, AC_UD_VIEW_INDEX, 
_sgpr_idx, 1);
declare_tess_lds(ctx);
break;
case MESA_SHADER_TESS_EVAL:
set_userdata_location_shader(ctx, AC_UD_TES_OFFCHIP_LAYOUT, 
_sgpr_idx, 1);
+   if (ctx->view_index)
+   set_userdata_location_shader(ctx, AC_UD_VIEW_INDEX, 
_sgpr_idx, 1);
break;
case 

[Mesa-dev] [Bug 102017] Wrong colours in Cities Skyline

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102017

--- Comment #15 from Marcin Gałązka  ---
@Gert - great catch! It seems that Steam provided via official Ubuntu/Canonical
repos depends on libtxc-dxtn-s2tc:i386 only so one need to install
libtxc-dxtn-s2tc:amd64 manually. See
https://bugs.launchpad.net/ubuntu/+source/steam/+bug/1684755,
https://github.com/ValveSoftware/Dota-2/issues/1213 for some more details.

After installing amd64 package the game appears to look fine. I suppose that we
can close this issue. (?)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/11] radv: Implement multiview draws.

2017-08-23 Thread Bas Nieuwenhuizen
---
 src/amd/vulkan/radv_cmd_buffer.c | 132 ++-
 src/amd/vulkan/radv_private.h|   1 +
 2 files changed, 105 insertions(+), 28 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index ed11a4aa35e..4960bdf758a 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2657,6 +2657,28 @@ void radv_CmdNextSubpass(
radv_cmd_buffer_clear_subpass(cmd_buffer);
 }
 
+static void radv_emit_view_index(struct radv_cmd_buffer *cmd_buffer, unsigned 
index)
+{
+   struct radv_pipeline *pipeline = cmd_buffer->state.pipeline;
+   for (unsigned stage = 0; stage < MESA_SHADER_STAGES; ++stage) {
+   if (!pipeline->shaders[stage])
+   continue;
+   struct ac_userdata_info *loc = radv_lookup_user_sgpr(pipeline, 
stage, AC_UD_VIEW_INDEX);
+   if (loc->sgpr_idx == -1)
+   continue;
+   uint32_t base_reg = radv_shader_stage_to_user_data_0(stage, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
+   radeon_set_sh_reg(cmd_buffer->cs, base_reg + loc->sgpr_idx * 4, 
index);
+
+   }
+   if (pipeline->gs_copy_shader) {
+   struct ac_userdata_info *loc = 
>gs_copy_shader->info.user_sgprs_locs.shader_data[AC_UD_VIEW_INDEX];
+   if (loc->sgpr_idx != -1) {
+   uint32_t base_reg = R_00B130_SPI_SHADER_USER_DATA_VS_0;
+   radeon_set_sh_reg(cmd_buffer->cs, base_reg + 
loc->sgpr_idx * 4, index);
+   }
+   }
+}
+
 void radv_CmdDraw(
VkCommandBuffer commandBuffer,
uint32_tvertexCount,
@@ -2668,7 +2690,7 @@ void radv_CmdDraw(
 
radv_cmd_buffer_flush_state(cmd_buffer, false, (instanceCount > 1), 
false, vertexCount);
 
-   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 10);
+   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 12 * MAX_VIEWS);
 
assert(cmd_buffer->state.pipeline->graphics.vtx_base_sgpr);
radeon_set_sh_reg_seq(cmd_buffer->cs, 
cmd_buffer->state.pipeline->graphics.vtx_base_sgpr,
@@ -2681,10 +2703,24 @@ void radv_CmdDraw(
radeon_emit(cmd_buffer->cs, PKT3(PKT3_NUM_INSTANCES, 0, 
cmd_buffer->state.predicating));
radeon_emit(cmd_buffer->cs, instanceCount);
 
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_DRAW_INDEX_AUTO, 1, 
cmd_buffer->state.predicating));
-   radeon_emit(cmd_buffer->cs, vertexCount);
-   radeon_emit(cmd_buffer->cs, V_0287F0_DI_SRC_SEL_AUTO_INDEX |
-   S_0287F0_USE_OPAQUE(0));
+   if (!cmd_buffer->state.subpass->view_mask) {
+   radeon_emit(cmd_buffer->cs, PKT3(PKT3_DRAW_INDEX_AUTO, 1, 
cmd_buffer->state.predicating));
+   radeon_emit(cmd_buffer->cs, vertexCount);
+   radeon_emit(cmd_buffer->cs, V_0287F0_DI_SRC_SEL_AUTO_INDEX |
+   S_0287F0_USE_OPAQUE(0));
+   } else {
+   for (unsigned i = 0; (1u << i) <= 
cmd_buffer->state.subpass->view_mask; ++i) {
+   if ((1u << i) & cmd_buffer->state.subpass->view_mask) {
+   radv_emit_view_index(cmd_buffer, i);
+
+   radeon_emit(cmd_buffer->cs, 
PKT3(PKT3_DRAW_INDEX_AUTO, 1, cmd_buffer->state.predicating));
+   radeon_emit(cmd_buffer->cs, vertexCount);
+   radeon_emit(cmd_buffer->cs, 
V_0287F0_DI_SRC_SEL_AUTO_INDEX |
+   
S_0287F0_USE_OPAQUE(0));
+
+   }
+   }
+   }
 
assert(cmd_buffer->cs->cdw <= cdw_max);
 
@@ -2705,7 +2741,7 @@ void radv_CmdDrawIndexed(
 
radv_cmd_buffer_flush_state(cmd_buffer, true, (instanceCount > 1), 
false, indexCount);
 
-   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 15);
+   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 17 * MAX_VIEWS);
 
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
radeon_set_uconfig_reg_idx(cmd_buffer->cs, 
R_03090C_VGT_INDEX_TYPE,
@@ -2728,12 +2764,28 @@ void radv_CmdDrawIndexed(
 
index_va = cmd_buffer->state.index_va;
index_va += firstIndex * index_size;
-   radeon_emit(cmd_buffer->cs, PKT3(PKT3_DRAW_INDEX_2, 4, false));
-   radeon_emit(cmd_buffer->cs, cmd_buffer->state.max_index_count);
-   radeon_emit(cmd_buffer->cs, index_va);
-   radeon_emit(cmd_buffer->cs, (index_va >> 32UL) & 0xFF);
-   radeon_emit(cmd_buffer->cs, indexCount);
-   radeon_emit(cmd_buffer->cs, V_0287F0_DI_SRC_SEL_DMA);
+   if (!cmd_buffer->state.subpass->view_mask) {
+

[Mesa-dev] [PATCH 06/11] radv: Store multiview info in renderpass.

2017-08-23 Thread Bas Nieuwenhuizen
---
 src/amd/vulkan/radv_pass.c| 25 -
 src/amd/vulkan/radv_private.h |  3 +++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_pass.c b/src/amd/vulkan/radv_pass.c
index 17eff3937ac..a52dae39d93 100644
--- a/src/amd/vulkan/radv_pass.c
+++ b/src/amd/vulkan/radv_pass.c
@@ -26,6 +26,8 @@
  */
 #include "radv_private.h"
 
+#include "vk_util.h"
+
 VkResult radv_CreateRenderPass(
VkDevice_device,
const VkRenderPassCreateInfo*   pCreateInfo,
@@ -36,6 +38,7 @@ VkResult radv_CreateRenderPass(
struct radv_render_pass *pass;
size_t size;
size_t attachments_offset;
+   VkRenderPassMultiviewCreateInfoKHX *multiview_info = NULL;
 
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO);
 
@@ -54,6 +57,16 @@ VkResult radv_CreateRenderPass(
pass->subpass_count = pCreateInfo->subpassCount;
pass->attachments = (void *) pass + attachments_offset;
 
+   vk_foreach_struct(ext, pCreateInfo->pNext) {
+   switch(ext->sType) {
+   case  VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO_KHX:
+   multiview_info = ( 
VkRenderPassMultiviewCreateInfoKHX*)ext;
+   break;
+   default:
+   break;
+   }
+   }
+
for (uint32_t i = 0; i < pCreateInfo->attachmentCount; i++) {
struct radv_render_pass_attachment *att = >attachments[i];
 
@@ -97,6 +110,8 @@ VkResult radv_CreateRenderPass(
 
subpass->input_count = desc->inputAttachmentCount;
subpass->color_count = desc->colorAttachmentCount;
+   if (multiview_info)
+   subpass->view_mask = multiview_info->pViewMasks[i];
 
if (desc->inputAttachmentCount > 0) {
subpass->input_attachments = p;
@@ -105,6 +120,8 @@ VkResult radv_CreateRenderPass(
for (uint32_t j = 0; j < desc->inputAttachmentCount; 
j++) {
subpass->input_attachments[j]
= desc->pInputAttachments[j];
+   if (desc->pInputAttachments[j].attachment != 
VK_ATTACHMENT_UNUSED)
+   
pass->attachments[desc->pInputAttachments[j].attachment].view_mask |= 
subpass->view_mask;
}
}
 
@@ -115,6 +132,8 @@ VkResult radv_CreateRenderPass(
for (uint32_t j = 0; j < desc->colorAttachmentCount; 
j++) {
subpass->color_attachments[j]
= desc->pColorAttachments[j];
+   if (desc->pColorAttachments[j].attachment != 
VK_ATTACHMENT_UNUSED)
+   
pass->attachments[desc->pColorAttachments[j].attachment].view_mask |= 
subpass->view_mask;
}
}
 
@@ -127,14 +146,18 @@ VkResult radv_CreateRenderPass(
uint32_t a = 
desc->pResolveAttachments[j].attachment;
subpass->resolve_attachments[j]
= desc->pResolveAttachments[j];
-   if (a != VK_ATTACHMENT_UNUSED)
+   if (a != VK_ATTACHMENT_UNUSED) {
subpass->has_resolve = true;
+   
pass->attachments[desc->pResolveAttachments[j].attachment].view_mask |= 
subpass->view_mask;
+   }
}
}
 
if (desc->pDepthStencilAttachment) {
subpass->depth_stencil_attachment =
*desc->pDepthStencilAttachment;
+   if (desc->pDepthStencilAttachment->attachment != 
VK_ATTACHMENT_UNUSED)
+   
pass->attachments[desc->pDepthStencilAttachment->attachment].view_mask |= 
subpass->view_mask;
} else {
subpass->depth_stencil_attachment.attachment = 
VK_ATTACHMENT_UNUSED;
}
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 05db2f0f82f..79238f799be 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1463,6 +1463,8 @@ struct radv_subpass {
bool has_resolve;
 
struct radv_subpass_barrier  start_barrier;
+
+   uint32_t view_mask;
 };
 
 struct radv_render_pass_attachment {
@@ -1472,6 +1474,7 @@ struct radv_render_pass_attachment {
VkAttachmentLoadOp   stencil_load_op;
VkImageLayoutinitial_layout;
VkImageLayout

[Mesa-dev] [PATCH 09/11] radv: Implement determining the has_multiview_view_index key.

2017-08-23 Thread Bas Nieuwenhuizen
---
 src/amd/vulkan/radv_pipeline.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 96917814e56..60740c58c2e 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -493,7 +493,8 @@ radv_pipeline_create_gs_copy_shader(struct radv_pipeline 
*pipeline,
struct nir_shader *nir,
void** code_out,
unsigned *code_size_out,
-   bool dump_shader)
+   bool dump_shader,
+   bool multiview)
 {
struct radv_shader_variant *variant = calloc(1, sizeof(struct 
radv_shader_variant));
enum radeon_family chip_family = 
pipeline->device->physical_device->rad_info.family;
@@ -506,6 +507,7 @@ radv_pipeline_create_gs_copy_shader(struct radv_pipeline 
*pipeline,
enum ac_target_machine_options tm_options = 0;
options.family = chip_family;
options.chip_class = 
pipeline->device->physical_device->rad_info.chip_class;
+   options.key.has_multiview_view_index = multiview;
if (options.supports_spill)
tm_options |= AC_TM_SUPPORTS_SPILL;
if (pipeline->device->instance->perftest_flags & RADV_PERFTEST_SISCHED)
@@ -590,7 +592,7 @@ radv_pipeline_compile(struct radv_pipeline *pipeline,
void *gs_copy_code = NULL;
unsigned gs_copy_code_size = 0;
pipeline->gs_copy_shader = radv_pipeline_create_gs_copy_shader(
-   pipeline, nir, _copy_code, _copy_code_size, dump);
+   pipeline, nir, _copy_code, _copy_code_size, dump, 
key->has_multiview_view_index);
 
if (pipeline->gs_copy_shader) {
pipeline->gs_copy_shader =
@@ -645,7 +647,8 @@ radv_tess_pipeline_compile(struct radv_pipeline *pipeline,
   const VkSpecializationInfo *tcs_spec_info,
   const VkSpecializationInfo *tes_spec_info,
   struct radv_pipeline_layout *layout,
-  unsigned input_vertices)
+  unsigned input_vertices,
+  bool has_view_index)
 {
unsigned char tcs_sha1[20], tes_sha1[20];
struct radv_shader_variant *tes_variant = NULL, *tcs_variant = NULL;
@@ -658,6 +661,7 @@ radv_tess_pipeline_compile(struct radv_pipeline *pipeline,
 
tes_key = radv_compute_tes_key(radv_pipeline_has_gs(pipeline),
   
pipeline->shaders[MESA_SHADER_FRAGMENT]->info.fs.prim_id_input);
+   tes_key.has_multiview_view_index = has_view_index;
if (tes_module->nir)
_mesa_sha1_compute(tes_module->nir->info.name,
   strlen(tes_module->nir->info.name),
@@ -2041,7 +2045,12 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
 {
struct radv_shader_module fs_m = {0};
VkResult result;
+   bool has_view_index = false;
 
+   RADV_FROM_HANDLE(radv_render_pass, pass, pCreateInfo->renderPass);
+   struct radv_subpass *subpass = pass->subpasses + pCreateInfo->subpass;
+   if (subpass->view_mask)
+   has_view_index = true;
if (alloc == NULL)
alloc = >alloc;
 
@@ -2099,6 +2108,7 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
else if 
(pipeline->shaders[MESA_SHADER_FRAGMENT]->info.fs.prim_id_input)
export_prim_id = true;
struct ac_shader_variant_key key = 
radv_compute_vs_key(pCreateInfo, as_es, as_ls, export_prim_id);
+   key.has_multiview_view_index = has_view_index;
 
pipeline->shaders[MESA_SHADER_VERTEX] =
 radv_pipeline_compile(pipeline, cache, 
modules[MESA_SHADER_VERTEX],
@@ -2112,6 +2122,7 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
 
if (modules[MESA_SHADER_GEOMETRY]) {
struct ac_shader_variant_key key = 
radv_compute_vs_key(pCreateInfo, false, false, false);
+   key.has_multiview_view_index = has_view_index;
 
pipeline->shaders[MESA_SHADER_GEOMETRY] =
 radv_pipeline_compile(pipeline, cache, 
modules[MESA_SHADER_GEOMETRY],
@@ -2135,7 +2146,8 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
   
pStages[MESA_SHADER_TESS_CTRL]->pSpecializationInfo,
   
pStages[MESA_SHADER_TESS_EVAL]->pSpecializationInfo,
   pipeline->layout,
-  
pCreateInfo->pTessellationState->patchControlPoints);
+  
pCreateInfo->pTessellationState->patchControlPoints,
+ 

[Mesa-dev] [PATCH 04/11] radv: Use 0 for the layer id if the vertex shader does not export it.

2017-08-23 Thread Bas Nieuwenhuizen
To use when we have e.g. input attachments, but there is no layer
export in the previous shader and hence no layered rendering.
---
 src/amd/vulkan/radv_pipeline.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index bd5eeb776c4..5d94acc1519 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1995,10 +1995,11 @@ static void calculate_ps_inputs(struct radv_pipeline 
*pipeline)
 
if (ps->info.fs.layer_input) {
unsigned vs_offset = 
outinfo->vs_output_param_offset[VARYING_SLOT_LAYER];
-   if (vs_offset != AC_EXP_PARAM_UNDEFINED) {
+   if (vs_offset != AC_EXP_PARAM_UNDEFINED)
pipeline->graphics.ps_input_cntl[ps_offset] = 
offset_to_ps_input(vs_offset, true);
-   ++ps_offset;
-   }
+   else
+   pipeline->graphics.ps_input_cntl[ps_offset] = 
offset_to_ps_input(AC_EXP_PARAM_DEFAULT_VAL_, true);
+   ++ps_offset;
}
 
if (ps->info.fs.has_pcoord) {
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/11] ac/nir: Make shader key a struct.

2017-08-23 Thread Bas Nieuwenhuizen
Some bits can be passed to almost every shader, and I don't like
adding 5 variables.
---
 src/amd/common/ac_nir_to_llvm.h  | 14 --
 src/amd/vulkan/radv_pipeline.c   | 26 +-
 src/amd/vulkan/radv_pipeline_cache.c |  2 +-
 src/amd/vulkan/radv_private.h|  4 ++--
 4 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.h b/src/amd/common/ac_nir_to_llvm.h
index 376db1387a4..2a9ea8a2b3f 100644
--- a/src/amd/common/ac_nir_to_llvm.h
+++ b/src/amd/common/ac_nir_to_llvm.h
@@ -62,16 +62,18 @@ struct ac_fs_variant_key {
uint32_t is_int10;
 };
 
-union ac_shader_variant_key {
-   struct ac_vs_variant_key vs;
-   struct ac_fs_variant_key fs;
-   struct ac_tes_variant_key tes;
-   struct ac_tcs_variant_key tcs;
+struct ac_shader_variant_key {
+   union {
+   struct ac_vs_variant_key vs;
+   struct ac_fs_variant_key fs;
+   struct ac_tes_variant_key tes;
+   struct ac_tcs_variant_key tcs;
+   };
 };
 
 struct ac_nir_compiler_options {
struct radv_pipeline_layout *layout;
-   union ac_shader_variant_key key;
+   struct ac_shader_variant_key key;
bool unsafe_math;
bool supports_spill;
enum radeon_family family;
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 5d94acc1519..96917814e56 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -441,7 +441,7 @@ static void radv_fill_shader_variant(struct radv_device 
*device,
 static struct radv_shader_variant *radv_shader_variant_create(struct 
radv_device *device,
  struct nir_shader 
*shader,
  struct 
radv_pipeline_layout *layout,
- const union 
ac_shader_variant_key *key,
+ const struct 
ac_shader_variant_key *key,
  void** code_out,
  unsigned 
*code_size_out,
  bool dump)
@@ -538,7 +538,7 @@ radv_pipeline_compile(struct radv_pipeline *pipeline,
  gl_shader_stage stage,
  const VkSpecializationInfo *spec_info,
  struct radv_pipeline_layout *layout,
- const union ac_shader_variant_key *key)
+ const struct ac_shader_variant_key *key)
 {
unsigned char sha1[20];
unsigned char gs_copy_sha1[20];
@@ -613,10 +613,10 @@ radv_pipeline_compile(struct radv_pipeline *pipeline,
return variant;
 }
 
-static union ac_shader_variant_key
+static struct ac_shader_variant_key
 radv_compute_tes_key(bool as_es, bool export_prim_id)
 {
-   union ac_shader_variant_key key;
+   struct ac_shader_variant_key key;
memset(, 0, sizeof(key));
key.tes.as_es = as_es;
/* export prim id only happens when no geom shader */
@@ -625,10 +625,10 @@ radv_compute_tes_key(bool as_es, bool export_prim_id)
return key;
 }
 
-static union ac_shader_variant_key
+static struct ac_shader_variant_key
 radv_compute_tcs_key(unsigned primitive_mode, unsigned input_vertices)
 {
-   union ac_shader_variant_key key;
+   struct ac_shader_variant_key key;
memset(, 0, sizeof(key));
key.tcs.primitive_mode = primitive_mode;
key.tcs.input_vertices = input_vertices;
@@ -652,8 +652,8 @@ radv_tess_pipeline_compile(struct radv_pipeline *pipeline,
nir_shader *tes_nir, *tcs_nir;
void *tes_code = NULL, *tcs_code = NULL;
unsigned tes_code_size = 0, tcs_code_size = 0;
-   union ac_shader_variant_key tes_key;
-   union ac_shader_variant_key tcs_key;
+   struct ac_shader_variant_key tes_key;
+   struct ac_shader_variant_key tcs_key;
bool dump = (pipeline->device->debug_flags & RADV_DEBUG_DUMP_SHADERS);
 
tes_key = radv_compute_tes_key(radv_pipeline_has_gs(pipeline),
@@ -1656,10 +1656,10 @@ radv_pipeline_init_dynamic_state(struct radv_pipeline 
*pipeline,
pipeline->dynamic_state_mask = states;
 }
 
-static union ac_shader_variant_key
+static struct ac_shader_variant_key
 radv_compute_vs_key(const VkGraphicsPipelineCreateInfo *pCreateInfo, bool 
as_es, bool as_ls, bool export_prim_id)
 {
-   union ac_shader_variant_key key;
+   struct ac_shader_variant_key key;
const VkPipelineVertexInputStateCreateInfo *input_state =
 pCreateInfo->pVertexInputState;
 
@@ -2068,7 +2068,7 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
}
 
if (modules[MESA_SHADER_FRAGMENT]) {
-   union ac_shader_variant_key key = {0};
+   

[Mesa-dev] [PATCH 02/11] ac/nir: Determine if input attachments are used in the info pass.

2017-08-23 Thread Bas Nieuwenhuizen
---
 src/amd/common/ac_shader_info.c | 11 ++-
 src/amd/common/ac_shader_info.h |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_shader_info.c b/src/amd/common/ac_shader_info.c
index 8668c4c3446..ca59965e2db 100644
--- a/src/amd/common/ac_shader_info.c
+++ b/src/amd/common/ac_shader_info.c
@@ -64,9 +64,18 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, struct 
ac_shader_info *info)
case nir_intrinsic_image_atomic_xor:
case nir_intrinsic_image_atomic_exchange:
case nir_intrinsic_image_atomic_comp_swap:
-   case nir_intrinsic_image_size:
+   case nir_intrinsic_image_size: {
+   const struct glsl_type *type = instr->variables[0]->var->type;
+   if(instr->variables[0]->deref.child)
+   type = instr->variables[0]->deref.child->type;
+
+   enum glsl_sampler_dim dim = glsl_get_sampler_dim(type);
+   if (dim == GLSL_SAMPLER_DIM_SUBPASS ||
+   dim == GLSL_SAMPLER_DIM_SUBPASS_MS)
+   info->ps.uses_input_attachments = true;
mark_sampler_desc(instr->variables[0]->var, info);
break;
+   }
default:
break;
}
diff --git a/src/amd/common/ac_shader_info.h b/src/amd/common/ac_shader_info.h
index 965ad542a2a..886b5e84b57 100644
--- a/src/amd/common/ac_shader_info.h
+++ b/src/amd/common/ac_shader_info.h
@@ -38,6 +38,7 @@ struct ac_shader_info {
struct {
bool force_persample;
bool needs_sample_positions;
+   bool uses_input_attachments;
} ps;
struct {
uint8_t grid_components_used;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/11] radv: Add multiview clears.

2017-08-23 Thread Bas Nieuwenhuizen
---
 src/amd/vulkan/radv_cmd_buffer.c |  1 +
 src/amd/vulkan/radv_meta_clear.c | 65 
 src/amd/vulkan/radv_private.h|  1 +
 3 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 94453094eb6..ed11a4aa35e 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1867,6 +1867,7 @@ radv_cmd_state_setup_attachments(struct radv_cmd_buffer 
*cmd_buffer,
}
 
state->attachments[i].pending_clear_aspects = clear_aspects;
+   state->attachments[i].cleared_views = 0;
if (clear_aspects && info) {
assert(info->clearValueCount > i);
state->attachments[i].clear_value = 
info->pClearValues[i];
diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index af76a517aaf..ea777d9979c 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -337,7 +337,8 @@ radv_device_finish_meta_clear_state(struct radv_device 
*device)
 static void
 emit_color_clear(struct radv_cmd_buffer *cmd_buffer,
  const VkClearAttachment *clear_att,
- const VkClearRect *clear_rect)
+ const VkClearRect *clear_rect,
+ uint32_t view_mask)
 {
struct radv_device *device = cmd_buffer->device;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
@@ -400,7 +401,14 @@ emit_color_clear(struct radv_cmd_buffer *cmd_buffer,
 
radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, 
_rect->rect);
 
-   radv_CmdDraw(cmd_buffer_h, 3, clear_rect->layerCount, 0, 
clear_rect->baseArrayLayer);
+   if (view_mask) {
+   for (unsigned i = 0; (1u << i) <= view_mask; ++i)
+   if ((1u << i) & view_mask) {
+   radv_CmdDraw(cmd_buffer_h, 3, 1, 0, i);
+   }
+   } else {
+   radv_CmdDraw(cmd_buffer_h, 3, clear_rect->layerCount, 0, 
clear_rect->baseArrayLayer);
+   }
 
radv_cmd_buffer_set_subpass(cmd_buffer, subpass, false);
 }
@@ -945,7 +953,8 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer,
  const VkClearAttachment *clear_att,
  const VkClearRect *clear_rect,
  enum radv_cmd_flush_bits *pre_flush,
- enum radv_cmd_flush_bits *post_flush)
+ enum radv_cmd_flush_bits *post_flush,
+  uint32_t view_mask)
 {
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
const uint32_t subpass_att = clear_att->colorAttachment;
@@ -989,9 +998,12 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer,
clear_rect->rect.extent.height != iview->image->info.height)
goto fail;
 
-   if (clear_rect->baseArrayLayer != 0)
+   if (view_mask && (iview->image->info.array_size >= 32 ||
+(1u << iview->image->info.array_size) - 1u != 
view_mask))
goto fail;
-   if (clear_rect->layerCount != iview->image->info.array_size)
+   if (!view_mask && clear_rect->baseArrayLayer != 0)
+   goto fail;
+   if (!view_mask && clear_rect->layerCount != 
iview->image->info.array_size)
goto fail;
 
/* RB+ doesn't work with CMASK fast clear on Stoney. */
@@ -1060,13 +1072,13 @@ emit_clear(struct radv_cmd_buffer *cmd_buffer,
const VkClearAttachment *clear_att,
const VkClearRect *clear_rect,
enum radv_cmd_flush_bits *pre_flush,
-   enum radv_cmd_flush_bits *post_flush)
+   enum radv_cmd_flush_bits *post_flush,
+   uint32_t view_mask)
 {
if (clear_att->aspectMask & VK_IMAGE_ASPECT_COLOR_BIT) {
-
if (!emit_fast_color_clear(cmd_buffer, clear_att, clear_rect,
-  pre_flush, post_flush))
-   emit_color_clear(cmd_buffer, clear_att, clear_rect);
+  pre_flush, post_flush, view_mask))
+   emit_color_clear(cmd_buffer, clear_att, clear_rect, 
view_mask);
} else {
assert(clear_att->aspectMask & (VK_IMAGE_ASPECT_DEPTH_BIT |
VK_IMAGE_ASPECT_STENCIL_BIT));
@@ -1084,17 +1096,20 @@ subpass_needs_clear(const struct radv_cmd_buffer 
*cmd_buffer)
 
if (!cmd_state->subpass)
return false;
+   uint32_t view_mask = cmd_state->subpass->view_mask;
ds = cmd_state->subpass->depth_stencil_attachment.attachment;
for (uint32_t i = 0; i < cmd_state->subpass->color_count; ++i) {
uint32_t a = 
cmd_state->subpass->color_attachments[i].attachment;
if (a != VK_ATTACHMENT_UNUSED &&
-   

[Mesa-dev] [PATCH 01/11] ac/nir: Cast sources of integer ops to int.

2017-08-23 Thread Bas Nieuwenhuizen
The int32->float semantic conversion got dropped in a testcase,
because the src was already float. On closer inspection I decided
to add a few more casts for integer op operands to be safe too.

Cc: 17.2 
---
 src/amd/common/ac_nir_to_llvm.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index a17a232bbed..f0120a984f8 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1806,10 +1806,12 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
break;
case nir_op_i2f32:
case nir_op_i2f64:
+   src[0] = to_integer(>ac, src[0]);
result = LLVMBuildSIToFP(ctx->ac.builder, src[0], 
to_float_type(>ac, def_type), "");
break;
case nir_op_u2f32:
case nir_op_u2f64:
+   src[0] = to_integer(>ac, src[0]);
result = LLVMBuildUIToFP(ctx->ac.builder, src[0], 
to_float_type(>ac, def_type), "");
break;
case nir_op_f2f64:
@@ -1820,6 +1822,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
break;
case nir_op_u2u32:
case nir_op_u2u64:
+   src[0] = to_integer(>ac, src[0]);
if (get_elem_bits(>ac, LLVMTypeOf(src[0])) < 
get_elem_bits(>ac, def_type))
result = LLVMBuildZExt(ctx->ac.builder, src[0], 
def_type, "");
else
@@ -1827,6 +1830,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
break;
case nir_op_i2i32:
case nir_op_i2i64:
+   src[0] = to_integer(>ac, src[0]);
if (get_elem_bits(>ac, LLVMTypeOf(src[0])) < 
get_elem_bits(>ac, def_type))
result = LLVMBuildSExt(ctx->ac.builder, src[0], 
def_type, "");
else
@@ -1836,18 +1840,25 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
result = emit_bcsel(>ac, src[0], src[1], src[2]);
break;
case nir_op_find_lsb:
+   src[0] = to_integer(>ac, src[0]);
result = emit_find_lsb(>ac, src[0]);
break;
case nir_op_ufind_msb:
+   src[0] = to_integer(>ac, src[0]);
result = emit_ufind_msb(>ac, src[0]);
break;
case nir_op_ifind_msb:
+   src[0] = to_integer(>ac, src[0]);
result = emit_ifind_msb(>ac, src[0]);
break;
case nir_op_uadd_carry:
+   src[0] = to_integer(>ac, src[0]);
+   src[1] = to_integer(>ac, src[1]);
result = emit_uint_carry(>ac, 
"llvm.uadd.with.overflow.i32", src[0], src[1]);
break;
case nir_op_usub_borrow:
+   src[0] = to_integer(>ac, src[0]);
+   src[1] = to_integer(>ac, src[1]);
result = emit_uint_carry(>ac, 
"llvm.usub.with.overflow.i32", src[0], src[1]);
break;
case nir_op_b2f:
@@ -1860,15 +1871,20 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
result = emit_b2i(>ac, src[0]);
break;
case nir_op_i2b:
+   src[0] = to_integer(>ac, src[0]);
result = emit_i2b(>ac, src[0]);
break;
case nir_op_fquantize2f16:
result = emit_f2f16(ctx->nctx, src[0]);
break;
case nir_op_umul_high:
+   src[0] = to_integer(>ac, src[0]);
+   src[1] = to_integer(>ac, src[1]);
result = emit_umul_high(>ac, src[0], src[1]);
break;
case nir_op_imul_high:
+   src[0] = to_integer(>ac, src[0]);
+   src[1] = to_integer(>ac, src[1]);
result = emit_imul_high(>ac, src[0], src[1]);
break;
case nir_op_pack_half_2x16:
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/11] ac/nir: Implement input attachments with layered rendering.

2017-08-23 Thread Bas Nieuwenhuizen
---
 src/amd/common/ac_nir_to_llvm.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index f0120a984f8..90406e88dfb 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3345,6 +3345,7 @@ static LLVMValueRef get_image_coords(struct 
ac_nir_context *ctx,
LLVMBuildAdd(ctx->ac.builder, 
fmask_load_address[chan],

LLVMBuildFPToUI(ctx->ac.builder, ctx->abi->frag_pos[chan],
ctx->ac.i32, 
""), "");
+   fmask_load_address[2] = to_integer(>ac, 
ctx->abi->inputs[radeon_llvm_reg_index_soa(VARYING_SLOT_LAYER, 0)]);
}
sample_index = adjust_sample_index_using_fmask(>ac,
   
fmask_load_address[0],
@@ -3367,9 +3368,11 @@ static LLVMValueRef get_image_coords(struct 
ac_nir_context *ctx,
}
 
if (add_frag_pos) {
-   for (chan = 0; chan < count; ++chan)
+   for (chan = 0; chan < 2; ++chan)
coords[chan] = LLVMBuildAdd(ctx->ac.builder, 
coords[chan], LLVMBuildFPToUI(ctx->ac.builder, ctx->abi->frag_pos[chan],
ctx->ac.i32, ""), "");
+   coords[2] = to_integer(>ac, 
ctx->abi->inputs[radeon_llvm_reg_index_soa(VARYING_SLOT_LAYER, 0)]);
+   count++;
}
if (is_ms) {
coords[count] = sample_index;
@@ -3414,7 +3417,9 @@ static LLVMValueRef visit_image_load(struct 
ac_nir_context *ctx,
res = to_integer(>ac, res);
} else {
bool is_da = glsl_sampler_type_is_array(type) ||
-glsl_get_sampler_dim(type) == 
GLSL_SAMPLER_DIM_CUBE;
+glsl_get_sampler_dim(type) == 
GLSL_SAMPLER_DIM_CUBE ||
+glsl_get_sampler_dim(type) == 
GLSL_SAMPLER_DIM_SUBPASS ||
+glsl_get_sampler_dim(type) == 
GLSL_SAMPLER_DIM_SUBPASS_MS;
LLVMValueRef da = is_da ? i1true : i1false;
LLVMValueRef glc = i1false;
LLVMValueRef slc = i1false;
@@ -5040,6 +5045,10 @@ handle_fs_inputs_pre(struct nir_to_llvm_context *ctx,
 struct nir_shader *nir)
 {
unsigned index = 0;
+
+   if (ctx->shader_info->info.ps.uses_input_attachments)
+   ctx->input_mask |= 1ull << VARYING_SLOT_LAYER;
+
for (unsigned i = 0; i < RADEON_LLVM_MAX_INPUTS; ++i) {
LLVMValueRef interp_param;
LLVMValueRef *inputs = ctx->inputs 
+radeon_llvm_reg_index_soa(i, 0);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] gallivm: correct channel shift logic on big endian

2017-08-23 Thread Ray Strode
Hi,

Just a quick note...

> From: Ray Strode 
>
> lp_build_fetch_rgba_soa fetches a texel from a texture.
> Part of that process involves first gathering the element
> together from memory into a packed format, and then breaking
> out the individual color channels into separate, parallel
> arrays.
>
> The code fails to account for endianess when reading the packed
> values.
>
> This commit attempts to correct the problem by reversing the order
> the packed values are read on big endian systems.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
> Cc: "17.2" "17.1" 
If consensus is this patch is good enough for now, then it's, of course,
got my

Signed-off-by: Ray Strode 

--Ray
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] radeonsi: clean up setting GRBM_GFX_INDEX

2017-08-23 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c | 41 ++---
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index ab0bb57..4772df2 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -4267,20 +4267,39 @@ static void si_apply_opaque_metadata(struct 
r600_common_screen *rscreen,
rtex->dcc_offset = 0;
 }
 
 void si_init_screen_state_functions(struct si_screen *sscreen)
 {
sscreen->b.b.is_format_supported = si_is_format_supported;
sscreen->b.query_opaque_metadata = si_query_opaque_metadata;
sscreen->b.apply_opaque_metadata = si_apply_opaque_metadata;
 }
 
+static void si_set_grbm_gfx_index(struct si_context *sctx,
+ struct si_pm4_state *pm4,  unsigned value)
+{
+   unsigned reg = sctx->b.chip_class >= CIK ? R_030800_GRBM_GFX_INDEX :
+  GRBM_GFX_INDEX;
+   si_pm4_set_reg(pm4, reg, value);
+}
+
+static void si_set_grbm_gfx_index_se(struct si_context *sctx,
+struct si_pm4_state *pm4, unsigned se)
+{
+   assert(se == ~0 || se < sctx->screen->b.info.max_se);
+   si_set_grbm_gfx_index(sctx, pm4,
+ (se == ~0 ? S_030800_SE_BROADCAST_WRITES(1) :
+ S_030800_SE_INDEX(se)) |
+ S_030800_SH_BROADCAST_WRITES(1) |
+ S_030800_INSTANCE_BROADCAST_WRITES(1));
+}
+
 static void
 si_write_harvested_raster_configs(struct si_context *sctx,
  struct si_pm4_state *pm4,
  unsigned raster_config,
  unsigned raster_config_1)
 {
unsigned sh_per_se = MAX2(sctx->screen->b.info.max_sh_per_se, 1);
unsigned num_se = MAX2(sctx->screen->b.info.max_se, 1);
unsigned rb_mask = sctx->screen->b.info.enabled_rb_mask;
unsigned num_rb = MIN2(sctx->screen->b.info.num_render_backends, 16);
@@ -4369,42 +4388,26 @@ si_write_harvested_raster_configs(struct si_context 
*sctx,
raster_config_se |=

S_028350_RB_MAP_PKR1(V_028350_RASTER_CONFIG_RB_MAP_3);
} else {
raster_config_se |=

S_028350_RB_MAP_PKR1(V_028350_RASTER_CONFIG_RB_MAP_0);
}
}
}
}
 
-   /* GRBM_GFX_INDEX has a different offset on SI and CI+ */
-   if (sctx->b.chip_class < CIK)
-   si_pm4_set_reg(pm4, GRBM_GFX_INDEX,
-  SE_INDEX(se) | SH_BROADCAST_WRITES |
-  INSTANCE_BROADCAST_WRITES);
-   else
-   si_pm4_set_reg(pm4, R_030800_GRBM_GFX_INDEX,
-  S_030800_SE_INDEX(se) | 
S_030800_SH_BROADCAST_WRITES(1) |
-  S_030800_INSTANCE_BROADCAST_WRITES(1));
+   si_set_grbm_gfx_index_se(sctx, pm4, se);
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 
raster_config_se);
}
+   si_set_grbm_gfx_index(sctx, pm4, ~0);
 
-   /* GRBM_GFX_INDEX has a different offset on SI and CI+ */
-   if (sctx->b.chip_class < CIK)
-   si_pm4_set_reg(pm4, GRBM_GFX_INDEX,
-  SE_BROADCAST_WRITES | SH_BROADCAST_WRITES |
-  INSTANCE_BROADCAST_WRITES);
-   else {
-   si_pm4_set_reg(pm4, R_030800_GRBM_GFX_INDEX,
-  S_030800_SE_BROADCAST_WRITES(1) | 
S_030800_SH_BROADCAST_WRITES(1) |
-  S_030800_INSTANCE_BROADCAST_WRITES(1));
-
+   if (sctx->b.chip_class >= CIK) {
if ((num_se > 2) && ((!se_mask[0] && !se_mask[1]) ||
 (!se_mask[2] && !se_mask[3]))) {
raster_config_1 &= C_028354_SE_PAIR_MAP;
 
if (!se_mask[0] && !se_mask[1]) {
raster_config_1 |=

S_028354_SE_PAIR_MAP(V_028354_RASTER_CONFIG_SE_PAIR_MAP_3);
} else {
raster_config_1 |=

S_028354_SE_PAIR_MAP(V_028354_RASTER_CONFIG_SE_PAIR_MAP_0);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] radeonsi: move PA_SC_RASTER_CONFIG emission into a separate function

2017-08-23 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c | 143 
 1 file changed, 73 insertions(+), 70 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 532388f..ab0bb57 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -4408,77 +4408,28 @@ si_write_harvested_raster_configs(struct si_context 
*sctx,
} else {
raster_config_1 |=

S_028354_SE_PAIR_MAP(V_028354_RASTER_CONFIG_SE_PAIR_MAP_0);
}
}
 
si_pm4_set_reg(pm4, R_028354_PA_SC_RASTER_CONFIG_1, 
raster_config_1);
}
 }
 
-static void si_init_config(struct si_context *sctx)
+static void si_set_raster_config(struct si_context *sctx, struct si_pm4_state 
*pm4)
 {
struct si_screen *sscreen = sctx->screen;
unsigned num_rb = MIN2(sctx->screen->b.info.num_render_backends, 16);
unsigned rb_mask = sctx->screen->b.info.enabled_rb_mask;
unsigned raster_config, raster_config_1;
-   uint64_t border_color_va = sctx->border_color_buffer->gpu_address;
-   bool has_clear_state = sscreen->has_clear_state;
-   struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
-
-   /* Only SI can disable CLEAR_STATE for now. */
-   assert(has_clear_state || sscreen->b.chip_class == SI);
-
-   if (!pm4)
-   return;
-
-   si_pm4_cmd_begin(pm4, PKT3_CONTEXT_CONTROL);
-   si_pm4_cmd_add(pm4, CONTEXT_CONTROL_LOAD_ENABLE(1));
-   si_pm4_cmd_add(pm4, CONTEXT_CONTROL_SHADOW_ENABLE(1));
-   si_pm4_cmd_end(pm4, false);
 
-   if (has_clear_state) {
-   si_pm4_cmd_begin(pm4, PKT3_CLEAR_STATE);
-   si_pm4_cmd_add(pm4, 0);
-   si_pm4_cmd_end(pm4, false);
-   }
-
-   si_pm4_set_reg(pm4, R_028A18_VGT_HOS_MAX_TESS_LEVEL, fui(64));
-   if (!has_clear_state)
-   si_pm4_set_reg(pm4, R_028A1C_VGT_HOS_MIN_TESS_LEVEL, fui(0));
-
-   /* FIXME calculate these values somehow ??? */
-   if (sctx->b.chip_class <= VI) {
-   si_pm4_set_reg(pm4, R_028A54_VGT_GS_PER_ES, SI_GS_PER_ES);
-   si_pm4_set_reg(pm4, R_028A58_VGT_ES_PER_GS, 0x40);
-   }
-
-   if (!has_clear_state) {
-   si_pm4_set_reg(pm4, R_028A5C_VGT_GS_PER_VS, 0x2);
-   si_pm4_set_reg(pm4, R_028A8C_VGT_PRIMITIVEID_RESET, 0x0);
-   si_pm4_set_reg(pm4, R_028B98_VGT_STRMOUT_BUFFER_CONFIG, 0x0);
-   }
-
-   si_pm4_set_reg(pm4, R_028AA0_VGT_INSTANCE_STEP_RATE_0, 1);
-   if (!has_clear_state)
-   si_pm4_set_reg(pm4, R_028AB8_VGT_VTX_CNT_EN, 0x0);
-   if (sctx->b.chip_class < CIK)
-   si_pm4_set_reg(pm4, R_008A14_PA_CL_ENHANCE, 
S_008A14_NUM_CLIP_SEQ(3) |
-  S_008A14_CLIP_VTX_REORDER_ENA(1));
-
-   si_pm4_set_reg(pm4, R_028BD4_PA_SC_CENTROID_PRIORITY_0, 0x76543210);
-   si_pm4_set_reg(pm4, R_028BD8_PA_SC_CENTROID_PRIORITY_1, 0xfedcba98);
-
-   if (!has_clear_state)
-   si_pm4_set_reg(pm4, R_02882C_PA_SU_PRIM_FILTER_CNTL, 0);
-
-   switch (sctx->screen->b.family) {
+   switch (sctx->b.family) {
case CHIP_TAHITI:
case CHIP_PITCAIRN:
raster_config = 0x2a00126a;
raster_config_1 = 0x;
break;
case CHIP_VERDE:
raster_config = 0x124a;
raster_config_1 = 0x;
break;
case CHIP_OLAND:
@@ -4536,44 +4487,96 @@ static void si_init_config(struct si_context *sctx)
raster_config = 0x; /* 0x0002 */
raster_config_1 = 0x;
break;
case CHIP_KABINI:
case CHIP_MULLINS:
case CHIP_STONEY:
raster_config = 0x;
raster_config_1 = 0x;
break;
default:
-   if (sctx->b.chip_class <= VI) {
-   fprintf(stderr,
-   "radeonsi: Unknown GPU, using 0 for 
raster_config\n");
-   raster_config = 0x;
-   raster_config_1 = 0x;
-   }
-   break;
+   fprintf(stderr,
+   "radeonsi: Unknown GPU, using 0 for raster_config\n");
+   raster_config = 0x;
+   raster_config_1 = 0x;
+   }
+
+   if (!rb_mask || util_bitcount(rb_mask) >= num_rb) {
+   /* Always use the default config when all backends are enabled
+* (or when we failed to determine the enabled backends).
+*/
+   si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG,
+  raster_config);
+   if 

[Mesa-dev] [PATCH 4/4] radeonsi: get the raster config from AMDGPU on SI

2017-08-23 Thread Marek Olšák
From: Marek Olšák 

Not sure yet if we wanna do this on CIK and VI too.
---
 src/amd/common/ac_gpu_info.c|  3 +++
 src/amd/common/ac_gpu_info.h|  2 ++
 src/gallium/drivers/radeonsi/si_state.c | 17 +
 3 files changed, 22 insertions(+)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index e55d864..84a54bb 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -294,20 +294,23 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
/* Get the number of good compute units. */
info->num_good_compute_units = 0;
for (i = 0; i < info->max_se; i++)
for (j = 0; j < info->max_sh_per_se; j++)
info->num_good_compute_units +=
util_bitcount(amdinfo->cu_bitmap[i][j]);
 
memcpy(info->si_tile_mode_array, amdinfo->gb_tile_mode,
sizeof(amdinfo->gb_tile_mode));
info->enabled_rb_mask = amdinfo->enabled_rb_pipes_mask;
+   memcpy(info->pa_sc_raster_config, amdinfo->pa_sc_raster_cfg,
+  sizeof(info->pa_sc_raster_config));
+   info->pa_sc_raster_config_1 = amdinfo->pa_sc_raster_cfg1[0];
 
memcpy(info->cik_macrotile_mode_array, amdinfo->gb_macro_tile_mode,
sizeof(amdinfo->gb_macro_tile_mode));
 
info->pte_fragment_size = alignment_info.size_local;
info->gart_page_size = alignment_info.size_remote;
 
if (info->chip_class == SI)
info->gfx_ib_pad_with_type2 = TRUE;
 
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 06b0c77..91d303a 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -89,20 +89,22 @@ struct radeon_info {
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
uint32_tr300_num_z_pipes;
uint32_tr600_gb_backend_map; /* R600 harvest config 
*/
boolr600_gb_backend_map_valid;
uint32_tr600_num_banks;
uint32_tnum_render_backends;
uint32_tnum_tile_pipes; /* pipe count from 
PIPE_CONFIG */
uint32_tpipe_interleave_bytes;
uint32_tenabled_rb_mask; /* GCN harvest config */
+   uint32_tpa_sc_raster_config[4]; /* per SE */
+   uint32_tpa_sc_raster_config_1;
 
uint64_tmax_alignment; /* from addrlib */
/* Tile modes. */
uint32_tsi_tile_mode_array[32];
uint32_tcik_macrotile_mode_array[16];
 };
 
 bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
   struct radeon_info *info,
   struct amdgpu_gpu_info *amdinfo);
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 4772df2..24e509c 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -4414,20 +4414,37 @@ si_write_harvested_raster_configs(struct si_context 
*sctx,
}
}
 
si_pm4_set_reg(pm4, R_028354_PA_SC_RASTER_CONFIG_1, 
raster_config_1);
}
 }
 
 static void si_set_raster_config(struct si_context *sctx, struct si_pm4_state 
*pm4)
 {
struct si_screen *sscreen = sctx->screen;
+
+   /* On SI, set the raster config value from AMDGPU. */
+   if (sscreen->b.info.drm_major == 3 && sscreen->b.chip_class == SI) {
+   if (sscreen->b.info.max_se == 1) {
+   si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG,
+  sscreen->b.info.pa_sc_raster_config[0]);
+   } else {
+   for (unsigned se = 0; se < sscreen->b.info.max_se; 
se++) {
+   si_set_grbm_gfx_index_se(sctx, pm4, se);
+   si_pm4_set_reg(pm4, 
R_028350_PA_SC_RASTER_CONFIG,
+  
sscreen->b.info.pa_sc_raster_config[se]);
+   }
+   si_set_grbm_gfx_index_se(sctx, pm4, ~0);
+   }
+   return;
+   }
+
unsigned num_rb = MIN2(sctx->screen->b.info.num_render_backends, 16);
unsigned rb_mask = sctx->screen->b.info.enabled_rb_mask;
unsigned raster_config, raster_config_1;
 
switch (sctx->b.family) {
case CHIP_TAHITI:
case CHIP_PITCAIRN:
raster_config = 0x2a00126a;
raster_config_1 = 0x;
break;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] radeonsi: correct maximum wave count per SIMD

2017-08-23 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index f02fc9e..186a3dd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5029,21 +5029,36 @@ static void si_shader_dump_stats(struct si_screen 
*sscreen,
 struct pipe_debug_callback *debug,
 unsigned processor,
 FILE *file,
 bool check_debug_option)
 {
const struct si_shader_config *conf = >config;
unsigned num_inputs = shader->selector ? 
shader->selector->info.num_inputs : 0;
unsigned code_size = si_get_shader_binary_size(shader);
unsigned lds_increment = sscreen->b.chip_class >= CIK ? 512 : 256;
unsigned lds_per_wave = 0;
-   unsigned max_simd_waves = 10;
+   unsigned max_simd_waves;
+
+   switch (sscreen->b.family) {
+   /* SGPR initialization bug workaround on Tonga and Iceland reduces
+* the wave count to 8. */
+   case CHIP_TONGA:
+   case CHIP_ICELAND:
+   /* These always have 8 waves: */
+   case CHIP_POLARIS10:
+   case CHIP_POLARIS11:
+   case CHIP_POLARIS12:
+   max_simd_waves = 8;
+   break;
+   default:
+   max_simd_waves = 10;
+   }
 
/* Compute LDS usage for PS. */
switch (processor) {
case PIPE_SHADER_FRAGMENT:
/* The minimum usage per wave is (num_inputs * 48). The maximum
 * usage is (num_inputs * 48 * 16).
 * We can get anything in between and it varies between waves.
 *
 * The 48 bytes per input for a single primitive is equal to
 * 4 bytes/component * 4 components/input * 3 points.
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] gallium: remove TGSI opcode SCS

2017-08-23 Thread Roland Scheidegger
Am 22.08.2017 um 16:38 schrieb Roland Scheidegger:
> Am 22.08.2017 um 15:17 schrieb Jose Fonseca:
>> On 22/08/17 12:28, Marek Olšák wrote:
>>> On Tue, Aug 22, 2017 at 1:10 PM, Jose Fonseca 
>>> wrote:
 On 20/08/17 01:49, Marek Olšák wrote:
>
> From: Marek Olšák 
>
> use COS+SIN instead.


 I don't know if any existing gallium driver leverages that, but it's
 a basic
 trigonometric principle that one can easily extract the sin from cos or
 vice-versa.  It requires some extra care for getting the right sign. 
 But
 the fact is that it should be considerably cheaper to comput both
 simultaneously than indepdently.

 Unfortunately GLSL/SPIR-V doesn't allow to express that.  D3D9/D3D11 and
 Metal all do.  And from what I've seen from D3D9/D3D11 apps, 99% of the
 times the shader wants both SIN/COS at the same time.

 If we want one opcode to rule them all, then a combined SIN+COS seems a
 better choice IMO.  On SM4 the sincos has two outputs:
 https://msdn.microsoft.com/en-us/library/windows/desktop/hh447234.aspx but

 they are both optional to use.  I don't know if there's a precedent for
 that.  I recall we had similar discussions about UMUL/UMUL_HI, and I
 suspect
 we chose not to go that route.


 Don't GPUs allow to express the computation of both sin/cos with a
 single
 opcode?  If nothing else there would be a non-negligible impact of
 leveraging this in llvmpipe at some point.  On the other hand, is
 possible
 that LLVM common-subexpression elimination optimization passes
 already do
 that, so we gain nothing.


 In short, not big deal either way, but I think it's worth give it a 2nd
 thought here.
>>>
>>> R300 doesn't have trigonometric functions. R500, R600 and later VLIW
>>> chips, and GCN all only have separate sin and cos.
>>>
>>> svga is the only driver that has sincos. No gallium hardware driver
>>> has that.
>>
>> I see.  Fair enough.  Considering that, plus the fact that GLSL doesn't
>> have conbined sin/cos, and that LLVM will most likely eliminate common
>> expressions generated by llvmpipe for cos/sin with same arg, there's
>> really not a significant upside left to justify keeping this around.
>>
>> FWIW the patch is
>>
>> Acked-by: Jose Fonseca 
>>
>> Jose
>>
> 
> FWIW old i965 chips had a SINCOS instruction, in addition to SIN and COS
> (those using the shared mathbox).
> However, even there the docs state for throughput "The two-output-phase
> SINCOS function is implemented as back to back SIN and COS functions."
> So it looks like the execution isn't actually faster there neither,
> albeit it would be faster due to only having to issue one send message.
> (But don't ask me how the chips actually implement this...)
> 

I forgot about this, but one reason for SCS also was that the
specification said the argument had to be in [-pi, pi] range (at least
GL spec said this, not really sure about d3d9). But if hw can't make use
of that in any case (and we didn't for llvmpipe neither) there's no point.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load

2017-08-23 Thread Ben Crocker
Sorry, please ignore this patch in favor of the sequence emailed
later today (23 Aug 2017).

-- Ben


- Original Message -
From: "Ben Crocker" 
To: mesa-dev@lists.freedesktop.org
Cc: "17.2 17.1" , "Dave Airlie" 
, "Ben Crocker" 
Sent: Wednesday, August 23, 2017 1:38:21 PM
Subject: [Mesa-dev] [PATCH] llvmpipe: lp_build_gather_elem_vec BE fix for   
3x16 load

Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).

Roland Scheidegger's commit e827d9175675aaa6cfc0b981e2a80685fb7b3a74
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000.  
This patch fixes
three of the four regressions observed by Ray:

- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2

One regression remains:
- draw-vertices-2101010

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" 

Signed-off-by: Ben Crocker 
---
 src/gallium/auxiliary/gallivm/lp_bld_gather.c | 30 +--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_gather.c 
b/src/gallium/auxiliary/gallivm/lp_bld_gather.c
index ccd0376..7d11dcd 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_gather.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_gather.c
@@ -234,13 +234,39 @@ lp_build_gather_elem_vec(struct gallivm_state *gallivm,
   */
  res = LLVMBuildZExt(gallivm->builder, res, dst_elem_type, "");
 
- if (vector_justify) {
 #ifdef PIPE_ARCH_BIG_ENDIAN
+ if (vector_justify) {
  res = LLVMBuildShl(gallivm->builder, res,
 LLVMConstInt(dst_elem_type,
  dst_type.width - src_width, 0), "");
-#endif
  }
+ if (src_width == 48) {
+/* Load 3x16 bit vector.
+ * The sequence of loads on big-endian hardware proceeds as 
follows.
+ * 16-bit fields are denoted by X, Y, Z, and 0.  In memory, the 
sequence
+ * of three fields appears in the order X, Y, Z.
+ *
+ * Load 32-bit word: 0.0.X.Y
+ * Load 16-bit halfword: 0.0.0.Z
+ * Rotate left: 0.X.Y.0
+ * Bitwise OR: 0.X.Y.Z
+ *
+ * The order in which we need the fields in the result is 0.Z.Y.X,
+ * the same as on little-endian; permute 16-bit fields accordingly
+ * within 64-bit register:
+ */
+LLVMValueRef shuffles[4] = {
+   lp_build_const_int32(gallivm, 2),
+   lp_build_const_int32(gallivm, 1),
+   lp_build_const_int32(gallivm, 0),
+   lp_build_const_int32(gallivm, 3),
+};
+res = LLVMBuildBitCast(gallivm->builder, res,
+   lp_build_vec_type(gallivm, 
lp_type_uint_vec(16, 4*16)), "");
+res = LLVMBuildShuffleVector(gallivm->builder, res, res, 
LLVMConstVector(shuffles, 4), "");
+res = LLVMBuildBitCast(gallivm->builder, res, dst_elem_type, "");
+ }
+#endif
   }
}
return res;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/2] llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load

2017-08-23 Thread Ben Crocker
Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).

Roland Scheidegger's commit e827d9175675aaa6cfc0b981e2a80685fb7b3a74
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000.  
This patch fixes
three of the four regressions observed by Ray:

- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2

One regression remains:
- draw-vertices-2101010

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" 


Signed-off-by: Ben Crocker 
---
 src/gallium/auxiliary/gallivm/lp_bld_gather.c | 30 +--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_gather.c 
b/src/gallium/auxiliary/gallivm/lp_bld_gather.c
index ccd0376..7d11dcd 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_gather.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_gather.c
@@ -234,13 +234,39 @@ lp_build_gather_elem_vec(struct gallivm_state *gallivm,
   */
  res = LLVMBuildZExt(gallivm->builder, res, dst_elem_type, "");
 
- if (vector_justify) {
 #ifdef PIPE_ARCH_BIG_ENDIAN
+ if (vector_justify) {
  res = LLVMBuildShl(gallivm->builder, res,
 LLVMConstInt(dst_elem_type,
  dst_type.width - src_width, 0), "");
-#endif
  }
+ if (src_width == 48) {
+/* Load 3x16 bit vector.
+ * The sequence of loads on big-endian hardware proceeds as 
follows.
+ * 16-bit fields are denoted by X, Y, Z, and 0.  In memory, the 
sequence
+ * of three fields appears in the order X, Y, Z.
+ *
+ * Load 32-bit word: 0.0.X.Y
+ * Load 16-bit halfword: 0.0.0.Z
+ * Rotate left: 0.X.Y.0
+ * Bitwise OR: 0.X.Y.Z
+ *
+ * The order in which we need the fields in the result is 0.Z.Y.X,
+ * the same as on little-endian; permute 16-bit fields accordingly
+ * within 64-bit register:
+ */
+LLVMValueRef shuffles[4] = {
+   lp_build_const_int32(gallivm, 2),
+   lp_build_const_int32(gallivm, 1),
+   lp_build_const_int32(gallivm, 0),
+   lp_build_const_int32(gallivm, 3),
+};
+res = LLVMBuildBitCast(gallivm->builder, res,
+   lp_build_vec_type(gallivm, 
lp_type_uint_vec(16, 4*16)), "");
+res = LLVMBuildShuffleVector(gallivm->builder, res, res, 
LLVMConstVector(shuffles, 4), "");
+res = LLVMBuildBitCast(gallivm->builder, res, dst_elem_type, "");
+ }
+#endif
   }
}
return res;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] gallivm: correct channel shift logic on big endian

2017-08-23 Thread Ben Crocker
From: Ray Strode 

lp_build_fetch_rgba_soa fetches a texel from a texture.
Part of that process involves first gathering the element
together from memory into a packed format, and then breaking
out the individual color channels into separate, parallel
arrays.

The code fails to account for endianess when reading the packed
values.

This commit attempts to correct the problem by reversing the order
the packed values are read on big endian systems.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" 

---
 src/gallium/auxiliary/gallivm/lp_bld_format_soa.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
index 98eb694..22c19b1 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
@@ -650,7 +650,13 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
  for (i = 0; i < format_desc->nr_channels; i++) {
 struct util_format_channel_description chan_desc = 
format_desc->channel[i];
 unsigned blockbits = type.width;
-unsigned vec_nr = chan_desc.shift / type.width;
+unsigned vec_nr;
+
+#ifdef PIPE_ARCH_BIG_ENDIAN
+vec_nr = (format_desc->block.bits - (chan_desc.shift + 
chan_desc.size)) / type.width;
+#else
+vec_nr = chan_desc.shift / type.width;
+#endif
 chan_desc.shift %= type.width;
 
 output[i] = lp_build_extract_soa_chan(,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/2] gallivm/llvmpipe: fix gather logic for big-endian architectures

2017-08-23 Thread Ben Crocker
The following patches, on top of Roland Scheidegger's commit
e827d9175675aaa6cfc0b981e2a80685fb7b3a74, reduce pre-Roland
Piglit failures from ~4000 to ~2000 on big-endian architectures
(PPC64, S390x).

Ben Crocker (1):
  llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load

Ray Strode (1):
  gallivm: correct channel shift logic on big endian

 src/gallium/auxiliary/gallivm/lp_bld_format_soa.c |  8 +-
 src/gallium/auxiliary/gallivm/lp_bld_gather.c | 30 +--
 2 files changed, 35 insertions(+), 3 deletions(-)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" 

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102017] Wrong colours in Cities Skyline

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102017

--- Comment #14 from Gert Wollny  ---
Maybe you're missing libtxc_dxtn? According to the glxinfo the dxt* texture
compression extensions are not available.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102038

--- Comment #14 from Bruce Cherniak  ---
Created attachment 133726
  --> https://bugs.freedesktop.org/attachment.cgi?id=133726=edit
Patch to fix swr after "new patch to try"

Hi Brian,

I found the reason that swr starts failing with your patch.  I was treating any
num_samples > 0 as msaa and turning off the fake_msaa cap.  The attached (in
conjunction with you patch) works correctly.

If this makes sense to you, will you please include it with your patch so that
we don't end up with a regression?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Only set key->flat_shade if COL0/COL1 are written.

2017-08-23 Thread Kenneth Graunke
On Wednesday, August 23, 2017 12:04:40 PM PDT Ilia Mirkin wrote:
> You might consider also including whether the interpolation method was
> forced or not. i.e. if you have
> 
> flat varying vec4 gl_Color;
> 
> then it doesn't matter whether shade model is flat or not, it'll be
> interpolated as flat. (Same with the other qualifiers made available
> in GL 3.0.)
> 
> So you only have to do funny stuff if either COL0 / COL1 don't have
> explicit interpolation qualifiers.
> 
> That might be over-optimizing it though. Your call. Just something
> that occurred to me.
> 
>   -ilia

Yeah, we definitely could do that.  I think I'd have to go loop over the
input variables, though, instead of checking a bitfield...which is a bit
of a pain.  Given that 'flat' was introduced with GLSL 1.30, which also
deprecated gl_Color, I doubt it's too common of a case.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac/nir: fixup layer/viewport export for GFX9.

2017-08-23 Thread Andres Gomez
Hi Dave,

This patch landed tagged for 17.2 only. Was it, then, not nominated for
17.1 intentionally ?

Br.

On Thu, 2017-08-17 at 14:27 +1000, Dave Airlie wrote:
> From: Dave Airlie 
> 
> GFX9 moved where the viewport index export goes.
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 32 +---
>  1 file changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 7aa7567..a17a232 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -5518,11 +5518,11 @@ handle_vs_outputs_post(struct nir_to_llvm_context 
> *ctx,
>
> ctx->nir->outputs[radeon_llvm_reg_index_soa(VARYING_SLOT_VIEWPORT, 0)], "");
>   }
>  
> - uint32_t mask = ((outinfo->writes_pointsize == true ? 1 : 0) |
> -  (outinfo->writes_layer == true ? 4 : 0) |
> -  (outinfo->writes_viewport_index == true ? 8 : 0));
> - if (mask) {
> - pos_args[1].enabled_channels = mask;
> + if (outinfo->writes_pointsize ||
> + outinfo->writes_layer ||
> + outinfo->writes_viewport_index) {
> + pos_args[1].enabled_channels = ((outinfo->writes_pointsize == 
> true ? 1 : 0) |
> + (outinfo->writes_layer == true 
> ? 4 : 0));
>   pos_args[1].valid_mask = 0;
>   pos_args[1].done = 0;
>   pos_args[1].target = V_008DFC_SQ_EXP_POS + 1;
> @@ -5536,8 +5536,26 @@ handle_vs_outputs_post(struct nir_to_llvm_context *ctx,
>   pos_args[1].out[0] = psize_value;
>   if (outinfo->writes_layer == true)
>   pos_args[1].out[2] = layer_value;
> - if (outinfo->writes_viewport_index == true)
> - pos_args[1].out[3] = viewport_index_value;
> + if (outinfo->writes_viewport_index == true) {
> + if (ctx->options->chip_class >= GFX9) {
> + /* GFX9 has the layer in out.z[10:0] and the 
> viewport
> +  * index in out.z[19:16].
> +  */
> + LLVMValueRef v = viewport_index_value;
> + v = to_integer(>ac, v);
> + v = LLVMBuildShl(ctx->builder, v,
> +  LLVMConstInt(ctx->i32, 16, 
> false),
> +  "");
> + v = LLVMBuildOr(ctx->builder, v,
> + to_integer(>ac, 
> pos_args[1].out[2]), "");
> +
> + pos_args[1].out[2] = to_float(>ac, v);
> + pos_args[1].enabled_channels |= 1 << 2;
> + } else {
> + pos_args[1].out[3] = viewport_index_value;
> + pos_args[1].enabled_channels |= 1 << 3;
> + }
> + }
>   }
>   for (i = 0; i < 4; i++) {
>   if (pos_args[i].out[0])
-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Only set key->flat_shade if COL0/COL1 are written.

2017-08-23 Thread Ilia Mirkin
You might consider also including whether the interpolation method was
forced or not. i.e. if you have

flat varying vec4 gl_Color;

then it doesn't matter whether shade model is flat or not, it'll be
interpolated as flat. (Same with the other qualifiers made available
in GL 3.0.)

So you only have to do funny stuff if either COL0 / COL1 don't have
explicit interpolation qualifiers.

That might be over-optimizing it though. Your call. Just something
that occurred to me.

  -ilia


On Tue, Aug 22, 2017 at 10:19 PM, Kenneth Graunke  wrote:
> This may reduce some recompiles.
> ---
>  src/mesa/drivers/dri/i965/brw_wm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
> b/src/mesa/drivers/dri/i965/brw_wm.c
> index c9c45045902..e1555d60c56 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -531,7 +531,9 @@ brw_wm_populate_key(struct brw_context *brw, struct 
> brw_wm_prog_key *key)
>key->stats_wm = brw->stats_wm;
>
> /* _NEW_LIGHT */
> -   key->flat_shade = (ctx->Light.ShadeModel == GL_FLAT);
> +   key->flat_shade =
> +  (prog->info.inputs_read & (VARYING_BIT_COL0 | VARYING_BIT_COL1)) &&
> +  (ctx->Light.ShadeModel == GL_FLAT);
>
> /* _NEW_FRAG_CLAMP | _NEW_BUFFERS */
> key->clamp_fragment_color = ctx->Color._ClampFragmentColor;
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Add support for the HelperInvocation builtin

2017-08-23 Thread Matt Turner
On Wed, Aug 23, 2017 at 2:09 PM, Jason Ekstrand  wrote:
> On Wed, Aug 23, 2017 at 9:58 AM, Ian Romanick  wrote:
>>
>> Reviewed-by: Ian Romanick 
>>
>> Did you submit a CTS bug?
>
>
> No, I didn't.  It does get some coverage through the up-and-coming subgroup
> tests but it should probably have it's own test.  That's going to be really
> annoying to test...

Take a look at tests/spec/glsl-4.50/execution/helper-invocation.shader_test
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: exclude the buffer reallocation for encode case

2017-08-23 Thread Christian König

Am 23.08.2017 um 20:32 schrieb Leo Liu:



On 08/23/2017 02:10 PM, Christian König wrote:

Am 23.08.2017 um 19:21 schrieb Leo Liu:

Since encoder only support de-interlaced buffers.

Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c

index b2be7af8c4..ea86ce1b3b 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -625,7 +625,8 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)

PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
  -   if (surf->buffer->interlaced != interlaced) {
+   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE &&
+   surf->buffer->interlaced != interlaced) {


I think it would be better to just use context->decoder->entrypoint 
in the call above.


That should return false for interlaced when there is some encoding 
going on.


Yeah. That should work too. v2 will address that, but the approach 
just to fix the regression.


Look at the encode interlaced handling, it's a bit messy. since in 
order to make Encode work on VAAPI, we have to use env 
"VAAPI_DISABLE_INTERLACE=true" to allocate the deinterlaced buffer in 
the beginning.


The problem is when buffer re-allocate to de-interlaced, the content 
of YUV is there already, so the simply reallocation will lose the 
first few frames.
Since I have done the similar thing on OMX to copy back YUV from 
interlaced buffer to deinterlaced, I may try to do that on VAAPI to 
remove env "VAAPI_DISABLE_INTERLACE"


Yeah, completely agree. That would certainly be great to have.

Regards,
Christian.



Regards,
Leo




Regards,
Christian.

surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,

PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
PIPE_VIDEO_CAP_PREFERS_INTERLACED);







___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: exclude the buffer reallocation for encode case

2017-08-23 Thread Christian König

Am 23.08.2017 um 20:35 schrieb Leo Liu:

Since encoder only support de-interlaced buffers.

v2: move to parameter call to tell dec/enc

Signed-off-by: Leo Liu 


Reviewed-by: Christian König 


---
  src/gallium/state_trackers/va/picture.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index b2be7af8c4..47e63d3b30 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -622,7 +622,7 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
  
 screen = context->decoder->context->screen;

 interlaced = screen->get_video_param(screen, context->decoder->profile,
-PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
+context->decoder->entrypoint,
  PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
  
 if (surf->buffer->interlaced != interlaced) {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: exclude the buffer reallocation for encode case

2017-08-23 Thread Leo Liu
Since encoder only support de-interlaced buffers.

v2: move to parameter call to tell dec/enc

Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/va/picture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index b2be7af8c4..47e63d3b30 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -622,7 +622,7 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
 
screen = context->decoder->context->screen;
interlaced = screen->get_video_param(screen, context->decoder->profile,
-PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
+context->decoder->entrypoint,
 PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
 
if (surf->buffer->interlaced != interlaced) {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: exclude the buffer reallocation for encode case

2017-08-23 Thread Leo Liu



On 08/23/2017 02:10 PM, Christian König wrote:

Am 23.08.2017 um 19:21 schrieb Leo Liu:

Since encoder only support de-interlaced buffers.

Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c

index b2be7af8c4..ea86ce1b3b 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -625,7 +625,8 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)

PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
  -   if (surf->buffer->interlaced != interlaced) {
+   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE &&
+   surf->buffer->interlaced != interlaced) {


I think it would be better to just use context->decoder->entrypoint in 
the call above.


That should return false for interlaced when there is some encoding 
going on.


Yeah. That should work too. v2 will address that, but the approach just 
to fix the regression.


Look at the encode interlaced handling, it's a bit messy. since in order 
to make Encode work on VAAPI, we have to use env 
"VAAPI_DISABLE_INTERLACE=true" to allocate the deinterlaced buffer in 
the beginning.


The problem is when buffer re-allocate to de-interlaced, the content of 
YUV is there already, so the simply reallocation will lose the first few 
frames.
Since I have done the similar thing on OMX to copy back YUV from 
interlaced buffer to deinterlaced, I may try to do that on VAAPI to 
remove env "VAAPI_DISABLE_INTERLACE"


Regards,
Leo




Regards,
Christian.

surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,

PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
PIPE_VIDEO_CAP_PREFERS_INTERLACED);





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102377] PIPE_*_4BYTE_ALIGNED_ONLY caps crashing

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102377

Bug ID: 102377
   Summary: PIPE_*_4BYTE_ALIGNED_ONLY caps crashing
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: timothy.o.row...@intel.com
QA Contact: mesa-dev@lists.freedesktop.org

For a potential upcoming change in the swr rasterizer, we need to set the
4BYTE_ALIGNED_ONLY caps to ensure alignment during vertex attribute fetch and
AOS to SOA conversion.

Testing with the caps caused three regressions on piglit, which can be
reproduced by setting these caps on llvmpipe.  The three failing tests all have
somewhat different failure signatures, but appear to point to mesa
infrastructure problems rather than a driver issue.

/home/torowley/src/piglit/bin/copy-pixels -auto

This one appears to clobber the stack:

Thread 1 "copy-pixels" received signal SIGSEGV, Segmentation fault.
0x761a1281 in _mesa_set_vp_override (ctx=0x77eff01000, flag=0
'\000') at ../../../mesa/src/mesa/main/state.c:450
(gdb) where
#0  0x761a1281 in _mesa_set_vp_override (ctx=0x77eff01000, flag=0
'\000') at ../../../mesa/src/mesa/main/state.c:450
#1  0x760056ea in _mesa_CopyPixels (srcx=0, srcy=0, width=15360,
height=15360, type=1572930) at ../../../mesa/src/mesa/main/drawpix.c:288
#2  0x4014af00 in ?? ()
#3  0x007fffd81000 in ?? ()

/home/torowley/src/piglit/bin/draw-vertices user -auto

Thread 1 "draw-vertices" received signal SIGSEGV, Segmentation fault.
0x in ?? ()
(gdb) where
#0  0x in ?? ()
#1  0x7655234c in pipe_resource_reference (ptr=0x812350, tex=0x0) at
../../../../mesa/src/gallium/auxiliary/util/u_inlines.h:144
#2  0x7655467a in u_vbuf_set_vertex_buffers (mgr=0x811c30,
start_slot=1, count=1, bufs=0x0) at
../../../../mesa/src/gallium/auxiliary/util/u_vbuf.c:836
#3  0x764b1411 in cso_set_vertex_buffers (ctx=0x810870, start_slot=1,
count=1, buffers=0x0) at
../../../../mesa/src/gallium/auxiliary/cso_cache/cso_context.c:1144
#4  0x76268cd3 in set_vertex_attribs (st=0x7f87a0,
vbuffers=0x7fffd360, num_vbuffers=1, velements=0x7fffd370,
num_velements=1) at ../../../mesa/src/mesa/state_tr
acker/st_atom_array.c:441
#5  0x762691b2 in setup_interleaved_attribs (st=0x7f87a0, vp=0x91f370,
arrays=0x7fe408, num_inputs=1) at
../../../mesa/src/mesa/state_tracker/st_atom_array.c:563
#6  0x76269845 in st_update_array (st=0x7f87a0) at
../../../mesa/src/mesa/state_tracker/st_atom_array.c:705
#7  0x7626af3f in st_validate_state (st=0x7f87a0,
pipeline=ST_PIPELINE_RENDER) at
../../../mesa/src/mesa/state_tracker/st_atom.c:253
#8  0x76298d85 in prepare_draw (st=0x7f87a0, ctx=0x77eff010) at
../../../mesa/src/mesa/state_tracker/st_draw.c:122
#9  0x76298e1b in st_draw_vbo (ctx=0x77eff010,
prims=0x7fffd690, nr_prims=1, ib=0x0, index_bounds_valid=1 '\001',
min_index=0, max_index=2, tfb_vertcount=0x0, st
ream=0, indirect=0x0) at ../../../mesa/src/mesa/state_tracker/st_draw.c:148
#10 0x76239f91 in vbo_draw_arrays (ctx=0x77eff010, mode=4, start=0,
count=3, numInstances=1, baseInstance=0, drawID=0) at
../../../mesa/src/mesa/vbo/vbo_exec_array.c
:486
#11 0x7623a7a1 in vbo_exec_DrawArrays (mode=4, start=0, count=3) at
../../../mesa/src/mesa/vbo/vbo_exec_array.c:641
#12 0x00402363 in test_short_vertices (x1=240, y1=0, x2=260, y2=20,
index=0) at /home/torowley/src/piglit/tests/general/draw-vertices.c:282
#13 0x00403536 in piglit_display () at
/home/torowley/src/piglit/tests/general/draw-vertices.c:568
#14 0x77b3520e in process_next_event (x11_fw=0x637ed0) at
/home/torowley/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:137
#15 0x77b352ce in enter_event_loop (winsys_fw=0x637ed0) at
/home/torowley/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:153
#16 0x77b34041 in run_test (gl_fw=0x637ed0, argc=2,
argv=0x7fffdb58) at
/home/torowley/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:88
#17 0x77b18378 in piglit_gl_test_run (argc=2, argv=0x7fffdb58,
config=0x7fffda10) at
/home/torowley/src/piglit/tests/util/piglit-framework-gl.c:223
#18 0x00401324 in main (argc=2, argv=0x7fffdb58) at
/home/torowley/src/piglit/tests/general/draw-vertices.c:43

/home/torowley/src/piglit/bin/fbo-depthtex -auto -fbo

Thread 1 "fbo-depthtex" received signal SIGSEGV, Segmentation fault.
0x76016e4e in check_end_texture_render (ctx=0x77eff010,
fb=0x8261cf) at ../../../mesa/src/mesa/main/fbobject.c:2566
#0  0x76016e4e in check_end_texture_render (ctx=0x77eff010,
fb=0x8261cf) at ../../../mesa/src/mesa/main/fbobject.c:2566
#1  0x7601720c in 

[Mesa-dev] [PATCH v6.2] egl: Allow creation of per surface out fence

2017-08-23 Thread yogesh . marathe
From: Zhongmin Wu 

Add plumbing to allow creation of per display surface out fence.

Currently enabled only on android, since the system expects a valid
fd in ANativeWindow::{queue,cancel}Buffer. We pass a fd of -1 with
which native applications such as flatland fail. The patch enables
explicit sync on android and fixes one of the functional issue for
apps or buffer consumers which depend upon fence and its timestamp.

v2: a) Also implement the fence in cancelBuffer.
b) The last sync fence is stored in drawable object
   rather than brw context.
c) format clear.

v3: a) Save the last fence fd in DRI Context object.
b) Return the last fence if the batch buffer is empty and
   nothing to be flushed when _intel_batchbuffer_flush_fence
c) Add the new interface in vbtl to set the retrieve fence

v3.1 a) close fd in the new vbtl interface on none Android platform

v4: a) The last fence is saved in brw context.
b) The retrieve fd is for all the platform but not just Android
c) Add a uniform dri2 interface to initialize the surface.

v4.1: a) make some changes of variable name.
  b) the patch is broken into two patches.

v4.2: a) Add a deinit interface for surface to clear the out fence

v5: a) Add enable_out_fence to init, platform sets it true or
   false
b) Change get fd to update fd and check for fence
c) Commit description updated

v6: a) Heading and commit description updated
b) enable_out_fence is set only if fence is supported
c) Review comments on function names
d) Test with standalone patch, resolves the bug

v6.1: Check for old display fence reverted

v6.2: enable_out_fence initialized to false by default,
  dri2_surf_update_fence_fd updated, deinit changed to fini

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101655

Signed-off-by: Zhongmin Wu 
Signed-off-by: Yogesh Marathe 
Reviewed-by: Emil Velikov 
Reviewed-by: Tomasz Figa 
---
 src/egl/drivers/dri2/egl_dri2.c | 71 +
 src/egl/drivers/dri2/egl_dri2.h |  9 
 src/egl/drivers/dri2/platform_android.c | 29 ++--
 src/egl/drivers/dri2/platform_drm.c |  3 +-
 src/egl/drivers/dri2/platform_surfaceless.c |  3 +-
 src/egl/drivers/dri2/platform_wayland.c |  3 +-
 src/egl/drivers/dri2/platform_x11.c |  3 +-
 src/egl/drivers/dri2/platform_x11_dri3.c|  3 +-
 8 files changed, 106 insertions(+), 18 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index aa6f02a..44b8e1d 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -1388,6 +1388,45 @@ dri2_destroy_context(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLContext *ctx)
return EGL_TRUE;
 }
 
+EGLBoolean
+dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type,
+_EGLConfig *conf, const EGLint *attrib_list, EGLBoolean 
enable_out_fence)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+
+   dri2_surf->out_fence_fd = -1;
+   dri2_surf->enable_out_fence = false;
+   if (dri2_dpy->fence && dri2_dpy->fence->base.version >= 2 &&
+   dri2_dpy->fence->get_capabilities &&
+   (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) &
+__DRI_FENCE_CAP_NATIVE_FD)) {
+  dri2_surf->enable_out_fence = enable_out_fence;
+   }
+
+   return _eglInitSurface(surf, dpy, type, conf, attrib_list);
+}
+
+static void
+dri2_surface_set_out_fence_fd( _EGLSurface *surf, int fence_fd)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+
+   if (dri2_surf->out_fence_fd >=0)
+  close(dri2_surf->out_fence_fd);
+
+   dri2_surf->out_fence_fd = fence_fd;
+}
+
+void
+dri2_fini_surface(_EGLSurface *surf)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+
+   dri2_surface_set_out_fence_fd(surf, -1);
+   dri2_surf->enable_out_fence = false;
+}
+
 static EGLBoolean
 dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
 {
@@ -1399,6 +1438,28 @@ dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, 
_EGLSurface *surf)
return dri2_dpy->vtbl->destroy_surface(drv, dpy, surf);
 }
 
+static void
+dri2_surf_update_fence_fd(_EGLContext *ctx,
+  _EGLDisplay *dpy, _EGLSurface *surf)
+{
+   __DRIcontext *dri_ctx = dri2_egl_context(ctx)->dri_context;
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   int fence_fd = -1;
+   void *fence;
+
+   if (!dri2_surf->enable_out_fence)
+  return;
+
+   fence = dri2_dpy->fence->create_fence_fd(dri_ctx, -1);
+   if (fence) {
+  fence_fd = dri2_dpy->fence->get_fence_fd(dri2_dpy->dri_screen,
+   fence);
+  

Re: [Mesa-dev] [PATCH] Android: Fix LLVM duplicated symbols linking for N and M

2017-08-23 Thread Mauro Rossi
Il 23/ago/2017 19:57, "Rob Herring"  ha scritto:

On Wed, Aug 23, 2017 at 12:31 PM, Emil Velikov 
wrote:
> On 23 August 2017 at 17:50, Rob Herring  wrote:
>> On Sun, Aug 20, 2017 at 2:57 PM, Rob Herring  wrote:
>>> On Fri, Aug 18, 2017 at 8:53 PM, Chih-Wei Huang 
wrote:
 2017-08-19 8:27 GMT+08:00 Emil Velikov :
> On 18 August 2017 at 20:46, Rob Herring  wrote:
>> Both statically linking libLLVMCore and dynamically linking libLLVM
causes
>> duplicated symbols in gallium_dri.so and it fails to dlopen. We don't
>> really need to link libLLVMCore, but just need generated headers to
be
>> built first. Dynamically linking to libLLVM instead is enough to do
>> that. Thanks to Qiang Yu for finding the root cause.
>>
>> [...]
>>
>>$(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \
>> -$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308
-DMESA_LLVM_VERSION_PATCH=0) \
>> -$(eval LOCAL_STATIC_LIBRARIES += libLLVMCore) \
>> -$(eval LOCAL_C_INCLUDES += external/llvm/include
external/llvm/device/include),) \
>> +$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308
-DMESA_LLVM_VERSION_PATCH=0),) \
>>$(if $(filter O,$(MESA_ANDROID_MAJOR_VERSION)), \
>> -$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309
-DMESA_LLVM_VERSION_PATCH=0) \
>> -$(eval LOCAL_HEADER_LIBRARIES += llvm-headers),)
>> +$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309
-DMESA_LLVM_VERSION_PATCH=0),) \
>> +  $(eval LOCAL_SHARED_LIBRARIES += libLLVM)
> Am I the only person getting tad confused by amount of brackets?
> As mentioned by Chih-Wei - a shell switch is not possible, but how
> about a test vague like the following?
>
> test "x$(MESA_ANDROID_MAJOR_VERSION)" = "xO" &&
>$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309
-DMESA_LLVM_VERSION_PATCH=0)

 Only possible if you put it into $(shell ...)
 That gives me an idea. Maybe we ca do like

 $(shell case "$(MESA_ANDROID_MAJOR_VERSION)" in \
 6) echo ... ;; \
 7) echo ... ;; \
 *)  echo ... ;; \
 esac)

 I haven't really try it yet.
>>>
>>> What does either really buy us? It's really just bike shedding and
>>> unrelated to fixing the problem at hand.
>>>
>>> I have another idea which is to use llvm-config and avoid the
>>> conditionals altogether. I haven't looked into that closely though.
>>
>> Well, the build is broken again because the version changed from O to
>> 8 (and I'm not sure if master is going to change to P or 9 at some
>> point). So I went ahead and have this all coded up like this (I don't
>> see a simple way to build and run llvm-config):
>>
> Yay :-(
>
>>   $(eval $(shell sed -n -e
>> 's/.*\(LLVM_VERSION_MAJOR\).*\([0-9].*\)/\1:=\2/p'
>> external/llvm/device/include/llvm/Config/llvm-config.h)) \
>>   $(eval $(shell sed -n -e
>> 's/.*\(LLVM_VERSION_MINOR\).*\([0-9].*\)/\1:=\2/p'
>> external/llvm/device/include/llvm/Config/llvm-config.h)) \
>>   $(eval LOCAL_CFLAGS +=
>> -DHAVE_LLVM=0x$(LLVM_VERSION_MAJOR)0$(LLVM_VERSION_MINOR))
>>
>> Only one slight problem in that for master/O it reports 3.8 as the
>> version is 3.8.275480 which I think is the SVN version number. Not
>> sure what to do with that...
>>
> Indeed, seems like a SVN version. Not sure how much to care about the
> PATCH version.
> Leave it as 0 or use the SVN one - your call.

Okay, I was a bit vague. The problem is the version is effectively 3.9
and the build breaks if we build with HAVE_LLVM=0x308 instead.

I don't think it would work on older versions either. N has 3.8.256229
and that is 3.8 (from mesa perspective). The M version was 3.6.svn,
but we pass 3.7 to mesa. So there's not really a programmatic way to
handle this.

Rob


LLVM version in Cmakelists.txt is indeed 3.9.0
Mauro
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v6.2] egl: Allow creation of per surface out fence

2017-08-23 Thread yogesh . marathe
From: Zhongmin Wu 

Add plumbing to allow creation of per display surface out fence.

Currently enabled only on android, since the system expects a valid
fd in ANativeWindow::{queue,cancel}Buffer. We pass a fd of -1 with
which native applications such as flatland fail. The patch enables
explicit sync on android and fixes one of the functional issue for
apps or buffer consumers which depend upon fence and its timestamp.

v2: a) Also implement the fence in cancelBuffer.
b) The last sync fence is stored in drawable object
   rather than brw context.
c) format clear.

v3: a) Save the last fence fd in DRI Context object.
b) Return the last fence if the batch buffer is empty and
   nothing to be flushed when _intel_batchbuffer_flush_fence
c) Add the new interface in vbtl to set the retrieve fence

v3.1 a) close fd in the new vbtl interface on none Android platform

v4: a) The last fence is saved in brw context.
b) The retrieve fd is for all the platform but not just Android
c) Add a uniform dri2 interface to initialize the surface.

v4.1: a) make some changes of variable name.
  b) the patch is broken into two patches.

v4.2: a) Add a deinit interface for surface to clear the out fence

v5: a) Add enable_out_fence to init, platform sets it true or
   false
b) Change get fd to update fd and check for fence
c) Commit description updated

v6: a) Heading and commit description updated
b) enable_out_fence is set only if fence is supported
c) Review comments on function names
d) Test with standalone patch, resolves the bug

v6.2: a) Check for old display fence reverted
  b) enable_out_fence initialized to false by default,
 dri2_surf_update_fence_fd updated, deinit changed to fini

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101655

Signed-off-by: Zhongmin Wu 
Signed-off-by: Yogesh Marathe 
Reviewed-by: Emil Velikov 
Reviewed-by: Tomasz Figa 
---
 src/egl/drivers/dri2/egl_dri2.c | 71 +
 src/egl/drivers/dri2/egl_dri2.h |  9 
 src/egl/drivers/dri2/platform_android.c | 29 ++--
 src/egl/drivers/dri2/platform_drm.c |  3 +-
 src/egl/drivers/dri2/platform_surfaceless.c |  3 +-
 src/egl/drivers/dri2/platform_wayland.c |  3 +-
 src/egl/drivers/dri2/platform_x11.c |  3 +-
 src/egl/drivers/dri2/platform_x11_dri3.c|  3 +-
 8 files changed, 106 insertions(+), 18 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index aa6f02a..44b8e1d 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -1388,6 +1388,45 @@ dri2_destroy_context(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLContext *ctx)
return EGL_TRUE;
 }
 
+EGLBoolean
+dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type,
+_EGLConfig *conf, const EGLint *attrib_list, EGLBoolean 
enable_out_fence)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+
+   dri2_surf->out_fence_fd = -1;
+   dri2_surf->enable_out_fence = false;
+   if (dri2_dpy->fence && dri2_dpy->fence->base.version >= 2 &&
+   dri2_dpy->fence->get_capabilities &&
+   (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) &
+__DRI_FENCE_CAP_NATIVE_FD)) {
+  dri2_surf->enable_out_fence = enable_out_fence;
+   }
+
+   return _eglInitSurface(surf, dpy, type, conf, attrib_list);
+}
+
+static void
+dri2_surface_set_out_fence_fd( _EGLSurface *surf, int fence_fd)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+
+   if (dri2_surf->out_fence_fd >=0)
+  close(dri2_surf->out_fence_fd);
+
+   dri2_surf->out_fence_fd = fence_fd;
+}
+
+void
+dri2_fini_surface(_EGLSurface *surf)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+
+   dri2_surface_set_out_fence_fd(surf, -1);
+   dri2_surf->enable_out_fence = false;
+}
+
 static EGLBoolean
 dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
 {
@@ -1399,6 +1438,28 @@ dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, 
_EGLSurface *surf)
return dri2_dpy->vtbl->destroy_surface(drv, dpy, surf);
 }
 
+static void
+dri2_surf_update_fence_fd(_EGLContext *ctx,
+  _EGLDisplay *dpy, _EGLSurface *surf)
+{
+   __DRIcontext *dri_ctx = dri2_egl_context(ctx)->dri_context;
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   int fence_fd = -1;
+   void *fence;
+
+   if (!dri2_surf->enable_out_fence)
+  return;
+
+   fence = dri2_dpy->fence->create_fence_fd(dri_ctx, -1);
+   if (fence) {
+  fence_fd = dri2_dpy->fence->get_fence_fd(dri2_dpy->dri_screen,
+   fence);
+  

Re: [Mesa-dev] [PATCH] st/va: exclude the buffer reallocation for encode case

2017-08-23 Thread Christian König

Am 23.08.2017 um 19:21 schrieb Leo Liu:

Since encoder only support de-interlaced buffers.

Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/va/picture.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index b2be7af8c4..ea86ce1b3b 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -625,7 +625,8 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
  PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
  PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
  
-   if (surf->buffer->interlaced != interlaced) {

+   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE &&
+   surf->buffer->interlaced != interlaced) {


I think it would be better to just use context->decoder->entrypoint in 
the call above.


That should return false for interlaced when there is some encoding 
going on.


Regards,
Christian.


surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,
   
PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
   
PIPE_VIDEO_CAP_PREFERS_INTERLACED);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Add support for the HelperInvocation builtin

2017-08-23 Thread Jason Ekstrand
On Wed, Aug 23, 2017 at 9:58 AM, Ian Romanick  wrote:

> Reviewed-by: Ian Romanick 
>
> Did you submit a CTS bug?
>

No, I didn't.  It does get some coverage through the up-and-coming subgroup
tests but it should probably have it's own test.  That's going to be really
annoying to test...


> On 08/21/2017 10:11 PM, Jason Ekstrand wrote:
> > I have no idea how this got missed but it's been missing since forever.
> >
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  src/compiler/spirv/vtn_variables.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/spirv/vtn_variables.c
> b/src/compiler/spirv/vtn_variables.c
> > index 6a8776b..87cb935 100644
> > --- a/src/compiler/spirv/vtn_variables.c
> > +++ b/src/compiler/spirv/vtn_variables.c
> > @@ -1121,6 +1121,10 @@ vtn_get_builtin_location(struct vtn_builder *b,
> >*location = FRAG_RESULT_DEPTH;
> >assert(*mode == nir_var_shader_out);
> >break;
> > +   case SpvBuiltInHelperInvocation:
> > +  *location = SYSTEM_VALUE_HELPER_INVOCATION;
> > +  set_mode_system_value(mode);
> > +  break;
> > case SpvBuiltInNumWorkgroups:
> >*location = SYSTEM_VALUE_NUM_WORK_GROUPS;
> >set_mode_system_value(mode);
> > @@ -1177,7 +1181,6 @@ vtn_get_builtin_location(struct vtn_builder *b,
> >*location = SYSTEM_VALUE_VIEW_INDEX;
> >set_mode_system_value(mode);
> >break;
> > -   case SpvBuiltInHelperInvocation:
> > default:
> >unreachable("unsupported builtin");
> > }
> >
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Android: Fix LLVM duplicated symbols linking for N and M

2017-08-23 Thread Rob Herring
On Wed, Aug 23, 2017 at 12:31 PM, Emil Velikov  wrote:
> On 23 August 2017 at 17:50, Rob Herring  wrote:
>> On Sun, Aug 20, 2017 at 2:57 PM, Rob Herring  wrote:
>>> On Fri, Aug 18, 2017 at 8:53 PM, Chih-Wei Huang  
>>> wrote:
 2017-08-19 8:27 GMT+08:00 Emil Velikov :
> On 18 August 2017 at 20:46, Rob Herring  wrote:
>> Both statically linking libLLVMCore and dynamically linking libLLVM 
>> causes
>> duplicated symbols in gallium_dri.so and it fails to dlopen. We don't
>> really need to link libLLVMCore, but just need generated headers to be
>> built first. Dynamically linking to libLLVM instead is enough to do
>> that. Thanks to Qiang Yu for finding the root cause.
>>
>> [...]
>>
>>$(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \
>> -$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 
>> -DMESA_LLVM_VERSION_PATCH=0) \
>> -$(eval LOCAL_STATIC_LIBRARIES += libLLVMCore) \
>> -$(eval LOCAL_C_INCLUDES += external/llvm/include 
>> external/llvm/device/include),) \
>> +$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 
>> -DMESA_LLVM_VERSION_PATCH=0),) \
>>$(if $(filter O,$(MESA_ANDROID_MAJOR_VERSION)), \
>> -$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 
>> -DMESA_LLVM_VERSION_PATCH=0) \
>> -$(eval LOCAL_HEADER_LIBRARIES += llvm-headers),)
>> +$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 
>> -DMESA_LLVM_VERSION_PATCH=0),) \
>> +  $(eval LOCAL_SHARED_LIBRARIES += libLLVM)
> Am I the only person getting tad confused by amount of brackets?
> As mentioned by Chih-Wei - a shell switch is not possible, but how
> about a test vague like the following?
>
> test "x$(MESA_ANDROID_MAJOR_VERSION)" = "xO" &&
>$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)

 Only possible if you put it into $(shell ...)
 That gives me an idea. Maybe we ca do like

 $(shell case "$(MESA_ANDROID_MAJOR_VERSION)" in \
 6) echo ... ;; \
 7) echo ... ;; \
 *)  echo ... ;; \
 esac)

 I haven't really try it yet.
>>>
>>> What does either really buy us? It's really just bike shedding and
>>> unrelated to fixing the problem at hand.
>>>
>>> I have another idea which is to use llvm-config and avoid the
>>> conditionals altogether. I haven't looked into that closely though.
>>
>> Well, the build is broken again because the version changed from O to
>> 8 (and I'm not sure if master is going to change to P or 9 at some
>> point). So I went ahead and have this all coded up like this (I don't
>> see a simple way to build and run llvm-config):
>>
> Yay :-(
>
>>   $(eval $(shell sed -n -e
>> 's/.*\(LLVM_VERSION_MAJOR\).*\([0-9].*\)/\1:=\2/p'
>> external/llvm/device/include/llvm/Config/llvm-config.h)) \
>>   $(eval $(shell sed -n -e
>> 's/.*\(LLVM_VERSION_MINOR\).*\([0-9].*\)/\1:=\2/p'
>> external/llvm/device/include/llvm/Config/llvm-config.h)) \
>>   $(eval LOCAL_CFLAGS +=
>> -DHAVE_LLVM=0x$(LLVM_VERSION_MAJOR)0$(LLVM_VERSION_MINOR))
>>
>> Only one slight problem in that for master/O it reports 3.8 as the
>> version is 3.8.275480 which I think is the SVN version number. Not
>> sure what to do with that...
>>
> Indeed, seems like a SVN version. Not sure how much to care about the
> PATCH version.
> Leave it as 0 or use the SVN one - your call.

Okay, I was a bit vague. The problem is the version is effectively 3.9
and the build breaks if we build with HAVE_LLVM=0x308 instead.

I don't think it would work on older versions either. N has 3.8.256229
and that is 3.8 (from mesa perspective). The M version was 3.6.svn,
but we pass 3.7 to mesa. So there's not really a programmatic way to
handle this.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: fix glVertexAttrib(index=0)

2017-08-23 Thread Charmaine Lee

Looks good. Thanks.

Reviewed-by: Charmaine Lee 


From: Brian Paul 
Sent: Tuesday, August 22, 2017 1:21:56 PM
To: mesa-dev@lists.freedesktop.org
Cc: Charmaine Lee; Neha Bhende
Subject: [PATCH] vbo: fix glVertexAttrib(index=0)

Depending on which extension or GL spec you read the behavior of
glVertexAttrib(index=0) either sets the current value for generic
attribute 0, or it emits a vertex just like glVertex().  I believe
it should do either, depending on context (see below).

The piglit gl-2.0-vertex-const-attr test declares two vertex attributes:
  attribute vec2 vertex;
  attribute vec4 attr;
and the GLSL linker assigns "vertex" to location 0 and "attr" to location 1.
The test passes.

But if the declarations were reversed such that "attr" was location 0 and
"vertex" was location 1, the test would fail to draw properly.

The problem is the call to glVertexAttrib(index=0) to set attr's value
was interpreted as glVertex() and did not set generic attribute[0]'s value.
Interesting, calling glVertex() outside glBegin/End (which is effectively
what the piglit test does) does not generate a GL error.

I believe the behavior of glVertexAttrib(index=0) should depend on
whether it's called inside or outside of glBegin/glEnd().  If inside
glBegin/End(), it should act like glVertex().  Else, it should behave
like glVertexAttrib(index > 0).  This seems to be what NVIDIA does.

This patch makes two changes:

1. Check if we're inside glBegin/End for glVertexAttrib()
2. Fix the vertex array binding for recalculate_input_bindings().  As it was,
   we were using >currval[VBO_ATTRIB_POS], but that's interpreted
   as a zero-stride attribute and doesn't make sense for array drawing.

No Piglit regressions.  Fixes updated gl-2.0-vertex-const-attr test and
passes new gl-2.0-vertex-attrib-0 test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101941
---
 src/mesa/vbo/vbo_attrib_tmp.h | 7 +--
 src/mesa/vbo/vbo_exec_array.c | 2 +-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/mesa/vbo/vbo_attrib_tmp.h b/src/mesa/vbo/vbo_attrib_tmp.h
index 5718ac5..126e4ef 100644
--- a/src/mesa/vbo/vbo_attrib_tmp.h
+++ b/src/mesa/vbo/vbo_attrib_tmp.h
@@ -524,15 +524,18 @@ TAG(MultiTexCoord4fv)(GLenum target, const GLfloat * v)

 /**
  * If index=0, does glVertexAttrib*() alias glVertex() to emit a vertex?
+ * It depends on a few things, including whether we're inside or outside
+ * of glBegin/glEnd.
  */
 static inline bool
 is_vertex_position(const struct gl_context *ctx, GLuint index)
 {
-   return index == 0 && _mesa_attr_zero_aliases_vertex(ctx);
+   return (index == 0 &&
+   _mesa_attr_zero_aliases_vertex(ctx) &&
+   _mesa_inside_begin_end(ctx));
 }


-
 static void GLAPIENTRY
 TAG(VertexAttrib1fARB)(GLuint index, GLfloat x)
 {
diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index edd55ce..e3421fa 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -356,7 +356,7 @@ recalculate_input_bindings(struct gl_context *ctx)
  else if (array[VERT_ATTRIB_POS].Enabled)
 inputs[0] = [VERT_ATTRIB_POS];
  else {
-inputs[0] = >currval[VBO_ATTRIB_POS];
+inputs[0] = >currval[VBO_ATTRIB_GENERIC0];
 const_inputs |= VERT_BIT_POS;
  }

--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load

2017-08-23 Thread Ben Crocker
Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).

Roland Scheidegger's commit e827d9175675aaa6cfc0b981e2a80685fb7b3a74
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000.  
This patch fixes
three of the four regressions observed by Ray:

- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2

One regression remains:
- draw-vertices-2101010

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" 

Signed-off-by: Ben Crocker 
---
 src/gallium/auxiliary/gallivm/lp_bld_gather.c | 30 +--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_gather.c 
b/src/gallium/auxiliary/gallivm/lp_bld_gather.c
index ccd0376..7d11dcd 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_gather.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_gather.c
@@ -234,13 +234,39 @@ lp_build_gather_elem_vec(struct gallivm_state *gallivm,
   */
  res = LLVMBuildZExt(gallivm->builder, res, dst_elem_type, "");
 
- if (vector_justify) {
 #ifdef PIPE_ARCH_BIG_ENDIAN
+ if (vector_justify) {
  res = LLVMBuildShl(gallivm->builder, res,
 LLVMConstInt(dst_elem_type,
  dst_type.width - src_width, 0), "");
-#endif
  }
+ if (src_width == 48) {
+/* Load 3x16 bit vector.
+ * The sequence of loads on big-endian hardware proceeds as 
follows.
+ * 16-bit fields are denoted by X, Y, Z, and 0.  In memory, the 
sequence
+ * of three fields appears in the order X, Y, Z.
+ *
+ * Load 32-bit word: 0.0.X.Y
+ * Load 16-bit halfword: 0.0.0.Z
+ * Rotate left: 0.X.Y.0
+ * Bitwise OR: 0.X.Y.Z
+ *
+ * The order in which we need the fields in the result is 0.Z.Y.X,
+ * the same as on little-endian; permute 16-bit fields accordingly
+ * within 64-bit register:
+ */
+LLVMValueRef shuffles[4] = {
+   lp_build_const_int32(gallivm, 2),
+   lp_build_const_int32(gallivm, 1),
+   lp_build_const_int32(gallivm, 0),
+   lp_build_const_int32(gallivm, 3),
+};
+res = LLVMBuildBitCast(gallivm->builder, res,
+   lp_build_vec_type(gallivm, 
lp_type_uint_vec(16, 4*16)), "");
+res = LLVMBuildShuffleVector(gallivm->builder, res, res, 
LLVMConstVector(shuffles, 4), "");
+res = LLVMBuildBitCast(gallivm->builder, res, dst_elem_type, "");
+ }
+#endif
   }
}
return res;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: exclude the buffer reallocation for encode case

2017-08-23 Thread Leo Liu
Since encoder only support de-interlaced buffers.

Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/va/picture.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index b2be7af8c4..ea86ce1b3b 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -625,7 +625,8 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
 PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
 PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
 
-   if (surf->buffer->interlaced != interlaced) {
+   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE &&
+   surf->buffer->interlaced != interlaced) {
   surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,
  
PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
  
PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Android: Fix LLVM duplicated symbols linking for N and M

2017-08-23 Thread Emil Velikov
On 23 August 2017 at 17:50, Rob Herring  wrote:
> On Sun, Aug 20, 2017 at 2:57 PM, Rob Herring  wrote:
>> On Fri, Aug 18, 2017 at 8:53 PM, Chih-Wei Huang  
>> wrote:
>>> 2017-08-19 8:27 GMT+08:00 Emil Velikov :
 On 18 August 2017 at 20:46, Rob Herring  wrote:
> Both statically linking libLLVMCore and dynamically linking libLLVM causes
> duplicated symbols in gallium_dri.so and it fails to dlopen. We don't
> really need to link libLLVMCore, but just need generated headers to be
> built first. Dynamically linking to libLLVM instead is enough to do
> that. Thanks to Qiang Yu for finding the root cause.
>
> [...]
>
>$(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \
> -$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 
> -DMESA_LLVM_VERSION_PATCH=0) \
> -$(eval LOCAL_STATIC_LIBRARIES += libLLVMCore) \
> -$(eval LOCAL_C_INCLUDES += external/llvm/include 
> external/llvm/device/include),) \
> +$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 
> -DMESA_LLVM_VERSION_PATCH=0),) \
>$(if $(filter O,$(MESA_ANDROID_MAJOR_VERSION)), \
> -$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 
> -DMESA_LLVM_VERSION_PATCH=0) \
> -$(eval LOCAL_HEADER_LIBRARIES += llvm-headers),)
> +$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 
> -DMESA_LLVM_VERSION_PATCH=0),) \
> +  $(eval LOCAL_SHARED_LIBRARIES += libLLVM)
 Am I the only person getting tad confused by amount of brackets?
 As mentioned by Chih-Wei - a shell switch is not possible, but how
 about a test vague like the following?

 test "x$(MESA_ANDROID_MAJOR_VERSION)" = "xO" &&
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)
>>>
>>> Only possible if you put it into $(shell ...)
>>> That gives me an idea. Maybe we ca do like
>>>
>>> $(shell case "$(MESA_ANDROID_MAJOR_VERSION)" in \
>>> 6) echo ... ;; \
>>> 7) echo ... ;; \
>>> *)  echo ... ;; \
>>> esac)
>>>
>>> I haven't really try it yet.
>>
>> What does either really buy us? It's really just bike shedding and
>> unrelated to fixing the problem at hand.
>>
>> I have another idea which is to use llvm-config and avoid the
>> conditionals altogether. I haven't looked into that closely though.
>
> Well, the build is broken again because the version changed from O to
> 8 (and I'm not sure if master is going to change to P or 9 at some
> point). So I went ahead and have this all coded up like this (I don't
> see a simple way to build and run llvm-config):
>
Yay :-(

>   $(eval $(shell sed -n -e
> 's/.*\(LLVM_VERSION_MAJOR\).*\([0-9].*\)/\1:=\2/p'
> external/llvm/device/include/llvm/Config/llvm-config.h)) \
>   $(eval $(shell sed -n -e
> 's/.*\(LLVM_VERSION_MINOR\).*\([0-9].*\)/\1:=\2/p'
> external/llvm/device/include/llvm/Config/llvm-config.h)) \
>   $(eval LOCAL_CFLAGS +=
> -DHAVE_LLVM=0x$(LLVM_VERSION_MAJOR)0$(LLVM_VERSION_MINOR))
>
> Only one slight problem in that for master/O it reports 3.8 as the
> version is 3.8.275480 which I think is the SVN version number. Not
> sure what to do with that...
>
Indeed, seems like a SVN version. Not sure how much to care about the
PATCH version.
Leave it as 0 or use the SVN one - your call.

Ideally there will be an LLVM API to query that at runtime... but
that's topic for another day.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] Android: EGL: fix missing nativewindow.h include on O

2017-08-23 Thread Rob Herring
The build against AOSP master and O is broken:

external/mesa3d/include/EGL/eglplatform.h:100:10: fatal error: 
'android/native_window.h' file not found

native_window.h has moved and is now part of libnativewindow library, so
add this dependency.

Signed-off-by: Rob Herring 
---
 src/egl/Android.mk | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/egl/Android.mk b/src/egl/Android.mk
index 00553226773e..3852deb4364c 100644
--- a/src/egl/Android.mk
+++ b/src/egl/Android.mk
@@ -58,6 +58,10 @@ LOCAL_SHARED_LIBRARIES := \
libgralloc_drm \
libsync
 
+ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION),5 6 7),)
+LOCAL_SHARED_LIBRARIES += libnativewindow
+endif
+
 # This controls enabling building of driver libraries
 ifneq ($(HAVE_I915_DRI),)
 LOCAL_REQUIRED_MODULES += i915_dri
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] Android: fix Android O version check for LLVM

2017-08-23 Thread Rob Herring
With the release of O, the MESA_ANDROID_MAJOR_VERSION has changed to 8.
Change the LLVM check to match. There's no point to continue to support 'O'
as no one is going to use an old AOSP master.

Presumably, we'll be back here again to fix things again for P (or 9).

Signed-off-by: Rob Herring 
---
 Android.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Android.mk b/Android.mk
index 1fcde5f4d7fb..1fb584dd5eb0 100644
--- a/Android.mk
+++ b/Android.mk
@@ -96,7 +96,7 @@ define mesa-build-with-llvm
 $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_PATCH=0)) \
   $(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \
 $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_PATCH=0)) \
-  $(if $(filter O,$(MESA_ANDROID_MAJOR_VERSION)), \
+  $(if $(filter 8,$(MESA_ANDROID_MAJOR_VERSION)), \
 $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)) \
   $(eval LOCAL_SHARED_LIBRARIES += libLLVM)
 endef
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] New #intel-3d IRC channel on Freenode

2017-08-23 Thread Kenneth Graunke
Hello,

The Intel Mesa team would like to welcome you to a new public IRC channel
on Freenode: #intel-3d.  The topic is Mesa development for Intel GPUs, in
particular the "i965" OpenGL and "anv" Vulkan drivers.

The open source graphics community has grown a lot over the last few
years, and as a result, both #intel-gfx and #dri-devel have become fairly
busy, with lots of discussions about the kernel, display, and other
drivers.  This has been great to see, but a number of us felt that we'd
outgrown our current space, so to speak, and decided it was finally time
to set up a third channel, to make it easier to discuss more topics in
parallel.

We will of course still be on #intel-gfx (general Intel graphics) and
#dri-devel (the broader Mesa and DRI community), as always.  Our hope is
to complement the existing channels.

We look forward to seeing you on #intel-3d!

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Add support for the HelperInvocation builtin

2017-08-23 Thread Ian Romanick
Reviewed-by: Ian Romanick 

Did you submit a CTS bug?

On 08/21/2017 10:11 PM, Jason Ekstrand wrote:
> I have no idea how this got missed but it's been missing since forever.
> 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/compiler/spirv/vtn_variables.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/compiler/spirv/vtn_variables.c 
> b/src/compiler/spirv/vtn_variables.c
> index 6a8776b..87cb935 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -1121,6 +1121,10 @@ vtn_get_builtin_location(struct vtn_builder *b,
>*location = FRAG_RESULT_DEPTH;
>assert(*mode == nir_var_shader_out);
>break;
> +   case SpvBuiltInHelperInvocation:
> +  *location = SYSTEM_VALUE_HELPER_INVOCATION;
> +  set_mode_system_value(mode);
> +  break;
> case SpvBuiltInNumWorkgroups:
>*location = SYSTEM_VALUE_NUM_WORK_GROUPS;
>set_mode_system_value(mode);
> @@ -1177,7 +1181,6 @@ vtn_get_builtin_location(struct vtn_builder *b,
>*location = SYSTEM_VALUE_VIEW_INDEX;
>set_mode_system_value(mode);
>break;
> -   case SpvBuiltInHelperInvocation:
> default:
>unreachable("unsupported builtin");
> }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/docs: Fix an inequality sign of TGSI_SEMANTIC_SUBGROUP_LT_MASK

2017-08-23 Thread Gwan-gyeong Mun
A previous expression presents same as TGSI_SEMANTIC_SUBGROUP_GT_MASK.
It fixes a direction of an inequality for TGSI_SEMANTIC_SUBGROUP_LT_MASK.

before:
  bit index > TGSI_SEMANTIC_SUBGROUP_INVOCATION

after:
  bit index < TGSI_SEMANTIC_SUBGROUP_INVOCATION

Signed-off-by: Mun Gwan-gyeong 
---
 src/gallium/docs/source/tgsi.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 31331ef511..0bd9964a98 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -3397,7 +3397,7 @@ A bit mask of ``bit index <= 
TGSI_SEMANTIC_SUBGROUP_INVOCATION``, i.e.
 TGSI_SEMANTIC_SUBGROUP_LT_MASK
 ""
 
-A bit mask of ``bit index > TGSI_SEMANTIC_SUBGROUP_INVOCATION``, i.e.
+A bit mask of ``bit index < TGSI_SEMANTIC_SUBGROUP_INVOCATION``, i.e.
 ``(1 << subgroup_invocation) - 1`` in arbitrary precision arithmetic.
 
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Android: Fix LLVM duplicated symbols linking for N and M

2017-08-23 Thread Rob Herring
On Sun, Aug 20, 2017 at 2:57 PM, Rob Herring  wrote:
> On Fri, Aug 18, 2017 at 8:53 PM, Chih-Wei Huang  
> wrote:
>> 2017-08-19 8:27 GMT+08:00 Emil Velikov :
>>> On 18 August 2017 at 20:46, Rob Herring  wrote:
 Both statically linking libLLVMCore and dynamically linking libLLVM causes
 duplicated symbols in gallium_dri.so and it fails to dlopen. We don't
 really need to link libLLVMCore, but just need generated headers to be
 built first. Dynamically linking to libLLVM instead is enough to do
 that. Thanks to Qiang Yu for finding the root cause.

[...]

$(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \
 -$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 
 -DMESA_LLVM_VERSION_PATCH=0) \
 -$(eval LOCAL_STATIC_LIBRARIES += libLLVMCore) \
 -$(eval LOCAL_C_INCLUDES += external/llvm/include 
 external/llvm/device/include),) \
 +$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 
 -DMESA_LLVM_VERSION_PATCH=0),) \
$(if $(filter O,$(MESA_ANDROID_MAJOR_VERSION)), \
 -$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 
 -DMESA_LLVM_VERSION_PATCH=0) \
 -$(eval LOCAL_HEADER_LIBRARIES += llvm-headers),)
 +$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 
 -DMESA_LLVM_VERSION_PATCH=0),) \
 +  $(eval LOCAL_SHARED_LIBRARIES += libLLVM)
>>> Am I the only person getting tad confused by amount of brackets?
>>> As mentioned by Chih-Wei - a shell switch is not possible, but how
>>> about a test vague like the following?
>>>
>>> test "x$(MESA_ANDROID_MAJOR_VERSION)" = "xO" &&
>>>$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)
>>
>> Only possible if you put it into $(shell ...)
>> That gives me an idea. Maybe we ca do like
>>
>> $(shell case "$(MESA_ANDROID_MAJOR_VERSION)" in \
>> 6) echo ... ;; \
>> 7) echo ... ;; \
>> *)  echo ... ;; \
>> esac)
>>
>> I haven't really try it yet.
>
> What does either really buy us? It's really just bike shedding and
> unrelated to fixing the problem at hand.
>
> I have another idea which is to use llvm-config and avoid the
> conditionals altogether. I haven't looked into that closely though.

Well, the build is broken again because the version changed from O to
8 (and I'm not sure if master is going to change to P or 9 at some
point). So I went ahead and have this all coded up like this (I don't
see a simple way to build and run llvm-config):

  $(eval $(shell sed -n -e
's/.*\(LLVM_VERSION_MAJOR\).*\([0-9].*\)/\1:=\2/p'
external/llvm/device/include/llvm/Config/llvm-config.h)) \
  $(eval $(shell sed -n -e
's/.*\(LLVM_VERSION_MINOR\).*\([0-9].*\)/\1:=\2/p'
external/llvm/device/include/llvm/Config/llvm-config.h)) \
  $(eval LOCAL_CFLAGS +=
-DHAVE_LLVM=0x$(LLVM_VERSION_MAJOR)0$(LLVM_VERSION_MINOR))

Only one slight problem in that for master/O it reports 3.8 as the
version is 3.8.275480 which I think is the SVN version number. Not
sure what to do with that...

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/10] st/glsl_to_tgsi: inline src_register into translate_src

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

src_register has no meaningful standalone use, it only makes sense when
called from translate_src.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 150 +++--
 1 file changed, 76 insertions(+), 74 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index f2aae4f5183..acb47abb40b 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5734,126 +5734,128 @@ dst_register(struct st_translate *t, gl_register_file 
file, unsigned index,
 }
 
 /**
- * Map a glsl_to_tgsi src register to a TGSI ureg_src register.
+ * Create a TGSI ureg_dst register from an st_dst_reg.
+ */
+static struct ureg_dst
+translate_dst(struct st_translate *t,
+  const st_dst_reg *dst_reg,
+  bool saturate)
+{
+   struct ureg_dst dst = dst_register(t, dst_reg->file, dst_reg->index,
+  dst_reg->array_id);
+
+   if (dst.File == TGSI_FILE_NULL)
+  return dst;
+
+   dst = ureg_writemask(dst, dst_reg->writemask);
+
+   if (saturate)
+  dst = ureg_saturate(dst);
+
+   if (dst_reg->reladdr != NULL) {
+  assert(dst_reg->file != PROGRAM_TEMPORARY);
+  dst = ureg_dst_indirect(dst, ureg_src(t->address[0]));
+   }
+
+   if (dst_reg->has_index2) {
+  if (dst_reg->reladdr2)
+ dst = ureg_dst_dimension_indirect(dst, ureg_src(t->address[1]),
+   dst_reg->index2D);
+  else
+ dst = ureg_dst_dimension(dst, dst_reg->index2D);
+   }
+
+   return dst;
+}
+
+/**
+ * Create a TGSI ureg_src register from an st_src_reg.
  */
 static struct ureg_src
-src_register(struct st_translate *t, const st_src_reg *reg)
+translate_src(struct st_translate *t, const st_src_reg *src_reg)
 {
-   int index = reg->index;
-   int double_reg2 = reg->double_reg2 ? 1 : 0;
+   struct ureg_src src;
+   int index = src_reg->index;
+   int double_reg2 = src_reg->double_reg2 ? 1 : 0;
 
-   switch(reg->file) {
+   switch(src_reg->file) {
case PROGRAM_UNDEFINED:
-  return ureg_imm4f(t->ureg, 0, 0, 0, 0);
+  src = ureg_imm4f(t->ureg, 0, 0, 0, 0);
+  break;
 
case PROGRAM_TEMPORARY:
case PROGRAM_ARRAY:
-  return ureg_src(dst_register(t, reg->file, reg->index, reg->array_id));
+  src = ureg_src(dst_register(t, src_reg->file, src_reg->index, 
src_reg->array_id));
+  break;
 
case PROGRAM_OUTPUT: {
-  struct ureg_dst dst = dst_register(t, reg->file, reg->index, 
reg->array_id);
+  struct ureg_dst dst = dst_register(t, src_reg->file, src_reg->index, 
src_reg->array_id);
   assert(dst.WriteMask != 0);
   unsigned shift = ffs(dst.WriteMask) - 1;
-  return ureg_swizzle(ureg_src(dst),
-  shift,
-  MIN2(shift + 1, 3),
-  MIN2(shift + 2, 3),
-  MIN2(shift + 3, 3));
+  src = ureg_swizzle(ureg_src(dst),
+ shift,
+ MIN2(shift + 1, 3),
+ MIN2(shift + 2, 3),
+ MIN2(shift + 3, 3));
+  break;
}
 
case PROGRAM_UNIFORM:
-  assert(reg->index >= 0);
-  return reg->index < t->num_constants ?
-   t->constants[reg->index] : ureg_imm4f(t->ureg, 0, 0, 0, 0);
+  assert(src_reg->index >= 0);
+  src = src_reg->index < t->num_constants ?
+   t->constants[src_reg->index] : ureg_imm4f(t->ureg, 0, 0, 0, 0);
+  break;
case PROGRAM_STATE_VAR:
case PROGRAM_CONSTANT:   /* ie, immediate */
-  if (reg->has_index2)
- return ureg_src_register(TGSI_FILE_CONSTANT, reg->index);
+  if (src_reg->has_index2)
+ src = ureg_src_register(TGSI_FILE_CONSTANT, src_reg->index);
   else
- return reg->index >= 0 && reg->index < t->num_constants ?
-  t->constants[reg->index] : ureg_imm4f(t->ureg, 0, 0, 0, 0);
+ src = src_reg->index >= 0 && src_reg->index < t->num_constants ?
+  t->constants[src_reg->index] : ureg_imm4f(t->ureg, 0, 0, 0, 
0);
+  break;
 
case PROGRAM_IMMEDIATE:
-  assert(reg->index >= 0 && reg->index < t->num_immediates);
-  return t->immediates[reg->index];
+  assert(src_reg->index >= 0 && src_reg->index < t->num_immediates);
+  src = t->immediates[src_reg->index];
+  break;
 
case PROGRAM_INPUT:
   /* GLSL inputs are 64-bit containers, so we have to
* map back to the original index and add the offset after
* mapping. */
   index -= double_reg2;
-  if (!reg->array_id) {
+  if (!src_reg->array_id) {
  assert(t->inputMapping[index] < ARRAY_SIZE(t->inputs));
  assert(t->inputs[t->inputMapping[index]].File != TGSI_FILE_NULL);
- return t->inputs[t->inputMapping[index] + double_reg2];
+ src = t->inputs[t->inputMapping[index] + 

[Mesa-dev] [PATCH 10/10] radeonsi: add an assertion that only two-dimensional constant references are used

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeonsi/si_shader.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index f02fc9e9ba2..c445c49d2aa 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1851,6 +1851,7 @@ static LLVMValueRef fetch_constant(
return lp_build_gather_values(>gallivm, values, 4);
}
 
+   assert(reg->Register.Dimension);
buf = reg->Register.Dimension ? reg->Dimension.Index : 0;
idx = reg->Register.Index * 4 + swizzle;
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/10] gallium/radeon: always use two-dimensional constant references

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/r600_query.c | 36 -
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index ca048722672..eaff39c830d 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -1459,7 +1459,7 @@ static void r600_create_query_result_shader(struct 
r600_common_context *rctx)
"DCL BUFFER[0]\n"
"DCL BUFFER[1]\n"
"DCL BUFFER[2]\n"
-   "DCL CONST[0..1]\n"
+   "DCL CONST[0][0..1]\n"
"DCL TEMP[0..5]\n"
"IMM[0] UINT32 {0, 31, 2147483647, 4294967295}\n"
"IMM[1] UINT32 {1, 2, 4, 8}\n"
@@ -1467,10 +1467,10 @@ static void r600_create_query_result_shader(struct 
r600_common_context *rctx)
"IMM[3] UINT32 {100, 0, %u, 0}\n" /* for timestamp 
conversion */
"IMM[4] UINT32 {256, 0, 0, 0}\n"
 
-   "AND TEMP[5], CONST[0]., IMM[2].\n"
+   "AND TEMP[5], CONST[0][0]., IMM[2].\n"
"UIF TEMP[5]\n"
/* Check result availability. */
-   "LOAD TEMP[1].x, BUFFER[0], CONST[1].\n"
+   "LOAD TEMP[1].x, BUFFER[0], CONST[0][1].\n"
"ISHR TEMP[0].z, TEMP[1]., IMM[0].\n"
"MOV TEMP[1], TEMP[0].\n"
"NOT TEMP[0].z, TEMP[0].\n"
@@ -1482,7 +1482,7 @@ static void r600_create_query_result_shader(struct 
r600_common_context *rctx)
"ELSE\n"
/* Load previously accumulated result if requested. */
"MOV TEMP[0], IMM[0].\n"
-   "AND TEMP[4], CONST[0]., IMM[1].\n"
+   "AND TEMP[4], CONST[0][0]., IMM[1].\n"
"UIF TEMP[4]\n"
"LOAD TEMP[0].xyz, BUFFER[1], IMM[0].\n"
"ENDIF\n"
@@ -1495,13 +1495,13 @@ static void r600_create_query_result_shader(struct 
r600_common_context *rctx)
"ENDIF\n"
 
/* Break if result_index >= result_count. */
-   "USGE TEMP[5], TEMP[1]., CONST[0].\n"
+   "USGE TEMP[5], TEMP[1]., CONST[0][0].\n"
"UIF TEMP[5]\n"
"BRK\n"
"ENDIF\n"
 
/* Load fence and check result availability */
-   "UMAD TEMP[5].x, TEMP[1]., CONST[0]., 
CONST[1].\n"
+   "UMAD TEMP[5].x, TEMP[1]., 
CONST[0][0]., CONST[0][1].\n"
"LOAD TEMP[5].x, BUFFER[0], TEMP[5].\n"
"ISHR TEMP[0].z, TEMP[5]., IMM[0].\n"
"NOT TEMP[0].z, TEMP[0].\n"
@@ -1512,16 +1512,16 @@ static void r600_create_query_result_shader(struct 
r600_common_context *rctx)
"MOV TEMP[1].y, IMM[0].\n"
"BGNLOOP\n"
/* Load start and end. */
-   "UMUL TEMP[5].x, TEMP[1]., 
CONST[0].\n"
-   "UMAD TEMP[5].x, TEMP[1]., 
CONST[1]., TEMP[5].\n"
+   "UMUL TEMP[5].x, TEMP[1]., 
CONST[0][0].\n"
+   "UMAD TEMP[5].x, TEMP[1]., 
CONST[0][1]., TEMP[5].\n"
"LOAD TEMP[2].xy, BUFFER[0], 
TEMP[5].\n"
 
-   "UADD TEMP[5].y, TEMP[5]., 
CONST[0].\n"
+   "UADD TEMP[5].y, TEMP[5]., 
CONST[0][0].\n"
"LOAD TEMP[3].xy, BUFFER[0], 
TEMP[5].\n"
 
"U64ADD TEMP[4].xy, TEMP[3], -TEMP[2]\n"
 
-   "AND TEMP[5].z, CONST[0]., 
IMM[4].\n"
+   "AND TEMP[5].z, CONST[0][0]., 
IMM[4].\n"
"UIF TEMP[5].\n"
/* Load second start/end 
half-pair and
 * take the difference
@@ -1538,7 +1538,7 @@ static void r600_create_query_result_shader(struct 
r600_common_context *rctx)
 
/* Increment pair index */
"UADD TEMP[1].y, TEMP[1]., 
IMM[1].\n"
-   "USGE TEMP[5], 

[Mesa-dev] [PATCH 07/10] pp: always use two-dimensional constant references

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/postprocess/pp_mlaa.h | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/postprocess/pp_mlaa.h 
b/src/gallium/auxiliary/postprocess/pp_mlaa.h
index 0b2c363e1c4..85c14a786a3 100644
--- a/src/gallium/auxiliary/postprocess/pp_mlaa.h
+++ b/src/gallium/auxiliary/postprocess/pp_mlaa.h
@@ -164,12 +164,12 @@ static const char offsetvs[] = "VERT\n"
"DCL OUT[1], GENERIC[0]\n"
"DCL OUT[2], GENERIC[10]\n"
"DCL OUT[3], GENERIC[11]\n"
-   "DCL CONST[0]\n"
+   "DCL CONST[0][0]\n"
"IMM FLT32 {1., 0.,-1., 0.}\n"
"  0: MOV OUT[0], IN[0]\n"
"  1: MOV OUT[1], IN[1]\n"
-   "  2: MAD OUT[2], CONST[0].xyxy, IMM[0].zyyz, IN[1].xyxy\n"
-   "  3: MAD OUT[3], CONST[0].xyxy, IMM[0].xyyx, IN[1].xyxy\n"
+   "  2: MAD OUT[2], CONST[0][0].xyxy, IMM[0].zyyz, IN[1].xyxy\n"
+   "  3: MAD OUT[3], CONST[0][0].xyxy, IMM[0].xyyx, IN[1].xyxy\n"
"  4: END\n";
 
 
@@ -183,7 +183,7 @@ static const char blend2fs_1[] = "FRAG\n"
"DCL SVIEW[1], 2D, FLOAT\n"
"DCL SAMP[2]\n"
"DCL SVIEW[2], 2D, FLOAT\n"
-   "DCL CONST[0]\n"
+   "DCL CONST[0][0]\n"
"DCL TEMP[0..6]\n"
"IMM FLT32 {0.,-0.2500, 0.00609756, 0.5000}\n"
"IMM FLT32 {   -1.5000,-2., 0.9000, 1.5000}\n"
@@ -204,7 +204,7 @@ static const char blend2fs_2[] =
" 11:   BRK\n"
" 12: ENDIF\n"
" 13: MOV TEMP[4].y, IMM[0].\n"
-   " 14: MAD TEMP[3].xyz, CONST[0].xyyy, TEMP[4].xyyy, TEMP[1].xyyy\n"
+   " 14: MAD TEMP[3].xyz, CONST[0][0].xyyy, TEMP[4].xyyy, TEMP[1].xyyy\n"
" 15: MOV TEMP[3].w, IMM[0].\n"
" 16: TXL TEMP[5], TEMP[3], SAMP[2], 2D\n"
" 17: MOV TEMP[3].x, TEMP[5].\n"
@@ -229,7 +229,7 @@ static const char blend2fs_2[] =
" 36:   BRK\n"
" 37: ENDIF\n"
" 38: MOV TEMP[5].y, IMM[0].\n"
-   " 39: MAD TEMP[4].xyz, CONST[0].xyyy, TEMP[5].xyyy, TEMP[3].xyyy\n"
+   " 39: MAD TEMP[4].xyz, CONST[0][0].xyyy, TEMP[5].xyyy, TEMP[3].xyyy\n"
" 40: MOV TEMP[4].w, IMM[0].\n"
" 41: TXL TEMP[6].xy, TEMP[4], SAMP[2], 2D\n"
" 42: MOV TEMP[4].x, TEMP[6].\n"
@@ -250,7 +250,7 @@ static const char blend2fs_2[] =
" 57:   MOV TEMP[5].x, TEMP[1].\n"
" 58:   ADD TEMP[1].x, TEMP[4]., IMM[2].\n"
" 59:   MOV TEMP[5].z, TEMP[1].\n"
-   " 60:   MAD TEMP[1], TEMP[5], CONST[0].xyxy, IN[0].xyxy\n"
+   " 60:   MAD TEMP[1], TEMP[5], CONST[0][0].xyxy, IN[0].xyxy\n"
" 61:   MOV TEMP[4], TEMP[1].xyyy\n"
" 62:   MOV TEMP[4].w, IMM[0].\n"
" 63:   TXL TEMP[5].x, TEMP[4], SAMP[2], 2D\n"
@@ -278,7 +278,7 @@ static const char blend2fs_2[] =
" 85:   BRK\n"
" 86: ENDIF\n"
" 87: MOV TEMP[3].y, IMM[0].\n"
-   " 88: MAD TEMP[5].xyz, CONST[0].xyyy, TEMP[3].yxxx, TEMP[1].xyyy\n"
+   " 88: MAD TEMP[5].xyz, CONST[0][0].xyyy, TEMP[3].yxxx, TEMP[1].xyyy\n"
" 89: MOV TEMP[5].w, IMM[0].\n"
" 90: TXL TEMP[4], TEMP[5], SAMP[2], 2D\n"
" 91: MOV TEMP[2].x, TEMP[4].\n"
@@ -303,7 +303,7 @@ static const char blend2fs_2[] =
"110:   BRK\n"
"111: ENDIF\n"
"112: MOV TEMP[4].y, IMM[0].\n"
-   "113: MAD TEMP[5].xyz, CONST[0].xyyy, TEMP[4].yxxx, TEMP[2].xyyy\n"
+   "113: MAD TEMP[5].xyz, CONST[0][0].xyyy, TEMP[4].yxxx, TEMP[2].xyyy\n"
"114: MOV TEMP[5].w, IMM[0].\n"
"115: TXL TEMP[6], TEMP[5], SAMP[2], 2D\n"
"116: MOV TEMP[3].x, TEMP[6].\n"
@@ -324,7 +324,7 @@ static const char blend2fs_2[] =
"131:   MOV TEMP[4].y, TEMP[1].\n"
"132:   ADD TEMP[1].x, TEMP[3]., IMM[2].\n"
"133:   MOV TEMP[4].w, TEMP[1].\n"
-   "134:   MAD TEMP[1], TEMP[4], CONST[0].xyxy, IN[0].xyxy\n"
+   "134:   MAD TEMP[1], TEMP[4], CONST[0][0].xyxy, IN[0].xyxy\n"
"135:   MOV TEMP[3], TEMP[1].xyyy\n"
"136:   MOV TEMP[3].w, IMM[0].\n"
"137:   TXL TEMP[4].y, TEMP[3], SAMP[2], 2D\n"
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/10] tgsi/ureg: always emit constants (and their decls) as 2D

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index ca31bc4a75a..b26434ccbde 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -180,8 +180,7 @@ struct ureg_program
unsigned array_temps[UREG_MAX_ARRAY_TEMPS];
unsigned nr_array_temps;
 
-   struct const_decl const_decls;
-   struct const_decl const_decls2D[PIPE_MAX_CONSTANT_BUFFERS];
+   struct const_decl const_decls[PIPE_MAX_CONSTANT_BUFFERS];
 
unsigned properties[TGSI_PROPERTY_COUNT];
 
@@ -507,7 +506,7 @@ ureg_DECL_constant2D(struct ureg_program *ureg,
  unsigned last,
  unsigned index2D)
 {
-   struct const_decl *decl = >const_decls2D[index2D];
+   struct const_decl *decl = >const_decls[index2D];
 
assert(index2D < PIPE_MAX_CONSTANT_BUFFERS);
 
@@ -529,7 +528,7 @@ struct ureg_src
 ureg_DECL_constant(struct ureg_program *ureg,
unsigned index)
 {
-   struct const_decl *decl = >const_decls;
+   struct const_decl *decl = >const_decls[0];
unsigned minconst = index, maxconst = index;
unsigned i;
 
@@ -579,7 +578,9 @@ out:
assert(i < decl->nr_constant_ranges);
assert(decl->constant_range[i].first <= index);
assert(decl->constant_range[i].last >= index);
-   return ureg_src_register(TGSI_FILE_CONSTANT, index);
+
+   struct ureg_src src = ureg_src_register(TGSI_FILE_CONSTANT, index);
+   return ureg_src_dimension(src, 0);
 }
 
 static struct ureg_dst alloc_temporary( struct ureg_program *ureg,
@@ -1891,17 +1892,8 @@ static void emit_decls( struct ureg_program *ureg )
  emit_decl_memory(ureg, i);
}
 
-   if (ureg->const_decls.nr_constant_ranges) {
-  for (i = 0; i < ureg->const_decls.nr_constant_ranges; i++) {
- emit_decl_range(ureg,
- TGSI_FILE_CONSTANT,
- ureg->const_decls.constant_range[i].first,
- ureg->const_decls.constant_range[i].last - 
ureg->const_decls.constant_range[i].first + 1);
-  }
-   }
-
for (i = 0; i < PIPE_MAX_CONSTANT_BUFFERS; i++) {
-  struct const_decl *decl = >const_decls2D[i];
+  struct const_decl *decl = >const_decls[i];
 
   if (decl->nr_constant_ranges) {
  uint j;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/10] gallium/tests: always use two-dimensional constant references

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/util/u_tests.c | 4 ++--
 src/gallium/tests/graw/fragment-shader/frag-cb-1d.sh | 8 
 src/gallium/tests/graw/vertex-shader/vert-cb-1d.sh   | 8 
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_tests.c 
b/src/gallium/auxiliary/util/u_tests.c
index 7ec8eef65fc..60e77b2e3bc 100644
--- a/src/gallium/auxiliary/util/u_tests.c
+++ b/src/gallium/auxiliary/util/u_tests.c
@@ -418,10 +418,10 @@ util_test_constant_buffer(struct pipe_context *ctx,
{
   static const char *text = /* I don't like ureg... */
 "FRAG\n"
-"DCL CONST[0]\n"
+"DCL CONST[0][0]\n"
 "DCL OUT[0], COLOR\n"
 
-"MOV OUT[0], CONST[0]\n"
+"MOV OUT[0], CONST[0][0]\n"
 "END\n";
   struct tgsi_token tokens[1000];
   struct pipe_shader_state state;
diff --git a/src/gallium/tests/graw/fragment-shader/frag-cb-1d.sh 
b/src/gallium/tests/graw/fragment-shader/frag-cb-1d.sh
index 85fb9ea4e7f..097774336f7 100644
--- a/src/gallium/tests/graw/fragment-shader/frag-cb-1d.sh
+++ b/src/gallium/tests/graw/fragment-shader/frag-cb-1d.sh
@@ -2,12 +2,12 @@ FRAG
 
 DCL IN[0], COLOR, LINEAR
 DCL OUT[0], COLOR
-DCL CONST[1]
-DCL CONST[3]
+DCL CONST[0][1]
+DCL CONST[0][3]
 DCL TEMP[0..1]
 
-ADD TEMP[0], IN[0], CONST[1]
-RCP TEMP[1], CONST[3].
+ADD TEMP[0], IN[0], CONST[0][1]
+RCP TEMP[1], CONST[0][3].
 MUL OUT[0], TEMP[0], TEMP[1]
 
 END
diff --git a/src/gallium/tests/graw/vertex-shader/vert-cb-1d.sh 
b/src/gallium/tests/graw/vertex-shader/vert-cb-1d.sh
index e227917fd3b..0b05ca8b677 100644
--- a/src/gallium/tests/graw/vertex-shader/vert-cb-1d.sh
+++ b/src/gallium/tests/graw/vertex-shader/vert-cb-1d.sh
@@ -4,13 +4,13 @@ DCL IN[0]
 DCL IN[1]
 DCL OUT[0], POSITION
 DCL OUT[1], COLOR
-DCL CONST[1]
-DCL CONST[3]
+DCL CONST[0][1]
+DCL CONST[0][3]
 DCL TEMP[0..1]
 
 MOV OUT[0], IN[0]
-ADD TEMP[0], IN[1], CONST[1]
-RCP TEMP[1], CONST[3].
+ADD TEMP[0], IN[1], CONST[0][1]
+RCP TEMP[1], CONST[0][3].
 MUL OUT[1], TEMP[0], TEMP[1]
 
 END
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] nine: always generate two-dimensional constant file accesses

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/state_trackers/nine/nine_ff.c |  2 +-
 src/gallium/state_trackers/nine/nine_shader.c | 10 --
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index 2175bdbcc51..39fcb8b1591 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -483,7 +483,7 @@ nine_ff_build_vs(struct NineDevice9 *device, struct 
vs_build_ctx *vs)
 
 for (i = 0; i < key->vertexblend; ++i) {
 for (c = 0; c < 4; ++c) {
-cWM[c] = ureg_src_register(TGSI_FILE_CONSTANT, (160 + i * 4) * 
!key->vertexblend_indexed + c);
+cWM[c] = 
ureg_src_dimension(ureg_src_register(TGSI_FILE_CONSTANT, (160 + i * 4) * 
!key->vertexblend_indexed + c), 0);
 if (key->vertexblend_indexed)
 cWM[c] = ureg_src_indirect(cWM[c], 
ureg_scalar(ureg_src(AR), i));
 }
diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 5b60dcbac8f..cc667ebfbcd 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -91,7 +91,7 @@ static inline const char *d3dsio_to_string(unsigned opcode);
TGSI_SWIZZLE_##x, TGSI_SWIZZLE_##y, TGSI_SWIZZLE_##z, TGSI_SWIZZLE_##w
 
 #define NINE_CONSTANT_SRC(index) \
-   ureg_src_register(TGSI_FILE_CONSTANT, index)
+   ureg_src_dimension(ureg_src_register(TGSI_FILE_CONSTANT, index), 0)
 
 #define NINE_APPLY_SWIZZLE(src, s) \
ureg_swizzle(src, NINE_SWIZZLE4(s, s, s, s))
@@ -1009,7 +1009,7 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 src = ureg_src_dimension(src, 0);
 }
 } else
-src = ureg_src_register(TGSI_FILE_CONSTANT, param->idx);
+src = NINE_CONSTANT_SRC(param->idx);
 }
 if (!IS_VS && tx->version.major < 2) {
 /* ps 1.X clamps constants */
@@ -1035,8 +1035,7 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 src = ureg_src_register(TGSI_FILE_CONSTANT, param->idx);
 src = ureg_src_dimension(src, 2);
 } else
-src = ureg_src_register(TGSI_FILE_CONSTANT,
-tx->info->const_i_base + param->idx);
+src = NINE_CONSTANT_SRC(tx->info->const_i_base + param->idx);
 }
 break;
 case D3DSPR_CONSTBOOL:
@@ -1049,8 +1048,7 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
src = ureg_src_register(TGSI_FILE_CONSTANT, r);
src = ureg_src_dimension(src, 3);
} else
-   src = ureg_src_register(TGSI_FILE_CONSTANT,
-   tx->info->const_b_base + r);
+   src = NINE_CONSTANT_SRC(tx->info->const_b_base + r);
src = ureg_swizzle(src, s, s, s, s);
 }
 break;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] gallium/hud: always use two-dimensional constant references

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/hud/hud_context.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 2deb48d18e7..ed2e8491143 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -1618,17 +1618,17 @@ hud_create(struct pipe_context *pipe, struct 
cso_context *cso)
  /* [0] = color,
   * [1] = (2/fb_width, 2/fb_height, xoffset, yoffset)
   * [2] = (xscale, yscale, 0, 0) */
- "DCL CONST[0..2]\n"
+ "DCL CONST[0][0..2]\n"
  "DCL TEMP[0]\n"
  "IMM[0] FLT32 { -1, 0, 0, 1 }\n"
 
  /* v = in * (xscale, yscale) + (xoffset, yoffset) */
- "MAD TEMP[0].xy, IN[0], CONST[2].xyyy, CONST[1].zwww\n"
+ "MAD TEMP[0].xy, IN[0], CONST[0][2].xyyy, CONST[0][1].zwww\n"
  /* pos = v * (2 / fb_width, 2 / fb_height) - (1, 1) */
- "MAD OUT[0].xy, TEMP[0], CONST[1].xyyy, IMM[0].\n"
+ "MAD OUT[0].xy, TEMP[0], CONST[0][1].xyyy, IMM[0].\n"
  "MOV OUT[0].zw, IMM[0]\n"
 
- "MOV OUT[1], CONST[0]\n"
+ "MOV OUT[1], CONST[0][0]\n"
  "MOV OUT[2], IN[1]\n"
  "END\n"
   };
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] gallium: all drivers should accept two-dimensional constant buffer indexing

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Most older drivers seem to just ignore the Dimension setting, so virtually
no changes should be needed.
---
 src/gallium/auxiliary/nir/tgsi_to_nir.c |  2 +-
 src/gallium/docs/source/screen.rst  | 11 +++
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
index 733eca0764b..aa715dcae2d 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -624,7 +624,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
  assert(!dim);
  break;
   case TGSI_FILE_CONSTANT:
- if (dim) {
+ if (dim && (dim->Index > 0 || dim->Indirect)) {
 op = nir_intrinsic_load_ubo;
  } else {
 op = nir_intrinsic_load_uniform;
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index be14ddd0c0d..93d94a48e6b 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -446,21 +446,16 @@ support different features.
 * ``PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE``: The maximum size per constant 
buffer in bytes.
 * ``PIPE_SHADER_CAP_MAX_CONST_BUFFERS``: Maximum number of constant buffers 
that can be bound
   to any shader stage using ``set_constant_buffer``. If 0 or 1, the pipe will
-  only permit binding one constant buffer per shader, and the shaders will
-  not permit two-dimensional access to constants.
+  only permit binding one constant buffer per shader.
 
 If a value greater than 0 is returned, the driver can have multiple
-constant buffers bound to shader stages. The CONST register file can
-be accessed with two-dimensional indices, like in the example below.
+constant buffers bound to shader stages. The CONST register file is
+accessed with two-dimensional indices, like in the example below.
 
 DCL CONST[0][0..7]   # declare first 8 vectors of constbuf 0
 DCL CONST[3][0]  # declare first vector of constbuf 3
 MOV OUT[0], CONST[0][3]  # copy vector 3 of constbuf 0
 
-For backwards compatibility, one-dimensional access to CONST register
-file is still supported. In that case, the constbuf index is assumed
-to be 0.
-
 * ``PIPE_SHADER_CAP_MAX_TEMPS``: The maximum number of temporary registers.
 * ``PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED``: Whether the continue opcode is 
supported.
 * ``PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR``: Whether indirect addressing
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] st/glsl_to_tgsi: ir_load_ubo always has a second index

2017-08-23 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9f021962e40..f2aae4f5183 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -2186,14 +2186,13 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* 
ir, st_src_reg *op)
   if (const_uniform_block) {
  /* Constant constant buffer */
  cbuf.reladdr2 = NULL;
- cbuf.has_index2 = true;
   }
   else {
  /* Relative/variable constant buffer */
  cbuf.reladdr2 = ralloc(mem_ctx, st_src_reg);
  memcpy(cbuf.reladdr2, [0], sizeof(st_src_reg));
- cbuf.has_index2 = true;
   }
+  cbuf.has_index2 = true;
 
   cbuf.swizzle = swizzle_for_size(ir->type->vector_elements);
   if (glsl_base_type_is_64bit(cbuf.type))
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/10] gallium: normalize CONST file accesses to 2D

2017-08-23 Thread Nicolai Hähnle
Hi all,

Following the discussion on Timothy's std430 packing series, here's
a quick proposal to just always use 2D accesses to the CONST file
in TGSI.

The first patch should be sufficient for all drivers to accept
those 2D accesses. It seems that most older drivers simply ignore
the dimension, and newer ones should handle it directly.

Subsequent patches modify the producers of TGSI to always use 2D
constant references. This is mostly done by changing ureg.

Finally, the last patch adds an assertion to radeonsi to make
sure all constant references are really 2D. It has survived my
very superficial initial testing.

What needs to be tested is:
- some more drivers
- Nine
- TGSI-to-NIR

You can find the series here: 
https://cgit.freedesktop.org/~nh/mesa/log/?h=tgsi-const-2d

Please comment/review!
Thanks,
Nicolai
--
 src/gallium/auxiliary/hud/hud_context.c  |   8 +-
 src/gallium/auxiliary/nir/tgsi_to_nir.c  |   2 +-
 src/gallium/auxiliary/postprocess/pp_mlaa.h  |  20 +--
 src/gallium/auxiliary/tgsi/tgsi_ureg.c   |  22 +--
 src/gallium/auxiliary/util/u_tests.c |   4 +-
 src/gallium/docs/source/screen.rst   |  11 +-
 src/gallium/drivers/radeon/r600_query.c  |  36 ++--
 src/gallium/drivers/radeonsi/si_shader.c |   1 +
 src/gallium/state_trackers/nine/nine_ff.c|   2 +-
 .../state_trackers/nine/nine_shader.c|  10 +-
 .../tests/graw/fragment-shader/frag-cb-1d.sh |   8 +-
 .../tests/graw/vertex-shader/vert-cb-1d.sh   |   8 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 153 +
 13 files changed, 136 insertions(+), 149 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: limit pipe_draw_info->restart_index usage

2017-08-23 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak  

> On Aug 23, 2017, at 11:19 AM, Tim Rowley  wrote:
> 
> Only copy this value when in restart drawing mode.
> 
> Eliminates valgrind errors when running trivial programs.
> ---
> src/gallium/drivers/swr/swr_draw.cpp | 5 -
> 1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/swr/swr_draw.cpp 
> b/src/gallium/drivers/swr/swr_draw.cpp
> index df1c11a..2363800 100644
> --- a/src/gallium/drivers/swr/swr_draw.cpp
> +++ b/src/gallium/drivers/swr/swr_draw.cpp
> @@ -107,7 +107,10 @@ swr_draw_vbo(struct pipe_context *pipe, const struct 
> pipe_draw_info *info)
>}
> 
>struct swr_vertex_element_state *velems = ctx->velems;
> -   velems->fsState.cutIndex = info->restart_index;
> +   if (info->primitive_restart)
> +  velems->fsState.cutIndex = info->restart_index;
> +   else
> +  velems->fsState.cutIndex = 0;
>velems->fsState.bEnableCutIndex = info->primitive_restart;
>velems->fsState.bPartialVertexBuffer = (info->min_index > 0);
> 
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102038] assertion failure in update_framebuffer_size

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102038

--- Comment #13 from Bruce Cherniak  ---
Hi Brian,

This appears to fix VTK tests on llvmpipe.  However, it breaks SWR.  Please let
me see what's going on there before submitting this patch for review.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-23 Thread Marek Olšák
On Wed, Aug 23, 2017 at 3:08 PM, Nicolai Hähnle  wrote:
> On 22.08.2017 22:39, Roland Scheidegger wrote:
>>
>> Am 22.08.2017 um 19:10 schrieb Marek Olšák:
>>>
>>> Hi,
>>>
>>> I'd like to discuss 16-bit float and integer support in TGSI. I'm
>>> proposing this:
>>>
>>>   struct tgsi_instruction
>>>   {
>>>  unsigned Type   : 4;  /* TGSI_TOKEN_TYPE_INSTRUCTION */
>>>  unsigned NrTokens   : 8;  /* UINT */
>>>  unsigned Opcode : 8;  /* TGSI_OPCODE_ */
>>>  unsigned Saturate   : 1;  /* BOOL */
>>>  unsigned NumDstRegs : 2;  /* UINT */
>>>  unsigned NumSrcRegs : 4;  /* UINT */
>>>  unsigned Label  : 1;
>>>  unsigned Texture: 1;
>>>  unsigned Memory : 1;
>>>  unsigned Precise: 1;
>>> -   unsigned Padding: 1;
>>> +   unsigned HalfPrecision : 1;
>>>   };
>>>
>>> There won't be any 16-bit TEMPs in TGSI, but each instruction will
>>> have the HalfPrecision flag, which is a hint for drivers that they can
>>> use a 16-bit opcode. Even texture, load, and store instructions can
>>> set HalfPrecision, which means they can accept and return 16-bit
>>> values.
>>>
>>> The catch is that drivers will have to insert 16-bit <-> 32-bit
>>> conversions manually, because they won't be present in TGSI. The
>>> advantage is that we don't have to add 200 new opcodes for the 3 new
>>> 16-bit types.
>>>
>>> What do you think?
>>>
>>
>> Flagging instructions as 16bit doesn't look too bad to me, but I'm
>> wondering if this isn't a bit problematic wrt register files. Clearly,
>> this is a restriction of tgsi "everything is a 32x4 value". Doubles, of
>> course, have a similar problem, but in the end they still have
>> well-defined interactions with the register files, because it's defined
>> what bits ultimately represent a 64bit value (at least in theory from
>> tgsi's point of view, it is perfectly valid to use some 32bit
>> calculations to set some reg, then just use double instructions directly
>> without conversion on these values - it may not be meaningful but it is
>> well defined).
>> But it looks like you want to avoid to have a well-defined mapping of
>> the registers to 16bit types (and with 16 bits instruction just being
>> hints, I can't see how it could exist).
>> Note that being able to flag instructions as HalfPrecision does not
>> necessarily mean you can't have any explicit 16bit conversion
>> instructions too.
>
>
> Those already exist: PK2H and UP2H. Or did you have something else in mind?
>
> More generally, there are really two use cases for this, and we need to be
> careful not to mix them up:
>
> - transparent downgrading to 16-bit of lowp and mediump
> - support for extensions that explicitly introduce 16-bit types
>
> For lowp and mediump, the approach of just having a HalfPrecision bit on the
> instructions is probably fine.
>
> The second case is different. I don't think there are ARB extensions for
> that yet, but there are AMD_gpu_shader_{int16,half_float} with explicitly
> 16-bit types. (There's also NV_half_float, but that's from earlier days
> without GLSL.) For those, we'd really need to provide exactly the required
> operation. No special handling of TGSI temporaries is needed: an f16vec4 is
> represented as a normal 4-component vector in TGSI, just that the upper 16
> bits of each component are ignored.

I wanted to avoid adding 16-bit opcodes to TGSI because it's too much work.

>
> Here's another question: What does "low precision" mean on a texture
> instruction? Are the offsets low precision or is it the output? Maybe we can
> punt on this for now -- at least GCN doesn't have low precision there
> anyway.

HalfPrecision means that all dst and src sources can be 16-bit.

If the consumer of a TEX instruction is 16-bit, TEX should return
16-bit automatically. If a source of a TEX instruction is 16-bit, TEX
should accept 16-bit automatically.

GFX9 can have 16-bit inputs and outputs in buffer and image
instructions. We also have 16-bit interpolation. We could, in theory,
run a whole pixel shader with 16-bit precision.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swr: limit pipe_draw_info->restart_index usage

2017-08-23 Thread Tim Rowley
Only copy this value when in restart drawing mode.

Eliminates valgrind errors when running trivial programs.
---
 src/gallium/drivers/swr/swr_draw.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/swr_draw.cpp 
b/src/gallium/drivers/swr/swr_draw.cpp
index df1c11a..2363800 100644
--- a/src/gallium/drivers/swr/swr_draw.cpp
+++ b/src/gallium/drivers/swr/swr_draw.cpp
@@ -107,7 +107,10 @@ swr_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
}
 
struct swr_vertex_element_state *velems = ctx->velems;
-   velems->fsState.cutIndex = info->restart_index;
+   if (info->primitive_restart)
+  velems->fsState.cutIndex = info->restart_index;
+   else
+  velems->fsState.cutIndex = 0;
velems->fsState.bEnableCutIndex = info->primitive_restart;
velems->fsState.bPartialVertexBuffer = (info->min_index > 0);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: do not assert when reserving bindless slot 0

2017-08-23 Thread Samuel Pitoiset



On 08/23/2017 05:59 PM, Nicolai Hähnle wrote:

On 23.08.2017 10:10, Samuel Pitoiset wrote:

Both solutions look good to me.

On 08/23/2017 10:06 AM, Michael Schellenberger Costa wrote:

Hi Samuel,

do you want to fully remove the assert or should this be something 
the kind of


MAYBE_UNUSED unsigned res = 
util_idalloc_alloc(>bindless_used_slots);

assert(res != 0);


I think you got the sense of the assertion wrong :)


Yeah... But after :)

I pushed a fix for that, sorry.



Cheers,
Nicolai



--Michael

-Ursprüngliche Nachricht-
Von: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] Im 
Auftrag von Samuel Pitoiset

Gesendet: Mittwoch, 23. August 2017 09:43
An: mesa-dev@lists.freedesktop.org
Betreff: [Mesa-dev] [PATCH] radeonsi: do not assert when reserving 
bindless slot 0


When assertions were disabled, the compiler removed
the call to util_idalloc_alloc() and the first allocated
bindless slot was 0 which is invalid per the spec.

Signed-off-by: Samuel Pitoiset 
---
  src/gallium/drivers/radeonsi/si_descriptors.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c

index f66ecc3e68..c53253ac8d 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -2192,7 +2192,7 @@ static void si_init_bindless_descriptors(struct 
si_context *sctx,

  util_idalloc_resize(>bindless_used_slots, num_elements);
  /* Reserve slot 0 because it's an invalid handle for bindless. */
-    assert(!util_idalloc_alloc(>bindless_used_slots));
+    util_idalloc_alloc(>bindless_used_slots);
  }
  static void si_release_bindless_descriptors(struct si_context *sctx)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: do not assert when reserving bindless slot 0

2017-08-23 Thread Nicolai Hähnle

On 23.08.2017 10:10, Samuel Pitoiset wrote:

Both solutions look good to me.

On 08/23/2017 10:06 AM, Michael Schellenberger Costa wrote:

Hi Samuel,

do you want to fully remove the assert or should this be something the 
kind of


MAYBE_UNUSED unsigned res = 
util_idalloc_alloc(>bindless_used_slots);

assert(res != 0);


I think you got the sense of the assertion wrong :)

Cheers,
Nicolai



--Michael

-Ursprüngliche Nachricht-
Von: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] Im 
Auftrag von Samuel Pitoiset

Gesendet: Mittwoch, 23. August 2017 09:43
An: mesa-dev@lists.freedesktop.org
Betreff: [Mesa-dev] [PATCH] radeonsi: do not assert when reserving 
bindless slot 0


When assertions were disabled, the compiler removed
the call to util_idalloc_alloc() and the first allocated
bindless slot was 0 which is invalid per the spec.

Signed-off-by: Samuel Pitoiset 
---
  src/gallium/drivers/radeonsi/si_descriptors.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c

index f66ecc3e68..c53253ac8d 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -2192,7 +2192,7 @@ static void si_init_bindless_descriptors(struct 
si_context *sctx,

  util_idalloc_resize(>bindless_used_slots, num_elements);
  /* Reserve slot 0 because it's an invalid handle for bindless. */
-assert(!util_idalloc_alloc(>bindless_used_slots));
+util_idalloc_alloc(>bindless_used_slots);
  }
  static void si_release_bindless_descriptors(struct si_context *sctx)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100613] Regression in Mesa 17 on s390x (zSystems)

2017-08-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100613

--- Comment #44 from Ben Crocker  ---
(In reply to Ben Crocker from comment #43)
> (In reply to Ben Crocker from comment #42)
> > Created attachment 133677 [details] [review] [review]
> > lp_build_gather_elem_vec big-endian fix for 3x16 load
> > 
> > In reply to Roland's Comment 32:
> > 
> > Roland, thanks for the constructive feedback.  Here is what the patch
> > looks like now.
> > 
> > I don't disagree that this is messy, but it DOES resolve (most of) the
> > Piglit regressions.
> > 
> > I agree that the code in lp_bld_gather should work for 4x16 bit
> > vectors, but SOMETHING appears to have gone right already for 4x16 bit
> > vectors, as the only regressions seen had to do with 3x16 vectors.
> 
> Clarification: the only regressions seen AFTER Ray Strode's patch
> (attachment 130980 [details] [review]).

By the way, draw-vertices-2101010 is failing across all architectures:
X86 and PPC64LE as well as PPC64/S390x.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] mesa: only check errors when the state change in glDepthBoundsEXT()

2017-08-23 Thread Samuel Pitoiset



On 08/23/2017 04:53 PM, Ilia Mirkin wrote:

This is a functional change, e.g. what if

glDepthBoundsEXT(2, 1)

is called? Either way, I suspect it's fine, but just pointing it out
in case it wasn't considered.


The spec doesn't seem to explain if the INVALID_VALUE error should be 
reported before or after the values are clamped.




On Wed, Aug 23, 2017 at 10:43 AM, Samuel Pitoiset
 wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/depth.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/depth.c b/src/mesa/main/depth.c
index 930f5e816f..ddd91481cd 100644
--- a/src/mesa/main/depth.c
+++ b/src/mesa/main/depth.c
@@ -146,17 +146,17 @@ _mesa_DepthBoundsEXT( GLclampd zmin, GLclampd zmax )
 if (MESA_VERBOSE & VERBOSE_API)
_mesa_debug(ctx, "glDepthBounds(%f, %f)\n", zmin, zmax);

-   if (zmin > zmax) {
-  _mesa_error(ctx, GL_INVALID_VALUE, "glDepthBoundsEXT(zmin > zmax)");
-  return;
-   }
-
 zmin = CLAMP(zmin, 0.0, 1.0);
 zmax = CLAMP(zmax, 0.0, 1.0);

 if (ctx->Depth.BoundsMin == zmin && ctx->Depth.BoundsMax == zmax)
return;

+   if (zmin > zmax) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "glDepthBoundsEXT(zmin > zmax)");
+  return;
+   }
+
 FLUSH_VERTICES(ctx, ctx->DriverFlags.NewDepth ? 0 : _NEW_DEPTH);
 ctx->NewDriverState |= ctx->DriverFlags.NewDepth;
 ctx->Depth.BoundsMin = (GLfloat) zmin;
--
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] gallium: add CONSTBUF type to tgsi_file_type

2017-08-23 Thread Roland Scheidegger
Am 23.08.2017 um 17:03 schrieb Nicolai Hähnle:
> On 23.08.2017 16:03, Roland Scheidegger wrote:
>> Am 23.08.2017 um 15:49 schrieb Ilia Mirkin:
>>> On Wed, Aug 23, 2017 at 9:20 AM, Nicolai Hähnle 
>>> wrote:
 On 22.08.2017 16:56, Ilia Mirkin wrote:
>
> On Tue, Aug 22, 2017 at 10:51 AM, Roland Scheidegger
> 
> wrote:
>>
>> I am probably missing something here, but why do you need a new
>> register
>> file? Since you couldn't use LOAD with TGSI_FILE_CONSTANT before,
>> can't
>> you just allow LOAD with TGSI_FILE_CONSTANT and achieve the same
>> thing?
>> Or do you need to know how it's going to be accessed in advance?
>
>
> With bindless, LOAD can take a CONST I believe [which contains the
> value of the bindless id]. I think it's nice to keep those concepts
> separate... having CONST sometimes mean the value and other times mean
> the address is a bit weird. This way CONSTBUF[0] is the address of the
> 0th constbuf.


 I'm still not quite convinced. The levels of indirection should
 clarify the
 meaning, shouldn't they?

 You get

    LOAD dst, CONST[0][0], IMM[0]

 when loading from offset IMM[0] of a bindless buffer whose handle is
 at the
 beginning of the buffer CONST[0].

 You get

    LOAD dst, CONST[0], IMM[0]

 when loading from offset IMM[0] of non-bindless buffer 0.

 Is there ever really a situation where the two could be confused?
>>>
>>> I always considered CONST[0] == CONST[0][0]. Technically they're not,
>>> since once has the second dimension in the TGSI encoding while the
>>> other doesn't. But practically,
>>>
>>> MOV TEMP[0], CONST[0]
>>>
>>> and
>>>
>>> MOV TEMP[0], CONST[0][0]
>>>
>>> are in every way identical. Currently st/mesa will just use CONST[0]
>>> everywhere, never adding the 2nd dimension.
>> Maybe it would be worth the effort to fix this?
> 
> Would be nice. One thing that makes this a bit awkward is that older
> drivers just don't support two-dimensional CONST at all -- see
> PIPE_SHADER_CAP_MAX_CONST_BUFFERS. Giving them a shader that loads
> CONST[0][n] is going to fail.
I suppose it wouldn't be too difficult to make them just accept this
(basically ignoring the buffer index).
But anyway, I don't know if it's worth the hassle, I just brought it up
because if it's a problem going forward, it should be possible to change
it. (Albeit we definitely have code relying on these 1d constants too...)

Roland


> 
> Basically, changing this is a backward-compatible change to state
> trackers, which would have to promise not to produce one-dimensional
> CONST for the usual, vec4-based constant fetching.
> 
> On the other hand, maybe we're over-complicating this. The only
> instruction that is really affected is LOAD. And for LOAD, there
> shouldn't be a compatibility problem. Hmm...
> 
> Cheers,
> Nicolai
> 
>>
>> Roland
>>
>>
>>   As such, I don't think we
>>> should start having behavioural differences for those on some
>>> instructions.
>>>
>>
> 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v6.1] egl: Allow creation of per surface out fence

2017-08-23 Thread Marathe, Yogesh
> -Original Message-
> From: Tomasz Figa [mailto:tf...@chromium.org]
> Sent: Wednesday, August 23, 2017 8:17 PM
> To: Marathe, Yogesh 
> Cc: ML mesa-dev ; Emil Velikov
> ; Gao, Shuo ; Liu, Zhiquan
> ; Daniel Stone ; Nicolai
> Hähnle ; Antognolli, Rafael
> ; Eric Engestrom ; Kenneth
> Graunke ; Rainer Hochecker
> ; Kondapally, Kalyan ;
> Timothy Arceri ; Varad Gautam
> ; Wu, Zhongmin 
> Subject: Re: [PATCH v6.1] egl: Allow creation of per surface out fence
> 
> Hi Yogesh,
> 
> Sorry for being late with review. Please see some comments inline.
> 

No problem.

> On Fri, Aug 18, 2017 at 7:08 PM,   wrote:
> > From: Zhongmin Wu 
> >
> > Add plumbing to allow creation of per display surface out fence.
> >
> > Currently enabled only on android, since the system expects a valid fd
> > in ANativeWindow::{queue,cancel}Buffer. We pass a fd of -1 with which
> > native applications such as flatland fail. The patch enables explicit
> > sync on android and fixes one of the functional issue for apps or
> > buffer consumers which depend upon fence and its timestamp.
> >
> > v2: a) Also implement the fence in cancelBuffer.
> > b) The last sync fence is stored in drawable object
> >rather than brw context.
> > c) format clear.
> >
> > v3: a) Save the last fence fd in DRI Context object.
> > b) Return the last fence if the batch buffer is empty and
> >nothing to be flushed when _intel_batchbuffer_flush_fence
> > c) Add the new interface in vbtl to set the retrieve fence
> >
> > v3.1 a) close fd in the new vbtl interface on none Android platform
> >
> > v4: a) The last fence is saved in brw context.
> > b) The retrieve fd is for all the platform but not just Android
> > c) Add a uniform dri2 interface to initialize the surface.
> >
> > v4.1: a) make some changes of variable name.
> >   b) the patch is broken into two patches.
> >
> > v4.2: a) Add a deinit interface for surface to clear the out fence
> >
> > v5: a) Add enable_out_fence to init, platform sets it true or
> >false
> > b) Change get fd to update fd and check for fence
> > c) Commit description updated
> >
> > v6: a) Heading and commit description updated
> > b) enable_out_fence is set only if fence is supported
> > c) Review comments on function names
> > d) Test with standalone patch, resolves the bug
> >
> > v6.1 a) Check for old display fence reverted back
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101655
> >
> > Signed-off-by: Zhongmin Wu 
> > Signed-off-by: Yogesh Marathe 
> > ---
> >  src/egl/drivers/dri2/egl_dri2.c | 69 
> > +
> >  src/egl/drivers/dri2/egl_dri2.h |  9 
> >  src/egl/drivers/dri2/platform_android.c | 29 ++--
> >  src/egl/drivers/dri2/platform_drm.c |  3 +-
> >  src/egl/drivers/dri2/platform_surfaceless.c |  3 +-
> >  src/egl/drivers/dri2/platform_wayland.c |  3 +-
> >  src/egl/drivers/dri2/platform_x11.c |  3 +-
> >  src/egl/drivers/dri2/platform_x11_dri3.c|  3 +-
> >  8 files changed, 104 insertions(+), 18 deletions(-)
> >
> > diff --git a/src/egl/drivers/dri2/egl_dri2.c
> > b/src/egl/drivers/dri2/egl_dri2.c index ed79e0d..04d0332 100644
> > --- a/src/egl/drivers/dri2/egl_dri2.c
> > +++ b/src/egl/drivers/dri2/egl_dri2.c
> > @@ -1354,6 +1354,44 @@ dri2_destroy_context(_EGLDriver *drv,
> _EGLDisplay *disp, _EGLContext *ctx)
> > return EGL_TRUE;
> >  }
> >
> > +EGLBoolean
> > +dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type,
> > +_EGLConfig *conf, const EGLint *attrib_list, EGLBoolean
> > +enable_out_fence) {
> > +   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
> > +   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
> > +
> > +   dri2_surf->out_fence_fd = -1;
> > +   if (dri2_dpy->fence && dri2_dpy->fence->base.version >= 2 &&
> > +   dri2_dpy->fence->get_capabilities &&
> > +   (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) &
> > +__DRI_FENCE_CAP_NATIVE_FD)) {
> > +  dri2_surf->enable_out_fence = enable_out_fence;
> > +   }
> 
> nit: It might not change anything in practice, but it would be more logical 
> if the
> code always initialized enable_out_fence to some value.
> So maybe let's add dri2_surf->enable_out_fence = 0; above the if.
> 

Ok.

> > +
> > +   return _eglInitSurface(surf, dpy, type, conf, attrib_list); }
> > +
> > +static void
> > +dri2_surface_set_out_fence_fd( _EGLSurface *surf, int 

Re: [Mesa-dev] [PATCH 1/7] gallium: add CONSTBUF type to tgsi_file_type

2017-08-23 Thread Nicolai Hähnle

On 23.08.2017 16:03, Roland Scheidegger wrote:

Am 23.08.2017 um 15:49 schrieb Ilia Mirkin:

On Wed, Aug 23, 2017 at 9:20 AM, Nicolai Hähnle  wrote:

On 22.08.2017 16:56, Ilia Mirkin wrote:


On Tue, Aug 22, 2017 at 10:51 AM, Roland Scheidegger 
wrote:


I am probably missing something here, but why do you need a new register
file? Since you couldn't use LOAD with TGSI_FILE_CONSTANT before, can't
you just allow LOAD with TGSI_FILE_CONSTANT and achieve the same thing?
Or do you need to know how it's going to be accessed in advance?



With bindless, LOAD can take a CONST I believe [which contains the
value of the bindless id]. I think it's nice to keep those concepts
separate... having CONST sometimes mean the value and other times mean
the address is a bit weird. This way CONSTBUF[0] is the address of the
0th constbuf.



I'm still not quite convinced. The levels of indirection should clarify the
meaning, shouldn't they?

You get

   LOAD dst, CONST[0][0], IMM[0]

when loading from offset IMM[0] of a bindless buffer whose handle is at the
beginning of the buffer CONST[0].

You get

   LOAD dst, CONST[0], IMM[0]

when loading from offset IMM[0] of non-bindless buffer 0.

Is there ever really a situation where the two could be confused?


I always considered CONST[0] == CONST[0][0]. Technically they're not,
since once has the second dimension in the TGSI encoding while the
other doesn't. But practically,

MOV TEMP[0], CONST[0]

and

MOV TEMP[0], CONST[0][0]

are in every way identical. Currently st/mesa will just use CONST[0]
everywhere, never adding the 2nd dimension.

Maybe it would be worth the effort to fix this?


Would be nice. One thing that makes this a bit awkward is that older 
drivers just don't support two-dimensional CONST at all -- see 
PIPE_SHADER_CAP_MAX_CONST_BUFFERS. Giving them a shader that loads 
CONST[0][n] is going to fail.


Basically, changing this is a backward-compatible change to state 
trackers, which would have to promise not to produce one-dimensional 
CONST for the usual, vec4-based constant fetching.


On the other hand, maybe we're over-complicating this. The only 
instruction that is really affected is LOAD. And for LOAD, there 
shouldn't be a compatibility problem. Hmm...


Cheers,
Nicolai



Roland


  As such, I don't think we

should start having behavioural differences for those on some
instructions.






--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] mesa: only check errors when the state change in glDepthBoundsEXT()

2017-08-23 Thread Ilia Mirkin
This is a functional change, e.g. what if

glDepthBoundsEXT(2, 1)

is called? Either way, I suspect it's fine, but just pointing it out
in case it wasn't considered.

On Wed, Aug 23, 2017 at 10:43 AM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/mesa/main/depth.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/main/depth.c b/src/mesa/main/depth.c
> index 930f5e816f..ddd91481cd 100644
> --- a/src/mesa/main/depth.c
> +++ b/src/mesa/main/depth.c
> @@ -146,17 +146,17 @@ _mesa_DepthBoundsEXT( GLclampd zmin, GLclampd zmax )
> if (MESA_VERBOSE & VERBOSE_API)
>_mesa_debug(ctx, "glDepthBounds(%f, %f)\n", zmin, zmax);
>
> -   if (zmin > zmax) {
> -  _mesa_error(ctx, GL_INVALID_VALUE, "glDepthBoundsEXT(zmin > zmax)");
> -  return;
> -   }
> -
> zmin = CLAMP(zmin, 0.0, 1.0);
> zmax = CLAMP(zmax, 0.0, 1.0);
>
> if (ctx->Depth.BoundsMin == zmin && ctx->Depth.BoundsMax == zmax)
>return;
>
> +   if (zmin > zmax) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "glDepthBoundsEXT(zmin > zmax)");
> +  return;
> +   }
> +
> FLUSH_VERTICES(ctx, ctx->DriverFlags.NewDepth ? 0 : _NEW_DEPTH);
> ctx->NewDriverState |= ctx->DriverFlags.NewDepth;
> ctx->Depth.BoundsMin = (GLfloat) zmin;
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-23 Thread Ilia Mirkin
On Wed, Aug 23, 2017 at 10:48 AM, Nicolai Hähnle  wrote:
> On 23.08.2017 16:36, Ilia Mirkin wrote:
>>
>> On Wed, Aug 23, 2017 at 10:30 AM, Nicolai Hähnle 
>> wrote:
>>>
>>> On 23.08.2017 15:15, Nicolai Hähnle wrote:


 On 22.08.2017 19:32, Marek Olšák wrote:
>
>
> On Tue, Aug 22, 2017 at 7:28 PM, Ilia Mirkin 
> wrote:
>>
>>
>> How do you propose defining the semantics for e.g. loading a 16-bit
>> value from a constbuf/ssbo? Would those get separate instructions?
>
>
>
> st/mesa should use UP2H, PK2H and similar opcodes for I16 and U16, and
> drivers can replace them with MOV if HalfPrecision == 1.



 You mean, if HalfPrecision == 1 for subsequent operations?

 How *do* we implement this for LLVM, anyway? Downcast (fptrunc) from
 float
 to half whenever we're loading operands of a HalfPrecision == 1
 instruction,
 and then casting (fpext) back up before storing the result?

 LLVM instcombine seems quite capable of seeing through that in simple
 code, but I worry about control flow.
>>>
>>>
>>>
>>> Thinking about this some more, having the precision a property of the
>>> temporaries like Ilia suggested would probably help with emitting LLVM IR
>>> that behaves well across control flow, but complicate st_glsl_to_tgsi.
>>> Hard
>>> to say what the tradeoff is there.
>>
>>
>> Why would it complicate glsl_to_tgsi? At the GLSL level, it's not the
>> operations that have precision, but variables. And those variables map
>> to temp's... we'd have to create a separate pool of high- vs
>> low-precision temps, but that's about it.
>
>
> Well, it may not be so bad in the end. But we definitely have to be more
> careful with things like temporary remapping, peephole, etc.
>
> Also, at least OUT variables could also be affected, right?

And IN as well. Note that some hardware has support for fp16 varyings
and fp16 color outputs (the GLES end of things).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-23 Thread Nicolai Hähnle

On 23.08.2017 16:36, Ilia Mirkin wrote:

On Wed, Aug 23, 2017 at 10:30 AM, Nicolai Hähnle  wrote:

On 23.08.2017 15:15, Nicolai Hähnle wrote:


On 22.08.2017 19:32, Marek Olšák wrote:


On Tue, Aug 22, 2017 at 7:28 PM, Ilia Mirkin 
wrote:


How do you propose defining the semantics for e.g. loading a 16-bit
value from a constbuf/ssbo? Would those get separate instructions?



st/mesa should use UP2H, PK2H and similar opcodes for I16 and U16, and
drivers can replace them with MOV if HalfPrecision == 1.



You mean, if HalfPrecision == 1 for subsequent operations?

How *do* we implement this for LLVM, anyway? Downcast (fptrunc) from float
to half whenever we're loading operands of a HalfPrecision == 1 instruction,
and then casting (fpext) back up before storing the result?

LLVM instcombine seems quite capable of seeing through that in simple
code, but I worry about control flow.



Thinking about this some more, having the precision a property of the
temporaries like Ilia suggested would probably help with emitting LLVM IR
that behaves well across control flow, but complicate st_glsl_to_tgsi. Hard
to say what the tradeoff is there.


Why would it complicate glsl_to_tgsi? At the GLSL level, it's not the
operations that have precision, but variables. And those variables map
to temp's... we'd have to create a separate pool of high- vs
low-precision temps, but that's about it.


Well, it may not be so bad in the end. But we definitely have to be more 
careful with things like temporary remapping, peephole, etc.


Also, at least OUT variables could also be affected, right?

Cheers,
Nicolai



   -ilia




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   >