date:20170912

Re: [Mesa-dev] [PATCH 3/4] i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2

2017-09-12 Thread Pohjolainen, Topi

On Tue, Sep 12, 2017 at 04:23:04PM -0700, Jason Ekstrand wrote:
> The old code made a new miptree that referenced the same BO as the
> renderbuffer and just trusted in the memory aliasing to work.  There are
> only two ways in which the new miptree is liable to differ from the one
> in the renderbuffer and neither of them matter:
> 
>  1) It may have a different target.  The only targets that we can ever
> see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
> and the difference between the two doesn't matter as far as the
> miptree is concerned; genX(update_sampler_state) only looks at the
> gl_texture_object and not the miptree when determining whether or
> not to use normalized coordinates.
> 
>  2) It may have a very slightly different format.  Again, this doesn't
> matter because we've supported texture views for quite some time so
> we always look at the gl_texture_object format instead of the
> miptree format for hardware setup anyway.
> 
> On the other hand, because we were recreating the miptree, we were using
> intel_miptree_create_for_bo which doesn't understand modifiers.  We
> really want this function to work without doing a resolve so long as you
> have modifiers so we need to fix that.

I read the last sentence a few times but it still sounds a little odd. Maybe
split it in two?

I read the whole series and it looks sensible to me. However, I think I
understand even less glx and compositors than you. Therefore for the series
weak:

Reviewed-by: Topi Pohjolainen 

> ---
>  src/mesa/drivers/dri/i965/intel_tex_image.c | 23 ---
>  1 file changed, 4 insertions(+), 19 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 4661581..09ff287 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -223,8 +223,6 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target,
> struct intel_renderbuffer *rb;
> struct gl_texture_object *texObj;
> struct gl_texture_image *texImage;
> -   mesa_format texFormat = MESA_FORMAT_NONE;
> -   struct intel_mipmap_tree *mt;
> GLenum internal_format = 0;
>  
> texObj = _mesa_get_current_tex_object(ctx, target);
> @@ -244,33 +242,20 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target,
>return;
>  
> if (rb->mt->cpp == 4) {
> -  if (texture_format == __DRI_TEXTURE_FORMAT_RGB) {
> +  if (texture_format == __DRI_TEXTURE_FORMAT_RGB)
>   internal_format = GL_RGB;
> - texFormat = MESA_FORMAT_B8G8R8X8_UNORM;
> -  }
> -  else {
> +  else
>   internal_format = GL_RGBA;
> - texFormat = MESA_FORMAT_B8G8R8A8_UNORM;
> -  }
> } else if (rb->mt->cpp == 2) {
> +  /* This is 565 */
>internal_format = GL_RGB;
> -  texFormat = MESA_FORMAT_B5G6R5_UNORM;
> }
>  
> intel_miptree_make_shareable(brw, rb->mt);
> -   mt = intel_miptree_create_for_bo(brw, rb->mt->bo, texFormat, 0,
> -rb->Base.Base.Width,
> -rb->Base.Base.Height,
> -1, rb->mt->surf.row_pitch,
> -MIPTREE_CREATE_DEFAULT);
> -   if (mt == NULL)
> -   return;
> -   mt->target = target;
>  
> _mesa_lock_texture(>ctx, texObj);
> texImage = _mesa_get_tex_image(ctx, texObj, target, 0);
> -   intel_set_texture_image_mt(brw, texImage, internal_format, mt);
> -   intel_miptree_release();
> +   intel_set_texture_image_mt(brw, texImage, internal_format, rb->mt);
> _mesa_unlock_texture(>ctx, texObj);
>  }
>  
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/ac: bump params array for image atomic comp swap

2017-09-12 Thread Dave Airlie

From: Dave Airlie 

For the comp_swap case this was overflowing and crashing
sometimes.

Fixes:
dEQP-VK.image.atomic_operations.compare_exchange.*

Cc: "17.2" 
Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 22e915d..1388ebd 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3466,7 +3466,7 @@ static void visit_image_store(struct ac_nir_context *ctx,
 static LLVMValueRef visit_image_atomic(struct ac_nir_context *ctx,
const nir_intrinsic_instr *instr)
 {
-   LLVMValueRef params[6];
+   LLVMValueRef params[7];
int param_count = 0;
const nir_variable *var = instr->variables[0]->var;
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format (v2)

2017-09-12 Thread Денис Паук

Do you mean delete check in u_format.c:: util_format_is_supported? Could
you please explain more?

On Wed, Sep 13, 2017 at 1:32 AM, Marek Olšák  wrote:

> On Wed, Sep 13, 2017 at 12:31 AM, Marek Olšák  wrote:
> > I think we shouldn't be getting PIPE_FORMAT_COUNT in
> > is_format_supported in the first place, and therefore drivers don't
> > have to work around it.
>
> Or any other invalid formats, for that matter.
>
> Marek
>
> >
> > Marek
> >
> > On Tue, Sep 12, 2017 at 10:38 PM, Denis Pauk 
> wrote:
> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552
> >>
> >> v2: Patch cleanup proposed by Nicolai Hähnle.
> >> * deleted changes in si_translate_texformat.
> >>
> >> Cc: Nicolai Hähnle 
> >> Cc: Ilia Mirkin 
> >> ---
> >>  src/gallium/auxiliary/util/u_format.c|  4 
> >>  src/gallium/drivers/r600/r600_state_common.c |  4 
> >>  src/gallium/drivers/radeonsi/si_state.c  | 10 +-
> >>  3 files changed, 17 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/src/gallium/auxiliary/util/u_format.c
> b/src/gallium/auxiliary/util/u_format.c
> >> index 3d281905ce..a6d42a428d 100644
> >> --- a/src/gallium/auxiliary/util/u_format.c
> >> +++ b/src/gallium/auxiliary/util/u_format.c
> >> @@ -238,6 +238,10 @@ util_format_is_subsampled_422(enum pipe_format
> format)
> >>  boolean
> >>  util_format_is_supported(enum pipe_format format, unsigned bind)
> >>  {
> >> +   if (format >= PIPE_FORMAT_COUNT) {
> >> +  return FALSE;
> >> +   }
> >> +
> >> if (util_format_is_s3tc(format) && !util_format_s3tc_enabled) {
> >>return FALSE;
> >> }
> >> diff --git a/src/gallium/drivers/r600/r600_state_common.c
> b/src/gallium/drivers/r600/r600_state_common.c
> >> index c1bce8304b..1515c28091 100644
> >> --- a/src/gallium/drivers/r600/r600_state_common.c
> >> +++ b/src/gallium/drivers/r600/r600_state_common.c
> >> @@ -2284,6 +2284,8 @@ uint32_t r600_translate_texformat(struct
> pipe_screen *screen,
> >> format = PIPE_FORMAT_A4R4_UNORM;
> >>
> >> desc = util_format_description(format);
> >> +   if (!desc)
> >> +   goto out_unknown;
> >>
> >> /* Depth and stencil swizzling is handled separately. */
> >> if (desc->colorspace != UTIL_FORMAT_COLORSPACE_ZS) {
> >> @@ -2650,6 +2652,8 @@ uint32_t r600_translate_colorformat(enum
> chip_class chip, enum pipe_format forma
> >> const struct util_format_description *desc =
> util_format_description(format);
> >> int channel = util_format_get_first_non_void_channel(format);
> >> bool is_float;
> >> +   if (!desc)
> >> +   return ~0U;
> >>
> >>  #define HAS_SIZE(x,y,z,w) \
> >> (desc->channel[0].size == (x) && desc->channel[1].size == (y)
> && \
> >> diff --git a/src/gallium/drivers/radeonsi/si_state.c
> b/src/gallium/drivers/radeonsi/si_state.c
> >> index ee070107fd..f7ee24bdc6 100644
> >> --- a/src/gallium/drivers/radeonsi/si_state.c
> >> +++ b/src/gallium/drivers/radeonsi/si_state.c
> >> @@ -1292,6 +1292,8 @@ static void si_emit_db_render_state(struct
> si_context *sctx, struct r600_atom *s
> >>  static uint32_t si_translate_colorformat(enum pipe_format format)
> >>  {
> >> const struct util_format_description *desc =
> util_format_description(format);
> >> +   if (!desc)
> >> +   return V_028C70_COLOR_INVALID;
> >>
> >>  #define HAS_SIZE(x,y,z,w) \
> >> (desc->channel[0].size == (x) && desc->channel[1].size == (y)
> && \
> >> @@ -1796,7 +1798,11 @@ static unsigned si_tex_dim(struct si_screen
> *sscreen, struct r600_texture *rtex,
> >>
> >>  static bool si_is_sampler_format_supported(struct pipe_screen
> *screen, enum pipe_format format)
> >>  {
> >> -   return si_translate_texformat(screen, format,
> util_format_description(format),
> >> +   struct util_format_description *desc = util_format_description(
> format);
> >> +   if (!desc)
> >> +   return false;
> >> +
> >> +   return si_translate_texformat(screen, format, desc,
> >>   
> >> util_format_get_first_non_void_channel(format))
> != ~0U;
> >>  }
> >>
> >> @@ -1925,6 +1931,8 @@ static unsigned si_is_vertex_format_supported(struct
> pipe_screen *screen,
> >>   PIPE_BIND_VERTEX_BUFFER)) == 0);
> >>
> >> desc = util_format_description(format);
> >> +   if (!desc)
> >> +   return 0;
> >>
> >> /* There are no native 8_8_8 or 16_16_16 data formats, and we
> currently
> >>  * select 8_8_8_8 and 16_16_16_16 instead. This works
> reasonably well
> >> --
> >> 2.14.1
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



-- 
Best regards,
  Denis.

[Mesa-dev] [PATCH] radv/gfx9: fix image resource handling.

2017-09-12 Thread Dave Airlie

From: Dave Airlie 

GFX9 changes how images are layed out, so this needs updating.

Fixes: dEQP-VK.query_pool.statistics_query.*

CC: "17.2" 
---
 src/amd/vulkan/radv_image.c | 27 +++
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index df28866..46b6205 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -1059,23 +1059,34 @@ radv_DestroyImage(VkDevice _device, VkImage _image,
 }
 
 void radv_GetImageSubresourceLayout(
-   VkDevicedevice,
+   VkDevice_device,
VkImage _image,
const VkImageSubresource*   pSubresource,
VkSubresourceLayout*pLayout)
 {
RADV_FROM_HANDLE(radv_image, image, _image);
+   RADV_FROM_HANDLE(radv_device, device, _device);
int level = pSubresource->mipLevel;
int layer = pSubresource->arrayLayer;
struct radeon_surf *surface = >surface;
 
-   pLayout->offset = surface->u.legacy.level[level].offset + 
surface->u.legacy.level[level].slice_size * layer;
-   pLayout->rowPitch = surface->u.legacy.level[level].nblk_x * 
surface->bpe;
-   pLayout->arrayPitch = surface->u.legacy.level[level].slice_size;
-   pLayout->depthPitch = surface->u.legacy.level[level].slice_size;
-   pLayout->size = surface->u.legacy.level[level].slice_size;
-   if (image->type == VK_IMAGE_TYPE_3D)
-   pLayout->size *= u_minify(image->info.depth, level);
+   if (device->physical_device->rad_info.chip_class >= GFX9) {
+   pLayout->offset = surface->u.gfx9.offset[level] + 
surface->u.gfx9.surf_slice_size * layer;
+   pLayout->rowPitch = surface->u.gfx9.surf_pitch * surface->bpe;
+   pLayout->arrayPitch = surface->u.gfx9.surf_slice_size;
+   pLayout->depthPitch = surface->u.gfx9.surf_slice_size;
+   pLayout->size = surface->u.gfx9.surf_slice_size;
+   if (image->type == VK_IMAGE_TYPE_3D)
+   pLayout->size *= u_minify(image->info.depth, level);
+   } else {
+   pLayout->offset = surface->u.legacy.level[level].offset + 
surface->u.legacy.level[level].slice_size * layer;
+   pLayout->rowPitch = surface->u.legacy.level[level].nblk_x * 
surface->bpe;
+   pLayout->arrayPitch = surface->u.legacy.level[level].slice_size;
+   pLayout->depthPitch = surface->u.legacy.level[level].slice_size;
+   pLayout->size = surface->u.legacy.level[level].slice_size;
+   if (image->type == VK_IMAGE_TYPE_3D)
+   pLayout->size *= u_minify(image->info.depth, level);
+   }
 }
 
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx9: set mip0-depth correctly for 2d arrays/3d images

2017-09-12 Thread Dave Airlie

From: Dave Airlie 

This field covers the whole resource.

Fixes:
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.view_type.3d.format.*
dEQP-VK.texture.filtering.3d.combinations.*

Cc: "17.2" 
Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 6b96a3d..3c512bd 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -3094,8 +3094,8 @@ radv_initialise_color_surface(struct radv_device *device,
}
 
if (device->physical_device->rad_info.chip_class >= GFX9) {
-   uint32_t max_slice = radv_surface_layer_count(iview);
-   unsigned mip0_depth = iview->base_layer + max_slice - 1;
+   unsigned mip0_depth = iview->image->type == VK_IMAGE_TYPE_3D ?
+ (iview->extent.depth - 1) : (iview->image->info.array_size - 
1);
 
cb->cb_color_view |= S_028C6C_MIP_LEVEL(iview->base_mip);
cb->cb_color_attrib |= S_028C74_MIP0_DEPTH(mip0_depth) |
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-12 Thread Connor Abbott

I think the arguments for doing NIR serialization and deseriallization are
pretty persuasive. I've started a skeleton of a NIR serialization
implementation at
https://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-serialize. Note
that filling this in by following nir_clone should be mostly mechanical and
straightforward, and easy to test too: we already have the NIR_TEST_CLONE
environment variable that clones the NIR before each pass, so we can just
extend that to optionally serialize and then immediately deserialize
instead, and then test i965 piglit for regressions. Of course, this will
also have to be hooked into mesa/st and i965 to be useful. Not sure how
much time I'll have to work on it, but it should be relatively easy for
someone else to pick up, even as a first project with Mesa/NIR.



On Wed, Sep 6, 2017 at 5:12 PM, Daniel Schürmann <
daniel.schuerm...@campus.tu-berlin.de> wrote:

> Hello together!
> Recently, we had a small discussion (off the list) about the NIR
> serialization, which was previously discussed in [RFC] ARB_gl_spirv and NIR
> backend for radeonsi.
> As this topic could be interesting to more people, I would like to share,
> what was talked about so far (You might want to read from bottom up).
>
> TL;DR:
> - NIR serialization is in demand for shader cache
> - could be done either directly (NIR binary form) or via SPIR-V
> - Ian et al. are working on GLSL IR -> SPIR-V transformation, which could
> be adapted for a NIR -> SPIR-V pass
> - in NIR representation, some type information is lost
> - thus, a serialization via SPIR-V could NOT be a glslang alternative
> (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if the
> output is valid SPIR-V)
> - now, the question is if this is worth the additional effort
>
> Kind regards,
> Daniel
>
>  Forwarded Message 
> Subject: Re: NIR serialization
> Date: Tue, 5 Sep 2017 11:00:31 -0700
> From: Ian Romanick  
> To: Daniel Schürmann 
> , Nicolai Hähnle
>  , Timothy Arceri
>  
>
> Sorry for taking so long to reply.  It was a long holiday weekend in the
> US, and I was away.
>
> On 09/01/2017 05:03 AM, Daniel Schürmann wrote:
> > A direct NIR binary serialization would also do the job (vc4/freedreno
> > was mentioned as well).
> > I only thought that SPIRV is preferable because
> > - deserialization for free
> > - cached shader size
> > - spirv-opt and glslang alternative
> >
> > The term lossy doesn't make much sense to me with regard to
> > optimizations: aren't all optimizations lossy?
>
> By lossy I mean there is a significant  semantic change.  As soon as
> GLSL IR is converted to NIR, Boolean types completely cease to exist.
> They are replaced with integers that are either 0 or -1.  Similarly, all
> matrix types cease to exist.  They are replaced by a set of vectors.
>
> For the purpose of the on-disk cache, this probably doesn't matter.  It
> does mean that additional information about, for example, types of
> uniforms has to be tracked.  In a direct GLSL IR to SPIR-V translation,
> type information is maintained, so the SPIR-V has all the necessary
> information.
>
> As a glslang replacement, maintaining type information is an absolute
> requirement.  Users will use other tools to introspect the SPIR-V shader
> to find locations of uniforms, shader inputs, offsets of values in UBOs,
> etc.  If the types are changed in the SPIR-V shader that we emit, none
> of that will work.  I plan to enable retrieval of portable SPIR-V both
> from a Mesa driver and the standalone GLSL compiler.
>
> Right now SPIR-V binaries will be quite large.  I have several ideas
> that I plan to implement once we have OpenGL 4.6 done that should
> dramatically reduce the size of SPIR-V... I'm actually hoping to present
> that at FOSDEM.
>
> > The primary goal would be the lossless NIR-SPIRV-NIR round-trip.
> > Secondary, it would be desirable if we achieve valid SPIRV binaries
> > which preserve the semantics of the original shader.
> > And here is the question if this is possible with the type information
> > that are available...
> >
> > Ian: can you hint me to your repository? I couldn't find it.
> https://cgit.freedesktop.org/~idr/mesa/log/?h=emit-spirv
>
> > Kind regards,
> >
> > Daniel
> >
> >
> > On 09/01/2017 12:16 PM, Nicolai Hähnle wrote:
> >> In addition to using NIR-based optimizations, I believe Timothy
> >> mentioned that a method for serializing NIR would help the shader disk
> >> cache of i965. It would certainly help radeonsi if/when we switch to
> >> the NIR backend, because we could compile new shader variants without
> >> falling back all the way to GLSL. For that, a lossless NIR-SPIRV-NIR
> >> path would do the job.
> >>
> >> Not that falling back all the way to GLSL from radeonsi is impossible,
> >> but it

Re: [Mesa-dev] [PATCH 1/2] radv/nir: call opt_remove_phis after trivial continues.

2017-09-12 Thread Timothy Arceri




On 13/09/17 13:52, Timothy Arceri wrote:



On 13/09/17 13:48, Dave Airlie wrote:
On 13 September 2017 at 13:42, Timothy Arceri  
wrote:


On 13/09/17 12:57, Dave Airlie wrote:


From: Dave Airlie 

With the shaders in the ssao demo, the nir_opt_if wasn't
working properly without this, after this the if gets optimised
so that loop unrolling gets called.

(loop unrolling fails due to instruction count, but at least
it gets to do that.)

Signed-off-by: Dave Airlie 
---
   src/amd/vulkan/radv_shader.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_shader.c 
b/src/amd/vulkan/radv_shader.c

index 1e25ea3..87deb7c 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -129,6 +129,7 @@ radv_optimize_nir(struct nir_shader *shader)
   if (nir_opt_trivial_continues(shader)) {
   progress = true;
   NIR_PASS(progress, shader, nir_copy_prop);
+   NIR_PASS(progress, shader, 
nir_opt_remove_phis);



Any reason for not just putting this in the main nir opt loop rather 
than

inside this if?


It's already in there.

This is adding it after the second copy_prop.



Right I just noticed that. I seems i965 just calls it once but later on, 
I wonder if radv should just do the same.


Or do we need to do this before something else kicks in? If thats the case:

Reviewed-by: Timothy Arceri 




Dave.




   NIR_PASS(progress, shader, nir_opt_dce);
   }
   NIR_PASS(progress, shader, nir_opt_if);




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv/nir: call opt_remove_phis after trivial continues.

2017-09-12 Thread Timothy Arceri




On 13/09/17 13:48, Dave Airlie wrote:

On 13 September 2017 at 13:42, Timothy Arceri  wrote:


On 13/09/17 12:57, Dave Airlie wrote:


From: Dave Airlie 

With the shaders in the ssao demo, the nir_opt_if wasn't
working properly without this, after this the if gets optimised
so that loop unrolling gets called.

(loop unrolling fails due to instruction count, but at least
it gets to do that.)

Signed-off-by: Dave Airlie 
---
   src/amd/vulkan/radv_shader.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 1e25ea3..87deb7c 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -129,6 +129,7 @@ radv_optimize_nir(struct nir_shader *shader)
   if (nir_opt_trivial_continues(shader)) {
   progress = true;
   NIR_PASS(progress, shader, nir_copy_prop);
+   NIR_PASS(progress, shader, nir_opt_remove_phis);



Any reason for not just putting this in the main nir opt loop rather than
inside this if?


It's already in there.

This is adding it after the second copy_prop.



Right I just noticed that. I seems i965 just calls it once but later on, 
I wonder if radv should just do the same.



Dave.




   NIR_PASS(progress, shader, nir_opt_dce);
   }
   NIR_PASS(progress, shader, nir_opt_if);




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv/nir: call opt_remove_phis after trivial continues.

2017-09-12 Thread Dave Airlie

On 13 September 2017 at 13:42, Timothy Arceri  wrote:
>
> On 13/09/17 12:57, Dave Airlie wrote:
>>
>> From: Dave Airlie 
>>
>> With the shaders in the ssao demo, the nir_opt_if wasn't
>> working properly without this, after this the if gets optimised
>> so that loop unrolling gets called.
>>
>> (loop unrolling fails due to instruction count, but at least
>> it gets to do that.)
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>   src/amd/vulkan/radv_shader.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
>> index 1e25ea3..87deb7c 100644
>> --- a/src/amd/vulkan/radv_shader.c
>> +++ b/src/amd/vulkan/radv_shader.c
>> @@ -129,6 +129,7 @@ radv_optimize_nir(struct nir_shader *shader)
>>   if (nir_opt_trivial_continues(shader)) {
>>   progress = true;
>>   NIR_PASS(progress, shader, nir_copy_prop);
>> +   NIR_PASS(progress, shader, nir_opt_remove_phis);
>
>
> Any reason for not just putting this in the main nir opt loop rather than
> inside this if?

It's already in there.

This is adding it after the second copy_prop.

Dave.
>
>
>>   NIR_PASS(progress, shader, nir_opt_dce);
>>   }
>>   NIR_PASS(progress, shader, nir_opt_if);
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] [rfc] nir: bump unroll instruction count to 96.

2017-09-12 Thread Timothy Arceri


On 13/09/17 12:57, Dave Airlie wrote:

From: Dave Airlie 

This gets the ssao demo from 400->440 fps on radv with the
previous patch.

Now the demo does a 0->32 loop across a ubo with 32 members,
I don't know if we still have that sort of information available
about the UBO in question at this stage. Maybe someone more
familiar with spir-v/nir can tell if we can access that info
then we can force a loop unroll like we do for var arrays.

Signed-off-by: Dave Airlie 
---
  src/compiler/nir/nir_opt_loop_unroll.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_opt_loop_unroll.c 
b/src/compiler/nir/nir_opt_loop_unroll.c
index 79d04f9..6158d58 100644
--- a/src/compiler/nir/nir_opt_loop_unroll.c
+++ b/src/compiler/nir/nir_opt_loop_unroll.c
@@ -34,7 +34,7 @@
   * loops that would unroll with GLSL IR fail to unroll if we set this to 25 so
   * we set it to 26.
   */
-#define LOOP_UNROLL_LIMIT 26
+#define LOOP_UNROLL_LIMIT 96


The main reason these limits exist is because they existed in GLSL IR. 
In GLSL IR the chosen limits reflect an issue with the speed at which 
the loop opt pass can run rather than issue with the size of the shader 
output so there isn't any reason I've seen to keep them so low. On the 
other-hand maybe detecting UBO indexing would be enough in most case 
anyway.


  
  /* Prepare this loop for unrolling by first converting to lcssa and then

   * converting the phis from the loops first block and the block that follows


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Marathe, Yogesh

>-Original Message-
>From: Matt Turner [mailto:matts...@gmail.com]
>
>On Tue, Sep 12, 2017 at 10:19 AM, Ian Romanick  wrote:
>> On 09/12/2017 02:40 AM, Marathe, Yogesh wrote:
>>> Hi Jason,
>>>
>>>
>>>
>>> On the asserts you’ve mentioned below, I assume we need to add them
>>> after ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be
>>> 0 initially. Another clarification on ~1%, we meant approx. 1% there,
>>> that’s an improvement we saw in 3Dmark total not a degradation, we’ll
>>> correct it in commit msg.
>>
>> I think the problem is that there is insufficient information about
>> your data.  What we want to see in a commit message is something like:
>>
>> commit 5ae2de81c8350272c122ea38e6bb4c0a41d58921
>> Author: Kenneth Graunke 
>> Date:   Mon Aug 28 16:08:32 2017 -0700
>>
>> i965: Use BLORP for buffer object stall avoidance blits instead of BLT.
>>
>> Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
>> - Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
>> - Car Chase by 1.25607% +/- 0.291262% (n=5).
>>
>> Reviewed-by: Jason Ekstrand 
>>
>> The important bits are:
>>
>> - average improvement
>> - statistical deviation
>> - number of runs
>
>And for generating such data, we often use http://anholt.net/compare-perf/

Ok. Thanks Matt and Ian, we'll look at it and update commit msg accordingly.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv/nir: call opt_remove_phis after trivial continues.

2017-09-12 Thread Timothy Arceri



On 13/09/17 12:57, Dave Airlie wrote:

From: Dave Airlie 

With the shaders in the ssao demo, the nir_opt_if wasn't
working properly without this, after this the if gets optimised
so that loop unrolling gets called.

(loop unrolling fails due to instruction count, but at least
it gets to do that.)

Signed-off-by: Dave Airlie 
---
  src/amd/vulkan/radv_shader.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 1e25ea3..87deb7c 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -129,6 +129,7 @@ radv_optimize_nir(struct nir_shader *shader)
  if (nir_opt_trivial_continues(shader)) {
  progress = true;
  NIR_PASS(progress, shader, nir_copy_prop);
+   NIR_PASS(progress, shader, nir_opt_remove_phis);


Any reason for not just putting this in the main nir opt loop rather 
than inside this if?



  NIR_PASS(progress, shader, nir_opt_dce);
  }
  NIR_PASS(progress, shader, nir_opt_if);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/10] swr: update rasterizer

2017-09-12 Thread Cherniak, Bruce

Reviewed-by: Bruce Cherniak  

> On Sep 11, 2017, at 2:28 PM, Tim Rowley  wrote:
> 
> Mostly some api changes, plus making the cpu topology code a bit more
> robust in the face of some odd configurations seen in virtualized
> environments.
> 
> No piglit or vtk ctest regressions.
> 
> Tim Rowley (10):
>  swr/rast: Add new API SwrStallBE
>  swr/rast: Move clip/cull enables in API
>  swr/rast: Start to remove hardcoded clipcull_dist vertex attrib slot
>  swr/rast: Remove hardcoded clip/cull slot from clipper
>  swr/rast: Migrate memory pointers to gfxptr_t type
>  swr/rast: add graph write to jit debug putput
>  swr/rast: whitespace changes
>  swr/rast: Missed conversion to SIMD_T
>  swr/rast: adjust linux cpu topology identification code
>  swr/rast: Fetch compile state changes
> 
> .../swr/rasterizer/codegen/gen_llvm_types.py   |  2 +-
> src/gallium/drivers/swr/rasterizer/core/api.cpp|  9 +++
> src/gallium/drivers/swr/rasterizer/core/api.h  |  8 +++
> .../drivers/swr/rasterizer/core/backend.cpp|  4 +-
> .../drivers/swr/rasterizer/core/backend_impl.h |  2 +-
> .../drivers/swr/rasterizer/core/backend_sample.cpp |  4 +-
> .../swr/rasterizer/core/backend_singlesample.cpp   |  4 +-
> src/gallium/drivers/swr/rasterizer/core/binner.cpp | 25 +++
> src/gallium/drivers/swr/rasterizer/core/clip.h | 57 ---
> .../drivers/swr/rasterizer/core/rasterizer.cpp |  2 +-
> src/gallium/drivers/swr/rasterizer/core/state.h| 16 +++--
> .../drivers/swr/rasterizer/core/threads.cpp| 81 ++
> .../drivers/swr/rasterizer/jitter/JitManager.cpp   |  6 +-
> .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 12 +++-
> .../drivers/swr/rasterizer/jitter/fetch_jit.h  |  7 +-
> .../drivers/swr/rasterizer/jitter/jit_api.h|  2 +
> .../drivers/swr/rasterizer/memory/StoreTile.h  |  4 +-
> .../swr/rasterizer/memory/TilingFunctions.h|  2 +-
> src/gallium/drivers/swr/swr_context.cpp| 18 ++---
> src/gallium/drivers/swr/swr_draw.cpp   |  8 +--
> src/gallium/drivers/swr/swr_resource.h |  2 +-
> src/gallium/drivers/swr/swr_screen.cpp | 21 +++---
> src/gallium/drivers/swr/swr_state.cpp  | 31 +
> 23 files changed, 182 insertions(+), 145 deletions(-)
> 
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] [rfc] nir: bump unroll instruction count to 96.

2017-09-12 Thread Dave Airlie

From: Dave Airlie 

This gets the ssao demo from 400->440 fps on radv with the
previous patch.

Now the demo does a 0->32 loop across a ubo with 32 members,
I don't know if we still have that sort of information available
about the UBO in question at this stage. Maybe someone more
familiar with spir-v/nir can tell if we can access that info
then we can force a loop unroll like we do for var arrays.

Signed-off-by: Dave Airlie 
---
 src/compiler/nir/nir_opt_loop_unroll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_opt_loop_unroll.c 
b/src/compiler/nir/nir_opt_loop_unroll.c
index 79d04f9..6158d58 100644
--- a/src/compiler/nir/nir_opt_loop_unroll.c
+++ b/src/compiler/nir/nir_opt_loop_unroll.c
@@ -34,7 +34,7 @@
  * loops that would unroll with GLSL IR fail to unroll if we set this to 25 so
  * we set it to 26.
  */
-#define LOOP_UNROLL_LIMIT 26
+#define LOOP_UNROLL_LIMIT 96
 
 /* Prepare this loop for unrolling by first converting to lcssa and then
  * converting the phis from the loops first block and the block that follows
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv/nir: call opt_remove_phis after trivial continues.

2017-09-12 Thread Dave Airlie

From: Dave Airlie 

With the shaders in the ssao demo, the nir_opt_if wasn't
working properly without this, after this the if gets optimised
so that loop unrolling gets called.

(loop unrolling fails due to instruction count, but at least
it gets to do that.)

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_shader.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 1e25ea3..87deb7c 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -129,6 +129,7 @@ radv_optimize_nir(struct nir_shader *shader)
 if (nir_opt_trivial_continues(shader)) {
 progress = true;
 NIR_PASS(progress, shader, nir_copy_prop);
+   NIR_PASS(progress, shader, nir_opt_remove_phis);
 NIR_PASS(progress, shader, nir_opt_dce);
 }
 NIR_PASS(progress, shader, nir_opt_if);
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 102639] BadLength (poly request too large or internal Xlib length erro

2017-09-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=102639

thomas  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from thomas  ---
The cirrus module was not loading automatically.
Once loaded, the problem disappears. Sorry to have bothered you.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi/uvd: fix interlaced video buffer height alignment

2017-09-12 Thread Leo Liu




On 2017-09-12 02:39 PM, Christian König wrote:





The problem is:

In si_uvd.c

struct pipe_video_buffer *si_video_buffer_create(struct pipe_context 
*pipe,

 const struct pipe_video_buffer *tmpl)
{
struct pipe_video_buffer template;

template.height = align(tmpl->height / array_size, 
VL_MACROBLOCK_HEIGHT);



The original info with right height in the tmpl, and that's my first 
thought to deal with the issue.


but when you keep looking to the code, the tmpl got wiped out, and 
leave a new template with 32 aligned height.


The video buffer was created based on this new template.


and there are the pipe_resource->width/height which are aligned so 
that the hardware can deal with them.


Video buffer and pipe buffer are same, they both got aligned.


Ok, than that is most likely the root problem.


Then how about to add member of "video_width", and "video_height" to 
"struct pipe_video_buffer" ?



Regards,
Leo




This shouldn't be the case IIRC.

Anyway feel free to go ahead with your original patch, as you noted 
better not touch that to intense or a lot of things might break.


We should just test with some low res MPEG2 stream to see if the 
standard PAL/NTSC formats still work.


Regards,
Christian.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Intel-gfx] [PATCH 1/2] drm/i915/kbl: Remove unused Kabylake pci ids

2017-09-12 Thread Rodrigo Vivi

On Tue, Sep 12, 2017 at 08:30:47PM +, Paulo Zanoni wrote:
> Em Seg, 2017-09-11 às 10:10 -0700, Rodrigo Vivi escreveu:
> > On Mon, Sep 11, 2017 at 04:11:33PM +, Anuj Phogat wrote:
> > > See Mesa commits: ebc5ccf and b2dae9f
> > 
> > I believe we need to be in sync between multiple gfx stack
> > components,
> > but I  don't believe we should remove ids.
> > 
> > In the past we had cases where we noticed a product group using a
> > listed
> > id to do a product and we just noticed the id after a user reported
> > at fd.o.
> 
> On the other hand, don't we have the risk that someone is going to see
> that these IDs are unused for KBL and them repurpose them om some
> future non-KBL product?

There is only risk if the id was removed from Spec. But when that happens I'm 
in favor
of removing from the components as well.
While it is listed there even without POR it is reserved.

> 
> > 
> > For us in kernel the cycle until that id gets into a stable release
> > propagated to OSVs distros can be a bit long.
> > 
> > Also Xserver ids are nowadays in sync with Mesa ones and I believe
> > some
> > OSVs might take a while to upgrade the Xserver as well in case of a
> > new
> > found product with some "new" id.
> > 
> > For this reason I was always in favor of adding all possible reserved
> > ids from the
> > beginning.
> > 
> > And this approach worked well on BDW and SKL, where we've seeing
> > later some
> > reserved ids becoming real product and we didn't have to do any extra
> > step.
> > 
> > For this same reason I believe the right solution is to
> > add those ids back to mesa instead of removing from kernel and
> > libdrm.
> > 
> > Thanks,
> > Rodrigo.
> > 
> > > 
> > > Cc: Matt Turner 
> > > Cc: Rodrigo Vivi 
> > > Signed-off-by: Anuj Phogat 
> > > ---
> > >  drivers/gpu/drm/i915/i915_pci.c |  1 -
> > >  include/drm/i915_pciids.h   | 15 ++-
> > >  2 files changed, 2 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > b/drivers/gpu/drm/i915/i915_pci.c
> > > index 129877b..ecf6d4c 100644
> > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > @@ -613,7 +613,6 @@ static const struct pci_device_id pciidlist[] =
> > > {
> > >   INTEL_KBL_GT1_IDS(_kabylake_gt1_info),
> > >   INTEL_KBL_GT2_IDS(_kabylake_gt2_info),
> > >   INTEL_KBL_GT3_IDS(_kabylake_gt3_info),
> > > - INTEL_KBL_GT4_IDS(_kabylake_gt3_info),
> > >   INTEL_CFL_S_GT1_IDS(_coffeelake_gt1_info),
> > >   INTEL_CFL_S_GT2_IDS(_coffeelake_gt2_info),
> > >   INTEL_CFL_H_GT2_IDS(_coffeelake_gt2_info),
> > > diff --git a/include/drm/i915_pciids.h b/include/drm/i915_pciids.h
> > > index 1257e15..a1bf90e 100644
> > > --- a/include/drm/i915_pciids.h
> > > +++ b/include/drm/i915_pciids.h
> > > @@ -337,15 +337,10 @@
> > >   INTEL_VGA_DEVICE(0x3185, info)
> > >  
> > >  #define INTEL_KBL_GT1_IDS(info)  \
> > > - INTEL_VGA_DEVICE(0x5913, info), /* ULT GT1.5 */ \
> > > - INTEL_VGA_DEVICE(0x5915, info), /* ULX GT1.5 */ \
> > >   INTEL_VGA_DEVICE(0x5917, info), /* DT  GT1.5 */ \
> > >   INTEL_VGA_DEVICE(0x5906, info), /* ULT GT1 */ \
> > > - INTEL_VGA_DEVICE(0x590E, info), /* ULX GT1 */ \
> > >   INTEL_VGA_DEVICE(0x5902, info), /* DT  GT1 */ \
> > > - INTEL_VGA_DEVICE(0x5908, info), /* Halo GT1 */ \
> > > - INTEL_VGA_DEVICE(0x590B, info), /* Halo GT1 */ \
> > > - INTEL_VGA_DEVICE(0x590A, info) /* SRV GT1 */
> > > + INTEL_VGA_DEVICE(0x590B, info)  /* Halo GT1 */
> > >  
> > >  #define INTEL_KBL_GT2_IDS(info)  \
> > >   INTEL_VGA_DEVICE(0x5916, info), /* ULT GT2 */ \
> > > @@ -353,22 +348,16 @@
> > >   INTEL_VGA_DEVICE(0x591E, info), /* ULX GT2 */ \
> > >   INTEL_VGA_DEVICE(0x5912, info), /* DT  GT2 */ \
> > >   INTEL_VGA_DEVICE(0x591B, info), /* Halo GT2 */ \
> > > - INTEL_VGA_DEVICE(0x591A, info), /* SRV GT2 */ \
> > >   INTEL_VGA_DEVICE(0x591D, info) /* WKS GT2 */
> > >  
> > >  #define INTEL_KBL_GT3_IDS(info) \
> > > - INTEL_VGA_DEVICE(0x5923, info), /* ULT GT3 */ \
> > >   INTEL_VGA_DEVICE(0x5926, info), /* ULT GT3 */ \
> > >   INTEL_VGA_DEVICE(0x5927, info) /* ULT GT3 */
> > >  
> > > -#define INTEL_KBL_GT4_IDS(info) \
> > > - INTEL_VGA_DEVICE(0x593B, info) /* Halo GT4 */
> > > -
> > >  #define INTEL_KBL_IDS(info) \
> > >   INTEL_KBL_GT1_IDS(info), \
> > >   INTEL_KBL_GT2_IDS(info), \
> > > - INTEL_KBL_GT3_IDS(info), \
> > > - INTEL_KBL_GT4_IDS(info)
> > > + INTEL_KBL_GT3_IDS(info)
> > >  
> > >  /* CFL S */
> > >  #define INTEL_CFL_S_GT1_IDS(info) \
> > > -- 
> > > 2.9.4
> > > 
> > > ___
> > > Intel-gfx mailing list
> > > intel-...@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > ___
> > Intel-gfx mailing list
> > intel-...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
mesa-dev mailing

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-12 Thread Timothy Arceri




On 13/09/17 03:00, Ian Romanick wrote:

On 09/11/2017 09:44 PM, Timothy Arceri wrote:

On 12/09/17 14:23, Ian Romanick wrote:

On 09/08/2017 01:59 AM, Kenneth Graunke wrote:

On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:

On 2017-09-06 14:12:41, Daniel Schürmann wrote:

Hello together!
Recently, we had a small discussion (off the list) about the NIR
serialization, which was previously discussed in [RFC] ARB_gl_spirv
and
NIR backend for radeonsi.

As this topic could be interesting to more people, I would like to
share, what was talked about so far (You might want to read from
bottom up).

TL;DR:
- NIR serialization is in demand for shader cache
- could be done either directly (NIR binary form) or via SPIR-V
- Ian et al. are working on GLSL IR -> SPIR-V transformation, which
could be adapted for a NIR -> SPIR-V pass
- in NIR representation, some type information is lost
- thus, a serialization via SPIR-V could NOT be a glslang alternative
(otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if the
output is valid SPIR-V)


Ian,

Tim was suggesting that we might look at serializing nir for the i965
shader cache. Based on this email, it sounds like serialized nir would
not be enough for the shader cache as some GLSL type info would be
lost. It sounds like GLSL IR => SPIR-V would be good enough. Is that
right?

I don't think we have a strict requirement for the GLSL IR => SPIR-V
path for GL 4.6, right? So, this is more of a 'nice-to-have'?

I'm not sure we'd want to make i965 shader cache depend on a
nice-to-have feature. (Unless we're pretty sure it'll be available
soon.)

But, it would be nice to not have to fallback to compiling the GLSL
for i965 shader cache, so it would be worth waiting a little bit to be
able to rely on a SPIR-V serialization of the GLSL IR.

What do you suggest?

-Jordan


We shouldn't use SPIR-V for the shader cache.

The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965 IRs.
Storing the content at one of those points, and later loading it and
resuming the normal compilation process from that point...that's totally
reasonable.

Having a fallback for "some things in the cache but not all the variants
we needed" suddenly take a different compilation pipeline, i.e. SPIR-V
-> NIR -> ... seems risky.  It's a different compilation path that we
don't normally use.  And one you'd only hit in limited circumstances.
There's a lot of potential for really obscure bugs.


Since we're going to expose exactly that path for GL_ARB_spirv / OpenGL
4.6, we'd better make sure it works always.  Right?

One nice thing about SPIR-V is that all of the handling of uniform
layouts, initial uniform values, attribute locations, etc. is already
serialized.  If I'm not mistaken, that was one of the big pain points
for all of the existing on-disk storage methods.  All of that has been
sorted out for SPIR-V, and we have to make it work anyway.


Correct these are the main issues for the fallback path, however this is
only used by i965 (exactly because an intermediate cache is missing).
Using SPIR-V as the intermediate cache means we still need to convert to
NIR and run all the opts, so I don't really see the advantage of caching
to SPIR-V over NIR.


The advantage is that we have N code paths instead of N+1.  Maintenance
is the biggest cost in software development.


But a SPIR-V cache has a N+1 code path, and it's going to be more 
untested than NIR serialization would be. NIR serialization should slot 
seamlessly into the existing code paths. e.g If we don't see NIR in the 
buffer when we need to do a variant recompilation we just load it from 
disk. Loading SPIR-V from disk would require a separate code path to 
fallback and recreate the NIR, this path would not always be hit and 
therefore will be much less tested. As Ken points out GLSL IR -> SPIR-V 
is another code path on top of this, it's fine if you want to make a 
glslang alternative but there in no requirement/need to convert to 
SPIR-V for ordinary GL shaders.


I've also just sent a series that introduces some basic NIR linking [1] 
ultimately once we have a NIR packing pass we should be able to drop 
more GLSL IR opts and get even better results from using NIR. I would 
expect caching to SPIR-V would make variant fallback paths even more 
complicated for NIR based linking.


[1] https://patchwork.freedesktop.org/series/30249/




Also there is going to be a requirement for a NIR cache for any of the
Gallium nir based drivers (which possibly includes radeonsi in future).


Serializing NIR, and possibly a few auxiliary structures that we need,
seems reasonable.  Although, just using the GLSL seemed reasonable to
me as well, but I guess that's proven to be painful?

--Ken

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util: Query build-id by symbol address, not library name

2017-09-12 Thread Matt Turner

On Tue, Sep 12, 2017 at 5:05 PM, Chad Versace  wrote:
> This patch renames build_id_find_nhdr() to
> build_id_find_nhdr_for_addr(), and changes it to never examine the
> library name.
>
> Tested on Fedora by confirming that build_id_get_data() returns the same
> build-id as the file(1) tool. For BSD, I confirmed that the API used
> (dladdr() and struct Dl_info) is documented in FreeBSD's manpages.
>
> This solves several problems, some more realistic than others:
>
> - We can now the query the build-id without knowing the installed 
> library's
>   filename.
>
>   This matters because Android requires specific filenames for HAL
>   modules, such as "/vendor/lib/hw/vulkan.${board}.so". The HAL
>   filenames do not follow the Unix convention of "libfoo.so".  In
>   other words, the same query code will now work on Linux and Android.
>
> - Querying the build-id now works correctly when the process
>   contains multiple shared objects with the same basename.
>   (Admittedly, this is a highly unlikely scenario).
>
> - Querying the build-id now works correctly when the library is
>   statically linked into the executable. (This even more unlikely
>   than the previous scenario).

I assume this one is speculative? I'm not sure how the build-id ELF
section could exist in a static archive, and I'm not sure how it could
end up in a statically linked binary (much less more than one of
them).

Otherwise, it looks like a nice change.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] util: Query build-id by symbol address, not library name

2017-09-12 Thread Chad Versace

This patch renames build_id_find_nhdr() to
build_id_find_nhdr_for_addr(), and changes it to never examine the
library name.

Tested on Fedora by confirming that build_id_get_data() returns the same
build-id as the file(1) tool. For BSD, I confirmed that the API used
(dladdr() and struct Dl_info) is documented in FreeBSD's manpages.

This solves several problems, some more realistic than others:

- We can now the query the build-id without knowing the installed library's
  filename.

  This matters because Android requires specific filenames for HAL
  modules, such as "/vendor/lib/hw/vulkan.${board}.so". The HAL
  filenames do not follow the Unix convention of "libfoo.so".  In
  other words, the same query code will now work on Linux and Android.

- Querying the build-id now works correctly when the process
  contains multiple shared objects with the same basename.
  (Admittedly, this is a highly unlikely scenario).

- Querying the build-id now works correctly when the library is
  statically linked into the executable. (This even more unlikely
  than the previous scenario).

Cc: Matt Turner 
Cc: Jason Ekstrand 
Cc: Jonathan Gray 
---
 src/intel/vulkan/anv_device.c |  3 ++-
 src/util/build_id.c   | 25 ++---
 src/util/build_id.h   |  2 +-
 3 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index be2455166e3..8e2ed9eac45 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -208,7 +208,8 @@ anv_physical_device_init_heaps(struct anv_physical_device 
*device, int fd)
 static VkResult
 anv_physical_device_init_uuids(struct anv_physical_device *device)
 {
-   const struct build_id_note *note = build_id_find_nhdr("libvulkan_intel.so");
+   const struct build_id_note *note =
+  build_id_find_nhdr_for_addr(anv_physical_device_init_uuids);
if (!note) {
   return vk_errorf(device->instance, device,
VK_ERROR_INITIALIZATION_FAILED,
diff --git a/src/util/build_id.c b/src/util/build_id.c
index 898a15f2b31..6280b4a54e3 100644
--- a/src/util/build_id.c
+++ b/src/util/build_id.c
@@ -46,7 +46,9 @@ struct build_id_note {
 };
 
 struct callback_data {
-   const char *filename;
+   /* Base address of shared object, taken from Dl_info::dli_fbase */
+   const void *dli_fbase;
+
struct build_id_note *note;
 };
 
@@ -55,14 +57,7 @@ build_id_find_nhdr_callback(struct dl_phdr_info *info, 
size_t size, void *data_)
 {
struct callback_data *data = data_;
 
-   /* The first object visited by callback is the main program.
-* Android's libc returns a NULL pointer for the first executable.
-*/
-   if (info->dlpi_name == NULL)
-  return 0;
-
-   char *ptr = strstr(info->dlpi_name, data->filename);
-   if (ptr == NULL || ptr[strlen(data->filename)] != '\0')
+   if ((void *)info->dlpi_addr != data->dli_fbase)
   return 0;
 
for (unsigned i = 0; i < info->dlpi_phnum; i++) {
@@ -94,10 +89,18 @@ build_id_find_nhdr_callback(struct dl_phdr_info *info, 
size_t size, void *data_)
 }
 
 const struct build_id_note *
-build_id_find_nhdr(const char *filename)
+build_id_find_nhdr_for_addr(const void *addr)
 {
+   Dl_info info;
+
+   if (!dladdr(addr, ))
+  return NULL;
+
+   if (!info.dli_fbase)
+  return NULL;
+
struct callback_data data = {
-  .filename = filename,
+  .dli_fbase = info.dli_fbase,
   .note = NULL,
};
 
diff --git a/src/util/build_id.h b/src/util/build_id.h
index 18641c44af2..86d611d8db7 100644
--- a/src/util/build_id.h
+++ b/src/util/build_id.h
@@ -29,7 +29,7 @@
 struct build_id_note;
 
 const struct build_id_note *
-build_id_find_nhdr(const char *filename);
+build_id_find_nhdr_for_addr(const void *addr);
 
 unsigned
 build_id_length(const struct build_id_note *note);
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-12 Thread Rob Clark

On Tue, Sep 12, 2017 at 4:39 PM, Jason Ekstrand  wrote:
> On Tue, Sep 12, 2017 at 11:09 AM, Jason Ekstrand 
> wrote:
>>
>> On Tue, Sep 12, 2017 at 10:12 AM, Ian Romanick 
>> wrote:
>>>
>>> On 09/11/2017 11:17 PM, Kenneth Graunke wrote:
>>> > On Monday, September 11, 2017 9:23:05 PM PDT Ian Romanick wrote:
>>> >> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
>>> >>> Having a fallback for "some things in the cache but not all the
>>> >>> variants
>>> >>> we needed" suddenly take a different compilation pipeline, i.e.
>>> >>> SPIR-V
>>> >>> -> NIR -> ... seems risky.  It's a different compilation path that we
>>> >>> don't normally use.  And one you'd only hit in limited circumstances.
>>> >>> There's a lot of potential for really obscure bugs.
>>> >>
>>> >> Since we're going to expose exactly that path for GL_ARB_spirv /
>>> >> OpenGL
>>> >> 4.6, we'd better make sure it works always.  Right?
>>> >
>>> > In addition to the old pipeline:
>>> >
>>> > - GLSL from the app -> GLSL IR -> NIR -> i965 IR
>>> >
>>> > GL_ARB_spirv and OpenGL 4.6 add a second pipeline:
>>> >
>>> > - SPIR-V from the app -> NIR -> i965 IR
>>> >
>>> > Both of those absolutely have to work.  But these:
>>> >
>>> > - GLSL -> GLSL IR -> NIR -> SPIR-V -> NIR -> i965 IRs
>>> > - GLSL -> GLSL IR -> SPIR-V -> NIR -> i965 IRs
>>> >
>>> > aren't required to work, or even be supported.  It makes a lot of sense
>>> > to support them - both for testing purposes, and as an alternative to
>>> > glslang, for a broader tooling ecosystem.
>>> >
>>> > The thing that concerns me is that if you use SPIR-V for the cache, you
>>> > need these paths to not just work, but be _indistinguishable_ from one
>>> > another:
>>> >
>>> > - GLSL -> GLSL IR -> NIR -> ...
>>> > - GLSL -> GLSL IR -> NIR -> SPIR-V, then SPIR-V -> NIR -> ...
>>> >
>>> > Otherwise the original compile and partially-cached recompile might
>>> > have
>>> > different properties.  For example, if the the SPIR-V step messes with
>>> > variables or instruction ordering a little, it could trip up the loop
>>> > unroller so the original compiler gets unrolled, and the recompile from
>>> > partial cache doesn't get unrolled.  I don't want to have to debug
>>> > that.
>>>
>>> That is a very compelling argument.  If we want Mesa to be an
>>> alternative to glslang, I think we would like to have that property, but
>>> it's not a hard requirement for that use case.
>>
>>
>> I also find that argument rather compelling.  The SPIR-V -> NIR pass is
>> *not* a simple pass.  It does piles of lowering and things on-the-fly as
>> well as creating temporary variables for various things.  The best we could
>> hope to guarnatee would be that NIR -> SPIR-V -> NIR -> vars_to_ssa -> CSE
>> is idempotent.  Even that might be a bit of a stretch.
>
>
> I was talking to Jordan about this in person this morning and one other
> roadblock for using SPIR-V cropped up.  The place where we really want to
> cache the NIR for highest effectiveness would be right after calling
> brw_create_nir in brw_link_shader.  At this point in the process, a lot of
> lowering has been done to NIR intrinsics which have no SPIR-V equivalent.
> If we wanted to serialize the NIR as SPIR-V, we would either have to do it
> much higher up in the pipeline and lose quite a bit of the benefit, or add a
> NIR extension to SPIR-V that provides those extra intrinsics.  Adding such a
> SPIR-V extension and relevant headers and to/from NIR code is probably at
> least 2x the code of writing a nir_serialize pass.
>

not that it should be the deciding point, but one of these days when I
get a chance, I'd like to move some of the lowering done in nir->ir3
to nir->nir(+ir3 specific nir instructions/intrinsics).. which seems
like it would make serialization via spirv too early to be hugely
useful..

Maybe nir->spirv->nir round-tripping is useful for other reasons.. but
really nir_serialize shouldn't be that much work, and not enough to
care about it being duplicate effort with IR round tripping (which
imho, it isn't)

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] (UNTESTED) virgl: filter out 2D constant file accesses and declarations

2017-09-12 Thread Dave Airlie

On 13 September 2017 at 06:34, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Sorry for the mess.
>
> I suspect something like this patch is needed. Is this sufficient to
> fix the problem?

Oops I missed this, I just posted almost identical patch, and tested mine.

btw this is normal for virgl, I just have to keep an eye out for tgsi
differences.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] radeonsi: rename variable to clarify its meaning

2017-09-12 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Mon, Sep 11, 2017 at 5:06 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  src/gallium/drivers/radeonsi/si_state.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index ee070107fd5..da3c7debd57 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -2168,50 +2168,50 @@ static void si_choose_spi_color_formats(struct 
> r600_surface *surf,
> surf->spi_shader_col_format_blend_alpha = blend_alpha;
>  }
>
>  static void si_initialize_color_surface(struct si_context *sctx,
> struct r600_surface *surf)
>  {
> struct r600_texture *rtex = (struct r600_texture*)surf->base.texture;
> unsigned color_info, color_attrib, color_view;
> unsigned format, swap, ntype, endian;
> const struct util_format_description *desc;
> -   int i;
> +   int firstchan;
> unsigned blend_clamp = 0, blend_bypass = 0;
>
> color_view = S_028C6C_SLICE_START(surf->base.u.tex.first_layer) |
>  S_028C6C_SLICE_MAX(surf->base.u.tex.last_layer);
>
> desc = util_format_description(surf->base.format);
> -   for (i = 0; i < 4; i++) {
> -   if (desc->channel[i].type != UTIL_FORMAT_TYPE_VOID) {
> +   for (firstchan = 0; firstchan < 4; firstchan++) {
> +   if (desc->channel[firstchan].type != UTIL_FORMAT_TYPE_VOID) {
> break;
> }
> }
> -   if (i == 4 || desc->channel[i].type == UTIL_FORMAT_TYPE_FLOAT) {
> +   if (firstchan == 4 || desc->channel[firstchan].type == 
> UTIL_FORMAT_TYPE_FLOAT) {
> ntype = V_028C70_NUMBER_FLOAT;
> } else {
> ntype = V_028C70_NUMBER_UNORM;
> if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB)
> ntype = V_028C70_NUMBER_SRGB;
> -   else if (desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED) {
> -   if (desc->channel[i].pure_integer) {
> +   else if (desc->channel[firstchan].type == 
> UTIL_FORMAT_TYPE_SIGNED) {
> +   if (desc->channel[firstchan].pure_integer) {
> ntype = V_028C70_NUMBER_SINT;
> } else {
> -   assert(desc->channel[i].normalized);
> +   assert(desc->channel[firstchan].normalized);
> ntype = V_028C70_NUMBER_SNORM;
> }
> -   } else if (desc->channel[i].type == 
> UTIL_FORMAT_TYPE_UNSIGNED) {
> -   if (desc->channel[i].pure_integer) {
> +   } else if (desc->channel[firstchan].type == 
> UTIL_FORMAT_TYPE_UNSIGNED) {
> +   if (desc->channel[firstchan].pure_integer) {
> ntype = V_028C70_NUMBER_UINT;
> } else {
> -   assert(desc->channel[i].normalized);
> +   assert(desc->channel[firstchan].normalized);
> ntype = V_028C70_NUMBER_UNORM;
> }
> }
> }
>
> format = si_translate_colorformat(surf->base.format);
> if (format == V_028C70_COLOR_INVALID) {
> R600_ERR("Invalid CB format: %d, disabling CB.\n", 
> surf->base.format);
> }
> assert(format != V_028C70_COLOR_INVALID);
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radeonsi: remove SET_PREDICATION workaround on newer firmware

2017-09-12 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Mon, Sep 11, 2017 at 5:01 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> We need to keep the workaround for older firmware, though.
> ---
>  src/gallium/drivers/radeon/r600_query.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_query.c 
> b/src/gallium/drivers/radeon/r600_query.c
> index 03ff1018a71..76307ca0662 100644
> --- a/src/gallium/drivers/radeon/r600_query.c
> +++ b/src/gallium/drivers/radeon/r600_query.c
> @@ -1796,25 +1796,27 @@ static void r600_render_condition(struct pipe_context 
> *ctx,
> struct r600_common_context *rctx = (struct r600_common_context *)ctx;
> struct r600_query_hw *rquery = (struct r600_query_hw *)query;
> struct r600_query_buffer *qbuf;
> struct r600_atom *atom = >render_cond_atom;
>
> /* Compute the size of SET_PREDICATION packets. */
> atom->num_dw = 0;
> if (query) {
> bool needs_workaround = false;
>
> -   /* There is a firmware regression in VI which causes 
> successive
> +   /* There was a firmware regression in VI which causes 
> successive
>  * SET_PREDICATION packets to give the wrong answer for
>  * non-inverted stream overflow predication.
>  */
> -   if (rctx->chip_class >= VI && !condition &&
> +   if (((rctx->chip_class == VI && 
> rctx->screen->info.pfp_fw_feature < 49) ||
> +(rctx->chip_class == GFX9 && 
> rctx->screen->info.pfp_fw_feature < 38)) &&
> +   !condition &&
> (rquery->b.type == PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE ||
>  (rquery->b.type == PIPE_QUERY_SO_OVERFLOW_PREDICATE &&
>   (rquery->buffer.previous ||
>rquery->buffer.results_end > rquery->result_size {
> needs_workaround = true;
> }
>
> if (needs_workaround && !rquery->workaround_buf) {
> bool old_force_off = rctx->render_cond_force_off;
> rctx->render_cond_force_off = true;
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] nir: add is_xfb_only to nir variable

2017-09-12 Thread Timothy Arceri


Whoop subject should be:

nir: add always_active_io to nir variable

On 13/09/17 09:37, Timothy Arceri wrote:

Will be used in nir link pass to decided if we can remove a varying
or not.
---
  src/compiler/glsl/glsl_to_nir.cpp |  1 +
  src/compiler/nir/nir.h| 10 ++
  2 files changed, 11 insertions(+)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index bb2ba17b220..ea75e3c8e99 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -326,6 +326,7 @@ nir_visitor::visit(ir_variable *ir)
 var->type = ir->type;
 var->name = ralloc_strdup(var, ir->name);
  
+   var->data.always_active_io = ir->data.always_active_io;

 var->data.read_only = ir->data.read_only;
 var->data.centroid = ir->data.centroid;
 var->data.sample = ir->data.sample;
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 8330e6d7ce7..fab2110f619 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -192,6 +192,16 @@ typedef struct nir_variable {
unsigned invariant:1;
  
/**

+   * When separate shader programs are enabled, only input/outputs between
+   * the stages of a multi-stage separate program can be safely removed
+   * from the shader interface. Other input/outputs must remains active.
+   *
+   * This is also used to make sure xfb varyings that are unused by the
+   * fragment shader are not removed.
+   */
+  unsigned always_active_io:1;
+
+  /**
 * Interpolation mode for shader inputs / outputs
 *
 * \sa glsl_interp_mode


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1.5/4] [RFC] ac/addrlib: relax an assertion

2017-09-12 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Mon, Sep 11, 2017 at 3:26 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
> We hit this assertion with 3D textures on gfx9.
>
> I'm not aware of any 3D-texture-specific failures, but I'm also not sure
> whether CMASK is supposed to work with 3D textures or whether we've just
> been lucky.
>
> ---
>  src/amd/addrlib/gfx9/gfx9addrlib.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/addrlib/gfx9/gfx9addrlib.cpp 
> b/src/amd/addrlib/gfx9/gfx9addrlib.cpp
> index 57ecb058727..edb4c6e636a 100644
> --- a/src/amd/addrlib/gfx9/gfx9addrlib.cpp
> +++ b/src/amd/addrlib/gfx9/gfx9addrlib.cpp
> @@ -261,21 +261,22 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeHtileInfo(
>  *
>  *   @return
>  *   ADDR_E_RETURNCODE
>  
> 
>  */
>  ADDR_E_RETURNCODE Gfx9Lib::HwlComputeCmaskInfo(
>  const ADDR2_COMPUTE_CMASK_INFO_INPUT*pIn,///< [in] input 
> structure
>  ADDR2_COMPUTE_CMASK_INFO_OUTPUT* pOut///< [out] output 
> structure
>  ) const
>  {
> -ADDR_ASSERT(pIn->resourceType == ADDR_RSRC_TEX_2D);
> +// TODO: Clarify with AddrLib team
> +// ADDR_ASSERT(pIn->resourceType == ADDR_RSRC_TEX_2D);
>
>  UINT_32 numPipeTotal = 
> GetPipeNumForMetaAddressing(pIn->cMaskFlags.pipeAligned,
> pIn->swizzleMode);
>
>  UINT_32 numRbTotal = pIn->cMaskFlags.rbAligned ? m_se * m_rbPerSe : 1;
>
>  UINT_32 numCompressBlkPerMetaBlkLog2, numCompressBlkPerMetaBlk;
>
>  if ((numPipeTotal == 1) && (numRbTotal == 1))
>  {
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] radeonsi: hard-code pixel center for interpolateAtSample without multisample buffers

2017-09-12 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Mon, Sep 11, 2017 at 5:11 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> The GLSL rules for interpolateAtSample are unfortunate:
>
>"Returns the value of the input interpolant variable at
> the location of sample number sample. If
> multisample buffers are not available, the input
> variable will be evaluated at the center of the pixel.
> If sample sample does not exist, the position used to
> interpolate the input variable is undefined."
>
> This fix will fallback to monolithic shader compilation when
> interpolateAtSample is used without multisampling.
>
> One alternative would be to always upload 16 sample positions,
> filling the buffer up with repetition when the actual number of
> samples is less, and then ANDing the sample ID with 0xf. However,
> that punishes all well-behaving users of interpolateAtSample,
> when in reality, only conformance tests should be affected by
> the issue.
>
> Fixes
> dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.*
> ---
>  src/gallium/drivers/radeonsi/si_shader.c| 28 
> -
>  src/gallium/drivers/radeonsi/si_shader.h|  3 +++
>  src/gallium/drivers/radeonsi/si_state_shaders.c |  3 +++
>  3 files changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 85de2e407b4..d0af60856b0 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -3665,21 +3665,47 @@ static void interp_fetch_args(
> LLVMValueRef sample_id;
> LLVMValueRef halfval = LLVMConstReal(ctx->f32, 0.5f);
>
> /* fetch sample ID, then fetch its sample position,
>  * and place into first two channels.
>  */
> sample_id = lp_build_emit_fetch(bld_base,
> emit_data->inst, 1, 
> TGSI_CHAN_X);
> sample_id = LLVMBuildBitCast(gallivm->builder, sample_id,
>  ctx->i32, "");
> -   sample_position = load_sample_position(ctx, sample_id);
> +
> +   /* Section 8.13.2 (Interpolation Functions) of the OpenGL 
> Shading
> +* Language 4.50 spec says about interpolateAtSample:
> +*
> +*"Returns the value of the input interpolant variable at
> +* the location of sample number sample. If multisample
> +* buffers are not available, the input variable will be
> +* evaluated at the center of the pixel. If sample sample
> +* does not exist, the position used to interpolate the
> +* input variable is undefined."
> +*
> +* This means that sample_id values outside of the valid are
> +* in fact valid input, and the usual mechanism for loading 
> the
> +* sample position doesn't work.
> +*/
> +   if 
> (ctx->shader->key.mono.u.ps.interpolate_at_sample_force_center) {
> +   LLVMValueRef center[4] = {
> +   LLVMConstReal(ctx->f32, 0.5),
> +   LLVMConstReal(ctx->f32, 0.5),
> +   ctx->ac.f32_0,
> +   ctx->ac.f32_0,
> +   };
> +
> +   sample_position = lp_build_gather_values(gallivm, 
> center, 4);
> +   } else {
> +   sample_position = load_sample_position(ctx, 
> sample_id);
> +   }
>
> emit_data->args[0] = LLVMBuildExtractElement(gallivm->builder,
>  sample_position,
>  ctx->i32_0, "");
>
> emit_data->args[0] = LLVMBuildFSub(gallivm->builder, 
> emit_data->args[0], halfval, "");
> emit_data->args[1] = LLVMBuildExtractElement(gallivm->builder,
>  sample_position,
>  ctx->i32_1, "");
> emit_data->args[1] = LLVMBuildFSub(gallivm->builder, 
> emit_data->args[1], halfval, "");
> diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
> b/src/gallium/drivers/radeonsi/si_shader.h
> index be17cf462be..22bb56b0ece 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.h
> +++ b/src/gallium/drivers/radeonsi/si_shader.h
> @@ -508,20 +508,23 @@ struct si_shader_key {
>
> /* Flags for monolithic compilation only. */
> struct {
> /* One byte for every input:

[Mesa-dev] [PATCH 3/8] nir: add a helper for getting the bitmask for a variable's location

2017-09-12 Thread Timothy Arceri

---
 src/compiler/nir/nir.h | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index fab2110f619..e52a1006896 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -351,6 +351,37 @@ typedef struct nir_variable {
 #define nir_foreach_variable_safe(var, var_list) \
foreach_list_typed_safe(nir_variable, var, node, var_list)
 
+/**
+ * Returns the bits in the inputs_read, outputs_written, or
+ * system_values_read bitfield corresponding to this variable.
+ */
+static inline uint64_t
+nir_variable_get_io_mask(nir_variable *var, gl_shader_stage stage)
+{
+   /* TODO: add support for tess patches */
+   if (var->data.patch || var->data.location < 0)
+  return 0;
+
+   assert(var->data.mode == nir_var_shader_in ||
+  var->data.mode == nir_var_shader_out ||
+  var->data.mode == nir_var_system_value);
+   assert(var->data.location >= 0);
+
+   const struct glsl_type *var_type = var->type;
+   if ((var->data.mode == nir_var_shader_in &&
+(stage == MESA_SHADER_GEOMETRY ||
+ stage == MESA_SHADER_TESS_CTRL ||
+ stage == MESA_SHADER_TESS_EVAL)) ||
+   (var->data.mode == nir_var_shader_out &&
+stage == MESA_SHADER_TESS_CTRL)) {
+  if (glsl_type_is_array(var_type))
+ var_type = glsl_get_array_element(var_type);
+   }
+
+   unsigned slots = glsl_count_attribute_slots(var_type, false);
+   return ((1ull << slots) - 1) << var->data.location;
+}
+
 static inline bool
 nir_variable_is_global(const nir_variable *var)
 {
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/8] nir: add some helpers for doing linking

2017-09-12 Thread Timothy Arceri

The initial helpers as support for removing unused varyings between
stages.
---
 src/compiler/Makefile.sources  |   1 +
 src/compiler/nir/nir.h |   6 ++
 src/compiler/nir/nir_linking_helpers.c | 136 +
 3 files changed, 143 insertions(+)
 create mode 100644 src/compiler/nir/nir_linking_helpers.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 0153df2d812..9c7f057eecf 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -203,6 +203,7 @@ NIR_FILES = \
nir/nir_instr_set.h \
nir/nir_intrinsics.c \
nir/nir_intrinsics.h \
+   nir/nir_linking_helpers.c \
nir/nir_liveness.c \
nir/nir_loop_analyze.c \
nir/nir_loop_analyze.h \
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index e52a1006896..1e89c74d14c 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2448,6 +2448,12 @@ void nir_shader_gather_info(nir_shader *shader, 
nir_function_impl *entrypoint);
 void nir_assign_var_locations(struct exec_list *var_list, unsigned *size,
   int (*type_size)(const struct glsl_type *));
 
+/* Some helpers to do very simple linking */
+bool nir_remove_unwritten_outputs(nir_shader *shader);
+bool nir_remove_unread_outputs(nir_shader *shader, uint64_t outputs_read);
+bool nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer);
+bool nir_compact_varyings(nir_shader *producer, nir_shader *consumer);
+
 typedef enum {
/* If set, this forces all non-flat fragment shader inputs to be
 * interpolated as if with the "sample" qualifier.  This requires
diff --git a/src/compiler/nir/nir_linking_helpers.c 
b/src/compiler/nir/nir_linking_helpers.c
new file mode 100644
index 000..d567aa713ad
--- /dev/null
+++ b/src/compiler/nir/nir_linking_helpers.c
@@ -0,0 +1,136 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "nir.h"
+#include "util/set.h"
+#include "util/hash_table.h"
+
+/* This file contains various little helpers for doing simple linking in
+ * NIR.  Eventually, we'll probably want a full-blown varying packing
+ * implementation in here.  Right now, it just deletes unused things.
+ */
+
+static void
+find_live_tcs_outputs(nir_shader *shader, struct set *live)
+{
+   nir_foreach_function(function, shader) {
+  if (function->impl) {
+ nir_foreach_block(block, function->impl) {
+nir_foreach_instr(instr, block) {
+   if (instr->type == nir_instr_type_intrinsic) {
+  nir_intrinsic_instr *intrin_instr =
+ nir_instr_as_intrinsic(instr);
+  if (intrin_instr->intrinsic == nir_intrinsic_load_var &&
+  intrin_instr->variables[0]->var->data.mode ==
+  nir_var_shader_out) {
+ _mesa_set_add(live, intrin_instr->variables[0]->var);
+  }
+   }
+}
+ }
+  }
+   }
+}
+
+static bool
+remove_unused_io_vars(nir_shader *shader, struct exec_list *var_list,
+  uint64_t used_by_other_stage,
+  struct set *live_tcs_outputs)
+{
+   bool progress = false;
+
+   nir_foreach_variable_safe(var, var_list) {
+  /* TODO: add patch support */
+  if (var->data.patch)
+ continue;
+
+  if (var->data.location < VARYING_SLOT_VAR0 && var->data.location >= 0)
+ continue;
+
+  if (var->data.always_active_io)
+ continue;
+
+  if (!(used_by_other_stage &
+nir_variable_get_io_mask(var, shader->stage))) {
+ /* Each TCS invocation can read data written by other TCS invocations,
+  * so even if the outputs are not used by the TES we must also make
+  * sure they are not read by the TCS before

[Mesa-dev] i965 NIR linking

2017-09-12 Thread Timothy Arceri

This started out based off the work Jason did back in 2015 to add
NIR linking to the Intel VK driver. It needed a reasonable amount
of updates to work with the GL driver, tess, xfb, etc.

As per the results in patch 8, it can provide some nice
improvements despite the GLSL IR linker already doing the same
link time removal of unused varyings.

Ultimately I'd like to use this with radv but adding it to i965
first provides a good test platform given the mature test suites,
and extensive shader-db collections available for OpenGL. I'm
planning on also adding a NIR packing pass and it makes sense
to test that here also. I beleive the packing pass should be the
last set towards removing any dependency on the GLSL IR
optimisation passes.

Please review.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/8] i965: make use of nir linking

2017-09-12 Thread Timothy Arceri

For now linking is just removing unused varyings between stages.

shader-db results BDW:

total instructions in shared programs: 13198288 -> 13191693 (-0.05%)
instructions in affected programs: 48325 -> 41730 (-13.65%)
helped: 473
HURT: 0

total cycles in shared programs: 541184926 -> 541159260 (-0.00%)
cycles in affected programs: 213238 -> 187572 (-12.04%)
helped: 435
HURT: 8
---
 src/mesa/drivers/dri/i965/brw_link.cpp | 48 ++
 1 file changed, 48 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 9f1634a5459..c4c89a0686e 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -252,6 +252,54 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
  compiler->scalar_stage[stage]);
}
 
+   /* Determine first and last stage. */
+   unsigned first = MESA_SHADER_STAGES;
+   unsigned last = 0;
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (!shProg->_LinkedShaders[i])
+ continue;
+  if (first == MESA_SHADER_STAGES)
+ first = i;
+  last = i;
+   }
+
+   /* Linking the stages in the opposite order (from fragment to vertex)
+* ensures that inter-shader outputs written to in an earlier stage
+* are eliminated if they are (transitively) not used in a later
+* stage.
+*/
+if (first != last) {
+   int next = last;
+   for (int i = next - 1; i >= 0; i--) {
+  if (shProg->_LinkedShaders[i] == NULL)
+ continue;
+
+nir_shader *producer = shProg->_LinkedShaders[i]->Program->nir;
+nir_shader *consumer = shProg->_LinkedShaders[next]->Program->nir;
+
+
nir_remove_dead_variables(shProg->_LinkedShaders[next]->Program->nir,
+  nir_var_shader_in);
+nir_remove_dead_variables(shProg->_LinkedShaders[i]->Program->nir,
+  nir_var_shader_out);
+if (nir_remove_unused_varyings(producer, consumer)) {
+   nir_lower_global_vars_to_local(producer);
+   nir_lower_global_vars_to_local(consumer);
+
+   nir_variable_mode indirect_mask = (nir_variable_mode) 0;
+   if (compiler->glsl_compiler_options[i].EmitNoIndirectTemp)
+  indirect_mask = (nir_variable_mode) nir_var_local;
+
+   nir_lower_indirect_derefs(producer, indirect_mask);
+
+   const bool is_scalar = compiler->scalar_stage[producer->stage];
+   shProg->_LinkedShaders[i]->Program->nir =
+ brw_nir_optimize(producer, compiler, is_scalar);
+}
+
+next = i;
+   }
+}
+
for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) {
   struct gl_linked_shader *shader = shProg->_LinkedShaders[stage];
   if (!shader)
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/8] nir: add is_xfb_only to nir variable

2017-09-12 Thread Timothy Arceri

Will be used in nir link pass to decided if we can remove a varying
or not.
---
 src/compiler/glsl/glsl_to_nir.cpp |  1 +
 src/compiler/nir/nir.h| 10 ++
 2 files changed, 11 insertions(+)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index bb2ba17b220..ea75e3c8e99 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -326,6 +326,7 @@ nir_visitor::visit(ir_variable *ir)
var->type = ir->type;
var->name = ralloc_strdup(var, ir->name);
 
+   var->data.always_active_io = ir->data.always_active_io;
var->data.read_only = ir->data.read_only;
var->data.centroid = ir->data.centroid;
var->data.sample = ir->data.sample;
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 8330e6d7ce7..fab2110f619 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -192,6 +192,16 @@ typedef struct nir_variable {
   unsigned invariant:1;
 
   /**
+   * When separate shader programs are enabled, only input/outputs between
+   * the stages of a multi-stage separate program can be safely removed
+   * from the shader interface. Other input/outputs must remains active.
+   *
+   * This is also used to make sure xfb varyings that are unused by the
+   * fragment shader are not removed.
+   */
+  unsigned always_active_io:1;
+
+  /**
* Interpolation mode for shader inputs / outputs
*
* \sa glsl_interp_mode
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/8] i965: create a brw_shader_gather_info() helper

2017-09-12 Thread Timothy Arceri

This will help us call gather info at a later point and allow us
to do some linking in nir.
---
 src/mesa/drivers/dri/i965/brw_program.c | 20 +---
 src/mesa/drivers/dri/i965/brw_program.h |  3 +++
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 9303dc85b9e..2d3fcd24647 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -107,6 +107,19 @@ brw_create_nir(struct brw_context *brw,
NIR_PASS(progress, nir, nir_lower_system_values);
NIR_PASS_V(nir, brw_nir_lower_uniforms, is_scalar);
 
+   brw_shader_gather_info(nir, prog);
+
+   if (shader_prog) {
+  NIR_PASS_V(nir, nir_lower_samplers, shader_prog);
+  NIR_PASS_V(nir, nir_lower_atomics, shader_prog);
+   }
+
+   return nir;
+}
+
+void
+brw_shader_gather_info(nir_shader *nir, struct gl_program *prog)
+{
nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir));
 
/* Copy the info we just generated back into the gl_program */
@@ -115,13 +128,6 @@ brw_create_nir(struct brw_context *brw,
prog->info = nir->info;
prog->info.name = prog_name;
prog->info.label = prog_label;
-
-   if (shader_prog) {
-  NIR_PASS_V(nir, nir_lower_samplers, shader_prog);
-  NIR_PASS_V(nir, nir_lower_atomics, shader_prog);
-   }
-
-   return nir;
 }
 
 static unsigned
diff --git a/src/mesa/drivers/dri/i965/brw_program.h 
b/src/mesa/drivers/dri/i965/brw_program.h
index e62b7d366c8..c52193c691c 100644
--- a/src/mesa/drivers/dri/i965/brw_program.h
+++ b/src/mesa/drivers/dri/i965/brw_program.h
@@ -25,6 +25,7 @@
 #define BRW_PROGRAM_H
 
 #include "compiler/brw_compiler.h"
+#include "nir.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -38,6 +39,8 @@ struct nir_shader *brw_create_nir(struct brw_context *brw,
   gl_shader_stage stage,
   bool is_scalar);
 
+void brw_shader_gather_info(nir_shader *nir, struct gl_program *prog);
+
 void brw_setup_tex_for_precompile(struct brw_context *brw,
   struct brw_sampler_prog_key_data *tex,
   struct gl_program *prog);
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/8] i965/nir: export nir_optimize

2017-09-12 Thread Timothy Arceri

---
 src/intel/compiler/brw_nir.c | 14 +++---
 src/intel/compiler/brw_nir.h |  4 
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index ce21c016699..a04f4af7b08 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -521,9 +521,9 @@ brw_nir_lower_cs_shared(nir_shader *nir)
this_progress;  \
 })
 
-static nir_shader *
-nir_optimize(nir_shader *nir, const struct brw_compiler *compiler,
- bool is_scalar)
+nir_shader *
+brw_nir_optimize(nir_shader *nir, const struct brw_compiler *compiler,
+ bool is_scalar)
 {
nir_variable_mode indirect_mask = 0;
if (compiler->glsl_compiler_options[nir->stage].EmitNoIndirectInput)
@@ -626,7 +626,7 @@ brw_preprocess_nir(const struct brw_compiler *compiler, 
nir_shader *nir)
 
OPT(nir_split_var_copies);
 
-   nir = nir_optimize(nir, compiler, is_scalar);
+   nir = brw_nir_optimize(nir, compiler, is_scalar);
 
if (is_scalar) {
   OPT(nir_lower_load_const_to_scalar);
@@ -652,7 +652,7 @@ brw_preprocess_nir(const struct brw_compiler *compiler, 
nir_shader *nir)
 nir_lower_divmod64);
 
/* Get rid of split copies */
-   nir = nir_optimize(nir, compiler, is_scalar);
+   nir = brw_nir_optimize(nir, compiler, is_scalar);
 
OPT(nir_remove_dead_variables, nir_var_local);
 
@@ -682,7 +682,7 @@ brw_postprocess_nir(nir_shader *nir, const struct 
brw_compiler *compiler,
   OPT(nir_opt_algebraic_before_ffma);
} while (progress);
 
-   nir = nir_optimize(nir, compiler, is_scalar);
+   nir = brw_nir_optimize(nir, compiler, is_scalar);
 
if (devinfo->gen >= 6) {
   /* Try and fuse multiply-adds */
@@ -776,7 +776,7 @@ brw_nir_apply_sampler_key(nir_shader *nir,
 
if (nir_lower_tex(nir, _options)) {
   nir_validate_shader(nir);
-  nir = nir_optimize(nir, compiler, is_scalar);
+  nir = brw_nir_optimize(nir, compiler, is_scalar);
}
 
return nir;
diff --git a/src/intel/compiler/brw_nir.h b/src/intel/compiler/brw_nir.h
index 560027c3662..f4b13b18c34 100644
--- a/src/intel/compiler/brw_nir.h
+++ b/src/intel/compiler/brw_nir.h
@@ -148,6 +148,10 @@ void brw_nir_analyze_ubo_ranges(const struct brw_compiler 
*compiler,
 
 bool brw_nir_opt_peephole_ffma(nir_shader *shader);
 
+nir_shader *brw_nir_optimize(nir_shader *nir,
+ const struct brw_compiler *compiler,
+ bool is_scalar);
+
 #define BRW_NIR_FRAG_OUTPUT_INDEX_SHIFT 0
 #define BRW_NIR_FRAG_OUTPUT_INDEX_MASK INTEL_MASK(0, 0)
 #define BRW_NIR_FRAG_OUTPUT_LOCATION_SHIFT 1
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/8] i965: call brw_shader_gather_info() from the callers of brw_create_nir()

2017-09-12 Thread Timothy Arceri

This will allow us to insert a nir linking step in brw_link_shader().
---
 src/mesa/drivers/dri/i965/brw_link.cpp  | 14 ++
 src/mesa/drivers/dri/i965/brw_program.c | 11 ---
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index a1082a7a05a..9f1634a5459 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -250,6 +250,20 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 
   prog->nir = brw_create_nir(brw, shProg, prog, (gl_shader_stage) stage,
  compiler->scalar_stage[stage]);
+   }
+
+   for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) {
+  struct gl_linked_shader *shader = shProg->_LinkedShaders[stage];
+  if (!shader)
+ continue;
+
+  struct gl_program *prog = shader->Program;
+  nir_shader *nir = shader->Program->nir;
+  brw_shader_gather_info(nir, prog);
+
+  NIR_PASS_V(nir, nir_lower_samplers, shProg);
+  NIR_PASS_V(nir, nir_lower_atomics, shProg);
+
   infos[stage] = >nir->info;
 
   /* Make a pass over the IR to add state references for any built-in
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 2d3fcd24647..ee6b23f7775 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -107,13 +107,6 @@ brw_create_nir(struct brw_context *brw,
NIR_PASS(progress, nir, nir_lower_system_values);
NIR_PASS_V(nir, brw_nir_lower_uniforms, is_scalar);
 
-   brw_shader_gather_info(nir, prog);
-
-   if (shader_prog) {
-  NIR_PASS_V(nir, nir_lower_samplers, shader_prog);
-  NIR_PASS_V(nir, nir_lower_atomics, shader_prog);
-   }
-
return nir;
 }
 
@@ -227,6 +220,8 @@ brwProgramStringNotify(struct gl_context *ctx,
 
   prog->nir = brw_create_nir(brw, NULL, prog, MESA_SHADER_FRAGMENT, true);
 
+  brw_shader_gather_info(prog->nir, prog);
+
   brw_fs_precompile(ctx, prog);
   break;
}
@@ -249,6 +244,8 @@ brwProgramStringNotify(struct gl_context *ctx,
   prog->nir = brw_create_nir(brw, NULL, prog, MESA_SHADER_VERTEX,
  compiler->scalar_stage[MESA_SHADER_VERTEX]);
 
+  brw_shader_gather_info(prog->nir, prog);
+
   brw_vs_precompile(ctx, prog);
   break;
}
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/8] glsl: mark xfb varyings as always active

2017-09-12 Thread Timothy Arceri

This will be used by the nir linking pass so that we don't remove
otherwise unused varyings.
---
 src/compiler/glsl/link_varyings.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 528506fd0eb..656bf79ca9d 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -2268,6 +2268,9 @@ assign_varying_locations(struct gl_context *ctx,
  return false;
   }
 
+  /* Mark xfb varyings as always active */
+  matched_candidate->toplevel_var->data.always_active_io = 1;
+
   if (matched_candidate->toplevel_var->data.is_unmatched_generic_inout) {
  matched_candidate->toplevel_var->data.is_xfb_only = 1;
  matches.record(matched_candidate->toplevel_var, NULL);
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] i965: Use prepare_external instead of make_shareable in setTexBuffer2

2017-09-12 Thread Jason Ekstrand

Adding people who may have some shot at understanding this stuff

On Tue, Sep 12, 2017 at 4:23 PM, Jason Ekstrand 
wrote:

> The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
> which has tighter restrictions than just "it's shared".  In particular,
> it says that any rendering to the image while it is bound causes the
> contents to become undefined.  This means that we can do whatever aux
> tracking we want between glxBindTexImageEXT and glxReleaseTexImageEXT so
> long as we always transition from external in Bind and to external in
> Release.
>
> The fact that we were using make_shareable before was a problem because
> it would resolve away 100% of the aux data and then throw away our
> reference to the aux buffer.  If the aux data was shared with some other
> application (i.e. if we're using I915_FORMAT_MOD_Y_TILED_CCS) then we
> would forget that the aux data even existed for the rest of eternity.
> This is fine for the first frame but any subsequent calls to
> glxBindTexImageEXT would bind the texture as if it has no aux
> whatsoever and no resolves would happen and texturing would happen as if
> there is no aux.  This was causing rendering corruption in mutter when
> running on top of X11 with modifiers.
> ---
>  src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 09ff287..0e8a947 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -251,7 +251,7 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint
> target,
>internal_format = GL_RGB;
> }
>
> -   intel_miptree_make_shareable(brw, rb->mt);
> +   intel_miptree_prepare_external(brw, rb->mt);
>
> _mesa_lock_texture(>ctx, texObj);
> texImage = _mesa_get_tex_image(ctx, texObj, target, 0);
> --
> 2.5.0.400.gff86faf
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] virgl: drop const dimensions on first block.

2017-09-12 Thread Dave Airlie

From: Dave Airlie 

The virgl protocol version of tgsi doesn't handle this yet,
transform it back to the old ways.

Fixes: 41e342d5 tgsi/ureg: always emit constants (and their decls) as 2D
Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/virgl/virgl_tgsi.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/src/gallium/drivers/virgl/virgl_tgsi.c 
b/src/gallium/drivers/virgl/virgl_tgsi.c
index 7ad1cbd..bf5c84c 100644
--- a/src/gallium/drivers/virgl/virgl_tgsi.c
+++ b/src/gallium/drivers/virgl/virgl_tgsi.c
@@ -31,6 +31,24 @@ struct virgl_transform_context {
struct tgsi_transform_context base;
 };
 
+static void
+virgl_tgsi_transform_declaration(struct tgsi_transform_context *ctx,
+ struct tgsi_full_declaration *decl)
+{
+   switch (decl->Declaration.File) {
+   case TGSI_FILE_CONSTANT:
+  if (decl->Declaration.Dimension) {
+ if (decl->Dim.Index2D == 0)
+decl->Declaration.Dimension = 0;
+  }
+  break;
+   default:
+  break;
+   }
+   ctx->emit_declaration(ctx, decl);
+
+}
+
 /* for now just strip out the new properties the remote doesn't understand
yet */
 static void
@@ -54,6 +72,13 @@ virgl_tgsi_transform_instruction(struct 
tgsi_transform_context *ctx,
 {
if (inst->Instruction.Precise)
   inst->Instruction.Precise = 0;
+
+   for (unsigned i = 0; i < TGSI_FULL_MAX_SRC_REGISTERS; i++) {
+  if (inst->Src[i].Register.File == TGSI_FILE_CONSTANT &&
+  inst->Src[i].Register.Dimension &&
+  inst->Src[i].Dimension.Index == 0)
+ inst->Src[i].Register.Dimension = 0;
+   }
ctx->emit_instruction(ctx, inst);
 }
 
@@ -69,6 +94,7 @@ struct tgsi_token *virgl_tgsi_transform(const struct 
tgsi_token *tokens_in)
   return NULL;
 
memset(, 0, sizeof(transform));
+   transform.base.transform_declaration = virgl_tgsi_transform_declaration;
transform.base.transform_property = virgl_tgsi_transform_property;
transform.base.transform_instruction = virgl_tgsi_transform_instruction;
tgsi_transform_shader(tokens_in, new_tokens, newLen, );
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] i965: Use prepare_external instead of make_shareable in setTexBuffer2

2017-09-12 Thread Jason Ekstrand

The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
which has tighter restrictions than just "it's shared".  In particular,
it says that any rendering to the image while it is bound causes the
contents to become undefined.  This means that we can do whatever aux
tracking we want between glxBindTexImageEXT and glxReleaseTexImageEXT so
long as we always transition from external in Bind and to external in
Release.

The fact that we were using make_shareable before was a problem because
it would resolve away 100% of the aux data and then throw away our
reference to the aux buffer.  If the aux data was shared with some other
application (i.e. if we're using I915_FORMAT_MOD_Y_TILED_CCS) then we
would forget that the aux data even existed for the rest of eternity.
This is fine for the first frame but any subsequent calls to
glxBindTexImageEXT would bind the texture as if it has no aux
whatsoever and no resolves would happen and texturing would happen as if
there is no aux.  This was causing rendering corruption in mutter when
running on top of X11 with modifiers.
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 09ff287..0e8a947 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -251,7 +251,7 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target,
   internal_format = GL_RGB;
}
 
-   intel_miptree_make_shareable(brw, rb->mt);
+   intel_miptree_prepare_external(brw, rb->mt);
 
_mesa_lock_texture(>ctx, texObj);
texImage = _mesa_get_tex_image(ctx, texObj, target, 0);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2

2017-09-12 Thread Jason Ekstrand

The old code made a new miptree that referenced the same BO as the
renderbuffer and just trusted in the memory aliasing to work.  There are
only two ways in which the new miptree is liable to differ from the one
in the renderbuffer and neither of them matter:

 1) It may have a different target.  The only targets that we can ever
see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
and the difference between the two doesn't matter as far as the
miptree is concerned; genX(update_sampler_state) only looks at the
gl_texture_object and not the miptree when determining whether or
not to use normalized coordinates.

 2) It may have a very slightly different format.  Again, this doesn't
matter because we've supported texture views for quite some time so
we always look at the gl_texture_object format instead of the
miptree format for hardware setup anyway.

On the other hand, because we were recreating the miptree, we were using
intel_miptree_create_for_bo which doesn't understand modifiers.  We
really want this function to work without doing a resolve so long as you
have modifiers so we need to fix that.
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 23 ---
 1 file changed, 4 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 4661581..09ff287 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -223,8 +223,6 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target,
struct intel_renderbuffer *rb;
struct gl_texture_object *texObj;
struct gl_texture_image *texImage;
-   mesa_format texFormat = MESA_FORMAT_NONE;
-   struct intel_mipmap_tree *mt;
GLenum internal_format = 0;
 
texObj = _mesa_get_current_tex_object(ctx, target);
@@ -244,33 +242,20 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target,
   return;
 
if (rb->mt->cpp == 4) {
-  if (texture_format == __DRI_TEXTURE_FORMAT_RGB) {
+  if (texture_format == __DRI_TEXTURE_FORMAT_RGB)
  internal_format = GL_RGB;
- texFormat = MESA_FORMAT_B8G8R8X8_UNORM;
-  }
-  else {
+  else
  internal_format = GL_RGBA;
- texFormat = MESA_FORMAT_B8G8R8A8_UNORM;
-  }
} else if (rb->mt->cpp == 2) {
+  /* This is 565 */
   internal_format = GL_RGB;
-  texFormat = MESA_FORMAT_B5G6R5_UNORM;
}
 
intel_miptree_make_shareable(brw, rb->mt);
-   mt = intel_miptree_create_for_bo(brw, rb->mt->bo, texFormat, 0,
-rb->Base.Base.Width,
-rb->Base.Base.Height,
-1, rb->mt->surf.row_pitch,
-MIPTREE_CREATE_DEFAULT);
-   if (mt == NULL)
-   return;
-   mt->target = target;
 
_mesa_lock_texture(>ctx, texObj);
texImage = _mesa_get_tex_image(ctx, texObj, target, 0);
-   intel_set_texture_image_mt(brw, texImage, internal_format, mt);
-   intel_miptree_release();
+   intel_set_texture_image_mt(brw, texImage, internal_format, rb->mt);
_mesa_unlock_texture(>ctx, texObj);
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] i965: Reset miptree aux state on update_image_buffer

2017-09-12 Thread Jason Ekstrand

When we get a miptree in through glxBindImageEXT, we don't know the
current aux state so we have to assume the worst-case.  If the image
gets recreated, everything is fine because miptreecreate_for_dri_image
sets it to the default.  However, if our miptree is recycled, then we
may have stale aux_usage and we need to reset to the default otherwise
our aux_state tracking will get messed up.
---
 src/mesa/drivers/dri/i965/brw_context.c   |  4 +++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 19 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  3 +++
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6441311..839cb6d 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1593,8 +1593,10 @@ intel_update_image_buffer(struct brw_context *intel,
else
   last_mt = rb->singlesample_mt;
 
-   if (last_mt && last_mt->bo == buffer->bo)
+   if (last_mt && last_mt->bo == buffer->bo) {
+  intel_miptree_finish_external(intel, last_mt);
   return;
+   }
 
enum isl_colorspace colorspace;
switch (_mesa_get_format_color_encoding(intel_rb_format(rb))) {
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index bc04ad6..ea44f85 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2813,6 +2813,25 @@ intel_miptree_prepare_external(struct brw_context *brw,
 aux_usage, supports_fast_clear);
 }
 
+void
+intel_miptree_finish_external(struct brw_context *brw,
+  struct intel_mipmap_tree *mt)
+{
+   if (!mt->mcs_buf)
+  return;
+
+   /* We just got this image in from the window system via glxBindTexImageEXT
+* or similar and have no idea what the actual aux state is other than that
+* we aren't in AUX_INVALID.  Reset the aux state to the default for the
+* image's modifier.
+*/
+   enum isl_aux_state default_aux_state =
+  isl_drm_modifier_get_default_aux_state(mt->drm_modifier);
+   assert(mt->last_level == mt->first_level);
+   intel_miptree_set_aux_state(brw, mt, 0, 0, INTEL_REMAINING_LAYERS,
+   default_aux_state);
+}
+
 /**
  * Make it possible to share the BO backing the given miptree with another
  * process or another miptree.
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index e2b23c5..3848192 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -673,6 +673,9 @@ intel_miptree_finish_depth(struct brw_context *brw,
 void
 intel_miptree_prepare_external(struct brw_context *brw,
struct intel_mipmap_tree *mt);
+void
+intel_miptree_finish_external(struct brw_context *brw,
+  struct intel_mipmap_tree *mt);
 
 void
 intel_miptree_make_shareable(struct brw_context *brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] intel/isl: Add a drm_modifier_get_default_aux_state helper

2017-09-12 Thread Jason Ekstrand

---
 src/intel/isl/isl.h   | 20 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  3 +--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index e77d7ee..d30b2de 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -1558,6 +1558,26 @@ isl_drm_modifier_has_aux(uint64_t modifier)
return isl_drm_modifier_get_info(modifier)->aux_usage != ISL_AUX_USAGE_NONE;
 }
 
+/** Returns the default isl_aux_state for the given modifier.
+ *
+ * All modified images are required to be kept out of the AUX_INVALID state
+ * but they may or may not actually be compressed and may or may not have
+ * clear color.  This function returns the worst case aux_state that we need
+ * to assume when getting a surface from another process or API.
+ */
+static inline enum isl_aux_state
+isl_drm_modifier_get_default_aux_state(uint64_t modifier)
+{
+   const struct isl_drm_modifier_info *mod_info =
+  isl_drm_modifier_get_info(modifier);
+
+   if (!mod_info || mod_info->aux_usage == ISL_AUX_USAGE_NONE)
+  return ISL_AUX_STATE_AUX_INVALID;
+
+   return mod_info->supports_clear_color ? ISL_AUX_STATE_COMPRESSED_CLEAR :
+   ISL_AUX_STATE_COMPRESSED_NO_CLEAR;
+}
+
 uint64_t ATTRIBUTE_CONST
 isl_drm_modifier_from_tiling(enum isl_tiling tiling,
  enum isl_aux_usage aux_usage);
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 79afdc5..bc04ad6 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1072,8 +1072,7 @@ intel_miptree_create_for_dri_image(struct brw_context 
*brw,
* a worst case of compression.
*/
   enum isl_aux_state initial_state =
- mod_info->supports_clear_color ? ISL_AUX_STATE_COMPRESSED_CLEAR :
-  ISL_AUX_STATE_COMPRESSED_NO_CLEAR;
+ isl_drm_modifier_get_default_aux_state(image->modifier);
 
   if (!create_ccs_buf_for_image(brw, image, mt, initial_state)) {
  intel_miptree_release();
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/4] i965: Properly handle CCS in glxBindTexImageEXT

2017-09-12 Thread Jason Ekstrand

This little series fixes (I think!) a bug in glxBindTexImageEXT when using
modifiers.  Before, we were using make_shareable which resolves everything
and then permanently throws away any aux information.  In the world of
modifiers, that aux information is suddenly important.  This was causing
rendering corruptions when running with Daniel's X11 modifiers work and
using a compositor (mutter in this case).

I've Cc'd a pile of people on this patch most of whom no longer work for
Intel.  While I'm probably the most qualified person we have to work on
this, I'm nowhere near as qualified as some other people out there are.
Please review at least the commit messages and give me a sanity check.  I
have very limited knowledge of how glx and compositors interact.

Cc: Topi Pohjolainen 
Cc: Chad Versace 
Cc: Eric Anholt 
Cc: Daniel Stone 

Jason Ekstrand (4):
  intel/isl: Add a drm_modifier_get_default_aux_state helper
  i965: Reset miptree aux state on update_image_buffer
  i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2
  i965: Use prepare_external instead of make_shareable in setTexBuffer2

 src/intel/isl/isl.h   | 20 
 src/mesa/drivers/dri/i965/brw_context.c   |  4 +++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 22 --
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  3 +++
 src/mesa/drivers/dri/i965/intel_tex_image.c   | 25 +
 5 files changed, 51 insertions(+), 23 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format (v2)

2017-09-12 Thread Marek Olšák

On Wed, Sep 13, 2017 at 12:31 AM, Marek Olšák  wrote:
> I think we shouldn't be getting PIPE_FORMAT_COUNT in
> is_format_supported in the first place, and therefore drivers don't
> have to work around it.

Or any other invalid formats, for that matter.

Marek

>
> Marek
>
> On Tue, Sep 12, 2017 at 10:38 PM, Denis Pauk  wrote:
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552
>>
>> v2: Patch cleanup proposed by Nicolai Hähnle.
>> * deleted changes in si_translate_texformat.
>>
>> Cc: Nicolai Hähnle 
>> Cc: Ilia Mirkin 
>> ---
>>  src/gallium/auxiliary/util/u_format.c|  4 
>>  src/gallium/drivers/r600/r600_state_common.c |  4 
>>  src/gallium/drivers/radeonsi/si_state.c  | 10 +-
>>  3 files changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_format.c 
>> b/src/gallium/auxiliary/util/u_format.c
>> index 3d281905ce..a6d42a428d 100644
>> --- a/src/gallium/auxiliary/util/u_format.c
>> +++ b/src/gallium/auxiliary/util/u_format.c
>> @@ -238,6 +238,10 @@ util_format_is_subsampled_422(enum pipe_format format)
>>  boolean
>>  util_format_is_supported(enum pipe_format format, unsigned bind)
>>  {
>> +   if (format >= PIPE_FORMAT_COUNT) {
>> +  return FALSE;
>> +   }
>> +
>> if (util_format_is_s3tc(format) && !util_format_s3tc_enabled) {
>>return FALSE;
>> }
>> diff --git a/src/gallium/drivers/r600/r600_state_common.c 
>> b/src/gallium/drivers/r600/r600_state_common.c
>> index c1bce8304b..1515c28091 100644
>> --- a/src/gallium/drivers/r600/r600_state_common.c
>> +++ b/src/gallium/drivers/r600/r600_state_common.c
>> @@ -2284,6 +2284,8 @@ uint32_t r600_translate_texformat(struct pipe_screen 
>> *screen,
>> format = PIPE_FORMAT_A4R4_UNORM;
>>
>> desc = util_format_description(format);
>> +   if (!desc)
>> +   goto out_unknown;
>>
>> /* Depth and stencil swizzling is handled separately. */
>> if (desc->colorspace != UTIL_FORMAT_COLORSPACE_ZS) {
>> @@ -2650,6 +2652,8 @@ uint32_t r600_translate_colorformat(enum chip_class 
>> chip, enum pipe_format forma
>> const struct util_format_description *desc = 
>> util_format_description(format);
>> int channel = util_format_get_first_non_void_channel(format);
>> bool is_float;
>> +   if (!desc)
>> +   return ~0U;
>>
>>  #define HAS_SIZE(x,y,z,w) \
>> (desc->channel[0].size == (x) && desc->channel[1].size == (y) && \
>> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
>> b/src/gallium/drivers/radeonsi/si_state.c
>> index ee070107fd..f7ee24bdc6 100644
>> --- a/src/gallium/drivers/radeonsi/si_state.c
>> +++ b/src/gallium/drivers/radeonsi/si_state.c
>> @@ -1292,6 +1292,8 @@ static void si_emit_db_render_state(struct si_context 
>> *sctx, struct r600_atom *s
>>  static uint32_t si_translate_colorformat(enum pipe_format format)
>>  {
>> const struct util_format_description *desc = 
>> util_format_description(format);
>> +   if (!desc)
>> +   return V_028C70_COLOR_INVALID;
>>
>>  #define HAS_SIZE(x,y,z,w) \
>> (desc->channel[0].size == (x) && desc->channel[1].size == (y) && \
>> @@ -1796,7 +1798,11 @@ static unsigned si_tex_dim(struct si_screen *sscreen, 
>> struct r600_texture *rtex,
>>
>>  static bool si_is_sampler_format_supported(struct pipe_screen *screen, enum 
>> pipe_format format)
>>  {
>> -   return si_translate_texformat(screen, format, 
>> util_format_description(format),
>> +   struct util_format_description *desc = 
>> util_format_description(format);
>> +   if (!desc)
>> +   return false;
>> +
>> +   return si_translate_texformat(screen, format, desc,
>>   
>> util_format_get_first_non_void_channel(format)) != ~0U;
>>  }
>>
>> @@ -1925,6 +1931,8 @@ static unsigned si_is_vertex_format_supported(struct 
>> pipe_screen *screen,
>>   PIPE_BIND_VERTEX_BUFFER)) == 0);
>>
>> desc = util_format_description(format);
>> +   if (!desc)
>> +   return 0;
>>
>> /* There are no native 8_8_8 or 16_16_16 data formats, and we 
>> currently
>>  * select 8_8_8_8 and 16_16_16_16 instead. This works reasonably well
>> --
>> 2.14.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format (v2)

2017-09-12 Thread Marek Olšák

I think we shouldn't be getting PIPE_FORMAT_COUNT in
is_format_supported in the first place, and therefore drivers don't
have to work around it.

Marek

On Tue, Sep 12, 2017 at 10:38 PM, Denis Pauk  wrote:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552
>
> v2: Patch cleanup proposed by Nicolai Hähnle.
> * deleted changes in si_translate_texformat.
>
> Cc: Nicolai Hähnle 
> Cc: Ilia Mirkin 
> ---
>  src/gallium/auxiliary/util/u_format.c|  4 
>  src/gallium/drivers/r600/r600_state_common.c |  4 
>  src/gallium/drivers/radeonsi/si_state.c  | 10 +-
>  3 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/util/u_format.c 
> b/src/gallium/auxiliary/util/u_format.c
> index 3d281905ce..a6d42a428d 100644
> --- a/src/gallium/auxiliary/util/u_format.c
> +++ b/src/gallium/auxiliary/util/u_format.c
> @@ -238,6 +238,10 @@ util_format_is_subsampled_422(enum pipe_format format)
>  boolean
>  util_format_is_supported(enum pipe_format format, unsigned bind)
>  {
> +   if (format >= PIPE_FORMAT_COUNT) {
> +  return FALSE;
> +   }
> +
> if (util_format_is_s3tc(format) && !util_format_s3tc_enabled) {
>return FALSE;
> }
> diff --git a/src/gallium/drivers/r600/r600_state_common.c 
> b/src/gallium/drivers/r600/r600_state_common.c
> index c1bce8304b..1515c28091 100644
> --- a/src/gallium/drivers/r600/r600_state_common.c
> +++ b/src/gallium/drivers/r600/r600_state_common.c
> @@ -2284,6 +2284,8 @@ uint32_t r600_translate_texformat(struct pipe_screen 
> *screen,
> format = PIPE_FORMAT_A4R4_UNORM;
>
> desc = util_format_description(format);
> +   if (!desc)
> +   goto out_unknown;
>
> /* Depth and stencil swizzling is handled separately. */
> if (desc->colorspace != UTIL_FORMAT_COLORSPACE_ZS) {
> @@ -2650,6 +2652,8 @@ uint32_t r600_translate_colorformat(enum chip_class 
> chip, enum pipe_format forma
> const struct util_format_description *desc = 
> util_format_description(format);
> int channel = util_format_get_first_non_void_channel(format);
> bool is_float;
> +   if (!desc)
> +   return ~0U;
>
>  #define HAS_SIZE(x,y,z,w) \
> (desc->channel[0].size == (x) && desc->channel[1].size == (y) && \
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index ee070107fd..f7ee24bdc6 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -1292,6 +1292,8 @@ static void si_emit_db_render_state(struct si_context 
> *sctx, struct r600_atom *s
>  static uint32_t si_translate_colorformat(enum pipe_format format)
>  {
> const struct util_format_description *desc = 
> util_format_description(format);
> +   if (!desc)
> +   return V_028C70_COLOR_INVALID;
>
>  #define HAS_SIZE(x,y,z,w) \
> (desc->channel[0].size == (x) && desc->channel[1].size == (y) && \
> @@ -1796,7 +1798,11 @@ static unsigned si_tex_dim(struct si_screen *sscreen, 
> struct r600_texture *rtex,
>
>  static bool si_is_sampler_format_supported(struct pipe_screen *screen, enum 
> pipe_format format)
>  {
> -   return si_translate_texformat(screen, format, 
> util_format_description(format),
> +   struct util_format_description *desc = 
> util_format_description(format);
> +   if (!desc)
> +   return false;
> +
> +   return si_translate_texformat(screen, format, desc,
>   
> util_format_get_first_non_void_channel(format)) != ~0U;
>  }
>
> @@ -1925,6 +1931,8 @@ static unsigned si_is_vertex_format_supported(struct 
> pipe_screen *screen,
>   PIPE_BIND_VERTEX_BUFFER)) == 0);
>
> desc = util_format_description(format);
> +   if (!desc)
> +   return 0;
>
> /* There are no native 8_8_8 or 16_16_16 data formats, and we 
> currently
>  * select 8_8_8_8 and 16_16_16_16 instead. This works reasonably well
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv: handle GFX9 1D textures

2017-09-12 Thread Dave Airlie

From: Dave Airlie 

As GFX9 can't handle 1D depth textures, radeonsi and
apparantly pro just update all 1D textures to 2D,
and work around it.

This ports the workarounds from radeonsi.

Cc: "17.2" 
Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 80 +++--
 src/amd/vulkan/radv_image.c | 10 --
 2 files changed, 76 insertions(+), 14 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 8f9f771..22e915d 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3264,13 +3264,13 @@ static LLVMValueRef get_image_coords(struct 
ac_nir_context *ctx,
 
int count;
enum glsl_sampler_dim dim = glsl_get_sampler_dim(type);
+   bool is_array = glsl_sampler_type_is_array(type);
bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS ||
 dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
bool is_ms = (dim == GLSL_SAMPLER_DIM_MS ||
  dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
-
-   count = image_type_to_components_count(dim,
-  
glsl_sampler_type_is_array(type));
+   bool gfx9_1d = ctx->abi->chip_class >= GFX9 && dim == 
GLSL_SAMPLER_DIM_1D;
+   count = image_type_to_components_count(dim, is_array);
 
if (is_ms) {
LLVMValueRef fmask_load_address[3];
@@ -3278,7 +3278,7 @@ static LLVMValueRef get_image_coords(struct 
ac_nir_context *ctx,
 
fmask_load_address[0] = 
LLVMBuildExtractElement(ctx->ac.builder, src0, masks[0], "");
fmask_load_address[1] = 
LLVMBuildExtractElement(ctx->ac.builder, src0, masks[1], "");
-   if (glsl_sampler_type_is_array(type))
+   if (is_array)
fmask_load_address[2] = 
LLVMBuildExtractElement(ctx->ac.builder, src0, masks[2], "");
else
fmask_load_address[2] = NULL;
@@ -3297,7 +3297,7 @@ static LLVMValueRef get_image_coords(struct 
ac_nir_context *ctx,
   sample_index,
   
get_sampler_desc(ctx, instr->variables[0], AC_DESC_FMASK, true, false));
}
-   if (count == 1) {
+   if (count == 1 && !gfx9_1d) {
if (instr->src[0].ssa->num_components)
res = LLVMBuildExtractElement(ctx->ac.builder, src0, 
masks[0], "");
else
@@ -3307,9 +3307,8 @@ static LLVMValueRef get_image_coords(struct 
ac_nir_context *ctx,
if (is_ms)
count--;
for (chan = 0; chan < count; ++chan) {
-   coords[chan] = LLVMBuildExtractElement(ctx->ac.builder, 
src0, masks[chan], "");
+   coords[chan] = llvm_extract_elem(>ac, src0, chan);
}
-
if (add_frag_pos) {
for (chan = 0; chan < 2; ++chan)
coords[chan] = LLVMBuildAdd(ctx->ac.builder, 
coords[chan], LLVMBuildFPToUI(ctx->ac.builder, ctx->abi->frag_pos[chan],
@@ -3317,6 +3316,16 @@ static LLVMValueRef get_image_coords(struct 
ac_nir_context *ctx,
coords[2] = ac_to_integer(>ac, 
ctx->abi->inputs[radeon_llvm_reg_index_soa(VARYING_SLOT_LAYER, 0)]);
count++;
}
+
+   if (gfx9_1d) {
+   if (is_array) {
+   coords[2] = coords[1];
+   coords[1] = ctx->ac.i32_0;
+   } else
+   coords[1] = ctx->ac.i32_0;
+   count++;
+   }
+
if (is_ms) {
coords[count] = sample_index;
count++;
@@ -3561,14 +3570,22 @@ static LLVMValueRef visit_image_size(struct 
ac_nir_context *ctx,
 
res = ac_build_image_opcode(>ac, );
 
+   LLVMValueRef two = LLVMConstInt(ctx->ac.i32, 2, false);
+
if (glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_CUBE &&
glsl_sampler_type_is_array(type)) {
-   LLVMValueRef two = LLVMConstInt(ctx->ac.i32, 2, false);
LLVMValueRef six = LLVMConstInt(ctx->ac.i32, 6, false);
LLVMValueRef z = LLVMBuildExtractElement(ctx->ac.builder, res, 
two, "");
z = LLVMBuildSDiv(ctx->ac.builder, z, six, "");
res = LLVMBuildInsertElement(ctx->ac.builder, res, z, two, "");
}
+   if (glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_1D &&
+   glsl_sampler_type_is_array(type)) {
+   LLVMValueRef layers = LLVMBuildExtractElement(ctx->ac.builder, 
res, two, "");
+   res = LLVMBuildInsertElement(ctx->ac.builder, res, layers,
+

[Mesa-dev] [PATCH 1/2] radv: don't use iview for meta image width/height.

2017-09-12 Thread Dave Airlie

From: Dave Airlie 

Work out the width/height from the level manually, as on GFX9
we won't minify the iview width/height.

This fixes:
dEQP-VK.api.image_clearing.core.clear_color_image* on gfx9

Cc: "17.2" 
Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_meta_blit.c  | 19 ---
 src/amd/vulkan/radv_meta_clear.c | 15 +--
 2 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_blit.c b/src/amd/vulkan/radv_meta_blit.c
index 3510e87..2c1a132 100644
--- a/src/amd/vulkan/radv_meta_blit.c
+++ b/src/amd/vulkan/radv_meta_blit.c
@@ -275,15 +275,20 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
VkFilter blit_filter)
 {
struct radv_device *device = cmd_buffer->device;
+   uint32_t src_width = radv_minify(src_iview->image->info.width, 
src_iview->base_mip);
+   uint32_t src_height = radv_minify(src_iview->image->info.height, 
src_iview->base_mip);
+   uint32_t src_depth = radv_minify(src_iview->image->info.depth, 
src_iview->base_mip);
+   uint32_t dst_width = radv_minify(dest_iview->image->info.width, 
dest_iview->base_mip);
+   uint32_t dst_height = radv_minify(dest_iview->image->info.height, 
dest_iview->base_mip);
 
assert(src_image->info.samples == dest_image->info.samples);
 
float vertex_push_constants[5] = {
-   (float)src_offset_0.x / (float)src_iview->extent.width,
-   (float)src_offset_0.y / (float)src_iview->extent.height,
-   (float)src_offset_1.x / (float)src_iview->extent.width,
-   (float)src_offset_1.y / (float)src_iview->extent.height,
-   (float)src_offset_0.z / (float)src_iview->extent.depth,
+   (float)src_offset_0.x / (float)src_width,
+   (float)src_offset_0.y / (float)src_height,
+   (float)src_offset_1.x / (float)src_width,
+   (float)src_offset_1.y / (float)src_height,
+   (float)src_offset_0.z / (float)src_depth,
};
 
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
@@ -310,8 +315,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
   .pAttachments = (VkImageView[]) {
   
radv_image_view_to_handle(dest_iview),
   },
-  .width = dest_iview->extent.width,
-  .height = dest_iview->extent.height,
+  .width = dst_width,
+  .height = dst_height,
   .layers = 1,
}, _buffer->pool->alloc, );
VkPipeline pipeline;
diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index b3eb389..08a6278 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -1202,6 +1202,9 @@ radv_clear_image_layer(struct radv_cmd_buffer *cmd_buffer,
 {
VkDevice device_h = radv_device_to_handle(cmd_buffer->device);
struct radv_image_view iview;
+   uint32_t width = radv_minify(image->info.width, range->baseMipLevel + 
level);
+   uint32_t height = radv_minify(image->info.height, range->baseMipLevel + 
level);
+
radv_image_view_init(, cmd_buffer->device,
 &(VkImageViewCreateInfo) {
 .sType = 
VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
@@ -1225,9 +1228,9 @@ radv_clear_image_layer(struct radv_cmd_buffer *cmd_buffer,
   .pAttachments = (VkImageView[]) {
   
radv_image_view_to_handle(),
   },
-  .width = iview.extent.width,
-   .height = 
iview.extent.height,
-   .layers = 1
+  .width = width,
+  .height = height,
+  .layers = 1
   },
   _buffer->pool->alloc,
   );
@@ -1283,8 +1286,8 @@ radv_clear_image_layer(struct radv_cmd_buffer *cmd_buffer,
.renderArea = {
.offset = { 0, 0, },
.extent = {
-   .width = 
iview.extent.width,
-   .height = 
iview.extent.height,
+   .width = width,
+

Re: [Mesa-dev] [PATCH 22/23] HACK: anv: Fix query of ELF build-id on ARC++

2017-09-12 Thread Chad Versace

On Mon 04 Sep 2017, Tapani Pälli wrote:
> 
> 
> On 09/04/2017 08:37 AM, Tapani Pälli wrote:
> > 
> > 
> > On 09/02/2017 11:17 AM, Chad Versace wrote:
> > > NOT FOR UPSTREAM.
> > > 
> > > To get the driver's build-id, anv_physical_device_init_uuids() searches
> > > the current process for an ELF phdr for filename "libvulkan_intel.so".
> > > However, Android requires that the library be named
> > > "vulkan.${board}.so".
> > 
> > I don't think this requirement exists, we are using libvulkan_intel.so
> > on Android IA and running Vulkan aps. It's up to the HAL implementation
> > to choose what library it opens.
> 
> OK, now reading further I understand why. You are including HAL module
> implementation in the driver itself and that is why the requirement. That is
> quite a big difference between our implementations, do other vendors include
> HAL in the driver?

At least one other vendor's driver embeds the HAL into its Vulkan driver
(a closed-source driver). I did not check any other vendor's source code. But,
considering some discussions I had with a different major unnamed
vendor, I strongly suspect that other vendor also embeds the HAL.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600/sb: remove superfluos assert

2017-09-12 Thread Glenn Kennard


On Tue, 12 Sep 2017 19:25:18 +0200, Vadim Girlin  wrote:


On 09/12/2017 12:49 PM, Gert Wollny wrote:

Am Dienstag, den 12.09.2017, 09:56 +0300 schrieb Vadim Girlin:

On 09/11/2017 07:09 PM, Emil Velikov wrote:



Anyway, if num_arrays is 0 there, I suspect it can be a result of
some other issue. At the very least it looks like a potential
performance problem, because in that case we assume all shader
registers can be  accessed with indirect addressing and it can limit
the optimizations significantly. So it might make sense to figure out
why it's zero in the first place, in theory it shouldn't happen.
Maybe something is wrong with the indirect_files bits?files


The shader that's failing is this (i.e. no arrays, and indirect access
only to SV).


Is the tested feature really supported by r600g? AFAICS the indirect
index value is unused in the shader code.

Anyway, at first glance it looks like we don't need indirect addressing
for GPRs in this case, so the outer "if" around that assert probably
should handle this case too and skip the assert. I'm not 100% sure though.



FRAG
DCL SV[0], SAMPLEMASK
DCL OUT[0], COLOR
DCL CONST[0][0]
DCL TEMP[0..1], LOCAL
DCL ADDR[0]
IMM[0] FLT32 {1., 0., 0., 0.}
IMM[1] INT32 {1, 0, 0, 0}
   0: MOV TEMP[0], IMM[0].xyyx
   1: UARL ADDR[0].x, CONST[0][0].
   2: USEQ TEMP[1].x, SV[ADDR[0].x]., IMM[1].
   3: UIF TEMP[1].
   4:   MOV TEMP[0].xy, IMM[0].yxyy
   5: ENDIF
   6: MOV OUT[0], TEMP[0]
   7: END

= SHADER #12 ==
PS/BARTS/EVERGREEN =
= 36 dw = 8 gprs = 1 stack
=
  4005 a418 ALU_PUSH_BEFORE 7 @10 KC0[CB0:0-15]
0010  00f9 00400c90 1 x: MOVR2.x,  1.0
0012  04f8 20400c90   y: MOVR2.y,  0
0014  04f8 40400c90   z: MOVR2.z,  0
0016  00f9 60400c90   w: MOVR2.w,  1.0
0018  8080 00800c90   t: MOVR4.x,  KC0[0].x
0020  801f4800 00601d10 2 x: SETE_INT   R3.x,  R0.z, 1
0022  801f00fe 00e0229c 3 MP  x: PRED_SETNE_INT R7.x,  PV.x, 0
0002  0003 8281 JUMP @6 POP:1
0004  000c a804 ALU_POP_AFTER 2 @24
0024  04f8 00400c90 4 x: MOVR2.x,  0
0026  80f9 20400c90   y: MOVR2.y,  1.0
0006  000e a00c ALU 4 @28
0028  0002 00200c90 5 x: MOVR1.x,  R2.x
0030  0402 20200c90   y: MOVR1.y,  R2.y
0032  0802 40200c90   z: MOVR1.z,  R2.z
0034  8c02 60200c90   w: MOVR1.w,  R2.w
0008  c0008000 95200688 EXPORT_DONEPIXEL 0 R1.xyzw  EOP
= SHADER_END






Hi Gert,

Vadim is correct, the fix is to extend the check in the if case above to also 
exclude TGSI_FILE_SYSTEM_VALUE, and keep the assert in place. ie:

 if (pshader->indirect_files & ~((1 << TGSI_FILE_CONSTANT) | (1 << TGSI_FILE_SAMPLER) 
| (1 << TGSI_FILE_SYSTEM_VALUE))) {


Although gl_SampleMaskIn is declared as an array in GLSL, its effectively a 32 
bit mask on all hardware supported by mesa so the array indexing is simply 
ignored. Thanks for looking in to this!


/Glenn
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/3] RadeonSI sync_file fences

2017-09-12 Thread Marek Olšák

Hi,

This series adds support for sync_file fences, enabling
EGL_ANDROID_native_fence_sync.

Dependencies:

Kernel patches (based on drm-next):
drm/syncobj: extract two helpers from drm_syncobj_create
drm/syncobj: add a new helper drm_syncobj_get_fd
drm/amdgpu: add FENCE_TO_HANDLE ioctl that returns syncobj or sync_file

libdrm patches:
amdgpu: add sync_file import and export functions
drm: add drmSyncobjWait wrapper
amdgpu: add amdgpu_cs_syncobj_wait
amdgpu: add amdgpu_cs_fence_to_handle

So far the extension has only been tested with piglit, which only contains
very basic tests. I wonder if there are better tests.

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] radeonsi: implement sync_file import/export

2017-09-12 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.c | 77 ++-
 src/gallium/drivers/radeonsi/si_pipe.c|  4 +-
 2 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 48fda7b..b66acf7 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -31,20 +31,21 @@
 #include "util/u_draw_quad.h"
 #include "util/u_memory.h"
 #include "util/u_format_s3tc.h"
 #include "util/u_upload_mgr.h"
 #include "os/os_time.h"
 #include "vl/vl_decoder.h"
 #include "vl/vl_video_buffer.h"
 #include "radeon/radeon_video.h"
 #include 
 #include 
+#include 
 
 #ifndef HAVE_LLVM
 #define HAVE_LLVM 0
 #endif
 
 #if HAVE_LLVM
 #include 
 #endif
 
 #ifndef MESA_LLVM_VERSION_PATCH
@@ -448,20 +449,89 @@ static void r600_fence_server_sync(struct pipe_context 
*ctx,
 * this fence dependency is signalled.
 *
 * Should we flush the context to allow more GPU parallelism?
 */
if (rfence->sdma)
r600_add_fence_dependency(rctx, rfence->sdma);
if (rfence->gfx)
r600_add_fence_dependency(rctx, rfence->gfx);
 }
 
+static void r600_create_fence_fd(struct pipe_context *ctx,
+struct pipe_fence_handle **pfence, int fd)
+{
+   struct r600_common_screen *rscreen = (struct 
r600_common_screen*)ctx->screen;
+   struct radeon_winsys *ws = rscreen->ws;
+   struct r600_multi_fence *rfence;
+
+   *pfence = NULL;
+
+   if (!rscreen->info.has_sync_file)
+   return;
+
+   rfence = CALLOC_STRUCT(r600_multi_fence);
+   if (!rfence)
+   return;
+
+   pipe_reference_init(>reference, 1);
+   rfence->gfx = ws->fence_import_sync_file(ws, fd);
+   if (!rfence->gfx) {
+   FREE(rfence);
+   return;
+   }
+
+   *pfence = (struct pipe_fence_handle*)rfence;
+}
+
+static int r600_fence_get_fd(struct pipe_screen *screen,
+struct pipe_fence_handle *fence)
+{
+   struct r600_common_screen *rscreen = (struct r600_common_screen*)screen;
+   struct radeon_winsys *ws = rscreen->ws;
+   struct r600_multi_fence *rfence = (struct r600_multi_fence *)fence;
+   int gfx_fd = -1, sdma_fd = -1;
+
+   if (!rscreen->info.has_sync_file)
+   return -1;
+
+   /* Deferred fences aren't supported. */
+   assert(!rfence->gfx_unflushed.ctx);
+   if (rfence->gfx_unflushed.ctx)
+   return -1;
+
+   if (rfence->sdma) {
+   sdma_fd = ws->fence_export_sync_file(ws, rfence->sdma);
+   if (sdma_fd == -1)
+   return -1;
+   }
+   if (rfence->gfx) {
+   gfx_fd = ws->fence_export_sync_file(ws, rfence->gfx);
+   if (gfx_fd == -1) {
+   if (sdma_fd != -1)
+   close(sdma_fd);
+   return -1;
+   }
+   }
+
+   /* If we don't have FDs at this point, it means we don't have fences
+* either. */
+   if (sdma_fd == -1)
+   return gfx_fd;
+   if (gfx_fd == -1)
+   return sdma_fd;
+
+   /* Get a fence that will be a combination of both fences. */
+   sync_accumulate("radeonsi", _fd, sdma_fd);
+   close(sdma_fd);
+   return gfx_fd;
+}
+
 static void r600_flush_from_st(struct pipe_context *ctx,
   struct pipe_fence_handle **fence,
   unsigned flags)
 {
struct pipe_screen *screen = ctx->screen;
struct r600_common_context *rctx = (struct r600_common_context *)ctx;
struct radeon_winsys *ws = rctx->ws;
struct pipe_fence_handle *gfx_fence = NULL;
struct pipe_fence_handle *sdma_fence = NULL;
bool deferred_fence = false;
@@ -476,23 +546,26 @@ static void r600_flush_from_st(struct pipe_context *ctx,
 
if (!radeon_emitted(rctx->gfx.cs, rctx->initial_gfx_cs_size)) {
if (fence)
ws->fence_reference(_fence, rctx->last_gfx_fence);
if (!(flags & PIPE_FLUSH_DEFERRED))
ws->cs_sync_flush(rctx->gfx.cs);
} else {
/* Instead of flushing, create a deferred fence. Constraints:
 * - The state tracker must allow a deferred flush.
 * - The state tracker must request a fence.
+* - fence_get_fd is not allowed.
 * Thread safety in fence_finish must be ensured by the state 
tracker.
 */
-   if (flags & PIPE_FLUSH_DEFERRED && fence) {
+   if (flags & PIPE_FLUSH_DEFERRED &&
+   !(flags & PIPE_FLUSH_FENCE_FD) &&
+   fence) {
gfx_fence =

[Mesa-dev] [PATCH 2/3] winsys/amdgpu: implement sync_file import/export

2017-09-12 Thread Marek Olšák

From: Marek Olšák 

syncobj is used internally for interactions with command submission.
---
 src/gallium/drivers/radeon/radeon_winsys.h |  12 +++
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c  | 115 +++--
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.h  |  18 -
 3 files changed, 138 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 99e22e0..2438ec2 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -590,20 +590,32 @@ struct radeon_winsys {
struct pipe_fence_handle *fence,
uint64_t timeout);
 
 /**
  * Reference counting for fences.
  */
 void (*fence_reference)(struct pipe_fence_handle **dst,
 struct pipe_fence_handle *src);
 
 /**
+ * Create a new fence object corresponding to the given sync_file.
+ */
+struct pipe_fence_handle *(*fence_import_sync_file)(struct radeon_winsys 
*ws,
+   int fd);
+
+/**
+ * Return a sync_file FD corresponding to the given fence object.
+ */
+int (*fence_export_sync_file)(struct radeon_winsys *ws,
+ struct pipe_fence_handle *fence);
+
+/**
  * Initialize surface
  *
  * \param wsThe winsys this function is called from.
  * \param tex   Input texture description
  * \param flags Bitmask of RADEON_SURF_* flags
  * \param bpe   Bytes per pixel, it can be different for Z buffers.
  * \param mode  Preferred tile mode. (linear, 1D, or 2D)
  * \param surf  Output structure
  */
 int (*surface_init)(struct radeon_winsys *ws,
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
index 768a164..d9d2a8b 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
@@ -40,30 +40,86 @@ DEBUG_GET_ONCE_BOOL_OPTION(noop, "RADEON_NOOP", false)
 
 /* FENCES */
 
 static struct pipe_fence_handle *
 amdgpu_fence_create(struct amdgpu_ctx *ctx, unsigned ip_type,
 unsigned ip_instance, unsigned ring)
 {
struct amdgpu_fence *fence = CALLOC_STRUCT(amdgpu_fence);
 
fence->reference.count = 1;
+   fence->ws = ctx->ws;
fence->ctx = ctx;
fence->fence.context = ctx->ctx;
fence->fence.ip_type = ip_type;
fence->fence.ip_instance = ip_instance;
fence->fence.ring = ring;
fence->submission_in_progress = true;
p_atomic_inc(>refcount);
return (struct pipe_fence_handle *)fence;
 }
 
+static struct pipe_fence_handle *
+amdgpu_fence_import_sync_file(struct radeon_winsys *rws, int fd)
+{
+   struct amdgpu_winsys *ws = amdgpu_winsys(rws);
+   struct amdgpu_fence *fence = CALLOC_STRUCT(amdgpu_fence);
+
+   if (!fence)
+  return NULL;
+
+   pipe_reference_init(>reference, 1);
+   fence->ws = ws;
+   /* fence->ctx == NULL means that the fence is syncobj-based. */
+
+   /* Convert sync_file into syncobj. */
+   int r = amdgpu_cs_create_syncobj(ws->dev, >syncobj);
+   if (r) {
+  FREE(fence);
+  return NULL;
+   }
+
+   r = amdgpu_cs_syncobj_import_sync_file(ws->dev, fence->syncobj, fd);
+   if (r) {
+  amdgpu_cs_destroy_syncobj(ws->dev, fence->syncobj);
+  FREE(fence);
+  return NULL;
+   }
+   return (struct pipe_fence_handle*)fence;
+}
+
+static int amdgpu_fence_export_sync_file(struct radeon_winsys *rws,
+struct pipe_fence_handle *pfence)
+{
+   struct amdgpu_winsys *ws = amdgpu_winsys(rws);
+   struct amdgpu_fence *fence = (struct amdgpu_fence*)pfence;
+
+   if (amdgpu_fence_is_syncobj(fence)) {
+  int fd, r;
+
+  /* Convert syncobj into sync_file. */
+  r = amdgpu_cs_syncobj_export_sync_file(ws->dev, fence->syncobj, );
+  return r ? -1 : fd;
+   }
+
+   os_wait_until_zero(>submission_in_progress, PIPE_TIMEOUT_INFINITE);
+
+   /* Convert the amdgpu fence into a fence FD. */
+   int fd;
+   if (amdgpu_cs_fence_to_handle(ws->dev, >fence,
+ AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD,
+ (uint32_t*)))
+  return -1;
+
+   return fd;
+}
+
 static void amdgpu_fence_submitted(struct pipe_fence_handle *fence,
uint64_t seq_no,
uint64_t *user_fence_cpu_address)
 {
struct amdgpu_fence *rfence = (struct amdgpu_fence*)fence;
 
rfence->fence.fence = seq_no;
rfence->user_fence_cpu_address = user_fence_cpu_address;
rfence->submission_in_progress = false;
 }
@@ -81,20 +137,35 @@ bool amdgpu_fence_wait(struct pipe_fence_handle *fence, 
uint64_t timeout,
 {
struct amdgpu_fence *rfence = (struct amdgpu_fence*)fence;
uint32_t expired;
int64_t abs_timeout;
uint64_t *user_fence_cpu;
int r;

[Mesa-dev] [PATCH 1/3] ac: add radeon_info::has_sync_file

2017-09-12 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 1 +
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeon/r600_pipe_common.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index e55d864..5125532 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -260,20 +260,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->max_se = amdinfo->num_shader_engines;
info->max_sh_per_se = amdinfo->num_shader_arrays_per_engine;
info->has_hw_decode =
(uvd.available_rings != 0) || (vcn_dec.available_rings != 0);
info->uvd_fw_version =
uvd.available_rings ? uvd_version : 0;
info->vce_fw_version =
vce.available_rings ? vce_version : 0;
info->has_userptr = true;
info->has_syncobj = has_syncobj(fd);
+   info->has_sync_file = info->has_syncobj && info->drm_minor >= 21;
info->num_render_backends = amdinfo->rb_pipes;
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
}
info->tcc_cache_line_size = 64; /* TC L2 line size on GCN */
if (info->chip_class == GFX9) {
info->num_tile_pipes = 1 << 
G_0098F8_NUM_PIPES(amdinfo->gb_addr_cfg);
info->pipe_interleave_bytes =
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 06b0c77..a792a1e 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -71,20 +71,21 @@ struct radeon_info {
uint32_tvce_harvest_config;
uint32_tclock_crystal_freq;
uint32_ttcc_cache_line_size;
 
/* Kernel info. */
uint32_tdrm_major; /* version */
uint32_tdrm_minor;
uint32_tdrm_patchlevel;
boolhas_userptr;
boolhas_syncobj;
+   boolhas_sync_file;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index fc27b4c..48fda7b 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -1556,20 +1556,21 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
printf("me_fw_version = %i\n", rscreen->info.me_fw_version);
printf("pfp_fw_version = %i\n", rscreen->info.pfp_fw_version);
printf("ce_fw_version = %i\n", rscreen->info.ce_fw_version);
printf("vce_harvest_config = %i\n", 
rscreen->info.vce_harvest_config);
printf("clock_crystal_freq = %i\n", 
rscreen->info.clock_crystal_freq);
printf("tcc_cache_line_size = %u\n", 
rscreen->info.tcc_cache_line_size);
printf("drm = %i.%i.%i\n", rscreen->info.drm_major,
   rscreen->info.drm_minor, rscreen->info.drm_patchlevel);
printf("has_userptr = %i\n", rscreen->info.has_userptr);
printf("has_syncobj = %u\n", rscreen->info.has_syncobj);
+   printf("has_sync_file = %u\n", rscreen->info.has_sync_file);
 
printf("r600_max_quad_pipes = %i\n", 
rscreen->info.r600_max_quad_pipes);
printf("max_shader_clock = %i\n", 
rscreen->info.max_shader_clock);
printf("num_good_compute_units = %i\n", 
rscreen->info.num_good_compute_units);
printf("max_se = %i\n", rscreen->info.max_se);
printf("max_sh_per_se = %i\n", rscreen->info.max_sh_per_se);
 
printf("r600_gb_backend_map = %i\n", 
rscreen->info.r600_gb_backend_map);
printf("r600_gb_backend_map_valid = %i\n", 
rscreen->info.r600_gb_backend_map_valid);
printf("r600_num_banks = %i\n", rscreen->info.r600_num_banks);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format

2017-09-12 Thread Денис Паук

Thank you, i have sent new patch version.

Current call sequence in backward order:

r600 =>
* r600_state_common.c::r600_translate_texformat and
r600_state_common.c::r600_translate_colorformat is called from
evergreen_state::r600_is_colorbuffer_format_supported and
r600_state::r600_is_colorbuffer_format_supported.

radeonsi =>
  * si_state::si_translate_colorformat =>
si_state::si_is_colorbuffer_format_supported =>
si_state::si_is_format_supported (In backward sequence.)
  * si_state::si_is_vertex_format_supported =>
si_state::si_is_format_supported

Looks as make sense to delete changes in si_translate_texformat will be
enough because we check format in si_is_sampler_format_supported.


On Mon, Sep 11, 2017 at 5:21 PM, Nicolai Hähnle  wrote:

> On 10.09.2017 20:52, Denis Pauk wrote:
>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552
>> ---
>>   src/gallium/auxiliary/util/u_format.c|  4 
>>   src/gallium/drivers/r600/r600_state_common.c |  4 
>>   src/gallium/drivers/radeonsi/si_state.c  | 13 -
>>   3 files changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_format.c
>> b/src/gallium/auxiliary/util/u_format.c
>> index 3d281905ce..a6d42a428d 100644
>> --- a/src/gallium/auxiliary/util/u_format.c
>> +++ b/src/gallium/auxiliary/util/u_format.c
>> @@ -238,6 +238,10 @@ util_format_is_subsampled_422(enum pipe_format
>> format)
>>   boolean
>>   util_format_is_supported(enum pipe_format format, unsigned bind)
>>   {
>> +   if (format >= PIPE_FORMAT_COUNT) {
>> +  return FALSE;
>> +   }
>> +
>>  if (util_format_is_s3tc(format) && !util_format_s3tc_enabled) {
>> return FALSE;
>>  }
>> diff --git a/src/gallium/drivers/r600/r600_state_common.c
>> b/src/gallium/drivers/r600/r600_state_common.c
>> index c1bce8304b..1515c28091 100644
>> --- a/src/gallium/drivers/r600/r600_state_common.c
>> +++ b/src/gallium/drivers/r600/r600_state_common.c
>> @@ -2284,6 +2284,8 @@ uint32_t r600_translate_texformat(struct
>> pipe_screen *screen,
>> format = PIPE_FORMAT_A4R4_UNORM;
>> desc = util_format_description(format);
>> +   if (!desc)
>> +   goto out_unknown;
>> /* Depth and stencil swizzling is handled separately. */
>> if (desc->colorspace != UTIL_FORMAT_COLORSPACE_ZS) {
>> @@ -2650,6 +2652,8 @@ uint32_t r600_translate_colorformat(enum
>> chip_class chip, enum pipe_format forma
>> const struct util_format_description *desc =
>> util_format_description(format);
>> int channel = util_format_get_first_non_void_channel(format);
>> bool is_float;
>> +   if (!desc)
>> +   return ~0U;
>> #define HAS_SIZE(x,y,z,w) \
>> (desc->channel[0].size == (x) && desc->channel[1].size == (y) && \
>> diff --git a/src/gallium/drivers/radeonsi/si_state.c
>> b/src/gallium/drivers/radeonsi/si_state.c
>> index ee070107fd..06fd5718fd 100644
>> --- a/src/gallium/drivers/radeonsi/si_state.c
>> +++ b/src/gallium/drivers/radeonsi/si_state.c
>> @@ -1292,6 +1292,8 @@ static void si_emit_db_render_state(struct
>> si_context *sctx, struct r600_atom *s
>>   static uint32_t si_translate_colorformat(enum pipe_format format)
>>   {
>> const struct util_format_description *desc =
>> util_format_description(format);
>> +   if (!desc)
>> +   return V_028C70_COLOR_INVALID;
>> #define HAS_SIZE(x,y,z,w) \
>> (desc->channel[0].size == (x) && desc->channel[1].size == (y) && \
>> @@ -1442,6 +1444,9 @@ static uint32_t si_translate_texformat(struct
>> pipe_screen *screen,
>> bool uniform = true;
>> int i;
>>   + if (!desc)
>> +   goto out_unknown;
>> +
>> /* Colorspace (return non-RGB formats directly). */
>> switch (desc->colorspace) {
>> /* Depth stencil formats */
>> @@ -1796,7 +1801,11 @@ static unsigned si_tex_dim(struct si_screen
>> *sscreen, struct r600_texture *rtex,
>> static bool si_is_sampler_format_supported(struct pipe_screen
>> *screen, enum pipe_format format)
>>   {
>> -   return si_translate_texformat(screen, format,
>> util_format_description(format),
>> +   struct util_format_description *desc =
>> util_format_description(format);
>> +   if (!desc)
>> +   return false;
>> +
>> +   return si_translate_texformat(screen, format, desc,
>>   
>> util_format_get_first_non_void_channel(format))
>> != ~0U;
>>   }
>>   @@ -1925,6 +1934,8 @@ static unsigned si_is_vertex_format_supported(struct
>> pipe_screen *screen,
>>   PIPE_BIND_VERTEX_BUFFER)) == 0);
>> desc = util_format_description(format);
>> +   if (!desc)
>> +   return 0;
>>
>
> The two si_is_*_format_supported hunks are fine, but all the other places
> should never see an invalid pipe format, so you should remove those hunks
> (perhaps add an assertion).
>
> Thanks,
> Nicolai
>

[Mesa-dev] [PATCH 1/3] ac: add radeon_info::has_sync_file

2017-09-12 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 1 +
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeon/r600_pipe_common.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index e55d864..5125532 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -260,20 +260,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->max_se = amdinfo->num_shader_engines;
info->max_sh_per_se = amdinfo->num_shader_arrays_per_engine;
info->has_hw_decode =
(uvd.available_rings != 0) || (vcn_dec.available_rings != 0);
info->uvd_fw_version =
uvd.available_rings ? uvd_version : 0;
info->vce_fw_version =
vce.available_rings ? vce_version : 0;
info->has_userptr = true;
info->has_syncobj = has_syncobj(fd);
+   info->has_sync_file = info->has_syncobj && info->drm_minor >= 21;
info->num_render_backends = amdinfo->rb_pipes;
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
}
info->tcc_cache_line_size = 64; /* TC L2 line size on GCN */
if (info->chip_class == GFX9) {
info->num_tile_pipes = 1 << 
G_0098F8_NUM_PIPES(amdinfo->gb_addr_cfg);
info->pipe_interleave_bytes =
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 06b0c77..a792a1e 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -71,20 +71,21 @@ struct radeon_info {
uint32_tvce_harvest_config;
uint32_tclock_crystal_freq;
uint32_ttcc_cache_line_size;
 
/* Kernel info. */
uint32_tdrm_major; /* version */
uint32_tdrm_minor;
uint32_tdrm_patchlevel;
boolhas_userptr;
boolhas_syncobj;
+   boolhas_sync_file;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index fc27b4c..48fda7b 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -1556,20 +1556,21 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
printf("me_fw_version = %i\n", rscreen->info.me_fw_version);
printf("pfp_fw_version = %i\n", rscreen->info.pfp_fw_version);
printf("ce_fw_version = %i\n", rscreen->info.ce_fw_version);
printf("vce_harvest_config = %i\n", 
rscreen->info.vce_harvest_config);
printf("clock_crystal_freq = %i\n", 
rscreen->info.clock_crystal_freq);
printf("tcc_cache_line_size = %u\n", 
rscreen->info.tcc_cache_line_size);
printf("drm = %i.%i.%i\n", rscreen->info.drm_major,
   rscreen->info.drm_minor, rscreen->info.drm_patchlevel);
printf("has_userptr = %i\n", rscreen->info.has_userptr);
printf("has_syncobj = %u\n", rscreen->info.has_syncobj);
+   printf("has_sync_file = %u\n", rscreen->info.has_sync_file);
 
printf("r600_max_quad_pipes = %i\n", 
rscreen->info.r600_max_quad_pipes);
printf("max_shader_clock = %i\n", 
rscreen->info.max_shader_clock);
printf("num_good_compute_units = %i\n", 
rscreen->info.num_good_compute_units);
printf("max_se = %i\n", rscreen->info.max_se);
printf("max_sh_per_se = %i\n", rscreen->info.max_sh_per_se);
 
printf("r600_gb_backend_map = %i\n", 
rscreen->info.r600_gb_backend_map);
printf("r600_gb_backend_map_valid = %i\n", 
rscreen->info.r600_gb_backend_map_valid);
printf("r600_num_banks = %i\n", rscreen->info.r600_num_banks);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-12 Thread Jason Ekstrand

On Tue, Sep 12, 2017 at 11:09 AM, Jason Ekstrand 
wrote:

> On Tue, Sep 12, 2017 at 10:12 AM, Ian Romanick 
> wrote:
>
>> On 09/11/2017 11:17 PM, Kenneth Graunke wrote:
>> > On Monday, September 11, 2017 9:23:05 PM PDT Ian Romanick wrote:
>> >> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
>> >>> On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:
>>  On 2017-09-06 14:12:41, Daniel Schürmann wrote:
>> > Hello together!
>> > Recently, we had a small discussion (off the list) about the NIR
>> > serialization, which was previously discussed in [RFC] ARB_gl_spirv
>> and
>> > NIR backend for radeonsi.
>> >
>> > As this topic could be interesting to more people, I would like to
>> > share, what was talked about so far (You might want to read from
>> bottom up).
>> >
>> > TL;DR:
>> > - NIR serialization is in demand for shader cache
>> > - could be done either directly (NIR binary form) or via SPIR-V
>> > - Ian et al. are working on GLSL IR -> SPIR-V transformation, which
>> > could be adapted for a NIR -> SPIR-V pass
>> > - in NIR representation, some type information is lost
>> > - thus, a serialization via SPIR-V could NOT be a glslang
>> alternative
>> > (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if
>> the
>> > output is valid SPIR-V)
>> 
>>  Ian,
>> 
>>  Tim was suggesting that we might look at serializing nir for the i965
>>  shader cache. Based on this email, it sounds like serialized nir
>> would
>>  not be enough for the shader cache as some GLSL type info would be
>>  lost. It sounds like GLSL IR => SPIR-V would be good enough. Is that
>>  right?
>> 
>>  I don't think we have a strict requirement for the GLSL IR => SPIR-V
>>  path for GL 4.6, right? So, this is more of a 'nice-to-have'?
>> 
>>  I'm not sure we'd want to make i965 shader cache depend on a
>>  nice-to-have feature. (Unless we're pretty sure it'll be available
>>  soon.)
>> 
>>  But, it would be nice to not have to fallback to compiling the GLSL
>>  for i965 shader cache, so it would be worth waiting a little bit to
>> be
>>  able to rely on a SPIR-V serialization of the GLSL IR.
>> 
>>  What do you suggest?
>> 
>>  -Jordan
>> >>>
>> >>> We shouldn't use SPIR-V for the shader cache.
>> >>>
>> >>> The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965
>> IRs.
>> >>> Storing the content at one of those points, and later loading it and
>> >>> resuming the normal compilation process from that point...that's
>> totally
>> >>> reasonable.
>> >>>
>> >>> Having a fallback for "some things in the cache but not all the
>> variants
>> >>> we needed" suddenly take a different compilation pipeline, i.e. SPIR-V
>> >>> -> NIR -> ... seems risky.  It's a different compilation path that we
>> >>> don't normally use.  And one you'd only hit in limited circumstances.
>> >>> There's a lot of potential for really obscure bugs.
>> >>
>> >> Since we're going to expose exactly that path for GL_ARB_spirv / OpenGL
>> >> 4.6, we'd better make sure it works always.  Right?
>> >
>> > In addition to the old pipeline:
>> >
>> > - GLSL from the app -> GLSL IR -> NIR -> i965 IR
>> >
>> > GL_ARB_spirv and OpenGL 4.6 add a second pipeline:
>> >
>> > - SPIR-V from the app -> NIR -> i965 IR
>> >
>> > Both of those absolutely have to work.  But these:
>> >
>> > - GLSL -> GLSL IR -> NIR -> SPIR-V -> NIR -> i965 IRs
>> > - GLSL -> GLSL IR -> SPIR-V -> NIR -> i965 IRs
>> >
>> > aren't required to work, or even be supported.  It makes a lot of sense
>> > to support them - both for testing purposes, and as an alternative to
>> > glslang, for a broader tooling ecosystem.
>> >
>> > The thing that concerns me is that if you use SPIR-V for the cache, you
>> > need these paths to not just work, but be _indistinguishable_ from one
>> > another:
>> >
>> > - GLSL -> GLSL IR -> NIR -> ...
>> > - GLSL -> GLSL IR -> NIR -> SPIR-V, then SPIR-V -> NIR -> ...
>> >
>> > Otherwise the original compile and partially-cached recompile might have
>> > different properties.  For example, if the the SPIR-V step messes with
>> > variables or instruction ordering a little, it could trip up the loop
>> > unroller so the original compiler gets unrolled, and the recompile from
>> > partial cache doesn't get unrolled.  I don't want to have to debug that.
>>
>> That is a very compelling argument.  If we want Mesa to be an
>> alternative to glslang, I think we would like to have that property, but
>> it's not a hard requirement for that use case.
>>
>
> I also find that argument rather compelling.  The SPIR-V -> NIR pass is
> *not* a simple pass.  It does piles of lowering and things on-the-fly as
> well as creating temporary variables for various things.  The best we could
> hope to guarnatee would be that NIR -> SPIR-V -> NIR -> vars_to_ssa -> CSE
> is

[Mesa-dev] [PATCH] gallium/{r600, radeonsi}: Fix segfault with color format (v2)

2017-09-12 Thread Denis Pauk

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552

v2: Patch cleanup proposed by Nicolai Hähnle.
* deleted changes in si_translate_texformat.

Cc: Nicolai Hähnle 
Cc: Ilia Mirkin 
---
 src/gallium/auxiliary/util/u_format.c|  4 
 src/gallium/drivers/r600/r600_state_common.c |  4 
 src/gallium/drivers/radeonsi/si_state.c  | 10 +-
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_format.c 
b/src/gallium/auxiliary/util/u_format.c
index 3d281905ce..a6d42a428d 100644
--- a/src/gallium/auxiliary/util/u_format.c
+++ b/src/gallium/auxiliary/util/u_format.c
@@ -238,6 +238,10 @@ util_format_is_subsampled_422(enum pipe_format format)
 boolean
 util_format_is_supported(enum pipe_format format, unsigned bind)
 {
+   if (format >= PIPE_FORMAT_COUNT) {
+  return FALSE;
+   }
+
if (util_format_is_s3tc(format) && !util_format_s3tc_enabled) {
   return FALSE;
}
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index c1bce8304b..1515c28091 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -2284,6 +2284,8 @@ uint32_t r600_translate_texformat(struct pipe_screen 
*screen,
format = PIPE_FORMAT_A4R4_UNORM;
 
desc = util_format_description(format);
+   if (!desc)
+   goto out_unknown;
 
/* Depth and stencil swizzling is handled separately. */
if (desc->colorspace != UTIL_FORMAT_COLORSPACE_ZS) {
@@ -2650,6 +2652,8 @@ uint32_t r600_translate_colorformat(enum chip_class chip, 
enum pipe_format forma
const struct util_format_description *desc = 
util_format_description(format);
int channel = util_format_get_first_non_void_channel(format);
bool is_float;
+   if (!desc)
+   return ~0U;
 
 #define HAS_SIZE(x,y,z,w) \
(desc->channel[0].size == (x) && desc->channel[1].size == (y) && \
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index ee070107fd..f7ee24bdc6 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1292,6 +1292,8 @@ static void si_emit_db_render_state(struct si_context 
*sctx, struct r600_atom *s
 static uint32_t si_translate_colorformat(enum pipe_format format)
 {
const struct util_format_description *desc = 
util_format_description(format);
+   if (!desc)
+   return V_028C70_COLOR_INVALID;
 
 #define HAS_SIZE(x,y,z,w) \
(desc->channel[0].size == (x) && desc->channel[1].size == (y) && \
@@ -1796,7 +1798,11 @@ static unsigned si_tex_dim(struct si_screen *sscreen, 
struct r600_texture *rtex,
 
 static bool si_is_sampler_format_supported(struct pipe_screen *screen, enum 
pipe_format format)
 {
-   return si_translate_texformat(screen, format, 
util_format_description(format),
+   struct util_format_description *desc = util_format_description(format);
+   if (!desc)
+   return false;
+
+   return si_translate_texformat(screen, format, desc,
  
util_format_get_first_non_void_channel(format)) != ~0U;
 }
 
@@ -1925,6 +1931,8 @@ static unsigned si_is_vertex_format_supported(struct 
pipe_screen *screen,
  PIPE_BIND_VERTEX_BUFFER)) == 0);
 
desc = util_format_description(format);
+   if (!desc)
+   return 0;
 
/* There are no native 8_8_8 or 16_16_16 data formats, and we currently
 * select 8_8_8_8 and 16_16_16_16 instead. This works reasonably well
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] (UNTESTED) virgl: filter out 2D constant file accesses and declarations

2017-09-12 Thread Nicolai Hähnle

FWIW, this patch should be a no-op without the offending "tgsi/ureg: 
always emit constants (and their decls) as 2D" commit.


On 12.09.2017 22:34, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

Sorry for the mess.

I suspect something like this patch is needed. Is this sufficient to
fix the problem?

Cheers,
Nicolai
---
  src/gallium/drivers/virgl/virgl_tgsi.c | 24 
  1 file changed, 24 insertions(+)

diff --git a/src/gallium/drivers/virgl/virgl_tgsi.c 
b/src/gallium/drivers/virgl/virgl_tgsi.c
index 7ad1cbdb886..4151e1d8450 100644
--- a/src/gallium/drivers/virgl/virgl_tgsi.c
+++ b/src/gallium/drivers/virgl/virgl_tgsi.c
@@ -42,36 +42,60 @@ virgl_tgsi_transform_property(struct tgsi_transform_context 
*ctx,
 case TGSI_PROPERTY_NUM_CULLDIST_ENABLED:
 case TGSI_PROPERTY_NEXT_SHADER:
break;
 default:
ctx->emit_property(ctx, prop);
break;
 }
  }
  
  static void

+virgl_tgsi_transform_declaration(struct tgsi_transform_context *ctx,
+ struct tgsi_full_declaration *decl)
+{
+   if (decl->Declaration.File == TGSI_FILE_CONSTANT &&
+   decl->Declaration.Dimension &&
+   decl->Dim.Index2D == 0)
+  decl->Declaration.Dimension = 0;
+
+   ctx->emit_declaration(ctx, decl);
+}
+
+static void
  virgl_tgsi_transform_instruction(struct tgsi_transform_context *ctx,
 struct tgsi_full_instruction *inst)
  {
 if (inst->Instruction.Precise)
inst->Instruction.Precise = 0;
+
+   for (unsigned i = 0; i < inst->Instruction.NumSrcRegs; ++i) {
+  struct tgsi_full_src_register *src = >Src[i];
+
+  if (src->Register.File == TGSI_FILE_CONSTANT &&
+  src->Register.Dimension &&
+  !src->Dimension.Indirect &&
+  src->Dimension.Index == 0)
+ src->Register.Dimension = 0;
+   }
+
 ctx->emit_instruction(ctx, inst);
  }
  
  struct tgsi_token *virgl_tgsi_transform(const struct tgsi_token *tokens_in)

  {
  
 struct virgl_transform_context transform;

 const uint newLen = tgsi_num_tokens(tokens_in);
 struct tgsi_token *new_tokens;
  
 new_tokens = tgsi_alloc_tokens(newLen);

 if (!new_tokens)
return NULL;
  
 memset(, 0, sizeof(transform));

 transform.base.transform_property = virgl_tgsi_transform_property;
 transform.base.transform_instruction = virgl_tgsi_transform_instruction;
+   transform.base.transform_declaration = virgl_tgsi_transform_declaration;
 tgsi_transform_shader(tokens_in, new_tokens, newLen, );
  
 return new_tokens;

  }




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] (UNTESTED) virgl: filter out 2D constant file accesses and declarations

2017-09-12 Thread Nicolai Hähnle

From: Nicolai Hähnle 

Sorry for the mess.

I suspect something like this patch is needed. Is this sufficient to
fix the problem?

Cheers,
Nicolai
---
 src/gallium/drivers/virgl/virgl_tgsi.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/gallium/drivers/virgl/virgl_tgsi.c 
b/src/gallium/drivers/virgl/virgl_tgsi.c
index 7ad1cbdb886..4151e1d8450 100644
--- a/src/gallium/drivers/virgl/virgl_tgsi.c
+++ b/src/gallium/drivers/virgl/virgl_tgsi.c
@@ -42,36 +42,60 @@ virgl_tgsi_transform_property(struct tgsi_transform_context 
*ctx,
case TGSI_PROPERTY_NUM_CULLDIST_ENABLED:
case TGSI_PROPERTY_NEXT_SHADER:
   break;
default:
   ctx->emit_property(ctx, prop);
   break;
}
 }
 
 static void
+virgl_tgsi_transform_declaration(struct tgsi_transform_context *ctx,
+ struct tgsi_full_declaration *decl)
+{
+   if (decl->Declaration.File == TGSI_FILE_CONSTANT &&
+   decl->Declaration.Dimension &&
+   decl->Dim.Index2D == 0)
+  decl->Declaration.Dimension = 0;
+
+   ctx->emit_declaration(ctx, decl);
+}
+
+static void
 virgl_tgsi_transform_instruction(struct tgsi_transform_context *ctx,
 struct tgsi_full_instruction *inst)
 {
if (inst->Instruction.Precise)
   inst->Instruction.Precise = 0;
+
+   for (unsigned i = 0; i < inst->Instruction.NumSrcRegs; ++i) {
+  struct tgsi_full_src_register *src = >Src[i];
+
+  if (src->Register.File == TGSI_FILE_CONSTANT &&
+  src->Register.Dimension &&
+  !src->Dimension.Indirect &&
+  src->Dimension.Index == 0)
+ src->Register.Dimension = 0;
+   }
+
ctx->emit_instruction(ctx, inst);
 }
 
 struct tgsi_token *virgl_tgsi_transform(const struct tgsi_token *tokens_in)
 {
 
struct virgl_transform_context transform;
const uint newLen = tgsi_num_tokens(tokens_in);
struct tgsi_token *new_tokens;
 
new_tokens = tgsi_alloc_tokens(newLen);
if (!new_tokens)
   return NULL;
 
memset(, 0, sizeof(transform));
transform.base.transform_property = virgl_tgsi_transform_property;
transform.base.transform_instruction = virgl_tgsi_transform_instruction;
+   transform.base.transform_declaration = virgl_tgsi_transform_declaration;
tgsi_transform_shader(tokens_in, new_tokens, newLen, );
 
return new_tokens;
 }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Intel-gfx] [PATCH 1/2] drm/i915/kbl: Remove unused Kabylake pci ids

2017-09-12 Thread Paulo Zanoni

Em Seg, 2017-09-11 às 10:10 -0700, Rodrigo Vivi escreveu:
> On Mon, Sep 11, 2017 at 04:11:33PM +, Anuj Phogat wrote:
> > See Mesa commits: ebc5ccf and b2dae9f
> 
> I believe we need to be in sync between multiple gfx stack
> components,
> but I  don't believe we should remove ids.
> 
> In the past we had cases where we noticed a product group using a
> listed
> id to do a product and we just noticed the id after a user reported
> at fd.o.

On the other hand, don't we have the risk that someone is going to see
that these IDs are unused for KBL and them repurpose them om some
future non-KBL product?

> 
> For us in kernel the cycle until that id gets into a stable release
> propagated to OSVs distros can be a bit long.
> 
> Also Xserver ids are nowadays in sync with Mesa ones and I believe
> some
> OSVs might take a while to upgrade the Xserver as well in case of a
> new
> found product with some "new" id.
> 
> For this reason I was always in favor of adding all possible reserved
> ids from the
> beginning.
> 
> And this approach worked well on BDW and SKL, where we've seeing
> later some
> reserved ids becoming real product and we didn't have to do any extra
> step.
> 
> For this same reason I believe the right solution is to
> add those ids back to mesa instead of removing from kernel and
> libdrm.
> 
> Thanks,
> Rodrigo.
> 
> > 
> > Cc: Matt Turner 
> > Cc: Rodrigo Vivi 
> > Signed-off-by: Anuj Phogat 
> > ---
> >  drivers/gpu/drm/i915/i915_pci.c |  1 -
> >  include/drm/i915_pciids.h   | 15 ++-
> >  2 files changed, 2 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > b/drivers/gpu/drm/i915/i915_pci.c
> > index 129877b..ecf6d4c 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -613,7 +613,6 @@ static const struct pci_device_id pciidlist[] =
> > {
> >     INTEL_KBL_GT1_IDS(_kabylake_gt1_info),
> >     INTEL_KBL_GT2_IDS(_kabylake_gt2_info),
> >     INTEL_KBL_GT3_IDS(_kabylake_gt3_info),
> > -   INTEL_KBL_GT4_IDS(_kabylake_gt3_info),
> >     INTEL_CFL_S_GT1_IDS(_coffeelake_gt1_info),
> >     INTEL_CFL_S_GT2_IDS(_coffeelake_gt2_info),
> >     INTEL_CFL_H_GT2_IDS(_coffeelake_gt2_info),
> > diff --git a/include/drm/i915_pciids.h b/include/drm/i915_pciids.h
> > index 1257e15..a1bf90e 100644
> > --- a/include/drm/i915_pciids.h
> > +++ b/include/drm/i915_pciids.h
> > @@ -337,15 +337,10 @@
> >     INTEL_VGA_DEVICE(0x3185, info)
> >  
> >  #define INTEL_KBL_GT1_IDS(info)\
> > -   INTEL_VGA_DEVICE(0x5913, info), /* ULT GT1.5 */ \
> > -   INTEL_VGA_DEVICE(0x5915, info), /* ULX GT1.5 */ \
> >     INTEL_VGA_DEVICE(0x5917, info), /* DT  GT1.5 */ \
> >     INTEL_VGA_DEVICE(0x5906, info), /* ULT GT1 */ \
> > -   INTEL_VGA_DEVICE(0x590E, info), /* ULX GT1 */ \
> >     INTEL_VGA_DEVICE(0x5902, info), /* DT  GT1 */ \
> > -   INTEL_VGA_DEVICE(0x5908, info), /* Halo GT1 */ \
> > -   INTEL_VGA_DEVICE(0x590B, info), /* Halo GT1 */ \
> > -   INTEL_VGA_DEVICE(0x590A, info) /* SRV GT1 */
> > +   INTEL_VGA_DEVICE(0x590B, info)  /* Halo GT1 */
> >  
> >  #define INTEL_KBL_GT2_IDS(info)\
> >     INTEL_VGA_DEVICE(0x5916, info), /* ULT GT2 */ \
> > @@ -353,22 +348,16 @@
> >     INTEL_VGA_DEVICE(0x591E, info), /* ULX GT2 */ \
> >     INTEL_VGA_DEVICE(0x5912, info), /* DT  GT2 */ \
> >     INTEL_VGA_DEVICE(0x591B, info), /* Halo GT2 */ \
> > -   INTEL_VGA_DEVICE(0x591A, info), /* SRV GT2 */ \
> >     INTEL_VGA_DEVICE(0x591D, info) /* WKS GT2 */
> >  
> >  #define INTEL_KBL_GT3_IDS(info) \
> > -   INTEL_VGA_DEVICE(0x5923, info), /* ULT GT3 */ \
> >     INTEL_VGA_DEVICE(0x5926, info), /* ULT GT3 */ \
> >     INTEL_VGA_DEVICE(0x5927, info) /* ULT GT3 */
> >  
> > -#define INTEL_KBL_GT4_IDS(info) \
> > -   INTEL_VGA_DEVICE(0x593B, info) /* Halo GT4 */
> > -
> >  #define INTEL_KBL_IDS(info) \
> >     INTEL_KBL_GT1_IDS(info), \
> >     INTEL_KBL_GT2_IDS(info), \
> > -   INTEL_KBL_GT3_IDS(info), \
> > -   INTEL_KBL_GT4_IDS(info)
> > +   INTEL_KBL_GT3_IDS(info)
> >  
> >  /* CFL S */
> >  #define INTEL_CFL_S_GT1_IDS(info) \
> > -- 
> > 2.9.4
> > 
> > ___
> > Intel-gfx mailing list
> > intel-...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 02/12] tgsi/ureg: always emit constants (and their decls) as 2D

2017-09-12 Thread Rob Herring

On Mon, Aug 28, 2017 at 3:58 AM, Nicolai Hähnle  wrote:
> From: Nicolai HÃ¤hnle 
>
> Acked-by: Roland Scheidegger 
> Tested-by: Dieter Nützel 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 22 +++---
>  1 file changed, 7 insertions(+), 15 deletions(-)

I bisected to this commit breaking virgl (don't know about other drv)
on Android. The symptom is that I just get a black screen with no
signs of any errors.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/15] radv: dump the active shaders when a hang occured

2017-09-12 Thread Samuel Pitoiset




On 09/12/2017 09:36 PM, Bas Nieuwenhuizen wrote:

On Tue, Sep 12, 2017 at 9:30 PM, Samuel Pitoiset
 wrote:



On 09/12/2017 09:16 PM, Bas Nieuwenhuizen wrote:


On Tue, Sep 12, 2017 at 9:13 PM, Samuel Pitoiset
 wrote:




On 09/12/2017 09:07 PM, Bas Nieuwenhuizen wrote:



On Tue, Sep 12, 2017 at 8:57 PM, Samuel Pitoiset
 wrote:





On 09/12/2017 08:12 PM, Bas Nieuwenhuizen wrote:




On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:




Only the disassembly is currently dumped.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_debug.c | 78
++---
 1 file changed, 73 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_debug.c
b/src/amd/vulkan/radv_debug.c
index 0dc2d3a22b..fe9d9cfdba 100644
--- a/src/amd/vulkan/radv_debug.c
+++ b/src/amd/vulkan/radv_debug.c
@@ -76,13 +76,62 @@ radv_dump_trace(struct radv_device *device,
struct
radeon_winsys_cs *cs)
fclose(f);
 }

+static void
+radv_dump_shader(struct radv_pipeline *pipeline, gl_shader_stage
stage,
FILE *f)
+{
+   struct radv_shader_variant *shader =
pipeline->shaders[stage];
+
+   if (!shader)
+   return;
+
+   fprintf(f, "%s:\n%s\n\n", radv_get_shader_name(shader,
stage),
+   shader->disasm_string);
+}
+
+static void
+radv_dump_shaders(struct radv_pipeline *pipeline, FILE *f)
+{
+   unsigned mask;
+
+   mask = pipeline->active_stages;
+   while (mask) {
+   int stage = u_bit_scan();
+
+   radv_dump_shader(pipeline, stage, f);
+   }
+
+   radv_dump_shader(pipeline, MESA_SHADER_COMPUTE, f);
+}
+
+static void
+radv_dump_state(struct radv_pipeline *pipeline, FILE *f)
+{
+   if (!pipeline)
+   return;
+
+   radv_dump_shaders(pipeline, f);
+}
+
+static struct radv_pipeline *
+radv_get_saved_graphics_pipeline(struct radv_device *device)
+{
+   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
+
+   return (struct radv_pipeline *)ptr[1];
+}
+
+static struct radv_pipeline *
+radv_get_saved_compute_pipeline(struct radv_device *device)
+{
+   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
+
+   return (struct radv_pipeline *)ptr[2];
+}
+
 static bool
-radv_gpu_hang_occured(struct radv_queue *queue)
+radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
 {
struct radeon_winsys *ws = queue->device->ws;
-   enum ring_type ring;
-
-   ring = radv_queue_family_to_ring(queue->queue_family_index);

if (!ws->ctx_wait_idle(queue->hw_ctx, ring,
queue->queue_idx))
return true;
@@ -93,10 +142,14 @@ radv_gpu_hang_occured(struct radv_queue *queue)
 void
 radv_check_gpu_hangs(struct radv_queue *queue, struct
radeon_winsys_cs
*cs)
 {
+   struct radv_pipeline *graphics_pipeline, *compute_pipeline;
struct radv_device *device = queue->device;
+   enum ring_type ring;
uint64_t addr;

-   bool hang_occurred = radv_gpu_hang_occured(queue);
+   ring = radv_queue_family_to_ring(queue->queue_family_index);
+
+   bool hang_occurred = radv_gpu_hang_occured(queue, ring);
bool vm_fault_occurred = false;
if (queue->device->instance->debug_flags &
RADV_DEBUG_VM_FAULTS)
vm_fault_occurred =
ac_vm_fault_occured(device->physical_device->rad_info.chip_class,
@@ -104,11 +157,26 @@ radv_check_gpu_hangs(struct radv_queue *queue,
struct radeon_winsys_cs *cs)
if (!hang_occurred && !vm_fault_occurred)
return;

+   graphics_pipeline = radv_get_saved_graphics_pipeline(device);
+   compute_pipeline = radv_get_saved_compute_pipeline(device);
+
if (vm_fault_occurred) {
fprintf(stderr, "VM fault report.\n\n");
fprintf(stderr, "Failing VM page:
0x%08"PRIx64"\n\n",
addr);
}

+   switch (ring) {
+   case RING_GFX:
+   radv_dump_state(graphics_pipeline, stderr);





You may also need to dump the compute shader if set, as we can do
compute dispatches from the gfx ring.





The compute shader (if present) is already dumped in
radv_dump_shaders()
which is similar for all rings.




That dumps the compute shader of the graphics pipeline though? (which
will always be NULL). You need the compute shader of the compute
pipeline, even in the gfx ring.




Ah, pipeline->shaders[MESA_SHADER_COMPUTE] is always NULL in the the gfx
ring? I didn't notice that.



Its not the ring that matters, its the pipeline. We have gfx pipelines
and compute pipelines. the gfx pipelines will never have a compute
shader, and the compute pipelines only a compute shader. The gfx bind
point of the command buffer contains a gfx pipeline and the compute
bind point a compute pipeline, note that these can be

Re: [Mesa-dev] [PATCH 05/15] radv: dump the active shaders when a hang occured

2017-09-12 Thread Bas Nieuwenhuizen

On Tue, Sep 12, 2017 at 9:30 PM, Samuel Pitoiset
 wrote:
>
>
> On 09/12/2017 09:16 PM, Bas Nieuwenhuizen wrote:
>>
>> On Tue, Sep 12, 2017 at 9:13 PM, Samuel Pitoiset
>>  wrote:
>>>
>>>
>>>
>>> On 09/12/2017 09:07 PM, Bas Nieuwenhuizen wrote:


 On Tue, Sep 12, 2017 at 8:57 PM, Samuel Pitoiset
  wrote:
>
>
>
>
> On 09/12/2017 08:12 PM, Bas Nieuwenhuizen wrote:
>>
>>
>>
>> On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
>>  wrote:
>>>
>>>
>>>
>>> Only the disassembly is currently dumped.
>>>
>>> Signed-off-by: Samuel Pitoiset 
>>> ---
>>> src/amd/vulkan/radv_debug.c | 78
>>> ++---
>>> 1 file changed, 73 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/amd/vulkan/radv_debug.c
>>> b/src/amd/vulkan/radv_debug.c
>>> index 0dc2d3a22b..fe9d9cfdba 100644
>>> --- a/src/amd/vulkan/radv_debug.c
>>> +++ b/src/amd/vulkan/radv_debug.c
>>> @@ -76,13 +76,62 @@ radv_dump_trace(struct radv_device *device,
>>> struct
>>> radeon_winsys_cs *cs)
>>>fclose(f);
>>> }
>>>
>>> +static void
>>> +radv_dump_shader(struct radv_pipeline *pipeline, gl_shader_stage
>>> stage,
>>> FILE *f)
>>> +{
>>> +   struct radv_shader_variant *shader =
>>> pipeline->shaders[stage];
>>> +
>>> +   if (!shader)
>>> +   return;
>>> +
>>> +   fprintf(f, "%s:\n%s\n\n", radv_get_shader_name(shader,
>>> stage),
>>> +   shader->disasm_string);
>>> +}
>>> +
>>> +static void
>>> +radv_dump_shaders(struct radv_pipeline *pipeline, FILE *f)
>>> +{
>>> +   unsigned mask;
>>> +
>>> +   mask = pipeline->active_stages;
>>> +   while (mask) {
>>> +   int stage = u_bit_scan();
>>> +
>>> +   radv_dump_shader(pipeline, stage, f);
>>> +   }
>>> +
>>> +   radv_dump_shader(pipeline, MESA_SHADER_COMPUTE, f);
>>> +}
>>> +
>>> +static void
>>> +radv_dump_state(struct radv_pipeline *pipeline, FILE *f)
>>> +{
>>> +   if (!pipeline)
>>> +   return;
>>> +
>>> +   radv_dump_shaders(pipeline, f);
>>> +}
>>> +
>>> +static struct radv_pipeline *
>>> +radv_get_saved_graphics_pipeline(struct radv_device *device)
>>> +{
>>> +   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
>>> +
>>> +   return (struct radv_pipeline *)ptr[1];
>>> +}
>>> +
>>> +static struct radv_pipeline *
>>> +radv_get_saved_compute_pipeline(struct radv_device *device)
>>> +{
>>> +   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
>>> +
>>> +   return (struct radv_pipeline *)ptr[2];
>>> +}
>>> +
>>> static bool
>>> -radv_gpu_hang_occured(struct radv_queue *queue)
>>> +radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
>>> {
>>>struct radeon_winsys *ws = queue->device->ws;
>>> -   enum ring_type ring;
>>> -
>>> -   ring = radv_queue_family_to_ring(queue->queue_family_index);
>>>
>>>if (!ws->ctx_wait_idle(queue->hw_ctx, ring,
>>> queue->queue_idx))
>>>return true;
>>> @@ -93,10 +142,14 @@ radv_gpu_hang_occured(struct radv_queue *queue)
>>> void
>>> radv_check_gpu_hangs(struct radv_queue *queue, struct
>>> radeon_winsys_cs
>>> *cs)
>>> {
>>> +   struct radv_pipeline *graphics_pipeline, *compute_pipeline;
>>>struct radv_device *device = queue->device;
>>> +   enum ring_type ring;
>>>uint64_t addr;
>>>
>>> -   bool hang_occurred = radv_gpu_hang_occured(queue);
>>> +   ring = radv_queue_family_to_ring(queue->queue_family_index);
>>> +
>>> +   bool hang_occurred = radv_gpu_hang_occured(queue, ring);
>>>bool vm_fault_occurred = false;
>>>if (queue->device->instance->debug_flags &
>>> RADV_DEBUG_VM_FAULTS)
>>>vm_fault_occurred =
>>> ac_vm_fault_occured(device->physical_device->rad_info.chip_class,
>>> @@ -104,11 +157,26 @@ radv_check_gpu_hangs(struct radv_queue *queue,
>>> struct radeon_winsys_cs *cs)
>>>if (!hang_occurred && !vm_fault_occurred)
>>>return;
>>>
>>> +   graphics_pipeline = radv_get_saved_graphics_pipeline(device);
>>> +   compute_pipeline = radv_get_saved_compute_pipeline(device);
>>> +
>>>if (vm_fault_occurred) {
>>>fprintf(stderr, "VM fault report.\n\n");
>>>fprintf(stderr, "Failing VM page:

Re: [Mesa-dev] [PATCH 05/15] radv: dump the active shaders when a hang occured

2017-09-12 Thread Samuel Pitoiset




On 09/12/2017 09:16 PM, Bas Nieuwenhuizen wrote:

On Tue, Sep 12, 2017 at 9:13 PM, Samuel Pitoiset
 wrote:



On 09/12/2017 09:07 PM, Bas Nieuwenhuizen wrote:


On Tue, Sep 12, 2017 at 8:57 PM, Samuel Pitoiset
 wrote:




On 09/12/2017 08:12 PM, Bas Nieuwenhuizen wrote:



On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:



Only the disassembly is currently dumped.

Signed-off-by: Samuel Pitoiset 
---
src/amd/vulkan/radv_debug.c | 78
++---
1 file changed, 73 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
index 0dc2d3a22b..fe9d9cfdba 100644
--- a/src/amd/vulkan/radv_debug.c
+++ b/src/amd/vulkan/radv_debug.c
@@ -76,13 +76,62 @@ radv_dump_trace(struct radv_device *device, struct
radeon_winsys_cs *cs)
   fclose(f);
}

+static void
+radv_dump_shader(struct radv_pipeline *pipeline, gl_shader_stage
stage,
FILE *f)
+{
+   struct radv_shader_variant *shader = pipeline->shaders[stage];
+
+   if (!shader)
+   return;
+
+   fprintf(f, "%s:\n%s\n\n", radv_get_shader_name(shader, stage),
+   shader->disasm_string);
+}
+
+static void
+radv_dump_shaders(struct radv_pipeline *pipeline, FILE *f)
+{
+   unsigned mask;
+
+   mask = pipeline->active_stages;
+   while (mask) {
+   int stage = u_bit_scan();
+
+   radv_dump_shader(pipeline, stage, f);
+   }
+
+   radv_dump_shader(pipeline, MESA_SHADER_COMPUTE, f);
+}
+
+static void
+radv_dump_state(struct radv_pipeline *pipeline, FILE *f)
+{
+   if (!pipeline)
+   return;
+
+   radv_dump_shaders(pipeline, f);
+}
+
+static struct radv_pipeline *
+radv_get_saved_graphics_pipeline(struct radv_device *device)
+{
+   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
+
+   return (struct radv_pipeline *)ptr[1];
+}
+
+static struct radv_pipeline *
+radv_get_saved_compute_pipeline(struct radv_device *device)
+{
+   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
+
+   return (struct radv_pipeline *)ptr[2];
+}
+
static bool
-radv_gpu_hang_occured(struct radv_queue *queue)
+radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
{
   struct radeon_winsys *ws = queue->device->ws;
-   enum ring_type ring;
-
-   ring = radv_queue_family_to_ring(queue->queue_family_index);

   if (!ws->ctx_wait_idle(queue->hw_ctx, ring,
queue->queue_idx))
   return true;
@@ -93,10 +142,14 @@ radv_gpu_hang_occured(struct radv_queue *queue)
void
radv_check_gpu_hangs(struct radv_queue *queue, struct
radeon_winsys_cs
*cs)
{
+   struct radv_pipeline *graphics_pipeline, *compute_pipeline;
   struct radv_device *device = queue->device;
+   enum ring_type ring;
   uint64_t addr;

-   bool hang_occurred = radv_gpu_hang_occured(queue);
+   ring = radv_queue_family_to_ring(queue->queue_family_index);
+
+   bool hang_occurred = radv_gpu_hang_occured(queue, ring);
   bool vm_fault_occurred = false;
   if (queue->device->instance->debug_flags &
RADV_DEBUG_VM_FAULTS)
   vm_fault_occurred =
ac_vm_fault_occured(device->physical_device->rad_info.chip_class,
@@ -104,11 +157,26 @@ radv_check_gpu_hangs(struct radv_queue *queue,
struct radeon_winsys_cs *cs)
   if (!hang_occurred && !vm_fault_occurred)
   return;

+   graphics_pipeline = radv_get_saved_graphics_pipeline(device);
+   compute_pipeline = radv_get_saved_compute_pipeline(device);
+
   if (vm_fault_occurred) {
   fprintf(stderr, "VM fault report.\n\n");
   fprintf(stderr, "Failing VM page: 0x%08"PRIx64"\n\n",
addr);
   }

+   switch (ring) {
+   case RING_GFX:
+   radv_dump_state(graphics_pipeline, stderr);




You may also need to dump the compute shader if set, as we can do
compute dispatches from the gfx ring.




The compute shader (if present) is already dumped in radv_dump_shaders()
which is similar for all rings.



That dumps the compute shader of the graphics pipeline though? (which
will always be NULL). You need the compute shader of the compute
pipeline, even in the gfx ring.



Ah, pipeline->shaders[MESA_SHADER_COMPUTE] is always NULL in the the gfx
ring? I didn't notice that.


Its not the ring that matters, its the pipeline. We have gfx pipelines
and compute pipelines. the gfx pipelines will never have a compute
shader, and the compute pipelines only a compute shader. The gfx bind
point of the command buffer contains a gfx pipeline and the compute
bind point a compute pipeline, note that these can be different
pipelines. Note that on the gfx queue, both bind points exist, on the
compute queue only the compute bind point.


Okay, so

Re: [Mesa-dev] [PATCH 05/15] radv: dump the active shaders when a hang occured

2017-09-12 Thread Bas Nieuwenhuizen

On Tue, Sep 12, 2017 at 9:13 PM, Samuel Pitoiset
 wrote:
>
>
> On 09/12/2017 09:07 PM, Bas Nieuwenhuizen wrote:
>>
>> On Tue, Sep 12, 2017 at 8:57 PM, Samuel Pitoiset
>>  wrote:
>>>
>>>
>>>
>>> On 09/12/2017 08:12 PM, Bas Nieuwenhuizen wrote:


 On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
  wrote:
>
>
> Only the disassembly is currently dumped.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>src/amd/vulkan/radv_debug.c | 78
> ++---
>1 file changed, 73 insertions(+), 5 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
> index 0dc2d3a22b..fe9d9cfdba 100644
> --- a/src/amd/vulkan/radv_debug.c
> +++ b/src/amd/vulkan/radv_debug.c
> @@ -76,13 +76,62 @@ radv_dump_trace(struct radv_device *device, struct
> radeon_winsys_cs *cs)
>   fclose(f);
>}
>
> +static void
> +radv_dump_shader(struct radv_pipeline *pipeline, gl_shader_stage
> stage,
> FILE *f)
> +{
> +   struct radv_shader_variant *shader = pipeline->shaders[stage];
> +
> +   if (!shader)
> +   return;
> +
> +   fprintf(f, "%s:\n%s\n\n", radv_get_shader_name(shader, stage),
> +   shader->disasm_string);
> +}
> +
> +static void
> +radv_dump_shaders(struct radv_pipeline *pipeline, FILE *f)
> +{
> +   unsigned mask;
> +
> +   mask = pipeline->active_stages;
> +   while (mask) {
> +   int stage = u_bit_scan();
> +
> +   radv_dump_shader(pipeline, stage, f);
> +   }
> +
> +   radv_dump_shader(pipeline, MESA_SHADER_COMPUTE, f);
> +}
> +
> +static void
> +radv_dump_state(struct radv_pipeline *pipeline, FILE *f)
> +{
> +   if (!pipeline)
> +   return;
> +
> +   radv_dump_shaders(pipeline, f);
> +}
> +
> +static struct radv_pipeline *
> +radv_get_saved_graphics_pipeline(struct radv_device *device)
> +{
> +   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
> +
> +   return (struct radv_pipeline *)ptr[1];
> +}
> +
> +static struct radv_pipeline *
> +radv_get_saved_compute_pipeline(struct radv_device *device)
> +{
> +   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
> +
> +   return (struct radv_pipeline *)ptr[2];
> +}
> +
>static bool
> -radv_gpu_hang_occured(struct radv_queue *queue)
> +radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
>{
>   struct radeon_winsys *ws = queue->device->ws;
> -   enum ring_type ring;
> -
> -   ring = radv_queue_family_to_ring(queue->queue_family_index);
>
>   if (!ws->ctx_wait_idle(queue->hw_ctx, ring,
> queue->queue_idx))
>   return true;
> @@ -93,10 +142,14 @@ radv_gpu_hang_occured(struct radv_queue *queue)
>void
>radv_check_gpu_hangs(struct radv_queue *queue, struct
> radeon_winsys_cs
> *cs)
>{
> +   struct radv_pipeline *graphics_pipeline, *compute_pipeline;
>   struct radv_device *device = queue->device;
> +   enum ring_type ring;
>   uint64_t addr;
>
> -   bool hang_occurred = radv_gpu_hang_occured(queue);
> +   ring = radv_queue_family_to_ring(queue->queue_family_index);
> +
> +   bool hang_occurred = radv_gpu_hang_occured(queue, ring);
>   bool vm_fault_occurred = false;
>   if (queue->device->instance->debug_flags &
> RADV_DEBUG_VM_FAULTS)
>   vm_fault_occurred =
> ac_vm_fault_occured(device->physical_device->rad_info.chip_class,
> @@ -104,11 +157,26 @@ radv_check_gpu_hangs(struct radv_queue *queue,
> struct radeon_winsys_cs *cs)
>   if (!hang_occurred && !vm_fault_occurred)
>   return;
>
> +   graphics_pipeline = radv_get_saved_graphics_pipeline(device);
> +   compute_pipeline = radv_get_saved_compute_pipeline(device);
> +
>   if (vm_fault_occurred) {
>   fprintf(stderr, "VM fault report.\n\n");
>   fprintf(stderr, "Failing VM page: 0x%08"PRIx64"\n\n",
> addr);
>   }
>
> +   switch (ring) {
> +   case RING_GFX:
> +   radv_dump_state(graphics_pipeline, stderr);



 You may also need to dump the compute shader if set, as we can do
 compute dispatches from the gfx ring.
>>>
>>>
>>>
>>> The compute shader (if present) is already dumped in radv_dump_shaders()
>>> which is similar for all rings.
>>
>>
>> That dumps the compute shader of the graphics

Re: [Mesa-dev] [PATCH 05/15] radv: dump the active shaders when a hang occured

2017-09-12 Thread Samuel Pitoiset




On 09/12/2017 09:07 PM, Bas Nieuwenhuizen wrote:

On Tue, Sep 12, 2017 at 8:57 PM, Samuel Pitoiset
 wrote:



On 09/12/2017 08:12 PM, Bas Nieuwenhuizen wrote:


On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:


Only the disassembly is currently dumped.

Signed-off-by: Samuel Pitoiset 
---
   src/amd/vulkan/radv_debug.c | 78
++---
   1 file changed, 73 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
index 0dc2d3a22b..fe9d9cfdba 100644
--- a/src/amd/vulkan/radv_debug.c
+++ b/src/amd/vulkan/radv_debug.c
@@ -76,13 +76,62 @@ radv_dump_trace(struct radv_device *device, struct
radeon_winsys_cs *cs)
  fclose(f);
   }

+static void
+radv_dump_shader(struct radv_pipeline *pipeline, gl_shader_stage stage,
FILE *f)
+{
+   struct radv_shader_variant *shader = pipeline->shaders[stage];
+
+   if (!shader)
+   return;
+
+   fprintf(f, "%s:\n%s\n\n", radv_get_shader_name(shader, stage),
+   shader->disasm_string);
+}
+
+static void
+radv_dump_shaders(struct radv_pipeline *pipeline, FILE *f)
+{
+   unsigned mask;
+
+   mask = pipeline->active_stages;
+   while (mask) {
+   int stage = u_bit_scan();
+
+   radv_dump_shader(pipeline, stage, f);
+   }
+
+   radv_dump_shader(pipeline, MESA_SHADER_COMPUTE, f);
+}
+
+static void
+radv_dump_state(struct radv_pipeline *pipeline, FILE *f)
+{
+   if (!pipeline)
+   return;
+
+   radv_dump_shaders(pipeline, f);
+}
+
+static struct radv_pipeline *
+radv_get_saved_graphics_pipeline(struct radv_device *device)
+{
+   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
+
+   return (struct radv_pipeline *)ptr[1];
+}
+
+static struct radv_pipeline *
+radv_get_saved_compute_pipeline(struct radv_device *device)
+{
+   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
+
+   return (struct radv_pipeline *)ptr[2];
+}
+
   static bool
-radv_gpu_hang_occured(struct radv_queue *queue)
+radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
   {
  struct radeon_winsys *ws = queue->device->ws;
-   enum ring_type ring;
-
-   ring = radv_queue_family_to_ring(queue->queue_family_index);

  if (!ws->ctx_wait_idle(queue->hw_ctx, ring, queue->queue_idx))
  return true;
@@ -93,10 +142,14 @@ radv_gpu_hang_occured(struct radv_queue *queue)
   void
   radv_check_gpu_hangs(struct radv_queue *queue, struct radeon_winsys_cs
*cs)
   {
+   struct radv_pipeline *graphics_pipeline, *compute_pipeline;
  struct radv_device *device = queue->device;
+   enum ring_type ring;
  uint64_t addr;

-   bool hang_occurred = radv_gpu_hang_occured(queue);
+   ring = radv_queue_family_to_ring(queue->queue_family_index);
+
+   bool hang_occurred = radv_gpu_hang_occured(queue, ring);
  bool vm_fault_occurred = false;
  if (queue->device->instance->debug_flags & RADV_DEBUG_VM_FAULTS)
  vm_fault_occurred =
ac_vm_fault_occured(device->physical_device->rad_info.chip_class,
@@ -104,11 +157,26 @@ radv_check_gpu_hangs(struct radv_queue *queue,
struct radeon_winsys_cs *cs)
  if (!hang_occurred && !vm_fault_occurred)
  return;

+   graphics_pipeline = radv_get_saved_graphics_pipeline(device);
+   compute_pipeline = radv_get_saved_compute_pipeline(device);
+
  if (vm_fault_occurred) {
  fprintf(stderr, "VM fault report.\n\n");
  fprintf(stderr, "Failing VM page: 0x%08"PRIx64"\n\n",
addr);
  }

+   switch (ring) {
+   case RING_GFX:
+   radv_dump_state(graphics_pipeline, stderr);



You may also need to dump the compute shader if set, as we can do
compute dispatches from the gfx ring.



The compute shader (if present) is already dumped in radv_dump_shaders()
which is similar for all rings.


That dumps the compute shader of the graphics pipeline though? (which
will always be NULL). You need the compute shader of the compute
pipeline, even in the gfx ring.


Ah, pipeline->shaders[MESA_SHADER_COMPUTE] is always NULL in the the gfx 
ring? I didn't notice that.










+   break;
+   case RING_COMPUTE:
+   radv_dump_state(compute_pipeline, stderr);
+   break;
+   default:
+   assert(0);
+   break;
+   }
+
  radv_dump_trace(queue->device, cs);
  abort();
   }
--
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-12 Thread Nicolai Hähnle


On 12.09.2017 06:25, Ian Romanick wrote:

On 09/07/2017 04:26 PM, Jordan Justen wrote:

On 2017-09-06 14:12:41, Daniel Schürmann wrote:

Hello together!
Recently, we had a small discussion (off the list) about the NIR
serialization, which was previously discussed in [RFC] ARB_gl_spirv and
NIR backend for radeonsi.

As this topic could be interesting to more people, I would like to
share, what was talked about so far (You might want to read from bottom up).

TL;DR:
- NIR serialization is in demand for shader cache
- could be done either directly (NIR binary form) or via SPIR-V
- Ian et al. are working on GLSL IR -> SPIR-V transformation, which
could be adapted for a NIR -> SPIR-V pass
- in NIR representation, some type information is lost
- thus, a serialization via SPIR-V could NOT be a glslang alternative
(otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if the
output is valid SPIR-V)


Ian,

Tim was suggesting that we might look at serializing nir for the i965
shader cache. Based on this email, it sounds like serialized nir would
not be enough for the shader cache as some GLSL type info would be
lost. It sounds like GLSL IR => SPIR-V would be good enough. Is that
right?

I don't think we have a strict requirement for the GLSL IR => SPIR-V
path for GL 4.6, right? So, this is more of a 'nice-to-have'?


I think it's basically a requirement if we want to adequately test
SPIR-V in OpenGL.  The volume of tests for SPIR-V in the CTS is going to
be small compared to the number of tests for GLSL in the CTS and piglit.


The idea being a stand-alone linker & compiler that supports everything 
Mesa does, but also writes out cross-stage valid locations for uniforms 
and I/O?


The latter is really the biggest pain when working with glslang...

Cheers,
Nicolai



I'm not sure we'd want to make i965 shader cache depend on a
nice-to-have feature. (Unless we're pretty sure it'll be available
soon.)

But, it would be nice to not have to fallback to compiling the GLSL
for i965 shader cache, so it would be worth waiting a little bit to be
able to rely on a SPIR-V serialization of the GLSL IR.

What do you suggest?

-Jordan


- now, the question is if this is worth the additional effort

Kind regards,
Daniel

 Forwarded Message 
Subject:Re: NIR serialization
Date:   Tue, 5 Sep 2017 11:00:31 -0700
From:   Ian Romanick 
To: Daniel Schürmann , Nicolai
Hähnle , Timothy Arceri 



Sorry for taking so long to reply.  It was a long holiday weekend in the
US, and I was away.

On 09/01/2017 05:03 AM, Daniel Schürmann wrote:

A direct NIR binary serialization would also do the job (vc4/freedreno
was mentioned as well).
I only thought that SPIRV is preferable because
- deserialization for free
- cached shader size
- spirv-opt and glslang alternative

The term lossy doesn't make much sense to me with regard to
optimizations: aren't all optimizations lossy?


By lossy I mean there is a significant  semantic change.  As soon as
GLSL IR is converted to NIR, Boolean types completely cease to exist.
They are replaced with integers that are either 0 or -1.  Similarly, all
matrix types cease to exist.  They are replaced by a set of vectors.

For the purpose of the on-disk cache, this probably doesn't matter.  It
does mean that additional information about, for example, types of
uniforms has to be tracked.  In a direct GLSL IR to SPIR-V translation,
type information is maintained, so the SPIR-V has all the necessary
information.

As a glslang replacement, maintaining type information is an absolute
requirement.  Users will use other tools to introspect the SPIR-V shader
to find locations of uniforms, shader inputs, offsets of values in UBOs,
etc.  If the types are changed in the SPIR-V shader that we emit, none
of that will work.  I plan to enable retrieval of portable SPIR-V both
from a Mesa driver and the standalone GLSL compiler.

Right now SPIR-V binaries will be quite large.  I have several ideas
that I plan to implement once we have OpenGL 4.6 done that should
dramatically reduce the size of SPIR-V... I'm actually hoping to present
that at FOSDEM.


The primary goal would be the lossless NIR-SPIRV-NIR round-trip.
Secondary, it would be desirable if we achieve valid SPIRV binaries
which preserve the semantics of the original shader.
And here is the question if this is possible with the type information
that are available...

Ian: can you hint me to your repository? I couldn't find it.


https://cgit.freedesktop.org/~idr/mesa/log/?h=emit-spirv


Kind regards,

Daniel


On 09/01/2017 12:16 PM, Nicolai Hähnle wrote:

In addition to using NIR-based optimizations, I believe Timothy
mentioned that a method for serializing NIR would help the shader disk
cache of i965. It would certainly help radeonsi if/when we switch to
the NIR backend, because we could compile new shader variants without
falling

Re: [Mesa-dev] [PATCH 05/15] radv: dump the active shaders when a hang occured

2017-09-12 Thread Bas Nieuwenhuizen

On Tue, Sep 12, 2017 at 8:57 PM, Samuel Pitoiset
 wrote:
>
>
> On 09/12/2017 08:12 PM, Bas Nieuwenhuizen wrote:
>>
>> On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
>>  wrote:
>>>
>>> Only the disassembly is currently dumped.
>>>
>>> Signed-off-by: Samuel Pitoiset 
>>> ---
>>>   src/amd/vulkan/radv_debug.c | 78
>>> ++---
>>>   1 file changed, 73 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
>>> index 0dc2d3a22b..fe9d9cfdba 100644
>>> --- a/src/amd/vulkan/radv_debug.c
>>> +++ b/src/amd/vulkan/radv_debug.c
>>> @@ -76,13 +76,62 @@ radv_dump_trace(struct radv_device *device, struct
>>> radeon_winsys_cs *cs)
>>>  fclose(f);
>>>   }
>>>
>>> +static void
>>> +radv_dump_shader(struct radv_pipeline *pipeline, gl_shader_stage stage,
>>> FILE *f)
>>> +{
>>> +   struct radv_shader_variant *shader = pipeline->shaders[stage];
>>> +
>>> +   if (!shader)
>>> +   return;
>>> +
>>> +   fprintf(f, "%s:\n%s\n\n", radv_get_shader_name(shader, stage),
>>> +   shader->disasm_string);
>>> +}
>>> +
>>> +static void
>>> +radv_dump_shaders(struct radv_pipeline *pipeline, FILE *f)
>>> +{
>>> +   unsigned mask;
>>> +
>>> +   mask = pipeline->active_stages;
>>> +   while (mask) {
>>> +   int stage = u_bit_scan();
>>> +
>>> +   radv_dump_shader(pipeline, stage, f);
>>> +   }
>>> +
>>> +   radv_dump_shader(pipeline, MESA_SHADER_COMPUTE, f);
>>> +}
>>> +
>>> +static void
>>> +radv_dump_state(struct radv_pipeline *pipeline, FILE *f)
>>> +{
>>> +   if (!pipeline)
>>> +   return;
>>> +
>>> +   radv_dump_shaders(pipeline, f);
>>> +}
>>> +
>>> +static struct radv_pipeline *
>>> +radv_get_saved_graphics_pipeline(struct radv_device *device)
>>> +{
>>> +   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
>>> +
>>> +   return (struct radv_pipeline *)ptr[1];
>>> +}
>>> +
>>> +static struct radv_pipeline *
>>> +radv_get_saved_compute_pipeline(struct radv_device *device)
>>> +{
>>> +   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
>>> +
>>> +   return (struct radv_pipeline *)ptr[2];
>>> +}
>>> +
>>>   static bool
>>> -radv_gpu_hang_occured(struct radv_queue *queue)
>>> +radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
>>>   {
>>>  struct radeon_winsys *ws = queue->device->ws;
>>> -   enum ring_type ring;
>>> -
>>> -   ring = radv_queue_family_to_ring(queue->queue_family_index);
>>>
>>>  if (!ws->ctx_wait_idle(queue->hw_ctx, ring, queue->queue_idx))
>>>  return true;
>>> @@ -93,10 +142,14 @@ radv_gpu_hang_occured(struct radv_queue *queue)
>>>   void
>>>   radv_check_gpu_hangs(struct radv_queue *queue, struct radeon_winsys_cs
>>> *cs)
>>>   {
>>> +   struct radv_pipeline *graphics_pipeline, *compute_pipeline;
>>>  struct radv_device *device = queue->device;
>>> +   enum ring_type ring;
>>>  uint64_t addr;
>>>
>>> -   bool hang_occurred = radv_gpu_hang_occured(queue);
>>> +   ring = radv_queue_family_to_ring(queue->queue_family_index);
>>> +
>>> +   bool hang_occurred = radv_gpu_hang_occured(queue, ring);
>>>  bool vm_fault_occurred = false;
>>>  if (queue->device->instance->debug_flags & RADV_DEBUG_VM_FAULTS)
>>>  vm_fault_occurred =
>>> ac_vm_fault_occured(device->physical_device->rad_info.chip_class,
>>> @@ -104,11 +157,26 @@ radv_check_gpu_hangs(struct radv_queue *queue,
>>> struct radeon_winsys_cs *cs)
>>>  if (!hang_occurred && !vm_fault_occurred)
>>>  return;
>>>
>>> +   graphics_pipeline = radv_get_saved_graphics_pipeline(device);
>>> +   compute_pipeline = radv_get_saved_compute_pipeline(device);
>>> +
>>>  if (vm_fault_occurred) {
>>>  fprintf(stderr, "VM fault report.\n\n");
>>>  fprintf(stderr, "Failing VM page: 0x%08"PRIx64"\n\n",
>>> addr);
>>>  }
>>>
>>> +   switch (ring) {
>>> +   case RING_GFX:
>>> +   radv_dump_state(graphics_pipeline, stderr);
>>
>>
>> You may also need to dump the compute shader if set, as we can do
>> compute dispatches from the gfx ring.
>
>
> The compute shader (if present) is already dumped in radv_dump_shaders()
> which is similar for all rings.

That dumps the compute shader of the graphics pipeline though? (which
will always be NULL). You need the compute shader of the compute
pipeline, even in the gfx ring.

>
>
>>
>>> +   break;
>>> +   case RING_COMPUTE:
>>> +   radv_dump_state(compute_pipeline, stderr);
>>> +   break;
>>> +   default:
>>> +   assert(0);
>>> +   break;
>>> +   }
>>> +
>>>  radv_dump_trace(queue->device, cs);
>>>  abort();
>>>   }
>>> --
>>> 2.14.1
>>>
>>>

Re: [Mesa-dev] [PATCH 05/15] radv: dump the active shaders when a hang occured

2017-09-12 Thread Samuel Pitoiset




On 09/12/2017 08:12 PM, Bas Nieuwenhuizen wrote:

On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:

Only the disassembly is currently dumped.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_debug.c | 78 ++---
  1 file changed, 73 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
index 0dc2d3a22b..fe9d9cfdba 100644
--- a/src/amd/vulkan/radv_debug.c
+++ b/src/amd/vulkan/radv_debug.c
@@ -76,13 +76,62 @@ radv_dump_trace(struct radv_device *device, struct 
radeon_winsys_cs *cs)
 fclose(f);
  }

+static void
+radv_dump_shader(struct radv_pipeline *pipeline, gl_shader_stage stage, FILE 
*f)
+{
+   struct radv_shader_variant *shader = pipeline->shaders[stage];
+
+   if (!shader)
+   return;
+
+   fprintf(f, "%s:\n%s\n\n", radv_get_shader_name(shader, stage),
+   shader->disasm_string);
+}
+
+static void
+radv_dump_shaders(struct radv_pipeline *pipeline, FILE *f)
+{
+   unsigned mask;
+
+   mask = pipeline->active_stages;
+   while (mask) {
+   int stage = u_bit_scan();
+
+   radv_dump_shader(pipeline, stage, f);
+   }
+
+   radv_dump_shader(pipeline, MESA_SHADER_COMPUTE, f);
+}
+
+static void
+radv_dump_state(struct radv_pipeline *pipeline, FILE *f)
+{
+   if (!pipeline)
+   return;
+
+   radv_dump_shaders(pipeline, f);
+}
+
+static struct radv_pipeline *
+radv_get_saved_graphics_pipeline(struct radv_device *device)
+{
+   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
+
+   return (struct radv_pipeline *)ptr[1];
+}
+
+static struct radv_pipeline *
+radv_get_saved_compute_pipeline(struct radv_device *device)
+{
+   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
+
+   return (struct radv_pipeline *)ptr[2];
+}
+
  static bool
-radv_gpu_hang_occured(struct radv_queue *queue)
+radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
  {
 struct radeon_winsys *ws = queue->device->ws;
-   enum ring_type ring;
-
-   ring = radv_queue_family_to_ring(queue->queue_family_index);

 if (!ws->ctx_wait_idle(queue->hw_ctx, ring, queue->queue_idx))
 return true;
@@ -93,10 +142,14 @@ radv_gpu_hang_occured(struct radv_queue *queue)
  void
  radv_check_gpu_hangs(struct radv_queue *queue, struct radeon_winsys_cs *cs)
  {
+   struct radv_pipeline *graphics_pipeline, *compute_pipeline;
 struct radv_device *device = queue->device;
+   enum ring_type ring;
 uint64_t addr;

-   bool hang_occurred = radv_gpu_hang_occured(queue);
+   ring = radv_queue_family_to_ring(queue->queue_family_index);
+
+   bool hang_occurred = radv_gpu_hang_occured(queue, ring);
 bool vm_fault_occurred = false;
 if (queue->device->instance->debug_flags & RADV_DEBUG_VM_FAULTS)
 vm_fault_occurred = 
ac_vm_fault_occured(device->physical_device->rad_info.chip_class,
@@ -104,11 +157,26 @@ radv_check_gpu_hangs(struct radv_queue *queue, struct 
radeon_winsys_cs *cs)
 if (!hang_occurred && !vm_fault_occurred)
 return;

+   graphics_pipeline = radv_get_saved_graphics_pipeline(device);
+   compute_pipeline = radv_get_saved_compute_pipeline(device);
+
 if (vm_fault_occurred) {
 fprintf(stderr, "VM fault report.\n\n");
 fprintf(stderr, "Failing VM page: 0x%08"PRIx64"\n\n", addr);
 }

+   switch (ring) {
+   case RING_GFX:
+   radv_dump_state(graphics_pipeline, stderr);


You may also need to dump the compute shader if set, as we can do
compute dispatches from the gfx ring.


The compute shader (if present) is already dumped in radv_dump_shaders() 
which is similar for all rings.





+   break;
+   case RING_COMPUTE:
+   radv_dump_state(compute_pipeline, stderr);
+   break;
+   default:
+   assert(0);
+   break;
+   }
+
 radv_dump_trace(queue->device, cs);
 abort();
  }
--
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/15] radv: add a comment that describes the trace BO layout

2017-09-12 Thread Samuel Pitoiset




On 09/12/2017 08:05 PM, Bas Nieuwenhuizen wrote:

add something that the offsets are in multiple of 4 bytes?


Okay.



On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_debug.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
index d52ba5d86d..052daaef2f 100644
--- a/src/amd/vulkan/radv_debug.c
+++ b/src/amd/vulkan/radv_debug.c
@@ -32,6 +32,12 @@
  #include "radv_debug.h"
  #include "radv_shader.h"

+/* Trace BO layout:
+ *
+ * [0]: primary trace ID
+ * [1]: secondary trace ID
+ */
+
  bool
  radv_init_trace(struct radv_device *device)
  {
--
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi/uvd: fix interlaced video buffer height alignment

2017-09-12 Thread Christian König


Am 12.09.2017 um 18:10 schrieb Leo Liu:



On 09/12/2017 11:37 AM, Christian König wrote:

Am 12.09.2017 um 17:32 schrieb Leo Liu:



On 09/12/2017 11:23 AM, Christian König wrote:
I don't think this is correct. A long long time ago I've came up 
with this because the firmware didn't liked what you proposed below.
Since this change only affects 720p video, so I did quite a bit 
tests on 720p video dec/enc, and haven't see any problem.


Can you point out some failed case?


No, I briefly remember the issue was with some low res MPEG2 stream, 
but no idea which one exactly.






Instead we should rather fix the scaler to use the original 
width/height of the video buffer and not the adjusted width/height 
of the resources.
To be honest, I don't really want to touch here, that's why I sent 
this patch alone for RFC, and also why I was got workaround for OMX 
one year ago, but now encounter the same again, and it seems not 
easy to fix on the blit, since it's scaling, not like OMX case, I 
can use src rect for dst.


I think the problem is simply that you use the wrong variables.

See there are video_buffer->width/heigth which are the original 
values the application requested (IIRC) 


The height used from va/postproc.c is from video buffer.

static const VARectangle *
vlVaRegionDefault(const VARectangle *region, struct pipe_video_buffer 
*buf,

  VARectangle *def)
{
   if (region)
  return region;

   def->x = 0;
   def->y = 0;
   def->width = buf->width;
   def->height = buf->height;

   return def;
}


The problem is:

In si_uvd.c

struct pipe_video_buffer *si_video_buffer_create(struct pipe_context 
*pipe,

 const struct pipe_video_buffer *tmpl)
{
struct pipe_video_buffer template;

template.height = align(tmpl->height / array_size, VL_MACROBLOCK_HEIGHT);


The original info with right height in the tmpl, and that's my first 
thought to deal with the issue.


but when you keep looking to the code, the tmpl got wiped out, and 
leave a new template with 32 aligned height.


The video buffer was created based on this new template.


and there are the pipe_resource->width/height which are aligned so 
that the hardware can deal with them.


Video buffer and pipe buffer are same, they both got aligned.


Ok, than that is most likely the root problem. This shouldn't be the 
case IIRC.


Anyway feel free to go ahead with your original patch, as you noted 
better not touch that to intense or a lot of things might break.


We should just test with some low res MPEG2 stream to see if the 
standard PAL/NTSC formats still work.


Regards,
Christian.




Regards,
Leo




So while scaling we need to look at video_buffer->width/height 
instead of the resource width/height to figure out the destination 
area of the blit.


Regards,
Christian.



Regards,
Leo




Regards,
Christian.

Am 12.09.2017 um 15:56 schrieb Leo Liu:

In code:
template.height = align(tmpl->height / array_size, 
VL_MACROBLOCK_HEIGHT);

...
template.height *= array_size;

It turns out the height will be aligned with 2*VL_MACROBLOCK_HEIGHT.
The problematic case for example is when VA-API postproc scaling with
blit between interlaced buffers, if the size is 720 in height, it 
will
be actually scaled to 736 in height, so the scaled video will crop 
out

the extra 16 lines of height.

Another example is when deint with 720p video from interlaced buffer
to progressive buffer. This problem happened on OMX, and got 
workaround

with patch:

(0c374a777 st/omx/dec: set dst rect to match src size

When creating interlaced video buffer, hegith set to 
"template.height =

align(tmpl->height/ array_size, VL_MACROBLOCK_HEIGHT);", and we use
"template.height *= array_size;" for the buffer height, so it 
actually
aligned with 32. With progressive video buffer it still aligned 
with 16,
thus causing different height between interlaced buffer and 
progressive

buffer for 4K (height=2160), and 720p (height=720).

When transcode the video, this will cause the 16 lines corruption
at the bottom of the encode video.)

Signed-off-by: Leo Liu 
---
  src/gallium/drivers/radeonsi/si_uvd.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_uvd.c 
b/src/gallium/drivers/radeonsi/si_uvd.c

index 2441ad248c..e4c55c20e1 100644
--- a/src/gallium/drivers/radeonsi/si_uvd.c
+++ b/src/gallium/drivers/radeonsi/si_uvd.c
@@ -62,7 +62,7 @@ struct pipe_video_buffer 
*si_video_buffer_create(struct pipe_context *pipe,

  array_size = tmpl->interlaced ? 2 : 1;
  template = *tmpl;
  template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH);
-template.height = align(tmpl->height / array_size, 
VL_MACROBLOCK_HEIGHT);
+template.height = align(tmpl->height, VL_MACROBLOCK_HEIGHT) / 
array_size;
vl_video_buffer_template(, , 
resource_formats[0], 1, array_size, PIPE_USAGE_DEFAULT, 0);

  /* TODO: get tiling working */

Re: [Mesa-dev] [PATCH 15/15] radv: dump the list of enabled options when a hang occured

2017-09-12 Thread Bas Nieuwenhuizen

Okay, if you fix the few comments I sent, this series is

Reviewed-by: Bas Nieuwenhuizen 

On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:
> Useful to know which debug/perftest options were enabled when
> a hang report is generated.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_debug.c   | 25 +
>  src/amd/vulkan/radv_device.c  | 14 ++
>  src/amd/vulkan/radv_private.h |  7 +++
>  3 files changed, 46 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
> index 106c6e4f64..812e868c10 100644
> --- a/src/amd/vulkan/radv_debug.c
> +++ b/src/amd/vulkan/radv_debug.c
> @@ -553,6 +553,30 @@ radv_dump_dmesg(FILE *f)
> pclose(p);
>  }
>
> +static void
> +radv_dump_enabled_options(struct radv_device *device, FILE *f)
> +{
> +   uint64_t mask;
> +
> +   fprintf(f, "Enabled debug options: ");
> +
> +   mask = device->debug_flags;
> +   while (mask) {
> +   int i = u_bit_scan64();
> +   fprintf(f, "%s, ", radv_get_debug_option_name(i));
> +   }
> +   fprintf(f, "\n");
> +
> +   fprintf(f, "Enabled perftest options: ");
> +
> +   mask = device->instance->perftest_flags;
> +   while (mask) {
> +   int i = u_bit_scan64();
> +   fprintf(f, "%s, ", radv_get_perftest_option_name(i));
> +   }
> +   fprintf(f, "\n");
> +}
> +
>  static bool
>  radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
>  {
> @@ -585,6 +609,7 @@ radv_check_gpu_hangs(struct radv_queue *queue, struct 
> radeon_winsys_cs *cs)
> graphics_pipeline = radv_get_saved_graphics_pipeline(device);
> compute_pipeline = radv_get_saved_compute_pipeline(device);
>
> +   radv_dump_enabled_options(device, stderr);
> radv_dump_dmesg(stderr);
>
> if (vm_fault_occurred) {
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 5101bd7cb2..58e6815124 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -417,12 +417,26 @@ static const struct debug_control radv_debug_options[] 
> = {
> {NULL, 0}
>  };
>
> +const char *
> +radv_get_debug_option_name(int id)
> +{
> +   assert(id < ARRAY_SIZE(radv_debug_options) - 1);
> +   return radv_debug_options[id].string;
> +}
> +
>  static const struct debug_control radv_perftest_options[] = {
> {"nobatchchain", RADV_PERFTEST_NO_BATCHCHAIN},
> {"sisched", RADV_PERFTEST_SISCHED},
> {NULL, 0}
>  };
>
> +const char *
> +radv_get_perftest_option_name(int id)
> +{
> +   assert(id < ARRAY_SIZE(radv_debug_options) - 1);
> +   return radv_perftest_options[id].string;
> +}
> +
>  VkResult radv_CreateInstance(
> const VkInstanceCreateInfo* pCreateInfo,
> const VkAllocationCallbacks*pAllocator,
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
> index 31991a314c..191a81f77a 100644
> --- a/src/amd/vulkan/radv_private.h
> +++ b/src/amd/vulkan/radv_private.h
> @@ -744,6 +744,13 @@ extern const struct radv_dynamic_state 
> default_dynamic_state;
>  void radv_dynamic_state_copy(struct radv_dynamic_state *dest,
>  const struct radv_dynamic_state *src,
>  uint32_t copy_mask);
> +
> +const char *
> +radv_get_debug_option_name(int id);
> +
> +const char *
> +radv_get_perftest_option_name(int id);
> +
>  /**
>   * Attachment state when recording a renderpass instance.
>   *
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/15] radv: dump the active shaders when a hang occured

2017-09-12 Thread Bas Nieuwenhuizen

On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:
> Only the disassembly is currently dumped.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_debug.c | 78 
> ++---
>  1 file changed, 73 insertions(+), 5 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
> index 0dc2d3a22b..fe9d9cfdba 100644
> --- a/src/amd/vulkan/radv_debug.c
> +++ b/src/amd/vulkan/radv_debug.c
> @@ -76,13 +76,62 @@ radv_dump_trace(struct radv_device *device, struct 
> radeon_winsys_cs *cs)
> fclose(f);
>  }
>
> +static void
> +radv_dump_shader(struct radv_pipeline *pipeline, gl_shader_stage stage, FILE 
> *f)
> +{
> +   struct radv_shader_variant *shader = pipeline->shaders[stage];
> +
> +   if (!shader)
> +   return;
> +
> +   fprintf(f, "%s:\n%s\n\n", radv_get_shader_name(shader, stage),
> +   shader->disasm_string);
> +}
> +
> +static void
> +radv_dump_shaders(struct radv_pipeline *pipeline, FILE *f)
> +{
> +   unsigned mask;
> +
> +   mask = pipeline->active_stages;
> +   while (mask) {
> +   int stage = u_bit_scan();
> +
> +   radv_dump_shader(pipeline, stage, f);
> +   }
> +
> +   radv_dump_shader(pipeline, MESA_SHADER_COMPUTE, f);
> +}
> +
> +static void
> +radv_dump_state(struct radv_pipeline *pipeline, FILE *f)
> +{
> +   if (!pipeline)
> +   return;
> +
> +   radv_dump_shaders(pipeline, f);
> +}
> +
> +static struct radv_pipeline *
> +radv_get_saved_graphics_pipeline(struct radv_device *device)
> +{
> +   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
> +
> +   return (struct radv_pipeline *)ptr[1];
> +}
> +
> +static struct radv_pipeline *
> +radv_get_saved_compute_pipeline(struct radv_device *device)
> +{
> +   uint64_t *ptr = (uint64_t *)device->trace_id_ptr;
> +
> +   return (struct radv_pipeline *)ptr[2];
> +}
> +
>  static bool
> -radv_gpu_hang_occured(struct radv_queue *queue)
> +radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
>  {
> struct radeon_winsys *ws = queue->device->ws;
> -   enum ring_type ring;
> -
> -   ring = radv_queue_family_to_ring(queue->queue_family_index);
>
> if (!ws->ctx_wait_idle(queue->hw_ctx, ring, queue->queue_idx))
> return true;
> @@ -93,10 +142,14 @@ radv_gpu_hang_occured(struct radv_queue *queue)
>  void
>  radv_check_gpu_hangs(struct radv_queue *queue, struct radeon_winsys_cs *cs)
>  {
> +   struct radv_pipeline *graphics_pipeline, *compute_pipeline;
> struct radv_device *device = queue->device;
> +   enum ring_type ring;
> uint64_t addr;
>
> -   bool hang_occurred = radv_gpu_hang_occured(queue);
> +   ring = radv_queue_family_to_ring(queue->queue_family_index);
> +
> +   bool hang_occurred = radv_gpu_hang_occured(queue, ring);
> bool vm_fault_occurred = false;
> if (queue->device->instance->debug_flags & RADV_DEBUG_VM_FAULTS)
> vm_fault_occurred = 
> ac_vm_fault_occured(device->physical_device->rad_info.chip_class,
> @@ -104,11 +157,26 @@ radv_check_gpu_hangs(struct radv_queue *queue, struct 
> radeon_winsys_cs *cs)
> if (!hang_occurred && !vm_fault_occurred)
> return;
>
> +   graphics_pipeline = radv_get_saved_graphics_pipeline(device);
> +   compute_pipeline = radv_get_saved_compute_pipeline(device);
> +
> if (vm_fault_occurred) {
> fprintf(stderr, "VM fault report.\n\n");
> fprintf(stderr, "Failing VM page: 0x%08"PRIx64"\n\n", addr);
> }
>
> +   switch (ring) {
> +   case RING_GFX:
> +   radv_dump_state(graphics_pipeline, stderr);

You may also need to dump the compute shader if set, as we can do
compute dispatches from the gfx ring.

> +   break;
> +   case RING_COMPUTE:
> +   radv_dump_state(compute_pipeline, stderr);
> +   break;
> +   default:
> +   assert(0);
> +   break;
> +   }
> +
> radv_dump_trace(queue->device, cs);
> abort();
>  }
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-12 Thread Jason Ekstrand

On Tue, Sep 12, 2017 at 10:12 AM, Ian Romanick  wrote:

> On 09/11/2017 11:17 PM, Kenneth Graunke wrote:
> > On Monday, September 11, 2017 9:23:05 PM PDT Ian Romanick wrote:
> >> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
> >>> On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:
>  On 2017-09-06 14:12:41, Daniel Schürmann wrote:
> > Hello together!
> > Recently, we had a small discussion (off the list) about the NIR
> > serialization, which was previously discussed in [RFC] ARB_gl_spirv
> and
> > NIR backend for radeonsi.
> >
> > As this topic could be interesting to more people, I would like to
> > share, what was talked about so far (You might want to read from
> bottom up).
> >
> > TL;DR:
> > - NIR serialization is in demand for shader cache
> > - could be done either directly (NIR binary form) or via SPIR-V
> > - Ian et al. are working on GLSL IR -> SPIR-V transformation, which
> > could be adapted for a NIR -> SPIR-V pass
> > - in NIR representation, some type information is lost
> > - thus, a serialization via SPIR-V could NOT be a glslang alternative
> > (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if
> the
> > output is valid SPIR-V)
> 
>  Ian,
> 
>  Tim was suggesting that we might look at serializing nir for the i965
>  shader cache. Based on this email, it sounds like serialized nir would
>  not be enough for the shader cache as some GLSL type info would be
>  lost. It sounds like GLSL IR => SPIR-V would be good enough. Is that
>  right?
> 
>  I don't think we have a strict requirement for the GLSL IR => SPIR-V
>  path for GL 4.6, right? So, this is more of a 'nice-to-have'?
> 
>  I'm not sure we'd want to make i965 shader cache depend on a
>  nice-to-have feature. (Unless we're pretty sure it'll be available
>  soon.)
> 
>  But, it would be nice to not have to fallback to compiling the GLSL
>  for i965 shader cache, so it would be worth waiting a little bit to be
>  able to rely on a SPIR-V serialization of the GLSL IR.
> 
>  What do you suggest?
> 
>  -Jordan
> >>>
> >>> We shouldn't use SPIR-V for the shader cache.
> >>>
> >>> The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965
> IRs.
> >>> Storing the content at one of those points, and later loading it and
> >>> resuming the normal compilation process from that point...that's
> totally
> >>> reasonable.
> >>>
> >>> Having a fallback for "some things in the cache but not all the
> variants
> >>> we needed" suddenly take a different compilation pipeline, i.e. SPIR-V
> >>> -> NIR -> ... seems risky.  It's a different compilation path that we
> >>> don't normally use.  And one you'd only hit in limited circumstances.
> >>> There's a lot of potential for really obscure bugs.
> >>
> >> Since we're going to expose exactly that path for GL_ARB_spirv / OpenGL
> >> 4.6, we'd better make sure it works always.  Right?
> >
> > In addition to the old pipeline:
> >
> > - GLSL from the app -> GLSL IR -> NIR -> i965 IR
> >
> > GL_ARB_spirv and OpenGL 4.6 add a second pipeline:
> >
> > - SPIR-V from the app -> NIR -> i965 IR
> >
> > Both of those absolutely have to work.  But these:
> >
> > - GLSL -> GLSL IR -> NIR -> SPIR-V -> NIR -> i965 IRs
> > - GLSL -> GLSL IR -> SPIR-V -> NIR -> i965 IRs
> >
> > aren't required to work, or even be supported.  It makes a lot of sense
> > to support them - both for testing purposes, and as an alternative to
> > glslang, for a broader tooling ecosystem.
> >
> > The thing that concerns me is that if you use SPIR-V for the cache, you
> > need these paths to not just work, but be _indistinguishable_ from one
> > another:
> >
> > - GLSL -> GLSL IR -> NIR -> ...
> > - GLSL -> GLSL IR -> NIR -> SPIR-V, then SPIR-V -> NIR -> ...
> >
> > Otherwise the original compile and partially-cached recompile might have
> > different properties.  For example, if the the SPIR-V step messes with
> > variables or instruction ordering a little, it could trip up the loop
> > unroller so the original compiler gets unrolled, and the recompile from
> > partial cache doesn't get unrolled.  I don't want to have to debug that.
>
> That is a very compelling argument.  If we want Mesa to be an
> alternative to glslang, I think we would like to have that property, but
> it's not a hard requirement for that use case.
>

I also find that argument rather compelling.  The SPIR-V -> NIR pass is
*not* a simple pass.  It does piles of lowering and things on-the-fly as
well as creating temporary variables for various things.  The best we could
hope to guarnatee would be that NIR -> SPIR-V -> NIR -> vars_to_ssa -> CSE
is idempotent.  Even that might be a bit of a stretch.


> > One could avoid this by making the original compile always go through
> > SPIR-V, and just drop glsl_to_nir altogether, so both take the same
> >

Re: [Mesa-dev] [PATCH 02/15] radv: save the bound pipeline pointers into the trace BO

2017-09-12 Thread Bas Nieuwenhuizen

On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:
> When a GPU hang is detected in radv_gpu_hang_occured() we know
> which command buffer is faulty but the bound pipelines might
> have been updated during the execution.
>
> The pointers to the radv_pipeline objects are emitted just
> after the second trace ID, that way it would be easy to dump
> the active shaders at the moment of the hang.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 59 
> +++-
>  src/amd/vulkan/radv_debug.c  |  2 ++
>  2 files changed, 54 insertions(+), 7 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 4e133d1f25..9b6c8c6106 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -329,6 +329,19 @@ radv_cmd_buffer_upload_data(struct radv_cmd_buffer 
> *cmd_buffer,
> return true;
>  }
>
> +static void
> +radv_emit_write_data_packet(struct radeon_winsys_cs *cs, uint64_t va,
> +   unsigned count, uint32_t *data)

const uint32_t *data

> +{
> +   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 2 + count, 0));
> +   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
> +   S_370_WR_CONFIRM(1) |
> +   S_370_ENGINE_SEL(V_370_ME));
> +   radeon_emit(cs, va);
> +   radeon_emit(cs, va >> 32);
> +   radeon_emit_array(cs, data, count);
> +}
> +
>  void radv_cmd_buffer_trace_emit(struct radv_cmd_buffer *cmd_buffer)
>  {
> struct radv_device *device = cmd_buffer->device;
> @@ -346,17 +359,46 @@ void radv_cmd_buffer_trace_emit(struct radv_cmd_buffer 
> *cmd_buffer)
>
> ++cmd_buffer->state.trace_id;
> device->ws->cs_add_buffer(cs, device->trace_bo, 8);
> -   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
> -   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
> -   S_370_WR_CONFIRM(1) |
> -   S_370_ENGINE_SEL(V_370_ME));
> -   radeon_emit(cs, va);
> -   radeon_emit(cs, va >> 32);
> -   radeon_emit(cs, cmd_buffer->state.trace_id);
> +   radv_emit_write_data_packet(cs, va, 1, _buffer->state.trace_id);
> radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
> radeon_emit(cs, AC_ENCODE_TRACE_POINT(cmd_buffer->state.trace_id));
>  }
>
> +static void
> +radv_save_pipeline(struct radv_cmd_buffer *cmd_buffer,
> +  struct radv_pipeline *pipeline, enum ring_type ring)
> +{
> +   struct radv_device *device = cmd_buffer->device;
> +   struct radeon_winsys_cs *cs = cmd_buffer->cs;
> +   uint32_t data[2];
> +   uint64_t va;
> +
> +   if (!device->trace_bo)
> +   return;
> +
> +   va = device->ws->buffer_get_va(device->trace_bo);
> +
> +   switch (ring) {
> +   case RING_GFX:
> +   va += 8;
> +   break;
> +   case RING_COMPUTE:
> +   va += 16;
> +   break;
> +   default:
> +   assert(!"invalid ring type");
> +   }
> +
> +   MAYBE_UNUSED unsigned cdw_max = radeon_check_space(device->ws,
> +  cmd_buffer->cs, 6);
> +
> +   data[0] = (uintptr_t)pipeline;
> +   data[1] = (uintptr_t)pipeline >> 32;
> +
> +   device->ws->cs_add_buffer(cs, device->trace_bo, 8);
> +   radv_emit_write_data_packet(cs, va, 2, data);
> +}
> +
>  static void
>  radv_emit_graphics_blend_state(struct radv_cmd_buffer *cmd_buffer,
>struct radv_pipeline *pipeline)
> @@ -897,6 +939,8 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer 
> *cmd_buffer)
> }
> radeon_set_context_reg(cmd_buffer->cs, R_028A6C_VGT_GS_OUT_PRIM_TYPE, 
> pipeline->graphics.gs_out);
>
> +   radv_save_pipeline(cmd_buffer, pipeline, RING_GFX);
> +
> cmd_buffer->state.emitted_pipeline = pipeline;
>  }
>
> @@ -2292,6 +2336,7 @@ radv_emit_compute_pipeline(struct radv_cmd_buffer 
> *cmd_buffer)
> 
> S_00B81C_NUM_THREAD_FULL(compute_shader->info.cs.block_size[2]));
>
> assert(cmd_buffer->cs->cdw <= cdw_max);
> +   radv_save_pipeline(cmd_buffer, pipeline, RING_COMPUTE);
>  }
>
>  static void radv_mark_descriptor_sets_dirty(struct radv_cmd_buffer 
> *cmd_buffer)
> diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
> index 052daaef2f..0dc2d3a22b 100644
> --- a/src/amd/vulkan/radv_debug.c
> +++ b/src/amd/vulkan/radv_debug.c
> @@ -36,6 +36,8 @@
>   *
>   * [0]: primary trace ID
>   * [1]: secondary trace ID
> + * [2-3]: 64-bit GFX pipeline pointer
> + * [4-5]: 64-bit COMPUTE pipeline pointer
>   */
>
>  bool
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing

Re: [Mesa-dev] [PATCH 01/15] radv: add a comment that describes the trace BO layout

2017-09-12 Thread Bas Nieuwenhuizen

add something that the offsets are in multiple of 4 bytes?

On Tue, Sep 12, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_debug.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
> index d52ba5d86d..052daaef2f 100644
> --- a/src/amd/vulkan/radv_debug.c
> +++ b/src/amd/vulkan/radv_debug.c
> @@ -32,6 +32,12 @@
>  #include "radv_debug.h"
>  #include "radv_shader.h"
>
> +/* Trace BO layout:
> + *
> + * [0]: primary trace ID
> + * [1]: secondary trace ID
> + */
> +
>  bool
>  radv_init_trace(struct radv_device *device)
>  {
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] radv: clear push_constant_stages when resetting a command buffer

2017-09-12 Thread Bas Nieuwenhuizen

The series is

Reviewed-by: Bas Nieuwenhuizen 

On Tue, Sep 12, 2017 at 7:08 PM, Samuel Pitoiset
 wrote:
> Per the spec:
>
>"Resetting a command buffer is an operation that discards any
>previously recorded commands and puts a command buffer in the
>initial state."
>
> As far I'm concerned, that flag can be changed by calling
> VkCmdPushConstants() (or any other functions which update it),
> so it should be cleared as well.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 6a82867c82..d888300677 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -216,6 +216,7 @@ radv_reset_cmd_buffer(struct radv_cmd_buffer *cmd_buffer)
> free(up);
> }
>
> +   cmd_buffer->push_constant_stages = 0;
> cmd_buffer->scratch_size_needed = 0;
> cmd_buffer->compute_scratch_size_needed = 0;
> cmd_buffer->esgs_ring_size_needed = 0;
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 102665] test_glsl_to_tgsi_lifetime.cpp:53:67: error: ‘?=>>=?UTF-8?Q?’ should be ‘?=> >=?UTF-8?Q?’ within a nested template argument list

2017-09-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=102665

--- Comment #4 from Vinson Lee  ---
In the travis build, the -std=c++11 option comes from LLVM_CXXFLAGS with
llvm-3.9. Older llvm versions or builds without llvm will not have -std=c++11.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Matt Turner

On Tue, Sep 12, 2017 at 10:19 AM, Ian Romanick  wrote:
> On 09/12/2017 02:40 AM, Marathe, Yogesh wrote:
>> Hi Jason,
>>
>>
>>
>> On the asserts you’ve mentioned below, I assume we need to add them
>> after ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be 0
>> initially. Another clarification on ~1%, we meant approx. 1% there,
>> that’s an improvement we saw in 3Dmark total not a degradation, we’ll
>> correct it in commit msg.
>
> I think the problem is that there is insufficient information about your
> data.  What we want to see in a commit message is something like:
>
> commit 5ae2de81c8350272c122ea38e6bb4c0a41d58921
> Author: Kenneth Graunke 
> Date:   Mon Aug 28 16:08:32 2017 -0700
>
> i965: Use BLORP for buffer object stall avoidance blits instead of BLT.
>
> Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
> - Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
> - Car Chase by 1.25607% +/- 0.291262% (n=5).
>
> Reviewed-by: Jason Ekstrand 
>
> The important bits are:
>
> - average improvement
> - statistical deviation
> - number of runs

And for generating such data, we often use http://anholt.net/compare-perf/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 102682] vblank_mode ignored from ~/.drirc

2017-09-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=102682

--- Comment #3 from Gustaw Smolarczyk  ---
Was too fast. The passed driver name is not "loader" but "dri2". And I think
it's intentional, since vblank_mode is not an option of a device driver but of
the dri infrastructure.

I will let the more experienced developers judge this bug report, but I would
say it's NOTABUG.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 102682] vblank_mode ignored from ~/.drirc

2017-09-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=102682

--- Comment #2 from Niklas Haas  ---
Verifying that your work-around does, indeed, work around the issue.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] glsl: compile unused function out

2017-09-12 Thread Matt Turner

On Tue, Sep 12, 2017 at 8:01 AM, Eric Engestrom
 wrote:
> The function is only called from one place, which is hidden behind
> the same `#ifdef DEBUG`.
>
> Fixes: ca73c3358c91434e68ab "glsl: Mark functions static"
> Signed-off-by: Eric Engestrom 

As long as make check continues passing (I broke that by marking some
function static):

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 102682] vblank_mode ignored from ~/.drirc

2017-09-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=102682

--- Comment #1 from Gustaw Smolarczyk  ---
Hi,

The problem seems to be with the `driver="radeonsi"' part (the screen attribute
is not used when driver is present). It seems that there is a bug and an
incorrect driver name "loader" is passed to the xmlconfig machinery.

For a quick work-around, just remove both screen and driver attributes, unless
you want to restrict your setting to a specific driver/screen.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600/sb: remove superfluos assert

2017-09-12 Thread Vadim Girlin


On 09/12/2017 12:49 PM, Gert Wollny wrote:

Am Dienstag, den 12.09.2017, 09:56 +0300 schrieb Vadim Girlin:

On 09/11/2017 07:09 PM, Emil Velikov wrote:



Anyway, if num_arrays is 0 there, I suspect it can be a result of
some other issue. At the very least it looks like a potential
performance problem, because in that case we assume all shader
registers can be  accessed with indirect addressing and it can limit
the optimizations significantly. So it might make sense to figure out
why it's zero in the first place, in theory it shouldn't happen.
Maybe something is wrong with the indirect_files bits?


The shader that's failing is this (i.e. no arrays, and indirect access
only to SV).


Is the tested feature really supported by r600g? AFAICS the indirect 
index value is unused in the shader code.


Anyway, at first glance it looks like we don't need indirect addressing 
for GPRs in this case, so the outer "if" around that assert probably 
should handle this case too and skip the assert. I'm not 100% sure though.




FRAG
DCL SV[0], SAMPLEMASK
DCL OUT[0], COLOR
DCL CONST[0][0]
DCL TEMP[0..1], LOCAL
DCL ADDR[0]
IMM[0] FLT32 {1., 0., 0., 0.}
IMM[1] INT32 {1, 0, 0, 0}
   0: MOV TEMP[0], IMM[0].xyyx
   1: UARL ADDR[0].x, CONST[0][0].
   2: USEQ TEMP[1].x, SV[ADDR[0].x]., IMM[1].
   3: UIF TEMP[1].
   4:   MOV TEMP[0].xy, IMM[0].yxyy
   5: ENDIF
   6: MOV OUT[0], TEMP[0]
   7: END

= SHADER #12 ==
PS/BARTS/EVERGREEN =
= 36 dw = 8 gprs = 1 stack
=
  4005 a418 ALU_PUSH_BEFORE 7 @10 KC0[CB0:0-15]
0010  00f9 00400c90 1 x: MOVR2.x,  1.0
0012  04f8 20400c90   y: MOVR2.y,  0
0014  04f8 40400c90   z: MOVR2.z,  0
0016  00f9 60400c90   w: MOVR2.w,  1.0
0018  8080 00800c90   t: MOVR4.x,  KC0[0].x
0020  801f4800 00601d10 2 x: SETE_INT   R3.x,  R0.z, 1
0022  801f00fe 00e0229c 3 MP  x: PRED_SETNE_INT R7.x,  PV.x, 0
0002  0003 8281 JUMP @6 POP:1
0004  000c a804 ALU_POP_AFTER 2 @24
0024  04f8 00400c90 4 x: MOVR2.x,  0
0026  80f9 20400c90   y: MOVR2.y,  1.0
0006  000e a00c ALU 4 @28
0028  0002 00200c90 5 x: MOVR1.x,  R2.x
0030  0402 20200c90   y: MOVR1.y,  R2.y
0032  0802 40200c90   z: MOVR1.z,  R2.z
0034  8c02 60200c90   w: MOVR1.w,  R2.w
0008  c0008000 95200688 EXPORT_DONEPIXEL 0 R1.xyzw  EOP
= SHADER_END



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] radv: compile out unused code

2017-09-12 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

On 12 Sep 2017 5:11 PM, "Eric Engestrom"  wrote:

Signed-off-by: Eric Engestrom 
---
 src/amd/vulkan/radv_wsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index aa44b7d78a..8a551c48bb 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd/vulkan/radv_wsi.c
@@ -28,9 +28,11 @@
 #include "wsi_common.h"
 #include "vk_util.h"

+#ifdef VK_USE_PLATFORM_WAYLAND_KHR
 static const struct wsi_callbacks wsi_cbs = {
.get_phys_device_format_properties = radv_GetPhysicalDeviceFormatPropert
ies,
 };
+#endif

 VkResult
 radv_init_wsi(struct radv_physical_device *physical_device)
--
Cheers,
  Eric
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Intel-gfx] [PATCH 1/2] drm/i915/kbl: Remove unused Kabylake pci ids

2017-09-12 Thread Anuj Phogat

On Mon, Sep 11, 2017 at 10:10 AM, Rodrigo Vivi  wrote:
> On Mon, Sep 11, 2017 at 04:11:33PM +, Anuj Phogat wrote:
>> See Mesa commits: ebc5ccf and b2dae9f
>
> I believe we need to be in sync between multiple gfx stack components,
> but I  don't believe we should remove ids.
>
> In the past we had cases where we noticed a product group using a listed
> id to do a product and we just noticed the id after a user reported at fd.o.
>
> For us in kernel the cycle until that id gets into a stable release
> propagated to OSVs distros can be a bit long.
>
> Also Xserver ids are nowadays in sync with Mesa ones and I believe some
> OSVs might take a while to upgrade the Xserver as well in case of a new
> found product with some "new" id.
>
> For this reason I was always in favor of adding all possible reserved ids 
> from the
> beginning.
>
> And this approach worked well on BDW and SKL, where we've seeing later some
> reserved ids becoming real product and we didn't have to do any extra step.
>
> For this same reason I believe the right solution is to
> add those ids back to mesa instead of removing from kernel and libdrm.
>
I'm fine with keeping the unused pci id's in Mesa tree and keep it uniform
across multiple gfx stack components. I'll revert the Mesa patch.

> Thanks,
> Rodrigo.
>
>>
>> Cc: Matt Turner 
>> Cc: Rodrigo Vivi 
>> Signed-off-by: Anuj Phogat 
>> ---
>>  drivers/gpu/drm/i915/i915_pci.c |  1 -
>>  include/drm/i915_pciids.h   | 15 ++-
>>  2 files changed, 2 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_pci.c 
>> b/drivers/gpu/drm/i915/i915_pci.c
>> index 129877b..ecf6d4c 100644
>> --- a/drivers/gpu/drm/i915/i915_pci.c
>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>> @@ -613,7 +613,6 @@ static const struct pci_device_id pciidlist[] = {
>>   INTEL_KBL_GT1_IDS(_kabylake_gt1_info),
>>   INTEL_KBL_GT2_IDS(_kabylake_gt2_info),
>>   INTEL_KBL_GT3_IDS(_kabylake_gt3_info),
>> - INTEL_KBL_GT4_IDS(_kabylake_gt3_info),
>>   INTEL_CFL_S_GT1_IDS(_coffeelake_gt1_info),
>>   INTEL_CFL_S_GT2_IDS(_coffeelake_gt2_info),
>>   INTEL_CFL_H_GT2_IDS(_coffeelake_gt2_info),
>> diff --git a/include/drm/i915_pciids.h b/include/drm/i915_pciids.h
>> index 1257e15..a1bf90e 100644
>> --- a/include/drm/i915_pciids.h
>> +++ b/include/drm/i915_pciids.h
>> @@ -337,15 +337,10 @@
>>   INTEL_VGA_DEVICE(0x3185, info)
>>
>>  #define INTEL_KBL_GT1_IDS(info)  \
>> - INTEL_VGA_DEVICE(0x5913, info), /* ULT GT1.5 */ \
>> - INTEL_VGA_DEVICE(0x5915, info), /* ULX GT1.5 */ \
>>   INTEL_VGA_DEVICE(0x5917, info), /* DT  GT1.5 */ \
>>   INTEL_VGA_DEVICE(0x5906, info), /* ULT GT1 */ \
>> - INTEL_VGA_DEVICE(0x590E, info), /* ULX GT1 */ \
>>   INTEL_VGA_DEVICE(0x5902, info), /* DT  GT1 */ \
>> - INTEL_VGA_DEVICE(0x5908, info), /* Halo GT1 */ \
>> - INTEL_VGA_DEVICE(0x590B, info), /* Halo GT1 */ \
>> - INTEL_VGA_DEVICE(0x590A, info) /* SRV GT1 */
>> + INTEL_VGA_DEVICE(0x590B, info)  /* Halo GT1 */
>>
>>  #define INTEL_KBL_GT2_IDS(info)  \
>>   INTEL_VGA_DEVICE(0x5916, info), /* ULT GT2 */ \
>> @@ -353,22 +348,16 @@
>>   INTEL_VGA_DEVICE(0x591E, info), /* ULX GT2 */ \
>>   INTEL_VGA_DEVICE(0x5912, info), /* DT  GT2 */ \
>>   INTEL_VGA_DEVICE(0x591B, info), /* Halo GT2 */ \
>> - INTEL_VGA_DEVICE(0x591A, info), /* SRV GT2 */ \
>>   INTEL_VGA_DEVICE(0x591D, info) /* WKS GT2 */
>>
>>  #define INTEL_KBL_GT3_IDS(info) \
>> - INTEL_VGA_DEVICE(0x5923, info), /* ULT GT3 */ \
>>   INTEL_VGA_DEVICE(0x5926, info), /* ULT GT3 */ \
>>   INTEL_VGA_DEVICE(0x5927, info) /* ULT GT3 */
>>
>> -#define INTEL_KBL_GT4_IDS(info) \
>> - INTEL_VGA_DEVICE(0x593B, info) /* Halo GT4 */
>> -
>>  #define INTEL_KBL_IDS(info) \
>>   INTEL_KBL_GT1_IDS(info), \
>>   INTEL_KBL_GT2_IDS(info), \
>> - INTEL_KBL_GT3_IDS(info), \
>> - INTEL_KBL_GT4_IDS(info)
>> + INTEL_KBL_GT3_IDS(info)
>>
>>  /* CFL S */
>>  #define INTEL_CFL_S_GT1_IDS(info) \
>> --
>> 2.9.4
>>
>> ___
>> Intel-gfx mailing list
>> intel-...@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Ian Romanick

On 09/12/2017 02:40 AM, Marathe, Yogesh wrote:
> Hi Jason,
> 
>  
> 
> On the asserts you’ve mentioned below, I assume we need to add them
> after ‘bufmgr->num_buckets++’ in add_bucket() as num_buckets could be 0
> initially. Another clarification on ~1%, we meant approx. 1% there,
> that’s an improvement we saw in 3Dmark total not a degradation, we’ll
> correct it in commit msg.

I think the problem is that there is insufficient information about your
data.  What we want to see in a commit message is something like:

commit 5ae2de81c8350272c122ea38e6bb4c0a41d58921
Author: Kenneth Graunke 
Date:   Mon Aug 28 16:08:32 2017 -0700

i965: Use BLORP for buffer object stall avoidance blits instead of BLT.

Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
- Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
- Car Chase by 1.25607% +/- 0.291262% (n=5).

Reviewed-by: Jason Ekstrand 

The important bits are:

- average improvement
- statistical deviation
- number of runs
- platform
- name of specific benchmark

> Rest all review comments from you, Tapani and Emil are noted &
> implemented, we are working on running it through mesa CI/CTS and we
> should see a v2 for review after that.
> 
>  
> 
> Regards,
> 
> Yogesh.
> 
>  
> 
> *From:*mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] *On
> Behalf Of *Jason Ekstrand
> *Sent:* Friday, September 8, 2017 9:09 PM
> *To:* Muthukumar, Aravindan 
> *Cc:* mesa-dev@lists.freedesktop.org; J Karanje, Kedar
> 
> *Subject:* Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation
> 
>  
> 
> In general, I'm very concerned about how this handles rounding
> behavior.  Almost everywhere, you round down when what you want to do is
> round up.  Also, as I said on IRC, I'd like to see some asserts in
> add_bucket so that we are sure this calculation is correct.  In
> particular, I'd like to see
> 
>  
> 
> assert(bucket_for_size(size) == >cache_bucket[i]);
> 
> assert(bucket_for_size(size - 2048) == >cache_bucket[i]);
> 
> assert(bucket_for_size(size + 1) != >cache_bucket[i]);
> 
>  
> 
> We need to check on both sides of size to be 100% sure we're doing our
> rounding correctly.
> 
>  
> 
> On Fri, Sep 8, 2017 at 1:11 AM,  > wrote:
> 
> From: Aravindan Muthukumar  >
> 
> Avoiding the loop which was running with O(n) complexity.
> Now the complexity has been reduced to O(1)
> 
> Tested with piglit.
> Slight performance improvement (~1%) in 3d mark.
> 
>  
> 
> Which 3dmark test?  Also, what's the error in that 1%?
> 
>  
> 
> Change-Id: Id099f1cd24ad5b691a69070eda79b8f4e9be39a6
> Signed-off-by: Aravindan Muthukumar  >
> Signed-off-by: Kedar Karanje  >
> Reviewed-by: Yogesh Marathe  >
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 48
> +-
>  1 file changed, 41 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index 5b4e784..18cb166 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -87,6 +87,11 @@
> 
>  #define memclear(s) memset(, 0, sizeof(s))
> 
> +/* Macros for BO cache size */
> +#define CACHE_PAGE_SIZE4096
> 
>  
> 
> Just call this PAGE_SIZE
> 
>  
> 
> +#define PAGE_SIZE_SHIFT12
> +#define BO_CACHE_PAGE_SIZE (4 * CACHE_PAGE_SIZE)
> 
>  
> 
> I think I'd rather we just use 4 * PAGE_SIZE explicitly than have this
> extra #define.  I think it's making things harder to read and not easier.
> 
>  
> 
> +
>  #define FILE_DEBUG_FLAG DEBUG_BUFMGR
> 
>  static inline int
> @@ -181,19 +186,48 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr,
> uint32_t pitch, uint32_t tiling)
> return ALIGN(pitch, tile_width);
>  }
> 
> +/*
> + * This functions is to find the correct bucket fit for the input size.
> + * This function works with O(1) complexity when the requested size
> + * was queried instead of iterating the size through all the buckets.
> + */
>  static struct bo_cache_bucket *
>  bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
>  {
> -   int i;
> +   struct bo_cache_bucket *bucket = NULL;
> +   int x=0,index = -1;
> +   int row, col=0;
> 
> -   for (i = 0; i < bufmgr->num_buckets; i++) {
> -  struct bo_cache_bucket *bucket = >cache_bucket[i];
> -  if (bucket->size >= size) {
> - return

Re: [Mesa-dev] [PATCH] i965 : optimized bucket index calculation

2017-09-12 Thread Ian Romanick

On 09/08/2017 08:38 AM, Jason Ekstrand wrote:
> In general, I'm very concerned about how this handles rounding
> behavior.  Almost everywhere, you round down when what you want to do is
> round up.  Also, as I said on IRC, I'd like to see some asserts in
> add_bucket so that we are sure this calculation is correct.  In
> particular, I'd like to see
> 
> assert(bucket_for_size(size) == >cache_bucket[i]);
> assert(bucket_for_size(size - 2048) == >cache_bucket[i]);
> assert(bucket_for_size(size + 1) != >cache_bucket[i]);
> 
> We need to check on both sides of size to be 100% sure we're doing our
> rounding correctly.
> 
> On Fri, Sep 8, 2017 at 1:11 AM,  > wrote:
> 
> From: Aravindan Muthukumar  >
> 
> Avoiding the loop which was running with O(n) complexity.
> Now the complexity has been reduced to O(1)
> 
> Tested with piglit.
> Slight performance improvement (~1%) in 3d mark.
> 
> 
> Which 3dmark test?  Also, what's the error in that 1%?

And, is that Atom or Core?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-12 Thread Ian Romanick

On 09/11/2017 11:17 PM, Kenneth Graunke wrote:
> On Monday, September 11, 2017 9:23:05 PM PDT Ian Romanick wrote:
>> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
>>> On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:
 On 2017-09-06 14:12:41, Daniel Schürmann wrote:
> Hello together!
> Recently, we had a small discussion (off the list) about the NIR 
> serialization, which was previously discussed in [RFC] ARB_gl_spirv and 
> NIR backend for radeonsi.
>
> As this topic could be interesting to more people, I would like to 
> share, what was talked about so far (You might want to read from bottom 
> up).
>
> TL;DR:
> - NIR serialization is in demand for shader cache
> - could be done either directly (NIR binary form) or via SPIR-V
> - Ian et al. are working on GLSL IR -> SPIR-V transformation, which 
> could be adapted for a NIR -> SPIR-V pass
> - in NIR representation, some type information is lost
> - thus, a serialization via SPIR-V could NOT be a glslang alternative 
> (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if the 
> output is valid SPIR-V)

 Ian,

 Tim was suggesting that we might look at serializing nir for the i965
 shader cache. Based on this email, it sounds like serialized nir would
 not be enough for the shader cache as some GLSL type info would be
 lost. It sounds like GLSL IR => SPIR-V would be good enough. Is that
 right?

 I don't think we have a strict requirement for the GLSL IR => SPIR-V
 path for GL 4.6, right? So, this is more of a 'nice-to-have'?

 I'm not sure we'd want to make i965 shader cache depend on a
 nice-to-have feature. (Unless we're pretty sure it'll be available
 soon.)

 But, it would be nice to not have to fallback to compiling the GLSL
 for i965 shader cache, so it would be worth waiting a little bit to be
 able to rely on a SPIR-V serialization of the GLSL IR.

 What do you suggest?

 -Jordan
>>>
>>> We shouldn't use SPIR-V for the shader cache.
>>>
>>> The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965 IRs.
>>> Storing the content at one of those points, and later loading it and
>>> resuming the normal compilation process from that point...that's totally
>>> reasonable.
>>>
>>> Having a fallback for "some things in the cache but not all the variants
>>> we needed" suddenly take a different compilation pipeline, i.e. SPIR-V
>>> -> NIR -> ... seems risky.  It's a different compilation path that we
>>> don't normally use.  And one you'd only hit in limited circumstances.
>>> There's a lot of potential for really obscure bugs.
>>
>> Since we're going to expose exactly that path for GL_ARB_spirv / OpenGL
>> 4.6, we'd better make sure it works always.  Right?
> 
> In addition to the old pipeline:
> 
> - GLSL from the app -> GLSL IR -> NIR -> i965 IR
> 
> GL_ARB_spirv and OpenGL 4.6 add a second pipeline:
> 
> - SPIR-V from the app -> NIR -> i965 IR
> 
> Both of those absolutely have to work.  But these:
> 
> - GLSL -> GLSL IR -> NIR -> SPIR-V -> NIR -> i965 IRs
> - GLSL -> GLSL IR -> SPIR-V -> NIR -> i965 IRs
> 
> aren't required to work, or even be supported.  It makes a lot of sense
> to support them - both for testing purposes, and as an alternative to
> glslang, for a broader tooling ecosystem.
> 
> The thing that concerns me is that if you use SPIR-V for the cache, you
> need these paths to not just work, but be _indistinguishable_ from one
> another:
> 
> - GLSL -> GLSL IR -> NIR -> ...
> - GLSL -> GLSL IR -> NIR -> SPIR-V, then SPIR-V -> NIR -> ...
> 
> Otherwise the original compile and partially-cached recompile might have
> different properties.  For example, if the the SPIR-V step messes with
> variables or instruction ordering a little, it could trip up the loop
> unroller so the original compiler gets unrolled, and the recompile from
> partial cache doesn't get unrolled.  I don't want to have to debug that.

That is a very compelling argument.  If we want Mesa to be an
alternative to glslang, I think we would like to have that property, but
it's not a hard requirement for that use case.

> One could avoid this by making the original compile always go through
> SPIR-V, and just drop glsl_to_nir altogether, so both take the same
> paths.  But...it's kind of an unnecessary step in the common case...

We may eventually partially do that, but that shouldn't block (any)
other work.  In the short term it would likely add compile overhead that
many would find unacceptable... by virtue of being non-zero.

> Just serializing/reading back the NIR and resuming the compile from the
> exact same IR would also solve that problem.
> 
> Or, just being -really- careful with the translator, I guess...
> 
>> One nice thing about SPIR-V is that all of the handling of uniform
>> layouts, initial uniform values, attribute locations, etc. is already
>>

[Mesa-dev] [PATCH 3/3] radv: clear push_constant_stages when resetting a command buffer

2017-09-12 Thread Samuel Pitoiset

Per the spec:

   "Resetting a command buffer is an operation that discards any
   previously recorded commands and puts a command buffer in the
   initial state."

As far I'm concerned, that flag can be changed by calling
VkCmdPushConstants() (or any other functions which update it),
so it should be cleared as well.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 6a82867c82..d888300677 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -216,6 +216,7 @@ radv_reset_cmd_buffer(struct radv_cmd_buffer *cmd_buffer)
free(up);
}
 
+   cmd_buffer->push_constant_stages = 0;
cmd_buffer->scratch_size_needed = 0;
cmd_buffer->compute_scratch_size_needed = 0;
cmd_buffer->esgs_ring_size_needed = 0;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] radv: add more radv_emit_XXX() helpers for the dynamic state

2017-09-12 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 117 ++-
 1 file changed, 77 insertions(+), 40 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 4d33c2f1d2..6a82867c82 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1013,6 +1013,73 @@ radv_emit_scissor(struct radv_cmd_buffer *cmd_buffer)
   
cmd_buffer->state.pipeline->graphics.ms.pa_sc_mode_cntl_0 | 
S_028A48_VPORT_SCISSOR_ENABLE(count ? 1 : 0));
 }
 
+static void
+radv_emit_line_width(struct radv_cmd_buffer *cmd_buffer)
+{
+   unsigned width = cmd_buffer->state.dynamic.line_width * 8;
+
+   radeon_set_context_reg(cmd_buffer->cs, R_028A08_PA_SU_LINE_CNTL,
+  S_028A08_WIDTH(CLAMP(width, 0, 0xFFF)));
+}
+
+static void
+radv_emit_blend_constants(struct radv_cmd_buffer *cmd_buffer)
+{
+   struct radv_dynamic_state *d = _buffer->state.dynamic;
+
+   radeon_set_context_reg_seq(cmd_buffer->cs, R_028414_CB_BLEND_RED, 4);
+   radeon_emit_array(cmd_buffer->cs, (uint32_t *)d->blend_constants, 4);
+}
+
+static void
+radv_emit_stencil(struct radv_cmd_buffer *cmd_buffer)
+{
+   struct radv_dynamic_state *d = _buffer->state.dynamic;
+
+   radeon_set_context_reg_seq(cmd_buffer->cs,
+  R_028430_DB_STENCILREFMASK, 2);
+   radeon_emit(cmd_buffer->cs,
+   S_028430_STENCILTESTVAL(d->stencil_reference.front) |
+   S_028430_STENCILMASK(d->stencil_compare_mask.front) |
+   S_028430_STENCILWRITEMASK(d->stencil_write_mask.front) |
+   S_028430_STENCILOPVAL(1));
+   radeon_emit(cmd_buffer->cs,
+   S_028434_STENCILTESTVAL_BF(d->stencil_reference.back) |
+   S_028434_STENCILMASK_BF(d->stencil_compare_mask.back) |
+   S_028434_STENCILWRITEMASK_BF(d->stencil_write_mask.back) |
+   S_028434_STENCILOPVAL_BF(1));
+}
+
+static void
+radv_emit_depth_bounds(struct radv_cmd_buffer *cmd_buffer)
+{
+   struct radv_dynamic_state *d = _buffer->state.dynamic;
+
+   radeon_set_context_reg(cmd_buffer->cs, R_028020_DB_DEPTH_BOUNDS_MIN,
+  fui(d->depth_bounds.min));
+   radeon_set_context_reg(cmd_buffer->cs, R_028024_DB_DEPTH_BOUNDS_MAX,
+  fui(d->depth_bounds.max));
+}
+
+static void
+radv_emit_depth_biais(struct radv_cmd_buffer *cmd_buffer)
+{
+   struct radv_raster_state *raster = 
_buffer->state.pipeline->graphics.raster;
+   struct radv_dynamic_state *d = _buffer->state.dynamic;
+   unsigned slope = fui(d->depth_bias.slope * 16.0f);
+   unsigned bias = fui(d->depth_bias.bias * 
cmd_buffer->state.offset_scale);
+
+   if (G_028814_POLY_OFFSET_FRONT_ENABLE(raster->pa_su_sc_mode_cntl)) {
+   radeon_set_context_reg_seq(cmd_buffer->cs,
+  R_028B7C_PA_SU_POLY_OFFSET_CLAMP, 5);
+   radeon_emit(cmd_buffer->cs, fui(d->depth_bias.clamp)); /* CLAMP 
*/
+   radeon_emit(cmd_buffer->cs, slope); /* FRONT SCALE */
+   radeon_emit(cmd_buffer->cs, bias); /* FRONT OFFSET */
+   radeon_emit(cmd_buffer->cs, slope); /* BACK SCALE */
+   radeon_emit(cmd_buffer->cs, bias); /* BACK OFFSET */
+   }
+}
+
 static void
 radv_emit_fb_color_state(struct radv_cmd_buffer *cmd_buffer,
 int index,
@@ -1370,8 +1437,6 @@ void radv_set_db_count_control(struct radv_cmd_buffer 
*cmd_buffer)
 static void
 radv_cmd_buffer_flush_dynamic_state(struct radv_cmd_buffer *cmd_buffer)
 {
-   struct radv_dynamic_state *d = _buffer->state.dynamic;
-
if 
(G_028810_DX_RASTERIZATION_KILL(cmd_buffer->state.pipeline->graphics.raster.pa_cl_clip_cntl))
return;
 
@@ -1381,52 +1446,24 @@ radv_cmd_buffer_flush_dynamic_state(struct 
radv_cmd_buffer *cmd_buffer)
if (cmd_buffer->state.dirty & (RADV_CMD_DIRTY_DYNAMIC_SCISSOR | 
RADV_CMD_DIRTY_DYNAMIC_VIEWPORT))
radv_emit_scissor(cmd_buffer);
 
-   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_DYNAMIC_LINE_WIDTH) {
-   unsigned width = cmd_buffer->state.dynamic.line_width * 8;
-   radeon_set_context_reg(cmd_buffer->cs, R_028A08_PA_SU_LINE_CNTL,
-  S_028A08_WIDTH(CLAMP(width, 0, 0xFFF)));
-   }
+   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_DYNAMIC_LINE_WIDTH)
+   radv_emit_line_width(cmd_buffer);
 
-   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS) {
-   radeon_set_context_reg_seq(cmd_buffer->cs, 
R_028414_CB_BLEND_RED, 4);
-   radeon_emit_array(cmd_buffer->cs, 
(uint32_t*)d->blend_constants, 4);
-   }
+   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS)

[Mesa-dev] [PATCH 1/3] radv: remove useless 'cmd_buffer' param from radv_buffer_view_init()

2017-09-12 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_image.c | 5 ++---
 src/amd/vulkan/radv_meta_blit2d.c   | 2 +-
 src/amd/vulkan/radv_meta_bufimage.c | 2 +-
 src/amd/vulkan/radv_private.h   | 3 +--
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 9c5767262e..7c84f7dc10 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -1107,8 +1107,7 @@ radv_DestroyImageView(VkDevice _device, VkImageView 
_iview,
 
 void radv_buffer_view_init(struct radv_buffer_view *view,
   struct radv_device *device,
-  const VkBufferViewCreateInfo* pCreateInfo,
-  struct radv_cmd_buffer *cmd_buffer)
+  const VkBufferViewCreateInfo* pCreateInfo)
 {
RADV_FROM_HANDLE(radv_buffer, buffer, pCreateInfo->buffer);
 
@@ -1135,7 +1134,7 @@ radv_CreateBufferView(VkDevice _device,
if (!view)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   radv_buffer_view_init(view, device, pCreateInfo, NULL);
+   radv_buffer_view_init(view, device, pCreateInfo);
 
*pView = radv_buffer_view_to_handle(view);
 
diff --git a/src/amd/vulkan/radv_meta_blit2d.c 
b/src/amd/vulkan/radv_meta_blit2d.c
index 05e49fea76..461d097d05 100644
--- a/src/amd/vulkan/radv_meta_blit2d.c
+++ b/src/amd/vulkan/radv_meta_blit2d.c
@@ -82,7 +82,7 @@ create_bview(struct radv_cmd_buffer *cmd_buffer,
  .format = format,
  .offset = src->offset,
  .range = VK_WHOLE_SIZE,
- }, cmd_buffer);
+ });
 
 }
 
diff --git a/src/amd/vulkan/radv_meta_bufimage.c 
b/src/amd/vulkan/radv_meta_bufimage.c
index 91af80c392..96b5c22662 100644
--- a/src/amd/vulkan/radv_meta_bufimage.c
+++ b/src/amd/vulkan/radv_meta_bufimage.c
@@ -893,7 +893,7 @@ create_bview(struct radv_cmd_buffer *cmd_buffer,
  .format = format,
  .offset = offset,
  .range = VK_WHOLE_SIZE,
- }, cmd_buffer);
+ });
 
 }
 
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 191a81f77a..e5092a8923 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1321,8 +1321,7 @@ struct radv_buffer_view {
 };
 void radv_buffer_view_init(struct radv_buffer_view *view,
   struct radv_device *device,
-  const VkBufferViewCreateInfo* pCreateInfo,
-  struct radv_cmd_buffer *cmd_buffer);
+  const VkBufferViewCreateInfo* pCreateInfo);
 
 static inline struct VkExtent3D
 radv_sanitize_image_extent(const VkImageType imageType,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-12 Thread Ian Romanick

On 09/11/2017 09:44 PM, Timothy Arceri wrote:
> On 12/09/17 14:23, Ian Romanick wrote:
>> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
>>> On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:
 On 2017-09-06 14:12:41, Daniel Schürmann wrote:
> Hello together!
> Recently, we had a small discussion (off the list) about the NIR
> serialization, which was previously discussed in [RFC] ARB_gl_spirv
> and
> NIR backend for radeonsi.
>
> As this topic could be interesting to more people, I would like to
> share, what was talked about so far (You might want to read from
> bottom up).
>
> TL;DR:
> - NIR serialization is in demand for shader cache
> - could be done either directly (NIR binary form) or via SPIR-V
> - Ian et al. are working on GLSL IR -> SPIR-V transformation, which
> could be adapted for a NIR -> SPIR-V pass
> - in NIR representation, some type information is lost
> - thus, a serialization via SPIR-V could NOT be a glslang alternative
> (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if the
> output is valid SPIR-V)

 Ian,

 Tim was suggesting that we might look at serializing nir for the i965
 shader cache. Based on this email, it sounds like serialized nir would
 not be enough for the shader cache as some GLSL type info would be
 lost. It sounds like GLSL IR => SPIR-V would be good enough. Is that
 right?

 I don't think we have a strict requirement for the GLSL IR => SPIR-V
 path for GL 4.6, right? So, this is more of a 'nice-to-have'?

 I'm not sure we'd want to make i965 shader cache depend on a
 nice-to-have feature. (Unless we're pretty sure it'll be available
 soon.)

 But, it would be nice to not have to fallback to compiling the GLSL
 for i965 shader cache, so it would be worth waiting a little bit to be
 able to rely on a SPIR-V serialization of the GLSL IR.

 What do you suggest?

 -Jordan
>>>
>>> We shouldn't use SPIR-V for the shader cache.
>>>
>>> The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965 IRs.
>>> Storing the content at one of those points, and later loading it and
>>> resuming the normal compilation process from that point...that's totally
>>> reasonable.
>>>
>>> Having a fallback for "some things in the cache but not all the variants
>>> we needed" suddenly take a different compilation pipeline, i.e. SPIR-V
>>> -> NIR -> ... seems risky.  It's a different compilation path that we
>>> don't normally use.  And one you'd only hit in limited circumstances.
>>> There's a lot of potential for really obscure bugs.
>>
>> Since we're going to expose exactly that path for GL_ARB_spirv / OpenGL
>> 4.6, we'd better make sure it works always.  Right?
>>
>> One nice thing about SPIR-V is that all of the handling of uniform
>> layouts, initial uniform values, attribute locations, etc. is already
>> serialized.  If I'm not mistaken, that was one of the big pain points
>> for all of the existing on-disk storage methods.  All of that has been
>> sorted out for SPIR-V, and we have to make it work anyway.
> 
> Correct these are the main issues for the fallback path, however this is
> only used by i965 (exactly because an intermediate cache is missing).
> Using SPIR-V as the intermediate cache means we still need to convert to
> NIR and run all the opts, so I don't really see the advantage of caching
> to SPIR-V over NIR.

The advantage is that we have N code paths instead of N+1.  Maintenance
is the biggest cost in software development.

> Also there is going to be a requirement for a NIR cache for any of the
> Gallium nir based drivers (which possibly includes radeonsi in future).
> 
>>> Serializing NIR, and possibly a few auxiliary structures that we need,
>>> seems reasonable.  Although, just using the GLSL seemed reasonable to
>>> me as well, but I guess that's proven to be painful?
>>>
>>> --Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] glsl: Rename ir_constant::array_elements to ::const_elements

2017-09-12 Thread Ian Romanick

From: Ian Romanick 

The next patch will unify ::array_elements and ::components, so the
name ::array_elements wouldn't be appropriate.  A lot of things use
the names array_elements and components, so grepping for either is
pretty useless.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/glsl_to_nir.cpp  |  2 +-
 src/compiler/glsl/ir.cpp   | 24 +++---
 src/compiler/glsl/ir.h |  2 +-
 src/compiler/glsl/ir_clone.cpp |  4 ++--
 src/compiler/glsl/link_uniform_initializers.cpp|  8 
 .../glsl/tests/uniform_initializer_utils.cpp   |  4 ++--
 src/mesa/program/ir_to_mesa.cpp|  2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +-
 8 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index bb2ba17..f3cf74d 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -302,7 +302,7 @@ constant_copy(ir_constant *ir, void *mem_ctx)
   ret->num_elements = ir->type->length;
 
   for (i = 0; i < ir->type->length; i++)
- ret->elements[i] = constant_copy(ir->array_elements[i], mem_ctx);
+ ret->elements[i] = constant_copy(ir->const_elements[i], mem_ctx);
   break;
 
default:
diff --git a/src/compiler/glsl/ir.cpp b/src/compiler/glsl/ir.cpp
index 52ca836..c223ec6 100644
--- a/src/compiler/glsl/ir.cpp
+++ b/src/compiler/glsl/ir.cpp
@@ -627,14 +627,14 @@ ir_expression::variable_referenced() const
 ir_constant::ir_constant()
: ir_rvalue(ir_type_constant)
 {
-   this->array_elements = NULL;
+   this->const_elements = NULL;
 }
 
 ir_constant::ir_constant(const struct glsl_type *type,
 const ir_constant_data *data)
: ir_rvalue(ir_type_constant)
 {
-   this->array_elements = NULL;
+   this->const_elements = NULL;
 
assert((type->base_type >= GLSL_TYPE_UINT)
  && (type->base_type <= GLSL_TYPE_IMAGE));
@@ -737,7 +737,7 @@ ir_constant::ir_constant(bool b, unsigned vector_elements)
 ir_constant::ir_constant(const ir_constant *c, unsigned i)
: ir_rvalue(ir_type_constant)
 {
-   this->array_elements = NULL;
+   this->const_elements = NULL;
this->type = c->type->get_base_type();
 
switch (this->type->base_type) {
@@ -753,19 +753,19 @@ ir_constant::ir_constant(const ir_constant *c, unsigned i)
 ir_constant::ir_constant(const struct glsl_type *type, exec_list *value_list)
: ir_rvalue(ir_type_constant)
 {
-   this->array_elements = NULL;
+   this->const_elements = NULL;
this->type = type;
 
assert(type->is_scalar() || type->is_vector() || type->is_matrix()
  || type->is_record() || type->is_array());
 
if (type->is_array()) {
-  this->array_elements = ralloc_array(this, ir_constant *, type->length);
+  this->const_elements = ralloc_array(this, ir_constant *, type->length);
   unsigned i = 0;
   foreach_in_list(ir_constant, value, value_list) {
 assert(value->as_constant() != NULL);
 
-this->array_elements[i++] = value;
+this->const_elements[i++] = value;
   }
   return;
}
@@ -924,10 +924,10 @@ ir_constant::zero(void *mem_ctx, const glsl_type *type)
memset(>value, 0, sizeof(c->value));
 
if (type->is_array()) {
-  c->array_elements = ralloc_array(c, ir_constant *, type->length);
+  c->const_elements = ralloc_array(c, ir_constant *, type->length);
 
   for (unsigned i = 0; i < type->length; i++)
-c->array_elements[i] = ir_constant::zero(c, type->fields.array);
+c->const_elements[i] = ir_constant::zero(c, type->fields.array);
}
 
if (type->is_record()) {
@@ -1100,7 +1100,7 @@ ir_constant::get_array_element(unsigned i) const
else if (i >= this->type->length)
   i = this->type->length - 1;
 
-   return array_elements[i];
+   return const_elements[i];
 }
 
 ir_constant *
@@ -1181,7 +1181,7 @@ ir_constant::copy_offset(ir_constant *src, int offset)
case GLSL_TYPE_ARRAY: {
   assert (src->type == this->type);
   for (unsigned i = 0; i < this->type->length; i++) {
-this->array_elements[i] = src->array_elements[i]->clone(this, NULL);
+this->const_elements[i] = src->const_elements[i]->clone(this, NULL);
   }
   break;
}
@@ -1243,7 +1243,7 @@ ir_constant::has_value(const ir_constant *c) const
 
if (this->type->is_array()) {
   for (unsigned i = 0; i < this->type->length; i++) {
-if (!this->array_elements[i]->has_value(c->array_elements[i]))
+if (!this->const_elements[i]->has_value(c->const_elements[i]))
return false;
   }
   return true;
@@ -1976,7 +1976,7 @@ steal_memory(ir_instruction *ir, void *new_ctx)
 }
   } else if (constant->type->is_array()) {
 for (unsigned int i = 0; i < constant->type->length; i++) {
-

[Mesa-dev] [PATCH 3/3] glsl: Unify ir_constant::const_elements and ::components

2017-09-12 Thread Ian Romanick

From: Ian Romanick 

There was no reason to treat array types and record types differently.
Unifying them saves a bunch of code and saves a few bytes in every
ir_constant.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/glsl_to_nir.cpp   | 11 ---
 src/compiler/glsl/ir.cpp| 92 +
 src/compiler/glsl/ir.h  |  5 +-
 src/compiler/glsl/ir_clone.cpp  | 16 +
 src/compiler/glsl/ir_print_visitor.cpp  |  5 +-
 src/compiler/glsl/link_uniform_initializers.cpp |  8 +--
 src/mesa/program/ir_to_mesa.cpp |  3 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |  3 +-
 8 files changed, 28 insertions(+), 115 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index f3cf74d..99df6e0 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -285,17 +285,6 @@ constant_copy(ir_constant *ir, void *mem_ctx)
   break;
 
case GLSL_TYPE_STRUCT:
-  ret->elements = ralloc_array(mem_ctx, nir_constant *,
-   ir->type->length);
-  ret->num_elements = ir->type->length;
-
-  i = 0;
-  foreach_in_list(ir_constant, field, >components) {
- ret->elements[i] = constant_copy(field, mem_ctx);
- i++;
-  }
-  break;
-
case GLSL_TYPE_ARRAY:
   ret->elements = ralloc_array(mem_ctx, nir_constant *,
ir->type->length);
diff --git a/src/compiler/glsl/ir.cpp b/src/compiler/glsl/ir.cpp
index c223ec6..49db56e 100644
--- a/src/compiler/glsl/ir.cpp
+++ b/src/compiler/glsl/ir.cpp
@@ -759,7 +759,12 @@ ir_constant::ir_constant(const struct glsl_type *type, 
exec_list *value_list)
assert(type->is_scalar() || type->is_vector() || type->is_matrix()
  || type->is_record() || type->is_array());
 
-   if (type->is_array()) {
+   /* If the constant is a record, the types of each of the entries in
+* value_list must be a 1-for-1 match with the structure components.  Each
+* entry must also be a constant.  Just move the nodes from the value_list
+* to the list in the ir_constant.
+*/
+   if (type->is_array() || type->is_record()) {
   this->const_elements = ralloc_array(this, ir_constant *, type->length);
   unsigned i = 0;
   foreach_in_list(ir_constant, value, value_list) {
@@ -770,20 +775,6 @@ ir_constant::ir_constant(const struct glsl_type *type, 
exec_list *value_list)
   return;
}
 
-   /* If the constant is a record, the types of each of the entries in
-* value_list must be a 1-for-1 match with the structure components.  Each
-* entry must also be a constant.  Just move the nodes from the value_list
-* to the list in the ir_constant.
-*/
-   /* FINISHME: Should there be some type checking and / or assertions here? */
-   /* FINISHME: Should the new constant take ownership of the nodes from
-* FINISHME: value_list, or should it make copies?
-*/
-   if (type->is_record()) {
-  value_list->move_nodes_to(& this->components);
-  return;
-   }
-
for (unsigned i = 0; i < 16; i++) {
   this->value.u[i] = 0;
}
@@ -931,9 +922,11 @@ ir_constant::zero(void *mem_ctx, const glsl_type *type)
}
 
if (type->is_record()) {
+  c->const_elements = ralloc_array(c, ir_constant *, type->length);
+
   for (unsigned i = 0; i < type->length; i++) {
-ir_constant *comp = ir_constant::zero(mem_ctx, 
type->fields.structure[i].type);
-c->components.push_tail(comp);
+ c->const_elements[i] =
+ir_constant::zero(mem_ctx, type->fields.structure[i].type);
   }
}
 
@@ -1106,24 +1099,10 @@ ir_constant::get_array_element(unsigned i) const
 ir_constant *
 ir_constant::get_record_field(int idx)
 {
-   if (idx < 0)
-  return NULL;
-
-   if (this->components.is_empty())
-  return NULL;
-
-   exec_node *node = this->components.get_head_raw();
-   for (int i = 0; i < idx; i++) {
-  node = node->next;
+   assert(this->type->is_record());
+   assert(idx >= 0 && idx < this->type->length);
 
-  /* If the end of the list is encountered before the element matching the
-   * requested field is found, return NULL.
-   */
-  if (node->is_tail_sentinel())
-return NULL;
-   }
-
-   return (ir_constant *) node;
+   return const_elements[idx];
 }
 
 void
@@ -1169,15 +1148,7 @@ ir_constant::copy_offset(ir_constant *src, int offset)
   break;
}
 
-   case GLSL_TYPE_STRUCT: {
-  assert (src->type == this->type);
-  this->components.make_empty();
-  foreach_in_list(ir_constant, orig, >components) {
-this->components.push_tail(orig->clone(this, NULL));
-  }
-  break;
-   }
-
+   case GLSL_TYPE_STRUCT:
case GLSL_TYPE_ARRAY: {
   assert (src->type == this->type);
   for (unsigned i = 0; i < this->type->length; i++) {

[Mesa-dev] [PATCH 1/3] glsl: Silence unused parameter warnings

2017-09-12 Thread Ian Romanick

From: Ian Romanick 

glsl/ast_type.cpp: In function ‘void merge_bindless_qualifier(YYLTYPE*, 
_mesa_glsl_parse_state*, const ast_type_qualifier&, const ast_type_qualifier&)’:
glsl/ast_type.cpp:189:35: warning: unused parameter ‘loc’ [-Wunused-parameter]
 merge_bindless_qualifier(YYLTYPE *loc,
   ^~~
glsl/ast_type.cpp:191:52: warning: unused parameter ‘qualifier’ 
[-Wunused-parameter]
  const ast_type_qualifier ,
^
glsl/ast_type.cpp:192:52: warning: unused parameter ‘new_qualifier’ 
[-Wunused-parameter]
  const ast_type_qualifier _qualifier)
^

glsl/ir_constant_expression.cpp: In member function ‘virtual ir_constant* 
ir_rvalue::constant_expression_value(void*, hash_table*)’:
glsl/ir_constant_expression.cpp:512:44: warning: unused parameter ‘mem_ctx’ 
[-Wunused-parameter]
 ir_rvalue::constant_expression_value(void *mem_ctx, struct hash_table *)
^~~
glsl/ir_constant_expression.cpp: In member function ‘virtual ir_constant* 
ir_texture::constant_expression_value(void*, hash_table*)’:
glsl/ir_constant_expression.cpp:705:45: warning: unused parameter ‘mem_ctx’ 
[-Wunused-parameter]
 ir_texture::constant_expression_value(void *mem_ctx, struct hash_table *)
 ^~~
glsl/ir_constant_expression.cpp: In member function ‘virtual ir_constant* 
ir_assignment::constant_expression_value(void*, hash_table*)’:
glsl/ir_constant_expression.cpp:851:48: warning: unused parameter ‘mem_ctx’ 
[-Wunused-parameter]
 ir_assignment::constant_expression_value(void *mem_ctx, struct hash_table *)
^~~
glsl/ir_constant_expression.cpp: In member function ‘virtual ir_constant* 
ir_constant::constant_expression_value(void*, hash_table*)’:
glsl/ir_constant_expression.cpp:859:46: warning: unused parameter ‘mem_ctx’ 
[-Wunused-parameter]
 ir_constant::constant_expression_value(void *mem_ctx, struct hash_table *)
  ^~~

glsl/linker.cpp: In function ‘void 
link_xfb_stride_layout_qualifiers(gl_context*, gl_shader_program*, 
gl_linked_shader*, gl_shader**, unsigned int)’:
glsl/linker.cpp:1655:60: warning: unused parameter ‘linked_shader’ 
[-Wunused-parameter]
   struct gl_linked_shader *linked_shader,
^
glsl/linker.cpp: In function ‘void 
link_bindless_layout_qualifiers(gl_shader_program*, gl_program*, gl_shader**, 
unsigned int)’:
glsl/linker.cpp:1693:52: warning: unused parameter ‘gl_prog’ 
[-Wunused-parameter]
 struct gl_program *gl_prog,
^~~

glsl/lower_distance.cpp: In member function ‘virtual void 
{anonymous}::lower_distance_visitor_counter::handle_rvalue(ir_rvalue**)’:
glsl/lower_distance.cpp:652:59: warning: unused parameter ‘rv’ 
[-Wunused-parameter]
 lower_distance_visitor_counter::handle_rvalue(ir_rvalue **rv)
   ^~

glsl/opt_array_splitting.cpp: In member function ‘virtual ir_visitor_status 
{anonymous}::ir_array_reference_visitor::visit_leave(ir_assignment*)’:
glsl/opt_array_splitting.cpp:198:56: warning: unused parameter ‘ir’ 
[-Wunused-parameter]
 ir_array_reference_visitor::visit_leave(ir_assignment *ir)
^~

glsl/glsl_parser_extras.cpp: In function ‘void 
assign_subroutine_indexes(gl_shader*, _mesa_glsl_parse_state*)’:
glsl/glsl_parser_extras.cpp:1869:45: warning: unused parameter ‘sh’ 
[-Wunused-parameter]
 assign_subroutine_indexes(struct gl_shader *sh,
 ^~

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ast_type.cpp   | 7 ++-
 src/compiler/glsl/glsl_parser_extras.cpp | 5 ++---
 src/compiler/glsl/ir_constant_expression.cpp | 8 
 src/compiler/glsl/linker.cpp | 7 ++-
 src/compiler/glsl/lower_distance.cpp | 2 +-
 src/compiler/glsl/opt_array_splitting.cpp| 2 +-
 6 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index ee8697b..e9d60de 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -186,10 +186,7 @@ validate_point_mode(MAYBE_UNUSED const ast_type_qualifier 
,
 }
 
 static void
-merge_bindless_qualifier(YYLTYPE *loc,
- _mesa_glsl_parse_state *state,
- const ast_type_qualifier ,
- const ast_type_qualifier _qualifier)
+merge_bindless_qualifier(_mesa_glsl_parse_state *state)
 {
if

[Mesa-dev] [Bug 102682] vblank_mode ignored from ~/.drirc

2017-09-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=102682

Bug ID: 102682
   Summary: vblank_mode ignored from ~/.drirc
   Product: Mesa
   Version: 17.2
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: bugs.freedesk...@haasn.xyz
QA Contact: mesa-dev@lists.freedesktop.org

Various sources (e.g. IRC) have led me to believe that `vblank_mode` should be
a settable option via ~/.drirc; however, it doesn't seem like this is the case
to me. A demonstration:









I made this from the ~/.drirc skeleton created by `driconf`. Curiously, the
`vblank_mode` setting was also absent from that skeleton, but I added it
manually. (It's also absent from the GUI - although a friend of mine says he
has the option in his driconf interface!) As expected, this option seems to
have no effect - I still get vsync in all tested applications (e.g. glxgears).

I can override it manually by using the environment variable vblank_mode, e.g.
`vblank_mode=0 glxgears`. This *does* work, although it doesn't explain why
this option is seemingly ignored from ~/.drirc. More worrying is the fact that
the actual source code (dri3_set_swap_interval) uses configQuery("vblank_mode")
to get this value, so it should pull it from both XML and the environment,
right?

If this is not expected behavior, I can try stepping through the `configQuery`
function in gdb; but I thought I'd ask first in case this is a known or easily
explained issue.

Mesa version is 17.2.0, kernel version is 4.12.4, device is a Sapphire RX 560.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Is there a mesa Docker image available like on travis-ci? (was Re: [PATCH 1/2] gallium/targets/dri: Add libunwind to linker flags)

2017-09-12 Thread Eric Engestrom

On Tuesday, 2017-09-12 17:26:43 +0200, Gert Wollny wrote:
> Am Dienstag, den 12.09.2017, 12:26 +0100 schrieb Emil Velikov:
> > On 12 September 2017 at 11:17, Gert Wollny 
> > wrote:
> > > 
> > > 
> > > Is there a docker image available that resembles the travis-ci
> > > build environment as it is set up in mesa/.travis.yml (i.e.
> > > Ubuntu/Trusty with all the manually compiled packages)?
> > > 
> > 
> > There isn't one that I know of. They are using stock Ubuntu (afaict)
> > so one should be able to reproduce/create one.
> I tried to avoid that ;) 
> 
> > 
> > That aside, something like the following should help - do tweak as
> > applicable. Don't forget to trim down the targets - no need to build
> > unrelated stuff ;-)
> 
> Thanks to your suggestion I was able to get some output [1], but the
> link command actually contains -lunwind and it is listed after
> libgallium.a, so I have no idea what's going wrong. 
> 
> Maybe it's a bug in that libtool version ... 

I wouldn't be surprised; try slibtool? We've had to use it internally
instead of libtool to build mesa for a while now.

> 
> 
> 
> 
> [1] https://travis-ci.org/gerddie/mesa/jobs/274632768
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 102677] [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails

2017-09-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=102677

Bug ID: 102677
   Summary: [OpenGL CTS]
KHR-GL45.CommonBugs.CommonBug_PerVertexValidation
fails
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: kenn...@whitecape.org
QA Contact: intel-3d-b...@lists.freedesktop.org
Blocks: 102590

This test links together separable programs with incompatible or missing
gl_PerVertex block declarations.  It expects a linker error.  We allow the
link.


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=102590
[Bug 102590] [OpenGL CTS] List of open issues
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 155 matches

Mail list logo