[Mesa-dev] [PATCH] panfrost/midgard: Implement fpow

2019-03-12 Thread Alyssa Rosenzweig
We have a native op for this, which was just found in a disassembly --
so instead of lowering, use it!

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/midgard/helpers.h | 1 +
 src/gallium/drivers/panfrost/midgard/midgard.h | 2 ++
 src/gallium/drivers/panfrost/midgard/midgard_compile.c | 1 +
 src/gallium/drivers/panfrost/midgard/midgard_compile.h | 1 -
 4 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/panfrost/midgard/helpers.h 
b/src/gallium/drivers/panfrost/midgard/helpers.h
index b597e516f20..606eb9982e7 100644
--- a/src/gallium/drivers/panfrost/midgard/helpers.h
+++ b/src/gallium/drivers/panfrost/midgard/helpers.h
@@ -228,6 +228,7 @@ static unsigned alu_opcode_props[256] = {
 [midgard_alu_op_frcp]   = UNIT_VLUT,
 [midgard_alu_op_frsqrt] = UNIT_VLUT,
 [midgard_alu_op_fsqrt]  = UNIT_VLUT,
+[midgard_alu_op_fpow]   = UNIT_VLUT,
 [midgard_alu_op_fexp2]  = UNIT_VLUT,
 [midgard_alu_op_flog2]  = UNIT_VLUT,
 
diff --git a/src/gallium/drivers/panfrost/midgard/midgard.h 
b/src/gallium/drivers/panfrost/midgard/midgard.h
index f6b83d30f5d..39b1df5d915 100644
--- a/src/gallium/drivers/panfrost/midgard/midgard.h
+++ b/src/gallium/drivers/panfrost/midgard/midgard.h
@@ -108,6 +108,7 @@ typedef enum {
 midgard_alu_op_fcsel  = 0xC5,
 midgard_alu_op_fround = 0xC6,
 midgard_alu_op_fatan_pt2  = 0xE8,
+midgard_alu_op_fpow   = 0xEC,
 midgard_alu_op_frcp   = 0xF0,
 midgard_alu_op_frsqrt = 0xF2,
 midgard_alu_op_fsqrt  = 0xF3,
@@ -462,6 +463,7 @@ static char *alu_opcode_names[256] = {
 [midgard_alu_op_frcp]   = "frcp",
 [midgard_alu_op_frsqrt] = "frsqrt",
 [midgard_alu_op_fsqrt]  = "fsqrt",
+[midgard_alu_op_fpow]   = "fpow",
 [midgard_alu_op_fexp2]  = "fexp2",
 [midgard_alu_op_flog2]  = "flog2",
 [midgard_alu_op_fsin]   = "fsin",
diff --git a/src/gallium/drivers/panfrost/midgard/midgard_compile.c 
b/src/gallium/drivers/panfrost/midgard/midgard_compile.c
index a6c4b4498da..7f3b6997ca0 100644
--- a/src/gallium/drivers/panfrost/midgard/midgard_compile.c
+++ b/src/gallium/drivers/panfrost/midgard/midgard_compile.c
@@ -990,6 +990,7 @@ emit_alu(compiler_context *ctx, nir_alu_instr *instr)
 ALU_CASE(frcp, frcp);
 ALU_CASE(frsq, frsqrt);
 ALU_CASE(fsqrt, fsqrt);
+ALU_CASE(fpow, fpow);
 ALU_CASE(fexp2, fexp2);
 ALU_CASE(flog2, flog2);
 
diff --git a/src/gallium/drivers/panfrost/midgard/midgard_compile.h 
b/src/gallium/drivers/panfrost/midgard/midgard_compile.h
index 2fcbc317164..6225d041c07 100644
--- a/src/gallium/drivers/panfrost/midgard/midgard_compile.h
+++ b/src/gallium/drivers/panfrost/midgard/midgard_compile.h
@@ -62,7 +62,6 @@ midgard_compile_shader_nir(nir_shader *nir, midgard_program 
*program, bool is_bl
 static const nir_shader_compiler_options midgard_nir_options = {
 .lower_ffma = true,
 .lower_sub = true,
-.lower_fpow = true,
 .lower_scmp = true,
 .lower_flrp32 = true,
 .lower_flrp64 = true,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 9/9] anv: Enabled the VK_EXT_sample_locations extension

2019-03-12 Thread Sagar Ghuge
Reviewed-by: Sagar Ghuge 

On 3/12/19 3:34 AM, Eleni Maria Stea wrote:
> Enabled the VK_EXT_sample_locations for Intel Gen >= 7.
> 
> v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support
> anything below Gen7. (Lionel Landwerlin)
> ---
>  src/intel/vulkan/anv_extensions.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/intel/vulkan/anv_extensions.py 
> b/src/intel/vulkan/anv_extensions.py
> index 9e4e03e46df..5a30c733c5c 100644
> --- a/src/intel/vulkan/anv_extensions.py
> +++ b/src/intel/vulkan/anv_extensions.py
> @@ -129,7 +129,7 @@ EXTENSIONS = [
>  Extension('VK_EXT_inline_uniform_block',  1, True),
>  Extension('VK_EXT_pci_bus_info',  2, True),
>  Extension('VK_EXT_post_depth_coverage',   1, 
> 'device->info.gen >= 9'),
> -Extension('VK_EXT_sample_locations',  1, False),
> +Extension('VK_EXT_sample_locations',  1, True),
>  Extension('VK_EXT_sampler_filter_minmax', 1, 
> 'device->info.gen >= 9'),
>  Extension('VK_EXT_scalar_block_layout',   1, True),
>  Extension('VK_EXT_shader_stencil_export', 1, 
> 'device->info.gen >= 9'),
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list

2019-03-12 Thread Sagar Ghuge
Reviewed-by: Sagar Ghuge 

On 3/12/19 3:34 AM, Eleni Maria Stea wrote:
> Added the VK_EXT_sample_locations to the anv_extensions.py list to
> generate the related entrypoints.
> ---
>  src/intel/vulkan/anv_extensions.py | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/intel/vulkan/anv_extensions.py 
> b/src/intel/vulkan/anv_extensions.py
> index 6fff293dee4..9e4e03e46df 100644
> --- a/src/intel/vulkan/anv_extensions.py
> +++ b/src/intel/vulkan/anv_extensions.py
> @@ -129,6 +129,7 @@ EXTENSIONS = [
>  Extension('VK_EXT_inline_uniform_block',  1, True),
>  Extension('VK_EXT_pci_bus_info',  2, True),
>  Extension('VK_EXT_post_depth_coverage',   1, 
> 'device->info.gen >= 9'),
> +Extension('VK_EXT_sample_locations',  1, False),
>  Extension('VK_EXT_sampler_filter_minmax', 1, 
> 'device->info.gen >= 9'),
>  Extension('VK_EXT_scalar_block_layout',   1, True),
>  Extension('VK_EXT_shader_stencil_export', 1, 
> 'device->info.gen >= 9'),
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] ac/nir_to_llvm: dont add unrequired swizzle to bcsel src

2019-03-12 Thread Timothy Arceri
NIR can produce IR that looks like this:

vec1 1 ssa_51 = ilt ssa_32, ssa_50
vec4 32 ssa_54 = intrinsic load_deref (ssa_53) (0)
vec4 32 ssa_57 = intrinsic load_deref (ssa_56) (0)
vec4 32 ssa_61 = bcsel ssa_51., ssa_54, ssa_57

The swizzle on the first bcsel src causes a crash in llvm as it
doesn't seem to expect it. Here we add a special case for the
first bcsel src.
---
 src/amd/common/ac_nir_to_llvm.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 5fb5c8da609..cebf248427c 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -154,13 +154,18 @@ static LLVMBasicBlockRef get_block(struct ac_nir_context 
*nir,
 
 static LLVMValueRef get_alu_src(struct ac_nir_context *ctx,
 nir_alu_src src,
-unsigned num_components)
+unsigned num_components,
+bool is_first_bcsel_src)
 {
LLVMValueRef value = get_src(ctx, src.src);
bool need_swizzle = false;
 
assert(value);
unsigned src_components = ac_get_llvm_num_components(value);
+
+   if (is_first_bcsel_src)
+   num_components = 1;
+
for (unsigned i = 0; i < num_components; ++i) {
assert(src.swizzle[i] < src_components);
if (src.swizzle[i] != i)
@@ -584,8 +589,10 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
src_components = num_components;
break;
}
-   for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++)
-   src[i] = get_alu_src(ctx, instr->src[i], src_components);
+   for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) {
+   src[i] = get_alu_src(ctx, instr->src[i], src_components,
+ (instr->op == nir_op_b32csel && i == 0));
+   }
 
switch (instr->op) {
case nir_op_fmov:
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv/pass: Flag the need for a depth flush for resolve attachments

2019-03-12 Thread Jason Ekstrand
Cc: mesa-sta...@lists.freedesktop.org
Cc: Nanley Chery 
---
 src/intel/vulkan/anv_pass.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
index 5fac5bbb31c..ec217abfda0 100644
--- a/src/intel/vulkan/anv_pass.c
+++ b/src/intel/vulkan/anv_pass.c
@@ -178,12 +178,28 @@ anv_render_pass_compile(struct anv_render_pass *pass)
 * subpasses and checking to see if any of them don't have an external
 * dependency.  Or, we could just be lazy and add a couple extra flushes.
 * We choose to be lazy.
+*
+* From the documentation for vkCmdNextSubpass:
+*
+*"Moving to the next subpass automatically performs any multisample
+*resolve operations in the subpass being ended. End-of-subpass
+*multisample resolves are treated as color attachment writes for the
+*purposes of synchronization. This applies to resolve operations for
+*both color and depth/stencil attachments. That is, they are
+*considered to execute in the
+*VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT pipeline stage and
+*their writes are synchronized with
+*VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT."
+*
+* Therefore, the above flags concerning color attachments also apply to
+* color and depth/stencil resolve attachments.
 */
if (all_usage & VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT) {
   pass->subpass_flushes[0] |=
  ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT;
}
-   if (all_usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) {
+   if (all_usage & (VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
+VK_IMAGE_USAGE_TRANSFER_DST_BIT)) {
   pass->subpass_flushes[pass->subpass_count] |=
  ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
}
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] mesa: Implement helper functions to map and unmap a VAO.

2019-03-12 Thread Brian Paul

The series looks great.  Just a few minor suggestions below.

Reviewed-by: Brian Paul 

BTW, I happened to look at _mesa_primitive_restart_index().  It seems we 
could/should add an assertion in there:


assert(index_size == 1 ||
   index_size == 2 ||
   index_size == 4);

right?

-Brian


On 03/05/2019 01:13 AM, mathias.froehl...@gmx.net wrote:

From: Mathias Fröhlich 

Provide a set of functions that maps or unmaps all VBOs held
in a VAO. The functions will be used in the following patches.

Signed-off-by: Mathias Fröhlich 
---
  src/mesa/main/arrayobj.c | 84 
  src/mesa/main/arrayobj.h | 18 +
  2 files changed, 102 insertions(+)

diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c
index 68d30aa9b1f..1f6c6904739 100644
--- a/src/mesa/main/arrayobj.c
+++ b/src/mesa/main/arrayobj.c
@@ -913,6 +913,90 @@ _mesa_all_buffers_are_unmapped(const struct 
gl_vertex_array_object *vao)
 return true;
  }
  
+

+/**
+ * Map buffer objects used in attribute arrays.
+ */
+void
+_mesa_vao_map_arrays(struct gl_context *ctx, struct gl_vertex_array_object 
*vao,
+ GLbitfield access)
+{
+   GLbitfield mask = vao->Enabled & vao->VertexAttribBufferMask;
+   while (mask) {
+  /* Do not use u_bit_scan as we can walk multiple attrib arrays at once */
+  const gl_vert_attrib attr = ffs(mask) - 1;
+  const GLubyte bindex = vao->VertexAttrib[attr].BufferBindingIndex;
+  struct gl_vertex_buffer_binding *binding = >BufferBinding[bindex];
+  mask &= ~binding->_BoundArrays;
+
+  struct gl_buffer_object *bo = binding->BufferObj;
+  assert(_mesa_is_bufferobj(bo));
+  if (_mesa_bufferobj_mapped(bo, MAP_INTERNAL))
+ continue;
+
+  ctx->Driver.MapBufferRange(ctx, 0, bo->Size, access, bo, MAP_INTERNAL);
+   }
+}
+
+
+/**
+ * Map buffer objects used in the vao,
+ * attribute arrays and index buffer.


Combine onto one line.


+ */
+void
+_mesa_vao_map(struct gl_context *ctx, struct gl_vertex_array_object *vao,
+  GLbitfield access)
+{


A comment might be hlpeful here, like
/* map the index buffer, if there is one, and not already mapped */


+   struct gl_buffer_object *bo = vao->IndexBufferObj;
+
+   if (_mesa_is_bufferobj(bo) && !_mesa_bufferobj_mapped(bo, MAP_INTERNAL))
+  ctx->Driver.MapBufferRange(ctx, 0, bo->Size, access, bo, MAP_INTERNAL);
+
+   _mesa_vao_map_arrays(ctx, vao, access);
+}
+
+
+/**
+ * Unmap buffer objects used in attribute arrays.
+ */
+void
+_mesa_vao_unmap_arrays(struct gl_context *ctx,
+   struct gl_vertex_array_object *vao)
+{
+   GLbitfield mask = vao->Enabled & vao->VertexAttribBufferMask;
+   while (mask) {
+  /* Do not use u_bit_scan as we can walk multiple attrib arrays at once */
+  const gl_vert_attrib attr = ffs(mask) - 1;
+  const GLubyte bindex = vao->VertexAttrib[attr].BufferBindingIndex;
+  struct gl_vertex_buffer_binding *binding = >BufferBinding[bindex];
+  mask &= ~binding->_BoundArrays;
+
+  struct gl_buffer_object *bo = binding->BufferObj;
+  assert(_mesa_is_bufferobj(bo));
+  if (!_mesa_bufferobj_mapped(bo, MAP_INTERNAL))
+ continue;
+
+  ctx->Driver.UnmapBuffer(ctx, bo, MAP_INTERNAL);
+   }
+}
+
+
+/**
+ * Unmap buffer objects used in the vao,
+ * attribute arrays and index buffer.


combine onto one line.



+ */
+void
+_mesa_vao_unmap(struct gl_context *ctx, struct gl_vertex_array_object *vao)
+{
+   struct gl_buffer_object *bo = vao->IndexBufferObj;
+
+   if (_mesa_is_bufferobj(bo) && _mesa_bufferobj_mapped(bo, MAP_INTERNAL))
+  ctx->Driver.UnmapBuffer(ctx, bo, MAP_INTERNAL);
+
+   _mesa_vao_unmap_arrays(ctx, vao);
+}
+
+
  /**/
  /* API Functions  */
  /**/
diff --git a/src/mesa/main/arrayobj.h b/src/mesa/main/arrayobj.h
index ee87b4b6ba5..7516bae9e39 100644
--- a/src/mesa/main/arrayobj.h
+++ b/src/mesa/main/arrayobj.h
@@ -100,6 +100,24 @@ extern bool
  _mesa_all_buffers_are_unmapped(const struct gl_vertex_array_object *vao);
  
  
+extern void

+_mesa_vao_map_arrays(struct gl_context *ctx, struct gl_vertex_array_object 
*vao,
+ GLbitfield access);
+
+extern void
+_mesa_vao_map(struct gl_context *ctx, struct gl_vertex_array_object *vao,
+  GLbitfield access);
+
+
+extern void
+_mesa_vao_unmap_arrays(struct gl_context *ctx,
+   struct gl_vertex_array_object *vao);
+
+extern void
+_mesa_vao_unmap(struct gl_context *ctx,
+struct gl_vertex_array_object *vao);
+
+
  /**
   * Array to apply the position/generic0 aliasing map to
   * an attribute value used in vertex processing inputs to an attribute



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [AppVeyor] mesa master #10389 completed

2019-03-12 Thread AppVeyor


Build mesa 10389 completed



Commit 3570d15b6d by Kenneth Graunke on 3/12/2019 2:00 AM:

intel/fs: Fix opt_peephole_csel to not throw away saturates.\n\nWe were not copying the saturate bit from the original instruction\nto the new replacement instruction.  This caused major misrendering\nin DiRT Rally on iris, where comparisons leading to discards failed\ndue to the missing saturate, causing lots of extra garbage pixels to\nbe drawn in text rendering, trees, and so on.\n\nThis did not show up on i965 because st/nir performs a more aggressive\nversion of nir_opt_peephole_select, yielding more b32csel operations.\n\nFixes: 52c7df1643e i965/fs: Merge CMP and SEL into CSEL on Gen8+\n\nReviewed-by: Lionel Landwerlin \nReviewed-by: Ian Romanick 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109641] GLX swrast driver leaks shared memory segments

2019-03-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109641

Brian Paul  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering

2019-03-12 Thread Bas Nieuwenhuizen
On Tue, Mar 12, 2019 at 9:59 AM Marc-André Lureau
 wrote:
>
> Hi
>
> On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich
>  wrote:
> >
> > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote:
> > > Hi,
> > >
> > > On 1.3.2019 11.12, Michel Dänzer wrote:
> > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote:
> > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen 
> > > >>> 
> > >  Why distro versions of Qemu filter sched_setaffinity() syscall?
> > > >>>
> > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889)
> > > >>>
> > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19
> > > >>>
> > > >>> "IMHO that mesa change is not valid. It is settings its affinity to
> > > >>> run on all threads which is definitely *NOT* something we want to be
> > > >>> allowed. Management applications want to control which CPUs QEMU runs
> > > >>> on, and as such Mesa should honour the CPU placement that the QEMU
> > > >>> process has.
> > > >>>
> > > >>> This is a great example of why QEMU wants to use seccomp to block
> > > >>> affinity changes to prevent something silently trying to use more CPUs
> > > >>> than are assigned to this QEMU."
> > > >>>
> > > >>
> > > >> Mesa uses thread affinity to optimize memory access performance on some
> > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore 
> > > >> the
> > > >> original thread affinity for some child threads. Additionally, if games
> > > >> limit the thread affinity, Mesa needs to restore the full thread 
> > > >> affinity
> > > >> for some of its child threads.
> > > >
> > > > The last part sounds like Mesa clearly overstepping its authority.
> > > >
> > > >
> > > >> In essence, the thread affinity should only be considered a hint for 
> > > >> the
> > > >> kernel for optimal performance. There is no reason to kill the process 
> > > >> if
> > > >> it's disallowed. Just ignore the call or modify the thread mask to 
> > > >> make it
> > > >> legal.
> > > >
> > > > The fundamental issue here is that Mesa is using the thread affinity API
> > > > for something else than it's intended for. If there was an API for what
> > > > Mesa wants (encouraging certain sets of threads to run on topologically
> > > > close cores), there should be no need to block that.
> > >
> > > Why such process needs to be killed instead the request being masked
> > > suitably, is there some program that breaks subtly if affinity request
> > > is masked (and that being worse than the program being killed)?
> >
> > But that is still a situation that could be nicely handled with a
> > EPERM error return. Way better than just kill a process.
> > That 'badly affected' program still can call abort then.
> > But nicely working programs don't get just killed then!!
>
>
> Returning an error seems less secure that prohibiting it completely.
> And it may lead to subtle bugs in rarely tested code paths.
>
> It's legitimate that QEMU and management layers want to prevent
> arbitrary code from changing resource allocation etc.
>
> There are no easy way I can think of for mesa (and other libraries) to
> probe the seccomp filters and associated action.
>
> So we need a way to tell mesa not to call setaffinity() (and other
> syscalls). MESA_NO_THREAD_AFFINITY or MESA_NO_SYSCALLS=setaffinity,...
> seem like a relatively easy way to go.

I strongly believe we should not be going the route of adding another
environment variable.

Primarily because this is adding a big pitfall for users that have
their software not work and have to spend significant efforts to
figure out the environment variable and get things work. (as well as
associated costs of some users not getting that far and filing bugs or
proclaiming on other sites that this thing is buggy for them).

As such I'd strongly appreciate it if people look further than the
immediate crash  and figure out a way to make graceful degradation
happen without user intervention.

Is there really no way to figure out that calling setaffinity is going
to kill the process, and what would be needed to add a method?

>
> thanks
>
>
> --
> Marc-André Lureau
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [AppVeyor] mesa viewport-jobize #10388 failed

2019-03-12 Thread AppVeyor



Build mesa 10388 failed


Commit 904905e3de by Alyssa Rosenzweig on 3/13/2019 1:50 AM:

panfrost: Compute viewport state on the fly\n\nPreviously, we were caching this incorrectly; there's no real reason to\ngiven how variable it is (sensitive to changes in viewport, framebuffer\ndimensions, and scissors) and how cheap it is to recompute. So, just do\nit on the fly each draw.\n\nFixes glmark-es2 -bshadow and -brefract.\n\nSigned-off-by: Alyssa Rosenzweig 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: Ignore VkRenderPassInputAttachementAspectCreateInfo

2019-03-12 Thread Jason Ekstrand
Pushed.  Thanks!

On Tue, Mar 12, 2019 at 6:35 PM Sagar Ghuge  wrote:

> Reviewed-by: Sagar Ghuge 
>
> On 3/12/19 4:21 PM, Jason Ekstrand wrote:
> > We don't care about the information but there's no sense in throwing a
> > debug warning about it.  It's harmless but annoying to users.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109984
> > Cc: Craig Stout 
> > ---
> >  src/intel/vulkan/anv_pass.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> > index 02f2be60e02..5fac5bbb31c 100644
> > --- a/src/intel/vulkan/anv_pass.c
> > +++ b/src/intel/vulkan/anv_pass.c
> > @@ -332,6 +332,10 @@ VkResult anv_CreateRenderPass(
> >
> > vk_foreach_struct(ext, pCreateInfo->pNext) {
> >switch (ext->sType) {
> > +  case
> VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO:
> > + /* We don't care about this information */
> > + break;
> > +
> >case VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO: {
> >   VkRenderPassMultiviewCreateInfo *mv = (void *)ext;
> >
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/loop_unroll: Fix out-of-bounds access handling

2019-03-12 Thread Jason Ekstrand
Pushed!  Thanks.

On Tue, Mar 12, 2019 at 5:42 PM Timothy Arceri 
wrote:

>
>
> On 13/3/19 8:30 am, Jason Ekstrand wrote:
> > The previous code was completely broken when it came to constructing the
> > undef values.  I'm not sure how it ever worked.  For the case of a copy
> > that reads an undefined value, we can just delete the copy because the
> > destination is a valid undefined value.  This saves us the effort of
> > trying to construct a value for an arbitrary copy_deref intrinsic.
>
> Yes. It turns out this was indeed not required. Thanks for the fix!
>
> Reviewed-by: Timothy Arceri 
>
> >
> > Fixes: e8a8937a04 "nir: add partial loop unrolling support"
> > Cc: Timothy Arceri 
> > ---
> >   src/compiler/nir/nir_opt_loop_unroll.c | 14 ++
> >   1 file changed, 2 insertions(+), 12 deletions(-)
> >
> > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> b/src/compiler/nir/nir_opt_loop_unroll.c
> > index 06ec78b8068..b2696d4aadb 100644
> > --- a/src/compiler/nir/nir_opt_loop_unroll.c
> > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > @@ -670,11 +670,9 @@ remove_out_of_bounds_induction_use(nir_shader
> *shader, nir_loop *loop,
> >   if (is_access_out_of_bounds(term,
> nir_src_as_deref(intrin->src[0]),
> >   trip_count)) {
> >  if (intrin->intrinsic == nir_intrinsic_load_deref) {
> > -  assert(intrin->src[0].is_ssa);
> > -  nir_ssa_def *a_ssa = intrin->src[0].ssa;
> > nir_ssa_def *undef =
> > - nir_ssa_undef(, intrin->num_components,
> > -   a_ssa->bit_size);
> > + nir_ssa_undef(, intrin->dest.ssa.num_components,
> > +   intrin->dest.ssa.bit_size);
> > nir_ssa_def_rewrite_uses(>dest.ssa,
> >  nir_src_for_ssa(undef));
> >  } else {
> > @@ -686,14 +684,6 @@ remove_out_of_bounds_induction_use(nir_shader
> *shader, nir_loop *loop,
> >   if (intrin->intrinsic == nir_intrinsic_copy_deref &&
> >   is_access_out_of_bounds(term,
> nir_src_as_deref(intrin->src[1]),
> >   trip_count)) {
> > -   assert(intrin->src[1].is_ssa);
> > -   nir_ssa_def *a_ssa = intrin->src[1].ssa;
> > -   nir_ssa_def *undef =
> > -  nir_ssa_undef(, intrin->num_components,
> a_ssa->bit_size);
> > -
> > -   /* Replace the copy with a store of the undefined value
> */
> > -   b.cursor = nir_before_instr(instr);
> > -   nir_store_deref(, nir_src_as_deref(intrin->src[0]),
> undef, ~0);
> >  nir_instr_remove(instr);
> >   }
> >}
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/10] panfrost: Minor comment cleanup (version detection)

2019-03-12 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_context.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index 48e471eace2..cb226cc2220 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -2408,8 +2408,9 @@ panfrost_create_context(struct pipe_screen *screen, void 
*priv, unsigned flags)
 unsigned gpu_id;
 
 gpu_id = pscreen->driver->query_gpu_version(pscreen);
-ctx->is_t6xx = gpu_id <= 0x0750; /* For now, this flag means t76x or 
less */
-ctx->require_sfbd = gpu_id < 0x0750; /* t76x is the first to support 
MFD */
+
+ctx->is_t6xx = gpu_id <= 0x0750; /* For now, this flag means T760 or 
less */
+ctx->require_sfbd = gpu_id < 0x0750; /* T760 is the first to support 
MFBD */
 
 gallium->screen = screen;
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/10] panfrost/mfbd: Implement linear depth buffers

2019-03-12 Thread Alyssa Rosenzweig
This removes a clunky hack where the depth buffer was enabled during the
*clear*, instead of during depth buffer linking. That said, this does
not yet support writeback like AFBC depth buffers.

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_mfbd.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_mfbd.c 
b/src/gallium/drivers/panfrost/pan_mfbd.c
index a141fd314c0..b9c7cb221e7 100644
--- a/src/gallium/drivers/panfrost/pan_mfbd.c
+++ b/src/gallium/drivers/panfrost/pan_mfbd.c
@@ -58,8 +58,6 @@ panfrost_mfbd_clear(
 struct bifrost_fb_extra *fbx,
 struct bifrost_render_target *rt)
 {
-struct panfrost_context *ctx = job->ctx;
-
 if (job->clear & PIPE_CLEAR_COLOR) {
 rt->clear_color_1 = job->clear_color;
 rt->clear_color_2 = job->clear_color;
@@ -74,14 +72,6 @@ panfrost_mfbd_clear(
 if (job->clear & PIPE_CLEAR_STENCIL) {
 fb->clear_stencil = job->clear_stencil;
 }
-
-if (job->clear & (PIPE_CLEAR_DEPTH | PIPE_CLEAR_STENCIL)) {
-/* Setup combined 24/8 depth/stencil */
-fb->unk3 |= MALI_MFBD_EXTRA;
-fbx->flags = 0x405;
-fbx->ds_linear.depth = ctx->depth_stencil_buffer.gpu;
-fbx->ds_linear.depth_stride = ctx->pipe_framebuffer.width * 4;
-}
 }
 
 static void
@@ -155,6 +145,15 @@ panfrost_mfbd_set_zsbuf(
 fbx->ds_afbc.padding = 0x1000;
 
 fb->unk3 |= MALI_MFBD_DEPTH_WRITE;
+} else if (rsrc->bo->layout == PAN_LINEAR) {
+fb->unk3 |= MALI_MFBD_EXTRA;
+fbx->flags |= MALI_EXTRA_PRESENT | MALI_EXTRA_ZS | 0x1;
+
+fbx->ds_linear.depth = rsrc->bo->gpu[0];
+fbx->ds_linear.depth_stride =
+util_format_get_stride(surf->format, 
surf->texture->width0);
+} else {
+assert(0);
 }
 }
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/10] panfrost: Allocate extra data for depth buffer

2019-03-12 Thread Alyssa Rosenzweig
It's not clear why the hardware "spills" a little bit, but if we don't
do this, we get MMU faults with linear depth buffers.

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_resource.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/drivers/panfrost/pan_resource.c 
b/src/gallium/drivers/panfrost/pan_resource.c
index d647f618ee7..a6e0cfe4687 100644
--- a/src/gallium/drivers/panfrost/pan_resource.c
+++ b/src/gallium/drivers/panfrost/pan_resource.c
@@ -197,6 +197,11 @@ panfrost_create_bo(struct panfrost_screen *screen, const 
struct pipe_resource *t
 if (template->height0) sz *= template->height0;
 if (template->depth0) sz *= template->depth0;
 
+/* Depth buffers require extra space for unknown reasons */
+
+if (template->bind & PIPE_BIND_DEPTH_STENCIL)
+sz = sz + sz/256;
+
 /* Based on the usage, figure out what storing will be used. There are
  * various tradeoffs:
  *
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/10] panfrost/mfbd: Respect per-job depth write flag

2019-03-12 Thread Alyssa Rosenzweig
While a depth buffer may be supplied, it only needs to be written to if
the depth writemask is set for any draw AND if the depth buffer is not
immediately invalidated (as is the case for scanout). This refactors
panfrost_job to provide a depth write requirement, which is now
implemented for MFBD depth buffers.

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_context.c | 30 --
 src/gallium/drivers/panfrost/pan_job.h |  9 +--
 src/gallium/drivers/panfrost/pan_mfbd.c| 21 ---
 src/gallium/drivers/panfrost/pan_sfbd.c|  2 +-
 4 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index cb226cc2220..a038ea122f7 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -50,19 +50,6 @@ extern const char *pan_counters_base;
 /* Do not actually send anything to the GPU; merely generate the cmdstream as 
fast as possible. Disables framebuffer writes */
 //#define DRY_RUN
 
-/* TODO: Sample size, etc */
-
-static void
-panfrost_set_framebuffer_msaa(struct panfrost_context *ctx, bool enabled)
-{
-struct panfrost_job *job = panfrost_get_job_for_fbo(ctx);
-
-job->msaa |= enabled;
-
-SET_BIT(ctx->fragment_shader_core.unknown2_3, MALI_HAS_MSAA, enabled);
-SET_BIT(ctx->fragment_shader_core.unknown2_4, MALI_NO_MSAA, !enabled);
-}
-
 /* AFBC is enabled on a per-resource basis (AFBC enabling is theoretically
  * indepdent between color buffers and depth/stencil). To enable, we allocate
  * the AFBC metadata buffer and mark that it is enabled. We do -not- actually
@@ -789,15 +776,30 @@ panfrost_emit_vertex_data(struct panfrost_context *ctx)
 void
 panfrost_emit_for_draw(struct panfrost_context *ctx, bool with_vertex_data)
 {
+struct panfrost_job *job = panfrost_get_job_for_fbo(ctx);
+
 if (with_vertex_data) {
 panfrost_emit_vertex_data(ctx);
 }
 
+bool msaa = ctx->rasterizer->base.multisample;
+
 if (ctx->dirty & PAN_DIRTY_RASTERIZER) {
 ctx->payload_tiler.gl_enables = 
ctx->rasterizer->tiler_gl_enables;
-panfrost_set_framebuffer_msaa(ctx, 
ctx->rasterizer->base.multisample);
+
+/* TODO: Sample size */
+SET_BIT(ctx->fragment_shader_core.unknown2_3, MALI_HAS_MSAA, 
msaa);
+SET_BIT(ctx->fragment_shader_core.unknown2_4, MALI_NO_MSAA, 
!msaa);
 }
 
+/* Enable job requirements at draw-time */
+
+if (msaa)
+job->requirements |= PAN_REQ_MSAA;
+
+if (ctx->depth_stencil->depth.writemask)
+job->requirements |= PAN_REQ_DEPTH_WRITE;
+
 if (ctx->occlusion_query) {
 ctx->payload_tiler.gl_enables |= MALI_OCCLUSION_QUERY | 
MALI_OCCLUSION_PRECISE;
 ctx->payload_tiler.postfix.occlusion_counter = 
ctx->occlusion_query->transfer.gpu;
diff --git a/src/gallium/drivers/panfrost/pan_job.h 
b/src/gallium/drivers/panfrost/pan_job.h
index 10503d944ac..30f1cf4bd5c 100644
--- a/src/gallium/drivers/panfrost/pan_job.h
+++ b/src/gallium/drivers/panfrost/pan_job.h
@@ -33,6 +33,9 @@ struct panfrost_job_key {
 struct pipe_surface *zsbuf;
 };
 
+#define PAN_REQ_MSAA(1 << 0)
+#define PAN_REQ_DEPTH_WRITE (1 << 1)
+
 /* A panfrost_job corresponds to a bound FBO we're rendering to,
  * collecting over multiple draws. */
 
@@ -48,8 +51,10 @@ struct panfrost_job {
 float clear_depth;
 unsigned clear_stencil;
 
-/* Whether this job uses MSAA */
-bool msaa;
+/* Whether this job uses the corresponding requirement (PAN_REQ_*
+ * bitmask) */
+unsigned requirements;
+
 };
 
 /* Functions for managing the above */
diff --git a/src/gallium/drivers/panfrost/pan_mfbd.c 
b/src/gallium/drivers/panfrost/pan_mfbd.c
index b9c7cb221e7..68c842981f3 100644
--- a/src/gallium/drivers/panfrost/pan_mfbd.c
+++ b/src/gallium/drivers/panfrost/pan_mfbd.c
@@ -143,8 +143,6 @@ panfrost_mfbd_set_zsbuf(
 
 fbx->ds_afbc.zero1 = 0x10009;
 fbx->ds_afbc.padding = 0x1000;
-
-fb->unk3 |= MALI_MFBD_DEPTH_WRITE;
 } else if (rsrc->bo->layout == PAN_LINEAR) {
 fb->unk3 |= MALI_MFBD_EXTRA;
 fbx->flags |= MALI_EXTRA_PRESENT | MALI_EXTRA_ZS | 0x1;
@@ -246,7 +244,21 @@ panfrost_mfbd_fragment(struct panfrost_context *ctx, bool 
flip_y)
 rts[0].framebuffer_stride = 0;
 }
 
-if (job->msaa) {
+/* When scanning out, the depth buffer is immediately invalidated, so
+ * we don't need to waste bandwidth writing it out. This can improve
+ * performance substantially (Z32_UNORM 1080p @ 60fps is 475 MB/s of
+ * memory bandwidth!).
+ *
+ * The exception is ReadPixels, but this is not supported on 

[Mesa-dev] [PATCH 07/10] panfrost: Comment spelling fix

2019-03-12 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_mfbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/panfrost/pan_mfbd.c 
b/src/gallium/drivers/panfrost/pan_mfbd.c
index 68c842981f3..ac7f9ed665f 100644
--- a/src/gallium/drivers/panfrost/pan_mfbd.c
+++ b/src/gallium/drivers/panfrost/pan_mfbd.c
@@ -198,7 +198,7 @@ panfrost_mfbd_upload(
 UPLOAD(m_f_trans, offset, [c], total_sz);
 }
 
-/* Return pointer suitable for the fragment seciton */
+/* Return pointer suitable for the fragment section */
 return m_f_trans.gpu | MALI_MFBD | (has_extra ? 2 : 0);
 }
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/10] panfrost; Disable AFBC for depth buffers

2019-03-12 Thread Alyssa Rosenzweig
For inexplicable reasons, the depth buffer is faster if kept as linear,
whereas the colour buffers are faster if AFBC. Given both code paths are
available, we'll choose the faster one of each (which also helps with
testing coverage).

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_context.c  | 5 +
 src/gallium/drivers/panfrost/pan_resource.c | 6 ++
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index a038ea122f7..f607a9a8169 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -2058,10 +2058,7 @@ panfrost_set_framebuffer_state(struct pipe_context *pctx,
 panfrost_attach_vt_framebuffer(ctx);
 panfrost_set_scissor(ctx);
 
-struct panfrost_resource *tex = ((struct 
panfrost_resource *) ctx->pipe_framebuffer.zsbuf->texture);
-
-if (tex->bo->layout != PAN_AFBC && 
!panfrost_is_scanout(ctx))
-panfrost_enable_afbc(ctx, tex, true);
+/* Keep the depth FBO linear */
 }
 }
 }
diff --git a/src/gallium/drivers/panfrost/pan_resource.c 
b/src/gallium/drivers/panfrost/pan_resource.c
index a6e0cfe4687..0fe54aa8a1d 100644
--- a/src/gallium/drivers/panfrost/pan_resource.c
+++ b/src/gallium/drivers/panfrost/pan_resource.c
@@ -220,6 +220,12 @@ panfrost_create_bo(struct panfrost_screen *screen, const 
struct pipe_resource *t
 /* Tiling textures is almost always faster, unless we only use it once 
*/
 bool should_tile = (template->usage != PIPE_USAGE_STREAM) && 
(template->bind & PIPE_BIND_SAMPLER_VIEW);
 
+/* For unclear reasons, depth/stencil is faster linear than AFBC, so
+ * make sure it's linear */
+
+if (template->bind & PIPE_BIND_DEPTH_STENCIL)
+should_tile = false;
+
 /* Set the layout appropriately */
 bo->layout = should_tile ? PAN_TILED : PAN_LINEAR;
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/10] Fragment descriptor focused refactor

2019-03-12 Thread Alyssa Rosenzweig
This patch series can be considered a "v2" of the previous, for the
commits that were not merged due to regressions and such. It goes back
and fixes quite a bit of these issues the right way. One particular
benefit is the removal of the staging fragment framebuffer descriptor,
which was a major hack and probably causing issues.

Alyssa Rosenzweig (10):
  panfrost: Break out fragment to SFBD/MFBD files
  panfrost: Remove staging SFBD for pan_context
  panfrost: Remove staging MFBD
  panfrost: Minor comment cleanup (version detection)
  panfrost/mfbd: Implement linear depth buffers
  panfrost/mfbd: Respect per-job depth write flag
  panfrost: Comment spelling fix
  panfrost: Allocate extra data for depth buffer
  panfrost; Disable AFBC for depth buffers
  panfrost: Compute viewport state on the fly

 src/gallium/drivers/panfrost/meson.build|   3 +
 src/gallium/drivers/panfrost/pan_context.c  | 525 +++-
 src/gallium/drivers/panfrost/pan_context.h  |  36 +-
 src/gallium/drivers/panfrost/pan_fragment.c |  66 +++
 src/gallium/drivers/panfrost/pan_job.h  |   8 +
 src/gallium/drivers/panfrost/pan_mfbd.c | 289 +++
 src/gallium/drivers/panfrost/pan_resource.c |  11 +
 src/gallium/drivers/panfrost/pan_sfbd.c | 139 ++
 8 files changed, 598 insertions(+), 479 deletions(-)
 create mode 100644 src/gallium/drivers/panfrost/pan_fragment.c
 create mode 100644 src/gallium/drivers/panfrost/pan_mfbd.c
 create mode 100644 src/gallium/drivers/panfrost/pan_sfbd.c

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/10] panfrost: Compute viewport state on the fly

2019-03-12 Thread Alyssa Rosenzweig
Previously, we were caching this incorrectly; there's no real reason to
given how variable it is (sensitive to changes in viewport, framebuffer
dimensions, and scissors) and how cheap it is to recompute. So, just do
it on the fly each draw.

Fixes glmark-es2 -bshadow and -brefract.

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_context.c | 108 -
 src/gallium/drivers/panfrost/pan_context.h |   1 -
 2 files changed, 38 insertions(+), 71 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index f607a9a8169..378a631410b 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -311,39 +311,6 @@ panfrost_attach_vt_framebuffer(struct panfrost_context 
*ctx)
 ctx->payload_tiler.postfix.framebuffer = framebuffer;
 }
 
-static void
-panfrost_viewport(struct panfrost_context *ctx,
-  float depth_clip_near,
-  float depth_clip_far,
-  int viewport_x0, int viewport_y0,
-  int viewport_x1, int viewport_y1)
-{
-/* Clip bounds are encoded as floats. The viewport itself is encoded as
- * (somewhat) asymmetric ints. */
-
-struct mali_viewport ret = {
-/* By default, do no viewport clipping, i.e. clip to (-inf,
- * inf) in each direction. Clipping to the viewport in theory
- * should work, but in practice causes issues when we're not
- * explicitly trying to scissor */
-
-.clip_minx = -inff,
-.clip_miny = -inff,
-.clip_maxx = inff,
-.clip_maxy = inff,
-
-/* We always perform depth clipping (TODO: Can this be 
disabled?) */
-
-.clip_minz = depth_clip_near,
-.clip_maxz = depth_clip_far,
-
-.viewport0 = { viewport_x0, viewport_y0 },
-.viewport1 = { MALI_POSITIVE(viewport_x1), 
MALI_POSITIVE(viewport_y1) },
-};
-
-memcpy(ctx->viewport, , sizeof(ret));
-}
-
 /* Reset per-frame context, called on context initialisation as well as after
  * flushing a frame */
 
@@ -413,11 +380,6 @@ panfrost_emit_tiler_payload(struct panfrost_context *ctx)
 },
 };
 
-/* Reserve the viewport */
-struct panfrost_transfer t = panfrost_allocate_chunk(ctx, 
sizeof(struct mali_viewport), HEAP_DESCRIPTOR);
-ctx->viewport = (struct mali_viewport *) t.cpu;
-payload.postfix.viewport = t.gpu;
-
 memcpy(>payload_tiler, , sizeof(payload));
 }
 
@@ -1119,6 +1081,44 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, 
bool with_vertex_data)
 }
 }
 
+/* TODO: Upload the viewport somewhere more appropriate */
+
+/* Clip bounds are encoded as floats. The viewport itself is encoded as
+ * (somewhat) asymmetric ints. */
+const struct pipe_scissor_state *ss = >scissor;
+
+struct mali_viewport view = {
+/* By default, do no viewport clipping, i.e. clip to (-inf,
+ * inf) in each direction. Clipping to the viewport in theory
+ * should work, but in practice causes issues when we're not
+ * explicitly trying to scissor */
+
+.clip_minx = -inff,
+.clip_miny = -inff,
+.clip_maxx = inff,
+.clip_maxy = inff,
+
+.clip_minz = 0.0,
+.clip_maxz = 1.0,
+};
+
+if (ss && ctx->rasterizer && ctx->rasterizer->base.scissor && 0) {
+view.viewport0[0] = ss->minx;
+view.viewport0[1] = ss->miny;
+view.viewport1[0] = MALI_POSITIVE(ss->maxx);
+view.viewport1[1] = MALI_POSITIVE(ss->maxy);
+} else {
+view.viewport0[0] = 0;
+view.viewport0[1] = 0;
+view.viewport1[0] = MALI_POSITIVE(ctx->pipe_framebuffer.width);
+view.viewport1[1] = 
MALI_POSITIVE(ctx->pipe_framebuffer.height);
+}
+
+ctx->payload_tiler.postfix.viewport =
+panfrost_upload_transient(ctx,
+,
+sizeof(struct mali_viewport));
+
 ctx->dirty = 0;
 }
 
@@ -1463,24 +1463,6 @@ panfrost_generic_cso_delete(struct pipe_context *pctx, 
void *hwcso)
 free(hwcso);
 }
 
-static void
-panfrost_set_scissor(struct panfrost_context *ctx)
-{
-const struct pipe_scissor_state *ss = >scissor;
-
-if (ss && ctx->rasterizer && ctx->rasterizer->base.scissor && 0) {
-ctx->viewport->viewport0[0] = ss->minx;
-ctx->viewport->viewport0[1] = ss->miny;
-ctx->viewport->viewport1[0] = MALI_POSITIVE(ss->maxx);
-ctx->viewport->viewport1[1] = MALI_POSITIVE(ss->maxy);
-   

[Mesa-dev] [PATCH 02/10] panfrost: Remove staging SFBD for pan_context

2019-03-12 Thread Alyssa Rosenzweig
The fragment framebuffer descriptor should not be a context entry;
rather, it should be constructed only at fragment time to keep analysis
tractable.

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_context.h  |  5 +-
 src/gallium/drivers/panfrost/pan_fragment.c |  7 ++-
 src/gallium/drivers/panfrost/pan_mfbd.c |  2 +-
 src/gallium/drivers/panfrost/pan_sfbd.c | 55 +
 4 files changed, 30 insertions(+), 39 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_context.h 
b/src/gallium/drivers/panfrost/pan_context.h
index a304d83e4fe..a3c87199d00 100644
--- a/src/gallium/drivers/panfrost/pan_context.h
+++ b/src/gallium/drivers/panfrost/pan_context.h
@@ -140,7 +140,6 @@ struct panfrost_context {
  * e.g. clearing information */
 
 union {
-struct mali_single_framebuffer fragment_sfbd;
 struct {
 struct bifrost_framebuffer fragment_mfbd;
 struct bifrost_fb_extra fragment_extra;
@@ -371,10 +370,10 @@ bool
 panfrost_is_scanout(struct panfrost_context *ctx);
 
 mali_ptr
-panfrost_sfbd_fragment(struct panfrost_context *ctx);
+panfrost_sfbd_fragment(struct panfrost_context *ctx, bool flip_y);
 
 mali_ptr
-panfrost_mfbd_fragment(struct panfrost_context *ctx);
+panfrost_mfbd_fragment(struct panfrost_context *ctx, bool flip_y);
 
 struct bifrost_framebuffer
 panfrost_emit_mfbd(struct panfrost_context *ctx);
diff --git a/src/gallium/drivers/panfrost/pan_fragment.c 
b/src/gallium/drivers/panfrost/pan_fragment.c
index 16405a4ed21..0dc15c2d235 100644
--- a/src/gallium/drivers/panfrost/pan_fragment.c
+++ b/src/gallium/drivers/panfrost/pan_fragment.c
@@ -34,9 +34,12 @@
 mali_ptr
 panfrost_fragment_job(struct panfrost_context *ctx)
 {
+/* TODO: Check viewport or something */
+bool flip_y = panfrost_is_scanout(ctx);
+
 mali_ptr framebuffer = ctx->require_sfbd ?
-panfrost_sfbd_fragment(ctx) :
-panfrost_mfbd_fragment(ctx);
+panfrost_sfbd_fragment(ctx, flip_y) :
+panfrost_mfbd_fragment(ctx, flip_y);
 
 struct mali_job_descriptor_header header = {
 .job_type = JOB_TYPE_FRAGMENT,
diff --git a/src/gallium/drivers/panfrost/pan_mfbd.c 
b/src/gallium/drivers/panfrost/pan_mfbd.c
index dffe13e6713..cb36a374063 100644
--- a/src/gallium/drivers/panfrost/pan_mfbd.c
+++ b/src/gallium/drivers/panfrost/pan_mfbd.c
@@ -234,7 +234,7 @@ panfrost_mfbd_upload(struct panfrost_context *ctx)
 /* Creates an MFBD for the FRAGMENT section of the bound framebuffer */
 
 mali_ptr
-panfrost_mfbd_fragment(struct panfrost_context *ctx)
+panfrost_mfbd_fragment(struct panfrost_context *ctx, bool flip_y)
 {
 struct panfrost_job *job = panfrost_get_job_for_fbo(ctx);
 
diff --git a/src/gallium/drivers/panfrost/pan_sfbd.c 
b/src/gallium/drivers/panfrost/pan_sfbd.c
index d04da20e258..0e283bbb082 100644
--- a/src/gallium/drivers/panfrost/pan_sfbd.c
+++ b/src/gallium/drivers/panfrost/pan_sfbd.c
@@ -36,16 +36,11 @@ panfrost_sfbd_format(struct pipe_surface *surf)
 }
 
 static void
-panfrost_sfbd_enable_msaa(struct panfrost_context *ctx)
-{
-ctx->fragment_sfbd.format |= MALI_FRAMEBUFFER_MSAA_A | 
MALI_FRAMEBUFFER_MSAA_B;
-}
-
-static void
-panfrost_sfbd_clear(struct panfrost_job *job)
+panfrost_sfbd_clear(
+struct panfrost_job *job,
+struct mali_single_framebuffer *sfbd)
 {
 struct panfrost_context *ctx = job->ctx;
-struct mali_single_framebuffer *sfbd = >fragment_sfbd;
 
 if (job->clear & PIPE_CLEAR_COLOR) {
 sfbd->clear_color_1 = job->clear_color;
@@ -91,60 +86,54 @@ panfrost_sfbd_clear(struct panfrost_job *job)
 
 static void
 panfrost_sfbd_set_cbuf(
-struct panfrost_context *ctx,
-struct pipe_surface *surf)
+struct mali_single_framebuffer *fb,
+struct pipe_surface *surf,
+bool flip_y)
 {
 struct panfrost_resource *rsrc = pan_resource(surf->texture);
 
 signed stride =
 util_format_get_stride(surf->format, surf->texture->width0);
 
-ctx->fragment_sfbd.format = panfrost_sfbd_format(surf);
+fb->format = panfrost_sfbd_format(surf);
 
 if (rsrc->bo->layout == PAN_LINEAR) {
 mali_ptr framebuffer = rsrc->bo->gpu[0];
 
 /* The default is upside down from OpenGL's perspective. */
-if (panfrost_is_scanout(ctx)) {
+if (flip_y) {
 framebuffer += stride * (surf->texture->height0 - 1);
 stride = -stride;
 }
 
-ctx->fragment_sfbd.framebuffer = framebuffer;
-ctx->fragment_sfbd.stride = stride;
+fb->framebuffer = framebuffer;
+fb->stride = stride;
 } else {
 fprintf(stderr, "Invalid render 

[Mesa-dev] [PATCH 01/10] panfrost: Break out fragment to SFBD/MFBD files

2019-03-12 Thread Alyssa Rosenzweig
This substantially cleans up the corresponding logic at the expense of a
bit of code duplication; nevertheless, it's a net win since otherwise
incompatible hardware code is mixed confusingly.

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/meson.build|   3 +
 src/gallium/drivers/panfrost/pan_context.c  | 385 +---
 src/gallium/drivers/panfrost/pan_context.h  |  21 ++
 src/gallium/drivers/panfrost/pan_fragment.c |  63 
 src/gallium/drivers/panfrost/pan_job.h  |   3 +
 src/gallium/drivers/panfrost/pan_mfbd.c | 273 ++
 src/gallium/drivers/panfrost/pan_sfbd.c | 150 
 7 files changed, 520 insertions(+), 378 deletions(-)
 create mode 100644 src/gallium/drivers/panfrost/pan_fragment.c
 create mode 100644 src/gallium/drivers/panfrost/pan_mfbd.c
 create mode 100644 src/gallium/drivers/panfrost/pan_sfbd.c

diff --git a/src/gallium/drivers/panfrost/meson.build 
b/src/gallium/drivers/panfrost/meson.build
index 456782355bb..e3569e73468 100644
--- a/src/gallium/drivers/panfrost/meson.build
+++ b/src/gallium/drivers/panfrost/meson.build
@@ -42,6 +42,9 @@ files_panfrost = files(
   'pan_blend_shaders.c',
   'pan_wallpaper.c',
   'pan_pretty_print.c',
+  'pan_fragment.c',
+  'pan_sfbd.c',
+  'pan_mfbd.c'
 )
 
 inc_panfrost = [
diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index cbf97daf6e5..48e471eace2 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -50,30 +50,17 @@ extern const char *pan_counters_base;
 /* Do not actually send anything to the GPU; merely generate the cmdstream as 
fast as possible. Disables framebuffer writes */
 //#define DRY_RUN
 
-#define SET_BIT(lval, bit, cond) \
-   if (cond) \
-   lval |= (bit); \
-   else \
-   lval &= ~(bit);
-
 /* TODO: Sample size, etc */
 
 static void
 panfrost_set_framebuffer_msaa(struct panfrost_context *ctx, bool enabled)
 {
-SET_BIT(ctx->fragment_shader_core.unknown2_3, MALI_HAS_MSAA, enabled);
-SET_BIT(ctx->fragment_shader_core.unknown2_4, MALI_NO_MSAA, !enabled);
-
-if (ctx->require_sfbd) {
-SET_BIT(ctx->fragment_sfbd.format, MALI_FRAMEBUFFER_MSAA_A | 
MALI_FRAMEBUFFER_MSAA_B, enabled);
-} else {
-SET_BIT(ctx->fragment_rts[0].format.flags, 
MALI_MFBD_FORMAT_MSAA, enabled);
+struct panfrost_job *job = panfrost_get_job_for_fbo(ctx);
 
-SET_BIT(ctx->fragment_mfbd.unk1, (1 << 4) | (1 << 1), enabled);
+job->msaa |= enabled;
 
-/* XXX */
-ctx->fragment_mfbd.rt_count_2 = enabled ? 4 : 1;
-}
+SET_BIT(ctx->fragment_shader_core.unknown2_3, MALI_HAS_MSAA, enabled);
+SET_BIT(ctx->fragment_shader_core.unknown2_4, MALI_NO_MSAA, !enabled);
 }
 
 /* AFBC is enabled on a per-resource basis (AFBC enabling is theoretically
@@ -131,160 +118,6 @@ panfrost_enable_checksum(struct panfrost_context *ctx, 
struct panfrost_resource
 rsrc->bo->has_checksum = true;
 }
 
-static unsigned
-panfrost_sfbd_format_for_surface(struct pipe_surface *surf)
-{
-/* TODO */
-return 0xb84e0281; /* RGB32, no MSAA */
-}
-
-static struct mali_rt_format
-panfrost_mfbd_format_for_surface(struct pipe_surface *surf)
-{
-/* Explode details on the format */
-
-const struct util_format_description *desc =
-util_format_description(surf->texture->format);
-
-/* Fill in accordingly */
-
-struct mali_rt_format fmt = {
-.unk1 = 0x400,
-.unk2 = 0x1,
-.nr_channels = MALI_POSITIVE(desc->nr_channels),
-.flags = 0x444,
-.swizzle = panfrost_translate_swizzle_4(desc->swizzle),
-.unk4 = 0x8
-};
-
-return fmt;
-}
-
-static bool panfrost_is_scanout(struct panfrost_context *ctx);
-
-/* These routines link a fragment job with the bound surface, accounting for 
the
- * BO layout. This routine runs per-frame */
-
-static void
-panfrost_set_fragment_target_cbuf(
-struct panfrost_context *ctx,
-struct pipe_surface *surf,
-unsigned cb)
-{
-struct panfrost_resource *rsrc = pan_resource(surf->texture);
-
-signed stride =
-util_format_get_stride(surf->format, surf->texture->width0);
-
-/* First, we set the format bits */
-
-if (ctx->require_sfbd) {
-ctx->fragment_sfbd.format =
-panfrost_sfbd_format_for_surface(surf);
-} else {
-ctx->fragment_rts[cb].format =
-panfrost_mfbd_format_for_surface(surf);
-}
-
-/* Now, we set the layout specific pieces */
-
-if (rsrc->bo->layout == PAN_LINEAR) {
-mali_ptr framebuffer = rsrc->bo->gpu[0];
-
-/* The default is upside 

[Mesa-dev] [PATCH 03/10] panfrost: Remove staging MFBD

2019-03-12 Thread Alyssa Rosenzweig
Same idea as the previous commit, but for the MFBD this time instead of
the SFBD.

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_context.h |  13 --
 src/gallium/drivers/panfrost/pan_mfbd.c| 194 +++--
 2 files changed, 98 insertions(+), 109 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_context.h 
b/src/gallium/drivers/panfrost/pan_context.h
index a3c87199d00..15b9819484f 100644
--- a/src/gallium/drivers/panfrost/pan_context.h
+++ b/src/gallium/drivers/panfrost/pan_context.h
@@ -134,19 +134,6 @@ struct panfrost_context {
 
 struct panfrost_query *occlusion_query;
 
-/* Each render job has multiple framebuffer descriptors associated with
- * it, used for various purposes with more or less the same format. The
- * most obvious is the fragment framebuffer descriptor, which carries
- * e.g. clearing information */
-
-union {
-struct {
-struct bifrost_framebuffer fragment_mfbd;
-struct bifrost_fb_extra fragment_extra;
-struct bifrost_render_target fragment_rts[4];
-};
-};
-
 /* Each draw has corresponding vertex and tiler payloads */
 struct midgard_payload_vertex_tiler payload_vertex;
 struct midgard_payload_vertex_tiler payload_tiler;
diff --git a/src/gallium/drivers/panfrost/pan_mfbd.c 
b/src/gallium/drivers/panfrost/pan_mfbd.c
index cb36a374063..a141fd314c0 100644
--- a/src/gallium/drivers/panfrost/pan_mfbd.c
+++ b/src/gallium/drivers/panfrost/pan_mfbd.c
@@ -50,159 +50,135 @@ panfrost_mfbd_format(struct pipe_surface *surf)
 return fmt;
 }
 
-static void
-panfrost_mfbd_enable_msaa(struct panfrost_context *ctx)
-{
-ctx->fragment_rts[0].format.flags |= MALI_MFBD_FORMAT_MSAA;
-
-/* XXX */
-ctx->fragment_mfbd.unk1 |= (1 << 4) | (1 << 1);
-ctx->fragment_mfbd.rt_count_2 = 4;
-}
 
 static void
-panfrost_mfbd_clear(struct panfrost_job *job)
+panfrost_mfbd_clear(
+struct panfrost_job *job,
+struct bifrost_framebuffer *fb,
+struct bifrost_fb_extra *fbx,
+struct bifrost_render_target *rt)
 {
 struct panfrost_context *ctx = job->ctx;
-struct bifrost_render_target *buffer_color = >fragment_rts[0];
-struct bifrost_framebuffer *buffer_ds = >fragment_mfbd;
 
 if (job->clear & PIPE_CLEAR_COLOR) {
-buffer_color->clear_color_1 = job->clear_color;
-buffer_color->clear_color_2 = job->clear_color;
-buffer_color->clear_color_3 = job->clear_color;
-buffer_color->clear_color_4 = job->clear_color;
+rt->clear_color_1 = job->clear_color;
+rt->clear_color_2 = job->clear_color;
+rt->clear_color_3 = job->clear_color;
+rt->clear_color_4 = job->clear_color;
 }
 
 if (job->clear & PIPE_CLEAR_DEPTH) {
-buffer_ds->clear_depth = job->clear_depth;
+fb->clear_depth = job->clear_depth;
 }
 
 if (job->clear & PIPE_CLEAR_STENCIL) {
-buffer_ds->clear_stencil = job->clear_stencil;
+fb->clear_stencil = job->clear_stencil;
 }
 
 if (job->clear & (PIPE_CLEAR_DEPTH | PIPE_CLEAR_STENCIL)) {
 /* Setup combined 24/8 depth/stencil */
-ctx->fragment_mfbd.unk3 |= MALI_MFBD_EXTRA;
-ctx->fragment_extra.flags = 0x405;
-ctx->fragment_extra.ds_linear.depth = 
ctx->depth_stencil_buffer.gpu;
-ctx->fragment_extra.ds_linear.depth_stride = 
ctx->pipe_framebuffer.width * 4;
+fb->unk3 |= MALI_MFBD_EXTRA;
+fbx->flags = 0x405;
+fbx->ds_linear.depth = ctx->depth_stencil_buffer.gpu;
+fbx->ds_linear.depth_stride = ctx->pipe_framebuffer.width * 4;
 }
 }
 
 static void
 panfrost_mfbd_set_cbuf(
-struct panfrost_context *ctx,
+struct bifrost_render_target *rt,
 struct pipe_surface *surf,
-unsigned cb)
+bool flip_y)
 {
 struct panfrost_resource *rsrc = pan_resource(surf->texture);
 
 signed stride =
 util_format_get_stride(surf->format, surf->texture->width0);
 
-ctx->fragment_rts[cb].format = panfrost_mfbd_format(surf);
+rt->format = panfrost_mfbd_format(surf);
 
 /* Now, we set the layout specific pieces */
 
 if (rsrc->bo->layout == PAN_LINEAR) {
 mali_ptr framebuffer = rsrc->bo->gpu[0];
 
-/* The default is upside down from OpenGL's perspective. */
-if (panfrost_is_scanout(ctx)) {
+if (flip_y) {
 framebuffer += stride * (surf->texture->height0 - 1);
 stride = -stride;

Re: [Mesa-dev] [PATCH 11/11] ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()

2019-03-12 Thread Timothy Arceri

This one causes 2000+ piglit tests to fail on radeonsi. For example:

./bin/shader_runner 
generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/geom-conversion-explicit-bool-double.shader_test 
-auto -fbo


On 13/3/19 3:19 am, Samuel Pitoiset wrote:

New buffer intrinsics have a separate soffset parameter.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/common/ac_llvm_build.c | 66 ++
  1 file changed, 26 insertions(+), 40 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index ce6639d49bf..8ed5199da55 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1227,59 +1227,45 @@ ac_build_buffer_store_dword(struct ac_llvm_context *ctx,
if (!swizzle_enable_hint) {
LLVMValueRef offset = soffset;
  
-		static const char *types[] = {"f32", "v2f32", "v4f32"};

-
if (inst_offset)
offset = LLVMBuildAdd(ctx->builder, offset,
  LLVMConstInt(ctx->i32, inst_offset, 0), 
"");
-   if (voffset)
-   offset = LLVMBuildAdd(ctx->builder, offset, voffset, 
"");
-
-   LLVMValueRef args[] = {
-   ac_to_float(ctx, vdata),
-   LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, ""),
-   ctx->i32_0,
-   offset,
-   LLVMConstInt(ctx->i1, glc, 0),
-   LLVMConstInt(ctx->i1, slc, 0),
-   };
-
-   char name[256];
-   snprintf(name, sizeof(name), "llvm.amdgcn.buffer.store.%s",
-types[CLAMP(num_channels, 1, 3) - 1]);
  
-		ac_build_intrinsic(ctx, name, ctx->voidt,

-  args, ARRAY_SIZE(args),
-  ac_get_store_intr_attribs(writeonly_memory));
+   if (HAVE_LLVM >= 0x800) {
+   ac_build_llvm8_buffer_store_common(ctx, rsrc,
+  ac_to_float(ctx, 
vdata),
+  ctx->i32_0,
+  voffset, offset,
+  num_channels,
+  glc, slc,
+  writeonly_memory,
+  false, true);
+   } else {
+   if (voffset)
+   offset = LLVMBuildAdd(ctx->builder, offset, voffset, 
"");
+
+   ac_build_buffer_store_common(ctx, rsrc,
+ac_to_float(ctx, vdata),
+ctx->i32_0, offset,
+num_channels, glc, slc,
+writeonly_memory, false);
+   }
return;
}
  
-	static const unsigned dfmt[] = {

+   static const unsigned dfmts[] = {
V_008F0C_BUF_DATA_FORMAT_32,
V_008F0C_BUF_DATA_FORMAT_32_32,
V_008F0C_BUF_DATA_FORMAT_32_32_32,
V_008F0C_BUF_DATA_FORMAT_32_32_32_32
};
-   static const char *types[] = {"i32", "v2i32", "v4i32"};
-   LLVMValueRef args[] = {
-   vdata,
-   LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, ""),
-   ctx->i32_0,
-   voffset ? voffset : ctx->i32_0,
-   soffset,
-   LLVMConstInt(ctx->i32, inst_offset, 0),
-   LLVMConstInt(ctx->i32, dfmt[num_channels - 1], 0),
-   LLVMConstInt(ctx->i32, V_008F0C_BUF_NUM_FORMAT_UINT, 0),
-   LLVMConstInt(ctx->i1, glc, 0),
-   LLVMConstInt(ctx->i1, slc, 0),
-   };
-   char name[256];
-   snprintf(name, sizeof(name), "llvm.amdgcn.tbuffer.store.%s",
-types[CLAMP(num_channels, 1, 3) - 1]);
+   unsigned dfmt = dfmts[num_channels - 1];
+   unsigned nfmt = V_008F0C_BUF_NUM_FORMAT_UINT;
+   LLVMValueRef immoffset = LLVMConstInt(ctx->i32, inst_offset, 0);
  
-	ac_build_intrinsic(ctx, name, ctx->voidt,

-  args, ARRAY_SIZE(args),
-  ac_get_store_intr_attribs(writeonly_memory));
+   ac_build_tbuffer_store(ctx, rsrc, vdata, ctx->i32_0, voffset, soffset,
+  immoffset, num_channels, dfmt, nfmt, glc, slc,
+  writeonly_memory);
  }
  
  static LLVMValueRef



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] panfrost: Set bo->size[0] in the DRM backend

2019-03-12 Thread Alyssa Rosenzweig
Both patches are:

Reviewed-by: Alyssa Rosenzweig 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: Ignore VkRenderPassInputAttachementAspectCreateInfo

2019-03-12 Thread Sagar Ghuge
Reviewed-by: Sagar Ghuge 

On 3/12/19 4:21 PM, Jason Ekstrand wrote:
> We don't care about the information but there's no sense in throwing a
> debug warning about it.  It's harmless but annoying to users.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109984
> Cc: Craig Stout 
> ---
>  src/intel/vulkan/anv_pass.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> index 02f2be60e02..5fac5bbb31c 100644
> --- a/src/intel/vulkan/anv_pass.c
> +++ b/src/intel/vulkan/anv_pass.c
> @@ -332,6 +332,10 @@ VkResult anv_CreateRenderPass(
>  
> vk_foreach_struct(ext, pCreateInfo->pNext) {
>switch (ext->sType) {
> +  case VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO:
> + /* We don't care about this information */
> + break;
> +
>case VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO: {
>   VkRenderPassMultiviewCreateInfo *mv = (void *)ext;
>  
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: Ignore VkRenderPassInputAttachementAspectCreateInfo

2019-03-12 Thread Jason Ekstrand
We don't care about the information but there's no sense in throwing a
debug warning about it.  It's harmless but annoying to users.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109984
Cc: Craig Stout 
---
 src/intel/vulkan/anv_pass.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
index 02f2be60e02..5fac5bbb31c 100644
--- a/src/intel/vulkan/anv_pass.c
+++ b/src/intel/vulkan/anv_pass.c
@@ -332,6 +332,10 @@ VkResult anv_CreateRenderPass(
 
vk_foreach_struct(ext, pCreateInfo->pNext) {
   switch (ext->sType) {
+  case VK_STRUCTURE_TYPE_RENDER_PASS_INPUT_ATTACHMENT_ASPECT_CREATE_INFO:
+ /* We don't care about this information */
+ break;
+
   case VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO: {
  VkRenderPassMultiviewCreateInfo *mv = (void *)ext;
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering

2019-03-12 Thread Marek Olšák
The env var workaround is fine.

Thread affinity is used for cache topology related optimizations. I think
it's a mistake to treat it only as a resource allocation tool.

Marek

On Tue, Mar 12, 2019, 1:59 AM Marc-André Lureau 
wrote:

> Hi
>
> On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich
>  wrote:
> >
> > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote:
> > > Hi,
> > >
> > > On 1.3.2019 11.12, Michel Dänzer wrote:
> > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote:
> > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen <
> eero.t.tammi...@intel.com>
> > >  Why distro versions of Qemu filter sched_setaffinity() syscall?
> > > >>>
> > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889)
> > > >>>
> > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19
> > > >>>
> > > >>> "IMHO that mesa change is not valid. It is settings its affinity to
> > > >>> run on all threads which is definitely *NOT* something we want to
> be
> > > >>> allowed. Management applications want to control which CPUs QEMU
> runs
> > > >>> on, and as such Mesa should honour the CPU placement that the QEMU
> > > >>> process has.
> > > >>>
> > > >>> This is a great example of why QEMU wants to use seccomp to block
> > > >>> affinity changes to prevent something silently trying to use more
> CPUs
> > > >>> than are assigned to this QEMU."
> > > >>>
> > > >>
> > > >> Mesa uses thread affinity to optimize memory access performance on
> some
> > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to
> restore the
> > > >> original thread affinity for some child threads. Additionally, if
> games
> > > >> limit the thread affinity, Mesa needs to restore the full thread
> affinity
> > > >> for some of its child threads.
> > > >
> > > > The last part sounds like Mesa clearly overstepping its authority.
> > > >
> > > >
> > > >> In essence, the thread affinity should only be considered a hint
> for the
> > > >> kernel for optimal performance. There is no reason to kill the
> process if
> > > >> it's disallowed. Just ignore the call or modify the thread mask to
> make it
> > > >> legal.
> > > >
> > > > The fundamental issue here is that Mesa is using the thread affinity
> API
> > > > for something else than it's intended for. If there was an API for
> what
> > > > Mesa wants (encouraging certain sets of threads to run on
> topologically
> > > > close cores), there should be no need to block that.
> > >
> > > Why such process needs to be killed instead the request being masked
> > > suitably, is there some program that breaks subtly if affinity request
> > > is masked (and that being worse than the program being killed)?
> >
> > But that is still a situation that could be nicely handled with a
> > EPERM error return. Way better than just kill a process.
> > That 'badly affected' program still can call abort then.
> > But nicely working programs don't get just killed then!!
>
>
> Returning an error seems less secure that prohibiting it completely.
> And it may lead to subtle bugs in rarely tested code paths.
>
> It's legitimate that QEMU and management layers want to prevent
> arbitrary code from changing resource allocation etc.
>
> There are no easy way I can think of for mesa (and other libraries) to
> probe the seccomp filters and associated action.
>
> So we need a way to tell mesa not to call setaffinity() (and other
> syscalls). MESA_NO_THREAD_AFFINITY or MESA_NO_SYSCALLS=setaffinity,...
> seem like a relatively easy way to go.
>
> thanks
>
>
> --
> Marc-André Lureau
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-12 Thread Francisco Jerez
Francisco Jerez  writes:

> Iago Toral  writes:
>
>> On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote:
>>> On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote:
>>> > Iago Toral  writes:
>>> > 
>>> > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote:
>>> > > > Iago Toral  writes:
>>> > > > 
>>> > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
>>> > > > > > Iago Toral  writes:
>>> > > > > > 
>>> > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote:
>>> > > > > > > > Iago Toral  writes:
>>> > > > > > > > 
>>> > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez
>>> > > > > > > > > wrote:
>>> > > > > > > > > > Iago Toral Quiroga  writes:
>>> > > > > > > > > > 
>>> > > > > > > > > > > ---
>>> > > > > > > > > > >  src/intel/compiler/brw_eu_validate.c|  64
>>> > > > > > > > > > > -
>>> > > > > > > > > > >  src/intel/compiler/test_eu_validate.cpp | 122
>>> > > > > > > > > > > 
>>> > > > > > > > > > >  2 files changed, 185 insertions(+), 1 deletion(-
>>> > > > > > > > > > > )
>>> > > > > > > > > > > 
>>> > > > > > > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
>>> > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c
>>> > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644
>>> > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
>>> > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
>>> > > > > > > > > > > @@ -531,7 +531,69 @@
>>> > > > > > > > > > > general_restrictions_based_on_operand_types(const
>>> > > > > > > > > > > struct
>>> > > > > > > > > > > gen_device_info *devinf
>>> > > > > > > > > > > exec_type_size == 8 && dst_type_size ==
>>> > > > > > > > > > > 4)
>>> > > > > > > > > > >dst_type_size = 8;
>>> > > > > > > > > > >  
>>> > > > > > > > > > > -   if (exec_type_size > dst_type_size) {
>>> > > > > > > > > > > +   /* From the BDW+ PRM:
>>> > > > > > > > > > > +*
>>> > > > > > > > > > > +*"There is no direct conversion from HF
>>> > > > > > > > > > > to
>>> > > > > > > > > > > DF
>>> > > > > > > > > > > or
>>> > > > > > > > > > > DF to
>>> > > > > > > > > > > HF.
>>> > > > > > > > > > > +* There is no direct conversion from HF
>>> > > > > > > > > > > to
>>> > > > > > > > > > > Q/UQ or
>>> > > > > > > > > > > Q/UQ to
>>> > > > > > > > > > > HF."
>>> > > > > > > > > > > +*/
>>> > > > > > > > > > > +   enum brw_reg_type src0_type =
>>> > > > > > > > > > > brw_inst_src0_type(devinfo,
>>> > > > > > > > > > > inst);
>>> > > > > > > > > > > +   ERROR_IF(brw_inst_opcode(devinfo, inst) ==
>>> > > > > > > > > > > BRW_OPCODE_MOV
>>> > > > > > > > > > > &&
>>> > > > > > > > > > 
>>> > > > > > > > > > Why is only the MOV instruction handled here and
>>> > > > > > > > > > below?  Aren't
>>> > > > > > > > > > other
>>> > > > > > > > > > instructions able to do implicit
>>> > > > > > > > > > conversions?  Probably
>>> > > > > > > > > > means
>>> > > > > > > > > > you
>>> > > > > > > > > > need
>>> > > > > > > > > > to deal with two sources rather than one.
>>> > > > > > > > > 
>>> > > > > > > > > This comes from the programming notes of the MOV
>>> > > > > > > > > instruction
>>> > > > > > > > > (Volume
>>> > > > > > > > > 2a, Command Reference - Instructions - MOV), so it is
>>> > > > > > > > > described
>>> > > > > > > > > specifically for the MOV instruction. I should
>>> > > > > > > > > probably
>>> > > > > > > > > have
>>> > > > > > > > > made
>>> > > > > > > > > this
>>> > > > > > > > > clear in the comment.
>>> > > > > > > > > 
>>> > > > > > > > 
>>> > > > > > > > Maybe the one above is specified in the MOV page only,
>>> > > > > > > > probably
>>> > > > > > > > due
>>> > > > > > > > to
>>> > > > > > > > an oversight (If these restrictions were really
>>> > > > > > > > specific
>>> > > > > > > > to
>>> > > > > > > > the
>>> > > > > > > > MOV
>>> > > > > > > > instruction, what would prevent you from implementing
>>> > > > > > > > such
>>> > > > > > > > conversions
>>> > > > > > > > through a different instruction?  E.g. "ADD dst:df,
>>> > > > > > > > src:hf,
>>> > > > > > > > 0"
>>> > > > > > > > which
>>> > > > > > > > would be substantially more efficient than what you're
>>> > > > > > > > doing
>>> > > > > > > > in
>>> > > > > > > > PATCH
>>> > > > > > > > 02)
>>> > > > > > > 
>>> > > > > > > Instructions that take HF can only be strictly HF or mix
>>> > > > > > > F
>>> > > > > > > and
>>> > > > > > > HF
>>> > > > > > > (mixed mode float), with MOV being the only exception.
>>> > > > > > > That
>>> > > > > > > means
>>> > > > > > > that
>>> > > > > > > any instruction like the one you use above are invalid.
>>> > > > > > > Maybe
>>> > > > > > > we
>>> > > > > > > should
>>> > > > > > > validate explicitly that instructions that use HF are
>>> > > > > > > strictly
>>> > > > > > > HF
>>> > > > > > > or
>>> > > > > > > mixed-float mode only?
>>> > > > > > > 
>>> > > > > > 
>>> > > > > > So you're acknowledging that the 

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-12 Thread Francisco Jerez
Iago Toral  writes:

> On Wed, 2019-03-06 at 09:21 +0100, Iago Toral wrote:
>> On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote:
>> > On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote:
>> > > Iago Toral  writes:
>> > > 
>> > > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote:
>> > > > > Iago Toral  writes:
>> > > > > 
>> > > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
>> > > > > > > Iago Toral  writes:
>> > > > > > > 
>> > > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez
>> > > > > > > > wrote:
>> > > > > > > > > Iago Toral  writes:
>> > > > > > > > > 
>> > > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez
>> > > > > > > > > > wrote:
>> > > > > > > > > > > Iago Toral Quiroga  writes:
>> > > > > > > > > > > 
>> > > > > > > > > > > > ---
>> > > > > > > > > > > >  src/intel/compiler/brw_eu_validate.c|  64
>> > > > > > > > > > > > -
>> > > > > > > > > > > >  src/intel/compiler/test_eu_validate.cpp | 122
>> > > > > > > > > > > > 
>> > > > > > > > > > > >  2 files changed, 185 insertions(+), 1
>> > > > > > > > > > > > deletion(-
>> > > > > > > > > > > > )
>> > > > > > > > > > > > 
>> > > > > > > > > > > > diff --git
>> > > > > > > > > > > > a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644
>> > > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > > @@ -531,7 +531,69 @@
>> > > > > > > > > > > > general_restrictions_based_on_operand_types(con
>> > > > > > > > > > > > st
>> > > > > > > > > > > > struct
>> > > > > > > > > > > > gen_device_info *devinf
>> > > > > > > > > > > > exec_type_size == 8 && dst_type_size ==
>> > > > > > > > > > > > 4)
>> > > > > > > > > > > >dst_type_size = 8;
>> > > > > > > > > > > >  
>> > > > > > > > > > > > -   if (exec_type_size > dst_type_size) {
>> > > > > > > > > > > > +   /* From the BDW+ PRM:
>> > > > > > > > > > > > +*
>> > > > > > > > > > > > +*"There is no direct conversion from
>> > > > > > > > > > > > HF
>> > > > > > > > > > > > to
>> > > > > > > > > > > > DF
>> > > > > > > > > > > > or
>> > > > > > > > > > > > DF to
>> > > > > > > > > > > > HF.
>> > > > > > > > > > > > +* There is no direct conversion from
>> > > > > > > > > > > > HF
>> > > > > > > > > > > > to
>> > > > > > > > > > > > Q/UQ or
>> > > > > > > > > > > > Q/UQ to
>> > > > > > > > > > > > HF."
>> > > > > > > > > > > > +*/
>> > > > > > > > > > > > +   enum brw_reg_type src0_type =
>> > > > > > > > > > > > brw_inst_src0_type(devinfo,
>> > > > > > > > > > > > inst);
>> > > > > > > > > > > > +   ERROR_IF(brw_inst_opcode(devinfo, inst) ==
>> > > > > > > > > > > > BRW_OPCODE_MOV
>> > > > > > > > > > > > &&
>> > > > > > > > > > > 
>> > > > > > > > > > > Why is only the MOV instruction handled here and
>> > > > > > > > > > > below?  Aren't
>> > > > > > > > > > > other
>> > > > > > > > > > > instructions able to do implicit
>> > > > > > > > > > > conversions?  Probably
>> > > > > > > > > > > means
>> > > > > > > > > > > you
>> > > > > > > > > > > need
>> > > > > > > > > > > to deal with two sources rather than one.
>> > > > > > > > > > 
>> > > > > > > > > > This comes from the programming notes of the MOV
>> > > > > > > > > > instruction
>> > > > > > > > > > (Volume
>> > > > > > > > > > 2a, Command Reference - Instructions - MOV), so it
>> > > > > > > > > > is
>> > > > > > > > > > described
>> > > > > > > > > > specifically for the MOV instruction. I should
>> > > > > > > > > > probably
>> > > > > > > > > > have
>> > > > > > > > > > made
>> > > > > > > > > > this
>> > > > > > > > > > clear in the comment.
>> > > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > Maybe the one above is specified in the MOV page
>> > > > > > > > > only,
>> > > > > > > > > probably
>> > > > > > > > > due
>> > > > > > > > > to
>> > > > > > > > > an oversight (If these restrictions were really
>> > > > > > > > > specific
>> > > > > > > > > to
>> > > > > > > > > the
>> > > > > > > > > MOV
>> > > > > > > > > instruction, what would prevent you from implementing
>> > > > > > > > > such
>> > > > > > > > > conversions
>> > > > > > > > > through a different instruction?  E.g. "ADD dst:df,
>> > > > > > > > > src:hf,
>> > > > > > > > > 0"
>> > > > > > > > > which
>> > > > > > > > > would be substantially more efficient than what
>> > > > > > > > > you're
>> > > > > > > > > doing
>> > > > > > > > > in
>> > > > > > > > > PATCH
>> > > > > > > > > 02)
>> > > > > > > > 
>> > > > > > > > Instructions that take HF can only be strictly HF or
>> > > > > > > > mix
>> > > > > > > > F
>> > > > > > > > and
>> > > > > > > > HF
>> > > > > > > > (mixed mode float), with MOV being the only exception.
>> > > > > > > > That
>> > > > > > > > means
>> > > > > > > > that
>> > > > > > > > any instruction like 

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-12 Thread Francisco Jerez
Iago Toral  writes:

> On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote:
>> On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote:
>> > Iago Toral  writes:
>> > 
>> > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote:
>> > > > Iago Toral  writes:
>> > > > 
>> > > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
>> > > > > > Iago Toral  writes:
>> > > > > > 
>> > > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote:
>> > > > > > > > Iago Toral  writes:
>> > > > > > > > 
>> > > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez
>> > > > > > > > > wrote:
>> > > > > > > > > > Iago Toral Quiroga  writes:
>> > > > > > > > > > 
>> > > > > > > > > > > ---
>> > > > > > > > > > >  src/intel/compiler/brw_eu_validate.c|  64
>> > > > > > > > > > > -
>> > > > > > > > > > >  src/intel/compiler/test_eu_validate.cpp | 122
>> > > > > > > > > > > 
>> > > > > > > > > > >  2 files changed, 185 insertions(+), 1 deletion(-
>> > > > > > > > > > > )
>> > > > > > > > > > > 
>> > > > > > > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > index 000a05cb6ac..203641fecb9 100644
>> > > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > > @@ -531,7 +531,69 @@
>> > > > > > > > > > > general_restrictions_based_on_operand_types(const
>> > > > > > > > > > > struct
>> > > > > > > > > > > gen_device_info *devinf
>> > > > > > > > > > > exec_type_size == 8 && dst_type_size ==
>> > > > > > > > > > > 4)
>> > > > > > > > > > >dst_type_size = 8;
>> > > > > > > > > > >  
>> > > > > > > > > > > -   if (exec_type_size > dst_type_size) {
>> > > > > > > > > > > +   /* From the BDW+ PRM:
>> > > > > > > > > > > +*
>> > > > > > > > > > > +*"There is no direct conversion from HF
>> > > > > > > > > > > to
>> > > > > > > > > > > DF
>> > > > > > > > > > > or
>> > > > > > > > > > > DF to
>> > > > > > > > > > > HF.
>> > > > > > > > > > > +* There is no direct conversion from HF
>> > > > > > > > > > > to
>> > > > > > > > > > > Q/UQ or
>> > > > > > > > > > > Q/UQ to
>> > > > > > > > > > > HF."
>> > > > > > > > > > > +*/
>> > > > > > > > > > > +   enum brw_reg_type src0_type =
>> > > > > > > > > > > brw_inst_src0_type(devinfo,
>> > > > > > > > > > > inst);
>> > > > > > > > > > > +   ERROR_IF(brw_inst_opcode(devinfo, inst) ==
>> > > > > > > > > > > BRW_OPCODE_MOV
>> > > > > > > > > > > &&
>> > > > > > > > > > 
>> > > > > > > > > > Why is only the MOV instruction handled here and
>> > > > > > > > > > below?  Aren't
>> > > > > > > > > > other
>> > > > > > > > > > instructions able to do implicit
>> > > > > > > > > > conversions?  Probably
>> > > > > > > > > > means
>> > > > > > > > > > you
>> > > > > > > > > > need
>> > > > > > > > > > to deal with two sources rather than one.
>> > > > > > > > > 
>> > > > > > > > > This comes from the programming notes of the MOV
>> > > > > > > > > instruction
>> > > > > > > > > (Volume
>> > > > > > > > > 2a, Command Reference - Instructions - MOV), so it is
>> > > > > > > > > described
>> > > > > > > > > specifically for the MOV instruction. I should
>> > > > > > > > > probably
>> > > > > > > > > have
>> > > > > > > > > made
>> > > > > > > > > this
>> > > > > > > > > clear in the comment.
>> > > > > > > > > 
>> > > > > > > > 
>> > > > > > > > Maybe the one above is specified in the MOV page only,
>> > > > > > > > probably
>> > > > > > > > due
>> > > > > > > > to
>> > > > > > > > an oversight (If these restrictions were really
>> > > > > > > > specific
>> > > > > > > > to
>> > > > > > > > the
>> > > > > > > > MOV
>> > > > > > > > instruction, what would prevent you from implementing
>> > > > > > > > such
>> > > > > > > > conversions
>> > > > > > > > through a different instruction?  E.g. "ADD dst:df,
>> > > > > > > > src:hf,
>> > > > > > > > 0"
>> > > > > > > > which
>> > > > > > > > would be substantially more efficient than what you're
>> > > > > > > > doing
>> > > > > > > > in
>> > > > > > > > PATCH
>> > > > > > > > 02)
>> > > > > > > 
>> > > > > > > Instructions that take HF can only be strictly HF or mix
>> > > > > > > F
>> > > > > > > and
>> > > > > > > HF
>> > > > > > > (mixed mode float), with MOV being the only exception.
>> > > > > > > That
>> > > > > > > means
>> > > > > > > that
>> > > > > > > any instruction like the one you use above are invalid.
>> > > > > > > Maybe
>> > > > > > > we
>> > > > > > > should
>> > > > > > > validate explicitly that instructions that use HF are
>> > > > > > > strictly
>> > > > > > > HF
>> > > > > > > or
>> > > > > > > mixed-float mode only?
>> > > > > > > 
>> > > > > > 
>> > > > > > So you're acknowledging that the conversions checked above
>> > > > > > are
>> > > > > > illegal
>> > > > > > whether the instruction is a MOV or something else.  Where
>> > > > > > are we
>> > > 

Re: [Mesa-dev] [PATCH] nir/loop_unroll: Fix out-of-bounds access handling

2019-03-12 Thread Timothy Arceri



On 13/3/19 8:30 am, Jason Ekstrand wrote:

The previous code was completely broken when it came to constructing the
undef values.  I'm not sure how it ever worked.  For the case of a copy
that reads an undefined value, we can just delete the copy because the
destination is a valid undefined value.  This saves us the effort of
trying to construct a value for an arbitrary copy_deref intrinsic.


Yes. It turns out this was indeed not required. Thanks for the fix!

Reviewed-by: Timothy Arceri 



Fixes: e8a8937a04 "nir: add partial loop unrolling support"
Cc: Timothy Arceri 
---
  src/compiler/nir/nir_opt_loop_unroll.c | 14 ++
  1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/src/compiler/nir/nir_opt_loop_unroll.c 
b/src/compiler/nir/nir_opt_loop_unroll.c
index 06ec78b8068..b2696d4aadb 100644
--- a/src/compiler/nir/nir_opt_loop_unroll.c
+++ b/src/compiler/nir/nir_opt_loop_unroll.c
@@ -670,11 +670,9 @@ remove_out_of_bounds_induction_use(nir_shader *shader, 
nir_loop *loop,
  if (is_access_out_of_bounds(term, 
nir_src_as_deref(intrin->src[0]),
  trip_count)) {
 if (intrin->intrinsic == nir_intrinsic_load_deref) {
-  assert(intrin->src[0].is_ssa);
-  nir_ssa_def *a_ssa = intrin->src[0].ssa;
nir_ssa_def *undef =
- nir_ssa_undef(, intrin->num_components,
-   a_ssa->bit_size);
+ nir_ssa_undef(, intrin->dest.ssa.num_components,
+   intrin->dest.ssa.bit_size);
nir_ssa_def_rewrite_uses(>dest.ssa,
 nir_src_for_ssa(undef));
 } else {
@@ -686,14 +684,6 @@ remove_out_of_bounds_induction_use(nir_shader *shader, 
nir_loop *loop,
  if (intrin->intrinsic == nir_intrinsic_copy_deref &&
  is_access_out_of_bounds(term, 
nir_src_as_deref(intrin->src[1]),
  trip_count)) {
-   assert(intrin->src[1].is_ssa);
-   nir_ssa_def *a_ssa = intrin->src[1].ssa;
-   nir_ssa_def *undef =
-  nir_ssa_undef(, intrin->num_components, a_ssa->bit_size);
-
-   /* Replace the copy with a store of the undefined value */
-   b.cursor = nir_before_instr(instr);
-   nir_store_deref(, nir_src_as_deref(intrin->src[0]), undef, 
~0);
 nir_instr_remove(instr);
  }
   }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-12 Thread Francisco Jerez
Iago Toral  writes:

> On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote:
>> Iago Toral  writes:
>> 
>> > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote:
>> > > Iago Toral  writes:
>> > > 
>> > > > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote:
>> > > > > Iago Toral  writes:
>> > > > > 
>> > > > > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote:
>> > > > > > > Iago Toral  writes:
>> > > > > > > 
>> > > > > > > > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez
>> > > > > > > > wrote:
>> > > > > > > > > Iago Toral Quiroga  writes:
>> > > > > > > > > 
>> > > > > > > > > > ---
>> > > > > > > > > >  src/intel/compiler/brw_eu_validate.c|  64
>> > > > > > > > > > -
>> > > > > > > > > >  src/intel/compiler/test_eu_validate.cpp | 122
>> > > > > > > > > > 
>> > > > > > > > > >  2 files changed, 185 insertions(+), 1 deletion(-)
>> > > > > > > > > > 
>> > > > > > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > index 000a05cb6ac..203641fecb9 100644
>> > > > > > > > > > --- a/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > +++ b/src/intel/compiler/brw_eu_validate.c
>> > > > > > > > > > @@ -531,7 +531,69 @@
>> > > > > > > > > > general_restrictions_based_on_operand_types(const
>> > > > > > > > > > struct
>> > > > > > > > > > gen_device_info *devinf
>> > > > > > > > > > exec_type_size == 8 && dst_type_size == 4)
>> > > > > > > > > >dst_type_size = 8;
>> > > > > > > > > >  
>> > > > > > > > > > -   if (exec_type_size > dst_type_size) {
>> > > > > > > > > > +   /* From the BDW+ PRM:
>> > > > > > > > > > +*
>> > > > > > > > > > +*"There is no direct conversion from HF to
>> > > > > > > > > > DF
>> > > > > > > > > > or
>> > > > > > > > > > DF to
>> > > > > > > > > > HF.
>> > > > > > > > > > +* There is no direct conversion from HF to
>> > > > > > > > > > Q/UQ or
>> > > > > > > > > > Q/UQ to
>> > > > > > > > > > HF."
>> > > > > > > > > > +*/
>> > > > > > > > > > +   enum brw_reg_type src0_type =
>> > > > > > > > > > brw_inst_src0_type(devinfo,
>> > > > > > > > > > inst);
>> > > > > > > > > > +   ERROR_IF(brw_inst_opcode(devinfo, inst) ==
>> > > > > > > > > > BRW_OPCODE_MOV
>> > > > > > > > > > &&
>> > > > > > > > > 
>> > > > > > > > > Why is only the MOV instruction handled here and
>> > > > > > > > > below?  Aren't
>> > > > > > > > > other
>> > > > > > > > > instructions able to do implicit
>> > > > > > > > > conversions?  Probably
>> > > > > > > > > means
>> > > > > > > > > you
>> > > > > > > > > need
>> > > > > > > > > to deal with two sources rather than one.
>> > > > > > > > 
>> > > > > > > > This comes from the programming notes of the MOV
>> > > > > > > > instruction
>> > > > > > > > (Volume
>> > > > > > > > 2a, Command Reference - Instructions - MOV), so it is
>> > > > > > > > described
>> > > > > > > > specifically for the MOV instruction. I should probably
>> > > > > > > > have
>> > > > > > > > made
>> > > > > > > > this
>> > > > > > > > clear in the comment.
>> > > > > > > > 
>> > > > > > > 
>> > > > > > > Maybe the one above is specified in the MOV page only,
>> > > > > > > probably
>> > > > > > > due
>> > > > > > > to
>> > > > > > > an oversight (If these restrictions were really specific
>> > > > > > > to
>> > > > > > > the
>> > > > > > > MOV
>> > > > > > > instruction, what would prevent you from implementing
>> > > > > > > such
>> > > > > > > conversions
>> > > > > > > through a different instruction?  E.g. "ADD dst:df,
>> > > > > > > src:hf,
>> > > > > > > 0"
>> > > > > > > which
>> > > > > > > would be substantially more efficient than what you're
>> > > > > > > doing
>> > > > > > > in
>> > > > > > > PATCH
>> > > > > > > 02)
>> > > > > > 
>> > > > > > Instructions that take HF can only be strictly HF or mix F
>> > > > > > and
>> > > > > > HF
>> > > > > > (mixed mode float), with MOV being the only exception. That
>> > > > > > means
>> > > > > > that
>> > > > > > any instruction like the one you use above are invalid.
>> > > > > > Maybe
>> > > > > > we
>> > > > > > should
>> > > > > > validate explicitly that instructions that use HF are
>> > > > > > strictly
>> > > > > > HF
>> > > > > > or
>> > > > > > mixed-float mode only?
>> > > > > > 
>> > > > > 
>> > > > > So you're acknowledging that the conversions checked above
>> > > > > are
>> > > > > illegal
>> > > > > whether the instruction is a MOV or something else.  Where
>> > > > > are we
>> > > > > preventing instructions other than MOV with such conversions
>> > > > > from
>> > > > > being
>> > > > > accepted by the validator?
>> > > > 
>> > > > We aren't, because the validator is not checking, in general,
>> > > > for
>> > > > accepted type combinations for specific instructions anywhere
>> > > > as
>> > > > far as
>> > > > I can see.
>> > > 
>> > > Luckily these type conversion restrictions aren't really specific
>> > > to
>> > > any

Re: [Mesa-dev] [PATCH 04/11] ac: add ac_build_buffer_store_format() helper

2019-03-12 Thread Samuel Pitoiset


On 3/12/19 5:19 PM, Samuel Pitoiset wrote:

Similar to ac_build_buffer_load_format().

Signed-off-by: Samuel Pitoiset 
---
  src/amd/common/ac_llvm_build.c  | 100 
  src/amd/common/ac_llvm_build.h  |  11 
  src/amd/common/ac_nir_to_llvm.c |  29 +++--
  3 files changed, 119 insertions(+), 21 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 253073e52fb..bb8a470ae1d 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1082,6 +1082,106 @@ LLVMValueRef 
ac_build_load_to_sgpr_uint_wraparound(struct ac_llvm_context *ctx,
return ac_build_load_custom(ctx, base_ptr, index, true, true, false);
  }
  
+static void

+ac_build_buffer_store_common(struct ac_llvm_context *ctx,
+LLVMValueRef rsrc,
+LLVMValueRef data,
+LLVMValueRef vindex,
+LLVMValueRef voffset,
+unsigned num_channels,
+bool glc,
+bool slc,
+bool writeonly_memory,
+bool use_format)
+{
+   LLVMValueRef args[] = {
+   data,
+   LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, ""),
+   vindex ? vindex : ctx->i32_0,
+   voffset,
+   LLVMConstInt(ctx->i1, glc, 0),
+   LLVMConstInt(ctx->i1, slc, 0)
+   };
+   unsigned func = CLAMP(num_channels, 1, 3) - 1;
+
+   const char *type_names[] = {"f32", "v2f32", "v4f32"};
+   char name[256];
+
+   if (use_format) {
+   snprintf(name, sizeof(name), 
"llvm.amdgcn.buffer.store.format.%s",
+type_names[func]);
+   } else {
+   snprintf(name, sizeof(name), "llvm.amdgcn.buffer.store.%s",
+type_names[func]);
+   }
+
+   ac_build_intrinsic(ctx, name, ctx->voidt, args, ARRAY_SIZE(args),
+  ac_get_store_intr_attribs(writeonly_memory));
+}
+
+static void
+ac_build_llvm8_buffer_store_common(struct ac_llvm_context *ctx,
+  LLVMValueRef rsrc,
+  LLVMValueRef data,
+  LLVMValueRef vindex,
+  LLVMValueRef voffset,
+  LLVMValueRef soffset,
+  unsigned num_channels,
+  bool glc,
+  bool slc,
+  bool writeonly_memory,
+  bool use_format,
+  bool structurized)
+{
+   LLVMValueRef args[5];

This should be 6.

+   int idx = 0;
+   args[idx++] = data;
+   args[idx++] = LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, "");
+   if (structurized)
+   args[idx++] = vindex ? vindex : ctx->i32_0;
+   args[idx++] = voffset ? voffset : ctx->i32_0;
+   args[idx++] = soffset ? soffset : ctx->i32_0;
+   args[idx++] = LLVMConstInt(ctx->i32, (glc ? 1 : 0) + (slc ? 2 : 0), 0);
+   unsigned func = CLAMP(num_channels, 1, 3) - 1;
+
+   const char *type_names[] = {"f32", "v2f32", "v4f32"};
+   const char *indexing_kind = structurized ? "struct" : "raw";
+   char name[256];
+
+   if (use_format) {
+   snprintf(name, sizeof(name), 
"llvm.amdgcn.%s.buffer.store.format.%s",
+indexing_kind, type_names[func]);
+   } else {
+   snprintf(name, sizeof(name), "llvm.amdgcn.%s.buffer.store.%s",
+indexing_kind, type_names[func]);
+   }
+
+   ac_build_intrinsic(ctx, name, ctx->voidt, args, idx,
+  ac_get_store_intr_attribs(writeonly_memory));
+}
+
+void
+ac_build_buffer_store_format(struct ac_llvm_context *ctx,
+LLVMValueRef rsrc,
+LLVMValueRef data,
+LLVMValueRef vindex,
+LLVMValueRef voffset,
+unsigned num_channels,
+bool glc,
+bool writeonly_memory)
+{
+   if (HAVE_LLVM >= 0x800) {
+   ac_build_llvm8_buffer_store_common(ctx, rsrc, data, vindex,
+  voffset, NULL, num_channels,
+  glc, false, writeonly_memory,
+  true, true);
+   } else {
+   ac_build_buffer_store_common(ctx, rsrc, data, vindex, voffset,
+num_channels, glc, false,
+writeonly_memory, true);
+   }
+}
+
  /* TBUFFER_STORE_FORMAT_{X,XY,XYZ,XYZW} <- the suffix is 

Re: [Mesa-dev] [PATCH] gallium/docs: clarify set_sampler_views

2019-03-12 Thread Rob Clark
On Tue, Mar 12, 2019 at 1:59 PM Roland Scheidegger  wrote:
>
> Am 12.03.19 um 16:16 schrieb Rob Clark:
> > This previously was not called out clearly, but based on a survey of the
> > code, it seems the expected behavior is to release the reference to any
> > sampler views beyond the new range being bound.
>
> That isn't really true. This was designed to work like d3d10, where
> other views are unmodified.
> The cso code will actually unset all views which previously were set and
> are above the num_views in the call (this wouldn't be necessary if the
> pipe function itself would work like this).
> However, it will only do this for fragment textures, and pass through
> the parameters unmodified otherwise. Which means behavior might not be
> very consistent for the different stages...

hmm, I did notice w/ deqp tests (which aren't so good at
resetting/clearing state between tests), that I ended up w/ different
# of sampler views bound (without changing freedreno to match the
behavior of most of the other drivers).. I didn't really dig in that
closely but it seemed like mesa/st wasn't clearing the additional
previously bound textures.  Maybe I overlooked something, but that
seemed wrong.

One way or another, I guess we should clarify and change the various
drivers to have a common behavior because right now there two
different behaviors and I guess it is at least confusing for new
gallium driver writers (as it was for me and I've been at it for a
while)

BR,
-R

>
>
> >
> > I think radeonsi and freedreno were the only ones not doing this.  Which
> > could probably temporarily leak a bit of memory by holding on to the
> > sampler view reference.
> Not sure about other drivers, but llvmpipe will not do this neither.
>
> Roland
>
>
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  src/gallium/docs/source/context.rst | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/src/gallium/docs/source/context.rst 
> > b/src/gallium/docs/source/context.rst
> > index f89d9e1005e..199d335f8f4 100644
> > --- a/src/gallium/docs/source/context.rst
> > +++ b/src/gallium/docs/source/context.rst
> > @@ -143,6 +143,9 @@ to the array index which is used for sampling.
> >to a respective sampler view and releases a reference to the previous
> >sampler view.
> >
> > +  Previously bound samplers with index ``>= num_views`` are unbound rather
> > +  than unmodified.
> > +
> >  * ``create_sampler_view`` creates a new sampler view. ``texture`` is 
> > associated
> >with the sampler view which results in sampler view holding a reference
> >to the texture. Format specified in template must be compatible
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir/loop_unroll: Fix out-of-bounds access handling

2019-03-12 Thread Jason Ekstrand
The previous code was completely broken when it came to constructing the
undef values.  I'm not sure how it ever worked.  For the case of a copy
that reads an undefined value, we can just delete the copy because the
destination is a valid undefined value.  This saves us the effort of
trying to construct a value for an arbitrary copy_deref intrinsic.

Fixes: e8a8937a04 "nir: add partial loop unrolling support"
Cc: Timothy Arceri 
---
 src/compiler/nir/nir_opt_loop_unroll.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/src/compiler/nir/nir_opt_loop_unroll.c 
b/src/compiler/nir/nir_opt_loop_unroll.c
index 06ec78b8068..b2696d4aadb 100644
--- a/src/compiler/nir/nir_opt_loop_unroll.c
+++ b/src/compiler/nir/nir_opt_loop_unroll.c
@@ -670,11 +670,9 @@ remove_out_of_bounds_induction_use(nir_shader *shader, 
nir_loop *loop,
 if (is_access_out_of_bounds(term, nir_src_as_deref(intrin->src[0]),
 trip_count)) {
if (intrin->intrinsic == nir_intrinsic_load_deref) {
-  assert(intrin->src[0].is_ssa);
-  nir_ssa_def *a_ssa = intrin->src[0].ssa;
   nir_ssa_def *undef =
- nir_ssa_undef(, intrin->num_components,
-   a_ssa->bit_size);
+ nir_ssa_undef(, intrin->dest.ssa.num_components,
+   intrin->dest.ssa.bit_size);
   nir_ssa_def_rewrite_uses(>dest.ssa,
nir_src_for_ssa(undef));
} else {
@@ -686,14 +684,6 @@ remove_out_of_bounds_induction_use(nir_shader *shader, 
nir_loop *loop,
 if (intrin->intrinsic == nir_intrinsic_copy_deref &&
 is_access_out_of_bounds(term, nir_src_as_deref(intrin->src[1]),
 trip_count)) {
-   assert(intrin->src[1].is_ssa);
-   nir_ssa_def *a_ssa = intrin->src[1].ssa;
-   nir_ssa_def *undef =
-  nir_ssa_undef(, intrin->num_components, a_ssa->bit_size);
-
-   /* Replace the copy with a store of the undefined value */
-   b.cursor = nir_before_instr(instr);
-   nir_store_deref(, nir_src_as_deref(intrin->src[0]), undef, 
~0);
nir_instr_remove(instr);
 }
  }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: set the maximum number of IBs per submit to 192

2019-03-12 Thread Samuel Pitoiset
This fixes random SteamVR corruption, see
https://github.com/ValveSoftware/SteamVR-for-Linux/issues/181

Fixes: 4d30f2c6f42 ("radv/winsys: remove the max IBs per submit limit for the 
fallback path")
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 2 +-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys_public.h | 7 +++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 83d218fb6bf..9570c15af02 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2815,7 +2815,7 @@ VkResult radv_QueueSubmit(
struct radeon_winsys_fence *base_fence = fence ? fence->fence : NULL;
struct radeon_winsys_ctx *ctx = queue->hw_ctx;
int ret;
-   uint32_t max_cs_submission = queue->device->trace_bo ? 1 : UINT32_MAX;
+   uint32_t max_cs_submission = queue->device->trace_bo ? 1 : 
RADV_MAX_IBS_PER_SUBMIT;
uint32_t scratch_size = 0;
uint32_t compute_scratch_size = 0;
uint32_t esgs_ring_size = 0, gsvs_ring_size = 0;
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys_public.h 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys_public.h
index 854e216551f..709669b2a57 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys_public.h
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys_public.h
@@ -29,6 +29,13 @@
 #ifndef RADV_AMDGPU_WINSYS_PUBLIC_H
 #define RADV_AMDGPU_WINSYS_PUBLIC_H
 
+/* The number of IBs per submit isn't infinite, it depends on the ring type
+ * (ie. some initial setup needed for a submit) and the number of IBs (4 DW).
+ * This limit is arbitrary but should be safe for now.  Ideally, we should get
+ * this limit from the KMD.
+*/
+#define RADV_MAX_IBS_PER_SUBMIT 192
+
 struct radeon_winsys *radv_amdgpu_winsys_create(int fd, uint64_t debug_flags,
uint64_t perftest_flags);
 
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: Fix driverUUID

2019-03-12 Thread Samuel Pitoiset


On 3/12/19 7:11 PM, Emil Velikov wrote:

On Tue, 12 Mar 2019 at 15:12, Samuel Pitoiset  wrote:

Reviewed-by: Samuel Pitoiset 


A fixes tag like below would be great ;-)

Fixes: 14cad8786a8 ("radv: generate the same driver UUID as radeonsi")
Reviewed-by: Emil Velikov 

Aside: seems like radv_get_driver_uuid() produces "AMD-MESA-DRV" - a
not so unique identifier.
Perhaps we should fix that at some point?


yeah, should probably be fixed.



HTH
Emil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: Stop using VK_TRUE/FALSE

2019-03-12 Thread Jason Ekstrand
We've been fairly inconsistent about this so we should really choose
whether we're going to use VK_TRUE/FALSE or the C boolean values.  The
Vulkan #defines are set to 1 and 0 respectively so it's the same value
as C gives you when you cast a boolean expression to an integer.  Since
there are several places where we set a VkBool32 to a C logical
expression, let's just embrace C booleans and stop using the VK defines.
---
 src/intel/vulkan/anv_device.c | 42 +--
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 729cceb3e32..83fa3936c19 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -833,7 +833,7 @@ VkResult anv_EnumeratePhysicalDeviceGroups(
   memset(p->physicalDevices, 0, sizeof(p->physicalDevices));
   p->physicalDevices[0] =
  anv_physical_device_to_handle(>physicalDevice);
-  p->subsetAllocation = VK_FALSE;
+  p->subsetAllocation = false;
 
   vk_foreach_struct(ext, p->pNext)
  anv_debug_ignored_stype(ext->sType);
@@ -967,7 +967,7 @@ void anv_GetPhysicalDeviceFeatures2(
   case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DEPTH_CLIP_ENABLE_FEATURES_EXT: {
  VkPhysicalDeviceDepthClipEnableFeaturesEXT *features =
 (VkPhysicalDeviceDepthClipEnableFeaturesEXT *)ext;
- features->depthClipEnable = VK_TRUE;
+ features->depthClipEnable = true;
  break;
   }
 
@@ -990,7 +990,7 @@ void anv_GetPhysicalDeviceFeatures2(
 
   case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_FEATURES: {
  VkPhysicalDeviceProtectedMemoryFeatures *features = (void *)ext;
- features->protectedMemory = VK_FALSE;
+ features->protectedMemory = false;
  break;
   }
 
@@ -1024,23 +1024,23 @@ void anv_GetPhysicalDeviceFeatures2(
   case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_FEATURES_EXT: {
  VkPhysicalDeviceTransformFeedbackFeaturesEXT *features =
 (VkPhysicalDeviceTransformFeedbackFeaturesEXT *)ext;
- features->transformFeedback = VK_TRUE;
- features->geometryStreams = VK_TRUE;
+ features->transformFeedback = true;
+ features->geometryStreams = true;
  break;
   }
 
   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_FEATURES_EXT: {
  VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT *features =
 (VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT *)ext;
- features->vertexAttributeInstanceRateDivisor = VK_TRUE;
- features->vertexAttributeInstanceRateZeroDivisor = VK_TRUE;
+ features->vertexAttributeInstanceRateDivisor = true;
+ features->vertexAttributeInstanceRateZeroDivisor = true;
  break;
   }
 
   case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_YCBCR_IMAGE_ARRAYS_FEATURES_EXT: {
  VkPhysicalDeviceYcbcrImageArraysFeaturesEXT *features =
 (VkPhysicalDeviceYcbcrImageArraysFeaturesEXT *)ext;
- features->ycbcrImageArrays = VK_TRUE;
+ features->ycbcrImageArrays = true;
  break;
   }
 
@@ -1234,8 +1234,8 @@ void anv_GetPhysicalDeviceProperties2(
VK_RESOLVE_MODE_MAX_BIT_KHR;
  }
 
- props->independentResolveNone = VK_TRUE;
- props->independentResolve = VK_TRUE;
+ props->independentResolveNone = true;
+ props->independentResolve = true;
  break;
   }
 
@@ -1372,7 +1372,7 @@ void anv_GetPhysicalDeviceProperties2(

VK_SUBGROUP_FEATURE_SHUFFLE_RELATIVE_BIT |
VK_SUBGROUP_FEATURE_CLUSTERED_BIT |
VK_SUBGROUP_FEATURE_QUAD_BIT;
- properties->quadOperationsInAllStages = VK_TRUE;
+ properties->quadOperationsInAllStages = true;
  break;
   }
 
@@ -1386,10 +1386,10 @@ void anv_GetPhysicalDeviceProperties2(
  props->maxTransformFeedbackStreamDataSize = 128 * 4;
  props->maxTransformFeedbackBufferDataSize = 128 * 4;
  props->maxTransformFeedbackBufferDataStride = 2048;
- props->transformFeedbackQueries = VK_TRUE;
- props->transformFeedbackStreamsLinesTriangles = VK_FALSE;
- props->transformFeedbackRasterizationStreamSelect = VK_FALSE;
- props->transformFeedbackDraw = VK_TRUE;
+ props->transformFeedbackQueries = true;
+ props->transformFeedbackStreamsLinesTriangles = false;
+ props->transformFeedbackRasterizationStreamSelect = false;
+ props->transformFeedbackDraw = true;
  break;
   }
 
@@ -2961,8 +2961,8 @@ void anv_GetBufferMemoryRequirements2(
   switch (ext->sType) {
   case VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS: {
  VkMemoryDedicatedRequirements *requirements = (void *)ext;
- requirements->prefersDedicatedAllocation = 

[Mesa-dev] [PATCH 0/2] Fix llvmpipe on ARM processors without NEON

2019-03-12 Thread Lubomir Rintel
Hi,

chained to this message are two patches that aim to fix llvmpipe driver on 
an ARM machine that has no NEON instructions. My machine is a XO-1.75
laptop with a Marvell MMP2 processor:

  [lkundrak@xo ~]$ LD_SHOW_AUXV=yespls /bin/true |grep HWCAP
  AT_HWCAP:half thumb fastmult vfp edsp iwmmxt thumbee vfpv3 vfpv3d16 
tls idivt
  AT_HWCAP2:
  [lkundrak@xo ~]$

With unpatched Mesa:

  # gdb /usr/libexec/Xorg --ex run
  ...
  (II) modeset(0): Initializing kms color map for depth 24, 8 bpc.
  warning: JITed object file architecture armv7 is not compatible with target 
architecture iwmmxt.
  ...
  Program received signal SIGILL, Illegal instruction.
  0xb6fbe028 in ?? ()
(gdb) x/i $pc
  => 0xb6fbe028:  vld1.32 {d16[]-d17[]}, [r7 :32]
  (gdb)

Seems like defaulting to enabling NEON on armv7 and thus breaking on Marvell and
NVidia (and perhaps other) platform was a deliberate decision and as such is not
considered a bug by LLVM upstream: https://bugs.llvm.org/show_bug.cgi?id=30842

Lubo


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gallivm: guess CPU features also on ARM

2019-03-12 Thread Lubomir Rintel
getHostCPUFeatures() is also available on ARM, for even longer time than
for x86. Use it -- it potentially enables instructions that may speed
things up.

Signed-off-by: Lubomir Rintel 
Cc: 
---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp 
b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index fcbdd5050fe..f3b5784fce7 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -556,11 +556,11 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
 
llvm::SmallVector MAttrs;
 
-#if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64)
-#if HAVE_LLVM >= 0x0400
-   /* llvm-3.7+ implements sys::getHostCPUFeatures for x86,
-* which allows us to enable/disable code generation based
-* on the results of cpuid.
+#if HAVE_LLVM >= 0x0400 && (defined(PIPE_ARCH_X86) || 
defined(PIPE_ARCH_X86_64) || defined(PIPE_ARCH_ARM))
+   /* llvm-3.3+ implements sys::getHostCPUFeatures for Arm
+* and llvm-3.7+ for x86, which allows us to enable/disable
+* code generation based on the results of cpuid on these
+* architectures.
 */
llvm::StringMap features;
llvm::sys::getHostCPUFeatures(features);
@@ -570,7 +570,7 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
 ++f) {
   MAttrs.push_back(((*f).second ? "+" : "-") + (*f).first().str());
}
-#else
+#elif defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64)
/*
 * We need to unset attributes because sometimes LLVM mistakenly assumes
 * certain features are present given the processor name.
@@ -625,7 +625,6 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
MAttrs.push_back("-avx512vl");
 #endif
 #endif
-#endif
 
 #if defined(PIPE_ARCH_PPC)
MAttrs.push_back(util_cpu_caps.has_altivec ? "+altivec" : "-altivec");
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] gallivm: disable NEON instructions if they are not supported

2019-03-12 Thread Lubomir Rintel
The LLVM project made some questionable decisions about defaults for
armv7 (e.g. they enable NEON that is not there on NVIDIA and Marvell
platforms).

On top of that, getHostCPUFeatures() doesn't disable missing machine
attributes. Finally, -neon alone is not sufficient to disable emmision
of NEON instructions.

Signed-off-by: Lubomir Rintel 
Cc: 
---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp 
b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index f3b5784fce7..f307c26d4f7 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -625,6 +625,13 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
MAttrs.push_back("-avx512vl");
 #endif
 #endif
+#if defined(PIPE_ARCH_ARM)
+   if (!util_cpu_caps.has_neon) {
+  MAttrs.push_back("-neon");
+  MAttrs.push_back("-crypto");
+  MAttrs.push_back("-vfp2");
+   }
+#endif
 
 #if defined(PIPE_ARCH_PPC)
MAttrs.push_back(util_cpu_caps.has_altivec ? "+altivec" : "-altivec");
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 8/9] anv: Removed unused header file

2019-03-12 Thread Sagar Ghuge
Reviewed-by: Sagar Ghuge 

On 3/12/19 3:34 AM, Eleni Maria Stea wrote:
> In src/intel/vulkan/genX_blorp_exec.c we included the file:
> common/gen_sample_positions.h but not use it. Removed.
> ---
>  src/intel/vulkan/genX_blorp_exec.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/src/intel/vulkan/genX_blorp_exec.c 
> b/src/intel/vulkan/genX_blorp_exec.c
> index e9c85d56d5f..0eeefaaa9d6 100644
> --- a/src/intel/vulkan/genX_blorp_exec.c
> +++ b/src/intel/vulkan/genX_blorp_exec.c
> @@ -31,7 +31,6 @@
>  #undef __gen_combine_address
>  
>  #include "common/gen_l3_config.h"
> -#include "common/gen_sample_positions.h"
>  #include "blorp/blorp_genX_exec.h"
>  
>  static void *
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] radv: use typed buffer loads for vertex input fetches

2019-03-12 Thread Bas Nieuwenhuizen
r-b for the series

On Tue, Feb 26, 2019 at 1:39 PM Samuel Pitoiset
 wrote:
>
> This drastically reduces the number of SGPRs because the driver
> now uses descriptors per vertex binding, instead of per vertex
> attribute format.
>
> 29077 shaders in 15096 tests
> Totals:
> SGPRS: 1354285 -> 1282109 (-5.33 %)
> VGPRS: 909896 -> 908800 (-0.12 %)
> Spilled SGPRs: 24840 -> 24811 (-0.12 %)
> Code Size: 49221144 -> 48986628 (-0.48 %) bytes
> Max Waves: 243930 -> 244229 (0.12 %)
>
> Totals from affected shaders:
> SGPRS: 390648 -> 318472 (-18.48 %)
> VGPRS: 288432 -> 287336 (-0.38 %)
> Spilled SGPRs: 94 -> 65 (-30.85 %)
> Code Size: 11548412 -> 11313896 (-2.03 %) bytes
> Max Waves: 86460 -> 86759 (0.35 %)
>
> This gives a really tiny boost.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c  | 21 +-
>  src/amd/vulkan/radv_nir_to_llvm.c | 47 +--
>  src/amd/vulkan/radv_pipeline.c| 37 ++--
>  src/amd/vulkan/radv_private.h |  5 +---
>  4 files changed, 57 insertions(+), 53 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index ad0b934ddfc..5ab93d11d68 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1985,13 +1985,13 @@ radv_flush_vertex_descriptors(struct radv_cmd_buffer 
> *cmd_buffer,
>  {
> if ((pipeline_is_dirty ||
> (cmd_buffer->state.dirty & RADV_CMD_DIRTY_VERTEX_BUFFER)) &&
> -   cmd_buffer->state.pipeline->vertex_elements.count &&
> +   cmd_buffer->state.pipeline->num_vertex_bindings &&
> radv_get_shader(cmd_buffer->state.pipeline, 
> MESA_SHADER_VERTEX)->info.info.vs.has_vertex_buffers) {
> struct radv_vertex_elements_info *velems = 
> _buffer->state.pipeline->vertex_elements;
> unsigned vb_offset;
> void *vb_ptr;
> uint32_t i = 0;
> -   uint32_t count = velems->count;
> +   uint32_t count = 
> cmd_buffer->state.pipeline->num_vertex_bindings;
> uint64_t va;
>
> /* allocate some descriptor state for vertex buffers */
> @@ -2002,13 +2002,15 @@ radv_flush_vertex_descriptors(struct radv_cmd_buffer 
> *cmd_buffer,
> for (i = 0; i < count; i++) {
> uint32_t *desc = &((uint32_t *)vb_ptr)[i * 4];
> uint32_t offset;
> -   int vb = velems->binding[i];
> -   struct radv_buffer *buffer = 
> cmd_buffer->vertex_bindings[vb].buffer;
> -   uint32_t stride = 
> cmd_buffer->state.pipeline->binding_stride[vb];
> +   struct radv_buffer *buffer = 
> cmd_buffer->vertex_bindings[i].buffer;
> +   uint32_t stride = 
> cmd_buffer->state.pipeline->binding_stride[i];
> +
> +   if (!buffer)
> +   continue;
>
> va = radv_buffer_get_va(buffer->bo);
>
> -   offset = cmd_buffer->vertex_bindings[vb].offset + 
> velems->offset[i];
> +   offset = cmd_buffer->vertex_bindings[i].offset;
> va += offset + buffer->offset;
> desc[0] = va;
> desc[1] = S_008F04_BASE_ADDRESS_HI(va >> 32) | 
> S_008F04_STRIDE(stride);
> @@ -2016,7 +2018,12 @@ radv_flush_vertex_descriptors(struct radv_cmd_buffer 
> *cmd_buffer,
> desc[2] = (buffer->size - offset - 
> velems->format_size[i]) / stride + 1;
> else
> desc[2] = buffer->size - offset;
> -   desc[3] = velems->rsrc_word3[i];
> +   desc[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
> + S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
> + S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
> + S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
> + 
> S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_UINT) |
> + 
> S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
> }
>
> va = radv_buffer_get_va(cmd_buffer->upload.upload_bo);
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 36f499be212..e6c8f3ecb92 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -2008,6 +2008,8 @@ adjust_vertex_fetch_alpha(struct radv_shader_context 
> *ctx,
>
> LLVMValueRef c30 = LLVMConstInt(ctx->ac.i32, 30, 0);
>
> +   alpha = LLVMBuildBitCast(ctx->ac.builder, alpha, ctx->ac.f32, "");
> +
> if (adjustment == RADV_ALPHA_ADJUST_SSCALED)
> alpha = LLVMBuildFPToUI(ctx->ac.builder, alpha, ctx->ac.i32, 
> "");
> else
> @@ -2035,7 +2037,7 @@ 

Re: [Mesa-dev] [PATCH] radv: Fix driverUUID

2019-03-12 Thread Emil Velikov
On Tue, 12 Mar 2019 at 15:12, Samuel Pitoiset  wrote:
>
> Reviewed-by: Samuel Pitoiset 
>

A fixes tag like below would be great ;-)

Fixes: 14cad8786a8 ("radv: generate the same driver UUID as radeonsi")
Reviewed-by: Emil Velikov 

Aside: seems like radv_get_driver_uuid() produces "AMD-MESA-DRV" - a
not so unique identifier.
Perhaps we should fix that at some point?

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/docs: clarify set_sampler_views

2019-03-12 Thread Roland Scheidegger
Am 12.03.19 um 16:16 schrieb Rob Clark:
> This previously was not called out clearly, but based on a survey of the
> code, it seems the expected behavior is to release the reference to any
> sampler views beyond the new range being bound.

That isn't really true. This was designed to work like d3d10, where
other views are unmodified.
The cso code will actually unset all views which previously were set and
are above the num_views in the call (this wouldn't be necessary if the
pipe function itself would work like this).
However, it will only do this for fragment textures, and pass through
the parameters unmodified otherwise. Which means behavior might not be
very consistent for the different stages...


> 
> I think radeonsi and freedreno were the only ones not doing this.  Which
> could probably temporarily leak a bit of memory by holding on to the
> sampler view reference.
Not sure about other drivers, but llvmpipe will not do this neither.

Roland


> 
> Signed-off-by: Rob Clark 
> ---
>  src/gallium/docs/source/context.rst | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/gallium/docs/source/context.rst 
> b/src/gallium/docs/source/context.rst
> index f89d9e1005e..199d335f8f4 100644
> --- a/src/gallium/docs/source/context.rst
> +++ b/src/gallium/docs/source/context.rst
> @@ -143,6 +143,9 @@ to the array index which is used for sampling.
>to a respective sampler view and releases a reference to the previous
>sampler view.
>  
> +  Previously bound samplers with index ``>= num_views`` are unbound rather
> +  than unmodified.
> +
>  * ``create_sampler_view`` creates a new sampler view. ``texture`` is 
> associated
>with the sampler view which results in sampler view holding a reference
>to the texture. Format specified in template must be compatible
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: init hash keys with memset(), not designated initializers

2019-03-12 Thread Roland Scheidegger
Am 12.03.19 um 18:34 schrieb Eric Anholt:
> Ilia Mirkin  writes:
> 
>> I believe the distinction here is between initializing all the fields
>> and using them individually (thus leaving the padding uninitialized),
>> vs using the fields to create a structure to hash as a sequence of
>> bytes. For the latter, you really need all the memory initialized, not
>> just the "valid" parts of the structure. In at least my mind, it's
>> fairly well-established that compilers don't always initialize all of
>> a structure's underlying bytes, but I also don't have a specific
>> instance of that situation I can point to.
>>
>> For most usage, foo = {0} is fine since you're not hashing the bytes
>> but rather accessing the fields directly. But for hashing, you really
>> want all the bits initialized.
> 
> Gah.  The commit message even said it was about padding, and I failed to
> read.  Sorry for the noise, this does seem right.

The alternative is to use explicit padding in the struct, so that
structure initialization is guaranteed to initialize everything to zero.
I believe some places still do that too, but it's not very nice neither
(can easily make mistakes there). So - pick your poison...

Roland

> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-devdata=02%7C01%7Csroland%40vmware.com%7Cd6e706866d844beb84a808d6a7110f53%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636880089068822714sdata=AENcqpYCV4d4DJW3q4Bkg9GXIEyU6Au8Yccuthd7Mbk%3Dreserved=0
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: init hash keys with memset(), not designated initializers

2019-03-12 Thread Eric Anholt
Ilia Mirkin  writes:

> I believe the distinction here is between initializing all the fields
> and using them individually (thus leaving the padding uninitialized),
> vs using the fields to create a structure to hash as a sequence of
> bytes. For the latter, you really need all the memory initialized, not
> just the "valid" parts of the structure. In at least my mind, it's
> fairly well-established that compilers don't always initialize all of
> a structure's underlying bytes, but I also don't have a specific
> instance of that situation I can point to.
>
> For most usage, foo = {0} is fine since you're not hashing the bytes
> but rather accessing the fields directly. But for hashing, you really
> want all the bits initialized.

Gah.  The commit message even said it was about padding, and I failed to
read.  Sorry for the noise, this does seem right.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: init hash keys with memset(), not designated initializers

2019-03-12 Thread Ilia Mirkin
I believe the distinction here is between initializing all the fields
and using them individually (thus leaving the padding uninitialized),
vs using the fields to create a structure to hash as a sequence of
bytes. For the latter, you really need all the memory initialized, not
just the "valid" parts of the structure. In at least my mind, it's
fairly well-established that compilers don't always initialize all of
a structure's underlying bytes, but I also don't have a specific
instance of that situation I can point to.

For most usage, foo = {0} is fine since you're not hashing the bytes
but rather accessing the fields directly. But for hashing, you really
want all the bits initialized.

Cheers,

  -ilia

On Tue, Mar 12, 2019 at 12:31 PM Eric Anholt  wrote:
>
> Brian Paul  writes:
>
> > On 03/11/2019 04:47 PM, Eric Anholt wrote:
> >> Brian Paul  writes:
> >>
> >>> Since the compiler may not zero-out padding in the object.
> >>> Add a couple comments about this to prevent misunderstandings in
> >>> the future.
> >>>
> >>> Fixes: 67d96816ff5 ("st/mesa: move, clean-up shader variant key 
> >>> decls/inits")
> >>> ---
> >>>   src/mesa/state_tracker/st_atom_shader.c |  9 +++--
> >>>   src/mesa/state_tracker/st_program.c | 13 ++---
> >>>   2 files changed, 17 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/src/mesa/state_tracker/st_atom_shader.c 
> >>> b/src/mesa/state_tracker/st_atom_shader.c
> >>> index ac7a1a5..a4475e2 100644
> >>> --- a/src/mesa/state_tracker/st_atom_shader.c
> >>> +++ b/src/mesa/state_tracker/st_atom_shader.c
> >>> @@ -112,7 +112,10 @@ st_update_fp( struct st_context *st )
> >>>  !stfp->variants->key.bitmap) {
> >>> shader = stfp->variants->driver_shader;
> >>>  } else {
> >>> -  struct st_fp_variant_key key = {0};
> >>> +  struct st_fp_variant_key key;
> >>> +
> >>> +  /* use memset, not an initializer to be sure all memory is zeroed 
> >>> */
> >>> +  memset(, 0, sizeof(key));
> >>
> >> Wait, what?  We rely on this form of initialization all over, what's
> >> changed?
> >
> > The question is do all compilers, when presented with
> >
> > struct st_fp_variant_key key = {0};
>
> You seem to be saying that not all compilers do, but throughout the
> compiler we're relying on all compilers doing so.  Can we get some more
> information about what compiler you're fixing here?
> -BEGIN PGP SIGNATURE-
>
> iQIzBAEBCgAdFiEE/JuuFDWp9/ZkuCBXtdYpNtH8nugFAlyH3tsACgkQtdYpNtH8
> nujFFxAAkzNBlIJyvZhaUuYZ3B7ugbHxCnyfsi6J9AmAKcXyyJ0oIAi1sXkLwflB
> e5WbyksenHiE+NnihdQZICV2fzb0G8WObn5HiBSxNGKqEwATQY3ND3gkzvH2npKT
> EGY0AZCPX/yJPA3yci/MRTLLPzLgSYTUstfEsAmTlRlqmOlWDenV6BV7OpOoTz/F
> bVdrk6QXufLcQ+ZtQS2OFKEKlXdaUaE5ruwanasc8qzXmJ0TMonvRZp4ZrNUNc4j
> p1sUQnAU0SxwuSqIVS6iVmPEH/N2e3u4XdNsZ+45KxFkAnV0wnpIKAbREa1tzOdP
> d2nhZ70BUNnHJE5q11I070mGWkburTKk5hbLvFGS59np/2nLjZQLcb3hYNyCPDP2
> sdRwqzPtwbdNLAbHpiQ3x0OuVJ6P5fuxGUjuB/AQfl1ixV03nB77AXogqhlQ/cK4
> WceUn9AXmhBb5tIdhfKK1uzuplmCSpXaKxdPubvgfI4Rey5KMvcZwBFNqztcfe24
> hc3wPFyfmayNAuFHkC+0jd4po20ibJkF1F0kXjoyl7dKMG6IFX4fICJKPsafIgNV
> cex67m6VAF9ZalBaSgnXZpcmHXSFbVSHYvES/WxRncefgAzXG/BMuywsLZ+US99A
> LfrnLaq0eRDAdpcSfITQmDgpjEvWFThxCDXXBCI5a+pQpYBB6lk=
> =4Bvh
> -END PGP SIGNATURE-___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: init hash keys with memset(), not designated initializers

2019-03-12 Thread Eric Anholt
Brian Paul  writes:

> On 03/11/2019 04:47 PM, Eric Anholt wrote:
>> Brian Paul  writes:
>> 
>>> Since the compiler may not zero-out padding in the object.
>>> Add a couple comments about this to prevent misunderstandings in
>>> the future.
>>>
>>> Fixes: 67d96816ff5 ("st/mesa: move, clean-up shader variant key 
>>> decls/inits")
>>> ---
>>>   src/mesa/state_tracker/st_atom_shader.c |  9 +++--
>>>   src/mesa/state_tracker/st_program.c | 13 ++---
>>>   2 files changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/mesa/state_tracker/st_atom_shader.c 
>>> b/src/mesa/state_tracker/st_atom_shader.c
>>> index ac7a1a5..a4475e2 100644
>>> --- a/src/mesa/state_tracker/st_atom_shader.c
>>> +++ b/src/mesa/state_tracker/st_atom_shader.c
>>> @@ -112,7 +112,10 @@ st_update_fp( struct st_context *st )
>>>  !stfp->variants->key.bitmap) {
>>> shader = stfp->variants->driver_shader;
>>>  } else {
>>> -  struct st_fp_variant_key key = {0};
>>> +  struct st_fp_variant_key key;
>>> +
>>> +  /* use memset, not an initializer to be sure all memory is zeroed */
>>> +  memset(, 0, sizeof(key));
>> 
>> Wait, what?  We rely on this form of initialization all over, what's
>> changed?
>
> The question is do all compilers, when presented with
>
> struct st_fp_variant_key key = {0};

You seem to be saying that not all compilers do, but throughout the
compiler we're relying on all compilers doing so.  Can we get some more
information about what compiler you're fixing here?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/11] ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()

2019-03-12 Thread Samuel Pitoiset
New buffer intrinsics have a separate soffset parameter.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c | 66 ++
 1 file changed, 26 insertions(+), 40 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index ce6639d49bf..8ed5199da55 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1227,59 +1227,45 @@ ac_build_buffer_store_dword(struct ac_llvm_context *ctx,
if (!swizzle_enable_hint) {
LLVMValueRef offset = soffset;
 
-   static const char *types[] = {"f32", "v2f32", "v4f32"};
-
if (inst_offset)
offset = LLVMBuildAdd(ctx->builder, offset,
  LLVMConstInt(ctx->i32, 
inst_offset, 0), "");
-   if (voffset)
-   offset = LLVMBuildAdd(ctx->builder, offset, voffset, 
"");
-
-   LLVMValueRef args[] = {
-   ac_to_float(ctx, vdata),
-   LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, ""),
-   ctx->i32_0,
-   offset,
-   LLVMConstInt(ctx->i1, glc, 0),
-   LLVMConstInt(ctx->i1, slc, 0),
-   };
-
-   char name[256];
-   snprintf(name, sizeof(name), "llvm.amdgcn.buffer.store.%s",
-types[CLAMP(num_channels, 1, 3) - 1]);
 
-   ac_build_intrinsic(ctx, name, ctx->voidt,
-  args, ARRAY_SIZE(args),
-  ac_get_store_intr_attribs(writeonly_memory));
+   if (HAVE_LLVM >= 0x800) {
+   ac_build_llvm8_buffer_store_common(ctx, rsrc,
+  ac_to_float(ctx, 
vdata),
+  ctx->i32_0,
+  voffset, offset,
+  num_channels,
+  glc, slc,
+  writeonly_memory,
+  false, true);
+   } else {
+   if (voffset)
+   offset = LLVMBuildAdd(ctx->builder, offset, 
voffset, "");
+
+   ac_build_buffer_store_common(ctx, rsrc,
+ac_to_float(ctx, vdata),
+ctx->i32_0, offset,
+num_channels, glc, slc,
+writeonly_memory, false);
+   }
return;
}
 
-   static const unsigned dfmt[] = {
+   static const unsigned dfmts[] = {
V_008F0C_BUF_DATA_FORMAT_32,
V_008F0C_BUF_DATA_FORMAT_32_32,
V_008F0C_BUF_DATA_FORMAT_32_32_32,
V_008F0C_BUF_DATA_FORMAT_32_32_32_32
};
-   static const char *types[] = {"i32", "v2i32", "v4i32"};
-   LLVMValueRef args[] = {
-   vdata,
-   LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, ""),
-   ctx->i32_0,
-   voffset ? voffset : ctx->i32_0,
-   soffset,
-   LLVMConstInt(ctx->i32, inst_offset, 0),
-   LLVMConstInt(ctx->i32, dfmt[num_channels - 1], 0),
-   LLVMConstInt(ctx->i32, V_008F0C_BUF_NUM_FORMAT_UINT, 0),
-   LLVMConstInt(ctx->i1, glc, 0),
-   LLVMConstInt(ctx->i1, slc, 0),
-   };
-   char name[256];
-   snprintf(name, sizeof(name), "llvm.amdgcn.tbuffer.store.%s",
-types[CLAMP(num_channels, 1, 3) - 1]);
+   unsigned dfmt = dfmts[num_channels - 1];
+   unsigned nfmt = V_008F0C_BUF_NUM_FORMAT_UINT;
+   LLVMValueRef immoffset = LLVMConstInt(ctx->i32, inst_offset, 0);
 
-   ac_build_intrinsic(ctx, name, ctx->voidt,
-  args, ARRAY_SIZE(args),
-  ac_get_store_intr_attribs(writeonly_memory));
+   ac_build_tbuffer_store(ctx, rsrc, vdata, ctx->i32_0, voffset, soffset,
+  immoffset, num_channels, dfmt, nfmt, glc, slc,
+  writeonly_memory);
 }
 
 static LLVMValueRef
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/11] ac: use new LLVM 8 intrinsic when storing 16-bit values

2019-03-12 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c  | 108 
 src/amd/common/ac_llvm_build.h  |  26 
 src/amd/common/ac_nir_to_llvm.c |  26 ++--
 3 files changed, 139 insertions(+), 21 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 7aec8154a76..ce6639d49bf 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1550,6 +1550,114 @@ ac_build_llvm8_tbuffer_load(struct ac_llvm_context *ctx,
  ac_get_load_intr_attribs(can_speculate));
 }
 
+static void
+ac_build_llvm8_tbuffer_store(struct ac_llvm_context *ctx,
+LLVMValueRef rsrc,
+LLVMValueRef vdata,
+LLVMValueRef vindex,
+LLVMValueRef voffset,
+LLVMValueRef soffset,
+unsigned num_channels,
+unsigned dfmt,
+unsigned nfmt,
+bool glc,
+bool slc,
+bool writeonly_memory,
+bool structurized)
+{
+   LLVMValueRef args[7];
+   int idx = 0;
+   args[idx++] = vdata;
+   args[idx++] = LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, "");
+   if (structurized)
+   args[idx++] = vindex ? vindex : ctx->i32_0;
+   args[idx++] = voffset ? voffset : ctx->i32_0;
+   args[idx++] = soffset ? soffset : ctx->i32_0;
+   args[idx++] = LLVMConstInt(ctx->i32, dfmt | (nfmt << 4), 0);
+   args[idx++] = LLVMConstInt(ctx->i32, (glc ? 1 : 0) + (slc ? 2 : 0), 0);
+   unsigned func = CLAMP(num_channels, 1, 3) - 1;
+
+   const char *type_names[] = {"i32", "v2i32", "v4i32"};
+   const char *indexing_kind = structurized ? "struct" : "raw";
+   char name[256];
+
+   snprintf(name, sizeof(name), "llvm.amdgcn.%s.tbuffer.store.%s",
+indexing_kind, type_names[func]);
+
+   ac_build_intrinsic(ctx, name, ctx->voidt, args, idx,
+  ac_get_store_intr_attribs(writeonly_memory));
+}
+
+void
+ac_build_tbuffer_store(struct ac_llvm_context *ctx,
+  LLVMValueRef rsrc,
+  LLVMValueRef vdata,
+  LLVMValueRef vindex,
+  LLVMValueRef voffset,
+  LLVMValueRef soffset,
+  LLVMValueRef immoffset,
+  unsigned num_channels,
+  unsigned dfmt,
+  unsigned nfmt,
+  bool glc,
+  bool slc,
+  bool writeonly_memory)
+{
+   if (HAVE_LLVM >= 0x800) {
+   voffset = LLVMBuildAdd(ctx->builder,
+  voffset ? voffset : ctx->i32_0,
+  immoffset, "");
+
+   ac_build_llvm8_tbuffer_store(ctx, rsrc, vdata, vindex, voffset,
+soffset, num_channels, dfmt, nfmt,
+glc, slc, writeonly_memory, true);
+   } else {
+   LLVMValueRef params[] = {
+   vdata,
+   rsrc,
+   vindex,
+   voffset ? voffset : ctx->i32_0,
+   soffset ? soffset : ctx->i32_0,
+   immoffset,
+   LLVMConstInt(ctx->i32, dfmt, false),
+   LLVMConstInt(ctx->i32, nfmt, false),
+   LLVMConstInt(ctx->i32, glc, false),
+   LLVMConstInt(ctx->i32, slc, false),
+   };
+   unsigned func = CLAMP(num_channels, 1, 3) - 1;
+   const char *type_names[] = {"i32", "v2i32", "v4i32"};
+   char name[256];
+
+   snprintf(name, sizeof(name), "llvm.amdgcn.tbuffer.store.%s",
+type_names[func]);
+
+   ac_build_intrinsic(ctx, name, ctx->voidt, params, 10,
+  ac_get_store_intr_attribs(writeonly_memory));
+   }
+}
+
+void
+ac_build_tbuffer_store_short(struct ac_llvm_context *ctx,
+LLVMValueRef rsrc,
+LLVMValueRef vdata,
+LLVMValueRef vindex,
+LLVMValueRef voffset,
+LLVMValueRef soffset,
+bool glc,
+bool slc,
+bool writeonly_memory)
+{
+   unsigned dfmt = V_008F0C_BUF_DATA_FORMAT_16;
+   unsigned nfmt = V_008F0C_BUF_NUM_FORMAT_UINT;
+
+   vdata = LLVMBuildBitCast(ctx->builder, vdata, ctx->i16, "");
+   vdata = LLVMBuildZExt(ctx->builder, vdata, ctx->i32, "");
+
+   ac_build_tbuffer_store(ctx, rsrc, 

[Mesa-dev] [PATCH 09/11] ac: use new LLVM 8 intrinsics in ac_build_buffer_load()

2019-03-12 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index bb8a470ae1d..7aec8154a76 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1412,6 +1412,14 @@ ac_build_buffer_load(struct ac_llvm_context *ctx,
return ac_build_gather_values(ctx, result, num_channels);
}
 
+   if (HAVE_LLVM >= 0x0800) {
+   return ac_build_llvm8_buffer_load_common(ctx, rsrc, vindex,
+offset, ctx->i32_0,
+num_channels, glc, slc,
+can_speculate, false,
+true);
+   }
+
return ac_build_buffer_load_common(ctx, rsrc, vindex, offset,
   num_channels, glc, slc,
   can_speculate, false);
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/11] ac: add ac_build_buffer_store_format() helper

2019-03-12 Thread Samuel Pitoiset
Similar to ac_build_buffer_load_format().

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c  | 100 
 src/amd/common/ac_llvm_build.h  |  11 
 src/amd/common/ac_nir_to_llvm.c |  29 +++--
 3 files changed, 119 insertions(+), 21 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 253073e52fb..bb8a470ae1d 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1082,6 +1082,106 @@ LLVMValueRef 
ac_build_load_to_sgpr_uint_wraparound(struct ac_llvm_context *ctx,
return ac_build_load_custom(ctx, base_ptr, index, true, true, false);
 }
 
+static void
+ac_build_buffer_store_common(struct ac_llvm_context *ctx,
+LLVMValueRef rsrc,
+LLVMValueRef data,
+LLVMValueRef vindex,
+LLVMValueRef voffset,
+unsigned num_channels,
+bool glc,
+bool slc,
+bool writeonly_memory,
+bool use_format)
+{
+   LLVMValueRef args[] = {
+   data,
+   LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, ""),
+   vindex ? vindex : ctx->i32_0,
+   voffset,
+   LLVMConstInt(ctx->i1, glc, 0),
+   LLVMConstInt(ctx->i1, slc, 0)
+   };
+   unsigned func = CLAMP(num_channels, 1, 3) - 1;
+
+   const char *type_names[] = {"f32", "v2f32", "v4f32"};
+   char name[256];
+
+   if (use_format) {
+   snprintf(name, sizeof(name), 
"llvm.amdgcn.buffer.store.format.%s",
+type_names[func]);
+   } else {
+   snprintf(name, sizeof(name), "llvm.amdgcn.buffer.store.%s",
+type_names[func]);
+   }
+
+   ac_build_intrinsic(ctx, name, ctx->voidt, args, ARRAY_SIZE(args),
+  ac_get_store_intr_attribs(writeonly_memory));
+}
+
+static void
+ac_build_llvm8_buffer_store_common(struct ac_llvm_context *ctx,
+  LLVMValueRef rsrc,
+  LLVMValueRef data,
+  LLVMValueRef vindex,
+  LLVMValueRef voffset,
+  LLVMValueRef soffset,
+  unsigned num_channels,
+  bool glc,
+  bool slc,
+  bool writeonly_memory,
+  bool use_format,
+  bool structurized)
+{
+   LLVMValueRef args[5];
+   int idx = 0;
+   args[idx++] = data;
+   args[idx++] = LLVMBuildBitCast(ctx->builder, rsrc, ctx->v4i32, "");
+   if (structurized)
+   args[idx++] = vindex ? vindex : ctx->i32_0;
+   args[idx++] = voffset ? voffset : ctx->i32_0;
+   args[idx++] = soffset ? soffset : ctx->i32_0;
+   args[idx++] = LLVMConstInt(ctx->i32, (glc ? 1 : 0) + (slc ? 2 : 0), 0);
+   unsigned func = CLAMP(num_channels, 1, 3) - 1;
+
+   const char *type_names[] = {"f32", "v2f32", "v4f32"};
+   const char *indexing_kind = structurized ? "struct" : "raw";
+   char name[256];
+
+   if (use_format) {
+   snprintf(name, sizeof(name), 
"llvm.amdgcn.%s.buffer.store.format.%s",
+indexing_kind, type_names[func]);
+   } else {
+   snprintf(name, sizeof(name), "llvm.amdgcn.%s.buffer.store.%s",
+indexing_kind, type_names[func]);
+   }
+
+   ac_build_intrinsic(ctx, name, ctx->voidt, args, idx,
+  ac_get_store_intr_attribs(writeonly_memory));
+}
+
+void
+ac_build_buffer_store_format(struct ac_llvm_context *ctx,
+LLVMValueRef rsrc,
+LLVMValueRef data,
+LLVMValueRef vindex,
+LLVMValueRef voffset,
+unsigned num_channels,
+bool glc,
+bool writeonly_memory)
+{
+   if (HAVE_LLVM >= 0x800) {
+   ac_build_llvm8_buffer_store_common(ctx, rsrc, data, vindex,
+  voffset, NULL, num_channels,
+  glc, false, writeonly_memory,
+  true, true);
+   } else {
+   ac_build_buffer_store_common(ctx, rsrc, data, vindex, voffset,
+num_channels, glc, false,
+writeonly_memory, true);
+   }
+}
+
 /* TBUFFER_STORE_FORMAT_{X,XY,XYZ,XYZW} <- the suffix is selected by 
num_channels=1..4.
  * The type of vdata must be one of i32 

[Mesa-dev] [PATCH 05/11] ac/nir: remove one useless check in visit_store_ssbo()

2019-03-12 Thread Samuel Pitoiset
Trivial.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index c10a0cce16f..55d3ce90ce4 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1567,12 +1567,9 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
}
data = extract_vector_range(>ac, base_data, start, count);
 
-   if (start == 0) {
-   offset = base_offset;
-   } else {
-   offset = LLVMBuildAdd(ctx->ac.builder, base_offset,
- LLVMConstInt(ctx->ac.i32, start * 
elem_size_bytes, false), "");
-   }
+   offset = LLVMBuildAdd(ctx->ac.builder, base_offset,
+ LLVMConstInt(ctx->ac.i32, start * 
elem_size_bytes, false), "");
+
if (num_bytes == 2) {
store_name = "llvm.amdgcn.tbuffer.store.i32";
data_type = ctx->ac.i32;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/11] ac/nir: use ac_build_buffer_store_dword() for SSBO store operations

2019-03-12 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index be0dca2d857..47a865de36f 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1591,34 +1591,29 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
   ctx->ac.voidt, tbuffer_params, 10,
   
ac_get_store_intr_attribs(writeonly_memory));
} else {
+   int num_channels = num_bytes / 4;
+
switch (num_bytes) {
case 16: /* v4f32 */
-   store_name = "llvm.amdgcn.buffer.store.v4f32";
data_type = ctx->ac.v4f32;
break;
case 8: /* v2f32 */
-   store_name = "llvm.amdgcn.buffer.store.v2f32";
data_type = ctx->ac.v2f32;
break;
case 4: /* f32 */
-   store_name = "llvm.amdgcn.buffer.store.f32";
data_type = ctx->ac.f32;
break;
default:
unreachable("Malformed vector store.");
}
data = LLVMBuildBitCast(ctx->ac.builder, data, 
data_type, "");
-   LLVMValueRef params[] = {
-   data,
-   rsrc,
-   ctx->ac.i32_0, /* vindex */
-   offset,
-   glc,
-   ctx->ac.i1false,  /* slc */
-   };
-   ac_build_intrinsic(>ac, store_name,
-  ctx->ac.voidt, params, 6,
-  
ac_get_store_intr_attribs(writeonly_memory));
+
+   ac_build_buffer_store_dword(>ac, rsrc, data,
+   num_channels, offset,
+   ctx->ac.i32_0, 0,
+   cache_policy & ac_glc,
+   false, writeonly_memory,
+   false);
}
}
 }
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/11] ac/nir: use ac_build_buffer_load() for SSBO load operations

2019-03-12 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 35 ++---
 1 file changed, 6 insertions(+), 29 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 922f1063c79..be0dca2d857 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1703,7 +1703,6 @@ static LLVMValueRef visit_load_buffer(struct 
ac_nir_context *ctx,
int num_components = instr->num_components;
enum gl_access_qualifier access = nir_intrinsic_access(instr);
unsigned cache_policy = get_cache_policy(ctx, access, false, false);
-   LLVMValueRef glc = (cache_policy & ac_glc) ? ctx->ac.i1true : 
ctx->ac.i1false;
 
LLVMValueRef offset = get_src(ctx, instr->src[1]);
LLVMValueRef rsrc = ctx->abi->load_ssbo(ctx->abi,
@@ -1734,34 +1733,12 @@ static LLVMValueRef visit_load_buffer(struct 
ac_nir_context *ctx,
  immoffset,
  cache_policy & 
ac_glc);
} else {
-   const char *load_name;
-   LLVMTypeRef data_type;
-   switch (load_bytes) {
-   case 16:
-   case 12:
-   load_name = "llvm.amdgcn.buffer.load.v4f32";
-   data_type = ctx->ac.v4f32;
-   break;
-   case 8:
-   case 6:
-   load_name = "llvm.amdgcn.buffer.load.v2f32";
-   data_type = ctx->ac.v2f32;
-   break;
-   case 4:
-   load_name = "llvm.amdgcn.buffer.load.f32";
-   data_type = ctx->ac.f32;
-   break;
-   default:
-   unreachable("Malformed load buffer.");
-   }
-   LLVMValueRef params[] = {
-   rsrc,
-   vindex,
-   LLVMBuildAdd(ctx->ac.builder, offset, 
immoffset, ""),
-   glc,
-   ctx->ac.i1false,
-   };
-   ret = ac_build_intrinsic(>ac, load_name, 
data_type, params, 5, 0);
+   int num_channels = util_next_power_of_two(load_bytes) / 
4;
+
+   ret = ac_build_buffer_load(>ac, rsrc, num_channels,
+  vindex, offset, immoffset, 0,
+  cache_policy & ac_glc, false,
+  false, false);
}
 
LLVMTypeRef byte_vec = LLVMVectorType(ctx->ac.i8, 
ac_get_type_size(LLVMTypeOf(ret)));
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/11] ac: fix glc parameter use for new LLVM 8 typed buffer intrinsics

2019-03-12 Thread Samuel Pitoiset
ac_build_llvm8_tbuffer_load() expects a boolean for glc.

Fixes: 2cf5433b99f ("ac: use new LLVM 8 intrinsic when loading 16-bit values")
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c  | 4 ++--
 src/amd/common/ac_llvm_build.h  | 2 +-
 src/amd/common/ac_nir_to_llvm.c | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index bc64f0bb7e3..88ea289a121 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1376,7 +1376,7 @@ ac_build_tbuffer_load_short(struct ac_llvm_context *ctx,
LLVMValueRef voffset,
LLVMValueRef soffset,
LLVMValueRef immoffset,
-   LLVMValueRef glc)
+   bool glc)
 {
unsigned dfmt = V_008F0C_BUF_DATA_FORMAT_16;
unsigned nfmt = V_008F0C_BUF_NUM_FORMAT_UINT;
@@ -1399,7 +1399,7 @@ ac_build_tbuffer_load_short(struct ac_llvm_context *ctx,
immoffset,
LLVMConstInt(ctx->i32, dfmt, false),
LLVMConstInt(ctx->i32, nfmt, false),
-   glc,
+   LLVMConstInt(ctx->i32, glc, false),
ctx->i1false,
};
res = ac_build_intrinsic(ctx, name, type, params, 9, 0);
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index fd5c4295abf..0fb3eb52f05 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -304,7 +304,7 @@ ac_build_tbuffer_load_short(struct ac_llvm_context *ctx,
LLVMValueRef voffset,
LLVMValueRef soffset,
LLVMValueRef immoffset,
-   LLVMValueRef glc);
+   bool glc);
 
 LLVMValueRef
 ac_build_llvm8_tbuffer_load(struct ac_llvm_context *ctx,
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 5fb5c8da609..a7b3fdf64aa 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1716,7 +1716,7 @@ static LLVMValueRef visit_load_buffer(struct 
ac_nir_context *ctx,
  offset,
  ctx->ac.i32_0,
  immoffset,
- glc);
+ cache_policy & 
ac_glc);
} else {
const char *load_name;
LLVMTypeRef data_type;
@@ -1787,7 +1787,7 @@ static LLVMValueRef visit_load_ubo_buffer(struct 
ac_nir_context *ctx,
 offset,
 ctx->ac.i32_0,
 
LLVMConstInt(ctx->ac.i32, 2 * i, 0),
-
ctx->ac.i1false);
+false);
}
ret = ac_build_gather_values(>ac, results, num_components);
} else {
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/11] ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations

2019-03-12 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 65 +
 1 file changed, 41 insertions(+), 24 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 55d3ce90ce4..922f1063c79 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1626,57 +1626,74 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
 static LLVMValueRef visit_atomic_ssbo(struct ac_nir_context *ctx,
   const nir_intrinsic_instr *instr)
 {
-   const char *name;
-   LLVMValueRef params[6];
+   const char *atomic_name;
+   char intrinsic_name[64];
+   LLVMValueRef params[7];
int arg_count = 0;
-
-   if (instr->intrinsic == nir_intrinsic_ssbo_atomic_comp_swap) {
-   params[arg_count++] = ac_llvm_extract_elem(>ac, 
get_src(ctx, instr->src[3]), 0);
-   }
-   params[arg_count++] = ac_llvm_extract_elem(>ac, get_src(ctx, 
instr->src[2]), 0);
-   params[arg_count++] = ctx->abi->load_ssbo(ctx->abi,
-get_src(ctx, instr->src[0]),
-true);
-   params[arg_count++] = ctx->ac.i32_0; /* vindex */
-   params[arg_count++] = get_src(ctx, instr->src[1]);  /* voffset */
-   params[arg_count++] = ctx->ac.i1false;  /* slc */
+   int length;
 
switch (instr->intrinsic) {
case nir_intrinsic_ssbo_atomic_add:
-   name = "llvm.amdgcn.buffer.atomic.add";
+   atomic_name = "add";
break;
case nir_intrinsic_ssbo_atomic_imin:
-   name = "llvm.amdgcn.buffer.atomic.smin";
+   atomic_name = "smin";
break;
case nir_intrinsic_ssbo_atomic_umin:
-   name = "llvm.amdgcn.buffer.atomic.umin";
+   atomic_name = "umin";
break;
case nir_intrinsic_ssbo_atomic_imax:
-   name = "llvm.amdgcn.buffer.atomic.smax";
+   atomic_name = "smax";
break;
case nir_intrinsic_ssbo_atomic_umax:
-   name = "llvm.amdgcn.buffer.atomic.umax";
+   atomic_name = "umax";
break;
case nir_intrinsic_ssbo_atomic_and:
-   name = "llvm.amdgcn.buffer.atomic.and";
+   atomic_name = "and";
break;
case nir_intrinsic_ssbo_atomic_or:
-   name = "llvm.amdgcn.buffer.atomic.or";
+   atomic_name = "or";
break;
case nir_intrinsic_ssbo_atomic_xor:
-   name = "llvm.amdgcn.buffer.atomic.xor";
+   atomic_name = "xor";
break;
case nir_intrinsic_ssbo_atomic_exchange:
-   name = "llvm.amdgcn.buffer.atomic.swap";
+   atomic_name = "swap";
break;
case nir_intrinsic_ssbo_atomic_comp_swap:
-   name = "llvm.amdgcn.buffer.atomic.cmpswap";
+   atomic_name = "cmpswap";
break;
default:
abort();
}
 
-   return ac_build_intrinsic(>ac, name, ctx->ac.i32, params, 
arg_count, 0);
+   if (instr->intrinsic == nir_intrinsic_ssbo_atomic_comp_swap) {
+   params[arg_count++] = ac_llvm_extract_elem(>ac, 
get_src(ctx, instr->src[3]), 0);
+   }
+   params[arg_count++] = ac_llvm_extract_elem(>ac, get_src(ctx, 
instr->src[2]), 0);
+   params[arg_count++] = ctx->abi->load_ssbo(ctx->abi,
+get_src(ctx, instr->src[0]),
+true);
+   params[arg_count++] = ctx->ac.i32_0; /* vindex */
+   params[arg_count++] = get_src(ctx, instr->src[1]);  /* voffset */
+
+   if (HAVE_LLVM >= 0x0800) {
+   params[arg_count++] = ctx->ac.i32_0; /* soffset */
+   params[arg_count++] = ctx->ac.i32_0;  /* slc */
+
+   length = snprintf(intrinsic_name, sizeof(intrinsic_name),
+ "llvm.amdgcn.struct.buffer.atomic.%s.i32",
+ atomic_name);
+   } else {
+   params[arg_count++] = ctx->ac.i1false;  /* slc */
+
+   length = snprintf(intrinsic_name, sizeof(intrinsic_name),
+ "llvm.amdgcn.buffer.atomic.%s", atomic_name);
+   }
+
+   assert(length < sizeof(intrinsic_name));
+   return ac_build_intrinsic(>ac, intrinsic_name, ctx->ac.i32,
+ params, arg_count, 0);
 }
 
 static LLVMValueRef visit_load_buffer(struct ac_nir_context *ctx,
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/11] ac/nir: set attrib flags for SSBO and image store operations

2019-03-12 Thread Samuel Pitoiset
For consistency regarding other store operations.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index a7b3fdf64aa..ff29345ffe5 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1591,7 +1591,8 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
ctx->ac.i1false,
};
ac_build_intrinsic(>ac, store_name,
-  ctx->ac.voidt, tbuffer_params, 10, 
0);
+  ctx->ac.voidt, tbuffer_params, 10,
+  
ac_get_store_intr_attribs(writeonly_memory));
} else {
switch (num_bytes) {
case 16: /* v4f32 */
@@ -1619,7 +1620,8 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
ctx->ac.i1false,  /* slc */
};
ac_build_intrinsic(>ac, store_name,
-  ctx->ac.voidt, params, 6, 0);
+  ctx->ac.voidt, params, 6,
+  
ac_get_store_intr_attribs(writeonly_memory));
}
}
 }
@@ -2548,7 +2550,8 @@ static void visit_image_store(struct ac_nir_context *ctx,
params[4] = LLVMConstInt(ctx->ac.i1, 
!!(args.cache_policy & ac_glc), 0);
params[5] = ctx->ac.i1false;  /* slc */
}
-   ac_build_intrinsic(>ac, name, ctx->ac.voidt, params, 6, 0);
+   ac_build_intrinsic(>ac, name, ctx->ac.voidt, params, 6,
+  ac_get_store_intr_attribs(writeonly_memory));
} else {
args.opcode = ac_image_store;
args.data[0] = ac_to_float(>ac, get_src(ctx, 
instr->src[3]));
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/11] ac: make use of ac_get_store_intr_attribs() where possible

2019-03-12 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 88ea289a121..253073e52fb 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1150,9 +1150,7 @@ ac_build_buffer_store_dword(struct ac_llvm_context *ctx,
 
ac_build_intrinsic(ctx, name, ctx->voidt,
   args, ARRAY_SIZE(args),
-  writeonly_memory ?
-  AC_FUNC_ATTR_INACCESSIBLE_MEM_ONLY :
-  AC_FUNC_ATTR_WRITEONLY);
+  ac_get_store_intr_attribs(writeonly_memory));
return;
}
 
@@ -1181,9 +1179,7 @@ ac_build_buffer_store_dword(struct ac_llvm_context *ctx,
 
ac_build_intrinsic(ctx, name, ctx->voidt,
   args, ARRAY_SIZE(args),
-  writeonly_memory ?
-  AC_FUNC_ATTR_INACCESSIBLE_MEM_ONLY :
-  AC_FUNC_ATTR_WRITEONLY);
+  ac_get_store_intr_attribs(writeonly_memory));
 }
 
 static LLVMValueRef
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/11] ac: use LLVM 8 buffer intrinsics everywhere

2019-03-12 Thread Samuel Pitoiset
Hi,

This small series makes use of new LLVM 8 buffer intrinsics.
No CTS regressions on GFX8 with LLVM 7, 8 and master.

Please review,
Thanks!

Samuel Pitoiset (11):
  ac: fix glc parameter use for new LLVM 8 typed buffer intrinsics
  ac: make use of ac_get_store_intr_attribs() where possible
  ac/nir: set attrib flags for SSBO and image store operations
  ac: add ac_build_buffer_store_format() helper
  ac/nir: remove one useless check in visit_store_ssbo()
  ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations
  ac/nir: use ac_build_buffer_load() for SSBO load operations
  ac/nir: use ac_build_buffer_store_dword() for SSBO store operations
  ac: use new LLVM 8 intrinsics in ac_build_buffer_load()
  ac: use new LLVM 8 intrinsic when storing 16-bit values
  ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()

 src/amd/common/ac_llvm_build.c  | 290 +++-
 src/amd/common/ac_llvm_build.h  |  39 -
 src/amd/common/ac_nir_to_llvm.c | 188 -
 3 files changed, 356 insertions(+), 161 deletions(-)

-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] panfrost: Set bo->size[0] in the DRM backend

2019-03-12 Thread Tomeu Vizoso
So we can unmap it later.

Signed-off-by: Tomeu Vizoso 
---
 include/drm-uapi/panfrost_drm.h|  1 -
 src/gallium/drivers/panfrost/pan_drm.c | 10 +-
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/include/drm-uapi/panfrost_drm.h b/include/drm-uapi/panfrost_drm.h
index 7618f14f9e26..a0ead4979ccc 100644
--- a/include/drm-uapi/panfrost_drm.h
+++ b/include/drm-uapi/panfrost_drm.h
@@ -27,7 +27,6 @@ extern "C" {
 #define DRM_IOCTL_PANFROST_GET_BO_OFFSET   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_PANFROST_GET_BO_OFFSET, struct drm_panfrost_get_bo_offset)
 
 #define PANFROST_JD_REQ_FS (1 << 0)
-
 /**
  * struct drm_panfrost_submit - ioctl argument for submitting commands to the 
3D
  * engine.
diff --git a/src/gallium/drivers/panfrost/pan_drm.c 
b/src/gallium/drivers/panfrost/pan_drm.c
index 6d1129ff5f2b..71e8c6ac1f9d 100644
--- a/src/gallium/drivers/panfrost/pan_drm.c
+++ b/src/gallium/drivers/panfrost/pan_drm.c
@@ -125,7 +125,7 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, 
struct winsys_handle *wha
struct panfrost_drm *drm = (struct panfrost_drm *)screen->driver;
 struct drm_panfrost_get_bo_offset get_bo_offset = {0,};
struct drm_panfrost_mmap_bo mmap_bo = {0,};
-int ret, size;
+int ret;
 unsigned gem_handle;
 
ret = drmPrimeFDToHandle(drm->fd, whandle->handle, _handle);
@@ -146,9 +146,9 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, 
struct winsys_handle *wha
assert(0);
}
 
-   size = lseek(whandle->handle, 0, SEEK_END);
-   assert(size > 0);
-bo->cpu[0] = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED,
+bo->size[0] = lseek(whandle->handle, 0, SEEK_END);
+assert(bo->size[0] > 0);
+bo->cpu[0] = mmap(NULL, bo->size[0], PROT_READ | PROT_WRITE, 
MAP_SHARED,
drm->fd, mmap_bo.offset);
 if (bo->cpu[0] == MAP_FAILED) {
 fprintf(stderr, "mmap failed: %p\n", bo->cpu[0]);
@@ -156,7 +156,7 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, 
struct winsys_handle *wha
}
 
 /* Record the mmap if we're tracing */
-pantrace_mmap(bo->gpu[0], bo->cpu[0], size, NULL);
+pantrace_mmap(bo->gpu[0], bo->cpu[0], bo->size[0], NULL);
 
 return bo;
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] panfrost: Set bo->gem_handle when creating a linear BO

2019-03-12 Thread Tomeu Vizoso
So we can free it later.

Signed-off-by: Tomeu Vizoso 
---
 src/gallium/drivers/panfrost/pan_resource.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/panfrost/pan_resource.c 
b/src/gallium/drivers/panfrost/pan_resource.c
index d647f618ee77..2fa468b177b9 100644
--- a/src/gallium/drivers/panfrost/pan_resource.c
+++ b/src/gallium/drivers/panfrost/pan_resource.c
@@ -237,6 +237,7 @@ panfrost_create_bo(struct panfrost_screen *screen, const 
struct pipe_resource *t
 
 bo->cpu[0] = mem.cpu;
 bo->gpu[0] = mem.gpu;
+bo->gem_handle = mem.gem_handle;
 
 /* TODO: Mipmap */
 }
@@ -312,7 +313,8 @@ panfrost_destroy_bo(struct panfrost_screen *screen, struct 
panfrost_bo *pbo)
 struct panfrost_memory mem = {
 .cpu = bo->cpu[0],
 .gpu = bo->gpu[0],
-.size = bo->size[0]
+.size = bo->size[0],
+.gem_handle = bo->gem_handle,
 };
 
 screen->driver->free_slab(screen, );
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker

2019-03-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109535
Bug 109535 depends on bug 107510, which changed state.

Bug 107510 Summary: [GEN8+] up to 10% perf drop on several 3D benchmarks
https://bugs.freedesktop.org/show_bug.cgi?id=107510

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/docs: clarify set_sampler_views

2019-03-12 Thread Rob Clark
This previously was not called out clearly, but based on a survey of the
code, it seems the expected behavior is to release the reference to any
sampler views beyond the new range being bound.

I think radeonsi and freedreno were the only ones not doing this.  Which
could probably temporarily leak a bit of memory by holding on to the
sampler view reference.

Signed-off-by: Rob Clark 
---
 src/gallium/docs/source/context.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/docs/source/context.rst 
b/src/gallium/docs/source/context.rst
index f89d9e1005e..199d335f8f4 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -143,6 +143,9 @@ to the array index which is used for sampling.
   to a respective sampler view and releases a reference to the previous
   sampler view.
 
+  Previously bound samplers with index ``>= num_views`` are unbound rather
+  than unmodified.
+
 * ``create_sampler_view`` creates a new sampler view. ``texture`` is associated
   with the sampler view which results in sampler view holding a reference
   to the texture. Format specified in template must be compatible
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/3] Fix Bugzilla Bug 109646 - jagginess with video play

2019-03-12 Thread Zhu, James
Bug 109646 - New video compositor compute shader render glitches mpv
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646
Fixed Problem 2,3:Jaggines caused by interpolation issue with weave de-interlace

James Zhu (3):
  gallium/auxiliary/vl: Increase shader_params size
  gallium/auxiliary/vl: Change grid setting
  gallium/auxiliary/vl: Change weave compute shader implementation

 src/gallium/auxiliary/vl/vl_compositor.c|  2 +-
 src/gallium/auxiliary/vl/vl_compositor_cs.c | 97 ++---
 2 files changed, 76 insertions(+), 23 deletions(-)

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: Fix driverUUID

2019-03-12 Thread Samuel Pitoiset

Reviewed-by: Samuel Pitoiset 

On 3/12/19 4:07 PM, Józef Kucia wrote:

---
  src/amd/vulkan/radv_device.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 83d218fb6bf0..6deb69d22d48 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -337,7 +337,7 @@ radv_physical_device_init(struct radv_physical_device 
*device,
device->rad_info.chip_class > GFX9)
fprintf(stderr, "WARNING: radv is not a conformant vulkan 
implementation, testing use only.\n");
  
-	radv_get_driver_uuid(>device_uuid);

+   radv_get_driver_uuid(>driver_uuid);
radv_get_device_uuid(>rad_info, >device_uuid);
  
  	if (device->rad_info.family == CHIP_STONEY ||

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] gallium/auxiliary/vl: Change grid setting

2019-03-12 Thread Zhu, James
Using draw area for grid setting instead of destination
buffer size.

Signed-off-by: James Zhu 
Tested-by: Bruno Milreu 
---
 src/gallium/auxiliary/vl/vl_compositor_cs.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor_cs.c 
b/src/gallium/auxiliary/vl/vl_compositor_cs.c
index 70898be..de0a3c7 100644
--- a/src/gallium/auxiliary/vl/vl_compositor_cs.c
+++ b/src/gallium/auxiliary/vl/vl_compositor_cs.c
@@ -221,7 +221,8 @@ const char *compute_shader_rgba =
 
 static void
 cs_launch(struct vl_compositor *c,
-  void *cs)
+  void *cs,
+  const struct u_rect  *draw_area)
 {
struct pipe_context *ctx = c->pipe;
 
@@ -241,8 +242,8 @@ cs_launch(struct vl_compositor *c,
info.block[0] = 8;
info.block[1] = 8;
info.block[2] = 1;
-   info.grid[0] = DIV_ROUND_UP(c->fb_state.width, info.block[0]);
-   info.grid[1] = DIV_ROUND_UP(c->fb_state.height, info.block[1]);
+   info.grid[0] = DIV_ROUND_UP(draw_area->x1, info.block[0]);
+   info.grid[1] = DIV_ROUND_UP(draw_area->y1, info.block[1]);
info.grid[2] = 1;
 
ctx->launch_grid(ctx, );
@@ -346,7 +347,7 @@ draw_layers(struct vl_compositor   *c,
  c->pipe->set_sampler_views(c->pipe, PIPE_SHADER_COMPUTE, 0,
 num_sampler_views, samplers);
 
- cs_launch(c, layer->cs);
+ cs_launch(c, layer->cs, &(drawn.area));
 
  if (dirty) {
 struct u_rect drawn = calc_drawn_area(s, layer);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] gallium/auxiliary/vl: Change weave compute shader implementation

2019-03-12 Thread Zhu, James
Use 2D_ARRARY instead of RECT to fetch texels for weave compute
shader.

Problem 2,3: Fixed interpolation issue with weave de-interlace

Fixes: 9364d66cb7f7 (Add video compositor compute shader render)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646
Signed-off-by: James Zhu 
Tested-by: Bruno Milreu 
---
 src/gallium/auxiliary/vl/vl_compositor_cs.c | 79 ++---
 1 file changed, 62 insertions(+), 17 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor_cs.c 
b/src/gallium/auxiliary/vl/vl_compositor_cs.c
index de0a3c7..bad7d5f 100644
--- a/src/gallium/auxiliary/vl/vl_compositor_cs.c
+++ b/src/gallium/auxiliary/vl/vl_compositor_cs.c
@@ -113,15 +113,16 @@ const char *compute_shader_weave =
   "DCL SV[1], BLOCK_ID\n"
 
   "DCL CONST[0..5]\n"
-  "DCL SVIEW[0..2], RECT, FLOAT\n"
+  "DCL SVIEW[0..2], 2D_ARRAY, FLOAT\n"
   "DCL SAMP[0..2]\n"
 
   "DCL IMAGE[0], 2D, WR\n"
-  "DCL TEMP[0..9]\n"
+  "DCL TEMP[0..15]\n"
 
   "IMM[0] UINT32 { 8, 8, 1, 0}\n"
   "IMM[1] FLT32 { 1.0, 2.0, 0.0, 0.0}\n"
   "IMM[2] UINT32 { 1, 2, 4, 0}\n"
+  "IMM[3] FLT32 { 0.25, 0.5, 0.125, 0.125}\n"
 
   "UMAD TEMP[0], SV[1], IMM[0], SV[0]\n"
 
@@ -137,26 +138,70 @@ const char *compute_shader_weave =
  /* Translate */
  "UADD TEMP[2].xy, TEMP[2], -CONST[5].xyxy\n"
 
- /* Texture layer */
- "UMOD TEMP[2].z, TEMP[2]., IMM[2].\n"
- "UMOD TEMP[3].z, TEMP[2]., IMM[2].\n"
- "USHR TEMP[3].z, TEMP[3]., IMM[2].\n"
+ /* Top Y */
+ "U2F TEMP[2], TEMP[2]\n"
+ "DIV TEMP[2].y, TEMP[2]., IMM[1].\n"
+ /* Down Y */
+ "MOV TEMP[12], TEMP[2]\n"
+
+ /* Top UV */
+ "MOV TEMP[3], TEMP[2]\n"
+ "DIV TEMP[3].xy, TEMP[3], IMM[1].\n"
+ /* Down UV */
+ "MOV TEMP[13], TEMP[3]\n"
+
+ /* Texture offset */
+ "ADD TEMP[2].x, TEMP[2]., IMM[3].\n"
+ "ADD TEMP[2].y, TEMP[2]., IMM[3].\n"
+ "ADD TEMP[12].x, TEMP[12]., IMM[3].\n"
+ "ADD TEMP[12].y, TEMP[12]., IMM[3].\n"
+
+ "ADD TEMP[3].x, TEMP[3]., IMM[3].\n"
+ "ADD TEMP[3].y, TEMP[3]., IMM[3].\n"
+ "ADD TEMP[13].x, TEMP[13]., IMM[3].\n"
+ "ADD TEMP[13].y, TEMP[13]., IMM[3].\n"
 
- "USHR TEMP[2].y, TEMP[2], IMM[2].\n"
- "USHR TEMP[3].xy, TEMP[2], IMM[2].\n"
+ /* Scale */
+ "DIV TEMP[2].xy, TEMP[2], CONST[3].zwzw\n"
+ "DIV TEMP[12].xy, TEMP[12], CONST[3].zwzw\n"
+ "DIV TEMP[3].xy, TEMP[3], CONST[3].zwzw\n"
+ "DIV TEMP[13].xy, TEMP[13], CONST[3].zwzw\n"
 
- "U2F TEMP[4], TEMP[2]\n"
- "U2F TEMP[5], TEMP[3]\n"
+ /* Weave offset */
+ "ADD TEMP[2].y, TEMP[2]., IMM[3].\n"
+ "ADD TEMP[12].y, TEMP[12]., -IMM[3].\n"
+ "ADD TEMP[3].y, TEMP[3]., IMM[3].\n"
+ "ADD TEMP[13].y, TEMP[13]., -IMM[3].\n"
 
- /* Scale */
- "DIV TEMP[4], TEMP[4], CONST[3].zwzw\n"
- "DIV TEMP[5], TEMP[5], CONST[3].zwzw\n"
+ /* Texture layer */
+ "MOV TEMP[14].x, TEMP[2].\n"
+ "MOV TEMP[14].yz, TEMP[3].\n"
+ "ROUND TEMP[15], TEMP[14]\n"
+ "ADD TEMP[14], TEMP[14], -TEMP[15]\n"
+ "MOV TEMP[14], |TEMP[14]|\n"
+ "MUL TEMP[14], TEMP[14], IMM[1].\n"
+
+ /* Normalize */
+ "DIV TEMP[2].xy, TEMP[2], CONST[5].zwzw\n"
+ "DIV TEMP[12].xy, TEMP[12], CONST[5].zwzw\n"
+ "DIV TEMP[15].xy, CONST[5].zwzw, IMM[1].\n"
+ "DIV TEMP[3].xy, TEMP[3], TEMP[15].xyxy\n"
+ "DIV TEMP[13].xy, TEMP[13], TEMP[15].xyxy\n"
 
  /* Fetch texels */
- "TEX_LZ TEMP[6].x, TEMP[4], SAMP[0], RECT\n"
- "TEX_LZ TEMP[6].y, TEMP[5], SAMP[1], RECT\n"
- "TEX_LZ TEMP[6].z, TEMP[5], SAMP[2], RECT\n"
-
+ "MOV TEMP[2].z, IMM[1].\n"
+ "MOV TEMP[3].z, IMM[1].\n"
+ "TEX_LZ TEMP[10].x, TEMP[2], SAMP[0], 2D_ARRAY\n"
+ "TEX_LZ TEMP[10].y, TEMP[3], SAMP[1], 2D_ARRAY\n"
+ "TEX_LZ TEMP[10].z, TEMP[3], SAMP[2], 2D_ARRAY\n"
+
+ "MOV TEMP[12].z, IMM[1].\n"
+ "MOV TEMP[13].z, IMM[1].\n"
+ "TEX_LZ TEMP[11].x, TEMP[12], SAMP[0], 2D_ARRAY\n"
+ "TEX_LZ TEMP[11].y, TEMP[13], SAMP[1], 2D_ARRAY\n"
+ "TEX_LZ TEMP[11].z, TEMP[13], SAMP[2], 2D_ARRAY\n"
+
+ "LRP TEMP[6], TEMP[14], TEMP[11], TEMP[10]\n"
  "MOV TEMP[6].w, IMM[1].\n"
 
  /* Color Space Conversion */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: Fix GL_NUM_DEVICE_UUIDS_EXT

2019-03-12 Thread Józef Kucia
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/main/get.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index ee77c45d03c1..efc9c11f79d3 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -906,6 +906,9 @@ find_custom_value(struct gl_context *ctx, const struct 
value_desc *d, union valu
   break;
 
/* GL_EXT_external_objects */
+   case GL_NUM_DEVICE_UUIDS_EXT:
+  v->value_int = 1;
+  break;
case GL_DRIVER_UUID_EXT:
   _mesa_get_driver_uuid(ctx, v->value_int_4);
   break;
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] gallium/auxiliary/vl: Increase shader_params size

2019-03-12 Thread Zhu, James
Increase shader_params size to pass sampler data to
compute shader during weave de-interlace.

Signed-off-by: James Zhu 
Tested-by: Bruno Milreu 
---
 src/gallium/auxiliary/vl/vl_compositor.c| 2 +-
 src/gallium/auxiliary/vl/vl_compositor_cs.c | 9 -
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index a8f3620..12c58ff 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -802,7 +802,7 @@ vl_compositor_init_state(struct vl_compositor_state *s, 
struct pipe_context *pip
   pipe->screen,
   PIPE_BIND_CONSTANT_BUFFER,
   PIPE_USAGE_DEFAULT,
-  sizeof(csc_matrix) + 4*sizeof(float) + 6*sizeof(int)
+  sizeof(csc_matrix) + 6*sizeof(float) + 6*sizeof(int)
);
 
if (!s->shader_params)
diff --git a/src/gallium/auxiliary/vl/vl_compositor_cs.c 
b/src/gallium/auxiliary/vl/vl_compositor_cs.c
index 0e49c11..70898be 100644
--- a/src/gallium/auxiliary/vl/vl_compositor_cs.c
+++ b/src/gallium/auxiliary/vl/vl_compositor_cs.c
@@ -38,6 +38,8 @@ struct cs_viewport {
struct u_rect area;
int translate_x;
int translate_y;
+   float sampler0_w;
+   float sampler0_h;
 };
 
 const char *compute_shader_video_buffer =
@@ -298,8 +300,11 @@ set_viewport(struct vl_compositor_state *s,
*ptr_int++ = drawn->area.x1;
*ptr_int++ = drawn->area.y1;
*ptr_int++ = drawn->translate_x;
-   *ptr_int = drawn->translate_y;
+   *ptr_int++ = drawn->translate_y;
 
+   ptr_float = (float *)ptr_int;
+   *ptr_float++ = drawn->sampler0_w;
+   *ptr_float = drawn->sampler0_h;
pipe_buffer_unmap(s->pipe, buf_transfer);
 
return true;
@@ -328,6 +333,8 @@ draw_layers(struct vl_compositor   *c,
  drawn.scale_y = drawn.scale_x;
  drawn.translate_x = (int)layer->viewport.translate[0];
  drawn.translate_y = (int)layer->viewport.translate[1];
+ drawn.sampler0_w = (float)layer->sampler_views[0]->texture->width0;
+ drawn.sampler0_h = (float)layer->sampler_views[0]->texture->height0;
 
  if (memcmp(, _drawn, sizeof(struct cs_viewport))) {
 set_viewport(s, );
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: Fix driverUUID

2019-03-12 Thread Józef Kucia
---
 src/amd/vulkan/radv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 83d218fb6bf0..6deb69d22d48 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -337,7 +337,7 @@ radv_physical_device_init(struct radv_physical_device 
*device,
device->rad_info.chip_class > GFX9)
fprintf(stderr, "WARNING: radv is not a conformant vulkan 
implementation, testing use only.\n");
 
-   radv_get_driver_uuid(>device_uuid);
+   radv_get_driver_uuid(>driver_uuid);
radv_get_device_uuid(>rad_info, >device_uuid);
 
if (device->rad_info.family == CHIP_STONEY ||
-- 
2.19.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109929] tgsi_to_nir.c:2111: undefined reference to `gl_nir_lower_samplers_as_deref'

2019-03-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109929

--- Comment #4 from Timur Kristóf  ---
Created attachment 143635
  --> https://bugs.freedesktop.org/attachment.cgi?id=143635=edit
build log of mesa with e582e761b7f49d1c0b100289b62442e6295cefef

I tried with the commit before that, e582e761b7f49d1c0b100289b62442e6295cefef
and mesa doesn't build with autotools. Attached the log. Am I doing something
wrong here?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109641] GLX swrast driver leaks shared memory segments

2019-03-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109641

--- Comment #2 from Brian Paul  ---
And it appears this patch will be in the Mesa 19.0 release.

--- Comment #3 from Marcello Blancasio  ---
The patch fixed the issue. Thank you.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 2019 X.Org Foundation Election Candidates

2019-03-12 Thread Luc Verhaegen
On Tue, Mar 12, 2019 at 12:24:00AM +, Wentland, Harry wrote:
> To all X.Org Foundation Members:
> 
> The election for the X.Org Foundation Board of Directors will begin on 
> 14 March 2019. We have 6 candidates who are running for 4 seats. They 
> are (in alphabetical order):

> Attached below are the Personal Statements each candidate submitted 
> for your consideration along with their Statements of Contribution 
> that they submitted with the membership application. Please review 
> each of the candidates' statements to help you decide whom to vote for 
> during the upcoming election.
> 
> If you have questions of the candidates, you should feel free to ask them 
> here on the mailing list.
> 
> The election committee will provide detailed instructions on how the voting 
> system will work when the voting period begins.
> 
> Harry, on behalf of the X.Org elections committee

Are these candidates the only thing we will be voting on?

Luc Verhaegen.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2019-03-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77449
Bug 77449 depends on bug 104602, which changed state.

Bug 104602 Summary: [apitrace] Graphical artifacts in Civilization VI on RX Vega
https://bugs.freedesktop.org/show_bug.cgi?id=104602

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT

2019-03-12 Thread Eleni Maria Stea
On Mon, 11 Mar 2019 11:39:58 -0700
Sagar Ghuge  wrote:

> On Mon, 2019-03-11 at 17:04 +0200, Eleni Maria Stea wrote:
> > The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled
> > with implementation dependent values and according to the table
> > from the Vulkan Specification section [36.1. Limit Requirements]:
> > 
> > pname | max | min
> > pname:sampleLocationSampleCounts   |-
> > |ename:VK_SAMPLE_COU NT_4_BIT
> > pname:maxSampleLocationGridSize|-|(1, 1)
> > pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
> > pname:sampleLocationSubPixelBits   |-|4
> > pname:variableSampleLocations  | false   |implementation
> > dependent
> > 
> > The hardware only supports setting the same sample location for all
> > the
> > pixels, so we only support 1x1 grids.
> > 
> > Also, variableSampleLocations is set to false because we don't
> > support the
> > feature.
> > ---
> >  src/intel/vulkan/anv_device.c  | 21 +
> >  src/intel/vulkan/anv_private.h |  3 +++
> >  2 files changed, 24 insertions(+)
> > 
> > diff --git a/src/intel/vulkan/anv_device.c
> > b/src/intel/vulkan/anv_device.c
> > index 729cceb3e32..1e183b7f4ad 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -1401,6 +1401,27 @@ void anv_GetPhysicalDeviceProperties2(
> >   break;
> >}
> >  
> > +  case
> > VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: {
> > + VkPhysicalDeviceSampleLocationsPropertiesEXT *props =
> > +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext;
> > + props->sampleLocationSampleCounts = ISL_SAMPLE_COUNT_2_BIT
> > |
> > + ISL_SAMPLE_COUNT_4_BIT
> > |
> > +
> > ISL_SAMPLE_COUNT_8_BIT;
> > + if (pdevice->info.gen >= 9)
> > +props->sampleLocationSampleCounts |=
> > ISL_SAMPLE_COUNT_16_BIT;  
> 
> Hi Eleni,
> 
> Thanks for the series.
> 
> "isl_device_get_sample_counts" method figure out values according to
> platform so maybe we can make use of it and ignore
> ISL_SAMPLE_COUNT_1_BIT. So that we don't have to take care of values
> according to platform here. 
> 
> I am not sure about this, so it might be a good idea to consult with
> Jason/Lionel once. :)

I think that not only you are right here, but on top of that we
shouldn't ignore the ISL_SAMPLE_COUNT_1_BIT, as we can still write one
user defined location when only 1 sample per pixel is used (at least
MULTISAMPLE and SAMPLE_PATTERN commands allow us to do so). So, I've
made the change, thank you. :)

> 
> with or without the fix, this patch is:
> 
> Reviewed-by: Sagar Ghuge 
> 

Thanks for the review!
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 8/9] anv: Removed unused header file

2019-03-12 Thread Eleni Maria Stea
In src/intel/vulkan/genX_blorp_exec.c we included the file:
common/gen_sample_positions.h but not use it. Removed.
---
 src/intel/vulkan/genX_blorp_exec.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index e9c85d56d5f..0eeefaaa9d6 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -31,7 +31,6 @@
 #undef __gen_combine_address
 
 #include "common/gen_l3_config.h"
-#include "common/gen_sample_positions.h"
 #include "blorp/blorp_genX_exec.h"
 
 static void *
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 7/9] anv: Optimized the emission of the default locations on Gen8+

2019-03-12 Thread Eleni Maria Stea
We only emit sample locations when the extension is enabled by the user.
In all other cases the default locations are emitted once when the device
is initialized to increase performance.
---
 src/intel/vulkan/anv_genX.h|  3 ++-
 src/intel/vulkan/genX_cmd_buffer.c |  2 +-
 src/intel/vulkan/genX_pipeline.c   | 11 +++
 src/intel/vulkan/genX_state.c  |  8 +---
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index e82d83465ef..7f33a2b0a68 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch,
  struct anv_sample *anv_samples,
  uint32_t num_samples,
  uint32_t log2_samples,
- bool custom_sample_locations);
+ bool custom_sample_locations,
+ bool sample_locations_ext_enabled);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 4752c66f350..ae7c5a80a3c 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2654,7 +2654,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
 
genX(emit_ms_state)(_buffer->batch, anv_samples, samples,
-   log2_samples, true);
+   log2_samples, true, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 8afc08f0320..12adfa65da8 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -573,10 +573,12 @@ emit_sample_mask(struct anv_pipeline *pipeline,
 }
 
 static void
-emit_ms_state(struct anv_pipeline *pipeline,
+emit_ms_state(struct anv_device *device,
+  struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
+   bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations;
struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS];
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
@@ -588,7 +590,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-  if (info->pNext) {
+  if (sample_loc_enabled && info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
 
@@ -617,7 +619,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
}
 
genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples,
-   custom_locations);
+   custom_locations, sample_loc_enabled);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
@@ -1947,7 +1949,8 @@ genX(graphics_pipeline_create)(
assert(pCreateInfo->pRasterizationState);
emit_rs_state(pipeline, pCreateInfo->pRasterizationState,
  pCreateInfo->pMultisampleState, pass, subpass);
-   emit_ms_state(pipeline, pCreateInfo->pMultisampleState, 
pCreateInfo->pDynamicState);
+   emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState,
+ pCreateInfo->pDynamicState);
emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass);
emit_cb_state(pipeline, pCreateInfo->pColorBlendState,
pCreateInfo->pMultisampleState);
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 804cfab3a56..bc6b5870d8d 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -552,12 +552,14 @@ genX(emit_ms_state)(struct anv_batch *batch,
   struct anv_sample *anv_samples,
   uint32_t num_samples,
   uint32_t log2_samples,
-  bool custom_sample_locations)
+  bool custom_sample_locations,
+  bool sample_locations_ext_enabled)
 {
emit_multisample(batch, anv_samples, num_samples, log2_samples,
 custom_sample_locations);
 #if GEN_GEN >= 8
-   emit_sample_locations(batch, anv_samples, num_samples,
- custom_sample_locations);
+   if (sample_locations_ext_enabled)
+  emit_sample_locations(batch, anv_samples, num_samples,
+custom_sample_locations);
 #endif
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7

2019-03-12 Thread Eleni Maria Stea
Allowing setting dynamic and non-dynamic sample locations on Gen7.
---
 src/intel/vulkan/anv_genX.h| 13 ++---
 src/intel/vulkan/genX_cmd_buffer.c |  9 ++--
 src/intel/vulkan/genX_pipeline.c   | 13 +
 src/intel/vulkan/genX_state.c  | 86 +-
 4 files changed, 70 insertions(+), 51 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index f84fe457152..e82d83465ef 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
-void genX(emit_multisample)(struct anv_batch *batch,
-uint32_t samples,
-uint32_t log2_samples);
-
-void genX(emit_sample_locations)(struct anv_batch *batch,
- const struct anv_sample *anv_samples,
- uint32_t num_samples,
- bool custom_locations);
+void genX(emit_ms_state)(struct anv_batch *batch,
+ struct anv_sample *anv_samples,
+ uint32_t num_samples,
+ uint32_t log2_samples,
+ bool custom_sample_locations);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 9229df84caa..4752c66f350 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2643,8 +2643,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
 static void
 cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
 {
-#if GEN_GEN >= 8
-   const struct anv_sample *anv_samples;
+   struct anv_sample *anv_samples;
uint32_t log2_samples;
uint32_t samples;
 
@@ -2654,10 +2653,8 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
log2_samples = __builtin_ffs(samples) - 1;
anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
 
-   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
-   genX(emit_sample_locations)(_buffer->batch, anv_samples, samples,
-  true);
-#endif
+   genX(emit_ms_state)(_buffer->batch, anv_samples, samples,
+   log2_samples, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index fa42e622077..8afc08f0320 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -577,12 +577,9 @@ emit_ms_state(struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
-#if GEN_GEN >= 8
struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS];
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
-#endif
-
uint32_t samples = 1;
uint32_t log2_samples = 0;
 
@@ -591,7 +588,6 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-#if GEN_GEN >= 8
   if (info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
@@ -616,17 +612,12 @@ emit_ms_state(struct anv_pipeline *pipeline,
 }
  }
   }
-#endif
 
   log2_samples = __builtin_ffs(samples) - 1;
}
 
-   genX(emit_multisample(>batch, samples, log2_samples));
-
-#if GEN_GEN >= 8
-   genX(emit_sample_locations)(>batch, anv_samples, samples,
-   custom_locations);
-#endif
+   genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples,
+   custom_locations);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 44cfc925ed5..804cfab3a56 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -437,10 +437,12 @@ VkResult genX(CreateSampler)(
return VK_SUCCESS;
 }
 
-void
-genX(emit_multisample)(struct anv_batch *batch,
-   uint32_t samples,
-   uint32_t log2_samples)
+static void
+emit_multisample(struct anv_batch *batch,
+ const struct anv_sample *anv_samples,
+ uint32_t samples,
+ uint32_t log2_samples,
+ bool custom_locations)
 {
anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) {
   ms.NumberofMultisamples = log2_samples;
@@ -453,34 +455,52 @@ genX(emit_multisample)(struct anv_batch *batch,
*/
   ms.PixelPositionOffsetEnable  = false;
 #else
-  switch (samples) {
-  case 1:
- GEN_SAMPLE_POS_1X(ms.Sample);
- break;
-  case 2:
- GEN_SAMPLE_POS_2X(ms.Sample);
- break;
-  case 4:
- GEN_SAMPLE_POS_4X(ms.Sample);
- break;
- 

[Mesa-dev] [PATCH v2 5/9] anv: Added support for dynamic sample locations on Gen8+

2019-03-12 Thread Eleni Maria Stea
Added support for setting the locations when the pipeline has been
created with the dynamic state bit enabled according to the Vulkan
Specification section [26.5. Custom Sample Locations] for the function:

'vkCmdSetSampleLocationsEXT'

The reason that we preferred to store the boolean valid inside the
dynamic state struct for locations instead of using a dirty bit
(ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions
can modify the value of the dirty bits causing unexpected behavior.
---
 src/intel/vulkan/anv_cmd_buffer.c  | 19 
 src/intel/vulkan/anv_genX.h|  6 +++-
 src/intel/vulkan/anv_private.h |  6 
 src/intel/vulkan/genX_cmd_buffer.c | 27 ++
 src/intel/vulkan/genX_pipeline.c   | 46 --
 src/intel/vulkan/genX_state.c  | 41 +++---
 6 files changed, 99 insertions(+), 46 deletions(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index 1b34644a434..101c1375430 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -28,6 +28,7 @@
 #include 
 
 #include "anv_private.h"
+#include "anv_sample_locations.h"
 
 #include "vk_format_info.h"
 #include "vk_util.h"
@@ -558,6 +559,24 @@ void anv_CmdSetStencilReference(
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
 }
 
+void
+anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer,
+ const VkSampleLocationsInfoEXT 
*pSampleLocationsInfo)
+{
+   ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
+   assert(pSampleLocationsInfo);
+
+   struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic;
+   dyn_state->sample_locations.num_samples =
+  pSampleLocationsInfo->sampleLocationsPerPixel;
+
+   anv_calc_sample_locations(dyn_state->sample_locations.anv_samples,
+ dyn_state->sample_locations.num_samples,
+ pSampleLocationsInfo);
+
+   cmd_buffer->state.gfx.dynamic.sample_locations.valid = true;
+}
+
 static void
 anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 52415c04a45..f84fe457152 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,7 +89,11 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
+void genX(emit_multisample)(struct anv_batch *batch,
+uint32_t samples,
+uint32_t log2_samples);
+
 void genX(emit_sample_locations)(struct anv_batch *batch,
+ const struct anv_sample *anv_samples,
  uint32_t num_samples,
- const VkSampleLocationsInfoEXT *sl,
  bool custom_locations);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 981956e5706..a2e1756cd99 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2135,6 +2135,12 @@ struct anv_dynamic_state {
   uint32_t  front;
   uint32_t  back;
} stencil_reference;
+
+   struct {
+  struct anv_sample 
anv_samples[MAX_SAMPLE_LOCATIONS];
+  uint32_t  num_samples;
+  bool  valid;
+   } sample_locations;
 };
 
 extern const struct anv_dynamic_state default_dynamic_state;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 7687507e6b7..9229df84caa 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -25,11 +25,13 @@
 #include 
 
 #include "anv_private.h"
+#include "anv_sample_locations.h"
 #include "vk_format_info.h"
 #include "vk_util.h"
 #include "util/fast_idiv_by_const.h"
 
 #include "common/gen_l3_config.h"
+#include "common/gen_sample_positions.h"
 #include "genxml/gen_macros.h"
 #include "genxml/genX_pack.h"
 
@@ -2638,6 +2640,26 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.push_constants_dirty &= ~flushed;
 }
 
+static void
+cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
+{
+#if GEN_GEN >= 8
+   const struct anv_sample *anv_samples;
+   uint32_t log2_samples;
+   uint32_t samples;
+
+   samples = cmd_buffer->state.gfx.dynamic.sample_locations.num_samples;
+   assert(samples > 0);
+
+   log2_samples = __builtin_ffs(samples) - 1;
+   anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
+
+   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
+   genX(emit_sample_locations)(_buffer->batch, 

[Mesa-dev] [PATCH v2 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT

2019-03-12 Thread Eleni Maria Stea
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with
implementation dependent values and according to the table from the
Vulkan Specification section [36.1. Limit Requirements]:

pname | max | min
pname:sampleLocationSampleCounts   |-|ename:VK_SAMPLE_COUNT_4_BIT
pname:maxSampleLocationGridSize|-|(1, 1)
pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
pname:sampleLocationSubPixelBits   |-|4
pname:variableSampleLocations  | false   |implementation dependent

The hardware only supports setting the same sample location for all the
pixels, so we only support 1x1 grids.

Also, variableSampleLocations is set to false because we don't support the
feature.

v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge)
2- Used the isl_device_sample_count to take the number of samples
per platform to avoid extra checks. (Sagar Ghuge)

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/anv_device.c  | 19 +++
 src/intel/vulkan/anv_private.h |  3 +++
 2 files changed, 22 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 729cceb3e32..bf6f03ebb1a 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1401,6 +1401,25 @@ void anv_GetPhysicalDeviceProperties2(
  break;
   }
 
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: {
+ VkPhysicalDeviceSampleLocationsPropertiesEXT *props =
+(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext;
+
+ props->sampleLocationSampleCounts =
+isl_device_get_sample_counts(>isl_dev);
+
+ props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W;
+ props->maxSampleLocationGridSize.height = SAMPLE_LOC_GRID_H;
+
+ props->sampleLocationCoordinateRange[0] = 0;
+ props->sampleLocationCoordinateRange[1] = 0.9375;
+ props->sampleLocationSubPixelBits = 4;
+
+ props->variableSampleLocations = VK_FALSE;
+
+ break;
+  }
+
   default:
  anv_debug_ignored_stype(ext->sType);
  break;
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index eed282ff985..5905299e59d 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -195,6 +195,9 @@ struct gen_l3_config;
 
 #define anv_printflike(a, b) __attribute__((__format__(__printf__, a, b)))
 
+#define SAMPLE_LOC_GRID_W 1
+#define SAMPLE_LOC_GRID_H 1
+
 static inline uint32_t
 align_down_npot_u32(uint32_t v, uint32_t a)
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT

2019-03-12 Thread Eleni Maria Stea
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to
the Vulkan Specification section [36.2. Additional Multisampling
Capabilities].
---
 src/intel/Makefile.sources  |  1 +
 src/intel/vulkan/anv_sample_locations.c | 60 +
 src/intel/vulkan/meson.build|  1 +
 3 files changed, 62 insertions(+)
 create mode 100644 src/intel/vulkan/anv_sample_locations.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index a5c8828a6b6..a0873c7ccc2 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -251,6 +251,7 @@ VULKAN_FILES := \
vulkan/anv_pipeline_cache.c \
vulkan/anv_private.h \
vulkan/anv_queue.c \
+   vulkan/anv_sample_locations.c \
vulkan/anv_util.c \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
diff --git a/src/intel/vulkan/anv_sample_locations.c 
b/src/intel/vulkan/anv_sample_locations.c
new file mode 100644
index 000..1ebf280e05b
--- /dev/null
+++ b/src/intel/vulkan/anv_sample_locations.c
@@ -0,0 +1,60 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "anv_private.h"
+
+void
+anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice,
+  VkSampleCountFlagBits samples,
+  VkMultisamplePropertiesEXT
+  *pMultisampleProperties)
+{
+   ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice);
+   const struct gen_device_info *devinfo = _device->info;
+
+   VkExtent2D grid_size;
+   switch (samples) {
+   case VK_SAMPLE_COUNT_2_BIT:
+   case VK_SAMPLE_COUNT_4_BIT:
+   case VK_SAMPLE_COUNT_8_BIT:
+  grid_size.width = SAMPLE_LOC_GRID_W;
+  grid_size.height = SAMPLE_LOC_GRID_H;
+  break;
+
+   case VK_SAMPLE_COUNT_16_BIT:
+  if (devinfo->gen >= 9) {
+ grid_size.width = SAMPLE_LOC_GRID_W;
+ grid_size.height = SAMPLE_LOC_GRID_H;
+ break;
+  }
+   default:
+  grid_size.width = grid_size.height = 0;
+  break;
+   };
+
+   *pMultisampleProperties = (VkMultisamplePropertiesEXT) {
+  .sType = VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT,
+  .pNext = NULL,
+  .maxSampleLocationGridSize = grid_size
+   };
+}
diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build
index 7fa43a6ad79..3f78757c774 100644
--- a/src/intel/vulkan/meson.build
+++ b/src/intel/vulkan/meson.build
@@ -135,6 +135,7 @@ libanv_files = files(
   'anv_pipeline_cache.c',
   'anv_private.h',
   'anv_queue.c',
+  'anv_sample_locations.c',
   'anv_util.c',
   'anv_wsi.c',
   'vk_format_info.h',
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 9/9] anv: Enabled the VK_EXT_sample_locations extension

2019-03-12 Thread Eleni Maria Stea
Enabled the VK_EXT_sample_locations for Intel Gen >= 7.

v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support
anything below Gen7. (Lionel Landwerlin)
---
 src/intel/vulkan/anv_extensions.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 9e4e03e46df..5a30c733c5c 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,7 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
-Extension('VK_EXT_sample_locations',  1, False),
+Extension('VK_EXT_sample_locations',  1, True),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 0/9] Implementation of the VK_EXT_sample_locations

2019-03-12 Thread Eleni Maria Stea
Implemented the requirements from the VK_EXT_sample_locations extension
specification to allow setting custom sample locations on Intel Gen >= 7.

Some decisions explained:

The grid size was set to 1x1 because the hardware only supports a single
set of sample locations for the whole framebuffer.

The user can only set custom sample locations per pipeline by filling
the extension provided structs or dynamically the way it is described
in the sections 26.5, 36.1, 36.2 of the Vulkan specification.

Sections 6.7.3 and 7.4 describe how to use sample locations with images
when a layout transition is about to take place. These sections were
ignored as currently we aren't using sample locations with images in the
driver.

Variable sample locations aren't required and have not been implemented.

We have 754 vk-gl-cts tests for this extension:
The 690 pass on Gen >= 9 (where we can support 16 samples).
The remaining 64 tests aren't supported because they test the variable
sample locations.

Eleni Maria Stea (9):
  anv: Added the VK_EXT_sample_locations extension to the anv_extensions
list
  anv: Set the values for the
VkPhysicalDeviceSampleLocationsPropertiesEXT
  anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
  anv: Added support for non-dynamic sample locations on Gen8+
  anv: Added support for dynamic sample locations on Gen8+
  anv: Added support for dynamic and non-dynamic sample locations on
Gen7
  anv: Optimized the emission of the default locations on Gen8+
  anv: Removed unused header file
  anv: Enabled the VK_EXT_sample_locations extension

 src/intel/Makefile.sources  |   1 +
 src/intel/common/gen_sample_positions.h |  53 ++
 src/intel/vulkan/anv_cmd_buffer.c   |  19 
 src/intel/vulkan/anv_device.c   |  21 
 src/intel/vulkan/anv_extensions.py  |   1 +
 src/intel/vulkan/anv_genX.h |   7 ++
 src/intel/vulkan/anv_private.h  |  18 
 src/intel/vulkan/anv_sample_locations.c |  96 ++
 src/intel/vulkan/anv_sample_locations.h |  29 ++
 src/intel/vulkan/genX_blorp_exec.c  |   1 -
 src/intel/vulkan/genX_cmd_buffer.c  |  24 +
 src/intel/vulkan/genX_pipeline.c|  92 +
 src/intel/vulkan/genX_state.c   | 128 
 src/intel/vulkan/meson.build|   1 +
 14 files changed, 450 insertions(+), 41 deletions(-)
 create mode 100644 src/intel/vulkan/anv_sample_locations.c
 create mode 100644 src/intel/vulkan/anv_sample_locations.h

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/9] anv: Added support for non-dynamic sample locations on Gen8+

2019-03-12 Thread Eleni Maria Stea
Allowing the user to set custom sample locations non-dynamically, by
filling the extension structs and chaining them to the pipeline structs
according to the Vulkan specification section [26.5. Custom Sample Locations]
for the following structures:

'VkPipelineSampleLocationsStateCreateInfoEXT'
'VkSampleLocationsInfoEXT'
'VkSampleLocationEXT'

Once custom locations are used, the default locations are lost and need to be
re-emitted again in the next pipeline creation. For that, we emit the
3DSTATE_SAMPLE_PATTERN at every pipeline creation.
---
 src/intel/common/gen_sample_positions.h | 53 
 src/intel/vulkan/anv_genX.h |  5 ++
 src/intel/vulkan/anv_private.h  |  9 +++
 src/intel/vulkan/anv_sample_locations.c | 38 +++-
 src/intel/vulkan/anv_sample_locations.h | 29 +
 src/intel/vulkan/genX_pipeline.c| 80 +
 src/intel/vulkan/genX_state.c   | 59 ++
 7 files changed, 259 insertions(+), 14 deletions(-)
 create mode 100644 src/intel/vulkan/anv_sample_locations.h

diff --git a/src/intel/common/gen_sample_positions.h 
b/src/intel/common/gen_sample_positions.h
index da48dcb5ed0..e8af2a552dc 100644
--- a/src/intel/common/gen_sample_positions.h
+++ b/src/intel/common/gen_sample_positions.h
@@ -160,4 +160,57 @@ prefix##14YOffset  = 0.9375; \
 prefix##15XOffset  = 0.0625; \
 prefix##15YOffset  = 0.;
 
+/* Examples:
+ * in case of GEN_GEN < 8:
+ * SET_SAMPLE_POS(ms.Sample, 0); expands to:
+ *ms.Sample0XOffset = anv_samples[0].offs_x;
+ *ms.Sample0YOffset = anv_samples[0].offs_y;
+ *
+ * in case of GEN_GEN >= 8:
+ * SET_SAMPLE_POS(sp._16xSample, 0); expands to:
+ *sp._16xSample0XOffset = anv_samples[0].offs_x;
+ *sp._16xSample0YOffset = anv_samples[0].offs_y;
+ */
+#define SET_SAMPLE_POS(prefix, sample_idx) \
+prefix##sample_idx##XOffset = anv_samples[sample_idx].offs_x; \
+prefix##sample_idx##YOffset = anv_samples[sample_idx].offs_y;
+
+#define SET_SAMPLE_POS_2X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1);
+
+#define SET_SAMPLE_POS_4X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3);
+
+#define SET_SAMPLE_POS_8X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3); \
+SET_SAMPLE_POS(prefix, 4); \
+SET_SAMPLE_POS(prefix, 5); \
+SET_SAMPLE_POS(prefix, 6); \
+SET_SAMPLE_POS(prefix, 7);
+
+#define SET_SAMPLE_POS_16X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3); \
+SET_SAMPLE_POS(prefix, 4); \
+SET_SAMPLE_POS(prefix, 5); \
+SET_SAMPLE_POS(prefix, 6); \
+SET_SAMPLE_POS(prefix, 7); \
+SET_SAMPLE_POS(prefix, 8); \
+SET_SAMPLE_POS(prefix, 9); \
+SET_SAMPLE_POS(prefix, 10); \
+SET_SAMPLE_POS(prefix, 11); \
+SET_SAMPLE_POS(prefix, 12); \
+SET_SAMPLE_POS(prefix, 13); \
+SET_SAMPLE_POS(prefix, 14); \
+SET_SAMPLE_POS(prefix, 15);
+
 #endif /* GEN_SAMPLE_POSITIONS_H */
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 8fd32cabf1e..52415c04a45 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -88,3 +88,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
+
+void genX(emit_sample_locations)(struct anv_batch *batch,
+ uint32_t num_samples,
+ const VkSampleLocationsInfoEXT *sl,
+ bool custom_locations);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 5905299e59d..981956e5706 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -71,6 +71,7 @@ struct anv_buffer;
 struct anv_buffer_view;
 struct anv_image_view;
 struct anv_instance;
+struct anv_sample;
 
 struct gen_l3_config;
 
@@ -165,6 +166,7 @@ struct gen_l3_config;
 #define MAX_PUSH_DESCRIPTORS 32 /* Minimum requirement */
 #define MAX_INLINE_UNIFORM_BLOCK_SIZE 4096
 #define MAX_INLINE_UNIFORM_BLOCK_DESCRIPTORS 32
+#define MAX_SAMPLE_LOCATIONS 16
 
 /* The kernel relocation API has a limitation of a 32-bit delta value
  * applied to the address before it is written which, in spite of it being
@@ -2086,6 +2088,13 @@ struct anv_push_constants {
struct brw_image_param images[MAX_GEN8_IMAGES];
 };
 
+struct
+anv_sample {
+   float offs_x;
+   float offs_y;
+   float radius;
+};
+
 struct anv_dynamic_state {
struct {
   uint32_t  count;
diff --git a/src/intel/vulkan/anv_sample_locations.c 
b/src/intel/vulkan/anv_sample_locations.c
index 1ebf280e05b..c660cb5ae84 100644
--- a/src/intel/vulkan/anv_sample_locations.c
+++ b/src/intel/vulkan/anv_sample_locations.c
@@ -21,7 +21,7 @@
  * IN THE SOFTWARE.
  */
 
-#include "anv_private.h"

[Mesa-dev] [PATCH v2 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list

2019-03-12 Thread Eleni Maria Stea
Added the VK_EXT_sample_locations to the anv_extensions.py list to
generate the related entrypoints.
---
 src/intel/vulkan/anv_extensions.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 6fff293dee4..9e4e03e46df 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,6 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
+Extension('VK_EXT_sample_locations',  1, False),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering

2019-03-12 Thread Marc-André Lureau
Hi

On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich
 wrote:
>
> On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote:
> > Hi,
> >
> > On 1.3.2019 11.12, Michel Dänzer wrote:
> > > On 2019-02-28 8:41 p.m., Marek Olšák wrote:
> > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen 
> > >>> 
> >  Why distro versions of Qemu filter sched_setaffinity() syscall?
> > >>>
> > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889)
> > >>>
> > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19
> > >>>
> > >>> "IMHO that mesa change is not valid. It is settings its affinity to
> > >>> run on all threads which is definitely *NOT* something we want to be
> > >>> allowed. Management applications want to control which CPUs QEMU runs
> > >>> on, and as such Mesa should honour the CPU placement that the QEMU
> > >>> process has.
> > >>>
> > >>> This is a great example of why QEMU wants to use seccomp to block
> > >>> affinity changes to prevent something silently trying to use more CPUs
> > >>> than are assigned to this QEMU."
> > >>>
> > >>
> > >> Mesa uses thread affinity to optimize memory access performance on some
> > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore 
> > >> the
> > >> original thread affinity for some child threads. Additionally, if games
> > >> limit the thread affinity, Mesa needs to restore the full thread affinity
> > >> for some of its child threads.
> > >
> > > The last part sounds like Mesa clearly overstepping its authority.
> > >
> > >
> > >> In essence, the thread affinity should only be considered a hint for the
> > >> kernel for optimal performance. There is no reason to kill the process if
> > >> it's disallowed. Just ignore the call or modify the thread mask to make 
> > >> it
> > >> legal.
> > >
> > > The fundamental issue here is that Mesa is using the thread affinity API
> > > for something else than it's intended for. If there was an API for what
> > > Mesa wants (encouraging certain sets of threads to run on topologically
> > > close cores), there should be no need to block that.
> >
> > Why such process needs to be killed instead the request being masked
> > suitably, is there some program that breaks subtly if affinity request
> > is masked (and that being worse than the program being killed)?
>
> But that is still a situation that could be nicely handled with a
> EPERM error return. Way better than just kill a process.
> That 'badly affected' program still can call abort then.
> But nicely working programs don't get just killed then!!


Returning an error seems less secure that prohibiting it completely.
And it may lead to subtle bugs in rarely tested code paths.

It's legitimate that QEMU and management layers want to prevent
arbitrary code from changing resource allocation etc.

There are no easy way I can think of for mesa (and other libraries) to
probe the seccomp filters and associated action.

So we need a way to tell mesa not to call setaffinity() (and other
syscalls). MESA_NO_THREAD_AFFINITY or MESA_NO_SYSCALLS=setaffinity,...
seem like a relatively easy way to go.

thanks


-- 
Marc-André Lureau
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev