Re: [Mesa-dev] [PATCH 1/5] nir/cf: Remove phi sources if needed in nir_handle_add_jump

2018-10-02 Thread Iago Toral
On Tue, 2018-10-02 at 07:50 -0500, Jason Ekstrand wrote:
> On Tue, Oct 2, 2018 at 7:30 AM Jason Ekstrand 
> wrote:
> > On Tue, Oct 2, 2018 at 5:53 AM Iago Toral 
> > wrote:
> > > On Sat, 2018-09-22 at 16:39 -0500, Jason Ekstrand wrote:
> > > 
> > > > If the block in which the jump is inserted is the predecessor
> > > of a
> > > 
> > > > phi
> > > 
> > > > then we need to remove phi sources otherwise the phi may end up
> > > with
> > > 
> > > > things improperly connected.  Found by running the Vulkan CTS
> > > with
> > > 
> > > > SPIR-V optimizations enabled.
> > > 
> > > > 
> > > 
> > > > Cc: mesa-sta...@lists.freedesktop.org
> > > 
> > > > ---
> > > 
> > > >  src/compiler/nir/nir_control_flow.c | 36 +++
> > > --
> > > 
> > > > 
> > > 
> > > >  1 file changed, 19 insertions(+), 17 deletions(-)
> > > 
> > > > 
> > > 
> > > > diff --git a/src/compiler/nir/nir_control_flow.c
> > > 
> > > > b/src/compiler/nir/nir_control_flow.c
> > > 
> > > > index 3b0a0f1a5b0..a82f35550b8 100644
> > > 
> > > > --- a/src/compiler/nir/nir_control_flow.c
> > > 
> > > > +++ b/src/compiler/nir/nir_control_flow.c
> > > 
> > > > @@ -437,6 +437,23 @@ nearest_loop(nir_cf_node *node)
> > > 
> > > > return nir_cf_node_as_loop(node);
> > > 
> > > >  }
> > > 
> > > >  
> > > 
> > > > +static void
> > > 
> > > > +remove_phi_src(nir_block *block, nir_block *pred)
> > > 
> > > > +{
> > > 
> > > > +   nir_foreach_instr(instr, block) {
> > > 
> > > > +  if (instr->type != nir_instr_type_phi)
> > > 
> > > > + break;
> > > 
> > > > +
> > > 
> > > > +  nir_phi_instr *phi = nir_instr_as_phi(instr);
> > > 
> > > > +  nir_foreach_phi_src_safe(src, phi) {
> > > 
> > > > + if (src->pred == pred) {
> > > 
> > > > +list_del(>src.use_link);
> > > 
> > > > +exec_node_remove(>node);
> > > 
> > > > + }
> > > 
> > > > +  }
> > > 
> > > > +   }
> > > 
> > > > +}
> > > 
> > > > +
> > > 
> > > >  /*
> > > 
> > > >   * update the CFG after a jump instruction has been added to
> > > the end
> > > 
> > > > of a block
> > > 
> > > >   */
> > > 
> > > > @@ -447,6 +464,8 @@ nir_handle_add_jump(nir_block *block)
> > > 
> > > > nir_instr *instr = nir_block_last_instr(block);
> > > 
> > > > nir_jump_instr *jump_instr = nir_instr_as_jump(instr);
> > > 
> > > >  
> > > 
> > > > +   if (block->successors[0])
> > > 
> > > > +  remove_phi_src(block->successors[0], block);
> > > 
> > > 
> > > 
> > > Don't we need to do the same for block->successors[1]?
> > 
> > I was going to say no because his function handles *adding* a phi
> > and so the block should already only have one successor.  However,
> > I suppose you could add a phi right before an if.  I'll add the one
> > for block->successors[1] just to be safe.
> 
> On further thought, I don't think it's possible to end up with phi
> sources at block->successors[1].  The only type of block that can
> have multiple successors is one right before an if and both sides of
> the if have only one predecessor so they can't have phis.  Unless, of
> course, we add a bunch of no-op phis for some reason.  Eh, removing
> phis on block->successors[1] is harmless and probably more correct. 
> Still, it's a very weird case...

Yeah, that makes sense.
Iago
> --Jason
> ___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/linker: Check the subroutine associated functions names

2018-10-02 Thread Tapani Pälli



On 10/2/18 7:38 PM, Vadym Shovkoplias wrote:

Hi Tapani,

Thanks for the review!

Completely agree with the first comment, I'll change that and resend the 
patch.
Regarding second comment. I'm not sure if it is possible to do this 
check after the optimization loop. From my observations compiler inlines 
everything
and only after that it removes dead functions (actually all funcs except 
"main"). After the optimization I don't see any possible way how to
implement this subroutine functions check because all functions and 
functions signatures are removed at that point.


Yeah I was considering it could be done by storing some data but it 
seems this is probably the most straightforward version.


On Tue, Oct 2, 2018 at 10:02 AM Tapani Pälli > wrote:



On 10/1/18 5:03 PM, Vadym Shovkoplias wrote:
 >  From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification
 >
 >      "A program will fail to compile or link if any shader
 >       or stage contains two or more functions with the same
 >       name if the name is associated with a subroutine type."
 >
 > Fixes:
 >      * no-overloads.vert
 >
 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109
 > Signed-off-by: Vadym Shovkoplias
mailto:vadym.shovkopl...@globallogic.com>>
 > ---
 >   src/compiler/glsl/linker.cpp | 40

 >   1 file changed, 40 insertions(+)
 >
 > diff --git a/src/compiler/glsl/linker.cpp
b/src/compiler/glsl/linker.cpp
 > index 3fde7e78d3..d0d017c7ff 100644
 > --- a/src/compiler/glsl/linker.cpp
 > +++ b/src/compiler/glsl/linker.cpp
 > @@ -4639,6 +4639,45 @@ link_assign_subroutine_types(struct
gl_shader_program *prog)
 >      }
 >   }
 >
 > +static void
 > +verify_subroutine_associated_funcs(struct gl_shader_program *prog)
 > +{
 > +   unsigned mask = prog->data->linked_stages;
 > +   while (mask) {
 > +      const int i = u_bit_scan();
 > +      gl_program *p = prog->_LinkedShaders[i]->Program;
 > +      glsl_symbol_table *symbols = prog->_LinkedShaders[i]->symbols;
 > +
 > +      /*
 > +       * From OpenGL ES Shading Language 4.00 specification
 > +       * (6.1.2 Subroutines):
 > +       *     "A program will fail to compile or link if any shader
 > +       *     or stage contains two or more functions with the same
 > +       *     name if the name is associated with a subroutine type."
 > +       */
 > +      for (unsigned j = 0; j < p->sh.NumSubroutineFunctions; j++) {
 > +         unsigned definitions = 0;
 > +         char *name = p->sh.SubroutineFunctions[j].name;
 > +         ir_function *fn = symbols->get_function(name);
 > +
 > +         /* Calculate number of function definitions with the
same name */
 > +         foreach_in_list(ir_function_signature, sig,
>signatures) {
 > +            if (sig->is_defined)
 > +               definitions++;

You can just error out here, no need to calculate further.

I'm wondering a bit though should we fail here even if that function
was
not used at all (optimized out)? I can see that the Piglit test does
not
have a call to the function defined.


 > +         }
 > +
 > +         if (definitions > 1) {
 > +            linker_error(prog, "%s shader contains %u function "
 > +                  "definitions with name `%s', which is
associated with"
 > +                  " a subroutine type.\n",
 > +                  _mesa_shader_stage_to_string(i), definitions,
fn->name);
 > +            return;
 > +         }
 > +      }
 > +   }
 > +}
 > +
 > +
 >   static void
 >   set_always_active_io(exec_list *ir, ir_variable_mode io_mode)
 >   {
 > @@ -5024,6 +5063,7 @@ link_shaders(struct gl_context *ctx, struct
gl_shader_program *prog)
 >
 >      check_explicit_uniform_locations(ctx, prog);
 >      link_assign_subroutine_types(prog);
 > +   verify_subroutine_associated_funcs(prog);
 >
 >      if (!prog->data->LinkStatus)
 >         goto done;
 >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org 
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



--

Vadym Shovkoplias | Senior Software Engineer
GlobalLogic
P +380.57.766.7667  M +3.8050.931.7304  S vadym.shovkoplias
www.globallogic.com 

http://www.globallogic.com/email_disclaimer.txt

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108135

--- Comment #3 from Thiago Macieira  ---
(In reply to Thiago Macieira from comment #2)
> The patch does solve the problem for the particular file, but there are more
> AVX uses in swrast_dri.so. The next issue is the intialisation of the
> builtin types in src/compiler/glsl_types.cpp, caused by:

I believe that's a red herring. It's caused by our use of -flto in the build,
meaning the use of AVX in one place "spills" to surrounding code that is
otherwise innocent.

Without -flto, the next instruction to crash happens inside
GlobalKnobs::GlobalKnobs(). I see this function (in fact, the entire
gen_knobs.cpp file) present in at least two libraries: libmesaswr and
libswrAVX. That is,
both./src/gallium/drivers/swr/rasterizer/codegen/.libs/libmesaswr_la-gen_knobs.o
and ./src/gallium/drivers/swr/rasterizer/codegen/.libs/libswrAVX_la-gen_knobs.o
exist and contain this function. And BOTH files have AVX instructions. I can
see the -mavx flag in the build.

Was this intended?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108135

--- Comment #2 from Thiago Macieira  ---
The patch does solve the problem for the particular file, but there are more
AVX uses in swrast_dri.so. The next issue is the intialisation of the builtin
types in src/compiler/glsl_types.cpp, caused by:

#define DECL_TYPE(NAME, ...)\
   const glsl_type glsl_type::_##NAME##_type = glsl_type(__VA_ARGS__, #NAME); \
   const glsl_type *const glsl_type::NAME##_type = _type::_##NAME##_type;

#define STRUCT_TYPE(NAME)

#include "compiler/builtin_type_macros.h"

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] shader cache backward compatibility

2018-10-02 Thread Timothy Arceri

On 3/9/18 6:53 pm, Alexander Larsson wrote:

On Mon, Sep 3, 2018 at 10:41 AM Alexander Larsson  wrote:


On Fri, Aug 31, 2018 at 4:05 PM Emil Velikov  wrote:


Valid point - I forgot about that.

A couple of ideas come to mind:
  - static link LLVM (Flatpak already does it)
No LLVM changes needed.

  - shared link LLVM
LLVM add -Wl,--build-id=sha1


As a very very simple workaround, can you add the file sizes (as well
as the mtimes) to the staleness check? I mean, its possible that a
rebuild generates the exact same size, but at least its better than
always being wrong.


Also, valentin david (of the freedesktop sdk project) started working
on a solution based on build-ids:
  https://gitlab.com/freedesktop-sdk/freedesktop-sdk/merge_requests/487/diffs

This currently relies on the freedesktop sdk having build-ids, but it
would be easy for it to fall back on the mtime if that was not found.
Also, this code is not really tested yet, but you still get the idea
from looking at it, and it should work.



I've pushed some changes. All drivers should now use build-ids when 
available, if not they will fall back to mtime. Also when falling back 
to mtine if mtime is 0 we disable the cache.


Hopefully these changes resolve the problems you were having.

Tim
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] nir: Expose nir_remove_unused_io_vars().

2018-10-02 Thread Timothy Arceri

On 29/9/18 5:52 am, Eric Anholt wrote:

For gallium drivers where you want to do some linking at variant compile
time, you don't have the other producer/consumer shader on hand to modify.
By exposing the inner function, the driver can have the used varyings in
the compiled shader cache key and still do linking.

This is also useful for V3D, where the binning shader wants to only output
position and TF varyings.  We've been removing those after nir_lower_io,
but this will be less driver-specific code and let more of the shader get
DCEed early in NIR.
---
  src/compiler/nir/nir.h |  3 +++
  src/compiler/nir/nir_linking_helpers.c | 32 +++---
  2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index e0df95c391c9..387efc8595e4 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2755,6 +2755,9 @@ void nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
  
  /* Some helpers to do very simple linking */

  bool nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer);
+bool nir_remove_unused_io_vars(nir_shader *shader, struct exec_list *var_list,
+   uint64_t *used_by_other_stage,
+   uint64_t *used_by_other_stage_patches);
  void nir_compact_varyings(nir_shader *producer, nir_shader *consumer,
bool default_to_smooth_interp);
  
diff --git a/src/compiler/nir/nir_linking_helpers.c b/src/compiler/nir/nir_linking_helpers.c

index 7446bb826f97..85677b7c176a 100644
--- a/src/compiler/nir/nir_linking_helpers.c
+++ b/src/compiler/nir/nir_linking_helpers.c
@@ -92,10 +92,26 @@ tcs_add_output_reads(nir_shader *shader, uint64_t *read, 
uint64_t *patches_read)
 }
  }
  
-static bool

-remove_unused_io_vars(nir_shader *shader, struct exec_list *var_list,
-  uint64_t *used_by_other_stage,
-  uint64_t *used_by_other_stage_patches)
+/**
+ * Helper for removing unused shader I/O variables, by demoting them to global
+ * variables (which may then by dead code eliminated).
+ *
+ * Example usage is:
+ *
+ * progress = nir_remove_unused_io_vars(producer,
+ *  >outputs,
+ *  read, patches_read) ||
+ *  progress;
+ *
+ * The "used" should be an array of 4 uint64_ts (probably of VARYING_BIT_*)
+ * representing each .location_frac used.  Note that for vector variables,
+ * only the first channel (.location_frac) is examined for deciding if the
+ * variable is used!


Yeah we depend on the lower to scalar passes and array splitting to get 
the most out of this function.


Series:

Reviewed-by: Timothy Arceri 


+ */
+bool
+nir_remove_unused_io_vars(nir_shader *shader, struct exec_list *var_list,
+  uint64_t *used_by_other_stage,
+  uint64_t *used_by_other_stage_patches)
  {
 bool progress = false;
 uint64_t *used;
@@ -169,11 +185,11 @@ nir_remove_unused_varyings(nir_shader *producer, 
nir_shader *consumer)
tcs_add_output_reads(producer, read, patches_read);
  
 bool progress = false;

-   progress = remove_unused_io_vars(producer, >outputs, read,
-patches_read);
+   progress = nir_remove_unused_io_vars(producer, >outputs, read,
+patches_read);
  
-   progress = remove_unused_io_vars(consumer, >inputs, written,

-patches_written) || progress;
+   progress = nir_remove_unused_io_vars(consumer, >inputs, written,
+patches_written) || progress;
  
 return progress;

  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 9/9] i965/tex_image: Drop intelCompressedTexSubImage

2018-10-02 Thread Nanley Chery
On Wed, Sep 26, 2018 at 04:31:11PM -0700, Nanley Chery wrote:
> Effectively revert 710b1d2e665ed654fb8d52b146fa22469e1dc3a7.
> 
> This function was created to perform the ASTC void-extent workaround.
> Now that the workaround is handled prior to sampling, this function is
> no longer necessary.

Adding to the commit message:

Makes the following piglit test pass:
spec@khr_texture_compression_astc@void-extent-dl-bug

In hopes that the test makes it upstream.

-Nanley

> ---
>  src/mesa/drivers/dri/i965/intel_tex_image.c | 87 -
>  1 file changed, 87 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 9775f788788..31ff08217ac 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -843,98 +843,11 @@ intel_get_tex_sub_image(struct gl_context *ctx,
> DBG("%s - DONE\n", __func__);
>  }
>  
> -static void
> -flush_astc_denorms(struct gl_context *ctx, GLuint dims,
> -   struct gl_texture_image *texImage,
> -   GLint xoffset, GLint yoffset, GLint zoffset,
> -   GLsizei width, GLsizei height, GLsizei depth)
> -{
> -   struct compressed_pixelstore store;
> -   _mesa_compute_compressed_pixelstore(dims, texImage->TexFormat,
> -   width, height, depth,
> -   >Unpack, );
> -
> -   for (int slice = 0; slice < store.CopySlices; slice++) {
> -
> -  /* Map dest texture buffer */
> -  GLubyte *dstMap;
> -  GLint dstRowStride;
> -  ctx->Driver.MapTextureImage(ctx, texImage, slice + zoffset,
> -  xoffset, yoffset, width, height,
> -  GL_MAP_READ_BIT | GL_MAP_WRITE_BIT,
> -  , );
> -  if (!dstMap)
> - continue;
> -
> -  for (int i = 0; i < store.CopyRowsPerSlice; i++) {
> -
> - /* An ASTC block is stored in little endian mode. The byte that
> -  * contains bits 0..7 is stored at the lower address in memory.
> -  */
> - struct astc_void_extent {
> -uint16_t header : 12;
> -uint16_t dontcare[3];
> -uint16_t R;
> -uint16_t G;
> -uint16_t B;
> -uint16_t A;
> - } *blocks = (struct astc_void_extent*) dstMap;
> -
> - /* Iterate over every copied block in the row */
> - for (int j = 0; j < store.CopyBytesPerRow / 16; j++) {
> -
> -/* Check if the header matches that of an LDR void-extent block 
> */
> -if (blocks[j].header == 0xDFC) {
> -
> -   /* Flush UNORM16 values that would be denormalized */
> -   if (blocks[j].A < 4) blocks[j].A = 0;
> -   if (blocks[j].B < 4) blocks[j].B = 0;
> -   if (blocks[j].G < 4) blocks[j].G = 0;
> -   if (blocks[j].R < 4) blocks[j].R = 0;
> -}
> - }
> -
> - dstMap += dstRowStride;
> -  }
> -
> -  ctx->Driver.UnmapTextureImage(ctx, texImage, slice + zoffset);
> -   }
> -}
> -
> -
> -static void
> -intelCompressedTexSubImage(struct gl_context *ctx, GLuint dims,
> -struct gl_texture_image *texImage,
> -GLint xoffset, GLint yoffset, GLint zoffset,
> -GLsizei width, GLsizei height, GLsizei depth,
> -GLenum format,
> -GLsizei imageSize, const GLvoid *data)
> -{
> -   /* Upload the compressed data blocks */
> -   _mesa_store_compressed_texsubimage(ctx, dims, texImage,
> -  xoffset, yoffset, zoffset,
> -  width, height, depth,
> -  format, imageSize, data);
> -
> -   /* Fix up copied ASTC blocks if necessary */
> -   GLenum gl_format = _mesa_compressed_format_to_glenum(ctx,
> -texImage->TexFormat);
> -   bool is_linear_astc = _mesa_is_astc_format(gl_format) &&
> -!_mesa_is_srgb_format(gl_format);
> -   struct brw_context *brw = (struct brw_context*) ctx;
> -   const struct gen_device_info *devinfo = >screen->devinfo;
> -   if (devinfo->gen == 9 && !gen_device_info_is_9lp(devinfo) && 
> is_linear_astc)
> -  flush_astc_denorms(ctx, dims, texImage,
> - xoffset, yoffset, zoffset,
> - width, height, depth);
> -}
> -
>  void
>  intelInitTextureImageFuncs(struct dd_function_table *functions)
>  {
> functions->TexImage = intelTexImage;
> functions->TexSubImage = intelTexSubImage;
> -   functions->CompressedTexSubImage = intelCompressedTexSubImage;
> functions->EGLImageTargetTexture2D = intel_image_target_texture_2d;
> functions->BindRenderbufferTexImage = 

Re: [Mesa-dev] [PATCH 0/9] i965: Re-implement the gen9 void-extent ASTC WA with BLORP

2018-10-02 Thread Nanley Chery
On Wed, Sep 26, 2018 at 04:31:02PM -0700, Nanley Chery wrote:
> The current workaround has two issues. It causes significant slow-downs [1] in
> application startup times and uses the modified ASTC blocks for non-sampling
> operations. This can result in incorrect texture downloads.
> 
> This series addresses the latter issue by keeping two copies of an ASTC
> miptree: one that's been modified for the sampler bug (the shadow) and another
> that hasn't (the main). The main copy is used for pixel transfer operations 
> and
> the shadow is used for sampling within a shader. The former issue is addressed
> by exchanging multiple GTT-mapped memory accesses at texture upload time with 
> a
> render engine read and write at sampling time.
> 
> At the moment, I don't have any empirical data on the performance
> implications nor on the bug fixes.

I just sent out a piglit test to demonstrate the fixed texture download
issue: https://patchwork.freedesktop.org/series/50474/

-Nanley

> I'm trying to get my hands on one of
> the affected benchmarks. This series does pass our CI system.
> 
> 1. 17 seconds were saved by avoiding it in commit:
>3e56e4642fb5875b3f5c4eb34798ba9f3d827705
> 
> Nanley Chery (9):
>   i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
>   i965/miptree: Allocate a shadow_mt for an ASTC WA
>   i965/miptree: Track the staleness of the ASTC shadow
>   intel/blorp_blit: Fix ptr deref in convert_to_uncompressed
>   intel/blorp_blit: Add blorp_copy_astc_wa
>   i965/blorp: Drop tmp_surfs from surf_for_miptree
>   i965: Do a WA blit between ASTC main and shadow
>   i965/surface_state: Use the ASTC shadow_mt if present
>   i965/tex_image: Drop intelCompressedTexSubImage
> 
>  src/intel/blorp/blorp.h   |   6 +
>  src/intel/blorp/blorp_blit.c  | 158 +-
>  src/intel/blorp/blorp_priv.h  |   1 +
>  src/mesa/drivers/dri/i965/brw_blorp.c |  56 ---
>  src/mesa/drivers/dri/i965/brw_blorp.h |   6 +
>  src/mesa/drivers/dri/i965/brw_draw.c  |  16 ++
>  .../drivers/dri/i965/brw_wm_surface_state.c   |  11 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  46 -
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  21 ++-
>  src/mesa/drivers/dri/i965/intel_tex_image.c   |  87 --
>  10 files changed, 276 insertions(+), 132 deletions(-)
> 
> -- 
> 2.19.0
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/13] mesa: add a list of EXT_direct_state_access to dispatch sanity

2018-10-02 Thread Timothy Arceri

On 3/10/18 7:50 am, Marek Olšák wrote:

This is probably OK, though the TODO list in features.txt should also
be updated when a new subset is implemented.


Sure now that we have one :)

Thanks for the feedback on the series. However priorities have changed 
since I sent this series. Also it turns out a patch in Wine staging was 
responsible for Wolfenstein thinking the extension was available (it 
works fine on older versions of Wine). I'm not sure I'll get back to 
this extension anytime in the foreseeable future.





Marek
On Sat, Sep 8, 2018 at 12:32 AM Timothy Arceri  wrote:


This extension is huge and this gives us a TODO list of functions
to implement.
---
  src/mesa/main/tests/dispatch_sanity.cpp | 219 
  1 file changed, 219 insertions(+)

diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index fb2acfbdeea..8b03f5377b3 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -1015,6 +1015,225 @@ const struct function 
common_desktop_functions_possible[] = {
 { "glGetQueryBufferObjecti64v", 45, -1 },
 { "glGetQueryBufferObjectui64v", 45, -1 },

+   /* GL_EXT_direct_state_access - GL 1.0 */
+   //{ "glMatrixLoadfEXT", 10, -1 },
+   //{ "glMatrixLoaddEXT", 10, -1 },
+   //{ "glMatrixMultfEXT", 10, -1 },
+   //{ "glMatrixMultdEXT", 10, -1 },
+   //{ "glMatrixLoadIdentityEXT", 10, -1 },
+   //{ "glMatrixRotatefEXT", 10, -1 },
+   //{ "glMatrixRotatedEXT", 10, -1 },
+   //{ "glMatrixScalefEXT", 10, -1 },
+   //{ "glMatrixScaledEXT", 10, -1 },
+   //{ "glMatrixTranslatefEXT", 10, -1 },
+   //{ "glMatrixTranslatedEXT", 10, -1 },
+   //{ "glMatrixOrthoEXT", 10, -1 },
+   //{ "glMatrixFrustumEXT", 10, -1 },
+   //{ "glMatrixPushEXT", 10, -1 },
+   //{ "glMatrixPopEXT", 10, -1 },
+   /* GL_EXT_direct_state_access - GL 1.1 */
+   //{ "glClientAttribDefaultEXT", 10, -1 },
+   //{ "glPushClientAttribDefaultEXT", 10, -1 },
+   //{ "glTextureParameteriEXT", 10, -1 },
+   //{ "glTextureParameterivEXT", 10, -1 },
+   //{ "glTextureParameterfEXT", 10, -1 },
+   //{ "glTextureParameterfvEXT", 10, -1 },
+   //{ "glTextureImage1DEXT", 10, -1 },
+   //{ "glTextureImage2DEXT", 10, -1 },
+   //{ "glTextureSubImage1DEXT", 10, -1 },
+   //{ "glTextureSubImage2DEXT", 10, -1 },
+   //{ "glCopyTextureImage1DEXT", 10, -1 },
+   //{ "glCopyTextureImage2DEXT", 10, -1 },
+   //{ "glCopyTextureSubImage1DEXT", 10, -1 },
+   //{ "glCopyTextureSubImage2DEXT", 10, -1 },
+   //{ "glGetTextureImageEXT", 10, -1 },
+   //{ "glGetTextureParameterfvEXT", 10, -1 },
+   //{ "glGetTextureParameterivEXT", 10, -1 },
+   //{ "glGetTextureLevelParameterfvEXT", 10, -1 },
+   //{ "glGetTextureLevelParameterivEXT", 10, -1 },
+   /* GL_EXT_direct_state_access - GL 1.2 */
+   //{ "glTextureImage3DEXT", 10, -1 },
+   //{ "glTextureSubImage3DEXT", 10, -1 },
+   //{ "glCopyTextureSubImage3DEXT", 10, -1 },
+   /* GL_EXT_direct_state_access - GL 1.2.1 */
+   //{ "glBindMultiTextureEXT", 10, -1 },
+   //{ "glMultiTexCoordPointerEXT", 10, -1 },
+   //{ "glMultiTexEnvfEXT", 10, -1 },
+   //{ "glMultiTexEnvfvEXT", 10, -1 },
+   //{ "glMultiTexEnviEXT", 10, -1 },
+   //{ "glMultiTexEnvivEXT", 10, -1 },
+   //{ "glMultiTexGenEXT", 10, -1 },
+   //{ "glMultiTexGenvEXT", 10, -1 },
+   //{ "glMultiTexGenfEXT", 10, -1 },
+   //{ "glMultiTexGenfvEXT", 10, -1 },
+   //{ "glMultiTexGeniEXT", 10, -1 },
+   //{ "glMultiTexGenivEXT", 10, -1 },
+   //{ "glGenMultiTexEnvfvEXT", 10, -1 },
+   //{ "glGenMultiTexEnvivEXT", 10, -1 },
+   //{ "glGenMultiTexGenvEXT", 10, -1 },
+   //{ "glGenMultiTexGenfvEXT", 10, -1 },
+   //{ "glGenMultiTexGenivEXT", 10, -1 },
+   //{ "glMultiTexParameterfEXT", 10, -1 },
+   //{ "glMultiTexParameterfvEXT", 10, -1 },
+   //{ "glMultiTexParameteriEXT", 10, -1 },
+   //{ "glMultiTexParameterivEXT", 10, -1 },
+   //{ "glMultiTexImage1DEXT", 10, -1 },
+   //{ "glMultiTexImage2DEXT", 10, -1 },
+   //{ "glMultiTexSubImage1DEXT", 10, -1 },
+   //{ "glMultiTexSubImage2DEXT", 10, -1 },
+   //{ "glCopyMultiTexImage1DEXT", 10, -1 },
+   //{ "glCopyMultiTexImage2DEXT", 10, -1 },
+   //{ "glCopyMultiTexSubImage1DEXT", 10, -1 },
+   //{ "glCopyMultiTexSubImage2DEXT", 10, -1 },
+   //{ "glGetMultiTexImageEXT", 10, -1 },
+   //{ "glGetMultiTexParameterfvEXT", 10, -1 },
+   //{ "glGetMultiTexParameterivEXT", 10, -1 },
+   //{ "glGetMultiTexLevelParameterfvEXT", 10, -1 },
+   //{ "glGetMultiTexLevelParameterivEXT", 10, -1 },
+   //{ "glMultiTexImage3DEXT", 10, -1 },
+   //{ "glMultiTexSubImage3DEXT", 10, -1 },
+   //{ "glCopyMultiTexSubImage3DEXT", 10, -1 },
+   //{ "glEnableClientStateIndexedEXT", 10, -1 },
+   //{ "glDisableClientStateIndexedEXT", 10, -1 },
+   //{ "glGetFloatIndexedvEXT", 10, -1 },
+   //{ "glGetDoubleIndexedvEXT", 10, -1 },
+   //{ "glGetPointerIndexedvEXT", 10, -1 },
+   //{ "glEnableIndexedEXT", 10, -1 },
+   //{ "glDisableIndexedEXT", 10, -1 },
+   //{ "glIsEnabledIndexedEXT", 10, -1 },
+   //{ 

Re: [Mesa-dev] [PATCH] util/u_queue: don't inherit thread affinity from parent thread

2018-10-02 Thread Marek Olšák
On Tue, Oct 2, 2018 at 6:36 PM Rob Clark  wrote:
>
> On Tue, Oct 2, 2018 at 6:30 PM Marek Olšák  wrote:
> >
> > From: Marek Olšák 
> >
> > ---
> >  src/util/u_queue.c | 12 
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/src/util/u_queue.c b/src/util/u_queue.c
> > index 22d2cdd0fa2..9dd1a69ed7a 100644
> > --- a/src/util/u_queue.c
> > +++ b/src/util/u_queue.c
> > @@ -232,20 +232,32 @@ struct thread_input {
> >  };
> >
> >  static int
> >  util_queue_thread_func(void *input)
> >  {
> > struct util_queue *queue = ((struct thread_input*)input)->queue;
> > int thread_index = ((struct thread_input*)input)->thread_index;
> >
> > free(input);
> >
> > +#ifdef HAVE_PTHREAD_SETAFFINITY
> > +   /* Don't inherit the thread affinity from the parent thread.
> > +* Set the full mask.
> > +*/
> > +   cpu_set_t cpuset;
> > +   CPU_ZERO();
> > +   for (unsigned i = 0; i < CPU_SETSIZE; i++)
> > +  CPU_SET(i, );
> > +
> > +   pthread_setaffinity_np(pthread_self(), sizeof(cpuset), );
> > +#endif
>
>
> Just curious (and maybe I missed some previous discussion), would this
> override taskset?
>
> Asking because when benchmarking on big/little arm SoCs I tend to use
> taskset to pin things to either the fast cores or slow cores, to
> eliminate a source of uncertainty in the result.  (And I use u_queue
> to split of the 2nd half of batch submits, Ie. the part that generates
> gmem/tiling cmds and does the kernel submit ioctl).  Would be slightly
> annoying to loose that ability to control which group of cores the
> u_queue thread runs on.
>
> (But admittedly this is kind of an edge case, so I guess an env var to
> override the behavior would be ok.)

I don't know, but I guess it affects it.

pipe_context::set_context_param(ctx,
PIPE_CONTEXT_PARAM_PIN_THREADS_TO_L3_CACHE, L3_group_index); is
similar to what you need.

The ideal option would be to have such default behavior on ARM that is
the most desirable. An env var is the second option.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/15] radeonsi: center viewport to improve guardband clipping for high resolutions

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

This will be more useful when we change the quant mode to increase subpixel
precision and decrease the viewport range (which might not be possible
if the viewport is not centered in the viewport range).
---
 src/gallium/drivers/radeonsi/si_gfx_cs.c  |  1 +
 src/gallium/drivers/radeonsi/si_state.c   | 11 +++-
 src/gallium/drivers/radeonsi/si_state.h   |  2 +
 .../drivers/radeonsi/si_state_viewport.c  | 62 +++
 4 files changed, 62 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_gfx_cs.c 
b/src/gallium/drivers/radeonsi/si_gfx_cs.c
index bdb576f7e5c..5a6f7bb35cb 100644
--- a/src/gallium/drivers/radeonsi/si_gfx_cs.c
+++ b/src/gallium/drivers/radeonsi/si_gfx_cs.c
@@ -341,20 +341,21 @@ void si_begin_new_gfx_cs(struct si_context *ctx)
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_MODE_CNTL_1] = 
0x;

ctx->tracked_regs.reg_value[SI_TRACKED_PA_SU_SMALL_PRIM_FILTER_CNTL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_VS_OUT_CNTL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_CLIP_CNTL] = 
0x0009;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_BINNER_CNTL_0] = 
0x0003;
ctx->tracked_regs.reg_value[SI_TRACKED_DB_DFSM_CONTROL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_VERT_CLIP_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_VERT_DISC_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_HORZ_CLIP_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_HORZ_DISC_ADJ]  
= 0x3f80;
+   
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SU_HARDWARE_SCREEN_OFFSET] = 0;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_CLIPRECT_RULE] 
= 0x;
 
/* Set all saved registers state to saved. */
ctx->tracked_regs.reg_saved = 0x;
} else {
/* Set all saved registers state to unknown. */
ctx->tracked_regs.reg_saved = 0;
}
 
/* 0x is a impossible value to register SPI_PS_INPUT_CNTL_n */
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index c2d3a6660ad..8940f78cb54 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2724,20 +2724,29 @@ static void si_set_framebuffer_state(struct 
pipe_context *ctx,
bool old_any_dst_linear = sctx->framebuffer.any_dst_linear;
unsigned old_nr_samples = sctx->framebuffer.nr_samples;
unsigned old_colorbuf_enabled_4bit = 
sctx->framebuffer.colorbuf_enabled_4bit;
bool old_has_zsbuf = !!sctx->framebuffer.state.zsbuf;
bool old_has_stencil =
old_has_zsbuf &&
((struct 
si_texture*)sctx->framebuffer.state.zsbuf->texture)->surface.has_stencil;
bool unbound = false;
int i;
 
+   /* Reject zero-sized framebuffers due to a hw bug on SI that occurs
+* when PA_SU_HARDWARE_SCREEN_OFFSET != 0 and any_scissor.BR_X/Y <= 0.
+* We could implement the full workaround here, but it's a useless case.
+*/
+   if ((!state->width || !state->height) && (state->nr_cbufs || 
state->zsbuf)) {
+   unreachable("the framebuffer shouldn't have zero area");
+   return;
+   }
+
si_update_fb_dirtiness_after_rendering(sctx);
 
for (i = 0; i < sctx->framebuffer.state.nr_cbufs; i++) {
if (!sctx->framebuffer.state.cbufs[i])
continue;
 
tex = (struct 
si_texture*)sctx->framebuffer.state.cbufs[i]->texture;
if (tex->dcc_gather_statistics)
vi_separate_dcc_stop_query(sctx, tex);
}
@@ -4900,22 +4909,20 @@ static void si_init_config(struct si_context *sctx)
if (!has_clear_state) {
si_pm4_set_reg(pm4, R_028230_PA_SC_EDGERULE,
   S_028230_ER_TRI(0xA) |
   S_028230_ER_POINT(0xA) |
   S_028230_ER_RECT(0xA) |
   /* Required by DX10_DIAMOND_TEST_ENA: */
   S_028230_ER_LINE_LR(0x1A) |
   S_028230_ER_LINE_RL(0x26) |
   S_028230_ER_LINE_TB(0xA) |
   S_028230_ER_LINE_BT(0xA));
-   /* PA_SU_HARDWARE_SCREEN_OFFSET must be 0 due to hw bug on SI */
-   si_pm4_set_reg(pm4, R_028234_PA_SU_HARDWARE_SCREEN_OFFSET, 0);
si_pm4_set_reg(pm4, R_028820_PA_CL_NANINF_CNTL, 0);
si_pm4_set_reg(pm4, R_028AC0_DB_SRESULTS_COMPARE_STATE0, 0x0);
si_pm4_set_reg(pm4, R_028AC4_DB_SRESULTS_COMPARE_STATE1, 0x0);
si_pm4_set_reg(pm4, 

[Mesa-dev] [PATCH 15/15] radeonsi: use higher subpixel precision (QUANT_MODE) for smaller viewports

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_get.c |  4 +-
 src/gallium/drivers/radeonsi/si_pipe.h|  8 +++
 .../drivers/radeonsi/si_state_viewport.c  | 50 ---
 3 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index a87cb3cbc8a..ac302b8a946 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -328,21 +328,23 @@ static int si_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
}
 }
 
 static float si_get_paramf(struct pipe_screen* pscreen, enum pipe_capf param)
 {
switch (param) {
case PIPE_CAPF_MAX_LINE_WIDTH:
case PIPE_CAPF_MAX_LINE_WIDTH_AA:
case PIPE_CAPF_MAX_POINT_WIDTH:
case PIPE_CAPF_MAX_POINT_WIDTH_AA:
-   return 8192.0f;
+   /* This depends on the quant mode, though the precise 
interactions
+* are unknown. */
+   return 2048;
case PIPE_CAPF_MAX_TEXTURE_ANISOTROPY:
return 16.0f;
case PIPE_CAPF_MAX_TEXTURE_LOD_BIAS:
return 16.0f;
case PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE:
case PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE:
case PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY:
return 0.0f;
}
return 0.0f;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 93082e262d6..7e15412ef87 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -596,25 +596,33 @@ struct si_framebuffer {
ubyte   uncompressed_cb_mask;
ubyte   color_is_int8;
ubyte   color_is_int10;
ubyte   dirty_cbufs;
booldirty_zsbuf;
boolany_dst_linear;
boolCB_has_shader_readable_metadata;
boolDB_has_shader_readable_metadata;
 };
 
+enum si_quant_mode {
+   /* This is the list we want to support. */
+   SI_QUANT_MODE_16_8_FIXED_POINT_1_256TH,
+   SI_QUANT_MODE_14_10_FIXED_POINT_1_1024TH,
+   SI_QUANT_MODE_12_12_FIXED_POINT_1_4096TH,
+};
+
 struct si_signed_scissor {
int minx;
int miny;
int maxx;
int maxy;
+   enum si_quant_mode quant_mode;
 };
 
 struct si_scissors {
unsigneddirty_mask;
struct pipe_scissor_state   states[SI_MAX_VIEWPORTS];
 };
 
 struct si_viewports {
unsigneddirty_mask;
unsigneddepth_range_dirty_mask;
diff --git a/src/gallium/drivers/radeonsi/si_state_viewport.c 
b/src/gallium/drivers/radeonsi/si_state_viewport.c
index c69a56dffae..819c773ba8e 100644
--- a/src/gallium/drivers/radeonsi/si_state_viewport.c
+++ b/src/gallium/drivers/radeonsi/si_state_viewport.c
@@ -100,20 +100,21 @@ static void si_clip_scissor(struct pipe_scissor_state 
*out,
out->maxy = MIN2(out->maxy, clip->maxy);
 }
 
 static void si_scissor_make_union(struct si_signed_scissor *out,
  struct si_signed_scissor *in)
 {
out->minx = MIN2(out->minx, in->minx);
out->miny = MIN2(out->miny, in->miny);
out->maxx = MAX2(out->maxx, in->maxx);
out->maxy = MAX2(out->maxy, in->maxy);
+   out->quant_mode = MIN2(out->quant_mode, in->quant_mode);
 }
 
 static void si_emit_one_scissor(struct si_context *ctx,
struct radeon_cmdbuf *cs,
struct si_signed_scissor *vp_scissor,
struct pipe_scissor_state *scissor)
 {
struct pipe_scissor_state final;
 
if (ctx->vs_disables_clipping_viewport) {
@@ -138,43 +139,47 @@ static void si_emit_one_scissor(struct si_context *ctx,
return;
}
 
radeon_emit(cs, S_028250_TL_X(final.minx) |
S_028250_TL_Y(final.miny) |
S_028250_WINDOW_OFFSET_DISABLE(1));
radeon_emit(cs, S_028254_BR_X(final.maxx) |
S_028254_BR_Y(final.maxy));
 }
 
-/* the range is [-MAX, MAX] */
-#define SI_MAX_VIEWPORT_RANGE 32768
-
 static void si_emit_guardband(struct si_context *ctx)
 {
const struct si_state_rasterizer *rs = ctx->queued.named.rasterizer;
struct si_signed_scissor vp_as_scissor;
struct pipe_viewport_state vp;
float left, top, right, bottom, max_range, guardband_x, guardband_y;
float discard_x, discard_y;
 
if (ctx->vs_writes_viewport_index) {
/* Shaders can draw to any viewport. Make a union of all
 * viewports. */
vp_as_scissor = ctx->viewports.as_scissor[0];
   

[Mesa-dev] [PATCH 14/15] radeonsi: move emission of PA_SU_VTX_CNTL into emit_guardband

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

We'll modify the quant mode there, which also affects the guarband
computation.
---
 src/gallium/drivers/radeonsi/si_gfx_cs.c | 1 +
 src/gallium/drivers/radeonsi/si_state.c  | 8 +++-
 src/gallium/drivers/radeonsi/si_state.h  | 2 ++
 src/gallium/drivers/radeonsi/si_state_viewport.c | 6 +-
 4 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_gfx_cs.c 
b/src/gallium/drivers/radeonsi/si_gfx_cs.c
index c458b68d846..60db3a6b96f 100644
--- a/src/gallium/drivers/radeonsi/si_gfx_cs.c
+++ b/src/gallium/drivers/radeonsi/si_gfx_cs.c
@@ -343,20 +343,21 @@ void si_begin_new_gfx_cs(struct si_context *ctx)

ctx->tracked_regs.reg_value[SI_TRACKED_PA_SU_SMALL_PRIM_FILTER_CNTL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_VS_OUT_CNTL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_CLIP_CNTL] = 
0x0009;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_BINNER_CNTL_0] = 
0x0003;
ctx->tracked_regs.reg_value[SI_TRACKED_DB_DFSM_CONTROL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_VERT_CLIP_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_VERT_DISC_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_HORZ_CLIP_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_HORZ_DISC_ADJ]  
= 0x3f80;

ctx->tracked_regs.reg_value[SI_TRACKED_PA_SU_HARDWARE_SCREEN_OFFSET] = 0;
+   ctx->tracked_regs.reg_value[SI_TRACKED_PA_SU_VTX_CNTL] = 
0x0005;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_CLIPRECT_RULE] 
= 0x;
 
/* Set all saved registers state to saved. */
ctx->tracked_regs.reg_saved = 0x;
} else {
/* Set all saved registers state to unknown. */
ctx->tracked_regs.reg_saved = 0;
}
 
/* 0x is a impossible value to register SPI_PS_INPUT_CNTL_n */
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index f4fc4fd69da..af1b9f0acc8 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -846,20 +846,21 @@ static void *si_create_rs_state(struct pipe_context *ctx,
if (!rs) {
return NULL;
}
 
rs->scissor_enable = state->scissor;
rs->clip_halfz = state->clip_halfz;
rs->two_side = state->light_twoside;
rs->multisample_enable = state->multisample;
rs->force_persample_interp = state->force_persample_interp;
rs->clip_plane_enable = state->clip_plane_enable;
+   rs->half_pixel_center = state->half_pixel_center;
rs->line_stipple_enable = state->line_stipple_enable;
rs->poly_stipple_enable = state->poly_stipple_enable;
rs->line_smooth = state->line_smooth;
rs->line_width = state->line_width;
rs->poly_smooth = state->poly_smooth;
rs->uses_poly_offset = state->offset_point || state->offset_line ||
   state->offset_tri;
rs->clamp_fragment_color = state->clamp_fragment_color;
rs->clamp_vertex_color = state->clamp_vertex_color;
rs->flatshade = state->flatshade;
@@ -906,24 +907,20 @@ static void *si_create_rs_state(struct pipe_context *ctx,
si_pm4_set_reg(pm4, R_028A08_PA_SU_LINE_CNTL,
   S_028A08_WIDTH(si_pack_float_12p4(state->line_width/2)));
si_pm4_set_reg(pm4, R_028A48_PA_SC_MODE_CNTL_0,
   S_028A48_LINE_STIPPLE_ENABLE(state->line_stipple_enable) 
|
   S_028A48_MSAA_ENABLE(state->multisample ||
state->poly_smooth ||
state->line_smooth) |
   S_028A48_VPORT_SCISSOR_ENABLE(1) |
   S_028A48_ALTERNATE_RBS_PER_TILE(sscreen->info.chip_class 
>= GFX9));
 
-   si_pm4_set_reg(pm4, R_028BE4_PA_SU_VTX_CNTL,
-  S_028BE4_PIX_CENTER(state->half_pixel_center) |
-  
S_028BE4_QUANT_MODE(V_028BE4_X_16_8_FIXED_POINT_1_256TH));
-
si_pm4_set_reg(pm4, R_028B7C_PA_SU_POLY_OFFSET_CLAMP, 
fui(state->offset_clamp));
si_pm4_set_reg(pm4, R_028814_PA_SU_SC_MODE_CNTL,
S_028814_PROVOKING_VTX_LAST(!state->flatshade_first) |
S_028814_CULL_FRONT((state->cull_face & PIPE_FACE_FRONT) ? 1 : 
0) |
S_028814_CULL_BACK((state->cull_face & PIPE_FACE_BACK) ? 1 : 0) 
|
S_028814_FACE(!state->front_ccw) |
S_028814_POLY_OFFSET_FRONT_ENABLE(util_get_offset(state, 
state->fill_front)) |
S_028814_POLY_OFFSET_BACK_ENABLE(util_get_offset(state, 
state->fill_back)) |

[Mesa-dev] [PATCH 13/15] radeonsi: don't re-upload the sample position constant buffer repeatedly

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c   |  7 +++
 src/gallium/drivers/radeonsi/si_pipe.h   | 13 -
 src/gallium/drivers/radeonsi/si_state.c  | 19 +--
 src/gallium/drivers/radeonsi/si_state_msaa.c | 10 +-
 4 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 4da361c42ee..5ae9c298e77 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -153,20 +153,21 @@ static void si_destroy_context(struct pipe_context 
*context)
struct pipe_framebuffer_state fb = {};
if (context->set_framebuffer_state)
context->set_framebuffer_state(context, );
 
si_release_all_descriptors(sctx);
 
pipe_resource_reference(>esgs_ring, NULL);
pipe_resource_reference(>gsvs_ring, NULL);
pipe_resource_reference(>tess_rings, NULL);
pipe_resource_reference(>null_const_buf.buffer, NULL);
+   pipe_resource_reference(>sample_pos_buffer, NULL);
r600_resource_reference(>border_color_buffer, NULL);
free(sctx->border_color_table);
r600_resource_reference(>scratch_buffer, NULL);
r600_resource_reference(>compute_scratch_buffer, NULL);
r600_resource_reference(>wait_mem_scratch, NULL);
 
si_pm4_free_state(sctx, sctx->init_config, ~0);
if (sctx->init_config_gs_rings)
si_pm4_free_state(sctx, sctx->init_config_gs_rings, ~0);
for (i = 0; i < ARRAY_SIZE(sctx->vgt_shader_config); i++)
@@ -592,20 +593,26 @@ static struct pipe_context *si_create_context(struct 
pipe_screen *screen,
_mesa_key_pointer_equal);
sctx->img_handles = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
_mesa_key_pointer_equal);
 
util_dynarray_init(>resident_tex_handles, NULL);
util_dynarray_init(>resident_img_handles, NULL);
util_dynarray_init(>resident_tex_needs_color_decompress, NULL);
util_dynarray_init(>resident_img_needs_color_decompress, NULL);
util_dynarray_init(>resident_tex_needs_depth_decompress, NULL);
 
+   sctx->sample_pos_buffer =
+   pipe_buffer_create(sctx->b.screen, 0, PIPE_USAGE_DEFAULT,
+  sizeof(sctx->sample_positions));
+   pipe_buffer_write(>b, sctx->sample_pos_buffer, 0,
+ sizeof(sctx->sample_positions), 
>sample_positions);
+
/* this must be last */
si_begin_new_gfx_cs(sctx);
return >b;
 fail:
fprintf(stderr, "radeonsi: Failed to create a context.\n");
si_destroy_context(>b);
return NULL;
 }
 
 static struct pipe_context *si_pipe_create_context(struct pipe_screen *screen,
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index ff11eab0224..93082e262d6 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -958,25 +958,28 @@ struct si_context {
struct util_dynarrayresident_img_needs_color_decompress;
struct util_dynarrayresident_tex_needs_depth_decompress;
 
/* Bindless state */
booluses_bindless_samplers;
booluses_bindless_images;
 
/* MSAA sample locations.
 * The first index is the sample index.
 * The second index is the coordinate: X, Y. */
-   float   sample_locations_1x[1][2];
-   float   sample_locations_2x[2][2];
-   float   sample_locations_4x[4][2];
-   float   sample_locations_8x[8][2];
-   float   sample_locations_16x[16][2];
+   struct {
+   float   x1[1][2];
+   float   x2[2][2];
+   float   x4[4][2];
+   float   x8[8][2];
+   float   x16[16][2];
+   } sample_positions;
+   struct pipe_resource *sample_pos_buffer;
 
/* Misc stats. */
unsignednum_draw_calls;
unsignednum_decompress_calls;
unsignednum_mrt_draw_calls;
unsignednum_prim_restart_calls;
unsignednum_spill_draw_calls;
unsignednum_compute_calls;
unsignednum_spill_compute_calls;
unsignednum_dma_calls;
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 0cebd974d80..f4fc4fd69da 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2711,21 +2711,20 @@ 

[Mesa-dev] [PATCH 12/15] radeonsi: set PA_SU_PRIM_FILTER_CNTL optimally

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_gfx_cs.c |  1 +
 src/gallium/drivers/radeonsi/si_state.c  | 15 +++
 src/gallium/drivers/radeonsi/si_state.h  |  1 +
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_gfx_cs.c 
b/src/gallium/drivers/radeonsi/si_gfx_cs.c
index 5a6f7bb35cb..c458b68d846 100644
--- a/src/gallium/drivers/radeonsi/si_gfx_cs.c
+++ b/src/gallium/drivers/radeonsi/si_gfx_cs.c
@@ -332,20 +332,21 @@ void si_begin_new_gfx_cs(struct si_context *ctx)
ctx->tracked_regs.reg_value[SI_TRACKED_DB_SHADER_CONTROL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_CB_TARGET_MASK] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_CB_DCC_CONTROL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_SX_PS_DOWNCONVERT] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_SX_BLEND_OPT_EPSILON] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_SX_BLEND_OPT_CONTROL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_LINE_CNTL] = 
0x1000;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_AA_CONFIG] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_DB_EQAA] = 0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_MODE_CNTL_1] = 
0x;
+   ctx->tracked_regs.reg_value[SI_TRACKED_PA_SU_PRIM_FILTER_CNTL] 
= 0;

ctx->tracked_regs.reg_value[SI_TRACKED_PA_SU_SMALL_PRIM_FILTER_CNTL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_VS_OUT_CNTL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_CLIP_CNTL] = 
0x0009;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_SC_BINNER_CNTL_0] = 
0x0003;
ctx->tracked_regs.reg_value[SI_TRACKED_DB_DFSM_CONTROL] = 
0x;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_VERT_CLIP_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_VERT_DISC_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_HORZ_CLIP_ADJ]  
= 0x3f80;
ctx->tracked_regs.reg_value[SI_TRACKED_PA_CL_GB_HORZ_DISC_ADJ]  
= 0x3f80;

ctx->tracked_regs.reg_value[SI_TRACKED_PA_SU_HARDWARE_SCREEN_OFFSET] = 0;
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 8940f78cb54..0cebd974d80 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3236,20 +3236,21 @@ static void si_emit_framebuffer_state(struct si_context 
*sctx)
radeon_emit(cs, EVENT_TYPE(V_028A90_BREAK_BATCH) | 
EVENT_INDEX(0));
}
 
sctx->framebuffer.dirty_cbufs = 0;
sctx->framebuffer.dirty_zsbuf = false;
 }
 
 static void si_emit_msaa_sample_locs(struct si_context *sctx)
 {
struct radeon_cmdbuf *cs = sctx->gfx_cs;
+   struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
unsigned nr_samples = sctx->framebuffer.nr_samples;
bool has_msaa_sample_loc_bug = sctx->screen->has_msaa_sample_loc_bug;
 
/* Smoothing (only possible with nr_samples == 1) uses the same
 * sample locations as the MSAA it simulates.
 */
if (nr_samples <= 1 && sctx->smoothing_enabled)
nr_samples = SI_NUM_SMOOTH_AA_SAMPLES;
 
/* On Polaris, the small primitive filter uses the sample locations
@@ -3257,40 +3258,49 @@ static void si_emit_msaa_sample_locs(struct si_context 
*sctx)
 */
if (has_msaa_sample_loc_bug)
nr_samples = MAX2(nr_samples, 1);
 
if (nr_samples != sctx->sample_locs_num_samples) {
sctx->sample_locs_num_samples = nr_samples;
si_emit_sample_locations(cs, nr_samples);
}
 
if (sctx->family >= CHIP_POLARIS10) {
-   struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
unsigned small_prim_filter_cntl =
S_028830_SMALL_PRIM_FILTER_ENABLE(1) |
/* line bug */
S_028830_LINE_FILTER_DISABLE(sctx->family <= 
CHIP_POLARIS12);
 
/* The alternative of setting sample locations to 0 would
 * require a DB flush to avoid Z errors, see
 * https://bugs.freedesktop.org/show_bug.cgi?id=96908
 */
if (has_msaa_sample_loc_bug &&
sctx->framebuffer.nr_samples > 1 &&
!rs->multisample_enable)
small_prim_filter_cntl &= 
C_028830_SMALL_PRIM_FILTER_ENABLE;
 
radeon_opt_set_context_reg(sctx,
   
R_028830_PA_SU_SMALL_PRIM_FILTER_CNTL,
   

[Mesa-dev] [PATCH 08/15] radeonsi: add GDS support to CP DMA

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_cp_dma.c | 104 ++-
 src/gallium/drivers/radeonsi/si_pipe.c   |   4 +
 src/gallium/drivers/radeonsi/si_pipe.h   |   2 +
 3 files changed, 89 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index e85bb9b1acf..c1ecd5fb3e8 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -32,22 +32,24 @@
 #define CP_DMA_CLEAR_PERF_THRESHOLD(32 * 1024) /* guess (clear is much 
slower) */
 
 /* Set this if you want the ME to wait until CP DMA is done.
  * It should be set on the last CP DMA packet. */
 #define CP_DMA_SYNC(1 << 0)
 
 /* Set this if the source data was used as a destination in a previous CP DMA
  * packet. It's for preventing a read-after-write (RAW) hazard between two
  * CP DMA packets. */
 #define CP_DMA_RAW_WAIT(1 << 1)
+#define CP_DMA_DST_IS_GDS  (1 << 2)
 #define CP_DMA_CLEAR   (1 << 3)
 #define CP_DMA_PFP_SYNC_ME (1 << 4)
+#define CP_DMA_SRC_IS_GDS  (1 << 5)
 
 /* The max number of bytes that can be copied per packet. */
 static inline unsigned cp_dma_max_byte_count(struct si_context *sctx)
 {
unsigned max = sctx->chip_class >= GFX9 ?
   S_414_BYTE_COUNT_GFX9(~0u) :
   S_414_BYTE_COUNT_GFX6(~0u);
 
/* make it aligned for optimal performance */
return max & ~(SI_CPDMA_ALIGNMENT - 1);
@@ -83,27 +85,37 @@ static void si_emit_cp_dma(struct si_context *sctx, 
uint64_t dst_va,
command |= S_414_DISABLE_WR_CONFIRM_GFX6(1);
}
 
if (flags & CP_DMA_RAW_WAIT)
command |= S_414_RAW_WAIT(1);
 
/* Src and dst flags. */
if (sctx->chip_class >= GFX9 && !(flags & CP_DMA_CLEAR) &&
src_va == dst_va) {
header |= S_411_DST_SEL(V_411_NOWHERE); /* prefetch only */
+   } else if (flags & CP_DMA_DST_IS_GDS) {
+   header |= S_411_DST_SEL(V_411_GDS);
+   /* GDS increments the address, not CP. */
+   command |= S_414_DAS(V_414_REGISTER) |
+  S_414_DAIC(V_414_NO_INCREMENT);
} else if (sctx->chip_class >= CIK && cache_policy != L2_BYPASS) {
header |= S_411_DST_SEL(V_411_DST_ADDR_TC_L2) |
  S_500_DST_CACHE_POLICY(cache_policy == L2_STREAM);
}
 
if (flags & CP_DMA_CLEAR) {
header |= S_411_SRC_SEL(V_411_DATA);
+   } else if (flags & CP_DMA_SRC_IS_GDS) {
+   header |= S_411_SRC_SEL(V_411_GDS);
+   /* Both of these are required for GDS. It does increment the 
address. */
+   command |= S_414_SAS(V_414_REGISTER) |
+  S_414_SAIC(V_414_NO_INCREMENT);
} else if (sctx->chip_class >= CIK && cache_policy != L2_BYPASS) {
header |= S_411_SRC_SEL(V_411_SRC_ADDR_TC_L2) |
  S_500_SRC_CACHE_POLICY(cache_policy == L2_STREAM);
}
 
if (sctx->chip_class >= CIK) {
radeon_emit(cs, PKT3(PKT3_DMA_DATA, 5, 0));
radeon_emit(cs, header);
radeon_emit(cs, src_va);/* SRC_ADDR_LO [31:0] */
radeon_emit(cs, src_va >> 32);  /* SRC_ADDR_HI [31:0] */
@@ -179,33 +191,35 @@ static void si_cp_dma_prepare(struct si_context *sctx, 
struct pipe_resource *dst
  unsigned *packet_flags)
 {
/* Fast exit for a CPDMA prefetch. */
if ((user_flags & SI_CPDMA_SKIP_ALL) == SI_CPDMA_SKIP_ALL) {
*is_first = false;
return;
}
 
if (!(user_flags & SI_CPDMA_SKIP_BO_LIST_UPDATE)) {
/* Count memory usage in so that need_cs_space can take it into 
account. */
-   si_context_add_resource_size(sctx, dst);
+   if (dst)
+   si_context_add_resource_size(sctx, dst);
if (src)
si_context_add_resource_size(sctx, src);
}
 
if (!(user_flags & SI_CPDMA_SKIP_CHECK_CS_SPACE))
si_need_gfx_cs_space(sctx);
 
/* This must be done after need_cs_space. */
if (!(user_flags & SI_CPDMA_SKIP_BO_LIST_UPDATE)) {
-   radeon_add_to_buffer_list(sctx, sctx->gfx_cs,
- r600_resource(dst),
- RADEON_USAGE_WRITE, 
RADEON_PRIO_CP_DMA);
+   if (dst)
+   radeon_add_to_buffer_list(sctx, sctx->gfx_cs,
+ r600_resource(dst),
+ RADEON_USAGE_WRITE, 
RADEON_PRIO_CP_DMA);
if (src)
radeon_add_to_buffer_list(sctx, sctx->gfx_cs,
  

[Mesa-dev] [PATCH 09/15] radeonsi: switch back to standard DX sample positions

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

Apps may rely on them.
---
 src/gallium/drivers/radeonsi/si_state_msaa.c | 43 
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_msaa.c 
b/src/gallium/drivers/radeonsi/si_state_msaa.c
index 10232a5e18b..f9387e75ed1 100644
--- a/src/gallium/drivers/radeonsi/si_state_msaa.c
+++ b/src/gallium/drivers/radeonsi/si_state_msaa.c
@@ -80,58 +80,67 @@
  * Groups of 8 samples in the same vicinity in 16x MSAA:
  *   Left half:  {0,2,4,6,8,10,12,14}
  *   Right half: {1,3,5,7,9,11,13,15}
  */
 
 /* 1x MSAA */
 static const uint32_t sample_locs_1x =
FILL_SREG( 0, 0,   0, 0,   0, 0,   0, 0); /* S1, S2, S3 fields are not 
used by 1x */
 static const uint64_t centroid_priority_1x = 0xull;
 
-/* 2x MSAA */
+/* 2x MSAA (the positions are sorted for EQAA) */
 static const uint32_t sample_locs_2x =
FILL_SREG(-4,-4,   4, 4,   0, 0,   0, 0); /* S2 & S3 fields are not 
used by 2x MSAA */
 static const uint64_t centroid_priority_2x = 0x1010101010101010ull;
 
-/* 4x, 8x, and 16x MSAA
- * - The first 4 locations happen to be optimal for 4x MSAA, better than
- *   the standard DX 4x locations.
- * - The first 8 locations happen to be almost as good as 8x DX locations,
- *   but the DX locations are horrible for worst-case EQAA 8s4f and 8s2f.
- */
-static const uint32_t sample_locs_4x_8x_16x[] = {
+/* 4x MSAA (the positions are sorted for EQAA) */
+static const uint32_t sample_locs_4x =
+   FILL_SREG(-2,-6,   2, 6,   -6, 2,  6,-2);
+static const uint64_t centroid_priority_4x = 0x3210321032103210ull;
+
+/* 8x MSAA (the positions are sorted for EQAA) */
+static const uint32_t sample_locs_8x[] = {
+   FILL_SREG(-3,-5,   5, 1,  -1, 3,   7,-7),
+   FILL_SREG(-7,-1,   3, 7,  -5, 5,   1,-3),
+};
+static const uint64_t centroid_priority_8x = 0x3546012735460127ull;
+
+/* 16x MSAA (the positions are sorted for EQAA) */
+static const uint32_t sample_locs_16x[] = {
FILL_SREG(-5,-2,   5, 3,  -2, 6,   3,-5),
-   FILL_SREG(-6,-7,   1, 1,  -6, 4,   7,-3),
+   FILL_SREG(-4,-6,   1, 1,  -6, 4,   7,-4),
FILL_SREG(-1,-3,   6, 7,  -3, 2,   0,-7),
-   FILL_SREG(-4,-6,   2, 5,  -8, 0,   4,-1),
+   FILL_SREG(-7,-8,   2, 5,  -8, 0,   4,-1),
 };
-static const uint64_t centroid_priority_4x = 0x2310231023102310ull;
-static const uint64_t centroid_priority_8x = 0x4762310547623105ull;
-static const uint64_t centroid_priority_16x = 0x49e7c6b231d0fa85ull;
+static const uint64_t centroid_priority_16x = 0xc97e64b231d0fa85ull;
 
 static void si_get_sample_position(struct pipe_context *ctx, unsigned 
sample_count,
   unsigned sample_index, float *out_value)
 {
const uint32_t *sample_locs;
 
switch (sample_count) {
case 1:
default:
sample_locs = _locs_1x;
break;
case 2:
sample_locs = _locs_2x;
break;
case 4:
+   sample_locs = _locs_4x;
+   break;
case 8:
+   sample_locs = sample_locs_8x;
+   break;
case 16:
-   sample_locs = sample_locs_4x_8x_16x;
+   sample_locs = sample_locs_16x;
break;
}
 
out_value[0] = (GET_SX(sample_locs, sample_index) + 8) / 16.0f;
out_value[1] = (GET_SY(sample_locs, sample_index) + 8) / 16.0f;
 }
 
 static void si_emit_max_4_sample_locs(struct radeon_cmdbuf *cs,
  uint64_t centroid_priority,
  uint32_t sample_locs)
@@ -165,27 +174,27 @@ void si_emit_sample_locations(struct radeon_cmdbuf *cs, 
int nr_samples)
 {
switch (nr_samples) {
default:
case 1:
si_emit_max_4_sample_locs(cs, centroid_priority_1x, 
sample_locs_1x);
break;
case 2:
si_emit_max_4_sample_locs(cs, centroid_priority_2x, 
sample_locs_2x);
break;
case 4:
-   si_emit_max_4_sample_locs(cs, centroid_priority_4x, 
sample_locs_4x_8x_16x[0]);
+   si_emit_max_4_sample_locs(cs, centroid_priority_4x, 
sample_locs_4x);
break;
case 8:
-   si_emit_max_16_sample_locs(cs, centroid_priority_8x, 
sample_locs_4x_8x_16x, 8);
+   si_emit_max_16_sample_locs(cs, centroid_priority_8x, 
sample_locs_8x, 8);
break;
case 16:
-   si_emit_max_16_sample_locs(cs, centroid_priority_16x, 
sample_locs_4x_8x_16x, 16);
+   si_emit_max_16_sample_locs(cs, centroid_priority_16x, 
sample_locs_16x, 16);
break;
}
 }
 
 void si_init_msaa_functions(struct si_context *sctx)
 {
int i;
 
sctx->b.get_sample_position = si_get_sample_position;
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 10/15] radeonsi: save raster config in screen, add se_tile_repeat

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c| 13 +++--
 src/amd/common/ac_gpu_info.h|  3 ++-
 src/amd/vulkan/si_cmd_buffer.c  |  2 +-
 src/gallium/drivers/radeonsi/si_pipe.c  |  9 +
 src/gallium/drivers/radeonsi/si_pipe.h  |  3 +++
 src/gallium/drivers/radeonsi/si_state.c | 12 +---
 6 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 766ad835476..d6df2f6443e 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -636,23 +636,24 @@ ac_get_gs_table_depth(enum chip_class chip_class, enum 
radeon_family family)
case CHIP_VEGAM:
return 32;
default:
unreachable("Unknown GPU");
}
 }
 
 void
 ac_get_raster_config(struct radeon_info *info,
 uint32_t *raster_config_p,
-uint32_t *raster_config_1_p)
+uint32_t *raster_config_1_p,
+uint32_t *se_tile_repeat_p)
 {
-   unsigned raster_config, raster_config_1;
+   unsigned raster_config, raster_config_1, se_tile_repeat;
 
switch (info->family) {
/* 1 SE / 1 RB */
case CHIP_HAINAN:
case CHIP_KABINI:
case CHIP_MULLINS:
case CHIP_STONEY:
raster_config = 0x;
raster_config_1 = 0x;
break;
@@ -715,22 +716,30 @@ ac_get_raster_config(struct radeon_info *info,
 
/* Fiji: Old kernels have incorrect tiling config. This decreases
 * RB performance by 25%. (it disables 1 RB in the second packer)
 */
if (info->family == CHIP_FIJI &&
info->cik_macrotile_mode_array[0] == 0x00e8) {
raster_config = 0x1612;
raster_config_1 = 0x002a;
}
 
+   unsigned se_width = 8 << G_028350_SE_XSEL_GFX6(raster_config);
+   unsigned se_height = 8 << G_028350_SE_YSEL_GFX6(raster_config);
+
+   /* I don't know how to calculate this, though this is probably a good 
guess. */
+   se_tile_repeat = MAX2(se_width, se_height) * info->max_se;
+
*raster_config_p = raster_config;
*raster_config_1_p = raster_config_1;
+   if (se_tile_repeat_p)
+   *se_tile_repeat_p = se_tile_repeat;
 }
 
 void
 ac_get_harvested_configs(struct radeon_info *info,
 unsigned raster_config,
 unsigned *cik_raster_config_1_p,
 unsigned *raster_config_se)
 {
unsigned sh_per_se = MAX2(info->max_sh_per_se, 1);
unsigned num_se = MAX2(info->max_se, 1);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 0583a6037f2..a7dc1094c05 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -143,21 +143,22 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
   struct radeon_info *info,
   struct amdgpu_gpu_info *amdinfo);
 
 void ac_compute_driver_uuid(char *uuid, size_t size);
 
 void ac_compute_device_uuid(struct radeon_info *info, char *uuid, size_t size);
 void ac_print_gpu_info(struct radeon_info *info);
 int ac_get_gs_table_depth(enum chip_class chip_class, enum radeon_family 
family);
 void ac_get_raster_config(struct radeon_info *info,
  uint32_t *raster_config_p,
- uint32_t *raster_config_1_p);
+ uint32_t *raster_config_1_p,
+ uint32_t *se_tile_repeat_p);
 void ac_get_harvested_configs(struct radeon_info *info,
  unsigned raster_config,
  unsigned *cik_raster_config_1_p,
  unsigned *raster_config_se);
 
 static inline unsigned ac_get_max_simd_waves(enum radeon_family family)
 {
 
switch (family) {
/* These always have 8 waves: */
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index e0d474756a3..de057657ee7 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -127,21 +127,21 @@ static unsigned radv_pack_float_12p4(float x)
 static void
 si_set_raster_config(struct radv_physical_device *physical_device,
 struct radeon_cmdbuf *cs)
 {
unsigned num_rb = MIN2(physical_device->rad_info.num_render_backends, 
16);
unsigned rb_mask = physical_device->rad_info.enabled_rb_mask;
unsigned raster_config, raster_config_1;
 
ac_get_raster_config(_device->rad_info,
 _config,
-_config_1);
+_config_1, NULL);
 
/* Always use the default config when all backends are enabled
 * (or when we failed to determine the enabled backends).
 */
if (!rb_mask || util_bitcount(rb_mask) >= num_rb) {

[Mesa-dev] [PATCH 07/15] radeonsi: rename si_gfx_* functions to si_cp_*

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

and write_event_eop -> release_mem
---
 src/amd/common/sid.h  |  1 +
 src/gallium/drivers/radeonsi/si_fence.c   | 32 +--
 src/gallium/drivers/radeonsi/si_perfcounter.c | 14 
 src/gallium/drivers/radeonsi/si_pipe.h| 16 +-
 src/gallium/drivers/radeonsi/si_query.c   | 32 +--
 src/gallium/drivers/radeonsi/si_state_draw.c  | 24 +++---
 6 files changed, 60 insertions(+), 59 deletions(-)

diff --git a/src/amd/common/sid.h b/src/amd/common/sid.h
index 3e36eb2d046..69b532177ac 100644
--- a/src/amd/common/sid.h
+++ b/src/amd/common/sid.h
@@ -139,20 +139,21 @@
 #define   V_370_MEM_ASYNC  5
 #define   R_371_DST_ADDR_LO0x371
 #define   R_372_DST_ADDR_HI0x372
 #define PKT3_DRAW_INDEX_INDIRECT_MULTI 0x38
 #define PKT3_MEM_SEMAPHORE 0x39
 #define PKT3_MPEG_INDEX0x3A /* not on CIK */
 #define PKT3_WAIT_REG_MEM  0x3C
 #defineWAIT_REG_MEM_EQUAL  3
 #defineWAIT_REG_MEM_NOT_EQUAL  4
 #define WAIT_REG_MEM_MEM_SPACE(x)   (((unsigned)(x) & 0x3) << 4)
+#define WAIT_REG_MEM_PFP   (1 << 8)
 #define PKT3_MEM_WRITE 0x3D /* not on CIK */
 #define PKT3_INDIRECT_BUFFER_CIK   0x3F /* new on CIK */
 #define   R_3F0_IB_BASE_LO 0x3F0
 #define   R_3F1_IB_BASE_HI 0x3F1
 #define   R_3F2_CONTROL0x3F2
 #define S_3F2_IB_SIZE(x)   (((unsigned)(x) & 0xf) << 0)
 #define G_3F2_IB_SIZE(x)   (((unsigned)(x) >> 0) & 0xf)
 #define S_3F2_CHAIN(x) (((unsigned)(x) & 0x1) << 20)
 #define G_3F2_CHAIN(x) (((unsigned)(x) >> 20) & 0x1)
 #define S_3F2_VALID(x) (((unsigned)(x) & 0x1) << 23)
diff --git a/src/gallium/drivers/radeonsi/si_fence.c 
b/src/gallium/drivers/radeonsi/si_fence.c
index 005fd9c1576..d1aa4544578 100644
--- a/src/gallium/drivers/radeonsi/si_fence.c
+++ b/src/gallium/drivers/radeonsi/si_fence.c
@@ -57,25 +57,25 @@ struct si_multi_fence {
  * Write an EOP event.
  *
  * \param eventEVENT_TYPE_*
  * \param event_flags  Optional cache flush flags (TC)
  * \param data_sel 1 = fence, 3 = timestamp
  * \param buf  Buffer
  * \param va   GPU address
  * \param old_valuePrevious fence value (for a bug workaround)
  * \param new_valueFence value to write for this event.
  */
-void si_gfx_write_event_eop(struct si_context *ctx,
-   unsigned event, unsigned event_flags,
-   unsigned dst_sel, unsigned int_sel, unsigned 
data_sel,
-   struct r600_resource *buf, uint64_t va,
-   uint32_t new_fence, unsigned query_type)
+void si_cp_release_mem(struct si_context *ctx,
+  unsigned event, unsigned event_flags,
+  unsigned dst_sel, unsigned int_sel, unsigned data_sel,
+  struct r600_resource *buf, uint64_t va,
+  uint32_t new_fence, unsigned query_type)
 {
struct radeon_cmdbuf *cs = ctx->gfx_cs;
unsigned op = EVENT_TYPE(event) |
  EVENT_INDEX(event == V_028A90_CS_DONE ||
  event == V_028A90_PS_DONE ? 6 : 5) |
  event_flags;
unsigned sel = EOP_DST_SEL(dst_sel) |
   EOP_INT_SEL(int_sel) |
   EOP_DATA_SEL(data_sel);
 
@@ -140,38 +140,38 @@ void si_gfx_write_event_eop(struct si_context *ctx,
radeon_emit(cs, new_fence); /* immediate data */
radeon_emit(cs, 0); /* unused */
}
 
if (buf) {
radeon_add_to_buffer_list(ctx, ctx->gfx_cs, buf, 
RADEON_USAGE_WRITE,
  RADEON_PRIO_QUERY);
}
 }
 
-unsigned si_gfx_write_fence_dwords(struct si_screen *screen)
+unsigned si_cp_write_fence_dwords(struct si_screen *screen)
 {
unsigned dwords = 6;
 
if (screen->info.chip_class == CIK ||
screen->info.chip_class == VI)
dwords *= 2;
 
return dwords;
 }
 
-void si_gfx_wait_fence(struct si_context *ctx,
-  uint64_t va, uint32_t ref, uint32_t mask)
+void si_cp_wait_mem(struct si_context *ctx,
+   uint64_t va, uint32_t ref, uint32_t mask, unsigned flags)
 {
struct radeon_cmdbuf *cs = ctx->gfx_cs;
 
radeon_emit(cs, PKT3(PKT3_WAIT_REG_MEM, 5, 0));
-   radeon_emit(cs, WAIT_REG_MEM_EQUAL | WAIT_REG_MEM_MEM_SPACE(1));
+   radeon_emit(cs, WAIT_REG_MEM_EQUAL | WAIT_REG_MEM_MEM_SPACE(1) | flags);
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
radeon_emit(cs, ref); /* 

[Mesa-dev] [PATCH 01/15] ac: define all address spaces properly

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/ac_llvm_build.c   | 10 +-
 src/amd/common/ac_llvm_build.h   | 10 ++
 src/amd/common/ac_nir_to_llvm.c  |  2 +-
 src/amd/vulkan/radv_nir_to_llvm.c|  2 +-
 src/gallium/drivers/radeonsi/si_shader.c |  6 +++---
 5 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index dcc6beb631f..4cbf599d946 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -177,21 +177,21 @@ ac_get_type_size(LLVMTypeRef type)
switch (kind) {
case LLVMIntegerTypeKind:
return LLVMGetIntTypeWidth(type) / 8;
case LLVMHalfTypeKind:
return 2;
case LLVMFloatTypeKind:
return 4;
case LLVMDoubleTypeKind:
return 8;
case LLVMPointerTypeKind:
-   if (LLVMGetPointerAddressSpace(type) == 
AC_CONST_32BIT_ADDR_SPACE)
+   if (LLVMGetPointerAddressSpace(type) == 
AC_ADDR_SPACE_CONST_32BIT)
return 4;
return 8;
case LLVMVectorTypeKind:
return LLVMGetVectorSize(type) *
   ac_get_type_size(LLVMGetElementType(type));
case LLVMArrayTypeKind:
return LLVMGetArrayLength(type) *
   ac_get_type_size(LLVMGetElementType(type));
default:
assert(0);
@@ -945,21 +945,21 @@ ac_build_indexed_store(struct ac_llvm_context *ctx,
  */
 static LLVMValueRef
 ac_build_load_custom(struct ac_llvm_context *ctx, LLVMValueRef base_ptr,
 LLVMValueRef index, bool uniform, bool invariant,
 bool no_unsigned_wraparound)
 {
LLVMValueRef pointer, result;
LLVMValueRef indices[2] = {ctx->i32_0, index};
 
if (no_unsigned_wraparound &&
-   LLVMGetPointerAddressSpace(LLVMTypeOf(base_ptr)) == 
AC_CONST_32BIT_ADDR_SPACE)
+   LLVMGetPointerAddressSpace(LLVMTypeOf(base_ptr)) == 
AC_ADDR_SPACE_CONST_32BIT)
pointer = LLVMBuildInBoundsGEP(ctx->builder, base_ptr, indices, 
2, "");
else
pointer = LLVMBuildGEP(ctx->builder, base_ptr, indices, 2, "");
 
if (uniform)
LLVMSetMetadata(pointer, ctx->uniform_md_kind, ctx->empty_md);
result = LLVMBuildLoad(ctx->builder, pointer, "");
if (invariant)
LLVMSetMetadata(result, ctx->invariant_load_md_kind, 
ctx->empty_md);
return result;
@@ -2536,21 +2536,21 @@ void ac_init_exec_full_mask(struct ac_llvm_context *ctx)
LLVMValueRef full_mask = LLVMConstInt(ctx->i64, ~0ull, 0);
ac_build_intrinsic(ctx,
   "llvm.amdgcn.init.exec", ctx->voidt,
   _mask, 1, AC_FUNC_ATTR_CONVERGENT);
 }
 
 void ac_declare_lds_as_pointer(struct ac_llvm_context *ctx)
 {
unsigned lds_size = ctx->chip_class >= CIK ? 65536 : 32768;
ctx->lds = LLVMBuildIntToPtr(ctx->builder, ctx->i32_0,
-LLVMPointerType(LLVMArrayType(ctx->i32, 
lds_size / 4), AC_LOCAL_ADDR_SPACE),
+LLVMPointerType(LLVMArrayType(ctx->i32, 
lds_size / 4), AC_ADDR_SPACE_LDS),
 "lds");
 }
 
 LLVMValueRef ac_lds_load(struct ac_llvm_context *ctx,
 LLVMValueRef dw_addr)
 {
return ac_build_load(ctx, ctx->lds, dw_addr);
 }
 
 void ac_lds_store(struct ac_llvm_context *ctx,
@@ -2618,30 +2618,30 @@ LLVMValueRef ac_find_lsb(struct ac_llvm_context *ctx,
/* Check for zero: */
return LLVMBuildSelect(ctx->builder, LLVMBuildICmp(ctx->builder,
   LLVMIntEQ, src0,
   zero, ""),
   LLVMConstInt(ctx->i32, -1, 0), lsb, "");
 }
 
 LLVMTypeRef ac_array_in_const_addr_space(LLVMTypeRef elem_type)
 {
return LLVMPointerType(LLVMArrayType(elem_type, 0),
-  AC_CONST_ADDR_SPACE);
+  AC_ADDR_SPACE_CONST);
 }
 
 LLVMTypeRef ac_array_in_const32_addr_space(LLVMTypeRef elem_type)
 {
if (!HAVE_32BIT_POINTERS)
return ac_array_in_const_addr_space(elem_type);
 
return LLVMPointerType(LLVMArrayType(elem_type, 0),
-  AC_CONST_32BIT_ADDR_SPACE);
+  AC_ADDR_SPACE_CONST_32BIT);
 }
 
 static struct ac_llvm_flow *
 get_current_flow(struct ac_llvm_context *ctx)
 {
if (ctx->flow_depth > 0)
return >flow[ctx->flow_depth - 1];
return NULL;
 }
 
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index 08f18435ddd..83aad02183e 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -30,24 +30,26 @@
 #include "compiler/nir/nir.h"
 

[Mesa-dev] [PATCH 04/15] ac: add ac_build_round

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/ac_llvm_build.c| 19 +--
 src/amd/common/ac_llvm_build.h|  1 +
 src/amd/common/ac_nir_to_llvm.c   |  2 +-
 .../drivers/radeonsi/si_shader_tgsi_mem.c |  4 +---
 4 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index fc6dc396d38..ed510a34d6f 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -551,20 +551,36 @@ LLVMValueRef ac_build_expand_to_vec4(struct 
ac_llvm_context *ctx,
}
elemtype = LLVMTypeOf(value);
}
 
while (num_channels < 4)
chan[num_channels++] = LLVMGetUndef(elemtype);
 
return ac_build_gather_values(ctx, chan, 4);
 }
 
+LLVMValueRef ac_build_round(struct ac_llvm_context *ctx, LLVMValueRef value)
+{
+   unsigned type_size = ac_get_type_size(LLVMTypeOf(value));
+   const char *name;
+
+   if (type_size == 2)
+   name = "llvm.rint.f16";
+   else if (type_size == 4)
+   name = "llvm.rint.f32";
+   else
+   name = "llvm.rint.f64";
+
+   return ac_build_intrinsic(ctx, name, LLVMTypeOf(value), , 1,
+ AC_FUNC_ATTR_READNONE);
+}
+
 LLVMValueRef
 ac_build_fdiv(struct ac_llvm_context *ctx,
  LLVMValueRef num,
  LLVMValueRef den)
 {
/* If we do (num / den), LLVM >= 7.0 does:
 *return num * v_rcp_f32(den * (fabs(den) > 0x1.0p+96f ? 0x1.0p-32f 
: 1.0f));
 *
 * If we do (num * (1 / den)), LLVM does:
 *return num * v_rcp_f32(den);
@@ -729,22 +745,21 @@ ac_prepare_cube_coords(struct ac_llvm_context *ctx,
   LLVMValueRef *coords_arg,
   LLVMValueRef *derivs_arg)
 {
 
LLVMBuilderRef builder = ctx->builder;
struct cube_selection_coords selcoords;
LLVMValueRef coords[3];
LLVMValueRef invma;
 
if (is_array && !is_lod) {
-   LLVMValueRef tmp = coords_arg[3];
-   tmp = ac_build_intrinsic(ctx, "llvm.rint.f32", ctx->f32, , 
1, 0);
+   LLVMValueRef tmp = ac_build_round(ctx, coords_arg[3]);
 
/* Section 8.9 (Texture Functions) of the GLSL 4.50 spec says:
 *
 *"For Array forms, the array layer used will be
 *
 *   max(0, min(d−1, floor(layer+0.5)))
 *
 * where d is the depth of the texture array and layer
 * comes from the component indicated in the tables below.
 * Workaroudn for an issue where the layer is taken from a
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index 83aad02183e..32d62450dfe 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -168,20 +168,21 @@ ac_build_gather_values_extended(struct ac_llvm_context 
*ctx,
unsigned value_stride,
bool load,
bool always_vector);
 LLVMValueRef
 ac_build_gather_values(struct ac_llvm_context *ctx,
   LLVMValueRef *values,
   unsigned value_count);
 LLVMValueRef ac_build_expand_to_vec4(struct ac_llvm_context *ctx,
 LLVMValueRef value,
 unsigned num_channels);
+LLVMValueRef ac_build_round(struct ac_llvm_context *ctx, LLVMValueRef value);
 
 LLVMValueRef
 ac_build_fdiv(struct ac_llvm_context *ctx,
  LLVMValueRef num,
  LLVMValueRef den);
 
 LLVMValueRef ac_build_fast_udiv(struct ac_llvm_context *ctx,
LLVMValueRef num,
LLVMValueRef multiplier,
LLVMValueRef pre_shift,
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 312383db36c..ffc64a79d95 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3304,21 +3304,21 @@ static void tex_fetch_ptrs(struct ac_nir_context *ctx,
}
if (fmask_ptr && (instr->op == nir_texop_txf_ms ||
  instr->op == nir_texop_samples_identical))
*fmask_ptr = get_sampler_desc(ctx, texture_deref_instr, 
AC_DESC_FMASK, instr, false, false);
 }
 
 static LLVMValueRef apply_round_slice(struct ac_llvm_context *ctx,
  LLVMValueRef coord)
 {
coord = ac_to_float(ctx, coord);
-   coord = ac_build_intrinsic(ctx, "llvm.rint.f32", ctx->f32, , 1, 
0);
+   coord = ac_build_round(ctx, coord);
coord = ac_to_integer(ctx, coord);
return coord;
 }
 
 static void visit_tex(struct ac_nir_context *ctx, nir_tex_instr *instr)
 {
LLVMValueRef result = NULL;
struct ac_image_args 

[Mesa-dev] [PATCH 03/15] ac: correct PKT3_COPY_DATA definitions

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/sid.h  | 11 +--
 src/amd/vulkan/radv_cmd_buffer.c  |  6 +++---
 src/amd/vulkan/radv_query.c   |  8 
 src/gallium/drivers/radeonsi/si_compute.c |  2 +-
 src/gallium/drivers/radeonsi/si_perfcounter.c |  6 +++---
 src/gallium/drivers/radeonsi/si_query.c   |  2 +-
 src/gallium/drivers/radeonsi/si_state_draw.c  |  2 +-
 7 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/src/amd/common/sid.h b/src/amd/common/sid.h
index d20b5484223..b3321ea3a77 100644
--- a/src/amd/common/sid.h
+++ b/src/amd/common/sid.h
@@ -153,28 +153,35 @@
 #define   R_3F2_CONTROL0x3F2
 #define S_3F2_IB_SIZE(x)   (((unsigned)(x) & 0xf) << 0)
 #define G_3F2_IB_SIZE(x)   (((unsigned)(x) >> 0) & 0xf)
 #define S_3F2_CHAIN(x) (((unsigned)(x) & 0x1) << 20)
 #define G_3F2_CHAIN(x) (((unsigned)(x) >> 20) & 0x1)
 #define S_3F2_VALID(x) (((unsigned)(x) & 0x1) << 23)
 
 #define PKT3_COPY_DATA0x40
 #defineCOPY_DATA_SRC_SEL(x)((x) & 0xf)
 #defineCOPY_DATA_REG   0
-#defineCOPY_DATA_MEM   1
+#defineCOPY_DATA_SRC_MEM   1 /* only valid as 
source */
+#define COPY_DATA_TC_L2 2
+#define COPY_DATA_GDS   3
 #define COPY_DATA_PERF  4
 #define COPY_DATA_IMM   5
 #define COPY_DATA_TIMESTAMP 9
 #defineCOPY_DATA_DST_SEL(x)(((unsigned)(x) & 0xf) 
<< 8)
-#define COPY_DATA_MEM_ASYNC 5
+#define COPY_DATA_DST_MEM_GRBM 1 /* sync across GRBM, 
deprecated */
+#define COPY_DATA_TC_L2 2
+#define COPY_DATA_GDS   3
+#define COPY_DATA_PERF  4
+#define COPY_DATA_DST_MEM   5
 #defineCOPY_DATA_COUNT_SEL (1 << 16)
 #defineCOPY_DATA_WR_CONFIRM(1 << 20)
+#defineCOPY_DATA_ENGINE_PFP(1 << 30)
 #define PKT3_PFP_SYNC_ME  0x42
 #define PKT3_SURFACE_SYNC  0x43 /* deprecated on CIK, use 
ACQUIRE_MEM */
 #define PKT3_ME_INITIALIZE 0x44 /* not on CIK */
 #define PKT3_COND_WRITE0x45
 #define PKT3_EVENT_WRITE   0x46
 #define PKT3_EVENT_WRITE_EOP   0x47 /* not on GFX9 */
 #define EOP_INT_SEL(x)  ((x) << 24)
 #defineEOP_INT_SEL_NONE0
 #defineEOP_INT_SEL_SEND_DATA_AFTER_WR_CONFIRM  3
 #define EOP_DATA_SEL(x) ((x) << 29)
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index d492456d6b8..339704990e2 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1290,21 +1290,21 @@ radv_load_ds_clear_metadata(struct radv_cmd_buffer 
*cmd_buffer,
if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
++reg_count;
} else {
++reg_offset;
va += 4;
}
if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
++reg_count;
 
radeon_emit(cs, PKT3(PKT3_COPY_DATA, 4, 0));
-   radeon_emit(cs, COPY_DATA_SRC_SEL(COPY_DATA_MEM) |
+   radeon_emit(cs, COPY_DATA_SRC_SEL(COPY_DATA_SRC_MEM) |
COPY_DATA_DST_SEL(COPY_DATA_REG) |
(reg_count == 2 ? COPY_DATA_COUNT_SEL : 0));
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
radeon_emit(cs, (R_028028_DB_STENCIL_CLEAR + 4 * reg_offset) >> 2);
radeon_emit(cs, 0);
 
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, 0));
radeon_emit(cs, 0);
 }
@@ -1420,21 +1420,21 @@ radv_load_color_clear_metadata(struct radv_cmd_buffer 
*cmd_buffer,
uint64_t va = radv_buffer_get_va(image->bo);
 
va += image->offset + image->clear_value_offset;
 
if (!radv_image_has_cmask(image) && !radv_image_has_dcc(image))
return;
 
uint32_t reg = R_028C8C_CB_COLOR0_CLEAR_WORD0 + cb_idx * 0x3c;
 
radeon_emit(cs, PKT3(PKT3_COPY_DATA, 4, cmd_buffer->state.predicating));
-   radeon_emit(cs, COPY_DATA_SRC_SEL(COPY_DATA_MEM) |
+   radeon_emit(cs, COPY_DATA_SRC_SEL(COPY_DATA_SRC_MEM) |
COPY_DATA_DST_SEL(COPY_DATA_REG) |
COPY_DATA_COUNT_SEL);
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
radeon_emit(cs, reg >> 2);
radeon_emit(cs, 0);
 
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, 
cmd_buffer->state.predicating));

[Mesa-dev] [PATCH 02/15] ac: simplify LLVM alloca helpers

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/ac_llvm_build.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 4cbf599d946..fc6dc396d38 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2801,50 +2801,47 @@ void ac_build_if(struct ac_llvm_context *ctx, 
LLVMValueRef value,
 
 void ac_build_uif(struct ac_llvm_context *ctx, LLVMValueRef value,
  int label_id)
 {
LLVMValueRef cond = LLVMBuildICmp(ctx->builder, LLVMIntNE,
  ac_to_integer(ctx, value),
  ctx->i32_0, "");
if_cond_emit(ctx, cond, label_id);
 }
 
-LLVMValueRef ac_build_alloca(struct ac_llvm_context *ac, LLVMTypeRef type,
+LLVMValueRef ac_build_alloca_undef(struct ac_llvm_context *ac, LLVMTypeRef 
type,
 const char *name)
 {
LLVMBuilderRef builder = ac->builder;
LLVMBasicBlockRef current_block = LLVMGetInsertBlock(builder);
LLVMValueRef function = LLVMGetBasicBlockParent(current_block);
LLVMBasicBlockRef first_block = LLVMGetEntryBasicBlock(function);
LLVMValueRef first_instr = LLVMGetFirstInstruction(first_block);
LLVMBuilderRef first_builder = LLVMCreateBuilderInContext(ac->context);
LLVMValueRef res;
 
if (first_instr) {
LLVMPositionBuilderBefore(first_builder, first_instr);
} else {
LLVMPositionBuilderAtEnd(first_builder, first_block);
}
 
res = LLVMBuildAlloca(first_builder, type, name);
-   LLVMBuildStore(builder, LLVMConstNull(type), res);
-
LLVMDisposeBuilder(first_builder);
-
return res;
 }
 
-LLVMValueRef ac_build_alloca_undef(struct ac_llvm_context *ac,
+LLVMValueRef ac_build_alloca(struct ac_llvm_context *ac,
   LLVMTypeRef type, const char *name)
 {
-   LLVMValueRef ptr = ac_build_alloca(ac, type, name);
-   LLVMBuildStore(ac->builder, LLVMGetUndef(type), ptr);
+   LLVMValueRef ptr = ac_build_alloca_undef(ac, type, name);
+   LLVMBuildStore(ac->builder, LLVMConstNull(type), ptr);
return ptr;
 }
 
 LLVMValueRef ac_cast_ptr(struct ac_llvm_context *ctx, LLVMValueRef ptr,
  LLVMTypeRef type)
 {
int addr_space = LLVMGetPointerAddressSpace(LLVMTypeOf(ptr));
return LLVMBuildBitCast(ctx->builder, ptr,
LLVMPointerType(type, addr_space), "");
 }
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/15] radeonsi: fix a typo at CS_PARTIAL_FLUSH

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

harmless
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 3d56d8e9ab4..81eb34d75e2 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -957,21 +957,21 @@ void si_emit_cache_flush(struct si_context *sctx)
} else if (flags & SI_CONTEXT_VS_PARTIAL_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_VS_PARTIAL_FLUSH) | 
EVENT_INDEX(4));
sctx->num_vs_flushes++;
}
}
 
if (flags & SI_CONTEXT_CS_PARTIAL_FLUSH &&
sctx->compute_is_busy) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
-   radeon_emit(cs, EVENT_TYPE(V_028A90_CS_PARTIAL_FLUSH | 
EVENT_INDEX(4)));
+   radeon_emit(cs, EVENT_TYPE(V_028A90_CS_PARTIAL_FLUSH) | 
EVENT_INDEX(4));
sctx->num_cs_flushes++;
sctx->compute_is_busy = false;
}
 
/* VGT state synchronization. */
if (flags & SI_CONTEXT_VGT_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) | 
EVENT_INDEX(0));
}
if (flags & SI_CONTEXT_VGT_STREAMOUT_SYNC) {
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/15] radeonsi: make si_gfx_write_event_eop more configurable

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/sid.h  |  5 +
 src/gallium/drivers/radeonsi/si_fence.c   | 19 ++-
 src/gallium/drivers/radeonsi/si_perfcounter.c |  2 ++
 src/gallium/drivers/radeonsi/si_pipe.h|  2 +-
 src/gallium/drivers/radeonsi/si_query.c   |  9 +++--
 src/gallium/drivers/radeonsi/si_state_draw.c  |  8 ++--
 6 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/src/amd/common/sid.h b/src/amd/common/sid.h
index b3321ea3a77..3e36eb2d046 100644
--- a/src/amd/common/sid.h
+++ b/src/amd/common/sid.h
@@ -174,28 +174,33 @@
 #define COPY_DATA_DST_MEM   5
 #defineCOPY_DATA_COUNT_SEL (1 << 16)
 #defineCOPY_DATA_WR_CONFIRM(1 << 20)
 #defineCOPY_DATA_ENGINE_PFP(1 << 30)
 #define PKT3_PFP_SYNC_ME  0x42
 #define PKT3_SURFACE_SYNC  0x43 /* deprecated on CIK, use 
ACQUIRE_MEM */
 #define PKT3_ME_INITIALIZE 0x44 /* not on CIK */
 #define PKT3_COND_WRITE0x45
 #define PKT3_EVENT_WRITE   0x46
 #define PKT3_EVENT_WRITE_EOP   0x47 /* not on GFX9 */
+#define EOP_DST_SEL(x) ((x) << 16)
+#defineEOP_DST_SEL_MEM 0
+#defineEOP_DST_SEL_TC_L2   1
 #define EOP_INT_SEL(x)  ((x) << 24)
 #defineEOP_INT_SEL_NONE0
 #defineEOP_INT_SEL_SEND_DATA_AFTER_WR_CONFIRM  3
 #define EOP_DATA_SEL(x) ((x) << 29)
 #defineEOP_DATA_SEL_DISCARD0
 #defineEOP_DATA_SEL_VALUE_32BIT1
 #defineEOP_DATA_SEL_VALUE_64BIT2
 #defineEOP_DATA_SEL_TIMESTAMP  3
+#defineEOP_DATA_SEL_GDS5
+#defineEOP_DATA_GDS(dw_offset, num_dwords) ((dw_offset) | 
((unsigned)(num_dwords) << 16))
 /* CP DMA bug: Any use of CP_DMA.DST_SEL=TC must be avoided when EOS packets
  * are used. Use DST_SEL=MC instead. For prefetch, use SRC_SEL=TC and
  * DST_SEL=MC. Only CIK chips are affected.
  */
 /* fix CP DMA before uncommenting: */
 /*#define PKT3_EVENT_WRITE_EOS   0x48*/ /* not on GFX9 */
 #define PKT3_RELEASE_MEM   0x49 /* GFX9+ [any ring] or 
GFX8 [compute ring only] */
 #define PKT3_ONE_REG_WRITE 0x57 /* not on CIK */
 #define PKT3_ACQUIRE_MEM   0x58 /* new for CIK */
 #define PKT3_SET_CONFIG_REG0x68
diff --git a/src/gallium/drivers/radeonsi/si_fence.c 
b/src/gallium/drivers/radeonsi/si_fence.c
index abb7057f299..005fd9c1576 100644
--- a/src/gallium/drivers/radeonsi/si_fence.c
+++ b/src/gallium/drivers/radeonsi/si_fence.c
@@ -59,34 +59,32 @@ struct si_multi_fence {
  * \param eventEVENT_TYPE_*
  * \param event_flags  Optional cache flush flags (TC)
  * \param data_sel 1 = fence, 3 = timestamp
  * \param buf  Buffer
  * \param va   GPU address
  * \param old_valuePrevious fence value (for a bug workaround)
  * \param new_valueFence value to write for this event.
  */
 void si_gfx_write_event_eop(struct si_context *ctx,
unsigned event, unsigned event_flags,
-   unsigned data_sel,
+   unsigned dst_sel, unsigned int_sel, unsigned 
data_sel,
struct r600_resource *buf, uint64_t va,
uint32_t new_fence, unsigned query_type)
 {
struct radeon_cmdbuf *cs = ctx->gfx_cs;
unsigned op = EVENT_TYPE(event) |
- EVENT_INDEX(5) |
+ EVENT_INDEX(event == V_028A90_CS_DONE ||
+ event == V_028A90_PS_DONE ? 6 : 5) |
  event_flags;
-   unsigned sel = EOP_DATA_SEL(data_sel);
-
-   /* Wait for write confirmation before writing data, but don't send
-* an interrupt. */
-   if (data_sel != EOP_DATA_SEL_DISCARD)
-   sel |= EOP_INT_SEL(EOP_INT_SEL_SEND_DATA_AFTER_WR_CONFIRM);
+   unsigned sel = EOP_DST_SEL(dst_sel) |
+  EOP_INT_SEL(int_sel) |
+  EOP_DATA_SEL(data_sel);
 
if (ctx->chip_class >= GFX9) {
/* A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion
 * counters) must immediately precede every timestamp event to
 * prevent a GPU hang on GFX9.
 *
 * Occlusion queries don't need to do it here, because they
 * always do ZPASS_DONE before the timestamp.
 */
if (ctx->chip_class == GFX9 &&
@@ -268,21 

[Mesa-dev] [PATCH 00/15] A bunch of shared code and RadeonSI changes

2018-10-02 Thread Marek Olšák
Hi,

Interesting bits:
- CP DMA support for GDS (unused but there is a test)
- switch back to DX sample positions
- center the viewport in the scanline area for maximizing the guardband
- optimal PA_SU_PRIM_FILTER_CNTL
- higher subpixel precision for 4K and lower resolutions
  (for more precise rendering of T-junctions in geometry)

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util/u_queue: don't inherit thread affinity from parent thread

2018-10-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/util/u_queue.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/util/u_queue.c b/src/util/u_queue.c
index 22d2cdd0fa2..9dd1a69ed7a 100644
--- a/src/util/u_queue.c
+++ b/src/util/u_queue.c
@@ -232,20 +232,32 @@ struct thread_input {
 };
 
 static int
 util_queue_thread_func(void *input)
 {
struct util_queue *queue = ((struct thread_input*)input)->queue;
int thread_index = ((struct thread_input*)input)->thread_index;
 
free(input);
 
+#ifdef HAVE_PTHREAD_SETAFFINITY
+   /* Don't inherit the thread affinity from the parent thread.
+* Set the full mask.
+*/
+   cpu_set_t cpuset;
+   CPU_ZERO();
+   for (unsigned i = 0; i < CPU_SETSIZE; i++)
+  CPU_SET(i, );
+
+   pthread_setaffinity_np(pthread_self(), sizeof(cpuset), );
+#endif
+
if (strlen(queue->name) > 0) {
   char name[16];
   util_snprintf(name, sizeof(name), "%s%i", queue->name, thread_index);
   u_thread_setname(name);
}
 
while (1) {
   struct util_queue_job job;
 
   mtx_lock(>lock);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] mesa: add support for glMapNamedBufferRangeEXT()

2018-10-02 Thread Marek Olšák
On Sat, Sep 8, 2018 at 12:33 AM Timothy Arceri  wrote:
>
> ---
>  .../glapi/gen/EXT_direct_state_access.xml | 10 +
>  src/mesa/main/bufferobj.c | 45 +--
>  src/mesa/main/bufferobj.h |  3 ++
>  src/mesa/main/tests/dispatch_sanity.cpp   |  2 +-
>  4 files changed, 46 insertions(+), 14 deletions(-)
>
> diff --git a/src/mapi/glapi/gen/EXT_direct_state_access.xml 
> b/src/mapi/glapi/gen/EXT_direct_state_access.xml
> index 203730b0242..d8fdf8921da 100644
> --- a/src/mapi/glapi/gen/EXT_direct_state_access.xml
> +++ b/src/mapi/glapi/gen/EXT_direct_state_access.xml
> @@ -124,5 +124,15 @@
>
> 
>
> +   
> +
> +   
> +  
> +  
> +  
> +  
> +  
> +   
> +
>  
>  
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index de809d31f35..23f2f713815 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -3270,30 +3270,49 @@ _mesa_MapNamedBufferRange_no_error(GLuint buffer, 
> GLintptr offset,
> "glMapNamedBufferRange");
>  }
>
> -void * GLAPIENTRY
> -_mesa_MapNamedBufferRange(GLuint buffer, GLintptr offset, GLsizeiptr length,
> -  GLbitfield access)
> +static void *
> +map_named_buffer_range(GLuint buffer, GLintptr offset, GLsizeiptr length,
> +   GLbitfield access, bool dst_ext, const char *func)

dsa_ext?

>  {
> GET_CURRENT_CONTEXT(ctx);
> -   struct gl_buffer_object *bufObj;
> +   struct gl_buffer_object *bufObj = NULL;
>
> if (!ctx->Extensions.ARB_map_buffer_range) {
>_mesa_error(ctx, GL_INVALID_OPERATION,
> -  "glMapNamedBufferRange("
> -  "ARB_map_buffer_range not supported)");
> +  "%s(ARB_map_buffer_range not supported)", func);
>return NULL;
> }
>
> -   bufObj = _mesa_lookup_bufferobj_err(ctx, buffer, "glMapNamedBufferRange");
> -   if (!bufObj)
> -  return NULL;
> +   if (dst_ext) {
> +  bufObj = _mesa_lookup_bufferobj(ctx, buffer);
> +  if (!_mesa_handle_bind_buffer_gen(ctx, buffer, , func))
> + return NULL;
> +   } else {
> +  bufObj = _mesa_lookup_bufferobj_err(ctx, buffer, func);
> +  if (!bufObj)
> + return NULL;
> +   }
>
> -   if (!validate_map_buffer_range(ctx, bufObj, offset, length, access,
> -  "glMapNamedBufferRange"))
> +   if (!validate_map_buffer_range(ctx, bufObj, offset, length, access, func))
>return NULL;
>
> -   return map_buffer_range(ctx, bufObj, offset, length, access,
> -   "glMapNamedBufferRange");
> +   return map_buffer_range(ctx, bufObj, offset, length, access, func);
> +}
> +
> +void * GLAPIENTRY
> +_mesa_MapNamedBufferRangeEXT(GLuint buffer, GLintptr offset, GLsizeiptr 
> length,
> + GLbitfield access)
> +{
> +   return map_named_buffer_range(buffer, offset, length, access, true,
> + "glMapNamedBufferRangeEXT");
> +}
> +
> +void * GLAPIENTRY
> +_mesa_MapNamedBufferRange(GLuint buffer, GLintptr offset, GLsizeiptr length,
> +  GLbitfield access)
> +{
> +   return map_named_buffer_range(buffer, offset, length, access, false,
> + "glMapNamedBufferRange");
>  }
>
>  /**
> diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
> index 6b35d70606f..c3b57ef7fe6 100644
> --- a/src/mesa/main/bufferobj.h
> +++ b/src/mesa/main/bufferobj.h
> @@ -357,6 +357,9 @@ _mesa_MapNamedBufferRange_no_error(GLuint buffer, 
> GLintptr offset,
>  void * GLAPIENTRY
>  _mesa_MapNamedBufferRange(GLuint buffer, GLintptr offset, GLsizeiptr length,
>GLbitfield access);
> +void * GLAPIENTRY
> +_mesa_MapNamedBufferRangeEXT(GLuint buffer, GLintptr offset,
> + GLsizeiptr length, GLbitfield access);
>
>  void * GLAPIENTRY
>  _mesa_MapBuffer_no_error(GLenum target, GLenum access);
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index 5a0cdfd78f2..ba020c31e70 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -1231,7 +1231,7 @@ const struct function 
> common_desktop_functions_possible[] = {
> //{ "glGetVertexArrayPointervEXT", 10, -1 },
> //{ "glGetVertexArrayIntegeri_vEXT", 10, -1 },
> //{ "glGetVertexArrayPointeri_vEXT", 10, -1 },
> -   //{ "glMapNamedBufferRangeEXT", 10, -1 },
> +   { "glMapNamedBufferRangeEXT", 10, -1 },
> //{ "glFlushMappedNamedBufferRangeEXT", 10, -1 },
>
> /* GL_ARB_internalformat_query */
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 08/13] mesa: add support for glNamedBufferStorageEXT

2018-10-02 Thread Marek Olšák
features.txt can be updated.

Marek
On Sat, Sep 8, 2018 at 12:34 AM Timothy Arceri  wrote:
>
> This is available in ARB_buffer_storage when
> EXT_direct_state_access is present.
> ---
>  src/mapi/glapi/gen/gl_API.xml   |  7 +++
>  src/mesa/main/bufferobj.c   | 15 +++
>  src/mesa/main/bufferobj.h   |  3 +++
>  src/mesa/main/tests/dispatch_sanity.cpp |  1 +
>  4 files changed, 26 insertions(+)
>
> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
> index 8403c80eb37..e65b24a1dbb 100644
> --- a/src/mapi/glapi/gen/gl_API.xml
> +++ b/src/mapi/glapi/gen/gl_API.xml
> @@ -8271,6 +8271,13 @@
>  
>  
>  
> +
> +   
> +  
> +  
> +  
> +  
> +   
>  
>
>   xmlns:xi="http://www.w3.org/2001/XInclude"/>
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index aa8d7062cb6..de809d31f35 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -1930,6 +1930,21 @@ _mesa_BufferStorage(GLenum target, GLsizeiptr size, 
> const GLvoid *data,
>false, false, false, "glBufferStorage");
>  }
>
> +void GLAPIENTRY
> +_mesa_NamedBufferStorageEXT(GLuint buffer, GLsizeiptr size,
> +const GLvoid *data, GLbitfield flags)
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +
> +   struct gl_buffer_object *bufObj = _mesa_lookup_bufferobj(ctx, buffer);
> +   if (!_mesa_handle_bind_buffer_gen(ctx, buffer,
> + , "glNamedBufferStorageEXT"))
> +  return;
> +
> +   inlined_buffer_storage(GL_NONE, buffer, size, data, flags, GL_NONE, 0,
> +  true, false, false, "glNamedBufferStorageEXT");
> +}
> +
>
>  void GLAPIENTRY
>  _mesa_BufferStorageMemEXT(GLenum target, GLsizeiptr size,
> diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
> index 74124649bb6..6b35d70606f 100644
> --- a/src/mesa/main/bufferobj.h
> +++ b/src/mesa/main/bufferobj.h
> @@ -189,6 +189,9 @@ void GLAPIENTRY
>  _mesa_BufferStorage(GLenum target, GLsizeiptr size, const GLvoid *data,
>  GLbitfield flags);
>  void GLAPIENTRY
> +_mesa_NamedBufferStorageEXT(GLuint buffer, GLsizeiptr size,
> +const GLvoid *data, GLbitfield flags);
> +void GLAPIENTRY
>  _mesa_BufferStorageMemEXT(GLenum target, GLsizeiptr size,
>GLuint memory, GLuint64 offset);
>  void GLAPIENTRY
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index 1b8dec18c20..5a0cdfd78f2 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -1304,6 +1304,7 @@ const struct function 
> common_desktop_functions_possible[] = {
>
> /* GL_ARB_buffer_storage */
> { "glBufferStorage", 43, -1 },
> +   { "glNamedBufferStorageEXT", 43, -1 },
>
> /* GL_ARB_clear_texture */
> { "glClearTexImage", 13, -1 },
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/batch_chain: Don't start a new BO just for BATCH_BUFFER_START

2018-10-02 Thread Jason Ekstrand
Previously, we just went ahead and emitted MI_BATCH_BUFFER_START as
normal.  If we are near enough to the end, this can cause us to start a
new BO just for the MI_BATCH_BUFFER_START which messes up chaining.  We
always reserve enough space at the end for an MI_BATCH_BUFFER_START so
we can just increment cmd_buffer->batch.end prior to emitting the
command.

Fixes: a0b133286a3 "anv/batch_chain: Simplify secondary batch return..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926
---
 src/intel/vulkan/anv_batch_chain.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 3e13553ac18..e08e07ad7bd 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -894,8 +894,17 @@ anv_cmd_buffer_end_batch_buffer(struct anv_cmd_buffer 
*cmd_buffer)
   * It doesn't matter where it points now so long as has a valid
   * relocation.  We'll adjust it later as part of the chaining
   * process.
+  *
+  * We set the end of the batch a little short so we would be sure we
+  * have room for the chaining command.  Since we're about to emit the
+  * chaining command, let's set it back where it should go.
   */
+ cmd_buffer->batch.end += GEN8_MI_BATCH_BUFFER_START_length * 4;
+ assert(cmd_buffer->batch.start == batch_bo->bo.map);
+ assert(cmd_buffer->batch.end == batch_bo->bo.map + batch_bo->bo.size);
+
  emit_batch_buffer_start(cmd_buffer, _bo->bo, 0);
+ assert(cmd_buffer->batch.start == batch_bo->bo.map);
   } else {
  cmd_buffer->exec_mode = ANV_CMD_BUFFER_EXEC_MODE_COPY_AND_CHAIN;
   }
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108135

--- Comment #1 from Thiago Macieira  ---
Created attachment 141840
  --> https://bugs.freedesktop.org/attachment.cgi?id=141840=edit
Attempt at making the variables local static

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/13] glapi: add EXT_direct_state_access

2018-10-02 Thread Marek Olšák
The patch subject should be changed, because it adds only a tiny
subset of the extension.

Marek
On Sat, Sep 8, 2018 at 12:32 AM Timothy Arceri  wrote:
>
> From: Chris Forbes 
>
> Signed-off-by: Chris Forbes 
> ---
>  .../glapi/gen/EXT_direct_state_access.xml | 101 ++
>  src/mapi/glapi/gen/gl_API.xml |   3 +
>  src/mesa/main/tests/dispatch_sanity.cpp   |  30 +++---
>  3 files changed, 119 insertions(+), 15 deletions(-)
>  create mode 100644 src/mapi/glapi/gen/EXT_direct_state_access.xml
>
> diff --git a/src/mapi/glapi/gen/EXT_direct_state_access.xml 
> b/src/mapi/glapi/gen/EXT_direct_state_access.xml
> new file mode 100644
> index 000..c19afe80a22
> --- /dev/null
> +++ b/src/mapi/glapi/gen/EXT_direct_state_access.xml
> @@ -0,0 +1,101 @@
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +   
> +
> +   
> +
> +   
> +  
> +  
> +   
> +   
> +  
> +  
> +   
> +   
> +  
> +  
> +   
> +   
> +  
> +  
> +   
> +   
> +  
> +   
> +   
> +  
> +  
> +  
> +  
> +  
> +   
> +   
> +  
> +  
> +  
> +  
> +  
> +   
> +   
> +  
> +  
> +  
> +  
> +   
> +   
> +  
> +  
> +  
> +  
> +   
> +   
> +  
> +  
> +  
> +  
> +   
> +   
> +  
> +  
> +  
> +  
> +   
> +   
> +  
> +  
> +  
> +  
> +  
> +  
> +  
> +   
> +   
> +  
> +  
> +  
> +  
> +  
> +  
> +  
> +   
> +   
> +  
> +   
> +   
> +  
> +   
> +
> +
> +
> +
> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
> index aae9a5835db..8403c80eb37 100644
> --- a/src/mapi/glapi/gen/gl_API.xml
> +++ b/src/mapi/glapi/gen/gl_API.xml
> @@ -12944,6 +12944,9 @@
>   xmlns:xi="http://www.w3.org/2001/XInclude"/>
>
> + +xmlns:xi="http://www.w3.org/2001/XInclude"/>
> +
>  
>  
>  
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index 8b03f5377b3..e0ed7c17329 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -1016,21 +1016,21 @@ const struct function 
> common_desktop_functions_possible[] = {
> { "glGetQueryBufferObjectui64v", 45, -1 },
>
> /* GL_EXT_direct_state_access - GL 1.0 */
> -   //{ "glMatrixLoadfEXT", 10, -1 },
> -   //{ "glMatrixLoaddEXT", 10, -1 },
> -   //{ "glMatrixMultfEXT", 10, -1 },
> -   //{ "glMatrixMultdEXT", 10, -1 },
> -   //{ "glMatrixLoadIdentityEXT", 10, -1 },
> -   //{ "glMatrixRotatefEXT", 10, -1 },
> -   //{ "glMatrixRotatedEXT", 10, -1 },
> -   //{ "glMatrixScalefEXT", 10, -1 },
> -   //{ "glMatrixScaledEXT", 10, -1 },
> -   //{ "glMatrixTranslatefEXT", 10, -1 },
> -   //{ "glMatrixTranslatedEXT", 10, -1 },
> -   //{ "glMatrixOrthoEXT", 10, -1 },
> -   //{ "glMatrixFrustumEXT", 10, -1 },
> -   //{ "glMatrixPushEXT", 10, -1 },
> -   //{ "glMatrixPopEXT", 10, -1 },
> +   { "glMatrixLoadfEXT", 10, -1 },
> +   { "glMatrixLoaddEXT", 10, -1 },
> +   { "glMatrixMultfEXT", 10, -1 },
> +   { "glMatrixMultdEXT", 10, -1 },
> +   { "glMatrixLoadIdentityEXT", 10, -1 },
> +   { "glMatrixRotatefEXT", 10, -1 },
> +   { "glMatrixRotatedEXT", 10, -1 },
> +   { "glMatrixScalefEXT", 10, -1 },
> +   { "glMatrixScaledEXT", 10, -1 },
> +   { "glMatrixTranslatefEXT", 10, -1 },
> +   { "glMatrixTranslatedEXT", 10, -1 },
> +   { "glMatrixOrthoEXT", 10, -1 },
> +   { "glMatrixFrustumEXT", 10, -1 },
> +   { "glMatrixPushEXT", 10, -1 },
> +   { "glMatrixPopEXT", 10, -1 },
> /* GL_EXT_direct_state_access - GL 1.1 */
> //{ "glClientAttribDefaultEXT", 10, -1 },
> //{ "glPushClientAttribDefaultEXT", 10, -1 },
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/13] mesa: EXT_dsa add selectorless matrix stack functions

2018-10-02 Thread Marek Olšák
features.txt should be updated.

Marek
On Sat, Sep 8, 2018 at 12:31 AM Timothy Arceri  wrote:
>
> From: Chris Forbes 
>
> Allows the legacy matrix stacks to be manipulated without disturbing the
> matrix mode selector.
>
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/main/matrix.c | 370 +++--
>  src/mesa/main/matrix.h |  46 +
>  2 files changed, 363 insertions(+), 53 deletions(-)
>
> diff --git a/src/mesa/main/matrix.c b/src/mesa/main/matrix.c
> index bd38c1d8496..ea0a3bd537f 100644
> --- a/src/mesa/main/matrix.c
> +++ b/src/mesa/main/matrix.c
> @@ -116,25 +116,53 @@ _mesa_Frustum( GLdouble left, GLdouble right,
>  {
> GET_CURRENT_CONTEXT(ctx);
>
> -   FLUSH_VERTICES(ctx, 0);
> -
> if (nearval <= 0.0 ||
> farval <= 0.0 ||
> nearval == farval ||
> left == right ||
> -   top == bottom)
> -   {
> +   top == bottom) {
>_mesa_error( ctx,  GL_INVALID_VALUE, "glFrustum" );
>return;
> }
>
> +   FLUSH_VERTICES(ctx, 0);
> _math_matrix_frustum( ctx->CurrentStack->Top,
> - (GLfloat) left, (GLfloat) right,
> -(GLfloat) bottom, (GLfloat) top,
> + (GLfloat) left, (GLfloat) right,
> +(GLfloat) bottom, (GLfloat) top,
>  (GLfloat) nearval, (GLfloat) farval );
> ctx->NewState |= ctx->CurrentStack->DirtyFlag;
>  }
>
> +void GLAPIENTRY
> +_mesa_MatrixFrustumEXT( GLenum matrixMode,
> +GLdouble left, GLdouble right,
> +GLdouble bottom, GLdouble top,
> +GLdouble nearval, GLdouble farval )
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +   struct gl_matrix_stack *stack = get_named_matrix_stack(ctx, matrixMode);
> +
> +   if (!stack) {
> +  _mesa_error(ctx, GL_INVALID_ENUM, "glMatrixFrustumEXT(mode)");
> +  return;
> +   }
> +
> +   if (nearval <= 0.0 ||
> +   farval <= 0.0 ||
> +   nearval == farval ||
> +   left == right ||
> +   top == bottom) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "glMatrixFrustumEXT");
> +  return;
> +   }
> +
> +   FLUSH_VERTICES(ctx, 0);
> +   _math_matrix_frustum(stack->Top,
> +(GLfloat) left, (GLfloat) right,
> +(GLfloat) bottom, (GLfloat) top,
> +(GLfloat) nearval, (GLfloat) farval);
> +   ctx->NewState |= stack->DirtyFlag;
> +}
>
>  /**
>   * Apply an orthographic projection matrix.
> @@ -159,27 +187,54 @@ _mesa_Ortho( GLdouble left, GLdouble right,
>  {
> GET_CURRENT_CONTEXT(ctx);
>
> -   FLUSH_VERTICES(ctx, 0);
> -
> if (MESA_VERBOSE & VERBOSE_API)
>_mesa_debug(ctx, "glOrtho(%f, %f, %f, %f, %f, %f)\n",
>left, right, bottom, top, nearval, farval);
>
> if (left == right ||
> bottom == top ||
> -   nearval == farval)
> -   {
> +   nearval == farval) {
>_mesa_error( ctx,  GL_INVALID_VALUE, "glOrtho" );
>return;
> }
>
> +   FLUSH_VERTICES(ctx, 0);
> _math_matrix_ortho( ctx->CurrentStack->Top,
> -   (GLfloat) left, (GLfloat) right,
> -  (GLfloat) bottom, (GLfloat) top,
> +   (GLfloat) left, (GLfloat) right,
> +  (GLfloat) bottom, (GLfloat) top,
>(GLfloat) nearval, (GLfloat) farval );
> ctx->NewState |= ctx->CurrentStack->DirtyFlag;
>  }
>
> +void GLAPIENTRY
> +_mesa_MatrixOrthoEXT( GLenum matrixMode,
> +  GLdouble left, GLdouble right,
> +  GLdouble bottom, GLdouble top,
> +  GLdouble nearval, GLdouble farval )
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +   struct gl_matrix_stack *stack = get_named_matrix_stack(ctx, matrixMode);
> +
> +   if (!stack) {
> +  _mesa_error(ctx, GL_INVALID_ENUM, "glMatrixOrthoEXT(mode)");
> +  return;
> +   }
> +
> +   if (left == right ||
> +   bottom == top ||
> +   nearval == farval) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "glMatrixOrthoEXT");
> +  return;
> +   }
> +
> +   FLUSH_VERTICES(ctx, 0);
> +   _math_matrix_ortho(stack->Top,
> +  (GLfloat) left, (GLfloat) right,
> +  (GLfloat) bottom, (GLfloat) top,
> +  (GLfloat) nearval, (GLfloat) farval);
> +   ctx->NewState |= stack->DirtyFlag;
> +}
> +
>  /**
>   * Set the current matrix stack.
>   *
> @@ -211,38 +266,21 @@ _mesa_MatrixMode( GLenum mode )
> }
>  }
>
> -
> -/**
> - * Push the current matrix stack.
> - *
> - * \sa glPushMatrix().
> - *
> - * Verifies the current matrix stack is not full, and duplicates the top-most
> - * matrix in the stack.
> - * Marks __struct gl_contextRec::NewState with the stack dirty flag.
> - */
> -void GLAPIENTRY
> -_mesa_PushMatrix( void )
> +static void
> +push_matrix(struct gl_context *ctx, struct gl_matrix_stack *stack,
> +GLenum matrixMode, const char 

Re: [Mesa-dev] [PATCH 01/13] mesa: add a list of EXT_direct_state_access to dispatch sanity

2018-10-02 Thread Marek Olšák
This is probably OK, though the TODO list in features.txt should also
be updated when a new subset is implemented.

Marek
On Sat, Sep 8, 2018 at 12:32 AM Timothy Arceri  wrote:
>
> This extension is huge and this gives us a TODO list of functions
> to implement.
> ---
>  src/mesa/main/tests/dispatch_sanity.cpp | 219 
>  1 file changed, 219 insertions(+)
>
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index fb2acfbdeea..8b03f5377b3 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -1015,6 +1015,225 @@ const struct function 
> common_desktop_functions_possible[] = {
> { "glGetQueryBufferObjecti64v", 45, -1 },
> { "glGetQueryBufferObjectui64v", 45, -1 },
>
> +   /* GL_EXT_direct_state_access - GL 1.0 */
> +   //{ "glMatrixLoadfEXT", 10, -1 },
> +   //{ "glMatrixLoaddEXT", 10, -1 },
> +   //{ "glMatrixMultfEXT", 10, -1 },
> +   //{ "glMatrixMultdEXT", 10, -1 },
> +   //{ "glMatrixLoadIdentityEXT", 10, -1 },
> +   //{ "glMatrixRotatefEXT", 10, -1 },
> +   //{ "glMatrixRotatedEXT", 10, -1 },
> +   //{ "glMatrixScalefEXT", 10, -1 },
> +   //{ "glMatrixScaledEXT", 10, -1 },
> +   //{ "glMatrixTranslatefEXT", 10, -1 },
> +   //{ "glMatrixTranslatedEXT", 10, -1 },
> +   //{ "glMatrixOrthoEXT", 10, -1 },
> +   //{ "glMatrixFrustumEXT", 10, -1 },
> +   //{ "glMatrixPushEXT", 10, -1 },
> +   //{ "glMatrixPopEXT", 10, -1 },
> +   /* GL_EXT_direct_state_access - GL 1.1 */
> +   //{ "glClientAttribDefaultEXT", 10, -1 },
> +   //{ "glPushClientAttribDefaultEXT", 10, -1 },
> +   //{ "glTextureParameteriEXT", 10, -1 },
> +   //{ "glTextureParameterivEXT", 10, -1 },
> +   //{ "glTextureParameterfEXT", 10, -1 },
> +   //{ "glTextureParameterfvEXT", 10, -1 },
> +   //{ "glTextureImage1DEXT", 10, -1 },
> +   //{ "glTextureImage2DEXT", 10, -1 },
> +   //{ "glTextureSubImage1DEXT", 10, -1 },
> +   //{ "glTextureSubImage2DEXT", 10, -1 },
> +   //{ "glCopyTextureImage1DEXT", 10, -1 },
> +   //{ "glCopyTextureImage2DEXT", 10, -1 },
> +   //{ "glCopyTextureSubImage1DEXT", 10, -1 },
> +   //{ "glCopyTextureSubImage2DEXT", 10, -1 },
> +   //{ "glGetTextureImageEXT", 10, -1 },
> +   //{ "glGetTextureParameterfvEXT", 10, -1 },
> +   //{ "glGetTextureParameterivEXT", 10, -1 },
> +   //{ "glGetTextureLevelParameterfvEXT", 10, -1 },
> +   //{ "glGetTextureLevelParameterivEXT", 10, -1 },
> +   /* GL_EXT_direct_state_access - GL 1.2 */
> +   //{ "glTextureImage3DEXT", 10, -1 },
> +   //{ "glTextureSubImage3DEXT", 10, -1 },
> +   //{ "glCopyTextureSubImage3DEXT", 10, -1 },
> +   /* GL_EXT_direct_state_access - GL 1.2.1 */
> +   //{ "glBindMultiTextureEXT", 10, -1 },
> +   //{ "glMultiTexCoordPointerEXT", 10, -1 },
> +   //{ "glMultiTexEnvfEXT", 10, -1 },
> +   //{ "glMultiTexEnvfvEXT", 10, -1 },
> +   //{ "glMultiTexEnviEXT", 10, -1 },
> +   //{ "glMultiTexEnvivEXT", 10, -1 },
> +   //{ "glMultiTexGenEXT", 10, -1 },
> +   //{ "glMultiTexGenvEXT", 10, -1 },
> +   //{ "glMultiTexGenfEXT", 10, -1 },
> +   //{ "glMultiTexGenfvEXT", 10, -1 },
> +   //{ "glMultiTexGeniEXT", 10, -1 },
> +   //{ "glMultiTexGenivEXT", 10, -1 },
> +   //{ "glGenMultiTexEnvfvEXT", 10, -1 },
> +   //{ "glGenMultiTexEnvivEXT", 10, -1 },
> +   //{ "glGenMultiTexGenvEXT", 10, -1 },
> +   //{ "glGenMultiTexGenfvEXT", 10, -1 },
> +   //{ "glGenMultiTexGenivEXT", 10, -1 },
> +   //{ "glMultiTexParameterfEXT", 10, -1 },
> +   //{ "glMultiTexParameterfvEXT", 10, -1 },
> +   //{ "glMultiTexParameteriEXT", 10, -1 },
> +   //{ "glMultiTexParameterivEXT", 10, -1 },
> +   //{ "glMultiTexImage1DEXT", 10, -1 },
> +   //{ "glMultiTexImage2DEXT", 10, -1 },
> +   //{ "glMultiTexSubImage1DEXT", 10, -1 },
> +   //{ "glMultiTexSubImage2DEXT", 10, -1 },
> +   //{ "glCopyMultiTexImage1DEXT", 10, -1 },
> +   //{ "glCopyMultiTexImage2DEXT", 10, -1 },
> +   //{ "glCopyMultiTexSubImage1DEXT", 10, -1 },
> +   //{ "glCopyMultiTexSubImage2DEXT", 10, -1 },
> +   //{ "glGetMultiTexImageEXT", 10, -1 },
> +   //{ "glGetMultiTexParameterfvEXT", 10, -1 },
> +   //{ "glGetMultiTexParameterivEXT", 10, -1 },
> +   //{ "glGetMultiTexLevelParameterfvEXT", 10, -1 },
> +   //{ "glGetMultiTexLevelParameterivEXT", 10, -1 },
> +   //{ "glMultiTexImage3DEXT", 10, -1 },
> +   //{ "glMultiTexSubImage3DEXT", 10, -1 },
> +   //{ "glCopyMultiTexSubImage3DEXT", 10, -1 },
> +   //{ "glEnableClientStateIndexedEXT", 10, -1 },
> +   //{ "glDisableClientStateIndexedEXT", 10, -1 },
> +   //{ "glGetFloatIndexedvEXT", 10, -1 },
> +   //{ "glGetDoubleIndexedvEXT", 10, -1 },
> +   //{ "glGetPointerIndexedvEXT", 10, -1 },
> +   //{ "glEnableIndexedEXT", 10, -1 },
> +   //{ "glDisableIndexedEXT", 10, -1 },
> +   //{ "glIsEnabledIndexedEXT", 10, -1 },
> +   //{ "glGetIntegerIndexedvEXT", 10, -1 },
> +   //{ "glGetBooleanIndexedvEXT", 10, -1 },
> +   /* GL_EXT_direct_state_access - ARB_vertex_program */
> +   //{ "glNamedProgramStringEXT", 10, -1 },
> +   //{ 

Re: [Mesa-dev] Meson-windows v4 (9/21/2018 rebase): LLVM linking problems

2018-10-02 Thread Dylan Baker
Quoting Liviu Prodea (2018-10-02 14:05:09)
> 
> 
> 
> 
> 
> 
> On Tuesday, October 2, 2018, 8:08:39 PM GMT+3, Dylan Baker
>  wrote:
> 
> 
> Quoting Liviu Prodea (2018-10-02 08:08:41)
> > Made a comprehensive test of this patch series and I still stumbled upon 
> > some
> > big problems:
> >
> > 1. Automatic LLVM linking via llvm-config if used by adding LLVM bin folder
> to
> > PATH results in build failure with 'llvm-c/Core.h' not found in src/gallium/
> > auxiliary/gallivm/lp_bld.h. Appveyor CI from 
> > https://ci.appveyor.com/project/
> > dcbaker/mesa didn't encounter this as it is using llvm-wrap option.
> 
> It's possible that llvm-config wrapping on windows is broken atm, it is on
> macos, I have pull request open, 
> https://github.com/mesonbuild/meson/pull/4283.
> I'll see if that fixes windows as well, or if we need some more work there.
> 
> 
> > 2. Even if build succeeds with LLVM linked via llvm-wrap and everything
> looking
> > good at first glance, llvmpipe and swr if it was built cannot be selected.
> > GALLIUM_DRIVER variable has no effect. You only get softpipe despite
> > opengl32.dll file looking big enough and swrAVX-0.dll and swrAVX2-0,dll 
> > being
> > generated when expected. Even when having LLVM built dynamically to avoid 
> > /MD
> > to /MT override warnings and building Mesa3D with default c_args and 
> > cpp_args
> > this issue is still in effect.
> >
> > 3. Meson 0.48.0 doesn't pass the /MT or /MTd c_args and cpp_args for some
> > unexplained reasons which leads to build failure if LLVM is not built with /
> MD.
> 
> 
> Meson 0.48 has added a new option to allow you to pick which crt you want:
> 
> https://mesonbuild.com/Release-notes-for-0-48-0.html#
> toggles-for-build-type-optimization-and-vcrt-type
> 
> The list of options are here:
> 
> https://mesonbuild.com/Builtin-options.html#base-options
> 
> I'll test and see if I can add b_vscrt=from_builtype to the default options
> without requiring a bumpt ot 0.48.0 for the whole project.
> 
> Dylan
> 
> ---
> 
> -Db_vscrt=mt doesn't help. I use LLVM built with /MT. I still get a bunch of
> 
> error LNK2038: mismatch detected for 'RuntimeLibrary': value 
> 'MT_StaticRelease' doesn't match value 'MD_DynamicRelease'

What is your -Dbuildtype set to?

> 
> As for why I don't get llvmpipe and swr to work when Mesa3D and LLVM CRT 
> linking match when using manual llvm-wrap option I think the explanation is 
> highlighted by Appveyor: 
> https://ci.appveyor.com/project/dcbaker/mesa/build/job/k02oo9qfyuxaxpgi?fullLog=true#L221
> Looking at line 224, LLVN version is reported as undefined. This can't be 
> good and I am seeing this as well. Automatic wrap with llvm-config doesn't 
> have this problem. Unfortunately it fails to find the headers as already 
> reported.

In that appveyor the fact that version is undefined is harmless, it's because
the `project()` definition in the wrap doesn't define a version, the
`declare_dependency()` does that. If you're seeing that with llvm-config then
that's bad, and may be related to the pull request I mentioned above.

Are you building LLVM yourself, or getting it from somewhere?

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: avoid sending GS_EMIT in shaders without outputs

2018-10-02 Thread Marek Olšák
Pushed, thanks!

Marek
On Sun, Sep 23, 2018 at 6:45 PM Józef Kucia  wrote:
>
> Fixes GPU hangs.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107857
> Signed-off-by: Józef Kucia 
> ---
>  src/gallium/drivers/radeonsi/si_shader.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 36f58e2ce52c..fedc616ebf61 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -4326,9 +4326,12 @@ static void si_llvm_emit_vertex(struct ac_shader_abi 
> *abi,
> gs_next_vertex = LLVMBuildAdd(ctx->ac.builder, gs_next_vertex, 
> ctx->i32_1, "");
> LLVMBuildStore(ctx->ac.builder, gs_next_vertex, 
> ctx->gs_next_vertex[stream]);
>
> -   /* Signal vertex emission */
> -   ac_build_sendmsg(>ac, AC_SENDMSG_GS_OP_EMIT | AC_SENDMSG_GS | 
> (stream << 8),
> -si_get_gs_wave_id(ctx));
> +   /* Signal vertex emission if vertex data was written. */
> +   if (offset) {
> +   ac_build_sendmsg(>ac, AC_SENDMSG_GS_OP_EMIT | 
> AC_SENDMSG_GS | (stream << 8),
> +si_get_gs_wave_id(ctx));
> +   }
> +
> if (!use_kill)
> lp_build_endif(_state);
>  }
> --
> 2.16.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Use separate MOCS settings for external BOs

2018-10-02 Thread Jason Ekstrand
On Broadwell and above, we have to use different MOCS settings to allow
the kernel to take over and disable caching when needed for external
buffers.  On Broadwell, this is especially important because the kernel
can't disable eLLC so we have to do it in userspace.  We very badly
don't want to do that on everything so we need separate MOCS for
external and internal BOs.

In order to do this, we add an anv-specific BO flag for "external" and
use that to distinguish between buffers which may be shared with other
processes and/or display and those which are entirely internal.  That,
together with an anv_mocs_for_bo helper lets us choose the right MOCS
settings for each BO use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507
Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/vulkan/anv_allocator.c   | 12 --
 src/intel/vulkan/anv_batch_chain.c |  2 +-
 src/intel/vulkan/anv_blorp.c   | 15 ++--
 src/intel/vulkan/anv_device.c  |  9 +--
 src/intel/vulkan/anv_image.c   |  5 ++--
 src/intel/vulkan/anv_intel.c   |  2 +-
 src/intel/vulkan/anv_private.h | 38 +++---
 src/intel/vulkan/gen7_cmd_buffer.c |  3 ++-
 src/intel/vulkan/gen8_cmd_buffer.c |  3 ++-
 src/intel/vulkan/genX_cmd_buffer.c | 18 +++---
 src/intel/vulkan/genX_gpu_memcpy.c |  5 ++--
 src/intel/vulkan/genX_state.c  |  6 +
 12 files changed, 80 insertions(+), 38 deletions(-)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index ab01d46cbeb..f62d48ae3fe 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -1253,7 +1253,8 @@ anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t 
gem_handle)
(EXEC_OBJECT_WRITE | \
 EXEC_OBJECT_ASYNC | \
 EXEC_OBJECT_SUPPORTS_48B_ADDRESS | \
-EXEC_OBJECT_PINNED)
+EXEC_OBJECT_PINNED | \
+ANV_BO_EXTERNAL)
 
 VkResult
 anv_bo_cache_alloc(struct anv_device *device,
@@ -1311,6 +1312,7 @@ anv_bo_cache_import(struct anv_device *device,
 struct anv_bo **bo_out)
 {
assert(bo_flags == (bo_flags & ANV_BO_CACHE_SUPPORTED_FLAGS));
+   assert(bo_flags & ANV_BO_EXTERNAL);
 
pthread_mutex_lock(>mutex);
 
@@ -1327,7 +1329,7 @@ anv_bo_cache_import(struct anv_device *device,
* client has imported a BO twice in different ways and they get what
* they have coming.
*/
-  uint64_t new_flags = 0;
+  uint64_t new_flags = ANV_BO_EXTERNAL;
   new_flags |= (bo->bo.flags | bo_flags) & EXEC_OBJECT_WRITE;
   new_flags |= (bo->bo.flags & bo_flags) & EXEC_OBJECT_ASYNC;
   new_flags |= (bo->bo.flags & bo_flags) & 
EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
@@ -1411,6 +1413,12 @@ anv_bo_cache_export(struct anv_device *device,
assert(anv_bo_cache_lookup(cache, bo_in->gem_handle) == bo_in);
struct anv_cached_bo *bo = (struct anv_cached_bo *)bo_in;
 
+   /* This BO must have been flagged external in order for us to be able
+* to export it.  This is done based on external options passed into
+* anv_AllocateMemory.
+*/
+   assert(bo->bo.flags & ANV_BO_EXTERNAL);
+
int fd = anv_gem_handle_to_fd(device, bo->bo.gem_handle);
if (fd < 0)
   return vk_error(VK_ERROR_TOO_MANY_OBJECTS);
diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 0f7c8325ea4..3e13553ac18 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1088,7 +1088,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
   obj->relocs_ptr = 0;
   obj->alignment = 0;
   obj->offset = bo->offset;
-  obj->flags = bo->flags | extra_flags;
+  obj->flags = (bo->flags & ~ANV_BO_FLAG_MASK) | extra_flags;
   obj->rsvd1 = 0;
   obj->rsvd2 = 0;
}
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index fa7936d0981..29ed6b2ee35 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -156,7 +156,7 @@ get_blorp_surf_for_anv_buffer(struct anv_device *device,
   .addr = {
  .buffer = buffer->address.bo,
  .offset = buffer->address.offset + offset,
- .mocs = device->default_mocs,
+ .mocs = anv_mocs_for_bo(device, buffer->address.bo),
   },
};
 
@@ -209,7 +209,7 @@ get_blorp_surf_for_anv_image(const struct anv_device 
*device,
   .addr = {
  .buffer = image->planes[plane].address.bo,
  .offset = image->planes[plane].address.offset + surface->offset,
- .mocs = device->default_mocs,
+ .mocs = anv_mocs_for_bo(device, image->planes[plane].address.bo),
   },
};
 
@@ -219,7 +219,7 @@ get_blorp_surf_for_anv_image(const struct anv_device 
*device,
   blorp_surf->aux_addr = (struct blorp_address) {
  .buffer = image->planes[plane].address.bo,
  .offset = image->planes[plane].address.offset + aux_surface->offset,
- .mocs = device->default_mocs,
+ .mocs = anv_mocs_for_bo(device, 

Re: [Mesa-dev] Meson-windows v4 (9/21/2018 rebase): LLVM linking problems

2018-10-02 Thread Liviu Prodea
 





On Tuesday, October 2, 2018, 8:08:39 PM GMT+3, Dylan Baker 
 wrote:  
 
 Quoting Liviu Prodea (2018-10-02 08:08:41)
> Made a comprehensive test of this patch series and I still stumbled upon some
> big problems:
> 
> 1. Automatic LLVM linking via llvm-config if used by adding LLVM bin folder to
> PATH results in build failure with 'llvm-c/Core.h' not found in src/gallium/
> auxiliary/gallivm/lp_bld.h. Appveyor CI from https://ci.appveyor.com/project/
> dcbaker/mesa didn't encounter this as it is using llvm-wrap option.

It's possible that llvm-config wrapping on windows is broken atm, it is on
macos, I have pull request open, https://github.com/mesonbuild/meson/pull/4283.
I'll see if that fixes windows as well, or if we need some more work there.

> 2. Even if build succeeds with LLVM linked via llvm-wrap and everything 
> looking
> good at first glance, llvmpipe and swr if it was built cannot be selected.
> GALLIUM_DRIVER variable has no effect. You only get softpipe despite
> opengl32.dll file looking big enough and swrAVX-0.dll and swrAVX2-0,dll being
> generated when expected. Even when having LLVM built dynamically to avoid /MD
> to /MT override warnings and building Mesa3D with default c_args and cpp_args
> this issue is still in effect.
> 
> 3. Meson 0.48.0 doesn't pass the /MT or /MTd c_args and cpp_args for some
> unexplained reasons which leads to build failure if LLVM is not built with 
> /MD.

Meson 0.48 has added a new option to allow you to pick which crt you want:

https://mesonbuild.com/Release-notes-for-0-48-0.html#toggles-for-build-type-optimization-and-vcrt-type

The list of options are here:

https://mesonbuild.com/Builtin-options.html#base-options

I'll test and see if I can add b_vscrt=from_builtype to the default options
without requiring a bumpt ot 0.48.0 for the whole project.

Dylan
---
-Db_vscrt=mt doesn't help. I use LLVM built with /MT. I still get a bunch 
oferror LNK2038: mismatch detected for 'RuntimeLibrary': value 
'MT_StaticRelease' doesn't match value 'MD_DynamicRelease'

As for why I don't get llvmpipe and swr to work when Mesa3D and LLVM CRT 
linking match when using manual llvm-wrap option I think the explanation is 
highlighted by Appveyor: 
https://ci.appveyor.com/project/dcbaker/mesa/build/job/k02oo9qfyuxaxpgi?fullLog=true#L221
Looking at line 224, LLVN version is reported as undefined. This can't be good 
and I am seeing this as well. Automatic wrap with llvm-config doesn't have this 
problem. Unfortunately it fails to find the headers as already reported.   
  ___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108135

Bug ID: 108135
   Summary: AVX instructions leak outside of CPU feature check and
cause SIGILL
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/swr
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: thi...@kde.org
QA Contact: mesa-dev@lists.freedesktop.org

Reported at https://github.com/clearlinux/distribution/issues/210

Symptom: when running certain applications (Xwayland in our case) on an Atom
CPU, the application crashes with SIGILL. The backtrace:

#7 _Z41__static_initialization_and_destruction_0ii.constprop.120() -
[swrast_dri.so] - lower_x86.cpp:73
#8 _GLOBAL__I_65535_0_gallium_dri_la_target.o.1657199() - [swrast_dri.so]
#9 call_init.part.0() - [ld-linux-x86-64.so.2] - dl-init.c:72
#10 _dl_init() - [ld-linux-x86-64.so.2] - dl-init.c:30
#11 dl_open_worker() - [ld-linux-x86-64.so.2] - dl-open.c:506
#12 _dl_catch_exception() - [libc.so.6] - dl-error-skeleton.c:196
#13 _dl_open() - [ld-linux-x86-64.so.2] - dl-open.c:588
#14 dlopen_doit() - [libdl.so.2] - dlopen.c:66
#15 _dl_catch_exception() - [libc.so.6] - dl-error-skeleton.c:196
#16 _dl_catch_error() - [libc.so.6] - dl-error-skeleton.c:215
#17 _dlerror_run() - [libdl.so.2] - dlerror.c:163
#18 dlopen@@GLIBC_2.2.5() - [libdl.so.2] - dlopen.c:87
#19 glxProbeDriver() - [/usr/bin/Xwayland] - glxdricommon.c:305
#20 __glXDRIscreenProbe() - [/usr/bin/Xwayland] - glxdriswrast.c:437
#21 xorgGlxServerInit() - [/usr/bin/Xwayland] - glxext.c:550
#22 _CallCallbacks() - [/usr/bin/Xwayland] - dixutils.c:737
#23 GlxExtensionInit() - [/usr/bin/Xwayland] - callback.h:83
#24 InitExtensions() - [/usr/bin/Xwayland] - miinitext.c:267
#25 dix_main() - [/usr/bin/Xwayland] - main.c:197
#26 __libc_start_main() - [libc.so.6] - libc-start.c:308
#27 _start() - [/usr/bin/Xwayland] - start.S:120

Investigation shows that the swrast_dri.so plugin contains AVX instructions
outside of a CPUID check in the "__static_initialization_and_destruction_0"
function. Further investigation shows some of those functions come from
src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp, while
initialising the variable SwrJit::instructionMap (which is a global static
non-POD of type std::map - see
).

This lower_x86.cpp file is compiled with -mavx. This means the compiler is free
to generate AVX instructions anywhere in the file without a CPUID check,
including in the dynamic initialisation and destruction code that the std::map
variable requires.

The simplest solution I can think of, for this particular instruction, is to
change it from a global static to a local static. This will solve the problem
for this particular initialisation, but I think there are more variables with
this problem.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108135] AVX instructions leak outside of CPU feature check and cause SIGILL

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108135

Thiago Macieira  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |timothy.o.row...@intel.com
   |org |

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] r600: use build-id when available for disk cache

2018-10-02 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek
On Tue, Sep 18, 2018 at 10:14 PM Timothy Arceri  wrote:
>
> ---
>  src/gallium/drivers/r600/r600_pipe_common.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600_pipe_common.c 
> b/src/gallium/drivers/r600/r600_pipe_common.c
> index f7cfd0d46a6..6b581242a18 100644
> --- a/src/gallium/drivers/r600/r600_pipe_common.c
> +++ b/src/gallium/drivers/r600/r600_pipe_common.c
> @@ -854,13 +854,13 @@ static void r600_disk_cache_create(struct 
> r600_common_screen *rscreen)
> if (rscreen->debug_flags & DBG_ALL_SHADERS)
> return;
>
> -   uint32_t mesa_timestamp;
> -   if (disk_cache_get_function_timestamp(r600_disk_cache_create,
> - _timestamp)) {
> -   char *timestamp_str;
> +   uint32_t mesa_id;
> +   if (disk_cache_get_function_identifier(r600_disk_cache_create,
> +  _id)) {
> +   char *mesa_id_str;
> int res = -1;
>
> -   res = asprintf(_str, "%u",mesa_timestamp);
> +   res = asprintf(_id_str, "%u", mesa_id);
> if (res != -1) {
> /* These flags affect shader compilation. */
> uint64_t shader_debug_flags =
> @@ -870,9 +870,9 @@ static void r600_disk_cache_create(struct 
> r600_common_screen *rscreen)
>
> rscreen->disk_shader_cache =
> 
> disk_cache_create(r600_get_family_name(rscreen),
> - timestamp_str,
> + mesa_id_str,
>   shader_debug_flags);
> -   free(timestamp_str);
> +   free(mesa_id_str);
> }
> }
>  }
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] configure: allow building with python3

2018-10-02 Thread Dylan Baker
I haven't had a chance to do that yet. I'll see if I can find some time this
week.

Quoting Emil Velikov (2018-10-02 08:02:12)
> Hi Dylan,
> 
> On Mon, 3 Sep 2018 at 14:57, Emil Velikov  wrote:
> >
> > On 24 August 2018 at 19:51, Dylan Baker  wrote:
> > > Can we just change the script to write a file instead of sending it's 
> > > output
> > > through the shell? That should fix any encoding problems since the shell 
> > > wont
> > > touch it and the LANG settings (no matter what they are) shouldn't matter.
> > >
> > Seems like I forgot to reply to this. Yes, please - that would be
> > highly preferred.
> >
> Did you get the chance to do this?
> 
> Thanks
> Emil


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 1/4] meson: turn git_sha1.h target into a proper dependency

2018-10-02 Thread Dylan Baker
Quoting Eric Engestrom (2018-10-02 07:09:03)
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Eric Engestrom 
> ---
>  src/mesa/meson.build |  3 +--
>  src/meson.build  | 13 -
>  2 files changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/meson.build b/src/mesa/meson.build
> index ea884977db8052d86fcb..861b0311048eff422b9f 100644
> --- a/src/mesa/meson.build
> +++ b/src/mesa/meson.build
> @@ -705,7 +705,6 @@ files_libmesa_common += [
>ir_expression_operation_h,
>main_remap_helper_h,
>matypes_h,
> -  sha1_h,
>  ]
>  
>  if with_sse41
> @@ -726,7 +725,7 @@ libmesa_classic = static_library(
>cpp_args : [cpp_vis_args, cpp_msvc_compat_args],
>include_directories : [inc_common, inc_libmesa_asm, 
> include_directories('main')],
>link_with : [libglsl, libmesa_sse41],
> -  dependencies : idep_nir_headers,
> +  dependencies : [idep_nir_headers, idep_git_sha1],
>build_by_default : false,
>  )
>  
> diff --git a/src/meson.build b/src/meson.build
> index af881cff70bf752a6474..89ffaddf47b7286e4fe0 100644
> --- a/src/meson.build
> +++ b/src/meson.build
> @@ -39,11 +39,14 @@ libglsl_util = static_library(
>build_by_default : false,
>  )
>  
> -sha1_h = custom_target(
> -  'git_sha1.h',
> -  output : 'git_sha1.h',
> -  command : [prog_python, git_sha1_gen_py, '--output', '@OUTPUT@'],
> -  build_always : true, # commit sha1 can change without having touched these 
> files
> +idep_git_sha1 = declare_dependency(
> +  sources : custom_target(
> +'git_sha1.h',
> +output : 'git_sha1.h',
> +command : [prog_python, git_sha1_gen_py, '--output', '@OUTPUT@'],
> +build_always : true, # commit sha1 can change without having touched 
> these files
> +  ),
> +  include_directories : inc_src,

What does this get us over including it in the source list, since it's a .h
meson should generate the proper include paths already, right?

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meson: Don't allow building EGL on on KMS systems or Haiku

2018-10-02 Thread Dylan Baker
That might be possible. I'll double check.

Dylan

Quoting Ilia Mirkin (2018-10-01 20:58:20)
> Shouldn't it be possible to use the x11 platform (+drisw)?
> On Mon, Oct 1, 2018 at 3:43 PM Dylan Baker  wrote:
> >
> > Currently mesa only supports EGL for KMS (Linux, *BSD) systems and
> > Haiku, we should actually enforce this. This fixes the default build on
> > MacOS.
> > ---
> >
> >  meson.build | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/meson.build b/meson.build
> > index 97693b91ecf..202f9d740d7 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -306,7 +306,10 @@ endif
> >
> >  _egl = get_option('egl')
> >  if _egl == 'auto'
> > -  with_egl = with_dri and with_shared_glapi and with_platforms
> > +  with_egl = (
> > +(system_has_kms_drm or with_platform_haiku) and
> > +with_dri and with_shared_glapi and with_platforms
> > +  )
> >  elif _egl == 'true'
> >if not with_dri
> >  error('EGL requires dri')
> > @@ -316,6 +319,8 @@ elif _egl == 'true'
> >  error('No platforms specified, consider 
> > -Dplatforms=drm,x11,surfaceless at least')
> >elif not ['disabled', 'dri'].contains(with_glx)
> >  error('EGL requires dri, but a GLX is being built without dri')
> > +  elif not (system_has_kms_drm or with_platform_haiku)
> > +error('EGL is not valid on systems that don\'t use KMS except Haiku.')
> >endif
> >with_egl = true
> >  else
> > --
> > 2.19.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] util: import public domain code for integer division by a constant

2018-10-02 Thread Jason Ekstrand
On Tue, Oct 2, 2018 at 3:20 PM Marek Olšák  wrote:

>
> On Tue, Oct 2, 2018, 1:15 PM Jason Ekstrand  wrote:
>
>> Reading through things in a bit more detail, I do believe that importing
>> this version in some form would be better than using mine for a number of
>> reasons:
>>
>>  * It is better optimized for signed integers
>>  * The struct of division factors is much better than what I did.  (I did
>> consider a struct and discarded the idea; I was wrong).
>>  * Computation of the division factors doesn't involve N*2-bit
>> multiplication
>>  * The round-up algorithm here results in significantly better code than
>> the N+1-bit round-down.
>>  * I trust ridiculousfish to get this right more than I trust myself
>>
>> That said, I have a few caveats on merging this as-is:
>>
>>  * I would like to see some unit tests.  I already spent the time to
>> write some; they just have to be ported.
>>  * It needs to be adjusted to handle 64-bit integers (right now, it
>> appears to only work for num_bits <= 32)
>>  * We shouldn't define uint_t and sint_t in a header
>>
>> How do you want to proceed?
>>
>
> I don't have a plan. Anything that works for you would be OK with me, so
> if you wanna just rework it according to you, that's fine.
>

I'm not in too much of a hurry but I can probably rework it if you don't
get to it first.


> Changing the types is tricky. Template code in a C header included several
> times would work. C++ templates would be ideal.
>

We may be able to just s/uint_t/uint64_t/ for most of it.  I *think* the
functions to create the division parameters should fairly nicely
generalize.  It's the functions which actually do divisions that we'll have
to re-type for each size.


> What's your timeframe for this? Mine is certainly more than a month.
>

Unsure.  I've got other (not the NIR pass) code pending that requires this
and will hopefully be landing in a month or so.  Sadly, I can't send it to
the ML yet.  In any case, I'd be looking at one to two months, probably.
I'm happy for it to be a matter of whoever gets to it first.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] util: import public domain code for integer division by a constant

2018-10-02 Thread Marek Olšák
On Tue, Oct 2, 2018, 1:15 PM Jason Ekstrand  wrote:

> Reading through things in a bit more detail, I do believe that importing
> this version in some form would be better than using mine for a number of
> reasons:
>
>  * It is better optimized for signed integers
>  * The struct of division factors is much better than what I did.  (I did
> consider a struct and discarded the idea; I was wrong).
>  * Computation of the division factors doesn't involve N*2-bit
> multiplication
>  * The round-up algorithm here results in significantly better code than
> the N+1-bit round-down.
>  * I trust ridiculousfish to get this right more than I trust myself
>
> That said, I have a few caveats on merging this as-is:
>
>  * I would like to see some unit tests.  I already spent the time to write
> some; they just have to be ported.
>  * It needs to be adjusted to handle 64-bit integers (right now, it
> appears to only work for num_bits <= 32)
>  * We shouldn't define uint_t and sint_t in a header
>
> How do you want to proceed?
>

I don't have a plan. Anything that works for you would be OK with me, so if
you wanna just rework it according to you, that's fine. Changing the types
is tricky. Template code in a C header included several times would work.
C++ templates would be ideal.

What's your timeframe for this? Mine is certainly more than a month.

Marek


> --Jason
>
>
> On Sun, Sep 23, 2018 at 11:58 AM Marek Olšák  wrote:
>
>> From: Marek Olšák 
>>
>> Compilers can use this to generate optimal code for integer division
>> by a constant.
>>
>> Additionally, an unsigned division by a uniform that is constant but not
>> known at compile time can still be optimized by passing 2-4 division
>> factors to the shader as uniforms and executing one of the fast_udiv*
>> variants. The signed division algorithm doesn't have this capability.
>> ---
>>  src/util/Makefile.sources |   2 +
>>  src/util/fast_idiv_by_const.c | 245
>> ++
>>  src/util/fast_idiv_by_const.h | 173 +
>>  src/util/meson.build  |   2 +
>>  4 files changed, 422 insertions(+)
>>  create mode 100644 src/util/fast_idiv_by_const.c
>>  create mode 100644 src/util/fast_idiv_by_const.h
>>
>> diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
>> index b562d6c..f741b2a 100644
>> --- a/src/util/Makefile.sources
>> +++ b/src/util/Makefile.sources
>> @@ -3,20 +3,22 @@ MESA_UTIL_FILES := \
>> bitscan.h \
>> bitset.h \
>> build_id.c \
>> build_id.h \
>> crc32.c \
>> crc32.h \
>> debug.c \
>> debug.h \
>> disk_cache.c \
>> disk_cache.h \
>> +   fast_idiv_by_const.c \
>> +   fast_idiv_by_const.h \
>> format_r11g11b10f.h \
>> format_rgb9e5.h \
>> format_srgb.h \
>> futex.h \
>> half_float.c \
>> half_float.h \
>> hash_table.c \
>> hash_table.h \
>> list.h \
>> macros.h \
>> diff --git a/src/util/fast_idiv_by_const.c b/src/util/fast_idiv_by_const.c
>> new file mode 100644
>> index 000..f247b66
>> --- /dev/null
>> +++ b/src/util/fast_idiv_by_const.c
>> @@ -0,0 +1,245 @@
>> +/*
>> + * Copyright © 2018 Advanced Micro Devices, Inc.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> next
>> + * paragraph) shall be included in all copies or substantial portions of
>> the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +
>> +/* Imported from:
>> + *
>> https://raw.githubusercontent.com/ridiculousfish/libdivide/master/divide_by_constants_codegen_reference.c
>> + * Paper:
>> + *
>> http://ridiculousfish.com/files/faster_unsigned_division_by_constants.pdf
>> + *
>> + * The author, ridiculous_fish, wrote:
>> + *
>> + *  ''Reference implementations of computing and using the "magic number"
>> + *approach to dividing by constants, including codegen instructions.
>> + *The unsigned division 

[Mesa-dev] [Bug 108062] Mesa 18.2.0 and Mesa 18.2.1 RADV Freeze

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108062

marco.grima...@gmail.com changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #1 from marco.grima...@gmail.com ---
Hi, 

new llvm7 plus updated mesa 18.2.1-2 solve the issue

Thanks.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va:Aligned image width and height to 16.

2018-10-02 Thread Ilia Mirkin
vlVaGetImage should respect the x/y/width/height. The surface size
need not have any correlation to the image size. Someone should
double-check the docs for how that function should work, but the
current logic seems completely bogus.
On Tue, Oct 2, 2018 at 3:09 PM Koenig, Christian
 wrote:
>
> Well that's the complete wrong place for that.
>
> The stride of the surface is determined by addrlib. That one should handle 
> aligning the parameters.
>
> Christian.
>
> Am 02.10.2018 20:38 schrieb "Sharma, Deepak" :
> Christian, the issue which trying to address here is vlvaGetImage doesn’t use 
> width/height
> passed to function. box.width is calculated from surface and that will end up 
> in wrong stride for dst buffer
> for said resolution. So was thinking to use aligned width/height for 
> vaCreateImage as well as surface.
> But as you said that depends on codec , So I think either we can use 
> width/height aligned based on codec
> or use passed width/height in vlvaGetImage to fix this issue.
>
> Thanks,
> Deepak
>
> -Original Message-
> From: Christian König 
> Sent: Tuesday, October 2, 2018 3:42 AM
> To: Sharma, Deepak ; mesa-dev@lists.freedesktop.org
> Cc: Guttula, Suresh 
> Subject: Re: [Mesa-dev] [PATCH] st/va:Aligned image width and height to 16.
>
> Am 02.10.2018 um 03:47 schrieb Sharma, Deepak:
> > From: suresh guttula 
> >
> > In case of decoding of resolution like 40x24, while allocating surface
> > video buffer is always aligned with macroblock width/height which is 16.
> > But when application tries to get data after decoding through
> > vaCreateImage /vaGetImage, image width/height aligned with 2 and
> > result a smaller image buffer which causes the memory stomping issue.
>
> Well NAK. It depends on the codec if the picture needs to be aligned to
> 16 or not.
>
> For example VC-1 would created decoding errors with that.
>
> Regards,
> Christian.
>
> >
> > Signed-off-by: suresh guttula 
> > ---
> >   src/gallium/state_trackers/va/image.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/gallium/state_trackers/va/image.c
> > b/src/gallium/state_trackers/va/image.c
> > index 3f892c9..2fc47b7 100644
> > --- a/src/gallium/state_trackers/va/image.c
> > +++ b/src/gallium/state_trackers/va/image.c
> > @@ -123,8 +123,8 @@ vlVaCreateImage(VADriverContextP ctx, VAImageFormat 
> > *format, int width, int heig
> >  img->format = *format;
> >  img->width = width;
> >  img->height = height;
> > -   w = align(width, 2);
> > -   h = align(height, 2);
> > +   w = align(width, 16);
> > +   h = align(height, 16);
> >
> >  switch (format->fourcc) {
> >  case VA_FOURCC('N','V','1','2'):
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa v2] i965: searching the cache doesn't need to modify it

2018-10-02 Thread Kenneth Graunke
The best way to make this code more readable is to replace it entirely.
It's horribly overcomplicated for what is basically an open-coded hash
table.  I've had various starts at that...but never quite managed to
finish. :(  You could probably grab iris_program_cache.c from my iris
branch, make it work with one buffer again, and try and just use
hash_table...

But, that's a lot of work.

This is probably fine...I think you can get into const casting hell when
start pulling out prog_data from a const cache entry and handing it back
as a non-const pointer.  Everything goes through void * though, so it
ends up working for now...

Acked-by: Kenneth Graunke 

On Tuesday, October 2, 2018 7:30:09 AM PDT Eric Engestrom wrote:
> Ping?
> I'm just adding `const` to make it easier to read and understand the
> code, and allow the compiler to tell us if we make a mistake and start
> modifying things shouldn't.
> 
> On Tuesday, 2018-08-07 12:02:53 +0100, Eric Engestrom wrote:
> > Signed-off-by: Eric Engestrom 
> > ---
> > v2: forgot the hunk that was the point of this :facepalm:
> > ---
> >  src/mesa/drivers/dri/i965/brw_program_cache.c | 12 ++--
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c 
> > b/src/mesa/drivers/dri/i965/brw_program_cache.c
> > index 600b0611c8b89095e393..a9a21d911612f9218e2a 100644
> > --- a/src/mesa/drivers/dri/i965/brw_program_cache.c
> > +++ b/src/mesa/drivers/dri/i965/brw_program_cache.c
> > @@ -142,11 +142,11 @@ brw_cache_item_equals(const struct brw_cache_item *a,
> >(memcmp(a->key, b->key, a->key_size) == 0);
> >  }
> >  
> > -static struct brw_cache_item *
> > -search_cache(struct brw_cache *cache, GLuint hash,
> > - struct brw_cache_item *lookup)
> > +static const struct brw_cache_item *
> > +search_cache(const struct brw_cache *cache, GLuint hash,
> > + const struct brw_cache_item *lookup)
> >  {
> > -   struct brw_cache_item *c;
> > +   const struct brw_cache_item *c;
> >  
> >  #if 0
> > int bucketcount = 0;
> > @@ -194,11 +194,11 @@ rehash(struct brw_cache *cache)
> >   * Returns the buffer object matching cache_id and key, or NULL.
> >   */
> >  bool
> > -brw_search_cache(struct brw_cache *cache, enum brw_cache_id cache_id,
> > +brw_search_cache(const struct brw_cache *cache, enum brw_cache_id cache_id,
> >   const void *key, GLuint key_size, uint32_t *inout_offset,
> >   void *inout_prog_data, bool flag_state)
> >  {
> > -   struct brw_cache_item *item;
> > +   const struct brw_cache_item *item;
> > struct brw_cache_item lookup;
> > GLuint hash;
> >  
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va:Aligned image width and height to 16.

2018-10-02 Thread Koenig, Christian
Well that's the complete wrong place for that.

The stride of the surface is determined by addrlib. That one should handle 
aligning the parameters.

Christian.

Am 02.10.2018 20:38 schrieb "Sharma, Deepak" :
Christian, the issue which trying to address here is vlvaGetImage doesn’t use 
width/height
passed to function. box.width is calculated from surface and that will end up 
in wrong stride for dst buffer
for said resolution. So was thinking to use aligned width/height for 
vaCreateImage as well as surface.
But as you said that depends on codec , So I think either we can use 
width/height aligned based on codec
or use passed width/height in vlvaGetImage to fix this issue.

Thanks,
Deepak

-Original Message-
From: Christian König 
Sent: Tuesday, October 2, 2018 3:42 AM
To: Sharma, Deepak ; mesa-dev@lists.freedesktop.org
Cc: Guttula, Suresh 
Subject: Re: [Mesa-dev] [PATCH] st/va:Aligned image width and height to 16.

Am 02.10.2018 um 03:47 schrieb Sharma, Deepak:
> From: suresh guttula 
>
> In case of decoding of resolution like 40x24, while allocating surface
> video buffer is always aligned with macroblock width/height which is 16.
> But when application tries to get data after decoding through
> vaCreateImage /vaGetImage, image width/height aligned with 2 and
> result a smaller image buffer which causes the memory stomping issue.

Well NAK. It depends on the codec if the picture needs to be aligned to
16 or not.

For example VC-1 would created decoding errors with that.

Regards,
Christian.

>
> Signed-off-by: suresh guttula 
> ---
>   src/gallium/state_trackers/va/image.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/image.c
> b/src/gallium/state_trackers/va/image.c
> index 3f892c9..2fc47b7 100644
> --- a/src/gallium/state_trackers/va/image.c
> +++ b/src/gallium/state_trackers/va/image.c
> @@ -123,8 +123,8 @@ vlVaCreateImage(VADriverContextP ctx, VAImageFormat 
> *format, int width, int heig
>  img->format = *format;
>  img->width = width;
>  img->height = height;
> -   w = align(width, 2);
> -   h = align(height, 2);
> +   w = align(width, 16);
> +   h = align(height, 16);
>
>  switch (format->fourcc) {
>  case VA_FOURCC('N','V','1','2'):

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va:Aligned image width and height to 16.

2018-10-02 Thread Sharma, Deepak
Christian, the issue which trying to address here is vlvaGetImage doesn’t use 
width/height 
passed to function. box.width is calculated from surface and that will end up 
in wrong stride for dst buffer
for said resolution. So was thinking to use aligned width/height for 
vaCreateImage as well as surface. 
But as you said that depends on codec , So I think either we can use 
width/height aligned based on codec 
or use passed width/height in vlvaGetImage to fix this issue. 

Thanks,
Deepak

-Original Message-
From: Christian König  
Sent: Tuesday, October 2, 2018 3:42 AM
To: Sharma, Deepak ; mesa-dev@lists.freedesktop.org
Cc: Guttula, Suresh 
Subject: Re: [Mesa-dev] [PATCH] st/va:Aligned image width and height to 16.

Am 02.10.2018 um 03:47 schrieb Sharma, Deepak:
> From: suresh guttula 
>
> In case of decoding of resolution like 40x24, while allocating surface 
> video buffer is always aligned with macroblock width/height which is 16.
> But when application tries to get data after decoding through 
> vaCreateImage /vaGetImage, image width/height aligned with 2 and 
> result a smaller image buffer which causes the memory stomping issue.

Well NAK. It depends on the codec if the picture needs to be aligned to
16 or not.

For example VC-1 would created decoding errors with that.

Regards,
Christian.

>
> Signed-off-by: suresh guttula 
> ---
>   src/gallium/state_trackers/va/image.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/image.c 
> b/src/gallium/state_trackers/va/image.c
> index 3f892c9..2fc47b7 100644
> --- a/src/gallium/state_trackers/va/image.c
> +++ b/src/gallium/state_trackers/va/image.c
> @@ -123,8 +123,8 @@ vlVaCreateImage(VADriverContextP ctx, VAImageFormat 
> *format, int width, int heig
>  img->format = *format;
>  img->width = width;
>  img->height = height;
> -   w = align(width, 2);
> -   h = align(height, 2);
> +   w = align(width, 16);
> +   h = align(height, 16);
>   
>  switch (format->fourcc) {
>  case VA_FOURCC('N','V','1','2'):

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/9] i965: Map the query results for the life of the bo

2018-10-02 Thread Chris Wilson
If we map the bo upon creation, we can avoid the latency of mmapping it
when querying, and later use the asynchronous, persistent map of the
predicate to do a quick query.

v2: Inline the wait on results; it disappears shortly in the next few
patches.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  1 +
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 40 ---
 2 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 3014fa68aff..840332294e6 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -459,6 +459,7 @@ struct brw_query_object {
 
/** Last query BO associated with this query. */
struct brw_bo *bo;
+   uint64_t *results;
 
/** Last index in bo with query data for this object. */
int last_index;
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index ffdee4040fc..17c10b135d1 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -229,7 +229,9 @@ gen6_queryobj_get_results(struct gl_context *ctx,
if (query->bo == NULL)
   return;
 
-   uint64_t *results = brw_bo_map(brw, query->bo, MAP_READ);
+   brw_bo_wait_rendering(query->bo);
+   uint64_t *results = query->results;
+
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
   /* The query BO contains the starting and ending timestamps.
@@ -304,7 +306,6 @@ gen6_queryobj_get_results(struct gl_context *ctx,
default:
   unreachable("Unrecognized query target in brw_queryobj_get_results()");
}
-   brw_bo_unmap(query->bo);
 
/* Now that we've processed the data stored in the query's buffer object,
 * we can release it.
@@ -315,6 +316,25 @@ gen6_queryobj_get_results(struct gl_context *ctx,
query->Base.Ready = true;
 }
 
+static int
+gen6_alloc_query(struct brw_context *brw, struct brw_query_object *query)
+{
+   /* Since we're starting a new query, we need to throw away old results. */
+   brw_bo_unreference(query->bo);
+
+   query->bo = brw_bo_alloc(brw->bufmgr,
+"query results", 4096,
+BRW_MEMZONE_OTHER);
+   query->results = brw_bo_map(brw, query->bo,
+   MAP_COHERENT | MAP_PERSISTENT |
+   MAP_READ | MAP_ASYNC);
+
+   /* For ARB_query_buffer_object: The result is not available */
+   set_query_availability(brw, query, false);
+
+   return 0;
+}
+
 /**
  * Driver hook for glBeginQuery().
  *
@@ -326,15 +346,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
-   const int idx = GEN6_QUERY_RESULTS;
-
-   /* Since we're starting a new query, we need to throw away old results. */
-   brw_bo_unreference(query->bo);
-   query->bo =
-  brw_bo_alloc(brw->bufmgr, "query results", 4096, BRW_MEMZONE_OTHER);
-
-   /* For ARB_query_buffer_object: The result is not available */
-   set_query_availability(brw, query, false);
+   const int idx = gen6_alloc_query(brw, query) + GEN6_QUERY_RESULTS;
 
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
@@ -548,8 +560,12 @@ gen6_query_counter(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
-   brw_query_counter(ctx, q);
+   const int idx = gen6_alloc_query(brw, query) + GEN6_QUERY_RESULTS;
+
+   brw_write_timestamp(brw, query->bo, idx);
set_query_availability(brw, query, true);
+
+   query->flushed = false;
 }
 
 /* Initialize Gen6+-specific query object functions. */
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/9] i965: Check last known busy status on bo before asking the kernel

2018-10-02 Thread Chris Wilson
If we know the bo is idle (that is we have no submitted a command buffer
referencing this bo since the last query) we can skip asking the kernel.
Note this may report a false negative if the target is being shared
between processes (exported via dmabuf or flink). To allow the caller
control over using the last known flag, the query is split into two.

v2: Check against external bo before trusting our own tracking.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 40 --
 src/mesa/drivers/dri/i965/brw_bufmgr.h | 11 +--
 2 files changed, 40 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index f1675b191c1..d9e8453787c 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -444,18 +444,40 @@ vma_free(struct brw_bufmgr *bufmgr,
}
 }
 
-int
+static int
+__brw_bo_busy(struct brw_bo *bo)
+{
+   struct drm_i915_gem_busy busy = { bo->gem_handle };
+
+   if (bo->idle && !bo->external)
+  return 0;
+
+   /* If we hit an error here, it means that bo->gem_handle is invalid.
+* Treat it as being idle (busy.busy is left as 0) and move along.
+*/
+   drmIoctl(bo->bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, );
+
+   bo->idle = !busy.busy;
+   return busy.busy;
+}
+
+bool
 brw_bo_busy(struct brw_bo *bo)
 {
-   struct brw_bufmgr *bufmgr = bo->bufmgr;
-   struct drm_i915_gem_busy busy = { .handle = bo->gem_handle };
+   return __brw_bo_busy(bo);
+}
 
-   int ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, );
-   if (ret == 0) {
-  bo->idle = !busy.busy;
-  return busy.busy;
-   }
-   return false;
+bool
+brw_bo_map_busy(struct brw_bo *bo, unsigned flags)
+{
+   unsigned mask;
+
+   if (flags & MAP_WRITE)
+  mask = ~0u;
+   else
+  mask = 0x;
+
+   return __brw_bo_busy(bo) & mask;
 }
 
 int
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index 32fc7a553c9..e1f46b091ce 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -323,10 +323,17 @@ int brw_bo_get_tiling(struct brw_bo *bo, uint32_t 
*tiling_mode,
 int brw_bo_flink(struct brw_bo *bo, uint32_t *name);
 
 /**
- * Returns 1 if mapping the buffer for write could cause the process
+ * Returns false if mapping the buffer is not in active use by the gpu.
+ * If it returns true, any mapping for for write could cause the process
  * to block, due to the object being active in the GPU.
  */
-int brw_bo_busy(struct brw_bo *bo);
+bool brw_bo_busy(struct brw_bo *bo);
+
+/**
+ * Returns true if mapping the buffer for the set of flags (i.e. MAP_READ or
+ * MAP_WRITE) will cause the process to block.
+ */
+bool brw_bo_map_busy(struct brw_bo *bo, unsigned flags);
 
 /**
  * Specify the volatility of the buffer.
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/9] i965: Use 'available' fence for polling query results

2018-10-02 Thread Chris Wilson
If we always write the 'available' flag after writing the final result
of the query, we can probe that predicate to quickly query whether the
result is ready from userspace. The primary advantage of checking the
predicate is that it allows for more fine-grained queries, we do not
have to wait for the batch to finish before the query is marked as
ready.

We still do check the status of the batch after probing the query so
that if the worst happens and the batch did hang without completing the
query, we do not spin forever (although it is not as nice as completely
eliminating the ioctl, the busy-ioctl is lightweight!).

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  4 +-
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 54 ++-
 2 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 840332294e6..418941c9194 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -468,8 +468,8 @@ struct brw_query_object {
bool flushed;
 };
 
-#define GEN6_QUERY_PREDICATE (2)
-#define GEN6_QUERY_RESULTS (0)
+#define GEN6_QUERY_PREDICATE (0)
+#define GEN6_QUERY_RESULTS (1)
 
 static inline unsigned
 gen6_query_predicate_offset(const struct brw_query_object *query)
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index dc70e2a568a..b6832588333 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -40,8 +40,7 @@
 #include "intel_buffer_objects.h"
 
 static inline void
-set_query_availability(struct brw_context *brw, struct brw_query_object *query,
-   bool available)
+set_query_available(struct brw_context *brw, struct brw_query_object *query)
 {
/* For platforms that support ARB_query_buffer_object, we write the
 * query availability for "pipelined" queries.
@@ -58,22 +57,12 @@ set_query_availability(struct brw_context *brw, struct 
brw_query_object *query,
 * PIPE_CONTROL with an immediate write will synchronize with
 * those earlier writes, so we write 1 when the value has landed.
 */
-   if (brw->ctx.Extensions.ARB_query_buffer_object &&
-   brw_is_query_pipelined(query)) {
-  unsigned flags = PIPE_CONTROL_WRITE_IMMEDIATE;
 
-  if (available) {
- /* Order available *after* the query results. */
- flags |= PIPE_CONTROL_FLUSH_ENABLE;
-  } else {
- /* Make it unavailable *before* any pipelined reads. */
- flags |= PIPE_CONTROL_CS_STALL;
-  }
-
-  brw_emit_pipe_control_write(brw, flags,
-  query->bo, 
gen6_query_predicate_offset(query),
-  available);
-   }
+   brw_emit_pipe_control_write(brw,
+   PIPE_CONTROL_WRITE_IMMEDIATE |
+   PIPE_CONTROL_FLUSH_ENABLE,
+   query->bo, gen6_query_predicate_offset(query),
+   true);
 }
 
 static void
@@ -144,12 +133,12 @@ write_xfb_overflow_streams(struct gl_context *ctx,
 }
 
 static bool
-check_xfb_overflow_streams(uint64_t *results, int count)
+check_xfb_overflow_streams(const uint64_t *results, int count)
 {
bool overflow = false;
 
for (int i = 0; i < count; i++) {
-  uint64_t *result_i = [4 * i];
+  const uint64_t *result_i = [4 * i];
 
   if ((result_i[3] - result_i[2]) != (result_i[1] - result_i[0])) {
  overflow = true;
@@ -221,7 +210,8 @@ emit_pipeline_stat(struct brw_context *brw, struct brw_bo 
*bo,
  */
 static void
 gen6_queryobj_get_results(struct gl_context *ctx,
-  struct brw_query_object *query)
+  struct brw_query_object *query,
+  uint64_t *results)
 {
struct brw_context *brw = brw_context(ctx);
const struct gen_device_info *devinfo = >screen->devinfo;
@@ -229,9 +219,6 @@ gen6_queryobj_get_results(struct gl_context *ctx,
if (query->bo == NULL)
   return;
 
-   brw_bo_wait_rendering(query->bo);
-   uint64_t *results = query->results;
-
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
   /* The query BO contains the starting and ending timestamps.
@@ -329,10 +316,10 @@ gen6_alloc_query(struct brw_context *brw, struct 
brw_query_object *query)
 
query->results = brw_bo_map(brw, query->bo,
MAP_COHERENT | MAP_PERSISTENT |
-   MAP_READ | MAP_ASYNC);
+   MAP_READ | MAP_WRITE);
 
/* For ARB_query_buffer_object: The result is not available */
-   set_query_availability(brw, query, false);
+   query->results[GEN6_QUERY_PREDICATE] = false;
 
return 0;
 }
@@ -487,7 +474,7 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
query->flushed = 

[Mesa-dev] [PATCH 9/9] i965: Set query->flush after flushing the query

2018-10-02 Thread Chris Wilson
Skip the next check for brw_batch_references() by recording when we
flush the query.
---
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index f3b9dd24624..d6e670c306e 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -509,14 +509,16 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
 static void
 flush_batch_if_needed(struct brw_context *brw, struct brw_query_object *query)
 {
+   if (query->flushed)
+  return;
+
/* If the batch doesn't reference the BO, it must have been flushed
 * (for example, due to being full).  Record that it's been flushed.
 */
-   query->flushed = query->flushed ||
-!brw_batch_references(>batch, query->bo);
-
-   if (!query->flushed)
+   if (brw_batch_references(>batch, query->bo))
   intel_batchbuffer_flush(brw);
+
+   query->flushed = true;
 }
 
 /**
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/9] i965: Use snoop bo for accessing query results on !llc

2018-10-02 Thread Chris Wilson
Ony non-llc architectures where we are primarily reading back the
results of the GPU queries, then we can improve performance by using a
cacheable mapping of the results. Unfortunately, enabling snooping makes
the writes from the GPU slower, which may adversely affect pipelined
query operations (where the results are used directly by the GPU and not
CPU).

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c| 24 +++
 src/mesa/drivers/dri/i965/brw_bufmgr.h|  2 ++
 src/mesa/drivers/dri/i965/gen6_queryobj.c |  2 ++
 3 files changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index d9e8453787c..3c3bdee3d2a 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -946,6 +946,30 @@ brw_bo_unreference(struct brw_bo *bo)
}
 }
 
+static bool
+__brw_bo_set_caching(struct brw_bo *bo, int caching)
+{
+   struct drm_i915_gem_caching arg = {
+  .handle = bo->gem_handle,
+  .caching = caching
+   };
+   return drmIoctl(bo->bufmgr->fd, DRM_IOCTL_I915_GEM_SET_CACHING, ) == 0;
+}
+
+void
+brw_bo_set_cache_coherent(struct brw_bo *bo)
+{
+   assert(!bo->external);
+   if (bo->cache_coherent)
+  return;
+
+   if (!__brw_bo_set_caching(bo, I915_CACHING_CACHED))
+  return;
+
+   bo->reusable = false;
+   bo->cache_coherent = true;
+}
+
 static void
 bo_wait_with_stall_warning(struct brw_context *brw,
struct brw_bo *bo,
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index e1f46b091ce..6f0fe54f79f 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -273,6 +273,8 @@ void brw_bo_unreference(struct brw_bo *bo);
 #define MAP_INTERNAL_MASK   (0xff << 24)
 #define MAP_RAW (0x01 << 24)
 
+void brw_bo_set_cache_coherent(struct brw_bo *bo);
+
 /**
  * Maps the buffer into userspace.
  *
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index 17c10b135d1..dc70e2a568a 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -325,6 +325,8 @@ gen6_alloc_query(struct brw_context *brw, struct 
brw_query_object *query)
query->bo = brw_bo_alloc(brw->bufmgr,
 "query results", 4096,
 BRW_MEMZONE_OTHER);
+   brw_bo_set_cache_coherent(query->bo);
+
query->results = brw_bo_map(brw, query->bo,
MAP_COHERENT | MAP_PERSISTENT |
MAP_READ | MAP_ASYNC);
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/9] i965: Replace open-coded gen6 queryobj offsets with simple helpers

2018-10-02 Thread Chris Wilson
Lots of places open-coded the assumed layout of the predicate/results
within the query object, replace those with simple helpers.

v2: Fix function decl style.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 .../drivers/dri/i965/brw_conditional_render.c  | 10 --
 src/mesa/drivers/dri/i965/brw_context.h| 15 +++
 src/mesa/drivers/dri/i965/gen6_queryobj.c  |  6 +++---
 src/mesa/drivers/dri/i965/hsw_queryobj.c   | 18 +-
 4 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_conditional_render.c 
b/src/mesa/drivers/dri/i965/brw_conditional_render.c
index e33e79fb6ce..0177a7f80b4 100644
--- a/src/mesa/drivers/dri/i965/brw_conditional_render.c
+++ b/src/mesa/drivers/dri/i965/brw_conditional_render.c
@@ -87,8 +87,14 @@ set_predicate_for_occlusion_query(struct brw_context *brw,
 */
brw_emit_pipe_control_flush(brw, PIPE_CONTROL_FLUSH_ENABLE);
 
-   brw_load_register_mem64(brw, MI_PREDICATE_SRC0, query->bo, 0 /* offset */);
-   brw_load_register_mem64(brw, MI_PREDICATE_SRC1, query->bo, 8 /* offset */);
+   brw_load_register_mem64(brw,
+   MI_PREDICATE_SRC0,
+   query->bo,
+   gen6_query_results_offset(query, 0));
+   brw_load_register_mem64(brw,
+   MI_PREDICATE_SRC1,
+   query->bo,
+   gen6_query_results_offset(query, 1));
 }
 
 static void
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 7fd15669eb9..3014fa68aff 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -467,6 +467,21 @@ struct brw_query_object {
bool flushed;
 };
 
+#define GEN6_QUERY_PREDICATE (2)
+#define GEN6_QUERY_RESULTS (0)
+
+static inline unsigned
+gen6_query_predicate_offset(const struct brw_query_object *query)
+{
+   return GEN6_QUERY_PREDICATE * sizeof(uint64_t);
+}
+
+static inline unsigned
+gen6_query_results_offset(const struct brw_query_object *query, unsigned idx)
+{
+   return (GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t);
+}
+
 struct brw_reloc_list {
struct drm_i915_gem_relocation_entry *relocs;
int reloc_count;
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index e3097e878aa..ffdee4040fc 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -71,7 +71,7 @@ set_query_availability(struct brw_context *brw, struct 
brw_query_object *query,
   }
 
   brw_emit_pipe_control_write(brw, flags,
-  query->bo, 2 * sizeof(uint64_t),
+  query->bo, 
gen6_query_predicate_offset(query),
   available);
}
 }
@@ -326,7 +326,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
-   const int idx = 0;
+   const int idx = GEN6_QUERY_RESULTS;
 
/* Since we're starting a new query, we need to throw away old results. */
brw_bo_unreference(query->bo);
@@ -416,7 +416,7 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
-   const int idx = 1;
+   const int idx = GEN6_QUERY_RESULTS + 1;
 
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
diff --git a/src/mesa/drivers/dri/i965/hsw_queryobj.c 
b/src/mesa/drivers/dri/i965/hsw_queryobj.c
index 24f52a7d752..120733c759a 100644
--- a/src/mesa/drivers/dri/i965/hsw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/hsw_queryobj.c
@@ -191,7 +191,7 @@ load_overflow_data_to_cs_gprs(struct brw_context *brw,
   struct brw_query_object *query,
   int idx)
 {
-   int offset = idx * sizeof(uint64_t) * 4;
+   int offset = gen6_query_results_offset(query, 0) + idx * sizeof(uint64_t) * 
4;
 
brw_load_register_mem64(brw, HSW_CS_GPR(1), query->bo, offset);
 
@@ -283,7 +283,7 @@ hsw_result_to_gpr0(struct gl_context *ctx, struct 
brw_query_object *query,
   brw_load_register_mem64(brw,
   HSW_CS_GPR(0),
   query->bo,
-  2 * sizeof(uint64_t));
+  gen6_query_predicate_offset(query));
   return;
}
 
@@ -300,7 +300,7 @@ hsw_result_to_gpr0(struct gl_context *ctx, struct 
brw_query_object *query,
   brw_load_register_mem64(brw,
   HSW_CS_GPR(0),
   query->bo,
-  0 * sizeof(uint64_t));
+  gen6_query_results_offset(query, 0));
} else if (query->Base.Target == 

[Mesa-dev] [PATCH 8/9] i965: Pass consistent args along gen6_queryobj.c

2018-10-02 Thread Chris Wilson
Be consistent in passing along brw_context rather than switching between
that and gl_context.

Signed-off-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 30 +++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index f73f29e8524..f3b9dd24624 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -101,11 +101,10 @@ write_xfb_primitives_written(struct brw_context *brw,
 }
 
 static void
-write_xfb_overflow_streams(struct gl_context *ctx,
+write_xfb_overflow_streams(struct brw_context *brw,
struct brw_bo *bo, int stream, int count,
int idx)
 {
-   struct brw_context *brw = brw_context(ctx);
const struct gen_device_info *devinfo = >screen->devinfo;
 
brw_emit_mi_flush(brw);
@@ -209,16 +208,12 @@ emit_pipeline_stat(struct brw_context *brw, struct brw_bo 
*bo,
  * Wait on the query object's BO and calculate the final result.
  */
 static void
-gen6_queryobj_get_results(struct gl_context *ctx,
+gen6_queryobj_get_results(struct brw_context *brw,
   struct brw_query_object *query,
   uint64_t *results)
 {
-   struct brw_context *brw = brw_context(ctx);
const struct gen_device_info *devinfo = >screen->devinfo;
 
-   if (query->bo == NULL)
-  return;
-
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
   /* The query BO contains the starting and ending timestamps.
@@ -235,7 +230,7 @@ gen6_queryobj_get_results(struct gl_context *ctx,
   /* Ensure the scaled timestamp overflows according to
* GL_QUERY_COUNTER_BITS
*/
-  query->Base.Result &= (1ull << ctx->Const.QueryCounterBits.Timestamp) - 
1;
+  query->Base.Result &= (1ull << 
brw->ctx.Const.QueryCounterBits.Timestamp) - 1;
   break;
 
case GL_SAMPLES_PASSED_ARB:
@@ -401,7 +396,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_PRIMITIVES_GENERATED:
   write_primitives_generated(brw, query->bo, query->Base.Stream, idx);
   if (query->Base.Stream == 0)
- ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
+ brw->ctx.NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
   break;
 
case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN:
@@ -409,11 +404,11 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
   break;
 
case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx);
+  write_xfb_overflow_streams(brw, query->bo, query->Base.Stream, 1, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx);
+  write_xfb_overflow_streams(brw, query->bo, 0, MAX_VERTEX_STREAMS, idx);
   break;
 
case GL_VERTICES_SUBMITTED_ARB:
@@ -464,7 +459,7 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_PRIMITIVES_GENERATED:
   write_primitives_generated(brw, query->bo, query->Base.Stream, idx);
   if (query->Base.Stream == 0)
- ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
+ brw->ctx.NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
   break;
 
case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN:
@@ -472,11 +467,11 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
   break;
 
case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx);
+  write_xfb_overflow_streams(brw, query->bo, query->Base.Stream, 1, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx);
+  write_xfb_overflow_streams(brw, query->bo, 0, MAX_VERTEX_STREAMS, idx);
   break;
 
   /* calculate overflow here */
@@ -535,6 +530,9 @@ static void gen6_wait_query(struct gl_context *ctx, struct 
gl_query_object *q)
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
 
+   if (query->bo == NULL)
+  return;
+
/* If the application has requested the query result, but this batch is
 * still contributing to it, flush it now to finish that work so the
 * result will become available (eventually).
@@ -545,7 +543,7 @@ static void gen6_wait_query(struct gl_context *ctx, struct 
gl_query_object *q)
if (!results[GEN6_QUERY_PREDICATE]) /* not yet available, must wait */
   brw_bo_wait_rendering(query->bo);
 
-   gen6_queryobj_get_results(ctx, query, results + GEN6_QUERY_RESULTS);
+   gen6_queryobj_get_results(brw, query, results + GEN6_QUERY_RESULTS);
 }
 
 /**
@@ -577,7 +575,7 @@ static void gen6_check_query(struct gl_context *ctx, struct 
gl_query_object *q)
uint64_t *results = query->results;
   

[Mesa-dev] [PATCH 2/9] i965: Replace hard-coded indices with const named variables in gen6_queryobj

2018-10-02 Thread Chris Wilson
To simplify replacement later, replace repeated use of explicit 0/1 with
local variables of the same value.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 30 ---
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c 
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
index ce9bb474e18..e3097e878aa 100644
--- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
+++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
@@ -326,6 +326,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
+   const int idx = 0;
 
/* Since we're starting a new query, we need to throw away old results. */
brw_bo_unreference(query->bo);
@@ -356,31 +357,31 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
* obtain the time elapsed.  Notably, this includes time elapsed while
* the system was doing other work, such as running other applications.
*/
-  brw_write_timestamp(brw, query->bo, 0);
+  brw_write_timestamp(brw, query->bo, idx);
   break;
 
case GL_ANY_SAMPLES_PASSED:
case GL_ANY_SAMPLES_PASSED_CONSERVATIVE:
case GL_SAMPLES_PASSED_ARB:
-  brw_write_depth_count(brw, query->bo, 0);
+  brw_write_depth_count(brw, query->bo, idx);
   break;
 
case GL_PRIMITIVES_GENERATED:
-  write_primitives_generated(brw, query->bo, query->Base.Stream, 0);
+  write_primitives_generated(brw, query->bo, query->Base.Stream, idx);
   if (query->Base.Stream == 0)
  ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
   break;
 
case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN:
-  write_xfb_primitives_written(brw, query->bo, query->Base.Stream, 0);
+  write_xfb_primitives_written(brw, query->bo, query->Base.Stream, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, 0);
+  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, 0);
+  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx);
   break;
 
case GL_VERTICES_SUBMITTED_ARB:
@@ -394,7 +395,7 @@ gen6_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_COMPUTE_SHADER_INVOCATIONS_ARB:
case GL_TESS_CONTROL_SHADER_PATCHES_ARB:
case GL_TESS_EVALUATION_SHADER_INVOCATIONS_ARB:
-  emit_pipeline_stat(brw, query->bo, query->Base.Stream, 
query->Base.Target, 0);
+  emit_pipeline_stat(brw, query->bo, query->Base.Stream, 
query->Base.Target, idx);
   break;
 
default:
@@ -415,34 +416,35 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
 {
struct brw_context *brw = brw_context(ctx);
struct brw_query_object *query = (struct brw_query_object *)q;
+   const int idx = 1;
 
switch (query->Base.Target) {
case GL_TIME_ELAPSED:
-  brw_write_timestamp(brw, query->bo, 1);
+  brw_write_timestamp(brw, query->bo, idx);
   break;
 
case GL_ANY_SAMPLES_PASSED:
case GL_ANY_SAMPLES_PASSED_CONSERVATIVE:
case GL_SAMPLES_PASSED_ARB:
-  brw_write_depth_count(brw, query->bo, 1);
+  brw_write_depth_count(brw, query->bo, idx);
   break;
 
case GL_PRIMITIVES_GENERATED:
-  write_primitives_generated(brw, query->bo, query->Base.Stream, 1);
+  write_primitives_generated(brw, query->bo, query->Base.Stream, idx);
   if (query->Base.Stream == 0)
  ctx->NewDriverState |= BRW_NEW_RASTERIZER_DISCARD;
   break;
 
case GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN:
-  write_xfb_primitives_written(brw, query->bo, query->Base.Stream, 1);
+  write_xfb_primitives_written(brw, query->bo, query->Base.Stream, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, 1);
+  write_xfb_overflow_streams(ctx, query->bo, query->Base.Stream, 1, idx);
   break;
 
case GL_TRANSFORM_FEEDBACK_OVERFLOW_ARB:
-  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, 1);
+  write_xfb_overflow_streams(ctx, query->bo, 0, MAX_VERTEX_STREAMS, idx);
   break;
 
   /* calculate overflow here */
@@ -458,7 +460,7 @@ gen6_end_query(struct gl_context *ctx, struct 
gl_query_object *q)
case GL_TESS_CONTROL_SHADER_PATCHES_ARB:
case GL_TESS_EVALUATION_SHADER_INVOCATIONS_ARB:
   emit_pipeline_stat(brw, query->bo,
- query->Base.Stream, query->Base.Target, 1);
+ query->Base.Stream, query->Base.Target, idx);
   break;
 
default:
-- 
2.19.0

___

[Mesa-dev] [PATCH 7/9] i965: Pack simple pipelined query objects into the same buffer

2018-10-02 Thread Chris Wilson
Reuse the same query object buffer for multiple queries within the same
batch.

A task for the future is propagating the GL_NO_MEMORY errors.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_context.c   |  3 ++
 src/mesa/drivers/dri/i965/brw_context.h   | 10 +++--
 src/mesa/drivers/dri/i965/brw_queryobj.c  | 16 +++
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 51 ++-
 4 files changed, 59 insertions(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6ba64e4e06d..53912c9c98e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -953,6 +953,8 @@ brwCreateContext(gl_api api,
 
brw->isl_dev = screen->isl_dev;
 
+   brw->query.last_index = 4096;
+
brw->vs.base.stage = MESA_SHADER_VERTEX;
brw->tcs.base.stage = MESA_SHADER_TESS_CTRL;
brw->tes.base.stage = MESA_SHADER_TESS_EVAL;
@@ -1164,6 +1166,7 @@ intelDestroyContext(__DRIcontext * driContextPriv)
brw_bo_unreference(brw->tes.base.push_const_bo);
brw_bo_unreference(brw->gs.base.push_const_bo);
brw_bo_unreference(brw->wm.base.push_const_bo);
+   brw_bo_unreference(brw->query.bo);
 
brw_destroy_hw_context(brw->bufmgr, brw->hw_ctx);
 
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 418941c9194..917bb3a7baf 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -462,7 +462,7 @@ struct brw_query_object {
uint64_t *results;
 
/** Last index in bo with query data for this object. */
-   int last_index;
+   unsigned index;
 
/** True if we know the batch has been flushed since we ended the query. */
bool flushed;
@@ -474,13 +474,13 @@ struct brw_query_object {
 static inline unsigned
 gen6_query_predicate_offset(const struct brw_query_object *query)
 {
-   return GEN6_QUERY_PREDICATE * sizeof(uint64_t);
+   return (query->index + GEN6_QUERY_PREDICATE) * sizeof(uint64_t);
 }
 
 static inline unsigned
 gen6_query_results_offset(const struct brw_query_object *query, unsigned idx)
 {
-   return (GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t);
+   return (query->index + GEN6_QUERY_RESULTS + idx) * sizeof(uint64_t);
 }
 
 struct brw_reloc_list {
@@ -1199,6 +1199,10 @@ struct brw_context
} cc;
 
struct {
+  struct brw_bo *bo;
+  uint64_t *map;
+  unsigned last_index;
+
   struct brw_query_object *obj;
   bool begin_emitted;
} query;
diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index bc4b8c43e7b..c77d630a138 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -188,7 +188,7 @@ brw_queryobj_get_results(struct gl_context *ctx,
* run out of space in the query's BO and allocated a new one.  If so,
* this function was already called to accumulate the results so far.
*/
-  for (i = 0; i < query->last_index; i++) {
+  for (i = 0; i < query->index; i++) {
 query->Base.Result += results[i * 2 + 1] - results[i * 2];
   }
   break;
@@ -198,7 +198,7 @@ brw_queryobj_get_results(struct gl_context *ctx,
   /* If the starting and ending PS_DEPTH_COUNT from any of the batches
* differ, then some fragments passed the depth test.
*/
-  for (i = 0; i < query->last_index; i++) {
+  for (i = 0; i < query->index; i++) {
 if (results[i * 2 + 1] != results[i * 2]) {
 query->Base.Result = GL_TRUE;
 break;
@@ -304,7 +304,7 @@ brw_begin_query(struct gl_context *ctx, struct 
gl_query_object *q)
*/
   brw_bo_unreference(query->bo);
   query->bo = NULL;
-  query->last_index = -1;
+  query->index = -1;
 
   brw->query.obj = query;
 
@@ -441,7 +441,7 @@ ensure_bo_has_space(struct gl_context *ctx, struct 
brw_query_object *query)
 
assert(devinfo->gen < 6);
 
-   if (!query->bo || query->last_index * 2 + 1 >= 4096 / sizeof(uint64_t)) {
+   if (!query->bo || query->index * 2 + 1 >= 4096 / sizeof(uint64_t)) {
 
   if (query->bo != NULL) {
  /* The old query BO did not have enough space, so we allocated a new
@@ -452,7 +452,7 @@ ensure_bo_has_space(struct gl_context *ctx, struct 
brw_query_object *query)
   }
 
   query->bo = brw_bo_alloc(brw->bufmgr, "query", 4096, BRW_MEMZONE_OTHER);
-  query->last_index = 0;
+  query->index = 0;
}
 }
 
@@ -490,7 +490,7 @@ brw_emit_query_begin(struct brw_context *brw)
 
ensure_bo_has_space(ctx, query);
 
-   brw_write_depth_count(brw, query->bo, query->last_index * 2);
+   brw_write_depth_count(brw, query->bo, query->index * 2);
 
brw->query.begin_emitted = true;
 }
@@ -509,10 +509,10 @@ brw_emit_query_end(struct brw_context *brw)
if (!brw->query.begin_emitted)
   return;
 
-   brw_write_depth_count(brw, query->bo, 

[Mesa-dev] [Bug 107832] Gallium picking A16L16 formats when emulating INTENSITY16 conflicts with mesa

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107832

--- Comment #2 from Gert Wollny  ---
Yes, it fixes the bug - I proposed and pushed the patch ;)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] util: import public domain code for integer division by a constant

2018-10-02 Thread Jason Ekstrand
Reading through things in a bit more detail, I do believe that importing
this version in some form would be better than using mine for a number of
reasons:

 * It is better optimized for signed integers
 * The struct of division factors is much better than what I did.  (I did
consider a struct and discarded the idea; I was wrong).
 * Computation of the division factors doesn't involve N*2-bit
multiplication
 * The round-up algorithm here results in significantly better code than
the N+1-bit round-down.
 * I trust ridiculousfish to get this right more than I trust myself

That said, I have a few caveats on merging this as-is:

 * I would like to see some unit tests.  I already spent the time to write
some; they just have to be ported.
 * It needs to be adjusted to handle 64-bit integers (right now, it appears
to only work for num_bits <= 32)
 * We shouldn't define uint_t and sint_t in a header

How do you want to proceed?

--Jason


On Sun, Sep 23, 2018 at 11:58 AM Marek Olšák  wrote:

> From: Marek Olšák 
>
> Compilers can use this to generate optimal code for integer division
> by a constant.
>
> Additionally, an unsigned division by a uniform that is constant but not
> known at compile time can still be optimized by passing 2-4 division
> factors to the shader as uniforms and executing one of the fast_udiv*
> variants. The signed division algorithm doesn't have this capability.
> ---
>  src/util/Makefile.sources |   2 +
>  src/util/fast_idiv_by_const.c | 245
> ++
>  src/util/fast_idiv_by_const.h | 173 +
>  src/util/meson.build  |   2 +
>  4 files changed, 422 insertions(+)
>  create mode 100644 src/util/fast_idiv_by_const.c
>  create mode 100644 src/util/fast_idiv_by_const.h
>
> diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
> index b562d6c..f741b2a 100644
> --- a/src/util/Makefile.sources
> +++ b/src/util/Makefile.sources
> @@ -3,20 +3,22 @@ MESA_UTIL_FILES := \
> bitscan.h \
> bitset.h \
> build_id.c \
> build_id.h \
> crc32.c \
> crc32.h \
> debug.c \
> debug.h \
> disk_cache.c \
> disk_cache.h \
> +   fast_idiv_by_const.c \
> +   fast_idiv_by_const.h \
> format_r11g11b10f.h \
> format_rgb9e5.h \
> format_srgb.h \
> futex.h \
> half_float.c \
> half_float.h \
> hash_table.c \
> hash_table.h \
> list.h \
> macros.h \
> diff --git a/src/util/fast_idiv_by_const.c b/src/util/fast_idiv_by_const.c
> new file mode 100644
> index 000..f247b66
> --- /dev/null
> +++ b/src/util/fast_idiv_by_const.c
> @@ -0,0 +1,245 @@
> +/*
> + * Copyright © 2018 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +/* Imported from:
> + *
> https://raw.githubusercontent.com/ridiculousfish/libdivide/master/divide_by_constants_codegen_reference.c
> + * Paper:
> + *
> http://ridiculousfish.com/files/faster_unsigned_division_by_constants.pdf
> + *
> + * The author, ridiculous_fish, wrote:
> + *
> + *  ''Reference implementations of computing and using the "magic number"
> + *approach to dividing by constants, including codegen instructions.
> + *The unsigned division incorporates the "round down" optimization per
> + *ridiculous_fish.
> + *
> + *This is free and unencumbered software. Any copyright is dedicated
> + *to the Public Domain.''
> + */
> +
> +#include "fast_idiv_by_const.h"
> +#include "u_math.h"
> +#include 
> +#include 
> +
> +/* uint_t and sint_t can be replaced by different integer types and the
> code
> + * will work as-is. The only requirement is that sizeof(uintN) ==
> sizeof(intN).
> + */
> +
> +struct util_fast_udiv_info
> +util_compute_fast_udiv_info(uint_t D, unsigned 

Re: [Mesa-dev] Meson-windows v4 (9/21/2018 rebase): LLVM linking problems

2018-10-02 Thread Dylan Baker
Quoting Liviu Prodea (2018-10-02 08:08:41)
> Made a comprehensive test of this patch series and I still stumbled upon some
> big problems:
> 
> 1. Automatic LLVM linking via llvm-config if used by adding LLVM bin folder to
> PATH results in build failure with 'llvm-c/Core.h' not found in src/gallium/
> auxiliary/gallivm/lp_bld.h. Appveyor CI from https://ci.appveyor.com/project/
> dcbaker/mesa didn't encounter this as it is using llvm-wrap option.

It's possible that llvm-config wrapping on windows is broken atm, it is on
macos, I have pull request open, https://github.com/mesonbuild/meson/pull/4283.
I'll see if that fixes windows as well, or if we need some more work there.

> 2. Even if build succeeds with LLVM linked via llvm-wrap and everything 
> looking
> good at first glance, llvmpipe and swr if it was built cannot be selected.
> GALLIUM_DRIVER variable has no effect. You only get softpipe despite
> opengl32.dll file looking big enough and swrAVX-0.dll and swrAVX2-0,dll being
> generated when expected. Even when having LLVM built dynamically to avoid /MD
> to /MT override warnings and building Mesa3D with default c_args and cpp_args
> this issue is still in effect.
> 
> 3. Meson 0.48.0 doesn't pass the /MT or /MTd c_args and cpp_args for some
> unexplained reasons which leads to build failure if LLVM is not built with 
> /MD.

Meson 0.48 has added a new option to allow you to pick which crt you want:

https://mesonbuild.com/Release-notes-for-0-48-0.html#toggles-for-build-type-optimization-and-vcrt-type

The list of options are here:

https://mesonbuild.com/Builtin-options.html#base-options

I'll test and see if I can add b_vscrt=from_builtype to the default options
without requiring a bumpt ot 0.48.0 for the whole project.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/linker: Check the subroutine associated functions names

2018-10-02 Thread Vadym Shovkoplias
Hi Tapani,

Thanks for the review!

Completely agree with the first comment, I'll change that and resend the
patch.
Regarding second comment. I'm not sure if it is possible to do this check
after the optimization loop. From my observations compiler inlines
everything
and only after that it removes dead functions (actually all funcs except
"main"). After the optimization I don't see any possible way how to
implement this subroutine functions check because all functions and
functions signatures are removed at that point.

On Tue, Oct 2, 2018 at 10:02 AM Tapani Pälli  wrote:

>
> On 10/1/18 5:03 PM, Vadym Shovkoplias wrote:
> >  From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification
> >
> >  "A program will fail to compile or link if any shader
> >   or stage contains two or more functions with the same
> >   name if the name is associated with a subroutine type."
> >
> > Fixes:
> >  * no-overloads.vert
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109
> > Signed-off-by: Vadym Shovkoplias 
> > ---
> >   src/compiler/glsl/linker.cpp | 40 
> >   1 file changed, 40 insertions(+)
> >
> > diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> > index 3fde7e78d3..d0d017c7ff 100644
> > --- a/src/compiler/glsl/linker.cpp
> > +++ b/src/compiler/glsl/linker.cpp
> > @@ -4639,6 +4639,45 @@ link_assign_subroutine_types(struct
> gl_shader_program *prog)
> >  }
> >   }
> >
> > +static void
> > +verify_subroutine_associated_funcs(struct gl_shader_program *prog)
> > +{
> > +   unsigned mask = prog->data->linked_stages;
> > +   while (mask) {
> > +  const int i = u_bit_scan();
> > +  gl_program *p = prog->_LinkedShaders[i]->Program;
> > +  glsl_symbol_table *symbols = prog->_LinkedShaders[i]->symbols;
> > +
> > +  /*
> > +   * From OpenGL ES Shading Language 4.00 specification
> > +   * (6.1.2 Subroutines):
> > +   * "A program will fail to compile or link if any shader
> > +   * or stage contains two or more functions with the same
> > +   * name if the name is associated with a subroutine type."
> > +   */
> > +  for (unsigned j = 0; j < p->sh.NumSubroutineFunctions; j++) {
> > + unsigned definitions = 0;
> > + char *name = p->sh.SubroutineFunctions[j].name;
> > + ir_function *fn = symbols->get_function(name);
> > +
> > + /* Calculate number of function definitions with the same name
> */
> > + foreach_in_list(ir_function_signature, sig, >signatures) {
> > +if (sig->is_defined)
> > +   definitions++;
>
> You can just error out here, no need to calculate further.
>
> I'm wondering a bit though should we fail here even if that function was
> not used at all (optimized out)? I can see that the Piglit test does not
> have a call to the function defined.
>
>
> > + }
> > +
> > + if (definitions > 1) {
> > +linker_error(prog, "%s shader contains %u function "
> > +  "definitions with name `%s', which is associated with"
> > +  " a subroutine type.\n",
> > +  _mesa_shader_stage_to_string(i), definitions,
> fn->name);
> > +return;
> > + }
> > +  }
> > +   }
> > +}
> > +
> > +
> >   static void
> >   set_always_active_io(exec_list *ir, ir_variable_mode io_mode)
> >   {
> > @@ -5024,6 +5063,7 @@ link_shaders(struct gl_context *ctx, struct
> gl_shader_program *prog)
> >
> >  check_explicit_uniform_locations(ctx, prog);
> >  link_assign_subroutine_types(prog);
> > +   verify_subroutine_associated_funcs(prog);
> >
> >  if (!prog->data->LinkStatus)
> > goto done;
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>


-- 

Vadym Shovkoplias | Senior Software Engineer
GlobalLogic
P +380.57.766.7667  M +3.8050.931.7304  S vadym.shovkoplias
www.globallogic.com

http://www.globallogic.com/email_disclaimer.txt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107594] [PATCH] fix crosscompilling with meson

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107594

--- Comment #3 from Dylan Baker  ---
Are you running into this issue, by chance:
https://github.com/mesonbuild/meson/issues/4254

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Replace checks for rb->Name with FlipY (v2)

2018-10-02 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Mon, Sep 17, 2018 at 3:52 PM Fritz Koenig  wrote:

> In the GL_MESA_framebuffer_flip_y implementation
> _mesa_is_winsys_fbo checks were replaced with
> FlipY checks.  rb->Name is also used to determine
> if a buffer is winsys.
>
> v2: Fixes annotation [for emil]
>
> Fixes: ab05dd183cc ("i965: implement GL_MESA_framebuffer_flip_y [v3]")
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c| 20 +---
>  src/mesa/drivers/dri/i965/intel_pixel_read.c |  4 ++--
>  2 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index ad747e0766..ad3a47ef03 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -1224,12 +1224,12 @@ do_single_blorp_clear(struct brw_context *brw,
> struct gl_framebuffer *fb,
>
> x0 = fb->_Xmin;
> x1 = fb->_Xmax;
> -   if (rb->Name != 0) {
> -  y0 = fb->_Ymin;
> -  y1 = fb->_Ymax;
> -   } else {
> +   if (fb->FlipY) {
>y0 = rb->Height - fb->_Ymax;
>y1 = rb->Height - fb->_Ymin;
> +   } else {
> +  y0 = fb->_Ymin;
> +  y1 = fb->_Ymax;
> }
>
> /* If the clear region is empty, just return. */
> @@ -1415,9 +1415,8 @@ brw_blorp_clear_depth_stencil(struct brw_context
> *brw,
> if (!(mask & (BUFFER_BITS_DEPTH_STENCIL)))
>return;
>
> -   uint32_t x0, x1, y0, y1, rb_name, rb_height;
> +   uint32_t x0, x1, y0, y1, rb_height;
> if (depth_rb) {
> -  rb_name = depth_rb->Name;
>rb_height = depth_rb->Height;
>if (stencil_rb) {
>   assert(depth_rb->Width == stencil_rb->Width);
> @@ -1425,18 +1424,17 @@ brw_blorp_clear_depth_stencil(struct brw_context
> *brw,
>}
> } else {
>assert(stencil_rb);
> -  rb_name = stencil_rb->Name;
>rb_height = stencil_rb->Height;
> }
>
> x0 = fb->_Xmin;
> x1 = fb->_Xmax;
> -   if (rb_name != 0) {
> -  y0 = fb->_Ymin;
> -  y1 = fb->_Ymax;
> -   } else {
> +   if (fb->FlipY) {
>y0 = rb_height - fb->_Ymax;
>y1 = rb_height - fb->_Ymin;
> +   } else {
> +  y0 = fb->_Ymin;
> +  y1 = fb->_Ymax;
> }
>
> /* If the clear region is empty, just return. */
> diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c
> b/src/mesa/drivers/dri/i965/intel_pixel_read.c
> index 6ed7895bc7..8a90b207ad 100644
> --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
> +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
> @@ -181,7 +181,7 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
>  * tiled_to_linear a negative pitch so that it walks through the
>  * client's data backwards as it walks through the renderbufer
> forwards.
>  */
> -   if (rb->Name == 0) {
> +   if (ctx->ReadBuffer->FlipY) {
>yoffset = rb->Height - yoffset - height;
>pixels += (ptrdiff_t) (height - 1) * dst_pitch;
>dst_pitch = -dst_pitch;
> @@ -249,7 +249,7 @@ intel_readpixels_blorp(struct gl_context *ctx,
> return brw_blorp_download_miptree(brw, irb->mt, rb->Format, swizzle,
>   irb->mt_level, x, y, irb->mt_layer,
>   w, h, 1, GL_TEXTURE_2D, format, type,
> - rb->Name == 0, pixels, packing);
> + ctx->ReadBuffer->FlipY, pixels,
> packing);
>  }
>
>  void
> --
> 2.19.0.397.gdd90340f6a-goog
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 1/3] include: sync eglplatform.h from Khronos

2018-10-02 Thread Eric Engestrom
On Tuesday, 2018-10-02 16:32:44 +0100, Emil Velikov wrote:
> Hi Eric,
> 
> On Sun, 10 Jun 2018 at 09:36, Eric Engestrom  wrote:
> >
> > An issue [1] was recently raised with the upstream eglplatform.h, as my
> > upstreaming of the "X11 on Apple" resulted in Apple platforms to always
> > try to include X11 headers.
> >
> > The solution [2] was to also upstream our `MESA_EGL_NO_X11_HEADERS`
> > toggle, inverted as `USE_X11`.
> >
> > This commit updates our copy of the header, and updates the build system
> > to use the new define.
> >
> I'm a bit concerned that this will break apps which want to build with 
> GLX/X11.
> Did you try building, say piglit/waffle (or more complex ones like
> dolphin/kodi) against the updated header?

I haven't (I'll try tomorrow), but I don't expect any other platform
than MacOS to change behaviour (where it will no longer include the X11
headers without being explicitly asked to).

> 
> Assuming that works, the patch is
> Reviewed-by: Emil Velikov 
> 
> -Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: consider a 'base level' when calculating width0, height0, depth0

2018-10-02 Thread asimiklit . work
From: Andrii Simiklit 

I guess that when we calculating the width0, height0, depth0
to use for function 'intel_miptree_create' we need to consider
the 'base level' like it is done in the 'intel_miptree_create_for_teximage'
function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107987
Signed-off-by: Andrii Simiklit 
---
 .../drivers/dri/i965/intel_tex_validate.c | 26 ++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_validate.c 
b/src/mesa/drivers/dri/i965/intel_tex_validate.c
index 72ce83c7ce..37aa8f43ec 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_validate.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_validate.c
@@ -119,8 +119,32 @@ intel_finalize_mipmap_tree(struct brw_context *brw,
/* May need to create a new tree:
 */
if (!intelObj->mt) {
+  const unsigned level = firstImage->base.Base.Level;
   intel_get_image_dims(>base.Base, , , );
-
+  /* Figure out image dimensions at start level. */
+  switch(intelObj->base.Target) {
+  case GL_TEXTURE_2D_MULTISAMPLE:
+  case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
+  case GL_TEXTURE_RECTANGLE:
+  case GL_TEXTURE_EXTERNAL_OES:
+  assert(level == 0);
+  break;
+  case GL_TEXTURE_3D:
+  depth = depth << level;
+  /* Fall through */
+  case GL_TEXTURE_2D:
+  case GL_TEXTURE_2D_ARRAY:
+  case GL_TEXTURE_CUBE_MAP:
+  case GL_TEXTURE_CUBE_MAP_ARRAY:
+  height = height << level;
+  /* Fall through */
+  case GL_TEXTURE_1D:
+  case GL_TEXTURE_1D_ARRAY:
+  width = width << level;
+  break;
+  default:
+  unreachable("Unexpected target");
+  }
   perf_debug("Creating new %s %dx%dx%d %d-level miptree to handle "
  "finalized texture miptree.\n",
  _mesa_get_format_name(firstImage->base.Base.TexFormat),
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108118] AMDGPU sometimes hangs forever when running graphical applications

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108118

--- Comment #3 from duoora...@gmail.com ---
Problem also occurs with kernel 4.17.14, it is not 4.18+ specific. Dmesg output
for the hang under 4.17.14 was:

[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled
seq=73798 last emitted seq=73800
[drm] No hardware hang detected. Did some blocks stall?

Haven't seen the "No hardware hang" message before. This was collected with
amdgpu.gpu_recovery=1.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 1/3] include: sync eglplatform.h from Khronos

2018-10-02 Thread Emil Velikov
Hi Eric,

On Sun, 10 Jun 2018 at 09:36, Eric Engestrom  wrote:
>
> An issue [1] was recently raised with the upstream eglplatform.h, as my
> upstreaming of the "X11 on Apple" resulted in Apple platforms to always
> try to include X11 headers.
>
> The solution [2] was to also upstream our `MESA_EGL_NO_X11_HEADERS`
> toggle, inverted as `USE_X11`.
>
> This commit updates our copy of the header, and updates the build system
> to use the new define.
>
I'm a bit concerned that this will break apps which want to build with GLX/X11.
Did you try building, say piglit/waffle (or more complex ones like
dolphin/kodi) against the updated header?

Assuming that works, the patch is
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108118] AMDGPU sometimes hangs forever when running graphical applications

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108118

--- Comment #2 from duoora...@gmail.com ---
I've failed to replicate the problem with both Wine DX9 games (backed by
OpenGL) and Dota 2's OpenGL mode so I think it is a Vulkan specific issue. Will
test an older kernel next.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Meson-windows v4 (9/21/2018 rebase): LLVM linking problems

2018-10-02 Thread Liviu Prodea
Made a comprehensive test of this patch series and I still stumbled upon some 
big problems:
1. Automatic LLVM linking via llvm-config if used by adding LLVM bin folder to 
PATH results in build failure with 'llvm-c/Core.h' not found in 
src/gallium/auxiliary/gallivm/lp_bld.h. Appveyor CI from 
https://ci.appveyor.com/project/dcbaker/mesa didn't encounter this as it is 
using llvm-wrap option.

2. Even if build succeeds with LLVM linked via llvm-wrap and everything looking 
good at first glance, llvmpipe and swr if it was built cannot be selected. 
GALLIUM_DRIVER variable has no effect. You only get softpipe despite 
opengl32.dll file looking big enough and swrAVX-0.dll and swrAVX2-0,dll being 
generated when expected. Even when having LLVM built dynamically to avoid /MD 
to /MT override warnings and building Mesa3D with default c_args and cpp_args 
this issue is still in effect.

3. Meson 0.48.0 doesn't pass the /MT or /MTd c_args and cpp_args for some 
unexplained reasons which leads to build failure if LLVM is not built with /MD.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] configure: allow building with python3

2018-10-02 Thread Emil Velikov
Hi Dylan,

On Mon, 3 Sep 2018 at 14:57, Emil Velikov  wrote:
>
> On 24 August 2018 at 19:51, Dylan Baker  wrote:
> > Can we just change the script to write a file instead of sending it's output
> > through the shell? That should fix any encoding problems since the shell 
> > wont
> > touch it and the LANG settings (no matter what they are) shouldn't matter.
> >
> Seems like I forgot to reply to this. Yes, please - that would be
> highly preferred.
>
Did you get the chance to do this?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v1] loader/dri3: wait for fences if back-buffer available

2018-10-02 Thread Michel Dänzer
On 2018-10-01 4:35 p.m., Sergii Romantsov wrote:
> Yes, it also works

Great, thanks for testing! Sent out a proper patch for review:

https://patchwork.freedesktop.org/patch/254393/


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] loader/dri3: Also wait for front buffer fence if we triggered it

2018-10-02 Thread Michel Dänzer
From: Michel Dänzer 

In that case, we have to wait for the fence to synchronize with the
corresponding drawing we triggered in the X server.

Fixes incorrect display with the i965 and some applications, e.g.
solvespace.

Bugzilla: https://bugs.freedesktop.org/108097
Fixes: aefac10fecc9 "loader/dri3: Only wait for back buffer fences in
 dri3_get_buffer"
Tested-by: Sergii Romantsov 
Signed-off-by: Michel Dänzer 
---
 src/loader/loader_dri3_helper.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index f641a34e6d1..1981b5f0515 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -1736,6 +1736,7 @@ dri3_get_buffer(__DRIdrawable *driDrawable,
 struct loader_dri3_drawable *draw)
 {
struct loader_dri3_buffer *buffer;
+   bool fence_await = buffer_type == loader_dri3_buffer_back;
int buf_id;
 
if (buffer_type == loader_dri3_buffer_back) {
@@ -1791,6 +1792,7 @@ dri3_get_buffer(__DRIdrawable *driDrawable,
0, 0, 0, 0,
draw->width, draw->height);
 dri3_fence_trigger(draw->conn, new_buffer);
+fence_await = true;
  }
  dri3_free_render_buffer(draw, buffer);
   } else if (buffer_type == loader_dri3_buffer_front) {
@@ -1812,13 +1814,14 @@ dri3_get_buffer(__DRIdrawable *driDrawable,
   new_buffer->linear_buffer,
   0, 0, draw->width, draw->height,
   0, 0, 0);
- }
+ } else
+fence_await = true;
   }
   buffer = new_buffer;
   draw->buffers[buf_id] = buffer;
}
 
-   if (buffer_type == loader_dri3_buffer_back)
+   if (fence_await)
   dri3_fence_await(draw->conn, draw, buffer);
 
/*
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa v2] i965: searching the cache doesn't need to modify it

2018-10-02 Thread Eric Engestrom
Ping?
I'm just adding `const` to make it easier to read and understand the
code, and allow the compiler to tell us if we make a mistake and start
modifying things shouldn't.

On Tuesday, 2018-08-07 12:02:53 +0100, Eric Engestrom wrote:
> Signed-off-by: Eric Engestrom 
> ---
> v2: forgot the hunk that was the point of this :facepalm:
> ---
>  src/mesa/drivers/dri/i965/brw_program_cache.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c 
> b/src/mesa/drivers/dri/i965/brw_program_cache.c
> index 600b0611c8b89095e393..a9a21d911612f9218e2a 100644
> --- a/src/mesa/drivers/dri/i965/brw_program_cache.c
> +++ b/src/mesa/drivers/dri/i965/brw_program_cache.c
> @@ -142,11 +142,11 @@ brw_cache_item_equals(const struct brw_cache_item *a,
>(memcmp(a->key, b->key, a->key_size) == 0);
>  }
>  
> -static struct brw_cache_item *
> -search_cache(struct brw_cache *cache, GLuint hash,
> - struct brw_cache_item *lookup)
> +static const struct brw_cache_item *
> +search_cache(const struct brw_cache *cache, GLuint hash,
> + const struct brw_cache_item *lookup)
>  {
> -   struct brw_cache_item *c;
> +   const struct brw_cache_item *c;
>  
>  #if 0
> int bucketcount = 0;
> @@ -194,11 +194,11 @@ rehash(struct brw_cache *cache)
>   * Returns the buffer object matching cache_id and key, or NULL.
>   */
>  bool
> -brw_search_cache(struct brw_cache *cache, enum brw_cache_id cache_id,
> +brw_search_cache(const struct brw_cache *cache, enum brw_cache_id cache_id,
>   const void *key, GLuint key_size, uint32_t *inout_offset,
>   void *inout_prog_data, bool flag_state)
>  {
> -   struct brw_cache_item *item;
> +   const struct brw_cache_item *item;
> struct brw_cache_item lookup;
> GLuint hash;
>  
> -- 
> Cheers,
>   Eric
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 4/4] clover: add missing meson build dependency

2018-10-02 Thread Eric Engestrom
Fixes: 42ea0631f108d82554339 "meson: build clover"
Signed-off-by: Eric Engestrom 
---
 src/gallium/state_trackers/clover/meson.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/state_trackers/clover/meson.build 
b/src/gallium/state_trackers/clover/meson.build
index d1497e657ea3c4a178fe..c2b6147ddca21f1cb3c7 100644
--- a/src/gallium/state_trackers/clover/meson.build
+++ b/src/gallium/state_trackers/clover/meson.build
@@ -119,4 +119,5 @@ libclover = static_library(
   include_directories : clover_incs,
   cpp_args : [clover_cpp_args, cpp_vis_args],
   link_with : [libcltgsi, libclllvm],
+  dependencies : idep_git_sha1,
 )
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 3/4] svga: add missing meson build dependency

2018-10-02 Thread Eric Engestrom
Fixes: a537231b226280bc1e5b7 "meson: build svga driver on linux"
Signed-off-by: Eric Engestrom 
---
 src/gallium/drivers/svga/meson.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/svga/meson.build 
b/src/gallium/drivers/svga/meson.build
index 2976212fdfba418c12ad..9de1c0d797780d8148af 100644
--- a/src/gallium/drivers/svga/meson.build
+++ b/src/gallium/drivers/svga/meson.build
@@ -85,6 +85,7 @@ libsvga = static_library(
 inc_src, inc_include, inc_gallium, inc_gallium_aux,
 include_directories('include')
   ],
+  dependencies : idep_git_sha1,
 )
 
 driver_svga = declare_dependency(
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 2/4] anv: add missing meson build dependency

2018-10-02 Thread Eric Engestrom
Fixes: e4538b93f5d5177318f2 "anv: Implement VK_KHR_driver_properties"
Signed-off-by: Eric Engestrom 
---
 src/intel/vulkan/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build
index f1beb1de6ac99d517edd..d590cb87c3bbaf1bf44c 100644
--- a/src/intel/vulkan/meson.build
+++ b/src/intel/vulkan/meson.build
@@ -182,7 +182,7 @@ libanv_common = static_library(
 inc_vulkan_wsi,
   ],
   c_args : anv_flags,
-  dependencies : anv_deps,
+  dependencies : [anv_deps, idep_git_sha1],
 )
 
 libvulkan_intel = shared_library(
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 1/4] meson: turn git_sha1.h target into a proper dependency

2018-10-02 Thread Eric Engestrom
Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Eric Engestrom 
---
 src/mesa/meson.build |  3 +--
 src/meson.build  | 13 -
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/src/mesa/meson.build b/src/mesa/meson.build
index ea884977db8052d86fcb..861b0311048eff422b9f 100644
--- a/src/mesa/meson.build
+++ b/src/mesa/meson.build
@@ -705,7 +705,6 @@ files_libmesa_common += [
   ir_expression_operation_h,
   main_remap_helper_h,
   matypes_h,
-  sha1_h,
 ]
 
 if with_sse41
@@ -726,7 +725,7 @@ libmesa_classic = static_library(
   cpp_args : [cpp_vis_args, cpp_msvc_compat_args],
   include_directories : [inc_common, inc_libmesa_asm, 
include_directories('main')],
   link_with : [libglsl, libmesa_sse41],
-  dependencies : idep_nir_headers,
+  dependencies : [idep_nir_headers, idep_git_sha1],
   build_by_default : false,
 )
 
diff --git a/src/meson.build b/src/meson.build
index af881cff70bf752a6474..89ffaddf47b7286e4fe0 100644
--- a/src/meson.build
+++ b/src/meson.build
@@ -39,11 +39,14 @@ libglsl_util = static_library(
   build_by_default : false,
 )
 
-sha1_h = custom_target(
-  'git_sha1.h',
-  output : 'git_sha1.h',
-  command : [prog_python, git_sha1_gen_py, '--output', '@OUTPUT@'],
-  build_always : true, # commit sha1 can change without having touched these 
files
+idep_git_sha1 = declare_dependency(
+  sources : custom_target(
+'git_sha1.h',
+output : 'git_sha1.h',
+command : [prog_python, git_sha1_gen_py, '--output', '@OUTPUT@'],
+build_always : true, # commit sha1 can change without having touched these 
files
+  ),
+  include_directories : inc_src,
 )
 
 subdir('gtest')
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] anv: suppress warning about unhandled image layout

2018-10-02 Thread Jason Ekstrand

Rb

On October 2, 2018 08:33:37 Eric Engestrom  wrote:


Let's just be explicit that VK_NV_shading_rate_image is not supported.

Suggested-by: Jason Ekstrand 
Fixes: 6ee17091708a41c4aa81a "vulkan: Update the XML and headers to 1.1.86"
Signed-off-by: Eric Engestrom 
---
src/intel/vulkan/anv_image.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index b0d8c560adb2292bd0f6..9f7964ae37eba894f8e4 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -920,6 +920,9 @@ anv_layout_to_aux_usage(const struct gen_device_info * 
const devinfo,


   case VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR:
  unreachable("VK_KHR_shared_presentable_image is unsupported");
+
+   case VK_IMAGE_LAYOUT_SHADING_RATE_OPTIMAL_NV:
+  unreachable("VK_NV_shading_rate_image is unsupported");
   }

   /* If the layout isn't recognized in the exhaustive switch above, the
--
Cheers,
 Eric




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa] anv: suppress warning about unhandled image layout

2018-10-02 Thread Eric Engestrom
Let's just be explicit that VK_NV_shading_rate_image is not supported.

Suggested-by: Jason Ekstrand 
Fixes: 6ee17091708a41c4aa81a "vulkan: Update the XML and headers to 1.1.86"
Signed-off-by: Eric Engestrom 
---
 src/intel/vulkan/anv_image.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index b0d8c560adb2292bd0f6..9f7964ae37eba894f8e4 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -920,6 +920,9 @@ anv_layout_to_aux_usage(const struct gen_device_info * 
const devinfo,
 
case VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR:
   unreachable("VK_KHR_shared_presentable_image is unsupported");
+
+   case VK_IMAGE_LAYOUT_SHADING_RATE_OPTIMAL_NV:
+  unreachable("VK_NV_shading_rate_image is unsupported");
}
 
/* If the layout isn't recognized in the exhaustive switch above, the
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] nir/cf: Remove phi sources if needed in nir_handle_add_jump

2018-10-02 Thread Jason Ekstrand
On Tue, Oct 2, 2018 at 7:30 AM Jason Ekstrand  wrote:

> On Tue, Oct 2, 2018 at 5:53 AM Iago Toral  wrote:
>
>> On Sat, 2018-09-22 at 16:39 -0500, Jason Ekstrand wrote:
>> > If the block in which the jump is inserted is the predecessor of a
>> > phi
>> > then we need to remove phi sources otherwise the phi may end up with
>> > things improperly connected.  Found by running the Vulkan CTS with
>> > SPIR-V optimizations enabled.
>> >
>> > Cc: mesa-sta...@lists.freedesktop.org
>> > ---
>> >  src/compiler/nir/nir_control_flow.c | 36 +++--
>> > 
>> >  1 file changed, 19 insertions(+), 17 deletions(-)
>> >
>> > diff --git a/src/compiler/nir/nir_control_flow.c
>> > b/src/compiler/nir/nir_control_flow.c
>> > index 3b0a0f1a5b0..a82f35550b8 100644
>> > --- a/src/compiler/nir/nir_control_flow.c
>> > +++ b/src/compiler/nir/nir_control_flow.c
>> > @@ -437,6 +437,23 @@ nearest_loop(nir_cf_node *node)
>> > return nir_cf_node_as_loop(node);
>> >  }
>> >
>> > +static void
>> > +remove_phi_src(nir_block *block, nir_block *pred)
>> > +{
>> > +   nir_foreach_instr(instr, block) {
>> > +  if (instr->type != nir_instr_type_phi)
>> > + break;
>> > +
>> > +  nir_phi_instr *phi = nir_instr_as_phi(instr);
>> > +  nir_foreach_phi_src_safe(src, phi) {
>> > + if (src->pred == pred) {
>> > +list_del(>src.use_link);
>> > +exec_node_remove(>node);
>> > + }
>> > +  }
>> > +   }
>> > +}
>> > +
>> >  /*
>> >   * update the CFG after a jump instruction has been added to the end
>> > of a block
>> >   */
>> > @@ -447,6 +464,8 @@ nir_handle_add_jump(nir_block *block)
>> > nir_instr *instr = nir_block_last_instr(block);
>> > nir_jump_instr *jump_instr = nir_instr_as_jump(instr);
>> >
>> > +   if (block->successors[0])
>> > +  remove_phi_src(block->successors[0], block);
>>
>> Don't we need to do the same for block->successors[1]?
>>
>
> I was going to say no because his function handles *adding* a phi and so
> the block should already only have one successor.  However, I suppose you
> could add a phi right before an if.  I'll add the one for
> block->successors[1] just to be safe.
>

On further thought, I don't think it's possible to end up with phi sources
at block->successors[1].  The only type of block that can have multiple
successors is one right before an if and both sides of the if have only one
predecessor so they can't have phis.  Unless, of course, we add a bunch of
no-op phis for some reason.  Eh, removing phis on block->successors[1] is
harmless and probably more correct.  Still, it's a very weird case...

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] nir/cf: Remove phi sources if needed in nir_handle_add_jump

2018-10-02 Thread Jason Ekstrand
On Tue, Oct 2, 2018 at 5:53 AM Iago Toral  wrote:

> On Sat, 2018-09-22 at 16:39 -0500, Jason Ekstrand wrote:
> > If the block in which the jump is inserted is the predecessor of a
> > phi
> > then we need to remove phi sources otherwise the phi may end up with
> > things improperly connected.  Found by running the Vulkan CTS with
> > SPIR-V optimizations enabled.
> >
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  src/compiler/nir/nir_control_flow.c | 36 +++--
> > 
> >  1 file changed, 19 insertions(+), 17 deletions(-)
> >
> > diff --git a/src/compiler/nir/nir_control_flow.c
> > b/src/compiler/nir/nir_control_flow.c
> > index 3b0a0f1a5b0..a82f35550b8 100644
> > --- a/src/compiler/nir/nir_control_flow.c
> > +++ b/src/compiler/nir/nir_control_flow.c
> > @@ -437,6 +437,23 @@ nearest_loop(nir_cf_node *node)
> > return nir_cf_node_as_loop(node);
> >  }
> >
> > +static void
> > +remove_phi_src(nir_block *block, nir_block *pred)
> > +{
> > +   nir_foreach_instr(instr, block) {
> > +  if (instr->type != nir_instr_type_phi)
> > + break;
> > +
> > +  nir_phi_instr *phi = nir_instr_as_phi(instr);
> > +  nir_foreach_phi_src_safe(src, phi) {
> > + if (src->pred == pred) {
> > +list_del(>src.use_link);
> > +exec_node_remove(>node);
> > + }
> > +  }
> > +   }
> > +}
> > +
> >  /*
> >   * update the CFG after a jump instruction has been added to the end
> > of a block
> >   */
> > @@ -447,6 +464,8 @@ nir_handle_add_jump(nir_block *block)
> > nir_instr *instr = nir_block_last_instr(block);
> > nir_jump_instr *jump_instr = nir_instr_as_jump(instr);
> >
> > +   if (block->successors[0])
> > +  remove_phi_src(block->successors[0], block);
>
> Don't we need to do the same for block->successors[1]?
>

I was going to say no because his function handles *adding* a phi and so
the block should already only have one successor.  However, I suppose you
could add a phi right before an if.  I'll add the one for
block->successors[1] just to be safe.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] util: rename timestamp param in disk_cache_create()

2018-10-02 Thread Timothy Arceri

Ping on patches 1-5

I've pushed patch 6.

On 19/9/18 12:13 pm, Timothy Arceri wrote:

Only some drivers use a timestamp here. Others use things such
as build-id, or even a combination of build-ids from Mesa and
LLVM.
---
  src/util/disk_cache.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 87ddfb86b27..368ec417927 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -189,7 +189,7 @@ do {   \
  } while (0);
  
  struct disk_cache *

-disk_cache_create(const char *gpu_name, const char *timestamp,
+disk_cache_create(const char *gpu_name, const char *driver_id,
uint64_t driver_flags)
  {
 void *local;
@@ -387,9 +387,9 @@ disk_cache_create(const char *gpu_name, const char 
*timestamp,
 cache->driver_keys_blob_size = cv_size;
  
 /* Create driver id keys */

-   size_t ts_size = strlen(timestamp) + 1;
+   size_t id_size = strlen(driver_id) + 1;
 size_t gpu_name_size = strlen(gpu_name) + 1;
-   cache->driver_keys_blob_size += ts_size;
+   cache->driver_keys_blob_size += id_size;
 cache->driver_keys_blob_size += gpu_name_size;
  
 /* We sometimes store entire structs that contains a pointers in the cache,

@@ -409,7 +409,7 @@ disk_cache_create(const char *gpu_name, const char 
*timestamp,
  
 uint8_t *drv_key_blob = cache->driver_keys_blob;

 DRV_KEY_CPY(drv_key_blob, _version, cv_size)
-   DRV_KEY_CPY(drv_key_blob, timestamp, ts_size)
+   DRV_KEY_CPY(drv_key_blob, driver_id, id_size)
 DRV_KEY_CPY(drv_key_blob, gpu_name, gpu_name_size)
 DRV_KEY_CPY(drv_key_blob, _size, ptr_size_size)
 DRV_KEY_CPY(drv_key_blob, _flags, driver_flags_size)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa] egl: sync headers from Khronos

2018-10-02 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
 include/EGL/egl.h | 8 
 include/EGL/eglext.h  | 9 +
 include/EGL/eglplatform.h | 8 
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/EGL/egl.h b/include/EGL/egl.h
index 93a21873c0f3ed850866..416b0935b279dcb99a21 100644
--- a/include/EGL/egl.h
+++ b/include/EGL/egl.h
@@ -28,17 +28,17 @@ extern "C" {
 ** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
 */
 /*
-** This header is generated from the Khronos OpenGL / OpenGL ES XML
-** API Registry. The current version of the Registry, generator scripts
+** This header is generated from the Khronos EGL XML API Registry.
+** The current version of the Registry, generator scripts
 ** used to make the header, and the header can be found at
 **   http://www.khronos.org/registry/egl
 **
-** Khronos $Git commit SHA1: a732b061e7 $ on $Git commit date: 2017-06-17 
23:27:53 +0100 $
+** Khronos $Git commit SHA1: e87f2f2fd2 $ on $Git commit date: 2018-09-30 
21:02:01 -0700 $
 */
 
 #include 
 
-/* Generated on date 20170627 */
+/* Generated on date 20180930 */
 
 /* Generated C header for:
  * API: egl
diff --git a/include/EGL/eglext.h b/include/EGL/eglext.h
index 794bd532881befec8ed9..18a3fbc1a6316b3cca29 100644
--- a/include/EGL/eglext.h
+++ b/include/EGL/eglext.h
@@ -28,17 +28,17 @@ extern "C" {
 ** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
 */
 /*
-** This header is generated from the Khronos OpenGL / OpenGL ES XML
-** API Registry. The current version of the Registry, generator scripts
+** This header is generated from the Khronos EGL XML API Registry.
+** The current version of the Registry, generator scripts
 ** used to make the header, and the header can be found at
 **   http://www.khronos.org/registry/egl
 **
-** Khronos $Git commit SHA1: bae3518c48 $ on $Git commit date: 2018-05-17 
10:56:57 -0700 $
+** Khronos $Git commit SHA1: e87f2f2fd2 $ on $Git commit date: 2018-09-30 
21:02:01 -0700 $
 */
 
 #include 
 
-#define EGL_EGLEXT_VERSION 20180517
+#define EGL_EGLEXT_VERSION 20180930
 
 /* Generated C header for:
  * API: egl
@@ -681,6 +681,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribEXT 
(EGLDisplay dpy, EGLint a
 #ifndef EGL_EXT_device_drm
 #define EGL_EXT_device_drm 1
 #define EGL_DRM_DEVICE_FILE_EXT   0x3233
+#define EGL_DRM_MASTER_FD_EXT 0x333C
 #endif /* EGL_EXT_device_drm */
 
 #ifndef EGL_EXT_device_enumeration
diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h
index b0541d52aed6584c63f2..ab4152f153fdfad75238 100644
--- a/include/EGL/eglplatform.h
+++ b/include/EGL/eglplatform.h
@@ -80,8 +80,8 @@ typedef HWNDEGLNativeWindowType;
 #elif defined(__WINSCW__) || defined(__SYMBIAN32__)  /* Symbian */
 
 typedef int   EGLNativeDisplayType;
-typedef void *EGLNativeWindowType;
 typedef void *EGLNativePixmapType;
+typedef void *EGLNativeWindowType;
 
 #elif defined(WL_EGL_PLATFORM)
 
@@ -100,15 +100,15 @@ typedef void   *EGLNativeWindowType;
 struct ANativeWindow;
 struct egl_native_pixmap_t;
 
-typedef struct ANativeWindow*   EGLNativeWindowType;
-typedef struct egl_native_pixmap_t* EGLNativePixmapType;
 typedef void*   EGLNativeDisplayType;
+typedef struct egl_native_pixmap_t* EGLNativePixmapType;
+typedef struct ANativeWindow*   EGLNativeWindowType;
 
 #elif defined(USE_OZONE)
 
 typedef intptr_t EGLNativeDisplayType;
-typedef intptr_t EGLNativeWindowType;
 typedef intptr_t EGLNativePixmapType;
+typedef intptr_t EGLNativeWindowType;
 
 #elif defined(__unix__) || defined(__APPLE__)
 
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] nir, spirv: Fix bugs uncovered by spirv-opt

2018-10-02 Thread Iago Toral
On Tue, 2018-10-02 at 12:54 +0200, Iago Toral wrote:
> Letf a couple of minor comments, but otherwise the series is:
> 
> Reviewed-by: Iago Toral Quiorga 

s/Quiorga/Quiroga


> On Sat, 2018-09-22 at 16:39 -0500, Jason Ekstrand wrote:
> > This little series fixes three bugs encountered while running the
> > Vulkan
> > CTS with SPIR-V optimizations enabled.  The optimizations
> > shouldn't,
> > in
> > theory, change the behavior of the shader but it does change some
> > of
> > the
> > SPIR-V patterns and triggered a number of compiler bugs.  I'd like
> > to
> > CC
> > the whole thing to stable but the last three patches are probably
> > best only
> > hitting the latest release because they're fairly major surgery.
> > 
> > Jason Ekstrand (5):
> >   nir/cf: Remove phi sources if needed in nir_handle_add_jump
> >   nir/from_ssa: Don't rewrite derefs destinations to registers
> >   spirv: Move function call handling to vtn_cfg
> >   spirv: Pass SSA values through functions
> >   spirv: Make images, samplers, and sampled images normal SSA
> > values
> > 
> >  src/compiler/nir/nir_control_flow.c |  36 ++--
> >  src/compiler/nir/nir_from_ssa.c |   6 +
> >  src/compiler/spirv/spirv_to_nir.c   | 123 +++-
> >  src/compiler/spirv/vtn_cfg.c| 279 ++--
> > 
> >  src/compiler/spirv/vtn_private.h|   9 +-
> >  src/compiler/spirv/vtn_variables.c  |  51 +++--
> >  6 files changed, 297 insertions(+), 207 deletions(-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] nir, spirv: Fix bugs uncovered by spirv-opt

2018-10-02 Thread Iago Toral
Letf a couple of minor comments, but otherwise the series is:

Reviewed-by: Iago Toral Quiorga 

On Sat, 2018-09-22 at 16:39 -0500, Jason Ekstrand wrote:
> This little series fixes three bugs encountered while running the
> Vulkan
> CTS with SPIR-V optimizations enabled.  The optimizations shouldn't,
> in
> theory, change the behavior of the shader but it does change some of
> the
> SPIR-V patterns and triggered a number of compiler bugs.  I'd like to
> CC
> the whole thing to stable but the last three patches are probably
> best only
> hitting the latest release because they're fairly major surgery.
> 
> Jason Ekstrand (5):
>   nir/cf: Remove phi sources if needed in nir_handle_add_jump
>   nir/from_ssa: Don't rewrite derefs destinations to registers
>   spirv: Move function call handling to vtn_cfg
>   spirv: Pass SSA values through functions
>   spirv: Make images, samplers, and sampled images normal SSA values
> 
>  src/compiler/nir/nir_control_flow.c |  36 ++--
>  src/compiler/nir/nir_from_ssa.c |   6 +
>  src/compiler/spirv/spirv_to_nir.c   | 123 +++-
>  src/compiler/spirv/vtn_cfg.c| 279 ++--
> 
>  src/compiler/spirv/vtn_private.h|   9 +-
>  src/compiler/spirv/vtn_variables.c  |  51 +++--
>  6 files changed, 297 insertions(+), 207 deletions(-)
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] nir/cf: Remove phi sources if needed in nir_handle_add_jump

2018-10-02 Thread Iago Toral
On Sat, 2018-09-22 at 16:39 -0500, Jason Ekstrand wrote:
> If the block in which the jump is inserted is the predecessor of a
> phi
> then we need to remove phi sources otherwise the phi may end up with
> things improperly connected.  Found by running the Vulkan CTS with
> SPIR-V optimizations enabled.
> 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/compiler/nir/nir_control_flow.c | 36 +++--
> 
>  1 file changed, 19 insertions(+), 17 deletions(-)
> 
> diff --git a/src/compiler/nir/nir_control_flow.c
> b/src/compiler/nir/nir_control_flow.c
> index 3b0a0f1a5b0..a82f35550b8 100644
> --- a/src/compiler/nir/nir_control_flow.c
> +++ b/src/compiler/nir/nir_control_flow.c
> @@ -437,6 +437,23 @@ nearest_loop(nir_cf_node *node)
> return nir_cf_node_as_loop(node);
>  }
>  
> +static void
> +remove_phi_src(nir_block *block, nir_block *pred)
> +{
> +   nir_foreach_instr(instr, block) {
> +  if (instr->type != nir_instr_type_phi)
> + break;
> +
> +  nir_phi_instr *phi = nir_instr_as_phi(instr);
> +  nir_foreach_phi_src_safe(src, phi) {
> + if (src->pred == pred) {
> +list_del(>src.use_link);
> +exec_node_remove(>node);
> + }
> +  }
> +   }
> +}
> +
>  /*
>   * update the CFG after a jump instruction has been added to the end
> of a block
>   */
> @@ -447,6 +464,8 @@ nir_handle_add_jump(nir_block *block)
> nir_instr *instr = nir_block_last_instr(block);
> nir_jump_instr *jump_instr = nir_instr_as_jump(instr);
>  
> +   if (block->successors[0])
> +  remove_phi_src(block->successors[0], block);

Don't we need to do the same for block->successors[1]?

Iago

> unlink_block_successors(block);
>  
> nir_function_impl *impl = nir_cf_node_get_function(
> >cf_node);
> @@ -470,23 +489,6 @@ nir_handle_add_jump(nir_block *block)
> }
>  }
>  
> -static void
> -remove_phi_src(nir_block *block, nir_block *pred)
> -{
> -   nir_foreach_instr(instr, block) {
> -  if (instr->type != nir_instr_type_phi)
> - break;
> -
> -  nir_phi_instr *phi = nir_instr_as_phi(instr);
> -  nir_foreach_phi_src_safe(src, phi) {
> - if (src->pred == pred) {
> -list_del(>src.use_link);
> -exec_node_remove(>node);
> - }
> -  }
> -   }
> -}
> -
>  /* Removes the successor of a block with a jump. Note that the jump
> to be
>   * eliminated may be free-floating.
>   */
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va:Aligned image width and height to 16.

2018-10-02 Thread Christian König

Am 02.10.2018 um 03:47 schrieb Sharma, Deepak:

From: suresh guttula 

In case of decoding of resolution like 40x24, while allocating surface
video buffer is always aligned with macroblock width/height which is 16.
But when application tries to get data after decoding through vaCreateImage
/vaGetImage, image width/height aligned with 2 and result a smaller image
buffer which causes the memory stomping issue.


Well NAK. It depends on the codec if the picture needs to be aligned to 
16 or not.


For example VC-1 would created decoding errors with that.

Regards,
Christian.



Signed-off-by: suresh guttula 
---
  src/gallium/state_trackers/va/image.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/image.c 
b/src/gallium/state_trackers/va/image.c
index 3f892c9..2fc47b7 100644
--- a/src/gallium/state_trackers/va/image.c
+++ b/src/gallium/state_trackers/va/image.c
@@ -123,8 +123,8 @@ vlVaCreateImage(VADriverContextP ctx, VAImageFormat 
*format, int width, int heig
 img->format = *format;
 img->width = width;
 img->height = height;
-   w = align(width, 2);
-   h = align(height, 2);
+   w = align(width, 16);
+   h = align(height, 16);
  
 switch (format->fourcc) {

 case VA_FOURCC('N','V','1','2'):


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 8/8] egl: add EGL_platform_device support

2018-10-02 Thread Emil Velikov
On Thu, 20 Sep 2018 at 15:13, Mathias Fröhlich
 wrote:

>
> If I replace the above with
>
>  EGLint surface_type = 0;
>  /* Only advertise pbuffer configs for non swrast devices */
>  if (dri2_dpy->image_driver)
> surface_type = EGL_PBUFFER_BIT;
>
>  dri2_conf = dri2_add_config(dpy, dri2_dpy->driver_configs[i],
>config_count + 1, surface_type, NULL,
>visuals[j].rgba_masks);
>
> then I can easily prohibit the crash that I mentioned when trying to
> create a pbuffer surface on the swrast device.
> At least I do no longer get a valid pbuffer config from eglChooseConfig
> and without that we cannot reach the crashing
> dri2_dpy->image_driver->createNewDrawable
> call somewhere from eglCreatePbufferSurface anymore.
>
> Still getting a surface less context on swrast should work...
>
Issue is that we do not know how to create a "pbuffer only" swrast.

Hence one resolves to hacks like the ones we have in platform_surfaceless.
Effectively pilling hacks upon hacks - see swrast_loader_extensions
and "software path w/o DRM.".

That said, I'm working on proper solution but since it will take some
time to finish/merge I'll drop this and 7/8 for now.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/8] egl: add EGL_EXT_device_drm support

2018-10-02 Thread Emil Velikov
Hi Mathias,

On Thu, 20 Sep 2018 at 15:12, Mathias Fröhlich
 wrote:

> > @@ -141,6 +231,12 @@ _eglQueryDeviceStringEXT(_EGLDevice *dev, EGLint name)
> > switch (name) {
> > case EGL_EXTENSIONS:
> >return dev->extensions;
> > +#ifdef HAVE_LIBDRM
> > +   case EGL_DRM_DEVICE_FILE_EXT:
> > +  if (_eglDeviceSupports(dev, _EGL_DEVICE_DRM))
> > + return dev->device->nodes[DRM_NODE_PRIMARY];
> ... we probably want
> return _eglGetDRMDeviceRenderNode(dev);
>
That isn't quite possible, as discussed in 2016's thread
"EGL_EXT_*_drm - primary vs render node".

The extensions is (was?) not too clear that a card node must be
returned, yet there are applications depend on it.

As mentioned in said thread we could add another extension which adds
support for EGL_DRM_RENDER_DEVICE_FILE_EXT.
But I'd suggest keeping that as a follow-up - hence the comment above
_eglGetDRMDeviceRenderNode()

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] spirv: Pass SSA values through functions

2018-10-02 Thread Iago Toral
On Sat, 2018-09-22 at 16:39 -0500, Jason Ekstrand wrote:
> Previously, we would create temporary variables and fill them out.
> Instead, we create as many function parameters as we need and pass
> them
> through as SSA defs.

(...)

>  void
>  vtn_handle_function_call(struct vtn_builder *b, SpvOp opcode,
>   const uint32_t *w, unsigned count)
> @@ -86,12 +215,8 @@ vtn_handle_function_call(struct vtn_builder *b,
> SpvOp opcode,
>   call->params[param_idx++] =
>  nir_src_for_ssa(vtn_pointer_to_ssa(b, pointer));
>} else {
> - /* This is a regular SSA value and we need a temporary */
> - nir_variable *tmp =
> -nir_local_variable_create(b->nb.impl, arg_type->type,
> "arg_tmp");
> - nir_deref_instr *tmp_deref = nir_build_deref_var(>nb,
> tmp);
> - vtn_local_store(b, vtn_ssa_value(b, arg_id), tmp_deref);
> - call->params[param_idx++] = nir_src_for_ssa(_deref-
> >dest.ssa);
> + vtn_ssa_value_add_to_call_params(b, vtn_ssa_value(b, w[4 +
> i]),

arg_id instead of w[4 + i] for consistency?

Iago


> +  arg_type, call,
> _idx);
>}
> }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108120] tools_error2aub-error2aub fails to build: error: implicit declaration of function ‘va_start’

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108120

--- Comment #4 from Lionel Landwerlin  ---
(In reply to Paul Menzel from comment #3)
> (In reply to Lionel Landwerlin from comment #2)
> > Sent https://patchwork.freedesktop.org/patch/254342/
> 
> Thanks, I just wanted to upload the same change. Two nitpicks.
> 
> 1.  Should the include be inserted lexicographically?
> 2.  I always prefer to have the error message in the commit message.
> 
> PS: I am not very involved in Mesa, but isn’t there a build tester, which
> should have caught that error? (Also, why did nobody else get it?)


Feel free to upload a better change, I'll review and push to master.

There are build testers. Not quite sure why it wasn't caught...
The only explanation I could think of is that we have different build option
(we usually build a large set of drivers, not just intel ones) and that somehow
pulls in a header file already including stdarg.h.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: error2aub: fix missing include

2018-10-02 Thread Tapani Pälli

Reviewed-by: Tapani Pälli 

On 10/2/18 12:29 PM, Lionel Landwerlin wrote:

Signed-off-by: Lionel Landwerlin 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108120
---
  src/intel/tools/error2aub.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/intel/tools/error2aub.c b/src/intel/tools/error2aub.c
index 8a23d5ef1e7..b33cb1356f9 100644
--- a/src/intel/tools/error2aub.c
+++ b/src/intel/tools/error2aub.c
@@ -28,6 +28,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/8] egl: add EGL_MESA_device_software support

2018-10-02 Thread Emil Velikov
On Thu, 20 Sep 2018 at 15:12, Mathias Fröhlich
 wrote:
>
> Hi Emil,
>
> Some comments inline below:
>
> On Tuesday, 4 September 2018 20:32:59 CEST Emil Velikov wrote:
> > From: Emil Velikov 
> >
> > Add a plain software device, which is always available.
> >
> > We can safely assign it as the first/initial device in _eglGlobals,
> > although we ensure that's the case with a handful of _eglDeviceSupports
> > checks throughout the code.
> >
> > v2:
> >  - s/_eglFindDevice/_eglAddDevice/ (Eric)
> >  - s/_eglLookupAllDevices/_eglRefreshDeviceList/ (Eric)
> >  - move ^^ helpers into a earlier patch (Eric, Mathias)
> >  - set the SW device on _eglGlobal init. (Eric)
> >  - add a number of _eglDeviceSupports checks (Mathias)
> >  - split Device/Display attach to a separate patch
> >
> > Signed-off-by: Emil Velikov 
> > ---
> >  src/egl/main/egldevice.c  | 27 +++
> >  src/egl/main/egldevice.h  |  4 +++-
> >  src/egl/main/eglglobals.c |  1 +
> >  3 files changed, 31 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/egl/main/egldevice.c b/src/egl/main/egldevice.c
> > index bbc9f2060d1..1d67fd71191 100644
> > --- a/src/egl/main/egldevice.c
> > +++ b/src/egl/main/egldevice.c
> > @@ -37,6 +37,8 @@ struct _egl_device {
> > _EGLDevice *Next;
> >
> > const char *extensions;
> > +
> > +   EGLBoolean MESA_device_software;
> >  };
> >
> >  void
> > @@ -47,6 +49,12 @@ _eglFiniDevice(void)
> > /* atexit function is called with global mutex locked */
> >
> > dev_list = _eglGlobal.DeviceList;
> > +
> > +   /* The first device is on-stack allocated SW device */
>
> May be I name that wrong as non native english, but 'on-stack'
> would be something different for me. The stack is more or less
> the function local allocation scope.
>
> The sw device is much more a 'static allocated SW device'.
>
> Right?
>
Sounds better indeed. Will fix shortly.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/8] egl: add base EGL_EXT_device_base implementation

2018-10-02 Thread Emil Velikov
On Thu, 20 Sep 2018 at 15:11, Mathias Fröhlich
 wrote:
>
> Hi Emil,
>
Thanks Mathias.

Just coming back from some holidays and XDC. Most of your commends are
addressed ... the ones which are not I'll commend on in a moment.

> On Tuesday, 4 September 2018 20:32:58 CEST Emil Velikov wrote:
> > From: Emil Velikov 
> >
> > Introduce the API for device query and enumeration. Those at the moment
> > produce nothing useful since zero devices are actually available.
> >
> > That contradicts with the spec, so the extension isn't advertised just
> > yet.
> >
> > With later commits we'll add support for software (always) and hardware
> > devices. Each one exposing the respective extension string.
> >
> > v2:
> >  - fold API boilerplate into this patch
> >  - move _eglAddDevice, _eglDeviceSupports, _eglRefreshDeviceList to this
> > patch (Eric, Mathias)
> >  - make _eglFiniDevice the one called last
>
> Thanks for the updated series.
>
> Nevertheless, patches #1, #2, #3, #8 have comments.
> Especialy there are lots of asserts that either dont compile or assert on 
> else working code.
> You should double check what they are supposed to do.
>
Indeed I got my asserts the wrong way around ... at least I was consistent ;-)

> > +static int
> > +_eglRefreshDeviceList(void)
> > +{
> > +   _EGLDevice *dev;
> > +   int count = 0;
> > +
> > +   dev = _eglGlobal.DeviceList;
>
> That one gives a compile warning in release compiles.
>
> Even in the final version of the file past the whole patch series.
> The dev argument is only used in asserts finally.
>
Right, I'll annotate dev as MAYBE_UNUSED.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108120] tools_error2aub-error2aub fails to build: error: implicit declaration of function ‘va_start’

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108120

--- Comment #3 from Paul Menzel  ---
(In reply to Lionel Landwerlin from comment #2)
> Sent https://patchwork.freedesktop.org/patch/254342/

Thanks, I just wanted to upload the same change. Two nitpicks.

1.  Should the include be inserted lexicographically?
2.  I always prefer to have the error message in the commit message.

PS: I am not very involved in Mesa, but isn’t there a build tester, which
should have caught that error? (Also, why did nobody else get it?)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108120] tools_error2aub-error2aub fails to build: error: implicit declaration of function ‘va_start’

2018-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108120

--- Comment #2 from Lionel Landwerlin  ---
Sent https://patchwork.freedesktop.org/patch/254342/

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel: error2aub: fix missing include

2018-10-02 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108120
---
 src/intel/tools/error2aub.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/tools/error2aub.c b/src/intel/tools/error2aub.c
index 8a23d5ef1e7..b33cb1356f9 100644
--- a/src/intel/tools/error2aub.c
+++ b/src/intel/tools/error2aub.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >