Re: [Mesa-dev] [PATCH 12/13] nir: add a loop unrolling pass

2016-08-29 Thread Connor Abbott
On Tue, Aug 30, 2016 at 12:40 AM, Matt Turner  wrote:
> On Mon, Aug 29, 2016 at 9:06 PM, Timothy Arceri
>  wrote:
>> Can't the phi have more than one source from before the loop? e.g
>>
>>int i = 0;
>>if (somthing)
>>  i = 1;
>>else
>>  i = 2;
>>
>>for ( ; i < 5; i++)
>>   do_stuff(i);
>
> In fact, no. :)
>
> NIR's control flow avoids so-called "critical edges" by ensuring that
> all if/else must be preceded by a single basic block and followed by a
> single basic block. It simplifies a lot of situations. This invariant
> and a few more are documented at the top of nir_control_flow.c.
>
> In your example, there will be a basic block between the end of the
> if/else construct and the beginning of the loop which will contain a
> phi node for i.
>
> phi nodes only have as many sources as its containing basic block has
> incoming edges. There are only two ways (that I can think of) that a
> block may have more than two incoming edges in NIR: multiple break or
> continue statements in a loop.

Matt is exactly right here... FWIW, I wrote up more details on NIR
control flow at
https://people.freedesktop.org/~cwabbott0/nir-docs/control_flow.html#the-nir-control-flow-model
(although I need to rebase my series and update the control flow
modification part... hmm).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] automake: egl: Android: Add libEGL dependencies

2016-08-29 Thread Tapani Pälli



On 08/30/2016 07:59 AM, Tapani Pälli wrote:



On 08/29/2016 07:11 PM, Emil Velikov wrote:

On 29 August 2016 at 05:37, Tapani Pälli  wrote:



On 08/26/2016 03:58 PM, Emil Velikov wrote:


On 26 August 2016 at 08:50, Tapani Pälli 
wrote:


Reviewed-by: Tapani Pälli 


What happened with my suggestion about getting things fixed as opposed
to adding tape over things, namely these thread [1] ?
Can someone please look into that one instead or give me some tips how
I can get things into AOSP ? Last time I've looked AOSP had longer and
more convoluted procedures than anything in the Linux graphics stack.



I'm not sure when this kind of 'big change' would happen, would be
nice to
have a working solution now and then discuss better solution in peace?


I hope I'm wrong, but I doubt anyone had the time/chance/will to
pursue the proposed solution. As such, pushing this 'hack' will not
increase the insensitive/chances of resolving this properly.


(trying to reduce the amount of patches that have to be applied to get
things working/built)


Ack and thank you for that. I believe the overall goal should be to
resolve things in a 'good enough for upstreaming' method, rather than
just pushing the first solution that comes to mind. Similar to how it
was done with the other CrOS inspired patches.



I could try to push .pc files in to those projects but I'm not sure how
that should work. Should those projects then install the .pc files
somewhere during Android build or would we mess with the PKG_CONFIG_PATH


*sigh* I mean during Chromium build ..


to be able to find them?

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: Introduce CLOVER_COMPILER_OPTIONS

2016-08-29 Thread Edward O'Callaghan


On 08/30/2016 09:20 AM, Vedran Miletić wrote:
> Options specified via the CLOVER_COMPILER_OPTIONS shell variable are
> appended to the compiler options specified by the OpenCL program (if
> any).
> 
> Signed-off-by: Vedran Miletić 
Reviewed-by: Edward O'Callaghan 

> ---
>  docs/envvars.html | 2 ++
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 9 +++--
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/envvars.html b/docs/envvars.html
> index 6d79398..52835b6 100644
> --- a/docs/envvars.html
> +++ b/docs/envvars.html
> @@ -224,6 +224,8 @@ Mesa EGL supports different sets of environment 
> variables.  See the
>  GALLIUM_DUMP_CPU - if non-zero, print information about the CPU on 
> start-up
>  TGSI_PRINT_SANITY - if set, do extra sanity checking on TGSI shaders and
>  print any errors to stderr.
> +CLOVER_COMPILER_OPTIONS - allows specifying additional compiler options.
> +Specified options are appended after the options set by the OpenCL 
> program.
>  DRAW_FSE - ???
>  DRAW_NO_FSE - ???
>  DRAW_USE_LLVM - if set to zero, the draw module will not use LLVM to 
> execute
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 5490d72..748850f 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -196,11 +196,16 @@ clover::llvm::compile_program(const std::string ,
>const std::string ,
>const std::string ,
>std::string _log) {
> +   const char *extra_opts_env = getenv("CLOVER_COMPILER_OPTIONS");
> +   std::string extra_opts;
> +   if (extra_opts_env)
> +   extra_opts = std::string(extra_opts_env);
> +
> if (has_flag(debug::clc))
> -  debug::log(".cl", "// Options: " + opts + '\n' + source);
> +  debug::log(".cl", "// Compiler options: " + opts + " " + extra_opts + 
> '\n' + source);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
> +   auto c = create_compiler_instance(target, tokenize(opts + " " + 
> extra_opts + " input.cl"),
>   r_log);
> auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
>r_log);
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] automake: egl: Android: Add libEGL dependencies

2016-08-29 Thread Tapani Pälli



On 08/29/2016 07:11 PM, Emil Velikov wrote:

On 29 August 2016 at 05:37, Tapani Pälli  wrote:



On 08/26/2016 03:58 PM, Emil Velikov wrote:


On 26 August 2016 at 08:50, Tapani Pälli  wrote:


Reviewed-by: Tapani Pälli 


What happened with my suggestion about getting things fixed as opposed
to adding tape over things, namely these thread [1] ?
Can someone please look into that one instead or give me some tips how
I can get things into AOSP ? Last time I've looked AOSP had longer and
more convoluted procedures than anything in the Linux graphics stack.



I'm not sure when this kind of 'big change' would happen, would be nice to
have a working solution now and then discuss better solution in peace?


I hope I'm wrong, but I doubt anyone had the time/chance/will to
pursue the proposed solution. As such, pushing this 'hack' will not
increase the insensitive/chances of resolving this properly.


(trying to reduce the amount of patches that have to be applied to get
things working/built)


Ack and thank you for that. I believe the overall goal should be to
resolve things in a 'good enough for upstreaming' method, rather than
just pushing the first solution that comes to mind. Similar to how it
was done with the other CrOS inspired patches.



I could try to push .pc files in to those projects but I'm not sure how 
that should work. Should those projects then install the .pc files 
somewhere during Android build or would we mess with the PKG_CONFIG_PATH 
to be able to find them?


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/13] nir: add a loop unrolling pass

2016-08-29 Thread Matt Turner
On Mon, Aug 29, 2016 at 9:06 PM, Timothy Arceri
 wrote:
> Can't the phi have more than one source from before the loop? e.g
>
>int i = 0;
>if (somthing)
>  i = 1;
>else
>  i = 2;
>
>for ( ; i < 5; i++)
>   do_stuff(i);

In fact, no. :)

NIR's control flow avoids so-called "critical edges" by ensuring that
all if/else must be preceded by a single basic block and followed by a
single basic block. It simplifies a lot of situations. This invariant
and a few more are documented at the top of nir_control_flow.c.

In your example, there will be a basic block between the end of the
if/else construct and the beginning of the loop which will contain a
phi node for i.

phi nodes only have as many sources as its containing basic block has
incoming edges. There are only two ways (that I can think of) that a
block may have more than two incoming edges in NIR: multiple break or
continue statements in a loop.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/13] nir: add helper for cloning loops

2016-08-29 Thread Timothy Arceri
On Mon, 2016-08-29 at 21:06 -0400, Connor Abbott wrote:
> So, you've noticed that your method of handling phi's while cloning
> doesn't handle phi's that point to other phi's in the same block.
> Particularly, something like:
> 
> a = phi(b, ...)
> b = phi(a, ...)
> 
> which is supposed to swap a and b each iteration. Here's a better
> strategy which should be simpler than this and fixes that problem.
> Create a new remap_table for each iteration of the loop. On the first
> iteration, make all the phi nodes remap to their pre-loop sources
> before cloning the body. On each successive iteration, make each phi
> node map to what the *old* remap table (from the previous iteration)
> says that it's source from the previous iteration remaps to. In
> pseudocode:
> 
> nir_cf_list loop_body;
> extract loop body into loop_body
> move loop header instructions (minus phi nodes) into loop_body
> 
> struct hash_table *old_remap_table = NULL, *remap_table;
> for each iteration:
> remap_table = new dictionary;
> for each phi in original loop:
> if old_remap_table:
> remap_table[phi] = old_remap_table[phi source pointing to
> end of loop]
> else:
> remap_table[phi] = phi source pointing to block before
> loop
> nir_cf_list cloned_body;
> clone loop_body to cloned_body using remap_table
> insert cloned_body before loop
> 
> delete loop_body
> delete loop

I see what you are getting at but what about something like this?

   int j = 0;
   for (int i=0 ; i < 5; i++) {
  ...
  j++;
      if (uniform)
         j += 2;
   }

I'm not sure we can throw away phis so easily.

> 
> This should eliminate the need to expose nir_cf_list_cleanup(),
> stitch_blocks(), etc. publicly as well as the previous patch, and it
> should handle the phi swapping case correctly.
> 
> 
> On Mon, Aug 29, 2016 at 12:59 AM, Timothy Arceri
>  wrote:
> > 
> > ---
> >  src/compiler/nir/nir.h   |  3 +++
> >  src/compiler/nir/nir_clone.c | 64
> > +++-
> >  2 files changed, 60 insertions(+), 7 deletions(-)
> > 
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index 0ab3ebc..9083bd0 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -2372,6 +2372,9 @@ void nir_print_instr(const nir_instr *instr,
> > FILE *fp);
> > 
> >  nir_shader *nir_shader_clone(void *mem_ctx, const nir_shader *s);
> >  nir_function_impl *nir_function_impl_clone(const nir_function_impl
> > *fi);
> > +void nir_clone_loop_list(struct exec_list *dst, const struct
> > exec_list *list,
> > + struct hash_table *remap_table,
> > + struct hash_table *phi_remap, nir_shader
> > *ns);
> >  nir_constant *nir_constant_clone(const nir_constant *c,
> > nir_variable *var);
> >  nir_variable *nir_variable_clone(const nir_variable *c, nir_shader
> > *shader);
> > 
> > diff --git a/src/compiler/nir/nir_clone.c
> > b/src/compiler/nir/nir_clone.c
> > index 8808333..071afc9 100644
> > --- a/src/compiler/nir/nir_clone.c
> > +++ b/src/compiler/nir/nir_clone.c
> > @@ -35,9 +35,17 @@ typedef struct {
> > /* True if we are cloning an entire shader. */
> > bool global_clone;
> > 
> > +   /* This allows us to clone a loop body without having to add
> > srcs from
> > +* outside the loop to the remap table. This is useful for loop
> > unrolling.
> > +*/
> > +   bool allow_remap_fallback;
> > +
> > /* maps orig ptr -> cloned ptr: */
> > struct hash_table *remap_table;
> > 
> > +   /* used for remaping when cloning loop body for loop unrolling
> > */
> > +   struct hash_table *phi_remap_table;
> > +
> > /* List of phi sources. */
> > struct list_head phi_srcs;
> > 
> > @@ -46,11 +54,20 @@ typedef struct {
> >  } clone_state;
> > 
> >  static void
> > -init_clone_state(clone_state *state, bool global)
> > +init_clone_state(clone_state *state, struct hash_table
> > *remap_table,
> > + bool global, bool allow_remap_fallback)
> >  {
> > state->global_clone = global;
> > -   state->remap_table = _mesa_hash_table_create(NULL,
> > _mesa_hash_pointer,
> > -_mesa_key_pointer_
> > equal);
> > +   state->allow_remap_fallback = allow_remap_fallback;
> > +
> > +   state->phi_remap_table = NULL;
> > +   if (remap_table) {
> > +  state->remap_table = remap_table;
> > +   } else {
> > +  state->remap_table = _mesa_hash_table_create(NULL,
> > _mesa_hash_pointer,
> > +   _mesa_key_point
> > er_equal);
> > +   }
> > +
> > list_inithead(>phi_srcs);
> >  }
> > 
> > @@ -72,16 +89,32 @@ _lookup_ptr(clone_state *state, const void
> > *ptr, bool global)
> >    return (void *)ptr;
> > 
> > entry = _mesa_hash_table_search(state->remap_table, ptr);
> > -   assert(entry && "Failed to find pointer!");
> > if (!entry)
> > -  return 

Re: [Mesa-dev] [PATCH 12/13] nir: add a loop unrolling pass

2016-08-29 Thread Timothy Arceri
On Tue, 2016-08-30 at 14:16 +1000, Timothy Arceri wrote:
> On Tue, 2016-08-30 at 14:06 +1000, Timothy Arceri wrote:
> > 
> > On Mon, 2016-08-29 at 20:34 -0400, Connor Abbott wrote:
> > > 
> > > 
> > > On Mon, Aug 29, 2016 at 12:59 AM, Timothy Arceri
> > >  wrote:
> > > > 
> > > > 
> > > > 
> > > > ---
> > > >  src/compiler/Makefile.sources  |   1 +
> > > >  src/compiler/nir/nir.h |   2 +
> > > >  src/compiler/nir/nir_opt_loop_unroll.c | 394
> > > > +
> > > >  3 files changed, 397 insertions(+)
> > > >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
> > > > 
> > > > diff --git a/src/compiler/Makefile.sources
> > > > b/src/compiler/Makefile.sources
> > > > index 79de484..a9f104d 100644
> > > > --- a/src/compiler/Makefile.sources
> > > > +++ b/src/compiler/Makefile.sources
> > > > @@ -231,6 +231,7 @@ NIR_FILES = \
> > > > nir/nir_opt_dead_cf.c \
> > > > nir/nir_opt_gcm.c \
> > > > nir/nir_opt_global_to_local.c \
> > > > +   nir/nir_opt_loop_unroll.c \
> > > > nir/nir_opt_peephole_select.c \
> > > > nir/nir_opt_remove_phis.c \
> > > > nir/nir_opt_undef.c \
> > > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > > > index 9083bd0..81d9dfc 100644
> > > > --- a/src/compiler/nir/nir.h
> > > > +++ b/src/compiler/nir/nir.h
> > > > @@ -2676,6 +2676,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
> > > > 
> > > >  void nir_opt_gcm(nir_shader *shader);
> > > > 
> > > > +bool nir_opt_loop_unroll(nir_shader *shader);
> > > > +
> > > >  bool nir_opt_peephole_select(nir_shader *shader);
> > > > 
> > > >  bool nir_opt_remove_phis(nir_shader *shader);
> > > > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> > > > b/src/compiler/nir/nir_opt_loop_unroll.c
> > > > new file mode 100644
> > > > index 000..22530c9
> > > > --- /dev/null
> > > > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > > > @@ -0,0 +1,394 @@
> > > > +/*
> > > > + * Copyright © 2016 Intel Corporation
> > > > + *
> > > > + * Permission is hereby granted, free of charge, to any person
> > > > obtaining a
> > > > + * copy of this software and associated documentation files
> > > > (the
> > > > "Software"),
> > > > + * to deal in the Software without restriction, including
> > > > without
> > > > limitation
> > > > + * the rights to use, copy, modify, merge, publish,
> > > > distribute,
> > > > sublicense,
> > > > + * and/or sell copies of the Software, and to permit persons
> > > > to
> > > > whom the
> > > > + * Software is furnished to do so, subject to the following
> > > > conditions:
> > > > + *
> > > > + * The above copyright notice and this permission notice
> > > > (including the next
> > > > + * paragraph) shall be included in all copies or substantial
> > > > portions of the
> > > > + * Software.
> > > > + *
> > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > > > KIND,
> > > > EXPRESS OR
> > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > > > MERCHANTABILITY,
> > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN
> > > > NO
> > > > EVENT SHALL
> > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > > > DAMAGES OR OTHER
> > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> > > > OTHERWISE,
> > > > ARISING
> > > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > > > OR
> > > > OTHER
> > > > + * DEALINGS IN THE SOFTWARE.
> > > > + */
> > > > +
> > > > +#include "nir.h"
> > > > +#include "nir_builder.h"
> > > > +#include "nir_control_flow.h"
> > > > +
> > > > +typedef struct {
> > > > +   /* Array of loops */
> > > > +   nir_loop *loops;
> > > > +
> > > > +   /* Array of unroll factors */
> > > > +   int *factors;
> > > > +} unroll_vector;
> > > > +
> > > > +typedef struct {
> > > > +   /* Array of loop infos for the loop nest */
> > > > +   nir_loop_info *li;
> > > > +
> > > > +   /* List of unroll vectors */
> > > > +   struct list_head unroll_vectors;
> > > > +
> > > > +   nir_shader_compiler_options *options;
> > > > +} loop_unroll_state;
> > > > +
> > > > +static void
> > > > +extract_loop_body(nir_cf_list *extracted, nir_cf_node *node,
> > > > nir_shader *ns)
> > > 
> > > The shader argument is unused, you can delete it.
> > > 
> > > > 
> > > > 
> > > > 
> > > > +{
> > > > +   nir_cf_node *end = node;
> > > > +   while (!nir_cf_node_is_last(end))
> > > > +  end = nir_cf_node_next(end);
> > > > +
> > > > +   nir_cf_loop_list_extract(extracted, node, end);
> > > > +}
> > > > +
> > > > +static void
> > > > +clone_list(nir_shader *ns, nir_loop *loop, nir_cf_list
> > > > *src_cf_list,
> > > > +   nir_cf_list *cloned_cf_list, struct hash_table
> > > > *remap_table,
> > > > +   struct hash_table *phi_remap)
> > > > +{
> > > > +   /* Dest list needs to at least have one block */
> > > > +   nir_block *nblk = nir_block_create(ns);
> > > > +   nblk->cf_node.parent = 

Re: [Mesa-dev] [PATCH 12/13] nir: add a loop unrolling pass

2016-08-29 Thread Timothy Arceri
On Tue, 2016-08-30 at 14:06 +1000, Timothy Arceri wrote:
> On Mon, 2016-08-29 at 20:34 -0400, Connor Abbott wrote:
> > 
> > On Mon, Aug 29, 2016 at 12:59 AM, Timothy Arceri
> >  wrote:
> > > 
> > > 
> > > ---
> > >  src/compiler/Makefile.sources  |   1 +
> > >  src/compiler/nir/nir.h |   2 +
> > >  src/compiler/nir/nir_opt_loop_unroll.c | 394
> > > +
> > >  3 files changed, 397 insertions(+)
> > >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
> > > 
> > > diff --git a/src/compiler/Makefile.sources
> > > b/src/compiler/Makefile.sources
> > > index 79de484..a9f104d 100644
> > > --- a/src/compiler/Makefile.sources
> > > +++ b/src/compiler/Makefile.sources
> > > @@ -231,6 +231,7 @@ NIR_FILES = \
> > > nir/nir_opt_dead_cf.c \
> > > nir/nir_opt_gcm.c \
> > > nir/nir_opt_global_to_local.c \
> > > +   nir/nir_opt_loop_unroll.c \
> > > nir/nir_opt_peephole_select.c \
> > > nir/nir_opt_remove_phis.c \
> > > nir/nir_opt_undef.c \
> > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > > index 9083bd0..81d9dfc 100644
> > > --- a/src/compiler/nir/nir.h
> > > +++ b/src/compiler/nir/nir.h
> > > @@ -2676,6 +2676,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
> > > 
> > >  void nir_opt_gcm(nir_shader *shader);
> > > 
> > > +bool nir_opt_loop_unroll(nir_shader *shader);
> > > +
> > >  bool nir_opt_peephole_select(nir_shader *shader);
> > > 
> > >  bool nir_opt_remove_phis(nir_shader *shader);
> > > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> > > b/src/compiler/nir/nir_opt_loop_unroll.c
> > > new file mode 100644
> > > index 000..22530c9
> > > --- /dev/null
> > > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > > @@ -0,0 +1,394 @@
> > > +/*
> > > + * Copyright © 2016 Intel Corporation
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person
> > > obtaining a
> > > + * copy of this software and associated documentation files (the
> > > "Software"),
> > > + * to deal in the Software without restriction, including
> > > without
> > > limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute,
> > > sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to
> > > whom the
> > > + * Software is furnished to do so, subject to the following
> > > conditions:
> > > + *
> > > + * The above copyright notice and this permission notice
> > > (including the next
> > > + * paragraph) shall be included in all copies or substantial
> > > portions of the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > > KIND,
> > > EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > > MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> > > EVENT SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > > DAMAGES OR OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> > > OTHERWISE,
> > > ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > > OTHER
> > > + * DEALINGS IN THE SOFTWARE.
> > > + */
> > > +
> > > +#include "nir.h"
> > > +#include "nir_builder.h"
> > > +#include "nir_control_flow.h"
> > > +
> > > +typedef struct {
> > > +   /* Array of loops */
> > > +   nir_loop *loops;
> > > +
> > > +   /* Array of unroll factors */
> > > +   int *factors;
> > > +} unroll_vector;
> > > +
> > > +typedef struct {
> > > +   /* Array of loop infos for the loop nest */
> > > +   nir_loop_info *li;
> > > +
> > > +   /* List of unroll vectors */
> > > +   struct list_head unroll_vectors;
> > > +
> > > +   nir_shader_compiler_options *options;
> > > +} loop_unroll_state;
> > > +
> > > +static void
> > > +extract_loop_body(nir_cf_list *extracted, nir_cf_node *node,
> > > nir_shader *ns)
> > 
> > The shader argument is unused, you can delete it.
> > 
> > > 
> > > 
> > > +{
> > > +   nir_cf_node *end = node;
> > > +   while (!nir_cf_node_is_last(end))
> > > +  end = nir_cf_node_next(end);
> > > +
> > > +   nir_cf_loop_list_extract(extracted, node, end);
> > > +}
> > > +
> > > +static void
> > > +clone_list(nir_shader *ns, nir_loop *loop, nir_cf_list
> > > *src_cf_list,
> > > +   nir_cf_list *cloned_cf_list, struct hash_table
> > > *remap_table,
> > > +   struct hash_table *phi_remap)
> > > +{
> > > +   /* Dest list needs to at least have one block */
> > > +   nir_block *nblk = nir_block_create(ns);
> > > +   nblk->cf_node.parent = loop->cf_node.parent;
> > > +   exec_list_push_tail(_cf_list->list, 
> > > > 
> > > > cf_node.node);
> > > +
> > > +   nir_clone_loop_list(_cf_list->list, _cf_list-
> > > >list,
> > > +   remap_table, phi_remap, ns);
> > > +}
> > > +
> > > +static void
> > > +remove_unrolled_loop(nir_cf_node *loop, nir_block
> > > *block_before_loop,
> > > + struct 

Re: [Mesa-dev] [PATCH 12/13] nir: add a loop unrolling pass

2016-08-29 Thread Timothy Arceri
On Mon, 2016-08-29 at 20:34 -0400, Connor Abbott wrote:
> On Mon, Aug 29, 2016 at 12:59 AM, Timothy Arceri
>  wrote:
> > 
> > ---
> >  src/compiler/Makefile.sources  |   1 +
> >  src/compiler/nir/nir.h |   2 +
> >  src/compiler/nir/nir_opt_loop_unroll.c | 394
> > +
> >  3 files changed, 397 insertions(+)
> >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
> > 
> > diff --git a/src/compiler/Makefile.sources
> > b/src/compiler/Makefile.sources
> > index 79de484..a9f104d 100644
> > --- a/src/compiler/Makefile.sources
> > +++ b/src/compiler/Makefile.sources
> > @@ -231,6 +231,7 @@ NIR_FILES = \
> > nir/nir_opt_dead_cf.c \
> > nir/nir_opt_gcm.c \
> > nir/nir_opt_global_to_local.c \
> > +   nir/nir_opt_loop_unroll.c \
> > nir/nir_opt_peephole_select.c \
> > nir/nir_opt_remove_phis.c \
> > nir/nir_opt_undef.c \
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index 9083bd0..81d9dfc 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -2676,6 +2676,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
> > 
> >  void nir_opt_gcm(nir_shader *shader);
> > 
> > +bool nir_opt_loop_unroll(nir_shader *shader);
> > +
> >  bool nir_opt_peephole_select(nir_shader *shader);
> > 
> >  bool nir_opt_remove_phis(nir_shader *shader);
> > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> > b/src/compiler/nir/nir_opt_loop_unroll.c
> > new file mode 100644
> > index 000..22530c9
> > --- /dev/null
> > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > @@ -0,0 +1,394 @@
> > +/*
> > + * Copyright © 2016 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > + * copy of this software and associated documentation files (the
> > "Software"),
> > + * to deal in the Software without restriction, including without
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
> > sublicense,
> > + * and/or sell copies of the Software, and to permit persons to
> > whom the
> > + * Software is furnished to do so, subject to the following
> > conditions:
> > + *
> > + * The above copyright notice and this permission notice
> > (including the next
> > + * paragraph) shall be included in all copies or substantial
> > portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> > EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#include "nir.h"
> > +#include "nir_builder.h"
> > +#include "nir_control_flow.h"
> > +
> > +typedef struct {
> > +   /* Array of loops */
> > +   nir_loop *loops;
> > +
> > +   /* Array of unroll factors */
> > +   int *factors;
> > +} unroll_vector;
> > +
> > +typedef struct {
> > +   /* Array of loop infos for the loop nest */
> > +   nir_loop_info *li;
> > +
> > +   /* List of unroll vectors */
> > +   struct list_head unroll_vectors;
> > +
> > +   nir_shader_compiler_options *options;
> > +} loop_unroll_state;
> > +
> > +static void
> > +extract_loop_body(nir_cf_list *extracted, nir_cf_node *node,
> > nir_shader *ns)
> 
> The shader argument is unused, you can delete it.
> 
> > 
> > +{
> > +   nir_cf_node *end = node;
> > +   while (!nir_cf_node_is_last(end))
> > +  end = nir_cf_node_next(end);
> > +
> > +   nir_cf_loop_list_extract(extracted, node, end);
> > +}
> > +
> > +static void
> > +clone_list(nir_shader *ns, nir_loop *loop, nir_cf_list
> > *src_cf_list,
> > +   nir_cf_list *cloned_cf_list, struct hash_table
> > *remap_table,
> > +   struct hash_table *phi_remap)
> > +{
> > +   /* Dest list needs to at least have one block */
> > +   nir_block *nblk = nir_block_create(ns);
> > +   nblk->cf_node.parent = loop->cf_node.parent;
> > +   exec_list_push_tail(_cf_list->list, 
> > >cf_node.node);
> > +
> > +   nir_clone_loop_list(_cf_list->list, _cf_list->list,
> > +   remap_table, phi_remap, ns);
> > +}
> > +
> > +static void
> > +remove_unrolled_loop(nir_cf_node *loop, nir_block
> > *block_before_loop,
> > + struct hash_table *remap_table,
> > + struct hash_table *phi_remap,
> > + nir_function_impl *impl)
> > +{
> > +   /* Fixup LCSSA-phi srcs */
> > +   nir_block *prev_block =
> > nir_cf_node_as_block(nir_cf_node_prev(loop));
> > +   nir_cf_node *cf_node = nir_cf_node_next(loop);
> > +   assert(cf_node->type == nir_cf_node_block);
> > +
> > +   nir_block *block = 

Re: [Mesa-dev] [PATCH 08/13] nir: add control flow helpers for loop unrolling

2016-08-29 Thread Timothy Arceri
On Mon, 2016-08-29 at 20:42 -0400, Connor Abbott wrote:
> I already noted in patch 12/13 that you can get rid of your use of
> stitch_blocks(). I also don't get why you need a special
> nir_cf_loop_list_extract() here... why
> 
> On Mon, Aug 29, 2016 at 12:59 AM, Timothy Arceri
>  wrote:
> > 
> > This makes stitch_blocks() available for use else where, and adds
> > a new helper that extracts a cf list without worrying about
> > validation.
> 
> I already noted in patch 12/13 that you can get rid of your use of
> stitch_blocks(). I also don't get why you need a special
> nir_cf_loop_list_extract() here... why can't you use
> nir_cf_extract()?
> The only difference between the two is that nir_cf_extract()
> re-stitches the nodes together, and it also makes sure that you don't
> handle phi nodes incorrectly, but AFAICT those differences don't
> matter for the way you intend to use it.

As I said manipulating the control flow for loops in is not much fun.
The current cf helpers are not very well suited to loop unrolling
because they try to be too smart keeping the cf in a valid state as we
go but extracting almost anything from a loop can break it.

nir_cf_extract() will fall over if we try to use it.

> 
> > 
> > ---
> >  src/compiler/nir/nir_control_flow.c | 34
> > --
> >  src/compiler/nir/nir_control_flow.h |  5 +
> >  2 files changed, 37 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/compiler/nir/nir_control_flow.c
> > b/src/compiler/nir/nir_control_flow.c
> > index a485e71..ed8cd24 100644
> > --- a/src/compiler/nir/nir_control_flow.c
> > +++ b/src/compiler/nir/nir_control_flow.c
> > @@ -628,8 +628,7 @@ update_if_uses(nir_cf_node *node)
> >   * Stitch two basic blocks together into one. The aggregate must
> > have the same
> >   * predecessors as the first and the same successors as the
> > second.
> >   */
> > -
> > -static void
> > +void
> >  stitch_blocks(nir_block *before, nir_block *after)
> >  {
> > /*
> > @@ -791,6 +790,37 @@ nir_cf_extract(nir_cf_list *extracted,
> > nir_cursor begin, nir_cursor end)
> > stitch_blocks(block_before, block_after);
> >  }
> > 
> > +/**
> > + * Its not really possible to extract control flow from a loop
> > while keeping
> > + * the cf valid so this function just rips out what we ask for and
> > any
> > + * validation and fix ups are left to the caller.
> > + */
> > +void
> > +nir_cf_loop_list_extract(nir_cf_list *extracted, nir_cf_node
> > *begin,
> > + nir_cf_node *end)
> > +{
> > +   extracted->impl = nir_cf_node_get_function(begin);
> > +   exec_list_make_empty(>list);
> > +
> > +   /* Dominance and other block-related information is toast. */
> > +   nir_metadata_preserve(extracted->impl, nir_metadata_none);
> > +
> > +   nir_cf_node *cf_node = begin;
> > +   nir_cf_node *cf_node_end = end;
> > +   while (true) {
> > +  nir_cf_node *next = nir_cf_node_next(cf_node);
> > +
> > +  exec_node_remove(_node->node);
> > +  cf_node->parent = NULL;
> > +  exec_list_push_tail(>list, _node->node);
> > +
> > +  if (cf_node == cf_node_end)
> > + break;
> > +
> > +  cf_node = next;
> > +   }
> > +}
> > +
> >  void
> >  nir_cf_reinsert(nir_cf_list *cf_list, nir_cursor cursor)
> >  {
> > diff --git a/src/compiler/nir/nir_control_flow.h
> > b/src/compiler/nir/nir_control_flow.h
> > index b71382f..0d97486 100644
> > --- a/src/compiler/nir/nir_control_flow.h
> > +++ b/src/compiler/nir/nir_control_flow.h
> > @@ -78,6 +78,9 @@ nir_cf_node_insert_end(struct exec_list *list,
> > nir_cf_node *node)
> > nir_cf_node_insert(nir_after_cf_list(list), node);
> >  }
> > 
> > +void
> > +stitch_blocks(nir_block *before, nir_block *after);
> > +
> > 
> >  /** Control flow motion.
> >   *
> > @@ -148,6 +151,8 @@ nir_cf_list_extract(nir_cf_list *extracted,
> > struct exec_list *cf_list)
> >    nir_after_cf_list(cf_list));
> >  }
> > 
> > +void nir_cf_loop_list_extract(nir_cf_list *extracted, nir_cf_node
> > *begin, nir_cf_node *end);
> > +
> >  /** removes a control flow node, doing any cleanup necessary */
> >  static inline void
> >  nir_cf_node_remove(nir_cf_node *node)
> > --
> > 2.7.4
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Initialize outputs[] array in lower_blend_equation_advanced.

2016-08-29 Thread Kenneth Graunke
Caught by Coverity.  Likely fixes real issues if an output component
is not present.

CID: 1372278
Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/lower_blend_equation_advanced.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_blend_equation_advanced.cpp 
b/src/compiler/glsl/lower_blend_equation_advanced.cpp
index a998df1..1d03392 100644
--- a/src/compiler/glsl/lower_blend_equation_advanced.cpp
+++ b/src/compiler/glsl/lower_blend_equation_advanced.cpp
@@ -497,7 +497,7 @@ lower_blend_equation_advanced(struct gl_linked_shader *sh)
 * which writes a subset of the components, starting at location_frac.
 * The variables can't overlap, thankfully.
 */
-   ir_variable *outputs[4];
+   ir_variable *outputs[4] = { NULL, NULL, NULL, NULL };
foreach_in_list(ir_instruction, ir, sh->ir) {
   ir_variable *var = ir->as_variable();
   if (!var || var->data.mode != ir_var_shader_out)
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] egl: return corresponding offset of EGLImage instead of 0.

2016-08-29 Thread Weng, Chuanbo
Hi Dave and all,
This version is based on Dave's review comments. Could you please 
review it?
Thanks!

-Original Message-
From: Weng, Chuanbo 
Sent: Friday, August 26, 2016 1:51 AM
To: mesa-dev@lists.freedesktop.org; airl...@gmail.com
Cc: emil.l.veli...@gmail.com; Weng, Chuanbo 
Subject: [PATCH v2] egl: return corresponding offset of EGLImage instead of 0.

The offset should not always be 0. For example, if EGLImage is created from a 
2D texture with EGL_GL_TEXTURE_LEVEL=1, then the offset should be the actual 
start of miplevel 1 in drm bo.

v2: version bump on the EGL image interface and add gallium pieces.

Signed-off-by: Chuanbo Weng 
---
 include/GL/internal/dri_interface.h  | 4 +++-
 src/egl/drivers/dri2/egl_dri2.c  | 3 ++-
 src/gallium/state_trackers/dri/dri2.c| 8 +++-
 src/gbm/backends/dri/gbm_dri.c   | 5 +++--
 src/mesa/drivers/dri/i965/intel_screen.c | 9 +++--
 5 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 1c73cce..d0b1bc6 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -1094,7 +1094,7 @@ struct __DRIdri2ExtensionRec {
  * extensions.
  */
 #define __DRI_IMAGE "DRI_IMAGE"
-#define __DRI_IMAGE_VERSION 12
+#define __DRI_IMAGE_VERSION 13
 
 /**
  * These formats correspond to the similarly named MESA_FORMAT_* @@ -1208,6 
+1208,8 @@ struct __DRIdri2ExtensionRec {
 #define __DRI_IMAGE_ATTRIB_FOURCC   0x2008 /* available in versions 11 */
 #define __DRI_IMAGE_ATTRIB_NUM_PLANES   0x2009 /* available in versions 11 */
 
+#define __DRI_IMAGE_ATTRIB_OFFSET 0x200A /* available in versions 13 */
+
 enum __DRIYUVColorSpace {
__DRI_YUV_COLOR_SPACE_UNDEFINED = 0,
__DRI_YUV_COLOR_SPACE_ITU_REC601 = 0x327F, diff --git 
a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 
e854903..3f28efe 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -2257,7 +2257,8 @@ dri2_export_dma_buf_image_mesa(_EGLDriver *drv, 
_EGLDisplay *disp, _EGLImage *im
  __DRI_IMAGE_ATTRIB_STRIDE, strides);
 
if (offsets)
-  offsets[0] = 0;
+  dri2_dpy->image->queryImage(dri2_img->dri_image,
+ __DRI_IMAGE_ATTRIB_OFFSET, offsets);
 
return EGL_TRUE;
 }
diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 9803b0e..a7f20da 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -1004,6 +1004,12 @@ dri2_query_image(__DRIimage *image, int attrib, int 
*value)
case __DRI_IMAGE_ATTRIB_NUM_PLANES:
   *value = 1;
   return GL_TRUE;
+   case __DRI_IMAGE_ATTRIB_OFFSET:
+  /*TODO: We just set the offset to 0 here. It can be set
+   * to corresponding value if someone has the requirement.
+   */
+  *value = 0;
+  return GL_TRUE;
default:
   return GL_FALSE;
}
@@ -1313,7 +1319,7 @@ dri2_get_capabilities(__DRIscreen *_screen)
 
 /* The extension is modified during runtime if DRI_PRIME is detected */  
static __DRIimageExtension dri2ImageExtension = {
-.base = { __DRI_IMAGE, 12 },
+.base = { __DRI_IMAGE, 13 },
 
 .createImageFromName  = dri2_create_image_from_name,
 .createImageFromRenderbuffer  = dri2_create_image_from_renderbuffer,
diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c 
index c3626e3..b14faef 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -941,7 +941,7 @@ gbm_dri_bo_map(struct gbm_bo *_bo,
   return *map_data;
}
 
-   if (!dri->image || dri->image->base.version < 12) {
+   if (!dri->image || dri->image->base.version < 12 || 
+ !dri->image->mapImage) {
   errno = ENOSYS;
   return NULL;
}
@@ -972,7 +972,8 @@ gbm_dri_bo_unmap(struct gbm_bo *_bo, void *map_data)
   return;
}
 
-   if (!dri->context || !dri->image || dri->image->base.version < 12)
+   if (!dri->context || !dri->image ||
+   dri->image->base.version < 12 || !dri->image->unmapImage)
   return;
 
dri->image->unmapImage(dri->context, bo->image, map_data); diff --git 
a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 7876652..a16d2c5 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -609,6 +609,9 @@ intel_query_image(__DRIimage *image, int attrib, int *value)
case __DRI_IMAGE_ATTRIB_NUM_PLANES:
   *value = 1;
   return true;
+   case __DRI_IMAGE_ATTRIB_OFFSET:
+  *value = image->offset;
+  return true;
 
   default:
   return false;
@@ -845,7 +848,7 @@ intel_from_planar(__DRIimage *parent, int plane, void 
*loaderPrivate)  }
 
 static const __DRIimageExtension intelImageExtension = {
-.base = { __DRI_IMAGE, 11 },
+

Re: [Mesa-dev] [PATCH 02/13] nir: limit copy propagation inside loops

2016-08-29 Thread Connor Abbott
On Mon, Aug 29, 2016 at 2:30 AM, Timothy Arceri
 wrote:
> On Mon, 2016-08-29 at 01:30 -0400, Connor Abbott wrote:
>> On Mon, Aug 29, 2016 at 12:54 AM, Timothy Arceri
>>  wrote:
>> >
>> > Don't do copy propagation inside loops until after we try
>> > unrolling them.
>> >
>> > This helps avoid propagating everything to the phis which
>> > makes loop unrolling more difficult.
>> >
>> > For example without this:
>> >
>> >loop {
>> >   block block_1:
>> >   /* preds: block_0 block_4 */
>> >   vec1 32 ssa_10 = phi block_0: ssa_5, block_4: ssa_15
>> >   vec1 32 ssa_11 = phi block_0: ssa_6, block_4: ssa_17
>> >   vec1 32 ssa_12 = phi block_0: ssa_7, block_4: ssa_18
>> >   vec1 32 ssa_13 = phi block_0: ssa_8, block_4: ssa_19
>> >   vec1 32 ssa_14 = phi block_0: ssa_9, block_4: ssa_20
>> >   vec1 32 ssa_15 = iadd ssa_10, ssa_2
>> >   vec1 32 ssa_16 = ige ssa_15, ssa_1
>> >   /* succs: block_2 block_3 */
>> >   if ssa_16 {
>> >  block block_2:
>> >  /* preds: block_1 */
>> >  break
>> >  /* succs: block_5 */
>> >   } else {
>> >  block block_3:
>> >  /* preds: block_1 */
>> >  /* succs: block_4 */
>> >   }
>> >   block block_4:
>> >   /* preds: block_3 */
>> >   vec1 32 ssa_17 = imov ssa_12
>> >   vec1 32 ssa_18 = imov ssa_13
>> >   vec1 32 ssa_19 = imov ssa_14
>> >   vec1 32 ssa_20 = imov ssa_11
>> >   /* succs: block_1 */
>> >}
>> >
>> > Will end up as:
>> >
>> >loop {
>> >   /* preds: block_0 block_4 */
>> >   block block_1:
>> >   vec1 32 ssa_10 = phi block_0: ssa_5, block_4: ssa_15
>> >   vec1 32 ssa_11 = phi block_0: ssa_6, block_4: ssa_12
>> >   vec1 32 ssa_12 = phi block_0: ssa_7, block_4: ssa_13
>> >   vec1 32 ssa_13 = phi block_0: ssa_8, block_4: ssa_14
>> >   vec1 32 ssa_14 = phi block_0: ssa_9, block_4: ssa_11
>> >   vec1 32 ssa_15 = iadd ssa_10, ssa_2
>> >   vec1 32 ssa_16 = ige ssa_15, ssa_1
>> >   /* succs: block_2 block_3 */
>> >   if ssa_16 {
>> >  block block_2:
>> >  /* preds: block_1 */
>> >  break
>> >  /* succs: block_5 */
>> >   } else {
>> >  block block_3:
>> >  /* preds: block_1 */
>> >  /* succs: block_4 */
>> >   }
>> >   block block_4:
>> >   /* preds: block_3 */
>> >   /* succs: block_1 */
>> >}
>>
>> This change seems really fishy to me, since moves like those in your
>> example are just a trivial renaming of values.
>
> If you look closely this is not just a trivial renaming of values. This
> example was from a loop that looked something like this:
>
>int i = 0;
>while (i < 2) {
>   a = a.yzwx;
>}
>

Yes, it is a trivial renaming of values. What's not trivial is that in
SSA, you can have a loop like this:

orig_a = ...
orig_b = ...
loop {
a = phi(b, orig_a);
b = phi(a, orig_b);
...
}

in which case a and b are swapped each iteration. In other words, phi
node copies are defined to happen in parallel. But this is just a fact
of life when dealing with phi nodes, and I sent a comment to patch 10
that explains how you can change your pass to handle it.

>>  Just turning off copy
>> propagation isn't a very robust solution (what if whatever's
>> producing
>> the NIR doesn't insert the moves in the first place?),
>
> I haven't seen this happen when testing shader-db or any of the test
> suites.

The point isn't that anything currently does this, it's that your pass
is depending on other details that it shouldn't be relying on.

>
>>  and it can hurt
>> other optimization passes if they can't see through moves.
>
> While true loops in glsl tend to be fairly simple. I haven't been able
> to detect any negative results in my copy of shader-db. Also we should
> be unrolling a high percentage of loops we come across which means copy
> propagation will kick in once they are unrolled.
>
>>  Why is the
>> second loop harder to unroll?
>
> Because manipulating the control flow in NIR is painful. When the movs
> remain in place we can simply clone them as we unroll.

You don't need to manipulate the control flow to handle it.

>
>>  Why can't it be handled by inserting
>> moves before unrolling the loop, or directly during the loop
>> unrolling
>> pass?
>
> We could possibly do this and I did attempt it at first, but it would
> mean an extra pass over all phi's in the loop header that would likely
> be expensive and complicated. So instead I went with the simple
> solution that had no measurable impact as far as I could tell.

Actually, the pass wouldn't be that complicated. But it shouldn't be
necessary anways (see above).

>
>>
>> >
>> > ---
>> >  src/compiler/nir/nir.h|  2 +-
>> >  src/compiler/nir/nir_opt_copy_propagate.c | 47
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_nir.c   |  6 ++--
>> > 

Re: [Mesa-dev] [PATCH 10/13] nir: add helper for cloning loops

2016-08-29 Thread Connor Abbott
So, you've noticed that your method of handling phi's while cloning
doesn't handle phi's that point to other phi's in the same block.
Particularly, something like:

a = phi(b, ...)
b = phi(a, ...)

which is supposed to swap a and b each iteration. Here's a better
strategy which should be simpler than this and fixes that problem.
Create a new remap_table for each iteration of the loop. On the first
iteration, make all the phi nodes remap to their pre-loop sources
before cloning the body. On each successive iteration, make each phi
node map to what the *old* remap table (from the previous iteration)
says that it's source from the previous iteration remaps to. In
pseudocode:

nir_cf_list loop_body;
extract loop body into loop_body
move loop header instructions (minus phi nodes) into loop_body

struct hash_table *old_remap_table = NULL, *remap_table;
for each iteration:
remap_table = new dictionary;
for each phi in original loop:
if old_remap_table:
remap_table[phi] = old_remap_table[phi source pointing to
end of loop]
else:
remap_table[phi] = phi source pointing to block before loop
nir_cf_list cloned_body;
clone loop_body to cloned_body using remap_table
insert cloned_body before loop

delete loop_body
delete loop

This should eliminate the need to expose nir_cf_list_cleanup(),
stitch_blocks(), etc. publicly as well as the previous patch, and it
should handle the phi swapping case correctly.


On Mon, Aug 29, 2016 at 12:59 AM, Timothy Arceri
 wrote:
> ---
>  src/compiler/nir/nir.h   |  3 +++
>  src/compiler/nir/nir_clone.c | 64 
> +++-
>  2 files changed, 60 insertions(+), 7 deletions(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 0ab3ebc..9083bd0 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2372,6 +2372,9 @@ void nir_print_instr(const nir_instr *instr, FILE *fp);
>
>  nir_shader *nir_shader_clone(void *mem_ctx, const nir_shader *s);
>  nir_function_impl *nir_function_impl_clone(const nir_function_impl *fi);
> +void nir_clone_loop_list(struct exec_list *dst, const struct exec_list *list,
> + struct hash_table *remap_table,
> + struct hash_table *phi_remap, nir_shader *ns);
>  nir_constant *nir_constant_clone(const nir_constant *c, nir_variable *var);
>  nir_variable *nir_variable_clone(const nir_variable *c, nir_shader *shader);
>
> diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
> index 8808333..071afc9 100644
> --- a/src/compiler/nir/nir_clone.c
> +++ b/src/compiler/nir/nir_clone.c
> @@ -35,9 +35,17 @@ typedef struct {
> /* True if we are cloning an entire shader. */
> bool global_clone;
>
> +   /* This allows us to clone a loop body without having to add srcs from
> +* outside the loop to the remap table. This is useful for loop unrolling.
> +*/
> +   bool allow_remap_fallback;
> +
> /* maps orig ptr -> cloned ptr: */
> struct hash_table *remap_table;
>
> +   /* used for remaping when cloning loop body for loop unrolling */
> +   struct hash_table *phi_remap_table;
> +
> /* List of phi sources. */
> struct list_head phi_srcs;
>
> @@ -46,11 +54,20 @@ typedef struct {
>  } clone_state;
>
>  static void
> -init_clone_state(clone_state *state, bool global)
> +init_clone_state(clone_state *state, struct hash_table *remap_table,
> + bool global, bool allow_remap_fallback)
>  {
> state->global_clone = global;
> -   state->remap_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
> -_mesa_key_pointer_equal);
> +   state->allow_remap_fallback = allow_remap_fallback;
> +
> +   state->phi_remap_table = NULL;
> +   if (remap_table) {
> +  state->remap_table = remap_table;
> +   } else {
> +  state->remap_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
> +   _mesa_key_pointer_equal);
> +   }
> +
> list_inithead(>phi_srcs);
>  }
>
> @@ -72,16 +89,32 @@ _lookup_ptr(clone_state *state, const void *ptr, bool 
> global)
>return (void *)ptr;
>
> entry = _mesa_hash_table_search(state->remap_table, ptr);
> -   assert(entry && "Failed to find pointer!");
> if (!entry)
> -  return NULL;
> +  return state->allow_remap_fallback ? (void *)ptr : NULL;
>
> return entry->data;
>  }
>
> +/**
> + * Updates a phi remap table used for unrolling loops.
> + */
> +static void
> +update_phi_remap_table(clone_state *state, const void *ptr, void *nptr)
> +{
> +   if (state->phi_remap_table == NULL)
> +  return;
> +
> +   struct hash_entry *hte;
> +   hash_table_foreach(state->phi_remap_table, hte) {
> +  if (hte->data == ptr)
> + hte->data = nptr;
> +   }
> +}
> +
>  static void
>  add_remap(clone_state *state, void *nptr, const void *ptr)
>  {
> +   

Re: [Mesa-dev] [PATCH 08/13] nir: add control flow helpers for loop unrolling

2016-08-29 Thread Connor Abbott
I already noted in patch 12/13 that you can get rid of your use of
stitch_blocks(). I also don't get why you need a special
nir_cf_loop_list_extract() here... why

On Mon, Aug 29, 2016 at 12:59 AM, Timothy Arceri
 wrote:
> This makes stitch_blocks() available for use else where, and adds
> a new helper that extracts a cf list without worrying about
> validation.

I already noted in patch 12/13 that you can get rid of your use of
stitch_blocks(). I also don't get why you need a special
nir_cf_loop_list_extract() here... why can't you use nir_cf_extract()?
The only difference between the two is that nir_cf_extract()
re-stitches the nodes together, and it also makes sure that you don't
handle phi nodes incorrectly, but AFAICT those differences don't
matter for the way you intend to use it.

> ---
>  src/compiler/nir/nir_control_flow.c | 34 --
>  src/compiler/nir/nir_control_flow.h |  5 +
>  2 files changed, 37 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/nir/nir_control_flow.c 
> b/src/compiler/nir/nir_control_flow.c
> index a485e71..ed8cd24 100644
> --- a/src/compiler/nir/nir_control_flow.c
> +++ b/src/compiler/nir/nir_control_flow.c
> @@ -628,8 +628,7 @@ update_if_uses(nir_cf_node *node)
>   * Stitch two basic blocks together into one. The aggregate must have the 
> same
>   * predecessors as the first and the same successors as the second.
>   */
> -
> -static void
> +void
>  stitch_blocks(nir_block *before, nir_block *after)
>  {
> /*
> @@ -791,6 +790,37 @@ nir_cf_extract(nir_cf_list *extracted, nir_cursor begin, 
> nir_cursor end)
> stitch_blocks(block_before, block_after);
>  }
>
> +/**
> + * Its not really possible to extract control flow from a loop while keeping
> + * the cf valid so this function just rips out what we ask for and any
> + * validation and fix ups are left to the caller.
> + */
> +void
> +nir_cf_loop_list_extract(nir_cf_list *extracted, nir_cf_node *begin,
> + nir_cf_node *end)
> +{
> +   extracted->impl = nir_cf_node_get_function(begin);
> +   exec_list_make_empty(>list);
> +
> +   /* Dominance and other block-related information is toast. */
> +   nir_metadata_preserve(extracted->impl, nir_metadata_none);
> +
> +   nir_cf_node *cf_node = begin;
> +   nir_cf_node *cf_node_end = end;
> +   while (true) {
> +  nir_cf_node *next = nir_cf_node_next(cf_node);
> +
> +  exec_node_remove(_node->node);
> +  cf_node->parent = NULL;
> +  exec_list_push_tail(>list, _node->node);
> +
> +  if (cf_node == cf_node_end)
> + break;
> +
> +  cf_node = next;
> +   }
> +}
> +
>  void
>  nir_cf_reinsert(nir_cf_list *cf_list, nir_cursor cursor)
>  {
> diff --git a/src/compiler/nir/nir_control_flow.h 
> b/src/compiler/nir/nir_control_flow.h
> index b71382f..0d97486 100644
> --- a/src/compiler/nir/nir_control_flow.h
> +++ b/src/compiler/nir/nir_control_flow.h
> @@ -78,6 +78,9 @@ nir_cf_node_insert_end(struct exec_list *list, nir_cf_node 
> *node)
> nir_cf_node_insert(nir_after_cf_list(list), node);
>  }
>
> +void
> +stitch_blocks(nir_block *before, nir_block *after);
> +
>
>  /** Control flow motion.
>   *
> @@ -148,6 +151,8 @@ nir_cf_list_extract(nir_cf_list *extracted, struct 
> exec_list *cf_list)
>nir_after_cf_list(cf_list));
>  }
>
> +void nir_cf_loop_list_extract(nir_cf_list *extracted, nir_cf_node *begin, 
> nir_cf_node *end);
> +
>  /** removes a control flow node, doing any cleanup necessary */
>  static inline void
>  nir_cf_node_remove(nir_cf_node *node)
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/13] nir: add a loop unrolling pass

2016-08-29 Thread Connor Abbott
On Mon, Aug 29, 2016 at 12:59 AM, Timothy Arceri
 wrote:
> ---
>  src/compiler/Makefile.sources  |   1 +
>  src/compiler/nir/nir.h |   2 +
>  src/compiler/nir/nir_opt_loop_unroll.c | 394 
> +
>  3 files changed, 397 insertions(+)
>  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
>
> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> index 79de484..a9f104d 100644
> --- a/src/compiler/Makefile.sources
> +++ b/src/compiler/Makefile.sources
> @@ -231,6 +231,7 @@ NIR_FILES = \
> nir/nir_opt_dead_cf.c \
> nir/nir_opt_gcm.c \
> nir/nir_opt_global_to_local.c \
> +   nir/nir_opt_loop_unroll.c \
> nir/nir_opt_peephole_select.c \
> nir/nir_opt_remove_phis.c \
> nir/nir_opt_undef.c \
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 9083bd0..81d9dfc 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2676,6 +2676,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
>
>  void nir_opt_gcm(nir_shader *shader);
>
> +bool nir_opt_loop_unroll(nir_shader *shader);
> +
>  bool nir_opt_peephole_select(nir_shader *shader);
>
>  bool nir_opt_remove_phis(nir_shader *shader);
> diff --git a/src/compiler/nir/nir_opt_loop_unroll.c 
> b/src/compiler/nir/nir_opt_loop_unroll.c
> new file mode 100644
> index 000..22530c9
> --- /dev/null
> +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> @@ -0,0 +1,394 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "nir.h"
> +#include "nir_builder.h"
> +#include "nir_control_flow.h"
> +
> +typedef struct {
> +   /* Array of loops */
> +   nir_loop *loops;
> +
> +   /* Array of unroll factors */
> +   int *factors;
> +} unroll_vector;
> +
> +typedef struct {
> +   /* Array of loop infos for the loop nest */
> +   nir_loop_info *li;
> +
> +   /* List of unroll vectors */
> +   struct list_head unroll_vectors;
> +
> +   nir_shader_compiler_options *options;
> +} loop_unroll_state;
> +
> +static void
> +extract_loop_body(nir_cf_list *extracted, nir_cf_node *node, nir_shader *ns)

The shader argument is unused, you can delete it.

> +{
> +   nir_cf_node *end = node;
> +   while (!nir_cf_node_is_last(end))
> +  end = nir_cf_node_next(end);
> +
> +   nir_cf_loop_list_extract(extracted, node, end);
> +}
> +
> +static void
> +clone_list(nir_shader *ns, nir_loop *loop, nir_cf_list *src_cf_list,
> +   nir_cf_list *cloned_cf_list, struct hash_table *remap_table,
> +   struct hash_table *phi_remap)
> +{
> +   /* Dest list needs to at least have one block */
> +   nir_block *nblk = nir_block_create(ns);
> +   nblk->cf_node.parent = loop->cf_node.parent;
> +   exec_list_push_tail(_cf_list->list, >cf_node.node);
> +
> +   nir_clone_loop_list(_cf_list->list, _cf_list->list,
> +   remap_table, phi_remap, ns);
> +}
> +
> +static void
> +remove_unrolled_loop(nir_cf_node *loop, nir_block *block_before_loop,
> + struct hash_table *remap_table,
> + struct hash_table *phi_remap,
> + nir_function_impl *impl)
> +{
> +   /* Fixup LCSSA-phi srcs */
> +   nir_block *prev_block = nir_cf_node_as_block(nir_cf_node_prev(loop));
> +   nir_cf_node *cf_node = nir_cf_node_next(loop);
> +   assert(cf_node->type == nir_cf_node_block);
> +
> +   nir_block *block = nir_cf_node_as_block(cf_node);
> +   nir_foreach_instr_safe(instr, block) {
> +  if (instr->type == nir_instr_type_phi) {
> + nir_phi_instr *phi = nir_instr_as_phi(instr);
> + assert(phi->is_lcssa_phi);
> +
> + if (nir_cf_node_as_loop(loop)->info->trip_count != 0) {
> +nir_foreach_phi_src_safe(src, phi) {
> +   /* Update predecessor */
> +   

Re: [Mesa-dev] [PATCH] aubinator: fix if indentation and add brackets to multiline body

2016-08-29 Thread Kenneth Graunke
On Tuesday, August 30, 2016 9:53:35 AM PDT Timothy Arceri wrote:
> Fixes misleading indentation warning in gcc.
> 
> Cc: Kristian Høgsberg Kristensen 
> Cc: Kenneth Graunke 
> ---
>  src/intel/tools/disasm.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/src/intel/tools/disasm.c b/src/intel/tools/disasm.c
> index a1cb191..fcb61c4 100644
> --- a/src/intel/tools/disasm.c
> +++ b/src/intel/tools/disasm.c
> @@ -75,12 +75,13 @@ gen_disasm_disassemble(struct gen_disasm *disasm, void 
> *assembly, int start,
>  
>/* Simplistic, but efficient way to terminate disasm */
>if (brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SEND ||
> -  brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SENDC)
> +  brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SENDC) {
>   if (brw_inst_eot(devinfo, insn))
>  break;
> - if (brw_inst_opcode(devinfo, insn) == 0)
> -break;
> +  }
>  
> +  if (brw_inst_opcode(devinfo, insn) == 0)
> + break;
> }
>  }
>  
> 

I think Sirisha is going to make broader changes here shortly, but in
the meantime, please go ahead:

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] aubinator: fix if indentation and add brackets to multiline body

2016-08-29 Thread Timothy Arceri
Fixes misleading indentation warning in gcc.

Cc: Kristian Høgsberg Kristensen 
Cc: Kenneth Graunke 
---
 src/intel/tools/disasm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/intel/tools/disasm.c b/src/intel/tools/disasm.c
index a1cb191..fcb61c4 100644
--- a/src/intel/tools/disasm.c
+++ b/src/intel/tools/disasm.c
@@ -75,12 +75,13 @@ gen_disasm_disassemble(struct gen_disasm *disasm, void 
*assembly, int start,
 
   /* Simplistic, but efficient way to terminate disasm */
   if (brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SEND ||
-  brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SENDC)
+  brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SENDC) {
  if (brw_inst_eot(devinfo, insn))
 break;
- if (brw_inst_opcode(devinfo, insn) == 0)
-break;
+  }
 
+  if (brw_inst_opcode(devinfo, insn) == 0)
+ break;
}
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: initialise and increment send_sbc

2016-08-29 Thread Dave Airlie
From: Dave Airlie 

At least set this to not be uninitialised memory.

Signed-off-by: Dave Airlie 
---
 src/intel/vulkan/anv_wsi_x11.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/intel/vulkan/anv_wsi_x11.c b/src/intel/vulkan/anv_wsi_x11.c
index 7c6ef97..7e9ff7d 100644
--- a/src/intel/vulkan/anv_wsi_x11.c
+++ b/src/intel/vulkan/anv_wsi_x11.c
@@ -610,6 +610,7 @@ x11_queue_present(struct anv_swapchain *anv_chain,
 
xshmfence_reset(image->shm_fence);
 
+   ++chain->send_sbc;
xcb_void_cookie_t cookie =
   xcb_present_pixmap(chain->conn,
  chain->window,
@@ -842,6 +843,7 @@ x11_surface_create_swapchain(VkIcdSurfaceBase *icd_surface,
chain->window = x11_surface_get_window(icd_surface);
chain->extent = pCreateInfo->imageExtent;
chain->image_count = num_images;
+   chain->send_sbc = 0;
 
chain->event_id = xcb_generate_id(chain->conn);
xcb_present_select_input(chain->conn, chain->event_id, chain->window,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] clover: Introduce CLOVER_COMPILER_OPTIONS

2016-08-29 Thread Vedran Miletić
Options specified via the CLOVER_COMPILER_OPTIONS shell variable are
appended to the compiler options specified by the OpenCL program (if
any).

Signed-off-by: Vedran Miletić 
---
 docs/envvars.html | 2 ++
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 9 +++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/docs/envvars.html b/docs/envvars.html
index 6d79398..52835b6 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -224,6 +224,8 @@ Mesa EGL supports different sets of environment variables.  
See the
 GALLIUM_DUMP_CPU - if non-zero, print information about the CPU on start-up
 TGSI_PRINT_SANITY - if set, do extra sanity checking on TGSI shaders and
 print any errors to stderr.
+CLOVER_COMPILER_OPTIONS - allows specifying additional compiler options.
+Specified options are appended after the options set by the OpenCL program.
 DRAW_FSE - ???
 DRAW_NO_FSE - ???
 DRAW_USE_LLVM - if set to zero, the draw module will not use LLVM to 
execute
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 5490d72..748850f 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -196,11 +196,16 @@ clover::llvm::compile_program(const std::string ,
   const std::string ,
   const std::string ,
   std::string _log) {
+   const char *extra_opts_env = getenv("CLOVER_COMPILER_OPTIONS");
+   std::string extra_opts;
+   if (extra_opts_env)
+   extra_opts = std::string(extra_opts_env);
+
if (has_flag(debug::clc))
-  debug::log(".cl", "// Options: " + opts + '\n' + source);
+  debug::log(".cl", "// Compiler options: " + opts + " " + extra_opts + 
'\n' + source);
 
auto ctx = create_context(r_log);
-   auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
+   auto c = create_compiler_instance(target, tokenize(opts + " " + extra_opts 
+ " input.cl"),
  r_log);
auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
   r_log);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: import the slab allocator from gallium

2016-08-29 Thread Marek Olšák
On Mon, Aug 29, 2016 at 10:40 PM, Dave Airlie  wrote:
>>
>> There are also some cosmetic changes.
>
> Did I miss 0/2? is there some future use for this in mesa?
>
> I assume there is, just wanting more info :)

It was for an experiment that turned out to be a disaster. There is no
other use.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Fix missing dirty bits related to is_drawing_points/lines.

2016-08-29 Thread Kenneth Graunke
calculate_attr_overrides() uses is_drawing_points(), which depends
on tessellation and geometry program state.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_clip_state.c |  4 
 src/mesa/drivers/dri/i965/gen7_sf_state.c   | 13 ++---
 src/mesa/drivers/dri/i965/gen8_sf_state.c   |  8 ++--
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_clip_state.c 
b/src/mesa/drivers/dri/i965/gen6_clip_state.c
index 4a3f7f9..1edefb0 100644
--- a/src/mesa/drivers/dri/i965/gen6_clip_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_clip_state.c
@@ -230,6 +230,9 @@ upload_clip_state(struct brw_context *brw)
else
   enable = GEN6_CLIP_ENABLE;
 
+   /* _NEW_POLYGON,
+* BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_TES_PROG_DATA | BRW_NEW_PRIMITIVE
+*/
if (!brw_is_drawing_points(brw) && !brw_is_drawing_lines(brw))
   dw2 |= GEN6_CLIP_XY_TEST;
 
@@ -281,6 +284,7 @@ const struct brw_tracked_state gen7_clip_state = {
BRW_NEW_META_IN_PROGRESS |
BRW_NEW_PRIMITIVE |
BRW_NEW_RASTERIZER_DISCARD |
+   BRW_NEW_TES_PROG_DATA |
BRW_NEW_VUE_MAP_GEOM_OUT,
},
.emit = upload_clip_state,
diff --git a/src/mesa/drivers/dri/i965/gen7_sf_state.c 
b/src/mesa/drivers/dri/i965/gen7_sf_state.c
index ba0592a..ffe92a6 100644
--- a/src/mesa/drivers/dri/i965/gen7_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sf_state.c
@@ -59,8 +59,10 @@ upload_sbe_state(struct brw_context *brw)
}
dw1 |= point_sprite_origin;
 
-   /* BRW_NEW_VUE_MAP_GEOM_OUT | BRW_NEW_FRAGMENT_PROGRAM
-* _NEW_POINT | _NEW_LIGHT | _NEW_PROGRAM | BRW_NEW_FS_PROG_DATA
+   /* _NEW_POINT | _NEW_LIGHT | _NEW_PROGRAM,
+* BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM |
+* BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_PRIMITIVE | BRW_NEW_TES_PROG_DATA |
+* BRW_NEW_VUE_MAP_GEOM_OUT
 */
uint32_t urb_entry_read_length;
uint32_t urb_entry_read_offset;
@@ -96,6 +98,7 @@ const struct brw_tracked_state gen7_sbe_state = {
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_FS_PROG_DATA |
BRW_NEW_GEOMETRY_PROGRAM |
+   BRW_NEW_TES_PROG_DATA |
BRW_NEW_PRIMITIVE |
BRW_NEW_VUE_MAP_GEOM_OUT,
},
@@ -187,7 +190,9 @@ upload_sf_state(struct brw_context *brw)
   dw2 |= GEN6_SF_CULL_NONE;
}
 
-   /* _NEW_SCISSOR _NEW_POLYGON BRW_NEW_GEOMETRY_PROGRAM BRW_NEW_PRIMITIVE */
+   /* _NEW_SCISSOR | _NEW_POLYGON,
+* BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_PRIMITIVE | BRW_NEW_TES_PROG_DATA
+*/
if (ctx->Scissor.EnableFlags ||
brw_is_drawing_points(brw) || brw_is_drawing_lines(brw))
   dw2 |= GEN6_SF_SCISSOR_ENABLE;
@@ -256,7 +261,9 @@ const struct brw_tracked_state gen7_sf_state = {
_NEW_SCISSOR,
   .brw   = BRW_NEW_BLORP |
BRW_NEW_CONTEXT |
+   BRW_NEW_GEOMETRY_PROGRAM |
BRW_NEW_PRIMITIVE |
+   BRW_NEW_TES_PROG_DATA |
BRW_NEW_VUE_MAP_GEOM_OUT,
},
.emit = upload_sf_state,
diff --git a/src/mesa/drivers/dri/i965/gen8_sf_state.c 
b/src/mesa/drivers/dri/i965/gen8_sf_state.c
index 0c4f1df..cf3e680 100644
--- a/src/mesa/drivers/dri/i965/gen8_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_sf_state.c
@@ -60,8 +60,10 @@ upload_sbe(struct brw_context *brw)
else
   dw1 |= GEN6_SF_POINT_SPRITE_UPPERLEFT;
 
-   /* BRW_NEW_VUE_MAP_GEOM_OUT | BRW_NEW_FRAGMENT_PROGRAM |
-* _NEW_POINT | _NEW_LIGHT | _NEW_PROGRAM | BRW_NEW_FS_PROG_DATA
+   /* _NEW_POINT | _NEW_LIGHT | _NEW_PROGRAM,
+* BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM |
+* BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_PRIMITIVE | BRW_NEW_TES_PROG_DATA |
+* BRW_NEW_VUE_MAP_GEOM_OUT
 */
calculate_attr_overrides(brw, attr_overrides,
 _sprite_enables,
@@ -137,6 +139,8 @@ const struct brw_tracked_state gen8_sbe_state = {
BRW_NEW_CONTEXT |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_FS_PROG_DATA |
+   BRW_NEW_GEOMETRY_PROGRAM |
+   BRW_NEW_TES_PROG_DATA |
BRW_NEW_VUE_MAP_GEOM_OUT,
},
.emit = upload_sbe,
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Use gs_prog_data in is_drawing_points/lines().

2016-08-29 Thread Kenneth Graunke
State upload code should use prog_data rather than poking at core
Mesa shader data structures wherever possible.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_clip_state.c | 16 
 src/mesa/drivers/dri/i965/gen6_sf_state.c   |  8 +---
 src/mesa/drivers/dri/i965/gen7_sf_state.c   |  8 
 src/mesa/drivers/dri/i965/gen8_sf_state.c   |  4 ++--
 4 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_clip_state.c 
b/src/mesa/drivers/dri/i965/gen6_clip_state.c
index 1edefb0..f8c552d 100644
--- a/src/mesa/drivers/dri/i965/gen6_clip_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_clip_state.c
@@ -43,9 +43,9 @@ brw_is_drawing_points(const struct brw_context *brw)
   return true;
}
 
-   if (brw->geometry_program) {
-  /* BRW_NEW_GEOMETRY_PROGRAM */
-  return brw->geometry_program->OutputType == GL_POINTS;
+   if (brw->gs.prog_data) {
+  /* BRW_NEW_GS_PROG_DATA */
+  return brw->gs.prog_data->output_topology == _3DPRIM_POINTLIST;
} else if (brw->tes.prog_data) {
   /* BRW_NEW_TES_PROG_DATA */
   return brw->tes.prog_data->output_topology ==
@@ -66,9 +66,9 @@ brw_is_drawing_lines(const struct brw_context *brw)
   return true;
}
 
-   if (brw->geometry_program) {
-  /* BRW_NEW_GEOMETRY_PROGRAM */
-  return brw->geometry_program->OutputType == GL_LINE_STRIP;
+   if (brw->gs.prog_data) {
+  /* BRW_NEW_GS_PROG_DATA */
+  return brw->gs.prog_data->output_topology == _3DPRIM_LINESTRIP;
} else if (brw->tes.prog_data) {
   /* BRW_NEW_TES_PROG_DATA */
   return brw->tes.prog_data->output_topology ==
@@ -262,7 +262,7 @@ const struct brw_tracked_state gen6_clip_state = {
   .brw   = BRW_NEW_BLORP |
BRW_NEW_CONTEXT |
BRW_NEW_FS_PROG_DATA |
-   BRW_NEW_GEOMETRY_PROGRAM |
+   BRW_NEW_GS_PROG_DATA |
BRW_NEW_META_IN_PROGRESS |
BRW_NEW_PRIMITIVE |
BRW_NEW_RASTERIZER_DISCARD |
@@ -280,7 +280,7 @@ const struct brw_tracked_state gen7_clip_state = {
   .brw   = BRW_NEW_BLORP |
BRW_NEW_CONTEXT |
BRW_NEW_FS_PROG_DATA |
-   BRW_NEW_GEOMETRY_PROGRAM |
+   BRW_NEW_GS_PROG_DATA |
BRW_NEW_META_IN_PROGRESS |
BRW_NEW_PRIMITIVE |
BRW_NEW_RASTERIZER_DISCARD |
diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
b/src/mesa/drivers/dri/i965/gen6_sf_state.c
index 7cef17a..059dd90 100644
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
@@ -193,7 +193,7 @@ calculate_attr_overrides(const struct brw_context *brw,
 * correctly set the attr overrides.
 *
 * _NEW_POLYGON
-* BRW_NEW_PRIMITIVE | BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_TES_PROG_DATA
+* BRW_NEW_PRIMITIVE | BRW_NEW_GS_PROG_DATA | BRW_NEW_TES_PROG_DATA
 */
bool drawing_points = brw_is_drawing_points(brw);
 
@@ -335,7 +335,9 @@ upload_sf_state(struct brw_context *brw)
unreachable("not reached");
}
 
-   /* _NEW_SCISSOR _NEW_POLYGON BRW_NEW_GEOMETRY_PROGRAM BRW_NEW_PRIMITIVE */
+   /* _NEW_SCISSOR | _NEW_POLYGON,
+* BRW_NEW_GS_PROG_DATA | BRW_NEW_TES_PROG_DATA | BRW_NEW_PRIMITIVE
+*/
if (ctx->Scissor.EnableFlags ||
brw_is_drawing_points(brw) || brw_is_drawing_lines(brw))
   dw3 |= GEN6_SF_SCISSOR_ENABLE;
@@ -448,7 +450,7 @@ const struct brw_tracked_state gen6_sf_state = {
BRW_NEW_CONTEXT |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_FS_PROG_DATA |
-   BRW_NEW_GEOMETRY_PROGRAM |
+   BRW_NEW_GS_PROG_DATA |
BRW_NEW_PRIMITIVE |
BRW_NEW_TES_PROG_DATA |
BRW_NEW_VUE_MAP_GEOM_OUT,
diff --git a/src/mesa/drivers/dri/i965/gen7_sf_state.c 
b/src/mesa/drivers/dri/i965/gen7_sf_state.c
index ffe92a6..764b4e3 100644
--- a/src/mesa/drivers/dri/i965/gen7_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sf_state.c
@@ -61,7 +61,7 @@ upload_sbe_state(struct brw_context *brw)
 
/* _NEW_POINT | _NEW_LIGHT | _NEW_PROGRAM,
 * BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM |
-* BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_PRIMITIVE | BRW_NEW_TES_PROG_DATA |
+* BRW_NEW_GS_PROG_DATA | BRW_NEW_PRIMITIVE | BRW_NEW_TES_PROG_DATA |
 * BRW_NEW_VUE_MAP_GEOM_OUT
 */
uint32_t urb_entry_read_length;
@@ -97,7 +97,7 @@ const struct brw_tracked_state gen7_sbe_state = {
BRW_NEW_CONTEXT |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_FS_PROG_DATA |
-   BRW_NEW_GEOMETRY_PROGRAM |
+   BRW_NEW_GS_PROG_DATA |
BRW_NEW_TES_PROG_DATA |
BRW_NEW_PRIMITIVE |
BRW_NEW_VUE_MAP_GEOM_OUT,
@@ -191,7 +191,7 @@ upload_sf_state(struct brw_context *brw)
}
 
/* _NEW_SCISSOR | _NEW_POLYGON,
-* BRW_NEW_GEOMETRY_PROGRAM | 

Re: [Mesa-dev] [Mesa-stable] [PATCH] mesa: fix format conversion bug in get_tex_rgba_uncompressed()

2016-08-29 Thread Anuj Phogat
On Mon, Aug 29, 2016 at 11:44 AM, Brian Paul  wrote:
>
> We need to set the need_convert flag with each loop iteration, not
> just when the rgba pointer is null.
>
> Bug reported by Markus Müller  on mesa-users list.
> Fixes new piglit arb_texture_float-get-tex3d test.
>
> Cc: 
> ---
>  src/mesa/main/texgetimage.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
> index bd44c68..b900278 100644
> --- a/src/mesa/main/texgetimage.c
> +++ b/src/mesa/main/texgetimage.c
> @@ -495,13 +495,15 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, 
> GLuint dimensions,
>*/
>   if (format == rgba_format) {
>  rgba = dest;
> - } else if (rgba == NULL) { /* Allocate the RGBA buffer only once */
> + } else {
>  need_convert = true;
> -rgba = malloc(height * rgba_stride);
> -if (!rgba) {
> -   _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetTexImage()");
> -   ctx->Driver.UnmapTextureImage(ctx, texImage, img);
> -   return;
> +if (rgba == NULL) { /* Allocate the RGBA buffer only once */
> +   rgba = malloc(height * rgba_stride);
> +   if (!rgba) {
> +  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetTexImage()");
> +  ctx->Driver.UnmapTextureImage(ctx, texImage, img);
> +  return;
> +   }
>  }
>   }
>
> --
> 1.9.1
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] Introduce .editorconfig

2016-08-29 Thread Serge Martin
On Monday 29 August 2016 22:30:58 Eric Engestrom wrote:
> A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files
> to try and enforce the formatting of the code, to which Michel Dänzer
> suggested [1] we start by importing the existing .dir-locals.el
> settings. The first draft was discussed in the RFC [2].
> 
> These .editorconfig are a first step, one that has the advantage of
> requiring little to no intervention from the devs once the settings
> files are in place, but the settings are very limited. This does have
> the advantage of applying while the code is being written.
> This doesn't replace the need for more comprehensive formatting tools
> such as clang-format & clang-tidy, but those reformat the code after
> the fact.
> 
> [0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html
> [1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html
> [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html
> 
> Reviewed-by: Eric Anholt  (for vc4)
> Acked-by: Nicolai Hähnle 
> Signed-off-by: Eric Engestrom 
> ---
> 
> First off, sorry I took so long to follow up on this.
> Here's the v1 with most of the changes suggested in the RFC/v0, except for
> the couple following issues:
> 
> Jose thinks I should remove this line:
> > +trim_trailing_whitespace = true
> 
> His reasoning is:
> > I'm sure we have lots of files with trailing whitespace, and this means
> > that people doing trivial one liner fixes might suddently cause the whole
> > file to be munged and stripped.
> > 
> > So I think we should remove this from the top level editorconfig.  It's
> > fine to have it on specific subdirs, which are know to be
> > trailing-whitespace free.
> I don't think waiting for the rule to be followed before introducing it can
> work, but I do understand that introducing it like this might result in
> people sending a lot of whitespace changes in unrelated patches.
> One solution I would prefer is to have a cleanup patch (in the same series)
> that removes those trailing spaces, making the rule enforceable right away.
> What do you guys think?
> 
> A couple of people have also suggested I add `tab_width`, but IMO this
> concept should die: it comes from the confusion between indentation (a
> logic concept meant to convey code structure) and alignment (an aesthetic
> concept). If you need a specific size for your tabs, you're not indenting,
> you're aligning, which makes no sense to do with a character that will have
> a varying size in the first place.
> There is a (pointless IMO, who cares?) war between tab and space indentation
> with various arguments on each side, but I have yet to find anyone with a
> single argument in favour of using varying-width characters to align
> code... yet I keep seeing people doing it.
> (OK, that's not true: I do have a preference for space indentation, because
> it makes it harder for people to do the wrong thing when they get confused)
>  (sorry)
> ---
>  .editorconfig| 35
>  bin/.editorconfig|
>  3 ++
>  include/CL/.editorconfig |  3 ++

I'm not sure we need an editorconfig in this dir.
The headers are a copy of Khronos ones

>  include/D3D9/.editorconfig   |  2 ++
>  include/c11/.editorconfig|  3 ++
>  include/d3dadapter/.editorconfig |  3 ++
>  include/vulkan/.editorconfig |  3 ++
>  src/egl/drivers/haiku/.editorconfig  |  2 ++
>  src/egl/wayland/.editorconfig|  2 ++
>  src/gallium/drivers/freedreno/.editorconfig  |  2 ++
>  src/gallium/drivers/r300/.editorconfig   |  3 ++
>  src/gallium/drivers/r600/.editorconfig   |  2 ++
>  src/gallium/drivers/radeon/.editorconfig |  2 ++
>  src/gallium/drivers/radeonsi/.editorconfig   |  2 ++
>  src/gallium/drivers/vc4/.editorconfig|  3 ++
>  src/gallium/drivers/vc4/kernel/.editorconfig |  2 ++
>  src/gallium/state_trackers/hgl/.editorconfig |  2 ++
>  src/gallium/state_trackers/nine/.editorconfig|  3 ++
>  src/gallium/state_trackers/xa/.editorconfig  |  3 ++
>  src/gallium/targets/d3dadapter9/.editorconfig|  3 ++
>  src/gallium/targets/haiku-softpipe/.editorconfig |  2 ++
>  src/gallium/winsys/freedreno/drm/.editorconfig   |  2 ++
>  src/gallium/winsys/nouveau/drm/.editorconfig |  2 ++
>  src/gallium/winsys/radeon/drm/.editorconfig  |  3 ++
>  src/gallium/winsys/sw/hgl/.editorconfig  |  2 ++
>  src/getopt/.editorconfig |  2 ++
>  src/gtest/.editorconfig  |  3 ++
>  src/hgl/.editorconfig|  2 ++
>  src/mesa/drivers/dri/nouveau/.editorconfig   |  2 ++
>  29 files changed, 103 insertions(+)
>  create mode 100644 .editorconfig
>  create mode 

[Mesa-dev] [PATCH mesa] Introduce .editorconfig

2016-08-29 Thread Eric Engestrom
A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files
to try and enforce the formatting of the code, to which Michel Dänzer
suggested [1] we start by importing the existing .dir-locals.el
settings. The first draft was discussed in the RFC [2].

These .editorconfig are a first step, one that has the advantage of
requiring little to no intervention from the devs once the settings
files are in place, but the settings are very limited. This does have
the advantage of applying while the code is being written.
This doesn't replace the need for more comprehensive formatting tools
such as clang-format & clang-tidy, but those reformat the code after
the fact.

[0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html
[1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html
[2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html

Reviewed-by: Eric Anholt  (for vc4)
Acked-by: Nicolai Hähnle 
Signed-off-by: Eric Engestrom 
---

First off, sorry I took so long to follow up on this.
Here's the v1 with most of the changes suggested in the RFC/v0, except for the
couple following issues:

Jose thinks I should remove this line:
> +trim_trailing_whitespace = true

His reasoning is:
> I'm sure we have lots of files with trailing whitespace, and this means that
> people doing trivial one liner fixes might suddently cause the whole file to
> be munged and stripped.
>
> So I think we should remove this from the top level editorconfig.  It's fine
> to have it on specific subdirs, which are know to be trailing-whitespace free.

I don't think waiting for the rule to be followed before introducing it can
work, but I do understand that introducing it like this might result in people
sending a lot of whitespace changes in unrelated patches.
One solution I would prefer is to have a cleanup patch (in the same series) that
removes those trailing spaces, making the rule enforceable right away.
What do you guys think?

A couple of people have also suggested I add `tab_width`, but IMO this concept
should die: it comes from the confusion between indentation (a logic concept
meant to convey code structure) and alignment (an aesthetic concept).
If you need a specific size for your tabs, you're not indenting, you're
aligning, which makes no sense to do with a character that will have a varying
size in the first place.
There is a (pointless IMO, who cares?) war between tab and space indentation
with various arguments on each side, but I have yet to find anyone with a single
argument in favour of using varying-width characters to align code... yet I keep
seeing people doing it.
(OK, that's not true: I do have a preference for space indentation, because it
makes it harder for people to do the wrong thing when they get confused)
 (sorry)
---
 .editorconfig| 35 
 bin/.editorconfig|  3 ++
 include/CL/.editorconfig |  3 ++
 include/D3D9/.editorconfig   |  2 ++
 include/c11/.editorconfig|  3 ++
 include/d3dadapter/.editorconfig |  3 ++
 include/vulkan/.editorconfig |  3 ++
 src/egl/drivers/haiku/.editorconfig  |  2 ++
 src/egl/wayland/.editorconfig|  2 ++
 src/gallium/drivers/freedreno/.editorconfig  |  2 ++
 src/gallium/drivers/r300/.editorconfig   |  3 ++
 src/gallium/drivers/r600/.editorconfig   |  2 ++
 src/gallium/drivers/radeon/.editorconfig |  2 ++
 src/gallium/drivers/radeonsi/.editorconfig   |  2 ++
 src/gallium/drivers/vc4/.editorconfig|  3 ++
 src/gallium/drivers/vc4/kernel/.editorconfig |  2 ++
 src/gallium/state_trackers/hgl/.editorconfig |  2 ++
 src/gallium/state_trackers/nine/.editorconfig|  3 ++
 src/gallium/state_trackers/xa/.editorconfig  |  3 ++
 src/gallium/targets/d3dadapter9/.editorconfig|  3 ++
 src/gallium/targets/haiku-softpipe/.editorconfig |  2 ++
 src/gallium/winsys/freedreno/drm/.editorconfig   |  2 ++
 src/gallium/winsys/nouveau/drm/.editorconfig |  2 ++
 src/gallium/winsys/radeon/drm/.editorconfig  |  3 ++
 src/gallium/winsys/sw/hgl/.editorconfig  |  2 ++
 src/getopt/.editorconfig |  2 ++
 src/gtest/.editorconfig  |  3 ++
 src/hgl/.editorconfig|  2 ++
 src/mesa/drivers/dri/nouveau/.editorconfig   |  2 ++
 29 files changed, 103 insertions(+)
 create mode 100644 .editorconfig
 create mode 100644 bin/.editorconfig
 create mode 100644 include/CL/.editorconfig
 create mode 100644 include/D3D9/.editorconfig
 create mode 100644 include/c11/.editorconfig
 create mode 100644 include/d3dadapter/.editorconfig
 create mode 100644 include/vulkan/.editorconfig
 create mode 100644 src/egl/drivers/haiku/.editorconfig
 create 

Re: [Mesa-dev] [PATCH 1/2] util: import the slab allocator from gallium

2016-08-29 Thread Dave Airlie
>
> There are also some cosmetic changes.

Did I miss 0/2? is there some future use for this in mesa?

I assume there is, just wanting more info :)

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/20] radeonsi: fix cubemaps viewed as 2D

2016-08-29 Thread Dave Airlie
On 30 August 2016 at 01:28, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This fixes: GL43-CTS.texture_view.view_sampling

Reviewed-by: Dave Airlie 

>
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/gallium/drivers/radeonsi/si_state.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index 25dfe26..026aded 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -1603,20 +1603,27 @@ static unsigned si_tex_compare(unsigned compare)
> }
>  }
>
>  static unsigned si_tex_dim(unsigned res_target, unsigned view_target,
>unsigned nr_samples)
>  {
> if (view_target == PIPE_TEXTURE_CUBE ||
> view_target == PIPE_TEXTURE_CUBE_ARRAY)
> res_target = view_target;
>
> +   /* If interpretting cubemaps as something else, set 2D_ARRAY. */
> +   if ((res_target == PIPE_TEXTURE_CUBE ||
> +res_target == PIPE_TEXTURE_CUBE_ARRAY) &&
> +   view_target != PIPE_TEXTURE_CUBE &&
> +   view_target != PIPE_TEXTURE_CUBE_ARRAY)
> +   res_target = PIPE_TEXTURE_2D_ARRAY;
> +
> switch (res_target) {
> default:
> case PIPE_TEXTURE_1D:
> return V_008F1C_SQ_RSRC_IMG_1D;
> case PIPE_TEXTURE_1D_ARRAY:
> return V_008F1C_SQ_RSRC_IMG_1D_ARRAY;
> case PIPE_TEXTURE_2D:
> case PIPE_TEXTURE_RECT:
> return nr_samples > 1 ? V_008F1C_SQ_RSRC_IMG_2D_MSAA :
> V_008F1C_SQ_RSRC_IMG_2D;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/20] radeonsi: fix Gather4 with integer formats

2016-08-29 Thread Dave Airlie
On 30 August 2016 at 01:28, Marek Olšák  wrote:
> From: Marek Olšák 
>
> The closed compiler does the same thing.
>
> This fixes: GL45-CTS.texture_gather.*-int-* (18 tests)

Also reused in radv, and works there to fix a bunch of gather int tests.

Reviewed-by: Dave Airlie 
> ---
>  src/gallium/drivers/radeonsi/si_shader.c | 99 
> +++-
>  1 file changed, 96 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index f8884ef..90c9b1f 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -4717,30 +4717,100 @@ static void tex_fetch_args(
> gather_comp = CLAMP(gather_comp, 0, 3);
> }
>
> dmask = 1 << gather_comp;
> }
>
> set_tex_fetch_args(ctx, emit_data, opcode, target, res_ptr,
>samp_ptr, address, count, dmask);
>  }
>
> +/* Gather4 should follow the same rules as bilinear filtering, but the 
> hardware
> + * incorrectly forces nearest filtering if the texture format is integer.
> + * The only effect it has on Gather4, which always returns 4 texels for
> + * bilinear filtering, is that the final coordinates are off by 0.5 of
> + * the texel size.
> + *
> + * The workaround is to subtract 0.5 from the unnormalized coordinates,
> + * or (0.5 / size) from the normalized coordinates.
> + */
> +static void si_lower_gather4_integer(struct si_shader_context *ctx,
> +struct lp_build_emit_data *emit_data,
> +const char *intr_name,
> +unsigned coord_vgpr_index)
> +{
> +   LLVMBuilderRef builder = ctx->radeon_bld.gallivm.builder;
> +   LLVMValueRef coord = emit_data->args[0];
> +   LLVMValueRef half_texel[2];
> +   int c;
> +
> +   if (emit_data->inst->Texture.Texture == TGSI_TEXTURE_RECT ||
> +   emit_data->inst->Texture.Texture == TGSI_TEXTURE_SHADOWRECT) {
> +   half_texel[0] = half_texel[1] = LLVMConstReal(ctx->f32, -0.5);
> +   } else {
> +   struct tgsi_full_instruction txq_inst = {};
> +   struct lp_build_emit_data txq_emit_data = {};
> +
> +   /* Query the texture size. */
> +   txq_inst.Texture.Texture = emit_data->inst->Texture.Texture;
> +   txq_emit_data.inst = _inst;
> +   txq_emit_data.dst_type = ctx->v4i32;
> +   set_tex_fetch_args(ctx, _emit_data, TGSI_OPCODE_TXQ,
> +  txq_inst.Texture.Texture,
> +  emit_data->args[1], NULL,
> +  
> >radeon_bld.soa.bld_base.uint_bld.zero,
> +  1, 0xf);
> +   txq_emit(NULL, >radeon_bld.soa.bld_base, _emit_data);
> +
> +   /* Compute -0.5 / size. */
> +   for (c = 0; c < 2; c++) {
> +   half_texel[c] =
> +   LLVMBuildExtractElement(builder, 
> txq_emit_data.output[0],
> +   
> LLVMConstInt(ctx->i32, c, 0), "");
> +   half_texel[c] = LLVMBuildUIToFP(builder, 
> half_texel[c], ctx->f32, "");
> +   half_texel[c] =
> +   
> lp_build_emit_llvm_unary(>radeon_bld.soa.bld_base,
> +TGSI_OPCODE_RCP, 
> half_texel[c]);
> +   half_texel[c] = LLVMBuildFMul(builder, half_texel[c],
> + LLVMConstReal(ctx->f32, 
> -0.5), "");
> +   }
> +   }
> +
> +   for (c = 0; c < 2; c++) {
> +   LLVMValueRef tmp;
> +   LLVMValueRef index = LLVMConstInt(ctx->i32, coord_vgpr_index 
> + c, 0);
> +
> +   tmp = LLVMBuildExtractElement(builder, coord, index, "");
> +   tmp = LLVMBuildBitCast(builder, tmp, ctx->f32, "");
> +   tmp = LLVMBuildFAdd(builder, tmp, half_texel[c], "");
> +   tmp = LLVMBuildBitCast(builder, tmp, ctx->i32, "");
> +   coord = LLVMBuildInsertElement(builder, coord, tmp, index, 
> "");
> +   }
> +
> +   emit_data->args[0] = coord;
> +   emit_data->output[emit_data->chan] =
> +   lp_build_intrinsic(builder, intr_name, emit_data->dst_type,
> +  emit_data->args, emit_data->arg_count,
> +  LLVMReadNoneAttribute);
> +}
> +
>  static void build_tex_intrinsic(const struct lp_build_tgsi_action *action,
> struct lp_build_tgsi_context *bld_base,
> struct lp_build_emit_data *emit_data)
>  {
> struct si_shader_context *ctx = 

Re: [Mesa-dev] [PATCH 09/20] radeonsi: fix a crash in imageSize for cubemap arrays

2016-08-29 Thread Dave Airlie
On 30 August 2016 at 01:28, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Sometimes it was f32, other times it was i32. Now it's always i32.

I think this is pretty much what radv does in the same situation.

Reviewed-by: Dave Airlie 
>
> This fixes:
> GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh
> ---
>  src/gallium/drivers/radeonsi/si_shader.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 6eca5cf..f8884ef 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -4098,21 +4098,21 @@ static void atomic_emit(
>
>  static void resq_fetch_args(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> struct si_shader_context *ctx = si_shader_context(bld_base);
> struct gallivm_state *gallivm = bld_base->base.gallivm;
> const struct tgsi_full_instruction *inst = emit_data->inst;
> const struct tgsi_full_src_register *reg = >Src[0];
>
> -   emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
> +   emit_data->dst_type = ctx->v4i32;
>
> if (reg->Register.File == TGSI_FILE_BUFFER) {
> emit_data->args[0] = shader_buffer_fetch_rsrc(ctx, reg);
> emit_data->arg_count = 1;
> } else if (inst->Memory.Texture == TGSI_TEXTURE_BUFFER) {
> image_fetch_rsrc(bld_base, reg, false, _data->args[0]);
> emit_data->arg_count = 1;
> } else {
> emit_data->args[0] = bld_base->uint_bld.zero; /* mip level */
> image_fetch_rsrc(bld_base, reg, false, _data->args[1]);
> @@ -4149,23 +4149,21 @@ static void resq_emit(
> builder, "llvm.SI.getresinfo.i32", 
> emit_data->dst_type,
> emit_data->args, emit_data->arg_count,
> LLVMReadNoneAttribute);
>
> /* Divide the number of layers by 6 to get the number of 
> cubes. */
> if (inst->Memory.Texture == TGSI_TEXTURE_CUBE_ARRAY) {
> LLVMValueRef imm2 = lp_build_const_int32(gallivm, 2);
> LLVMValueRef imm6 = lp_build_const_int32(gallivm, 6);
>
> LLVMValueRef z = LLVMBuildExtractElement(builder, 
> out, imm2, "");
> -   z = LLVMBuildBitCast(builder, z, 
> bld_base->uint_bld.elem_type, "");
> z = LLVMBuildSDiv(builder, z, imm6, "");
> -   z = LLVMBuildBitCast(builder, z, 
> bld_base->base.elem_type, "");
> out = LLVMBuildInsertElement(builder, out, z, imm2, 
> "");
> }
> }
>
> emit_data->output[emit_data->chan] = out;
>  }
>
>  static void set_tex_fetch_args(struct si_shader_context *ctx,
>struct lp_build_emit_data *emit_data,
>unsigned opcode, unsigned target,
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/20] gallium/radeon: set VPORT_ZMIN/MAX registers correctly

2016-08-29 Thread Bas Nieuwenhuizen
On Mon, Aug 29, 2016 at 5:28 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Calculate depth ranges from viewport states and
> pipe_rasterizer_state::clip_halfz.
>
> The evergreend.h change is required to silence a warning.
>
> This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range
> ---
>  src/gallium/drivers/r600/evergreen_state.c|  1 +
>  src/gallium/drivers/r600/evergreend.h |  4 +-
>  src/gallium/drivers/r600/r600_hw_context.c|  1 +
>  src/gallium/drivers/r600/r600_pipe.h  |  1 +
>  src/gallium/drivers/r600/r600_state.c |  1 +
>  src/gallium/drivers/r600/r600_state_common.c  |  2 +-
>  src/gallium/drivers/radeon/r600_pipe_common.h |  5 +-
>  src/gallium/drivers/radeon/r600_viewport.c| 73 
> ---
>  src/gallium/drivers/radeon/r600d_common.h |  2 +
>  src/gallium/drivers/radeonsi/si_hw_context.c  |  1 +
>  src/gallium/drivers/radeonsi/si_state.c   |  3 +-
>  src/gallium/drivers/radeonsi/si_state.h   |  1 +
>  12 files changed, 82 insertions(+), 13 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/evergreen_state.c 
> b/src/gallium/drivers/r600/evergreen_state.c
> index 11c8161..5ca5453 100644
> --- a/src/gallium/drivers/r600/evergreen_state.c
> +++ b/src/gallium/drivers/r600/evergreen_state.c
> @@ -466,20 +466,21 @@ static void *evergreen_create_rs_state(struct 
> pipe_context *ctx,
> float psize_min, psize_max;
> struct r600_rasterizer_state *rs = 
> CALLOC_STRUCT(r600_rasterizer_state);
>
> if (!rs) {
> return NULL;
> }
>
> r600_init_command_buffer(>buffer, 30);
>
> rs->scissor_enable = state->scissor;
> +   rs->clip_halfz = state->clip_halfz;
> rs->flatshade = state->flatshade;
> rs->sprite_coord_enable = state->sprite_coord_enable;
> rs->two_side = state->light_twoside;
> rs->clip_plane_enable = state->clip_plane_enable;
> rs->pa_sc_line_stipple = state->line_stipple_enable ?
> 
> S_028A0C_LINE_PATTERN(state->line_stipple_pattern) |
> 
> S_028A0C_REPEAT_COUNT(state->line_stipple_factor) : 0;
> rs->pa_cl_clip_cntl =
> S_028810_DX_CLIP_SPACE_DEF(state->clip_halfz) |
> S_028810_ZCLIP_NEAR_DISABLE(!state->depth_clip) |
> diff --git a/src/gallium/drivers/r600/evergreend.h 
> b/src/gallium/drivers/r600/evergreend.h
> index a81b6c5..3f33e42 100644
> --- a/src/gallium/drivers/r600/evergreend.h
> +++ b/src/gallium/drivers/r600/evergreend.h
> @@ -1856,22 +1856,22 @@
>  #define R_0283DC_SQ_VTX_SEMANTIC_23  0x000283DC
>  #define R_0283E0_SQ_VTX_SEMANTIC_24  0x000283E0
>  #define R_0283E4_SQ_VTX_SEMANTIC_25  0x000283E4
>  #define R_0283E8_SQ_VTX_SEMANTIC_26  0x000283E8
>  #define R_0283EC_SQ_VTX_SEMANTIC_27  0x000283EC
>  #define R_0283F0_SQ_VTX_SEMANTIC_28  0x000283F0
>  #define R_0283F4_SQ_VTX_SEMANTIC_29  0x000283F4
>  #define R_0283F8_SQ_VTX_SEMANTIC_30  0x000283F8
>  #define R_0283FC_SQ_VTX_SEMANTIC_31  0x000283FC
>  #define R_0288F0_SQ_VTX_SEMANTIC_CLEAR   0x000288F0
> -#define R_0282D0_PA_SC_VPORT_ZMIN_0  0x000282D0
> -#define R_0282D4_PA_SC_VPORT_ZMAX_0  0x000282D4
> +#define R_0282D0_PA_SC_VPORT_ZMIN_0 
> 0x0282D0
> +#define R_0282D4_PA_SC_VPORT_ZMAX_0 
> 0x0282D4

Could you align these the same as the other registers.

With that, patches 1-11,13-16,18-20 are

Reviewed-by: Bas Nieuwenhuizen 

Patches 12 and 17 look good to me, but I can't ascertain anything
about correctness as they almost entirely depend on unodcumented
hardware facts that I don't know and hence don't get my r-b. I don't
if there is a preferred tag for this?

Yours sincerely,
Bas Nieuwenhuizen

>  #define R_028400_VGT_MAX_VTX_INDX0x00028400
>  #define R_028404_VGT_MIN_VTX_INDX0x00028404
>  #define R_028408_VGT_INDX_OFFSET 0x00028408
>  #define R_02840C_VGT_MULTI_PRIM_IB_RESET_INDX0x0002840C
>  #define R_028414_CB_BLEND_RED0x00028414
>  #define R_028418_CB_BLEND_GREEN  0x00028418
>  #define R_02841C_CB_BLEND_BLUE   0x0002841C
>  #define R_028420_CB_BLEND_ALPHA  0x00028420
>  #define R_028438_SX_ALPHA_REF0x00028438
>  #define R_02843C_PA_CL_VPORT_XSCALE_00x0002843C
> diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
> b/src/gallium/drivers/r600/r600_hw_context.c
> index 58ba09d..dc5ad75 100644
> --- a/src/gallium/drivers/r600/r600_hw_context.c
> +++ b/src/gallium/drivers/r600/r600_hw_context.c
> @@ -305,20 +305,21 @@ 

Re: [Mesa-dev] [PATCH] spirv: replace assert with unreachable

2016-08-29 Thread Jason Ekstrand
On Aug 29, 2016 12:06 PM, "Matt Turner"  wrote:
>
> On Sun, Aug 28, 2016 at 7:13 PM, Timothy Arceri
>  wrote:
> > Fixes uninitialised warning for coord_components.
> > ---
> >  src/compiler/spirv/spirv_to_nir.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/spirv/spirv_to_nir.c
b/src/compiler/spirv/spirv_to_nir.c
> > index ca404d8..fda38f9 100644
> > --- a/src/compiler/spirv/spirv_to_nir.c
> > +++ b/src/compiler/spirv/spirv_to_nir.c
> > @@ -1426,7 +1426,7 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp
opcode,
> >   coord_components = 3;
> >   break;
> >default:
> > - assert("Invalid sampler type");
> > + unreachable("Invalid sampler type");
>
> Not only does it fix an uninitialized warning, the assert was *wrong*.
> It's missing the ! so it would have always been true!

Drp... Rb

> Reviewed-by: Matt Turner 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: probe pipe for the value of CL_DEVICE_ADDRESS_BITS

2016-08-29 Thread Vedran Miletić

On 08/29/2016 09:30 PM, Jan Vesely wrote:

On Mon, 2016-08-29 at 20:13 +0200, Vedran Miletić wrote:

Clover presently reports 32 as CL_DEVICE_ADDRESS_BITS, which is not
correct for AMD SI and newer chip generations. This patch introduces
the PIPE_COMPUTE_CAP_ADDRESS_BITS pipe capability queried by Clover
to r600 pipe and sets the value to 32 for AMD EG/NI chips and 64 for
SI and newer (chips older than EG will not use this capability).


I pushed my version that handles the PIPE_CAP in other pipes as well.

Jan



Whoops, missed that one.

Sorry.

Vedran

--
Vedran Miletić
vedran.miletic.net
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: probe pipe for the value of CL_DEVICE_ADDRESS_BITS

2016-08-29 Thread Jan Vesely
On Mon, 2016-08-29 at 20:13 +0200, Vedran Miletić wrote:
> Clover presently reports 32 as CL_DEVICE_ADDRESS_BITS, which is not
> correct for AMD SI and newer chip generations. This patch introduces
> the PIPE_COMPUTE_CAP_ADDRESS_BITS pipe capability queried by Clover
> to r600 pipe and sets the value to 32 for AMD EG/NI chips and 64 for
> SI and newer (chips older than EG will not use this capability).

I pushed my version that handles the PIPE_CAP in other pipes as well.

Jan

> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97513
> Signed-off-by: Vedran Miletić 
> ---
>  src/gallium/drivers/radeon/r600_pipe_common.c | 9 +
>  src/gallium/include/pipe/p_defines.h  | 3 ++-
>  src/gallium/state_trackers/clover/api/device.cpp  | 2 +-
>  src/gallium/state_trackers/clover/core/device.cpp | 6 ++
>  src/gallium/state_trackers/clover/core/device.hpp | 1 +
>  5 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index b1da22f..7385715 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -955,6 +955,15 @@ static int r600_get_compute_param(struct
> pipe_screen *screen,
>   *subgroup_size =
> r600_wavefront_size(rscreen->family);
>   }
>   return sizeof(uint32_t);
> +case PIPE_COMPUTE_CAP_ADDRESS_BITS:
> +if (ret) {
> +uint32_t *address_bits = ret;
> +if (rscreen->chip_class >= SI)
> +*address_bits = 64;
> +else
> +*address_bits = 32;
> +}
> +return sizeof(uint32_t);
>   }
>  
>  fprintf(stderr, "unknown PIPE_COMPUTE_CAP %d\n", param);
> diff --git a/src/gallium/include/pipe/p_defines.h
> b/src/gallium/include/pipe/p_defines.h
> index 1e4d802..93e30e8 100644
> --- a/src/gallium/include/pipe/p_defines.h
> +++ b/src/gallium/include/pipe/p_defines.h
> @@ -847,7 +847,8 @@ enum pipe_compute_cap
> PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY,
> PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS,
> PIPE_COMPUTE_CAP_IMAGES_SUPPORTED,
> -   PIPE_COMPUTE_CAP_SUBGROUP_SIZE
> +   PIPE_COMPUTE_CAP_SUBGROUP_SIZE,
> +   PIPE_COMPUTE_CAP_ADDRESS_BITS
>  };
>  
>  /**
> diff --git a/src/gallium/state_trackers/clover/api/device.cpp
> b/src/gallium/state_trackers/clover/api/device.cpp
> index 11f21e9..f7bd61b 100644
> --- a/src/gallium/state_trackers/clover/api/device.cpp
> +++ b/src/gallium/state_trackers/clover/api/device.cpp
> @@ -158,7 +158,7 @@ clGetDeviceInfo(cl_device_id d_dev,
> cl_device_info param,
>    break;
>  
> case CL_DEVICE_ADDRESS_BITS:
> -  buf.as_scalar() = 32;
> +  buf.as_scalar() = dev.address_bits();
>    break;
>  
> case CL_DEVICE_MAX_READ_IMAGE_ARGS:
> diff --git a/src/gallium/state_trackers/clover/core/device.cpp
> b/src/gallium/state_trackers/clover/core/device.cpp
> index 39f39f4..8825f99 100644
> --- a/src/gallium/state_trackers/clover/core/device.cpp
> +++ b/src/gallium/state_trackers/clover/core/device.cpp
> @@ -193,6 +193,12 @@ device::subgroup_size() const {
>    PIPE_COMPUTE_CAP_SUBGROUP_SIZE
> )[0];
>  }
>  
> +cl_uint
> +device::address_bits() const {
> +   return get_compute_param(pipe, ir_format(),
> +  PIPE_COMPUTE_CAP_ADDRESS_BITS)
> [0];
> +}
> +
>  std::string
>  device::device_name() const {
> return pipe->get_name(pipe);
> diff --git a/src/gallium/state_trackers/clover/core/device.hpp
> b/src/gallium/state_trackers/clover/core/device.hpp
> index 2857847..6cf6c7f 100644
> --- a/src/gallium/state_trackers/clover/core/device.hpp
> +++ b/src/gallium/state_trackers/clover/core/device.hpp
> @@ -68,6 +68,7 @@ namespace clover {
>  
>    std::vector max_block_size() const;
>    cl_uint subgroup_size() const;
> +  cl_uint address_bits() const;
>    std::string device_name() const;
>    std::string vendor_name() const;
>    enum pipe_shader_ir ir_format() const;
-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/20] radeonsi: fix texture format reinterpretation with DCC

2016-08-29 Thread Marek Olšák
On Mon, Aug 29, 2016 at 8:29 PM, Bas Nieuwenhuizen
 wrote:
> Hi Marek,
>
> I don't think this accounts for the fast clear bits? unorm->uint and
> snorm<->sint should have compatible clear values, but otherwise we may
> need to eliminate the fast clears.

That's a good point.

I propose that this patch be pushed as-is (unless you have other
comments on it) and the fast clear can be fixed after we have proper
tests.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: replace assert with unreachable

2016-08-29 Thread Matt Turner
On Sun, Aug 28, 2016 at 7:13 PM, Timothy Arceri
 wrote:
> Fixes uninitialised warning for coord_components.
> ---
>  src/compiler/spirv/spirv_to_nir.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/spirv/spirv_to_nir.c 
> b/src/compiler/spirv/spirv_to_nir.c
> index ca404d8..fda38f9 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -1426,7 +1426,7 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
>   coord_components = 3;
>   break;
>default:
> - assert("Invalid sampler type");
> + unreachable("Invalid sampler type");

Not only does it fix an uninitialized warning, the assert was *wrong*.
It's missing the ! so it would have always been true!

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] glsl: add core plumbing for GL_ANDROID_extension_pack_es31a

2016-08-29 Thread Ilia Mirkin
On Mon, Aug 29, 2016 at 2:45 PM, Matt Turner  wrote:
> On Sun, Aug 28, 2016 at 7:10 PM, Ilia Mirkin  wrote:
>> Signed-off-by: Ilia Mirkin 
>> ---
>>  src/compiler/glsl/glsl_parser_extras.cpp | 57 
>> +++-
>>  src/compiler/glsl/glsl_parser_extras.h   |  2 ++
>>  src/mesa/main/extensions_table.h |  2 ++
>>  src/mesa/main/mtypes.h   |  1 +
>>  4 files changed, 46 insertions(+), 16 deletions(-)
>>
>> diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
>> b/src/compiler/glsl/glsl_parser_extras.cpp
>> index b33cd3a..a44f014 100644
>> --- a/src/compiler/glsl/glsl_parser_extras.cpp
>> +++ b/src/compiler/glsl/glsl_parser_extras.cpp
>> @@ -523,6 +523,11 @@ struct _mesa_glsl_extension {
>> const char *name;
>>
>> /**
>> +* Whether this extension is a part of AEP
>> +*/
>> +   bool aep;
>
> In reviewing this patch, I developed some doubts whether this is the
> right approach. Looking at the list of extensions in the AEP spec,
> it's difficult to compare with our code to confirm that they're all
> included.
>
> Take OES_sample_shading for instance. It's enabled if
> OES_sample_variables is. OES_sample_variables is also in AEP, so using
> EXT_AEP seems fine, but it takes a bit of analysis.
>
> Or take EXT_copy_image. It's enabled based on OES_copy_image. Does
> that mean that OES_copy_image should use EXT_AEP even though it's not
> part of AEP?
>
> I wonder if handling AEP in version.c like we do with the ver_3_0
> variables and the list of extensions would be better.

From the spec:

Including the following line in a shader:

  #extension GL_ANDROID_extension_pack_es31a : 

has the same effect as including the following lines:

  #extension GL_KHR_blend_equation_advanced : 
  #extension GL_OES_sample_variables : 
  #extension GL_OES_shader_image_atomic : 
  #extension GL_OES_shader_multisample_interpolation : 
  #extension GL_OES_texture_storage_multisample_2d_array : 
  #extension GL_EXT_geometry_shader : 
  #extension GL_EXT_gpu_shader5 : 
  #extension GL_EXT_primitive_bounding_box : 
  #extension GL_EXT_shader_io_blocks : 
  #extension GL_EXT_tessellation_shader : 
  #extension GL_EXT_texture_buffer : 
  #extension GL_EXT_texture_cube_map_array : 

What I do here is directly equivalent. Note that this EXT_AEP thing is
solely for the GLSL end of things, not for the extension string.

I think you may be confusing the two concepts... the ANDROID_foo
extension enable in ctx->Extensions should only be flipped on when
those various exts (and the ones you listed below) are enabled by the
driver, which will make it available to be enabled in GLSL. This isn't
the only extension with funny dependencies. The approach thus far has
been to just add a new bit and letting the driver deal with it rather
than trying to build logic into version.c or other places. (See, for
example, the recent OES_texture_cube_map_array enable, which is
basically ARB_texture_cube_map_array && OES_geometry_shader.)

>
>> +
>> +   /**
>>  * Predicate that checks whether the relevant extension is available for
>>  * this context.
>>  */
>> @@ -565,9 +570,14 @@ has_##name_str(const struct gl_context *ctx, gl_api 
>> api, uint8_t version) \
>>  #undef EXT
>>
>>  #define EXT(NAME)   \
>> -   { "GL_" #NAME, has_##NAME, \
>> - &_mesa_glsl_parse_state::NAME##_enable,\
>> - &_mesa_glsl_parse_state::NAME##_warn }
>> +   { "GL_" #NAME, false, has_##NAME,\
>> + &_mesa_glsl_parse_state::NAME##_enable,\
>> + &_mesa_glsl_parse_state::NAME##_warn }
>> +
>> +#define EXT_AEP(NAME)   \
>> +   { "GL_" #NAME, true, has_##NAME, \
>> + &_mesa_glsl_parse_state::NAME##_enable,\
>> + &_mesa_glsl_parse_state::NAME##_warn }
>>
>>  /**
>>   * Table of extensions that can be enabled/disabled within a shader,
>> @@ -623,7 +633,7 @@ static const _mesa_glsl_extension 
>> _mesa_glsl_supported_extensions[] = {
>>
>> /* KHR extensions go here, sorted alphabetically.
>>  */
>> -   EXT(KHR_blend_equation_advanced),
>> +   EXT_AEP(KHR_blend_equation_advanced),
>>
>> /* OES extensions go here, sorted alphabetically.
>>  */
>> @@ -632,17 +642,17 @@ static const _mesa_glsl_extension 
>> _mesa_glsl_supported_extensions[] = {
>> EXT(OES_geometry_shader),
>> EXT(OES_gpu_shader5),
>> EXT(OES_primitive_bounding_box),
>> -   EXT(OES_sample_variables),
>> -   EXT(OES_shader_image_atomic),
>> +   EXT_AEP(OES_sample_variables),
>> +   EXT_AEP(OES_shader_image_atomic),
>> EXT(OES_shader_io_blocks),
>> -   EXT(OES_shader_multisample_interpolation),
>> +   EXT_AEP(OES_shader_multisample_interpolation),
>> EXT(OES_standard_derivatives),
>> 

Re: [Mesa-dev] [PATCH 4/7] glsl: add core plumbing for GL_ANDROID_extension_pack_es31a

2016-08-29 Thread Matt Turner
On Sun, Aug 28, 2016 at 7:10 PM, Ilia Mirkin  wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>  src/compiler/glsl/glsl_parser_extras.cpp | 57 
> +++-
>  src/compiler/glsl/glsl_parser_extras.h   |  2 ++
>  src/mesa/main/extensions_table.h |  2 ++
>  src/mesa/main/mtypes.h   |  1 +
>  4 files changed, 46 insertions(+), 16 deletions(-)
>
> diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
> b/src/compiler/glsl/glsl_parser_extras.cpp
> index b33cd3a..a44f014 100644
> --- a/src/compiler/glsl/glsl_parser_extras.cpp
> +++ b/src/compiler/glsl/glsl_parser_extras.cpp
> @@ -523,6 +523,11 @@ struct _mesa_glsl_extension {
> const char *name;
>
> /**
> +* Whether this extension is a part of AEP
> +*/
> +   bool aep;

In reviewing this patch, I developed some doubts whether this is the
right approach. Looking at the list of extensions in the AEP spec,
it's difficult to compare with our code to confirm that they're all
included.

Take OES_sample_shading for instance. It's enabled if
OES_sample_variables is. OES_sample_variables is also in AEP, so using
EXT_AEP seems fine, but it takes a bit of analysis.

Or take EXT_copy_image. It's enabled based on OES_copy_image. Does
that mean that OES_copy_image should use EXT_AEP even though it's not
part of AEP?

I wonder if handling AEP in version.c like we do with the ver_3_0
variables and the list of extensions would be better.

> +
> +   /**
>  * Predicate that checks whether the relevant extension is available for
>  * this context.
>  */
> @@ -565,9 +570,14 @@ has_##name_str(const struct gl_context *ctx, gl_api api, 
> uint8_t version) \
>  #undef EXT
>
>  #define EXT(NAME)   \
> -   { "GL_" #NAME, has_##NAME, \
> - &_mesa_glsl_parse_state::NAME##_enable,\
> - &_mesa_glsl_parse_state::NAME##_warn }
> +   { "GL_" #NAME, false, has_##NAME,\
> + &_mesa_glsl_parse_state::NAME##_enable,\
> + &_mesa_glsl_parse_state::NAME##_warn }
> +
> +#define EXT_AEP(NAME)   \
> +   { "GL_" #NAME, true, has_##NAME, \
> + &_mesa_glsl_parse_state::NAME##_enable,\
> + &_mesa_glsl_parse_state::NAME##_warn }
>
>  /**
>   * Table of extensions that can be enabled/disabled within a shader,
> @@ -623,7 +633,7 @@ static const _mesa_glsl_extension 
> _mesa_glsl_supported_extensions[] = {
>
> /* KHR extensions go here, sorted alphabetically.
>  */
> -   EXT(KHR_blend_equation_advanced),
> +   EXT_AEP(KHR_blend_equation_advanced),
>
> /* OES extensions go here, sorted alphabetically.
>  */
> @@ -632,17 +642,17 @@ static const _mesa_glsl_extension 
> _mesa_glsl_supported_extensions[] = {
> EXT(OES_geometry_shader),
> EXT(OES_gpu_shader5),
> EXT(OES_primitive_bounding_box),
> -   EXT(OES_sample_variables),
> -   EXT(OES_shader_image_atomic),
> +   EXT_AEP(OES_sample_variables),
> +   EXT_AEP(OES_shader_image_atomic),
> EXT(OES_shader_io_blocks),
> -   EXT(OES_shader_multisample_interpolation),
> +   EXT_AEP(OES_shader_multisample_interpolation),
> EXT(OES_standard_derivatives),
> EXT(OES_tessellation_point_size),
> EXT(OES_tessellation_shader),
> EXT(OES_texture_3D),
> EXT(OES_texture_buffer),
> EXT(OES_texture_cube_map_array),
> -   EXT(OES_texture_storage_multisample_2d_array),
> +   EXT_AEP(OES_texture_storage_multisample_2d_array),
>
> /* All other extensions go here, sorted alphabetically.
>  */
> @@ -651,23 +661,24 @@ static const _mesa_glsl_extension 
> _mesa_glsl_supported_extensions[] = {
> EXT(AMD_shader_trinary_minmax),
> EXT(AMD_vertex_shader_layer),
> EXT(AMD_vertex_shader_viewport_index),
> +   EXT(ANDROID_extension_pack_es31a),
> EXT(EXT_blend_func_extended),
> EXT(EXT_draw_buffers),
> EXT(EXT_clip_cull_distance),
> EXT(EXT_geometry_point_size),
> -   EXT(EXT_geometry_shader),
> -   EXT(EXT_gpu_shader5),
> -   EXT(EXT_primitive_bounding_box),
> +   EXT_AEP(EXT_geometry_shader),
> +   EXT_AEP(EXT_gpu_shader5),
> +   EXT_AEP(EXT_primitive_bounding_box),
> EXT(EXT_separate_shader_objects),
> EXT(EXT_shader_framebuffer_fetch),
> EXT(EXT_shader_integer_mix),
> -   EXT(EXT_shader_io_blocks),
> +   EXT_AEP(EXT_shader_io_blocks),
> EXT(EXT_shader_samples_identical),
> EXT(EXT_tessellation_point_size),
> -   EXT(EXT_tessellation_shader),
> +   EXT_AEP(EXT_tessellation_shader),
> EXT(EXT_texture_array),
> -   EXT(EXT_texture_buffer),
> -   EXT(EXT_texture_cube_map_array),
> +   EXT_AEP(EXT_texture_buffer),
> +   EXT_AEP(EXT_texture_cube_map_array),
> EXT(MESA_shader_integer_functions),

I think we're missing:

KHR_texture_compression_astc_ldr
OES_texture_stencil8
EXT_copy_image
EXT_draw_buffers_indexed
EXT_texture_border_clamp

[Mesa-dev] [PATCH] mesa: fix format conversion bug in get_tex_rgba_uncompressed()

2016-08-29 Thread Brian Paul
We need to set the need_convert flag with each loop iteration, not
just when the rgba pointer is null.

Bug reported by Markus Müller  on mesa-users list.
Fixes new piglit arb_texture_float-get-tex3d test.

Cc: 
---
 src/mesa/main/texgetimage.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index bd44c68..b900278 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -495,13 +495,15 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint 
dimensions,
   */
  if (format == rgba_format) {
 rgba = dest;
- } else if (rgba == NULL) { /* Allocate the RGBA buffer only once */
+ } else {
 need_convert = true;
-rgba = malloc(height * rgba_stride);
-if (!rgba) {
-   _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetTexImage()");
-   ctx->Driver.UnmapTextureImage(ctx, texImage, img);
-   return;
+if (rgba == NULL) { /* Allocate the RGBA buffer only once */
+   rgba = malloc(height * rgba_stride);
+   if (!rgba) {
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetTexImage()");
+  ctx->Driver.UnmapTextureImage(ctx, texImage, img);
+  return;
+   }
 }
  }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/20] radeonsi: fix texture format reinterpretation with DCC

2016-08-29 Thread Bas Nieuwenhuizen
Hi Marek,

I don't think this accounts for the fast clear bits? unorm->uint and
snorm<->sint should have compatible clear values, but otherwise we may
need to eliminate the fast clears.

Yours sincerely,
Bas Nieuwenhuizen


On Mon, Aug 29, 2016 at 5:28 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> DCC is limited in how texture formats can be reinterpreted using texture
> views. If we get a view format that is incompatible with the initial
> texture format with respect to DCC, disable DCC.
>
> There is a new piglit which tests all format combinations.
> What works and what doesn't was deduced by looking at the piglit failures.
> ---
>  src/gallium/drivers/radeon/r600_pipe_common.h |  6 ++
>  src/gallium/drivers/radeon/r600_texture.c | 96 
> +++
>  src/gallium/drivers/radeonsi/si_blit.c|  8 +++
>  src/gallium/drivers/radeonsi/si_descriptors.c |  3 +-
>  src/gallium/drivers/radeonsi/si_state.c   |  4 ++
>  5 files changed, 116 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
> b/src/gallium/drivers/radeon/r600_pipe_common.h
> index 1924535..624dea3 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.h
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
> @@ -750,20 +750,26 @@ void r600_texture_get_fmask_info(struct 
> r600_common_screen *rscreen,
>  struct r600_fmask_info *out);
>  void r600_texture_get_cmask_info(struct r600_common_screen *rscreen,
>  struct r600_texture *rtex,
>  struct r600_cmask_info *out);
>  bool r600_init_flushed_depth_texture(struct pipe_context *ctx,
>  struct pipe_resource *texture,
>  struct r600_texture **staging);
>  void r600_print_texture_info(struct r600_texture *rtex, FILE *f);
>  struct pipe_resource *r600_texture_create(struct pipe_screen *screen,
> const struct pipe_resource *templ);
> +bool vi_dcc_formats_compatible(enum pipe_format format1,
> +  enum pipe_format format2);
> +void vi_dcc_disable_if_incompatible_format(struct r600_common_context *rctx,
> +  struct pipe_resource *tex,
> +  unsigned level,
> +  enum pipe_format view_format);
>  struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe,
> struct pipe_resource *texture,
> const struct pipe_surface 
> *templ,
> unsigned width, unsigned 
> height);
>  unsigned r600_translate_colorswap(enum pipe_format format, bool 
> do_endian_swap);
>  void vi_separate_dcc_start_query(struct pipe_context *ctx,
>  struct r600_texture *tex);
>  void vi_separate_dcc_stop_query(struct pipe_context *ctx,
> struct r600_texture *tex);
>  void vi_separate_dcc_process_and_reset_stats(struct pipe_context *ctx,
> diff --git a/src/gallium/drivers/radeon/r600_texture.c 
> b/src/gallium/drivers/radeon/r600_texture.c
> index 7bdceb1..2f04019 100644
> --- a/src/gallium/drivers/radeon/r600_texture.c
> +++ b/src/gallium/drivers/radeon/r600_texture.c
> @@ -1659,42 +1659,138 @@ static void r600_texture_transfer_unmap(struct 
> pipe_context *ctx,
>
>  static const struct u_resource_vtbl r600_texture_vtbl =
>  {
> NULL,   /* get_handle */
> r600_texture_destroy,   /* resource_destroy */
> r600_texture_transfer_map,  /* transfer_map */
> u_default_transfer_flush_region, /* transfer_flush_region */
> r600_texture_transfer_unmap,/* transfer_unmap */
>  };
>
> +/* DCC channel type categories within which formats can be reinterpreted
> + * while keeping the same DCC encoding. The swizzle must also match. */
> +enum dcc_channel_type {
> +   dcc_channel_any32,
> +   dcc_channel_int16,
> +   dcc_channel_float16,
> +   dcc_channel_any_10_10_10_2,
> +   dcc_channel_any8,
> +   dcc_channel_incompatible,
> +};
> +
> +/* Return the type of DCC encoding. */
> +static enum dcc_channel_type
> +vi_get_dcc_channel_type(const struct util_format_description *desc)
> +{
> +   int i;
> +
> +   /* Find the first non-void channel. */
> +   for (i = 0; i < desc->nr_channels; i++)
> +   if (desc->channel[i].type != UTIL_FORMAT_TYPE_VOID)
> +   break;
> +   if (i == desc->nr_channels)
> +   return dcc_channel_incompatible;
> +
> +   switch (desc->channel[i].size) {
> +   case 32:
> +   if (desc->nr_channels == 4)
> +   return dcc_channel_incompatible;
> +   else

Re: [Mesa-dev] [PATCH 6/7] mesa: add a GLES3.2 enums section, and expose new MS line width params

2016-08-29 Thread Ilia Mirkin
On Sun, Aug 28, 2016 at 10:10 PM, Ilia Mirkin  wrote:
> This also exposes them for ARB_ES3_2_compatibility.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mesa/main/context.h | 10 ++
>  src/mesa/main/get.c | 26 --
>  src/mesa/main/get_hash_generator.py | 12 
>  src/mesa/main/get_hash_params.py|  5 +
>  4 files changed, 43 insertions(+), 10 deletions(-)
>
> diff --git a/src/mesa/main/context.h b/src/mesa/main/context.h
> index 4cd149d..520b3bb 100644
> --- a/src/mesa/main/context.h
> +++ b/src/mesa/main/context.h
> @@ -318,6 +318,16 @@ _mesa_is_gles31(const struct gl_context *ctx)
>
>
>  /**
> + * Checks if the context is for GLES 3.2 or later
> + */
> +static inline bool
> +_mesa_is_gles32(const struct gl_context *ctx)
> +{
> +   return ctx->API == API_OPENGLES2 && ctx->Version >= 32;
> +}
> +
> +
> +/**
>   * Checks if the context supports geometry shaders.
>   */
>  static inline bool
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 810ccb9..3cabb2b 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -142,6 +142,7 @@ enum value_extra {
> EXTRA_API_ES2,
> EXTRA_API_ES3,
> EXTRA_API_ES31,
> +   EXTRA_API_ES32,
> EXTRA_NEW_BUFFERS,
> EXTRA_NEW_FRAG_CLAMP,
> EXTRA_VALID_DRAW_BUFFER,
> @@ -416,6 +417,12 @@ static const int 
> extra_ARB_gpu_shader5_or_OES_sample_variables[] = {
> EXTRA_END
>  };
>
> +static const int extra_ES32[] = {
> +   EXT(ARB_ES3_2_compatibility),
> +   EXTRA_API_ES32,
> +   EXTRA_END
> +};
> +
>  EXTRA_EXT(ARB_texture_cube_map);
>  EXTRA_EXT(EXT_texture_array);
>  EXTRA_EXT(NV_fog_distance);
> @@ -1164,6 +1171,11 @@ check_extra(struct gl_context *ctx, const char *func, 
> const struct value_desc *d
>   if (_mesa_is_gles31(ctx))
>  api_found = GL_TRUE;
>  break;
> +  case EXTRA_API_ES32:
> + api_check = GL_TRUE;
> + if (_mesa_is_gles32(ctx))
> +api_found = GL_TRUE;
> +break;
>case EXTRA_API_GL:
>   api_check = GL_TRUE;
>   if (_mesa_is_desktop_gl(ctx))
> @@ -1312,12 +1324,14 @@ find_value(const char *func, GLenum pname, void **p, 
> union value *v)
>  * value since it's compatible with GLES2 its entry in table_set[] is at 
> the
>  * end.
>  */
> -   STATIC_ASSERT(ARRAY_SIZE(table_set) == API_OPENGL_LAST + 3);
> -   if (_mesa_is_gles3(ctx)) {
> -  api = API_OPENGL_LAST + 1;
> -   }
> -   if (_mesa_is_gles31(ctx)) {
> -  api = API_OPENGL_LAST + 2;
> +   STATIC_ASSERT(ARRAY_SIZE(table_set) == API_OPENGL_LAST + 4);
> +   if (ctx->API == API_OPENGLES2) {
> +  if (ctx->Version >= 32)
> + api = API_OPENGL_LAST + 3;
> +  else if (ctx->Version >= 31)
> + api = API_OPENGL_LAST + 2;
> +  else if (ctx->Version >= 30)
> + api = API_OPENGL_LAST + 1;
> }
> mask = ARRAY_SIZE(table(api)) - 1;
> hash = (pname * prime_factor);
> diff --git a/src/mesa/main/get_hash_generator.py 
> b/src/mesa/main/get_hash_generator.py
> index c777b78..a8b4647 100644
> --- a/src/mesa/main/get_hash_generator.py
> +++ b/src/mesa/main/get_hash_generator.py
> @@ -44,7 +44,7 @@ prime_factor = 89
>  prime_step = 281
>  hash_table_size = 1024
>
> -gl_apis=set(["GL", "GL_CORE", "GLES", "GLES2", "GLES3", "GLES31"])
> +gl_apis=set(["GL", "GL_CORE", "GLES", "GLES2", "GLES3", "GLES31", "GLES32"])
>
>  def print_header():
> print "typedef const unsigned short table_t[%d];\n" % (hash_table_size)
> @@ -69,6 +69,7 @@ api_enum = [
> 'GL_CORE',
> 'GLES3', # Not in gl_api enum in mtypes.h
> 'GLES31', # Not in gl_api enum in mtypes.h
> +   'GLES32', # Not in gl_api enum in mtypes.h
>  ]
>
>  def api_index(api):
> @@ -168,13 +169,15 @@ def generate_hash_tables(enum_list, enabled_apis, 
> param_descriptors):
>
>   for api in valid_apis:
>  add_to_hash_table(tables[api], hash_val, len(params))
> -# Also add GLES2 items to the GLES3 and GLES31 hash table
> +# Also add GLES2 items to the GLES3+ hash tables
>  if api == "GLES2":
> add_to_hash_table(tables["GLES3"], hash_val, len(params))
> add_to_hash_table(tables["GLES31"], hash_val, len(params))
> -# Also add GLES3 items to the GLES31 hash table
> +   add_to_hash_table(tables["GLES32"], hash_val, len(params))
> +# Also add GLES3 items to the GLES31+ hash tables
>  if api == "GLES3":
> add_to_hash_table(tables["GLES31"], hash_val, len(params))
> +   add_to_hash_table(tables["GLES32"], hash_val, len(params))

This is missing:

+if api == "GLES31":
+   add_to_hash_table(tables["GLES32"], hash_val, len(params))

Oops.

>   params.append(["GL_" + enum_name, param[1]])
>
> sorted_tables={}
> @@ -210,7 +213,8 @@ if __name__ == '__main__':
>

Re: [Mesa-dev] [PATCH v2 1/2] gallium: add cap to export device pointer size

2016-08-29 Thread Ilia Mirkin
On Mon, Aug 29, 2016 at 2:19 PM, Jan Vesely  wrote:
> On Mon, 2016-08-29 at 14:47 +0200, Marek Olšák wrote:
>> On Sun, Aug 28, 2016 at 7:57 PM, Jan Vesely 
>> wrote:
>> >
>> > v2: document the new cap
>> >
>> > Signed-off-by: Jan Vesely 
>> > ---
>> >  src/gallium/docs/source/screen.rst | 1 +
>> >  src/gallium/drivers/ilo/ilo_screen.c   | 6 ++
>> >  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++
>> >  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 ++
>> >  src/gallium/drivers/radeon/r600_pipe_common.c  | 8 
>> >  src/gallium/drivers/softpipe/sp_screen.c   | 1 +
>> >  src/gallium/include/pipe/p_defines.h   | 1 +
>> >  7 files changed, 21 insertions(+)
>> >
>> > diff --git a/src/gallium/docs/source/screen.rst
>> > b/src/gallium/docs/source/screen.rst
>> > index c00d012..8c67604 100644
>> > --- a/src/gallium/docs/source/screen.rst
>> > +++ b/src/gallium/docs/source/screen.rst
>> > @@ -496,6 +496,7 @@ pipe_screen::get_compute_param.
>> >non-zero means yes, zero means no. Value type: ``uint32_t``
>> >  * ``PIPE_COMPUTE_CAP_SUBGROUP_SIZE``: The size of a basic
>> > execution unit in
>> >threads. Also known as wavefront size, warp size or SIMD width.
>> > +* ``PIPE_COMPUTE_CAP_ADDRESS_BITS``: The default compute device
>> > address space size specified as an unsigned integer value in bits.
>>
>> There is the 80 chars per line limit in this file.
>
> I'll fix that locally before pushing.
> @Illia, are you OK with the nouveau bits?

Acked-by: Ilia Mirkin 

It's a bit academic for nouveau, as there's no opencl support, but
perhaps hansg will finish that work some day. We can figure out what
the right thing is then, in case it's not what's in your patch today.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] gallium: add cap to export device pointer size

2016-08-29 Thread Jan Vesely
On Mon, 2016-08-29 at 14:47 +0200, Marek Olšák wrote:
> On Sun, Aug 28, 2016 at 7:57 PM, Jan Vesely 
> wrote:
> > 
> > v2: document the new cap
> > 
> > Signed-off-by: Jan Vesely 
> > ---
> >  src/gallium/docs/source/screen.rst | 1 +
> >  src/gallium/drivers/ilo/ilo_screen.c   | 6 ++
> >  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++
> >  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 ++
> >  src/gallium/drivers/radeon/r600_pipe_common.c  | 8 
> >  src/gallium/drivers/softpipe/sp_screen.c   | 1 +
> >  src/gallium/include/pipe/p_defines.h   | 1 +
> >  7 files changed, 21 insertions(+)
> > 
> > diff --git a/src/gallium/docs/source/screen.rst
> > b/src/gallium/docs/source/screen.rst
> > index c00d012..8c67604 100644
> > --- a/src/gallium/docs/source/screen.rst
> > +++ b/src/gallium/docs/source/screen.rst
> > @@ -496,6 +496,7 @@ pipe_screen::get_compute_param.
> >    non-zero means yes, zero means no. Value type: ``uint32_t``
> >  * ``PIPE_COMPUTE_CAP_SUBGROUP_SIZE``: The size of a basic
> > execution unit in
> >    threads. Also known as wavefront size, warp size or SIMD width.
> > +* ``PIPE_COMPUTE_CAP_ADDRESS_BITS``: The default compute device
> > address space size specified as an unsigned integer value in bits.
> 
> There is the 80 chars per line limit in this file.

I'll fix that locally before pushing.
@Illia, are you OK with the nouveau bits?


> 
> Other than that, the patch is:
> 
> Reviewed-by: Marek Olšák 

thanks,
Jan

> 
> Marek
-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] noop: simplify some functions

2016-08-29 Thread Marek Olšák
On Mon, Aug 29, 2016 at 7:23 PM, Emil Velikov  wrote:
> On 29 August 2016 at 16:29, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> ---
>>  src/gallium/drivers/noop/noop_state.c | 56 
>> +--
>>  1 file changed, 7 insertions(+), 49 deletions(-)
>>
>> diff --git a/src/gallium/drivers/noop/noop_state.c 
>> b/src/gallium/drivers/noop/noop_state.c
>> index 0c0ad9f..01538bfe27 100644
>> --- a/src/gallium/drivers/noop/noop_state.c
>> +++ b/src/gallium/drivers/noop/noop_state.c
>> @@ -35,63 +35,39 @@ static void noop_draw_vbo(struct pipe_context *ctx, 
>> const struct pipe_draw_info
>>  }
>>
>>  static void noop_set_blend_color(struct pipe_context *ctx,
>>   const struct pipe_blend_color *state)
>>  {
>>  }
>>
>>  static void *noop_create_blend_state(struct pipe_context *ctx,
>>   const struct pipe_blend_state *state)
>>  {
>> -   struct pipe_blend_state *nstate = CALLOC_STRUCT(pipe_blend_state);
>> -
>> -   if (!nstate) {
>> -  return NULL;
>> -   }
>> -   *nstate = *state;
>> -   return nstate;
>> +   return malloc(1);
> You want to use [CM]ALLOC to match the FREE.
>
> Mildly related:
> Is the allocator doing to do something special for the 1 byte case or
> it's going to threat it like any other "smaller than bucket size" ?

I have no idea, and I'm not concerned about that.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915g: Fix typo in i915_translate_instruction()

2016-08-29 Thread Eric Anholt
Echelon9  writes:

> From: Rhys Kidd 
>
> Noticed this error in a debug message whilst reviewing
> https://bugs.freedesktop.org/show_bug.cgi?id=97477
>
> This patch doesn't go towards fixing that bug, but at
> least may clarify future debug output.
>
> Signed-off-by: Rhys Kidd 

Reviewed and pushed.  Thanks!


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] r600g: Pair of small code clean ups with TGSI

2016-08-29 Thread Eric Anholt
Rhys Kidd  writes:

> Having run Mesa through Clang on Eric Anholt's Travis harness, these small
> code clean ups improve readability of TGSI code in r600g and may avoid
> future problems.
>
> Series also can be found at:
> https://github.com/Echelon9/mesa/tree/fix/r600g-cleanup-tgsi-opcodes
>
> I don't have the hardware to test this so would appreciated Tested-by's.
>
> I do not have commit rights to fd.o so after R-B would appreciate if
> the reviewer(s) could push to master.

Applied the tags and pushed.  Thanks!


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] clover: probe pipe for the value of CL_DEVICE_ADDRESS_BITS

2016-08-29 Thread Vedran Miletić
Clover presently reports 32 as CL_DEVICE_ADDRESS_BITS, which is not
correct for AMD SI and newer chip generations. This patch introduces
the PIPE_COMPUTE_CAP_ADDRESS_BITS pipe capability queried by Clover
to r600 pipe and sets the value to 32 for AMD EG/NI chips and 64 for
SI and newer (chips older than EG will not use this capability).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97513
Signed-off-by: Vedran Miletić 
---
 src/gallium/drivers/radeon/r600_pipe_common.c | 9 +
 src/gallium/include/pipe/p_defines.h  | 3 ++-
 src/gallium/state_trackers/clover/api/device.cpp  | 2 +-
 src/gallium/state_trackers/clover/core/device.cpp | 6 ++
 src/gallium/state_trackers/clover/core/device.hpp | 1 +
 5 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index b1da22f..7385715 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -955,6 +955,15 @@ static int r600_get_compute_param(struct pipe_screen 
*screen,
*subgroup_size = r600_wavefront_size(rscreen->family);
}
return sizeof(uint32_t);
+case PIPE_COMPUTE_CAP_ADDRESS_BITS:
+if (ret) {
+uint32_t *address_bits = ret;
+if (rscreen->chip_class >= SI)
+*address_bits = 64;
+else
+*address_bits = 32;
+}
+return sizeof(uint32_t);
}
 
 fprintf(stderr, "unknown PIPE_COMPUTE_CAP %d\n", param);
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 1e4d802..93e30e8 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -847,7 +847,8 @@ enum pipe_compute_cap
PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY,
PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS,
PIPE_COMPUTE_CAP_IMAGES_SUPPORTED,
-   PIPE_COMPUTE_CAP_SUBGROUP_SIZE
+   PIPE_COMPUTE_CAP_SUBGROUP_SIZE,
+   PIPE_COMPUTE_CAP_ADDRESS_BITS
 };
 
 /**
diff --git a/src/gallium/state_trackers/clover/api/device.cpp 
b/src/gallium/state_trackers/clover/api/device.cpp
index 11f21e9..f7bd61b 100644
--- a/src/gallium/state_trackers/clover/api/device.cpp
+++ b/src/gallium/state_trackers/clover/api/device.cpp
@@ -158,7 +158,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
   break;
 
case CL_DEVICE_ADDRESS_BITS:
-  buf.as_scalar() = 32;
+  buf.as_scalar() = dev.address_bits();
   break;
 
case CL_DEVICE_MAX_READ_IMAGE_ARGS:
diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
b/src/gallium/state_trackers/clover/core/device.cpp
index 39f39f4..8825f99 100644
--- a/src/gallium/state_trackers/clover/core/device.cpp
+++ b/src/gallium/state_trackers/clover/core/device.cpp
@@ -193,6 +193,12 @@ device::subgroup_size() const {
   PIPE_COMPUTE_CAP_SUBGROUP_SIZE)[0];
 }
 
+cl_uint
+device::address_bits() const {
+   return get_compute_param(pipe, ir_format(),
+  PIPE_COMPUTE_CAP_ADDRESS_BITS)[0];
+}
+
 std::string
 device::device_name() const {
return pipe->get_name(pipe);
diff --git a/src/gallium/state_trackers/clover/core/device.hpp 
b/src/gallium/state_trackers/clover/core/device.hpp
index 2857847..6cf6c7f 100644
--- a/src/gallium/state_trackers/clover/core/device.hpp
+++ b/src/gallium/state_trackers/clover/core/device.hpp
@@ -68,6 +68,7 @@ namespace clover {
 
   std::vector max_block_size() const;
   cl_uint subgroup_size() const;
+  cl_uint address_bits() const;
   std::string device_name() const;
   std::string vendor_name() const;
   enum pipe_shader_ir ir_format() const;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: s/unsigned/enum pipe_shader_type/

2016-08-29 Thread Charmaine Lee

Reviewed-by:  Charmaine Lee 

From: Brian Paul 
Sent: Monday, August 29, 2016 9:16:09 AM
To: mesa-dev@lists.freedesktop.org
Cc: Neha Bhende; Charmaine Lee
Subject: [PATCH] svga: s/unsigned/enum pipe_shader_type/

---
 src/gallium/drivers/svga/svga_draw.c|  4 ++--
 src/gallium/drivers/svga/svga_pipe_sampler.c|  2 +-
 src/gallium/drivers/svga/svga_sampler_view.h|  2 +-
 src/gallium/drivers/svga/svga_shader.c  |  3 ++-
 src/gallium/drivers/svga/svga_shader.h  |  5 +++--
 src/gallium/drivers/svga/svga_state_constants.c | 12 ++--
 src/gallium/drivers/svga/svga_state_fs.c|  2 +-
 src/gallium/drivers/svga/svga_state_sampler.c   |  6 +++---
 src/gallium/drivers/svga/svga_state_tss.c   |  6 +++---
 src/gallium/drivers/svga/svga_state_vs.c|  2 +-
 src/gallium/drivers/svga/svga_surface.c |  2 +-
 11 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_draw.c 
b/src/gallium/drivers/svga/svga_draw.c
index 9e0dfe5..f8d3ae5 100644
--- a/src/gallium/drivers/svga/svga_draw.c
+++ b/src/gallium/drivers/svga/svga_draw.c
@@ -311,7 +311,7 @@ xlate_index_format(unsigned indexWidth)
 static enum pipe_error
 validate_sampler_resources(struct svga_context *svga)
 {
-   unsigned shader;
+   enum pipe_shader_type shader;

assert(svga_have_vgpu10(svga));

@@ -376,7 +376,7 @@ validate_sampler_resources(struct svga_context *svga)
 static enum pipe_error
 validate_constant_buffers(struct svga_context *svga)
 {
-   unsigned shader;
+   enum pipe_shader_type shader;

assert(svga_have_vgpu10(svga));

diff --git a/src/gallium/drivers/svga/svga_pipe_sampler.c 
b/src/gallium/drivers/svga/svga_pipe_sampler.c
index 4a2b3c3..5d7af70 100644
--- a/src/gallium/drivers/svga/svga_pipe_sampler.c
+++ b/src/gallium/drivers/svga/svga_pipe_sampler.c
@@ -529,7 +529,7 @@ done:
 void
 svga_cleanup_sampler_state(struct svga_context *svga)
 {
-   unsigned shader;
+   enum pipe_shader_type shader;

if (!svga_have_vgpu10(svga))
   return;
diff --git a/src/gallium/drivers/svga/svga_sampler_view.h 
b/src/gallium/drivers/svga/svga_sampler_view.h
index b36f089..7521a82 100644
--- a/src/gallium/drivers/svga/svga_sampler_view.h
+++ b/src/gallium/drivers/svga/svga_sampler_view.h
@@ -102,7 +102,7 @@ svga_sampler_view_reference(struct svga_sampler_view **ptr, 
struct svga_sampler_
 boolean
 svga_check_sampler_view_resource_collision(struct svga_context *svga,
struct svga_winsys_surface *res,
-   unsigned shader);
+   enum pipe_shader_type shader);

 boolean
 svga_check_sampler_framebuffer_resource_collision(struct svga_context *svga,
diff --git a/src/gallium/drivers/svga/svga_shader.c 
b/src/gallium/drivers/svga/svga_shader.c
index 9ba6055..55f7922 100644
--- a/src/gallium/drivers/svga/svga_shader.c
+++ b/src/gallium/drivers/svga/svga_shader.c
@@ -166,7 +166,8 @@ svga_remap_generic_index(int8_t 
remap_table[MAX_GENERIC_VARYING],
  * state.  This is basically the texture-related state.
  */
 void
-svga_init_shader_key_common(const struct svga_context *svga, unsigned shader,
+svga_init_shader_key_common(const struct svga_context *svga,
+enum pipe_shader_type shader,
 struct svga_compile_key *key)
 {
unsigned i, idx = 0;
diff --git a/src/gallium/drivers/svga/svga_shader.h 
b/src/gallium/drivers/svga/svga_shader.h
index b53a4bf..ec116c0 100644
--- a/src/gallium/drivers/svga/svga_shader.h
+++ b/src/gallium/drivers/svga/svga_shader.h
@@ -253,7 +253,8 @@ svga_remap_generic_index(int8_t 
remap_table[MAX_GENERIC_VARYING],
  int generic_index);

 void
-svga_init_shader_key_common(const struct svga_context *svga, unsigned shader,
+svga_init_shader_key_common(const struct svga_context *svga,
+enum pipe_shader_type shader,
 struct svga_compile_key *key);

 struct svga_shader_variant *
@@ -310,7 +311,7 @@ svga_shader_too_large(const struct svga_context *svga,
  * Convert from PIPE_SHADER_* to SVGA3D_SHADERTYPE_*
  */
 static inline SVGA3dShaderType
-svga_shader_type(unsigned shader)
+svga_shader_type(enum pipe_shader_type shader)
 {
switch (shader) {
case PIPE_SHADER_VERTEX:
diff --git a/src/gallium/drivers/svga/svga_state_constants.c 
b/src/gallium/drivers/svga/svga_state_constants.c
index 8784f47..dc80edf 100644
--- a/src/gallium/drivers/svga/svga_state_constants.c
+++ b/src/gallium/drivers/svga/svga_state_constants.c
@@ -65,7 +65,7 @@
 static unsigned
 svga_get_extra_constants_common(struct svga_context *svga,
 const struct svga_shader_variant *variant,
-unsigned shader, float *dest)
+enum pipe_shader_type shader, 

Re: [Mesa-dev] [PATCH 1/3] noop: simplify some functions

2016-08-29 Thread Emil Velikov
On 29 August 2016 at 16:29, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/noop/noop_state.c | 56 
> +--
>  1 file changed, 7 insertions(+), 49 deletions(-)
>
> diff --git a/src/gallium/drivers/noop/noop_state.c 
> b/src/gallium/drivers/noop/noop_state.c
> index 0c0ad9f..01538bfe27 100644
> --- a/src/gallium/drivers/noop/noop_state.c
> +++ b/src/gallium/drivers/noop/noop_state.c
> @@ -35,63 +35,39 @@ static void noop_draw_vbo(struct pipe_context *ctx, const 
> struct pipe_draw_info
>  }
>
>  static void noop_set_blend_color(struct pipe_context *ctx,
>   const struct pipe_blend_color *state)
>  {
>  }
>
>  static void *noop_create_blend_state(struct pipe_context *ctx,
>   const struct pipe_blend_state *state)
>  {
> -   struct pipe_blend_state *nstate = CALLOC_STRUCT(pipe_blend_state);
> -
> -   if (!nstate) {
> -  return NULL;
> -   }
> -   *nstate = *state;
> -   return nstate;
> +   return malloc(1);
You want to use [CM]ALLOC to match the FREE.

Mildly related:
Is the allocator doing to do something special for the 1 byte case or
it's going to threat it like any other "smaller than bucket size" ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] svga: s/unsigned/enum pipe_shader_type/

2016-08-29 Thread Brian Paul
---
 src/gallium/drivers/svga/svga_draw.c|  4 ++--
 src/gallium/drivers/svga/svga_pipe_sampler.c|  2 +-
 src/gallium/drivers/svga/svga_sampler_view.h|  2 +-
 src/gallium/drivers/svga/svga_shader.c  |  3 ++-
 src/gallium/drivers/svga/svga_shader.h  |  5 +++--
 src/gallium/drivers/svga/svga_state_constants.c | 12 ++--
 src/gallium/drivers/svga/svga_state_fs.c|  2 +-
 src/gallium/drivers/svga/svga_state_sampler.c   |  6 +++---
 src/gallium/drivers/svga/svga_state_tss.c   |  6 +++---
 src/gallium/drivers/svga/svga_state_vs.c|  2 +-
 src/gallium/drivers/svga/svga_surface.c |  2 +-
 11 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_draw.c 
b/src/gallium/drivers/svga/svga_draw.c
index 9e0dfe5..f8d3ae5 100644
--- a/src/gallium/drivers/svga/svga_draw.c
+++ b/src/gallium/drivers/svga/svga_draw.c
@@ -311,7 +311,7 @@ xlate_index_format(unsigned indexWidth)
 static enum pipe_error
 validate_sampler_resources(struct svga_context *svga)
 {
-   unsigned shader;
+   enum pipe_shader_type shader;
 
assert(svga_have_vgpu10(svga));
 
@@ -376,7 +376,7 @@ validate_sampler_resources(struct svga_context *svga)
 static enum pipe_error
 validate_constant_buffers(struct svga_context *svga)
 {
-   unsigned shader;
+   enum pipe_shader_type shader;
 
assert(svga_have_vgpu10(svga));
 
diff --git a/src/gallium/drivers/svga/svga_pipe_sampler.c 
b/src/gallium/drivers/svga/svga_pipe_sampler.c
index 4a2b3c3..5d7af70 100644
--- a/src/gallium/drivers/svga/svga_pipe_sampler.c
+++ b/src/gallium/drivers/svga/svga_pipe_sampler.c
@@ -529,7 +529,7 @@ done:
 void
 svga_cleanup_sampler_state(struct svga_context *svga)
 {
-   unsigned shader;
+   enum pipe_shader_type shader;
 
if (!svga_have_vgpu10(svga))
   return;
diff --git a/src/gallium/drivers/svga/svga_sampler_view.h 
b/src/gallium/drivers/svga/svga_sampler_view.h
index b36f089..7521a82 100644
--- a/src/gallium/drivers/svga/svga_sampler_view.h
+++ b/src/gallium/drivers/svga/svga_sampler_view.h
@@ -102,7 +102,7 @@ svga_sampler_view_reference(struct svga_sampler_view **ptr, 
struct svga_sampler_
 boolean
 svga_check_sampler_view_resource_collision(struct svga_context *svga,
struct svga_winsys_surface *res,
-   unsigned shader);
+   enum pipe_shader_type shader);
 
 boolean
 svga_check_sampler_framebuffer_resource_collision(struct svga_context *svga,
diff --git a/src/gallium/drivers/svga/svga_shader.c 
b/src/gallium/drivers/svga/svga_shader.c
index 9ba6055..55f7922 100644
--- a/src/gallium/drivers/svga/svga_shader.c
+++ b/src/gallium/drivers/svga/svga_shader.c
@@ -166,7 +166,8 @@ svga_remap_generic_index(int8_t 
remap_table[MAX_GENERIC_VARYING],
  * state.  This is basically the texture-related state.
  */
 void
-svga_init_shader_key_common(const struct svga_context *svga, unsigned shader,
+svga_init_shader_key_common(const struct svga_context *svga,
+enum pipe_shader_type shader,
 struct svga_compile_key *key)
 {
unsigned i, idx = 0;
diff --git a/src/gallium/drivers/svga/svga_shader.h 
b/src/gallium/drivers/svga/svga_shader.h
index b53a4bf..ec116c0 100644
--- a/src/gallium/drivers/svga/svga_shader.h
+++ b/src/gallium/drivers/svga/svga_shader.h
@@ -253,7 +253,8 @@ svga_remap_generic_index(int8_t 
remap_table[MAX_GENERIC_VARYING],
  int generic_index);
 
 void
-svga_init_shader_key_common(const struct svga_context *svga, unsigned shader,
+svga_init_shader_key_common(const struct svga_context *svga,
+enum pipe_shader_type shader,
 struct svga_compile_key *key);
 
 struct svga_shader_variant *
@@ -310,7 +311,7 @@ svga_shader_too_large(const struct svga_context *svga,
  * Convert from PIPE_SHADER_* to SVGA3D_SHADERTYPE_*
  */
 static inline SVGA3dShaderType
-svga_shader_type(unsigned shader)
+svga_shader_type(enum pipe_shader_type shader)
 {
switch (shader) {
case PIPE_SHADER_VERTEX:
diff --git a/src/gallium/drivers/svga/svga_state_constants.c 
b/src/gallium/drivers/svga/svga_state_constants.c
index 8784f47..dc80edf 100644
--- a/src/gallium/drivers/svga/svga_state_constants.c
+++ b/src/gallium/drivers/svga/svga_state_constants.c
@@ -65,7 +65,7 @@
 static unsigned
 svga_get_extra_constants_common(struct svga_context *svga,
 const struct svga_shader_variant *variant,
-unsigned shader, float *dest)
+enum pipe_shader_type shader, float *dest)
 {
uint32_t *dest_u = (uint32_t *) dest;  // uint version of dest
unsigned i;
@@ -271,7 +271,7 @@ svga_get_extra_gs_constants(struct svga_context *svga, 
float *dest)
  * \param value  the new float[4] value
  */
 static enum pipe_error
-emit_const(struct 

Re: [Mesa-dev] [PATCH v2] configure.ac: add llvm inteljitevents component if enabled

2016-08-29 Thread Emil Velikov
On 26 August 2016 at 19:24, Tim Rowley  wrote:
> Needed to successfully link llvmpipe or swr when using shared llvm libs
> built with inteljitevents enabled.
>
> v2: Make adding inteljitevents component global rather than just
> llvmpipe/swr, since libgallium will have a symbol dependency.
Cc: 
Reviewed-by: Emil Velikov 

Side note: r300 and nv30(part of nouveau) can use the llvm paths of
libgallium for some 'corner' cases.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] automake: egl: Android: Add libEGL dependencies

2016-08-29 Thread Emil Velikov
On 29 August 2016 at 05:37, Tapani Pälli  wrote:
>
>
> On 08/26/2016 03:58 PM, Emil Velikov wrote:
>>
>> On 26 August 2016 at 08:50, Tapani Pälli  wrote:
>>>
>>> Reviewed-by: Tapani Pälli 
>>>
>> What happened with my suggestion about getting things fixed as opposed
>> to adding tape over things, namely these thread [1] ?
>> Can someone please look into that one instead or give me some tips how
>> I can get things into AOSP ? Last time I've looked AOSP had longer and
>> more convoluted procedures than anything in the Linux graphics stack.
>
>
> I'm not sure when this kind of 'big change' would happen, would be nice to
> have a working solution now and then discuss better solution in peace?
>
I hope I'm wrong, but I doubt anyone had the time/chance/will to
pursue the proposed solution. As such, pushing this 'hack' will not
increase the insensitive/chances of resolving this properly.

> (trying to reduce the amount of patches that have to be applied to get
> things working/built)
>
Ack and thank you for that. I believe the overall goal should be to
resolve things in a 'good enough for upstreaming' method, rather than
just pushing the first solution that comes to mind. Similar to how it
was done with the other CrOS inspired patches.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel/blorp: Inline get_vs_entry_size into emit_urb_config

2016-08-29 Thread Jason Ekstrand
Topi asked to have the prefix removed because there's nothing gen7 about
it.  However, now that everything is in a single file, there is no good
reason to have it split out into a helper function anyway.  Let's just put
the contents in emit_urb_config and call it a day.
---
 src/intel/blorp/blorp_genX_exec.h | 41 +--
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/src/intel/blorp/blorp_genX_exec.h 
b/src/intel/blorp/blorp_genX_exec.h
index f44076e..a39c1ae 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -113,28 +113,6 @@ __gen_combine_address(struct blorp_batch *batch, void 
*location,
   _dw + 1; /* Array starts at dw[1] */   \
})
 
-/* Once vertex fetcher has written full VUE entries with complete
- * header the space requirement is as follows per vertex (in bytes):
- *
- * HeaderPositionProgram constants
- *   +++---+
- *   |   16   | 16 |  n x 16   |
- *   +++---+
- *
- * where 'n' stands for number of varying inputs expressed as vec4s.
- *
- * The URB size is in turn expressed in 64 bytes (512 bits).
- */
-static inline unsigned
-gen7_blorp_get_vs_entry_size(const struct blorp_params *params)
-{
-const unsigned num_varyings =
-   params->wm_prog_data ? params->wm_prog_data->num_varying_inputs : 0;
-const unsigned total_needed = 16 + 16 + num_varyings * 16;
-
-   return DIV_ROUND_UP(total_needed, 64);
-}
-
 /* 3DSTATE_URB
  * 3DSTATE_URB_VS
  * 3DSTATE_URB_HS
@@ -166,7 +144,24 @@ static void
 emit_urb_config(struct blorp_batch *batch,
 const struct blorp_params *params)
 {
-   blorp_emit_urb_config(batch, gen7_blorp_get_vs_entry_size(params));
+   /* Once vertex fetcher has written full VUE entries with complete
+* header the space requirement is as follows per vertex (in bytes):
+*
+* HeaderPositionProgram constants
+*   +++---+
+*   |   16   | 16 |  n x 16   |
+*   +++---+
+*
+* where 'n' stands for number of varying inputs expressed as vec4s.
+*/
+const unsigned num_varyings =
+   params->wm_prog_data ? params->wm_prog_data->num_varying_inputs : 0;
+const unsigned total_needed = 16 + 16 + num_varyings * 16;
+
+   /* The URB size is expressed in units of 64 bytes (512 bits) */
+   const unsigned vs_entry_size = DIV_ROUND_UP(total_needed, 64)
+
+   blorp_emit_urb_config(batch, vs_entry_size);
 }
 
 static void
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/31] i965/blorp: Use gen6_upload_urb

2016-08-29 Thread Jason Ekstrand
On Sun, Aug 28, 2016 at 10:29 PM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Fri, Aug 19, 2016 at 09:55:43AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 6 ++
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > index a6ac7b0..402ae3f 100644
> > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > @@ -200,9 +200,9 @@ static void
> >  emit_urb_config(struct brw_context *brw,
> >  const struct brw_blorp_params *params)
> >  {
> > -#if GEN_GEN >= 7
> > const unsigned vs_entry_size = gen7_blorp_get_vs_entry_size(params);
>
> About using logic marked as gen7 also for gen6 further down: The name of
> function is misleading, there is nothing gen7 specific in the way the entry
> size is calculated (earlier gens like to have the size as 64 byte chunks as
> well).
>
> It looks that you would need to do unnecessary rebasing just to change the
> name. Perhaps do it as a follow up?
>

Patch incoming


> >
> > +#if GEN_GEN >= 7
> > if (!(brw->ctx.NewDriverState & (BRW_NEW_CONTEXT |
> BRW_NEW_URB_SIZE)) &&
> > brw->urb.vsize >= vs_entry_size)
> >return;
> > @@ -211,9 +211,7 @@ emit_urb_config(struct brw_context *brw,
> >
> > gen7_upload_urb(brw, vs_entry_size, false, false);
> >  #else
> > -   blorp_emit(brw, GENX(3DSTATE_URB), urb) {
> > -  urb.VSNumberofURBEntries = brw->urb.max_vs_entries;
> > -   }
>
> I wonder how correct this was before. Actual entry size
> (VSURBEntryAllocationSize) was left to zero. Now we start actually setting
> it.
>

Yeah, I was a bit disturbed by that as well. :/


> Reviewed-by: Topi Pohjolainen 
>

Thanks!


> > +   gen6_upload_urb(brw, vs_entry_size, false, 0);
> >  #endif
> >  }
> >
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] noop: set missing functions

2016-08-29 Thread Ilia Mirkin
On Mon, Aug 29, 2016 at 11:49 AM, Marek Olšák  wrote:
> On Mon, Aug 29, 2016 at 5:43 PM, Ilia Mirkin  wrote:
>> On Mon, Aug 29, 2016 at 11:29 AM, Marek Olšák  wrote:
>>> From: Marek Olšák 
>>>
>>> ---
>>>  src/gallium/drivers/noop/noop_pipe.c  | 51 
>>> +++
>>>  src/gallium/drivers/noop/noop_state.c | 24 +
>>>  2 files changed, 75 insertions(+)
>>>
>>> diff --git a/src/gallium/drivers/noop/noop_pipe.c 
>>> b/src/gallium/drivers/noop/noop_pipe.c
>>> index b3e2a3e..25e0c1f 100644
>>> --- a/src/gallium/drivers/noop/noop_pipe.c
>>> +++ b/src/gallium/drivers/noop/noop_pipe.c
>>> @@ -264,42 +264,56 @@ noop_flush_resource(struct pipe_context *ctx,
>>>  }
>>>
>>>
>>>  /*
>>>   * context
>>>   */
>>>  static void noop_flush(struct pipe_context *ctx,
>>> struct pipe_fence_handle **fence,
>>> unsigned flags)
>>>  {
>>> +   if (fence)
>>> +  *fence = NULL;
>>>  }
>>>
>>>  static void noop_destroy_context(struct pipe_context *ctx)
>>>  {
>>> FREE(ctx);
>>>  }
>>>
>>> +static boolean noop_generate_mipmap(struct pipe_context *ctx,
>>> +struct pipe_resource *resource,
>>> +enum pipe_format format,
>>> +unsigned base_level,
>>> +unsigned last_level,
>>> +unsigned first_layer,
>>> +unsigned last_layer)
>>> +{
>>> +   return true;
>>> +}
>>> +
>>>  static struct pipe_context *noop_create_context(struct pipe_screen *screen,
>>>  void *priv, unsigned flags)
>>>  {
>>> struct pipe_context *ctx = CALLOC_STRUCT(pipe_context);
>>>
>>> if (!ctx)
>>>return NULL;
>>> ctx->screen = screen;
>>> ctx->priv = priv;
>>> ctx->destroy = noop_destroy_context;
>>> ctx->flush = noop_flush;
>>> ctx->clear = noop_clear;
>>> ctx->clear_render_target = noop_clear_render_target;
>>> ctx->clear_depth_stencil = noop_clear_depth_stencil;
>>> ctx->resource_copy_region = noop_resource_copy_region;
>>> +   ctx->generate_mipmap = noop_generate_mipmap;
>>> ctx->blit = noop_blit;
>>> ctx->flush_resource = noop_flush_resource;
>>> ctx->create_query = noop_create_query;
>>> ctx->destroy_query = noop_destroy_query;
>>> ctx->begin_query = noop_begin_query;
>>> ctx->end_query = noop_end_query;
>>> ctx->get_query_result = noop_get_query_result;
>>> ctx->set_active_query_state = noop_set_active_query_state;
>>> ctx->transfer_map = noop_transfer_map;
>>> ctx->transfer_flush_region = noop_transfer_flush_region;
>>> @@ -352,20 +366,30 @@ static float noop_get_paramf(struct pipe_screen* 
>>> pscreen,
>>> return screen->get_paramf(screen, param);
>>>  }
>>>
>>>  static int noop_get_shader_param(struct pipe_screen* pscreen, unsigned 
>>> shader, enum pipe_shader_cap param)
>>>  {
>>> struct pipe_screen *screen = ((struct 
>>> noop_pipe_screen*)pscreen)->oscreen;
>>>
>>> return screen->get_shader_param(screen, shader, param);
>>>  }
>>>
>>> +static int noop_get_compute_param(struct pipe_screen *pscreen,
>>> +  enum pipe_shader_ir ir_type,
>>> +  enum pipe_compute_cap param,
>>> +  void *ret)
>>> +{
>>> +   struct pipe_screen *screen = ((struct 
>>> noop_pipe_screen*)pscreen)->oscreen;
>>> +
>>> +   return screen->get_compute_param(screen, ir_type, param, ret);
>>> +}
>>> +
>>>  static boolean noop_is_format_supported(struct pipe_screen* pscreen,
>>>  enum pipe_format format,
>>>  enum pipe_texture_target target,
>>>  unsigned sample_count,
>>>  unsigned usage)
>>>  {
>>> struct pipe_screen *screen = ((struct 
>>> noop_pipe_screen*)pscreen)->oscreen;
>>>
>>> return screen->is_format_supported(screen, format, target, 
>>> sample_count, usage);
>>>  }
>>> @@ -377,20 +401,43 @@ static uint64_t noop_get_timestamp(struct pipe_screen 
>>> *pscreen)
>>>
>>>  static void noop_destroy_screen(struct pipe_screen *screen)
>>>  {
>>> struct noop_pipe_screen *noop_screen = (struct noop_pipe_screen*)screen;
>>> struct pipe_screen *oscreen = noop_screen->oscreen;
>>>
>>> oscreen->destroy(oscreen);
>>> FREE(screen);
>>>  }
>>>
>>> +static void noop_fence_reference(struct pipe_screen *screen,
>>> +  struct pipe_fence_handle **ptr,
>>> +  struct pipe_fence_handle *fence)
>>> +{
>>> +}
>>> +
>>> +static boolean noop_fence_finish(struct pipe_screen *screen,
>>> + struct pipe_context *ctx,
>>> +  

Re: [Mesa-dev] [PATCH] gallium: Use enum pipe_shader_type in bind_sampler_states()

2016-08-29 Thread Ilia Mirkin
On Fri, Aug 26, 2016 at 7:58 AM, Kai Wasserbäch
 wrote:
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index b9ac9f4..48aaa46 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -426,7 +426,8 @@ nvc0_sampler_state_delete(struct pipe_context *pipe, void 
> *hwcso)
>  }
>
>  static inline void
> -nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0, int s,
> +nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
> +   enum pipe_shader_type s,
> unsigned nr, void **hwcso)
>  {
> unsigned i;
> @@ -456,7 +457,7 @@ nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0, 
> int s,
>
>  static void
>  nvc0_stage_sampler_states_bind_range(struct nvc0_context *nvc0,
> - const unsigned s,
> + const enum pipe_shader_type s,
>   unsigned start, unsigned nr, void **cso)
>  {
> const unsigned end = start + nr;

Just noticed it, but these hunks are wrong. No ill effect at runtime,
since enum == int basically, but s != shader enum. It's an unwritten
enum in nvc0 code where 0 = vertex, 1 = tcs, etc. In nv50 code, there
are only 4 values (no tess).

I don't have time to revert them right now, but I'll try to do it
tonight if no one else beats me to it.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: expose OES_geometry_shader and OES_texture_cube_map_array

2016-08-29 Thread Marek Olšák
On Mon, Aug 29, 2016 at 5:41 PM, Ilia Mirkin  wrote:
> On Mon, Aug 29, 2016 at 11:35 AM, Marek Olšák  wrote:
>> On Sat, Aug 27, 2016 at 11:53 PM, Ilia Mirkin  wrote:
>>> Signed-off-by: Ilia Mirkin 
>>> ---
>>>  docs/features.txt  |  4 ++--
>>>  docs/relnotes/12.1.0.html  |  4 ++--
>>>  src/mesa/state_tracker/st_extensions.c | 14 ++
>>>  3 files changed, 18 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/docs/features.txt b/docs/features.txt
>>> index 26e8ff7..4c755c6 100644
>>> --- a/docs/features.txt
>>> +++ b/docs/features.txt
>>> @@ -260,7 +260,7 @@ GLES3.2, GLSL ES 3.2:
>>>GL_OES_copy_image DONE (all drivers)
>>>GL_OES_draw_buffers_indexed   DONE (all drivers 
>>> that support GL_ARB_draw_buffers_blend)
>>>GL_OES_draw_elements_base_vertex  DONE (all drivers)
>>> -  GL_OES_geometry_shaderDONE (i965/gen8+)
>>> +  GL_OES_geometry_shaderDONE (i965/gen8+, 
>>> nvc0, radeonsi)
>>>GL_OES_gpu_shader5DONE (all drivers 
>>> that support GL_ARB_gpu_shader5)
>>>GL_OES_primitive_bounding_box not started
>>>GL_OES_sample_shading DONE (i965, nvc0, 
>>> r600, radeonsi)
>>> @@ -271,7 +271,7 @@ GLES3.2, GLSL ES 3.2:
>>>GL_OES_tessellation_shaderstarted (Ken)
>>>GL_OES_texture_border_clamp   DONE (all drivers)
>>>GL_OES_texture_buffer DONE (i965, nvc0, 
>>> radeonsi)
>>> -  GL_OES_texture_cube_map_array DONE (i965/gen8+)
>>> +  GL_OES_texture_cube_map_array DONE (i965/gen8+, 
>>> nvc0, radeonsi)
>>>GL_OES_texture_stencil8   DONE (all drivers 
>>> that support GL_ARB_texture_stencil8)
>>>GL_OES_texture_storage_multisample_2d_array   DONE (all drivers 
>>> that support GL_ARB_texture_multisample)
>>>
>>> diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
>>> index d22d14b..f77ef91 100644
>>> --- a/docs/relnotes/12.1.0.html
>>> +++ b/docs/relnotes/12.1.0.html
>>> @@ -57,8 +57,8 @@ Note: some of the new features are only available with 
>>> certain drivers.
>>>  GL_KHR_blend_equation_advanced on i965
>>>  GL_KHR_texture_compression_astc_sliced_3d on i965
>>>  GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, 
>>> llvmpipe
>>> -GL_OES_geometry_shader on i965/gen8+
>>> -GL_OES_texture_cube_map_array on i965/gen8+
>>> +GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi
>>> +GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi
>>>  
>>>
>>>  Bug fixes
>>> diff --git a/src/mesa/state_tracker/st_extensions.c 
>>> b/src/mesa/state_tracker/st_extensions.c
>>> index f86a5a3..c7ec10c 100644
>>> --- a/src/mesa/state_tracker/st_extensions.c
>>> +++ b/src/mesa/state_tracker/st_extensions.c
>>> @@ -946,6 +946,15 @@ void st_init_extensions(struct pipe_screen *screen,
>>>extensions->ARB_tessellation_shader = GL_TRUE;
>>> }
>>>
>>> +   /* Ideally this should also check that invocations are supported. In
>>> +* practice, all of the hw that supports ES 3.1 also supports multiple
>>> +* invocations.
>>> +*/
>>> +   if (screen->get_shader_param(screen, PIPE_SHADER_GEOMETRY,
>>> +PIPE_SHADER_CAP_MAX_INSTRUCTIONS) > 0) {
>>> +  extensions->OES_geometry_shader = GL_TRUE;
>>> +   }
>>
>> This should also check ARB_gpu_shader5, which adds support for
>> multiple GS invocations.
>
> Yeah, I just thought it was a little heavy to check for gs5 when all
> we wanted were multiple invocations. And I didn't really want to go
> through and add a new cap. I guess the safe thing is to add a gs5 dep
> and then when some hw comes along that wants OES_geom but not ARB_gs5,
> we can worry about it then.

Yeah that sounds good.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] noop: set missing functions

2016-08-29 Thread Marek Olšák
On Mon, Aug 29, 2016 at 5:43 PM, Ilia Mirkin  wrote:
> On Mon, Aug 29, 2016 at 11:29 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> ---
>>  src/gallium/drivers/noop/noop_pipe.c  | 51 
>> +++
>>  src/gallium/drivers/noop/noop_state.c | 24 +
>>  2 files changed, 75 insertions(+)
>>
>> diff --git a/src/gallium/drivers/noop/noop_pipe.c 
>> b/src/gallium/drivers/noop/noop_pipe.c
>> index b3e2a3e..25e0c1f 100644
>> --- a/src/gallium/drivers/noop/noop_pipe.c
>> +++ b/src/gallium/drivers/noop/noop_pipe.c
>> @@ -264,42 +264,56 @@ noop_flush_resource(struct pipe_context *ctx,
>>  }
>>
>>
>>  /*
>>   * context
>>   */
>>  static void noop_flush(struct pipe_context *ctx,
>> struct pipe_fence_handle **fence,
>> unsigned flags)
>>  {
>> +   if (fence)
>> +  *fence = NULL;
>>  }
>>
>>  static void noop_destroy_context(struct pipe_context *ctx)
>>  {
>> FREE(ctx);
>>  }
>>
>> +static boolean noop_generate_mipmap(struct pipe_context *ctx,
>> +struct pipe_resource *resource,
>> +enum pipe_format format,
>> +unsigned base_level,
>> +unsigned last_level,
>> +unsigned first_layer,
>> +unsigned last_layer)
>> +{
>> +   return true;
>> +}
>> +
>>  static struct pipe_context *noop_create_context(struct pipe_screen *screen,
>>  void *priv, unsigned flags)
>>  {
>> struct pipe_context *ctx = CALLOC_STRUCT(pipe_context);
>>
>> if (!ctx)
>>return NULL;
>> ctx->screen = screen;
>> ctx->priv = priv;
>> ctx->destroy = noop_destroy_context;
>> ctx->flush = noop_flush;
>> ctx->clear = noop_clear;
>> ctx->clear_render_target = noop_clear_render_target;
>> ctx->clear_depth_stencil = noop_clear_depth_stencil;
>> ctx->resource_copy_region = noop_resource_copy_region;
>> +   ctx->generate_mipmap = noop_generate_mipmap;
>> ctx->blit = noop_blit;
>> ctx->flush_resource = noop_flush_resource;
>> ctx->create_query = noop_create_query;
>> ctx->destroy_query = noop_destroy_query;
>> ctx->begin_query = noop_begin_query;
>> ctx->end_query = noop_end_query;
>> ctx->get_query_result = noop_get_query_result;
>> ctx->set_active_query_state = noop_set_active_query_state;
>> ctx->transfer_map = noop_transfer_map;
>> ctx->transfer_flush_region = noop_transfer_flush_region;
>> @@ -352,20 +366,30 @@ static float noop_get_paramf(struct pipe_screen* 
>> pscreen,
>> return screen->get_paramf(screen, param);
>>  }
>>
>>  static int noop_get_shader_param(struct pipe_screen* pscreen, unsigned 
>> shader, enum pipe_shader_cap param)
>>  {
>> struct pipe_screen *screen = ((struct 
>> noop_pipe_screen*)pscreen)->oscreen;
>>
>> return screen->get_shader_param(screen, shader, param);
>>  }
>>
>> +static int noop_get_compute_param(struct pipe_screen *pscreen,
>> +  enum pipe_shader_ir ir_type,
>> +  enum pipe_compute_cap param,
>> +  void *ret)
>> +{
>> +   struct pipe_screen *screen = ((struct 
>> noop_pipe_screen*)pscreen)->oscreen;
>> +
>> +   return screen->get_compute_param(screen, ir_type, param, ret);
>> +}
>> +
>>  static boolean noop_is_format_supported(struct pipe_screen* pscreen,
>>  enum pipe_format format,
>>  enum pipe_texture_target target,
>>  unsigned sample_count,
>>  unsigned usage)
>>  {
>> struct pipe_screen *screen = ((struct 
>> noop_pipe_screen*)pscreen)->oscreen;
>>
>> return screen->is_format_supported(screen, format, target, sample_count, 
>> usage);
>>  }
>> @@ -377,20 +401,43 @@ static uint64_t noop_get_timestamp(struct pipe_screen 
>> *pscreen)
>>
>>  static void noop_destroy_screen(struct pipe_screen *screen)
>>  {
>> struct noop_pipe_screen *noop_screen = (struct noop_pipe_screen*)screen;
>> struct pipe_screen *oscreen = noop_screen->oscreen;
>>
>> oscreen->destroy(oscreen);
>> FREE(screen);
>>  }
>>
>> +static void noop_fence_reference(struct pipe_screen *screen,
>> +  struct pipe_fence_handle **ptr,
>> +  struct pipe_fence_handle *fence)
>> +{
>> +}
>> +
>> +static boolean noop_fence_finish(struct pipe_screen *screen,
>> + struct pipe_context *ctx,
>> + struct pipe_fence_handle *fence,
>> + uint64_t timeout)
>> +{
>> +   return true;
>> +}
>> +
>> +static void noop_query_memory_info(struct 

Re: [Mesa-dev] [PATCH] st/mesa: expose OES_geometry_shader and OES_texture_cube_map_array

2016-08-29 Thread Ilia Mirkin
On Mon, Aug 29, 2016 at 11:35 AM, Marek Olšák  wrote:
> On Sat, Aug 27, 2016 at 11:53 PM, Ilia Mirkin  wrote:
>> Signed-off-by: Ilia Mirkin 
>> ---
>>  docs/features.txt  |  4 ++--
>>  docs/relnotes/12.1.0.html  |  4 ++--
>>  src/mesa/state_tracker/st_extensions.c | 14 ++
>>  3 files changed, 18 insertions(+), 4 deletions(-)
>>
>> diff --git a/docs/features.txt b/docs/features.txt
>> index 26e8ff7..4c755c6 100644
>> --- a/docs/features.txt
>> +++ b/docs/features.txt
>> @@ -260,7 +260,7 @@ GLES3.2, GLSL ES 3.2:
>>GL_OES_copy_image DONE (all drivers)
>>GL_OES_draw_buffers_indexed   DONE (all drivers 
>> that support GL_ARB_draw_buffers_blend)
>>GL_OES_draw_elements_base_vertex  DONE (all drivers)
>> -  GL_OES_geometry_shaderDONE (i965/gen8+)
>> +  GL_OES_geometry_shaderDONE (i965/gen8+, 
>> nvc0, radeonsi)
>>GL_OES_gpu_shader5DONE (all drivers 
>> that support GL_ARB_gpu_shader5)
>>GL_OES_primitive_bounding_box not started
>>GL_OES_sample_shading DONE (i965, nvc0, 
>> r600, radeonsi)
>> @@ -271,7 +271,7 @@ GLES3.2, GLSL ES 3.2:
>>GL_OES_tessellation_shaderstarted (Ken)
>>GL_OES_texture_border_clamp   DONE (all drivers)
>>GL_OES_texture_buffer DONE (i965, nvc0, 
>> radeonsi)
>> -  GL_OES_texture_cube_map_array DONE (i965/gen8+)
>> +  GL_OES_texture_cube_map_array DONE (i965/gen8+, 
>> nvc0, radeonsi)
>>GL_OES_texture_stencil8   DONE (all drivers 
>> that support GL_ARB_texture_stencil8)
>>GL_OES_texture_storage_multisample_2d_array   DONE (all drivers 
>> that support GL_ARB_texture_multisample)
>>
>> diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
>> index d22d14b..f77ef91 100644
>> --- a/docs/relnotes/12.1.0.html
>> +++ b/docs/relnotes/12.1.0.html
>> @@ -57,8 +57,8 @@ Note: some of the new features are only available with 
>> certain drivers.
>>  GL_KHR_blend_equation_advanced on i965
>>  GL_KHR_texture_compression_astc_sliced_3d on i965
>>  GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe
>> -GL_OES_geometry_shader on i965/gen8+
>> -GL_OES_texture_cube_map_array on i965/gen8+
>> +GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi
>> +GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi
>>  
>>
>>  Bug fixes
>> diff --git a/src/mesa/state_tracker/st_extensions.c 
>> b/src/mesa/state_tracker/st_extensions.c
>> index f86a5a3..c7ec10c 100644
>> --- a/src/mesa/state_tracker/st_extensions.c
>> +++ b/src/mesa/state_tracker/st_extensions.c
>> @@ -946,6 +946,15 @@ void st_init_extensions(struct pipe_screen *screen,
>>extensions->ARB_tessellation_shader = GL_TRUE;
>> }
>>
>> +   /* Ideally this should also check that invocations are supported. In
>> +* practice, all of the hw that supports ES 3.1 also supports multiple
>> +* invocations.
>> +*/
>> +   if (screen->get_shader_param(screen, PIPE_SHADER_GEOMETRY,
>> +PIPE_SHADER_CAP_MAX_INSTRUCTIONS) > 0) {
>> +  extensions->OES_geometry_shader = GL_TRUE;
>> +   }
>
> This should also check ARB_gpu_shader5, which adds support for
> multiple GS invocations.

Yeah, I just thought it was a little heavy to check for gs5 when all
we wanted were multiple invocations. And I didn't really want to go
through and add a new cap. I guess the safe thing is to add a gs5 dep
and then when some hw comes along that wants OES_geom but not ARB_gs5,
we can worry about it then.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] noop: set missing functions

2016-08-29 Thread Ilia Mirkin
On Mon, Aug 29, 2016 at 11:29 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/noop/noop_pipe.c  | 51 
> +++
>  src/gallium/drivers/noop/noop_state.c | 24 +
>  2 files changed, 75 insertions(+)
>
> diff --git a/src/gallium/drivers/noop/noop_pipe.c 
> b/src/gallium/drivers/noop/noop_pipe.c
> index b3e2a3e..25e0c1f 100644
> --- a/src/gallium/drivers/noop/noop_pipe.c
> +++ b/src/gallium/drivers/noop/noop_pipe.c
> @@ -264,42 +264,56 @@ noop_flush_resource(struct pipe_context *ctx,
>  }
>
>
>  /*
>   * context
>   */
>  static void noop_flush(struct pipe_context *ctx,
> struct pipe_fence_handle **fence,
> unsigned flags)
>  {
> +   if (fence)
> +  *fence = NULL;
>  }
>
>  static void noop_destroy_context(struct pipe_context *ctx)
>  {
> FREE(ctx);
>  }
>
> +static boolean noop_generate_mipmap(struct pipe_context *ctx,
> +struct pipe_resource *resource,
> +enum pipe_format format,
> +unsigned base_level,
> +unsigned last_level,
> +unsigned first_layer,
> +unsigned last_layer)
> +{
> +   return true;
> +}
> +
>  static struct pipe_context *noop_create_context(struct pipe_screen *screen,
>  void *priv, unsigned flags)
>  {
> struct pipe_context *ctx = CALLOC_STRUCT(pipe_context);
>
> if (!ctx)
>return NULL;
> ctx->screen = screen;
> ctx->priv = priv;
> ctx->destroy = noop_destroy_context;
> ctx->flush = noop_flush;
> ctx->clear = noop_clear;
> ctx->clear_render_target = noop_clear_render_target;
> ctx->clear_depth_stencil = noop_clear_depth_stencil;
> ctx->resource_copy_region = noop_resource_copy_region;
> +   ctx->generate_mipmap = noop_generate_mipmap;
> ctx->blit = noop_blit;
> ctx->flush_resource = noop_flush_resource;
> ctx->create_query = noop_create_query;
> ctx->destroy_query = noop_destroy_query;
> ctx->begin_query = noop_begin_query;
> ctx->end_query = noop_end_query;
> ctx->get_query_result = noop_get_query_result;
> ctx->set_active_query_state = noop_set_active_query_state;
> ctx->transfer_map = noop_transfer_map;
> ctx->transfer_flush_region = noop_transfer_flush_region;
> @@ -352,20 +366,30 @@ static float noop_get_paramf(struct pipe_screen* 
> pscreen,
> return screen->get_paramf(screen, param);
>  }
>
>  static int noop_get_shader_param(struct pipe_screen* pscreen, unsigned 
> shader, enum pipe_shader_cap param)
>  {
> struct pipe_screen *screen = ((struct noop_pipe_screen*)pscreen)->oscreen;
>
> return screen->get_shader_param(screen, shader, param);
>  }
>
> +static int noop_get_compute_param(struct pipe_screen *pscreen,
> +  enum pipe_shader_ir ir_type,
> +  enum pipe_compute_cap param,
> +  void *ret)
> +{
> +   struct pipe_screen *screen = ((struct noop_pipe_screen*)pscreen)->oscreen;
> +
> +   return screen->get_compute_param(screen, ir_type, param, ret);
> +}
> +
>  static boolean noop_is_format_supported(struct pipe_screen* pscreen,
>  enum pipe_format format,
>  enum pipe_texture_target target,
>  unsigned sample_count,
>  unsigned usage)
>  {
> struct pipe_screen *screen = ((struct noop_pipe_screen*)pscreen)->oscreen;
>
> return screen->is_format_supported(screen, format, target, sample_count, 
> usage);
>  }
> @@ -377,20 +401,43 @@ static uint64_t noop_get_timestamp(struct pipe_screen 
> *pscreen)
>
>  static void noop_destroy_screen(struct pipe_screen *screen)
>  {
> struct noop_pipe_screen *noop_screen = (struct noop_pipe_screen*)screen;
> struct pipe_screen *oscreen = noop_screen->oscreen;
>
> oscreen->destroy(oscreen);
> FREE(screen);
>  }
>
> +static void noop_fence_reference(struct pipe_screen *screen,
> +  struct pipe_fence_handle **ptr,
> +  struct pipe_fence_handle *fence)
> +{
> +}
> +
> +static boolean noop_fence_finish(struct pipe_screen *screen,
> + struct pipe_context *ctx,
> + struct pipe_fence_handle *fence,
> + uint64_t timeout)
> +{
> +   return true;
> +}
> +
> +static void noop_query_memory_info(struct pipe_screen *pscreen,
> +   struct pipe_memory_info *info)
> +{
> +   struct noop_pipe_screen *noop_screen = (struct noop_pipe_screen*)pscreen;
> +   struct pipe_screen *screen = 

[Mesa-dev] [PATCH 13/20] gallium/radeon: rename the num-cs-flushes query to num-ctx-flushes

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

num-cs-flushes will mean compute shader flushes
---
 src/gallium/drivers/radeon/r600_query.c | 8 
 src/gallium/drivers/radeon/r600_query.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index bd0a906..29ad249 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -47,21 +47,21 @@ static void r600_query_sw_destroy(struct 
r600_common_context *rctx,
 }
 
 static enum radeon_value_id winsys_id_from_type(unsigned type)
 {
switch (type) {
case R600_QUERY_REQUESTED_VRAM: return RADEON_REQUESTED_VRAM_MEMORY;
case R600_QUERY_REQUESTED_GTT: return RADEON_REQUESTED_GTT_MEMORY;
case R600_QUERY_MAPPED_VRAM: return RADEON_MAPPED_VRAM;
case R600_QUERY_MAPPED_GTT: return RADEON_MAPPED_GTT;
case R600_QUERY_BUFFER_WAIT_TIME: return RADEON_BUFFER_WAIT_TIME_NS;
-   case R600_QUERY_NUM_CS_FLUSHES: return RADEON_NUM_CS_FLUSHES;
+   case R600_QUERY_NUM_CTX_FLUSHES: return RADEON_NUM_CS_FLUSHES;
case R600_QUERY_NUM_BYTES_MOVED: return RADEON_NUM_BYTES_MOVED;
case R600_QUERY_NUM_EVICTIONS: return RADEON_NUM_EVICTIONS;
case R600_QUERY_VRAM_USAGE: return RADEON_VRAM_USAGE;
case R600_QUERY_GTT_USAGE: return RADEON_GTT_USAGE;
case R600_QUERY_GPU_TEMPERATURE: return RADEON_GPU_TEMPERATURE;
case R600_QUERY_CURRENT_GPU_SCLK: return RADEON_CURRENT_SCLK;
case R600_QUERY_CURRENT_GPU_MCLK: return RADEON_CURRENT_MCLK;
default: unreachable("query type does not correspond to winsys id");
}
 }
@@ -96,21 +96,21 @@ static bool r600_query_sw_begin(struct r600_common_context 
*rctx,
case R600_QUERY_MAPPED_GTT:
case R600_QUERY_VRAM_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BACK_BUFFER_PS_DRAW_RATIO:
query->begin_result = 0;
break;
case R600_QUERY_BUFFER_WAIT_TIME:
-   case R600_QUERY_NUM_CS_FLUSHES:
+   case R600_QUERY_NUM_CTX_FLUSHES:
case R600_QUERY_NUM_BYTES_MOVED:
case R600_QUERY_NUM_EVICTIONS: {
enum radeon_value_id ws_id = winsys_id_from_type(query->b.type);
query->begin_result = rctx->ws->query_value(rctx->ws, ws_id);
break;
}
case R600_QUERY_GPU_LOAD:
query->begin_result = r600_gpu_load_begin(rctx->screen);
break;
case R600_QUERY_NUM_COMPILATIONS:
@@ -161,21 +161,21 @@ static bool r600_query_sw_end(struct r600_common_context 
*rctx,
case R600_QUERY_REQUESTED_VRAM:
case R600_QUERY_REQUESTED_GTT:
case R600_QUERY_MAPPED_VRAM:
case R600_QUERY_MAPPED_GTT:
case R600_QUERY_VRAM_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BUFFER_WAIT_TIME:
-   case R600_QUERY_NUM_CS_FLUSHES:
+   case R600_QUERY_NUM_CTX_FLUSHES:
case R600_QUERY_NUM_BYTES_MOVED:
case R600_QUERY_NUM_EVICTIONS: {
enum radeon_value_id ws_id = winsys_id_from_type(query->b.type);
query->end_result = rctx->ws->query_value(rctx->ws, ws_id);
break;
}
case R600_QUERY_GPU_LOAD:
query->end_result = r600_gpu_load_end(rctx->screen,
  query->begin_result);
query->begin_result = 0;
@@ -1180,21 +1180,21 @@ static struct pipe_driver_query_info 
r600_driver_query_list[] = {
X("draw-calls", DRAW_CALLS, UINT64, 
AVERAGE),
X("spill-draw-calls",   SPILL_DRAW_CALLS,   UINT64, 
AVERAGE),
X("compute-calls",  COMPUTE_CALLS,  UINT64, 
AVERAGE),
X("spill-compute-calls",SPILL_COMPUTE_CALLS,UINT64, 
AVERAGE),
X("dma-calls",  DMA_CALLS,  UINT64, 
AVERAGE),
X("requested-VRAM", REQUESTED_VRAM, BYTES, AVERAGE),
X("requested-GTT",  REQUESTED_GTT,  BYTES, AVERAGE),
X("mapped-VRAM",MAPPED_VRAM,BYTES, AVERAGE),
X("mapped-GTT", MAPPED_GTT, BYTES, AVERAGE),
X("buffer-wait-time",   BUFFER_WAIT_TIME,   MICROSECONDS, 
CUMULATIVE),
-   X("num-cs-flushes", NUM_CS_FLUSHES, UINT64, 
AVERAGE),
+   X("num-ctx-flushes",NUM_CTX_FLUSHES,UINT64, 
AVERAGE),
X("num-bytes-moved",NUM_BYTES_MOVED,BYTES, 
CUMULATIVE),
X("num-evictions",  NUM_EVICTIONS,  

[Mesa-dev] [PATCH 05/20] radeonsi: return correct eviction stats for NVX_gpu_memory_info

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index b1da22f..32486c8 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -1055,22 +1055,27 @@ static void r600_query_memory_info(struct pipe_screen 
*screen,
 
info->avail_device_memory =
vram_usage <= info->total_device_memory ?
info->total_device_memory - vram_usage : 0;
info->avail_staging_memory =
gtt_usage <= info->total_staging_memory ?
info->total_staging_memory - gtt_usage : 0;
 
info->device_memory_evicted =
ws->query_value(ws, RADEON_NUM_BYTES_MOVED) / 1024;
-   /* Just return the number of evicted 64KB pages. */
-   info->nr_device_memory_evictions = info->device_memory_evicted / 64;
+
+   if (rscreen->info.drm_major == 3 && rscreen->info.drm_minor >= 4)
+   info->nr_device_memory_evictions =
+   ws->query_value(ws, RADEON_NUM_EVICTIONS);
+   else
+   /* Just return the number of evicted 64KB pages. */
+   info->nr_device_memory_evictions = info->device_memory_evicted 
/ 64;
 }
 
 struct pipe_resource *r600_resource_create_common(struct pipe_screen *screen,
  const struct pipe_resource 
*templ)
 {
struct r600_common_screen *rscreen = (struct r600_common_screen*)screen;
 
if (templ->target == PIPE_BUFFER) {
return r600_buffer_create(screen, templ,
  rscreen->info.gart_page_size);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97250] Mesa/Clover: openCV library bugs on CL_MEM_USE_HOST_PTR

2016-08-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97250

Vedran Miletić  changed:

   What|Removed |Added

 CC||ved...@miletic.net

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/20] gallium/radeon: use the current ctx for CMASK elimination in resource_get_handle

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

For coherency with the current context.
---
 src/gallium/drivers/radeon/r600_texture.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index e7be768..912d123 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -333,29 +333,34 @@ static void r600_texture_init_metadata(struct 
r600_texture *rtex,
metadata->num_banks = surface->num_banks;
metadata->stride = surface->level[0].pitch_bytes;
metadata->scanout = (surface->flags & RADEON_SURF_SCANOUT) != 0;
 }
 
 static void r600_dirty_all_framebuffer_states(struct r600_common_screen 
*rscreen)
 {
p_atomic_inc(>dirty_fb_counter);
 }
 
-static void r600_eliminate_fast_color_clear(struct r600_common_screen *rscreen,
- struct r600_texture *rtex)
+static void r600_eliminate_fast_color_clear(struct r600_common_context *rctx,
+   struct r600_texture *rtex)
 {
-   struct pipe_context *ctx = rscreen->aux_context;
+   struct r600_common_screen *rscreen = rctx->screen;
+   struct pipe_context *ctx = >b;
+
+   if (ctx == rscreen->aux_context)
+   pipe_mutex_lock(rscreen->aux_context_lock);
 
-   pipe_mutex_lock(rscreen->aux_context_lock);
ctx->flush_resource(ctx, >resource.b.b);
ctx->flush(ctx, NULL, 0);
-   pipe_mutex_unlock(rscreen->aux_context_lock);
+
+   if (ctx == rscreen->aux_context)
+   pipe_mutex_unlock(rscreen->aux_context_lock);
 }
 
 static void r600_texture_discard_cmask(struct r600_common_screen *rscreen,
   struct r600_texture *rtex)
 {
if (!rtex->cmask.size)
return;
 
assert(rtex->resource.b.b.nr_samples <= 1);
 
@@ -538,21 +543,21 @@ static boolean r600_texture_get_handle(struct 
pipe_screen* screen,
 * access.
 */
if (usage & PIPE_HANDLE_USAGE_WRITE && rtex->dcc_offset) {
if (r600_texture_disable_dcc(rctx, rtex))
update_metadata = true;
}
 
if (!(usage & PIPE_HANDLE_USAGE_EXPLICIT_FLUSH) &&
rtex->cmask.size) {
/* Eliminate fast clear (both CMASK and DCC) */
-   r600_eliminate_fast_color_clear(rscreen, rtex);
+   r600_eliminate_fast_color_clear(rctx, rtex);
 
/* Disable CMASK if flush_resource isn't going
 * to be called.
 */
r600_texture_discard_cmask(rscreen, rtex);
}
 
/* Set metadata. */
if (!res->is_shared || update_metadata) {
r600_texture_init_metadata(rtex, );
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] gallium: switch drivers to the slab allocator in src/util

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/Makefile.sources |   2 -
 src/gallium/auxiliary/util/u_slab.c| 171 -
 src/gallium/auxiliary/util/u_slab.h|  96 
 src/gallium/drivers/freedreno/freedreno_context.c  |   6 +-
 src/gallium/drivers/freedreno/freedreno_context.h  |   8 +-
 src/gallium/drivers/freedreno/freedreno_query_hw.c |  24 +--
 src/gallium/drivers/freedreno/freedreno_resource.c |   6 +-
 src/gallium/drivers/i915/i915_context.c|   8 +-
 src/gallium/drivers/i915/i915_context.h|   6 +-
 src/gallium/drivers/i915/i915_resource_buffer.c|   4 +-
 src/gallium/drivers/i915/i915_resource_texture.c   |   4 +-
 src/gallium/drivers/ilo/ilo_context.c  |   6 +-
 src/gallium/drivers/ilo/ilo_context.h  |   4 +-
 src/gallium/drivers/ilo/ilo_transfer.c |   6 +-
 src/gallium/drivers/ilo/shader/toy_compiler.c  |   8 +-
 src/gallium/drivers/ilo/shader/toy_compiler.h  |   8 +-
 src/gallium/drivers/r300/r300_context.c|   7 +-
 src/gallium/drivers/r300/r300_context.h|   2 +-
 src/gallium/drivers/r300/r300_screen.h |   2 +-
 src/gallium/drivers/r300/r300_screen_buffer.c  |   6 +-
 src/gallium/drivers/radeon/r600_buffer_common.c|   4 +-
 src/gallium/drivers/radeon/r600_pipe_common.c  |   7 +-
 src/gallium/drivers/radeon/r600_pipe_common.h  |   4 +-
 src/gallium/drivers/vc4/vc4_context.c  |   6 +-
 src/gallium/drivers/vc4/vc4_context.h  |   4 +-
 src/gallium/drivers/vc4/vc4_resource.c |   6 +-
 src/gallium/drivers/virgl/virgl_buffer.c   |   4 +-
 src/gallium/drivers/virgl/virgl_context.c  |   8 +-
 src/gallium/drivers/virgl/virgl_context.h  |   4 +-
 src/gallium/drivers/virgl/virgl_texture.c  |   4 +-
 30 files changed, 82 insertions(+), 353 deletions(-)
 delete mode 100644 src/gallium/auxiliary/util/u_slab.c
 delete mode 100644 src/gallium/auxiliary/util/u_slab.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 093c45b..f8954c9 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -277,22 +277,20 @@ C_SOURCES := \
util/u_range.h \
util/u_rect.h \
util/u_resource.c \
util/u_resource.h \
util/u_ringbuffer.c \
util/u_ringbuffer.h \
util/u_sampler.c \
util/u_sampler.h \
util/u_simple_shaders.c \
util/u_simple_shaders.h \
-   util/u_slab.c \
-   util/u_slab.h \
util/u_split_prim.h \
util/u_sse.h \
util/u_string.h \
util/u_suballoc.c \
util/u_suballoc.h \
util/u_surface.c \
util/u_surface.h \
util/u_surfaces.c \
util/u_surfaces.h \
util/u_tests.c \
diff --git a/src/gallium/auxiliary/util/u_slab.c 
b/src/gallium/auxiliary/util/u_slab.c
deleted file mode 100644
index 7e7d43b..000
--- a/src/gallium/auxiliary/util/u_slab.c
+++ /dev/null
@@ -1,171 +0,0 @@
-/*
- * Copyright 2010 Marek Olšák 
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * on the rights to use, copy, modify, merge, publish, distribute, sub
- * license, and/or sell copies of the Software, and to permit persons to whom
- * the Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
- * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
- * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
- * USE OR OTHER DEALINGS IN THE SOFTWARE. */
-
-#include "util/u_slab.h"
-
-#include "util/u_math.h"
-#include "util/u_memory.h"
-#include "util/simple_list.h"
-
-#include 
-
-#define UTIL_SLAB_MAGIC 0xcafe4321
-
-/* The block is either allocated memory or free space. */
-struct util_slab_block {
-   /* The header. */
-   /* The first next free block. */
-   struct util_slab_block *next_free;
-
-   intptr_t magic;
-
-   /* Memory after the last member is dedicated to the block itself.
-* The allocated size is always larger than this structure. */
-};
-
-static struct util_slab_block *
-util_slab_get_block(struct util_slab_mempool *pool,
-struct util_slab_page *page, unsigned 

Re: [Mesa-dev] [PATCH] st/mesa: expose OES_geometry_shader and OES_texture_cube_map_array

2016-08-29 Thread Marek Olšák
On Sat, Aug 27, 2016 at 11:53 PM, Ilia Mirkin  wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>  docs/features.txt  |  4 ++--
>  docs/relnotes/12.1.0.html  |  4 ++--
>  src/mesa/state_tracker/st_extensions.c | 14 ++
>  3 files changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index 26e8ff7..4c755c6 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -260,7 +260,7 @@ GLES3.2, GLSL ES 3.2:
>GL_OES_copy_image DONE (all drivers)
>GL_OES_draw_buffers_indexed   DONE (all drivers 
> that support GL_ARB_draw_buffers_blend)
>GL_OES_draw_elements_base_vertex  DONE (all drivers)
> -  GL_OES_geometry_shaderDONE (i965/gen8+)
> +  GL_OES_geometry_shaderDONE (i965/gen8+, 
> nvc0, radeonsi)
>GL_OES_gpu_shader5DONE (all drivers 
> that support GL_ARB_gpu_shader5)
>GL_OES_primitive_bounding_box not started
>GL_OES_sample_shading DONE (i965, nvc0, 
> r600, radeonsi)
> @@ -271,7 +271,7 @@ GLES3.2, GLSL ES 3.2:
>GL_OES_tessellation_shaderstarted (Ken)
>GL_OES_texture_border_clamp   DONE (all drivers)
>GL_OES_texture_buffer DONE (i965, nvc0, 
> radeonsi)
> -  GL_OES_texture_cube_map_array DONE (i965/gen8+)
> +  GL_OES_texture_cube_map_array DONE (i965/gen8+, 
> nvc0, radeonsi)
>GL_OES_texture_stencil8   DONE (all drivers 
> that support GL_ARB_texture_stencil8)
>GL_OES_texture_storage_multisample_2d_array   DONE (all drivers 
> that support GL_ARB_texture_multisample)
>
> diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
> index d22d14b..f77ef91 100644
> --- a/docs/relnotes/12.1.0.html
> +++ b/docs/relnotes/12.1.0.html
> @@ -57,8 +57,8 @@ Note: some of the new features are only available with 
> certain drivers.
>  GL_KHR_blend_equation_advanced on i965
>  GL_KHR_texture_compression_astc_sliced_3d on i965
>  GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe
> -GL_OES_geometry_shader on i965/gen8+
> -GL_OES_texture_cube_map_array on i965/gen8+
> +GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi
> +GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi
>  
>
>  Bug fixes
> diff --git a/src/mesa/state_tracker/st_extensions.c 
> b/src/mesa/state_tracker/st_extensions.c
> index f86a5a3..c7ec10c 100644
> --- a/src/mesa/state_tracker/st_extensions.c
> +++ b/src/mesa/state_tracker/st_extensions.c
> @@ -946,6 +946,15 @@ void st_init_extensions(struct pipe_screen *screen,
>extensions->ARB_tessellation_shader = GL_TRUE;
> }
>
> +   /* Ideally this should also check that invocations are supported. In
> +* practice, all of the hw that supports ES 3.1 also supports multiple
> +* invocations.
> +*/
> +   if (screen->get_shader_param(screen, PIPE_SHADER_GEOMETRY,
> +PIPE_SHADER_CAP_MAX_INSTRUCTIONS) > 0) {
> +  extensions->OES_geometry_shader = GL_TRUE;
> +   }

This should also check ARB_gpu_shader5, which adds support for
multiple GS invocations.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] noop: implement resource_get_handle

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

X+DRI3 locks up if the returned handle is invalid.
---
 src/gallium/drivers/noop/noop_pipe.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/noop/noop_pipe.c 
b/src/gallium/drivers/noop/noop_pipe.c
index 25e0c1f..3013019 100644
--- a/src/gallium/drivers/noop/noop_pipe.c
+++ b/src/gallium/drivers/noop/noop_pipe.c
@@ -127,27 +127,39 @@ static struct pipe_resource 
*noop_resource_from_handle(struct pipe_screen *scree
struct pipe_screen *oscreen = noop_screen->oscreen;
struct pipe_resource *result;
struct pipe_resource *noop_resource;
 
result = oscreen->resource_from_handle(oscreen, templ, handle, usage);
noop_resource = noop_resource_create(screen, result);
pipe_resource_reference(, NULL);
return noop_resource;
 }
 
-static boolean noop_resource_get_handle(struct pipe_screen *screen,
+static boolean noop_resource_get_handle(struct pipe_screen *pscreen,
 struct pipe_context *ctx,
 struct pipe_resource *resource,
 struct winsys_handle *handle,
 unsigned usage)
 {
-   return FALSE;
+   struct noop_pipe_screen *noop_screen = (struct noop_pipe_screen*)pscreen;
+   struct pipe_screen *screen = noop_screen->oscreen;
+   struct pipe_resource *tex;
+   bool result;
+
+   /* resource_get_handle musn't fail. Just create something and return it. */
+   tex = screen->resource_create(screen, resource);
+   if (!tex)
+  return false;
+
+   result = screen->resource_get_handle(screen, NULL, tex, handle, usage);
+   pipe_resource_reference(, NULL);
+   return result;
 }
 
 static void noop_resource_destroy(struct pipe_screen *screen,
   struct pipe_resource *resource)
 {
struct noop_resource *nresource = (struct noop_resource *)resource;
 
FREE(nresource->data);
FREE(resource);
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] noop: set missing functions

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/noop/noop_pipe.c  | 51 +++
 src/gallium/drivers/noop/noop_state.c | 24 +
 2 files changed, 75 insertions(+)

diff --git a/src/gallium/drivers/noop/noop_pipe.c 
b/src/gallium/drivers/noop/noop_pipe.c
index b3e2a3e..25e0c1f 100644
--- a/src/gallium/drivers/noop/noop_pipe.c
+++ b/src/gallium/drivers/noop/noop_pipe.c
@@ -264,42 +264,56 @@ noop_flush_resource(struct pipe_context *ctx,
 }
 
 
 /*
  * context
  */
 static void noop_flush(struct pipe_context *ctx,
struct pipe_fence_handle **fence,
unsigned flags)
 {
+   if (fence)
+  *fence = NULL;
 }
 
 static void noop_destroy_context(struct pipe_context *ctx)
 {
FREE(ctx);
 }
 
+static boolean noop_generate_mipmap(struct pipe_context *ctx,
+struct pipe_resource *resource,
+enum pipe_format format,
+unsigned base_level,
+unsigned last_level,
+unsigned first_layer,
+unsigned last_layer)
+{
+   return true;
+}
+
 static struct pipe_context *noop_create_context(struct pipe_screen *screen,
 void *priv, unsigned flags)
 {
struct pipe_context *ctx = CALLOC_STRUCT(pipe_context);
 
if (!ctx)
   return NULL;
ctx->screen = screen;
ctx->priv = priv;
ctx->destroy = noop_destroy_context;
ctx->flush = noop_flush;
ctx->clear = noop_clear;
ctx->clear_render_target = noop_clear_render_target;
ctx->clear_depth_stencil = noop_clear_depth_stencil;
ctx->resource_copy_region = noop_resource_copy_region;
+   ctx->generate_mipmap = noop_generate_mipmap;
ctx->blit = noop_blit;
ctx->flush_resource = noop_flush_resource;
ctx->create_query = noop_create_query;
ctx->destroy_query = noop_destroy_query;
ctx->begin_query = noop_begin_query;
ctx->end_query = noop_end_query;
ctx->get_query_result = noop_get_query_result;
ctx->set_active_query_state = noop_set_active_query_state;
ctx->transfer_map = noop_transfer_map;
ctx->transfer_flush_region = noop_transfer_flush_region;
@@ -352,20 +366,30 @@ static float noop_get_paramf(struct pipe_screen* pscreen,
return screen->get_paramf(screen, param);
 }
 
 static int noop_get_shader_param(struct pipe_screen* pscreen, unsigned shader, 
enum pipe_shader_cap param)
 {
struct pipe_screen *screen = ((struct noop_pipe_screen*)pscreen)->oscreen;
 
return screen->get_shader_param(screen, shader, param);
 }
 
+static int noop_get_compute_param(struct pipe_screen *pscreen,
+  enum pipe_shader_ir ir_type,
+  enum pipe_compute_cap param,
+  void *ret)
+{
+   struct pipe_screen *screen = ((struct noop_pipe_screen*)pscreen)->oscreen;
+
+   return screen->get_compute_param(screen, ir_type, param, ret);
+}
+
 static boolean noop_is_format_supported(struct pipe_screen* pscreen,
 enum pipe_format format,
 enum pipe_texture_target target,
 unsigned sample_count,
 unsigned usage)
 {
struct pipe_screen *screen = ((struct noop_pipe_screen*)pscreen)->oscreen;
 
return screen->is_format_supported(screen, format, target, sample_count, 
usage);
 }
@@ -377,20 +401,43 @@ static uint64_t noop_get_timestamp(struct pipe_screen 
*pscreen)
 
 static void noop_destroy_screen(struct pipe_screen *screen)
 {
struct noop_pipe_screen *noop_screen = (struct noop_pipe_screen*)screen;
struct pipe_screen *oscreen = noop_screen->oscreen;
 
oscreen->destroy(oscreen);
FREE(screen);
 }
 
+static void noop_fence_reference(struct pipe_screen *screen,
+  struct pipe_fence_handle **ptr,
+  struct pipe_fence_handle *fence)
+{
+}
+
+static boolean noop_fence_finish(struct pipe_screen *screen,
+ struct pipe_context *ctx,
+ struct pipe_fence_handle *fence,
+ uint64_t timeout)
+{
+   return true;
+}
+
+static void noop_query_memory_info(struct pipe_screen *pscreen,
+   struct pipe_memory_info *info)
+{
+   struct noop_pipe_screen *noop_screen = (struct noop_pipe_screen*)pscreen;
+   struct pipe_screen *screen = noop_screen->oscreen;
+
+   screen->query_memory_info(screen, info);
+}
+
 struct pipe_screen *noop_screen_create(struct pipe_screen *oscreen)
 {
struct noop_pipe_screen *noop_screen;
struct pipe_screen *screen;
 
if (!debug_get_option_noop()) {
   return oscreen;
}
 
noop_screen = 

[Mesa-dev] [PATCH 1/3] noop: simplify some functions

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/noop/noop_state.c | 56 +--
 1 file changed, 7 insertions(+), 49 deletions(-)

diff --git a/src/gallium/drivers/noop/noop_state.c 
b/src/gallium/drivers/noop/noop_state.c
index 0c0ad9f..01538bfe27 100644
--- a/src/gallium/drivers/noop/noop_state.c
+++ b/src/gallium/drivers/noop/noop_state.c
@@ -35,63 +35,39 @@ static void noop_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
 }
 
 static void noop_set_blend_color(struct pipe_context *ctx,
  const struct pipe_blend_color *state)
 {
 }
 
 static void *noop_create_blend_state(struct pipe_context *ctx,
  const struct pipe_blend_state *state)
 {
-   struct pipe_blend_state *nstate = CALLOC_STRUCT(pipe_blend_state);
-
-   if (!nstate) {
-  return NULL;
-   }
-   *nstate = *state;
-   return nstate;
+   return malloc(1);
 }
 
 static void *noop_create_dsa_state(struct pipe_context *ctx,
const struct pipe_depth_stencil_alpha_state 
*state)
 {
-   struct pipe_depth_stencil_alpha_state *nstate = 
CALLOC_STRUCT(pipe_depth_stencil_alpha_state);
-
-   if (!nstate) {
-  return NULL;
-   }
-   *nstate = *state;
-   return nstate;
+   return malloc(1);
 }
 
 static void *noop_create_rs_state(struct pipe_context *ctx,
   const struct pipe_rasterizer_state *state)
 {
-   struct pipe_rasterizer_state *nstate = CALLOC_STRUCT(pipe_rasterizer_state);
-
-   if (!nstate) {
-  return NULL;
-   }
-   *nstate = *state;
-   return nstate;
+   return malloc(1);
 }
 
 static void *noop_create_sampler_state(struct pipe_context *ctx,
const struct pipe_sampler_state *state)
 {
-   struct pipe_sampler_state *nstate = CALLOC_STRUCT(pipe_sampler_state);
-
-   if (!nstate) {
-  return NULL;
-   }
-   *nstate = *state;
-   return nstate;
+   return malloc(1);
 }
 
 static struct pipe_sampler_view *noop_create_sampler_view(struct pipe_context 
*ctx,
   struct pipe_resource 
*texture,
   const struct 
pipe_sampler_view *state)
 {
struct pipe_sampler_view *sampler_view = CALLOC_STRUCT(pipe_sampler_view);
 
if (!sampler_view)
   return NULL;
@@ -198,60 +174,42 @@ static void noop_surface_destroy(struct pipe_context *ctx,
 
 static void noop_bind_state(struct pipe_context *ctx, void *state)
 {
 }
 
 static void noop_delete_state(struct pipe_context *ctx, void *state)
 {
FREE(state);
 }
 
-static void noop_delete_vertex_element(struct pipe_context *ctx, void *state)
-{
-   FREE(state);
-}
-
-
 static void noop_set_index_buffer(struct pipe_context *ctx,
   const struct pipe_index_buffer *ib)
 {
 }
 
 static void noop_set_vertex_buffers(struct pipe_context *ctx,
 unsigned start_slot, unsigned count,
 const struct pipe_vertex_buffer *buffers)
 {
 }
 
 static void *noop_create_vertex_elements(struct pipe_context *ctx,
  unsigned count,
  const struct pipe_vertex_element 
*state)
 {
-   struct pipe_vertex_element *nstate = CALLOC_STRUCT(pipe_vertex_element);
-
-   if (!nstate) {
-  return NULL;
-   }
-   *nstate = *state;
-   return nstate;
+   return malloc(1);
 }
 
 static void *noop_create_shader_state(struct pipe_context *ctx,
   const struct pipe_shader_state *state)
 {
-   struct pipe_shader_state *nstate = CALLOC_STRUCT(pipe_shader_state);
-
-   if (!nstate) {
-  return NULL;
-   }
-   *nstate = *state;
-   return nstate;
+   return malloc(1);
 }
 
 static struct pipe_stream_output_target *noop_create_stream_output_target(
   struct pipe_context *ctx,
   struct pipe_resource *res,
   unsigned buffer_offset,
   unsigned buffer_size)
 {
struct pipe_stream_output_target *t = 
CALLOC_STRUCT(pipe_stream_output_target);
if (!t)
@@ -296,21 +254,21 @@ void noop_init_state_functions(struct pipe_context *ctx)
ctx->bind_sampler_states = noop_bind_sampler_states;
ctx->bind_fs_state = noop_bind_state;
ctx->bind_rasterizer_state = noop_bind_state;
ctx->bind_vertex_elements_state = noop_bind_state;
ctx->bind_vs_state = noop_bind_state;
ctx->delete_blend_state = noop_delete_state;
ctx->delete_depth_stencil_alpha_state = noop_delete_state;
ctx->delete_fs_state = noop_delete_state;
ctx->delete_rasterizer_state = noop_delete_state;
ctx->delete_sampler_state = noop_delete_state;
-   ctx->delete_vertex_elements_state = noop_delete_vertex_element;
+   ctx->delete_vertex_elements_state = noop_delete_state;
ctx->delete_vs_state = noop_delete_state;
ctx->set_blend_color = noop_set_blend_color;

[Mesa-dev] [PATCH 15/20] radeonsi: don't emit CS_PARTIAL_FLUSH if compute is not used

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

for less noise in the HUD
---
 src/gallium/drivers/radeonsi/si_compute.c| 1 +
 src/gallium/drivers/radeonsi/si_pipe.h   | 1 +
 src/gallium/drivers/radeonsi/si_state_draw.c | 4 +++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 17a4125..5041761 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -505,20 +505,21 @@ static void si_launch_grid(
 
if (program->ir_type == PIPE_SHADER_IR_TGSI)
si_setup_tgsi_grid(sctx, info);
 
si_ce_pre_draw_synchronization(sctx);
 
si_emit_dispatch_packets(sctx, info);
 
si_ce_post_draw_synchronization(sctx);
 
+   sctx->compute_is_busy = true;
sctx->b.num_compute_calls++;
if (sctx->cs_shader_state.uses_scratch)
sctx->b.num_spill_compute_calls++;
 
if (cs_regalloc_hang)
sctx->b.flags |= SI_CONTEXT_CS_PARTIAL_FLUSH;
 }
 
 
 static void si_delete_compute_state(struct pipe_context *ctx, void* state){
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index f6535cf..5c041ce 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -212,20 +212,21 @@ struct si_context {
struct si_screen*screen;
 
struct radeon_winsys_cs *ce_ib;
struct radeon_winsys_cs *ce_preamble_ib;
boolce_need_synchronization;
struct u_suballocator   *ce_suballocator;
 
struct si_shader_ctx_state  fixed_func_tcs_shader;
LLVMTargetMachineReftm; /* only non-threaded compilation */
boolgfx_flush_in_progress;
+   boolcompute_is_busy;
 
/* Atoms (direct states). */
union si_state_atomsatoms;
unsigneddirty_atoms; /* mask */
/* PM4 states (precomputed immutable states) */
union si_state  queued;
union si_state  emitted;
 
/* Atom declarations. */
struct r600_atomcache_flush;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 9e50bb2..ddcb904 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -778,24 +778,26 @@ void si_emit_cache_flush(struct si_context *si_ctx, 
struct r600_atom *atom)
 */
sctx->num_vs_flushes++;
sctx->num_ps_flushes++;
} else if (sctx->flags & SI_CONTEXT_VS_PARTIAL_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_VS_PARTIAL_FLUSH) | 
EVENT_INDEX(4));
sctx->num_vs_flushes++;
}
}
 
-   if (sctx->flags & SI_CONTEXT_CS_PARTIAL_FLUSH) {
+   if (sctx->flags & SI_CONTEXT_CS_PARTIAL_FLUSH &&
+   si_ctx->compute_is_busy) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_CS_PARTIAL_FLUSH | 
EVENT_INDEX(4)));
sctx->num_cs_flushes++;
+   si_ctx->compute_is_busy = false;
}
 
/* VGT state synchronization. */
if (sctx->flags & SI_CONTEXT_VGT_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) | 
EVENT_INDEX(0));
}
if (sctx->flags & SI_CONTEXT_VGT_STREAMOUT_SYNC) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_STREAMOUT_SYNC) | 
EVENT_INDEX(0));
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/20] gallium/radeon: remove VPORT_ZMIN/ZMAX from init config states

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

It's part of the viewport state now.
---
 src/gallium/drivers/r600/evergreen_state.c | 14 +-
 src/gallium/drivers/r600/r600_state.c  |  6 --
 src/gallium/drivers/radeonsi/si_state.c|  6 --
 3 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 5ca5453..ed385ee 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2323,21 +2323,21 @@ void cayman_init_common_regs(struct r600_command_buffer 
*cb,
r600_store_context_reg_seq(cb, R_028350_SX_MISC, 2);
r600_store_value(cb, 0);
r600_store_value(cb, S_028354_SURFACE_SYNC_MASK(0xf));
 
r600_store_context_reg(cb, R_028800_DB_DEPTH_CONTROL, 0);
 }
 
 static void cayman_init_atom_start_cs(struct r600_context *rctx)
 {
struct r600_command_buffer *cb = >start_cs_cmd;
-   int tmp, i;
+   int i;
 
r600_init_command_buffer(cb, 338);
 
/* This must be first. */
r600_store_value(cb, PKT3(PKT3_CONTEXT_CONTROL, 1, 0));
r600_store_value(cb, 0x8000);
r600_store_value(cb, 0x8000);
 
/* We're setting config registers here. */
r600_store_value(cb, PKT3(PKT3_EVENT_WRITE, 0, 0));
@@ -2415,26 +2415,20 @@ static void cayman_init_atom_start_cs(struct 
r600_context *rctx)
r600_store_context_reg(cb, R_0286DC_SPI_FOG_CNTL, 0);
 
r600_store_context_reg_seq(cb, R_028AC0_DB_SRESULTS_COMPARE_STATE0, 3);
r600_store_value(cb, 0); /* R_028AC0_DB_SRESULTS_COMPARE_STATE0 */
r600_store_value(cb, 0); /* R_028AC4_DB_SRESULTS_COMPARE_STATE1 */
r600_store_value(cb, 0); /* R_028AC8_DB_PRELOAD_CONTROL */
 
r600_store_context_reg(cb, R_028200_PA_SC_WINDOW_OFFSET, 0);
r600_store_context_reg(cb, R_02820C_PA_SC_CLIPRECT_RULE, 0x);
 
-   r600_store_context_reg_seq(cb, R_0282D0_PA_SC_VPORT_ZMIN_0, 2 * 
R600_MAX_VIEWPORTS);
-   for (tmp = 0; tmp < R600_MAX_VIEWPORTS; tmp++) {
-   r600_store_value(cb, 0); /* R_0282D0_PA_SC_VPORT_ZMIN_0 */
-   r600_store_value(cb, fui(1.0)); /* R_0282D4_PA_SC_VPORT_ZMAX_0 
*/
-   }
-
r600_store_context_reg(cb, R_028230_PA_SC_EDGERULE, 0x);
r600_store_context_reg(cb, R_028820_PA_CL_NANINF_CNTL, 0);
 
r600_store_context_reg_seq(cb, R_028240_PA_SC_GENERIC_SCISSOR_TL, 2);
r600_store_value(cb, 0); /* R_028240_PA_SC_GENERIC_SCISSOR_TL */
r600_store_value(cb, S_028244_BR_X(16384) | S_028244_BR_Y(16384)); /* 
R_028244_PA_SC_GENERIC_SCISSOR_BR */
 
r600_store_context_reg_seq(cb, R_028030_PA_SC_SCREEN_SCISSOR_TL, 2);
r600_store_value(cb, 0); /* R_028030_PA_SC_SCREEN_SCISSOR_TL */
r600_store_value(cb, S_028034_BR_X(16384) | S_028034_BR_Y(16384)); /* 
R_028034_PA_SC_SCREEN_SCISSOR_BR */
@@ -2825,26 +2819,20 @@ void evergreen_init_atom_start_cs(struct r600_context 
*rctx)
r600_store_value(cb, 0); /* R_028404_VGT_MIN_VTX_INDX */
 
r600_store_ctl_const(cb, R_03CFF0_SQ_VTX_BASE_VTX_LOC, 0);
 
r600_store_context_reg(cb, R_028028_DB_STENCIL_CLEAR, 0);
 
r600_store_context_reg(cb, R_028200_PA_SC_WINDOW_OFFSET, 0);
r600_store_context_reg(cb, R_02820C_PA_SC_CLIPRECT_RULE, 0x);
r600_store_context_reg(cb, R_028230_PA_SC_EDGERULE, 0x);
 
-   r600_store_context_reg_seq(cb, R_0282D0_PA_SC_VPORT_ZMIN_0, 2 * 
R600_MAX_VIEWPORTS);
-   for (tmp = 0; tmp < R600_MAX_VIEWPORTS; tmp++) {
-   r600_store_value(cb, 0); /* R_0282D0_PA_SC_VPORT_ZMIN_0 */
-   r600_store_value(cb, fui(1.0)); /* R_0282D4_PA_SC_VPORT_ZMAX_0 
*/
-   }
-
r600_store_context_reg(cb, R_0286DC_SPI_FOG_CNTL, 0);
r600_store_context_reg(cb, R_028820_PA_CL_NANINF_CNTL, 0);
 
r600_store_context_reg_seq(cb, R_028AC0_DB_SRESULTS_COMPARE_STATE0, 3);
r600_store_value(cb, 0); /* R_028AC0_DB_SRESULTS_COMPARE_STATE0 */
r600_store_value(cb, 0); /* R_028AC4_DB_SRESULTS_COMPARE_STATE1 */
r600_store_value(cb, 0); /* R_028AC8_DB_PRELOAD_CONTROL */
 
r600_store_context_reg_seq(cb, R_028240_PA_SC_GENERIC_SCISSOR_TL, 2);
r600_store_value(cb, 0); /* R_028240_PA_SC_GENERIC_SCISSOR_TL */
diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index c8768e0..c55c532 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -2367,26 +2367,20 @@ void r600_init_atom_start_cs(struct r600_context *rctx)
r600_store_value(cb, 0); /* R_0286E4_SPI_FOG_FUNC_BIAS */
 
r600_store_context_reg_seq(cb, R_028D28_DB_SRESULTS_COMPARE_STATE0, 3);
r600_store_value(cb, 0); /* R_028D28_DB_SRESULTS_COMPARE_STATE0 */
r600_store_value(cb, 0); /* R_028D2C_DB_SRESULTS_COMPARE_STATE1 */
r600_store_value(cb, 0); /* 

[Mesa-dev] [PATCH 1/2] util: import the slab allocator from gallium

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

There are also some cosmetic changes.
---
 src/util/Makefile.sources |   2 +
 src/util/slab.c   | 180 ++
 src/util/slab.h   |  62 
 3 files changed, 244 insertions(+)
 create mode 100644 src/util/slab.c
 create mode 100644 src/util/slab.h

diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index 6def4f7..d72107a 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -18,19 +18,21 @@ MESA_UTIL_FILES :=  \
ralloc.c \
ralloc.h \
register_allocate.c \
register_allocate.h \
rgtc.c \
rgtc.h \
rounding.h \
set.c \
set.h \
simple_list.h \
+   slab.c \
+   slab.h \
strndup.c \
strndup.h \
strtod.c \
strtod.h \
texcompress_rgtc_tmp.h \
u_atomic.h
 
 MESA_UTIL_GENERATED_FILES = \
format_srgb.c
diff --git a/src/util/slab.c b/src/util/slab.c
new file mode 100644
index 000..acf818b
--- /dev/null
+++ b/src/util/slab.c
@@ -0,0 +1,180 @@
+/*
+ * Copyright 2010 Marek Olšák 
+ * Copyright 2016 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * on the rights to use, copy, modify, merge, publish, distribute, sub
+ * license, and/or sell copies of the Software, and to permit persons to whom
+ * the Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE. */
+
+#include "slab.h"
+#include "macros.h"
+#include "simple_list.h"
+#include 
+#include 
+
+#define ALIGN(value, align) (((value) + (align) - 1) & ~((align) - 1))
+
+#ifdef DEBUG
+#define SLAB_MAGIC 0xcafe4321
+#define SET_MAGIC(element)   (element)->magic = SLAB_MAGIC
+#define CHECK_MAGIC(element) assert((element)->magic == SLAB_MAGIC)
+#else
+#define SET_MAGIC(element)
+#define CHECK_MAGIC(element)
+#endif
+
+/* One array element within a big buffer. */
+struct slab_element_header {
+   /* The next free element. */
+   struct slab_element_header *next_free;
+
+#ifdef DEBUG
+   /* Use intptr_t to keep the header aligned to a pointer size. */
+   intptr_t magic;
+#endif
+};
+
+static struct slab_element_header *
+slab_get_element(struct slab_mempool *pool,
+ struct slab_page_header *page, unsigned index)
+{
+   return (struct slab_element_header*)
+  ((uint8_t*)[1] + (pool->element_size * index));
+}
+
+static bool
+slab_add_new_page(struct slab_mempool *pool)
+{
+   struct slab_page_header *page;
+   struct slab_element_header *element;
+   unsigned i;
+
+   page = malloc(sizeof(struct slab_page_header) +
+ pool->num_elements * pool->element_size);
+   if (!page)
+  return false;
+
+   if (!pool->list.prev)
+  make_empty_list(>list);
+
+   insert_at_tail(>list, page);
+
+   /* Mark all elements as free. */
+   for (i = 0; i < pool->num_elements-1; i++) {
+  element = slab_get_element(pool, page, i);
+  element->next_free = slab_get_element(pool, page, i + 1);
+  SET_MAGIC(element);
+   }
+
+   element = slab_get_element(pool, page, pool->num_elements - 1);
+   element->next_free = pool->first_free;
+   SET_MAGIC(element);
+   pool->first_free = slab_get_element(pool, page, 0);
+   return true;
+}
+
+/**
+ * Allocate an object from the slab. Single-threaded (no mutex).
+ */
+void *
+slab_alloc_st(struct slab_mempool *pool)
+{
+   struct slab_element_header *element;
+
+   /* Allocate a new page. */
+   if (!pool->first_free &&
+   !slab_add_new_page(pool))
+  return NULL;
+
+   element = pool->first_free;
+   CHECK_MAGIC(element);
+   pool->first_free = element->next_free;
+   return [1];
+}
+
+/**
+ * Free an object allocated from the slab. Single-threaded (no mutex).
+ */
+void
+slab_free_st(struct slab_mempool *pool, void *ptr)
+{
+   struct slab_element_header *element =
+  ((struct slab_element_header*)ptr - 1);
+
+   CHECK_MAGIC(element);
+   element->next_free = pool->first_free;
+   pool->first_free = element;
+}
+
+/**
+ * Allocate an object from the slab. Thread-safe.
+ */
+void *

[Mesa-dev] [PATCH 19/20] gallium/radeon: set VPORT_ZMIN/MAX registers correctly

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

Calculate depth ranges from viewport states and
pipe_rasterizer_state::clip_halfz.

The evergreend.h change is required to silence a warning.

This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range
---
 src/gallium/drivers/r600/evergreen_state.c|  1 +
 src/gallium/drivers/r600/evergreend.h |  4 +-
 src/gallium/drivers/r600/r600_hw_context.c|  1 +
 src/gallium/drivers/r600/r600_pipe.h  |  1 +
 src/gallium/drivers/r600/r600_state.c |  1 +
 src/gallium/drivers/r600/r600_state_common.c  |  2 +-
 src/gallium/drivers/radeon/r600_pipe_common.h |  5 +-
 src/gallium/drivers/radeon/r600_viewport.c| 73 ---
 src/gallium/drivers/radeon/r600d_common.h |  2 +
 src/gallium/drivers/radeonsi/si_hw_context.c  |  1 +
 src/gallium/drivers/radeonsi/si_state.c   |  3 +-
 src/gallium/drivers/radeonsi/si_state.h   |  1 +
 12 files changed, 82 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 11c8161..5ca5453 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -466,20 +466,21 @@ static void *evergreen_create_rs_state(struct 
pipe_context *ctx,
float psize_min, psize_max;
struct r600_rasterizer_state *rs = CALLOC_STRUCT(r600_rasterizer_state);
 
if (!rs) {
return NULL;
}
 
r600_init_command_buffer(>buffer, 30);
 
rs->scissor_enable = state->scissor;
+   rs->clip_halfz = state->clip_halfz;
rs->flatshade = state->flatshade;
rs->sprite_coord_enable = state->sprite_coord_enable;
rs->two_side = state->light_twoside;
rs->clip_plane_enable = state->clip_plane_enable;
rs->pa_sc_line_stipple = state->line_stipple_enable ?

S_028A0C_LINE_PATTERN(state->line_stipple_pattern) |

S_028A0C_REPEAT_COUNT(state->line_stipple_factor) : 0;
rs->pa_cl_clip_cntl =
S_028810_DX_CLIP_SPACE_DEF(state->clip_halfz) |
S_028810_ZCLIP_NEAR_DISABLE(!state->depth_clip) |
diff --git a/src/gallium/drivers/r600/evergreend.h 
b/src/gallium/drivers/r600/evergreend.h
index a81b6c5..3f33e42 100644
--- a/src/gallium/drivers/r600/evergreend.h
+++ b/src/gallium/drivers/r600/evergreend.h
@@ -1856,22 +1856,22 @@
 #define R_0283DC_SQ_VTX_SEMANTIC_23  0x000283DC
 #define R_0283E0_SQ_VTX_SEMANTIC_24  0x000283E0
 #define R_0283E4_SQ_VTX_SEMANTIC_25  0x000283E4
 #define R_0283E8_SQ_VTX_SEMANTIC_26  0x000283E8
 #define R_0283EC_SQ_VTX_SEMANTIC_27  0x000283EC
 #define R_0283F0_SQ_VTX_SEMANTIC_28  0x000283F0
 #define R_0283F4_SQ_VTX_SEMANTIC_29  0x000283F4
 #define R_0283F8_SQ_VTX_SEMANTIC_30  0x000283F8
 #define R_0283FC_SQ_VTX_SEMANTIC_31  0x000283FC
 #define R_0288F0_SQ_VTX_SEMANTIC_CLEAR   0x000288F0
-#define R_0282D0_PA_SC_VPORT_ZMIN_0  0x000282D0
-#define R_0282D4_PA_SC_VPORT_ZMAX_0  0x000282D4
+#define R_0282D0_PA_SC_VPORT_ZMIN_0 
0x0282D0
+#define R_0282D4_PA_SC_VPORT_ZMAX_0 
0x0282D4
 #define R_028400_VGT_MAX_VTX_INDX0x00028400
 #define R_028404_VGT_MIN_VTX_INDX0x00028404
 #define R_028408_VGT_INDX_OFFSET 0x00028408
 #define R_02840C_VGT_MULTI_PRIM_IB_RESET_INDX0x0002840C
 #define R_028414_CB_BLEND_RED0x00028414
 #define R_028418_CB_BLEND_GREEN  0x00028418
 #define R_02841C_CB_BLEND_BLUE   0x0002841C
 #define R_028420_CB_BLEND_ALPHA  0x00028420
 #define R_028438_SX_ALPHA_REF0x00028438
 #define R_02843C_PA_CL_VPORT_XSCALE_00x0002843C
diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index 58ba09d..dc5ad75 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -305,20 +305,21 @@ void r600_begin_new_cs(struct r600_context *ctx)
r600_mark_atom_dirty(ctx, >db_misc_state.atom);
r600_mark_atom_dirty(ctx, >db_state.atom);
r600_mark_atom_dirty(ctx, >framebuffer.atom);
r600_mark_atom_dirty(ctx, 
>hw_shader_stages[R600_HW_STAGE_PS].atom);
r600_mark_atom_dirty(ctx, >poly_offset_state.atom);
r600_mark_atom_dirty(ctx, >vgt_state.atom);
r600_mark_atom_dirty(ctx, >sample_mask.atom);
ctx->b.scissors.dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
r600_mark_atom_dirty(ctx, >b.scissors.atom);
ctx->b.viewports.dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
+   

[Mesa-dev] [PATCH 10/20] radeonsi: fix Gather4 with integer formats

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

The closed compiler does the same thing.

This fixes: GL45-CTS.texture_gather.*-int-* (18 tests)
---
 src/gallium/drivers/radeonsi/si_shader.c | 99 +++-
 1 file changed, 96 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index f8884ef..90c9b1f 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4717,30 +4717,100 @@ static void tex_fetch_args(
gather_comp = CLAMP(gather_comp, 0, 3);
}
 
dmask = 1 << gather_comp;
}
 
set_tex_fetch_args(ctx, emit_data, opcode, target, res_ptr,
   samp_ptr, address, count, dmask);
 }
 
+/* Gather4 should follow the same rules as bilinear filtering, but the hardware
+ * incorrectly forces nearest filtering if the texture format is integer.
+ * The only effect it has on Gather4, which always returns 4 texels for
+ * bilinear filtering, is that the final coordinates are off by 0.5 of
+ * the texel size.
+ *
+ * The workaround is to subtract 0.5 from the unnormalized coordinates,
+ * or (0.5 / size) from the normalized coordinates.
+ */
+static void si_lower_gather4_integer(struct si_shader_context *ctx,
+struct lp_build_emit_data *emit_data,
+const char *intr_name,
+unsigned coord_vgpr_index)
+{
+   LLVMBuilderRef builder = ctx->radeon_bld.gallivm.builder;
+   LLVMValueRef coord = emit_data->args[0];
+   LLVMValueRef half_texel[2];
+   int c;
+
+   if (emit_data->inst->Texture.Texture == TGSI_TEXTURE_RECT ||
+   emit_data->inst->Texture.Texture == TGSI_TEXTURE_SHADOWRECT) {
+   half_texel[0] = half_texel[1] = LLVMConstReal(ctx->f32, -0.5);
+   } else {
+   struct tgsi_full_instruction txq_inst = {};
+   struct lp_build_emit_data txq_emit_data = {};
+
+   /* Query the texture size. */
+   txq_inst.Texture.Texture = emit_data->inst->Texture.Texture;
+   txq_emit_data.inst = _inst;
+   txq_emit_data.dst_type = ctx->v4i32;
+   set_tex_fetch_args(ctx, _emit_data, TGSI_OPCODE_TXQ,
+  txq_inst.Texture.Texture,
+  emit_data->args[1], NULL,
+  >radeon_bld.soa.bld_base.uint_bld.zero,
+  1, 0xf);
+   txq_emit(NULL, >radeon_bld.soa.bld_base, _emit_data);
+
+   /* Compute -0.5 / size. */
+   for (c = 0; c < 2; c++) {
+   half_texel[c] =
+   LLVMBuildExtractElement(builder, 
txq_emit_data.output[0],
+   LLVMConstInt(ctx->i32, 
c, 0), "");
+   half_texel[c] = LLVMBuildUIToFP(builder, half_texel[c], 
ctx->f32, "");
+   half_texel[c] =
+   
lp_build_emit_llvm_unary(>radeon_bld.soa.bld_base,
+TGSI_OPCODE_RCP, 
half_texel[c]);
+   half_texel[c] = LLVMBuildFMul(builder, half_texel[c],
+ LLVMConstReal(ctx->f32, 
-0.5), "");
+   }
+   }
+
+   for (c = 0; c < 2; c++) {
+   LLVMValueRef tmp;
+   LLVMValueRef index = LLVMConstInt(ctx->i32, coord_vgpr_index + 
c, 0);
+
+   tmp = LLVMBuildExtractElement(builder, coord, index, "");
+   tmp = LLVMBuildBitCast(builder, tmp, ctx->f32, "");
+   tmp = LLVMBuildFAdd(builder, tmp, half_texel[c], "");
+   tmp = LLVMBuildBitCast(builder, tmp, ctx->i32, "");
+   coord = LLVMBuildInsertElement(builder, coord, tmp, index, "");
+   }
+
+   emit_data->args[0] = coord;
+   emit_data->output[emit_data->chan] =
+   lp_build_intrinsic(builder, intr_name, emit_data->dst_type,
+  emit_data->args, emit_data->arg_count,
+  LLVMReadNoneAttribute);
+}
+
 static void build_tex_intrinsic(const struct lp_build_tgsi_action *action,
struct lp_build_tgsi_context *bld_base,
struct lp_build_emit_data *emit_data)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
struct lp_build_context *base = _base->base;
-   unsigned opcode = emit_data->inst->Instruction.Opcode;
-   unsigned target = emit_data->inst->Texture.Texture;
+   const struct tgsi_full_instruction *inst = emit_data->inst;
+   unsigned opcode = inst->Instruction.Opcode;
+   unsigned target = inst->Texture.Texture;
char intr_name[127];
-   

[Mesa-dev] [PATCH 18/20] gallium/radeon: unify viewport emission code

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_viewport.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_viewport.c 
b/src/gallium/drivers/radeon/r600_viewport.c
index 5c998c8..2d68783 100644
--- a/src/gallium/drivers/radeon/r600_viewport.c
+++ b/src/gallium/drivers/radeon/r600_viewport.c
@@ -269,57 +269,59 @@ static void r600_set_viewport_states(struct pipe_context 
*ctx,
r600_get_scissor_from_viewport(rctx, [i],
   
>viewports.as_scissor[index]);
}
 
rctx->viewports.dirty_mask |= ((1 << num_viewports) - 1) << start_slot;
rctx->scissors.dirty_mask |= ((1 << num_viewports) - 1) << start_slot;
rctx->set_atom_dirty(rctx, >viewports.atom, true);
rctx->set_atom_dirty(rctx, >scissors.atom, true);
 }
 
+static void r600_emit_one_viewport(struct r600_common_context *rctx,
+  struct pipe_viewport_state *state)
+{
+   struct radeon_winsys_cs *cs = rctx->gfx.cs;
+
+   radeon_emit(cs, fui(state->scale[0]));
+   radeon_emit(cs, fui(state->translate[0]));
+   radeon_emit(cs, fui(state->scale[1]));
+   radeon_emit(cs, fui(state->translate[1]));
+   radeon_emit(cs, fui(state->scale[2]));
+   radeon_emit(cs, fui(state->translate[2]));
+}
+
 static void r600_emit_viewports(struct r600_common_context *rctx, struct 
r600_atom *atom)
 {
struct radeon_winsys_cs *cs = rctx->gfx.cs;
struct pipe_viewport_state *states = rctx->viewports.states;
unsigned mask = rctx->viewports.dirty_mask;
 
/* The simple case: Only 1 viewport is active. */
if (!rctx->vs_writes_viewport_index) {
if (!(mask & 1))
return;
 
radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE, 6);
-   radeon_emit(cs, fui(states[0].scale[0]));
-   radeon_emit(cs, fui(states[0].translate[0]));
-   radeon_emit(cs, fui(states[0].scale[1]));
-   radeon_emit(cs, fui(states[0].translate[1]));
-   radeon_emit(cs, fui(states[0].scale[2]));
-   radeon_emit(cs, fui(states[0].translate[2]));
+   r600_emit_one_viewport(rctx, [0]);
rctx->viewports.dirty_mask &= ~1; /* clear one bit */
return;
}
 
while (mask) {
int start, count, i;
 
u_bit_scan_consecutive_range(, , );
 
radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE +
   start * 4 * 6, count * 6);
-   for (i = start; i < start+count; i++) {
-   radeon_emit(cs, fui(states[i].scale[0]));
-   radeon_emit(cs, fui(states[i].translate[0]));
-   radeon_emit(cs, fui(states[i].scale[1]));
-   radeon_emit(cs, fui(states[i].translate[1]));
-   radeon_emit(cs, fui(states[i].scale[2]));
-   radeon_emit(cs, fui(states[i].translate[2]));
-   }
+   for (i = start; i < start+count; i++)
+   r600_emit_one_viewport(rctx, [i]);
}
rctx->viewports.dirty_mask = 0;
 }
 
 void r600_set_scissor_enable(struct r600_common_context *rctx, bool enable)
 {
if (rctx->scissor_enabled != enable) {
rctx->scissor_enabled = enable;
rctx->scissors.dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
rctx->set_atom_dirty(rctx, >scissors.atom, true);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/20] radeonsi: always use the same function signature for llvm.SI.export

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index a5b566e..2863faa 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3049,24 +3049,24 @@ static void si_export_null(struct lp_build_tgsi_context 
*bld_base)
struct si_shader_context *ctx = si_shader_context(bld_base);
struct lp_build_context *base = _base->base;
struct lp_build_context *uint = _base->uint_bld;
LLVMValueRef args[9];
 
args[0] = lp_build_const_int32(base->gallivm, 0x0); /* enabled channels 
*/
args[1] = uint->one; /* whether the EXEC mask is valid */
args[2] = uint->one; /* DONE bit */
args[3] = lp_build_const_int32(base->gallivm, V_008DFC_SQ_EXP_NULL);
args[4] = uint->zero; /* COMPR flag (0 = 32-bit export) */
-   args[5] = uint->undef; /* R */
-   args[6] = uint->undef; /* G */
-   args[7] = uint->undef; /* B */
-   args[8] = uint->undef; /* A */
+   args[5] = base->undef; /* R */
+   args[6] = base->undef; /* G */
+   args[7] = base->undef; /* B */
+   args[8] = base->undef; /* A */
 
lp_build_intrinsic(base->gallivm->builder, "llvm.SI.export",
   ctx->voidt, args, 9, 0);
 }
 
 static void si_llvm_emit_fs_epilogue(struct lp_build_tgsi_context *bld_base)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
struct si_shader *shader = ctx->shader;
struct lp_build_context *base = _base->base;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/20] radeonsi: add HUD queries for counting VS/PS/CS partial flushes

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.h |  3 +++
 src/gallium/drivers/radeon/r600_query.c   | 21 +
 src/gallium/drivers/radeon/r600_query.h   |  3 +++
 src/gallium/drivers/radeonsi/si_state_draw.c  |  8 
 4 files changed, 35 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 624dea3..d821eaa 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -551,20 +551,23 @@ struct r600_common_context {
unsignednum_cs_dw_queries_suspend;
/* Additional hardware info. */
unsignedbackend_mask;
unsignedmax_db; /* for OQ */
/* Misc stats. */
unsignednum_draw_calls;
unsignednum_spill_draw_calls;
unsignednum_compute_calls;
unsignednum_spill_compute_calls;
unsignednum_dma_calls;
+   unsignednum_vs_flushes;
+   unsignednum_ps_flushes;
+   unsignednum_cs_flushes;
uint64_tnum_alloc_tex_transfer_bytes;
unsignedlast_tex_ps_draw_ratio; /* for query */
 
/* Render condition. */
struct r600_atomrender_cond_atom;
struct pipe_query   *render_cond;
unsignedrender_cond_mode;
boolrender_cond_invert;
boolrender_cond_force_off; /* for u_blitter 
*/
 
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 29ad249..2c3d530 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -83,20 +83,29 @@ static bool r600_query_sw_begin(struct r600_common_context 
*rctx,
break;
case R600_QUERY_COMPUTE_CALLS:
query->begin_result = rctx->num_compute_calls;
break;
case R600_QUERY_SPILL_COMPUTE_CALLS:
query->begin_result = rctx->num_spill_compute_calls;
break;
case R600_QUERY_DMA_CALLS:
query->begin_result = rctx->num_dma_calls;
break;
+   case R600_QUERY_NUM_VS_FLUSHES:
+   query->begin_result = rctx->num_vs_flushes;
+   break;
+   case R600_QUERY_NUM_PS_FLUSHES:
+   query->begin_result = rctx->num_ps_flushes;
+   break;
+   case R600_QUERY_NUM_CS_FLUSHES:
+   query->begin_result = rctx->num_cs_flushes;
+   break;
case R600_QUERY_REQUESTED_VRAM:
case R600_QUERY_REQUESTED_GTT:
case R600_QUERY_MAPPED_VRAM:
case R600_QUERY_MAPPED_GTT:
case R600_QUERY_VRAM_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BACK_BUFFER_PS_DRAW_RATIO:
@@ -151,20 +160,29 @@ static bool r600_query_sw_end(struct r600_common_context 
*rctx,
break;
case R600_QUERY_COMPUTE_CALLS:
query->end_result = rctx->num_compute_calls;
break;
case R600_QUERY_SPILL_COMPUTE_CALLS:
query->end_result = rctx->num_spill_compute_calls;
break;
case R600_QUERY_DMA_CALLS:
query->end_result = rctx->num_dma_calls;
break;
+   case R600_QUERY_NUM_VS_FLUSHES:
+   query->end_result = rctx->num_vs_flushes;
+   break;
+   case R600_QUERY_NUM_PS_FLUSHES:
+   query->end_result = rctx->num_ps_flushes;
+   break;
+   case R600_QUERY_NUM_CS_FLUSHES:
+   query->end_result = rctx->num_cs_flushes;
+   break;
case R600_QUERY_REQUESTED_VRAM:
case R600_QUERY_REQUESTED_GTT:
case R600_QUERY_MAPPED_VRAM:
case R600_QUERY_MAPPED_GTT:
case R600_QUERY_VRAM_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BUFFER_WAIT_TIME:
@@ -1175,20 +1193,23 @@ err:
XFULL(name_, query_type_, type_, result_type_, 
R600_QUERY_GROUP_##group_)
 
 static struct pipe_driver_query_info r600_driver_query_list[] = {
X("num-compilations",   NUM_COMPILATIONS,   UINT64, 
CUMULATIVE),
X("num-shaders-created",NUM_SHADERS_CREATED,UINT64, 
CUMULATIVE),
X("draw-calls", DRAW_CALLS, UINT64, 
AVERAGE),

[Mesa-dev] [PATCH 16/20] radeonsi: fix variable naming in si_emit_cache_flush

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_draw.c | 62 ++--
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index ddcb904..0a91291 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -689,158 +689,158 @@ static void si_emit_draw_packets(struct si_context 
*sctx,
radeon_emit(cs, V_0287F0_DI_SRC_SEL_DMA);
} else {
radeon_emit(cs, PKT3(PKT3_DRAW_INDEX_AUTO, 1, 
render_cond_bit));
radeon_emit(cs, info->count);
radeon_emit(cs, V_0287F0_DI_SRC_SEL_AUTO_INDEX |

S_0287F0_USE_OPAQUE(!!info->count_from_stream_output));
}
}
 }
 
-void si_emit_cache_flush(struct si_context *si_ctx, struct r600_atom *atom)
+void si_emit_cache_flush(struct si_context *sctx, struct r600_atom *atom)
 {
-   struct r600_common_context *sctx = _ctx->b;
-   struct radeon_winsys_cs *cs = sctx->gfx.cs;
+   struct r600_common_context *rctx = >b;
+   struct radeon_winsys_cs *cs = rctx->gfx.cs;
uint32_t cp_coher_cntl = 0;
 
/* SI has a bug that it always flushes ICACHE and KCACHE if either
 * bit is set. An alternative way is to write SQC_CACHES, but that
 * doesn't seem to work reliably. Since the bug doesn't affect
 * correctness (it only does more work than necessary) and
 * the performance impact is likely negligible, there is no plan
 * to add a workaround for it.
 */
 
-   if (sctx->flags & SI_CONTEXT_INV_ICACHE)
+   if (rctx->flags & SI_CONTEXT_INV_ICACHE)
cp_coher_cntl |= S_0085F0_SH_ICACHE_ACTION_ENA(1);
-   if (sctx->flags & SI_CONTEXT_INV_SMEM_L1)
+   if (rctx->flags & SI_CONTEXT_INV_SMEM_L1)
cp_coher_cntl |= S_0085F0_SH_KCACHE_ACTION_ENA(1);
 
-   if (sctx->flags & SI_CONTEXT_INV_VMEM_L1)
+   if (rctx->flags & SI_CONTEXT_INV_VMEM_L1)
cp_coher_cntl |= S_0085F0_TCL1_ACTION_ENA(1);
-   if (sctx->flags & SI_CONTEXT_INV_GLOBAL_L2) {
+   if (rctx->flags & SI_CONTEXT_INV_GLOBAL_L2) {
cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1);
 
-   if (sctx->chip_class >= VI)
+   if (rctx->chip_class >= VI)
cp_coher_cntl |= S_0301F0_TC_WB_ACTION_ENA(1);
}
 
-   if (sctx->flags & SI_CONTEXT_FLUSH_AND_INV_CB) {
+   if (rctx->flags & SI_CONTEXT_FLUSH_AND_INV_CB) {
cp_coher_cntl |= S_0085F0_CB_ACTION_ENA(1) |
 S_0085F0_CB0_DEST_BASE_ENA(1) |
 S_0085F0_CB1_DEST_BASE_ENA(1) |
 S_0085F0_CB2_DEST_BASE_ENA(1) |
 S_0085F0_CB3_DEST_BASE_ENA(1) |
 S_0085F0_CB4_DEST_BASE_ENA(1) |
 S_0085F0_CB5_DEST_BASE_ENA(1) |
 S_0085F0_CB6_DEST_BASE_ENA(1) |
 S_0085F0_CB7_DEST_BASE_ENA(1);
 
/* Necessary for DCC */
-   if (sctx->chip_class >= VI) {
+   if (rctx->chip_class >= VI) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE_EOP, 4, 0));
radeon_emit(cs, 
EVENT_TYPE(V_028A90_FLUSH_AND_INV_CB_DATA_TS) |
EVENT_INDEX(5));
radeon_emit(cs, 0);
radeon_emit(cs, 0);
radeon_emit(cs, 0);
radeon_emit(cs, 0);
}
}
-   if (sctx->flags & SI_CONTEXT_FLUSH_AND_INV_DB) {
+   if (rctx->flags & SI_CONTEXT_FLUSH_AND_INV_DB) {
cp_coher_cntl |= S_0085F0_DB_ACTION_ENA(1) |
 S_0085F0_DB_DEST_BASE_ENA(1);
}
 
-   if (sctx->flags & SI_CONTEXT_FLUSH_AND_INV_CB_META) {
+   if (rctx->flags & SI_CONTEXT_FLUSH_AND_INV_CB_META) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_CB_META) | 
EVENT_INDEX(0));
/* needed for wait for idle in SURFACE_SYNC */
-   assert(sctx->flags & SI_CONTEXT_FLUSH_AND_INV_CB);
+   assert(rctx->flags & SI_CONTEXT_FLUSH_AND_INV_CB);
}
-   if (sctx->flags & SI_CONTEXT_FLUSH_AND_INV_DB_META) {
+   if (rctx->flags & SI_CONTEXT_FLUSH_AND_INV_DB_META) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_DB_META) | 
EVENT_INDEX(0));
/* needed for wait for idle in SURFACE_SYNC */
-   assert(sctx->flags & SI_CONTEXT_FLUSH_AND_INV_DB);
+   

[Mesa-dev] [PATCH 17/20] radeonsi: also do VS_PARTIAL_FLUSH before updating VGT ring pointers

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

ported from Vulkan
---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 394afaa..b4f19fe 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1602,20 +1602,26 @@ static void si_emit_spi_map(struct si_context *sctx, 
struct r600_atom *atom)
 }
 
 /**
  * Writing CONFIG or UCONFIG VGT registers requires VGT_FLUSH before that.
  */
 static void si_init_config_add_vgt_flush(struct si_context *sctx)
 {
if (sctx->init_config_has_vgt_flush)
return;
 
+   /* Done by Vulkan before VGT_FLUSH. */
+   si_pm4_cmd_begin(sctx->init_config, PKT3_EVENT_WRITE);
+   si_pm4_cmd_add(sctx->init_config,
+  EVENT_TYPE(V_028A90_VS_PARTIAL_FLUSH) | EVENT_INDEX(4));
+   si_pm4_cmd_end(sctx->init_config, false);
+
/* VGT_FLUSH is required even if VGT is idle. It resets VGT pointers. */
si_pm4_cmd_begin(sctx->init_config, PKT3_EVENT_WRITE);
si_pm4_cmd_add(sctx->init_config, EVENT_TYPE(V_028A90_VGT_FLUSH) | 
EVENT_INDEX(0));
si_pm4_cmd_end(sctx->init_config, false);
sctx->init_config_has_vgt_flush = true;
 }
 
 /* Initialize state related to ESGS / GSVS ring buffers */
 static bool si_update_gs_ring_buffers(struct si_context *sctx)
 {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/20] radeonsi: fix texture format reinterpretation with DCC

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

DCC is limited in how texture formats can be reinterpreted using texture
views. If we get a view format that is incompatible with the initial
texture format with respect to DCC, disable DCC.

There is a new piglit which tests all format combinations.
What works and what doesn't was deduced by looking at the piglit failures.
---
 src/gallium/drivers/radeon/r600_pipe_common.h |  6 ++
 src/gallium/drivers/radeon/r600_texture.c | 96 +++
 src/gallium/drivers/radeonsi/si_blit.c|  8 +++
 src/gallium/drivers/radeonsi/si_descriptors.c |  3 +-
 src/gallium/drivers/radeonsi/si_state.c   |  4 ++
 5 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 1924535..624dea3 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -750,20 +750,26 @@ void r600_texture_get_fmask_info(struct 
r600_common_screen *rscreen,
 struct r600_fmask_info *out);
 void r600_texture_get_cmask_info(struct r600_common_screen *rscreen,
 struct r600_texture *rtex,
 struct r600_cmask_info *out);
 bool r600_init_flushed_depth_texture(struct pipe_context *ctx,
 struct pipe_resource *texture,
 struct r600_texture **staging);
 void r600_print_texture_info(struct r600_texture *rtex, FILE *f);
 struct pipe_resource *r600_texture_create(struct pipe_screen *screen,
const struct pipe_resource *templ);
+bool vi_dcc_formats_compatible(enum pipe_format format1,
+  enum pipe_format format2);
+void vi_dcc_disable_if_incompatible_format(struct r600_common_context *rctx,
+  struct pipe_resource *tex,
+  unsigned level,
+  enum pipe_format view_format);
 struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe,
struct pipe_resource *texture,
const struct pipe_surface 
*templ,
unsigned width, unsigned 
height);
 unsigned r600_translate_colorswap(enum pipe_format format, bool 
do_endian_swap);
 void vi_separate_dcc_start_query(struct pipe_context *ctx,
 struct r600_texture *tex);
 void vi_separate_dcc_stop_query(struct pipe_context *ctx,
struct r600_texture *tex);
 void vi_separate_dcc_process_and_reset_stats(struct pipe_context *ctx,
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 7bdceb1..2f04019 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1659,42 +1659,138 @@ static void r600_texture_transfer_unmap(struct 
pipe_context *ctx,
 
 static const struct u_resource_vtbl r600_texture_vtbl =
 {
NULL,   /* get_handle */
r600_texture_destroy,   /* resource_destroy */
r600_texture_transfer_map,  /* transfer_map */
u_default_transfer_flush_region, /* transfer_flush_region */
r600_texture_transfer_unmap,/* transfer_unmap */
 };
 
+/* DCC channel type categories within which formats can be reinterpreted
+ * while keeping the same DCC encoding. The swizzle must also match. */
+enum dcc_channel_type {
+   dcc_channel_any32,
+   dcc_channel_int16,
+   dcc_channel_float16,
+   dcc_channel_any_10_10_10_2,
+   dcc_channel_any8,
+   dcc_channel_incompatible,
+};
+
+/* Return the type of DCC encoding. */
+static enum dcc_channel_type
+vi_get_dcc_channel_type(const struct util_format_description *desc)
+{
+   int i;
+
+   /* Find the first non-void channel. */
+   for (i = 0; i < desc->nr_channels; i++)
+   if (desc->channel[i].type != UTIL_FORMAT_TYPE_VOID)
+   break;
+   if (i == desc->nr_channels)
+   return dcc_channel_incompatible;
+
+   switch (desc->channel[i].size) {
+   case 32:
+   if (desc->nr_channels == 4)
+   return dcc_channel_incompatible;
+   else
+   return dcc_channel_any32;
+   case 16:
+   if (desc->channel[i].type == UTIL_FORMAT_TYPE_FLOAT)
+   return dcc_channel_float16;
+   else
+   return dcc_channel_int16;
+   case 10:
+   return dcc_channel_any_10_10_10_2;
+   case 8:
+   return dcc_channel_any8;
+   default:
+   return dcc_channel_incompatible;
+   }
+}
+
+/* Return if it's 

[Mesa-dev] [PATCH 08/20] radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

This fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation
.gl_PatchVerticesIn
---
 src/gallium/drivers/radeonsi/si_shader.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 2863faa..6eca5cf 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1723,21 +1723,26 @@ static void declare_system_value(
if 
(ctx->shader->selector->info.properties[TGSI_PROPERTY_TES_PRIM_MODE] ==
PIPE_PRIM_TRIANGLES)
coord[2] = lp_build_sub(bld, bld->one,
lp_build_add(bld, coord[0], 
coord[1]));
 
value = lp_build_gather_values(gallivm, coord, 4);
break;
}
 
case TGSI_SEMANTIC_VERTICESIN:
-   value = unpack_param(ctx, SI_PARAM_TCS_OUT_LAYOUT, 26, 6);
+   if (ctx->type == PIPE_SHADER_TESS_CTRL)
+   value = unpack_param(ctx, SI_PARAM_TCS_OUT_LAYOUT, 26, 
6);
+   else if (ctx->type == PIPE_SHADER_TESS_EVAL)
+   value = unpack_param(ctx, SI_PARAM_TCS_OFFCHIP_LAYOUT, 
9, 7);
+   else
+   assert(!"invalid shader stage for 
TGSI_SEMANTIC_VERTICESIN");
break;
 
case TGSI_SEMANTIC_TESSINNER:
case TGSI_SEMANTIC_TESSOUTER:
{
LLVMValueRef rw_buffers, buffer, base, addr;
int param = si_shader_io_get_unique_index(decl->Semantic.Name, 
0);
 
rw_buffers = LLVMGetParam(ctx->radeon_bld.main_fn,
SI_PARAM_RW_BUFFERS);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/20] gallium/radeon: also eliminate DCC fast clear in resource_get_handle

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

just do what the comment says
---
 src/gallium/drivers/radeon/r600_texture.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 912d123..7bdceb1 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -541,28 +541,29 @@ static boolean r600_texture_get_handle(struct 
pipe_screen* screen,
/* Since shader image stores don't support DCC on VI,
 * disable it for external clients that want write
 * access.
 */
if (usage & PIPE_HANDLE_USAGE_WRITE && rtex->dcc_offset) {
if (r600_texture_disable_dcc(rctx, rtex))
update_metadata = true;
}
 
if (!(usage & PIPE_HANDLE_USAGE_EXPLICIT_FLUSH) &&
-   rtex->cmask.size) {
+   (rtex->cmask.size || rtex->dcc_offset)) {
/* Eliminate fast clear (both CMASK and DCC) */
r600_eliminate_fast_color_clear(rctx, rtex);
 
/* Disable CMASK if flush_resource isn't going
 * to be called.
 */
-   r600_texture_discard_cmask(rscreen, rtex);
+   if (rtex->cmask.size)
+   r600_texture_discard_cmask(rscreen, rtex);
}
 
/* Set metadata. */
if (!res->is_shared || update_metadata) {
r600_texture_init_metadata(rtex, );
if (rscreen->query_opaque_metadata)
rscreen->query_opaque_metadata(rscreen, rtex,
   );
 
rscreen->ws->buffer_set_metadata(res->buf, );
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/20] gallium/radeon: use the current ctx for DCC decompression in resource_get_handle

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

For coherency with the current context.
---
 src/gallium/drivers/radeon/r600_texture.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index fb3068a..e7be768 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -511,41 +511,41 @@ static void r600_degrade_tile_mode_to_linear(struct 
r600_common_context *rctx,
p_atomic_inc(>screen->dirty_tex_descriptor_counter);
 }
 
 static boolean r600_texture_get_handle(struct pipe_screen* screen,
   struct pipe_context *ctx,
   struct pipe_resource *resource,
   struct winsys_handle *whandle,
unsigned usage)
 {
struct r600_common_screen *rscreen = (struct r600_common_screen*)screen;
-   struct r600_common_context *aux_context =
-   (struct r600_common_context*)rscreen->aux_context;
+   struct r600_common_context *rctx = (struct r600_common_context*)
+  (ctx ? ctx : rscreen->aux_context);
struct r600_resource *res = (struct r600_resource*)resource;
struct r600_texture *rtex = (struct r600_texture*)resource;
struct radeon_bo_metadata metadata;
bool update_metadata = false;
 
/* This is not supported now, but it might be required for OpenCL
 * interop in the future.
 */
if (resource->target != PIPE_BUFFER &&
(resource->nr_samples > 1 || rtex->is_depth))
return false;
 
if (resource->target != PIPE_BUFFER) {
/* Since shader image stores don't support DCC on VI,
 * disable it for external clients that want write
 * access.
 */
if (usage & PIPE_HANDLE_USAGE_WRITE && rtex->dcc_offset) {
-   if (r600_texture_disable_dcc(aux_context, rtex))
+   if (r600_texture_disable_dcc(rctx, rtex))
update_metadata = true;
}
 
if (!(usage & PIPE_HANDLE_USAGE_EXPLICIT_FLUSH) &&
rtex->cmask.size) {
/* Eliminate fast clear (both CMASK and DCC) */
r600_eliminate_fast_color_clear(rscreen, rtex);
 
/* Disable CMASK if flush_resource isn't going
 * to be called.
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/20] radeonsi: fix a crash in imageSize for cubemap arrays

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

Sometimes it was f32, other times it was i32. Now it's always i32.

This fixes:
GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh
---
 src/gallium/drivers/radeonsi/si_shader.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 6eca5cf..f8884ef 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4098,21 +4098,21 @@ static void atomic_emit(
 
 static void resq_fetch_args(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
const struct tgsi_full_instruction *inst = emit_data->inst;
const struct tgsi_full_src_register *reg = >Src[0];
 
-   emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
+   emit_data->dst_type = ctx->v4i32;
 
if (reg->Register.File == TGSI_FILE_BUFFER) {
emit_data->args[0] = shader_buffer_fetch_rsrc(ctx, reg);
emit_data->arg_count = 1;
} else if (inst->Memory.Texture == TGSI_TEXTURE_BUFFER) {
image_fetch_rsrc(bld_base, reg, false, _data->args[0]);
emit_data->arg_count = 1;
} else {
emit_data->args[0] = bld_base->uint_bld.zero; /* mip level */
image_fetch_rsrc(bld_base, reg, false, _data->args[1]);
@@ -4149,23 +4149,21 @@ static void resq_emit(
builder, "llvm.SI.getresinfo.i32", emit_data->dst_type,
emit_data->args, emit_data->arg_count,
LLVMReadNoneAttribute);
 
/* Divide the number of layers by 6 to get the number of cubes. 
*/
if (inst->Memory.Texture == TGSI_TEXTURE_CUBE_ARRAY) {
LLVMValueRef imm2 = lp_build_const_int32(gallivm, 2);
LLVMValueRef imm6 = lp_build_const_int32(gallivm, 6);
 
LLVMValueRef z = LLVMBuildExtractElement(builder, out, 
imm2, "");
-   z = LLVMBuildBitCast(builder, z, 
bld_base->uint_bld.elem_type, "");
z = LLVMBuildSDiv(builder, z, imm6, "");
-   z = LLVMBuildBitCast(builder, z, 
bld_base->base.elem_type, "");
out = LLVMBuildInsertElement(builder, out, z, imm2, "");
}
}
 
emit_data->output[emit_data->chan] = out;
 }
 
 static void set_tex_fetch_args(struct si_shader_context *ctx,
   struct lp_build_emit_data *emit_data,
   unsigned opcode, unsigned target,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/20] gallium/radeon: derive buffer placement and flags only at initialization

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

Invalidated buffers don't have to go through it.

Split r600_init_resource into r600_init_resource_fields and
r600_alloc_resource.
---
 src/gallium/drivers/r600/r600_state_common.c|  5 +-
 src/gallium/drivers/radeon/r600_buffer_common.c | 84 ++---
 src/gallium/drivers/radeon/r600_pipe_common.h   | 13 ++--
 src/gallium/drivers/radeon/r600_texture.c   |  9 +--
 src/gallium/drivers/radeonsi/si_descriptors.c   |  5 +-
 5 files changed, 67 insertions(+), 49 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index a5341c3..0349432 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -2774,26 +2774,25 @@ uint32_t r600_colorformat_endian_swap(uint32_t 
colorformat, bool do_endian_swap)
}
} else {
return ENDIAN_NONE;
}
 }
 
 static void r600_invalidate_buffer(struct pipe_context *ctx, struct 
pipe_resource *buf)
 {
struct r600_context *rctx = (struct r600_context*)ctx;
struct r600_resource *rbuffer = r600_resource(buf);
-   unsigned i, shader, mask, alignment = rbuffer->buf->alignment;
+   unsigned i, shader, mask;
struct r600_pipe_sampler_view *view;
 
/* Reallocate the buffer in the same pipe_resource. */
-   r600_init_resource(>screen->b, rbuffer, rbuffer->b.b.width0,
-  alignment);
+   r600_alloc_resource(>screen->b, rbuffer);
 
/* We changed the buffer, now we need to bind it where the old one was 
bound. */
/* Vertex buffers. */
mask = rctx->vertex_buffer_state.enabled_mask;
while (mask) {
i = u_bit_scan();
if (rctx->vertex_buffer_state.vb[i].buffer == >b.b) {
rctx->vertex_buffer_state.dirty_mask |= 1 << i;
r600_vertex_buffers_dirty(rctx);
}
diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 4480293..6a55de1 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -92,97 +92,118 @@ void *r600_buffer_map_sync_with_rings(struct 
r600_common_context *ctx,
ctx->ws->cs_sync_flush(ctx->gfx.cs);
if (ctx->dma.cs)
ctx->ws->cs_sync_flush(ctx->dma.cs);
}
}
 
/* Setting the CS to NULL will prevent doing checks we have done 
already. */
return ctx->ws->buffer_map(resource->buf, NULL, usage);
 }
 
-bool r600_init_resource(struct r600_common_screen *rscreen,
-   struct r600_resource *res,
-   uint64_t size, unsigned alignment)
+void r600_init_resource_fields(struct r600_common_screen *rscreen,
+  struct r600_resource *res,
+  uint64_t size, unsigned alignment)
 {
struct r600_texture *rtex = (struct r600_texture*)res;
-   struct pb_buffer *old_buf, *new_buf;
-   enum radeon_bo_flag flags = 0;
+
+   res->bo_size = size;
+   res->bo_alignment = alignment;
+   res->flags = 0;
 
switch (res->b.b.usage) {
case PIPE_USAGE_STREAM:
-   flags = RADEON_FLAG_GTT_WC;
+   res->flags = RADEON_FLAG_GTT_WC;
/* fall through */
case PIPE_USAGE_STAGING:
-   /* Transfers are likely to occur more often with these 
resources. */
+   /* Transfers are likely to occur more often with these
+* resources. */
res->domains = RADEON_DOMAIN_GTT;
break;
case PIPE_USAGE_DYNAMIC:
/* Older kernels didn't always flush the HDP cache before
 * CS execution
 */
if (rscreen->info.drm_major == 2 &&
rscreen->info.drm_minor < 40) {
res->domains = RADEON_DOMAIN_GTT;
-   flags |= RADEON_FLAG_GTT_WC;
+   res->flags |= RADEON_FLAG_GTT_WC;
break;
}
-   flags |= RADEON_FLAG_CPU_ACCESS;
+   res->flags |= RADEON_FLAG_CPU_ACCESS;
/* fall through */
case PIPE_USAGE_DEFAULT:
case PIPE_USAGE_IMMUTABLE:
default:
-   /* Not listing GTT here improves performance in some apps. */
+   /* Not listing GTT here improves performance in some
+* apps. */
res->domains = RADEON_DOMAIN_VRAM;
-   flags |= RADEON_FLAG_GTT_WC;
+   res->flags |= RADEON_FLAG_GTT_WC;
break;
}
 
if (res->b.b.target == PIPE_BUFFER &&
res->b.b.flags & (PIPE_RESOURCE_FLAG_MAP_PERSISTENT |
   

[Mesa-dev] [PATCH 12/20] radeonsi: fix a badly implemented GS bug workaround

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

Limit it to geometry shaders and Hawaii.
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index bcf1104..60cc3f0 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -337,40 +337,45 @@ static unsigned si_get_ia_multi_vgt_param(struct 
si_context *sctx,
(sctx->b.family == CHIP_HAWAII ||
 (sctx->b.chip_class == VI &&
  (sctx->gs_shader.cso || max_primgroup_in_wave != 2
partial_vs_wave = true;
 
/* Instancing bug on Bonaire. */
if (sctx->b.family == CHIP_BONAIRE && ia_switch_on_eoi &&
(info->indirect || info->instance_count > 1))
partial_vs_wave = true;
 
+   /* GS hw bug with single-primitive instances and SWITCH_ON_EOI.
+* The hw doc says all multi-SE chips are affected, but Vulkan
+* only applies it to Hawaii. Do what Vulkan does.
+*/
+   if (sctx->b.family == CHIP_HAWAII &&
+   sctx->gs_shader.cso &&
+   ia_switch_on_eoi &&
+   (info->indirect ||
+(info->instance_count > 1 &&
+ si_num_prims_for_vertices(info) <= 1)))
+   sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;
+
+
/* If the WD switch is false, the IA switch must be false too. 
*/
assert(wd_switch_on_eop || !ia_switch_on_eop);
}
 
/* If SWITCH_ON_EOI is set, PARTIAL_ES_WAVE must be set too. */
if (ia_switch_on_eoi)
partial_es_wave = true;
 
/* GS requirement. */
if (SI_GS_PER_ES / primgroup_size >= sctx->screen->gs_table_depth - 3)
partial_es_wave = true;
 
-   /* Hw bug with single-primitive instances and SWITCH_ON_EOI
-* on multi-SE chips. */
-   if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi &&
-   (info->indirect ||
-(info->instance_count > 1 &&
- si_num_prims_for_vertices(info) <= 1)))
-   sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;
-
return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) |
S_028AA8_SWITCH_ON_EOI(ia_switch_on_eoi) |
S_028AA8_PARTIAL_VS_WAVE_ON(partial_vs_wave) |
S_028AA8_PARTIAL_ES_WAVE_ON(partial_es_wave) |
S_028AA8_PRIMGROUP_SIZE(primgroup_size - 1) |
S_028AA8_WD_SWITCH_ON_EOP(sctx->b.chip_class >= CIK ? 
wd_switch_on_eop : 0) |
S_028AA8_MAX_PRIMGRP_IN_WAVE(sctx->b.chip_class >= VI ?
 max_primgroup_in_wave : 0);
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/20] radeonsi: fix cubemaps viewed as 2D

2016-08-29 Thread Marek Olšák
From: Marek Olšák 

This fixes: GL43-CTS.texture_view.view_sampling

Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/radeonsi/si_state.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 25dfe26..026aded 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1603,20 +1603,27 @@ static unsigned si_tex_compare(unsigned compare)
}
 }
 
 static unsigned si_tex_dim(unsigned res_target, unsigned view_target,
   unsigned nr_samples)
 {
if (view_target == PIPE_TEXTURE_CUBE ||
view_target == PIPE_TEXTURE_CUBE_ARRAY)
res_target = view_target;
 
+   /* If interpretting cubemaps as something else, set 2D_ARRAY. */
+   if ((res_target == PIPE_TEXTURE_CUBE ||
+res_target == PIPE_TEXTURE_CUBE_ARRAY) &&
+   view_target != PIPE_TEXTURE_CUBE &&
+   view_target != PIPE_TEXTURE_CUBE_ARRAY)
+   res_target = PIPE_TEXTURE_2D_ARRAY;
+
switch (res_target) {
default:
case PIPE_TEXTURE_1D:
return V_008F1C_SQ_RSRC_IMG_1D;
case PIPE_TEXTURE_1D_ARRAY:
return V_008F1C_SQ_RSRC_IMG_1D_ARRAY;
case PIPE_TEXTURE_2D:
case PIPE_TEXTURE_RECT:
return nr_samples > 1 ? V_008F1C_SQ_RSRC_IMG_2D_MSAA :
V_008F1C_SQ_RSRC_IMG_2D;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/20] Plenty of RadeonSI fixes

2016-08-29 Thread Marek Olšák
Hi,

This series contains mostly fixes, i.e. for DCC, cubemaps, tessellation, 
texture views, Gather4, viewport depth range, etc.

There are also some new HUD queries.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: fix build issues with genxml, isl

2016-08-29 Thread Jason Ekstrand
On Sun, Aug 28, 2016 at 6:33 PM, Mauro Rossi  wrote:

> > While you're at it, I've got another build-breaking branch here:
> >
> > https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/blorp-vulkan
> >
> > It's almost reviewed so I'll be pushing soon. If you could provide a
> > squash-in, that would be great.
> >
> > --Jason
>
> Hi Jason,
>
> here it is, built marshmallow-x86 successfully after breaking commit
> "i965: Move blorp into src/intel/blorp"
> also attached to email, in case of gmail issues.
>
> This patch has to be applied on top of  my former "[PATCH] android:
> intel: Flatten the makefile structure"
> KR
>

Thanks for both patches!  I'll get the first one pushed and I've already
squashed the second into my blorp series so it doesn't break anything.
I've only got one more build-system change in queue.  I'll try and remember
to CC you on it (and maybe even try my hand at making the Android changes).

--Jason


> Mauro
>
> From 3ae8ea625b5cb091438de421d9762ea8bcb8e2bc Mon Sep 17 00:00:00 2001
> From: Mauro Rossi 
> Date: Mon, 29 Aug 2016 03:08:02 +0200
> Subject: [PATCH] android: i965: Move blorp into src/intel/blorp
>
> Port to android of commit "i965: Move blorp into src/intel/blorp"
>
> libmesa_blorp static library module is built by Android.blorp.mk
>
> the necessary dependencies and includes are declared
> and nir_opcodes.h generated header is included by using
> the macro generated-sources-dir-for which requires LOCAL_MODULE_CLASS
> ---
>  src/intel/Android.blorp.mk   | 47 ++
> ++
>  src/intel/Android.mk |  1 +
>  src/mesa/drivers/dri/i965/Android.mk |  1 +
>  3 files changed, 49 insertions(+)
>  create mode 100644 src/intel/Android.blorp.mk
>
> diff --git a/src/intel/Android.blorp.mk b/src/intel/Android.blorp.mk
> new file mode 100644
> index 000..268d5eb
> --- /dev/null
> +++ b/src/intel/Android.blorp.mk
> @@ -0,0 +1,47 @@
> +# Copyright © 2016 Intel Corporation
> +# Copyright © 2016 Mauro Rossi 
> +#
> +# Permission is hereby granted, free of charge, to any person obtaining a
> +# copy of this software and associated documentation files (the
> "Software"),
> +# to deal in the Software without restriction, including without
> limitation
> +# the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +# and/or sell copies of the Software, and to permit persons to whom the
> +# Software is furnished to do so, subject to the following conditions:
> +#
> +# The above copyright notice and this permission notice shall be included
> +# in all copies or substantial portions of the Software.
> +#
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> +# DEALINGS IN THE SOFTWARE.
> +
> +# ---
> +# Build libmesa_blorp
> +# ---
> +
> +include $(CLEAR_VARS)
> +
> +LOCAL_MODULE := libmesa_blorp
> +
> +LOCAL_MODULE_CLASS := STATIC_LIBRARIES
> +
> +LOCAL_SRC_FILES := $(BLORP_FILES)
> +
> +LOCAL_C_INCLUDES := := \
> + $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \
> + $(MESA_TOP)/src/gallium/auxiliary \
> + $(MESA_TOP)/src/gallium/include \
> + $(MESA_TOP)/src/mapi \
> + $(MESA_TOP)/src/mesa \
> + $(MESA_TOP)/src/mesa/drivers/dri/i965
> +
> +LOCAL_STATIC_LIBRARIES := libmesa_nir
> +
> +LOCAL_SHARED_LIBRARIES := libdrm_intel
> +
> +include $(MESA_COMMON_MK)
> +include $(BUILD_STATIC_LIBRARY)
> diff --git a/src/intel/Android.mk b/src/intel/Android.mk
> index 114b111..0e9c29d 100644
> --- a/src/intel/Android.mk
> +++ b/src/intel/Android.mk
> @@ -25,5 +25,6 @@ LOCAL_PATH := $(call my-dir)
>  # Import variables
>  include $(LOCAL_PATH)/Makefile.sources
>
> +include $(LOCAL_PATH)/Android.blorp.mk
>  include $(LOCAL_PATH)/Android.genxml.mk
>  include $(LOCAL_PATH)/Android.isl.mk
> diff --git a/src/mesa/drivers/dri/i965/Android.mk
> b/src/mesa/drivers/dri/i965/Android.mk
> index e6bcedb..32adb9a 100644
> --- a/src/mesa/drivers/dri/i965/Android.mk
> +++ b/src/mesa/drivers/dri/i965/Android.mk
> @@ -178,6 +178,7 @@ LOCAL_SRC_FILES := \
>  LOCAL_WHOLE_STATIC_LIBRARIES := \
>   $(MESA_DRI_WHOLE_STATIC_LIBRARIES) \
>   $(I965_PERGEN_LIBS) \
> + libmesa_blorp \
>   libmesa_isl
>
>  LOCAL_SHARED_LIBRARIES := \
> --
> 2.7.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2016-08-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77449

Vedran Miletić  changed:

   What|Removed |Added

 Depends on||91969


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=91969
[Bug 91969] [radeonsi][bonaire] Stalls with Borderlands 2
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2016-08-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77449

Vedran Miletić  changed:

   What|Removed |Added

 Depends on||97340, 95308


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=95308
[Bug 95308] [radeonsi] Hangs after some minutes on Team Fortress 2
https://bugs.freedesktop.org/show_bug.cgi?id=97340
[Bug 97340] [radeonsi] POSTAL 2 freezes during shader compilation
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: Use enum pipe_shader_type in bind_sampler_states()

2016-08-29 Thread Brian Paul

On 08/27/2016 11:18 AM, Jan Vesely wrote:

On Fri, 2016-08-26 at 17:25 +0100, Eric Engestrom wrote:

On Fri, Aug 26, 2016 at 04:21:14PM +0200, Kai Wasserbäch wrote:


Hey Eric,
Eric Engestrom wrote on 26.08.2016 15:49:


On Fri, Aug 26, 2016 at 03:14:57PM +0200, Kai Wasserbäch wrote:


Brian Paul wrote on 26.08.2016 14:50:


On 08/26/2016 05:58 AM, Kai Wasserbäch wrote:


Cc: Brian Paul 
Signed-off-by: Kai Wasserbäch 
---

Hi Brian,
is this what you had in mind? If so, I was wondering
whether virgl_encode.c
would need to be updated as well. Doesn't seem like it,
since the functions
there map everything to uint32_t or some other standard
type.

Another point are the switch statements nouveau uses. To
silence the -Wswitch
warning of GCC I stuck a default case with two asserts at
the end of them. But
maybe it would be better to use an if...else for nv30 and
nv50.

I think one assertion is enough.  It's up to the nouveau
developers whether they
want to do more.

ok, then I'll go for the generic "unhandled type" assert.

I would've gone with `unreachable()`, since nothing outside the
enum range should ever get in.

Do you feel strongly about that and require a v3?

I don't, it's fine as is.



Personally I think the
assert() is better since the function could be called with another
enum member
value, which still is unhandled by the switch().

unreachable() still has an assert() (see src/util/macros.h:75)
The difference is that unreachable() tells the compiler that we won't
be
interested in what comes next. If we reach it, we abort with a
message
in debug builds, and after that the compiler can do whatever it
wants,
e.g. optimise out irrelevant code paths, and obviously it wont warn
about these code paths anymore, which is often more interesting.


moreover, assert leaves untested codepath with silent failure in
ndebug builds.
In this case it would be better to use unreachable + unhandled enum
values instead of default (so the compiler complains when new shader
types get added, however unlikely that is).

just my 2c,
Jan


Kai, the series LGTM.  I'll add my R-b and push the four patches.

Do you plan to do more such as pipe_context::set_constant_buffer()

Also, git grep 'unsigned shader' and 'uint shader' shows quite a few 
more places where similar clean-ups are possible.  I'll probably take 
care of the VMware driver.


As for assert vs. unreachable, whichever people think is better is fine 
with me.  Off-hand, I'm not aware of any new types of shaders coming 
along any time soon so I think it's a minor issue which can be addressed 
in follow-on commits.


Thanks for doing this!

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97260] [bisected] R9 290 low performance in Linux 4.7

2016-08-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97260

Vedran Miletić  changed:

   What|Removed |Added

 CC||ved...@miletic.net

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: add support for cull distances. (v1.1)

2016-08-29 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Mon, Aug 29, 2016 at 4:16 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This should be all that is required for cull distances to work
> on radeonsi.
>
> v1.1: whitespace cleanup, add docs fix clipdist_mask usage.
>
> Signed-off-by: Dave Airlie 
> ---
>  docs/features.txt   | 2 +-
>  docs/relnotes/12.1.0.html   | 1 +
>  src/gallium/drivers/radeonsi/si_pipe.c  | 2 +-
>  src/gallium/drivers/radeonsi/si_state.c | 7 ---
>  4 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index 26e8ff7..4ca1b99 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -211,7 +211,7 @@ GL 4.5, GLSL 4.50:
>GL_ARB_ES3_1_compatibilityDONE (i965/gen8+, 
> nvc0, radeonsi)
>GL_ARB_clip_control   DONE (i965, nv50, 
> nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
>GL_ARB_conditional_render_invertedDONE (i965, nv50, 
> nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
> -  GL_ARB_cull_distance  DONE (i965, nv50, 
> nvc0, llvmpipe, softpipe, swr)
> +  GL_ARB_cull_distance  DONE (i965, nv50, 
> nvc0, radeonsi, llvmpipe, softpipe, swr)
>GL_ARB_derivative_control DONE (i965, nv50, 
> nvc0, r600, radeonsi)
>GL_ARB_direct_state_accessDONE (all drivers)
>GL_ARB_get_texture_sub_image  DONE (all drivers)
> diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
> index d22d14b..c7f005d 100644
> --- a/docs/relnotes/12.1.0.html
> +++ b/docs/relnotes/12.1.0.html
> @@ -47,6 +47,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  OpenGL ES 3.1 on i965/hsw
>  GL_ARB_ES3_1_compatibility on i965
>  GL_ARB_clear_texture on r600, radeonsi
> +GL_ARB_cull_distance on radeonsi
>  GL_ARB_enhanced_layouts on i965
>  GL_ARB_indirect_parameters on radeonsi
>  GL_ARB_shader_draw_parameters on radeonsi
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 8e7d021..8f9e6f5 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -399,6 +399,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
> case PIPE_CAP_STRING_MARKER:
> case PIPE_CAP_CLEAR_TEXTURE:
> +   case PIPE_CAP_CULL_DISTANCE:
> return 1;
>
> case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:
> @@ -448,7 +449,6 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> case PIPE_CAP_VERTEXID_NOBASE:
> case PIPE_CAP_QUERY_BUFFER_OBJECT:
> -   case PIPE_CAP_CULL_DISTANCE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> case PIPE_CAP_TGSI_VOTE:
> case PIPE_CAP_MAX_WINDOW_RECTANGLES:
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index 25dfe26..375e74b 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -650,21 +650,22 @@ static void si_emit_clip_regs(struct si_context *sctx, 
> struct r600_atom *atom)
>info->properties[TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION];
> unsigned clipdist_mask =
> info->writes_clipvertex ? SIX_BITS : info->clipdist_writemask;
> +   unsigned total_mask = clipdist_mask | (info->culldist_writemask << 
> info->num_written_clipdistance);
>
> radeon_set_context_reg(cs, R_02881C_PA_CL_VS_OUT_CNTL,
> S_02881C_USE_VTX_POINT_SIZE(info->writes_psize) |
> S_02881C_USE_VTX_EDGE_FLAG(info->writes_edgeflag) |
> S_02881C_USE_VTX_RENDER_TARGET_INDX(info->writes_layer) |
> S_02881C_USE_VTX_VIEWPORT_INDX(info->writes_viewport_index) |
> -   S_02881C_VS_OUT_CCDIST0_VEC_ENA((clipdist_mask & 0x0F) != 0) |
> -   S_02881C_VS_OUT_CCDIST1_VEC_ENA((clipdist_mask & 0xF0) != 0) |
> +   S_02881C_VS_OUT_CCDIST0_VEC_ENA((total_mask & 0x0F) != 0) |
> +   S_02881C_VS_OUT_CCDIST1_VEC_ENA((total_mask & 0xF0) != 0) |
> S_02881C_VS_OUT_MISC_VEC_ENA(info->writes_psize ||
> info->writes_edgeflag ||
> info->writes_layer ||
>  info->writes_viewport_index) |
> S_02881C_VS_OUT_MISC_SIDE_BUS_ENA(1) |
> (sctx->queued.named.rasterizer->clip_plane_enable &
> -clipdist_mask));
> +clipdist_mask) | (info->culldist_writemask << 8));
>  

Re: [Mesa-dev] [PATCH 0/2] r600g: Pair of small code clean ups with TGSI

2016-08-29 Thread James Harvey

Hi Rhys,

 I ran piglit on my Evergreen HD5850 with your patches.

No regressions here.

Tested-by: James Harvey 

Thanks,
James


On 08/27/2016 09:05 AM, Rhys Kidd wrote:

Having run Mesa through Clang on Eric Anholt's Travis harness, these small
code clean ups improve readability of TGSI code in r600g and may avoid
future problems.

Series also can be found at:
https://github.com/Echelon9/mesa/tree/fix/r600g-cleanup-tgsi-opcodes

I don't have the hardware to test this so would appreciated Tested-by's.

I do not have commit rights to fd.o so after R-B would appreciate if
the reviewer(s) could push to master.

Rhys Kidd (2):
  r600g: Avoid duplicated initialization of TGSI_OPCODE_DFMA
  r600g: Clean up defined magic numbers for TGSI opcodes

 src/gallium/drivers/r600/r600_shader.c | 18 +-
 src/gallium/include/pipe/p_shader_tokens.h |  1 +
 2 files changed, 10 insertions(+), 9 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] r600g: Clean up defined magic numbers for TGSI opcodes

2016-08-29 Thread Rhys Kidd
On Monday, August 29, 2016, Marek Olšák  wrote:

> For the series:
>
> Reviewed-by: Marek Olšák >
>
> Marek


Thanks Marek.

As I don't have commit access, when you have a moment I would appreciate if
you could please push to master.

Regards,
Rhys


>
> On Sat, Aug 27, 2016 at 6:05 PM, Rhys Kidd  > wrote:
> > Small code clean up that removes magic numbers where a TGSI
> > opcode has been defined.
> >
> > No functional change expected as each opcode is unsupported on
> > the respective hardware.
> >
> > Signed-off-by: Rhys Kidd >
> > ---
> >  src/gallium/drivers/r600/r600_shader.c | 14 +++---
> >  src/gallium/include/pipe/p_shader_tokens.h |  1 +
> >  2 files changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/gallium/drivers/r600/r600_shader.c
> b/src/gallium/drivers/r600/r600_shader.c
> > index a39301f..f7b8495 100644
> > --- a/src/gallium/drivers/r600/r600_shader.c
> > +++ b/src/gallium/drivers/r600/r600_shader.c
> > @@ -8998,20 +8998,20 @@ static const struct r600_shader_tgsi_instruction
> r600_shader_tgsi_instruction[]
> > [TGSI_OPCODE_ENDSUB]= { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_TXQ_LZ]= { FETCH_OP_GET_TEXTURE_RESINFO,
> tgsi_tex},
> > [TGSI_OPCODE_TXQS]  = { FETCH_OP_GET_NUMBER_OF_SAMPLES,
> tgsi_tex},
> > -   [105]   = { ALU_OP0_NOP, tgsi_unsupported},
> > +   [TGSI_OPCODE_RESQ]  = { ALU_OP0_NOP, tgsi_unsupported},
> > [106]   = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_NOP]   = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_FSEQ]  = { ALU_OP2_SETE_DX10, tgsi_op2},
> > [TGSI_OPCODE_FSGE]  = { ALU_OP2_SETGE_DX10, tgsi_op2},
> > [TGSI_OPCODE_FSLT]  = { ALU_OP2_SETGT_DX10, tgsi_op2_swap},
> > [TGSI_OPCODE_FSNE]  = { ALU_OP2_SETNE_DX10, tgsi_op2_swap},
> > -   [112]   = { ALU_OP0_NOP, tgsi_unsupported},
> > +   [TGSI_OPCODE_MEMBAR]= { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_CALLNZ]= { ALU_OP0_NOP, tgsi_unsupported},
> > [114]   = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_BREAKC]= { ALU_OP0_NOP, tgsi_loop_breakc},
> > [TGSI_OPCODE_KILL_IF]   = { ALU_OP2_KILLGT, tgsi_kill},  /*
> conditional kill */
> > [TGSI_OPCODE_END]   = { ALU_OP0_NOP, tgsi_end},  /* aka HALT
> */
> > -   [118]   = { ALU_OP0_NOP, tgsi_unsupported},
> > +   [TGSI_OPCODE_DFMA]  = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_F2I]   = { ALU_OP1_FLT_TO_INT, tgsi_op2_trans},
> > [TGSI_OPCODE_IDIV]  = { ALU_OP0_NOP, tgsi_idiv},
> > [TGSI_OPCODE_IMAX]  = { ALU_OP2_MAX_INT, tgsi_op2},
> > @@ -9197,14 +9197,14 @@ static const struct r600_shader_tgsi_instruction
> eg_shader_tgsi_instruction[] =
> > [TGSI_OPCODE_ENDSUB]= { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_TXQ_LZ]= { FETCH_OP_GET_TEXTURE_RESINFO,
> tgsi_tex},
> > [TGSI_OPCODE_TXQS]  = { FETCH_OP_GET_NUMBER_OF_SAMPLES,
> tgsi_tex},
> > -   [105]   = { ALU_OP0_NOP, tgsi_unsupported},
> > +   [TGSI_OPCODE_RESQ]  = { ALU_OP0_NOP, tgsi_unsupported},
> > [106]   = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_NOP]   = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_FSEQ]  = { ALU_OP2_SETE_DX10, tgsi_op2},
> > [TGSI_OPCODE_FSGE]  = { ALU_OP2_SETGE_DX10, tgsi_op2},
> > [TGSI_OPCODE_FSLT]  = { ALU_OP2_SETGT_DX10, tgsi_op2_swap},
> > [TGSI_OPCODE_FSNE]  = { ALU_OP2_SETNE_DX10, tgsi_op2_swap},
> > -   [112]   = { ALU_OP0_NOP, tgsi_unsupported},
> > +   [TGSI_OPCODE_MEMBAR]= { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_CALLNZ]= { ALU_OP0_NOP, tgsi_unsupported},
> > [114]   = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_BREAKC]= { ALU_OP0_NOP, tgsi_unsupported},
> > @@ -9420,14 +9420,14 @@ static const struct r600_shader_tgsi_instruction
> cm_shader_tgsi_instruction[] =
> > [TGSI_OPCODE_ENDSUB]= { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_TXQ_LZ]= { FETCH_OP_GET_TEXTURE_RESINFO,
> tgsi_tex},
> > [TGSI_OPCODE_TXQS]  = { FETCH_OP_GET_NUMBER_OF_SAMPLES,
> tgsi_tex},
> > -   [105]   = { ALU_OP0_NOP, tgsi_unsupported},
> > +   [TGSI_OPCODE_RESQ]  = { ALU_OP0_NOP, tgsi_unsupported},
> > [106]   = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_NOP]   = { ALU_OP0_NOP, tgsi_unsupported},
> > [TGSI_OPCODE_FSEQ]  = { ALU_OP2_SETE_DX10, tgsi_op2},
> > [TGSI_OPCODE_FSGE]  = { ALU_OP2_SETGE_DX10, tgsi_op2},
> > 

Re: [Mesa-dev] [PATCH 2/2] r600g: Clean up defined magic numbers for TGSI opcodes

2016-08-29 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Sat, Aug 27, 2016 at 6:05 PM, Rhys Kidd  wrote:
> Small code clean up that removes magic numbers where a TGSI
> opcode has been defined.
>
> No functional change expected as each opcode is unsupported on
> the respective hardware.
>
> Signed-off-by: Rhys Kidd 
> ---
>  src/gallium/drivers/r600/r600_shader.c | 14 +++---
>  src/gallium/include/pipe/p_shader_tokens.h |  1 +
>  2 files changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600_shader.c 
> b/src/gallium/drivers/r600/r600_shader.c
> index a39301f..f7b8495 100644
> --- a/src/gallium/drivers/r600/r600_shader.c
> +++ b/src/gallium/drivers/r600/r600_shader.c
> @@ -8998,20 +8998,20 @@ static const struct r600_shader_tgsi_instruction 
> r600_shader_tgsi_instruction[]
> [TGSI_OPCODE_ENDSUB]= { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_TXQ_LZ]= { FETCH_OP_GET_TEXTURE_RESINFO, tgsi_tex},
> [TGSI_OPCODE_TXQS]  = { FETCH_OP_GET_NUMBER_OF_SAMPLES, tgsi_tex},
> -   [105]   = { ALU_OP0_NOP, tgsi_unsupported},
> +   [TGSI_OPCODE_RESQ]  = { ALU_OP0_NOP, tgsi_unsupported},
> [106]   = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_NOP]   = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_FSEQ]  = { ALU_OP2_SETE_DX10, tgsi_op2},
> [TGSI_OPCODE_FSGE]  = { ALU_OP2_SETGE_DX10, tgsi_op2},
> [TGSI_OPCODE_FSLT]  = { ALU_OP2_SETGT_DX10, tgsi_op2_swap},
> [TGSI_OPCODE_FSNE]  = { ALU_OP2_SETNE_DX10, tgsi_op2_swap},
> -   [112]   = { ALU_OP0_NOP, tgsi_unsupported},
> +   [TGSI_OPCODE_MEMBAR]= { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_CALLNZ]= { ALU_OP0_NOP, tgsi_unsupported},
> [114]   = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_BREAKC]= { ALU_OP0_NOP, tgsi_loop_breakc},
> [TGSI_OPCODE_KILL_IF]   = { ALU_OP2_KILLGT, tgsi_kill},  /* 
> conditional kill */
> [TGSI_OPCODE_END]   = { ALU_OP0_NOP, tgsi_end},  /* aka HALT */
> -   [118]   = { ALU_OP0_NOP, tgsi_unsupported},
> +   [TGSI_OPCODE_DFMA]  = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_F2I]   = { ALU_OP1_FLT_TO_INT, tgsi_op2_trans},
> [TGSI_OPCODE_IDIV]  = { ALU_OP0_NOP, tgsi_idiv},
> [TGSI_OPCODE_IMAX]  = { ALU_OP2_MAX_INT, tgsi_op2},
> @@ -9197,14 +9197,14 @@ static const struct r600_shader_tgsi_instruction 
> eg_shader_tgsi_instruction[] =
> [TGSI_OPCODE_ENDSUB]= { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_TXQ_LZ]= { FETCH_OP_GET_TEXTURE_RESINFO, tgsi_tex},
> [TGSI_OPCODE_TXQS]  = { FETCH_OP_GET_NUMBER_OF_SAMPLES, tgsi_tex},
> -   [105]   = { ALU_OP0_NOP, tgsi_unsupported},
> +   [TGSI_OPCODE_RESQ]  = { ALU_OP0_NOP, tgsi_unsupported},
> [106]   = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_NOP]   = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_FSEQ]  = { ALU_OP2_SETE_DX10, tgsi_op2},
> [TGSI_OPCODE_FSGE]  = { ALU_OP2_SETGE_DX10, tgsi_op2},
> [TGSI_OPCODE_FSLT]  = { ALU_OP2_SETGT_DX10, tgsi_op2_swap},
> [TGSI_OPCODE_FSNE]  = { ALU_OP2_SETNE_DX10, tgsi_op2_swap},
> -   [112]   = { ALU_OP0_NOP, tgsi_unsupported},
> +   [TGSI_OPCODE_MEMBAR]= { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_CALLNZ]= { ALU_OP0_NOP, tgsi_unsupported},
> [114]   = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_BREAKC]= { ALU_OP0_NOP, tgsi_unsupported},
> @@ -9420,14 +9420,14 @@ static const struct r600_shader_tgsi_instruction 
> cm_shader_tgsi_instruction[] =
> [TGSI_OPCODE_ENDSUB]= { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_TXQ_LZ]= { FETCH_OP_GET_TEXTURE_RESINFO, tgsi_tex},
> [TGSI_OPCODE_TXQS]  = { FETCH_OP_GET_NUMBER_OF_SAMPLES, tgsi_tex},
> -   [105]   = { ALU_OP0_NOP, tgsi_unsupported},
> +   [TGSI_OPCODE_RESQ]  = { ALU_OP0_NOP, tgsi_unsupported},
> [106]   = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_NOP]   = { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_FSEQ]  = { ALU_OP2_SETE_DX10, tgsi_op2},
> [TGSI_OPCODE_FSGE]  = { ALU_OP2_SETGE_DX10, tgsi_op2},
> [TGSI_OPCODE_FSLT]  = { ALU_OP2_SETGT_DX10, tgsi_op2_swap},
> [TGSI_OPCODE_FSNE]  = { ALU_OP2_SETNE_DX10, tgsi_op2_swap},
> -   [112]   = { ALU_OP0_NOP, tgsi_unsupported},
> +   [TGSI_OPCODE_MEMBAR]= { ALU_OP0_NOP, tgsi_unsupported},
> [TGSI_OPCODE_CALLNZ]= { ALU_OP0_NOP, tgsi_unsupported},
> [114]   = { ALU_OP0_NOP, tgsi_unsupported},
> 

Re: [Mesa-dev] [PATCH v2 1/2] gallium: add cap to export device pointer size

2016-08-29 Thread Marek Olšák
On Sun, Aug 28, 2016 at 7:57 PM, Jan Vesely  wrote:
> v2: document the new cap
>
> Signed-off-by: Jan Vesely 
> ---
>  src/gallium/docs/source/screen.rst | 1 +
>  src/gallium/drivers/ilo/ilo_screen.c   | 6 ++
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 ++
>  src/gallium/drivers/radeon/r600_pipe_common.c  | 8 
>  src/gallium/drivers/softpipe/sp_screen.c   | 1 +
>  src/gallium/include/pipe/p_defines.h   | 1 +
>  7 files changed, 21 insertions(+)
>
> diff --git a/src/gallium/docs/source/screen.rst 
> b/src/gallium/docs/source/screen.rst
> index c00d012..8c67604 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -496,6 +496,7 @@ pipe_screen::get_compute_param.
>non-zero means yes, zero means no. Value type: ``uint32_t``
>  * ``PIPE_COMPUTE_CAP_SUBGROUP_SIZE``: The size of a basic execution unit in
>threads. Also known as wavefront size, warp size or SIMD width.
> +* ``PIPE_COMPUTE_CAP_ADDRESS_BITS``: The default compute device address 
> space size specified as an unsigned integer value in bits.

There is the 80 chars per line limit in this file.

Other than that, the patch is:

Reviewed-by: Marek Olšák 

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: Don't use global variables for tess lds

2016-08-29 Thread Marek Olšák
On Mon, Aug 29, 2016 at 2:21 AM, Edward O'Callaghan
 wrote:
> Missing Signoff-by line but otherwise with that fix,

Signed-off-by isn't required in Mesa.

The patch is:

Reviewed-by: Marek Olšák 

Marek

>
> Reviewed-By: Edward O'Callaghan 
>
> On 08/27/2016 05:52 AM, Tom Stellard wrote:
>> We were allocating global variables for the maximum LDS size
>> which made the compiler think we were using all of LDS, which
>> isn't the case.
>> ---
>>  src/gallium/drivers/radeonsi/si_shader.c | 15 ++-
>>  1 file changed, 6 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
>> b/src/gallium/drivers/radeonsi/si_shader.c
>> index 64c367e..5d972cb 100644
>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>> @@ -5420,16 +5420,13 @@ static unsigned llvm_get_type_size(LLVMTypeRef type)
>>  static void declare_tess_lds(struct si_shader_context *ctx)
>>  {
>>   struct gallivm_state *gallivm = >radeon_bld.gallivm;
>> - LLVMTypeRef i32 = ctx->radeon_bld.soa.bld_base.uint_bld.elem_type;
>> - unsigned lds_size = ctx->screen->b.chip_class >= CIK ? 65536 : 32768;
>> + struct lp_build_tgsi_context *bld_base = >radeon_bld.soa.bld_base;
>> + struct lp_build_context *uint = _base->uint_bld;
>>
>> - /* The actual size is computed outside of the shader to reduce
>> -  * the number of shader variants. */
>> - ctx->lds =
>> - LLVMAddGlobalInAddressSpace(gallivm->module,
>> - LLVMArrayType(i32, lds_size / 4),
>> - "tess_lds",
>> - LOCAL_ADDR_SPACE);
>> + unsigned lds_size = ctx->screen->b.chip_class >= CIK ? 65536 : 32768;
>> + ctx->lds = LLVMBuildIntToPtr(gallivm->builder, uint->zero,
>> + LLVMPointerType(LLVMArrayType(ctx->i32, lds_size / 4), 
>> LOCAL_ADDR_SPACE),
>> + "tess_lds");
>>  }
>>
>>  static void create_function(struct si_shader_context *ctx)
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] st/vdpau: use temporary buffers while applying filters

2016-08-29 Thread Christian König

Hi Nayan,

yeah, that sounds like a good idea to me. In the meantime I'm taking a 
look at your lanczos filter once more and try to figure out where those 
artifacts come from.


The problem with using multiple output surfaces is that DRI2 didn't 
supported that. E.g. in DRI2 you can only get two textures for each 
drawable (front and back).


Because of this I came up with the delayed rendering approach to avoid 
an additional copy of the texture content. For DRI3 we don't have that 
problem any more.


Steps which needs to be done should be:
1. Read yourself into src/gallium/auxiliary/vl/vl_winsys_dri.c and 
src/gallium/auxiliary/vl/vl_winsys_dri3.c.


2. Then make sure DRI3 is the default in the VDPAU state tracker and 
issue a warning when we need to fallback to DRI2.


3. Hack on vl_winsys_dri3.c so that the VDPAU state tracker can allocate 
multiple textures from it (one for each output surface).


4. Modify the VDPAU state tracker to use this functionality instead of 
the delayed rendering.


5. Remove the delayed rendering hack.

Regards,
Christian.

Am 29.08.2016 um 12:43 schrieb Nayan Deshmukh:

Hi Christian,

Should I start working on using DRI rather than delayed rendering to
render directly to the surface?
Also, I will need some direction to get started.


Regards,
Nayan.
On Mon, Aug 29, 2016 at 3:26 PM, Christian König
 wrote:

Am 26.08.2016 um 11:53 schrieb Nayan Deshmukh:

Use temporary buffers so that we don't read and write to the
same surface at the same time. We don't need to use linear
layout now.

v2: rebase the patch against reverted change

Signed-off-by: Nayan Deshmukh 


Reviewed, tested and pushed.

Thanks for the help,
Christian.



---
   src/gallium/state_trackers/vdpau/mixer.c | 75

   1 file changed, 57 insertions(+), 18 deletions(-)

diff --git a/src/gallium/state_trackers/vdpau/mixer.c
b/src/gallium/state_trackers/vdpau/mixer.c
index cb0ef03..c205427 100644
--- a/src/gallium/state_trackers/vdpau/mixer.c
+++ b/src/gallium/state_trackers/vdpau/mixer.c
@@ -240,8 +240,10 @@ VdpStatus vlVdpVideoMixerRender(VdpVideoMixer mixer,
  struct u_rect rect, clip, *prect, dirty_area;
  unsigned i, layer = 0;
  struct pipe_video_buffer *video_buffer;
-   struct pipe_sampler_view *sampler_view;
-   struct pipe_surface *surface;
+   struct pipe_sampler_view *sampler_view, sv_templ;
+   struct pipe_surface *surface, surf_templ;
+   struct pipe_context *pipe;
+   struct pipe_resource res_tmpl, *res;
vlVdpVideoMixer *vmixer;
  vlVdpSurface *surf;
@@ -335,25 +337,25 @@ VdpStatus vlVdpVideoMixerRender(VdpVideoMixer mixer,
  }
  vl_compositor_set_buffer_layer(>cstate, compositor, layer,
video_buffer, prect, NULL, deinterlace);
   -   if(vmixer->bicubic.filter) {
-  struct pipe_context *pipe;
-  struct pipe_resource res_tmpl, *res;
-  struct pipe_sampler_view sv_templ;
-  struct pipe_surface surf_templ;
-
+   if (vmixer->bicubic.filter || vmixer->sharpness.filter ||
vmixer->noise_reduction.filter) {
 pipe = vmixer->device->context;
 memset(_tmpl, 0, sizeof(res_tmpl));
   res_tmpl.target = PIPE_TEXTURE_2D;
-  res_tmpl.width0 = surf->templat.width;
-  res_tmpl.height0 = surf->templat.height;
 res_tmpl.format = dst->sampler_view->format;
 res_tmpl.depth0 = 1;
 res_tmpl.array_size = 1;
-  res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
-  PIPE_BIND_LINEAR;
+  res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET;
 res_tmpl.usage = PIPE_USAGE_DEFAULT;
   +  if (!vmixer->bicubic.filter) {
+ res_tmpl.width0 = dst->surface->width;
+ res_tmpl.height0 = dst->surface->height;
+  } else {
+ res_tmpl.width0 = surf->templat.width;
+ res_tmpl.height0 = surf->templat.height;
+  }
+
 res = pipe->screen->resource_create(pipe->screen, _tmpl);
   vlVdpDefaultSamplerViewTemplate(_templ, res);
@@ -369,6 +371,9 @@ VdpStatus vlVdpVideoMixerRender(VdpVideoMixer mixer,
 surface = dst->surface;
 sampler_view = dst->sampler_view;
 dirty_area = dst->dirty_area;
+   }
+
+   if (!vmixer->bicubic.filter) {
 vl_compositor_set_layer_dst_area(>cstate, layer++,
RectToPipe(destination_video_rect, ));
 vl_compositor_set_dst_clip(>cstate,
RectToPipe(destination_rect, ));
  }
@@ -394,13 +399,47 @@ VdpStatus vlVdpVideoMixerRender(VdpVideoMixer mixer,
  else {
 vl_compositor_render(>cstate, compositor, surface,
_area, true);
   -  if (vmixer->noise_reduction.filter)
- vl_median_filter_render(vmixer->noise_reduction.filter,
- sampler_view, surface);
+  if (vmixer->noise_reduction.filter) {
+ if (!vmixer->sharpness.filter && !vmixer->bicubic.filter) {
+vl_median_filter_render(vmixer->noise_reduction.filter,
+ 

  1   2   >