Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Edward O'Callaghan
Hey guys,

Well taking a step back here for a second. My initial thoughts are -
what, conceptually, is the primary user of addrlib from an extremely
high level view?

If the case is mesa3d then perhaps mesa should be treated as the
upstream from which other users pull. If the case is that it is indeed
truly agnostic/general then perhaps deferring to the packaging
expertise/infrastructure of the distro's (aka pkgconfig, etc..) is
perhaps the best mechanism. In the latter case, addrlib would be
released with versioning which should be fairly maintainable as it is a
slowly moving target right?

Just my two cent, hope its helpful!
Kind Regards,
Edward.

On 08/10/2016 01:39 AM, Christian König wrote:
> Am 09.08.2016 um 15:47 schrieb Nicolai Hähnle:
>> Hi everybody,
>>
>> addrlib is the addressing and alignment calculator which is used by
>> radeonsi. It's developed (and also used) internally at AMD, and so far
>> we've had one open source copy living in the Mesa repository at
>> src/gallium/winsys/amdgpu/drm/addrlib.
>>
>> The question of using addrlib in non-Mesa parts of our open-source
>> stack has come up, in particular in relation to compute. We'd
>> obviously like to share the code rather than having multiple copies
>> flying around. Since the interface of addrlib is slow-moving but
>> unstable, shared linking is not an option.
>>
>> I think the best way forward is to create a dedicated repository for
>> addrlib which is then integrated into Mesa as a git submodule.
>>
>> The point of this email is to gather feedback from the Mesa community
>> on this plan, which is explicitly:
>>
>> (0) Create an addrlib repository, say amd/addrlib on fd.o.
>> (1) Add it as a git submodule to the Mesa repository.
>> (2) Fix up whatever aspects of the build system that may be affected
>> (perhaps for building source tarballs).
>> (3) Go back to mostly ignoring addrlib, except for trying to get
>> better at syncing with the internal closed-source version.
>>
>> From initial experiments, the impact on users interested in radeon is
>> that they will have to run `git submodule init` and then occasionally
>> `git submodule update`. Users who do not build radeonsi should be able
>> to ignore the change completely.
>>
>> There are alternatives. For example, ROCm uses Google's repo tool
>> already. But for Mesa, git submodule looks like a lightweight, well
>> supported and overall conservative option that everybody should
>> already have installed. If there are good arguments for something
>> else, let's hear them!
>>
>> Another point: if we proceed with this plan, I think we should
>> consider moving addrlib into src/amd/addrlib. There are two reasons:
>> First, transitioning to a submodule *without* changing the directory
>> is probably more fragile, i.e. what happens when you switch between
>> checkouts before and after the transition. Second, if/when radv ends
>> up being merged into Mesa master, it makes sense to have addrlib there
>> anyway.
>>
>> Thoughts? Complaints? Praise?
> 
> Well using git submodule is a possibility and we had rather good
> experience with that in GStreamer.
> 
> But it would remove one major argument to beating the addrlib guys
> towards a stable interface and/or proper library version handling.
> 
> Moving it into libdrm is clearly not an option because then you would
> need to use versioning for the whole libdrm_amdgpu library which we
> don't want.
> 
> Christian.
> 
>> Nicolai
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] RadeonSI: GPU hang fix + 3 small changes

2016-08-09 Thread Michel Dänzer
On 10/08/16 08:34 AM, Marek Olšák wrote:
> Hi,
> 
> The first patch is pretty important and fixes the root cause of
> Overlord and Witcher 2 hangs. The GLSL revert can be reverted after
> that.

I assume you don't mean "revert the revert" literally, but just
re-applying the original change again.


> Please review.

The series is

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97231] GL_DEPTH_CLAMP doesn't clamp to the far plane

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97231

--- Comment #14 from Roland Scheidegger  ---
(In reply to Jules Blok from comment #13)
> Created attachment 125650 [details]
> api trace file version 4
> 
> I've added an apitrace that was captured on Linux, perhaps you will have
> less problems running that trace.

This still uses plenty of unsupported 4.0 stuff not supported on llvmpipe
unfortunately.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/nine: Fix invalid attempt to use indirect draws.

2016-08-09 Thread Trevor Davenport
Since commit 6d7177f01b231e9fe79a558c28d2b562a218d7ea, radeonsi
would take a different path if info->indirect_params was not
initialized properly.  Nine was not initializating this field.
---
 src/gallium/state_trackers/nine/device9.c | 1 +
 1 file changed, 1 insertion(+)

diff --git src/gallium/state_trackers/nine/device9.c 
src/gallium/state_trackers/nine/device9.c
index d233304..3f6577c 100644
--- src/gallium/state_trackers/nine/device9.c
+++ src/gallium/state_trackers/nine/device9.c
@@ -2935,6 +2935,7 @@ init_draw_info(struct pipe_draw_info *info,
 info->restart_index = 0;
 info->count_from_stream_output = NULL;
 info->indirect = NULL;
+info->indirect_params = NULL;
 }
 
 HRESULT NINE_WINAPI
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] nir: Introduce a nir_opt_move_comparisons() pass.

2016-08-09 Thread Kenneth Graunke
This tries to move comparisons (a common source of boolean values)
closer to their first use.  For GPUs which use condition codes,
this can eliminate a lot of temporary booleans and comparisons
which reload the condition code register based on a boolean.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/Makefile.sources   |   1 +
 src/compiler/nir/nir.h  |   2 +
 src/compiler/nir/nir_opt_move_comparisons.c | 173 
 3 files changed, 176 insertions(+)
 create mode 100644 src/compiler/nir/nir_opt_move_comparisons.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 0ff9b23..008a101 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -226,6 +226,7 @@ NIR_FILES = \
nir/nir_opt_gcm.c \
nir/nir_opt_global_to_local.c \
nir/nir_opt_peephole_select.c \
+   nir/nir_opt_move_comparisons.c \
nir/nir_opt_remove_phis.c \
nir/nir_opt_undef.c \
nir/nir_phi_builder.c \
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 9ce5be2..79511a7 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2574,6 +2574,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
 
 void nir_opt_gcm(nir_shader *shader);
 
+bool nir_opt_move_comparisons(nir_shader *shader);
+
 bool nir_opt_peephole_select(nir_shader *shader);
 
 bool nir_opt_remove_phis(nir_shader *shader);
diff --git a/src/compiler/nir/nir_opt_move_comparisons.c 
b/src/compiler/nir/nir_opt_move_comparisons.c
new file mode 100644
index 000..74927c9
--- /dev/null
+++ b/src/compiler/nir/nir_opt_move_comparisons.c
@@ -0,0 +1,173 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "nir.h"
+
+/**
+ * \file nir_opt_move_comparisons.c
+ *
+ * This pass moves ALU comparison operations just before their first use.
+ *
+ * It only moves instructions within a single basic block; cross-block
+ * movement is left to global code motion.
+ *
+ * Many GPUs generate condition codes for comparisons, and use predication
+ * for conditional selects and control flow.  In a sequence such as:
+ *
+ * vec1 32 ssa_1 = flt a b
+ * 
+ * vec1 32 ssa_2 = bcsel ssa_1 c d
+ *
+ * the backend would likely do the comparison, producing condition codes,
+ * then save those to a boolean value.  The intervening operations might
+ * trash the condition codes.  Then, in order to do the bcsel, it would
+ * need to re-populate the condition code register based on the boolean.
+ *
+ * By moving the comparison just before the bcsel, the condition codes could
+ * be used directly.  This eliminates the need to reload them from the boolean
+ * (generally eliminating an instruction).  It may also eliminate the need to
+ * create a boolean value altogether (unless it's used elsewhere), which could
+ * lower register pressure.
+ */
+
+static bool
+is_comparison(nir_op op)
+{
+   switch (op) {
+   case nir_op_flt:
+   case nir_op_fge:
+   case nir_op_feq:
+   case nir_op_fne:
+   case nir_op_ilt:
+   case nir_op_ult:
+   case nir_op_ige:
+   case nir_op_uge:
+   case nir_op_ieq:
+   case nir_op_ine:
+  return true;
+   default:
+  return false;
+   }
+}
+
+static bool
+move_comparison_source(nir_src *src, nir_block *block, struct exec_node 
*before)
+{
+   if (src->is_ssa && src->ssa->parent_instr->block == block &&
+   src->ssa->parent_instr->type == nir_instr_type_alu &&
+   is_comparison(nir_instr_as_alu(src->ssa->parent_instr)->op)) {
+
+  struct exec_node *src_node = >ssa->parent_instr->node;
+  exec_node_remove(src_node);
+
+  if (before)
+ exec_node_insert_node_before(before, src_node);
+  else
+ exec_list_push_tail(>instr_list, src_node);
+
+  return true;
+   }
+
+   return false;
+}
+
+/* nir_foreach_src callback 

[Mesa-dev] [PATCH 3/6] nir: Turn -(b2f(a) + b2f(b) >= 0 into !(a || b).

2016-08-09 Thread Kenneth Graunke
On Haswell (GL 3.3):

total instructions in shared programs: 6211485 -> 6211427 (-0.00%)
instructions in affected programs: 16260 -> 16202 (-0.36%)
helped: 25
HURT: 37

On Broadwell (GL 4.4):

total instructions in shared programs: 11640288 -> 11640218 (-0.00%)
instructions in affected programs: 16313 -> 16243 (-0.43%)
helped: 27
HURT: 37

Signed-off-by: Kenneth Graunke 
---
 src/compiler/nir/nir_opt_algebraic.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 4e9896f..ef87d4d 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -135,6 +135,9 @@ optimizations = [
# inot(a)
(('fge', 0.0, ('b2f', a)), ('inot', a)),
 
+   # -(b2f(a) + b2f(b)) >= 0 becomes !(a || b)
+   (('fge', ('fneg', ('fadd', ('b2f', 'a@bool'), ('b2f', 'b@bool'))), 0.0), 
('inot', ('ior', a, b))),
+
# 0.0 < fabs(a)
# fabs(a) > 0.0
# fabs(a) != 0.0 because fabs(a) must be >= 0
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] nir: Turn bcsel of +/- 1.0 and 0.0 into b2f sequences.

2016-08-09 Thread Kenneth Graunke
On Haswell (GL 3.3):

total instructions in shared programs: 6208759 -> 6203860 (-0.08%)
instructions in affected programs: 856541 -> 851642 (-0.57%)
helped: 3157
HURT: 113
LOST:   7
GAINED: 15

On Broadwell (GL 4.4):

total instructions in shared programs: 11637854 -> 11632016 (-0.05%)
instructions in affected programs: 1055693 -> 1049855 (-0.55%)
helped: 3900
HURT: 176
LOST:   1
GAINED: 18

Signed-off-by: Kenneth Graunke 
---
 src/compiler/nir/nir_opt_algebraic.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 1cf614c..4e9896f 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -251,6 +251,10 @@ optimizations = [
(('ieq', 'a@bool', False), ('inot', 'a')),
(('bcsel', a, True, False), ('ine', a, 0)),
(('bcsel', a, False, True), ('ieq', a, 0)),
+   (('bcsel@32', a, 1.0, 0.0), ('b2f', ('ine', a, 0))),
+   (('bcsel@32', a, 0.0, 1.0), ('b2f', ('ieq', a, 0))),
+   (('bcsel@32', a, -1.0, -0.0), ('fneg', ('b2f', ('ine', a, 0,
+   (('bcsel@32', a, -0.0, -1.0), ('fneg', ('b2f', ('ieq', a, 0,
(('bcsel', True, b, c), b),
(('bcsel', False, b, c), c),
# The result of this should be hit by constant propagation and, in the
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] i965: Move nir_lower_locals_to_regs a bit later.

2016-08-09 Thread Kenneth Graunke
I'm going to add a boolean scheduling pass that I want run late, but
after copy propagation and dead code elimination.  Yet, I don't want
to have to think about registers.  So, move the register conversion
a little later.

No impact on shader-db.  Suggested by Jason Ekstrand.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_nir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index d12a946..db2056d 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -484,12 +484,12 @@ brw_postprocess_nir(nir_shader *nir,
 
OPT(nir_opt_algebraic_late);
 
-   OPT(nir_lower_locals_to_regs);
-
OPT_V(nir_lower_to_source_mods);
OPT(nir_copy_prop);
OPT(nir_opt_dce);
 
+   OPT(nir_lower_locals_to_regs);
+
if (unlikely(debug_enabled)) {
   /* Re-index SSA defs so we print more sensible numbers. */
   nir_foreach_function(function, nir) {
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] i965: Use the nir_move_comparisons pass.

2016-08-09 Thread Kenneth Graunke
On Haswell (GL 3.3):

total instructions in shared programs: 6211427 -> 6210079 (-0.02%)
instructions in affected programs: 219356 -> 218008 (-0.61%)
helped: 567
HURT: 132

No spill/fill changes.

LOST:   0
GAINED: 4

On Broadwell (GL 4.4):

total instructions in shared programs: 11640218 -> 11632136 (-0.07%)
instructions in affected programs: 542528 -> 534446 (-1.49%)
helped: 1661
HURT: 143

total spills in shared programs: 2922 -> 2932 (0.34%)
spills in affected programs: 420 -> 430 (2.38%)
helped: 1
HURT: 1

total fills in shared programs: 4389 -> 4390 (0.02%)
fills in affected programs: 967 -> 968 (0.10%)
helped: 1
HURT: 1

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_nir.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index db2056d..b02c404 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -487,6 +487,7 @@ brw_postprocess_nir(nir_shader *nir,
OPT_V(nir_lower_to_source_mods);
OPT(nir_copy_prop);
OPT(nir_opt_dce);
+   OPT(nir_opt_move_comparisons);
 
OPT(nir_lower_locals_to_regs);
 
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] nir: Convert ineg(b2i(a)) to a if it's a boolean.

2016-08-09 Thread Kenneth Graunke
On Haswell (GL 3.3):

total instructions in shared programs: 6211678 -> 6211584 (-0.00%)
instructions in affected programs: 18968 -> 18874 (-0.50%)
helped: 62
HURT: 0
LOST:   0
GAINED: 4

On Broadwell (GL 4.4):

total instructions in shared programs: 11638930 -> 11638181 (-0.01%)
instructions in affected programs: 84768 -> 84019 (-0.88%)
helped: 505
HURT: 45

Signed-off-by: Kenneth Graunke 
---
 src/compiler/nir/nir_opt_algebraic.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 0f0896b..1cf614c 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -180,6 +180,7 @@ optimizations = [
(('fmul', ('b2f', a), ('b2f', b)), ('b2f', ('iand', a, b))),
(('fsat', ('fadd', ('b2f', a), ('b2f', b))), ('b2f', ('ior', a, b))),
(('iand', 'a@bool', 1.0), ('b2f', a)),
+   (('ineg', ('b2i', 'a@bool')), a),
(('flt', ('fneg', ('b2f', a)), 0), a), # Generated by TGSI KILL_IF.
(('flt', ('fsub', 0.0, ('b2f', a)), 0), a), # Generated by TGSI KILL_IF.
# Comparison with the same args.  Note that these are not done for
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: look for frag data bindings with [0] tacked onto the end for arrays

2016-08-09 Thread Ilia Mirkin
ping? do we want this? should i drop it?

On Wed, Jul 13, 2016 at 3:37 PM, Ilia Mirkin  wrote:
> Thanks for confirming, Corentin.
>
> Ian, do you have any opinions on this? Seems like a fairly innocuous
> thing to be doing...
>
> On Fri, Jul 8, 2016 at 3:39 PM, Corentin Wallez  wrote:
>> Not sure how reviews work in Mesa, but this patch LGTM. I also tested that
>> it fixes the relevant tests failures it is supposed to address.
>>
>> On Wed, Jul 6, 2016 at 7:40 PM, Ilia Mirkin  wrote:
>>>
>>> The GL spec is very unclear on this point. Apparently this is discussed
>>> without resolution in the closed Khronos bugtracker at
>>> https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7829 . The
>>> recommendation is to allow dropping the [0] for looking up the bindings.
>>>
>>> The approach taken in this patch is to instead tack on [0]'s for each
>>> arrayness level of the output's type, and doing the lookup again. That
>>> way, for
>>>
>>> out vec4 foo[2][2][2]
>>>
>>> we will end up looking for bindings for foo, foo[0], foo[0][0], and
>>> foo[0][0][0], in that order of preference.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
>>> Signed-off-by: Ilia Mirkin 
>>> ---
>>>  src/compiler/glsl/linker.cpp | 39 ---
>>>  1 file changed, 28 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
>>> index d963f54..9d54c2f 100644
>>> --- a/src/compiler/glsl/linker.cpp
>>> +++ b/src/compiler/glsl/linker.cpp
>>> @@ -2566,6 +2566,7 @@ find_available_slots(unsigned used_mask, unsigned
>>> needed_count)
>>>  /**
>>>   * Assign locations for either VS inputs or FS outputs
>>>   *
>>> + * \param mem_ctx   Temporary ralloc context used for linking
>>>   * \param prog  Shader program whose variables need locations
>>> assigned
>>>   * \param constants Driver specific constant values for the program.
>>>   * \param target_index  Selector for the program target to receive
>>> location
>>> @@ -2577,7 +2578,8 @@ find_available_slots(unsigned used_mask, unsigned
>>> needed_count)
>>>   * error is emitted to the shader link log and false is returned.
>>>   */
>>>  bool
>>> -assign_attribute_or_color_locations(gl_shader_program *prog,
>>> +assign_attribute_or_color_locations(void *mem_ctx,
>>> +gl_shader_program *prog,
>>>  struct gl_constants *constants,
>>>  unsigned target_index)
>>>  {
>>> @@ -2680,16 +2682,31 @@
>>> assign_attribute_or_color_locations(gl_shader_program *prog,
>>>} else if (target_index == MESA_SHADER_FRAGMENT) {
>>>  unsigned binding;
>>>  unsigned index;
>>> + const char *name = var->name;
>>> + const glsl_type *type = var->type;
>>> +
>>> + while (type) {
>>> +/* Check if there's a binding for the variable name */
>>> +if (prog->FragDataBindings->get(binding, name)) {
>>> +   assert(binding >= FRAG_RESULT_DATA0);
>>> +   var->data.location = binding;
>>> +   var->data.is_unmatched_generic_inout = 0;
>>> +
>>> +   if (prog->FragDataIndexBindings->get(index, name)) {
>>> +  var->data.index = index;
>>> +   }
>>> +   break;
>>> +}
>>>
>>> -if (prog->FragDataBindings->get(binding, var->name)) {
>>> -   assert(binding >= FRAG_RESULT_DATA0);
>>> -   var->data.location = binding;
>>> -var->data.is_unmatched_generic_inout = 0;
>>> +/* If not, but it's an array type, look for name[0] */
>>> +if (type->is_array()) {
>>> +   name = ralloc_asprintf(mem_ctx, "%s[0]", name);
>>> +   type = type->fields.array;
>>> +   continue;
>>> +}
>>>
>>> -   if (prog->FragDataIndexBindings->get(index, var->name)) {
>>> -  var->data.index = index;
>>> -   }
>>> -}
>>> +break;
>>> + }
>>>}
>>>
>>>/* From GL4.5 core spec, section 15.2 (Shader Execution):
>>> @@ -4816,12 +4833,12 @@ link_shaders(struct gl_context *ctx, struct
>>> gl_shader_program *prog)
>>>prev = i;
>>> }
>>>
>>> -   if (!assign_attribute_or_color_locations(prog, >Const,
>>> +   if (!assign_attribute_or_color_locations(mem_ctx, prog, >Const,
>>>  MESA_SHADER_VERTEX)) {
>>>goto done;
>>> }
>>>
>>> -   if (!assign_attribute_or_color_locations(prog, >Const,
>>> +   if (!assign_attribute_or_color_locations(mem_ctx, prog, >Const,
>>>  MESA_SHADER_FRAGMENT)) {
>>>goto done;
>>> }
>>> --
>>> 2.7.3
>>>
>>
___
mesa-dev mailing list

[Mesa-dev] [PATCH 2/2] aubinator: Fix the tool to correctly decode the DWords

2016-08-09 Thread Sirisha Gandikota
From: Sirisha Gandikota 

Several fixes have been added as part of this as listed below:

1) Fix the mask and add disassembler handling for STATE_DS, STATE_HS
as the mask returned wrong values of the fields.

2) Fix the GEN_TYPE_ADDRESS/GEN_TYPE_OFFSET decoding - the address/
offset were handled the same way as the other fields and that gives
the wrong values for the address/offset.

3) Decode nested/recurssive structures - Many packets contain nested
structures, ex: 3DSATE_SO_BUFFER, STATE_BASE_ADDRESS, etc contain MOC
structures. Previously, the aubinator printed 1 if there was a MOC
structure. Now we decode the entire structure and print out its fields.

4) Print out the DWord address along with its hex value - For a better
clarity of information, it is helpful to print both the address and
hex value of the DWord along with the DWord count. Since the DWord0
contains the instruction code and the instruction length, it is
unnecessary to print the decoded values for DWord0. This information
is already available from the DWord hex value.

5) Decode the  and the corresponding fields in the group- The
 tag can have fields of several types including structures. A
group can contain one or more number of fields and this has be correctly
decoded. Previously, aubinator did not decode the groups or the
fields/structures inside them. Now we decode the  in the
instructions and structures where the fields in it repeat for any number
of times specified.

Signed-off-by: Sirisha Gandikota 
---
 src/intel/tools/aubinator.c | 115 +---
 src/intel/tools/decoder.c   |  95 +---
 src/intel/tools/decoder.h   |  39 +++
 3 files changed, 192 insertions(+), 57 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 99d67a1..d2f9d13 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -84,17 +84,80 @@ valid_offset(uint32_t offset)
 }
 
 static void
+print_dword_val(struct gen_field_iterator *iter, uint64_t offset, int 
*dword_num)
+{
+   struct gen_field *f;
+   union {
+  uint32_t dw;
+  float f;
+   } v;
+
+   f = iter->group->fields[iter->i-1];
+   v.dw = iter->p[f->start / 32];
+
+   if (*dword_num != (f->start/32)) {
+  printf("0x%08lx:  0x%08x : Dword %d\n",(offset+4*(f->start/32)), v.dw, 
f->start/32);
+  *dword_num = (f->start/32);
+   }
+}
+
+static char *
+print_iterator_values(struct gen_field_iterator *iter, int *idx)
+{
+char *token = NULL;
+if (strstr(iter->value,"struct") == NULL) {
+printf("%s: %s\n", iter->name, iter->value);
+} else {
+token = strtok(iter->value, " ");
+if (token != NULL) {
+token = strtok(NULL, " ");
+*idx = atoi(strtok(NULL, ">"));
+} else {
+token = NULL;
+}
+printf("%s:\n", iter->name, token);
+}
+return token;
+
+}
+
+static void
 decode_structure(struct gen_spec *spec, struct gen_group *strct, const 
uint32_t *p)
 {
struct gen_field_iterator iter;
+   char *token = NULL;
+   int idx = 0, dword_num = 0;
+   uint64_t offset = 0;
+
+   if (option_print_offsets)
+  offset = (void *) p - gtt;
+   else
+  offset = 0;
 
gen_field_iterator_init(, strct, p);
while (gen_field_iterator_next()) {
-  printf("%s: %s\n", iter.name, iter.value);
+  idx = 0;
+  print_dword_val(, offset, _num);
+  token = print_iterator_values(, );
+  if (token != NULL) {
+ struct gen_group *struct_val = gen_spec_find_struct(spec, token);
+ decode_structure(spec, struct_val, [idx]);
+ token = NULL;
+  }
}
 }
 
 static void
+handle_struct_decode(struct gen_spec *spec, char *struct_name, uint32_t *p)
+{
+if (struct_name == NULL)
+return;
+struct gen_group *struct_val = gen_spec_find_struct(spec, struct_name);
+decode_structure(spec, struct_val, p);
+
+}
+
+static void
 dump_binding_table(struct gen_spec *spec, uint32_t offset)
 {
uint32_t *pointers, i;
@@ -248,7 +311,8 @@ handle_media_interface_descriptor_load(struct gen_spec 
*spec, uint32_t *p)
 }
 
 /* Heuristic to determine whether a uint32_t is probably actually a float
- * (http://stackoverflow.com/a/2953466) */
+ * (http://stackoverflow.com/a/2953466) 
+ */
 
 static bool
 probably_float(uint32_t bits)
@@ -256,15 +320,15 @@ probably_float(uint32_t bits)
int exp = ((bits & 0x7f80U) >> 23) - 127;
uint32_t mant = bits & 0x007f;
 
-   // +- 0.0
+   /* +- 0.0 */
if (exp == -127 && mant == 0)
   return true;
 
-   // +- 1 billionth to 1 billion
+   /* +- 1 billionth to 1 billion */
if (-30 <= exp && exp <= 30)
   return true;
 
-   // some value with only a few binary digits
+   /* some value with only a few binary digits */
if ((mant & 0x) == 0)
   return true;
 
@@ -342,6 +406,30 @@ 

[Mesa-dev] [PATCH 0/2] *** Aubinator tool for Intel Gen platforms ***

2016-08-09 Thread Sirisha Gandikota
From: Sirisha Gandikota 

This is a patch series for adding the aubinator tool to the codebase.

The aubinator tool is designed to help the driver developers to debug
the driver functionality by decoding the data in the .aub files. This
tool is for Intel Gen platforms and has been tested for Gen7.5, Gen8 
and Gen9 platforms.

Kristian Høgsberg Kristensen (1):
  aubinator: Add a new tool called Aubinator to the src/intel/tools
folder.

Sirisha Gandikota (1):
  aubinator: Fix the tool to correctly decode the DWords

 configure.ac |1 +
 src/Makefile.am  |4 +
 src/intel/tools/Makefile.am  |   65 +++
 src/intel/tools/aubinator.c  | 1140 ++
 src/intel/tools/decoder.c|  515 +++
 src/intel/tools/decoder.h|   96 
 src/intel/tools/disasm.c |  109 
 src/intel/tools/gen_disasm.h |   35 ++
 src/intel/tools/intel_aub.h  |  154 ++
 9 files changed, 2119 insertions(+)
 create mode 100644 src/intel/tools/Makefile.am
 create mode 100644 src/intel/tools/aubinator.c
 create mode 100644 src/intel/tools/decoder.c
 create mode 100644 src/intel/tools/decoder.h
 create mode 100644 src/intel/tools/disasm.c
 create mode 100644 src/intel/tools/gen_disasm.h
 create mode 100644 src/intel/tools/intel_aub.h

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] aubinator: Add a new tool called Aubinator to the src/intel/tools folder.

2016-08-09 Thread Sirisha Gandikota
From: Kristian Høgsberg Kristensen 

The Aubinator tool is designed to help the driver developers in debugging
the driver functionality by decoding the data in the .aub files.
Primary Authors of this tool are Damien Lespiau 
and Kristian Høgsberg Kristensen .

Signed-off-by: Sirisha Gandikota 
---
 configure.ac |1 +
 src/Makefile.am  |4 +
 src/intel/tools/Makefile.am  |   65 +++
 src/intel/tools/aubinator.c  | 1039 ++
 src/intel/tools/decoder.c|  520 +
 src/intel/tools/decoder.h|   57 +++
 src/intel/tools/disasm.c |  109 +
 src/intel/tools/gen_disasm.h |   35 ++
 src/intel/tools/intel_aub.h  |  154 +++
 9 files changed, 1984 insertions(+)
 create mode 100644 src/intel/tools/Makefile.am
 create mode 100644 src/intel/tools/aubinator.c
 create mode 100644 src/intel/tools/decoder.c
 create mode 100644 src/intel/tools/decoder.h
 create mode 100644 src/intel/tools/disasm.c
 create mode 100644 src/intel/tools/gen_disasm.h
 create mode 100644 src/intel/tools/intel_aub.h

diff --git a/configure.ac b/configure.ac
index aea5890..acc8356 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2742,6 +2742,7 @@ AC_CONFIG_FILES([Makefile
src/intel/genxml/Makefile
src/intel/isl/Makefile
src/intel/vulkan/Makefile
+   src/intel/tools/Makefile
src/loader/Makefile
src/mapi/Makefile
src/mapi/es1api/glesv1_cm.pc
diff --git a/src/Makefile.am b/src/Makefile.am
index d4e34b4..190ad08 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -88,6 +88,10 @@ if HAVE_INTEL_VULKAN
 SUBDIRS += intel/vulkan
 endif
 
+if HAVE_INTEL_DRIVERS
+SUBDIRS += intel/tools
+endif
+
 if HAVE_GALLIUM
 SUBDIRS += gallium
 endif
diff --git a/src/intel/tools/Makefile.am b/src/intel/tools/Makefile.am
new file mode 100644
index 000..cf5477f
--- /dev/null
+++ b/src/intel/tools/Makefile.am
@@ -0,0 +1,65 @@
+# Copyright © 2016 Intel Corporation
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice (including the next
+# paragraph) shall be included in all copies or substantial portions of the
+# Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+# IN THE SOFTWARE.
+
+CLEANFILES = \
+   aubinator-aubinator.o \
+   aubinator-decoder.o \
+   disasm.lo \
+   libdisasm.la
+
+noinst_LTLIBRARIES = libdisasm.la
+
+AM_CPPFLAGS = \
+   $(INTEL_CFLAGS) \
+   $(VALGRIND_CFLAGS) \
+   $(DEFINES) \
+   -I$(top_srcdir)/include \
+   -I$(top_builddir)/src \
+   -I$(top_srcdir)/src \
+   -I$(top_builddir)/src/compiler \
+   -I$(top_srcdir)/src/compiler \
+   -I$(top_builddir)/src/compiler/nir \
+   -I$(top_srcdir)/src/mapi \
+   -I$(top_srcdir)/src/mesa \
+   -I$(top_srcdir)/src/mesa/drivers/dri/common \
+   -I$(top_srcdir)/src/mesa/drivers/dri/i965 \
+   -I$(top_srcdir)/src/gallium/auxiliary \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_builddir)/src/intel \
+   -I$(top_srcdir)/src/intel
+
+libdisasm_la_SOURCES = \
+   gen_disasm.h \
+   disasm.c
+libdisasm_la_LIBADD = 
$(top_builddir)/src/mesa/drivers/dri/i965/libi965_compiler.la \
+   $(top_builddir)/src/mesa/libmesa.la \
+   $(PER_GEN_LIBS) \
+   $(PTHREAD_LIBS) \
+   $(DLOPEN_LIBS) \
+   -lm
+
+
+bin_PROGRAMS = aubinator
+
+aubinator_SOURCES = intel_aub.h decoder.c decoder.h aubinator.c
+aubinator_CFLAGS =  $(EXPAT_CFLAGS) $(AM_CFLAGS) -I$(top_srcdir)/src 
-I$(top_srcdir)/include
+
+aubinator_LDADD = $(EXPAT_LIBS) libdisasm.la
diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
new file mode 100644
index 000..99d67a1
--- /dev/null
+++ b/src/intel/tools/aubinator.c
@@ -0,0 +1,1039 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation 

[Mesa-dev] [PATCH 4/4] radeonsi: set CB_COLORn_INFO.ROUND_MODE

2016-08-09 Thread Marek Olšák
From: Marek Olšák 

just do what the register spec says
---
 src/gallium/drivers/radeonsi/si_state.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index ebd2b7d..94dbe4c 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2060,20 +2060,25 @@ static void si_initialize_color_surface(struct 
si_context *sctx,
(format == V_028C70_COLOR_8 ||
 format == V_028C70_COLOR_8_8 ||
 format == V_028C70_COLOR_8_8_8_8))
surf->color_is_int8 = true;
 
color_info = S_028C70_FORMAT(format) |
S_028C70_COMP_SWAP(swap) |
S_028C70_BLEND_CLAMP(blend_clamp) |
S_028C70_BLEND_BYPASS(blend_bypass) |
S_028C70_SIMPLE_FLOAT(1) |
+   S_028C70_ROUND_MODE(ntype != V_028C70_NUMBER_UNORM &&
+   ntype != V_028C70_NUMBER_SNORM &&
+   ntype != V_028C70_NUMBER_SRGB &&
+   format != V_028C70_COLOR_8_24 &&
+   format != V_028C70_COLOR_24_8) |
S_028C70_NUMBER_TYPE(ntype) |
S_028C70_ENDIAN(endian);
 
/* Intensity is implemented as Red, so treat it that way. */
color_attrib = S_028C74_FORCE_DST_ALPHA_1(desc->swizzle[3] == 
PIPE_SWIZZLE_1 ||
  
util_format_is_intensity(surf->base.format));
 
if (rtex->resource.b.b.nr_samples > 1) {
unsigned log_samples = 
util_logbase2(rtex->resource.b.b.nr_samples);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] radeonsi: disallow MIN/MAX blend equations for dual source blending

2016-08-09 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index f5b2330..8f7203e 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -457,20 +457,30 @@ static void *si_create_blend_state_mode(struct 
pipe_context *ctx,
unsigned blend_cntl = 0;
 
sx_mrt_blend_opt[i] =

S_028760_COLOR_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED) |

S_028760_ALPHA_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED);
 
/* Only set dual source blending for MRT0 to avoid a hang. */
if (i >= 1 && blend->dual_src_blend)
continue;
 
+   /* Only addition and subtraction equations are supported with
+* dual source blending.
+*/
+   if (blend->dual_src_blend &&
+   (eqRGB == PIPE_BLEND_MIN || eqRGB == PIPE_BLEND_MAX ||
+eqA == PIPE_BLEND_MIN || eqA == PIPE_BLEND_MAX)) {
+   assert(!"Unsupported equation for dual source 
blending");
+   continue;
+   }
+
if (!state->rt[j].colormask)
continue;
 
/* cb_render_state will disable unused ones */
blend->cb_target_mask |= (unsigned)state->rt[j].colormask << (4 
* i);
 
if (!state->rt[j].blend_enable) {
si_pm4_set_reg(pm4, R_028780_CB_BLEND0_CONTROL + i * 4, 
blend_cntl);
continue;
}
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] radeonsi: only set dual source blending for MRT0

2016-08-09 Thread Marek Olšák
From: Marek Olšák 

This is the proper fix for Overlord and Witcher 2 hangs.

The hang condition is that 1 app must write to MRT0 and MRT1 from a pixel
shader while MRT1 is disabled in CB_TARGET_MASK (does this generate
unflushable pixel quads? I don't know), and another app (e.g. Glamor)
must enable dual source blending in both MRT0 and MRT1. The hw gets
confused, which leads to corruption and hangs.

Cc: 12.0 11.2 
---
 src/gallium/drivers/radeonsi/si_state.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 47fc7a0..f5b2330 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -453,20 +453,24 @@ static void *si_create_blend_state_mode(struct 
pipe_context *ctx,
unsigned srcA = state->rt[j].alpha_src_factor;
unsigned dstA = state->rt[j].alpha_dst_factor;
 
unsigned srcRGB_opt, dstRGB_opt, srcA_opt, dstA_opt;
unsigned blend_cntl = 0;
 
sx_mrt_blend_opt[i] =

S_028760_COLOR_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED) |

S_028760_ALPHA_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED);
 
+   /* Only set dual source blending for MRT0 to avoid a hang. */
+   if (i >= 1 && blend->dual_src_blend)
+   continue;
+
if (!state->rt[j].colormask)
continue;
 
/* cb_render_state will disable unused ones */
blend->cb_target_mask |= (unsigned)state->rt[j].colormask << (4 
* i);
 
if (!state->rt[j].blend_enable) {
si_pm4_set_reg(pm4, R_028780_CB_BLEND0_CONTROL + i * 4, 
blend_cntl);
continue;
}
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] RadeonSI: GPU hang fix + 3 small changes

2016-08-09 Thread Marek Olšák
Hi,

The first patch is pretty important and fixes the root cause of Overlord and 
Witcher 2 hangs. The GLSL revert can be reverted after that.

Please review.

Marek

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] radeonsi: set CB_COLORn_INFO.SIMPLE_FLOAT

2016-08-09 Thread Marek Olšák
From: Marek Olšák 

This can help enable some blend optimizations (see the register spec).
Vulkan always sets this.
---
 src/gallium/drivers/radeonsi/si_state.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 8f7203e..ebd2b7d 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2059,20 +2059,21 @@ static void si_initialize_color_surface(struct 
si_context *sctx,
if ((ntype == V_028C70_NUMBER_UINT || ntype == V_028C70_NUMBER_SINT) &&
(format == V_028C70_COLOR_8 ||
 format == V_028C70_COLOR_8_8 ||
 format == V_028C70_COLOR_8_8_8_8))
surf->color_is_int8 = true;
 
color_info = S_028C70_FORMAT(format) |
S_028C70_COMP_SWAP(swap) |
S_028C70_BLEND_CLAMP(blend_clamp) |
S_028C70_BLEND_BYPASS(blend_bypass) |
+   S_028C70_SIMPLE_FLOAT(1) |
S_028C70_NUMBER_TYPE(ntype) |
S_028C70_ENDIAN(endian);
 
/* Intensity is implemented as Red, so treat it that way. */
color_attrib = S_028C74_FORCE_DST_ALPHA_1(desc->swizzle[3] == 
PIPE_SWIZZLE_1 ||
  
util_format_is_intensity(surf->base.format));
 
if (rtex->resource.b.b.nr_samples > 1) {
unsigned log_samples = 
util_logbase2(rtex->resource.b.b.nr_samples);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #1900 failed

2016-08-09 Thread AppVeyor



Build mesa 1900 failed


Commit 3f100b77f9 by Marek Olšák on 8/1/2016 10:48 PM:

gallium/radeon: use unflushed fences for deferred flushes (v2)\n\n+23% Bioshock Infinite performance.\n\nv2: - use the new fence_finish interface\n- allow deferred fences with multiple contexts\n- clear the ctx pointer after a deferred flush\n\nReviewed-by: Nicolai Hähnle 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 00/27] i965: Rework the blorp API to use ISL

2016-08-09 Thread Jason Ekstrand
On Tue, Aug 9, 2016 at 4:02 PM, Chad Versace  wrote:

> On 07/26/2016 03:11 PM, Jason Ekstrand wrote:
>
>> This patch series builds on the previous one I just sent and reworks the
>> blorp API to be entirely ISL.  The last bits of intel_mipmap_tree are
>> removed from the ISL internals and shoved into brw_blorp.c/h which simply
>> serves as a wrapper around the ISL-centric brw_blorp.h file.  Eventually,
>> the plan is to completely separate the internals of blorp from the i965
>> driver and share it with the Vulkan driver.  This is just one more step on
>> the very long road to getting there.  This series can be found here:
>>
>> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/blorp-isl-pt2
>>
>> The best place to start reviewing is by looking at patch 25/27 where we
>> make the final API changes.  That shows off where things are going.  That
>> commit can be found on cgit here:
>>
>> https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=revie
>> w/blorp-isl-pt2=b9a55af924d9cab317224ccb9b507b9f87b44c5d
>>
>> Happy Reviewing!
>>
>
> Ping! What's the status of this series?
>

Pong!  Topi has been reviewing stuff but there's still quite a bit left
unreviewed ATM.  Look at my wip/blorp-isl-pt2 branch to see the latest
status (including applied R-Bs)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

--- Comment #11 from Alexandr Zelinsky  ---
fix working
if you need logs with last tracing patch https://bpaste.net/show/6b49674b4e26

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glcpp: Track the actual version instead of just the version_resolved flag

2016-08-09 Thread Timothy Arceri
Both patches are:

Reviewed-by: Timothy Arceri 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] nir: Drop an unused program/hash_table.h include.

2016-08-09 Thread Timothy Arceri
On Tue, 2016-08-09 at 16:06 -0700, Ian Romanick wrote:
> With the one comment on patch 3 resolved, this series is
> 
> Reviewed-by: Ian Romanick 

Series also:

Reviewed-by: Timothy Arceri 

> 
> On 08/09/2016 10:17 AM, Eric Anholt wrote:
> > 
> > ---
> >  src/compiler/nir/nir_lower_samplers.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/src/compiler/nir/nir_lower_samplers.c
> > b/src/compiler/nir/nir_lower_samplers.c
> > index 4a4326983a65..e878edd9b54b 100644
> > --- a/src/compiler/nir/nir_lower_samplers.c
> > +++ b/src/compiler/nir/nir_lower_samplers.c
> > @@ -25,7 +25,6 @@
> >  
> >  #include "nir.h"
> >  #include "nir_builder.h"
> > -#include "program/hash_table.h"
> >  #include "compiler/glsl/ir_uniform.h"
> >  
> >  #include "main/compiler.h"
> > 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] nir: Drop an unused program/hash_table.h include.

2016-08-09 Thread Ian Romanick
With the one comment on patch 3 resolved, this series is

Reviewed-by: Ian Romanick 

On 08/09/2016 10:17 AM, Eric Anholt wrote:
> ---
>  src/compiler/nir/nir_lower_samplers.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/src/compiler/nir/nir_lower_samplers.c 
> b/src/compiler/nir/nir_lower_samplers.c
> index 4a4326983a65..e878edd9b54b 100644
> --- a/src/compiler/nir/nir_lower_samplers.c
> +++ b/src/compiler/nir/nir_lower_samplers.c
> @@ -25,7 +25,6 @@
>  
>  #include "nir.h"
>  #include "nir_builder.h"
> -#include "program/hash_table.h"
>  #include "compiler/glsl/ir_uniform.h"
>  
>  #include "main/compiler.h"
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] prog_hash_table: Convert to using util/hash_table.h.

2016-08-09 Thread Ian Romanick
On 08/09/2016 10:17 AM, Eric Anholt wrote:
> Improves glretrace -b servo.trace (a trace of Mozilla's servo rendering
> engine booting, rendering a page, and exiting) from 1.8s to 1.1s.  It uses
> a large uniform array of structs, making a huge number of separate program
> resources, and the fixed-size hash table was killing it.  Given how many
> times we've improved performance by swapping the hash table to
> util/hash_table.h, just do it once and for all.
> 
> This just rebases the old hash table API on top of util/, for minimal
> diff.  Cleaning things up is left for later, particularly because I want
> to fix up the new hash table API a little bit.
> ---
>  src/mesa/program/hash_table.h  |  78 +++-
>  src/mesa/program/prog_hash_table.c | 181 
> -
>  2 files changed, 54 insertions(+), 205 deletions(-)
> 
> diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h
> index aba5282fe9e5..362eb2ee0a78 100644
> --- a/src/mesa/program/hash_table.h
> +++ b/src/mesa/program/hash_table.h
> @@ -37,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include "util/hash_table.h"
>  
>  struct string_to_uint_map;
>  
> @@ -44,8 +45,6 @@ struct string_to_uint_map;
>  extern "C" {
>  #endif
>  
> -struct hash_table;
> -
>  typedef unsigned (*hash_func_t)(const void *key);
>  typedef bool (*hash_compare_func_t)(const void *key1, const void *key2);
>  
> @@ -60,26 +59,32 @@ typedef bool (*hash_compare_func_t)(const void *key1, 
> const void *key2);
>   * \param hash Function used to compute hash value of input keys.
>   * \param compare  Function used to compare keys.
>   */
> -extern struct hash_table *hash_table_ctor(unsigned num_buckets,
> -hash_func_t hash, hash_compare_func_t compare);
> -
> +static inline struct hash_table *hash_table_ctor(unsigned num_buckets,
^
Add UNUSED here to avoid piles of unused parameter warnings in my build.

> +hash_func_t hash, hash_compare_func_t compare)
> +{
> +   return _mesa_hash_table_create(NULL, hash, compare);
> +}
>  
>  /**
>   * Release all memory associated with a hash table
>   *
>   * \warning
> - * This function cannot release memory occupied either by keys or data.
> + * This function does not release memory occupied either by keys or data.
>   */
> -extern void hash_table_dtor(struct hash_table *ht);
> -
> +static inline void hash_table_dtor(struct hash_table *ht)
> +{
> +   return _mesa_hash_table_destroy(ht, NULL);
> +}
>  
>  /**
>   * Flush all entries from a hash table
>   *
>   * \param ht  Table to be cleared of its entries.
>   */
> -extern void hash_table_clear(struct hash_table *ht);
> -
> +static inline void hash_table_clear(struct hash_table *ht)
> +{
> +   return _mesa_hash_table_clear(ht, NULL);
> +}
>  
>  /**
>   * Search a hash table for a specific element
> @@ -92,25 +97,28 @@ extern void hash_table_clear(struct hash_table *ht);
>   * the matching key was added.  If no matching key exists in the table,
>   * \c NULL is returned.
>   */
> -extern void *hash_table_find(struct hash_table *ht, const void *key);
> -
> +static inline void *hash_table_find(struct hash_table *ht, const void *key)
> +{
> +   struct hash_entry *entry = _mesa_hash_table_search(ht, key);
> +   if (!entry)
> +  return NULL;
> +   return entry->data;
> +}
>  
>  /**
>   * Add an element to a hash table
>   *
>   * \warning
> - * If \c key is already in the hash table, it will be added again.  Future
> - * calls to \c hash_table_find and \c hash_table_remove will return or 
> remove,
> - * repsectively, the most recently added instance of \c key.
> - *
> - * \warning
>   * The value passed by \c key is kept in the hash table and is used by later
>   * calls to \c hash_table_find.
>   *
>   * \sa hash_table_replace
>   */
> -extern void hash_table_insert(struct hash_table *ht, void *data,
> -const void *key);
> +static inline void hash_table_insert(struct hash_table *ht, void *data,
> + const void *key)
> +{
> +   _mesa_hash_table_insert(ht, key, data);
> +}
>  
>  /**
>   * Add an element to a hash table with replacement
> @@ -126,13 +134,29 @@ extern void hash_table_insert(struct hash_table *ht, 
> void *data,
>   *
>   * \sa hash_table_insert
>   */
> -extern bool hash_table_replace(struct hash_table *ht, void *data,
> -const void *key);
> +static inline bool hash_table_replace(struct hash_table *ht, void *data,
> +  const void *key)
> +{
> +   struct hash_entry *entry = _mesa_hash_table_search(ht, key);
> +   if (entry) {
> +  entry->data = data;
> +  return true;
> +   } else {
> +  _mesa_hash_table_insert(ht, key, data);
> +  return false;
> +   }
> +}
>  
>  /**
>   * Remove a specific element from a hash table.
>   */
> -extern void hash_table_remove(struct hash_table *ht, const void *key);
> +static inline void hash_table_remove(struct 

Re: [Mesa-dev] [PATCH v2 00/27] i965: Rework the blorp API to use ISL

2016-08-09 Thread Chad Versace

On 07/26/2016 03:11 PM, Jason Ekstrand wrote:

This patch series builds on the previous one I just sent and reworks the
blorp API to be entirely ISL.  The last bits of intel_mipmap_tree are
removed from the ISL internals and shoved into brw_blorp.c/h which simply
serves as a wrapper around the ISL-centric brw_blorp.h file.  Eventually,
the plan is to completely separate the internals of blorp from the i965
driver and share it with the Vulkan driver.  This is just one more step on
the very long road to getting there.  This series can be found here:

https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/blorp-isl-pt2

The best place to start reviewing is by looking at patch 25/27 where we
make the final API changes.  That shows off where things are going.  That
commit can be found on cgit here:

https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=review/blorp-isl-pt2=b9a55af924d9cab317224ccb9b507b9f87b44c5d

Happy Reviewing!


Ping! What's the status of this series?

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97270] [softpipe] piglit ext_framebuffer_multisample-fast-clear GL_ARB_texture_rg single-sample regression

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97270

Bug ID: 97270
   Summary: [softpipe] piglit
ext_framebuffer_multisample-fast-clear
GL_ARB_texture_rg single-sample regression
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: mar...@gmail.com, nhaeh...@gmail.com

mesa: aa920736feeddd1793861651e95bcd09524e024c (12.1.0-devel)


$ ./bin/ext_framebuffer_multisample-fast-clear GL_ARB_texture_rg single-sample
-auto 
Using test set: GL_ARB_texture_rg
Testing GL_R8
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.25 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.75 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.50 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Testing GL_R16
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.25 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.75 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.50 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Testing GL_RG
Probe color at (0,0)
  Expected: 1.00 1.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.25 0.50 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.75 0.50 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.50 0.25 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 1.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Testing GL_RG8
Probe color at (0,0)
  Expected: 1.00 1.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.25 0.50 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.75 0.50 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.50 0.25 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 1.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Testing GL_RG16
Probe color at (0,0)
  Expected: 1.00 1.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 0.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.25 0.50 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.75 0.50 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 0.50 0.25 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
Probe color at (0,0)
  Expected: 1.00 1.00 0.00 1.00
  Observed: 0.00 0.00 0.00 1.00
PIGLIT: {"result": "fail" }


79dcd69afae4ada47fd4e746e9eec87c6d8028f0 is the first bad commit
commit 79dcd69afae4ada47fd4e746e9eec87c6d8028f0
Author: Marek Olšák 
Date:   Sun Jul 17 20:37:58 2016 +0200

st/mesa: remove excessive shader state dirtying

This just needs to be done by st_validate_state.

v2: add "shaders_may_be_dirty" flags for not skipping st_validate_state
on _NEW_PROGRAM to detect real shader changes

Reviewed-by: Nicolai Hähnle 


Re: [Mesa-dev] [PATCH v2] egl: android: query native window default width and height

2016-08-09 Thread Chad Versace
On 08/09/2016 01:49 PM, Haixia Shi wrote:
> Pinging this thread - any objection to commit this? Thanks.

I pushed it.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

Chad Versace  changed:

   What|Removed |Added

 CC||c...@kiwitree.net

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shaders

2016-08-09 Thread Ian Romanick
From: Ian Romanick 

Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:

   It is an error to undefine or to redefine a built-in (pre-defined)
   macro name.

The GLSL ES 1.00 spec does not contain this text.

Section 3.3 (Preprocessor) of the GLSL 1.30 spec says:

   #define and #undef functionality are defined as is standard for C++
   preprocessors for macro definitions both with and without macro
   parameters.

At least as far as I can tell GCC allow '#undef __FILE__'.  Furthermore,
there are desktop OpenGL conformance tests that expect '#undef
__VERSION__' and '#undef GL_core_profile' to work.

Fixes:

GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex
GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment
GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex
GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment

Signed-off-by: Ian Romanick 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/compiler/glsl/glcpp/glcpp-parse.y | 32 
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y 
b/src/compiler/glsl/glcpp/glcpp-parse.y
index 05a76c7..eff53be 100644
--- a/src/compiler/glsl/glcpp/glcpp-parse.y
+++ b/src/compiler/glsl/glcpp/glcpp-parse.y
@@ -278,10 +278,34 @@ control_line_success:
HASH_TOKEN DEFINE_TOKEN define
 |  HASH_TOKEN UNDEF IDENTIFIER NEWLINE {
macro_t *macro;
-   if (strcmp("__LINE__", $3) == 0
-   || strcmp("__FILE__", $3) == 0
-   || strcmp("__VERSION__", $3) == 0
-   || strncmp("GL_", $3, 3) == 0)
+
+/* Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:
+ *
+ *It is an error to undefine or to redefine a built-in
+ *(pre-defined) macro name.
+ *
+ * The GLSL ES 1.00 spec does not contain this text.
+ *
+ * Section 3.3 (Preprocessor) of the GLSL 1.30 spec says:
+ *
+ *#define and #undef functionality are defined as is
+ *standard for C++ preprocessors for macro definitions
+ *both with and without macro parameters.
+ *
+ * At least as far as I can tell GCC allow '#undef __FILE__'.
+ * Furthermore, there are desktop OpenGL conformance tests
+ * that expect '#undef __VERSION__' and '#undef
+ * GL_core_profile' to work.
+ *
+ * Only disallow #undef of pre-defined macros on GLSL ES >=
+ * 3.00 shaders.
+ */
+   if (parser->is_gles &&
+parser->version >= 300 &&
+(strcmp("__LINE__", $3) == 0
+ || strcmp("__FILE__", $3) == 0
+ || strcmp("__VERSION__", $3) == 0
+ || strncmp("GL_", $3, 3) == 0))
glcpp_error(& @1, parser, "Built-in (pre-defined)"
" macro names cannot be undefined.");
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] glcpp: Track the actual version instead of just the version_resolved flag

2016-08-09 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/compiler/glsl/glcpp/glcpp-parse.y | 10 +-
 src/compiler/glsl/glcpp/glcpp.h   |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y 
b/src/compiler/glsl/glcpp/glcpp-parse.y
index ca376d9..05a76c7 100644
--- a/src/compiler/glsl/glcpp/glcpp-parse.y
+++ b/src/compiler/glsl/glcpp/glcpp-parse.y
@@ -396,13 +396,13 @@ control_line_success:
_glcpp_parser_skip_stack_pop (parser, & @1);
} NEWLINE
 |  HASH_TOKEN VERSION_TOKEN integer_constant NEWLINE {
-   if (parser->version_resolved) {
+   if (parser->version != 0) {
glcpp_error(& @1, parser, "#version must appear on the 
first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, NULL, 
true);
}
 |  HASH_TOKEN VERSION_TOKEN integer_constant IDENTIFIER NEWLINE {
-   if (parser->version_resolved) {
+   if (parser->version != 0) {
glcpp_error(& @1, parser, "#version must appear on the 
first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, $4, true);
@@ -1346,7 +1346,7 @@ glcpp_parser_create(glcpp_extension_iterator extensions, 
void *state, gl_api api
parser->extensions = extensions;
parser->state = state;
parser->api = api;
-   parser->version_resolved = false;
+   parser->version = 0;
 
parser->has_new_line_number = 0;
parser->new_line_number = 1;
@@ -2280,10 +2280,10 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
  const char *es_identifier,
  bool explicitly_set)
 {
-   if (parser->version_resolved)
+   if (parser->version != 0)
   return;
 
-   parser->version_resolved = true;
+   parser->version = version;
 
add_builtin_define (parser, "__VERSION__", version);
 
diff --git a/src/compiler/glsl/glcpp/glcpp.h b/src/compiler/glsl/glcpp/glcpp.h
index 07eaf68..9f35b05 100644
--- a/src/compiler/glsl/glcpp/glcpp.h
+++ b/src/compiler/glsl/glcpp/glcpp.h
@@ -206,7 +206,7 @@ struct glcpp_parser {
glcpp_extension_iterator extensions;
void *state;
gl_api api;
-   bool version_resolved;
+   unsigned version;
bool has_new_line_number;
int new_line_number;
bool has_new_source_number;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Jason Ekstrand
On Tue, Aug 9, 2016 at 1:12 PM, Dave Airlie  wrote:

> >
> > tbh, git submodules are more annoying than they need to be, and I'm
> > not really terribly excited to use that for something that is a build
> > dependency.  Maybe just move it into libdrm instead?
> >
>
> I've only had to use git submodules once with spice project, and it
> was a nightmare. It makes packaging etc a real pita.
>
> Alternatives are something like a fetch external sources script,
> that does git submodules but does it better, you'll see Vulkan-CTS
> etc use something like that, it would have to be integrated with
> the build system a bit better though.
>

This isn't a plug for submodules, but *please* don't base anything on the
Vulkan CTS fetch_sources script.  We've had no end of trouble with their
git hacks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 2/3] anv/pipeline: Add sample locations for gen7-7.5

2016-08-09 Thread Jason Ekstrand
LGTM

On Tue, Aug 9, 2016 at 2:41 PM, Anuj Phogat  wrote:

> V1: Add multisample positions (Nanley)
> V2: Fix 8x sample positions to match OpenGL (Anuj)
> V3: Vulkan has standard sample locations. They need not be same as
> in OpenGL. (Anuj)
>
> Signed-off-by: Anuj Phogat 
> Reviewed-by: Jason Ekstrand 
> ---
>  src/intel/vulkan/genX_pipeline_util.h | 47 ++
> +
>  1 file changed, 47 insertions(+)
>
> diff --git a/src/intel/vulkan/genX_pipeline_util.h
> b/src/intel/vulkan/genX_pipeline_util.h
> index d9d8ca4..64b89cd 100644
> --- a/src/intel/vulkan/genX_pipeline_util.h
> +++ b/src/intel/vulkan/genX_pipeline_util.h
> @@ -475,6 +475,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
> anv_batch_emit(>batch, GENX(3DSTATE_MULTISAMPLE), ms) {
>ms.NumberofMultisamples   = log2_samples;
>
> +#if GEN_GEN >= 8
>/* The PRM says that this bit is valid only for DX9:
> *
> *SW can choose to set this bit only for DX9 API. DX10/OGL API's
> @@ -482,6 +483,52 @@ emit_ms_state(struct anv_pipeline *pipeline,
> */
>ms.PixelPositionOffsetEnable  = false;
>ms.PixelLocation  = CENTER;
> +#else
> +  ms.PixelLocation  = PIXLOC_CENTER;
> +
> +  switch (samples) {
> +  case 1:
> + ms.Sample0XOffset  = 0.5;
> + ms.Sample0YOffset  = 0.5;
> + break;
> +  case 2:
> + ms.Sample0XOffset  = 0.25;
> + ms.Sample0YOffset  = 0.25;
> + ms.Sample1XOffset  = 0.75;
> + ms.Sample1YOffset  = 0.75;
> + break;
> +  case 4:
> + ms.Sample0XOffset  = 0.375;
> + ms.Sample0YOffset  = 0.125;
> + ms.Sample1XOffset  = 0.875;
> + ms.Sample1YOffset  = 0.375;
> + ms.Sample2XOffset  = 0.125;
> + ms.Sample2YOffset  = 0.625;
> + ms.Sample3XOffset  = 0.625;
> + ms.Sample3YOffset  = 0.875;
> + break;
> +  case 8:
> + ms.Sample0XOffset  = 0.5625;
> + ms.Sample0YOffset  = 0.3125;
> + ms.Sample1XOffset  = 0.4375;
> + ms.Sample1YOffset  = 0.6875;
> + ms.Sample2XOffset  = 0.8125;
> + ms.Sample2YOffset  = 0.5625;
> + ms.Sample3XOffset  = 0.3125;
> + ms.Sample3YOffset  = 0.1875;
> + ms.Sample4XOffset  = 0.1875;
> + ms.Sample4YOffset  = 0.8125;
> + ms.Sample5XOffset  = 0.0625;
> + ms.Sample5YOffset  = 0.4375;
> + ms.Sample6XOffset  = 0.6875;
> + ms.Sample6YOffset  = 0.9375;
> + ms.Sample7XOffset  = 0.9375;
> + ms.Sample7YOffset  = 0.0625;
> + break;
> +  default:
> + break;
> +  }
> +#endif
> }
>
> anv_batch_emit(>batch, GENX(3DSTATE_SAMPLE_MASK), sm) {
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 2/3] anv/pipeline: Add sample locations for gen7-7.5

2016-08-09 Thread Anuj Phogat
V1: Add multisample positions (Nanley)
V2: Fix 8x sample positions to match OpenGL (Anuj)
V3: Vulkan has standard sample locations. They need not be same as
in OpenGL. (Anuj)

Signed-off-by: Anuj Phogat 
Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/genX_pipeline_util.h | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/src/intel/vulkan/genX_pipeline_util.h 
b/src/intel/vulkan/genX_pipeline_util.h
index d9d8ca4..64b89cd 100644
--- a/src/intel/vulkan/genX_pipeline_util.h
+++ b/src/intel/vulkan/genX_pipeline_util.h
@@ -475,6 +475,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
anv_batch_emit(>batch, GENX(3DSTATE_MULTISAMPLE), ms) {
   ms.NumberofMultisamples   = log2_samples;
 
+#if GEN_GEN >= 8
   /* The PRM says that this bit is valid only for DX9:
*
*SW can choose to set this bit only for DX9 API. DX10/OGL API's
@@ -482,6 +483,52 @@ emit_ms_state(struct anv_pipeline *pipeline,
*/
   ms.PixelPositionOffsetEnable  = false;
   ms.PixelLocation  = CENTER;
+#else
+  ms.PixelLocation  = PIXLOC_CENTER;
+
+  switch (samples) {
+  case 1:
+ ms.Sample0XOffset  = 0.5;
+ ms.Sample0YOffset  = 0.5;
+ break;
+  case 2:
+ ms.Sample0XOffset  = 0.25;
+ ms.Sample0YOffset  = 0.25;
+ ms.Sample1XOffset  = 0.75;
+ ms.Sample1YOffset  = 0.75;
+ break;
+  case 4:
+ ms.Sample0XOffset  = 0.375;
+ ms.Sample0YOffset  = 0.125;
+ ms.Sample1XOffset  = 0.875;
+ ms.Sample1YOffset  = 0.375;
+ ms.Sample2XOffset  = 0.125;
+ ms.Sample2YOffset  = 0.625;
+ ms.Sample3XOffset  = 0.625;
+ ms.Sample3YOffset  = 0.875;
+ break;
+  case 8:
+ ms.Sample0XOffset  = 0.5625;
+ ms.Sample0YOffset  = 0.3125;
+ ms.Sample1XOffset  = 0.4375;
+ ms.Sample1YOffset  = 0.6875;
+ ms.Sample2XOffset  = 0.8125;
+ ms.Sample2YOffset  = 0.5625;
+ ms.Sample3XOffset  = 0.3125;
+ ms.Sample3YOffset  = 0.1875;
+ ms.Sample4XOffset  = 0.1875;
+ ms.Sample4YOffset  = 0.8125;
+ ms.Sample5XOffset  = 0.0625;
+ ms.Sample5YOffset  = 0.4375;
+ ms.Sample6XOffset  = 0.6875;
+ ms.Sample6YOffset  = 0.9375;
+ ms.Sample7XOffset  = 0.9375;
+ ms.Sample7YOffset  = 0.0625;
+ break;
+  default:
+ break;
+  }
+#endif
}
 
anv_batch_emit(>batch, GENX(3DSTATE_SAMPLE_MASK), sm) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium/radeon: change the vendor string to Advanced Micro Devices, Inc.

2016-08-09 Thread Marek Olšák
On Sun, Aug 7, 2016 at 12:01 AM, Axel Davy  wrote:
> Hi,
>
> This looks like it will perturb driver detection of already written games.
>
> For example from the dolphin sources, this change would make them detect
> catalyst as driver.
>
> As it is known workarounds are applied depending on the driver detected, or
> some features are disabled,
> I believe it isn't a good change.

OK. I'll keep it as-is for now.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] egl: android: query native window default width and height

2016-08-09 Thread Haixia Shi
Pinging this thread - any objection to commit this? Thanks.

On Thu, Jul 28, 2016 at 8:58 PM, Tomasz Figa  wrote:

> On Fri, Jul 29, 2016 at 2:51 AM, Haixia Shi  wrote:
> > On android platform, the width and height of a native window surface may
> > be updated after initialization. It is therefore necessary to query
> android
> > framework for the current width and height.
> >
> > v2: remove Android specific #ifdef's and just implement the fallback
> directly
> > if the platform query_surface() callback is not provided.
> >
> > TEST=dEQP-EGL.functional.resize.surface_size#* on cyan-cheets
> >
> > Change-Id: I673f7d2f1d90c3bf572b30f63da537f2cae1496e
> > ---
> >  src/egl/drivers/dri2/egl_dri2.c | 11 +++
> >  src/egl/drivers/dri2/egl_dri2.h |  4 
> >  src/egl/drivers/dri2/platform_android.c | 27
> +++
> >  3 files changed, 42 insertions(+)
>
> Reviewed-by: Tomasz Figa 
>
> Best regards,
> Tomasz
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: Use win32 intrinsics for util_last_bit if present.

2016-08-09 Thread Brian Paul

On 08/09/2016 01:41 PM, mathias.froehl...@gmx.net wrote:

From: Mathias Fröhlich 

v2: Split into two patches.
v3: Fix off by one problem.

Signed-off-by: Mathias Fröhlich 
---
  src/util/bitscan.h | 12 
  1 file changed, 12 insertions(+)

diff --git a/src/util/bitscan.h b/src/util/bitscan.h
index 0743fe7..8afef81 100644
--- a/src/util/bitscan.h
+++ b/src/util/bitscan.h
@@ -157,6 +157,12 @@ util_last_bit(unsigned u)
  {
  #if defined(HAVE___BUILTIN_CLZ)
 return u == 0 ? 0 : 32 - __builtin_clz(u);
+#elif defined(_MSC_VER) && (_M_IX86 || _M_ARM || _M_AMD64 || _M_IA64)
+   unsigned long index;
+   if (_BitScanReverse(, u))
+  return index + 1;
+   else
+  return 0;
  #else
 unsigned r = 0;
 while (u) {
@@ -177,6 +183,12 @@ util_last_bit64(uint64_t u)
  {
  #if defined(HAVE___BUILTIN_CLZLL)
 return u == 0 ? 0 : 64 - __builtin_clzll(u);
+#elif defined(_MSC_VER) && (_M_AMD64 || _M_ARM || _M_IA64)
+   unsigned long index;
+   if (_BitScanReverse64(, u))
+  return index + 1;
+   else
+  return 0;
  #else
 unsigned r = 0;
 while (u) {



Reviewed-by: Brian Paul 
Tested-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Dave Airlie
>
> tbh, git submodules are more annoying than they need to be, and I'm
> not really terribly excited to use that for something that is a build
> dependency.  Maybe just move it into libdrm instead?
>

I've only had to use git submodules once with spice project, and it
was a nightmare. It makes packaging etc a real pita.

Alternatives are something like a fetch external sources script,
that does git submodules but does it better, you'll see Vulkan-CTS
etc use something like that, it would have to be integrated with
the build system a bit better though.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

--- Comment #10 from Nicolas Boichat  ---
Created attachment 125652
  --> https://bugs.freedesktop.org/attachment.cgi?id=125652=edit
Possible fix

Possible fix attached, please give it a try. Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gallium: Add c99_compat.h to u_bitcast.h

2016-08-09 Thread Mathias Fröhlich
Hi Brian,

On Tuesday, 9 August 2016 08:23:41 CEST Brian Paul wrote:
> >> As it fixes something independent, should I push that already?
> >
> > Sure.  For 1 & 3,
> > Reviewed-by: Brian Paul 
> > Tested-by: Brian Paul 
Pushed.

> Yes, we need to add one to the index.  I'll re-test that patch when you 
> update it.
Is sent.

Thanks!

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util: Use win32 intrinsics for util_last_bit if present.

2016-08-09 Thread Mathias . Froehlich
From: Mathias Fröhlich 

v2: Split into two patches.
v3: Fix off by one problem.

Signed-off-by: Mathias Fröhlich 
---
 src/util/bitscan.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/util/bitscan.h b/src/util/bitscan.h
index 0743fe7..8afef81 100644
--- a/src/util/bitscan.h
+++ b/src/util/bitscan.h
@@ -157,6 +157,12 @@ util_last_bit(unsigned u)
 {
 #if defined(HAVE___BUILTIN_CLZ)
return u == 0 ? 0 : 32 - __builtin_clz(u);
+#elif defined(_MSC_VER) && (_M_IX86 || _M_ARM || _M_AMD64 || _M_IA64)
+   unsigned long index;
+   if (_BitScanReverse(, u))
+  return index + 1;
+   else
+  return 0;
 #else
unsigned r = 0;
while (u) {
@@ -177,6 +183,12 @@ util_last_bit64(uint64_t u)
 {
 #if defined(HAVE___BUILTIN_CLZLL)
return u == 0 ? 0 : 64 - __builtin_clzll(u);
+#elif defined(_MSC_VER) && (_M_AMD64 || _M_ARM || _M_IA64)
+   unsigned long index;
+   if (_BitScanReverse64(, u))
+  return index + 1;
+   else
+  return 0;
 #else
unsigned r = 0;
while (u) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97231] GL_DEPTH_CLAMP doesn't clamp to the far plane

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97231

Jules Blok  changed:

   What|Removed |Added

 Attachment #125589|0   |1
is obsolete||
 Attachment #125624|0   |1
is obsolete||

--- Comment #13 from Jules Blok  ---
Created attachment 125650
  --> https://bugs.freedesktop.org/attachment.cgi?id=125650=edit
api trace file version 4

I've added an apitrace that was captured on Linux, perhaps you will have less
problems running that trace.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] vl: add a lanczos interpolation filter v3

2016-08-09 Thread Christian König

Am 09.08.2016 um 19:21 schrieb Nayan Deshmukh:

Hi Christian,

A few questions.



On Tue, Aug 9, 2016 at 5:10 PM, Christian König 
> wrote:



I am more than happy to solve these problems, the
Lanczos filtering was getting a little stale
anyway because I was not able to reproduce the problems Andy was
facing.

Yeah thought so, the reason is probably that you don't have the
necessary hardware.


Is that why I need to add a PIPE_BIND_LINEAR to a surface?

Yes exactly.


So I need to use maybe a couple of surfaces alternatively to read
and write with the filters. This approach should work I guess.

Allocate a temporary surface for each step, apply the necessary
filters using it and then use the temporary buffer as input for
the next step.

See how the deinterlacing filter does this, you should use the
same approach here.

I would use this order for doing things:
1. Median filter for noise reduction.
2. Sharpening/blur filter.
3. Deinterlacing.
4. Compositioning and CC conversion.
5. Advanced scaling.

I need to provide the median filter and the blur filter with a sampler 
view and the deint filter requires a pipe_video_buffer. I am not sure 
how to acheive this. Any suggestions?


video buffers are basically just a collection of sampler views and 
render targets (up to six). You just need to apply each filter to each 
plane separately.




Also right now deinterlacing is the first step and the other steps 
follow. But if we perform median and sharpening filter before then we 
also need to apply them on the past and the future surfaces that we 
require for deinterlacing. Am I right?


Oh, good point. Might be that we need to change the order to 1) 
Deinterlacing, 2) Median 3) Sharpening.


Otherwise the calculation overhead/memory bandwidth probably start to 
hit some limits on low end hardware.


Regards,
Christian.



Regards,
Nayan.

Regards,
Christian.


Am 08.08.2016 um 16:32 schrieb Nayan Deshmukh:

Hi Christian,

I am more than happy to solve these problems, the
Lanczos filtering was getting a little stale
anyway because I was not able to reproduce the problems Andy was
facing.

On Mon, Aug 8, 2016 at 6:24 PM, Christian König
> wrote:

Hi Nayan,

ok let's take a step back and put the lanczos filtering aside
for a moment. I have another task for you which is more
urgent right now.

The order we do things in vlVdpVideoMixerRender() was never
100% correct, so we have at least three problems here which
needs fixing:

1) The noise reduction and sharpness filter read and write to
the same surface at the same time. That only works because we
use a linear layout.

Is that why I need to add a PIPE_BIND_LINEAR to a surface? So I
need to use maybe a couple of surfaces alternatively to read and
write with the filters. This approach should work I guess.

2) We apply the noise reduction and sharpness filter after
the composition. That is rather odd and should be fixed so
that we apply those filters on the original video frame instead.

 So we need to apply the filter before the CSC conversion.

3) We use delayed rendering to render into output surfaces
directly. We should rather use the DRI3 capabilities to
allocate multiple output surfaces instead.

Could you take care of some of those issues? Especially #1
has become a problem recently.

Surely, I will start working on the first 2 problem for now and
look at the third problem a little later.

Regards,
Nayan.

Regards,
Christian.


Am 04.08.2016 um 19:22 schrieb Nayan Deshmukh:

Hi Andy,


On Thu, Aug 4, 2016 at 8:48 PM, Andy Furniss
> wrote:

Nayan Deshmukh wrote:

Hi Andy,

Thanks for testing my patches.


NP


The scaling happens after CSC.


OK, thanks.


I believe there is some misunderstanding here, I was
able to see the
artifacts in the video that you sent me previously.
But I was not
able to replicate them


Yea, I got that - just thought you may want to see how
they had changed.

with the pendulum video on my system. Same case this
time the
pendulum video plays fine this time too. I am
attacing a video of it
here


https://drive.google.com/file/d/0B1s62k5GtdBwOVAtOUVaU0V5c1E/view?usp=sharing


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Erik Faye-Lund
On Tue, Aug 9, 2016 at 7:24 PM, Nicolai Hähnle  wrote:
>
> On 09.08.2016 18:22, Erik Faye-Lund wrote:
>>
>> On Tue, Aug 9, 2016 at 4:59 PM, Nicolai Hähnle  wrote:
>>>
>>> On 09.08.2016 15:58, Rob Clark wrote:


 tbh, git submodules are more annoying than they need to be, and I'm
 not really terribly excited to use that for something that is a build
 dependency.  Maybe just move it into libdrm instead?
>>>
>>>
>>>
>>> I know. That's what I would have proposed if the addrlib interface were
>>> stable. Unfortunately it isn't, and realistically speaking, that's not
>>> going
>>> to change.
>>>
>>> So shared linking is right out.
>>>
>>> Static linking or just including source files from a separate repository
>>> could be considered, but then what's the process for ensuring you have
>>> the
>>> right version?
>>>
>>> The nice aspect of submodules is that every commit of the Mesa repository
>>> "knows" what the corresponding right version of addrlib is, and so git
>>> can
>>> update the submodule to the correct version for you automatically.
>>
>>
>> I'm not a huge fan of submodules either. They just don't deal with
>> distributed development properly, which should be a non-starter for
>> OSS IMO. You either set the submodule to point to an absolute URL, in
>> which case it's awkward to work with if you need to change the code,
>> or you use a relative URL, which forces everyone who has a fork to
>> fork the submodule also. Yuck. As a formerly active Git developer, my
>> impression is that nobody of the core-git developers really loved the
>> idea of git-submodule, it was mostly introduced into Git to help KDE
>> transition their gigantic SVN-external based source tree to Git.
>>
>> IMO, a much better alternative would be to have addrlib live in its
>> own repository, and periodically do a git subtree-merge into mesa and
>> other dependent system. That means that nobody really needs to deal
>> with the fact that the upstream is in a different repo, except when
>> submitting patches for upstream. This is what git.git itself does for
>> some of its subsystems.
>
>
> That looks interesting. Are people using subtree-merge with the kind of
> linear history that Mesa uses?
>

I work mostly with branch-heavy work-flows these days, so I don't
really know. The subtree-merges themselves will obviously appear as
merges, but you can always keep the development in the upstream linear
if you want...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] anv/gen7_pipeline: Set multisample state using shared function

2016-08-09 Thread Jason Ekstrand
On Tue, Aug 9, 2016 at 11:04 AM, Anuj Phogat  wrote:

> On Mon, Aug 8, 2016 at 5:09 PM, Jason Ekstrand 
> wrote:
> > Does this fix any tests?  If so, we should say so in the commit message.
> > With that updated,
> >
> No it doesn't. It surprised me too but it looks like cts don't have enough
> multisampling tests.
>

That's an understatement  In that case, it looks like this is a strict
improvement, so go ahead with my R-B.


>
> > Reviewed-by: Jason Ekstrand 
> >
> > On Mon, Aug 8, 2016 at 2:57 PM, Anuj Phogat 
> wrote:
> >>
> >> Signed-off-by: Anuj Phogat 
> >> ---
> >>  src/intel/vulkan/gen7_pipeline.c | 16 +---
> >>  1 file changed, 1 insertion(+), 15 deletions(-)
> >>
> >> diff --git a/src/intel/vulkan/gen7_pipeline.c
> >> b/src/intel/vulkan/gen7_pipeline.c
> >> index 5395e79..17d7ccc 100644
> >> --- a/src/intel/vulkan/gen7_pipeline.c
> >> +++ b/src/intel/vulkan/gen7_pipeline.c
> >> @@ -81,21 +81,7 @@ genX(graphics_pipeline_create)(
> >>   pCreateInfo->pRasterizationState, extra);
> >> emit_3dstate_streamout(pipeline, pCreateInfo->pRasterizationState);
> >>
> >> -   if (pCreateInfo->pMultisampleState &&
> >> -   pCreateInfo->pMultisampleState->rasterizationSamples > 1)
> >> -
> >> anv_finishme("VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_
> STATE_CREATE_INFO");
> >> -
> >> -   uint32_t samples = 1;
> >> -   uint32_t log2_samples = __builtin_ffs(samples) - 1;
> >> -
> >> -   anv_batch_emit(>batch, GENX(3DSTATE_MULTISAMPLE), ms) {
> >> -  ms.PixelLocation= PIXLOC_CENTER;
> >> -  ms.NumberofMultisamples = log2_samples;
> >> -   }
> >> -
> >> -   anv_batch_emit(>batch, GENX(3DSTATE_SAMPLE_MASK), sm) {
> >> -  sm.SampleMask = 0xff;
> >> -   }
> >> +   emit_ms_state(pipeline, pCreateInfo->pMultisampleState);
> >>
> >> const struct brw_vs_prog_data *vs_prog_data =
> >> get_vs_prog_data(pipeline);
> >>
> >> --
> >> 2.5.5
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] anv/clear: Clear E5B9G9R9 images as R32_UINT

2016-08-09 Thread Nanley Chery
On Wed, Aug 03, 2016 at 01:06:10PM -0700, Jason Ekstrand wrote:
> We can't actually clear these images normally because we can't render to
> them.  Instead, we have to manually unpack the rgb9e5 color value on the
> CPU and clear it as R32_UINT.  We still have a bit of work to do to clear
> non-power-of-two images, but this should get all of the power-of-two clears
> working on at least Haswell.
> 
> Cc: "12.0" 
> ---
>  src/intel/vulkan/anv_meta_clear.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_meta_clear.c 
> b/src/intel/vulkan/anv_meta_clear.c
> index fa07ec1..d8b5ce0 100644
> --- a/src/intel/vulkan/anv_meta_clear.c
> +++ b/src/intel/vulkan/anv_meta_clear.c
> @@ -25,6 +25,8 @@
>  #include "anv_private.h"
>  #include "nir/nir_builder.h"
>  
> +#include "gallium/auxiliary/util/u_format_rgb9e5.h"

I encountered a build failure on this patch. Please change the include to:

#include "util/format_rgb9e5.h"

to fix it.

With the above fixed, the spelling correction, and a mention of the
passing tests in the commit message, this series is,

Reviewed-by: Nanley Chery 

> +
>  /** Vertex attributes for color clears.  */
>  struct color_clear_vattrs {
> struct anv_vue_header vue_header;
> @@ -760,6 +762,16 @@ anv_cmd_clear_image(struct anv_cmd_buffer *cmd_buffer,
>  {
> VkDevice device_h = anv_device_to_handle(cmd_buffer->device);
>  
> +   VkFormat vk_format = image->vk_format;
> +   if (vk_format == VK_FORMAT_E5B9G9R9_UFLOAT_PACK32) {
> +  /* We can't actually render to this format so we have to work around it
> +   * by manualy unpacking and using R32_UINT.
> +   */
> +  clear_value.color.uint32[0] =
> + float3_to_rgb9e5(clear_value.color.float32);
> +  vk_format = VK_FORMAT_R32_UINT;
> +   }
> +
> for (uint32_t r = 0; r < range_count; r++) {
>const VkImageSubresourceRange *range = [r];
>for (uint32_t l = 0; l < anv_get_levelCount(image, range); ++l) {
> @@ -773,7 +785,7 @@ anv_cmd_clear_image(struct anv_cmd_buffer *cmd_buffer,
>.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
>.image = anv_image_to_handle(image),
>.viewType = anv_meta_get_view_type(image),
> -  .format = image->vk_format,
> +  .format = vk_format,
>.subresourceRange = {
>   .aspectMask = range->aspectMask,
>   .baseMipLevel = range->baseMipLevel + l,
> @@ -800,7 +812,7 @@ anv_cmd_clear_image(struct anv_cmd_buffer *cmd_buffer,
> );
>  
>  VkAttachmentDescription att_desc = {
> -   .format = iview.vk_format,
> +   .format = vk_format,
> .loadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
> .storeOp = VK_ATTACHMENT_STORE_OP_STORE,
> .stencilLoadOp = VK_ATTACHMENT_LOAD_OP_LOAD,
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

Nicolas Boichat  changed:

   What|Removed |Added

 Attachment #125643|0   |1
is obsolete||

--- Comment #9 from Nicolas Boichat  ---
Created attachment 125647
  --> https://bugs.freedesktop.org/attachment.cgi?id=125647=edit
More tracing in egl_dri2.c

I see. When the second display is initialized, there is still an active
context, it seems (dri2_display_release should not be called on the first call
to dri2_make_current):

libEGL debug: Native platform type: drm (autodetected)
libEGL debug: dri2_initialize 0x90d7a0 0x9a2f10 (dri2_dpy=(null))
libEGL debug: the best driver is DRI2
libEGL debug: EGL user error 0x3009 (EGL_BAD_MATCH) in dri2_create_context
libEGL debug: dri2_make_current 0x90d7a0 0x9a2f10 (nil) (nil) 0x99dbf0
libEGL debug: dri2_display_release 0x9a2f10 2
refcount -> 1

This causes the reference count to drop to zero later on:
libEGL debug: dri2_make_current 0x90d7a0 0x9a2f10 (nil) (nil) (nil)
libEGL debug: dri2_display_release 0x9a2f10 1
refcount -> 0 => display is destroyed
libEGL debug: dri2_make_current 0x90d7a0 0x9a2f10 (nil) (nil) 0x99dbf0
libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglMakeCurrent

One more patch to confirm this.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] anv/gen7_pipeline: Set multisample state using shared function

2016-08-09 Thread Anuj Phogat
On Mon, Aug 8, 2016 at 5:09 PM, Jason Ekstrand  wrote:
> Does this fix any tests?  If so, we should say so in the commit message.
> With that updated,
>
No it doesn't. It surprised me too but it looks like cts don't have enough
multisampling tests.

> Reviewed-by: Jason Ekstrand 
>
> On Mon, Aug 8, 2016 at 2:57 PM, Anuj Phogat  wrote:
>>
>> Signed-off-by: Anuj Phogat 
>> ---
>>  src/intel/vulkan/gen7_pipeline.c | 16 +---
>>  1 file changed, 1 insertion(+), 15 deletions(-)
>>
>> diff --git a/src/intel/vulkan/gen7_pipeline.c
>> b/src/intel/vulkan/gen7_pipeline.c
>> index 5395e79..17d7ccc 100644
>> --- a/src/intel/vulkan/gen7_pipeline.c
>> +++ b/src/intel/vulkan/gen7_pipeline.c
>> @@ -81,21 +81,7 @@ genX(graphics_pipeline_create)(
>>   pCreateInfo->pRasterizationState, extra);
>> emit_3dstate_streamout(pipeline, pCreateInfo->pRasterizationState);
>>
>> -   if (pCreateInfo->pMultisampleState &&
>> -   pCreateInfo->pMultisampleState->rasterizationSamples > 1)
>> -
>> anv_finishme("VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO");
>> -
>> -   uint32_t samples = 1;
>> -   uint32_t log2_samples = __builtin_ffs(samples) - 1;
>> -
>> -   anv_batch_emit(>batch, GENX(3DSTATE_MULTISAMPLE), ms) {
>> -  ms.PixelLocation= PIXLOC_CENTER;
>> -  ms.NumberofMultisamples = log2_samples;
>> -   }
>> -
>> -   anv_batch_emit(>batch, GENX(3DSTATE_SAMPLE_MASK), sm) {
>> -  sm.SampleMask = 0xff;
>> -   }
>> +   emit_ms_state(pipeline, pCreateInfo->pMultisampleState);
>>
>> const struct brw_vs_prog_data *vs_prog_data =
>> get_vs_prog_data(pipeline);
>>
>> --
>> 2.5.5
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] freedreno/a2xx: add missing casts to silence notices

2016-08-09 Thread Francesco Ansanelli
Signed-off-by: Francesco Ansanelli 
---
 src/gallium/drivers/freedreno/a2xx/ir-a2xx.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c 
b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
index 2b62b3a..163c282 100644
--- a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
+++ b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
@@ -403,7 +403,7 @@ static int instr_emit_alu(struct ir2_instruction *instr, 
uint32_t *dwords,
assert((src2_reg->flags & IR2_REG_EXPORT) == 0);
assert(!src2_reg->swizzle || (strlen(src2_reg->swizzle) == 4));
 
-   if (instr->alu.vector_opc == ~0) {
+   if (instr->alu.vector_opc == (instr_vector_opc_t)~0) {
alu->vector_opc  = MAXv;
alu->vector_write_mask   = 0;
} else {
@@ -431,7 +431,7 @@ static int instr_emit_alu(struct ir2_instruction *instr, 
uint32_t *dwords,
alu->vector_clamp= instr->alu.vector_clamp;
alu->scalar_clamp= instr->alu.scalar_clamp;
 
-   if (instr->alu.scalar_opc != ~0) {
+   if (instr->alu.scalar_opc != (instr_scalar_opc_t)~0) {
struct ir2_register *sdst_reg = instr->regs[reg++];
 
reg_update_stats(sdst_reg, info, true);
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

--- Comment #8 from Alexandr Zelinsky  ---
Created attachment 125644
  --> https://bugs.freedesktop.org/attachment.cgi?id=125644=edit
EGL_LOG_LEVEL=debug with patches

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Nicolai Hähnle



On 09.08.2016 18:22, Erik Faye-Lund wrote:

On Tue, Aug 9, 2016 at 4:59 PM, Nicolai Hähnle  wrote:

On 09.08.2016 15:58, Rob Clark wrote:


tbh, git submodules are more annoying than they need to be, and I'm
not really terribly excited to use that for something that is a build
dependency.  Maybe just move it into libdrm instead?



I know. That's what I would have proposed if the addrlib interface were
stable. Unfortunately it isn't, and realistically speaking, that's not going
to change.

So shared linking is right out.

Static linking or just including source files from a separate repository
could be considered, but then what's the process for ensuring you have the
right version?

The nice aspect of submodules is that every commit of the Mesa repository
"knows" what the corresponding right version of addrlib is, and so git can
update the submodule to the correct version for you automatically.


I'm not a huge fan of submodules either. They just don't deal with
distributed development properly, which should be a non-starter for
OSS IMO. You either set the submodule to point to an absolute URL, in
which case it's awkward to work with if you need to change the code,
or you use a relative URL, which forces everyone who has a fork to
fork the submodule also. Yuck. As a formerly active Git developer, my
impression is that nobody of the core-git developers really loved the
idea of git-submodule, it was mostly introduced into Git to help KDE
transition their gigantic SVN-external based source tree to Git.

IMO, a much better alternative would be to have addrlib live in its
own repository, and periodically do a git subtree-merge into mesa and
other dependent system. That means that nobody really needs to deal
with the fact that the upstream is in a different repo, except when
submitting patches for upstream. This is what git.git itself does for
some of its subsystems.


That looks interesting. Are people using subtree-merge with the kind of 
linear history that Mesa uses?


Nicolai




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] vl: add a lanczos interpolation filter v3

2016-08-09 Thread Nayan Deshmukh
Hi Christian,

A few questions.



On Tue, Aug 9, 2016 at 5:10 PM, Christian König 
wrote:

> I am more than happy to solve these problems, the Lanczos filtering was
> getting a little stale
> anyway because I was not able to reproduce the problems Andy was facing.
>
> Yeah thought so, the reason is probably that you don't have the necessary
> hardware.
>
> Is that why I need to add a PIPE_BIND_LINEAR to a surface?
>
> Yes exactly.
>
> So I need to use maybe a couple of surfaces alternatively to read and
> write with the filters. This approach should work I guess.
>
> Allocate a temporary surface for each step, apply the necessary filters
> using it and then use the temporary buffer as input for the next step.
>
> See how the deinterlacing filter does this, you should use the same
> approach here.
>
> I would use this order for doing things:
> 1. Median filter for noise reduction.
> 2. Sharpening/blur filter.
> 3. Deinterlacing.
> 4. Compositioning and CC conversion.
> 5. Advanced scaling.
>
>
I need to provide the median filter and the blur filter with a sampler view
and the deint filter requires a pipe_video_buffer. I am not sure how to
acheive this. Any suggestions?

Also right now deinterlacing is the first step and the other steps follow.
But if we perform median and sharpening filter before then we also need to
apply them on the past and the future surfaces that we require for
deinterlacing. Am I right?

Regards,
Nayan.

Regards,
> Christian.
>
>
> Am 08.08.2016 um 16:32 schrieb Nayan Deshmukh:
>
> Hi Christian,
>
> I am more than happy to solve these problems, the Lanczos filtering was
> getting a little stale
> anyway because I was not able to reproduce the problems Andy was facing.
>
> On Mon, Aug 8, 2016 at 6:24 PM, Christian König 
> wrote:
>
>> Hi Nayan,
>>
>> ok let's take a step back and put the lanczos filtering aside for a
>> moment. I have another task for you which is more urgent right now.
>>
>> The order we do things in vlVdpVideoMixerRender() was never 100% correct,
>> so we have at least three problems here which needs fixing:
>>
>> 1) The noise reduction and sharpness filter read and write to the same
>> surface at the same time. That only works because we use a linear layout.
>>
>> Is that why I need to add a PIPE_BIND_LINEAR to a surface? So I need to
> use maybe a couple of surfaces alternatively to read and write with the
> filters. This approach should work I guess.
>
> 2) We apply the noise reduction and sharpness filter after the
>> composition. That is rather odd and should be fixed so that we apply those
>> filters on the original video frame instead.
>>
>>  So we need to apply the filter before the CSC conversion.
>
>> 3) We use delayed rendering to render into output surfaces directly. We
>> should rather use the DRI3 capabilities to allocate multiple output
>> surfaces instead.
>>
>> Could you take care of some of those issues? Especially #1 has become a
>> problem recently.
>>
>> Surely, I will start working on the first 2 problem for now and look at
> the third problem a little later.
>
> Regards,
> Nayan.
>
>
>> Regards,
>> Christian.
>>
>>
>> Am 04.08.2016 um 19:22 schrieb Nayan Deshmukh:
>>
>> Hi Andy,
>>
>>
>> On Thu, Aug 4, 2016 at 8:48 PM, Andy Furniss  wrote:
>>
>>> Nayan Deshmukh wrote:
>>>
 Hi Andy,

 Thanks for testing my patches.

>>>
>>> NP
>>>
>>>
>>> The scaling happens after CSC.
>

>>> OK, thanks.
>>>
>>>
>>> I believe there is some misunderstanding here, I was able to see the
 artifacts in the video that you sent me previously. But I was not
 able to replicate them

>>>
>>> Yea, I got that - just thought you may want to see how they had changed.
>>>
>>> with the pendulum video on my system. Same case this time the
 pendulum video plays fine this time too. I am attacing a video of it
 here

 https://drive.google.com/file/d/0B1s62k5GtdBwOVAtOUVaU0V5c1E
 /view?usp=sharing

>>>
>>> Hmm, that's interesting for a few reasons.
>>>
>>> Though my monitor won't do 1366x768 I can replicate the same scale
>>> factor windowed with mplayer ... -xy 768/576 ...
>>>
>>> At first glance only level 2 is obviously artifacted (though very close
>>> inspection shows others are slightly).
>>>
>>> Levels: for some reason your vid has the black bars at 0 but the content
>>> isn't scaled - like your mplayer didn't expand tv to pc on playback.
>>> This shouldn't happen by default. Do you have some extra config
>>> somewhere like in $HOME/.mplayer, if so maybe better to test without.
>>>
>>> Most important - though the vp9 compression may be to blame I can't
>>> really see any difference between the levels in that vid.
>>>
>>> In fact closely comparing just your level 1 to my (admittedly
>>> uncompressed) level 1 and 0 output scaled the same plus unstretched
>>> levels wise it looks to me like you are getting level 0 on this test.
>>>
>>
>> You are 

Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Nicolai Hähnle

On 09.08.2016 19:18, Marek Olšák wrote:

On Tue, Aug 9, 2016 at 5:35 PM, Nicolai Hähnle  wrote:

On 09.08.2016 17:21, Marek Olšák wrote:


On Tue, Aug 9, 2016 at 3:47 PM, Nicolai Hähnle  wrote:


Hi everybody,

addrlib is the addressing and alignment calculator which is used by
radeonsi. It's developed (and also used) internally at AMD, and so far
we've
had one open source copy living in the Mesa repository at
src/gallium/winsys/amdgpu/drm/addrlib.

The question of using addrlib in non-Mesa parts of our open-source stack
has
come up, in particular in relation to compute. We'd obviously like to
share
the code rather than having multiple copies flying around. Since the
interface of addrlib is slow-moving but unstable, shared linking is not
an
option.

I think the best way forward is to create a dedicated repository for
addrlib
which is then integrated into Mesa as a git submodule.

The point of this email is to gather feedback from the Mesa community on
this plan, which is explicitly:

(0) Create an addrlib repository, say amd/addrlib on fd.o.
(1) Add it as a git submodule to the Mesa repository.
(2) Fix up whatever aspects of the build system that may be affected
(perhaps for building source tarballs).
(3) Go back to mostly ignoring addrlib, except for trying to get better
at
syncing with the internal closed-source version.

From initial experiments, the impact on users interested in radeon is
that
they will have to run `git submodule init` and then occasionally `git
submodule update`. Users who do not build radeonsi should be able to
ignore
the change completely.

There are alternatives. For example, ROCm uses Google's repo tool
already.
But for Mesa, git submodule looks like a lightweight, well supported and
overall conservative option that everybody should already have installed.
If
there are good arguments for something else, let's hear them!

Another point: if we proceed with this plan, I think we should consider
moving addrlib into src/amd/addrlib. There are two reasons: First,
transitioning to a submodule *without* changing the directory is probably
more fragile, i.e. what happens when you switch between checkouts before
and
after the transition. Second, if/when radv ends up being merged into Mesa
master, it makes sense to have addrlib there anyway.

Thoughts? Complaints? Praise?



I don't know.

How does this ensure that Mesa and ROCm addrlib copies won't diverge?



They won't really be different copies, because both "copies" are really
checkouts from the same repository. They will occasionally be checkouts of
_different versions_ from the same repository -- usually that would happen
after a sync with the internal copy, when one driver updates their pointer
before the other does. But that's easiy to reconcile. Usually it should just
mean changing the version pointer in whichever driver uses the older
version.



What issues can we expect if Mesa and ROCm addrlib copies diverge?



This is about software maintenance. If we _do_ have separate copies, and
someone applies a bug fix in one copy, they may forget to apply it to the
other. When we want to sync with the internal copy, that has to be done
twice. Basically, all the usual frictions that go with having the same (or
almost the same) piece of code in more than one place.


Instead of introducing a new repo, can ROCm simply copy addrlib
directly from the Mesa tree?

If I understand it correctly, the only thing the git submodule would
allow is that ROCm developers wouldn't have to clone Mesa.


They certainly can copy addrlib from the Mesa tree. The question is 
whether that's a good basis for staying synchronized in the future.


Nicolai



Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Marek Olšák
On Tue, Aug 9, 2016 at 5:35 PM, Nicolai Hähnle  wrote:
> On 09.08.2016 17:21, Marek Olšák wrote:
>>
>> On Tue, Aug 9, 2016 at 3:47 PM, Nicolai Hähnle  wrote:
>>>
>>> Hi everybody,
>>>
>>> addrlib is the addressing and alignment calculator which is used by
>>> radeonsi. It's developed (and also used) internally at AMD, and so far
>>> we've
>>> had one open source copy living in the Mesa repository at
>>> src/gallium/winsys/amdgpu/drm/addrlib.
>>>
>>> The question of using addrlib in non-Mesa parts of our open-source stack
>>> has
>>> come up, in particular in relation to compute. We'd obviously like to
>>> share
>>> the code rather than having multiple copies flying around. Since the
>>> interface of addrlib is slow-moving but unstable, shared linking is not
>>> an
>>> option.
>>>
>>> I think the best way forward is to create a dedicated repository for
>>> addrlib
>>> which is then integrated into Mesa as a git submodule.
>>>
>>> The point of this email is to gather feedback from the Mesa community on
>>> this plan, which is explicitly:
>>>
>>> (0) Create an addrlib repository, say amd/addrlib on fd.o.
>>> (1) Add it as a git submodule to the Mesa repository.
>>> (2) Fix up whatever aspects of the build system that may be affected
>>> (perhaps for building source tarballs).
>>> (3) Go back to mostly ignoring addrlib, except for trying to get better
>>> at
>>> syncing with the internal closed-source version.
>>>
>>> From initial experiments, the impact on users interested in radeon is
>>> that
>>> they will have to run `git submodule init` and then occasionally `git
>>> submodule update`. Users who do not build radeonsi should be able to
>>> ignore
>>> the change completely.
>>>
>>> There are alternatives. For example, ROCm uses Google's repo tool
>>> already.
>>> But for Mesa, git submodule looks like a lightweight, well supported and
>>> overall conservative option that everybody should already have installed.
>>> If
>>> there are good arguments for something else, let's hear them!
>>>
>>> Another point: if we proceed with this plan, I think we should consider
>>> moving addrlib into src/amd/addrlib. There are two reasons: First,
>>> transitioning to a submodule *without* changing the directory is probably
>>> more fragile, i.e. what happens when you switch between checkouts before
>>> and
>>> after the transition. Second, if/when radv ends up being merged into Mesa
>>> master, it makes sense to have addrlib there anyway.
>>>
>>> Thoughts? Complaints? Praise?
>>
>>
>> I don't know.
>>
>> How does this ensure that Mesa and ROCm addrlib copies won't diverge?
>
>
> They won't really be different copies, because both "copies" are really
> checkouts from the same repository. They will occasionally be checkouts of
> _different versions_ from the same repository -- usually that would happen
> after a sync with the internal copy, when one driver updates their pointer
> before the other does. But that's easiy to reconcile. Usually it should just
> mean changing the version pointer in whichever driver uses the older
> version.
>
>
>> What issues can we expect if Mesa and ROCm addrlib copies diverge?
>
>
> This is about software maintenance. If we _do_ have separate copies, and
> someone applies a bug fix in one copy, they may forget to apply it to the
> other. When we want to sync with the internal copy, that has to be done
> twice. Basically, all the usual frictions that go with having the same (or
> almost the same) piece of code in more than one place.

Instead of introducing a new repo, can ROCm simply copy addrlib
directly from the Mesa tree?

If I understand it correctly, the only thing the git submodule would
allow is that ROCm developers wouldn't have to clone Mesa.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] prog_hash_table: Convert compare funcs to match util/hash_table.h.

2016-08-09 Thread Eric Anholt
I'm going to replace this hash table with util/hash_table.h, and the first
step is to compare things the same way.
---
 src/mesa/program/hash_table.h  | 9 -
 src/mesa/program/prog_hash_table.c | 9 +++--
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h
index d0a2abffa343..aba5282fe9e5 100644
--- a/src/mesa/program/hash_table.h
+++ b/src/mesa/program/hash_table.h
@@ -47,7 +47,7 @@ extern "C" {
 struct hash_table;
 
 typedef unsigned (*hash_func_t)(const void *key);
-typedef int (*hash_compare_func_t)(const void *key1, const void *key2);
+typedef bool (*hash_compare_func_t)(const void *key1, const void *key2);
 
 /**
  * Hash table constructor
@@ -151,12 +151,11 @@ extern unsigned hash_table_string_hash(const void *key);
 /**
  * Compare two strings used as keys
  *
- * This is just a macro wrapper around \c strcmp.
+ * This is just a wrapper around \c strcmp.
  *
  * \sa hash_table_string_hash
  */
-#define hash_table_string_compare ((hash_compare_func_t) strcmp)
-
+bool hash_table_string_compare(const void *a, const void *b);
 
 /**
  * Compute hash value of a pointer
@@ -178,7 +177,7 @@ hash_table_pointer_hash(const void *key);
  *
  * \sa hash_table_pointer_hash
  */
-int
+bool
 hash_table_pointer_compare(const void *key1, const void *key2);
 
 void
diff --git a/src/mesa/program/prog_hash_table.c 
b/src/mesa/program/prog_hash_table.c
index 5592b6fb8148..f8a7107eb5bd 100644
--- a/src/mesa/program/prog_hash_table.c
+++ b/src/mesa/program/prog_hash_table.c
@@ -228,6 +228,11 @@ hash_table_string_hash(const void *key)
 return hash;
 }
 
+bool hash_table_string_compare(const void *a, const void *b)
+{
+   return strcmp(a, b) == 0;
+}
+
 
 unsigned
 hash_table_pointer_hash(const void *key)
@@ -236,8 +241,8 @@ hash_table_pointer_hash(const void *key)
 }
 
 
-int
+bool
 hash_table_pointer_compare(const void *key1, const void *key2)
 {
-   return key1 == key2 ? 0 : 1;
+   return key1 == key2;
 }
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] nir: Drop an unused program/hash_table.h include.

2016-08-09 Thread Eric Anholt
---
 src/compiler/nir/nir_lower_samplers.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/compiler/nir/nir_lower_samplers.c 
b/src/compiler/nir/nir_lower_samplers.c
index 4a4326983a65..e878edd9b54b 100644
--- a/src/compiler/nir/nir_lower_samplers.c
+++ b/src/compiler/nir/nir_lower_samplers.c
@@ -25,7 +25,6 @@
 
 #include "nir.h"
 #include "nir_builder.h"
-#include "program/hash_table.h"
 #include "compiler/glsl/ir_uniform.h"
 
 #include "main/compiler.h"
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] prog_hash_table: Convert to using util/hash_table.h.

2016-08-09 Thread Eric Anholt
Improves glretrace -b servo.trace (a trace of Mozilla's servo rendering
engine booting, rendering a page, and exiting) from 1.8s to 1.1s.  It uses
a large uniform array of structs, making a huge number of separate program
resources, and the fixed-size hash table was killing it.  Given how many
times we've improved performance by swapping the hash table to
util/hash_table.h, just do it once and for all.

This just rebases the old hash table API on top of util/, for minimal
diff.  Cleaning things up is left for later, particularly because I want
to fix up the new hash table API a little bit.
---
 src/mesa/program/hash_table.h  |  78 +++-
 src/mesa/program/prog_hash_table.c | 181 -
 2 files changed, 54 insertions(+), 205 deletions(-)

diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h
index aba5282fe9e5..362eb2ee0a78 100644
--- a/src/mesa/program/hash_table.h
+++ b/src/mesa/program/hash_table.h
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include "util/hash_table.h"
 
 struct string_to_uint_map;
 
@@ -44,8 +45,6 @@ struct string_to_uint_map;
 extern "C" {
 #endif
 
-struct hash_table;
-
 typedef unsigned (*hash_func_t)(const void *key);
 typedef bool (*hash_compare_func_t)(const void *key1, const void *key2);
 
@@ -60,26 +59,32 @@ typedef bool (*hash_compare_func_t)(const void *key1, const 
void *key2);
  * \param hash Function used to compute hash value of input keys.
  * \param compare  Function used to compare keys.
  */
-extern struct hash_table *hash_table_ctor(unsigned num_buckets,
-hash_func_t hash, hash_compare_func_t compare);
-
+static inline struct hash_table *hash_table_ctor(unsigned num_buckets,
+hash_func_t hash, hash_compare_func_t compare)
+{
+   return _mesa_hash_table_create(NULL, hash, compare);
+}
 
 /**
  * Release all memory associated with a hash table
  *
  * \warning
- * This function cannot release memory occupied either by keys or data.
+ * This function does not release memory occupied either by keys or data.
  */
-extern void hash_table_dtor(struct hash_table *ht);
-
+static inline void hash_table_dtor(struct hash_table *ht)
+{
+   return _mesa_hash_table_destroy(ht, NULL);
+}
 
 /**
  * Flush all entries from a hash table
  *
  * \param ht  Table to be cleared of its entries.
  */
-extern void hash_table_clear(struct hash_table *ht);
-
+static inline void hash_table_clear(struct hash_table *ht)
+{
+   return _mesa_hash_table_clear(ht, NULL);
+}
 
 /**
  * Search a hash table for a specific element
@@ -92,25 +97,28 @@ extern void hash_table_clear(struct hash_table *ht);
  * the matching key was added.  If no matching key exists in the table,
  * \c NULL is returned.
  */
-extern void *hash_table_find(struct hash_table *ht, const void *key);
-
+static inline void *hash_table_find(struct hash_table *ht, const void *key)
+{
+   struct hash_entry *entry = _mesa_hash_table_search(ht, key);
+   if (!entry)
+  return NULL;
+   return entry->data;
+}
 
 /**
  * Add an element to a hash table
  *
  * \warning
- * If \c key is already in the hash table, it will be added again.  Future
- * calls to \c hash_table_find and \c hash_table_remove will return or remove,
- * repsectively, the most recently added instance of \c key.
- *
- * \warning
  * The value passed by \c key is kept in the hash table and is used by later
  * calls to \c hash_table_find.
  *
  * \sa hash_table_replace
  */
-extern void hash_table_insert(struct hash_table *ht, void *data,
-const void *key);
+static inline void hash_table_insert(struct hash_table *ht, void *data,
+ const void *key)
+{
+   _mesa_hash_table_insert(ht, key, data);
+}
 
 /**
  * Add an element to a hash table with replacement
@@ -126,13 +134,29 @@ extern void hash_table_insert(struct hash_table *ht, void 
*data,
  *
  * \sa hash_table_insert
  */
-extern bool hash_table_replace(struct hash_table *ht, void *data,
-const void *key);
+static inline bool hash_table_replace(struct hash_table *ht, void *data,
+  const void *key)
+{
+   struct hash_entry *entry = _mesa_hash_table_search(ht, key);
+   if (entry) {
+  entry->data = data;
+  return true;
+   } else {
+  _mesa_hash_table_insert(ht, key, data);
+  return false;
+   }
+}
 
 /**
  * Remove a specific element from a hash table.
  */
-extern void hash_table_remove(struct hash_table *ht, const void *key);
+static inline void hash_table_remove(struct hash_table *ht, const void *key)
+{
+   struct hash_entry *entry = _mesa_hash_table_search(ht, key);
+
+   if (entry)
+  _mesa_hash_table_remove(ht, entry);
+}
 
 /**
  * Compute hash value of a string
@@ -180,12 +204,18 @@ hash_table_pointer_hash(const void *key);
 bool
 hash_table_pointer_compare(const void *key1, const void *key2);
 
-void
+static inline void
 hash_table_call_foreach(struct hash_table *ht,
void 

[Mesa-dev] [PATCH 4/4] mesa: Use a temporary set to track whether we've added a resource yet.

2016-08-09 Thread Eric Anholt
Saves another .1s on servo.trace.
---
 src/compiler/glsl/linker.cpp | 76 +---
 1 file changed, 50 insertions(+), 26 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f4049133ee69..ceb86aa0a929 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -73,6 +73,7 @@
 #include "program.h"
 #include "program/hash_table.h"
 #include "program/prog_instruction.h"
+#include "util/set.h"
 #include "linker.h"
 #include "link_varyings.h"
 #include "ir_optimization.h"
@@ -3528,15 +3529,15 @@ should_add_buffer_variable(struct gl_shader_program 
*shProg,
 }
 
 static bool
-add_program_resource(struct gl_shader_program *prog, GLenum type,
- const void *data, uint8_t stages)
+add_program_resource(struct gl_shader_program *prog,
+ struct set *resource_set,
+ GLenum type, const void *data, uint8_t stages)
 {
assert(data);
 
/* If resource already exists, do not add it again. */
-   for (unsigned i = 0; i < prog->NumProgramResourceList; i++)
-  if (prog->ProgramResourceList[i].Data == data)
- return true;
+   if (_mesa_set_search(resource_set, data))
+  return true;
 
prog->ProgramResourceList =
   reralloc(prog,
@@ -3558,6 +3559,8 @@ add_program_resource(struct gl_shader_program *prog, 
GLenum type,
 
prog->NumProgramResourceList++;
 
+   _mesa_set_add(resource_set, data);
+
return true;
 }
 
@@ -3722,7 +3725,8 @@ create_shader_variable(struct gl_shader_program *shProg,
 }
 
 static bool
-add_shader_variable(struct gl_shader_program *shProg, unsigned stage_mask,
+add_shader_variable(struct gl_shader_program *shProg, struct set *resource_set,
+unsigned stage_mask,
 GLenum programInterface, ir_variable *var,
 const char *name, const glsl_type *type,
 bool use_implicit_location, int location,
@@ -3750,7 +3754,8 @@ add_shader_variable(struct gl_shader_program *shProg, 
unsigned stage_mask,
   for (unsigned i = 0; i < type->length; i++) {
  const struct glsl_struct_field *field = >fields.structure[i];
  char *field_name = ralloc_asprintf(shProg, "%s.%s", name, 
field->name);
- if (!add_shader_variable(shProg, stage_mask, programInterface,
+ if (!add_shader_variable(shProg, resource_set,
+  stage_mask, programInterface,
   var, field_name, field->type,
   use_implicit_location, field_location,
   outermost_struct_type))
@@ -3792,13 +3797,15 @@ add_shader_variable(struct gl_shader_program *shProg, 
unsigned stage_mask,
   if (!sha_v)
  return false;
 
-  return add_program_resource(shProg, programInterface, sha_v, stage_mask);
+  return add_program_resource(shProg, resource_set,
+  programInterface, sha_v, stage_mask);
}
}
 }
 
 static bool
 add_interface_variables(struct gl_shader_program *shProg,
+struct set *resource_set,
 unsigned stage, GLenum programInterface)
 {
exec_list *ir = shProg->_LinkedShaders[stage]->ir;
@@ -3848,7 +3855,8 @@ add_interface_variables(struct gl_shader_program *shProg,
  (stage == MESA_SHADER_VERTEX && var->data.mode == ir_var_shader_in) ||
  (stage == MESA_SHADER_FRAGMENT && var->data.mode == 
ir_var_shader_out);
 
-  if (!add_shader_variable(shProg, 1 << stage, programInterface,
+  if (!add_shader_variable(shProg, resource_set,
+   1 << stage, programInterface,
var, var->name, var->type, 
vs_input_or_fs_output,
var->data.location - loc_bias))
  return false;
@@ -3857,7 +3865,8 @@ add_interface_variables(struct gl_shader_program *shProg,
 }
 
 static bool
-add_packed_varyings(struct gl_shader_program *shProg, int stage, GLenum type)
+add_packed_varyings(struct gl_shader_program *shProg, struct set *resource_set,
+int stage, GLenum type)
 {
struct gl_linked_shader *sh = shProg->_LinkedShaders[stage];
GLenum iface;
@@ -3882,7 +3891,8 @@ add_packed_varyings(struct gl_shader_program *shProg, int 
stage, GLenum type)
  if (type == iface) {
 const int stage_mask =
build_stageref(shProg, var->name, var->data.mode);
-if (!add_shader_variable(shProg, stage_mask,
+if (!add_shader_variable(shProg, resource_set,
+ stage_mask,
  iface, var, var->name, var->type, false,
  var->data.location - VARYING_SLOT_VAR0))
return false;
@@ -3893,7 +3903,7 @@ add_packed_varyings(struct gl_shader_program *shProg, int 
stage, GLenum type)
 }
 
 

[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

--- Comment #7 from Nicolas Boichat  ---
Created attachment 125643
  --> https://bugs.freedesktop.org/attachment.cgi?id=125643=edit
Add tracing to egl_dri2.c

libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglMakeCurrent

Is due to the new code in dri2_make_current:
   if (!dri2_dpy)
  return _eglError(EGL_NOT_INITIALIZED, "eglMakeCurrent");

I don't quite understand how dri2_dpy could end up being uninitialized, unless
eglMakeCurrent is called after eglTerminate...

I'd start by adding traces in dri2_initialize, dri2_terminate,
dri2_make_current, which is what this patch does...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Erik Faye-Lund
On Tue, Aug 9, 2016 at 4:59 PM, Nicolai Hähnle  wrote:
> On 09.08.2016 15:58, Rob Clark wrote:
>>
>> tbh, git submodules are more annoying than they need to be, and I'm
>> not really terribly excited to use that for something that is a build
>> dependency.  Maybe just move it into libdrm instead?
>
>
> I know. That's what I would have proposed if the addrlib interface were
> stable. Unfortunately it isn't, and realistically speaking, that's not going
> to change.
>
> So shared linking is right out.
>
> Static linking or just including source files from a separate repository
> could be considered, but then what's the process for ensuring you have the
> right version?
>
> The nice aspect of submodules is that every commit of the Mesa repository
> "knows" what the corresponding right version of addrlib is, and so git can
> update the submodule to the correct version for you automatically.

I'm not a huge fan of submodules either. They just don't deal with
distributed development properly, which should be a non-starter for
OSS IMO. You either set the submodule to point to an absolute URL, in
which case it's awkward to work with if you need to change the code,
or you use a relative URL, which forces everyone who has a fork to
fork the submodule also. Yuck. As a formerly active Git developer, my
impression is that nobody of the core-git developers really loved the
idea of git-submodule, it was mostly introduced into Git to help KDE
transition their gigantic SVN-external based source tree to Git.

IMO, a much better alternative would be to have addrlib live in its
own repository, and periodically do a git subtree-merge into mesa and
other dependent system. That means that nobody really needs to deal
with the fact that the upstream is in a different repo, except when
submitting patches for upstream. This is what git.git itself does for
some of its subsystems.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] cairo as state tracker

2016-08-09 Thread Jason Ekstrand
On Tue, Aug 9, 2016 at 8:11 AM, Enrico Weigelt, metux IT consult <
enrico.weig...@gr13.net> wrote:

> On 07.08.2016 12:50, Marek Olšák wrote:
>
> > It would mainly be a futile task if it had to compete with their
> > official Mesa driver.
>
> Not quite. Would give us all of gallium's capabilities also for
> the intel chips, for example having lots of different state trackers.
> (coming back to my original intention of cairo as a gallium st)
>

Gallium isn't an API you support it's a way you write your driver.
Switching to gallium is a fundamental change in the way the entire driver
is architected.  It could be done (and recent changes in our driver are
shrinking the dri portion so it's getting easier) but it's a massive pile
of work with mediocre benefit.  I could go on a very long rant about it but
the short version is: If we were writing a new driver or if something came
up that made gallium hugely benificial, we might consider it but at the
moment, there's no good reason.

As far as cairo goes, I'd much rather see the GL backend improved or a
Vulkan backend added as those are both stable industry-supported APIs
rather than running directly on gallium.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] vl/rbsp: add a check for emulation prevention three byte

2016-08-09 Thread Christian König

Am 09.08.2016 um 17:18 schrieb Leo Liu:

This is the case when the "00 00 03" is very close to the beginning of
nal unit header

v2: move the check to rbsp init

Signed-off-by: Leo Liu 


Reviewed-by: Christian König 


---
  src/gallium/auxiliary/vl/vl_rbsp.h | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_rbsp.h 
b/src/gallium/auxiliary/vl/vl_rbsp.h
index 7867238..c8bebff 100644
--- a/src/gallium/auxiliary/vl/vl_rbsp.h
+++ b/src/gallium/auxiliary/vl/vl_rbsp.h
@@ -50,7 +50,8 @@ struct vl_rbsp {
   */
  static inline void vl_rbsp_init(struct vl_rbsp *rbsp, struct vl_vlc *nal, 
unsigned num_bits)
  {
-   unsigned bits_left = vl_vlc_bits_left(nal);
+   unsigned valid, bits_left = vl_vlc_bits_left(nal);
+   int i;
  
 /* copy the position */

 rbsp->nal = *nal;
@@ -62,10 +63,19 @@ static inline void vl_rbsp_init(struct vl_rbsp *rbsp, 
struct vl_vlc *nal, unsign
if (vl_vlc_peekbits(nal, 24) == 0x01 ||
vl_vlc_peekbits(nal, 32) == 0x0001) {
   vl_vlc_limit(>nal, bits_left - vl_vlc_bits_left(nal));
- return;
+ break;
}
vl_vlc_eatbits(nal, 8);
 }
+
+   valid = vl_vlc_valid_bits(>nal);
+   /* search for the emulation prevention three byte */
+   for (i = 24; i <= valid; i += 8) {
+  if ((vl_vlc_peekbits(>nal, i) & 0xff) == 0x3) {
+ vl_vlc_removebits(>nal, i - 8, 8);
+ i += 8;
+  }
+   }
  }
  
  /**



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Marek Olšák
On Tue, Aug 9, 2016 at 3:47 PM, Nicolai Hähnle  wrote:
> Hi everybody,
>
> addrlib is the addressing and alignment calculator which is used by
> radeonsi. It's developed (and also used) internally at AMD, and so far we've
> had one open source copy living in the Mesa repository at
> src/gallium/winsys/amdgpu/drm/addrlib.
>
> The question of using addrlib in non-Mesa parts of our open-source stack has
> come up, in particular in relation to compute. We'd obviously like to share
> the code rather than having multiple copies flying around. Since the
> interface of addrlib is slow-moving but unstable, shared linking is not an
> option.
>
> I think the best way forward is to create a dedicated repository for addrlib
> which is then integrated into Mesa as a git submodule.
>
> The point of this email is to gather feedback from the Mesa community on
> this plan, which is explicitly:
>
> (0) Create an addrlib repository, say amd/addrlib on fd.o.
> (1) Add it as a git submodule to the Mesa repository.
> (2) Fix up whatever aspects of the build system that may be affected
> (perhaps for building source tarballs).
> (3) Go back to mostly ignoring addrlib, except for trying to get better at
> syncing with the internal closed-source version.
>
> From initial experiments, the impact on users interested in radeon is that
> they will have to run `git submodule init` and then occasionally `git
> submodule update`. Users who do not build radeonsi should be able to ignore
> the change completely.
>
> There are alternatives. For example, ROCm uses Google's repo tool already.
> But for Mesa, git submodule looks like a lightweight, well supported and
> overall conservative option that everybody should already have installed. If
> there are good arguments for something else, let's hear them!
>
> Another point: if we proceed with this plan, I think we should consider
> moving addrlib into src/amd/addrlib. There are two reasons: First,
> transitioning to a submodule *without* changing the directory is probably
> more fragile, i.e. what happens when you switch between checkouts before and
> after the transition. Second, if/when radv ends up being merged into Mesa
> master, it makes sense to have addrlib there anyway.
>
> Thoughts? Complaints? Praise?

I don't know.

How does this ensure that Mesa and ROCm addrlib copies won't diverge?

What issues can we expect if Mesa and ROCm addrlib copies diverge?

For texture sharing, the buffer metadata is set in a way that doesn't
leave any room for interpretation. I think it's possible to bypass
addrlib in this case.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Christian König

Am 09.08.2016 um 15:47 schrieb Nicolai Hähnle:

Hi everybody,

addrlib is the addressing and alignment calculator which is used by 
radeonsi. It's developed (and also used) internally at AMD, and so far 
we've had one open source copy living in the Mesa repository at 
src/gallium/winsys/amdgpu/drm/addrlib.


The question of using addrlib in non-Mesa parts of our open-source 
stack has come up, in particular in relation to compute. We'd 
obviously like to share the code rather than having multiple copies 
flying around. Since the interface of addrlib is slow-moving but 
unstable, shared linking is not an option.


I think the best way forward is to create a dedicated repository for 
addrlib which is then integrated into Mesa as a git submodule.


The point of this email is to gather feedback from the Mesa community 
on this plan, which is explicitly:


(0) Create an addrlib repository, say amd/addrlib on fd.o.
(1) Add it as a git submodule to the Mesa repository.
(2) Fix up whatever aspects of the build system that may be affected 
(perhaps for building source tarballs).
(3) Go back to mostly ignoring addrlib, except for trying to get 
better at syncing with the internal closed-source version.


From initial experiments, the impact on users interested in radeon is 
that they will have to run `git submodule init` and then occasionally 
`git submodule update`. Users who do not build radeonsi should be able 
to ignore the change completely.


There are alternatives. For example, ROCm uses Google's repo tool 
already. But for Mesa, git submodule looks like a lightweight, well 
supported and overall conservative option that everybody should 
already have installed. If there are good arguments for something 
else, let's hear them!


Another point: if we proceed with this plan, I think we should 
consider moving addrlib into src/amd/addrlib. There are two reasons: 
First, transitioning to a submodule *without* changing the directory 
is probably more fragile, i.e. what happens when you switch between 
checkouts before and after the transition. Second, if/when radv ends 
up being merged into Mesa master, it makes sense to have addrlib there 
anyway.


Thoughts? Complaints? Praise?


Well using git submodule is a possibility and we had rather good 
experience with that in GStreamer.


But it would remove one major argument to beating the addrlib guys 
towards a stable interface and/or proper library version handling.


Moving it into libdrm is clearly not an option because then you would 
need to use versioning for the whole libdrm_amdgpu library which we 
don't want.


Christian.


Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Nicolai Hähnle

On 09.08.2016 17:21, Marek Olšák wrote:

On Tue, Aug 9, 2016 at 3:47 PM, Nicolai Hähnle  wrote:

Hi everybody,

addrlib is the addressing and alignment calculator which is used by
radeonsi. It's developed (and also used) internally at AMD, and so far we've
had one open source copy living in the Mesa repository at
src/gallium/winsys/amdgpu/drm/addrlib.

The question of using addrlib in non-Mesa parts of our open-source stack has
come up, in particular in relation to compute. We'd obviously like to share
the code rather than having multiple copies flying around. Since the
interface of addrlib is slow-moving but unstable, shared linking is not an
option.

I think the best way forward is to create a dedicated repository for addrlib
which is then integrated into Mesa as a git submodule.

The point of this email is to gather feedback from the Mesa community on
this plan, which is explicitly:

(0) Create an addrlib repository, say amd/addrlib on fd.o.
(1) Add it as a git submodule to the Mesa repository.
(2) Fix up whatever aspects of the build system that may be affected
(perhaps for building source tarballs).
(3) Go back to mostly ignoring addrlib, except for trying to get better at
syncing with the internal closed-source version.

From initial experiments, the impact on users interested in radeon is that
they will have to run `git submodule init` and then occasionally `git
submodule update`. Users who do not build radeonsi should be able to ignore
the change completely.

There are alternatives. For example, ROCm uses Google's repo tool already.
But for Mesa, git submodule looks like a lightweight, well supported and
overall conservative option that everybody should already have installed. If
there are good arguments for something else, let's hear them!

Another point: if we proceed with this plan, I think we should consider
moving addrlib into src/amd/addrlib. There are two reasons: First,
transitioning to a submodule *without* changing the directory is probably
more fragile, i.e. what happens when you switch between checkouts before and
after the transition. Second, if/when radv ends up being merged into Mesa
master, it makes sense to have addrlib there anyway.

Thoughts? Complaints? Praise?


I don't know.

How does this ensure that Mesa and ROCm addrlib copies won't diverge?


They won't really be different copies, because both "copies" are really 
checkouts from the same repository. They will occasionally be checkouts 
of _different versions_ from the same repository -- usually that would 
happen after a sync with the internal copy, when one driver updates 
their pointer before the other does. But that's easiy to reconcile. 
Usually it should just mean changing the version pointer in whichever 
driver uses the older version.




What issues can we expect if Mesa and ROCm addrlib copies diverge?


This is about software maintenance. If we _do_ have separate copies, and 
someone applies a bug fix in one copy, they may forget to apply it to 
the other. When we want to sync with the internal copy, that has to be 
done twice. Basically, all the usual frictions that go with having the 
same (or almost the same) piece of code in more than one place.




For texture sharing, the buffer metadata is set in a way that doesn't
leave any room for interpretation. I think it's possible to bypass
addrlib in this case.


Right, this is orthogonal to interoperability. Multiple drivers with 
different versions of addrlib can coexist in a system.


Nicolai



Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] cairo as state tracker

2016-08-09 Thread Rob Clark
On Tue, Aug 9, 2016 at 11:11 AM, Enrico Weigelt, metux IT consult
 wrote:
> On 07.08.2016 12:50, Marek Olšák wrote:
>
>> It would mainly be a futile task if it had to compete with their
>> official Mesa driver.
>
> Not quite. Would give us all of gallium's capabilities also for
> the intel chips, for example having lots of different state trackers.
> (coming back to my original intention of cairo as a gallium st)
>

If you don't realize the complexity of a gpu driver, or the 100's of
thousands of hours that have gone into i965, it's easy to say 'lets
throw that all away and start from scratch with a gallium driver' ;-)

There is ilo.. I suppose if someone cared enough they could add NIR
support and figure out how to share the compiler back-end with i965,
so it wouldn't be *completely* starting from scratch.  But there is
still a big delta between ilo and i965 in terms of features and
supported hw gen's.

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] vl/rbsp: add a check for emulation prevention three byte

2016-08-09 Thread Leo Liu
This is the case when the "00 00 03" is very close to the beginning of
nal unit header

v2: move the check to rbsp init

Signed-off-by: Leo Liu 
---
 src/gallium/auxiliary/vl/vl_rbsp.h | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_rbsp.h 
b/src/gallium/auxiliary/vl/vl_rbsp.h
index 7867238..c8bebff 100644
--- a/src/gallium/auxiliary/vl/vl_rbsp.h
+++ b/src/gallium/auxiliary/vl/vl_rbsp.h
@@ -50,7 +50,8 @@ struct vl_rbsp {
  */
 static inline void vl_rbsp_init(struct vl_rbsp *rbsp, struct vl_vlc *nal, 
unsigned num_bits)
 {
-   unsigned bits_left = vl_vlc_bits_left(nal);
+   unsigned valid, bits_left = vl_vlc_bits_left(nal);
+   int i;
 
/* copy the position */
rbsp->nal = *nal;
@@ -62,10 +63,19 @@ static inline void vl_rbsp_init(struct vl_rbsp *rbsp, 
struct vl_vlc *nal, unsign
   if (vl_vlc_peekbits(nal, 24) == 0x01 ||
   vl_vlc_peekbits(nal, 32) == 0x0001) {
  vl_vlc_limit(>nal, bits_left - vl_vlc_bits_left(nal));
- return;
+ break;
   }
   vl_vlc_eatbits(nal, 8);
}
+
+   valid = vl_vlc_valid_bits(>nal);
+   /* search for the emulation prevention three byte */
+   for (i = 24; i <= valid; i += 8) {
+  if ((vl_vlc_peekbits(>nal, i) & 0xff) == 0x3) {
+ vl_vlc_removebits(>nal, i - 8, 8);
+ i += 8;
+  }
+   }
 }
 
 /**
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] cairo as state tracker

2016-08-09 Thread Enrico Weigelt, metux IT consult
On 07.08.2016 12:50, Marek Olšák wrote:

> It would mainly be a futile task if it had to compete with their
> official Mesa driver.

Not quite. Would give us all of gallium's capabilities also for
the intel chips, for example having lots of different state trackers.
(coming back to my original intention of cairo as a gallium st)


--mtx

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Enrico Weigelt, metux IT consult
On 09.08.2016 16:59, Nicolai Hähnle wrote:

> So shared linking is right out.

Not exactly. Just everything needs to be linked against the matching
versions. More a dist-layer problem.

addrlibs folks should learn to introduce a proper versioning and
provide MVCC-capable build rules. That really isn't hard.
> Static linking or just including source files from a separate repository
> could be considered, but then what's the process for ensuring you have
> the right version?

pkgconfig ?

> The nice aspect of submodules is that every commit of the Mesa
> repository "knows" what the corresponding right version of addrlib is,
> and so git can update the submodule to the correct version for you
> automatically.

No, it can only checkout the ref'ed commit or anywhere else the user
tells it to. Just jumping to the head does exactly *not* jump to
anything like an correct version. And that's all that git can do
for you automatically.


--mtx

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] vl/rbsp: add a check for emulation prevention three byte

2016-08-09 Thread Christian König

Am 09.08.2016 um 15:56 schrieb Leo Liu:

This is the case when the "00 00 03" is very close to the beginning of
nal unit header

v2: move the check to rbsp init

Signed-off-by: Leo Liu 
---
  src/gallium/auxiliary/vl/vl_rbsp.h | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/src/gallium/auxiliary/vl/vl_rbsp.h 
b/src/gallium/auxiliary/vl/vl_rbsp.h
index 7867238..c175e23 100644
--- a/src/gallium/auxiliary/vl/vl_rbsp.h
+++ b/src/gallium/auxiliary/vl/vl_rbsp.h
@@ -61,7 +61,18 @@ static inline void vl_rbsp_init(struct vl_rbsp *rbsp, struct 
vl_vlc *nal, unsign
 while (vl_vlc_search_byte(nal, num_bits, 0x00)) {
if (vl_vlc_peekbits(nal, 24) == 0x01 ||
vl_vlc_peekbits(nal, 32) == 0x0001) {
+ unsigned valid;
+ int i;
+
   vl_vlc_limit(>nal, bits_left - vl_vlc_bits_left(nal));
+ valid = vl_vlc_valid_bits(>nal);
+ /* search for the emulation prevention three byte */
+ for (i = 24; i <= valid; i += 8) {
+if ((vl_vlc_peekbits(>nal, i) & 0xff) == 0x3) {
+   vl_vlc_removebits(>nal, i - 8, 8);
+   i += 8;
+}
+ }


Mhm, I think that isn't 100% correct either.

We return inside the loop only when we find the next NAL unit after the 
current one, but if this is the last one that isn't the case.



   return;


I think that just replacing thing return with a break and moving the 
code after the while should do the trick.


Regards,
Christian.


}
vl_vlc_eatbits(nal, 8);




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Enrico Weigelt, metux IT consult
On 09.08.2016 15:47, Nicolai Hähnle wrote:

> I think the best way forward is to create a dedicated repository for
> addrlib which is then integrated into Mesa as a git submodule.

If you really wanna make a lot of people, especially dist-maintainers
very unhappy ...

> From initial experiments, the impact on users interested in radeon is
> that they will have to run `git submodule init` and then occasionally
> `git submodule update`.

Which requires additional, package specific, logic in build/rule files
of all the distros out there.

> There are alternatives. For example, ROCm uses Google's repo tool
> already. 

Even worse.


--mtx

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Nicolai Hähnle

On 09.08.2016 15:58, Rob Clark wrote:

On Tue, Aug 9, 2016 at 9:47 AM, Nicolai Hähnle  wrote:

Hi everybody,

addrlib is the addressing and alignment calculator which is used by
radeonsi. It's developed (and also used) internally at AMD, and so far we've
had one open source copy living in the Mesa repository at
src/gallium/winsys/amdgpu/drm/addrlib.

The question of using addrlib in non-Mesa parts of our open-source stack has
come up, in particular in relation to compute. We'd obviously like to share
the code rather than having multiple copies flying around. Since the
interface of addrlib is slow-moving but unstable, shared linking is not an
option.

I think the best way forward is to create a dedicated repository for addrlib
which is then integrated into Mesa as a git submodule.

The point of this email is to gather feedback from the Mesa community on
this plan, which is explicitly:

(0) Create an addrlib repository, say amd/addrlib on fd.o.
(1) Add it as a git submodule to the Mesa repository.
(2) Fix up whatever aspects of the build system that may be affected
(perhaps for building source tarballs).
(3) Go back to mostly ignoring addrlib, except for trying to get better at
syncing with the internal closed-source version.

From initial experiments, the impact on users interested in radeon is that
they will have to run `git submodule init` and then occasionally `git
submodule update`. Users who do not build radeonsi should be able to ignore
the change completely.


tbh, git submodules are more annoying than they need to be, and I'm
not really terribly excited to use that for something that is a build
dependency.  Maybe just move it into libdrm instead?


I know. That's what I would have proposed if the addrlib interface were 
stable. Unfortunately it isn't, and realistically speaking, that's not 
going to change.


So shared linking is right out.

Static linking or just including source files from a separate repository 
could be considered, but then what's the process for ensuring you have 
the right version?


The nice aspect of submodules is that every commit of the Mesa 
repository "knows" what the corresponding right version of addrlib is, 
and so git can update the submodule to the correct version for you 
automatically.


Cheers,
Nicolai


BR,
-R


There are alternatives. For example, ROCm uses Google's repo tool already.
But for Mesa, git submodule looks like a lightweight, well supported and
overall conservative option that everybody should already have installed. If
there are good arguments for something else, let's hear them!

Another point: if we proceed with this plan, I think we should consider
moving addrlib into src/amd/addrlib. There are two reasons: First,
transitioning to a submodule *without* changing the directory is probably
more fragile, i.e. what happens when you switch between checkouts before and
after the transition. Second, if/when radv ends up being merged into Mesa
master, it makes sense to have addrlib there anyway.

Thoughts? Complaints? Praise?
Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gallium: Add c99_compat.h to u_bitcast.h

2016-08-09 Thread Brian Paul

On 08/09/2016 07:46 AM, Brian Paul wrote:

On 08/09/2016 12:30 AM, Mathias Fröhlich wrote:

Hi Brian,

 > I don't know why my local build is failing while appveyor and our

 > in-house automated build seem OK. But applying your patch 3 alone
fixes

 > things for me.

As it fixes something independent, should I push that already?


Sure.  For 1 & 3,
Reviewed-by: Brian Paul 
Tested-by: Brian Paul 




 > Yeah, I applied your whole series and the MSVC build seems OK.
However,

 > I'm hitting a new runtime crash (even after fixing the unrelated issue

 > from Marek's rewrite of the state tracker validation code). It looks

 > like patch 2/3 is the problem. I'll try to dig deeper tomorrow...

Hmm, that part is meant as a nice addition on top, once we are at it.

But, if this patch is a problem then I am probably off by one in the
returned

index. You may 'return index + 1;' instead of just index.


I'll let you know soon...


Yes, we need to add one to the index.  I'll re-test that patch when you 
update it.


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] gallivm: add lp_build_alloca_undef

2016-08-09 Thread Roland Scheidegger
Am 09.08.2016 um 12:38 schrieb Nicolai Hähnle:
> From: Nicolai Hähnle 
> 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_flow.c | 19 +++
>  src/gallium/auxiliary/gallivm/lp_bld_flow.h |  5 +
>  2 files changed, 24 insertions(+)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_flow.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_flow.c
> index 9183f45..3c3f16c 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_flow.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_flow.c
> @@ -501,20 +501,39 @@ lp_build_alloca(struct gallivm_state *gallivm,
> res = LLVMBuildAlloca(first_builder, type, name);
> LLVMBuildStore(builder, LLVMConstNull(type), res);
>  
> LLVMDisposeBuilder(first_builder);
>  
> return res;
>  }
>  
>  
>  /**
> + * Like lp_build_alloca_undef, but do not zero-initialize the variable.

Like lp_build_alloca

For the series:
Reviewed-by: Roland Scheidegger 


> + */
> +LLVMValueRef
> +lp_build_alloca_undef(struct gallivm_state *gallivm,
> +  LLVMTypeRef type,
> +  const char *name)
> +{
> +   LLVMBuilderRef first_builder = create_builder_at_entry(gallivm);
> +   LLVMValueRef res;
> +
> +   res = LLVMBuildAlloca(first_builder, type, name);
> +
> +   LLVMDisposeBuilder(first_builder);
> +
> +   return res;
> +}
> +
> +
> +/**
>   * Allocate an array of scalars/vectors.
>   *
>   * mem2reg pass is not capable of promoting structs or arrays to registers, 
> but
>   * we still put it in the first block anyway as failure to put allocas in the
>   * first block may prevent the X86 backend from successfully align the stack 
> as
>   * required.
>   *
>   * Also the scalarrepl pass is supposedly more powerful and can promote
>   * arrays in many cases.
>   *
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_flow.h 
> b/src/gallium/auxiliary/gallivm/lp_bld_flow.h
> index 083b0ad..674fc18 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_flow.h
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_flow.h
> @@ -189,20 +189,25 @@ lp_build_endif(struct lp_build_if_state *ctx);
>  
>  LLVMBasicBlockRef
>  lp_build_insert_new_block(struct gallivm_state *gallivm, const char *name);
>  
>  LLVMValueRef
>  lp_build_alloca(struct gallivm_state *gallivm,
>  LLVMTypeRef type,
>  const char *name);
>  
>  LLVMValueRef
> +lp_build_alloca_undef(struct gallivm_state *gallivm,
> +  LLVMTypeRef type,
> +  const char *name);
> +
> +LLVMValueRef
>  lp_build_array_alloca(struct gallivm_state *gallivm,
>LLVMTypeRef type,
>LLVMValueRef count,
>const char *name);
>  
>  #ifdef __cplusplus
>  }
>  #endif
>  
>  #endif /* !LP_BLD_FLOW_H */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97261] vaapi u/v wrong order since vl/util: add copy func for yv12image to nv12surface

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97261

Andy Furniss  changed:

   What|Removed |Added

 CC||deathsim...@vodafone.de

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

--- Comment #6 from Alexandr Zelinsky  ---
Created attachment 125640
  --> https://bugs.freedesktop.org/attachment.cgi?id=125640=edit
EGL_LOG_LEVEL=debug

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

--- Comment #5 from Michel Dänzer  ---
Try EGL_LOG_LEVEL=debug as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] vl/rbsp: add a check for emulation prevention three byte

2016-08-09 Thread Leo Liu
This is the case when the "00 00 03" is very close to the beginning of
nal unit header

v2: move the check to rbsp init

Signed-off-by: Leo Liu 
---
 src/gallium/auxiliary/vl/vl_rbsp.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/gallium/auxiliary/vl/vl_rbsp.h 
b/src/gallium/auxiliary/vl/vl_rbsp.h
index 7867238..c175e23 100644
--- a/src/gallium/auxiliary/vl/vl_rbsp.h
+++ b/src/gallium/auxiliary/vl/vl_rbsp.h
@@ -61,7 +61,18 @@ static inline void vl_rbsp_init(struct vl_rbsp *rbsp, struct 
vl_vlc *nal, unsign
while (vl_vlc_search_byte(nal, num_bits, 0x00)) {
   if (vl_vlc_peekbits(nal, 24) == 0x01 ||
   vl_vlc_peekbits(nal, 32) == 0x0001) {
+ unsigned valid;
+ int i;
+
  vl_vlc_limit(>nal, bits_left - vl_vlc_bits_left(nal));
+ valid = vl_vlc_valid_bits(>nal);
+ /* search for the emulation prevention three byte */
+ for (i = 24; i <= valid; i += 8) {
+if ((vl_vlc_peekbits(>nal, i) & 0xff) == 0x3) {
+   vl_vlc_removebits(>nal, i - 8, 8);
+   i += 8;
+}
+ }
  return;
   }
   vl_vlc_eatbits(nal, 8);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Rob Clark
On Tue, Aug 9, 2016 at 9:47 AM, Nicolai Hähnle  wrote:
> Hi everybody,
>
> addrlib is the addressing and alignment calculator which is used by
> radeonsi. It's developed (and also used) internally at AMD, and so far we've
> had one open source copy living in the Mesa repository at
> src/gallium/winsys/amdgpu/drm/addrlib.
>
> The question of using addrlib in non-Mesa parts of our open-source stack has
> come up, in particular in relation to compute. We'd obviously like to share
> the code rather than having multiple copies flying around. Since the
> interface of addrlib is slow-moving but unstable, shared linking is not an
> option.
>
> I think the best way forward is to create a dedicated repository for addrlib
> which is then integrated into Mesa as a git submodule.
>
> The point of this email is to gather feedback from the Mesa community on
> this plan, which is explicitly:
>
> (0) Create an addrlib repository, say amd/addrlib on fd.o.
> (1) Add it as a git submodule to the Mesa repository.
> (2) Fix up whatever aspects of the build system that may be affected
> (perhaps for building source tarballs).
> (3) Go back to mostly ignoring addrlib, except for trying to get better at
> syncing with the internal closed-source version.
>
> From initial experiments, the impact on users interested in radeon is that
> they will have to run `git submodule init` and then occasionally `git
> submodule update`. Users who do not build radeonsi should be able to ignore
> the change completely.

tbh, git submodules are more annoying than they need to be, and I'm
not really terribly excited to use that for something that is a build
dependency.  Maybe just move it into libdrm instead?

BR,
-R

> There are alternatives. For example, ROCm uses Google's repo tool already.
> But for Mesa, git submodule looks like a lightweight, well supported and
> overall conservative option that everybody should already have installed. If
> there are good arguments for something else, let's hear them!
>
> Another point: if we proceed with this plan, I think we should consider
> moving addrlib into src/amd/addrlib. There are two reasons: First,
> transitioning to a submodule *without* changing the directory is probably
> more fragile, i.e. what happens when you switch between checkouts before and
> after the transition. Second, if/when radv ends up being merged into Mesa
> master, it makes sense to have addrlib there anyway.
>
> Thoughts? Complaints? Praise?
> Nicolai
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97261] vaapi u/v wrong order since vl/util: add copy func for yv12image to nv12surface

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97261

Bug ID: 97261
   Summary: vaapi u/v wrong order since vl/util: add copy func for
yv12image to nv12surface
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: adf.li...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 125638
  --> https://bugs.freedesktop.org/attachment.cgi?id=125638=edit
small test vid

As noted st the time, though Boyuan said he couldn't reproduce, for me

vl/util: add copy func for yv12image to nv12surface

gets u and v for both yv12 and I420 inputs reversed whether encoding or
playing.

Both gstreamer and mpv affected.

Testing playback using attached small test vid that instantly shows the issue
either

VAAPI_DISABLE_INTERLACE=true mpv --vo=vaapi uvtest.mkv

or

gst-launch-1.0 filesrc location=uvtest.mkv ! matroskademux ! avdec_h264 !
vaapisink

Of course any test that outputs nv12 works OK as it avoids the conversion.

It seems that the new util function expects input to be yuv, but it actually
gets yvu.

I sent a patch to the list for this -

https://lists.freedesktop.org/archives/mesa-dev/2016-July/124695.html

Filing bug/test to see if anyone else reproduce.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97214] X not running with error "Failed to make EGL context current"

2016-08-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97214

--- Comment #4 from Alexandr Zelinsky  ---
Created attachment 125637
  --> https://bugs.freedesktop.org/attachment.cgi?id=125637=edit
xinit log with LIBGL_DEBUG=verbose

seems not very usefull
what your nickname on #dri-devel?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Moving amdgpu/addrlib into a git submodule

2016-08-09 Thread Nicolai Hähnle

Hi everybody,

addrlib is the addressing and alignment calculator which is used by 
radeonsi. It's developed (and also used) internally at AMD, and so far 
we've had one open source copy living in the Mesa repository at 
src/gallium/winsys/amdgpu/drm/addrlib.


The question of using addrlib in non-Mesa parts of our open-source stack 
has come up, in particular in relation to compute. We'd obviously like 
to share the code rather than having multiple copies flying around. 
Since the interface of addrlib is slow-moving but unstable, shared 
linking is not an option.


I think the best way forward is to create a dedicated repository for 
addrlib which is then integrated into Mesa as a git submodule.


The point of this email is to gather feedback from the Mesa community on 
this plan, which is explicitly:


(0) Create an addrlib repository, say amd/addrlib on fd.o.
(1) Add it as a git submodule to the Mesa repository.
(2) Fix up whatever aspects of the build system that may be affected 
(perhaps for building source tarballs).
(3) Go back to mostly ignoring addrlib, except for trying to get better 
at syncing with the internal closed-source version.


From initial experiments, the impact on users interested in radeon is 
that they will have to run `git submodule init` and then occasionally 
`git submodule update`. Users who do not build radeonsi should be able 
to ignore the change completely.


There are alternatives. For example, ROCm uses Google's repo tool 
already. But for Mesa, git submodule looks like a lightweight, well 
supported and overall conservative option that everybody should already 
have installed. If there are good arguments for something else, let's 
hear them!


Another point: if we proceed with this plan, I think we should consider 
moving addrlib into src/amd/addrlib. There are two reasons: First, 
transitioning to a submodule *without* changing the directory is 
probably more fragile, i.e. what happens when you switch between 
checkouts before and after the transition. Second, if/when radv ends up 
being merged into Mesa master, it makes sense to have addrlib there anyway.


Thoughts? Complaints? Praise?
Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gallium: Add c99_compat.h to u_bitcast.h

2016-08-09 Thread Brian Paul

On 08/09/2016 12:30 AM, Mathias Fröhlich wrote:

Hi Brian,

 > I don't know why my local build is failing while appveyor and our

 > in-house automated build seem OK. But applying your patch 3 alone fixes

 > things for me.

As it fixes something independent, should I push that already?


Sure.  For 1 & 3,
Reviewed-by: Brian Paul 
Tested-by: Brian Paul 




 > Yeah, I applied your whole series and the MSVC build seems OK. However,

 > I'm hitting a new runtime crash (even after fixing the unrelated issue

 > from Marek's rewrite of the state tracker validation code). It looks

 > like patch 2/3 is the problem. I'll try to dig deeper tomorrow...

Hmm, that part is meant as a nice addition on top, once we are at it.

But, if this patch is a problem then I am probably off by one in the
returned

index. You may 'return index + 1;' instead of just index.


I'll let you know soon...

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vl/rbsp: add a check for emulation prevention three byte

2016-08-09 Thread Leo Liu



On 08/09/2016 04:26 AM, Christian König wrote:

Am 08.08.2016 um 22:10 schrieb Leo Liu:

This is the case when the "00 00 03" is very close to the beginning of
nal unit header


I see where the problem is, but the fix is incorrect.

You always search for the emulation prevention three byte even when 
the previous fill has done so already. So it could happen in theory 
that you revert the escaping twice, e.g. remove valid bits.


Right. Will send v2.

Thanks,
Leo



You need to add this extra check to the end of vl_rbsp_init() after we 
searched for the end of the NAL unit.


Regards,
Christian.



Signed-off-by: Leo Liu 
---
  src/gallium/auxiliary/vl/vl_rbsp.h | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/vl/vl_rbsp.h 
b/src/gallium/auxiliary/vl/vl_rbsp.h

index 7867238..c134d31 100644
--- a/src/gallium/auxiliary/vl/vl_rbsp.h
+++ b/src/gallium/auxiliary/vl/vl_rbsp.h
@@ -77,8 +77,16 @@ static inline void vl_rbsp_fillbits(struct vl_rbsp 
*rbsp)

 unsigned i, bits;
   /* abort if we still have enough bits */
-   if (valid >= 32)
+   if (valid >= 32) {
+  /* search for the emulation prevention three byte */
+  for (i = 24; i <= valid; i += 8) {
+ if ((vl_vlc_peekbits(>nal, i) & 0xff) == 0x3) {
+vl_vlc_removebits(>nal, i - 8, 8);
+i += 8;
+ }
+  }
return;
+   }
   vl_vlc_fillbits(>nal);





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Fix Android compilation when swrast is enabled

2016-08-09 Thread Mathieu Maret
Swrast add dependency on libdrm, but libdrm is not defined for host
build (only for the targeted device). So host modules likes mesa_gen_matypes
cannot find there libdrm dependency

Signed-off-by: Mathieu Maret 
---
 Android.common.mk | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Android.common.mk b/Android.common.mk
index 26d2482..3903ebe 100644
--- a/Android.common.mk
+++ b/Android.common.mk
@@ -100,10 +100,12 @@ LOCAL_CFLAGS += \
 endif
 
 # add libdrm if there are hardware drivers
+ifneq ($(strip $(LOCAL_IS_HOST_MODULE)),true)
 ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
 LOCAL_CFLAGS += -DHAVE_LIBDRM
 LOCAL_SHARED_LIBRARIES += libdrm
 endif
+endif
 
 LOCAL_CPPFLAGS += \
$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-std=c++11) \
-- 
2.9.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] vl: add a lanczos interpolation filter v3

2016-08-09 Thread Christian König
I am more than happy to solve these problems, the Lanczos filtering 
was getting a little stale

anyway because I was not able to reproduce the problems Andy was facing.
Yeah thought so, the reason is probably that you don't have the 
necessary hardware.



Is that why I need to add a PIPE_BIND_LINEAR to a surface?

Yes exactly.

So I need to use maybe a couple of surfaces alternatively to read and 
write with the filters. This approach should work I guess.
Allocate a temporary surface for each step, apply the necessary filters 
using it and then use the temporary buffer as input for the next step.


See how the deinterlacing filter does this, you should use the same 
approach here.


I would use this order for doing things:
1. Median filter for noise reduction.
2. Sharpening/blur filter.
3. Deinterlacing.
4. Compositioning and CC conversion.
5. Advanced scaling.

Regards,
Christian.

Am 08.08.2016 um 16:32 schrieb Nayan Deshmukh:

Hi Christian,

I am more than happy to solve these problems, the Lanczos filtering 
was getting a little stale

anyway because I was not able to reproduce the problems Andy was facing.

On Mon, Aug 8, 2016 at 6:24 PM, Christian König 
> wrote:


Hi Nayan,

ok let's take a step back and put the lanczos filtering aside for
a moment. I have another task for you which is more urgent right now.

The order we do things in vlVdpVideoMixerRender() was never 100%
correct, so we have at least three problems here which needs fixing:

1) The noise reduction and sharpness filter read and write to the
same surface at the same time. That only works because we use a
linear layout.

Is that why I need to add a PIPE_BIND_LINEAR to a surface? So I need 
to use maybe a couple of surfaces alternatively to read and write with 
the filters. This approach should work I guess.


2) We apply the noise reduction and sharpness filter after the
composition. That is rather odd and should be fixed so that we
apply those filters on the original video frame instead.

 So we need to apply the filter before the CSC conversion.

3) We use delayed rendering to render into output surfaces
directly. We should rather use the DRI3 capabilities to allocate
multiple output surfaces instead.

Could you take care of some of those issues? Especially #1 has
become a problem recently.

Surely, I will start working on the first 2 problem for now and look 
at the third problem a little later.


Regards,
Nayan.

Regards,
Christian.


Am 04.08.2016 um 19:22 schrieb Nayan Deshmukh:

Hi Andy,


On Thu, Aug 4, 2016 at 8:48 PM, Andy Furniss > wrote:

Nayan Deshmukh wrote:

Hi Andy,

Thanks for testing my patches.


NP


The scaling happens after CSC.


OK, thanks.


I believe there is some misunderstanding here, I was able
to see the
artifacts in the video that you sent me previously. But I
was not
able to replicate them


Yea, I got that - just thought you may want to see how they
had changed.

with the pendulum video on my system. Same case this time the
pendulum video plays fine this time too. I am attacing a
video of it
here


https://drive.google.com/file/d/0B1s62k5GtdBwOVAtOUVaU0V5c1E/view?usp=sharing




Hmm, that's interesting for a few reasons.

Though my monitor won't do 1366x768 I can replicate the same
scale
factor windowed with mplayer ... -xy 768/576 ...

At first glance only level 2 is obviously artifacted (though
very close
inspection shows others are slightly).

Levels: for some reason your vid has the black bars at 0 but
the content
isn't scaled - like your mplayer didn't expand tv to pc on
playback.
This shouldn't happen by default. Do you have some extra config
somewhere like in $HOME/.mplayer, if so maybe better to test
without.

Most important - though the vp9 compression may be to blame I
can't
really see any difference between the levels in that vid.

In fact closely comparing just your level 1 to my (admittedly
uncompressed) level 1 and 0 output scaled the same plus
unstretched
levels wise it looks to me like you are getting level 0 on
this test.


You are right it I am getting level 0 only. I have a PRIME
configuration
and I forgot to set DRI_PRIME to 1. But for some reason, I am not
able to create
a screen recording when I use my AMD card. So, for now, I can't
reproduce the artifacts
you are having so can't debug them too :(

   

Re: [Mesa-dev] [PATCH 0/8] More state optimizations for st/mesa

2016-08-09 Thread Nicolai Hähnle

I sent comments on patches 4 & 6. Apart from that, the series is

Reviewed-by: Nicolai Hähnle 

On 07.08.2016 03:12, Marek Olšák wrote:

PS: In order to make reviewing easier, all my patches have 10 lines
of contexts instead of 3. That will be the default for all my work
from now on.


I like that idea, I've made the same change to my git config.

Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] st/mesa: determine states used or affected by shaders at compile time

2016-08-09 Thread Nicolai Hähnle

On 07.08.2016 03:12, Marek Olšák wrote:

From: Marek Olšák 

At compile time, each shader determines which ST_NEW flags should be set
at shader bind time.

This just sets the new field for all shaders. The next commit will use it.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 175 -
 src/mesa/state_tracker/st_program.c|  37 +-
 src/mesa/state_tracker/st_program.h|   6 +
 3 files changed, 215 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 362559f..fd14766 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -,31 +,202 @@ get_mesa_program_tgsi(struct gl_context *ctx,

 static struct gl_program *
 get_mesa_program(struct gl_context *ctx,
  struct gl_shader_program *shader_program,
  struct gl_linked_shader *shader)
 {
struct pipe_screen *pscreen = ctx->st->pipe->screen;
unsigned ptarget = st_shader_stage_to_ptarget(shader->Stage);
enum pipe_shader_ir preferred_ir = (enum pipe_shader_ir)
   pscreen->get_shader_param(pscreen, ptarget, 
PIPE_SHADER_CAP_PREFERRED_IR);
+   struct gl_program *prog = NULL;
+
if (preferred_ir == PIPE_SHADER_IR_NIR) {
   /* TODO only for GLSL VS/FS for now: */
   switch (shader->Stage) {
   case MESA_SHADER_VERTEX:
   case MESA_SHADER_FRAGMENT:
- return st_nir_get_mesa_program(ctx, shader_program, shader);
+ prog = st_nir_get_mesa_program(ctx, shader_program, shader);
   default:
  break;
   }
+   } else {
+  prog = get_mesa_program_tgsi(ctx, shader_program, shader);
+   }
+
+   if (prog) {
+  uint64_t *states;
+
+  /* This determines which states will be updated when the shader is
+   * bound.
+   */
+  switch (shader->Stage) {
+  case MESA_SHADER_VERTEX:
+ states = &((struct st_vertex_program*)prog)->affected_states;
+
+ *states = ST_NEW_VS_STATE |
+   ST_NEW_RASTERIZER |
+   ST_NEW_VERTEX_ARRAYS;
+
+ if (prog->Parameters->NumParameters)
+*states |= ST_NEW_VS_CONSTANTS;
+
+ if (shader->num_samplers)
+*states |= ST_NEW_VS_SAMPLER_VIEWS |
+   ST_NEW_RENDER_SAMPLERS;
+
+ if (shader->NumImages)
+*states |= ST_NEW_VS_IMAGES;
+
+ if (shader->NumUniformBlocks)
+*states |= ST_NEW_VS_UBOS;
+
+ if (shader->NumShaderStorageBlocks)
+*states |= ST_NEW_VS_SSBOS;
+
+ if (shader->NumAtomicBuffers)
+*states |= ST_NEW_VS_ATOMICS;


I'm not overly fond of the code duplication here. Perhaps these could 
all be expressed relative to a stage-specific base flag?


Nicolai



+ break;
+
+  case MESA_SHADER_TESS_CTRL:
+ states = &((struct st_tessctrl_program*)prog)->affected_states;
+
+ *states = ST_NEW_TCS_STATE;
+
+ if (prog->Parameters->NumParameters)
+*states |= ST_NEW_TCS_CONSTANTS;
+
+ if (shader->num_samplers)
+*states |= ST_NEW_TCS_SAMPLER_VIEWS |
+   ST_NEW_RENDER_SAMPLERS;
+
+ if (shader->NumImages)
+*states |= ST_NEW_TCS_IMAGES;
+
+ if (shader->NumUniformBlocks)
+*states |= ST_NEW_TCS_UBOS;
+
+ if (shader->NumShaderStorageBlocks)
+*states |= ST_NEW_TCS_SSBOS;
+
+ if (shader->NumAtomicBuffers)
+*states |= ST_NEW_TCS_ATOMICS;
+ break;
+
+  case MESA_SHADER_TESS_EVAL:
+ states = &((struct st_tesseval_program*)prog)->affected_states;
+
+ *states = ST_NEW_TES_STATE |
+   ST_NEW_RASTERIZER;
+
+ if (prog->Parameters->NumParameters)
+*states |= ST_NEW_TES_CONSTANTS;
+
+ if (shader->num_samplers)
+*states |= ST_NEW_TES_SAMPLER_VIEWS |
+   ST_NEW_RENDER_SAMPLERS;
+
+ if (shader->NumImages)
+*states |= ST_NEW_TES_IMAGES;
+
+ if (shader->NumUniformBlocks)
+*states |= ST_NEW_TES_UBOS;
+
+ if (shader->NumShaderStorageBlocks)
+*states |= ST_NEW_TES_SSBOS;
+
+ if (shader->NumAtomicBuffers)
+*states |= ST_NEW_TES_ATOMICS;
+ break;
+
+  case MESA_SHADER_GEOMETRY:
+ states = &((struct st_geometry_program*)prog)->affected_states;
+
+ *states = ST_NEW_GS_STATE |
+   ST_NEW_RASTERIZER;
+
+ if (prog->Parameters->NumParameters)
+*states |= ST_NEW_GS_CONSTANTS;
+
+ if (shader->num_samplers)
+*states |= ST_NEW_GS_SAMPLER_VIEWS |
+   ST_NEW_RENDER_SAMPLERS;
+
+ if (shader->NumImages)
+*states |= ST_NEW_GS_IMAGES;
+
+ if (shader->NumUniformBlocks)
+*states |= ST_NEW_GS_UBOS;
+
+ if 

Re: [Mesa-dev] [PATCH 6/8] st/mesa: _NEW_TEXTURE & CONSTANTS shouldn't flag states that aren't used

2016-08-09 Thread Nicolai Hähnle



On 07.08.2016 03:12, Marek Olšák wrote:

From: Marek Olšák 

---
 src/mesa/state_tracker/st_context.c | 64 -
 src/mesa/state_tracker/st_context.h |  6 
 2 files changed, 62 insertions(+), 8 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index 1ff0355..b9fc9e7 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -116,20 +116,65 @@ st_query_memory_info(struct gl_context *ctx, struct 
gl_memory_info *out)

out->total_device_memory = info.total_device_memory;
out->avail_device_memory = info.avail_device_memory;
out->total_staging_memory = info.total_staging_memory;
out->avail_staging_memory = info.avail_staging_memory;
out->device_memory_evicted = info.device_memory_evicted;
out->nr_device_memory_evictions = info.nr_device_memory_evictions;
 }


+uint64_t
+st_get_active_states(struct gl_context *ctx)
+{
+   struct st_vertex_program *vp =
+  st_vertex_program(ctx->VertexProgram._Current);
+   struct st_tessctrl_program *tcp =
+  st_tessctrl_program(ctx->TessCtrlProgram._Current);
+   struct st_tesseval_program *tep =
+  st_tesseval_program(ctx->TessEvalProgram._Current);
+   struct st_geometry_program *gp =
+  st_geometry_program(ctx->GeometryProgram._Current);
+   struct st_fragment_program *fp =
+  st_fragment_program(ctx->FragmentProgram._Current);
+   struct st_compute_program *cp =
+  st_compute_program(ctx->ComputeProgram._Current);
+
+   uint64_t active_shader_states = 0;
+   uint64_t all_shader_resources;
+
+   if (vp)
+  active_shader_states |= vp->affected_states;
+   if (tcp)
+  active_shader_states |= tcp->affected_states;
+   if (tep)
+  active_shader_states |= tep->affected_states;
+   if (gp)
+  active_shader_states |= gp->affected_states;
+   if (fp)
+  active_shader_states |= fp->affected_states;
+   if (cp)
+  active_shader_states |= cp->affected_states;
+
+   all_shader_resources = ST_NEW_SAMPLER_VIEWS |
+  ST_NEW_SAMPLERS |
+  ST_NEW_CONSTANTS |
+  ST_NEW_UNIFORM_BUFFER |
+  ST_NEW_ATOMIC_BUFFER |
+  ST_NEW_STORAGE_BUFFER |
+  ST_NEW_IMAGE_UNITS;


This should probably be a #define in st_atom.h.

Nicolai


+
+   /* Mark non-shader-resource shader states as "always active". */
+   return active_shader_states | ~all_shader_resources;
+}
+
+
 /**
  * Called via ctx->Driver.UpdateState()
  */
 void st_invalidate_state(struct gl_context * ctx, GLbitfield new_state)
 {
struct st_context *st = st_context(ctx);

if (new_state & _NEW_BUFFERS) {
   st->dirty |= ST_NEW_DSA |
ST_NEW_FB_STATE |
@@ -197,41 +242,44 @@ void st_invalidate_state(struct gl_context * ctx, 
GLbitfield new_state)
st_user_clip_planes_enabled(ctx))
   st->dirty |= ST_NEW_CLIP_STATE;

if (new_state & _NEW_COLOR)
   st->dirty |= ST_NEW_BLEND |
ST_NEW_DSA;

if (new_state & _NEW_PIXEL)
   st->dirty |= ST_NEW_PIXEL_TRANSFER;

-   if (new_state & _NEW_TEXTURE)
-  st->dirty |= ST_NEW_SAMPLER_VIEWS |
-   ST_NEW_SAMPLERS |
-   ST_NEW_IMAGE_UNITS;
-
if (new_state & _NEW_CURRENT_ATTRIB)
   st->dirty |= ST_NEW_VERTEX_ARRAYS;

-   if (new_state & _NEW_PROGRAM_CONSTANTS)
-  st->dirty |= ST_NEW_CONSTANTS;
-
/* Update the vertex shader if ctx->Light._ClampVertexColor was changed. */
if (st->clamp_vert_color_in_shader && (new_state & _NEW_LIGHT))
   st->dirty |= ST_NEW_VS_STATE;

/* Which shaders are dirty will be determined manually. */
if (new_state & _NEW_PROGRAM) {
   st->gfx_shaders_may_be_dirty = true;
   st->compute_shader_may_be_dirty = true;
+  /* This will mask out unused shader resources. */
+  st->active_states = st_get_active_states(ctx);
}

+   if (new_state & _NEW_TEXTURE)
+  st->dirty |= st->active_states &
+   (ST_NEW_SAMPLER_VIEWS |
+ST_NEW_SAMPLERS |
+ST_NEW_IMAGE_UNITS);
+
+   if (new_state & _NEW_PROGRAM_CONSTANTS)
+  st->dirty |= st->active_states & ST_NEW_CONSTANTS;
+
/* This is the only core Mesa module we depend upon.
 * No longer use swrast, swsetup, tnl.
 */
_vbo_InvalidateState(ctx, new_state);
 }


 static void
 st_destroy_context_priv(struct st_context *st)
 {
diff --git a/src/mesa/state_tracker/st_context.h 
b/src/mesa/state_tracker/st_context.h
index 556b9c9..f82cf3a 100644
--- a/src/mesa/state_tracker/st_context.h
+++ b/src/mesa/state_tracker/st_context.h
@@ -133,20 +133,23 @@ struct st_context
   GLuint poly_stipple[32];  /**< In OpenGL's bottom-to-top order */

   GLuint fb_orientation;
} state;

char vendor[100];
char renderer[100];

uint64_t dirty; /**< dirty states */

+   

[Mesa-dev] [PATCH 1/3] gallivm: add create_builder_at_entry helper function

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Reduces code duplication.
---
 src/gallium/auxiliary/gallivm/lp_bld_flow.c | 45 ++---
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_flow.c 
b/src/gallium/auxiliary/gallivm/lp_bld_flow.c
index f3b3eab..9183f45 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_flow.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_flow.c
@@ -447,20 +447,40 @@ lp_build_endif(struct lp_build_if_state *ifthen)
   /* no else clause */
   LLVMBuildCondBr(builder, ifthen->condition,
   ifthen->true_block, ifthen->merge_block);
}
 
/* Resume building code at end of the ifthen->merge_block */
LLVMPositionBuilderAtEnd(builder, ifthen->merge_block);
 }
 
 
+static LLVMBuilderRef
+create_builder_at_entry(struct gallivm_state *gallivm)
+{
+   LLVMBuilderRef builder = gallivm->builder;
+   LLVMBasicBlockRef current_block = LLVMGetInsertBlock(builder);
+   LLVMValueRef function = LLVMGetBasicBlockParent(current_block);
+   LLVMBasicBlockRef first_block = LLVMGetEntryBasicBlock(function);
+   LLVMValueRef first_instr = LLVMGetFirstInstruction(first_block);
+   LLVMBuilderRef first_builder = LLVMCreateBuilderInContext(gallivm->context);
+
+   if (first_instr) {
+  LLVMPositionBuilderBefore(first_builder, first_instr);
+   } else {
+  LLVMPositionBuilderAtEnd(first_builder, first_block);
+   }
+
+   return first_builder;
+}
+
+
 /**
  * Allocate a scalar (or vector) variable.
  *
  * Although not strictly part of control flow, control flow has deep impact in
  * how variables should be allocated.
  *
  * The mem2reg optimization pass is the recommended way to dealing with mutable
  * variables, and SSA. It looks for allocas and if it can handle them, it
  * promotes them, but only looks for alloca instructions in the entry block of
  * the function. Being in the entry block guarantees that the alloca is only
@@ -468,33 +488,23 @@ lp_build_endif(struct lp_build_if_state *ifthen)
  *
  * See also:
  * - http://www.llvm.org/docs/tutorial/OCamlLangImpl7.html#memory
  */
 LLVMValueRef
 lp_build_alloca(struct gallivm_state *gallivm,
 LLVMTypeRef type,
 const char *name)
 {
LLVMBuilderRef builder = gallivm->builder;
-   LLVMBasicBlockRef current_block = LLVMGetInsertBlock(builder);
-   LLVMValueRef function = LLVMGetBasicBlockParent(current_block);
-   LLVMBasicBlockRef first_block = LLVMGetEntryBasicBlock(function);
-   LLVMValueRef first_instr = LLVMGetFirstInstruction(first_block);
-   LLVMBuilderRef first_builder = LLVMCreateBuilderInContext(gallivm->context);
+   LLVMBuilderRef first_builder = create_builder_at_entry(gallivm);
LLVMValueRef res;
 
-   if (first_instr) {
-  LLVMPositionBuilderBefore(first_builder, first_instr);
-   } else {
-  LLVMPositionBuilderAtEnd(first_builder, first_block);
-   }
-
res = LLVMBuildAlloca(first_builder, type, name);
LLVMBuildStore(builder, LLVMConstNull(type), res);
 
LLVMDisposeBuilder(first_builder);
 
return res;
 }
 
 
 /**
@@ -510,30 +520,19 @@ lp_build_alloca(struct gallivm_state *gallivm,
  *
  * See also:
  * - http://www.llvm.org/docs/tutorial/OCamlLangImpl7.html#memory
  */
 LLVMValueRef
 lp_build_array_alloca(struct gallivm_state *gallivm,
   LLVMTypeRef type,
   LLVMValueRef count,
   const char *name)
 {
-   LLVMBuilderRef builder = gallivm->builder;
-   LLVMBasicBlockRef current_block = LLVMGetInsertBlock(builder);
-   LLVMValueRef function = LLVMGetBasicBlockParent(current_block);
-   LLVMBasicBlockRef first_block = LLVMGetEntryBasicBlock(function);
-   LLVMValueRef first_instr = LLVMGetFirstInstruction(first_block);
-   LLVMBuilderRef first_builder = LLVMCreateBuilderInContext(gallivm->context);
+   LLVMBuilderRef first_builder = create_builder_at_entry(gallivm);
LLVMValueRef res;
 
-   if (first_instr) {
-  LLVMPositionBuilderBefore(first_builder, first_instr);
-   } else {
-  LLVMPositionBuilderAtEnd(first_builder, first_block);
-   }
-
res = LLVMBuildArrayAlloca(first_builder, type, count, name);
 
LLVMDisposeBuilder(first_builder);
 
return res;
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] gallium/radeon: use lp_build_alloca_undef

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Avoid building all those store 0 / store undef instrucction pairs that
end up getting removed anyway.
---
 src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 6a010d5..b419add 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -483,43 +483,34 @@ static LLVMValueRef fetch_system_value(struct 
lp_build_tgsi_context *bld_base,
struct gallivm_state *gallivm = bld_base->base.gallivm;
 
LLVMValueRef cval = ctx->system_values[reg->Register.Index];
if (LLVMGetTypeKind(LLVMTypeOf(cval)) == LLVMVectorTypeKind) {
cval = LLVMBuildExtractElement(gallivm->builder, cval,
   lp_build_const_int32(gallivm, 
swizzle), "");
}
return bitcast(bld_base, type, cval);
 }
 
-static LLVMValueRef si_build_alloca_undef(struct gallivm_state *gallivm,
- LLVMTypeRef type,
- const char *name)
-{
-   LLVMValueRef ptr = lp_build_alloca(gallivm, type, name);
-   LLVMBuildStore(gallivm->builder, LLVMGetUndef(type), ptr);
-   return ptr;
-}
-
 static void emit_declaration(struct lp_build_tgsi_context *bld_base,
 const struct tgsi_full_declaration *decl)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
unsigned first, last, i;
switch(decl->Declaration.File) {
case TGSI_FILE_ADDRESS:
{
 unsigned idx;
for (idx = decl->Range.First; idx <= decl->Range.Last; idx++) {
unsigned chan;
for (chan = 0; chan < TGSI_NUM_CHANNELS; chan++) {
-ctx->soa.addr[idx][chan] = 
si_build_alloca_undef(
+ctx->soa.addr[idx][chan] = 
lp_build_alloca_undef(
>gallivm,
ctx->soa.bld_base.uint_bld.elem_type, 
"");
}
}
break;
}
 
case TGSI_FILE_TEMPORARY:
{
char name[16] = "";
@@ -574,21 +565,21 @@ static void emit_declaration(struct lp_build_tgsi_context 
*bld_base,
ctx->temps_count = 
bld_base->info->file_max[TGSI_FILE_TEMPORARY] + 1;
ctx->temps = MALLOC(TGSI_NUM_CHANNELS * 
ctx->temps_count * sizeof(LLVMValueRef));
}
if (!array_alloca) {
for (i = 0; i < decl_size; ++i) {
 #ifdef DEBUG
snprintf(name, sizeof(name), "TEMP%d.%c",
 first + i / 4, "xyzw"[i % 4]);
 #endif
ctx->temps[first * TGSI_NUM_CHANNELS + i] =
-   
si_build_alloca_undef(bld_base->base.gallivm,
+   
lp_build_alloca_undef(bld_base->base.gallivm,
  
bld_base->base.vec_type,
  name);
}
} else {
LLVMValueRef idxs[2] = {
bld_base->uint_bld.zero,
NULL
};
LLVMValueRef undef = NULL;
unsigned j = 0;
@@ -633,21 +624,21 @@ static void emit_declaration(struct lp_build_tgsi_context 
*bld_base,
}
break;
 
case TGSI_FILE_OUTPUT:
{
unsigned idx;
for (idx = decl->Range.First; idx <= decl->Range.Last; idx++) {
unsigned chan;
assert(idx < RADEON_LLVM_MAX_OUTPUTS);
for (chan = 0; chan < TGSI_NUM_CHANNELS; chan++) {
-   ctx->soa.outputs[idx][chan] = 
si_build_alloca_undef(
+   ctx->soa.outputs[idx][chan] = 
lp_build_alloca_undef(
>gallivm,
ctx->soa.bld_base.base.elem_type, "");
}
}
break;
}
 
case TGSI_FILE_MEMORY:
ctx->declare_memory_region(ctx, decl);
break;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] gallivm: add lp_build_alloca_undef

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/gallivm/lp_bld_flow.c | 19 +++
 src/gallium/auxiliary/gallivm/lp_bld_flow.h |  5 +
 2 files changed, 24 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_flow.c 
b/src/gallium/auxiliary/gallivm/lp_bld_flow.c
index 9183f45..3c3f16c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_flow.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_flow.c
@@ -501,20 +501,39 @@ lp_build_alloca(struct gallivm_state *gallivm,
res = LLVMBuildAlloca(first_builder, type, name);
LLVMBuildStore(builder, LLVMConstNull(type), res);
 
LLVMDisposeBuilder(first_builder);
 
return res;
 }
 
 
 /**
+ * Like lp_build_alloca_undef, but do not zero-initialize the variable.
+ */
+LLVMValueRef
+lp_build_alloca_undef(struct gallivm_state *gallivm,
+  LLVMTypeRef type,
+  const char *name)
+{
+   LLVMBuilderRef first_builder = create_builder_at_entry(gallivm);
+   LLVMValueRef res;
+
+   res = LLVMBuildAlloca(first_builder, type, name);
+
+   LLVMDisposeBuilder(first_builder);
+
+   return res;
+}
+
+
+/**
  * Allocate an array of scalars/vectors.
  *
  * mem2reg pass is not capable of promoting structs or arrays to registers, but
  * we still put it in the first block anyway as failure to put allocas in the
  * first block may prevent the X86 backend from successfully align the stack as
  * required.
  *
  * Also the scalarrepl pass is supposedly more powerful and can promote
  * arrays in many cases.
  *
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_flow.h 
b/src/gallium/auxiliary/gallivm/lp_bld_flow.h
index 083b0ad..674fc18 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_flow.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_flow.h
@@ -189,20 +189,25 @@ lp_build_endif(struct lp_build_if_state *ctx);
 
 LLVMBasicBlockRef
 lp_build_insert_new_block(struct gallivm_state *gallivm, const char *name);
 
 LLVMValueRef
 lp_build_alloca(struct gallivm_state *gallivm,
 LLVMTypeRef type,
 const char *name);
 
 LLVMValueRef
+lp_build_alloca_undef(struct gallivm_state *gallivm,
+  LLVMTypeRef type,
+  const char *name);
+
+LLVMValueRef
 lp_build_array_alloca(struct gallivm_state *gallivm,
   LLVMTypeRef type,
   LLVMValueRef count,
   const char *name);
 
 #ifdef __cplusplus
 }
 #endif
 
 #endif /* !LP_BLD_FLOW_H */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/19] gallium/radeon: always do the full store in store_value_to_array

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Doing the write-back of the temporary vector in radeon_llvm_emit_store makes
no sense.

This also allows us to get rid of get_alloca_for_array.
---
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 77 --
 1 file changed, 28 insertions(+), 49 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 87fc07e..d8ab5b0 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -149,37 +149,20 @@ get_array_range(struct lp_build_tgsi_context *bld_base,
get_temp_array(bld_base, reg_index, reg);
if (array)
return array->range;
}
 
range.First = 0;
range.Last = bld_base->info->file_max[File];
return range;
 }
 
-static LLVMValueRef get_alloca_for_array(struct lp_build_tgsi_context 
*bld_base,
-unsigned file,
-unsigned index,
-const struct tgsi_ind_register *reg)
-{
-   const struct radeon_llvm_array *array;
-
-   if (file != TGSI_FILE_TEMPORARY)
-   return NULL;
-
-   array = get_temp_array(bld_base, index, reg);
-   if (!array)
-   return NULL;
-
-   return array->alloca;
-}
-
 static LLVMValueRef
 emit_array_index(struct lp_build_tgsi_soa_context *bld,
 const struct tgsi_ind_register *reg,
 unsigned offset)
 {
struct gallivm_state *gallivm = bld->bld_base.base.gallivm;
 
if (!reg) {
return lp_build_const_int32(gallivm, offset);
}
@@ -299,44 +282,67 @@ load_value_from_array(struct lp_build_tgsi_context 
*bld_base,
struct tgsi_declaration_range range =
get_array_range(bld_base, file, reg_index, 
reg_indirect);
LLVMValueRef index =
emit_array_index(bld, reg_indirect, reg_index - 
range.First);
LLVMValueRef array =
emit_array_fetch(bld_base, file, type, range, swizzle);
return LLVMBuildExtractElement(builder, array, index, "");
}
 }
 
-static LLVMValueRef
+static void
 store_value_to_array(struct lp_build_tgsi_context *bld_base,
 LLVMValueRef value,
 unsigned file,
 unsigned chan_index,
 unsigned reg_index,
 const struct tgsi_ind_register *reg_indirect)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct lp_build_tgsi_soa_context *bld = lp_soa_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef ptr;
 
ptr = get_pointer_into_array(ctx, file, chan_index, reg_index, 
reg_indirect);
if (ptr) {
LLVMBuildStore(builder, value, ptr);
-   return NULL;
} else {
+   unsigned i, size;
struct tgsi_declaration_range range = get_array_range(bld_base, 
file, reg_index, reg_indirect);
LLVMValueRef index = emit_array_index(bld, reg_indirect, 
reg_index - range.First);
LLVMValueRef array =
emit_array_fetch(bld_base, file, TGSI_TYPE_FLOAT, 
range, chan_index);
-   return LLVMBuildInsertElement(builder, array, value, index, "");
+   LLVMValueRef temp_ptr;
+
+   array = LLVMBuildInsertElement(builder, array, value, index, 
"");
+
+   size = range.Last - range.First + 1;
+   for (i = 0; i < size; ++i) {
+   switch(file) {
+   case TGSI_FILE_OUTPUT:
+   temp_ptr = bld->outputs[i + 
range.First][chan_index];
+   break;
+
+   case TGSI_FILE_TEMPORARY:
+   if (range.First + i >= ctx->temps_count)
+   continue;
+   temp_ptr = ctx->temps[(i + range.First) * 
TGSI_NUM_CHANNELS + chan_index];
+   break;
+
+   default:
+   continue;
+   }
+   value = LLVMBuildExtractElement(builder, array,
+   lp_build_const_int32(gallivm, i), "");
+   LLVMBuildStore(builder, value, temp_ptr);
+   }
}
 }
 
 LLVMValueRef radeon_llvm_emit_fetch(struct lp_build_tgsi_context *bld_base,
const struct tgsi_full_src_register *reg,
enum tgsi_opcode_type type,
unsigned 

[Mesa-dev] [PATCH 19/19] gallium/radeon: protect against out of bounds temporary array accesses

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

They can lead to VM faults and worse, which goes against the GL robustness
promises.
---
 src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index e3b04ee..6a010d5 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -220,20 +220,35 @@ get_pointer_into_array(struct radeon_llvm_context *ctx,
LLVMValueRef index;
 
if (file != TGSI_FILE_TEMPORARY)
return NULL;
 
array = get_temp_array(>soa.bld_base, reg_index, reg_indirect);
if (!array || !array->alloca)
return NULL;
 
index = emit_array_index(>soa, reg_indirect, reg_index - 
array->range.First);
+
+   /* Ensure that the index is within a valid range, to guard against
+* VM faults and overwriting critical data (e.g. spilled resource
+* descriptors).
+*
+* TODO It should be possible to avoid the additional instructions
+* if LLVM is changed so that it guarantuees:
+* 1. the scratch space descriptor isolates the current wave (this
+*could even save the scratch offset SGPR at the cost of an
+*additional SALU instruction)
+* 2. the memory for allocas must be allocated at the _end_ of the
+*scratch space (after spilled registers)
+*/
+   index = radeon_llvm_bound_index(ctx, index, array->range.Last - 
array->range.First + 1);
+
index = LLVMBuildMul(
builder, index,
lp_build_const_int32(gallivm, util_bitcount(array->usagemask)),
"");
index = LLVMBuildAdd(
builder, index,
lp_build_const_int32(
gallivm,
util_bitcount(array->usagemask & ((1 << swizzle) - 1))),
"");
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/19] gallium/radeon: pass indirect register info into get_alloca_for_array

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

To have the same signature as get_array_range.
---
 src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 994c7da..531a8fe 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -151,28 +151,29 @@ get_array_range(struct lp_build_tgsi_context *bld_base,
return array->range;
}
 
range.First = 0;
range.Last = bld_base->info->file_max[File];
return range;
 }
 
 static LLVMValueRef get_alloca_for_array(struct lp_build_tgsi_context 
*bld_base,
 unsigned file,
-unsigned index)
+unsigned index,
+const struct tgsi_ind_register *reg)
 {
const struct radeon_llvm_array *array;
 
if (file != TGSI_FILE_TEMPORARY)
return NULL;
 
-   array = get_temp_array(bld_base, index, NULL);
+   array = get_temp_array(bld_base, index, reg);
if (!array)
return NULL;
 
return array->alloca;
 }
 
 static LLVMValueRef
 emit_array_index(struct lp_build_tgsi_soa_context *bld,
 const struct tgsi_ind_register *reg,
 unsigned offset)
@@ -240,21 +241,21 @@ load_value_from_array(struct lp_build_tgsi_context 
*bld_base,
  enum tgsi_opcode_type type,
  unsigned swizzle,
  unsigned reg_index,
  const struct tgsi_ind_register *reg_indirect)
 {
struct lp_build_tgsi_soa_context *bld = lp_soa_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBuilderRef builder = gallivm->builder;
struct tgsi_declaration_range range = get_array_range(bld_base, file, 
reg_index, reg_indirect);
LLVMValueRef index = emit_array_index(bld, reg_indirect, reg_index - 
range.First);
-   LLVMValueRef array = get_alloca_for_array(bld_base, file, reg_index);
+   LLVMValueRef array = get_alloca_for_array(bld_base, file, reg_index, 
reg_indirect);
LLVMValueRef ptr, val, indices[2];
 
if (!array) {
/* Handle the case where the array is stored as a vector. */
return LLVMBuildExtractElement(builder,
emit_array_fetch(bld_base, file, type, range, 
swizzle),
index, "");
}
 
index = LLVMBuildMul(builder, index, lp_build_const_int32(gallivm, 
TGSI_NUM_CHANNELS), "");
@@ -280,21 +281,21 @@ store_value_to_array(struct lp_build_tgsi_context 
*bld_base,
 unsigned file,
 unsigned chan_index,
 unsigned reg_index,
 const struct tgsi_ind_register *reg_indirect)
 {
struct lp_build_tgsi_soa_context *bld = lp_soa_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBuilderRef builder = gallivm->builder;
struct tgsi_declaration_range range = get_array_range(bld_base, file, 
reg_index, reg_indirect);
LLVMValueRef index = emit_array_index(bld, reg_indirect, reg_index - 
range.First);
-   LLVMValueRef array = get_alloca_for_array(bld_base, file, reg_index);
+   LLVMValueRef array = get_alloca_for_array(bld_base, file, reg_index, 
reg_indirect);
 
if (array) {
LLVMValueRef indices[2];
index = LLVMBuildMul(builder, index, 
lp_build_const_int32(gallivm, TGSI_NUM_CHANNELS), "");
index = LLVMBuildAdd(builder, index, 
lp_build_const_int32(gallivm, chan_index), "");
indices[0] = bld_base->uint_bld.zero;
indices[1] = index;
LLVMValueRef pointer = LLVMBuildGEP(builder, array, indices, 2, 
"");
LLVMBuildStore(builder, value, pointer);
return NULL;
@@ -617,21 +618,21 @@ void radeon_llvm_emit_store(struct lp_build_tgsi_context 
*bld_base,
 
if (reg->Register.Indirect) {
struct tgsi_declaration_range range = 
get_array_range(bld_base,
reg->Register.File, reg->Register.Index, 
>Indirect);
 
unsigned i, size = range.Last - range.First + 1;
unsigned file = reg->Register.File;
unsigned reg_index = reg->Register.Index;
LLVMValueRef array = store_value_to_array(bld_base, 
value, file, chan_index,
  reg_index, 
>Indirect);
-   if (get_alloca_for_array(bld_base, file, reg_index)) {
+ 

[Mesa-dev] [PATCH 08/19] gallium/radeon: clean up emit_declaration for temporaries

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

In the alloca'd array case, no longer create redundant and unused allocas
for the individual elements; create getelementptrs instead.
---
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 27 ++
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index d75311e..41f24d3 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -408,81 +408,90 @@ static LLVMValueRef si_build_alloca_undef(struct 
gallivm_state *gallivm,
LLVMValueRef ptr = lp_build_alloca(gallivm, type, name);
LLVMBuildStore(gallivm->builder, LLVMGetUndef(type), ptr);
return ptr;
 }
 
 static void emit_declaration(struct lp_build_tgsi_context *bld_base,
 const struct tgsi_full_declaration *decl)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
-   unsigned first, last, i, idx;
+   unsigned first, last, i;
switch(decl->Declaration.File) {
case TGSI_FILE_ADDRESS:
{
 unsigned idx;
for (idx = decl->Range.First; idx <= decl->Range.Last; idx++) {
unsigned chan;
for (chan = 0; chan < TGSI_NUM_CHANNELS; chan++) {
 ctx->soa.addr[idx][chan] = 
si_build_alloca_undef(
>gallivm,
ctx->soa.bld_base.uint_bld.elem_type, 
"");
}
}
break;
}
 
case TGSI_FILE_TEMPORARY:
{
+   LLVMValueRef array_alloca = NULL;
unsigned decl_size;
first = decl->Range.First;
last = decl->Range.Last;
decl_size = 4 * ((last - first) + 1);
if (decl->Declaration.Array) {
unsigned id = decl->Array.ArrayID - 1;
if (!ctx->arrays) {
int size = 
bld_base->info->array_max[TGSI_FILE_TEMPORARY];
ctx->arrays = CALLOC(size, 
sizeof(ctx->arrays[0]));
-   for (i = 0; i < size; ++i) {
-   assert(!ctx->arrays[i].alloca);}
}
 
ctx->arrays[id].range = decl->Range;
 
/* If the array is more than 16 elements (each element
 * is 32-bits), then store it in a vector.  Storing the
 * array in a vector will causes the compiler to store
 * the array in registers and access it using indirect
 * addressing.  16 is number of vector elements that
 * LLVM will store in a register.
 * FIXME: We shouldn't need to do this.  LLVM should be
 * smart enough to promote allocas int registers when
 * profitable.
 */
if (decl_size > 16) {
-   ctx->arrays[id].alloca = 
LLVMBuildAlloca(builder,
+   array_alloca = LLVMBuildAlloca(builder,
LLVMArrayType(bld_base->base.vec_type, 
decl_size),"array");
+   ctx->arrays[id].alloca = array_alloca;
}
}
-   first = decl->Range.First;
-   last = decl->Range.Last;
+
if (!ctx->temps_count) {
ctx->temps_count = 
bld_base->info->file_max[TGSI_FILE_TEMPORARY] + 1;
ctx->temps = MALLOC(TGSI_NUM_CHANNELS * 
ctx->temps_count * sizeof(LLVMValueRef));
}
-   for (idx = first; idx <= last; idx++) {
-   for (i = 0; i < TGSI_NUM_CHANNELS; i++) {
-   ctx->temps[idx * TGSI_NUM_CHANNELS + i] =
+   if (!array_alloca) {
+   for (i = 0; i < decl_size; ++i) {
+   ctx->temps[first * TGSI_NUM_CHANNELS + i] =

si_build_alloca_undef(bld_base->base.gallivm,
  
bld_base->base.vec_type,
  "temp");
}
+   } else {
+   LLVMValueRef idxs[2] = {
+   bld_base->uint_bld.zero,
+   NULL
+   };
+   for (i = 0; i < decl_size; ++i) {
+   idxs[1] = 

[Mesa-dev] [PATCH 12/19] gallium/radeon: clarify the comment on the array alloca heuristic

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 29 ++
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 22ff18e..e4bfa74 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -437,33 +437,42 @@ static void emit_declaration(struct lp_build_tgsi_context 
*bld_base,
decl_size = 4 * ((last - first) + 1);
if (decl->Declaration.Array) {
unsigned id = decl->Array.ArrayID - 1;
if (!ctx->arrays) {
int size = 
bld_base->info->array_max[TGSI_FILE_TEMPORARY];
ctx->arrays = CALLOC(size, 
sizeof(ctx->arrays[0]));
}
 
ctx->arrays[id].range = decl->Range;
 
-   /* If the array is more than 16 elements (each element
-* is 32-bits), then store it in a vector.  Storing the
-* array in a vector will causes the compiler to store
-* the array in registers and access it using indirect
-* addressing.  16 is number of vector elements that
-* LLVM will store in a register.
-* FIXME: We shouldn't need to do this.  LLVM should be
-* smart enough to promote allocas int registers when
-* profitable.
+   /* If the array has more than 16 elements, store it
+* in memory using an alloca that spans the entire
+* array.
+*
+* Otherwise, store each array element individually.
+* We will then generate vectors (per-channel, up to
+* <4 x float>) for indirect addressing.
+*
+* Note that 16 is the number of vector elements that
+* LLVM will store in a register, so theoretically an
+* array with up to 4 * 16 = 64 elements could be
+* handled this way, but whether that's a good idea
+* depends on VGPR register pressure elsewhere.
+*
+* FIXME: We shouldn't need to have the non-alloca
+* code path for arrays. LLVM should be smart enough to
+* promote allocas into registers when profitable.
 */
if (decl_size > 16) {
array_alloca = LLVMBuildAlloca(builder,
-   LLVMArrayType(bld_base->base.vec_type, 
decl_size),"array");
+   LLVMArrayType(bld_base->base.vec_type,
+ decl_size), "array");
ctx->arrays[id].alloca = array_alloca;
}
}
 
if (!ctx->temps_count) {
ctx->temps_count = 
bld_base->info->file_max[TGSI_FILE_TEMPORARY] + 1;
ctx->temps = MALLOC(TGSI_NUM_CHANNELS * 
ctx->temps_count * sizeof(LLVMValueRef));
}
if (!array_alloca) {
for (i = 0; i < decl_size; ++i) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/19] gallium/radeon: simplify radeon_llvm_emit_store for direct array addressing

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

We can use the pointer stored in the temps array directly.
---
 src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index e084248..7b96a58 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -624,30 +624,23 @@ void radeon_llvm_emit_store(struct lp_build_tgsi_context 
*bld_base,
} else {
switch(reg->Register.File) {
case TGSI_FILE_OUTPUT:
temp_ptr = 
bld->outputs[reg->Register.Index][chan_index];
if (tgsi_type_is_64bit(dtype))
temp_ptr2 = 
bld->outputs[reg->Register.Index][chan_index + 1];
break;
 
case TGSI_FILE_TEMPORARY:
{
-   LLVMValueRef array;
if (reg->Register.Index >= ctx->temps_count)
continue;
-   array = get_alloca_for_array(bld_base, 
reg->Register.File, reg->Register.Index);
 
-   if (array) {
-   store_value_to_array(bld_base, value, 
reg->Register.File, chan_index, reg->Register.Index,
-   NULL);
-   continue;
-   }
temp_ptr = ctx->temps[ TGSI_NUM_CHANNELS * 
reg->Register.Index + chan_index];
if (tgsi_type_is_64bit(dtype))
temp_ptr2 = ctx->temps[ 
TGSI_NUM_CHANNELS * reg->Register.Index + chan_index + 1];
 
break;
}
default:
return;
}
if (!tgsi_type_is_64bit(dtype))
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/19] gallium/radeon: add radeon_llvm_bound_index for bounds checking

2016-08-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/radeon_llvm.h   |  4 +++
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 29 ++
 src/gallium/drivers/radeonsi/si_shader.c   | 19 +-
 3 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_llvm.h 
b/src/gallium/drivers/radeon/radeon_llvm.h
index 4c946b5..c3efd7d 100644
--- a/src/gallium/drivers/radeon/radeon_llvm.h
+++ b/src/gallium/drivers/radeon/radeon_llvm.h
@@ -112,20 +112,24 @@ struct radeon_llvm_context {
 
struct gallivm_state gallivm;
 };
 
 LLVMTypeRef tgsi2llvmtype(struct lp_build_tgsi_context *bld_base,
  enum tgsi_opcode_type type);
 
 LLVMValueRef bitcast(struct lp_build_tgsi_context *bld_base,
 enum tgsi_opcode_type type, LLVMValueRef value);
 
+LLVMValueRef radeon_llvm_bound_index(struct radeon_llvm_context *ctx,
+LLVMValueRef index,
+unsigned num);
+
 void radeon_llvm_emit_prepare_cube_coords(struct lp_build_tgsi_context 
*bld_base,
  struct lp_build_emit_data *emit_data,
  LLVMValueRef *coords_arg,
  LLVMValueRef *derivs_arg);
 
 void radeon_llvm_context_init(struct radeon_llvm_context *ctx,
   const char *triple);
 
 void radeon_llvm_create_func(struct radeon_llvm_context *ctx,
 LLVMTypeRef *return_types, unsigned 
num_return_elems,
diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 73e4ce2..e3b04ee 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -66,20 +66,49 @@ LLVMValueRef bitcast(struct lp_build_tgsi_context *bld_base,
 {
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
LLVMTypeRef dst_type = tgsi2llvmtype(bld_base, type);
 
if (dst_type)
return LLVMBuildBitCast(builder, value, dst_type, "");
else
return value;
 }
 
+/**
+ * Return a value that is equal to the given i32 \p index if it lies in [0,num)
+ * or an undefined value in the same interval otherwise.
+ */
+LLVMValueRef radeon_llvm_bound_index(struct radeon_llvm_context *ctx,
+LLVMValueRef index,
+unsigned num)
+{
+   struct gallivm_state *gallivm = >gallivm;
+   LLVMBuilderRef builder = gallivm->builder;
+   LLVMValueRef c_max = lp_build_const_int32(gallivm, num - 1);
+   LLVMValueRef cc;
+
+   if (util_is_power_of_two(num)) {
+   index = LLVMBuildAnd(builder, index, c_max, "");
+   } else {
+   /* In theory, this MAX pattern should result in code that is
+* as good as the bit-wise AND above.
+*
+* In practice, LLVM generates worse code (at the time of
+* writing), because its value tracking is not strong enough.
+*/
+   cc = LLVMBuildICmp(builder, LLVMIntULE, index, c_max, "");
+   index = LLVMBuildSelect(builder, cc, index, c_max, "");
+   }
+
+   return index;
+}
+
 static struct radeon_llvm_loop *get_current_loop(struct radeon_llvm_context 
*ctx)
 {
return ctx->loop_depth > 0 ? ctx->loop + (ctx->loop_depth - 1) : NULL;
 }
 
 static struct radeon_llvm_branch *get_current_branch(struct 
radeon_llvm_context *ctx)
 {
return ctx->branch_depth > 0 ?
ctx->branch + (ctx->branch_depth - 1) : NULL;
 }
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 2de20cb..5f02fcd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -558,47 +558,30 @@ static LLVMValueRef get_indirect_index(struct 
si_shader_context *ctx,
 }
 
 /**
  * Like get_indirect_index, but restricts the return value to a (possibly
  * undefined) value inside [0..num).
  */
 static LLVMValueRef get_bounded_indirect_index(struct si_shader_context *ctx,
   const struct tgsi_ind_register 
*ind,
   int rel_index, unsigned num)
 {
-   struct gallivm_state *gallivm = >radeon_bld.gallivm;
-   LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef result = get_indirect_index(ctx, ind, rel_index);
-   LLVMValueRef c_max = LLVMConstInt(ctx->i32, num - 1, 0);
-   LLVMValueRef cc;
 
/* LLVM 3.8: If indirect resource indexing is used:
 * - SI & CIK hang
 * - VI crashes
 */
if (HAVE_LLVM <= 0x0308)
return LLVMGetUndef(ctx->i32);
 
-   if 

  1   2   >