Re: [Mesa-dev] [PATCH 1/3] vl/dri3: use external texture as back buffers(v4)

2017-01-06 Thread Michel Dänzer
On 06/01/17 05:50 AM, Andy Furniss wrote:
> Christian König wrote:
>> Am 04.01.2017 um 18:13 schrieb Nayan Deshmukh:
>>> dri3 allows us to send handle of a texture directly to X
>>> so this patch allows a state tracker to directly send its
>>> texture to X to be used as back buffer and avoids extra
>>> copying
>>>
>>> v2: use clip width/height to display a portion of the surface
>>> v3: remove redundant variables, fix wrapping, rename variables
>>>  handle vaapi path
>>> v3.1: we need clip_width/height for every frame so we don't need
>>>to maintain it for each buffer instead use a global variable
>>> v4: In case of single gpu we can cache the buffers as applications
>>>  use constant number of buffer and we can avoid calls to present
>>>  extension for every frame
>>>
>>> Suggested-by: Leo Liu 
>>> Signed-off-by: Nayan Deshmukh 
>>
>> Acked-by: Christian König .
>>
>> Andy & Leo did you guys already had a chance to test it? To me it looks
>> like this should work now.
> 
> Well there is still the tearing issue from loosing pageflips.
> 
> Maybe different GPUs don't see this. I can fix by forcing perf but I
> just tested dal and it's not even fixable running that.
> 
> I guess that may not count as an issue with these patches as such if
> xorg/xf86-video-amdgpu can work around, but it's a very noticeable
> regression until that happens.

Somebody should track down why the buffers sent for presentation in this
case don't use the same tiling parameters as buffers used for GL via DRI3.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97542] mesa-12.0.1 with llvm-3.9.0_rc3 - src/gallium/state_trackers/clover/llvm/invocation.cpp:212:75: error: no matching function for call to clang::CompilerInvocation::setLangDefault

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97542

Michel Dänzer  changed:

   What|Removed |Added

Version|13.0|12.0

--- Comment #14 from Michel Dänzer  ---
(In reply to Christian from comment #13)
> Same problem, different version: 
> mesa-13.0.2
> llvm: sys-devel/llvm-3.9.1

Actually, it looks like it picks up the headers from an older LLVM version for
you. If you can't figure out why, ask for help on the mesa-dev mailing list,
providing he corresponding config.log file.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94512] X segfaults with glx-tls enabled in a x32 environment

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94512

--- Comment #9 from EoD  ---
(In reply to Emil Velikov from comment #8)
> Double-checking the logs - seems like TLS is built/used throughout the board.
> One thing which comes to mind - can you try with --disable-asm. I'm fairly
> sure that the code we have in there doesn't attribute x32.
> 
> Note: I'll be pushing a patch which makes --enable-glx-tls the default in a
> moment, so please keep it disabled locally until we get to the bottom of
> this.

I can confirm that a "--enable-glx-tls --disable-asm" works as well as
"--disable-glx-tls --enable-asm".

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] anv/pipeline: Only call remove_dead_variables once

2017-01-06 Thread Jason Ekstrand
It can handle multiple modes at a time now so there's no reason to call
it repeatedly.
---
 src/intel/vulkan/anv_pipeline.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index db35d70..fadc76a 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -157,9 +157,9 @@ anv_shader_compile_to_nir(struct anv_device *device,
assert(exec_list_length(>functions) == 1);
entry_point->name = ralloc_strdup(entry_point, "main");
 
-   nir_remove_dead_variables(nir, nir_var_shader_in);
-   nir_remove_dead_variables(nir, nir_var_shader_out);
-   nir_remove_dead_variables(nir, nir_var_system_value);
+   nir_remove_dead_variables(nir, nir_var_shader_in |
+  nir_var_shader_out |
+  nir_var_system_value);
nir_validate_shader(nir);
 
/* Now that we've deleted all but the main function, we can go ahead and
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] anv/pipeline: Call NIR passes using NIR_PASS_V

2017-01-06 Thread Jason Ekstrand
This lets us get validation without having to do it manually.
---
 src/intel/vulkan/anv_pipeline.c | 46 ++---
 1 file changed, 15 insertions(+), 31 deletions(-)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index fadc76a..17491e3 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -131,23 +131,16 @@ anv_shader_compile_to_nir(struct anv_device *device,
 
free(spec_entries);
 
-   if (stage == MESA_SHADER_FRAGMENT) {
-  nir_lower_wpos_center(nir);
-  nir_validate_shader(nir);
-   }
+   if (stage == MESA_SHADER_FRAGMENT)
+  NIR_PASS_V(nir, nir_lower_wpos_center);
 
/* We have to lower away local constant initializers right before we
 * inline functions.  That way they get properly initialized at the top
 * of the function and not at the top of its caller.
 */
-   nir_lower_constant_initializers(nir, nir_var_local);
-   nir_validate_shader(nir);
-
-   nir_lower_returns(nir);
-   nir_validate_shader(nir);
-
-   nir_inline_functions(nir);
-   nir_validate_shader(nir);
+   NIR_PASS_V(nir, nir_lower_constant_initializers, nir_var_local);
+   NIR_PASS_V(nir, nir_lower_returns);
+   NIR_PASS_V(nir, nir_inline_functions);
 
/* Pick off the single entrypoint that we want */
foreach_list_typed_safe(nir_function, func, node, >functions) {
@@ -157,36 +150,27 @@ anv_shader_compile_to_nir(struct anv_device *device,
assert(exec_list_length(>functions) == 1);
entry_point->name = ralloc_strdup(entry_point, "main");
 
-   nir_remove_dead_variables(nir, nir_var_shader_in |
-  nir_var_shader_out |
-  nir_var_system_value);
-   nir_validate_shader(nir);
+   NIR_PASS_V(nir, nir_remove_dead_variables,
+  nir_var_shader_in | nir_var_shader_out | nir_var_system_value);
 
/* Now that we've deleted all but the main function, we can go ahead and
 * lower the rest of the constant initializers.
 */
-   nir_lower_constant_initializers(nir, ~0);
-   nir_validate_shader(nir);
-
-   nir_propagate_invariant(nir);
-   nir_validate_shader(nir);
-
-   nir_lower_io_to_temporaries(entry_point->shader, entry_point->impl,
-   true, false);
-
-   nir_lower_system_values(nir);
-   nir_validate_shader(nir);
+   NIR_PASS_V(nir, nir_lower_constant_initializers, ~0);
+   NIR_PASS_V(nir, nir_propagate_invariant);
+   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
+  entry_point->impl, true, false);
+   NIR_PASS_V(nir, nir_lower_system_values);
 
/* Vulkan uses the separate-shader linking model */
nir->info->separate_shader = true;
 
nir = brw_preprocess_nir(compiler, nir);
 
-   nir_lower_clip_cull_distance_arrays(nir);
-   nir_validate_shader(nir);
+   NIR_PASS_V(nir, nir_lower_clip_cull_distance_arrays);
 
if (stage == MESA_SHADER_FRAGMENT)
-  anv_nir_lower_input_attachments(nir);
+  NIR_PASS_V(nir, anv_nir_lower_input_attachments);
 
nir_shader_gather_info(nir, entry_point->impl);
 
@@ -325,7 +309,7 @@ anv_pipeline_compile(struct anv_pipeline *pipeline,
if (nir == NULL)
   return NULL;
 
-   anv_nir_lower_push_constants(nir);
+   NIR_PASS_V(nir, anv_nir_lower_push_constants);
 
/* Figure out the number of parameters */
prog_data->nr_params = 0;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix depth transitions with layerCount = VK_REMAINING_ARRAY_LAYERS

2017-01-06 Thread Bas Nieuwenhuizen
Thanks, pushed.

- Bas

On Sat, Jan 7, 2017 at 12:08 AM, Pierre-Loup A. Griffais
 wrote:
> Yep, sorry about that and thanks for the review... Please ignore the other
> thread on mesa-dev now that this one is in the right place :(
>
> On 01/06/2017 02:05 PM, Jason Ekstrand wrote:
>>
>> Bah... cc mesa-dev
>>
>> On Fri, Jan 6, 2017 at 2:04 PM, Jason Ekstrand > > wrote:
>>
>> Reviewed-by: Jason Ekstrand > >
>>
>> I'll let Dave or Bas push though. :-)
>>
>> On Fri, Jan 6, 2017 at 12:57 PM, Pierre-Loup A. Griffais
>> >
>> wrote:
>>
>> Interpreting layerCount literally would try to create billions
>> of image
>> views in radv_process_depth_image_inplace().
>>
>> Signed-off-by: Pierre-Loup A. Griffais
>> >
>> ---
>>  src/amd/vulkan/radv_meta_decompress.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/amd/vulkan/radv_meta_decompress.c
>> b/src/amd/vulkan/radv_meta_decompress.c
>> index 47ef64d..9f262e6 100644
>> --- a/src/amd/vulkan/radv_meta_decompress.c
>> +++ b/src/amd/vulkan/radv_meta_decompress.c
>> @@ -382,7 +382,7 @@ static void
>> radv_process_depth_image_inplace(struct radv_cmd_buffer
>> *cmd_buffer,
>>
>>
>> radv_meta_save_graphics_reset_vport_scissor(_state,
>> cmd_buffer);
>>
>> -   for (uint32_t layer = 0; layer <
>> subresourceRange->layerCount; layer++) {
>> +   for (uint32_t layer = 0; layer <
>> radv_get_layerCount(image, subresourceRange); layer++) {
>> struct radv_image_view iview;
>>
>> radv_image_view_init(, cmd_buffer->device,
>> --
>> 2.9.3
>>
>> ___
>> xorg-de...@lists.x.org : X.Org
>> development
>> Archives: http://lists.x.org/archives/xorg-devel
>> 
>> Info: https://lists.x.org/mailman/listinfo/xorg-devel
>> 
>>
>>
>>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] isl: Mark A4B4G4R4_UNORM as supported on gen8

2017-01-06 Thread Kenneth Graunke
On Friday, January 6, 2017 2:22:42 PM PST Jason Ekstrand wrote:
> Cc: "13.1" 
> ---
>  src/intel/isl/isl_format.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
> index 98806f4..43c2f4f 100644
> --- a/src/intel/isl/isl_format.c
> +++ b/src/intel/isl/isl_format.c
> @@ -217,7 +217,10 @@ static const struct surface_format_info format_info[] = {
> SF(50, 50,  x,  x,  x,  x,  x,  x,  x,x,   P8A8_UNORM_PALETTE0)
> SF(50, 50,  x,  x,  x,  x,  x,  x,  x,x,   P8A8_UNORM_PALETTE1)
> SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   A1B5G5R5_UNORM)
> -   SF(90, 90,  x,  x, 90,  x,  x,  x,  x,x,   A4B4G4R4_UNORM)
> +   /* According to the PRM, A4B4G4R4_UNORM isn't supported until Sky Lake
> +* but empirical testing indicates that it works just fine on Broadwell.
> +*/
> +   SF(80, 80,  x,  x, 80,  x,  x,  x,  x,x,   A4B4G4R4_UNORM)
> SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_UINT)
> SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_SINT)
> SF( Y,  Y,  x, 45,  Y,  Y,  Y,  x,  x,x,   R8_UNORM)
> 

both are
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] anv/image: Disable HiZ for depth buffer arrays

2017-01-06 Thread Nanley Chery
On Fri, Jan 06, 2017 at 03:08:12PM -0800, Jason Ekstrand wrote:
> 2017-01-06 14:46 GMT-08:00 Nanley Chery :
> 
> > We currently don't perform clears or resolves on multiple array layers
> > with HiZ.
> >
> 
> Glancing through the code, it looks like you're right.  I'm not even sure
> that you can do layered HiZ clears and/or resolves with the HZ op; you'd
> probably have to do it the gen7 way with blorp.  Thanks for catching this!
> 

No problem! I believe we can perform those HiZ operations on different
layers, but it requires changing
3DSTATE_DEPTH_BUFFER::MinimumArrayElement before each HiZ sequence.
You'll find this note in the following section of the SKL PRM:
Optimized Depth Buffer Clear and/or Stencil Buffer Clear .

> 
> > Cc: mesa-sta...@lists.freedesktop.org
> 
> 
> Do we have hiz in 13.1?  If not, it won't apply and Emil will reject it.
> 

We do. If you run a Vulkan application with INTEL_VK_HIZ=0 preceding the
command, you'll see: 'anv_image.c:190: FINISHME: Implement gen7 HiZ'

> Reviewed-by: Jason Ekstrand 
> 

Thanks for the review!

-Nanley

> 
> >
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/intel/vulkan/anv_image.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> > index e60373a151..f262d8a524 100644
> > --- a/src/intel/vulkan/anv_image.c
> > +++ b/src/intel/vulkan/anv_image.c
> > @@ -186,6 +186,8 @@ make_surface(const struct anv_device *dev,
> >   anv_finishme("Implement gen7 HiZ");
> >} else if (vk_info->mipLevels > 1) {
> >   anv_finishme("Test multi-LOD HiZ");
> > +  } else if (vk_info->arrayLayers > 1) {
> > + anv_finishme("Implement multi-arrayLayer HiZ clears and
> > resolves");
> >} else if (dev->info.gen == 8 && vk_info->samples > 1) {
> >   anv_finishme("Test gen8 multisampled HiZ");
> >} else {
> > --
> > 2.11.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Allow a per gen timebase scale factor

2017-01-06 Thread Jason Ekstrand
While you're at it, would you mind making Vulkan use this as well?  It
should be a 2-line change to GetPhysicalDeviceProperties.

On Fri, Jan 6, 2017 at 1:17 PM, Kenneth Graunke 
wrote:

> From: Robert Bragg 
>
> v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.
> ---
>  src/intel/common/gen_device_info.c| 13 ++--
>  src/intel/common/gen_device_info.h| 24 ++
>  src/mesa/drivers/dri/i965/brw_context.c   | 15 +
>  src/mesa/drivers/dri/i965/brw_context.h   |  3 ++
>  src/mesa/drivers/dri/i965/brw_queryobj.c  | 53
> ---
>  src/mesa/drivers/dri/i965/gen6_queryobj.c | 28 +---
>  6 files changed, 109 insertions(+), 27 deletions(-)
>
> diff --git a/src/intel/common/gen_device_info.c
> b/src/intel/common/gen_device_info.c
> index 9bf3cd5cc42..209b293e510 100644
> --- a/src/intel/common/gen_device_info.c
> +++ b/src/intel/common/gen_device_info.c
> @@ -36,6 +36,7 @@ static const struct gen_device_info gen_device_info_i965
> = {
> .urb = {
>.size = 256,
> },
> +   .timebase_scale = 80,
>  };
>
>  static const struct gen_device_info gen_device_info_g4x = {
> @@ -51,6 +52,7 @@ static const struct gen_device_info gen_device_info_g4x
> = {
> .urb = {
>.size = 384,
> },
> +   .timebase_scale = 80,
>  };
>
>  static const struct gen_device_info gen_device_info_ilk = {
> @@ -65,6 +67,7 @@ static const struct gen_device_info gen_device_info_ilk
> = {
> .urb = {
>.size = 1024,
> },
> +   .timebase_scale = 80,
>  };
>
>  static const struct gen_device_info gen_device_info_snb_gt1 = {
> @@ -89,6 +92,7 @@ static const struct gen_device_info
> gen_device_info_snb_gt1 = {
>   [MESA_SHADER_GEOMETRY] = 256,
>},
> },
> +   .timebase_scale = 80,
>  };
>
>  static const struct gen_device_info gen_device_info_snb_gt2 = {
> @@ -113,6 +117,7 @@ static const struct gen_device_info
> gen_device_info_snb_gt2 = {
>   [MESA_SHADER_GEOMETRY] = 256,
>},
> },
> +   .timebase_scale = 80,
>  };
>
>  #define GEN7_FEATURES   \
> @@ -121,7 +126,8 @@ static const struct gen_device_info
> gen_device_info_snb_gt2 = {
> .must_use_separate_stencil = true,   \
> .has_llc = true, \
> .has_pln = true, \
> -   .has_surface_tile_offset = true
> +   .has_surface_tile_offset = true, \
> +   .timebase_scale = 80
>
>  static const struct gen_device_info gen_device_info_ivb_gt1 = {
> GEN7_FEATURES, .is_ivybridge = true, .gt = 1,
> @@ -287,7 +293,8 @@ static const struct gen_device_info
> gen_device_info_hsw_gt3 = {
> .max_tcs_threads = 504,  \
> .max_tes_threads = 504,  \
> .max_gs_threads = 504,   \
> -   .max_wm_threads = 384
> +   .max_wm_threads = 384,   \
> +   .timebase_scale = 80
>
>  static const struct gen_device_info gen_device_info_bdw_gt1 = {
> GEN8_FEATURES, .gt = 1,
> @@ -385,6 +392,7 @@ static const struct gen_device_info
> gen_device_info_chv = {
> .max_tcs_threads = 336,  \
> .max_tes_threads = 336,  \
> .max_cs_threads = 56,\
> +   .timebase_scale = 10.0 / 1200.0, \
> .urb = { \
>.size = 384,  \
>.min_entries = {  \
> @@ -410,6 +418,7 @@ static const struct gen_device_info
> gen_device_info_chv = {
> .max_tes_threads = 112, \
> .max_gs_threads = 112,  \
> .max_cs_threads = 6 * 6,\
> +   .timebase_scale = 10.0 / 19200123.0,\
> .urb = {\
>.size = 192, \
>.min_entries = { \
> diff --git a/src/intel/common/gen_device_info.h
> b/src/intel/common/gen_device_info.h
> index f0e8750d0ea..80676d0e003 100644
> --- a/src/intel/common/gen_device_info.h
> +++ b/src/intel/common/gen_device_info.h
> @@ -147,6 +147,30 @@ struct gen_device_info
> */
>unsigned max_entries[4];
> } urb;
> +
> +   /**
> +* For the longest time the timestamp frequency for Gen's timestamp
> counter
> +* could be assumed to be 12.5MHz, where the least significant bit
> neatly
> +* corresponded to 80 nanoseconds.
> +*
> +* Since Gen9 the numbers aren't so round, with a a frequency of 12MHz
> for
> +* SKL (or scale factor of 83.) and a frequency of 19200123Hz
> for
> +* BXT.
> +*
> +* For simplicty to fit with the current code scaling by a single
> constant
> +* to map from raw timestamps to nanoseconds we now do the 

Re: [Mesa-dev] [PATCH 1/6] nir/dead_variables: Remove shader-local variables that are only written

2017-01-06 Thread Jason Ekstrand
On Wed, Jan 4, 2017 at 3:20 PM, Timothy Arceri  wrote:

> On Mon, 2016-12-12 at 19:39 -0800, Jason Ekstrand wrote:
> > ---
> >  src/compiler/nir/nir_remove_dead_variables.c | 66
> > 
> >  1 file changed, 58 insertions(+), 8 deletions(-)
> >
> > diff --git a/src/compiler/nir/nir_remove_dead_variables.c
> > b/src/compiler/nir/nir_remove_dead_variables.c
> > index f7429eb..d22b7f5 100644
> > --- a/src/compiler/nir/nir_remove_dead_variables.c
> > +++ b/src/compiler/nir/nir_remove_dead_variables.c
> > @@ -31,9 +31,27 @@ static void
> >  add_var_use_intrinsic(nir_intrinsic_instr *instr, struct set *live)
> >  {
> > unsigned num_vars = nir_intrinsic_infos[instr-
> > >intrinsic].num_variables;
> > -   for (unsigned i = 0; i < num_vars; i++) {
> > -  nir_variable *var = instr->variables[i]->var;
> > -  _mesa_set_add(live, var);
> > +
> > +   switch (instr->intrinsic) {
> > +   case nir_intrinsic_copy_var:
> > +  _mesa_set_add(live, instr->variables[1]->var);
> > +  /* Fall through */
> > +   case nir_intrinsic_store_var: {
> > +  /* The first source in both copy_var and store_var is the
> > destination.
> > +   * If the variable is a local that never escapes the shader,
> > then we
> > +   * don't mark it as live for just a store.
> > +   */
> > +  nir_variable_mode mode = instr->variables[0]->var->data.mode;
> > +  if (!(mode & (nir_var_local | nir_var_global |
> > nir_var_shared)))
>
> So you have nir_var_shared here but I think you are missing the bit to
> add:
>
>if (modes & nir_var_shared)
>   progress = remove_dead_vars(>shared, live) || progress;
>

That needs to be its own patch, but sure.


>
> Otherwise won't the var remain while the write instruction is removed?
>
>
> > + _mesa_set_add(live, instr->variables[0]->var);
> > +  break;
> > +   }
> > +
> > +   default:
> > +  for (unsigned i = 0; i < num_vars; i++) {
> > + _mesa_set_add(live, instr->variables[i]->var);
> > +  }
> > +  break;
> > }
> >  }
> >
> > @@ -94,6 +112,31 @@ add_var_use_shader(nir_shader *shader, struct set
> > *live)
> > }
> >  }
> >
> > +static void
> > +remove_dead_var_writes(nir_shader *shader, struct set *live)
> > +{
> > +   nir_foreach_function(function, shader) {
> > +  if (!function->impl)
> > + continue;
> > +
> > +  nir_foreach_block(block, function->impl) {
> > + nir_foreach_instr_safe(instr, block) {
> > +if (instr->type != nir_instr_type_intrinsic)
> > +   continue;
> > +
> > +nir_intrinsic_instr *intrin =
> > nir_instr_as_intrinsic(instr);
> > +if (intrin->intrinsic != nir_intrinsic_copy_var &&
> > +intrin->intrinsic != nir_intrinsic_store_var)
> > +   continue;
> > +
> > +/* Stores to dead variables need to be removed */
> > +if (!_mesa_set_search(live, intrin->variables[0]->var))
> > +   nir_instr_remove(instr);
> > + }
> > +  }
> > +   }
> > +}
> > +
> >  static bool
> >  remove_dead_vars(struct exec_list *var_list, struct set *live)
> >  {
> > @@ -138,12 +181,19 @@ nir_remove_dead_variables(nir_shader *shader,
> > nir_variable_mode modes)
> > if (modes & nir_var_local) {
> >nir_foreach_function(function, shader) {
> >   if (function->impl) {
> > -if (remove_dead_vars(>impl->locals, live)) {
> > -   nir_metadata_preserve(function->impl,
> > nir_metadata_block_index |
> > - nir_metadata_do
> > minance |
> > - nir_metadata_li
> > ve_ssa_defs);
> > +if (remove_dead_vars(>impl->locals, live))
> > progress = true;
> > -}
> > + }
> > +  }
> > +   }
> > +
> > +   if (progress) {
> > +  remove_dead_var_writes(shader, live);
>
> The vars are conditional on modes passed to nir_remove_dead_variables()
> dont we need this here too? Otherwise we will remove the write
> instruction but not the var.
>

Hrm... Yes.  I'm thinking that we probably want to set the variable mode to
0 when we delete it and use that to decide when to remove a write.  That
way we never run into this problem in the future.


> > +
> > +  nir_foreach_function(function, shader) {
> > + if (function->impl) {
> > +nir_metadata_preserve(function->impl,
> > nir_metadata_block_index |
> > +  nir_metadata_domin
> > ance);
> >   }
> >}
> > }
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix depth transitions with layerCount = VK_REMAINING_ARRAY_LAYERS

2017-01-06 Thread Pierre-Loup A. Griffais
Yep, sorry about that and thanks for the review... Please ignore the 
other thread on mesa-dev now that this one is in the right place :(


On 01/06/2017 02:05 PM, Jason Ekstrand wrote:

Bah... cc mesa-dev

On Fri, Jan 6, 2017 at 2:04 PM, Jason Ekstrand > wrote:

Reviewed-by: Jason Ekstrand >

I'll let Dave or Bas push though. :-)

On Fri, Jan 6, 2017 at 12:57 PM, Pierre-Loup A. Griffais
>
wrote:

Interpreting layerCount literally would try to create billions
of image
views in radv_process_depth_image_inplace().

Signed-off-by: Pierre-Loup A. Griffais
>
---
 src/amd/vulkan/radv_meta_decompress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_meta_decompress.c
b/src/amd/vulkan/radv_meta_decompress.c
index 47ef64d..9f262e6 100644
--- a/src/amd/vulkan/radv_meta_decompress.c
+++ b/src/amd/vulkan/radv_meta_decompress.c
@@ -382,7 +382,7 @@ static void
radv_process_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,


radv_meta_save_graphics_reset_vport_scissor(_state,
cmd_buffer);

-   for (uint32_t layer = 0; layer <
subresourceRange->layerCount; layer++) {
+   for (uint32_t layer = 0; layer <
radv_get_layerCount(image, subresourceRange); layer++) {
struct radv_image_view iview;

radv_image_view_init(, cmd_buffer->device,
--
2.9.3

___
xorg-de...@lists.x.org : X.Org
development
Archives: http://lists.x.org/archives/xorg-devel

Info: https://lists.x.org/mailman/listinfo/xorg-devel






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] anv/image: Disable HiZ for depth buffer arrays

2017-01-06 Thread Jason Ekstrand
2017-01-06 14:46 GMT-08:00 Nanley Chery :

> We currently don't perform clears or resolves on multiple array layers
> with HiZ.
>

Glancing through the code, it looks like you're right.  I'm not even sure
that you can do layered HiZ clears and/or resolves with the HZ op; you'd
probably have to do it the gen7 way with blorp.  Thanks for catching this!


> Cc: mesa-sta...@lists.freedesktop.org


Do we have hiz in 13.1?  If not, it won't apply and Emil will reject it.

Reviewed-by: Jason Ekstrand 


>
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/vulkan/anv_image.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> index e60373a151..f262d8a524 100644
> --- a/src/intel/vulkan/anv_image.c
> +++ b/src/intel/vulkan/anv_image.c
> @@ -186,6 +186,8 @@ make_surface(const struct anv_device *dev,
>   anv_finishme("Implement gen7 HiZ");
>} else if (vk_info->mipLevels > 1) {
>   anv_finishme("Test multi-LOD HiZ");
> +  } else if (vk_info->arrayLayers > 1) {
> + anv_finishme("Implement multi-arrayLayer HiZ clears and
> resolves");
>} else if (dev->info.gen == 8 && vk_info->samples > 1) {
>   anv_finishme("Test gen8 multisampled HiZ");
>} else {
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] anv/cmd_buffer: Fix programmed HiZ qpitch

2017-01-06 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Jan 6, 2017 at 2:46 PM, Nanley Chery  wrote:

> Match the comment above the field by using units of pixels and not HiZ
> blocks.
>
> Cc: mesa-sta...@lists.freedesktop.org
> Suggested-by: Jason Ekstrand 
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/vulkan/genX_cmd_buffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 0d24aeaed6..7a44c449a8 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -2190,7 +2190,7 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer
> *cmd_buffer)
>* 2-D images.  Prior to Sky Lake, this field is always in rows.
>*/
>   hdb.SurfaceQPitch =
> -isl_surf_get_array_pitch_el_rows(>aux_surface.isl) >>
> 2;
> +isl_surf_get_array_pitch_sa_rows(>aux_surface.isl) >>
> 2;
>  #endif
>}
> } else {
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] anv/cmd_buffer: Fix arrayed depth/stencil attachments

2017-01-06 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Jan 6, 2017 at 2:46 PM, Nanley Chery  wrote:

> Enable multiple layers of the depth/stencil buffers to be accessible.
>
> Fixes the crucible test, func.depthstencil.arrayed_clear.
>
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/vulkan/genX_cmd_buffer.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 9c6349a745..0d24aeaed6 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -2122,14 +2122,17 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
>   db.Height   = image->extent.height - 1;
>   db.Width= image->extent.width - 1;
>   db.LOD  = iview->isl.base_level;
> - db.Depth= image->array_size - 1; /* FIXME: 3-D */
>   db.MinimumArrayElement  = iview->isl.base_array_layer;
>
> + assert(image->depth_surface.isl.dim != ISL_SURF_DIM_3D);
> + db.Depth =
> + db.RenderTargetViewExtent =
> +iview->isl.array_len - iview->isl.base_array_layer - 1;
> +
>  #if GEN_GEN >= 8
>   db.SurfaceQPitch =
>  isl_surf_get_array_pitch_el_rows(>depth_surface.isl)
> >> 2;
>  #endif
> - db.RenderTargetViewExtent = 1 - 1;
>}
> } else {
>/* Even when no depth buffer is present, the hardware requires that
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] radeonsi: cleanly communicate whether si_shader_dump should check R600_DEBUG

2017-01-06 Thread Marek Olšák
Ping for Rb for the first 2 patches.

Marek

On Tue, Jan 3, 2017 at 8:17 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/radeonsi/si_compute.c   |  2 +-
>  src/gallium/drivers/radeonsi/si_debug.c |  2 +-
>  src/gallium/drivers/radeonsi/si_shader.c| 20 +++-
>  src/gallium/drivers/radeonsi/si_shader.h|  2 +-
>  src/gallium/drivers/radeonsi/si_state_shaders.c |  2 +-
>  5 files changed, 15 insertions(+), 13 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
> b/src/gallium/drivers/radeonsi/si_compute.c
> index cb14a35..fe29fb1 100644
> --- a/src/gallium/drivers/radeonsi/si_compute.c
> +++ b/src/gallium/drivers/radeonsi/si_compute.c
> @@ -163,21 +163,21 @@ static void *si_create_compute_state(
> radeon_elf_read(code, header->num_bytes, 
> >shader.binary);
> if (program->use_code_object_v2) {
> const amd_kernel_code_t *code_object =
> si_compute_get_code_object(program, 0);
> code_object_to_config(code_object, 
> >shader.config);
> } else {
> si_shader_binary_read_config(>shader.binary,
>  >shader.config, 0);
> }
> si_shader_dump(sctx->screen, >shader, >b.debug,
> -  PIPE_SHADER_COMPUTE, stderr);
> +  PIPE_SHADER_COMPUTE, stderr, true);
> if (si_shader_binary_upload(sctx->screen, >shader) < 
> 0) {
> fprintf(stderr, "LLVM failed to upload shader\n");
> FREE(program);
> return NULL;
> }
> }
>
> return program;
>  }
>
> diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
> b/src/gallium/drivers/radeonsi/si_debug.c
> index 1090dda..a1cd9e5 100644
> --- a/src/gallium/drivers/radeonsi/si_debug.c
> +++ b/src/gallium/drivers/radeonsi/si_debug.c
> @@ -38,21 +38,21 @@ static void si_dump_shader(struct si_screen *sscreen,
>  {
> struct si_shader *current = state->current;
>
> if (!state->cso || !current)
> return;
>
> if (current->shader_log)
> fwrite(current->shader_log, current->shader_log_size, 1, f);
> else
> si_shader_dump(sscreen, state->current, NULL,
> -  state->cso->info.processor, f);
> +  state->cso->info.processor, f, false);
>  }
>
>  /**
>   * Shader compiles can be overridden with arbitrary ELF objects by setting
>   * the environment variable 
> RADEON_REPLACE_SHADERS=num1:filename1[;num2:filename2]
>   */
>  bool si_replace_shader(unsigned num, struct radeon_shader_binary *binary)
>  {
> const char *p = debug_get_option_replace_shaders();
> const char *semicolon;
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 8dec55c..5dfbd66 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -6162,21 +6162,22 @@ static void si_shader_dump_disassembly(const struct 
> radeon_shader_binary *binary
> binary->code[i + 3], binary->code[i + 2],
> binary->code[i + 1], binary->code[i]);
> }
> }
>  }
>
>  static void si_shader_dump_stats(struct si_screen *sscreen,
>  struct si_shader *shader,
>  struct pipe_debug_callback *debug,
>  unsigned processor,
> -FILE *file)
> +FILE *file,
> +bool check_debug_option)
>  {
> struct si_shader_config *conf = >config;
> unsigned num_inputs = shader->selector ? 
> shader->selector->info.num_inputs : 0;
> unsigned code_size = si_get_shader_binary_size(shader);
> unsigned lds_increment = sscreen->b.chip_class >= CIK ? 512 : 256;
> unsigned lds_per_wave = 0;
> unsigned max_simd_waves = 10;
>
> /* Compute LDS usage for PS. */
> switch (processor) {
> @@ -6213,21 +6214,21 @@ static void si_shader_dump_stats(struct si_screen 
> *sscreen,
> }
>
> if (conf->num_vgprs)
> max_simd_waves = MIN2(max_simd_waves, 256 / conf->num_vgprs);
>
> /* LDS is 64KB per CU (4 SIMDs), which is 16KB per SIMD (usage above
>  * 16KB makes some SIMDs unoccupied). */
> if (lds_per_wave)
> max_simd_waves = MIN2(max_simd_waves, 16384 / lds_per_wave);
>
> -   if (file != stderr ||
> +   if (!check_debug_option ||
> r600_can_dump_shader(>b, processor)) {
> if (processor == 

[Mesa-dev] [PATCH 2/3] anv/cmd_buffer: Fix programmed HiZ qpitch

2017-01-06 Thread Nanley Chery
Match the comment above the field by using units of pixels and not HiZ
blocks.

Cc: mesa-sta...@lists.freedesktop.org
Suggested-by: Jason Ekstrand 
Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/genX_cmd_buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 0d24aeaed6..7a44c449a8 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2190,7 +2190,7 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
   * 2-D images.  Prior to Sky Lake, this field is always in rows.
   */
  hdb.SurfaceQPitch =
-isl_surf_get_array_pitch_el_rows(>aux_surface.isl) >> 2;
+isl_surf_get_array_pitch_sa_rows(>aux_surface.isl) >> 2;
 #endif
   }
} else {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] anv/image: Disable HiZ for depth buffer arrays

2017-01-06 Thread Nanley Chery
We currently don't perform clears or resolves on multiple array layers
with HiZ.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_image.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index e60373a151..f262d8a524 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -186,6 +186,8 @@ make_surface(const struct anv_device *dev,
  anv_finishme("Implement gen7 HiZ");
   } else if (vk_info->mipLevels > 1) {
  anv_finishme("Test multi-LOD HiZ");
+  } else if (vk_info->arrayLayers > 1) {
+ anv_finishme("Implement multi-arrayLayer HiZ clears and resolves");
   } else if (dev->info.gen == 8 && vk_info->samples > 1) {
  anv_finishme("Test gen8 multisampled HiZ");
   } else {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] anv/cmd_buffer: Fix arrayed depth/stencil attachments

2017-01-06 Thread Nanley Chery
Enable multiple layers of the depth/stencil buffers to be accessible.

Fixes the crucible test, func.depthstencil.arrayed_clear.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/genX_cmd_buffer.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 9c6349a745..0d24aeaed6 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2122,14 +2122,17 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
  db.Height   = image->extent.height - 1;
  db.Width= image->extent.width - 1;
  db.LOD  = iview->isl.base_level;
- db.Depth= image->array_size - 1; /* FIXME: 3-D */
  db.MinimumArrayElement  = iview->isl.base_array_layer;
 
+ assert(image->depth_surface.isl.dim != ISL_SURF_DIM_3D);
+ db.Depth =
+ db.RenderTargetViewExtent =
+iview->isl.array_len - iview->isl.base_array_layer - 1;
+
 #if GEN_GEN >= 8
  db.SurfaceQPitch =
 isl_surf_get_array_pitch_el_rows(>depth_surface.isl) >> 2;
 #endif
- db.RenderTargetViewExtent = 1 - 1;
   }
} else {
   /* Even when no depth buffer is present, the hardware requires that
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] isl: Mark A4B4G4R4_UNORM as supported on gen8

2017-01-06 Thread Jason Ekstrand
Cc: "13.1" 
---
 src/intel/isl/isl_format.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
index 98806f4..43c2f4f 100644
--- a/src/intel/isl/isl_format.c
+++ b/src/intel/isl/isl_format.c
@@ -217,7 +217,10 @@ static const struct surface_format_info format_info[] = {
SF(50, 50,  x,  x,  x,  x,  x,  x,  x,x,   P8A8_UNORM_PALETTE0)
SF(50, 50,  x,  x,  x,  x,  x,  x,  x,x,   P8A8_UNORM_PALETTE1)
SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   A1B5G5R5_UNORM)
-   SF(90, 90,  x,  x, 90,  x,  x,  x,  x,x,   A4B4G4R4_UNORM)
+   /* According to the PRM, A4B4G4R4_UNORM isn't supported until Sky Lake
+* but empirical testing indicates that it works just fine on Broadwell.
+*/
+   SF(80, 80,  x,  x, 80,  x,  x,  x,  x,x,   A4B4G4R4_UNORM)
SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_UINT)
SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_SINT)
SF( Y,  Y,  x, 45,  Y,  Y,  Y,  x,  x,x,   R8_UNORM)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #47 from Marek Olšák  ---
The freezes are not caused by shader compilation. I don't know what's causing
them. They are too long (even 10 seconds). A CPU profiler shows the time is not
spent in the driver. It's spent in RocketLeague. The game is doing something
and I can't see what, because it doesn't come with debug info. Some time is
spent in sched_yield, which suggests that the game is in a loop waiting for
something. This issue might not be related to the GL driver even.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] anv/formats: Use the real format for B4G4R4A4_UNORM_PACK16 on gen8

2017-01-06 Thread Jason Ekstrand
Because border color is handled pre-swizzle, when we move the alpha
channel around in the format, the OPAQUE_BLACK border colors don't work
correctly on B4G4R4A4_UNORM_PACK16 with the hack.  This fixes the
following Vulkan CTS tests on Broadwell:

dEQP-VK.pipeline.sampler.view_type.2d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.1d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.2d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.1d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.3d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black

Cc: "13.1" 
---
 src/intel/vulkan/anv_formats.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
index 9ef998c..1ee4d0f 100644
--- a/src/intel/vulkan/anv_formats.c
+++ b/src/intel/vulkan/anv_formats.c
@@ -295,10 +295,10 @@ anv_get_format(const struct gen_device_info *devinfo, 
VkFormat vk_format,
   }
}
 
-   /* The B4G4R4A4 format isn't available prior to Sky Lake so we have to fall
+   /* The B4G4R4A4 format isn't available prior to Broadwell so we have to fall
 * back to a format with a more complex swizzle.
 */
-   if (vk_format == VK_FORMAT_B4G4R4A4_UNORM_PACK16 && devinfo->gen < 9) {
+   if (vk_format == VK_FORMAT_B4G4R4A4_UNORM_PACK16 && devinfo->gen < 8) {
   return (struct anv_format) {
  .isl_format = ISL_FORMAT_B4G4R4A4_UNORM,
  .swizzle = ISL_SWIZZLE(GREEN, RED, ALPHA, BLUE),
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix depth transitions with layerCount = VK_REMAINING_ARRAY_LAYERS

2017-01-06 Thread Jason Ekstrand
Bah... cc mesa-dev

On Fri, Jan 6, 2017 at 2:04 PM, Jason Ekstrand  wrote:

> Reviewed-by: Jason Ekstrand 
>
> I'll let Dave or Bas push though. :-)
>
> On Fri, Jan 6, 2017 at 12:57 PM, Pierre-Loup A. Griffais <
> pgriff...@valvesoftware.com> wrote:
>
>> Interpreting layerCount literally would try to create billions of image
>> views in radv_process_depth_image_inplace().
>>
>> Signed-off-by: Pierre-Loup A. Griffais 
>> ---
>>  src/amd/vulkan/radv_meta_decompress.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/amd/vulkan/radv_meta_decompress.c
>> b/src/amd/vulkan/radv_meta_decompress.c
>> index 47ef64d..9f262e6 100644
>> --- a/src/amd/vulkan/radv_meta_decompress.c
>> +++ b/src/amd/vulkan/radv_meta_decompress.c
>> @@ -382,7 +382,7 @@ static void radv_process_depth_image_inplace(struct
>> radv_cmd_buffer *cmd_buffer,
>>
>> radv_meta_save_graphics_reset_vport_scissor(_state,
>> cmd_buffer);
>>
>> -   for (uint32_t layer = 0; layer < subresourceRange->layerCount;
>> layer++) {
>> +   for (uint32_t layer = 0; layer < radv_get_layerCount(image,
>> subresourceRange); layer++) {
>> struct radv_image_view iview;
>>
>> radv_image_view_init(, cmd_buffer->device,
>> --
>> 2.9.3
>>
>> ___
>> xorg-de...@lists.x.org: X.Org development
>> Archives: http://lists.x.org/archives/xorg-devel
>> Info: https://lists.x.org/mailman/listinfo/xorg-devel
>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/radeon: use the internal clear_buffer callback to fix r600g

2017-01-06 Thread Alex Deucher
On Fri, Jan 6, 2017 at 4:26 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> r600g doesn't set pipe_context::clear_buffer.

Mention the bug report here:
https://bugs.freedesktop.org/show_bug.cgi?id=99303
With that:
Reviewed-by: Alex Deucher 

> ---
>  src/gallium/drivers/radeon/r600_pipe_common.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index 28bb791..5113765 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -538,21 +538,23 @@ bool r600_check_device_reset(struct r600_common_context 
> *rctx)
>
> rctx->device_reset_callback.reset(rctx->device_reset_callback.data, 
> status);
> return true;
>  }
>
>  static void r600_dma_clear_buffer_fallback(struct pipe_context *ctx,
>struct pipe_resource *dst,
>uint64_t offset, uint64_t size,
>unsigned value)
>  {
> -   ctx->clear_buffer(ctx, dst, offset, size, , 4);
> +   struct r600_common_context *rctx = (struct r600_common_context *)ctx;
> +
> +   rctx->clear_buffer(ctx, dst, offset, size, value, 
> R600_COHERENCY_NONE);
>  }
>
>  bool r600_common_context_init(struct r600_common_context *rctx,
>   struct r600_common_screen *rscreen,
>   unsigned context_flags)
>  {
> slab_create_child(>pool_transfers, >pool_transfers);
>
> rctx->screen = rscreen;
> rctx->ws = rscreen->ws;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Allow a per gen timebase scale factor

2017-01-06 Thread Kenneth Graunke
On Friday, January 6, 2017 1:17:39 PM PST Kenneth Graunke wrote:
> From: Robert Bragg 
> 
> v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.

Hi Robert!

Your patch had merge conflicts in gen_device_info.c at this point, so I
fixed those and re-sent it.  It also looked like it didn't set
timebase_scale for KBL, GLK, etc...it should now (via GEN9_FEATURES and
GEN9_LP_FEATURES).

[snip]

> +/* As best we know currently, the Gen HW timestamps are 36bits across
> + * all platforms, which we need to account for when calculating a
> + * delta to measure elapsed time.
> + *
> + * The timestamps read via glGetTimestamp() / brw_get_timestamp() sometimes
> + * only have 32bits due to a kernel bug and so in that case we make sure to
> + * treat all raw timestamps as 32bits so they overflow consistently and 
> remain
> + * comparable.
> + */
> +uint64_t
> +brw_raw_timestamp_delta(struct brw_context *brw, uint64_t time0, uint64_t 
> time1)
> +{
> +   if (brw->screen->hw_has_timestamp == 2) {
> +  /* Kernel clips timestamps to 32bits in this case */
> +  return (uint32_t)time1 - (uint32_t)time0;

Is this right?  intel_detect_timestamp() says

 return 2; /* upper dword holds the low 32bits of the timestamp */

but casting these to uint32_t should take the low DWords...

> +   } else {
> +  if (time0 > time1)
> + return (1ULL << 36) + time1 - time0;
> +  else
> + return time1 - time0;
> +   }
> +}

Otherwise, this looks good to me, and is:
Reviewed-by: Kenneth Graunke 

Could you try and hook up your fixed-point multiplier to fix query
buffer objects as well?  I'd like to land these for the 17.0 release.

Thanks for fixing this!

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix texturing in the vec4 TCS and GS backends.

2017-01-06 Thread Matt Turner
On Fri, Jan 6, 2017 at 10:27 AM, Jason Ekstrand  wrote:
> Sorry I didn't fix vec4 when I fixed fs. :-(
>
> Reciewed-by: Jason Ekstrand 

Ken, make sure to fix the "Reciewed" typo when applying the patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/radeon: use the internal clear_buffer callback to fix r600g

2017-01-06 Thread Marek Olšák
From: Marek Olšák 

r600g doesn't set pipe_context::clear_buffer.
---
 src/gallium/drivers/radeon/r600_pipe_common.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 28bb791..5113765 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -538,21 +538,23 @@ bool r600_check_device_reset(struct r600_common_context 
*rctx)
 
rctx->device_reset_callback.reset(rctx->device_reset_callback.data, 
status);
return true;
 }
 
 static void r600_dma_clear_buffer_fallback(struct pipe_context *ctx,
   struct pipe_resource *dst,
   uint64_t offset, uint64_t size,
   unsigned value)
 {
-   ctx->clear_buffer(ctx, dst, offset, size, , 4);
+   struct r600_common_context *rctx = (struct r600_common_context *)ctx;
+
+   rctx->clear_buffer(ctx, dst, offset, size, value, R600_COHERENCY_NONE);
 }
 
 bool r600_common_context_init(struct r600_common_context *rctx,
  struct r600_common_screen *rscreen,
  unsigned context_flags)
 {
slab_create_child(>pool_transfers, >pool_transfers);
 
rctx->screen = rscreen;
rctx->ws = rscreen->ws;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Allow a per gen timebase scale factor

2017-01-06 Thread Matt Turner
On Fri, Jan 6, 2017 at 1:17 PM, Kenneth Graunke  wrote:
> From: Robert Bragg 
>
> v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.
> ---
>  src/intel/common/gen_device_info.c| 13 ++--
>  src/intel/common/gen_device_info.h| 24 ++
>  src/mesa/drivers/dri/i965/brw_context.c   | 15 +
>  src/mesa/drivers/dri/i965/brw_context.h   |  3 ++
>  src/mesa/drivers/dri/i965/brw_queryobj.c  | 53 
> ---
>  src/mesa/drivers/dri/i965/gen6_queryobj.c | 28 +---
>  6 files changed, 109 insertions(+), 27 deletions(-)
>
> diff --git a/src/intel/common/gen_device_info.c 
> b/src/intel/common/gen_device_info.c
> index 9bf3cd5cc42..209b293e510 100644
> --- a/src/intel/common/gen_device_info.c
> +++ b/src/intel/common/gen_device_info.c
> @@ -36,6 +36,7 @@ static const struct gen_device_info gen_device_info_i965 = {
> .urb = {
>.size = 256,
> },
> +   .timebase_scale = 80,
>  };
>
>  static const struct gen_device_info gen_device_info_g4x = {
> @@ -51,6 +52,7 @@ static const struct gen_device_info gen_device_info_g4x = {
> .urb = {
>.size = 384,
> },
> +   .timebase_scale = 80,
>  };
>
>  static const struct gen_device_info gen_device_info_ilk = {
> @@ -65,6 +67,7 @@ static const struct gen_device_info gen_device_info_ilk = {
> .urb = {
>.size = 1024,
> },
> +   .timebase_scale = 80,
>  };
>
>  static const struct gen_device_info gen_device_info_snb_gt1 = {
> @@ -89,6 +92,7 @@ static const struct gen_device_info gen_device_info_snb_gt1 
> = {
>   [MESA_SHADER_GEOMETRY] = 256,
>},
> },
> +   .timebase_scale = 80,
>  };
>
>  static const struct gen_device_info gen_device_info_snb_gt2 = {
> @@ -113,6 +117,7 @@ static const struct gen_device_info 
> gen_device_info_snb_gt2 = {
>   [MESA_SHADER_GEOMETRY] = 256,
>},
> },
> +   .timebase_scale = 80,
>  };
>
>  #define GEN7_FEATURES   \
> @@ -121,7 +126,8 @@ static const struct gen_device_info 
> gen_device_info_snb_gt2 = {
> .must_use_separate_stencil = true,   \
> .has_llc = true, \
> .has_pln = true, \
> -   .has_surface_tile_offset = true
> +   .has_surface_tile_offset = true, \
> +   .timebase_scale = 80

Trailing comma

>
>  static const struct gen_device_info gen_device_info_ivb_gt1 = {
> GEN7_FEATURES, .is_ivybridge = true, .gt = 1,
> @@ -287,7 +293,8 @@ static const struct gen_device_info 
> gen_device_info_hsw_gt3 = {
> .max_tcs_threads = 504,  \
> .max_tes_threads = 504,  \
> .max_gs_threads = 504,   \
> -   .max_wm_threads = 384
> +   .max_wm_threads = 384,   \
> +   .timebase_scale = 80

Trailing comma

>
>  static const struct gen_device_info gen_device_info_bdw_gt1 = {
> GEN8_FEATURES, .gt = 1,
> @@ -385,6 +392,7 @@ static const struct gen_device_info gen_device_info_chv = 
> {
> .max_tcs_threads = 336,  \
> .max_tes_threads = 336,  \
> .max_cs_threads = 56,\
> +   .timebase_scale = 10.0 / 1200.0, \
> .urb = { \
>.size = 384,  \
>.min_entries = {  \
> @@ -410,6 +418,7 @@ static const struct gen_device_info gen_device_info_chv = 
> {
> .max_tes_threads = 112, \
> .max_gs_threads = 112,  \
> .max_cs_threads = 6 * 6,\
> +   .timebase_scale = 10.0 / 19200123.0,\
> .urb = {\
>.size = 192, \
>.min_entries = { \
> diff --git a/src/intel/common/gen_device_info.h 
> b/src/intel/common/gen_device_info.h
> index f0e8750d0ea..80676d0e003 100644
> --- a/src/intel/common/gen_device_info.h
> +++ b/src/intel/common/gen_device_info.h
> @@ -147,6 +147,30 @@ struct gen_device_info
> */
>unsigned max_entries[4];
> } urb;
> +
> +   /**
> +* For the longest time the timestamp frequency for Gen's timestamp 
> counter
> +* could be assumed to be 12.5MHz, where the least significant bit neatly
> +* corresponded to 80 nanoseconds.
> +*
> +* Since Gen9 the numbers aren't so round, with a a frequency of 12MHz for
> +* SKL (or scale factor of 83.) and a frequency of 19200123Hz for
> +* BXT.
> +*
> +* For simplicty to fit with the current code scaling by a single constant
> +* to map from raw timestamps to nanoseconds we now do the conversion in
> +* floating point instead of integer arithmetic.
> +*
> +* In general it's 

[Mesa-dev] [PATCH] i965: Allow a per gen timebase scale factor

2017-01-06 Thread Kenneth Graunke
From: Robert Bragg 

v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.
---
 src/intel/common/gen_device_info.c| 13 ++--
 src/intel/common/gen_device_info.h| 24 ++
 src/mesa/drivers/dri/i965/brw_context.c   | 15 +
 src/mesa/drivers/dri/i965/brw_context.h   |  3 ++
 src/mesa/drivers/dri/i965/brw_queryobj.c  | 53 ---
 src/mesa/drivers/dri/i965/gen6_queryobj.c | 28 +---
 6 files changed, 109 insertions(+), 27 deletions(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 9bf3cd5cc42..209b293e510 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -36,6 +36,7 @@ static const struct gen_device_info gen_device_info_i965 = {
.urb = {
   .size = 256,
},
+   .timebase_scale = 80,
 };
 
 static const struct gen_device_info gen_device_info_g4x = {
@@ -51,6 +52,7 @@ static const struct gen_device_info gen_device_info_g4x = {
.urb = {
   .size = 384,
},
+   .timebase_scale = 80,
 };
 
 static const struct gen_device_info gen_device_info_ilk = {
@@ -65,6 +67,7 @@ static const struct gen_device_info gen_device_info_ilk = {
.urb = {
   .size = 1024,
},
+   .timebase_scale = 80,
 };
 
 static const struct gen_device_info gen_device_info_snb_gt1 = {
@@ -89,6 +92,7 @@ static const struct gen_device_info gen_device_info_snb_gt1 = 
{
  [MESA_SHADER_GEOMETRY] = 256,
   },
},
+   .timebase_scale = 80,
 };
 
 static const struct gen_device_info gen_device_info_snb_gt2 = {
@@ -113,6 +117,7 @@ static const struct gen_device_info gen_device_info_snb_gt2 
= {
  [MESA_SHADER_GEOMETRY] = 256,
   },
},
+   .timebase_scale = 80,
 };
 
 #define GEN7_FEATURES   \
@@ -121,7 +126,8 @@ static const struct gen_device_info gen_device_info_snb_gt2 
= {
.must_use_separate_stencil = true,   \
.has_llc = true, \
.has_pln = true, \
-   .has_surface_tile_offset = true
+   .has_surface_tile_offset = true, \
+   .timebase_scale = 80
 
 static const struct gen_device_info gen_device_info_ivb_gt1 = {
GEN7_FEATURES, .is_ivybridge = true, .gt = 1,
@@ -287,7 +293,8 @@ static const struct gen_device_info gen_device_info_hsw_gt3 
= {
.max_tcs_threads = 504,  \
.max_tes_threads = 504,  \
.max_gs_threads = 504,   \
-   .max_wm_threads = 384
+   .max_wm_threads = 384,   \
+   .timebase_scale = 80
 
 static const struct gen_device_info gen_device_info_bdw_gt1 = {
GEN8_FEATURES, .gt = 1,
@@ -385,6 +392,7 @@ static const struct gen_device_info gen_device_info_chv = {
.max_tcs_threads = 336,  \
.max_tes_threads = 336,  \
.max_cs_threads = 56,\
+   .timebase_scale = 10.0 / 1200.0, \
.urb = { \
   .size = 384,  \
   .min_entries = {  \
@@ -410,6 +418,7 @@ static const struct gen_device_info gen_device_info_chv = {
.max_tes_threads = 112, \
.max_gs_threads = 112,  \
.max_cs_threads = 6 * 6,\
+   .timebase_scale = 10.0 / 19200123.0,\
.urb = {\
   .size = 192, \
   .min_entries = { \
diff --git a/src/intel/common/gen_device_info.h 
b/src/intel/common/gen_device_info.h
index f0e8750d0ea..80676d0e003 100644
--- a/src/intel/common/gen_device_info.h
+++ b/src/intel/common/gen_device_info.h
@@ -147,6 +147,30 @@ struct gen_device_info
*/
   unsigned max_entries[4];
} urb;
+
+   /**
+* For the longest time the timestamp frequency for Gen's timestamp counter
+* could be assumed to be 12.5MHz, where the least significant bit neatly
+* corresponded to 80 nanoseconds.
+*
+* Since Gen9 the numbers aren't so round, with a a frequency of 12MHz for
+* SKL (or scale factor of 83.) and a frequency of 19200123Hz for
+* BXT.
+*
+* For simplicty to fit with the current code scaling by a single constant
+* to map from raw timestamps to nanoseconds we now do the conversion in
+* floating point instead of integer arithmetic.
+*
+* In general it's probably worth noting that the documented constants we
+* have for the per-platform timestamp frequencies aren't perfect and
+* shouldn't be trusted for scaling and comparing timestamps with a large
+* delta.
+*
+* E.g. with crude testing on my system using the 'correct' scale factor I'm
+* seeing a drift of ~2 milliseconds 

Re: [Mesa-dev] [PATCH] radv: fix depth transitions with layerCount = VK_REMAINING_ARRAY_LAYERS

2017-01-06 Thread Kai Wasserbäch
Hey Pierre,
this looks like it went to the wrong list. radv patches should be sent to
 AFAIK (CCed with this message).

Cheers,
Kai


Pierre-Loup A. Griffais wrote on 06.01.2017 21:57:
> Interpreting layerCount literally would try to create billions of image
> views in radv_process_depth_image_inplace().
> 
> Signed-off-by: Pierre-Loup A. Griffais 
> ---
>  src/amd/vulkan/radv_meta_decompress.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/amd/vulkan/radv_meta_decompress.c 
> b/src/amd/vulkan/radv_meta_decompress.c
> index 47ef64d..9f262e6 100644
> --- a/src/amd/vulkan/radv_meta_decompress.c
> +++ b/src/amd/vulkan/radv_meta_decompress.c
> @@ -382,7 +382,7 @@ static void radv_process_depth_image_inplace(struct 
> radv_cmd_buffer *cmd_buffer,
>  
>   radv_meta_save_graphics_reset_vport_scissor(_state, cmd_buffer);
>  
> - for (uint32_t layer = 0; layer < subresourceRange->layerCount; layer++) 
> {
> + for (uint32_t layer = 0; layer < radv_get_layerCount(image, 
> subresourceRange); layer++) {
>   struct radv_image_view iview;
>  
>   radv_image_view_init(, cmd_buffer->device,
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97542] mesa-12.0.1 with llvm-3.9.0_rc3 - src/gallium/state_trackers/clover/llvm/invocation.cpp:212:75: error: no matching function for call to clang::CompilerInvocation::setLangDefault

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97542

Christian  changed:

   What|Removed |Added

Version|12.0|13.0

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97542] mesa-12.0.1 with llvm-3.9.0_rc3 - src/gallium/state_trackers/clover/llvm/invocation.cpp:212:75: error: no matching function for call to clang::CompilerInvocation::setLangDefault

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97542

--- Comment #13 from Christian  ---
Same problem, different version: 
mesa-13.0.2
llvm: sys-devel/llvm-3.9.1
sys-devel/clang-3.9.1-r100

Error:
libtool: compile:  x86_64-pc-linux-gnu-g++ -m32 -DPACKAGE_NAME=\"Mesa\"
-DPACKAGE_TARNAME=\"mesa\" -DPACKAGE_VERSION=\"13.0.2\"
"-DPACKAGE_STRING=\"Mesa 13.0.2\""
"-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\";
-DPACKAGE_URL=\"\" -DPACKAGE=\"mesa\" -DVERSION=\"13.0.2\"
-D_FILE_OFFSET_BITS=64 -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
-DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1
-DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1
-DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DYYTEXT_POINTER=1
-DHAVE___BUILTIN_BSWAP32=1 -DHAVE___BUILTIN_BSWAP64=1 -DHAVE___BUILTIN_CLZ=1
-DHAVE___BUILTIN_CLZLL=1 -DHAVE___BUILTIN_CTZ=1 -DHAVE___BUILTIN_EXPECT=1
-DHAVE___BUILTIN_FFS=1 -DHAVE___BUILTIN_FFSLL=1 -DHAVE___BUILTIN_POPCOUNT=1
-DHAVE___BUILTIN_POPCOUNTLL=1 -DHAVE___BUILTIN_UNREACHABLE=1
-DHAVE_FUNC_ATTRIBUTE_CONST=1 -DHAVE_FUNC_ATTRIBUTE_FLATTEN=1
-DHAVE_FUNC_ATTRIBUTE_FORMAT=1 -DHAVE_FUNC_ATTRIBUTE_MALLOC=1
-DHAVE_FUNC_ATTRIBUTE_PACKED=1 -DHAVE_FUNC_ATTRIBUTE_PURE=1
-DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL=1 -DHAVE_FUNC_ATTRIBUTE_UNUSED=1
-DHAVE_FUNC_ATTRIBUTE_VISIBILITY=1 -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1
-DHAVE_FUNC_ATTRIBUTE_WEAK=1 -DMAJOR_IN_SYSMACROS=1 -DHAVE_DLADDR=1
-DHAVE_CLOCK_GETTIME=1 -DHAVE_PTHREAD=1 -I.
-I/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover
-I/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/include
-I/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src
-I/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/include
-I/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/drivers
-I/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/auxiliary
-I/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/winsys
-I../../../../src
-I/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover
-std=c++11 -fvisibility=hidden -I/usr/include -std=c++11
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_GNU_SOURCE -DUSE_SSE41
-DUSE_GCC_ATOMIC_BUILTINS -DNDEBUG -DTEXTURE_FLOAT_ENABLED -DUSE_X86_ASM
-DUSE_MMX_ASM -DUSE_3DNOW_ASM -DUSE_SSE_ASM -DHAVE_XLOCALE_H
-DHAVE_SYS_SYSCTL_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_DLOPEN
-DHAVE_POSIX_MEMALIGN -DHAVE_LIBDRM -DHAVE_SHA1 -DGLX_USE_DRM
-DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_TLS -DHAVE_ALIAS
-DHAVE_DRI3 -DHAVE_MINCORE -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=1
-DLIBCLC_INCLUDEDIR=\"/usr/include/\" -DLIBCLC_LIBEXECDIR=\"/usr/lib/clc/\"
-DCLANG_RESOURCE_DIR=\"/usr/lib/clang/3.9.1\" -mtune=k8 -O2 -pipe
-ffat-lto-objects -Wall -fno-math-errno -fno-trapping-math -c
/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover/llvm/invocation.cpp
 -fPIC -DPIC -o llvm/.libs/libclllvm_la-invocation.o
In file included from
/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover/llvm/metadata.hpp:31:0,
 from
/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover/llvm/codegen/bitcode.cpp:35:
/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover/llvm/compat.hpp:
In function 'void
clover::llvm::compat::set_lang_defaults(clang::CompilerInvocation&,
clang::LangOptions&, clang::InputKind, const llvm::Triple&,
clang::PreprocessorOptions&, clang::LangStandard::Kind)':
/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover/llvm/compat.hpp:72:58:
error: no matching function for call to
'clang::CompilerInvocation::setLangDefaults(clang::LangOptions&,
clang::InputKind&, const llvm::Triple&, clang::PreprocessorOptions&,
clang::LangStandard::Kind&)'
 inv.setLangDefaults(lopts, ik, t, ppopts, std);
  ^
In file included from
/usr/local/include/clang/Frontend/CompilerInstance.h:17:0,
 from
/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover/llvm/codegen.hpp:37,
 from
/var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/gallium/state_trackers/clover/llvm/codegen/bitcode.cpp:34:
/usr/local/include/clang/Frontend/CompilerInvocation.h:157:15: note: candidate:
static void clang::CompilerInvocation::setLangDefaults(clang::LangOptions&,
clang::InputKind, clang::LangStandard::Kind)
   static void setLangDefaults(LangOptions , InputKind IK,
   ^
/usr/local/include/clang/Frontend/CompilerInvocation.h:157:15: note:  
candidate expects 3 arguments, 5 provided
In file included from

[Mesa-dev] [Bug 99305] account creation request

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99305

--- Comment #1 from George Kyriazis  ---
Created attachment 128799
  --> https://bugs.freedesktop.org/attachment.cgi?id=128799=edit
ssh public key

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99305] account creation request

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99305

Bug ID: 99305
   Summary: account creation request
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: george.kyria...@intel.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 128798
  --> https://bugs.freedesktop.org/attachment.cgi?id=128798=edit
pgp key

Working on the OpenSWR driver @ Intel.

Would like to get write access to mesa repo to check in changes.

Thank you!

George

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] st/nine: Process pending commands on Reset

2017-01-06 Thread Axel Davy
Some nine_state_* and nine_context_* functions
used for Reset() require all pending commands are
flushed.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c| 1 +
 src/gallium/state_trackers/nine/device9ex.c  | 1 +
 src/gallium/state_trackers/nine/nine_state.c | 3 +++
 3 files changed, 5 insertions(+)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 6f2e5e9962..03564203df 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -919,6 +919,7 @@ NineDevice9_Reset( struct NineDevice9 *This,
 break;
 }
 
+nine_csmt_process(This);
 nine_state_clear(>state, TRUE);
 nine_context_clear(This);
 
diff --git a/src/gallium/state_trackers/nine/device9ex.c 
b/src/gallium/state_trackers/nine/device9ex.c
index 30c8c65e2b..2853a813ba 100644
--- a/src/gallium/state_trackers/nine/device9ex.c
+++ b/src/gallium/state_trackers/nine/device9ex.c
@@ -257,6 +257,7 @@ NineDevice9Ex_Reset( struct NineDevice9Ex *This,
 break;
 }
 
+nine_csmt_process(>base);
 nine_state_clear(>base.state, TRUE);
 nine_context_clear(>base);
 
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index 697e216436..8909692594 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -2995,6 +2995,9 @@ static const DWORD 
nine_samp_state_defaults[NINED3DSAMP_LAST + 1] =
 [NINED3DSAMP_CUBETEX] = 0
 };
 
+/* Note: The following 4 functions assume there is no
+ * pending commands */
+
 void nine_state_restore_non_cso(struct NineDevice9 *device)
 {
 struct nine_context *context = >context;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] st/nine: Rework CreatePipeSurface

2017-01-06 Thread Axel Davy
Create both surfaces in one call.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/surface9.c | 49 ++
 src/gallium/state_trackers/nine/surface9.h |  3 --
 2 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/src/gallium/state_trackers/nine/surface9.c 
b/src/gallium/state_trackers/nine/surface9.c
index a5c4a9ede8..4b8e2132ab 100644
--- a/src/gallium/state_trackers/nine/surface9.c
+++ b/src/gallium/state_trackers/nine/surface9.c
@@ -44,6 +44,9 @@
 
 #define DBG_CHANNEL DBG_SURFACE
 
+static void
+NineSurface9_CreatePipeSurfaces( struct NineSurface9 *This );
+
 HRESULT
 NineSurface9_ctor( struct NineSurface9 *This,
struct NineUnknownParams *pParams,
@@ -184,10 +187,8 @@ NineSurface9_ctor( struct NineSurface9 *This,
 if (This->base.resource && (pDesc->Usage & D3DUSAGE_DYNAMIC))
 This->base.resource->flags |= NINE_RESOURCE_FLAG_LOCKABLE;
 
-if (This->base.resource && (pDesc->Usage & (D3DUSAGE_RENDERTARGET | 
D3DUSAGE_DEPTHSTENCIL))) {
-(void) NineSurface9_CreatePipeSurface(This, 0);
-(void) NineSurface9_CreatePipeSurface(This, 1);
-}
+if (This->base.resource && (pDesc->Usage & (D3DUSAGE_RENDERTARGET | 
D3DUSAGE_DEPTHSTENCIL)))
+NineSurface9_CreatePipeSurfaces(This);
 
 /* TODO: investigate what else exactly needs to be cleared */
 if (This->base.resource && (pDesc->Usage & D3DUSAGE_RENDERTARGET))
@@ -220,8 +221,8 @@ NineSurface9_dtor( struct NineSurface9 *This )
 NineResource9_dtor(>base);
 }
 
-struct pipe_surface *
-NineSurface9_CreatePipeSurface( struct NineSurface9 *This, const int sRGB )
+static void
+NineSurface9_CreatePipeSurfaces( struct NineSurface9 *This )
 {
 struct pipe_context *pipe;
 struct pipe_screen *screen = NineDevice9_GetScreen(This->base.base.device);
@@ -233,21 +234,33 @@ NineSurface9_CreatePipeSurface( struct NineSurface9 
*This, const int sRGB )
 assert(resource);
 
 srgb_format = util_format_srgb(resource->format);
-if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
-screen->is_format_supported(screen, srgb_format,
-resource->target, 0, resource->bind))
-templ.format = srgb_format;
-else
-templ.format = resource->format;
+if (srgb_format == PIPE_FORMAT_NONE ||
+!screen->is_format_supported(screen, srgb_format,
+ resource->target, 0, resource->bind))
+srgb_format = resource->format;
+
+memset(, 0, sizeof(templ));
+templ.format = resource->format;
 templ.u.tex.level = This->level;
 templ.u.tex.first_layer = This->layer;
 templ.u.tex.last_layer = This->layer;
 
 pipe = nine_context_get_pipe_acquire(This->base.base.device);
-This->surface[sRGB] = pipe->create_surface(pipe, resource, );
+
+This->surface[0] = pipe->create_surface(pipe, resource, );
+
+memset(, 0, sizeof(templ));
+templ.format = srgb_format;
+templ.u.tex.level = This->level;
+templ.u.tex.first_layer = This->layer;
+templ.u.tex.last_layer = This->layer;
+
+This->surface[1] = pipe->create_surface(pipe, resource, );
+
 nine_context_get_pipe_release(This->base.base.device);
-assert(This->surface[sRGB]);
-return This->surface[sRGB];
+
+assert(This->surface[0]); /* TODO: Handle failure */
+assert(This->surface[1]);
 }
 
 #ifdef DEBUG
@@ -762,10 +775,8 @@ NineSurface9_SetResourceResize( struct NineSurface9 *This,
 
 pipe_surface_reference(>surface[0], NULL);
 pipe_surface_reference(>surface[1], NULL);
-if (resource) {
-(void) NineSurface9_CreatePipeSurface(This, 0);
-(void) NineSurface9_CreatePipeSurface(This, 1);
-}
+if (resource)
+NineSurface9_CreatePipeSurfaces(This);
 }
 
 
diff --git a/src/gallium/state_trackers/nine/surface9.h 
b/src/gallium/state_trackers/nine/surface9.h
index 8263060cd5..6f416f2de6 100644
--- a/src/gallium/state_trackers/nine/surface9.h
+++ b/src/gallium/state_trackers/nine/surface9.h
@@ -90,9 +90,6 @@ NineSurface9_dtor( struct NineSurface9 *This );
 void
 NineSurface9_MarkContainerDirty( struct NineSurface9 *This );
 
-struct pipe_surface *
-NineSurface9_CreatePipeSurface( struct NineSurface9 *This, const int sRGB );
-
 static inline struct pipe_surface *
 NineSurface9_GetSurface( struct NineSurface9 *This, int sRGB )
 {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] st/nine: Flush pending commands if needed for surface9 changes

2017-01-06 Thread Axel Davy
nine_context uses NineSurface9 fields, thus we need to flush
pending commands using the surface before changing the fields.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/surface9.c | 28 
 src/gallium/state_trackers/nine/surface9.h | 17 -
 2 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/src/gallium/state_trackers/nine/surface9.c 
b/src/gallium/state_trackers/nine/surface9.c
index 4b8e2132ab..836369cafd 100644
--- a/src/gallium/state_trackers/nine/surface9.c
+++ b/src/gallium/state_trackers/nine/surface9.c
@@ -755,6 +755,33 @@ NineSurface9_UploadSelf( struct NineSurface9 *This,
 return D3D_OK;
 }
 
+/* Currently nine_context uses the NineSurface9
+ * fields when it is render target. Any modification requires
+ * pending commands with the surface to be executed. If the bind
+ * count is 0, there is no pending commands. */
+#define PROCESS_IF_BOUND(surf) \
+if (surf->base.base.bind) \
+nine_csmt_process(surf->base.base.device);
+
+void
+NineSurface9_SetResource( struct NineSurface9 *This,
+  struct pipe_resource *resource, unsigned level )
+{
+/* No need to call PROCESS_IF_BOUND, because SetResource is used only
+ * for MANAGED textures, and they are not render targets. */
+assert(This->base.pool == D3DPOOL_MANAGED);
+This->level = level;
+pipe_resource_reference(>base.resource, resource);
+}
+
+void
+NineSurface9_SetMultiSampleType( struct NineSurface9 *This,
+ D3DMULTISAMPLE_TYPE mst )
+{
+PROCESS_IF_BOUND(This);
+This->desc.MultiSampleType = mst;
+}
+
 void
 NineSurface9_SetResourceResize( struct NineSurface9 *This,
 struct pipe_resource *resource )
@@ -764,6 +791,7 @@ NineSurface9_SetResourceResize( struct NineSurface9 *This,
 assert(This->desc.Pool == D3DPOOL_DEFAULT);
 assert(!This->texture);
 
+PROCESS_IF_BOUND(This);
 pipe_resource_reference(>base.resource, resource);
 
 This->desc.Width = This->base.info.width0 = resource->width0;
diff --git a/src/gallium/state_trackers/nine/surface9.h 
b/src/gallium/state_trackers/nine/surface9.h
index 6f416f2de6..7badde4e17 100644
--- a/src/gallium/state_trackers/nine/surface9.h
+++ b/src/gallium/state_trackers/nine/surface9.h
@@ -103,22 +103,13 @@ NineSurface9_GetResource( struct NineSurface9 *This )
 return This->base.resource;
 }
 
-static inline void
+void
 NineSurface9_SetResource( struct NineSurface9 *This,
-  struct pipe_resource *resource, unsigned level )
-{
-This->level = level;
-pipe_resource_reference(>base.resource, resource);
-pipe_surface_reference(>surface[0], NULL);
-pipe_surface_reference(>surface[1], NULL);
-}
+  struct pipe_resource *resource, unsigned level );
 
-static inline void
+void
 NineSurface9_SetMultiSampleType( struct NineSurface9 *This,
- D3DMULTISAMPLE_TYPE mst )
-{
-This->desc.MultiSampleType = mst;
-}
+ D3DMULTISAMPLE_TYPE mst );
 
 void
 NineSurface9_SetResourceResize( struct NineSurface9 *This,
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] st/nine: Remove duplicated checks

2017-01-06 Thread Axel Davy
There is no need to check on csmt_active before
calling nine_csmt_process, because the function
checks already.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c|  3 +--
 src/gallium/state_trackers/nine/nine_state.c | 14 ++
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 95dc703ec0..6f2e5e9962 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -524,8 +524,7 @@ NineDevice9_ctor( struct NineDevice9 *This,
 nine_state_init_sw(This);
 
 ID3DPresentGroup_Release(This->present);
-if (This->csmt_active)
-nine_csmt_process(This);
+nine_csmt_process(This);
 
 return D3D_OK;
 }
diff --git a/src/gallium/state_trackers/nine/nine_state.c 
b/src/gallium/state_trackers/nine/nine_state.c
index afc309f1db..697e216436 100644
--- a/src/gallium/state_trackers/nine/nine_state.c
+++ b/src/gallium/state_trackers/nine/nine_state.c
@@ -280,8 +280,7 @@ nine_csmt_resume( struct NineDevice9 *device )
 struct pipe_context *
 nine_context_get_pipe( struct NineDevice9 *device )
 {
-if (device->csmt_active)
-nine_csmt_process(device);
+nine_csmt_process(device);
 return device->context.pipe;
 }
 
@@ -1908,8 +1907,8 @@ nine_context_light_enable_stateblock(struct NineDevice9 
*device,
 {
 struct nine_context *context = >context;
 
-if (device->csmt_active) /* TODO: fix */
-nine_csmt_process(device);
+/* TODO: Use CSMT_* to avoid calling nine_csmt_process */
+nine_csmt_process(device);
 memcpy(context->ff.active_light, active_light, NINE_MAX_LIGHTS_ACTIVE * 
sizeof(context->ff.active_light[0]));
 context->ff.num_lights_active = num_lights_active;
 context->changed.group |= NINE_STATE_FF_LIGHTING;
@@ -2821,10 +2820,9 @@ nine_context_get_query_result(struct NineDevice9 
*device, struct pipe_query *que
 struct pipe_context *pipe;
 boolean ret;
 
-if (wait) {
-if (device->csmt_active)
-nine_csmt_process(device);
-} else if (p_atomic_read(counter) > 0) {
+if (wait)
+nine_csmt_process(device);
+else if (p_atomic_read(counter) > 0) {
 if (flush && device->csmt_active)
 nine_queue_flush(device->csmt_ctx->pool);
 DBG("Pending begin/end. Returning\n");
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] st/nine: Protect dtors with mutex

2017-01-06 Thread Axel Davy
When the flag D3DCREATE_MULTITHREAD is set, a global mutex is used
to protect nine calls.
However for performance reasons, AddRef and Release didn't hold the mutex,
and instead used atomics.

Unfortunately at item release, the item can be destroyed, and that
destruction path should be protected by a mutex (at least for
some objects).

Without this patch, it is possible an app thread is in a dtor
while another thread is making gallium nine calls. It is possible
that two threads are using the same gallium pipe, which is forbiden.
The problem has been made worse with csmt, because it can cause hang,
since nine_csmt_process is not threadsafe.

Fixes Hitman hang, and possibly others.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/iunknown.c  | 26 +++
 src/gallium/state_trackers/nine/iunknown.h  |  3 ++
 src/gallium/state_trackers/nine/nine_lock.c | 51 ++---
 src/gallium/state_trackers/nine/nine_lock.h |  3 ++
 4 files changed, 64 insertions(+), 19 deletions(-)

diff --git a/src/gallium/state_trackers/nine/iunknown.c 
b/src/gallium/state_trackers/nine/iunknown.c
index eae4997aa1..d76d644789 100644
--- a/src/gallium/state_trackers/nine/iunknown.c
+++ b/src/gallium/state_trackers/nine/iunknown.c
@@ -26,6 +26,7 @@
 
 #include "nine_helpers.h"
 #include "nine_pdata.h"
+#include "nine_lock.h"
 
 #define DBG_CHANNEL DBG_UNKNOWN
 
@@ -135,6 +136,31 @@ NineUnknown_Release( struct NineUnknown *This )
 return r;
 }
 
+/* No need to lock the mutex protecting nine (when D3DCREATE_MULTITHREADED)
+ * for AddRef and Release, except for dtor as some of the dtors require it. */
+ULONG NINE_WINAPI
+NineUnknown_ReleaseWithDtorLock( struct NineUnknown *This )
+{
+if (This->forward)
+return NineUnknown_ReleaseWithDtorLock(This->container);
+
+ULONG r = p_atomic_dec_return(>refs);
+
+if (r == 0) {
+if (This->device) {
+if (NineUnknown_ReleaseWithDtorLock(NineUnknown(This->device)) == 
0)
+return r; /* everything's gone */
+}
+/* Containers (here with !forward) take care of item destruction */
+if (!This->container && This->bind == 0) {
+NineLockGlobalMutex();
+This->dtor(This);
+NineUnlockGlobalMutex();
+}
+}
+return r;
+}
+
 HRESULT NINE_WINAPI
 NineUnknown_GetDevice( struct NineUnknown *This,
IDirect3DDevice9 **ppDevice )
diff --git a/src/gallium/state_trackers/nine/iunknown.h 
b/src/gallium/state_trackers/nine/iunknown.h
index 4b9edaa355..f9ce7b50c9 100644
--- a/src/gallium/state_trackers/nine/iunknown.h
+++ b/src/gallium/state_trackers/nine/iunknown.h
@@ -100,6 +100,9 @@ NineUnknown_AddRef( struct NineUnknown *This );
 ULONG NINE_WINAPI
 NineUnknown_Release( struct NineUnknown *This );
 
+ULONG NINE_WINAPI
+NineUnknown_ReleaseWithDtorLock( struct NineUnknown *This );
+
 HRESULT NINE_WINAPI
 NineUnknown_GetDevice( struct NineUnknown *This,
IDirect3DDevice9 **ppDevice );
diff --git a/src/gallium/state_trackers/nine/nine_lock.c 
b/src/gallium/state_trackers/nine/nine_lock.c
index fb24400778..1136dad494 100644
--- a/src/gallium/state_trackers/nine/nine_lock.c
+++ b/src/gallium/state_trackers/nine/nine_lock.c
@@ -43,12 +43,25 @@
 #include "volumetexture9.h"
 
 #include "d3d9.h"
+#include "nine_lock.h"
 
 #include "os/os_thread.h"
 
 /* Global mutex as described by MSDN */
 pipe_static_mutex(d3dlock_global);
 
+void
+NineLockGlobalMutex()
+{
+pipe_mutex_lock(d3dlock_global);
+}
+
+void
+NineUnlockGlobalMutex()
+{
+pipe_mutex_unlock(d3dlock_global);
+}
+
 static HRESULT NINE_WINAPI
 LockAuthenticatedChannel9_GetCertificateSize( struct NineAuthenticatedChannel9 
*This,
   UINT *pCertificateSize )
@@ -114,7 +127,7 @@ LockAuthenticatedChannel9_Configure( struct 
NineAuthenticatedChannel9 *This,
 IDirect3DAuthenticatedChannel9Vtbl LockAuthenticatedChannel9_vtable = {
 (void *)NineUnknown_QueryInterface,
 (void *)NineUnknown_AddRef,
-(void *)NineUnknown_Release,
+(void *)NineUnknown_ReleaseWithDtorLock,
 (void *)LockAuthenticatedChannel9_GetCertificateSize,
 (void *)LockAuthenticatedChannel9_GetCertificate,
 (void *)LockAuthenticatedChannel9_NegotiateKeyExchange,
@@ -398,7 +411,7 @@ LockCryptoSession9_GetEncryptionBltKey( struct 
NineCryptoSession9 *This,
 IDirect3DCryptoSession9Vtbl LockCryptoSession9_vtable = {
 (void *)NineUnknown_QueryInterface,
 (void *)NineUnknown_AddRef,
-(void *)NineUnknown_Release,
+(void *)NineUnknown_ReleaseWithDtorLock,
 (void *)LockCryptoSession9_GetCertificateSize,
 (void *)LockCryptoSession9_GetCertificate,
 (void *)LockCryptoSession9_NegotiateKeyExchange,
@@ -481,7 +494,7 @@ LockCubeTexture9_AddDirtyRect( struct NineCubeTexture9 
*This,
 IDirect3DCubeTexture9Vtbl LockCubeTexture9_vtable = {
 (void *)NineUnknown_QueryInterface,
 (void 

[Mesa-dev] [PATCH 6/7] st/nine: Flush the queue at device dtor

2017-01-06 Thread Axel Davy
Flush the queue to get refcounts right, and properly
release the items, instead of throwing away all pending
commands.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/device9.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index 03564203df..1a0ab035c7 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -537,8 +537,13 @@ NineDevice9_dtor( struct NineDevice9 *This )
 
 DBG("This=%p\n", This);
 
-/* Do not call nine_csmt_process here. The device is dead! */
+/* Flush all pending commands to get refcount right,
+ * and properly release bound objects. It is ok to still
+ * execute commands while we are in device dtor, because
+ * we haven't released anything yet. Note that no pending
+ * command can increase the device refcount. */
 if (This->csmt_active && This->csmt_ctx) {
+nine_csmt_process(This);
 nine_csmt_destroy(This, This->csmt_ctx);
 This->csmt_active = FALSE;
 This->csmt_ctx = NULL;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] st/nine: Don't call u_box_union_* when dirty region is empty

2017-01-06 Thread Axel Davy
From: Masanori Kakura 

When dirty region is empty, u_box_union_* incorrectly expands
the new region.

This fixes broken font rendering issue in WOLF RPG Editor v2.10 games.

Signed-off-by: Masanori Kakura 
Reviewed-by: Axel Davy 
---
 src/gallium/state_trackers/nine/cubetexture9.c   | 12 
 src/gallium/state_trackers/nine/texture9.c   | 10 +++---
 src/gallium/state_trackers/nine/volumetexture9.c | 10 +++---
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/src/gallium/state_trackers/nine/cubetexture9.c 
b/src/gallium/state_trackers/nine/cubetexture9.c
index 977a345552..65251ad2b7 100644
--- a/src/gallium/state_trackers/nine/cubetexture9.c
+++ b/src/gallium/state_trackers/nine/cubetexture9.c
@@ -285,10 +285,14 @@ NineCubeTexture9_AddDirtyRect( struct NineCubeTexture9 
*This,
 This->base.base.info.height0,
 >dirty_rect[FaceType]);
 } else {
-struct pipe_box box;
-rect_to_pipe_box_clamp(, pDirtyRect);
-u_box_union_2d(>dirty_rect[FaceType], 
>dirty_rect[FaceType],
-   );
+if (This->dirty_rect[FaceType].width == 0) {
+rect_to_pipe_box_clamp(>dirty_rect[FaceType], pDirtyRect);
+} else {
+struct pipe_box box;
+rect_to_pipe_box_clamp(, pDirtyRect);
+u_box_union_2d(>dirty_rect[FaceType], 
>dirty_rect[FaceType],
+   );
+}
 (void) u_box_clip_2d(>dirty_rect[FaceType],
  >dirty_rect[FaceType],
  This->base.base.info.width0,
diff --git a/src/gallium/state_trackers/nine/texture9.c 
b/src/gallium/state_trackers/nine/texture9.c
index bf054cc305..78ca4add4a 100644
--- a/src/gallium/state_trackers/nine/texture9.c
+++ b/src/gallium/state_trackers/nine/texture9.c
@@ -330,9 +330,13 @@ NineTexture9_AddDirtyRect( struct NineTexture9 *This,
 u_box_origin_2d(This->base.base.info.width0,
 This->base.base.info.height0, >dirty_rect);
 } else {
-struct pipe_box box;
-rect_to_pipe_box_clamp(, pDirtyRect);
-u_box_union_2d(>dirty_rect, >dirty_rect, );
+if (This->dirty_rect.width == 0) {
+rect_to_pipe_box_clamp(>dirty_rect, pDirtyRect);
+} else {
+struct pipe_box box;
+rect_to_pipe_box_clamp(, pDirtyRect);
+u_box_union_2d(>dirty_rect, >dirty_rect, );
+}
 (void) u_box_clip_2d(>dirty_rect, >dirty_rect,
  This->base.base.info.width0,
  This->base.base.info.height0);
diff --git a/src/gallium/state_trackers/nine/volumetexture9.c 
b/src/gallium/state_trackers/nine/volumetexture9.c
index 5c83fdb60c..c836dd2102 100644
--- a/src/gallium/state_trackers/nine/volumetexture9.c
+++ b/src/gallium/state_trackers/nine/volumetexture9.c
@@ -222,9 +222,13 @@ NineVolumeTexture9_AddDirtyBox( struct NineVolumeTexture9 
*This,
 This->dirty_box.height = This->base.base.info.height0;
 This->dirty_box.depth = This->base.base.info.depth0;
 } else {
-struct pipe_box box;
-d3dbox_to_pipe_box(, pDirtyBox);
-u_box_union_3d(>dirty_box, >dirty_box, );
+if (This->dirty_box.width == 0) {
+d3dbox_to_pipe_box(>dirty_box, pDirtyBox);
+} else {
+struct pipe_box box;
+d3dbox_to_pipe_box(, pDirtyBox);
+u_box_union_3d(>dirty_box, >dirty_box, );
+}
 This->dirty_box.x = MAX2(This->dirty_box.x, 0);
 This->dirty_box.y = MAX2(This->dirty_box.y, 0);
 This->dirty_box.z = MAX2(This->dirty_box.z, 0);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Properly flush in hsw_pause_transform_feedback().

2017-01-06 Thread Anuj Phogat
On Fri, Jan 6, 2017 at 12:09 AM, Kenneth Graunke  wrote:
> Fixes a number of transform feedback tests when run with Linux 4.8,
> which allows us to use the MI_LOAD_REGISTER_REG command, at which point
> we started using this new broken path.
>
> ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.*
> and Piglit's arb_transform_feedback2/draw-auto are both fixed by this
> patch, for example.
>
> Thanks to Chris Wilson for catching this mistake!
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/hsw_sol.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/hsw_sol.c 
> b/src/mesa/drivers/dri/i965/hsw_sol.c
> index e299b022706..b0dd150b7df 100644
> --- a/src/mesa/drivers/dri/i965/hsw_sol.c
> +++ b/src/mesa/drivers/dri/i965/hsw_sol.c
> @@ -201,6 +201,9 @@ hsw_pause_transform_feedback(struct gl_context *ctx,
>(struct brw_transform_feedback_object *) obj;
>
> if (brw->is_haswell) {
> +  /* Flush any drawing so that the counters have the right values. */
> +  brw_emit_mi_flush(brw);
> +
>/* Save the SOL buffer offset register values. */
>for (int i = 0; i < BRW_MAX_XFB_STREAMS; i++) {
>   BEGIN_BATCH(3);
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Looks reasonable to me.
Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] configure: Add a condition for compiling for ARM.

2017-01-06 Thread Eric Anholt
This will let VC4 do some ARM-specific optimizations while still having
the simulator build on x86.
---

I'm finishing building the series for doing NEON optimizations today,
but I wanted to get this out there for review since it touches shared
code.

I didn't replicate the 'case "$host_os"' for this check.  Do we need
that?  I'm planning on using gcc intrinsics for arm asm code, so it
doesn't seem host os dependent.

 configure.ac | 8 
 1 file changed, 8 insertions(+)

diff --git a/configure.ac b/configure.ac
index d1ffb57f57e3..1fe8e2ed071c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -763,6 +763,9 @@ if test "x$enable_asm" = xyes; then
 ;;
 esac
 ;;
+arm)
+asm_arch=arm
+;;
 sparc*)
 case "$host_os" in
 linux*)
@@ -777,6 +780,10 @@ if test "x$enable_asm" = xyes; then
 DEFINES="$DEFINES -DUSE_X86_ASM -DUSE_MMX_ASM -DUSE_3DNOW_ASM 
-DUSE_SSE_ASM"
 AC_MSG_RESULT([yes, x86])
 ;;
+arm)
+DEFINES="$DEFINES -DUSE_ARM_ASM"
+AC_MSG_RESULT([yes, arm])
+;;
 x86_64|amd64)
 DEFINES="$DEFINES -DUSE_X86_64_ASM"
 AC_MSG_RESULT([yes, x86_64])
@@ -2683,6 +2690,7 @@ AM_CONDITIONAL(HAVE_COMMON_OSMESA, test "x$enable_osmesa" 
= xyes -o \
 AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = 
xx86_64)
 AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64)
 AM_CONDITIONAL(HAVE_SPARC_ASM, test "x$asm_arch" = xsparc)
+AM_CONDITIONAL(HAVE_ARM_ASM, test "x$asm_arch" = xarm)
 
 AC_SUBST([NINE_MAJOR], 1)
 AC_SUBST([NINE_MINOR], 0)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: don't skip the VUE header if we are reading gl_Layer in a fragment shader

2017-01-06 Thread Jason Ekstrand
Thanks!

Reviewed-by: Jason Ekstrand 

On Jan 5, 2017 05:10, "Iago Toral Quiroga"  wrote:

> This is the same we do in the GL driver: the hardware provides gl_Layer
> in the VUE header, so when the fragment shader reads it we can't skip it.
> ---
>
> With this patch we now successfully read gl_Layer in fragment shaders.
> Layered
> rendering still does not work though, probably because we still need to
> hook up
> the layer_id stuff that Jason added some time ago. I'll look into that
> next.
>
>  src/intel/vulkan/genX_pipeline.c | 20 
>  1 file changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_
> pipeline.c
> index 845d020..c1d8ae6 100644
> --- a/src/intel/vulkan/genX_pipeline.c
> +++ b/src/intel/vulkan/genX_pipeline.c
> @@ -291,6 +291,8 @@ emit_3dstate_sbe(struct anv_pipeline *pipeline)
>  #  define swiz sbe
>  #endif
>
> +   /* Skip the VUE header and position slots by default */
> +   unsigned urb_entry_read_offset = 1;
> int max_source_attr = 0;
> for (int attr = 0; attr < VARYING_SLOT_MAX; attr++) {
>int input_index = wm_prog_data->urb_setup[attr];
> @@ -298,6 +300,12 @@ emit_3dstate_sbe(struct anv_pipeline *pipeline)
>if (input_index < 0)
>   continue;
>
> +  /* gl_Layer is stored in the VUE header */
> +  if (attr == VARYING_SLOT_LAYER) {
> + urb_entry_read_offset = 0;
> + continue;
> +  }
> +
>if (attr == VARYING_SLOT_PNTC) {
>   sbe.PointSpriteTextureCoordinateEnable = 1 << input_index;
>   continue;
> @@ -322,18 +330,22 @@ emit_3dstate_sbe(struct anv_pipeline *pipeline)
>   swiz.Attribute[input_index].ComponentOverrideZ = true;
>   swiz.Attribute[input_index].ComponentOverrideW = true;
>} else {
> - assert(slot >= 2);
> - const int source_attr = slot - 2;
> - max_source_attr = MAX2(max_source_attr, source_attr);
>   /* We have to subtract two slots to accout for the URB entry
> output
>* read offset in the VS and GS stages.
>*/
> + assert(slot >= 2);
> + const int source_attr = slot - 2 * urb_entry_read_offset;
> + max_source_attr = MAX2(max_source_attr, source_attr);
>   swiz.Attribute[input_index].SourceAttribute = source_attr;
>}
> }
>
> -   sbe.VertexURBEntryReadOffset = 1; /* Skip the VUE header and position
> slots */
> +   sbe.VertexURBEntryReadOffset = urb_entry_read_offset;
> sbe.VertexURBEntryReadLength = DIV_ROUND_UP(max_source_attr + 1, 2);
> +#if GEN_GEN >= 8
> +   sbe.ForceVertexURBEntryReadOffset = true;
> +   sbe.ForceVertexURBEntryReadLength = true;
> +#endif
>
> uint32_t *dw = anv_batch_emit_dwords(>batch,
>  GENX(3DSTATE_SBE_length));
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix texturing in the vec4 TCS and GS backends.

2017-01-06 Thread Jason Ekstrand
Sorry I didn't fix vec4 when I fixed fs. :-(

Reciewed-by: Jason Ekstrand 

On Jan 6, 2017 02:00, "Kenneth Graunke"  wrote:

We were failing to zero m0.2 of the sampler message header for TCS and
GS messages in the simple case.  fs_generator has done this for about
a year now, but we missed it in vec4_generator.

Fixes ES31-CTS.core.texture_cube_map_array.sampling,
GL45-CTS.texture_cube_map_array.sampling, and many
dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler subtests:
- dynamically_uniform.tessellation_control.isampler3d
- dynamically_uniform.tessellation_control.isamplercube
- dynamically_uniform.tessellation_control.sampler2d
- dynamically_uniform.tessellation_control.usamplercube
- dynamically_uniform.tessellation_control.sampler2darray
- dynamically_uniform.tessellation_control.isampler2darray
- dynamically_uniform.tessellation_control.usampler3d
- dynamically_uniform.tessellation_control.usampler2darray
- dynamically_uniform.tessellation_control.usampler2d
- dynamically_uniform.tessellation_control.sampler3d
- dynamically_uniform.tessellation_control.samplercube
- dynamically_uniform.tessellation_control.isampler2d
- uniform.tessellation_control.isampler3d
- uniform.tessellation_control.isamplercube
- uniform.tessellation_control.usampler2d
- uniform.tessellation_control.usampler3d
- uniform.tessellation_control.sampler2darray
- uniform.tessellation_control.isampler2darray
- uniform.tessellation_control.usampler2darray
- uniform.tessellation_control.sampler2d
- uniform.tessellation_control.usamplercube
- uniform.tessellation_control.sampler3d
- uniform.tessellation_control.samplercube
- uniform.tessellation_control.isampler2d

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 3d688cff144..f095cc2d0f2 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -106,6 +106,7 @@ generate_math2_gen4(struct brw_codegen *p,
 static void
 generate_tex(struct brw_codegen *p,
  struct brw_vue_prog_data *prog_data,
+ gl_shader_stage stage,
  vec4_instruction *inst,
  struct brw_reg dst,
  struct brw_reg src,
@@ -238,8 +239,16 @@ generate_tex(struct brw_codegen *p,
  */
 dw2 |= GEN9_SAMPLER_SIMD_MODE_EXTENSION_SIMD4X2;

- if (dw2)
+ /* The VS, DS, and FS stages have the g0.2 payload delivered as 0,
+  * so header0.2 is 0 when g0 is copied.  The HS and GS stages do
+  * not, so we must set to to 0 to avoid setting undesirable bits
+  * in the message header.
+  */
+ if (dw2 ||
+ stage == MESA_SHADER_TESS_CTRL ||
+ stage == MESA_SHADER_GEOMETRY) {
 brw_MOV(p, get_element_ud(header, 2), brw_imm_ud(dw2));
+ }

  brw_adjust_sampler_state_pointer(p, header, sampler_index);
  brw_pop_insn_state(p);
@@ -1748,7 +1757,8 @@ generate_code(struct brw_codegen *p,
   case SHADER_OPCODE_TG4:
   case SHADER_OPCODE_TG4_OFFSET:
   case SHADER_OPCODE_SAMPLEINFO:
- generate_tex(p, prog_data, inst, dst, src[0], src[1], src[2]);
+ generate_tex(p, prog_data, nir->stage,
+  inst, dst, src[0], src[1], src[2]);
  break;

   case VS_OPCODE_URB_WRITE:
--
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] isl: render target cube maps should be handled as 2D images, not cubes

2017-01-06 Thread Jason Ekstrand
On Jan 6, 2017 10:18, "Jason Ekstrand"  wrote:

Thanks for catching this.  I wonder how I managed to switch the GL driver
over to using ISL for emitting surface states without regressing anything...

Reviewed-by: Jason Ekstrand 

On Jan 6, 2017 05:42, "Iago Toral Quiroga"  wrote:

> This fixes layered rendering Vulkan CTS tests with cube (arrays). We
> also do this in the GL driver, see this code from gen8_depth_state.c
> for example:
>
> case GL_TEXTURE_CUBE_MAP_ARRAY:
> case GL_TEXTURE_CUBE_MAP:
>/* The PRM claims that we should use BRW_SURFACE_CUBE for this
> * situation, but experiments show that gl_Layer doesn't work when we do
> * this.  So we use BRW_SURFACE_2D, since for rendering purposes this is
> * equivalent.
> */
>surftype = BRW_SURFACE_2D;
>depth *= 6;
>break;
>
> So I guess we simply forgot to port this workaround to Vulkan.
>
> Fixes:
> dEQP-VK.geometry.layered.cube*
> ---
>
> With this (and the previous patch I sent to fix the SBE state packet to not
> skip the VUE header when we need the layer information) all the layered
> rendering tests in Vulkan CTS seem to pass.
>
>  src/intel/isl/isl_surface_state.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/isl/isl_surface_state.c
> b/src/intel/isl/isl_surface_state.c
> index 3bb0abd..0960a90 100644
> --- a/src/intel/isl/isl_surface_state.c
> +++ b/src/intel/isl/isl_surface_state.c
> @@ -113,8 +113,9 @@ get_surftype(enum isl_surf_dim dim,
> isl_surf_usage_flags_t usage)
>assert(!(usage & ISL_SURF_USAGE_CUBE_BIT));
>return SURFTYPE_1D;
> case ISL_SURF_DIM_2D:
> -  if (usage & ISL_SURF_USAGE_STORAGE_BIT) {
> - /* Storage images are always plain 2-D, not cube */
> +  if ((usage & ISL_SURF_USAGE_STORAGE_BIT) ||
> +  (usage & ISL_SURF_USAGE_RENDER_TARGET_BIT)) {
> + /* Storage / Render images are always plain 2-D, not cube */
>   return SURFTYPE_2D;
>} else if (usage & ISL_SURF_USAGE_CUBE_BIT) {
>

On second thought... I wonder if the right thing to do here wouldnt be to
just change the other condition a bit to

if ((usage & ISL_SURF_USAGE_CUBE_BIT) &&
(usage & ISL_SURF_USAGE_TEXTURE_BIT)) {
   /* We need SURFTYPE_CUBE to make cube sampling work */
   return SURFTYPE_CUBE;
} else {
   /* Everything else (render and storage) treat cubes a plain 2D array
textures */
   return SURFTYPE_2D;
}

It seems like textures, and not render targets, are the special case here.

  return SURFTYPE_CUBE;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] isl: render target cube maps should be handled as 2D images, not cubes

2017-01-06 Thread Jason Ekstrand
Thanks for catching this.  I wonder how I managed to switch the GL driver
over to using ISL for emitting surface states without regressing anything...

Reviewed-by: Jason Ekstrand 

On Jan 6, 2017 05:42, "Iago Toral Quiroga"  wrote:

> This fixes layered rendering Vulkan CTS tests with cube (arrays). We
> also do this in the GL driver, see this code from gen8_depth_state.c
> for example:
>
> case GL_TEXTURE_CUBE_MAP_ARRAY:
> case GL_TEXTURE_CUBE_MAP:
>/* The PRM claims that we should use BRW_SURFACE_CUBE for this
> * situation, but experiments show that gl_Layer doesn't work when we do
> * this.  So we use BRW_SURFACE_2D, since for rendering purposes this is
> * equivalent.
> */
>surftype = BRW_SURFACE_2D;
>depth *= 6;
>break;
>
> So I guess we simply forgot to port this workaround to Vulkan.
>
> Fixes:
> dEQP-VK.geometry.layered.cube*
> ---
>
> With this (and the previous patch I sent to fix the SBE state packet to not
> skip the VUE header when we need the layer information) all the layered
> rendering tests in Vulkan CTS seem to pass.
>
>  src/intel/isl/isl_surface_state.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/isl/isl_surface_state.c
> b/src/intel/isl/isl_surface_state.c
> index 3bb0abd..0960a90 100644
> --- a/src/intel/isl/isl_surface_state.c
> +++ b/src/intel/isl/isl_surface_state.c
> @@ -113,8 +113,9 @@ get_surftype(enum isl_surf_dim dim,
> isl_surf_usage_flags_t usage)
>assert(!(usage & ISL_SURF_USAGE_CUBE_BIT));
>return SURFTYPE_1D;
> case ISL_SURF_DIM_2D:
> -  if (usage & ISL_SURF_USAGE_STORAGE_BIT) {
> - /* Storage images are always plain 2-D, not cube */
> +  if ((usage & ISL_SURF_USAGE_STORAGE_BIT) ||
> +  (usage & ISL_SURF_USAGE_RENDER_TARGET_BIT)) {
> + /* Storage / Render images are always plain 2-D, not cube */
>   return SURFTYPE_2D;
>} else if (usage & ISL_SURF_USAGE_CUBE_BIT) {
>   return SURFTYPE_CUBE;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] android: support creating texture from gralloc buffer

2017-01-06 Thread Rob Herring
On Fri, Jan 6, 2017 at 11:35 AM, Wu Zhen  wrote:
> From: WuZhen 
>
> Change-Id: Ifabf40fe94007f73171a89b23545002707817053
> Reviewed-by: Mauro Rossi 
> Reviewed-by: Chih-Wei Huang 
> ---
>  src/gallium/targets/dri/Android.mk|  1 +
>  src/gallium/winsys/sw/dri/dri_sw_winsys.c | 65 
> +++
>  2 files changed, 66 insertions(+)

I don't think we want to add a gralloc dependency here. Either we
should use kms-dri or GBM (and gbm_gralloc) here instead.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] android: egl: add support for software rasterizer

2017-01-06 Thread Wu Zhen
From: WuZhen 

this commit enable software rendering on android with llvmpipe.
the system boots fine antutu 3D benchmark is passing

this commit incorporates some further work done by:
Paulo Sergio Travaglia 
Mauro Rossi 

Change-Id: Ibe0114333a278fd5e64632ac8c17cffde7c9b359
Reviewed-by: Mauro Rossi 
Reviewed-by: Chih-Wei Huang 
---
 src/egl/Android.mk  |   1 +
 src/egl/drivers/dri2/egl_dri2.c |   1 +
 src/egl/drivers/dri2/platform_android.c | 389 +++-
 3 files changed, 386 insertions(+), 5 deletions(-)

diff --git a/src/egl/Android.mk b/src/egl/Android.mk
index bfd56a744d..d63e71da92 100644
--- a/src/egl/Android.mk
+++ b/src/egl/Android.mk
@@ -46,6 +46,7 @@ LOCAL_CFLAGS := \
 LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/egl/main \
$(MESA_TOP)/src/egl/drivers/dri2 \
+   $(MESA_TOP)/src/gallium/include
 
 LOCAL_STATIC_LIBRARIES := \
libmesa_loader
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 52fbdff0b1..bdb3119496 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -402,6 +402,7 @@ static const struct dri2_extension_match 
swrast_driver_extensions[] = {
 
 static const struct dri2_extension_match swrast_core_extensions[] = {
{ __DRI_TEX_BUFFER, 2, offsetof(struct dri2_egl_display, tex_buffer) },
+   { __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) },
{ NULL, 0, 0 }
 };
 
diff --git a/src/egl/drivers/dri2/platform_android.c 
b/src/egl/drivers/dri2/platform_android.c
index 1c880f934a..61c0aa4818 100644
--- a/src/egl/drivers/dri2/platform_android.c
+++ b/src/egl/drivers/dri2/platform_android.c
@@ -40,6 +40,7 @@
 #include "loader.h"
 #include "egl_dri2.h"
 #include "egl_dri2_fallbacks.h"
+#include "state_tracker/drm_driver.h"
 #include "gralloc_drm.h"
 
 #define ALIGN(val, align)  (((val) + (align) - 1) & ~((align) - 1))
@@ -157,6 +158,8 @@ get_native_buffer_name(struct ANativeWindowBuffer *buf)
return gralloc_drm_get_gem_handle(buf->handle);
 }
 
+static const gralloc_module_t *gr_module = NULL;
+
 static EGLBoolean
 droid_window_dequeue_buffer(struct dri2_egl_surface *dri2_surf)
 {
@@ -338,9 +341,14 @@ droid_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
if (!config)
   goto cleanup_surface;
 
-   dri2_surf->dri_drawable =
-  (*dri2_dpy->dri2->createNewDrawable)(dri2_dpy->dri_screen, config,
-   dri2_surf);
+   if (dri2_dpy->dri2) {
+  dri2_surf->dri_drawable =
+ dri2_dpy->dri2->createNewDrawable(dri2_dpy->dri_screen, config, 
dri2_surf);
+   } else {
+  dri2_surf->dri_drawable =
+ dri2_dpy->swrast->createNewDrawable(dri2_dpy->dri_screen, config, 
dri2_surf);
+   }
+
if (dri2_surf->dri_drawable == NULL) {
   _eglError(EGL_BAD_ALLOC, "dri2->createNewDrawable");
   goto cleanup_surface;
@@ -980,6 +988,259 @@ droid_add_configs_for_visuals(_EGLDriver *drv, 
_EGLDisplay *dpy)
return (count != 0);
 }
 
+static int swrastUpdateBuffer(struct dri2_egl_surface *dri2_surf)
+{
+   if (dri2_surf->base.Type == EGL_WINDOW_BIT) {
+   if (!dri2_surf->buffer && !droid_window_dequeue_buffer(dri2_surf)) {
+  _eglLog(_EGL_WARNING, "failed to dequeue buffer for window");
+  return 1;
+   }
+   dri2_surf->base.Width = dri2_surf->buffer->width;
+   dri2_surf->base.Height = dri2_surf->buffer->height;
+   }
+   return 0;
+}
+
+static void
+swrastGetDrawableInfo(__DRIdrawable * draw,
+  int *x, int *y, int *w, int *h,
+  void *loaderPrivate)
+{
+   struct dri2_egl_surface *dri2_surf = loaderPrivate;
+
+   swrastUpdateBuffer(dri2_surf);
+
+   *x = 0;
+   *y = 0;
+   *w = dri2_surf->base.Width;
+   *h = dri2_surf->base.Height;
+}
+
+static void
+swrastPutImage2(__DRIdrawable * draw, int op,
+int x, int y, int w, int h, int stride,
+char *data, void *loaderPrivate)
+{
+   struct dri2_egl_surface *dri2_surf = loaderPrivate;
+   struct _EGLDisplay *egl_dpy = dri2_surf->base.Resource.Display;
+   char *dstPtr, *srcPtr;
+   size_t BPerPixel, dstStride, copyWidth, xOffset;
+
+   if (swrastUpdateBuffer(dri2_surf)) {
+  return;
+   }
+
+   BPerPixel = get_format_bpp(dri2_surf->buffer->format);
+   dstStride = BPerPixel * dri2_surf->buffer->stride;
+   copyWidth = BPerPixel * w;
+   xOffset = BPerPixel * x;
+
+   /* drivers expect we do these checks (and some rely on it) */
+   if (copyWidth > dstStride - xOffset)
+  copyWidth = dstStride - xOffset;
+   if (h > dri2_surf->base.Height - y)
+  h = dri2_surf->base.Height - y;
+
+   if (gr_module->lock(gr_module, dri2_surf->buffer->handle, 
GRALLOC_USAGE_SW_READ_OFTEN | GRALLOC_USAGE_SW_WRITE_OFTEN,
+   0, 0, dri2_surf->buffer->width, 
dri2_surf->buffer->height, (void**))) {
+   

[Mesa-dev] [PATCH 6/8] drisw: support fence externsion and image extension

2017-01-06 Thread Wu Zhen
From: WuZhen 

adds a new type of winsys handle type that allows passing
a pointer sized handle to winsys

Change-Id: I3bf1732619206d2bc50f6aca6b27258bb026a212
Reviewed-by: Mauro Rossi 
Reviewed-by: Chih-Wei Huang 
---
 include/GL/internal/dri_interface.h| 14 ++-
 src/gallium/include/state_tracker/drm_driver.h | 10 -
 src/gallium/state_trackers/dri/dri2.c  | 12 +++---
 src/gallium/state_trackers/dri/drisw.c | 55 ++
 src/mesa/drivers/dri/common/dri_util.c |  4 +-
 src/mesa/drivers/dri/common/dri_util.h |  2 +-
 6 files changed, 85 insertions(+), 12 deletions(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 8922356990..a84bef90a0 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -62,6 +62,7 @@ typedef struct __DRIdrawableRec   __DRIdrawable;
 typedef struct __DRIconfigRec  __DRIconfig;
 typedef struct __DRIframebufferRec __DRIframebuffer;
 typedef struct __DRIversionRec __DRIversion;
+typedef struct __DRIimageRec__DRIimage;
 
 typedef struct __DRIcoreExtensionRec   __DRIcoreExtension;
 typedef struct __DRIextensionRec   __DRIextension;
@@ -861,8 +862,9 @@ struct __DRIlegacyExtensionRec {
  * conjunction with the core extension.
  */
 #define __DRI_SWRAST "DRI_SWRast"
-#define __DRI_SWRAST_VERSION 4
+#define __DRI_SWRAST_VERSION 5
 
+struct winsys_handle;
 struct __DRIswrastExtensionRec {
 __DRIextension base;
 
@@ -909,6 +911,15 @@ struct __DRIswrastExtensionRec {
 const __DRIconfig ***driver_configs,
 void *loaderPrivate);
 
+   /**
+* create a dri image from native window system handle
+*
+* \since version 5
+*/
+   __DRIimage *(*createImageFromWinsys)(__DRIscreen *_screen,
+int width, int height, int format,
+int num_handles, struct winsys_handle 
*whandle,
+void *loaderPrivate);
 };
 
 /** Common DRI function definitions, shared among DRI2 and Image extensions
@@ -1308,7 +1319,6 @@ enum __DRIChromaSiting {
 #define __BLIT_FLAG_FLUSH  0x0001
 #define __BLIT_FLAG_FINISH 0x0002
 
-typedef struct __DRIimageRec  __DRIimage;
 typedef struct __DRIimageExtensionRec __DRIimageExtension;
 struct __DRIimageExtensionRec {
 __DRIextension base;
diff --git a/src/gallium/include/state_tracker/drm_driver.h 
b/src/gallium/include/state_tracker/drm_driver.h
index c80fb09dbc..e4d8f17ceb 100644
--- a/src/gallium/include/state_tracker/drm_driver.h
+++ b/src/gallium/include/state_tracker/drm_driver.h
@@ -11,6 +11,7 @@ struct pipe_resource;
 #define DRM_API_HANDLE_TYPE_SHARED 0
 #define DRM_API_HANDLE_TYPE_KMS1
 #define DRM_API_HANDLE_TYPE_FD 2
+#define DRM_API_HANDLE_TYPE_BUFFER 3
 
 
 /**
@@ -20,7 +21,7 @@ struct winsys_handle
 {
/**
 * Input for texture_from_handle, valid values are
-* DRM_API_HANDLE_TYPE_SHARED or DRM_API_HANDLE_TYPE_FD.
+* DRM_API_HANDLE_TYPE_SHARED or DRM_API_HANDLE_TYPE_FD or 
DRM_API_HANDLE_TYPE_BUFFER.
 * Input to texture_get_handle,
 * to select handle for kms, flink, or prime.
 */
@@ -30,6 +31,13 @@ struct winsys_handle
 * of a specific layer of an array texture.
 */
unsigned layer;
+
+   /**
+* Input to texture_from_handle.
+* Output for texture_get_handle.
+*/
+   void* external_buffer;
+
/**
 * Input to texture_from_handle.
 * Output for texture_get_handle.
diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 77523e98ff..b9d7bca711 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -111,7 +111,7 @@ static int convert_fourcc(int format, int *dri_components_p)
  * only needed for exporting dmabuf's, so I think I won't loose much
  * sleep over it.
  */
-static int convert_to_fourcc(int format)
+int convert_to_fourcc(int format)
 {
switch(format) {
case __DRI_IMAGE_FORMAT_RGB565:
@@ -765,7 +765,7 @@ dri2_update_tex_buffer(struct dri_drawable *drawable,
/* no-op */
 }
 
-static __DRIimage *
+__DRIimage *
 dri2_lookup_egl_image(struct dri_screen *screen, void *handle)
 {
const __DRIimageLookupExtension *loader = screen->sPriv->dri2.image;
@@ -780,7 +780,7 @@ dri2_lookup_egl_image(struct dri_screen *screen, void 
*handle)
return img;
 }
 
-static __DRIimage *
+__DRIimage *
 dri2_create_image_from_winsys(__DRIscreen *_screen,
   int width, int height, int format,
   int num_handles, struct winsys_handle *whandle,
@@ -1173,7 +1173,7 @@ dri2_from_planar(__DRIimage *image, int plane, void 
*loaderPrivate)
return img;
 }
 
-static 

[Mesa-dev] [PATCH 7/8] android: support creating texture from gralloc buffer

2017-01-06 Thread Wu Zhen
From: WuZhen 

Change-Id: Ifabf40fe94007f73171a89b23545002707817053
Reviewed-by: Mauro Rossi 
Reviewed-by: Chih-Wei Huang 
---
 src/gallium/targets/dri/Android.mk|  1 +
 src/gallium/winsys/sw/dri/dri_sw_winsys.c | 65 +++
 2 files changed, 66 insertions(+)

diff --git a/src/gallium/targets/dri/Android.mk 
b/src/gallium/targets/dri/Android.mk
index 5a71867381..02b3f76f09 100644
--- a/src/gallium/targets/dri/Android.mk
+++ b/src/gallium/targets/dri/Android.mk
@@ -43,6 +43,7 @@ LOCAL_SHARED_LIBRARIES := \
liblog \
libglapi \
libexpat \
+   libhardware \
 
 ifneq ($(filter freedreno,$(MESA_GPU_DRIVERS)),)
 LOCAL_CFLAGS += -DGALLIUM_FREEDRENO
diff --git a/src/gallium/winsys/sw/dri/dri_sw_winsys.c 
b/src/gallium/winsys/sw/dri/dri_sw_winsys.c
index 94d5092405..bbc7f08f5a 100644
--- a/src/gallium/winsys/sw/dri/dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/dri/dri_sw_winsys.c
@@ -34,8 +34,15 @@
 #include "util/u_memory.h"
 
 #include "state_tracker/sw_winsys.h"
+#include "state_tracker/drm_driver.h"
 #include "dri_sw_winsys.h"
 
+#ifdef HAVE_ANDROID_PLATFORM
+#include 
+#include 
+#include 
+#endif
+
 
 struct dri_sw_displaytarget
 {
@@ -45,11 +52,31 @@ struct dri_sw_displaytarget
unsigned stride;
 
unsigned map_flags;
+#ifdef HAVE_ANDROID_PLATFORM
+   struct ANativeWindowBuffer *androidBuffer;
+#endif
void *data;
void *mapped;
const void *front_private;
 };
 
+#ifdef HAVE_ANDROID_PLATFORM
+const struct gralloc_module_t* get_gralloc()
+{
+   static const struct gralloc_module_t* gr_module = NULL;
+   const hw_module_t *mod;
+   int err;
+
+   if (!gr_module) {
+  err =  hw_get_module(GRALLOC_HARDWARE_MODULE_ID, );
+  if (!err) {
+ gr_module = (gralloc_module_t *) mod;
+  }
+   }
+   return gr_module;
+}
+#endif
+
 struct dri_sw_winsys
 {
struct sw_winsys base;
@@ -125,6 +152,12 @@ dri_sw_displaytarget_destroy(struct sw_winsys *ws,
 {
struct dri_sw_displaytarget *dri_sw_dt = dri_sw_displaytarget(dt);
 
+#ifdef HAVE_ANDROID_PLATFORM
+   if (dri_sw_dt->androidBuffer) {
+  
dri_sw_dt->androidBuffer->common.decRef(_sw_dt->androidBuffer->common);
+   }
+#endif
+
align_free(dri_sw_dt->data);
 
FREE(dri_sw_dt);
@@ -136,6 +169,17 @@ dri_sw_displaytarget_map(struct sw_winsys *ws,
  unsigned flags)
 {
struct dri_sw_displaytarget *dri_sw_dt = dri_sw_displaytarget(dt);
+#ifdef HAVE_ANDROID_PLATFORM
+   if (dri_sw_dt->androidBuffer) {
+  if (!get_gralloc()->lock(get_gralloc(), dri_sw_dt->androidBuffer->handle,
+  GRALLOC_USAGE_SW_READ_OFTEN | 
GRALLOC_USAGE_SW_WRITE_OFTEN,
+  0, 0, dri_sw_dt->androidBuffer->width, 
dri_sw_dt->androidBuffer->height,
+  (void**)_sw_dt->mapped)) {
+ dri_sw_dt->map_flags = flags;
+ return dri_sw_dt->mapped;
+  }
+   }
+#endif
dri_sw_dt->mapped = dri_sw_dt->data;
 
if (dri_sw_dt->front_private && (flags & PIPE_TRANSFER_READ)) {
@@ -156,6 +200,11 @@ dri_sw_displaytarget_unmap(struct sw_winsys *ws,
   dri_sw_ws->lf->put_image2((void *)dri_sw_dt->front_private, 
dri_sw_dt->data, 0, 0, dri_sw_dt->width, dri_sw_dt->height, dri_sw_dt->stride);
}
dri_sw_dt->map_flags = 0;
+#ifdef HAVE_ANDROID_PLATFORM
+   if (dri_sw_dt->androidBuffer) {
+  get_gralloc()->unlock(get_gralloc(), dri_sw_dt->androidBuffer->handle);
+   }
+#endif
dri_sw_dt->mapped = NULL;
 }
 
@@ -165,6 +214,22 @@ dri_sw_displaytarget_from_handle(struct sw_winsys *winsys,
  struct winsys_handle *whandle,
  unsigned *stride)
 {
+#ifdef HAVE_ANDROID_PLATFORM
+   struct dri_sw_displaytarget *dri_sw_dt;
+
+   if (whandle->type == DRM_API_HANDLE_TYPE_BUFFER) {
+  dri_sw_dt = CALLOC_STRUCT(dri_sw_displaytarget);
+  dri_sw_dt->width = templ->width0;
+  dri_sw_dt->height = templ->height0;
+  dri_sw_dt->androidBuffer = whandle->external_buffer;
+  dri_sw_dt->stride = whandle->stride;
+
+  
dri_sw_dt->androidBuffer->common.incRef(_sw_dt->androidBuffer->common);
+  *stride = dri_sw_dt->stride;
+
+  return dri_sw_dt;
+   }
+#endif
assert(0);
return NULL;
 }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/8] android: add Android.mk for llvmpipe

2017-01-06 Thread Wu Zhen
From: WuZhen 

rename old swrast to softpipe, add a new driver llvmpipe

Change-Id: Ia8bc1005ad6846df78bc1f6d0a4196310a049aca
Reviewed-by: Mauro Rossi 
Reviewed-by: Chih-Wei Huang 
---
 Android.common.mk|  2 +-
 Android.mk   |  6 ++---
 src/gallium/Android.mk   |  4 ++-
 src/gallium/auxiliary/pipe-loader/Android.mk |  2 +-
 src/gallium/drivers/llvmpipe/Android.mk  | 39 
 src/gallium/state_trackers/dri/Android.mk|  4 +--
 src/gallium/targets/dri/Android.mk   |  6 -
 7 files changed, 54 insertions(+), 9 deletions(-)
 create mode 100644 src/gallium/drivers/llvmpipe/Android.mk

diff --git a/Android.common.mk b/Android.common.mk
index cb2a4e6104..023895bffd 100644
--- a/Android.common.mk
+++ b/Android.common.mk
@@ -85,7 +85,7 @@ endif
 
 ifneq ($(LOCAL_IS_HOST_MODULE),true)
 # add libdrm if there are hardware drivers
-ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
+ifneq ($(filter-out llvmpipe softpipe,$(MESA_GPU_DRIVERS)),)
 LOCAL_CFLAGS += -DHAVE_LIBDRM
 LOCAL_SHARED_LIBRARIES += libdrm
 endif
diff --git a/Android.mk b/Android.mk
index b52e7f8232..9ef99377a1 100644
--- a/Android.mk
+++ b/Android.mk
@@ -24,7 +24,7 @@
 # BOARD_GPU_DRIVERS should be defined.  The valid values are
 #
 #   classic drivers: i915 i965
-#   gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi 
vc4 virgl vmwgfx
+#   gallium drivers: llvmpipe softpipe freedreno i915g ilo nouveau r300g r600g 
radeonsi vc4 virgl vmwgfx
 #
 # The main target is libGLES_mesa.  For each classic driver enabled, a DRI
 # module will also be built.  DRI modules will be loaded by libGLES_mesa.
@@ -50,7 +50,7 @@ MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk
 MESA_PYTHON2 := python
 
 classic_drivers := i915 i965
-gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi 
vmwgfx vc4 virgl
+gallium_drivers := llvmpipe softpipe freedreno i915g ilo nouveau r300g r600g 
radeonsi vmwgfx vc4 virgl
 
 MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))
 
@@ -82,7 +82,7 @@ else
 MESA_BUILD_GALLIUM := false
 endif
 
-MESA_ENABLE_LLVM := $(if $(filter radeonsi,$(MESA_GPU_DRIVERS)),true,false)
+MESA_ENABLE_LLVM := $(if $(filter radeonsi 
llvmpipe,$(MESA_GPU_DRIVERS)),true,false)
 
 # add subdirectories
 ifneq ($(strip $(MESA_GPU_DRIVERS)),)
diff --git a/src/gallium/Android.mk b/src/gallium/Android.mk
index 2b469b65ee..1c719d1968 100644
--- a/src/gallium/Android.mk
+++ b/src/gallium/Android.mk
@@ -34,7 +34,9 @@ SUBDIRS += auxiliary/pipe-loader
 #
 
 # swrast
-ifneq ($(filter swrast,$(MESA_GPU_DRIVERS)),)
+ifneq ($(filter llvmpipe,$(MESA_GPU_DRIVERS)),)
+SUBDIRS += winsys/sw/dri drivers/llvmpipe drivers/softpipe
+else ifneq ($(filter softpipe,$(MESA_GPU_DRIVERS)),)
 SUBDIRS += winsys/sw/dri drivers/softpipe
 endif
 
diff --git a/src/gallium/auxiliary/pipe-loader/Android.mk 
b/src/gallium/auxiliary/pipe-loader/Android.mk
index 006bb0ebfd..3f0563afb6 100644
--- a/src/gallium/auxiliary/pipe-loader/Android.mk
+++ b/src/gallium/auxiliary/pipe-loader/Android.mk
@@ -37,7 +37,7 @@ LOCAL_SRC_FILES := $(COMMON_SOURCES)
 
 LOCAL_MODULE := libmesa_pipe_loader
 
-ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
+ifneq ($(filter-out llvmpipe softpipe,$(MESA_GPU_DRIVERS)),)
 LOCAL_SRC_FILES += $(DRM_SOURCES)
 LOCAL_STATIC_LIBRARIES := libmesa_loader
 endif
diff --git a/src/gallium/drivers/llvmpipe/Android.mk 
b/src/gallium/drivers/llvmpipe/Android.mk
new file mode 100644
index 00..0193071e60
--- /dev/null
+++ b/src/gallium/drivers/llvmpipe/Android.mk
@@ -0,0 +1,39 @@
+# Mesa 3-D graphics library
+#
+# Copyright (C) 2015-2016 Zhen Wu 
+# Copyright (C) 2015-2016 Jide Inc.
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+
+LOCAL_PATH := $(call my-dir)
+
+# get C_SOURCES
+include $(LOCAL_PATH)/Makefile.sources
+
+include 

[Mesa-dev] [PATCH 4/8] android: fix llvmpipe build

2017-01-06 Thread Wu Zhen
From: WuZhen 

since (cf410574 gallivm: Make MCJIT a runtime optioni.), llvmpipe assume
MCJIT is available on x86(_64). this is not the case for android prior to M.

Change-Id: I6c82915b043f65e15ffc73aecff6779a1d02f1f1
Reviewed-by: Mauro Rossi 
Reviewed-by: Chih-Wei Huang 
---
 Android.common.mk   |  1 +
 src/gallium/auxiliary/gallivm/lp_bld_init.c | 12 ++--
 src/mesa/Android.libmesa_st_mesa.mk |  9 -
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/Android.common.mk b/Android.common.mk
index 7ab3942ee2..cb2a4e6104 100644
--- a/Android.common.mk
+++ b/Android.common.mk
@@ -59,6 +59,7 @@ LOCAL_CFLAGS += \
-DHAVE___BUILTIN_UNREACHABLE \
-DHAVE_PTHREAD=1 \
-DHAVE_DLOPEN \
+   -DHAVE_ANDROID_PLATFORM \
-fvisibility=hidden \
-Wno-sign-compare
 
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index d1b2369f34..a378f0c850 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -46,6 +46,8 @@
 /* Only MCJIT is available as of LLVM SVN r216982 */
 #if HAVE_LLVM >= 0x0306
 #  define USE_MCJIT 1
+#elif defined(HAVE_ANDROID_PLATFORM)
+#  define USE_MCJIT 0
 #elif defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || 
defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
 #  define USE_MCJIT 1
 #else
@@ -395,9 +397,15 @@ lp_build_init(void)
if (gallivm_initialized)
   return TRUE;
 
-   LLVMLinkInMCJIT();
-#if !defined(USE_MCJIT)
+#ifdef USE_MCJIT
+   #if USE_MCJIT
+  LLVMLinkInMCJIT();
+   #else
+  LLVMLinkInJIT();
+   #endif
+#else
USE_MCJIT = debug_get_bool_option("GALLIVM_MCJIT", 0);
+   LLVMLinkInMCJIT();
LLVMLinkInJIT();
 #endif
 
diff --git a/src/mesa/Android.libmesa_st_mesa.mk 
b/src/mesa/Android.libmesa_st_mesa.mk
index 90e4ccd210..d9b4129315 100644
--- a/src/mesa/Android.libmesa_st_mesa.mk
+++ b/src/mesa/Android.libmesa_st_mesa.mk
@@ -67,7 +67,14 @@ LOCAL_WHOLE_STATIC_LIBRARIES += \
 
 LOCAL_STATIC_LIBRARIES += libmesa_nir libmesa_glsl
 
-include external/libcxx/libcxx.mk
+ifeq ($(MESA_LOLLIPOP_BUILD),true)
+-include external/libcxx/libcxx.mk
+LOCAL_CXX_STL := libc++
+else
+include external/stlport/libstlport.mk
+endif
+
+
 include $(LOCAL_PATH)/Android.gen.mk
 include $(MESA_COMMON_MK)
 include $(BUILD_STATIC_LIBRARY)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] android: remove static linking LLVM parts.

2017-01-06 Thread Wu Zhen
From: WuZhen 

linking against llvm with both static and shared lib will
cause problems.
requires a companion changes in android llvm

Change-Id: I1d459135f7e5e242164abe38cd15f0a49448e495
Reviewed-by: Mauro Rossi 
Reviewed-by: Chih-Wei Huang 
---
 src/gallium/targets/dri/Android.mk | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/gallium/targets/dri/Android.mk 
b/src/gallium/targets/dri/Android.mk
index 0333641d97..972ea83530 100644
--- a/src/gallium/targets/dri/Android.mk
+++ b/src/gallium/targets/dri/Android.mk
@@ -118,12 +118,7 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \
 LOCAL_STATIC_LIBRARIES :=
 
 ifeq ($(MESA_ENABLE_LLVM),true)
-LOCAL_STATIC_LIBRARIES += \
-   libLLVMR600CodeGen \
-   libLLVMR600Desc \
-   libLLVMR600Info \
-   libLLVMR600AsmPrinter \
-   libelf
+LOCAL_STATIC_LIBRARIES += libelf
 LOCAL_LDLIBS += $(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-lgcc)
 endif
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] android: fix building on lollipop

2017-01-06 Thread Wu Zhen
From: WuZhen 

this commit fixes mesa building on lollipop, however,
llvm on lollipop is too old to build amdgpu

based on initial work by Mauro Rossi 

Change-Id: I98d646f9e1c61fe2754479382885718386a8bbb7
Reviewed-by: Mauro Rossi 
Reviewed-by: Chih-Wei Huang 
---
 Android.common.mk   | 2 +-
 Android.mk  | 5 -
 src/gbm/Android.mk  | 1 +
 src/mesa/Android.libmesa_st_mesa.mk | 1 +
 4 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Android.common.mk b/Android.common.mk
index 9f64c220f8..7ab3942ee2 100644
--- a/Android.common.mk
+++ b/Android.common.mk
@@ -91,7 +91,7 @@ endif
 endif
 
 LOCAL_CPPFLAGS += \
-   $(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-D_USING_LIBCXX) \
+   $(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-std=c++11) \
-Wno-error=non-virtual-dtor \
-Wno-non-virtual-dtor
 
diff --git a/Android.mk b/Android.mk
index fb29105a60..b52e7f8232 100644
--- a/Android.mk
+++ b/Android.mk
@@ -95,10 +95,13 @@ SUBDIRS := \
src/mesa \
src/util \
src/egl \
-   src/amd \
src/intel \
src/mesa/drivers/dri
 
+ifneq ($(filter r300g r600g radeonsi, $(MESA_GPU_DRIVERS)),)
+SUBDIRS += src/amd
+endif
+
 INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))
 
 ifeq ($(strip $(MESA_BUILD_GALLIUM)),true)
diff --git a/src/gbm/Android.mk b/src/gbm/Android.mk
index a3f8fbbeab..89127766e6 100644
--- a/src/gbm/Android.mk
+++ b/src/gbm/Android.mk
@@ -33,6 +33,7 @@ LOCAL_C_INCLUDES := \
$(LOCAL_PATH)/main
 
 LOCAL_STATIC_LIBRARIES := libmesa_loader
+LOCAL_SHARED_LIBRARIES := libdl
 LOCAL_MODULE := libgbm
 
 LOCAL_SRC_FILES := \
diff --git a/src/mesa/Android.libmesa_st_mesa.mk 
b/src/mesa/Android.libmesa_st_mesa.mk
index 3905ddcf24..90e4ccd210 100644
--- a/src/mesa/Android.libmesa_st_mesa.mk
+++ b/src/mesa/Android.libmesa_st_mesa.mk
@@ -67,6 +67,7 @@ LOCAL_WHOLE_STATIC_LIBRARIES += \
 
 LOCAL_STATIC_LIBRARIES += libmesa_nir libmesa_glsl
 
+include external/libcxx/libcxx.mk
 include $(LOCAL_PATH)/Android.gen.mk
 include $(MESA_COMMON_MK)
 include $(BUILD_STATIC_LIBRARY)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] android: print debug info to logcat

2017-01-06 Thread Wu Zhen
From: WuZhen 

Redirect logs printed to stderr to logcat.

Change-Id: I58e3966a608af361b86c54b4c95a92561b711968
Signed-off-by: Chih-Wei Huang 
Reviewed-by: Mauro Rossi 
Reviewed-by: Chih-Wei Huang 
---
 src/gallium/auxiliary/os/os_misc.c   | 12 ++--
 src/gallium/auxiliary/util/u_debug.c |  2 +-
 src/gallium/targets/dri/Android.mk   |  1 +
 src/mesa/main/errors.c   |  8 
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/os/os_misc.c 
b/src/gallium/auxiliary/os/os_misc.c
index 09d4400e08..a4d962868a 100644
--- a/src/gallium/auxiliary/os/os_misc.c
+++ b/src/gallium/auxiliary/os/os_misc.c
@@ -46,8 +46,10 @@
 
 #endif
 
-
-#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN) || 
defined(PIPE_OS_SOLARIS)
+#if defined(PIPE_OS_ANDROID)
+#  define LOG_TAG "gallium"
+#  include 
+#elif defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN) || 
defined(PIPE_OS_SOLARIS)
 #  include 
 #elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
 #  include 
@@ -100,6 +102,12 @@ os_log_message(const char *message)
   fflush(fout);
}
 #else /* !PIPE_SUBSYSTEM_WINDOWS */
+#if defined(PIPE_OS_ANDROID)
+   if (fout == stderr) {
+  ALOGD("%s", message);
+  return;
+   }
+#endif
fflush(stdout);
fputs(message, fout);
fflush(fout);
diff --git a/src/gallium/auxiliary/util/u_debug.c 
b/src/gallium/auxiliary/util/u_debug.c
index dd3e16791d..6ccc4f6ece 100644
--- a/src/gallium/auxiliary/util/u_debug.c
+++ b/src/gallium/auxiliary/util/u_debug.c
@@ -55,7 +55,7 @@ void
 _debug_vprintf(const char *format, va_list ap)
 {
static char buf[4096] = {'\0'};
-#if defined(PIPE_OS_WINDOWS) || defined(PIPE_SUBSYSTEM_EMBEDDED)
+#if defined(PIPE_OS_WINDOWS) || defined(PIPE_OS_ANDROID) || 
defined(PIPE_SUBSYSTEM_EMBEDDED)
/* We buffer until we find a newline. */
size_t len = strlen(buf);
int ret = util_vsnprintf(buf + len, sizeof(buf) - len, format, ap);
diff --git a/src/gallium/targets/dri/Android.mk 
b/src/gallium/targets/dri/Android.mk
index 950a46420c..0333641d97 100644
--- a/src/gallium/targets/dri/Android.mk
+++ b/src/gallium/targets/dri/Android.mk
@@ -40,6 +40,7 @@ LOCAL_CFLAGS :=
 
 LOCAL_SHARED_LIBRARIES := \
libdl \
+   liblog \
libglapi \
libexpat \
 
diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
index 3a40c7457a..d3121746a6 100644
--- a/src/mesa/main/errors.c
+++ b/src/mesa/main/errors.c
@@ -36,6 +36,10 @@
 #include "context.h"
 #include "debug_output.h"
 
+#if defined(ANDROID)
+#  define LOG_TAG "mesa"
+#  include 
+#endif
 
 static FILE *LogFile = NULL;
 
@@ -89,6 +93,10 @@ output_if_debug(const char *prefixString, const char 
*outputString,
  _mesa_snprintf(buf, sizeof(buf), "%s: %s%s", prefixString, 
outputString, newline ? "\n" : "");
  OutputDebugStringA(buf);
   }
+#elif defined(ANDROID)
+  {
+ ALOGD("%s: %s", prefixString, outputString);
+  }
 #endif
}
 }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/8] enable llvmpipe on android

2017-01-06 Thread Wu Zhen
This series of patches enables llvmpipe on anroid-x86 as
a fallback renderer, llvmpipe has been enabled and tested
on Remix OS(a variant of Android-x86) for about a year.

WuZhen (8):
  android: print debug info to logcat
  android: fix building on lollipop
  android: remove static linking LLVM parts.
  android: fix llvmpipe build
  android: add Android.mk for llvmpipe
  drisw: support fence externsion and image extension
  android: support creating texture from gralloc buffer
  android: egl: add support for software rasterizer

 Android.common.mk  |   5 +-
 Android.mk |  11 +-
 include/GL/internal/dri_interface.h|  14 +-
 src/egl/Android.mk |   1 +
 src/egl/drivers/dri2/egl_dri2.c|   1 +
 src/egl/drivers/dri2/platform_android.c| 389 -
 src/gallium/Android.mk |   4 +-
 src/gallium/auxiliary/gallivm/lp_bld_init.c|  12 +-
 src/gallium/auxiliary/os/os_misc.c |  12 +-
 src/gallium/auxiliary/pipe-loader/Android.mk   |   2 +-
 src/gallium/auxiliary/util/u_debug.c   |   2 +-
 src/gallium/drivers/llvmpipe/Android.mk|  39 +++
 src/gallium/include/state_tracker/drm_driver.h |  10 +-
 src/gallium/state_trackers/dri/Android.mk  |   4 +-
 src/gallium/state_trackers/dri/dri2.c  |  12 +-
 src/gallium/state_trackers/dri/drisw.c |  55 
 src/gallium/targets/dri/Android.mk |  15 +-
 src/gallium/winsys/sw/dri/dri_sw_winsys.c  |  65 +
 src/gbm/Android.mk |   1 +
 src/mesa/Android.libmesa_st_mesa.mk|   8 +
 src/mesa/drivers/dri/common/dri_util.c |   4 +-
 src/mesa/drivers/dri/common/dri_util.h |   2 +-
 src/mesa/main/errors.c |   8 +
 23 files changed, 637 insertions(+), 39 deletions(-)
 create mode 100644 src/gallium/drivers/llvmpipe/Android.mk

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

--- Comment #10 from Tim Rowley  ---
I think the right fix for CalculateProcessorTopology is to prune empty nodes at
the end:

for (auto it = out_nodes.begin(); it != out_nodes.end(); ) {
if ((*it).cores.size() == 0)
it = out_nodes.erase(it);
else
++it;
}

However, the rest of the topology logic with that cpuinfo comes to the
conclusion that's there's only two cores, and so will only generate two threads
with one being dedicated to the API.  We'll need to adjust that logic as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

--- Comment #9 from Jan Ziak <0xe2.0x9a.0...@gmail.com> ---
Created attachment 128795
  --> https://bugs.freedesktop.org/attachment.cgi?id=128795=edit
/proc/cpuinfo

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

--- Comment #8 from Jan Ziak <0xe2.0x9a.0...@gmail.com> ---
(In reply to Bruce Cherniak from comment #7)
> I also see you are on an AVX capable processor.  It would be helpful to know
> which model.

AMD A10-7850K

> The segfault you've referenced below is in a section of code that figures
> out processor topology.

The minimal "physical id" in /proc/cpuinfo on my machine is 1. It isn't 0.

In function CreateThreadPool():
(gdb) p nodes[0].cores
$2 = std::vector of length 0, capacity 0

Adding "numaId--" to function CalculateProcessorTopology() fixes the
segmentation fault.

$ zgrep NUMA /proc/config.gz 
# CONFIG_NUMA is not set

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] winsys/amdgpu: fix a race condition between fence updates and IB submissions

2017-01-06 Thread Marek Olšák
On Fri, Jan 6, 2017 at 1:02 PM, Nicolai Hähnle  wrote:
> On 02.01.2017 21:20, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> The CS thread is needed to ensure proper ordering of operations and can't
>> be disabled (without complicating the code).
>>
>> Discovered by Nine CSMT, which ended up in a deadlock.
>
>
> I'm curious why the thread makes a difference for the deadlock. Why isn't it
> enough in the un-threaded case to extend the scope of the ws->bo_fence_lock
> to cover the submit ioctl call?

You can't extend the bo_fence lock to the CS thread, so the main
thread and the CS thread can be racing.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/9] radeonsi: clean up more HAVE_LLVM #ifdefs

2017-01-06 Thread Marek Olšák
On Fri, Jan 6, 2017 at 12:42 PM, Nicolai Hähnle  wrote:
> On 02.01.2017 21:16, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> ---
>>  src/gallium/drivers/radeon/r600_pipe_common.c | 14 +-
>>  src/gallium/drivers/radeonsi/si_shader.c  | 19 +++
>>  2 files changed, 20 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
>> b/src/gallium/drivers/radeon/r600_pipe_common.c
>> index 74e8de9..d45a385 100644
>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>> @@ -36,20 +36,24 @@
>>  #include "vl/vl_decoder.h"
>>  #include "vl/vl_video_buffer.h"
>>  #include "radeon/radeon_video.h"
>>  #include 
>>  #include 
>>
>>  #ifndef HAVE_LLVM
>>  #define HAVE_LLVM 0
>>  #endif
>>
>> +#ifndef MESA_LLVM_VERSION_PATCH
>> +#define MESA_LLVM_VERSION_PATCH 0
>> +#endif
>> +
>
>
> Are you sure this isn't needed? configure.ac looks like it doesn't set this
> if only r600 without llvm is compiled.

I don't understand what you are asking. :) I think it *is* needed.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] st/nine: Remove all usage of ureg_SUB in nine_ff

2017-01-06 Thread Marek Olšák
On Fri, Jan 6, 2017 at 12:43 PM, Jose Fonseca  wrote:
> I think this is a good idea.
>
> We still use them but I'm happy to see them go
>
> It would be much easier for you and for us if you just implemented a
> ureg_ABS() / ureg_SUB inline helper that would call ureg_MOV/ureg_ADD
> internally:  fewer chances of a typo somewhere, and less work necessary all
> around.

Yeah probably. A lot of the users don't use ureg, so the chances of a
typo are still pretty high.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

--- Comment #7 from Bruce Cherniak  ---
I also see you are on an AVX capable processor.  It would be helpful to know
which model.

The segfault you've referenced below is in a section of code that figures out
processor topology.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

--- Comment #6 from Bruce Cherniak  ---
What are your configure and run options?  I have not installing these as the
default drivers, rather using a test sandbox .

I am simply configuring using:
--prefix=
--with-dri-drivers=swrast
--with-gallium-drivers=swrast,swr

And then to run, I have set:
LD_LIBRARY_PATH prepends /lib
LIBGL_DRIVERS_PATH=/lib/dri
LIBGL_ALWAYS_SOFTWARE=1
GALLIUM_DRIVER=swr

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

Bruce Cherniak  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #5 from Bruce Cherniak  ---
This is quite a different bt than you had attached originally.  From this,
something is definitely going on in SWR.  I'll continue to take a look.

Lately, I've run quite a bit of stuff through OpenSWR with DRI drivers. 
Although this isn't our primary mode (most of our customers simply use the
standalone gallium libGL.so.1.5.0), we definitely support it.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] isl: render target cube maps should be handled as 2D images, not cubes

2017-01-06 Thread Iago Toral Quiroga
This fixes layered rendering Vulkan CTS tests with cube (arrays). We
also do this in the GL driver, see this code from gen8_depth_state.c
for example:

case GL_TEXTURE_CUBE_MAP_ARRAY:
case GL_TEXTURE_CUBE_MAP:
   /* The PRM claims that we should use BRW_SURFACE_CUBE for this
* situation, but experiments show that gl_Layer doesn't work when we do
* this.  So we use BRW_SURFACE_2D, since for rendering purposes this is
* equivalent.
*/
   surftype = BRW_SURFACE_2D;
   depth *= 6;
   break;

So I guess we simply forgot to port this workaround to Vulkan.

Fixes:
dEQP-VK.geometry.layered.cube*
---

With this (and the previous patch I sent to fix the SBE state packet to not
skip the VUE header when we need the layer information) all the layered
rendering tests in Vulkan CTS seem to pass.

 src/intel/isl/isl_surface_state.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 3bb0abd..0960a90 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -113,8 +113,9 @@ get_surftype(enum isl_surf_dim dim, isl_surf_usage_flags_t 
usage)
   assert(!(usage & ISL_SURF_USAGE_CUBE_BIT));
   return SURFTYPE_1D;
case ISL_SURF_DIM_2D:
-  if (usage & ISL_SURF_USAGE_STORAGE_BIT) {
- /* Storage images are always plain 2-D, not cube */
+  if ((usage & ISL_SURF_USAGE_STORAGE_BIT) ||
+  (usage & ISL_SURF_USAGE_RENDER_TARGET_BIT)) {
+ /* Storage / Render images are always plain 2-D, not cube */
  return SURFTYPE_2D;
   } else if (usage & ISL_SURF_USAGE_CUBE_BIT) {
  return SURFTYPE_CUBE;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glx: Add missing glproto dependency for gallium-xlib glx

2017-01-06 Thread Chuck Atkins
Cc: mesa-sta...@lists.freedesktop.org
Cc: Bruce Cherniak 
Signed-of-by: Chuck Atkins 
---
 configure.ac| 4 +++-
 src/gallium/state_trackers/glx/xlib/Makefile.am | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index d1ffb57..092bea0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1597,6 +1597,9 @@ AC_ARG_ENABLE([driglx-direct],
 dnl
 dnl libGL configuration per driver
 dnl
+if test "x$enable_glx" != xno; then
+PKG_CHECK_MODULES([GLPROTO], [glproto >= $GLPROTO_REQUIRED])
+fi
 case "x$enable_glx" in
 xxlib | xgallium-xlib)
 # Xlib-based GLX
@@ -1610,7 +1613,6 @@ xxlib | xgallium-xlib)
 ;;
 xdri)
 # DRI-based GLX
-PKG_CHECK_MODULES([GLPROTO], [glproto >= $GLPROTO_REQUIRED])
 
 # find the DRI deps for libGL
 dri_modules="x11 xext xdamage xfixes x11-xcb xcb xcb-glx >= 
$XCBGLX_REQUIRED"
diff --git a/src/gallium/state_trackers/glx/xlib/Makefile.am 
b/src/gallium/state_trackers/glx/xlib/Makefile.am
index a7e6c0c..112030be 100644
--- a/src/gallium/state_trackers/glx/xlib/Makefile.am
+++ b/src/gallium/state_trackers/glx/xlib/Makefile.am
@@ -25,6 +25,7 @@ include $(top_srcdir)/src/gallium/Automake.inc
 
 AM_CFLAGS = \
$(GALLIUM_CFLAGS) \
+   $(GLPROTO_CFLAGS) \
$(X11_INCLUDES)
 AM_CPPFLAGS = \
-I$(top_srcdir)/include \
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix opt_minmax redundancy checks against baserange

2017-01-06 Thread Iago Toral
Hi Tim,

it's been a while, so off the top of my head I don't have any
particular suggestions or the capacity to tell whether your fix is
correct or not :-/.

I'll try to spend some time re-acquainting myself with that code on
Monday and see if I can see what is going on here.

Iago

On Fri, 2017-01-06 at 22:08 +1100, Timothy Arceri wrote:
> Hi Iago/Petri,
> 
> Just Ccing you guys in case you have a better solution (I know its
> been
> a while since this was written).
> 
> This is causing incorrect shaders in at least Serious Sam 3 possibly
> others. I've sent some piglit tests to reproduce it to the piglit
> list.
> 
> Thanks,
> Tim
> 
> On Fri, 2017-01-06 at 10:26 +1100, Timothy Arceri wrote:
> > 
> > Marking operations as redundant if they are equal to the base
> > range is fine when the tree structure is something like this:
> > 
> > max
> >   / \
> >  max b
> > /   \
> >    3max
> >    /   \
> >   3 a
> > 
> > But the opt falls apart with a tree like this:
> > 
> > max
> >  /   \
> > max max
> >    /   \   /   \
> >   3a   b3
> > 
> > I'm not 100% sure what is going wrong as a tree like:
> > 
> > max
> >  /   \
> > max max
> >    /   \   /   \
> >   3a   b4
> > 
> > Will remove the right hand side max just fine. But not marking
> > limits that equal the base range seems to fix the problem.
> > 
> > NIR algebraic opt will clean up the first tree for anyway,
> > hopefully
> > other backends are smart enough to do this also.
> > 
> > Cc: "13.0" 
> > ---
> >  src/compiler/glsl/opt_minmax.cpp | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/compiler/glsl/opt_minmax.cpp
> > b/src/compiler/glsl/opt_minmax.cpp
> > index 29482ee..9f64db9 100644
> > --- a/src/compiler/glsl/opt_minmax.cpp
> > +++ b/src/compiler/glsl/opt_minmax.cpp
> > @@ -355,7 +355,7 @@
> > ir_minmax_visitor::prune_expression(ir_expression
> > *expr, minmax_range baserange)
> >    */
> >   if (!is_redundant && limits[i].low && baserange.high) {
> >  cr = compare_components(limits[i].low,
> > baserange.high);
> > -if (cr >= EQUAL && cr != MIXED)
> > +if (cr > EQUAL && cr != MIXED)
> > is_redundant = true;
> >   }
> >    } else {
> > @@ -373,7 +373,7 @@
> > ir_minmax_visitor::prune_expression(ir_expression
> > *expr, minmax_range baserange)
> >    */
> >   if (!is_redundant && limits[i].high && baserange.low) {
> >  cr = compare_components(limits[i].high,
> > baserange.low);
> > -if (cr <= EQUAL)
> > +if (cr < EQUAL)
> > is_redundant = true;
> >   }
> >    }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix opt_minmax redundancy checks against baserange

2017-01-06 Thread Timothy Arceri
On Fri, 2017-01-06 at 12:36 +0100, Nicolai Hähnle wrote:
> On 06.01.2017 00:26, Timothy Arceri wrote:
> > Marking operations as redundant if they are equal to the base
> > range is fine when the tree structure is something like this:
> > 
> > max
> >   / \
> >  max b
> > /   \
> >    3max
> >    /   \
> >   3 a
> > 
> > But the opt falls apart with a tree like this:
> > 
> > max
> >  /   \
> > max max
> >    /   \   /   \
> >   3a   b3
> 
> This will just become max(a, b), right?

Ah yes so it does. I spent so much time trying to reproduce the problem
(mostly trying trees like the first one) that I though it was doing
something more crazy than that. Makes much more sense looking at it
again with less tired eyes.

> 
> The problem is that both branches are treated the same: descending
> in 
> the left branch will prune the constant, and then descending the
> right 
> branch will prune the constant there as well, because limits[0]
> wasn't 
> updated to take the change on the left branch into account, and so
> we 
> still get [3,\infty) as baserange.
> 
> With your change, nothing will be done at all. That's fine as far as
> I'm 
> concerned. I don't see a clean way of updating the limits on-the-fly.
> 
> Thanks for the piglit test. An explanatory comment would be nice,
> but 
> either way, this patch is

I'll add your explanation to the commit message.

> 
> Reviewed-by: Nicolai Hähnle 
> 
> 

Thanks :)

> 
> > I'm not 100% sure what is going wrong as a tree like:
> > 
> > max
> >  /   \
> > max max
> >    /   \   /   \
> >   3a   b4
> > 
> > Will remove the right hand side max just fine. But not marking
> > limits that equal the base range seems to fix the problem.
> > 
> > NIR algebraic opt will clean up the first tree for anyway,
> > hopefully
> > other backends are smart enough to do this also.
> > 
> > Cc: "13.0" 
> > ---
> >  src/compiler/glsl/opt_minmax.cpp | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/compiler/glsl/opt_minmax.cpp
> > b/src/compiler/glsl/opt_minmax.cpp
> > index 29482ee..9f64db9 100644
> > --- a/src/compiler/glsl/opt_minmax.cpp
> > +++ b/src/compiler/glsl/opt_minmax.cpp
> > @@ -355,7 +355,7 @@
> > ir_minmax_visitor::prune_expression(ir_expression *expr,
> > minmax_range baserange)
> >    */
> >   if (!is_redundant && limits[i].low && baserange.high) {
> >  cr = compare_components(limits[i].low,
> > baserange.high);
> > -if (cr >= EQUAL && cr != MIXED)
> > +if (cr > EQUAL && cr != MIXED)
> > is_redundant = true;
> >   }
> >    } else {
> > @@ -373,7 +373,7 @@
> > ir_minmax_visitor::prune_expression(ir_expression *expr,
> > minmax_range baserange)
> >    */
> >   if (!is_redundant && limits[i].high && baserange.low) {
> >  cr = compare_components(limits[i].high,
> > baserange.low);
> > -if (cr <= EQUAL)
> > +if (cr < EQUAL)
> > is_redundant = true;
> >   }
> >    }
> > 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] winsys/amdgpu: fix a race condition between fence updates and IB submissions

2017-01-06 Thread Nicolai Hähnle

On 02.01.2017 21:20, Marek Olšák wrote:

From: Marek Olšák 

The CS thread is needed to ensure proper ordering of operations and can't
be disabled (without complicating the code).

Discovered by Nine CSMT, which ended up in a deadlock.


I'm curious why the thread makes a difference for the deadlock. Why 
isn't it enough in the un-threaded case to extend the scope of the 
ws->bo_fence_lock to cover the submit ioctl call?


Then again, I'm happy with simplifying the code to eliminate the 
un-threaded path, so...


Reviewed-by: Nicolai Hähnle 


---
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 31 +++
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c |  9 
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
index 95402bf..87246f7 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
@@ -1060,25 +1060,23 @@ cleanup:
for (i = 0; i < cs->num_slab_buffers; i++)
   p_atomic_dec(>slab_buffers[i].bo->num_active_ioctls);

amdgpu_cs_context_cleanup(cs);
 }

 /* Make sure the previous submission is completed. */
 void amdgpu_cs_sync_flush(struct radeon_winsys_cs *rcs)
 {
struct amdgpu_cs *cs = amdgpu_cs(rcs);
-   struct amdgpu_winsys *ws = cs->ctx->ws;

/* Wait for any pending ioctl of this CS to complete. */
-   if (util_queue_is_initialized(>cs_queue))
-  util_queue_job_wait(>flush_completed);
+   util_queue_job_wait(>flush_completed);
 }

 static int amdgpu_cs_flush(struct radeon_winsys_cs *rcs,
unsigned flags,
struct pipe_fence_handle **fence)
 {
struct amdgpu_cs *cs = amdgpu_cs(rcs);
struct amdgpu_winsys *ws = cs->ctx->ws;
int error_code = 0;

@@ -1150,53 +1148,58 @@ static int amdgpu_cs_flush(struct radeon_winsys_cs *rcs,
  cs->next_fence = NULL;
   } else {
  cur->fence = amdgpu_fence_create(cs->ctx,
   cur->request.ip_type,
   cur->request.ip_instance,
   cur->request.ring);
   }
   if (fence)
  amdgpu_fence_reference(fence, cur->fence);

-  /* Prepare buffers. */
+  amdgpu_cs_sync_flush(rcs);
+
+  /* Prepare buffers.
+   *
+   * This fence must be held until the submission is queued to ensure
+   * that the order of fence dependency updates matches the order of
+   * submissions.
+   */
   pipe_mutex_lock(ws->bo_fence_lock);
   amdgpu_add_fence_dependencies(cs);

   num_buffers = cur->num_real_buffers;
   for (i = 0; i < num_buffers; i++) {
  struct amdgpu_winsys_bo *bo = cur->real_buffers[i].bo;
  p_atomic_inc(>num_active_ioctls);
  amdgpu_add_fence(bo, cur->fence);
   }

   num_buffers = cur->num_slab_buffers;
   for (i = 0; i < num_buffers; i++) {
  struct amdgpu_winsys_bo *bo = cur->slab_buffers[i].bo;
  p_atomic_inc(>num_active_ioctls);
  amdgpu_add_fence(bo, cur->fence);
   }
-  pipe_mutex_unlock(ws->bo_fence_lock);
-
-  amdgpu_cs_sync_flush(rcs);

   /* Swap command streams. "cst" is going to be submitted. */
   cs->csc = cs->cst;
   cs->cst = cur;

   /* Submit. */
-  if ((flags & RADEON_FLUSH_ASYNC) &&
-  util_queue_is_initialized(>cs_queue)) {
- util_queue_add_job(>cs_queue, cs, >flush_completed,
-amdgpu_cs_submit_ib, NULL);
-  } else {
- amdgpu_cs_submit_ib(cs, 0);
- error_code = cs->cst->error_code;
+  util_queue_add_job(>cs_queue, cs, >flush_completed,
+ amdgpu_cs_submit_ib, NULL);
+  /* The submission has been queued, unlock the fence now. */
+  pipe_mutex_unlock(ws->bo_fence_lock);
+
+  if (!(flags & RADEON_FLUSH_ASYNC)) {
+ amdgpu_cs_sync_flush(rcs);
+ error_code = cur->error_code;
   }
} else {
   amdgpu_cs_context_cleanup(cs->csc);
}

amdgpu_get_new_ib(>base, cs, IB_MAIN);
if (cs->const_ib.ib_mapped)
   amdgpu_get_new_ib(>base, cs, IB_CONST);
if (cs->const_preamble_ib.ib_mapped)
   amdgpu_get_new_ib(>base, cs, IB_CONST_PREAMBLE);
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index b950d37..e944e62 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -471,22 +471,20 @@ static unsigned hash_dev(void *key)
 #else
return pointer_to_intptr(key);
 #endif
 }

 static int compare_dev(void *key1, void *key2)
 {
return key1 != key2;
 }

-DEBUG_GET_ONCE_BOOL_OPTION(thread, "RADEON_THREAD", true)
-
 static bool amdgpu_winsys_unref(struct radeon_winsys *rws)
 {
struct amdgpu_winsys *ws = (struct 

Re: [Mesa-dev] [PATCH 3/3] radeonsi: add TC L2 prefetch for shaders and VBO descriptors

2017-01-06 Thread Nicolai Hähnle

On 02.01.2017 21:18, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_cp_dma.c | 12 +
 src/gallium/drivers/radeonsi/si_pipe.h   |  2 ++
 src/gallium/drivers/radeonsi/si_state_draw.c | 37 +++-
 3 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 653021e..13b901b 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -360,14 +360,26 @@ void si_copy_buffer(struct si_context *sctx,
 _first);

if (tc_l2_flag)
r600_resource(dst)->TC_L2_dirty = true;

/* If it's not a prefetch... */
if (dst_offset != src_offset)
sctx->b.num_cp_dma_calls++;
 }

+void cik_prefetch_TC_L2_async(struct si_context *sctx, struct pipe_resource 
*buf,
+ uint64_t offset, unsigned size)
+{
+   assert(sctx->b.chip_class >= CIK);
+
+   si_copy_buffer(sctx, buf, buf, offset, offset, size,
+  SI_CPDMA_SKIP_CHECK_CS_SPACE |
+  SI_CPDMA_SKIP_SYNC_AFTER |
+  SI_CPDMA_SKIP_SYNC_BEFORE |
+  SI_CPDMA_SKIP_GFX_SYNC);
+}
+
 void si_init_cp_dma_functions(struct si_context *sctx)
 {
sctx->b.clear_buffer = si_clear_buffer;
 }
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index dc37c8d..c0a4636 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -374,20 +374,22 @@ void si_resource_copy_region(struct pipe_context *ctx,
 /* si_cp_dma.c */
 #define SI_CPDMA_SKIP_CHECK_CS_SPACE   (1 << 0) /* don't call need_cs_space */
 #define SI_CPDMA_SKIP_SYNC_AFTER   (1 << 1) /* don't wait for DMA after 
the copy */
 #define SI_CPDMA_SKIP_SYNC_BEFORE  (1 << 2) /* don't wait for DMA before 
the copy (RAW hazards) */
 #define SI_CPDMA_SKIP_GFX_SYNC (1 << 3) /* don't flush caches and 
don't wait for PS/CS */

 void si_copy_buffer(struct si_context *sctx,
struct pipe_resource *dst, struct pipe_resource *src,
uint64_t dst_offset, uint64_t src_offset, unsigned size,
unsigned user_flags);
+void cik_prefetch_TC_L2_async(struct si_context *sctx, struct pipe_resource 
*buf,
+ uint64_t offset, unsigned size);
 void si_init_cp_dma_functions(struct si_context *sctx);

 /* si_debug.c */
 void si_init_debug_functions(struct si_context *sctx);
 void si_check_vm_faults(struct r600_common_context *ctx,
struct radeon_saved_cs *saved, enum ring_type ring);
 bool si_replace_shader(unsigned num, struct radeon_shader_binary *binary);

 /* si_dma.c */
 void si_init_dma_functions(struct si_context *sctx);
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index b3f664e..7b75602 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -930,20 +930,31 @@ void si_ce_pre_draw_synchronization(struct si_context 
*sctx)
 void si_ce_post_draw_synchronization(struct si_context *sctx)
 {
if (sctx->ce_need_synchronization) {
radeon_emit(sctx->b.gfx.cs, PKT3(PKT3_INCREMENT_DE_COUNTER, 0, 
0));
radeon_emit(sctx->b.gfx.cs, 0);

sctx->ce_need_synchronization = false;
}
 }

+static void cik_prefetch_shader_async(struct si_context *sctx,
+ struct si_pm4_state *state)
+{
+   if (state) {
+   struct pipe_resource *bo = >bo[0]->b.b;
+   assert(state->nbo == 1);
+
+   cik_prefetch_TC_L2_async(sctx, bo, 0, bo->width0);
+   }
+}
+
 void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
 {
struct si_context *sctx = (struct si_context *)ctx;
struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
struct pipe_index_buffer ib = {};
unsigned mask, dirty_fb_counter, dirty_tex_counter, rast_prim;

if (likely(!info->indirect)) {
/* SI-CI treat instance_count==0 as instance_count==1. There is
 * no workaround for indirect draws, but we can at least skip
@@ -1107,24 +1118,48 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)

si_need_cs_space(sctx);

/* Since we've called r600_context_add_resource_size for vertex buffers,
 * this must be called after si_need_cs_space, because we must let
 * need_cs_space flush before we add buffers to the buffer list.
 */
if (!si_upload_vertex_buffer_descriptors(sctx))
return;

-   /* Flushed caches prior to emitting states. */
+   /* Flushed caches prior to prefetching 

Re: [Mesa-dev] [PATCH 9/9] gallium/radeon: add new HUD query num-SDMA-IBs

2017-01-06 Thread Nicolai Hähnle

Apart from the question on patch #2, this series is

Reviewed-by: Nicolai Hähnle 

On 02.01.2017 21:17, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_query.c   | 4 
 src/gallium/drivers/radeon/r600_query.h   | 1 +
 src/gallium/drivers/radeon/radeon_winsys.h| 1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 6 +-
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 2 ++
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h | 1 +
 src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 5 -
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2 ++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.h | 1 +
 9 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 70a2568..3c72f27 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -59,20 +59,21 @@ static void r600_query_sw_destroy(struct 
r600_common_context *rctx,

 static enum radeon_value_id winsys_id_from_type(unsigned type)
 {
switch (type) {
case R600_QUERY_REQUESTED_VRAM: return RADEON_REQUESTED_VRAM_MEMORY;
case R600_QUERY_REQUESTED_GTT: return RADEON_REQUESTED_GTT_MEMORY;
case R600_QUERY_MAPPED_VRAM: return RADEON_MAPPED_VRAM;
case R600_QUERY_MAPPED_GTT: return RADEON_MAPPED_GTT;
case R600_QUERY_BUFFER_WAIT_TIME: return RADEON_BUFFER_WAIT_TIME_NS;
case R600_QUERY_NUM_GFX_IBS: return RADEON_NUM_GFX_IBS;
+   case R600_QUERY_NUM_SDMA_IBS: return RADEON_NUM_SDMA_IBS;
case R600_QUERY_NUM_BYTES_MOVED: return RADEON_NUM_BYTES_MOVED;
case R600_QUERY_NUM_EVICTIONS: return RADEON_NUM_EVICTIONS;
case R600_QUERY_VRAM_USAGE: return RADEON_VRAM_USAGE;
case R600_QUERY_GTT_USAGE: return RADEON_GTT_USAGE;
case R600_QUERY_GPU_TEMPERATURE: return RADEON_GPU_TEMPERATURE;
case R600_QUERY_CURRENT_GPU_SCLK: return RADEON_CURRENT_SCLK;
case R600_QUERY_CURRENT_GPU_MCLK: return RADEON_CURRENT_MCLK;
default: unreachable("query type does not correspond to winsys id");
}
 }
@@ -129,20 +130,21 @@ static bool r600_query_sw_begin(struct 
r600_common_context *rctx,
case R600_QUERY_VRAM_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BACK_BUFFER_PS_DRAW_RATIO:
query->begin_result = 0;
break;
case R600_QUERY_BUFFER_WAIT_TIME:
case R600_QUERY_NUM_GFX_IBS:
+   case R600_QUERY_NUM_SDMA_IBS:
case R600_QUERY_NUM_BYTES_MOVED:
case R600_QUERY_NUM_EVICTIONS: {
enum radeon_value_id ws_id = winsys_id_from_type(query->b.type);
query->begin_result = rctx->ws->query_value(rctx->ws, ws_id);
break;
}
case R600_QUERY_GPU_LOAD:
query->begin_result = r600_gpu_load_begin(rctx->screen);
break;
case R600_QUERY_NUM_COMPILATIONS:
@@ -219,20 +221,21 @@ static bool r600_query_sw_end(struct r600_common_context 
*rctx,
case R600_QUERY_REQUESTED_GTT:
case R600_QUERY_MAPPED_VRAM:
case R600_QUERY_MAPPED_GTT:
case R600_QUERY_VRAM_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BUFFER_WAIT_TIME:
case R600_QUERY_NUM_GFX_IBS:
+   case R600_QUERY_NUM_SDMA_IBS:
case R600_QUERY_NUM_BYTES_MOVED:
case R600_QUERY_NUM_EVICTIONS: {
enum radeon_value_id ws_id = winsys_id_from_type(query->b.type);
query->end_result = rctx->ws->query_value(rctx->ws, ws_id);
break;
}
case R600_QUERY_GPU_LOAD:
query->end_result = r600_gpu_load_end(rctx->screen,
  query->begin_result);
query->begin_result = 0;
@@ -1685,20 +1688,21 @@ static struct pipe_driver_query_info 
r600_driver_query_list[] = {
X("num-cs-flushes",   NUM_CS_FLUSHES, UINT64, AVERAGE),
X("num-fb-cache-flushes", NUM_FB_CACHE_FLUSHES,   UINT64, AVERAGE),
X("num-L2-invalidates",   NUM_L2_INVALIDATES, UINT64, 
AVERAGE),
X("num-L2-writebacks",NUM_L2_WRITEBACKS,  UINT64, 
AVERAGE),
X("requested-VRAM",   REQUESTED_VRAM, BYTES, AVERAGE),
X("requested-GTT",REQUESTED_GTT,  BYTES, AVERAGE),
X("mapped-VRAM",  MAPPED_VRAM,BYTES, AVERAGE),
X("mapped-GTT",   MAPPED_GTT, BYTES, 
AVERAGE),
X("buffer-wait-time", BUFFER_WAIT_TIME,  

Re: [Mesa-dev] [PATCH 2/5] st/nine: Remove all usage of ureg_SUB in nine_ff

2017-01-06 Thread Jose Fonseca

I think this is a good idea.

We still use them but I'm happy to see them go

It would be much easier for you and for us if you just implemented a 
ureg_ABS() / ureg_SUB inline helper that would call ureg_MOV/ureg_ADD 
internally:  fewer chances of a typo somewhere, and less work necessary 
all around.


Jose


On 01/01/17 00:04, Marek Olšák wrote:

From: Axel Davy 

This is required to remove gallium SUB.

Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/nine/nine_ff.c | 40 +++
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
b/src/gallium/state_trackers/nine/nine_ff.c
index a0a33cd..7cbe3f7 100644
--- a/src/gallium/state_trackers/nine/nine_ff.c
+++ b/src/gallium/state_trackers/nine/nine_ff.c
@@ -442,23 +442,23 @@ nine_ff_build_vs(struct NineDevice9 *device, struct 
vs_build_ctx *vs)
 ureg_MOV(ureg, oPos, vs->aVtx);
 } else {
 struct ureg_dst tmp = ureg_DECL_temporary(ureg);
 /* vs->aVtx contains the coordinates buffer wise.
 * later in the pipeline, clipping, viewport and division
 * by w (rhw = 1/w) are going to be applied, so do the reverse
 * of these transformations (except clipping) to have the good
 * position at the end.*/
 ureg_MOV(ureg, tmp, vs->aVtx);
 /* X from [X_min, X_min + width] to [-1, 1], same for Y. Z to [0, 
1] */
-ureg_SUB(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XYZ), 
ureg_src(tmp), _CONST(101));
+ureg_ADD(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XYZ), 
ureg_src(tmp), ureg_negate(_CONST(101)));
 ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XYZ), 
ureg_src(tmp), _CONST(100));
-ureg_SUB(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XY), 
ureg_src(tmp), ureg_imm1f(ureg, 1.0f));
+ureg_ADD(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XY), 
ureg_src(tmp), ureg_imm1f(ureg, -1.0f));
 /* Y needs to be reversed */
 ureg_MOV(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), 
ureg_negate(ureg_src(tmp)));
 /* inverse rhw */
 ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_W), _W(tmp));
 /* multiply X, Y, Z by w */
 ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_XYZ), 
ureg_src(tmp), _W(tmp));
 ureg_MOV(ureg, oPos, ureg_src(tmp));
 ureg_release_temporary(ureg, tmp);
 }
 } else if (key->vertexblend) {
@@ -504,21 +504,21 @@ nine_ff_build_vs(struct NineDevice9 *device, struct 
vs_build_ctx *vs)
 ureg_MAD(ureg, tmp2, _(vs->aNrm), cWM[1], ureg_src(tmp2));
 ureg_MAD(ureg, tmp2, _(vs->aNrm), cWM[2], ureg_src(tmp2));
 }

 if (i < (key->vertexblend - 1)) {
 /* accumulate weighted position value */
 ureg_MAD(ureg, aVtx_dst, ureg_src(tmp), ureg_scalar(vs->aWgt, 
i), ureg_src(aVtx_dst));
 if (has_aNrm)
 ureg_MAD(ureg, aNrm_dst, ureg_src(tmp2), 
ureg_scalar(vs->aWgt, i), ureg_src(aNrm_dst));
 /* subtract weighted position value for last value */
-ureg_SUB(ureg, sum_blendweights, ureg_src(sum_blendweights), 
ureg_scalar(vs->aWgt, i));
+ureg_ADD(ureg, sum_blendweights, ureg_src(sum_blendweights), 
ureg_negate(ureg_scalar(vs->aWgt, i)));
 }
 }

 /* the last weighted position is always 1 - sum_of_previous_weights */
 ureg_MAD(ureg, aVtx_dst, ureg_src(tmp), 
ureg_scalar(ureg_src(sum_blendweights), key->vertexblend - 1), 
ureg_src(aVtx_dst));
 if (has_aNrm)
 ureg_MAD(ureg, aNrm_dst, ureg_src(tmp2), 
ureg_scalar(ureg_src(sum_blendweights), key->vertexblend - 1), 
ureg_src(aNrm_dst));

 /* multiply by VIEW_PROJ */
 ureg_MUL(ureg, tmp, _X(aVtx_dst), _CONST(8));
@@ -654,36 +654,36 @@ nine_ff_build_vs(struct NineDevice9 *device, struct 
vs_build_ctx *vs)
 ureg_MOV(ureg, ureg_writemask(input_coord, TGSI_WRITEMASK_W), 
ureg_imm1f(ureg, 1.0f));
 dim_input = 4;
 break;
 case NINED3DTSS_TCI_CAMERASPACEREFLECTIONVECTOR:
 tmp.WriteMask = TGSI_WRITEMASK_XYZ;
 aVtx_normed = ureg_DECL_temporary(ureg);
 ureg_normalize3(ureg, aVtx_normed, vs->aVtx);
 ureg_DP3(ureg, tmp_x, ureg_src(aVtx_normed), vs->aNrm);
 ureg_MUL(ureg, tmp, vs->aNrm, _X(tmp));
 ureg_ADD(ureg, tmp, ureg_src(tmp), ureg_src(tmp));
-ureg_SUB(ureg, ureg_writemask(input_coord, TGSI_WRITEMASK_XYZ), 
ureg_src(aVtx_normed), ureg_src(tmp));
+ureg_ADD(ureg, ureg_writemask(input_coord, TGSI_WRITEMASK_XYZ), 
ureg_src(aVtx_normed), ureg_negate(ureg_src(tmp)));
 ureg_MOV(ureg, ureg_writemask(input_coord, TGSI_WRITEMASK_W), 
ureg_imm1f(ureg, 1.0f));
 

Re: [Mesa-dev] [PATCH 2/9] radeonsi: clean up more HAVE_LLVM #ifdefs

2017-01-06 Thread Nicolai Hähnle

On 02.01.2017 21:16, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.c | 14 +-
 src/gallium/drivers/radeonsi/si_shader.c  | 19 +++
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 74e8de9..d45a385 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -36,20 +36,24 @@
 #include "vl/vl_decoder.h"
 #include "vl/vl_video_buffer.h"
 #include "radeon/radeon_video.h"
 #include 
 #include 

 #ifndef HAVE_LLVM
 #define HAVE_LLVM 0
 #endif

+#ifndef MESA_LLVM_VERSION_PATCH
+#define MESA_LLVM_VERSION_PATCH 0
+#endif
+


Are you sure this isn't needed? configure.ac looks like it doesn't set 
this if only r600 without llvm is compiled.


Nicolai


 struct r600_multi_fence {
struct pipe_reference reference;
struct pipe_fence_handle *gfx;
struct pipe_fence_handle *sdma;

/* If the context wasn't flushed at fence creation, this is non-NULL. */
struct {
struct r600_common_context *ctx;
unsigned ib_index;
} gfx_unflushed;
@@ -1194,25 +1198,25 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
 {
char llvm_string[32] = {}, kernel_version[128] = {};
struct utsname uname_data;

ws->query_info(ws, >info);

if (uname(_data) == 0)
snprintf(kernel_version, sizeof(kernel_version),
 " / %s", uname_data.release);

-#if HAVE_LLVM
-   snprintf(llvm_string, sizeof(llvm_string),
-", LLVM %i.%i.%i", (HAVE_LLVM >> 8) & 0xff,
-HAVE_LLVM & 0xff, MESA_LLVM_VERSION_PATCH);
-#endif
+   if (HAVE_LLVM > 0) {
+   snprintf(llvm_string, sizeof(llvm_string),
+", LLVM %i.%i.%i", (HAVE_LLVM >> 8) & 0xff,
+HAVE_LLVM & 0xff, MESA_LLVM_VERSION_PATCH);
+   }

snprintf(rscreen->renderer_string, sizeof(rscreen->renderer_string),
 "%s (DRM %i.%i.%i%s%s)",
 r600_get_chip_name(rscreen), rscreen->info.drm_major,
 rscreen->info.drm_minor, rscreen->info.drm_patchlevel,
 kernel_version, llvm_string);

rscreen->b.get_name = r600_get_name;
rscreen->b.get_vendor = r600_get_vendor;
rscreen->b.get_device_vendor = r600_get_device_vendor;
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 72cf827..f18aa82 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1764,30 +1764,33 @@ static void declare_system_value(
}

case TGSI_SEMANTIC_BLOCK_ID:
value = LLVMGetParam(radeon_bld->main_fn, SI_PARAM_BLOCK_ID);
break;

case TGSI_SEMANTIC_THREAD_ID:
value = LLVMGetParam(radeon_bld->main_fn, SI_PARAM_THREAD_ID);
break;

-#if HAVE_LLVM >= 0x0309
case TGSI_SEMANTIC_HELPER_INVOCATION:
-   value = lp_build_intrinsic(gallivm->builder,
-  "llvm.amdgcn.ps.live",
-  ctx->i1, NULL, 0,
-  LP_FUNC_ATTR_READNONE);
-   value = LLVMBuildNot(gallivm->builder, value, "");
-   value = LLVMBuildSExt(gallivm->builder, value, ctx->i32, "");
+   if (HAVE_LLVM >= 0x0309) {
+   value = lp_build_intrinsic(gallivm->builder,
+  "llvm.amdgcn.ps.live",
+  ctx->i1, NULL, 0,
+  LP_FUNC_ATTR_READNONE);
+   value = LLVMBuildNot(gallivm->builder, value, "");
+   value = LLVMBuildSExt(gallivm->builder, value, ctx->i32, 
"");
+   } else {
+   assert(!"TGSI_SEMANTIC_HELPER_INVOCATION unsupported");
+   return;
+   }
break;
-#endif

default:
assert(!"unknown system value");
return;
}

radeon_bld->system_values[index] = value;
 }

 static void declare_compute_memory(struct si_shader_context *radeon_bld,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix opt_minmax redundancy checks against baserange

2017-01-06 Thread Nicolai Hähnle

On 06.01.2017 00:26, Timothy Arceri wrote:

Marking operations as redundant if they are equal to the base
range is fine when the tree structure is something like this:

max
  / \
 max b
/   \
   3max
   /   \
  3 a

But the opt falls apart with a tree like this:

max
 /   \
max max
   /   \   /   \
  3a   b3


This will just become max(a, b), right?

The problem is that both branches are treated the same: descending in 
the left branch will prune the constant, and then descending the right 
branch will prune the constant there as well, because limits[0] wasn't 
updated to take the change on the left branch into account, and so we 
still get [3,\infty) as baserange.


With your change, nothing will be done at all. That's fine as far as I'm 
concerned. I don't see a clean way of updating the limits on-the-fly.


Thanks for the piglit test. An explanatory comment would be nice, but 
either way, this patch is


Reviewed-by: Nicolai Hähnle 




I'm not 100% sure what is going wrong as a tree like:

max
 /   \
max max
   /   \   /   \
  3a   b4

Will remove the right hand side max just fine. But not marking
limits that equal the base range seems to fix the problem.

NIR algebraic opt will clean up the first tree for anyway, hopefully
other backends are smart enough to do this also.

Cc: "13.0" 
---
 src/compiler/glsl/opt_minmax.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/opt_minmax.cpp b/src/compiler/glsl/opt_minmax.cpp
index 29482ee..9f64db9 100644
--- a/src/compiler/glsl/opt_minmax.cpp
+++ b/src/compiler/glsl/opt_minmax.cpp
@@ -355,7 +355,7 @@ ir_minmax_visitor::prune_expression(ir_expression *expr, 
minmax_range baserange)
   */
  if (!is_redundant && limits[i].low && baserange.high) {
 cr = compare_components(limits[i].low, baserange.high);
-if (cr >= EQUAL && cr != MIXED)
+if (cr > EQUAL && cr != MIXED)
is_redundant = true;
  }
   } else {
@@ -373,7 +373,7 @@ ir_minmax_visitor::prune_expression(ir_expression *expr, 
minmax_range baserange)
   */
  if (!is_redundant && limits[i].high && baserange.low) {
 cr = compare_components(limits[i].high, baserange.low);
-if (cr <= EQUAL)
+if (cr < EQUAL)
is_redundant = true;
  }
   }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix opt_minmax redundancy checks against baserange

2017-01-06 Thread Timothy Arceri
Hi Iago/Petri,

Just Ccing you guys in case you have a better solution (I know its been
a while since this was written).

This is causing incorrect shaders in at least Serious Sam 3 possibly
others. I've sent some piglit tests to reproduce it to the piglit list.

Thanks,
Tim

On Fri, 2017-01-06 at 10:26 +1100, Timothy Arceri wrote:
> Marking operations as redundant if they are equal to the base
> range is fine when the tree structure is something like this:
> 
> max
>   / \
>  max b
> /   \
>    3max
>    /   \
>   3 a
> 
> But the opt falls apart with a tree like this:
> 
> max
>  /   \
> max max
>    /   \   /   \
>   3a   b3
> 
> I'm not 100% sure what is going wrong as a tree like:
> 
> max
>  /   \
> max max
>    /   \   /   \
>   3a   b4
> 
> Will remove the right hand side max just fine. But not marking
> limits that equal the base range seems to fix the problem.
> 
> NIR algebraic opt will clean up the first tree for anyway, hopefully
> other backends are smart enough to do this also.
> 
> Cc: "13.0" 
> ---
>  src/compiler/glsl/opt_minmax.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/compiler/glsl/opt_minmax.cpp
> b/src/compiler/glsl/opt_minmax.cpp
> index 29482ee..9f64db9 100644
> --- a/src/compiler/glsl/opt_minmax.cpp
> +++ b/src/compiler/glsl/opt_minmax.cpp
> @@ -355,7 +355,7 @@ ir_minmax_visitor::prune_expression(ir_expression
> *expr, minmax_range baserange)
>    */
>   if (!is_redundant && limits[i].low && baserange.high) {
>  cr = compare_components(limits[i].low, baserange.high);
> -if (cr >= EQUAL && cr != MIXED)
> +if (cr > EQUAL && cr != MIXED)
> is_redundant = true;
>   }
>    } else {
> @@ -373,7 +373,7 @@ ir_minmax_visitor::prune_expression(ir_expression
> *expr, minmax_range baserange)
>    */
>   if (!is_redundant && limits[i].high && baserange.low) {
>  cr = compare_components(limits[i].high, baserange.low);
> -if (cr <= EQUAL)
> +if (cr < EQUAL)
> is_redundant = true;
>   }
>    }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/7] gallium: add FBFETCH opcode to retrieve the current sample value

2017-01-06 Thread Nicolai Hähnle

On 05.01.2017 17:59, Ilia Mirkin wrote:

On Thu, Jan 5, 2017 at 11:30 AM, Nicolai Hähnle  wrote:

On 05.01.2017 17:02, Ilia Mirkin wrote:


On Thu, Jan 5, 2017 at 10:48 AM, Nicolai Hähnle 
wrote:


On 02.01.2017 21:41, Marek Olšák wrote:



On Mon, Jan 2, 2017 at 7:01 AM, Ilia Mirkin 
wrote:



Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_info.c |  2 +-
 src/gallium/docs/source/tgsi.rst   | 11 +++
 src/gallium/include/pipe/p_shader_tokens.h |  2 +-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 37549aa..e34b8c7 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -106,7 +106,7 @@ static const struct tgsi_opcode_info
opcode_info[TGSI_OPCODE_LAST] =
{ 1, 3, 0, 0, 0, 0, 0, COMP, "CMP", TGSI_OPCODE_CMP },
{ 1, 1, 0, 0, 0, 0, 0, CHAN, "SCS", TGSI_OPCODE_SCS },
{ 1, 2, 1, 0, 0, 0, 0, OTHR, "TXB", TGSI_OPCODE_TXB },
-   { 0, 1, 0, 0, 0, 0, 1, NONE, "", 69 },  /* removed */
+   { 1, 1, 0, 0, 0, 0, 0, OTHR, "FBFETCH", TGSI_OPCODE_FBFETCH },
{ 1, 2, 0, 0, 0, 0, 0, COMP, "DIV", TGSI_OPCODE_DIV },
{ 1, 2, 0, 0, 0, 0, 0, REPL, "DP2", TGSI_OPCODE_DP2 },
{ 1, 2, 1, 0, 0, 0, 0, OTHR, "TXL", TGSI_OPCODE_TXL },
diff --git a/src/gallium/docs/source/tgsi.rst
b/src/gallium/docs/source/tgsi.rst
index d2d30b4..accbe1d 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2561,6 +2561,17 @@ Resource Access Opcodes
   image, while .w will contain the number of samples for multi-sampled
   images.

+.. opcode:: FBFETCH - Load data from framebuffer
+
+  Syntax: ``FBFETCH dst, output``
+
+  Example: ``FBFETCH TEMP[0], OUT[0]``
+
+  Returns the color of the current position in the framebuffer from
+  before this fragment shader invocation. Always returns the same
+  value from multiple calls for a particular output within a single
+  invocation.




I'm not a fan of this last sentence. It's true that we could somehow bend
things in the compiler to make the sentence true, but

(a) The statement is clearly false with a straight-forward implementation
of
the instruction: multiple fragment shaders can be simultaneously
in-flight
on the same pixel/sample. A second FBFETCH could happen after an earlier
invocation on the same pixel finished and get the new framebuffer value.

(b) I'm not aware of an API that actually requires this guarantee.



If the value is always the same, can it be declared as a system value
instead?




I don't know. I'd remove the statement about the value always being the
same
to begin with. And with an eye to how this actually ends up being
implemented, and possible interactions with
ARB_fragment_shader_interlock,
I'd say it makes sense (and our lives easier) for the TGSI to define
_when_
the framebuffer value is supposed to be read, and for that it makes sense
to
have an instruction for it.



In case it's not obvious, this is primarily for
KHR_blend_equation_advanced. It's illegal to use it with overdraw
without a barrier first.



Well, you get an undefined value :)

My point is that there's a plausible implementation of FBFETCH as an
instruction in which "the same value will be returned within a single
invocation" is only guaranteed if the application follows the rules
involving BlendBarrier.



There's also
KHR_blend_equation_advanced_coherent and EXT_shader_framebuffer_fetch,
which do allow overdraw. I'm not sure how the ordering is specified
(or, how it's specified for regular blending). The gl_LastFragData
stuff are a non-writable space, and as such, it makes sense that
they'd be computed once on invocation and then kept constant
throughout the shader run.



It still leaves open the question of _when_ they should be computed,
especially in the future when guarantees about the order of fragment shaders
are required (i.e. KHR_blend_equation_advanced_coherent): If you load the
data too early, you may lose some potential for parallelism because you have
to spend more time waiting for earlier invocations to finish. This obviously
depends on the details of the hardware, but ARB_fragment_shader_interlock
suggests that it is (or will be?) quite common to have synchronization that
is rather fine-grained.


This is a related to the implementation issues I had on nouveau with
the sysval - basically the code looks like

if (advanced blend is enabled) {
MOV TEMP[0], SV
IF
  use(sv)
ELSE
  use(sv)
use(sv some more)
}

And then the st_glsl_to_tgsi copy propagation logic pushes those SV's
down to the uses. This is further complicated by the fact that the
xyzw values are logically independent at the TGSI level. And now I'm
stuck either

(a) getting the value once at the top of the shader, even if advanced
blend might be disabled entirely
(b) fetching the texture once per bb
(c) figuring out 

Re: [Mesa-dev] [PATCH 0/5] nvc0: better instruction pipelining for Maxwell GPUs

2017-01-06 Thread Samuel Pitoiset



On 01/06/2017 11:53 AM, Jan Vesely wrote:

On Fri, 2016-12-23 at 00:15 +0100, Samuel Pitoiset wrote:

Hello,

This series makes use of the scheduling control code in order to improve the
instruction pipelining on Maxwell GPUs.

Starting with the Kepler architecture, where a control instruction has to be
inserted every 7 instructions, Maxwell added additional control codes and the
control instruction now has to be every 3 instructions. Maxwell control codes
are really powerful and well documented [1]. By the way, I would like to thank
Scott Gray who did an awesome reverse engineering work, although I had to
figure out the missing parts myself.

On Maxwell, control codes are mainly used for setting the number of stall
counts and for producing/consumming dependency barriers in order to avoid
hazards. I'm not going to explain in details how do they work because the
documentation is quite good and because I added explanations here and there
in the source code. But the main thing to understand is that the previous
control code used by default (ie. st 0x0) means "wait for all dependencies
and stall the pipeline for 15 cycles which is the maximum".
Which is quite bad...

Now, let's have a look at the (impressive) performance improvements. :-)
I measured on a GeForce GTX 750 Ti (GM107) reclocked to the highest perf level,
with and without the control codes (NV50_PROG_SCHED=0/1).

app: number of FPS without -> number of FPS with (+gain%)

FurMark:   13  ->  42  (+223%)
Pixmark Piano: 2   ->  7   (+250%)
Pixmark Volposion: 6   ->  20  (+233%)
Julia F32: 61  ->  219 (+259%)
LightMarks:352 ->  685 (+94%)
Heaven (low):  51  ->  102 (+100%)
Heaven (ultra):14  ->  27  (+93%)
Valley (low):  30  ->  68  (+126%)
Valley (ultra):18  ->  39  (+100%)
Talos (low):   32  ->  50  (+56%)
Talos (ultra): 7   ->  14  (+100%)
Shadow of Mordor (lowest): 13  ->  20  (+53%)

That's it! I think it's enough to understand the power of Maxwell control
codes. We may get additional numbers from Phoronix (wink, wink, Michael).
As I said in the main patch, the control codes can be disabled with
'export NV50_PROG_SCHED=0'.

Now, let's have a look how nouveau performs compared to NVIDIA's blob.

FurMark:   42  ->  59   (+40%)
Pixmark Piano: 7   ->  13   (+85%)
Pixmark Volposion: 20  ->  42   (+110%)
Julia F32: 219 ->  351  (+60%)
LightMarks:685 ->  1192 (+74%)
Heaven (low):  102 ->  144  (+41%)
Heaven (ultra):27  ->  46   (+70%)
Valley (low):  68  ->  94   (+38%)
Valley (ultra):39  ->  60   (+53%)
Talos (low):   50  ->  128  (+156%)
Talos (ultra): 14  ->  30   (+114%)
Shadow of Mordor (lowest): 20  ->  77   (+285%)


I see + 45% and + 33% for my gm107m (prime) for Valley and Heaven
(1024x768, medium). which pushes above the integrated skylake iGPU
performance. There are visual artifacts in both demos, but they appear
the same with and without these patches.

Tested-by: Jan Vesely 


Thanks for testing.

Yeah, there is a sync issue with at least Valley and Heaven on Maxwell.

If you try with MESA_DEBUG=flush, the visual artifacts no longer happen.



regards,
Jan



Nouveau is still far away from the blob, but now I think Maxwell is actually
in roughly the same shape as Kepler in terms of performance and features.
Speaking about this, I will enable OpenGL 4.3 on Maxwell in a separate patch,
later on.

The overhead at compile time added by this seris is rather small. For a full
shader-db run with my private repository of shaders, it takes approximately
208s for compiling 25k shaders before the series and approximately 211s after.
Less than 2% of overhead and it's comparable to a full shader-db run on Kepler.

No regressions with both piglit and dEQP (tested multiple times) and all
benchmarks/games I have tried render fine and seem to be quite stable.

Due to a lack of time, some parts are still left to do and some others could
be improved. With the following ideas implemented I'm pretty sure we can
improve performance significantly.

* Add support for the yield flag. This seems to be a hint to the hardware for
  improving how the work is balanced between the warps. I didn't figure out
  how and where to use it without breaking a bunch of things. Need time and
  patience.

* Add support for dual-issue, the rules are pretty different than Kepler
  especially because of the dependency barriers. Note that the yield flag has
  to be set, otherwise the hardware won't dual-issue and in fact it will wait
  for all dependencies (ie. st 0x0) which is really different that what you
  are looking for.

* Reduce stall counts. A bunch of instructions have a read latency which is the
  number of cycles before they can actually read the sources. This should be
  

Re: [Mesa-dev] [PATCH 0/5] nvc0: better instruction pipelining for Maxwell GPUs

2017-01-06 Thread Jan Vesely
On Fri, 2016-12-23 at 00:15 +0100, Samuel Pitoiset wrote:
> Hello,
> 
> This series makes use of the scheduling control code in order to improve the
> instruction pipelining on Maxwell GPUs.
> 
> Starting with the Kepler architecture, where a control instruction has to be
> inserted every 7 instructions, Maxwell added additional control codes and the
> control instruction now has to be every 3 instructions. Maxwell control codes
> are really powerful and well documented [1]. By the way, I would like to thank
> Scott Gray who did an awesome reverse engineering work, although I had to
> figure out the missing parts myself.
> 
> On Maxwell, control codes are mainly used for setting the number of stall
> counts and for producing/consumming dependency barriers in order to avoid
> hazards. I'm not going to explain in details how do they work because the
> documentation is quite good and because I added explanations here and there
> in the source code. But the main thing to understand is that the previous
> control code used by default (ie. st 0x0) means "wait for all dependencies
> and stall the pipeline for 15 cycles which is the maximum".
> Which is quite bad...
> 
> Now, let's have a look at the (impressive) performance improvements. :-)
> I measured on a GeForce GTX 750 Ti (GM107) reclocked to the highest perf 
> level,
> with and without the control codes (NV50_PROG_SCHED=0/1).
> 
> app: number of FPS without -> number of FPS with (+gain%)
> 
> FurMark:   13  ->  42  (+223%)
> Pixmark Piano: 2   ->  7   (+250%)
> Pixmark Volposion: 6   ->  20  (+233%)
> Julia F32: 61  ->  219 (+259%)
> LightMarks:352 ->  685 (+94%)
> Heaven (low):  51  ->  102 (+100%)
> Heaven (ultra):14  ->  27  (+93%)
> Valley (low):  30  ->  68  (+126%)
> Valley (ultra):18  ->  39  (+100%)
> Talos (low):   32  ->  50  (+56%)
> Talos (ultra): 7   ->  14  (+100%)
> Shadow of Mordor (lowest): 13  ->  20  (+53%)
> 
> That's it! I think it's enough to understand the power of Maxwell control
> codes. We may get additional numbers from Phoronix (wink, wink, Michael).
> As I said in the main patch, the control codes can be disabled with
> 'export NV50_PROG_SCHED=0'.
> 
> Now, let's have a look how nouveau performs compared to NVIDIA's blob.
> 
> FurMark:   42  ->  59   (+40%)
> Pixmark Piano: 7   ->  13   (+85%)
> Pixmark Volposion: 20  ->  42   (+110%)
> Julia F32: 219 ->  351  (+60%)
> LightMarks:685 ->  1192 (+74%)
> Heaven (low):  102 ->  144  (+41%)
> Heaven (ultra):27  ->  46   (+70%)
> Valley (low):  68  ->  94   (+38%)
> Valley (ultra):39  ->  60   (+53%)
> Talos (low):   50  ->  128  (+156%)
> Talos (ultra): 14  ->  30   (+114%)
> Shadow of Mordor (lowest): 20  ->  77   (+285%)

I see + 45% and + 33% for my gm107m (prime) for Valley and Heaven
(1024x768, medium). which pushes above the integrated skylake iGPU
performance. There are visual artifacts in both demos, but they appear
the same with and without these patches.

Tested-by: Jan Vesely 

regards,
Jan

> 
> Nouveau is still far away from the blob, but now I think Maxwell is actually 
> in roughly the same shape as Kepler in terms of performance and features.
> Speaking about this, I will enable OpenGL 4.3 on Maxwell in a separate patch,
> later on.
> 
> The overhead at compile time added by this seris is rather small. For a full
> shader-db run with my private repository of shaders, it takes approximately
> 208s for compiling 25k shaders before the series and approximately 211s after.
> Less than 2% of overhead and it's comparable to a full shader-db run on 
> Kepler.
> 
> No regressions with both piglit and dEQP (tested multiple times) and all
> benchmarks/games I have tried render fine and seem to be quite stable.
> 
> Due to a lack of time, some parts are still left to do and some others could
> be improved. With the following ideas implemented I'm pretty sure we can
> improve performance significantly.
> 
> * Add support for the yield flag. This seems to be a hint to the hardware for
>   improving how the work is balanced between the warps. I didn't figure out
>   how and where to use it without breaking a bunch of things. Need time and
>   patience.
> 
> * Add support for dual-issue, the rules are pretty different than Kepler 
>   especially because of the dependency barriers. Note that the yield flag has
>   to be set, otherwise the hardware won't dual-issue and in fact it will wait
>   for all dependencies (ie. st 0x0) which is really different that what you
>   are looking for.
> 
> * Reduce stall counts. A bunch of instructions have a read latency which is 
> the
>   number of cycles before they can actually read the sources. This should be
>   fairly easy to implement but 

[Mesa-dev] [PATCH] i965: Fix texturing in the vec4 TCS and GS backends.

2017-01-06 Thread Kenneth Graunke
We were failing to zero m0.2 of the sampler message header for TCS and
GS messages in the simple case.  fs_generator has done this for about
a year now, but we missed it in vec4_generator.

Fixes ES31-CTS.core.texture_cube_map_array.sampling,
GL45-CTS.texture_cube_map_array.sampling, and many
dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler subtests:
- dynamically_uniform.tessellation_control.isampler3d
- dynamically_uniform.tessellation_control.isamplercube
- dynamically_uniform.tessellation_control.sampler2d
- dynamically_uniform.tessellation_control.usamplercube
- dynamically_uniform.tessellation_control.sampler2darray
- dynamically_uniform.tessellation_control.isampler2darray
- dynamically_uniform.tessellation_control.usampler3d
- dynamically_uniform.tessellation_control.usampler2darray
- dynamically_uniform.tessellation_control.usampler2d
- dynamically_uniform.tessellation_control.sampler3d
- dynamically_uniform.tessellation_control.samplercube
- dynamically_uniform.tessellation_control.isampler2d
- uniform.tessellation_control.isampler3d
- uniform.tessellation_control.isamplercube
- uniform.tessellation_control.usampler2d
- uniform.tessellation_control.usampler3d
- uniform.tessellation_control.sampler2darray
- uniform.tessellation_control.isampler2darray
- uniform.tessellation_control.usampler2darray
- uniform.tessellation_control.sampler2d
- uniform.tessellation_control.usamplercube
- uniform.tessellation_control.sampler3d
- uniform.tessellation_control.samplercube
- uniform.tessellation_control.isampler2d

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 3d688cff144..f095cc2d0f2 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -106,6 +106,7 @@ generate_math2_gen4(struct brw_codegen *p,
 static void
 generate_tex(struct brw_codegen *p,
  struct brw_vue_prog_data *prog_data,
+ gl_shader_stage stage,
  vec4_instruction *inst,
  struct brw_reg dst,
  struct brw_reg src,
@@ -238,8 +239,16 @@ generate_tex(struct brw_codegen *p,
  */
 dw2 |= GEN9_SAMPLER_SIMD_MODE_EXTENSION_SIMD4X2;
 
- if (dw2)
+ /* The VS, DS, and FS stages have the g0.2 payload delivered as 0,
+  * so header0.2 is 0 when g0 is copied.  The HS and GS stages do
+  * not, so we must set to to 0 to avoid setting undesirable bits
+  * in the message header.
+  */
+ if (dw2 ||
+ stage == MESA_SHADER_TESS_CTRL ||
+ stage == MESA_SHADER_GEOMETRY) {
 brw_MOV(p, get_element_ud(header, 2), brw_imm_ud(dw2));
+ }
 
  brw_adjust_sampler_state_pointer(p, header, sampler_index);
  brw_pop_insn_state(p);
@@ -1748,7 +1757,8 @@ generate_code(struct brw_codegen *p,
   case SHADER_OPCODE_TG4:
   case SHADER_OPCODE_TG4_OFFSET:
   case SHADER_OPCODE_SAMPLEINFO:
- generate_tex(p, prog_data, inst, dst, src[0], src[1], src[2]);
+ generate_tex(p, prog_data, nir->stage,
+  inst, dst, src[0], src[1], src[2]);
  break;
 
   case VS_OPCODE_URB_WRITE:
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: always do sqrt(abs()) and inversesqrt(abs())

2017-01-06 Thread Samuel Pitoiset
D3D always computes the absolute value while GLSL says that the
result of inversesqrt() is undefined if x <= 0 (and undefined if
x < 0 for sqrt()). But some apps rely on this specific behaviour
which is not clearly defined by OpenGL.

Computing the absolute value before sqrt()/inversesqrt() will
prevent that, especially for apps which have been ported from D3D.
Note that closed drivers seem to also use that quirk.

This gets rid of the NaN values in the "Spec Ops: The Line" game
as well as the black squares with radeonsi. Note that Nouveau is
not affected by this bug because we already take the absolute value
when translating from TGSI to nv50/ir.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97338

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/glsl/builtin_functions.cpp | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 797af08b6c..f816f2ff7d 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3623,12 +3623,30 @@ builtin_builder::_pow(const glsl_type *type)
return binop(always_available, ir_binop_pow, type, type, type);
 }
 
+ir_function_signature *
+builtin_builder::_sqrt(builtin_available_predicate avail,
+   const glsl_type *type)
+{
+   ir_variable *x = in_var(type, "x");
+   MAKE_SIG(type, avail, 1, x);
+   body.emit(ret(expr(ir_unop_sqrt, abs(x;
+   return sig;
+}
+
+ir_function_signature *
+builtin_builder::_inversesqrt(builtin_available_predicate avail,
+  const glsl_type *type)
+{
+   ir_variable *x = in_var(type, "x");
+   MAKE_SIG(type, avail, 1, x);
+   body.emit(ret(expr(ir_unop_rsq, abs(x;
+   return sig;
+}
+
 UNOP(exp, ir_unop_exp,  always_available)
 UNOP(log, ir_unop_log,  always_available)
 UNOP(exp2,ir_unop_exp2, always_available)
 UNOP(log2,ir_unop_log2, always_available)
-UNOPA(sqrt,ir_unop_sqrt)
-UNOPA(inversesqrt, ir_unop_rsq)
 
 /** @} */
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

--- Comment #4 from Jan Ziak <0xe2.0x9a.0...@gmail.com> ---
(In reply to Jan Ziak from comment #3)
> $ LIBGL_ALWAYS_SOFTWARE=1 GALLIUM_DRIVER=swr

$ LIBGL_ALWAYS_SOFTWARE=1 GALLIUM_DRIVER=swr ./gl

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

--- Comment #3 from Jan Ziak <0xe2.0x9a.0...@gmail.com> ---
I am unable to confirm whether this bug has been resolved.

$ LIBGL_ALWAYS_SOFTWARE=1 GALLIUM_DRIVER=swr
SWR detected AVX
Segmentation fault

$ gdb
(gdb) bt
#0  CreateThreadPool (pContext=pContext@entry=0x63a340,
pPool=pPool@entry=0x63a510) at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/gallium/drivers/swr/rasterizer/core/threads.cpp:840
 
#1  0x7584c1ce in SwrCreateContext
(pCreateInfo=pCreateInfo@entry=0x7fffcf60) at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/gallium/drivers/swr/rasterizer/core/api.cpp:109
 
#2  0x75835d5b in swr_create_context (p_screen=0x7715f0, priv=0x0,
flags=) at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/gallium/drivers/swr/swr_context.cpp:466
 
#3  0x773454ce in st_api_create_context (stapi=,
smapi=0x757f60, attribs=0x7fffd110, error=0x7fffd10c,
shared_stctxi=0x0)  
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/mesa/state_tracker/st_manager.c:662
 
#4  0x774b34ba in dri_create_context (api=,
visual=0x77c480, cPriv=0x63a2d0, major_version=,
minor_version=, flags=, notify_reset=false,
error=0x7fffd2dc,
sharedContextPrivate=0x0) at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/gallium/state_trackers/dri/dri_context.c:123
 
#5  0x774b298f in driCreateContextAttribs (screen=0x617ae0,
api=, config=0x77c480, shared=,
num_attribs=, attribs=, error=0x7fffd2dc,
data=0x63a130)  
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/mesa/drivers/dri/common/dri_util.c:448
 
#6  0x00366d041baf in drisw_create_context_attribs (base=0x630b10,
config_base=0x784240, shareList=, num_attribs=,
attribs=, error=0x7fffd2dc)  
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/glx/drisw_glx.c:476
#7  0x00366d01abc0 in glXCreateContextAttribsARB (dpy=0x603070,
config=0x784240, share_context=0x0, direct=1, attrib_list=0x7fffd330) at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/glx/create_context.c:78
 
#8  0x77d96e55 in _glfwCreateContextGLX () from /usr/lib64/libglfw.so.3 
#9  0x77d9394d in _glfwPlatformCreateWindow () from
/usr/lib64/libglfw.so.3 
#10 0x77d8dd3d in glfwCreateWindow () from /usr/lib64/libglfw.so.3  
#11 0x00400bf0 in main ()

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: don't skip the VUE header if we are reading gl_Layer in a fragment shader

2017-01-06 Thread Iago Toral
On Thu, 2017-01-05 at 14:08 +0100, Iago Toral Quiroga wrote:
> This is the same we do in the GL driver: the hardware provides
> gl_Layer
> in the VUE header, so when the fragment shader reads it we can't skip
> it.

Forgot to add that this fixes the following Vulkan CTS tests:

dEQP-VK.geometry.layered.1d_array.fragment_layer
dEQP-VK.geometry.layered.2d_array.fragment_layer
dEQP-VK.geometry.layered.cube.fragment_layer

> ---
> 
> With this patch we now successfully read gl_Layer in fragment
> shaders. Layered
> rendering still does not work though, probably because we still need
> to hook up
> the layer_id stuff that Jason added some time ago. I'll look into
> that next.

And by this I mean that the dEQP-VK.geometry.layered.cube_array.* tests
still fail.

>  src/intel/vulkan/genX_pipeline.c | 20 
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/src/intel/vulkan/genX_pipeline.c
> b/src/intel/vulkan/genX_pipeline.c
> index 845d020..c1d8ae6 100644
> --- a/src/intel/vulkan/genX_pipeline.c
> +++ b/src/intel/vulkan/genX_pipeline.c
> @@ -291,6 +291,8 @@ emit_3dstate_sbe(struct anv_pipeline *pipeline)
>  #  define swiz sbe
>  #endif
>  
> +   /* Skip the VUE header and position slots by default */
> +   unsigned urb_entry_read_offset = 1;
> int max_source_attr = 0;
> for (int attr = 0; attr < VARYING_SLOT_MAX; attr++) {
>    int input_index = wm_prog_data->urb_setup[attr];
> @@ -298,6 +300,12 @@ emit_3dstate_sbe(struct anv_pipeline *pipeline)
>    if (input_index < 0)
>   continue;
>  
> +  /* gl_Layer is stored in the VUE header */
> +  if (attr == VARYING_SLOT_LAYER) {
> + urb_entry_read_offset = 0;
> + continue;
> +  }
> +
>    if (attr == VARYING_SLOT_PNTC) {
>   sbe.PointSpriteTextureCoordinateEnable = 1 << input_index;
>   continue;
> @@ -322,18 +330,22 @@ emit_3dstate_sbe(struct anv_pipeline *pipeline)
>   swiz.Attribute[input_index].ComponentOverrideZ = true;
>   swiz.Attribute[input_index].ComponentOverrideW = true;
>    } else {
> - assert(slot >= 2);
> - const int source_attr = slot - 2;
> - max_source_attr = MAX2(max_source_attr, source_attr);
>   /* We have to subtract two slots to accout for the URB
> entry output
>    * read offset in the VS and GS stages.
>    */
> + assert(slot >= 2);
> + const int source_attr = slot - 2 * urb_entry_read_offset;
> + max_source_attr = MAX2(max_source_attr, source_attr);
>   swiz.Attribute[input_index].SourceAttribute = source_attr;
>    }
> }
>  
> -   sbe.VertexURBEntryReadOffset = 1; /* Skip the VUE header and
> position slots */
> +   sbe.VertexURBEntryReadOffset = urb_entry_read_offset;
> sbe.VertexURBEntryReadLength = DIV_ROUND_UP(max_source_attr + 1,
> 2);
> +#if GEN_GEN >= 8
> +   sbe.ForceVertexURBEntryReadOffset = true;
> +   sbe.ForceVertexURBEntryReadLength = true;
> +#endif
>  
> uint32_t *dw = anv_batch_emit_dwords(>batch,
>  GENX(3DSTATE_SBE_length));
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] nvc0: better instruction pipelining for Maxwell GPUs

2017-01-06 Thread Samuel Pitoiset



On 01/06/2017 03:34 AM, Alexandre Courbot wrote:

On 12/23/2016 08:15 AM, Samuel Pitoiset wrote:

This series makes use of the scheduling control code in order to improve the
instruction pipelining on Maxwell GPUs.


Tested this on Jetson TX1. The performance improvement on glmark2 was
only marginal, with terrain going from 7 to 10 fps at pstate 01 and from
29 to 33 fps at pstate 0d (probably due to some other non-shader related
bottleneck on this board?), but I have not noticed any issue.

Tested-by: Alexandre Courbot 


Thanks for testing Alex!

Yeah, Nouveau has bunch of other bottlenecks, I'm not surprised. :)

But I guess GPUTest should be improved a lot on TX1 like on my GM107.




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Properly flush in hsw_pause_transform_feedback().

2017-01-06 Thread Kenneth Graunke
Fixes a number of transform feedback tests when run with Linux 4.8,
which allows us to use the MI_LOAD_REGISTER_REG command, at which point
we started using this new broken path.

ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.*
and Piglit's arb_transform_feedback2/draw-auto are both fixed by this
patch, for example.

Thanks to Chris Wilson for catching this mistake!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/hsw_sol.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/hsw_sol.c 
b/src/mesa/drivers/dri/i965/hsw_sol.c
index e299b022706..b0dd150b7df 100644
--- a/src/mesa/drivers/dri/i965/hsw_sol.c
+++ b/src/mesa/drivers/dri/i965/hsw_sol.c
@@ -201,6 +201,9 @@ hsw_pause_transform_feedback(struct gl_context *ctx,
   (struct brw_transform_feedback_object *) obj;
 
if (brw->is_haswell) {
+  /* Flush any drawing so that the counters have the right values. */
+  brw_emit_mi_flush(brw);
+
   /* Save the SOL buffer offset register values. */
   for (int i = 0; i < BRW_MAX_XFB_STREAMS; i++) {
  BEGIN_BATCH(3);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] drirc: remove spurious tabs

2017-01-06 Thread Edward O'Callaghan
Reviewed-by: Edward O'Callaghan 

On 01/06/2017 08:06 AM, Eric Engestrom wrote:
> Signed-off-by: Eric Engestrom 
> ---
>  src/mesa/drivers/dri/common/drirc | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/common/drirc 
> b/src/mesa/drivers/dri/common/drirc
> index af84ee82e8..97297b7a1c 100644
> --- a/src/mesa/drivers/dri/common/drirc
> +++ b/src/mesa/drivers/dri/common/drirc
> @@ -28,46 +28,46 @@ TODO: document the other workarounds.
>  
>  
>  
> - 
> +
>  
>  
>  
>  
> - 
> +
>  
>  
>   value="true" />
>  
>  
> - 
> +
>  
>  
>   value="true" />
>  
>  
> - 
> +
>  
>  
>   value="true" />
>  
>  
> - 
> +
>  
>  
>   value="true" />
>  
>  
> - 
> +
>  
>   executable="OilRush_x86">
>  
>   value="true" />
> - 
> +
>  
>   executable="OilRush_x64">
>  
>   value="true" />
> - 
> +
>  
>  
>  
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev