Re: [Mesa-dev] [PATCH 0/5] GL_OES_required_internalformat

2017-06-11 Thread Tapani Pälli



On 05/18/2017 09:39 PM, Eric Anholt wrote:

Eric Anholt  writes:


This series came out of fixing dEQP failures on vc4's GLES2 context.
Mesa was allowing RGB565 textures, which is only valid with
GL_OES_required_internalformat.  Rather than disable RGB565, I decided
the extension was easy enough to support.

I've sent one piglit test for renderbuffer sizing, and dEQP has tests
for whether enums get accepted for TexImage.

There's a functional question in patch #2, see the comment there, and
there's a question of whether the extension should be dummy_true in
patch #5.

branch: https://github.com/anholt/mesa/commits/required-internalformat


I would still love review on this series.



Earlier I took a brief look at series and run through our CI, there were 
many failing tests but t seems I forgot to reply/report .. I'll run it 
again and let you know what's the result.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation

2017-06-11 Thread Michel Dänzer
On 10/06/17 08:15 AM, Gert Wollny wrote:
> 
> Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they 
> don't
> seem to be related to shaders.

Which tests regressed (maybe you can put up a piglit HTML summary
somewhere generated from a run with and without your patches)? Do they
consistently pass without your patches and fail with them?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 0/2] egl: wire up swap_with_damage() and enable it on Android

2017-06-11 Thread Tapani Pälli

Hi Eric;

EGL_KHR_swap_buffers_with_damage is implemented by Android's EGL wrapper 
library so these tests are passing already, no implementation required 
in Mesa side.


Thanks;

On 06/12/2017 01:09 AM, Eric Engestrom wrote:

This is completely untested, as I don't have access to an Android
test device, which is the only platform that can benefit from this
improvement so far.

Can someone run dEQP-EGL.functional.swap_buffers_with_damage.*
on Android for me?

Cheers,
   Eric

---

Eric Engestrom (2):
   egl: improve dri2_fallback_swap_buffers_with_damage()
   egl/android: enable EGL_EXT_swap_buffers_with_damage

  src/egl/drivers/dri2/egl_dri2_fallbacks.h | 1 +
  src/egl/drivers/dri2/platform_android.c   | 1 +
  2 files changed, 2 insertions(+)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965: disable_throttling

2017-06-11 Thread Marathe, Yogesh
Hi,

> -Original Message-
> Hi,
> 
> On 09.06.2017 09:21, Marathe, Yogesh wrote:
> > I’m looking forward to set this brw->disable_throttling to true, I’m
> > actually observing adverse effect on performance benchmarks after I
> > force set that to true. What’s the expectation from disable_throttling
> > here?
> 
> By default Mesa throttles its buffer swaps so that compositor is able to 
> easier
> insert its own batches to better place in GPU batch queue.
> 
> If throttling is disabled, compositor needs to discard more frames as 
> obsolete,
> because benchmark is running too many frames forward of it.
> 
> Normally having no throttling means that:
> * composited performance looks worse to the user, because only some of the
> rendered frames end up on screen
> * benchmark runs faster because compositor composes less frames, i.e.
> takes less of the system bandwidth
> 

Thanks Eero. Essentially, you are saying having disable_trottling=true is
actually _not_ beneficial from compositor's perspective where GPU is not
matching speeds with that of benchmark. I will leave it to default then. 

> 
>  > BTW, this is on android.
> 
> On TDP constrained machines, power management can do funky things.  When
> in doubt, use fixed GPU & CPU speeds are re-test.
> 

Alright.

> 
> 
>   - Eero
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Call intel_prepare_render() from intel_update_state()

2017-06-11 Thread Tapani Pälli

This fixes also the crashes seen on Android;

Tested-by: Tapani Pälli 

On 06/10/2017 12:11 AM, Kenneth Graunke wrote:

The resolve code looks at the current color draw buffers.  These are not
valid until intel_prepare_render() is called.  You can end up with one
color buffer bound, but where the renderbuffer has zero width/height and
no miptree allocated.

You can get a call chain like: _mesa_Clear -> _mesa_update_state ->
intel_update_state, where no brw driver hooks were called, so there is
no other point at which we could have called this.

Fixes crashes in KWin where Clear was causing intel_disable_rb_aux_buffer
to crash on irb != NULL but irb->mt == NULL.

Tested-by: Tobias Klausmann 
Cc: Jason Ekstrand 
---
  src/mesa/drivers/dri/i965/brw_context.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6d27866fcea..5433f9080ee 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -201,6 +201,8 @@ intel_update_state(struct gl_context * ctx)
  
 _mesa_unlock_context_textures(ctx);
  
+   intel_prepare_render(brw);

+
 /* Resolve the depth buffer's HiZ buffer. */
 depth_irb = intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH);
 if (depth_irb && depth_irb->mt) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Call intel_prepare_render() from intel_update_state()

2017-06-11 Thread Jason Ekstrand
Ugh... Yeah, this is probably needed.  I've got another patch in my tree
(I'll probably send that series tomorrow) which moves all of the resolve
stuff back to brw_draw.c and not intel_update_state.

Reviewed-by: Jason Ekstrand 

On Fri, Jun 9, 2017 at 2:11 PM, Kenneth Graunke 
wrote:

> The resolve code looks at the current color draw buffers.  These are not
> valid until intel_prepare_render() is called.  You can end up with one
> color buffer bound, but where the renderbuffer has zero width/height and
> no miptree allocated.
>
> You can get a call chain like: _mesa_Clear -> _mesa_update_state ->
> intel_update_state, where no brw driver hooks were called, so there is
> no other point at which we could have called this.
>
> Fixes crashes in KWin where Clear was causing intel_disable_rb_aux_buffer
> to crash on irb != NULL but irb->mt == NULL.
>
> Tested-by: Tobias Klausmann 
> Cc: Jason Ekstrand 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 6d27866fcea..5433f9080ee 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -201,6 +201,8 @@ intel_update_state(struct gl_context * ctx)
>
> _mesa_unlock_context_textures(ctx);
>
> +   intel_prepare_render(brw);
> +
> /* Resolve the depth buffer's HiZ buffer. */
> depth_irb = intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH);
> if (depth_irb && depth_irb->mt) {
> --
> 2.13.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] RFC: radeon/compute: Limit allocations for VRAM-based chips to 3/4 VRAM

2017-06-11 Thread Michel Dänzer
On 10/06/17 12:43 AM, Aaron Watry wrote:
> On Wed, Jun 7, 2017 at 11:12 PM, Aaron Watry  wrote:
>> On Wed, Jun 7, 2017 at 9:15 PM, Michel Dänzer  wrote:
>>> On 08/06/17 03:42 AM, Marek Olšák wrote:
 On Wed, Jun 7, 2017 at 4:10 PM, Aaron Watry  wrote:
> On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák  wrote:
>>
>> Can you make the change in radeon_drm_winsys.c instead?
>
> Something like the following?
>
> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> index a485615ae4..44948f49ef 100644
> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> @@ -365,6 +365,8 @@ static bool do_winsys_init(struct radeon_drm_winsys 
> *ws)
>  /* Radeon allocates all buffers as contigous, which makes large 
> allocations
>   * unlikely to succeed. */
>  ws->info.max_alloc_size = MAX2(ws->info.vram_size,
> ws->info.gart_size) * 0.7;
> +if (ws->info.has_dedicated_vram)
> +ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
> ws->info.max_alloc_size);
>  if (ws->info.drm_minor < 40)
>  ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 
> 256*1024*1024);

 Yes, feel free to push that.
>>>
>>> That also affects PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE, is that intended?
>>
>> Not necessarily.
>>
>> Part of the reason that I had originally put this in
>> r600_pipe_common.c under the compute params was that I didn't feel
>> comfortable changing this for all workload types. There's evidence
>> that implies that the closed-source AMD CL runtime limits global
>> allocations to either 256MB or 1/4 VRAM (on a 1GB card), so 70% of the
>> max of GART/VRAM seems a bit high for us to report. I'll probably
>> check around a bit and see what the prevailing limits seem to be and
>> if lowering the absolute max might make sense here (for compute loads
>> only), as a failure to allocate the requested amount of memory seems
>> to result in system hangs shortly thereafter, and I'd like to get the
>> frequency of those occurrences down a bit.
> 
> At least in Windows 10 using the AMD binary CL runtime, it reports
> global memory size of 2GB and max allocation of 1GB for the 1GB card
> that I've got.  Whether that's being calculated as max allocation =
> VRAM-size, or 50% of global memory size is an unknown. I'm not sure if
> you can easily adjust the gart size in windows. So my original theory
> of 1/4 VRAM seems to be limited to other cards or older drivers/OSes.
> 
> Given that Marek/Nicolai want to stick this in radeon_drm_winsys.c,
> I'm ok with putting it there.  I think it still makes sense to limit
> the max allocation to a percentage of VRAM when the card has its own
> memory available for the reasons already mentioned by Nicolai. Whether
> 70% is a good number is another question, but one thing at a time.
> 
> Any objections Michel, or were you just raising the point that it
> affected the texture allocation sizes just to make sure we were aware?

Right, just wanted to make sure everybody's aware.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/lower_distance: only set max_array_access for 1D clip dist arrays

2017-06-11 Thread Timothy Arceri

On 12/06/17 10:27, Dave Airlie wrote:

On 12 June 2017 at 10:04, Timothy Arceri  wrote:

On 12/06/17 09:47, Dave Airlie wrote:


From: Dave Airlie 

The max_array_access field applies to the first dimension, which means
we only want to set it for the 1D clip dist arrays.

This fixes an ir_validate assert seen with
KHR-GL44.cull_distance.functional
on nouveau and radeon with debug builds.

Signed-off-by: Dave Airlie 
---
   src/compiler/glsl/lower_distance.cpp | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_distance.cpp
b/src/compiler/glsl/lower_distance.cpp
index 9858503..ff04e9a 100644
--- a/src/compiler/glsl/lower_distance.cpp
+++ b/src/compiler/glsl/lower_distance.cpp
@@ -167,7 +167,6 @@ lower_distance_visitor::visit(ir_variable *ir)
 /* Clone the old var so that we inherit all of its properties */
 *new_var = ir->clone(ralloc_parent(ir), NULL);
 (*new_var)->name = ralloc_strdup(*new_var, GLSL_CLIP_VAR_NAME);
-  (*new_var)->data.max_array_access = new_size - 1;
 (*new_var)->data.location = VARYING_SLOT_CLIP_DIST0;
   if (!ir->type->fields.array->is_array()) {
@@ -182,6 +181,7 @@ lower_distance_visitor::visit(ir_variable *ir)
 this->shader_stage == MESA_SHADER_GEOMETRY)));
  assert (ir->type->fields.array == glsl_type::float_type);
+ (*new_var)->data.max_array_access = new_size - 1;



Why do we need to set this at all? As far as I can tell the only user is
update_array_sizes() which creates a new array instance type, but we are
doing that below anyway.


I think we want to set it so validation passes, not sure any other reason.


Fair enough.

Reviewed-by: Timothy Arceri 



Dave.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] util: Port nir_array functionality to u_dynarray

2017-06-11 Thread Kenneth Graunke
On Saturday, June 3, 2017 11:11:39 AM PDT Thomas Helland wrote:
[snip]
> @@ -87,10 +102,18 @@ util_dynarray_trim(struct util_dynarray *buf)
>  {
> if (buf->size != buf->capacity) {
>if (buf->size) {
> - buf->data = realloc(buf->data, buf->size);
> + if (buf->mem_ctx) {
> +reralloc_size(buf->mem_ctx, buf->data, buf->size);
> + } else {
> +buf->data = realloc(buf->data, buf->size);
> + }

Hi Thomas,

Would it make sense to do:

  buf->data = reralloc_size(...)

It's probably safe as is given that you're trying to trim down an
existing allocation to be smaller, but it seems safer to use the return
value nonetheless.  Coverity grumbled after you pushed the patch:

** CID 1412568:  Resource leaks  (RESOURCE_LEAK)
/src/util/u_dynarray.h: 110 in util_dynarray_trim()

__
*** CID 1412568:  Resource leaks  (RESOURCE_LEAK)
/src/util/u_dynarray.h: 110 in util_dynarray_trim()
104 static inline void
105 util_dynarray_trim(struct util_dynarray *buf)
106 {
107if (buf->size != buf->capacity) {
108   if (buf->size) {
109  if (buf->mem_ctx) {
>>> CID 1412568:  Resource leaks  (RESOURCE_LEAK)
>>> Ignoring storage allocated by "reralloc_size(buf->mem_ctx, buf->data, 
>>> buf->size)" leaks it.
110 reralloc_size(buf->mem_ctx, buf->data, buf->size);
111  } else {
112 buf->data = realloc(buf->data, buf->size);
113  }
114  buf->capacity = buf->size;
115   } else {



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/lower_distance: only set max_array_access for 1D clip dist arrays

2017-06-11 Thread Dave Airlie
On 12 June 2017 at 10:04, Timothy Arceri  wrote:
> On 12/06/17 09:47, Dave Airlie wrote:
>>
>> From: Dave Airlie 
>>
>> The max_array_access field applies to the first dimension, which means
>> we only want to set it for the 1D clip dist arrays.
>>
>> This fixes an ir_validate assert seen with
>> KHR-GL44.cull_distance.functional
>> on nouveau and radeon with debug builds.
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>   src/compiler/glsl/lower_distance.cpp | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/compiler/glsl/lower_distance.cpp
>> b/src/compiler/glsl/lower_distance.cpp
>> index 9858503..ff04e9a 100644
>> --- a/src/compiler/glsl/lower_distance.cpp
>> +++ b/src/compiler/glsl/lower_distance.cpp
>> @@ -167,7 +167,6 @@ lower_distance_visitor::visit(ir_variable *ir)
>> /* Clone the old var so that we inherit all of its properties */
>> *new_var = ir->clone(ralloc_parent(ir), NULL);
>> (*new_var)->name = ralloc_strdup(*new_var, GLSL_CLIP_VAR_NAME);
>> -  (*new_var)->data.max_array_access = new_size - 1;
>> (*new_var)->data.location = VARYING_SLOT_CLIP_DIST0;
>>   if (!ir->type->fields.array->is_array()) {
>> @@ -182,6 +181,7 @@ lower_distance_visitor::visit(ir_variable *ir)
>> this->shader_stage == MESA_SHADER_GEOMETRY)));
>>  assert (ir->type->fields.array == glsl_type::float_type);
>> + (*new_var)->data.max_array_access = new_size - 1;
>
>
> Why do we need to set this at all? As far as I can tell the only user is
> update_array_sizes() which creates a new array instance type, but we are
> doing that below anyway.

I think we want to set it so validation passes, not sure any other reason.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/lower_distance: only set max_array_access for 1D clip dist arrays

2017-06-11 Thread Timothy Arceri

On 12/06/17 09:47, Dave Airlie wrote:

From: Dave Airlie 

The max_array_access field applies to the first dimension, which means
we only want to set it for the 1D clip dist arrays.

This fixes an ir_validate assert seen with
KHR-GL44.cull_distance.functional
on nouveau and radeon with debug builds.

Signed-off-by: Dave Airlie 
---
  src/compiler/glsl/lower_distance.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_distance.cpp 
b/src/compiler/glsl/lower_distance.cpp
index 9858503..ff04e9a 100644
--- a/src/compiler/glsl/lower_distance.cpp
+++ b/src/compiler/glsl/lower_distance.cpp
@@ -167,7 +167,6 @@ lower_distance_visitor::visit(ir_variable *ir)
/* Clone the old var so that we inherit all of its properties */
*new_var = ir->clone(ralloc_parent(ir), NULL);
(*new_var)->name = ralloc_strdup(*new_var, GLSL_CLIP_VAR_NAME);
-  (*new_var)->data.max_array_access = new_size - 1;
(*new_var)->data.location = VARYING_SLOT_CLIP_DIST0;
  
if (!ir->type->fields.array->is_array()) {

@@ -182,6 +181,7 @@ lower_distance_visitor::visit(ir_variable *ir)
this->shader_stage == MESA_SHADER_GEOMETRY)));
  
   assert (ir->type->fields.array == glsl_type::float_type);

+ (*new_var)->data.max_array_access = new_size - 1;


Why do we need to set this at all? As far as I can tell the only user is 
update_array_sizes() which creates a new array instance type, but we are 
doing that below anyway.


The other thing it does is update the number of state slot via 
var->set_num_state_slots(). Is this something we need to do for 2D 
arrays also? I'm not sure.


  
   /* And change the properties that we need to change */

   (*new_var)->type = 
glsl_type::get_array_instance(glsl_type::vec4_type,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl/lower_distance: only set max_array_access for 1D clip dist arrays

2017-06-11 Thread Dave Airlie
From: Dave Airlie 

The max_array_access field applies to the first dimension, which means
we only want to set it for the 1D clip dist arrays.

This fixes an ir_validate assert seen with
KHR-GL44.cull_distance.functional
on nouveau and radeon with debug builds.

Signed-off-by: Dave Airlie 
---
 src/compiler/glsl/lower_distance.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_distance.cpp 
b/src/compiler/glsl/lower_distance.cpp
index 9858503..ff04e9a 100644
--- a/src/compiler/glsl/lower_distance.cpp
+++ b/src/compiler/glsl/lower_distance.cpp
@@ -167,7 +167,6 @@ lower_distance_visitor::visit(ir_variable *ir)
   /* Clone the old var so that we inherit all of its properties */
   *new_var = ir->clone(ralloc_parent(ir), NULL);
   (*new_var)->name = ralloc_strdup(*new_var, GLSL_CLIP_VAR_NAME);
-  (*new_var)->data.max_array_access = new_size - 1;
   (*new_var)->data.location = VARYING_SLOT_CLIP_DIST0;
 
   if (!ir->type->fields.array->is_array()) {
@@ -182,6 +181,7 @@ lower_distance_visitor::visit(ir_variable *ir)
   this->shader_stage == MESA_SHADER_GEOMETRY)));
 
  assert (ir->type->fields.array == glsl_type::float_type);
+ (*new_var)->data.max_array_access = new_size - 1;
 
  /* And change the properties that we need to change */
  (*new_var)->type = glsl_type::get_array_instance(glsl_type::vec4_type,
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 2/2] egl/android: enable EGL_EXT_swap_buffers_with_damage

2017-06-11 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
 src/egl/drivers/dri2/platform_android.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/egl/drivers/dri2/platform_android.c 
b/src/egl/drivers/dri2/platform_android.c
index 7a73419d14..629ee593d0 100644
--- a/src/egl/drivers/dri2/platform_android.c
+++ b/src/egl/drivers/dri2/platform_android.c
@@ -1222,6 +1222,7 @@ dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *dpy)
dpy->Extensions.EXT_buffer_age = EGL_TRUE;
 #if ANDROID_API_LEVEL >= 23
dpy->Extensions.KHR_partial_update = EGL_TRUE;
+   dpy->Extensions.EXT_swap_buffers_with_damage = EGL_TRUE;
 #endif
 
/* Fill vtbl last to prevent accidentally calling virtual function during
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 0/2] egl: wire up swap_with_damage() and enable it on Android

2017-06-11 Thread Eric Engestrom
This is completely untested, as I don't have access to an Android
test device, which is the only platform that can benefit from this
improvement so far.

Can someone run dEQP-EGL.functional.swap_buffers_with_damage.*
on Android for me?

Cheers,
  Eric

---

Eric Engestrom (2):
  egl: improve dri2_fallback_swap_buffers_with_damage()
  egl/android: enable EGL_EXT_swap_buffers_with_damage

 src/egl/drivers/dri2/egl_dri2_fallbacks.h | 1 +
 src/egl/drivers/dri2/platform_android.c   | 1 +
 2 files changed, 2 insertions(+)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 1/2] egl: improve dri2_fallback_swap_buffers_with_damage()

2017-06-11 Thread Eric Engestrom
Let's (try to) set damages before swapping buffers.

Signed-off-by: Eric Engestrom 
---
 src/egl/drivers/dri2/egl_dri2_fallbacks.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/egl/drivers/dri2/egl_dri2_fallbacks.h 
b/src/egl/drivers/dri2/egl_dri2_fallbacks.h
index d8363c9bdd..604db881a8 100644
--- a/src/egl/drivers/dri2/egl_dri2_fallbacks.h
+++ b/src/egl/drivers/dri2/egl_dri2_fallbacks.h
@@ -68,6 +68,7 @@ dri2_fallback_swap_buffers_with_damage(_EGLDriver *drv, 
_EGLDisplay *dpy,
   const EGLint *rects, EGLint n_rects)
 {
struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+   dri2_dpy->vtbl->set_damage_region(drv, dpy, surf, rects, n_rects);
return dri2_dpy->vtbl->swap_buffers(drv, dpy, surf);
 }
 
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/1] radeonsi: Use libdrm to get chipset name

2017-06-11 Thread Marek Olšák
On Sun, Jun 11, 2017 at 8:25 PM, Henri Verbeet  wrote:
> On 7 June 2017 at 21:54, Marek Olšák  wrote:
>> On Wed, Jun 7, 2017 at 2:07 AM, Marek Olšák  wrote:
>>> On Wed, Jun 7, 2017 at 12:21 AM, Samuel Li  wrote:
 @@ -790,6 +790,15 @@ static const char* r600_get_device_vendor(struct 
 pipe_screen* pscreen)

  static const char* r600_get_chip_name(struct r600_common_screen *rscreen)
  {
 +   const char *mname;
 +
 +   if (rscreen->ws->get_chip_name) {
 +   mname = rscreen->ws->get_chip_name(rscreen->ws);
 +   if (mname != NULL)
 +   return mname;
 +   }
 +
 +   /* fall back to family names*/
 switch (rscreen->info.family) {
 case CHIP_R600: return "AMD R600";
 case CHIP_RV610: return "AMD RV610";
>
> As someone downstream of this, I have to say I find the "family" names
> much more informative than whatever marketing came up with. More
> importantly however, this commit changes the GL_RENDERER string
> reported to applications, like Wine, for existing GPUs in an
> incompatible way. Since I suspect displaying the "marketing" name is
> important to at least some people at AMD, could I request please
> including the family name as well, as is done by for example lspci?

Yes, if you write the patch with the codename in the existing parentheses. :)

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 8/9] nv50/ir: disable mul+add to mad for precise instructions

2017-06-11 Thread Karol Herbst
fixes missrendering in TombRaider

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 4c92a1efb5..85f3f44832 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1669,6 +1669,10 @@ AlgebraicOpt::handleABS(Instruction *abs)
 bool
 AlgebraicOpt::handleADD(Instruction *add)
 {
+   // we can't optimize to SAD/MAD if the instruction is tagged as precise
+   if (add->precise)
+  return false;
+
Value *src0 = add->getSrc(0);
Value *src1 = add->getSrc(1);
 
@@ -1712,7 +1716,7 @@ AlgebraicOpt::tryADDToMADOrSAD(Instruction *add, 
operation toOp)
   return false;
 
if (src->getInsn()->saturate || src->getInsn()->postFactor ||
-   src->getInsn()->dnz)
+   src->getInsn()->dnz || src->getInsn()->precise)
   return false;
 
if (toOp == OP_SAD) {
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 9/9] nv50/ir/tgsi: split mad to mul+add

2017-06-11 Thread Karol Herbst
fixes
KHR-GL44.gpu_shader5.precise_qualifier
KHR-GL45.gpu_shader5.precise_qualifier

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index c633185893..cd45e82426 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -3184,6 +3184,20 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
   break;
case TGSI_OPCODE_MAD:
case TGSI_OPCODE_UMAD:
+  FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
+ val0 = getSSA();
+ src0 = fetchSrc(0, c);
+ src1 = fetchSrc(1, c);
+ src2 = fetchSrc(2, c);
+ geni = mkOp2(OP_MUL, dstTy, val0, src0, src1);
+ if (dstTy == TYPE_F32)
+geni->dnz = info->io.mul_zero_wins;
+ geni->precise = insn->Instruction.Precise;
+
+ geni = mkOp2(OP_ADD, dstTy, dst0[c], val0, src2);
+ geni->precise = insn->Instruction.Precise;
+  }
+  break;
case TGSI_OPCODE_SAD:
case TGSI_OPCODE_FMA:
   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 6/9] nv50/ir: add precise field to Instruction

2017-06-11 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index 5c09fed05c..6835c4fa8c 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -884,6 +884,7 @@ public:
unsigned perPatch   : 1;
unsigned exit   : 1; // terminate program after insn
unsigned mask   : 4; // for vector ops
+   unsigned precise: 1; // prevent algebraic optimisations like mul+add to 
mad
 
int8_t postFactor; // MUL/DIV(if < 0) by 1 << postFactor
 
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 7/9] nv50/ir/tgsi: handle precise for most ALU instructions

2017-06-11 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 1264dd4834..c633185893 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -3179,6 +3179,7 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
  geni->subOp = tgsi::opcodeToSubOp(tgsi.getOpcode());
  if (op == OP_MUL && dstTy == TYPE_F32)
 geni->dnz = info->io.mul_zero_wins;
+ geni->precise = insn->Instruction.Precise;
   }
   break;
case TGSI_OPCODE_MAD:
@@ -3192,6 +3193,7 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
  geni = mkOp3(op, dstTy, dst0[c], src0, src1, src2);
  if (dstTy == TYPE_F32)
 geni->dnz = info->io.mul_zero_wins;
+ geni->precise = insn->Instruction.Precise;
   }
   break;
case TGSI_OPCODE_MOV:
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 2/9] tgsi/dump: print _PRECISE modifier on Instrutions

2017-06-11 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/auxiliary/tgsi/tgsi_dump.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index f6eba7424b..b58e64511c 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -584,6 +584,10 @@ iter_instruction(
   TXT( "_SAT" );
}
 
+   if (inst->Instruction.Precise) {
+  TXT( "_PRECISE" );
+   }
+
for (i = 0; i < inst->Instruction.NumDstRegs; i++) {
   const struct tgsi_full_dst_register *dst = &inst->Dst[i];
 
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 4/9] tgsi: populate precise

2017-06-11 Thread Karol Herbst
Only implemented for glsl->tgsi. Other converters just set precise to 0.

Signed-off-by: Karol Herbst 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c   |  3 +++
 src/gallium/auxiliary/tgsi/tgsi_ureg.c| 14 +++---
 src/gallium/auxiliary/tgsi/tgsi_ureg.h| 20 +++---
 src/gallium/auxiliary/util/u_simple_shaders.c |  2 +-
 src/gallium/state_trackers/nine/nine_shader.c |  6 ++---
 src/mesa/state_tracker/st_atifs_to_tgsi.c | 38 +--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 12 -
 src/mesa/state_tracker/st_mesa_to_tgsi.c  |  8 +++---
 src/mesa/state_tracker/st_pbo.c   |  2 +-
 9 files changed, 65 insertions(+), 40 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index 55e4d064ed..144a017768 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -651,6 +651,7 @@ tgsi_default_instruction( void )
 static struct tgsi_instruction
 tgsi_build_instruction(unsigned opcode,
unsigned saturate,
+   unsigned precise,
unsigned num_dst_regs,
unsigned num_src_regs,
struct tgsi_header *header)
@@ -665,6 +666,7 @@ tgsi_build_instruction(unsigned opcode,
instruction = tgsi_default_instruction();
instruction.Opcode = opcode;
instruction.Saturate = saturate;
+   instruction.Precise = precise;
instruction.NumDstRegs = num_dst_regs;
instruction.NumSrcRegs = num_src_regs;
 
@@ -1061,6 +1063,7 @@ tgsi_build_full_instruction(
 
*instruction = tgsi_build_instruction(full_inst->Instruction.Opcode,
  full_inst->Instruction.Saturate,
+ full_inst->Instruction.Precise,
  full_inst->Instruction.NumDstRegs,
  full_inst->Instruction.NumSrcRegs,
  header);
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 5bd779728a..56db2252c5 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -1213,6 +1213,7 @@ struct ureg_emit_insn_result
 ureg_emit_insn(struct ureg_program *ureg,
unsigned opcode,
boolean saturate,
+   unsigned precise,
unsigned num_dst,
unsigned num_src)
 {
@@ -1226,6 +1227,7 @@ ureg_emit_insn(struct ureg_program *ureg,
out[0].insn = tgsi_default_instruction();
out[0].insn.Opcode = opcode;
out[0].insn.Saturate = saturate;
+   out[0].insn.Precise = precise;
out[0].insn.NumDstRegs = num_dst;
out[0].insn.NumSrcRegs = num_src;
 
@@ -1354,7 +1356,8 @@ ureg_insn(struct ureg_program *ureg,
   const struct ureg_dst *dst,
   unsigned nr_dst,
   const struct ureg_src *src,
-  unsigned nr_src )
+  unsigned nr_src,
+  unsigned precise )
 {
struct ureg_emit_insn_result insn;
unsigned i;
@@ -1369,6 +1372,7 @@ ureg_insn(struct ureg_program *ureg,
insn = ureg_emit_insn(ureg,
  opcode,
  saturate,
+ precise,
  nr_dst,
  nr_src);
 
@@ -1391,7 +1395,8 @@ ureg_tex_insn(struct ureg_program *ureg,
   const struct tgsi_texture_offset *texoffsets,
   unsigned nr_offset,
   const struct ureg_src *src,
-  unsigned nr_src )
+  unsigned nr_src,
+  unsigned precise )
 {
struct ureg_emit_insn_result insn;
unsigned i;
@@ -1406,6 +1411,7 @@ ureg_tex_insn(struct ureg_program *ureg,
insn = ureg_emit_insn(ureg,
  opcode,
  saturate,
+ precise,
  nr_dst,
  nr_src);
 
@@ -1434,7 +1440,8 @@ ureg_memory_insn(struct ureg_program *ureg,
  unsigned nr_src,
  unsigned qualifier,
  unsigned texture,
- unsigned format)
+ unsigned format,
+ unsigned precise)
 {
struct ureg_emit_insn_result insn;
unsigned i;
@@ -1442,6 +1449,7 @@ ureg_memory_insn(struct ureg_program *ureg,
insn = ureg_emit_insn(ureg,
  opcode,
  FALSE,
+ precise,
  nr_dst,
  nr_src);
 
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.h 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
index 54f95ba565..105c85abd5 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
@@ -546,7 +546,8 @@ ureg_insn(struct ureg_program *ureg,
   const struct ureg_dst *dst,
   u

[Mesa-dev] [RFC 1/9] tgsi: add precise flag to tgsi_instruction

2017-06-11 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c| 1 +
 src/gallium/include/pipe/p_shader_tokens.h | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index 00843241f8..55e4d064ed 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -642,6 +642,7 @@ tgsi_default_instruction( void )
instruction.Label = 0;
instruction.Texture = 0;
instruction.Memory = 0;
+   instruction.Precise = 0;
instruction.Padding = 0;
 
return instruction;
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 1e08d97329..aa0fb3e3b3 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -638,7 +638,8 @@ struct tgsi_instruction
unsigned Label  : 1;
unsigned Texture: 1;
unsigned Memory : 1;
-   unsigned Padding: 2;
+   unsigned Precise: 1;
+   unsigned Padding: 1;
 };
 
 /*
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 3/9] st/glsl_to_tgsi: handle precise modifier

2017-06-11 Thread Karol Herbst
all subexpression inside an ir_assignment needs to be tagged as precise.

Signed-off-by: Karol Herbst 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 80 --
 1 file changed, 65 insertions(+), 15 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index c5d2e0fcd2..19f90f21fe 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -87,6 +87,13 @@ static int swizzle_for_type(const glsl_type *type, int 
component = 0)
return swizzle;
 }
 
+static unsigned is_precise(const ir_variable *ir)
+{
+   if (!ir)
+  return 0;
+   return ir->data.precise || ir->data.invariant;
+}
+
 /**
  * This struct is a corresponding struct to TGSI ureg_src.
  */
@@ -296,6 +303,7 @@ public:
ir_instruction *ir;
 
unsigned op:8; /**< TGSI opcode */
+   unsigned precise:1;
unsigned saturate:1;
unsigned is_64bit_expanded:1;
unsigned sampler_base:5;
@@ -435,6 +443,7 @@ public:
bool have_fma;
bool use_shared_memory;
bool has_tex_txf_lz;
+   unsigned precise;
 
variable_storage *find_variable_storage(ir_variable *var);
 
@@ -505,13 +514,29 @@ public:
   st_src_reg src0 = undef_src,
   st_src_reg src1 = undef_src,
   st_src_reg src2 = undef_src,
-  st_src_reg src3 = undef_src);
+  st_src_reg src3 = undef_src,
+  unsigned precise = 0);
 
glsl_to_tgsi_instruction *emit_asm(ir_instruction *ir, unsigned op,
   st_dst_reg dst, st_dst_reg dst1,
   st_src_reg src0 = undef_src,
   st_src_reg src1 = undef_src,
   st_src_reg src2 = undef_src,
+  st_src_reg src3 = undef_src,
+  unsigned precise = 0);
+
+   glsl_to_tgsi_instruction *emit_asm(ir_expression *ir, unsigned op,
+  st_dst_reg dst = undef_dst,
+  st_src_reg src0 = undef_src,
+  st_src_reg src1 = undef_src,
+  st_src_reg src2 = undef_src,
+  st_src_reg src3 = undef_src);
+
+   glsl_to_tgsi_instruction *emit_asm(ir_expression *ir, unsigned op,
+  st_dst_reg dst, st_dst_reg dst1,
+  st_src_reg src0 = undef_src,
+  st_src_reg src1 = undef_src,
+  st_src_reg src2 = undef_src,
   st_src_reg src3 = undef_src);
 
unsigned get_opcode(unsigned op,
@@ -650,7 +675,8 @@ glsl_to_tgsi_instruction *
 glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
st_dst_reg dst, st_dst_reg dst1,
st_src_reg src0, st_src_reg src1,
-   st_src_reg src2, st_src_reg src3)
+   st_src_reg src2, st_src_reg src3,
+   unsigned precise)
 {
glsl_to_tgsi_instruction *inst = new(mem_ctx) glsl_to_tgsi_instruction();
int num_reladdr = 0, i, j;
@@ -691,6 +717,7 @@ glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned 
op,
STATIC_ASSERT(TGSI_OPCODE_LAST <= 255);
 
inst->op = op;
+   inst->precise = precise;
inst->info = tgsi_get_opcode_info(op);
inst->dst[0] = dst;
inst->dst[1] = dst1;
@@ -881,9 +908,28 @@ glsl_to_tgsi_instruction *
 glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
st_dst_reg dst,
st_src_reg src0, st_src_reg src1,
+   st_src_reg src2, st_src_reg src3,
+   unsigned precise)
+{
+   return emit_asm(ir, op, dst, undef_dst, src0, src1, src2, src3, precise);
+}
+
+glsl_to_tgsi_instruction *
+glsl_to_tgsi_visitor::emit_asm(ir_expression *ir, unsigned op,
+   st_dst_reg dst,
+   st_src_reg src0, st_src_reg src1,
+   st_src_reg src2, st_src_reg src3)
+{
+   return emit_asm(ir, op, dst, undef_dst, src0, src1, src2, src3, 
this->precise);
+}
+
+glsl_to_tgsi_instruction *
+glsl_to_tgsi_visitor::emit_asm(ir_expression *ir, unsigned op,
+   st_dst_reg dst, st_dst_reg dst1,
+   st_src_reg src0, st_src_reg src1,
st_src_reg src2, st_src_reg src3)
 {
-   return emit_asm(ir, op, dst, undef_dst, src0, src1, src2, src3);
+   return emit_asm(ir, op, dst, dst1, src0, src1, src2, sr

[Mesa-dev] [RFC 5/9] tgsi/text: parse _PRECISE modifier

2017-06-11 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/auxiliary/tgsi/tgsi_text.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index 93a05568f4..c5fcb3283d 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -999,6 +999,7 @@ parse_texoffset_operand(
 static boolean
 match_inst(const char **pcur,
unsigned *saturate,
+   unsigned *precise,
const struct tgsi_opcode_info *info)
 {
const char *cur = *pcur;
@@ -1007,6 +1008,7 @@ match_inst(const char **pcur,
if (str_match_nocase_whole(&cur, info->mnemonic)) {
   *pcur = cur;
   *saturate = 0;
+  *precise = 0;
   return TRUE;
}
 
@@ -1015,8 +1017,15 @@ match_inst(const char **pcur,
   if (str_match_nocase_whole(&cur, "_SAT")) {
  *pcur = cur;
  *saturate = 1;
- return TRUE;
   }
+
+  if (str_match_nocase_whole(&cur, "_PRECISE")) {
+ *pcur = cur;
+ *precise = 1;
+  }
+
+  if (*precise || *saturate)
+ return TRUE;
}
 
return FALSE;
@@ -1029,6 +1038,7 @@ parse_instruction(
 {
uint i;
uint saturate = 0;
+   uint precise = 0;
const struct tgsi_opcode_info *info;
struct tgsi_full_instruction inst;
const char *cur;
@@ -1043,7 +1053,7 @@ parse_instruction(
   cur = ctx->cur;
 
   info = tgsi_get_opcode_info( i );
-  if (match_inst(&cur, &saturate, info)) {
+  if (match_inst(&cur, &saturate, &precise, info)) {
  if (info->num_dst + info->num_src + info->is_tex == 0) {
 ctx->cur = cur;
 break;
@@ -1064,6 +1074,7 @@ parse_instruction(
 
inst.Instruction.Opcode = i;
inst.Instruction.Saturate = saturate;
+   inst.Instruction.Precise = precise;
inst.Instruction.NumDstRegs = info->num_dst;
inst.Instruction.NumSrcRegs = info->num_src;
 
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI

2017-06-11 Thread Karol Herbst
Running Tomb Raider on Nouveau I found some flicker caused by ignoring precise
modifiers on variables inside Nouveau.

This series add precise/invariant handling to TGSI, which can be then used by
drivers to disable certain unsafe optimisations which may otherwise alter
calculations, which depend on having the same result across shaders.

This series fixes this bug in Tomb Raider and one CTS test for 4.4 and 4.5

Note on Patch 3: I really dislike how I tell glsl_to_tgsi_visitor to apply the
precise flag on instruction emited in ir_assignment->rhs->accept(); but I found
no other easy way to handle this. Maybe somebody of you has a better idea?

Karol Herbst (9):
  tgsi: add precise flag to tgsi_instruction
  tgsi/dump: print _PRECISE modifier on Instrutions
  st/glsl_to_tgsi: handle precise modifier
  tgsi: populate precise
  tgsi/text: parse _PRECISE modifier
  nv50/ir: add precise field to Instruction
  nv50/ir/tgsi: handle precise for most ALU instructions
  nv50/ir: disable mul+add to mad for precise instructions
  nv50/ir/tgsi: split mad to mul+add

 src/gallium/auxiliary/tgsi/tgsi_build.c|  4 +
 src/gallium/auxiliary/tgsi/tgsi_dump.c |  4 +
 src/gallium/auxiliary/tgsi/tgsi_text.c | 15 +++-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 14 +++-
 src/gallium/auxiliary/tgsi/tgsi_ureg.h | 20 -
 src/gallium/auxiliary/util/u_simple_shaders.c  |  2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  |  1 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 16 
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |  6 +-
 src/gallium/include/pipe/p_shader_tokens.h |  3 +-
 src/gallium/state_trackers/nine/nine_shader.c  |  6 +-
 src/mesa/state_tracker/st_atifs_to_tgsi.c  | 38 -
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 92 +-
 src/mesa/state_tracker/st_mesa_to_tgsi.c   |  8 +-
 src/mesa/state_tracker/st_pbo.c|  2 +-
 15 files changed, 172 insertions(+), 59 deletions(-)

-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/1] radeonsi: Use libdrm to get chipset name

2017-06-11 Thread Henri Verbeet
On 7 June 2017 at 21:54, Marek Olšák  wrote:
> On Wed, Jun 7, 2017 at 2:07 AM, Marek Olšák  wrote:
>> On Wed, Jun 7, 2017 at 12:21 AM, Samuel Li  wrote:
>>> @@ -790,6 +790,15 @@ static const char* r600_get_device_vendor(struct 
>>> pipe_screen* pscreen)
>>>
>>>  static const char* r600_get_chip_name(struct r600_common_screen *rscreen)
>>>  {
>>> +   const char *mname;
>>> +
>>> +   if (rscreen->ws->get_chip_name) {
>>> +   mname = rscreen->ws->get_chip_name(rscreen->ws);
>>> +   if (mname != NULL)
>>> +   return mname;
>>> +   }
>>> +
>>> +   /* fall back to family names*/
>>> switch (rscreen->info.family) {
>>> case CHIP_R600: return "AMD R600";
>>> case CHIP_RV610: return "AMD RV610";

As someone downstream of this, I have to say I find the "family" names
much more informative than whatever marketing came up with. More
importantly however, this commit changes the GL_RENDERER string
reported to applications, like Wine, for existing GPUs in an
incompatible way. Since I suspect displaying the "marketing" name is
important to at least some people at AMD, could I request please
including the family name as well, as is done by for example lspci?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation

2017-06-11 Thread Gert Wollny
Hello Marek, 

thanks for chiming in. 

Am Sonntag, den 11.06.2017, 16:15 +0200 schrieb Marek Olšák:
> Also, I don't know if people will like that it uses STL. I personally
> have no issue with that as long as it doesn't break apps (e.g. the
> STL shipped with apps should be the same as the STL shipped with the
> distribution).

Well, on Linux I would take it for granted that the STL used to run the
code is the same like the one the code was compiled with, and there are
already quite some places in the mesa code where STL constructs are
used (if that wounld't  have been the case, then I would tried to avoid
the STL). I am actually more concerned that propagating the  C++11
requirement to the whole  of src/mesa might not be welcomed (although
everything compiles and runs fine).


> On Sun, Jun 11, 2017 at 4:12 PM, Marek Olšák 
> wrote:
> > Hi Gert,
> > 
> > Have you measured the CPU overhead of the new code?

So far no, I guess one would do that with the shader-db to get
reasonable complex shaders, but I only have a r600 based card so I'm
not sure whether I can run this. In any case, tomorrow I will take a
look into this. 

Best, 
Gert 

> > 
> > Marek
> > 
> > On Sat, Jun 10, 2017 at 1:15 AM, Gert Wollny 
> > wrote:
> > > Dear all,
> > > 
> > > as I wrote before, I was looking into the temporary register
> > > renaming.
> > > 
> > > This series of patches implements a new approach that achieves a
> > > tigher
> > > estimation of the life time of the temporaries, and as a result
> > > the Piano
> > > and Voloplosion benchmarks implemented in gputest [1] now work.
> > > Before
> > > they failed with "r600_pipe_shader_create - translation from TGSI
> > > failed!"
> > > 
> > > Piglit shows 7 fixes and 6 regressions compared to git 8fac894f,
> > > but they don't
> > > seem to be related to shaders. I've also tested other programs
> > > like the unignie-*
> > > benchmarks and they didn't show regressions.
> > > 
> > > I think that the patch will need a few more iterations to remove
> > > code duplication
> > > and generally adhere to the mesa style, but I think it is atthe
> > > point where I could
> > > need a bit of feedback to get it into shape to be acceptable, and
> > > I'd also like to
> > > mention that since I'm new to mesa this I have no commit rights.
> > > 
> > > many thanks,
> > > Gert
> > > 
> > > [1] http://www.geeks3d.com/gputest/
> > > 
> > > Gert Wollny (3):
> > >   mesa/st: glsl_to_tgsi move some helper classes to extra files
> > >   mesa/st: glsl_to_tgsi Implement a new lifetime tracker for
> > > temporaries
> > >   mesa/st: glsl_to_tgsi: tie in the new register renaming
> > > approach
> > > 
> > >  configure.ac   |   1 +
> > >  src/mesa/Makefile.am   |   4 +-
> > >  src/mesa/Makefile.sources  |   4 +
> > >  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +
> > > ---
> > >  src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++
> > >  src/mesa/state_tracker/st_glsl_to_tgsi_private.h   | 135 
> > >  .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 551
> > > ++
> > >  .../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
> > >  src/mesa/state_tracker/tests/Makefile.am   |  40 ++
> > >  src/mesa/state_tracker/tests/st-renumerate-test| 210 ++
> > >  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 789
> > > +
> > >  11 files changed, 2104 insertions(+), 287 deletions(-)
> > >  create mode 100644
> > > src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
> > >  create mode 100644
> > > src/mesa/state_tracker/st_glsl_to_tgsi_private.h
> > >  create mode 100644
> > > src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
> > >  create mode 100644
> > > src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
> > >  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
> > >  create mode 100755 src/mesa/state_tracker/tests/st-renumerate-
> > > test
> > >  create mode 100644
> > > src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
> > > 
> > > --
> > > 2.13.0
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 0/3] Fix missing initializer errors in generated tables

2017-06-11 Thread Benedikt Schemmer
This series aims to fix hundreds of missing initializer warnings in generated 
header files
when compiling with -Wextra

V1: Fix the old fashioned way by adding 0s where needed
V2: switch to designated initializers (Emil), didnt send
V3: add some layout so its easier to read and create a new version for vk that
just uses {0} instead of {0, 0, 0, 0, 0} (same thing, less zeros)

---

Because this generated such unexpected controversy and out of curiosity i wrote 
a little
test program to show the problem. Compile with -Wextra.
There is no init code generated for any variant on any compiler i tested.

gcc version 5.4.1 20170304 (Ubuntu 5.4.1-8ubuntu1)
gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)
gcc version 7.0.1 20170407 (experimental) [trunk revision 246759] (Ubuntu 
7-20170407-0ubuntu2)
clang version 4.0.0-1ubuntu1 (tags/RELEASE_400/rc1)
clang version 5.0.0-svn305158-0~z~padoka0 (trunk)
MSVC Compiler Version 19.00.24210

---

struct s {
   int a;
   int b;
   int c;
   int d;
   int e;
};

static const struct s str1 = {};   // gcc 5/6/7, clang 4/5 accept this 
without warning, MSVC2013 doesnt compile
(not Standard, will generate a warning with -Wpendantic, much prettier however)
static const struct s str2 = {0};  // clang 4/5 generate a warning here 
(although ANSI Standard)
static const struct s str3 = {0,}; // clang 4/5 generate a warning here 
(although ANSI Standard)
static const struct s str4 = {1,2,3};  // gcc 5/6/7, clang 4/5 generate a 
warning here
static const struct s str5 = {1,2,3,}; // gcc 5/6/7, clang 4/5 generate a 
warning here
static const struct s str6 = {1,2,3,0,0};  // this is fine with all compilers
static const struct s str7 = {.a = 1, .b = 2, .c = 3}; // this might not 
compile on MSVC <2013 but couldnt test

int main() {
   return 0;
}

---

This is what Rust does:

---

#![allow(unused_variables)]
#![allow(dead_code)]

#[derive(Default)]
struct Test {
   a: i32,
   b: i32,
   c: i32,
   d: i32,
   e: i32
}

fn main() {
//all of these wont work
//let t1 = Test {};
//let t2 = Test {0};
//let t3 = Test {0,};
//let t3b= Test {..}; would be very cool
//let t4 = Test {1,2,3};
//let t5 = Test {1,2,3,};
//let t6 = Test {1,2,3,0,0};
//let t7 = Test {a: 1, b: 1};
//let t8 = Test {a: 1, b: 1, ..}; would be cool

//only this is legal in rust
let t9: Test = Default::default();
let t10 = Test {..Default::default()};
let t11 = Test {a: 1, b: 1, ..Default::default()};
let t12 = Test {a: 1, b: 1, c: 1, d: 1, e: 1};

println!("Hello, world!");
}

---

So in the end I followed Emils suggestion of designated initializers for 
partial initialization,
to make it explicit and get rid of the warnings.

Please kindly review and push.

Thanks,
Benedikt

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/3] Fix missing initializer warning in egd_tables.h by adding appropriate default fields in egd_tables.py

2017-06-11 Thread Benedikt Schemmer
Fix missing initializer warning in egd_tables.h by adding appropriate 
designated initializers in egd_tables.py

---
 src/gallium/drivers/r600/egd_tables.py | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/r600/egd_tables.py 
b/src/gallium/drivers/r600/egd_tables.py
index 4c606025ba..289981ae18 100644
--- a/src/gallium/drivers/r600/egd_tables.py
+++ b/src/gallium/drivers/r600/egd_tables.py
@@ -266,11 +266,13 @@ struct eg_packet3 {
 while value[1] >= len(values_offsets):
 values_offsets.append(-1)
 values_offsets[value[1]] = 
strings.add(strip_prefix(value[0]))
-print '\t{%s, %s(~0u), %s, %s},' % (
-strings.add(field.name), field.s_name,
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u), \
+   \r\n\t .num_values\t= %s,\r\n\t .values_offset\t= 
%s},' \
+   % (strings.add(field.name), field.s_name,
 len(values_offsets), 
strings_offsets.add(values_offsets))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= 
%s(~0u)},' \
+   % (strings.add(field.name), field.s_name)
 fields_idx += 1

 print '};'
@@ -279,10 +281,13 @@ struct eg_packet3 {
 print 'static const struct eg_reg egd_reg_table[] = {'
 for reg in regs:
 if len(reg.fields):
-print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s, \
+   \r\n\t .num_fields\t= %s,\r\n\t .fields_offset\t= %s},' \
+   % (strings.add(reg.name), reg.r_name,
 len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s},' \
+   % (strings.add(reg.name), reg.r_name)
 print '};'
 print

-- 
2.11.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/3] Fix missing initializer warning in vk_format_table.h by, adding appropriate default fields in vk_format_table.py

2017-06-11 Thread Benedikt Schemmer
Fix missing initializer warning in vk_format_table.h by changing to a default 
initializer in vk_format_table.py
and correct the autogenerated from message

---
 src/amd/vulkan/vk_format_table.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/vk_format_table.py 
b/src/amd/vulkan/vk_format_table.py
index 36352b108d..139bb9544c 100644
--- a/src/amd/vulkan/vk_format_table.py
+++ b/src/amd/vulkan/vk_format_table.py
@@ -86,7 +86,7 @@ def print_channels(format, func):
 print '#endif'

 def write_format_table(formats):
-print '/* This file is autogenerated by u_format_table.py from 
u_format.csv. Do not edit directly. */'
+print '/* This file is autogenerated by vk_format_table.py from 
vk_format_layout.csv. Do not edit directly. */'
 print
 # This will print the copyright message on the top of this file
 print CopyRight.strip()
@@ -106,7 +106,7 @@ def write_format_table(formats):
 if channel.size:
 print "  {%s, %s, %s, %s, %u, %u}%s\t/* %s = %s */" % 
(type_map[channel.type],
bool_map(channel.norm), bool_map(channel.pure), bool_map(channel.scaled), 
channel.size, channel.shift, sep, "xyzw"[i],
channel.name)
 else:
-print "  {0, 0, 0, 0, 0}%s" % (sep,)
+print "  {0}%s" % (sep,)
 print "   },"

 def do_swizzle_array(channels, swizzles):
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/3] Fix missing initializer warning in sid_tables.h by adding appropriate default fields in sid_tables.py

2017-06-11 Thread Benedikt Schemmer
Fix missing initializer warning in sid_tables.h by adding appropriate 
designated initializers in sid_tables.py

---
 src/amd/common/sid_tables.py | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/sid_tables.py b/src/amd/common/sid_tables.py
index fd88d3c9d5..691d766b08 100644
--- a/src/amd/common/sid_tables.py
+++ b/src/amd/common/sid_tables.py
@@ -266,11 +266,13 @@ struct si_packet3 {
 while value[1] >= len(values_offsets):
 values_offsets.append(-1)
 values_offsets[value[1]] = 
strings.add(strip_prefix(value[0]))
-print '\t{%s, %s(~0u), %s, %s},' % (
-strings.add(field.name), field.s_name,
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u), \
+   \r\n\t .num_values\t= %s,\r\n\t .values_offset\t= 
%s},' \
+   % (strings.add(field.name), field.s_name,
 len(values_offsets), 
strings_offsets.add(values_offsets))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= 
%s(~0u)},' \
+   % (strings.add(field.name), field.s_name)
 fields_idx += 1

 print '};'
@@ -279,10 +281,13 @@ struct si_packet3 {
 print 'static const struct si_reg sid_reg_table[] = {'
 for reg in regs:
 if len(reg.fields):
-print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s, \
+   \r\n\t .num_fields\t= %s,\r\n\t .fields_offset\t= %s},' \
+   % (strings.add(reg.name), reg.r_name,
 len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s},' \
+   % (strings.add(reg.name), reg.r_name)
 print '};'
 print

-- 
2.11.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #15 from John  ---
Created attachment 131879
  --> https://bugs.freedesktop.org/attachment.cgi?id=131879&action=edit
trace

ooops

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #14 from Grazvydas Ignotas  ---
Looks like you attached the wrong file.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #13 from John  ---
Created attachment 131878
  --> https://bugs.freedesktop.org/attachment.cgi?id=131878&action=edit
trace

The ML patch worked!

Here's the trace.

Thank you!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] st/mesa: skip texture validation logic when nothing has changed

2017-06-11 Thread Marek Olšák
If it's copied from i965, it must be correct, right? ;) It probably is.

Reviewed-by: Marek Olšák 

Marek

On Sat, Jun 10, 2017 at 6:52 AM, Timothy Arceri  wrote:
> Based on the same logic in the i965 driver 2f225f61451abd51 and
> 16060c5adcd4.
>
> perf reports st_finalize_texture() going from 0.60% -> 0.16% with
> this change when running the Xonotic benchmark from PTS.
> ---
>
>  A full run of piglit on radeonsi produced no regressions. No other drivers
>  have been tested.
>
>  src/mesa/state_tracker/st_cb_texture.c | 28 
>  src/mesa/state_tracker/st_manager.c|  2 ++
>  src/mesa/state_tracker/st_texture.h|  9 +
>  3 files changed, 39 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_cb_texture.c 
> b/src/mesa/state_tracker/st_cb_texture.c
> index 99c59f7..443bb7b 100644
> --- a/src/mesa/state_tracker/st_cb_texture.c
> +++ b/src/mesa/state_tracker/st_cb_texture.c
> @@ -147,20 +147,22 @@ st_DeleteTextureImage(struct gl_context * ctx, struct 
> gl_texture_image *img)
>
>  /** called via ctx->Driver.NewTextureObject() */
>  static struct gl_texture_object *
>  st_NewTextureObject(struct gl_context * ctx, GLuint name, GLenum target)
>  {
> struct st_texture_object *obj = ST_CALLOC_STRUCT(st_texture_object);
>
> DBG("%s\n", __func__);
> _mesa_initialize_texture_object(ctx, &obj->base, name, target);
>
> +   obj->needs_validation = true;
> +
> return &obj->base;
>  }
>
>  /** called via ctx->Driver.DeleteTextureObject() */
>  static void
>  st_DeleteTextureObject(struct gl_context *ctx,
> struct gl_texture_object *texObj)
>  {
> struct st_context *st = st_context(ctx);
> struct st_texture_object *stObj = st_texture_object(texObj);
> @@ -599,20 +601,22 @@ st_AllocTextureImageBuffer(struct gl_context *ctx,
> struct st_texture_object *stObj = st_texture_object(texImage->TexObject);
> const GLuint level = texImage->Level;
> GLuint width = texImage->Width;
> GLuint height = texImage->Height;
> GLuint depth = texImage->Depth;
>
> DBG("%s\n", __func__);
>
> assert(!stImage->pt); /* xxx this might be wrong */
>
> +   stObj->needs_validation = true;
> +
> etc_fallback_allocate(st, stImage);
>
> /* Look if the parent texture object has space for this image */
> if (stObj->pt &&
> level <= stObj->pt->last_level &&
> st_texture_match_image(st, stObj->pt, texImage)) {
>/* this image will fit in the existing texture object's memory */
>pipe_resource_reference(&stImage->pt, stObj->pt);
>return GL_TRUE;
> }
> @@ -2478,20 +2482,30 @@ st_finalize_texture(struct gl_context *ctx,
>   pipe_resource_reference(&stObj->pt, st_obj->buffer);
>   st_texture_release_all_sampler_views(st, stObj);
>}
>return GL_TRUE;
>
> }
>
> firstImage = 
> st_texture_image_const(stObj->base.Image[cubeMapFace][stObj->base.BaseLevel]);
> assert(firstImage);
>
> +   /* Skip the loop over images in the common case of no images having
> +* changed.  But if the GL_BASE_LEVEL or GL_MAX_LEVEL change to something 
> we
> +* haven't looked at, then we do need to look at those new images.
> +*/
> +   if (!stObj->needs_validation &&
> +   stObj->base.BaseLevel >= stObj->validated_first_level &&
> +   stObj->lastLevel <= stObj->validated_last_level) {
> +  return GL_TRUE;
> +   }
> +
> /* If both firstImage and stObj point to a texture which can contain
>  * all active images, favour firstImage.  Note that because of the
>  * completeness requirement, we know that the image dimensions
>  * will match.
>  */
> if (firstImage->pt &&
> firstImage->pt != stObj->pt &&
> (!stObj->pt || firstImage->pt->last_level >= stObj->pt->last_level)) {
>pipe_resource_reference(&stObj->pt, firstImage->pt);
>st_texture_release_all_sampler_views(st, stObj);
> @@ -2624,20 +2638,24 @@ st_finalize_texture(struct gl_context *ctx,
>  (stImage->base.Width == u_minify(ptWidth, level) &&
>   stImage->base.Height == height &&
>   stImage->base.Depth == depth)) {
> /* src image fits expected dest mipmap level size */
> copy_image_data_to_texture(st, stObj, level, stImage);
>  }
>   }
>}
> }
>
> +   stObj->validated_first_level = stObj->base.BaseLevel;
> +   stObj->validated_last_level = stObj->lastLevel;
> +   stObj->needs_validation = false;
> +
> return GL_TRUE;
>  }
>
>
>  /**
>   * Called via ctx->Driver.AllocTextureStorage() to allocate texture memory
>   * for a whole mipmap stack.
>   */
>  static GLboolean
>  st_AllocTextureStorage(struct gl_context *ctx,
> @@ -2705,20 +2723,25 @@ st_AllocTextureStorage(struct gl_context *ctx,
>GLuint face;
>for (face = 0; face < numFaces; face++) {
>   struct st_texture_image *stImage =
>  st_texture_ima

Re: [Mesa-dev] [PATCH] ac: Use mov_dpp for derivatives.

2017-06-11 Thread Marek Olšák
Hi Bas,

Have you tested piglit on radeonsi with this?

Marek

On Sat, Jun 10, 2017 at 10:05 PM, Bas Nieuwenhuizen
 wrote:
> Slightly faster than bpermute, and seems supported since at least
> LLVM 3.9.
>
> v2: Since this supersedes bpermute, remove the bpermute code.
> Signed-off-by: Bas Nieuwenhuizen 
> ---
>  src/amd/common/ac_llvm_build.c   | 47 
> 
>  src/amd/common/ac_llvm_build.h   |  2 +-
>  src/amd/common/ac_nir_to_llvm.c  |  8 +++---
>  src/gallium/drivers/radeonsi/si_pipe.c   |  2 +-
>  src/gallium/drivers/radeonsi/si_pipe.h   |  2 +-
>  src/gallium/drivers/radeonsi/si_shader.c |  4 +--
>  6 files changed, 38 insertions(+), 27 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index 237e9291d41..99d41bf52d6 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -783,41 +783,52 @@ ac_get_thread_id(struct ac_llvm_context *ctx)
>   */
>  LLVMValueRef
>  ac_build_ddxy(struct ac_llvm_context *ctx,
> - bool has_ds_bpermute,
> + bool has_mov_dpp,
>   uint32_t mask,
>   int idx,
>   LLVMValueRef lds,
>   LLVMValueRef val)
>  {
> -   LLVMValueRef thread_id, tl, trbl, tl_tid, trbl_tid, args[2];
> +   LLVMValueRef thread_id, tl, trbl, args[5];
> LLVMValueRef result;
>
> -   thread_id = ac_get_thread_id(ctx);
> +   if (has_mov_dpp) {
> +   uint32_t tl_ctrl = 0, trbl_ctrl = 0;
>
> -   tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
> - LLVMConstInt(ctx->i32, mask, false), "");
> -
> -   trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
> -   LLVMConstInt(ctx->i32, idx, false), "");
> +   for (unsigned i = 0; i < 4; ++i) {
> +   tl_ctrl |= (i & mask) << (2 * i);
> +   trbl_ctrl |= ((i & mask) + idx) << (2 * i);
> +   }
>
> -   if (has_ds_bpermute) {
> -   args[0] = LLVMBuildMul(ctx->builder, tl_tid,
> -  LLVMConstInt(ctx->i32, 4, false), "");
> -   args[1] = val;
> +   args[0] = val;
> +   args[1] = LLVMConstInt(ctx->i32, tl_ctrl, false);
> +   args[2] = LLVMConstInt(ctx->i32, 0xf, false);
> +   args[3] = LLVMConstInt(ctx->i32, 0xf, false);
> +   args[4] = LLVMConstInt(ctx->i1, 1, false);
> tl = ac_build_intrinsic(ctx,
> -   "llvm.amdgcn.ds.bpermute", ctx->i32,
> -   args, 2,
> +   "llvm.amdgcn.mov.dpp.i32", ctx->i32,
> +   args, 5,
> AC_FUNC_ATTR_READNONE |
> AC_FUNC_ATTR_CONVERGENT);
>
> -   args[0] = LLVMBuildMul(ctx->builder, trbl_tid,
> -  LLVMConstInt(ctx->i32, 4, false), "");
> +   args[1] = LLVMConstInt(ctx->i32, trbl_ctrl, false);
> trbl = ac_build_intrinsic(ctx,
> - "llvm.amdgcn.ds.bpermute", ctx->i32,
> - args, 2,
> + "llvm.amdgcn.mov.dpp.i32", ctx->i32,
> + args, 5,
>   AC_FUNC_ATTR_READNONE |
>   AC_FUNC_ATTR_CONVERGENT);
> } else {
> +   LLVMValueRef tl_tid, trbl_tid;
> +
> +   thread_id = ac_get_thread_id(ctx);
> +
> +   tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
> +   LLVMConstInt(ctx->i32, mask, false), "");
> +
> +   trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
> +   LLVMConstInt(ctx->i32, idx, false), 
> "");
> +
> +
> LLVMValueRef store_ptr, load_ptr0, load_ptr1;
>
> store_ptr = ac_build_gep0(ctx, lds, thread_id);
> diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
> index ebb78fbd79b..14260b05018 100644
> --- a/src/amd/common/ac_llvm_build.h
> +++ b/src/amd/common/ac_llvm_build.h
> @@ -161,7 +161,7 @@ ac_get_thread_id(struct ac_llvm_context *ctx);
>
>  LLVMValueRef
>  ac_build_ddxy(struct ac_llvm_context *ctx,
> - bool has_ds_bpermute,
> + bool has_mov_dpp,
>   uint32_t mask,
>   int idx,
>   LLVMValueRef lds,
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 49117d21bd2..2385c60d316 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -164,7 +164,7 @@ struct nir_to_llvm_context {
> uint8_t num_output_clips;
> uint8_t num_output_c

Re: [Mesa-dev] [PATCH] radv: fix trace dumping for !use_ib_bos

2017-06-11 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

We shouldn't chain when use_ib_bos is false and embed secondary
command buffers directly in the primary buffer as well, so no handling
of chaining is needed.

On Sun, Jun 11, 2017 at 4:03 PM, Grazvydas Ignotas  wrote:
> Fixes trace dumping crash for SI or when RADV_DEBUG=noibs is set.
>
> Fixes: 97dfff5410 "radv: Dump command buffer on hang."
> Signed-off-by: Grazvydas Ignotas 
> ---
> Not sure if chained buffer dumping can be done for !use_ib_bos,
> returning NULL in _get_cpu_addr() just skips that.
>
>  src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 14 +++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
> b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> index 7b74970..ffc7566 100644
> --- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> +++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> @@ -950,10 +950,13 @@ static int radv_amdgpu_winsys_cs_submit(struct 
> radeon_winsys_ctx *_ctx,
>
>  static void *radv_amdgpu_winsys_get_cpu_addr(void *_cs, uint64_t addr)
>  {
> struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs *)_cs;
> void *ret = NULL;
> +
> +   if (!cs->ib_buffer)
> +   return NULL;
> for (unsigned i = 0; i <= cs->num_old_ib_buffers; ++i) {
> struct radv_amdgpu_winsys_bo *bo;
>
> bo = (struct radv_amdgpu_winsys_bo*)
>(i == cs->num_old_ib_buffers ? cs->ib_buffer : 
> cs->old_ib_buffers[i]);
> @@ -968,14 +971,19 @@ static void *radv_amdgpu_winsys_get_cpu_addr(void *_cs, 
> uint64_t addr)
>  static void radv_amdgpu_winsys_cs_dump(struct radeon_winsys_cs *_cs,
> FILE* file,
> uint32_t trace_id)
>  {
> struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs *)_cs;
> +   void *ib = cs->base.buf;
> +   int num_dw = cs->base.cdw;
>
> -   ac_parse_ib(file,
> -   radv_amdgpu_winsys_get_cpu_addr(cs, cs->ib.ib_mc_address),
> -   cs->ib.size, trace_id,  "main IB", 
> cs->ws->info.chip_class,
> +   if (cs->ws->use_ib_bos) {
> +   ib = radv_amdgpu_winsys_get_cpu_addr(cs, 
> cs->ib.ib_mc_address);
> +   num_dw = cs->ib.size;
> +   }
> +   assert(ib);
> +   ac_parse_ib(file, ib, num_dw, trace_id, "main IB", 
> cs->ws->info.chip_class,
> radv_amdgpu_winsys_get_cpu_addr, cs);
>  }
>
>  static struct radeon_winsys_ctx *radv_amdgpu_ctx_create(struct radeon_winsys 
> *_ws)
>  {
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation

2017-06-11 Thread Marek Olšák
Also, I don't know if people will like that it uses STL. I personally
have no issue with that as long as it doesn't break apps (e.g. the STL
shipped with apps should be the same as the STL shipped with the
distribution).

Marek

On Sun, Jun 11, 2017 at 4:12 PM, Marek Olšák  wrote:
> Hi Gert,
>
> Have you measured the CPU overhead of the new code?
>
> Marek
>
> On Sat, Jun 10, 2017 at 1:15 AM, Gert Wollny  wrote:
>> Dear all,
>>
>> as I wrote before, I was looking into the temporary register renaming.
>>
>> This series of patches implements a new approach that achieves a tigher
>> estimation of the life time of the temporaries, and as a result the Piano
>> and Voloplosion benchmarks implemented in gputest [1] now work. Before
>> they failed with "r600_pipe_shader_create - translation from TGSI failed!"
>>
>> Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they 
>> don't
>> seem to be related to shaders. I've also tested other programs like the 
>> unignie-*
>> benchmarks and they didn't show regressions.
>>
>> I think that the patch will need a few more iterations to remove code 
>> duplication
>> and generally adhere to the mesa style, but I think it is atthe point where 
>> I could
>> need a bit of feedback to get it into shape to be acceptable, and I'd also 
>> like to
>> mention that since I'm new to mesa this I have no commit rights.
>>
>> many thanks,
>> Gert
>>
>> [1] http://www.geeks3d.com/gputest/
>>
>> Gert Wollny (3):
>>   mesa/st: glsl_to_tgsi move some helper classes to extra files
>>   mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
>>   mesa/st: glsl_to_tgsi: tie in the new register renaming approach
>>
>>  configure.ac   |   1 +
>>  src/mesa/Makefile.am   |   4 +-
>>  src/mesa/Makefile.sources  |   4 +
>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +---
>>  src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++
>>  src/mesa/state_tracker/st_glsl_to_tgsi_private.h   | 135 
>>  .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 551 ++
>>  .../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
>>  src/mesa/state_tracker/tests/Makefile.am   |  40 ++
>>  src/mesa/state_tracker/tests/st-renumerate-test| 210 ++
>>  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 789 
>> +
>>  11 files changed, 2104 insertions(+), 287 deletions(-)
>>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
>>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
>>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
>>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
>>  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
>>  create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
>>  create mode 100644 
>> src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
>>
>> --
>> 2.13.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #12 from Grazvydas Ignotas  ---
Created attachment 131877
  --> https://bugs.freedesktop.org/attachment.cgi?id=131877&action=edit
trace everything

I've sent a patch that should fix trace dumping for SI:
https://lists.freedesktop.org/archives/mesa-dev/2017-June/158739.html

If you want to trace everything, use the attached patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation

2017-06-11 Thread Marek Olšák
Hi Gert,

Have you measured the CPU overhead of the new code?

Marek

On Sat, Jun 10, 2017 at 1:15 AM, Gert Wollny  wrote:
> Dear all,
>
> as I wrote before, I was looking into the temporary register renaming.
>
> This series of patches implements a new approach that achieves a tigher
> estimation of the life time of the temporaries, and as a result the Piano
> and Voloplosion benchmarks implemented in gputest [1] now work. Before
> they failed with "r600_pipe_shader_create - translation from TGSI failed!"
>
> Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they 
> don't
> seem to be related to shaders. I've also tested other programs like the 
> unignie-*
> benchmarks and they didn't show regressions.
>
> I think that the patch will need a few more iterations to remove code 
> duplication
> and generally adhere to the mesa style, but I think it is atthe point where I 
> could
> need a bit of feedback to get it into shape to be acceptable, and I'd also 
> like to
> mention that since I'm new to mesa this I have no commit rights.
>
> many thanks,
> Gert
>
> [1] http://www.geeks3d.com/gputest/
>
> Gert Wollny (3):
>   mesa/st: glsl_to_tgsi move some helper classes to extra files
>   mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
>   mesa/st: glsl_to_tgsi: tie in the new register renaming approach
>
>  configure.ac   |   1 +
>  src/mesa/Makefile.am   |   4 +-
>  src/mesa/Makefile.sources  |   4 +
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +---
>  src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++
>  src/mesa/state_tracker/st_glsl_to_tgsi_private.h   | 135 
>  .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 551 ++
>  .../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
>  src/mesa/state_tracker/tests/Makefile.am   |  40 ++
>  src/mesa/state_tracker/tests/st-renumerate-test| 210 ++
>  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 789 
> +
>  11 files changed, 2104 insertions(+), 287 deletions(-)
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
>  create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
>  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
>  create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
>  create mode 100644 
> src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
>
> --
> 2.13.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90264

--- Comment #76 from omerfarukdoga...@hotmail.com ---
This problem happens when a tooltip was previously shown with a multi-line
content and the current tooltip has smaller number of lines than the previous
one. Check this topic for detailed explanation:
https://www.kubuntuforums.net/showthread.php?t=71878

It's like the placement of the drawn element is calculated according to the
previous tooltip size (aligned to the bottom).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: fix trace dumping for !use_ib_bos

2017-06-11 Thread Grazvydas Ignotas
Fixes trace dumping crash for SI or when RADV_DEBUG=noibs is set.

Fixes: 97dfff5410 "radv: Dump command buffer on hang."
Signed-off-by: Grazvydas Ignotas 
---
Not sure if chained buffer dumping can be done for !use_ib_bos,
returning NULL in _get_cpu_addr() just skips that.

 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
index 7b74970..ffc7566 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
@@ -950,10 +950,13 @@ static int radv_amdgpu_winsys_cs_submit(struct 
radeon_winsys_ctx *_ctx,
 
 static void *radv_amdgpu_winsys_get_cpu_addr(void *_cs, uint64_t addr)
 {
struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs *)_cs;
void *ret = NULL;
+
+   if (!cs->ib_buffer)
+   return NULL;
for (unsigned i = 0; i <= cs->num_old_ib_buffers; ++i) {
struct radv_amdgpu_winsys_bo *bo;
 
bo = (struct radv_amdgpu_winsys_bo*)
   (i == cs->num_old_ib_buffers ? cs->ib_buffer : 
cs->old_ib_buffers[i]);
@@ -968,14 +971,19 @@ static void *radv_amdgpu_winsys_get_cpu_addr(void *_cs, 
uint64_t addr)
 static void radv_amdgpu_winsys_cs_dump(struct radeon_winsys_cs *_cs,
FILE* file,
uint32_t trace_id)
 {
struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs *)_cs;
+   void *ib = cs->base.buf;
+   int num_dw = cs->base.cdw;
 
-   ac_parse_ib(file,
-   radv_amdgpu_winsys_get_cpu_addr(cs, cs->ib.ib_mc_address),
-   cs->ib.size, trace_id,  "main IB", cs->ws->info.chip_class,
+   if (cs->ws->use_ib_bos) {
+   ib = radv_amdgpu_winsys_get_cpu_addr(cs, cs->ib.ib_mc_address);
+   num_dw = cs->ib.size;
+   }
+   assert(ib);
+   ac_parse_ib(file, ib, num_dw, trace_id, "main IB", 
cs->ws->info.chip_class,
radv_amdgpu_winsys_get_cpu_addr, cs);
 }
 
 static struct radeon_winsys_ctx *radv_amdgpu_ctx_create(struct radeon_winsys 
*_ws)
 {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: unmap the stream_uploader buffer before drawing

2017-06-11 Thread Marek Olšák
On Sat, Jun 10, 2017 at 5:27 AM, Brian Paul  wrote:
> Some drivers require that the vertex buffers be unmapped prior to
> drawing.  This change unmaps the stream_uploader buffer after we've
> uploaded the zero-stride attributes (unless the driver supports
> rendering with mapped buffers).
>
> This fixes a regression in the VMware driver since 17f776c27be266f2.
> Some Mesa demos such as mandelbrot and brick would display black
> quads instead of the expected rendering.
>
> --
>
> Marek: can you please verify that this is the right place for this
> call (and not in st_draw_vbo())?

Yes, this is the right place.

Reviewed-by: Marek Olšák 

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 1/2] hud: Handle query values according to their type

2017-06-11 Thread Boyan Ding
Signed-off-by: Boyan Ding 
---
 src/gallium/auxiliary/hud/hud_driver_query.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_driver_query.c 
b/src/gallium/auxiliary/hud/hud_driver_query.c
index 76104b5b49..7a469bd1bd 100644
--- a/src/gallium/auxiliary/hud/hud_driver_query.c
+++ b/src/gallium/auxiliary/hud/hud_driver_query.c
@@ -202,6 +202,7 @@ struct query_info {
unsigned query_type;
unsigned result_index; /* unit depends on query_type */
enum pipe_driver_query_result_type result_type;
+   enum pipe_driver_query_type type;
 
/* Ring of queries. If a query is busy, we use another slot. */
struct pipe_query *query[NUM_QUERIES];
@@ -229,6 +230,19 @@ query_new_value_batch(struct query_info *info)
}
 }
 
+static uint64_t
+query_result_to_u64(union pipe_query_result result,
+enum pipe_driver_query_type type)
+{
+   switch (type) {
+   case PIPE_DRIVER_QUERY_TYPE_PERCENTAGE:
+   case PIPE_DRIVER_QUERY_TYPE_FLOAT:
+  return result.f;
+   default:
+  return result.u64;
+   }
+}
+
 static void
 query_new_value_normal(struct query_info *info)
 {
@@ -242,10 +256,11 @@ query_new_value_normal(struct query_info *info)
   while (1) {
  struct pipe_query *query = info->query[info->tail];
  union pipe_query_result result;
- uint64_t *res64 = (uint64_t *)&result;
+ union pipe_query_result *presult = &result;
 
- if (query && pipe->get_query_result(pipe, query, FALSE, &result)) {
-info->results_cumulative += res64[info->result_index];
+ if (query && pipe->get_query_result(pipe, query, FALSE, presult)) {
+info->results_cumulative +=
+   query_result_to_u64(presult[info->result_index], info->type);
 info->num_results++;
 
 if (info->tail == info->head)
@@ -383,6 +398,7 @@ hud_pipe_query_install(struct hud_batch_query_context **pbq,
info = gr->query_data;
info->pipe = pipe;
info->result_type = result_type;
+   info->type = type;
 
if (flags & PIPE_DRIVER_QUERY_FLAG_BATCH) {
   if (!batch_query_add(pbq, pipe, query_type, &info->result_index))
-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/util: whitespace, formatting fixes in u_upload_mgr.c

2017-06-11 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Sat, Jun 10, 2017 at 5:27 AM, Brian Paul  wrote:
> ---
>  src/gallium/auxiliary/util/u_upload_mgr.c | 54 
> +--
>  1 file changed, 29 insertions(+), 25 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_upload_mgr.c 
> b/src/gallium/auxiliary/util/u_upload_mgr.c
> index 9528495..4bb14d6 100644
> --- a/src/gallium/auxiliary/util/u_upload_mgr.c
> +++ b/src/gallium/auxiliary/util/u_upload_mgr.c
> @@ -1,8 +1,8 @@
>  /**
> - *
> + *
>   * Copyright 2009 VMware, Inc.
>   * All Rights Reserved.
> - *
> + *
>   * Permission is hereby granted, free of charge, to any person obtaining a
>   * copy of this software and associated documentation files (the
>   * "Software"), to deal in the Software without restriction, including
> @@ -10,11 +10,11 @@
>   * distribute, sub license, and/or sell copies of the Software, and to
>   * permit persons to whom the Software is furnished to do so, subject to
>   * the following conditions:
> - *
> + *
>   * The above copyright notice and this permission notice (including the
>   * next paragraph) shall be included in all copies or substantial portions
>   * of the Software.
> - *
> + *
>   * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
>   * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>   * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> @@ -22,7 +22,7 @@
>   * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
>   * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
>   * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> - *
> + *
>   **/
>
>  /* Helper utility for uploading user buffers & other data, and
> @@ -59,7 +59,7 @@ struct u_upload_mgr *
>  u_upload_create(struct pipe_context *pipe, unsigned default_size,
>  unsigned bind, enum pipe_resource_usage usage)
>  {
> -   struct u_upload_mgr *upload = CALLOC_STRUCT( u_upload_mgr );
> +   struct u_upload_mgr *upload = CALLOC_STRUCT(u_upload_mgr);
> if (!upload)
>return NULL;
>
> @@ -104,7 +104,8 @@ u_upload_clone(struct pipe_context *pipe, struct 
> u_upload_mgr *upload)
>upload->usage);
>  }
>
> -static void upload_unmap_internal(struct u_upload_mgr *upload, boolean 
> destroying)
> +static void
> +upload_unmap_internal(struct u_upload_mgr *upload, boolean destroying)
>  {
> if (!destroying && upload->map_persistent)
>return;
> @@ -124,30 +125,32 @@ static void upload_unmap_internal(struct u_upload_mgr 
> *upload, boolean destroyin
>  }
>
>
> -void u_upload_unmap( struct u_upload_mgr *upload )
> +void
> +u_upload_unmap(struct u_upload_mgr *upload)
>  {
> upload_unmap_internal(upload, FALSE);
>  }
>
>
> -static void u_upload_release_buffer(struct u_upload_mgr *upload)
> +static void
> +u_upload_release_buffer(struct u_upload_mgr *upload)
>  {
> /* Unmap and unreference the upload buffer. */
> upload_unmap_internal(upload, TRUE);
> -   pipe_resource_reference( &upload->buffer, NULL );
> +   pipe_resource_reference(&upload->buffer, NULL);
>  }
>
>
> -void u_upload_destroy( struct u_upload_mgr *upload )
> +void
> +u_upload_destroy(struct u_upload_mgr *upload)
>  {
> -   u_upload_release_buffer( upload );
> -   FREE( upload );
> +   u_upload_release_buffer(upload);
> +   FREE(upload);
>  }
>
>
>  static void
> -u_upload_alloc_buffer(struct u_upload_mgr *upload,
> -  unsigned min_size)
> +u_upload_alloc_buffer(struct u_upload_mgr *upload, unsigned min_size)
>  {
> struct pipe_screen *screen = upload->pipe->screen;
> struct pipe_resource buffer;
> @@ -155,9 +158,9 @@ u_upload_alloc_buffer(struct u_upload_mgr *upload,
>
> /* Release the old buffer, if present:
>  */
> -   u_upload_release_buffer( upload );
> +   u_upload_release_buffer(upload);
>
> -   /* Allocate a new one:
> +   /* Allocate a new one:
>  */
> size = align(MAX2(upload->default_size, min_size), 4096);
>
> @@ -232,7 +235,7 @@ u_upload_alloc(struct u_upload_mgr *upload,
>offset,
>buffer_size - offset,
>upload->map_flags,
> - &upload->transfer);
> +  &upload->transfer);
>if (unlikely(!upload->map)) {
>   upload->transfer = NULL;
>   *out_offset = ~0;
> @@ -256,13 +259,14 @@ u_upload_alloc(struct u_upload_mgr *upload,
> upload->offset = offset + size;
>  }
>
> -void u_upload_data(struct u_upload_mgr *upload,
> -   unsigned min_out_offset,
> -   unsigned size,
> -   unsigned alignment,
> -   const void *data,
> -   

[Mesa-dev] [RFC 0/2] nvc0: Fix non-integer counters in AMD_performance_monitor

2017-06-11 Thread Boyan Ding
Some performance counters in nouveau uses non-integer type in AMD_perfmon,
but they are currently returning int values. One reason behind this is
that gallium hud, which is one of the users of the counters, only supports
integers. This series tries to fix problem in both parts -- making nouveau
return values of appropriate types and teaching hud to be aware of types.
Although hud is still not clever enough to handle floating points, it is
becoming no worse.

Note that this series is highly RFC. I'm posting this to solicit ideas,
both on whether this approach is appropriate and details in handling --
the current code looks somewhat ugly. There are also further problems
that I look to solve, namely some performance counters, such as ipc,
which should be floats instead of ints. But I want to get basics correct
first.

Boyan Ding (2):
  hud: Handle query values according to their type
  nvc0: Return value of appropriate type instead of u64

 src/gallium/auxiliary/hud/hud_driver_query.c   | 22 ++-
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 70 +-
 2 files changed, 61 insertions(+), 31 deletions(-)

-- 
2.13.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 2/2] nvc0: Return value of appropriate type instead of u64

2017-06-11 Thread Boyan Ding
Signed-off-by: Boyan Ding 
---
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 70 +-
 1 file changed, 42 insertions(+), 28 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
index 089af61820..6d4deaf2ba 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
@@ -498,53 +498,59 @@ nvc0_hw_metric_end_query(struct nvc0_context *nvc0, 
struct nvc0_hw_query *hq)
 static uint64_t
 sm20_hw_metric_calc_result(struct nvc0_hw_query *hq, uint64_t res64[8])
 {
+   union pipe_query_result result;
+
+   result.u64 = 0;
switch (hq->base.type - NVC0_HW_METRIC_QUERY(0)) {
case NVC0_HW_METRIC_QUERY_ACHIEVED_OCCUPANCY:
   /* ((active_warps / active_cycles) / max. number of warps on a MP) * 100 
*/
   if (res64[1])
- return ((res64[0] / (double)res64[1]) / 48) * 100;
+ result.f = ((res64[0] / (double)res64[1]) / 48) * 100;
   break;
case NVC0_HW_METRIC_QUERY_BRANCH_EFFICIENCY:
   /* (branch / (branch + divergent_branch)) * 100 */
   if (res64[0] + res64[1])
- return (res64[0] / (double)(res64[0] + res64[1])) * 100;
+ result.f = (res64[0] / (double)(res64[0] + res64[1])) * 100;
   break;
case NVC0_HW_METRIC_QUERY_INST_PER_WRAP:
   /* inst_executed / warps_launched */
   if (res64[1])
- return res64[0] / (double)res64[1];
+ result.u64 = res64[0] / (double)res64[1];
   break;
case NVC0_HW_METRIC_QUERY_INST_REPLAY_OVERHEAD:
   /* (inst_issued - inst_executed) / inst_executed */
   if (res64[1])
- return (res64[0] - res64[1]) / (double)res64[1];
+ result.u64 = (res64[0] - res64[1]) / (double)res64[1];
   break;
case NVC0_HW_METRIC_QUERY_ISSUED_IPC:
   /* inst_issued / active_cycles */
   if (res64[1])
- return res64[0] / (double)res64[1];
+ result.u64 = res64[0] / (double)res64[1];
   break;
case NVC0_HW_METRIC_QUERY_ISSUE_SLOT_UTILIZATION:
   /* ((inst_issued / 2) / active_cycles) * 100 */
   if (res64[1])
- return ((res64[0] / 2) / (double)res64[1]) * 100;
+ result.f = ((res64[0] / 2) / (double)res64[1]) * 100;
   break;
case NVC0_HW_METRIC_QUERY_IPC:
   /* inst_executed / active_cycles */
   if (res64[1])
- return res64[0] / (double)res64[1];
+ result.u64 = res64[0] / (double)res64[1];
   break;
default:
   debug_printf("invalid metric type: %d\n",
hq->base.type - NVC0_HW_METRIC_QUERY(0));
   break;
}
-   return 0;
+   return result.u64;
 }
 
 static uint64_t
 sm21_hw_metric_calc_result(struct nvc0_hw_query *hq, uint64_t res64[8])
 {
+   union pipe_query_result result;
+
+   result.u64 = 0;
switch (hq->base.type - NVC0_HW_METRIC_QUERY(0)) {
case NVC0_HW_METRIC_QUERY_ACHIEVED_OCCUPANCY:
   return sm20_hw_metric_calc_result(hq, res64);
@@ -552,31 +558,31 @@ sm21_hw_metric_calc_result(struct nvc0_hw_query *hq, 
uint64_t res64[8])
   return sm20_hw_metric_calc_result(hq, res64);
case NVC0_HW_METRIC_QUERY_INST_ISSUED:
   /* issued1_0 + issued1_1 + (issued2_0 + issued2_1) * 2 */
-  return res64[0] + res64[1] + (res64[2] + res64[3]) * 2;
+  result.u64 = res64[0] + res64[1] + (res64[2] + res64[3]) * 2;
   break;
case NVC0_HW_METRIC_QUERY_INST_PER_WRAP:
   return sm20_hw_metric_calc_result(hq, res64);
case NVC0_HW_METRIC_QUERY_INST_REPLAY_OVERHEAD:
   /* (metric-inst_issued - inst_executed) / inst_executed */
   if (res64[4])
- return (((res64[0] + res64[1] + (res64[2] + res64[3]) * 2) -
-   res64[4]) / (double)res64[4]);
+ result.u64 = (((res64[0] + res64[1] + (res64[2] + res64[3]) * 2) -
+ res64[4]) / (double)res64[4]);
   break;
case NVC0_HW_METRIC_QUERY_ISSUED_IPC:
   /* metric-inst_issued / active_cycles */
   if (res64[4])
- return (res64[0] + res64[1] + (res64[2] + res64[3]) * 2) /
-(double)res64[4];
+ result.u64 = (res64[0] + res64[1] + (res64[2] + res64[3]) * 2) /
+  (double)res64[4];
   break;
case NVC0_HW_METRIC_QUERY_ISSUE_SLOTS:
   /* issued1_0 + issued1_1 + issued2_0 + issued2_1 */
-  return res64[0] + res64[1] + res64[2] + res64[3];
+  result.u64 = res64[0] + res64[1] + res64[2] + res64[3];
   break;
case NVC0_HW_METRIC_QUERY_ISSUE_SLOT_UTILIZATION:
   /* ((metric-issue_slots / 2) / active_cycles) * 100 */
   if (res64[4])
- return (((res64[0] + res64[1] + res64[2] + res64[3]) / 2) /
- (double)res64[4]) * 100;
+ result.f =  (((res64[0] + res64[1] + res64[2] + res64[3]) / 2) /
+  (double)res64[4]) * 100;
   break;
case NVC0_HW_METRIC_QUERY_IPC:
   return sm20_hw_metric_calc_result(hq, res64);
@@ -585,

[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #11 from John  ---
Created attachment 131874
  --> https://bugs.freedesktop.org/attachment.cgi?id=131874&action=edit
gdb backtrace

Well, I've been able to get a backtrace thanks to screen.

That looks more interesting already.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #10 from John  ---
I'm not sure if it's thanks to debug, but now I get something in dmesg, not
that helpful I'm afraid:

[  141.325269] raytracing[2417]: segfault at 8 ip 7fd0b21e74d2 sp
7ffc604d5520 error 4 in libvulkan_radeon.so[7fd0b217+1b3000]


The trace file has been empty the various times I've tried. Is there a way to
get a full trace of everything it's doing? maybe that would allow the last line
or so to be useful.

As for gdb, it gets stuck on "attaching to process" and the process command in
ps is displayed in square brackets.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #9 from Grazvydas Ignotas  ---
Is the process still alive when you ssh to the system with a hung GPU? If it
is, you could attach gdb and try to get a backtrace of a hung thread.

You can try at least a few other things:
* compile mesa with --enable-debug if you aren't already, it will enable
asserts that might detect something bad
* set a RADV_TRACE_FILE=/path/to/file environment variable, it will then try to
write out trace of GPU commands to that file if/when it detects a hang.

The trace file sometimes takes a few tries to produce successfully, but if you
can get it, it might help to find the cause of the hang.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101378] interpolateAtSample check for input parameter is too strict

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101378

--- Comment #1 from freedesk...@ca.sh13.net ---
Sorry, copy/paste error, the first two lines are from another unrelated error.
The correct error message is just this part:

0:18(33): error: parameter `interpolant` must be a shader input
0:18(6): error: no matching function for call to `length(error)';
candidates are:
0:18(6): error:float length(float)
0:18(6): error:float length(vec2)
0:18(6): error:float length(vec3)
0:18(6): error:float length(vec4)
0:18(6): error:double length(double)
0:18(6): error:double length(dvec2)
0:18(6): error:double length(dvec3)
0:18(6): error:double length(dvec4)
0:18(6): error: operands to relational operators must be scalar and numeric
0:18(6): error: if-statement condition must be scalar boolean

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101378] interpolateAtSample check for input parameter is too strict

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101378

Bug ID: 101378
   Summary: interpolateAtSample check for input parameter is too
strict
   Product: Mesa
   Version: 17.0
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: freedesk...@ca.sh13.net
QA Contact: intel-3d-b...@lists.freedesktop.org

The following code snippet fails on Mesa 17.0.3 with the following error:

in GeometryToPixel
{
vec3 color;
vec2 uv;
} gs2ps;

   out vec4 Color;

   void main ()
   {
Color = vec4(gs2ps.color, 1);

if (length(interpolateAtSample(gs2ps.uv, gl_SampleID))>1) {
discard;
}
}

This is the error:

0:10(1): error: if a fragment input is (or contains) an integer, then it
must be qualified with 'flat'
0:10(8): error: `gl_SampleID' redeclared
0:18(33): error: parameter `interpolant` must be a shader input
0:18(6): error: no matching function for call to `length(error)';
candidates are:
0:18(6): error:float length(float)
0:18(6): error:float length(vec2)
0:18(6): error:float length(vec3)
0:18(6): error:float length(vec4)
0:18(6): error:double length(double)
0:18(6): error:double length(dvec2)
0:18(6): error:double length(dvec3)
0:18(6): error:double length(dvec4)
0:18(6): error: operands to relational operators must be scalar and numeric
0:18(6): error: if-statement condition must be scalar boolean

Changing the shader to use:

in  vec3 icolor;
in  vec2 iuv;

fixes the problem. This seems to be related to
https://patchwork.freedesktop.org/patch/15298/, but the input variable check is
too strict. It should work with an input block as well, which is for instance
used by the G-Truc samples:
https://github.com/g-truc/ogl-samples/blob/69499c23b9566ac432cc2af33cde6646271c/data/gl-400/fbo-multisample-interpolate.frag

Code in question works fine on AMD desktop drivers.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc

2017-06-11 Thread Jose Fonseca

On 11/06/17 07:59, Jacob Lifshay wrote:
On Sat, Jun 10, 2017 at 3:25 PM Jose Fonseca > wrote:


I don't see how to effectively tack triangle setup into the vertex
shader: vertex shader applies to vertices, where as triangle setup and
bining applies to primitives.  Usually, each vertex gets transformed
only once with llvmpipe, no matter how many triangles refer that vertex.
   The only way to tack triangle setup into vertex shading would be if
you processed vertices a primitive at a time.  Of course one could put
an if-statement to skip reprocessing a vertex that already was
processed, but then you have race conditions, and no benefit of
inlining.

I was mostly thinking of non-indexed vertices.



And I'm afraid that tacking rasterization too is one those things that
sound great on paper, quite bad in practice.  And I speak from
experience: in fact llvmpipe had the last step of rasterization bolted
on the fragment shaders for some time.  But we took it out because it
was _slower_.

The issue is that if you bolt on to the shader body, you either:

- inline in the shader body code for the maxmimum number of planes that
(which are 7, 3 sides of triangle, plus 4 sides of a scissor rect), and
waste cpu cicles going through all of those tests, even when most of the
time many of those tests aren't needed

- or you generate if/for blocks for each place, so you only do the
needed tests, but then you have branch prediction issues...

Whereas if you keep rasterization _outside_ the shader you can have
specialized functions to do the rasterization based on the primitive
itself: (is the triangle fully inside the scissor, you need 3 planes, if
the stamp is fully inside the triangle you need zero).  Essentially you
can "compose" by coupling two functions calls: you call a rasterization
function that's especiallized for the primitive, then a shading function
that's specialized for the state (but not depends on the primitive).

It makes sense: rasterization needs to be specialized for the primitive,
not the graphics state; where as the shader needs to be specialized for
the state.

I am planning on generating a function for each primitive type and state 
combination, or I can convert all primitives into triangles and just 
have a function for each state. The state includes stuff like if a 
particular clipping/scissor equation needs to be checked. I did it that 
way in my proof-of-concept code by using c++ templates to do the code 
duplication: 
https://github.com/programmerjake/tiled-renderer/blob/47e09f5d711803b8e899c3669fbeae3e62c9e32c/main.cpp#L366


I'm not sure there will be enough benefits of iniline to compensate the 
time spent on compiling 2**7 variants of each shader to cope with all 
possible incoming triangles..



And this is just one of those non-intuitive things that's not obvious
until one actually does a lot of profiling, a lot of experimentation.
And trust me, lot of time was spent fine tuning this for llvmpipe (not
be me -- most of rasterization was done by Keith Whitwell.)  And by
throwing llvmpipe out of the window and starting a new software
rendering from scratch you'd be just subscribing to do it all over
again.

Whereas if instead of starting from scratch, you take llvmpipe, and you
rewrite/replace one component at a time, you can reach exactly the same
destination you want to reach, however you'll have something working
every step of the way, so when you take a bad step, you can measure
performance impact, and readjust.  Plus if you run out of time, you have
something useful -- not yet another half finished project, which quickly
will rot away.

In the case that the project is not finished this summer, I'm still 
planning on working on it, just at a reduced rate. If all else fails, we 
will at least have a up-to-date spir-v to llvm converter that handles 
the glsl spir-v extensions.


Regarding generating the spir-v -> scalar llvm, then do whole function
vectorization, I don't think it's a bad idea per se.  If was I writing
llvmpipe from scratch today I'd do something like that.  Especially
because (scalar) LLVM IR is so pervasive in the graphics ecosistem
anyway.

It was only after I had tgsi -> llvm ir all done that I stumbled into
http://compilers.cs.uni-saarland.de/projects/wfv/ .

I think the important thing here is that, once you've vectorized the
shader, and you converted your "texture_sample" to
"texture_sample.vector8", and your "output_merger" intrinsics to
"output_merger.vector8", or you log2/exp2, you then slot the fine tuned
llvmpipe code for texture sampling and blending and math, as that's were
your bottle necks tend to be.  Because if you plan to write all texture
sampling from scratch then you need a time/clone machine to

[Mesa-dev] [Bug 101374] Worms Clan Wars hangs on loading screen

2017-06-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101374

--- Comment #2 from cosiek...@o2.pl ---
Thank you for a quick response. I'm using mesa and I'm also affected. But I'm
using r600. Maybe r600 vs radeonsi?



Computer Information:
Manufacturer:  Unknown
Model:  Unknown
Form Factor: Desktop
No Touch Input Detected

Processor Information:
CPU Vendor:  AuthenticAMD
CPU Brand:  AMD Phenom(tm) II X4 965 Processor
CPU Family:  0x10
CPU Model:  0x4
CPU Stepping:  0x3
CPU Type:  0x0
Speed:  3635 Mhz
4 logical processors
4 physical processors
HyperThreading:  Unsupported
FCMOV:  Supported
SSE2:  Supported
SSE3:  Supported
SSSE3:  Unsupported
SSE4a:  Supported
SSE41:  Unsupported
SSE42:  Unsupported
AES:  Unsupported
AVX:  Unsupported
CMPXCHG16B:  Supported
LAHF/SAHF:  Supported
PrefetchW:  Unsupported

Network Information:
Network Speed:  

Operating System Version:
"Manjaro Linux" (64 bit)
Kernel Name:  Linux
Kernel Version:  4.11.4-1-MANJARO
X Server Vendor:  The X.Org Foundation
X Server Release:  11903000
X Window Manager:  Xfwm4
Steam Runtime Version:  

Video Card:
Driver:  X.Org Gallium 0.4 on AMD JUNIPER (DRM 2.49.0 / 4.11.4-1-MANJARO,
LLVM 4.0.0)

Driver Version:  3.0 Mesa 17.1.2
OpenGL Version: 3.0
Desktop Color Depth: 24 bits per pixel
Monitor Refresh Rate: 119 Hz
VendorID:  0x1002
DeviceID:  0x68ba
Revision Not Detected
Number of Monitors:  1
Number of Logical Video Cards:  1
Primary Display Resolution:  1680 x 1050
Desktop Resolution: 1680 x 1050
Primary Display Size: 18,82" x 11,81"  (22,20" diag)
47,8cm x 30,0cm  (56,4cm diag)
Primary VRAM: 1024 MB

Sound card:
Audio device: SigmaTel STAC9721,23

Memory:
RAM:  3954 Mb

Miscellaneous:
UI Language:  English
LANG:  pl_PL.UTF-8
Microphone:  Not set
Steam Controller Cable and Base:  Not set
Total Hard Disk Space Available:  126486 Mb
Largest Free Hard Disk Block:  21047 Mb
VR Headset: None detected

Recent Failure Reports:

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc

2017-06-11 Thread Jacob Lifshay
On Sat, Jun 10, 2017 at 3:25 PM Jose Fonseca  wrote:

> I don't see how to effectively tack triangle setup into the vertex
> shader: vertex shader applies to vertices, where as triangle setup and
> bining applies to primitives.  Usually, each vertex gets transformed
> only once with llvmpipe, no matter how many triangles refer that vertex.
>   The only way to tack triangle setup into vertex shading would be if
> you processed vertices a primitive at a time.  Of course one could put
> an if-statement to skip reprocessing a vertex that already was
> processed, but then you have race conditions, and no benefit of inlining.
>
I was mostly thinking of non-indexed vertices.

And I'm afraid that tacking rasterization too is one those things that
> sound great on paper, quite bad in practice.  And I speak from
> experience: in fact llvmpipe had the last step of rasterization bolted
> on the fragment shaders for some time.  But we took it out because it
> was _slower_.
>
> The issue is that if you bolt on to the shader body, you either:
>
> - inline in the shader body code for the maxmimum number of planes that
> (which are 7, 3 sides of triangle, plus 4 sides of a scissor rect), and
> waste cpu cicles going through all of those tests, even when most of the
> time many of those tests aren't needed
>
> - or you generate if/for blocks for each place, so you only do the
> needed tests, but then you have branch prediction issues...
>
> Whereas if you keep rasterization _outside_ the shader you can have
> specialized functions to do the rasterization based on the primitive
> itself: (is the triangle fully inside the scissor, you need 3 planes, if
> the stamp is fully inside the triangle you need zero).  Essentially you
> can "compose" by coupling two functions calls: you call a rasterization
> function that's especiallized for the primitive, then a shading function
> that's specialized for the state (but not depends on the primitive).
>
> It makes sense: rasterization needs to be specialized for the primitive,
> not the graphics state; where as the shader needs to be specialized for
> the state.
>
I am planning on generating a function for each primitive type and state
combination, or I can convert all primitives into triangles and just have a
function for each state. The state includes stuff like if a particular
clipping/scissor equation needs to be checked. I did it that way in my
proof-of-concept code by using c++ templates to do the code duplication:
https://github.com/programmerjake/tiled-renderer/blob/47e09f5d711803b8e899c3669fbeae3e62c9e32c/main.cpp#L366


And this is just one of those non-intuitive things that's not obvious
> until one actually does a lot of profiling, a lot of experimentation.
> And trust me, lot of time was spent fine tuning this for llvmpipe (not
> be me -- most of rasterization was done by Keith Whitwell.)  And by
> throwing llvmpipe out of the window and starting a new software
> rendering from scratch you'd be just subscribing to do it all over again.
>
> Whereas if instead of starting from scratch, you take llvmpipe, and you
> rewrite/replace one component at a time, you can reach exactly the same
> destination you want to reach, however you'll have something working
> every step of the way, so when you take a bad step, you can measure
> performance impact, and readjust.  Plus if you run out of time, you have
> something useful -- not yet another half finished project, which quickly
> will rot away.
>
In the case that the project is not finished this summer, I'm still
planning on working on it, just at a reduced rate. If all else fails, we
will at least have a up-to-date spir-v to llvm converter that handles the
glsl spir-v extensions.

Regarding generating the spir-v -> scalar llvm, then do whole function
> vectorization, I don't think it's a bad idea per se.  If was I writing
> llvmpipe from scratch today I'd do something like that.  Especially
> because (scalar) LLVM IR is so pervasive in the graphics ecosistem anyway.
>
> It was only after I had tgsi -> llvm ir all done that I stumbled into
> http://compilers.cs.uni-saarland.de/projects/wfv/ .
>
> I think the important thing here is that, once you've vectorized the
> shader, and you converted your "texture_sample" to
> "texture_sample.vector8", and your "output_merger" intrinsics to
> "output_merger.vector8", or you log2/exp2, you then slot the fine tuned
> llvmpipe code for texture sampling and blending and math, as that's were
> your bottle necks tend to be.  Because if you plan to write all texture
> sampling from scratch then you need a time/clone machine to complete
> this in a summer; and if just use LLVM's / standard C runtime's
> sqrt/log2/exp2/sin/cos then it would be dead slow.
>
I am planning on using c++ templates to help with a lot of the texture
sampler code generation -- clang can convert it to llvm ir and then I can
inline it into the appropriate places. I think that all of the
non-compressed image formats should be pr