date:20160626

[Mesa-dev] [Bug 96684] [swrast] piglit glsl-array-bounds-01 regression

2016-06-26 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=96684

--- Comment #1 from Kenneth Graunke  ---
I can't reproduce this.  I tried both classic swrast and llvmpipe and it seems
to be working fine.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH RFC 1/1] r600, compute: Use vtx #3 for kernel arguments

2016-06-26 Thread Jan Vesely

Both explicit and implicit.
Using vtx 0 (as existing llvm code implies) does not work for dynamic offsets.

Signed-off-by: Jan Vesely 
---
Hi,

I ran into problem when using VTX_READ from constant buffer would work only for 
0 index. The LLVM code implied that it should work (or maybe they considered 
constant offsets only), but I could not find one way or the other in ISA docs.

Switching to vtx#3 fixed the problem, though I'm not sure if it's the right 
solution.

thanks,
Jan


 src/gallium/drivers/r600/evergreen_compute.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index 7f9580c..b351cee 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -369,6 +369,8 @@ static void evergreen_compute_upload_input(struct 
pipe_context *ctx,
ctx->transfer_unmap(ctx, transfer);
 
/* ID=0 is reserved for the parameters */
+   evergreen_cs_set_vertex_buffer(rctx, 3, 0,
+   (struct pipe_resource*)shader->kernel_param);
evergreen_cs_set_constant_buffer(rctx, 0, 0, input_size,
(struct pipe_resource*)shader->kernel_param);
 }
@@ -614,9 +616,9 @@ static void evergreen_set_compute_resources(struct 
pipe_context *ctx,
start, count);
 
for (unsigned i = 0; i < count; i++) {
-   /* The First three vertex buffers are reserved for parameters 
and
+   /* The First four vertex buffers are reserved for parameters and
 * global buffers. */
-   unsigned vtx_id = 3 + i;
+   unsigned vtx_id = 4 + i;
if (resources[i]) {
struct r600_resource_global *buffer =
(struct r600_resource_global*)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vc4: add hash table look-up for exported dmabufs

2016-06-26 Thread Eric Anholt

Rob Clark  writes:

> On Sat, Jun 25, 2016 at 11:33 PM, Eric Anholt  wrote:
>> Rob Herring  writes:
>>
>>> It is necessary to reuse existing BOs when dmabufs are imported. There
>>> are 2 cases that need to be handled. dmabufs can be created/exported and
>>> imported by the same process and can be imported multiple times.
>>> Copying other drivers, add a hash table to track exported BOs so the
>>> BOs get reused.
>>>
>>> Cc: Eric Anholt 
>>> Signed-off-by: Rob Herring 
>>> ---
>>> With this and the fd hashing to get a single screen, the flickery screen
>>> is gone and Android is somewhat working. Several apps though hang, don't
>>> render, and then exit. I also see CMA allocation errors, but not
>>> correlating to the app problems.
>>>
>>> Also, flink names need similar hash table look-up as well. Maybe that's
>>> a don't care for vc4? In any case, I don't have the setup to test that.
>>>
>>> Rob
>>>
>>>  src/gallium/drivers/vc4/vc4_bufmgr.c | 20 +++-
>>>  src/gallium/drivers/vc4/vc4_bufmgr.h | 12 +++-
>>>  src/gallium/drivers/vc4/vc4_screen.c | 15 +++
>>>  src/gallium/drivers/vc4/vc4_screen.h |  3 +++
>>>  4 files changed, 48 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c 
>>> b/src/gallium/drivers/vc4/vc4_bufmgr.c
>>> index 21e3bde..d91157b 100644
>>> --- a/src/gallium/drivers/vc4/vc4_bufmgr.c
>>> +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c
>>> @@ -28,6 +28,7 @@
>>>  #include 
>>>  #include 
>>>
>>> +#include "util/u_hash_table.h"
>>>  #include "util/u_memory.h"
>>>  #include "util/ralloc.h"
>>>
>>> @@ -329,10 +330,19 @@ vc4_bo_open_handle(struct vc4_screen *screen,
>>> uint32_t winsys_stride,
>>> uint32_t handle, uint32_t size)
>>>  {
>>> -struct vc4_bo *bo = CALLOC_STRUCT(vc4_bo);
>>> +struct vc4_bo *bo;
>>>
>>>  assert(size);
>>>
>>> +pipe_mutex_lock(screen->bo_handles_mutex);
>>> +
>>> +bo = util_hash_table_get(screen->bo_handles, 
>>> (void*)(uintptr_t)handle);
>>> +if (bo) {
>>> +pipe_reference(NULL, >reference);
>>> +goto done;
>>> +}
>>> +
>>> +bo = CALLOC_STRUCT(vc4_bo);
>>>  pipe_reference_init(>reference, 1);
>>>  bo->screen = screen;
>>>  bo->handle = handle;
>>> @@ -347,6 +357,10 @@ vc4_bo_open_handle(struct vc4_screen *screen,
>>>  bo->map = malloc(bo->size);
>>>  #endif
>>>
>>> +util_hash_table_set(screen->bo_handles, (void *)(uintptr_t)handle, 
>>> bo);
>>> +
>>> +done:
>>> +pipe_mutex_unlock(screen->bo_handles_mutex);
>>>  return bo;
>>>  }
>>>
>>> @@ -401,6 +415,10 @@ vc4_bo_get_dmabuf(struct vc4_bo *bo)
>>>  }
>>>  bo->private = false;
>>>
>>> +pipe_mutex_lock(bo->screen->bo_handles_mutex);
>>> +util_hash_table_set(bo->screen->bo_handles, (void 
>>> *)(uintptr_t)bo->handle, bo);
>>> +pipe_mutex_unlock(bo->screen->bo_handles_mutex);
>>> +
>>>  return fd;
>>>  }
>>>
>>> diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.h 
>>> b/src/gallium/drivers/vc4/vc4_bufmgr.h
>>> index b77506e..0896b30 100644
>>> --- a/src/gallium/drivers/vc4/vc4_bufmgr.h
>>> +++ b/src/gallium/drivers/vc4/vc4_bufmgr.h
>>> @@ -25,6 +25,7 @@
>>>  #define VC4_BUFMGR_H
>>>
>>>  #include 
>>> +#include "util/u_hash_table.h"
>>>  #include "util/u_inlines.h"
>>>  #include "vc4_qir.h"
>>>
>>> @@ -87,11 +88,20 @@ vc4_bo_reference(struct vc4_bo *bo)
>>>  static inline void
>>>  vc4_bo_unreference(struct vc4_bo **bo)
>>>  {
>>> +struct vc4_screen *screen;
>>>  if (!*bo)
>>>  return;
>>>
>>> -if (pipe_reference(&(*bo)->reference, NULL))
>>> +screen = (*bo)->screen;
>>> +pipe_mutex_lock(screen->bo_handles_mutex);
>>> +
>>> +if (pipe_reference(&(*bo)->reference, NULL)) {
>>>  vc4_bo_last_unreference(*bo);
>>> +util_hash_table_remove(screen->bo_handles,
>>> +   (void *)(uintptr_t)(*bo)->handle);
>>
>> I think you're use-after-freeing bo here.  Just stick it before
>> last_unreference()?
>>
>>> +}
>>> +
>>> +pipe_mutex_unlock(screen->bo_handles_mutex);
>>>  *bo = NULL;
>>>  }
>>
>> Taking a mutex on every unref sucks -- it's a *really* hot path, and it
>> kind of defeats the point of doing these pipe_reference atomics.  We
>> should be able to skip the mutex in the bo->private case, since any
>> flinked/dmabuf BO will be !private.  Think you could give that a shot?
>
> or just move the mutex inside the if (pipe_reference(...)).. so you
> only take it on final unref?

Then you can have the refcount go to 0 and then back to 1 (when it gets
pulled out of the hash table before you grabbed the mutex), which I
assumed the API would prevent.


signature.asc
Description: PGP signature

Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

2016-06-26 Thread Samuel Pitoiset




On 06/26/2016 11:55 PM, Ilia Mirkin wrote:

On Sun, Jun 26, 2016 at 5:49 PM, Samuel Pitoiset
 wrote:



On 06/26/2016 11:46 PM, Ilia Mirkin wrote:


We don't appear to handle src(0) having a neg modifier in the
immediate case. Does the insnCanLoad logic account for that? (Perhaps
the bit is there and we just forgot about it?)



I don't see any neg modifier in envydis for IADD32I.


Well, isModSupported() will happily generate such instructions:

  case OP_ADD:
 if (mod.abs())
return false;
 if (insn->src(s ? 0 : 1).mod.neg())
return false;
 break;

and insnCanLoad:

 case TYPE_S32:
 case TYPE_U32:
// with u32, 0xf counts as 0x as well
if (reg.data.s32 > 0x7 || reg.data.s32 < -0x8)
   return false;
break;



Cool, but I will double check with nvidiasm.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

2016-06-26 Thread Ilia Mirkin

On Sun, Jun 26, 2016 at 5:49 PM, Samuel Pitoiset
 wrote:
>
>
> On 06/26/2016 11:46 PM, Ilia Mirkin wrote:
>>
>> We don't appear to handle src(0) having a neg modifier in the
>> immediate case. Does the insnCanLoad logic account for that? (Perhaps
>> the bit is there and we just forgot about it?)
>
>
> I don't see any neg modifier in envydis for IADD32I.

Well, isModSupported() will happily generate such instructions:

  case OP_ADD:
 if (mod.abs())
return false;
 if (insn->src(s ? 0 : 1).mod.neg())
return false;
 break;

and insnCanLoad:

 case TYPE_S32:
 case TYPE_U32:
// with u32, 0xf counts as 0x as well
if (reg.data.s32 > 0x7 || reg.data.s32 < -0x8)
   return false;
break;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

2016-06-26 Thread Samuel Pitoiset




On 06/26/2016 11:46 PM, Ilia Mirkin wrote:

We don't appear to handle src(0) having a neg modifier in the
immediate case. Does the insnCanLoad logic account for that? (Perhaps
the bit is there and we just forgot about it?)


I don't see any neg modifier in envydis for IADD32I.



On Sun, Jun 26, 2016 at 5:40 PM, Samuel Pitoiset
 wrote:

IADD only allows to emit 19-bits immediates. This is similar to the
previous fix I did for MOV.

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index bf719a9..80761e2 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1683,7 +1683,7 @@ CodeEmitterGM107::emitNOT()
 void
 CodeEmitterGM107::emitIADD()
 {
-   if (!longIMMD(insn->src(1))) {
+   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
   switch (insn->src(1).getFile()) {
   case FILE_GPR:
  emitInsn(0x5c10);
--
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

2016-06-26 Thread Ilia Mirkin

We don't appear to handle src(0) having a neg modifier in the
immediate case. Does the insnCanLoad logic account for that? (Perhaps
the bit is there and we just forgot about it?)

On Sun, Jun 26, 2016 at 5:40 PM, Samuel Pitoiset
 wrote:
> IADD only allows to emit 19-bits immediates. This is similar to the
> previous fix I did for MOV.
>
> Signed-off-by: Samuel Pitoiset 
> Cc: 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index bf719a9..80761e2 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -1683,7 +1683,7 @@ CodeEmitterGM107::emitNOT()
>  void
>  CodeEmitterGM107::emitIADD()
>  {
> -   if (!longIMMD(insn->src(1))) {
> +   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
>switch (insn->src(1).getFile()) {
>case FILE_GPR:
>   emitInsn(0x5c10);
> --
> 2.8.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gm107/ir: make use of MOV32I for all immediates

2016-06-26 Thread Samuel Pitoiset

MOV only allows to emit 19-bits immediates. This is similar to the
previous fix I did for IMUL.

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 3a02fc7..bf719a9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -684,8 +684,7 @@ CodeEmitterGM107::emitRAM()
 void
 CodeEmitterGM107::emitMOV()
 {
-   if ( insn->src(0).getFile() != FILE_IMMEDIATE ||
-   (insn->sType != TYPE_F32 && !longIMMD(insn->src(0 {
+   if (insn->src(0).getFile() != FILE_IMMEDIATE) {
   switch (insn->src(0).getFile()) {
   case FILE_GPR:
  if (insn->def(0).getFile() == FILE_PREDICATE) {
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

2016-06-26 Thread Samuel Pitoiset

IADD only allows to emit 19-bits immediates. This is similar to the
previous fix I did for MOV.

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index bf719a9..80761e2 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1683,7 +1683,7 @@ CodeEmitterGM107::emitNOT()
 void
 CodeEmitterGM107::emitIADD()
 {
-   if (!longIMMD(insn->src(1))) {
+   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
   switch (insn->src(1).getFile()) {
   case FILE_GPR:
  emitInsn(0x5c10);
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Fwd: [PATCH] st/vdpau: use bicubic filter for scaling

2016-06-26 Thread Andy Furniss


Nayan Deshmukh wrote:

Hi Andy,

On Sun, Jun 26, 2016 at 12:25 AM, Andy Furniss  wrote:


Nayan Deshmukh wrote:


Hi Andy,

Thanks for testing the patches.

Please send me the videos and ratios with which there is corruption.




https://drive.google.com/file/d/0BxP5-S1t9VEEaHZEM203RFpyNEE/view?usp=sharing

This has no aspect encoded and displayed fullscreen on a 1920x1080
monitor shows vertical line artifacts over the first 2/3 of the image.

When I say lines they are not lines as such just that the distortion
on the pendulum shows as it passes over imaginary lines at fixed
points on the screen.

with mplayer -aspect 4/3 or 16/9 it doesn't.



I tested the videos and found out that the distortion is because of the
amount
of calculation done in the fragment shader. I tested the video with
vl_median_filter
and it showed no distortion however, with vl_matrix_filter( which requires
more
calculations than vl_median_filter) it showed the same distortion. I'll try
to make it
more efficient. But it still requires a lot of processing for a single
pixel as it uses
15 neighbouring pixel.


Seems a bit strange, does the processing needed vary greatly with
similar scale amounts? I have a powerful GPU and can force clocks
high, but it makes no difference.

Below is a png showing the artifacts I see on pendulum fullscreen
are these what you see?

If rather than full screen I stretch out the window to scale, there
will be many sizes that don't produce those.

https://drive.google.com/file/d/0BxP5-S1t9VEEd2hwNVp0ZXRSZTA/view?usp=sharing


Also I don't see any offsets with the videos, may be I am missing something.
If could tell me more about the offsets, I'll try to debug them.


https://drive.google.com/file/d/0BxP5-S1t9VEEUGZTbndOMzBNZnM/view?usp=sharing

Is a default scale, if you download both pngs and use something to
display them both at the same time and line up the windows one on
top of the other then flip between them you can see although the
windows are lined up the images contained are not.

You can make your own screen/window shots with xwd and display them
with xwud. For me using fluxbox as a desktop it's easy to line up
windows as they snap a bit towards the edge of the screen YMMV.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] swrast: fix active attribs with atifragshader

2016-06-26 Thread Miklós Máté

Only include the ones that can be used by the shader.

This fixes texture coordinates, which were completely wrong,
because WPOS was included in the list of attribs. It also
increases performance noticeably.

Signed-off-by: Miklós Máté 
---
 src/mesa/swrast/s_context.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/swrast/s_context.c b/src/mesa/swrast/s_context.c
index 0a5fc7e..a63179c 100644
--- a/src/mesa/swrast/s_context.c
+++ b/src/mesa/swrast/s_context.c
@@ -504,7 +504,8 @@ _swrast_update_active_attribs(struct gl_context *ctx)
   attribsMask &= ~VARYING_BIT_POS; /* WPOS is always handled specially */
}
else if (ctx->ATIFragmentShader._Enabled) {
-  attribsMask = ~0;  /* XXX fix me */
+  attribsMask = VARYING_BIT_COL0 | VARYING_BIT_COL1 |
+VARYING_BIT_FOGC | VARYING_BITS_TEX_ANY;
}
else {
   /* fixed function */
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 96684] [swrast] piglit glsl-array-bounds-01 regression

2016-06-26 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=96684

Bug ID: 96684
   Summary: [swrast] piglit glsl-array-bounds-01 regression
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: kenn...@whitecape.org, t_arc...@yahoo.com.au

mesa: 367cf3a2e3e51466429a6446ef14ed398a5fb948 (12.1.0-devel)

$ ./bin/shader_runner tests/shaders/glsl-array-bounds-01.shader_test -auto
Probe color at (15,15)
  Expected: 0.00 1.00 0.00
  Observed: 1.00 0.00 0.00
PIGLIT: {"result": "fail" }


c264fdbc073a0dfc393f53a8be880f535fd4b988 is the first bad commit
commit c264fdbc073a0dfc393f53a8be880f535fd4b988
Author: Kenneth Graunke 
Date:   Mon Jun 20 11:20:51 2016 -0700

glsl: Split arrays even in the presence of whole-array copies.

Previously, we failed to split constant arrays.  Code such as

   int[2] numbers = int[](1, 2);

would generates a whole-array assignment:

  (assign () (var_ref numbers)
 (constant (array int 4) (constant int 1) (constant int 2)))

opt_array_splitting generally tried to visit ir_dereference_array nodes,
and avoid recursing into the inner ir_dereference_variable.  So if it
ever saw a ir_dereference_variable, it assumed this was a whole-array
read and bailed.  However, in the above case, there's no array deref,
and we can totally handle it - we just have to "unroll" the assignment,
creating assignments for each element.

This was mitigated by the fact that we constant propagate whole arrays,
so a dereference of a single component would usually get the desired
single value anyway.  However, I plan to stop doing that shortly;
early experiments with disabling constant propagation of arrays
revealed this shortcoming.

This patch causes some arrays in Gl32GSCloth's geometry shaders to be
split, which allows other optimizations to eliminate unused GS inputs.
The VS then doesn't have to write them, which eliminates the entire VS
(5 -> 2 instructions).  It still renders correctly.

No other change in shader-db.

v2: Drop !AOA check and improve a comment (feedback from Tim Arceri).

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
Reviewed-by: Timothy Arceri 

:04 04 5d96cd8ba322e2ac1857baf9e401b5d494a85ab2
2b920fe841296c89251cca630f51060de413cbb3 M  src
bisect run success

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vc4: add hash table look-up for exported dmabufs

2016-06-26 Thread Rob Clark

On Sat, Jun 25, 2016 at 11:33 PM, Eric Anholt  wrote:
> Rob Herring  writes:
>
>> It is necessary to reuse existing BOs when dmabufs are imported. There
>> are 2 cases that need to be handled. dmabufs can be created/exported and
>> imported by the same process and can be imported multiple times.
>> Copying other drivers, add a hash table to track exported BOs so the
>> BOs get reused.
>>
>> Cc: Eric Anholt 
>> Signed-off-by: Rob Herring 
>> ---
>> With this and the fd hashing to get a single screen, the flickery screen
>> is gone and Android is somewhat working. Several apps though hang, don't
>> render, and then exit. I also see CMA allocation errors, but not
>> correlating to the app problems.
>>
>> Also, flink names need similar hash table look-up as well. Maybe that's
>> a don't care for vc4? In any case, I don't have the setup to test that.
>>
>> Rob
>>
>>  src/gallium/drivers/vc4/vc4_bufmgr.c | 20 +++-
>>  src/gallium/drivers/vc4/vc4_bufmgr.h | 12 +++-
>>  src/gallium/drivers/vc4/vc4_screen.c | 15 +++
>>  src/gallium/drivers/vc4/vc4_screen.h |  3 +++
>>  4 files changed, 48 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c 
>> b/src/gallium/drivers/vc4/vc4_bufmgr.c
>> index 21e3bde..d91157b 100644
>> --- a/src/gallium/drivers/vc4/vc4_bufmgr.c
>> +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c
>> @@ -28,6 +28,7 @@
>>  #include 
>>  #include 
>>
>> +#include "util/u_hash_table.h"
>>  #include "util/u_memory.h"
>>  #include "util/ralloc.h"
>>
>> @@ -329,10 +330,19 @@ vc4_bo_open_handle(struct vc4_screen *screen,
>> uint32_t winsys_stride,
>> uint32_t handle, uint32_t size)
>>  {
>> -struct vc4_bo *bo = CALLOC_STRUCT(vc4_bo);
>> +struct vc4_bo *bo;
>>
>>  assert(size);
>>
>> +pipe_mutex_lock(screen->bo_handles_mutex);
>> +
>> +bo = util_hash_table_get(screen->bo_handles, 
>> (void*)(uintptr_t)handle);
>> +if (bo) {
>> +pipe_reference(NULL, >reference);
>> +goto done;
>> +}
>> +
>> +bo = CALLOC_STRUCT(vc4_bo);
>>  pipe_reference_init(>reference, 1);
>>  bo->screen = screen;
>>  bo->handle = handle;
>> @@ -347,6 +357,10 @@ vc4_bo_open_handle(struct vc4_screen *screen,
>>  bo->map = malloc(bo->size);
>>  #endif
>>
>> +util_hash_table_set(screen->bo_handles, (void *)(uintptr_t)handle, 
>> bo);
>> +
>> +done:
>> +pipe_mutex_unlock(screen->bo_handles_mutex);
>>  return bo;
>>  }
>>
>> @@ -401,6 +415,10 @@ vc4_bo_get_dmabuf(struct vc4_bo *bo)
>>  }
>>  bo->private = false;
>>
>> +pipe_mutex_lock(bo->screen->bo_handles_mutex);
>> +util_hash_table_set(bo->screen->bo_handles, (void 
>> *)(uintptr_t)bo->handle, bo);
>> +pipe_mutex_unlock(bo->screen->bo_handles_mutex);
>> +
>>  return fd;
>>  }
>>
>> diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.h 
>> b/src/gallium/drivers/vc4/vc4_bufmgr.h
>> index b77506e..0896b30 100644
>> --- a/src/gallium/drivers/vc4/vc4_bufmgr.h
>> +++ b/src/gallium/drivers/vc4/vc4_bufmgr.h
>> @@ -25,6 +25,7 @@
>>  #define VC4_BUFMGR_H
>>
>>  #include 
>> +#include "util/u_hash_table.h"
>>  #include "util/u_inlines.h"
>>  #include "vc4_qir.h"
>>
>> @@ -87,11 +88,20 @@ vc4_bo_reference(struct vc4_bo *bo)
>>  static inline void
>>  vc4_bo_unreference(struct vc4_bo **bo)
>>  {
>> +struct vc4_screen *screen;
>>  if (!*bo)
>>  return;
>>
>> -if (pipe_reference(&(*bo)->reference, NULL))
>> +screen = (*bo)->screen;
>> +pipe_mutex_lock(screen->bo_handles_mutex);
>> +
>> +if (pipe_reference(&(*bo)->reference, NULL)) {
>>  vc4_bo_last_unreference(*bo);
>> +util_hash_table_remove(screen->bo_handles,
>> +   (void *)(uintptr_t)(*bo)->handle);
>
> I think you're use-after-freeing bo here.  Just stick it before
> last_unreference()?
>
>> +}
>> +
>> +pipe_mutex_unlock(screen->bo_handles_mutex);
>>  *bo = NULL;
>>  }
>
> Taking a mutex on every unref sucks -- it's a *really* hot path, and it
> kind of defeats the point of doing these pipe_reference atomics.  We
> should be able to skip the mutex in the bo->private case, since any
> flinked/dmabuf BO will be !private.  Think you could give that a shot?

or just move the mutex inside the if (pipe_reference(...)).. so you
only take it on final unref?

BR,
-R


> Thanks for debugging!  Note, I'm still on vacation, so I'll be slow at
> replying.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] Fwd: [PATCH] st/vdpau: use bicubic filter for scaling

2016-06-26 Thread Nayan Deshmukh

Hi Andy,

On Sun, Jun 26, 2016 at 12:25 AM, Andy Furniss  wrote:

> Nayan Deshmukh wrote:
>
>> Hi Andy,
>>
>> Thanks for testing the patches.
>>
>> Please send me the videos and ratios with which there is corruption.
>>
>
>
> https://drive.google.com/file/d/0BxP5-S1t9VEEaHZEM203RFpyNEE/view?usp=sharing
>
> This has no aspect encoded and displayed fullscreen on a 1920x1080
> monitor shows vertical line artifacts over the first 2/3 of the image.
>
> When I say lines they are not lines as such just that the distortion
> on the pendulum shows as it passes over imaginary lines at fixed
> points on the screen.
>
> with mplayer -aspect 4/3 or 16/9 it doesn't.
>

I tested the videos and found out that the distortion is because of the
amount
of calculation done in the fragment shader. I tested the video with
vl_median_filter
and it showed no distortion however, with vl_matrix_filter( which requires
more
calculations than vl_median_filter) it showed the same distortion. I'll try
to make it
more efficient. But it still requires a lot of processing for a single
pixel as it uses
15 neighbouring pixel.

Also I don't see any offsets with the videos, may be I am missing something.
If could tell me more about the offsets, I'll try to debug them.

Thanks,
Nayan.

>
> That one is 720x576, with a 704x576 like -
>
> ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/625/mobl_080.m2v
>
> 4/3 still has the vertical "line" artifacts  - 16/9 doesn't.
>
> On other vids there is a hint of horizontal distortion, I'd need time
> to cut the one I saw that on - it was subtle on a 1440x1080 16/9 long vid.
>
>
>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vc4: add hash table look-up for exported dmabufs

2016-06-26 Thread Eric Anholt

Rob Herring  writes:

> It is necessary to reuse existing BOs when dmabufs are imported. There
> are 2 cases that need to be handled. dmabufs can be created/exported and
> imported by the same process and can be imported multiple times.
> Copying other drivers, add a hash table to track exported BOs so the
> BOs get reused.
>
> Cc: Eric Anholt 
> Signed-off-by: Rob Herring 
> ---
> With this and the fd hashing to get a single screen, the flickery screen 
> is gone and Android is somewhat working. Several apps though hang, don't 
> render, and then exit. I also see CMA allocation errors, but not 
> correlating to the app problems.
>
> Also, flink names need similar hash table look-up as well. Maybe that's 
> a don't care for vc4? In any case, I don't have the setup to test that.
>
> Rob
>
>  src/gallium/drivers/vc4/vc4_bufmgr.c | 20 +++-
>  src/gallium/drivers/vc4/vc4_bufmgr.h | 12 +++-
>  src/gallium/drivers/vc4/vc4_screen.c | 15 +++
>  src/gallium/drivers/vc4/vc4_screen.h |  3 +++
>  4 files changed, 48 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c 
> b/src/gallium/drivers/vc4/vc4_bufmgr.c
> index 21e3bde..d91157b 100644
> --- a/src/gallium/drivers/vc4/vc4_bufmgr.c
> +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  
> +#include "util/u_hash_table.h"
>  #include "util/u_memory.h"
>  #include "util/ralloc.h"
>  
> @@ -329,10 +330,19 @@ vc4_bo_open_handle(struct vc4_screen *screen,
> uint32_t winsys_stride,
> uint32_t handle, uint32_t size)
>  {
> -struct vc4_bo *bo = CALLOC_STRUCT(vc4_bo);
> +struct vc4_bo *bo;
>  
>  assert(size);
>  
> +pipe_mutex_lock(screen->bo_handles_mutex);
> +
> +bo = util_hash_table_get(screen->bo_handles, 
> (void*)(uintptr_t)handle);
> +if (bo) {
> +pipe_reference(NULL, >reference);
> +goto done;
> +}
> +
> +bo = CALLOC_STRUCT(vc4_bo);
>  pipe_reference_init(>reference, 1);
>  bo->screen = screen;
>  bo->handle = handle;
> @@ -347,6 +357,10 @@ vc4_bo_open_handle(struct vc4_screen *screen,
>  bo->map = malloc(bo->size);
>  #endif
>  
> +util_hash_table_set(screen->bo_handles, (void *)(uintptr_t)handle, 
> bo);
> +
> +done:
> +pipe_mutex_unlock(screen->bo_handles_mutex);
>  return bo;
>  }
>  
> @@ -401,6 +415,10 @@ vc4_bo_get_dmabuf(struct vc4_bo *bo)
>  }
>  bo->private = false;
>  
> +pipe_mutex_lock(bo->screen->bo_handles_mutex);
> +util_hash_table_set(bo->screen->bo_handles, (void 
> *)(uintptr_t)bo->handle, bo);
> +pipe_mutex_unlock(bo->screen->bo_handles_mutex);
> +
>  return fd;
>  }
>  
> diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.h 
> b/src/gallium/drivers/vc4/vc4_bufmgr.h
> index b77506e..0896b30 100644
> --- a/src/gallium/drivers/vc4/vc4_bufmgr.h
> +++ b/src/gallium/drivers/vc4/vc4_bufmgr.h
> @@ -25,6 +25,7 @@
>  #define VC4_BUFMGR_H
>  
>  #include 
> +#include "util/u_hash_table.h"
>  #include "util/u_inlines.h"
>  #include "vc4_qir.h"
>  
> @@ -87,11 +88,20 @@ vc4_bo_reference(struct vc4_bo *bo)
>  static inline void
>  vc4_bo_unreference(struct vc4_bo **bo)
>  {
> +struct vc4_screen *screen;
>  if (!*bo)
>  return;
>  
> -if (pipe_reference(&(*bo)->reference, NULL))
> +screen = (*bo)->screen;
> +pipe_mutex_lock(screen->bo_handles_mutex);
> +
> +if (pipe_reference(&(*bo)->reference, NULL)) {
>  vc4_bo_last_unreference(*bo);
> +util_hash_table_remove(screen->bo_handles,
> +   (void *)(uintptr_t)(*bo)->handle);

I think you're use-after-freeing bo here.  Just stick it before
last_unreference()?

> +}
> +
> +pipe_mutex_unlock(screen->bo_handles_mutex);
>  *bo = NULL;
>  }

Taking a mutex on every unref sucks -- it's a *really* hot path, and it
kind of defeats the point of doing these pipe_reference atomics.  We
should be able to skip the mutex in the bo->private case, since any
flinked/dmabuf BO will be !private.  Think you could give that a shot?

Thanks for debugging!  Note, I'm still on vacation, so I'll be slow at
replying.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: Fix copy_prop_src when src is an indirect access on a reg.

2016-06-26 Thread Jason Ekstrand

Rb
On Jun 25, 2016 9:30 PM, "Eric Anholt"  wrote:

> The intent was to continue down the indirect chain, not to call ourselves
> with unchanged input arguments.  Found by code inspection, and comparison
> to copy_prop_alu_src().
>
> We haven't hit this because callers of NIR's copy prop are doing so in
> SSA, before indirect variable dereferences have been lowered to registers.
> ---
>  src/compiler/nir/nir_opt_copy_propagate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/nir/nir_opt_copy_propagate.c
> b/src/compiler/nir/nir_opt_copy_propagate.c
> index adca7fa6eff2..c26e07fda712 100644
> --- a/src/compiler/nir/nir_opt_copy_propagate.c
> +++ b/src/compiler/nir/nir_opt_copy_propagate.c
> @@ -103,7 +103,7 @@ copy_prop_src(nir_src *src, nir_instr *parent_instr,
> nir_if *parent_if)
>  {
> if (!src->is_ssa) {
>if (src->reg.indirect)
> - return copy_prop_src(src, parent_instr, parent_if);
> + return copy_prop_src(src->reg.indirect, parent_instr, parent_if);
>return false;
> }
>
> --
> 2.8.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gm107/ir: make use of IMUL32I for all immediates

2016-06-26 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

Might be good to make a note of why, and/or reference the GK110 commit
that did the same thing.

On Sun, Jun 26, 2016 at 11:23 AM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> Cc: 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 5a7f391..3a02fc7 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -1725,7 +1725,7 @@ CodeEmitterGM107::emitIADD()
>  void
>  CodeEmitterGM107::emitIMUL()
>  {
> -   if (!longIMMD(insn->src(1))) {
> +   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
>switch (insn->src(1).getFile()) {
>case FILE_GPR:
>   emitInsn(0x5c38);
> --
> 2.8.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gm107/ir: make use of IMUL32I for all immediates

2016-06-26 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
Cc: 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 5a7f391..3a02fc7 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1725,7 +1725,7 @@ CodeEmitterGM107::emitIADD()
 void
 CodeEmitterGM107::emitIMUL()
 {
-   if (!longIMMD(insn->src(1))) {
+   if (insn->src(1).getFile() != FILE_IMMEDIATE) {
   switch (insn->src(1).getFile()) {
   case FILE_GPR:
  emitInsn(0x5c38);
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] V3 On disk shader cache for i965 (Now with real world results!)

2016-06-26 Thread Timothy Arceri

On Sun, 2016-06-26 at 16:15 +0300, Grazvydas Ignotas wrote:
> Tried this while playing with apitrace and am getting segfaults when
> running any trace with a cached (second) run. Not sure if it's
> "wrong"
> traces I've chosen or what, you can take one example from this bug:
> https://bugs.freedesktop.org/show_bug.cgi?id=96425

Thanks for testing I'll take a look tomorrow.

> 
> It would also be good idea to hide the cache debug messages behind
> some env var, or at least send them to stderr and not stdout, as
> stdout breaks programs that pipe data through stdout like qapitrace.

Right thats my next task, I should get this done tomorrow also. As
stated below :) "For now I have left in some printf's as the feature is
still disabled by default and they are useful for debugging. I intend
to fix this soon to hide them behind an environment var."

Thanks again.

> 
> Gražvydas
> 
> On Sun, Jun 26, 2016 at 7:16 AM, Timothy Arceri
>  wrote:
> > I've spent a bunch of time rebasing this series to remove the
> > excess
> > code churn and I've just pushed the results to the shader-cache
> > branch
> > mentioned below. There are no code changes to the end result but
> > I've
> > managed to get the patch count down to 80 (was 96 i think) and
> > things
> > should be much easier to review now.
> > 
> > I've also had reports of people testing with additional games such
> > as
> > Dota 2 and seeing good results.
> > 
> > 
> > On Tue, 2016-06-21 at 16:08 +1000, Timothy Arceri wrote:
> > > Rather than send 90+ patches to the list. Please see the repo at
> > > the
> > > bottom of this email.
> > > 
> > > The big update is I've added all stages but compute and tested
> > > with a
> > > few games and everything seems to be working well so far.
> > > Enabling
> > > shader cache with the Shadow of Mordor benchmark make things
> > > noticeably
> > > smoother and helps consitently keep the min FPS at 15 on my
> > > Skylake,
> > > were as without it can be anywhere between 4-15.
> > > 
> > > The elemental demo which Dave pointed out as also doing a bunch
> > > of
> > > compiles during the demo is also smoother especially on the
> > > second
> > > run
> > > but its really slow on my Skylake regardless. Maybe someone with
> > > a
> > > highend Skylake would like to give it a try.
> > > 
> > > 
> > > V3:
> > > - add support for geometry and tessellation stages
> > > - cache clip planes
> > > - reserve parameter storage before restoring list
> > > - stop losing  buffer blocks on cache fallback
> > > - lots of little fixes I cant remember
> > > 
> > > V2:
> > > - rebased on master
> > > - add support for encoding doubles
> > > - renamed skip_cache params to is_cache_fallback, and fix related
> > > bug
> > > when
> > >  disabling shader cache for xfb.
> > > 
> > > This series is based on the great work done by Carl, Kristian and
> > > others.
> > > 
> > > I've split up Carls original patches for easier review, and also
> > > merged
> > > a number of fixes and clean-ups into his patches. However there
> > > is a
> > > little more code churn than is ideal as the appoach taken by the
> > > original patches needed to be modified quite a lot, I'm hoping
> > > its
> > > not
> > > more than people can live with as I'd like to keep some of the
> > > history
> > > rather than just squashing everything.
> > > 
> > > For now I have left in some printf's as the feature is still
> > > disabled
> > > by default and they are useful for debugging. I intend to fix
> > > this
> > > soon
> > > to hide them behind an environment var.
> > > 
> > > There are no regressions after two runs of piglit with shader
> > > cache
> > > enabled on my Broadwell machine.
> > > 
> > > This series enables on disk shader cache for all stage except
> > > compute
> > > programs. For now transform feedback, and SSO programs skip using
> > > the
> > > cache, these will be added as follow ups.
> > > 
> > > My main goal with this series is to land something that
> > > passes piglit there is a number of optimisations that can still
> > > be
> > > done
> > > such as skipping more validation and state recreation when
> > > falling
> > > back
> > > to a full recompile but I would rather leave this until we have
> > > something fully working.
> > > 
> > > Here are the shader-db times (from V2):
> > > 
> > > Cache disabled:
> > > 
> > > Thread 1 took 1360.47 seconds and compiled 13015 shaders (not
> > > including
> > > SIMD16) with 50 GL context switches
> > > Thread 3 took 1349.85 seconds and compiled 12848 shaders (not
> > > including
> > > SIMD16) with 40 GL context switches
> > > Thread 2 took 1362.94 seconds and compiled 12637 shaders (not
> > > including
> > > SIMD16) with 36 GL context switches
> > > Thread 0 took 1352.41 seconds and compiled 12593 shaders (not
> > > including
> > > SIMD16) with 46 GL context switches
> > > 
> > > Cache enabled first run:
> > > 
> > > Thread 1 took 1410.30 seconds and compiled 12678 shaders (not
> > > including
>

Re: [Mesa-dev] [PATCH] nir: Fix copy_prop_src when src is an indirect access on a reg.

2016-06-26 Thread Rob Clark

On Sat, Jun 25, 2016 at 8:54 PM, Eric Anholt  wrote:
> The intent was to continue down the indirect chain, not to call ourselves
> with unchanged input arguments.  Found by code inspection, and comparison
> to copy_prop_alu_src().
>
> We haven't hit this because callers of NIR's copy prop are doing so in
> SSA, before indirect variable dereferences have been lowered to registers.

Reviewed-by: Rob Clark 

> ---
>  src/compiler/nir/nir_opt_copy_propagate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/nir/nir_opt_copy_propagate.c 
> b/src/compiler/nir/nir_opt_copy_propagate.c
> index adca7fa6eff2..c26e07fda712 100644
> --- a/src/compiler/nir/nir_opt_copy_propagate.c
> +++ b/src/compiler/nir/nir_opt_copy_propagate.c
> @@ -103,7 +103,7 @@ copy_prop_src(nir_src *src, nir_instr *parent_instr, 
> nir_if *parent_if)
>  {
> if (!src->is_ssa) {
>if (src->reg.indirect)
> - return copy_prop_src(src, parent_instr, parent_if);
> + return copy_prop_src(src->reg.indirect, parent_instr, parent_if);
>return false;
> }
>
> --
> 2.8.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] V3 On disk shader cache for i965 (Now with real world results!)

2016-06-26 Thread Grazvydas Ignotas

Tried this while playing with apitrace and am getting segfaults when
running any trace with a cached (second) run. Not sure if it's "wrong"
traces I've chosen or what, you can take one example from this bug:
https://bugs.freedesktop.org/show_bug.cgi?id=96425

It would also be good idea to hide the cache debug messages behind
some env var, or at least send them to stderr and not stdout, as
stdout breaks programs that pipe data through stdout like qapitrace.

Gražvydas

On Sun, Jun 26, 2016 at 7:16 AM, Timothy Arceri
 wrote:
> I've spent a bunch of time rebasing this series to remove the excess
> code churn and I've just pushed the results to the shader-cache branch
> mentioned below. There are no code changes to the end result but I've
> managed to get the patch count down to 80 (was 96 i think) and things
> should be much easier to review now.
>
> I've also had reports of people testing with additional games such as
> Dota 2 and seeing good results.
>
>
> On Tue, 2016-06-21 at 16:08 +1000, Timothy Arceri wrote:
>> Rather than send 90+ patches to the list. Please see the repo at the
>> bottom of this email.
>>
>> The big update is I've added all stages but compute and tested with a
>> few games and everything seems to be working well so far. Enabling
>> shader cache with the Shadow of Mordor benchmark make things
>> noticeably
>> smoother and helps consitently keep the min FPS at 15 on my Skylake,
>> were as without it can be anywhere between 4-15.
>>
>> The elemental demo which Dave pointed out as also doing a bunch of
>> compiles during the demo is also smoother especially on the second
>> run
>> but its really slow on my Skylake regardless. Maybe someone with a
>> highend Skylake would like to give it a try.
>>
>>
>> V3:
>> - add support for geometry and tessellation stages
>> - cache clip planes
>> - reserve parameter storage before restoring list
>> - stop losing  buffer blocks on cache fallback
>> - lots of little fixes I cant remember
>>
>> V2:
>> - rebased on master
>> - add support for encoding doubles
>> - renamed skip_cache params to is_cache_fallback, and fix related bug
>> when
>>  disabling shader cache for xfb.
>>
>> This series is based on the great work done by Carl, Kristian and
>> others.
>>
>> I've split up Carls original patches for easier review, and also
>> merged
>> a number of fixes and clean-ups into his patches. However there is a
>> little more code churn than is ideal as the appoach taken by the
>> original patches needed to be modified quite a lot, I'm hoping its
>> not
>> more than people can live with as I'd like to keep some of the
>> history
>> rather than just squashing everything.
>>
>> For now I have left in some printf's as the feature is still disabled
>> by default and they are useful for debugging. I intend to fix this
>> soon
>> to hide them behind an environment var.
>>
>> There are no regressions after two runs of piglit with shader cache
>> enabled on my Broadwell machine.
>>
>> This series enables on disk shader cache for all stage except compute
>> programs. For now transform feedback, and SSO programs skip using the
>> cache, these will be added as follow ups.
>>
>> My main goal with this series is to land something that
>> passes piglit there is a number of optimisations that can still be
>> done
>> such as skipping more validation and state recreation when falling
>> back
>> to a full recompile but I would rather leave this until we have
>> something fully working.
>>
>> Here are the shader-db times (from V2):
>>
>> Cache disabled:
>>
>> Thread 1 took 1360.47 seconds and compiled 13015 shaders (not
>> including
>> SIMD16) with 50 GL context switches
>> Thread 3 took 1349.85 seconds and compiled 12848 shaders (not
>> including
>> SIMD16) with 40 GL context switches
>> Thread 2 took 1362.94 seconds and compiled 12637 shaders (not
>> including
>> SIMD16) with 36 GL context switches
>> Thread 0 took 1352.41 seconds and compiled 12593 shaders (not
>> including
>> SIMD16) with 46 GL context switches
>>
>> Cache enabled first run:
>>
>> Thread 1 took 1410.30 seconds and compiled 12678 shaders (not
>> including
>> SIMD16) with 34 GL context switches
>> Thread 2 took 1421.35 seconds and compiled 12822 shaders (not
>> including
>> SIMD16) with 50 GL context switches
>> Thread 0 took 1410.49 seconds and compiled 12999 shaders (not
>> including
>> SIMD16) with 40 GL context switches
>> Thread 3 took 1426.67 seconds and compiled 12594 shaders (not
>> including
>> SIMD16) with 48 GL context switches
>>
>> Cache enabled second run:
>>
>> Thread 0 took 259.84 seconds and compiled 12817 shaders (not
>> including
>> SIMD16) with 40 GL context switches
>> Thread 3 took 257.03 seconds and compiled 12533 shaders (not
>> including
>> SIMD16) with 50 GL context switches
>> Thread 1 took 256.18 seconds and compiled 12828 shaders (not
>> including
>> SIMD16) with 40 GL context switches
>> Thread 2 took 261.31 seconds and compiled 12915 shaders (not
>>

[Mesa-dev] [PATCH 2/2] i965: Print EOT in fs_visitor::dump_instruction().

2016-06-26 Thread Kenneth Graunke

This was useful when debugging the previous commit's issue.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a929a20..2f473cc 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -5324,6 +5324,10 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
   fprintf(file, "(mlen: %d) ", inst->mlen);
}
 
+   if (inst->eot) {
+  fprintf(file, "(EOT) ");
+   }
+
switch (inst->dst.file) {
case VGRF:
   fprintf(file, "vgrf%d", inst->dst.nr);
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] i965: Make emit_urb_writes() not produce an EOT message for GS.

2016-06-26 Thread Kenneth Graunke

emit_urb_writes() contains code to emit an EOT write with no actual
data when there are no output varyings.  This makes sense for the VS
and TES stages, where it's called once at the end of the program.

However, in the geometry shader stage, emit_urb_writes() is called once
for every EmitVertex().  We explicitly emit a URB write with EOT set at
the end of the shader, separately from this path.  So we'd better not
terminate the thread.  This could get us into trouble for shaders which
do EmitVertex() with no varyings followed by SSBO/image/atomic writes.

It also caused us to emit multiple sends with EOT set, which apparently
confuses the register allocator into not using g112-g127 for all but
the first one.  This caused EU validation failures in OglGSCloth
shaders in shader-db.  (The actual application was fine, but shader-db
thinks there are no outputs because it doesn't understand transform
feedback.)

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 3a49794..4a1ff30 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -594,6 +594,10 @@ fs_visitor::emit_urb_writes(const fs_reg _vertex_count)
 *"The write data payload can be between 1 and 8 message phases long."
 */
if (vue_map->slots_valid == 0) {
+  /* For GS, just turn EmitVertex() into a no-op. */
+  if (stage == MESA_SHADER_GEOMETRY)
+ return;
+
   fs_reg payload = fs_reg(VGRF, alloc.allocate(2), BRW_REGISTER_TYPE_UD);
   bld.exec_all().MOV(payload, urb_handle);
 
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: Check NULL pointer

2016-06-26 Thread Julien Isorce

Thx for the patch:
Reviewed-by: Julien Isorce 

I'll push it today.

On 26 June 2016 at 08:02, Gurkirpal Singh  wrote:

> Call to handle_table_get in vlVaDestroySurfaces can
> return NULL on failure.
>
> CID: 1243522
>
> Signed-off-by: Gurkirpal Singh 
> ---
>  src/gallium/state_trackers/va/surface.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/gallium/state_trackers/va/surface.c
> b/src/gallium/state_trackers/va/surface.c
> index 5efb893..3e74353 100644
> --- a/src/gallium/state_trackers/va/surface.c
> +++ b/src/gallium/state_trackers/va/surface.c
> @@ -71,6 +71,10 @@ vlVaDestroySurfaces(VADriverContextP ctx, VASurfaceID
> *surface_list, int num_sur
> pipe_mutex_lock(drv->mutex);
> for (i = 0; i < num_surfaces; ++i) {
>vlVaSurface *surf = handle_table_get(drv->htab, surface_list[i]);
> +  if (!surf) {
> + pipe_mutex_unlock(drv->mutex);
> + return VA_STATUS_ERROR_INVALID_SURFACE;
> +  }
>if (surf->buffer)
>   surf->buffer->destroy(surf->buffer);
>util_dynarray_fini(>subpics);
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/va: Check NULL pointer

2016-06-26 Thread Gurkirpal Singh

Call to handle_table_get in vlVaDestroySurfaces can
return NULL on failure.

CID: 1243522

Signed-off-by: Gurkirpal Singh 
---
 src/gallium/state_trackers/va/surface.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 5efb893..3e74353 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -71,6 +71,10 @@ vlVaDestroySurfaces(VADriverContextP ctx, VASurfaceID 
*surface_list, int num_sur
pipe_mutex_lock(drv->mutex);
for (i = 0; i < num_surfaces; ++i) {
   vlVaSurface *surf = handle_table_get(drv->htab, surface_list[i]);
+  if (!surf) {
+ pipe_mutex_unlock(drv->mutex);
+ return VA_STATUS_ERROR_INVALID_SURFACE;
+  }
   if (surf->buffer)
  surf->buffer->destroy(surf->buffer);
   util_dynarray_fini(>subpics);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 96684] [swrast] piglit glsl-array-bounds-01 regression

[Mesa-dev] [PATCH RFC 1/1] r600, compute: Use vtx #3 for kernel arguments

Re: [Mesa-dev] [PATCH] vc4: add hash table look-up for exported dmabufs

Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

Re: [Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

[Mesa-dev] [PATCH 1/2] gm107/ir: make use of MOV32I for all immediates

[Mesa-dev] [PATCH 2/2] gm107/ir: make use of IADD32I for all immediates

Re: [Mesa-dev] Fwd: [PATCH] st/vdpau: use bicubic filter for scaling

[Mesa-dev] [PATCH] swrast: fix active attribs with atifragshader

[Mesa-dev] [Bug 96684] [swrast] piglit glsl-array-bounds-01 regression

Re: [Mesa-dev] [PATCH] vc4: add hash table look-up for exported dmabufs

Re: [Mesa-dev] Fwd: [PATCH] st/vdpau: use bicubic filter for scaling

Re: [Mesa-dev] [PATCH] vc4: add hash table look-up for exported dmabufs

Re: [Mesa-dev] [PATCH] nir: Fix copy_prop_src when src is an indirect access on a reg.

Re: [Mesa-dev] [PATCH] gm107/ir: make use of IMUL32I for all immediates

[Mesa-dev] [PATCH] gm107/ir: make use of IMUL32I for all immediates

Re: [Mesa-dev] V3 On disk shader cache for i965 (Now with real world results!)

Re: [Mesa-dev] [PATCH] nir: Fix copy_prop_src when src is an indirect access on a reg.

Re: [Mesa-dev] V3 On disk shader cache for i965 (Now with real world results!)

[Mesa-dev] [PATCH 2/2] i965: Print EOT in fs_visitor::dump_instruction().

[Mesa-dev] [PATCH 1/2] i965: Make emit_urb_writes() not produce an EOT message for GS.

Re: [Mesa-dev] [PATCH] st/va: Check NULL pointer

[Mesa-dev] [PATCH] st/va: Check NULL pointer

25 matches

Site Navigation

Mail list logo

Footer information