[Mesa-dev] [PATCH] r600/sb: fix rotated register in while loop

2018-05-31 Thread Gert Wollny
This patch is based on
https://lists.freedesktop.org/archives/mesa-dev/2018-February/185805.html

Dave Airlie:

 "A bunch of CTS tests led me to write
  tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
  which r600/sb always fell over on.

  GCM seems to move some of the copies into other basic blocks,
  if we don't allow this to happen then it doesn't seem to schedule
  them badly.

  Everything I've read on SSA/phi copies say they have to happen
  in parallel, so keeping them in the same basic block seems like
  a good way to keep some of that property."

This patch differs from the one proposed by Dave in that it only adds 
the NF_DONT_MOVE flag to copy_move instructions that are created by split_phi* 
and that are located in loops.

Fixes piglit: tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
(no regressions in the shader set). It also fixes all failing tests from 

  dEQP-GLES3.functional.shaders.loops.*

Signed-off-by: Gert Wollny 
---
 src/gallium/drivers/r600/sb/sb_ra_init.cpp | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/r600/sb/sb_ra_init.cpp 
b/src/gallium/drivers/r600/sb/sb_ra_init.cpp
index 985e179452..c557b86871 100644
--- a/src/gallium/drivers/r600/sb/sb_ra_init.cpp
+++ b/src/gallium/drivers/r600/sb/sb_ra_init.cpp
@@ -545,10 +545,13 @@ void ra_split::split_phi_src(container_node *loc, 
container_node *c,
continue;
 
value *t = sh.create_temp_value();
+   alu_node* n = sh.create_copy_mov(t, v);
+   if (loop)
+   n->flags |= NF_DONT_MOVE;
if (loop && id == 0)
-   loc->insert_before(sh.create_copy_mov(t, v));
+   loc->insert_before(n);
else
-   loc->push_back(sh.create_copy_mov(t, v));
+   loc->push_back(n);
v = t;
 
sh.coal.add_edge(v, d, coalescer::phi_cost);
@@ -566,9 +569,10 @@ void ra_split::split_phi_dst(node* loc, container_node *c, 
bool loop) {
 
value *t = sh.create_temp_value();
node *cp = sh.create_copy_mov(v, t);
-   if (loop)
+   if (loop) {
+   cp->flags |= NF_DONT_MOVE;
static_cast(loc)->push_front(cp);
-   else
+   } else
loc->insert_after(cp);
v = t;
}
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600/sb: fix rotated register in while loop

2018-02-19 Thread Gert Wollny
Am Montag, den 19.02.2018, 14:06 +1000 schrieb Dave Airlie:
> On 15 February 2018 at 01:26, Gert Wollny 
> wrote:
> > Am Mittwoch, den 14.02.2018, 17:18 +1000 schrieb Dave Airlie:
> > > From: Dave Airlie 
> > > 
> > > A bunch of CTS tests led me to write
> > > tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
> > > which r600/sb always fell over on.
> > > 
> > > This patch fixes it, but I'll probably never be 100% sure why.
> 
> I've gone back to this approach, I've spot tested and I don't see
> as many regressions.

For me most of the assertion failures don't come up anymore, but in
total the numer of regressions is still quite high, i.e. 1 assertion
failure and 13 normal failures. 

  vs-struct-nonconst-non-opaque-members fails with   

  b/sb_sched.cpp:931:process_fetch: 
  Assertion `f->parent->count() ==  1' failed.

and the list of normal failures is very similar to you first patch, the
only change is that 
  glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr 
no longer fails, but instead 
  glsl-1.50/execution/geometry/dynamic_input_array_index
fails, 

Best, 
Gert


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600/sb: fix rotated register in while loop

2018-02-18 Thread Dave Airlie
On 15 February 2018 at 01:26, Gert Wollny  wrote:
> Am Mittwoch, den 14.02.2018, 17:18 +1000 schrieb Dave Airlie:
>> From: Dave Airlie 
>>
>> A bunch of CTS tests led me to write
>> tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
>> which r600/sb always fell over on.
>>
>> This patch fixes it, but I'll probably never be 100% sure why.

I've gone back to this approach, I've spot tested and I don't see
as many regressions.

>
> fixes:
>shaders/ssa/fs-while-loop-rotate-value
>spec/arb_enhanced_layouts/execution/component-layout/
>   sso-vs-gs-fs-array-interleave
>
However this test is broken on r600 even with nosb, it also
never clears so if a previous test has the right answer it will
always pass.

I might try fixing the test then seeing why it's broken on r600.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600/sb: fix rotated register in while loop (v3)

2018-02-18 Thread Dave Airlie
From: Dave Airlie 

A bunch of CTS tests led me to write
tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
which r600/sb always fell over on.

This patch fixes it, but I'll probably never be 100% sure why.

Anyways what appears to be happening is when gcm is scheduling
the copy_movs used for phis, we have 4 copy_movs in the scheduling
queue, one of them releases a new copy_mov and it gets pushed
to the the front of the scheduling queue, but for correctness
we rely on the previous copy_movs getting emitted first. This
places copy_movs in order on the list.

So if the list is empty, it puts it at the front, if the list
has just copy_movs in it, it goes to the end, otherwise
we iterate the list and insert it between the copy_movs and
subsequent instructions.

I've added the additional check here that if the src[0]
of the copy mov has a constraint (which will be true for the
(copy) MOV R1.x.2, t10 case but not for MOV t20, cases,
then we should schedule those early, but the others late.

v2: wrong direction.

v3: back to v1 + changes

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/sb/sb_gcm.cpp | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/sb/sb_gcm.cpp 
b/src/gallium/drivers/r600/sb/sb_gcm.cpp
index 7776a10fc86..0c715b24dfb 100644
--- a/src/gallium/drivers/r600/sb/sb_gcm.cpp
+++ b/src/gallium/drivers/r600/sb/sb_gcm.cpp
@@ -637,8 +637,23 @@ void gcm::add_ready(node *n) {
sched_queue_id sq = sh.get_queue_id(n);
if (n->flags & NF_SCHEDULE_EARLY)
bu_ready_early[sq].push_back(n);
-   else if (sq == SQ_ALU && n->is_copy_mov())
-   bu_ready[sq].push_front(n);
+   else if (sq == SQ_ALU && n->is_copy_mov()) {
+   if (bu_ready[sq].empty() || n->src[0]->constraint)
+   bu_ready[sq].push_front(n);
+   else {
+   bool inserted = false;
+   for (sched_queue::iterator I = bu_ready[sq].begin(), E 
= bu_ready[sq].end(); I != E; I++) {
+   node *a = *I;
+   if (!a->is_copy_mov()) {
+   bu_ready[sq].insert(I, n);
+   inserted = true;
+   break;
+   }
+   }
+   if (!inserted)
+   bu_ready[sq].push_back(n);
+   }
+   }
else if (n->is_alu_inst()) {
alu_node *a = static_cast(n);
if (a->bc.op_ptr->flags & AF_PRED && a->dst[2]) {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600/sb: fix rotated register in while loop (attempt 2)

2018-02-16 Thread Gert Wollny
Am Freitag, den 16.02.2018, 14:44 +1000 schrieb Dave Airlie:
> From: Dave Airlie 
> 
> A bunch of CTS tests led me to write
> tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
> which r600/sb always fell over on.
> 
> GCM seems to move some of the copys into other basic blocks,
> if we don't allow this to happen then it doesn't seem to schedule
> them badly.
> 
> Everything I've read on SSA/phi copies say they have to happen
> in parallel, so keeping them in the same basic block seems like
> a good way to keep some of that property.


This one fixes fs-while-loop-rotate-value for me (barts) as promised,
no longer fixes 

   spec/arb_enhanced_layouts/execution/component-layout/
sso-vs-gs-fs-array-interleave

 but breaks  

   spec/glsl-1.50/execution/variable-indexing/
vs-output-array-vec2-index-wr-before-gs


I don't know whether this is helpful, but when I apply a patch to TGSI
split arrays [1], this piglit also passes with your patch.
Specifically, since the loops are completely unrolled no indirect
access is happening and all arrays can be split. As a result, when
using plain master the shaders optimised by sb and when using your
patch on top of the array splitting are very similar (mostly
differently ordered literals and register indices), however, running 
master + this patch results in very different optimised shaders, e.g.
sb bundles many vfetch operations at the beginning.

Best, 
Gert 



[1] https://patchwork.freedesktop.org/series/37991/

> 
> Signed-off-by: Dave Airlie 
> ---
>  src/gallium/drivers/r600/sb/sb_shader.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/r600/sb/sb_shader.cpp
> b/src/gallium/drivers/r600/sb/sb_shader.cpp
> index 321e24ea25..8959b8391d 100644
> --- a/src/gallium/drivers/r600/sb/sb_shader.cpp
> +++ b/src/gallium/drivers/r600/sb/sb_shader.cpp
> @@ -121,7 +121,7 @@ alu_node* shader::create_copy_mov(value* dst,
> value* src, unsigned affcost) {
>   alu_node *n = create_mov(dst, src);
>  
>   dst->assign_source(src);
> - n->flags |= NF_COPY_MOV | NF_DONT_HOIST;
> + n->flags |= NF_COPY_MOV | NF_DONT_HOIST | NF_DONT_MOVE;
>  
>   if (affcost && dst->is_sgpr() && src->is_sgpr())
>   coal.add_edge(src, dst, affcost);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600/sb: fix rotated register in while loop (attempt 2)

2018-02-15 Thread Dave Airlie
From: Dave Airlie 

A bunch of CTS tests led me to write
tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
which r600/sb always fell over on.

GCM seems to move some of the copys into other basic blocks,
if we don't allow this to happen then it doesn't seem to schedule
them badly.

Everything I've read on SSA/phi copies say they have to happen
in parallel, so keeping them in the same basic block seems like
a good way to keep some of that property.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/sb/sb_shader.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/sb/sb_shader.cpp 
b/src/gallium/drivers/r600/sb/sb_shader.cpp
index 321e24ea25..8959b8391d 100644
--- a/src/gallium/drivers/r600/sb/sb_shader.cpp
+++ b/src/gallium/drivers/r600/sb/sb_shader.cpp
@@ -121,7 +121,7 @@ alu_node* shader::create_copy_mov(value* dst, value* src, 
unsigned affcost) {
alu_node *n = create_mov(dst, src);
 
dst->assign_source(src);
-   n->flags |= NF_COPY_MOV | NF_DONT_HOIST;
+   n->flags |= NF_COPY_MOV | NF_DONT_HOIST | NF_DONT_MOVE;
 
if (affcost && dst->is_sgpr() && src->is_sgpr())
coal.add_edge(src, dst, affcost);
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600/sb: fix rotated register in while loop

2018-02-14 Thread Gert Wollny
Am Mittwoch, den 14.02.2018, 17:18 +1000 schrieb Dave Airlie:
> From: Dave Airlie 
> 
> A bunch of CTS tests led me to write
> tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
> which r600/sb always fell over on.
> 
> This patch fixes it, but I'll probably never be 100% sure why.

Unfortunately it shows quite a number of regressions against 
(git-b9d2ff05a6), in summary:

fixes:
   shaders/ssa/fs-while-loop-rotate-value
   spec/arb_enhanced_layouts/execution/component-layout/
  sso-vs-gs-fs-array-interleave

regressions:

crash with sb/sb_sched.cpp:931:process_fetch: 
   Assertion `f->parent->count() == 1' failed: 
  
spec/
   arb_arrays_of_arrays/execution/sampler/
vs-struct-nonconst-non-opaque-members

spec/
   arb_gpu_shader5/execution/sampler_array_indexing/
sampler-nonconst-2d
sampler-nonconst-2d-array
sampler-nonconst-2d-array-grad
sampler-nonconst-2d-grad

Simply fail: 
spec/
   glsl-1.10/execution/variable-indexing/
vs-output-array-float-index-wr
vs-temp-array-mat2-col-rd
vs-temp-array-mat2-index-col-rd
vs-temp-mat2-col-rd
vs-varying-array-mat2-index-col-rd
vs-varying-mat2-col-rd

   glsl-1.20/execution/variable-indexing/
vs-temp-array-mat2-col-rd
vs-temp-mat2-col-rd
vs-varying-array-mat2-index-col-rd
vs-varying-array-mat3-col-row-wr
vs-varying-array-mat3-col-wr
vs-varying-mat2-col-rd


   glsl-120/execution/variable-indexing/
vs-varying-array-mat3-col-row-wr
vs-varying-array-mat3-col-wr

I tried to figure out goes wrong with vs-output-array-float-index-wr,
it is the vertex shader (#18 for me), but why it fails eludes me so
far. 

best, 
Gert 



> Anyways what appears to be happening is when gcm is scheduling
> the copy_movs used for phis, we have 4 copy_movs in the scheduling
> queue, one of them releases a new copy_mov and it gets pushed
> to the the front of the scheduling queue, but for correctness
> we rely on the previous copy_movs getting emitted first. This
> places copy_movs in order on the list.
> 
> So if the list is empty, it puts it at the front, if the list
> has just copy_movs in it, it goes to the end, otherwise
> we iterate the list and insert it between the copy_movs and
> subsequent instructions.
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/gallium/drivers/r600/sb/sb_gcm.cpp | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/r600/sb/sb_gcm.cpp
> b/src/gallium/drivers/r600/sb/sb_gcm.cpp
> index 7776a10fc8..7b4df8bd52 100644
> --- a/src/gallium/drivers/r600/sb/sb_gcm.cpp
> +++ b/src/gallium/drivers/r600/sb/sb_gcm.cpp
> @@ -637,8 +637,23 @@ void gcm::add_ready(node *n) {
>   sched_queue_id sq = sh.get_queue_id(n);
>   if (n->flags & NF_SCHEDULE_EARLY)
>   bu_ready_early[sq].push_back(n);
> - else if (sq == SQ_ALU && n->is_copy_mov())
> - bu_ready[sq].push_front(n);
> + else if (sq == SQ_ALU && n->is_copy_mov()) {
> + if (bu_ready[sq].empty())
> + bu_ready[sq].push_front(n);
> + else {
> + bool inserted = false;
> + for (sched_queue::iterator I =
> bu_ready[sq].begin(), E = bu_ready[sq].end(); I != E; I++) {
> + node *a = *I;
> + if (!a->is_copy_mov()) {
> + bu_ready[sq].insert(I, n);
> + inserted = true;
> + break;
> + }
> + }
> + if (!inserted)
> + bu_ready[sq].push_back(n);
> + }
> + }
>   else if (n->is_alu_inst()) {
>   alu_node *a = static_cast(n);
>   if (a->bc.op_ptr->flags & AF_PRED && a->dst[2]) {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600/sb: fix rotated register in while loop

2018-02-13 Thread Dave Airlie
From: Dave Airlie 

A bunch of CTS tests led me to write
tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
which r600/sb always fell over on.

This patch fixes it, but I'll probably never be 100% sure why.

Anyways what appears to be happening is when gcm is scheduling
the copy_movs used for phis, we have 4 copy_movs in the scheduling
queue, one of them releases a new copy_mov and it gets pushed
to the the front of the scheduling queue, but for correctness
we rely on the previous copy_movs getting emitted first. This
places copy_movs in order on the list.

So if the list is empty, it puts it at the front, if the list
has just copy_movs in it, it goes to the end, otherwise
we iterate the list and insert it between the copy_movs and
subsequent instructions.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/sb/sb_gcm.cpp | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/sb/sb_gcm.cpp 
b/src/gallium/drivers/r600/sb/sb_gcm.cpp
index 7776a10fc8..7b4df8bd52 100644
--- a/src/gallium/drivers/r600/sb/sb_gcm.cpp
+++ b/src/gallium/drivers/r600/sb/sb_gcm.cpp
@@ -637,8 +637,23 @@ void gcm::add_ready(node *n) {
sched_queue_id sq = sh.get_queue_id(n);
if (n->flags & NF_SCHEDULE_EARLY)
bu_ready_early[sq].push_back(n);
-   else if (sq == SQ_ALU && n->is_copy_mov())
-   bu_ready[sq].push_front(n);
+   else if (sq == SQ_ALU && n->is_copy_mov()) {
+   if (bu_ready[sq].empty())
+   bu_ready[sq].push_front(n);
+   else {
+   bool inserted = false;
+   for (sched_queue::iterator I = bu_ready[sq].begin(), E 
= bu_ready[sq].end(); I != E; I++) {
+   node *a = *I;
+   if (!a->is_copy_mov()) {
+   bu_ready[sq].insert(I, n);
+   inserted = true;
+   break;
+   }
+   }
+   if (!inserted)
+   bu_ready[sq].push_back(n);
+   }
+   }
else if (n->is_alu_inst()) {
alu_node *a = static_cast(n);
if (a->bc.op_ptr->flags & AF_PRED && a->dst[2]) {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev