Re: [Mesa-dev] V3 Loop unrolling in NIR

2016-09-15 Thread Timothy Arceri
On Thu, 2016-09-15 at 12:28 +0300, Eero Tamminen wrote:
> Hi,
> 
> Have you any plans for supporting partial unrolling?

Not currently no although it shouldn't be too difficult to add this.

> 
> I.e. if the loop count is too large to be completely unrolled, unroll
> it 
> few times (that still fits into instruction cache) and then loop
> that.
> 
> E.g. for a loop with 51 rounds, Mesa could unroll it 4 rounds, loop
> that 
> 12 times and unroll (or loop) remaining 3 rounds separately.
> 
> 
>   - Eero
> 
> On 15.09.2016 10:03, Timothy Arceri wrote:
> > 
> > Big thanks to Connor for his feedback on previous versions, and
> > to Jason for answering my all my nir questions.
> > 
> > This series works on ssa defs so for now it's only enabled for
> > the scalar backend on Gen7+.
> > 
> > V3:
> > - So called complex loop unrolling has been implemented.
> > - An instruction limit and rules from the GLSL IR pass to override
> >  the limit for unrolling have been implemented.
> > - Lots of other stuff see individual patches.
> > 
> > total instructions in shared programs: 8488940 -> 8488648 (-0.00%)
> > instructions in affected programs: 48903 -> 48611 (-0.60%)
> > helped: 68
> > HURT: 89
> > 
> > Most of this HURT comes for switching to using
> > nir_lower_indirect_derefs(). See patch 1 for more deals.
> > 
> > total cycles in shared programs: 69787006 -> 69758740 (-0.04%)
> > cycles in affected programs: 2525708 -> 2497442 (-1.12%)
> > helped: 900
> > HURT: 919
> > 
> > total loops in shared programs: 2071 -> 1499 (-27.62%)
> > loops in affected programs: 687 -> 115 (-83.26%)
> > helped: 655
> > HURT: 99
> > 
> > Helped here comes from a number of things. One example is the
> > nir pass is better than the GLSL pass at unrolling loops
> > regardless of which terminator has the lowest limit. We could
> > easily go further and handle unrolling of loops with complex
> > terminators e.g the ifs then or else blocks contain instructions
> > currently we just bail if they are not empty, I still need to
> > check if its worth while.
> > 
> > Another reason could be that I've set the instruction limit too
> > high but that doesn't seem to be the case.
> > 
> > I believe 82/99 of the HURT is from shaders that look something
> > like this:
> > 
> >   vec2 array[const_size_of_array];
> >   for (i = 0; i < const_size_of_array; i++) {
> > ...  = array[i];
> > 
> > ... lots of instructions (more that the unroll limit) ...
> >   }
> > 
> > The GLSL IR pass would force this to unroll as long as
> > const_size_of_array
> > wasn't greater than 32. However by the time we get to the nir pass
> > the
> > arrays have been removed, it seems like this may only be happening
> > for
> > vectors but I haven't looked into what is causing it yet.
> > 
> > The other 17 shaders seem to be various corner cases that can be
> > fixed
> > in folow-up patches.
> > 
> > total spills in shared programs: 2212 -> 2212 (0.00%)
> > spills in affected programs: 0 -> 0
> > helped: 0
> > HURT: 0
> > 
> > total fills in shared programs: 1891 -> 1891 (0.00%)
> > fills in affected programs: 0 -> 0
> > helped: 0
> > HURT: 0
> > 
> > LOST:   6
> > GAINED: 32
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V3 Loop unrolling in NIR

2016-09-15 Thread Eero Tamminen

Hi,

Have you any plans for supporting partial unrolling?

I.e. if the loop count is too large to be completely unrolled, unroll it 
few times (that still fits into instruction cache) and then loop that.


E.g. for a loop with 51 rounds, Mesa could unroll it 4 rounds, loop that 
12 times and unroll (or loop) remaining 3 rounds separately.



- Eero

On 15.09.2016 10:03, Timothy Arceri wrote:

Big thanks to Connor for his feedback on previous versions, and
to Jason for answering my all my nir questions.

This series works on ssa defs so for now it's only enabled for
the scalar backend on Gen7+.

V3:
- So called complex loop unrolling has been implemented.
- An instruction limit and rules from the GLSL IR pass to override
 the limit for unrolling have been implemented.
- Lots of other stuff see individual patches.

total instructions in shared programs: 8488940 -> 8488648 (-0.00%)
instructions in affected programs: 48903 -> 48611 (-0.60%)
helped: 68
HURT: 89

Most of this HURT comes for switching to using
nir_lower_indirect_derefs(). See patch 1 for more deals.

total cycles in shared programs: 69787006 -> 69758740 (-0.04%)
cycles in affected programs: 2525708 -> 2497442 (-1.12%)
helped: 900
HURT: 919

total loops in shared programs: 2071 -> 1499 (-27.62%)
loops in affected programs: 687 -> 115 (-83.26%)
helped: 655
HURT: 99

Helped here comes from a number of things. One example is the
nir pass is better than the GLSL pass at unrolling loops
regardless of which terminator has the lowest limit. We could
easily go further and handle unrolling of loops with complex
terminators e.g the ifs then or else blocks contain instructions
currently we just bail if they are not empty, I still need to
check if its worth while.

Another reason could be that I've set the instruction limit too
high but that doesn't seem to be the case.

I believe 82/99 of the HURT is from shaders that look something
like this:

  vec2 array[const_size_of_array];
  for (i = 0; i < const_size_of_array; i++) {
...  = array[i];

... lots of instructions (more that the unroll limit) ...
  }

The GLSL IR pass would force this to unroll as long as const_size_of_array
wasn't greater than 32. However by the time we get to the nir pass the
arrays have been removed, it seems like this may only be happening for
vectors but I haven't looked into what is causing it yet.

The other 17 shaders seem to be various corner cases that can be fixed
in folow-up patches.

total spills in shared programs: 2212 -> 2212 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 1891 -> 1891 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   6
GAINED: 32

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev