On 03/12/2018 05:21 PM, Marek Olšák wrote:
On Mon, Mar 12, 2018 at 6:18 AM, Samuel Pitoiset
<samuel.pitoi...@gmail.com> wrote:

On 03/11/2018 04:41 PM, Marek Olšák wrote:

On Thu, Mar 8, 2018 at 5:48 PM, Ian Romanick <i...@freedesktop.org> wrote:

On 03/08/2018 06:50 AM, Samuel Pitoiset wrote:

This pass moves load UBO operations just before their first use,
loosely based on nir_opt_move_comparisons.

If I'm reading this correctly, it moves UBO loads closer to the first
use in the same block.  My assumption is the benefit in the next patch
occurs because live ranges are smaller.  It seems like this could also
hurt performance since it may be harder for the schedule to hide the
latency of the load when register pressure is not an issue.  Have you
measured performance of running apps to see if this is an issue?

I'm mostly asking because Jason had a series for global code motion that
does, in some cases, the opposite of this patch by moving UBO loads up
to earlier blocks.

The pass is OK for LLVM, because LLVM does CSE across basic blocks,
and it also does instruction scheduling within a block.

radeonsi/tgsi does the same thing: it load uniforms from memory at
every use. It sounds inefficient, but we found out that it's the best
thing to do with LLVM. LLVM can move loads away from the use, but it
doesn't move loads close to the use.

Exactly, RadeonSI does something similar. Though the shader-db result posted
by Timothy doesn't look very good for NIR, what do you think?

The results are OK.

Okay, can someone review the series then?


mesa-dev mailing list

Reply via email to