Hello Thomas,

is this useful even after '[Mesa-dev] [PATCH 0/2] V2: Use hash table cloning in copy propagation' landed?

I've running both together with Dave's '[Mesa-dev] [PATCH] radv/winsys: replace bo list searchs with a hash table.' patch.

Dieter

Am 24.01.2018 08:33, schrieb Thomas Helland:
2018-01-21 23:58 GMT+01:00 Eric Anholt <[email protected]>:
Thomas Helland <[email protected]> writes:

Also, allocate worklist_elem in groups of 20, to reduce the burden of
allocation. Do not use rzalloc, as there is no need. This lets us drop
the number of calls to ralloc from aproximately 10% of all calls to
ralloc(130 000 calls), down to a mere 2000 calls to ralloc_array_size.
This cuts the runtime of shader-db by 1%, while at the same time
reducing the number of stalled cycles, executed cycles, and executed
instructions by about 1 % as reported by perf. I did a five-run
benchmark pre and post and got a statistical variance less than 0.1% pre and post. This was with i965's ir validation polluting the benchmark, so
the numbers are even better in release builds.

Performance change as found with perf-diff:
4.74%     -0.23%  libc-2.26.so            [.] _int_malloc
1.88%     -0.21%  libc-2.26.so            [.] malloc
2.27%     +0.16%  libmesa_dri_drivers.so  [.] match_value.part.7
2.95%     -0.12%  libc-2.26.so            [.] _int_free
          +0.11%  libmesa_dri_drivers.so  [.] worklist_push
1.22%     -0.08%  libc-2.26.so            [.] malloc_consolidate
0.16%     -0.06%  libmesa_dri_drivers.so  [.] mark_live_cb
1.21%     +0.06%  libmesa_dri_drivers.so  [.] match_expression.part.6
0.75%     -0.05%  libc-2.26.so            [.] cfree@GLIBC_2.2.5
0.50%     -0.05%  libmesa_dri_drivers.so  [.] ralloc_size
0.57%     +0.04%  libmesa_dri_drivers.so  [.] nir_replace_instr
1.29%     -0.04%  libmesa_dri_drivers.so  [.] unsafe_free

I'm curious, since a NIR instruction worklist seems like a generally
useful thing to have:

Could nir_worklist.c keep the implementation of this?

Also, I wonder if it wouldn't be even better to have a u_dynarray of
instructions in the worklist, with push/pop on the end of the array, and
a struct set tracking the instructions in the array to avoid
double-adding. I actually don't know if that would be better or not, so I'd be happy with the worklist management just moved to nir_worklist.c.

I'll look into this to see what I can do. nir_worklist.c at this time has only a block worklist. This numbers all the blocks, uses a bitset for checking
if the item is present, and uses an array with an index pointing to the
start of the queue of blocks in the buffer.

The same scheme could be easily used for ssa-defs, as these are
also numbered. I actually did this for the VRP pass I wrote years ago.

However, for instructions we do not have a way of numbering them,
so a different scheme would have to be used. A dynarray + set type
of thing, us you're suggesting, might get us where we want.
I'll see what I can come up with.
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to