On 16.09.2016 19:11, Ian Romanick wrote:
On 09/16/2016 06:57 AM, Nicolai Hähnle wrote:
Hi all,

as the title says. The implementation uses a compute shader to summarize
data from the query buffers. As long as only one query buffer is in flight
(the normal case), that compute shader is launched exactly once, on a
single thread. If multiple buffers were required, then one compute grid is
launched for each of these buffers, in sequence.

All of this could be done in much fancier ways using bindless buffers and
wave-wide computations, but really, the expectation is that most queries
will be rather simple (though occlusion queries always contain at least 8
result pairs, so it's not like it would be completely pointless).

This code also exposes the hilarious lowering of 64-bit integer divides
in LLVM, since timestamp queries use it. This lowering generates more than
2KB of code for a single division, which is excessive even when the division
*isn't* by a constant. The right place to fix this is in LLVM, and I'm
already looking into it. For normal queries this is completely irrelevant
because the code will just be skipped.

Is the division by a constant?  If it is, you might want to use
something like what libdivide would generate.

Yes it is. I'd rather fix this in LLVM, though. LLVM has the required infrastructure already, it just doesn't use it in this case out of silliness.


Please review!

mesa-dev mailing list

mesa-dev mailing list

Reply via email to