"Pohjolainen, Topi" <topi.pohjolai...@gmail.com> writes: > Jason, Curro, do you have any opinion if this is worth pursuing? > I need something for blorp blits at least - using blorp for texture > uploads on top of current excessive flushing regresses perf. >
I wouldn't be surprised if it improves throughput slightly for workloads doing both compute and 3D rendering, but only testing will tell how helpful it is in practice... > When working on gpu hangs on SKL we also identified compute > flushing caches it shouldn't. I think those could be addressed > nicely here as well. I can respin if this is something we'd like > to have. > > On Fri, Feb 17, 2017 at 09:32:03PM +0200, Topi Pohjolainen wrote: >> Currently: >> >> 1) Blorp color clears and resolves emit unconditional render target >> flush + command stream after every clear/resolve (including >> regular non-fast clears). >> >> 2) Blorp color clears, resolves and blits emit texture and constant >> cache resolves even in case only destination is dirty. This is >> because brw_render_cache_set_check_flush() does both render target >> flush as well as the top-of-pipe read cache flushes. >> >> 3) Similarly to item 2, 3D and compute paths also flush texture and >> constant caches even if none of the texture surfaces are dirty. >> >> 4) In case of multiple surfaces needing resolves, all render paths >> (blorp, 3D and compute) emit render target, texture and constant >> cache flushes after each resolve instead of just once after all >> resolves. >> >> This series addresses all four cases. Good news are that even though >> the current setup isn't optimal, it doesn't actually get any better in >> most cases performance wise. There is modest gain in OglDrvRes which >> does heavy blorp blitting. I'm expecting this series also to make >> blorp tex uploads and blorp mipmap generation more competitive. >> >> Bad news are in the final patch - it looks that current unconditional >> flushing/stalling has been hiding bugs elsewhere. There are cases >> which rely on the flushes after non-fast clears. Hunting the real >> cause is, however, difficult. I only saw them in CI system within >> full runs and was not able to reproduce them myself. >> >> As first steps the series introduces end-of-pipe synchronization. >> This is a flush combined with stall and post-sync operation of >> writing a double word (32 bits). Until now this wasn't really >> needed as there was in many cases double flushing which in turn >> looks to take long enough to hide the need for the sync. I also >> noticed that one needs to be rather careful with it - performance >> gets decreased noticeably when used unneeded. >> >> I don't really know if we want to go this way myself even. Current >> logic - while not ideal - is rather simple. >> >> Topi Pohjolainen (16): >> i965/miptree: Tell if anything got resolved >> i965/gen6+: Implement end-of-pipe sync >> i965: Hook end-of-pipe-sync after texture resolves >> i965: Hook end-of-pipe-sync after image resolves >> i965: Hook end-of-pipe-sync after framebuffer resolves >> i965: Consider layered rt resolves along with other >> i965: Add color resolve end-of-pipe-sync before switch to blit ring >> i965/dri2: Add end-of-pipe-sync after color resolves >> i965/miptree: Add color resolve end-of-pipe-sync before sharing >> i965: Add end-of-pipe sync before non-gpu read of color resolves >> i965/blorp: Do more fine grained flushing/syncing >> i965/blorp/blit: Refactor hiz/ccs prep for blits >> i965/blorp: Use conditional end-of-pipe-sync >> i965: Consider surface resolves and sync after blorp ops >> i965: Check if fast color clear state transition needs sync >> i965/blorp: Drop unnecessary flushes after clear/resolve >> >> src/mesa/drivers/dri/i965/brw_blorp.c | 187 ++++++++++---- >> src/mesa/drivers/dri/i965/brw_compute.c | 2 + >> src/mesa/drivers/dri/i965/brw_context.c | 333 >> +++++++++++++++++++------ >> src/mesa/drivers/dri/i965/brw_context.h | 3 + >> src/mesa/drivers/dri/i965/brw_draw.c | 36 +-- >> src/mesa/drivers/dri/i965/brw_pipe_control.c | 91 +++++++ >> src/mesa/drivers/dri/i965/genX_blorp_exec.c | 11 - >> src/mesa/drivers/dri/i965/intel_blit.c | 16 +- >> src/mesa/drivers/dri/i965/intel_copy_image.c | 10 +- >> src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 25 +- >> src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +- >> src/mesa/drivers/dri/i965/intel_pixel.c | 4 + >> src/mesa/drivers/dri/i965/intel_pixel_bitmap.c | 5 +- >> src/mesa/drivers/dri/i965/intel_pixel_read.c | 7 +- >> src/mesa/drivers/dri/i965/intel_tex_image.c | 10 +- >> src/mesa/drivers/dri/i965/intel_tex_subimage.c | 11 +- >> 16 files changed, 557 insertions(+), 196 deletions(-) >> >> -- >> 2.5.5 >>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev