bleh, seems like max-cc's is still too low on mesa-dev, and some of the patches didn't get through. You can also find them here:
https://github.com/freedreno/mesa/commits/wip-rsq BR, -R On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark <[email protected]> wrote: > From: Rob Clark <[email protected]> > > So, I know there were a couple concerns voiced over the idea of > re-ordering rendering in a gallium shim pipe driver layer. For > me, the main concern was whether the overhead of an extra layer, > queueing and replaying state updates, draws, etc, would be > prohibitive. So I implemented it enough that I could do some > benchmarking ;-) > > The first 9 patches are just some general API cleanups, which I > found to be convenient (since the resequencer layer is generating > most of the state handling with python + mako, so the cleanups to > improve consistency help minimize the state which required special > handling). But regardless of the outcome of the resequencer > layer, I think these patches make sense on their own. > > (Note: auto-generating some of the other wrapper layers might be > an interesting future cleanup.. at least it should be trivial > for noop ;-)) > > As far as overhead, I've been benchmarking (most glmark2 + stk + > gfxbench), and in the current state (without actually having the > dependency tracking implemented) it doesn't seem to cause more > than a couple percent overhead. From here on out, the remaining > overhead added to implement the dependency tracking and re- > ordering would be the same as the additional overhead required > to implement it in the driver backend. > > And a couple percent overhead is small compared to the expected > gains for games which benefit.. ie. 8MiB for 1080p rgb frame, > avoiding copying that from tile to memory and back once or twice > quickly dwarfs an extra copy of some 10's of kb of state.. and > even more so for (for ex.) f32f32f32f32 intermediate buffers. > > Queries are still missing, but I expect what would be required > to implement it is the same as the logic that would be needed in > the driver backend otherwise. > > Basically, the only concern I have, compared to the approach of > implementing the dependency tracking in each driver backend is > pipe_constant_buffer::user_buffer. Currently both freedreno and > vc4 what non-UBO constant buffers to be emitted in cmdstream. > In the adreno case, it looks like a3xx/a4xx should also support > the non-user_buffer case, although in fact this appears to be > broken (at least on a4xx) and I've never seen blob driver use > this. At the moment I'm doing a hack in freedreno to map the > backing fd_bo and then memcpy it into cmdstream. Which is a > bit silly (since it is a write-combine buffer I'm copying from). > But in glmark I had trouble even measuring the overhead of this > extra copy. Although possibly I need to find something to > measure which emits more non-UBO constant state. > > btw, if someone has some requests for benchmarks to try (provided > they are available for arm/linux) I'd be happy to try some other > things. > > The plus side of doing this in a separate layer is that we only > implement the dependency tracking and resource shadowing once, > instead of both in vc4 and freedreno (and who knows, maybe > someday someone gets around to writing a lima gallium driver). > Plus, I envision this to be something that mesa/st wraps the > pipe_screen with if driconf tells it to, and pscreen->rsq_funcs > is populated (we at least need a callback to know if resource > is still busy). This way we can turn it on for games/apps that > are known to benefit, and leave it off with zero additional > overhead for better written things (or rather, things written > with tilers in mind). > > > Rob Clark (10): > gallium: cleanup set_tess_state > gallium: make shader_buffers const > gallium: make constant_buffer const > gallium: make image_view const > gallium: change end_query() to return boolean > gallium/util: add util_copy_index_buffer() helper > gallium/util: add util_copy_shader_buffer() helper > gallium/util: add util_copy_vertex_buffer helper > gallium/util: make util_copy_framebuffer_state(src=NULL) work > RFC: gallium: add resequencer driver (INCOMPLETE) > > configure.ac | 1 + > src/gallium/auxiliary/util/u_framebuffer.c | 37 +- > src/gallium/auxiliary/util/u_helpers.c | 15 - > src/gallium/auxiliary/util/u_helpers.h | 3 - > src/gallium/auxiliary/util/u_inlines.h | 49 ++ > src/gallium/drivers/ddebug/dd_context.c | 15 +- > src/gallium/drivers/freedreno/freedreno_query.c | 2 +- > src/gallium/drivers/freedreno/freedreno_state.c | 13 +- > src/gallium/drivers/i915/i915_query.c | 2 +- > src/gallium/drivers/i915/i915_state.c | 8 +- > src/gallium/drivers/ilo/ilo_query.c | 2 +- > src/gallium/drivers/ilo/ilo_state.c | 14 +- > src/gallium/drivers/llvmpipe/lp_query.c | 2 +- > src/gallium/drivers/llvmpipe/lp_state_fs.c | 2 +- > src/gallium/drivers/llvmpipe/lp_state_vertex.c | 6 +- > src/gallium/drivers/noop/noop_pipe.c | 2 +- > src/gallium/drivers/noop/noop_state.c | 2 +- > src/gallium/drivers/nouveau/nv30/nv30_query.c | 2 +- > src/gallium/drivers/nouveau/nv30/nv30_state.c | 13 +- > src/gallium/drivers/nouveau/nv50/nv50_query.c | 2 +- > src/gallium/drivers/nouveau/nv50/nv50_state.c | 2 +- > src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 25 +- > src/gallium/drivers/r300/r300_query.c | 4 +- > src/gallium/drivers/r300/r300_state.c | 10 +- > src/gallium/drivers/r600/evergreen_state.c | 7 +- > src/gallium/drivers/r600/r600_state_common.c | 7 +- > src/gallium/drivers/radeon/r600_query.c | 2 +- > src/gallium/drivers/radeonsi/si_descriptors.c | 16 +- > src/gallium/drivers/radeonsi/si_state.c | 13 +- > src/gallium/drivers/radeonsi/si_state.h | 3 +- > src/gallium/drivers/rbug/rbug_context.c | 6 +- > src/gallium/drivers/resequencer/.gitignore | 2 + > src/gallium/drivers/resequencer/Makefile.am | 44 ++ > src/gallium/drivers/resequencer/Makefile.sources | 23 + > src/gallium/drivers/resequencer/rsq_batch.c | 144 +++++ > src/gallium/drivers/resequencer/rsq_batch.h | 71 +++ > src/gallium/drivers/resequencer/rsq_context.c | 457 ++++++++++++++++ > src/gallium/drivers/resequencer/rsq_context.h | 84 +++ > src/gallium/drivers/resequencer/rsq_draw.c | 230 ++++++++ > src/gallium/drivers/resequencer/rsq_draw.h | 40 ++ > src/gallium/drivers/resequencer/rsq_fence.c | 48 ++ > src/gallium/drivers/resequencer/rsq_fence.h | 43 ++ > src/gallium/drivers/resequencer/rsq_public.h | 68 +++ > src/gallium/drivers/resequencer/rsq_query.c | 148 +++++ > src/gallium/drivers/resequencer/rsq_query.h | 32 ++ > src/gallium/drivers/resequencer/rsq_resource.c | 222 ++++++++ > src/gallium/drivers/resequencer/rsq_resource.h | 60 ++ > src/gallium/drivers/resequencer/rsq_screen.c | 186 +++++++ > src/gallium/drivers/resequencer/rsq_screen.h | 50 ++ > src/gallium/drivers/resequencer/rsq_state.py | 607 > +++++++++++++++++++++ > .../drivers/resequencer/rsq_state_helpers.h | 219 ++++++++ > src/gallium/drivers/resequencer/rsq_surface.c | 107 ++++ > src/gallium/drivers/resequencer/rsq_surface.h | 72 +++ > src/gallium/drivers/softpipe/sp_query.c | 2 +- > src/gallium/drivers/softpipe/sp_state_image.c | 10 +- > src/gallium/drivers/softpipe/sp_state_shader.c | 2 +- > src/gallium/drivers/softpipe/sp_state_vertex.c | 6 +- > src/gallium/drivers/svga/svga_pipe_constants.c | 2 +- > src/gallium/drivers/svga/svga_pipe_query.c | 2 +- > src/gallium/drivers/svga/svga_pipe_vertex.c | 2 +- > src/gallium/drivers/swr/swr_query.cpp | 2 +- > src/gallium/drivers/swr/swr_state.cpp | 9 +- > src/gallium/drivers/trace/tr_context.c | 15 +- > src/gallium/drivers/vc4/vc4_query.c | 2 +- > src/gallium/drivers/vc4/vc4_state.c | 13 +- > src/gallium/drivers/virgl/virgl_context.c | 10 +- > src/gallium/drivers/virgl/virgl_query.c | 4 +- > src/gallium/include/pipe/p_context.h | 12 +- > src/gallium/include/pipe/p_state.h | 8 + > src/mesa/state_tracker/st_atom_tess.c | 13 +- > 70 files changed, 3148 insertions(+), 210 deletions(-) > create mode 100644 src/gallium/drivers/resequencer/.gitignore > create mode 100644 src/gallium/drivers/resequencer/Makefile.am > create mode 100644 src/gallium/drivers/resequencer/Makefile.sources > create mode 100644 src/gallium/drivers/resequencer/rsq_batch.c > create mode 100644 src/gallium/drivers/resequencer/rsq_batch.h > create mode 100644 src/gallium/drivers/resequencer/rsq_context.c > create mode 100644 src/gallium/drivers/resequencer/rsq_context.h > create mode 100644 src/gallium/drivers/resequencer/rsq_draw.c > create mode 100644 src/gallium/drivers/resequencer/rsq_draw.h > create mode 100644 src/gallium/drivers/resequencer/rsq_fence.c > create mode 100644 src/gallium/drivers/resequencer/rsq_fence.h > create mode 100644 src/gallium/drivers/resequencer/rsq_public.h > create mode 100644 src/gallium/drivers/resequencer/rsq_query.c > create mode 100644 src/gallium/drivers/resequencer/rsq_query.h > create mode 100644 src/gallium/drivers/resequencer/rsq_resource.c > create mode 100644 src/gallium/drivers/resequencer/rsq_resource.h > create mode 100644 src/gallium/drivers/resequencer/rsq_screen.c > create mode 100644 src/gallium/drivers/resequencer/rsq_screen.h > create mode 100644 src/gallium/drivers/resequencer/rsq_state.py > create mode 100644 src/gallium/drivers/resequencer/rsq_state_helpers.h > create mode 100644 src/gallium/drivers/resequencer/rsq_surface.c > create mode 100644 src/gallium/drivers/resequencer/rsq_surface.h > > -- > 2.5.5 > _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
