On Thu, Jul 5, 2018 at 2:18 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote:
> On Thu, Jul 5, 2018 at 11:03 AM, Francisco Jerez <curroje...@riseup.net> > wrote: > >> Jason Ekstrand <ja...@jlekstrand.net> writes: >> >> > On Wed, Jul 4, 2018 at 1:20 PM, Francisco Jerez <curroje...@riseup.net> >> > wrote: >> > >> >> Jason Ekstrand <ja...@jlekstrand.net> writes: >> >> >> >> > Many fragment shaders do a discard using relatively little >> information >> >> > but still put the discard fairly far down in the shader for no good >> >> > reason. If the discard is moved higher up, we can possibly avoid >> doing >> >> > some or almost all of the work in the shader. When this lets us skip >> >> > texturing operations, it's an especially high win. >> >> > >> >> > One of the biggest offenders here is DXVK. The D3D APIs have >> different >> >> > rules for discards than OpenGL and Vulkan. One effective way (which >> is >> >> > what DXVK uses) to implement DX behavior on top of GL or Vulkan is to >> >> > wait until the very end of the shader to discard. This ends up in >> the >> >> > pessimal case where we always do all of the work before discarding. >> >> > This pass helps some DXVK shaders significantly. >> >> > >> >> >> >> One thing to keep in mind is that this sort of transformation is >> trading >> >> off run-time of fragment shader invocations that don't call discard (or >> >> do so non-uniformly, which means that the code the discard jump is >> >> protecting will be executed anyway, so doing this can actually increase >> >> the critical path of the program) in favour of invocations that call >> >> discard uniformly (so executing discard early will effectively >> terminate >> >> the program early). >> > >> > >> > It's not really a uniform vs. non-uniform thing. Even if a shader only >> > discards some of the fragments, it sill reduces the number of live >> channels >> > which reduces the cost of later non-uniform control-flow. >> > >> >> Which only helps if the shader's control flow is sufficiently >> non-uniform that the additional cost from performing those computations >> early pays off -- Or not at all if the discarded fragments need to be >> executed (non-compliantly) anyway in order to provide >> derivatives_safe_after_discard. However, if the discard condition is >> uniform (across a warp), the thread can be terminated early by the >> back-end most certainly, which gives you the maximum pay-off. Uniform >> discard conditions are therefore the best-case scenario for this >> optimization pass. >> > > Yes, that is correct. Fortunately, things that discard tend to discard > fairly large chunks of the polygon at one time so this case is fairly > common. > > >> > >> >> Optimizing for the latter case is an essentially >> >> heuristic assumption that needs to be verified experimentally. Have >> you >> >> tested the effect of this pass on non-DX workloads extensively? >> >> >> > >> > Yes, it is a trade-off. No, I have not done particularly extensive >> > testing. We do, however, know of non-DXVK workloads that would benefit >> > from this. I believe Manhattan is one such example though I have not >> yet >> > benchmarked it. >> > >> >> You should grab some numbers then to make sure there are no >> regressions... > > > I'm working on that. Unfortunately the perf system is giving me trouble > so I don't have the numbers yet. > > >> But keep in mind that the i965 scheduler is already >> performing a similar optimization (locally, but with cycle-count >> information). This will only help over the existing optimization if the >> shaders that represent a bottleneck in Manhattan have sufficient control >> flow for the basic block boundaries to represent a problem to the >> (local) scheduler. >> > > I'm not sure about the manhattan shader but the Skyrim shader does have > control flow which the discard has to get moved above. > I have results from the perf system now and somehow this pass makes manhattan noticeably worse. I'll look into that.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev