Am 22.06.2017 um 18:22 schrieb Marek Olšák: > On Thu, Jun 22, 2017 at 6:13 PM, Alex Smith <[email protected]> > wrote: >> On 22 June 2017 at 15:52, Roland Scheidegger <[email protected]> wrote: >>> Am 22.06.2017 um 13:09 schrieb Nicolai Hähnle: >>>> On 22.06.2017 10:14, Michel Dänzer wrote: >>>>> On 22/06/17 04:34 PM, Nicolai Hähnle wrote: >>>>>> On 22.06.2017 03:38, Rob Clark wrote: >>>>>>> On Wed, Jun 21, 2017 at 8:15 PM, Marek Olšák <[email protected]> wrote: >>>>>>>> On Wed, Jun 21, 2017 at 10:37 PM, Rob Clark <[email protected]> >>>>>>>> wrote: >>>>>>>>> On Tue, Jun 20, 2017 at 6:54 PM, Marek Olšák <[email protected]> >>>>>>>>> wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> This series updates pipe loaders so that flags such as drirc options >>>>>>>>>> can be passed to create_screen(). I have compile-tested everything >>>>>>>>>> except clover. >>>>>>>>>> >>>>>>>>>> The first pipe_screen flag is a drirc option to fix incorrect grass >>>>>>>>>> rendering in Rocket League for radeonsi. Rocket League expects >>>>>>>>>> DirectX >>>>>>>>>> behavior for partial derivative computations after discard/kill, but >>>>>>>>>> radeonsi implements the more efficient but stricter OpenGL behavior >>>>>>>>>> and that will remain our default behavior. The new screen flag >>>>>>>>>> forces >>>>>>>>>> radeonsi to use the DX behavior for that game. >>>>>>>>>> >>>>>>>>> >>>>>>>>> do we really want this to be a *global* option for the screen? >>>>>>>> >>>>>>>> Yes. Shaders are pipe_screen (global) objects in radeonsi, so a >>>>>>>> compiler option also has to be global. We can't look at the context >>>>>>>> during the TGSI->LLVM translation. >>>>>>> >>>>>>> well, I didn't really mean per-screen vs per-context, as much as >>>>>>> per-screen vs per-shader (or maybe more per-screen vs >>>>>>> per-instruction?) >>>>>> >>>>>> I honestly don't think it's worth the trouble. Applications that are >>>>>> properly coded against GLSL can benefit from the relaxed semantics, and >>>>>> applications that get it wrong in one shader are rather likely to get it >>>>>> wrong everywhere. >>>>>> >>>>>> Since GLSL simply says derivatives are undefined after non-uniform >>>>>> discard, and this option makes it defined instead, setting this flag can >>>>>> never break the behavior of a correctly written shader. >>>>> >>>>> BTW, how expensive is the radeonsi workaround when it isn't needed? >>>>> >>>>> I'm starting to wonder if we shouldn't just make it always safe and call >>>>> it a day, saving the trouble of identifying broken apps and plumbing the >>>>> info through the API layers... >>>> >>>> As-is, the workaround can be *very* expensive in the worst case. A large >>>> number of pixels could be disabled by a discard early in the shader, and >>>> we're now moving the discard down, which means a lot of unnecessary >>>> texture fetches may be happening. >>>> >>>> Also, I think I spoke too soon about this flag not having negative >>>> effects: if a shader has an image/buffer write after a discard, that >>>> write is now no longer disabled. >>>> >>>> A more efficient workaround can be done at the LLVM level by doing the >>>> discard early, but then re-enabling WQM "relative to" the new set of >>>> active pixels. It's a bit involved, especially when the discard itself >>>> happens in a branch, and still a little more expensive, but it's an option. >>>> >>> >>> I'm wondering what your driver for the other OS does (afaik dx10 is >>> really the odd man out, all of glsl, spir-v, even metal have undefined >>> derivatives after non-uniform discards). Thinking surely there must be >>> something clever you could do... >> >> I'm wondering the same. >> >> This is an issue we come across from time to time, where a game's >> shaders are expecting the D3D behaviour of derivatives remaining >> defined post-discard. For this we usually do essentially what this >> workaround is doing, just postpone the discard until the very end of >> the shader. >> >> However it seems like doing this is less performant than the original >> shaders running on D3D. One case I've seen had a big performance loss >> against D3D when doing a delayed discard (which was being used early >> in a complex shader to cull a lot of unneeded pixels), on both AMD and >> NVIDIA. >> >> Given that, I've wondered whether there's something clever that the >> D3D drivers are doing to optimise this. Maybe, for example, discarding >> immediately if all pixels in a quad used for derivative calculations >> get discarded? Is something like that possible on AMD hardware? > > Yes, it's possible but not implemented in LLVM yet. >
Albeit if you'd wanted to do it correctly in the app, I'm not sure how you could achieve that... Roland _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
