Am 27.08.2013 23:52, schrieb Vadim Girlin:
> On 08/28/2013 12:43 AM, Marek Olšák wrote:
>> Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
>> with a Mesa driver that likes to compile shader variants on first use?
>> It's HORRIBLE.
> 
> I don't think that shader variants are bad, but it's definitely bad when
> we are compiling variants that are never used. Currently glxgears
> compiles 18 ps/vs shaders. In my branch with initial GS support [1] I
> switched handling of the shaders to deferred compilation, that is,
> shaders are compiled only before the actual draw. I found later that
> it's not really required for GS, but IIRC this change results in only 5
> shaders being compiled for glxgears instead of 18. It seems most of the
> useless variants are results of state changes between creation of the
> shader state (initial compilation) and actual draw call.
> 
> I had some concerns about increased overhead with those changes, and
> it's actually noticeable with drawoverhead demo, but I didn't see any
> regressions with a few real apps that I tested, e.g. glxgears even
> showed slightly better performance with these changes. Probably I also
> implemented it in a not very optimal way (I was mostly concentrated on
> GS support) and the overhead can be reduced.
> 
> One more thing is duplicate shaders, I've analyzed shader dumps from
> Unigine Heaven 3.0 some time ago and found that from about 320 compiled
> shaders, only about 180 (50%) were unique, others were duplicates
> (detected by comparing the bytecode dumps for them in an automated way),
> maybe they had different shader keys (which still resulted in the same
> bytecode), but I suspect duplicate pipe shaders were also involved.
> Unfortunately I didn't have a time to investigate it more thoroughly
> since then.
> 
> So my point is that we don't really need to eliminate shader variants,
> first we need to eliminate compilation of unused variants and duplicate
> shaders. Also we might want to consider offloading of the compilation to
> separate thread(s) and caching of shader binaries between runs.

Hmm ok that seems a way more complicated problem than I thought :-).
Compile early and you might compile variants you will never use, compile
late and the delay might be noticeable.
I just thought it might be unlikely you'd actually need two variants -
e.g. some depth exporting shader is probably unlikely to use alpha test.
But ok I guess it shouldn't write color in this case, so even then it
might never be worth bothering. Was just a random idea ;-).

Roland


> 
> Vadim
> 
>  [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders
> 
>>
>> What the patch does is probably the right solution. At least
>> alpha-test state changes don't cause shader recompilation and
>> re-binding, which also negatively affects performance. Ideally we
>> shouldn't depend on the framebuffer state at all, but we need to
>> emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
>> should always be fine with key.nr_cbufs forced to 8 for any shader
>> without that property. I expect app developers to do the right thing
>> and not write outputs they don't need.
>>
>> Marek
>>
>> On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger
>> <srol...@vmware.com> wrote:
>>> Not that I'm qualified to review r600 code, but couldn't you create
>>> different shader variants depending on whether you need alpha test? At
>>> least I would assume shader exports aren't free.
>>>
>>> Roland
>>>
>>> Am 27.08.2013 19:56, schrieb Vadim Girlin:
>>>> We need to export at least one color if the shader writes it,
>>>> even when nr_cbufs==0.
>>>>
>>>> Signed-off-by: Vadim Girlin <vadimgir...@gmail.com>
>>>> ---
>>>>
>>>> Tested on evergreen with multiple combinations of backends - no
>>>> regressions,
>>>> fixes some tests:
>>>>
>>>>    default    - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
>>>>    default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
>>>>    llvm       - fixes about 25 tests related to depth/stencil
>>>>    llvm+sb    - fixes about 300 tests (llvm's depth/stencil issues and
>>>>                 regressions cased by reordering of exports in sb)
>>>>
>>>> With this patch, there are no regressions with default+sb vs default.
>>>> There is one regression with llvm+sb vs llvm -
>>>> fs-texturegrad-miplevels,
>>>> AFAICS it's a problem with llvm backend uncovered by sb -
>>>> SET_GRADIENTS_V/H
>>>> instructions are not placed in the same TEX clause with
>>>> corresponding SAMPLE_G.
>>>>
>>>>   src/gallium/drivers/r600/r600_shader.c | 7 ++++---
>>>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/src/gallium/drivers/r600/r600_shader.c
>>>> b/src/gallium/drivers/r600/r600_shader.c
>>>> index 300b5c4..f7eab76 100644
>>>> --- a/src/gallium/drivers/r600/r600_shader.c
>>>> +++ b/src/gallium/drivers/r600/r600_shader.c
>>>> @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct
>>>> r600_screen *rscreen,
>>>>        unsigned opcode;
>>>>        int i, j, k, r = 0;
>>>>        int next_pos_base = 60, next_param_base = 0;
>>>> +     int max_color_exports = MAX2(key.nr_cbufs, 1);
>>>>        /* Declarations used by llvm code */
>>>>        bool use_llvm = false;
>>>>        bool indirect_gprs;
>>>> @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct
>>>> r600_screen *rscreen,
>>>>                radeon_llvm_ctx.face_gpr = ctx.face_gpr;
>>>>                radeon_llvm_ctx.r600_inputs = ctx.shader->input;
>>>>                radeon_llvm_ctx.r600_outputs = ctx.shader->output;
>>>> -             radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs
>>>> , 1);
>>>> +             radeon_llvm_ctx.color_buffer_count = max_color_exports;
>>>>                radeon_llvm_ctx.chip_class = ctx.bc->chip_class;
>>>>                radeon_llvm_ctx.fs_color_all = shader->fs_write_all
>>>> && (rscreen->chip_class >= EVERGREEN);
>>>>                radeon_llvm_ctx.stream_outputs = &so;
>>>> @@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct
>>>> r600_screen *rscreen,
>>>>                case TGSI_PROCESSOR_FRAGMENT:
>>>>                        if (shader->output[i].name ==
>>>> TGSI_SEMANTIC_COLOR) {
>>>>                                /* never export more colors than the
>>>> number of CBs */
>>>> -                             if (shader->output[i].sid >=
>>>> key.nr_cbufs) {
>>>> +                             if (shader->output[i].sid >=
>>>> max_color_exports) {
>>>>                                        /* skip export */
>>>>                                        j--;
>>>>                                        continue;
>>>> @@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct
>>>> r600_screen *rscreen,
>>>>                                output[j].type =
>>>> V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL;
>>>>                                shader->nr_ps_color_exports++;
>>>>                                if (shader->fs_write_all &&
>>>> (rscreen->chip_class >= EVERGREEN)) {
>>>> -                                     for (k = 1; k < key.nr_cbufs;
>>>> k++) {
>>>> +                                     for (k = 1; k <
>>>> max_color_exports; k++) {
>>>>                                                j++;
>>>>                                                memset(&output[j], 0,
>>>> sizeof(struct r600_bytecode_output));
>>>>                                                output[j].gpr =
>>>> shader->output[i].gpr;
>>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to