Re: [Mesa-dev] [PATCH 00/13] RadeonSI: Reduce user SGPR usage

Nicolai Hähnle Mon, 26 Feb 2018 02:21:55 -0800

Nothing wrong with them, really. All the LLVM stuff has caused me tofall behind on going through Mesa patches, sorry for that. 9-13 are finally


Reviewed-by: Nicolai Hähnle <[email protected]>


On 25.02.2018 02:04, Marek Olšák wrote:

So what is wrong with patches 9-13?

We can do cleanups after those.

Marek

On Thu, Feb 22, 2018 at 5:17 PM, Marek Olšák <[email protected]> wrote:

I don't think that adding "uint32_t userdata_XX[16];" would simplify anything.

The bottom line is, patches 9-13 are prerequisites for VBO descriptors
in user SGPRs, so they block that optimization as long as they sit on
the mailing list.

Marek

On Tue, Feb 20, 2018 at 8:51 PM, Marek Olšák <[email protected]> wrote:

The user SGPRs for blits are kinda a separate thing where the standard
emit paths are disabled. 64-bit pointers are a short-term issue and
will be removed in 2 years (or 1.5 years or when we want to kill off
old LLVM support). VBO descriptors in user SGPRs will require 32-bit
pointers. Next-gen will also require 32-bit pointers. The number of
codepaths will be reduced to merged/non-merged and mono/non-mono
again. For gfx9 and later, the only codepaths will be mono/non-mono.

There will just be a transitory period when both 32-bit and 64-bit
pointers will be supported, and both the old and new way of setting up
VBO descriptors will be supported. However, next-gen will only support
one way - the newer way.

Overall, I don't see an increase in complexity other than the transitory period.

Marek

On Tue, Feb 20, 2018 at 5:46 PM, Nicolai Hähnle <[email protected]> wrote:

With a small comment on patch 6, patches 1-8:

Reviewed-by: Nicolai Hähnle <[email protected]>

for now.

However, I'm unhappy about how complex this is all getting. 32- vs. 64-bit,
merged vs. non-merged, monolithic vs. non-monolithic, and then special user
SGPR uses like for blits and soon VBO descriptors, it feels like it's
becoming too much.

The problem is I don't have a good answer to it all :)

Perhaps some of it could be helped by having an explicit userdata staging
area, i.e.

   uint32_t userdata_XX[16]; // or 32
   uint32_t userdata_XX_dirty;

Then si_upload_descriptors would write its pointers into userdata_XX in the
right location and set the appropriate dirty bit(s), and a separate
emit_userdata function would use the contiguous bit scan to actually emit
all the userdata together -- this would include VS state bits, tess state
info, and blit shader SGPRs.

I do think this would be cleaner especially than the current
si_emit_shader_pointer_* code, and it would coalesce more SH reg writes as a
side bonus. What do you think?

The other half of it is how the LLVM functions are created.

Thanks,
Nicolai


On 17.02.2018 20:43, Marek Olšák wrote:


Hi,

This series has the following effect on user SGPRs:

64-bit pointers:
      TCS:    14 -> 12
      Merged VS-TCS: 24 -> 20
      Merged VS-GS:  18 -> 16
      Merged TES-GS: 18 -> 14

32-bit pointers:
      TCS:    10 -> 8
      Merged VS-TCS: 16 -> 12
      Merged VS-GS:  11 -> 9
      Merged TES-GS: 11 -> 6

I tested both monolithic and non-monolithic shaders, and both 64-bit
and 32-bit pointers. (4 combinations)

This series is a prerequisite for VBO descriptors in user SGPRs.

Note that merged LS-HS and ES-GS don't even use s[6:7] input SGPRs
yet. Those only provide 40 bits of scalar data (not 64 bits like
s[0:1]).

Please review.

Thanks,
Marek
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.



--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/13] RadeonSI: Reduce user SGPR usage

Reply via email to