On 6/3/19 9:46 AM, Alex Smith wrote:
On Sun, 2 Jun 2019 at 11:59, Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl <mailto:b...@basnieuwenhuizen.nl>> wrote:

    On Sun, Jun 2, 2019 at 12:32 PM Alex Smith
    <asm...@feralinteractive.com <mailto:asm...@feralinteractive.com>>
    wrote:
    >
    > Put the uncached GTT type at a higher index than the visible
    VRAM type,
    > rather than having GTT first.
    >
    > When we don't have dedicated VRAM, we don't have a non-visible VRAM
    > type, and the property flags for GTT and visible VRAM are identical.
    > According to the spec, for types with identical flags, we should
    give
    > the one with better performance a lower index.
    >
    > Previously, apps which follow the spec guidance for choosing a
    memory
    > type would have picked the GTT type in preference to visible
    VRAM (all
    > Feral games will do this), and end up with lower performance.
    >
    > On a Ryzen 5 2500U laptop (Raven Ridge), this improves average
    FPS in
    > the Rise of the Tomb Raider benchmark by up to ~30%. Tested a
    couple of
    > other (Feral) games and saw similar improvement on those as well.
    >
    > Signed-off-by: Alex Smith <asm...@feralinteractive.com
    <mailto:asm...@feralinteractive.com>>
    > ---
    > I noticed that the memory types advertised on my Raven laptop
    looked a
    > bit odd so played around with it and found this. I'm not sure if
    it is
    > actually expected that the performance difference between
    visible VRAM
    > and GTT is so large, seeing as it's not dedicated VRAM, but the
    results
    > are clear (and consistent, tested multiple times).

    AFAIU it is still using different memory paths, with GTT using
    different pagetables (those from the CPU I believe on APUs) and
    possible CPU snooping.

    Main risk here seems applications pushing out driver internal stuff
    (descriptor sets etc.) from "VRAM", posssibly hitting perf elsewhere.


Driver internal allocations have higher BO priorities than all app allocations, wouldn't that help avoid that? I'm not sure how much effect the priorities actually have...
Priorities shouldn't matter much.

    That said,

    Reviewed-by: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl
    <mailto:b...@basnieuwenhuizen.nl>>

    > ---
    >  src/amd/vulkan/radv_device.c | 18 +++++++++++++++---
    >  1 file changed, 15 insertions(+), 3 deletions(-)
    >
    > diff --git a/src/amd/vulkan/radv_device.c
    b/src/amd/vulkan/radv_device.c
    > index 3cf050ed220..d36ee226ebd 100644
    > --- a/src/amd/vulkan/radv_device.c
    > +++ b/src/amd/vulkan/radv_device.c
    > @@ -171,12 +171,11 @@ radv_physical_device_init_mem_types(struct
    radv_physical_device *device)
    >                         .heapIndex = vram_index,
    >                 };
    >         }
    > -       if (gart_index >= 0) {
    > +       if (gart_index >= 0 &&
    device->rad_info.has_dedicated_vram) {
    >                 device->mem_type_indices[type_count] =
    RADV_MEM_TYPE_GTT_WRITE_COMBINE;
    >  device->memory_properties.memoryTypes[type_count++] =
    (VkMemoryType) {
    >                         .propertyFlags =
    VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
    > -  VK_MEMORY_PROPERTY_HOST_COHERENT_BIT |
    > -  (device->rad_info.has_dedicated_vram ? 0 :
    VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT),
    > +  VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
    >                         .heapIndex = gart_index,
    >                 };
    >         }
    > @@ -189,6 +188,19 @@ radv_physical_device_init_mem_types(struct
    radv_physical_device *device)
    >                         .heapIndex = visible_vram_index,
    >                 };
    >         }
    > +       if (gart_index >= 0 &&
    !device->rad_info.has_dedicated_vram) {
    > +               /* Put GTT after visible VRAM for GPUs without
    dedicated VRAM
    > +                * as they have identical property flags, and
    according to the
    > +                * spec, for types with identical flags, the one
    with greater
    > +                * performance must be given a lower index. */
    > +               device->mem_type_indices[type_count] =
    RADV_MEM_TYPE_GTT_WRITE_COMBINE;
    > +  device->memory_properties.memoryTypes[type_count++] =
    (VkMemoryType) {
    > +                       .propertyFlags =
    VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT |
    > +  VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
    > +  VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
    > +                       .heapIndex = gart_index,
    > +               };
    > +       }
    >         if (gart_index >= 0) {
    >                 device->mem_type_indices[type_count] =
    RADV_MEM_TYPE_GTT_CACHED;
    >  device->memory_properties.memoryTypes[type_count++] =
    (VkMemoryType) {
    > --
    > 2.21.0
    >
    > _______________________________________________
    > mesa-dev mailing list
    > mesa-dev@lists.freedesktop.org
    <mailto:mesa-dev@lists.freedesktop.org>
    > https://lists.freedesktop.org/mailman/listinfo/mesa-dev


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to