Re: [Mesa-dev] [PATCH 4/7] winsys/amdgpu: clean up code around BO VM alignment

2018-11-28 Thread Bas Nieuwenhuizen
On Mon, Nov 26, 2018 at 8:54 PM Marek Olšák  wrote:
>
> On Mon, Nov 26, 2018 at 8:46 AM Bas Nieuwenhuizen  wrote:
>>
>> patches 4-7 are
>>
>> Reviewed-by: Bas Nieuwenhuizen 
>>
>> though I agree with Christian that it would be nice to get a case
>> where 7 improves things before we submit it.
>>
>> For patches 1-3 I need to dive into the slab allocator before I have
>> enough knowledge to review.
>
>
> It's actually pretty simple. Each slab buffer contains suballocations of the 
> same size. E.g. one 2M slab can only have 16 128K suballocations. If you want 
> a 64K buffer, you have to allocate a separate 2M slab with 32 64K 
> suballocations. If you need to allocate just 512B, you need another 2MB slab. 
> This wastes memory in 2 ways:
> 1) a 129K buffer will have to be allocated as a 256K buffer, because the 
> allocation size has to be a power of two, wasting space
> 2) there will be too many 2MB slabs, one slab for each size (from 2^9 to 
> 2^18), each having some unused space, wasting space
> 3) having too many 2K and smaller allocations actually decreases memory 
> usage, because the minimum kernel allocation size is 4K
>
> Problem 2 is mitigated by layering slab allocators on top of each other. The 
> top slab allocator can only allocate sizes 2^17 to 2^18, so you only need two 
> 2M slabs to cover all your needs. The middle allocator uses the top allocator 
> and allocates slabs of size 2^17 and uses it to allocate 2^13 to 2^16 sizes. 
> The bottom allocator allocates slabs of size 2^13 and uses it allocate 2^9 to 
> 2^12 sizes.

Sure, just needed to familiarize myself with the code since I don't
want to review just the idea.

1-3 are

Reviewed-by: Bas Nieuwenhuizen 
>
> Marek
>
>
>> On Sat, Nov 24, 2018 at 12:41 AM Marek Olšák  wrote:
>> >
>> > From: Marek Olšák 
>> >
>> > ---
>> >  src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 9 +++--
>> >  1 file changed, 7 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
>> > b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>> > index a9271c33ee9..36e2c4ec0dc 100644
>> > --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>> > +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>> > @@ -449,24 +449,29 @@ static struct amdgpu_winsys_bo 
>> > *amdgpu_create_bo(struct amdgpu_winsys *ws,
>> > r = amdgpu_bo_alloc(ws->dev, , _handle);
>> > if (r) {
>> >fprintf(stderr, "amdgpu: Failed to allocate a buffer:\n");
>> >fprintf(stderr, "amdgpu:size  : %"PRIu64" bytes\n", size);
>> >fprintf(stderr, "amdgpu:alignment : %u bytes\n", alignment);
>> >fprintf(stderr, "amdgpu:domains   : %u\n", initial_domain);
>> >goto error_bo_alloc;
>> > }
>> >
>> > va_gap_size = ws->check_vm ? MAX2(4 * alignment, 64 * 1024) : 0;
>> > +
>> > +   unsigned vm_alignment = alignment;
>> > +
>> > +   /* Increase the VM alignment for faster address translation. */
>> > if (size > ws->info.pte_fragment_size)
>> > -  alignment = MAX2(alignment, ws->info.pte_fragment_size);
>> > +  vm_alignment = MAX2(vm_alignment, ws->info.pte_fragment_size);
>> > +
>> > r = amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
>> > - size + va_gap_size, alignment, 0, , 
>> > _handle,
>> > + size + va_gap_size, vm_alignment, 0, , 
>> > _handle,
>> >   (flags & RADEON_FLAG_32BIT ? 
>> > AMDGPU_VA_RANGE_32_BIT : 0) |
>> >  AMDGPU_VA_RANGE_HIGH);
>> > if (r)
>> >goto error_va_alloc;
>> >
>> > unsigned vm_flags = AMDGPU_VM_PAGE_READABLE |
>> > AMDGPU_VM_PAGE_EXECUTABLE;
>> >
>> > if (!(flags & RADEON_FLAG_READ_ONLY))
>> > vm_flags |= AMDGPU_VM_PAGE_WRITEABLE;
>> > --
>> > 2.17.1
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] winsys/amdgpu: clean up code around BO VM alignment

2018-11-26 Thread Marek Olšák
On Mon, Nov 26, 2018 at 8:46 AM Bas Nieuwenhuizen 
wrote:

> patches 4-7 are
>
> Reviewed-by: Bas Nieuwenhuizen 
>
> though I agree with Christian that it would be nice to get a case
> where 7 improves things before we submit it.
>
> For patches 1-3 I need to dive into the slab allocator before I have
> enough knowledge to review.
>

It's actually pretty simple. Each slab buffer contains suballocations of
the same size. E.g. one 2M slab can only have 16 128K suballocations. If
you want a 64K buffer, you have to allocate a separate 2M slab with 32 64K
suballocations. If you need to allocate just 512B, you need another 2MB
slab. This wastes memory in 2 ways:
1) a 129K buffer will have to be allocated as a 256K buffer, because the
allocation size has to be a power of two, wasting space
2) there will be too many 2MB slabs, one slab for each size (from 2^9 to
2^18), each having some unused space, wasting space
3) having too many 2K and smaller allocations actually decreases memory
usage, because the minimum kernel allocation size is 4K

Problem 2 is mitigated by layering slab allocators on top of each other.
The top slab allocator can only allocate sizes 2^17 to 2^18, so you only
need two 2M slabs to cover all your needs. The middle allocator uses the
top allocator and allocates slabs of size 2^17 and uses it to allocate 2^13
to 2^16 sizes. The bottom allocator allocates slabs of size 2^13 and uses
it allocate 2^9 to 2^12 sizes.

Marek


On Sat, Nov 24, 2018 at 12:41 AM Marek Olšák  wrote:
> >
> > From: Marek Olšák 
> >
> > ---
> >  src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> > index a9271c33ee9..36e2c4ec0dc 100644
> > --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> > +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> > @@ -449,24 +449,29 @@ static struct amdgpu_winsys_bo
> *amdgpu_create_bo(struct amdgpu_winsys *ws,
> > r = amdgpu_bo_alloc(ws->dev, , _handle);
> > if (r) {
> >fprintf(stderr, "amdgpu: Failed to allocate a buffer:\n");
> >fprintf(stderr, "amdgpu:size  : %"PRIu64" bytes\n", size);
> >fprintf(stderr, "amdgpu:alignment : %u bytes\n", alignment);
> >fprintf(stderr, "amdgpu:domains   : %u\n", initial_domain);
> >goto error_bo_alloc;
> > }
> >
> > va_gap_size = ws->check_vm ? MAX2(4 * alignment, 64 * 1024) : 0;
> > +
> > +   unsigned vm_alignment = alignment;
> > +
> > +   /* Increase the VM alignment for faster address translation. */
> > if (size > ws->info.pte_fragment_size)
> > -  alignment = MAX2(alignment, ws->info.pte_fragment_size);
> > +  vm_alignment = MAX2(vm_alignment, ws->info.pte_fragment_size);
> > +
> > r = amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
> > - size + va_gap_size, alignment, 0, ,
> _handle,
> > + size + va_gap_size, vm_alignment, 0, ,
> _handle,
> >   (flags & RADEON_FLAG_32BIT ?
> AMDGPU_VA_RANGE_32_BIT : 0) |
> >  AMDGPU_VA_RANGE_HIGH);
> > if (r)
> >goto error_va_alloc;
> >
> > unsigned vm_flags = AMDGPU_VM_PAGE_READABLE |
> > AMDGPU_VM_PAGE_EXECUTABLE;
> >
> > if (!(flags & RADEON_FLAG_READ_ONLY))
> > vm_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> > --
> > 2.17.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] winsys/amdgpu: clean up code around BO VM alignment

2018-11-26 Thread Bas Nieuwenhuizen
patches 4-7 are

Reviewed-by: Bas Nieuwenhuizen 

though I agree with Christian that it would be nice to get a case
where 7 improves things before we submit it.

For patches 1-3 I need to dive into the slab allocator before I have
enough knowledge to review.
On Sat, Nov 24, 2018 at 12:41 AM Marek Olšák  wrote:
>
> From: Marek Olšák 
>
> ---
>  src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
> b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> index a9271c33ee9..36e2c4ec0dc 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> @@ -449,24 +449,29 @@ static struct amdgpu_winsys_bo *amdgpu_create_bo(struct 
> amdgpu_winsys *ws,
> r = amdgpu_bo_alloc(ws->dev, , _handle);
> if (r) {
>fprintf(stderr, "amdgpu: Failed to allocate a buffer:\n");
>fprintf(stderr, "amdgpu:size  : %"PRIu64" bytes\n", size);
>fprintf(stderr, "amdgpu:alignment : %u bytes\n", alignment);
>fprintf(stderr, "amdgpu:domains   : %u\n", initial_domain);
>goto error_bo_alloc;
> }
>
> va_gap_size = ws->check_vm ? MAX2(4 * alignment, 64 * 1024) : 0;
> +
> +   unsigned vm_alignment = alignment;
> +
> +   /* Increase the VM alignment for faster address translation. */
> if (size > ws->info.pte_fragment_size)
> -  alignment = MAX2(alignment, ws->info.pte_fragment_size);
> +  vm_alignment = MAX2(vm_alignment, ws->info.pte_fragment_size);
> +
> r = amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
> - size + va_gap_size, alignment, 0, , 
> _handle,
> + size + va_gap_size, vm_alignment, 0, , 
> _handle,
>   (flags & RADEON_FLAG_32BIT ? 
> AMDGPU_VA_RANGE_32_BIT : 0) |
>  AMDGPU_VA_RANGE_HIGH);
> if (r)
>goto error_va_alloc;
>
> unsigned vm_flags = AMDGPU_VM_PAGE_READABLE |
> AMDGPU_VM_PAGE_EXECUTABLE;
>
> if (!(flags & RADEON_FLAG_READ_ONLY))
> vm_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] winsys/amdgpu: clean up code around BO VM alignment

2018-11-23 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index a9271c33ee9..36e2c4ec0dc 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -449,24 +449,29 @@ static struct amdgpu_winsys_bo *amdgpu_create_bo(struct 
amdgpu_winsys *ws,
r = amdgpu_bo_alloc(ws->dev, , _handle);
if (r) {
   fprintf(stderr, "amdgpu: Failed to allocate a buffer:\n");
   fprintf(stderr, "amdgpu:size  : %"PRIu64" bytes\n", size);
   fprintf(stderr, "amdgpu:alignment : %u bytes\n", alignment);
   fprintf(stderr, "amdgpu:domains   : %u\n", initial_domain);
   goto error_bo_alloc;
}
 
va_gap_size = ws->check_vm ? MAX2(4 * alignment, 64 * 1024) : 0;
+
+   unsigned vm_alignment = alignment;
+
+   /* Increase the VM alignment for faster address translation. */
if (size > ws->info.pte_fragment_size)
-  alignment = MAX2(alignment, ws->info.pte_fragment_size);
+  vm_alignment = MAX2(vm_alignment, ws->info.pte_fragment_size);
+
r = amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
- size + va_gap_size, alignment, 0, , _handle,
+ size + va_gap_size, vm_alignment, 0, , 
_handle,
  (flags & RADEON_FLAG_32BIT ? 
AMDGPU_VA_RANGE_32_BIT : 0) |
 AMDGPU_VA_RANGE_HIGH);
if (r)
   goto error_va_alloc;
 
unsigned vm_flags = AMDGPU_VM_PAGE_READABLE |
AMDGPU_VM_PAGE_EXECUTABLE;
 
if (!(flags & RADEON_FLAG_READ_ONLY))
vm_flags |= AMDGPU_VM_PAGE_WRITEABLE;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev