Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-06-23 Thread Mike Lothian
Hi

The buddy allocator is still causing me issues in 5.19-rc3
(https://gitlab.freedesktop.org/drm/amd/-/issues/2059)

I'm no longer seeing null pointers though, so I think the bulk move
fix did it's bit

Let me know if there's anything I can help with, now there aren't
freezes I can offer remote access to debug if it'll help

Cheers

Mike


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-05-30 Thread Christian König

Am 29.05.22 um 01:52 schrieb Mike Lothian:

On Sat, 28 May 2022 at 08:44, Paneer Selvam, Arunpravin
 wrote:

[Public]

Hi,

After investigating quite some time on this issue, found freeze problem is not 
with the amdgpu part of buddy allocator patch as the patch doesn’t throw any 
issues when applied separately on top of the stable base of drm-next. After 
digging more into this issue, the below patch seems to be the cause of this 
problem,

drm/ttm: rework bulk move handling v5
https://cgit.freedesktop.org/drm/drm/commit/?id=fee2ede155423b0f7a559050a39750b98fe9db69

when this patch applied on top of the stable (working version) of drm-next 
without buddy allocator patch, we can see multiple issues listed below, each 
thrown randomly at every GravityMark run, 1. general protection fault at 
ttm_lru_bulk_move_tail() 2. NULL pointer deference at ttm_lru_bulk_move_tail() 
3. NULL pointer deference at ttm_resource_init().

Regards,
Arun.

Thanks for tracking it down, fee2ede155423b0f7a559050a39750b98fe9db69
isn't trivial to revert

Hopefully Christian can figure it out


Arun is unfortunately running into the wrong direction with his testing. 
The merge fallout from "drm/ttm: rework bulk move handling v5" is 
already fixed by "drm/amdgpu: fix drm-next merge fallout", but your 
problem with the buddy allocator is separate to that.


Regards,
Christian.


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-05-28 Thread Mike Lothian
On Sat, 28 May 2022 at 08:44, Paneer Selvam, Arunpravin
 wrote:
>
> [Public]
>
> Hi,
>
> After investigating quite some time on this issue, found freeze problem is 
> not with the amdgpu part of buddy allocator patch as the patch doesn’t throw 
> any issues when applied separately on top of the stable base of drm-next. 
> After digging more into this issue, the below patch seems to be the cause of 
> this problem,
>
> drm/ttm: rework bulk move handling v5
> https://cgit.freedesktop.org/drm/drm/commit/?id=fee2ede155423b0f7a559050a39750b98fe9db69
>
> when this patch applied on top of the stable (working version) of drm-next 
> without buddy allocator patch, we can see multiple issues listed below, each 
> thrown randomly at every GravityMark run, 1. general protection fault at 
> ttm_lru_bulk_move_tail() 2. NULL pointer deference at 
> ttm_lru_bulk_move_tail() 3. NULL pointer deference at ttm_resource_init().
>
> Regards,
> Arun.

Thanks for tracking it down, fee2ede155423b0f7a559050a39750b98fe9db69
isn't trivial to revert

Hopefully Christian can figure it out


RE: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-05-28 Thread Paneer Selvam, Arunpravin
[Public]

Hi,

After investigating quite some time on this issue, found freeze problem is not 
with the amdgpu part of buddy allocator patch as the patch doesn’t throw any 
issues when applied separately on top of the stable base of drm-next. After 
digging more into this issue, the below patch seems to be the cause of this 
problem,

drm/ttm: rework bulk move handling v5
https://cgit.freedesktop.org/drm/drm/commit/?id=fee2ede155423b0f7a559050a39750b98fe9db69

when this patch applied on top of the stable (working version) of drm-next 
without buddy allocator patch, we can see multiple issues listed below, each 
thrown randomly at every GravityMark run, 1. general protection fault at 
ttm_lru_bulk_move_tail() 2. NULL pointer deference at ttm_lru_bulk_move_tail() 
3. NULL pointer deference at ttm_resource_init().

Regards,
Arun.
-Original Message-
From: Alex Deucher  
Sent: Monday, May 16, 2022 8:36 PM
To: Mike Lothian 
Cc: Paneer Selvam, Arunpravin ; Intel Graphics 
Development ; amd-gfx list 
; Maling list - DRI developers 
; Deucher, Alexander 
; Koenig, Christian ; 
Matthew Auld 
Subject: Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

On Mon, May 16, 2022 at 8:40 AM Mike Lothian  wrote:
>
> Hi
>
> The merge window for 5.19 will probably be opening next week, has 
> there been any progress with this bug?

It took a while to find a combination of GPUs that would repro the issue, but 
now that we can, it is still being investigated.

Alex

>
> Thanks
>
> Mike
>
> On Mon, 2 May 2022 at 17:31, Mike Lothian  wrote:
> >
> > On Mon, 2 May 2022 at 16:54, Arunpravin Paneer Selvam 
> >  wrote:
> > >
> > >
> > >
> > > On 5/2/2022 8:41 PM, Mike Lothian wrote:
> > > > On Wed, 27 Apr 2022 at 12:55, Mike Lothian  wrote:
> > > >> On Tue, 26 Apr 2022 at 17:36, Christian König 
> > > >>  wrote:
> > > >>> Hi Mike,
> > > >>>
> > > >>> sounds like somehow stitching together the SG table for PRIME 
> > > >>> doesn't work any more with this patch.
> > > >>>
> > > >>> Can you try with P2P DMA disabled?
> > > >> -CONFIG_PCI_P2PDMA=y
> > > >> +# CONFIG_PCI_P2PDMA is not set
> > > >>
> > > >> If that's what you're meaning, then there's no difference, I'll 
> > > >> upload my dmesg to the gitlab issue
> > > >>
> > > >>> Apart from that can you take a look Arun?
> > > >>>
> > > >>> Thanks,
> > > >>> Christian.
> > > > Hi
> > > >
> > > > Have you had any success in replicating this?
> > > Hi Mike,
> > > I couldn't replicate on my Raven APU machine. I see you have 2 
> > > cards initialized, one is Renoir and the other is Navy Flounder. 
> > > Could you give some more details, are you running Gravity Mark on 
> > > Renoir and what is your system RAM configuration?
> > > >
> > > > Cheers
> > > >
> > > > Mike
> > >
> > Hi
> >
> > It's a PRIME laptop, it failed on the RENOIR too, it caused a 
> > lockup, but systemd managed to capture it, I'll attach it to the 
> > issue
> >
> > I've got 64GB RAM, the 6800M has 12GB VRAM
> >
> > Cheers
> >
> > Mike


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-05-16 Thread Alex Deucher
On Mon, May 16, 2022 at 8:40 AM Mike Lothian  wrote:
>
> Hi
>
> The merge window for 5.19 will probably be opening next week, has
> there been any progress with this bug?

It took a while to find a combination of GPUs that would repro the
issue, but now that we can, it is still being investigated.

Alex

>
> Thanks
>
> Mike
>
> On Mon, 2 May 2022 at 17:31, Mike Lothian  wrote:
> >
> > On Mon, 2 May 2022 at 16:54, Arunpravin Paneer Selvam
> >  wrote:
> > >
> > >
> > >
> > > On 5/2/2022 8:41 PM, Mike Lothian wrote:
> > > > On Wed, 27 Apr 2022 at 12:55, Mike Lothian  wrote:
> > > >> On Tue, 26 Apr 2022 at 17:36, Christian König 
> > > >>  wrote:
> > > >>> Hi Mike,
> > > >>>
> > > >>> sounds like somehow stitching together the SG table for PRIME doesn't
> > > >>> work any more with this patch.
> > > >>>
> > > >>> Can you try with P2P DMA disabled?
> > > >> -CONFIG_PCI_P2PDMA=y
> > > >> +# CONFIG_PCI_P2PDMA is not set
> > > >>
> > > >> If that's what you're meaning, then there's no difference, I'll upload
> > > >> my dmesg to the gitlab issue
> > > >>
> > > >>> Apart from that can you take a look Arun?
> > > >>>
> > > >>> Thanks,
> > > >>> Christian.
> > > > Hi
> > > >
> > > > Have you had any success in replicating this?
> > > Hi Mike,
> > > I couldn't replicate on my Raven APU machine. I see you have 2 cards
> > > initialized, one is Renoir
> > > and the other is Navy Flounder. Could you give some more details, are
> > > you running Gravity Mark
> > > on Renoir and what is your system RAM configuration?
> > > >
> > > > Cheers
> > > >
> > > > Mike
> > >
> > Hi
> >
> > It's a PRIME laptop, it failed on the RENOIR too, it caused a lockup,
> > but systemd managed to capture it, I'll attach it to the issue
> >
> > I've got 64GB RAM, the 6800M has 12GB VRAM
> >
> > Cheers
> >
> > Mike


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-05-16 Thread Mike Lothian
Hi

The merge window for 5.19 will probably be opening next week, has
there been any progress with this bug?

Thanks

Mike

On Mon, 2 May 2022 at 17:31, Mike Lothian  wrote:
>
> On Mon, 2 May 2022 at 16:54, Arunpravin Paneer Selvam
>  wrote:
> >
> >
> >
> > On 5/2/2022 8:41 PM, Mike Lothian wrote:
> > > On Wed, 27 Apr 2022 at 12:55, Mike Lothian  wrote:
> > >> On Tue, 26 Apr 2022 at 17:36, Christian König  
> > >> wrote:
> > >>> Hi Mike,
> > >>>
> > >>> sounds like somehow stitching together the SG table for PRIME doesn't
> > >>> work any more with this patch.
> > >>>
> > >>> Can you try with P2P DMA disabled?
> > >> -CONFIG_PCI_P2PDMA=y
> > >> +# CONFIG_PCI_P2PDMA is not set
> > >>
> > >> If that's what you're meaning, then there's no difference, I'll upload
> > >> my dmesg to the gitlab issue
> > >>
> > >>> Apart from that can you take a look Arun?
> > >>>
> > >>> Thanks,
> > >>> Christian.
> > > Hi
> > >
> > > Have you had any success in replicating this?
> > Hi Mike,
> > I couldn't replicate on my Raven APU machine. I see you have 2 cards
> > initialized, one is Renoir
> > and the other is Navy Flounder. Could you give some more details, are
> > you running Gravity Mark
> > on Renoir and what is your system RAM configuration?
> > >
> > > Cheers
> > >
> > > Mike
> >
> Hi
>
> It's a PRIME laptop, it failed on the RENOIR too, it caused a lockup,
> but systemd managed to capture it, I'll attach it to the issue
>
> I've got 64GB RAM, the 6800M has 12GB VRAM
>
> Cheers
>
> Mike


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-05-02 Thread Mike Lothian
On Mon, 2 May 2022 at 16:54, Arunpravin Paneer Selvam
 wrote:
>
>
>
> On 5/2/2022 8:41 PM, Mike Lothian wrote:
> > On Wed, 27 Apr 2022 at 12:55, Mike Lothian  wrote:
> >> On Tue, 26 Apr 2022 at 17:36, Christian König  
> >> wrote:
> >>> Hi Mike,
> >>>
> >>> sounds like somehow stitching together the SG table for PRIME doesn't
> >>> work any more with this patch.
> >>>
> >>> Can you try with P2P DMA disabled?
> >> -CONFIG_PCI_P2PDMA=y
> >> +# CONFIG_PCI_P2PDMA is not set
> >>
> >> If that's what you're meaning, then there's no difference, I'll upload
> >> my dmesg to the gitlab issue
> >>
> >>> Apart from that can you take a look Arun?
> >>>
> >>> Thanks,
> >>> Christian.
> > Hi
> >
> > Have you had any success in replicating this?
> Hi Mike,
> I couldn't replicate on my Raven APU machine. I see you have 2 cards
> initialized, one is Renoir
> and the other is Navy Flounder. Could you give some more details, are
> you running Gravity Mark
> on Renoir and what is your system RAM configuration?
> >
> > Cheers
> >
> > Mike
>
Hi

It's a PRIME laptop, it failed on the RENOIR too, it caused a lockup,
but systemd managed to capture it, I'll attach it to the issue

I've got 64GB RAM, the 6800M has 12GB VRAM

Cheers

Mike


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-05-02 Thread Arunpravin Paneer Selvam




On 5/2/2022 8:41 PM, Mike Lothian wrote:

On Wed, 27 Apr 2022 at 12:55, Mike Lothian  wrote:

On Tue, 26 Apr 2022 at 17:36, Christian König  wrote:

Hi Mike,

sounds like somehow stitching together the SG table for PRIME doesn't
work any more with this patch.

Can you try with P2P DMA disabled?

-CONFIG_PCI_P2PDMA=y
+# CONFIG_PCI_P2PDMA is not set

If that's what you're meaning, then there's no difference, I'll upload
my dmesg to the gitlab issue


Apart from that can you take a look Arun?

Thanks,
Christian.

Hi

Have you had any success in replicating this?

Hi Mike,
I couldn't replicate on my Raven APU machine. I see you have 2 cards 
initialized, one is Renoir
and the other is Navy Flounder. Could you give some more details, are 
you running Gravity Mark

on Renoir and what is your system RAM configuration?


Cheers

Mike




Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-05-02 Thread Mike Lothian
On Wed, 27 Apr 2022 at 12:55, Mike Lothian  wrote:
>
> On Tue, 26 Apr 2022 at 17:36, Christian König  
> wrote:
> >
> > Hi Mike,
> >
> > sounds like somehow stitching together the SG table for PRIME doesn't
> > work any more with this patch.
> >
> > Can you try with P2P DMA disabled?
>
> -CONFIG_PCI_P2PDMA=y
> +# CONFIG_PCI_P2PDMA is not set
>
> If that's what you're meaning, then there's no difference, I'll upload
> my dmesg to the gitlab issue
>
> >
> > Apart from that can you take a look Arun?
> >
> > Thanks,
> > Christian.

Hi

Have you had any success in replicating this?

Cheers

Mike


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-04-27 Thread Mike Lothian
On Tue, 26 Apr 2022 at 17:36, Christian König  wrote:
>
> Hi Mike,
>
> sounds like somehow stitching together the SG table for PRIME doesn't
> work any more with this patch.
>
> Can you try with P2P DMA disabled?

-CONFIG_PCI_P2PDMA=y
+# CONFIG_PCI_P2PDMA is not set

If that's what you're meaning, then there's no difference, I'll upload
my dmesg to the gitlab issue

>
> Apart from that can you take a look Arun?
>
> Thanks,
> Christian.


RE: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-04-26 Thread Paneer Selvam, Arunpravin
[AMD Official Use Only - General]

Hi Christian,

I will check this issue.

Regards,
Arun
-Original Message-
From: Koenig, Christian  
Sent: Tuesday, April 26, 2022 10:06 PM
To: Mike Lothian ; Paneer Selvam, Arunpravin 

Cc: intel-...@lists.freedesktop.org; dri-de...@lists.freedesktop.org; 
amd-gfx@lists.freedesktop.org; Deucher, Alexander ; 
matthew.a...@intel.com
Subject: Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

Hi Mike,

sounds like somehow stitching together the SG table for PRIME doesn't work any 
more with this patch.

Can you try with P2P DMA disabled?

Apart from that can you take a look Arun?

Thanks,
Christian.

Am 26.04.22 um 17:29 schrieb Mike Lothian:
> Hi
>
> I'm having issues with this patch on my PRIME system and vulkan 
> workloads
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
> ab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1992data=05%7C01%7C
> christian.koenig%40amd.com%7Ce18d158769fc47b08ee708da27998e7e%7C3dd896
> 1fe4884e608e11a82d994e183d%7C0%7C0%7C637865838441574170%7CUnknown%7CTW
> FpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6
> Mn0%3D%7C1000%7C%7C%7Csdata=hQu67WrdUwZn6%2BdXziGz84nMGepI6%2FnlB
> 8XFCFKCnpA%3Dreserved=0
>
> Is there any chance you could take a look?
>
> Cheers
>
> Mike


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-04-26 Thread Christian König

Hi Mike,

sounds like somehow stitching together the SG table for PRIME doesn't 
work any more with this patch.


Can you try with P2P DMA disabled?

Apart from that can you take a look Arun?

Thanks,
Christian.

Am 26.04.22 um 17:29 schrieb Mike Lothian:

Hi

I'm having issues with this patch on my PRIME system and vulkan workloads

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1992data=05%7C01%7Cchristian.koenig%40amd.com%7Ce18d158769fc47b08ee708da27998e7e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637865838441574170%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7Csdata=hQu67WrdUwZn6%2BdXziGz84nMGepI6%2FnlB8XFCFKCnpA%3Dreserved=0

Is there any chance you could take a look?

Cheers

Mike




Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-04-26 Thread Mike Lothian
Hi

I'm having issues with this patch on my PRIME system and vulkan workloads

https://gitlab.freedesktop.org/drm/amd/-/issues/1992

Is there any chance you could take a look?

Cheers

Mike


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-04-07 Thread Paul Menzel

Dear Arunpravin,


Thank you for your patch.

Am 07.04.22 um 07:46 schrieb Arunpravin Paneer Selvam:

- Switch to drm buddy allocator
- Add resource cursor support for drm buddy


I though after the last long discussion, you would actually act on the 
review comments. Daniel wrote a good summary, you could more or less 
copy and past. So why didn’t you?


So, I really wish to not have the patch commit as is.

The summary should also say something about using mutex over spinlocks. 
For me the version change summaries below are just for reviewers of 
earlier iterations to see what changed, and not something to be read easily.



Kind regards,

Paul



v2(Matthew Auld):
   - replace spinlock with mutex as we call kmem_cache_zalloc
 (..., GFP_KERNEL) in drm_buddy_alloc() function

   - lock drm_buddy_block_trim() function as it calls
 mark_free/mark_split are all globally visible

v3(Matthew Auld):
   - remove trim method error handling as we address the failure case
 at drm_buddy_block_trim() function

v4:
   - fix warnings reported by kernel test robot 

v5:
   - fix merge conflict issue

v6:
   - fix warnings reported by kernel test robot 

v7:
   - remove DRM_BUDDY_RANGE_ALLOCATION flag usage

v8:
   - keep DRM_BUDDY_RANGE_ALLOCATION flag usage
   - resolve conflicts created by drm/amdgpu: remove VRAM accounting v2

v9(Christian):
   - merged the below patch
  - drm/amdgpu: move vram inline functions into a header
   - rename label name as fallback
   - move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h
   - remove unnecessary flags from struct amdgpu_vram_reservation
   - rewrite block NULL check condition
   - change else style as per coding standard
   - rewrite the node max size
   - add a helper function to fetch the first entry from the list

v10(Christian):
- rename amdgpu_get_node() function name as amdgpu_vram_mgr_first_block

v11:
- if size is not aligned with min_page_size, enable is_contiguous flag,
  therefore, the size round up to the power of two and trimmed to the
  original size.
v12:
- rename the function names having prefix as amdgpu_vram_mgr_*()
- modify the round_up() logic conforming to contiguous flag enablement
  or if size is not aligned to min_block_size
- modify the trim logic
- rename node as block wherever applicable

Signed-off-by: Arunpravin Paneer Selvam 
---
  drivers/gpu/drm/Kconfig   |   1 +
  .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  10 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 359 ++
  4 files changed, 291 insertions(+), 176 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index f1422bee3dcc..5133c3f028ab 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -280,6 +280,7 @@ config DRM_AMDGPU
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select DRM_BUDDY
help
  Choose this option if you have a recent AMD Radeon graphics card.
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h

index acfa207cf970..6546552e596c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
  #include 
  #include 
  
+#include "amdgpu_vram_mgr.h"

+
  /* state back for walking over vram_mgr and gtt_mgr allocations */
  struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
  };
  
  /**

@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
  {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
  
-	if (!res || res->mem_type == TTM_PL_SYSTEM) {

-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto fallback;
  
  	BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
  
-	node = to_ttm_range_mgr_node(res)->mm_nodes;

-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = _amdgpu_vram_mgr_resource(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+

Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-04-07 Thread Christian König

Am 07.04.22 um 07:46 schrieb Arunpravin Paneer Selvam:

- Switch to drm buddy allocator
- Add resource cursor support for drm buddy

v2(Matthew Auld):
   - replace spinlock with mutex as we call kmem_cache_zalloc
 (..., GFP_KERNEL) in drm_buddy_alloc() function

   - lock drm_buddy_block_trim() function as it calls
 mark_free/mark_split are all globally visible

v3(Matthew Auld):
   - remove trim method error handling as we address the failure case
 at drm_buddy_block_trim() function

v4:
   - fix warnings reported by kernel test robot 

v5:
   - fix merge conflict issue

v6:
   - fix warnings reported by kernel test robot 

v7:
   - remove DRM_BUDDY_RANGE_ALLOCATION flag usage

v8:
   - keep DRM_BUDDY_RANGE_ALLOCATION flag usage
   - resolve conflicts created by drm/amdgpu: remove VRAM accounting v2

v9(Christian):
   - merged the below patch
  - drm/amdgpu: move vram inline functions into a header
   - rename label name as fallback
   - move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h
   - remove unnecessary flags from struct amdgpu_vram_reservation
   - rewrite block NULL check condition
   - change else style as per coding standard
   - rewrite the node max size
   - add a helper function to fetch the first entry from the list

v10(Christian):
- rename amdgpu_get_node() function name as amdgpu_vram_mgr_first_block

v11:
- if size is not aligned with min_page_size, enable is_contiguous flag,
  therefore, the size round up to the power of two and trimmed to the
  original size.
v12:
- rename the function names having prefix as amdgpu_vram_mgr_*()
- modify the round_up() logic conforming to contiguous flag enablement
  or if size is not aligned to min_block_size
- modify the trim logic
- rename node as block wherever applicable

Signed-off-by: Arunpravin Paneer Selvam 


Acked-by: Christian König 

I don't have time for a detailed in depth review, but this has seen 
enough iterations that I trust it to work fine.


Please check if the drm_buddy work has already backmerged into 
amd-staging-drm-next. If not maybe ping Alex when that is planned or 
alternatively I can push it through drm-misc-next.


Thanks,
Christian.


---
  drivers/gpu/drm/Kconfig   |   1 +
  .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  10 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 359 ++
  4 files changed, 291 insertions(+), 176 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index f1422bee3dcc..5133c3f028ab 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -280,6 +280,7 @@ config DRM_AMDGPU
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select DRM_BUDDY
help
  Choose this option if you have a recent AMD Radeon graphics card.
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h

index acfa207cf970..6546552e596c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
  #include 
  #include 
  
+#include "amdgpu_vram_mgr.h"

+
  /* state back for walking over vram_mgr and gtt_mgr allocations */
  struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
  };
  
  /**

@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
  {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
  
-	if (!res || res->mem_type == TTM_PL_SYSTEM) {

-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto fallback;
  
  	BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
  
-	node = to_ttm_range_mgr_node(res)->mm_nodes;

-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = _amdgpu_vram_mgr_resource(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto fallback;
+
+   while (start >= amdgpu_vram_mgr_block_size(block)) 

[PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-04-06 Thread Arunpravin Paneer Selvam
- Switch to drm buddy allocator
- Add resource cursor support for drm buddy

v2(Matthew Auld):
  - replace spinlock with mutex as we call kmem_cache_zalloc
(..., GFP_KERNEL) in drm_buddy_alloc() function

  - lock drm_buddy_block_trim() function as it calls
mark_free/mark_split are all globally visible

v3(Matthew Auld):
  - remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function

v4:
  - fix warnings reported by kernel test robot 

v5:
  - fix merge conflict issue

v6:
  - fix warnings reported by kernel test robot 

v7:
  - remove DRM_BUDDY_RANGE_ALLOCATION flag usage

v8:
  - keep DRM_BUDDY_RANGE_ALLOCATION flag usage
  - resolve conflicts created by drm/amdgpu: remove VRAM accounting v2

v9(Christian):
  - merged the below patch
 - drm/amdgpu: move vram inline functions into a header
  - rename label name as fallback
  - move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h
  - remove unnecessary flags from struct amdgpu_vram_reservation
  - rewrite block NULL check condition
  - change else style as per coding standard
  - rewrite the node max size
  - add a helper function to fetch the first entry from the list

v10(Christian):
   - rename amdgpu_get_node() function name as amdgpu_vram_mgr_first_block

v11:
   - if size is not aligned with min_page_size, enable is_contiguous flag,
 therefore, the size round up to the power of two and trimmed to the
 original size.
v12:
   - rename the function names having prefix as amdgpu_vram_mgr_*()
   - modify the round_up() logic conforming to contiguous flag enablement
 or if size is not aligned to min_block_size
   - modify the trim logic
   - rename node as block wherever applicable

Signed-off-by: Arunpravin Paneer Selvam 
---
 drivers/gpu/drm/Kconfig   |   1 +
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 359 ++
 4 files changed, 291 insertions(+), 176 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index f1422bee3dcc..5133c3f028ab 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -280,6 +280,7 @@ config DRM_AMDGPU
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select DRM_BUDDY
help
  Choose this option if you have a recent AMD Radeon graphics card.
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index acfa207cf970..6546552e596c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
 #include 
 #include 
 
+#include "amdgpu_vram_mgr.h"
+
 /* state back for walking over vram_mgr and gtt_mgr allocations */
 struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
 };
 
 /**
@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
 {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
 
-   if (!res || res->mem_type == TTM_PL_SYSTEM) {
-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto fallback;
 
BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
 
-   node = to_ttm_range_mgr_node(res)->mm_nodes;
-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = _amdgpu_vram_mgr_resource(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto fallback;
+
+   while (start >= amdgpu_vram_mgr_block_size(block)) {
+   start -= amdgpu_vram_mgr_block_size(block);
+
+   next = block->link.next;
+   if (next != head)
+   block = list_entry(next, struct 
drm_buddy_block, link);
+   }
+
+   cur->start = amdgpu_vram_mgr_block_start(block) + start;
+   cur->size = min(amdgpu_vram_mgr_block_size(block) - start, 
size);
+