Re: [PATCH] drm/amd/pm: ingore failure of GmiPwrDnControl

2022-01-09 Thread Lazar, Lijo




On 1/10/2022 1:17 PM, Tao Zhou wrote:

PMFW only returns 0 on master die and sends NACK back on other dies for
the message.

Signed-off-by: Tao Zhou 
---
  .../gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c   | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
index 261892977654..121b0ab5823f 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
@@ -1625,10 +1625,18 @@ static int aldebaran_set_df_cstate(struct smu_context 
*smu,
  
  static int aldebaran_allow_xgmi_power_down(struct smu_context *smu, bool en)

  {
-   return smu_cmn_send_smc_msg_with_param(smu,
-  SMU_MSG_GmiPwrDnControl,
-  en ? 0 : 1,
-  NULL);
+   int ret;
+
+   ret = smu_cmn_send_smc_msg_with_param(smu,
+  SMU_MSG_GmiPwrDnControl,
+  en ? 0 : 1,
+  NULL);
+   /* The message only works on master die and NACK will be sent
+  back for other dies, ingore CMD failure for the message */
+   if (ret == -EIO)
+   ret = 0;
+


Could you confirm if this is hive master or primary die of each device? 
If it's the latter, use aldebaran_is_primary(). For the former we can 
add support similar to is_primary().


Thanks,
Lijo


+   return ret;
  }
  
  static const struct throttling_logging_label {




RE: [PATCH] drm/amd/pm: correct the checks for fan attributes support

2022-01-09 Thread Quan, Evan
[AMD Official Use Only]



> -Original Message-
> From: Lazar, Lijo 
> Sent: Monday, January 10, 2022 3:36 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: Re: [PATCH] drm/amd/pm: correct the checks for fan attributes
> support
> 
> 
> 
> On 1/10/2022 11:30 AM, Evan Quan wrote:
> > Before we relied on the return values from the corresponding interfaces.
> > That is with low efficiency. And the wrong intermediate variable used
> > makes the fan mode stuck at manual mode which then causes overheating
> in
> > 3D graphics tests.
> >
> > Signed-off-by: Evan Quan 
> > Change-Id: Ia93ccf3b929c12e6d10b50c8f3596783ac63f0e3
> > ---
> >   drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 23
> +++
> >   drivers/gpu/drm/amd/pm/amdgpu_pm.c  | 20 ++--
> >   drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 12 
> >   3 files changed, 45 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
> b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
> > index 68d2e80a673b..e732418a9558 100644
> > --- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
> > +++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
> > @@ -1547,3 +1547,26 @@ int amdgpu_dpm_get_dpm_clock_table(struct
> amdgpu_device *adev,
> >
> > return ret;
> >   }
> > +
> > +int amdgpu_dpm_is_fan_operation_supported(struct amdgpu_device
> *adev,
> > + enum fan_operation_id id)
> > +{
> > +   const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
> > +
> > +   switch (id) {
> > +   case FAN_CONTROL_MODE_RETRIEVING:
> > +   return pp_funcs->get_fan_control_mode ? 1 : 0;
> > +   case FAN_CONTROL_MODE_SETTING:
> > +   return pp_funcs->set_fan_control_mode ? 1 : 0;
> > +   case FAN_SPEED_PWM_RETRIEVING:
> > +   return pp_funcs->get_fan_speed_pwm ? 1 : 0;
> > +   case FAN_SPEED_PWM_SETTING:
> > +   return pp_funcs->set_fan_speed_pwm ? 1 : 0;
> > +   case FAN_SPEED_RPM_RETRIEVING:
> > +   return pp_funcs->get_fan_speed_rpm ? 1 : 0;
> > +   case FAN_SPEED_RPM_SETTING:
> > +   return pp_funcs->set_fan_speed_rpm ? 1 : 0;
> > +   default:
> > +   return 0;
> > +   }
> > +}
> > diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> > index d3eab245e0fe..57721750c51a 100644
> > --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> > +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> > @@ -3263,15 +3263,15 @@ static umode_t
> hwmon_attributes_visible(struct kobject *kobj,
> > return 0;
> >
> > /* mask fan attributes if we have no bindings for this asic to expose
> */
> > -   if (((amdgpu_dpm_get_fan_speed_pwm(adev, ) == -EINVAL)
> &&
> > +   if ((!amdgpu_dpm_is_fan_operation_supported(adev,
> FAN_SPEED_PWM_RETRIEVING) &&
> 
> As per the current logic, it's really checking the hardware registers.
[Quan, Evan] I probably should mention the "current" version you see now is 
actually a regression introduced by the commit below:
801771de0331 drm/amd/pm: do not expose power implementation details to 
amdgpu_pm.c

The very early version(which works good) is something like below:
-   if (!is_support_sw_smu(adev)) {
-   /* mask fan attributes if we have no bindings for this asic to 
expose */
-   if ((!adev->powerplay.pp_funcs->get_fan_speed_pwm &&
-attr == _dev_attr_pwm1.dev_attr.attr) || /* can't 
query fan */
-   (!adev->powerplay.pp_funcs->get_fan_control_mode &&
-attr == _dev_attr_pwm1_enable.dev_attr.attr)) /* 
can't query state */
-   effective_mode &= ~S_IRUGO;

So, the changes here are really just back to old working version. It aims to 
provide a quick fix for the failures reported by CQE.
> For ex: we could have some SKUs that have PMFW based fan control and for
> some other SKUs, AIBs could be having a different cooling solution which
> doesn't make use of PMFW.
> 
> 
> >   attr == _dev_attr_pwm1.dev_attr.attr) || /* can't query
> fan */
> > -   ((amdgpu_dpm_get_fan_control_mode(adev, ) == -
> EOPNOTSUPP) &&
> > +   (!amdgpu_dpm_is_fan_operation_supported(adev,
> FAN_CONTROL_MODE_RETRIEVING) &&
> >  attr == _dev_attr_pwm1_enable.dev_attr.attr)) /* can't
> query state */
> > effective_mode &= ~S_IRUGO;
> >
> > -   if (((amdgpu_dpm_set_fan_speed_pwm(adev, speed) == -EINVAL)
> &&
> > +   if ((!amdgpu_dpm_is_fan_operation_supported(adev,
> FAN_SPEED_PWM_SETTING) &&
> >   attr == _dev_attr_pwm1.dev_attr.attr) || /* can't
> manage fan */
> > - ((amdgpu_dpm_set_fan_control_mode(adev, speed) == -
> EOPNOTSUPP) &&
> > +   (!amdgpu_dpm_is_fan_operation_supported(adev,
> FAN_CONTROL_MODE_SETTING) &&
> >   attr == _dev_attr_pwm1_enable.dev_attr.attr)) /* can't
> manage state */
> > effective_mode &= ~S_IWUSR;
> >
> > @@ -3291,16 +3291,16 @@ static umode_t
> hwmon_attributes_visible(struct kobject 

[PATCH] drm/amd/pm: ingore failure of GmiPwrDnControl

2022-01-09 Thread Tao Zhou
PMFW only returns 0 on master die and sends NACK back on other dies for
the message.

Signed-off-by: Tao Zhou 
---
 .../gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c   | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
index 261892977654..121b0ab5823f 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
@@ -1625,10 +1625,18 @@ static int aldebaran_set_df_cstate(struct smu_context 
*smu,
 
 static int aldebaran_allow_xgmi_power_down(struct smu_context *smu, bool en)
 {
-   return smu_cmn_send_smc_msg_with_param(smu,
-  SMU_MSG_GmiPwrDnControl,
-  en ? 0 : 1,
-  NULL);
+   int ret;
+
+   ret = smu_cmn_send_smc_msg_with_param(smu,
+  SMU_MSG_GmiPwrDnControl,
+  en ? 0 : 1,
+  NULL);
+   /* The message only works on master die and NACK will be sent
+  back for other dies, ingore CMD failure for the message */
+   if (ret == -EIO)
+   ret = 0;
+
+   return ret;
 }
 
 static const struct throttling_logging_label {
-- 
2.17.1



[PATCH] drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt

2022-01-09 Thread Guchun Chen
This is observed in SRIOV case with virtual KMS as display.

_raw_spin_lock_irqsave+0x37/0x40
drm_handle_vblank+0x69/0x350 [drm]
? try_to_wake_up+0x432/0x5c0
? amdgpu_vkms_prepare_fb+0x1c0/0x1c0 [amdgpu]
drm_crtc_handle_vblank+0x17/0x20 [drm]
amdgpu_vkms_vblank_simulate+0x4d/0x80 [amdgpu]
__hrtimer_run_queues+0xfb/0x230
hrtimer_interrupt+0x109/0x220
__sysvec_apic_timer_interrupt+0x64/0xe0
asm_call_irq_on_stack+0x12/0x20

Fixes: ba5317109d0c("drm/amdgpu: create amdgpu_vkms (v4)")
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
index 2dcc68e04e84..d99c8779b51e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
@@ -144,15 +144,16 @@ static void amdgpu_vkms_crtc_atomic_disable(struct 
drm_crtc *crtc,
 static void amdgpu_vkms_crtc_atomic_flush(struct drm_crtc *crtc,
  struct drm_atomic_state *state)
 {
+   unsigned long flags;
if (crtc->state->event) {
-   spin_lock(>dev->event_lock);
+   spin_lock_irqsave(>dev->event_lock, flags);
 
if (drm_crtc_vblank_get(crtc) != 0)
drm_crtc_send_vblank_event(crtc, crtc->state->event);
else
drm_crtc_arm_vblank_event(crtc, crtc->state->event);
 
-   spin_unlock(>dev->event_lock);
+   spin_unlock_irqrestore(>dev->event_lock, flags);
 
crtc->state->event = NULL;
}
-- 
2.17.1



[PATCH] drm/amdgpu: not return error on the init_apu_flags

2022-01-09 Thread Prike Liang
In some APU project we needn't always assign flags to identify each other,
so we may not need return an error.

Change-Id: I92c1acb9ffbdba7e9a68469163911801db262412
Signed-off-by: Prike Liang 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4322d92c25c7..2a6671f37336 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1449,7 +1449,7 @@ static int amdgpu_device_init_apu_flags(struct 
amdgpu_device *adev)
adev->apu_flags |= AMD_APU_IS_CYAN_SKILLFISH2;
break;
default:
-   return -EINVAL;
+   break;
}
 
return 0;
@@ -3497,9 +3497,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
mutex_init(>psp.mutex);
mutex_init(>notifier_lock);
 
-   r = amdgpu_device_init_apu_flags(adev);
-   if (r)
-   return r;
+amdgpu_device_init_apu_flags(adev);
 
r = amdgpu_device_check_arguments(adev);
if (r)
-- 
2.17.1



Re: Various problems trying to vga-passthrough a Renoir iGPU to a xen/qubes-os hvm

2022-01-09 Thread Yann Dirson
Alex wrote:
> On Thu, Jan 6, 2022 at 10:38 AM Yann Dirson  wrote:
> >
> > Alex wrote:
> > > > How is the stolen memory communicated to the driver ?  That
> > > > host
> > > > physical
> > > > memory probably has to be mapped at the same guest physical
> > > > address
> > > > for
> > > > the magic to work, right ?
> > >
> > > Correct.  The driver reads the physical location of that memory
> > > from
> > > hardware registers.  Removing this chunk of code from gmc_v9_0.c
> > > will
> > > force the driver to use the BAR,
> >
> > That would only be a workaround for a missing mapping of stolen
> > memory to the guest, right ?
> 
> 
> Correct. That will use the PCI BAR rather than the underlying
> physical
> memory for CPU access to the carve out region.
> 
> >
> >
> > > but I'm not sure if there are any
> > > other places in the driver that make assumptions about using the
> > > physical host address or not on APUs off hand.
> >
> > gmc_v9_0_vram_gtt_location() updates vm_manager.vram_base_offset
> > from
> > the same value.  I'm not sure I understand why in this case there
> > is
> > no reason to use the BAR while there are some in
> > gmc_v9_0_mc_init().
> >
> > vram_base_offset then gets used in several places:
> >
> > * amdgpu_gmc_init_pdb0, that seems likely enough to be problematic,
> >   right ?
> >   As a sidenote the XGMI offset added earlier gets substracted
> >   here to deduce vram base addr
> >   (a couple of new acronyms there: PDB, PDE -- page directory
> >   base/entry?)
> >
> > * amdgpu_ttm_map_buffer, amdgpu_vm_bo_update_mapping: those seem to
> > be
> >   as problematic
> >
> > * amdgpu_gmc_vram_mc2pa: until I got there I had assumed MC could
> > stand for
> >   "memory controller", but then "MC address of buffer" makes me
> >   doubt
> >
> >
> 
> MC = memory controller (as in graphics memory controller).
> 
> These are GPU addresses not CPU addresses so they should be fine.
> 
> > >
> > > if ((adev->flags & AMD_IS_APU) ||
> > > (adev->gmc.xgmi.supported &&
> > >  adev->gmc.xgmi.connected_to_cpu)) {
> > > adev->gmc.aper_base =
> > > adev->gfxhub.funcs->get_mc_fb_offset(adev)
> > > +
> > > adev->gmc.xgmi.physical_node_id *
> > > adev->gmc.xgmi.node_segment_size;
> > > adev->gmc.aper_size = adev->gmc.real_vram_size;
> > > }
> >
> >
> > Now for the test... it does indeed seem to go much further, I even
> > loose the dom0's efifb to that black screen hopefully showing the
> > driver started to setup the hardware.  Will probably still have to
> > hunt down whether it still tries to use efifb afterwards (can't see
> > why it would not, TBH, given the previous behaviour where it kept
> > using it after the guest failed to start).
> >
> > The log shows many details about TMR loading
> >
> > Then as expected:
> >
> > [2022-01-06 15:16:09] <6>[5.844589] amdgpu :00:05.0:
> > amdgpu: RAP: optional rap ta ucode is not available
> > [2022-01-06 15:16:09] <6>[5.844619] amdgpu :00:05.0:
> > amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
> > [2022-01-06 15:16:09] <7>[5.844639]
> > [drm:amdgpu_device_init.cold [amdgpu]] hw_init (phase2) of IP
> > block ...
> > [2022-01-06 15:16:09] <6>[5.845515] amdgpu :00:05.0:
> > amdgpu: SMU is initialized successfully!
> >
> >
> > not sure about that unhandled interrupt (and a bit worried about
> > messed-up logs):
> >
> > [2022-01-06 15:16:09] <7>[6.010681] amdgpu :00:05.0:
> > [drm:amdgpu_ring_test_hel[2022-01-06 15:16:10] per [amdgpu]] ring
> > test on sdma0 succeeded
> > [2022-01-06 15:16:10] <7>[6.010831] [drm:amdgpu_ih_process
> > [amdgpu]] amdgpu_ih_process: rptr 0, wptr 32
> > [2022-01-06 15:16:10] <7>[6.011002] [drm:amdgpu_irq_dispatch
> > [amdgpu]] Unhandled interrupt src_id: 243
> >
> >
> > then comes a first error:
> >
> > [2022-01-06 15:16:10] <6>[6.011785] [drm] Display Core
> > initialized with v3.2.149!
> > [2022-01-06 15:16:10] <6>[6.012714] [drm] DMUB hardware
> > initialized: version=0x0101001C
> > [2022-01-06 15:16:10] <3>[6.228263] [drm:dc_dmub_srv_wait_idle
> > [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
> > [2022-01-06 15:16:10] <7>[6.229125]
> > [drm:amdgpu_dm_init.isra.0.cold [amdgpu]] amdgpu: freesync_module
> > init done 76c7b459.
> > [2022-01-06 15:16:10] <7>[6.229677]
> > [drm:amdgpu_dm_init.isra.0.cold [amdgpu]] amdgpu: hdcp_workqueue
> > init done 87e28b47.
> > [2022-01-06 15:16:10] <7>[6.229979]
> > [drm:amdgpu_dm_init.isra.0.cold [amdgpu]]
> > amdgpu_dm_connector_init()
> >
> > ... which we can see again several times later though the driver
> > seems sufficient to finish init:
> >
> > [2022-01-06 15:16:10] <6>[6.615615] [drm] late_init of IP block
> > ...
> > [2022-01-06 15:16:10] <6>[6.615772] [drm] late_init of IP block
> > ...
> > [2022-01-06 15:16:10] 

[PATCH v7 6/6] drm/amdgpu: add drm buddy support to amdgpu

2022-01-09 Thread Arunpravin
- Remove drm_mm references and replace with drm buddy functionalities
- Add res cursor support for drm buddy

v2(Matthew Auld):
  - replace spinlock with mutex as we call kmem_cache_zalloc
(..., GFP_KERNEL) in drm_buddy_alloc() function

  - lock drm_buddy_block_trim() function as it calls
mark_free/mark_split are all globally visible

v3(Matthew Auld):
  - remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/Kconfig   |   1 +
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 259 ++
 4 files changed, 230 insertions(+), 133 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index b85f7ffae621..572fcc4adedd 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -270,6 +270,7 @@ config DRM_AMDGPU
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select DRM_BUDDY
help
  Choose this option if you have a recent AMD Radeon graphics card.
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index acfa207cf970..da12b4ff2e45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
 #include 
 #include 
 
+#include "amdgpu_vram_mgr.h"
+
 /* state back for walking over vram_mgr and gtt_mgr allocations */
 struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
 };
 
 /**
@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
 {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
 
-   if (!res || res->mem_type == TTM_PL_SYSTEM) {
-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto err_out;
 
BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
 
-   node = to_ttm_range_mgr_node(res)->mm_nodes;
-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = _amdgpu_vram_mgr_node(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto err_out;
+
+   while (start >= amdgpu_node_size(block)) {
+   start -= amdgpu_node_size(block);
+
+   next = block->link.next;
+   if (next != head)
+   block = list_entry(next, struct 
drm_buddy_block, link);
+   }
+
+   cur->start = amdgpu_node_start(block) + start;
+   cur->size = min(amdgpu_node_size(block) - start, size);
+   cur->remaining = size;
+   cur->node = block;
+   break;
+   case TTM_PL_TT:
+   node = to_ttm_range_mgr_node(res)->mm_nodes;
+   while (start >= node->size << PAGE_SHIFT)
+   start -= node++->size << PAGE_SHIFT;
+
+   cur->start = (node->start << PAGE_SHIFT) + start;
+   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   cur->remaining = size;
+   cur->node = node;
+   break;
+   default:
+   goto err_out;
+   }
 
-   cur->start = (node->start << PAGE_SHIFT) + start;
-   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   return;
+
+err_out:
+   cur->start = start;
+   cur->size = size;
cur->remaining = size;
-   cur->node = node;
+   cur->node = NULL;
+   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
+   return;
 }
 
 /**
@@ -85,7 +124,9 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
  */
 static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t 
size)
 {
-   struct drm_mm_node *node = cur->node;
+   struct drm_buddy_block *block;
+   struct drm_mm_node *node;
+   struct list_head *next;
 
BUG_ON(size > 

[PATCH v7 3/6] drm: implement top-down allocation method

2022-01-09 Thread Arunpravin
Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.

The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.

v2:
  - Fix alignment issues(Matthew Auld)
  - Remove unnecessary list_empty check(Matthew Auld)
  - merged the below patch to see the feature in action
 - add top-down alloc support to i915 driver

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c   | 36 ---
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  3 ++
 include/drm/drm_buddy.h   |  1 +
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index cd914d104b1a..11356c2bb7aa 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -369,6 +369,26 @@ alloc_range_bias(struct drm_buddy *mm,
return ERR_PTR(err);
 }
 
+static struct drm_buddy_block *
+get_maxblock(struct list_head *head)
+{
+   struct drm_buddy_block *max_block = NULL, *node;
+
+   max_block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!max_block)
+   return NULL;
+
+   list_for_each_entry(node, head, link) {
+   if (drm_buddy_block_offset(node) >
+   drm_buddy_block_offset(max_block))
+   max_block = node;
+   }
+
+   return max_block;
+}
+
 static struct drm_buddy_block *
 alloc_from_freelist(struct drm_buddy *mm,
unsigned int order,
@@ -379,11 +399,17 @@ alloc_from_freelist(struct drm_buddy *mm,
int err;
 
for (i = order; i <= mm->max_order; ++i) {
-   block = list_first_entry_or_null(>free_list[i],
-struct drm_buddy_block,
-link);
-   if (block)
-   break;
+   if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) {
+   block = get_maxblock(>free_list[i]);
+   if (block)
+   break;
+   } else {
+   block = list_first_entry_or_null(>free_list[i],
+struct drm_buddy_block,
+link);
+   if (block)
+   break;
+   }
}
 
if (!block)
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index 143657a706ae..ae9201246bb5 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -53,6 +53,9 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
INIT_LIST_HEAD(_res->blocks);
bman_res->mm = mm;
 
+   if (place->flags & TTM_PL_FLAG_TOPDOWN)
+   bman_res->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
+
if (place->fpfn || lpfn != man->size)
bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATION;
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 865664b90a8a..424fc443115e 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -28,6 +28,7 @@
 })
 
 #define DRM_BUDDY_RANGE_ALLOCATION (1 << 0)
+#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 1)
 
 struct drm_buddy_block {
 #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
-- 
2.25.1



[PATCH v7 4/6] drm: implement a method to free unused pages

2022-01-09 Thread Arunpravin
On contiguous allocation, we round up the size
to the *next* power of 2, implement a function
to free the unused pages after the newly allocate block.

v2(Matthew Auld):
  - replace function name 'drm_buddy_free_unused_pages' with
drm_buddy_block_trim
  - replace input argument name 'actual_size' with 'new_size'
  - add more validation checks for input arguments
  - add overlaps check to avoid needless searching and splitting
  - merged the below patch to see the feature in action
 - add free unused pages support to i915 driver
  - lock drm_buddy_block_trim() function as it calls mark_free/mark_split
are all globally visible

v3(Matthew Auld):
  - remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function

v4:
  - in case of trim, at __alloc_range() split_block failure path
marks the block as free and removes it from the original list,
potentially also freeing it, to overcome this problem, we turn
the drm_buddy_block_trim() input node into a temporary node to
prevent recursively freeing itself, but still retain the
un-splitting/freeing of the other nodes(Matthew Auld)

  - modify the drm_buddy_block_trim() function return type

v5(Matthew Auld):
  - revert drm_buddy_block_trim() function return type changes in v4
  - modify drm_buddy_block_trim() passing argument n_pages to original_size
as n_pages has already been rounded up to the next power-of-two and
passing n_pages results noop

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c   | 65 +++
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 10 +++
 include/drm/drm_buddy.h   |  4 ++
 3 files changed, 79 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 11356c2bb7aa..d97b0ba8aa1f 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -540,6 +540,71 @@ static int __drm_buddy_alloc_range(struct drm_buddy *mm,
return __alloc_range(mm, , start, size, blocks);
 }
 
+/**
+ * drm_buddy_block_trim - free unused pages
+ *
+ * @mm: DRM buddy manager
+ * @new_size: original size requested
+ * @blocks: output list head to add allocated blocks
+ *
+ * For contiguous allocation, we round up the size to the nearest
+ * power of two value, drivers consume *actual* size, so remaining
+ * portions are unused and it can be freed.
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
+int drm_buddy_block_trim(struct drm_buddy *mm,
+u64 new_size,
+struct list_head *blocks)
+{
+   struct drm_buddy_block *parent;
+   struct drm_buddy_block *block;
+   LIST_HEAD(dfs);
+   u64 new_start;
+   int err;
+
+   if (!list_is_singular(blocks))
+   return -EINVAL;
+
+   block = list_first_entry(blocks,
+struct drm_buddy_block,
+link);
+
+   if (!drm_buddy_block_is_allocated(block))
+   return -EINVAL;
+
+   if (new_size > drm_buddy_block_size(mm, block))
+   return -EINVAL;
+
+   if (!new_size && !IS_ALIGNED(new_size, mm->chunk_size))
+   return -EINVAL;
+
+   if (new_size == drm_buddy_block_size(mm, block))
+   return 0;
+
+   list_del(>link);
+   mark_free(mm, block);
+   mm->avail += drm_buddy_block_size(mm, block);
+
+   /* Prevent recursively freeing this node */
+   parent = block->parent;
+   block->parent = NULL;
+
+   new_start = drm_buddy_block_offset(block);
+   list_add(>tmp_link, );
+   err =  __alloc_range(mm, , new_start, new_size, blocks);
+   if (err) {
+   mark_allocated(block);
+   mm->avail -= drm_buddy_block_size(mm, block);
+   list_add(>link, blocks);
+   }
+
+   block->parent = parent;
+   return err;
+}
+EXPORT_SYMBOL(drm_buddy_block_trim);
+
 /**
  * drm_buddy_alloc_blocks - allocate power-of-two blocks
  *
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index ae9201246bb5..626108fb9725 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -97,6 +97,16 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
if (unlikely(err))
goto err_free_blocks;
 
+   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+   u64 original_size = (u64)bman_res->base.num_pages << PAGE_SHIFT;
+
+   mutex_lock(>lock);
+   drm_buddy_block_trim(mm,
+   original_size,
+   _res->blocks);
+   mutex_unlock(>lock);
+   }
+
*res = _res->base;
return 0;
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 424fc443115e..17ca928fce8e 100644
--- a/include/drm/drm_buddy.h
+++ 

[PATCH v7 5/6] drm/amdgpu: move vram inline functions into a header

2022-01-09 Thread Arunpravin
Move shared vram inline functions and structs
into a header file

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 51 
 1 file changed, 51 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
new file mode 100644
index ..59983464cce5
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_VRAM_MGR_H__
+#define __AMDGPU_VRAM_MGR_H__
+
+#include 
+
+struct amdgpu_vram_mgr_node {
+   struct ttm_resource base;
+   struct list_head blocks;
+   unsigned long flags;
+};
+
+static inline u64 amdgpu_node_start(struct drm_buddy_block *block)
+{
+   return drm_buddy_block_offset(block);
+}
+
+static inline u64 amdgpu_node_size(struct drm_buddy_block *block)
+{
+   return PAGE_SIZE << drm_buddy_block_order(block);
+}
+
+static inline struct amdgpu_vram_mgr_node *
+to_amdgpu_vram_mgr_node(struct ttm_resource *res)
+{
+   return container_of(res, struct amdgpu_vram_mgr_node, base);
+}
+
+#endif
-- 
2.25.1



[PATCH v7 2/6] drm: improve drm_buddy_alloc function

2022-01-09 Thread Arunpravin
- Make drm_buddy_alloc a single function to handle
  range allocation and non-range allocation demands

- Implemented a new function alloc_range() which allocates
  the requested power-of-two block comply with range limitations

- Moved order computation and memory alignment logic from
  i915 driver to drm buddy

v2:
  merged below changes to keep the build unbroken
   - drm_buddy_alloc_range() becomes obsolete and may be removed
   - enable ttm range allocation (fpfn / lpfn) support in i915 driver
   - apply enhanced drm_buddy_alloc() function to i915 driver

v3(Matthew Auld):
  - Fix alignment issues and remove unnecessary list_empty check
  - add more validation checks for input arguments
  - make alloc_range() block allocations as bottom-up
  - optimize order computation logic
  - replace uint64_t with u64, which is preferred in the kernel

v4(Matthew Auld):
  - keep drm_buddy_alloc_range() function implementation for generic
actual range allocations
  - keep alloc_range() implementation for end bias allocations

v5(Matthew Auld):
  - modify drm_buddy_alloc() passing argument place->lpfn to lpfn
as place->lpfn will currently always be zero for i915

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c   | 316 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  67 ++--
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
 include/drm/drm_buddy.h   |  22 +-
 4 files changed, 285 insertions(+), 122 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 46456b41da49..cd914d104b1a 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -282,23 +282,97 @@ void drm_buddy_free_list(struct drm_buddy *mm, struct 
list_head *objects)
 }
 EXPORT_SYMBOL(drm_buddy_free_list);
 
-/**
- * drm_buddy_alloc_blocks - allocate power-of-two blocks
- *
- * @mm: DRM buddy manager to allocate from
- * @order: size of the allocation
- *
- * The order value here translates to:
- *
- * 0 = 2^0 * mm->chunk_size
- * 1 = 2^1 * mm->chunk_size
- * 2 = 2^2 * mm->chunk_size
- *
- * Returns:
- * allocated ptr to the _buddy_block on success
- */
-struct drm_buddy_block *
-drm_buddy_alloc_blocks(struct drm_buddy *mm, unsigned int order)
+static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= e2 && e1 >= s2;
+}
+
+static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= s2 && e1 >= e2;
+}
+
+static struct drm_buddy_block *
+alloc_range_bias(struct drm_buddy *mm,
+u64 start, u64 end,
+unsigned int order)
+{
+   struct drm_buddy_block *block;
+   struct drm_buddy_block *buddy;
+   LIST_HEAD(dfs);
+   int err;
+   int i;
+
+   end = end - 1;
+
+   for (i = 0; i < mm->n_roots; ++i)
+   list_add_tail(>roots[i]->tmp_link, );
+
+   do {
+   u64 block_start;
+   u64 block_end;
+
+   block = list_first_entry_or_null(,
+struct drm_buddy_block,
+tmp_link);
+   if (!block)
+   break;
+
+   list_del(>tmp_link);
+
+   if (drm_buddy_block_order(block) < order)
+   continue;
+
+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block) - 1;
+
+   if (!overlaps(start, end, block_start, block_end))
+   continue;
+
+   if (drm_buddy_block_is_allocated(block))
+   continue;
+
+   if (contains(start, end, block_start, block_end) &&
+   order == drm_buddy_block_order(block)) {
+   /*
+* Find the free block within the range.
+*/
+   if (drm_buddy_block_is_free(block))
+   return block;
+
+   continue;
+   }
+
+   if (!drm_buddy_block_is_split(block)) {
+   err = split_block(mm, block);
+   if (unlikely(err))
+   goto err_undo;
+   }
+
+   list_add(>right->tmp_link, );
+   list_add(>left->tmp_link, );
+   } while (1);
+
+   return ERR_PTR(-ENOSPC);
+
+err_undo:
+   /*
+* We really don't want to leave around a bunch of split blocks, since
+* bigger is better, so make sure we merge everything back before we
+* free the allocated blocks.
+*/
+   buddy = get_buddy(block);
+   if (buddy &&
+   (drm_buddy_block_is_free(block) &&
+drm_buddy_block_is_free(buddy)))
+   __drm_buddy_free(mm, block);
+   return ERR_PTR(err);
+}
+
+static struct drm_buddy_block *
+alloc_from_freelist(struct drm_buddy *mm,
+  

[PATCH v7 1/6] drm: move the buddy allocator from i915 into common drm

2022-01-09 Thread Arunpravin
Move the base i915 buddy allocator code into drm
- Move i915_buddy.h to include/drm
- Move i915_buddy.c to drm root folder
- Rename "i915" string with "drm" string wherever applicable
- Rename "I915" string with "DRM" string wherever applicable
- Fix header file dependencies
- Fix alignment issues
- add Makefile support for drm buddy
- export functions and write kerneldoc description
- Remove i915 selftest config check condition as buddy selftest
  will be moved to drm selftest folder

cleanup i915 buddy references in i915 driver module
and replace with drm buddy

v2:
  - include header file in alphabetical order(Thomas)
  - merged changes listed in the body section into a single patch
to keep the build intact(Christian, Jani)

v3:
  - make drm buddy a separate module(Thomas, Christian)

v4:
  - Fix build error reported by kernel test robot 
  - removed i915 buddy selftest from i915_mock_selftests.h to
avoid build error
  - removed selftests/i915_buddy.c file as we create a new set of
buddy test cases in drm/selftests folder

v5:
  - Fix merge conflict issue

v6:
  - replace drm_buddy_mm structure name as drm_buddy(Thomas, Christian)
  - replace drm_buddy_alloc() function name as drm_buddy_alloc_blocks()
(Thomas)
  - replace drm_buddy_free() function name as drm_buddy_free_block()
(Thomas)
  - export drm_buddy_free_block() function
  - fix multiple instances of KMEM_CACHE() entry

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/Kconfig   |   6 +
 drivers/gpu/drm/Makefile  |   2 +
 drivers/gpu/drm/drm_buddy.c   | 535 
 drivers/gpu/drm/i915/Kconfig  |   1 +
 drivers/gpu/drm/i915/Makefile |   1 -
 drivers/gpu/drm/i915/i915_buddy.c | 466 ---
 drivers/gpu/drm/i915/i915_buddy.h | 143 
 drivers/gpu/drm/i915/i915_module.c|   3 -
 drivers/gpu/drm/i915/i915_scatterlist.c   |  11 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  33 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   4 +-
 drivers/gpu/drm/i915/selftests/i915_buddy.c   | 787 --
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
 .../drm/i915/selftests/intel_memory_region.c  |  13 +-
 include/drm/drm_buddy.h   | 150 
 15 files changed, 725 insertions(+), 1431 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 delete mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c
 create mode 100644 include/drm/drm_buddy.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index b1f22e457fd0..b85f7ffae621 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -198,6 +198,12 @@ config DRM_TTM
  GPU memory types. Will be enabled automatically if a device driver
  uses it.
 
+config DRM_BUDDY
+   tristate
+   depends on DRM
+   help
+ A page based buddy allocator
+
 config DRM_VRAM_HELPER
tristate
depends on DRM
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 301a44dc18e3..ff0286eca254 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -42,6 +42,8 @@ obj-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_cma_helper.o
 drm_shmem_helper-y := drm_gem_shmem_helper.o
 obj-$(CONFIG_DRM_GEM_SHMEM_HELPER) += drm_shmem_helper.o
 
+obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o
+
 drm_vram_helper-y := drm_gem_vram_helper.o
 obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
 
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
new file mode 100644
index ..46456b41da49
--- /dev/null
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -0,0 +1,535 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+static struct kmem_cache *slab_blocks;
+
+static struct drm_buddy_block *drm_block_alloc(struct drm_buddy *mm,
+  struct drm_buddy_block *parent,
+  unsigned int order,
+  u64 offset)
+{
+   struct drm_buddy_block *block;
+
+   BUG_ON(order > DRM_BUDDY_MAX_ORDER);
+
+   block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL);
+   if (!block)
+   return NULL;
+
+   block->header = offset;
+   block->header |= order;
+   block->parent = parent;
+
+   BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
+   return block;
+}
+
+static void drm_block_free(struct drm_buddy *mm,
+  struct drm_buddy_block *block)
+{
+   kmem_cache_free(slab_blocks, block);
+}
+
+static void mark_allocated(struct drm_buddy_block *block)
+{
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_ALLOCATED;
+
+   list_del(>link);
+}
+
+static void 

RE: [PATCH] drm/amd/amdgpu: Add pcie indirect support to amdgpu_mm_wreg_mmio_rlc()

2022-01-09 Thread Zhang, Hawking
[AMD Official Use Only]

RE - Actually, for older asics, we shouldn't we be using mmINDEX/mmDATA rather 
than the pcie indirect registers?  Or is that handled already somehow?

I remember we checked this with hw team before (might two years ago when make 
the change in amdgpu_device_rreg/wreg helper). The answer was it is safe to 
retire mmINDEX/DATA approach for SI and onwards. PCIE_INDEX/DATA should be good 
enough for indirect access in amdgpu driver. For radeon driver, mmINDEX/mmDATA 
shall still be kept.

Regards,
Hawking
-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Sunday, January 9, 2022 05:13
To: StDenis, Tom 
Cc: amd-gfx list 
Subject: Re: [PATCH] drm/amd/amdgpu: Add pcie indirect support to 
amdgpu_mm_wreg_mmio_rlc()

On Fri, Jan 7, 2022 at 7:07 AM Tom St Denis  wrote:
>
> The function amdgpu_mm_wreg_mmio_rlc() is used by debugfs to write to 
> MMIO registers.  It didn't support registers beyond the BAR mapped 
> MMIO space.  This adds pcie indirect write support.
>
> Signed-off-by: Tom St Denis 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index c38e0e87090b..53a04095a6db 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -552,7 +552,7 @@ void amdgpu_device_wreg(struct amdgpu_device 
> *adev,  }
>
>  /**
> - * amdgpu_mm_wreg_mmio_rlc -  write register either with mmio or with 
> RLC path if in range
> + * amdgpu_mm_wreg_mmio_rlc -  write register either with 
> + direct/indirect mmio or with RLC path if in range
>   *
>   * this function is invoked only the debugfs register access
>   */
> @@ -567,6 +567,8 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
> adev->gfx.rlc.funcs->is_rlcg_access_range) {
> if (adev->gfx.rlc.funcs->is_rlcg_access_range(adev, reg))
> return adev->gfx.rlc.funcs->sriov_wreg(adev, 
> reg, v, 0, 0);
> +   } else if ((reg * 4) >= adev->rmmio_size) {
> +   adev->pcie_wreg(adev, reg * 4, v);

Actually, for older asics, we shouldn't we be using mmINDEX/mmDATA rather than 
the pcie indirect registers?  Or is that handled already somehow?

Alex

> } else {
> writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
> }
> --
> 2.32.0
>


Re: [PATCH v6 1/6] drm: move the buddy allocator from i915 into common drm

2022-01-09 Thread Arunpravin



On 07/01/22 9:27 pm, Christian König wrote:
> Am 07.01.22 um 16:49 schrieb Matthew Auld:
>> On 26/12/2021 22:24, Arunpravin wrote:
>>> Move the base i915 buddy allocator code into drm
>>> - Move i915_buddy.h to include/drm
>>> - Move i915_buddy.c to drm root folder
>>> - Rename "i915" string with "drm" string wherever applicable
>>> - Rename "I915" string with "DRM" string wherever applicable
>>> - Fix header file dependencies
>>> - Fix alignment issues
>>> - add Makefile support for drm buddy
>>> - export functions and write kerneldoc description
>>> - Remove i915 selftest config check condition as buddy selftest
>>>    will be moved to drm selftest folder
>>>
>>> cleanup i915 buddy references in i915 driver module
>>> and replace with drm buddy
>>>
>>> v2:
>>>    - include header file in alphabetical order(Thomas)
>>>    - merged changes listed in the body section into a single patch
>>>  to keep the build intact(Christian, Jani)
>>>
>>> v3:
>>>    - make drm buddy a separate module(Thomas, Christian)
>>>
>>> v4:
>>>    - Fix build error reported by kernel test robot 
>>>    - removed i915 buddy selftest from i915_mock_selftests.h to
>>>  avoid build error
>>>    - removed selftests/i915_buddy.c file as we create a new set of
>>>  buddy test cases in drm/selftests folder
>>>
>>> v5:
>>>    - Fix merge conflict issue
>>>
>>> Signed-off-by: Arunpravin 
>>
>> 
>>
>>> +int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size)
>>> +{
>>> +    unsigned int i;
>>> +    u64 offset;
>>> +
>>> +    if (size < chunk_size)
>>> +    return -EINVAL;
>>> +
>>> +    if (chunk_size < PAGE_SIZE)
>>> +    return -EINVAL;
>>> +
>>> +    if (!is_power_of_2(chunk_size))
>>> +    return -EINVAL;
>>> +
>>> +    size = round_down(size, chunk_size);
>>> +
>>> +    mm->size = size;
>>> +    mm->avail = size;
>>> +    mm->chunk_size = chunk_size;
>>> +    mm->max_order = ilog2(size) - ilog2(chunk_size);
>>> +
>>> +    BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER);
>>> +
>>> +    mm->slab_blocks = KMEM_CACHE(drm_buddy_block, 0);
>>> +    if (!mm->slab_blocks)
>>> +    return -ENOMEM;
>>
>> It looks like every KMEM_CACHE() also creates a debugfs entry? See the 
>> error here[1]. I guess because we end with multiple instances in i915. 
>> If so, is it possible to have a single KMEM_CACHE() as part of the 
>> buddy module, similar to what i915 was doing previously?
> 
> Oh, that is a really good point, this code here doesn't make to much sense.
> 
> The value of a KMEM_CACHE() is to allow speeding up allocation of the 
> same structure size between different drm_buddy object. If you allocate 
> one cache per drm_buddy that makes the whole functionality useless.
> 
> Please fix, this is actually a bug.
> 
> Christian.
> 

I fixed in v7 version

Thanks,
Arun
>>
>> [1] 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fintel-gfx-ci.01.org%2Ftree%2Fdrm-tip%2FTrybot_8217%2Fshard-skl4%2Figt%40i915_selftest%40mock%40memory_region.htmldata=04%7C01%7Cchristian.koenig%40amd.com%7C56202fbe886f415c3b8308d9d1f5409c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637771673545453215%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=ZnRmAQo%2BjX414hbqHigL4R18oBDKLIugUQIVcwhFI%2BY%3Dreserved=0
>