Re: [PATCH] drm/amdgpu: drop redundant null-pointer checks in amdgpu_ttm_tt_populate() and amdgpu_ttm_tt_unpopulate()

2021-08-04 Thread Alex Deucher
Applied.  Thanks!

Alex

On Wed, Aug 4, 2021 at 2:49 AM Christian König  wrote:
>
> Am 04.08.21 um 03:51 schrieb Tuo Li:
> > The varialbe gtt in the function amdgpu_ttm_tt_populate() and
> > amdgpu_ttm_tt_unpopulate() is guaranteed to be not NULL in the context.
> > Thus the null-pointer checks are redundant and can be dropped.
> >
> > Reported-by: TOTE Robot 
> > Signed-off-by: Tuo Li 
>
> Reviewed-by: Christian König 
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 3a55f08e00e1..719539bd6c44 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1121,7 +1121,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device 
> > *bdev,
> >   struct amdgpu_ttm_tt *gtt = (void *)ttm;
> >
> >   /* user pages are bound by amdgpu_ttm_tt_pin_userptr() */
> > - if (gtt && gtt->userptr) {
> > + if (gtt->userptr) {
> >   ttm->sg = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> >   if (!ttm->sg)
> >   return -ENOMEM;
> > @@ -1146,7 +1146,7 @@ static void amdgpu_ttm_tt_unpopulate(struct 
> > ttm_device *bdev,
> >   struct amdgpu_ttm_tt *gtt = (void *)ttm;
> >   struct amdgpu_device *adev;
> >
> > - if (gtt && gtt->userptr) {
> > + if (gtt->userptr) {
> >   amdgpu_ttm_tt_set_user_pages(ttm, NULL);
> >   kfree(ttm->sg);
> >   ttm->sg = NULL;
>


Re: [PATCH v2] drm/radeon: Update pitch for page flip

2021-08-04 Thread Alex Deucher
On Tue, Aug 3, 2021 at 10:39 PM Zhenneng Li  wrote:
>
>
> When primary bo is updated, crtc's pitch may
> have not been updated, this will lead to show
> disorder content when user changes display mode,
> we update crtc's pitch in page flip to avoid
> this bug.
> This refers to amdgpu's pageflip.
>
> v1->v2:
> Update all of the pitch in all of the page_flip functions
> in radeon rather than just the evergreen one.
>
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: "Pan, Xinhui" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Zhenneng Li 
> ---
>  drivers/gpu/drm/radeon/evergreen.c | 8 +++-
>  drivers/gpu/drm/radeon/r100.c  | 5 +
>  drivers/gpu/drm/radeon/rs600.c | 8 +++-
>  drivers/gpu/drm/radeon/rv770.c | 8 +++-
>  4 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/evergreen.c 
> b/drivers/gpu/drm/radeon/evergreen.c
> index 36a888e1b179..eeb590d2dec2 100644
> --- a/drivers/gpu/drm/radeon/evergreen.c
> +++ b/drivers/gpu/drm/radeon/evergreen.c
> @@ -28,6 +28,7 @@
>
>  #include 
>  #include 
> +#include 
>
>  #include "atom.h"
>  #include "avivod.h"
> @@ -1414,10 +1415,15 @@ void evergreen_page_flip(struct radeon_device *rdev, 
> int crtc_id, u64 crtc_base,
>  bool async)
>  {
> struct radeon_crtc *radeon_crtc = rdev->mode_info.crtcs[crtc_id];
> +   struct drm_framebuffer *fb = radeon_crtc->base.primary->fb;
>
> -   /* update the scanout addresses */
> +   /* flip at hsync for async, default is vsync */
> WREG32(EVERGREEN_GRPH_FLIP_CONTROL + radeon_crtc->crtc_offset,
>async ? EVERGREEN_GRPH_SURFACE_UPDATE_H_RETRACE_EN : 0);
> +   /* update pitch */
> +   WREG32(EVERGREEN_GRPH_PITCH + radeon_crtc->crtc_offset,
> +  fb->pitches[0] / fb->format->cpp[0]);
> +   /* update the scanout addresses */
> WREG32(EVERGREEN_GRPH_PRIMARY_SURFACE_ADDRESS_HIGH + 
> radeon_crtc->crtc_offset,
>upper_32_bits(crtc_base));
> WREG32(EVERGREEN_GRPH_PRIMARY_SURFACE_ADDRESS + 
> radeon_crtc->crtc_offset,
> diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
> index ba724198b72e..1268854552ff 100644
> --- a/drivers/gpu/drm/radeon/r100.c
> +++ b/drivers/gpu/drm/radeon/r100.c
> @@ -162,6 +162,7 @@ void r100_wait_for_vblank(struct radeon_device *rdev, int 
> crtc)
>  void r100_page_flip(struct radeon_device *rdev, int crtc_id, u64 crtc_base, 
> bool async)
>  {
> struct radeon_crtc *radeon_crtc = rdev->mode_info.crtcs[crtc_id];
> +   struct drm_framebuffer *fb = radeon_crtc->base.primary->fb;
> u32 tmp = ((u32)crtc_base) | RADEON_CRTC_OFFSET__OFFSET_LOCK;
> int i;
>
> @@ -169,6 +170,10 @@ void r100_page_flip(struct radeon_device *rdev, int 
> crtc_id, u64 crtc_base, bool
> /* update the scanout addresses */
> WREG32(RADEON_CRTC_OFFSET + radeon_crtc->crtc_offset, tmp);
>
> +   /* update pitch */
> +   WREG32(RADEON_CRTC_PITCH + radeon_crtc->crtc_offset,
> +  fb->pitches[0] / fb->format->cpp[0]);
> +

This needs the follow formatting (from radeon_legacy_crtc.c):
pitch_pixels = fb->pitches[0] / fb->format->cpp[0];
crtc_pitch = DIV_ROUND_UP(pitch_pixels * fb->format->cpp[0] * 8,
  fb->format->cpp[0] * 8 * 8);
crtc_pitch |= crtc_pitch << 16;
WREG32(RADEON_CRTC_PITCH + radeon_crtc->crtc_offset, crtc_pitch);

With that fixed,
Reviewed-by: Alex Deucher 

> /* Wait for update_pending to go high. */
> for (i = 0; i < rdev->usec_timeout; i++) {
> if (RREG32(RADEON_CRTC_OFFSET + radeon_crtc->crtc_offset) & 
> RADEON_CRTC_OFFSET__GUI_TRIG_OFFSET)
> diff --git a/drivers/gpu/drm/radeon/rs600.c b/drivers/gpu/drm/radeon/rs600.c
> index b2d22e25eee1..b87dd551e939 100644
> --- a/drivers/gpu/drm/radeon/rs600.c
> +++ b/drivers/gpu/drm/radeon/rs600.c
> @@ -41,6 +41,7 @@
>
>  #include 
>  #include 
> +#include 
>
>  #include "atom.h"
>  #include "radeon.h"
> @@ -118,6 +119,7 @@ void avivo_wait_for_vblank(struct radeon_device *rdev, 
> int crtc)
>  void rs600_page_flip(struct radeon_device *rdev, int crtc_id, u64 crtc_base, 
> bool async)
>  {
> struct radeon_crtc *radeon_crtc = rdev->mode_info.crtcs[crtc_id];
> +   struct drm_framebuffer *fb = radeon_crtc->base.primary->fb;
> u32 tmp = RREG32(AVIVO_D1GRPH_UPDATE + radeon_crtc->crtc_offset);
> int i;
>
> @@ -125,9 +127,13 @@ void rs600_page_flip(struct radeon_device *rdev, int 
> crtc_id, u64 crtc_base, boo
> tmp |= AVIVO_D1GRPH_UPDATE_LOCK;
> WREG32(AVIVO_D1GRPH_UPDATE + radeon_crtc->crtc_offset, tmp);
>
> -   /* update the scanout addresses */
> +   /* flip at hsync for async, default is vsync */
> WREG32(AVIVO_D1GRPH_FLIP_CONTROL + radeon_crtc->crtc_offset,
>

[pull] amdgpu drm-fixes-5.14

2021-08-04 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 5.14.

The following changes since commit d28e2568ac26fff351c846bf74ba6ca5dded733e:

  Merge tag 'amd-drm-fixes-5.14-2021-07-28' of 
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes (2021-07-29 17:20:29 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-5.14-2021-08-04

for you to fetch changes up to 574fdb20f3e2b001eeddcaf4f16a5c8258243323:

  drm/amdgpu: add DID for beige goby (2021-08-03 16:59:16 -0400)


amd-drm-fixes-5.14-2021-08-04:

amdgpu:
- Fix potential out-of-bounds read when updating GPUVM mapping
- Renoir powergating fix
- Yellow Carp updates
- 8K fix for navi1x
- Beige Goby updates and new DIDs
- Fix DMUB firmware version output
- EDP fix
- pmops config fix


Bing Guo (2):
  drm/amd/display: Fix Dynamic bpp issue with 8K30 with Navi 1X
  drm/amd/display: Increase stutter watermark for dcn303

Chengming Gui (1):
  drm/amdgpu: add DID for beige goby

Jude Shih (1):
  drm/amd/display: Fix resetting DCN3.1 HW when resuming from S4

Qingqing Zhuo (1):
  drm/amd/display: workaround for hard hang on HPD on native DP

Randy Dunlap (1):
  drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled

Shirish S (1):
  drm/amdgpu/display: fix DMUB firmware version info

Wesley Chalmers (1):
  drm/amd/display: Assume LTTPR interop for DCN31+

Xiaomeng Hou (1):
  drm/amd/pm: update yellow carp pmfw interface version

Yifan Zhang (1):
  drm/amdgpu: fix the doorbell missing when in CGPG issue for renoir.

xinhui pan (1):
  drm/amdgpu: Fix out-of-bounds read when update mapping

 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h  |  3 ++-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 21 -
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c   |  2 +-
 .../drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c   |  4 +++-
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c| 21 ++---
 drivers/gpu/drm/amd/display/dc/dc.h |  2 ++
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_optc.c   |  2 +-
 .../gpu/drm/amd/display/dc/dcn30/dcn30_resource.c   | 20 
 .../gpu/drm/amd/display/dc/dcn303/dcn303_resource.c |  4 ++--
 .../gpu/drm/amd/display/dc/dcn31/dcn31_resource.c   | 16 
 drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c   |  8 +---
 drivers/gpu/drm/amd/pm/inc/smu_v13_0.h  |  2 +-
 14 files changed, 83 insertions(+), 31 deletions(-)


[PATCH] drm/amd/amdgpu: skip locking delayed work if not initialized.

2021-08-04 Thread YuBiao Wang
When init failed in early init stage, amdgpu_object has
not been initialized, so hasn't the ttm delayed queue functions.

Signed-off-by: YuBiao Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 9e53ff851496..4c33985542ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3825,7 +3825,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 {
dev_info(adev->dev, "amdgpu: finishing device.\n");
flush_delayed_work(>delayed_init_work);
-   ttm_bo_lock_delayed_workqueue(>mman.bdev);
+   if (adev->mman.initialized)
+   ttm_bo_lock_delayed_workqueue(>mman.bdev);
adev->shutdown = true;
 
/* make sure IB test finished before entering exclusive mode
-- 
2.25.1



Re: [PATCH v2 11/14] drm/tilcdc: Convert to Linux IRQ interfaces

2021-08-04 Thread Sam Ravnborg
Hi Thomas,
On Wed, Aug 04, 2021 at 08:30:41PM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 03.08.21 um 17:00 schrieb Sam Ravnborg:
> > Hi Thomas,
> > 
> > On Tue, Aug 03, 2021 at 11:07:01AM +0200, Thomas Zimmermann wrote:
> > > Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
> > > IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
> > > don't benefit from using it.
> > > 
> > > DRM IRQ callbacks are now being called directly or inlined.
> > > 
> > > Calls to platform_get_irq() can fail with a negative errno code.
> > > Abort initialization in this case. The DRM IRQ midlayer does not
> > > handle this case correctly.
> > 
> > I cannot see why the irq_enabled flag is needed here, and the changelog
> > do not help me.
> > What do I miss?
> 
> That's a good point. Usually, only the DRM IRQ helpers use irq_enabled in
> struct drm_device. It'll become legacy and we can ignore it for the driver
> conversion.
> 
> The exception is tilcdc, which also uses irq_enabled to make its error
> rollback work correctly. So I duplicated the state in the local private
> structure.
> 
> I'll add this explanation to the commit message.
With this added:
Acked-by: Sam Ravnborg 


Re: [PATCH v2 11/14] drm/tilcdc: Convert to Linux IRQ interfaces

2021-08-04 Thread Thomas Zimmermann

Hi

Am 03.08.21 um 17:00 schrieb Sam Ravnborg:

Hi Thomas,

On Tue, Aug 03, 2021 at 11:07:01AM +0200, Thomas Zimmermann wrote:

Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
don't benefit from using it.

DRM IRQ callbacks are now being called directly or inlined.

Calls to platform_get_irq() can fail with a negative errno code.
Abort initialization in this case. The DRM IRQ midlayer does not
handle this case correctly.


I cannot see why the irq_enabled flag is needed here, and the changelog
do not help me.
What do I miss?


That's a good point. Usually, only the DRM IRQ helpers use irq_enabled 
in struct drm_device. It'll become legacy and we can ignore it for the 
driver conversion.


The exception is tilcdc, which also uses irq_enabled to make its error 
rollback work correctly. So I duplicated the state in the local private 
structure.


I'll add this explanation to the commit message.

Best regards
Thomas



Sam




Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/tilcdc/tilcdc_drv.c | 51 ++---
  drivers/gpu/drm/tilcdc/tilcdc_drv.h |  3 ++
  2 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_drv.c 
b/drivers/gpu/drm/tilcdc/tilcdc_drv.c
index f1d3a9f919fd..6b03f89a98d4 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_drv.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_drv.c
@@ -20,7 +20,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include 
@@ -124,6 +123,39 @@ static int cpufreq_transition(struct notifier_block *nb,
  }
  #endif
  
+static irqreturn_t tilcdc_irq(int irq, void *arg)

+{
+   struct drm_device *dev = arg;
+   struct tilcdc_drm_private *priv = dev->dev_private;
+
+   return tilcdc_crtc_irq(priv->crtc);
+}
+
+static int tilcdc_irq_install(struct drm_device *dev, unsigned int irq)
+{
+   struct tilcdc_drm_private *priv = dev->dev_private;
+   int ret;
+
+   ret = request_irq(irq, tilcdc_irq, 0, dev->driver->name, dev);
+   if (ret)
+   return ret;
+
+   priv->irq_enabled = false;
+
+   return 0;
+}
+
+static void tilcdc_irq_uninstall(struct drm_device *dev)
+{
+   struct tilcdc_drm_private *priv = dev->dev_private;
+
+   if (!priv->irq_enabled)
+   return;
+
+   free_irq(priv->irq, dev);
+   priv->irq_enabled = false;
+}
+
  /*
   * DRM operations:
   */
@@ -145,7 +177,7 @@ static void tilcdc_fini(struct drm_device *dev)
drm_dev_unregister(dev);
  
  	drm_kms_helper_poll_fini(dev);

-   drm_irq_uninstall(dev);
+   tilcdc_irq_uninstall(dev);
drm_mode_config_cleanup(dev);
  
  	if (priv->clk)

@@ -336,7 +368,12 @@ static int tilcdc_init(const struct drm_driver *ddrv, 
struct device *dev)
goto init_failed;
}
  
-	ret = drm_irq_install(ddev, platform_get_irq(pdev, 0));

+   ret = platform_get_irq(pdev, 0);
+   if (ret < 0)
+   goto init_failed;
+   priv->irq = ret;
+
+   ret = tilcdc_irq_install(ddev, priv->irq);
if (ret < 0) {
dev_err(dev, "failed to install IRQ handler\n");
goto init_failed;
@@ -360,13 +397,6 @@ static int tilcdc_init(const struct drm_driver *ddrv, 
struct device *dev)
return ret;
  }
  
-static irqreturn_t tilcdc_irq(int irq, void *arg)

-{
-   struct drm_device *dev = arg;
-   struct tilcdc_drm_private *priv = dev->dev_private;
-   return tilcdc_crtc_irq(priv->crtc);
-}
-
  #if defined(CONFIG_DEBUG_FS)
  static const struct {
const char *name;
@@ -454,7 +484,6 @@ DEFINE_DRM_GEM_CMA_FOPS(fops);
  
  static const struct drm_driver tilcdc_driver = {

.driver_features= DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC,
-   .irq_handler= tilcdc_irq,
DRM_GEM_CMA_DRIVER_OPS,
  #ifdef CONFIG_DEBUG_FS
.debugfs_init   = tilcdc_debugfs_init,
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_drv.h 
b/drivers/gpu/drm/tilcdc/tilcdc_drv.h
index d29806ca8817..b818448c83f6 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_drv.h
+++ b/drivers/gpu/drm/tilcdc/tilcdc_drv.h
@@ -46,6 +46,8 @@ struct tilcdc_drm_private {
struct clk *clk; /* functional clock */
int rev; /* IP revision */
  
+	unsigned int irq;

+
/* don't attempt resolutions w/ higher W * H * Hz: */
uint32_t max_bandwidth;
/*
@@ -82,6 +84,7 @@ struct tilcdc_drm_private {
  
  	bool is_registered;

bool is_componentized;
+   bool irq_enabled;
  };
  
  /* Sub-module for display.  Since we don't know at compile time what panels

--
2.32.0


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] drm/amdkfd: Expose GFXIP engine version to sysfs

2021-08-04 Thread Graham Sider
Add u32 gfx_version field to kfd_node_properties and kfd_device_info.
Populate _device_info structs accordingly and expose to sysfs.

This allows eliminating device-ID-based lookup tables in user mode for
future ASICs.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 29 +++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  1 +
 4 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index b551dd675085..16a57b70cc1a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -91,6 +91,7 @@ static const struct kfd2kgd_calls *kfd2kgd_funcs[] = {
 static const struct kfd_device_info kaveri_device_info = {
.asic_family = CHIP_KAVERI,
.asic_name = "kaveri",
+   .gfx_target_version = 7,
.max_pasid_bits = 16,
/* max num of queues for KV.TODO should be a dynamic value */
.max_no_of_hqd  = 24,
@@ -110,6 +111,7 @@ static const struct kfd_device_info kaveri_device_info = {
 static const struct kfd_device_info carrizo_device_info = {
.asic_family = CHIP_CARRIZO,
.asic_name = "carrizo",
+   .gfx_target_version = 80001,
.max_pasid_bits = 16,
/* max num of queues for CZ.TODO should be a dynamic value */
.max_no_of_hqd  = 24,
@@ -130,6 +132,7 @@ static const struct kfd_device_info carrizo_device_info = {
 static const struct kfd_device_info raven_device_info = {
.asic_family = CHIP_RAVEN,
.asic_name = "raven",
+   .gfx_target_version = 90002,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 8,
@@ -148,6 +151,7 @@ static const struct kfd_device_info raven_device_info = {
 static const struct kfd_device_info hawaii_device_info = {
.asic_family = CHIP_HAWAII,
.asic_name = "hawaii",
+   .gfx_target_version = 70001,
.max_pasid_bits = 16,
/* max num of queues for KV.TODO should be a dynamic value */
.max_no_of_hqd  = 24,
@@ -167,6 +171,7 @@ static const struct kfd_device_info hawaii_device_info = {
 static const struct kfd_device_info tonga_device_info = {
.asic_family = CHIP_TONGA,
.asic_name = "tonga",
+   .gfx_target_version = 80002,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 4,
@@ -185,6 +190,7 @@ static const struct kfd_device_info tonga_device_info = {
 static const struct kfd_device_info fiji_device_info = {
.asic_family = CHIP_FIJI,
.asic_name = "fiji",
+   .gfx_target_version = 80003,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 4,
@@ -203,6 +209,7 @@ static const struct kfd_device_info fiji_device_info = {
 static const struct kfd_device_info fiji_vf_device_info = {
.asic_family = CHIP_FIJI,
.asic_name = "fiji",
+   .gfx_target_version = 80003,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 4,
@@ -222,6 +229,7 @@ static const struct kfd_device_info fiji_vf_device_info = {
 static const struct kfd_device_info polaris10_device_info = {
.asic_family = CHIP_POLARIS10,
.asic_name = "polaris10",
+   .gfx_target_version = 80003,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 4,
@@ -240,6 +248,7 @@ static const struct kfd_device_info polaris10_device_info = 
{
 static const struct kfd_device_info polaris10_vf_device_info = {
.asic_family = CHIP_POLARIS10,
.asic_name = "polaris10",
+   .gfx_target_version = 80003,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 4,
@@ -258,6 +267,7 @@ static const struct kfd_device_info 
polaris10_vf_device_info = {
 static const struct kfd_device_info polaris11_device_info = {
.asic_family = CHIP_POLARIS11,
.asic_name = "polaris11",
+   .gfx_target_version = 80003,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 4,
@@ -276,6 +286,7 @@ static const struct kfd_device_info polaris11_device_info = 
{
 static const struct kfd_device_info polaris12_device_info = {
.asic_family = CHIP_POLARIS12,
.asic_name = "polaris12",
+   .gfx_target_version = 80003,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 4,
@@ -294,6 +305,7 @@ static const struct kfd_device_info polaris12_device_info = 
{
 static const struct kfd_device_info vegam_device_info = {
.asic_family = CHIP_VEGAM,
.asic_name = "vegam",
+   .gfx_target_version = 80003,
.max_pasid_bits = 16,
.max_no_of_hqd  = 24,
.doorbell_size  = 4,
@@ -312,6 +324,7 @@ static const struct kfd_device_info vegam_device_info = {
 static const struct kfd_device_info 

[PATCH] drm/vc4: hdmi: Fix build break caused by merge issue

2021-08-04 Thread Anson Jacob
Commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod")
was reverted partially by
Commit 0600a948942d ("Merge tag 'v5.13' into amd-staging-drm-next")
which caused build break when compiling 'make allmodconfig'

Original Patch: 
https://patchwork.freedesktop.org/patch/msgid/20210524131852.263883-2-max...@cerno.tech
Signed-off-by: Anson Jacob 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index df65e0b6449b..aab1b36ceb3c 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -168,10 +168,9 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, 
bool force)
 
WARN_ON(pm_runtime_resume_and_get(_hdmi->pdev->dev));
 
-   if (vc4_hdmi->hpd_gpio) {
-   if (gpio_get_value_cansleep(vc4_hdmi->hpd_gpio) ^
-   vc4_hdmi->hpd_active_low)
-   connected = true;
+   if (vc4_hdmi->hpd_gpio &&
+   gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) {
+   connected = true;
} else if (drm_probe_ddc(vc4_hdmi->ddc)) {
connected = true;
} else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) {
-- 
2.25.1



Re: [PATCH] drm/amdgpu: Fix vm free pts race when process exiting

2021-08-04 Thread Felix Kuehling
Am 2021-08-04 um 5:01 a.m. schrieb Christian König:
> Am 03.08.21 um 00:33 schrieb Philip Yang:
>> Take vm->invalidated_lock spinlock to remove vm pd and pt bos, to avoid
>> link list corruption with crash backtrace:
>>
>> [ 2290.505111] list_del corruption. next->prev should be
>>   9b2514ec0018, but was 4e03280211010f04
>> [ 2290.505154] kernel BUG at lib/list_debug.c:56!
>> [ 2290.505176] invalid opcode:  [#1] SMP NOPTI
>> [ 2290.505254] Workqueue: events delayed_fput
>> [ 2290.505271] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
>> [ 2290.505513] Call Trace:
>> [ 2290.505623]  amdgpu_vm_free_table+0x26/0x80 [amdgpu]
>> [ 2290.505705]  amdgpu_vm_free_pts+0x7a/0xf0 [amdgpu]
>> [ 2290.505786]  amdgpu_vm_fini+0x1f0/0x440 [amdgpu]
>> [ 2290.505864]  amdgpu_driver_postclose_kms+0x172/0x290 [amdgpu]
>> [ 2290.505893]  drm_file_free.part.10+0x1d4/0x270 [drm]
>> [ 2290.505916]  drm_release+0xa9/0xe0 [drm]
>> [ 2290.505930]  __fput+0xb7/0x230
>> [ 2290.505942]  delayed_fput+0x1c/0x30
>> [ 2290.505957]  process_one_work+0x1a7/0x360
>> [ 2290.505971]  worker_thread+0x30/0x390
>> [ 2290.505985]  ? create_worker+0x1a0/0x1a0
>> [ 2290.505999]  kthread+0x112/0x130
>> [ 2290.506011]  ? kthread_flush_work_fn+0x10/0x10
>> [ 2290.506027]  ret_from_fork+0x22/0x40
>
> Wow, well this is a big NAK.
>
> The page tables should never ever be on the invalidation list or
> otherwise we would try to point PTEs to them which is a huge security
> issue.

entry->vm_status is used on other lists as well, and I think page tables
can be added to those when they are evicted:

static void amdgpu_vm_bo_evicted(struct amdgpu_vm_bo_base *vm_bo)
{
    struct amdgpu_vm *vm = vm_bo->vm;
    struct amdgpu_bo *bo = vm_bo->bo;

    vm_bo->moved = true;
    if (bo->tbo.type == ttm_bo_type_kernel)
    list_move(_bo->vm_status, >evicted);
    else
    list_move_tail(_bo->vm_status, >evicted);
}

But that never seems to involve the invalidated_lock. Maybe there is
some other lock we should be holding somewhere (not necessarily in
amdgpu_vm_free_table) to avoid this list corruption.

Regards,
  Felix


>
> Taking the lock just workaround that. Can you investigate how it
> happens that a page table ends up on that list?
>
> Thanks in advance,
> Christian.
>
>>
>> Signed-off-by: Philip Yang 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 2a88ed5d983b..5c4c355e7d6b 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1045,7 +1045,9 @@ static void amdgpu_vm_free_table(struct
>> amdgpu_vm_bo_base *entry)
>>   return;
>>   shadow = amdgpu_bo_shadowed(entry->bo);
>>   entry->bo->vm_bo = NULL;
>> +    spin_lock(>vm->invalidated_lock);
>>   list_del(>vm_status);
>> +    spin_unlock(>vm->invalidated_lock);
>>   amdgpu_bo_unref();
>>   amdgpu_bo_unref(>bo);
>>   }
>


Re: [PATCH] drm/radeon: Update pitch for page flip

2021-08-04 Thread Daniel Vetter
On Tue, Aug 03, 2021 at 10:49:39AM -0400, Alex Deucher wrote:
> On Tue, Aug 3, 2021 at 4:34 AM Michel Dänzer  wrote:
> >
> > On 2021-08-02 4:51 p.m., Alex Deucher wrote:
> > > On Mon, Aug 2, 2021 at 4:31 AM Daniel Vetter  wrote:
> > >>
> > >> On Mon, Aug 02, 2021 at 10:12:47AM +0200, Christian König wrote:
> > >>> Am 02.08.21 um 09:43 schrieb Zhenneng Li:
> >  When primary bo is updated, crtc's pitch may
> >  have not been updated, this will lead to show
> >  disorder content when user changes display mode,
> >  we update crtc's pitch in page flip to avoid
> >  this bug.
> >  This refers to amdgpu's pageflip.
> > >>>
> > >>> Alex is the expert to ask about that code, but I'm not sure if that is
> > >>> really correct for the old hardware.
> > >>>
> > >>> As far as I know the crtc's pitch should not change during a page flip, 
> > >>> but
> > >>> only during a full mode set.
> > >>>
> > >>> So could you elaborate a bit more how you trigger this?
> > >>
> > >> legacy page_flip ioctl only verifies that the fb->format stays the same.
> > >> It doesn't check anything else (afair never has), this is all up to
> > >> drivers to verify.
> > >>
> > >> Personally I'd say add a check to reject this, since testing this and
> > >> making sure it really works everywhere is probably a bit much on this old
> > >> hw.
> > >
> > > If just the pitch changed, that probably wouldn't be much of a
> > > problem, but if the pitch is changing, that probably implies other
> > > stuff has changed as well and we'll just be chasing changes.  I agree
> > > it would be best to just reject anything other than updating the
> > > scanout address.
> >
> > FWIW, that means page flipping cannot be used in some cases which work fine 
> > by changing the pitch, which can result in tearing: 
> > https://gitlab.freedesktop.org/xorg/xserver/-/issues/839 (which says the 
> > i915 driver handles this as well).
> >
> 
> Ok.  In that case, @Zhenneng can you update all of the pitch in all of
> the page_flip functions in radeon rather than just the evergreen one?

atomic drivers handle this fairly ok-ish, since we have a proper
atomic_check there.

For legacy kms I just wouldn't bother, too many corners to validate. But
also if you're bored, just do it :-)
-Daniel

> 
> Thanks,
> 
> Alex
> 
> 
> >
> > --
> > Earthling Michel Dänzer   |   https://redhat.com
> > Libre software enthusiast | Mesa and X developer

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/amdgpu: Fix vm free pts race when process exiting

2021-08-04 Thread Christian König

Am 04.08.21 um 17:11 schrieb philip yang:


On 2021-08-04 5:01 a.m., Christian König wrote:


Am 03.08.21 um 00:33 schrieb Philip Yang:

Take vm->invalidated_lock spinlock to remove vm pd and pt bos, to avoid
link list corruption with crash backtrace:

[ 2290.505111] list_del corruption. next->prev should be
  9b2514ec0018, but was 4e03280211010f04
[ 2290.505154] kernel BUG at lib/list_debug.c:56!
[ 2290.505176] invalid opcode:  [#1] SMP NOPTI
[ 2290.505254] Workqueue: events delayed_fput
[ 2290.505271] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
[ 2290.505513] Call Trace:
[ 2290.505623]  amdgpu_vm_free_table+0x26/0x80 [amdgpu]
[ 2290.505705]  amdgpu_vm_free_pts+0x7a/0xf0 [amdgpu]
[ 2290.505786]  amdgpu_vm_fini+0x1f0/0x440 [amdgpu]
[ 2290.505864]  amdgpu_driver_postclose_kms+0x172/0x290 [amdgpu]
[ 2290.505893]  drm_file_free.part.10+0x1d4/0x270 [drm]
[ 2290.505916]  drm_release+0xa9/0xe0 [drm]
[ 2290.505930]  __fput+0xb7/0x230
[ 2290.505942]  delayed_fput+0x1c/0x30
[ 2290.505957]  process_one_work+0x1a7/0x360
[ 2290.505971]  worker_thread+0x30/0x390
[ 2290.505985]  ? create_worker+0x1a0/0x1a0
[ 2290.505999]  kthread+0x112/0x130
[ 2290.506011]  ? kthread_flush_work_fn+0x10/0x10
[ 2290.506027]  ret_from_fork+0x22/0x40


Wow, well this is a big NAK.

The page tables should never ever be on the invalidation list or 
otherwise we would try to point PTEs to them which is a huge security 
issue.


Taking the lock just workaround that. Can you investigate how it 
happens that a page table ends up on that list?


I will be off 3 days, I can investigate this further next Monday. From 
debug, amdgpu_vm_bo_invalidate is called many times. The background: 
app has 8 processes, running on 8 GPUs, 64 VMs, VRAM usage is around 
85% from rocm-info, 1/5 chance this happens (kernel BUG, server die) 
after application segmentation fault crash (another app issue). It is 
new issue on rocm release 4.3, never happened before on rocm 4.1 (no 
app crash on 4.1 either). kernel version is 4.18 on CentOS.




It could be a bug in the VM code. I'm on vacation till 16th of August, 
but I will try to take a look when I have time.


Thanks for the notice,
Christian.


Regards,

Philip



Thanks in advance,
Christian.



Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 2a88ed5d983b..5c4c355e7d6b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1045,7 +1045,9 @@ static void amdgpu_vm_free_table(struct 
amdgpu_vm_bo_base *entry)

  return;
  shadow = amdgpu_bo_shadowed(entry->bo);
  entry->bo->vm_bo = NULL;
+    spin_lock(>vm->invalidated_lock);
  list_del(>vm_status);
+    spin_unlock(>vm->invalidated_lock);
  amdgpu_bo_unref();
  amdgpu_bo_unref(>bo);
  }






Re: [PATCH] drm/amdgpu: Fix vm free pts race when process exiting

2021-08-04 Thread philip yang

  


On 2021-08-04 5:01 a.m., Christian
  König wrote:

Am
  03.08.21 um 00:33 schrieb Philip Yang:
  
  Take vm->invalidated_lock spinlock to
remove vm pd and pt bos, to avoid

link list corruption with crash backtrace:


[ 2290.505111] list_del corruption. next->prev should be

  9b2514ec0018, but was 4e03280211010f04

[ 2290.505154] kernel BUG at lib/list_debug.c:56!

[ 2290.505176] invalid opcode:  [#1] SMP NOPTI

[ 2290.505254] Workqueue: events delayed_fput

[ 2290.505271] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c

[ 2290.505513] Call Trace:

[ 2290.505623]  amdgpu_vm_free_table+0x26/0x80 [amdgpu]

[ 2290.505705]  amdgpu_vm_free_pts+0x7a/0xf0 [amdgpu]

[ 2290.505786]  amdgpu_vm_fini+0x1f0/0x440 [amdgpu]

[ 2290.505864]  amdgpu_driver_postclose_kms+0x172/0x290 [amdgpu]

[ 2290.505893]  drm_file_free.part.10+0x1d4/0x270 [drm]

[ 2290.505916]  drm_release+0xa9/0xe0 [drm]

[ 2290.505930]  __fput+0xb7/0x230

[ 2290.505942]  delayed_fput+0x1c/0x30

[ 2290.505957]  process_one_work+0x1a7/0x360

[ 2290.505971]  worker_thread+0x30/0x390

[ 2290.505985]  ? create_worker+0x1a0/0x1a0

[ 2290.505999]  kthread+0x112/0x130

[ 2290.506011]  ? kthread_flush_work_fn+0x10/0x10

[ 2290.506027]  ret_from_fork+0x22/0x40

  
  
  Wow, well this is a big NAK.
  
  
  The page tables should never ever be on the invalidation list or
  otherwise we would try to point PTEs to them which is a huge
  security issue.
  
  
  Taking the lock just workaround that. Can you investigate how it
  happens that a page table ends up on that list?
  

I will be off 3 days, I can investigate this further next Monday.
  From debug, amdgpu_vm_bo_invalidate is called many times. The
  background: app has 8 processes, running on 8 GPUs, 64 VMs, VRAM
  usage is around 85% from rocm-info, 1/5 chance this happens
  (kernel BUG, server die) after application segmentation fault
  crash (another app issue). It is new issue on rocm release 4.3,
  never happened before on rocm 4.1 (no app crash on 4.1 either).
  kernel version is 4.18 on CentOS.

Regards,
Philip


  
  Thanks in advance,
  
  Christian.
  
  
  

Signed-off-by: Philip Yang 

---

  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++

  1 file changed, 2 insertions(+)


diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 2a88ed5d983b..5c4c355e7d6b 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

@@ -1045,7 +1045,9 @@ static void amdgpu_vm_free_table(struct
amdgpu_vm_bo_base *entry)

  return;

  shadow = amdgpu_bo_shadowed(entry->bo);

  entry->bo->vm_bo = NULL;

+    spin_lock(>vm->invalidated_lock);

  list_del(>vm_status);

+    spin_unlock(>vm->invalidated_lock);

  amdgpu_bo_unref();

  amdgpu_bo_unref(>bo);

  }

  
  

  



Re: [PATCH v2] drm/amdgpu: fix fdinfo race with process exit

2021-08-04 Thread philip yang

  


On 2021-08-04 5:04 a.m., Christian
  König wrote:

Sorry
  I'm on vacation and can't reply immediately.
  
  
  This is the wrong approach. The fdinfo should have grabbed a
  reference to the fd it prints the info for.
  
  
  So we should never race here. Can you double check how this
  happens?
  

This backtrace happened once, from
  /var/crash/..$date../vmcode-dmesg.log on the server machine, I can
  not repro the issue, grep app folder, there are python scripts
  accessing /proc/pid/node_id/fdinfo. This happened after app crash
  segmentation fault killed.

fdinfo grab fpriv reference, but not fpriv->vm.root.bo
  reference, I think this is needed, otherwise
  amdgpu_bo_reserve(fpriv->vm.root.bo) may deference NULL
  pointer.
Regards,
Philip

Thanks,
  
  Christian.
  
  
  Am 03.08.21 um 16:06 schrieb philip yang:
  
  

ping?


On 2021-07-29 10:13 p.m., Philip Yang wrote:

Get process vm root BO ref in case
  process is exiting and root BO is
  
  freed, to avoid NULL pointer dereference backtrace:
  
  
  BUG: unable to handle kernel NULL pointer dereference at
  
  
  
  Call Trace:
  
  amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
  
  seq_show+0x12c/0x180
  
  seq_read+0x153/0x410
  
  vfs_read+0x91/0x140[ 3427.206183]  ksys_read+0x4f/0xb0
  
  do_syscall_64+0x5b/0x1a0
  
  entry_SYSCALL_64_after_hwframe+0x65/0xca
  
  
  v2: rebase to staging
  
  
  Signed-off-by: Philip Yang
  
  ---
  
    drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 11 +--
  
    1 file changed, 9 insertions(+), 2 deletions(-)
  
  
  diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
  b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
  
  index d94c5419ec25..5a6857c44bb6 100644
  
  --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
  
  +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
  
  @@ -59,6 +59,7 @@ void amdgpu_show_fdinfo(struct seq_file *m,
  struct file *f)
  
    uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
  
    struct drm_file *file = f->private_data;
  
    struct amdgpu_device *adev =
  drm_to_adev(file->minor->dev);
  
  +    struct amdgpu_bo *root;
  
    int ret;
  
      ret = amdgpu_file_to_fpriv(f, );
  
  @@ -69,13 +70,19 @@ void amdgpu_show_fdinfo(struct seq_file
  *m, struct file *f)
  
    dev = PCI_SLOT(adev->pdev->devfn);
  
    fn = PCI_FUNC(adev->pdev->devfn);
  
    -    ret = amdgpu_bo_reserve(fpriv->vm.root.bo, false);
  
  +    root = amdgpu_bo_ref(fpriv->vm.root.bo);
  
  +    if (!root)
  
  +    return;
  
  +
  
  +    ret = amdgpu_bo_reserve(root, false);
  
    if (ret) {
  
    DRM_ERROR("Fail to reserve bo\n");
  
    return;
  
    }
  
    amdgpu_vm_get_memory(>vm, _mem,
  _mem, _mem);
  
  -    amdgpu_bo_unreserve(fpriv->vm.root.bo);
  
  +    amdgpu_bo_unreserve(root);
  
  +    amdgpu_bo_unref();
  
  +
  
    seq_printf(m, "pdev:\t%04x:%02x:%02x.%d\npasid:\t%u\n",
  domain, bus,
  
    dev, fn, fpriv->vm.pasid);
  
    seq_printf(m, "vram mem:\t%llu kB\n", vram_mem/1024UL);
  

  
  

  



RE: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS

2021-08-04 Thread Chen, Guchun
[Public]

+/*
+ * Helper function to query RAS EEPROM address
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Return true if vbios supports ras rom address reporting

As you have documented the first argument of function 
amdgpu_atomfirmware_ras_rom_addr, the other one "uint8_t* i2c_address" should 
be documented as well.

BTW, if this patch fixes https://gitlab.freedesktop.org/drm/amd/-/issues/1670?

Regards,
Guchun

From: amd-gfx  On Behalf Of Clements, 
John
Sent: Wednesday, August 4, 2021 5:14 PM
To: Clements, John ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking 
Subject: RE: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS


[AMD Official Use Only]

Updated patch with reviewed changes

From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Clements, John
Sent: Wednesday, August 4, 2021 4:48 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>
Subject: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS


[AMD Official Use Only]

Submitting patch to get RAS EEPROM I2C address from VBIOS FW info table.

Thank you,
John Clements
<>

RE: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS

2021-08-04 Thread Zhang, Hawking
[Public]

The patch is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Clements, John 
Sent: Wednesday, August 4, 2021 17:14
To: Clements, John ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking 
Subject: RE: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS


[AMD Official Use Only]

Updated patch with reviewed changes

From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Clements, John
Sent: Wednesday, August 4, 2021 4:48 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>
Subject: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS


[AMD Official Use Only]

Submitting patch to get RAS EEPROM I2C address from VBIOS FW info table.

Thank you,
John Clements


RE: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS

2021-08-04 Thread Clements, John
[AMD Official Use Only]

Updated patch with reviewed changes

From: amd-gfx  On Behalf Of Clements, 
John
Sent: Wednesday, August 4, 2021 4:48 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking 
Subject: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS


[AMD Official Use Only]

Submitting patch to get RAS EEPROM I2C address from VBIOS FW info table.

Thank you,
John Clements


0001-drm-amdgpu-set-RAS-EEPROM-address-from-VBIOS.patch
Description: 0001-drm-amdgpu-set-RAS-EEPROM-address-from-VBIOS.patch


Re: [PATCH] drm/amdgpu: Add driver version

2021-08-04 Thread Christian König

Why did you then send it out?

Thanks,
Christian.

Am 04.08.21 um 09:05 schrieb Zhou, Peng Ju:

[AMD Official Use Only]

Merge error, Too big to drop


-Original Message-
From: Peng Ju Zhou 
Sent: Wednesday, August 4, 2021 10:15 AM
To: amd-gfx@lists.freedesktop.org
Cc: Nieto, David M ; Zhou, Peng Ju

Subject: [PATCH] drm/amdgpu: Add driver version

From: David M Nieto 

This sysfs is only defined in DKMS drivers
it exposes the internal AMDGPU version

Signed-off-by: David M Nieto 
Signed-off-by: Peng Ju Zhou 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2659 +++-
  1 file changed, 941 insertions(+), 1718 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 9e53ff851496..d93d1c966bad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -29,6 +29,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 

  #include 
@@ -65,13 +66,9 @@
  #include "amdgpu_ras.h"
  #include "amdgpu_pmu.h"
  #include "amdgpu_fru_eeprom.h"
-#include "amdgpu_reset.h"

  #include 
  #include 
-#include 
-
-#include 

  MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
  MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
@@ -83,8 +80,6 @@ MODULE_FIRMWARE("amdgpu/renoir_gpu_info.bin");
  MODULE_FIRMWARE("amdgpu/navi10_gpu_info.bin");
  MODULE_FIRMWARE("amdgpu/navi14_gpu_info.bin");
  MODULE_FIRMWARE("amdgpu/navi12_gpu_info.bin");
-MODULE_FIRMWARE("amdgpu/vangogh_gpu_info.bin");
-MODULE_FIRMWARE("amdgpu/yellow_carp_gpu_info.bin");

  #define AMDGPU_RESUME_MS  2000

@@ -114,17 +109,9 @@ const char *amdgpu_asic_name[] = {
"RAVEN",
"ARCTURUS",
"RENOIR",
-   "ALDEBARAN",
"NAVI10",
-   "CYAN_SKILLFISH",
"NAVI14",
"NAVI12",
-   "SIENNA_CICHLID",
-   "NAVY_FLOUNDER",
-   "VANGOGH",
-   "DIMGREY_CAVEFISH",
-   "BEIGE_GOBY",
-   "YELLOW_CARP",
"LAST",
  };

@@ -141,10 +128,10 @@ static ssize_t
amdgpu_device_get_pcie_replay_count(struct device *dev,
struct device_attribute *attr, char *buf)
  {
struct drm_device *ddev = dev_get_drvdata(dev);
-   struct amdgpu_device *adev = drm_to_adev(ddev);
+   struct amdgpu_device *adev = ddev->dev_private;
uint64_t cnt = amdgpu_asic_get_pcie_replay_count(adev);

-   return sysfs_emit(buf, "%llu\n", cnt);
+   return snprintf(buf, PAGE_SIZE, "%llu\n", cnt);
  }

  static DEVICE_ATTR(pcie_replay_count, S_IRUGO,
@@ -166,9 +153,9 @@ static ssize_t
amdgpu_device_get_product_name(struct device *dev,
struct device_attribute *attr, char *buf)
  {
struct drm_device *ddev = dev_get_drvdata(dev);
-   struct amdgpu_device *adev = drm_to_adev(ddev);
+   struct amdgpu_device *adev = ddev->dev_private;

-   return sysfs_emit(buf, "%s\n", adev->product_name);
+   return snprintf(buf, PAGE_SIZE, "%s\n", adev->product_name);
  }

  static DEVICE_ATTR(product_name, S_IRUGO,
@@ -188,9 +175,9 @@ static ssize_t
amdgpu_device_get_product_number(struct device *dev,
struct device_attribute *attr, char *buf)
  {
struct drm_device *ddev = dev_get_drvdata(dev);
-   struct amdgpu_device *adev = drm_to_adev(ddev);
+   struct amdgpu_device *adev = ddev->dev_private;

-   return sysfs_emit(buf, "%s\n", adev->product_number);
+   return snprintf(buf, PAGE_SIZE, "%s\n", adev->product_number);
  }

  static DEVICE_ATTR(product_number, S_IRUGO,
@@ -210,45 +197,27 @@ static ssize_t
amdgpu_device_get_serial_number(struct device *dev,
struct device_attribute *attr, char *buf)
  {
struct drm_device *ddev = dev_get_drvdata(dev);
-   struct amdgpu_device *adev = drm_to_adev(ddev);
+   struct amdgpu_device *adev = ddev->dev_private;

-   return sysfs_emit(buf, "%s\n", adev->serial);
+   return snprintf(buf, PAGE_SIZE, "%s\n", adev->serial);
  }

  static DEVICE_ATTR(serial_number, S_IRUGO,
amdgpu_device_get_serial_number, NULL);

  /**
- * amdgpu_device_supports_px - Is the device a dGPU with ATPX power
control
- *
- * @dev: drm_device pointer
- *
- * Returns true if the device is a dGPU with ATPX power control,
- * otherwise return false.
- */
-bool amdgpu_device_supports_px(struct drm_device *dev)
-{
-   struct amdgpu_device *adev = drm_to_adev(dev);
-
-   if ((adev->flags & AMD_IS_PX) && !amdgpu_is_atpx_hybrid())
-   return true;
-   return false;
-}
-
-/**
- * amdgpu_device_supports_boco - Is the device a dGPU with ACPI power
resources
+ * amdgpu_device_supports_boco - Is the device a dGPU with HG/PX power
control
   *
   * @dev: drm_device pointer
   *
- * Returns true if the device is a dGPU with ACPI power control,
+ * Returns true if the device is a dGPU with HG/PX power control,
   * otherwise return false.
   */
  bool amdgpu_device_supports_boco(struct drm_device *dev)
  {
-   

Re: [PATCH v2] drm/amdgpu: fix fdinfo race with process exit

2021-08-04 Thread Christian König

Sorry I'm on vacation and can't reply immediately.

This is the wrong approach. The fdinfo should have grabbed a reference 
to the fd it prints the info for.


So we should never race here. Can you double check how this happens?

Thanks,
Christian.

Am 03.08.21 um 16:06 schrieb philip yang:


ping?

On 2021-07-29 10:13 p.m., Philip Yang wrote:

Get process vm root BO ref in case process is exiting and root BO is
freed, to avoid NULL pointer dereference backtrace:

BUG: unable to handle kernel NULL pointer dereference at

Call Trace:
amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
seq_show+0x12c/0x180
seq_read+0x153/0x410
vfs_read+0x91/0x140[ 3427.206183]  ksys_read+0x4f/0xb0
do_syscall_64+0x5b/0x1a0
entry_SYSCALL_64_after_hwframe+0x65/0xca

v2: rebase to staging

Signed-off-by: Philip Yang
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index d94c5419ec25..5a6857c44bb6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -59,6 +59,7 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
+   struct amdgpu_bo *root;
int ret;
  
  	ret = amdgpu_file_to_fpriv(f, );

@@ -69,13 +70,19 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
dev = PCI_SLOT(adev->pdev->devfn);
fn = PCI_FUNC(adev->pdev->devfn);
  
-	ret = amdgpu_bo_reserve(fpriv->vm.root.bo, false);

+   root = amdgpu_bo_ref(fpriv->vm.root.bo);
+   if (!root)
+   return;
+
+   ret = amdgpu_bo_reserve(root, false);
if (ret) {
DRM_ERROR("Fail to reserve bo\n");
return;
}
amdgpu_vm_get_memory(>vm, _mem, _mem, _mem);
-   amdgpu_bo_unreserve(fpriv->vm.root.bo);
+   amdgpu_bo_unreserve(root);
+   amdgpu_bo_unref();
+
seq_printf(m, "pdev:\t%04x:%02x:%02x.%d\npasid:\t%u\n", domain, bus,
dev, fn, fpriv->vm.pasid);
seq_printf(m, "vram mem:\t%llu kB\n", vram_mem/1024UL);




Re: [PATCH] drm/amdgpu: Fix vm free pts race when process exiting

2021-08-04 Thread Christian König

Am 03.08.21 um 00:33 schrieb Philip Yang:

Take vm->invalidated_lock spinlock to remove vm pd and pt bos, to avoid
link list corruption with crash backtrace:

[ 2290.505111] list_del corruption. next->prev should be
  9b2514ec0018, but was 4e03280211010f04
[ 2290.505154] kernel BUG at lib/list_debug.c:56!
[ 2290.505176] invalid opcode:  [#1] SMP NOPTI
[ 2290.505254] Workqueue: events delayed_fput
[ 2290.505271] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
[ 2290.505513] Call Trace:
[ 2290.505623]  amdgpu_vm_free_table+0x26/0x80 [amdgpu]
[ 2290.505705]  amdgpu_vm_free_pts+0x7a/0xf0 [amdgpu]
[ 2290.505786]  amdgpu_vm_fini+0x1f0/0x440 [amdgpu]
[ 2290.505864]  amdgpu_driver_postclose_kms+0x172/0x290 [amdgpu]
[ 2290.505893]  drm_file_free.part.10+0x1d4/0x270 [drm]
[ 2290.505916]  drm_release+0xa9/0xe0 [drm]
[ 2290.505930]  __fput+0xb7/0x230
[ 2290.505942]  delayed_fput+0x1c/0x30
[ 2290.505957]  process_one_work+0x1a7/0x360
[ 2290.505971]  worker_thread+0x30/0x390
[ 2290.505985]  ? create_worker+0x1a0/0x1a0
[ 2290.505999]  kthread+0x112/0x130
[ 2290.506011]  ? kthread_flush_work_fn+0x10/0x10
[ 2290.506027]  ret_from_fork+0x22/0x40


Wow, well this is a big NAK.

The page tables should never ever be on the invalidation list or 
otherwise we would try to point PTEs to them which is a huge security issue.


Taking the lock just workaround that. Can you investigate how it happens 
that a page table ends up on that list?


Thanks in advance,
Christian.



Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 2a88ed5d983b..5c4c355e7d6b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1045,7 +1045,9 @@ static void amdgpu_vm_free_table(struct amdgpu_vm_bo_base 
*entry)
return;
shadow = amdgpu_bo_shadowed(entry->bo);
entry->bo->vm_bo = NULL;
+   spin_lock(>vm->invalidated_lock);
list_del(>vm_status);
+   spin_unlock(>vm->invalidated_lock);
amdgpu_bo_unref();
amdgpu_bo_unref(>bo);
  }




RE: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS

2021-08-04 Thread Zhang, Hawking
[AMD Official Use Only]

Hi John,

Can you please add error code check in the new atomfirmware interface so driver 
know exactly whether there is RAS EEPROM onboard, and accordingly, jump out and 
stop any further I2C request from here.

Regards,
Hawking
From: Clements, John 
Sent: Wednesday, August 4, 2021 16:48
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking 
Subject: [PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS


[AMD Official Use Only]

Submitting patch to get RAS EEPROM I2C address from VBIOS FW info table.

Thank you,
John Clements


[PATCH] drm/amdgpu: set RAS EEPROM address from VBIOS

2021-08-04 Thread Clements, John
[AMD Official Use Only]

Submitting patch to get RAS EEPROM I2C address from VBIOS FW info table.

Thank you,
John Clements


0001-drm-amdgpu-set-RAS-EEPROM-address-from-VBIOS.patch
Description: 0001-drm-amdgpu-set-RAS-EEPROM-address-from-VBIOS.patch


RE: [PATCH] drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode

2021-08-04 Thread Gu, JiaWei (Will)
[AMD Official Use Only]

Add Alex.

-Original Message-
From: Jiawei Gu  
Sent: Wednesday, August 4, 2021 3:50 PM
To: amd-gfx@lists.freedesktop.org
Cc: Nieto, David M ; Deng, Emily ; Gu, 
JiaWei (Will) 
Subject: [PATCH] drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode

Enable pp_num_states, pp_cur_state, pp_force_state, pp_table sysfs under SRIOV 
1-VF scenario.

Signed-off-by: Jiawei Gu 
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index 769f58d5ae1a..04c7d82f8b89 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -2005,10 +2005,10 @@ static int ss_bias_attr_update(struct amdgpu_device 
*adev, struct amdgpu_device_  static struct amdgpu_device_attr 
amdgpu_device_attrs[] = {
AMDGPU_DEVICE_ATTR_RW(power_dpm_state,  
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(power_dpm_force_performance_level,
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
-   AMDGPU_DEVICE_ATTR_RO(pp_num_states,
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RO(pp_cur_state, 
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RW(pp_force_state,   
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RW(pp_table, 
ATTR_FLAG_BASIC),
+   AMDGPU_DEVICE_ATTR_RO(pp_num_states,
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RO(pp_cur_state, 
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RW(pp_force_state,   
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RW(pp_table, 
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(pp_dpm_sclk,  
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(pp_dpm_mclk,  
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(pp_dpm_socclk,
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
--
2.17.1


[PATCH] drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode

2021-08-04 Thread Jiawei Gu
Enable pp_num_states, pp_cur_state, pp_force_state, pp_table sysfs under
SRIOV 1-VF scenario.

Signed-off-by: Jiawei Gu 
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index 769f58d5ae1a..04c7d82f8b89 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -2005,10 +2005,10 @@ static int ss_bias_attr_update(struct amdgpu_device 
*adev, struct amdgpu_device_
 static struct amdgpu_device_attr amdgpu_device_attrs[] = {
AMDGPU_DEVICE_ATTR_RW(power_dpm_state,  
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(power_dpm_force_performance_level,
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
-   AMDGPU_DEVICE_ATTR_RO(pp_num_states,
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RO(pp_cur_state, 
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RW(pp_force_state,   
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RW(pp_table, 
ATTR_FLAG_BASIC),
+   AMDGPU_DEVICE_ATTR_RO(pp_num_states,
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RO(pp_cur_state, 
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RW(pp_force_state,   
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RW(pp_table, 
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(pp_dpm_sclk,  
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(pp_dpm_mclk,  
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(pp_dpm_socclk,
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
-- 
2.17.1



RE: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access

2021-08-04 Thread Clements, John
[Public]

@Chen, Guchun,
Based off your feedback I double checked the code, and I changed my opinion 
about it, I think it's better just to reuse the original mutex for now. I've 
submitted an updated patch for review

From: Clements, John 
Sent: Tuesday, August 3, 2021 10:07 PM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Li, Candice ; 
Lazar, Lijo 
Subject: Re: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access

Hello Guchun,

In most of those cases you are right it is redundant, the reason i kept them 
separate for now is to resolve this bug while also keeping those interfaces 
modular, and not affecting the psp submit sequence yet. We are planning a 
bigger change to that source to remove alot of the duplicate code regarding the 
cmd buffer prepare/submit flow and will probably go back down to one mutex 
there.

Thank you,
John Clements


From: Chen, Guchun mailto:guchun.c...@amd.com>>
Sent: Tuesday, August 3, 2021 9:58 PM
To: Clements, John mailto:john.cleme...@amd.com>>; 
amd-gfx@lists.freedesktop.org 
mailto:amd-gfx@lists.freedesktop.org>>
Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>; Li, 
Candice mailto:candice...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>
Subject: RE: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access


[Public]



Before calling into psp_cmd_submit_buf, a mutex psp->cmd_buf_mutex is there, 
and after entering psp_cmd_submit_buf, there is another mutex psp->mutex, is it 
a bit redundant?



Regards,

Guchun



From: Clements, John mailto:john.cleme...@amd.com>>
Sent: Tuesday, August 3, 2021 5:50 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>; Li, 
Candice mailto:candice...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>; Chen, Guchun 
mailto:guchun.c...@amd.com>>
Subject: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access



[AMD Official Use Only]



Submitting patch to synchronize access to psp cmd submission memory to resolve 
potential race conditions.


0001-drm-amdgpu-added-synchronization-for-psp-cmd-buf-acc.patch
Description: 0001-drm-amdgpu-added-synchronization-for-psp-cmd-buf-acc.patch


[PATCH] drm/amdgpu: drop redundant null-pointer checks in amdgpu_ttm_tt_populate() and amdgpu_ttm_tt_unpopulate()

2021-08-04 Thread Tuo Li
The varialbe gtt in the function amdgpu_ttm_tt_populate() and
amdgpu_ttm_tt_unpopulate() is guaranteed to be not NULL in the context.
Thus the null-pointer checks are redundant and can be dropped.

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 3a55f08e00e1..719539bd6c44 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1121,7 +1121,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
struct amdgpu_ttm_tt *gtt = (void *)ttm;
 
/* user pages are bound by amdgpu_ttm_tt_pin_userptr() */
-   if (gtt && gtt->userptr) {
+   if (gtt->userptr) {
ttm->sg = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
if (!ttm->sg)
return -ENOMEM;
@@ -1146,7 +1146,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device 
*bdev,
struct amdgpu_ttm_tt *gtt = (void *)ttm;
struct amdgpu_device *adev;
 
-   if (gtt && gtt->userptr) {
+   if (gtt->userptr) {
amdgpu_ttm_tt_set_user_pages(ttm, NULL);
kfree(ttm->sg);
ttm->sg = NULL;
-- 
2.25.1



[PATCH v2] drm/radeon: Update pitch for page flip

2021-08-04 Thread Zhenneng Li

When primary bo is updated, crtc's pitch may
have not been updated, this will lead to show
disorder content when user changes display mode,
we update crtc's pitch in page flip to avoid
this bug.
This refers to amdgpu's pageflip.

v1->v2:
Update all of the pitch in all of the page_flip functions
in radeon rather than just the evergreen one.

Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Zhenneng Li 
---
 drivers/gpu/drm/radeon/evergreen.c | 8 +++-
 drivers/gpu/drm/radeon/r100.c  | 5 +
 drivers/gpu/drm/radeon/rs600.c | 8 +++-
 drivers/gpu/drm/radeon/rv770.c | 8 +++-
 4 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 36a888e1b179..eeb590d2dec2 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -28,6 +28,7 @@
 
 #include 
 #include 
+#include 
 
 #include "atom.h"
 #include "avivod.h"
@@ -1414,10 +1415,15 @@ void evergreen_page_flip(struct radeon_device *rdev, 
int crtc_id, u64 crtc_base,
 bool async)
 {
struct radeon_crtc *radeon_crtc = rdev->mode_info.crtcs[crtc_id];
+   struct drm_framebuffer *fb = radeon_crtc->base.primary->fb;
 
-   /* update the scanout addresses */
+   /* flip at hsync for async, default is vsync */
WREG32(EVERGREEN_GRPH_FLIP_CONTROL + radeon_crtc->crtc_offset,
   async ? EVERGREEN_GRPH_SURFACE_UPDATE_H_RETRACE_EN : 0);
+   /* update pitch */
+   WREG32(EVERGREEN_GRPH_PITCH + radeon_crtc->crtc_offset,
+  fb->pitches[0] / fb->format->cpp[0]);
+   /* update the scanout addresses */
WREG32(EVERGREEN_GRPH_PRIMARY_SURFACE_ADDRESS_HIGH + 
radeon_crtc->crtc_offset,
   upper_32_bits(crtc_base));
WREG32(EVERGREEN_GRPH_PRIMARY_SURFACE_ADDRESS + 
radeon_crtc->crtc_offset,
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index ba724198b72e..1268854552ff 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -162,6 +162,7 @@ void r100_wait_for_vblank(struct radeon_device *rdev, int 
crtc)
 void r100_page_flip(struct radeon_device *rdev, int crtc_id, u64 crtc_base, 
bool async)
 {
struct radeon_crtc *radeon_crtc = rdev->mode_info.crtcs[crtc_id];
+   struct drm_framebuffer *fb = radeon_crtc->base.primary->fb;
u32 tmp = ((u32)crtc_base) | RADEON_CRTC_OFFSET__OFFSET_LOCK;
int i;
 
@@ -169,6 +170,10 @@ void r100_page_flip(struct radeon_device *rdev, int 
crtc_id, u64 crtc_base, bool
/* update the scanout addresses */
WREG32(RADEON_CRTC_OFFSET + radeon_crtc->crtc_offset, tmp);
 
+   /* update pitch */
+   WREG32(RADEON_CRTC_PITCH + radeon_crtc->crtc_offset,
+  fb->pitches[0] / fb->format->cpp[0]);
+
/* Wait for update_pending to go high. */
for (i = 0; i < rdev->usec_timeout; i++) {
if (RREG32(RADEON_CRTC_OFFSET + radeon_crtc->crtc_offset) & 
RADEON_CRTC_OFFSET__GUI_TRIG_OFFSET)
diff --git a/drivers/gpu/drm/radeon/rs600.c b/drivers/gpu/drm/radeon/rs600.c
index b2d22e25eee1..b87dd551e939 100644
--- a/drivers/gpu/drm/radeon/rs600.c
+++ b/drivers/gpu/drm/radeon/rs600.c
@@ -41,6 +41,7 @@
 
 #include 
 #include 
+#include 
 
 #include "atom.h"
 #include "radeon.h"
@@ -118,6 +119,7 @@ void avivo_wait_for_vblank(struct radeon_device *rdev, int 
crtc)
 void rs600_page_flip(struct radeon_device *rdev, int crtc_id, u64 crtc_base, 
bool async)
 {
struct radeon_crtc *radeon_crtc = rdev->mode_info.crtcs[crtc_id];
+   struct drm_framebuffer *fb = radeon_crtc->base.primary->fb;
u32 tmp = RREG32(AVIVO_D1GRPH_UPDATE + radeon_crtc->crtc_offset);
int i;
 
@@ -125,9 +127,13 @@ void rs600_page_flip(struct radeon_device *rdev, int 
crtc_id, u64 crtc_base, boo
tmp |= AVIVO_D1GRPH_UPDATE_LOCK;
WREG32(AVIVO_D1GRPH_UPDATE + radeon_crtc->crtc_offset, tmp);
 
-   /* update the scanout addresses */
+   /* flip at hsync for async, default is vsync */
WREG32(AVIVO_D1GRPH_FLIP_CONTROL + radeon_crtc->crtc_offset,
   async ? AVIVO_D1GRPH_SURFACE_UPDATE_H_RETRACE_EN : 0);
+   /* update pitch */
+   WREG32(AVIVO_D1GRPH_PITCH + radeon_crtc->crtc_offset,
+  fb->pitches[0] / fb->format->cpp[0]);
+   /* update the scanout addresses */
WREG32(AVIVO_D1GRPH_SECONDARY_SURFACE_ADDRESS + 
radeon_crtc->crtc_offset,
   (u32)crtc_base);
WREG32(AVIVO_D1GRPH_PRIMARY_SURFACE_ADDRESS + radeon_crtc->crtc_offset,
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c
index 74499307285b..e592e57be1bb 100644
--- a/drivers/gpu/drm/radeon/rv770.c
+++ b/drivers/gpu/drm/radeon/rv770.c
@@ -32,6 +32,7 @@
 
 #include 
 #include 

Re: [PATCH v2 00/14] drm: Make DRM's IRQ helpers legacy

2021-08-04 Thread Thomas Zimmermann

Hi

Am 03.08.21 um 20:36 schrieb Chrisanthus, Anitha:

Hi Thomas,
Can you please hold off on applying the kmb patch, I am seeing some issues 
while testing. Modetest works, but video playback only plays once, and it fails 
the second time with this patch.


Sounds a bit like the testing issue at [1]. For testing, you need the 
latest drm-misc-next or drm-tip. Specifically, you need commit 
1e4cd78ed493 ("drm: Don't test for IRQ support in VBLANK ioctls").


Let me know whether this works for you.

Best regards
Thomas

[1] https://patchwork.freedesktop.org/patch/447057/?series=93078=1



Thanks,
Anitha



-Original Message-
From: Sam Ravnborg 
Sent: Tuesday, August 3, 2021 8:05 AM
To: Thomas Zimmermann 
Cc: dan...@ffwll.ch; airl...@linux.ie; alexander.deuc...@amd.com;
christian.koe...@amd.com; liviu.du...@arm.com; brian.star...@arm.com;
bbrezil...@kernel.org; nicolas.fe...@microchip.com;
maarten.lankho...@linux.intel.com; mrip...@kernel.org; ste...@agner.ch;
alison.w...@nxp.com; patrik.r.jakobs...@gmail.com; Chrisanthus, Anitha
; robdcl...@gmail.com; Dea, Edmund J
; s...@poorly.run; shawn...@kernel.org;
s.ha...@pengutronix.de; ker...@pengutronix.de; jyri.sa...@iki.fi;
to...@kernel.org; dan.sned...@microchip.com;
tomi.valkei...@ideasonboard.com; amd-gfx@lists.freedesktop.org; dri-
de...@lists.freedesktop.org; linux-arm-ker...@lists.infradead.org; linux-arm-
m...@vger.kernel.org; freedr...@lists.freedesktop.org
Subject: Re: [PATCH v2 00/14] drm: Make DRM's IRQ helpers legacy

Hi Thomas,

On Tue, Aug 03, 2021 at 11:06:50AM +0200, Thomas Zimmermann wrote:

DRM's IRQ helpers are only helpful for old, non-KMS drivers. Move
the code behind CONFIG_DRM_LEGACY. Convert KMS drivers to Linux
IRQ interfaces.

DRM provides IRQ helpers for setting up, receiving and removing IRQ
handlers. It's an abstraction over plain Linux functions. The code
is mid-layerish with several callbacks to hook into the rsp drivers.
Old UMS driver have their interrupts enabled via ioctl, so these
abstractions makes some sense. Modern KMS manage all their interrupts
internally. Using the DRM helpers adds indirection without benefits.

Most KMS drivers already use Linux IRQ functions instead of DRM's
abstraction layer. Patches 1 to 12 convert the remaining ones.
The patches also resolve a bug for devices without assigned interrupt
number. DRM helpers don't test for IRQ_NOTCONNECTED, so drivers do
not detect if the device has no interrupt assigned.

Patch 13 removes an unused function.

Patch 14 moves the DRM IRQ helpers behind CONFIG_DRM_LEGACY. Only
the old non-KMS drivers still use the functionality.

v2:
* drop IRQ_NOTCONNECTED test from atmel-hlcdc (Sam)
* use devm_request_irq() in atmel-hlcdc (Sam)
* unify variable names in arm/hlcdc (Sam)

Thomas Zimmermann (14):


The following patches are all:
Acked-by: Sam Ravnborg 


   drm/fsl-dcu: Convert to Linux IRQ interfaces
   drm/gma500: Convert to Linux IRQ interfaces
   drm/kmb: Convert to Linux IRQ interfaces
   drm/msm: Convert to Linux IRQ interfaces
   drm/mxsfb: Convert to Linux IRQ interfaces
   drm/tidss: Convert to Linux IRQ interfaces
   drm/vc4: Convert to Linux IRQ interfaces
   drm: Remove unused devm_drm_irq_install()


The remaining patches I either skipped or already had a feedback from
me or I asked a question.

Sam


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


RE: [PATCH] drm/amdgpu: add done BO list

2021-08-04 Thread Zhou, Peng Ju
[AMD Official Use Only]

Merge error, Too big to drop

> -Original Message-
> From: Peng Ju Zhou 
> Sent: Wednesday, August 4, 2021 10:15 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Nieto, David M ; Zhou, Peng Ju
> 
> Subject: [PATCH] drm/amdgpu: add done BO list
> 
> From: David M Nieto 
> 
> backport of "add a list in VM for BOs in the done state"
> 
> Signed-off-by: David M Nieto 
> Signed-off-by: Peng Ju Zhou 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1196 +++-
>  1 file changed, 561 insertions(+), 635 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 2a88ed5d983b..ecf7f2039de0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -25,22 +25,16 @@
>   *  Alex Deucher
>   *  Jerome Glisse
>   */
> -
>  #include 
>  #include 
>  #include 
> -#include 
> 
>  #include 
> -#include 
>  #include "amdgpu.h"
>  #include "amdgpu_trace.h"
>  #include "amdgpu_amdkfd.h"
>  #include "amdgpu_gmc.h"
>  #include "amdgpu_xgmi.h"
> -#include "amdgpu_dma_buf.h"
> -#include "amdgpu_res_cursor.h"
> -#include "kfd_svm.h"
> 
>  /**
>   * DOC: GPUVM
> @@ -89,46 +83,6 @@ struct amdgpu_prt_cb {
>  };
> 
>  /**
> - * amdgpu_vm_set_pasid - manage pasid and vm ptr mapping
> - *
> - * @adev: amdgpu_device pointer
> - * @vm: amdgpu_vm pointer
> - * @pasid: the pasid the VM is using on this GPU
> - *
> - * Set the pasid this VM is using on this GPU, can also be used to remove the
> - * pasid by passing in zero.
> - *
> - */
> -int amdgpu_vm_set_pasid(struct amdgpu_device *adev, struct amdgpu_vm
> *vm,
> - u32 pasid)
> -{
> - int r;
> -
> - if (vm->pasid == pasid)
> - return 0;
> -
> - if (vm->pasid) {
> - r = xa_err(xa_erase_irq(>vm_manager.pasids, vm-
> >pasid));
> - if (r < 0)
> - return r;
> -
> - vm->pasid = 0;
> - }
> -
> - if (pasid) {
> - r = xa_err(xa_store_irq(>vm_manager.pasids, pasid, vm,
> - GFP_KERNEL));
> - if (r < 0)
> - return r;
> -
> - vm->pasid = pasid;
> - }
> -
> -
> - return 0;
> -}
> -
> -/*
>   * vm eviction_lock can be taken in MMU notifiers. Make sure no reclaim-FS
>   * happens while holding this lock anywhere to prevent deadlocks when
>   * an MMU notifier runs in reclaim-FS context.
> @@ -136,13 +90,13 @@ int amdgpu_vm_set_pasid(struct amdgpu_device
> *adev, struct amdgpu_vm *vm,
>  static inline void amdgpu_vm_eviction_lock(struct amdgpu_vm *vm)
>  {
>   mutex_lock(>eviction_lock);
> - vm->saved_flags = memalloc_noreclaim_save();
> + vm->saved_flags = memalloc_nofs_save();
>  }
> 
>  static inline int amdgpu_vm_eviction_trylock(struct amdgpu_vm *vm)
>  {
>   if (mutex_trylock(>eviction_lock)) {
> - vm->saved_flags = memalloc_noreclaim_save();
> + vm->saved_flags = memalloc_nofs_save();
>   return 1;
>   }
>   return 0;
> @@ -150,7 +104,7 @@ static inline int amdgpu_vm_eviction_trylock(struct
> amdgpu_vm *vm)
> 
>  static inline void amdgpu_vm_eviction_unlock(struct amdgpu_vm *vm)
>  {
> - memalloc_noreclaim_restore(vm->saved_flags);
> + memalloc_nofs_restore(vm->saved_flags);
>   mutex_unlock(>eviction_lock);
>  }
> 
> @@ -372,7 +326,7 @@ static void amdgpu_vm_bo_base_init(struct
> amdgpu_vm_bo_base *base,
>   base->next = bo->vm_bo;
>   bo->vm_bo = base;
> 
> - if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
> + if (amdkcl_ttm_resvp(>tbo) != amdkcl_ttm_resvp(
> >root.base.bo->tbo))
>   return;
> 
>   vm->bulk_moveable = false;
> @@ -382,7 +336,7 @@ static void amdgpu_vm_bo_base_init(struct
> amdgpu_vm_bo_base *base,
>   amdgpu_vm_bo_idle(base);
> 
>   if (bo->preferred_domains &
> - amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type))
> + amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type))
>   return;
> 
>   /*
> @@ -401,14 +355,14 @@ static void amdgpu_vm_bo_base_init(struct
> amdgpu_vm_bo_base *base,
>   * Helper to get the parent entry for the child page table. NULL if we are at
>   * the root page directory.
>   */
> -static struct amdgpu_vm_bo_base *amdgpu_vm_pt_parent(struct
> amdgpu_vm_bo_base *pt)
> +static struct amdgpu_vm_pt *amdgpu_vm_pt_parent(struct amdgpu_vm_pt
> *pt)
>  {
> - struct amdgpu_bo *parent = pt->bo->parent;
> + struct amdgpu_bo *parent = pt->base.bo->parent;
> 
>   if (!parent)
>   return NULL;
> 
> - return parent->vm_bo;
> + return container_of(parent->vm_bo, struct amdgpu_vm_pt, base);
>  }
> 
>  /*
> @@ -416,8 +370,8 @@ static struct amdgpu_vm_bo_base
> *amdgpu_vm_pt_parent(struct amdgpu_vm_bo_base *p
>   */
>  struct amdgpu_vm_pt_cursor {
>   uint64_t pfn;
> - struct amdgpu_vm_bo_base *parent;
> - 

RE: [PATCH] drm/amdgpu: Add driver version

2021-08-04 Thread Zhou, Peng Ju
[AMD Official Use Only]

Merge error, Too big to drop

> -Original Message-
> From: Peng Ju Zhou 
> Sent: Wednesday, August 4, 2021 10:15 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Nieto, David M ; Zhou, Peng Ju
> 
> Subject: [PATCH] drm/amdgpu: Add driver version
> 
> From: David M Nieto 
> 
> This sysfs is only defined in DKMS drivers
> it exposes the internal AMDGPU version
> 
> Signed-off-by: David M Nieto 
> Signed-off-by: Peng Ju Zhou 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2659 +++-
>  1 file changed, 941 insertions(+), 1718 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 9e53ff851496..d93d1c966bad 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> 
>  #include 
> @@ -65,13 +66,9 @@
>  #include "amdgpu_ras.h"
>  #include "amdgpu_pmu.h"
>  #include "amdgpu_fru_eeprom.h"
> -#include "amdgpu_reset.h"
> 
>  #include 
>  #include 
> -#include 
> -
> -#include 
> 
>  MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> @@ -83,8 +80,6 @@ MODULE_FIRMWARE("amdgpu/renoir_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/navi10_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/navi14_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/navi12_gpu_info.bin");
> -MODULE_FIRMWARE("amdgpu/vangogh_gpu_info.bin");
> -MODULE_FIRMWARE("amdgpu/yellow_carp_gpu_info.bin");
> 
>  #define AMDGPU_RESUME_MS 2000
> 
> @@ -114,17 +109,9 @@ const char *amdgpu_asic_name[] = {
>   "RAVEN",
>   "ARCTURUS",
>   "RENOIR",
> - "ALDEBARAN",
>   "NAVI10",
> - "CYAN_SKILLFISH",
>   "NAVI14",
>   "NAVI12",
> - "SIENNA_CICHLID",
> - "NAVY_FLOUNDER",
> - "VANGOGH",
> - "DIMGREY_CAVEFISH",
> - "BEIGE_GOBY",
> - "YELLOW_CARP",
>   "LAST",
>  };
> 
> @@ -141,10 +128,10 @@ static ssize_t
> amdgpu_device_get_pcie_replay_count(struct device *dev,
>   struct device_attribute *attr, char *buf)
>  {
>   struct drm_device *ddev = dev_get_drvdata(dev);
> - struct amdgpu_device *adev = drm_to_adev(ddev);
> + struct amdgpu_device *adev = ddev->dev_private;
>   uint64_t cnt = amdgpu_asic_get_pcie_replay_count(adev);
> 
> - return sysfs_emit(buf, "%llu\n", cnt);
> + return snprintf(buf, PAGE_SIZE, "%llu\n", cnt);
>  }
> 
>  static DEVICE_ATTR(pcie_replay_count, S_IRUGO,
> @@ -166,9 +153,9 @@ static ssize_t
> amdgpu_device_get_product_name(struct device *dev,
>   struct device_attribute *attr, char *buf)
>  {
>   struct drm_device *ddev = dev_get_drvdata(dev);
> - struct amdgpu_device *adev = drm_to_adev(ddev);
> + struct amdgpu_device *adev = ddev->dev_private;
> 
> - return sysfs_emit(buf, "%s\n", adev->product_name);
> + return snprintf(buf, PAGE_SIZE, "%s\n", adev->product_name);
>  }
> 
>  static DEVICE_ATTR(product_name, S_IRUGO,
> @@ -188,9 +175,9 @@ static ssize_t
> amdgpu_device_get_product_number(struct device *dev,
>   struct device_attribute *attr, char *buf)
>  {
>   struct drm_device *ddev = dev_get_drvdata(dev);
> - struct amdgpu_device *adev = drm_to_adev(ddev);
> + struct amdgpu_device *adev = ddev->dev_private;
> 
> - return sysfs_emit(buf, "%s\n", adev->product_number);
> + return snprintf(buf, PAGE_SIZE, "%s\n", adev->product_number);
>  }
> 
>  static DEVICE_ATTR(product_number, S_IRUGO,
> @@ -210,45 +197,27 @@ static ssize_t
> amdgpu_device_get_serial_number(struct device *dev,
>   struct device_attribute *attr, char *buf)
>  {
>   struct drm_device *ddev = dev_get_drvdata(dev);
> - struct amdgpu_device *adev = drm_to_adev(ddev);
> + struct amdgpu_device *adev = ddev->dev_private;
> 
> - return sysfs_emit(buf, "%s\n", adev->serial);
> + return snprintf(buf, PAGE_SIZE, "%s\n", adev->serial);
>  }
> 
>  static DEVICE_ATTR(serial_number, S_IRUGO,
>   amdgpu_device_get_serial_number, NULL);
> 
>  /**
> - * amdgpu_device_supports_px - Is the device a dGPU with ATPX power
> control
> - *
> - * @dev: drm_device pointer
> - *
> - * Returns true if the device is a dGPU with ATPX power control,
> - * otherwise return false.
> - */
> -bool amdgpu_device_supports_px(struct drm_device *dev)
> -{
> - struct amdgpu_device *adev = drm_to_adev(dev);
> -
> - if ((adev->flags & AMD_IS_PX) && !amdgpu_is_atpx_hybrid())
> - return true;
> - return false;
> -}
> -
> -/**
> - * amdgpu_device_supports_boco - Is the device a dGPU with ACPI power
> resources
> + * amdgpu_device_supports_boco - Is the device a dGPU with HG/PX power
> control
>   *
>   * @dev: drm_device pointer
>   *
> - * Returns true if the device is a dGPU with ACPI power control,
> + * Returns true if the device is a dGPU with HG/PX power control,
>   * otherwise return 

Re: [PATCH] drm/amdgpu: drop redundant null-pointer checks in amdgpu_ttm_tt_populate() and amdgpu_ttm_tt_unpopulate()

2021-08-04 Thread Christian König

Am 04.08.21 um 03:51 schrieb Tuo Li:

The varialbe gtt in the function amdgpu_ttm_tt_populate() and
amdgpu_ttm_tt_unpopulate() is guaranteed to be not NULL in the context.
Thus the null-pointer checks are redundant and can be dropped.

Reported-by: TOTE Robot 
Signed-off-by: Tuo Li 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 3a55f08e00e1..719539bd6c44 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1121,7 +1121,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev,
struct amdgpu_ttm_tt *gtt = (void *)ttm;
  
  	/* user pages are bound by amdgpu_ttm_tt_pin_userptr() */

-   if (gtt && gtt->userptr) {
+   if (gtt->userptr) {
ttm->sg = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
if (!ttm->sg)
return -ENOMEM;
@@ -1146,7 +1146,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device 
*bdev,
struct amdgpu_ttm_tt *gtt = (void *)ttm;
struct amdgpu_device *adev;
  
-	if (gtt && gtt->userptr) {

+   if (gtt->userptr) {
amdgpu_ttm_tt_set_user_pages(ttm, NULL);
kfree(ttm->sg);
ttm->sg = NULL;




RE: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access

2021-08-04 Thread Chen, Guchun
[Public]

Sorry for missing RB.

This patch is:
Reviewed-by: Guchun Chen 

Regards,
Guchun

From: amd-gfx  On Behalf Of Chen, Guchun
Sent: Wednesday, August 4, 2021 11:40 AM
To: Clements, John ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Li, Candice ; 
Lazar, Lijo 
Subject: RE: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access


[Public]

Thanks John. As in the same context, it's meaningless that two mutex target 
almost the same thing.

Regards,
Guchun

From: Clements, John mailto:john.cleme...@amd.com>>
Sent: Wednesday, August 4, 2021 11:34 AM
To: Chen, Guchun mailto:guchun.c...@amd.com>>; 
amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>; Li, 
Candice mailto:candice...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>
Subject: RE: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access


[Public]

@Chen, Guchun,
Based off your feedback I double checked the code, and I changed my opinion 
about it, I think it's better just to reuse the original mutex for now. I've 
submitted an updated patch for review

From: Clements, John mailto:john.cleme...@amd.com>>
Sent: Tuesday, August 3, 2021 10:07 PM
To: Chen, Guchun mailto:guchun.c...@amd.com>>; 
amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>; Li, 
Candice mailto:candice...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>
Subject: Re: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access

Hello Guchun,

In most of those cases you are right it is redundant, the reason i kept them 
separate for now is to resolve this bug while also keeping those interfaces 
modular, and not affecting the psp submit sequence yet. We are planning a 
bigger change to that source to remove alot of the duplicate code regarding the 
cmd buffer prepare/submit flow and will probably go back down to one mutex 
there.

Thank you,
John Clements


From: Chen, Guchun mailto:guchun.c...@amd.com>>
Sent: Tuesday, August 3, 2021 9:58 PM
To: Clements, John mailto:john.cleme...@amd.com>>; 
amd-gfx@lists.freedesktop.org 
mailto:amd-gfx@lists.freedesktop.org>>
Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>; Li, 
Candice mailto:candice...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>
Subject: RE: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access


[Public]



Before calling into psp_cmd_submit_buf, a mutex psp->cmd_buf_mutex is there, 
and after entering psp_cmd_submit_buf, there is another mutex psp->mutex, is it 
a bit redundant?



Regards,

Guchun



From: Clements, John mailto:john.cleme...@amd.com>>
Sent: Tuesday, August 3, 2021 5:50 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>; Li, 
Candice mailto:candice...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>; Chen, Guchun 
mailto:guchun.c...@amd.com>>
Subject: [PATCH] drm/amdgpu: added synchronization for psp cmd buf access



[AMD Official Use Only]



Submitting patch to synchronize access to psp cmd submission memory to resolve 
potential race conditions.