Re: [PATCH v1 0/3] Add support for next gen eDP driver on SnapDragon

2021-05-10 Thread Bjorn Andersson
On Mon 10 May 07:16 CDT 2021, sbill...@codeaurora.org wrote:

> On 2021-05-06 20:32, Rob Clark wrote:
> > On Wed, May 5, 2021 at 11:47 PM  wrote:
> > > 
> > > On 2021-05-05 15:31, Dmitry Baryshkov wrote:
> > > > Hi,
> > > >
> > > > On Wed, 5 May 2021 at 11:17, Sankeerth Billakanti
> > > >  wrote:
> > > >>
> > > >> These patches add support for the next generation eDP driver on
> > > >> SnapDragon
> > > >> with dpu support. The existing eDP driver cannot support the new eDP
> > > >> hardware. So, to maintain backward compatibility, the older eDP driver
> > > >> is
> > > >> moved to v200 folder and the new generation eDP driver is added in
> > > >> the v510 folder.
> > > >
> > > > What exactly does this version correspond to?
> > > > I assume that v510 corresponds to sdmshrike/sc8180x. Is it right?
> > > [Sankeerth] This is for sc7280.
> > > 
> > > > Is it really so specific, or just v2/v5 would be enough? Not to
> > > > mention that this is the MDP/ version, while other blocks tend to use
> > > > block-specific versions/ids.
> > > [Sankeerth] I can rename it as edp-v1 and edp-v2. Edp v1 is very old
> > > chip and there is considerable HW delta between v1 and v2. So, we want
> > > to separate the driver. We followed similar model for DPU driver
> > > where,
> > > MDP4, MDP5 and DPU have separate folders. EDP v1 belongs to MDP4
> > > generation.
> > 
> > Bjorn brought up the idea of just dropping the existing drm/msm/edp..
> > since the efforts to upstream the platform it worked on (8084?)
> > fizzled out, I don't think there is any device which uses it.
> > 
> > But it does sound like edp is a subset of the the newer dp driver, so
> > seems sort of like the better approach would be to add edp support to
> > dp.  I believe Bjorn has something based on this approach which is
> > working for sc8280 (although not sure if it is in shape to post
> > patches yet)
> > 
> > BR,
> > -R
> Hi Rob,
> I will explore to integrate native eDP driver as part of DP driver. Will
> follow up with new patchsets.
> 
> Hi Dmitry,
> I will move the eDP phy to qmp drivers folder in the new patchsets so that
> it can reuse the dp core driver.
> 

Hi Sankeerth,

I've been working on eDP support for sc8180x recently, which afaict is
identical to sc7280 in this regard. I finally got the patches cleaned up
and posted here:
https://lore.kernel.org/linux-arm-msm/20210511042043.592802-1-bjorn.anders...@linaro.org/T/#t
https://lore.kernel.org/linux-arm-msm/20210511041930.592483-1-bjorn.anders...@linaro.org/T/#t

My initial patches added widebus support, rather than disabling it. But
those patches needs a little bit more polishing - and I finally figured
was able to disable the feature. So I will get back to this.

There's currently a few seconds delay on plug detection, so this needs
to be investigated further and I haven't looked at backlight handling
yet.

Regards,
Bjorn


Re: [PATCH v3 2/2] drm/bridge: anx7625: add suspend / resume hooks

2021-05-10 Thread Hsin-Yi Wang
On Mon, May 10, 2021 at 1:31 PM Pi-Hsun Shih  wrote:
>
> Add suspend / resume hooks for anx7625 driver, that power off the device
> on suspend and power on the device on resume if it was previously
> powered.
>
> Signed-off-by: Pi-Hsun Shih 

Tested-by: Hsin-Yi Wang 

Tested on a mt8183 juniper device.

> ---
>
> Changes from v2:
> * No change.
>
> ---
>  drivers/gpu/drm/bridge/analogix/anx7625.c | 27 +++
>  1 file changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
> b/drivers/gpu/drm/bridge/analogix/anx7625.c
> index e1bf31eafe22..b165ef71e00f 100644
> --- a/drivers/gpu/drm/bridge/analogix/anx7625.c
> +++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
> @@ -1705,7 +1705,34 @@ static int __maybe_unused 
> anx7625_runtime_pm_resume(struct device *dev)
> return 0;
>  }
>
> +static int __maybe_unused anx7625_resume(struct device *dev)
> +{
> +   struct anx7625_data *ctx = dev_get_drvdata(dev);
> +
> +   if (!ctx->pdata.intp_irq)
> +   return 0;
> +
> +   if (!pm_runtime_enabled(dev) || !pm_runtime_suspended(dev))
> +   anx7625_runtime_pm_resume(dev);
> +
> +   return 0;
> +}
> +
> +static int __maybe_unused anx7625_suspend(struct device *dev)
> +{
> +   struct anx7625_data *ctx = dev_get_drvdata(dev);
> +
> +   if (!ctx->pdata.intp_irq)
> +   return 0;
> +
> +   if (!pm_runtime_enabled(dev) || !pm_runtime_suspended(dev))
> +   anx7625_runtime_pm_suspend(dev);
> +
> +   return 0;
> +}
> +
>  static const struct dev_pm_ops anx7625_pm_ops = {
> +   SET_SYSTEM_SLEEP_PM_OPS(anx7625_suspend, anx7625_resume)
> SET_RUNTIME_PM_OPS(anx7625_runtime_pm_suspend,
>anx7625_runtime_pm_resume, NULL)
>  };
> --
> 2.31.1.607.g51e8a6a459-goog
>


Re: [PATCH v3 1/2] drm/bridge: anx7625: refactor power control to use runtime PM framework

2021-05-10 Thread Hsin-Yi Wang
On Mon, May 10, 2021 at 1:31 PM Pi-Hsun Shih  wrote:
>
> The driver originally use an atomic_t for keep track of the power
> status, which makes the driver more complicated than needed, and has
> some race condition as it's possible to have the power on and power off
> sequence going at the same time.
>
> This patch remove the usage of the atomic_t power_status, and use the
> kernel runtime power management framework instead.
>
> Signed-off-by: Pi-Hsun Shih 

Tested-by: Hsin-Yi Wang 

Tested on a mt8183 juniper device.

> ---
>
> Changes from v2:
> * Add missing .pm field to anx7625_driver.
>
> ---
>  drivers/gpu/drm/bridge/analogix/anx7625.c | 149 ++
>  drivers/gpu/drm/bridge/analogix/anx7625.h |   1 -
>  2 files changed, 64 insertions(+), 86 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
> b/drivers/gpu/drm/bridge/analogix/anx7625.c
> index 23283ba0c4f9..e1bf31eafe22 100644
> --- a/drivers/gpu/drm/bridge/analogix/anx7625.c
> +++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1005,33 +1006,6 @@ static void anx7625_power_on_init(struct anx7625_data 
> *ctx)
> }
>  }
>
> -static void anx7625_chip_control(struct anx7625_data *ctx, int state)
> -{
> -   struct device *dev = >client->dev;
> -
> -   DRM_DEV_DEBUG_DRIVER(dev, "before set, power_state(%d).\n",
> -atomic_read(>power_status));
> -
> -   if (!ctx->pdata.low_power_mode)
> -   return;
> -
> -   if (state) {
> -   atomic_inc(>power_status);
> -   if (atomic_read(>power_status) == 1)
> -   anx7625_power_on_init(ctx);
> -   } else {
> -   if (atomic_read(>power_status)) {
> -   atomic_dec(>power_status);
> -
> -   if (atomic_read(>power_status) == 0)
> -   anx7625_power_standby(ctx);
> -   }
> -   }
> -
> -   DRM_DEV_DEBUG_DRIVER(dev, "after set, power_state(%d).\n",
> -atomic_read(>power_status));
> -}
> -
>  static void anx7625_init_gpio(struct anx7625_data *platform)
>  {
> struct device *dev = >client->dev;
> @@ -1061,9 +1035,6 @@ static void anx7625_stop_dp_work(struct anx7625_data 
> *ctx)
> ctx->hpd_status = 0;
> ctx->hpd_high_cnt = 0;
> ctx->display_timing_valid = 0;
> -
> -   if (ctx->pdata.low_power_mode == 0)
> -   anx7625_disable_pd_protocol(ctx);
>  }
>
>  static void anx7625_start_dp_work(struct anx7625_data *ctx)
> @@ -1105,49 +1076,26 @@ static void anx7625_hpd_polling(struct anx7625_data 
> *ctx)
> int ret, val;
> struct device *dev = >client->dev;
>
> -   if (atomic_read(>power_status) != 1) {
> -   DRM_DEV_DEBUG_DRIVER(dev, "No need to poling HPD status.\n");
> -   return;
> -   }
> -
> ret = readx_poll_timeout(anx7625_read_hpd_status_p0,
>  ctx, val,
>  ((val & HPD_STATUS) || (val < 0)),
>  5000,
>  5000 * 100);
> if (ret) {
> -   DRM_DEV_ERROR(dev, "HPD polling timeout!\n");
> -   } else {
> -   DRM_DEV_DEBUG_DRIVER(dev, "HPD raise up.\n");
> -   anx7625_reg_write(ctx, ctx->i2c.tcpc_client,
> - INTR_ALERT_1, 0xFF);
> -   anx7625_reg_write(ctx, ctx->i2c.rx_p0_client,
> - INTERFACE_CHANGE_INT, 0);
> +   DRM_DEV_ERROR(dev, "no hpd.\n");
> +   return;
> }
>
> -   anx7625_start_dp_work(ctx);
> -}
> -
> -static void anx7625_disconnect_check(struct anx7625_data *ctx)
> -{
> -   if (atomic_read(>power_status) == 0)
> -   anx7625_stop_dp_work(ctx);
> -}
> -
> -static void anx7625_low_power_mode_check(struct anx7625_data *ctx,
> -int state)
> -{
> -   struct device *dev = >client->dev;
> +   DRM_DEV_DEBUG_DRIVER(dev, "system status: 0x%x. HPD raise up.\n", 
> val);
> +   anx7625_reg_write(ctx, ctx->i2c.tcpc_client,
> + INTR_ALERT_1, 0xFF);
> +   anx7625_reg_write(ctx, ctx->i2c.rx_p0_client,
> + INTERFACE_CHANGE_INT, 0);
>
> -   DRM_DEV_DEBUG_DRIVER(dev, "low power mode check, state(%d).\n", 
> state);
> +   anx7625_start_dp_work(ctx);
>
> -   if (ctx->pdata.low_power_mode) {
> -   anx7625_chip_control(ctx, state);
> -   if (state)
> -   anx7625_hpd_polling(ctx);
> -   else
> -   anx7625_disconnect_check(ctx);
> -   }
> +   if (!ctx->pdata.panel_bridge && ctx->bridge_attached)
> +   drm_helper_hpd_irq_event(ctx->bridge.dev);
>  }
>
>  static 

[PATCH 4/4] drm/msm/dp: Add support for SC8180x eDP

2021-05-10 Thread Bjorn Andersson
The eDP controller found in SC8180x is at large compatible with the
current implementation, but has its register blocks at slightly
different offsets.

Add the compatible and the new register layout.

Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/dp/dp_display.c |  1 +
 drivers/gpu/drm/msm/dp/dp_parser.c  | 28 
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c
index d1319b58e901..0be03bdc882c 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -121,6 +121,7 @@ struct dp_display_private {
 
 static const struct of_device_id dp_dt_match[] = {
{.compatible = "qcom,sc7180-dp"},
+   { .compatible = "qcom,sc8180x-edp" },
{}
 };
 
diff --git a/drivers/gpu/drm/msm/dp/dp_parser.c 
b/drivers/gpu/drm/msm/dp/dp_parser.c
index 51ec85b4803b..47cf18bba4b2 100644
--- a/drivers/gpu/drm/msm/dp/dp_parser.c
+++ b/drivers/gpu/drm/msm/dp/dp_parser.c
@@ -251,6 +251,7 @@ static int dp_parser_clock(struct dp_parser *parser)
 static int dp_parser_parse(struct dp_parser *parser)
 {
struct dss_io_data *io = >io.dp_controller;
+   struct device *dev = >pdev->dev;
int rc = 0;
 
if (!parser) {
@@ -276,14 +277,25 @@ static int dp_parser_parse(struct dp_parser *parser)
 */
parser->regulator_cfg = _dp_reg_cfg;
 
-   io->ahb = io->base + 0x0;
-   io->ahb_len = 0x200;
-   io->aux = io->base + 0x200;
-   io->aux_len = 0x200;
-   io->link = io->base + 0x400;
-   io->link_len = 0x600;
-   io->p0 = io->base + 0x1000;
-   io->p0_len = 0x400;
+   if (of_device_is_compatible(dev->of_node, "qcom,sc8180x-edp")) {
+   io->ahb = io->base + 0x0;
+   io->ahb_len = 0x200;
+   io->aux = io->base + 0x200;
+   io->aux_len = 0x200;
+   io->link = io->base + 0x400;
+   io->link_len = 0x600;
+   io->p0 = io->base + 0xa00;
+   io->p0_len = 0x400;
+   } else {
+   io->ahb = io->base + 0x0;
+   io->ahb_len = 0x200;
+   io->aux = io->base + 0x200;
+   io->aux_len = 0x200;
+   io->link = io->base + 0x400;
+   io->link_len = 0x600;
+   io->p0 = io->base + 0x1000;
+   io->p0_len = 0x400;
+   }
 
return 0;
 }
-- 
2.29.2



[PATCH 3/4] drm/msm/dp: Initialize the INTF_CONFIG register

2021-05-10 Thread Bjorn Andersson
Some bootloaders set the widebus enable bit in the INTF_CONFIG register,
but configuration of widebus isn't yet supported ensure that the
register has a known value, with widebus disabled.

Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")
Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/dp/dp_catalog.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c 
b/drivers/gpu/drm/msm/dp/dp_catalog.c
index a0449a2867e4..e3996eef5518 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.c
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
@@ -707,6 +707,7 @@ int dp_catalog_panel_timing_cfg(struct dp_catalog 
*dp_catalog)
dp_write_link(catalog, REG_DP_HSYNC_VSYNC_WIDTH_POLARITY,
dp_catalog->width_blanking);
dp_write_link(catalog, REG_DP_ACTIVE_HOR_VER, dp_catalog->dp_active);
+   dp_write_p0(catalog, MMSS_DP_INTF_CONFIG, 0);
return 0;
 }
 
-- 
2.29.2



[PATCH 2/4] drm/msm/dp: Store each subblock in the io region

2021-05-10 Thread Bjorn Andersson
Not all platforms has DP_P0 at offset 0x1000 from the beginning of the
DP block. So move the offsets into dss_io_data, to make it possible in
the next patch to specify alternative offsets and sizes of these
segments.

Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/dp/dp_catalog.c | 57 -
 drivers/gpu/drm/msm/dp/dp_parser.c  | 10 +
 drivers/gpu/drm/msm/dp/dp_parser.h  |  8 
 3 files changed, 33 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c 
b/drivers/gpu/drm/msm/dp/dp_catalog.c
index 2eb37ee48e42..a0449a2867e4 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.c
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
@@ -24,15 +24,6 @@
 #define DP_INTERRUPT_STATUS_ACK_SHIFT  1
 #define DP_INTERRUPT_STATUS_MASK_SHIFT 2
 
-#define MSM_DP_CONTROLLER_AHB_OFFSET   0x
-#define MSM_DP_CONTROLLER_AHB_SIZE 0x0200
-#define MSM_DP_CONTROLLER_AUX_OFFSET   0x0200
-#define MSM_DP_CONTROLLER_AUX_SIZE 0x0200
-#define MSM_DP_CONTROLLER_LINK_OFFSET  0x0400
-#define MSM_DP_CONTROLLER_LINK_SIZE0x0C00
-#define MSM_DP_CONTROLLER_P0_OFFSET0x1000
-#define MSM_DP_CONTROLLER_P0_SIZE  0x0400
-
 #define DP_INTERRUPT_STATUS1 \
(DP_INTR_AUX_I2C_DONE| \
DP_INTR_WRONG_ADDR | DP_INTR_TIMEOUT | \
@@ -64,75 +55,67 @@ struct dp_catalog_private {
 
 static inline u32 dp_read_aux(struct dp_catalog_private *catalog, u32 offset)
 {
-   offset += MSM_DP_CONTROLLER_AUX_OFFSET;
-   return readl_relaxed(catalog->io->dp_controller.base + offset);
+   return readl_relaxed(catalog->io->dp_controller.aux + offset);
 }
 
 static inline void dp_write_aux(struct dp_catalog_private *catalog,
   u32 offset, u32 data)
 {
-   offset += MSM_DP_CONTROLLER_AUX_OFFSET;
/*
 * To make sure aux reg writes happens before any other operation,
 * this function uses writel() instread of writel_relaxed()
 */
-   writel(data, catalog->io->dp_controller.base + offset);
+   writel(data, catalog->io->dp_controller.aux + offset);
 }
 
 static inline u32 dp_read_ahb(struct dp_catalog_private *catalog, u32 offset)
 {
-   offset += MSM_DP_CONTROLLER_AHB_OFFSET;
-   return readl_relaxed(catalog->io->dp_controller.base + offset);
+   return readl_relaxed(catalog->io->dp_controller.ahb + offset);
 }
 
 static inline void dp_write_ahb(struct dp_catalog_private *catalog,
   u32 offset, u32 data)
 {
-   offset += MSM_DP_CONTROLLER_AHB_OFFSET;
/*
 * To make sure phy reg writes happens before any other operation,
 * this function uses writel() instread of writel_relaxed()
 */
-   writel(data, catalog->io->dp_controller.base + offset);
+   writel(data, catalog->io->dp_controller.ahb + offset);
 }
 
 static inline void dp_write_p0(struct dp_catalog_private *catalog,
   u32 offset, u32 data)
 {
-   offset += MSM_DP_CONTROLLER_P0_OFFSET;
/*
 * To make sure interface reg writes happens before any other operation,
 * this function uses writel() instread of writel_relaxed()
 */
-   writel(data, catalog->io->dp_controller.base + offset);
+   writel(data, catalog->io->dp_controller.p0 + offset);
 }
 
 static inline u32 dp_read_p0(struct dp_catalog_private *catalog,
   u32 offset)
 {
-   offset += MSM_DP_CONTROLLER_P0_OFFSET;
/*
 * To make sure interface reg writes happens before any other operation,
 * this function uses writel() instread of writel_relaxed()
 */
-   return readl_relaxed(catalog->io->dp_controller.base + offset);
+   return readl_relaxed(catalog->io->dp_controller.p0 + offset);
 }
 
 static inline u32 dp_read_link(struct dp_catalog_private *catalog, u32 offset)
 {
-   offset += MSM_DP_CONTROLLER_LINK_OFFSET;
-   return readl_relaxed(catalog->io->dp_controller.base + offset);
+   return readl_relaxed(catalog->io->dp_controller.link + offset);
 }
 
 static inline void dp_write_link(struct dp_catalog_private *catalog,
   u32 offset, u32 data)
 {
-   offset += MSM_DP_CONTROLLER_LINK_OFFSET;
/*
 * To make sure link reg writes happens before any other operation,
 * this function uses writel() instread of writel_relaxed()
 */
-   writel(data, catalog->io->dp_controller.base + offset);
+   writel(data, catalog->io->dp_controller.link + offset);
 }
 
 /* aux related catalog functions */
@@ -267,29 +250,21 @@ static void dump_regs(void __iomem *base, int len)
 
 void dp_catalog_dump_regs(struct dp_catalog *dp_catalog)
 {
-   u32 offset, len;
struct dp_catalog_private *catalog = container_of(dp_catalog,
struct dp_catalog_private, dp_catalog);
+   struct dss_io_data *io = >io->dp_controller;
 
pr_info("AHB regs\n");
-   offset = MSM_DP_CONTROLLER_AHB_OFFSET;
-  

[PATCH 1/4] drm/msm/dp: Simplify the mvid/nvid calculation

2021-05-10 Thread Bjorn Andersson
In the search for causes to timing issues seen during implementation of
eDP support for SC8180x a fair amount of time was spent concluding why
the calculated mvid/nvid values where wrong.

The overall conclusion is that the ratio of MVID/NVID describes, and
should match, the ratio between the pixel and link clock.

Downstream this calculation reads the M and N values off the pixel clock
straight from DISP_CC and are then adjusted based on knowledge of how
the link and vco_div (parent of the pixel clock) are derrived from the
common VCO.

While upstreaming, and then extracting the PHY driver, the resulting
function performs the following steps:

1) Adjust the passed link rate based on the VCO divider used in the PHY
   driver, and multiply this by 10 based on the link rate divider.
2) Pick reasonable choices of M and N, by calculating the ratio between
   this new clock and the pixel clock.
3) Subtract M from N and flip the bits, to match the encoding of the N
   register in DISP_CC.
4) Flip the bits of N and add M, to get the value of N back.
5) Multiply M with 5, per the documentation.
6) Scale the values such that N is close to 0x8000 (or larger)
7) Multply M with 2 or 3 depending on the link rate of HBR2 or HBR3.

Presumably step 3) was added to provide step 4) with expected input, so
the two cancel each other out. The factor of 10 from step 1) goes into
the denominator and is partially cancelled by the 5 in the numerator in
step 5), resulting in step 7) simply cancelling out step 1).

Left is the code that finds the ratio between the two arguments, scaled
to keep the denominator close to or larger than 0x8000. And this is our
mvid/nvid pair.

Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/dp/dp_catalog.c | 41 +
 1 file changed, 6 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_catalog.c 
b/drivers/gpu/drm/msm/dp/dp_catalog.c
index b1a9b1b98f5f..2eb37ee48e42 100644
--- a/drivers/gpu/drm/msm/dp/dp_catalog.c
+++ b/drivers/gpu/drm/msm/dp/dp_catalog.c
@@ -415,39 +415,16 @@ void dp_catalog_ctrl_config_msa(struct dp_catalog 
*dp_catalog,
u32 rate, u32 stream_rate_khz,
bool fixed_nvid)
 {
-   u32 pixel_m, pixel_n;
-   u32 mvid, nvid, pixel_div = 0, dispcc_input_rate;
u32 const nvid_fixed = DP_LINK_CONSTANT_N_VALUE;
-   u32 const link_rate_hbr2 = 54;
-   u32 const link_rate_hbr3 = 81;
-   unsigned long den, num;
-
+   unsigned long mvid, nvid;
struct dp_catalog_private *catalog = container_of(dp_catalog,
struct dp_catalog_private, dp_catalog);
 
-   if (rate == link_rate_hbr3)
-   pixel_div = 6;
-   else if (rate == 162 || rate == 27)
-   pixel_div = 2;
-   else if (rate == link_rate_hbr2)
-   pixel_div = 4;
-   else
-   DRM_ERROR("Invalid pixel mux divider\n");
-
-   dispcc_input_rate = (rate * 10) / pixel_div;
-
-   rational_best_approximation(dispcc_input_rate, stream_rate_khz,
-   (unsigned long)(1 << 16) - 1,
-   (unsigned long)(1 << 16) - 1, , );
-
-   den = ~(den - num);
-   den = den & 0x;
-   pixel_m = num;
-   pixel_n = den;
-
-   mvid = (pixel_m & 0x) * 5;
-   nvid = (0x & (~pixel_n)) + (pixel_m & 0x);
+   rational_best_approximation(stream_rate_khz, rate,
+   (1 << 16) - 1, (1 << 16) - 1,
+   , );
 
+   /* Adjust values so that nvid is close to DP_LINK_CONSTANT_N_VALUE */
if (nvid < nvid_fixed) {
u32 temp;
 
@@ -456,13 +433,7 @@ void dp_catalog_ctrl_config_msa(struct dp_catalog 
*dp_catalog,
nvid = temp;
}
 
-   if (link_rate_hbr2 == rate)
-   nvid *= 2;
-
-   if (link_rate_hbr3 == rate)
-   nvid *= 3;
-
-   DRM_DEBUG_DP("mvid=0x%x, nvid=0x%x\n", mvid, nvid);
+   DRM_DEBUG_DP("mvid=0x%lx, nvid=0x%lx\n", mvid, nvid);
dp_write_link(catalog, REG_DP_SOFTWARE_MVID, mvid);
dp_write_link(catalog, REG_DP_SOFTWARE_NVID, nvid);
dp_write_p0(catalog, MMSS_DP_DSC_DTO, 0x0);
-- 
2.29.2



[PATCH 0/4] drm/msm/dp: Add support for SC8180x eDP controller

2021-05-10 Thread Bjorn Andersson
The first patch in the series is somewhat unrelated to the support, but
simplifies reasoning and debugging of timing related issues.

The second patch introduces support for dealing with different register block
layouts, which is used in the forth patch to describe the hardware blocks found
in the SC8180x eDP block.

The third patch configures the INTF_CONFIG register, which carries the
configuration for widebus handling. As with the DPU the bootloader enables
widebus and we need to disable it, or implement support for adjusting the
timing.

Bjorn Andersson (4):
  drm/msm/dp: Simplify the mvid/nvid calculation
  drm/msm/dp: Store each subblock in the io region
  drm/msm/dp: Initialize the INTF_CONFIG register
  drm/msm/dp: Add support for SC8180x eDP

 drivers/gpu/drm/msm/dp/dp_catalog.c | 99 +++--
 drivers/gpu/drm/msm/dp/dp_display.c |  1 +
 drivers/gpu/drm/msm/dp/dp_parser.c  | 22 +++
 drivers/gpu/drm/msm/dp/dp_parser.h  |  8 +++
 4 files changed, 53 insertions(+), 77 deletions(-)

-- 
2.29.2



[PATCH 3/4] drm/msm/dpu: Add SC8180x to hw catalog

2021-05-10 Thread Bjorn Andersson
From: Rob Clark 

Add SC8180x to the hardware catalog, for initial support for the
platform. Due to limitations in the DP driver only one of the four DP
interfaces is left enabled.

The SC8180x platform supports the newly added DPU_INTF_WIDEBUS flag and
the Windows-on-Snapdragon bootloader leaves the widebus bit set, so this
is flagged appropriately to ensure widebus is disabled - for now.

Signed-off-by: Rob Clark 
Signed-off-by: Bjorn Andersson 
---
 .../devicetree/bindings/display/msm/dpu.txt   |   4 +-
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 121 ++
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|   3 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |   1 +
 drivers/gpu/drm/msm/msm_drv.c |   1 +
 5 files changed, 128 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dpu.txt 
b/Documentation/devicetree/bindings/display/msm/dpu.txt
index 586e6eac5b08..b98258374a60 100644
--- a/Documentation/devicetree/bindings/display/msm/dpu.txt
+++ b/Documentation/devicetree/bindings/display/msm/dpu.txt
@@ -8,7 +8,7 @@ The DPU display controller is found in SDM845 SoC.
 
 MDSS:
 Required properties:
-- compatible:  "qcom,sdm845-mdss", "qcom,sc7180-mdss"
+- compatible:  "qcom,sdm845-mdss", "qcom,sc7180-mdss", "qcom,sc8180x-mdss"
 - reg: physical base address and length of controller's registers.
 - reg-names: register region names. The following region is required:
   * "mdss"
@@ -41,7 +41,7 @@ Optional properties:
 
 MDP:
 Required properties:
-- compatible: "qcom,sdm845-dpu", "qcom,sc7180-dpu"
+- compatible: "qcom,sdm845-dpu", "qcom,sc7180-dpu", "qcom,sc8180x-dpu"
 - reg: physical base address and length of controller's registers.
 - reg-names : register region names. The following region is required:
   * "mdp"
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index b569030a0847..81c429ce94a9 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -56,6 +56,10 @@
 
 #define INTF_SC7280_MASK INTF_SC7180_MASK | BIT(DPU_DATA_HCTL_EN)
 
+#define INTF_SC8180X_MASK BIT(DPU_INTF_INPUT_CTRL) | \
+ BIT(DPU_INTF_TE) | \
+ BIT(DPU_INTF_WIDEBUS)
+
 #define INTR_SC7180_MASK \
(BIT(DPU_IRQ_TYPE_PING_PONG_RD_PTR) |\
BIT(DPU_IRQ_TYPE_PING_PONG_WR_PTR) |\
@@ -197,6 +201,22 @@ static const struct dpu_caps sm8150_dpu_caps = {
.max_vdeci_exp = MAX_VERT_DECIMATION,
 };
 
+static const struct dpu_caps sc8180_dpu_caps = {
+   .max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
+   .max_mixer_blendstages = 0xb,
+   .qseed_type = DPU_SSPP_SCALER_QSEED3,
+   .smart_dma_rev = DPU_SSPP_SMART_DMA_V2, /* TODO: v2.5 */
+   .ubwc_version = DPU_HW_UBWC_VER_30,
+   .has_src_split = true,
+   .has_dim_layer = true,
+   .has_idle_pc = true,
+   .has_3d_merge = false,   /* I think? */
+   .max_linewidth = 4096,
+   .pixel_ram_size = DEFAULT_PIXEL_RAM_SIZE,
+   .max_hdeci_exp = MAX_HORZ_DECIMATION,
+   .max_vdeci_exp = MAX_VERT_DECIMATION,
+};
+
 static const struct dpu_caps sm8250_dpu_caps = {
.max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH,
.max_mixer_blendstages = 0xb,
@@ -265,6 +285,35 @@ static const struct dpu_mdp_cfg sc7180_mdp[] = {
},
 };
 
+static const struct dpu_mdp_cfg sc8180_mdp[] = {
+   {
+   .name = "top_0", .id = MDP_TOP,
+   // TODO check len
+   .base = 0x0, .len = 0x45C,
+   .features = 0,
+   .highest_bank_bit = 0x3,
+   .clk_ctrls[DPU_CLK_CTRL_VIG0] = {
+   .reg_off = 0x2AC, .bit_off = 0},
+   .clk_ctrls[DPU_CLK_CTRL_VIG1] = {
+   .reg_off = 0x2B4, .bit_off = 0},
+   .clk_ctrls[DPU_CLK_CTRL_VIG2] = {
+   .reg_off = 0x2BC, .bit_off = 0},
+   .clk_ctrls[DPU_CLK_CTRL_VIG3] = {
+   .reg_off = 0x2C4, .bit_off = 0},
+   .clk_ctrls[DPU_CLK_CTRL_DMA0] = {
+   .reg_off = 0x2AC, .bit_off = 8},
+   .clk_ctrls[DPU_CLK_CTRL_DMA1] = {
+   .reg_off = 0x2B4, .bit_off = 8},
+   .clk_ctrls[DPU_CLK_CTRL_CURSOR0] = {
+   .reg_off = 0x2BC, .bit_off = 8},
+   .clk_ctrls[DPU_CLK_CTRL_CURSOR1] = {
+   .reg_off = 0x2C4, .bit_off = 8},
+// TODO ???
+// .clk_ctrls[DPU_CLK_CTRL_REG_DMA] = {
+// .reg_off = 0x2BC, .bit_off = 20},
+   },
+};
+
 static const struct dpu_mdp_cfg sm8250_mdp[] = {
{
.name = "top_0", .id = MDP_TOP,
@@ -789,6 +838,15 @@ static const struct dpu_intf_cfg sc7280_intf[] = {
INTF_BLK("intf_5", INTF_5, 0x39000, INTF_EDP, 0, 24, INTF_SC7280_MASK),
 };
 
+static const struct dpu_intf_cfg sc8180x_intf[] = {
+// INTF_BLK("intf_0", INTF_0, 0x6A000, INTF_DP, 0, 24, INTF_SC8180X_MASK),
+   INTF_BLK("intf_1", INTF_1, 0x6A800, INTF_DSI, 

[PATCH 4/4] dpu: hack up the irq table for 8180 intf_5

2021-05-10 Thread Bjorn Andersson
Signed-off-by: Bjorn Andersson 
---

This is a hack and as discussed on IRC this should be replaced by some sane
mechanism for dealing with the old and new IRQ layout. Including it in the
series for completeness.

 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
index 48c96b812126..fa576c617f86 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
@@ -72,11 +72,13 @@
 #define DPU_INTR_INTF_1_UNDERRUN BIT(26)
 #define DPU_INTR_INTF_2_UNDERRUN BIT(28)
 #define DPU_INTR_INTF_3_UNDERRUN BIT(30)
+#define DPU_INTR_INTF_4_UNDERRUN BIT(20)
 #define DPU_INTR_INTF_5_UNDERRUN BIT(22)
 #define DPU_INTR_INTF_0_VSYNC BIT(25)
 #define DPU_INTR_INTF_1_VSYNC BIT(27)
 #define DPU_INTR_INTF_2_VSYNC BIT(29)
 #define DPU_INTR_INTF_3_VSYNC BIT(31)
+#define DPU_INTR_INTF_4_VSYNC BIT(21)
 #define DPU_INTR_INTF_5_VSYNC BIT(23)
 
 /**
@@ -310,14 +312,10 @@ static const struct dpu_irq_type dpu_irq_map[] = {
{ DPU_IRQ_TYPE_PING_PONG_WR_PTR, PINGPONG_3,
DPU_INTR_PING_PONG_3_WR_PTR, 0},
/* irq_idx: 20-23 */
-   { DPU_IRQ_TYPE_PING_PONG_AUTO_REF, PINGPONG_0,
-   DPU_INTR_PING_PONG_0_AUTOREFRESH_DONE, 0},
-   { DPU_IRQ_TYPE_PING_PONG_AUTO_REF, PINGPONG_1,
-   DPU_INTR_PING_PONG_1_AUTOREFRESH_DONE, 0},
-   { DPU_IRQ_TYPE_PING_PONG_AUTO_REF, PINGPONG_2,
-   DPU_INTR_PING_PONG_2_AUTOREFRESH_DONE, 0},
-   { DPU_IRQ_TYPE_PING_PONG_AUTO_REF, PINGPONG_3,
-   DPU_INTR_PING_PONG_3_AUTOREFRESH_DONE, 0},
+   { DPU_IRQ_TYPE_INTF_UNDER_RUN, INTF_4, DPU_INTR_INTF_4_UNDERRUN, 0},
+   { DPU_IRQ_TYPE_INTF_VSYNC, INTF_4, DPU_INTR_INTF_4_VSYNC, 0},
+   { DPU_IRQ_TYPE_INTF_UNDER_RUN, INTF_5, DPU_INTR_INTF_5_UNDERRUN, 0},
+   { DPU_IRQ_TYPE_INTF_VSYNC, INTF_5, DPU_INTR_INTF_5_VSYNC, 0},
/* irq_idx: 24-27 */
{ DPU_IRQ_TYPE_INTF_UNDER_RUN, INTF_0, DPU_INTR_INTF_0_UNDERRUN, 0},
{ DPU_IRQ_TYPE_INTF_VSYNC, INTF_0, DPU_INTR_INTF_0_VSYNC, 0},
-- 
2.29.2



[PATCH 2/4] drm/msm/dpu: Clear boot loader configured data paths

2021-05-10 Thread Bjorn Andersson
It's typical for the bootloader to configure CTL_0 for the boot splash
or EFIFB, but for non-DSI use cases the DPU driver tend to pick another
CTL and the system might end up with two configured data paths producing
data on the same INTF - with resulting graphical artifacts.

Naturally the end goal would be to inherit the bootloader's
configuration and provide the user with a glitch free handover from the
boot configuration to a running DPU.
But such effort will affect clocks, regulators, power-domains etc, so in
the meantime this patch simply disables all INTFs and clear all
configured data paths, to avoid the graphical artifacts.

Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c |  4 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c|  2 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 36 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h |  8 +
 4 files changed, 50 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
index 2d4645e01ebf..7aba27c1055a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
@@ -349,9 +349,13 @@ static void dpu_hw_ctl_clear_all_blendstages(struct 
dpu_hw_ctl *ctx)
DPU_REG_WRITE(c, CTL_LAYER_EXT(LM_0 + i), 0);
DPU_REG_WRITE(c, CTL_LAYER_EXT2(LM_0 + i), 0);
DPU_REG_WRITE(c, CTL_LAYER_EXT3(LM_0 + i), 0);
+
+   ctx->pending_flush_mask |= dpu_hw_ctl_get_bitmask_mixer(ctx, 
LM_0 + i);
}
 
DPU_REG_WRITE(c, CTL_FETCH_PIPE_ACTIVE, 0);
+
+   ctx->pending_flush_mask |= CTL_FLUSH_MASK_CTL;
 }
 
 static void dpu_hw_ctl_setup_blendstage(struct dpu_hw_ctl *ctx,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 88e9cc38c13b..8b01cb660381 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -970,6 +970,8 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
 
dpu_kms->rm_init = true;
 
+   dpu_rm_clear_boot_config(_kms->rm, dpu_kms->catalog);
+
dpu_kms->hw_mdp = dpu_hw_mdptop_init(MDP_TOP, dpu_kms->mmio,
 dpu_kms->catalog);
if (IS_ERR(dpu_kms->hw_mdp)) {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index fd2d104f0a91..2cf47084482f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -4,6 +4,7 @@
  */
 
 #define pr_fmt(fmt)"[drm:%s] " fmt, __func__
+#include 
 #include "dpu_kms.h"
 #include "dpu_hw_lm.h"
 #include "dpu_hw_ctl.h"
@@ -229,6 +230,41 @@ int dpu_rm_init(struct dpu_rm *rm,
return rc ? rc : -EFAULT;
 }
 
+void dpu_rm_clear_boot_config(struct dpu_rm *rm, struct dpu_mdss_cfg *cat)
+{
+   struct dpu_hw_intf *intf;
+   struct dpu_hw_ctl *ctl;
+   int i;
+
+   for (i = INTF_0; i < INTF_MAX; i++) {
+   if (!rm->intf_blks[i - INTF_0])
+   continue;
+
+   DPU_DEBUG("disabling intf%d timing engine\n", i - INTF_0);
+
+   intf = to_dpu_hw_intf(rm->intf_blks[i - INTF_0]);
+   intf->ops.enable_timing(intf, 0);
+   }
+
+   /*
+* Wait one frame for the INTF timing engine to stop, and then wait one
+* more frame, per the documentation.
+*/
+   msleep(32);
+
+   for (i = CTL_0; i < CTL_MAX; i++) {
+   if (!rm->ctl_blks[i - CTL_0])
+   continue;
+
+   DPU_DEBUG("clearing ctl%d layer configuration\n", i - CTL_0);
+
+   ctl = to_dpu_hw_ctl(rm->ctl_blks[i - CTL_0]);
+   ctl->ops.clear_all_blendstages(ctl);
+   ctl->ops.trigger_flush(ctl);
+   ctl->ops.trigger_start(ctl);
+   }
+}
+
 static bool _dpu_rm_needs_split_display(const struct msm_display_topology *top)
 {
return top->num_intf > 1;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
index 1f12c8d5b8aa..53cd649614a3 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
@@ -88,5 +88,13 @@ void dpu_rm_release(struct dpu_global_state *global_state,
 int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
struct dpu_global_state *global_state, uint32_t enc_id,
enum dpu_hw_blk_type type, struct dpu_hw_blk **blks, int blks_size);
+
+/**
+ * dpu_rm_clear_boot_config() - Tear down any data paths configured by boot
+ * @rm: DPU Resource Manger handle
+ * @cat: Pointer to hardware catalog
+ */
+void dpu_rm_clear_boot_config(struct dpu_rm *rm, struct dpu_mdss_cfg *cat);
+
 #endif /* __DPU_RM_H__ */
 
-- 
2.29.2



[PATCH 1/4] drm/msm/dpu: Introduce knowledge of widebus feature

2021-05-10 Thread Bjorn Andersson
Some hardware supports clocking 2 pixels per pixel clock pulse, known as
"widebus". The configuration needs to match between the DPU and the
interface controller, and the timing parameters must be adjusted.

As a first step towards supporting this, start by adding a INTF mask
flag to signal the timing configuration code that the INTF_CONFIG2
register should be written - which will clear the bit, in the case that
the bootloader left it set.

Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 2 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c| 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 4dfd8a20ad5c..c2f34a4f82d9 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -196,12 +196,14 @@ enum {
  * @DPU_INTF_TE INTF block has TE configuration support
  * @DPU_DATA_HCTL_ENAllows data to be transferred at different rate
 than video timing
+ * @DPU_INTF_WIDEBUSINTF block supports driving 2 pixels per clock
  * @DPU_INTF_MAX
  */
 enum {
DPU_INTF_INPUT_CTRL = 0x1,
DPU_INTF_TE,
DPU_DATA_HCTL_EN,
+   DPU_INTF_WIDEBUS,
DPU_INTF_MAX
 };
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
index 1599e3f49a4f..933485d8c03c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
@@ -183,7 +183,6 @@ static void dpu_hw_intf_setup_timing_engine(struct 
dpu_hw_intf *ctx,
if (ctx->cap->features & BIT(DPU_DATA_HCTL_EN)) {
intf_cfg2 |= BIT(4);
display_data_hctl = display_hctl;
-   DPU_REG_WRITE(c, INTF_CONFIG2, intf_cfg2);
DPU_REG_WRITE(c, INTF_DISPLAY_DATA_HCTL, display_data_hctl);
}
 
@@ -204,6 +203,8 @@ static void dpu_hw_intf_setup_timing_engine(struct 
dpu_hw_intf *ctx,
DPU_REG_WRITE(c, INTF_FRAME_LINE_COUNT_EN, 0x3);
DPU_REG_WRITE(c, INTF_CONFIG, intf_cfg);
DPU_REG_WRITE(c, INTF_PANEL_FORMAT, panel_format);
+   if (ctx->cap->features & (BIT(DPU_DATA_HCTL_EN) | 
BIT(DPU_INTF_WIDEBUS)))
+   DPU_REG_WRITE(c, INTF_CONFIG2, intf_cfg2);
 }
 
 static void dpu_hw_intf_enable_timing_engine(
-- 
2.29.2



[PATCH 0/4] drm/msm/dpu: Qualcomm SC8180x MDSS/DPU support

2021-05-10 Thread Bjorn Andersson
These patches adds MDSS and DPU support for the Qualcomm SC8180x platform.

The platform supports running 2 pixels per pixel clock cycle and the bootloader
enables this, so the first patch adds enough support to the DPU driver to
disable this again.

The second patch shoots down the data path configured in CTL_0, as the DPU
driver picks CTL_2 on the laptops, causing graphical artifacts.

The third patch adds the SC8180x to the hw catalog.

The forth patch is included for "completeness", but needs to be reworked. It
updates the IRQ mapping for INTF_5, which is where we find the eDP controller.

Bjorn Andersson (3):
  drm/msm/dpu: Introduce knowledge of widebus feature
  drm/msm/dpu: Clear boot loader configured data paths
  dpu: hack up the irq table for 8180 intf_5

Rob Clark (1):
  drm/msm/dpu: Add SC8180x to hw catalog

 .../devicetree/bindings/display/msm/dpu.txt   |   4 +-
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 121 ++
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|   5 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c|   4 +
 .../gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c |  14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c   |   3 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |   3 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c|  36 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h|   8 ++
 drivers/gpu/drm/msm/msm_drv.c |   1 +
 10 files changed, 188 insertions(+), 11 deletions(-)

-- 
2.29.2



Re: [PATCH v3] drm/i915: Invoke another _DSM to enable MUX on HP Workstation laptops

2021-05-10 Thread Kai-Heng Feng
On Mon, Apr 26, 2021 at 11:24 PM Kai-Heng Feng
 wrote:
>
> On HP Fury G7 Workstations, graphics output is re-routed from Intel GFX
> to discrete GFX after S3. This is not desirable, because userspace will
> treat connected display as a new one, losing display settings.
>
> The expected behavior is to let discrete GFX drives all external
> displays.
>
> The platform in question uses ACPI method \_SB.PCI0.HGME to enable MUX.
> The method is inside the another _DSM, so add the _DSM and call it
> accordingly.
>
> I also tested some MUX-less and iGPU only laptops with that _DSM, no
> regression was found.
>
> v3:
>  - Remove BXT from names.
>  - Change the parameter type.
>  - Fold the function into intel_modeset_init_hw().
>
> v2:
>  - Forward declare struct pci_dev.
>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3113
> References: 
> https://lore.kernel.org/intel-gfx/1460040732-31417-4-git-send-email-animesh.ma...@intel.com/
> Signed-off-by: Kai-Heng Feng 

A gentle ping...

> ---
>  drivers/gpu/drm/i915/display/intel_acpi.c| 18 ++
>  drivers/gpu/drm/i915/display/intel_acpi.h|  3 +++
>  drivers/gpu/drm/i915/display/intel_display.c |  2 ++
>  3 files changed, 23 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_acpi.c 
> b/drivers/gpu/drm/i915/display/intel_acpi.c
> index 833d0c1be4f1..d008d3976261 100644
> --- a/drivers/gpu/drm/i915/display/intel_acpi.c
> +++ b/drivers/gpu/drm/i915/display/intel_acpi.c
> @@ -13,12 +13,17 @@
>  #include "intel_display_types.h"
>
>  #define INTEL_DSM_REVISION_ID 1 /* For Calpella anyway... */
> +#define INTEL_DSM_FN_PLATFORM_MUX_ENABLE 0 /* No args */
>  #define INTEL_DSM_FN_PLATFORM_MUX_INFO 1 /* No args */
>
>  static const guid_t intel_dsm_guid =
> GUID_INIT(0x7ed873d3, 0xc2d0, 0x4e4f,
>   0xa8, 0x54, 0x0f, 0x13, 0x17, 0xb0, 0x1c, 0x2c);
>
> +static const guid_t intel_dsm_guid2 =
> +   GUID_INIT(0x3e5b41c6, 0xeb1d, 0x4260,
> + 0x9d, 0x15, 0xc7, 0x1f, 0xba, 0xda, 0xe4, 0x14);
> +
>  static char *intel_dsm_port_name(u8 id)
>  {
> switch (id) {
> @@ -176,6 +181,19 @@ void intel_unregister_dsm_handler(void)
>  {
>  }
>
> +void intel_dsm_enable_mux(struct drm_i915_private *i915)
> +{
> +   struct pci_dev *pdev = i915->drm.pdev;
> +   acpi_handle dhandle;
> +
> +   dhandle = ACPI_HANDLE(>dev);
> +   if (!dhandle)
> +   return;
> +
> +   acpi_evaluate_dsm(dhandle, _dsm_guid2, INTEL_DSM_REVISION_ID,
> + INTEL_DSM_FN_PLATFORM_MUX_ENABLE, NULL);
> +}
> +
>  /*
>   * ACPI Specification, Revision 5.0, Appendix B.3.2 _DOD (Enumerate All 
> Devices
>   * Attached to the Display Adapter).
> diff --git a/drivers/gpu/drm/i915/display/intel_acpi.h 
> b/drivers/gpu/drm/i915/display/intel_acpi.h
> index e8b068661d22..def013cf6308 100644
> --- a/drivers/gpu/drm/i915/display/intel_acpi.h
> +++ b/drivers/gpu/drm/i915/display/intel_acpi.h
> @@ -11,11 +11,14 @@ struct drm_i915_private;
>  #ifdef CONFIG_ACPI
>  void intel_register_dsm_handler(void);
>  void intel_unregister_dsm_handler(void);
> +void intel_dsm_enable_mux(struct drm_i915_private *i915);
>  void intel_acpi_device_id_update(struct drm_i915_private *i915);
>  #else
>  static inline void intel_register_dsm_handler(void) { return; }
>  static inline void intel_unregister_dsm_handler(void) { return; }
>  static inline
> +void intel_dsm_enable_mux(struct drm_i915_private *i915) { return; }
> +static inline
>  void intel_acpi_device_id_update(struct drm_i915_private *i915) { return; }
>  #endif /* CONFIG_ACPI */
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index a10e26380ef3..d79dae370b20 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -11472,6 +11472,8 @@ void intel_modeset_init_hw(struct drm_i915_private 
> *i915)
>  {
> struct intel_cdclk_state *cdclk_state;
>
> +   intel_dsm_enable_mux(i915);
> +
> if (!HAS_DISPLAY(i915))
> return;
>
> --
> 2.30.2
>


Re: [PATCH v2] drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected

2021-05-10 Thread Kai-Heng Feng
On Fri, Apr 30, 2021 at 12:57 PM Kai-Heng Feng
 wrote:
>
> Screen flickers rapidly when two 4K 60Hz monitors are in use. This issue
> doesn't happen when one monitor is 4K 60Hz (pixelclock 594MHz) and
> another one is 4K 30Hz (pixelclock 297MHz).
>
> The issue is gone after setting "power_dpm_force_performance_level" to
> "high". Following the indication, we found that the issue occurs when
> sclk is too low.
>
> So resolve the issue by disabling sclk switching when there are two
> monitors requires high pixelclock (> 297MHz).
>
> v2:
>  - Only apply the fix to Oland.
> Signed-off-by: Kai-Heng Feng 

A gentle ping...

> ---
>  drivers/gpu/drm/radeon/radeon.h| 1 +
>  drivers/gpu/drm/radeon/radeon_pm.c | 8 
>  drivers/gpu/drm/radeon/si_dpm.c| 3 +++
>  3 files changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 42281fce552e6..56ed5634cebef 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -1549,6 +1549,7 @@ struct radeon_dpm {
> void*priv;
> u32 new_active_crtcs;
> int new_active_crtc_count;
> +   int high_pixelclock_count;
> u32 current_active_crtcs;
> int current_active_crtc_count;
> bool single_display;
> diff --git a/drivers/gpu/drm/radeon/radeon_pm.c 
> b/drivers/gpu/drm/radeon/radeon_pm.c
> index 0c1950f4e146f..3861c0b98fcf3 100644
> --- a/drivers/gpu/drm/radeon/radeon_pm.c
> +++ b/drivers/gpu/drm/radeon/radeon_pm.c
> @@ -1767,6 +1767,7 @@ static void radeon_pm_compute_clocks_dpm(struct 
> radeon_device *rdev)
> struct drm_device *ddev = rdev->ddev;
> struct drm_crtc *crtc;
> struct radeon_crtc *radeon_crtc;
> +   struct radeon_connector *radeon_connector;
>
> if (!rdev->pm.dpm_enabled)
> return;
> @@ -1776,6 +1777,7 @@ static void radeon_pm_compute_clocks_dpm(struct 
> radeon_device *rdev)
> /* update active crtc counts */
> rdev->pm.dpm.new_active_crtcs = 0;
> rdev->pm.dpm.new_active_crtc_count = 0;
> +   rdev->pm.dpm.high_pixelclock_count = 0;
> if (rdev->num_crtc && rdev->mode_info.mode_config_initialized) {
> list_for_each_entry(crtc,
> >mode_config.crtc_list, head) {
> @@ -1783,6 +1785,12 @@ static void radeon_pm_compute_clocks_dpm(struct 
> radeon_device *rdev)
> if (crtc->enabled) {
> rdev->pm.dpm.new_active_crtcs |= (1 << 
> radeon_crtc->crtc_id);
> rdev->pm.dpm.new_active_crtc_count++;
> +   if (!radeon_crtc->connector)
> +   continue;
> +
> +   radeon_connector = 
> to_radeon_connector(radeon_crtc->connector);
> +   if (radeon_connector->pixelclock_for_modeset 
> > 297000)
> +   rdev->pm.dpm.high_pixelclock_count++;
> }
> }
> }
> diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
> index 9186095518047..3cc2b96a7f368 100644
> --- a/drivers/gpu/drm/radeon/si_dpm.c
> +++ b/drivers/gpu/drm/radeon/si_dpm.c
> @@ -2979,6 +2979,9 @@ static void si_apply_state_adjust_rules(struct 
> radeon_device *rdev,
> (rdev->pdev->device == 0x6605)) {
> max_sclk = 75000;
> }
> +
> +   if (rdev->pm.dpm.high_pixelclock_count > 1)
> +   disable_sclk_switching = true;
> }
>
> if (rps->vce_active) {
> --
> 2.30.2
>


Re: [RFC PATCH 00/97] Basic GuC submission support in the i915

2021-05-10 Thread Dixit, Ashutosh
On Sun, 09 May 2021 16:11:43 -0700, Jason Ekstrand wrote:
>
> Yes, landing GuC support may be the first step in removing execlist
> support. The inevitable reality is that GPU scheduling is coming and
> likely to be there only path in the not-too-distant future.  (See also
> the ongoing thread with AMD about fences.) I'm not going to pass
> judgement on whether or not this is a good thing.  I'm just reading the
> winds and, in my view, this is where things are headed for good or ill.
>
> In answer to the question above, the answer to "what do we gain from
> GuC?" may soon be, "you get to use your GPU."  We're not there yet and,
> again, I'm not necessarily advocating for it, but that is likely where
> things are headed.
>
> A firmware-based submission model isn't a bad design IMO and, aside from
> the firmware freedom issues, I think there are actual advantages to the
> model. Immediately, it'll unlock a few features like parallel submission
> (more on that in a bit) and long-running compute because they're
> implemented in GuC and the work to implement them properly in the
> execlist scheduler is highly non-trivial.  Longer term, it may (no
> guarantees) unlock some performance by getting the kernel out of the way.

I believe another main reason for GuC is support for HW based
virtualization like SRIOV. The only way to support SRIOV with execlists
would be to statically partition the GPU between VM's, any dynamic
partitioning needs something in HW.


[Bug 211875] CPU frequency scaling lost after "WARNING: CPU: 2 PID: 2358578 at smu8_send_msg_to_smc_with_parameter+0xfe/0x140 [amdgpu]"

2021-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=211875

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Attachment #296147|0   |1
is obsolete||

--- Comment #11 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 296717
  --> https://bugzilla.kernel.org/attachment.cgi?id=296717=edit
kernel .config (kernel 5.13-rc1, A10-9700E)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 211875] CPU frequency scaling lost after "WARNING: CPU: 2 PID: 2358578 at smu8_send_msg_to_smc_with_parameter+0xfe/0x140 [amdgpu]"

2021-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=211875

--- Comment #10 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 296715
  --> https://bugzilla.kernel.org/attachment.cgi?id=296715=edit
dmesg (kernel 5.13-rc1, A10-9700E)

Some locking stacktrace on kernel v5.13-rc1 before the usual
"smu8_send_msg_to_smc_with_parameter" messages:

[...]

WARNING: inconsistent lock state
5.13.0-rc1-bdver4 #9 Not tainted

inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
X/591 [HC1[1]:SC0[0]:HE0:SE1] takes:
be37d160 (fs_reclaim){?.+.}-{0:0}, at: fs_reclaim_acquire+0xf7/0x160
{HARDIRQ-ON-W} state was registered at:
  lock_acquire+0x1a0/0x6e0
  fs_reclaim_acquire+0x117/0x160
  kmem_cache_alloc_trace+0x3b/0x320
  init_rescuer+0x80/0x330
  workqueue_init+0x12f/0x2fd
  kernel_init_freeable+0x305/0x584
  kernel_init+0x8/0x116
  ret_from_fork+0x22/0x30
irq event stamp: 68143332
hardirqs last  enabled at (68143331): []
asm_sysvec_apic_timer_interrupt+0xf/0x20
hardirqs last disabled at (68143332): []
common_interrupt+0x14/0xa0
softirqs last  enabled at (68143304): []
irq_exit_rcu+0x119/0x1a0
softirqs last disabled at (68143299): []
irq_exit_rcu+0x119/0x1a0

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(fs_reclaim);
  
lock(fs_reclaim);

 *** DEADLOCK ***

5 locks held by X/591:
 #0: 88810397fa20 (crtc_ww_class_acquire){+.+.}-{0:0}, at:
set_property_atomic+0xb2/0x2f0 [drm]
 #1: 88812bc00538 (crtc_ww_class_mutex){+.+.}-{3:3}, at:
modeset_lock+0x364/0x500 [drm]
 #2: 88812bc1cb58 (>dm.dc_lock){+.+.}-{3:3}, at:
amdgpu_dm_atomic_commit_tail+0xa12/0x9870 [amdgpu]
 #3: 8881538d6880 (>smu_lock){+.+.}-{3:3}, at:
pp_dpm_dispatch_tasks+0x50/0x90 [amdgpu]
 #4: 8881538d6910 (>msg_lock){+.+.}-{3:3}, at:
smum_send_msg_to_smc_with_parameter+0x1bf/0x300 [amdgpu]

stack backtrace:
CPU: 1 PID: 591 Comm: X Not tainted 5.13.0-rc1-bdver4 #9
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./A320M-HDV R3.0,
BIOS P3.10 06/26/2019
Call Trace:
 
 dump_stack+0xa5/0xe6
 mark_lock.cold+0x145/0x14f
 ? lock_chain_count+0x20/0x20
 ? mark_lock+0xee/0x2fd0
 ? lock_chain_count+0x20/0x20
 ? lock_chain_count+0x20/0x20
 ? lock_chain_count+0x20/0x20
 ? rcu_read_lock_sched_held+0x3a/0x70
 __lock_acquire+0x146a/0x5d40
 ? debug_check_no_locks_held+0xa0/0xa0
 ? mark_lock+0xee/0x2fd0
 ? debug_check_no_locks_held+0xa0/0xa0
 lock_acquire+0x1a0/0x6e0
 ? fs_reclaim_acquire+0xf7/0x160
 ? lock_release+0x730/0x730
 ? amdgpu_dm_irq_handler+0x1ad/0x780 [amdgpu]
 ? lock_downgrade+0x6e0/0x6e0
 fs_reclaim_acquire+0x117/0x160
 ? fs_reclaim_acquire+0xf7/0x160
 ? amdgpu_dm_irq_handler+0x2d1/0x780 [amdgpu]
 kmem_cache_alloc_trace+0x3b/0x320
 amdgpu_dm_irq_handler+0x2d1/0x780 [amdgpu]
 amdgpu_irq_dispatch+0x280/0x590 [amdgpu]
 ? amdgpu_irq_add_id+0x2c0/0x2c0 [amdgpu]
 ? drm_print_bits+0x190/0x190 [drm]
 ? __lock_acquire+0xd5b/0x5d40
 amdgpu_ih_process+0x1c4/0x390 [amdgpu]
 ? amdgpu_irq_disable_all+0x300/0x300 [amdgpu]
 amdgpu_irq_handler+0x22/0x210 [amdgpu]
 ? amdgpu_irq_disable_all+0x300/0x300 [amdgpu]
 __handle_irq_event_percpu+0x248/0x600
 handle_irq_event+0xfa/0x260
 ? handle_irq_event_percpu+0x140/0x140
 ? lapic_next_event+0x53/0x80
 ? clockevents_program_event+0x1c7/0x280
 handle_edge_irq+0x1f8/0xa90
 __common_interrupt+0x70/0x150
 common_interrupt+0x76/0xa0
 
 asm_common_interrupt+0x1b/0x40
RIP: 0010:delay_halt_mwaitx+0x31/0x40
Code: 00 65 48 03 05 a8 83 22 43 31 d2 48 89 d1 0f 01 fa bb ff ff ff ff b8 f0
00 00 00 b9 02 00 00 00 48 39 de 48 0f 46 de 0f 01 fb <5b> c3 66 66 2e 0f 1f 84
00 00 00 00 00 66 90 48 83 ec 08 48 89 f8
RSP: 0018:88810397ec38 EFLAGS: 0297
RAX: 00f0 RBX: 0bb3 RCX: 0002
RDX:  RSI: 0bb3 RDI: 2c4339a8e963
RBP: 0bb3 R08:  R09: be61f63f
R10: 0001 R11: 0001 R12: 8881538d6800
R13: 01d0 R14:  R15: ed102a71ad3e
 delay_halt+0x36/0x60
 phm_wait_for_register_unequal+0xd5/0x240 [amdgpu]
 ? amdgpu_device_wreg.part.0+0x2ae/0x350 [amdgpu]
 smu8_send_msg_to_smc_with_parameter+0x1e9/0x380 [amdgpu]
 smum_send_msg_to_smc_with_parameter+0x215/0x300 [amdgpu]
 smu8_set_power_state_tasks+0x691/0xe10 [amdgpu]
 ? debug_check_no_locks_held+0xa0/0xa0
 phm_set_power_state+0xcc/0x130 [amdgpu]
 ? phm_power_down_asic+0x90/0x90 [amdgpu]
 psm_adjust_power_state_dynamic+0x172/0x570 [amdgpu]
 ? psm_set_user_performance_state+0x1c0/0x1c0 [amdgpu]
 ? mutex_lock_io_nested+0xfd0/0xfe0
 ? memcpy+0x39/0x60
 ? psm_set_states+0x109/0x190 [amdgpu]
 hwmgr_handle_task+0x10a/0x1f0 [amdgpu]
 ? hwmgr_resume+0xc0/0xc0 [amdgpu]
 ? amdgpu_debugfs_fence_info_show+0x450/0x450 [amdgpu]
 pp_dpm_dispatch_tasks+0x5e/0x90 [amdgpu]
 ? pp_get_power_limit+0x250/0x250 [amdgpu]
 amdgpu_pm_compute_clocks.part.0+0x245/0x1500 [amdgpu]
 ? 

[PATCH RFC 1/3] drm: Add drm_plane_add_modifiers()

2021-05-10 Thread Tina Zhang
Add a function to add modifiers to a plane.

Signed-off-by: Tina Zhang 
---
 drivers/gpu/drm/drm_plane.c | 41 +
 include/drm/drm_plane.h |  3 +++
 2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index b570a480090a..793b16d84f86 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -288,6 +288,47 @@ int drm_universal_plane_init(struct drm_device *dev, 
struct drm_plane *plane,
 }
 EXPORT_SYMBOL(drm_universal_plane_init);
 
+int drm_plane_add_modifiers(struct drm_device *dev,
+ struct drm_plane *plane,
+ const uint64_t *format_modifiers)
+{
+   struct drm_mode_config *config = >mode_config;
+   const uint64_t *temp_modifiers = format_modifiers;
+   unsigned int format_modifier_count = 0;
+
+   /*
+* Only considering adding modifiers when no modifier was
+* added to that plane before.
+*/
+   if (!temp_modifiers || plane->modifier_count)
+   return -EINVAL;
+
+   while (*temp_modifiers++ != DRM_FORMAT_MOD_INVALID)
+   format_modifier_count++;
+
+   if (format_modifier_count)
+   config->allow_fb_modifiers = true;
+
+   plane->modifier_count = format_modifier_count;
+   plane->modifiers = kmalloc_array(format_modifier_count,
+sizeof(format_modifiers[0]),
+GFP_KERNEL);
+
+   if (format_modifier_count && !plane->modifiers) {
+   DRM_DEBUG_KMS("out of memory when allocating plane\n");
+   return -ENOMEM;
+   }
+
+   memcpy(plane->modifiers, format_modifiers,
+  format_modifier_count * sizeof(format_modifiers[0]));
+   if (config->allow_fb_modifiers)
+   create_in_format_blob(dev, plane);
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_plane_add_modifiers);
+
+
 int drm_plane_register_all(struct drm_device *dev)
 {
unsigned int num_planes = 0;
diff --git a/include/drm/drm_plane.h b/include/drm/drm_plane.h
index 50c23eb432b7..0dacdeffc3bc 100644
--- a/include/drm/drm_plane.h
+++ b/include/drm/drm_plane.h
@@ -827,6 +827,9 @@ int drm_universal_plane_init(struct drm_device *dev,
 const uint64_t *format_modifiers,
 enum drm_plane_type type,
 const char *name, ...);
+int drm_plane_add_modifiers(struct drm_device *dev,
+  struct drm_plane *plane,
+  const uint64_t *format_modifiers);
 int drm_plane_init(struct drm_device *dev,
   struct drm_plane *plane,
   uint32_t possible_crtcs,
-- 
2.25.1



[PATCH RFC 3/3] drm/virtio: Include modifier as part of set_scanout_blob

2021-05-10 Thread Tina Zhang
From: Vivek Kasireddy 

With new use-cases coming up that include virtio-gpu:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9592

the FB associated with a Guest blob may have a modifier. Therefore,
this modifier info needs to be included as part of set_scanout_blob.

v2: (Tina)
* Use drm_plane_add_modifiers() to set allow_fb_modifiers.
* Append the modifier field to the virtio_gpu_set_scanout_blob struct.

Signed-off-by: Vivek Kasireddy 
Signed-off-by: Tina Zhang 
---
 drivers/gpu/drm/virtio/virtgpu_vq.c | 3 ++-
 include/uapi/linux/virtio_gpu.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 7a6d6628e167..351befed105a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -34,7 +34,7 @@
 #include "virtgpu_drv.h"
 #include "virtgpu_trace.h"
 
-#define MAX_INLINE_CMD_SIZE   96
+#define MAX_INLINE_CMD_SIZE   112
 #define MAX_INLINE_RESP_SIZE  24
 #define VBUFFER_SIZE  (sizeof(struct virtio_gpu_vbuffer) \
   + MAX_INLINE_CMD_SIZE \
@@ -1336,6 +1336,7 @@ void virtio_gpu_cmd_set_scanout_blob(struct 
virtio_gpu_device *vgdev,
cmd_p->format = cpu_to_le32(format);
cmd_p->width  = cpu_to_le32(fb->width);
cmd_p->height = cpu_to_le32(fb->height);
+   cmd_p->modifier = cpu_to_le64(fb->modifier);
 
for (i = 0; i < 4; i++) {
cmd_p->strides[i] = cpu_to_le32(fb->pitches[i]);
diff --git a/include/uapi/linux/virtio_gpu.h b/include/uapi/linux/virtio_gpu.h
index f853d7672175..6d08481ac4ef 100644
--- a/include/uapi/linux/virtio_gpu.h
+++ b/include/uapi/linux/virtio_gpu.h
@@ -420,6 +420,7 @@ struct virtio_gpu_set_scanout_blob {
__le32 padding;
__le32 strides[4];
__le32 offsets[4];
+   __le64 modifier;
 };
 
 /* VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB */
-- 
2.25.1



[PATCH RFC 2/3] drm/virtio: Add modifier support

2021-05-10 Thread Tina Zhang
Add a command to get modifier info from the backend.

Signed-off-by: Tina Zhang 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h   |  3 ++
 drivers/gpu/drm/virtio/virtgpu_kms.c   |  3 ++
 drivers/gpu/drm/virtio/virtgpu_plane.c | 16 ++
 drivers/gpu/drm/virtio/virtgpu_vq.c| 42 ++
 include/uapi/linux/virtio_gpu.h|  8 +
 5 files changed, 72 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index d9dbc4f258f3..e077ea065558 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -250,6 +250,8 @@ struct virtio_gpu_device {
spinlock_t resource_export_lock;
/* protects map state and host_visible_mm */
spinlock_t host_visible_lock;
+
+   u64 modifiers[VIRTIO_GPU_PLANE_MAX_MODIFIERS];
 };
 
 struct virtio_gpu_fpriv {
@@ -329,6 +331,7 @@ int virtio_gpu_detach_status_page(struct virtio_gpu_device 
*vgdev);
 void virtio_gpu_cursor_ping(struct virtio_gpu_device *vgdev,
struct virtio_gpu_output *output);
 int virtio_gpu_cmd_get_display_info(struct virtio_gpu_device *vgdev);
+int virtio_gpu_cmd_get_plane_info(struct virtio_gpu_device *vgdev);
 int virtio_gpu_cmd_get_capset_info(struct virtio_gpu_device *vgdev, int idx);
 int virtio_gpu_cmd_get_capset(struct virtio_gpu_device *vgdev,
  int idx, int version,
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index b4ec479c32cd..3ecf36d92c5c 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -226,6 +226,9 @@ int virtio_gpu_init(struct drm_device *dev)
virtio_gpu_notify(vgdev);
wait_event_timeout(vgdev->resp_wq, !vgdev->display_info_pending,
   5 * HZ);
+
+   virtio_gpu_cmd_get_plane_info(vgdev);
+
return 0;
 
 err_scanouts:
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c 
b/drivers/gpu/drm/virtio/virtgpu_plane.c
index 42ac08ed1442..1b9b2a7bf0ac 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -73,6 +73,20 @@ static void virtio_gpu_plane_destroy(struct drm_plane *plane)
kfree(plane);
 }
 
+static bool virtio_plane_format_mod_supported(struct drm_plane *plane,
+   u32 format, u64 modifier)
+{
+   struct drm_device *dev = plane->dev;
+   struct virtio_gpu_device *vgdev = dev->dev_private;
+   int i;
+
+   for (i = 0; i < VIRTIO_GPU_PLANE_MAX_MODIFIERS; i++)
+   if (modifier == vgdev->modifiers[i])
+   return true;
+
+   return false;
+}
+
 static const struct drm_plane_funcs virtio_gpu_plane_funcs = {
.update_plane   = drm_atomic_helper_update_plane,
.disable_plane  = drm_atomic_helper_disable_plane,
@@ -80,6 +94,7 @@ static const struct drm_plane_funcs virtio_gpu_plane_funcs = {
.reset  = drm_atomic_helper_plane_reset,
.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
.atomic_destroy_state   = drm_atomic_helper_plane_destroy_state,
+   .format_mod_supported   = virtio_plane_format_mod_supported,
 };
 
 static int virtio_gpu_plane_atomic_check(struct drm_plane *plane,
@@ -348,6 +363,7 @@ struct drm_plane *virtio_gpu_plane_init(struct 
virtio_gpu_device *vgdev,
nformats = ARRAY_SIZE(virtio_gpu_formats);
funcs = _gpu_primary_helper_funcs;
}
+
ret = drm_universal_plane_init(dev, plane, 1 << index,
   _gpu_plane_funcs,
   formats, nformats,
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index cf84d382dd41..7a6d6628e167 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -678,6 +678,26 @@ static void virtio_gpu_cmd_get_display_info_cb(struct 
virtio_gpu_device *vgdev,
drm_kms_helper_hotplug_event(vgdev->ddev);
 }
 
+static void virtio_gpu_cmd_get_plane_info_cb(struct virtio_gpu_device *vgdev,
+  struct virtio_gpu_vbuffer *vbuf)
+{
+   struct drm_device *dev = vgdev->ddev;
+   struct virtio_gpu_output *output = vgdev->outputs;
+   struct drm_crtc *crtc = >crtc;
+   struct virtio_gpu_resp_plane_info *resp =
+   (struct virtio_gpu_resp_plane_info *)vbuf->resp_buf;
+   int i;
+
+   for (i = 0; i < VIRTIO_GPU_PLANE_MAX_MODIFIERS; i++) {
+   vgdev->modifiers[i] = resp->modifiers[i];
+   if (vgdev->modifiers[i] == DRM_FORMAT_MOD_INVALID)
+   break;
+   }
+
+   if (i < VIRTIO_GPU_PLANE_MAX_MODIFIERS)
+   drm_plane_add_modifiers(dev, crtc->primary, vgdev->modifiers);
+}
+
 static void virtio_gpu_cmd_get_capset_info_cb(struct virtio_gpu_device 

[PATCH RFC 0/3] Add virtio-gpu modifiers support

2021-05-10 Thread Tina Zhang
This patchset introduces the modifiers support to virtio-gpu. 

This RFC version tries to add a new virtio-gpu command to let front-end driver
query the modifiers' info from the backend. Besides, the front-end driver
may also need a new drm helper function to add the supported modifiers to a
plane as its properties and allow fbs to use modifiers on that plane.

Tina Zhang (2):
  drm: Add drm_plane_add_modifiers()
  drm/virtio: Add modifier support

Vivek Kasireddy (1):
  drm/virtio: Include modifier as part of set_scanout_blob

 drivers/gpu/drm/drm_plane.c| 41 +++
 drivers/gpu/drm/virtio/virtgpu_drv.h   |  3 ++
 drivers/gpu/drm/virtio/virtgpu_kms.c   |  3 ++
 drivers/gpu/drm/virtio/virtgpu_plane.c | 16 +
 drivers/gpu/drm/virtio/virtgpu_vq.c| 45 +-
 include/drm/drm_plane.h|  3 ++
 include/uapi/linux/virtio_gpu.h|  9 ++
 7 files changed, 119 insertions(+), 1 deletion(-)

-- 
2.25.1



[Bug 212959] [drm:dm_helpers_dp_write_dpcd [amdgpu]] *ERROR* Failed to find connector for link! - Exclusively an issue by booting from mounted ISOs of respective OS.

2021-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=212959

tob88...@gmail.com changed:

   What|Removed |Added

   Severity|blocking|normal

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 212959] [drm:dm_helpers_dp_write_dpcd [amdgpu]] *ERROR* Failed to find connector for link! - Exclusively an issue by booting from mounted ISOs of respective OS.

2021-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=212959

tob88...@gmail.com changed:

   What|Removed |Added

Summary|[drm:dm_helpers_dp_write_dp |[drm:dm_helpers_dp_write_dp
   |cd [amdgpu]] *ERROR* Failed |cd [amdgpu]] *ERROR* Failed
   |to find connector for link! |to find connector for link!
   ||- Exclusively an issue by
   ||booting from mounted ISOs
   ||of respective OS.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 212959] [drm:dm_helpers_dp_write_dpcd [amdgpu]] *ERROR* Failed to find connector for link!

2021-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=212959

--- Comment #4 from tob88...@gmail.com ---
(In reply to Alex Deucher from comment #2)
> Please attach the dmesg output for your system.  Other than the error, is
> the GPU otherwise working correctly?

The GPU otherwise is working correctly. On other Operating Systems.

Also this issue is exclusive to attempting to boot the live systems or
installations from a mounted version of the respective downloaded ISO file.

If you flash it to USB, the installation works fine, despite somewhat the same
errors. I am writing this currently from a working PopOS installation.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 212959] [drm:dm_helpers_dp_write_dpcd [amdgpu]] *ERROR* Failed to find connector for link!

2021-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=212959

--- Comment #3 from tob88...@gmail.com ---
Created attachment 296713
  --> https://bugzilla.kernel.org/attachment.cgi?id=296713=edit
Dmesg output

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late

2021-05-10 Thread kernel test robot
Hi Andrey,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next 
tegra-drm/drm/tegra/for-next linus/master v5.13-rc1 next-20210510]
[cannot apply to pci/next drm/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a012-20210510 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
492173d42b32cb91d5d0d72d5ed84fcab80d059a)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/28901216b0a25add4057d60c10eb305d4a32535e
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Andrey-Grodzovsky/RFC-Support-hot-device-unplug-in-amdgpu/20210511-003754
git checkout 28901216b0a25add4057d60c10eb305d4a32535e
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:444: warning: Function parameter 
or member 'sched_score' not described in 'amdgpu_fence_driver_init_ring'
>> drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:527: warning: expecting prototype 
>> for amdgpu_fence_driver_fini(). Prototype was for 
>> amdgpu_fence_driver_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3652: warning: expecting 
>> prototype for amdgpu_device_fini(). Prototype was for 
>> amdgpu_device_fini_hw() instead
--
>> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:376: warning: expecting prototype 
>> for amdgpu_irq_fini(). Prototype was for amdgpu_irq_fini_sw() instead


vim +527 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c

d38ceaf99ed015 Alex Deucher  2015-04-20  517  
d38ceaf99ed015 Alex Deucher  2015-04-20  518  /**
d38ceaf99ed015 Alex Deucher  2015-04-20  519   * amdgpu_fence_driver_fini - 
tear down the fence driver
d38ceaf99ed015 Alex Deucher  2015-04-20  520   * for all possible rings.
d38ceaf99ed015 Alex Deucher  2015-04-20  521   *
d38ceaf99ed015 Alex Deucher  2015-04-20  522   * @adev: amdgpu device 
pointer
d38ceaf99ed015 Alex Deucher  2015-04-20  523   *
d38ceaf99ed015 Alex Deucher  2015-04-20  524   * Tear down the fence driver 
for all possible rings (all asics).
d38ceaf99ed015 Alex Deucher  2015-04-20  525   */
28901216b0a25a Andrey Grodzovsky 2021-05-10  526  void 
amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
d38ceaf99ed015 Alex Deucher  2015-04-20 @527  {
c89377d10a11e5 Christian König   2016-03-13  528unsigned i, j;
c89377d10a11e5 Christian König   2016-03-13  529int r;
d38ceaf99ed015 Alex Deucher  2015-04-20  530  
d38ceaf99ed015 Alex Deucher  2015-04-20  531for (i = 0; i < 
AMDGPU_MAX_RINGS; i++) {
d38ceaf99ed015 Alex Deucher  2015-04-20  532struct 
amdgpu_ring *ring = adev->rings[i];
c2776afe740db5 Christian König   2015-11-03  533  
d38ceaf99ed015 Alex Deucher  2015-04-20  534if (!ring || 
!ring->fence_drv.initialized)
d38ceaf99ed015 Alex Deucher  2015-04-20  535
continue;
bb0cd09be45ea4 Emily Deng2021-03-04  536if 
(!ring->no_scheduler)
bb0cd09be45ea4 Emily Deng2021-03-04  537
drm_sched_fini(>sched);
d38ceaf99ed015 Alex Deucher  2015-04-20  538r = 
amdgpu_fence_wait_empty(ring);
d38ceaf99ed015 Alex Deucher  2015-04-20  539if (r) {
d38ceaf99ed015 Alex Deucher  2015-04-20  540/* no 
need to trigger GPU reset as we are unloading */
2f9d4084cac96a Monk Liu  2017-10-16  541
amdgpu_fence_driver_force_completion(ring);
d38ceaf99ed015 Alex Deucher  2015-04-20  542}
55611b507fd645 Jack Xiao 2019-06-05  543if 
(ring->fence_drv.irq_src)
c6a4079badc2f0 Chunming Zhou 2015-06-01  544
amdgpu_irq_put(adev, ring->fence_drv.irq_src,
c6a4079badc2f0 Chunming Zhou 2015-06-01  545
   ring->fence_drv.irq_type);
bb0cd09be45ea4 E

Re: [PATCH] drm/amd/display: remove unused function dc_link_perform_link_training

2021-05-10 Thread Aurabindo Pillai




On 2021-05-10 5:33 p.m., Rodrigo Siqueira wrote:

LGTM,

Jay, any comment?


None, LGTM, and this applied already.


Reviewed-by: Rodrigo Siqueira 

On 05/08, Rouven Czerwinski wrote:

This function is not used anywhere, remove it. It was added in
40dd6bd376a4 ("drm/amd/display: Linux Set/Read link rate and lane count
through debugfs") and moved in fe798de53a7a ("drm/amd/display: Move link
functions from dc to dc_link"), but a user is missing.

Signed-off-by: Rouven Czerwinski 
---
  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 13 -
  drivers/gpu/drm/amd/display/dc/dc_link.h  |  3 ---
  2 files changed, 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 3fb0cebd6938..55c5cf2264b3 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3553,19 +3553,6 @@ void dc_link_set_drive_settings(struct dc *dc,
dc_link_dp_set_drive_settings(dc->links[i], lt_settings);
  }
  
-void dc_link_perform_link_training(struct dc *dc,

-  struct dc_link_settings *link_setting,
-  bool skip_video_pattern)
-{
-   int i;
-
-   for (i = 0; i < dc->link_count; i++)
-   dc_link_dp_perform_link_training(
-   dc->links[i],
-   link_setting,
-   skip_video_pattern);
-}
-
  void dc_link_set_preferred_link_settings(struct dc *dc,
 struct dc_link_settings *link_setting,
 struct dc_link *link)
diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
b/drivers/gpu/drm/amd/display/dc/dc_link.h
index fc5622ffec3d..45c927cd27ab 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_link.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
@@ -363,9 +363,6 @@ bool dc_link_is_hdcp22(struct dc_link *link, enum 
signal_type signal);
  void dc_link_set_drive_settings(struct dc *dc,
struct link_training_settings *lt_settings,
const struct dc_link *link);
-void dc_link_perform_link_training(struct dc *dc,
-  struct dc_link_settings *link_setting,
-  bool skip_video_pattern);
  void dc_link_set_preferred_link_settings(struct dc *dc,
 struct dc_link_settings *link_setting,
 struct dc_link *link);
--
2.31.1

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7CRodrigo.Siqueira%40amd.com%7C9724972184d64ad6e7e008d913010665%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637561717696066502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000sdata=vUFEeBJwjTDnI9l8MGDiW8%2FoX7LINZi%2FfD4A004QfLs%3Dreserved=0




--
Regards,
Aurabindo Pillai


[PATCH v17 1/2] drm/tegra: dc: Support memory bandwidth management

2021-05-10 Thread Dmitry Osipenko
Display controller (DC) performs isochronous memory transfers, and thus,
has a requirement for a minimum memory bandwidth that shall be fulfilled,
otherwise framebuffer data can't be fetched fast enough and this results
in a DC's data-FIFO underflow that follows by a visual corruption.

The Memory Controller drivers provide facility for memory bandwidth
management via interconnect API. Let's wire up the interconnect API
support to the DC driver in order to fix the distorted display output
on T30 Ouya, T124 TK1 and other Tegra devices.

Tested-by: Peter Geis  # Ouya T30
Tested-by: Matt Merhar  # Ouya T30
Tested-by: Nicolas Chauvet  # PAZ00 T20 and TK1 T124
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/Kconfig |   1 +
 drivers/gpu/drm/tegra/dc.c| 352 ++
 drivers/gpu/drm/tegra/dc.h|  14 ++
 drivers/gpu/drm/tegra/drm.c   |  14 ++
 drivers/gpu/drm/tegra/hub.c   |   3 +
 drivers/gpu/drm/tegra/plane.c | 116 +++
 drivers/gpu/drm/tegra/plane.h |  15 ++
 7 files changed, 515 insertions(+)

diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig
index 5043dcaf1cf9..1650a448eabd 100644
--- a/drivers/gpu/drm/tegra/Kconfig
+++ b/drivers/gpu/drm/tegra/Kconfig
@@ -9,6 +9,7 @@ config DRM_TEGRA
select DRM_MIPI_DSI
select DRM_PANEL
select TEGRA_HOST1X
+   select INTERCONNECT
select IOMMU_IOVA
select CEC_CORE if CEC_NOTIFIER
help
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index f9120dc24682..9997e4942bf8 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -618,6 +619,9 @@ static int tegra_plane_atomic_check(struct drm_plane *plane,
struct tegra_dc *dc = to_tegra_dc(new_plane_state->crtc);
int err;
 
+   plane_state->peak_memory_bandwidth = 0;
+   plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!new_plane_state->crtc)
return 0;
@@ -808,6 +812,12 @@ static struct drm_plane *tegra_primary_plane_create(struct 
drm_device *drm,
formats = dc->soc->primary_formats;
modifiers = dc->soc->modifiers;
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
err = drm_universal_plane_init(drm, >base, possible_crtcs,
   _plane_funcs, formats,
   num_formats, modifiers, type, NULL);
@@ -845,9 +855,13 @@ static int tegra_cursor_atomic_check(struct drm_plane 
*plane,
 {
struct drm_plane_state *new_plane_state = 
drm_atomic_get_new_plane_state(state,

 plane);
+   struct tegra_plane_state *plane_state = 
to_tegra_plane_state(new_plane_state);
struct tegra_plane *tegra = to_tegra_plane(plane);
int err;
 
+   plane_state->peak_memory_bandwidth = 0;
+   plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!new_plane_state->crtc)
return 0;
@@ -1030,6 +1044,12 @@ static struct drm_plane 
*tegra_dc_cursor_plane_create(struct drm_device *drm,
formats = tegra_cursor_plane_formats;
}
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
err = drm_universal_plane_init(drm, >base, possible_crtcs,
   _plane_funcs, formats,
   num_formats, NULL,
@@ -1144,6 +1164,12 @@ static struct drm_plane 
*tegra_dc_overlay_plane_create(struct drm_device *drm,
num_formats = dc->soc->num_overlay_formats;
formats = dc->soc->overlay_formats;
 
+   err = tegra_plane_interconnect_init(plane);
+   if (err) {
+   kfree(plane);
+   return ERR_PTR(err);
+   }
+
if (!cursor)
type = DRM_PLANE_TYPE_OVERLAY;
else
@@ -1261,6 +1287,7 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc)
 {
struct tegra_dc_state *state = to_dc_state(crtc->state);
struct tegra_dc_state *copy;
+   unsigned int i;
 
copy = kmalloc(sizeof(*copy), GFP_KERNEL);
if (!copy)
@@ -1272,6 +1299,9 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc)
copy->div = state->div;
copy->planes = state->planes;
 
+   for (i = 0; i < ARRAY_SIZE(state->plane_peak_bw); i++)
+   copy->plane_peak_bw[i] = state->plane_peak_bw[i];
+
return >base;
 }
 
@@ -1798,6 +1828,106 @@ static int tegra_dc_wait_idle(struct tegra_dc *dc, 
unsigned long timeout)
return -ETIMEDOUT;
 }
 
+static 

[PATCH v17 2/2] drm/tegra: dc: Extend debug stats with total number of events

2021-05-10 Thread Dmitry Osipenko
It's useful to know the total number of underflow events and currently
the debug stats are getting reset each time CRTC is being disabled. Let's
account the overall number of events that doesn't get a reset.

Reviewed-by: Michał Mirosław 
Signed-off-by: Dmitry Osipenko 
---
 drivers/gpu/drm/tegra/dc.c | 10 ++
 drivers/gpu/drm/tegra/dc.h |  5 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 9997e4942bf8..d87ef2550e26 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -1596,6 +1596,11 @@ static int tegra_dc_show_stats(struct seq_file *s, void 
*data)
seq_printf(s, "underflow: %lu\n", dc->stats.underflow);
seq_printf(s, "overflow: %lu\n", dc->stats.overflow);
 
+   seq_printf(s, "frames total: %lu\n", dc->stats.frames_total);
+   seq_printf(s, "vblank total: %lu\n", dc->stats.vblank_total);
+   seq_printf(s, "underflow total: %lu\n", dc->stats.underflow_total);
+   seq_printf(s, "overflow total: %lu\n", dc->stats.overflow_total);
+
return 0;
 }
 
@@ -2370,6 +2375,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): frame end\n", __func__);
*/
+   dc->stats.frames_total++;
dc->stats.frames++;
}
 
@@ -2378,6 +2384,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
dev_dbg(dc->dev, "%s(): vertical blank\n", __func__);
*/
drm_crtc_handle_vblank(>base);
+   dc->stats.vblank_total++;
dc->stats.vblank++;
}
 
@@ -2385,6 +2392,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): underflow\n", __func__);
*/
+   dc->stats.underflow_total++;
dc->stats.underflow++;
}
 
@@ -2392,11 +2400,13 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): overflow\n", __func__);
*/
+   dc->stats.overflow_total++;
dc->stats.overflow++;
}
 
if (status & HEAD_UF_INT) {
dev_dbg_ratelimited(dc->dev, "%s(): head underflow\n", 
__func__);
+   dc->stats.underflow_total++;
dc->stats.underflow++;
}
 
diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h
index db10af097033..063bb777d607 100644
--- a/drivers/gpu/drm/tegra/dc.h
+++ b/drivers/gpu/drm/tegra/dc.h
@@ -48,6 +48,11 @@ struct tegra_dc_stats {
unsigned long vblank;
unsigned long underflow;
unsigned long overflow;
+
+   unsigned long frames_total;
+   unsigned long vblank_total;
+   unsigned long underflow_total;
+   unsigned long overflow_total;
 };
 
 struct tegra_windowgroup_soc {
-- 
2.30.2



[PATCH v17 0/2] Add memory bandwidth management to NVIDIA Tegra DRM driver

2021-05-10 Thread Dmitry Osipenko
This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
which is done using interconnect framework. It fixes display corruption that
happens due to insufficient memory bandwidth.

Changelog:

v17: - No changes, re-sending for v5.14.

v16: - Implemented suggestions that were given by Michał Mirosław to v15.

 - Added r-b from Michał Mirosław to the debug-stats patch.

 - Rebased on top of a recent linux-next.

 - Removed bandwidth scaling based on width difference of src/dst
   windows since it's not actual anymore. Apparently the recent memory
   driver changes fixed problems that I witnessed before.

 - Average bandwidth calculation now won't overflow for 4k resolutions.

 - Average bandwidth calculation now uses the size of the visible
   area instead of the src area since debug stats of the memory
   controller clearly show that downscaled window takes less bandwidth,
   proportionally to the scaled size.

 - Bandwidth calculation now uses "adjusted mode" of the CRTC, which
   is what used for h/w programming, instead of the mode that was
   requested by userspace, although the two usually match in practice.

v15: - Corrected tegra_plane_icc_names[] NULL-check that was partially lost
   by accident in v14 after unsuccessful rebase.

v14: - Made improvements that were suggested by Michał Mirosław to v13:

   - Changed 'unsigned int' to 'bool'.
   - Renamed functions which calculate bandwidth state.
   - Reworked comment in the code that explains why downscaled plane
 require higher bandwidth.
   - Added round-up to bandwidth calculation.
   - Added sanity checks of the plane index and fixed out-of-bounds
 access which happened on T124 due to the cursor plane index.

v13: - No code changes. Patches missed v5.12, re-sending them for v5.13.

Dmitry Osipenko (2):
  drm/tegra: dc: Support memory bandwidth management
  drm/tegra: dc: Extend debug stats with total number of events

 drivers/gpu/drm/tegra/Kconfig |   1 +
 drivers/gpu/drm/tegra/dc.c| 362 ++
 drivers/gpu/drm/tegra/dc.h|  19 ++
 drivers/gpu/drm/tegra/drm.c   |  14 ++
 drivers/gpu/drm/tegra/hub.c   |   3 +
 drivers/gpu/drm/tegra/plane.c | 116 +++
 drivers/gpu/drm/tegra/plane.h |  15 ++
 7 files changed, 530 insertions(+)

-- 
2.30.2



Re: [PATCH v2] drm: Declare drm_send_event_helper static.

2021-05-10 Thread Guenter Roeck
On Mon, May 10, 2021 at 06:46:16PM +0530, Rajat Asthana wrote:
> Declare drm_send_event_helper as static to fix sparse warning:
> 
> > warning: symbol 'drm_send_event_helper' was not declared.
> > Should it be static?
> 
> Signed-off-by: Rajat Asthana 
> ---
> Changes in v2:
> Provide full name in Author and Signed-off.
> 

Turns out a variant of this patch [1] has already been accepted.

Guenter

---
[1] 
https://patchwork.kernel.org/project/dri-devel/patch/20210427105503.10765-1-fmdefrance...@gmail.com/


RE: [PATCH v2 10/10] drm/amdgpu: Move dmabuf attach/detach to backend_(un)bind

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Christian 
König
Sent: Thursday, April 22, 2021 6:20 AM
To: Kuehling, Felix ; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 10/10] drm/amdgpu: Move dmabuf attach/detach to 
backend_(un)bind

Am 22.04.21 um 03:30 schrieb Felix Kuehling:
> The dmabuf attachment should be updated by moving the SG BO to 
> DOMAIN_CPU and back to DOMAIN_GTT. This does not necessarily invoke 
> the populate/unpopulate callbacks. Do this in backend_bind/unbind instead.
>
> Signed-off-by: Felix Kuehling 

Reviewed-by: Christian König 

> ---
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  3 --
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 51 +--
>   2 files changed, 25 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 18a1f9222a59..68e6ce8dcf33 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -582,9 +582,6 @@ kfd_mem_dmaunmap_dmabuf(struct kfd_mem_attachment 
> *attachment)
>   
>   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
>   ttm_bo_validate(>tbo, >placement, );
> - /* FIXME: This does not guarantee that amdgpu_ttm_tt_unpopulate is
> -  * called
> -  */
>   }
>   
>   static void
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 7e7d8330d64b..fc2a8d681dbc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -910,7 +910,23 @@ static int amdgpu_ttm_backend_bind(struct ttm_device 
> *bdev,
>   DRM_ERROR("failed to pin userptr\n");
>   return r;
>   }
> + } else if (ttm->page_flags & TTM_PAGE_FLAG_SG) {
> + if (!ttm->sg) {
> + struct dma_buf_attachment *attach;
> + struct sg_table *sgt;
> +
> + attach = gtt->gobj->import_attach;
> + sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
> + if (IS_ERR(sgt))
> + return PTR_ERR(sgt);
> +
> + ttm->sg = sgt;
> + }
> +
> + drm_prime_sg_to_dma_addr_array(ttm->sg, gtt->ttm.dma_address,
> +ttm->num_pages);
>   }
> +
>   if (!ttm->num_pages) {
>   WARN(1, "nothing to bind %u pages for mreg %p back %p!\n",
>ttm->num_pages, bo_mem, ttm); @@ -1037,8 +1053,15 @@ 
> static 
> void amdgpu_ttm_backend_unbind(struct ttm_device *bdev,
>   int r;
>   
>   /* if the pages have userptr pinning then clear that first */
> - if (gtt->userptr)
> + if (gtt->userptr) {
>   amdgpu_ttm_tt_unpin_userptr(bdev, ttm);
> + } else if (ttm->sg && gtt->gobj->import_attach) {
> + struct dma_buf_attachment *attach;
> +
> + attach = gtt->gobj->import_attach;
> + dma_buf_unmap_attachment(attach, ttm->sg, DMA_BIDIRECTIONAL);
> + ttm->sg = NULL;
> + }
>   
>   if (!gtt->bound)
>   return;
> @@ -1125,23 +1148,8 @@ static int amdgpu_ttm_tt_populate(struct ttm_device 
> *bdev,
>   return 0;
>   }
>   
> - if (ttm->page_flags & TTM_PAGE_FLAG_SG) {
> - if (!ttm->sg) {
> - struct dma_buf_attachment *attach;
> - struct sg_table *sgt;
> -
> - attach = gtt->gobj->import_attach;
> - sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
> - if (IS_ERR(sgt))
> - return PTR_ERR(sgt);
> -
> - ttm->sg = sgt;
> - }
> -
> - drm_prime_sg_to_dma_addr_array(ttm->sg, gtt->ttm.dma_address,
> -ttm->num_pages);
> + if (ttm->page_flags & TTM_PAGE_FLAG_SG)
>   return 0;
> - }
>   
>   return ttm_pool_alloc(>mman.bdev.pool, ttm, ctx);
>   }
> @@ -1165,15 +1173,6 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device 
> *bdev,
>   return;
>   }
>   
> - if (ttm->sg && gtt->gobj->import_attach) {
> - struct dma_buf_attachment *attach;
> -
> - attach = gtt->gobj->import_attach;
> - dma_buf_unmap_attachment(attach, ttm->sg, DMA_BIDIRECTIONAL);
> - ttm->sg = NULL;
> - return;
> - }
> -
>   if (ttm->page_flags & TTM_PAGE_FLAG_SG)
>   return;
>   

___
amd-gfx mailing list
amd-...@lists.freedesktop.org

RE: [PATCH v2 09/10] drm/ttm: Don't count pages in SG BOs against pages_limit

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Wednesday, April 21, 2021 8:31 PM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: [PATCH v2 09/10] drm/ttm: Don't count pages in SG BOs against 
pages_limit

Pages in SG BOs were not allocated by TTM. So don't count them against TTM's 
pages limit.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/ttm/ttm_tt.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 
5d8820725b75..e8b8c3257392 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -317,9 +317,12 @@ int ttm_tt_populate(struct ttm_device *bdev,
if (ttm_tt_is_populated(ttm))
return 0;
 
-   atomic_long_add(ttm->num_pages, _pages_allocated);
-   if (bdev->pool.use_dma32)
-   atomic_long_add(ttm->num_pages, _dma32_pages_allocated);
+   if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+   atomic_long_add(ttm->num_pages, _pages_allocated);
+   if (bdev->pool.use_dma32)
+   atomic_long_add(ttm->num_pages,
+   _dma32_pages_allocated);
+   }
 
while (atomic_long_read(_pages_allocated) > ttm_pages_limit ||
   atomic_long_read(_dma32_pages_allocated) > @@ -350,9 +353,12 
@@ int ttm_tt_populate(struct ttm_device *bdev,
return 0;
 
 error:
-   atomic_long_sub(ttm->num_pages, _pages_allocated);
-   if (bdev->pool.use_dma32)
-   atomic_long_sub(ttm->num_pages, _dma32_pages_allocated);
+   if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+   atomic_long_sub(ttm->num_pages, _pages_allocated);
+   if (bdev->pool.use_dma32)
+   atomic_long_sub(ttm->num_pages,
+   _dma32_pages_allocated);
+   }
return ret;
 }
 EXPORT_SYMBOL(ttm_tt_populate);
@@ -382,9 +388,12 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct 
ttm_tt *ttm)
else
ttm_pool_free(>pool, ttm);
 
-   atomic_long_sub(ttm->num_pages, _pages_allocated);
-   if (bdev->pool.use_dma32)
-   atomic_long_sub(ttm->num_pages, _dma32_pages_allocated);
+   if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
+   atomic_long_sub(ttm->num_pages, _pages_allocated);
+   if (bdev->pool.use_dma32)
+   atomic_long_sub(ttm->num_pages,
+   _dma32_pages_allocated);
+   }
 
ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED;  }
--
2.31.1

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Cphilip.yang%40amd.com%7C2c56b6451f56454af1ed08d9052e6395%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637546519067581184%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=eMVIddYdHMgr4TKyeS1fsjQbYVKvzg5D2EgzreknCEI%3Dreserved=0


RE: [PATCH v2 08/10] drm/amdgpu: Add DMA mapping of GTT BOs

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Tuesday, April 27, 2021 10:09 AM
To: Zeng, Oak ; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 08/10] drm/amdgpu: Add DMA mapping of GTT BOs

Am 2021-04-27 um 10:29 a.m. schrieb Zeng, Oak:
> Regards,
> Oak
>
>  
>
> On 2021-04-26, 11:56 PM, "Kuehling, Felix"  wrote:
>
> Am 2021-04-26 um 8:35 p.m. schrieb Zeng, Oak:
> > Regards,
> > Oak 
> >
> >  
> >
> > On 2021-04-21, 9:31 PM, "amd-gfx on behalf of Felix Kuehling" 
>  
> wrote:
> >
> > Use DMABufs with dynamic attachment to DMA-map GTT BOs on other 
> GPUs.
> >
> > Signed-off-by: Felix Kuehling 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  2 +
> >  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 76 
> ++-
> >  2 files changed, 77 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > index 63668433f5a6..b706e5a54782 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> > @@ -41,6 +41,7 @@ struct amdgpu_device;
> >  enum kfd_mem_attachment_type {
> > KFD_MEM_ATT_SHARED, /* Share kgd_mem->bo or another 
> attachment's */
> > KFD_MEM_ATT_USERPTR,/* SG bo to DMA map pages from a 
> userptr bo */
> > +   KFD_MEM_ATT_DMABUF, /* DMAbuf to DMA map TTM BOs */
> >  };
> >
> >  struct kfd_mem_attachment {
> > @@ -56,6 +57,7 @@ struct kfd_mem_attachment {
> >  struct kgd_mem {
> > struct mutex lock;
> > struct amdgpu_bo *bo;
> > +   struct dma_buf *dmabuf;
> > struct list_head attachments;
> > /* protected by amdkfd_process_info.lock */
> > struct ttm_validate_buffer validate_list;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index 9eeedd0c7920..18a1f9222a59 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -524,6 +524,16 @@ kfd_mem_dmamap_userptr(struct kgd_mem *mem,
> > return ret;
> >  }
> >
> > +static int
> > +kfd_mem_dmamap_dmabuf(struct kfd_mem_attachment *attachment)
> > +{
> > +   struct ttm_operation_ctx ctx = {.interruptible = true};
> > +   struct amdgpu_bo *bo = attachment->bo_va->base.bo;
> > +
> > +   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT);
> > +   return ttm_bo_validate(>tbo, >placement, );
> > How does this work? The function name says this is dma mapping a 
> buffer but from the implementation, it is just a placement and 
> validation
>
> Conceptually, calling ttm_bo_validate ensures that the BO is in the
> specified domain, in this case GTT. Before calling validate, it can be
> in the CPU domain, which means it may be swapped to disk so it's not GPU
> accessible. For a DMABuf attachment, the CPU domain means, that the
> DMABuf is not attached because the underlying memory object may be on
> the move or swapped out.
>
> The actual implementation of the dmabuf attachment is currently in
> amdgpu_ttm_populate/unpopulate. This is incorrect. Patch 10 in this
> series fixes that to move the actual dmabuf attachment into
> amdgpu_ttm_backend_bind/unbind, which is called from amdgpu_bo_move when
> a BO is moved between the CPU and GTT domains.
>
> Thanks for the explanation. One more thing I don't quite understand: before 
> this series, GTT memory should already has been validated somewhere before 
> GTT memory is mapped to GPU. You added GTT memory validation here - will this 
> validation be duplicated?

When you have N GPUs there are now N BOs involved. Each GPU needs its own BO 
because it needs its own DMA mapping. There will be one actual GTT BO that 
allocates physical pages in TTM. The other BOs are dmabuf imports that DMA-map 
the same physical pages for access by the other GPUs.

The validate call here validates one of the dmabuf imports. This does not 
duplicate the validation of the underlying TTM BO with the actual physical 
memory allocation.


>
> The function naming kfd_mem_dmamap_dmabuf is still confusing since it seems 
> to me it is only some preparation work before dynamically dma-map a GTT 
> memory.

No, this series is not just preparation. It implements DMA mapping of BOs for 
multiple GPUs. TTM already handles DMA mapping of the memory for the device 
where the memory was allocated. (Yes, even GTT memory is associated with 

RE: [PATCH v2 07/10] drm/amdgpu: Move kfd_mem_attach outside reservation

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Wednesday, April 21, 2021 8:31 PM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: [PATCH v2 07/10] drm/amdgpu: Move kfd_mem_attach outside reservation

This is needed to avoid deadlocks with DMA buf import in the next patch.
Also move PT/PD validation out of kfd_mem_attach, that way the caller can bo 
this unconditionally.

Signed-off-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 75 +++
 1 file changed, 44 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 7d25d886b98c..9eeedd0c7920 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -577,6 +577,34 @@ kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,
}
 }
 
+static int
+kfd_mem_attach_userptr(struct amdgpu_device *adev, struct kgd_mem *mem,
+  struct amdgpu_bo **bo)
+{
+   unsigned long bo_size = mem->bo->tbo.base.size;
+   struct drm_gem_object *gobj;
+   int ret;
+
+   ret = amdgpu_bo_reserve(mem->bo, false);
+   if (ret)
+   return ret;
+
+   ret = amdgpu_gem_object_create(adev, bo_size, 1,
+  AMDGPU_GEM_DOMAIN_CPU,
+  0, ttm_bo_type_sg,
+  mem->bo->tbo.base.resv,
+  );
+   if (ret)
+   return ret;
+
+   amdgpu_bo_unreserve(mem->bo);
+
+   *bo = gem_to_amdgpu_bo(gobj);
+   (*bo)->parent = amdgpu_bo_ref(mem->bo);
+
+   return 0;
+}
+
 /* kfd_mem_attach - Add a BO to a VM
  *
  * Everything that needs to bo done only once when a BO is first added @@ 
-598,7 +626,6 @@ static int kfd_mem_attach(struct amdgpu_device *adev, struct 
kgd_mem *mem,
uint64_t va = mem->va;
struct kfd_mem_attachment *attachment[2] = {NULL, NULL};
struct amdgpu_bo *bo[2] = {NULL, NULL};
-   struct drm_gem_object *gobj;
int i, ret;
 
if (!va) {
@@ -632,15 +659,9 @@ static int kfd_mem_attach(struct amdgpu_device *adev, 
struct kgd_mem *mem,
} else if (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm)) {
/* Create an SG BO to DMA-map userptrs on other GPUs */
attachment[i]->type = KFD_MEM_ATT_USERPTR;
-   ret = amdgpu_gem_object_create(adev, bo_size, 1,
-  AMDGPU_GEM_DOMAIN_CPU,
-  0, ttm_bo_type_sg,
-  mem->bo->tbo.base.resv,
-  );
+   ret = kfd_mem_attach_userptr(adev, mem, [i]);
if (ret)
goto unwind;
-   bo[i] = gem_to_amdgpu_bo(gobj);
-   bo[i]->parent = amdgpu_bo_ref(mem->bo);
} else {
/* FIXME: Need to DMA-map other BO types */
attachment[i]->type = KFD_MEM_ATT_SHARED; @@ -665,13 
+686,6 @@ static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem 
*mem,
va += bo_size;
}
 
-   /* Allocate validate page tables if needed */
-   ret = vm_validate_pt_pd_bos(vm);
-   if (unlikely(ret)) {
-   pr_err("validate_pt_pd_bos() failed\n");
-   goto unwind;
-   }
-
return 0;
 
 unwind:
@@ -1478,12 +1492,12 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
pr_debug("Release VA 0x%llx - 0x%llx\n", mem->va,
mem->va + bo_size * (1 + mem->aql_queue));
 
+   ret = unreserve_bo_and_vms(, false, false);
+
/* Remove from VM internal data structures */
list_for_each_entry_safe(entry, tmp, >attachments, list)
kfd_mem_detach(entry);
 
-   ret = unreserve_bo_and_vms(, false, false);
-
/* Free the sync object */
amdgpu_sync_free(>sync);
 
@@ -1560,6 +1574,12 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
mem->va + bo_size * (1 + mem->aql_queue),
avm, domain_string(domain));
 
+   if (!kfd_mem_is_attached(avm, mem)) {
+   ret = kfd_mem_attach(adev, mem, avm, mem->aql_queue);
+   if (ret)
+   goto out;
+   }
+
ret = reserve_bo_and_vm(mem, avm, );
if (unlikely(ret))
goto out;
@@ -1573,15 +1593,9 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
bo->tbo.mem.mem_type == TTM_PL_SYSTEM)
is_invalid_userptr = true;
 
-   if (!kfd_mem_is_attached(avm, mem)) {
-   

RE: [PATCH v2 06/10] drm/amdgpu: DMA map/unmap when updating GPU mappings

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Monday, April 26, 2021 10:48 PM
To: Zeng, Oak ; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 06/10] drm/amdgpu: DMA map/unmap when updating GPU 
mappings

Am 2021-04-26 um 8:23 p.m. schrieb Zeng, Oak:
> Regards,
> Oak
>
>  
>
> On 2021-04-21, 9:31 PM, "dri-devel on behalf of Felix Kuehling" 
>  
> wrote:
>
> DMA map kfd_mem_attachments in update_gpuvm_pte. This function is called
> with the BO and page tables reserved, so we can safely update the DMA
> mapping.
>
> DMA unmap when a BO is unmapped from a GPU and before updating mappings
> in restore workers.
>
> Signed-off-by: Felix Kuehling 
> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 56 ++-
>  1 file changed, 29 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 49d1af4aa5f1..7d25d886b98c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -961,11 +961,12 @@ static int unreserve_bo_and_vms(struct 
> bo_vm_reservation_context *ctx,
>   return ret;
>  }
>
> -static int unmap_bo_from_gpuvm(struct amdgpu_device *adev,
> +static void unmap_bo_from_gpuvm(struct kgd_mem *mem,
>   struct kfd_mem_attachment *entry,
>   struct amdgpu_sync *sync)
>  {
>   struct amdgpu_bo_va *bo_va = entry->bo_va;
> + struct amdgpu_device *adev = entry->adev;
>   struct amdgpu_vm *vm = bo_va->base.vm;
>
>   amdgpu_vm_bo_unmap(adev, bo_va, entry->va);
> @@ -974,15 +975,20 @@ static int unmap_bo_from_gpuvm(struct 
> amdgpu_device *adev,
>
>   amdgpu_sync_fence(sync, bo_va->last_pt_update);
>
> - return 0;
> + kfd_mem_dmaunmap_attachment(mem, entry);
>  }
>
> -static int update_gpuvm_pte(struct amdgpu_device *adev,
> - struct kfd_mem_attachment *entry,
> - struct amdgpu_sync *sync)
> +static int update_gpuvm_pte(struct kgd_mem *mem,
> + struct kfd_mem_attachment *entry,
> + struct amdgpu_sync *sync)
>  {
> - int ret;
>   struct amdgpu_bo_va *bo_va = entry->bo_va;
> + struct amdgpu_device *adev = entry->adev;
> + int ret;
> +
> + ret = kfd_mem_dmamap_attachment(mem, entry);
> Should the dma mapping be done in the kfd_mem_attach function on a memory 
> object is attached to a vm the first time? Since each memory object can be 
> mapped to many GPU or many VMs, by doing dma mapping the first it is attached 
> can simplify the logics. Or even simpler, maybe we can just just dma map when 
> a memory object is created - it wastes some iommu page table entry but really 
> simplify the logic in this patch series. I found this series is not very easy 
> to understand.

The DMA mapping must be updated every time the physical memory allocation 
changes, e.g. after a BO was evicted and restored. Basically, if the physical 
pages of the BO change, we need to update the DMA mapping to point to those new 
pages. Therefore I added this in the update_gpu_vm_pte function, which is 
called after a BO has been validated the first time, or revalidated after an 
eviction.

You'll also see that I call dmaunmap in the re-validation cases (in the restore 
workers below) to ensure that we don't leak DMA mappings.

Regards,
  Felix


> + if (ret)
> + return ret;
>
>   /* Update the page tables  */
>   ret = amdgpu_vm_bo_update(adev, bo_va, false);
> @@ -994,14 +1000,15 @@ static int update_gpuvm_pte(struct amdgpu_device 
> *adev,
>   return amdgpu_sync_fence(sync, bo_va->last_pt_update);
>  }
>
> -static int map_bo_to_gpuvm(struct amdgpu_device *adev,
> - struct kfd_mem_attachment *entry, struct amdgpu_sync *sync,
> - bool no_update_pte)
> +static int map_bo_to_gpuvm(struct kgd_mem *mem,
> +struct kfd_mem_attachment *entry,
> +struct amdgpu_sync *sync,
> +bool no_update_pte)
>  {
>   int ret;
>
>   /* Set virtual address for the allocation */
> - ret = amdgpu_vm_bo_map(adev, entry->bo_va, entry->va, 0,
> + ret = amdgpu_vm_bo_map(entry->adev, entry->bo_va, entry->va, 0,
>  amdgpu_bo_size(entry->bo_va->base.bo),
>  entry->pte_flags);
>   if (ret) {
> @@ -1013,7 +1020,7 @@ static int map_bo_to_gpuvm(struct amdgpu_device 
> *adev,
>   if (no_update_pte)
>   return 0;
>
> - ret = update_gpuvm_pte(adev, entry, sync);
> + ret = update_gpuvm_pte(mem, entry, sync);
>   if (ret) {
>   

RE: [PATCH v2 05/10] drm/amdgpu: Add multi-GPU DMA mapping helpers

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Monday, April 26, 2021 10:41 PM
To: Zeng, Oak ; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 05/10] drm/amdgpu: Add multi-GPU DMA mapping helpers

Am 2021-04-26 um 8:09 p.m. schrieb Zeng, Oak:
> As I understand it, when one GPU map another GPU's vram, this vram should 
> also be mapped in iommu page table. Also normal GTT memory (versus userptr) 
> also need to be mapped in iommu. But don't see this code below.

Right, I'm not solving all problems at once. The next patch is there to handle 
GTT BOs.

Peer mappings of doorbells, MMIO and VRAM still need to be handled in the 
future. I'm trying to fix the worst issues first. This series should get 99% of 
real world tests working.


>  I only see you map userptr in iommu. Maybe you map them in iommu not during 
> memory attachment time?
>
> Also see a nit-pick inline
>
> Regards,
> Oak
>
>  
>
> On 2021-04-21, 9:31 PM, "dri-devel on behalf of Felix Kuehling" 
>  
> wrote:
>
> Add BO-type specific helpers functions to DMA-map and unmap
> kfd_mem_attachments. Implement this functionality for userptrs by creating
> one SG BO per GPU and filling it with a DMA mapping of the pages from the
> original mem->bo.
>
> Signed-off-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   8 +-
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 146 +-
>  2 files changed, 145 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index c24b2478f445..63668433f5a6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -38,11 +38,17 @@ extern uint64_t amdgpu_amdkfd_total_mem_size;
>
>  struct amdgpu_device;
>
> +enum kfd_mem_attachment_type {
> + KFD_MEM_ATT_SHARED, /* Share kgd_mem->bo or another attachment's */
> + KFD_MEM_ATT_USERPTR,/* SG bo to DMA map pages from a userptr bo */
> +};
> +
>  struct kfd_mem_attachment {
>   struct list_head list;
> + enum kfd_mem_attachment_type type;
> + bool is_mapped;
>   struct amdgpu_bo_va *bo_va;
>   struct amdgpu_device *adev;
> - bool is_mapped;
>   uint64_t va;
>   uint64_t pte_flags;
>  };
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index fbd7e786b54e..49d1af4aa5f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -473,12 +473,117 @@ static uint64_t get_pte_flags(struct amdgpu_device 
> *adev, struct kgd_mem *mem)
>   return pte_flags;
>  }
>
> +static int
> +kfd_mem_dmamap_userptr(struct kgd_mem *mem,
> +struct kfd_mem_attachment *attachment)
> +{
> + enum dma_data_direction direction =
> + mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
> + DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
> + struct ttm_operation_ctx ctx = {.interruptible = true};
> + struct amdgpu_bo *bo = attachment->bo_va->base.bo;
> + struct amdgpu_device *adev = attachment->adev;
> + struct ttm_tt *src_ttm = mem->bo->tbo.ttm;
> + struct ttm_tt *ttm = bo->tbo.ttm;
> + int ret;
> +
> + ttm->sg = kmalloc(sizeof(*ttm->sg), GFP_KERNEL);
> + if (unlikely(!ttm->sg))
> + return -ENOMEM;
> +
> + if (WARN_ON(ttm->num_pages != src_ttm->num_pages))
> + return -EINVAL;
> +
> + /* Same sequence as in amdgpu_ttm_tt_pin_userptr */
> + ret = sg_alloc_table_from_pages(ttm->sg, src_ttm->pages,
> + ttm->num_pages, 0,
> + (u64)ttm->num_pages << PAGE_SHIFT,
> + GFP_KERNEL);
> + if (unlikely(ret))
> + goto release_sg;
> Should go to a label starting from kfree below?

Thanks, I'll fix that.

Regards,
  Felix


> +
> + ret = dma_map_sgtable(adev->dev, ttm->sg, direction, 0);
> + if (unlikely(ret))
> + goto release_sg;
> +
> + drm_prime_sg_to_dma_addr_array(ttm->sg, ttm->dma_address,
> +ttm->num_pages);
> +
> + amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT);
> + ret = ttm_bo_validate(>tbo, >placement, );
> + if (ret)
> + goto release_sg;
> +
> + return 0;
> +
> +release_sg:
> + pr_err("DMA map userptr failed: %d\n", ret);
> + sg_free_table(ttm->sg);
> + kfree(ttm->sg);
> + ttm->sg = NULL;
> + return ret;
> +}
> +
> +static int
> +kfd_mem_dmamap_attachment(struct kgd_mem *mem,
> +   

RE: [PATCH v2 04/10] drm/amdgpu: Simplify AQL queue mapping

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Friday, April 23, 2021 2:23 AM
To: Zeng, Oak ; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 04/10] drm/amdgpu: Simplify AQL queue mapping

Am 2021-04-22 um 9:33 p.m. schrieb Zeng, Oak:
> Regards,
> Oak
>
>  
>
> On 2021-04-21, 9:31 PM, "amd-gfx on behalf of Felix Kuehling" 
>  
> wrote:
>
> Do AQL queue double-mapping with a single attach call. That will make it
> easier to create per-GPU BOs later, to be shared between the two BO VA
> mappings on the same GPU.
>
> Freeing the attachments is not necessary if map_to_gpu fails. These will 
> be
> cleaned up when the kdg_mem object is destroyed in
> amdgpu_amdkfd_gpuvm_free_memory_of_gpu.
>
> Signed-off-by: Felix Kuehling 
> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 103 --
>  1 file changed, 48 insertions(+), 55 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 34c9a2d0028e..fbd7e786b54e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -486,70 +486,76 @@ static uint64_t get_pte_flags(struct amdgpu_device 
> *adev, struct kgd_mem *mem)
>   * 4a.  Validate new page tables and directories
>   */
>  static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem 
> *mem,
> - struct amdgpu_vm *vm, bool is_aql,
> - struct kfd_mem_attachment **p_attachment)
> + struct amdgpu_vm *vm, bool is_aql)
>  {
>   unsigned long bo_size = mem->bo->tbo.base.size;
>   uint64_t va = mem->va;
> - struct kfd_mem_attachment *attachment;
> - struct amdgpu_bo *bo;
> - int ret;
> + struct kfd_mem_attachment *attachment[2] = {NULL, NULL};
> + struct amdgpu_bo *bo[2] = {NULL, NULL};
> + int i, ret;
>
>   if (!va) {
>   pr_err("Invalid VA when adding BO to VM\n");
>   return -EINVAL;
>   }
>
> - if (is_aql)
> - va += bo_size;
> -
> - attachment = kzalloc(sizeof(*attachment), GFP_KERNEL);
> - if (!attachment)
> - return -ENOMEM;
> + for (i = 0; i <= is_aql; i++) {
> + attachment[i] = kzalloc(sizeof(*attachment[i]), GFP_KERNEL);
> + if (unlikely(!attachment[i])) {
> + ret = -ENOMEM;
> + goto unwind;
> + }
>
> - pr_debug("\t add VA 0x%llx - 0x%llx to vm %p\n", va,
> - va + bo_size, vm);
> + pr_debug("\t add VA 0x%llx - 0x%llx to vm %p\n", va,
> +  va + bo_size, vm);
>
> - /* FIXME: For now all attachments use the same BO. This is incorrect
> -  * because one BO can only have one DMA mapping for one GPU. We need
> -  * one BO per GPU, e.g. a DMABuf import with dynamic attachment. This
> -  * will be addressed one BO-type at a time in subsequent patches.
> -  */
> - bo = mem->bo;
> - drm_gem_object_get(>tbo.base);
> + /* FIXME: For now all attachments use the same BO. This is
> +  * incorrect because one BO can only have one DMA mapping
> +  * for one GPU. We need one BO per GPU, e.g. a DMABuf
> +  * import with dynamic attachment. This will be addressed
> +  * one BO-type at a time in subsequent patches.
> +  */
> + bo[i] = mem->bo;
> + drm_gem_object_get([i]->tbo.base);
>
> - /* Add BO to VM internal data structures*/
> - attachment->bo_va = amdgpu_vm_bo_add(adev, vm, bo);
> - if (!attachment->bo_va) {
> - ret = -EINVAL;
> - pr_err("Failed to add BO object to VM. ret == %d\n",
> - ret);
> - goto err_vmadd;
> - }
> + /* Add BO to VM internal data structures */
> + attachment[i]->bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
> Just for discussion. Are we allowed to add one bo twice to a vm? When I 
> looked at amdgpu_vm_bo_base_init (called by amdgpu_vm_bo_add), line:
> bo->vm_bo = base;
> when you add the same bo to vm the second time, bo->vm_bo will be 
> overwritten. I am not sure whether this will cause an issue later.
> This is not introduced by your code. The original code (calling 
> kfd_mem_attach twice for aql) has the same problem.

If you just add one more line of context, you'll see that bo->vm_bo is the 
start of a single linked list of struct amdgpu_vm_bo_base. So adding a BO to a 
VM multiple times just extends that single-linked list:

    base->next = bo->vm_bo;
    bo->vm_bo = base;

Regards,
  Felix


> + if (unlikely(!attachment[i]->bo_va)) {
> + ret = -ENOMEM;
> +   

RE: [PATCH v2 03/10] drm/amdgpu: Keep a bo-reference per-attachment

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Wednesday, April 21, 2021 8:31 PM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: [PATCH v2 03/10] drm/amdgpu: Keep a bo-reference per-attachment

For now they all reference the same BO. For correct DMA mappings they will 
refer to different BOs per-GPU.

Signed-off-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 22 ++-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index fee4c64dd051..34c9a2d0028e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -489,11 +489,11 @@ static int kfd_mem_attach(struct amdgpu_device *adev, 
struct kgd_mem *mem,
struct amdgpu_vm *vm, bool is_aql,
struct kfd_mem_attachment **p_attachment)  {
-   int ret;
-   struct kfd_mem_attachment *attachment;
-   struct amdgpu_bo *bo = mem->bo;
+   unsigned long bo_size = mem->bo->tbo.base.size;
uint64_t va = mem->va;
-   unsigned long bo_size = bo->tbo.base.size;
+   struct kfd_mem_attachment *attachment;
+   struct amdgpu_bo *bo;
+   int ret;
 
if (!va) {
pr_err("Invalid VA when adding BO to VM\n"); @@ -510,6 +510,14 
@@ static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem *mem,
pr_debug("\t add VA 0x%llx - 0x%llx to vm %p\n", va,
va + bo_size, vm);
 
+   /* FIXME: For now all attachments use the same BO. This is incorrect
+* because one BO can only have one DMA mapping for one GPU. We need
+* one BO per GPU, e.g. a DMABuf import with dynamic attachment. This
+* will be addressed one BO-type at a time in subsequent patches.
+*/
+   bo = mem->bo;
+   drm_gem_object_get(>tbo.base);
+
/* Add BO to VM internal data structures*/
attachment->bo_va = amdgpu_vm_bo_add(adev, vm, bo);
if (!attachment->bo_va) {
@@ -529,7 +537,7 @@ static int kfd_mem_attach(struct amdgpu_device *adev, 
struct kgd_mem *mem,
 
/* Allocate validate page tables if needed */
ret = vm_validate_pt_pd_bos(vm);
-   if (ret) {
+   if (unlikely(ret)) {
pr_err("validate_pt_pd_bos() failed\n");
goto err_alloc_pts;
}
@@ -540,15 +548,19 @@ static int kfd_mem_attach(struct amdgpu_device *adev, 
struct kgd_mem *mem,
amdgpu_vm_bo_rmv(adev, attachment->bo_va);
list_del(>list);
 err_vmadd:
+   drm_gem_object_put(>tbo.base);
kfree(attachment);
return ret;
 }
 
 static void kfd_mem_detach(struct kfd_mem_attachment *attachment)  {
+   struct amdgpu_bo *bo = attachment->bo_va->base.bo;
+
pr_debug("\t remove VA 0x%llx in entry %p\n",
attachment->va, attachment);
amdgpu_vm_bo_rmv(attachment->adev, attachment->bo_va);
+   drm_gem_object_put(>tbo.base);
list_del(>list);
kfree(attachment);
 }
--
2.31.1

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Cphilip.yang%40amd.com%7C45e84767a4a54ffcf1e908d9052e6301%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637546519079854064%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=h9%2BK4amKsU5EZUkaA5tI3j1x7xSWECROzVi%2FAY%2FgtLs%3Dreserved=0


RE: [PATCH v2 02/10] drm/amdgpu: Rename kfd_bo_va_list to kfd_mem_attachment

2021-05-10 Thread Errabolu, Ramesh
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu 

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Wednesday, April 21, 2021 8:31 PM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: [PATCH v2 02/10] drm/amdgpu: Rename kfd_bo_va_list to 
kfd_mem_attachment

This name is more fitting, especially for the changes coming next to support 
multi-GPU systems with proper DMA mappings. Cleaned up the code and renamed 
some related functions and variables to improve readability.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   8 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 209 +-
 2 files changed, 104 insertions(+), 113 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 313ee49b9f17..c24b2478f445 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -38,10 +38,10 @@ extern uint64_t amdgpu_amdkfd_total_mem_size;
 
 struct amdgpu_device;
 
-struct kfd_bo_va_list {
-   struct list_head bo_list;
+struct kfd_mem_attachment {
+   struct list_head list;
struct amdgpu_bo_va *bo_va;
-   void *kgd_dev;
+   struct amdgpu_device *adev;
bool is_mapped;
uint64_t va;
uint64_t pte_flags;
@@ -50,7 +50,7 @@ struct kfd_bo_va_list {  struct kgd_mem {
struct mutex lock;
struct amdgpu_bo *bo;
-   struct list_head bo_va_list;
+   struct list_head attachments;
/* protected by amdkfd_process_info.lock */
struct ttm_validate_buffer validate_list;
struct ttm_validate_buffer resv_list;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index dfa025d694f8..fee4c64dd051 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -72,16 +72,16 @@ static inline struct amdgpu_device 
*get_amdgpu_device(struct kgd_dev *kgd)
return (struct amdgpu_device *)kgd;
 }
 
-static bool check_if_add_bo_to_vm(struct amdgpu_vm *avm,
+static bool kfd_mem_is_attached(struct amdgpu_vm *avm,
struct kgd_mem *mem)
 {
-   struct kfd_bo_va_list *entry;
+   struct kfd_mem_attachment *entry;
 
-   list_for_each_entry(entry, >bo_va_list, bo_list)
+   list_for_each_entry(entry, >attachments, list)
if (entry->bo_va->base.vm == avm)
-   return false;
+   return true;
 
-   return true;
+   return false;
 }
 
 /* Set memory usage limits. Current, limits are @@ -473,7 +473,7 @@ static 
uint64_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem)
return pte_flags;
 }
 
-/* add_bo_to_vm - Add a BO to a VM
+/* kfd_mem_attach - Add a BO to a VM
  *
  * Everything that needs to bo done only once when a BO is first added
  * to a VM. It can later be mapped and unmapped many times without @@ -485,15 
+485,14 @@ static uint64_t get_pte_flags(struct amdgpu_device *adev, struct 
kgd_mem *mem)
  * 4. Alloc page tables and directories if needed
  * 4a.  Validate new page tables and directories
  */
-static int add_bo_to_vm(struct amdgpu_device *adev, struct kgd_mem *mem,
+static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem 
+*mem,
struct amdgpu_vm *vm, bool is_aql,
-   struct kfd_bo_va_list **p_bo_va_entry)
+   struct kfd_mem_attachment **p_attachment)
 {
int ret;
-   struct kfd_bo_va_list *bo_va_entry;
+   struct kfd_mem_attachment *attachment;
struct amdgpu_bo *bo = mem->bo;
uint64_t va = mem->va;
-   struct list_head *list_bo_va = >bo_va_list;
unsigned long bo_size = bo->tbo.base.size;
 
if (!va) {
@@ -504,29 +503,29 @@ static int add_bo_to_vm(struct amdgpu_device *adev, 
struct kgd_mem *mem,
if (is_aql)
va += bo_size;
 
-   bo_va_entry = kzalloc(sizeof(*bo_va_entry), GFP_KERNEL);
-   if (!bo_va_entry)
+   attachment = kzalloc(sizeof(*attachment), GFP_KERNEL);
+   if (!attachment)
return -ENOMEM;
 
pr_debug("\t add VA 0x%llx - 0x%llx to vm %p\n", va,
va + bo_size, vm);
 
/* Add BO to VM internal data structures*/
-   bo_va_entry->bo_va = amdgpu_vm_bo_add(adev, vm, bo);
-   if (!bo_va_entry->bo_va) {
+   attachment->bo_va = amdgpu_vm_bo_add(adev, vm, bo);
+   if (!attachment->bo_va) {
ret = -EINVAL;
pr_err("Failed to add BO object to VM. ret == %d\n",
ret);
goto err_vmadd;
}
 
-   bo_va_entry->va = va;
-   bo_va_entry->pte_flags = get_pte_flags(adev, mem);
-   bo_va_entry->kgd_dev = (void *)adev;
-   list_add(_va_entry->bo_list, list_bo_va);
+   attachment->va 

Re: [PATCH] drm/radeon/ni_dpm: Fix booting bug

2021-05-10 Thread Gustavo A. R. Silva
Hi Alex,

On 5/10/21 16:17, Alex Deucher wrote:
> On Sun, May 9, 2021 at 6:48 PM Gustavo A. R. Silva
>  wrote:
[..]

>>
>> Bug: 
>> https://lore.kernel.org/dri-devel/3eedbe78-1fbd-4763-a7f3-ac5665e76...@xenosoft.de/
>> Fixes: 434fb1e7444a ("drm/radeon/nislands_smc.h: Replace one-element array 
>> with flexible-array member in struct NISLANDS_SMC_SWSTATE")
>> Cc: sta...@vger.kernel.org
>> Reported-by: Christian Zigotzky 
>> Tested-by: Christian Zigotzky 
>> Link: 
>> https://lore.kernel.org/dri-devel/9bb5fcbd-daf5-1669-b3e7-b8624b3c3...@xenosoft.de/
>> Signed-off-by: Gustavo A. R. Silva 
> 
> This seems like a lot of churn just to use flexible arrays.  That
> said, if static checkers are going to keep complaining about single
> element arrays, I don't mind applying these patches since this code is
> not likely to change.  Applied.  Thanks.

This is not only about the one-element arrays. These fixes (together with 
commits
434fb1e7444a and 96e27e8d919e) allow us to fix more than a dozen of these 
out-of-bounds
warnings:

drivers/gpu/drm/radeon/ni_dpm.c:2521:20: warning: array subscript 1 is above 
array bounds of ‘NISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct
NISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
 2521 |   smc_state->levels[i].dpm2.MaxPS =
  |   ~^~~

which should be fixed in order to globally enable -Warray-bounds. :)

Thanks!
--
Gustavo



Re: [PATCH] drm/amd/display: Expose active display color configurations to userspace

2021-05-10 Thread Alex Deucher
On Fri, May 7, 2021 at 3:27 PM Werner Sembach  wrote:
>
> xrandr --prop and other userspace info tools have currently no way of
> telling which color configuration is used on HDMI and DP ports.
>
> The ongoing transsition from HDMI 1.4 to 2.0 and the different bandwidth
> requirements of YCbCr 4:2:0 and RGB color format raise different
> incompatibilities. Having these configuration information readily
> available is a useful tool in debuging washed out colors, color artefacts
> on small fonts and missing refreshrate options.

I think we would ideally want these as generic connector properties
rather than AMD specific ones since they are not really AMD specific.
I believe there is already a generic drm property (max_bpc) for the
color depth.  At this point, I think having a generic RGB vs YCbCr
property would make sense.  I'm not sure about the color space.

Alex

>
> Signed-off-by: Werner Sembach 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 58 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  4 ++
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 
>  3 files changed, 98 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index f753e04fee99..c0404bcda31b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -986,6 +986,40 @@ static const struct drm_prop_enum_list 
> amdgpu_dither_enum_list[] =
> { AMDGPU_FMT_DITHER_ENABLE, "on" },
>  };
>
> +static const struct drm_prop_enum_list 
> amdgpu_active_pixel_encoding_enum_list[] = {
> +   { PIXEL_ENCODING_UNDEFINED, "undefined" },
> +   { PIXEL_ENCODING_RGB, "RGB" },
> +   { PIXEL_ENCODING_YCBCR422, "YCbCr 4:2:2" },
> +   { PIXEL_ENCODING_YCBCR444, "YCbCr 4:4:4" },
> +   { PIXEL_ENCODING_YCBCR420, "YCbCr 4:2:0" },
> +};
> +
> +static const struct drm_prop_enum_list 
> amdgpu_active_display_color_depth_enum_list[] = {
> +   { COLOR_DEPTH_UNDEFINED, "undefined" },
> +   { COLOR_DEPTH_666, "6 bit" },
> +   { COLOR_DEPTH_888, "8 bit" },
> +   { COLOR_DEPTH_101010, "10 bit" },
> +   { COLOR_DEPTH_121212, "12 bit" },
> +   { COLOR_DEPTH_141414, "14 bit" },
> +   { COLOR_DEPTH_161616, "16 bit" },
> +   { COLOR_DEPTH_999, "9 bit" },
> +   { COLOR_DEPTH_11, "11 bit" },
> +};
> +
> +static const struct drm_prop_enum_list 
> amdgpu_active_output_color_space_enum_list[] = {
> +   { COLOR_SPACE_UNKNOWN, "unknown" },
> +   { COLOR_SPACE_SRGB, "sRGB" },
> +   { COLOR_SPACE_SRGB_LIMITED, "sRGB limited" },
> +   { COLOR_SPACE_YCBCR601, "YCbCr 601" },
> +   { COLOR_SPACE_YCBCR709, "YCbCr 709" },
> +   { COLOR_SPACE_YCBCR601_LIMITED, "YCbCr 601 limited" },
> +   { COLOR_SPACE_YCBCR709_LIMITED, "YCbCr 709 limited" },
> +   { COLOR_SPACE_2020_RGB_FULLRANGE, "RGB 2020" },
> +   { COLOR_SPACE_2020_RGB_LIMITEDRANGE, "RGB 2020 limited" },
> +   { COLOR_SPACE_2020_YCBCR, "YCbCr 2020" },
> +   { COLOR_SPACE_ADOBERGB, "Adobe RGB" },
> +};
> +
>  int amdgpu_display_modeset_create_props(struct amdgpu_device *adev)
>  {
> int sz;
> @@ -1038,6 +1072,30 @@ int amdgpu_display_modeset_create_props(struct 
> amdgpu_device *adev)
>   "abm level", 0, 4);
> if (!adev->mode_info.abm_level_property)
> return -ENOMEM;
> +
> +   sz = ARRAY_SIZE(amdgpu_active_pixel_encoding_enum_list);
> +   adev->mode_info.active_pixel_encoding_property =
> +   drm_property_create_enum(adev_to_drm(adev), 0,
> +   "active pixel encoding",
> +   amdgpu_active_pixel_encoding_enum_list, sz);
> +   if (!adev->mode_info.active_pixel_encoding_property)
> +   return -ENOMEM;
> +
> +   sz = ARRAY_SIZE(amdgpu_active_display_color_depth_enum_list);
> +   adev->mode_info.active_display_color_depth_property =
> +   drm_property_create_enum(adev_to_drm(adev), 0,
> +   "active display color depth",
> +   amdgpu_active_display_color_depth_enum_list, 
> sz);
> +   if (!adev->mode_info.active_display_color_depth_property)
> +   return -ENOMEM;
> +
> +   sz = ARRAY_SIZE(amdgpu_active_output_color_space_enum_list);
> +   adev->mode_info.active_output_color_space_property =
> +   drm_property_create_enum(adev_to_drm(adev), 0,
> +   "active output color space",
> +   amdgpu_active_output_color_space_enum_list, 
> sz);
> +   if (!adev->mode_info.active_output_color_space_property)
> +   return -ENOMEM;
> }
>
> return 0;
> diff --git 

Re: [PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup

2021-05-10 Thread Rodrigo Siqueira
lgtm,

Reviewed-by: Rodrigo Siqueira 

On 05/10, Andrey Grodzovsky wrote:
> It's already being released by DRM core through devm
> 
> Signed-off-by: Andrey Grodzovsky 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 6c2c6a51ce6c..9728a0158bcb 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3757,7 +3757,6 @@ static int amdgpu_dm_initialize_drm_device(struct 
> amdgpu_device *adev)
>  
>  static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
>  {
> - drm_mode_config_cleanup(dm->ddev);
>   drm_atomic_private_obj_fini(>atomic_obj);
>   return;
>  }
> -- 
> 2.25.1
> 
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7CRodrigo.Siqueira%40amd.com%7Cd7ebdc33a79d49d6560308d913d1e32c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637562614440095736%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=OtEAqSIaLB6CgKhLQGhIQc2A%2B6lprqGB31yqQts6OVc%3Dreserved=0

-- 
Rodrigo Siqueira
https://siqueira.tech


signature.asc
Description: PGP signature


Re: [PATCH] drm/amd/display: remove unused function dc_link_perform_link_training

2021-05-10 Thread Rodrigo Siqueira
LGTM,

Jay, any comment?

Reviewed-by: Rodrigo Siqueira 

On 05/08, Rouven Czerwinski wrote:
> This function is not used anywhere, remove it. It was added in
> 40dd6bd376a4 ("drm/amd/display: Linux Set/Read link rate and lane count
> through debugfs") and moved in fe798de53a7a ("drm/amd/display: Move link
> functions from dc to dc_link"), but a user is missing.
> 
> Signed-off-by: Rouven Czerwinski 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 13 -
>  drivers/gpu/drm/amd/display/dc/dc_link.h  |  3 ---
>  2 files changed, 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> index 3fb0cebd6938..55c5cf2264b3 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> @@ -3553,19 +3553,6 @@ void dc_link_set_drive_settings(struct dc *dc,
>   dc_link_dp_set_drive_settings(dc->links[i], lt_settings);
>  }
>  
> -void dc_link_perform_link_training(struct dc *dc,
> -struct dc_link_settings *link_setting,
> -bool skip_video_pattern)
> -{
> - int i;
> -
> - for (i = 0; i < dc->link_count; i++)
> - dc_link_dp_perform_link_training(
> - dc->links[i],
> - link_setting,
> - skip_video_pattern);
> -}
> -
>  void dc_link_set_preferred_link_settings(struct dc *dc,
>struct dc_link_settings *link_setting,
>struct dc_link *link)
> diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
> b/drivers/gpu/drm/amd/display/dc/dc_link.h
> index fc5622ffec3d..45c927cd27ab 100644
> --- a/drivers/gpu/drm/amd/display/dc/dc_link.h
> +++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
> @@ -363,9 +363,6 @@ bool dc_link_is_hdcp22(struct dc_link *link, enum 
> signal_type signal);
>  void dc_link_set_drive_settings(struct dc *dc,
>   struct link_training_settings *lt_settings,
>   const struct dc_link *link);
> -void dc_link_perform_link_training(struct dc *dc,
> -struct dc_link_settings *link_setting,
> -bool skip_video_pattern);
>  void dc_link_set_preferred_link_settings(struct dc *dc,
>struct dc_link_settings *link_setting,
>struct dc_link *link);
> -- 
> 2.31.1
> 
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7CRodrigo.Siqueira%40amd.com%7C9724972184d64ad6e7e008d913010665%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637561717696066502%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000sdata=vUFEeBJwjTDnI9l8MGDiW8%2FoX7LINZi%2FfD4A004QfLs%3Dreserved=0

-- 
Rodrigo Siqueira
https://siqueira.tech


signature.asc
Description: PGP signature


Re: [PATCH 1/1] drm/amdgpu: Delete two unneeded bool conversions

2021-05-10 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, May 10, 2021 at 8:24 AM Zhen Lei  wrote:
>
> The result of an expression consisting of a single relational operator is
> already of the bool type and does not need to be evaluated explicitly.
>
> No functional change.
>
> Signed-off-by: Zhen Lei 
> ---
>  drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c 
> b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c
> index a9899335d0b1fb0..709ac576ac7e892 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c
> @@ -569,9 +569,9 @@ static int mmhub_v2_3_set_clockgating(struct 
> amdgpu_device *adev,
> return 0;
>
> mmhub_v2_3_update_medium_grain_clock_gating(adev,
> -   state == AMD_CG_STATE_GATE ? true : false);
> +   state == AMD_CG_STATE_GATE);
> mmhub_v2_3_update_medium_grain_light_sleep(adev,
> -   state == AMD_CG_STATE_GATE ? true : false);
> +   state == AMD_CG_STATE_GATE);
>
> return 0;
>  }
> --
> 2.26.0.106.g9fadedd
>
>


Re: [PATCH 1/1] drm/amd/display: Delete several unneeded bool conversions

2021-05-10 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, May 10, 2021 at 8:16 AM Zhen Lei  wrote:
>
> The result of an expression consisting of a single relational operator is
> already of the bool type and does not need to be evaluated explicitly.
>
> No functional change.
>
> Signed-off-by: Zhen Lei 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp_cm.c | 4 ++--
>  drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mpc.c| 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp_cm.c 
> b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp_cm.c
> index 8dc3d1f7398422e..2feb051a200294a 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp_cm.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp_cm.c
> @@ -482,7 +482,7 @@ bool dpp20_program_blnd_lut(
> next_mode = LUT_RAM_A;
>
> dpp20_power_on_blnd_lut(dpp_base, true);
> -   dpp20_configure_blnd_lut(dpp_base, next_mode == LUT_RAM_A ? 
> true:false);
> +   dpp20_configure_blnd_lut(dpp_base, next_mode == LUT_RAM_A);
>
> if (next_mode == LUT_RAM_A)
> dpp20_program_blnd_luta_settings(dpp_base, params);
> @@ -893,7 +893,7 @@ bool dpp20_program_shaper(
> else
> next_mode = LUT_RAM_A;
>
> -   dpp20_configure_shaper_lut(dpp_base, next_mode == LUT_RAM_A ? 
> true:false);
> +   dpp20_configure_shaper_lut(dpp_base, next_mode == LUT_RAM_A);
>
> if (next_mode == LUT_RAM_A)
> dpp20_program_shaper_luta_settings(dpp_base, params);
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mpc.c 
> b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mpc.c
> index 910c17fd4278932..950c9bfd53de516 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mpc.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mpc.c
> @@ -874,7 +874,7 @@ bool mpc3_program_shaper(
> else
> next_mode = LUT_RAM_A;
>
> -   mpc3_configure_shaper_lut(mpc, next_mode == LUT_RAM_A ? true:false, 
> rmu_idx);
> +   mpc3_configure_shaper_lut(mpc, next_mode == LUT_RAM_A, rmu_idx);
>
> if (next_mode == LUT_RAM_A)
> mpc3_program_shaper_luta_settings(mpc, params, rmu_idx);
> --
> 2.26.0.106.g9fadedd
>
>


Re: [PATCH] drm/amd/pm: Fix out-of-bounds bug

2021-05-10 Thread Alex Deucher
On Mon, May 10, 2021 at 4:45 PM Gustavo A. R. Silva
 wrote:
>
> Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels
> and ACPIState.levels are never actually used as flexible arrays. Those
> arrays can be used as simple objects of type
> SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead.
>
> Currently, the code fails because flexible array _levels_ in
> struct SISLANDS_SMC_SWSTATE doesn't allow for code that accesses
> the first element of initialState.levels and ACPIState.levels
> arrays:
>
> drivers/gpu/drm/amd/pm/powerplay/si_dpm.c:
> 4820: table->initialState.levels[0].mclk.vDLL_CNTL =
> 4821: cpu_to_be32(si_pi->clock_registers.dll_cntl);
> ...
> 5021: table->ACPIState.levels[0].mclk.vDLL_CNTL =
> 5022: cpu_to_be32(dll_cntl);
>
> because such element cannot be accessed without previously allocating
> enough dynamic memory for it to exist (which never actually happens).
> So, there is an out-of-bounds bug in this case.
>
> That's why struct SISLANDS_SMC_SWSTATE should only be used as type
> for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is
> created as type for objects initialState, ACPIState and ULVState.
>
> Also, with the change from one-element array to flexible-array member
> in commit 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with
> flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of
> dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be
> SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of
> SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1.
>
> Fixes: 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with 
> flexible-array in struct SISLANDS_SMC_SWSTATE")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Gustavo A. R. Silva 

Applied.  Thanks!

Alex

> ---
>  drivers/gpu/drm/amd/pm/powerplay/si_dpm.c | 174 +-
>  .../gpu/drm/amd/pm/powerplay/sislands_smc.h   |  34 ++--
>  2 files changed, 109 insertions(+), 99 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c 
> b/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c
> index 26a5321e621b..15c0b8af376f 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c
> @@ -4817,70 +4817,70 @@ static int si_populate_smc_initial_state(struct 
> amdgpu_device *adev,
> u32 reg;
> int ret;
>
> -   table->initialState.levels[0].mclk.vDLL_CNTL =
> +   table->initialState.level.mclk.vDLL_CNTL =
> cpu_to_be32(si_pi->clock_registers.dll_cntl);
> -   table->initialState.levels[0].mclk.vMCLK_PWRMGT_CNTL =
> +   table->initialState.level.mclk.vMCLK_PWRMGT_CNTL =
> cpu_to_be32(si_pi->clock_registers.mclk_pwrmgt_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_AD_FUNC_CNTL =
> +   table->initialState.level.mclk.vMPLL_AD_FUNC_CNTL =
> cpu_to_be32(si_pi->clock_registers.mpll_ad_func_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_DQ_FUNC_CNTL =
> +   table->initialState.level.mclk.vMPLL_DQ_FUNC_CNTL =
> cpu_to_be32(si_pi->clock_registers.mpll_dq_func_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL =
> +   table->initialState.level.mclk.vMPLL_FUNC_CNTL =
> cpu_to_be32(si_pi->clock_registers.mpll_func_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL_1 =
> +   table->initialState.level.mclk.vMPLL_FUNC_CNTL_1 =
> cpu_to_be32(si_pi->clock_registers.mpll_func_cntl_1);
> -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL_2 =
> +   table->initialState.level.mclk.vMPLL_FUNC_CNTL_2 =
> cpu_to_be32(si_pi->clock_registers.mpll_func_cntl_2);
> -   table->initialState.levels[0].mclk.vMPLL_SS =
> +   table->initialState.level.mclk.vMPLL_SS =
> cpu_to_be32(si_pi->clock_registers.mpll_ss1);
> -   table->initialState.levels[0].mclk.vMPLL_SS2 =
> +   table->initialState.level.mclk.vMPLL_SS2 =
> cpu_to_be32(si_pi->clock_registers.mpll_ss2);
>
> -   table->initialState.levels[0].mclk.mclk_value =
> +   table->initialState.level.mclk.mclk_value =
> cpu_to_be32(initial_state->performance_levels[0].mclk);
>
> -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL =
> +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL =
> cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl);
> -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_2 =
> +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL_2 =
> cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl_2);
> -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_3 =
> +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL_3 =
> cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl_3);
> -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_4 =
> +   

Re: [PATCH] drm/radeon/si_dpm: Fix SMU power state load

2021-05-10 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, May 10, 2021 at 1:51 AM Kai-Heng Feng
 wrote:
>
> On Mon, May 10, 2021 at 6:54 AM Gustavo A. R. Silva
>  wrote:
> >
> > Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels
> > and ACPIState.levels are never actually used as flexible arrays. Those
> > arrays can be used as simple objects of type
> > SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead.
> >
> > Currently, the code fails because flexible array _levels_ in
> > struct SISLANDS_SMC_SWSTATE doesn't allow for code that access
> > the first element of initialState.levels and ACPIState.levels
> > arrays:
> >
> > 4353 table->initialState.levels[0].mclk.vDLL_CNTL =
> > 4354 cpu_to_be32(si_pi->clock_registers.dll_cntl);
> > ...
> > 4555 table->ACPIState.levels[0].mclk.vDLL_CNTL =
> > 4556 cpu_to_be32(dll_cntl);
> >
> > because such element cannot exist without previously allocating
> > any dynamic memory for it (which never actually happens).
> >
> > That's why struct SISLANDS_SMC_SWSTATE should only be used as type
> > for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is
> > created as type for objects initialState, ACPIState and ULVState.
> >
> > Also, with the change from one-element array to flexible-array member
> > in commit 96e27e8d919e ("drm/radeon/si_dpm: Replace one-element array
> > with flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of
> > dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be
> > SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of
> > SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1.
> >
> > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1583
> > Fixes: 96e27e8d919e ("drm/radeon/si_dpm: Replace one-element array with 
> > flexible-array in struct SISLANDS_SMC_SWSTATE")
> > Cc: sta...@vger.kernel.org
> > Reported-by: Kai-Heng Feng 
> > Signed-off-by: Gustavo A. R. Silva 
>
> Tested-by: Kai-Heng Feng 
>
> > ---
> >  drivers/gpu/drm/radeon/si_dpm.c   | 174 +-
> >  drivers/gpu/drm/radeon/sislands_smc.h |  34 +++--
> >  2 files changed, 109 insertions(+), 99 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/radeon/si_dpm.c 
> > b/drivers/gpu/drm/radeon/si_dpm.c
> > index 91bfc4762767..2a8b9680cf6b 100644
> > --- a/drivers/gpu/drm/radeon/si_dpm.c
> > +++ b/drivers/gpu/drm/radeon/si_dpm.c
> > @@ -4350,70 +4350,70 @@ static int si_populate_smc_initial_state(struct 
> > radeon_device *rdev,
> > u32 reg;
> > int ret;
> >
> > -   table->initialState.levels[0].mclk.vDLL_CNTL =
> > +   table->initialState.level.mclk.vDLL_CNTL =
> > cpu_to_be32(si_pi->clock_registers.dll_cntl);
> > -   table->initialState.levels[0].mclk.vMCLK_PWRMGT_CNTL =
> > +   table->initialState.level.mclk.vMCLK_PWRMGT_CNTL =
> > cpu_to_be32(si_pi->clock_registers.mclk_pwrmgt_cntl);
> > -   table->initialState.levels[0].mclk.vMPLL_AD_FUNC_CNTL =
> > +   table->initialState.level.mclk.vMPLL_AD_FUNC_CNTL =
> > cpu_to_be32(si_pi->clock_registers.mpll_ad_func_cntl);
> > -   table->initialState.levels[0].mclk.vMPLL_DQ_FUNC_CNTL =
> > +   table->initialState.level.mclk.vMPLL_DQ_FUNC_CNTL =
> > cpu_to_be32(si_pi->clock_registers.mpll_dq_func_cntl);
> > -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL =
> > +   table->initialState.level.mclk.vMPLL_FUNC_CNTL =
> > cpu_to_be32(si_pi->clock_registers.mpll_func_cntl);
> > -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL_1 =
> > +   table->initialState.level.mclk.vMPLL_FUNC_CNTL_1 =
> > cpu_to_be32(si_pi->clock_registers.mpll_func_cntl_1);
> > -   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL_2 =
> > +   table->initialState.level.mclk.vMPLL_FUNC_CNTL_2 =
> > cpu_to_be32(si_pi->clock_registers.mpll_func_cntl_2);
> > -   table->initialState.levels[0].mclk.vMPLL_SS =
> > +   table->initialState.level.mclk.vMPLL_SS =
> > cpu_to_be32(si_pi->clock_registers.mpll_ss1);
> > -   table->initialState.levels[0].mclk.vMPLL_SS2 =
> > +   table->initialState.level.mclk.vMPLL_SS2 =
> > cpu_to_be32(si_pi->clock_registers.mpll_ss2);
> >
> > -   table->initialState.levels[0].mclk.mclk_value =
> > +   table->initialState.level.mclk.mclk_value =
> > cpu_to_be32(initial_state->performance_levels[0].mclk);
> >
> > -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL =
> > +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL =
> > cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl);
> > -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_2 =
> > +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL_2 =
> > cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl_2);
> > -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_3 =
> > +   

Re: [PATCH] drm/radeon/ni_dpm: Fix booting bug

2021-05-10 Thread Alex Deucher
On Sun, May 9, 2021 at 6:48 PM Gustavo A. R. Silva
 wrote:
>
> Create new structure NISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels
> and ACPIState.levels are never actually used as flexible arrays. Those
> arrays can be used as simple objects of type
> NISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead.
>
> Currently, the code fails because flexible array _levels_ in
> struct NISLANDS_SMC_SWSTATE doesn't allow for code that access
> the first element of initialState.levels and ACPIState.levels
> arrays:
>
> drivers/gpu/drm/radeon/ni_dpm.c:
> 1690 table->initialState.levels[0].mclk.vMPLL_AD_FUNC_CNTL =
> 1691 cpu_to_be32(ni_pi->clock_registers.mpll_ad_func_cntl);
> ...
> 1903:   table->ACPIState.levels[0].mclk.vMPLL_AD_FUNC_CNTL = 
> cpu_to_be32(mpll_ad_func_cntl);
> 1904:   table->ACPIState.levels[0].mclk.vMPLL_AD_FUNC_CNTL_2 = 
> cpu_to_be32(mpll_ad_func_cntl_2);
>
> because such element cannot exist without previously allocating
> any dynamic memory for it (which never actually happens).
>
> That's why struct NISLANDS_SMC_SWSTATE should only be used as type
> for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is
> created as type for objects initialState, ACPIState and ULVState.
>
> Also, with the change from one-element array to flexible-array member
> in commit 434fb1e7444a ("drm/radeon/nislands_smc.h: Replace one-element
> array with flexible-array member in struct NISLANDS_SMC_SWSTATE"), the
> size of dpmLevels in struct NISLANDS_SMC_STATETABLE should be fixed to
> be NISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of
> NISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1.
>
> Bug: 
> https://lore.kernel.org/dri-devel/3eedbe78-1fbd-4763-a7f3-ac5665e76...@xenosoft.de/
> Fixes: 434fb1e7444a ("drm/radeon/nislands_smc.h: Replace one-element array 
> with flexible-array member in struct NISLANDS_SMC_SWSTATE")
> Cc: sta...@vger.kernel.org
> Reported-by: Christian Zigotzky 
> Tested-by: Christian Zigotzky 
> Link: 
> https://lore.kernel.org/dri-devel/9bb5fcbd-daf5-1669-b3e7-b8624b3c3...@xenosoft.de/
> Signed-off-by: Gustavo A. R. Silva 

This seems like a lot of churn just to use flexible arrays.  That
said, if static checkers are going to keep complaining about single
element arrays, I don't mind applying these patches since this code is
not likely to change.  Applied.  Thanks.

Alex


Alex

> ---
>  drivers/gpu/drm/radeon/ni_dpm.c   | 144 +-
>  drivers/gpu/drm/radeon/nislands_smc.h |  34 +++---
>  2 files changed, 94 insertions(+), 84 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/ni_dpm.c b/drivers/gpu/drm/radeon/ni_dpm.c
> index dd5ef6493723..769f666335ac 100644
> --- a/drivers/gpu/drm/radeon/ni_dpm.c
> +++ b/drivers/gpu/drm/radeon/ni_dpm.c
> @@ -1687,102 +1687,102 @@ static int ni_populate_smc_initial_state(struct 
> radeon_device *rdev,
> u32 reg;
> int ret;
>
> -   table->initialState.levels[0].mclk.vMPLL_AD_FUNC_CNTL =
> +   table->initialState.level.mclk.vMPLL_AD_FUNC_CNTL =
> cpu_to_be32(ni_pi->clock_registers.mpll_ad_func_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_AD_FUNC_CNTL_2 =
> +   table->initialState.level.mclk.vMPLL_AD_FUNC_CNTL_2 =
> cpu_to_be32(ni_pi->clock_registers.mpll_ad_func_cntl_2);
> -   table->initialState.levels[0].mclk.vMPLL_DQ_FUNC_CNTL =
> +   table->initialState.level.mclk.vMPLL_DQ_FUNC_CNTL =
> cpu_to_be32(ni_pi->clock_registers.mpll_dq_func_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_DQ_FUNC_CNTL_2 =
> +   table->initialState.level.mclk.vMPLL_DQ_FUNC_CNTL_2 =
> cpu_to_be32(ni_pi->clock_registers.mpll_dq_func_cntl_2);
> -   table->initialState.levels[0].mclk.vMCLK_PWRMGT_CNTL =
> +   table->initialState.level.mclk.vMCLK_PWRMGT_CNTL =
> cpu_to_be32(ni_pi->clock_registers.mclk_pwrmgt_cntl);
> -   table->initialState.levels[0].mclk.vDLL_CNTL =
> +   table->initialState.level.mclk.vDLL_CNTL =
> cpu_to_be32(ni_pi->clock_registers.dll_cntl);
> -   table->initialState.levels[0].mclk.vMPLL_SS =
> +   table->initialState.level.mclk.vMPLL_SS =
> cpu_to_be32(ni_pi->clock_registers.mpll_ss1);
> -   table->initialState.levels[0].mclk.vMPLL_SS2 =
> +   table->initialState.level.mclk.vMPLL_SS2 =
> cpu_to_be32(ni_pi->clock_registers.mpll_ss2);
> -   table->initialState.levels[0].mclk.mclk_value =
> +   table->initialState.level.mclk.mclk_value =
> cpu_to_be32(initial_state->performance_levels[0].mclk);
>
> -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL =
> +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL =
> cpu_to_be32(ni_pi->clock_registers.cg_spll_func_cntl);
> -   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_2 =
> +   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL_2 =
> 

[Bug 213007] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)

2021-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213007

Alex Deucher (alexdeuc...@gmail.com) changed:

   What|Removed |Added

 CC||alexdeuc...@gmail.com

--- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) ---
Something hung the GPU and it was not able to successfully reset.  Was there
some specific application that caused this?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH] drm/amd/amdgpu: Fix errors in function documentation

2021-05-10 Thread Alex Deucher
Applied.  Thanks!

Alex

On Sun, May 9, 2021 at 12:30 PM Christian König
 wrote:
>
> Am 09.05.21 um 16:49 schrieb Dwaipayan Ray:
> > Fix a couple of syntax errors and removed one excess
> > parameter in the function documentations which lead
> > to kernel docs build warning.
> >
> > Signed-off-by: Dwaipayan Ray 
>
> Reviewed-by: Christian König 
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 1 -
> >   2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> > index ae9fb2025259..312f24004413 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> > @@ -320,11 +320,14 @@ static int amdgpu_ras_debugfs_ctrl_parse_data(struct 
> > file *f,
> >* "disable" requires only the block.
> >* "enable" requires the block and error type.
> >* "inject" requires the block, error type, address, and value.
> > + *
> >* The block is one of: umc, sdma, gfx, etc.
> >*  see ras_block_string[] for details
> > + *
> >* The error type is one of: ue, ce, where,
> >*  ue is multi-uncorrectable
> >*  ce is single-correctable
> > + *
> >* The sub-block is a the sub-block index, pass 0 if there is no 
> > sub-block.
> >* The address and value are hexadecimal numbers, leading 0x is optional.
> >*
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > index 16252d48e5a4..7e1a67295106 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > @@ -2796,7 +2796,6 @@ long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long 
> > timeout)
> >*
> >* @adev: amdgpu_device pointer
> >* @vm: requested vm
> > - * @vm_context: Indicates if it GFX or Compute context
> >* @pasid: Process address space identifier
> >*
> >* Init @vm fields.
>
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver

2021-05-10 Thread Bjorn Helgaas
In subject:

  PCI: Add support for dev_groups to struct pci_driver

(not "struct pci_device_driver," which does not exist)

On Mon, May 10, 2021 at 12:36:17PM -0400, Andrey Grodzovsky wrote:
> This helps converting PCI drivers sysfs attributes to static.
> 
> Analogous to b71b283e3d6d ("USB: add support for dev_groups to
> struct usb_driver")
> 
> Signed-off-by: Andrey Grodzovsky 
> Suggested-by: Greg Kroah-Hartman 

With the subject change above,

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/pci-driver.c | 1 +
>  include/linux/pci.h  | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index ec44a79e951a..3a72352aa5cf 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1385,6 +1385,7 @@ int __pci_register_driver(struct pci_driver *drv, 
> struct module *owner,
>   drv->driver.owner = owner;
>   drv->driver.mod_name = mod_name;
>   drv->driver.groups = drv->groups;
> + drv->driver.dev_groups = drv->dev_groups;
>  
>   spin_lock_init(>dynids.lock);
>   INIT_LIST_HEAD(>dynids.list);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..b57755b03009 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -858,6 +858,8 @@ struct module;
>   *   number of VFs to enable via sysfs "sriov_numvfs" file.
>   * @err_handler: See Documentation/PCI/pci-error-recovery.rst
>   * @groups:  Sysfs attribute groups.
> + * @dev_groups: Attributes attached to the device that will be
> + *  created once it is bound to the driver.
>   * @driver:  Driver model structure.
>   * @dynids:  List of dynamically added device IDs.
>   */
> @@ -873,6 +875,7 @@ struct pci_driver {
>   int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
>   const struct pci_error_handlers *err_handler;
>   const struct attribute_group **groups;
> + const struct attribute_group **dev_groups;
>   struct device_driverdriver;
>   struct pci_dynids   dynids;
>  };
> -- 
> 2.25.1
> 


Re: [PATCH] drm/amd/display: remove unused function dc_link_perform_link_training

2021-05-10 Thread Alex Deucher
Applied.  Thanks!

Alex

On Sun, May 9, 2021 at 11:42 AM Rouven Czerwinski  wrote:
>
> This function is not used anywhere, remove it. It was added in
> 40dd6bd376a4 ("drm/amd/display: Linux Set/Read link rate and lane count
> through debugfs") and moved in fe798de53a7a ("drm/amd/display: Move link
> functions from dc to dc_link"), but a user is missing.
>
> Signed-off-by: Rouven Czerwinski 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 13 -
>  drivers/gpu/drm/amd/display/dc/dc_link.h  |  3 ---
>  2 files changed, 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> index 3fb0cebd6938..55c5cf2264b3 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> @@ -3553,19 +3553,6 @@ void dc_link_set_drive_settings(struct dc *dc,
> dc_link_dp_set_drive_settings(dc->links[i], lt_settings);
>  }
>
> -void dc_link_perform_link_training(struct dc *dc,
> -  struct dc_link_settings *link_setting,
> -  bool skip_video_pattern)
> -{
> -   int i;
> -
> -   for (i = 0; i < dc->link_count; i++)
> -   dc_link_dp_perform_link_training(
> -   dc->links[i],
> -   link_setting,
> -   skip_video_pattern);
> -}
> -
>  void dc_link_set_preferred_link_settings(struct dc *dc,
>  struct dc_link_settings 
> *link_setting,
>  struct dc_link *link)
> diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
> b/drivers/gpu/drm/amd/display/dc/dc_link.h
> index fc5622ffec3d..45c927cd27ab 100644
> --- a/drivers/gpu/drm/amd/display/dc/dc_link.h
> +++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
> @@ -363,9 +363,6 @@ bool dc_link_is_hdcp22(struct dc_link *link, enum 
> signal_type signal);
>  void dc_link_set_drive_settings(struct dc *dc,
> struct link_training_settings *lt_settings,
> const struct dc_link *link);
> -void dc_link_perform_link_training(struct dc *dc,
> -  struct dc_link_settings *link_setting,
> -  bool skip_video_pattern);
>  void dc_link_set_preferred_link_settings(struct dc *dc,
>  struct dc_link_settings 
> *link_setting,
>  struct dc_link *link);
> --
> 2.31.1
>
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/pm: Fix out-of-bounds bug

2021-05-10 Thread Gustavo A. R. Silva
Create new structure SISLANDS_SMC_SWSTATE_SINGLE, as initialState.levels
and ACPIState.levels are never actually used as flexible arrays. Those
arrays can be used as simple objects of type
SISLANDS_SMC_HW_PERFORMANCE_LEVEL, instead.

Currently, the code fails because flexible array _levels_ in
struct SISLANDS_SMC_SWSTATE doesn't allow for code that accesses
the first element of initialState.levels and ACPIState.levels
arrays:

drivers/gpu/drm/amd/pm/powerplay/si_dpm.c:
4820: table->initialState.levels[0].mclk.vDLL_CNTL =
4821: cpu_to_be32(si_pi->clock_registers.dll_cntl);
...
5021: table->ACPIState.levels[0].mclk.vDLL_CNTL =
5022: cpu_to_be32(dll_cntl);

because such element cannot be accessed without previously allocating
enough dynamic memory for it to exist (which never actually happens).
So, there is an out-of-bounds bug in this case.

That's why struct SISLANDS_SMC_SWSTATE should only be used as type
for object driverState and new struct SISLANDS_SMC_SWSTATE_SINGLE is
created as type for objects initialState, ACPIState and ULVState.

Also, with the change from one-element array to flexible-array member
in commit 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with
flexible-array in struct SISLANDS_SMC_SWSTATE"), the size of
dpmLevels in struct SISLANDS_SMC_STATETABLE should be fixed to be
SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE instead of
SISLANDS_MAX_SMC_PERFORMANCE_LEVELS_PER_SWSTATE - 1.

Fixes: 0e1aa13ca3ff ("drm/amd/pm: Replace one-element array with flexible-array 
in struct SISLANDS_SMC_SWSTATE")
Cc: sta...@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/powerplay/si_dpm.c | 174 +-
 .../gpu/drm/amd/pm/powerplay/sislands_smc.h   |  34 ++--
 2 files changed, 109 insertions(+), 99 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c 
b/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c
index 26a5321e621b..15c0b8af376f 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c
@@ -4817,70 +4817,70 @@ static int si_populate_smc_initial_state(struct 
amdgpu_device *adev,
u32 reg;
int ret;
 
-   table->initialState.levels[0].mclk.vDLL_CNTL =
+   table->initialState.level.mclk.vDLL_CNTL =
cpu_to_be32(si_pi->clock_registers.dll_cntl);
-   table->initialState.levels[0].mclk.vMCLK_PWRMGT_CNTL =
+   table->initialState.level.mclk.vMCLK_PWRMGT_CNTL =
cpu_to_be32(si_pi->clock_registers.mclk_pwrmgt_cntl);
-   table->initialState.levels[0].mclk.vMPLL_AD_FUNC_CNTL =
+   table->initialState.level.mclk.vMPLL_AD_FUNC_CNTL =
cpu_to_be32(si_pi->clock_registers.mpll_ad_func_cntl);
-   table->initialState.levels[0].mclk.vMPLL_DQ_FUNC_CNTL =
+   table->initialState.level.mclk.vMPLL_DQ_FUNC_CNTL =
cpu_to_be32(si_pi->clock_registers.mpll_dq_func_cntl);
-   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL =
+   table->initialState.level.mclk.vMPLL_FUNC_CNTL =
cpu_to_be32(si_pi->clock_registers.mpll_func_cntl);
-   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL_1 =
+   table->initialState.level.mclk.vMPLL_FUNC_CNTL_1 =
cpu_to_be32(si_pi->clock_registers.mpll_func_cntl_1);
-   table->initialState.levels[0].mclk.vMPLL_FUNC_CNTL_2 =
+   table->initialState.level.mclk.vMPLL_FUNC_CNTL_2 =
cpu_to_be32(si_pi->clock_registers.mpll_func_cntl_2);
-   table->initialState.levels[0].mclk.vMPLL_SS =
+   table->initialState.level.mclk.vMPLL_SS =
cpu_to_be32(si_pi->clock_registers.mpll_ss1);
-   table->initialState.levels[0].mclk.vMPLL_SS2 =
+   table->initialState.level.mclk.vMPLL_SS2 =
cpu_to_be32(si_pi->clock_registers.mpll_ss2);
 
-   table->initialState.levels[0].mclk.mclk_value =
+   table->initialState.level.mclk.mclk_value =
cpu_to_be32(initial_state->performance_levels[0].mclk);
 
-   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL =
+   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL =
cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl);
-   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_2 =
+   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL_2 =
cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl_2);
-   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_3 =
+   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL_3 =
cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl_3);
-   table->initialState.levels[0].sclk.vCG_SPLL_FUNC_CNTL_4 =
+   table->initialState.level.sclk.vCG_SPLL_FUNC_CNTL_4 =
cpu_to_be32(si_pi->clock_registers.cg_spll_func_cntl_4);
-   table->initialState.levels[0].sclk.vCG_SPLL_SPREAD_SPECTRUM =
+   table->initialState.level.sclk.vCG_SPLL_SPREAD_SPECTRUM =

Re: [PATCH V3 2/2] drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and SN65DSI84 driver

2021-05-10 Thread Marek Vasut

On 5/10/21 8:04 PM, Dave Stevenson wrote:

Hi,
[...]

+static void sn65dsi83_enable(struct drm_bridge *bridge)
+{
+   struct sn65dsi83 *ctx = bridge_to_sn65dsi83(bridge);
+   unsigned int pval;
+   u16 val;
+   int ret;
+
+   /* Clear reset, disable PLL */
+   regmap_write(ctx->regmap, REG_RC_RESET, 0x00);
+   regmap_write(ctx->regmap, REG_RC_PLL_EN, 0x00);


Sorry, a further thread of discussion coming from the investigations
I've been involved with.

You've powered up in pre_enable, and are sending the I2C writes in enable.

>From the docs for drm_bridge_funcs->enable[1]

* The bridge can assume that the display pipe (i.e. clocks and timing
* signals) feeding it is running when this callback is called. This
* callback must enable the display link feeding the next bridge in the
* chain if there is one.

So video is running when enable is called, and the DSI data lanes may
be HS. (Someone correct me if that is an incorrect reading of the
text).

The SN65DSI84 datasheet table 7-2 Initialization Sequence gives init
seq 8 as being "Change DSI data lanes to HS state and start DSI video
stream", AFTER all the I2C has been completed except reading back
registers and checking for errors.
With video running you don't fulfil the second part of init seq 2 "the
DSI data lanes MUST be driven to LP11 state"

My investigations have been over delaying starting the DSI video
stream until after enable, but reading the descriptive text for enable
I believe the Pi is correct to be sending video at that point.
I guess there is some ambiguity as to whether the clock lane is going
to be in HS mode during pre_enable. On the Pi the PHY and clocks will
be enabled prior to pre_enable to allow for sending DSI commands
during pre_enable, but it may not be true on other platforms.


You have to make sure the clock lane is running and in HS mode when
configuring the DSI83, otherwise the internal DSI83 state machine won't
be able to operate.


Indeed, but my reading of the documentation says that neither
pre_enable nor enable give you the state that you require.
You need a hook in the middle, an option to ask for clock lanes during
pre_enable or no video during enable, or an amendment to the docs over
the state during enable.

Having the data lanes in HS mode does appear to stop the DSI83
accepting the I2C setup commands.


Uhh, that is new. Is that what you observed in your lab ?

I saw the DSI83 behave this way if the clock lane was stopped, but the
data lanes had no impact. Was your clock lane running when the DSI83 was
not accepting i2c commands ? Does your DSI83 source clock from it or
from external Xtal ?


I haven't got into the lab as yet, and I don't have a DSI83 myself.
This is relaying experimentation from others.
They're using the DSI clock lane as the clock source.Yes the clock
lane on the Pi is started before any of the enable bridge calls.

In the vc4 driver[1] it runs through the all pre-enables, configures
register DISP0_CTRL including setting bit DSI_DISP0_ENABLE which
starts it requesting pixels from the pipeline, and then calls all the
enables. With that behaviour it fails to start the DSI83.

If the DSI83 I2C setup code is moved from enable to pre_enable then it
works, or if patch [2] is used to move the setting of the
DSI_DISP0_ENABLE bit to after enable it also works.


The pre_enable option won't work on MX8M and I suspect Exynos5, because 
that DSIM PHY only enables the DSI HS clock in its bridge enable 
(whether that is OK or not, I cannot tell, I am hoping someone can 
clarify that).



Sorry life is all rather up in the air with working from home. I'll go
into the lab and try to confirm that DSI_DISP0_ENABLE does what the
documentation implies it does.
Those who do have hardware now have it working on the Pi, although
with a version of Jagan's driver rather than yours. We're trying to
figure out the diffs with yours.


All right, if you figure it out, I'd be interested to know what it is.


If you have it working reliably on other platforms that you believe
are following the docs during pre_enable and enable, then I'm happy to
drop out of the discussions for now. We can revisit it once we have
determined exactly why it's being fussy on the Pi.


Since you have one working setup and another non-working, and the only 
difference is the DSI83 bridge driver, it should be possible to find the 
difference easily.


I had a look at the driver from Jagan again, and there is no 
configuration in pre_enable either, so the pre_enable is likely not the 
reason why it works for you. Maybe the extra mdelays the driver adds all 
over the place are the reason ?



Cheers
   Dave

[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/vc4/vc4_dsi.c#L1072
[2] 
https://github.com/6by9/linux/commit/b939eaffc47cc84ebfea6bf1ab10ae1ec9fa58c2



[...]


Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Theodore Ts'o
On Mon, May 10, 2021 at 02:49:44PM +0100, David Woodhouse wrote:
> On Mon, 2021-05-10 at 13:55 +0200, Mauro Carvalho Chehab wrote:
> > This patch series is doing conversion only when using ASCII makes
> > more sense than using UTF-8. 
> > 
> > See, a number of converted documents ended with weird characters
> > like ZERO WIDTH NO-BREAK SPACE (U+FEFF) character. This specific
> > character doesn't do any good.
> > 
> > Others use NO-BREAK SPACE (U+A0) instead of 0x20. Harmless, until
> > someone tries to use grep[1].
> 
> Replacing those makes sense. But replacing emdashes — which are a
> distinct character that has no direct replacement in ASCII and which
> people do *deliberately* use instead of hyphen-minus — does not.

I regularly use --- for em-dashes and -- for en-dashes.  Markdown will
automatically translate 3 ASCII hypens to em-dashes, and 2 ASCII
hyphens to en-dashes.  It's much, much easier for me to type 2 or 3
hypens into my text editor of choice than trying to enter the UTF-8
characters.  If we can make sphinx do this translation, maybe that's
the best way of dealing with these two characters?

Cheers,

- Ted


Re: [PATCH] component: Move host device to end of device lists on binding

2021-05-10 Thread Stephen Boyd
Quoting Daniel Vetter (2021-05-10 11:26:40)
> On Mon, May 10, 2021 at 7:52 PM Stephen Boyd  wrote:
> > The device list now has msm, i2c, bridge in that order. When we go to
> > system wide shutdown the bridge is shutdown first, then the i2c bus, and
> > then msm calls drm_atomic_helper_shutdown(). That tries to call the i2c
> > bridge ops because it's attached to the end of the DSI encoder and
> > things don't go well because i2c is gone. This patch fixes the order of
> > the list so that msm is moved on the device list after all the
> > components that make up the aggregate device have probed. This only
> > works to move the aggregate device after the i2c bridge because the
> > msm_dsi_host_register() function won't return success until the bridge
> > device is probed.
>
> Ah I think I get this now. There is indeed a design problem:
> component.c only has bind/unbind hooks for all its things. Which means
> driver load/unload will work correctly because in your above sequence:
>
> 1. drm_brige unbinds
> -> this triggers the unbind of the entire aggregate of components
> 2. i2c unbinds
> 3. msm unbinds, but there's nothing to clean up anymore except the
> aggregate/master struct

Yes. I just tried this though and it didn't work, so I suspect there are
bugs in bridge unbind. Another rabbit hole.

>
> Now for runtime pm this also all works out, because each component
> grabs the right runtime pm references. But for the system-wide pm
> changes, where we rely on the device list order to make sure things
> happen in the right way, it all blows up.
>
> 1. drm_bringe shutdown
> 2. i2c shutdown
> 3. msm shutdown, and with very sad thrombones because we blow up
>
> I think the right fix is to make component.c more of  a driver model
> thing, which probably means either the aggregate must get tied closer
> to the main struct device, or it needs to gain its own struct device.
> Or minimally at least, the aggregate needs to gain an entire set of
> pm_ops, which gets called in the right order if any of the component's
> pm_ops gets called. Wiring that all up will be major surgery I think.

Yes the root of the problem is that the aggregate device is not part of
the kernel's driver model. It's basically a pair of probe and remove
functions and nothing else.

>
> I guess another option would be trying to figure out how the aggreate
> registration could fail with EPROBE_DEFER until all the parts are
> there, to guarantee the right ordering. Not sure that will work with
> the current component users though.

I had that written up and it worked for me but I was concerned it would
break other users, plus it didn't feel correct to defer probe just
because the components weren't probed yet. The aggregate device wasn't
waiting for the components to probe, so why change that? For msm it led
to more work too, because we have some child devices that are removed if
the aggregate device fails to probe, meaning we go through a few cycles
of add/remove of the components this way. If the aggregate doesn't defer
probe then we can avoid the other components adding/removing over and
over again until the final component, DSI that is waiting for the
bridge, can probe.

That's why I opted to move the device on the list to the tail. I'm
hoping that most component users (which is basically drm?) don't do much
with the device they're using to host the aggregate device besides tell
drm that the display pipeline is here now. Everything else would be in
the bind/unbind callbacks. If there was a 'struct device', or maybe a
'struct class', that was associated with the whole display pipeline and
aggregate device we could attach the pm ops to that. Would 'struct
drm_device' be that? If yes we could make some drm functions that let
you attach PM ops to a struct device inside of that and make it a child
of the device that calls drm_dev_alloc().

>
> > It's an interesting idea to trigger shutdown when the component device
> > is unbound. Are you suggesting that the i2c bridge device have a
> > 'shutdown' callback, that essentially removes the bridge from the
> > encoder chain via mipi_dsi_detach() and then drm_bridge_remove()?
> > Presumably that would somehow tell the DSI encoder that it should stop
> > trying to use the i2c bridge and then drm_atomic_helper_shutdown()
> > wouldn't try to traverse beyond the DSI to shut things down.
>
> Nope, we don't want to unbind the driver on shutdown. I somehow
> thought you're dying in there, which is why I wondered what's going
> on. But since you're dying in pm_ops->shutdown, that's a different
> thing.

I'm dying in msm_pdev_shutdown(), but yes pm_ops are similar.

>
> > I will try it, but then I wonder about things like system wide
> > suspend/resume too. The drm encoder chain would need to reimplement the
> > logic for system wide suspend/resume so that any PM ops attached to the
> > msm device run in the correct order. Right now the bridge PM ops will
> > run, the i2c bus PM ops will run, and then the msm PM 

Re: [PATCH 1/2] drm: Fix dirtyfb stalls

2021-05-10 Thread Rob Clark
On Mon, May 10, 2021 at 10:44 AM Daniel Vetter  wrote:
>
> On Mon, May 10, 2021 at 6:51 PM Rob Clark  wrote:
> >
> > On Mon, May 10, 2021 at 9:14 AM Daniel Vetter  wrote:
> > >
> > > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote:
> > > > From: Rob Clark 
> > > >
> > > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video
> > > > mode" type displays, which is pointless and unnecessary.  Add an
> > > > optional helper vfunc to determine if a plane is attached to a CRTC
> > > > that actually needs dirtyfb, and skip over them.
> > > >
> > > > Signed-off-by: Rob Clark 
> > >
> > > So this is a bit annoying because the idea of all these "remap legacy uapi
> > > to atomic constructs" helpers is that they shouldn't need/use anything
> > > beyond what userspace also has available. So adding hacks for them feels
> > > really bad.
> >
> > I suppose the root problem is that userspace doesn't know if dirtyfb
> > (or similar) is actually required or is a no-op.
> >
> > But it is perhaps less of a problem because this essentially boils
> > down to "x11 vs wayland", and it seems like wayland compositors for
> > non-vsync'd rendering just pageflips and throws away extra frames from
> > the app?
>
> Yeah it's about not adequately batching up rendering and syncing with
> hw. bare metal x11 is just especially stupid about it :-)
>
> > > Also I feel like it's not entirely the right thing to do here either.
> > > We've had this problem already on the fbcon emulation side (which also
> > > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix
> > > there was to have a worker which batches up all the updates and avoids any
> > > stalls in bad places.
> >
> > I'm not too worried about fbcon not being able to render faster than
> > vblank.  OTOH it is a pretty big problem for x11
>
> That's why we'd let the worker get ahead at most one dirtyfb. We do
> the same with fbcon, which trivially can get ahead of vblank otherwise
> (if sometimes flushes each character, so you have to pile them up into
> a single update if that's still pending).
>
> > > Since this is for frontbuffer rendering userspace only we can probably get
> > > away with assuming there's only a single fb, so the implementation becomes
> > > pretty simple:
> > >
> > > - 1 worker, and we keep track of a single pending fb
> > > - if there's already a dirty fb pending on a different fb, we stall for
> > >   the worker to start processing that one already (i.e. the fb we track is
> > >   reset to NULL)
> > > - if it's pending on the same fb we just toss away all the updates and go
> > >   with a full update, since merging the clip rects is too much work :-) I
> > >   think there's helpers so you could be slightly more clever and just have
> > >   an overall bounding box
> >
> > This doesn't really fix the problem, you still end up delaying sending
> > the next back-buffer to mesa
>
> With this the dirtyfb would never block. Also glorious frontbuffer
> tracking corruption is possible, but that's not the kernel's problem.
> So how would anything get held up in userspace.

the part about stalling if a dirtyfb is pending was what I was worried
about.. but I suppose you meant the worker stalling, rather than
userspace stalling (where I had interpreted it the other way around).
As soon as userspace needs to stall, you're losing again.

> > But we could re-work drm_framebuffer_funcs::dirty to operate on a
> > per-crtc basis and hoist the loop and check if dirtyfb is needed out
> > of drm_atomic_helper_dirtyfb()
>
> That's still using information that userspace doesn't have, which is a
> bit irky. We might as well go with your thing here then.

arguably, this is something we should expose to userspace.. for DSI
command-mode panels, you probably want to make a different decision
with regard to how many buffers in your flip-chain..

Possibly we should add/remove the fb_damage_clips property depending
on the display type (ie. video/pull vs cmd/push mode)?

BR,
-R

> -Daniel
>
> > BR,
> > -R
> >
> > >
> > > Could probably steal most of the implementation.
> > >
> > > This approach here feels a tad too much in the hacky area ...
> > >
> > > Thoughts?
> > > -Daniel
> > >
> > > > ---
> > > >  drivers/gpu/drm/drm_damage_helper.c  |  8 
> > > >  include/drm/drm_modeset_helper_vtables.h | 14 ++
> > > >  2 files changed, 22 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/drm_damage_helper.c 
> > > > b/drivers/gpu/drm/drm_damage_helper.c
> > > > index 3a4126dc2520..a0bed1a2c2dc 100644
> > > > --- a/drivers/gpu/drm/drm_damage_helper.c
> > > > +++ b/drivers/gpu/drm/drm_damage_helper.c
> > > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct 
> > > > drm_framebuffer *fb,
> > > >  retry:
> > > >   drm_for_each_plane(plane, fb->dev) {
> > > >   struct drm_plane_state *plane_state;
> > > > + struct drm_crtc *crtc;
> > > >
> > > >   ret = drm_modeset_lock(>mutex, 

Re: [PATCH] drm/dp: Fix bogus DPCD version check in drm_dp_read_downstream_info()

2021-05-10 Thread Ville Syrjälä
On Fri, May 07, 2021 at 05:42:09PM -0400, Lyude Paul wrote:
> Ville pointed this out to me when fixing some issues in
> drm_dp_read_downstream_info() - the DPCD version check here is bogus as
> there's no DisplayPort versions prior to 1.0. The original code from i915
> that this was extracted from actually did:
> 
>   dpcd[DP_DPCD_REV] == DP_DPCD_REV_10
> 
> Which is correct, and somehow got missed when extracting this function. So
> let's fix this. Note that as far as I'm aware, I don't think this fixes any
> actual issues users are hitting.
> 
> Signed-off-by: Lyude Paul 
> Cc: Ville Syrjälä 

Reviewed-by: Ville Syrjälä 

> ---
>  drivers/gpu/drm/drm_dp_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index 0f84df8798ab..55b53df6ce34 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -677,7 +677,7 @@ int drm_dp_read_downstream_info(struct drm_dp_aux *aux,
>   memset(downstream_ports, 0, DP_MAX_DOWNSTREAM_PORTS);
>  
>   /* No downstream info to read */
> - if (!drm_dp_is_branch(dpcd) || dpcd[DP_DPCD_REV] < DP_DPCD_REV_10)
> + if (!drm_dp_is_branch(dpcd) || dpcd[DP_DPCD_REV] == DP_DPCD_REV_10)
>   return 0;
>  
>   /* Some branches advertise having 0 downstream ports, despite also 
> advertising they have a
> -- 
> 2.30.2

-- 
Ville Syrjälä
Intel


Re: [Intel-gfx] [RFC PATCH 00/97] Basic GuC submission support in the i915

2021-05-10 Thread Francisco Jerez
Daniel Vetter  writes:

> On Mon, May 10, 2021 at 3:55 PM Martin Peres  wrote:
>>
>> On 10/05/2021 02:11, Jason Ekstrand wrote:
>> > On May 9, 2021 12:12:36 Martin Peres  wrote:
>> >
>> >> Hi,
>> >>
>> >> On 06/05/2021 22:13, Matthew Brost wrote:
>> >>> Basic GuC submission support. This is the first bullet point in the
>> >>> upstreaming plan covered in the following RFC [1].
>> >>>
>> >>> At a very high level the GuC is a piece of firmware which sits between
>> >>> the i915 and the GPU. It offloads some of the scheduling of contexts
>> >>> from the i915 and programs the GPU to submit contexts. The i915
>> >>> communicates with the GuC and the GuC communicates with the GPU.
>> >>
>> >> May I ask what will GuC command submission do that execlist won't/can't
>> >> do? And what would be the impact on users? Even forgetting the troubled
>> >> history of GuC (instability, performance regression, poor level of user
>> >> support, 6+ years of trying to upstream it...), adding this much code
>> >> and doubling the amount of validation needed should come with a
>> >> rationale making it feel worth it... and I am not seeing here. Would you
>> >> mind providing the rationale behind this work?
>> >>
>> >>>
>> >>> GuC submission will be disabled by default on all current upstream
>> >>> platforms behind a module parameter - enable_guc. A value of 3 will
>> >>> enable submission and HuC loading via the GuC. GuC submission should
>> >>> work on all gen11+ platforms assuming the GuC firmware is present.
>> >>
>> >> What is the plan here when it comes to keeping support for execlist? I
>> >> am afraid that landing GuC support in Linux is the first step towards
>> >> killing the execlist, which would force users to use proprietary
>> >> firmwares that even most Intel engineers have little influence over.
>> >> Indeed, if "drm/i915/guc: Disable semaphores when using GuC scheduling"
>> >> which states "Disable semaphores when using GuC scheduling as semaphores
>> >> are broken in the current GuC firmware." is anything to go by, it means
>> >> that even Intel developers seem to prefer working around the GuC
>> >> firmware, rather than fixing it.
>> >
>> > Yes, landing GuC support may be the first step in removing execlist
>> > support. The inevitable reality is that GPU scheduling is coming and
>> > likely to be there only path in the not-too-distant future. (See also
>> > the ongoing thread with AMD about fences.) I'm not going to pass
>> > judgement on whether or not this is a good thing.  I'm just reading the
>> > winds and, in my view, this is where things are headed for good or ill.
>> >
>> > In answer to the question above, the answer to "what do we gain from
>> > GuC?" may soon be, "you get to use your GPU."  We're not there yet and,
>> > again, I'm not necessarily advocating for it, but that is likely where
>> > things are headed.
>>
>> This will be a sad day, especially since it seems fundamentally opposed
>> with any long-term support, on top of taking away user freedom to
>> fix/tweak their system when Intel won't.
>>
>> > A firmware-based submission model isn't a bad design IMO and, aside from
>> > the firmware freedom issues, I think there are actual advantages to the
>> > model. Immediately, it'll unlock a few features like parallel submission
>> > (more on that in a bit) and long-running compute because they're
>> > implemented in GuC and the work to implement them properly in the
>> > execlist scheduler is highly non-trivial. Longer term, it may (no
>> > guarantees) unlock some performance by getting the kernel out of the way.
>>
>> Oh, I definitely agree with firmware-based submission model not being a
>> bad design. I was even cheering for it in 2015. Experience with it made
>> me regret that deeply since :s
>>
>> But with the DRM scheduler being responsible for most things, I fail to
>> see what we could offload in the GuC except context switching (like
>> every other manufacturer). The problem is, the GuC does way more than
>> just switching registers in bulk, and if the number of revisions of the
>> GuC is anything to go by, it is way too complex for me to feel
>> comfortable with it.
>
> We need to flesh out that part of the plan more, but we're not going
> to use drm scheduler for everything. It's only to handle the dma-fence
> legacy side of things, which means:
> - timeout handling for batches that take too long
> - dma_fence dependency sorting/handling
> - boosting of context from display flips (currently missing, needs to
> be ported from drm/i915)
>
> The actual round-robin/preempt/priority handling is still left to the
> backend, in this case here the fw. So there's large chunks of
> code/functionality where drm/scheduler wont be involved in, and like
> Jason says: The hw direction winds definitely blow in the direction
> that this is all handled in hw.
>

I agree with Martin on this.  Given that using GuC currently involves
making your open-source graphics stack rely on a closed-source

Re: [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use

2021-05-10 Thread Andrey Grodzovsky



On 2021-05-10 2:27 p.m., Felix Kuehling wrote:

Am 2021-05-10 um 12:36 p.m. schrieb Andrey Grodzovsky:

It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.

I don't see any calls to ttm_tt_unpopulate in the rest of the series
now. Is that an accident, or can this patch be dropped?

Regards,
   Felix



You are right, it can be dropped because it's not required post 5.11 
kernel (at least

not in the use cases I tested).

Andrey





Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/ttm/ttm_tt.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 539e0232cb3b..dfbe1ea8763f 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long 
num_dma32_pages)
if (!ttm_dma32_pages_limit)
ttm_dma32_pages_limit = num_dma32_pages;
  }
+EXPORT_SYMBOL(ttm_tt_unpopulate);


Re: [PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use

2021-05-10 Thread Felix Kuehling
Am 2021-05-10 um 12:36 p.m. schrieb Andrey Grodzovsky:
> It's needed to drop iommu backed pages on device unplug
> before device's IOMMU group is released.

I don't see any calls to ttm_tt_unpopulate in the rest of the series
now. Is that an accident, or can this patch be dropped?

Regards,
  Felix


>
> Signed-off-by: Andrey Grodzovsky 
> ---
>  drivers/gpu/drm/ttm/ttm_tt.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 539e0232cb3b..dfbe1ea8763f 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned 
> long num_dma32_pages)
>   if (!ttm_dma32_pages_limit)
>   ttm_dma32_pages_limit = num_dma32_pages;
>  }
> +EXPORT_SYMBOL(ttm_tt_unpopulate);


Re: [PATCH] component: Move host device to end of device lists on binding

2021-05-10 Thread Daniel Vetter
On Mon, May 10, 2021 at 7:52 PM Stephen Boyd  wrote:
>
> Quoting Daniel Vetter (2021-05-10 09:05:21)
> > On Sat, May 08, 2021 at 12:41:18AM -0700, Stephen Boyd wrote:
> > > The device lists are poorly ordered when the component device code is
> > > used. This is because component_master_add_with_match() returns 0
> > > regardless of component devices calling component_add() first. It can
> > > really only fail if an allocation fails, in which case everything is
> > > going bad and we're out of memory. The host device (called master_dev in
> > > the code), can succeed at probe and be put on the device lists before
> > > any of the component devices are probed and put on the lists.
> > >
> > > Within the component device framework this usually isn't that bad
> > > because the real driver work is done at bind time via
> > > component{,master}_ops::bind(). It becomes a problem when the driver
> > > core, or host driver, wants to operate on the component device outside
> > > of the bind/unbind functions, e.g. via 'remove' or 'shutdown'. The
> > > driver core doesn't understand the relationship between the host device
> > > and the component devices and could possibly try to operate on component
> > > devices when they're already removed from the system or shut down.
> > >
> > > Normally, device links or probe defer would reorder the lists and put
> > > devices that depend on other devices in the lists at the correct
> > > location, but with component devices this doesn't happen because this
> > > information isn't expressed anywhere. Drivers simply succeed at
> > > registering their component or host with the component framework and
> > > wait for their bind() callback to be called once the other components
> > > are ready. We could make various device links between 'master_dev' and
> > > 'component->dev' but it's not necessary. Let's simply move the hosting
> > > device to the end of the device lists when the component device fully
> > > binds. This way we know that all components are present and have probed
> > > properly and now the host device has really probed so it's safe to
> > > assume the host driver ops can operate on any component device.
> > >
> > > This fixes the msm display driver shutdown path when the DSI controller
> > > is connected to a DSI bridge that is controlled via i2c. In this case,
> > > the msm display driver wants to tear down the display pipeline on
> > > shutdown at msm_pdev_shutdown() by calling drm_atomic_helper_shutdown(),
> > > and it can't do that unless the whole display chain is still probed and
> > > active in the system. When a display bridge is on i2c, the i2c device
> > > for the bridge will be created whenever the i2c controller probes, which
> > > could be before or after the msm display driver probes. If the i2c
> > > controller probes after the display driver, then the i2c controller will
> > > be shutdown before the display controller during system wide shutdown
> > > and thus i2c transactions will stop working before the display pipeline
> > > is shut down. This means we'll have the display bridge trying to access
> > > an i2c bus that's shut down because drm_atomic_helper_shutdown() is
> > > trying to disable the bridge after the bridge is off.
> > >
> > > Moving the host device to the end of the lists at bind time moves the
> > > drm_atomic_helper_shutdown() call before the i2c bus is shutdown.
> > > This fixes the immediate problem, but we could improve it a bit by
> > > modeling device links from the component devices to the host device
> > > indicating that they supply something, although it is slightly different
> > > because the consumer doesn't need the suppliers to probe to succeed.
> > >
> > > Cc: "Rafael J. Wysocki" 
> > > Cc: Daniel Vetter 
> > > Cc: Russell King 
> > > Cc: Rob Clark 
> > > Cc: 
> > > Signed-off-by: Stephen Boyd 
> >
> > Entirely aside, but an s/master/aggregate/ or similar over the entire
> > component.c codebase would help a pile in making it easier to understand
> > which part does what. Or at least I'm always terribly confused about which
> > bind binds what and all that, so maybe an additional review whether we
> > have a clear split into aggregate and individual components after that
> > initial fix is needed.
>
> Agreed.
>
> >
> > On the actual topic: I agree there's a problem here, but I'm honestly not
> > sure how it should be fixed. That's way over my understanding of all the
> > device probe and pm interactions. Of which there are plenty.
> >
> > One question I have: Why is the bridge component driver not correctly
> > ordered wrt the i2c driver it needs? The idea is that the aggregate driver
> > doesn't access any hw itself, but entirely relies on all its components.
> > So as long as all the component drivers are sorted correctly in the device
> > list, things /should/ work. And as soon as we drop out a single component,
> > the aggregate gets unbound (and then does all the
> > drm_atomic_helper_shutdown and all the other 

Re: [PULL] topic/iomem-mmap-vs-gup

2021-05-10 Thread Paolo Bonzini

On 10/05/21 19:57, Sean Christopherson wrote:

+Paolo

On Mon, May 10, 2021, Jason Gunthorpe wrote:

On Mon, May 10, 2021 at 04:55:39PM +0200, Daniel Vetter wrote:


yeah vfio is still broken for the case I care about. I think there's
also some questions open still about whether kvm really uses
mmu_notifier in all cases correctly,


IIRC kvm doesn't either.


Yep, KVM on x86 has a non-trivial number of flows that don't properly hook into
the mmu_notifier.  Paolo is working on fixing the problem, but I believe the
rework won't be ready until 5.14.


Yeah, I like the way it's coming, but I'm at 20-ish patches and counting.

Paolo



Re: [RFC] Implicit vs explicit user fence sync

2021-05-10 Thread Christian König

Am 04.05.21 um 17:11 schrieb Daniel Vetter:

On Tue, May 04, 2021 at 04:26:42PM +0200, Christian König wrote:

Hi Daniel,

Am 04.05.21 um 16:15 schrieb Daniel Vetter:

Hi Christian,

On Tue, May 04, 2021 at 03:27:17PM +0200, Christian König wrote:

Hi guys,

with this patch set I want to look into how much more additional work it
would be to support implicit sync compared to only explicit sync.

Turned out that this is much simpler than expected since the only
addition is that before a command submission or flip the kernel and
classic drivers would need to wait for the user fence to signal before
taking any locks.

It's a lot more I think
- sync_file/drm_syncobj still need to be supported somehow

You need that with explicit fences as well.

I'm just concentrating on what extra burden implicit sync would get us.

It's not just implicit sync. Currently the best approach we have for
explicit sync is hiding them in drm_syncobj. Because for that all the work
with intentional stall points and userspace submit thread already exists.

None of this work has been done for sync_file. And looking at how much
work it was to get drm_syncobj going, that will be anything but easy.


I don't think we will want this for sync_file in the first place.


- we need userspace to handle the stall in a submit thread at least
- there's nothing here that sets the sync object
- implicit sync isn't just execbuf, it's everything. E.g. the various
wait_bo ioctl also need to keep working, including timeout and
everything

Good point, but that should be relatively easily to add as well.


- we can't stall in atomic kms where you're currently stalling, that's for
sure. The uapi says "we're not stalling for fences in there", and you're
breaking that.

Again as far as I can see we run into the same problem with explicit sync.

So the question is where could we block for atomic modeset for user fences
in general?

Nah, I have an idea. But it only works if userspace is aware, because the
rules are essentialyl:

- when you supply a userspace in-fence, then you only get a userspace
   out-fence
- mixing in fences between dma-fence and user fence is ok
- mixing out fences isn't

And we currently do have sync_file out fence. So it's not possible to
support implicit user fence in atomic in a way which doesn't break the
uapi somewhere.

Doing the explicit user fence support first will make that very obvious.

And that's just the one ioctl I know is big trouble, I'm sure we'll find
more funny corner cases when we roll out explicit user fencing.


I think we can just ignore sync_file. As far as it concerns me that UAPI 
is pretty much dead.


What we should support is drm_syncobj, but that also only as an in-fence 
since that's what our hardware supports.



Anotherone that looks very sketchy right now is buffer sharing between
different userspace drivers, like compute <-> media (if you have some
fancy AI pipeline in your media workload, as an example).


Yeah, we are certainly going to get that. But only inside the same 
driver, so not much of a problem.





- ... at this point I stopped pondering but there's definitely more

Imo the only way we'll even get the complete is if we do the following:
1. roll out implicit sync with userspace fences on a driver-by-driver basis

s/implicit/explicit/

But I think you got that.


 1a. including all the winsys/modeset stuff

Completely agree, that's why I've split that up into individual patches.

I'm also fine if drivers can just opt out of user fence based
synchronization and we return an error from dma_buf_dynamic_attach() if some
driver says it can't handle that.

Yeah, but that boils down to us just breaking those use-cases. Which is
exactly what you're trying to avoid by rolling out implicit user fence I
think.


But we can add support to all drivers as necessary.




2. roll out support for userspace fences to drm_syncobj timeline for
 interop, both across process/userspace and across drivers
 2a. including all the winsys/modeset stuff, but hopefully that's
 largely solved with 1. already.

Correct, but again we need this for explicit fencing as well.


3. only then try to figure out how to retroshoehorn this into implicit
 sync, and whether that even makes sense.

Because doing 3 before we've done 1&2 for at least 2 drivers (2 because
interop fun across drivers) is just praying that this time around we're
not collectively idiots and can correctly predict the future. That never
worked :-)


For this prototype this patch set doesn't implement any user fence
synchronization at all, but just assumes that faulting user pages is
sufficient to make sure that we can wait for user space to finish
submitting the work. If necessary this can be made even more strict, the
only use case I could find which blocks this is the radeon driver and
that should be handle able.

This of course doesn't give you the same semantic as the classic
implicit sync to 

Re: [PATCH V3 2/2] drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and SN65DSI84 driver

2021-05-10 Thread Dave Stevenson
On Mon, 10 May 2021 at 12:16, Marek Vasut  wrote:
>
> On 5/10/21 11:58 AM, Dave Stevenson wrote:
> > On Sat, 8 May 2021 at 21:26, Marek Vasut  wrote:
> >>
> >> On 5/7/21 2:48 PM, Dave Stevenson wrote:
> >>
> >> [...]
> >>
>  +static void sn65dsi83_enable(struct drm_bridge *bridge)
>  +{
>  +   struct sn65dsi83 *ctx = bridge_to_sn65dsi83(bridge);
>  +   unsigned int pval;
>  +   u16 val;
>  +   int ret;
>  +
>  +   /* Clear reset, disable PLL */
>  +   regmap_write(ctx->regmap, REG_RC_RESET, 0x00);
>  +   regmap_write(ctx->regmap, REG_RC_PLL_EN, 0x00);
> >>>
> >>> Sorry, a further thread of discussion coming from the investigations
> >>> I've been involved with.
> >>>
> >>> You've powered up in pre_enable, and are sending the I2C writes in enable.
> >>>
> >>> >From the docs for drm_bridge_funcs->enable[1]
> >>>
> >>>* The bridge can assume that the display pipe (i.e. clocks and timing
> >>>* signals) feeding it is running when this callback is called. This
> >>>* callback must enable the display link feeding the next bridge in the
> >>>* chain if there is one.
> >>>
> >>> So video is running when enable is called, and the DSI data lanes may
> >>> be HS. (Someone correct me if that is an incorrect reading of the
> >>> text).
> >>>
> >>> The SN65DSI84 datasheet table 7-2 Initialization Sequence gives init
> >>> seq 8 as being "Change DSI data lanes to HS state and start DSI video
> >>> stream", AFTER all the I2C has been completed except reading back
> >>> registers and checking for errors.
> >>> With video running you don't fulfil the second part of init seq 2 "the
> >>> DSI data lanes MUST be driven to LP11 state"
> >>>
> >>> My investigations have been over delaying starting the DSI video
> >>> stream until after enable, but reading the descriptive text for enable
> >>> I believe the Pi is correct to be sending video at that point.
> >>> I guess there is some ambiguity as to whether the clock lane is going
> >>> to be in HS mode during pre_enable. On the Pi the PHY and clocks will
> >>> be enabled prior to pre_enable to allow for sending DSI commands
> >>> during pre_enable, but it may not be true on other platforms.
> >>
> >> You have to make sure the clock lane is running and in HS mode when
> >> configuring the DSI83, otherwise the internal DSI83 state machine won't
> >> be able to operate.
> >
> > Indeed, but my reading of the documentation says that neither
> > pre_enable nor enable give you the state that you require.
> > You need a hook in the middle, an option to ask for clock lanes during
> > pre_enable or no video during enable, or an amendment to the docs over
> > the state during enable.
> >
> > Having the data lanes in HS mode does appear to stop the DSI83
> > accepting the I2C setup commands.
>
> Uhh, that is new. Is that what you observed in your lab ?
>
> I saw the DSI83 behave this way if the clock lane was stopped, but the
> data lanes had no impact. Was your clock lane running when the DSI83 was
> not accepting i2c commands ? Does your DSI83 source clock from it or
> from external Xtal ?

I haven't got into the lab as yet, and I don't have a DSI83 myself.
This is relaying experimentation from others.
They're using the DSI clock lane as the clock source.Yes the clock
lane on the Pi is started before any of the enable bridge calls.

In the vc4 driver[1] it runs through the all pre-enables, configures
register DISP0_CTRL including setting bit DSI_DISP0_ENABLE which
starts it requesting pixels from the pipeline, and then calls all the
enables. With that behaviour it fails to start the DSI83.

If the DSI83 I2C setup code is moved from enable to pre_enable then it
works, or if patch [2] is used to move the setting of the
DSI_DISP0_ENABLE bit to after enable it also works.

Sorry life is all rather up in the air with working from home. I'll go
into the lab and try to confirm that DSI_DISP0_ENABLE does what the
documentation implies it does.
Those who do have hardware now have it working on the Pi, although
with a version of Jagan's driver rather than yours. We're trying to
figure out the diffs with yours.

If you have it working reliably on other platforms that you believe
are following the docs during pre_enable and enable, then I'm happy to
drop out of the discussions for now. We can revisit it once we have
determined exactly why it's being fussy on the Pi.

Cheers
  Dave

[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/vc4/vc4_dsi.c#L1072
[2] 
https://github.com/6by9/linux/commit/b939eaffc47cc84ebfea6bf1ab10ae1ec9fa58c2


Re: [PATCH] component: Move host device to end of device lists on binding

2021-05-10 Thread Stephen Boyd
Quoting Daniel Vetter (2021-05-10 09:05:21)
> On Sat, May 08, 2021 at 12:41:18AM -0700, Stephen Boyd wrote:
> > The device lists are poorly ordered when the component device code is
> > used. This is because component_master_add_with_match() returns 0
> > regardless of component devices calling component_add() first. It can
> > really only fail if an allocation fails, in which case everything is
> > going bad and we're out of memory. The host device (called master_dev in
> > the code), can succeed at probe and be put on the device lists before
> > any of the component devices are probed and put on the lists.
> >
> > Within the component device framework this usually isn't that bad
> > because the real driver work is done at bind time via
> > component{,master}_ops::bind(). It becomes a problem when the driver
> > core, or host driver, wants to operate on the component device outside
> > of the bind/unbind functions, e.g. via 'remove' or 'shutdown'. The
> > driver core doesn't understand the relationship between the host device
> > and the component devices and could possibly try to operate on component
> > devices when they're already removed from the system or shut down.
> >
> > Normally, device links or probe defer would reorder the lists and put
> > devices that depend on other devices in the lists at the correct
> > location, but with component devices this doesn't happen because this
> > information isn't expressed anywhere. Drivers simply succeed at
> > registering their component or host with the component framework and
> > wait for their bind() callback to be called once the other components
> > are ready. We could make various device links between 'master_dev' and
> > 'component->dev' but it's not necessary. Let's simply move the hosting
> > device to the end of the device lists when the component device fully
> > binds. This way we know that all components are present and have probed
> > properly and now the host device has really probed so it's safe to
> > assume the host driver ops can operate on any component device.
> >
> > This fixes the msm display driver shutdown path when the DSI controller
> > is connected to a DSI bridge that is controlled via i2c. In this case,
> > the msm display driver wants to tear down the display pipeline on
> > shutdown at msm_pdev_shutdown() by calling drm_atomic_helper_shutdown(),
> > and it can't do that unless the whole display chain is still probed and
> > active in the system. When a display bridge is on i2c, the i2c device
> > for the bridge will be created whenever the i2c controller probes, which
> > could be before or after the msm display driver probes. If the i2c
> > controller probes after the display driver, then the i2c controller will
> > be shutdown before the display controller during system wide shutdown
> > and thus i2c transactions will stop working before the display pipeline
> > is shut down. This means we'll have the display bridge trying to access
> > an i2c bus that's shut down because drm_atomic_helper_shutdown() is
> > trying to disable the bridge after the bridge is off.
> >
> > Moving the host device to the end of the lists at bind time moves the
> > drm_atomic_helper_shutdown() call before the i2c bus is shutdown.
> > This fixes the immediate problem, but we could improve it a bit by
> > modeling device links from the component devices to the host device
> > indicating that they supply something, although it is slightly different
> > because the consumer doesn't need the suppliers to probe to succeed.
> >
> > Cc: "Rafael J. Wysocki" 
> > Cc: Daniel Vetter 
> > Cc: Russell King 
> > Cc: Rob Clark 
> > Cc: 
> > Signed-off-by: Stephen Boyd 
>
> Entirely aside, but an s/master/aggregate/ or similar over the entire
> component.c codebase would help a pile in making it easier to understand
> which part does what. Or at least I'm always terribly confused about which
> bind binds what and all that, so maybe an additional review whether we
> have a clear split into aggregate and individual components after that
> initial fix is needed.

Agreed.

>
> On the actual topic: I agree there's a problem here, but I'm honestly not
> sure how it should be fixed. That's way over my understanding of all the
> device probe and pm interactions. Of which there are plenty.
>
> One question I have: Why is the bridge component driver not correctly
> ordered wrt the i2c driver it needs? The idea is that the aggregate driver
> doesn't access any hw itself, but entirely relies on all its components.
> So as long as all the component drivers are sorted correctly in the device
> list, things /should/ work. And as soon as we drop out a single component,
> the aggregate gets unbound (and then does all the
> drm_atomic_helper_shutdown and all the other drm teardown). So is the bug
> perhaps that msm does the drm teardown in the wrong callback?

I see my explanation of the problem wasn't sufficient :|

The bridge driver is not a component device. It is connected to 

Re: [PATCH 1/2] drm: Fix dirtyfb stalls

2021-05-10 Thread Daniel Vetter
On Mon, May 10, 2021 at 6:51 PM Rob Clark  wrote:
>
> On Mon, May 10, 2021 at 9:14 AM Daniel Vetter  wrote:
> >
> > On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote:
> > > From: Rob Clark 
> > >
> > > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video
> > > mode" type displays, which is pointless and unnecessary.  Add an
> > > optional helper vfunc to determine if a plane is attached to a CRTC
> > > that actually needs dirtyfb, and skip over them.
> > >
> > > Signed-off-by: Rob Clark 
> >
> > So this is a bit annoying because the idea of all these "remap legacy uapi
> > to atomic constructs" helpers is that they shouldn't need/use anything
> > beyond what userspace also has available. So adding hacks for them feels
> > really bad.
>
> I suppose the root problem is that userspace doesn't know if dirtyfb
> (or similar) is actually required or is a no-op.
>
> But it is perhaps less of a problem because this essentially boils
> down to "x11 vs wayland", and it seems like wayland compositors for
> non-vsync'd rendering just pageflips and throws away extra frames from
> the app?

Yeah it's about not adequately batching up rendering and syncing with
hw. bare metal x11 is just especially stupid about it :-)

> > Also I feel like it's not entirely the right thing to do here either.
> > We've had this problem already on the fbcon emulation side (which also
> > shouldn't be able to peek behind the atomic kms uapi curtain), and the fix
> > there was to have a worker which batches up all the updates and avoids any
> > stalls in bad places.
>
> I'm not too worried about fbcon not being able to render faster than
> vblank.  OTOH it is a pretty big problem for x11

That's why we'd let the worker get ahead at most one dirtyfb. We do
the same with fbcon, which trivially can get ahead of vblank otherwise
(if sometimes flushes each character, so you have to pile them up into
a single update if that's still pending).

> > Since this is for frontbuffer rendering userspace only we can probably get
> > away with assuming there's only a single fb, so the implementation becomes
> > pretty simple:
> >
> > - 1 worker, and we keep track of a single pending fb
> > - if there's already a dirty fb pending on a different fb, we stall for
> >   the worker to start processing that one already (i.e. the fb we track is
> >   reset to NULL)
> > - if it's pending on the same fb we just toss away all the updates and go
> >   with a full update, since merging the clip rects is too much work :-) I
> >   think there's helpers so you could be slightly more clever and just have
> >   an overall bounding box
>
> This doesn't really fix the problem, you still end up delaying sending
> the next back-buffer to mesa

With this the dirtyfb would never block. Also glorious frontbuffer
tracking corruption is possible, but that's not the kernel's problem.
So how would anything get held up in userspace.

> But we could re-work drm_framebuffer_funcs::dirty to operate on a
> per-crtc basis and hoist the loop and check if dirtyfb is needed out
> of drm_atomic_helper_dirtyfb()

That's still using information that userspace doesn't have, which is a
bit irky. We might as well go with your thing here then.
-Daniel

> BR,
> -R
>
> >
> > Could probably steal most of the implementation.
> >
> > This approach here feels a tad too much in the hacky area ...
> >
> > Thoughts?
> > -Daniel
> >
> > > ---
> > >  drivers/gpu/drm/drm_damage_helper.c  |  8 
> > >  include/drm/drm_modeset_helper_vtables.h | 14 ++
> > >  2 files changed, 22 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/drm_damage_helper.c 
> > > b/drivers/gpu/drm/drm_damage_helper.c
> > > index 3a4126dc2520..a0bed1a2c2dc 100644
> > > --- a/drivers/gpu/drm/drm_damage_helper.c
> > > +++ b/drivers/gpu/drm/drm_damage_helper.c
> > > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer 
> > > *fb,
> > >  retry:
> > >   drm_for_each_plane(plane, fb->dev) {
> > >   struct drm_plane_state *plane_state;
> > > + struct drm_crtc *crtc;
> > >
> > >   ret = drm_modeset_lock(>mutex, state->acquire_ctx);
> > >   if (ret)
> > > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer 
> > > *fb,
> > >   continue;
> > >   }
> > >
> > > + crtc = plane->state->crtc;
> > > + if (crtc->helper_private->needs_dirtyfb &&
> > > + !crtc->helper_private->needs_dirtyfb(crtc)) 
> > > {
> > > + drm_modeset_unlock(>mutex);
> > > + continue;
> > > + }
> > > +
> > >   plane_state = drm_atomic_get_plane_state(state, plane);
> > >   if (IS_ERR(plane_state)) {
> > >   ret = PTR_ERR(plane_state);
> > > diff --git a/include/drm/drm_modeset_helper_vtables.h 
> > > b/include/drm/drm_modeset_helper_vtables.h
> > > index 

Re: [PATCH v2 1/1] drm/msm/dpu: Fix error return code in dpu_mdss_init()

2021-05-10 Thread Stephen Boyd
Quoting Zhen Lei (2021-05-09 23:38:05)
> The error code returned by platform_get_irq() is stored in 'irq', it's
> forgotten to be copied to 'ret' before being returned. As a result, the
> value 0 of 'ret' is returned incorrectly.
>
> After the above fix is completed, initializing the local variable 'ret'
> to 0 is no longer needed, remove it.
>
> In addition, when dpu_mdss_init() is successfully returned, the value of
> 'ret' is always 0. Therefore, replace "return ret" with "return 0" to make
> the code clearer.
>
> Fixes: 070e64dc1bbc ("drm/msm/dpu: Convert to a chained irq chip")
> Reported-by: Hulk Robot 
> Signed-off-by: Zhen Lei 
> ---

Reviewed-by: Stephen Boyd 


Re: [PATCH 1/2] drm: Fix dirtyfb stalls

2021-05-10 Thread Rob Clark
On Mon, May 10, 2021 at 9:14 AM Daniel Vetter  wrote:
>
> On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video
> > mode" type displays, which is pointless and unnecessary.  Add an
> > optional helper vfunc to determine if a plane is attached to a CRTC
> > that actually needs dirtyfb, and skip over them.
> >
> > Signed-off-by: Rob Clark 
>
> So this is a bit annoying because the idea of all these "remap legacy uapi
> to atomic constructs" helpers is that they shouldn't need/use anything
> beyond what userspace also has available. So adding hacks for them feels
> really bad.

I suppose the root problem is that userspace doesn't know if dirtyfb
(or similar) is actually required or is a no-op.

But it is perhaps less of a problem because this essentially boils
down to "x11 vs wayland", and it seems like wayland compositors for
non-vsync'd rendering just pageflips and throws away extra frames from
the app?

> Also I feel like it's not entirely the right thing to do here either.
> We've had this problem already on the fbcon emulation side (which also
> shouldn't be able to peek behind the atomic kms uapi curtain), and the fix
> there was to have a worker which batches up all the updates and avoids any
> stalls in bad places.

I'm not too worried about fbcon not being able to render faster than
vblank.  OTOH it is a pretty big problem for x11

> Since this is for frontbuffer rendering userspace only we can probably get
> away with assuming there's only a single fb, so the implementation becomes
> pretty simple:
>
> - 1 worker, and we keep track of a single pending fb
> - if there's already a dirty fb pending on a different fb, we stall for
>   the worker to start processing that one already (i.e. the fb we track is
>   reset to NULL)
> - if it's pending on the same fb we just toss away all the updates and go
>   with a full update, since merging the clip rects is too much work :-) I
>   think there's helpers so you could be slightly more clever and just have
>   an overall bounding box

This doesn't really fix the problem, you still end up delaying sending
the next back-buffer to mesa

But we could re-work drm_framebuffer_funcs::dirty to operate on a
per-crtc basis and hoist the loop and check if dirtyfb is needed out
of drm_atomic_helper_dirtyfb()

BR,
-R

>
> Could probably steal most of the implementation.
>
> This approach here feels a tad too much in the hacky area ...
>
> Thoughts?
> -Daniel
>
> > ---
> >  drivers/gpu/drm/drm_damage_helper.c  |  8 
> >  include/drm/drm_modeset_helper_vtables.h | 14 ++
> >  2 files changed, 22 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_damage_helper.c 
> > b/drivers/gpu/drm/drm_damage_helper.c
> > index 3a4126dc2520..a0bed1a2c2dc 100644
> > --- a/drivers/gpu/drm/drm_damage_helper.c
> > +++ b/drivers/gpu/drm/drm_damage_helper.c
> > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer 
> > *fb,
> >  retry:
> >   drm_for_each_plane(plane, fb->dev) {
> >   struct drm_plane_state *plane_state;
> > + struct drm_crtc *crtc;
> >
> >   ret = drm_modeset_lock(>mutex, state->acquire_ctx);
> >   if (ret)
> > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer 
> > *fb,
> >   continue;
> >   }
> >
> > + crtc = plane->state->crtc;
> > + if (crtc->helper_private->needs_dirtyfb &&
> > + !crtc->helper_private->needs_dirtyfb(crtc)) {
> > + drm_modeset_unlock(>mutex);
> > + continue;
> > + }
> > +
> >   plane_state = drm_atomic_get_plane_state(state, plane);
> >   if (IS_ERR(plane_state)) {
> >   ret = PTR_ERR(plane_state);
> > diff --git a/include/drm/drm_modeset_helper_vtables.h 
> > b/include/drm/drm_modeset_helper_vtables.h
> > index eb706342861d..afa8ec5754e7 100644
> > --- a/include/drm/drm_modeset_helper_vtables.h
> > +++ b/include/drm/drm_modeset_helper_vtables.h
> > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs {
> >bool in_vblank_irq, int *vpos, int *hpos,
> >ktime_t *stime, ktime_t *etime,
> >const struct drm_display_mode *mode);
> > +
> > + /**
> > +  * @needs_dirtyfb
> > +  *
> > +  * Optional callback used by damage helpers to determine if 
> > fb_damage_clips
> > +  * update is needed.
> > +  *
> > +  * Returns:
> > +  *
> > +  * True if fb_damage_clips update is needed to handle DIRTYFB, False
> > +  * otherwise.  If this callback is not implemented, then True is
> > +  * assumed.
> > +  */
> > + bool (*needs_dirtyfb)(struct drm_crtc *crtc);
> >  };
> >
> >  /**
> > --
> > 2.30.2
> >
>
> --
> Daniel Vetter
> 

[PATCH v6 16/16] drm/amdgpu: Verify DMA opearations from device are done

2021-05-10 Thread Andrey Grodzovsky
In case device remove is just simualted by sysfs then verify
device doesn't keep doing DMA to the released memory after
pci_remove is done.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 83006f45b10b..5e6af9e0b7bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1314,7 +1314,13 @@ amdgpu_pci_remove(struct pci_dev *pdev)
drm_dev_unplug(dev);
amdgpu_driver_unload_kms(dev);
 
+   /*
+* Flush any in flight DMA operations from device.
+* Clear the Bus Master Enable bit and then wait on the PCIe Device
+* StatusTransactions Pending bit.
+*/
pci_disable_device(pdev);
+   pci_wait_for_pending_transaction(pdev);
 }
 
 static void
-- 
2.25.1



[PATCH v6 14/16] drm/scheduler: Fix hang when sched_entity released

2021-05-10 Thread Andrey Grodzovsky
Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity->stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/scheduler/sched_entity.c |  3 ++-
 drivers/gpu/drm/scheduler/sched_main.c   | 24 
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 0249c7450188..2e93e881b65f 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -116,7 +116,8 @@ static bool drm_sched_entity_is_idle(struct 
drm_sched_entity *entity)
rmb(); /* for list_empty to work without lock */
 
if (list_empty(>list) ||
-   spsc_queue_count(>job_queue) == 0)
+   spsc_queue_count(>job_queue) == 0 ||
+   entity->stopped)
return true;
 
return false;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 8d1211e87101..a2a953693b45 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -898,9 +898,33 @@ EXPORT_SYMBOL(drm_sched_init);
  */
 void drm_sched_fini(struct drm_gpu_scheduler *sched)
 {
+   struct drm_sched_entity *s_entity;
+   int i;
+
if (sched->thread)
kthread_stop(sched->thread);
 
+   for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; 
i--) {
+   struct drm_sched_rq *rq = >sched_rq[i];
+
+   if (!rq)
+   continue;
+
+   spin_lock(>lock);
+   list_for_each_entry(s_entity, >entities, list)
+   /*
+* Prevents reinsertion and marks job_queue as idle,
+* it will removed from rq in drm_sched_entity_fini
+* eventually
+*/
+   s_entity->stopped = true;
+   spin_unlock(>lock);
+
+   }
+
+   /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */
+   wake_up_all(>job_scheduled);
+
/* Confirm no work left behind accessing device structures */
cancel_delayed_work_sync(>work_tdr);
 
-- 
2.25.1



[PATCH v6 10/16] drm/amdgpu: Guard against write accesses after device removal

2021-05-10 Thread Andrey Grodzovsky
This should prevent writing to memory or IO ranges possibly
already allocated for other uses after our device is removed.

v5:
Protect more places wher memcopy_to/form_io takes place
Protect IB submissions

v6: Switch to !drm_dev_enter instead of scoping entire code
with brackets.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   |  9 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c| 17 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   | 63 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h   |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c  | 70 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  | 49 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c   | 31 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c   | 11 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c   | 22 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  7 +-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c| 44 ++--
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c|  8 +--
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c |  8 +--
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 26 ---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 22 +++---
 .../drm/amd/pm/powerplay/smumgr/smu7_smumgr.c |  2 +
 17 files changed, 257 insertions(+), 145 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a0bff4713672..94c415176cdc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -71,6 +71,8 @@
 #include 
 #include 
 
+#include 
+
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -281,7 +283,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, 
loff_t pos,
unsigned long flags;
uint32_t hi = ~0;
uint64_t last;
+   int idx;
 
+if (!drm_dev_enter(>ddev, ))
+return;
 
 #ifdef CONFIG_64BIT
last = min(pos + size, adev->gmc.visible_vram_size);
@@ -299,8 +304,10 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, 
loff_t pos,
memcpy_fromio(buf, addr, count);
}
 
-   if (count == size)
+   if (count == size) {
+   drm_dev_exit(idx);
return;
+   }
 
pos += count;
buf += count / 4;
@@ -323,6 +330,8 @@ void amdgpu_device_vram_access(struct amdgpu_device *adev, 
loff_t pos,
*buf++ = RREG32_NO_KIQ(mmMM_DATA);
}
spin_unlock_irqrestore(>mmio_idx_lock, flags);
+
+   drm_dev_exit(idx);
 }
 
 /*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 4d32233cde92..04ba5eef1e88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -31,6 +31,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_xgmi.h"
 
+#include 
+
 /**
  * amdgpu_gmc_pdb0_alloc - allocate vram for pdb0
  *
@@ -151,6 +153,10 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, 
void *cpu_pt_addr,
 {
void __iomem *ptr = (void *)cpu_pt_addr;
uint64_t value;
+   int idx;
+
+   if (!drm_dev_enter(>ddev, ))
+   return 0;
 
/*
 * The following is for PTE only. GART does not have PDEs.
@@ -158,6 +164,9 @@ int amdgpu_gmc_set_pte_pde(struct amdgpu_device *adev, void 
*cpu_pt_addr,
value = addr & 0xF000ULL;
value |= flags;
writeq(value, ptr + (gpu_page_idx * 8));
+
+   drm_dev_exit(idx);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 148a3b481b12..62fcbd446c71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -30,6 +30,7 @@
 #include 
 
 #include 
+#include 
 
 #include "amdgpu.h"
 #include "atom.h"
@@ -137,7 +138,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
bool secure;
 
unsigned i;
-   int r = 0;
+   int idx, r = 0;
bool need_pipe_sync = false;
 
if (num_ibs == 0)
@@ -169,13 +170,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
return -EINVAL;
}
 
+   if (!drm_dev_enter(>ddev, ))
+   return -ENODEV;
+
alloc_size = ring->funcs->emit_frame_size + num_ibs *
ring->funcs->emit_ib_size;
 
r = amdgpu_ring_alloc(ring, alloc_size);
if (r) {
dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
-   return r;
+   goto exit;
}
 
need_ctx_switch = ring->current_ctx != fence_ctx;
@@ -205,7 +209,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 

[PATCH v6 13/16] drm/amdgpu: Fix hang on device removal.

2021-05-10 Thread Andrey Grodzovsky
If removing while commands in flight you cannot wait to flush the
HW fences on a ring since the device is gone.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 1ffb36bd0b19..fa03702ecbfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 
+#include 
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -525,8 +526,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev)
  */
 void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev)
 {
-   unsigned i, j;
-   int r;
+   int i, r;
 
for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
struct amdgpu_ring *ring = adev->rings[i];
@@ -535,11 +535,15 @@ void amdgpu_fence_driver_fini_hw(struct amdgpu_device 
*adev)
continue;
if (!ring->no_scheduler)
drm_sched_fini(>sched);
-   r = amdgpu_fence_wait_empty(ring);
-   if (r) {
-   /* no need to trigger GPU reset as we are unloading */
+   /* You can't wait for HW to signal if it's gone */
+   if (!drm_dev_is_unplugged(>ddev))
+   r = amdgpu_fence_wait_empty(ring);
+   else
+   r = -ENODEV;
+   /* no need to trigger GPU reset as we are unloading */
+   if (r)
amdgpu_fence_driver_force_completion(ring);
-   }
+
if (ring->fence_drv.irq_src)
amdgpu_irq_put(adev, ring->fence_drv.irq_src,
   ring->fence_drv.irq_type);
-- 
2.25.1



[PATCH v6 12/16] drm/amdgpu: Prevent any job recoveries after device is unplugged.

2021-05-10 Thread Andrey Grodzovsky
Return DRM_TASK_STATUS_ENODEV back to the scheduler when device
is not present so they timeout timer will not be rearmed.

v5: Update to match updated return values in enum drm_gpu_sched_stat

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 759b34799221..d33e6d97cc89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -25,6 +25,8 @@
 #include 
 #include 
 
+#include 
+
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
@@ -34,6 +36,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
struct amdgpu_job *job = to_amdgpu_job(s_job);
struct amdgpu_task_info ti;
struct amdgpu_device *adev = ring->adev;
+   int idx;
+
+   if (!drm_dev_enter(>ddev, )) {
+   DRM_INFO("%s - device unplugged skipping recovery on 
scheduler:%s",
+__func__, s_job->sched->name);
+
+   /* Effectively the job is aborted as the device is gone */
+   return DRM_GPU_SCHED_STAT_ENODEV;
+   }
 
memset(, 0, sizeof(struct amdgpu_task_info));
 
@@ -41,7 +52,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) 
{
DRM_ERROR("ring %s timeout, but soft recovered\n",
  s_job->sched->name);
-   return DRM_GPU_SCHED_STAT_NOMINAL;
+   goto exit;
}
 
amdgpu_vm_get_task_info(ring->adev, job->pasid, );
@@ -53,13 +64,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
 
if (amdgpu_device_should_recover_gpu(ring->adev)) {
amdgpu_device_gpu_recover(ring->adev, job);
-   return DRM_GPU_SCHED_STAT_NOMINAL;
} else {
drm_sched_suspend_timeout(>sched);
if (amdgpu_sriov_vf(adev))
adev->virt.tdr_debug = true;
-   return DRM_GPU_SCHED_STAT_NOMINAL;
}
+
+exit:
+   drm_dev_exit(idx);
+   return DRM_GPU_SCHED_STAT_NOMINAL;
 }
 
 int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
-- 
2.25.1



[PATCH v6 11/16] drm/sched: Make timeout timer rearm conditional.

2021-05-10 Thread Andrey Grodzovsky
We don't want to rearm the timer if driver hook reports
that the device is gone.

v5: Update drm_gpu_sched_stat values in code.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/scheduler/sched_main.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..8d1211e87101 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
struct drm_gpu_scheduler *sched;
struct drm_sched_job *job;
+   enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;
 
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
@@ -331,7 +332,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
list_del_init(>list);
spin_unlock(>job_list_lock);
 
-   job->sched->ops->timedout_job(job);
+   status = job->sched->ops->timedout_job(job);
 
/*
 * Guilty job did complete and hence needs to be manually 
removed
@@ -345,9 +346,11 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
spin_unlock(>job_list_lock);
}
 
-   spin_lock(>job_list_lock);
-   drm_sched_start_timeout(sched);
-   spin_unlock(>job_list_lock);
+   if (status != DRM_GPU_SCHED_STAT_ENODEV) {
+   spin_lock(>job_list_lock);
+   drm_sched_start_timeout(sched);
+   spin_unlock(>job_list_lock);
+   }
 }
 
  /**
-- 
2.25.1



[PATCH v6 07/16] drm/amdgpu: Remap all page faults to per process dummy page.

2021-05-10 Thread Andrey Grodzovsky
On device removal reroute all CPU mappings to dummy page
per drm_file instance or imported GEM object.

v4:
Update for modified ttm_bo_vm_dummy_page

Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8c7ec09eb1a4..0d54e70278ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -48,6 +48,7 @@
 #include 
 
 #include 
+#include 
 
 #include "amdgpu.h"
 #include "amdgpu_object.h"
@@ -1905,18 +1906,28 @@ void amdgpu_ttm_set_buffer_funcs_status(struct 
amdgpu_device *adev, bool enable)
 static vm_fault_t amdgpu_ttm_fault(struct vm_fault *vmf)
 {
struct ttm_buffer_object *bo = vmf->vma->vm_private_data;
+   struct drm_device *ddev = bo->base.dev;
vm_fault_t ret;
+   int idx;
 
ret = ttm_bo_vm_reserve(bo, vmf);
if (ret)
return ret;
 
-   ret = amdgpu_bo_fault_reserve_notify(bo);
-   if (ret)
-   goto unlock;
+   if (drm_dev_enter(ddev, )) {
+   ret = amdgpu_bo_fault_reserve_notify(bo);
+   if (ret) {
+   drm_dev_exit(idx);
+   goto unlock;
+   }
 
-   ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
-  TTM_BO_VM_NUM_PREFAULT, 1);
+ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
+   TTM_BO_VM_NUM_PREFAULT, 1);
+
+drm_dev_exit(idx);
+   } else {
+   ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
+   }
if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
return ret;
 
-- 
2.25.1



[PATCH v6 09/16] drm/amdgpu: Convert driver sysfs attributes to static attributes

2021-05-10 Thread Andrey Grodzovsky
This allows to remove explicit creation and destruction
of those attrs and by this avoids warnings on device
finalizing post physical device extraction.

v5: Use newly added pci_driver.dev_groups directly

Signed-off-by: Andrey Grodzovsky 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 17 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  | 13 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  | 25 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 14 ---
 4 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 494b2e1717d5..879ed3e50a6e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1768,6 +1768,15 @@ static ssize_t amdgpu_atombios_get_vbios_version(struct 
device *dev,
 static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
   NULL);
 
+static struct attribute *amdgpu_vbios_version_attrs[] = {
+   _attr_vbios_version.attr,
+   NULL
+};
+
+const struct attribute_group amdgpu_vbios_version_attr_group = {
+   .attrs = amdgpu_vbios_version_attrs
+};
+
 /**
  * amdgpu_atombios_fini - free the driver info and callbacks for atombios
  *
@@ -1787,7 +1796,6 @@ void amdgpu_atombios_fini(struct amdgpu_device *adev)
adev->mode_info.atom_context = NULL;
kfree(adev->mode_info.atom_card_info);
adev->mode_info.atom_card_info = NULL;
-   device_remove_file(adev->dev, _attr_vbios_version);
 }
 
 /**
@@ -1804,7 +1812,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 {
struct card_info *atom_card_info =
kzalloc(sizeof(struct card_info), GFP_KERNEL);
-   int ret;
 
if (!atom_card_info)
return -ENOMEM;
@@ -1833,12 +1840,6 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
amdgpu_atombios_allocate_fb_scratch(adev);
}
 
-   ret = device_create_file(adev->dev, _attr_vbios_version);
-   if (ret) {
-   DRM_ERROR("Failed to create device file for VBIOS version\n");
-   return ret;
-   }
-
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 5ebed4c7d9c0..83006f45b10b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1766,6 +1766,18 @@ static struct pci_error_handlers amdgpu_pci_err_handler 
= {
.resume = amdgpu_pci_resume,
 };
 
+extern const struct attribute_group amdgpu_vram_mgr_attr_group;
+extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
+extern const struct attribute_group amdgpu_vbios_version_attr_group;
+
+static const struct attribute_group *amdgpu_sysfs_groups[] = {
+   _vram_mgr_attr_group,
+   _gtt_mgr_attr_group,
+   _vbios_version_attr_group,
+   NULL,
+};
+
+
 static struct pci_driver amdgpu_kms_pci_driver = {
.name = DRIVER_NAME,
.id_table = pciidlist,
@@ -1774,6 +1786,7 @@ static struct pci_driver amdgpu_kms_pci_driver = {
.shutdown = amdgpu_pci_shutdown,
.driver.pm = _pm_ops,
.err_handler = _pci_err_handler,
+   .dev_groups = amdgpu_sysfs_groups,
 };
 
 static int __init amdgpu_init(void)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 72962de4c04c..a4404da8ca6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -75,6 +75,16 @@ static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
 static DEVICE_ATTR(mem_info_gtt_used, S_IRUGO,
   amdgpu_mem_info_gtt_used_show, NULL);
 
+static struct attribute *amdgpu_gtt_mgr_attributes[] = {
+   _attr_mem_info_gtt_total.attr,
+   _attr_mem_info_gtt_used.attr,
+   NULL
+};
+
+const struct attribute_group amdgpu_gtt_mgr_attr_group = {
+   .attrs = amdgpu_gtt_mgr_attributes
+};
+
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func;
 /**
  * amdgpu_gtt_mgr_init - init GTT manager and DRM MM
@@ -89,7 +99,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t 
gtt_size)
struct amdgpu_gtt_mgr *mgr = >mman.gtt_mgr;
struct ttm_resource_manager *man = >manager;
uint64_t start, size;
-   int ret;
 
man->use_tt = true;
man->func = _gtt_mgr_func;
@@ -102,17 +111,6 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, 
uint64_t gtt_size)
spin_lock_init(>lock);
atomic64_set(>available, gtt_size >> PAGE_SHIFT);
 
-   ret = device_create_file(adev->dev, _attr_mem_info_gtt_total);
-   if (ret) {
-   DRM_ERROR("Failed to create device file mem_info_gtt_total\n");
-   return ret;
-   }
-   ret = device_create_file(adev->dev, _attr_mem_info_gtt_used);
-   if (ret) {
- 

[PATCH v6 06/16] drm/amdgpu: Handle IOMMU enabled case.

2021-05-10 Thread Andrey Grodzovsky
Handle all DMA IOMMU gropup related dependencies before the
group is removed.

v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
v6: Drop the BO unamp list

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c| 9 +
 drivers/gpu/drm/amd/amdgpu/cik_ih.c| 1 -
 drivers/gpu/drm/amd/amdgpu/cz_ih.c | 1 -
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c| 1 -
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c | 3 ---
 drivers/gpu/drm/amd/amdgpu/si_ih.c | 1 -
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c  | 1 -
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 3 ---
 11 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 18598eda18f6..a0bff4713672 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3256,7 +3256,6 @@ static const struct attribute *amdgpu_dev_attributes[] = {
NULL
 };
 
-
 /**
  * amdgpu_device_init - initialize the driver
  *
@@ -3698,12 +3697,13 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
amdgpu_ucode_sysfs_fini(adev);
sysfs_remove_files(>dev->kobj, amdgpu_dev_attributes);
 
-
amdgpu_fbdev_fini(adev);
 
amdgpu_irq_fini_hw(adev);
 
amdgpu_device_ip_fini_early(adev);
+
+   amdgpu_gart_dummy_page_fini(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index c5a9a4fb10d2..354e68081b53 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -92,7 +92,7 @@ static int amdgpu_gart_dummy_page_init(struct amdgpu_device 
*adev)
  *
  * Frees the dummy page used by the driver (all asics).
  */
-static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
 {
if (!adev->dummy_page_addr)
return;
@@ -375,5 +375,4 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
  */
 void amdgpu_gart_fini(struct amdgpu_device *adev)
 {
-   amdgpu_gart_dummy_page_fini(adev);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index a25fe97b0196..78dc7a23da56 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -58,6 +58,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
 void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
 int amdgpu_gart_init(struct amdgpu_device *adev);
 void amdgpu_gart_fini(struct amdgpu_device *adev);
+void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev);
 int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
   int pages);
 int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 233b64dab94b..a14973a7a9c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -361,6 +361,15 @@ void amdgpu_irq_fini_hw(struct amdgpu_device *adev)
if (!amdgpu_device_has_dc_support(adev))
flush_work(>hotplug_work);
}
+
+   if (adev->irq.ih_soft.ring)
+   amdgpu_ih_ring_fini(adev, >irq.ih_soft);
+   if (adev->irq.ih.ring)
+   amdgpu_ih_ring_fini(adev, >irq.ih);
+   if (adev->irq.ih1.ring)
+   amdgpu_ih_ring_fini(adev, >irq.ih1);
+   if (adev->irq.ih2.ring)
+   amdgpu_ih_ring_fini(adev, >irq.ih2);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c 
b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index 183d44a6583c..df385ffc9768 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -310,7 +310,6 @@ static int cik_ih_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
amdgpu_irq_fini_sw(adev);
-   amdgpu_ih_ring_fini(adev, >irq.ih);
amdgpu_irq_remove_domain(adev);
 
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c 
b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index d32743949003..b8c47e0cf37a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -302,7 +302,6 @@ static int cz_ih_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
amdgpu_irq_fini_sw(adev);
-   amdgpu_ih_ring_fini(adev, >irq.ih);
amdgpu_irq_remove_domain(adev);
 
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c 
b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index da96c6013477..ddfe4eaeea05 100644
--- 

[PATCH v6 08/16] PCI: Add support for dev_groups to struct pci_device_driver

2021-05-10 Thread Andrey Grodzovsky
This helps converting PCI drivers sysfs attributes to static.

Analogous to b71b283e3d6d ("USB: add support for dev_groups to
struct usb_driver")

Signed-off-by: Andrey Grodzovsky 
Suggested-by: Greg Kroah-Hartman 
---
 drivers/pci/pci-driver.c | 1 +
 include/linux/pci.h  | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index ec44a79e951a..3a72352aa5cf 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1385,6 +1385,7 @@ int __pci_register_driver(struct pci_driver *drv, struct 
module *owner,
drv->driver.owner = owner;
drv->driver.mod_name = mod_name;
drv->driver.groups = drv->groups;
+   drv->driver.dev_groups = drv->dev_groups;
 
spin_lock_init(>dynids.lock);
INIT_LIST_HEAD(>dynids.list);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 86c799c97b77..b57755b03009 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -858,6 +858,8 @@ struct module;
  * number of VFs to enable via sysfs "sriov_numvfs" file.
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
  * @groups:Sysfs attribute groups.
+ * @dev_groups: Attributes attached to the device that will be
+ *  created once it is bound to the driver.
  * @driver:Driver model structure.
  * @dynids:List of dynamically added device IDs.
  */
@@ -873,6 +875,7 @@ struct pci_driver {
int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
const struct pci_error_handlers *err_handler;
const struct attribute_group **groups;
+   const struct attribute_group **dev_groups;
struct device_driverdriver;
struct pci_dynids   dynids;
 };
-- 
2.25.1



[PATCH v6 03/16] drm/amdgpu: Split amdgpu_device_fini into early and late

2021-05-10 Thread Andrey Grodzovsky
Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.

v4: Change functions prefix early->hw and late->sw

Signed-off-by: Andrey Grodzovsky 
Acked-by: Christian König 
Reviewed-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  6 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  7 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 15 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c| 26 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h|  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  3 ++-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/cz_ih.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/si_ih.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/vega20_ih.c |  2 +-
 17 files changed, 79 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 380801b59b07..d830a541ba89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1099,7 +1099,9 @@ static inline struct amdgpu_device 
*amdgpu_ttm_adev(struct ttm_device *bdev)
 
 int amdgpu_device_init(struct amdgpu_device *adev,
   uint32_t flags);
-void amdgpu_device_fini(struct amdgpu_device *adev);
+void amdgpu_device_fini_hw(struct amdgpu_device *adev);
+void amdgpu_device_fini_sw(struct amdgpu_device *adev);
+
 int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
 
 void amdgpu_device_vram_access(struct amdgpu_device *adev, loff_t pos,
@@ -1319,6 +1321,8 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
 int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
 void amdgpu_driver_postclose_kms(struct drm_device *dev,
 struct drm_file *file_priv);
+void amdgpu_driver_release_kms(struct drm_device *dev);
+
 int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
 int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);
 int amdgpu_device_resume(struct drm_device *dev, bool fbcon);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b4ad1c055c70..3760ce7d8ff8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3648,15 +3648,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  * Tear down the driver info (all asics).
  * Called at driver shutdown.
  */
-void amdgpu_device_fini(struct amdgpu_device *adev)
+void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 {
dev_info(adev->dev, "amdgpu: finishing device.\n");
flush_delayed_work(>delayed_init_work);
ttm_bo_lock_delayed_workqueue(>mman.bdev);
adev->shutdown = true;
 
-   kfree(adev->pci_state);
-
/* make sure IB test finished before entering exclusive mode
 * to avoid preemption on IB test
 * */
@@ -3673,11 +3671,24 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
else
drm_atomic_helper_shutdown(adev_to_drm(adev));
}
-   amdgpu_fence_driver_fini(adev);
+   amdgpu_fence_driver_fini_hw(adev);
+
if (adev->pm_sysfs_en)
amdgpu_pm_sysfs_fini(adev);
+   if (adev->ucode_sysfs_en)
+   amdgpu_ucode_sysfs_fini(adev);
+   sysfs_remove_files(>dev->kobj, amdgpu_dev_attributes);
+
+
amdgpu_fbdev_fini(adev);
+
+   amdgpu_irq_fini_hw(adev);
+}
+
+void amdgpu_device_fini_sw(struct amdgpu_device *adev)
+{
amdgpu_device_ip_fini(adev);
+   amdgpu_fence_driver_fini_sw(adev);
release_firmware(adev->firmware.gpu_info_fw);
adev->firmware.gpu_info_fw = NULL;
adev->accel_working = false;
@@ -3703,14 +3714,13 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
adev->rmmio = NULL;
amdgpu_device_doorbell_fini(adev);
 
-   if (adev->ucode_sysfs_en)
-   amdgpu_ucode_sysfs_fini(adev);
-
-   sysfs_remove_files(>dev->kobj, amdgpu_dev_attributes);
if (IS_ENABLED(CONFIG_PERF_EVENTS))
amdgpu_pmu_fini(adev);
if (adev->mman.discovery_bin)
amdgpu_discovery_fini(adev);
+
+   kfree(adev->pci_state);
+
 }
 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 

[PATCH v6 04/16] drm/amdkfd: Split kfd suspend from devie exit

2021-05-10 Thread Andrey Grodzovsky
Helps to expdite HW related stuff to amdgpu_pci_remove

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5f6696a3c778..2b06dee9a0ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -170,7 +170,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
}
 }
 
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev)
 {
if (adev->kfd.dev) {
kgd2kfd_device_exit(adev->kfd.dev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 14f68c028126..f8e10af99c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -127,7 +127,7 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
const void *ih_ring_entry);
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
-void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
+void amdgpu_amdkfd_device_fini_sw(struct amdgpu_device *adev);
 int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
uint32_t *ib_cmd, uint32_t ib_len);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 357b9bf62a1c..ab6d2a43c9a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -858,10 +858,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
return kfd->init_complete;
 }
 
+
+
 void kgd2kfd_device_exit(struct kfd_dev *kfd)
 {
if (kfd->init_complete) {
-   kgd2kfd_suspend(kfd, false);
device_queue_manager_uninit(kfd->dqm);
kfd_interrupt_exit(kfd);
kfd_topology_remove_device(kfd);
-- 
2.25.1



[PATCH v6 15/16] drm/amd/display: Remove superflous drm_mode_config_cleanup

2021-05-10 Thread Andrey Grodzovsky
It's already being released by DRM core through devm

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6c2c6a51ce6c..9728a0158bcb 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3757,7 +3757,6 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
 
 static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
 {
-   drm_mode_config_cleanup(dm->ddev);
drm_atomic_private_obj_fini(>atomic_obj);
return;
 }
-- 
2.25.1



[PATCH v6 05/16] drm/amdgpu: Add early fini callback

2021-05-10 Thread Andrey Grodzovsky
Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug

v5: Move HW finilization into this callback to prevent MMIO accesses
post cpi remove.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 59 +--
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +++-
 drivers/gpu/drm/amd/include/amd_shared.h  |  2 +
 3 files changed, 52 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3760ce7d8ff8..18598eda18f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2558,34 +2558,26 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
return 0;
 }
 
-/**
- * amdgpu_device_ip_fini - run fini for hardware IPs
- *
- * @adev: amdgpu_device pointer
- *
- * Main teardown pass for hardware IPs.  The list of all the hardware
- * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
- * are run.  hw_fini tears down the hardware associated with each IP
- * and sw_fini tears down any software state associated with each IP.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
 {
int i, r;
 
-   if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
-   amdgpu_virt_release_ras_err_handler_data(adev);
+   for (i = 0; i < adev->num_ip_blocks; i++) {
+   if (!adev->ip_blocks[i].version->funcs->early_fini)
+   continue;
 
-   amdgpu_ras_pre_fini(adev);
+   r = adev->ip_blocks[i].version->funcs->early_fini((void *)adev);
+   if (r) {
+   DRM_DEBUG("early_fini of IP block <%s> failed %d\n",
+ adev->ip_blocks[i].version->funcs->name, r);
+   }
+   }
 
-   if (adev->gmc.xgmi.num_physical_nodes > 1)
-   amdgpu_xgmi_remove_device(adev);
+   amdgpu_amdkfd_suspend(adev, false);
 
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
 
-   amdgpu_amdkfd_device_fini(adev);
-
/* need to disable SMC first */
for (i = 0; i < adev->num_ip_blocks; i++) {
if (!adev->ip_blocks[i].status.hw)
@@ -2616,6 +2608,33 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)
adev->ip_blocks[i].status.hw = false;
}
 
+   return 0;
+}
+
+/**
+ * amdgpu_device_ip_fini - run fini for hardware IPs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Main teardown pass for hardware IPs.  The list of all the hardware
+ * IPs that make up the asic is walked and the hw_fini and sw_fini callbacks
+ * are run.  hw_fini tears down the hardware associated with each IP
+ * and sw_fini tears down any software state associated with each IP.
+ * Returns 0 on success, negative error code on failure.
+ */
+static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
+{
+   int i, r;
+
+   if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
+   amdgpu_virt_release_ras_err_handler_data(adev);
+
+   amdgpu_ras_pre_fini(adev);
+
+   if (adev->gmc.xgmi.num_physical_nodes > 1)
+   amdgpu_xgmi_remove_device(adev);
+
+   amdgpu_amdkfd_device_fini_sw(adev);
 
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.sw)
@@ -3683,6 +3702,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
amdgpu_fbdev_fini(adev);
 
amdgpu_irq_fini_hw(adev);
+
+   amdgpu_device_ip_fini_early(adev);
 }
 
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 296704ce3768..6c2c6a51ce6c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1251,6 +1251,15 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
return -EINVAL;
 }
 
+static int amdgpu_dm_early_fini(void *handle)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+   amdgpu_dm_audio_fini(adev);
+
+   return 0;
+}
+
 static void amdgpu_dm_fini(struct amdgpu_device *adev)
 {
int i;
@@ -1259,8 +1268,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
drm_encoder_cleanup(>dm.mst_encoders[i].base);
}
 
-   amdgpu_dm_audio_fini(adev);
-
amdgpu_dm_destroy_drm_device(>dm);
 
 #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
@@ -2298,6 +2305,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
.late_init = dm_late_init,
.sw_init = dm_sw_init,
.sw_fini = dm_sw_fini,
+   .early_fini = 

[PATCH v6 02/16] drm/ttm: Expose ttm_tt_unpopulate for driver use

2021-05-10 Thread Andrey Grodzovsky
It's needed to drop iommu backed pages on device unplug
before device's IOMMU group is released.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/ttm/ttm_tt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 539e0232cb3b..dfbe1ea8763f 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -433,3 +433,4 @@ void ttm_tt_mgr_init(unsigned long num_pages, unsigned long 
num_dma32_pages)
if (!ttm_dma32_pages_limit)
ttm_dma32_pages_limit = num_dma32_pages;
 }
+EXPORT_SYMBOL(ttm_tt_unpopulate);
-- 
2.25.1



[PATCH v6 01/16] drm/ttm: Remap all page faults to per process dummy page.

2021-05-10 Thread Andrey Grodzovsky
On device removal reroute all CPU mappings to dummy page.

v3:
Remove loop to find DRM file and instead access it
by vma->vm_file->private_data. Move dummy page installation
into a separate function.

v4:
Map the entire BOs VA space into on demand allocated dummy page
on the first fault for that BO.

v5: Remove duplicate return.

v6: Polish ttm_bo_vm_dummy_page, remove superflous code.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 57 -
 include/drm/ttm/ttm_bo_api.h|  2 ++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index b31b18058965..e5a9615519d1 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -34,6 +34,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -380,19 +382,72 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 }
 EXPORT_SYMBOL(ttm_bo_vm_fault_reserved);
 
+static void ttm_bo_release_dummy_page(struct drm_device *dev, void *res)
+{
+   struct page *dummy_page = (struct page *)res;
+
+   __free_page(dummy_page);
+}
+
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot)
+{
+   struct vm_area_struct *vma = vmf->vma;
+   struct ttm_buffer_object *bo = vma->vm_private_data;
+   struct drm_device *ddev = bo->base.dev;
+   vm_fault_t ret = VM_FAULT_NOPAGE;
+   unsigned long address;
+   unsigned long pfn;
+   struct page *page;
+
+   /* Allocate new dummy page to map all the VA range in this VMA to it*/
+   page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+   if (!page)
+   return VM_FAULT_OOM;
+
+   pfn = page_to_pfn(page);
+
+   /* Prefault the entire VMA range right away to avoid further faults */
+   for (address = vma->vm_start; address < vma->vm_end; address += 
PAGE_SIZE) {
+
+   if (unlikely(address >= vma->vm_end))
+   break;
+
+   if (vma->vm_flags & VM_MIXEDMAP)
+   ret = vmf_insert_mixed_prot(vma, address,
+   __pfn_to_pfn_t(pfn, 
PFN_DEV),
+   prot);
+   else
+   ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
+   }
+
+   /* Set the page to be freed using drmm release action */
+   if (drmm_add_action_or_reset(ddev, ttm_bo_release_dummy_page, page))
+   return VM_FAULT_OOM;
+
+   return ret;
+}
+EXPORT_SYMBOL(ttm_bo_vm_dummy_page);
+
 vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 {
struct vm_area_struct *vma = vmf->vma;
pgprot_t prot;
struct ttm_buffer_object *bo = vma->vm_private_data;
+   struct drm_device *ddev = bo->base.dev;
vm_fault_t ret;
+   int idx;
 
ret = ttm_bo_vm_reserve(bo, vmf);
if (ret)
return ret;
 
prot = vma->vm_page_prot;
-   ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, 1);
+   if (drm_dev_enter(ddev, )) {
+   ret = ttm_bo_vm_fault_reserved(vmf, prot, 
TTM_BO_VM_NUM_PREFAULT, 1);
+   drm_dev_exit(idx);
+   } else {
+   ret = ttm_bo_vm_dummy_page(vmf, prot);
+   }
if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
return ret;
 
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 639521880c29..254ede97f8e3 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -620,4 +620,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned 
long addr,
 void *buf, int len, int write);
 bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
 
+vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
+
 #endif
-- 
2.25.1



[PATCH v6 00/16] RFC Support hot device unplug in amdgpu

2021-05-10 Thread Andrey Grodzovsky
Until now extracting a card either by physical extraction (e.g. eGPU with 
thunderbolt connection or by emulation through  sysfs -> 
/sys/bus/pci/devices/device_id/remove) 
would cause random crashes in user apps. The random crashes in apps were 
mostly due to the app having mapped a device backed BO into its address 
space and was still trying to access the BO while the backing device was gone.
To answer this first problem Christian suggested fixing the handling of mapped 
memory in the clients when the device goes away by forcibly unmapping all 
buffers the 
user processes have by clearing their respective VMAs mapping the device BOs.
Then when the VMAs try to fill in the page tables again we check in the fault 
handler if the device is removed and if so, return an error. This will generate 
a 
SIGBUS to the application which can then cleanly terminate. This indeed was 
done 
but this in turn created a problem of kernel OOPs where the OOPSes were due to 
the 
fact that while the app was terminating because of the SIGBUS it would trigger 
use 
after free in the driver by calling to access device structures that were 
already
released from the pci remove sequence. This was handled by introducing a 
'flush' 
sequence during device removal where we wait for drm file reference to drop to 
0 
meaning all user clients directly using this device terminated.

v2:
Based on discussions in the mailing list with Daniel and Pekka [1] and based on 
the document 
produced by Pekka from those discussions [2] the whole approach with returning 
SIGBUS and 
waiting for all user clients having CPU mapping of device BOs to die was 
dropped. 
Instead as per the document suggestion the device structures are kept alive 
until 
the last reference to the device is dropped by user client and in the meanwhile 
all existing and new CPU mappings of the BOs 
belonging to the device directly or by dma-buf import are rerouted to per user 
process dummy rw page.Also, I skipped the 'Requirements for KMS UAPI' section 
of [2] 
since i am trying to get the minimal set of requirements that still give useful 
solution 
to work and this is the'Requirements for Render and Cross-Device UAPI' section 
and so my 
test case is removing a secondary device, which is render only and is not 
involved 
in KMS.

v3:
More updates following comments from v2 such as removing loop to find DRM file 
when rerouting 
page faults to dummy page,getting rid of unnecessary sysfs handling refactoring 
and moving 
prevention of GPU recovery post device unplug from amdgpu to scheduler layer. 
On top of that added unplug support for the IOMMU enabled system.

v4:
Drop last sysfs hack and use sysfs default attribute.
Guard against write accesses after device removal to avoid modifying released 
memory.
Update dummy pages handling to on demand allocation and release through drm 
managed framework.
Add return value to scheduler job TO handler (by Luben Tuikov) and use this in 
amdgpu for prevention 
of GPU recovery post device unplug
Also rebase on top of drm-misc-mext instead of amd-staging-drm-next

v5:
The most significant in this series is the improved protection from kernel 
driver accessing MMIO ranges that were allocated
for the device once the device is gone. To do this, first a patch 'drm/amdgpu: 
Unmap all MMIO mappings' is introduced.
This patch unamps all MMIO mapped into the kernel address space in the form of 
BARs and kernel BOs with CPU visible VRAM mappings.
This way it helped to discover multiple such access points because a page fault 
would be immediately generated on access. Most of them
were solved by moving HW fini code into pci_remove stage (patch drm/amdgpu: Add 
early fini callback) and for some who 
were harder to unwind drm_dev_enter/exit scoping was used. In addition all the 
IOCTLs and all background work and timers 
are now protected with drm_dev_enter/exit at their root in an attempt that 
after drm_dev_unplug is finished none of them 
run anymore and the pci_remove thread is the only thread executing which might 
touch the HW. To prevent deadlocks in such 
case against threads stuck on various HW or SW fences patches 'drm/amdgpu: 
Finalise device fences on device remove'  
and drm/amdgpu: Add rw_sem to pushing job into sched queue' take care of force 
signaling all such existing fences 
and rejecting any newly added ones.

v6:
Drop using drm_dev_enter/exit in conjunction with signalling HW fences before 
setting drm_dev_unplug.
We need to devise a more robust cros DRM approach to the problem of dma fence 
waits falling
inside drm_dev_enter/exit scopes -> move to TODO.

With these patches I am able to gracefully remove the secondary card using 
sysfs remove hook while glxgears is running off of secondary 
card (DRI_PRIME=1) without kernel oopses or hangs and keep working with the 
primary card or soft reset the device without hangs or oopses.
Also as per Daniel's comment I added 3 tests to IGT [4] to core_hotunplug test 
suite - remove 

Re: [RFC PATCH 00/97] Basic GuC submission support in the i915

2021-05-10 Thread Daniel Vetter
On Mon, May 10, 2021 at 3:55 PM Martin Peres  wrote:
>
> On 10/05/2021 02:11, Jason Ekstrand wrote:
> > On May 9, 2021 12:12:36 Martin Peres  wrote:
> >
> >> Hi,
> >>
> >> On 06/05/2021 22:13, Matthew Brost wrote:
> >>> Basic GuC submission support. This is the first bullet point in the
> >>> upstreaming plan covered in the following RFC [1].
> >>>
> >>> At a very high level the GuC is a piece of firmware which sits between
> >>> the i915 and the GPU. It offloads some of the scheduling of contexts
> >>> from the i915 and programs the GPU to submit contexts. The i915
> >>> communicates with the GuC and the GuC communicates with the GPU.
> >>
> >> May I ask what will GuC command submission do that execlist won't/can't
> >> do? And what would be the impact on users? Even forgetting the troubled
> >> history of GuC (instability, performance regression, poor level of user
> >> support, 6+ years of trying to upstream it...), adding this much code
> >> and doubling the amount of validation needed should come with a
> >> rationale making it feel worth it... and I am not seeing here. Would you
> >> mind providing the rationale behind this work?
> >>
> >>>
> >>> GuC submission will be disabled by default on all current upstream
> >>> platforms behind a module parameter - enable_guc. A value of 3 will
> >>> enable submission and HuC loading via the GuC. GuC submission should
> >>> work on all gen11+ platforms assuming the GuC firmware is present.
> >>
> >> What is the plan here when it comes to keeping support for execlist? I
> >> am afraid that landing GuC support in Linux is the first step towards
> >> killing the execlist, which would force users to use proprietary
> >> firmwares that even most Intel engineers have little influence over.
> >> Indeed, if "drm/i915/guc: Disable semaphores when using GuC scheduling"
> >> which states "Disable semaphores when using GuC scheduling as semaphores
> >> are broken in the current GuC firmware." is anything to go by, it means
> >> that even Intel developers seem to prefer working around the GuC
> >> firmware, rather than fixing it.
> >
> > Yes, landing GuC support may be the first step in removing execlist
> > support. The inevitable reality is that GPU scheduling is coming and
> > likely to be there only path in the not-too-distant future. (See also
> > the ongoing thread with AMD about fences.) I'm not going to pass
> > judgement on whether or not this is a good thing.  I'm just reading the
> > winds and, in my view, this is where things are headed for good or ill.
> >
> > In answer to the question above, the answer to "what do we gain from
> > GuC?" may soon be, "you get to use your GPU."  We're not there yet and,
> > again, I'm not necessarily advocating for it, but that is likely where
> > things are headed.
>
> This will be a sad day, especially since it seems fundamentally opposed
> with any long-term support, on top of taking away user freedom to
> fix/tweak their system when Intel won't.
>
> > A firmware-based submission model isn't a bad design IMO and, aside from
> > the firmware freedom issues, I think there are actual advantages to the
> > model. Immediately, it'll unlock a few features like parallel submission
> > (more on that in a bit) and long-running compute because they're
> > implemented in GuC and the work to implement them properly in the
> > execlist scheduler is highly non-trivial. Longer term, it may (no
> > guarantees) unlock some performance by getting the kernel out of the way.
>
> Oh, I definitely agree with firmware-based submission model not being a
> bad design. I was even cheering for it in 2015. Experience with it made
> me regret that deeply since :s
>
> But with the DRM scheduler being responsible for most things, I fail to
> see what we could offload in the GuC except context switching (like
> every other manufacturer). The problem is, the GuC does way more than
> just switching registers in bulk, and if the number of revisions of the
> GuC is anything to go by, it is way too complex for me to feel
> comfortable with it.

We need to flesh out that part of the plan more, but we're not going
to use drm scheduler for everything. It's only to handle the dma-fence
legacy side of things, which means:
- timeout handling for batches that take too long
- dma_fence dependency sorting/handling
- boosting of context from display flips (currently missing, needs to
be ported from drm/i915)

The actual round-robin/preempt/priority handling is still left to the
backend, in this case here the fw. So there's large chunks of
code/functionality where drm/scheduler wont be involved in, and like
Jason says: The hw direction winds definitely blow in the direction
that this is all handled in hw.

> >> In the same vein, I have another concern related to the impact of GuC on
> >> Linux's stable releases. Let's say that in 3 years, a new application
> >> triggers a bug in command submission inside the firmware. Given that the
> >> Linux community cannot patch the 

Re: [RFC PATCH 00/97] Basic GuC submission support in the i915

2021-05-10 Thread Jason Ekstrand

On May 10, 2021 08:55:55 Martin Peres  wrote:


On 10/05/2021 02:11, Jason Ekstrand wrote:

On May 9, 2021 12:12:36 Martin Peres  wrote:


Hi,

On 06/05/2021 22:13, Matthew Brost wrote:

Basic GuC submission support. This is the first bullet point in the
upstreaming plan covered in the following RFC [1].

At a very high level the GuC is a piece of firmware which sits between
the i915 and the GPU. It offloads some of the scheduling of contexts
from the i915 and programs the GPU to submit contexts. The i915
communicates with the GuC and the GuC communicates with the GPU.


May I ask what will GuC command submission do that execlist won't/can't
do? And what would be the impact on users? Even forgetting the troubled
history of GuC (instability, performance regression, poor level of user
support, 6+ years of trying to upstream it...), adding this much code
and doubling the amount of validation needed should come with a
rationale making it feel worth it... and I am not seeing here. Would you
mind providing the rationale behind this work?



GuC submission will be disabled by default on all current upstream
platforms behind a module parameter - enable_guc. A value of 3 will
enable submission and HuC loading via the GuC. GuC submission should
work on all gen11+ platforms assuming the GuC firmware is present.


What is the plan here when it comes to keeping support for execlist? I
am afraid that landing GuC support in Linux is the first step towards
killing the execlist, which would force users to use proprietary
firmwares that even most Intel engineers have little influence over.
Indeed, if "drm/i915/guc: Disable semaphores when using GuC scheduling"
which states "Disable semaphores when using GuC scheduling as semaphores
are broken in the current GuC firmware." is anything to go by, it means
that even Intel developers seem to prefer working around the GuC
firmware, rather than fixing it.


Yes, landing GuC support may be the first step in removing execlist
support. The inevitable reality is that GPU scheduling is coming and
likely to be there only path in the not-too-distant future. (See also
the ongoing thread with AMD about fences.) I'm not going to pass
judgement on whether or not this is a good thing.  I'm just reading the
winds and, in my view, this is where things are headed for good or ill.

In answer to the question above, the answer to "what do we gain from
GuC?" may soon be, "you get to use your GPU."  We're not there yet and,
again, I'm not necessarily advocating for it, but that is likely where
things are headed.


This will be a sad day, especially since it seems fundamentally opposed
with any long-term support, on top of taking away user freedom to
fix/tweak their system when Intel won't.


A firmware-based submission model isn't a bad design IMO and, aside from
the firmware freedom issues, I think there are actual advantages to the
model. Immediately, it'll unlock a few features like parallel submission
(more on that in a bit) and long-running compute because they're
implemented in GuC and the work to implement them properly in the
execlist scheduler is highly non-trivial. Longer term, it may (no
guarantees) unlock some performance by getting the kernel out of the way.


Oh, I definitely agree with firmware-based submission model not being a
bad design. I was even cheering for it in 2015. Experience with it made
me regret that deeply since :s

But with the DRM scheduler being responsible for most things, I fail to
see what we could offload in the GuC except context switching (like
every other manufacturer). The problem is, the GuC does way more than
just switching registers in bulk, and if the number of revisions of the
GuC is anything to go by, it is way too complex for me to feel
comfortable with it.


It's more than just bulk register writes. When it comes to load-balancing 
multiple GPU users, firmware can theoretically preempt and switch faster 
leading to more efficient time-slicing. All we really need the DRM 
scheduler for is handling implicit dma_fence dependencies between different 
applications.






In the same vein, I have another concern related to the impact of GuC on
Linux's stable releases. Let's say that in 3 years, a new application
triggers a bug in command submission inside the firmware. Given that the
Linux community cannot patch the GuC firmware, how likely is it that
Intel would release a new GuC version? That would not be necessarily
such a big problem if newer versions of the GuC could easily be
backported to this potentially-decade-old Linux version, but given that
the GuC seems to have ABI-breaking changes on a monthly cadence (we are
at major version 60 *already*? :o), I would say that it is
highly-unlikely that it would not require potentially-extensive changes
to i915 to make it work, making the fix almost impossible to land in the
stable tree... Do you have a plan to mitigate this problem?

Patches like "drm/i915/guc: Disable bonding extension with GuC
submission" 

Re: [PATCH 3/3] [RFC] drm/exynos: Add basic i.MX8MM support code

2021-05-10 Thread Tim Harvey
On Mon, May 10, 2021 at 7:12 AM Adam Ford  wrote:
>
> On Mon, Oct 5, 2020 at 8:48 AM Marek Vasut  wrote:
> >
> > This adds basic i.MX8MM glue code for the Samsung DSIM PHY.
> > There are still a couple of items which need to be sorted out
> > in drivers/gpu/drm/bridge/samsung-dsim.c before this can even
> > be merged, specifically:
> >
> > - The dsi->out_bridge is not populated until samsung_dsim_host_attach()
> >   is called, however samsung_dsim_host_attach() is not called until the
> >   next bridge attaches and calls mipi_dsi_attach(), and that only happens
> >   after the DSIM calls drm_bridge_attach() on that next bridge.
> >
> > - The samsung_dsim_bridge_mode_fixup() is needed for iMX8MM LCDIF to set
> >   the correct sync flags. This likely needs to be done in the glue code.
>
> Since you asked for an RFC, I
> I applied Michael's series and this series to 5.12 since we are so
> close on having the blk-clk and the power domain stuff working.  I
> also tried your patch for the ti-sn65dsi83 and the adv7511 on the
> Beacon imx8mm development kit.
>
> In both the HDMI bridge and LVDS bridge, I am able to get the modetest
> and drmdevice to return data that looks valid.  The resolution and
> refresh look correct, but I am not able to can an actual image to
> generate out to either the LVDS or the HDMI.  I am able to get the
> image to appear using the NXP kernel with the ADV7511 HDMI bridge, so
> that leads me to believe there might be something wrong with either
> LCDIF or the Samsung DSIM layer code.  I am guess it's the Samsung
> DSIM stuff since the LCDIF has been around for a while.
>
> I am not particularly well versed in the video world, but if you have
> something you'd like me to try, i am willing to try it.
>

Adam,

Try the patches Frieder had to make for his display from his git here:
https://github.com/fschrempf/linux/commits/v5.10-mx8mm-graphics.

I found I needed these for the display I have:
drm/exynos: Fix DE polarity for usage with LCDIF encoder
drm/exynos: Fix PLL PMS offset for P value bitfield

Best regards,

Tim


Re: [PATCH 1/2] drm: Fix dirtyfb stalls

2021-05-10 Thread Daniel Vetter
On Mon, May 10, 2021 at 06:14:20PM +0200, Daniel Vetter wrote:
> On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video
> > mode" type displays, which is pointless and unnecessary.  Add an
> > optional helper vfunc to determine if a plane is attached to a CRTC
> > that actually needs dirtyfb, and skip over them.
> > 
> > Signed-off-by: Rob Clark 
> 
> So this is a bit annoying because the idea of all these "remap legacy uapi
> to atomic constructs" helpers is that they shouldn't need/use anything
> beyond what userspace also has available. So adding hacks for them feels
> really bad.
> 
> Also I feel like it's not entirely the right thing to do here either.
> We've had this problem already on the fbcon emulation side (which also
> shouldn't be able to peek behind the atomic kms uapi curtain), and the fix
> there was to have a worker which batches up all the updates and avoids any
> stalls in bad places.
> 
> Since this is for frontbuffer rendering userspace only we can probably get
> away with assuming there's only a single fb, so the implementation becomes
> pretty simple:
> 
> - 1 worker, and we keep track of a single pending fb
> - if there's already a dirty fb pending on a different fb, we stall for
>   the worker to start processing that one already (i.e. the fb we track is
>   reset to NULL)
> - if it's pending on the same fb we just toss away all the updates and go
>   with a full update, since merging the clip rects is too much work :-) I
>   think there's helpers so you could be slightly more clever and just have
>   an overall bounding box
> 
> Could probably steal most of the implementation.

Maybe we should even do all this in the common dirtyfb code, before we
call into the driver hook. Gives more symmetry in how it works between
fbcon and direct rendering userspace.
-Daniel

> 
> This approach here feels a tad too much in the hacky area ...
> 
> Thoughts?
> -Daniel
> 
> > ---
> >  drivers/gpu/drm/drm_damage_helper.c  |  8 
> >  include/drm/drm_modeset_helper_vtables.h | 14 ++
> >  2 files changed, 22 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_damage_helper.c 
> > b/drivers/gpu/drm/drm_damage_helper.c
> > index 3a4126dc2520..a0bed1a2c2dc 100644
> > --- a/drivers/gpu/drm/drm_damage_helper.c
> > +++ b/drivers/gpu/drm/drm_damage_helper.c
> > @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer 
> > *fb,
> >  retry:
> > drm_for_each_plane(plane, fb->dev) {
> > struct drm_plane_state *plane_state;
> > +   struct drm_crtc *crtc;
> >  
> > ret = drm_modeset_lock(>mutex, state->acquire_ctx);
> > if (ret)
> > @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer 
> > *fb,
> > continue;
> > }
> >  
> > +   crtc = plane->state->crtc;
> > +   if (crtc->helper_private->needs_dirtyfb &&
> > +   !crtc->helper_private->needs_dirtyfb(crtc)) {
> > +   drm_modeset_unlock(>mutex);
> > +   continue;
> > +   }
> > +
> > plane_state = drm_atomic_get_plane_state(state, plane);
> > if (IS_ERR(plane_state)) {
> > ret = PTR_ERR(plane_state);
> > diff --git a/include/drm/drm_modeset_helper_vtables.h 
> > b/include/drm/drm_modeset_helper_vtables.h
> > index eb706342861d..afa8ec5754e7 100644
> > --- a/include/drm/drm_modeset_helper_vtables.h
> > +++ b/include/drm/drm_modeset_helper_vtables.h
> > @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs {
> >  bool in_vblank_irq, int *vpos, int *hpos,
> >  ktime_t *stime, ktime_t *etime,
> >  const struct drm_display_mode *mode);
> > +
> > +   /**
> > +* @needs_dirtyfb
> > +*
> > +* Optional callback used by damage helpers to determine if 
> > fb_damage_clips
> > +* update is needed.
> > +*
> > +* Returns:
> > +*
> > +* True if fb_damage_clips update is needed to handle DIRTYFB, False
> > +* otherwise.  If this callback is not implemented, then True is
> > +* assumed.
> > +*/
> > +   bool (*needs_dirtyfb)(struct drm_crtc *crtc);
> >  };
> >  
> >  /**
> > -- 
> > 2.30.2
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH V4 1/2] dt-bindings: drm/bridge: ti-sn65dsi83: Add TI SN65DSI83 and SN65DSI84 bindings

2021-05-10 Thread Rob Herring
On Sat, 08 May 2021 22:22:50 +0200, Marek Vasut wrote:
> Add DT binding document for TI SN65DSI83 and SN65DSI84 DSI to LVDS bridge.
> 
> Reviewed-by: Linus Walleij 
> Signed-off-by: Marek Vasut 
> Cc: Douglas Anderson 
> Cc: Jagan Teki 
> Cc: Laurent Pinchart 
> Cc: Linus Walleij 
> Cc: Rob Herring 
> Cc: Sam Ravnborg 
> Cc: Stephen Boyd 
> Cc: devicet...@vger.kernel.org
> To: dri-devel@lists.freedesktop.org
> ---
> V2: Add compatible string for SN65DSI84, since this is now tested on it
> V3: - Add 0x2c as valid i2c address
> - Switch to schemas/graph.yaml
> - Constraint data-lanes to <1>, <1 2>, <1 2 3>, <1 2 3 4> only
> - Indent example by 4 spaces
> - Handle dual-link LVDS with two ports and describe the second DSI
>   channel-B port as well. Based on the register defaults of DSI83
>   and DSI84, it is likely that the LVDS-channel-B and DSI-channel-B
>   hardware is present in all the chips, so just reuse port@0 and 2
>   for DSI83, port@0,2,3 for DSI84 and all of 0,1,2,3 for DSI85 when
>   that is supported
> V4: - Fix typo in port@3 description
> - Add RB from Linus Walleij
> - Replace oneOf: and const with enum:
> - ref /schemas/media/video-interfaces.yaml#
> - Drop empty endpoint: and properties:
> ---
>  .../bindings/display/bridge/ti,sn65dsi83.yaml | 159 ++
>  1 file changed, 159 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/bridge/ti,sn65dsi83.yaml
> 

Reviewed-by: Rob Herring 


Re: [PATCH 1/2] drm: Fix dirtyfb stalls

2021-05-10 Thread Daniel Vetter
On Sat, May 08, 2021 at 12:56:38PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> drm_atomic_helper_dirtyfb() will end up stalling for vblank on "video
> mode" type displays, which is pointless and unnecessary.  Add an
> optional helper vfunc to determine if a plane is attached to a CRTC
> that actually needs dirtyfb, and skip over them.
> 
> Signed-off-by: Rob Clark 

So this is a bit annoying because the idea of all these "remap legacy uapi
to atomic constructs" helpers is that they shouldn't need/use anything
beyond what userspace also has available. So adding hacks for them feels
really bad.

Also I feel like it's not entirely the right thing to do here either.
We've had this problem already on the fbcon emulation side (which also
shouldn't be able to peek behind the atomic kms uapi curtain), and the fix
there was to have a worker which batches up all the updates and avoids any
stalls in bad places.

Since this is for frontbuffer rendering userspace only we can probably get
away with assuming there's only a single fb, so the implementation becomes
pretty simple:

- 1 worker, and we keep track of a single pending fb
- if there's already a dirty fb pending on a different fb, we stall for
  the worker to start processing that one already (i.e. the fb we track is
  reset to NULL)
- if it's pending on the same fb we just toss away all the updates and go
  with a full update, since merging the clip rects is too much work :-) I
  think there's helpers so you could be slightly more clever and just have
  an overall bounding box

Could probably steal most of the implementation.

This approach here feels a tad too much in the hacky area ...

Thoughts?
-Daniel

> ---
>  drivers/gpu/drm/drm_damage_helper.c  |  8 
>  include/drm/drm_modeset_helper_vtables.h | 14 ++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_damage_helper.c 
> b/drivers/gpu/drm/drm_damage_helper.c
> index 3a4126dc2520..a0bed1a2c2dc 100644
> --- a/drivers/gpu/drm/drm_damage_helper.c
> +++ b/drivers/gpu/drm/drm_damage_helper.c
> @@ -211,6 +211,7 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb,
>  retry:
>   drm_for_each_plane(plane, fb->dev) {
>   struct drm_plane_state *plane_state;
> + struct drm_crtc *crtc;
>  
>   ret = drm_modeset_lock(>mutex, state->acquire_ctx);
>   if (ret)
> @@ -221,6 +222,13 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb,
>   continue;
>   }
>  
> + crtc = plane->state->crtc;
> + if (crtc->helper_private->needs_dirtyfb &&
> + !crtc->helper_private->needs_dirtyfb(crtc)) {
> + drm_modeset_unlock(>mutex);
> + continue;
> + }
> +
>   plane_state = drm_atomic_get_plane_state(state, plane);
>   if (IS_ERR(plane_state)) {
>   ret = PTR_ERR(plane_state);
> diff --git a/include/drm/drm_modeset_helper_vtables.h 
> b/include/drm/drm_modeset_helper_vtables.h
> index eb706342861d..afa8ec5754e7 100644
> --- a/include/drm/drm_modeset_helper_vtables.h
> +++ b/include/drm/drm_modeset_helper_vtables.h
> @@ -487,6 +487,20 @@ struct drm_crtc_helper_funcs {
>bool in_vblank_irq, int *vpos, int *hpos,
>ktime_t *stime, ktime_t *etime,
>const struct drm_display_mode *mode);
> +
> + /**
> +  * @needs_dirtyfb
> +  *
> +  * Optional callback used by damage helpers to determine if 
> fb_damage_clips
> +  * update is needed.
> +  *
> +  * Returns:
> +  *
> +  * True if fb_damage_clips update is needed to handle DIRTYFB, False
> +  * otherwise.  If this callback is not implemented, then True is
> +  * assumed.
> +  */
> + bool (*needs_dirtyfb)(struct drm_crtc *crtc);
>  };
>  
>  /**
> -- 
> 2.30.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH 8/9] drm/gem: Associate GEM objects with drm cgroup

2021-05-10 Thread Tamminen, Eero T
Hi,

Mon, 2021-05-10 at 17:36 +0200, Daniel Vetter wrote:
> 
...
> > If DRM allows user-space to exhaust all of system memory, this seems
> > to be a gap in enforcement of MEMCG limits for system memory.
> > I tried to look into it when this was discussed in the past
> > My guess is that shmem_read_mapping_page_gfp() ->
> > shmem_getpage_gfp() is not choosing the correct MM to charge against
> > in the use case of drivers using shmemfs for backing gem buffers.
> 
> Yeah we know about this one since forever. The bug report is roughly
> as old as the gem/ttm memory managers :-/ So another problem might be
> that if we now suddenly include gpu memory in the memcg accounting, we
> start breaking a bunch of workloads that worked just fine beforehand.

It's not the first time tightening security requires adapting settings
for running workloads...

Workload GPU memory usage needs to be significant portion of
application's real memory usage, to cause workload to hit limits that
have been set for it earlier.  Therefore I think it to definitely be
something that user setting such limits actually cares about.

=> I think the important thing is that reason for the failures is clear
from the OOM message.  This works much better if GPU related memory
usage is specifically stated in that message, once that memory starts to
be accounted for system memory.


- Eero



Re: [PATCH] component: Move host device to end of device lists on binding

2021-05-10 Thread Daniel Vetter
On Sat, May 08, 2021 at 12:41:18AM -0700, Stephen Boyd wrote:
> The device lists are poorly ordered when the component device code is
> used. This is because component_master_add_with_match() returns 0
> regardless of component devices calling component_add() first. It can
> really only fail if an allocation fails, in which case everything is
> going bad and we're out of memory. The host device (called master_dev in
> the code), can succeed at probe and be put on the device lists before
> any of the component devices are probed and put on the lists.
> 
> Within the component device framework this usually isn't that bad
> because the real driver work is done at bind time via
> component{,master}_ops::bind(). It becomes a problem when the driver
> core, or host driver, wants to operate on the component device outside
> of the bind/unbind functions, e.g. via 'remove' or 'shutdown'. The
> driver core doesn't understand the relationship between the host device
> and the component devices and could possibly try to operate on component
> devices when they're already removed from the system or shut down.
> 
> Normally, device links or probe defer would reorder the lists and put
> devices that depend on other devices in the lists at the correct
> location, but with component devices this doesn't happen because this
> information isn't expressed anywhere. Drivers simply succeed at
> registering their component or host with the component framework and
> wait for their bind() callback to be called once the other components
> are ready. We could make various device links between 'master_dev' and
> 'component->dev' but it's not necessary. Let's simply move the hosting
> device to the end of the device lists when the component device fully
> binds. This way we know that all components are present and have probed
> properly and now the host device has really probed so it's safe to
> assume the host driver ops can operate on any component device.
> 
> This fixes the msm display driver shutdown path when the DSI controller
> is connected to a DSI bridge that is controlled via i2c. In this case,
> the msm display driver wants to tear down the display pipeline on
> shutdown at msm_pdev_shutdown() by calling drm_atomic_helper_shutdown(),
> and it can't do that unless the whole display chain is still probed and
> active in the system. When a display bridge is on i2c, the i2c device
> for the bridge will be created whenever the i2c controller probes, which
> could be before or after the msm display driver probes. If the i2c
> controller probes after the display driver, then the i2c controller will
> be shutdown before the display controller during system wide shutdown
> and thus i2c transactions will stop working before the display pipeline
> is shut down. This means we'll have the display bridge trying to access
> an i2c bus that's shut down because drm_atomic_helper_shutdown() is
> trying to disable the bridge after the bridge is off.
> 
> Moving the host device to the end of the lists at bind time moves the
> drm_atomic_helper_shutdown() call before the i2c bus is shutdown.
> This fixes the immediate problem, but we could improve it a bit by
> modeling device links from the component devices to the host device
> indicating that they supply something, although it is slightly different
> because the consumer doesn't need the suppliers to probe to succeed.
> 
> Cc: "Rafael J. Wysocki" 
> Cc: Daniel Vetter 
> Cc: Russell King 
> Cc: Rob Clark 
> Cc: 
> Signed-off-by: Stephen Boyd 

Entirely aside, but an s/master/aggregate/ or similar over the entire
component.c codebase would help a pile in making it easier to understand
which part does what. Or at least I'm always terribly confused about which
bind binds what and all that, so maybe an additional review whether we
have a clear split into aggregate and individual components after that
initial fix is needed.

On the actual topic: I agree there's a problem here, but I'm honestly not
sure how it should be fixed. That's way over my understanding of all the
device probe and pm interactions. Of which there are plenty.

One question I have: Why is the bridge component driver not correctly
ordered wrt the i2c driver it needs? The idea is that the aggregate driver
doesn't access any hw itself, but entirely relies on all its components.
So as long as all the component drivers are sorted correctly in the device
list, things /should/ work. And as soon as we drop out a single component,
the aggregate gets unbound (and then does all the
drm_atomic_helper_shutdown and all the other drm teardown). So is the bug
perhaps that msm does the drm teardown in the wrong callback?
-Daniel

> ---
>  drivers/base/component.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/base/component.c b/drivers/base/component.c
> index dcfbe7251dc4..de645420bae2 100644
> --- a/drivers/base/component.c
> +++ b/drivers/base/component.c
> @@ -15,6 +15,8 @@
>  #include 
>  #include 
>  
> 

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Edward Cree
On 10/05/2021 14:38, Mauro Carvalho Chehab wrote:
> Em Mon, 10 May 2021 14:16:16 +0100
> Edward Cree  escreveu:
>> But what kinds of things with × or — in are going to be grept for?
> 
> Actually, on almost all places, those aren't used inside math formulae, but
> instead, they describe video some resolutions:
Ehh, those are also proper uses of ×.  It's still a multiplication,
 after all.

> it is a way more likely that, if someone wants to grep, they would be 
> doing something like this, in order to get video resolutions:
Why would someone be grepping for "all video resolutions mentioned in
 the documentation"?  That seems contrived to me.

-ed


Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Edward Cree
On 10/05/2021 14:59, Matthew Wilcox wrote:
> Most of these
> UTF-8 characters come from latex conversions and really aren't
> necessary (and are being used incorrectly).
I fully agree with fixing those.
The cover-letter, however, gave the impression that that was not the
 main purpose of this series; just, perhaps, a happy side-effect.

> You seem quite knowedgeable about the various differences.  Perhaps
> you'd be willing to write a document for Documentation/doc-guide/
> that provides guidance for when to use which kinds of horizontal
> line?I have Opinions about the proper usage of punctuation, but I also know
 that other people have differing opinions.  For instance, I place
 spaces around an em dash, which is nonstandard according to most
 style guides.  Really this is an individual enough thing that I'm not
 sure we could have a "kernel style guide" that would be more useful
 than general-purpose guidance like the page you linked.
Moreover, such a guide could make non-native speakers needlessly self-
 conscious about their writing and discourage them from contributing
 documentation at all.  I'm not advocating here for trying to push
 kernel developers towards an eats-shoots-and-leaves level of
 linguistic pedantry; rather, I merely think that existing correct
 usages should be left intact (and therefore, excising incorrect usage
 should only be attempted by someone with both the expertise and time
 to check each case).

But if you really want such a doc I wouldn't mind contributing to it.

-ed


Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Ben Boeckel
On Mon, May 10, 2021 at 13:55:18 +0200, Mauro Carvalho Chehab wrote:
> $ git grep "CPU 0 has been" Documentation/RCU/
>   Documentation/RCU/Design/Data-Structures/Data-Structures.rst:| #. CPU 0 
> has been in dyntick-idle mode for quite some time. When it   |
>   Documentation/RCU/Design/Data-Structures/Data-Structures.rst:|
> notices that CPU 0 has been in dyntick idle mode, which qualifies  |

The kernel documentation uses hard line wraps, so such a naive grep is
going to always fail unless such line wraps are taken into account. Not
saying this isn't an improvement in and of itself, but smarter searching
strategies are likely needed anyways.

--Ben


Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Edward Cree
On 10/05/2021 12:55, Mauro Carvalho Chehab wrote:
> The main point on this series is to replace just the occurrences
> where ASCII represents the symbol equally well

>   - U+2014 ('—'): EM DASH
Em dash is not the same thing as hyphen-minus, and the latter does not
 serve 'equally well'.  People use em dashes because — even in
 monospace fonts — they make text easier to read and comprehend, when
 used correctly.
I accept that some of the other distinctions — like en dashes — are
 needlessly pedantic (though I don't doubt there is someone out there
 who will gladly defend them with the same fervour with which I argue
 for the em dash) and I wouldn't take the trouble to use them myself;
 but I think there is a reasonable assumption that when someone goes
 to the effort of using a Unicode punctuation mark that is semantic
 (rather than merely typographical), they probably had a reason for
 doing so.

>   - U+2018 ('‘'): LEFT SINGLE QUOTATION MARK
>   - U+2019 ('’'): RIGHT SINGLE QUOTATION MARK
>   - U+201c ('“'): LEFT DOUBLE QUOTATION MARK
>   - U+201d ('”'): RIGHT DOUBLE QUOTATION MARK
(These are purely typographic, I have no problem with dumping them.)

>   - U+00d7 ('×'): MULTIPLICATION SIGN
Presumably this is appearing in mathematical formulae, in which case
 changing it to 'x' loses semantic information.

> Using the above symbols will just trick tools like grep for no good
> reason.
NBSP, sure.  That one's probably an artefact of some document format
 conversion somewhere along the line, anyway.
But what kinds of things with × or — in are going to be grept for?

If there are em dashes lying around that semantically _should_ be
 hyphen-minus (one of your patches I've seen, for instance, fixes an
 *en* dash moonlighting as the option character in an `ethtool`
 command line), then sure, convert them.
But any time someone is using a Unicode character to *express
 semantics*, even if you happen to think the semantic distinction
 involved is a pedantic or unimportant one, I think you need an
 explicit grep case to justify ASCIIfying it.

-ed


  1   2   >