date:20211115


Am 16.11.21 um 08:37 schrieb Daniel Vetter:

On Mon, Nov 15, 2021 at 9:23 PM Christian König
 wrote:



Am 15.11.21 um 16:05 schrieb Daniel Vetter:

You need

commit 13e9e30cafea10dff6bc8d63a38a61249e83fd65
Author: Christian König 
Date:   Mon Oct 18 21:27:55 2021 +0200

 drm/scheduler: fix drm_sched_job_add_implicit_dependencies

which Christian pushed to drm-misc-next instead of drm-misc-fixes. I
already asked Christian in some other thread to cherry-pick it over.

Sounds like you haven't seen my answer to that request.

I can't cherry pick the patch to drm-misc-fixes because the patch which
broke things hasn't showed up in that branch yet causing a conflict.

Yeah I asked Maxime to roll forward to -rc1 right after sending out
this mail so you can do that.


I've pined him again just a second ago because a "dim update-branches" 
still doesn't show the patches from -rc1 this morning.



  Which you could have done too :-)


Hui? I can push merges from upstream into drm-misc-fixes? ^^

Christian.


-Daniel


Regards,
Christian.


-Daniel

On Mon, Nov 15, 2021 at 3:56 PM Daniel Stone  wrote:

Hi Ondrej,

On Mon, 15 Nov 2021 at 07:35, Ondřej Jirman  wrote:

I'm getting some fence refcounting related panics with the current
Linus's master branch:

It happens immediately whenever I start Xorg or sway.

Anyone has any ideas where to start looking? It works fine with v5.15.

(sorry for the interleaved log, it's coming from multiple CPUs at once
I guess)

Thanks a lot for the report - are you able to bisect this please?

Cheers,
Daniel


--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7C16b6abb8435be5c908d9a8d3e8d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637726450378319982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=0SieiO%2FNcLgRmlDWvyifVcfsfHGbVhQqA4ff6oj81SQ%3Dreserved=0

Re: Panic with linus/master and panfrost

2021-11-15 Thread Daniel Vetter

On Mon, Nov 15, 2021 at 9:23 PM Christian König
 wrote:
>
>
>
> Am 15.11.21 um 16:05 schrieb Daniel Vetter:
> > You need
> >
> > commit 13e9e30cafea10dff6bc8d63a38a61249e83fd65
> > Author: Christian König 
> > Date:   Mon Oct 18 21:27:55 2021 +0200
> >
> > drm/scheduler: fix drm_sched_job_add_implicit_dependencies
> >
> > which Christian pushed to drm-misc-next instead of drm-misc-fixes. I
> > already asked Christian in some other thread to cherry-pick it over.
>
> Sounds like you haven't seen my answer to that request.
>
> I can't cherry pick the patch to drm-misc-fixes because the patch which
> broke things hasn't showed up in that branch yet causing a conflict.

Yeah I asked Maxime to roll forward to -rc1 right after sending out
this mail so you can do that. Which you could have done too :-)
-Daniel

>
> Regards,
> Christian.
>
> > -Daniel
> >
> > On Mon, Nov 15, 2021 at 3:56 PM Daniel Stone  wrote:
> >> Hi Ondrej,
> >>
> >> On Mon, 15 Nov 2021 at 07:35, Ondřej Jirman  wrote:
> >>> I'm getting some fence refcounting related panics with the current
> >>> Linus's master branch:
> >>>
> >>> It happens immediately whenever I start Xorg or sway.
> >>>
> >>> Anyone has any ideas where to start looking? It works fine with v5.15.
> >>>
> >>> (sorry for the interleaved log, it's coming from multiple CPUs at once
> >>> I guess)
> >> Thanks a lot for the report - are you able to bisect this please?
> >>
> >> Cheers,
> >> Daniel
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Cc541030e445e472b082808d9a84954cc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637725855208408806%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=LVQEVyNFPE1hpZjlD%2BApOVsfUBEPYPiRVVp5Gkut%2BcU%3Dreserved=0
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] drm/radeon:WARNING opportunity for max()


Am 16.11.21 um 06:50 schrieb zhaoxiao:

Fix following coccicheck warning:
drivers/gpu/drm/radeon/r100.c:3450:26-27: WARNING opportunity for max()
drivers/gpu/drm/radeon/r100.c:2812:23-24: WARNING opportunity for max()

Signed-off-by: zhaoxiao 


In general feel free to add my Acked-by, but I'm not sure if we want 
such cleanups in a driver which is only in maintenance mode.


If you do this as part of a general and automated cleanup then it is 
probably ok.


Regards,
Christian.


---
  drivers/gpu/drm/radeon/r100.c | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 2dd85ba1faa2..c65ee6f44af2 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -2809,10 +2809,7 @@ void r100_vram_init_sizes(struct radeon_device *rdev)
if (rdev->mc.aper_size > config_aper_size)
config_aper_size = rdev->mc.aper_size;
  
-		if (config_aper_size > rdev->mc.real_vram_size)

-   rdev->mc.mc_vram_size = config_aper_size;
-   else
-   rdev->mc.mc_vram_size = rdev->mc.real_vram_size;
+   rdev->mc.mc_vram_size = max(config_aper_size, 
rdev->mc.real_vram_size);
}
  }
  
@@ -3447,10 +3444,7 @@ void r100_bandwidth_update(struct radeon_device *rdev)

mc_latency_mclk.full += disp_latency_overhead.full + 
cur_latency_mclk.full;
mc_latency_sclk.full += disp_latency_overhead.full + 
cur_latency_sclk.full;
  
-	if (mc_latency_mclk.full > mc_latency_sclk.full)

-   disp_latency.full = mc_latency_mclk.full;
-   else
-   disp_latency.full = mc_latency_sclk.full;
+   disp_latency.full = max(mc_latency_mclk.full, mc_latency_sclk.full);
  
  	/* setup Max GRPH_STOP_REQ default value */

if (ASIC_IS_RV100(rdev))

Re: [PATCH v3 5/6] drm/i915/ttm: Implement asynchronous TTM moves

2021-11-15 Thread Thomas Hellström




On 11/15/21 18:16, Matthew Auld wrote:

Thanks for reviewing, Matthew,

I'll take a look at the comments.

/Thomas



On 14/11/2021 11:12, Thomas Hellström wrote:

Don't wait sync while migrating, but rather make the GPU blit await the
dependencies and add a moving fence to the object.

This also enables asynchronous VRAM management in that on eviction,
rather than waiting for the moving fence to expire before freeing VRAM,
it is freed immediately and the fence is stored with the VRAM manager 
and
handed out to newly allocated objects to await before clears and 
swapins,

or for kernel objects before setting up gpu vmas or mapping.

To collect dependencies before migrating, add a set of utilities that
coalesce these to a single dma_fence.

What is still missing for fully asynchronous operation is 
asynchronous vma

unbinding, which is still to be implemented.

This commit substantially reduces execution time in the 
gem_lmem_swapping

test.

v2:
- Make a couple of functions static.

Re: [PATCH v4] drm/ttm: Clarify that the TTM_PL_SYSTEM is under TTMs control

Am 13.11.21 um 12:26 schrieb Thomas Hellström:

Hi, Zack,

On 11/11/21 17:44, Zack Rusin wrote:

On Wed, 2021-11-10 at 09:50 -0500, Zack Rusin wrote:

I took a slightly deeper look into this. I think we need to formalize
this a bit more to understand pros and cons and what the restrictions
are really all about. Anybody looking at the prevous discussion will
mostly see arguments similar to "this is stupid and difficult" and "it
has always been this way" which are not really constructive.

First disregarding all accounting stuff, I think this all boils down
to TTM_PL_SYSTEM having three distinct states:

1) POPULATED
2) LIMBO (Or whatever we want to call it. No pages present)
3) SWAPPED.

Now using a bounce domain to proxy SYSTEM means that the driver can
forget about the SWAPPED state, it's automatically handled by the move
setup code. However, another pitfall is LIMBO, in that if when you
move from SYSTEM/LIMBO to your bounce domain, the BO will be
populated. So any naive accelerated move() implementation creating a
1GB BO in fixed memory, like VRAM, will needlessly allocate and free
1GB of system memory in the process instead of just performing a clear
operation. Looks like amdgpu suffers from this?

I think what is really needed is either

b) Make LIMBO and SWAPPED distinct memory regions. (I think I'd vote
for this without looking into it in detail).

In both these cases, we should really make SYSTEM bindable by GPU,
otherwise we'd just be trading one pitfall for another related without
really resolving the root problem.

I think the problem goes deeper than what has been mentioned here so far.

Having fences attached to BOs in the system domain is probably ok, but
the key point is that the BOs in the system domain are under TTMs
control and should not be touched by the driver.

What we have now is that TTMs internals like the allocation state of BOs
in system memory (the populated, limbo, swapped you mentioned above) is
leaking into the drivers and I think exactly that is the part which
doesn't work reliable here. You can of course can get that working, but
that requires knowledge of the internal state which in my eyes was
always illegal.

What we could do is to split the system domain into SYSTEM and SWAP and
then say only the swap domain is under TTMs control.

Regards,
Christian.

Thanks,

/Thomas

Polite and gentle ping on that one. Are we ok with the wording here?

RE: [PATCH 3/4] drm/aspeed: Update INTR_STS handling

2021-11-15 Thread Tommy Huang

Hi Joel,

Thanks for your comment.
I will modify the patch and send it again.

> -Original Message-
> From: Joel Stanley 
> Sent: Tuesday, November 16, 2021 2:19 PM
> To: Tommy Huang 
> Cc: David Airlie ; Daniel Vetter ; Rob
> Herring ; Andrew Jeffery ;
> linux-aspeed ; open list:DRM DRIVERS
> ; devicetree ;
> Linux ARM ; Linux Kernel Mailing List
> ; BMC-SW 
> Subject: Re: [PATCH 3/4] drm/aspeed: Update INTR_STS handling
> 
> On Mon, 1 Nov 2021 at 11:01, tommy-huang
>  wrote:
> >
> > The V-sync INTR_STS is differnet on AST2600.
> > Change into general rule to handle it.
> >
> > Signed-off-by: tommy-huang 
> > ---
> >  drivers/gpu/drm/aspeed/aspeed_gfx.h |  2 ++
> >  drivers/gpu/drm/aspeed/aspeed_gfx_drv.c | 26
> > ++---
> >  2 files changed, 25 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx.h
> > b/drivers/gpu/drm/aspeed/aspeed_gfx.h
> > index 96501152bafa..5eed9275bce7 100644
> > --- a/drivers/gpu/drm/aspeed/aspeed_gfx.h
> > +++ b/drivers/gpu/drm/aspeed/aspeed_gfx.h
> > @@ -12,6 +12,8 @@ struct aspeed_gfx {
> > struct regmap   *scu;
> >
> > u32 dac_reg;
> > +   u32 int_reg;
> > +   u32 int_clr_reg;
> > u32 vga_scratch_reg;
> > u32 throd_val;
> > u32 scan_line_max;
> > diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
> > b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
> > index b53fee6f1c17..1092060cb59c 100644
> > --- a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
> > +++ b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
> > @@ -60,6 +60,8 @@
> >
> >  struct aspeed_gfx_config {
> > u32 dac_reg;/* DAC register in SCU */
> > +   u32 int_status_reg; /* Interrupt status register */
> 
> This is the same for all supported chips; do you need to introduce the 
> variable
> for it?
> 
> > +   u32 int_clear_reg;  /* Interrupt clear register */
> > u32 vga_scratch_reg;/* VGA scratch register in SCU */
> > u32 throd_val;  /* Default Threshold Seting */
> > u32 scan_line_max;  /* Max memory size of one scan line */
> > @@ -67,6 +69,8 @@ struct aspeed_gfx_config {
> >
> >  static const struct aspeed_gfx_config ast2400_config = {
> > .dac_reg = 0x2c,
> > +   .int_status_reg = 0x60,
> > +   .int_clear_reg = 0x60,
> > .vga_scratch_reg = 0x50,
> > .throd_val = CRT_THROD_LOW(0x1e) | CRT_THROD_HIGH(0x12),
> > .scan_line_max = 64,
> > @@ -74,14 +78,26 @@ static const struct aspeed_gfx_config
> > ast2400_config = {
> >
> >  static const struct aspeed_gfx_config ast2500_config = {
> > .dac_reg = 0x2c,
> > +   .int_status_reg = 0x60,
> > +   .int_clear_reg = 0x60,
> > .vga_scratch_reg = 0x50,
> > .throd_val = CRT_THROD_LOW(0x24) | CRT_THROD_HIGH(0x3c),
> > .scan_line_max = 128,
> >  };
> >
> > +static const struct aspeed_gfx_config ast2600_config = {
> > +   .dac_reg = 0xc0,
> > +   .int_status_reg = 0x60,
> > +   .int_clear_reg = 0x68,
> > +   .vga_scratch_reg = 0x50,
> > +   .throd_val = CRT_THROD_LOW(0x50) | CRT_THROD_HIGH(0x70),
> > +   .scan_line_max = 128,
> > +};
> 
> This patch combines adding the clear_reg functionality with support for the
> 2600. Can you split them out; add int_clear_reg first, and then add support 
> for
> the 2600 in the following patch?
> 
> 
> > +
> >  static const struct of_device_id aspeed_gfx_match[] = {
> > { .compatible = "aspeed,ast2400-gfx", .data = _config },
> > { .compatible = "aspeed,ast2500-gfx", .data = _config
> > },
> > +   { .compatible = "aspeed,ast2600-gfx", .data = _config
> > + },
> > { },
> >  };
> >  MODULE_DEVICE_TABLE(of, aspeed_gfx_match); @@ -113,13 +129,15 @@
> > static irqreturn_t aspeed_gfx_irq_handler(int irq, void *data)  {
> > struct drm_device *drm = data;
> > struct aspeed_gfx *priv = to_aspeed_gfx(drm);
> > -   u32 reg;
> > +   u32 reg, clr_reg;
> >
> > -   reg = readl(priv->base + CRT_CTRL1);
> > +   reg = readl(priv->base + priv->int_reg);
> >
> > if (reg & CRT_CTRL_VERTICAL_INTR_STS) {
> > drm_crtc_handle_vblank(>pipe.crtc);
> > -   writel(reg, priv->base + CRT_CTRL1);
> > +   clr_reg = (readl(priv->base + priv->int_clr_reg) |
> > +   CRT_CTRL_VERTICAL_INTR_STS);
> > +   writel(clr_reg, priv->base + priv->int_clr_reg);
> You do not need to do read-modify-write.
> 
> On the 2500 you're writing back the value you read. On the 2600 all of the 
> bits
> are read only except the INTR_STS bit, which is W1C, so you can use the value
> you read from CRT_CTRL1, so this should be enough:
> 
>  writel(reg, priv->base + priv->int_clr_reg);
> 
> >

[PATCH] In function nvkm_ioctl_map(), the variable "type" could be uninitialized if "nvkm_object_map()" returns error code, however, it does not check the return value and directly use the "type" in t

2021-11-15 Thread Yizhuo Zhai

Fixes:01326050391ce("drm/nouveau/core/object: allow arguments to
be passed to map function")
Signed-off-by: Yizhuo Zhai 
---
 drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c 
b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
index 735cb6816f10..4264d9d79783 100644
--- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
+++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
@@ -266,6 +266,8 @@ nvkm_ioctl_map(struct nvkm_client *client,
ret = nvkm_object_map(object, data, size, ,
  >v0.handle,
  >v0.length);
+   if (ret)
+   return ret;
if (type == NVKM_OBJECT_MAP_IO)
args->v0.type = NVIF_IOCTL_MAP_V0_IO;
else
-- 
2.25.1

Re: Questions about KMS flip


Am 16.11.21 um 04:27 schrieb Lang Yu:

On Mon, Nov 15, 2021 at 01:04:15PM +0100, Michel DDDnzer wrote:

[SNIP]

Though a single call to dce_v*_0_crtc_do_set_base() will
only pin the BO, I found it will be unpinned in next call to
dce_v*_0_crtc_do_set_base().

Yeah, that's the normal case when the new BO is different from the old one.

To catch the case I described, try something like

diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c

index 18a7b3bd633b..5726bd87a355 100644

--- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c

+++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c

@@ -1926,6 +1926,7 @@ static int dce_v11_0_crtc_do_set_base(struct drm_crtc 
*crtc,

 return r;



 if (!atomic) {

+   WARN_ON_ONCE(target_fb == fb);

 r = amdgpu_bo_pin(abo, AMDGPU_GEM_DOMAIN_VRAM);

 if (unlikely(r != 0)) {

 amdgpu_bo_unreserve(abo);


I did some tests, the warning can be triggered.

pin/unpin operations in *_crtc_do_set_base() and
amdgpu_display_crtc_page_flip_target() are mixed.


Ok sounds like we narrowed down the root cause pretty well.

Question is now how can we fix this? Just not pin the BO when target_fb 
== fb?


Thanks,
Christian.



Regards,
Lang

[PATCH v1 3/3] phy: qcom: Program SSC only if supported by sink

Some legacy eDP sinks may not support SSC. The support for SSC is
indicated through an opts flag from the controller driver. This
change will enable SSC only if the sink supports it.

Signed-off-by: Sankeerth Billakanti 
---
 drivers/phy/qualcomm/phy-qcom-edp.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/phy/qualcomm/phy-qcom-edp.c 
b/drivers/phy/qualcomm/phy-qcom-edp.c
index 6d9404d..06ba609 100644
--- a/drivers/phy/qualcomm/phy-qcom-edp.c
+++ b/drivers/phy/qualcomm/phy-qcom-edp.c
@@ -335,9 +335,11 @@ static int qcom_edp_phy_power_on(struct phy *phy)
writel(0x00, edp->tx0 + TXn_LANE_MODE_1);
writel(0x00, edp->tx1 + TXn_LANE_MODE_1);
 
-   ret = qcom_edp_configure_ssc(edp);
-   if (ret)
-   return ret;
+   if (edp->dp_opts.ssc) {
+   ret = qcom_edp_configure_ssc(edp);
+   if (ret)
+   return ret;
+   }
 
ret = qcom_edp_configure_pll(edp);
if (ret)
-- 
2.7.4

[PATCH v1 1/3] dt-bindings: phy: Add eDP PHY compatible for sc7280

Add compatible string for the supported eDP PHY on sc7280 platform.

Signed-off-by: Sankeerth Billakanti 
---
 Documentation/devicetree/bindings/phy/qcom,edp-phy.yaml | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/phy/qcom,edp-phy.yaml 
b/Documentation/devicetree/bindings/phy/qcom,edp-phy.yaml
index 9076e19..a5850ff 100644
--- a/Documentation/devicetree/bindings/phy/qcom,edp-phy.yaml
+++ b/Documentation/devicetree/bindings/phy/qcom,edp-phy.yaml
@@ -16,7 +16,9 @@ description:
 
 properties:
   compatible:
-const: qcom,sc8180x-edp-phy
+enum:
+  - qcom,sc7280-edp-phy
+  - qcom,sc8180x-edp-phy
 
   reg:
 items:
-- 
2.7.4

[PATCH v1 0/3] Add support for eDP PHY on SC7280 platform

This series adds support for the eDP PHY on Qualcomm SC7280 platform.
The changes are dependent on v4 of the new eDP PHY driver introduced by Bjorn:
https://patchwork.kernel.org/project/linux-arm-msm/list/?series=575135

Sankeerth Billakanti (3):
  dt-bindings: phy: Add eDP PHY compatible for sc7280
  phy: qcom: Add support for eDP PHY on sc7280
  phy: qcom: Program SSC only if supported by sink

 Documentation/devicetree/bindings/phy/qcom,edp-phy.yaml | 4 +++-
 drivers/phy/qualcomm/phy-qcom-edp.c | 9 ++---
 2 files changed, 9 insertions(+), 4 deletions(-)

-- 
2.7.4

[PATCH v1 2/3] phy: qcom: Add support for eDP PHY on sc7280

The sc7280 platform supports native eDP controller and PHY.
This change will add support for the eDP PHY on sc7280.

Signed-off-by: Sankeerth Billakanti 
---
 drivers/phy/qualcomm/phy-qcom-edp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/phy/qualcomm/phy-qcom-edp.c 
b/drivers/phy/qualcomm/phy-qcom-edp.c
index 17d5653..6d9404d 100644
--- a/drivers/phy/qualcomm/phy-qcom-edp.c
+++ b/drivers/phy/qualcomm/phy-qcom-edp.c
@@ -654,6 +654,7 @@ static int qcom_edp_phy_probe(struct platform_device *pdev)
 }
 
 static const struct of_device_id qcom_edp_phy_match_table[] = {
+   { .compatible = "qcom,sc7280-edp-phy" },
{ .compatible = "qcom,sc8180x-edp-phy" },
{ }
 };
-- 
2.7.4

[PATCH] In function nvkm_ioctl_map(), the variable "type" could be uninitialized if "nvkm_object_map()" returns error code, however, it does not check the return value and directly use the "type" in t

2021-11-15 Thread Yizhuo Zhai

Fixes:01326050391ce("drm/nouveau/core/object: allow arguments to
be passed to map function")
Signed-off-by: Yizhuo Zhai 
---
 drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c 
b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
index 735cb6816f10..4264d9d79783 100644
--- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
+++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
@@ -266,6 +266,8 @@ nvkm_ioctl_map(struct nvkm_client *client,
ret = nvkm_object_map(object, data, size, ,
  >v0.handle,
  >v0.length);
+   if (ret)
+   return ret;
if (type == NVKM_OBJECT_MAP_IO)
args->v0.type = NVIF_IOCTL_MAP_V0_IO;
else
-- 
2.25.1

Re: Panic with linus/master and panfrost


Am 16.11.21 um 00:04 schrieb Rob Clark:

On Mon, Nov 15, 2021 at 2:43 PM Rob Clark  wrote:

On Mon, Nov 15, 2021 at 8:16 AM Ondřej Jirman  wrote:

On Mon, Nov 15, 2021 at 05:04:36PM +0100, megi xff wrote:

On Mon, Nov 15, 2021 at 04:05:02PM +0100, Daniel Vetter wrote:

You need

commit 13e9e30cafea10dff6bc8d63a38a61249e83fd65
Author: Christian König 
Date:   Mon Oct 18 21:27:55 2021 +0200

drm/scheduler: fix drm_sched_job_add_implicit_dependencies

Thank you, that fixed the panic. :)

I spoke too soon. Panic is gone, but I still see (immediately after
starting Xorg):

[   13.290795] [ cut here ]
[   13.291103] refcount_t: addition on 0; use-after-free.
[   13.291495] WARNING: CPU: 5 PID: 548 at lib/refcount.c:25 
refcount_warn_saturate+0x98/0x140
[   13.292124] Modules linked in:
[   13.292285] CPU: 5 PID: 548 Comm: Xorg Not tainted 
5.16.0-rc1-00414-g21a254904a26 #29
[   13.292857] Hardware name: Pine64 PinePhonePro (DT)
[   13.293172] pstate: 4005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   13.293669] pc : refcount_warn_saturate+0x98/0x140
[   13.293977] lr : refcount_warn_saturate+0x98/0x140
[   13.294285] sp : 8000129a3b50
[   13.294464] x29: 8000129a3b50 x28: 8000129a3d50 x27: 17ec4b00
[   13.294979] x26: 0001 x25: 0001 x24: 127cca48
[   13.295494] x23: 17d19b00 x22: 000a x21: 0001
[   13.296006] x20: 17e15500 x19: 12980580 x18: 0003
[   13.296520] x17:  x16:  x15: 8000129a3b58
[   13.297033] x14:  x13: 2e656572662d7265 x12: 7466612d65737520
[   13.297546] x11: 3b30206e6f206e6f x10: 800011d6e8a0 x9 : 80001022f37c
[   13.298059] x8 : efff x7 : 800011dc68a0 x6 : 0001
[   13.298573] x5 :  x4 : f77a9788 x3 : f77b56f0
[   13.299085] x2 : f77a9788 x1 : 8000e5eb1000 x0 : 002a
[   13.299600] Call trace:
[   13.299704]  refcount_warn_saturate+0x98/0x140
[   13.299981]  drm_sched_job_add_implicit_dependencies+0x90/0xdc
[   13.300385]  panfrost_job_push+0xd0/0x1d4
[   13.300628]  panfrost_ioctl_submit+0x34c/0x440
[   13.300906]  drm_ioctl_kernel+0x9c/0x154
[   13.301142]  drm_ioctl+0x1f0/0x410
[   13.301330]  __arm64_sys_ioctl+0xb4/0xdc
[   13.301566]  invoke_syscall+0x4c/0x110
[   13.301787]  el0_svc_common.constprop.0+0x48/0xf0
[   13.302090]  do_el0_svc+0x2c/0x90
[   13.302271]  el0_svc+0x14/0x50
[   13.302431]  el0t_64_sync_handler+0x9c/0x120
[   13.302693]  el0t_64_sync+0x158/0x15c
[   13.302904] ---[ end trace 8c211e57f89714c8 ]---
[   13.303211] [ cut here ]
[   13.303504] refcount_t: underflow; use-after-free.
[   13.303820] WARNING: CPU: 5 PID: 548 at lib/refcount.c:28 
refcount_warn_saturate+0xec/0x140
[   13.304439] Modules linked in:
[   13.304596] CPU: 5 PID: 548 Comm: Xorg Tainted: GW 
5.16.0-rc1-00414-g21a254904a26 #29
[   13.305286] Hardware name: Pine64 PinePhonePro (DT)
[   13.305600] pstate: 4005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   13.306095] pc : refcount_warn_saturate+0xec/0x140
[   13.306402] lr : refcount_warn_saturate+0xec/0x140
[   13.306710] sp : 8000129a3b70
[   13.306887] x29: 8000129a3b70 x28: 8000129a3d50 x27: 17ec4b00
[   13.307401] x26: 0001 x25: 0001 x24: 
[   13.307914] x23:  x22: 129807c0 x21: 12980580
[   13.308428] x20: 17c54d00 x19:  x18: 0003
[   13.308942] x17:  x16:  x15: 8000129a3b58
[   13.309454] x14:  x13: 2e656572662d7265 x12: 7466612d65737520
[   13.309967] x11: 3b776f6c66726564 x10: 800011d6e8a0 x9 : 80001017893c
[   13.310480] x8 : efff x7 : 800011dc68a0 x6 : 0001
[   13.310993] x5 : f77a9788 x4 :  x3 : 0027
[   13.311506] x2 : 0023 x1 : f77a9790 x0 : 0026
[   13.312020] Call trace:
[   13.312123]  refcount_warn_saturate+0xec/0x140
[   13.312401]  dma_resv_add_excl_fence+0x1a8/0x1bc
[   13.312700]  panfrost_job_push+0x174/0x1d4
[   13.312949]  panfrost_ioctl_submit+0x34c/0x440
[   13.313229]  drm_ioctl_kernel+0x9c/0x154
[   13.313464]  drm_ioctl+0x1f0/0x410
[   13.313651]  __arm64_sys_ioctl+0xb4/0xdc
[   13.313884]  invoke_syscall+0x4c/0x110
[   13.314103]  el0_svc_common.constprop.0+0x48/0xf0
[   13.314405]  do_el0_svc+0x2c/0x90
[   13.314586]  el0_svc+0x14/0x50
[   13.314745]  el0t_64_sync_handler+0x9c/0x120
[   13.315007]  el0t_64_sync+0x158/0x15c
[   13.315217] ---[ end trace 8c211e57f89714c9 ]---

In dmesg. So this looks like some independent issue.



I'm seeing something similar with drm/msm, which is, I think, due to
the introduction and location of call to drm_sched_job_arm().. I'm
still trying to untangle where it should go,

[PATCH v3 13/13] drm/msm/dsi: Pass DSC params to drm_panel

When DSC is enabled, we need to pass the DSC parameters to panel driver
as well, so add a dsc parameter in panel and set it when DSC is enabled

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/dsi/dsi_host.c | 16 +++-
 include/drm/drm_panel.h|  7 +++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 2c14c36f0b3d..3d5773fcf496 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -2159,11 +2159,25 @@ int msm_dsi_host_modeset_init(struct mipi_dsi_host 
*host,
struct msm_dsi_host *msm_host = to_msm_dsi_host(host);
const struct msm_dsi_cfg_handler *cfg_hnd = msm_host->cfg_hnd;
struct msm_drm_private *priv;
+   struct drm_panel *panel;
int ret;
 
msm_host->dev = dev;
+   panel = msm_dsi_host_get_panel(_host->base);
priv = dev->dev_private;
-   priv->dsc = msm_host->dsc;
+
+   if (panel && panel->dsc) {
+   struct msm_display_dsc_config *dsc = priv->dsc;
+
+   if (!dsc) {
+   dsc = kzalloc(sizeof(*dsc), GFP_KERNEL);
+   if (!dsc)
+   return -ENOMEM;
+   dsc->drm = panel->dsc;
+   priv->dsc = dsc;
+   msm_host->dsc = dsc;
+   }
+   }
 
ret = cfg_hnd->ops->tx_buf_alloc(msm_host, SZ_4K);
if (ret) {
diff --git a/include/drm/drm_panel.h b/include/drm/drm_panel.h
index 4602f833eb51..eb8ae9bf32ed 100644
--- a/include/drm/drm_panel.h
+++ b/include/drm/drm_panel.h
@@ -171,6 +171,13 @@ struct drm_panel {
 * Panel entry in registry.
 */
struct list_head list;
+
+   /**
+* @dsc:
+*
+* Panel DSC pps payload to be sent
+*/
+   struct drm_dsc_config *dsc;
 };
 
 void drm_panel_init(struct drm_panel *panel, struct device *dev,
-- 
2.31.1

[PATCH v3 12/13] drm/msm/dsi: Add support for DSC configuration

When DSC is enabled, we need to configure DSI registers accordingly and
configure the respective stream compression registers.

Add support to calculate the register setting based on DSC params and
timing information and configure these registers.

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/dsi/dsi.xml.h  |  10 +++
 drivers/gpu/drm/msm/dsi/dsi_host.c | 113 -
 2 files changed, 122 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi.xml.h 
b/drivers/gpu/drm/msm/dsi/dsi.xml.h
index 49b551ad1bff..c1c85df58c4b 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.xml.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.xml.h
@@ -706,4 +706,14 @@ static inline uint32_t DSI_VERSION_MAJOR(uint32_t val)
 #define REG_DSI_CPHY_MODE_CTRL 0x02d4
 
 
+#define REG_DSI_VIDEO_COMPRESSION_MODE_CTRL0x029c
+
+#define REG_DSI_VIDEO_COMPRESSION_MODE_CTRL2   0x02a0
+
+#define REG_DSI_COMMAND_COMPRESSION_MODE_CTRL  0x02a4
+
+#define REG_DSI_COMMAND_COMPRESSION_MODE_CTRL2 0x02a8
+
+#define REG_DSI_COMMAND_COMPRESSION_MODE_CTRL3 0x02ac
+
 #endif /* DSI_XML */
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 31d385d8d834..2c14c36f0b3d 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -908,6 +908,20 @@ static void dsi_ctrl_config(struct msm_dsi_host *msm_host, 
bool enable,
dsi_write(msm_host, REG_DSI_CPHY_MODE_CTRL, BIT(0));
 }
 
+static int dsi_dsc_update_pic_dim(struct msm_display_dsc_config *dsc,
+ int pic_width, int pic_height)
+{
+   if (!dsc || !pic_width || !pic_height) {
+   pr_err("DSI: invalid input: pic_width: %d pic_height: %d\n", 
pic_width, pic_height);
+   return -EINVAL;
+   }
+
+   dsc->drm->pic_width = pic_width;
+   dsc->drm->pic_height = pic_height;
+
+   return 0;
+}
+
 static void dsi_timing_setup(struct msm_dsi_host *msm_host, bool is_bonded_dsi)
 {
struct drm_display_mode *mode = msm_host->mode;
@@ -940,7 +954,68 @@ static void dsi_timing_setup(struct msm_dsi_host 
*msm_host, bool is_bonded_dsi)
hdisplay /= 2;
}
 
+   if (msm_host->dsc) {
+   struct msm_display_dsc_config *dsc = msm_host->dsc;
+
+   /* update dsc params with timing params */
+   dsi_dsc_update_pic_dim(dsc, mode->hdisplay, mode->vdisplay);
+   DBG("Mode Width- %d x Height %d\n", dsc->drm->pic_width, 
dsc->drm->pic_height);
+
+   /* we do the calculations for dsc parameters here so that
+* panel can use these parameters
+*/
+   dsi_populate_dsc_params(dsc);
+
+   /* Divide the display by 3 but keep back/font porch and
+* pulse width same
+*/
+   h_total -= hdisplay;
+   hdisplay /= 3;
+   h_total += hdisplay;
+   ha_end = ha_start + hdisplay;
+   }
+
if (msm_host->mode_flags & MIPI_DSI_MODE_VIDEO) {
+   if (msm_host->dsc) {
+   struct msm_display_dsc_config *dsc = msm_host->dsc;
+   u32 reg, intf_width, slice_per_intf;
+   u32 total_bytes_per_intf;
+
+   /* first calculate dsc parameters and then program
+* compress mode registers
+*/
+   intf_width = hdisplay;
+   slice_per_intf = DIV_ROUND_UP(intf_width, 
dsc->drm->slice_width);
+
+   dsc->drm->slice_count = 1;
+   dsc->bytes_in_slice = 
DIV_ROUND_UP(dsc->drm->slice_width * 8, 8);
+   total_bytes_per_intf = dsc->bytes_in_slice * 
slice_per_intf;
+
+   dsc->eol_byte_num = total_bytes_per_intf % 3;
+   dsc->pclk_per_line =  
DIV_ROUND_UP(total_bytes_per_intf, 3);
+   dsc->bytes_per_pkt = dsc->bytes_in_slice * 
dsc->drm->slice_count;
+   dsc->pkt_per_line = slice_per_intf / 
dsc->drm->slice_count;
+
+   reg = dsc->bytes_per_pkt << 16;
+   reg |= (0x0b << 8);/* dtype of compressed image */
+
+   /* pkt_per_line:
+* 0 == 1 pkt
+* 1 == 2 pkt
+* 2 == 4 pkt
+* 3 pkt is not supported
+* above translates to ffs() - 1
+*/
+   reg |= (ffs(dsc->pkt_per_line) - 1) << 6;
+
+   dsc->eol_byte_num = total_bytes_per_intf % 3;
+   reg |= dsc->eol_byte_num << 4;
+   reg |= 1;
+
+   dsi_write(msm_host,
+

[PATCH v3 11/13] drm/msm/dsi: add mode valid callback for dsi_mgr

Add a mode valid callback for dsi_mgr for checking mode being valid in
case of DSC. For DSC the height and width needs to be multiple of slice,
so we check that here

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/dsi/dsi.h |  2 ++
 drivers/gpu/drm/msm/dsi/dsi_host.c| 26 ++
 drivers/gpu/drm/msm/dsi/dsi_manager.c | 12 
 3 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/msm/dsi/dsi.h b/drivers/gpu/drm/msm/dsi/dsi.h
index 569c8ff062ba..e7affab2fc1e 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.h
@@ -115,6 +115,8 @@ int msm_dsi_host_power_on(struct mipi_dsi_host *host,
 int msm_dsi_host_power_off(struct mipi_dsi_host *host);
 int msm_dsi_host_set_display_mode(struct mipi_dsi_host *host,
  const struct drm_display_mode *mode);
+enum drm_mode_status msm_dsi_host_check_dsc(struct mipi_dsi_host *host,
+   const struct drm_display_mode 
*mode);
 struct drm_panel *msm_dsi_host_get_panel(struct mipi_dsi_host *host);
 unsigned long msm_dsi_host_get_mode_flags(struct mipi_dsi_host *host);
 struct drm_bridge *msm_dsi_host_get_bridge(struct mipi_dsi_host *host);
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 30c1e299aa52..31d385d8d834 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -2588,6 +2588,32 @@ int msm_dsi_host_set_display_mode(struct mipi_dsi_host 
*host,
return 0;
 }
 
+enum drm_mode_status msm_dsi_host_check_dsc(struct mipi_dsi_host *host,
+   const struct drm_display_mode *mode)
+{
+   struct msm_dsi_host *msm_host = to_msm_dsi_host(host);
+   struct msm_display_dsc_config *dsc = msm_host->dsc;
+   int pic_width = mode->hdisplay;
+   int pic_height = mode->vdisplay;
+
+   if (!msm_host->dsc)
+   return MODE_OK;
+
+   if (pic_width % dsc->drm->slice_width) {
+   pr_err("DSI: pic_width %d has to be multiple of slice %d\n",
+  pic_width, dsc->drm->slice_width);
+   return MODE_H_ILLEGAL;
+   }
+
+   if (pic_height % dsc->drm->slice_height) {
+   pr_err("DSI: pic_height %d has to be multiple of slice %d\n",
+  pic_height, dsc->drm->slice_height);
+   return MODE_V_ILLEGAL;
+   }
+
+   return MODE_OK;
+}
+
 struct drm_panel *msm_dsi_host_get_panel(struct mipi_dsi_host *host)
 {
return of_drm_find_panel(to_msm_dsi_host(host)->device_node);
diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c 
b/drivers/gpu/drm/msm/dsi/dsi_manager.c
index 20c4d650fd80..0ad8a53aaa0e 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
@@ -579,6 +579,17 @@ static void dsi_mgr_bridge_mode_set(struct drm_bridge 
*bridge,
msm_dsi_host_set_display_mode(other_dsi->host, adjusted_mode);
 }
 
+static enum drm_mode_status dsi_mgr_bridge_mode_valid(struct drm_bridge 
*bridge,
+ const struct 
drm_display_info *info,
+ const struct 
drm_display_mode *mode)
+{
+   int id = dsi_mgr_bridge_get_id(bridge);
+   struct msm_dsi *msm_dsi = dsi_mgr_get_dsi(id);
+   struct mipi_dsi_host *host = msm_dsi->host;
+
+   return msm_dsi_host_check_dsc(host, mode);
+}
+
 static const struct drm_connector_funcs dsi_mgr_connector_funcs = {
.detect = dsi_mgr_connector_detect,
.fill_modes = drm_helper_probe_single_connector_modes,
@@ -600,6 +611,7 @@ static const struct drm_bridge_funcs dsi_mgr_bridge_funcs = 
{
.disable = dsi_mgr_bridge_disable,
.post_disable = dsi_mgr_bridge_post_disable,
.mode_set = dsi_mgr_bridge_mode_set,
+   .mode_valid = dsi_mgr_bridge_mode_valid,
 };
 
 /* initialize connector when we're connected to a drm_panel */
-- 
2.31.1

[PATCH v3 10/13] drm/msm/disp/dpu1: Add DSC support in RM

This add the bits in RM to enable the DSC blocks

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c  | 66 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h  |  1 +
 3 files changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
index 775bcbda860f..fd6672efb096 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
@@ -146,6 +146,7 @@ struct dpu_global_state {
uint32_t ctl_to_enc_id[CTL_MAX - CTL_0];
uint32_t intf_to_enc_id[INTF_MAX - INTF_0];
uint32_t dspp_to_enc_id[DSPP_MAX - DSPP_0];
+   uint32_t dsc_to_enc_id[DSC_MAX - DSC_0];
 };
 
 struct dpu_global_state
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index f9c83d6e427a..c9d0fc765aaf 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -11,6 +11,7 @@
 #include "dpu_hw_intf.h"
 #include "dpu_hw_dspp.h"
 #include "dpu_hw_merge3d.h"
+#include "dpu_hw_dsc.h"
 #include "dpu_encoder.h"
 #include "dpu_trace.h"
 
@@ -75,6 +76,14 @@ int dpu_rm_destroy(struct dpu_rm *rm)
dpu_hw_intf_destroy(hw);
}
}
+   for (i = 0; i < ARRAY_SIZE(rm->dsc_blks); i++) {
+   struct dpu_hw_dsc *hw;
+
+   if (rm->dsc_blks[i]) {
+   hw = to_dpu_hw_dsc(rm->dsc_blks[i]);
+   dpu_hw_dsc_destroy(hw);
+   }
+   }
 
return 0;
 }
@@ -221,6 +230,19 @@ int dpu_rm_init(struct dpu_rm *rm,
rm->dspp_blks[dspp->id - DSPP_0] = >base;
}
 
+   for (i = 0; i < cat->dsc_count; i++) {
+   struct dpu_hw_dsc *hw;
+   const struct dpu_dsc_cfg *dsc = >dsc[i];
+
+   hw = dpu_hw_dsc_init(dsc->id, mmio, cat);
+   if (IS_ERR_OR_NULL(hw)) {
+   rc = PTR_ERR(hw);
+   DPU_ERROR("failed dsc object creation: err %d\n", rc);
+   goto fail;
+   }
+   rm->dsc_blks[dsc->id - DSC_0] = >base;
+   }
+
return 0;
 
 fail:
@@ -476,6 +498,7 @@ static int _dpu_rm_reserve_intf(
}
 
global_state->intf_to_enc_id[idx] = enc_id;
+
return 0;
 }
 
@@ -500,6 +523,38 @@ static int _dpu_rm_reserve_intf_related_hw(
return ret;
 }
 
+static int _dpu_rm_reserve_dsc(struct dpu_rm *rm,
+  struct dpu_global_state *global_state,
+  struct drm_encoder *enc,
+  const struct msm_display_topology *top)
+{
+   struct msm_drm_private *priv;
+   int num_dsc = top->num_dsc;
+   int i;
+
+   priv = enc->dev->dev_private;
+
+   if (!priv)
+   return -EIO;
+
+   /* check if DSC is supported */
+   if (!priv->dsc)
+   return 0;
+
+   /* check if DSC required are allocated or not */
+   for (i = 0; i < num_dsc; i++) {
+   if (global_state->dsc_to_enc_id[i]) {
+   DPU_ERROR("DSC %d is already allocated\n", i);
+   return -EIO;
+   }
+   }
+
+   for (i = 0; i < num_dsc; i++)
+   global_state->dsc_to_enc_id[i] = enc->base.id;
+
+   return 0;
+}
+
 static int _dpu_rm_make_reservation(
struct dpu_rm *rm,
struct dpu_global_state *global_state,
@@ -526,6 +581,10 @@ static int _dpu_rm_make_reservation(
if (ret)
return ret;
 
+   ret  = _dpu_rm_reserve_dsc(rm, global_state, enc, >topology);
+   if (ret)
+   return ret;
+
return ret;
 }
 
@@ -567,6 +626,8 @@ void dpu_rm_release(struct dpu_global_state *global_state,
ARRAY_SIZE(global_state->ctl_to_enc_id), enc->base.id);
_dpu_rm_clear_mapping(global_state->intf_to_enc_id,
ARRAY_SIZE(global_state->intf_to_enc_id), enc->base.id);
+   _dpu_rm_clear_mapping(global_state->dsc_to_enc_id,
+   ARRAY_SIZE(global_state->dsc_to_enc_id), enc->base.id);
 }
 
 int dpu_rm_reserve(
@@ -640,6 +701,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
hw_to_enc_id = global_state->dspp_to_enc_id;
max_blks = ARRAY_SIZE(rm->dspp_blks);
break;
+   case DPU_HW_BLK_DSC:
+   hw_blks = rm->dsc_blks;
+   hw_to_enc_id = global_state->dsc_to_enc_id;
+   max_blks = ARRAY_SIZE(rm->dsc_blks);
+   break;
default:
DPU_ERROR("blk type %d not managed by rm\n", type);
return 0;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
index 1f12c8d5b8aa..278d2a510b80 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
+++

[PATCH v3 09/13] drm/msm/disp/dpu1: Add support for DSC in topology

For DSC to work we typically need a 2,2,1 configuration. This should
suffice for resolutions up to 4k. For more resolutions like 8k this won't
work.

Also, it is better to use 2 LMs and DSC instances as half width results
in lesser power consumption as compared to single LM, DSC at full width.

The panel has been tested only with 2,2,1 configuration, so for
now we blindly create 2,2,1 topology when DSC is enabled

Co-developed-by: Abhinav Kumar 
Signed-off-by: Abhinav Kumar 
Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 18 ++
 drivers/gpu/drm/msm/msm_drv.h   |  2 ++
 2 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index f2ff8a504918..12f58de88ac7 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -533,6 +533,8 @@ static struct msm_display_topology dpu_encoder_get_topology(
struct drm_display_mode *mode)
 {
struct msm_display_topology topology = {0};
+   struct drm_encoder *drm_enc;
+   struct msm_drm_private *priv;
int i, intf_count = 0;
 
for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
@@ -567,8 +569,24 @@ static struct msm_display_topology 
dpu_encoder_get_topology(
topology.num_enc = 0;
topology.num_intf = intf_count;
 
+   drm_enc = _enc->base;
+   priv = drm_enc->dev->dev_private;
+   if (priv && priv->dsc) {
+   /* In case of Display Stream Compression DSC, we would use
+* 2 encoders, 2 line mixers and 1 interface
+* this is power optimal and can drive up to (including) 4k
+* screens
+*/
+   topology.num_enc = 2;
+   topology.num_dsc = 2;
+   topology.num_intf = 1;
+   topology.num_lm = 2;
+   priv->dsc->dsc_mask = BIT(0) | BIT(1);
+   }
+
return topology;
 }
+
 static int dpu_encoder_virt_atomic_check(
struct drm_encoder *drm_enc,
struct drm_crtc_state *crtc_state,
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index c4a588ad226e..d6b25d77700e 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -103,12 +103,14 @@ enum msm_event_wait {
  * @num_enc:  number of compression encoder blocks used
  * @num_intf: number of interfaces the panel is mounted on
  * @num_dspp: number of dspp blocks used
+ * @num_dsc:  number of Display Stream Compression (DSC) blocks used
  */
 struct msm_display_topology {
u32 num_lm;
u32 num_enc;
u32 num_intf;
u32 num_dspp;
+   u32 num_dsc;
 };
 
 /**
-- 
2.31.1

[PATCH v3 08/13] drm/msm: Add missing structure documentation

Somehow documentation for dspp was missed, so add that

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/msm_drv.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index de7cb65bfc52..c4a588ad226e 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -102,6 +102,7 @@ enum msm_event_wait {
  * @num_lm:   number of layer mixers used
  * @num_enc:  number of compression encoder blocks used
  * @num_intf: number of interfaces the panel is mounted on
+ * @num_dspp: number of dspp blocks used
  */
 struct msm_display_topology {
u32 num_lm;
-- 
2.31.1

[PATCH v3 07/13] drm/msm/disp/dpu1: Add support for DSC in encoder

We need to configure the encoder for DSC configuration and calculate DSC
parameters for the given timing so this patch adds that support by
adding dpu_encoder_prep_dsc() which is invoked when DSC is enabled.

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 140 +++-
 1 file changed, 139 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index e7ee4cfb8461..f2ff8a504918 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -21,6 +21,7 @@
 #include "dpu_hw_intf.h"
 #include "dpu_hw_ctl.h"
 #include "dpu_hw_dspp.h"
+#include "dpu_hw_dsc.h"
 #include "dpu_formats.h"
 #include "dpu_encoder_phys.h"
 #include "dpu_crtc.h"
@@ -136,6 +137,7 @@ enum dpu_enc_rc_states {
  * @cur_slave: As above but for the slave encoder.
  * @hw_pp: Handle to the pingpong blocks used for the display. No.
  * pingpong blocks can be different than num_phys_encs.
+ * @hw_dsc:Handle to the DSC blocks used for the display.
  * @intfs_swapped: Whether or not the phys_enc interfaces have been swapped
  * for partial update right-only cases, such as pingpong
  * split where virtual pingpong does not generate IRQs
@@ -182,6 +184,7 @@ struct dpu_encoder_virt {
struct dpu_encoder_phys *cur_master;
struct dpu_encoder_phys *cur_slave;
struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC];
+   struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC];
 
bool intfs_swapped;
 
@@ -972,7 +975,8 @@ static void dpu_encoder_virt_mode_set(struct drm_encoder 
*drm_enc,
struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC];
struct dpu_hw_blk *hw_lm[MAX_CHANNELS_PER_ENC];
struct dpu_hw_blk *hw_dspp[MAX_CHANNELS_PER_ENC] = { NULL };
-   int num_lm, num_ctl, num_pp;
+   struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC];
+   int num_lm, num_ctl, num_pp, num_dsc;
int i, j;
 
if (!drm_enc) {
@@ -1030,6 +1034,14 @@ static void dpu_encoder_virt_mode_set(struct drm_encoder 
*drm_enc,
dpu_enc->hw_pp[i] = i < num_pp ? to_dpu_hw_pingpong(hw_pp[i])
: NULL;
 
+   if (priv->dsc) {
+   num_dsc = dpu_rm_get_assigned_resources(_kms->rm, 
global_state,
+   drm_enc->base.id, 
DPU_HW_BLK_DSC,
+   hw_dsc, 
ARRAY_SIZE(hw_dsc));
+   for (i = 0; i < MAX_CHANNELS_PER_ENC; i++)
+   dpu_enc->hw_dsc[i] = i < num_dsc ? 
to_dpu_hw_dsc(hw_dsc[i]) : NULL;
+   }
+
cstate = to_dpu_crtc_state(drm_crtc->state);
 
for (i = 0; i < num_lm; i++) {
@@ -1772,10 +1784,132 @@ static void 
dpu_encoder_vsync_event_work_handler(struct kthread_work *work)
nsecs_to_jiffies(ktime_to_ns(wakeup_time)));
 }
 
+static void
+dpu_encoder_dsc_pclk_param_calc(struct msm_display_dsc_config *dsc, u32 width)
+{
+   int slice_count, slice_per_intf;
+   int bytes_in_slice, total_bytes_per_intf;
+
+   if (!dsc || !dsc->drm->slice_width || !dsc->drm->slice_count) {
+   DPU_ERROR("Invalid DSC/slices\n");
+   return;
+   }
+
+   slice_count = dsc->drm->slice_count;
+   slice_per_intf = DIV_ROUND_UP(width, dsc->drm->slice_width);
+
+   /*
+* If slice_count is greater than slice_per_intf then default to 1.
+* This can happen during partial update.
+*/
+   if (slice_count > slice_per_intf)
+   slice_count = 1;
+
+   bytes_in_slice = DIV_ROUND_UP(dsc->drm->slice_width *
+ dsc->drm->bits_per_pixel, 8);
+   total_bytes_per_intf = bytes_in_slice * slice_per_intf;
+
+   dsc->eol_byte_num = total_bytes_per_intf % 3;
+   dsc->pclk_per_line =  DIV_ROUND_UP(total_bytes_per_intf, 3);
+   dsc->bytes_in_slice = bytes_in_slice;
+   dsc->bytes_per_pkt = bytes_in_slice * slice_count;
+   dsc->pkt_per_line = slice_per_intf / slice_count;
+}
+
+static void
+dpu_encoder_dsc_initial_line_calc(struct msm_display_dsc_config *dsc,
+ u32 enc_ip_width)
+{
+   int ssm_delay, total_pixels, soft_slice_per_enc;
+
+   soft_slice_per_enc = enc_ip_width / dsc->drm->slice_width;
+
+   /*
+* minimum number of initial line pixels is a sum of:
+* 1. sub-stream multiplexer delay (83 groups for 8bpc,
+*91 for 10 bpc) * 3
+* 2. for two soft slice cases, add extra sub-stream multiplexer * 3
+* 3. the initial xmit delay
+* 4. total pipeline delay through the "lock step" of encoder (47)
+* 5. 6 additional pixels as the output of the rate buffer is
+*48 bits wide
+*/
+   ssm_delay =

[PATCH v3 06/13] drm/msm/disp/dpu1: Add DSC support in hw_ctl

Later gens of hardware have DSC bits moved to hw_ctl, so configure these
bits so that DSC would work there as well

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
index 36831457a91b..66b0c44118d8 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
@@ -25,6 +25,8 @@
 #define   CTL_MERGE_3D_ACTIVE   0x0E4
 #define   CTL_INTF_ACTIVE   0x0F4
 #define   CTL_MERGE_3D_FLUSH0x100
+#define   CTL_DSC_ACTIVE0x0E8
+#define   CTL_DSC_FLUSH0x104
 #define   CTL_INTF_FLUSH0x110
 #define   CTL_INTF_MASTER   0x134
 #define   CTL_FETCH_PIPE_ACTIVE 0x0FC
@@ -34,6 +36,7 @@
 
 #define DPU_REG_RESET_TIMEOUT_US2000
 #define  MERGE_3D_IDX   23
+#define  DSC_IDX22
 #define  INTF_IDX   31
 #define CTL_INVALID_BIT 0x
 
@@ -120,7 +123,6 @@ static u32 dpu_hw_ctl_get_pending_flush(struct dpu_hw_ctl 
*ctx)
 
 static inline void dpu_hw_ctl_trigger_flush_v1(struct dpu_hw_ctl *ctx)
 {
-
if (ctx->pending_flush_mask & BIT(MERGE_3D_IDX))
DPU_REG_WRITE(>hw, CTL_MERGE_3D_FLUSH,
ctx->pending_merge_3d_flush_mask);
@@ -498,6 +500,9 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
u32 intf_active = 0;
u32 mode_sel = 0;
 
+   if (cfg->dsc)
+   DPU_REG_WRITE(>hw, CTL_DSC_FLUSH, cfg->dsc);
+
if (cfg->intf_mode_sel == DPU_CTL_MODE_SEL_CMD)
mode_sel |= BIT(17);
 
@@ -509,6 +514,10 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
if (cfg->merge_3d)
DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
  BIT(cfg->merge_3d - MERGE_3D_0));
+   if (cfg->dsc) {
+   DPU_REG_WRITE(>hw, CTL_FLUSH, cfg->dsc);
+   DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
+   }
 }
 
 static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
-- 
2.31.1

[PATCH v3 05/13] drm/msm/disp/dpu1: Don't use DSC with mode_3d

We cannot enable mode_3d when we are using the DSC. So pass
configuration to detect DSC is enabled and not enable mode_3d
when we are using DSC

We add a helper dpu_encoder_helper_get_dsc() to detect dsc
enabled and pass this to .setup_intf_cfg()

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h | 11 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c |  2 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c   |  3 ++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h   |  2 ++
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
index e7270eb6b84b..efb85d595598 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -332,6 +332,17 @@ static inline enum dpu_3d_blend_mode 
dpu_encoder_helper_get_3d_blend_mode(
return BLEND_3D_NONE;
 }
 
+static inline bool dpu_encoder_helper_get_dsc(struct dpu_encoder_phys 
*phys_enc)
+{
+   struct drm_encoder *drm_enc = phys_enc->parent;
+   struct msm_drm_private *priv = drm_enc->dev->dev_private;
+
+   if (priv->dsc)
+   return priv->dsc->dsc_mask;
+
+   return 0;
+}
+
 /**
  * dpu_encoder_helper_split_config - split display configuration helper 
function
  * This helper function may be used by physical encoders to configure
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
index 34a6940d12c5..f3f00f4d0193 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
@@ -70,6 +70,8 @@ static void _dpu_encoder_phys_cmd_update_intf_cfg(
intf_cfg.intf_mode_sel = DPU_CTL_MODE_SEL_CMD;
intf_cfg.stream_sel = cmd_enc->stream_sel;
intf_cfg.mode_3d = dpu_encoder_helper_get_3d_blend_mode(phys_enc);
+   intf_cfg.dsc = dpu_encoder_helper_get_dsc(phys_enc);
+
ctl->ops.setup_intf_cfg(ctl, _cfg);
 }
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
index 64740ddb983e..36831457a91b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
@@ -519,7 +519,8 @@ static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
 
intf_cfg |= (cfg->intf & 0xF) << 4;
 
-   if (cfg->mode_3d) {
+   /* In DSC we can't set merge, so check for dsc too */
+   if (cfg->mode_3d && !cfg->dsc) {
intf_cfg |= BIT(19);
intf_cfg |= (cfg->mode_3d - 0x1) << 20;
}
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
index 806c171e5df2..9847c9c46d6f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
@@ -40,6 +40,7 @@ struct dpu_hw_stage_cfg {
  * @merge_3d:  3d merge block used
  * @intf_mode_sel: Interface mode, cmd / vid
  * @stream_sel:Stream selection for multi-stream interfaces
+ * @dsc:   DSC BIT masks
  */
 struct dpu_hw_intf_cfg {
enum dpu_intf intf;
@@ -47,6 +48,7 @@ struct dpu_hw_intf_cfg {
enum dpu_merge_3d merge_3d;
enum dpu_ctl_mode_sel intf_mode_sel;
int stream_sel;
+   unsigned int dsc;
 };
 
 /**
-- 
2.31.1

[PATCH v3 04/13] drm/msm/disp/dpu1: Add DSC for SDM845 to hw_catalog

This adds SDM845 DSC blocks into hw_catalog

Reviewed-by: Dmitry Baryshkov 
Signed-off-by: Vinod Koul 
---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 20 +++
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index ce6f32a919e5..c773bbe57b6b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -821,6 +821,24 @@ static const struct dpu_pingpong_cfg sc7280_pp[] = {
PP_BLK("pingpong_2", PINGPONG_2, 0x6b000, 0, sc7280_pp_sblk, -1, -1),
PP_BLK("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, -1),
 };
+
+/*
+ * DSC sub blocks config
+ */
+#define DSC_BLK(_name, _id, _base) \
+   {\
+   .name = _name, .id = _id, \
+   .base = _base, .len = 0x140, \
+   .features = 0, \
+   }
+
+static struct dpu_dsc_cfg sdm845_dsc[] = {
+   DSC_BLK("dsc_0", DSC_0, 0x8),
+   DSC_BLK("dsc_1", DSC_1, 0x80400),
+   DSC_BLK("dsc_2", DSC_2, 0x80800),
+   DSC_BLK("dsc_3", DSC_3, 0x80c00),
+};
+
 /*
  * INTF sub blocks config
  */
@@ -1124,6 +1142,8 @@ static void sdm845_cfg_init(struct dpu_mdss_cfg *dpu_cfg)
.mixer = sdm845_lm,
.pingpong_count = ARRAY_SIZE(sdm845_pp),
.pingpong = sdm845_pp,
+   .dsc_count = ARRAY_SIZE(sdm845_dsc),
+   .dsc = sdm845_dsc,
.intf_count = ARRAY_SIZE(sdm845_intf),
.intf = sdm845_intf,
.vbif_count = ARRAY_SIZE(sdm845_vbif),
-- 
2.31.1

[PATCH v3 03/13] drm/msm/disp/dpu1: Add support for DSC in pingpong block

In SDM845, DSC can be enabled by writing to pingpong block registers, so
add support for DSC in hw_pp

Reviewed-by: Abhinav Kumar 
Reviewed-by: Dmitry Baryshkov 
Signed-off-by: Vinod Koul 
---
 .../gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c   | 32 +++
 .../gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h   | 14 
 2 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
index 55766c97c4c8..47c6ab6caf95 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
@@ -28,6 +28,9 @@
 #define PP_FBC_MODE 0x034
 #define PP_FBC_BUDGET_CTL   0x038
 #define PP_FBC_LOSSY_MODE   0x03C
+#define PP_DSC_MODE 0x0a0
+#define PP_DCE_DATA_IN_SWAP 0x0ac
+#define PP_DCE_DATA_OUT_SWAP0x0c8
 
 #define PP_DITHER_EN   0x000
 #define PP_DITHER_BITDEPTH 0x004
@@ -245,6 +248,32 @@ static u32 dpu_hw_pp_get_line_count(struct dpu_hw_pingpong 
*pp)
return line;
 }
 
+static int dpu_hw_pp_dsc_enable(struct dpu_hw_pingpong *pp)
+{
+   struct dpu_hw_blk_reg_map *c = >hw;
+
+   DPU_REG_WRITE(c, PP_DSC_MODE, 1);
+   return 0;
+}
+
+static void dpu_hw_pp_dsc_disable(struct dpu_hw_pingpong *pp)
+{
+   struct dpu_hw_blk_reg_map *c = >hw;
+
+   DPU_REG_WRITE(c, PP_DSC_MODE, 0);
+}
+
+static int dpu_hw_pp_setup_dsc(struct dpu_hw_pingpong *pp)
+{
+   struct dpu_hw_blk_reg_map *pp_c = >hw;
+   int data;
+
+   data = DPU_REG_READ(pp_c, PP_DCE_DATA_OUT_SWAP);
+   data |= BIT(18); /* endian flip */
+   DPU_REG_WRITE(pp_c, PP_DCE_DATA_OUT_SWAP, data);
+   return 0;
+}
+
 static void _setup_pingpong_ops(struct dpu_hw_pingpong *c,
unsigned long features)
 {
@@ -256,6 +285,9 @@ static void _setup_pingpong_ops(struct dpu_hw_pingpong *c,
c->ops.get_autorefresh = dpu_hw_pp_get_autorefresh_config;
c->ops.poll_timeout_wr_ptr = dpu_hw_pp_poll_timeout_wr_ptr;
c->ops.get_line_count = dpu_hw_pp_get_line_count;
+   c->ops.setup_dsc = dpu_hw_pp_setup_dsc;
+   c->ops.enable_dsc = dpu_hw_pp_dsc_enable;
+   c->ops.disable_dsc = dpu_hw_pp_dsc_disable;
 
if (test_bit(DPU_PINGPONG_DITHER, ))
c->ops.setup_dither = dpu_hw_pp_setup_dither;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h
index 89d08a715c16..12758468d9ca 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h
@@ -124,6 +124,20 @@ struct dpu_hw_pingpong_ops {
 */
void (*setup_dither)(struct dpu_hw_pingpong *pp,
struct dpu_hw_dither_cfg *cfg);
+   /**
+* Enable DSC
+*/
+   int (*enable_dsc)(struct dpu_hw_pingpong *pp);
+
+   /**
+* Disable DSC
+*/
+   void (*disable_dsc)(struct dpu_hw_pingpong *pp);
+
+   /**
+* Setup DSC
+*/
+   int (*setup_dsc)(struct dpu_hw_pingpong *pp);
 };
 
 struct dpu_hw_merge_3d;
-- 
2.31.1

[PATCH v3 02/13] drm/msm/disp/dpu1: Add support for DSC

Display Stream Compression (DSC) is one of the hw blocks in dpu, so add
support by adding hw blocks for DSC

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/Makefile  |   1 +
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|  13 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c| 210 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h|  77 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h   |  13 ++
 5 files changed, 314 insertions(+)
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 40577f8856d8..7d7058f1f5c1 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -59,6 +59,7 @@ msm-y := \
disp/dpu1/dpu_formats.o \
disp/dpu1/dpu_hw_catalog.o \
disp/dpu1/dpu_hw_ctl.o \
+   disp/dpu1/dpu_hw_dsc.o \
disp/dpu1/dpu_hw_interrupts.o \
disp/dpu1/dpu_hw_intf.o \
disp/dpu1/dpu_hw_lm.o \
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 4ade44bbd37e..65f43fadcda0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -553,6 +553,16 @@ struct dpu_merge_3d_cfg  {
const struct dpu_merge_3d_sub_blks *sblk;
 };
 
+/**
+ * struct dpu_dsc_cfg - information of DSC blocks
+ * @id enum identifying this block
+ * @base   register offset of this block
+ * @features   bit mask identifying sub-blocks/features
+ */
+struct dpu_dsc_cfg {
+   DPU_HW_BLK_INFO;
+};
+
 /**
  * struct dpu_intf_cfg - information of timing engine blocks
  * @id enum identifying this block
@@ -749,6 +759,9 @@ struct dpu_mdss_cfg {
u32 merge_3d_count;
const struct dpu_merge_3d_cfg *merge_3d;
 
+   u32 dsc_count;
+   struct dpu_dsc_cfg *dsc;
+
u32 intf_count;
const struct dpu_intf_cfg *intf;
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
new file mode 100644
index ..449d6f1dad28
--- /dev/null
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
@@ -0,0 +1,210 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2020-2021, Linaro Limited
+ */
+
+#include "dpu_kms.h"
+#include "dpu_hw_catalog.h"
+#include "dpu_hwio.h"
+#include "dpu_hw_mdss.h"
+#include "dpu_hw_dsc.h"
+
+#define DSC_COMMON_MODE0x000
+#define DSC_ENC 0X004
+#define DSC_PICTURE 0x008
+#define DSC_SLICE   0x00C
+#define DSC_CHUNK_SIZE  0x010
+#define DSC_DELAY   0x014
+#define DSC_SCALE_INITIAL   0x018
+#define DSC_SCALE_DEC_INTERVAL  0x01C
+#define DSC_SCALE_INC_INTERVAL  0x020
+#define DSC_FIRST_LINE_BPG_OFFSET   0x024
+#define DSC_BPG_OFFSET  0x028
+#define DSC_DSC_OFFSET  0x02C
+#define DSC_FLATNESS0x030
+#define DSC_RC_MODEL_SIZE   0x034
+#define DSC_RC  0x038
+#define DSC_RC_BUF_THRESH   0x03C
+#define DSC_RANGE_MIN_QP0x074
+#define DSC_RANGE_MAX_QP0x0B0
+#define DSC_RANGE_BPG_OFFSET0x0EC
+
+static void dpu_hw_dsc_disable(struct dpu_hw_dsc *dsc)
+{
+   struct dpu_hw_blk_reg_map *c = >hw;
+
+   DPU_REG_WRITE(c, DSC_COMMON_MODE, 0);
+}
+
+static void dpu_hw_dsc_config(struct dpu_hw_dsc *hw_dsc,
+ struct msm_display_dsc_config *dsc, u32 mode)
+{
+   struct dpu_hw_blk_reg_map *c = _dsc->hw;
+   u32 data, lsb, bpp;
+   u32 initial_lines = dsc->initial_lines;
+   bool is_cmd_mode = !(mode & DSC_MODE_VIDEO);
+
+   DPU_REG_WRITE(c, DSC_COMMON_MODE, mode);
+
+   if (is_cmd_mode)
+   initial_lines += 1;
+
+   data = (initial_lines << 20);
+   data |= ((dsc->slice_last_group_size - 1) << 18);
+   /* bpp is 6.4 format, 4 LSBs bits are for fractional part */
+   data |= dsc->drm->bits_per_pixel << 12;
+   lsb = dsc->drm->bits_per_pixel % 4;
+   bpp = dsc->drm->bits_per_pixel / 4;
+   bpp *= 4;
+   bpp <<= 4;
+   bpp |= lsb;
+
+   data |= bpp << 8;
+   data |= (dsc->drm->block_pred_enable << 7);
+   data |= (dsc->drm->line_buf_depth << 3);
+   data |= (dsc->drm->simple_422 << 2);
+   data |= (dsc->drm->convert_rgb << 1);
+   data |= dsc->drm->bits_per_component;
+
+   DPU_REG_WRITE(c, DSC_ENC, data);
+
+   data = dsc->drm->pic_width << 16;
+   data |= dsc->drm->pic_height;
+   DPU_REG_WRITE(c, DSC_PICTURE, data);
+
+   data = dsc->drm->slice_width << 16;
+   data |= dsc->drm->slice_height;
+   DPU_REG_WRITE(c, DSC_SLICE, data);
+
+   data = dsc->drm->slice_chunk_size << 16;
+

[PATCH v3 01/13] drm/msm/dsi: add support for dsc data

Display Stream Compression (DSC) parameters need to be calculated. Add
helpers and struct msm_display_dsc_config in msm_drv for this
msm_display_dsc_config uses drm_dsc_config for DSC parameters.

Signed-off-by: Vinod Koul 
---
 drivers/gpu/drm/msm/dsi/dsi_host.c | 132 +
 drivers/gpu/drm/msm/msm_drv.h  |  20 +
 2 files changed, 152 insertions(+)

diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index f69a125f9559..30c1e299aa52 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -31,6 +31,8 @@
 
 #define DSI_RESET_TOGGLE_DELAY_MS 20
 
+static int dsi_populate_dsc_params(struct msm_display_dsc_config *dsc);
+
 static int dsi_get_version(const void __iomem *base, u32 *major, u32 *minor)
 {
u32 ver;
@@ -157,6 +159,7 @@ struct msm_dsi_host {
struct regmap *sfpb;
 
struct drm_display_mode *mode;
+   struct msm_display_dsc_config *dsc;
 
/* connected device info */
struct device_node *device_node;
@@ -1710,6 +1713,135 @@ static int dsi_host_parse_lane_data(struct msm_dsi_host 
*msm_host,
return -EINVAL;
 }
 
+static u32 dsi_dsc_rc_buf_thresh[DSC_NUM_BUF_RANGES - 1] = {
+   0x0e, 0x1c, 0x2a, 0x38, 0x46, 0x54, 0x62,
+   0x69, 0x70, 0x77, 0x79, 0x7b, 0x7d, 0x7e
+};
+
+/* only 8bpc, 8bpp added */
+static char min_qp[DSC_NUM_BUF_RANGES] = {
+   0, 0, 1, 1, 3, 3, 3, 3, 3, 3, 5, 5, 5, 7, 13
+};
+
+static char max_qp[DSC_NUM_BUF_RANGES] = {
+   4, 4, 5, 6, 7, 7, 7, 8, 9, 10, 11, 12, 13, 13, 15
+};
+
+static char bpg_offset[DSC_NUM_BUF_RANGES] = {
+   2, 0, 0, -2, -4, -6, -8, -8, -8, -10, -10, -12, -12, -12, -12
+};
+
+static int dsi_populate_dsc_params(struct msm_display_dsc_config *dsc)
+{
+   int mux_words_size;
+   int groups_per_line, groups_total;
+   int min_rate_buffer_size;
+   int hrd_delay;
+   int pre_num_extra_mux_bits, num_extra_mux_bits;
+   int slice_bits;
+   int target_bpp_x16;
+   int data;
+   int final_value, final_scale;
+   int i;
+
+   dsc->drm->rc_model_size = 8192;
+   dsc->drm->first_line_bpg_offset = 12;
+   dsc->drm->rc_edge_factor = 6;
+   dsc->drm->rc_tgt_offset_high = 3;
+   dsc->drm->rc_tgt_offset_low = 3;
+   dsc->drm->simple_422 = 0;
+   dsc->drm->convert_rgb = 1;
+   dsc->drm->vbr_enable = 0;
+
+   /* handle only bpp = bpc = 8 */
+   for (i = 0; i < DSC_NUM_BUF_RANGES - 1 ; i++)
+   dsc->drm->rc_buf_thresh[i] = dsi_dsc_rc_buf_thresh[i];
+
+   for (i = 0; i < DSC_NUM_BUF_RANGES; i++) {
+   dsc->drm->rc_range_params[i].range_min_qp = min_qp[i];
+   dsc->drm->rc_range_params[i].range_max_qp = max_qp[i];
+   dsc->drm->rc_range_params[i].range_bpg_offset = bpg_offset[i];
+   }
+
+   dsc->drm->initial_offset = 6144; /* Not bpp 12 */
+   if (dsc->drm->bits_per_pixel != 8)
+   dsc->drm->initial_offset = 2048;/* bpp = 12 */
+
+   mux_words_size = 48;/* bpc == 8/10 */
+   if (dsc->drm->bits_per_component == 12)
+   mux_words_size = 64;
+
+   dsc->drm->initial_xmit_delay = 512;
+   dsc->drm->initial_scale_value = 32;
+   dsc->drm->first_line_bpg_offset = 12;
+   dsc->drm->line_buf_depth = dsc->drm->bits_per_component + 1;
+
+   /* bpc 8 */
+   dsc->drm->flatness_min_qp = 3;
+   dsc->drm->flatness_max_qp = 12;
+   dsc->det_thresh_flatness = 7 + 2 * (dsc->drm->bits_per_component - 8);
+   dsc->drm->rc_quant_incr_limit0 = 11;
+   dsc->drm->rc_quant_incr_limit1 = 11;
+   dsc->drm->mux_word_size = DSC_MUX_WORD_SIZE_8_10_BPC;
+
+   /* FIXME: need to call drm_dsc_compute_rc_parameters() so that rest of
+* params are calculated
+*/
+   dsc->slice_last_group_size = 3 - (dsc->drm->slice_width % 3);
+   groups_per_line = DIV_ROUND_UP(dsc->drm->slice_width, 3);
+   dsc->drm->slice_chunk_size = dsc->drm->slice_width * 
dsc->drm->bits_per_pixel / 8;
+   if ((dsc->drm->slice_width * dsc->drm->bits_per_pixel) % 8)
+   dsc->drm->slice_chunk_size++;
+
+   /* rbs-min */
+   min_rate_buffer_size =  dsc->drm->rc_model_size - 
dsc->drm->initial_offset +
+   dsc->drm->initial_xmit_delay * 
dsc->drm->bits_per_pixel +
+   groups_per_line * 
dsc->drm->first_line_bpg_offset;
+
+   hrd_delay = DIV_ROUND_UP(min_rate_buffer_size, 
dsc->drm->bits_per_pixel);
+
+   dsc->drm->initial_dec_delay = hrd_delay - dsc->drm->initial_xmit_delay;
+
+   dsc->drm->initial_scale_value = 8 * dsc->drm->rc_model_size /
+  (dsc->drm->rc_model_size - 
dsc->drm->initial_offset);
+
+   slice_bits = 8 * dsc->drm->slice_chunk_size * dsc->drm->slice_height;
+
+   groups_total = groups_per_line * dsc->drm->slice_height;
+
+   data = dsc->drm->first_line_bpg_offset *

[PATCH v3 00/13] drm/msm: Add Display Stream Compression Support

Display Stream Compression (DSC) compresses the display stream in host which
is later decoded by panel. This series enables this for Qualcomm msm driver.
This was tested on Google Pixel3 phone which use LGE SW43408 panel.
 
The changes include DSC data and hardware block enabling for DPU1 then
support in encoder. We also add support in DSI and introduce required
topology changes.
 
In order for panel to set the DSC parameters we add dsc in drm_panel and set
it from the msm driver.

We still have dsc as a globabl entity. I think while doing DP + DSC we
should be able to update it, right now comprehending the requirements are
bit difficult.
 
Complete changes which enable this for Pixel3 along with panel driver (not
part of this series) and DT changes can be found at:
git.linaro.org/people/vinod.koul/kernel.git pixel/dsc_v3
 
Comments welcome!

Changes since v2:
 - Fix comments by Dimitry except the dsc being global.
 - Move RM patch later for dependency on topology now
 - Add patch for mode valid callback for dsi_mgr
 - Add missing structure documentation patch
 - Fix errors in mode_3d changes
 - Rebase on v5.16-rc1 and test

Changes since v1:
 - Fix various issues spotted by kbuildbot
 - Rebase to v5.15-rc3
 - Remove unused fields and duplicate defines
 - Enable DSC blocks only when DSC is enabled
 - remove sdm845 feature mask, use 0
 - Check for DSC in hw_ctl

Changes since RFC:
 - Drop the DT binding patch as we derive the configuration from panel
 - Drop the drm api patch as we no longer need it (use pps drm api)
 - Fix comments raised by Dimitry
 - Add dsc parameters calculation from downstream

Vinod Koul (13):
  drm/msm/dsi: add support for dsc data
  drm/msm/disp/dpu1: Add support for DSC
  drm/msm/disp/dpu1: Add support for DSC in pingpong block
  drm/msm/disp/dpu1: Add DSC for SDM845 to hw_catalog
  drm/msm/disp/dpu1: Don't use DSC with mode_3d
  drm/msm/disp/dpu1: Add DSC support in hw_ctl
  drm/msm/disp/dpu1: Add support for DSC in encoder
  drm/msm: Add missing structure documentation
  drm/msm/disp/dpu1: Add support for DSC in topology
  drm/msm/disp/dpu1: Add DSC support in RM
  drm/msm/dsi: add mode valid callback for dsi_mgr
  drm/msm/dsi: Add support for DSC configuration
  drm/msm/dsi: Pass DSC params to drm_panel

 drivers/gpu/drm/msm/Makefile  |   1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   | 158 +-
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  11 +
 .../drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c  |   2 +
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c|  20 ++
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|  13 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c|  14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h|   2 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c| 210 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h|  77 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h   |  13 +
 .../gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c   |  32 ++
 .../gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h   |  14 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h   |   1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c|  66 
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h|   1 +
 drivers/gpu/drm/msm/dsi/dsi.h |   2 +
 drivers/gpu/drm/msm/dsi/dsi.xml.h |  10 +
 drivers/gpu/drm/msm/dsi/dsi_host.c| 285 +-
 drivers/gpu/drm/msm/dsi/dsi_manager.c |  12 +
 drivers/gpu/drm/msm/msm_drv.h |  23 ++
 include/drm/drm_panel.h   |   7 +
 22 files changed, 970 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.h

-- 
2.31.1

Re: [PATCH] drm/nouveau/core: fix the uninitialized use in nvkm_ioctl_map()

2021-11-15 Thread Yizhuo Zhai

Hi Karol:
Thanks for the feedback, the patch might be too old to apply to the
latest code tree. Let me check and get back to you soon.

On Sat, Nov 13, 2021 at 12:22 PM Karol Herbst  wrote:
>
> something seems to have messed with the patch so it doesn't apply correctly.
>
> On Thu, Jun 17, 2021 at 9:39 AM Yizhuo Zhai  wrote:
> >
> > In function nvkm_ioctl_map(), the variable "type" could be
> > uninitialized if "nvkm_object_map()" returns error code,
> > however, it does not check the return value and directly
> > use the "type" in the if statement, which is potentially
> > unsafe.
> >
> > Signed-off-by: Yizhuo 
> > ---
> >  drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> > b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> > index d777df5a64e6..7f2e8482f167 100644
> > --- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> > +++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c
> > @@ -266,6 +266,8 @@ nvkm_ioctl_map(struct nvkm_client *client,
> > ret = nvkm_object_map(object, data, size, ,
> >   >v0.handle,
> >   >v0.length);
> > +   if (ret)
> > +   return ret;
> > if (type == NVKM_OBJECT_MAP_IO)
> > args->v0.type = NVIF_IOCTL_MAP_V0_IO;
> > else
> > --
> > 2.17.1
> >
>


-- 
Kind Regards,

Yizhuo Zhai

Computer Science, Graduate Student
University of California, Riverside

Re: [PATCH 2/2] drm/msm: Restore error return on invalid fence

2021-11-15 Thread Akhil P Oommen


On 11/15/2021 10:26 PM, Rob Clark wrote:

On Mon, Nov 15, 2021 at 6:43 AM Akhil P Oommen  wrote:


On 11/12/2021 12:54 AM, Rob Clark wrote:

From: Rob Clark 

When converting to use an idr to map userspace fence seqno values back
to a dma_fence, we lost the error return when userspace passes seqno
that is larger than the last submitted fence.  Restore this check.

Reported-by: Akhil P Oommen 
Fixes: a61acbbe9cf8 ("drm/msm: Track "seqno" fences by idr")
Signed-off-by: Rob Clark 
---
Note: I will rebase "drm/msm: Handle fence rollover" on top of this,
to simplify backporting this patch to stable kernels

   drivers/gpu/drm/msm/msm_drv.c| 6 ++
   drivers/gpu/drm/msm/msm_gem_submit.c | 1 +
   drivers/gpu/drm/msm/msm_gpu.h| 3 +++
   3 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index cb14d997c174..56500eb5219e 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -967,6 +967,12 @@ static int wait_fence(struct msm_gpu_submitqueue *queue, 
uint32_t fence_id,
   struct dma_fence *fence;
   int ret;

+ if (fence_id > queue->last_fence) {


But fence_id can wrap around and then this check won't be valid.


that is correct, but see my note about rebasing "drm/msm: Handle fence
rollover" on top of this patch, so this patch could be more easily
cherry-picked to stable/lts branches

BR,
-R


Missed that. Thanks.

Reviewed-by: Akhil P Oommen 

-Akhil.



-Akhil.


+ DRM_ERROR_RATELIMITED("waiting on invalid fence: %u (of %u)\n",
+   fence_id, queue->last_fence);
+ return -EINVAL;
+ }
+
   /*
* Map submitqueue scoped "seqno" (which is actually an idr key)
* back to underlying dma-fence
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 151d19e4453c..a38f23be497d 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -911,6 +911,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
   drm_sched_entity_push_job(>base, queue->entity);

   args->fence = submit->fence_id;
+ queue->last_fence = submit->fence_id;

   msm_reset_syncobjs(syncobjs_to_reset, args->nr_in_syncobjs);
   msm_process_post_deps(post_deps, args->nr_out_syncobjs,
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index bd4e0024033e..e73a5bb03544 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -376,6 +376,8 @@ static inline int msm_gpu_convert_priority(struct msm_gpu 
*gpu, int prio,
* @ring_nr:   the ringbuffer used by this submitqueue, which is determined
* by the submitqueue's priority
* @faults:the number of GPU hangs associated with this submitqueue
+ * @last_fence: the sequence number of the last allocated fence (for error
+ * checking)
* @ctx:   the per-drm_file context associated with the submitqueue (ie.
* which set of pgtables do submits jobs associated with the
* submitqueue use)
@@ -391,6 +393,7 @@ struct msm_gpu_submitqueue {
   u32 flags;
   u32 ring_nr;
   int faults;
+ uint32_t last_fence;
   struct msm_file_private *ctx;
   struct list_head node;
   struct idr fence_idr;

[PATCH] drm/radeon:WARNING opportunity for max()

2021-11-15 Thread zhaoxiao

Fix following coccicheck warning:
drivers/gpu/drm/radeon/r100.c:3450:26-27: WARNING opportunity for max()
drivers/gpu/drm/radeon/r100.c:2812:23-24: WARNING opportunity for max()

Signed-off-by: zhaoxiao 
---
 drivers/gpu/drm/radeon/r100.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 2dd85ba1faa2..c65ee6f44af2 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -2809,10 +2809,7 @@ void r100_vram_init_sizes(struct radeon_device *rdev)
if (rdev->mc.aper_size > config_aper_size)
config_aper_size = rdev->mc.aper_size;
 
-   if (config_aper_size > rdev->mc.real_vram_size)
-   rdev->mc.mc_vram_size = config_aper_size;
-   else
-   rdev->mc.mc_vram_size = rdev->mc.real_vram_size;
+   rdev->mc.mc_vram_size = max(config_aper_size, 
rdev->mc.real_vram_size);
}
 }
 
@@ -3447,10 +3444,7 @@ void r100_bandwidth_update(struct radeon_device *rdev)
mc_latency_mclk.full += disp_latency_overhead.full + 
cur_latency_mclk.full;
mc_latency_sclk.full += disp_latency_overhead.full + 
cur_latency_sclk.full;
 
-   if (mc_latency_mclk.full > mc_latency_sclk.full)
-   disp_latency.full = mc_latency_mclk.full;
-   else
-   disp_latency.full = mc_latency_sclk.full;
+   disp_latency.full = max(mc_latency_mclk.full, mc_latency_sclk.full);
 
/* setup Max GRPH_STOP_REQ default value */
if (ASIC_IS_RV100(rdev))
-- 
2.20.1

Re: [PATCH 08/11] dmaengine: xilinx_dpdma: stop using slave_id field

On 15-11-21, 11:21, Arnd Bergmann wrote:
> On Mon, Nov 15, 2021 at 10:14 AM Laurent Pinchart
>  wrote:
> > On Mon, Nov 15, 2021 at 09:54:00AM +0100, Arnd Bergmann wrote:
> > > @@ -1285,11 +1287,13 @@ static int xilinx_dpdma_config(struct dma_chan 
> > > *dchan,
> > >   spin_lock_irqsave(>lock, flags);
> > >
> > >   /*
> > > -  * Abuse the slave_id to indicate that the channel is part of a 
> > > video
> > > -  * group.
> > > +  * Abuse the peripheral_config to indicate that the channel is part
> >
> > Is it still an abuse, or is this now the right way to pass custom data
> > to the DMA engine driver ?
> 
> It doesn't make the driver any more portable, but it's now being
> more explicit about it. As far as I can tell, this is the best way
> to pass data that cannot be expressed through the regular interfaces
> in DT and the dmaengine API.
> 
> Ideally there would be a generic way to pass this flag, but I couldn't
> figure out what this is actually doing, or whether there is a better
> way. Maybe Vinod has an idea.
> 
> I'll change s/Abuse/Use/ for the moment until I get a definite answer.

I would feel this is still not use for the peripheral_config, but lets
keep it to get rid of slave_id.

Also, I would be better if this was moved to DT as the next cell, don't
recall why that was not done/feasible.

-- 
~Vinod

[PATCH v2] backlight: ili922x: fix kernel-doc warnings & notation

2021-11-15 Thread Randy Dunlap

Convert function-like macro comments to kernel-doc notation and
fix other kernel-doc warnings:

drivers/video/backlight/ili922x.c:85: warning: This comment starts with '/**', 
but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* START_BYTE(id, rs, rw)
drivers/video/backlight/ili922x.c:118: warning: expecting prototype for 
CHECK_FREQ_REG(spi_device s, spi_transfer x)(). Prototype was for 
CHECK_FREQ_REG() instead

ili922x.c:92: warning: contents before sections
ili922x.c:150: warning: No description found for return value of 
'ili922x_read_status'
ili922x.c:193: warning: No description found for return value of 'ili922x_read'
ili922x.c:247: warning: No description found for return value of 'ili922x_write'
ili922x.c:353: warning: No description found for return value of 
'ili922x_poweron'
ili922x.c:382: warning: No description found for return value of 
'ili922x_poweroff'

Fixes: 4cfbfa971478 ("video: backlight: add ili922x lcd driver")
Signed-off-by: Randy Dunlap 
Reported-by: kernel test robot 
Cc: Lee Jones 
Cc: Daniel Thompson 
Cc: Jingoo Han 
Cc: dri-devel@lists.freedesktop.org
Cc: Stefano Babic 
Cc: Anatolij Gustschin 
---
v2: add periods at end of sentences so that the generated documentation
flows correctly. (thanks, Daniel)

 drivers/video/backlight/ili922x.c |   29 ++--
 1 file changed, 19 insertions(+), 10 deletions(-)

--- linux-next-2025.orig/drivers/video/backlight/ili922x.c
+++ linux-next-2025/drivers/video/backlight/ili922x.c
@@ -82,13 +82,7 @@
 #define START_RW_READ  1
 
 /**
- * START_BYTE(id, rs, rw)
- *
- * Set the start byte according to the required operation.
- * The start byte is defined as:
- *   --
- *  | 0 | 1 | 1 | 1 | 0 | ID | RS | RW |
- *   --
+ * START_BYTE() - Set the start byte according to the required operation.
  * @id: display's id as set by the manufacturer
  * @rs: operation type bit, one of:
  *   - START_RS_INDEX  set the index register
@@ -96,14 +90,19 @@
  * @rw: read/write operation
  *  - START_RW_WRITE   write
  *  - START_RW_READread
+ *
+ * The start byte is defined as:
+ *   --
+ *  | 0 | 1 | 1 | 1 | 0 | ID | RS | RW |
+ *   --
  */
 #define START_BYTE(id, rs, rw) \
(0x70 | (((id) & 0x01) << 2) | (((rs) & 0x01) << 1) | ((rw) & 0x01))
 
 /**
- * CHECK_FREQ_REG(spi_device s, spi_transfer x) - Check the frequency
- * for the SPI transfer. According to the datasheet, the controller
- * accept higher frequency for the GRAM transfer, but it requires
+ * CHECK_FREQ_REG() - Check the frequency for the SPI transfer.
+ * According to the datasheet, the controller
+ * accepts higher frequency for the GRAM transfer, but it requires
  * lower frequency when the registers are read/written.
  * The macro sets the frequency in the spi_transfer structure if
  * the frequency exceeds the maximum value.
@@ -145,6 +144,8 @@ struct ili922x {
  * ili922x_read_status - read status register from display
  * @spi: spi device
  * @rs:  output value
+ *
+ * Return: %0 on success or a negative error code on failure
  */
 static int ili922x_read_status(struct spi_device *spi, u16 *rs)
 {
@@ -188,6 +189,8 @@ static int ili922x_read_status(struct sp
  * @spi: spi device
  * @reg: offset of the register to be read
  * @rx:  output value
+ *
+ * Return: %0 on success or a negative error code on failure
  */
 static int ili922x_read(struct spi_device *spi, u8 reg, u16 *rx)
 {
@@ -242,6 +245,8 @@ static int ili922x_read(struct spi_devic
  * @spi: struct spi_device *
  * @reg: offset of the register to be written
  * @value: value to be written
+ *
+ * Return: %0 on success or a negative error code on failure
  */
 static int ili922x_write(struct spi_device *spi, u8 reg, u16 value)
 {
@@ -348,6 +353,8 @@ static void set_write_to_gram_reg(struct
  * The sequence to turn on the display is taken from
  * the datasheet and/or the example code provided by the
  * manufacturer.
+ *
+ * Return: %0 on success or a negative value on failure
  */
 static int ili922x_poweron(struct spi_device *spi)
 {
@@ -377,6 +384,8 @@ static int ili922x_poweron(struct spi_de
 /**
  * ili922x_poweroff - turn the display off
  * @spi: spi device
+ *
+ * Return: %0 on success or a negative value on failure
  */
 static int ili922x_poweroff(struct spi_device *spi)
 {

Re: [PATCH] backlight: ili922x: fix kernel-doc warnings & notation

2021-11-15 Thread Randy Dunlap


On 11/15/21 3:38 AM, Daniel Thompson wrote:



Thanks for the fixes. Just a could of quibbles about full
stops/periods.



---
  drivers/video/backlight/ili922x.c |   29 ++--
  1 file changed, 19 insertions(+), 10 deletions(-)

--- linux-next-20211102.orig/drivers/video/backlight/ili922x.c
+++ linux-next-20211102/drivers/video/backlight/ili922x.c



  /**
- * CHECK_FREQ_REG(spi_device s, spi_transfer x) - Check the frequency
- * for the SPI transfer. According to the datasheet, the controller
- * accept higher frequency for the GRAM transfer, but it requires
+ * CHECK_FREQ_REG() - Check the frequency for the SPI transfer
+ * According to the datasheet, the controller
+ * accepts higher frequency for the GRAM transfer, but it requires


Also missing the full stop/period in the first sentence of the summary.

Note that here the missing full stop does not benefit from a new line to
conceal it and we will generate bad text as a result.

   Check the frequency for the SPI transfer According to the data
   sheet, the controller accepts...


Got it. I'll send a v2.

Thanks.
--
~Randy

[PATCH] drm/sun4i: remove no need type conversion to bool

This change is to cleanup the code a bit.

Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c 
b/drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c
index fbf7da9d9592..25e6f85fed0b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c
+++ b/drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c
@@ -49,7 +49,7 @@ static unsigned long sun4i_tmds_calc_divider(unsigned long 
rate,
(rate - tmp_rate) < (rate - best_rate)) {
best_rate = tmp_rate;
best_m = m;
-   is_double = (d == 2) ? true : false;
+   is_double = (d == 2);
}
}
}
-- 
2.33.1

[PATCH] drm/tegra: remove no need NULL check before kfree

This change is to cleanup the code a bit.

Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/tegra/submit.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/tegra/submit.c b/drivers/gpu/drm/tegra/submit.c
index 776f825df52f..c2fc9677742e 100644
--- a/drivers/gpu/drm/tegra/submit.c
+++ b/drivers/gpu/drm/tegra/submit.c
@@ -608,12 +608,10 @@ int tegra_drm_ioctl_channel_submit(struct drm_device 
*drm, void *data,
if (job_data && job_data->used_mappings) {
for (i = 0; i < job_data->num_used_mappings; i++)

tegra_drm_mapping_put(job_data->used_mappings[i].mapping);
-
-   kfree(job_data->used_mappings);
}
 
-   if (job_data)
-   kfree(job_data);
+   kfree(job_data->used_mappings);
+   kfree(job_data);
 put_bo:
gather_bo_put(>base);
 unlock:
-- 
2.33.1

[PATCH] drm/amd/display: remove no need NULL check before kfree

This change is to cleanup the code a bit.

Signed-off-by: Bernard Zhao 
---
 .../drm/amd/display/dc/dcn10/dcn10_resource.c  | 18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
index f37551e00023..0090550d4aee 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
@@ -978,10 +978,8 @@ static void dcn10_resource_destruct(struct 
dcn10_resource_pool *pool)
pool->base.mpc = NULL;
}
 
-   if (pool->base.hubbub != NULL) {
-   kfree(pool->base.hubbub);
-   pool->base.hubbub = NULL;
-   }
+   kfree(pool->base.hubbub);
+   pool->base.hubbub = NULL;
 
for (i = 0; i < pool->base.pipe_count; i++) {
if (pool->base.opps[i] != NULL)
@@ -1011,14 +1009,10 @@ static void dcn10_resource_destruct(struct 
dcn10_resource_pool *pool)
for (i = 0; i < pool->base.res_cap->num_ddc; i++) {
if (pool->base.engines[i] != NULL)
dce110_engine_destroy(>base.engines[i]);
-   if (pool->base.hw_i2cs[i] != NULL) {
-   kfree(pool->base.hw_i2cs[i]);
-   pool->base.hw_i2cs[i] = NULL;
-   }
-   if (pool->base.sw_i2cs[i] != NULL) {
-   kfree(pool->base.sw_i2cs[i]);
-   pool->base.sw_i2cs[i] = NULL;
-   }
+   kfree(pool->base.hw_i2cs[i]);
+   pool->base.hw_i2cs[i] = NULL;
+   kfree(pool->base.sw_i2cs[i]);
+   pool->base.sw_i2cs[i] = NULL;
}
 
for (i = 0; i < pool->base.audio_count; i++) {
-- 
2.33.1

[PATCH] drm/amd/display: cleanup the code a bit

In function dc_sink_destruct, kfree will check pointer, no need
to check again.
This change is to cleanup the code a bit.

Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/amd/display/dc/core/dc_sink.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_sink.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_sink.c
index a249a0e5edd0..4b5e4d8e7735 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_sink.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_sink.c
@@ -33,14 +33,6 @@
  * Private functions
  
**/
 
-static void dc_sink_destruct(struct dc_sink *sink)
-{
-   if (sink->dc_container_id) {
-   kfree(sink->dc_container_id);
-   sink->dc_container_id = NULL;
-   }
-}
-
 static bool dc_sink_construct(struct dc_sink *sink, const struct 
dc_sink_init_data *init_params)
 {
 
@@ -75,7 +67,7 @@ void dc_sink_retain(struct dc_sink *sink)
 static void dc_sink_free(struct kref *kref)
 {
struct dc_sink *sink = container_of(kref, struct dc_sink, refcount);
-   dc_sink_destruct(sink);
+   kfree(sink->dc_container_id);
kfree(sink);
 }
 
-- 
2.33.1

Re: [PATCH linux-next] drm/msm/dp: remove unneeded variable

2021-11-15 Thread Stephen Boyd

Quoting cgel@gmail.com (2021-11-10 04:05:12)
> From: Changcheng Deng 
>
> Fix the following coccicheck review:
> ./drivers/gpu/drm/msm/dp/dp_debug.c: Unneeded variable
>
> Remove unneeded variable used to store return value.
>
> Reported-by: Zeal Robot 
> Signed-off-by: Changcheng Deng 
> ---
>  drivers/gpu/drm/msm/dp/dp_debug.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_debug.c 
> b/drivers/gpu/drm/msm/dp/dp_debug.c
> index 2f6247e80e9d..c5c75273d1e5 100644
> --- a/drivers/gpu/drm/msm/dp/dp_debug.c
> +++ b/drivers/gpu/drm/msm/dp/dp_debug.c
> @@ -365,7 +365,6 @@ static const struct file_operations test_active_fops = {
>
>  static int dp_debug_init(struct dp_debug *dp_debug, struct drm_minor *minor)

Would be nice to make it void as well and then cleanup the caller. Can
you do that too?

>  {
> -   int rc = 0;
> struct dp_debug_private *debug = container_of(dp_debug,
> struct dp_debug_private, dp_debug);
>

Re: [PATCH] drm/msm: Fix mmap to include VM_IO and VM_DONTDUMP

2021-11-15 Thread Stephen Boyd

Quoting Douglas Anderson (2021-11-10 11:33:42)
> In commit 510410bfc034 ("drm/msm: Implement mmap as GEM object
> function") we switched to a new/cleaner method of doing things. That's
> good, but we missed a little bit.
>
> Before that commit, we used to _first_ run through the
> drm_gem_mmap_obj() case where `obj->funcs->mmap()` was NULL. That meant
> that we ran:
>
>   vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
>   vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
>   vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
>
> ...and _then_ we modified those mappings with our own. Now that
> `obj->funcs->mmap()` is no longer NULL we don't run the default
> code. It looks like the fact that the vm_flags got VM_IO / VM_DONTDUMP
> was important because we're now getting crashes on Chromebooks that
> use ARC++ while logging out. Specifically a crash that looks like this
> (this is on a 5.10 kernel w/ relevant backports but also seen on a
> 5.15 kernel):
>
>   Unable to handle kernel paging request at virtual address ffc00800
>   Mem abort info:
> ESR = 0x9606
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
>   Data abort info:
> ISV = 0, ISS = 0x0006
> CM = 0, WnR = 0
>   swapper pgtable: 4k pages, 39-bit VAs, pgdp=8293d000
>   [ffc00800] pgd=0001002b3003, p4d=0001002b3003,
>  pud=0001002b3003, pmd=
>   Internal error: Oops: 9606 [#1] PREEMPT SMP
>   [...]
>   CPU: 7 PID: 15734 Comm: crash_dump64 Tainted: G W 5.10.67 #1 [...]
>   Hardware name: Qualcomm Technologies, Inc. sc7280 IDP SKU2 platform (DT)
>   pstate: 8049 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
>   pc : __arch_copy_to_user+0xc0/0x30c
>   lr : copyout+0xac/0x14c
>   [...]
>   Call trace:
>__arch_copy_to_user+0xc0/0x30c
>copy_page_to_iter+0x1a0/0x294
>process_vm_rw_core+0x240/0x408
>process_vm_rw+0x110/0x16c
>__arm64_sys_process_vm_readv+0x30/0x3c
>el0_svc_common+0xf8/0x250
>do_el0_svc+0x30/0x80
>el0_svc+0x10/0x1c
>el0_sync_handler+0x78/0x108
>el0_sync+0x184/0x1c0
>   Code: f8408423 f80008c3 910020c6 36100082 (b8404423)
>
> Let's add the two flags back in.
>
> While we're at it, the fact that we aren't running the default means
> that we _don't_ need to clear out VM_PFNMAP, so remove that and save
> an instruction.
>
> NOTE: it was confirmed that VM_IO was the important flag to fix the
> problem I was seeing, but adding back VM_DONTDUMP seems like a sane
> thing to do so I'm doing that too.
>
> Fixes: 510410bfc034 ("drm/msm: Implement mmap as GEM object function")
> Reported-by: Stephen Boyd 
> Signed-off-by: Douglas Anderson 
> ---

Reviewed-by: Stephen Boyd 
Tested-by: Stephen Boyd

Re: Build regressions/improvements in v5.16-rc1

2021-11-15 Thread Kees Cook

On Mon, Nov 15, 2021 at 05:44:33PM +0100, Marco Elver wrote:
> On Mon, Nov 15, 2021 at 05:12PM +0100, Geert Uytterhoeven wrote:
> [...]
> > >   + /kisskb/src/include/linux/fortify-string.h: error: call to 
> > > '__read_overflow' declared with attribute error: detected read beyond 
> > > size of object (1st parameter):  => 263:25, 277:17
> > 
> > in lib/test_kasan.c
> > 
> > s390-all{mod,yes}config
> > arm64-allmodconfig (gcc11)
> 
> Kees, wasn't that what [1] was meant to fix?
> [1] https://lkml.kernel.org/r/20211006181544.1670992-1-keesc...@chromium.org

Ah, I found it:

http://kisskb.ellerman.id.au/kisskb/buildresult/14660585/log/

it's actually:

inlined from 'kasan_memcmp' at /kisskb/src/lib/test_kasan.c:897:2:

and

inlined from 'kasan_memchr' at /kisskb/src/lib/test_kasan.c:872:2:

I can send a patch doing the same as what [1] does for these cases too.

-- 
Kees Cook

Re: Build regressions/improvements in v5.16-rc1

2021-11-15 Thread Kees Cook

On Mon, Nov 15, 2021 at 05:44:33PM +0100, Marco Elver wrote:
> On Mon, Nov 15, 2021 at 05:12PM +0100, Geert Uytterhoeven wrote:
> [...]
> > >   + /kisskb/src/include/linux/fortify-string.h: error: call to 
> > > '__read_overflow' declared with attribute error: detected read beyond 
> > > size of object (1st parameter):  => 263:25, 277:17
> > 
> > in lib/test_kasan.c
> > 
> > s390-all{mod,yes}config
> > arm64-allmodconfig (gcc11)
> 
> Kees, wasn't that what [1] was meant to fix?
> [1] https://lkml.kernel.org/r/20211006181544.1670992-1-keesc...@chromium.org

[1] fixed the ones I found when scanning for __write_overflow(). [2]
fixed some others, so it's possible there are yet more to fix?

Taking a look at Linus's tree, though, the "263" and "277" lines don't
line up correctly. I'll go see if I can reproduce this. Is this with
W=1?

-Kees

[2] 
https://www.ozlabs.org/~akpm/mmotm/broken-out/kasan-test-consolidate-workarounds-for-unwanted-__alloc_size-protection.patch


-- 
Kees Cook

Re: [PATCH] drm/bridge: parade-ps8640: Fix additional suspend/resume at bootup

2021-11-15 Thread Stephen Boyd

Quoting yangcong (2021-11-12 00:43:02)
> Through log and waveform, we can see that there will be additional
> suspend/resume when booting. This timing does not meet the ps8640 spec.
> It seems that the delay of 500ms does not satisfied drm_panel_get_modes.
> I increased it to 900ms and it seems that this problem can be solved.
> To be safe, I'd just round up to a full 1000.
>
> Signed-off-by: yangcong 
> ---

Reviewed-by: Stephen Boyd

Re: [PATCH v10, 2/5] drm/mediatek: add component POSTMASK

Hi, Yongqiang:

Yongqiang Niu  於 2021年9月30日 週四 下午11:52寫道：
>
> This patch add component POSTMASK.

Applied to mediatek-drm-next [1], thanks.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git/log/?h=mediatek-drm-next

Regards,
Chun-Kuang.

>
> Signed-off-by: Yongqiang Niu 
> Signed-off-by: Hsin-Yi Wang 
> Reviewed-by: CK Hu 
> ---
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 102 ++--
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |   1 +
>  2 files changed, 73 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> index 4a2abcf3e5f9..89170ad825fd 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> @@ -62,6 +62,12 @@
>  #define DITHER_ADD_LSHIFT_G(x) (((x) & 0x7) << 4)
>  #define DITHER_ADD_RSHIFT_G(x) (((x) & 0x7) << 0)
>
> +#define DISP_POSTMASK_EN   0x
> +#define POSTMASK_ENBIT(0)
> +#define DISP_POSTMASK_CFG  0x0020
> +#define POSTMASK_RELAY_MODEBIT(0)
> +#define DISP_POSTMASK_SIZE 0x0030
> +
>  struct mtk_ddp_comp_dev {
> struct clk *clk;
> void __iomem *regs;
> @@ -214,6 +220,32 @@ static void mtk_dither_stop(struct device *dev)
> writel_relaxed(0x0, priv->regs + DISP_DITHER_EN);
>  }
>
> +static void mtk_postmask_config(struct device *dev, unsigned int w,
> +   unsigned int h, unsigned int vrefresh,
> +   unsigned int bpc, struct cmdq_pkt *cmdq_pkt)
> +{
> +   struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> +
> +   mtk_ddp_write(cmdq_pkt, w << 16 | h, >cmdq_reg, priv->regs,
> + DISP_POSTMASK_SIZE);
> +   mtk_ddp_write(cmdq_pkt, POSTMASK_RELAY_MODE, >cmdq_reg,
> + priv->regs, DISP_POSTMASK_CFG);
> +}
> +
> +static void mtk_postmask_start(struct device *dev)
> +{
> +   struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> +
> +   writel(POSTMASK_EN, priv->regs + DISP_POSTMASK_EN);
> +}
> +
> +static void mtk_postmask_stop(struct device *dev)
> +{
> +   struct mtk_ddp_comp_dev *priv = dev_get_drvdata(dev);
> +
> +   writel_relaxed(0x0, priv->regs + DISP_POSTMASK_EN);
> +}
> +
>  static const struct mtk_ddp_comp_funcs ddp_aal = {
> .clk_enable = mtk_aal_clk_enable,
> .clk_disable = mtk_aal_clk_disable,
> @@ -289,6 +321,14 @@ static const struct mtk_ddp_comp_funcs ddp_ovl = {
> .bgclr_in_off = mtk_ovl_bgclr_in_off,
>  };
>
> +static const struct mtk_ddp_comp_funcs ddp_postmask = {
> +   .clk_enable = mtk_ddp_clk_enable,
> +   .clk_disable = mtk_ddp_clk_disable,
> +   .config = mtk_postmask_config,
> +   .start = mtk_postmask_start,
> +   .stop = mtk_postmask_stop,
> +};
> +
>  static const struct mtk_ddp_comp_funcs ddp_rdma = {
> .clk_enable = mtk_rdma_clk_enable,
> .clk_disable = mtk_rdma_clk_disable,
> @@ -324,6 +364,7 @@ static const char * const 
> mtk_ddp_comp_stem[MTK_DDP_COMP_TYPE_MAX] = {
> [MTK_DISP_MUTEX] = "mutex",
> [MTK_DISP_OD] = "od",
> [MTK_DISP_BLS] = "bls",
> +   [MTK_DISP_POSTMASK] = "postmask",
>  };
>
>  struct mtk_ddp_comp_match {
> @@ -333,36 +374,37 @@ struct mtk_ddp_comp_match {
>  };
>
>  static const struct mtk_ddp_comp_match mtk_ddp_matches[DDP_COMPONENT_ID_MAX] 
> = {
> -   [DDP_COMPONENT_AAL0]= { MTK_DISP_AAL,   0, _aal },
> -   [DDP_COMPONENT_AAL1]= { MTK_DISP_AAL,   1, _aal },
> -   [DDP_COMPONENT_BLS] = { MTK_DISP_BLS,   0, NULL },
> -   [DDP_COMPONENT_CCORR]   = { MTK_DISP_CCORR, 0, _ccorr },
> -   [DDP_COMPONENT_COLOR0]  = { MTK_DISP_COLOR, 0, _color },
> -   [DDP_COMPONENT_COLOR1]  = { MTK_DISP_COLOR, 1, _color },
> -   [DDP_COMPONENT_DITHER]  = { MTK_DISP_DITHER,0, _dither },
> -   [DDP_COMPONENT_DPI0]= { MTK_DPI,0, _dpi },
> -   [DDP_COMPONENT_DPI1]= { MTK_DPI,1, _dpi },
> -   [DDP_COMPONENT_DSI0]= { MTK_DSI,0, _dsi },
> -   [DDP_COMPONENT_DSI1]= { MTK_DSI,1, _dsi },
> -   [DDP_COMPONENT_DSI2]= { MTK_DSI,2, _dsi },
> -   [DDP_COMPONENT_DSI3]= { MTK_DSI,3, _dsi },
> -   [DDP_COMPONENT_GAMMA]   = { MTK_DISP_GAMMA, 0, _gamma },
> -   [DDP_COMPONENT_OD0] = { MTK_DISP_OD,0, _od },
> -   [DDP_COMPONENT_OD1] = { MTK_DISP_OD,1, _od },
> -   [DDP_COMPONENT_OVL0]= { MTK_DISP_OVL,   0, _ovl },
> -   [DDP_COMPONENT_OVL1]= { MTK_DISP_OVL,   1, _ovl },
> -   [DDP_COMPONENT_OVL_2L0] = { MTK_DISP_OVL_2L,0, _ovl },
> -   [DDP_COMPONENT_OVL_2L1] = { MTK_DISP_OVL_2L,1, _ovl },
> -   [DDP_COMPONENT_OVL_2L2] = { MTK_DISP_OVL_2L,

Re: [PATCH v10, 5/5] drm/mediatek: add support for mediatek SOC MT8192

Hi, Yongqiang:

Yongqiang Niu  於 2021年9月30日 週四 下午11:52寫道：
>
> add support for mediatek SOC MT8192

Applied to mediatek-drm-next [1], thanks.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git/log/?h=mediatek-drm-next

Regards,
Chun-Kuang.

>
> Signed-off-by: Yongqiang Niu 
> Signed-off-by: Hsin-Yi Wang 
> Reviewed-by: CK Hu 
> ---
>  drivers/gpu/drm/mediatek/mtk_disp_ccorr.c |  6 
>  drivers/gpu/drm/mediatek/mtk_disp_ovl.c   | 20 +++
>  drivers/gpu/drm/mediatek/mtk_disp_rdma.c  |  6 
>  drivers/gpu/drm/mediatek/mtk_drm_drv.c| 42 +++
>  4 files changed, 74 insertions(+)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c 
> b/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
> index 141cb36b9c07..3a53ebc4e172 100644
> --- a/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
> +++ b/drivers/gpu/drm/mediatek/mtk_disp_ccorr.c
> @@ -205,9 +205,15 @@ static const struct mtk_disp_ccorr_data 
> mt8183_ccorr_driver_data = {
> .matrix_bits = 10,
>  };
>
> +static const struct mtk_disp_ccorr_data mt8192_ccorr_driver_data = {
> +   .matrix_bits = 11,
> +};
> +
>  static const struct of_device_id mtk_disp_ccorr_driver_dt_match[] = {
> { .compatible = "mediatek,mt8183-disp-ccorr",
>   .data = _ccorr_driver_data},
> +   { .compatible = "mediatek,mt8192-disp-ccorr",
> + .data = _ccorr_driver_data},
> {},
>  };
>  MODULE_DEVICE_TABLE(of, mtk_disp_ccorr_driver_dt_match);
> diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
> b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> index 5326989d5206..2146299e5f52 100644
> --- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> +++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> @@ -456,6 +456,22 @@ static const struct mtk_disp_ovl_data 
> mt8183_ovl_2l_driver_data = {
> .fmt_rgb565_is_0 = true,
>  };
>
> +static const struct mtk_disp_ovl_data mt8192_ovl_driver_data = {
> +   .addr = DISP_REG_OVL_ADDR_MT8173,
> +   .gmc_bits = 10,
> +   .layer_nr = 4,
> +   .fmt_rgb565_is_0 = true,
> +   .smi_id_en = true,
> +};
> +
> +static const struct mtk_disp_ovl_data mt8192_ovl_2l_driver_data = {
> +   .addr = DISP_REG_OVL_ADDR_MT8173,
> +   .gmc_bits = 10,
> +   .layer_nr = 2,
> +   .fmt_rgb565_is_0 = true,
> +   .smi_id_en = true,
> +};
> +
>  static const struct of_device_id mtk_disp_ovl_driver_dt_match[] = {
> { .compatible = "mediatek,mt2701-disp-ovl",
>   .data = _ovl_driver_data},
> @@ -465,6 +481,10 @@ static const struct of_device_id 
> mtk_disp_ovl_driver_dt_match[] = {
>   .data = _ovl_driver_data},
> { .compatible = "mediatek,mt8183-disp-ovl-2l",
>   .data = _ovl_2l_driver_data},
> +   { .compatible = "mediatek,mt8192-disp-ovl",
> + .data = _ovl_driver_data},
> +   { .compatible = "mediatek,mt8192-disp-ovl-2l",
> + .data = _ovl_2l_driver_data},
> {},
>  };
>  MODULE_DEVICE_TABLE(of, mtk_disp_ovl_driver_dt_match);
> diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c 
> b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
> index 75d7f45579e2..d41a3970b944 100644
> --- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
> +++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
> @@ -353,6 +353,10 @@ static const struct mtk_disp_rdma_data 
> mt8183_rdma_driver_data = {
> .fifo_size = 5 * SZ_1K,
>  };
>
> +static const struct mtk_disp_rdma_data mt8192_rdma_driver_data = {
> +   .fifo_size = 5 * SZ_1K,
> +};
> +
>  static const struct of_device_id mtk_disp_rdma_driver_dt_match[] = {
> { .compatible = "mediatek,mt2701-disp-rdma",
>   .data = _rdma_driver_data},
> @@ -360,6 +364,8 @@ static const struct of_device_id 
> mtk_disp_rdma_driver_dt_match[] = {
>   .data = _rdma_driver_data},
> { .compatible = "mediatek,mt8183-disp-rdma",
>   .data = _rdma_driver_data},
> +   { .compatible = "mediatek,mt8192-disp-rdma",
> + .data = _rdma_driver_data},
> {},
>  };
>  MODULE_DEVICE_TABLE(of, mtk_disp_rdma_driver_dt_match);
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> index aec39724ebeb..fa86485b4b9a 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> @@ -158,6 +158,25 @@ static const enum mtk_ddp_comp_id mt8183_mtk_ddp_ext[] = 
> {
> DDP_COMPONENT_DPI0,
>  };
>
> +static const enum mtk_ddp_comp_id mt8192_mtk_ddp_main[] = {
> +   DDP_COMPONENT_OVL0,
> +   DDP_COMPONENT_OVL_2L0,
> +   DDP_COMPONENT_RDMA0,
> +   DDP_COMPONENT_COLOR0,
> +   DDP_COMPONENT_CCORR,
> +   DDP_COMPONENT_AAL0,
> +   DDP_COMPONENT_GAMMA,
> +   DDP_COMPONENT_POSTMASK0,
> +   DDP_COMPONENT_DITHER,
> +   DDP_COMPONENT_DSI0,
> +};
> +
> +static const enum mtk_ddp_comp_id mt8192_mtk_ddp_ext[] = {
> +   DDP_COMPONENT_OVL_2L2,
> +   DDP_COMPONENT_RDMA4,
> +   DDP_COMPONENT_DPI0,
> +};
> +
>  static const struct

Re: [PATCH v10, 3/5] drm/mediatek: add component RDMA4

Hi, Yongqiang:

Yongqiang Niu  於 2021年9月30日 週四 下午11:52寫道：
>
> This patch add component RDMA4

Applied to mediatek-drm-next [1], thanks.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git/log/?h=mediatek-drm-next

Regards,
Chun-Kuang.

>
> Signed-off-by: Yongqiang Niu 
> Reviewed-by: Chun-Kuang Hu 
> Signed-off-by: Hsin-Yi Wang 
> ---
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> index 89170ad825fd..6491eadf34c2 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> @@ -402,6 +402,7 @@ static const struct mtk_ddp_comp_match 
> mtk_ddp_matches[DDP_COMPONENT_ID_MAX] = {
> [DDP_COMPONENT_RDMA0]   = { MTK_DISP_RDMA,  0, _rdma 
> },
> [DDP_COMPONENT_RDMA1]   = { MTK_DISP_RDMA,  1, _rdma 
> },
> [DDP_COMPONENT_RDMA2]   = { MTK_DISP_RDMA,  2, _rdma 
> },
> +   [DDP_COMPONENT_RDMA4]   = { MTK_DISP_RDMA,  4, _rdma 
> },
> [DDP_COMPONENT_UFOE]= { MTK_DISP_UFOE,  0, _ufoe 
> },
> [DDP_COMPONENT_WDMA0]   = { MTK_DISP_WDMA,  0, NULL },
> [DDP_COMPONENT_WDMA1]   = { MTK_DISP_WDMA,  1, NULL },
> --
> 2.25.1
>

Re: [PATCH v10, 1/5] drm/mediatek: add component OVL_2L2

Hi, Yongqiang:

Yongqiang Niu  於 2021年9月30日 週四 下午11:52寫道：
>
> This patch add component OVL_2L2

Applied to mediatek-drm-next [1], thanks.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git/log/?h=mediatek-drm-next

Regards,
Chun-Kuang.

>
> Signed-off-by: Yongqiang Niu 
> Reviewed-by: Chun-Kuang Hu 
> Signed-off-by: Hsin-Yi Wang 
> ---
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> index 33e8789fde8a..4a2abcf3e5f9 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> @@ -353,6 +353,7 @@ static const struct mtk_ddp_comp_match 
> mtk_ddp_matches[DDP_COMPONENT_ID_MAX] = {
> [DDP_COMPONENT_OVL1]= { MTK_DISP_OVL,   1, _ovl },
> [DDP_COMPONENT_OVL_2L0] = { MTK_DISP_OVL_2L,0, _ovl },
> [DDP_COMPONENT_OVL_2L1] = { MTK_DISP_OVL_2L,1, _ovl },
> +   [DDP_COMPONENT_OVL_2L2] = { MTK_DISP_OVL_2L,2, _ovl },
> [DDP_COMPONENT_PWM0]= { MTK_DISP_PWM,   0, NULL },
> [DDP_COMPONENT_PWM1]= { MTK_DISP_PWM,   1, NULL },
> [DDP_COMPONENT_PWM2]= { MTK_DISP_PWM,   2, NULL },
> --
> 2.25.1
>

Re: [PATCH] fbdev: sh7760fb: document fallthrough cases

2021-11-15 Thread Gustavo A. R. Silva

On Mon, Nov 15, 2021 at 09:35:09AM +0100, Geert Uytterhoeven wrote:
> On Mon, Nov 15, 2021 at 7:33 AM Randy Dunlap  wrote:
> > Fix fallthrough warnings in sh776fb.c:
> >
> > ../drivers/video/fbdev/sh7760fb.c: In function 'sh7760fb_get_color_info':
> > ../drivers/video/fbdev/sh7760fb.c:138:23: warning: this statement may fall 
> > through [-Wimplicit-fallthrough=]
> >   138 | lgray = 1;
> > ../drivers/video/fbdev/sh7760fb.c:143:23: warning: this statement may fall 
> > through [-Wimplicit-fallthrough=]
> >   143 | lgray = 1;
> >
> > Just document the current state of code execution/flow.
> >
> > Fixes: 4a25e41831ee ("video: sh7760fb: SH7760/SH7763 LCDC framebuffer 
> > driver")
> > Signed-off-by: Randy Dunlap 
> 
> Section 30.4.4 ("Data Format") of the SH7760 Group Hardware
> Manual confirms fall-through is appropriate here (especially for
> the odd 6 bpp mode).
> 
> Reviewed-by: Geert Uytterhoeven 

I'm taking this in my -next tree[1].

Thanks
--
Gustavo

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/log/?h=for-next/kspp-misc-fixes

[Bug 214921] amdgpu hangs HP Laptop on shutdown

https://bugzilla.kernel.org/show_bug.cgi?id=214921

--- Comment #6 from spassw...@web.de ---
Tested the patch with linux-5.15.2, linux-next-2025 and linux-5.16-rc1. It
solves the hang on suspend (or shutdown) problem in all cases but resuming from
suspend is still broken on linux-5.15.2 when the IOMMU is missing:
https://bugzilla.kernel.org/show_bug.cgi?id=214963

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 214963] [amdgpu] resuming from suspend fails when IOMMU is missing

https://bugzilla.kernel.org/show_bug.cgi?id=214963

spassw...@web.de changed:

   What|Removed |Added

 Kernel Version|5.15.0, 5.15.1  |5.15.0, 5.15.1, 5.15.2

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: Panic with linus/master and panfrost

2021-11-15 Thread Rob Clark

On Mon, Nov 15, 2021 at 2:43 PM Rob Clark  wrote:
>
> On Mon, Nov 15, 2021 at 8:16 AM Ondřej Jirman  wrote:
> >
> > On Mon, Nov 15, 2021 at 05:04:36PM +0100, megi xff wrote:
> > > On Mon, Nov 15, 2021 at 04:05:02PM +0100, Daniel Vetter wrote:
> > > > You need
> > > >
> > > > commit 13e9e30cafea10dff6bc8d63a38a61249e83fd65
> > > > Author: Christian König 
> > > > Date:   Mon Oct 18 21:27:55 2021 +0200
> > > >
> > > >drm/scheduler: fix drm_sched_job_add_implicit_dependencies
> > >
> > > Thank you, that fixed the panic. :)
> >
> > I spoke too soon. Panic is gone, but I still see (immediately after
> > starting Xorg):
> >
> > [   13.290795] [ cut here ]
> > [   13.291103] refcount_t: addition on 0; use-after-free.
> > [   13.291495] WARNING: CPU: 5 PID: 548 at lib/refcount.c:25 
> > refcount_warn_saturate+0x98/0x140
> > [   13.292124] Modules linked in:
> > [   13.292285] CPU: 5 PID: 548 Comm: Xorg Not tainted 
> > 5.16.0-rc1-00414-g21a254904a26 #29
> > [   13.292857] Hardware name: Pine64 PinePhonePro (DT)
> > [   13.293172] pstate: 4005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS 
> > BTYPE=--)
> > [   13.293669] pc : refcount_warn_saturate+0x98/0x140
> > [   13.293977] lr : refcount_warn_saturate+0x98/0x140
> > [   13.294285] sp : 8000129a3b50
> > [   13.294464] x29: 8000129a3b50 x28: 8000129a3d50 x27: 
> > 17ec4b00
> > [   13.294979] x26: 0001 x25: 0001 x24: 
> > 127cca48
> > [   13.295494] x23: 17d19b00 x22: 000a x21: 
> > 0001
> > [   13.296006] x20: 17e15500 x19: 12980580 x18: 
> > 0003
> > [   13.296520] x17:  x16:  x15: 
> > 8000129a3b58
> > [   13.297033] x14:  x13: 2e656572662d7265 x12: 
> > 7466612d65737520
> > [   13.297546] x11: 3b30206e6f206e6f x10: 800011d6e8a0 x9 : 
> > 80001022f37c
> > [   13.298059] x8 : efff x7 : 800011dc68a0 x6 : 
> > 0001
> > [   13.298573] x5 :  x4 : f77a9788 x3 : 
> > f77b56f0
> > [   13.299085] x2 : f77a9788 x1 : 8000e5eb1000 x0 : 
> > 002a
> > [   13.299600] Call trace:
> > [   13.299704]  refcount_warn_saturate+0x98/0x140
> > [   13.299981]  drm_sched_job_add_implicit_dependencies+0x90/0xdc
> > [   13.300385]  panfrost_job_push+0xd0/0x1d4
> > [   13.300628]  panfrost_ioctl_submit+0x34c/0x440
> > [   13.300906]  drm_ioctl_kernel+0x9c/0x154
> > [   13.301142]  drm_ioctl+0x1f0/0x410
> > [   13.301330]  __arm64_sys_ioctl+0xb4/0xdc
> > [   13.301566]  invoke_syscall+0x4c/0x110
> > [   13.301787]  el0_svc_common.constprop.0+0x48/0xf0
> > [   13.302090]  do_el0_svc+0x2c/0x90
> > [   13.302271]  el0_svc+0x14/0x50
> > [   13.302431]  el0t_64_sync_handler+0x9c/0x120
> > [   13.302693]  el0t_64_sync+0x158/0x15c
> > [   13.302904] ---[ end trace 8c211e57f89714c8 ]---
> > [   13.303211] [ cut here ]
> > [   13.303504] refcount_t: underflow; use-after-free.
> > [   13.303820] WARNING: CPU: 5 PID: 548 at lib/refcount.c:28 
> > refcount_warn_saturate+0xec/0x140
> > [   13.304439] Modules linked in:
> > [   13.304596] CPU: 5 PID: 548 Comm: Xorg Tainted: GW 
> > 5.16.0-rc1-00414-g21a254904a26 #29
> > [   13.305286] Hardware name: Pine64 PinePhonePro (DT)
> > [   13.305600] pstate: 4005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS 
> > BTYPE=--)
> > [   13.306095] pc : refcount_warn_saturate+0xec/0x140
> > [   13.306402] lr : refcount_warn_saturate+0xec/0x140
> > [   13.306710] sp : 8000129a3b70
> > [   13.306887] x29: 8000129a3b70 x28: 8000129a3d50 x27: 
> > 17ec4b00
> > [   13.307401] x26: 0001 x25: 0001 x24: 
> > 
> > [   13.307914] x23:  x22: 129807c0 x21: 
> > 12980580
> > [   13.308428] x20: 17c54d00 x19:  x18: 
> > 0003
> > [   13.308942] x17:  x16:  x15: 
> > 8000129a3b58
> > [   13.309454] x14:  x13: 2e656572662d7265 x12: 
> > 7466612d65737520
> > [   13.309967] x11: 3b776f6c66726564 x10: 800011d6e8a0 x9 : 
> > 80001017893c
> > [   13.310480] x8 : efff x7 : 800011dc68a0 x6 : 
> > 0001
> > [   13.310993] x5 : f77a9788 x4 :  x3 : 
> > 0027
> > [   13.311506] x2 : 0023 x1 : f77a9790 x0 : 
> > 0026
> > [   13.312020] Call trace:
> > [   13.312123]  refcount_warn_saturate+0xec/0x140
> > [   13.312401]  dma_resv_add_excl_fence+0x1a8/0x1bc
> > [   13.312700]  panfrost_job_push+0x174/0x1d4
> > [   13.312949]  panfrost_ioctl_submit+0x34c/0x440
> > [   13.313229]  drm_ioctl_kernel+0x9c/0x154
> > [   13.313464]  drm_ioctl+0x1f0/0x410
> > [   13.313651]  __arm64_sys_ioctl+0xb4/0xdc
> > [   13.313884]  invoke_syscall+0x4c/0x110
> > [   13.314103]  el0_svc_common.constprop.0+0x48/0xf0
> > [

[Bug 215025] [amdgpu] Thinkpad A275 hangs on shutdown / screen does not turn on after reboot

https://bugzilla.kernel.org/show_bug.cgi?id=215025

--- Comment #7 from Bjoern Franke (b...@nord-west.org) ---
The patch fixes it, thanks.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 215025] [amdgpu] Thinkpad A275 hangs on shutdown / screen does not turn on after reboot

https://bugzilla.kernel.org/show_bug.cgi?id=215025

--- Comment #6 from Bjoern Franke (b...@nord-west.org) ---
Created attachment 299595
  --> https://bugzilla.kernel.org/attachment.cgi?id=299595=edit
dmesg

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: Panic with linus/master and panfrost

2021-11-15 Thread Rob Clark

On Mon, Nov 15, 2021 at 8:16 AM Ondřej Jirman  wrote:
>
> On Mon, Nov 15, 2021 at 05:04:36PM +0100, megi xff wrote:
> > On Mon, Nov 15, 2021 at 04:05:02PM +0100, Daniel Vetter wrote:
> > > You need
> > >
> > > commit 13e9e30cafea10dff6bc8d63a38a61249e83fd65
> > > Author: Christian König 
> > > Date:   Mon Oct 18 21:27:55 2021 +0200
> > >
> > >drm/scheduler: fix drm_sched_job_add_implicit_dependencies
> >
> > Thank you, that fixed the panic. :)
>
> I spoke too soon. Panic is gone, but I still see (immediately after
> starting Xorg):
>
> [   13.290795] [ cut here ]
> [   13.291103] refcount_t: addition on 0; use-after-free.
> [   13.291495] WARNING: CPU: 5 PID: 548 at lib/refcount.c:25 
> refcount_warn_saturate+0x98/0x140
> [   13.292124] Modules linked in:
> [   13.292285] CPU: 5 PID: 548 Comm: Xorg Not tainted 
> 5.16.0-rc1-00414-g21a254904a26 #29
> [   13.292857] Hardware name: Pine64 PinePhonePro (DT)
> [   13.293172] pstate: 4005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   13.293669] pc : refcount_warn_saturate+0x98/0x140
> [   13.293977] lr : refcount_warn_saturate+0x98/0x140
> [   13.294285] sp : 8000129a3b50
> [   13.294464] x29: 8000129a3b50 x28: 8000129a3d50 x27: 
> 17ec4b00
> [   13.294979] x26: 0001 x25: 0001 x24: 
> 127cca48
> [   13.295494] x23: 17d19b00 x22: 000a x21: 
> 0001
> [   13.296006] x20: 17e15500 x19: 12980580 x18: 
> 0003
> [   13.296520] x17:  x16:  x15: 
> 8000129a3b58
> [   13.297033] x14:  x13: 2e656572662d7265 x12: 
> 7466612d65737520
> [   13.297546] x11: 3b30206e6f206e6f x10: 800011d6e8a0 x9 : 
> 80001022f37c
> [   13.298059] x8 : efff x7 : 800011dc68a0 x6 : 
> 0001
> [   13.298573] x5 :  x4 : f77a9788 x3 : 
> f77b56f0
> [   13.299085] x2 : f77a9788 x1 : 8000e5eb1000 x0 : 
> 002a
> [   13.299600] Call trace:
> [   13.299704]  refcount_warn_saturate+0x98/0x140
> [   13.299981]  drm_sched_job_add_implicit_dependencies+0x90/0xdc
> [   13.300385]  panfrost_job_push+0xd0/0x1d4
> [   13.300628]  panfrost_ioctl_submit+0x34c/0x440
> [   13.300906]  drm_ioctl_kernel+0x9c/0x154
> [   13.301142]  drm_ioctl+0x1f0/0x410
> [   13.301330]  __arm64_sys_ioctl+0xb4/0xdc
> [   13.301566]  invoke_syscall+0x4c/0x110
> [   13.301787]  el0_svc_common.constprop.0+0x48/0xf0
> [   13.302090]  do_el0_svc+0x2c/0x90
> [   13.302271]  el0_svc+0x14/0x50
> [   13.302431]  el0t_64_sync_handler+0x9c/0x120
> [   13.302693]  el0t_64_sync+0x158/0x15c
> [   13.302904] ---[ end trace 8c211e57f89714c8 ]---
> [   13.303211] [ cut here ]
> [   13.303504] refcount_t: underflow; use-after-free.
> [   13.303820] WARNING: CPU: 5 PID: 548 at lib/refcount.c:28 
> refcount_warn_saturate+0xec/0x140
> [   13.304439] Modules linked in:
> [   13.304596] CPU: 5 PID: 548 Comm: Xorg Tainted: GW 
> 5.16.0-rc1-00414-g21a254904a26 #29
> [   13.305286] Hardware name: Pine64 PinePhonePro (DT)
> [   13.305600] pstate: 4005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   13.306095] pc : refcount_warn_saturate+0xec/0x140
> [   13.306402] lr : refcount_warn_saturate+0xec/0x140
> [   13.306710] sp : 8000129a3b70
> [   13.306887] x29: 8000129a3b70 x28: 8000129a3d50 x27: 
> 17ec4b00
> [   13.307401] x26: 0001 x25: 0001 x24: 
> 
> [   13.307914] x23:  x22: 129807c0 x21: 
> 12980580
> [   13.308428] x20: 17c54d00 x19:  x18: 
> 0003
> [   13.308942] x17:  x16:  x15: 
> 8000129a3b58
> [   13.309454] x14:  x13: 2e656572662d7265 x12: 
> 7466612d65737520
> [   13.309967] x11: 3b776f6c66726564 x10: 800011d6e8a0 x9 : 
> 80001017893c
> [   13.310480] x8 : efff x7 : 800011dc68a0 x6 : 
> 0001
> [   13.310993] x5 : f77a9788 x4 :  x3 : 
> 0027
> [   13.311506] x2 : 0023 x1 : f77a9790 x0 : 
> 0026
> [   13.312020] Call trace:
> [   13.312123]  refcount_warn_saturate+0xec/0x140
> [   13.312401]  dma_resv_add_excl_fence+0x1a8/0x1bc
> [   13.312700]  panfrost_job_push+0x174/0x1d4
> [   13.312949]  panfrost_ioctl_submit+0x34c/0x440
> [   13.313229]  drm_ioctl_kernel+0x9c/0x154
> [   13.313464]  drm_ioctl+0x1f0/0x410
> [   13.313651]  __arm64_sys_ioctl+0xb4/0xdc
> [   13.313884]  invoke_syscall+0x4c/0x110
> [   13.314103]  el0_svc_common.constprop.0+0x48/0xf0
> [   13.314405]  do_el0_svc+0x2c/0x90
> [   13.314586]  el0_svc+0x14/0x50
> [   13.314745]  el0t_64_sync_handler+0x9c/0x120
> [   13.315007]  el0t_64_sync+0x158/0x15c
> [   13.315217] ---[ end trace 8c211e57f89714c9 ]---
>
> In dmesg. So this looks like some independent issue.
>


I'm

Re: [PATCH v2 2/8] drm: improve drm_buddy_alloc function

2021-11-15 Thread Arunpravin

Hi Matthew,
I am preparing version3 of the buddy allocator,
Please find the updated comments.

SNIP

>>> -int drm_buddy_alloc_range(struct drm_buddy_mm *mm,
>>> - struct list_head *blocks,
>>> - u64 start, u64 size)
>>> +static struct drm_buddy_block *
>>> +alloc_range(struct drm_buddy_mm *mm,
>>> +   u64 start, u64 end,
>>> +   unsigned int order)
>>>   {
>>> struct drm_buddy_block *block;
>>> struct drm_buddy_block *buddy;
>>> -   LIST_HEAD(allocated);
>>> LIST_HEAD(dfs);
>>> -   u64 end;
>>> int err;
>>> int i;
>>>   
>>
>>> if (!block)
>>> break;
>>>   
>>> list_del(>tmp_link);
>>>   
>>> +   if (drm_buddy_block_order(block) < order)
>>> +   continue;
>>> +
>>> block_start = drm_buddy_block_offset(block);
>>> block_end = block_start + drm_buddy_block_size(mm, block) - 1;
>>>   
>>> if (!overlaps(start, end, block_start, block_end))
>>> continue;
>>>   
>>> -   if (drm_buddy_block_is_allocated(block)) {
>>> -   err = -ENOSPC;
>>> -   goto err_free;
>>> -   }
>>> +   if (drm_buddy_block_is_allocated(block))
>>> +   continue;
>>>   
>>> -   if (contains(start, end, block_start, block_end)) {
>>> -   if (!drm_buddy_block_is_free(block)) {
>>> -   err = -ENOSPC;
>>> -   goto err_free;
>>> -   }
>>> +   if (contains(start, end, block_start, block_end)
>>> +   && order == drm_buddy_block_order(block)) {
>>
>> Alignment looks off, also && should be on the line above.
> 
> [Arun] ok
>>
>>> +   /*
>>> +* Find the free block within the range.
>>> +*/
>>> +   if (drm_buddy_block_is_free(block))
>>> +   return block;
>>
>> Would it make sense to keep searching here, rather than restarting the 
>> search from scratch every time? Would it work if we pass in the total 
>> size and min order?
> [Arun] yes, I will rewrite this function

I tried to rewrite the function, AFAIK, in case of end bias allocation,
we have to restart the search on every new order computed value from the
requested total size since we have to find a free node in the required
order level traversing from left to right, here continuing the search
for the subsequent order value would skip the free nodes present in the
beginning of the tree.

In case of actual range allocation, as handled at
i915_buddy_alloc_range, we can continue the search from where the
previous allocation happened since we allocate all the blocks
progressively within the start and end address values.

alloc_range() handles both the cases, having a penalty of restarting the
search in case of actual range allocation. Please let me know if any
suggestions?

>>> +int drm_buddy_alloc(struct drm_buddy_mm *mm,
>>> +   u64 start, u64 end, u64 size,
>>> +   u64 min_page_size,
>>> +   struct list_head *blocks,
>>> +   unsigned long flags)
>>
>> Do we need to validate the flags somewhere?
> [Arun] I will move 'unsigned long flags' to enum type declaration
I tried to move 'unsigned long flags' to enum type declaration, it
creates an ambiguity in i915 driver as both DRM_BUDDY_ALLOC_TOPDOWN and
DRM_BUDDY_ALLOC_RANGE are mutually non-exclusive. So I think its better
to have 'unsigned long flags'.

AFAIK, we don't need to validate the flags since we check flags using
'flags & DRM_BUDDY_RANGE_ALLOCATION'

>>
>>> +   BUG_ON(order > mm->max_order);
>>> +   BUG_ON(order < min_order);
>>> +
>>> +   do {
>>> +   if (flags & DRM_BUDDY_RANGE_ALLOCATION)
>>> +   /* Allocate traversing within the range */
>>> +   block = alloc_range(mm, start, end, order);
>>
>> Ok, so blocks might be in a random order, which is a slight concern for 
>> actual range allocations(not the bias thing). Can we somehow make 
>> alloc_range just do the old behaviour when end - start == size? Not the 
>> end of the world though if not.
> [Arun] I will change the alloc_range() block allocations to bottom-up,
> so both actual range allocation and end bias allocation blocks will
> start from lowest address. And, since we are traversing the tree from
> left to right, blocks will be in order.
> 
> And, alloc_range() handles actual range allocation demands the same way
> as in the old i915_buddy_alloc_range() function except alloc_range()
> make use of order value to find the blocks within the actual range
> allocation.

Correction - I will change alloc_range() block allocations to bottom-up,
so actual range allocation blocks will start from lowest address (not
the bias thing), and since we are traversing the tree from left to

[PATCH v3 2/9] backlight: qcom-wled: Pass number of elements to read to read_u32_array

of_property_read_u32_array takes the number of elements to read as last
argument. This does not always need to be 4 (sizeof(u32)) but should
instead be the size of the array in DT as read just above with
of_property_count_elems_of_size.

To not make such an error go unnoticed again the driver now bails
accordingly when of_property_read_u32_array returns an error.
Surprisingly the indentation of newlined arguments is lining up again
after prepending `rc = `.

Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 8a42ed89c59c..d413b913fef3 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1535,10 +1535,15 @@ static int wled_configure(struct wled *wled)
return -EINVAL;
}
 
-   of_property_read_u32_array(dev->of_node,
+   rc = of_property_read_u32_array(dev->of_node,
"qcom,enabled-strings",
wled->cfg.enabled_strings,
-   sizeof(u32));
+   string_len);
+   if (rc) {
+   dev_err(dev, "Failed to read %d elements from 
qcom,enabled-strings: %d\n",
+   string_len, rc);
+   return rc;
+   }
 
for (i = 0; i < string_len; ++i) {
if (wled->cfg.enabled_strings[i] >= 
wled->max_string_count) {
-- 
2.33.1

[PATCH v3 8/9] backlight: qcom-wled: Remove unnecessary double whitespace

Remove redundant spaces inside for loop conditions.  No other double
spaces were found that are not part of indentation with `[^\s]  `.

Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index f975c1f6398b..e2a78f4a9668 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -235,7 +235,7 @@ static int wled3_set_brightness(struct wled *wled, u16 
brightness)
 
v = cpu_to_le16(brightness & WLED3_SINK_REG_BRIGHT_MAX);
 
-   for (i = 0;  i < wled->cfg.num_strings; ++i) {
+   for (i = 0; i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->ctrl_addr +
   WLED3_SINK_REG_BRIGHT(i),
   , sizeof(v));
@@ -258,7 +258,7 @@ static int wled4_set_brightness(struct wled *wled, u16 
brightness)
 
v = cpu_to_le16(brightness & WLED3_SINK_REG_BRIGHT_MAX);
 
-   for (i = 0;  i < wled->cfg.num_strings; ++i) {
+   for (i = 0; i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->sink_addr +
   WLED4_SINK_REG_BRIGHT(i),
   , sizeof(v));
-- 
2.33.1

[PATCH v3 9/9] backlight: qcom-wled: Respect enabled-strings in set_brightness

The hardware is capable of controlling any non-contiguous sequence of
LEDs specified in the DT using qcom,enabled-strings as u32
array, and this also follows from the DT-bindings documentation.  The
numbers specified in this array represent indices of the LED strings
that are to be enabled and disabled.

Its value is appropriately used to setup and enable string modules, but
completely disregarded in the set_brightness paths which only iterate
over the number of strings linearly.
Take an example where only string 2 is enabled with
qcom,enabled_strings=<2>: this string is appropriately enabled but
subsequent brightness changes would have only touched the zero'th
brightness register because num_strings is 1 here.  This is simply
addressed by looking up the string for this index in the enabled_strings
array just like the other codepaths that iterate over num_strings.

Likewise enabled_strings is now also used in the autodetection path for
consistent behaviour: when a list of strings is specified in DT only
those strings will be probed for autodetection, analogous to how the
number of strings that need to be probed is already bound by
qcom,num-strings.  After all autodetection uses the set_brightness
helpers to set an initial value, which could otherwise end up changing
brightness on a different set of strings.

Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Fixes: 03b2b5e86986 ("backlight: qcom-wled: Add support for WLED4 peripheral")
Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index e2a78f4a9668..306bcc6ccb92 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -237,7 +237,7 @@ static int wled3_set_brightness(struct wled *wled, u16 
brightness)
 
for (i = 0; i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->ctrl_addr +
-  WLED3_SINK_REG_BRIGHT(i),
+  
WLED3_SINK_REG_BRIGHT(wled->cfg.enabled_strings[i]),
   , sizeof(v));
if (rc < 0)
return rc;
@@ -260,7 +260,7 @@ static int wled4_set_brightness(struct wled *wled, u16 
brightness)
 
for (i = 0; i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->sink_addr +
-  WLED4_SINK_REG_BRIGHT(i),
+  
WLED4_SINK_REG_BRIGHT(wled->cfg.enabled_strings[i]),
   , sizeof(v));
if (rc < 0)
return rc;
@@ -571,7 +571,7 @@ static irqreturn_t wled_short_irq_handler(int irq, void 
*_wled)
 
 static void wled_auto_string_detection(struct wled *wled)
 {
-   int rc = 0, i, delay_time_us;
+   int rc = 0, i, j, delay_time_us;
u32 sink_config = 0;
u8 sink_test = 0, sink_valid = 0, val;
bool fault_set;
@@ -618,14 +618,15 @@ static void wled_auto_string_detection(struct wled *wled)
 
/* Iterate through the strings one by one */
for (i = 0; i < wled->cfg.num_strings; i++) {
-   sink_test = BIT((WLED4_SINK_REG_CURR_SINK_SHFT + i));
+   j = wled->cfg.enabled_strings[i];
+   sink_test = BIT((WLED4_SINK_REG_CURR_SINK_SHFT + j));
 
/* Enable feedback control */
rc = regmap_write(wled->regmap, wled->ctrl_addr +
- WLED3_CTRL_REG_FEEDBACK_CONTROL, i + 1);
+ WLED3_CTRL_REG_FEEDBACK_CONTROL, j + 1);
if (rc < 0) {
dev_err(wled->dev, "Failed to enable feedback for SINK 
%d rc = %d\n",
-   i + 1, rc);
+   j + 1, rc);
goto failed_detect;
}
 
@@ -634,7 +635,7 @@ static void wled_auto_string_detection(struct wled *wled)
  WLED4_SINK_REG_CURR_SINK, sink_test);
if (rc < 0) {
dev_err(wled->dev, "Failed to configure SINK %d 
rc=%d\n",
-   i + 1, rc);
+   j + 1, rc);
goto failed_detect;
}
 
@@ -661,7 +662,7 @@ static void wled_auto_string_detection(struct wled *wled)
 
if (fault_set)
dev_dbg(wled->dev, "WLED OVP fault detected with SINK 
%d\n",
-   i + 1);
+   j + 1);
else
sink_valid |= sink_test;
 
@@ -701,15 +702,16 @@ static void wled_auto_string_detection(struct wled *wled)

[PATCH v3 5/9] backlight: qcom-wled: Override default length with qcom, enabled-strings

The length of qcom,enabled-strings as property array is enough to
determine the number of strings to be enabled, without needing to set
qcom,num-strings to override the default number of strings when less
than the default (which is also the maximum) is provided in DT.

This also introduces an extra warning when qcom,num-strings is set,
denoting that it is not necessary to set both anymore.  It is usually
more concise to set just qcom,num-length when a zero-based, contiguous
range of strings is needed (the majority of the cases), or to only set
qcom,enabled-strings when a specific set of indices is desired.

Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

---
 drivers/video/backlight/qcom-wled.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index ab10910971e9..5306b06044b4 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1520,6 +1520,8 @@ static int wled_configure(struct wled *wled)
return -EINVAL;
}
}
+
+   cfg->num_strings = string_len;
}
 
rc = of_property_read_u32(dev->of_node, "qcom,num-strings", );
@@ -1530,9 +1532,13 @@ static int wled_configure(struct wled *wled)
return -EINVAL;
}
 
-   if (string_len > 0 && val > string_len) {
-   dev_err(dev, "qcom,num-strings exceeds 
qcom,enabled-strings\n");
-   return -EINVAL;
+   if (string_len > 0) {
+   dev_warn(dev, "Only one of qcom,num-strings or 
qcom,enabled-strings"
+ " should be set\n");
+   if (val > string_len) {
+   dev_err(dev, "qcom,num-strings exceeds 
qcom,enabled-strings\n");
+   return -EINVAL;
+   }
}
 
cfg->num_strings = val;
-- 
2.33.1

[PATCH v3 7/9] backlight: qcom-wled: Provide enabled_strings default for WLED 4 and 5

Only WLED 3 sets a sensible default that allows operating this driver
with just qcom,num-strings in the DT; WLED 4 and 5 require
qcom,enabled-strings to be provided otherwise enabled_strings remains
zero-initialized, resulting in every string-specific register write
(currently only the setup and config functions, brightness follows in a
future patch) to only configure the zero'th string multiple times.

Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 5c5df5a3deab..f975c1f6398b 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1079,6 +1079,7 @@ static const struct wled_config wled4_config_defaults = {
.cabc = false,
.external_pfet = false,
.auto_detection_enabled = false,
+   .enabled_strings = {0, 1, 2, 3},
 };
 
 static int wled5_setup(struct wled *wled)
@@ -1192,6 +1193,7 @@ static const struct wled_config wled5_config_defaults = {
.cabc = false,
.external_pfet = false,
.auto_detection_enabled = false,
+   .enabled_strings = {0, 1, 2, 3},
 };
 
 static const u32 wled3_boost_i_limit_values[] = {
-- 
2.33.1

[PATCH v3 6/9] backlight: qcom-wled: Remove unnecessary 4th default string in WLED3

The previous commit improves num_strings parsing to not go over the
maximum of 3 strings for WLED3 anymore.  Likewise this default index for
a hypothetical 4th string is invalid and could access registers that are
not mapped to the desired purpose.
Removing this value gets rid of undesired confusion and avoids the
possibility of accessing registers at this offset even if the 4th array
element is used by accident.

Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 5306b06044b4..5c5df5a3deab 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -948,7 +948,7 @@ static const struct wled_config wled3_config_defaults = {
.cs_out_en = false,
.ext_gen = false,
.cabc = false,
-   .enabled_strings = {0, 1, 2, 3},
+   .enabled_strings = {0, 1, 2},
 };
 
 static int wled4_setup(struct wled *wled)
-- 
2.33.1

[PATCH v3 3/9] backlight: qcom-wled: Use cpu_to_le16 macro to perform conversion

The kernel already provides appropriate primitives to perform endianness
conversion which should be used in favour of manual bit-wrangling.

Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

---
 drivers/video/backlight/qcom-wled.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index d413b913fef3..9d883e702134 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -231,14 +231,14 @@ struct wled {
 static int wled3_set_brightness(struct wled *wled, u16 brightness)
 {
int rc, i;
-   u8 v[2];
+   __le16 v;
 
-   v[0] = brightness & 0xff;
-   v[1] = (brightness >> 8) & 0xf;
+   v = cpu_to_le16(brightness & WLED3_SINK_REG_BRIGHT_MAX);
 
for (i = 0;  i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->ctrl_addr +
-  WLED3_SINK_REG_BRIGHT(i), v, 2);
+  WLED3_SINK_REG_BRIGHT(i),
+  , sizeof(v));
if (rc < 0)
return rc;
}
@@ -250,18 +250,18 @@ static int wled4_set_brightness(struct wled *wled, u16 
brightness)
 {
int rc, i;
u16 low_limit = wled->max_brightness * 4 / 1000;
-   u8 v[2];
+   __le16 v;
 
/* WLED4's lower limit of operation is 0.4% */
if (brightness > 0 && brightness < low_limit)
brightness = low_limit;
 
-   v[0] = brightness & 0xff;
-   v[1] = (brightness >> 8) & 0xf;
+   v = cpu_to_le16(brightness & WLED3_SINK_REG_BRIGHT_MAX);
 
for (i = 0;  i < wled->cfg.num_strings; ++i) {
rc = regmap_bulk_write(wled->regmap, wled->sink_addr +
-  WLED4_SINK_REG_BRIGHT(i), v, 2);
+  WLED4_SINK_REG_BRIGHT(i),
+  , sizeof(v));
if (rc < 0)
return rc;
}
@@ -273,21 +273,20 @@ static int wled5_set_brightness(struct wled *wled, u16 
brightness)
 {
int rc, offset;
u16 low_limit = wled->max_brightness * 1 / 1000;
-   u8 v[2];
+   __le16 v;
 
/* WLED5's lower limit is 0.1% */
if (brightness < low_limit)
brightness = low_limit;
 
-   v[0] = brightness & 0xff;
-   v[1] = (brightness >> 8) & 0x7f;
+   v = cpu_to_le16(brightness & WLED5_SINK_REG_BRIGHT_MAX_15B);
 
offset = (wled->cfg.mod_sel == MOD_A) ?
  WLED5_SINK_REG_MOD_A_BRIGHTNESS_LSB :
  WLED5_SINK_REG_MOD_B_BRIGHTNESS_LSB;
 
rc = regmap_bulk_write(wled->regmap, wled->sink_addr + offset,
-  v, 2);
+  , sizeof(v));
return rc;
 }
 
-- 
2.33.1

[PATCH v3 4/9] backlight: qcom-wled: Fix off-by-one maximum with default num_strings

When not specifying num-strings in the DT the default is used, but +1 is
added to it which turns WLED3 into 4 and WLED4/5 into 5 strings instead
of 3 and 4 respectively, causing out-of-bounds reads and register
read/writes.  This +1 exists for a deficiency in the DT parsing code,
and is simply omitted entirely - solving this oob issue - by parsing the
property separately much like qcom,enabled-strings.

This also enables more stringent checks on the maximum value when
qcom,enabled-strings is provided in the DT, by parsing num-strings after
enabled-strings to allow it to check against (and in a subsequent patch
override) the length of enabled-strings: it is invalid to set
num-strings higher than that.
The DT currently utilizes it to get around an incorrect fixed read of
four elements from that array (has been addressed in a prior patch) by
setting a lower num-strings where desired.

Fixes: 93c64f1ea1e8 ("leds: add Qualcomm PM8941 WLED driver")
Signed-off-by: Marijn Suijten 
Reviewed-By: AngeloGioacchino Del Regno 

---
 drivers/video/backlight/qcom-wled.c | 48 ++---
 1 file changed, 16 insertions(+), 32 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 9d883e702134..ab10910971e9 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1255,21 +1255,6 @@ static const struct wled_var_cfg wled5_ovp_cfg = {
.size = 16,
 };
 
-static u32 wled3_num_strings_values_fn(u32 idx)
-{
-   return idx + 1;
-}
-
-static const struct wled_var_cfg wled3_num_strings_cfg = {
-   .fn = wled3_num_strings_values_fn,
-   .size = 3,
-};
-
-static const struct wled_var_cfg wled4_num_strings_cfg = {
-   .fn = wled3_num_strings_values_fn,
-   .size = 4,
-};
-
 static u32 wled3_switch_freq_values_fn(u32 idx)
 {
return 19200 / (2 * (1 + idx));
@@ -1343,11 +1328,6 @@ static int wled_configure(struct wled *wled)
.val_ptr = >switch_freq,
.cfg = _switch_freq_cfg,
},
-   {
-   .name = "qcom,num-strings",
-   .val_ptr = >num_strings,
-   .cfg = _num_strings_cfg,
-   },
};
 
const struct wled_u32_opts wled4_opts[] = {
@@ -1371,11 +1351,6 @@ static int wled_configure(struct wled *wled)
.val_ptr = >switch_freq,
.cfg = _switch_freq_cfg,
},
-   {
-   .name = "qcom,num-strings",
-   .val_ptr = >num_strings,
-   .cfg = _num_strings_cfg,
-   },
};
 
const struct wled_u32_opts wled5_opts[] = {
@@ -1399,11 +1374,6 @@ static int wled_configure(struct wled *wled)
.val_ptr = >switch_freq,
.cfg = _switch_freq_cfg,
},
-   {
-   .name = "qcom,num-strings",
-   .val_ptr = >num_strings,
-   .cfg = _num_strings_cfg,
-   },
{
.name = "qcom,modulator-sel",
.val_ptr = >mod_sel,
@@ -1522,8 +1492,6 @@ static int wled_configure(struct wled *wled)
*bool_opts[i].val_ptr = true;
}
 
-   cfg->num_strings = cfg->num_strings + 1;
-
string_len = of_property_count_elems_of_size(dev->of_node,
 "qcom,enabled-strings",
 sizeof(u32));
@@ -1554,6 +1522,22 @@ static int wled_configure(struct wled *wled)
}
}
 
+   rc = of_property_read_u32(dev->of_node, "qcom,num-strings", );
+   if (!rc) {
+   if (val < 1 || val > wled->max_string_count) {
+   dev_err(dev, "qcom,num-strings must be between 1 and 
%d\n",
+   wled->max_string_count);
+   return -EINVAL;
+   }
+
+   if (string_len > 0 && val > string_len) {
+   dev_err(dev, "qcom,num-strings exceeds 
qcom,enabled-strings\n");
+   return -EINVAL;
+   }
+
+   cfg->num_strings = val;
+   }
+
return 0;
 }
 
-- 
2.33.1

[PATCH v3 1/9] backlight: qcom-wled: Validate enabled string indices in DT

The strings passed in DT may possibly cause out-of-bounds register
accesses and should be validated before use.

Fixes: 775d2ffb4af6 ("backlight: qcom-wled: Restructure the driver for WLED3")
Signed-off-by: Marijn Suijten 
Reviewed-by: AngeloGioacchino Del Regno 

Reviewed-by: Daniel Thompson 
---
 drivers/video/backlight/qcom-wled.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index d094299c2a48..8a42ed89c59c 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -1528,12 +1528,28 @@ static int wled_configure(struct wled *wled)
string_len = of_property_count_elems_of_size(dev->of_node,
 "qcom,enabled-strings",
 sizeof(u32));
-   if (string_len > 0)
+   if (string_len > 0) {
+   if (string_len > wled->max_string_count) {
+   dev_err(dev, "Cannot have more than %d strings\n",
+   wled->max_string_count);
+   return -EINVAL;
+   }
+
of_property_read_u32_array(dev->of_node,
"qcom,enabled-strings",
wled->cfg.enabled_strings,
sizeof(u32));
 
+   for (i = 0; i < string_len; ++i) {
+   if (wled->cfg.enabled_strings[i] >= 
wled->max_string_count) {
+   dev_err(dev,
+   "qcom,enabled-strings index %d at %d is 
out of bounds\n",
+   wled->cfg.enabled_strings[i], i);
+   return -EINVAL;
+   }
+   }
+   }
+
return 0;
 }
 
-- 
2.33.1

[PATCH v3 0/9] backlight: qcom-wled: fix and solidify handling of enabled-strings

This patchset fixes WLED's handling of enabled-strings: besides some
cleanup it is now actually possible to specify a non-contiguous array of
enabled strings (not necessarily starting at zero) and the values from
DT are now validated to prevent possible unexpected out-of-bounds
register and array element accesses.
Off-by-one mistakes in the maximum number of strings, also causing
out-of-bounds access, have been addressed as well.

Changes in v3:
- Use __le16 type for cpu_to_le16 result;
- Reword ambiguity warning between qcom,num-strings and
  qcom,enabled-strings to explain that only one should/needs to be set;
- Move this warning from patch 4 patch 5, where the length of
  qcom,enabled-strings starts to be taken into account;
- Drop DT patches that have been picked up in the qcom tree.

v2: 
https://lore.kernel.org/lkml/2022002706.453289-1-marijn.suij...@somainline.org/T

Changes in v2:
- Reordered patch 4/10 (Validate enabled string indices in DT) to sit
  before patch 1/10 (Pass number of elements to read to read_u32_array);
- Pulled qcom,num-strings out of the DT enumeration parser, and moved it
  after qcom,enabled-strings parser to always have final sign-off over
  the number of strings;
- Extra validation for this number of strings against
  qcom,enabled-strings;
- Recombined patch 9 (Consistently use enabled-strings in
  set_brightness) and patch 10 (Consider enabled_strings in
  autodetection), which both solve the same problem in two different
  functions.  In addition the autodetection code uses set_brightness as
  helper already;
- Improved DT configurations for pmi8994 and pm660l, currently in 5.15
  rc's.

v1: 
https://lore.kernel.org/dri-devel/20211004192741.621870-1-marijn.suij...@somainline.org/T

Marijn Suijten (9):
  backlight: qcom-wled: Validate enabled string indices in DT
  backlight: qcom-wled: Pass number of elements to read to
read_u32_array
  backlight: qcom-wled: Use cpu_to_le16 macro to perform conversion
  backlight: qcom-wled: Fix off-by-one maximum with default num_strings
  backlight: qcom-wled: Override default length with
qcom,enabled-strings
  backlight: qcom-wled: Remove unnecessary 4th default string in WLED3
  backlight: qcom-wled: Provide enabled_strings default for WLED 4 and 5
  backlight: qcom-wled: Remove unnecessary double whitespace
  backlight: qcom-wled: Respect enabled-strings in set_brightness

 drivers/video/backlight/qcom-wled.c | 130 +++-
 1 file changed, 72 insertions(+), 58 deletions(-)


base-commit: fa55b7dcdc43c1aa1ba12bca9d2dd4318c2a0dbf
--
2.33.1

Re: [PATCH 12/26] drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2


Am 15.11.21 um 15:08 schrieb Daniel Vetter:

On Mon, Nov 15, 2021 at 03:03:53PM +0100, Sascha Hauer wrote:

Hi,

On Fri, Sep 17, 2021 at 02:34:59PM +0200, Christian König wrote:

Simplifying the code a bit.

v2: use dma_resv_for_each_fence

Signed-off-by: Christian König 
---
  drivers/gpu/drm/scheduler/sched_main.c | 26 ++
  1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 042c16b5d54a..5bc5f775abe1 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -699,30 +699,16 @@ int drm_sched_job_add_implicit_dependencies(struct 
drm_sched_job *job,
struct drm_gem_object *obj,
bool write)
  {
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
int ret;
-   struct dma_fence **fences;
-   unsigned int i, fence_count;
-
-   if (!write) {
-   struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
-
-   return drm_sched_job_add_dependency(job, fence);
-   }
-
-   ret = dma_resv_get_fences(obj->resv, NULL, _count, );
-   if (ret || !fence_count)
-   return ret;
  
-	for (i = 0; i < fence_count; i++) {

-   ret = drm_sched_job_add_dependency(job, fences[i]);
+   dma_resv_for_each_fence(, obj->resv, write, fence) {
+   ret = drm_sched_job_add_dependency(job, fence);
if (ret)
-   break;
+   return ret;
}
-
-   for (; i < fence_count; i++)
-   dma_fence_put(fences[i]);
-   kfree(fences);
-   return ret;
+   return 0;
  }
  EXPORT_SYMBOL(drm_sched_job_add_implicit_dependencies);
  

This patch lets the panfrost driver explode on v5.16-rc1 with the
following. I didn't bisect it, but it goes away when I revert this
patch. I only started weston, nothing more.

Any idea what goes wrong here?

Should be fixed in 13e9e30cafea1, but Christian pushed it to the wrong
patch so it missed -rc1.

Christian, this needs to go into drm-misc-fixes, pls cherry-pick it over.


The problem is it doesn't apply to drm-misc-fixes. Looks like the branch 
wasn't updated.


What's going on here?

Christian.


-Daniel


Sascha

[   12.512606] Fence drm_sched:pan_js:a:1 released with pending signals!
[   12.513225] WARNING: CPU: 3 PID: 257 at drivers/dma-buf/dma-fence.c:526 
dma_fence_release+0xac/0xe8
[   12.514056] Modules linked in:
[   12.514334] CPU: 3 PID: 257 Comm: weston Not tainted 
5.16.0-rc1-00043-g794870164a37 #443
[   12.514621] [ cut here ]
[   12.515040] Hardware name: Rockchip RK3568 EVB1 DDR4 V10 Board (DT)
[   12.515044] pstate: 4049 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   12.515049] pc : dma_fence_release+0xac/0xe8
[   12.515056] lr : dma_fence_release+0xac/0xe8
[   12.515061] sp : 8000123ebb20
[   12.515064] x29: 8000123ebb20 x28: 8000123ebd58
[   12.515518] refcount_t: addition on 0; use-after-free.
[   12.516015]  x27: 
[   12.516668] WARNING: CPU: 0 PID: 145 at lib/refcount.c:25 
refcount_warn_saturate+0x98/0x140
[   12.516992] x26: 0001
[   12.517366] Modules linked in:
[   12.517654]  x25: 04b051c0
[   12.518108]
[   12.518555]  x24: 
[   12.518854] CPU: 0 PID: 145 Comm: irq/25-panfrost Not tainted 
5.16.0-rc1-00043-g794870164a37 #443
[   12.519576]
[   12.519866] Hardware name: Rockchip RK3568 EVB1 DDR4 V10 Board (DT)
[   12.520133] x23: 
[   12.520430] pstate: 4049 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   12.520559]  x22: 800010d41b78
[   12.520856] pc : refcount_warn_saturate+0x98/0x140
[   12.521625]  x21: 04b05050
[   12.521755] lr : refcount_warn_saturate+0x98/0x140
[   12.522299]
[   12.522588] sp : 8000122b3bc0
[   12.523192] x20: 04b05040
[   12.523489] x29: 8000122b3bc0
[   12.523906]  x19: 04b05078
[   12.524203]  x28: 
[   12.524620]  x18: 0010
[   12.524751]  x27: 03791880
[   12.525040]
[   12.525329]
[   12.525618] x17: 
[   12.525915] x26: 8000122b3d30
[   12.526212]  x16: 
[   12.526509]  x25: 0001
[   12.526806]  x15: 050e2dc0
[   12.526937]  x24: 03791a10
[   12.527067]
[   12.527357]
[   12.527646] x14: 01b5
[   12.527942] x23: 
[   12.528240]  x13: 050e2dc0
[   12.528536]  x22: 03505280
[   12.528833]  x12: ffea
[   12.528964]  x21: 03a2a220
[   12.529095]
[   12.529384]
[   12.529673] x11: 800011761ec8
[   12.529970] x20: 04b05078
[   12.530267]  x10: 8000115e1e88
[   12.530564]  x19: 04b05000
[   12.530861]  x9 : 8000115e1ee0
[   12.530992]  x18: 0010
[   12.531123]
[

Re: Panic with linus/master and panfrost





Am 15.11.21 um 16:05 schrieb Daniel Vetter:

You need

commit 13e9e30cafea10dff6bc8d63a38a61249e83fd65
Author: Christian König 
Date:   Mon Oct 18 21:27:55 2021 +0200

drm/scheduler: fix drm_sched_job_add_implicit_dependencies

which Christian pushed to drm-misc-next instead of drm-misc-fixes. I
already asked Christian in some other thread to cherry-pick it over.


Sounds like you haven't seen my answer to that request.

I can't cherry pick the patch to drm-misc-fixes because the patch which 
broke things hasn't showed up in that branch yet causing a conflict.


Regards,
Christian.


-Daniel

On Mon, Nov 15, 2021 at 3:56 PM Daniel Stone  wrote:

Hi Ondrej,

On Mon, 15 Nov 2021 at 07:35, Ondřej Jirman  wrote:

I'm getting some fence refcounting related panics with the current
Linus's master branch:

It happens immediately whenever I start Xorg or sway.

Anyone has any ideas where to start looking? It works fine with v5.15.

(sorry for the interleaved log, it's coming from multiple CPUs at once
I guess)

Thanks a lot for the report - are you able to bisect this please?

Cheers,
Daniel



--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Cc541030e445e472b082808d9a84954cc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637725855208408806%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=LVQEVyNFPE1hpZjlD%2BApOVsfUBEPYPiRVVp5Gkut%2BcU%3Dreserved=0

Re: [PATCH v4] drm/msm/dp: do not initialize phy until plugin interrupt received

2021-11-15 Thread khsieh


On 2021-11-09 13:38, Kuogee Hsieh wrote:

From: Kuogee Hsieh 

Current DP drivers have regulators, clocks, irq and phy are grouped
together within a function and executed not in a symmetric manner.
This increase difficulty of code maintenance and limited code 
scalability.

This patch divided the driver life cycle of operation into four states,
resume (including booting up), dongle plugin, dongle unplugged and 
suspend.
Regulators, core clocks and irq are grouped together and enabled at 
resume
(or booting up) so that the DP controller is armed and ready to receive 
HPD
plugin interrupts. HPD plugin interrupt is generated when a dongle 
plugs

into DUT (device under test). Once HPD plugin interrupt is received, DP
controller will initialize phy so that dpcd read/write will function 
and

following link training can be proceeded successfully. DP phy will be
disabled after main link is teared down at end of unplugged HPD 
interrupt
handle triggered by dongle unplugged out of DUT. Finally regulators, 
code

clocks and irq are disabled at corresponding suspension.

Changes in V2:
-- removed unnecessary dp_ctrl NULL check
-- removed unnecessary phy init_count and power_count DRM_DEBUG_DP logs
-- remove flip parameter out of dp_ctrl_irq_enable()
-- add fixes tag

Changes in V3:
-- call dp_display_host_phy_init() instead of dp_ctrl_phy_init() at
dp_display_host_init() for eDP

Changes in V4:
-- rewording commit text to match this commit changes

Fixes: e91e3065a806 ("drm/msm/dp: Add DP compliance tests on
Snapdragon Chipsets")

Signed-off-by: Kuogee Hsieh 
---


any more comments?

 drivers/gpu/drm/msm/dp/dp_ctrl.c| 87 
-

 drivers/gpu/drm/msm/dp/dp_ctrl.h|  9 ++--
 drivers/gpu/drm/msm/dp/dp_display.c | 83 
++-

 3 files changed, 105 insertions(+), 74 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c 
b/drivers/gpu/drm/msm/dp/dp_ctrl.c

index 7ec155d..4788e8c 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1364,60 +1364,54 @@ static int dp_ctrl_enable_stream_clocks(struct
dp_ctrl_private *ctrl)
return ret;
 }

-int dp_ctrl_host_init(struct dp_ctrl *dp_ctrl, bool flip, bool reset)
+void dp_ctrl_irq_enable(struct dp_ctrl *dp_ctrl)
+{
+   struct dp_ctrl_private *ctrl;
+
+   ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
+
+   dp_catalog_ctrl_reset(ctrl->catalog);
+
+   dp_catalog_ctrl_enable_irq(ctrl->catalog, true);
+}
+
+void dp_ctrl_irq_disable(struct dp_ctrl *dp_ctrl)
+{
+   struct dp_ctrl_private *ctrl;
+
+   ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
+
+   dp_catalog_ctrl_reset(ctrl->catalog);
+
+   dp_catalog_ctrl_enable_irq(ctrl->catalog, false);
+}
+
+void dp_ctrl_phy_init(struct dp_ctrl *dp_ctrl)
 {
struct dp_ctrl_private *ctrl;
struct dp_io *dp_io;
struct phy *phy;

-   if (!dp_ctrl) {
-   DRM_ERROR("Invalid input data\n");
-   return -EINVAL;
-   }
-
ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
dp_io = >parser->io;
phy = dp_io->phy;

-   ctrl->dp_ctrl.orientation = flip;
-
-   if (reset)
-   dp_catalog_ctrl_reset(ctrl->catalog);
-
-   DRM_DEBUG_DP("flip=%d\n", flip);
dp_catalog_ctrl_phy_reset(ctrl->catalog);
phy_init(phy);
-   dp_catalog_ctrl_enable_irq(ctrl->catalog, true);
-
-   return 0;
 }

-/**
- * dp_ctrl_host_deinit() - Uninitialize DP controller
- * @dp_ctrl: Display Port Driver data
- *
- * Perform required steps to uninitialize DP controller
- * and its resources.
- */
-void dp_ctrl_host_deinit(struct dp_ctrl *dp_ctrl)
+void dp_ctrl_phy_exit(struct dp_ctrl *dp_ctrl)
 {
struct dp_ctrl_private *ctrl;
struct dp_io *dp_io;
struct phy *phy;

-   if (!dp_ctrl) {
-   DRM_ERROR("Invalid input data\n");
-   return;
-   }
-
ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
dp_io = >parser->io;
phy = dp_io->phy;

-   dp_catalog_ctrl_enable_irq(ctrl->catalog, false);
+   dp_catalog_ctrl_phy_reset(ctrl->catalog);
phy_exit(phy);
-
-   DRM_DEBUG_DP("Host deinitialized successfully\n");
 }

 static bool dp_ctrl_use_fixed_nvid(struct dp_ctrl_private *ctrl)
@@ -1895,8 +1889,14 @@ int dp_ctrl_off_link_stream(struct dp_ctrl 
*dp_ctrl)

return ret;
}

+   DRM_DEBUG_DP("Before, phy=%x init_count=%d power_on=%d\n",
+   (u32)(uintptr_t)phy, phy->init_count, phy->power_count);
+
phy_power_off(phy);

+   DRM_DEBUG_DP("After, phy=%x init_count=%d power_on=%d\n",
+   (u32)(uintptr_t)phy, phy->init_count, phy->power_count);
+
/* aux channel down, reinit phy */
phy_exit(phy);
phy_init(phy);
@@ -1905,23 +1905,6 @@ int dp_ctrl_off_link_stream(struct dp_ctrl 
*dp_ctrl)

[PATCH v4.5 12/14] dt-bindings: msm/dp: Add bindings for HDCP registers

2021-11-15 Thread Sean Paul

From: Sean Paul 

This patch adds the bindings for the MSM DisplayPort HDCP registers
which are required to write the HDCP key into the display controller as
well as the registers to enable HDCP authentication/key
exchange/encryption.

We'll use a new compatible string for this since the fields are optional.

Cc: Rob Herring 
Cc: Stephen Boyd 
Signed-off-by: Sean Paul 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210913175747.47456-13-s...@poorly.run
 #v1
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210915203834.1439-13-s...@poorly.run
 #v2
Link: 
https://patchwork.freedesktop.org/patch/msgid/20211001151145.55916-13-s...@poorly.run
 #v3
Link: 
https://patchwork.freedesktop.org/patch/msgid/20211105030434.2828845-13-s...@poorly.run
 #v4

Changes in v2:
-Drop register range names (Stephen)
-Fix yaml errors (Rob)
Changes in v3:
-Add new compatible string for dp-hdcp
-Add descriptions to reg
-Add minItems/maxItems to reg
-Make reg depend on the new hdcp compatible string
Changes in v4:
-Rebase on Bjorn's multi-dp patchset
Changes in v4.5:
-Remove maxItems from reg (Rob)
-Remove leading zeros in example (Rob)
---
 .../devicetree/bindings/display/msm/dp-controller.yaml | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/display/msm/dp-controller.yaml 
b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
index b36d74c1da7c..aff7d45ba6ed 100644
--- a/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dp-controller.yaml
@@ -21,12 +21,15 @@ properties:
   - qcom,sc8180x-edp
 
   reg:
+minItems: 5
 items:
   - description: ahb register block
   - description: aux register block
   - description: link register block
   - description: p0 register block
   - description: p1 register block
+  - description: (Optional) Registers for HDCP device key injection
+  - description: (Optional) Registers for HDCP TrustZone interaction
 
   interrupts:
 maxItems: 1
@@ -111,7 +114,9 @@ examples:
   <0xae90200 0x200>,
   <0xae90400 0xc00>,
   <0xae91000 0x400>,
-  <0xae91400 0x400>;
+  <0xae91400 0x400>,
+  <0xaed1000 0x174>,
+  <0xaee1000 0x2c>;
 interrupt-parent = <>;
 interrupts = <12>;
 clocks = < DISP_CC_MDSS_AHB_CLK>,
-- 
Sean Paul, Software Engineer, Google / Chromium OS

Re: [PATCH] drm: change logs to print connectors in the form CONNECTOR:id:name

2021-11-15 Thread Jani Nikula

On Mon, 15 Nov 2021, Claudio Suarez  wrote:
> On Mon, Nov 15, 2021 at 12:24:26PM +0200, Jani Nikula wrote:
>> On Sun, 14 Nov 2021, Claudio Suarez  wrote:
>> > On Sat, Nov 13, 2021 at 09:39:46PM +0100, Sam Ravnborg wrote:
>> >> Hi Claudio,
>> >> 
>> >> On Sat, Nov 13, 2021 at 08:27:30PM +0100, Claudio Suarez wrote:
>> >> > The prefered way to log connectors is [CONNECTOR:id:name]. Change it in
>> >> > drm core programs.
>> >> > 
>> >> > Suggested-by: Ville Syrjälä 
>> >> > Signed-off-by: Claudio Suarez 
>> >> 
>> >> While touching all these logging calls, could you convernt them to the
>> >> newer drm_dbg_kms variants?
>> >> DRM_DEBUG_* are all deprecated.
>> >> 
>> >
>> > Yes, I can, but it is recommended to do it in a different patch:
>> >
>> > https://www.kernel.org/doc/html/latest/process/submitting-patches.html#separate-your-changes
>> >
>> > C:
>> > "Separate your changes
>> > Separate each logical change into a separate patch.
>> > For example, if your changes include both bug fixes and performance 
>> > enhancements..."
>> >
>> >
>> > I will study and send a new separate patch with your suggestion.
>> 
>> I feel these logging changes are the types of changes where I'd err on
>> the side of fewer patches than strictly following the above guidelines.
>
> To size the problem:
> - there are about 3434 references to DRM_DEBUG_* in all the drm code, all 
> drivers.
> - there are 413 references to DRM_DEBUG_* in the drm core code, only 66 of
>them are related to connectors.
> - there are 62 references to DRM_ERR* and 7 references to DRM_INFO in
>the drm core programs.
>
> I meant I can make two patches:
>  1 - this one with the change to CONNECTOR:id:name (29 changes)
>  2 - a new and bigger patch to change 413 + 62 + 7 references to 
> DRM_{DEBUG,ERR,INFO}
>  in the drm core programs.

The second one is an on-going change that will have to happen gradually,
file by file. Changing connector references while at it seems like a
reasonable drive-by-change to me. (Others may disagree.)

> The second patch will be ready in a few days.
>
> Is it a good plan ? or can it be improved ?

It can't be a single patch. It needs to be split to smaller changes.

BR,
Jani.

>
> Best regards,
> Claudio Suarez.
>
>
>

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [PATCH] drm/virtio: add null check in virtio_gpu_poll

2021-11-15 Thread Gurchetan Singh

On Mon, Nov 15, 2021 at 9:58 AM Gurchetan Singh 
wrote:

> From: Gurchetan Singh 
>
> If vfpriv is null, we shouldn't check vfpriv->ring_idx_mask.
>
> Signed-off-by: Gurchetan Singh 
> ---
>  drivers/gpu/drm/virtio/virtgpu_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c
> b/drivers/gpu/drm/virtio/virtgpu_drv.c
> index 749db18dcfa2..7975ea06b316 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_drv.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
> @@ -166,7 +166,7 @@ static __poll_t virtio_gpu_poll(struct file *filp,
> struct drm_pending_event *e = NULL;
> __poll_t mask = 0;
>
> -   if (!vfpriv->ring_idx_mask)
> +   if (!vfpriv || !vfpriv->ring_idx_mask)
> return drm_poll(filp, wait);
>
> poll_wait(filp, _file->event_wait, wait);
>

Nevermind, looks like fix was merged in the main tree and will make it back
to to drm-misc-next:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d89c0c8322ecdc9a2ec84b959b6f766be082da76

-- 
> 2.34.0.rc1.387.gb447b232ab-goog
>
>

Re: [RESEND PATCH v2 04/13] backlight: qcom-wled: Fix off-by-one maximum with default num_strings

On 2021-11-15 11:23:27, Daniel Thompson wrote:
> On Fri, Nov 12, 2021 at 10:43:37PM +0100, Marijn Suijten wrote:
> > On 2021-11-12 13:35:03, Marijn Suijten wrote:
> > > On 2021-11-12 12:08:39, Daniel Thompson wrote:
> > > > On Fri, Nov 12, 2021 at 01:26:57AM +0100, Marijn Suijten wrote:
> > > > > When not specifying num-strings in the DT the default is used, but +1 
> > > > > is
> > > > > added to it which turns WLED3 into 4 and WLED4/5 into 5 strings 
> > > > > instead
> > > > > of 3 and 4 respectively, causing out-of-bounds reads and register
> > > > > read/writes.  This +1 exists for a deficiency in the DT parsing code,
> > > > > and is simply omitted entirely - solving this oob issue - by parsing 
> > > > > the
> > > > > property separately much like qcom,enabled-strings.
> > > > > 
> > > > > This also allows more stringent checks on the maximum value when
> > > > > qcom,enabled-strings is provided in the DT.  Note that num-strings is
> > > > > parsed after enabled-strings to give it final sign-off over the 
> > > > > length,
> > > > > which DT currently utilizes to get around an incorrect fixed read of
> > > > > four elements from that array (has been addressed in a prior patch).
> > > > > 
> > > > > Fixes: 93c64f1ea1e8 ("leds: add Qualcomm PM8941 WLED driver")
> > > > > Signed-off-by: Marijn Suijten 
> > > > > Reviewed-By: AngeloGioacchino Del Regno 
> > > > > 
> > > > > ---
> > > > >  drivers/video/backlight/qcom-wled.c | 51 
> > > > > +++--
> > > > >  1 file changed, 19 insertions(+), 32 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/video/backlight/qcom-wled.c 
> > > > > b/drivers/video/backlight/qcom-wled.c
> > > > > index 977cd75827d7..c5232478a343 100644
> > > > > --- a/drivers/video/backlight/qcom-wled.c
> > > > > +++ b/drivers/video/backlight/qcom-wled.c
> > > > > @@ -1552,6 +1520,25 @@ static int wled_configure(struct wled *wled)
> > > > >   }
> > > > >   }
> > > > > 
> > > > > + rc = of_property_read_u32(dev->of_node, "qcom,num-strings", 
> > > > > );
> > > > > + if (!rc) {
> > > > > + if (val < 1 || val > wled->max_string_count) {
> > > > > + dev_err(dev, "qcom,num-strings must be between 
> > > > > 1 and %d\n",
> > > > > + wled->max_string_count);
> > > > > + return -EINVAL;
> > > > > + }
> > > > > +
> > > > > + if (string_len > 0) {
> > > > > + dev_warn(dev, "qcom,num-strings and 
> > > > > qcom,enabled-strings are ambiguous\n");
> > > > 
> > > > The warning should also be below the error message on the next if 
> > > > statement.
> > > 
> > > Agreed.
> > 
> > Thinking about this again while reworking the patches, I initially put
> > this above the error to make DT writers aware.  There's no point telling
> > them that their values are out of sync (num-strings >
> > len(enabled-strings)), when they "shouldn't even" (don't need to) set
> > both in the first place.  They might needlessly fix the discrepancy, see
> > the driver finally probe (working backlight) and carry on without
> > noticing this warning that now appears.
> > 
> > Sorry for bringing this back up, but I'm curious about your opinion.
> 
> With a more helpful warning about how to fix then I think it is OK to
> have both the warning and the error.

Thanks - I presume the message we settled upon last time is helpful
enough:

Only one of qcom,num-strings or qcom,enabled-strings should be set

I'll respin this, together with this warning reordered into the next
commit, and using __le16 for the cpu_to_le16 output.

- Marijn

Re: [PATCH] drm: change logs to print connectors in the form CONNECTOR:id:name

2021-11-15 Thread Claudio Suarez

On Mon, Nov 15, 2021 at 12:24:26PM +0200, Jani Nikula wrote:
> On Sun, 14 Nov 2021, Claudio Suarez  wrote:
> > On Sat, Nov 13, 2021 at 09:39:46PM +0100, Sam Ravnborg wrote:
> >> Hi Claudio,
> >> 
> >> On Sat, Nov 13, 2021 at 08:27:30PM +0100, Claudio Suarez wrote:
> >> > The prefered way to log connectors is [CONNECTOR:id:name]. Change it in
> >> > drm core programs.
> >> > 
> >> > Suggested-by: Ville Syrjälä 
> >> > Signed-off-by: Claudio Suarez 
> >> 
> >> While touching all these logging calls, could you convernt them to the
> >> newer drm_dbg_kms variants?
> >> DRM_DEBUG_* are all deprecated.
> >> 
> >
> > Yes, I can, but it is recommended to do it in a different patch:
> >
> > https://www.kernel.org/doc/html/latest/process/submitting-patches.html#separate-your-changes
> >
> > C:
> > "Separate your changes
> > Separate each logical change into a separate patch.
> > For example, if your changes include both bug fixes and performance 
> > enhancements..."
> >
> >
> > I will study and send a new separate patch with your suggestion.
> 
> I feel these logging changes are the types of changes where I'd err on
> the side of fewer patches than strictly following the above guidelines.

To size the problem:
- there are about 3434 references to DRM_DEBUG_* in all the drm code, all 
drivers.
- there are 413 references to DRM_DEBUG_* in the drm core code, only 66 of
   them are related to connectors.
- there are 62 references to DRM_ERR* and 7 references to DRM_INFO in
   the drm core programs.

I meant I can make two patches:
 1 - this one with the change to CONNECTOR:id:name (29 changes)
 2 - a new and bigger patch to change 413 + 62 + 7 references to 
DRM_{DEBUG,ERR,INFO}
 in the drm core programs.

The second patch will be ready in a few days.

Is it a good plan ? or can it be improved ?

Best regards,
Claudio Suarez.

[PATCH v1 1/2] ext4/xfs: add page refcount helper

From: Ralph Campbell 

There are several places where ZONE_DEVICE struct pages assume a reference
count == 1 means the page is idle and free. Instead of open coding this,
add a helper function to hide this detail.

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
Reviewed-by: Christoph Hellwig 
Acked-by: Theodore Ts'o 
Acked-by: Darrick J. Wong 
---
v3:
[AS]: rename dax_layout_is_idle_page func to dax_page_unused

v4:
[AS]: This ref count functionality was missing on fuse/dax.c.
---
 fs/dax.c|  4 ++--
 fs/ext4/inode.c |  5 +
 fs/fuse/dax.c   |  4 +---
 fs/xfs/xfs_file.c   |  4 +---
 include/linux/dax.h | 10 ++
 5 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 4e3e5a283a91..84803c835650 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -369,7 +369,7 @@ static void dax_disassociate_entry(void *entry, struct 
address_space *mapping,
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
 
-   WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
+   WARN_ON_ONCE(trunc && !dax_page_unused(page));
WARN_ON_ONCE(page->mapping && page->mapping != mapping);
page->mapping = NULL;
page->index = 0;
@@ -383,7 +383,7 @@ static struct page *dax_busy_page(void *entry)
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
 
-   if (page_ref_count(page) > 1)
+   if (!dax_page_unused(page))
return page;
}
return NULL;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0f06305167d5..068e8f78ec02 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3913,10 +3913,7 @@ int ext4_break_layouts(struct inode *inode)
if (!page)
return 0;
 
-   error = ___wait_var_event(>_refcount,
-   atomic_read(>_refcount) == 1,
-   TASK_INTERRUPTIBLE, 0, 0,
-   ext4_wait_dax_page(inode));
+   error = dax_wait_page(inode, page, ext4_wait_dax_page);
} while (error == 0);
 
return error;
diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
index 281d79f8b3d3..f6d2a36e56e2 100644
--- a/fs/fuse/dax.c
+++ b/fs/fuse/dax.c
@@ -676,9 +676,7 @@ static int __fuse_dax_break_layouts(struct inode *inode, 
bool *retry,
return 0;
 
*retry = true;
-   return ___wait_var_event(>_refcount,
-   atomic_read(>_refcount) == 1, TASK_INTERRUPTIBLE,
-   0, 0, fuse_wait_dax_page(inode));
+   return dax_wait_page(inode, page, fuse_wait_dax_page);
 }
 
 /* dmap_end == 0 leads to unmapping of whole file */
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 7aa943edfc02..fb13b12fd032 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -860,9 +860,7 @@ xfs_break_dax_layouts(
return 0;
 
*retry = true;
-   return ___wait_var_event(>_refcount,
-   atomic_read(>_refcount) == 1, TASK_INTERRUPTIBLE,
-   0, 0, xfs_wait_dax_page(inode));
+   return dax_wait_page(inode, page, xfs_wait_dax_page);
 }
 
 int
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 2619d94c308d..c9c27fbf0b98 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -216,6 +216,16 @@ static inline bool dax_mapping(struct address_space 
*mapping)
return mapping->host && IS_DAX(mapping->host);
 }
 
+static inline bool dax_page_unused(struct page *page)
+{
+   return page_ref_count(page) == 1;
+}
+
+#define dax_wait_page(_inode, _page, _wait_cb) \
+   ___wait_var_event(&(_page)->_refcount,  \
+   dax_page_unused(_page), \
+   TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode))
+
 #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
 void hmem_register_device(int target_nid, struct resource *r);
 #else
-- 
2.32.0

[PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

From: Ralph Campbell 

ZONE_DEVICE struct pages have an extra reference count that complicates the
code for put_page() and several places in the kernel that need to check the
reference count to see that a page is not being used (gup, compaction,
migration, etc.). Clean up the code so the reference count doesn't need to
be treated specially for ZONE_DEVICE.

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
Reviewed-by: Christoph Hellwig 
---
 arch/powerpc/kvm/book3s_hv_uvmem.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  3 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |  2 +-
 fs/dax.c |  4 +-
 include/linux/dax.h  |  2 +-
 include/linux/memremap.h |  7 ++-
 include/linux/mm.h   | 11 
 lib/test_hmm.c   |  2 +-
 mm/internal.h|  7 +++
 mm/memcontrol.c  |  6 +-
 mm/memremap.c| 72 +++-
 mm/migrate.c |  5 --
 mm/page_alloc.c  |  3 +
 mm/swap.c| 45 ++-
 14 files changed, 48 insertions(+), 123 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index a7061ee3b157..4de7c77ea17b 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -712,7 +712,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
 
dpage = pfn_to_page(uvmem_pfn);
dpage->zone_device_data = pvt;
-   get_page(dpage);
+   init_page_count(dpage);
lock_page(dpage);
return dpage;
 out_clear:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 3e405f078ade..c1b41c301a67 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -224,7 +224,8 @@ svm_migrate_get_vram_page(struct svm_range *prange, 
unsigned long pfn)
page = pfn_to_page(pfn);
svm_range_bo_ref(prange->svm_bo);
page->zone_device_data = prange->svm_bo;
-   get_page(page);
+   VM_BUG_ON_PAGE(page_ref_count(page), page);
+   init_page_count(page);
lock_page(page);
 }
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c 
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 92987daa5e17..8bc7120e1216 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -324,7 +324,7 @@ nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm)
return NULL;
}
 
-   get_page(page);
+   init_page_count(page);
lock_page(page);
return page;
 }
diff --git a/fs/dax.c b/fs/dax.c
index 84803c835650..72bb6f1e155c 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -572,14 +572,14 @@ static void *grab_mapping_entry(struct xa_state *xas,
 
 /**
  * dax_layout_busy_page_range - find first pinned page in @mapping
- * @mapping: address space to scan for a page with ref count > 1
+ * @mapping: address space to scan for a page with ref count > 0
  * @start: Starting offset. Page containing 'start' is included.
  * @end: End offset. Page containing 'end' is included. If 'end' is LLONG_MAX,
  *   pages from 'start' till the end of file are included.
  *
  * DAX requires ZONE_DEVICE mapped pages. These pages are never
  * 'onlined' to the page allocator so they are considered idle when
- * page->count == 1. A filesystem uses this interface to determine if
+ * page->count == 0. A filesystem uses this interface to determine if
  * any page in the mapping is busy, i.e. for DMA, or other
  * get_user_pages() usages.
  *
diff --git a/include/linux/dax.h b/include/linux/dax.h
index c9c27fbf0b98..1d3e25d51fc6 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -218,7 +218,7 @@ static inline bool dax_mapping(struct address_space 
*mapping)
 
 static inline bool dax_page_unused(struct page *page)
 {
-   return page_ref_count(page) == 1;
+   return page_ref_count(page) == 0;
 }
 
 #define dax_wait_page(_inode, _page, _wait_cb) \
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index ff4d398edf35..8340e39241fc 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -74,9 +74,10 @@ enum memory_type {
 
 struct dev_pagemap_ops {
/*
-* Called once the page refcount reaches 1.  (ZONE_DEVICE pages never
-* reach 0 refcount unless there is a refcount bug. This allows the
-* device driver to implement its own memory management.)
+* Called once the page refcount reaches 0. The reference count
+* should be reset to one with init_page_count(page) before reusing
+* the page. This allows the device driver to implement its own
+* memory management.
 */
void (*page_free)(struct page *page);
 
diff --git

[PATCH v1 0/2] Remove extra ZONE_DEVICE page refcount

This patch includes Ralph Campbell’s ZONE_DEVICE refcount cleanup and
additionally the changes necessary to avoid breaking the separately
submitted MEMORY_DEVICE_COHERENT page migration code.
Ralph’s original description:
ZONE_DEVICE struct pages have an extra reference count that complicates
the code for put_page() and several places in the kernel that need to
check the reference count to see that a page is not being used (gup,
compaction, migration, etc.). Clean up the code so the reference count
doesn't need to be treated specially for ZONE_DEVICE.
Following a suggestion by Christoph, we attempted to combine this
cleanup with the device patch migration patch series, however this
caused xftests 413 to fail with a warning, the root cause of which has
large kernel footprint than just device memory. Fixing this issue
properly will require cooperation between multiple development groups
working across multiple kernel subsystems, as is apparent from the
discussion under the earlier, combined patch submission.
We therefore propose to break this work out separately as its own patch,
so it can receive the cooperative development work it needs. The deep
problem arises from the get_user_pages API, which has proved troublesome
for many years. It is possible that a concerted effort to repair this
particular refcount issue properly will be a step in the direction of
rationalizing this popular and problematic API.
In the larger picture, this API rationalization work probably deserves
an agenda item at the upcoming Filesystem, MM & BPF Summit:
https://events.linuxfoundation.org/lsfmm/

The wide ranging discussion following previous iterations of the
migration patchset focused almost exclusively on the refcount cleanup
patch. The thread is here:
https://lore.kernel.org/linux-mm/20211014153928.16805-3-alex.sie...@amd.com/
and links a number of previous threads. It is apparent that there is a
lot of work in progress by a number of developer groups in parallel,
and that there are issues with the order in which this work should be
attempted and merged.
Jason provided his list of “balls in the air”:
 - Joao's compound page support for device_dax and more
 - Alex's DEVICE_COHERENT
 - The refcount normalization
 - Removing the pgmap test from GUP
 - Removing the need for the PUD/PMD/PTE special bit
 - Removing the need for the PUD/PMD/PTE devmap bit
 - Remove PUD/PMD vma_is_special
 - folios for fsdax
 - shootdown for fsdax
It is not clear that the refcount cleanup in this patch should be the
first item on the list to be merged, however it has proved to be a good
starting point for a cooperative effort to address the underlying
issues.
Ralph, if you would prefer to take back “ownership” of this patch, it’s
yours, otherwise we will be happy to keep it in play and get it merged
when some other pieces of the puzzle fall into place.

Ralph Campbell (2):
  ext4/xfs: add page refcount helper
  mm: remove extra ZONE_DEVICE struct page refcount

 arch/powerpc/kvm/book3s_hv_uvmem.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  3 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |  2 +-
 fs/dax.c |  8 +--
 fs/ext4/inode.c  |  5 +-
 fs/fuse/dax.c|  4 +-
 fs/xfs/xfs_file.c|  4 +-
 include/linux/dax.h  | 10 
 include/linux/memremap.h |  7 ++-
 include/linux/mm.h   | 11 
 lib/test_hmm.c   |  2 +-
 mm/internal.h|  7 +++
 mm/memcontrol.c  |  6 +-
 mm/memremap.c| 72 +++-
 mm/migrate.c |  5 --
 mm/page_alloc.c  |  3 +
 mm/swap.c| 45 ++-
 17 files changed, 62 insertions(+), 134 deletions(-)

-- 
2.32.0

[PATCH v1 6/9] lib: test_hmm add module param for zone device type

In order to configure device coherent in test_hmm, two module parameters
should be passed, which correspond to the SP start address of each
device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed,
private device type is configured.

Signed-off-by: Alex Sierra 
---
 lib/test_hmm.c  | 66 +++--
 lib/test_hmm_uapi.h |  1 +
 2 files changed, 47 insertions(+), 20 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 2daaa7b3710b..45334df28d7b 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -34,6 +34,16 @@
 #define DEVMEM_CHUNK_SIZE  (256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE  16
 
+static unsigned long spm_addr_dev0;
+module_param(spm_addr_dev0, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev0,
+   "Specify start address for SPM (special purpose memory) used 
for device 0. By setting this Coherent device type will be used. Make sure 
spm_addr_dev1 is set too");
+
+static unsigned long spm_addr_dev1;
+module_param(spm_addr_dev1, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev1,
+   "Specify start address for SPM (special purpose memory) used 
for device 1. By setting this Coherent device type will be used. Make sure 
spm_addr_dev0 is set too");
+
 static const struct dev_pagemap_ops dmirror_devmem_ops;
 static const struct mmu_interval_notifier_ops dmirror_min_ops;
 static dev_t dmirror_dev;
@@ -452,11 +462,11 @@ static int dmirror_write(struct dmirror *dmirror, struct 
hmm_dmirror_cmd *cmd)
return ret;
 }
 
-static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
+static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
   struct page **ppage)
 {
struct dmirror_chunk *devmem;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long pfn;
unsigned long pfn_first;
unsigned long pfn_last;
@@ -464,17 +474,29 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
 
devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
if (!devmem)
-   return false;
+   return -ENOMEM;
 
-   res = request_free_mem_region(_resource, DEVMEM_CHUNK_SIZE,
- "hmm_dmirror");
-   if (IS_ERR(res))
-   goto err_devmem;
+   if (!spm_addr_dev0 && !spm_addr_dev1) {
+   res = request_free_mem_region(_resource, 
DEVMEM_CHUNK_SIZE,
+ "hmm_dmirror");
+   if (IS_ERR_OR_NULL(res))
+   goto err_devmem;
+   devmem->pagemap.range.start = res->start;
+   devmem->pagemap.range.end = res->end;
+   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+   mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+   } else if (spm_addr_dev0 && spm_addr_dev1) {
+   devmem->pagemap.range.start = MINOR(mdevice->cdevice.dev) ?
+   spm_addr_dev0 :
+   spm_addr_dev1;
+   devmem->pagemap.range.end = devmem->pagemap.range.start +
+   DEVMEM_CHUNK_SIZE - 1;
+   devmem->pagemap.type = MEMORY_DEVICE_COHERENT;
+   mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_COHERENT;
+   } else {
+   pr_err("Both spm_addr_dev parameters should be set\n");
+   }
 
-   mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
-   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
-   devmem->pagemap.range.start = res->start;
-   devmem->pagemap.range.end = res->end;
devmem->pagemap.nr_range = 1;
devmem->pagemap.ops = _devmem_ops;
devmem->pagemap.owner = mdevice;
@@ -495,10 +517,14 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
mdevice->devmem_capacity = new_capacity;
mdevice->devmem_chunks = new_chunks;
}
-
ptr = memremap_pages(>pagemap, numa_node_id());
-   if (IS_ERR(ptr))
+   if (IS_ERR_OR_NULL(ptr)) {
+   if (ptr)
+   ret = PTR_ERR(ptr);
+   else
+   ret = -EFAULT;
goto err_release;
+   }
 
devmem->mdevice = mdevice;
pfn_first = devmem->pagemap.range.start >> PAGE_SHIFT;
@@ -531,7 +557,8 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
 
 err_release:
mutex_unlock(>devmem_lock);
-   release_mem_region(devmem->pagemap.range.start, 
range_len(>pagemap.range));
+   if (res)
+   release_mem_region(devmem->pagemap.range.start, 
range_len(>pagemap.range));
 err_devmem:
kfree(devmem);
 
@@ -1219,10 +1246,8 @@ static int dmirror_device_init(struct dmirror_device 
*mdevice, int id)
if (ret)
return ret;
 
-

[PATCH v1 9/9] tools: update test_hmm script to support SP config

Add two more parameters to set spm_addr_dev0 & spm_addr_dev1
addresses. These two parameters configure the start SP
addresses for each device in test_hmm driver.
Consequently, this configures zone device type as coherent.

Signed-off-by: Alex Sierra 
---
 tools/testing/selftests/vm/test_hmm.sh | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/test_hmm.sh 
b/tools/testing/selftests/vm/test_hmm.sh
index 0647b525a625..3eeabe94399f 100755
--- a/tools/testing/selftests/vm/test_hmm.sh
+++ b/tools/testing/selftests/vm/test_hmm.sh
@@ -40,7 +40,18 @@ check_test_requirements()
 
 load_driver()
 {
-   modprobe $DRIVER > /dev/null 2>&1
+   if [ $# -eq 0 ]; then
+   modprobe $DRIVER > /dev/null 2>&1
+   else
+   if [ $# -eq 2 ]; then
+   modprobe $DRIVER spm_addr_dev0=$1 spm_addr_dev1=$2
+   > /dev/null 2>&1
+   else
+   echo "Missing module parameters. Make sure pass"\
+   "spm_addr_dev0 and spm_addr_dev1"
+   usage
+   fi
+   fi
if [ $? == 0 ]; then
major=$(awk "\$2==\"HMM_DMIRROR\" {print \$1}" /proc/devices)
mknod /dev/hmm_dmirror0 c $major 0
@@ -58,7 +69,7 @@ run_smoke()
 {
echo "Running smoke test. Note, this test provides basic coverage."
 
-   load_driver
+   load_driver $1 $2
$(dirname "${BASH_SOURCE[0]}")/hmm-tests
unload_driver
 }
@@ -75,6 +86,9 @@ usage()
echo "# Smoke testing"
echo "./${TEST_NAME}.sh smoke"
echo
+   echo "# Smoke testing with SPM enabled"
+   echo "./${TEST_NAME}.sh smoke  "
+   echo
exit 0
 }
 
@@ -84,7 +98,7 @@ function run_test()
usage
else
if [ "$1" = "smoke" ]; then
-   run_smoke
+   run_smoke $2 $3
else
usage
fi
-- 
2.32.0

[PATCH v1 2/9] mm: add device coherent vma selection for memory migration

This case is used to migrate pages from device memory, back to system
memory. Device coherent type memory is cache coherent from device and CPU
point of view.

Signed-off-by: Alex Sierra 
---
v2:
condition added when migrations from device coherent pages.
---
 include/linux/migrate.h | 1 +
 mm/migrate.c| 9 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index c8077e936691..e74bb0978f6f 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -138,6 +138,7 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
 enum migrate_vma_direction {
MIGRATE_VMA_SELECT_SYSTEM = 1 << 0,
MIGRATE_VMA_SELECT_DEVICE_PRIVATE = 1 << 1,
+   MIGRATE_VMA_SELECT_DEVICE_COHERENT = 1 << 2,
 };
 
 struct migrate_vma {
diff --git a/mm/migrate.c b/mm/migrate.c
index f74422a42192..166bfec7d85e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2340,8 +2340,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
if (is_writable_device_private_entry(entry))
mpfn |= MIGRATE_PFN_WRITE;
} else {
-   if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
-   goto next;
pfn = pte_pfn(pte);
if (is_zero_pfn(pfn)) {
mpfn = MIGRATE_PFN_MIGRATE;
@@ -2349,6 +2347,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
goto next;
}
page = vm_normal_page(migrate->vma, addr, pte);
+   if (!is_zone_device_page(page) &&
+   !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
+   goto next;
+   if (is_zone_device_page(page) &&
+   (!(migrate->flags & 
MIGRATE_VMA_SELECT_DEVICE_COHERENT) ||
+page->pgmap->owner != migrate->pgmap_owner))
+   goto next;
mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
}
-- 
2.32.0

[PATCH v1 7/9] lib: add support for device coherent type in test_hmm

Device Coherent type uses device memory that is coherently accesible by
the CPU. This could be shown as SP (special purpose) memory range
at the BIOS-e820 memory enumeration. If no SP memory is supported in
system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.

Currently, test_hmm only supports two different SP ranges of at least
256MB size. This could be specified in the kernel parameter variable
efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x1 &
0x14000 physical address. Ex.
efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4

Signed-off-by: Alex Sierra 
---
 lib/test_hmm.c  | 191 +---
 lib/test_hmm_uapi.h |  15 ++--
 2 files changed, 153 insertions(+), 53 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 45334df28d7b..9e2cc0cd4412 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -471,6 +471,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
unsigned long pfn_first;
unsigned long pfn_last;
void *ptr;
+   int ret = -ENOMEM;
 
devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
if (!devmem)
@@ -553,7 +554,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
}
spin_unlock(>lock);
 
-   return true;
+   return 0;
 
 err_release:
mutex_unlock(>devmem_lock);
@@ -562,7 +563,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
 err_devmem:
kfree(devmem);
 
-   return false;
+   return ret;
 }
 
 static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
@@ -571,8 +572,10 @@ static struct page *dmirror_devmem_alloc_page(struct 
dmirror_device *mdevice)
struct page *rpage;
 
/*
-* This is a fake device so we alloc real system memory to store
-* our device memory.
+* For ZONE_DEVICE private type, this is a fake device so we alloc real
+* system memory to store our device memory.
+* For ZONE_DEVICE coherent type we use the actual dpage to store the 
data
+* and ignore rpage.
 */
rpage = alloc_page(GFP_HIGHUSER);
if (!rpage)
@@ -623,12 +626,17 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
 * unallocated pte_none() or read-only zero page.
 */
spage = migrate_pfn_to_page(*src);
+   if (spage && is_zone_device_page(spage))
+   pr_err("page already in device spage pfn: 0x%lx\n",
+   page_to_pfn(spage));
+   WARN_ON(spage && is_zone_device_page(spage));
 
dpage = dmirror_devmem_alloc_page(mdevice);
if (!dpage)
continue;
 
-   rpage = dpage->zone_device_data;
+   rpage = is_device_private_page(dpage) ? dpage->zone_device_data 
:
+   dpage;
if (spage)
copy_highpage(rpage, spage);
else
@@ -640,8 +648,11 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
 * the simulated device memory and that page holds the pointer
 * to the mirror.
 */
+   rpage = dpage->zone_device_data;
rpage->zone_device_data = dmirror;
 
+   pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 
0x%lx\n",
+page_to_pfn(spage), page_to_pfn(dpage));
*dst = migrate_pfn(page_to_pfn(dpage)) |
MIGRATE_PFN_LOCKED;
if ((*src & MIGRATE_PFN_WRITE) ||
@@ -721,10 +732,13 @@ static int dmirror_migrate_finalize_and_map(struct 
migrate_vma *args,
continue;
 
/*
-* Store the page that holds the data so the page table
-* doesn't have to deal with ZONE_DEVICE private pages.
+* For ZONE_DEVICE private pages we store the page that
+* holds the data so the page table doesn't have to deal it.
+* For ZONE_DEVICE coherent pages we store the actual page, 
since
+* the CPU has coherent access to the page.
 */
-   entry = dpage->zone_device_data;
+   entry = is_device_private_page(dpage) ? dpage->zone_device_data 
:
+   dpage;
if (*dst & MIGRATE_PFN_WRITE)
entry = xa_tag_pointer(entry, DPT_XA_TAG_WRITE);
entry = xa_store(>pt, pfn, entry, GFP_ATOMIC);
@@ -804,8 +818,110 @@ static int dmirror_exclusive(struct dmirror *dmirror,
return ret;
 }
 
-static int dmirror_migrate(struct dmirror *dmirror,
-  struct hmm_dmirror_cmd *cmd)
+static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma

[PATCH v1 8/9] tools: update hmm-test to support device coherent type

Test cases such as migrate_fault and migrate_multiple,
were modified to explicit migrate from device to sys memory
without the need of page faults, when using device coherent
type.

Snapshot test case updated to read memory device type
first and based on that, get the proper returned results
migrate_ping_pong test case added to test explicit migration
from device to sys memory for both private and coherent zone
types.

Helpers to migrate from device to sys memory and vicerversa
were also added.

Signed-off-by: Alex Sierra 
---
 tools/testing/selftests/vm/hmm-tests.c | 156 ++---
 1 file changed, 138 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index 864f126ffd78..6091e30636d5 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -44,6 +44,7 @@ struct hmm_buffer {
int fd;
uint64_tcpages;
uint64_tfaults;
+   int zone_device_type;
 };
 
 #define TWOMEG (1 << 21)
@@ -144,6 +145,7 @@ static int hmm_dmirror_cmd(int fd,
}
buffer->cpages = cmd.cpages;
buffer->faults = cmd.faults;
+   buffer->zone_device_type = cmd.zone_device_type;
 
return 0;
 }
@@ -211,6 +213,32 @@ static void hmm_nanosleep(unsigned int n)
nanosleep(, NULL);
 }
 
+static int hmm_migrate_sys_to_dev(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_DEV, buffer, npages);
+}
+
+static int hmm_migrate_dev_to_sys(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_SYS, buffer, npages);
+}
+
+static int hmm_is_private_device(int fd, bool *res)
+{
+   struct hmm_buffer buffer;
+   int ret;
+
+   buffer.ptr = 0;
+   ret = hmm_dmirror_cmd(fd, HMM_DMIRROR_GET_MEM_DEV_TYPE, , 1);
+   *res = (buffer.zone_device_type == HMM_DMIRROR_MEMORY_DEVICE_PRIVATE);
+
+   return ret;
+}
+
 /*
  * Simple NULL test of device open/close.
  */
@@ -875,7 +903,7 @@ TEST_F(hmm, migrate)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -923,7 +951,7 @@ TEST_F(hmm, migrate_fault)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -936,7 +964,7 @@ TEST_F(hmm, migrate_fault)
ASSERT_EQ(ptr[i], i);
 
/* Migrate memory to the device again. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -976,7 +1004,7 @@ TEST_F(hmm, migrate_shared)
ASSERT_NE(buffer->ptr, MAP_FAILED);
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, -ENOENT);
 
hmm_buffer_free(buffer);
@@ -1015,7 +1043,7 @@ TEST_F(hmm2, migrate_mixed)
p = buffer->ptr;
 
/* Migrating a protected area should be an error. */
-   ret = hmm_dmirror_cmd(self->fd1, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd1, buffer, npages);
ASSERT_EQ(ret, -EINVAL);
 
/* Punch a hole after the first page address. */
@@ -1023,7 +1051,7 @@ TEST_F(hmm2, migrate_mixed)
ASSERT_EQ(ret, 0);
 
/* We expect an error if the vma doesn't cover the range. */
-   ret = hmm_dmirror_cmd(self->fd1, HMM_DMIRROR_MIGRATE, buffer, 3);
+   ret = hmm_migrate_sys_to_dev(self->fd1, buffer, 3);
ASSERT_EQ(ret, -EINVAL);
 
/* Page 2 will be a read-only zero page. */
@@ -1055,13 +1083,13 @@ TEST_F(hmm2, migrate_mixed)
 
/* Now try to migrate pages 2-5 to device 1. */
buffer->ptr = p + 2 * self->page_size;
-   ret = hmm_dmirror_cmd(self->fd1, HMM_DMIRROR_MIGRATE, buffer, 4);
+   ret = hmm_migrate_sys_to_dev(self->fd1, buffer, 4);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, 4);
 
/* Page 5 won't be migrated to device 0 because it's on device 1. */
buffer->ptr = p + 5 * self->page_size;
-   ret = hmm_dmirror_cmd(self->fd0, HMM_DMIRROR_MIGRATE, buffer, 1);
+   ret = hmm_migrate_sys_to_dev(self->fd0, buffer, 1);

[PATCH v1 0/9] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
owned by a device that can be mapped into CPU page tables like
MEMORY_DEVICE_GENERIC and can also be migrated like
MEMORY_DEVICE_PRIVATE.
Christoph, the suggestion to incorporate Ralph Campbell’s refcount
cleanup patch into our hardware page migration patchset originally came
from you, but it proved impractical to do things in that order because
the refcount cleanup introduced a bug with wide ranging structural
implications. Instead, we amended Ralph’s patch so that it could be
applied after merging the migration work. As we saw from the recent
discussion, merging the refcount work is going to take some time and
cooperation between multiple development groups, while the migration
work is ready now and is needed now. So we propose to merge this
patchset first and continue to work with Ralph and others to merge the
refcount cleanup separately, when it is ready.
This patch series is mostly self-contained except for a few places where
it needs to update other subsystems to handle the new memory type.
System stability and performance are not affected according to our
ongoing testing, including xfstests.
How it works: The system BIOS advertises the GPU device memory
(aka VRAM) as SPM (special purpose memory) in the UEFI system address
map.
The amdgpu driver registers the memory with devmap as
MEMORY_DEVICE_COHERENT using devm_memremap_pages. The initial user for
this hardware page migration capability is the Frontier supercomputer
project. This functionality is not AMD-specific. We expect other GPU
vendors to find this functionality useful, and possibly other hardware
types in the future.
Our test nodes in the lab are similar to the Frontier configuration,
with .5 TB of system memory plus 256 GB of device memory split across
4 GPUs, all in a single coherent address space. Page migration is
expected to improve application efficiency significantly. We will
report empirical results as they become available.
We extended hmm_test to cover migration of MEMORY_DEVICE_COHERENT. This
patch set builds on HMM and our SVM memory manager already merged in
5.15.

Alex Sierra (9):
  mm: add zone device coherent type memory support
  mm: add device coherent vma selection for memory migration
  drm/amdkfd: add SPM support for SVM
  drm/amdkfd: coherent type as sys mem on migration to ram
  lib: test_hmm add ioctl to get zone device type
  lib: test_hmm add module param for zone device type
  lib: add support for device coherent type in test_hmm
  tools: update hmm-test to support device coherent type
  tools: update test_hmm script to support SP config

 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  34 ++-
 include/linux/memremap.h |   8 +
 include/linux/migrate.h  |   1 +
 include/linux/mm.h   |   9 +
 lib/test_hmm.c   | 270 +--
 lib/test_hmm_uapi.h  |  21 +-
 mm/memcontrol.c  |   6 +-
 mm/memory-failure.c  |   6 +-
 mm/memremap.c|   5 +-
 mm/migrate.c |  30 ++-
 tools/testing/selftests/vm/hmm-tests.c   | 156 +++--
 tools/testing/selftests/vm/test_hmm.sh   |  20 +-
 12 files changed, 446 insertions(+), 120 deletions(-)

-- 
2.32.0

[PATCH v1 3/9] drm/amdkfd: add SPM support for SVM

When CPU is connected throug XGMI, it has coherent
access to VRAM resource. In this case that resource
is taken from a table in the device gmc aperture base.
This resource is used along with the device type, which could
be DEVICE_PRIVATE or DEVICE_COHERENT to create the device
page map region.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
v7:
Remove lookup_resource call, so export symbol for this function
is not longer required. Patch dropped "kernel: resource:
lookup_resource as exported symbol"
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 29 +++-
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index aeade32ec298..9e36fe8aea0f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -935,7 +935,7 @@ int svm_migrate_init(struct amdgpu_device *adev)
 {
struct kfd_dev *kfddev = adev->kfd.dev;
struct dev_pagemap *pgmap;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long size;
void *r;
 
@@ -950,28 +950,34 @@ int svm_migrate_init(struct amdgpu_device *adev)
 * should remove reserved size
 */
size = ALIGN(adev->gmc.real_vram_size, 2ULL << 20);
-   res = devm_request_free_mem_region(adev->dev, _resource, size);
-   if (IS_ERR(res))
-   return -ENOMEM;
+   if (adev->gmc.xgmi.connected_to_cpu) {
+   pgmap->range.start = adev->gmc.aper_base;
+   pgmap->range.end = adev->gmc.aper_base + adev->gmc.aper_size - 
1;
+   pgmap->type = MEMORY_DEVICE_COHERENT;
+   } else {
+   res = devm_request_free_mem_region(adev->dev, _resource, 
size);
+   if (IS_ERR(res))
+   return -ENOMEM;
+   pgmap->range.start = res->start;
+   pgmap->range.end = res->end;
+   pgmap->type = MEMORY_DEVICE_PRIVATE;
+   }
 
-   pgmap->type = MEMORY_DEVICE_PRIVATE;
pgmap->nr_range = 1;
-   pgmap->range.start = res->start;
-   pgmap->range.end = res->end;
pgmap->ops = _migrate_pgmap_ops;
pgmap->owner = SVM_ADEV_PGMAP_OWNER(adev);
-   pgmap->flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
-
+   pgmap->flags = 0;
/* Device manager releases device-specific resources, memory region and
 * pgmap when driver disconnects from device.
 */
r = devm_memremap_pages(adev->dev, pgmap);
if (IS_ERR(r)) {
pr_err("failed to register HMM device memory\n");
-
/* Disable SVM support capability */
pgmap->type = 0;
-   devm_release_mem_region(adev->dev, res->start, 
resource_size(res));
+   if (pgmap->type == MEMORY_DEVICE_PRIVATE)
+   devm_release_mem_region(adev->dev, res->start,
+   res->end - res->start + 1);
return PTR_ERR(r);
}
 
@@ -984,3 +990,4 @@ int svm_migrate_init(struct amdgpu_device *adev)
 
return 0;
 }
+
-- 
2.32.0

[PATCH v1 5/9] lib: test_hmm add ioctl to get zone device type

new ioctl cmd added to query zone device type. This will be
used once the test_hmm adds zone device coherent type.

Signed-off-by: Alex Sierra 
---
 lib/test_hmm.c  | 15 ++-
 lib/test_hmm_uapi.h |  7 +++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index c259842f6d44..2daaa7b3710b 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -84,6 +84,7 @@ struct dmirror_chunk {
 struct dmirror_device {
struct cdev cdevice;
struct hmm_devmem   *devmem;
+   unsigned intzone_device_type;
 
unsigned intdevmem_capacity;
unsigned intdevmem_count;
@@ -470,6 +471,7 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
if (IS_ERR(res))
goto err_devmem;
 
+   mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
devmem->pagemap.range.start = res->start;
devmem->pagemap.range.end = res->end;
@@ -1025,6 +1027,15 @@ static int dmirror_snapshot(struct dmirror *dmirror,
return ret;
 }
 
+static int dmirror_get_device_type(struct dmirror *dmirror,
+   struct hmm_dmirror_cmd *cmd)
+{
+   mutex_lock(>mutex);
+   cmd->zone_device_type = dmirror->mdevice->zone_device_type;
+   mutex_unlock(>mutex);
+
+   return 0;
+}
 static long dmirror_fops_unlocked_ioctl(struct file *filp,
unsigned int command,
unsigned long arg)
@@ -1074,7 +1085,9 @@ static long dmirror_fops_unlocked_ioctl(struct file *filp,
case HMM_DMIRROR_SNAPSHOT:
ret = dmirror_snapshot(dmirror, );
break;
-
+   case HMM_DMIRROR_GET_MEM_DEV_TYPE:
+   ret = dmirror_get_device_type(dmirror, );
+   break;
default:
return -EINVAL;
}
diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
index f14dea5dcd06..c42e57a6a71e 100644
--- a/lib/test_hmm_uapi.h
+++ b/lib/test_hmm_uapi.h
@@ -26,6 +26,7 @@ struct hmm_dmirror_cmd {
__u64   npages;
__u64   cpages;
__u64   faults;
+   __u64   zone_device_type;
 };
 
 /* Expose the address space of the calling process through hmm device file */
@@ -35,6 +36,7 @@ struct hmm_dmirror_cmd {
 #define HMM_DMIRROR_SNAPSHOT   _IOWR('H', 0x03, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_EXCLUSIVE  _IOWR('H', 0x04, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_CHECK_EXCLUSIVE_IOWR('H', 0x05, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_GET_MEM_DEV_TYPE   _IOWR('H', 0x06, struct hmm_dmirror_cmd)
 
 /*
  * Values returned in hmm_dmirror_cmd.ptr for HMM_DMIRROR_SNAPSHOT.
@@ -62,4 +64,9 @@ enum {
HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE = 0x30,
 };
 
+enum {
+   /* 0 is reserved to catch uninitialized type fields */
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE = 1,
+};
+
 #endif /* _LIB_TEST_HMM_UAPI_H */
-- 
2.32.0

[PATCH v1 4/9] drm/amdkfd: coherent type as sys mem on migration to ram

Coherent device type memory on VRAM to RAM migration, has similar access
as System RAM from the CPU. This flag sets the source from the sender.
Which in Coherent type case, should be set as
MIGRATE_VMA_SELECT_DEVICE_COHERENT.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 9e36fe8aea0f..3e405f078ade 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -661,9 +661,12 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.vma = vma;
migrate.start = start;
migrate.end = end;
-   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
 
+   if (adev->gmc.xgmi.connected_to_cpu)
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_COHERENT;
+   else
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
size = 2 * sizeof(*migrate.src) + sizeof(uint64_t) + sizeof(dma_addr_t);
size *= npages;
buf = kvmalloc(size, GFP_KERNEL | __GFP_ZERO);
-- 
2.32.0

[PATCH v1 1/9] mm: add zone device coherent type memory support