Re: [PATCH v5 3/5] drm/msm/dp: set stream_pixel rate directly

2022-03-03 Thread Dmitry Baryshkov
On Fri, 4 Mar 2022 at 07:31, Stephen Boyd  wrote:
>
> Quoting Dmitry Baryshkov (2022-03-03 20:23:06)
> > On Fri, 4 Mar 2022 at 01:32, Stephen Boyd  wrote:
> > >
> > > Quoting Dmitry Baryshkov (2022-02-16 21:55:27)
> > > > The only clock for which we set the rate is the "stream_pixel". Rather
> > > > than storing the rate and then setting it by looping over all the
> > > > clocks, set the clock rate directly.
> > > >
> > > > Signed-off-by: Dmitry Baryshkov 
> > > [...]
> > > > diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c 
> > > > b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > > > index 07f6bf7e1acb..8e6361dedd77 100644
> > > > --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > > > +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > > > @@ -1315,7 +1315,7 @@ static void dp_ctrl_set_clock_rate(struct 
> > > > dp_ctrl_private *ctrl,
> > > > DRM_DEBUG_DP("setting rate=%lu on clk=%s\n", rate, name);
> > > >
> > > > if (num)
> > > > -   cfg->rate = rate;
> > > > +   clk_set_rate(cfg->clk, rate);
> > >
> > > This looks bad. From what I can tell we set the rate of the pixel clk
> > > after enabling the phy and configuring it. See the order of operations
> > > in dp_ctrl_enable_mainlink_clocks() and note how dp_power_clk_enable()
> > > is the one that eventually sets a rate through dp_power_clk_set_rate()
> > >
> > > dp_ctrl_set_clock_rate(ctrl, DP_CTRL_PM, "ctrl_link",
> > > ctrl->link->link_params.rate * 
> > > 1000);
> > >
> > > phy_configure(phy, &dp_io->phy_opts);
> > > phy_power_on(phy);
> > >
> > > ret = dp_power_clk_enable(ctrl->power, DP_CTRL_PM, true);
> >
> > This code has been changed in the previous patch.
> >
> > Let's get back a bit.
> > Currently dp_ctrl_set_clock_rate() doesn't change the clock rate. It
> > just stores the rate in the config so that later the sequence of
> > dp_power_clk_enable() -> dp_power_clk_set_rate() ->
> > [dp_power_clk_set_link_rate() -> dev_pm_opp_set_rate() or
> > msm_dss_clk_set_rate() -> clk_set_rate()] will use that.
> >
> > There are only two users of dp_ctrl_set_clock_rate():
> > - dp_ctrl_enable_mainlink_clocks(), which you have quoted above.
> >   This case is handled in the patch 1 from this series. It makes
>
> Patch 1 form this series says DP is unaffected. Huh?
>
> > dp_ctrl_enable_mainlink_clocks() call dev_pm_opp_set_rate() directly
> > without storing (!) the rate in the config, calling
> > phy_configure()/phy_power_on() and then setting the opp via the
> > sequence of calls specified above

Note, this handles the "ctrl_link" clock.

> >
> > - dp_ctrl_enable_stream_clocks(), which calls dp_power_clk_enable()
> > immediately afterwards. This call would set the stream_pixel rate
> > while enabling stream clocks. As far as I can see, the stream_pixel is
> > the only stream clock. So this patch sets the clock rate without
> > storing in the interim configuration data.
> >
> > Could you please clarify, what exactly looks bad to you?
> >

Note, this handles the "stream_pixel" clock.

>
> I'm concerned about the order of operations changing between the
> phy being powered on and the pixel clk frequency being set. From what I
> recall the pixel clk rate operations depend on the phy frequency being
> set (which is done through phy_configure?) so if we call clk_set_rate()
> on the pixel clk before the phy is set then the clk frequency will be
> calculated badly and probably be incorrect.

But the order of operations is mostly unchanged. The only major change
is that the opp point is now set before calling the
phy_configure()/phy_power_on()

For the pixel clock the driver has:
static int dp_ctrl_enable_stream_clocks(struct dp_ctrl_private *ctrl)
{
int ret = 0;

dp_ctrl_set_clock_rate(ctrl, DP_STREAM_PM, "stream_pixel",
ctrl->dp_ctrl.pixel_rate * 1000);

ret = dp_power_clk_enable(ctrl->power, DP_STREAM_PM, true);
[skipped the error handling]
}

dp_power_clk_enable() doesn't have any special handlers for the the
DP_STREAM_PM,
so this code would be equivalent to the following pseudo code (given
that there is only one stream clock):

unsigned int rate = ctrl->dp_ctrl.pixel_rate * 1000;

/* dp_ctrl_set_clock_rate() */
cfg = find_clock_cfg("stream_pixel");
cfg->rate = rate;

/* dp_power_clk_enable() */
clk = find_clock("stream_pixel")
clk_set_rate(clk, cfg->rate);
clk_prepare_enable(clk);

The proposed patch does exactly this.

Please correct me if I'm wrong.

-- 
With best wishes
Dmitry


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-03 Thread Xiaomeng Tong
correct for typo:

-for (struct list_head *list = head->next, cond = (struct list_head *)-1; cond 
== (struct list_head *)-1; cond = NULL) \
+for (struct list_head *list = head->next, *cond = (struct list_head *)-1; cond 
== (struct list_head *)-1; cond = NULL) \

--
Xiaomeng Tong


RE: [v1] drm/msm/disp/dpu1: add inline rotation support for sc7280 target

2022-03-03 Thread Vinod Polimera
> WARNING: This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
> 
> On 18/02/2022 14:30, Vinod Polimera wrote:
> > - Some DPU versions support inline rot90. It is supported only for
> > limited amount of UBWC formats.
> > - There are two versions of inline rotators, v1 (present on sm8250 and
> > sm7250) and v2 (sc7280). These versions differ in the list of supported
> > formats and in the scaler possibilities.
> >
> > Changes in RFC:
> > - Rebase changes to the latest code base.
> > - Append rotation config variables with v2 and
> > remove unused variables.(Dmitry)
> > - Move pixel_ext setup separately from scaler3 config.(Dmitry)
> > - Add 270 degree rotation to supported rotation list.(Dmitry)
> >
> > Signed-off-by: Kalyan Thota 
> > Signed-off-by: Vinod Polimera 
> > ---
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c |  44 ---
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h |  15 
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c  | 105
> -
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h  |   2 +
> >   4 files changed, 134 insertions(+), 32 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> > index aa75991..ae17a61 100644
> > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> > @@ -25,6 +25,9 @@
> >   #define VIG_SM8250_MASK \
> >   (VIG_MASK | BIT(DPU_SSPP_QOS_8LVL) |
> BIT(DPU_SSPP_SCALER_QSEED3LITE))
> >
> > +#define VIG_SC7280_MASK \
> > + (VIG_SC7180_MASK | BIT(DPU_SSPP_INLINE_ROTATION))
> > +
> >   #define DMA_SDM845_MASK \
> >   (BIT(DPU_SSPP_SRC) | BIT(DPU_SSPP_QOS) |
> BIT(DPU_SSPP_QOS_8LVL) |\
> >   BIT(DPU_SSPP_TS_PREFILL) | BIT(DPU_SSPP_TS_PREFILL_REC1) |\
> > @@ -102,6 +105,8 @@
> >   #define MAX_DOWNSCALE_RATIO 4
> >   #define SSPP_UNITY_SCALE1
> >
> > +#define INLINE_ROTATOR_V22
> 
> Unused
> 
> > +
> >   #define STRCAT(X, Y) (X Y)
> >
> >   static const uint32_t plane_formats[] = {
> > @@ -177,6 +182,11 @@ static const uint32_t plane_formats_yuv[] = {
> >   DRM_FORMAT_YVU420,
> >   };
> >
> > +static const uint32_t rotation_v2_formats[] = {
> > + DRM_FORMAT_NV12,
> > + /* TODO add formats after validation */
> > +};
> > +
> >
> /**
> ***
> >* DPU sub blocks config
> >
> **
> ***/
> > @@ -465,7 +475,13 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = {
> >
> >   /* SSPP common configuration */
> >
> > -#define _VIG_SBLK(num, sdma_pri, qseed_ver) \
> > +static const struct dpu_rotation_cfg dpu_rot_cfg_v2 = {
> > + .rot_maxheight = 1088,
> 
> Is the maxheight expected to be common between the SoC generations?
> You are declaring it inside generic `dpu_rot_cfg_v2`, which means that
> the struct will be used unchanged for several platforms.

Changed 'dpu_rot_cfg_v2 to 'dpu_rot_sc7280_cfg_v2' so that it will be specific 
to sc7280.
> 
> > + .rot_num_formats = ARRAY_SIZE(rotation_v2_formats),
> > + .rot_format_list = rotation_v2_formats,
> > +};
> 
> This should come later, together with the rest of structures.
> 
> > +
> > +#define _VIG_SBLK(num, sdma_pri, qseed_ver, rot_cfg) \
> >   { \
> >   .maxdwnscale = MAX_DOWNSCALE_RATIO, \
> >   .maxupscale = MAX_UPSCALE_RATIO, \
> > @@ -482,6 +498,7 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = {
> >   .num_formats = ARRAY_SIZE(plane_formats_yuv), \
> >   .virt_format_list = plane_formats, \
> >   .virt_num_formats = ARRAY_SIZE(plane_formats), \
> > + .rotation_cfg = rot_cfg, \
> >   }
> >
> >   #define _DMA_SBLK(num, sdma_pri) \
> > @@ -498,13 +515,13 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = {
> >   }
> >
> >   static const struct dpu_sspp_sub_blks sdm845_vig_sblk_0 =
> > - _VIG_SBLK("0", 5, DPU_SSPP_SCALER_QSEED3);
> > + _VIG_SBLK("0", 5, DPU_SSPP_SCALER_QSEED3, 
> > NULL);
> >   static const struct dpu_sspp_sub_blks sdm845_vig_sblk_1 =
> > - _VIG_SBLK("1", 6, DPU_SSPP_SCALER_QSEED3);
> > + _VIG_SBLK("1", 6, DPU_SSPP_SCALER_QSEED3, 
> > NULL);
> >   static const struct dpu_sspp_sub_blks sdm845_vig_sblk_2 =
> > - _VIG_SBLK("2", 7, DPU_SSPP_SCALER_QSEED3);
> > + _VIG_SBLK("2", 7, DPU_SSPP_SCALER_QSEED3, 
> > NULL);
> >   static const struct dpu_sspp_sub_blks sdm845_vig_sblk_3 =
> > - _VIG_SBLK("3", 8, DPU_SSPP_SCALER_QSEED3);
> > + _VIG_SBLK("3", 8, DPU_SSPP_SCALER_QSEED3, 
> > NULL);
> >
> >   static const struct dpu_sspp_sub_blks sdm845_dma_sblk_0 =
> _DMA_SBLK("8", 1);
> >   static const struct dpu_sspp_sub_blks sdm845_dma_sblk_1 =
> _DMA_SBLK("9", 2);
> > @@

RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-03 Thread Xiaomeng Tong
> From: Xiaomeng Tong
> > Sent: 03 March 2022 07:27
> > 
> > On Thu, 3 Mar 2022 04:58:23 +, David Laight wrote:
> > > on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote:
> > > > The problem is the mis-use of iterator outside the loop on exit, and
> > > > the iterator will be the HEAD's container_of pointer which pointers
> > > > to a type-confused struct. Sidenote: The *mis-use* here refers to
> > > > mistakely access to other members of the struct, instead of the
> > > > list_head member which acutally is the valid HEAD.
> > >
> > > The problem is that the HEAD's container_of pointer should never
> > > be calculated at all.
> > > This is what is fundamentally broken about the current definition.
> > 
> > Yes, the rule is "the HEAD's container_of pointer should never be
> > calculated at all outside the loop", but how do you make sure everyone
> > follows this rule?
> > Everyone makes mistakes, but we can eliminate them all from the beginning
> > with the help of compiler which can catch such use-after-loop things.
> > 
> > > > IOW, you would dereference a (NULL + offset_of_member) address here.
> > >
> > >Where?
> > 
> > In the case where a developer do not follows the above rule, and mistakely
> > access a non-list-head member of the HEAD's container_of pointer outside
> > the loop. For example:
> > struct req{
> >   int a;
> >   struct list_head h;
> > }
> > struct req *r;
> > list_for_each_entry(r, HEAD, h) {
> >   if (r->a == 0x10)
> > break;
> > }
> > // the developer made a mistake: he didn't take this situation into
> > // account where all entries in the list are *r->a != 0x10*, and now
> > // the r is the HEAD's container_of pointer.
> > r->a = 0x20;
> > Thus the "r->a = 0x20" would dereference a (NULL + offset_of_member)
> > address here.
> 
> That is just a bug.
> No different to failing to check anything else might 'return'
> a NULL pointer.

Yes, but it‘s a mistake everyone has made and will make, we should avoid
this at the beginning with the help of compiler.

> Because it is a NULL dereference you find out pretty quickly.

AFAIK,NULL dereference is a undefined bahavior, can compiler quickly
catch it? Or it can only be applied to some simple/restricted cases.

> The existing loop leaves you with a valid pointer to something
> that isn't a list item.
> 
> > > > Please remind me if i missed something, thanks.
> > > >
> > > > Can you share your "alternative definitions" details? thanks!
> > >
> > > The loop should probably use as extra variable that points
> > > to the 'list node' in the next structure.
> > > Something like:
> > >   for (xxx *iter = head->next;
> > >   iter == &head ? ((item = NULL),0) : ((item = 
> > > list_item(iter),1));
> > >   iter = item->member->next) {
> > >  ...
> > > With a bit of casting you can use 'item' to hold 'iter'.
> > 
> > you still can not make sure everyone follows this rule:
> > "do not use iterator outside the loop" without the help of compiler,
> > because item is declared outside the loop.
> 
> That one has 'iter' defined in the loop.

Oh, sorry, I misunderstood. Then this is the same way with my way of
list_for_each_entry_inside(pos, type, head, member), which declare
the iterator inside the loop.
You go further and make things like "&pos->member == (head)" goes away
to avoid calculate the HEAD's container_of pointer, although the
calculation is harmless.

> 
> > BTW, to avoid ambiguity,the "alternative definitions" here i asked is
> > something from you in this context:
> > "OTOH there may be alternative definitions that can be used to get
> > the compiler (or other compiler-like tools) to detect broken code.
> > Even if the definition can't possibly generate a working kerrnel."
> 
> I was thinking of something like:
>   if ((pos = list_first)), 1) pos = NULL else
> so that unchecked dereferences after the loop will be detectable
> as NULL pointer offsets - but that in itself isn't enough to avoid
> other warnings.
> 

Do you mean put this right after the loop (I changed somthing that i
do not understand, please correct me if i am worng, thanks):
   if (pos == list_first) pos = NULL; else
and compiler can detect the following NULL derefernce?
But if the NULL derefernce is just right after the loop originally,
it will be masked by the *else* keyword。

> > > > The "list_for_each_entry_inside(pos, type, head, member)" way makes
> > > > the iterator invisiable outside the loop, and would be catched by
> > > > compiler if use-after-loop things happened.
> > 
> > > It is also a compete PITA for anything doing a search.
> > 
> > You mean it would be a burden on search? can you show me some examples?
> 
> The whole business of having to save the pointer to the located item
> before breaking the loop, remembering to have set it to NULL earlier etc.

Ok, i see. And then you need pass a "item" to the list_for_each_entry macro
as a new argument.

> 
> It is so much better if y

Re: [PATCH v3 00/21] DEPT(Dependency Tracker)

2022-03-03 Thread Hyeonggon Yoo
On Thu, Mar 03, 2022 at 06:48:24PM +0900, Byungchul Park wrote:
> On Thu, Mar 03, 2022 at 08:03:21AM +, Hyeonggon Yoo wrote:
> > On Thu, Mar 03, 2022 at 09:18:13AM +0900, Byungchul Park wrote:
> > > Hi Hyeonggon,
> > > 
> > > Dept also allows the following scenario when an user guarantees that
> > > each lock instance is different from another at a different depth:
> > >
> > >lock A0 with depth
> > >lock A1 with depth + 1
> > >lock A2 with depth + 2
> > >lock A3 with depth + 3
> > >(and so on)
> > >..
> > >unlock A3
> > >unlock A2
> > >unlock A1
> > >unlock A0
> 

[+Cc kmemleak maintainer]

> Look at this. Dept allows object->lock -> other_object->lock (with a
> different depth using *_lock_nested()) so won't report it.
>

No, It did.

S: object->lock ( _raw_spin_lock_irqsave)
W: other_object->lock (_raw_spin_lock_nested)

DEPT reported this as AA deadlock.

===
DEPT: Circular dependency has been detected.
5.17.0-rc1+ #1 Tainted: GW
---
summary
---
*** AA DEADLOCK ***

context A
[S] __raw_spin_lock_irqsave(&object->lock:0)
[W] _raw_spin_lock_nested(&object->lock:0)
[E] spin_unlock(&object->lock:0)

[S]: start of the event context
[W]: the wait blocked
[E]: the event not reachable
---
context A's detail
---
context A
[S] __raw_spin_lock_irqsave(&object->lock:0)
[W] _raw_spin_lock_nested(&object->lock:0)
[E] spin_unlock(&object->lock:0)
---
context A's detail
---
context A
[S] __raw_spin_lock_irqsave(&object->lock:0)
[W] _raw_spin_lock_nested(&object->lock:0)
[E] spin_unlock(&object->lock:0)

[S] __raw_spin_lock_irqsave(&object->lock:0):
[] scan_gray_list+0x84/0x13c
stacktrace:
  dept_ecxt_enter+0x88/0xf4
  _raw_spin_lock_irqsave+0xf0/0x1c4
  scan_gray_list+0x84/0x13c
  kmemleak_scan+0x2d8/0x54c
  kmemleak_scan_thread+0xac/0xd4
  kthread+0xd4/0xe4
  ret_from_fork+0x10/0x20

[W] _raw_spin_lock_nested(&object->lock:0):
[] scan_block+0xb4/0x128
stacktrace:
  __dept_wait+0x8c/0xa4
  dept_wait+0x6c/0x88
  _raw_spin_lock_nested+0xa8/0x1b0
  scan_block+0xb4/0x128
  scan_gray_list+0xc4/0x13c
  kmemleak_scan+0x2d8/0x54c
  kmemleak_scan_thread+0xac/0xd4
  kthread+0xd4/0xe4
  ret_from_fork+0x10/0x20

[E] spin_unlock(&object->lock:0):
[] scan_block+0x60/0x128
---
information that might be helpful
---
CPU: 2 PID: 38 Comm: kmemleak Tainted: GW 5.17.0-rc1+ #1
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace.part.0+0x9c/0xc4
 show_stack+0x14/0x28
 dump_stack_lvl+0x9c/0xcc
 dump_stack+0x14/0x2c
 print_circle+0x2d4/0x438
 cb_check_dl+0x44/0x70
 bfs+0x60/0x168
 add_dep+0x88/0x11c
 add_wait+0x2d0/0x2dc
 __dept_wait+0x8c/0xa4
 dept_wait+0x6c/0x88
 _raw_spin_lock_nested+0xa8/0x1b0
 scan_block+0xb4/0x128
 scan_gray_list+0xc4/0x13c
 kmemleak_scan+0x2d8/0x54c
 kmemleak_scan_thread+0xac/0xd4
 kthread+0xd4/0xe4
 ret_from_fork+0x10/0x20

> > > However, Dept does not allow the following scenario where another lock
> > > class cuts in the dependency chain:
> > > 
> > >lock A0 with depth
> > >lock B
> > >lock A1 with depth + 1
> > >lock A2 with depth + 2
> > >lock A3 with depth + 3
> > >(and so on)
> > >..
> > >unlock A3
> > >unlock A2
> > >unlock A1
> > >unlock B
> > >unlock A0
> > > 
> > > This scenario is clearly problematic. What do you think is going to
> > > happen with another context running the following?
> > >
> > 
> > First of all, I want to say I'm not expert at locking primitives.
> > I may be wrong.
> 
> It's okay. Thanks anyway for your feedback.
>

Thanks.

> > > >   45  *   scan_mutex [-> object->lock] -> kmemleak_lock -> 
> > > > other_object->lock (SINGLE_DEPTH_NESTING)
> > > >   46  *
> > > >   47  * No kmemleak_lock and object->lock nesting is allowed outside 
> > > > scan_mutex
> > > >   48  * regions.
> > 
> > lock order in kmemleak is described above.
> > 
> > and DEPT detects two cases as deadlock:
> > 
> > 1) object->lock -> other_object->lock
> 
> It's not a deadlock *IF* two have different depth using *_lock_nested().
> Dept also allows this case. So Dept wouldn't report it.
>
> > 2) object->lock -> kmemleak_lock, kmemleak_lock -> other_object->lock
>
> But this usage is risky. I already explained it in the mail you replied
> to. I copied it. See the below.
>

I understand why you said this is risky.
Its lock ordering is not good.

> context A
> > >lock A0 with depth
> > >lock B
> > >lock A1 with depth + 1
> > >lock A2 with depth + 2
> > >l

Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-03 Thread Xiaomeng Tong
On Thu, 3 Mar 2022 12:18:24 +, Daniel Thompson wrote:
> On Thu, Mar 03, 2022 at 03:26:57PM +0800, Xiaomeng Tong wrote:
> > On Thu, 3 Mar 2022 04:58:23 +, David Laight wrote:
> > > on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote:
> > > > The problem is the mis-use of iterator outside the loop on exit, and
> > > > the iterator will be the HEAD's container_of pointer which pointers
> > > > to a type-confused struct. Sidenote: The *mis-use* here refers to
> > > > mistakely access to other members of the struct, instead of the
> > > > list_head member which acutally is the valid HEAD.
> > >
> > > The problem is that the HEAD's container_of pointer should never
> > > be calculated at all.
> > > This is what is fundamentally broken about the current definition.
> > 
> > Yes, the rule is "the HEAD's container_of pointer should never be
> > calculated at all outside the loop", but how do you make sure everyone
> > follows this rule?
> 
> Your formulation of the rule is correct: never run container_of() on HEAD
> pointer.

Actually, it is not my rule. My rule is that never access other members
of the struct except for the list_head member after the loop, because
this is a invalid member after loop exit, but valid for the list_head
member which just is HEAD and the lately caculation (&pos->head) seems
harmless.

I have considered the case that the HEAD's container "pos" is layouted
across the max and the min address boundary, which means the address of
HEAD is likely 0x60, and the address of pos is likely 0xffe0.
It seems ok to caculate pos with:
((type *)(__mptr - offsetof(type, member)));
and it seems ok to caculate head outside the loop with:
if (&pos->head == &HEAD)
return NULL;

The only case I can think of with the rule "never run container_of()
on HEAD" must be followed is when the first argument (which is &HEAD)
passing to container_of() is NULL + some offset, it may lead to the
resulting "pos->member" access being a NULL dereference. But maybe
the caller can take the responsibility to check if it is NULL, not
container_of() itself.

Please remind me if i missed somthing, thanks.

> 
> However the rule that is introduced by list_for_each_entry_inside() is
> *not* this rule. The rule it introduces is: never access the iterator
> variable outside the loop.

Sorry for the confusion, indeed, that is two *different* rule.

> 
> Making the iterator NULL on loop exit does follow the rule you proposed
> but using a different technique: do not allow HEAD to be stored in the
> iterator variable after loop exit. This also makes it impossible to run
> container_of() on the HEAD pointer.
> 

It does not. My rule is: never access the iterator variable outside the loop.
The "Making the iterator NULL on loop exit" way still leak the pos with NULL
outside the loop, may lead to a NULL deference.

> 
> > Everyone makes mistakes, but we can eliminate them all from the beginning
> > with the help of compiler which can catch such use-after-loop things.
> 
> Indeed but if we introduce new interfaces then we don't have to worry
> about existing usages and silent regressions. Code will have been
> written knowing the loop can exit with the iterator set to NULL.

Yes, it is more simple and compatible with existing interfaces. Howerver,
you should make every developers to remember that "pos will be set NULL on
loop exit", which is unreasonable and impossible for *every* single person.
Otherwise the mis-use-after-loop will lead to a NULL dereference.
But we can kill this problem by declaring iterator inside the loop and the
complier will catch it if somebody mis-use-after-loop.

> 
> Sure it is still possible for programmers to make mistakes and
> dereference the NULL pointer but C programmers are well training w.r.t.
> NULL pointer checking so such mistakes are much less likely than with
> the current list_for_each_entry() macro. This risk must be offset
> against the way a NULLify approach can lead to more elegant code when we
> are doing a list search.
> 

Yes, the NULLify approach is better than the current list_for_each_entry()
macro, but i stick with that the list_for_each_entry_inside() way is best
and perfect _technically_.

Thus, my idea is *better a finger off than always aching*, let's settle this
damn problem once and for all, with list_for_each_entry_inside().

--
Xiaomeng Tong


Re: [PATCH] i2c: at91: use dma safe buffers

2022-03-03 Thread Christian König

Am 03.03.22 um 17:17 schrieb Michael Walle:

The supplied buffer might be on the stack and we get the following error
message:
[3.312058] at91_i2c e0070600.i2c: rejecting DMA map of vmalloc memory

Use i2c_{get,put}_dma_safe_msg_buf() to get a DMA-able memory region if
necessary.

Cc: sta...@vger.kernel.org
Signed-off-by: Michael Walle 
---

I'm not sure if or which Fixes: tag I should add to this patch. The issue
seems to be since a very long time, but nobody seem to have triggered it.
FWIW, I'm using the sff,sfp driver, which triggers this.

  drivers/i2c/busses/i2c-at91-master.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/drivers/i2c/busses/i2c-at91-master.c 
b/drivers/i2c/busses/i2c-at91-master.c
index b0eae94909f4..a7a22fedbaba 100644
--- a/drivers/i2c/busses/i2c-at91-master.c
+++ b/drivers/i2c/busses/i2c-at91-master.c
@@ -656,6 +656,7 @@ static int at91_twi_xfer(struct i2c_adapter *adap, struct 
i2c_msg *msg, int num)
unsigned int_addr_flag = 0;
struct i2c_msg *m_start = msg;
bool is_read;
+   u8 *dma_buf;


Maybe call your variable differently. DMA-buf is an inter driver buffer 
sharing frame we use for GPU acceleration and V4L.


It doesn't cause any technical issues, but the maintainer regex now 
triggers on that. So you are CCing people not related to this code in 
any way.


Regards,
Christian.

  
  	dev_dbg(&adap->dev, "at91_xfer: processing %d messages:\n", num);
  
@@ -703,7 +704,18 @@ static int at91_twi_xfer(struct i2c_adapter *adap, struct i2c_msg *msg, int num)

dev->msg = m_start;
dev->recv_len_abort = false;
  
+	if (dev->use_dma) {

+   dma_buf = i2c_get_dma_safe_msg_buf(m_start, 1);
+   if (!dma_buf) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   dev->buf = dma_buf;
+   }
+
+
ret = at91_do_twi_transfer(dev);
+   i2c_put_dma_safe_msg_buf(dma_buf, m_start, !ret);
  
  	ret = (ret < 0) ? ret : num;

  out:




Re: [Intel-gfx] [PATCH 1/2] drm/i915/xehp: Support platforms with CCS engines but no RCS

2022-03-03 Thread Lucas De Marchi

On Thu, Mar 03, 2022 at 02:34:34PM -0800, Matt Roper wrote:

In the past we've always assumed that an RCS engine is present on every
platform.  However now that we have compute engines there may be
platforms that have CCS engines but no RCS, or platforms that are
designed to have both, but have the RCS engine fused off.

Various engine-centric initialization that only needs to be done a
single time for the group of RCS+CCS engines can't rely on being setup
with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag
that will be assigned to a single engine in the group; whichever engine
has this flag will be responsible for some of the general setup
(RCU_MODE programming, initialization of certain workarounds, etc.).

Signed-off-by: Matt Roper 



Reviewed-by: Lucas De Marchi 

Lucas De Marchi


[PATCH v4 24/24] dept: Disable Dept on that map once it's been handled until next turn

2022-03-03 Thread Byungchul Park
Dept works with waits preceeding an event, that might lead a deadlock.
Once the event has been handled, it's hard to ensure further waits
actually contibute to deadlock until next turn, which will start when
a sleep associated with that map happens.

So let Dept start tracking dependency when a sleep happens and stop
tracking dependency once the event e.i. wake up, has been handled.

Signed-off-by: Byungchul Park 
---
 kernel/dependency/dept.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index cc1b3a3..1c91db8 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -2325,6 +2325,12 @@ void dept_event(struct dept_map *m, unsigned long e_f, 
unsigned long ip,
do_event((void *)m, c, READ_ONCE(m->wgen), ip);
pop_ecxt((void *)m);
}
+
+   /*
+* Keep the map diabled until the next sleep.
+*/
+   WRITE_ONCE(m->wgen, 0);
+
dept_exit(flags);
 }
 EXPORT_SYMBOL_GPL(dept_event);
@@ -2447,6 +2453,11 @@ void dept_event_split_map(struct dept_map_each *me,
pop_ecxt((void *)me);
}
 
+   /*
+* Keep the map diabled until the next sleep.
+*/
+   WRITE_ONCE(me->wgen, 0);
+
dept_exit(flags);
 }
 EXPORT_SYMBOL_GPL(dept_event_split_map);
-- 
1.9.1



[PATCH v4 08/24] dept: Apply Dept to wait_for_completion()/complete()

2022-03-03 Thread Byungchul Park
Makes Dept able to track dependencies by
wait_for_completion()/complete().

Signed-off-by: Byungchul Park 
---
 include/linux/completion.h | 42 --
 kernel/sched/completion.c  | 12 ++--
 2 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index 51d9ab0..a1ad5a8 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -26,14 +26,48 @@
 struct completion {
unsigned int done;
struct swait_queue_head wait;
+   struct dept_map dmap;
 };
 
+#ifdef CONFIG_DEPT
+#define dept_wfc_init(m, k, s, n)  dept_map_init(m, k, s, n)
+#define dept_wfc_reinit(m) dept_map_reinit(m)
+#define dept_wfc_wait(m, ip)   \
+do {   \
+   dept_ask_event(m);  \
+   dept_wait(m, 1UL, ip, __func__, 0); \
+} while (0)
+#define dept_wfc_complete(m, ip)   dept_event(m, 1UL, ip, __func__)
+#define dept_wfc_enter(m, ip)  dept_ecxt_enter(m, 1UL, ip, 
"completion_context_enter", "complete", 0)
+#define dept_wfc_exit(m, ip)   dept_ecxt_exit(m, ip)
+#else
+#define dept_wfc_init(m, k, s, n)  do { (void)(n); (void)(k); } 
while (0)
+#define dept_wfc_reinit(m) do { } while (0)
+#define dept_wfc_wait(m, ip)   do { } while (0)
+#define dept_wfc_complete(m, ip)   do { } while (0)
+#define dept_wfc_enter(m, ip)  do { } while (0)
+#define dept_wfc_exit(m, ip)   do { } while (0)
+#endif
+
+#ifdef CONFIG_DEPT
+#define WFC_DEPT_MAP_INIT(work) .dmap = { .name = #work, .skip_cnt = 
ATOMIC_INIT(0) }
+#else
+#define WFC_DEPT_MAP_INIT(work)
+#endif
+
+#define init_completion(x) \
+   do {\
+   static struct dept_key __dkey;  \
+   __init_completion(x, &__dkey, #x);  \
+   } while (0)
+
 #define init_completion_map(x, m) init_completion(x)
 static inline void complete_acquire(struct completion *x) {}
 static inline void complete_release(struct completion *x) {}
 
 #define COMPLETION_INITIALIZER(work) \
-   { 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
+   { 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
+   WFC_DEPT_MAP_INIT(work) }
 
 #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
(*({ init_completion_map(&(work), &(map)); &(work); }))
@@ -81,9 +115,12 @@ static inline void complete_release(struct completion *x) {}
  * This inline function will initialize a dynamically created completion
  * structure.
  */
-static inline void init_completion(struct completion *x)
+static inline void __init_completion(struct completion *x,
+struct dept_key *dkey,
+const char *name)
 {
x->done = 0;
+   dept_wfc_init(&x->dmap, dkey, 0, name);
init_swait_queue_head(&x->wait);
 }
 
@@ -97,6 +134,7 @@ static inline void init_completion(struct completion *x)
 static inline void reinit_completion(struct completion *x)
 {
x->done = 0;
+   dept_wfc_reinit(&x->dmap);
 }
 
 extern void wait_for_completion(struct completion *);
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index a778554..6e31cc0 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -29,6 +29,7 @@ void complete(struct completion *x)
 {
unsigned long flags;
 
+   dept_wfc_complete(&x->dmap, _RET_IP_);
raw_spin_lock_irqsave(&x->wait.lock, flags);
 
if (x->done != UINT_MAX)
@@ -58,6 +59,7 @@ void complete_all(struct completion *x)
 {
unsigned long flags;
 
+   dept_wfc_complete(&x->dmap, _RET_IP_);
lockdep_assert_RT_in_threaded_ctx();
 
raw_spin_lock_irqsave(&x->wait.lock, flags);
@@ -112,17 +114,23 @@ void complete_all(struct completion *x)
 }
 
 static long __sched
-wait_for_common(struct completion *x, long timeout, int state)
+_wait_for_common(struct completion *x, long timeout, int state)
 {
return __wait_for_common(x, schedule_timeout, timeout, state);
 }
 
 static long __sched
-wait_for_common_io(struct completion *x, long timeout, int state)
+_wait_for_common_io(struct completion *x, long timeout, int state)
 {
return __wait_for_common(x, io_schedule_timeout, timeout, state);
 }
 
+#define wait_for_common(x, t, s)   \
+({ dept_wfc_wait(&(x)->dmap, _RET_IP_); _wait_for_common(x, t, s); })
+
+#define wait_for_common_io(x, t, s)\
+({ dept_wfc_wait(&(x)->dmap, _RET_IP_); _wait_for_common_io(x, t, s); })
+
 /**
  * wait_for_completion: - waits for completion of a tas

[PATCH v4 20/24] dept: Add nocheck version of init_completion()

2022-03-03 Thread Byungchul Park
For completions who don't want to get tracked by Dept, added
init_completion_nocheck() to disable Dept on it.

Signed-off-by: Byungchul Park 
---
 include/linux/completion.h | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index a1ad5a8..9bd3bc9 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -30,6 +30,7 @@ struct completion {
 };
 
 #ifdef CONFIG_DEPT
+#define dept_wfc_nocheck(m)dept_map_nocheck(m)
 #define dept_wfc_init(m, k, s, n)  dept_map_init(m, k, s, n)
 #define dept_wfc_reinit(m) dept_map_reinit(m)
 #define dept_wfc_wait(m, ip)   \
@@ -41,6 +42,7 @@ struct completion {
 #define dept_wfc_enter(m, ip)  dept_ecxt_enter(m, 1UL, ip, 
"completion_context_enter", "complete", 0)
 #define dept_wfc_exit(m, ip)   dept_ecxt_exit(m, ip)
 #else
+#define dept_wfc_nocheck(m)do { } while (0)
 #define dept_wfc_init(m, k, s, n)  do { (void)(n); (void)(k); } 
while (0)
 #define dept_wfc_reinit(m) do { } while (0)
 #define dept_wfc_wait(m, ip)   do { } while (0)
@@ -55,10 +57,11 @@ struct completion {
 #define WFC_DEPT_MAP_INIT(work)
 #endif
 
+#define init_completion_nocheck(x) __init_completion(x, NULL, #x, false)
 #define init_completion(x) \
do {\
static struct dept_key __dkey;  \
-   __init_completion(x, &__dkey, #x);  \
+   __init_completion(x, &__dkey, #x, true);\
} while (0)
 
 #define init_completion_map(x, m) init_completion(x)
@@ -117,10 +120,15 @@ static inline void complete_release(struct completion *x) 
{}
  */
 static inline void __init_completion(struct completion *x,
 struct dept_key *dkey,
-const char *name)
+const char *name, bool check)
 {
x->done = 0;
-   dept_wfc_init(&x->dmap, dkey, 0, name);
+
+   if (check)
+   dept_wfc_init(&x->dmap, dkey, 0, name);
+   else
+   dept_wfc_nocheck(&x->dmap);
+
init_swait_queue_head(&x->wait);
 }
 
-- 
1.9.1



[PATCH v4 22/24] dept: Don't create dependencies between different depths in any case

2022-03-03 Thread Byungchul Park
Dept already prevents creating dependencies between different depths of
the class indicated by *_lock_nested() when the lock acquisitions happen
consecutively.

   lock A0 with depth
   lock_nested A1 with depth + 1
   ...
   unlock A1
   unlock A0

Dept does not create A0 -> A1 dependency in this case, either.

However, once another class cut in, the code becomes problematic. When
Dept tries to create real dependencies, it does not only create real
ones but also wrong ones between different depths of the class.

   lock A0 with depth
   lock B
   lock_nested A1 with depth + 1
   ...
   unlock A1
   unlock B
   unlock A0

Even in this case, Dept should not create A0 -> A1 dependency.

So let Dept not create wrong dependencies between different depths of
the class in any case.

Reported-by: 42.hye...@gmail.com
Signed-off-by: Byungchul Park 
---
 kernel/dependency/dept.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 5d4efc3..cc1b3a3 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1458,14 +1458,7 @@ static void add_wait(struct dept_class *c, unsigned long 
ip,
 
eh = dt->ecxt_held + i;
if (eh->ecxt->class != c || eh->nest == ne)
-   break;
-   }
-
-   for (; i >= 0; i--) {
-   struct dept_ecxt_held *eh;
-
-   eh = dt->ecxt_held + i;
-   add_dep(eh->ecxt, w);
+   add_dep(eh->ecxt, w);
}
 
if (!wait_consumed(w) && !rich_stack) {
-- 
1.9.1



[PATCH v4 21/24] dept: Disable Dept on struct crypto_larval's completion for now

2022-03-03 Thread Byungchul Park
struct crypto_larval's completion is used for multiple purposes e.g.
waiting for test to complete or waiting for probe to complete.

The completion variable needs to be split according to what it's used
for. Otherwise, Dept cannot distinguish one from another and doesn't
work properly. Now that it isn't, disable Dept on it.

Signed-off-by: Byungchul Park 
---
 crypto/api.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/crypto/api.c b/crypto/api.c
index cf0869d..f501b91 100644
--- a/crypto/api.c
+++ b/crypto/api.c
@@ -115,7 +115,12 @@ struct crypto_larval *crypto_larval_alloc(const char 
*name, u32 type, u32 mask)
larval->alg.cra_destroy = crypto_larval_destroy;
 
strlcpy(larval->alg.cra_name, name, CRYPTO_MAX_ALG_NAME);
-   init_completion(&larval->completion);
+   /*
+* TODO: Split ->completion according to what it's used for e.g.
+* ->test_completion, ->probe_completion and the like, so that
+*  Dept can track its dependency properly.
+*/
+   init_completion_nocheck(&larval->completion);
 
return larval;
 }
-- 
1.9.1



[PATCH v4 06/24] dept: Apply Dept to mutex families

2022-03-03 Thread Byungchul Park
Makes Dept able to track dependencies by mutex families.

Signed-off-by: Byungchul Park 
---
 include/linux/lockdep.h | 18 +++---
 include/linux/mutex.h   | 33 +
 include/linux/rtmutex.h |  7 +++
 3 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 529ea18..6653a4f 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -615,9 +615,21 @@ static inline void print_irqtrace_events(struct 
task_struct *curr)
 #define seqcount_acquire_read(l, s, t, i)  
lock_acquire_shared_recursive(l, s, t, NULL, i)
 #define seqcount_release(l, i) lock_release(l, i)
 
-#define mutex_acquire(l, s, t, i)  lock_acquire_exclusive(l, s, t, 
NULL, i)
-#define mutex_acquire_nest(l, s, t, n, i)  lock_acquire_exclusive(l, s, t, 
n, i)
-#define mutex_release(l, i)lock_release(l, i)
+#define mutex_acquire(l, s, t, i)  \
+do {   \
+   lock_acquire_exclusive(l, s, t, NULL, i);   \
+   dept_mutex_lock(&(l)->dmap, s, t, NULL, "mutex_unlock", i); \
+} while (0)
+#define mutex_acquire_nest(l, s, t, n, i)  \
+do {   \
+   lock_acquire_exclusive(l, s, t, n, i);  \
+   dept_mutex_lock(&(l)->dmap, s, t, (n) ? &(n)->dmap : NULL, 
"mutex_unlock", i);\
+} while (0)
+#define mutex_release(l, i)\
+do {   \
+   lock_release(l, i); \
+   dept_mutex_unlock(&(l)->dmap, i);   \
+} while (0)
 
 #define rwsem_acquire(l, s, t, i)  lock_acquire_exclusive(l, s, t, 
NULL, i)
 #define rwsem_acquire_nest(l, s, t, n, i)  lock_acquire_exclusive(l, s, t, 
n, i)
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 8f226d4..204f976 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -20,11 +20,18 @@
 #include 
 #include 
 
+#ifdef CONFIG_DEPT
+# define DMAP_MUTEX_INIT(lockname) .dmap = { .name = #lockname, .skip_cnt 
= ATOMIC_INIT(0) },
+#else
+# define DMAP_MUTEX_INIT(lockname)
+#endif
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # define __DEP_MAP_MUTEX_INITIALIZER(lockname) \
, .dep_map = {  \
.name = #lockname,  \
.wait_type_inner = LD_WAIT_SLEEP,   \
+   DMAP_MUTEX_INIT(lockname)   \
}
 #else
 # define __DEP_MAP_MUTEX_INITIALIZER(lockname)
@@ -75,6 +82,32 @@ struct mutex {
 #endif
 };
 
+#ifdef CONFIG_DEPT
+#define dept_mutex_lock(m, ne, t, n, e_fn, ip) \
+do {   \
+   if (t) {\
+   dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   } else if (n) { \
+   dept_skip(m);   \
+   } else {\
+   dept_wait(m, 1UL, ip, __func__, ne);\
+   dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   }   \
+} while (0)
+#define dept_mutex_unlock(m, ip)   \
+do {   \
+   if (!dept_unskip_if_skipped(m)) {   \
+   dept_event(m, 1UL, ip, __func__);   \
+   dept_ecxt_exit(m, ip);  \
+   }   \
+} while (0)
+#else
+#define dept_mutex_lock(m, ne, t, n, e_fn, ip) do { } while (0)
+#define dept_mutex_unlock(m, ip)   do { } while (0)
+#endif
+
 #ifdef CONFIG_DEBUG_MUTEXES
 
 #define __DEBUG_MUTEX_INITIALIZER(lockname)\
diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 7d04988..712d6e6 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -76,11 +76,18 @@ static inline void rt_mutex_debug_task_free(struct 
task_struct *tsk) { }
__rt_mutex_init(mutex, __func__, &__key); \
 } while (0)
 
+#ifdef CONFIG_DEPT
+#define DMAP_RT_MUTEX_INIT(mutexname)  .dmap = { .name = #mutexname, .skip_cnt 
= ATOMIC_INIT(0) },
+#else
+#define DMAP

[PATCH v4 18/24] dept: Distinguish each work from another

2022-03-03 Thread Byungchul Park
Workqueue already provides concurrency control. By that, any wait in a
work doesn't prevents events in other works with the control enabled.
Thus, each work would better be considered a different context.

So let Dept assign a different context id to each work.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h |  2 ++
 kernel/dependency/dept.c | 10 ++
 kernel/workqueue.c   |  3 +++
 3 files changed, 15 insertions(+)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 1a1c307..55c5ed5 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -486,6 +486,7 @@ struct dept_task {
 extern void dept_event_split_map(struct dept_map_each *me, struct 
dept_map_common *mc, unsigned long ip, const char *e_fn);
 extern void dept_ask_event_split_map(struct dept_map_each *me, struct 
dept_map_common *mc);
 extern void dept_kernel_enter(void);
+extern void dept_work_enter(void);
 
 /*
  * for users who want to manage external keys
@@ -527,6 +528,7 @@ struct dept_task {
 #define dept_event_split_map(me, mc, ip, e_fn) do { } while (0)
 #define dept_ask_event_split_map(me, mc)   do { } while (0)
 #define dept_kernel_enter()do { } while (0)
+#define dept_work_enter()  do { } while (0)
 #define dept_key_init(k)   do { (void)(k); } while 
(0)
 #define dept_key_destroy(k)do { (void)(k); } while 
(0)
 #endif
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 8f962ae..5d4efc3 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1873,6 +1873,16 @@ void dept_disable_hardirq(unsigned long ip)
dept_exit(flags);
 }
 
+/*
+ * Assign a different context id to each work.
+ */
+void dept_work_enter(void)
+{
+   struct dept_task *dt = dept_task();
+
+   dt->cxt_id[DEPT_CXT_PROCESS] += (1UL << DEPT_CXTS_NR);
+}
+
 void dept_kernel_enter(void)
 {
struct dept_task *dt = dept_task();
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 33f1106..f5d762c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -51,6 +51,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "workqueue_internal.h"
 
@@ -2217,6 +2218,8 @@ static void process_one_work(struct worker *worker, 
struct work_struct *work)
 
lockdep_copy_map(&lockdep_map, &work->lockdep_map);
 #endif
+   dept_work_enter();
+
/* ensure we're on the correct CPU */
WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
 raw_smp_processor_id() != pool->cpu);
-- 
1.9.1



[PATCH v4 23/24] dept: Let it work with real sleeps in __schedule()

2022-03-03 Thread Byungchul Park
Dept commits the staged wait in __schedule() even if the corresponding
wake_up() has already woken up the task. Which means Dept considers the
case as a sleep. This would help Dept work for stronger detection but
also leads false positives.

It'd be better to let Dept work only with real sleeps conservatively for
now. So did it.

Signed-off-by: Byungchul Park 
---
 kernel/sched/core.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6a422aa..2ec7cf8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6192,7 +6192,12 @@ static void __sched notrace __schedule(unsigned int 
sched_mode)
local_irq_disable();
rcu_note_context_switch(!!sched_mode);
 
-   if (sched_mode == SM_NONE)
+   /*
+* Skip the commit if the current task does not actually go to
+* sleep.
+*/
+   if (READ_ONCE(prev->__state) & TASK_NORMAL &&
+   sched_mode == SM_NONE)
dept_ask_event_wait_commit(_RET_IP_);
 
/*
-- 
1.9.1



[PATCH v4 19/24] dept: Disable Dept within the wait_bit layer by default

2022-03-03 Thread Byungchul Park
The struct wait_queue_head array, bit_wait_table[] in sched/wait_bit.c
are shared by all its users, which unfortunately vary in terms of class.
So each should've been assigned its own class to avoid false positives.

It'd better let Dept work at a higher layer than wait_bit. So disabled
Dept within the wait_bit layer by default.

It's worth noting that Dept is still working with the other struct
wait_queue_head ones that are mostly well-classified.

Signed-off-by: Byungchul Park 
---
 kernel/sched/wait_bit.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/wait_bit.c b/kernel/sched/wait_bit.c
index 02ce292..3e5a3eb 100644
--- a/kernel/sched/wait_bit.c
+++ b/kernel/sched/wait_bit.c
@@ -3,6 +3,7 @@
  * The implementation of the wait_bit*() and related waiting APIs:
  */
 #include "sched.h"
+#include 
 
 #define WAIT_TABLE_BITS 8
 #define WAIT_TABLE_SIZE (1 << WAIT_TABLE_BITS)
@@ -246,6 +247,8 @@ void __init wait_bit_init(void)
 {
int i;
 
-   for (i = 0; i < WAIT_TABLE_SIZE; i++)
+   for (i = 0; i < WAIT_TABLE_SIZE; i++) {
init_waitqueue_head(bit_wait_table + i);
+   dept_map_nocheck(&(bit_wait_table + i)->dmap);
+   }
 }
-- 
1.9.1



[PATCH v4 14/24] dept: Apply SDT to swait

2022-03-03 Thread Byungchul Park
Makes SDT able to track dependencies by swait.

Signed-off-by: Byungchul Park 
---
 include/linux/swait.h |  4 
 kernel/sched/swait.c  | 10 ++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/swait.h b/include/linux/swait.h
index 6a8c22b..dbdf2ce 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -43,6 +44,7 @@
 struct swait_queue_head {
raw_spinlock_t  lock;
struct list_headtask_list;
+   struct dept_map dmap;
 };
 
 struct swait_queue {
@@ -61,6 +63,7 @@ struct swait_queue {
 #define __SWAIT_QUEUE_HEAD_INITIALIZER(name) { \
.lock   = __RAW_SPIN_LOCK_UNLOCKED(name.lock),  \
.task_list  = LIST_HEAD_INIT((name).task_list), \
+   .dmap   = DEPT_SDT_MAP_INIT(name),  \
 }
 
 #define DECLARE_SWAIT_QUEUE_HEAD(name) \
@@ -72,6 +75,7 @@ extern void __init_swait_queue_head(struct swait_queue_head 
*q, const char *name
 #define init_swait_queue_head(q)   \
do {\
static struct lock_class_key __key; \
+   sdt_map_init(&(q)->dmap);   \
__init_swait_queue_head((q), #q, &__key);   \
} while (0)
 
diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
index e1c655f..4ca7d6e 100644
--- a/kernel/sched/swait.c
+++ b/kernel/sched/swait.c
@@ -27,6 +27,7 @@ void swake_up_locked(struct swait_queue_head *q)
return;
 
curr = list_first_entry(&q->task_list, typeof(*curr), task_list);
+   sdt_event(&q->dmap);
wake_up_process(curr->task);
list_del_init(&curr->task_list);
 }
@@ -69,6 +70,7 @@ void swake_up_all(struct swait_queue_head *q)
while (!list_empty(&tmp)) {
curr = list_first_entry(&tmp, typeof(*curr), task_list);
 
+   sdt_event(&q->dmap);
wake_up_state(curr->task, TASK_NORMAL);
list_del_init(&curr->task_list);
 
@@ -97,6 +99,9 @@ void prepare_to_swait_exclusive(struct swait_queue_head *q, 
struct swait_queue *
__prepare_to_swait(q, wait);
set_current_state(state);
raw_spin_unlock_irqrestore(&q->lock, flags);
+
+   if (state & TASK_NORMAL)
+   sdt_wait_prepare(&q->dmap);
 }
 EXPORT_SYMBOL(prepare_to_swait_exclusive);
 
@@ -119,12 +124,16 @@ long prepare_to_swait_event(struct swait_queue_head *q, 
struct swait_queue *wait
}
raw_spin_unlock_irqrestore(&q->lock, flags);
 
+   if (!ret && state & TASK_NORMAL)
+   sdt_wait_prepare(&q->dmap);
+
return ret;
 }
 EXPORT_SYMBOL(prepare_to_swait_event);
 
 void __finish_swait(struct swait_queue_head *q, struct swait_queue *wait)
 {
+   sdt_wait_finish();
__set_current_state(TASK_RUNNING);
if (!list_empty(&wait->task_list))
list_del_init(&wait->task_list);
@@ -134,6 +143,7 @@ void finish_swait(struct swait_queue_head *q, struct 
swait_queue *wait)
 {
unsigned long flags;
 
+   sdt_wait_finish();
__set_current_state(TASK_RUNNING);
 
if (!list_empty_careful(&wait->task_list)) {
-- 
1.9.1



[PATCH v4 16/24] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread

2022-03-03 Thread Byungchul Park
cb92173d1f0 (locking/lockdep, cpu/hotplug: Annotate AP thread) was
introduced to make lockdep_assert_cpus_held() work in AP thread.

However, the annotation is too strong for that purpose. We don't have to
use more than try lock annotation for that.

Furthermore, now that Dept was introduced, false positive alarms was
reported by that. Replaced it with try lock annotation.

Signed-off-by: Byungchul Park 
---
 kernel/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 407a256..1f92a42 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -355,7 +355,7 @@ int lockdep_is_cpus_held(void)
 
 static void lockdep_acquire_cpus_lock(void)
 {
-   rwsem_acquire(&cpu_hotplug_lock.dep_map, 0, 0, _THIS_IP_);
+   rwsem_acquire(&cpu_hotplug_lock.dep_map, 0, 1, _THIS_IP_);
 }
 
 static void lockdep_release_cpus_lock(void)
-- 
1.9.1



[PATCH v4 02/24] dept: Implement Dept(Dependency Tracker)

2022-03-03 Thread Byungchul Park
CURRENT STATUS
--
Lockdep tracks acquisition order of locks in order to detect deadlock,
and IRQ and IRQ enable/disable state as well to take accident
acquisitions into account.

Lockdep should be turned off once it detects and reports a deadlock
since the data structure and algorithm are not reusable after detection
because of the complex design.

PROBLEM
---
*Waits* and their *events* that never reach eventually cause deadlock.
However, Lockdep is only interested in lock acquisition order, forcing
to emulate lock acqusition even for just waits and events that have
nothing to do with real lock.

Even worse, no one likes Lockdep's false positive detection because that
prevents further one that might be more valuable. That's why all the
kernel developers are sensitive to Lockdep's false positive.

Besides those, by tracking acquisition order, it cannot correctly deal
with read lock and cross-event e.g. wait_for_completion()/complete() for
deadlock detection. Lockdep is no longer a good tool for that purpose.

SOLUTION

Again, *waits* and their *events* that never reach eventually cause
deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits
and events themselves. Dept tracks waits and events and report it if
any event would be never reachable.

Dept does:
   . Works with read lock in the right way.
   . Works with any wait and event e.i. cross-event.
   . Continue to work even after reporting multiple times.
   . Provides simple and intuitive APIs.
   . Does exactly what dependency checker should do.

Q & A
-
Q. Is this the first try ever to address the problem?
A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement
   the 'crossrelease' feature) addressed it 2 years ago that was a
   Lockdep extension and merged but reverted shortly because:

   Cross-release started to report valuable hidden problems but started
   to give report false positive reports as well. For sure, no one
   likes Lockdep's false positive reports since it makes Lockdep stop,
   preventing reporting further real problems.

Q. Why not Dept was developed as an extension of Lockdep?
A. Lockdep definitely includes all the efforts great developers have
   made for a long time so as to be quite stable enough. But I had to
   design and implement newly because of the following:

   1) Lockdep was designed to track lock acquisition order. The APIs and
  implementation do not fit on wait-event model.
   2) Lockdep is turned off on detection including false positive. Which
  is terrible and prevents developing any extension for stronger
  detection.

Q. Do you intend to totally replace Lockdep?
A. No. Lockdep also checks if lock usage is correct. Of course, the
   dependency check routine should be replaced but the other functions
   should be still there.

Q. Do you mean the dependency check routine should be replaced right
   away?
A. No. I admit Lockdep is stable enough thanks to great efforts kernel
   developers have made. Lockdep and Dept, both should be in the kernel
   until Dept gets considered stable.

Q. Stronger detection capability would give more false positive report.
   Which was a big problem when cross-release was introduced. Is it ok
   with Dept?
A. It's ok. Dept allows multiple reporting thanks to simple and quite
   generalized design. Of course, false positive reports should be fixed
   anyway but it's no longer as a critical problem as it was.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h|  481 
 include/linux/dept_sdt.h|   62 +
 include/linux/hardirq.h |3 +
 include/linux/irqflags.h|   33 +-
 include/linux/sched.h   |7 +
 init/init_task.c|2 +
 init/main.c |2 +
 kernel/Makefile |1 +
 kernel/dependency/Makefile  |3 +
 kernel/dependency/dept.c| 2536 +++
 kernel/dependency/dept_hash.h   |   10 +
 kernel/dependency/dept_object.h |   13 +
 kernel/exit.c   |1 +
 kernel/fork.c   |2 +
 kernel/module.c |2 +
 kernel/sched/core.c |3 +
 kernel/softirq.c|6 +-
 kernel/trace/trace_preemptirq.c |   19 +-
 lib/Kconfig.debug   |   20 +
 19 files changed, 3197 insertions(+), 9 deletions(-)
 create mode 100644 include/linux/dept.h
 create mode 100644 include/linux/dept_sdt.h
 create mode 100644 kernel/dependency/Makefile
 create mode 100644 kernel/dependency/dept.c
 create mode 100644 kernel/dependency/dept_hash.h
 create mode 100644 kernel/dependency/dept_object.h

diff --git a/include/linux/dept.h b/include/linux/dept.h
new file mode 100644
index 000..c3fb3cf
--- /dev/null
+++ b/include/linux/dept.h
@@ -0,0 +1,481 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * DEPT(DEPendency Tracker) - runtime dependency tracker
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (c

[PATCH v4 12/24] dept: Introduce split map concept and new APIs for them

2022-03-03 Thread Byungchul Park
There is a case where all maps used for a type of wait/event is so large
in size. For instance, struct page can be a type for (un)lock_page().
The additional memory size for the maps would be 'the # of pages *
sizeof(struct dept_map)' if each struct page keeps its map all the way,
which might be too big to accept in some system.

It'd better to have split map, one is for each instance and the other
is for the type which is commonly used, and new APIs using them. So
introduced split map and new APIs for them.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h |  80 ++-
 kernel/dependency/dept.c | 122 +++
 2 files changed, 180 insertions(+), 22 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index c0bbb8e..e2d4aea 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -362,6 +362,30 @@ struct dept_map {
boolnocheck;
 };
 
+struct dept_map_each {
+   /*
+* wait timestamp associated to this map
+*/
+   unsigned int wgen;
+};
+
+struct dept_map_common {
+   const char *name;
+   struct dept_key *keys;
+   int sub_usr;
+
+   /*
+* It's local copy for fast acces to the associated classes. And
+* Also used for dept_key instance for statically defined map.
+*/
+   struct dept_key keys_local;
+
+   /*
+* whether this map should be going to be checked or not
+*/
+   bool nocheck;
+};
+
 struct dept_task {
/*
 * all event contexts that have entered and before exiting
@@ -451,6 +475,11 @@ struct dept_task {
 extern void dept_ecxt_exit(struct dept_map *m, unsigned long ip);
 extern void dept_skip(struct dept_map *m);
 extern bool dept_unskip_if_skipped(struct dept_map *m);
+extern void dept_split_map_each_init(struct dept_map_each *me);
+extern void dept_split_map_common_init(struct dept_map_common *mc, struct 
dept_key *k, const char *n);
+extern void dept_wait_split_map(struct dept_map_each *me, struct 
dept_map_common *mc, unsigned long ip, const char *w_fn, int ne);
+extern void dept_event_split_map(struct dept_map_each *me, struct 
dept_map_common *mc, unsigned long ip, const char *e_fn);
+extern void dept_ask_event_split_map(struct dept_map_each *me, struct 
dept_map_common *mc);
 
 /*
  * for users who want to manage external keys
@@ -460,31 +489,38 @@ struct dept_task {
 #else /* !CONFIG_DEPT */
 struct dept_key  { };
 struct dept_map  { };
+struct dept_map_each{ };
+struct dept_map_commmon { };
 struct dept_task { };
 
 #define DEPT_TASK_INITIALIZER(t)
 
-#define dept_on()  do { } while (0)
-#define dept_off() do { } while (0)
-#define dept_init()do { } while (0)
-#define dept_task_init(t)  do { } while (0)
-#define dept_task_exit(t)  do { } while (0)
-#define dept_free_range(s, sz) do { } while (0)
-#define dept_map_init(m, k, s, n)  do { (void)(n); (void)(k); } 
while (0)
-#define dept_map_reinit(m) do { } while (0)
-#define dept_map_nocheck(m)do { } while (0)
-
-#define dept_wait(m, w_f, ip, w_fn, ne)do { (void)(w_fn); } 
while (0)
-#define dept_stage_wait(m, w_f, w_fn, ne)  do { (void)(w_fn); } while (0)
-#define dept_ask_event_wait_commit(ip) do { } while (0)
-#define dept_clean_stage() do { } while (0)
-#define dept_ecxt_enter(m, e_f, ip, c_fn, e_fn, ne) do { (void)(c_fn); 
(void)(e_fn); } while (0)
-#define dept_ask_event(m)  do { } while (0)
-#define dept_event(m, e_f, ip, e_fn)   do { (void)(e_fn); } while (0)
-#define dept_ecxt_exit(m, ip)  do { } while (0)
-#define dept_skip(m)   do { } while (0)
-#define dept_unskip_if_skipped(m)  (false)
-#define dept_key_init(k)   do { (void)(k); } while (0)
-#define dept_key_destroy(k)do { (void)(k); } while (0)
+#define dept_on()  do { } while (0)
+#define dept_off() do { } while (0)
+#define dept_init()do { } while (0)
+#define dept_task_init(t)  do { } while (0)
+#define dept_task_exit(t)  do { } while (0)
+#define dept_free_range(s, sz) do { } while (0)
+#define dept_map_init(m, k, s, n)  do { (void)(n); 
(void)(k); } while (0)
+#define dept_map_reinit(m) do { } while (0)
+#define dept_map_nocheck(m)do { } while (0)
+
+#define dept_wait(m, w_f, ip, w_fn, ne)do { 
(void)(w_fn); } while (0)
+#define dept_stage_wait(m, w_f, w_fn, ne)  do { (void)(w_fn); } 

[PATCH v4 11/24] dept: Add proc knobs to show stats and dependency graph

2022-03-03 Thread Byungchul Park
It'd be useful to show Dept internal stats and dependency graph on
runtime via proc for better information. Introduced the knobs.

Signed-off-by: Byungchul Park 
---
 kernel/dependency/Makefile|  1 +
 kernel/dependency/dept.c  | 24 --
 kernel/dependency/dept_internal.h | 26 +++
 kernel/dependency/dept_proc.c | 92 +++
 4 files changed, 128 insertions(+), 15 deletions(-)
 create mode 100644 kernel/dependency/dept_internal.h
 create mode 100644 kernel/dependency/dept_proc.c

diff --git a/kernel/dependency/Makefile b/kernel/dependency/Makefile
index b5cfb8a..92f1654 100644
--- a/kernel/dependency/Makefile
+++ b/kernel/dependency/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_DEPT) += dept.o
+obj-$(CONFIG_DEPT) += dept_proc.o
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 3f22c5b..4142c78 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -73,6 +73,7 @@
 #include 
 #include 
 #include 
+#include "dept_internal.h"
 
 static int dept_stop;
 static int dept_per_cpu_ready;
@@ -233,20 +234,13 @@ static inline struct dept_task *dept_task(void)
  *   have been freed will be placed.
  */
 
-enum object_t {
-#define OBJECT(id, nr) OBJECT_##id,
-   #include "dept_object.h"
-#undef  OBJECT
-   OBJECT_NR,
-};
-
 #define OBJECT(id, nr) \
 static struct dept_##id spool_##id[nr];
\
 static DEFINE_PER_CPU(struct llist_head, lpool_##id);
#include "dept_object.h"
 #undef  OBJECT
 
-static struct dept_pool pool[OBJECT_NR] = {
+struct dept_pool dept_pool[OBJECT_NR] = {
 #define OBJECT(id, nr) {   \
.name = #id,\
.obj_sz = sizeof(struct dept_##id), \
@@ -276,7 +270,7 @@ static void *from_pool(enum object_t t)
if (DEPT_WARN_ON(!irqs_disabled()))
return NULL;
 
-   p = &pool[t];
+   p = &dept_pool[t];
 
/*
 * Try local pool first.
@@ -306,7 +300,7 @@ static void *from_pool(enum object_t t)
 
 static void to_pool(void *o, enum object_t t)
 {
-   struct dept_pool *p = &pool[t];
+   struct dept_pool *p = &dept_pool[t];
struct llist_head *h;
 
preempt_disable();
@@ -1986,7 +1980,7 @@ void dept_map_nocheck(struct dept_map *m)
 }
 EXPORT_SYMBOL_GPL(dept_map_nocheck);
 
-static LIST_HEAD(classes);
+LIST_HEAD(dept_classes);
 
 static inline bool within(const void *addr, void *start, unsigned long size)
 {
@@ -2013,7 +2007,7 @@ void dept_free_range(void *start, unsigned int sz)
while (unlikely(!dept_lock()))
cpu_relax();
 
-   list_for_each_entry_safe(c, n, &classes, all_node) {
+   list_for_each_entry_safe(c, n, &dept_classes, all_node) {
if (!within((void *)c->key, start, sz) &&
!within(c->name, start, sz))
continue;
@@ -2082,7 +2076,7 @@ static struct dept_class *check_new_class(struct dept_key 
*local,
c->sub = sub;
c->key = (unsigned long)(k->subkeys + sub);
hash_add_class(c);
-   list_add(&c->all_node, &classes);
+   list_add(&c->all_node, &dept_classes);
 unlock:
dept_unlock();
 caching:
@@ -2537,8 +2531,8 @@ static void migrate_per_cpu_pool(void)
struct llist_head *from;
struct llist_head *to;
 
-   from = &pool[i].boot_pool;
-   to = per_cpu_ptr(pool[i].lpool, boot_cpu);
+   from = &dept_pool[i].boot_pool;
+   to = per_cpu_ptr(dept_pool[i].lpool, boot_cpu);
move_llist(to, from);
}
 }
diff --git a/kernel/dependency/dept_internal.h 
b/kernel/dependency/dept_internal.h
new file mode 100644
index 000..007c1ee
--- /dev/null
+++ b/kernel/dependency/dept_internal.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Dept(DEPendency Tracker) - runtime dependency tracker internal header
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ */
+
+#ifndef __DEPT_INTERNAL_H
+#define __DEPT_INTERNAL_H
+
+#ifdef CONFIG_DEPT
+
+enum object_t {
+#define OBJECT(id, nr) OBJECT_##id,
+   #include "dept_object.h"
+#undef  OBJECT
+   OBJECT_NR,
+};
+
+extern struct list_head dept_classes;
+extern struct dept_pool dept_pool[];
+
+#endif
+#endif /* __DEPT_INTERNAL_H */
diff --git a/kernel/dependency/dept_proc.c b/kernel/dependency/dept_proc.c
new file mode 100644
index 000..c069354
--- /dev/null
+++ b/kernel/dependency/dept_proc.c
@@ -0,0 +1,92 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Procfs knobs for Dept(DEPendency Tracker)
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (C) 2021 LG Electronics, Inc. , Byungchul Park
+ */
+#include 
+#include 
+#include 
+#include "dept_inte

[PATCH v4 10/24] dept: Apply Dept to rwsem

2022-03-03 Thread Byungchul Park
Makes Dept able to track dependencies by rwsem.

Signed-off-by: Byungchul Park 
---
 include/linux/lockdep.h  | 24 
 include/linux/percpu-rwsem.h | 10 +-
 include/linux/rwsem.h| 33 +
 3 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index b93a707..37af50c 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -646,10 +646,26 @@ static inline void print_irqtrace_events(struct 
task_struct *curr)
dept_mutex_unlock(&(l)->dmap, i);   \
 } while (0)
 
-#define rwsem_acquire(l, s, t, i)  lock_acquire_exclusive(l, s, t, 
NULL, i)
-#define rwsem_acquire_nest(l, s, t, n, i)  lock_acquire_exclusive(l, s, t, 
n, i)
-#define rwsem_acquire_read(l, s, t, i) lock_acquire_shared(l, s, t, 
NULL, i)
-#define rwsem_release(l, i)lock_release(l, i)
+#define rwsem_acquire(l, s, t, i)  \
+do {   \
+   lock_acquire_exclusive(l, s, t, NULL, i);   \
+   dept_rwsem_lock(&(l)->dmap, s, t, NULL, "up_write", i); \
+} while (0)
+#define rwsem_acquire_nest(l, s, t, n, i)  \
+do {   \
+   lock_acquire_exclusive(l, s, t, n, i);  \
+   dept_rwsem_lock(&(l)->dmap, s, t, (n) ? &(n)->dmap : NULL, "up_write", 
i);\
+} while (0)
+#define rwsem_acquire_read(l, s, t, i) \
+do {   \
+   lock_acquire_shared(l, s, t, NULL, i);  \
+   dept_rwsem_lock(&(l)->dmap, s, t, NULL, "up_read", i);  \
+} while (0)
+#define rwsem_release(l, i)\
+do {   \
+   lock_release(l, i); \
+   dept_rwsem_unlock(&(l)->dmap, i);   \
+} while (0)
 
 #define lock_map_acquire(l)lock_acquire_exclusive(l, 0, 0, 
NULL, _THIS_IP_)
 #define lock_map_acquire_read(l)   
lock_acquire_shared_recursive(l, 0, 0, NULL, _THIS_IP_)
diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h
index 5fda40f..ac2b1a5 100644
--- a/include/linux/percpu-rwsem.h
+++ b/include/linux/percpu-rwsem.h
@@ -20,8 +20,16 @@ struct percpu_rw_semaphore {
 #endif
 };
 
+#ifdef CONFIG_DEPT
+#define __PERCPU_RWSEM_DMAP_INIT(lockname) .dmap = { .name = #lockname, 
.skip_cnt = ATOMIC_INIT(0) }
+#else
+#define __PERCPU_RWSEM_DMAP_INIT(lockname)
+#endif
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-#define __PERCPU_RWSEM_DEP_MAP_INIT(lockname)  .dep_map = { .name = #lockname 
},
+#define __PERCPU_RWSEM_DEP_MAP_INIT(lockname)  .dep_map = {\
+   .name = #lockname,  \
+   __PERCPU_RWSEM_DMAP_INIT(lockname) },
 #else
 #define __PERCPU_RWSEM_DEP_MAP_INIT(lockname)
 #endif
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index f934876..dc7977a 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -16,11 +16,18 @@
 #include 
 #include 
 
+#ifdef CONFIG_DEPT
+# define RWSEM_DMAP_INIT(lockname) .dmap = { .name = #lockname, .skip_cnt 
= ATOMIC_INIT(0) },
+#else
+# define RWSEM_DMAP_INIT(lockname)
+#endif
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # define __RWSEM_DEP_MAP_INIT(lockname)\
.dep_map = {\
.name = #lockname,  \
.wait_type_inner = LD_WAIT_SLEEP,   \
+   RWSEM_DMAP_INIT(lockname)   \
},
 #else
 # define __RWSEM_DEP_MAP_INIT(lockname)
@@ -32,6 +39,32 @@
 #include 
 #endif
 
+#ifdef CONFIG_DEPT
+#define dept_rwsem_lock(m, ne, t, n, e_fn, ip) \
+do {   \
+   if (t) {\
+   dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   } else if (n) { \
+   dept_skip(m);   \
+   } else {\
+   dept_wait(m, 1UL, ip, __func__, ne);\
+   dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   }   \
+} while (0)
+#define dept_rwsem_unlock(m, ip)   \
+do {

[PATCH v4 03/24] dept: Embed Dept data in Lockdep

2022-03-03 Thread Byungchul Park
Dept should work independently from Lockdep. However, there's no choise
but to rely on Lockdep code and its instances for now.

Signed-off-by: Byungchul Park 
---
 include/linux/lockdep.h   | 71 ---
 include/linux/lockdep_types.h |  3 ++
 kernel/locking/lockdep.c  | 12 
 3 files changed, 76 insertions(+), 10 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 467b942..c56f6b6 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -20,6 +20,33 @@
 extern int prove_locking;
 extern int lock_stat;
 
+#ifdef CONFIG_DEPT
+static inline void dept_after_copy_map(struct dept_map *to,
+  struct dept_map *from)
+{
+   int i;
+
+   if (from->keys == &from->keys_local)
+   to->keys = &to->keys_local;
+
+   if (!to->keys)
+   return;
+
+   /*
+* Since the class cache can be modified concurrently we could observe
+* half pointers (64bit arch using 32bit copy insns). Therefore clear
+* the caches and take the performance hit.
+*
+* XXX it doesn't work well with lockdep_set_class_and_subclass(), since
+* that relies on cache abuse.
+*/
+   for (i = 0; i < DEPT_MAX_SUBCLASSES_CACHE; i++)
+   to->keys->classes[i] = NULL;
+}
+#else
+#define dept_after_copy_map(t, f)  do { } while (0)
+#endif
+
 #ifdef CONFIG_LOCKDEP
 
 #include 
@@ -43,6 +70,8 @@ static inline void lockdep_copy_map(struct lockdep_map *to,
 */
for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
to->class_cache[i] = NULL;
+
+   dept_after_copy_map(&to->dmap, &from->dmap);
 }
 
 /*
@@ -176,8 +205,19 @@ struct held_lock {
current->lockdep_recursion -= LOCKDEP_OFF;  \
 } while (0)
 
-extern void lockdep_register_key(struct lock_class_key *key);
-extern void lockdep_unregister_key(struct lock_class_key *key);
+extern void __lockdep_register_key(struct lock_class_key *key);
+extern void __lockdep_unregister_key(struct lock_class_key *key);
+
+#define lockdep_register_key(k)\
+do {   \
+   __lockdep_register_key(k);  \
+   dept_key_init(&(k)->dkey);  \
+} while (0)
+#define lockdep_unregister_key(k)  \
+do {   \
+   __lockdep_unregister_key(k);\
+   dept_key_destroy(&(k)->dkey);   \
+} while (0)
 
 /*
  * These methods are used by specific locking variants (spinlocks,
@@ -185,9 +225,18 @@ struct held_lock {
  * to lockdep:
  */
 
-extern void lockdep_init_map_type(struct lockdep_map *lock, const char *name,
+extern void __lockdep_init_map_type(struct lockdep_map *lock, const char *name,
struct lock_class_key *key, int subclass, u8 inner, u8 outer, u8 
lock_type);
 
+#define lockdep_init_map_type(l, n, k, s, i, o, t) \
+do {   \
+   __lockdep_init_map_type(l, n, k, s, i, o, t);   \
+   if ((k) == &__lockdep_no_validate__)\
+   dept_map_nocheck(&(l)->dmap);   \
+   else\
+   dept_map_init(&(l)->dmap, &(k)->dkey, s, n);\
+} while (0)
+
 static inline void
 lockdep_init_map_waits(struct lockdep_map *lock, const char *name,
   struct lock_class_key *key, int subclass, u8 inner, u8 
outer)
@@ -431,13 +480,27 @@ enum xhlock_context_t {
XHLOCK_CTX_NR,
 };
 
+#ifdef CONFIG_DEPT
+/*
+ * TODO: I found the case to use an address of other than a real key as
+ * _key, for instance, in workqueue. So for now, we cannot use the
+ * assignment like '.dmap.keys = &(_key)->dkey' unless it's fixed.
+ */
+#define STATIC_DEPT_MAP_INIT(_name, _key) .dmap = {\
+   .name = (_name),\
+   .keys = NULL },
+#else
+#define STATIC_DEPT_MAP_INIT(_name, _key)
+#endif
+
 #define lockdep_init_map_crosslock(m, n, k, s) do {} while (0)
 /*
  * To initialize a lockdep_map statically use this macro.
  * Note that _name must not be NULL.
  */
 #define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
-   { .name = (_name), .key = (void *)(_key), }
+   { .name = (_name), .key = (void *)(_key), \
+   STATIC_DEPT_MAP_INIT(_name, _key) }
 
 static inline void lockdep_invariant_state(bool force) {}
 static inline void lockdep_free_task(struct task_struct *task) {}
diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
index d224308..50c8879 100644
--- a/include/linux/lockdep_types.h
+++ b/include/linux/lockdep_types.h
@@ -11,6 +11,7 @@
 #define __LINUX_LOCKDEP_TYPES_H
 
 #include 
+#include 
 
 #define MAX_LOCKDEP_SUBCLASSES 8UL
 
@@ -76,6 +77,7 @@ struct 

[PATCH v4 17/24] dept: Distinguish each syscall context from another

2022-03-03 Thread Byungchul Park
It enters kernel mode on each syscall and each syscall handling should
be considered independently from the point of view of Dept. Otherwise,
Dept may wrongly track dependencies across different syscalls.

That might be a real dependency from user mode. However, now that Dept
just started to work, conservatively let Dept not track dependencies
across different syscalls.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h | 39 
 kernel/dependency/dept.c | 67 
 kernel/entry/common.c|  3 +++
 3 files changed, 60 insertions(+), 49 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index e2d4aea..1a1c307 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -25,11 +25,16 @@
 #define DEPT_MAX_SUBCLASSES_USR(DEPT_MAX_SUBCLASSES / 
DEPT_MAX_SUBCLASSES_EVT)
 #define DEPT_MAX_SUBCLASSES_CACHE  2
 
-#define DEPT_SIRQ  0
-#define DEPT_HIRQ  1
-#define DEPT_IRQS_NR   2
-#define DEPT_SIRQF (1UL << DEPT_SIRQ)
-#define DEPT_HIRQF (1UL << DEPT_HIRQ)
+enum {
+   DEPT_CXT_SIRQ = 0,
+   DEPT_CXT_HIRQ,
+   DEPT_CXT_IRQS_NR,
+   DEPT_CXT_PROCESS = DEPT_CXT_IRQS_NR,
+   DEPT_CXTS_NR
+};
+
+#define DEPT_SIRQF (1UL << DEPT_CXT_SIRQ)
+#define DEPT_HIRQF (1UL << DEPT_CXT_HIRQ)
 
 struct dept_ecxt;
 struct dept_iecxt {
@@ -95,8 +100,8 @@ struct dept_class {
/*
 * for tracking IRQ dependencies
 */
-   struct dept_iecxt   iecxt[DEPT_IRQS_NR];
-   struct dept_iwait   iwait[DEPT_IRQS_NR];
+   struct dept_iecxt   iecxt[DEPT_CXT_IRQS_NR];
+   struct dept_iwait   iwait[DEPT_CXT_IRQS_NR];
 };
 
 struct dept_stack {
@@ -150,8 +155,8 @@ struct dept_ecxt {
/*
 * where the IRQ-enabled happened
 */
-   unsigned long   enirq_ip[DEPT_IRQS_NR];
-   struct dept_stack   *enirq_stack[DEPT_IRQS_NR];
+   unsigned long   enirq_ip[DEPT_CXT_IRQS_NR];
+   struct dept_stack   *enirq_stack[DEPT_CXT_IRQS_NR];
 
/*
 * where the event context started
@@ -194,8 +199,8 @@ struct dept_wait {
/*
 * where the IRQ wait happened
 */
-   unsigned long   irq_ip[DEPT_IRQS_NR];
-   struct dept_stack   *irq_stack[DEPT_IRQS_NR];
+   unsigned long   irq_ip[DEPT_CXT_IRQS_NR];
+   struct dept_stack   *irq_stack[DEPT_CXT_IRQS_NR];
 
/*
 * where the wait happened
@@ -400,19 +405,19 @@ struct dept_task {
int wait_hist_pos;
 
/*
-* sequential id to identify each IRQ context
+* sequential id to identify each context
 */
-   unsigned intirq_id[DEPT_IRQS_NR];
+   unsigned intcxt_id[DEPT_CXTS_NR];
 
/*
 * for tracking IRQ-enabled points with cross-event
 */
-   unsigned intwgen_enirq[DEPT_IRQS_NR];
+   unsigned intwgen_enirq[DEPT_CXT_IRQS_NR];
 
/*
 * for keeping up-to-date IRQ-enabled points
 */
-   unsigned long   enirq_ip[DEPT_IRQS_NR];
+   unsigned long   enirq_ip[DEPT_CXT_IRQS_NR];
 
/*
 * current effective IRQ-enabled flag
@@ -448,7 +453,7 @@ struct dept_task {
.dept_task.wait_hist = { { .wait = NULL, } },   \
.dept_task.ecxt_held_pos = 0,   \
.dept_task.wait_hist_pos = 0,   \
-   .dept_task.irq_id = { 0 },  \
+   .dept_task.cxt_id = { 0 },  \
.dept_task.wgen_enirq = { 0 },  \
.dept_task.enirq_ip = { 0 },\
.dept_task.recursive = 0,   \
@@ -480,6 +485,7 @@ struct dept_task {
 extern void dept_wait_split_map(struct dept_map_each *me, struct 
dept_map_common *mc, unsigned long ip, const char *w_fn, int ne);
 extern void dept_event_split_map(struct dept_map_each *me, struct 
dept_map_common *mc, unsigned long ip, const char *e_fn);
 extern void dept_ask_event_split_map(struct dept_map_each *me, struct 
dept_map_common *mc);
+extern void dept_kernel_enter(void);
 
 /*
  * for users who want to manage external keys
@@ -520,6 +526,7 @@ struct dept_task {
 #define dept_wait_split_map(me, mc, ip, w_fn, ne)  do { } while (0)
 #define dept_event_split_map(me, mc, ip, e_fn) do { } while (0)
 #define dept_ask_event_split_map(me, mc)   do { } while (0)
+#define dept_kernel_enter()  

[PATCH v4 09/24] dept: Apply Dept to seqlock

2022-03-03 Thread Byungchul Park
Makes Dept able to track dependencies by seqlock with adding wait
annotation on read side of seqlock.

Signed-off-by: Byungchul Park 
---
 include/linux/seqlock.h | 59 -
 1 file changed, 58 insertions(+), 1 deletion(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 37ded6b..6e8ecd7 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -23,6 +23,25 @@
 
 #include 
 
+#ifdef CONFIG_DEPT
+#define DEPT_EVT_ALL   ((1UL << DEPT_MAX_SUBCLASSES_EVT) - 1)
+#define dept_seq_wait(m, ip)   dept_wait(m, DEPT_EVT_ALL, ip, __func__, 0)
+#define dept_seq_writebegin(m, ip) \
+do {   \
+   dept_ecxt_enter(m, 1UL, ip, __func__, "write_seqcount_end", 0);\
+   dept_ask_event(m);  \
+} while (0)
+#define dept_seq_writeend(m, ip)   \
+do {   \
+   dept_event(m, 1UL, ip, __func__);   \
+   dept_ecxt_exit(m, ip);  \
+} while (0)
+#else
+#define dept_seq_wait(m, ip)   do { } while (0)
+#define dept_seq_writebegin(m, ip) do { } while (0)
+#define dept_seq_writeend(m, ip)   do { } while (0)
+#endif
+
 /*
  * The seqlock seqcount_t interface does not prescribe a precise sequence of
  * read begin/retry/end. For readers, typically there is a call to
@@ -148,7 +167,7 @@ static inline void seqcount_lockdep_reader_access(const 
seqcount_t *s)
  * This lock-unlock technique must be implemented for all of PREEMPT_RT
  * sleeping locks.  See Documentation/locking/locktypes.rst
  */
-#if defined(CONFIG_LOCKDEP) || defined(CONFIG_PREEMPT_RT)
+#if defined(CONFIG_LOCKDEP) || defined(CONFIG_DEPT) || 
defined(CONFIG_PREEMPT_RT)
 #define __SEQ_LOCK(expr)   expr
 #else
 #define __SEQ_LOCK(expr)
@@ -203,6 +222,22 @@ static inline void seqcount_lockdep_reader_access(const 
seqcount_t *s)
__SEQ_LOCK(locktype *lock); \
 } seqcount_##lockname##_t; \
\
+static __always_inline void\
+__seqprop_##lockname##_wait(const seqcount_##lockname##_t *s)  \
+{  \
+   __SEQ_LOCK(dept_seq_wait(&(lockmember)->dep_map.dmap, _RET_IP_));\
+}  \
+   \
+static __always_inline void\
+__seqprop_##lockname##_writebegin(const seqcount_##lockname##_t *s)\
+{  \
+}  \
+   \
+static __always_inline void\
+__seqprop_##lockname##_writeend(const seqcount_##lockname##_t *s)  \
+{  \
+}  \
+   \
 static __always_inline seqcount_t *\
 __seqprop_##lockname##_ptr(seqcount_##lockname##_t *s) \
 {  \
@@ -271,6 +306,21 @@ static inline void __seqprop_assert(const seqcount_t *s)
lockdep_assert_preemption_disabled();
 }
 
+static inline void __seqprop_wait(seqcount_t *s)
+{
+   dept_seq_wait(&s->dep_map.dmap, _RET_IP_);
+}
+
+static inline void __seqprop_writebegin(seqcount_t *s)
+{
+   dept_seq_writebegin(&s->dep_map.dmap, _RET_IP_);
+}
+
+static inline void __seqprop_writeend(seqcount_t *s)
+{
+   dept_seq_writeend(&s->dep_map.dmap, _RET_IP_);
+}
+
 #define __SEQ_RT   IS_ENABLED(CONFIG_PREEMPT_RT)
 
 SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t,  false,s->lock,
raw_spin, raw_spin_lock(s->lock))
@@ -311,6 +361,9 @@ static inline void __seqprop_assert(const seqcount_t *s)
 #define seqprop_sequence(s)__seqprop(s, sequence)
 #define seqprop_preemptible(s) __seqprop(s, preemptible)
 #define seqprop_assert(s)  __seqprop(s, assert)
+#define seqprop_dept_wait(s)   __seqprop(s, wait)
+#define seqprop_dept_writebegin(s) __seqprop(s, writebegin)
+#define seqprop_dept_writeend(s)   __seqprop(s, writeend)
 
 /**
  * __read_seqcount_begin() - begin a seqcount_t read section w/o barrier
@@ -360,6 +413,7 @@ static inline void __seqprop_assert(const seqcount_t *s)
 #define read_seqcount_begin(s)  

[PATCH v4 13/24] dept: Apply Dept to wait/event of PG_{locked, writeback}

2022-03-03 Thread Byungchul Park
Makes Dept able to track dependencies by PG_{locked,writeback}. For
instance, (un)lock_page() generates that type of dependency.

Signed-off-by: Byungchul Park 
---
 include/linux/dept_page.h   | 78 +
 include/linux/page-flags.h  | 45 ++--
 include/linux/pagemap.h |  7 +++-
 init/main.c |  2 ++
 kernel/dependency/dept_object.h |  2 +-
 lib/Kconfig.debug   |  1 +
 mm/filemap.c| 68 +++
 mm/page_ext.c   |  5 +++
 8 files changed, 204 insertions(+), 4 deletions(-)
 create mode 100644 include/linux/dept_page.h

diff --git a/include/linux/dept_page.h b/include/linux/dept_page.h
new file mode 100644
index 000..d2d093d
--- /dev/null
+++ b/include/linux/dept_page.h
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_DEPT_PAGE_H
+#define __LINUX_DEPT_PAGE_H
+
+#ifdef CONFIG_DEPT
+#include 
+
+extern struct page_ext_operations dept_pglocked_ops;
+extern struct page_ext_operations dept_pgwriteback_ops;
+extern struct dept_map_common pglocked_mc;
+extern struct dept_map_common pgwriteback_mc;
+
+extern void dept_page_init(void);
+extern struct dept_map_each *get_pglocked_me(struct page *page);
+extern struct dept_map_each *get_pgwriteback_me(struct page *page);
+
+#define dept_pglocked_wait(f)  \
+do {   \
+   struct dept_map_each *me = get_pglocked_me(&(f)->page); \
+   \
+   if (likely(me)) \
+   dept_wait_split_map(me, &pglocked_mc, _RET_IP_, \
+   __func__, 0);   \
+} while (0)
+
+#define dept_pglocked_set_bit(f)   \
+do {   \
+   struct dept_map_each *me = get_pglocked_me(&(f)->page); \
+   \
+   if (likely(me)) \
+   dept_ask_event_split_map(me, &pglocked_mc); \
+} while (0)
+
+#define dept_pglocked_event(f) \
+do {   \
+   struct dept_map_each *me = get_pglocked_me(&(f)->page); \
+   \
+   if (likely(me)) \
+   dept_event_split_map(me, &pglocked_mc, _RET_IP_,\
+__func__); \
+} while (0)
+
+#define dept_pgwriteback_wait(f)   \
+do {   \
+   struct dept_map_each *me = get_pgwriteback_me(&(f)->page);\
+   \
+   if (likely(me)) \
+   dept_wait_split_map(me, &pgwriteback_mc, _RET_IP_,\
+   __func__, 0);   \
+} while (0)
+
+#define dept_pgwriteback_set_bit(f)\
+do {   \
+   struct dept_map_each *me = get_pgwriteback_me(&(f)->page);\
+   \
+   if (likely(me)) \
+   dept_ask_event_split_map(me, &pgwriteback_mc);\
+} while (0)
+
+#define dept_pgwriteback_event(f)  \
+do {   \
+   struct dept_map_each *me = get_pgwriteback_me(&(f)->page);\
+   \
+   if (likely(me)) \
+   dept_event_split_map(me, &pgwriteback_mc, _RET_IP_,\
+__func__); \
+} while (0)
+#else
+#define dept_page_init()   do { } while (0)
+#define dept_pglocked_wait(f)  do { } while (0)
+#define dept_pglocked_set_bit(f)   do { } while (0)
+#define dept_pglocked_event(f) do { } while (0)
+#define dept_pgwriteback_wait(f)   do { } while (0)
+#define dept_pgwriteback_set_bit(f)do { } while (0)
+#define dept_pgwriteback_event(f)  do { } while (0)
+#endif
+
+#endif /* __LINUX_DEPT_PAGE_H */
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 1c3b6e5..066b6a5 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -411,7 +411,6 @@ static unsigned long *folio_flags(struct folio *folio, 
unsigned n)
 #define TESTSCFLAG_FALSE(uname, lname) \
TESTSETFLAG_FALSE(uname, lname) TESTCLEARFLAG_FALSE(uname, lname)
 
-__PAGEFLAG(Locked, locked, PF_NO_TAIL

[PATCH v4 15/24] dept: Apply SDT to wait(waitqueue)

2022-03-03 Thread Byungchul Park
Makes SDT able to track dependencies by wait(waitqueue).

Signed-off-by: Byungchul Park 
---
 include/linux/wait.h |  6 +-
 kernel/sched/wait.c  | 16 
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 851e07d..2133998 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -37,6 +38,7 @@ struct wait_queue_entry {
 struct wait_queue_head {
spinlock_t  lock;
struct list_headhead;
+   struct dept_map dmap;
 };
 typedef struct wait_queue_head wait_queue_head_t;
 
@@ -56,7 +58,8 @@ struct wait_queue_head {
 
 #define __WAIT_QUEUE_HEAD_INITIALIZER(name) {  
\
.lock   = __SPIN_LOCK_UNLOCKED(name.lock),  
\
-   .head   = LIST_HEAD_INIT(name.head) }
+   .head   = LIST_HEAD_INIT(name.head),
\
+   .dmap   = DEPT_SDT_MAP_INIT(name) }
 
 #define DECLARE_WAIT_QUEUE_HEAD(name) \
struct wait_queue_head name = __WAIT_QUEUE_HEAD_INITIALIZER(name)
@@ -67,6 +70,7 @@ struct wait_queue_head {
do {
\
static struct lock_class_key __key; 
\

\
+   sdt_map_init(&(wq_head)->dmap); 
\
__init_waitqueue_head((wq_head), #wq_head, &__key); 
\
} while (0)
 
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index eca3810..fc5a16a 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -105,6 +105,7 @@ static int __wake_up_common(struct wait_queue_head 
*wq_head, unsigned int mode,
if (flags & WQ_FLAG_BOOKMARK)
continue;
 
+   sdt_event(&wq_head->dmap);
ret = curr->func(curr, mode, wake_flags, key);
if (ret < 0)
break;
@@ -268,6 +269,9 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head)
__add_wait_queue(wq_head, wq_entry);
set_current_state(state);
spin_unlock_irqrestore(&wq_head->lock, flags);
+
+   if (state & TASK_NORMAL)
+   sdt_wait_prepare(&wq_head->dmap);
 }
 EXPORT_SYMBOL(prepare_to_wait);
 
@@ -286,6 +290,10 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head)
}
set_current_state(state);
spin_unlock_irqrestore(&wq_head->lock, flags);
+
+   if (state & TASK_NORMAL)
+   sdt_wait_prepare(&wq_head->dmap);
+
return was_empty;
 }
 EXPORT_SYMBOL(prepare_to_wait_exclusive);
@@ -331,6 +339,9 @@ long prepare_to_wait_event(struct wait_queue_head *wq_head, 
struct wait_queue_en
}
spin_unlock_irqrestore(&wq_head->lock, flags);
 
+   if (!ret && state & TASK_NORMAL)
+   sdt_wait_prepare(&wq_head->dmap);
+
return ret;
 }
 EXPORT_SYMBOL(prepare_to_wait_event);
@@ -352,7 +363,9 @@ int do_wait_intr(wait_queue_head_t *wq, wait_queue_entry_t 
*wait)
return -ERESTARTSYS;
 
spin_unlock(&wq->lock);
+   sdt_wait_prepare(&wq->dmap);
schedule();
+   sdt_wait_finish();
spin_lock(&wq->lock);
 
return 0;
@@ -369,7 +382,9 @@ int do_wait_intr_irq(wait_queue_head_t *wq, 
wait_queue_entry_t *wait)
return -ERESTARTSYS;
 
spin_unlock_irq(&wq->lock);
+   sdt_wait_prepare(&wq->dmap);
schedule();
+   sdt_wait_finish();
spin_lock_irq(&wq->lock);
 
return 0;
@@ -389,6 +404,7 @@ void finish_wait(struct wait_queue_head *wq_head, struct 
wait_queue_entry *wq_en
 {
unsigned long flags;
 
+   sdt_wait_finish();
__set_current_state(TASK_RUNNING);
/*
 * We can check for list emptiness outside the lock
-- 
1.9.1



[PATCH v4 04/24] dept: Add a API for skipping dependency check temporarily

2022-03-03 Thread Byungchul Park
Dept would skip check for dmaps marked by dept_map_nocheck() permanently.
However, sometimes it needs to skip check for some dmaps temporarily and
back to normal, for instance, lock acquisition with a nest lock.

Lock usage check with regard to nest lock could be performed by Lockdep,
however, dependency check is not necessary for that case. So prepared
for it by adding two new APIs, dept_skip() and dept_unskip_if_skipped().

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h |  9 +
 include/linux/dept_sdt.h |  2 +-
 include/linux/lockdep.h  |  4 +++-
 kernel/dependency/dept.c | 49 
 4 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index c3fb3cf..c0bbb8e 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -352,6 +352,11 @@ struct dept_map {
unsigned intwgen;
 
/*
+* for skipping dependency check temporarily
+*/
+   atomic_tskip_cnt;
+
+   /*
 * whether this map should be going to be checked or not
 */
boolnocheck;
@@ -444,6 +449,8 @@ struct dept_task {
 extern void dept_ask_event(struct dept_map *m);
 extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long 
ip, const char *e_fn);
 extern void dept_ecxt_exit(struct dept_map *m, unsigned long ip);
+extern void dept_skip(struct dept_map *m);
+extern bool dept_unskip_if_skipped(struct dept_map *m);
 
 /*
  * for users who want to manage external keys
@@ -475,6 +482,8 @@ struct dept_task {
 #define dept_ask_event(m)  do { } while (0)
 #define dept_event(m, e_f, ip, e_fn)   do { (void)(e_fn); } while (0)
 #define dept_ecxt_exit(m, ip)  do { } while (0)
+#define dept_skip(m)   do { } while (0)
+#define dept_unskip_if_skipped(m)  (false)
 #define dept_key_init(k)   do { (void)(k); } while (0)
 #define dept_key_destroy(k)do { (void)(k); } while (0)
 #endif
diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h
index 375c4c3..e9d558d 100644
--- a/include/linux/dept_sdt.h
+++ b/include/linux/dept_sdt.h
@@ -13,7 +13,7 @@
 #include 
 
 #ifdef CONFIG_DEPT
-#define DEPT_SDT_MAP_INIT(dname)   { .name = #dname }
+#define DEPT_SDT_MAP_INIT(dname)   { .name = #dname, .skip_cnt = 
ATOMIC_INIT(0) }
 
 /*
  * SDT(Single-event Dependency Tracker) APIs
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index c56f6b6..c1a56fe 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -488,7 +488,9 @@ enum xhlock_context_t {
  */
 #define STATIC_DEPT_MAP_INIT(_name, _key) .dmap = {\
.name = (_name),\
-   .keys = NULL },
+   .keys = NULL,   \
+   .skip_cnt = ATOMIC_INIT(0), \
+   },
 #else
 #define STATIC_DEPT_MAP_INIT(_name, _key)
 #endif
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index ec3f131..3f22c5b 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1943,6 +1943,7 @@ void dept_map_init(struct dept_map *m, struct dept_key 
*k, int sub,
m->name = n;
m->wgen = 0U;
m->nocheck = false;
+   atomic_set(&m->skip_cnt, 0);
 exit:
dept_exit(flags);
 }
@@ -1963,6 +1964,7 @@ void dept_map_reinit(struct dept_map *m)
 
clean_classes_cache(&m->keys_local);
m->wgen = 0U;
+   atomic_set(&m->skip_cnt, 0);
 
dept_exit(flags);
 }
@@ -2346,6 +2348,53 @@ void dept_ecxt_exit(struct dept_map *m, unsigned long ip)
 }
 EXPORT_SYMBOL_GPL(dept_ecxt_exit);
 
+void dept_skip(struct dept_map *m)
+{
+   struct dept_task *dt = dept_task();
+   unsigned long flags;
+
+   if (READ_ONCE(dept_stop) || dt->recursive)
+   return;
+
+   if (m->nocheck)
+   return;
+
+   flags = dept_enter();
+
+   atomic_inc(&m->skip_cnt);
+
+   dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_skip);
+
+/*
+ * Return true if successfully unskip, otherwise false.
+ */
+bool dept_unskip_if_skipped(struct dept_map *m)
+{
+   struct dept_task *dt = dept_task();
+   unsigned long flags;
+   bool ret = false;
+
+   if (READ_ONCE(dept_stop) || dt->recursive)
+   return false;
+
+   if (m->nocheck)
+   return false;
+
+   flags = dept_enter();
+
+   if (!atomic_read(&m->skip_cnt))
+   goto exit;
+
+   atomic_dec(&m->skip_cnt);
+   ret = true;
+exit:
+   dept_exit(flags);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(dept_unskip_if_skipped);
+
 void dept_task_exit(struct task_struct *t)
 {
struct dept_task *dt = &t->dept_task;
-- 
1.9.1



[PATCH v4 01/24] llist: Move llist_{head,node} definition to types.h

2022-03-03 Thread Byungchul Park
llist_head and llist_node can be used by very primitives. For example,
Dept for tracking dependency uses llist things in its header. To avoid
header dependency, move those to types.h.

Signed-off-by: Byungchul Park 
---
 include/linux/llist.h | 8 
 include/linux/types.h | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/llist.h b/include/linux/llist.h
index 85bda2d..99cc3c3 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -53,14 +53,6 @@
 #include 
 #include 
 
-struct llist_head {
-   struct llist_node *first;
-};
-
-struct llist_node {
-   struct llist_node *next;
-};
-
 #define LLIST_HEAD_INIT(name)  { NULL }
 #define LLIST_HEAD(name)   struct llist_head name = LLIST_HEAD_INIT(name)
 
diff --git a/include/linux/types.h b/include/linux/types.h
index ac825ad..4662d6e 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -187,6 +187,14 @@ struct hlist_node {
struct hlist_node *next, **pprev;
 };
 
+struct llist_head {
+   struct llist_node *first;
+};
+
+struct llist_node {
+   struct llist_node *next;
+};
+
 struct ustat {
__kernel_daddr_tf_tfree;
 #ifdef CONFIG_ARCH_32BIT_USTAT_F_TINODE
-- 
1.9.1



[PATCH v4 00/24] DEPT(Dependency Tracker)

2022-03-03 Thread Byungchul Park
Hi Linus and folks,

I've been developing a tool for detecting deadlock possibilities by
tracking wait/event rather than lock(?) acquisition order to try to
cover all synchonization machanisms. It's done on v5.17-rc1 tag.

https://github.com/lgebyungchulpark/linux-dept/commits/dept1.14_on_v5.17-rc1

Benifit:

0. Works with all lock primitives.
1. Works with wait_for_completion()/complete().
2. Works with 'wait' on PG_locked.
3. Works with 'wait' on PG_writeback.
4. Works with swait/wakeup.
5. Works with waitqueue.
6. Multiple reports are allowed.
7. Deduplication control on multiple reports.
8. Withstand false positives thanks to 6.
9. Easy to tag any wait/event.

Future work:

0. To make it more stable.
1. To separates Dept from Lockdep.
2. To improves performance in terms of time and space.
3. To use Dept as a dependency engine for Lockdep.
4. To add any missing tags of wait/event in the kernel.
5. To deduplicate stack trace.

How to interpret reports:

1. E(event) in each context cannot be triggered because of the
   W(wait) that cannot be woken.
2. The stack trace helping find the problematic code is located
   in each conext's detail.

Thanks,
Byungchul

---

Changes from v3:

1. Dept shouldn't create dependencies between different depths
   of a class that were indicated by *_lock_nested(). Dept
   normally doesn't but it does once another lock class comes
   in. So fixed it. (feedback from Hyeonggon)
2. Dept considered a wait as a real wait once getting to
   __schedule() even if it has been set to TASK_RUNNING by wake
   up sources in advance. Fixed it so that Dept doesn't consider
   the case as a real wait. (feedback from Jan Kara)
3. Stop tracking dependencies with a map once the event
   associated with the map has been handled. Dept will start to
   work with the map again, on the next sleep.

Changes from v2:

1. Disable Dept on bit_wait_table[] in sched/wait_bit.c
   reporting a lot of false positives, which is my fault.
   Wait/event for bit_wait_table[] should've been tagged in a
   higher layer for better work, which is a future work.
   (feedback from Jan Kara)
2. Disable Dept on crypto_larval's completion to prevent a false
   positive.

Changes from v1:

1. Fix coding style and typo. (feedback from Steven)
2. Distinguish each work context from another in workqueue.
3. Skip checking lock acquisition with nest_lock, which is about
   correct lock usage that should be checked by Lockdep.

Changes from RFC:

1. Prevent adding a wait tag at prepare_to_wait() but __schedule().
   (feedback from Linus and Matthew)
2. Use try version at lockdep_acquire_cpus_lock() annotation.
3. Distinguish each syscall context from another.

Byungchul Park (24):
  llist: Move llist_{head,node} definition to types.h
  dept: Implement Dept(Dependency Tracker)
  dept: Embed Dept data in Lockdep
  dept: Add a API for skipping dependency check temporarily
  dept: Apply Dept to spinlock
  dept: Apply Dept to mutex families
  dept: Apply Dept to rwlock
  dept: Apply Dept to wait_for_completion()/complete()
  dept: Apply Dept to seqlock
  dept: Apply Dept to rwsem
  dept: Add proc knobs to show stats and dependency graph
  dept: Introduce split map concept and new APIs for them
  dept: Apply Dept to wait/event of PG_{locked,writeback}
  dept: Apply SDT to swait
  dept: Apply SDT to wait(waitqueue)
  locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread
  dept: Distinguish each syscall context from another
  dept: Distinguish each work from another
  dept: Disable Dept within the wait_bit layer by default
  dept: Add nocheck version of init_completion()
  dept: Disable Dept on struct crypto_larval's completion for now
  dept: Don't create dependencies between different depths in any case
  dept: Let it work with real sleeps in __schedule()
  dept: Disable Dept on that map once it's been handled until next turn

 crypto/api.c   |7 +-
 include/linux/completion.h |   50 +-
 include/linux/dept.h   |  535 +++
 include/linux/dept_page.h  |   78 ++
 include/linux/dept_sdt.h   |   62 +
 include/linux/hardirq.h|3 +
 include/linux/irqflags.h   |   33 +-
 include/linux/llist.h  |8 -
 include/linux/lockdep.h|  158 ++-
 include/linux/lockdep_types.h  |3 +
 include/linux/mutex.h  |   33 +
 include/linux/page-flags.h |   45 +-
 include/linux/pagemap.h|7 +-
 include/linux/percpu-rwsem.h   |   10 +-
 include/linux/rtmutex.h|7 +
 include/linux/rwlock.h |   52 +
 include/lin

[PATCH v4 07/24] dept: Apply Dept to rwlock

2022-03-03 Thread Byungchul Park
Makes Dept able to track dependencies by rwlock.

Signed-off-by: Byungchul Park 
---
 include/linux/lockdep.h| 25 
 include/linux/rwlock.h | 52 ++
 include/linux/rwlock_api_smp.h |  8 +++
 include/linux/rwlock_types.h   |  7 ++
 4 files changed, 83 insertions(+), 9 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 6653a4f..b93a707 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -600,16 +600,31 @@ static inline void print_irqtrace_events(struct 
task_struct *curr)
dept_spin_unlock(&(l)->dmap, i);\
 } while (0)
 
-#define rwlock_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, 
NULL, i)
+#define rwlock_acquire(l, s, t, i) \
+do {   \
+   lock_acquire_exclusive(l, s, t, NULL, i);   \
+   dept_rwlock_wlock(&(l)->dmap, s, t, NULL, "write_unlock", i);   \
+} while (0)
 #define rwlock_acquire_read(l, s, t, i)
\
 do {   \
-   if (read_lock_is_recursive())   \
+   if (read_lock_is_recursive()) { \
lock_acquire_shared_recursive(l, s, t, NULL, i);\
-   else\
+   dept_rwlock_rlock(&(l)->dmap, s, t, NULL, "read_unlock", i, 0);\
+   } else {\
lock_acquire_shared(l, s, t, NULL, i);  \
+   dept_rwlock_rlock(&(l)->dmap, s, t, NULL, "read_unlock", i, 1);\
+   }   \
+} while (0)
+#define rwlock_release(l, i)   \
+do {   \
+   lock_release(l, i); \
+   dept_rwlock_wunlock(&(l)->dmap, i); \
+} while (0)
+#define rwlock_release_read(l, i)  \
+do {   \
+   lock_release(l, i); \
+   dept_rwlock_runlock(&(l)->dmap, i); \
 } while (0)
-
-#define rwlock_release(l, i)   lock_release(l, i)
 
 #define seqcount_acquire(l, s, t, i)   lock_acquire_exclusive(l, s, t, 
NULL, i)
 #define seqcount_acquire_read(l, s, t, i)  
lock_acquire_shared_recursive(l, s, t, NULL, i)
diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h
index 8f416c5..768ad9e 100644
--- a/include/linux/rwlock.h
+++ b/include/linux/rwlock.h
@@ -28,6 +28,58 @@
do { *(lock) = __RW_LOCK_UNLOCKED(lock); } while (0)
 #endif
 
+#ifdef CONFIG_DEPT
+#define DEPT_EVT_RWLOCK_R  1UL
+#define DEPT_EVT_RWLOCK_W  (1UL << 1)
+#define DEPT_EVT_RWLOCK_RW (DEPT_EVT_RWLOCK_R | DEPT_EVT_RWLOCK_W)
+
+#define dept_rwlock_wlock(m, ne, t, n, e_fn, ip)   \
+do {   \
+   if (t) {\
+   dept_ecxt_enter(m, DEPT_EVT_RWLOCK_W, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   } else if (n) { \
+   dept_skip(m);   \
+   } else {\
+   dept_wait(m, DEPT_EVT_RWLOCK_RW, ip, __func__, ne); \
+   dept_ecxt_enter(m, DEPT_EVT_RWLOCK_W, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   }   \
+} while (0)
+#define dept_rwlock_rlock(m, ne, t, n, e_fn, ip, q)\
+do {   \
+   if (t) {\
+   dept_ecxt_enter(m, DEPT_EVT_RWLOCK_R, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   } else if (n) { \
+   dept_skip(m);   \
+   } else {\
+   dept_wait(m, (q) ? DEPT_EVT_RWLOCK_RW : DEPT_EVT_RWLOCK_W, ip, 
__func__, ne);\
+   dept_ecxt_enter(m, DEPT_EVT_RWLOCK_R, ip, __func__, e_fn, ne);\
+   dept_ask_event(m); 

[PATCH v4 05/24] dept: Apply Dept to spinlock

2022-03-03 Thread Byungchul Park
Makes Dept able to track dependencies by spinlock.

Signed-off-by: Byungchul Park 
---
 include/linux/lockdep.h| 18 +++---
 include/linux/spinlock.h   | 26 ++
 include/linux/spinlock_types_raw.h | 13 +
 3 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index c1a56fe..529ea18 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -584,9 +584,21 @@ static inline void print_irqtrace_events(struct 
task_struct *curr)
 #define lock_acquire_shared(l, s, t, n, i) lock_acquire(l, s, t, 
1, 1, n, i)
 #define lock_acquire_shared_recursive(l, s, t, n, i)   lock_acquire(l, s, t, 
2, 1, n, i)
 
-#define spin_acquire(l, s, t, i)   lock_acquire_exclusive(l, s, t, 
NULL, i)
-#define spin_acquire_nest(l, s, t, n, i)   lock_acquire_exclusive(l, s, t, 
n, i)
-#define spin_release(l, i) lock_release(l, i)
+#define spin_acquire(l, s, t, i)   \
+do {   \
+   lock_acquire_exclusive(l, s, t, NULL, i);   \
+   dept_spin_lock(&(l)->dmap, s, t, NULL, "spin_unlock", i);   \
+} while (0)
+#define spin_acquire_nest(l, s, t, n, i)   \
+do {   \
+   lock_acquire_exclusive(l, s, t, n, i);  \
+   dept_spin_lock(&(l)->dmap, s, t, (n) ? &(n)->dmap : NULL, 
"spin_unlock", i); \
+} while (0)
+#define spin_release(l, i) \
+do {   \
+   lock_release(l, i); \
+   dept_spin_unlock(&(l)->dmap, i);\
+} while (0)
 
 #define rwlock_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, 
NULL, i)
 #define rwlock_acquire_read(l, s, t, i)
\
diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 5c0c517..6b5c3f4 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -95,6 +95,32 @@
 # include 
 #endif
 
+#ifdef CONFIG_DEPT
+#define dept_spin_lock(m, ne, t, n, e_fn, ip)  \
+do {   \
+   if (t) {\
+   dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   } else if (n) { \
+   dept_skip(m);   \
+   } else {\
+   dept_wait(m, 1UL, ip, __func__, ne);\
+   dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\
+   dept_ask_event(m);  \
+   }   \
+} while (0)
+#define dept_spin_unlock(m, ip)
\
+do {   \
+   if (!dept_unskip_if_skipped(m)) {   \
+   dept_event(m, 1UL, ip, __func__);   \
+   dept_ecxt_exit(m, ip);  \
+   }   \
+} while (0)
+#else
+#define dept_spin_lock(m, ne, t, n, e_fn, ip)  do { } while (0)
+#define dept_spin_unlock(m, ip)do { } while (0)
+#endif
+
 #ifdef CONFIG_DEBUG_SPINLOCK
   extern void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
   struct lock_class_key *key, short inner);
diff --git a/include/linux/spinlock_types_raw.h 
b/include/linux/spinlock_types_raw.h
index 91cb36b..279e821 100644
--- a/include/linux/spinlock_types_raw.h
+++ b/include/linux/spinlock_types_raw.h
@@ -26,16 +26,28 @@
 
 #define SPINLOCK_OWNER_INIT((void *)-1L)
 
+#ifdef CONFIG_DEPT
+# define RAW_SPIN_DMAP_INIT(lockname)  .dmap = { .name = #lockname, .skip_cnt 
= ATOMIC_INIT(0) },
+# define SPIN_DMAP_INIT(lockname)  .dmap = { .name = #lockname, .skip_cnt 
= ATOMIC_INIT(0) },
+# define LOCAL_SPIN_DMAP_INIT(lockname).dmap = { .name = #lockname, 
.skip_cnt = ATOMIC_INIT(0) },
+#else
+# define RAW_SPIN_DMAP_INIT(lockname)
+# define SPIN_DMAP_INIT(lockname)
+# define LOCAL_SPIN_DMAP_INIT(lockname)
+#endif
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # define RAW_SPIN_DEP_MAP_INIT(lockname)   \
.dep_map = {\
.name = #lockname,  \
.wait_type_inner =

[PATCH v1 1/2] drm/aspeed: Add gfx flags and clock selection for AST2600

2022-03-03 Thread Tommy Haung
Add clock selection code for AST2600. At AST2600 user could
select more than one dispaly timing. Add gfx flags for future
usage.

Signed-off-by: Tommy Haung 
---
 drivers/gpu/drm/aspeed/aspeed_gfx.h  | 11 +++
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 14 ++
 drivers/gpu/drm/aspeed/aspeed_gfx_drv.c  |  4 
 3 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx.h 
b/drivers/gpu/drm/aspeed/aspeed_gfx.h
index 4e6a442c3886..eb4c267cde5e 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx.h
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx.h
@@ -16,6 +16,7 @@ struct aspeed_gfx {
u32 vga_scratch_reg;
u32 throd_val;
u32 scan_line_max;
+   u32 flags;
 
struct drm_simple_display_pipe  pipe;
struct drm_connectorconnector;
@@ -106,3 +107,13 @@ int aspeed_gfx_create_output(struct drm_device *drm);
 /* CRT_THROD */
 #define CRT_THROD_LOW(x)   (x)
 #define CRT_THROD_HIGH(x)  ((x) << 8)
+
+/* SCU control */
+#define SCU_G6_CLK_COURCE  0x300
+
+/* GFX FLAGS */
+#define CLK_MASK   BIT(0)
+#define CLK_G6 BIT(0)
+
+#define G6_CLK_MASK(BIT(8) | BIT(9) | BIT(10))
+#define G6_USB_40_CLK  BIT(9)
diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c 
b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
index 827e62c1daba..a24fab22eac4 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
@@ -77,6 +77,18 @@ static void aspeed_gfx_disable_controller(struct aspeed_gfx 
*priv)
regmap_update_bits(priv->scu, priv->dac_reg, BIT(16), 0);
 }
 
+static void aspeed_gfx_set_clk(struct aspeed_gfx *priv)
+{
+   switch (priv->flags & CLK_MASK) {
+   case CLK_G6:
+   regmap_update_bits(priv->scu, SCU_G6_CLK_COURCE, G6_CLK_MASK, 
0x0);
+   regmap_update_bits(priv->scu, SCU_G6_CLK_COURCE, G6_CLK_MASK, 
G6_USB_40_CLK);
+   break;
+   default:
+   break;
+   }
+}
+
 static void aspeed_gfx_crtc_mode_set_nofb(struct aspeed_gfx *priv)
 {
struct drm_display_mode *m = &priv->pipe.crtc.state->adjusted_mode;
@@ -87,6 +99,8 @@ static void aspeed_gfx_crtc_mode_set_nofb(struct aspeed_gfx 
*priv)
if (err)
return;
 
+   aspeed_gfx_set_clk(priv);
+
 #if 0
/* TODO: we have only been able to test with the 40MHz USB clock. The
 * clock is fixed, so we cannot adjust it here. */
diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c 
b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
index d10246b1d1c2..af56ffdccc65 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
@@ -64,6 +64,7 @@ struct aspeed_gfx_config {
u32 vga_scratch_reg;/* VGA scratch register in SCU */
u32 throd_val;  /* Default Threshold Seting */
u32 scan_line_max;  /* Max memory size of one scan line */
+   u32 gfx_flags;  /* Flags for gfx chip caps */
 };
 
 static const struct aspeed_gfx_config ast2400_config = {
@@ -72,6 +73,7 @@ static const struct aspeed_gfx_config ast2400_config = {
.vga_scratch_reg = 0x50,
.throd_val = CRT_THROD_LOW(0x1e) | CRT_THROD_HIGH(0x12),
.scan_line_max = 64,
+   .gfx_flags = 0,
 };
 
 static const struct aspeed_gfx_config ast2500_config = {
@@ -80,6 +82,7 @@ static const struct aspeed_gfx_config ast2500_config = {
.vga_scratch_reg = 0x50,
.throd_val = CRT_THROD_LOW(0x24) | CRT_THROD_HIGH(0x3c),
.scan_line_max = 128,
+   .gfx_flags = 0,
 };
 
 static const struct aspeed_gfx_config ast2600_config = {
@@ -88,6 +91,7 @@ static const struct aspeed_gfx_config ast2600_config = {
.vga_scratch_reg = 0x50,
.throd_val = CRT_THROD_LOW(0x50) | CRT_THROD_HIGH(0x70),
.scan_line_max = 128,
+   .gfx_flags = CLK_G6,
 };
 
 static const struct of_device_id aspeed_gfx_match[] = {
-- 
2.17.1



[PATCH v1 0/2] Add 1024x768 timing for AST2600

2022-03-03 Thread Tommy Haung
v1: 
  Add 1024x768@70Hz for AST2600 soc display timing selection.
  Add gfx flag for future usage.

Testing steps:

1. Add below config to turn VT and LOGO on.

CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
CONFIG_LDISC_AUTOLOAD=y
CONFIG_DEVMEM=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_LOGO=y
CONFIG_LOGO_LINUX_CLUT224=y

  2. The Linux logo will be shown on the screen, when the BMC boot in Linux.
  3. Check the display mode is 1024x768@70Hz at AST2600.
  4. Check the display mode is 800x600@60Hz at AST2500.

Tommy Haung (2):
  drm/aspeed: Add gfx flags and clock selection for AST2600
  drm/aspeed: Add 1024x768 mode for AST2600

 drivers/gpu/drm/aspeed/aspeed_gfx.h  | 15 ++
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 35 
 drivers/gpu/drm/aspeed/aspeed_gfx_drv.c  | 20 --
 drivers/gpu/drm/aspeed/aspeed_gfx_out.c  | 14 +-
 4 files changed, 81 insertions(+), 3 deletions(-)

-- 
2.17.1



[PATCH v1 2/2] drm/aspeed: Add 1024x768 mode for AST2600

2022-03-03 Thread Tommy Haung
Update the aspeed_gfx_set_clk with display width.
At AST2600, the display clock could be coming from
HPLL clock / 16 = 75MHz. It would fit 1024x768@70Hz.
Another chip will still keep 800x600.

Signed-off-by: Tommy Haung 
---
 drivers/gpu/drm/aspeed/aspeed_gfx.h  | 12 ++
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 29 
 drivers/gpu/drm/aspeed/aspeed_gfx_drv.c  | 16 +++--
 drivers/gpu/drm/aspeed/aspeed_gfx_out.c  | 14 +++-
 4 files changed, 60 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx.h 
b/drivers/gpu/drm/aspeed/aspeed_gfx.h
index eb4c267cde5e..c7aefee0657a 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx.h
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx.h
@@ -109,11 +109,15 @@ int aspeed_gfx_create_output(struct drm_device *drm);
 #define CRT_THROD_HIGH(x)  ((x) << 8)
 
 /* SCU control */
-#define SCU_G6_CLK_COURCE  0x300
+#define G6_CLK_SOURCE  0x300
+#define G6_CLK_SOURCE_MASK (BIT(8) | BIT(9) | BIT(10))
+#define G6_CLK_SOURCE_HPLL (BIT(8) | BIT(9) | BIT(10))
+#define G6_CLK_SOURCE_USB  BIT(9)
+#define G6_CLK_SEL30x308
+#define G6_CLK_DIV_MASK0x3F000
+#define G6_CLK_DIV_16  (BIT(16)|BIT(15)|BIT(13)|BIT(12))
+#define G6_USB_40_CLK  BIT(9)
 
 /* GFX FLAGS */
 #define CLK_MASK   BIT(0)
 #define CLK_G6 BIT(0)
-
-#define G6_CLK_MASK(BIT(8) | BIT(9) | BIT(10))
-#define G6_USB_40_CLK  BIT(9)
diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c 
b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
index a24fab22eac4..5829be9c7c67 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
@@ -23,6 +23,28 @@ drm_pipe_to_aspeed_gfx(struct drm_simple_display_pipe *pipe)
return container_of(pipe, struct aspeed_gfx, pipe);
 }
 
+static void aspeed_gfx_set_clock_source(struct aspeed_gfx *priv, int 
mode_width)
+{
+   regmap_update_bits(priv->scu, G6_CLK_SOURCE, G6_CLK_SOURCE_MASK, 0x0);
+   regmap_update_bits(priv->scu, G6_CLK_SEL3, G6_CLK_DIV_MASK, 0x0);
+
+   switch (mode_width) {
+   case 1024:
+   /* hpll div 16 = 75Mhz */
+   regmap_update_bits(priv->scu, G6_CLK_SOURCE,
+   G6_CLK_SOURCE_MASK, G6_CLK_SOURCE_HPLL);
+   regmap_update_bits(priv->scu, G6_CLK_SEL3,
+   G6_CLK_DIV_MASK, G6_CLK_DIV_16);
+   break;
+   case 800:
+   default:
+   /* usb 40Mhz */
+   regmap_update_bits(priv->scu, G6_CLK_SOURCE,
+   G6_CLK_SOURCE_MASK, G6_CLK_SOURCE_USB);
+   break;
+   }
+}
+
 static int aspeed_gfx_set_pixel_fmt(struct aspeed_gfx *priv, u32 *bpp)
 {
struct drm_crtc *crtc = &priv->pipe.crtc;
@@ -77,12 +99,11 @@ static void aspeed_gfx_disable_controller(struct aspeed_gfx 
*priv)
regmap_update_bits(priv->scu, priv->dac_reg, BIT(16), 0);
 }
 
-static void aspeed_gfx_set_clk(struct aspeed_gfx *priv)
+static void aspeed_gfx_set_clk(struct aspeed_gfx *priv, int mode_width)
 {
switch (priv->flags & CLK_MASK) {
case CLK_G6:
-   regmap_update_bits(priv->scu, SCU_G6_CLK_COURCE, G6_CLK_MASK, 
0x0);
-   regmap_update_bits(priv->scu, SCU_G6_CLK_COURCE, G6_CLK_MASK, 
G6_USB_40_CLK);
+   aspeed_gfx_set_clock_source(priv, mode_width);
break;
default:
break;
@@ -99,7 +120,7 @@ static void aspeed_gfx_crtc_mode_set_nofb(struct aspeed_gfx 
*priv)
if (err)
return;
 
-   aspeed_gfx_set_clk(priv);
+   aspeed_gfx_set_clk(priv, m->hdisplay);
 
 #if 0
/* TODO: we have only been able to test with the 40MHz USB clock. The
diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c 
b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
index af56ffdccc65..e1a814aebc2d 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c
@@ -110,6 +110,7 @@ static const struct drm_mode_config_funcs 
aspeed_gfx_mode_config_funcs = {
 
 static int aspeed_gfx_setup_mode_config(struct drm_device *drm)
 {
+   struct aspeed_gfx *priv = to_aspeed_gfx(drm);
int ret;
 
ret = drmm_mode_config_init(drm);
@@ -118,8 +119,18 @@ static int aspeed_gfx_setup_mode_config(struct drm_device 
*drm)
 
drm->mode_config.min_width = 0;
drm->mode_config.min_height = 0;
-   drm->mode_config.max_width = 800;
-   drm->mode_config.max_height = 600;
+
+   switch (priv->flags & CLK_MASK) {
+   case CLK_G6:
+   drm->mode_config.max_width = 1024;
+   drm->mode_config.max_height = 768;
+   break;
+   default:
+   drm->mode_config.max_width = 800;
+   drm->mode_config.max_height = 600;
+   break;
+   }
+
drm-

Re: [PATCH 1/9] dt-bindings: mxsfb: Add compatible for i.MX8MP

2022-03-03 Thread Marek Vasut

On 3/3/22 09:21, Lucas Stach wrote:

Am Donnerstag, dem 03.03.2022 um 04:14 +0100 schrieb Marek Vasut:

On 3/2/22 10:23, Lucas Stach wrote:

[...]


I tend to agree with Marek on this one.  We have an instance where the
blk-ctrl and the GPC driver between 8m, mini, nano, plus are close,
but different enough where each SoC has it's own set of tables and
some checks.   Lucas created the framework, and others adapted it for
various SoC's.  If there really is nearly 50% common code for the
LCDIF, why not either leave the driver as one or split the common code
into its own driver like lcdif-common and then have smaller drivers
that handle their specific variations.


I don't know exactly how the standalone driver looks like, but I guess
the overlap is not really in any real HW specific parts, but the common
DRM boilerplate, so there isn't much point in creating a common lcdif
driver.


The mxsfb currently has 1280 LoC as of patch 8/9 of this series. Of
that, there is some 400 LoC which are specific to old LCDIF and this
patch adds 380 LoC for the new LCDIF. So that's 800 LoC or ~60% of
shared boilerplate that would be duplicated .


That is probably ignoring the fact that the 8MP LCDIF does not support
any overlays, so it could use the drm_simple_display_pipe
infrastructure to reduce the needed boilerplate.


It seems the IMXRT1070 LCDIF v2 (heh ...) does support overlays, so no,
the mxsfb and hypothetical lcdif drivers would look really very similar.


As you brought up the blk-ctrl as an example: I'm all for supporting
slightly different hardware in the same driver, as long as the HW
interface is close enough. But then I also opted for a separate 8MP
blk-ctrl driver for those blk-ctrls that differ significantly from the
others, as I think it would make the common driver unmaintainable
trying to support all the different variants in one driver.


But then you also need to maintain two sets of boilerplate, they
diverge, and that is not good.


I don't think that there is much chance for bugs going unfixed due to
divergence in the boilerplate, especially if you use the simple pipe
framework to handle most of that stuff for you, which gives you a lot
of code sharing with other simple DRM drivers.


But I can not use the simple pipe because overlays, see imxrt1070 .

[...]

We can always split the drivers later if this becomes unmaintainable
too, no ?


Not if you want to keep the same userspace running. As userspace has
some ties to the DRM driver name, e.g. for finding the right GBM
implementation, splitting the driver later on would be a UABI break.


Hum, so what other options do we have left ? Duplicate 60% of the driver 
right away ?


Re: [PATCH v5 3/5] drm/msm/dp: set stream_pixel rate directly

2022-03-03 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2022-03-03 20:23:06)
> On Fri, 4 Mar 2022 at 01:32, Stephen Boyd  wrote:
> >
> > Quoting Dmitry Baryshkov (2022-02-16 21:55:27)
> > > The only clock for which we set the rate is the "stream_pixel". Rather
> > > than storing the rate and then setting it by looping over all the
> > > clocks, set the clock rate directly.
> > >
> > > Signed-off-by: Dmitry Baryshkov 
> > [...]
> > > diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c 
> > > b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > > index 07f6bf7e1acb..8e6361dedd77 100644
> > > --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > > +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > > @@ -1315,7 +1315,7 @@ static void dp_ctrl_set_clock_rate(struct 
> > > dp_ctrl_private *ctrl,
> > > DRM_DEBUG_DP("setting rate=%lu on clk=%s\n", rate, name);
> > >
> > > if (num)
> > > -   cfg->rate = rate;
> > > +   clk_set_rate(cfg->clk, rate);
> >
> > This looks bad. From what I can tell we set the rate of the pixel clk
> > after enabling the phy and configuring it. See the order of operations
> > in dp_ctrl_enable_mainlink_clocks() and note how dp_power_clk_enable()
> > is the one that eventually sets a rate through dp_power_clk_set_rate()
> >
> > dp_ctrl_set_clock_rate(ctrl, DP_CTRL_PM, "ctrl_link",
> > ctrl->link->link_params.rate * 
> > 1000);
> >
> > phy_configure(phy, &dp_io->phy_opts);
> > phy_power_on(phy);
> >
> > ret = dp_power_clk_enable(ctrl->power, DP_CTRL_PM, true);
>
> This code has been changed in the previous patch.
>
> Let's get back a bit.
> Currently dp_ctrl_set_clock_rate() doesn't change the clock rate. It
> just stores the rate in the config so that later the sequence of
> dp_power_clk_enable() -> dp_power_clk_set_rate() ->
> [dp_power_clk_set_link_rate() -> dev_pm_opp_set_rate() or
> msm_dss_clk_set_rate() -> clk_set_rate()] will use that.
>
> There are only two users of dp_ctrl_set_clock_rate():
> - dp_ctrl_enable_mainlink_clocks(), which you have quoted above.
>   This case is handled in the patch 1 from this series. It makes

Patch 1 form this series says DP is unaffected. Huh?

> dp_ctrl_enable_mainlink_clocks() call dev_pm_opp_set_rate() directly
> without storing (!) the rate in the config, calling
> phy_configure()/phy_power_on() and then setting the opp via the
> sequence of calls specified above
>
> - dp_ctrl_enable_stream_clocks(), which calls dp_power_clk_enable()
> immediately afterwards. This call would set the stream_pixel rate
> while enabling stream clocks. As far as I can see, the stream_pixel is
> the only stream clock. So this patch sets the clock rate without
> storing in the interim configuration data.
>
> Could you please clarify, what exactly looks bad to you?
>

I'm concerned about the order of operations changing between the
phy being powered on and the pixel clk frequency being set. From what I
recall the pixel clk rate operations depend on the phy frequency being
set (which is done through phy_configure?) so if we call clk_set_rate()
on the pixel clk before the phy is set then the clk frequency will be
calculated badly and probably be incorrect.


Re: [PATCH v5 3/5] drm/msm/dp: set stream_pixel rate directly

2022-03-03 Thread Dmitry Baryshkov
On Fri, 4 Mar 2022 at 01:32, Stephen Boyd  wrote:
>
> Quoting Dmitry Baryshkov (2022-02-16 21:55:27)
> > The only clock for which we set the rate is the "stream_pixel". Rather
> > than storing the rate and then setting it by looping over all the
> > clocks, set the clock rate directly.
> >
> > Signed-off-by: Dmitry Baryshkov 
> [...]
> > diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c 
> > b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > index 07f6bf7e1acb..8e6361dedd77 100644
> > --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
> > @@ -1315,7 +1315,7 @@ static void dp_ctrl_set_clock_rate(struct 
> > dp_ctrl_private *ctrl,
> > DRM_DEBUG_DP("setting rate=%lu on clk=%s\n", rate, name);
> >
> > if (num)
> > -   cfg->rate = rate;
> > +   clk_set_rate(cfg->clk, rate);
>
> This looks bad. From what I can tell we set the rate of the pixel clk
> after enabling the phy and configuring it. See the order of operations
> in dp_ctrl_enable_mainlink_clocks() and note how dp_power_clk_enable()
> is the one that eventually sets a rate through dp_power_clk_set_rate()
>
> dp_ctrl_set_clock_rate(ctrl, DP_CTRL_PM, "ctrl_link",
> ctrl->link->link_params.rate * 1000);
>
> phy_configure(phy, &dp_io->phy_opts);
> phy_power_on(phy);
>
> ret = dp_power_clk_enable(ctrl->power, DP_CTRL_PM, true);

This code has been changed in the previous patch.

Let's get back a bit.
Currently dp_ctrl_set_clock_rate() doesn't change the clock rate. It
just stores the rate in the config so that later the sequence of
dp_power_clk_enable() -> dp_power_clk_set_rate() ->
[dp_power_clk_set_link_rate() -> dev_pm_opp_set_rate() or
msm_dss_clk_set_rate() -> clk_set_rate()] will use that.

There are only two users of dp_ctrl_set_clock_rate():
- dp_ctrl_enable_mainlink_clocks(), which you have quoted above.
  This case is handled in the patch 1 from this series. It makes
dp_ctrl_enable_mainlink_clocks() call dev_pm_opp_set_rate() directly
without storing (!) the rate in the config, calling
phy_configure()/phy_power_on() and then setting the opp via the
sequence of calls specified above

- dp_ctrl_enable_stream_clocks(), which calls dp_power_clk_enable()
immediately afterwards. This call would set the stream_pixel rate
while enabling stream clocks. As far as I can see, the stream_pixel is
the only stream clock. So this patch sets the clock rate without
storing in the interim configuration data.

Could you please clarify, what exactly looks bad to you?

> and I vaguely recall that the DP phy needs to be configured for some
> frequency so that the pixel clk can use it when determining the rate to
> set.
>
> > else
> > DRM_ERROR("%s clock doesn't exit to set rate %lu\n",
> > name, rate);



-- 
With best wishes
Dmitry


[git pull] drm fixes for 5.17-rc7

2022-03-03 Thread Dave Airlie
Hi Linus,

Things are quieting down as expected, just a small set of fixes, i915,
exynos, amdgpu, vrr, bridge and hdlcd. Nothing scary at all.

Dave.

drm-fixes-2022-03-04:
drm fixes for 5.17-rc7

i915:
- Fix GuC SLPC unset command
- Fix misidentification of some Apple MacBook Pro laptops as Jasper Lake.

amdgpu:
- Suspend regression fix

exynos:
- irq handling fixes.
- Fix two regressions to TE-gpio handling.

arm/hdlcd:
- Select DRM_GEM_CMEA_HELPER for HDLCD

bridge:
- ti-sn65dsi86: Properly undo autosuspend

vrr:
- Fix potential NULL-pointer deref
The following changes since commit 7e57714cd0ad2d5bb90e50b5096a0e671dec1ef3:

  Linux 5.17-rc6 (2022-02-27 14:36:33 -0800)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2022-03-04

for you to fetch changes up to 8fdb19679722a02fe21642d39710c701d2ed567a:

  Merge tag 'drm-misc-fixes-2022-03-03' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2022-03-04
13:04:11 +1000)


drm fixes for 5.17-rc7

i915:
- Fix GuC SLPC unset command
- Fix misidentification of some Apple MacBook Pro laptops as Jasper Lake.

amdgpu:
- Suspend regression fix

exynos:
- irq handling fixes.
- Fix two regressions to TE-gpio handling.

arm/hdlcd:
- Select DRM_GEM_CMEA_HELPER for HDLCD

bridge:
- ti-sn65dsi86: Properly undo autosuspend

vrr:
- Fix potential NULL-pointer deref


Carsten Haitzler (1):
  drm/arm: arm hdlcd select DRM_GEM_CMA_HELPER

Dave Airlie (4):
  Merge tag 'exynos-drm-fixes-v5.17-rc6' of
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into
drm-fixes
  Merge tag 'drm-intel-fixes-2022-03-03' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
  Merge tag 'amd-drm-fixes-5.17-2022-03-02' of
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
  Merge tag 'drm-misc-fixes-2022-03-03' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Douglas Anderson (1):
  drm/bridge: ti-sn65dsi86: Properly undo autosuspend

Lad Prabhakar (5):
  drm/exynos/exynos7_drm_decon: Use platform_get_irq_byname() to
get the interrupt
  drm/exynos: mixer: Use platform_get_irq() to get the interrupt
  drm/exynos/exynos_drm_fimd: Use platform_get_irq_byname() to get
the interrupt
  drm/exynos/fimc: Use platform_get_irq() to get the interrupt
  drm/exynos: gsc: Use platform_get_irq() to get the interrupt

Manasi Navare (1):
  drm/vrr: Set VRR capable prop only if it is attached to connector

Marek Szyprowski (2):
  drm/exynos: Don't fail if no TE-gpio is defined for DSI driver
  drm/exynos: Search for TE-gpio in DSI panel's node

Qiang Yu (1):
  drm/amdgpu: fix suspend/resume hang regression

Ville Syrjälä (1):
  drm/i915: s/JSP2/ICP2/ PCH

Vinay Belgaumkar (1):
  drm/i915/guc/slpc: Correct the param count for unset param

 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  3 ++-
 drivers/gpu/drm/arm/Kconfig |  1 +
 drivers/gpu/drm/bridge/ti-sn65dsi86.c   |  5 +++--
 drivers/gpu/drm/drm_connector.c |  3 +++
 drivers/gpu/drm/exynos/exynos7_drm_decon.c  | 12 +++-
 drivers/gpu/drm/exynos/exynos_drm_dsi.c |  6 --
 drivers/gpu/drm/exynos/exynos_drm_fimc.c| 13 +
 drivers/gpu/drm/exynos/exynos_drm_fimd.c| 13 -
 drivers/gpu/drm/exynos/exynos_drm_gsc.c | 10 +++---
 drivers/gpu/drm/exynos/exynos_mixer.c   | 14 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c |  2 +-
 drivers/gpu/drm/i915/intel_pch.c|  2 +-
 drivers/gpu/drm/i915/intel_pch.h|  2 +-
 13 files changed, 37 insertions(+), 49 deletions(-)


[PATCH v3 5/5] drm/msm: allow compile time selection of driver components

2022-03-03 Thread Dmitry Baryshkov
MSM DRM driver already allows one to compile out the DP or DSI support.
Add support for disabling other features like MDP4/MDP5/DPU drivers or
direct HDMI output support.

Suggested-by: Stephen Boyd 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/Kconfig| 50 --
 drivers/gpu/drm/msm/Makefile   | 18 ++--
 drivers/gpu/drm/msm/msm_drv.h  | 33 ++
 drivers/gpu/drm/msm/msm_mdss.c | 13 +++--
 4 files changed, 106 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 9b019598e042..3735fd41eb3b 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -46,12 +46,39 @@ config DRM_MSM_GPU_SUDO
  Only use this if you are a driver developer.  This should *not*
  be enabled for production kernels.  If unsure, say N.
 
-config DRM_MSM_HDMI_HDCP
-   bool "Enable HDMI HDCP support in MSM DRM driver"
+config DRM_MSM_MDSS
+   bool
+   depends on DRM_MSM
+   default n
+
+config DRM_MSM_MDP4
+   bool "Enable MDP4 support in MSM DRM driver"
depends on DRM_MSM
default y
help
- Choose this option to enable HDCP state machine
+ Compile in support for the Mobile Display Processor v4 (MDP4) in
+ the MSM DRM driver. It is the older display controller found in
+ devices using APQ8064/MSM8960/MSM8x60 platforms.
+
+config DRM_MSM_MDP5
+   bool "Enable MDP5 support in MSM DRM driver"
+   depends on DRM_MSM
+   select DRM_MSM_MDSS
+   default y
+   help
+ Compile in support for the Mobile Display Processor v5 (MDP4) in
+ the MSM DRM driver. It is the display controller found in devices
+ using e.g. APQ8016/MSM8916/APQ8096/MSM8996/MSM8974/SDM6x0 platforms.
+
+config DRM_MSM_DPU
+   bool "Enable DPU support in MSM DRM driver"
+   depends on DRM_MSM
+   select DRM_MSM_MDSS
+   default y
+   help
+ Compile in support for the Display Processing Unit in
+ the MSM DRM driver. It is the display controller found in devices
+ using e.g. SDM845 and newer platforms.
 
 config DRM_MSM_DP
bool "Enable DisplayPort support in MSM DRM driver"
@@ -116,3 +143,20 @@ config DRM_MSM_DSI_7NM_PHY
help
  Choose this option if DSI PHY on SM8150/SM8250/SC7280 is used on
  the platform.
+
+config DRM_MSM_HDMI
+   bool "Enable HDMI support in MSM DRM driver"
+   depends on DRM_MSM
+   default y
+   help
+ Compile in support for the HDMI output MSM DRM driver. It can
+ be a primary or a secondary display on device. Note that this is used
+ only for the direct HDMI output. If the device outputs HDMI data
+ throught some kind of DSI-to-HDMI bridge, this option can be disabled.
+
+config DRM_MSM_HDMI_HDCP
+   bool "Enable HDMI HDCP support in MSM DRM driver"
+   depends on DRM_MSM && DRM_MSM_HDMI
+   default y
+   help
+ Choose this option to enable HDCP state machine
diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index e76927b42033..5fe9c20ab9ee 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -16,6 +16,8 @@ msm-y := \
adreno/a6xx_gpu.o \
adreno/a6xx_gmu.o \
adreno/a6xx_hfi.o \
+
+msm-$(CONFIG_DRM_MSM_HDMI) += \
hdmi/hdmi.o \
hdmi/hdmi_audio.o \
hdmi/hdmi_bridge.o \
@@ -27,8 +29,8 @@ msm-y := \
hdmi/hdmi_phy_8x60.o \
hdmi/hdmi_phy_8x74.o \
hdmi/hdmi_pll_8960.o \
-   disp/mdp_format.o \
-   disp/mdp_kms.o \
+
+msm-$(CONFIG_DRM_MSM_MDP4) += \
disp/mdp4/mdp4_crtc.o \
disp/mdp4/mdp4_dtv_encoder.o \
disp/mdp4/mdp4_lcdc_encoder.o \
@@ -37,6 +39,8 @@ msm-y := \
disp/mdp4/mdp4_irq.o \
disp/mdp4/mdp4_kms.o \
disp/mdp4/mdp4_plane.o \
+
+msm-$(CONFIG_DRM_MSM_MDP5) += \
disp/mdp5/mdp5_cfg.o \
disp/mdp5/mdp5_ctl.o \
disp/mdp5/mdp5_crtc.o \
@@ -47,6 +51,8 @@ msm-y := \
disp/mdp5/mdp5_mixer.o \
disp/mdp5/mdp5_plane.o \
disp/mdp5/mdp5_smp.o \
+
+msm-$(CONFIG_DRM_MSM_DPU) += \
disp/dpu1/dpu_core_perf.o \
disp/dpu1/dpu_crtc.o \
disp/dpu1/dpu_encoder.o \
@@ -69,6 +75,13 @@ msm-y := \
disp/dpu1/dpu_plane.o \
disp/dpu1/dpu_rm.o \
disp/dpu1/dpu_vbif.o \
+
+msm-$(CONFIG_DRM_MSM_MDSS) += \
+   msm_mdss.o \
+
+msm-y += \
+   disp/mdp_format.o \
+   disp/mdp_kms.o \
disp/msm_disp_snapshot.o \
disp/msm_disp_snapshot_util.o \
msm_atomic.o \
@@ -86,7 +99,6 @@ msm-y := \
msm_gpu_devfreq.o \
msm_io_utils.o \
msm_iommu.o \
-   msm_mdss.o \
msm_perf.o \
msm_rd.o \
msm_ringbuffer.o \
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index c1aaadfbea34..6bad7e7b479d 100644
--- a/drivers/gpu

[PATCH v3 4/5] drm/msm: stop using device's match data pointer

2022-03-03 Thread Dmitry Baryshkov
Let's make the match's data pointer a (sub-)driver's private data. The
only user currently is the msm_drm_init() function, using this data to
select kms_init callback. Pass this callback through the driver's
private data instead.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c  | 10 ---
 drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 14 +
 drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 11 ---
 drivers/gpu/drm/msm/msm_drv.c| 38 ++--
 drivers/gpu/drm/msm/msm_drv.h|  5 +---
 drivers/gpu/drm/msm/msm_kms.h|  4 ---
 drivers/gpu/drm/msm/msm_mdss.c   | 29 +++---
 7 files changed, 42 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index e29796c4f27b..38627ccf3068 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -1172,7 +1172,7 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
return rc;
 }
 
-struct msm_kms *dpu_kms_init(struct drm_device *dev)
+static int dpu_kms_init(struct drm_device *dev)
 {
struct msm_drm_private *priv;
struct dpu_kms *dpu_kms;
@@ -1180,7 +1180,7 @@ struct msm_kms *dpu_kms_init(struct drm_device *dev)
 
if (!dev) {
DPU_ERROR("drm device node invalid\n");
-   return ERR_PTR(-EINVAL);
+   return -EINVAL;
}
 
priv = dev->dev_private;
@@ -1189,11 +1189,11 @@ struct msm_kms *dpu_kms_init(struct drm_device *dev)
irq = irq_of_parse_and_map(dpu_kms->pdev->dev.of_node, 0);
if (irq < 0) {
DPU_ERROR("failed to get irq: %d\n", irq);
-   return ERR_PTR(irq);
+   return irq;
}
dpu_kms->base.irq = irq;
 
-   return &dpu_kms->base;
+   return 0;
 }
 
 static int dpu_bind(struct device *dev, struct device *master, void *data)
@@ -1204,6 +1204,8 @@ static int dpu_bind(struct device *dev, struct device 
*master, void *data)
struct dpu_kms *dpu_kms;
int ret = 0;
 
+   priv->kms_init = dpu_kms_init;
+
dpu_kms = devm_kzalloc(&pdev->dev, sizeof(*dpu_kms), GFP_KERNEL);
if (!dpu_kms)
return -ENOMEM;
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c 
b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
index c5c0650414c5..2e5f6b6fd3c3 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
@@ -389,7 +389,7 @@ static void read_mdp_hw_revision(struct mdp4_kms *mdp4_kms,
DRM_DEV_INFO(dev->dev, "MDP4 version v%d.%d", *major, *minor);
 }
 
-struct msm_kms *mdp4_kms_init(struct drm_device *dev)
+static int mdp4_kms_init(struct drm_device *dev)
 {
struct platform_device *pdev = to_platform_device(dev->dev);
struct mdp4_platform_config *config = mdp4_get_config(pdev);
@@ -403,8 +403,7 @@ struct msm_kms *mdp4_kms_init(struct drm_device *dev)
mdp4_kms = kzalloc(sizeof(*mdp4_kms), GFP_KERNEL);
if (!mdp4_kms) {
DRM_DEV_ERROR(dev->dev, "failed to allocate kms\n");
-   ret = -ENOMEM;
-   goto fail;
+   return -ENOMEM;
}
 
ret = mdp_kms_init(&mdp4_kms->base, &kms_funcs);
@@ -551,12 +550,13 @@ struct msm_kms *mdp4_kms_init(struct drm_device *dev)
dev->mode_config.max_width = 2048;
dev->mode_config.max_height = 2048;
 
-   return kms;
+   return 0;
 
 fail:
if (kms)
mdp4_destroy(kms);
-   return ERR_PTR(ret);
+
+   return ret;
 }
 
 static struct mdp4_platform_config *mdp4_get_config(struct platform_device 
*dev)
@@ -583,6 +583,8 @@ static int mdp4_probe(struct platform_device *pdev)
if (!priv)
return -ENOMEM;
 
+   priv->kms_init = mdp4_kms_init;
+
platform_set_drvdata(pdev, priv);
 
/*
@@ -600,7 +602,7 @@ static int mdp4_remove(struct platform_device *pdev)
 }
 
 static const struct of_device_id mdp4_dt_match[] = {
-   { .compatible = "qcom,mdp4", .data = (void *)KMS_MDP4 },
+   { .compatible = "qcom,mdp4" },
{ /* sentinel */ }
 };
 MODULE_DEVICE_TABLE(of, mdp4_dt_match);
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
index 3b92372e7bdf..0c78608832c3 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
@@ -544,7 +544,7 @@ static int get_clk(struct platform_device *pdev, struct clk 
**clkp,
return 0;
 }
 
-struct msm_kms *mdp5_kms_init(struct drm_device *dev)
+static int mdp5_kms_init(struct drm_device *dev)
 {
struct msm_drm_private *priv = dev->dev_private;
struct platform_device *pdev;
@@ -558,7 +558,7 @@ struct msm_kms *mdp5_kms_init(struct drm_device *dev)
/* priv->kms would have been populated by the MDP5 driver */
kms = priv->kms;
if (!kms)
-   return NU

[PATCH v3 3/5] drm/msm: split the main platform driver

2022-03-03 Thread Dmitry Baryshkov
Currently the msm platform driver is a multiplex handling several cases:
- headless GPU-only driver,
- MDP4 with flat device nodes,
- MDP5/DPU MDSS with all the nodes being children of MDSS node.

This results in not-so-perfect code, checking the hardware version
(MDP4/MDP5/DPU) in several places, checking for mdss even when it can
not exist, etc. Split the code into three handling subdrivers (mdp4,
mdss and headless msm).

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c |  56 ++
 drivers/gpu/drm/msm/msm_drv.c| 228 ---
 drivers/gpu/drm/msm/msm_drv.h|  27 ++-
 drivers/gpu/drm/msm/msm_kms.h|   7 -
 drivers/gpu/drm/msm/msm_mdss.c   | 178 +-
 5 files changed, 291 insertions(+), 205 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c 
b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
index 3cf476c55158..c5c0650414c5 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
@@ -569,3 +569,59 @@ static struct mdp4_platform_config *mdp4_get_config(struct 
platform_device *dev)
 
return &config;
 }
+
+static const struct dev_pm_ops mdp4_pm_ops = {
+   .prepare = msm_pm_prepare,
+   .complete = msm_pm_complete,
+};
+
+static int mdp4_probe(struct platform_device *pdev)
+{
+   struct msm_drm_private *priv;
+
+   priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+   if (!priv)
+   return -ENOMEM;
+
+   platform_set_drvdata(pdev, priv);
+
+   /*
+* on MDP4 based platforms, the MDP platform device is the component
+* master that adds other display interface components to itself.
+*/
+   return msm_drv_probe(&pdev->dev, &pdev->dev);
+}
+
+static int mdp4_remove(struct platform_device *pdev)
+{
+   component_master_del(&pdev->dev, &msm_drm_ops);
+
+   return 0;
+}
+
+static const struct of_device_id mdp4_dt_match[] = {
+   { .compatible = "qcom,mdp4", .data = (void *)KMS_MDP4 },
+   { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, mdp4_dt_match);
+
+static struct platform_driver mdp4_platform_driver = {
+   .probe  = mdp4_probe,
+   .remove = mdp4_remove,
+   .shutdown   = msm_drv_shutdown,
+   .driver = {
+   .name   = "mdp4",
+   .of_match_table = mdp4_dt_match,
+   .pm = &mdp4_pm_ops,
+   },
+};
+
+void __init msm_mdp4_register(void)
+{
+   platform_driver_register(&mdp4_platform_driver);
+}
+
+void __exit msm_mdp4_unregister(void)
+{
+   platform_driver_unregister(&mdp4_platform_driver);
+}
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index f3f33b8c6eba..2f44df8c5585 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -255,10 +255,6 @@ static int msm_drm_uninit(struct device *dev)
return 0;
 }
 
-#define KMS_MDP4 4
-#define KMS_MDP5 5
-#define KMS_DPU  3
-
 static int get_mdp_ver(struct platform_device *pdev)
 {
struct device *dev = &pdev->dev;
@@ -941,50 +937,7 @@ static const struct drm_driver msm_driver = {
.patchlevel = MSM_VERSION_PATCHLEVEL,
 };
 
-static int __maybe_unused msm_runtime_suspend(struct device *dev)
-{
-   struct msm_drm_private *priv = dev_get_drvdata(dev);
-   struct msm_mdss *mdss = priv->mdss;
-
-   DBG("");
-
-   if (mdss)
-   return msm_mdss_disable(mdss);
-
-   return 0;
-}
-
-static int __maybe_unused msm_runtime_resume(struct device *dev)
-{
-   struct msm_drm_private *priv = dev_get_drvdata(dev);
-   struct msm_mdss *mdss = priv->mdss;
-
-   DBG("");
-
-   if (mdss)
-   return msm_mdss_enable(mdss);
-
-   return 0;
-}
-
-static int __maybe_unused msm_pm_suspend(struct device *dev)
-{
-
-   if (pm_runtime_suspended(dev))
-   return 0;
-
-   return msm_runtime_suspend(dev);
-}
-
-static int __maybe_unused msm_pm_resume(struct device *dev)
-{
-   if (pm_runtime_suspended(dev))
-   return 0;
-
-   return msm_runtime_resume(dev);
-}
-
-static int __maybe_unused msm_pm_prepare(struct device *dev)
+int msm_pm_prepare(struct device *dev)
 {
struct msm_drm_private *priv = dev_get_drvdata(dev);
struct drm_device *ddev = priv ? priv->dev : NULL;
@@ -995,7 +948,7 @@ static int __maybe_unused msm_pm_prepare(struct device *dev)
return drm_mode_config_helper_suspend(ddev);
 }
 
-static void __maybe_unused msm_pm_complete(struct device *dev)
+void msm_pm_complete(struct device *dev)
 {
struct msm_drm_private *priv = dev_get_drvdata(dev);
struct drm_device *ddev = priv ? priv->dev : NULL;
@@ -1007,8 +960,6 @@ static void __maybe_unused msm_pm_complete(struct device 
*dev)
 }
 
 static const struct dev_pm_ops msm_pm_ops = {
-   SET_SYSTEM_SLEEP_PM_OPS(msm_pm_suspend, msm_pm_resume)
-   SET_RUNTIME_PM_OPS(msm_runtime_suspend, 

[PATCH v3 2/5] drm/msm: remove extra indirection for msm_mdss

2022-03-03 Thread Dmitry Baryshkov
Since now there is just one mdss subdriver, drop all the indirection,
make msm_mdss struct completely opaque (and defined inside msm_mdss.c)
and call mdss functions directly.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/msm_drv.c  |  29 +++
 drivers/gpu/drm/msm/msm_kms.h  |  16 ++--
 drivers/gpu/drm/msm/msm_mdss.c | 136 +++--
 3 files changed, 81 insertions(+), 100 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 078c7e951a6e..f3f33b8c6eba 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -948,8 +948,8 @@ static int __maybe_unused msm_runtime_suspend(struct device 
*dev)
 
DBG("");
 
-   if (mdss && mdss->funcs)
-   return mdss->funcs->disable(mdss);
+   if (mdss)
+   return msm_mdss_disable(mdss);
 
return 0;
 }
@@ -961,8 +961,8 @@ static int __maybe_unused msm_runtime_resume(struct device 
*dev)
 
DBG("");
 
-   if (mdss && mdss->funcs)
-   return mdss->funcs->enable(mdss);
+   if (mdss)
+   return msm_mdss_enable(mdss);
 
return 0;
 }
@@ -1197,6 +1197,7 @@ static const struct component_master_ops msm_drm_ops = {
 static int msm_pdev_probe(struct platform_device *pdev)
 {
struct component_match *match = NULL;
+   struct msm_mdss *mdss;
struct msm_drm_private *priv;
int ret;
 
@@ -1208,20 +1209,22 @@ static int msm_pdev_probe(struct platform_device *pdev)
 
switch (get_mdp_ver(pdev)) {
case KMS_MDP5:
-   ret = msm_mdss_init(pdev, true);
+   mdss = msm_mdss_init(pdev, true);
break;
case KMS_DPU:
-   ret = msm_mdss_init(pdev, false);
+   mdss = msm_mdss_init(pdev, false);
break;
default:
-   ret = 0;
+   mdss = NULL;
break;
}
-   if (ret) {
-   platform_set_drvdata(pdev, NULL);
+   if (IS_ERR(mdss)) {
+   ret = PTR_ERR(mdss);
return ret;
}
 
+   priv->mdss = mdss;
+
if (get_mdp_ver(pdev)) {
ret = add_display_components(pdev, &match);
if (ret)
@@ -1248,8 +1251,8 @@ static int msm_pdev_probe(struct platform_device *pdev)
 fail:
of_platform_depopulate(&pdev->dev);
 
-   if (priv->mdss && priv->mdss->funcs)
-   priv->mdss->funcs->destroy(priv->mdss);
+   if (priv->mdss)
+   msm_mdss_destroy(priv->mdss);
 
return ret;
 }
@@ -1262,8 +1265,8 @@ static int msm_pdev_remove(struct platform_device *pdev)
component_master_del(&pdev->dev, &msm_drm_ops);
of_platform_depopulate(&pdev->dev);
 
-   if (mdss && mdss->funcs)
-   mdss->funcs->destroy(mdss);
+   if (mdss)
+   msm_mdss_destroy(mdss);
 
return 0;
 }
diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
index 10d5ae3e76df..09c21994 100644
--- a/drivers/gpu/drm/msm/msm_kms.h
+++ b/drivers/gpu/drm/msm/msm_kms.h
@@ -201,18 +201,12 @@ struct msm_kms *dpu_kms_init(struct drm_device *dev);
 extern const struct of_device_id dpu_dt_match[];
 extern const struct of_device_id mdp5_dt_match[];
 
-struct msm_mdss_funcs {
-   int (*enable)(struct msm_mdss *mdss);
-   int (*disable)(struct msm_mdss *mdss);
-   void (*destroy)(struct msm_mdss *mdss);
-};
-
-struct msm_mdss {
-   struct device *dev;
-   const struct msm_mdss_funcs *funcs;
-};
+struct msm_mdss;
 
-int msm_mdss_init(struct platform_device *pdev, bool is_mdp5);
+struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool is_mdp5);
+int msm_mdss_enable(struct msm_mdss *mdss);
+int msm_mdss_disable(struct msm_mdss *mdss);
+void msm_mdss_destroy(struct msm_mdss *mdss);
 
 #define for_each_crtc_mask(dev, crtc, crtc_mask) \
drm_for_each_crtc(crtc, dev) \
diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c
index 71f3277bde32..857eefbb8649 100644
--- a/drivers/gpu/drm/msm/msm_mdss.c
+++ b/drivers/gpu/drm/msm/msm_mdss.c
@@ -3,19 +3,16 @@
  * Copyright (c) 2018, The Linux Foundation
  */
 
+#include 
 #include 
 #include 
 #include 
 #include 
-
-#include "msm_drv.h"
-#include "msm_kms.h"
+#include 
 
 /* for DPU_HW_* defines */
 #include "disp/dpu1/dpu_hw_catalog.h"
 
-#define to_dpu_mdss(x) container_of(x, struct dpu_mdss, base)
-
 #define HW_REV 0x0
 #define HW_INTR_STATUS 0x0010
 
@@ -23,8 +20,9 @@
 #define UBWC_CTRL_20x150
 #define UBWC_PREDICTION_MODE   0x154
 
-struct dpu_mdss {
-   struct msm_mdss base;
+struct msm_mdss {
+   struct device *dev;
+
void __iomem *mmio;
struct clk_bulk_data *clocks;
size_t num_clocks;
@@ -36,22 +34,22 @@ struct dpu_mdss {
 
 static void msm_mdss_irq(struct irq_desc *desc)
 {
-   struct dpu_mdss *dpu_md

[PATCH v3 1/5] drm/msm: unify MDSS drivers

2022-03-03 Thread Dmitry Baryshkov
MDP5 and DPU1 both provide the driver handling the MDSS region, which
handles the irq domain and (incase of DPU1) adds some init for the UBWC
controller. Unify those two pieces of code into a common driver.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/Makefile  |   3 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c | 252 --
 drivers/gpu/drm/msm/msm_drv.c |   4 +-
 drivers/gpu/drm/msm/msm_kms.h |   3 +-
 .../msm/{disp/dpu1/dpu_mdss.c => msm_mdss.c}  | 145 +-
 5 files changed, 83 insertions(+), 324 deletions(-)
 delete mode 100644 drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c
 rename drivers/gpu/drm/msm/{disp/dpu1/dpu_mdss.c => msm_mdss.c} (63%)

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index e9cc7d8ac301..e76927b42033 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -42,7 +42,6 @@ msm-y := \
disp/mdp5/mdp5_crtc.o \
disp/mdp5/mdp5_encoder.o \
disp/mdp5/mdp5_irq.o \
-   disp/mdp5/mdp5_mdss.o \
disp/mdp5/mdp5_kms.o \
disp/mdp5/mdp5_pipe.o \
disp/mdp5/mdp5_mixer.o \
@@ -67,7 +66,6 @@ msm-y := \
disp/dpu1/dpu_hw_util.o \
disp/dpu1/dpu_hw_vbif.o \
disp/dpu1/dpu_kms.o \
-   disp/dpu1/dpu_mdss.o \
disp/dpu1/dpu_plane.o \
disp/dpu1/dpu_rm.o \
disp/dpu1/dpu_vbif.o \
@@ -88,6 +86,7 @@ msm-y := \
msm_gpu_devfreq.o \
msm_io_utils.o \
msm_iommu.o \
+   msm_mdss.o \
msm_perf.o \
msm_rd.o \
msm_ringbuffer.o \
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c
deleted file mode 100644
index 049c6784a531..
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c
+++ /dev/null
@@ -1,252 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Copyright (c) 2016, The Linux Foundation. All rights reserved.
- */
-
-#include 
-#include 
-
-#include "msm_drv.h"
-#include "mdp5_kms.h"
-
-#define to_mdp5_mdss(x) container_of(x, struct mdp5_mdss, base)
-
-struct mdp5_mdss {
-   struct msm_mdss base;
-
-   void __iomem *mmio, *vbif;
-
-   struct clk *ahb_clk;
-   struct clk *axi_clk;
-   struct clk *vsync_clk;
-
-   struct {
-   volatile unsigned long enabled_mask;
-   struct irq_domain *domain;
-   } irqcontroller;
-};
-
-static inline void mdss_write(struct mdp5_mdss *mdp5_mdss, u32 reg, u32 data)
-{
-   msm_writel(data, mdp5_mdss->mmio + reg);
-}
-
-static inline u32 mdss_read(struct mdp5_mdss *mdp5_mdss, u32 reg)
-{
-   return msm_readl(mdp5_mdss->mmio + reg);
-}
-
-static irqreturn_t mdss_irq(int irq, void *arg)
-{
-   struct mdp5_mdss *mdp5_mdss = arg;
-   u32 intr;
-
-   intr = mdss_read(mdp5_mdss, REG_MDSS_HW_INTR_STATUS);
-
-   VERB("intr=%08x", intr);
-
-   while (intr) {
-   irq_hw_number_t hwirq = fls(intr) - 1;
-
-   generic_handle_domain_irq(mdp5_mdss->irqcontroller.domain, 
hwirq);
-   intr &= ~(1 << hwirq);
-   }
-
-   return IRQ_HANDLED;
-}
-
-/*
- * interrupt-controller implementation, so sub-blocks (MDP/HDMI/eDP/DSI/etc)
- * can register to get their irq's delivered
- */
-
-#define VALID_IRQS  (MDSS_HW_INTR_STATUS_INTR_MDP | \
-   MDSS_HW_INTR_STATUS_INTR_DSI0 | \
-   MDSS_HW_INTR_STATUS_INTR_DSI1 | \
-   MDSS_HW_INTR_STATUS_INTR_HDMI | \
-   MDSS_HW_INTR_STATUS_INTR_EDP)
-
-static void mdss_hw_mask_irq(struct irq_data *irqd)
-{
-   struct mdp5_mdss *mdp5_mdss = irq_data_get_irq_chip_data(irqd);
-
-   smp_mb__before_atomic();
-   clear_bit(irqd->hwirq, &mdp5_mdss->irqcontroller.enabled_mask);
-   smp_mb__after_atomic();
-}
-
-static void mdss_hw_unmask_irq(struct irq_data *irqd)
-{
-   struct mdp5_mdss *mdp5_mdss = irq_data_get_irq_chip_data(irqd);
-
-   smp_mb__before_atomic();
-   set_bit(irqd->hwirq, &mdp5_mdss->irqcontroller.enabled_mask);
-   smp_mb__after_atomic();
-}
-
-static struct irq_chip mdss_hw_irq_chip = {
-   .name   = "mdss",
-   .irq_mask   = mdss_hw_mask_irq,
-   .irq_unmask = mdss_hw_unmask_irq,
-};
-
-static int mdss_hw_irqdomain_map(struct irq_domain *d, unsigned int irq,
-irq_hw_number_t hwirq)
-{
-   struct mdp5_mdss *mdp5_mdss = d->host_data;
-
-   if (!(VALID_IRQS & (1 << hwirq)))
-   return -EPERM;
-
-   irq_set_chip_and_handler(irq, &mdss_hw_irq_chip, handle_level_irq);
-   irq_set_chip_data(irq, mdp5_mdss);
-
-   return 0;
-}
-
-static const struct irq_domain_ops mdss_hw_irqdomain_ops = {
-   .map = mdss_hw_irqdomain_map,
-   .xlate = irq_domain_xlate_onecell,
-};
-
-
-static int mdss_irq_domain_init(struct mdp5_mdss *mdp5_mdss)
-{
-   struct device *dev = mdp5_mdss->base.dev;
-   struct irq_domain *d;
-
-   d = irq_domain_add_lin

[PATCH v3 0/5] drm/msm: rework MDSS drivers

2022-03-03 Thread Dmitry Baryshkov
These patches coninue work started by AngeloGioacchino Del Regno in the
previous cycle by further decoupling and dissecting MDSS and MDP drivers
probe/binding paths.

This removes code duplication between MDP5 and DPU1 MDSS drivers, by
merging them and moving to the top level.

This patchset depends on the patches 1 and 2 from [1]

Changes since v2:
 - Rebased on top of current msm/msm-next(-staging)
 - Allow disabling MDP4/MDP5/DPU/HDMI components (like we do for DP and
   DSI)
 - Made mdp5_mdss_parse_clock() static
 - Changed mdp5 to is_mdp5 argument in several functions
 - Dropped boolean device data from the mdss driver
 - Reworked error handling in msm_pdev_probe()
 - Removed unused header inclusion
 - Dropped __init/__exit from function prototypes

Changes since v1:
 - Rebased on top of [2] and [1]

[1] https://patchwork.freedesktop.org/series/99066/
[2] https://patchwork.freedesktop.org/series/98521/

Dmitry Baryshkov (5):
  drm/msm: unify MDSS drivers
  drm/msm: remove extra indirection for msm_mdss
  drm/msm: split the main platform driver
  drm/msm: stop using device's match data pointer
  drm/msm: allow runtime selection of driver components

 drivers/gpu/drm/msm/Kconfig   |  50 ++-
 drivers/gpu/drm/msm/Makefile  |  19 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |  10 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c  | 260 -
 drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c  |  68 +++-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c  |  11 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c | 252 -
 drivers/gpu/drm/msm/msm_drv.c | 263 +++--
 drivers/gpu/drm/msm/msm_drv.h |  57 ++-
 drivers/gpu/drm/msm/msm_kms.h |  18 -
 drivers/gpu/drm/msm/msm_mdss.c| 429 ++
 11 files changed, 667 insertions(+), 770 deletions(-)
 delete mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
 delete mode 100644 drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c
 create mode 100644 drivers/gpu/drm/msm/msm_mdss.c


base-commit: 8ddb80c5fcf455fe38156636126a83eadacfb743
-- 
2.34.1



Re: Report 2 in ext4 and journal based on v5.17-rc1

2022-03-03 Thread Byungchul Park
On Thu, Mar 03, 2022 at 09:36:25AM -0500, Theodore Ts'o wrote:
> On Thu, Mar 03, 2022 at 02:23:33PM +0900, Byungchul Park wrote:
> > I totally agree with you. *They aren't really locks but it's just waits
> > and wakeups.* That's exactly why I decided to develop Dept. Dept is not
> > interested in locks unlike Lockdep, but fouces on waits and wakeup
> > sources itself. I think you get Dept wrong a lot. Please ask me more if
> > you have things you doubt about Dept.
> 
> So the question is this --- do you now understand why, even though
> there is a circular dependency, nothing gets stalled in the
> interactions between the two wait channels?

I found a point that the two wait channels don't lead a deadlock in
some cases thanks to Jan Kara. I will fix it so that Dept won't
complain it.

Thanks,
Byungchul

> 
>   - Ted


Re: Report 2 in ext4 and journal based on v5.17-rc1

2022-03-03 Thread Byungchul Park
On Thu, Mar 03, 2022 at 10:54:56AM +0100, Jan Kara wrote:
> On Thu 03-03-22 10:00:33, Byungchul Park wrote:
> > Unfortunately, it's neither perfect nor safe without another wakeup
> > source - rescue wakeup source.
> > 
> >consumer producer
> > 
> > lock L
> > (too much work queued == true)
> > unlock L
> > --- preempted
> >lock L
> >unlock L
> >do work
> >lock L
> >unlock L
> >do work
> >...
> >(no work == true)
> >sleep
> > --- scheduled in
> > sleep
> > 
> > This code leads a deadlock without another wakeup source, say, not safe.
> 
> So the scenario you describe above is indeed possible. But the trick is
> that the wakeup from 'consumer' as is doing work will remove 'producer'
> from the wait queue and change the 'producer' process state to
> 'TASK_RUNNING'. So when 'producer' calls sleep (in fact schedule()), the
> scheduler will just treat this as another preemption point and the
> 'producer' will immediately or soon continue to run. So indeed we can think
> of this as "another wakeup source" but the source is in the CPU scheduler
> itself. This is the standard way how waitqueues are used in the kernel...

Nice! Thanks for the explanation. I will take it into account if needed.

> > Lastly, just for your information, I need to explain how Dept works a
> > little more for you not to misunderstand Dept.
> > 
> > Assuming the consumer and producer guarantee not to lead a deadlock like
> > the following, Dept won't report it a problem:
> > 
> >consumer producer
> > 
> > sleep
> >wakeup work_done
> > queue work
> >sleep
> > wakeup work_queued
> >do work
> > sleep
> >wakeup work_done
> > queue work
> >sleep
> > wakeup work_queued
> >do work
> > sleep
> >...  ...
> > 
> > Dept does not consider all waits preceeding an event but only waits that
> > might lead a deadlock. In this case, Dept works with each region
> > independently.
> > 
> >consumer producer
> > 
> > sleep <- initiates region 1
> >--- region 1 starts
> >...  ...
> >--- region 1 ends
> >wakeup work_done
> >...  ...
> > queue work
> >...  ...
> >sleep <- initiates region 2
> > --- region 2 starts
> >...  ...
> > --- region 2 ends
> > wakeup work_queued
> >...  ...
> >do work
> >...  ...
> > sleep <- initiates region 3
> >--- region 3 starts
> >...  ...
> >--- region 3 ends
> >wakeup work_done
> >...  ...
> > queue work
> >...  ...
> >sleep <- initiates region 4
> > --- region 4 starts
> >...  ...
> > --- region 4 ends
> > wakeup work_queued
> >...  ...
> >do work
> >...  ...
> > 
> > That is, Dept does not build dependencies across different regions. So
> > you don't have to worry about unreasonable false positives that much.
> > 
> > Thoughts?
> 
> Thanks for explanation! And what exactly defines the 'regions'? When some
> process goes to sleep on some waitqueue, this defines a start of a region
> at the place where all the other processes are at that moment and wakeup of
> the waitqueue is an end of the region?

Yes. Let me explain it more for better understanding.
(I copied it from the talk I did with Matthew..)


   ideal view
   ---
   context Xcontext Y

   request event E  ...
  write REQUESTEVENTwhen (notice REQUESTEVENT written)
   ... notice the request from X [S]

--- ideally region 1 starts here
   wait for the event   ...
  sleep if (can see REQUESTEVENT written)
   it's on the way to the event
   ...  
...
--- ideally region 1 ends here

finally the event [E]

Dept basically works with the above view with regard to wait and event.
But it's very 

[PATCH v2 2/2] drm/msm/dp: Implement oob_hotplug_event()

2022-03-03 Thread Bjorn Andersson
The Qualcomm DisplayPort driver contains traces of the necessary
plumbing to hook up USB HPD, in the form of the dp_hpd module and the
dp_usbpd_cb struct. Use this as basis for implementing the
oob_hotplug_event() callback, by amending the dp_hpd module with the
missing logic.

Overall the solution is similar to what's done downstream, but upstream
all the code to disect the HPD notification lives on the calling side of
drm_connector_oob_hotplug_event().

drm_connector_oob_hotplug_event() performs the lookup of the
drm_connector based on fwnode, hence the need to assign the fwnode in
dp_drm_connector_init().

Changes in v2:
- Adopt enum drm_connector_hpd_state

Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/dp/dp_display.c |  9 +
 drivers/gpu/drm/msm/dp/dp_display.h |  3 +++
 drivers/gpu/drm/msm/dp/dp_drm.c | 11 +++
 drivers/gpu/drm/msm/dp/dp_hpd.c | 21 +
 drivers/gpu/drm/msm/dp/dp_hpd.h |  5 +
 5 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c
index 178b774a5fbd..3d9d754a75f3 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -449,6 +449,14 @@ static int dp_display_usbpd_configure_cb(struct device 
*dev)
return dp_display_process_hpd_high(dp);
 }
 
+void dp_display_oob_hotplug_event(struct msm_dp *dp_display,
+ enum drm_connector_hpd_state hpd_state)
+{
+   struct dp_display_private *dp = container_of(dp_display, struct 
dp_display_private, dp_display);
+
+   dp->usbpd->oob_event(dp->usbpd, hpd_state);
+}
+
 static int dp_display_usbpd_disconnect_cb(struct device *dev)
 {
struct dp_display_private *dp = dev_get_dp_display_private(dev);
@@ -1296,6 +1304,7 @@ static int dp_display_probe(struct platform_device *pdev)
dp->pdev = pdev;
dp->name = "drm_dp";
dp->dp_display.connector_type = desc->connector_type;
+   dp->dp_display.dev = &pdev->dev;
 
rc = dp_init_sub_modules(dp);
if (rc) {
diff --git a/drivers/gpu/drm/msm/dp/dp_display.h 
b/drivers/gpu/drm/msm/dp/dp_display.h
index 7af2b186d2d9..16658270df2c 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.h
+++ b/drivers/gpu/drm/msm/dp/dp_display.h
@@ -11,6 +11,7 @@
 #include "disp/msm_disp_snapshot.h"
 
 struct msm_dp {
+   struct device *dev;
struct drm_device *drm_dev;
struct device *codec_dev;
struct drm_bridge *bridge;
@@ -40,5 +41,7 @@ bool dp_display_check_video_test(struct msm_dp *dp_display);
 int dp_display_get_test_bpp(struct msm_dp *dp_display);
 void dp_display_signal_audio_start(struct msm_dp *dp_display);
 void dp_display_signal_audio_complete(struct msm_dp *dp_display);
+void dp_display_oob_hotplug_event(struct msm_dp *dp_display,
+ enum drm_connector_hpd_state hpd_state);
 
 #endif /* _DP_DISPLAY_H_ */
diff --git a/drivers/gpu/drm/msm/dp/dp_drm.c b/drivers/gpu/drm/msm/dp/dp_drm.c
index 80f59cf99089..76904b1601b1 100644
--- a/drivers/gpu/drm/msm/dp/dp_drm.c
+++ b/drivers/gpu/drm/msm/dp/dp_drm.c
@@ -123,6 +123,14 @@ static enum drm_mode_status dp_connector_mode_valid(
return dp_display_validate_mode(dp_disp, mode->clock);
 }
 
+static void dp_oob_hotplug_event(struct drm_connector *connector,
+enum drm_connector_hpd_state hpd_state)
+{
+   struct msm_dp *dp_disp = to_dp_connector(connector)->dp_display;
+
+   dp_display_oob_hotplug_event(dp_disp, hpd_state);
+}
+
 static const struct drm_connector_funcs dp_connector_funcs = {
.detect = dp_connector_detect,
.fill_modes = drm_helper_probe_single_connector_modes,
@@ -130,6 +138,7 @@ static const struct drm_connector_funcs dp_connector_funcs 
= {
.reset = drm_atomic_helper_connector_reset,
.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+   .oob_hotplug_event = dp_oob_hotplug_event,
 };
 
 static const struct drm_connector_helper_funcs dp_connector_helper_funcs = {
@@ -160,6 +169,8 @@ struct drm_connector *dp_drm_connector_init(struct msm_dp 
*dp_display)
if (ret)
return ERR_PTR(ret);
 
+   connector->fwnode = fwnode_handle_get(dev_fwnode(dp_display->dev));
+
drm_connector_helper_add(connector, &dp_connector_helper_funcs);
 
/*
diff --git a/drivers/gpu/drm/msm/dp/dp_hpd.c b/drivers/gpu/drm/msm/dp/dp_hpd.c
index db98a1d431eb..cdb1feea5ebf 100644
--- a/drivers/gpu/drm/msm/dp/dp_hpd.c
+++ b/drivers/gpu/drm/msm/dp/dp_hpd.c
@@ -7,6 +7,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 #include "dp_hpd.h"
 
@@ -45,6 +47,24 @@ int dp_hpd_connect(struct dp_usbpd *dp_usbpd, bool hpd)
return rc;
 }
 
+static void dp_hpd_oob_event(struct dp_usbpd *dp_usbpd,
+enum drm_connector_hpd_state hpd_state)
+{
+   struct dp_hpd_

[PATCH v2 1/2] drm: Add HPD state to drm_connector_oob_hotplug_event()

2022-03-03 Thread Bjorn Andersson
In some implementations, such as the Qualcomm platforms, the display
driver has no way to query the current HPD state and as such it's
impossible to distinguish between disconnect and attention events.

Add a parameter to drm_connector_oob_hotplug_event() to pass the HPD
state.

Also push the test for unchanged state in the displayport altmode driver
into the i915 driver, to allow other drivers to act upon each update.

Changes in v2:
- Replace bool with drm_connector_hpd_state enum to represent "state" better
- Track old hpd state per encoder in i915

Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/drm_connector.c  |  6 --
 drivers/gpu/drm/i915/display/intel_dp.c  | 17 ++---
 drivers/gpu/drm/i915/i915_drv.h  |  3 +++
 drivers/usb/typec/altmodes/displayport.c | 10 +++---
 include/drm/drm_connector.h  | 11 +--
 5 files changed, 33 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index a50c82bc2b2f..a44f082ebd9d 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2825,6 +2825,7 @@ struct drm_connector *drm_connector_find_by_fwnode(struct 
fwnode_handle *fwnode)
 /**
  * drm_connector_oob_hotplug_event - Report out-of-band hotplug event to 
connector
  * @connector_fwnode: fwnode_handle to report the event on
+ * @hpd_state: hot plug detect logical state
  *
  * On some hardware a hotplug event notification may come from outside the 
display
  * driver / device. An example of this is some USB Type-C setups where the 
hardware
@@ -2834,7 +2835,8 @@ struct drm_connector *drm_connector_find_by_fwnode(struct 
fwnode_handle *fwnode)
  * This function can be used to report these out-of-band events after obtaining
  * a drm_connector reference through calling drm_connector_find_by_fwnode().
  */
-void drm_connector_oob_hotplug_event(struct fwnode_handle *connector_fwnode)
+void drm_connector_oob_hotplug_event(struct fwnode_handle *connector_fwnode,
+enum drm_connector_hpd_state hpd_state)
 {
struct drm_connector *connector;
 
@@ -2843,7 +2845,7 @@ void drm_connector_oob_hotplug_event(struct fwnode_handle 
*connector_fwnode)
return;
 
if (connector->funcs->oob_hotplug_event)
-   connector->funcs->oob_hotplug_event(connector);
+   connector->funcs->oob_hotplug_event(connector, hpd_state);
 
drm_connector_put(connector);
 }
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 1046e7fe310a..a3c9dbae5cee 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -4825,15 +4825,26 @@ static int intel_dp_connector_atomic_check(struct 
drm_connector *conn,
return intel_modeset_synced_crtcs(state, conn);
 }
 
-static void intel_dp_oob_hotplug_event(struct drm_connector *connector)
+static void intel_dp_oob_hotplug_event(struct drm_connector *connector,
+  enum drm_connector_hpd_state hpd_state)
 {
struct intel_encoder *encoder = 
intel_attached_encoder(to_intel_connector(connector));
struct drm_i915_private *i915 = to_i915(connector->dev);
+   bool hpd_high = hpd_state == DRM_CONNECTOR_HPD_HIGH;
+   unsigned int hpd_pin = encoder->hpd_pin;
+   bool need_work = false;
 
spin_lock_irq(&i915->irq_lock);
-   i915->hotplug.event_bits |= BIT(encoder->hpd_pin);
+   if (hpd_high != test_bit(hpd_pin, 
&i915->hotplug.oob_hotplug_last_state)) {
+   i915->hotplug.event_bits |= BIT(hpd_pin);
+
+   __assign_bit(hpd_pin, &i915->hotplug.oob_hotplug_last_state, 
hpd_high);
+   need_work = true;
+   }
spin_unlock_irq(&i915->irq_lock);
-   queue_delayed_work(system_wq, &i915->hotplug.hotplug_work, 0);
+
+   if (need_work)
+   queue_delayed_work(system_wq, &i915->hotplug.hotplug_work, 0);
 }
 
 static const struct drm_connector_funcs intel_dp_connector_funcs = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5cfe69b30841..80a4615a38e2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -138,6 +138,9 @@ struct i915_hotplug {
/* Whether or not to count short HPD IRQs in HPD storms */
u8 hpd_short_storm_enabled;
 
+   /* Last state reported by oob_hotplug_event for each encoder */
+   unsigned long oob_hotplug_last_state;
+
/*
 * if we get a HPD irq from DP and a HPD irq from non-DP
 * the non-DP HPD could block the workqueue on a mode config
diff --git a/drivers/usb/typec/altmodes/displayport.c 
b/drivers/usb/typec/altmodes/displayport.c
index c1d8c23baa39..ea9cb1d71fd2 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -59,7 +59,6 @@ struct dp_altmode {
struct typec_displ

[PATCH 4/4] drm/msm/a6xx: Zap counters across context switch

2022-03-03 Thread Rob Clark
From: Rob Clark 

Any app controlled perfcntr collection (GL_AMD_performance_monitor, etc)
does not require counters to maintain state across context switches.  So
clear them if systemwide profiling is not active.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 29 +++
 1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 237c2e7a7baa..02b47977b5c3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -101,6 +101,7 @@ static void get_stats_counter(struct msm_ringbuffer *ring, 
u32 counter,
 static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
struct msm_ringbuffer *ring, struct msm_file_private *ctx)
 {
+   bool sysprof = refcount_read(&a6xx_gpu->base.base.sysprof_active) > 1;
phys_addr_t ttbr;
u32 asid;
u64 memptr = rbmemptr(ring, ttbr0);
@@ -111,6 +112,15 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
return;
 
+   if (!sysprof) {
+   /* Turn off protected mode to write to special registers */
+   OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+   OUT_RING(ring, 0);
+
+   OUT_PKT4(ring, REG_A6XX_RBBM_PERFCTR_SRAM_INIT_CMD, 1);
+   OUT_RING(ring, 1);
+   }
+
/* Execute the table update */
OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
OUT_RING(ring, CP_SMMU_TABLE_UPDATE_0_TTBR0_LO(lower_32_bits(ttbr)));
@@ -137,6 +147,25 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 
OUT_PKT7(ring, CP_EVENT_WRITE, 1);
OUT_RING(ring, 0x31);
+
+   if (!sysprof) {
+   /*
+* Wait for SRAM clear after the pgtable update, so the
+* two can happen in parallel:
+*/
+   OUT_PKT7(ring, CP_WAIT_REG_MEM, 6);
+   OUT_RING(ring, CP_WAIT_REG_MEM_0_FUNCTION(WRITE_EQ));
+   OUT_RING(ring, CP_WAIT_REG_MEM_1_POLL_ADDR_LO(
+   REG_A6XX_RBBM_PERFCTR_SRAM_INIT_STATUS));
+   OUT_RING(ring, CP_WAIT_REG_MEM_2_POLL_ADDR_HI(0));
+   OUT_RING(ring, CP_WAIT_REG_MEM_3_REF(0x1));
+   OUT_RING(ring, CP_WAIT_REG_MEM_4_MASK(0x1));
+   OUT_RING(ring, CP_WAIT_REG_MEM_5_DELAY_LOOP_CYCLES(0));
+
+   /* Re-enable protected mode: */
+   OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+   OUT_RING(ring, 1);
+   }
 }
 
 static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
-- 
2.35.1



[PATCH 3/4] drm/msm: Add SYSPROF param (v2)

2022-03-03 Thread Rob Clark
From: Rob Clark 

Add a SYSPROF param for system profiling tools like Mesa's pps-producer
(perfetto) to control behavior related to system-wide performance
counter collection.  In particular, for profiling, one wants to ensure
that GPU context switches do not effect perfcounter state, and might
want to suppress suspend (which would cause counters to lose state).

v2: Swap the order in msm_file_private_set_sysprof() [sboyd] and
initialize the sysprof_active refcount to one (because the under/
overflow checking in refcount_t doesn't expect a 0->1 transition)
meaning that values greater than 1 means sysprof is active.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 +++
 drivers/gpu/drm/msm/msm_drv.c   |  8 +
 drivers/gpu/drm/msm/msm_gpu.c   |  2 ++
 drivers/gpu/drm/msm/msm_gpu.h   | 27 +
 drivers/gpu/drm/msm/msm_submitqueue.c   | 39 +
 include/uapi/drm/msm_drm.h  |  1 +
 6 files changed, 81 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 6a37d409653b..c91ea363c373 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -287,6 +287,10 @@ int adreno_set_param(struct msm_gpu *gpu, struct 
msm_file_private *ctx,
 uint32_t param, uint64_t value)
 {
switch (param) {
+   case MSM_PARAM_SYSPROF:
+   if (!capable(CAP_SYS_ADMIN))
+   return -EPERM;
+   return msm_file_private_set_sysprof(ctx, gpu, value);
default:
DBG("%s: invalid param: %u", gpu->name, param);
return -EINVAL;
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index ca9a8a866292..780f9748aaaf 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -559,8 +559,16 @@ static void context_close(struct msm_file_private *ctx)
 
 static void msm_postclose(struct drm_device *dev, struct drm_file *file)
 {
+   struct msm_drm_private *priv = dev->dev_private;
struct msm_file_private *ctx = file->driver_priv;
 
+   /*
+* It is not possible to set sysprof param to non-zero if gpu
+* is not initialized:
+*/
+   if (priv->gpu)
+   msm_file_private_set_sysprof(ctx, priv->gpu, 0);
+
context_close(ctx);
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index c4fe8bc9445e..8fe4aee96aa9 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -975,6 +975,8 @@ int msm_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
 
gpu->nr_rings = nr_rings;
 
+   refcount_set(&gpu->sysprof_active, 1);
+
return 0;
 
 fail:
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index fde9a29f884e..a84140055920 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -160,6 +160,13 @@ struct msm_gpu {
struct msm_ringbuffer *rb[MSM_GPU_MAX_RINGS];
int nr_rings;
 
+   /**
+* sysprof_active:
+*
+* The count of contexts that have enabled system profiling.
+*/
+   refcount_t sysprof_active;
+
/**
 * cur_ctx_seqno:
 *
@@ -330,6 +337,24 @@ struct msm_file_private {
struct kref ref;
int seqno;
 
+   /**
+* sysprof:
+*
+* The value of MSM_PARAM_SYSPROF set by userspace.  This is
+* intended to be used by system profiling tools like Mesa's
+* pps-producer (perfetto), and restricted to CAP_SYS_ADMIN.
+*
+* Setting a value of 1 will preserve performance counters across
+* context switches.  Setting a value of 2 will in addition
+* suppress suspend.  (Performance counters lose state across
+* power collapse, which is undesirable for profiling in some
+* cases.)
+*
+* The value automatically reverts to zero when the drm device
+* file is closed.
+*/
+   int sysprof;
+
/**
 * elapsed:
 *
@@ -545,6 +570,8 @@ void msm_submitqueue_close(struct msm_file_private *ctx);
 
 void msm_submitqueue_destroy(struct kref *kref);
 
+int msm_file_private_set_sysprof(struct msm_file_private *ctx,
+struct msm_gpu *gpu, int sysprof);
 void __msm_file_private_destroy(struct kref *kref);
 
 static inline void msm_file_private_put(struct msm_file_private *ctx)
diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c 
b/drivers/gpu/drm/msm/msm_submitqueue.c
index 7cb158bcbcf6..79b6ccd6ce64 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c
@@ -7,6 +7,45 @@
 
 #include "msm_gpu.h"
 
+int msm_file_private_set_sysprof(struct msm_file_private *ctx,
+struct msm_gpu *gpu, int sysprof)
+{
+   /*

[PATCH 2/4] drm/msm: Add SET_PARAM ioctl

2022-03-03 Thread Rob Clark
From: Rob Clark 

It was always expected to have a use for this some day, so we left a
placeholder.  Now we do.  (And I expect another use in the not too
distant future when we start allowing userspace to allocate GPU iova.)

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 ++
 drivers/gpu/drm/msm/msm_drv.c   | 20 ++
 drivers/gpu/drm/msm/msm_gpu.h   |  2 ++
 include/uapi/drm/msm_drm.h  | 27 ++---
 10 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
index 22e8295a5e2b..6c9a747eb4ad 100644
--- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
@@ -471,6 +471,7 @@ static u32 a2xx_get_rptr(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
 static const struct adreno_gpu_funcs funcs = {
.base = {
.get_param = adreno_get_param,
+   .set_param = adreno_set_param,
.hw_init = a2xx_hw_init,
.pm_suspend = msm_gpu_pm_suspend,
.pm_resume = msm_gpu_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 2e481e2692ba..0ab0e1dd8bbb 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -486,6 +486,7 @@ static u32 a3xx_get_rptr(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
 static const struct adreno_gpu_funcs funcs = {
.base = {
.get_param = adreno_get_param,
+   .set_param = adreno_set_param,
.hw_init = a3xx_hw_init,
.pm_suspend = msm_gpu_pm_suspend,
.pm_resume = msm_gpu_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index c5524d6e8705..0c6b2a6d0b4c 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -621,6 +621,7 @@ static u32 a4xx_get_rptr(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
 static const struct adreno_gpu_funcs funcs = {
.base = {
.get_param = adreno_get_param,
+   .set_param = adreno_set_param,
.hw_init = a4xx_hw_init,
.pm_suspend = a4xx_pm_suspend,
.pm_resume = a4xx_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 3d28fcf841a6..407f50a15faa 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1700,6 +1700,7 @@ static uint32_t a5xx_get_rptr(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
 static const struct adreno_gpu_funcs funcs = {
.base = {
.get_param = adreno_get_param,
+   .set_param = adreno_set_param,
.hw_init = a5xx_hw_init,
.pm_suspend = a5xx_pm_suspend,
.pm_resume = a5xx_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 7d23c741db4a..237c2e7a7baa 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1800,6 +1800,7 @@ static int a6xx_set_supported_hw(struct device *dev, 
struct adreno_rev rev)
 static const struct adreno_gpu_funcs funcs = {
.base = {
.get_param = adreno_get_param,
+   .set_param = adreno_set_param,
.hw_init = a6xx_hw_init,
.pm_suspend = a6xx_pm_suspend,
.pm_resume = a6xx_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 15c8997b7251..6a37d409653b 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -283,6 +283,16 @@ int adreno_get_param(struct msm_gpu *gpu, struct 
msm_file_private *ctx,
}
 }
 
+int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
+uint32_t param, uint64_t value)
+{
+   switch (param) {
+   default:
+   DBG("%s: invalid param: %u", gpu->name, param);
+   return -EINVAL;
+   }
+}
+
 const struct firmware *
 adreno_request_fw(struct adreno_gpu *adreno_gpu, const char *fwname)
 {
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index b1ee453d627d..0490c5fbb780 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -282,6 +282,8 @@ static inline int adreno_is_a650_family(struct adreno_gpu 
*gpu)
 
 int adreno_get_param(struct msm_gpu *gpu, struct msm_file_

[PATCH 0/4] drm/msm: Clear perf counters across context switch

2022-03-03 Thread Rob Clark
From: Rob Clark 

Some clever folks figured out a way to use performance counters as a
side-channel[1].  But, other than the special case of using the perf
counters for system profiling, we can reset the counters across context
switches to protect against this.

This series introduces a SYSPROF param which a sufficiently privilaged
userspace (like Mesa's pps-producer, which already must run as root) to
opt-out, and makes the default behavior to reset counters on context
switches.

[1] https://dl.acm.org/doi/pdf/10.1145/3503222.3507757

Rob Clark (4):
  drm/msm: Update generated headers
  drm/msm: Add SET_PARAM ioctl
  drm/msm: Add SYSPROF param (v2)
  drm/msm/a6xx: Zap counters across context switch

 drivers/gpu/drm/msm/adreno/a2xx.xml.h |  26 +-
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c |   1 +
 drivers/gpu/drm/msm/adreno/a3xx.xml.h |  30 +-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c |   1 +
 drivers/gpu/drm/msm/adreno/a4xx.xml.h | 112 ++-
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c |   1 +
 drivers/gpu/drm/msm/adreno/a5xx.xml.h |  63 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c |   1 +
 drivers/gpu/drm/msm/adreno/a6xx.xml.h | 674 +++---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h |  26 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  30 +
 .../gpu/drm/msm/adreno/adreno_common.xml.h|  31 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |  14 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   2 +
 drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h   |  46 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4.xml.h  |  37 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h  |  37 +-
 drivers/gpu/drm/msm/disp/mdp_common.xml.h |  37 +-
 drivers/gpu/drm/msm/dsi/dsi.xml.h |  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_10nm.xml.h|  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_14nm.xml.h|  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_20nm.xml.h|  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_28nm.xml.h|  37 +-
 .../gpu/drm/msm/dsi/dsi_phy_28nm_8960.xml.h   |  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_5nm.xml.h | 480 -
 drivers/gpu/drm/msm/dsi/dsi_phy_7nm.xml.h |  43 +-
 drivers/gpu/drm/msm/dsi/mmss_cc.xml.h |  37 +-
 drivers/gpu/drm/msm/dsi/sfpb.xml.h|  37 +-
 drivers/gpu/drm/msm/hdmi/hdmi.xml.h   |  37 +-
 drivers/gpu/drm/msm/hdmi/qfprom.xml.h |  37 +-
 drivers/gpu/drm/msm/msm_drv.c |  28 +
 drivers/gpu/drm/msm/msm_gpu.c |   2 +
 drivers/gpu/drm/msm/msm_gpu.h |  29 +
 drivers/gpu/drm/msm/msm_submitqueue.c |  39 +
 include/uapi/drm/msm_drm.h|  28 +-
 35 files changed, 1058 insertions(+), 1130 deletions(-)
 delete mode 100644 drivers/gpu/drm/msm/dsi/dsi_phy_5nm.xml.h

-- 
2.35.1



Re: Report 2 in ext4 and journal based on v5.17-rc1

2022-03-03 Thread Byungchul Park
On Thu, Mar 03, 2022 at 09:36:25AM -0500, Theodore Ts'o wrote:
> On Thu, Mar 03, 2022 at 02:23:33PM +0900, Byungchul Park wrote:
> > I totally agree with you. *They aren't really locks but it's just waits
> > and wakeups.* That's exactly why I decided to develop Dept. Dept is not
> > interested in locks unlike Lockdep, but fouces on waits and wakeup
> > sources itself. I think you get Dept wrong a lot. Please ask me more if
> > you have things you doubt about Dept.
> 
> So the question is this --- do you now understand why, even though
> there is a circular dependency, nothing gets stalled in the
> interactions between the two wait channels?

??? I'm afraid I don't get you.

All contexts waiting for any of the events in the circular dependency
chain will be definitely stuck if there is a circular dependency as I
explained. So we need another wakeup source to break the circle. In
ext4 code, you might have the wakeup source for breaking the circle.

What I agreed with is:

   The case that 1) the circular dependency is unevitable 2) there are
   another wakeup source for breadking the circle and 3) the duration
   in sleep is short enough, should be acceptable.

Sounds good?

Thanks,
Byungchul


Re: [PATCH v3 00/21] DEPT(Dependency Tracker)

2022-03-03 Thread Byungchul Park
On Thu, Mar 03, 2022 at 12:38:39PM +, Hyeonggon Yoo wrote:
> On Thu, Mar 03, 2022 at 06:48:24PM +0900, Byungchul Park wrote:
> > On Thu, Mar 03, 2022 at 08:03:21AM +, Hyeonggon Yoo wrote:
> > > On Thu, Mar 03, 2022 at 09:18:13AM +0900, Byungchul Park wrote:
> > > > Hi Hyeonggon,
> > > > 
> > > > Dept also allows the following scenario when an user guarantees that
> > > > each lock instance is different from another at a different depth:
> > > >
> > > >lock A0 with depth
> > > >lock A1 with depth + 1
> > > >lock A2 with depth + 2
> > > >lock A3 with depth + 3
> > > >(and so on)
> > > >..
> > > >unlock A3
> > > >unlock A2
> > > >unlock A1
> > > >unlock A0
> > 
> 
> [+Cc kmemleak maintainer]
> 
> > Look at this. Dept allows object->lock -> other_object->lock (with a
> > different depth using *_lock_nested()) so won't report it.
> >
> 
> No, It did.

Yes, you are right. I should've asked you to resend the AA deadlock
report when I found [W]'s stacktrace was missed in what you shared and
should've taken a look at it more.

Dept normally doesn't report this type of AA deadlock. But it does
when the case happens, that we are talking about, say, another lock
class cut in between the nesting locks. I will fix it. The AA deadlock
report here doesn't make sense. Thank you.

However, the other report below still makes sense.

> > > > >   45  *   scan_mutex [-> object->lock] -> kmemleak_lock -> 
> > > > > other_object->lock (SINGLE_DEPTH_NESTING)
> > > > >   46  *
> > > > >   47  * No kmemleak_lock and object->lock nesting is allowed outside 
> > > > > scan_mutex
> > > > >   48  * regions.
> > > 
> > > lock order in kmemleak is described above.
> > > 
> > > and DEPT detects two cases as deadlock:
> > > 
> > > 1) object->lock -> other_object->lock
> > 
> > It's not a deadlock *IF* two have different depth using *_lock_nested().
> > Dept also allows this case. So Dept wouldn't report it.
> >
> > > 2) object->lock -> kmemleak_lock, kmemleak_lock -> other_object->lock
> >
> > But this usage is risky. I already explained it in the mail you replied
> > to. I copied it. See the below.
> >
> 
> I understand why you said this is risky.
> Its lock ordering is not good.
> 
> > context A
> > > >lock A0 with depth
> > > >lock B
> > > >lock A1 with depth + 1
> > > >lock A2 with depth + 2
> > > >lock A3 with depth + 3
> > > >(and so on)
> > > >..
> > > >unlock A3
> > > >unlock A2
> > > >unlock A1
> > > >unlock B
> > > >unlock A0
> >
> > ...
> >
> > context B
> > > >lock A1 with depth
> > > >lock B
> > > >lock A2 with depth + 1
> > > >lock A3 with depth + 2
> > > >(and so on)
> > > >..
> > > >unlock A3
> > > >unlock A2
> > > >unlock B
> > > >unlock A1
> > 
> > where Ax : object->lock, B : kmemleak_lock.
> > 
> > A deadlock might occur if the two contexts run at the same time.
> >
> 
> But I want to say kmemleak is getting things under control. No two contexts
> can run at same time.

So.. do you think the below is also okay? Because lock C and lock B are
under control?

   context Xcontext Y

   lock mutex A lock mutex A
   lock B   lock C
   lock C   lock B
   unlock C unlock B
   unlock B unlock C
   unlock mutex A   unlock mutex A

In my opinion, lock B and lock C are unnecessary if they are always
along with lock mutex A. Or we should keep correct lock order across all
the code.

> > > And in kmemleak case, 1) and 2) is not possible because it must hold
> > > scan_mutex first.
> > 
> > This is another issue. Let's focus on whether the order is okay for now.
> >
> 
> Why is it another issue?

You seem to insist that locking order is not important *if* they are
under control by serializing the sections. I meant this is another issue.

> > > I think the author of kmemleak intended lockdep to treat object->lock
> > > and other_object->lock as different class, using raw_spin_lock_nested().
> > 
> > Yes. The author meant to assign a different class according to its depth
> > using a Lockdep API. Strictly speaking, those are the same class anyway
> > but we assign a different class to each depth to avoid Lockdep splats
> > *IF* the user guarantees the nesting lock usage is safe, IOW, guarantees
> > each lock instance is different at a different depth.
> 
> Then why DEPT reports 1) and 2) as deadlock?

1) will be fixed so that Dept doesn't report it. But I still think the
case 2) should be reported for the wrong usage.

Thanks,
Byungchul

> Does DEPT assign same class unlike Lockdep?
> 
> > I was fundamentally asking you... so... is the nesting lock usage safe
> > for real?
> 
> I don't get what the point is. I agree it's not a good lock ordering.
> But in kmemleak case, I think kmemleak is getting things under control.
> 
> -- 
> Thank you, You are awesome!
> Hyeonggon :-)
> 
> 

Re: [PATCH V2 04/12] drm: bridge: icn6211: Add DSI lane count DT property parsing

2022-03-03 Thread Marek Vasut

On 3/3/22 13:54, Maxime Ripard wrote:

[...]


Regarding the default value -- there are no in-tree users of this driver yet
(per git grep in current linux-next), do we really care about backward
compatibility in this case?


If it hasn't been in a stable release yet, no. If it did, yes


It was in a stable release, V3 is out.


[PATCH V3 12/13] drm: bridge: icn6211: Rework ICN6211_DSI to chipone_writeb()

2022-03-03 Thread Marek Vasut
Rename and inline macro ICN6211_DSI() into function chipone_writeb()
to keep all function names lower-case. No functional change.

Acked-by: Maxime Ripard 
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 63 +++-
 1 file changed, 28 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index 4ad149c13f599..c66eacc6b1e2a 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -153,8 +153,7 @@ static inline struct chipone *bridge_to_chipone(struct 
drm_bridge *bridge)
return container_of(bridge, struct chipone, bridge);
 }
 
-static inline int chipone_dsi_write(struct chipone *icn,  const void *seq,
-   size_t len)
+static void chipone_writeb(struct chipone *icn, u8 reg, u8 val)
 {
if (icn->interface_i2c)
i2c_smbus_write_byte_data(icn->client, reg, val);
@@ -162,12 +161,6 @@ static inline int chipone_dsi_write(struct chipone *icn,  
const void *seq,
mipi_dsi_generic_write(icn->dsi, (u8[]){reg, val}, 2);
 }
 
-#define ICN6211_DSI(icn, seq...)   \
-   {   \
-   const u8 d[] = { seq }; \
-   chipone_dsi_write(icn, d, ARRAY_SIZE(d));   \
-   }
-
 static void chipone_configure_pll(struct chipone *icn,
  const struct drm_display_mode *mode)
 {
@@ -242,11 +235,11 @@ static void chipone_configure_pll(struct chipone *icn,
(fin * best_m) / BIT(best_p + best_s + 2));
 
/* Clock source selection fixed to MIPI DSI clock lane */
-   ICN6211_DSI(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK);
-   ICN6211_DSI(icn, PLL_REF_DIV,
+   chipone_writeb(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK);
+   chipone_writeb(icn, PLL_REF_DIV,
(best_p ? PLL_REF_DIV_Pe : 0) | /* Prefer /2 pre-divider */
PLL_REF_DIV_P(best_p) | PLL_REF_DIV_S(best_s));
-   ICN6211_DSI(icn, PLL_INT(0), best_m);
+   chipone_writeb(icn, PLL_INT(0), best_m);
 }
 
 static void chipone_atomic_enable(struct drm_bridge *bridge,
@@ -265,19 +258,19 @@ static void chipone_atomic_enable(struct drm_bridge 
*bridge,
bus_flags = bridge_state->output_bus_cfg.flags;
 
if (icn->interface_i2c)
-   ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_I2C);
+   chipone_writeb(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_I2C);
else
-   ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI);
+   chipone_writeb(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI);
 
-   ICN6211_DSI(icn, HACTIVE_LI, mode->hdisplay & 0xff);
+   chipone_writeb(icn, HACTIVE_LI, mode->hdisplay & 0xff);
 
-   ICN6211_DSI(icn, VACTIVE_LI, mode->vdisplay & 0xff);
+   chipone_writeb(icn, VACTIVE_LI, mode->vdisplay & 0xff);
 
/*
 * lsb nibble: 2nd nibble of hdisplay
 * msb nibble: 2nd nibble of vdisplay
 */
-   ICN6211_DSI(icn, VACTIVE_HACTIVE_HI,
+   chipone_writeb(icn, VACTIVE_HACTIVE_HI,
((mode->hdisplay >> 8) & 0xf) |
(((mode->vdisplay >> 8) & 0xf) << 4));
 
@@ -285,49 +278,49 @@ static void chipone_atomic_enable(struct drm_bridge 
*bridge,
hsync = mode->hsync_end - mode->hsync_start;
hbp = mode->htotal - mode->hsync_end;
 
-   ICN6211_DSI(icn, HFP_LI, hfp & 0xff);
-   ICN6211_DSI(icn, HSYNC_LI, hsync & 0xff);
-   ICN6211_DSI(icn, HBP_LI, hbp & 0xff);
+   chipone_writeb(icn, HFP_LI, hfp & 0xff);
+   chipone_writeb(icn, HSYNC_LI, hsync & 0xff);
+   chipone_writeb(icn, HBP_LI, hbp & 0xff);
/* Top two bits of Horizontal Front porch/Sync/Back porch */
-   ICN6211_DSI(icn, HFP_HSW_HBP_HI,
+   chipone_writeb(icn, HFP_HSW_HBP_HI,
HFP_HSW_HBP_HI_HFP(hfp) |
HFP_HSW_HBP_HI_HS(hsync) |
HFP_HSW_HBP_HI_HBP(hbp));
 
-   ICN6211_DSI(icn, VFP, mode->vsync_start - mode->vdisplay);
+   chipone_writeb(icn, VFP, mode->vsync_start - mode->vdisplay);
 
-   ICN6211_DSI(icn, VSYNC, mode->vsync_end - mode->vsync_start);
+   chipone_writeb(icn, VSYNC, mode->vsync_end - mode->vsync_start);
 
-   ICN6211_DSI(icn, VBP, mode->vtotal - mode->vsync_end);
+   chipone_writeb(icn, VBP, mode->vtotal - mode->vsync_end);
 
/* dsi specific sequence */
-   ICN6211_DSI(icn, SYNC_EVENT_DLY, 0x80);
-   ICN6211_DSI(icn, HFP_MIN, hfp & 0xff);
+   chipone_writeb(icn, SYNC_EVENT_DLY, 0x80);
+   chipone_writeb(icn, HFP_MIN, hfp & 0xff);
 
/* DSI data lane count */
-   ICN

[PATCH V3 06/13] drm: bridge: icn6211: Add generic DSI-to-DPI PLL configuration

2022-03-03 Thread Marek Vasut
The chip contains fractional PLL, however the driver currently hard-codes
one specific PLL setting. Implement generic PLL parameter calculation code,
so any DPI panel with arbitrary pixel clock can be attached to this bridge.

The datasheet for this bridge is not available, the PLL behavior has been
inferred from [1] and [2] and by analyzing the DPI pixel clock with scope.
The PLL limits might be wrong, but at least the calculated values match all
the example code available. This is better than one hard-coded pixel clock
value anyway.

[1] 
https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/gpu/drm/bridge/icn6211.c
[2] https://github.com/tdjastrzebski/ICN6211-Configurator

Acked-by: Maxime Ripard 
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 87 +++-
 1 file changed, 84 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index df8e75a068ad0..71c83a18984fa 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -163,6 +163,87 @@ static inline int chipone_dsi_write(struct chipone *icn,  
const void *seq,
chipone_dsi_write(icn, d, ARRAY_SIZE(d));   \
}
 
+static void chipone_configure_pll(struct chipone *icn,
+ const struct drm_display_mode *mode)
+{
+   unsigned int best_p = 0, best_m = 0, best_s = 0;
+   unsigned int delta, min_delta = 0x;
+   unsigned int freq_p, freq_s, freq_out;
+   unsigned int p_min, p_max;
+   unsigned int p, m, s;
+   unsigned int fin;
+
+   /*
+* DSI clock lane frequency (input into PLL) is calculated as:
+*  DSI_CLK = mode clock * bpp / dsi_data_lanes / 2
+* the 2 is there because the bus is DDR.
+*
+* DPI pixel clock frequency (output from PLL) is mode clock.
+*
+* The chip contains fractional PLL which works as follows:
+*  DPI_CLK = ((DSI_CLK / P) * M) / S
+* P is pre-divider, register PLL_REF_DIV[3:0] is 2^(n+1) divider
+*   register PLL_REF_DIV[4] is extra 1:2 divider
+* M is integer multiplier, register PLL_INT(0) is multiplier
+* S is post-divider, register PLL_REF_DIV[7:5] is 2^(n+1) divider
+*
+* It seems the PLL input clock after applying P pre-divider have
+* to be lower than 20 MHz.
+*/
+   fin = mode->clock * mipi_dsi_pixel_format_to_bpp(icn->dsi->format) /
+ icn->dsi_lanes / 2; /* in kHz */
+
+   /* Minimum value of P predivider for PLL input in 5..20 MHz */
+   p_min = ffs(fin / 2);
+   p_max = (fls(fin / 5000) - 1) & 0x1f;
+
+   for (p = p_min; p < p_max; p++) {   /* PLL_REF_DIV[4,3:0] */
+   freq_p = fin / BIT(p + 1);
+   if (freq_p == 0)/* Divider too high */
+   break;
+
+   for (s = 0; s < 0x7; s++) { /* PLL_REF_DIV[7:5] */
+   freq_s = freq_p / BIT(s + 1);
+   if (freq_s == 0)/* Divider too high */
+   break;
+
+   m = mode->clock / freq_s;
+
+   /* Multiplier is 8 bit */
+   if (m > 0xff)
+   continue;
+
+   /* Limit PLL VCO frequency to 1 GHz */
+   freq_out = (fin * m) / BIT(p + 1);
+   if (freq_out > 100)
+   continue;
+
+   /* Apply post-divider */
+   freq_out /= BIT(s + 1);
+
+   delta = abs(mode->clock - freq_out);
+   if (delta < min_delta) {
+   best_p = p;
+   best_m = m;
+   best_s = s;
+   min_delta = delta;
+   }
+   }
+   }
+
+   dev_dbg(icn->dev,
+   "PLL: P[3:0]=2^%d P[4]=2*%d M=%d S[7:5]=2^%d delta=%d => DSI 
f_in=%d kHz ; DPI f_out=%ld kHz\n",
+   best_p, !!best_p, best_m, best_s + 1, min_delta, fin,
+   (fin * best_m) / BIT(best_p + best_s + 2));
+
+   /* Clock source selection fixed to MIPI DSI clock lane */
+   ICN6211_DSI(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK);
+   ICN6211_DSI(icn, PLL_REF_DIV,
+   (best_p ? PLL_REF_DIV_Pe : 0) | /* Prefer /2 pre-divider */
+   PLL_REF_DIV_P(best_p) | PLL_REF_DIV_S(best_s));
+   ICN6211_DSI(icn, PLL_INT(0), best_m);
+}
+
 static void chipone_atomic_enable(struct drm_bridge *bridge,
  struct drm_b

[PATCH V3 08/13] drm: bridge: icn6211: Disable DPI color swap

2022-03-03 Thread Marek Vasut
The chip is capable of swapping DPI RGB channels. The driver currently
does not implement support for this functionality. Write the MIPI_PN_SWAP
register to 0 to assure the color swap is disabled.

Acked-by: Maxime Ripard 
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index b4e886c2b92a5..1a3afefcc9e80 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -302,6 +302,7 @@ static void chipone_atomic_enable(struct drm_bridge *bridge,
 
ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0);
ICN6211_DSI(icn, PLL_CTRL(12), 0xff);
+   ICN6211_DSI(icn, MIPI_PN_SWAP, 0x00);
 
/* DPI HS/VS/DE polarity */
pol = ((mode->flags & DRM_MODE_FLAG_PHSYNC) ? BIST_POL_HSYNC_POL : 0) |
-- 
2.34.1



[PATCH V3 09/13] drm: bridge: icn6211: Set SYS_CTRL_1 to value used in examples

2022-03-03 Thread Marek Vasut
Both example code [1], [2] as well as one provided by custom panel vendor
set register SYS_CTRL_1 to 0x88. What exactly does the value mean is unknown
due to unavailable datasheet. Align this register value with example code.

[1] 
https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/gpu/drm/bridge/icn6211.c
[2] https://github.com/tdjastrzebski/ICN6211-Configurator

Acked-by: Maxime Ripard 
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index 1a3afefcc9e80..095002a40d0e8 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -314,7 +314,7 @@ static void chipone_atomic_enable(struct drm_bridge *bridge,
chipone_configure_pll(icn, mode);
 
ICN6211_DSI(icn, SYS_CTRL(0), 0x40);
-   ICN6211_DSI(icn, SYS_CTRL(1), 0x98);
+   ICN6211_DSI(icn, SYS_CTRL(1), 0x88);
 
/* icn6211 specific sequence */
ICN6211_DSI(icn, MIPI_FORCE_0, 0x20);
-- 
2.34.1



[PATCH V3 11/13] drm: bridge: icn6211: Add I2C configuration support

2022-03-03 Thread Marek Vasut
The ICN6211 chip starts in I2C configuration mode after cold boot.
Implement support for configuring the chip via I2C in addition to
the current DSI LP command mode configuration support. The later
seems to be available only on chips which have additional MCU on
the panel/bridge board which preconfigures the ICN6211, while the
I2C configuration mode added by this patch does not require any
such MCU.

Acked-by: Maxime Ripard 
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: - Drop the abridge variable
- Rename chipone_dsi_setup to chipone_dsi_host_attach and call
  it from chipone_i2c_probe()
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 183 ---
 1 file changed, 161 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index afc619e215c3b..4ad149c13f599 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -133,15 +134,18 @@
 
 struct chipone {
struct device *dev;
+   struct i2c_client *client;
struct drm_bridge bridge;
struct drm_display_mode mode;
struct drm_bridge *panel_bridge;
struct device_node *host_node;
+   struct mipi_dsi_device *dsi;
struct gpio_desc *enable_gpio;
struct regulator *vdd1;
struct regulator *vdd2;
struct regulator *vdd3;
int dsi_lanes;
+   bool interface_i2c;
 };
 
 static inline struct chipone *bridge_to_chipone(struct drm_bridge *bridge)
@@ -152,9 +156,10 @@ static inline struct chipone *bridge_to_chipone(struct 
drm_bridge *bridge)
 static inline int chipone_dsi_write(struct chipone *icn,  const void *seq,
size_t len)
 {
-   struct mipi_dsi_device *dsi = to_mipi_dsi_device(icn->dev);
-
-   return mipi_dsi_generic_write(dsi, seq, len);
+   if (icn->interface_i2c)
+   i2c_smbus_write_byte_data(icn->client, reg, val);
+   else
+   mipi_dsi_generic_write(icn->dsi, (u8[]){reg, val}, 2);
 }
 
 #define ICN6211_DSI(icn, seq...)   \
@@ -259,7 +264,10 @@ static void chipone_atomic_enable(struct drm_bridge 
*bridge,
bridge_state = drm_atomic_get_new_bridge_state(state, bridge);
bus_flags = bridge_state->output_bus_cfg.flags;
 
-   ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI);
+   if (icn->interface_i2c)
+   ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_I2C);
+   else
+   ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI);
 
ICN6211_DSI(icn, HACTIVE_LI, mode->hdisplay & 0xff);
 
@@ -380,6 +388,57 @@ static void chipone_mode_set(struct drm_bridge *bridge,
struct chipone *icn = bridge_to_chipone(bridge);
 
drm_mode_copy(&icn->mode, adjusted_mode);
+};
+
+static int chipone_dsi_attach(struct chipone *icn)
+{
+   struct mipi_dsi_device *dsi = icn->dsi;
+   int ret;
+
+   dsi->lanes = icn->dsi_lanes;
+   dsi->format = MIPI_DSI_FMT_RGB888;
+   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+ MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET;
+
+   ret = mipi_dsi_attach(dsi);
+   if (ret < 0)
+   dev_err(icn->dev, "failed to attach dsi\n");
+
+   return ret;
+}
+
+static int chipone_dsi_host_attach(struct chipone *icn)
+{
+   struct device *dev = icn->dev;
+   struct mipi_dsi_device *dsi;
+   struct mipi_dsi_host *host;
+   int ret = 0;
+
+   const struct mipi_dsi_device_info info = {
+   .type = "chipone",
+   .channel = 0,
+   .node = NULL,
+   };
+
+   host = of_find_mipi_dsi_host_by_node(icn->host_node);
+   if (!host) {
+   dev_err(dev, "failed to find dsi host\n");
+   return -EPROBE_DEFER;
+   }
+
+   dsi = mipi_dsi_device_register_full(host, &info);
+   if (IS_ERR(dsi)) {
+   return dev_err_probe(dev, PTR_ERR(dsi),
+"failed to create dsi device\n");
+   }
+
+   icn->dsi = dsi;
+
+   ret = chipone_dsi_attach(icn);
+   if (ret < 0)
+   mipi_dsi_device_unregister(dsi);
+
+   return ret;
 }
 
 static int chipone_attach(struct drm_bridge *bridge, enum 
drm_bridge_attach_flags flags)
@@ -506,9 +565,8 @@ static int chipone_parse_dt(struct chipone *icn)
return ret;
 }
 
-static int chipone_probe(struct mipi_dsi_device *dsi)
+static int chipone_common_probe(struct device *dev, struct chipone **icnr)
 {
-   struct device *dev = &dsi->dev;
struct chipone *icn;
int ret;
 
@@ -516,7 +574,6 @@ static int chipone_probe(struct mipi_dsi_device *dsi)
i

[PATCH V3 02/13] drm: bridge: icn6211: Fix register layout

2022-03-03 Thread Marek Vasut
The chip register layout has nothing to do with MIPI DCS, the registers
incorrectly marked as MIPI DCS in the driver are regular chip registers
often with completely different function.

Fill in the actual register names and bits from [1] and [2] and add the
entire register layout, since the documentation for this chip is hard to
come by.

[1] 
https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/gpu/drm/bridge/icn6211.c
[2] https://github.com/tdjastrzebski/ICN6211-Configurator

Acked-by: Maxime Ripard 
Fixes: ce517f18944e3 ("drm: bridge: Add Chipone ICN6211 MIPI-DSI to RGB bridge")
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 134 ---
 1 file changed, 117 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index e8f36dca56b33..4b8d1a5a50302 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -15,8 +15,19 @@
 #include 
 #include 
 
-#include 
-
+#define VENDOR_ID  0x00
+#define DEVICE_ID_H0x01
+#define DEVICE_ID_L0x02
+#define VERSION_ID 0x03
+#define FIRMWARE_VERSION   0x08
+#define CONFIG_FINISH  0x09
+#define PD_CTRL(n) (0x0a + ((n) & 0x3)) /* 0..3 */
+#define RST_CTRL(n)(0x0e + ((n) & 0x1)) /* 0..1 */
+#define SYS_CTRL(n)(0x10 + ((n) & 0x7)) /* 0..4 */
+#define RGB_DRV(n) (0x18 + ((n) & 0x3)) /* 0..3 */
+#define RGB_DLY(n) (0x1c + ((n) & 0x1)) /* 0..1 */
+#define RGB_TEST_CTRL  0x1e
+#define ATE_PLL_EN 0x1f
 #define HACTIVE_LI 0x20
 #define VACTIVE_LI 0x21
 #define VACTIVE_HACTIVE_HI 0x22
@@ -27,6 +38,95 @@
 #define VFP0x27
 #define VSYNC  0x28
 #define VBP0x29
+#define BIST_POL   0x2a
+#define BIST_POL_BIST_MODE(n)  (((n) & 0xf) << 4)
+#define BIST_POL_BIST_GEN  BIT(3)
+#define BIST_POL_HSYNC_POL BIT(2)
+#define BIST_POL_VSYNC_POL BIT(1)
+#define BIST_POL_DE_POLBIT(0)
+#define BIST_RED   0x2b
+#define BIST_GREEN 0x2c
+#define BIST_BLUE  0x2d
+#define BIST_CHESS_X   0x2e
+#define BIST_CHESS_Y   0x2f
+#define BIST_CHESS_XY_H0x30
+#define BIST_FRAME_TIME_L  0x31
+#define BIST_FRAME_TIME_H  0x32
+#define FIFO_MAX_ADDR_LOW  0x33
+#define SYNC_EVENT_DLY 0x34
+#define HSW_MIN0x35
+#define HFP_MIN0x36
+#define LOGIC_RST_NUM  0x37
+#define OSC_CTRL(n)(0x48 + ((n) & 0x7)) /* 0..5 */
+#define BG_CTRL0x4e
+#define LDO_PLL0x4f
+#define PLL_CTRL(n)(0x50 + ((n) & 0xf)) /* 0..15 */
+#define PLL_CTRL_6_EXTERNAL0x90
+#define PLL_CTRL_6_MIPI_CLK0x92
+#define PLL_CTRL_6_INTERNAL0x93
+#define PLL_REM(n) (0x60 + ((n) & 0x3)) /* 0..2 */
+#define PLL_DIV(n) (0x63 + ((n) & 0x3)) /* 0..2 */
+#define PLL_FRAC(n)(0x66 + ((n) & 0x3)) /* 0..2 */
+#define PLL_INT(n) (0x69 + ((n) & 0x1)) /* 0..1 */
+#define PLL_REF_DIV0x6b
+#define PLL_REF_DIV_P(n)   ((n) & 0xf)
+#define PLL_REF_DIV_Pe BIT(4)
+#define PLL_REF_DIV_S(n)   (((n) & 0x7) << 5)
+#define PLL_SSC_P(n)   (0x6c + ((n) & 0x3)) /* 0..2 */
+#define PLL_SSC_STEP(n)(0x6f + ((n) & 0x3)) /* 0..2 */
+#define PLL_SSC_OFFSET(n)  (0x72 + ((n) & 0x3)) /* 0..3 */
+#define GPIO_OEN   0x79
+#define MIPI_CFG_PW0x7a
+#define MIPI_CFG_PW_CONFIG_DSI 0xc1
+#define MIPI_CFG_PW_CONFIG_I2C 0x3e
+#define GPIO_SEL(n)(0x7b + ((n) & 0x1)) /* 0..1 */
+#define IRQ_SEL0x7d
+#define DBG_SEL0x7e
+#define DBG_SIGNAL 0x7f
+#define MIPI_ERR_VECTOR_L  0x80
+#define MIPI_ERR_VECTOR_H  0x81
+#define MIPI_ERR_VECTOR_EN_L   0x82
+#define MIPI_ERR_VECTOR_EN_H   0x83
+#define MIPI_MAX_SIZE_L0x84
+#define MIPI_MAX_SIZE_H0x85
+#define DSI_CTRL   0x86
+#define DSI_CTRL_UNKNOWN   0x28
+#define DSI_CTRL_DSI_LANES(n)  ((n) & 0x3)
+#define MIPI_PN_SWAP   0x87
+#define MIPI_PN_SWAP_CLK   BIT(4)
+#define MIPI_PN_SWAP_D(n)  BIT((n) & 0x3)
+#define MIPI_SOT_SYNC_BIT_(n)  (0x88 + ((n) & 0x1)) /* 0..1 */
+#define MIPI_ULPS_CTRL 0x8a
+#define MIPI_CLK_CHK_VAR   0x8e
+#define MIPI_CLK_CHK_INI   0x8f
+#define MIPI_T_TERM_EN 0x90
+#define MIPI_T_HS_SETTLE  

[PATCH V3 13/13] drm: bridge: icn6211: Read and validate chip IDs before configuration

2022-03-03 Thread Marek Vasut
Read out the Vendor/Chip/Version ID registers from the chip before
performing any configuration, and validate that the registers have
correct values. This is mostly a simple test whether DSI register
access does work, since that tends to be broken on various bridges.

Acked-by: Maxime Ripard 
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index c66eacc6b1e2a..0a07023d0aeec 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -153,6 +153,14 @@ static inline struct chipone *bridge_to_chipone(struct 
drm_bridge *bridge)
return container_of(bridge, struct chipone, bridge);
 }
 
+static void chipone_readb(struct chipone *icn, u8 reg, u8 *val)
+{
+   if (icn->interface_i2c)
+   *val = i2c_smbus_read_byte_data(icn->client, reg);
+   else
+   mipi_dsi_generic_read(icn->dsi, (u8[]){reg, 1}, 2, val, 1);
+}
+
 static void chipone_writeb(struct chipone *icn, u8 reg, u8 val)
 {
if (icn->interface_i2c)
@@ -251,7 +259,21 @@ static void chipone_atomic_enable(struct drm_bridge 
*bridge,
const struct drm_bridge_state *bridge_state;
u16 hfp, hbp, hsync;
u32 bus_flags;
-   u8 pol;
+   u8 pol, id[4];
+
+   chipone_readb(icn, VENDOR_ID, id);
+   chipone_readb(icn, DEVICE_ID_H, id + 1);
+   chipone_readb(icn, DEVICE_ID_L, id + 2);
+   chipone_readb(icn, VERSION_ID, id + 3);
+
+   dev_dbg(icn->dev,
+   "Chip IDs: Vendor=0x%02x Device=0x%02x:0x%02x Version=0x%02x\n",
+   id[0], id[1], id[2], id[3]);
+
+   if (id[0] != 0xc1 || id[1] != 0x62 || id[2] != 0x11) {
+   dev_dbg(icn->dev, "Invalid Chip IDs, aborting configuration\n");
+   return;
+   }
 
/* Get the DPI flags from the bridge state. */
bridge_state = drm_atomic_get_new_bridge_state(state, bridge);
-- 
2.34.1



[PATCH V3 07/13] drm: bridge: icn6211: Use DSI burst mode without EoT and with LP command mode

2022-03-03 Thread Marek Vasut
The DSI burst mode is more energy efficient than the DSI sync pulse mode,
make use of the burst mode since the chip supports it as well. Disable the
generation of EoT packet, the chip ignores it, so no point in emitting it.
Enable transmission of data in LP mode, otherwise register read via DSI
does not work with this chip.

Acked-by: Maxime Ripard 
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index 71c83a18984fa..b4e886c2b92a5 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -503,7 +503,8 @@ static int chipone_probe(struct mipi_dsi_device *dsi)
 
dsi->lanes = icn->dsi_lanes;
dsi->format = MIPI_DSI_FMT_RGB888;
-   dsi->mode_flags = MIPI_DSI_MODE_VIDEO_SYNC_PULSE;
+   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+ MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET;
 
ret = mipi_dsi_attach(dsi);
if (ret < 0) {
-- 
2.34.1



[PATCH V3 04/13] drm: bridge: icn6211: Add HS/VS/DE polarity handling

2022-03-03 Thread Marek Vasut
The driver currently hard-codes HS/VS polarity to active-low and DE to
active-high, which is not correct for a lot of supported DPI panels.
Add the missing mode flag handling for HS/VS/DE polarity.

Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: No change
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index e29e6a84c39a6..2ac8eb7e25f52 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -165,8 +165,16 @@ static void chipone_atomic_enable(struct drm_bridge 
*bridge,
  struct drm_bridge_state *old_bridge_state)
 {
struct chipone *icn = bridge_to_chipone(bridge);
+   struct drm_atomic_state *state = old_bridge_state->base.state;
struct drm_display_mode *mode = &icn->mode;
+   const struct drm_bridge_state *bridge_state;
u16 hfp, hbp, hsync;
+   u32 bus_flags;
+   u8 pol;
+
+   /* Get the DPI flags from the bridge state. */
+   bridge_state = drm_atomic_get_new_bridge_state(state, bridge);
+   bus_flags = bridge_state->output_bus_cfg.flags;
 
ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI);
 
@@ -206,7 +214,13 @@ static void chipone_atomic_enable(struct drm_bridge 
*bridge,
ICN6211_DSI(icn, HFP_MIN, hfp & 0xff);
ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0);
ICN6211_DSI(icn, PLL_CTRL(12), 0xff);
-   ICN6211_DSI(icn, BIST_POL, BIST_POL_DE_POL);
+
+   /* DPI HS/VS/DE polarity */
+   pol = ((mode->flags & DRM_MODE_FLAG_PHSYNC) ? BIST_POL_HSYNC_POL : 0) |
+ ((mode->flags & DRM_MODE_FLAG_PVSYNC) ? BIST_POL_VSYNC_POL : 0) |
+ ((bus_flags & DRM_BUS_FLAG_DE_HIGH) ? BIST_POL_DE_POL : 0);
+   ICN6211_DSI(icn, BIST_POL, pol);
+
ICN6211_DSI(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK);
ICN6211_DSI(icn, PLL_REF_DIV, 0x71);
ICN6211_DSI(icn, PLL_INT(0), 0x2b);
-- 
2.34.1



[PATCH V3 00/13] drm: bridge: icn6211: Fix hard-coded panel settings and add I2C support

2022-03-03 Thread Marek Vasut
This series fixes multiple problems with the ICN6211 driver and adds
support for configuration of the chip via I2C bus.

First, in the current state, the ICN6211 driver hard-codes DPI timing
and clock settings specific to some unknown panel. The settings provided
by panel driver are ignored. Using any other panel than the one for which
this driver is currently hard-coded can lead to permanent damage of the
panel (per display supplier warning, and it sure did in my case. The
damage looks like multiple rows of dead pixels at the bottom of the
panel, and this is not going away even after long power off time).

Much of this series thus fixes incorrect register layout, DPI timing
programming, clock generation by adding actual PLL configuration code.
This series also adds lane count decoding instead of using hard-coded
value, and fills in a couple of registers with likely correct default
values.

Second, this series adds support for I2C configuration of the ICN6211.
The device can be configured either via DSI command mode or via I2C,
the register layout is the same in both cases.

Since the datasheet for this device is very hard to come by, a lot of
information has been salvaged from [1] and [2].

[1] 
https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/gpu/drm/bridge/icn6211.c
[2] https://github.com/tdjastrzebski/ICN6211-Configurator

Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org

Marek Vasut (13):
  dt-bindings: display: bridge: icn6211: Document DSI data-lanes
property
  drm: bridge: icn6211: Fix register layout
  drm: bridge: icn6211: Fix HFP_HSW_HBP_HI and HFP_MIN handling
  drm: bridge: icn6211: Add HS/VS/DE polarity handling
  drm: bridge: icn6211: Add DSI lane count DT property parsing
  drm: bridge: icn6211: Add generic DSI-to-DPI PLL configuration
  drm: bridge: icn6211: Use DSI burst mode without EoT and with LP
command mode
  drm: bridge: icn6211: Disable DPI color swap
  drm: bridge: icn6211: Set SYS_CTRL_1 to value used in examples
  drm: bridge: icn6211: Implement atomic_get_input_bus_fmts
  drm: bridge: icn6211: Add I2C configuration support
  drm: bridge: icn6211: Rework ICN6211_DSI to chipone_writeb()
  drm: bridge: icn6211: Read and validate chip IDs before configuration

 .../display/bridge/chipone,icn6211.yaml   |  18 +-
 drivers/gpu/drm/bridge/chipone-icn6211.c  | 534 --
 2 files changed, 496 insertions(+), 56 deletions(-)

-- 
2.34.1



[PATCH V3 10/13] drm: bridge: icn6211: Implement atomic_get_input_bus_fmts

2022-03-03 Thread Marek Vasut
Implement .atomic_get_input_bus_fmts callback, which sets up the
input (DSI-end) format, and that format can then be used in pipeline
format negotiation between the DSI-end of this bridge and the other
component closer to the scanout engine.

Acked-by: Maxime Ripard 
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 27 
 1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index 095002a40d0e8..afc619e215c3b 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -389,6 +389,32 @@ static int chipone_attach(struct drm_bridge *bridge, enum 
drm_bridge_attach_flag
return drm_bridge_attach(bridge->encoder, icn->panel_bridge, bridge, 
flags);
 }
 
+#define MAX_INPUT_SEL_FORMATS  1
+
+static u32 *
+chipone_atomic_get_input_bus_fmts(struct drm_bridge *bridge,
+ struct drm_bridge_state *bridge_state,
+ struct drm_crtc_state *crtc_state,
+ struct drm_connector_state *conn_state,
+ u32 output_fmt,
+ unsigned int *num_input_fmts)
+{
+   u32 *input_fmts;
+
+   *num_input_fmts = 0;
+
+   input_fmts = kcalloc(MAX_INPUT_SEL_FORMATS, sizeof(*input_fmts),
+GFP_KERNEL);
+   if (!input_fmts)
+   return NULL;
+
+   /* This is the DSI-end bus format */
+   input_fmts[0] = MEDIA_BUS_FMT_RGB888_1X24;
+   *num_input_fmts = 1;
+
+   return input_fmts;
+}
+
 static const struct drm_bridge_funcs chipone_bridge_funcs = {
.atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state,
.atomic_destroy_state   = drm_atomic_helper_bridge_destroy_state,
@@ -398,6 +424,7 @@ static const struct drm_bridge_funcs chipone_bridge_funcs = 
{
.atomic_post_disable= chipone_atomic_post_disable,
.mode_set   = chipone_mode_set,
.attach = chipone_attach,
+   .atomic_get_input_bus_fmts = chipone_atomic_get_input_bus_fmts,
 };
 
 static int chipone_parse_dt(struct chipone *icn)
-- 
2.34.1



[PATCH V3 03/13] drm: bridge: icn6211: Fix HFP_HSW_HBP_HI and HFP_MIN handling

2022-03-03 Thread Marek Vasut
The HFP_HSW_HBP_HI register must be programmed with 2 LSbits of each
Horizontal Front Porch/Sync/Back Porch. Currently the driver programs
this register to 0, which breaks displays with either value above 255.

The HFP_MIN register must be set to the same value as HFP_LI, otherwise
there is visible image distortion, usually in the form of missing lines
at the bottom of the panel.

Fix this by correctly programming the HFP_HSW_HBP_HI and HFP_MIN registers.

Acked-by: Maxime Ripard 
Fixes: ce517f18944e3 ("drm: bridge: Add Chipone ICN6211 MIPI-DSI to RGB bridge")
Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Add AB from Maxime
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index 4b8d1a5a50302..e29e6a84c39a6 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -35,6 +35,9 @@
 #define HSYNC_LI   0x24
 #define HBP_LI 0x25
 #define HFP_HSW_HBP_HI 0x26
+#define HFP_HSW_HBP_HI_HFP(n)  (((n) & 0x300) >> 4)
+#define HFP_HSW_HBP_HI_HS(n)   (((n) & 0x300) >> 6)
+#define HFP_HSW_HBP_HI_HBP(n)  (((n) & 0x300) >> 8)
 #define VFP0x27
 #define VSYNC  0x28
 #define VBP0x29
@@ -163,6 +166,7 @@ static void chipone_atomic_enable(struct drm_bridge *bridge,
 {
struct chipone *icn = bridge_to_chipone(bridge);
struct drm_display_mode *mode = &icn->mode;
+   u16 hfp, hbp, hsync;
 
ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI);
 
@@ -178,13 +182,18 @@ static void chipone_atomic_enable(struct drm_bridge 
*bridge,
((mode->hdisplay >> 8) & 0xf) |
(((mode->vdisplay >> 8) & 0xf) << 4));
 
-   ICN6211_DSI(icn, HFP_LI, mode->hsync_start - mode->hdisplay);
+   hfp = mode->hsync_start - mode->hdisplay;
+   hsync = mode->hsync_end - mode->hsync_start;
+   hbp = mode->htotal - mode->hsync_end;
 
-   ICN6211_DSI(icn, HSYNC_LI, mode->hsync_end - mode->hsync_start);
-
-   ICN6211_DSI(icn, HBP_LI, mode->htotal - mode->hsync_end);
-
-   ICN6211_DSI(icn, HFP_HSW_HBP_HI, 0x00);
+   ICN6211_DSI(icn, HFP_LI, hfp & 0xff);
+   ICN6211_DSI(icn, HSYNC_LI, hsync & 0xff);
+   ICN6211_DSI(icn, HBP_LI, hbp & 0xff);
+   /* Top two bits of Horizontal Front porch/Sync/Back porch */
+   ICN6211_DSI(icn, HFP_HSW_HBP_HI,
+   HFP_HSW_HBP_HI_HFP(hfp) |
+   HFP_HSW_HBP_HI_HS(hsync) |
+   HFP_HSW_HBP_HI_HBP(hbp));
 
ICN6211_DSI(icn, VFP, mode->vsync_start - mode->vdisplay);
 
@@ -194,7 +203,7 @@ static void chipone_atomic_enable(struct drm_bridge *bridge,
 
/* dsi specific sequence */
ICN6211_DSI(icn, SYNC_EVENT_DLY, 0x80);
-   ICN6211_DSI(icn, HFP_MIN, 0x28);
+   ICN6211_DSI(icn, HFP_MIN, hfp & 0xff);
ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0);
ICN6211_DSI(icn, PLL_CTRL(12), 0xff);
ICN6211_DSI(icn, BIST_POL, BIST_POL_DE_POL);
-- 
2.34.1



[PATCH V3 05/13] drm: bridge: icn6211: Add DSI lane count DT property parsing

2022-03-03 Thread Marek Vasut
The driver currently hard-codes DSI lane count to two, however the chip
is capable of operating in 1..4 DSI lanes mode. Parse 'data-lanes' DT
property and program the result into DSI_CTRL register.

Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
To: dri-devel@lists.freedesktop.org
---
V2: Rebase on next-20220214
V3: Default to 4 data lanes unless specified otherwise
---
 drivers/gpu/drm/bridge/chipone-icn6211.c | 45 +---
 1 file changed, 41 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c 
b/drivers/gpu/drm/bridge/chipone-icn6211.c
index 2ac8eb7e25f52..df8e75a068ad0 100644
--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
+++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
@@ -136,10 +136,12 @@ struct chipone {
struct drm_bridge bridge;
struct drm_display_mode mode;
struct drm_bridge *panel_bridge;
+   struct device_node *host_node;
struct gpio_desc *enable_gpio;
struct regulator *vdd1;
struct regulator *vdd2;
struct regulator *vdd3;
+   int dsi_lanes;
 };
 
 static inline struct chipone *bridge_to_chipone(struct drm_bridge *bridge)
@@ -212,6 +214,11 @@ static void chipone_atomic_enable(struct drm_bridge 
*bridge,
/* dsi specific sequence */
ICN6211_DSI(icn, SYNC_EVENT_DLY, 0x80);
ICN6211_DSI(icn, HFP_MIN, hfp & 0xff);
+
+   /* DSI data lane count */
+   ICN6211_DSI(icn, DSI_CTRL,
+   DSI_CTRL_UNKNOWN | DSI_CTRL_DSI_LANES(icn->dsi_lanes - 1));
+
ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0);
ICN6211_DSI(icn, PLL_CTRL(12), 0xff);
 
@@ -314,7 +321,9 @@ static const struct drm_bridge_funcs chipone_bridge_funcs = 
{
 static int chipone_parse_dt(struct chipone *icn)
 {
struct device *dev = icn->dev;
+   struct device_node *endpoint;
struct drm_panel *panel;
+   int dsi_lanes;
int ret;
 
icn->vdd1 = devm_regulator_get_optional(dev, "vdd1");
@@ -350,15 +359,42 @@ static int chipone_parse_dt(struct chipone *icn)
return PTR_ERR(icn->enable_gpio);
}
 
+   endpoint = of_graph_get_endpoint_by_regs(dev->of_node, 0, 0);
+   dsi_lanes = of_property_count_u32_elems(endpoint, "data-lanes");
+   icn->host_node = of_graph_get_remote_port_parent(endpoint);
+   of_node_put(endpoint);
+
+   if (!icn->host_node)
+   return -ENODEV;
+
+   /*
+* If the 'data-lanes' property does not exist in DT or is invalid,
+* default to previously hard-coded behavior, which was 4 data lanes.
+*/
+   if (dsi_lanes < 0) {
+   icn->dsi_lanes = 4;
+   } else if (dsi_lanes > 4) {
+   ret = -EINVAL;
+   goto err_data_lanes;
+   } else {
+   icn->dsi_lanes = dsi_lanes;
+   }
+
ret = drm_of_find_panel_or_bridge(dev->of_node, 1, 0, &panel, NULL);
if (ret)
-   return ret;
+   goto err_data_lanes;
 
icn->panel_bridge = devm_drm_panel_bridge_add(dev, panel);
-   if (IS_ERR(icn->panel_bridge))
-   return PTR_ERR(icn->panel_bridge);
+   if (IS_ERR(icn->panel_bridge)) {
+   ret = PTR_ERR(icn->panel_bridge);
+   goto err_data_lanes;
+   }
 
return 0;
+
+err_data_lanes:
+   of_node_put(icn->host_node);
+   return ret;
 }
 
 static int chipone_probe(struct mipi_dsi_device *dsi)
@@ -384,7 +420,7 @@ static int chipone_probe(struct mipi_dsi_device *dsi)
 
drm_bridge_add(&icn->bridge);
 
-   dsi->lanes = 4;
+   dsi->lanes = icn->dsi_lanes;
dsi->format = MIPI_DSI_FMT_RGB888;
dsi->mode_flags = MIPI_DSI_MODE_VIDEO_SYNC_PULSE;
 
@@ -403,6 +439,7 @@ static int chipone_remove(struct mipi_dsi_device *dsi)
 
mipi_dsi_detach(dsi);
drm_bridge_remove(&icn->bridge);
+   of_node_put(icn->host_node);
 
return 0;
 }
-- 
2.34.1



[PATCH V3 01/13] dt-bindings: display: bridge: icn6211: Document DSI data-lanes property

2022-03-03 Thread Marek Vasut
It is necessary to specify the number of connected/used DSI data lanes when
using the DSI input port of this bridge. Document the 'data-lanes' property
of the DSI input port.

Signed-off-by: Marek Vasut 
Cc: Jagan Teki 
Cc: Maxime Ripard 
Cc: Rob Herring 
Cc: Robert Foss 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
To: dri-devel@lists.freedesktop.org
---
V3: New patch
---
 .../display/bridge/chipone,icn6211.yaml| 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git 
a/Documentation/devicetree/bindings/display/bridge/chipone,icn6211.yaml 
b/Documentation/devicetree/bindings/display/bridge/chipone,icn6211.yaml
index 62c3bd4cb28d8..f8cac721a7330 100644
--- a/Documentation/devicetree/bindings/display/bridge/chipone,icn6211.yaml
+++ b/Documentation/devicetree/bindings/display/bridge/chipone,icn6211.yaml
@@ -41,10 +41,26 @@ properties:
 
 properties:
   port@0:
-$ref: /schemas/graph.yaml#/properties/port
+$ref: /schemas/graph.yaml#/$defs/port-base
+unevaluatedProperties: false
 description:
   Video port for MIPI DSI input
 
+properties:
+  endpoint:
+$ref: /schemas/media/video-interfaces.yaml#
+unevaluatedProperties: false
+
+properties:
+  data-lanes:
+description: array of physical DSI data lane indexes.
+minItems: 1
+items:
+  - const: 1
+  - const: 2
+  - const: 3
+  - const: 4
+
   port@1:
 $ref: /schemas/graph.yaml#/properties/port
 description:
-- 
2.34.1



Re: [PATCH v4 4/4] arm64/dts/qcom/sm8250: remove assigned-clock-rate property for mdp clk

2022-03-03 Thread Dmitry Baryshkov
On Fri, 4 Mar 2022 at 02:56, Stephen Boyd  wrote:
>
> Quoting Dmitry Baryshkov (2022-03-03 15:50:50)
> > On Thu, 3 Mar 2022 at 12:40, Vinod Polimera  
> > wrote:
> > >
> > > Kernel clock driver assumes that initial rate is the
> > > max rate for that clock and was not allowing it to scale
> > > beyond the assigned clock value.
> > >
> > > Drop the assigned clock rate property and vote on the mdp clock as per
> > > calculated value during the usecase.
> > >
> > > Fixes: 7c1dffd471("arm64: dts: qcom: sm8250.dtsi: add display system 
> > > nodes")
> >
> > Please remove the Fixes tags from all commits. Otherwise the patches
> > might be picked up into earlier kernels, which do not have a patch
> > adding a vote on the MDP clock.
>
> What patch is that? The Fixes tag could point to that commit.

Please correct me if I'm wrong.
Currently the dtsi enforces bumping the MDP clock when the mdss device
is being probed and when the dpu device is being probed.
Later during the DPU lifetime the core_perf would change the clock's
rate as it sees fit according to the CRTC requirements.

However it would happen only when the during the
dpu_crtc_atomic_flush(), before we call this function, the MDP clock
is left in the undetermined state. The power rails controlled by the
opp table are left in the undetermined state.

I suppose that during the dpu_bind we should bump the clock to the max
possible freq and let dpu_core_perf handle it afterwards.


--
With best wishes
Dmitry


Re: [PATCH v2] drm/msm/disp/dpu1: add inline rotation support for sc7280 target

2022-03-03 Thread Dmitry Baryshkov
On Thu, 3 Mar 2022 at 14:43, Vinod Polimera  wrote:
>
> - Some DPU versions support inline rot90. It is supported only for
> limited amount of UBWC formats.
> - There are two versions of inline rotators, v1 (present on sm8250 and
> sm7250) and v2 (sc7280). These versions differ in the list of supported
> formats and in the scaler possibilities.
>
> Changes in RFC:
> - Rebase changes to the latest code base.
> - Append rotation config variables with v2 and
> remove unused variables.(Dmitry)
> - Move pixel_ext setup separately from scaler3 config.(Dmitry)
> - Add 270 degree rotation to supported rotation list.(Dmitry)
>
> Changes in V2:
> - Remove unused macros and fix indentation.
> - Add check if 90 rotation is supported and add supported rotations to 
> rot_cfg.
>
> Signed-off-by: Kalyan Thota 
> Signed-off-by: Vinod Polimera 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c |  44 +++---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h |  17 
>  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c  | 108 
> +++--
>  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h  |   2 +
>  4 files changed, 134 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index aa75991..7cd07be 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -25,6 +25,9 @@
>  #define VIG_SM8250_MASK \
> (VIG_MASK | BIT(DPU_SSPP_QOS_8LVL) | BIT(DPU_SSPP_SCALER_QSEED3LITE))
>
> +#define VIG_SC7280_MASK \
> +   (VIG_SC7180_MASK | BIT(DPU_SSPP_INLINE_ROTATION))
> +
>  #define DMA_SDM845_MASK \
> (BIT(DPU_SSPP_SRC) | BIT(DPU_SSPP_QOS) | BIT(DPU_SSPP_QOS_8LVL) |\
> BIT(DPU_SSPP_TS_PREFILL) | BIT(DPU_SSPP_TS_PREFILL_REC1) |\
> @@ -177,6 +180,11 @@ static const uint32_t plane_formats_yuv[] = {
> DRM_FORMAT_YVU420,
>  };
>
> +static const uint32_t rotation_v2_formats[] = {
> +   DRM_FORMAT_NV12,
> +   /* TODO add formats after validation */
> +};
> +
>  /*
>   * DPU sub blocks config
>   */
> @@ -464,8 +472,7 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = {
>   */
>
>  /* SSPP common configuration */
> -
> -#define _VIG_SBLK(num, sdma_pri, qseed_ver) \
> +#define _VIG_SBLK(num, sdma_pri, qseed_ver, rot_cfg) \
> { \
> .maxdwnscale = MAX_DOWNSCALE_RATIO, \
> .maxupscale = MAX_UPSCALE_RATIO, \
> @@ -482,6 +489,7 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = {
> .num_formats = ARRAY_SIZE(plane_formats_yuv), \
> .virt_format_list = plane_formats, \
> .virt_num_formats = ARRAY_SIZE(plane_formats), \
> +   .rotation_cfg = rot_cfg, \
> }
>
>  #define _DMA_SBLK(num, sdma_pri) \
> @@ -497,14 +505,21 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = {
> .virt_num_formats = ARRAY_SIZE(plane_formats), \
> }
>
> +static const struct dpu_rotation_cfg dpu_rot_sc7280_cfg_v2 = {
> +   .rot_maxheight = 1088,
> +   .rot_num_formats = ARRAY_SIZE(rotation_v2_formats),
> +   .rot_format_list = rotation_v2_formats,
> +   .rot_supported = DRM_MODE_ROTATE_MASK | DRM_MODE_REFLECT_MASK,
> +};
> +
>  static const struct dpu_sspp_sub_blks sdm845_vig_sblk_0 =
> -   _VIG_SBLK("0", 5, DPU_SSPP_SCALER_QSEED3);
> +   _VIG_SBLK("0", 5, DPU_SSPP_SCALER_QSEED3, 
> NULL);
>  static const struct dpu_sspp_sub_blks sdm845_vig_sblk_1 =
> -   _VIG_SBLK("1", 6, DPU_SSPP_SCALER_QSEED3);
> +   _VIG_SBLK("1", 6, DPU_SSPP_SCALER_QSEED3, 
> NULL);
>  static const struct dpu_sspp_sub_blks sdm845_vig_sblk_2 =
> -   _VIG_SBLK("2", 7, DPU_SSPP_SCALER_QSEED3);
> +   _VIG_SBLK("2", 7, DPU_SSPP_SCALER_QSEED3, 
> NULL);
>  static const struct dpu_sspp_sub_blks sdm845_vig_sblk_3 =
> -   _VIG_SBLK("3", 8, DPU_SSPP_SCALER_QSEED3);
> +   _VIG_SBLK("3", 8, DPU_SSPP_SCALER_QSEED3, 
> NULL);
>
>  static const struct dpu_sspp_sub_blks sdm845_dma_sblk_0 = _DMA_SBLK("8", 1);
>  static const struct dpu_sspp_sub_blks sdm845_dma_sblk_1 = _DMA_SBLK("9", 2);
> @@ -543,7 +558,10 @@ static const struct dpu_sspp_cfg sdm845_sspp[] = {
>  };
>
>  static const struct dpu_sspp_sub_blks sc7180_vig_sblk_0 =
> -   _VIG_SBLK("0", 4, DPU_SSPP_SCALER_QSEED4);
> +   _VIG_SBLK("0", 4, DPU_SSPP_SCALER_QSEED4, 
> NULL);
> +
> +static const struct dpu_sspp_sub_blks sc7280_vig_sblk_0 =
> +   _VIG_SBLK("0", 4, DPU_SSPP_SCALER_QSEED4, 
> &dpu_rot_sc7280_cfg_v2);
>
>  static const struct dpu_sspp_cfg sc7180_sspp[] = {
> SSPP_BLK

[PATCH v2 2/2] dt-bindings: gpu: Convert aspeed-gfx bindings to yaml

2022-03-03 Thread Joel Stanley
Convert the bindings to yaml and add the ast2600 compatible string.

The legacy mfd description was put in place before the gfx bindings
existed, to document the compatible that is used in the pinctrl
bindings.

Signed-off-by: Joel Stanley 
---
 .../devicetree/bindings/gpu/aspeed,gfx.yaml   | 69 +++
 .../devicetree/bindings/gpu/aspeed-gfx.txt| 41 ---
 .../devicetree/bindings/mfd/aspeed-gfx.txt| 17 -
 3 files changed, 69 insertions(+), 58 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml
 delete mode 100644 Documentation/devicetree/bindings/gpu/aspeed-gfx.txt
 delete mode 100644 Documentation/devicetree/bindings/mfd/aspeed-gfx.txt

diff --git a/Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml 
b/Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml
new file mode 100644
index ..8ddc4fa6e8e4
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml
@@ -0,0 +1,69 @@
+# SPDX-License-Identifier: GPL-2.0-only
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/gpu/aspeed,gfx.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: ASPEED GFX display device
+
+maintainers:
+  - Joel Stanley 
+
+properties:
+  compatible:
+items:
+  - enum:
+  - aspeed,ast2400-gfx
+  - aspeed,ast2500-gfx
+  - aspeed,ast2600-gfx
+  - const: syscon
+
+  reg:
+minItems: 1
+
+  interrupts:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+
+  resets:
+maxItems: 1
+
+  memory-region: true
+
+  syscon: true
+
+  reg-io-width: true
+
+required:
+  - reg
+  - compatible
+  - interrupts
+  - clocks
+  - resets
+  - memory-region
+  - syscon
+
+additionalProperties: false
+
+examples:
+  - |
+   #include 
+   gfx: display@1e6e6000 {
+   compatible = "aspeed,ast2500-gfx", "syscon";
+   reg = <0x1e6e6000 0x1000>;
+   reg-io-width = <4>;
+   clocks = <&syscon ASPEED_CLK_GATE_D1CLK>;
+   resets = <&syscon ASPEED_RESET_CRT1>;
+   interrupts = <0x19>;
+   syscon = <&syscon>;
+   memory-region = <&gfx_memory>;
+   };
+
+   gfx_memory: framebuffer {
+   size = <0x0100>;
+   alignment = <0x0100>;
+   compatible = "shared-dma-pool";
+   reusable;
+   };
diff --git a/Documentation/devicetree/bindings/gpu/aspeed-gfx.txt 
b/Documentation/devicetree/bindings/gpu/aspeed-gfx.txt
deleted file mode 100644
index 958bdf962339..
--- a/Documentation/devicetree/bindings/gpu/aspeed-gfx.txt
+++ /dev/null
@@ -1,41 +0,0 @@
-Device tree configuration for the GFX display device on the ASPEED SoCs
-
-Required properties:
-  - compatible
-* Must be one of the following:
-  + aspeed,ast2500-gfx
-  + aspeed,ast2400-gfx
-* In addition, the ASPEED pinctrl bindings require the 'syscon' property to
-  be present
-
-  - reg: Physical base address and length of the GFX registers
-
-  - interrupts: interrupt number for the GFX device
-
-  - clocks: clock number used to generate the pixel clock
-
-  - resets: reset line that must be released to use the GFX device
-
-  - memory-region:
-Phandle to a memory region to allocate from, as defined in
-Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
-
-
-Example:
-
-gfx: display@1e6e6000 {
-   compatible = "aspeed,ast2500-gfx", "syscon";
-   reg = <0x1e6e6000 0x1000>;
-   reg-io-width = <4>;
-   clocks = <&syscon ASPEED_CLK_GATE_D1CLK>;
-   resets = <&syscon ASPEED_RESET_CRT1>;
-   interrupts = <0x19>;
-   memory-region = <&gfx_memory>;
-};
-
-gfx_memory: framebuffer {
-   size = <0x0100>;
-   alignment = <0x0100>;
-   compatible = "shared-dma-pool";
-   reusable;
-};
diff --git a/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt 
b/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt
deleted file mode 100644
index aea5370efd97..
--- a/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt
+++ /dev/null
@@ -1,17 +0,0 @@
-* Device tree bindings for Aspeed SoC Display Controller (GFX)
-
-The Aspeed SoC Display Controller primarily does as its name suggests, but also
-participates in pinmux requests on the g5 SoCs. It is therefore considered a
-syscon device.
-
-Required properties:
-- compatible:  "aspeed,ast2500-gfx", "syscon"
-- reg: contains offset/length value of the GFX memory
-   region.
-
-Example:
-
-gfx: display@1e6e6000 {
-   compatible = "aspeed,ast2500-gfx", "syscon";
-   reg = <0x1e6e6000 0x1000>;
-};
-- 
2.34.1



[PATCH v2 1/2] dt-bindings: pinctrl: aspeed: Update gfx node in example

2022-03-03 Thread Joel Stanley
The example needs updating to match the to be added yaml bindings for
the gfx node.

Signed-off-by: Joel Stanley 
---
 .../bindings/pinctrl/aspeed,ast2500-pinctrl.yaml | 16 
 1 file changed, 16 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml 
b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml
index d316cc082107..9969997c2f1b 100644
--- a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml
+++ b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml
@@ -73,6 +73,7 @@ additionalProperties: false
 
 examples:
   - |
+#include 
 apb {
 compatible = "simple-bus";
 #address-cells = <1>;
@@ -82,6 +83,8 @@ examples:
 syscon: scu@1e6e2000 {
 compatible = "aspeed,ast2500-scu", "syscon", "simple-mfd";
 reg = <0x1e6e2000 0x1a8>;
+#clock-cells = <1>;
+#reset-cells = <1>;
 
 pinctrl: pinctrl {
 compatible = "aspeed,ast2500-pinctrl";
@@ -102,6 +105,12 @@ examples:
 gfx: display@1e6e6000 {
 compatible = "aspeed,ast2500-gfx", "syscon";
 reg = <0x1e6e6000 0x1000>;
+reg-io-width = <4>;
+clocks = <&syscon ASPEED_CLK_GATE_D1CLK>;
+resets = <&syscon ASPEED_RESET_CRT1>;
+interrupts = <0x19>;
+syscon = <&syscon>;
+memory-region = <&gfx_memory>;
 };
 };
 
@@ -128,3 +137,10 @@ examples:
 };
 };
 };
+
+gfx_memory: framebuffer {
+size = <0x0100>;
+alignment = <0x0100>;
+compatible = "shared-dma-pool";
+reusable;
+};
-- 
2.34.1



[PATCH v2 0/2] dt-bindings: Convert GFX bindings to yaml

2022-03-03 Thread Joel Stanley
v1: https://lore.kernel.org/all/20220302051056.678367-1-j...@jms.id.au/

This series cleans up the bindings for the ASPEED GFX unit.

The old text files are deleted for both the description under gpu, and the
placeholder one under mfd.

The mfd one existed because pinctrl for the 2500 depends on the gfx
bindings, and at the time we didn't have any support fo the gfx device,
so Andrew added the mfd ones.

The example in the pinctrl bindings is updated to prevent warnings about
missing properties that pop up when the gfx yaml bindings are added.

Joel Stanley (2):
  dt-bindings: pinctrl: aspeed: Update gfx node in example
  dt-bindings: gpu: Convert aspeed-gfx bindings to yaml

 .../devicetree/bindings/gpu/aspeed,gfx.yaml   | 69 +++
 .../devicetree/bindings/gpu/aspeed-gfx.txt| 41 ---
 .../devicetree/bindings/mfd/aspeed-gfx.txt| 17 -
 .../pinctrl/aspeed,ast2500-pinctrl.yaml   | 16 +
 4 files changed, 85 insertions(+), 58 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml
 delete mode 100644 Documentation/devicetree/bindings/gpu/aspeed-gfx.txt
 delete mode 100644 Documentation/devicetree/bindings/mfd/aspeed-gfx.txt

-- 
2.34.1



Re: [PATCH v4 4/4] arm64/dts/qcom/sm8250: remove assigned-clock-rate property for mdp clk

2022-03-03 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2022-03-03 15:50:50)
> On Thu, 3 Mar 2022 at 12:40, Vinod Polimera  wrote:
> >
> > Kernel clock driver assumes that initial rate is the
> > max rate for that clock and was not allowing it to scale
> > beyond the assigned clock value.
> >
> > Drop the assigned clock rate property and vote on the mdp clock as per
> > calculated value during the usecase.
> >
> > Fixes: 7c1dffd471("arm64: dts: qcom: sm8250.dtsi: add display system nodes")
>
> Please remove the Fixes tags from all commits. Otherwise the patches
> might be picked up into earlier kernels, which do not have a patch
> adding a vote on the MDP clock.

What patch is that? The Fixes tag could point to that commit.


Re: [PATCH v4 4/4] arm64/dts/qcom/sm8250: remove assigned-clock-rate property for mdp clk

2022-03-03 Thread Dmitry Baryshkov
On Thu, 3 Mar 2022 at 12:40, Vinod Polimera  wrote:
>
> Kernel clock driver assumes that initial rate is the
> max rate for that clock and was not allowing it to scale
> beyond the assigned clock value.
>
> Drop the assigned clock rate property and vote on the mdp clock as per
> calculated value during the usecase.
>
> Fixes: 7c1dffd471("arm64: dts: qcom: sm8250.dtsi: add display system nodes")

Please remove the Fixes tags from all commits. Otherwise the patches
might be picked up into earlier kernels, which do not have a patch
adding a vote on the MDP clock.

> Signed-off-by: Vinod Polimera 
> ---
>  arch/arm64/boot/dts/qcom/sm8250.dtsi | 9 ++---
>  1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi 
> b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> index fdaf303..2105eb7 100644
> --- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
> @@ -3164,9 +3164,6 @@
>  <&dispcc DISP_CC_MDSS_MDP_CLK>;
> clock-names = "iface", "bus", "nrt_bus", "core";
>
> -   assigned-clocks = <&dispcc DISP_CC_MDSS_MDP_CLK>;
> -   assigned-clock-rates = <46000>;
> -
> interrupts = ;
> interrupt-controller;
> #interrupt-cells = <1>;
> @@ -3191,10 +3188,8 @@
>  <&dispcc DISP_CC_MDSS_VSYNC_CLK>;
> clock-names = "iface", "bus", "core", "vsync";
>
> -   assigned-clocks = <&dispcc 
> DISP_CC_MDSS_MDP_CLK>,
> - <&dispcc 
> DISP_CC_MDSS_VSYNC_CLK>;
> -   assigned-clock-rates = <46000>,
> -  <1920>;
> +   assigned-clocks = <&dispcc 
> DISP_CC_MDSS_VSYNC_CLK>;
> +   assigned-clock-rates = <1920>;
>
> operating-points-v2 = <&mdp_opp_table>;
> power-domains = <&rpmhpd SM8250_MMCX>;
> --
> 2.7.4
>


-- 
With best wishes
Dmitry


Re: [PATCH v2 3/4] drm/msm: split the main platform driver

2022-03-03 Thread Dmitry Baryshkov
On Fri, 4 Mar 2022 at 02:00, Stephen Boyd  wrote:
>
> Quoting Dmitry Baryshkov (2022-01-19 14:40:04)
> > diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
> > index 06d26c5fb274..6895c056be19 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.h
> > +++ b/drivers/gpu/drm/msm/msm_drv.h
> > @@ -451,10 +451,18 @@ static inline void msm_dp_debugfs_init(struct msm_dp 
> > *dp_display,
> >
> >  #endif
> >
> > +#define KMS_MDP4 4
> > +#define KMS_MDP5 5
> > +#define KMS_DPU  3
> > +
> > +void __init msm_mdp4_register(void);
> > +void __exit msm_mdp4_unregister(void);
> >  void __init msm_mdp_register(void);
> >  void __exit msm_mdp_unregister(void);
> >  void __init msm_dpu_register(void);
> >  void __exit msm_dpu_unregister(void);
> > +void __init msm_mdss_register(void);
> > +void __exit msm_mdss_unregister(void);
>
> Don't need __init or __exit on prototypes.
>
> >
> >  #ifdef CONFIG_DEBUG_FS
> >  void msm_framebuffer_describe(struct drm_framebuffer *fb, struct seq_file 
> > *m);
> > diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c
> > index 92562221b517..759076357e0e 100644
> > --- a/drivers/gpu/drm/msm/msm_mdss.c
> > +++ b/drivers/gpu/drm/msm/msm_mdss.c
> > @@ -8,6 +8,8 @@
> >  #include 
> >  #include 
> >
> > +#include 
>
> What's this include for?
>
> > +
> >  #include "msm_drv.h"
> >  #include "msm_kms.h"
> >
> > @@ -127,7 +129,7 @@ static int _msm_mdss_irq_domain_add(struct msm_mdss 
> > *msm_mdss)
> > return 0;
> >  }
> >
> > -int msm_mdss_enable(struct msm_mdss *msm_mdss)
> > +static int msm_mdss_enable(struct msm_mdss *msm_mdss)
> >  {
> > int ret;
> >
> > @@ -163,14 +165,14 @@ int msm_mdss_enable(struct msm_mdss *msm_mdss)
> > return ret;
> >  }
> >
> > -int msm_mdss_disable(struct msm_mdss *msm_mdss)
> > +static int msm_mdss_disable(struct msm_mdss *msm_mdss)
> >  {
> > clk_bulk_disable_unprepare(msm_mdss->num_clocks, msm_mdss->clocks);
> >
> > return 0;
> >  }
> >
> > -void msm_mdss_destroy(struct msm_mdss *msm_mdss)
> > +static void msm_mdss_destroy(struct msm_mdss *msm_mdss)
> >  {
> > struct platform_device *pdev = to_platform_device(msm_mdss->dev);
> > int irq;
> > @@ -228,7 +230,7 @@ int mdp5_mdss_parse_clock(struct platform_device *pdev, 
> > struct clk_bulk_data **c
> > return num_clocks;
> >  }
> >
> > -struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool mdp5)
> > +static struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool 
> > mdp5)
> >  {
> > struct msm_mdss *msm_mdss;
> > int ret;
> > @@ -269,3 +271,171 @@ struct msm_mdss *msm_mdss_init(struct platform_device 
> > *pdev, bool mdp5)
> >
> > return msm_mdss;
> >  }
> > +
> > +static int __maybe_unused mdss_runtime_suspend(struct device *dev)
> > +{
> > +   struct msm_drm_private *priv = dev_get_drvdata(dev);
> > +
> > +   DBG("");
> > +
> > +   return msm_mdss_disable(priv->mdss);
> > +}
> > +
> > +static int __maybe_unused mdss_runtime_resume(struct device *dev)
> > +{
> > +   struct msm_drm_private *priv = dev_get_drvdata(dev);
> > +
> > +   DBG("");
> > +
> > +   return msm_mdss_enable(priv->mdss);
> > +}
> > +
> > +static int __maybe_unused mdss_pm_suspend(struct device *dev)
> > +{
> > +
> > +   if (pm_runtime_suspended(dev))
> > +   return 0;
> > +
> > +   return mdss_runtime_suspend(dev);
> > +}
> > +
> > +static int __maybe_unused mdss_pm_resume(struct device *dev)
> > +{
> > +   if (pm_runtime_suspended(dev))
> > +   return 0;
> > +
> > +   return mdss_runtime_resume(dev);
> > +}
> > +
> > +static const struct dev_pm_ops mdss_pm_ops = {
> > +   SET_SYSTEM_SLEEP_PM_OPS(mdss_pm_suspend, mdss_pm_resume)
> > +   SET_RUNTIME_PM_OPS(mdss_runtime_suspend, mdss_runtime_resume, NULL)
> > +   .prepare = msm_pm_prepare,
> > +   .complete = msm_pm_complete,
> > +};
> > +
> > +static int get_mdp_ver(struct platform_device *pdev)
> > +{
> > +   struct device *dev = &pdev->dev;
> > +
> > +   return (int) (unsigned long) of_device_get_match_data(dev);
> > +}
> > +
> > +static int find_mdp_node(struct device *dev, void *data)
> > +{
> > +   return of_match_node(dpu_dt_match, dev->of_node) ||
> > +   of_match_node(mdp5_dt_match, dev->of_node);
> > +}
> > +
> > +static int mdss_probe(struct platform_device *pdev)
> > +{
> > +   struct msm_mdss *mdss;
> > +   struct msm_drm_private *priv;
> > +   int mdp_ver = get_mdp_ver(pdev);
> > +   struct device *mdp_dev;
> > +   struct device *dev = &pdev->dev;
> > +   int ret;
> > +
> > +   if (mdp_ver != KMS_MDP5 && mdp_ver != KMS_DPU)
> > +   return -EINVAL;
>
> Is it possible anymore? Now that the driver is split it seems like no.

Yes, I'll drop this.

>
> > +
> > +   priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> > +   if (!priv)
> > +   return -ENOMEM;
> > +
> > +  

Re: [PATCH v2 2/4] drm/msm: remove extra indirection for msm_mdss

2022-03-03 Thread Dmitry Baryshkov
On Fri, 4 Mar 2022 at 01:54, Stephen Boyd  wrote:
>
> Quoting Dmitry Baryshkov (2022-01-19 14:40:03)
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index be06a62d7ccb..f18dfbb614f0 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -1211,19 +1212,32 @@ static int msm_pdev_probe(struct platform_device 
> > *pdev)
> >
> > switch (get_mdp_ver(pdev)) {
> > case KMS_MDP5:
> > -   ret = msm_mdss_init(pdev, true);
> > +   mdss = msm_mdss_init(pdev, true);
> > +   if (IS_ERR(mdss)) {
> > +   ret = PTR_ERR(mdss);
> > +   platform_set_drvdata(pdev, NULL);
> > +
> > +   return ret;
> > +   } else {
>
> Drop else
>
> > +   priv->mdss = mdss;
> > +   pm_runtime_enable(&pdev->dev);
> > +   }
> > break;
> > case KMS_DPU:
> > -   ret = msm_mdss_init(pdev, false);
> > +   mdss = msm_mdss_init(pdev, false);
> > +   if (IS_ERR(mdss)) {
> > +   ret = PTR_ERR(mdss);
> > +   platform_set_drvdata(pdev, NULL);
> > +
> > +   return ret;
> > +   } else {
> > +   priv->mdss = mdss;
> > +   pm_runtime_enable(&pdev->dev);
> > +   }
>
> This is the same so why can't it be done below in the deleted if (ret)?

I didn't like the idea of checking the if (IS_ERR(mdss)) outside of
the case blocks, but now I can move it back.

>
> > break;
> > default:
> > -   ret = 0;
> > break;
> > }
> > -   if (ret) {
> > -   platform_set_drvdata(pdev, NULL);
> > -   return ret;
> > -   }
> >
> > if (get_mdp_ver(pdev)) {
> > ret = add_display_components(pdev, &match);
> > diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
> > index 2459ba479caf..0c341660941a 100644
> > --- a/drivers/gpu/drm/msm/msm_kms.h
> > +++ b/drivers/gpu/drm/msm/msm_kms.h
> > @@ -239,50 +228,44 @@ int mdp5_mdss_parse_clock(struct platform_device 
> > *pdev, struct clk_bulk_data **c
> > return num_clocks;
> >  }
> >
> > -int msm_mdss_init(struct platform_device *pdev, bool mdp5)
> > +struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool mdp5)
>
> Ah I see it will quickly become not static. Still should have static
> first and remove it here.



-- 
With best wishes
Dmitry


Re: [PATCH 00/12] Add writeback block support for DPU

2022-03-03 Thread Abhinav Kumar

Hi Stephen

There is some discussion going on about the base dependency of the change:

https://patchwork.kernel.org/project/dri-devel/patch/20220202085429.22261-6-suraj.kand...@intel.com/

I will resend this with comments addressed once the dependency is sorted 
out among intel, QC and laurent.


Thanks

Abhinav
On 3/3/2022 2:46 PM, Stephen Boyd wrote:

Quoting Abhinav Kumar (2022-02-04 13:17:13)

This series adds support for writeback block on DPU. Writeback
block is extremely useful to validate boards having no physical displays
in addition to many other use-cases where we want to get the output
of the display pipeline to examine whether issue is with the display
pipeline or with the panel.


Is this series going to be resent?


Re: [PATCH] dt-bindings: gpu: Convert aspeed-gfx bindings to yaml

2022-03-03 Thread Andrew Jeffery



On Fri, 4 Mar 2022, at 08:05, Joel Stanley wrote:
> On Thu, 3 Mar 2022 at 19:34, Rob Herring  wrote:
>>
>> On Wed, Mar 2, 2022 at 12:01 PM Rob Herring  wrote:
>> >
>> > On Wed, Mar 02, 2022 at 03:40:56PM +1030, Joel Stanley wrote:
>> > > Convert the bindings to yaml and add the ast2600 compatible string.
>> > >
>> > > Signed-off-by: Joel Stanley 
>> > > ---
>> > >  .../devicetree/bindings/gpu/aspeed,gfx.yaml   | 69 +++
>> > >  .../devicetree/bindings/gpu/aspeed-gfx.txt| 41 ---
>> > >  2 files changed, 69 insertions(+), 41 deletions(-)
>> > >  create mode 100644 Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml
>> > >  delete mode 100644 Documentation/devicetree/bindings/gpu/aspeed-gfx.txt
>> >
>> > Applied, thanks.
>>
>> Uggg, now dropped...
>>
>> What's Documentation/devicetree/bindings/mfd/aspeed-gfx.txt and also
>> the example in 
>> Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml?
>> Please sort those out.
>
> I think the aspeed-gfx.txt can be deleted. And the example in the
> pinctrl bindings needs to be updated with the required properties.
>
> Andrew, can you clarify what's going on with those other files?

Looks like you'll just need to paste your example from 
aspeed,gfx.yaml into the pinctrl yamls to replace the existing gfx 
nodes.

Andrew


Re: [PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines

2022-03-03 Thread Matt Roper
On Thu, Mar 03, 2022 at 02:37:35PM -0800, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> An earlier patch added support for compute engines. However, it missed
> enabling the anti-pre-emption w/a for the new engine class. So move
> the 'compute capable' flag earlier and use it for the pre-emption w/a
> test.
> 
> Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions")
> Cc: Tvrtko Ursulin 
> Cc: Daniele Ceraolo Spurio 
> Cc: Aravind Iddamsetty 
> Cc: Matt Roper 
> Cc: Tvrtko Ursulin 
> Cc: Daniel Vetter 
> Cc: Maarten Lankhorst 
> Cc: Lucas De Marchi 
> Cc: John Harrison 
> Cc: Jason Ekstrand 
> Cc: "Michał Winiarski" 
> Cc: Matthew Brost 
> Cc: Chris Wilson 
> Cc: Tejas Upadhyay 
> Cc: Umesh Nerlige Ramappa 
> Cc: "Thomas Hellström" 
> Cc: Stuart Summers 
> Cc: Matthew Auld 
> Cc: Jani Nikula 
> Cc: Ramalingam C 
> Cc: Akeem G Abodunrin 
> Signed-off-by: John Harrison 

Reviewed-by: Matt Roper 

> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 22e70e4e007c..4185c7338581 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
> intel_engine_id id,
>   engine->logical_mask = BIT(logical_instance);
>   __sprint_engine_name(engine);
>  
> + /* features common between engines sharing EUs */
> + if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
> + engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
> + engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
> + }
> +
>   engine->props.heartbeat_interval_ms =
>   CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
>   engine->props.max_busywait_duration_ns =
> @@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
> intel_engine_id id,
>   CONFIG_DRM_I915_TIMESLICE_DURATION;
>  
>   /* Override to uninterruptible for OpenCL workloads. */
> - if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
> + if (GRAPHICS_VER(i915) == 12 && (engine->flags & 
> I915_ENGINE_HAS_RCS_REG_STATE))
>   engine->props.preempt_timeout_ms = 0;
>  
> - /* features common between engines sharing EUs */
> - if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
> - engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
> - engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
> - }
> -
>   /* Cap properties according to any system limits */
>  #define CLAMP_PROP(field) \
>   do { \
> -- 
> 2.25.1
> 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795


Re: [PATCH v2 4/4] drm/msm: stop using device's match data pointer

2022-03-03 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2022-01-19 14:40:05)
> diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c
> index 759076357e0e..f83dca99f03d 100644
> --- a/drivers/gpu/drm/msm/msm_mdss.c
> +++ b/drivers/gpu/drm/msm/msm_mdss.c
> @@ -314,11 +314,11 @@ static const struct dev_pm_ops mdss_pm_ops = {
> .complete = msm_pm_complete,
>  };
>
> -static int get_mdp_ver(struct platform_device *pdev)
> +static bool get_is_mdp5(struct platform_device *pdev)
>  {
> struct device *dev = &pdev->dev;
>
> -   return (int) (unsigned long) of_device_get_match_data(dev);
> +   return (bool) (unsigned long) of_device_get_match_data(dev);
>  }
>
>  static int find_mdp_node(struct device *dev, void *data)
> @@ -331,21 +331,18 @@ static int mdss_probe(struct platform_device *pdev)
>  {
> struct msm_mdss *mdss;
> struct msm_drm_private *priv;
> -   int mdp_ver = get_mdp_ver(pdev);
> +   bool is_mdp5 = get_is_mdp5(pdev);

is_mdp5 = of_device_is_compatible(pdev->dev.of_node, "qcom,mdss");

> struct device *mdp_dev;
> struct device *dev = &pdev->dev;
> int ret;
>
> -   if (mdp_ver != KMS_MDP5 && mdp_ver != KMS_DPU)
> -   return -EINVAL;
> -
> priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> if (!priv)
> return -ENOMEM;
>
> platform_set_drvdata(pdev, priv);
>
> -   mdss = msm_mdss_init(pdev, mdp_ver == KMS_MDP5);
> +   mdss = msm_mdss_init(pdev, is_mdp5);
> if (IS_ERR(mdss)) {
> ret = PTR_ERR(mdss);
> platform_set_drvdata(pdev, NULL);
> @@ -409,12 +406,12 @@ static int mdss_remove(struct platform_device *pdev)
>  }
>
>  static const struct of_device_id mdss_dt_match[] = {
> -   { .compatible = "qcom,mdss", .data = (void *)KMS_MDP5 },
> -   { .compatible = "qcom,sdm845-mdss", .data = (void *)KMS_DPU },
> -   { .compatible = "qcom,sc7180-mdss", .data = (void *)KMS_DPU },
> -   { .compatible = "qcom,sc7280-mdss", .data = (void *)KMS_DPU },
> -   { .compatible = "qcom,sm8150-mdss", .data = (void *)KMS_DPU },
> -   { .compatible = "qcom,sm8250-mdss", .data = (void *)KMS_DPU },
> +   { .compatible = "qcom,mdss", .data = (void *)true },
> +   { .compatible = "qcom,sdm845-mdss", .data = (void *)false },
> +   { .compatible = "qcom,sc7180-mdss", .data = (void *)false },
> +   { .compatible = "qcom,sc7280-mdss", .data = (void *)false },
> +   { .compatible = "qcom,sm8150-mdss", .data = (void *)false },
> +   { .compatible = "qcom,sm8250-mdss", .data = (void *)false },

And then no data needed?


Re: [PATCH v4 2/2] drm/bridge: analogix_dp: Enable autosuspend

2022-03-03 Thread Doug Anderson
Hi,

On Tue, Mar 1, 2022 at 6:11 PM Brian Norris  wrote:
>
> DP AUX transactions can consist of many short operations. There's no
> need to power things up/down in short intervals.
>
> I pick an arbitrary 100ms; for the systems I'm testing (Rockchip
> RK3399), runtime-PM transitions only take a few microseconds.
>
> Signed-off-by: Brian Norris 
> ---
>
> Changes in v4:
>  - call pm_runtime_mark_last_busy() and
>pm_runtime_dont_use_autosuspend()
>  - drop excess pm references around drm_get_edid(), now that we grab and
>hold in the dp-aux helper
>
> Changes in v3:
>  - New in v3
>
>  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)

This looks great to me now, thanks.

Reviewed-by: Douglas Anderson 

Though I'm not a massive expert on the Analogix DP driver, I'm pretty
confident about the DP AUX stuff that Brian is touching. I just
checked and I see that this driver isn't changing lots and the last
change landed in drm-misc, which means that I can commit this. Thus,
unless someone else shouts, I'll plan to wait until next week and
commit these two patches to drm-misc.

The first of the two patches is a "Fix" but since it's been broken
since 2016 I'll assume that nobody is chomping at the bit for these to
get into stable and that it would be easier to land both in
"drm-misc-next". Please yell if someone disagrees.

-Doug


Re: [PATCH v2 3/4] drm/msm: split the main platform driver

2022-03-03 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2022-01-19 14:40:04)
> diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
> index 06d26c5fb274..6895c056be19 100644
> --- a/drivers/gpu/drm/msm/msm_drv.h
> +++ b/drivers/gpu/drm/msm/msm_drv.h
> @@ -451,10 +451,18 @@ static inline void msm_dp_debugfs_init(struct msm_dp 
> *dp_display,
>
>  #endif
>
> +#define KMS_MDP4 4
> +#define KMS_MDP5 5
> +#define KMS_DPU  3
> +
> +void __init msm_mdp4_register(void);
> +void __exit msm_mdp4_unregister(void);
>  void __init msm_mdp_register(void);
>  void __exit msm_mdp_unregister(void);
>  void __init msm_dpu_register(void);
>  void __exit msm_dpu_unregister(void);
> +void __init msm_mdss_register(void);
> +void __exit msm_mdss_unregister(void);

Don't need __init or __exit on prototypes.

>
>  #ifdef CONFIG_DEBUG_FS
>  void msm_framebuffer_describe(struct drm_framebuffer *fb, struct seq_file 
> *m);
> diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c
> index 92562221b517..759076357e0e 100644
> --- a/drivers/gpu/drm/msm/msm_mdss.c
> +++ b/drivers/gpu/drm/msm/msm_mdss.c
> @@ -8,6 +8,8 @@
>  #include 
>  #include 
>
> +#include 

What's this include for?

> +
>  #include "msm_drv.h"
>  #include "msm_kms.h"
>
> @@ -127,7 +129,7 @@ static int _msm_mdss_irq_domain_add(struct msm_mdss 
> *msm_mdss)
> return 0;
>  }
>
> -int msm_mdss_enable(struct msm_mdss *msm_mdss)
> +static int msm_mdss_enable(struct msm_mdss *msm_mdss)
>  {
> int ret;
>
> @@ -163,14 +165,14 @@ int msm_mdss_enable(struct msm_mdss *msm_mdss)
> return ret;
>  }
>
> -int msm_mdss_disable(struct msm_mdss *msm_mdss)
> +static int msm_mdss_disable(struct msm_mdss *msm_mdss)
>  {
> clk_bulk_disable_unprepare(msm_mdss->num_clocks, msm_mdss->clocks);
>
> return 0;
>  }
>
> -void msm_mdss_destroy(struct msm_mdss *msm_mdss)
> +static void msm_mdss_destroy(struct msm_mdss *msm_mdss)
>  {
> struct platform_device *pdev = to_platform_device(msm_mdss->dev);
> int irq;
> @@ -228,7 +230,7 @@ int mdp5_mdss_parse_clock(struct platform_device *pdev, 
> struct clk_bulk_data **c
> return num_clocks;
>  }
>
> -struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool mdp5)
> +static struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool 
> mdp5)
>  {
> struct msm_mdss *msm_mdss;
> int ret;
> @@ -269,3 +271,171 @@ struct msm_mdss *msm_mdss_init(struct platform_device 
> *pdev, bool mdp5)
>
> return msm_mdss;
>  }
> +
> +static int __maybe_unused mdss_runtime_suspend(struct device *dev)
> +{
> +   struct msm_drm_private *priv = dev_get_drvdata(dev);
> +
> +   DBG("");
> +
> +   return msm_mdss_disable(priv->mdss);
> +}
> +
> +static int __maybe_unused mdss_runtime_resume(struct device *dev)
> +{
> +   struct msm_drm_private *priv = dev_get_drvdata(dev);
> +
> +   DBG("");
> +
> +   return msm_mdss_enable(priv->mdss);
> +}
> +
> +static int __maybe_unused mdss_pm_suspend(struct device *dev)
> +{
> +
> +   if (pm_runtime_suspended(dev))
> +   return 0;
> +
> +   return mdss_runtime_suspend(dev);
> +}
> +
> +static int __maybe_unused mdss_pm_resume(struct device *dev)
> +{
> +   if (pm_runtime_suspended(dev))
> +   return 0;
> +
> +   return mdss_runtime_resume(dev);
> +}
> +
> +static const struct dev_pm_ops mdss_pm_ops = {
> +   SET_SYSTEM_SLEEP_PM_OPS(mdss_pm_suspend, mdss_pm_resume)
> +   SET_RUNTIME_PM_OPS(mdss_runtime_suspend, mdss_runtime_resume, NULL)
> +   .prepare = msm_pm_prepare,
> +   .complete = msm_pm_complete,
> +};
> +
> +static int get_mdp_ver(struct platform_device *pdev)
> +{
> +   struct device *dev = &pdev->dev;
> +
> +   return (int) (unsigned long) of_device_get_match_data(dev);
> +}
> +
> +static int find_mdp_node(struct device *dev, void *data)
> +{
> +   return of_match_node(dpu_dt_match, dev->of_node) ||
> +   of_match_node(mdp5_dt_match, dev->of_node);
> +}
> +
> +static int mdss_probe(struct platform_device *pdev)
> +{
> +   struct msm_mdss *mdss;
> +   struct msm_drm_private *priv;
> +   int mdp_ver = get_mdp_ver(pdev);
> +   struct device *mdp_dev;
> +   struct device *dev = &pdev->dev;
> +   int ret;
> +
> +   if (mdp_ver != KMS_MDP5 && mdp_ver != KMS_DPU)
> +   return -EINVAL;

Is it possible anymore? Now that the driver is split it seems like no.

> +
> +   priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> +   if (!priv)
> +   return -ENOMEM;
> +
> +   platform_set_drvdata(pdev, priv);
> +
> +   mdss = msm_mdss_init(pdev, mdp_ver == KMS_MDP5);
> +   if (IS_ERR(mdss)) {
> +   ret = PTR_ERR(mdss);
> +   platform_set_drvdata(pdev, NULL);
> +
> +   return ret;
> +   }
> +
> +   priv->mdss = mdss;
> +   pm_runtime_enable(&pdev->dev);
> +
> +   /*
> +* MDP5/DPU based devices don't

Re: [PATCH v2 2/4] drm/msm: remove extra indirection for msm_mdss

2022-03-03 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2022-01-19 14:40:03)
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index be06a62d7ccb..f18dfbb614f0 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -1211,19 +1212,32 @@ static int msm_pdev_probe(struct platform_device 
> *pdev)
>
> switch (get_mdp_ver(pdev)) {
> case KMS_MDP5:
> -   ret = msm_mdss_init(pdev, true);
> +   mdss = msm_mdss_init(pdev, true);
> +   if (IS_ERR(mdss)) {
> +   ret = PTR_ERR(mdss);
> +   platform_set_drvdata(pdev, NULL);
> +
> +   return ret;
> +   } else {

Drop else

> +   priv->mdss = mdss;
> +   pm_runtime_enable(&pdev->dev);
> +   }
> break;
> case KMS_DPU:
> -   ret = msm_mdss_init(pdev, false);
> +   mdss = msm_mdss_init(pdev, false);
> +   if (IS_ERR(mdss)) {
> +   ret = PTR_ERR(mdss);
> +   platform_set_drvdata(pdev, NULL);
> +
> +   return ret;
> +   } else {
> +   priv->mdss = mdss;
> +   pm_runtime_enable(&pdev->dev);
> +   }

This is the same so why can't it be done below in the deleted if (ret)?

> break;
> default:
> -   ret = 0;
> break;
> }
> -   if (ret) {
> -   platform_set_drvdata(pdev, NULL);
> -   return ret;
> -   }
>
> if (get_mdp_ver(pdev)) {
> ret = add_display_components(pdev, &match);
> diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
> index 2459ba479caf..0c341660941a 100644
> --- a/drivers/gpu/drm/msm/msm_kms.h
> +++ b/drivers/gpu/drm/msm/msm_kms.h
> @@ -239,50 +228,44 @@ int mdp5_mdss_parse_clock(struct platform_device *pdev, 
> struct clk_bulk_data **c
> return num_clocks;
>  }
>
> -int msm_mdss_init(struct platform_device *pdev, bool mdp5)
> +struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool mdp5)

Ah I see it will quickly become not static. Still should have static
first and remove it here.


Re: [PATCH v2 1/4] drm/msm: unify MDSS drivers

2022-03-03 Thread Stephen Boyd
Quoting Dmitry Baryshkov (2022-01-19 14:40:02)
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c 
> b/drivers/gpu/drm/msm/msm_mdss.c
> similarity index 58%
> rename from drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
> rename to drivers/gpu/drm/msm/msm_mdss.c
> index 9f5cc7f9e9a9..f5429eb0ae52 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
> +++ b/drivers/gpu/drm/msm/msm_mdss.c
> @@ -188,22 +182,64 @@ static void dpu_mdss_destroy(struct msm_mdss *mdss)
>
> pm_runtime_suspend(mdss->dev);
> pm_runtime_disable(mdss->dev);
> -   _dpu_mdss_irq_domain_fini(dpu_mdss);
> +   irq_domain_remove(dpu_mdss->irq_controller.domain);
> +   dpu_mdss->irq_controller.domain = NULL;
> irq = platform_get_irq(pdev, 0);
> irq_set_chained_handler_and_data(irq, NULL, NULL);
> -
> -   if (dpu_mdss->mmio)
> -   devm_iounmap(&pdev->dev, dpu_mdss->mmio);
> -   dpu_mdss->mmio = NULL;
>  }
>
>  static const struct msm_mdss_funcs mdss_funcs = {
> -   .enable = dpu_mdss_enable,
> -   .disable = dpu_mdss_disable,
> -   .destroy = dpu_mdss_destroy,
> +   .enable = msm_mdss_enable,
> +   .disable = msm_mdss_disable,
> +   .destroy = msm_mdss_destroy,
>  };
>
> -int dpu_mdss_init(struct platform_device *pdev)
> +/*
> + * MDP5 MDSS uses at most three specified clocks.
> + */
> +#define MDP5_MDSS_NUM_CLOCKS 3
> +int mdp5_mdss_parse_clock(struct platform_device *pdev, struct clk_bulk_data 
> **clocks)

static?

> +{
> +   struct clk_bulk_data *bulk;
> +   struct clk *clk;
> +   int num_clocks = 0;
> +
> +   if (!pdev)
> +   return -EINVAL;
> +
> +   bulk = devm_kcalloc(&pdev->dev, MDP5_MDSS_NUM_CLOCKS, sizeof(struct 
> clk_bulk_data), GFP_KERNEL);
> +   if (!bulk)
> +   return -ENOMEM;
> +
> +   /* We ignore all the errors except deferral: typically they mean that 
> the clock is not provided in the dts. */
> +   clk = msm_clk_get(pdev, "iface");
> +   if (!IS_ERR(clk)) {
> +   bulk[num_clocks].id = "iface";
> +   bulk[num_clocks].clk = clk;
> +   num_clocks++;
> +   } else if (clk == ERR_PTR(-EPROBE_DEFER))
> +   return -EPROBE_DEFER;
> +
> +   clk = msm_clk_get(pdev, "bus");
> +   if (!IS_ERR(clk)) {
> +   bulk[num_clocks].id = "bus";
> +   bulk[num_clocks].clk = clk;
> +   num_clocks++;
> +   } else if (clk == ERR_PTR(-EPROBE_DEFER))
> +   return -EPROBE_DEFER;
> +
> +   clk = msm_clk_get(pdev, "vsync");
> +   if (!IS_ERR(clk)) {
> +   bulk[num_clocks].id = "vsync";
> +   bulk[num_clocks].clk = clk;
> +   num_clocks++;
> +   } else if (clk == ERR_PTR(-EPROBE_DEFER))
> +   return -EPROBE_DEFER;
> +
> +   return num_clocks;
> +}
> +
> +int msm_mdss_init(struct platform_device *pdev, bool mdp5)

Maybe is_mdp5 so the if reads simpler.

>  {
> struct msm_drm_private *priv = platform_get_drvdata(pdev);
> struct dpu_mdss *dpu_mdss;
> @@ -220,27 +256,28 @@ int dpu_mdss_init(struct platform_device *pdev)
>
> DRM_DEBUG("mapped mdss address space @%pK\n", dpu_mdss->mmio);
>
> -   ret = msm_parse_clock(pdev, &dpu_mdss->clocks);
> +   if (mdp5)
> +   ret = mdp5_mdss_parse_clock(pdev, &dpu_mdss->clocks);
> +   else
> +   ret = msm_parse_clock(pdev, &dpu_mdss->clocks);
> if (ret < 0) {
> -   DPU_ERROR("failed to parse clocks, ret=%d\n", ret);
> -   goto clk_parse_err;
> +   DRM_ERROR("failed to parse clocks, ret=%d\n", ret);
> +   return ret;
> }
> dpu_mdss->num_clocks = ret;


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Add RCS mask to GuC ADS params

2022-03-03 Thread John Harrison

On 3/3/2022 14:34, Matt Roper wrote:

From: Stuart Summers 

If RCS is not enumerated, GuC will return invalid parameters.
Make sure we do not send RCS supported when we have not enumerated
it.

Cc: Vinay Belgaumkar 
Signed-off-by: Stuart Summers 
Signed-off-by: Matt Roper 

Reviewed-by: John Harrison 


---
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 32c2053f2f08..acc4a3766dc1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -433,7 +433,7 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc)
  static void fill_engine_enable_masks(struct intel_gt *gt,
 struct iosys_map *info_map)
  {
-   info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1);
+   info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 
RCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_COMPUTE_CLASS], 
CCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1);
info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], 
VDBOX_MASK(gt));




Re: [PATCH 00/12] Add writeback block support for DPU

2022-03-03 Thread Stephen Boyd
Quoting Abhinav Kumar (2022-02-04 13:17:13)
> This series adds support for writeback block on DPU. Writeback
> block is extremely useful to validate boards having no physical displays
> in addition to many other use-cases where we want to get the output
> of the display pipeline to examine whether issue is with the display
> pipeline or with the panel.

Is this series going to be resent?


[PATCH] drm/i915/dg2: Add preemption changes for Wa_14015141709

2022-03-03 Thread Matt Roper
From: Akeem G Abodunrin 

Starting with DG2, preemption can no longer be controlled using userspace
on a per-context basis. Instead, the hardware only allows us to enable or
disable preemption in a global, system-wide basis. Also, we lose the
ability to specify the preemption granularity (such as batch-level vs
command-level vs object-level).

As a result of this - for debugging purposes, this patch adds debugfs
interface to configure (disable/enable) preemption globally.

Jira: VLK-27831

Cc: Matt Roper 
Cc: Prathap Kumar Valsan 
Cc: John Harrison 
Cc: Joonas Lahtinen 
Signed-off-by: Akeem G Abodunrin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |  3 ++
 drivers/gpu/drm/i915/gt/intel_workarounds.c |  2 +-
 drivers/gpu/drm/i915/i915_debugfs.c | 50 +
 drivers/gpu/drm/i915/i915_drv.h |  3 ++
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 19cd34f24263..21ede1887b9f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -468,6 +468,9 @@
 #define VF_PREEMPTION  _MMIO(0x83a4)
 #define   PREEMPTION_VERTEX_COUNT  REG_GENMASK(15, 0)
 
+#define GEN12_VFG_PREEMPTION_CHICKEN   _MMIO(0x83b4)
+#define   GEN12_VFG_PREEMPT_CHICKEN_DISABLEREG_BIT(8)
+
 #define GEN8_RC6_CTX_INFO  _MMIO(0x8504)
 
 #define GEN12_SQCM _MMIO(0x8724)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index c014b40d2e9f..18dc82f29776 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -2310,7 +2310,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
 FF_DOP_CLOCK_GATE_DISABLE);
}
 
-   if (IS_GRAPHICS_VER(i915, 9, 12)) {
+   if (HAS_PERCTX_PREEMPT_CTRL(i915)) {
/* 
FtrPerCtxtPreemptionGranularityControl:skl,bxt,kbl,cfl,cnl,icl,tgl */
wa_masked_en(wal,
 GEN7_FF_SLICE_CS_CHICKEN1,
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 747fe9f41e1f..40e6e17e2950 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -571,6 +571,55 @@ static int i915_wa_registers(struct seq_file *m, void 
*unused)
return 0;
 }
 
+static void i915_global_preemption_config(struct drm_i915_private *i915,
+ u32 val)
+{
+   const u32 bit = GEN12_VFG_PREEMPT_CHICKEN_DISABLE;
+
+   if (val)
+   intel_uncore_write(&i915->uncore, GEN12_VFG_PREEMPTION_CHICKEN,
+  _MASKED_BIT_DISABLE(bit));
+   else
+   intel_uncore_write(&i915->uncore, GEN12_VFG_PREEMPTION_CHICKEN,
+  _MASKED_BIT_ENABLE(bit));
+}
+
+static int i915_global_preempt_support_get(void *data, u64 *val)
+{
+   struct drm_i915_private *i915 = data;
+   intel_wakeref_t wakeref;
+   u32 curr_status = 0;
+
+   if (HAS_PERCTX_PREEMPT_CTRL(i915) || GRAPHICS_VER(i915) < 11)
+   return -EINVAL;
+
+   with_intel_runtime_pm(&i915->runtime_pm, wakeref)
+   curr_status = intel_uncore_read(&i915->uncore,
+   GEN12_VFG_PREEMPTION_CHICKEN);
+   *val = (curr_status & GEN12_VFG_PREEMPT_CHICKEN_DISABLE) ? 0 : 1;
+
+   return 0;
+}
+
+static int i915_global_preempt_support_set(void *data, u64 val)
+{
+   struct drm_i915_private *i915 = data;
+   intel_wakeref_t wakeref;
+
+   if (HAS_PERCTX_PREEMPT_CTRL(i915) || GRAPHICS_VER(i915) < 11)
+   return -EINVAL;
+
+   with_intel_runtime_pm(&i915->runtime_pm, wakeref)
+   i915_global_preemption_config(i915, val);
+
+   return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_global_preempt_support_fops,
+   i915_global_preempt_support_get,
+   i915_global_preempt_support_set,
+   "%lld\n");
+
 static int i915_wedged_get(void *data, u64 *val)
 {
struct drm_i915_private *i915 = data;
@@ -765,6 +814,7 @@ static const struct i915_debugfs_files {
const struct file_operations *fops;
 } i915_debugfs_files[] = {
{"i915_perf_noa_delay", &i915_perf_noa_delay_fops},
+   {"i915_global_preempt_support", &i915_global_preempt_support_fops},
{"i915_wedged", &i915_wedged_fops},
{"i915_gem_drop_caches", &i915_drop_caches_fops},
 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 457bc1993d19..8c3f69c87d36 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1407,6 +1407,9 @@ IS_SUBPLATFORM(con

Re: [PATCH v4 1/4] arm64/dts/qcom/sc7280: remove assigned-clock-rate property for mdp clk

2022-03-03 Thread Doug Anderson
Hi,

On Thu, Mar 3, 2022 at 1:40 AM Vinod Polimera  wrote:
>
> Kernel clock driver assumes that initial rate is the
> max rate for that clock and was not allowing it to scale
> beyond the assigned clock value.
>
> Drop the assigned clock rate property and vote on the mdp clock as per
> calculated value during the usecase.

I see the "Drop the assigned clock rate property" part, but where is
the "and vote on the mdp clock" part? Did it already land or
something? I definitely see that commit 5752c921d267 ("drm/msm/dpu:
simplify clocks handling") changed a bunch of this but it looks like
dpu_core_perf_init() still sets "max_core_clk_rate" to whatever the
clock was at bootup. I assume you need to modify that function to call
into the OPP layer to find the max frequency?


> Changes in v2:
> - Remove assigned-clock-rate property and set mdp clk during resume sequence.
> - Add fixes tag.
>
> Changes in v3:
> - Remove extra line after fixes tag.(Stephen Boyd)
>
> Fixes: 62fbdce91("arm64: dts: qcom: sc7280: add display dt nodes")

Having a "Fixes" is good, but presumably you need a code change along
with this, right? Otherwise if someone picks this back to stable then
they'll end up breaking, right? We need to tag / note that _somehow_.


[PATCH v3 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts

2022-03-03 Thread John . C . Harrison
From: John Harrison 

Compute workloads are inherently not pre-emptible for long periods on
current hardware. As a workaround for this, the pre-emption timeout
for compute capable engines was disabled. This is undesirable with GuC
submission as it prevents per engine reset of hung contexts. Hence the
next patch will re-enable the timeout but bumped up by an order of
magnitude.

However, the heartbeat might not respect that. Depending upon current
activity, a pre-emption to the heartbeat pulse might not even be
attempted until the last heartbeat period. Which means that only one
period is granted for the pre-emption to occur. With the aforesaid
bump, the pre-emption timeout could be significantly larger than this
heartbeat period.

So adjust the heartbeat code to take the pre-emption timeout into
account. When it reaches the final (high priority) period, it now
ensures the delay before hitting reset is bigger than the pre-emption
timeout.

v2: Fix for selftests which adjust the heartbeat period manually.

Signed-off-by: John Harrison 
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c   | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index a3698f611f45..0dc53def8e42 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -22,9 +22,27 @@
 
 static bool next_heartbeat(struct intel_engine_cs *engine)
 {
+   struct i915_request *rq;
long delay;
 
delay = READ_ONCE(engine->props.heartbeat_interval_ms);
+
+   rq = engine->heartbeat.systole;
+
+   if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
+   delay == engine->defaults.heartbeat_interval_ms) {
+   long longer;
+
+   /*
+* The final try is at the highest priority possible. Up until 
now
+* a pre-emption might not even have been attempted. So make 
sure
+* this last attempt allows enough time for a pre-emption to 
occur.
+*/
+   longer = READ_ONCE(engine->props.preempt_timeout_ms) * 2;
+   if (longer > delay)
+   delay = longer;
+   }
+
if (!delay)
return false;
 
-- 
2.25.1



[PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission

2022-03-03 Thread John . C . Harrison
From: John Harrison 

A workaround was added to the driver to allow OpenCL workloads to run
'forever' by disabling pre-emption on the RCS engine for Gen12.
It is not totally unbound as the heartbeat will kick in eventually
and cause a reset of the hung engine.

However, this does not work well in GuC submission mode. In GuC mode,
the pre-emption timeout is how GuC detects hung contexts and triggers
a per engine reset. Thus, disabling the timeout means also losing all
per engine reset ability. A full GT reset will still occur when the
heartbeat finally expires, but that is a much more destructive and
undesirable mechanism.

The purpose of the workaround is actually to give OpenCL tasks longer
to reach a pre-emption point after a pre-emption request has been
issued. This is necessary because Gen12 does not support mid-thread
pre-emption and OpenCL can have long running threads.

So, rather than disabling the timeout completely, just set it to a
'long' value.

v2: Review feedback from Tvrtko - must hard code the 'long' value
instead of determining it algorithmically. So make it an extra CONFIG
definition. Also, remove the execlist centric comment from the
existing pre-emption timeout CONFIG option given that it applies to
more than just execlists.

Signed-off-by: John Harrison 
Reviewed-by: Daniele Ceraolo Spurio  (v1)
Acked-by: Michal Mrozek 
---
 drivers/gpu/drm/i915/Kconfig.profile  | 26 +++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++--
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
b/drivers/gpu/drm/i915/Kconfig.profile
index 39328567c200..7cc38d25ee5c 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
default 640 # milliseconds
help
  How long to wait (in milliseconds) for a preemption event to occur
- when submitting a new context via execlists. If the current context
- does not hit an arbitration point and yield to HW before the timer
- expires, the HW will be reset to allow the more important context
- to execute.
+ when submitting a new context. If the current context does not hit
+ an arbitration point and yield to HW before the timer expires, the
+ HW will be reset to allow the more important context to execute.
+
+ This is adjustable via
+ /sys/class/drm/card?/engine/*/preempt_timeout_ms
+
+ May be 0 to disable the timeout.
+
+ The compiled in default may get overridden at driver probe time on
+ certain platforms and certain engines which will be reflected in the
+ sysfs control.
+
+config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
+   int "Preempt timeout for compute engines (ms, jiffy granularity)"
+   default 7500 # milliseconds
+   help
+ How long to wait (in milliseconds) for a preemption event to occur
+ when submitting a new context to a compute capable engine. If the
+ current context does not hit an arbitration point and yield to HW
+ before the timer expires, the HW will be reset to allow the more
+ important context to execute.
 
  This is adjustable via
  /sys/class/drm/card?/engine/*/preempt_timeout_ms
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4185c7338581..cc0954ad836a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id,
engine->props.timeslice_duration_ms =
CONFIG_DRM_I915_TIMESLICE_DURATION;
 
-   /* Override to uninterruptible for OpenCL workloads. */
+   /*
+* Mid-thread pre-emption is not available in Gen12. Unfortunately,
+* some OpenCL workloads run quite long threads. That means they get
+* reset due to not pre-empting in a timely manner. So, bump the
+* pre-emption timeout value to be much higher for compute engines.
+*/
if (GRAPHICS_VER(i915) == 12 && (engine->flags & 
I915_ENGINE_HAS_RCS_REG_STATE))
-   engine->props.preempt_timeout_ms = 0;
+   engine->props.preempt_timeout_ms = 
CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
 
/* Cap properties according to any system limits */
 #define CLAMP_PROP(field) \
-- 
2.25.1



[PATCH v3 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow

2022-03-03 Thread John . C . Harrison
From: John Harrison 

GuC converts the pre-emption timeout and timeslice quantum values into
clock ticks internally. That significantly reduces the point of 32bit
overflow. On current platforms, worst case scenario is approximately
110 seconds. Rather than allowing the user to set higher values and
then get confused by early timeouts, add limits when setting these
values.

v2: Add helper functins for clamping (review feedback from Tvrtko).

Signed-off-by: John Harrison 
Reviewed-by: Daniele Ceraolo Spurio  (v1)
---
 drivers/gpu/drm/i915/gt/intel_engine.h  |  6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 69 +
 drivers/gpu/drm/i915/gt/sysfs_engines.c | 25 +---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h |  9 +++
 4 files changed, 99 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 1c0ab05c3c40..d7044c4e526e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -351,4 +351,10 @@ intel_engine_get_hung_context(struct intel_engine_cs 
*engine)
return engine->hung_ce;
 }
 
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 
value);
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 
value);
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 
value);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 7447411a5b26..22e70e4e007c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -442,6 +442,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id,
engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
}
 
+   /* Cap properties according to any system limits */
+#define CLAMP_PROP(field) \
+   do { \
+   u64 clamp = intel_clamp_##field(engine, engine->props.field); \
+   if (clamp != engine->props.field) { \
+   drm_notice(&engine->i915->drm, \
+  "Warning, clamping %s to %lld to prevent 
overflow\n", \
+  #field, clamp); \
+   engine->props.field = clamp; \
+   } \
+   } while (0)
+
+   CLAMP_PROP(heartbeat_interval_ms);
+   CLAMP_PROP(max_busywait_duration_ns);
+   CLAMP_PROP(preempt_timeout_ms);
+   CLAMP_PROP(stop_timeout_ms);
+   CLAMP_PROP(timeslice_duration_ms);
+
+#undef CLAMP_PROP
+
engine->defaults = engine->props; /* never to change again */
 
engine->context_size = intel_engine_context_size(gt, engine->class);
@@ -464,6 +484,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id,
return 0;
 }
 
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 
value)
+{
+   value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+   return value;
+}
+
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 
value)
+{
+   value = min(value, jiffies_to_nsecs(2));
+
+   return value;
+}
+
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+   /*
+* NB: The GuC API only supports 32bit values. However, the limit is 
further
+* reduced due to internal calculations which would otherwise overflow.
+*/
+   if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
+   value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS);
+
+   value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+   return value;
+}
+
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+   value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+   return value;
+}
+
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 
value)
+{
+   /*
+* NB: The GuC API only supports 32bit values. However, the limit is 
further
+* reduced due to internal calculations which would otherwise overflow.
+*/
+   if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
+   value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS);
+
+   value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+   return value;
+}
+
 static void __setup_engine_capabilities(struct intel_engine_cs *engine)
 {
struct drm_i915_private *i915 = engine->i915;
diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c 
b/drivers/gpu/drm/i915/gt/sysfs_engines.c
index 967031056202..f2d9858d827c 100644
--- a/drivers/gpu/drm/i915/gt/sysfs_engines.c
+++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c
@@ -144

[PATCH v3 0/4] Improve anti-pre-emption w/a for compute workloads

2022-03-03 Thread John . C . Harrison
From: John Harrison 

Compute workloads are inherently not pre-emptible on current hardware.
Thus the pre-emption timeout was disabled as a workaround to prevent
unwanted resets. Instead, the hang detection was left to the heartbeat
and its (longer) timeout. This is undesirable with GuC submission as
the heartbeat is a full GT reset rather than a per engine reset and so
is much more destructive. Instead, just bump the pre-emption timeout
to a big value. Also, update the heartbeat to allow such a long
pre-emption delay in the final heartbeat period.

v2: Add clamping helpers.
v3: Remove long timeout algorithm and replace with hard coded value
(review feedback from Tvrtko). Also, fix execlist selftest failure and
fix bug in compute enabling patch related to pre-emption timeouts.

Signed-off-by: John Harrison 


John Harrison (4):
  drm/i915/guc: Limit scheduling properties to avoid overflow
  drm/i915: Fix compute pre-emption w/a to apply to compute engines
  drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  drm/i915: Improve long running OCL w/a for GuC submission

 drivers/gpu/drm/i915/Kconfig.profile  | 26 +-
 drivers/gpu/drm/i915/gt/intel_engine.h|  6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 92 +--
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 18 
 drivers/gpu/drm/i915/gt/sysfs_engines.c   | 25 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  9 ++
 6 files changed, 153 insertions(+), 23 deletions(-)

-- 
2.25.1



[PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines

2022-03-03 Thread John . C . Harrison
From: John Harrison 

An earlier patch added support for compute engines. However, it missed
enabling the anti-pre-emption w/a for the new engine class. So move
the 'compute capable' flag earlier and use it for the pre-emption w/a
test.

Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions")
Cc: Tvrtko Ursulin 
Cc: Daniele Ceraolo Spurio 
Cc: Aravind Iddamsetty 
Cc: Matt Roper 
Cc: Tvrtko Ursulin 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Lucas De Marchi 
Cc: John Harrison 
Cc: Jason Ekstrand 
Cc: "Michał Winiarski" 
Cc: Matthew Brost 
Cc: Chris Wilson 
Cc: Tejas Upadhyay 
Cc: Umesh Nerlige Ramappa 
Cc: "Thomas Hellström" 
Cc: Stuart Summers 
Cc: Matthew Auld 
Cc: Jani Nikula 
Cc: Ramalingam C 
Cc: Akeem G Abodunrin 
Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 22e70e4e007c..4185c7338581 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id,
engine->logical_mask = BIT(logical_instance);
__sprint_engine_name(engine);
 
+   /* features common between engines sharing EUs */
+   if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
+   engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
+   engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
+   }
+
engine->props.heartbeat_interval_ms =
CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
engine->props.max_busywait_duration_ns =
@@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id,
CONFIG_DRM_I915_TIMESLICE_DURATION;
 
/* Override to uninterruptible for OpenCL workloads. */
-   if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
+   if (GRAPHICS_VER(i915) == 12 && (engine->flags & 
I915_ENGINE_HAS_RCS_REG_STATE))
engine->props.preempt_timeout_ms = 0;
 
-   /* features common between engines sharing EUs */
-   if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
-   engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
-   engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
-   }
-
/* Cap properties according to any system limits */
 #define CLAMP_PROP(field) \
do { \
-- 
2.25.1



Re: [PATCH 3/4] drm/msm: Add SYSPROF param

2022-03-03 Thread Stephen Boyd
Quoting Rob Clark (2022-03-03 13:47:14)
> On Thu, Mar 3, 2022 at 1:17 PM Rob Clark  wrote:
> >
> > On Thu, Mar 3, 2022 at 12:47 PM Stephen Boyd  wrote:
> > >
> > > Quoting Rob Clark (2022-03-03 11:46:47)
> > > > +
> > > > +   /* then apply new value: */
> > >
> > > It would be safer to swap this. Otherwise a set when the values are at
> > > "1" would drop to "zero" here and potentially trigger some glitch,
> > > whereas incrementing one more time and then dropping the previous state
> > > would avoid that short blip.
> > >
> > > > +   switch (sysprof) {
> > > > +   default:
> > > > +   return -EINVAL;
> > >
> > > This will become more complicated though.
> >
> > Right, that is why I took the "unwind first and then re-apply"
> > approach.. in practice I expect userspace to set the value before it
> > starts sampling counter values, so I wasn't too concerned about this
> > racing with a submit and clearing the counters.  (Plus any glitch if
> > userspace did decide to change it dynamically would just be transient
> > and not really a big deal.)
>
> Actually I could just swap the two switch's.. the result would be that
> an EINVAL would not change the state instead of dropping the state to
> zero.  Maybe that is better anyways
>

Yeah it isn't clear to me what should happen if the new state is
invalid. Outright rejection is probably better than replacing the
previous state with an invalid state.


[PATCH 2/2] drm/i915: Add RCS mask to GuC ADS params

2022-03-03 Thread Matt Roper
From: Stuart Summers 

If RCS is not enumerated, GuC will return invalid parameters.
Make sure we do not send RCS supported when we have not enumerated
it.

Cc: Vinay Belgaumkar 
Signed-off-by: Stuart Summers 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 32c2053f2f08..acc4a3766dc1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -433,7 +433,7 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc)
 static void fill_engine_enable_masks(struct intel_gt *gt,
 struct iosys_map *info_map)
 {
-   info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1);
+   info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 
RCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_COMPUTE_CLASS], 
CCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1);
info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], 
VDBOX_MASK(gt));
-- 
2.34.1



[PATCH 1/2] drm/i915/xehp: Support platforms with CCS engines but no RCS

2022-03-03 Thread Matt Roper
In the past we've always assumed that an RCS engine is present on every
platform.  However now that we have compute engines there may be
platforms that have CCS engines but no RCS, or platforms that are
designed to have both, but have the RCS engine fused off.

Various engine-centric initialization that only needs to be done a
single time for the group of RCS+CCS engines can't rely on being setup
with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag
that will be assigned to a single engine in the group; whichever engine
has this flag will be responsible for some of the general setup
(RCU_MODE programming, initialization of certain workarounds, etc.).

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 5 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 ++
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  | 2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c   | 2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c| 2 +-
 drivers/gpu/drm/i915/i915_drv.h  | 2 ++
 7 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 7447411a5b26..8080479f27aa 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -436,6 +436,11 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id,
if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
engine->props.preempt_timeout_ms = 0;
 
+   if ((engine->class == COMPUTE_CLASS && !RCS_MASK(engine->gt) &&
+__ffs(CCS_MASK(engine->gt)) == engine->instance) ||
+engine->class == RENDER_CLASS)
+   engine->flags |= I915_ENGINE_FIRST_RENDER_COMPUTE;
+
/* features common between engines sharing EUs */
if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 19ff8758e34d..4fbf45a74ec0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -97,6 +97,7 @@ struct i915_ctx_workarounds {
 #define I915_MAX_VCS   8
 #define I915_MAX_VECS  4
 #define I915_MAX_CCS   4
+#define I915_MAX_RCS   1
 
 /*
  * Engine IDs definitions.
@@ -526,6 +527,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_WANT_FORCED_PREEMPTION BIT(8)
 #define I915_ENGINE_HAS_RCS_REG_STATE  BIT(9)
 #define I915_ENGINE_HAS_EU_PRIORITYBIT(10)
+#define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11)
unsigned int flags;
 
/*
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1c602d4ae297..e1470bb60f34 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2912,7 +2912,7 @@ static int execlists_resume(struct intel_engine_cs 
*engine)
 
enable_execlists(engine);
 
-   if (engine->class == RENDER_CLASS)
+   if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE)
xehp_enable_ccs_engines(engine);
 
return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index c014b40d2e9f..beca8735bae5 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -2633,7 +2633,7 @@ engine_init_workarounds(struct intel_engine_cs *engine, 
struct i915_wa_list *wal
 * to a single RCS/CCS engine's workaround list since
 * they're reset as part of the general render domain reset.
 */
-   if (engine->class == RENDER_CLASS)
+   if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE)
general_render_compute_wa_init(engine, wal);
 
if (engine->class == RENDER_CLASS)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 9bb551b83e7a..32c2053f2f08 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -335,7 +335,7 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
ret |= GUC_MMIO_REG_ADD(regset, RING_IMR(base), false);
 
-   if (engine->class == RENDER_CLASS &&
+   if ((engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE) &&
CCS_MASK(engine->gt))
ret |= GUC_MMIO_REG_ADD(regset, GEN12_RCU_MODE, true);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 1ce7e04aa837..8a8bb87e77a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/

  1   2   3   >