Re: [PATCH v5 3/5] drm/msm/dp: set stream_pixel rate directly
On Fri, 4 Mar 2022 at 07:31, Stephen Boyd wrote: > > Quoting Dmitry Baryshkov (2022-03-03 20:23:06) > > On Fri, 4 Mar 2022 at 01:32, Stephen Boyd wrote: > > > > > > Quoting Dmitry Baryshkov (2022-02-16 21:55:27) > > > > The only clock for which we set the rate is the "stream_pixel". Rather > > > > than storing the rate and then setting it by looping over all the > > > > clocks, set the clock rate directly. > > > > > > > > Signed-off-by: Dmitry Baryshkov > > > [...] > > > > diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c > > > > b/drivers/gpu/drm/msm/dp/dp_ctrl.c > > > > index 07f6bf7e1acb..8e6361dedd77 100644 > > > > --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c > > > > +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c > > > > @@ -1315,7 +1315,7 @@ static void dp_ctrl_set_clock_rate(struct > > > > dp_ctrl_private *ctrl, > > > > DRM_DEBUG_DP("setting rate=%lu on clk=%s\n", rate, name); > > > > > > > > if (num) > > > > - cfg->rate = rate; > > > > + clk_set_rate(cfg->clk, rate); > > > > > > This looks bad. From what I can tell we set the rate of the pixel clk > > > after enabling the phy and configuring it. See the order of operations > > > in dp_ctrl_enable_mainlink_clocks() and note how dp_power_clk_enable() > > > is the one that eventually sets a rate through dp_power_clk_set_rate() > > > > > > dp_ctrl_set_clock_rate(ctrl, DP_CTRL_PM, "ctrl_link", > > > ctrl->link->link_params.rate * > > > 1000); > > > > > > phy_configure(phy, &dp_io->phy_opts); > > > phy_power_on(phy); > > > > > > ret = dp_power_clk_enable(ctrl->power, DP_CTRL_PM, true); > > > > This code has been changed in the previous patch. > > > > Let's get back a bit. > > Currently dp_ctrl_set_clock_rate() doesn't change the clock rate. It > > just stores the rate in the config so that later the sequence of > > dp_power_clk_enable() -> dp_power_clk_set_rate() -> > > [dp_power_clk_set_link_rate() -> dev_pm_opp_set_rate() or > > msm_dss_clk_set_rate() -> clk_set_rate()] will use that. > > > > There are only two users of dp_ctrl_set_clock_rate(): > > - dp_ctrl_enable_mainlink_clocks(), which you have quoted above. > > This case is handled in the patch 1 from this series. It makes > > Patch 1 form this series says DP is unaffected. Huh? > > > dp_ctrl_enable_mainlink_clocks() call dev_pm_opp_set_rate() directly > > without storing (!) the rate in the config, calling > > phy_configure()/phy_power_on() and then setting the opp via the > > sequence of calls specified above Note, this handles the "ctrl_link" clock. > > > > - dp_ctrl_enable_stream_clocks(), which calls dp_power_clk_enable() > > immediately afterwards. This call would set the stream_pixel rate > > while enabling stream clocks. As far as I can see, the stream_pixel is > > the only stream clock. So this patch sets the clock rate without > > storing in the interim configuration data. > > > > Could you please clarify, what exactly looks bad to you? > > Note, this handles the "stream_pixel" clock. > > I'm concerned about the order of operations changing between the > phy being powered on and the pixel clk frequency being set. From what I > recall the pixel clk rate operations depend on the phy frequency being > set (which is done through phy_configure?) so if we call clk_set_rate() > on the pixel clk before the phy is set then the clk frequency will be > calculated badly and probably be incorrect. But the order of operations is mostly unchanged. The only major change is that the opp point is now set before calling the phy_configure()/phy_power_on() For the pixel clock the driver has: static int dp_ctrl_enable_stream_clocks(struct dp_ctrl_private *ctrl) { int ret = 0; dp_ctrl_set_clock_rate(ctrl, DP_STREAM_PM, "stream_pixel", ctrl->dp_ctrl.pixel_rate * 1000); ret = dp_power_clk_enable(ctrl->power, DP_STREAM_PM, true); [skipped the error handling] } dp_power_clk_enable() doesn't have any special handlers for the the DP_STREAM_PM, so this code would be equivalent to the following pseudo code (given that there is only one stream clock): unsigned int rate = ctrl->dp_ctrl.pixel_rate * 1000; /* dp_ctrl_set_clock_rate() */ cfg = find_clock_cfg("stream_pixel"); cfg->rate = rate; /* dp_power_clk_enable() */ clk = find_clock("stream_pixel") clk_set_rate(clk, cfg->rate); clk_prepare_enable(clk); The proposed patch does exactly this. Please correct me if I'm wrong. -- With best wishes Dmitry
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
correct for typo: -for (struct list_head *list = head->next, cond = (struct list_head *)-1; cond == (struct list_head *)-1; cond = NULL) \ +for (struct list_head *list = head->next, *cond = (struct list_head *)-1; cond == (struct list_head *)-1; cond = NULL) \ -- Xiaomeng Tong
RE: [v1] drm/msm/disp/dpu1: add inline rotation support for sc7280 target
> WARNING: This email originated from outside of Qualcomm. Please be wary > of any links or attachments, and do not enable macros. > > On 18/02/2022 14:30, Vinod Polimera wrote: > > - Some DPU versions support inline rot90. It is supported only for > > limited amount of UBWC formats. > > - There are two versions of inline rotators, v1 (present on sm8250 and > > sm7250) and v2 (sc7280). These versions differ in the list of supported > > formats and in the scaler possibilities. > > > > Changes in RFC: > > - Rebase changes to the latest code base. > > - Append rotation config variables with v2 and > > remove unused variables.(Dmitry) > > - Move pixel_ext setup separately from scaler3 config.(Dmitry) > > - Add 270 degree rotation to supported rotation list.(Dmitry) > > > > Signed-off-by: Kalyan Thota > > Signed-off-by: Vinod Polimera > > --- > > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 44 --- > > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 15 > > drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 105 > - > > drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 2 + > > 4 files changed, 134 insertions(+), 32 deletions(-) > > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c > b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c > > index aa75991..ae17a61 100644 > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c > > @@ -25,6 +25,9 @@ > > #define VIG_SM8250_MASK \ > > (VIG_MASK | BIT(DPU_SSPP_QOS_8LVL) | > BIT(DPU_SSPP_SCALER_QSEED3LITE)) > > > > +#define VIG_SC7280_MASK \ > > + (VIG_SC7180_MASK | BIT(DPU_SSPP_INLINE_ROTATION)) > > + > > #define DMA_SDM845_MASK \ > > (BIT(DPU_SSPP_SRC) | BIT(DPU_SSPP_QOS) | > BIT(DPU_SSPP_QOS_8LVL) |\ > > BIT(DPU_SSPP_TS_PREFILL) | BIT(DPU_SSPP_TS_PREFILL_REC1) |\ > > @@ -102,6 +105,8 @@ > > #define MAX_DOWNSCALE_RATIO 4 > > #define SSPP_UNITY_SCALE1 > > > > +#define INLINE_ROTATOR_V22 > > Unused > > > + > > #define STRCAT(X, Y) (X Y) > > > > static const uint32_t plane_formats[] = { > > @@ -177,6 +182,11 @@ static const uint32_t plane_formats_yuv[] = { > > DRM_FORMAT_YVU420, > > }; > > > > +static const uint32_t rotation_v2_formats[] = { > > + DRM_FORMAT_NV12, > > + /* TODO add formats after validation */ > > +}; > > + > > > /** > *** > >* DPU sub blocks config > > > ** > ***/ > > @@ -465,7 +475,13 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = { > > > > /* SSPP common configuration */ > > > > -#define _VIG_SBLK(num, sdma_pri, qseed_ver) \ > > +static const struct dpu_rotation_cfg dpu_rot_cfg_v2 = { > > + .rot_maxheight = 1088, > > Is the maxheight expected to be common between the SoC generations? > You are declaring it inside generic `dpu_rot_cfg_v2`, which means that > the struct will be used unchanged for several platforms. Changed 'dpu_rot_cfg_v2 to 'dpu_rot_sc7280_cfg_v2' so that it will be specific to sc7280. > > > + .rot_num_formats = ARRAY_SIZE(rotation_v2_formats), > > + .rot_format_list = rotation_v2_formats, > > +}; > > This should come later, together with the rest of structures. > > > + > > +#define _VIG_SBLK(num, sdma_pri, qseed_ver, rot_cfg) \ > > { \ > > .maxdwnscale = MAX_DOWNSCALE_RATIO, \ > > .maxupscale = MAX_UPSCALE_RATIO, \ > > @@ -482,6 +498,7 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = { > > .num_formats = ARRAY_SIZE(plane_formats_yuv), \ > > .virt_format_list = plane_formats, \ > > .virt_num_formats = ARRAY_SIZE(plane_formats), \ > > + .rotation_cfg = rot_cfg, \ > > } > > > > #define _DMA_SBLK(num, sdma_pri) \ > > @@ -498,13 +515,13 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = { > > } > > > > static const struct dpu_sspp_sub_blks sdm845_vig_sblk_0 = > > - _VIG_SBLK("0", 5, DPU_SSPP_SCALER_QSEED3); > > + _VIG_SBLK("0", 5, DPU_SSPP_SCALER_QSEED3, > > NULL); > > static const struct dpu_sspp_sub_blks sdm845_vig_sblk_1 = > > - _VIG_SBLK("1", 6, DPU_SSPP_SCALER_QSEED3); > > + _VIG_SBLK("1", 6, DPU_SSPP_SCALER_QSEED3, > > NULL); > > static const struct dpu_sspp_sub_blks sdm845_vig_sblk_2 = > > - _VIG_SBLK("2", 7, DPU_SSPP_SCALER_QSEED3); > > + _VIG_SBLK("2", 7, DPU_SSPP_SCALER_QSEED3, > > NULL); > > static const struct dpu_sspp_sub_blks sdm845_vig_sblk_3 = > > - _VIG_SBLK("3", 8, DPU_SSPP_SCALER_QSEED3); > > + _VIG_SBLK("3", 8, DPU_SSPP_SCALER_QSEED3, > > NULL); > > > > static const struct dpu_sspp_sub_blks sdm845_dma_sblk_0 = > _DMA_SBLK("8", 1); > > static const struct dpu_sspp_sub_blks sdm845_dma_sblk_1 = > _DMA_SBLK("9", 2); > > @@
RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
> From: Xiaomeng Tong > > Sent: 03 March 2022 07:27 > > > > On Thu, 3 Mar 2022 04:58:23 +, David Laight wrote: > > > on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote: > > > > The problem is the mis-use of iterator outside the loop on exit, and > > > > the iterator will be the HEAD's container_of pointer which pointers > > > > to a type-confused struct. Sidenote: The *mis-use* here refers to > > > > mistakely access to other members of the struct, instead of the > > > > list_head member which acutally is the valid HEAD. > > > > > > The problem is that the HEAD's container_of pointer should never > > > be calculated at all. > > > This is what is fundamentally broken about the current definition. > > > > Yes, the rule is "the HEAD's container_of pointer should never be > > calculated at all outside the loop", but how do you make sure everyone > > follows this rule? > > Everyone makes mistakes, but we can eliminate them all from the beginning > > with the help of compiler which can catch such use-after-loop things. > > > > > > IOW, you would dereference a (NULL + offset_of_member) address here. > > > > > >Where? > > > > In the case where a developer do not follows the above rule, and mistakely > > access a non-list-head member of the HEAD's container_of pointer outside > > the loop. For example: > > struct req{ > > int a; > > struct list_head h; > > } > > struct req *r; > > list_for_each_entry(r, HEAD, h) { > > if (r->a == 0x10) > > break; > > } > > // the developer made a mistake: he didn't take this situation into > > // account where all entries in the list are *r->a != 0x10*, and now > > // the r is the HEAD's container_of pointer. > > r->a = 0x20; > > Thus the "r->a = 0x20" would dereference a (NULL + offset_of_member) > > address here. > > That is just a bug. > No different to failing to check anything else might 'return' > a NULL pointer. Yes, but it‘s a mistake everyone has made and will make, we should avoid this at the beginning with the help of compiler. > Because it is a NULL dereference you find out pretty quickly. AFAIK,NULL dereference is a undefined bahavior, can compiler quickly catch it? Or it can only be applied to some simple/restricted cases. > The existing loop leaves you with a valid pointer to something > that isn't a list item. > > > > > Please remind me if i missed something, thanks. > > > > > > > > Can you share your "alternative definitions" details? thanks! > > > > > > The loop should probably use as extra variable that points > > > to the 'list node' in the next structure. > > > Something like: > > > for (xxx *iter = head->next; > > > iter == &head ? ((item = NULL),0) : ((item = > > > list_item(iter),1)); > > > iter = item->member->next) { > > > ... > > > With a bit of casting you can use 'item' to hold 'iter'. > > > > you still can not make sure everyone follows this rule: > > "do not use iterator outside the loop" without the help of compiler, > > because item is declared outside the loop. > > That one has 'iter' defined in the loop. Oh, sorry, I misunderstood. Then this is the same way with my way of list_for_each_entry_inside(pos, type, head, member), which declare the iterator inside the loop. You go further and make things like "&pos->member == (head)" goes away to avoid calculate the HEAD's container_of pointer, although the calculation is harmless. > > > BTW, to avoid ambiguity,the "alternative definitions" here i asked is > > something from you in this context: > > "OTOH there may be alternative definitions that can be used to get > > the compiler (or other compiler-like tools) to detect broken code. > > Even if the definition can't possibly generate a working kerrnel." > > I was thinking of something like: > if ((pos = list_first)), 1) pos = NULL else > so that unchecked dereferences after the loop will be detectable > as NULL pointer offsets - but that in itself isn't enough to avoid > other warnings. > Do you mean put this right after the loop (I changed somthing that i do not understand, please correct me if i am worng, thanks): if (pos == list_first) pos = NULL; else and compiler can detect the following NULL derefernce? But if the NULL derefernce is just right after the loop originally, it will be masked by the *else* keyword。 > > > > The "list_for_each_entry_inside(pos, type, head, member)" way makes > > > > the iterator invisiable outside the loop, and would be catched by > > > > compiler if use-after-loop things happened. > > > > > It is also a compete PITA for anything doing a search. > > > > You mean it would be a burden on search? can you show me some examples? > > The whole business of having to save the pointer to the located item > before breaking the loop, remembering to have set it to NULL earlier etc. Ok, i see. And then you need pass a "item" to the list_for_each_entry macro as a new argument. > > It is so much better if y
Re: [PATCH v3 00/21] DEPT(Dependency Tracker)
On Thu, Mar 03, 2022 at 06:48:24PM +0900, Byungchul Park wrote: > On Thu, Mar 03, 2022 at 08:03:21AM +, Hyeonggon Yoo wrote: > > On Thu, Mar 03, 2022 at 09:18:13AM +0900, Byungchul Park wrote: > > > Hi Hyeonggon, > > > > > > Dept also allows the following scenario when an user guarantees that > > > each lock instance is different from another at a different depth: > > > > > >lock A0 with depth > > >lock A1 with depth + 1 > > >lock A2 with depth + 2 > > >lock A3 with depth + 3 > > >(and so on) > > >.. > > >unlock A3 > > >unlock A2 > > >unlock A1 > > >unlock A0 > [+Cc kmemleak maintainer] > Look at this. Dept allows object->lock -> other_object->lock (with a > different depth using *_lock_nested()) so won't report it. > No, It did. S: object->lock ( _raw_spin_lock_irqsave) W: other_object->lock (_raw_spin_lock_nested) DEPT reported this as AA deadlock. === DEPT: Circular dependency has been detected. 5.17.0-rc1+ #1 Tainted: GW --- summary --- *** AA DEADLOCK *** context A [S] __raw_spin_lock_irqsave(&object->lock:0) [W] _raw_spin_lock_nested(&object->lock:0) [E] spin_unlock(&object->lock:0) [S]: start of the event context [W]: the wait blocked [E]: the event not reachable --- context A's detail --- context A [S] __raw_spin_lock_irqsave(&object->lock:0) [W] _raw_spin_lock_nested(&object->lock:0) [E] spin_unlock(&object->lock:0) --- context A's detail --- context A [S] __raw_spin_lock_irqsave(&object->lock:0) [W] _raw_spin_lock_nested(&object->lock:0) [E] spin_unlock(&object->lock:0) [S] __raw_spin_lock_irqsave(&object->lock:0): [] scan_gray_list+0x84/0x13c stacktrace: dept_ecxt_enter+0x88/0xf4 _raw_spin_lock_irqsave+0xf0/0x1c4 scan_gray_list+0x84/0x13c kmemleak_scan+0x2d8/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20 [W] _raw_spin_lock_nested(&object->lock:0): [] scan_block+0xb4/0x128 stacktrace: __dept_wait+0x8c/0xa4 dept_wait+0x6c/0x88 _raw_spin_lock_nested+0xa8/0x1b0 scan_block+0xb4/0x128 scan_gray_list+0xc4/0x13c kmemleak_scan+0x2d8/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20 [E] spin_unlock(&object->lock:0): [] scan_block+0x60/0x128 --- information that might be helpful --- CPU: 2 PID: 38 Comm: kmemleak Tainted: GW 5.17.0-rc1+ #1 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace.part.0+0x9c/0xc4 show_stack+0x14/0x28 dump_stack_lvl+0x9c/0xcc dump_stack+0x14/0x2c print_circle+0x2d4/0x438 cb_check_dl+0x44/0x70 bfs+0x60/0x168 add_dep+0x88/0x11c add_wait+0x2d0/0x2dc __dept_wait+0x8c/0xa4 dept_wait+0x6c/0x88 _raw_spin_lock_nested+0xa8/0x1b0 scan_block+0xb4/0x128 scan_gray_list+0xc4/0x13c kmemleak_scan+0x2d8/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20 > > > However, Dept does not allow the following scenario where another lock > > > class cuts in the dependency chain: > > > > > >lock A0 with depth > > >lock B > > >lock A1 with depth + 1 > > >lock A2 with depth + 2 > > >lock A3 with depth + 3 > > >(and so on) > > >.. > > >unlock A3 > > >unlock A2 > > >unlock A1 > > >unlock B > > >unlock A0 > > > > > > This scenario is clearly problematic. What do you think is going to > > > happen with another context running the following? > > > > > > > First of all, I want to say I'm not expert at locking primitives. > > I may be wrong. > > It's okay. Thanks anyway for your feedback. > Thanks. > > > > 45 * scan_mutex [-> object->lock] -> kmemleak_lock -> > > > > other_object->lock (SINGLE_DEPTH_NESTING) > > > > 46 * > > > > 47 * No kmemleak_lock and object->lock nesting is allowed outside > > > > scan_mutex > > > > 48 * regions. > > > > lock order in kmemleak is described above. > > > > and DEPT detects two cases as deadlock: > > > > 1) object->lock -> other_object->lock > > It's not a deadlock *IF* two have different depth using *_lock_nested(). > Dept also allows this case. So Dept wouldn't report it. > > > 2) object->lock -> kmemleak_lock, kmemleak_lock -> other_object->lock > > But this usage is risky. I already explained it in the mail you replied > to. I copied it. See the below. > I understand why you said this is risky. Its lock ordering is not good. > context A > > >lock A0 with depth > > >lock B > > >lock A1 with depth + 1 > > >lock A2 with depth + 2 > > >l
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Thu, 3 Mar 2022 12:18:24 +, Daniel Thompson wrote: > On Thu, Mar 03, 2022 at 03:26:57PM +0800, Xiaomeng Tong wrote: > > On Thu, 3 Mar 2022 04:58:23 +, David Laight wrote: > > > on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote: > > > > The problem is the mis-use of iterator outside the loop on exit, and > > > > the iterator will be the HEAD's container_of pointer which pointers > > > > to a type-confused struct. Sidenote: The *mis-use* here refers to > > > > mistakely access to other members of the struct, instead of the > > > > list_head member which acutally is the valid HEAD. > > > > > > The problem is that the HEAD's container_of pointer should never > > > be calculated at all. > > > This is what is fundamentally broken about the current definition. > > > > Yes, the rule is "the HEAD's container_of pointer should never be > > calculated at all outside the loop", but how do you make sure everyone > > follows this rule? > > Your formulation of the rule is correct: never run container_of() on HEAD > pointer. Actually, it is not my rule. My rule is that never access other members of the struct except for the list_head member after the loop, because this is a invalid member after loop exit, but valid for the list_head member which just is HEAD and the lately caculation (&pos->head) seems harmless. I have considered the case that the HEAD's container "pos" is layouted across the max and the min address boundary, which means the address of HEAD is likely 0x60, and the address of pos is likely 0xffe0. It seems ok to caculate pos with: ((type *)(__mptr - offsetof(type, member))); and it seems ok to caculate head outside the loop with: if (&pos->head == &HEAD) return NULL; The only case I can think of with the rule "never run container_of() on HEAD" must be followed is when the first argument (which is &HEAD) passing to container_of() is NULL + some offset, it may lead to the resulting "pos->member" access being a NULL dereference. But maybe the caller can take the responsibility to check if it is NULL, not container_of() itself. Please remind me if i missed somthing, thanks. > > However the rule that is introduced by list_for_each_entry_inside() is > *not* this rule. The rule it introduces is: never access the iterator > variable outside the loop. Sorry for the confusion, indeed, that is two *different* rule. > > Making the iterator NULL on loop exit does follow the rule you proposed > but using a different technique: do not allow HEAD to be stored in the > iterator variable after loop exit. This also makes it impossible to run > container_of() on the HEAD pointer. > It does not. My rule is: never access the iterator variable outside the loop. The "Making the iterator NULL on loop exit" way still leak the pos with NULL outside the loop, may lead to a NULL deference. > > > Everyone makes mistakes, but we can eliminate them all from the beginning > > with the help of compiler which can catch such use-after-loop things. > > Indeed but if we introduce new interfaces then we don't have to worry > about existing usages and silent regressions. Code will have been > written knowing the loop can exit with the iterator set to NULL. Yes, it is more simple and compatible with existing interfaces. Howerver, you should make every developers to remember that "pos will be set NULL on loop exit", which is unreasonable and impossible for *every* single person. Otherwise the mis-use-after-loop will lead to a NULL dereference. But we can kill this problem by declaring iterator inside the loop and the complier will catch it if somebody mis-use-after-loop. > > Sure it is still possible for programmers to make mistakes and > dereference the NULL pointer but C programmers are well training w.r.t. > NULL pointer checking so such mistakes are much less likely than with > the current list_for_each_entry() macro. This risk must be offset > against the way a NULLify approach can lead to more elegant code when we > are doing a list search. > Yes, the NULLify approach is better than the current list_for_each_entry() macro, but i stick with that the list_for_each_entry_inside() way is best and perfect _technically_. Thus, my idea is *better a finger off than always aching*, let's settle this damn problem once and for all, with list_for_each_entry_inside(). -- Xiaomeng Tong
Re: [PATCH] i2c: at91: use dma safe buffers
Am 03.03.22 um 17:17 schrieb Michael Walle: The supplied buffer might be on the stack and we get the following error message: [3.312058] at91_i2c e0070600.i2c: rejecting DMA map of vmalloc memory Use i2c_{get,put}_dma_safe_msg_buf() to get a DMA-able memory region if necessary. Cc: sta...@vger.kernel.org Signed-off-by: Michael Walle --- I'm not sure if or which Fixes: tag I should add to this patch. The issue seems to be since a very long time, but nobody seem to have triggered it. FWIW, I'm using the sff,sfp driver, which triggers this. drivers/i2c/busses/i2c-at91-master.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/i2c/busses/i2c-at91-master.c b/drivers/i2c/busses/i2c-at91-master.c index b0eae94909f4..a7a22fedbaba 100644 --- a/drivers/i2c/busses/i2c-at91-master.c +++ b/drivers/i2c/busses/i2c-at91-master.c @@ -656,6 +656,7 @@ static int at91_twi_xfer(struct i2c_adapter *adap, struct i2c_msg *msg, int num) unsigned int_addr_flag = 0; struct i2c_msg *m_start = msg; bool is_read; + u8 *dma_buf; Maybe call your variable differently. DMA-buf is an inter driver buffer sharing frame we use for GPU acceleration and V4L. It doesn't cause any technical issues, but the maintainer regex now triggers on that. So you are CCing people not related to this code in any way. Regards, Christian. dev_dbg(&adap->dev, "at91_xfer: processing %d messages:\n", num); @@ -703,7 +704,18 @@ static int at91_twi_xfer(struct i2c_adapter *adap, struct i2c_msg *msg, int num) dev->msg = m_start; dev->recv_len_abort = false; + if (dev->use_dma) { + dma_buf = i2c_get_dma_safe_msg_buf(m_start, 1); + if (!dma_buf) { + ret = -ENOMEM; + goto out; + } + dev->buf = dma_buf; + } + + ret = at91_do_twi_transfer(dev); + i2c_put_dma_safe_msg_buf(dma_buf, m_start, !ret); ret = (ret < 0) ? ret : num; out:
Re: [Intel-gfx] [PATCH 1/2] drm/i915/xehp: Support platforms with CCS engines but no RCS
On Thu, Mar 03, 2022 at 02:34:34PM -0800, Matt Roper wrote: In the past we've always assumed that an RCS engine is present on every platform. However now that we have compute engines there may be platforms that have CCS engines but no RCS, or platforms that are designed to have both, but have the RCS engine fused off. Various engine-centric initialization that only needs to be done a single time for the group of RCS+CCS engines can't rely on being setup with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag that will be assigned to a single engine in the group; whichever engine has this flag will be responsible for some of the general setup (RCU_MODE programming, initialization of certain workarounds, etc.). Signed-off-by: Matt Roper Reviewed-by: Lucas De Marchi Lucas De Marchi
[PATCH v4 24/24] dept: Disable Dept on that map once it's been handled until next turn
Dept works with waits preceeding an event, that might lead a deadlock. Once the event has been handled, it's hard to ensure further waits actually contibute to deadlock until next turn, which will start when a sleep associated with that map happens. So let Dept start tracking dependency when a sleep happens and stop tracking dependency once the event e.i. wake up, has been handled. Signed-off-by: Byungchul Park --- kernel/dependency/dept.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index cc1b3a3..1c91db8 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -2325,6 +2325,12 @@ void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip, do_event((void *)m, c, READ_ONCE(m->wgen), ip); pop_ecxt((void *)m); } + + /* +* Keep the map diabled until the next sleep. +*/ + WRITE_ONCE(m->wgen, 0); + dept_exit(flags); } EXPORT_SYMBOL_GPL(dept_event); @@ -2447,6 +2453,11 @@ void dept_event_split_map(struct dept_map_each *me, pop_ecxt((void *)me); } + /* +* Keep the map diabled until the next sleep. +*/ + WRITE_ONCE(me->wgen, 0); + dept_exit(flags); } EXPORT_SYMBOL_GPL(dept_event_split_map); -- 1.9.1
[PATCH v4 08/24] dept: Apply Dept to wait_for_completion()/complete()
Makes Dept able to track dependencies by wait_for_completion()/complete(). Signed-off-by: Byungchul Park --- include/linux/completion.h | 42 -- kernel/sched/completion.c | 12 ++-- 2 files changed, 50 insertions(+), 4 deletions(-) diff --git a/include/linux/completion.h b/include/linux/completion.h index 51d9ab0..a1ad5a8 100644 --- a/include/linux/completion.h +++ b/include/linux/completion.h @@ -26,14 +26,48 @@ struct completion { unsigned int done; struct swait_queue_head wait; + struct dept_map dmap; }; +#ifdef CONFIG_DEPT +#define dept_wfc_init(m, k, s, n) dept_map_init(m, k, s, n) +#define dept_wfc_reinit(m) dept_map_reinit(m) +#define dept_wfc_wait(m, ip) \ +do { \ + dept_ask_event(m); \ + dept_wait(m, 1UL, ip, __func__, 0); \ +} while (0) +#define dept_wfc_complete(m, ip) dept_event(m, 1UL, ip, __func__) +#define dept_wfc_enter(m, ip) dept_ecxt_enter(m, 1UL, ip, "completion_context_enter", "complete", 0) +#define dept_wfc_exit(m, ip) dept_ecxt_exit(m, ip) +#else +#define dept_wfc_init(m, k, s, n) do { (void)(n); (void)(k); } while (0) +#define dept_wfc_reinit(m) do { } while (0) +#define dept_wfc_wait(m, ip) do { } while (0) +#define dept_wfc_complete(m, ip) do { } while (0) +#define dept_wfc_enter(m, ip) do { } while (0) +#define dept_wfc_exit(m, ip) do { } while (0) +#endif + +#ifdef CONFIG_DEPT +#define WFC_DEPT_MAP_INIT(work) .dmap = { .name = #work, .skip_cnt = ATOMIC_INIT(0) } +#else +#define WFC_DEPT_MAP_INIT(work) +#endif + +#define init_completion(x) \ + do {\ + static struct dept_key __dkey; \ + __init_completion(x, &__dkey, #x); \ + } while (0) + #define init_completion_map(x, m) init_completion(x) static inline void complete_acquire(struct completion *x) {} static inline void complete_release(struct completion *x) {} #define COMPLETION_INITIALIZER(work) \ - { 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait) } + { 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), \ + WFC_DEPT_MAP_INIT(work) } #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \ (*({ init_completion_map(&(work), &(map)); &(work); })) @@ -81,9 +115,12 @@ static inline void complete_release(struct completion *x) {} * This inline function will initialize a dynamically created completion * structure. */ -static inline void init_completion(struct completion *x) +static inline void __init_completion(struct completion *x, +struct dept_key *dkey, +const char *name) { x->done = 0; + dept_wfc_init(&x->dmap, dkey, 0, name); init_swait_queue_head(&x->wait); } @@ -97,6 +134,7 @@ static inline void init_completion(struct completion *x) static inline void reinit_completion(struct completion *x) { x->done = 0; + dept_wfc_reinit(&x->dmap); } extern void wait_for_completion(struct completion *); diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c index a778554..6e31cc0 100644 --- a/kernel/sched/completion.c +++ b/kernel/sched/completion.c @@ -29,6 +29,7 @@ void complete(struct completion *x) { unsigned long flags; + dept_wfc_complete(&x->dmap, _RET_IP_); raw_spin_lock_irqsave(&x->wait.lock, flags); if (x->done != UINT_MAX) @@ -58,6 +59,7 @@ void complete_all(struct completion *x) { unsigned long flags; + dept_wfc_complete(&x->dmap, _RET_IP_); lockdep_assert_RT_in_threaded_ctx(); raw_spin_lock_irqsave(&x->wait.lock, flags); @@ -112,17 +114,23 @@ void complete_all(struct completion *x) } static long __sched -wait_for_common(struct completion *x, long timeout, int state) +_wait_for_common(struct completion *x, long timeout, int state) { return __wait_for_common(x, schedule_timeout, timeout, state); } static long __sched -wait_for_common_io(struct completion *x, long timeout, int state) +_wait_for_common_io(struct completion *x, long timeout, int state) { return __wait_for_common(x, io_schedule_timeout, timeout, state); } +#define wait_for_common(x, t, s) \ +({ dept_wfc_wait(&(x)->dmap, _RET_IP_); _wait_for_common(x, t, s); }) + +#define wait_for_common_io(x, t, s)\ +({ dept_wfc_wait(&(x)->dmap, _RET_IP_); _wait_for_common_io(x, t, s); }) + /** * wait_for_completion: - waits for completion of a tas
[PATCH v4 20/24] dept: Add nocheck version of init_completion()
For completions who don't want to get tracked by Dept, added init_completion_nocheck() to disable Dept on it. Signed-off-by: Byungchul Park --- include/linux/completion.h | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/include/linux/completion.h b/include/linux/completion.h index a1ad5a8..9bd3bc9 100644 --- a/include/linux/completion.h +++ b/include/linux/completion.h @@ -30,6 +30,7 @@ struct completion { }; #ifdef CONFIG_DEPT +#define dept_wfc_nocheck(m)dept_map_nocheck(m) #define dept_wfc_init(m, k, s, n) dept_map_init(m, k, s, n) #define dept_wfc_reinit(m) dept_map_reinit(m) #define dept_wfc_wait(m, ip) \ @@ -41,6 +42,7 @@ struct completion { #define dept_wfc_enter(m, ip) dept_ecxt_enter(m, 1UL, ip, "completion_context_enter", "complete", 0) #define dept_wfc_exit(m, ip) dept_ecxt_exit(m, ip) #else +#define dept_wfc_nocheck(m)do { } while (0) #define dept_wfc_init(m, k, s, n) do { (void)(n); (void)(k); } while (0) #define dept_wfc_reinit(m) do { } while (0) #define dept_wfc_wait(m, ip) do { } while (0) @@ -55,10 +57,11 @@ struct completion { #define WFC_DEPT_MAP_INIT(work) #endif +#define init_completion_nocheck(x) __init_completion(x, NULL, #x, false) #define init_completion(x) \ do {\ static struct dept_key __dkey; \ - __init_completion(x, &__dkey, #x); \ + __init_completion(x, &__dkey, #x, true);\ } while (0) #define init_completion_map(x, m) init_completion(x) @@ -117,10 +120,15 @@ static inline void complete_release(struct completion *x) {} */ static inline void __init_completion(struct completion *x, struct dept_key *dkey, -const char *name) +const char *name, bool check) { x->done = 0; - dept_wfc_init(&x->dmap, dkey, 0, name); + + if (check) + dept_wfc_init(&x->dmap, dkey, 0, name); + else + dept_wfc_nocheck(&x->dmap); + init_swait_queue_head(&x->wait); } -- 1.9.1
[PATCH v4 22/24] dept: Don't create dependencies between different depths in any case
Dept already prevents creating dependencies between different depths of the class indicated by *_lock_nested() when the lock acquisitions happen consecutively. lock A0 with depth lock_nested A1 with depth + 1 ... unlock A1 unlock A0 Dept does not create A0 -> A1 dependency in this case, either. However, once another class cut in, the code becomes problematic. When Dept tries to create real dependencies, it does not only create real ones but also wrong ones between different depths of the class. lock A0 with depth lock B lock_nested A1 with depth + 1 ... unlock A1 unlock B unlock A0 Even in this case, Dept should not create A0 -> A1 dependency. So let Dept not create wrong dependencies between different depths of the class in any case. Reported-by: 42.hye...@gmail.com Signed-off-by: Byungchul Park --- kernel/dependency/dept.c | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index 5d4efc3..cc1b3a3 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -1458,14 +1458,7 @@ static void add_wait(struct dept_class *c, unsigned long ip, eh = dt->ecxt_held + i; if (eh->ecxt->class != c || eh->nest == ne) - break; - } - - for (; i >= 0; i--) { - struct dept_ecxt_held *eh; - - eh = dt->ecxt_held + i; - add_dep(eh->ecxt, w); + add_dep(eh->ecxt, w); } if (!wait_consumed(w) && !rich_stack) { -- 1.9.1
[PATCH v4 21/24] dept: Disable Dept on struct crypto_larval's completion for now
struct crypto_larval's completion is used for multiple purposes e.g. waiting for test to complete or waiting for probe to complete. The completion variable needs to be split according to what it's used for. Otherwise, Dept cannot distinguish one from another and doesn't work properly. Now that it isn't, disable Dept on it. Signed-off-by: Byungchul Park --- crypto/api.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/crypto/api.c b/crypto/api.c index cf0869d..f501b91 100644 --- a/crypto/api.c +++ b/crypto/api.c @@ -115,7 +115,12 @@ struct crypto_larval *crypto_larval_alloc(const char *name, u32 type, u32 mask) larval->alg.cra_destroy = crypto_larval_destroy; strlcpy(larval->alg.cra_name, name, CRYPTO_MAX_ALG_NAME); - init_completion(&larval->completion); + /* +* TODO: Split ->completion according to what it's used for e.g. +* ->test_completion, ->probe_completion and the like, so that +* Dept can track its dependency properly. +*/ + init_completion_nocheck(&larval->completion); return larval; } -- 1.9.1
[PATCH v4 06/24] dept: Apply Dept to mutex families
Makes Dept able to track dependencies by mutex families. Signed-off-by: Byungchul Park --- include/linux/lockdep.h | 18 +++--- include/linux/mutex.h | 33 + include/linux/rtmutex.h | 7 +++ 3 files changed, 55 insertions(+), 3 deletions(-) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index 529ea18..6653a4f 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -615,9 +615,21 @@ static inline void print_irqtrace_events(struct task_struct *curr) #define seqcount_acquire_read(l, s, t, i) lock_acquire_shared_recursive(l, s, t, NULL, i) #define seqcount_release(l, i) lock_release(l, i) -#define mutex_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i) -#define mutex_acquire_nest(l, s, t, n, i) lock_acquire_exclusive(l, s, t, n, i) -#define mutex_release(l, i)lock_release(l, i) +#define mutex_acquire(l, s, t, i) \ +do { \ + lock_acquire_exclusive(l, s, t, NULL, i); \ + dept_mutex_lock(&(l)->dmap, s, t, NULL, "mutex_unlock", i); \ +} while (0) +#define mutex_acquire_nest(l, s, t, n, i) \ +do { \ + lock_acquire_exclusive(l, s, t, n, i); \ + dept_mutex_lock(&(l)->dmap, s, t, (n) ? &(n)->dmap : NULL, "mutex_unlock", i);\ +} while (0) +#define mutex_release(l, i)\ +do { \ + lock_release(l, i); \ + dept_mutex_unlock(&(l)->dmap, i); \ +} while (0) #define rwsem_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i) #define rwsem_acquire_nest(l, s, t, n, i) lock_acquire_exclusive(l, s, t, n, i) diff --git a/include/linux/mutex.h b/include/linux/mutex.h index 8f226d4..204f976 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -20,11 +20,18 @@ #include #include +#ifdef CONFIG_DEPT +# define DMAP_MUTEX_INIT(lockname) .dmap = { .name = #lockname, .skip_cnt = ATOMIC_INIT(0) }, +#else +# define DMAP_MUTEX_INIT(lockname) +#endif + #ifdef CONFIG_DEBUG_LOCK_ALLOC # define __DEP_MAP_MUTEX_INITIALIZER(lockname) \ , .dep_map = { \ .name = #lockname, \ .wait_type_inner = LD_WAIT_SLEEP, \ + DMAP_MUTEX_INIT(lockname) \ } #else # define __DEP_MAP_MUTEX_INITIALIZER(lockname) @@ -75,6 +82,32 @@ struct mutex { #endif }; +#ifdef CONFIG_DEPT +#define dept_mutex_lock(m, ne, t, n, e_fn, ip) \ +do { \ + if (t) {\ + dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } else if (n) { \ + dept_skip(m); \ + } else {\ + dept_wait(m, 1UL, ip, __func__, ne);\ + dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } \ +} while (0) +#define dept_mutex_unlock(m, ip) \ +do { \ + if (!dept_unskip_if_skipped(m)) { \ + dept_event(m, 1UL, ip, __func__); \ + dept_ecxt_exit(m, ip); \ + } \ +} while (0) +#else +#define dept_mutex_lock(m, ne, t, n, e_fn, ip) do { } while (0) +#define dept_mutex_unlock(m, ip) do { } while (0) +#endif + #ifdef CONFIG_DEBUG_MUTEXES #define __DEBUG_MUTEX_INITIALIZER(lockname)\ diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h index 7d04988..712d6e6 100644 --- a/include/linux/rtmutex.h +++ b/include/linux/rtmutex.h @@ -76,11 +76,18 @@ static inline void rt_mutex_debug_task_free(struct task_struct *tsk) { } __rt_mutex_init(mutex, __func__, &__key); \ } while (0) +#ifdef CONFIG_DEPT +#define DMAP_RT_MUTEX_INIT(mutexname) .dmap = { .name = #mutexname, .skip_cnt = ATOMIC_INIT(0) }, +#else +#define DMAP
[PATCH v4 18/24] dept: Distinguish each work from another
Workqueue already provides concurrency control. By that, any wait in a work doesn't prevents events in other works with the control enabled. Thus, each work would better be considered a different context. So let Dept assign a different context id to each work. Signed-off-by: Byungchul Park --- include/linux/dept.h | 2 ++ kernel/dependency/dept.c | 10 ++ kernel/workqueue.c | 3 +++ 3 files changed, 15 insertions(+) diff --git a/include/linux/dept.h b/include/linux/dept.h index 1a1c307..55c5ed5 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -486,6 +486,7 @@ struct dept_task { extern void dept_event_split_map(struct dept_map_each *me, struct dept_map_common *mc, unsigned long ip, const char *e_fn); extern void dept_ask_event_split_map(struct dept_map_each *me, struct dept_map_common *mc); extern void dept_kernel_enter(void); +extern void dept_work_enter(void); /* * for users who want to manage external keys @@ -527,6 +528,7 @@ struct dept_task { #define dept_event_split_map(me, mc, ip, e_fn) do { } while (0) #define dept_ask_event_split_map(me, mc) do { } while (0) #define dept_kernel_enter()do { } while (0) +#define dept_work_enter() do { } while (0) #define dept_key_init(k) do { (void)(k); } while (0) #define dept_key_destroy(k)do { (void)(k); } while (0) #endif diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index 8f962ae..5d4efc3 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -1873,6 +1873,16 @@ void dept_disable_hardirq(unsigned long ip) dept_exit(flags); } +/* + * Assign a different context id to each work. + */ +void dept_work_enter(void) +{ + struct dept_task *dt = dept_task(); + + dt->cxt_id[DEPT_CXT_PROCESS] += (1UL << DEPT_CXTS_NR); +} + void dept_kernel_enter(void) { struct dept_task *dt = dept_task(); diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 33f1106..f5d762c 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -51,6 +51,7 @@ #include #include #include +#include #include "workqueue_internal.h" @@ -2217,6 +2218,8 @@ static void process_one_work(struct worker *worker, struct work_struct *work) lockdep_copy_map(&lockdep_map, &work->lockdep_map); #endif + dept_work_enter(); + /* ensure we're on the correct CPU */ WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) && raw_smp_processor_id() != pool->cpu); -- 1.9.1
[PATCH v4 23/24] dept: Let it work with real sleeps in __schedule()
Dept commits the staged wait in __schedule() even if the corresponding wake_up() has already woken up the task. Which means Dept considers the case as a sleep. This would help Dept work for stronger detection but also leads false positives. It'd be better to let Dept work only with real sleeps conservatively for now. So did it. Signed-off-by: Byungchul Park --- kernel/sched/core.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 6a422aa..2ec7cf8 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6192,7 +6192,12 @@ static void __sched notrace __schedule(unsigned int sched_mode) local_irq_disable(); rcu_note_context_switch(!!sched_mode); - if (sched_mode == SM_NONE) + /* +* Skip the commit if the current task does not actually go to +* sleep. +*/ + if (READ_ONCE(prev->__state) & TASK_NORMAL && + sched_mode == SM_NONE) dept_ask_event_wait_commit(_RET_IP_); /* -- 1.9.1
[PATCH v4 19/24] dept: Disable Dept within the wait_bit layer by default
The struct wait_queue_head array, bit_wait_table[] in sched/wait_bit.c are shared by all its users, which unfortunately vary in terms of class. So each should've been assigned its own class to avoid false positives. It'd better let Dept work at a higher layer than wait_bit. So disabled Dept within the wait_bit layer by default. It's worth noting that Dept is still working with the other struct wait_queue_head ones that are mostly well-classified. Signed-off-by: Byungchul Park --- kernel/sched/wait_bit.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/sched/wait_bit.c b/kernel/sched/wait_bit.c index 02ce292..3e5a3eb 100644 --- a/kernel/sched/wait_bit.c +++ b/kernel/sched/wait_bit.c @@ -3,6 +3,7 @@ * The implementation of the wait_bit*() and related waiting APIs: */ #include "sched.h" +#include #define WAIT_TABLE_BITS 8 #define WAIT_TABLE_SIZE (1 << WAIT_TABLE_BITS) @@ -246,6 +247,8 @@ void __init wait_bit_init(void) { int i; - for (i = 0; i < WAIT_TABLE_SIZE; i++) + for (i = 0; i < WAIT_TABLE_SIZE; i++) { init_waitqueue_head(bit_wait_table + i); + dept_map_nocheck(&(bit_wait_table + i)->dmap); + } } -- 1.9.1
[PATCH v4 14/24] dept: Apply SDT to swait
Makes SDT able to track dependencies by swait. Signed-off-by: Byungchul Park --- include/linux/swait.h | 4 kernel/sched/swait.c | 10 ++ 2 files changed, 14 insertions(+) diff --git a/include/linux/swait.h b/include/linux/swait.h index 6a8c22b..dbdf2ce 100644 --- a/include/linux/swait.h +++ b/include/linux/swait.h @@ -6,6 +6,7 @@ #include #include #include +#include #include /* @@ -43,6 +44,7 @@ struct swait_queue_head { raw_spinlock_t lock; struct list_headtask_list; + struct dept_map dmap; }; struct swait_queue { @@ -61,6 +63,7 @@ struct swait_queue { #define __SWAIT_QUEUE_HEAD_INITIALIZER(name) { \ .lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \ .task_list = LIST_HEAD_INIT((name).task_list), \ + .dmap = DEPT_SDT_MAP_INIT(name), \ } #define DECLARE_SWAIT_QUEUE_HEAD(name) \ @@ -72,6 +75,7 @@ extern void __init_swait_queue_head(struct swait_queue_head *q, const char *name #define init_swait_queue_head(q) \ do {\ static struct lock_class_key __key; \ + sdt_map_init(&(q)->dmap); \ __init_swait_queue_head((q), #q, &__key); \ } while (0) diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c index e1c655f..4ca7d6e 100644 --- a/kernel/sched/swait.c +++ b/kernel/sched/swait.c @@ -27,6 +27,7 @@ void swake_up_locked(struct swait_queue_head *q) return; curr = list_first_entry(&q->task_list, typeof(*curr), task_list); + sdt_event(&q->dmap); wake_up_process(curr->task); list_del_init(&curr->task_list); } @@ -69,6 +70,7 @@ void swake_up_all(struct swait_queue_head *q) while (!list_empty(&tmp)) { curr = list_first_entry(&tmp, typeof(*curr), task_list); + sdt_event(&q->dmap); wake_up_state(curr->task, TASK_NORMAL); list_del_init(&curr->task_list); @@ -97,6 +99,9 @@ void prepare_to_swait_exclusive(struct swait_queue_head *q, struct swait_queue * __prepare_to_swait(q, wait); set_current_state(state); raw_spin_unlock_irqrestore(&q->lock, flags); + + if (state & TASK_NORMAL) + sdt_wait_prepare(&q->dmap); } EXPORT_SYMBOL(prepare_to_swait_exclusive); @@ -119,12 +124,16 @@ long prepare_to_swait_event(struct swait_queue_head *q, struct swait_queue *wait } raw_spin_unlock_irqrestore(&q->lock, flags); + if (!ret && state & TASK_NORMAL) + sdt_wait_prepare(&q->dmap); + return ret; } EXPORT_SYMBOL(prepare_to_swait_event); void __finish_swait(struct swait_queue_head *q, struct swait_queue *wait) { + sdt_wait_finish(); __set_current_state(TASK_RUNNING); if (!list_empty(&wait->task_list)) list_del_init(&wait->task_list); @@ -134,6 +143,7 @@ void finish_swait(struct swait_queue_head *q, struct swait_queue *wait) { unsigned long flags; + sdt_wait_finish(); __set_current_state(TASK_RUNNING); if (!list_empty_careful(&wait->task_list)) { -- 1.9.1
[PATCH v4 16/24] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread
cb92173d1f0 (locking/lockdep, cpu/hotplug: Annotate AP thread) was introduced to make lockdep_assert_cpus_held() work in AP thread. However, the annotation is too strong for that purpose. We don't have to use more than try lock annotation for that. Furthermore, now that Dept was introduced, false positive alarms was reported by that. Replaced it with try lock annotation. Signed-off-by: Byungchul Park --- kernel/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 407a256..1f92a42 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -355,7 +355,7 @@ int lockdep_is_cpus_held(void) static void lockdep_acquire_cpus_lock(void) { - rwsem_acquire(&cpu_hotplug_lock.dep_map, 0, 0, _THIS_IP_); + rwsem_acquire(&cpu_hotplug_lock.dep_map, 0, 1, _THIS_IP_); } static void lockdep_release_cpus_lock(void) -- 1.9.1
[PATCH v4 02/24] dept: Implement Dept(Dependency Tracker)
CURRENT STATUS -- Lockdep tracks acquisition order of locks in order to detect deadlock, and IRQ and IRQ enable/disable state as well to take accident acquisitions into account. Lockdep should be turned off once it detects and reports a deadlock since the data structure and algorithm are not reusable after detection because of the complex design. PROBLEM --- *Waits* and their *events* that never reach eventually cause deadlock. However, Lockdep is only interested in lock acquisition order, forcing to emulate lock acqusition even for just waits and events that have nothing to do with real lock. Even worse, no one likes Lockdep's false positive detection because that prevents further one that might be more valuable. That's why all the kernel developers are sensitive to Lockdep's false positive. Besides those, by tracking acquisition order, it cannot correctly deal with read lock and cross-event e.g. wait_for_completion()/complete() for deadlock detection. Lockdep is no longer a good tool for that purpose. SOLUTION Again, *waits* and their *events* that never reach eventually cause deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits and events themselves. Dept tracks waits and events and report it if any event would be never reachable. Dept does: . Works with read lock in the right way. . Works with any wait and event e.i. cross-event. . Continue to work even after reporting multiple times. . Provides simple and intuitive APIs. . Does exactly what dependency checker should do. Q & A - Q. Is this the first try ever to address the problem? A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement the 'crossrelease' feature) addressed it 2 years ago that was a Lockdep extension and merged but reverted shortly because: Cross-release started to report valuable hidden problems but started to give report false positive reports as well. For sure, no one likes Lockdep's false positive reports since it makes Lockdep stop, preventing reporting further real problems. Q. Why not Dept was developed as an extension of Lockdep? A. Lockdep definitely includes all the efforts great developers have made for a long time so as to be quite stable enough. But I had to design and implement newly because of the following: 1) Lockdep was designed to track lock acquisition order. The APIs and implementation do not fit on wait-event model. 2) Lockdep is turned off on detection including false positive. Which is terrible and prevents developing any extension for stronger detection. Q. Do you intend to totally replace Lockdep? A. No. Lockdep also checks if lock usage is correct. Of course, the dependency check routine should be replaced but the other functions should be still there. Q. Do you mean the dependency check routine should be replaced right away? A. No. I admit Lockdep is stable enough thanks to great efforts kernel developers have made. Lockdep and Dept, both should be in the kernel until Dept gets considered stable. Q. Stronger detection capability would give more false positive report. Which was a big problem when cross-release was introduced. Is it ok with Dept? A. It's ok. Dept allows multiple reporting thanks to simple and quite generalized design. Of course, false positive reports should be fixed anyway but it's no longer as a critical problem as it was. Signed-off-by: Byungchul Park --- include/linux/dept.h| 481 include/linux/dept_sdt.h| 62 + include/linux/hardirq.h |3 + include/linux/irqflags.h| 33 +- include/linux/sched.h |7 + init/init_task.c|2 + init/main.c |2 + kernel/Makefile |1 + kernel/dependency/Makefile |3 + kernel/dependency/dept.c| 2536 +++ kernel/dependency/dept_hash.h | 10 + kernel/dependency/dept_object.h | 13 + kernel/exit.c |1 + kernel/fork.c |2 + kernel/module.c |2 + kernel/sched/core.c |3 + kernel/softirq.c|6 +- kernel/trace/trace_preemptirq.c | 19 +- lib/Kconfig.debug | 20 + 19 files changed, 3197 insertions(+), 9 deletions(-) create mode 100644 include/linux/dept.h create mode 100644 include/linux/dept_sdt.h create mode 100644 kernel/dependency/Makefile create mode 100644 kernel/dependency/dept.c create mode 100644 kernel/dependency/dept_hash.h create mode 100644 kernel/dependency/dept_object.h diff --git a/include/linux/dept.h b/include/linux/dept.h new file mode 100644 index 000..c3fb3cf --- /dev/null +++ b/include/linux/dept.h @@ -0,0 +1,481 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * DEPT(DEPendency Tracker) - runtime dependency tracker + * + * Started by Byungchul Park : + * + * Copyright (c
[PATCH v4 12/24] dept: Introduce split map concept and new APIs for them
There is a case where all maps used for a type of wait/event is so large in size. For instance, struct page can be a type for (un)lock_page(). The additional memory size for the maps would be 'the # of pages * sizeof(struct dept_map)' if each struct page keeps its map all the way, which might be too big to accept in some system. It'd better to have split map, one is for each instance and the other is for the type which is commonly used, and new APIs using them. So introduced split map and new APIs for them. Signed-off-by: Byungchul Park --- include/linux/dept.h | 80 ++- kernel/dependency/dept.c | 122 +++ 2 files changed, 180 insertions(+), 22 deletions(-) diff --git a/include/linux/dept.h b/include/linux/dept.h index c0bbb8e..e2d4aea 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -362,6 +362,30 @@ struct dept_map { boolnocheck; }; +struct dept_map_each { + /* +* wait timestamp associated to this map +*/ + unsigned int wgen; +}; + +struct dept_map_common { + const char *name; + struct dept_key *keys; + int sub_usr; + + /* +* It's local copy for fast acces to the associated classes. And +* Also used for dept_key instance for statically defined map. +*/ + struct dept_key keys_local; + + /* +* whether this map should be going to be checked or not +*/ + bool nocheck; +}; + struct dept_task { /* * all event contexts that have entered and before exiting @@ -451,6 +475,11 @@ struct dept_task { extern void dept_ecxt_exit(struct dept_map *m, unsigned long ip); extern void dept_skip(struct dept_map *m); extern bool dept_unskip_if_skipped(struct dept_map *m); +extern void dept_split_map_each_init(struct dept_map_each *me); +extern void dept_split_map_common_init(struct dept_map_common *mc, struct dept_key *k, const char *n); +extern void dept_wait_split_map(struct dept_map_each *me, struct dept_map_common *mc, unsigned long ip, const char *w_fn, int ne); +extern void dept_event_split_map(struct dept_map_each *me, struct dept_map_common *mc, unsigned long ip, const char *e_fn); +extern void dept_ask_event_split_map(struct dept_map_each *me, struct dept_map_common *mc); /* * for users who want to manage external keys @@ -460,31 +489,38 @@ struct dept_task { #else /* !CONFIG_DEPT */ struct dept_key { }; struct dept_map { }; +struct dept_map_each{ }; +struct dept_map_commmon { }; struct dept_task { }; #define DEPT_TASK_INITIALIZER(t) -#define dept_on() do { } while (0) -#define dept_off() do { } while (0) -#define dept_init()do { } while (0) -#define dept_task_init(t) do { } while (0) -#define dept_task_exit(t) do { } while (0) -#define dept_free_range(s, sz) do { } while (0) -#define dept_map_init(m, k, s, n) do { (void)(n); (void)(k); } while (0) -#define dept_map_reinit(m) do { } while (0) -#define dept_map_nocheck(m)do { } while (0) - -#define dept_wait(m, w_f, ip, w_fn, ne)do { (void)(w_fn); } while (0) -#define dept_stage_wait(m, w_f, w_fn, ne) do { (void)(w_fn); } while (0) -#define dept_ask_event_wait_commit(ip) do { } while (0) -#define dept_clean_stage() do { } while (0) -#define dept_ecxt_enter(m, e_f, ip, c_fn, e_fn, ne) do { (void)(c_fn); (void)(e_fn); } while (0) -#define dept_ask_event(m) do { } while (0) -#define dept_event(m, e_f, ip, e_fn) do { (void)(e_fn); } while (0) -#define dept_ecxt_exit(m, ip) do { } while (0) -#define dept_skip(m) do { } while (0) -#define dept_unskip_if_skipped(m) (false) -#define dept_key_init(k) do { (void)(k); } while (0) -#define dept_key_destroy(k)do { (void)(k); } while (0) +#define dept_on() do { } while (0) +#define dept_off() do { } while (0) +#define dept_init()do { } while (0) +#define dept_task_init(t) do { } while (0) +#define dept_task_exit(t) do { } while (0) +#define dept_free_range(s, sz) do { } while (0) +#define dept_map_init(m, k, s, n) do { (void)(n); (void)(k); } while (0) +#define dept_map_reinit(m) do { } while (0) +#define dept_map_nocheck(m)do { } while (0) + +#define dept_wait(m, w_f, ip, w_fn, ne)do { (void)(w_fn); } while (0) +#define dept_stage_wait(m, w_f, w_fn, ne) do { (void)(w_fn); }
[PATCH v4 11/24] dept: Add proc knobs to show stats and dependency graph
It'd be useful to show Dept internal stats and dependency graph on runtime via proc for better information. Introduced the knobs. Signed-off-by: Byungchul Park --- kernel/dependency/Makefile| 1 + kernel/dependency/dept.c | 24 -- kernel/dependency/dept_internal.h | 26 +++ kernel/dependency/dept_proc.c | 92 +++ 4 files changed, 128 insertions(+), 15 deletions(-) create mode 100644 kernel/dependency/dept_internal.h create mode 100644 kernel/dependency/dept_proc.c diff --git a/kernel/dependency/Makefile b/kernel/dependency/Makefile index b5cfb8a..92f1654 100644 --- a/kernel/dependency/Makefile +++ b/kernel/dependency/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_DEPT) += dept.o +obj-$(CONFIG_DEPT) += dept_proc.o diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index 3f22c5b..4142c78 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -73,6 +73,7 @@ #include #include #include +#include "dept_internal.h" static int dept_stop; static int dept_per_cpu_ready; @@ -233,20 +234,13 @@ static inline struct dept_task *dept_task(void) * have been freed will be placed. */ -enum object_t { -#define OBJECT(id, nr) OBJECT_##id, - #include "dept_object.h" -#undef OBJECT - OBJECT_NR, -}; - #define OBJECT(id, nr) \ static struct dept_##id spool_##id[nr]; \ static DEFINE_PER_CPU(struct llist_head, lpool_##id); #include "dept_object.h" #undef OBJECT -static struct dept_pool pool[OBJECT_NR] = { +struct dept_pool dept_pool[OBJECT_NR] = { #define OBJECT(id, nr) { \ .name = #id,\ .obj_sz = sizeof(struct dept_##id), \ @@ -276,7 +270,7 @@ static void *from_pool(enum object_t t) if (DEPT_WARN_ON(!irqs_disabled())) return NULL; - p = &pool[t]; + p = &dept_pool[t]; /* * Try local pool first. @@ -306,7 +300,7 @@ static void *from_pool(enum object_t t) static void to_pool(void *o, enum object_t t) { - struct dept_pool *p = &pool[t]; + struct dept_pool *p = &dept_pool[t]; struct llist_head *h; preempt_disable(); @@ -1986,7 +1980,7 @@ void dept_map_nocheck(struct dept_map *m) } EXPORT_SYMBOL_GPL(dept_map_nocheck); -static LIST_HEAD(classes); +LIST_HEAD(dept_classes); static inline bool within(const void *addr, void *start, unsigned long size) { @@ -2013,7 +2007,7 @@ void dept_free_range(void *start, unsigned int sz) while (unlikely(!dept_lock())) cpu_relax(); - list_for_each_entry_safe(c, n, &classes, all_node) { + list_for_each_entry_safe(c, n, &dept_classes, all_node) { if (!within((void *)c->key, start, sz) && !within(c->name, start, sz)) continue; @@ -2082,7 +2076,7 @@ static struct dept_class *check_new_class(struct dept_key *local, c->sub = sub; c->key = (unsigned long)(k->subkeys + sub); hash_add_class(c); - list_add(&c->all_node, &classes); + list_add(&c->all_node, &dept_classes); unlock: dept_unlock(); caching: @@ -2537,8 +2531,8 @@ static void migrate_per_cpu_pool(void) struct llist_head *from; struct llist_head *to; - from = &pool[i].boot_pool; - to = per_cpu_ptr(pool[i].lpool, boot_cpu); + from = &dept_pool[i].boot_pool; + to = per_cpu_ptr(dept_pool[i].lpool, boot_cpu); move_llist(to, from); } } diff --git a/kernel/dependency/dept_internal.h b/kernel/dependency/dept_internal.h new file mode 100644 index 000..007c1ee --- /dev/null +++ b/kernel/dependency/dept_internal.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Dept(DEPendency Tracker) - runtime dependency tracker internal header + * + * Started by Byungchul Park : + * + * Copyright (c) 2020 LG Electronics, Inc., Byungchul Park + */ + +#ifndef __DEPT_INTERNAL_H +#define __DEPT_INTERNAL_H + +#ifdef CONFIG_DEPT + +enum object_t { +#define OBJECT(id, nr) OBJECT_##id, + #include "dept_object.h" +#undef OBJECT + OBJECT_NR, +}; + +extern struct list_head dept_classes; +extern struct dept_pool dept_pool[]; + +#endif +#endif /* __DEPT_INTERNAL_H */ diff --git a/kernel/dependency/dept_proc.c b/kernel/dependency/dept_proc.c new file mode 100644 index 000..c069354 --- /dev/null +++ b/kernel/dependency/dept_proc.c @@ -0,0 +1,92 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Procfs knobs for Dept(DEPendency Tracker) + * + * Started by Byungchul Park : + * + * Copyright (C) 2021 LG Electronics, Inc. , Byungchul Park + */ +#include +#include +#include +#include "dept_inte
[PATCH v4 10/24] dept: Apply Dept to rwsem
Makes Dept able to track dependencies by rwsem. Signed-off-by: Byungchul Park --- include/linux/lockdep.h | 24 include/linux/percpu-rwsem.h | 10 +- include/linux/rwsem.h| 33 + 3 files changed, 62 insertions(+), 5 deletions(-) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index b93a707..37af50c 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -646,10 +646,26 @@ static inline void print_irqtrace_events(struct task_struct *curr) dept_mutex_unlock(&(l)->dmap, i); \ } while (0) -#define rwsem_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i) -#define rwsem_acquire_nest(l, s, t, n, i) lock_acquire_exclusive(l, s, t, n, i) -#define rwsem_acquire_read(l, s, t, i) lock_acquire_shared(l, s, t, NULL, i) -#define rwsem_release(l, i)lock_release(l, i) +#define rwsem_acquire(l, s, t, i) \ +do { \ + lock_acquire_exclusive(l, s, t, NULL, i); \ + dept_rwsem_lock(&(l)->dmap, s, t, NULL, "up_write", i); \ +} while (0) +#define rwsem_acquire_nest(l, s, t, n, i) \ +do { \ + lock_acquire_exclusive(l, s, t, n, i); \ + dept_rwsem_lock(&(l)->dmap, s, t, (n) ? &(n)->dmap : NULL, "up_write", i);\ +} while (0) +#define rwsem_acquire_read(l, s, t, i) \ +do { \ + lock_acquire_shared(l, s, t, NULL, i); \ + dept_rwsem_lock(&(l)->dmap, s, t, NULL, "up_read", i); \ +} while (0) +#define rwsem_release(l, i)\ +do { \ + lock_release(l, i); \ + dept_rwsem_unlock(&(l)->dmap, i); \ +} while (0) #define lock_map_acquire(l)lock_acquire_exclusive(l, 0, 0, NULL, _THIS_IP_) #define lock_map_acquire_read(l) lock_acquire_shared_recursive(l, 0, 0, NULL, _THIS_IP_) diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index 5fda40f..ac2b1a5 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -20,8 +20,16 @@ struct percpu_rw_semaphore { #endif }; +#ifdef CONFIG_DEPT +#define __PERCPU_RWSEM_DMAP_INIT(lockname) .dmap = { .name = #lockname, .skip_cnt = ATOMIC_INIT(0) } +#else +#define __PERCPU_RWSEM_DMAP_INIT(lockname) +#endif + #ifdef CONFIG_DEBUG_LOCK_ALLOC -#define __PERCPU_RWSEM_DEP_MAP_INIT(lockname) .dep_map = { .name = #lockname }, +#define __PERCPU_RWSEM_DEP_MAP_INIT(lockname) .dep_map = {\ + .name = #lockname, \ + __PERCPU_RWSEM_DMAP_INIT(lockname) }, #else #define __PERCPU_RWSEM_DEP_MAP_INIT(lockname) #endif diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h index f934876..dc7977a 100644 --- a/include/linux/rwsem.h +++ b/include/linux/rwsem.h @@ -16,11 +16,18 @@ #include #include +#ifdef CONFIG_DEPT +# define RWSEM_DMAP_INIT(lockname) .dmap = { .name = #lockname, .skip_cnt = ATOMIC_INIT(0) }, +#else +# define RWSEM_DMAP_INIT(lockname) +#endif + #ifdef CONFIG_DEBUG_LOCK_ALLOC # define __RWSEM_DEP_MAP_INIT(lockname)\ .dep_map = {\ .name = #lockname, \ .wait_type_inner = LD_WAIT_SLEEP, \ + RWSEM_DMAP_INIT(lockname) \ }, #else # define __RWSEM_DEP_MAP_INIT(lockname) @@ -32,6 +39,32 @@ #include #endif +#ifdef CONFIG_DEPT +#define dept_rwsem_lock(m, ne, t, n, e_fn, ip) \ +do { \ + if (t) {\ + dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } else if (n) { \ + dept_skip(m); \ + } else {\ + dept_wait(m, 1UL, ip, __func__, ne);\ + dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } \ +} while (0) +#define dept_rwsem_unlock(m, ip) \ +do {
[PATCH v4 03/24] dept: Embed Dept data in Lockdep
Dept should work independently from Lockdep. However, there's no choise but to rely on Lockdep code and its instances for now. Signed-off-by: Byungchul Park --- include/linux/lockdep.h | 71 --- include/linux/lockdep_types.h | 3 ++ kernel/locking/lockdep.c | 12 3 files changed, 76 insertions(+), 10 deletions(-) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index 467b942..c56f6b6 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -20,6 +20,33 @@ extern int prove_locking; extern int lock_stat; +#ifdef CONFIG_DEPT +static inline void dept_after_copy_map(struct dept_map *to, + struct dept_map *from) +{ + int i; + + if (from->keys == &from->keys_local) + to->keys = &to->keys_local; + + if (!to->keys) + return; + + /* +* Since the class cache can be modified concurrently we could observe +* half pointers (64bit arch using 32bit copy insns). Therefore clear +* the caches and take the performance hit. +* +* XXX it doesn't work well with lockdep_set_class_and_subclass(), since +* that relies on cache abuse. +*/ + for (i = 0; i < DEPT_MAX_SUBCLASSES_CACHE; i++) + to->keys->classes[i] = NULL; +} +#else +#define dept_after_copy_map(t, f) do { } while (0) +#endif + #ifdef CONFIG_LOCKDEP #include @@ -43,6 +70,8 @@ static inline void lockdep_copy_map(struct lockdep_map *to, */ for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++) to->class_cache[i] = NULL; + + dept_after_copy_map(&to->dmap, &from->dmap); } /* @@ -176,8 +205,19 @@ struct held_lock { current->lockdep_recursion -= LOCKDEP_OFF; \ } while (0) -extern void lockdep_register_key(struct lock_class_key *key); -extern void lockdep_unregister_key(struct lock_class_key *key); +extern void __lockdep_register_key(struct lock_class_key *key); +extern void __lockdep_unregister_key(struct lock_class_key *key); + +#define lockdep_register_key(k)\ +do { \ + __lockdep_register_key(k); \ + dept_key_init(&(k)->dkey); \ +} while (0) +#define lockdep_unregister_key(k) \ +do { \ + __lockdep_unregister_key(k);\ + dept_key_destroy(&(k)->dkey); \ +} while (0) /* * These methods are used by specific locking variants (spinlocks, @@ -185,9 +225,18 @@ struct held_lock { * to lockdep: */ -extern void lockdep_init_map_type(struct lockdep_map *lock, const char *name, +extern void __lockdep_init_map_type(struct lockdep_map *lock, const char *name, struct lock_class_key *key, int subclass, u8 inner, u8 outer, u8 lock_type); +#define lockdep_init_map_type(l, n, k, s, i, o, t) \ +do { \ + __lockdep_init_map_type(l, n, k, s, i, o, t); \ + if ((k) == &__lockdep_no_validate__)\ + dept_map_nocheck(&(l)->dmap); \ + else\ + dept_map_init(&(l)->dmap, &(k)->dkey, s, n);\ +} while (0) + static inline void lockdep_init_map_waits(struct lockdep_map *lock, const char *name, struct lock_class_key *key, int subclass, u8 inner, u8 outer) @@ -431,13 +480,27 @@ enum xhlock_context_t { XHLOCK_CTX_NR, }; +#ifdef CONFIG_DEPT +/* + * TODO: I found the case to use an address of other than a real key as + * _key, for instance, in workqueue. So for now, we cannot use the + * assignment like '.dmap.keys = &(_key)->dkey' unless it's fixed. + */ +#define STATIC_DEPT_MAP_INIT(_name, _key) .dmap = {\ + .name = (_name),\ + .keys = NULL }, +#else +#define STATIC_DEPT_MAP_INIT(_name, _key) +#endif + #define lockdep_init_map_crosslock(m, n, k, s) do {} while (0) /* * To initialize a lockdep_map statically use this macro. * Note that _name must not be NULL. */ #define STATIC_LOCKDEP_MAP_INIT(_name, _key) \ - { .name = (_name), .key = (void *)(_key), } + { .name = (_name), .key = (void *)(_key), \ + STATIC_DEPT_MAP_INIT(_name, _key) } static inline void lockdep_invariant_state(bool force) {} static inline void lockdep_free_task(struct task_struct *task) {} diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h index d224308..50c8879 100644 --- a/include/linux/lockdep_types.h +++ b/include/linux/lockdep_types.h @@ -11,6 +11,7 @@ #define __LINUX_LOCKDEP_TYPES_H #include +#include #define MAX_LOCKDEP_SUBCLASSES 8UL @@ -76,6 +77,7 @@ struct
[PATCH v4 17/24] dept: Distinguish each syscall context from another
It enters kernel mode on each syscall and each syscall handling should be considered independently from the point of view of Dept. Otherwise, Dept may wrongly track dependencies across different syscalls. That might be a real dependency from user mode. However, now that Dept just started to work, conservatively let Dept not track dependencies across different syscalls. Signed-off-by: Byungchul Park --- include/linux/dept.h | 39 kernel/dependency/dept.c | 67 kernel/entry/common.c| 3 +++ 3 files changed, 60 insertions(+), 49 deletions(-) diff --git a/include/linux/dept.h b/include/linux/dept.h index e2d4aea..1a1c307 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -25,11 +25,16 @@ #define DEPT_MAX_SUBCLASSES_USR(DEPT_MAX_SUBCLASSES / DEPT_MAX_SUBCLASSES_EVT) #define DEPT_MAX_SUBCLASSES_CACHE 2 -#define DEPT_SIRQ 0 -#define DEPT_HIRQ 1 -#define DEPT_IRQS_NR 2 -#define DEPT_SIRQF (1UL << DEPT_SIRQ) -#define DEPT_HIRQF (1UL << DEPT_HIRQ) +enum { + DEPT_CXT_SIRQ = 0, + DEPT_CXT_HIRQ, + DEPT_CXT_IRQS_NR, + DEPT_CXT_PROCESS = DEPT_CXT_IRQS_NR, + DEPT_CXTS_NR +}; + +#define DEPT_SIRQF (1UL << DEPT_CXT_SIRQ) +#define DEPT_HIRQF (1UL << DEPT_CXT_HIRQ) struct dept_ecxt; struct dept_iecxt { @@ -95,8 +100,8 @@ struct dept_class { /* * for tracking IRQ dependencies */ - struct dept_iecxt iecxt[DEPT_IRQS_NR]; - struct dept_iwait iwait[DEPT_IRQS_NR]; + struct dept_iecxt iecxt[DEPT_CXT_IRQS_NR]; + struct dept_iwait iwait[DEPT_CXT_IRQS_NR]; }; struct dept_stack { @@ -150,8 +155,8 @@ struct dept_ecxt { /* * where the IRQ-enabled happened */ - unsigned long enirq_ip[DEPT_IRQS_NR]; - struct dept_stack *enirq_stack[DEPT_IRQS_NR]; + unsigned long enirq_ip[DEPT_CXT_IRQS_NR]; + struct dept_stack *enirq_stack[DEPT_CXT_IRQS_NR]; /* * where the event context started @@ -194,8 +199,8 @@ struct dept_wait { /* * where the IRQ wait happened */ - unsigned long irq_ip[DEPT_IRQS_NR]; - struct dept_stack *irq_stack[DEPT_IRQS_NR]; + unsigned long irq_ip[DEPT_CXT_IRQS_NR]; + struct dept_stack *irq_stack[DEPT_CXT_IRQS_NR]; /* * where the wait happened @@ -400,19 +405,19 @@ struct dept_task { int wait_hist_pos; /* -* sequential id to identify each IRQ context +* sequential id to identify each context */ - unsigned intirq_id[DEPT_IRQS_NR]; + unsigned intcxt_id[DEPT_CXTS_NR]; /* * for tracking IRQ-enabled points with cross-event */ - unsigned intwgen_enirq[DEPT_IRQS_NR]; + unsigned intwgen_enirq[DEPT_CXT_IRQS_NR]; /* * for keeping up-to-date IRQ-enabled points */ - unsigned long enirq_ip[DEPT_IRQS_NR]; + unsigned long enirq_ip[DEPT_CXT_IRQS_NR]; /* * current effective IRQ-enabled flag @@ -448,7 +453,7 @@ struct dept_task { .dept_task.wait_hist = { { .wait = NULL, } }, \ .dept_task.ecxt_held_pos = 0, \ .dept_task.wait_hist_pos = 0, \ - .dept_task.irq_id = { 0 }, \ + .dept_task.cxt_id = { 0 }, \ .dept_task.wgen_enirq = { 0 }, \ .dept_task.enirq_ip = { 0 },\ .dept_task.recursive = 0, \ @@ -480,6 +485,7 @@ struct dept_task { extern void dept_wait_split_map(struct dept_map_each *me, struct dept_map_common *mc, unsigned long ip, const char *w_fn, int ne); extern void dept_event_split_map(struct dept_map_each *me, struct dept_map_common *mc, unsigned long ip, const char *e_fn); extern void dept_ask_event_split_map(struct dept_map_each *me, struct dept_map_common *mc); +extern void dept_kernel_enter(void); /* * for users who want to manage external keys @@ -520,6 +526,7 @@ struct dept_task { #define dept_wait_split_map(me, mc, ip, w_fn, ne) do { } while (0) #define dept_event_split_map(me, mc, ip, e_fn) do { } while (0) #define dept_ask_event_split_map(me, mc) do { } while (0) +#define dept_kernel_enter()
[PATCH v4 09/24] dept: Apply Dept to seqlock
Makes Dept able to track dependencies by seqlock with adding wait annotation on read side of seqlock. Signed-off-by: Byungchul Park --- include/linux/seqlock.h | 59 - 1 file changed, 58 insertions(+), 1 deletion(-) diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h index 37ded6b..6e8ecd7 100644 --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -23,6 +23,25 @@ #include +#ifdef CONFIG_DEPT +#define DEPT_EVT_ALL ((1UL << DEPT_MAX_SUBCLASSES_EVT) - 1) +#define dept_seq_wait(m, ip) dept_wait(m, DEPT_EVT_ALL, ip, __func__, 0) +#define dept_seq_writebegin(m, ip) \ +do { \ + dept_ecxt_enter(m, 1UL, ip, __func__, "write_seqcount_end", 0);\ + dept_ask_event(m); \ +} while (0) +#define dept_seq_writeend(m, ip) \ +do { \ + dept_event(m, 1UL, ip, __func__); \ + dept_ecxt_exit(m, ip); \ +} while (0) +#else +#define dept_seq_wait(m, ip) do { } while (0) +#define dept_seq_writebegin(m, ip) do { } while (0) +#define dept_seq_writeend(m, ip) do { } while (0) +#endif + /* * The seqlock seqcount_t interface does not prescribe a precise sequence of * read begin/retry/end. For readers, typically there is a call to @@ -148,7 +167,7 @@ static inline void seqcount_lockdep_reader_access(const seqcount_t *s) * This lock-unlock technique must be implemented for all of PREEMPT_RT * sleeping locks. See Documentation/locking/locktypes.rst */ -#if defined(CONFIG_LOCKDEP) || defined(CONFIG_PREEMPT_RT) +#if defined(CONFIG_LOCKDEP) || defined(CONFIG_DEPT) || defined(CONFIG_PREEMPT_RT) #define __SEQ_LOCK(expr) expr #else #define __SEQ_LOCK(expr) @@ -203,6 +222,22 @@ static inline void seqcount_lockdep_reader_access(const seqcount_t *s) __SEQ_LOCK(locktype *lock); \ } seqcount_##lockname##_t; \ \ +static __always_inline void\ +__seqprop_##lockname##_wait(const seqcount_##lockname##_t *s) \ +{ \ + __SEQ_LOCK(dept_seq_wait(&(lockmember)->dep_map.dmap, _RET_IP_));\ +} \ + \ +static __always_inline void\ +__seqprop_##lockname##_writebegin(const seqcount_##lockname##_t *s)\ +{ \ +} \ + \ +static __always_inline void\ +__seqprop_##lockname##_writeend(const seqcount_##lockname##_t *s) \ +{ \ +} \ + \ static __always_inline seqcount_t *\ __seqprop_##lockname##_ptr(seqcount_##lockname##_t *s) \ { \ @@ -271,6 +306,21 @@ static inline void __seqprop_assert(const seqcount_t *s) lockdep_assert_preemption_disabled(); } +static inline void __seqprop_wait(seqcount_t *s) +{ + dept_seq_wait(&s->dep_map.dmap, _RET_IP_); +} + +static inline void __seqprop_writebegin(seqcount_t *s) +{ + dept_seq_writebegin(&s->dep_map.dmap, _RET_IP_); +} + +static inline void __seqprop_writeend(seqcount_t *s) +{ + dept_seq_writeend(&s->dep_map.dmap, _RET_IP_); +} + #define __SEQ_RT IS_ENABLED(CONFIG_PREEMPT_RT) SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false,s->lock, raw_spin, raw_spin_lock(s->lock)) @@ -311,6 +361,9 @@ static inline void __seqprop_assert(const seqcount_t *s) #define seqprop_sequence(s)__seqprop(s, sequence) #define seqprop_preemptible(s) __seqprop(s, preemptible) #define seqprop_assert(s) __seqprop(s, assert) +#define seqprop_dept_wait(s) __seqprop(s, wait) +#define seqprop_dept_writebegin(s) __seqprop(s, writebegin) +#define seqprop_dept_writeend(s) __seqprop(s, writeend) /** * __read_seqcount_begin() - begin a seqcount_t read section w/o barrier @@ -360,6 +413,7 @@ static inline void __seqprop_assert(const seqcount_t *s) #define read_seqcount_begin(s)
[PATCH v4 13/24] dept: Apply Dept to wait/event of PG_{locked, writeback}
Makes Dept able to track dependencies by PG_{locked,writeback}. For instance, (un)lock_page() generates that type of dependency. Signed-off-by: Byungchul Park --- include/linux/dept_page.h | 78 + include/linux/page-flags.h | 45 ++-- include/linux/pagemap.h | 7 +++- init/main.c | 2 ++ kernel/dependency/dept_object.h | 2 +- lib/Kconfig.debug | 1 + mm/filemap.c| 68 +++ mm/page_ext.c | 5 +++ 8 files changed, 204 insertions(+), 4 deletions(-) create mode 100644 include/linux/dept_page.h diff --git a/include/linux/dept_page.h b/include/linux/dept_page.h new file mode 100644 index 000..d2d093d --- /dev/null +++ b/include/linux/dept_page.h @@ -0,0 +1,78 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_DEPT_PAGE_H +#define __LINUX_DEPT_PAGE_H + +#ifdef CONFIG_DEPT +#include + +extern struct page_ext_operations dept_pglocked_ops; +extern struct page_ext_operations dept_pgwriteback_ops; +extern struct dept_map_common pglocked_mc; +extern struct dept_map_common pgwriteback_mc; + +extern void dept_page_init(void); +extern struct dept_map_each *get_pglocked_me(struct page *page); +extern struct dept_map_each *get_pgwriteback_me(struct page *page); + +#define dept_pglocked_wait(f) \ +do { \ + struct dept_map_each *me = get_pglocked_me(&(f)->page); \ + \ + if (likely(me)) \ + dept_wait_split_map(me, &pglocked_mc, _RET_IP_, \ + __func__, 0); \ +} while (0) + +#define dept_pglocked_set_bit(f) \ +do { \ + struct dept_map_each *me = get_pglocked_me(&(f)->page); \ + \ + if (likely(me)) \ + dept_ask_event_split_map(me, &pglocked_mc); \ +} while (0) + +#define dept_pglocked_event(f) \ +do { \ + struct dept_map_each *me = get_pglocked_me(&(f)->page); \ + \ + if (likely(me)) \ + dept_event_split_map(me, &pglocked_mc, _RET_IP_,\ +__func__); \ +} while (0) + +#define dept_pgwriteback_wait(f) \ +do { \ + struct dept_map_each *me = get_pgwriteback_me(&(f)->page);\ + \ + if (likely(me)) \ + dept_wait_split_map(me, &pgwriteback_mc, _RET_IP_,\ + __func__, 0); \ +} while (0) + +#define dept_pgwriteback_set_bit(f)\ +do { \ + struct dept_map_each *me = get_pgwriteback_me(&(f)->page);\ + \ + if (likely(me)) \ + dept_ask_event_split_map(me, &pgwriteback_mc);\ +} while (0) + +#define dept_pgwriteback_event(f) \ +do { \ + struct dept_map_each *me = get_pgwriteback_me(&(f)->page);\ + \ + if (likely(me)) \ + dept_event_split_map(me, &pgwriteback_mc, _RET_IP_,\ +__func__); \ +} while (0) +#else +#define dept_page_init() do { } while (0) +#define dept_pglocked_wait(f) do { } while (0) +#define dept_pglocked_set_bit(f) do { } while (0) +#define dept_pglocked_event(f) do { } while (0) +#define dept_pgwriteback_wait(f) do { } while (0) +#define dept_pgwriteback_set_bit(f)do { } while (0) +#define dept_pgwriteback_event(f) do { } while (0) +#endif + +#endif /* __LINUX_DEPT_PAGE_H */ diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 1c3b6e5..066b6a5 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -411,7 +411,6 @@ static unsigned long *folio_flags(struct folio *folio, unsigned n) #define TESTSCFLAG_FALSE(uname, lname) \ TESTSETFLAG_FALSE(uname, lname) TESTCLEARFLAG_FALSE(uname, lname) -__PAGEFLAG(Locked, locked, PF_NO_TAIL
[PATCH v4 15/24] dept: Apply SDT to wait(waitqueue)
Makes SDT able to track dependencies by wait(waitqueue). Signed-off-by: Byungchul Park --- include/linux/wait.h | 6 +- kernel/sched/wait.c | 16 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/linux/wait.h b/include/linux/wait.h index 851e07d..2133998 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -7,6 +7,7 @@ #include #include #include +#include #include #include @@ -37,6 +38,7 @@ struct wait_queue_entry { struct wait_queue_head { spinlock_t lock; struct list_headhead; + struct dept_map dmap; }; typedef struct wait_queue_head wait_queue_head_t; @@ -56,7 +58,8 @@ struct wait_queue_head { #define __WAIT_QUEUE_HEAD_INITIALIZER(name) { \ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ - .head = LIST_HEAD_INIT(name.head) } + .head = LIST_HEAD_INIT(name.head), \ + .dmap = DEPT_SDT_MAP_INIT(name) } #define DECLARE_WAIT_QUEUE_HEAD(name) \ struct wait_queue_head name = __WAIT_QUEUE_HEAD_INITIALIZER(name) @@ -67,6 +70,7 @@ struct wait_queue_head { do { \ static struct lock_class_key __key; \ \ + sdt_map_init(&(wq_head)->dmap); \ __init_waitqueue_head((wq_head), #wq_head, &__key); \ } while (0) diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index eca3810..fc5a16a 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -105,6 +105,7 @@ static int __wake_up_common(struct wait_queue_head *wq_head, unsigned int mode, if (flags & WQ_FLAG_BOOKMARK) continue; + sdt_event(&wq_head->dmap); ret = curr->func(curr, mode, wake_flags, key); if (ret < 0) break; @@ -268,6 +269,9 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head) __add_wait_queue(wq_head, wq_entry); set_current_state(state); spin_unlock_irqrestore(&wq_head->lock, flags); + + if (state & TASK_NORMAL) + sdt_wait_prepare(&wq_head->dmap); } EXPORT_SYMBOL(prepare_to_wait); @@ -286,6 +290,10 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head) } set_current_state(state); spin_unlock_irqrestore(&wq_head->lock, flags); + + if (state & TASK_NORMAL) + sdt_wait_prepare(&wq_head->dmap); + return was_empty; } EXPORT_SYMBOL(prepare_to_wait_exclusive); @@ -331,6 +339,9 @@ long prepare_to_wait_event(struct wait_queue_head *wq_head, struct wait_queue_en } spin_unlock_irqrestore(&wq_head->lock, flags); + if (!ret && state & TASK_NORMAL) + sdt_wait_prepare(&wq_head->dmap); + return ret; } EXPORT_SYMBOL(prepare_to_wait_event); @@ -352,7 +363,9 @@ int do_wait_intr(wait_queue_head_t *wq, wait_queue_entry_t *wait) return -ERESTARTSYS; spin_unlock(&wq->lock); + sdt_wait_prepare(&wq->dmap); schedule(); + sdt_wait_finish(); spin_lock(&wq->lock); return 0; @@ -369,7 +382,9 @@ int do_wait_intr_irq(wait_queue_head_t *wq, wait_queue_entry_t *wait) return -ERESTARTSYS; spin_unlock_irq(&wq->lock); + sdt_wait_prepare(&wq->dmap); schedule(); + sdt_wait_finish(); spin_lock_irq(&wq->lock); return 0; @@ -389,6 +404,7 @@ void finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_en { unsigned long flags; + sdt_wait_finish(); __set_current_state(TASK_RUNNING); /* * We can check for list emptiness outside the lock -- 1.9.1
[PATCH v4 04/24] dept: Add a API for skipping dependency check temporarily
Dept would skip check for dmaps marked by dept_map_nocheck() permanently. However, sometimes it needs to skip check for some dmaps temporarily and back to normal, for instance, lock acquisition with a nest lock. Lock usage check with regard to nest lock could be performed by Lockdep, however, dependency check is not necessary for that case. So prepared for it by adding two new APIs, dept_skip() and dept_unskip_if_skipped(). Signed-off-by: Byungchul Park --- include/linux/dept.h | 9 + include/linux/dept_sdt.h | 2 +- include/linux/lockdep.h | 4 +++- kernel/dependency/dept.c | 49 4 files changed, 62 insertions(+), 2 deletions(-) diff --git a/include/linux/dept.h b/include/linux/dept.h index c3fb3cf..c0bbb8e 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -352,6 +352,11 @@ struct dept_map { unsigned intwgen; /* +* for skipping dependency check temporarily +*/ + atomic_tskip_cnt; + + /* * whether this map should be going to be checked or not */ boolnocheck; @@ -444,6 +449,8 @@ struct dept_task { extern void dept_ask_event(struct dept_map *m); extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *e_fn); extern void dept_ecxt_exit(struct dept_map *m, unsigned long ip); +extern void dept_skip(struct dept_map *m); +extern bool dept_unskip_if_skipped(struct dept_map *m); /* * for users who want to manage external keys @@ -475,6 +482,8 @@ struct dept_task { #define dept_ask_event(m) do { } while (0) #define dept_event(m, e_f, ip, e_fn) do { (void)(e_fn); } while (0) #define dept_ecxt_exit(m, ip) do { } while (0) +#define dept_skip(m) do { } while (0) +#define dept_unskip_if_skipped(m) (false) #define dept_key_init(k) do { (void)(k); } while (0) #define dept_key_destroy(k)do { (void)(k); } while (0) #endif diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h index 375c4c3..e9d558d 100644 --- a/include/linux/dept_sdt.h +++ b/include/linux/dept_sdt.h @@ -13,7 +13,7 @@ #include #ifdef CONFIG_DEPT -#define DEPT_SDT_MAP_INIT(dname) { .name = #dname } +#define DEPT_SDT_MAP_INIT(dname) { .name = #dname, .skip_cnt = ATOMIC_INIT(0) } /* * SDT(Single-event Dependency Tracker) APIs diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index c56f6b6..c1a56fe 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -488,7 +488,9 @@ enum xhlock_context_t { */ #define STATIC_DEPT_MAP_INIT(_name, _key) .dmap = {\ .name = (_name),\ - .keys = NULL }, + .keys = NULL, \ + .skip_cnt = ATOMIC_INIT(0), \ + }, #else #define STATIC_DEPT_MAP_INIT(_name, _key) #endif diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index ec3f131..3f22c5b 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -1943,6 +1943,7 @@ void dept_map_init(struct dept_map *m, struct dept_key *k, int sub, m->name = n; m->wgen = 0U; m->nocheck = false; + atomic_set(&m->skip_cnt, 0); exit: dept_exit(flags); } @@ -1963,6 +1964,7 @@ void dept_map_reinit(struct dept_map *m) clean_classes_cache(&m->keys_local); m->wgen = 0U; + atomic_set(&m->skip_cnt, 0); dept_exit(flags); } @@ -2346,6 +2348,53 @@ void dept_ecxt_exit(struct dept_map *m, unsigned long ip) } EXPORT_SYMBOL_GPL(dept_ecxt_exit); +void dept_skip(struct dept_map *m) +{ + struct dept_task *dt = dept_task(); + unsigned long flags; + + if (READ_ONCE(dept_stop) || dt->recursive) + return; + + if (m->nocheck) + return; + + flags = dept_enter(); + + atomic_inc(&m->skip_cnt); + + dept_exit(flags); +} +EXPORT_SYMBOL_GPL(dept_skip); + +/* + * Return true if successfully unskip, otherwise false. + */ +bool dept_unskip_if_skipped(struct dept_map *m) +{ + struct dept_task *dt = dept_task(); + unsigned long flags; + bool ret = false; + + if (READ_ONCE(dept_stop) || dt->recursive) + return false; + + if (m->nocheck) + return false; + + flags = dept_enter(); + + if (!atomic_read(&m->skip_cnt)) + goto exit; + + atomic_dec(&m->skip_cnt); + ret = true; +exit: + dept_exit(flags); + return ret; +} +EXPORT_SYMBOL_GPL(dept_unskip_if_skipped); + void dept_task_exit(struct task_struct *t) { struct dept_task *dt = &t->dept_task; -- 1.9.1
[PATCH v4 01/24] llist: Move llist_{head,node} definition to types.h
llist_head and llist_node can be used by very primitives. For example, Dept for tracking dependency uses llist things in its header. To avoid header dependency, move those to types.h. Signed-off-by: Byungchul Park --- include/linux/llist.h | 8 include/linux/types.h | 8 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/include/linux/llist.h b/include/linux/llist.h index 85bda2d..99cc3c3 100644 --- a/include/linux/llist.h +++ b/include/linux/llist.h @@ -53,14 +53,6 @@ #include #include -struct llist_head { - struct llist_node *first; -}; - -struct llist_node { - struct llist_node *next; -}; - #define LLIST_HEAD_INIT(name) { NULL } #define LLIST_HEAD(name) struct llist_head name = LLIST_HEAD_INIT(name) diff --git a/include/linux/types.h b/include/linux/types.h index ac825ad..4662d6e 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -187,6 +187,14 @@ struct hlist_node { struct hlist_node *next, **pprev; }; +struct llist_head { + struct llist_node *first; +}; + +struct llist_node { + struct llist_node *next; +}; + struct ustat { __kernel_daddr_tf_tfree; #ifdef CONFIG_ARCH_32BIT_USTAT_F_TINODE -- 1.9.1
[PATCH v4 00/24] DEPT(Dependency Tracker)
Hi Linus and folks, I've been developing a tool for detecting deadlock possibilities by tracking wait/event rather than lock(?) acquisition order to try to cover all synchonization machanisms. It's done on v5.17-rc1 tag. https://github.com/lgebyungchulpark/linux-dept/commits/dept1.14_on_v5.17-rc1 Benifit: 0. Works with all lock primitives. 1. Works with wait_for_completion()/complete(). 2. Works with 'wait' on PG_locked. 3. Works with 'wait' on PG_writeback. 4. Works with swait/wakeup. 5. Works with waitqueue. 6. Multiple reports are allowed. 7. Deduplication control on multiple reports. 8. Withstand false positives thanks to 6. 9. Easy to tag any wait/event. Future work: 0. To make it more stable. 1. To separates Dept from Lockdep. 2. To improves performance in terms of time and space. 3. To use Dept as a dependency engine for Lockdep. 4. To add any missing tags of wait/event in the kernel. 5. To deduplicate stack trace. How to interpret reports: 1. E(event) in each context cannot be triggered because of the W(wait) that cannot be woken. 2. The stack trace helping find the problematic code is located in each conext's detail. Thanks, Byungchul --- Changes from v3: 1. Dept shouldn't create dependencies between different depths of a class that were indicated by *_lock_nested(). Dept normally doesn't but it does once another lock class comes in. So fixed it. (feedback from Hyeonggon) 2. Dept considered a wait as a real wait once getting to __schedule() even if it has been set to TASK_RUNNING by wake up sources in advance. Fixed it so that Dept doesn't consider the case as a real wait. (feedback from Jan Kara) 3. Stop tracking dependencies with a map once the event associated with the map has been handled. Dept will start to work with the map again, on the next sleep. Changes from v2: 1. Disable Dept on bit_wait_table[] in sched/wait_bit.c reporting a lot of false positives, which is my fault. Wait/event for bit_wait_table[] should've been tagged in a higher layer for better work, which is a future work. (feedback from Jan Kara) 2. Disable Dept on crypto_larval's completion to prevent a false positive. Changes from v1: 1. Fix coding style and typo. (feedback from Steven) 2. Distinguish each work context from another in workqueue. 3. Skip checking lock acquisition with nest_lock, which is about correct lock usage that should be checked by Lockdep. Changes from RFC: 1. Prevent adding a wait tag at prepare_to_wait() but __schedule(). (feedback from Linus and Matthew) 2. Use try version at lockdep_acquire_cpus_lock() annotation. 3. Distinguish each syscall context from another. Byungchul Park (24): llist: Move llist_{head,node} definition to types.h dept: Implement Dept(Dependency Tracker) dept: Embed Dept data in Lockdep dept: Add a API for skipping dependency check temporarily dept: Apply Dept to spinlock dept: Apply Dept to mutex families dept: Apply Dept to rwlock dept: Apply Dept to wait_for_completion()/complete() dept: Apply Dept to seqlock dept: Apply Dept to rwsem dept: Add proc knobs to show stats and dependency graph dept: Introduce split map concept and new APIs for them dept: Apply Dept to wait/event of PG_{locked,writeback} dept: Apply SDT to swait dept: Apply SDT to wait(waitqueue) locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread dept: Distinguish each syscall context from another dept: Distinguish each work from another dept: Disable Dept within the wait_bit layer by default dept: Add nocheck version of init_completion() dept: Disable Dept on struct crypto_larval's completion for now dept: Don't create dependencies between different depths in any case dept: Let it work with real sleeps in __schedule() dept: Disable Dept on that map once it's been handled until next turn crypto/api.c |7 +- include/linux/completion.h | 50 +- include/linux/dept.h | 535 +++ include/linux/dept_page.h | 78 ++ include/linux/dept_sdt.h | 62 + include/linux/hardirq.h|3 + include/linux/irqflags.h | 33 +- include/linux/llist.h |8 - include/linux/lockdep.h| 158 ++- include/linux/lockdep_types.h |3 + include/linux/mutex.h | 33 + include/linux/page-flags.h | 45 +- include/linux/pagemap.h|7 +- include/linux/percpu-rwsem.h | 10 +- include/linux/rtmutex.h|7 + include/linux/rwlock.h | 52 + include/lin
[PATCH v4 07/24] dept: Apply Dept to rwlock
Makes Dept able to track dependencies by rwlock. Signed-off-by: Byungchul Park --- include/linux/lockdep.h| 25 include/linux/rwlock.h | 52 ++ include/linux/rwlock_api_smp.h | 8 +++ include/linux/rwlock_types.h | 7 ++ 4 files changed, 83 insertions(+), 9 deletions(-) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index 6653a4f..b93a707 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -600,16 +600,31 @@ static inline void print_irqtrace_events(struct task_struct *curr) dept_spin_unlock(&(l)->dmap, i);\ } while (0) -#define rwlock_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i) +#define rwlock_acquire(l, s, t, i) \ +do { \ + lock_acquire_exclusive(l, s, t, NULL, i); \ + dept_rwlock_wlock(&(l)->dmap, s, t, NULL, "write_unlock", i); \ +} while (0) #define rwlock_acquire_read(l, s, t, i) \ do { \ - if (read_lock_is_recursive()) \ + if (read_lock_is_recursive()) { \ lock_acquire_shared_recursive(l, s, t, NULL, i);\ - else\ + dept_rwlock_rlock(&(l)->dmap, s, t, NULL, "read_unlock", i, 0);\ + } else {\ lock_acquire_shared(l, s, t, NULL, i); \ + dept_rwlock_rlock(&(l)->dmap, s, t, NULL, "read_unlock", i, 1);\ + } \ +} while (0) +#define rwlock_release(l, i) \ +do { \ + lock_release(l, i); \ + dept_rwlock_wunlock(&(l)->dmap, i); \ +} while (0) +#define rwlock_release_read(l, i) \ +do { \ + lock_release(l, i); \ + dept_rwlock_runlock(&(l)->dmap, i); \ } while (0) - -#define rwlock_release(l, i) lock_release(l, i) #define seqcount_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i) #define seqcount_acquire_read(l, s, t, i) lock_acquire_shared_recursive(l, s, t, NULL, i) diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h index 8f416c5..768ad9e 100644 --- a/include/linux/rwlock.h +++ b/include/linux/rwlock.h @@ -28,6 +28,58 @@ do { *(lock) = __RW_LOCK_UNLOCKED(lock); } while (0) #endif +#ifdef CONFIG_DEPT +#define DEPT_EVT_RWLOCK_R 1UL +#define DEPT_EVT_RWLOCK_W (1UL << 1) +#define DEPT_EVT_RWLOCK_RW (DEPT_EVT_RWLOCK_R | DEPT_EVT_RWLOCK_W) + +#define dept_rwlock_wlock(m, ne, t, n, e_fn, ip) \ +do { \ + if (t) {\ + dept_ecxt_enter(m, DEPT_EVT_RWLOCK_W, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } else if (n) { \ + dept_skip(m); \ + } else {\ + dept_wait(m, DEPT_EVT_RWLOCK_RW, ip, __func__, ne); \ + dept_ecxt_enter(m, DEPT_EVT_RWLOCK_W, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } \ +} while (0) +#define dept_rwlock_rlock(m, ne, t, n, e_fn, ip, q)\ +do { \ + if (t) {\ + dept_ecxt_enter(m, DEPT_EVT_RWLOCK_R, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } else if (n) { \ + dept_skip(m); \ + } else {\ + dept_wait(m, (q) ? DEPT_EVT_RWLOCK_RW : DEPT_EVT_RWLOCK_W, ip, __func__, ne);\ + dept_ecxt_enter(m, DEPT_EVT_RWLOCK_R, ip, __func__, e_fn, ne);\ + dept_ask_event(m);
[PATCH v4 05/24] dept: Apply Dept to spinlock
Makes Dept able to track dependencies by spinlock. Signed-off-by: Byungchul Park --- include/linux/lockdep.h| 18 +++--- include/linux/spinlock.h | 26 ++ include/linux/spinlock_types_raw.h | 13 + 3 files changed, 54 insertions(+), 3 deletions(-) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index c1a56fe..529ea18 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -584,9 +584,21 @@ static inline void print_irqtrace_events(struct task_struct *curr) #define lock_acquire_shared(l, s, t, n, i) lock_acquire(l, s, t, 1, 1, n, i) #define lock_acquire_shared_recursive(l, s, t, n, i) lock_acquire(l, s, t, 2, 1, n, i) -#define spin_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i) -#define spin_acquire_nest(l, s, t, n, i) lock_acquire_exclusive(l, s, t, n, i) -#define spin_release(l, i) lock_release(l, i) +#define spin_acquire(l, s, t, i) \ +do { \ + lock_acquire_exclusive(l, s, t, NULL, i); \ + dept_spin_lock(&(l)->dmap, s, t, NULL, "spin_unlock", i); \ +} while (0) +#define spin_acquire_nest(l, s, t, n, i) \ +do { \ + lock_acquire_exclusive(l, s, t, n, i); \ + dept_spin_lock(&(l)->dmap, s, t, (n) ? &(n)->dmap : NULL, "spin_unlock", i); \ +} while (0) +#define spin_release(l, i) \ +do { \ + lock_release(l, i); \ + dept_spin_unlock(&(l)->dmap, i);\ +} while (0) #define rwlock_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i) #define rwlock_acquire_read(l, s, t, i) \ diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h index 5c0c517..6b5c3f4 100644 --- a/include/linux/spinlock.h +++ b/include/linux/spinlock.h @@ -95,6 +95,32 @@ # include #endif +#ifdef CONFIG_DEPT +#define dept_spin_lock(m, ne, t, n, e_fn, ip) \ +do { \ + if (t) {\ + dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } else if (n) { \ + dept_skip(m); \ + } else {\ + dept_wait(m, 1UL, ip, __func__, ne);\ + dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, ne);\ + dept_ask_event(m); \ + } \ +} while (0) +#define dept_spin_unlock(m, ip) \ +do { \ + if (!dept_unskip_if_skipped(m)) { \ + dept_event(m, 1UL, ip, __func__); \ + dept_ecxt_exit(m, ip); \ + } \ +} while (0) +#else +#define dept_spin_lock(m, ne, t, n, e_fn, ip) do { } while (0) +#define dept_spin_unlock(m, ip)do { } while (0) +#endif + #ifdef CONFIG_DEBUG_SPINLOCK extern void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name, struct lock_class_key *key, short inner); diff --git a/include/linux/spinlock_types_raw.h b/include/linux/spinlock_types_raw.h index 91cb36b..279e821 100644 --- a/include/linux/spinlock_types_raw.h +++ b/include/linux/spinlock_types_raw.h @@ -26,16 +26,28 @@ #define SPINLOCK_OWNER_INIT((void *)-1L) +#ifdef CONFIG_DEPT +# define RAW_SPIN_DMAP_INIT(lockname) .dmap = { .name = #lockname, .skip_cnt = ATOMIC_INIT(0) }, +# define SPIN_DMAP_INIT(lockname) .dmap = { .name = #lockname, .skip_cnt = ATOMIC_INIT(0) }, +# define LOCAL_SPIN_DMAP_INIT(lockname).dmap = { .name = #lockname, .skip_cnt = ATOMIC_INIT(0) }, +#else +# define RAW_SPIN_DMAP_INIT(lockname) +# define SPIN_DMAP_INIT(lockname) +# define LOCAL_SPIN_DMAP_INIT(lockname) +#endif + #ifdef CONFIG_DEBUG_LOCK_ALLOC # define RAW_SPIN_DEP_MAP_INIT(lockname) \ .dep_map = {\ .name = #lockname, \ .wait_type_inner =
[PATCH v1 1/2] drm/aspeed: Add gfx flags and clock selection for AST2600
Add clock selection code for AST2600. At AST2600 user could select more than one dispaly timing. Add gfx flags for future usage. Signed-off-by: Tommy Haung --- drivers/gpu/drm/aspeed/aspeed_gfx.h | 11 +++ drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 14 ++ drivers/gpu/drm/aspeed/aspeed_gfx_drv.c | 4 3 files changed, 29 insertions(+) diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx.h b/drivers/gpu/drm/aspeed/aspeed_gfx.h index 4e6a442c3886..eb4c267cde5e 100644 --- a/drivers/gpu/drm/aspeed/aspeed_gfx.h +++ b/drivers/gpu/drm/aspeed/aspeed_gfx.h @@ -16,6 +16,7 @@ struct aspeed_gfx { u32 vga_scratch_reg; u32 throd_val; u32 scan_line_max; + u32 flags; struct drm_simple_display_pipe pipe; struct drm_connectorconnector; @@ -106,3 +107,13 @@ int aspeed_gfx_create_output(struct drm_device *drm); /* CRT_THROD */ #define CRT_THROD_LOW(x) (x) #define CRT_THROD_HIGH(x) ((x) << 8) + +/* SCU control */ +#define SCU_G6_CLK_COURCE 0x300 + +/* GFX FLAGS */ +#define CLK_MASK BIT(0) +#define CLK_G6 BIT(0) + +#define G6_CLK_MASK(BIT(8) | BIT(9) | BIT(10)) +#define G6_USB_40_CLK BIT(9) diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c index 827e62c1daba..a24fab22eac4 100644 --- a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c +++ b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c @@ -77,6 +77,18 @@ static void aspeed_gfx_disable_controller(struct aspeed_gfx *priv) regmap_update_bits(priv->scu, priv->dac_reg, BIT(16), 0); } +static void aspeed_gfx_set_clk(struct aspeed_gfx *priv) +{ + switch (priv->flags & CLK_MASK) { + case CLK_G6: + regmap_update_bits(priv->scu, SCU_G6_CLK_COURCE, G6_CLK_MASK, 0x0); + regmap_update_bits(priv->scu, SCU_G6_CLK_COURCE, G6_CLK_MASK, G6_USB_40_CLK); + break; + default: + break; + } +} + static void aspeed_gfx_crtc_mode_set_nofb(struct aspeed_gfx *priv) { struct drm_display_mode *m = &priv->pipe.crtc.state->adjusted_mode; @@ -87,6 +99,8 @@ static void aspeed_gfx_crtc_mode_set_nofb(struct aspeed_gfx *priv) if (err) return; + aspeed_gfx_set_clk(priv); + #if 0 /* TODO: we have only been able to test with the 40MHz USB clock. The * clock is fixed, so we cannot adjust it here. */ diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c index d10246b1d1c2..af56ffdccc65 100644 --- a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c +++ b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c @@ -64,6 +64,7 @@ struct aspeed_gfx_config { u32 vga_scratch_reg;/* VGA scratch register in SCU */ u32 throd_val; /* Default Threshold Seting */ u32 scan_line_max; /* Max memory size of one scan line */ + u32 gfx_flags; /* Flags for gfx chip caps */ }; static const struct aspeed_gfx_config ast2400_config = { @@ -72,6 +73,7 @@ static const struct aspeed_gfx_config ast2400_config = { .vga_scratch_reg = 0x50, .throd_val = CRT_THROD_LOW(0x1e) | CRT_THROD_HIGH(0x12), .scan_line_max = 64, + .gfx_flags = 0, }; static const struct aspeed_gfx_config ast2500_config = { @@ -80,6 +82,7 @@ static const struct aspeed_gfx_config ast2500_config = { .vga_scratch_reg = 0x50, .throd_val = CRT_THROD_LOW(0x24) | CRT_THROD_HIGH(0x3c), .scan_line_max = 128, + .gfx_flags = 0, }; static const struct aspeed_gfx_config ast2600_config = { @@ -88,6 +91,7 @@ static const struct aspeed_gfx_config ast2600_config = { .vga_scratch_reg = 0x50, .throd_val = CRT_THROD_LOW(0x50) | CRT_THROD_HIGH(0x70), .scan_line_max = 128, + .gfx_flags = CLK_G6, }; static const struct of_device_id aspeed_gfx_match[] = { -- 2.17.1
[PATCH v1 0/2] Add 1024x768 timing for AST2600
v1: Add 1024x768@70Hz for AST2600 soc display timing selection. Add gfx flag for future usage. Testing steps: 1. Add below config to turn VT and LOGO on. CONFIG_TTY=y CONFIG_VT=y CONFIG_CONSOLE_TRANSLATIONS=y CONFIG_VT_CONSOLE=y CONFIG_VT_CONSOLE_SLEEP=y CONFIG_HW_CONSOLE=y CONFIG_VT_HW_CONSOLE_BINDING=y CONFIG_UNIX98_PTYS=y CONFIG_LDISC_AUTOLOAD=y CONFIG_DEVMEM=y CONFIG_DUMMY_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y CONFIG_LOGO=y CONFIG_LOGO_LINUX_CLUT224=y 2. The Linux logo will be shown on the screen, when the BMC boot in Linux. 3. Check the display mode is 1024x768@70Hz at AST2600. 4. Check the display mode is 800x600@60Hz at AST2500. Tommy Haung (2): drm/aspeed: Add gfx flags and clock selection for AST2600 drm/aspeed: Add 1024x768 mode for AST2600 drivers/gpu/drm/aspeed/aspeed_gfx.h | 15 ++ drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 35 drivers/gpu/drm/aspeed/aspeed_gfx_drv.c | 20 -- drivers/gpu/drm/aspeed/aspeed_gfx_out.c | 14 +- 4 files changed, 81 insertions(+), 3 deletions(-) -- 2.17.1
[PATCH v1 2/2] drm/aspeed: Add 1024x768 mode for AST2600
Update the aspeed_gfx_set_clk with display width. At AST2600, the display clock could be coming from HPLL clock / 16 = 75MHz. It would fit 1024x768@70Hz. Another chip will still keep 800x600. Signed-off-by: Tommy Haung --- drivers/gpu/drm/aspeed/aspeed_gfx.h | 12 ++ drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 29 drivers/gpu/drm/aspeed/aspeed_gfx_drv.c | 16 +++-- drivers/gpu/drm/aspeed/aspeed_gfx_out.c | 14 +++- 4 files changed, 60 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx.h b/drivers/gpu/drm/aspeed/aspeed_gfx.h index eb4c267cde5e..c7aefee0657a 100644 --- a/drivers/gpu/drm/aspeed/aspeed_gfx.h +++ b/drivers/gpu/drm/aspeed/aspeed_gfx.h @@ -109,11 +109,15 @@ int aspeed_gfx_create_output(struct drm_device *drm); #define CRT_THROD_HIGH(x) ((x) << 8) /* SCU control */ -#define SCU_G6_CLK_COURCE 0x300 +#define G6_CLK_SOURCE 0x300 +#define G6_CLK_SOURCE_MASK (BIT(8) | BIT(9) | BIT(10)) +#define G6_CLK_SOURCE_HPLL (BIT(8) | BIT(9) | BIT(10)) +#define G6_CLK_SOURCE_USB BIT(9) +#define G6_CLK_SEL30x308 +#define G6_CLK_DIV_MASK0x3F000 +#define G6_CLK_DIV_16 (BIT(16)|BIT(15)|BIT(13)|BIT(12)) +#define G6_USB_40_CLK BIT(9) /* GFX FLAGS */ #define CLK_MASK BIT(0) #define CLK_G6 BIT(0) - -#define G6_CLK_MASK(BIT(8) | BIT(9) | BIT(10)) -#define G6_USB_40_CLK BIT(9) diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c index a24fab22eac4..5829be9c7c67 100644 --- a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c +++ b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c @@ -23,6 +23,28 @@ drm_pipe_to_aspeed_gfx(struct drm_simple_display_pipe *pipe) return container_of(pipe, struct aspeed_gfx, pipe); } +static void aspeed_gfx_set_clock_source(struct aspeed_gfx *priv, int mode_width) +{ + regmap_update_bits(priv->scu, G6_CLK_SOURCE, G6_CLK_SOURCE_MASK, 0x0); + regmap_update_bits(priv->scu, G6_CLK_SEL3, G6_CLK_DIV_MASK, 0x0); + + switch (mode_width) { + case 1024: + /* hpll div 16 = 75Mhz */ + regmap_update_bits(priv->scu, G6_CLK_SOURCE, + G6_CLK_SOURCE_MASK, G6_CLK_SOURCE_HPLL); + regmap_update_bits(priv->scu, G6_CLK_SEL3, + G6_CLK_DIV_MASK, G6_CLK_DIV_16); + break; + case 800: + default: + /* usb 40Mhz */ + regmap_update_bits(priv->scu, G6_CLK_SOURCE, + G6_CLK_SOURCE_MASK, G6_CLK_SOURCE_USB); + break; + } +} + static int aspeed_gfx_set_pixel_fmt(struct aspeed_gfx *priv, u32 *bpp) { struct drm_crtc *crtc = &priv->pipe.crtc; @@ -77,12 +99,11 @@ static void aspeed_gfx_disable_controller(struct aspeed_gfx *priv) regmap_update_bits(priv->scu, priv->dac_reg, BIT(16), 0); } -static void aspeed_gfx_set_clk(struct aspeed_gfx *priv) +static void aspeed_gfx_set_clk(struct aspeed_gfx *priv, int mode_width) { switch (priv->flags & CLK_MASK) { case CLK_G6: - regmap_update_bits(priv->scu, SCU_G6_CLK_COURCE, G6_CLK_MASK, 0x0); - regmap_update_bits(priv->scu, SCU_G6_CLK_COURCE, G6_CLK_MASK, G6_USB_40_CLK); + aspeed_gfx_set_clock_source(priv, mode_width); break; default: break; @@ -99,7 +120,7 @@ static void aspeed_gfx_crtc_mode_set_nofb(struct aspeed_gfx *priv) if (err) return; - aspeed_gfx_set_clk(priv); + aspeed_gfx_set_clk(priv, m->hdisplay); #if 0 /* TODO: we have only been able to test with the 40MHz USB clock. The diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c index af56ffdccc65..e1a814aebc2d 100644 --- a/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c +++ b/drivers/gpu/drm/aspeed/aspeed_gfx_drv.c @@ -110,6 +110,7 @@ static const struct drm_mode_config_funcs aspeed_gfx_mode_config_funcs = { static int aspeed_gfx_setup_mode_config(struct drm_device *drm) { + struct aspeed_gfx *priv = to_aspeed_gfx(drm); int ret; ret = drmm_mode_config_init(drm); @@ -118,8 +119,18 @@ static int aspeed_gfx_setup_mode_config(struct drm_device *drm) drm->mode_config.min_width = 0; drm->mode_config.min_height = 0; - drm->mode_config.max_width = 800; - drm->mode_config.max_height = 600; + + switch (priv->flags & CLK_MASK) { + case CLK_G6: + drm->mode_config.max_width = 1024; + drm->mode_config.max_height = 768; + break; + default: + drm->mode_config.max_width = 800; + drm->mode_config.max_height = 600; + break; + } + drm-
Re: [PATCH 1/9] dt-bindings: mxsfb: Add compatible for i.MX8MP
On 3/3/22 09:21, Lucas Stach wrote: Am Donnerstag, dem 03.03.2022 um 04:14 +0100 schrieb Marek Vasut: On 3/2/22 10:23, Lucas Stach wrote: [...] I tend to agree with Marek on this one. We have an instance where the blk-ctrl and the GPC driver between 8m, mini, nano, plus are close, but different enough where each SoC has it's own set of tables and some checks. Lucas created the framework, and others adapted it for various SoC's. If there really is nearly 50% common code for the LCDIF, why not either leave the driver as one or split the common code into its own driver like lcdif-common and then have smaller drivers that handle their specific variations. I don't know exactly how the standalone driver looks like, but I guess the overlap is not really in any real HW specific parts, but the common DRM boilerplate, so there isn't much point in creating a common lcdif driver. The mxsfb currently has 1280 LoC as of patch 8/9 of this series. Of that, there is some 400 LoC which are specific to old LCDIF and this patch adds 380 LoC for the new LCDIF. So that's 800 LoC or ~60% of shared boilerplate that would be duplicated . That is probably ignoring the fact that the 8MP LCDIF does not support any overlays, so it could use the drm_simple_display_pipe infrastructure to reduce the needed boilerplate. It seems the IMXRT1070 LCDIF v2 (heh ...) does support overlays, so no, the mxsfb and hypothetical lcdif drivers would look really very similar. As you brought up the blk-ctrl as an example: I'm all for supporting slightly different hardware in the same driver, as long as the HW interface is close enough. But then I also opted for a separate 8MP blk-ctrl driver for those blk-ctrls that differ significantly from the others, as I think it would make the common driver unmaintainable trying to support all the different variants in one driver. But then you also need to maintain two sets of boilerplate, they diverge, and that is not good. I don't think that there is much chance for bugs going unfixed due to divergence in the boilerplate, especially if you use the simple pipe framework to handle most of that stuff for you, which gives you a lot of code sharing with other simple DRM drivers. But I can not use the simple pipe because overlays, see imxrt1070 . [...] We can always split the drivers later if this becomes unmaintainable too, no ? Not if you want to keep the same userspace running. As userspace has some ties to the DRM driver name, e.g. for finding the right GBM implementation, splitting the driver later on would be a UABI break. Hum, so what other options do we have left ? Duplicate 60% of the driver right away ?
Re: [PATCH v5 3/5] drm/msm/dp: set stream_pixel rate directly
Quoting Dmitry Baryshkov (2022-03-03 20:23:06) > On Fri, 4 Mar 2022 at 01:32, Stephen Boyd wrote: > > > > Quoting Dmitry Baryshkov (2022-02-16 21:55:27) > > > The only clock for which we set the rate is the "stream_pixel". Rather > > > than storing the rate and then setting it by looping over all the > > > clocks, set the clock rate directly. > > > > > > Signed-off-by: Dmitry Baryshkov > > [...] > > > diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c > > > b/drivers/gpu/drm/msm/dp/dp_ctrl.c > > > index 07f6bf7e1acb..8e6361dedd77 100644 > > > --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c > > > +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c > > > @@ -1315,7 +1315,7 @@ static void dp_ctrl_set_clock_rate(struct > > > dp_ctrl_private *ctrl, > > > DRM_DEBUG_DP("setting rate=%lu on clk=%s\n", rate, name); > > > > > > if (num) > > > - cfg->rate = rate; > > > + clk_set_rate(cfg->clk, rate); > > > > This looks bad. From what I can tell we set the rate of the pixel clk > > after enabling the phy and configuring it. See the order of operations > > in dp_ctrl_enable_mainlink_clocks() and note how dp_power_clk_enable() > > is the one that eventually sets a rate through dp_power_clk_set_rate() > > > > dp_ctrl_set_clock_rate(ctrl, DP_CTRL_PM, "ctrl_link", > > ctrl->link->link_params.rate * > > 1000); > > > > phy_configure(phy, &dp_io->phy_opts); > > phy_power_on(phy); > > > > ret = dp_power_clk_enable(ctrl->power, DP_CTRL_PM, true); > > This code has been changed in the previous patch. > > Let's get back a bit. > Currently dp_ctrl_set_clock_rate() doesn't change the clock rate. It > just stores the rate in the config so that later the sequence of > dp_power_clk_enable() -> dp_power_clk_set_rate() -> > [dp_power_clk_set_link_rate() -> dev_pm_opp_set_rate() or > msm_dss_clk_set_rate() -> clk_set_rate()] will use that. > > There are only two users of dp_ctrl_set_clock_rate(): > - dp_ctrl_enable_mainlink_clocks(), which you have quoted above. > This case is handled in the patch 1 from this series. It makes Patch 1 form this series says DP is unaffected. Huh? > dp_ctrl_enable_mainlink_clocks() call dev_pm_opp_set_rate() directly > without storing (!) the rate in the config, calling > phy_configure()/phy_power_on() and then setting the opp via the > sequence of calls specified above > > - dp_ctrl_enable_stream_clocks(), which calls dp_power_clk_enable() > immediately afterwards. This call would set the stream_pixel rate > while enabling stream clocks. As far as I can see, the stream_pixel is > the only stream clock. So this patch sets the clock rate without > storing in the interim configuration data. > > Could you please clarify, what exactly looks bad to you? > I'm concerned about the order of operations changing between the phy being powered on and the pixel clk frequency being set. From what I recall the pixel clk rate operations depend on the phy frequency being set (which is done through phy_configure?) so if we call clk_set_rate() on the pixel clk before the phy is set then the clk frequency will be calculated badly and probably be incorrect.
Re: [PATCH v5 3/5] drm/msm/dp: set stream_pixel rate directly
On Fri, 4 Mar 2022 at 01:32, Stephen Boyd wrote: > > Quoting Dmitry Baryshkov (2022-02-16 21:55:27) > > The only clock for which we set the rate is the "stream_pixel". Rather > > than storing the rate and then setting it by looping over all the > > clocks, set the clock rate directly. > > > > Signed-off-by: Dmitry Baryshkov > [...] > > diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c > > b/drivers/gpu/drm/msm/dp/dp_ctrl.c > > index 07f6bf7e1acb..8e6361dedd77 100644 > > --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c > > +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c > > @@ -1315,7 +1315,7 @@ static void dp_ctrl_set_clock_rate(struct > > dp_ctrl_private *ctrl, > > DRM_DEBUG_DP("setting rate=%lu on clk=%s\n", rate, name); > > > > if (num) > > - cfg->rate = rate; > > + clk_set_rate(cfg->clk, rate); > > This looks bad. From what I can tell we set the rate of the pixel clk > after enabling the phy and configuring it. See the order of operations > in dp_ctrl_enable_mainlink_clocks() and note how dp_power_clk_enable() > is the one that eventually sets a rate through dp_power_clk_set_rate() > > dp_ctrl_set_clock_rate(ctrl, DP_CTRL_PM, "ctrl_link", > ctrl->link->link_params.rate * 1000); > > phy_configure(phy, &dp_io->phy_opts); > phy_power_on(phy); > > ret = dp_power_clk_enable(ctrl->power, DP_CTRL_PM, true); This code has been changed in the previous patch. Let's get back a bit. Currently dp_ctrl_set_clock_rate() doesn't change the clock rate. It just stores the rate in the config so that later the sequence of dp_power_clk_enable() -> dp_power_clk_set_rate() -> [dp_power_clk_set_link_rate() -> dev_pm_opp_set_rate() or msm_dss_clk_set_rate() -> clk_set_rate()] will use that. There are only two users of dp_ctrl_set_clock_rate(): - dp_ctrl_enable_mainlink_clocks(), which you have quoted above. This case is handled in the patch 1 from this series. It makes dp_ctrl_enable_mainlink_clocks() call dev_pm_opp_set_rate() directly without storing (!) the rate in the config, calling phy_configure()/phy_power_on() and then setting the opp via the sequence of calls specified above - dp_ctrl_enable_stream_clocks(), which calls dp_power_clk_enable() immediately afterwards. This call would set the stream_pixel rate while enabling stream clocks. As far as I can see, the stream_pixel is the only stream clock. So this patch sets the clock rate without storing in the interim configuration data. Could you please clarify, what exactly looks bad to you? > and I vaguely recall that the DP phy needs to be configured for some > frequency so that the pixel clk can use it when determining the rate to > set. > > > else > > DRM_ERROR("%s clock doesn't exit to set rate %lu\n", > > name, rate); -- With best wishes Dmitry
[git pull] drm fixes for 5.17-rc7
Hi Linus, Things are quieting down as expected, just a small set of fixes, i915, exynos, amdgpu, vrr, bridge and hdlcd. Nothing scary at all. Dave. drm-fixes-2022-03-04: drm fixes for 5.17-rc7 i915: - Fix GuC SLPC unset command - Fix misidentification of some Apple MacBook Pro laptops as Jasper Lake. amdgpu: - Suspend regression fix exynos: - irq handling fixes. - Fix two regressions to TE-gpio handling. arm/hdlcd: - Select DRM_GEM_CMEA_HELPER for HDLCD bridge: - ti-sn65dsi86: Properly undo autosuspend vrr: - Fix potential NULL-pointer deref The following changes since commit 7e57714cd0ad2d5bb90e50b5096a0e671dec1ef3: Linux 5.17-rc6 (2022-02-27 14:36:33 -0800) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2022-03-04 for you to fetch changes up to 8fdb19679722a02fe21642d39710c701d2ed567a: Merge tag 'drm-misc-fixes-2022-03-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2022-03-04 13:04:11 +1000) drm fixes for 5.17-rc7 i915: - Fix GuC SLPC unset command - Fix misidentification of some Apple MacBook Pro laptops as Jasper Lake. amdgpu: - Suspend regression fix exynos: - irq handling fixes. - Fix two regressions to TE-gpio handling. arm/hdlcd: - Select DRM_GEM_CMEA_HELPER for HDLCD bridge: - ti-sn65dsi86: Properly undo autosuspend vrr: - Fix potential NULL-pointer deref Carsten Haitzler (1): drm/arm: arm hdlcd select DRM_GEM_CMA_HELPER Dave Airlie (4): Merge tag 'exynos-drm-fixes-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes Merge tag 'drm-intel-fixes-2022-03-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Merge tag 'amd-drm-fixes-5.17-2022-03-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes Merge tag 'drm-misc-fixes-2022-03-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Douglas Anderson (1): drm/bridge: ti-sn65dsi86: Properly undo autosuspend Lad Prabhakar (5): drm/exynos/exynos7_drm_decon: Use platform_get_irq_byname() to get the interrupt drm/exynos: mixer: Use platform_get_irq() to get the interrupt drm/exynos/exynos_drm_fimd: Use platform_get_irq_byname() to get the interrupt drm/exynos/fimc: Use platform_get_irq() to get the interrupt drm/exynos: gsc: Use platform_get_irq() to get the interrupt Manasi Navare (1): drm/vrr: Set VRR capable prop only if it is attached to connector Marek Szyprowski (2): drm/exynos: Don't fail if no TE-gpio is defined for DSI driver drm/exynos: Search for TE-gpio in DSI panel's node Qiang Yu (1): drm/amdgpu: fix suspend/resume hang regression Ville Syrjälä (1): drm/i915: s/JSP2/ICP2/ PCH Vinay Belgaumkar (1): drm/i915/guc/slpc: Correct the param count for unset param drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++- drivers/gpu/drm/arm/Kconfig | 1 + drivers/gpu/drm/bridge/ti-sn65dsi86.c | 5 +++-- drivers/gpu/drm/drm_connector.c | 3 +++ drivers/gpu/drm/exynos/exynos7_drm_decon.c | 12 +++- drivers/gpu/drm/exynos/exynos_drm_dsi.c | 6 -- drivers/gpu/drm/exynos/exynos_drm_fimc.c| 13 + drivers/gpu/drm/exynos/exynos_drm_fimd.c| 13 - drivers/gpu/drm/exynos/exynos_drm_gsc.c | 10 +++--- drivers/gpu/drm/exynos/exynos_mixer.c | 14 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 2 +- drivers/gpu/drm/i915/intel_pch.c| 2 +- drivers/gpu/drm/i915/intel_pch.h| 2 +- 13 files changed, 37 insertions(+), 49 deletions(-)
[PATCH v3 5/5] drm/msm: allow compile time selection of driver components
MSM DRM driver already allows one to compile out the DP or DSI support. Add support for disabling other features like MDP4/MDP5/DPU drivers or direct HDMI output support. Suggested-by: Stephen Boyd Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/Kconfig| 50 -- drivers/gpu/drm/msm/Makefile | 18 ++-- drivers/gpu/drm/msm/msm_drv.h | 33 ++ drivers/gpu/drm/msm/msm_mdss.c | 13 +++-- 4 files changed, 106 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index 9b019598e042..3735fd41eb3b 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -46,12 +46,39 @@ config DRM_MSM_GPU_SUDO Only use this if you are a driver developer. This should *not* be enabled for production kernels. If unsure, say N. -config DRM_MSM_HDMI_HDCP - bool "Enable HDMI HDCP support in MSM DRM driver" +config DRM_MSM_MDSS + bool + depends on DRM_MSM + default n + +config DRM_MSM_MDP4 + bool "Enable MDP4 support in MSM DRM driver" depends on DRM_MSM default y help - Choose this option to enable HDCP state machine + Compile in support for the Mobile Display Processor v4 (MDP4) in + the MSM DRM driver. It is the older display controller found in + devices using APQ8064/MSM8960/MSM8x60 platforms. + +config DRM_MSM_MDP5 + bool "Enable MDP5 support in MSM DRM driver" + depends on DRM_MSM + select DRM_MSM_MDSS + default y + help + Compile in support for the Mobile Display Processor v5 (MDP4) in + the MSM DRM driver. It is the display controller found in devices + using e.g. APQ8016/MSM8916/APQ8096/MSM8996/MSM8974/SDM6x0 platforms. + +config DRM_MSM_DPU + bool "Enable DPU support in MSM DRM driver" + depends on DRM_MSM + select DRM_MSM_MDSS + default y + help + Compile in support for the Display Processing Unit in + the MSM DRM driver. It is the display controller found in devices + using e.g. SDM845 and newer platforms. config DRM_MSM_DP bool "Enable DisplayPort support in MSM DRM driver" @@ -116,3 +143,20 @@ config DRM_MSM_DSI_7NM_PHY help Choose this option if DSI PHY on SM8150/SM8250/SC7280 is used on the platform. + +config DRM_MSM_HDMI + bool "Enable HDMI support in MSM DRM driver" + depends on DRM_MSM + default y + help + Compile in support for the HDMI output MSM DRM driver. It can + be a primary or a secondary display on device. Note that this is used + only for the direct HDMI output. If the device outputs HDMI data + throught some kind of DSI-to-HDMI bridge, this option can be disabled. + +config DRM_MSM_HDMI_HDCP + bool "Enable HDMI HDCP support in MSM DRM driver" + depends on DRM_MSM && DRM_MSM_HDMI + default y + help + Choose this option to enable HDCP state machine diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index e76927b42033..5fe9c20ab9ee 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -16,6 +16,8 @@ msm-y := \ adreno/a6xx_gpu.o \ adreno/a6xx_gmu.o \ adreno/a6xx_hfi.o \ + +msm-$(CONFIG_DRM_MSM_HDMI) += \ hdmi/hdmi.o \ hdmi/hdmi_audio.o \ hdmi/hdmi_bridge.o \ @@ -27,8 +29,8 @@ msm-y := \ hdmi/hdmi_phy_8x60.o \ hdmi/hdmi_phy_8x74.o \ hdmi/hdmi_pll_8960.o \ - disp/mdp_format.o \ - disp/mdp_kms.o \ + +msm-$(CONFIG_DRM_MSM_MDP4) += \ disp/mdp4/mdp4_crtc.o \ disp/mdp4/mdp4_dtv_encoder.o \ disp/mdp4/mdp4_lcdc_encoder.o \ @@ -37,6 +39,8 @@ msm-y := \ disp/mdp4/mdp4_irq.o \ disp/mdp4/mdp4_kms.o \ disp/mdp4/mdp4_plane.o \ + +msm-$(CONFIG_DRM_MSM_MDP5) += \ disp/mdp5/mdp5_cfg.o \ disp/mdp5/mdp5_ctl.o \ disp/mdp5/mdp5_crtc.o \ @@ -47,6 +51,8 @@ msm-y := \ disp/mdp5/mdp5_mixer.o \ disp/mdp5/mdp5_plane.o \ disp/mdp5/mdp5_smp.o \ + +msm-$(CONFIG_DRM_MSM_DPU) += \ disp/dpu1/dpu_core_perf.o \ disp/dpu1/dpu_crtc.o \ disp/dpu1/dpu_encoder.o \ @@ -69,6 +75,13 @@ msm-y := \ disp/dpu1/dpu_plane.o \ disp/dpu1/dpu_rm.o \ disp/dpu1/dpu_vbif.o \ + +msm-$(CONFIG_DRM_MSM_MDSS) += \ + msm_mdss.o \ + +msm-y += \ + disp/mdp_format.o \ + disp/mdp_kms.o \ disp/msm_disp_snapshot.o \ disp/msm_disp_snapshot_util.o \ msm_atomic.o \ @@ -86,7 +99,6 @@ msm-y := \ msm_gpu_devfreq.o \ msm_io_utils.o \ msm_iommu.o \ - msm_mdss.o \ msm_perf.o \ msm_rd.o \ msm_ringbuffer.o \ diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index c1aaadfbea34..6bad7e7b479d 100644 --- a/drivers/gpu
[PATCH v3 4/5] drm/msm: stop using device's match data pointer
Let's make the match's data pointer a (sub-)driver's private data. The only user currently is the msm_drm_init() function, using this data to select kms_init callback. Pass this callback through the driver's private data instead. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 10 --- drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 14 + drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 11 --- drivers/gpu/drm/msm/msm_drv.c| 38 ++-- drivers/gpu/drm/msm/msm_drv.h| 5 +--- drivers/gpu/drm/msm/msm_kms.h| 4 --- drivers/gpu/drm/msm/msm_mdss.c | 29 +++--- 7 files changed, 42 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c index e29796c4f27b..38627ccf3068 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c @@ -1172,7 +1172,7 @@ static int dpu_kms_hw_init(struct msm_kms *kms) return rc; } -struct msm_kms *dpu_kms_init(struct drm_device *dev) +static int dpu_kms_init(struct drm_device *dev) { struct msm_drm_private *priv; struct dpu_kms *dpu_kms; @@ -1180,7 +1180,7 @@ struct msm_kms *dpu_kms_init(struct drm_device *dev) if (!dev) { DPU_ERROR("drm device node invalid\n"); - return ERR_PTR(-EINVAL); + return -EINVAL; } priv = dev->dev_private; @@ -1189,11 +1189,11 @@ struct msm_kms *dpu_kms_init(struct drm_device *dev) irq = irq_of_parse_and_map(dpu_kms->pdev->dev.of_node, 0); if (irq < 0) { DPU_ERROR("failed to get irq: %d\n", irq); - return ERR_PTR(irq); + return irq; } dpu_kms->base.irq = irq; - return &dpu_kms->base; + return 0; } static int dpu_bind(struct device *dev, struct device *master, void *data) @@ -1204,6 +1204,8 @@ static int dpu_bind(struct device *dev, struct device *master, void *data) struct dpu_kms *dpu_kms; int ret = 0; + priv->kms_init = dpu_kms_init; + dpu_kms = devm_kzalloc(&pdev->dev, sizeof(*dpu_kms), GFP_KERNEL); if (!dpu_kms) return -ENOMEM; diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c index c5c0650414c5..2e5f6b6fd3c3 100644 --- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c +++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c @@ -389,7 +389,7 @@ static void read_mdp_hw_revision(struct mdp4_kms *mdp4_kms, DRM_DEV_INFO(dev->dev, "MDP4 version v%d.%d", *major, *minor); } -struct msm_kms *mdp4_kms_init(struct drm_device *dev) +static int mdp4_kms_init(struct drm_device *dev) { struct platform_device *pdev = to_platform_device(dev->dev); struct mdp4_platform_config *config = mdp4_get_config(pdev); @@ -403,8 +403,7 @@ struct msm_kms *mdp4_kms_init(struct drm_device *dev) mdp4_kms = kzalloc(sizeof(*mdp4_kms), GFP_KERNEL); if (!mdp4_kms) { DRM_DEV_ERROR(dev->dev, "failed to allocate kms\n"); - ret = -ENOMEM; - goto fail; + return -ENOMEM; } ret = mdp_kms_init(&mdp4_kms->base, &kms_funcs); @@ -551,12 +550,13 @@ struct msm_kms *mdp4_kms_init(struct drm_device *dev) dev->mode_config.max_width = 2048; dev->mode_config.max_height = 2048; - return kms; + return 0; fail: if (kms) mdp4_destroy(kms); - return ERR_PTR(ret); + + return ret; } static struct mdp4_platform_config *mdp4_get_config(struct platform_device *dev) @@ -583,6 +583,8 @@ static int mdp4_probe(struct platform_device *pdev) if (!priv) return -ENOMEM; + priv->kms_init = mdp4_kms_init; + platform_set_drvdata(pdev, priv); /* @@ -600,7 +602,7 @@ static int mdp4_remove(struct platform_device *pdev) } static const struct of_device_id mdp4_dt_match[] = { - { .compatible = "qcom,mdp4", .data = (void *)KMS_MDP4 }, + { .compatible = "qcom,mdp4" }, { /* sentinel */ } }; MODULE_DEVICE_TABLE(of, mdp4_dt_match); diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c index 3b92372e7bdf..0c78608832c3 100644 --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c @@ -544,7 +544,7 @@ static int get_clk(struct platform_device *pdev, struct clk **clkp, return 0; } -struct msm_kms *mdp5_kms_init(struct drm_device *dev) +static int mdp5_kms_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev; @@ -558,7 +558,7 @@ struct msm_kms *mdp5_kms_init(struct drm_device *dev) /* priv->kms would have been populated by the MDP5 driver */ kms = priv->kms; if (!kms) - return NU
[PATCH v3 3/5] drm/msm: split the main platform driver
Currently the msm platform driver is a multiplex handling several cases: - headless GPU-only driver, - MDP4 with flat device nodes, - MDP5/DPU MDSS with all the nodes being children of MDSS node. This results in not-so-perfect code, checking the hardware version (MDP4/MDP5/DPU) in several places, checking for mdss even when it can not exist, etc. Split the code into three handling subdrivers (mdp4, mdss and headless msm). Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 56 ++ drivers/gpu/drm/msm/msm_drv.c| 228 --- drivers/gpu/drm/msm/msm_drv.h| 27 ++- drivers/gpu/drm/msm/msm_kms.h| 7 - drivers/gpu/drm/msm/msm_mdss.c | 178 +- 5 files changed, 291 insertions(+), 205 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c index 3cf476c55158..c5c0650414c5 100644 --- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c +++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c @@ -569,3 +569,59 @@ static struct mdp4_platform_config *mdp4_get_config(struct platform_device *dev) return &config; } + +static const struct dev_pm_ops mdp4_pm_ops = { + .prepare = msm_pm_prepare, + .complete = msm_pm_complete, +}; + +static int mdp4_probe(struct platform_device *pdev) +{ + struct msm_drm_private *priv; + + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + + platform_set_drvdata(pdev, priv); + + /* +* on MDP4 based platforms, the MDP platform device is the component +* master that adds other display interface components to itself. +*/ + return msm_drv_probe(&pdev->dev, &pdev->dev); +} + +static int mdp4_remove(struct platform_device *pdev) +{ + component_master_del(&pdev->dev, &msm_drm_ops); + + return 0; +} + +static const struct of_device_id mdp4_dt_match[] = { + { .compatible = "qcom,mdp4", .data = (void *)KMS_MDP4 }, + { /* sentinel */ } +}; +MODULE_DEVICE_TABLE(of, mdp4_dt_match); + +static struct platform_driver mdp4_platform_driver = { + .probe = mdp4_probe, + .remove = mdp4_remove, + .shutdown = msm_drv_shutdown, + .driver = { + .name = "mdp4", + .of_match_table = mdp4_dt_match, + .pm = &mdp4_pm_ops, + }, +}; + +void __init msm_mdp4_register(void) +{ + platform_driver_register(&mdp4_platform_driver); +} + +void __exit msm_mdp4_unregister(void) +{ + platform_driver_unregister(&mdp4_platform_driver); +} diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index f3f33b8c6eba..2f44df8c5585 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -255,10 +255,6 @@ static int msm_drm_uninit(struct device *dev) return 0; } -#define KMS_MDP4 4 -#define KMS_MDP5 5 -#define KMS_DPU 3 - static int get_mdp_ver(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -941,50 +937,7 @@ static const struct drm_driver msm_driver = { .patchlevel = MSM_VERSION_PATCHLEVEL, }; -static int __maybe_unused msm_runtime_suspend(struct device *dev) -{ - struct msm_drm_private *priv = dev_get_drvdata(dev); - struct msm_mdss *mdss = priv->mdss; - - DBG(""); - - if (mdss) - return msm_mdss_disable(mdss); - - return 0; -} - -static int __maybe_unused msm_runtime_resume(struct device *dev) -{ - struct msm_drm_private *priv = dev_get_drvdata(dev); - struct msm_mdss *mdss = priv->mdss; - - DBG(""); - - if (mdss) - return msm_mdss_enable(mdss); - - return 0; -} - -static int __maybe_unused msm_pm_suspend(struct device *dev) -{ - - if (pm_runtime_suspended(dev)) - return 0; - - return msm_runtime_suspend(dev); -} - -static int __maybe_unused msm_pm_resume(struct device *dev) -{ - if (pm_runtime_suspended(dev)) - return 0; - - return msm_runtime_resume(dev); -} - -static int __maybe_unused msm_pm_prepare(struct device *dev) +int msm_pm_prepare(struct device *dev) { struct msm_drm_private *priv = dev_get_drvdata(dev); struct drm_device *ddev = priv ? priv->dev : NULL; @@ -995,7 +948,7 @@ static int __maybe_unused msm_pm_prepare(struct device *dev) return drm_mode_config_helper_suspend(ddev); } -static void __maybe_unused msm_pm_complete(struct device *dev) +void msm_pm_complete(struct device *dev) { struct msm_drm_private *priv = dev_get_drvdata(dev); struct drm_device *ddev = priv ? priv->dev : NULL; @@ -1007,8 +960,6 @@ static void __maybe_unused msm_pm_complete(struct device *dev) } static const struct dev_pm_ops msm_pm_ops = { - SET_SYSTEM_SLEEP_PM_OPS(msm_pm_suspend, msm_pm_resume) - SET_RUNTIME_PM_OPS(msm_runtime_suspend,
[PATCH v3 2/5] drm/msm: remove extra indirection for msm_mdss
Since now there is just one mdss subdriver, drop all the indirection, make msm_mdss struct completely opaque (and defined inside msm_mdss.c) and call mdss functions directly. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/msm_drv.c | 29 +++ drivers/gpu/drm/msm/msm_kms.h | 16 ++-- drivers/gpu/drm/msm/msm_mdss.c | 136 +++-- 3 files changed, 81 insertions(+), 100 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 078c7e951a6e..f3f33b8c6eba 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -948,8 +948,8 @@ static int __maybe_unused msm_runtime_suspend(struct device *dev) DBG(""); - if (mdss && mdss->funcs) - return mdss->funcs->disable(mdss); + if (mdss) + return msm_mdss_disable(mdss); return 0; } @@ -961,8 +961,8 @@ static int __maybe_unused msm_runtime_resume(struct device *dev) DBG(""); - if (mdss && mdss->funcs) - return mdss->funcs->enable(mdss); + if (mdss) + return msm_mdss_enable(mdss); return 0; } @@ -1197,6 +1197,7 @@ static const struct component_master_ops msm_drm_ops = { static int msm_pdev_probe(struct platform_device *pdev) { struct component_match *match = NULL; + struct msm_mdss *mdss; struct msm_drm_private *priv; int ret; @@ -1208,20 +1209,22 @@ static int msm_pdev_probe(struct platform_device *pdev) switch (get_mdp_ver(pdev)) { case KMS_MDP5: - ret = msm_mdss_init(pdev, true); + mdss = msm_mdss_init(pdev, true); break; case KMS_DPU: - ret = msm_mdss_init(pdev, false); + mdss = msm_mdss_init(pdev, false); break; default: - ret = 0; + mdss = NULL; break; } - if (ret) { - platform_set_drvdata(pdev, NULL); + if (IS_ERR(mdss)) { + ret = PTR_ERR(mdss); return ret; } + priv->mdss = mdss; + if (get_mdp_ver(pdev)) { ret = add_display_components(pdev, &match); if (ret) @@ -1248,8 +1251,8 @@ static int msm_pdev_probe(struct platform_device *pdev) fail: of_platform_depopulate(&pdev->dev); - if (priv->mdss && priv->mdss->funcs) - priv->mdss->funcs->destroy(priv->mdss); + if (priv->mdss) + msm_mdss_destroy(priv->mdss); return ret; } @@ -1262,8 +1265,8 @@ static int msm_pdev_remove(struct platform_device *pdev) component_master_del(&pdev->dev, &msm_drm_ops); of_platform_depopulate(&pdev->dev); - if (mdss && mdss->funcs) - mdss->funcs->destroy(mdss); + if (mdss) + msm_mdss_destroy(mdss); return 0; } diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h index 10d5ae3e76df..09c21994 100644 --- a/drivers/gpu/drm/msm/msm_kms.h +++ b/drivers/gpu/drm/msm/msm_kms.h @@ -201,18 +201,12 @@ struct msm_kms *dpu_kms_init(struct drm_device *dev); extern const struct of_device_id dpu_dt_match[]; extern const struct of_device_id mdp5_dt_match[]; -struct msm_mdss_funcs { - int (*enable)(struct msm_mdss *mdss); - int (*disable)(struct msm_mdss *mdss); - void (*destroy)(struct msm_mdss *mdss); -}; - -struct msm_mdss { - struct device *dev; - const struct msm_mdss_funcs *funcs; -}; +struct msm_mdss; -int msm_mdss_init(struct platform_device *pdev, bool is_mdp5); +struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool is_mdp5); +int msm_mdss_enable(struct msm_mdss *mdss); +int msm_mdss_disable(struct msm_mdss *mdss); +void msm_mdss_destroy(struct msm_mdss *mdss); #define for_each_crtc_mask(dev, crtc, crtc_mask) \ drm_for_each_crtc(crtc, dev) \ diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c index 71f3277bde32..857eefbb8649 100644 --- a/drivers/gpu/drm/msm/msm_mdss.c +++ b/drivers/gpu/drm/msm/msm_mdss.c @@ -3,19 +3,16 @@ * Copyright (c) 2018, The Linux Foundation */ +#include #include #include #include #include - -#include "msm_drv.h" -#include "msm_kms.h" +#include /* for DPU_HW_* defines */ #include "disp/dpu1/dpu_hw_catalog.h" -#define to_dpu_mdss(x) container_of(x, struct dpu_mdss, base) - #define HW_REV 0x0 #define HW_INTR_STATUS 0x0010 @@ -23,8 +20,9 @@ #define UBWC_CTRL_20x150 #define UBWC_PREDICTION_MODE 0x154 -struct dpu_mdss { - struct msm_mdss base; +struct msm_mdss { + struct device *dev; + void __iomem *mmio; struct clk_bulk_data *clocks; size_t num_clocks; @@ -36,22 +34,22 @@ struct dpu_mdss { static void msm_mdss_irq(struct irq_desc *desc) { - struct dpu_mdss *dpu_md
[PATCH v3 1/5] drm/msm: unify MDSS drivers
MDP5 and DPU1 both provide the driver handling the MDSS region, which handles the irq domain and (incase of DPU1) adds some init for the UBWC controller. Unify those two pieces of code into a common driver. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/Makefile | 3 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c | 252 -- drivers/gpu/drm/msm/msm_drv.c | 4 +- drivers/gpu/drm/msm/msm_kms.h | 3 +- .../msm/{disp/dpu1/dpu_mdss.c => msm_mdss.c} | 145 +- 5 files changed, 83 insertions(+), 324 deletions(-) delete mode 100644 drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c rename drivers/gpu/drm/msm/{disp/dpu1/dpu_mdss.c => msm_mdss.c} (63%) diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index e9cc7d8ac301..e76927b42033 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -42,7 +42,6 @@ msm-y := \ disp/mdp5/mdp5_crtc.o \ disp/mdp5/mdp5_encoder.o \ disp/mdp5/mdp5_irq.o \ - disp/mdp5/mdp5_mdss.o \ disp/mdp5/mdp5_kms.o \ disp/mdp5/mdp5_pipe.o \ disp/mdp5/mdp5_mixer.o \ @@ -67,7 +66,6 @@ msm-y := \ disp/dpu1/dpu_hw_util.o \ disp/dpu1/dpu_hw_vbif.o \ disp/dpu1/dpu_kms.o \ - disp/dpu1/dpu_mdss.o \ disp/dpu1/dpu_plane.o \ disp/dpu1/dpu_rm.o \ disp/dpu1/dpu_vbif.o \ @@ -88,6 +86,7 @@ msm-y := \ msm_gpu_devfreq.o \ msm_io_utils.o \ msm_iommu.o \ + msm_mdss.o \ msm_perf.o \ msm_rd.o \ msm_ringbuffer.o \ diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c deleted file mode 100644 index 049c6784a531.. --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c +++ /dev/null @@ -1,252 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * Copyright (c) 2016, The Linux Foundation. All rights reserved. - */ - -#include -#include - -#include "msm_drv.h" -#include "mdp5_kms.h" - -#define to_mdp5_mdss(x) container_of(x, struct mdp5_mdss, base) - -struct mdp5_mdss { - struct msm_mdss base; - - void __iomem *mmio, *vbif; - - struct clk *ahb_clk; - struct clk *axi_clk; - struct clk *vsync_clk; - - struct { - volatile unsigned long enabled_mask; - struct irq_domain *domain; - } irqcontroller; -}; - -static inline void mdss_write(struct mdp5_mdss *mdp5_mdss, u32 reg, u32 data) -{ - msm_writel(data, mdp5_mdss->mmio + reg); -} - -static inline u32 mdss_read(struct mdp5_mdss *mdp5_mdss, u32 reg) -{ - return msm_readl(mdp5_mdss->mmio + reg); -} - -static irqreturn_t mdss_irq(int irq, void *arg) -{ - struct mdp5_mdss *mdp5_mdss = arg; - u32 intr; - - intr = mdss_read(mdp5_mdss, REG_MDSS_HW_INTR_STATUS); - - VERB("intr=%08x", intr); - - while (intr) { - irq_hw_number_t hwirq = fls(intr) - 1; - - generic_handle_domain_irq(mdp5_mdss->irqcontroller.domain, hwirq); - intr &= ~(1 << hwirq); - } - - return IRQ_HANDLED; -} - -/* - * interrupt-controller implementation, so sub-blocks (MDP/HDMI/eDP/DSI/etc) - * can register to get their irq's delivered - */ - -#define VALID_IRQS (MDSS_HW_INTR_STATUS_INTR_MDP | \ - MDSS_HW_INTR_STATUS_INTR_DSI0 | \ - MDSS_HW_INTR_STATUS_INTR_DSI1 | \ - MDSS_HW_INTR_STATUS_INTR_HDMI | \ - MDSS_HW_INTR_STATUS_INTR_EDP) - -static void mdss_hw_mask_irq(struct irq_data *irqd) -{ - struct mdp5_mdss *mdp5_mdss = irq_data_get_irq_chip_data(irqd); - - smp_mb__before_atomic(); - clear_bit(irqd->hwirq, &mdp5_mdss->irqcontroller.enabled_mask); - smp_mb__after_atomic(); -} - -static void mdss_hw_unmask_irq(struct irq_data *irqd) -{ - struct mdp5_mdss *mdp5_mdss = irq_data_get_irq_chip_data(irqd); - - smp_mb__before_atomic(); - set_bit(irqd->hwirq, &mdp5_mdss->irqcontroller.enabled_mask); - smp_mb__after_atomic(); -} - -static struct irq_chip mdss_hw_irq_chip = { - .name = "mdss", - .irq_mask = mdss_hw_mask_irq, - .irq_unmask = mdss_hw_unmask_irq, -}; - -static int mdss_hw_irqdomain_map(struct irq_domain *d, unsigned int irq, -irq_hw_number_t hwirq) -{ - struct mdp5_mdss *mdp5_mdss = d->host_data; - - if (!(VALID_IRQS & (1 << hwirq))) - return -EPERM; - - irq_set_chip_and_handler(irq, &mdss_hw_irq_chip, handle_level_irq); - irq_set_chip_data(irq, mdp5_mdss); - - return 0; -} - -static const struct irq_domain_ops mdss_hw_irqdomain_ops = { - .map = mdss_hw_irqdomain_map, - .xlate = irq_domain_xlate_onecell, -}; - - -static int mdss_irq_domain_init(struct mdp5_mdss *mdp5_mdss) -{ - struct device *dev = mdp5_mdss->base.dev; - struct irq_domain *d; - - d = irq_domain_add_lin
[PATCH v3 0/5] drm/msm: rework MDSS drivers
These patches coninue work started by AngeloGioacchino Del Regno in the previous cycle by further decoupling and dissecting MDSS and MDP drivers probe/binding paths. This removes code duplication between MDP5 and DPU1 MDSS drivers, by merging them and moving to the top level. This patchset depends on the patches 1 and 2 from [1] Changes since v2: - Rebased on top of current msm/msm-next(-staging) - Allow disabling MDP4/MDP5/DPU/HDMI components (like we do for DP and DSI) - Made mdp5_mdss_parse_clock() static - Changed mdp5 to is_mdp5 argument in several functions - Dropped boolean device data from the mdss driver - Reworked error handling in msm_pdev_probe() - Removed unused header inclusion - Dropped __init/__exit from function prototypes Changes since v1: - Rebased on top of [2] and [1] [1] https://patchwork.freedesktop.org/series/99066/ [2] https://patchwork.freedesktop.org/series/98521/ Dmitry Baryshkov (5): drm/msm: unify MDSS drivers drm/msm: remove extra indirection for msm_mdss drm/msm: split the main platform driver drm/msm: stop using device's match data pointer drm/msm: allow runtime selection of driver components drivers/gpu/drm/msm/Kconfig | 50 ++- drivers/gpu/drm/msm/Makefile | 19 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 10 +- drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c | 260 - drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 68 +++- drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 11 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c | 252 - drivers/gpu/drm/msm/msm_drv.c | 263 +++-- drivers/gpu/drm/msm/msm_drv.h | 57 ++- drivers/gpu/drm/msm/msm_kms.h | 18 - drivers/gpu/drm/msm/msm_mdss.c| 429 ++ 11 files changed, 667 insertions(+), 770 deletions(-) delete mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c delete mode 100644 drivers/gpu/drm/msm/disp/mdp5/mdp5_mdss.c create mode 100644 drivers/gpu/drm/msm/msm_mdss.c base-commit: 8ddb80c5fcf455fe38156636126a83eadacfb743 -- 2.34.1
Re: Report 2 in ext4 and journal based on v5.17-rc1
On Thu, Mar 03, 2022 at 09:36:25AM -0500, Theodore Ts'o wrote: > On Thu, Mar 03, 2022 at 02:23:33PM +0900, Byungchul Park wrote: > > I totally agree with you. *They aren't really locks but it's just waits > > and wakeups.* That's exactly why I decided to develop Dept. Dept is not > > interested in locks unlike Lockdep, but fouces on waits and wakeup > > sources itself. I think you get Dept wrong a lot. Please ask me more if > > you have things you doubt about Dept. > > So the question is this --- do you now understand why, even though > there is a circular dependency, nothing gets stalled in the > interactions between the two wait channels? I found a point that the two wait channels don't lead a deadlock in some cases thanks to Jan Kara. I will fix it so that Dept won't complain it. Thanks, Byungchul > > - Ted
Re: Report 2 in ext4 and journal based on v5.17-rc1
On Thu, Mar 03, 2022 at 10:54:56AM +0100, Jan Kara wrote: > On Thu 03-03-22 10:00:33, Byungchul Park wrote: > > Unfortunately, it's neither perfect nor safe without another wakeup > > source - rescue wakeup source. > > > >consumer producer > > > > lock L > > (too much work queued == true) > > unlock L > > --- preempted > >lock L > >unlock L > >do work > >lock L > >unlock L > >do work > >... > >(no work == true) > >sleep > > --- scheduled in > > sleep > > > > This code leads a deadlock without another wakeup source, say, not safe. > > So the scenario you describe above is indeed possible. But the trick is > that the wakeup from 'consumer' as is doing work will remove 'producer' > from the wait queue and change the 'producer' process state to > 'TASK_RUNNING'. So when 'producer' calls sleep (in fact schedule()), the > scheduler will just treat this as another preemption point and the > 'producer' will immediately or soon continue to run. So indeed we can think > of this as "another wakeup source" but the source is in the CPU scheduler > itself. This is the standard way how waitqueues are used in the kernel... Nice! Thanks for the explanation. I will take it into account if needed. > > Lastly, just for your information, I need to explain how Dept works a > > little more for you not to misunderstand Dept. > > > > Assuming the consumer and producer guarantee not to lead a deadlock like > > the following, Dept won't report it a problem: > > > >consumer producer > > > > sleep > >wakeup work_done > > queue work > >sleep > > wakeup work_queued > >do work > > sleep > >wakeup work_done > > queue work > >sleep > > wakeup work_queued > >do work > > sleep > >... ... > > > > Dept does not consider all waits preceeding an event but only waits that > > might lead a deadlock. In this case, Dept works with each region > > independently. > > > >consumer producer > > > > sleep <- initiates region 1 > >--- region 1 starts > >... ... > >--- region 1 ends > >wakeup work_done > >... ... > > queue work > >... ... > >sleep <- initiates region 2 > > --- region 2 starts > >... ... > > --- region 2 ends > > wakeup work_queued > >... ... > >do work > >... ... > > sleep <- initiates region 3 > >--- region 3 starts > >... ... > >--- region 3 ends > >wakeup work_done > >... ... > > queue work > >... ... > >sleep <- initiates region 4 > > --- region 4 starts > >... ... > > --- region 4 ends > > wakeup work_queued > >... ... > >do work > >... ... > > > > That is, Dept does not build dependencies across different regions. So > > you don't have to worry about unreasonable false positives that much. > > > > Thoughts? > > Thanks for explanation! And what exactly defines the 'regions'? When some > process goes to sleep on some waitqueue, this defines a start of a region > at the place where all the other processes are at that moment and wakeup of > the waitqueue is an end of the region? Yes. Let me explain it more for better understanding. (I copied it from the talk I did with Matthew..) ideal view --- context Xcontext Y request event E ... write REQUESTEVENTwhen (notice REQUESTEVENT written) ... notice the request from X [S] --- ideally region 1 starts here wait for the event ... sleep if (can see REQUESTEVENT written) it's on the way to the event ... ... --- ideally region 1 ends here finally the event [E] Dept basically works with the above view with regard to wait and event. But it's very
[PATCH v2 2/2] drm/msm/dp: Implement oob_hotplug_event()
The Qualcomm DisplayPort driver contains traces of the necessary plumbing to hook up USB HPD, in the form of the dp_hpd module and the dp_usbpd_cb struct. Use this as basis for implementing the oob_hotplug_event() callback, by amending the dp_hpd module with the missing logic. Overall the solution is similar to what's done downstream, but upstream all the code to disect the HPD notification lives on the calling side of drm_connector_oob_hotplug_event(). drm_connector_oob_hotplug_event() performs the lookup of the drm_connector based on fwnode, hence the need to assign the fwnode in dp_drm_connector_init(). Changes in v2: - Adopt enum drm_connector_hpd_state Signed-off-by: Bjorn Andersson --- drivers/gpu/drm/msm/dp/dp_display.c | 9 + drivers/gpu/drm/msm/dp/dp_display.h | 3 +++ drivers/gpu/drm/msm/dp/dp_drm.c | 11 +++ drivers/gpu/drm/msm/dp/dp_hpd.c | 21 + drivers/gpu/drm/msm/dp/dp_hpd.h | 5 + 5 files changed, 49 insertions(+) diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c index 178b774a5fbd..3d9d754a75f3 100644 --- a/drivers/gpu/drm/msm/dp/dp_display.c +++ b/drivers/gpu/drm/msm/dp/dp_display.c @@ -449,6 +449,14 @@ static int dp_display_usbpd_configure_cb(struct device *dev) return dp_display_process_hpd_high(dp); } +void dp_display_oob_hotplug_event(struct msm_dp *dp_display, + enum drm_connector_hpd_state hpd_state) +{ + struct dp_display_private *dp = container_of(dp_display, struct dp_display_private, dp_display); + + dp->usbpd->oob_event(dp->usbpd, hpd_state); +} + static int dp_display_usbpd_disconnect_cb(struct device *dev) { struct dp_display_private *dp = dev_get_dp_display_private(dev); @@ -1296,6 +1304,7 @@ static int dp_display_probe(struct platform_device *pdev) dp->pdev = pdev; dp->name = "drm_dp"; dp->dp_display.connector_type = desc->connector_type; + dp->dp_display.dev = &pdev->dev; rc = dp_init_sub_modules(dp); if (rc) { diff --git a/drivers/gpu/drm/msm/dp/dp_display.h b/drivers/gpu/drm/msm/dp/dp_display.h index 7af2b186d2d9..16658270df2c 100644 --- a/drivers/gpu/drm/msm/dp/dp_display.h +++ b/drivers/gpu/drm/msm/dp/dp_display.h @@ -11,6 +11,7 @@ #include "disp/msm_disp_snapshot.h" struct msm_dp { + struct device *dev; struct drm_device *drm_dev; struct device *codec_dev; struct drm_bridge *bridge; @@ -40,5 +41,7 @@ bool dp_display_check_video_test(struct msm_dp *dp_display); int dp_display_get_test_bpp(struct msm_dp *dp_display); void dp_display_signal_audio_start(struct msm_dp *dp_display); void dp_display_signal_audio_complete(struct msm_dp *dp_display); +void dp_display_oob_hotplug_event(struct msm_dp *dp_display, + enum drm_connector_hpd_state hpd_state); #endif /* _DP_DISPLAY_H_ */ diff --git a/drivers/gpu/drm/msm/dp/dp_drm.c b/drivers/gpu/drm/msm/dp/dp_drm.c index 80f59cf99089..76904b1601b1 100644 --- a/drivers/gpu/drm/msm/dp/dp_drm.c +++ b/drivers/gpu/drm/msm/dp/dp_drm.c @@ -123,6 +123,14 @@ static enum drm_mode_status dp_connector_mode_valid( return dp_display_validate_mode(dp_disp, mode->clock); } +static void dp_oob_hotplug_event(struct drm_connector *connector, +enum drm_connector_hpd_state hpd_state) +{ + struct msm_dp *dp_disp = to_dp_connector(connector)->dp_display; + + dp_display_oob_hotplug_event(dp_disp, hpd_state); +} + static const struct drm_connector_funcs dp_connector_funcs = { .detect = dp_connector_detect, .fill_modes = drm_helper_probe_single_connector_modes, @@ -130,6 +138,7 @@ static const struct drm_connector_funcs dp_connector_funcs = { .reset = drm_atomic_helper_connector_reset, .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state, .atomic_destroy_state = drm_atomic_helper_connector_destroy_state, + .oob_hotplug_event = dp_oob_hotplug_event, }; static const struct drm_connector_helper_funcs dp_connector_helper_funcs = { @@ -160,6 +169,8 @@ struct drm_connector *dp_drm_connector_init(struct msm_dp *dp_display) if (ret) return ERR_PTR(ret); + connector->fwnode = fwnode_handle_get(dev_fwnode(dp_display->dev)); + drm_connector_helper_add(connector, &dp_connector_helper_funcs); /* diff --git a/drivers/gpu/drm/msm/dp/dp_hpd.c b/drivers/gpu/drm/msm/dp/dp_hpd.c index db98a1d431eb..cdb1feea5ebf 100644 --- a/drivers/gpu/drm/msm/dp/dp_hpd.c +++ b/drivers/gpu/drm/msm/dp/dp_hpd.c @@ -7,6 +7,8 @@ #include #include +#include +#include #include "dp_hpd.h" @@ -45,6 +47,24 @@ int dp_hpd_connect(struct dp_usbpd *dp_usbpd, bool hpd) return rc; } +static void dp_hpd_oob_event(struct dp_usbpd *dp_usbpd, +enum drm_connector_hpd_state hpd_state) +{ + struct dp_hpd_
[PATCH v2 1/2] drm: Add HPD state to drm_connector_oob_hotplug_event()
In some implementations, such as the Qualcomm platforms, the display driver has no way to query the current HPD state and as such it's impossible to distinguish between disconnect and attention events. Add a parameter to drm_connector_oob_hotplug_event() to pass the HPD state. Also push the test for unchanged state in the displayport altmode driver into the i915 driver, to allow other drivers to act upon each update. Changes in v2: - Replace bool with drm_connector_hpd_state enum to represent "state" better - Track old hpd state per encoder in i915 Signed-off-by: Bjorn Andersson --- drivers/gpu/drm/drm_connector.c | 6 -- drivers/gpu/drm/i915/display/intel_dp.c | 17 ++--- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/usb/typec/altmodes/displayport.c | 10 +++--- include/drm/drm_connector.h | 11 +-- 5 files changed, 33 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index a50c82bc2b2f..a44f082ebd9d 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -2825,6 +2825,7 @@ struct drm_connector *drm_connector_find_by_fwnode(struct fwnode_handle *fwnode) /** * drm_connector_oob_hotplug_event - Report out-of-band hotplug event to connector * @connector_fwnode: fwnode_handle to report the event on + * @hpd_state: hot plug detect logical state * * On some hardware a hotplug event notification may come from outside the display * driver / device. An example of this is some USB Type-C setups where the hardware @@ -2834,7 +2835,8 @@ struct drm_connector *drm_connector_find_by_fwnode(struct fwnode_handle *fwnode) * This function can be used to report these out-of-band events after obtaining * a drm_connector reference through calling drm_connector_find_by_fwnode(). */ -void drm_connector_oob_hotplug_event(struct fwnode_handle *connector_fwnode) +void drm_connector_oob_hotplug_event(struct fwnode_handle *connector_fwnode, +enum drm_connector_hpd_state hpd_state) { struct drm_connector *connector; @@ -2843,7 +2845,7 @@ void drm_connector_oob_hotplug_event(struct fwnode_handle *connector_fwnode) return; if (connector->funcs->oob_hotplug_event) - connector->funcs->oob_hotplug_event(connector); + connector->funcs->oob_hotplug_event(connector, hpd_state); drm_connector_put(connector); } diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 1046e7fe310a..a3c9dbae5cee 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -4825,15 +4825,26 @@ static int intel_dp_connector_atomic_check(struct drm_connector *conn, return intel_modeset_synced_crtcs(state, conn); } -static void intel_dp_oob_hotplug_event(struct drm_connector *connector) +static void intel_dp_oob_hotplug_event(struct drm_connector *connector, + enum drm_connector_hpd_state hpd_state) { struct intel_encoder *encoder = intel_attached_encoder(to_intel_connector(connector)); struct drm_i915_private *i915 = to_i915(connector->dev); + bool hpd_high = hpd_state == DRM_CONNECTOR_HPD_HIGH; + unsigned int hpd_pin = encoder->hpd_pin; + bool need_work = false; spin_lock_irq(&i915->irq_lock); - i915->hotplug.event_bits |= BIT(encoder->hpd_pin); + if (hpd_high != test_bit(hpd_pin, &i915->hotplug.oob_hotplug_last_state)) { + i915->hotplug.event_bits |= BIT(hpd_pin); + + __assign_bit(hpd_pin, &i915->hotplug.oob_hotplug_last_state, hpd_high); + need_work = true; + } spin_unlock_irq(&i915->irq_lock); - queue_delayed_work(system_wq, &i915->hotplug.hotplug_work, 0); + + if (need_work) + queue_delayed_work(system_wq, &i915->hotplug.hotplug_work, 0); } static const struct drm_connector_funcs intel_dp_connector_funcs = { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 5cfe69b30841..80a4615a38e2 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -138,6 +138,9 @@ struct i915_hotplug { /* Whether or not to count short HPD IRQs in HPD storms */ u8 hpd_short_storm_enabled; + /* Last state reported by oob_hotplug_event for each encoder */ + unsigned long oob_hotplug_last_state; + /* * if we get a HPD irq from DP and a HPD irq from non-DP * the non-DP HPD could block the workqueue on a mode config diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c index c1d8c23baa39..ea9cb1d71fd2 100644 --- a/drivers/usb/typec/altmodes/displayport.c +++ b/drivers/usb/typec/altmodes/displayport.c @@ -59,7 +59,6 @@ struct dp_altmode { struct typec_displ
[PATCH 4/4] drm/msm/a6xx: Zap counters across context switch
From: Rob Clark Any app controlled perfcntr collection (GL_AMD_performance_monitor, etc) does not require counters to maintain state across context switches. So clear them if systemwide profiling is not active. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 29 +++ 1 file changed, 29 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 237c2e7a7baa..02b47977b5c3 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -101,6 +101,7 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter, static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu, struct msm_ringbuffer *ring, struct msm_file_private *ctx) { + bool sysprof = refcount_read(&a6xx_gpu->base.base.sysprof_active) > 1; phys_addr_t ttbr; u32 asid; u64 memptr = rbmemptr(ring, ttbr0); @@ -111,6 +112,15 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu, if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid)) return; + if (!sysprof) { + /* Turn off protected mode to write to special registers */ + OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1); + OUT_RING(ring, 0); + + OUT_PKT4(ring, REG_A6XX_RBBM_PERFCTR_SRAM_INIT_CMD, 1); + OUT_RING(ring, 1); + } + /* Execute the table update */ OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4); OUT_RING(ring, CP_SMMU_TABLE_UPDATE_0_TTBR0_LO(lower_32_bits(ttbr))); @@ -137,6 +147,25 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu, OUT_PKT7(ring, CP_EVENT_WRITE, 1); OUT_RING(ring, 0x31); + + if (!sysprof) { + /* +* Wait for SRAM clear after the pgtable update, so the +* two can happen in parallel: +*/ + OUT_PKT7(ring, CP_WAIT_REG_MEM, 6); + OUT_RING(ring, CP_WAIT_REG_MEM_0_FUNCTION(WRITE_EQ)); + OUT_RING(ring, CP_WAIT_REG_MEM_1_POLL_ADDR_LO( + REG_A6XX_RBBM_PERFCTR_SRAM_INIT_STATUS)); + OUT_RING(ring, CP_WAIT_REG_MEM_2_POLL_ADDR_HI(0)); + OUT_RING(ring, CP_WAIT_REG_MEM_3_REF(0x1)); + OUT_RING(ring, CP_WAIT_REG_MEM_4_MASK(0x1)); + OUT_RING(ring, CP_WAIT_REG_MEM_5_DELAY_LOOP_CYCLES(0)); + + /* Re-enable protected mode: */ + OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1); + OUT_RING(ring, 1); + } } static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) -- 2.35.1
[PATCH 3/4] drm/msm: Add SYSPROF param (v2)
From: Rob Clark Add a SYSPROF param for system profiling tools like Mesa's pps-producer (perfetto) to control behavior related to system-wide performance counter collection. In particular, for profiling, one wants to ensure that GPU context switches do not effect perfcounter state, and might want to suppress suspend (which would cause counters to lose state). v2: Swap the order in msm_file_private_set_sysprof() [sboyd] and initialize the sysprof_active refcount to one (because the under/ overflow checking in refcount_t doesn't expect a 0->1 transition) meaning that values greater than 1 means sysprof is active. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 4 +++ drivers/gpu/drm/msm/msm_drv.c | 8 + drivers/gpu/drm/msm/msm_gpu.c | 2 ++ drivers/gpu/drm/msm/msm_gpu.h | 27 + drivers/gpu/drm/msm/msm_submitqueue.c | 39 + include/uapi/drm/msm_drm.h | 1 + 6 files changed, 81 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 6a37d409653b..c91ea363c373 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -287,6 +287,10 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx, uint32_t param, uint64_t value) { switch (param) { + case MSM_PARAM_SYSPROF: + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + return msm_file_private_set_sysprof(ctx, gpu, value); default: DBG("%s: invalid param: %u", gpu->name, param); return -EINVAL; diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index ca9a8a866292..780f9748aaaf 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -559,8 +559,16 @@ static void context_close(struct msm_file_private *ctx) static void msm_postclose(struct drm_device *dev, struct drm_file *file) { + struct msm_drm_private *priv = dev->dev_private; struct msm_file_private *ctx = file->driver_priv; + /* +* It is not possible to set sysprof param to non-zero if gpu +* is not initialized: +*/ + if (priv->gpu) + msm_file_private_set_sysprof(ctx, priv->gpu, 0); + context_close(ctx); } diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index c4fe8bc9445e..8fe4aee96aa9 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -975,6 +975,8 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, gpu->nr_rings = nr_rings; + refcount_set(&gpu->sysprof_active, 1); + return 0; fail: diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index fde9a29f884e..a84140055920 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -160,6 +160,13 @@ struct msm_gpu { struct msm_ringbuffer *rb[MSM_GPU_MAX_RINGS]; int nr_rings; + /** +* sysprof_active: +* +* The count of contexts that have enabled system profiling. +*/ + refcount_t sysprof_active; + /** * cur_ctx_seqno: * @@ -330,6 +337,24 @@ struct msm_file_private { struct kref ref; int seqno; + /** +* sysprof: +* +* The value of MSM_PARAM_SYSPROF set by userspace. This is +* intended to be used by system profiling tools like Mesa's +* pps-producer (perfetto), and restricted to CAP_SYS_ADMIN. +* +* Setting a value of 1 will preserve performance counters across +* context switches. Setting a value of 2 will in addition +* suppress suspend. (Performance counters lose state across +* power collapse, which is undesirable for profiling in some +* cases.) +* +* The value automatically reverts to zero when the drm device +* file is closed. +*/ + int sysprof; + /** * elapsed: * @@ -545,6 +570,8 @@ void msm_submitqueue_close(struct msm_file_private *ctx); void msm_submitqueue_destroy(struct kref *kref); +int msm_file_private_set_sysprof(struct msm_file_private *ctx, +struct msm_gpu *gpu, int sysprof); void __msm_file_private_destroy(struct kref *kref); static inline void msm_file_private_put(struct msm_file_private *ctx) diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c index 7cb158bcbcf6..79b6ccd6ce64 100644 --- a/drivers/gpu/drm/msm/msm_submitqueue.c +++ b/drivers/gpu/drm/msm/msm_submitqueue.c @@ -7,6 +7,45 @@ #include "msm_gpu.h" +int msm_file_private_set_sysprof(struct msm_file_private *ctx, +struct msm_gpu *gpu, int sysprof) +{ + /*
[PATCH 2/4] drm/msm: Add SET_PARAM ioctl
From: Rob Clark It was always expected to have a use for this some day, so we left a placeholder. Now we do. (And I expect another use in the not too distant future when we start allowing userspace to allocate GPU iova.) Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 + drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 ++ drivers/gpu/drm/msm/msm_drv.c | 20 ++ drivers/gpu/drm/msm/msm_gpu.h | 2 ++ include/uapi/drm/msm_drm.h | 27 ++--- 10 files changed, 54 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c index 22e8295a5e2b..6c9a747eb4ad 100644 --- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c @@ -471,6 +471,7 @@ static u32 a2xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, + .set_param = adreno_set_param, .hw_init = a2xx_hw_init, .pm_suspend = msm_gpu_pm_suspend, .pm_resume = msm_gpu_pm_resume, diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c index 2e481e2692ba..0ab0e1dd8bbb 100644 --- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c @@ -486,6 +486,7 @@ static u32 a3xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, + .set_param = adreno_set_param, .hw_init = a3xx_hw_init, .pm_suspend = msm_gpu_pm_suspend, .pm_resume = msm_gpu_pm_resume, diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c index c5524d6e8705..0c6b2a6d0b4c 100644 --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c @@ -621,6 +621,7 @@ static u32 a4xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, + .set_param = adreno_set_param, .hw_init = a4xx_hw_init, .pm_suspend = a4xx_pm_suspend, .pm_resume = a4xx_pm_resume, diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 3d28fcf841a6..407f50a15faa 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1700,6 +1700,7 @@ static uint32_t a5xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, + .set_param = adreno_set_param, .hw_init = a5xx_hw_init, .pm_suspend = a5xx_pm_suspend, .pm_resume = a5xx_pm_resume, diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 7d23c741db4a..237c2e7a7baa 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1800,6 +1800,7 @@ static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev) static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, + .set_param = adreno_set_param, .hw_init = a6xx_hw_init, .pm_suspend = a6xx_pm_suspend, .pm_resume = a6xx_pm_resume, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 15c8997b7251..6a37d409653b 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -283,6 +283,16 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx, } } +int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx, +uint32_t param, uint64_t value) +{ + switch (param) { + default: + DBG("%s: invalid param: %u", gpu->name, param); + return -EINVAL; + } +} + const struct firmware * adreno_request_fw(struct adreno_gpu *adreno_gpu, const char *fwname) { diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h index b1ee453d627d..0490c5fbb780 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h @@ -282,6 +282,8 @@ static inline int adreno_is_a650_family(struct adreno_gpu *gpu) int adreno_get_param(struct msm_gpu *gpu, struct msm_file_
[PATCH 0/4] drm/msm: Clear perf counters across context switch
From: Rob Clark Some clever folks figured out a way to use performance counters as a side-channel[1]. But, other than the special case of using the perf counters for system profiling, we can reset the counters across context switches to protect against this. This series introduces a SYSPROF param which a sufficiently privilaged userspace (like Mesa's pps-producer, which already must run as root) to opt-out, and makes the default behavior to reset counters on context switches. [1] https://dl.acm.org/doi/pdf/10.1145/3503222.3507757 Rob Clark (4): drm/msm: Update generated headers drm/msm: Add SET_PARAM ioctl drm/msm: Add SYSPROF param (v2) drm/msm/a6xx: Zap counters across context switch drivers/gpu/drm/msm/adreno/a2xx.xml.h | 26 +- drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/a3xx.xml.h | 30 +- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/a4xx.xml.h | 112 ++- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/a5xx.xml.h | 63 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 1 + drivers/gpu/drm/msm/adreno/a6xx.xml.h | 674 +++--- drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h | 26 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 30 + .../gpu/drm/msm/adreno/adreno_common.xml.h| 31 +- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 14 + drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 + drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h | 46 +- drivers/gpu/drm/msm/disp/mdp4/mdp4.xml.h | 37 +- drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h | 37 +- drivers/gpu/drm/msm/disp/mdp_common.xml.h | 37 +- drivers/gpu/drm/msm/dsi/dsi.xml.h | 37 +- drivers/gpu/drm/msm/dsi/dsi_phy_10nm.xml.h| 37 +- drivers/gpu/drm/msm/dsi/dsi_phy_14nm.xml.h| 37 +- drivers/gpu/drm/msm/dsi/dsi_phy_20nm.xml.h| 37 +- drivers/gpu/drm/msm/dsi/dsi_phy_28nm.xml.h| 37 +- .../gpu/drm/msm/dsi/dsi_phy_28nm_8960.xml.h | 37 +- drivers/gpu/drm/msm/dsi/dsi_phy_5nm.xml.h | 480 - drivers/gpu/drm/msm/dsi/dsi_phy_7nm.xml.h | 43 +- drivers/gpu/drm/msm/dsi/mmss_cc.xml.h | 37 +- drivers/gpu/drm/msm/dsi/sfpb.xml.h| 37 +- drivers/gpu/drm/msm/hdmi/hdmi.xml.h | 37 +- drivers/gpu/drm/msm/hdmi/qfprom.xml.h | 37 +- drivers/gpu/drm/msm/msm_drv.c | 28 + drivers/gpu/drm/msm/msm_gpu.c | 2 + drivers/gpu/drm/msm/msm_gpu.h | 29 + drivers/gpu/drm/msm/msm_submitqueue.c | 39 + include/uapi/drm/msm_drm.h| 28 +- 35 files changed, 1058 insertions(+), 1130 deletions(-) delete mode 100644 drivers/gpu/drm/msm/dsi/dsi_phy_5nm.xml.h -- 2.35.1
Re: Report 2 in ext4 and journal based on v5.17-rc1
On Thu, Mar 03, 2022 at 09:36:25AM -0500, Theodore Ts'o wrote: > On Thu, Mar 03, 2022 at 02:23:33PM +0900, Byungchul Park wrote: > > I totally agree with you. *They aren't really locks but it's just waits > > and wakeups.* That's exactly why I decided to develop Dept. Dept is not > > interested in locks unlike Lockdep, but fouces on waits and wakeup > > sources itself. I think you get Dept wrong a lot. Please ask me more if > > you have things you doubt about Dept. > > So the question is this --- do you now understand why, even though > there is a circular dependency, nothing gets stalled in the > interactions between the two wait channels? ??? I'm afraid I don't get you. All contexts waiting for any of the events in the circular dependency chain will be definitely stuck if there is a circular dependency as I explained. So we need another wakeup source to break the circle. In ext4 code, you might have the wakeup source for breaking the circle. What I agreed with is: The case that 1) the circular dependency is unevitable 2) there are another wakeup source for breadking the circle and 3) the duration in sleep is short enough, should be acceptable. Sounds good? Thanks, Byungchul
Re: [PATCH v3 00/21] DEPT(Dependency Tracker)
On Thu, Mar 03, 2022 at 12:38:39PM +, Hyeonggon Yoo wrote: > On Thu, Mar 03, 2022 at 06:48:24PM +0900, Byungchul Park wrote: > > On Thu, Mar 03, 2022 at 08:03:21AM +, Hyeonggon Yoo wrote: > > > On Thu, Mar 03, 2022 at 09:18:13AM +0900, Byungchul Park wrote: > > > > Hi Hyeonggon, > > > > > > > > Dept also allows the following scenario when an user guarantees that > > > > each lock instance is different from another at a different depth: > > > > > > > >lock A0 with depth > > > >lock A1 with depth + 1 > > > >lock A2 with depth + 2 > > > >lock A3 with depth + 3 > > > >(and so on) > > > >.. > > > >unlock A3 > > > >unlock A2 > > > >unlock A1 > > > >unlock A0 > > > > [+Cc kmemleak maintainer] > > > Look at this. Dept allows object->lock -> other_object->lock (with a > > different depth using *_lock_nested()) so won't report it. > > > > No, It did. Yes, you are right. I should've asked you to resend the AA deadlock report when I found [W]'s stacktrace was missed in what you shared and should've taken a look at it more. Dept normally doesn't report this type of AA deadlock. But it does when the case happens, that we are talking about, say, another lock class cut in between the nesting locks. I will fix it. The AA deadlock report here doesn't make sense. Thank you. However, the other report below still makes sense. > > > > > 45 * scan_mutex [-> object->lock] -> kmemleak_lock -> > > > > > other_object->lock (SINGLE_DEPTH_NESTING) > > > > > 46 * > > > > > 47 * No kmemleak_lock and object->lock nesting is allowed outside > > > > > scan_mutex > > > > > 48 * regions. > > > > > > lock order in kmemleak is described above. > > > > > > and DEPT detects two cases as deadlock: > > > > > > 1) object->lock -> other_object->lock > > > > It's not a deadlock *IF* two have different depth using *_lock_nested(). > > Dept also allows this case. So Dept wouldn't report it. > > > > > 2) object->lock -> kmemleak_lock, kmemleak_lock -> other_object->lock > > > > But this usage is risky. I already explained it in the mail you replied > > to. I copied it. See the below. > > > > I understand why you said this is risky. > Its lock ordering is not good. > > > context A > > > >lock A0 with depth > > > >lock B > > > >lock A1 with depth + 1 > > > >lock A2 with depth + 2 > > > >lock A3 with depth + 3 > > > >(and so on) > > > >.. > > > >unlock A3 > > > >unlock A2 > > > >unlock A1 > > > >unlock B > > > >unlock A0 > > > > ... > > > > context B > > > >lock A1 with depth > > > >lock B > > > >lock A2 with depth + 1 > > > >lock A3 with depth + 2 > > > >(and so on) > > > >.. > > > >unlock A3 > > > >unlock A2 > > > >unlock B > > > >unlock A1 > > > > where Ax : object->lock, B : kmemleak_lock. > > > > A deadlock might occur if the two contexts run at the same time. > > > > But I want to say kmemleak is getting things under control. No two contexts > can run at same time. So.. do you think the below is also okay? Because lock C and lock B are under control? context Xcontext Y lock mutex A lock mutex A lock B lock C lock C lock B unlock C unlock B unlock B unlock C unlock mutex A unlock mutex A In my opinion, lock B and lock C are unnecessary if they are always along with lock mutex A. Or we should keep correct lock order across all the code. > > > And in kmemleak case, 1) and 2) is not possible because it must hold > > > scan_mutex first. > > > > This is another issue. Let's focus on whether the order is okay for now. > > > > Why is it another issue? You seem to insist that locking order is not important *if* they are under control by serializing the sections. I meant this is another issue. > > > I think the author of kmemleak intended lockdep to treat object->lock > > > and other_object->lock as different class, using raw_spin_lock_nested(). > > > > Yes. The author meant to assign a different class according to its depth > > using a Lockdep API. Strictly speaking, those are the same class anyway > > but we assign a different class to each depth to avoid Lockdep splats > > *IF* the user guarantees the nesting lock usage is safe, IOW, guarantees > > each lock instance is different at a different depth. > > Then why DEPT reports 1) and 2) as deadlock? 1) will be fixed so that Dept doesn't report it. But I still think the case 2) should be reported for the wrong usage. Thanks, Byungchul > Does DEPT assign same class unlike Lockdep? > > > I was fundamentally asking you... so... is the nesting lock usage safe > > for real? > > I don't get what the point is. I agree it's not a good lock ordering. > But in kmemleak case, I think kmemleak is getting things under control. > > -- > Thank you, You are awesome! > Hyeonggon :-) > >
Re: [PATCH V2 04/12] drm: bridge: icn6211: Add DSI lane count DT property parsing
On 3/3/22 13:54, Maxime Ripard wrote: [...] Regarding the default value -- there are no in-tree users of this driver yet (per git grep in current linux-next), do we really care about backward compatibility in this case? If it hasn't been in a stable release yet, no. If it did, yes It was in a stable release, V3 is out.
[PATCH V3 12/13] drm: bridge: icn6211: Rework ICN6211_DSI to chipone_writeb()
Rename and inline macro ICN6211_DSI() into function chipone_writeb() to keep all function names lower-case. No functional change. Acked-by: Maxime Ripard Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 63 +++- 1 file changed, 28 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index 4ad149c13f599..c66eacc6b1e2a 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -153,8 +153,7 @@ static inline struct chipone *bridge_to_chipone(struct drm_bridge *bridge) return container_of(bridge, struct chipone, bridge); } -static inline int chipone_dsi_write(struct chipone *icn, const void *seq, - size_t len) +static void chipone_writeb(struct chipone *icn, u8 reg, u8 val) { if (icn->interface_i2c) i2c_smbus_write_byte_data(icn->client, reg, val); @@ -162,12 +161,6 @@ static inline int chipone_dsi_write(struct chipone *icn, const void *seq, mipi_dsi_generic_write(icn->dsi, (u8[]){reg, val}, 2); } -#define ICN6211_DSI(icn, seq...) \ - { \ - const u8 d[] = { seq }; \ - chipone_dsi_write(icn, d, ARRAY_SIZE(d)); \ - } - static void chipone_configure_pll(struct chipone *icn, const struct drm_display_mode *mode) { @@ -242,11 +235,11 @@ static void chipone_configure_pll(struct chipone *icn, (fin * best_m) / BIT(best_p + best_s + 2)); /* Clock source selection fixed to MIPI DSI clock lane */ - ICN6211_DSI(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK); - ICN6211_DSI(icn, PLL_REF_DIV, + chipone_writeb(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK); + chipone_writeb(icn, PLL_REF_DIV, (best_p ? PLL_REF_DIV_Pe : 0) | /* Prefer /2 pre-divider */ PLL_REF_DIV_P(best_p) | PLL_REF_DIV_S(best_s)); - ICN6211_DSI(icn, PLL_INT(0), best_m); + chipone_writeb(icn, PLL_INT(0), best_m); } static void chipone_atomic_enable(struct drm_bridge *bridge, @@ -265,19 +258,19 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, bus_flags = bridge_state->output_bus_cfg.flags; if (icn->interface_i2c) - ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_I2C); + chipone_writeb(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_I2C); else - ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI); + chipone_writeb(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI); - ICN6211_DSI(icn, HACTIVE_LI, mode->hdisplay & 0xff); + chipone_writeb(icn, HACTIVE_LI, mode->hdisplay & 0xff); - ICN6211_DSI(icn, VACTIVE_LI, mode->vdisplay & 0xff); + chipone_writeb(icn, VACTIVE_LI, mode->vdisplay & 0xff); /* * lsb nibble: 2nd nibble of hdisplay * msb nibble: 2nd nibble of vdisplay */ - ICN6211_DSI(icn, VACTIVE_HACTIVE_HI, + chipone_writeb(icn, VACTIVE_HACTIVE_HI, ((mode->hdisplay >> 8) & 0xf) | (((mode->vdisplay >> 8) & 0xf) << 4)); @@ -285,49 +278,49 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, hsync = mode->hsync_end - mode->hsync_start; hbp = mode->htotal - mode->hsync_end; - ICN6211_DSI(icn, HFP_LI, hfp & 0xff); - ICN6211_DSI(icn, HSYNC_LI, hsync & 0xff); - ICN6211_DSI(icn, HBP_LI, hbp & 0xff); + chipone_writeb(icn, HFP_LI, hfp & 0xff); + chipone_writeb(icn, HSYNC_LI, hsync & 0xff); + chipone_writeb(icn, HBP_LI, hbp & 0xff); /* Top two bits of Horizontal Front porch/Sync/Back porch */ - ICN6211_DSI(icn, HFP_HSW_HBP_HI, + chipone_writeb(icn, HFP_HSW_HBP_HI, HFP_HSW_HBP_HI_HFP(hfp) | HFP_HSW_HBP_HI_HS(hsync) | HFP_HSW_HBP_HI_HBP(hbp)); - ICN6211_DSI(icn, VFP, mode->vsync_start - mode->vdisplay); + chipone_writeb(icn, VFP, mode->vsync_start - mode->vdisplay); - ICN6211_DSI(icn, VSYNC, mode->vsync_end - mode->vsync_start); + chipone_writeb(icn, VSYNC, mode->vsync_end - mode->vsync_start); - ICN6211_DSI(icn, VBP, mode->vtotal - mode->vsync_end); + chipone_writeb(icn, VBP, mode->vtotal - mode->vsync_end); /* dsi specific sequence */ - ICN6211_DSI(icn, SYNC_EVENT_DLY, 0x80); - ICN6211_DSI(icn, HFP_MIN, hfp & 0xff); + chipone_writeb(icn, SYNC_EVENT_DLY, 0x80); + chipone_writeb(icn, HFP_MIN, hfp & 0xff); /* DSI data lane count */ - ICN
[PATCH V3 06/13] drm: bridge: icn6211: Add generic DSI-to-DPI PLL configuration
The chip contains fractional PLL, however the driver currently hard-codes one specific PLL setting. Implement generic PLL parameter calculation code, so any DPI panel with arbitrary pixel clock can be attached to this bridge. The datasheet for this bridge is not available, the PLL behavior has been inferred from [1] and [2] and by analyzing the DPI pixel clock with scope. The PLL limits might be wrong, but at least the calculated values match all the example code available. This is better than one hard-coded pixel clock value anyway. [1] https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/gpu/drm/bridge/icn6211.c [2] https://github.com/tdjastrzebski/ICN6211-Configurator Acked-by: Maxime Ripard Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 87 +++- 1 file changed, 84 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index df8e75a068ad0..71c83a18984fa 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -163,6 +163,87 @@ static inline int chipone_dsi_write(struct chipone *icn, const void *seq, chipone_dsi_write(icn, d, ARRAY_SIZE(d)); \ } +static void chipone_configure_pll(struct chipone *icn, + const struct drm_display_mode *mode) +{ + unsigned int best_p = 0, best_m = 0, best_s = 0; + unsigned int delta, min_delta = 0x; + unsigned int freq_p, freq_s, freq_out; + unsigned int p_min, p_max; + unsigned int p, m, s; + unsigned int fin; + + /* +* DSI clock lane frequency (input into PLL) is calculated as: +* DSI_CLK = mode clock * bpp / dsi_data_lanes / 2 +* the 2 is there because the bus is DDR. +* +* DPI pixel clock frequency (output from PLL) is mode clock. +* +* The chip contains fractional PLL which works as follows: +* DPI_CLK = ((DSI_CLK / P) * M) / S +* P is pre-divider, register PLL_REF_DIV[3:0] is 2^(n+1) divider +* register PLL_REF_DIV[4] is extra 1:2 divider +* M is integer multiplier, register PLL_INT(0) is multiplier +* S is post-divider, register PLL_REF_DIV[7:5] is 2^(n+1) divider +* +* It seems the PLL input clock after applying P pre-divider have +* to be lower than 20 MHz. +*/ + fin = mode->clock * mipi_dsi_pixel_format_to_bpp(icn->dsi->format) / + icn->dsi_lanes / 2; /* in kHz */ + + /* Minimum value of P predivider for PLL input in 5..20 MHz */ + p_min = ffs(fin / 2); + p_max = (fls(fin / 5000) - 1) & 0x1f; + + for (p = p_min; p < p_max; p++) { /* PLL_REF_DIV[4,3:0] */ + freq_p = fin / BIT(p + 1); + if (freq_p == 0)/* Divider too high */ + break; + + for (s = 0; s < 0x7; s++) { /* PLL_REF_DIV[7:5] */ + freq_s = freq_p / BIT(s + 1); + if (freq_s == 0)/* Divider too high */ + break; + + m = mode->clock / freq_s; + + /* Multiplier is 8 bit */ + if (m > 0xff) + continue; + + /* Limit PLL VCO frequency to 1 GHz */ + freq_out = (fin * m) / BIT(p + 1); + if (freq_out > 100) + continue; + + /* Apply post-divider */ + freq_out /= BIT(s + 1); + + delta = abs(mode->clock - freq_out); + if (delta < min_delta) { + best_p = p; + best_m = m; + best_s = s; + min_delta = delta; + } + } + } + + dev_dbg(icn->dev, + "PLL: P[3:0]=2^%d P[4]=2*%d M=%d S[7:5]=2^%d delta=%d => DSI f_in=%d kHz ; DPI f_out=%ld kHz\n", + best_p, !!best_p, best_m, best_s + 1, min_delta, fin, + (fin * best_m) / BIT(best_p + best_s + 2)); + + /* Clock source selection fixed to MIPI DSI clock lane */ + ICN6211_DSI(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK); + ICN6211_DSI(icn, PLL_REF_DIV, + (best_p ? PLL_REF_DIV_Pe : 0) | /* Prefer /2 pre-divider */ + PLL_REF_DIV_P(best_p) | PLL_REF_DIV_S(best_s)); + ICN6211_DSI(icn, PLL_INT(0), best_m); +} + static void chipone_atomic_enable(struct drm_bridge *bridge, struct drm_b
[PATCH V3 08/13] drm: bridge: icn6211: Disable DPI color swap
The chip is capable of swapping DPI RGB channels. The driver currently does not implement support for this functionality. Write the MIPI_PN_SWAP register to 0 to assure the color swap is disabled. Acked-by: Maxime Ripard Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index b4e886c2b92a5..1a3afefcc9e80 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -302,6 +302,7 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0); ICN6211_DSI(icn, PLL_CTRL(12), 0xff); + ICN6211_DSI(icn, MIPI_PN_SWAP, 0x00); /* DPI HS/VS/DE polarity */ pol = ((mode->flags & DRM_MODE_FLAG_PHSYNC) ? BIST_POL_HSYNC_POL : 0) | -- 2.34.1
[PATCH V3 09/13] drm: bridge: icn6211: Set SYS_CTRL_1 to value used in examples
Both example code [1], [2] as well as one provided by custom panel vendor set register SYS_CTRL_1 to 0x88. What exactly does the value mean is unknown due to unavailable datasheet. Align this register value with example code. [1] https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/gpu/drm/bridge/icn6211.c [2] https://github.com/tdjastrzebski/ICN6211-Configurator Acked-by: Maxime Ripard Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index 1a3afefcc9e80..095002a40d0e8 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -314,7 +314,7 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, chipone_configure_pll(icn, mode); ICN6211_DSI(icn, SYS_CTRL(0), 0x40); - ICN6211_DSI(icn, SYS_CTRL(1), 0x98); + ICN6211_DSI(icn, SYS_CTRL(1), 0x88); /* icn6211 specific sequence */ ICN6211_DSI(icn, MIPI_FORCE_0, 0x20); -- 2.34.1
[PATCH V3 11/13] drm: bridge: icn6211: Add I2C configuration support
The ICN6211 chip starts in I2C configuration mode after cold boot. Implement support for configuring the chip via I2C in addition to the current DSI LP command mode configuration support. The later seems to be available only on chips which have additional MCU on the panel/bridge board which preconfigures the ICN6211, while the I2C configuration mode added by this patch does not require any such MCU. Acked-by: Maxime Ripard Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: - Drop the abridge variable - Rename chipone_dsi_setup to chipone_dsi_host_attach and call it from chipone_i2c_probe() V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 183 --- 1 file changed, 161 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index afc619e215c3b..4ad149c13f599 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -133,15 +134,18 @@ struct chipone { struct device *dev; + struct i2c_client *client; struct drm_bridge bridge; struct drm_display_mode mode; struct drm_bridge *panel_bridge; struct device_node *host_node; + struct mipi_dsi_device *dsi; struct gpio_desc *enable_gpio; struct regulator *vdd1; struct regulator *vdd2; struct regulator *vdd3; int dsi_lanes; + bool interface_i2c; }; static inline struct chipone *bridge_to_chipone(struct drm_bridge *bridge) @@ -152,9 +156,10 @@ static inline struct chipone *bridge_to_chipone(struct drm_bridge *bridge) static inline int chipone_dsi_write(struct chipone *icn, const void *seq, size_t len) { - struct mipi_dsi_device *dsi = to_mipi_dsi_device(icn->dev); - - return mipi_dsi_generic_write(dsi, seq, len); + if (icn->interface_i2c) + i2c_smbus_write_byte_data(icn->client, reg, val); + else + mipi_dsi_generic_write(icn->dsi, (u8[]){reg, val}, 2); } #define ICN6211_DSI(icn, seq...) \ @@ -259,7 +264,10 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, bridge_state = drm_atomic_get_new_bridge_state(state, bridge); bus_flags = bridge_state->output_bus_cfg.flags; - ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI); + if (icn->interface_i2c) + ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_I2C); + else + ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI); ICN6211_DSI(icn, HACTIVE_LI, mode->hdisplay & 0xff); @@ -380,6 +388,57 @@ static void chipone_mode_set(struct drm_bridge *bridge, struct chipone *icn = bridge_to_chipone(bridge); drm_mode_copy(&icn->mode, adjusted_mode); +}; + +static int chipone_dsi_attach(struct chipone *icn) +{ + struct mipi_dsi_device *dsi = icn->dsi; + int ret; + + dsi->lanes = icn->dsi_lanes; + dsi->format = MIPI_DSI_FMT_RGB888; + dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST | + MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET; + + ret = mipi_dsi_attach(dsi); + if (ret < 0) + dev_err(icn->dev, "failed to attach dsi\n"); + + return ret; +} + +static int chipone_dsi_host_attach(struct chipone *icn) +{ + struct device *dev = icn->dev; + struct mipi_dsi_device *dsi; + struct mipi_dsi_host *host; + int ret = 0; + + const struct mipi_dsi_device_info info = { + .type = "chipone", + .channel = 0, + .node = NULL, + }; + + host = of_find_mipi_dsi_host_by_node(icn->host_node); + if (!host) { + dev_err(dev, "failed to find dsi host\n"); + return -EPROBE_DEFER; + } + + dsi = mipi_dsi_device_register_full(host, &info); + if (IS_ERR(dsi)) { + return dev_err_probe(dev, PTR_ERR(dsi), +"failed to create dsi device\n"); + } + + icn->dsi = dsi; + + ret = chipone_dsi_attach(icn); + if (ret < 0) + mipi_dsi_device_unregister(dsi); + + return ret; } static int chipone_attach(struct drm_bridge *bridge, enum drm_bridge_attach_flags flags) @@ -506,9 +565,8 @@ static int chipone_parse_dt(struct chipone *icn) return ret; } -static int chipone_probe(struct mipi_dsi_device *dsi) +static int chipone_common_probe(struct device *dev, struct chipone **icnr) { - struct device *dev = &dsi->dev; struct chipone *icn; int ret; @@ -516,7 +574,6 @@ static int chipone_probe(struct mipi_dsi_device *dsi) i
[PATCH V3 02/13] drm: bridge: icn6211: Fix register layout
The chip register layout has nothing to do with MIPI DCS, the registers incorrectly marked as MIPI DCS in the driver are regular chip registers often with completely different function. Fill in the actual register names and bits from [1] and [2] and add the entire register layout, since the documentation for this chip is hard to come by. [1] https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/gpu/drm/bridge/icn6211.c [2] https://github.com/tdjastrzebski/ICN6211-Configurator Acked-by: Maxime Ripard Fixes: ce517f18944e3 ("drm: bridge: Add Chipone ICN6211 MIPI-DSI to RGB bridge") Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 134 --- 1 file changed, 117 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index e8f36dca56b33..4b8d1a5a50302 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -15,8 +15,19 @@ #include #include -#include - +#define VENDOR_ID 0x00 +#define DEVICE_ID_H0x01 +#define DEVICE_ID_L0x02 +#define VERSION_ID 0x03 +#define FIRMWARE_VERSION 0x08 +#define CONFIG_FINISH 0x09 +#define PD_CTRL(n) (0x0a + ((n) & 0x3)) /* 0..3 */ +#define RST_CTRL(n)(0x0e + ((n) & 0x1)) /* 0..1 */ +#define SYS_CTRL(n)(0x10 + ((n) & 0x7)) /* 0..4 */ +#define RGB_DRV(n) (0x18 + ((n) & 0x3)) /* 0..3 */ +#define RGB_DLY(n) (0x1c + ((n) & 0x1)) /* 0..1 */ +#define RGB_TEST_CTRL 0x1e +#define ATE_PLL_EN 0x1f #define HACTIVE_LI 0x20 #define VACTIVE_LI 0x21 #define VACTIVE_HACTIVE_HI 0x22 @@ -27,6 +38,95 @@ #define VFP0x27 #define VSYNC 0x28 #define VBP0x29 +#define BIST_POL 0x2a +#define BIST_POL_BIST_MODE(n) (((n) & 0xf) << 4) +#define BIST_POL_BIST_GEN BIT(3) +#define BIST_POL_HSYNC_POL BIT(2) +#define BIST_POL_VSYNC_POL BIT(1) +#define BIST_POL_DE_POLBIT(0) +#define BIST_RED 0x2b +#define BIST_GREEN 0x2c +#define BIST_BLUE 0x2d +#define BIST_CHESS_X 0x2e +#define BIST_CHESS_Y 0x2f +#define BIST_CHESS_XY_H0x30 +#define BIST_FRAME_TIME_L 0x31 +#define BIST_FRAME_TIME_H 0x32 +#define FIFO_MAX_ADDR_LOW 0x33 +#define SYNC_EVENT_DLY 0x34 +#define HSW_MIN0x35 +#define HFP_MIN0x36 +#define LOGIC_RST_NUM 0x37 +#define OSC_CTRL(n)(0x48 + ((n) & 0x7)) /* 0..5 */ +#define BG_CTRL0x4e +#define LDO_PLL0x4f +#define PLL_CTRL(n)(0x50 + ((n) & 0xf)) /* 0..15 */ +#define PLL_CTRL_6_EXTERNAL0x90 +#define PLL_CTRL_6_MIPI_CLK0x92 +#define PLL_CTRL_6_INTERNAL0x93 +#define PLL_REM(n) (0x60 + ((n) & 0x3)) /* 0..2 */ +#define PLL_DIV(n) (0x63 + ((n) & 0x3)) /* 0..2 */ +#define PLL_FRAC(n)(0x66 + ((n) & 0x3)) /* 0..2 */ +#define PLL_INT(n) (0x69 + ((n) & 0x1)) /* 0..1 */ +#define PLL_REF_DIV0x6b +#define PLL_REF_DIV_P(n) ((n) & 0xf) +#define PLL_REF_DIV_Pe BIT(4) +#define PLL_REF_DIV_S(n) (((n) & 0x7) << 5) +#define PLL_SSC_P(n) (0x6c + ((n) & 0x3)) /* 0..2 */ +#define PLL_SSC_STEP(n)(0x6f + ((n) & 0x3)) /* 0..2 */ +#define PLL_SSC_OFFSET(n) (0x72 + ((n) & 0x3)) /* 0..3 */ +#define GPIO_OEN 0x79 +#define MIPI_CFG_PW0x7a +#define MIPI_CFG_PW_CONFIG_DSI 0xc1 +#define MIPI_CFG_PW_CONFIG_I2C 0x3e +#define GPIO_SEL(n)(0x7b + ((n) & 0x1)) /* 0..1 */ +#define IRQ_SEL0x7d +#define DBG_SEL0x7e +#define DBG_SIGNAL 0x7f +#define MIPI_ERR_VECTOR_L 0x80 +#define MIPI_ERR_VECTOR_H 0x81 +#define MIPI_ERR_VECTOR_EN_L 0x82 +#define MIPI_ERR_VECTOR_EN_H 0x83 +#define MIPI_MAX_SIZE_L0x84 +#define MIPI_MAX_SIZE_H0x85 +#define DSI_CTRL 0x86 +#define DSI_CTRL_UNKNOWN 0x28 +#define DSI_CTRL_DSI_LANES(n) ((n) & 0x3) +#define MIPI_PN_SWAP 0x87 +#define MIPI_PN_SWAP_CLK BIT(4) +#define MIPI_PN_SWAP_D(n) BIT((n) & 0x3) +#define MIPI_SOT_SYNC_BIT_(n) (0x88 + ((n) & 0x1)) /* 0..1 */ +#define MIPI_ULPS_CTRL 0x8a +#define MIPI_CLK_CHK_VAR 0x8e +#define MIPI_CLK_CHK_INI 0x8f +#define MIPI_T_TERM_EN 0x90 +#define MIPI_T_HS_SETTLE
[PATCH V3 13/13] drm: bridge: icn6211: Read and validate chip IDs before configuration
Read out the Vendor/Chip/Version ID registers from the chip before performing any configuration, and validate that the registers have correct values. This is mostly a simple test whether DSI register access does work, since that tends to be broken on various bridges. Acked-by: Maxime Ripard Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 24 +++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index c66eacc6b1e2a..0a07023d0aeec 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -153,6 +153,14 @@ static inline struct chipone *bridge_to_chipone(struct drm_bridge *bridge) return container_of(bridge, struct chipone, bridge); } +static void chipone_readb(struct chipone *icn, u8 reg, u8 *val) +{ + if (icn->interface_i2c) + *val = i2c_smbus_read_byte_data(icn->client, reg); + else + mipi_dsi_generic_read(icn->dsi, (u8[]){reg, 1}, 2, val, 1); +} + static void chipone_writeb(struct chipone *icn, u8 reg, u8 val) { if (icn->interface_i2c) @@ -251,7 +259,21 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, const struct drm_bridge_state *bridge_state; u16 hfp, hbp, hsync; u32 bus_flags; - u8 pol; + u8 pol, id[4]; + + chipone_readb(icn, VENDOR_ID, id); + chipone_readb(icn, DEVICE_ID_H, id + 1); + chipone_readb(icn, DEVICE_ID_L, id + 2); + chipone_readb(icn, VERSION_ID, id + 3); + + dev_dbg(icn->dev, + "Chip IDs: Vendor=0x%02x Device=0x%02x:0x%02x Version=0x%02x\n", + id[0], id[1], id[2], id[3]); + + if (id[0] != 0xc1 || id[1] != 0x62 || id[2] != 0x11) { + dev_dbg(icn->dev, "Invalid Chip IDs, aborting configuration\n"); + return; + } /* Get the DPI flags from the bridge state. */ bridge_state = drm_atomic_get_new_bridge_state(state, bridge); -- 2.34.1
[PATCH V3 07/13] drm: bridge: icn6211: Use DSI burst mode without EoT and with LP command mode
The DSI burst mode is more energy efficient than the DSI sync pulse mode, make use of the burst mode since the chip supports it as well. Disable the generation of EoT packet, the chip ignores it, so no point in emitting it. Enable transmission of data in LP mode, otherwise register read via DSI does not work with this chip. Acked-by: Maxime Ripard Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index 71c83a18984fa..b4e886c2b92a5 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -503,7 +503,8 @@ static int chipone_probe(struct mipi_dsi_device *dsi) dsi->lanes = icn->dsi_lanes; dsi->format = MIPI_DSI_FMT_RGB888; - dsi->mode_flags = MIPI_DSI_MODE_VIDEO_SYNC_PULSE; + dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST | + MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET; ret = mipi_dsi_attach(dsi); if (ret < 0) { -- 2.34.1
[PATCH V3 04/13] drm: bridge: icn6211: Add HS/VS/DE polarity handling
The driver currently hard-codes HS/VS polarity to active-low and DE to active-high, which is not correct for a lot of supported DPI panels. Add the missing mode flag handling for HS/VS/DE polarity. Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: No change --- drivers/gpu/drm/bridge/chipone-icn6211.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index e29e6a84c39a6..2ac8eb7e25f52 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -165,8 +165,16 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, struct drm_bridge_state *old_bridge_state) { struct chipone *icn = bridge_to_chipone(bridge); + struct drm_atomic_state *state = old_bridge_state->base.state; struct drm_display_mode *mode = &icn->mode; + const struct drm_bridge_state *bridge_state; u16 hfp, hbp, hsync; + u32 bus_flags; + u8 pol; + + /* Get the DPI flags from the bridge state. */ + bridge_state = drm_atomic_get_new_bridge_state(state, bridge); + bus_flags = bridge_state->output_bus_cfg.flags; ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI); @@ -206,7 +214,13 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, ICN6211_DSI(icn, HFP_MIN, hfp & 0xff); ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0); ICN6211_DSI(icn, PLL_CTRL(12), 0xff); - ICN6211_DSI(icn, BIST_POL, BIST_POL_DE_POL); + + /* DPI HS/VS/DE polarity */ + pol = ((mode->flags & DRM_MODE_FLAG_PHSYNC) ? BIST_POL_HSYNC_POL : 0) | + ((mode->flags & DRM_MODE_FLAG_PVSYNC) ? BIST_POL_VSYNC_POL : 0) | + ((bus_flags & DRM_BUS_FLAG_DE_HIGH) ? BIST_POL_DE_POL : 0); + ICN6211_DSI(icn, BIST_POL, pol); + ICN6211_DSI(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK); ICN6211_DSI(icn, PLL_REF_DIV, 0x71); ICN6211_DSI(icn, PLL_INT(0), 0x2b); -- 2.34.1
[PATCH V3 00/13] drm: bridge: icn6211: Fix hard-coded panel settings and add I2C support
This series fixes multiple problems with the ICN6211 driver and adds support for configuration of the chip via I2C bus. First, in the current state, the ICN6211 driver hard-codes DPI timing and clock settings specific to some unknown panel. The settings provided by panel driver are ignored. Using any other panel than the one for which this driver is currently hard-coded can lead to permanent damage of the panel (per display supplier warning, and it sure did in my case. The damage looks like multiple rows of dead pixels at the bottom of the panel, and this is not going away even after long power off time). Much of this series thus fixes incorrect register layout, DPI timing programming, clock generation by adding actual PLL configuration code. This series also adds lane count decoding instead of using hard-coded value, and fills in a couple of registers with likely correct default values. Second, this series adds support for I2C configuration of the ICN6211. The device can be configured either via DSI command mode or via I2C, the register layout is the same in both cases. Since the datasheet for this device is very hard to come by, a lot of information has been salvaged from [1] and [2]. [1] https://github.com/rockchip-linux/kernel/blob/develop-4.19/drivers/gpu/drm/bridge/icn6211.c [2] https://github.com/tdjastrzebski/ICN6211-Configurator Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org Marek Vasut (13): dt-bindings: display: bridge: icn6211: Document DSI data-lanes property drm: bridge: icn6211: Fix register layout drm: bridge: icn6211: Fix HFP_HSW_HBP_HI and HFP_MIN handling drm: bridge: icn6211: Add HS/VS/DE polarity handling drm: bridge: icn6211: Add DSI lane count DT property parsing drm: bridge: icn6211: Add generic DSI-to-DPI PLL configuration drm: bridge: icn6211: Use DSI burst mode without EoT and with LP command mode drm: bridge: icn6211: Disable DPI color swap drm: bridge: icn6211: Set SYS_CTRL_1 to value used in examples drm: bridge: icn6211: Implement atomic_get_input_bus_fmts drm: bridge: icn6211: Add I2C configuration support drm: bridge: icn6211: Rework ICN6211_DSI to chipone_writeb() drm: bridge: icn6211: Read and validate chip IDs before configuration .../display/bridge/chipone,icn6211.yaml | 18 +- drivers/gpu/drm/bridge/chipone-icn6211.c | 534 -- 2 files changed, 496 insertions(+), 56 deletions(-) -- 2.34.1
[PATCH V3 10/13] drm: bridge: icn6211: Implement atomic_get_input_bus_fmts
Implement .atomic_get_input_bus_fmts callback, which sets up the input (DSI-end) format, and that format can then be used in pipeline format negotiation between the DSI-end of this bridge and the other component closer to the scanout engine. Acked-by: Maxime Ripard Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 27 1 file changed, 27 insertions(+) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index 095002a40d0e8..afc619e215c3b 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -389,6 +389,32 @@ static int chipone_attach(struct drm_bridge *bridge, enum drm_bridge_attach_flag return drm_bridge_attach(bridge->encoder, icn->panel_bridge, bridge, flags); } +#define MAX_INPUT_SEL_FORMATS 1 + +static u32 * +chipone_atomic_get_input_bus_fmts(struct drm_bridge *bridge, + struct drm_bridge_state *bridge_state, + struct drm_crtc_state *crtc_state, + struct drm_connector_state *conn_state, + u32 output_fmt, + unsigned int *num_input_fmts) +{ + u32 *input_fmts; + + *num_input_fmts = 0; + + input_fmts = kcalloc(MAX_INPUT_SEL_FORMATS, sizeof(*input_fmts), +GFP_KERNEL); + if (!input_fmts) + return NULL; + + /* This is the DSI-end bus format */ + input_fmts[0] = MEDIA_BUS_FMT_RGB888_1X24; + *num_input_fmts = 1; + + return input_fmts; +} + static const struct drm_bridge_funcs chipone_bridge_funcs = { .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state, .atomic_destroy_state = drm_atomic_helper_bridge_destroy_state, @@ -398,6 +424,7 @@ static const struct drm_bridge_funcs chipone_bridge_funcs = { .atomic_post_disable= chipone_atomic_post_disable, .mode_set = chipone_mode_set, .attach = chipone_attach, + .atomic_get_input_bus_fmts = chipone_atomic_get_input_bus_fmts, }; static int chipone_parse_dt(struct chipone *icn) -- 2.34.1
[PATCH V3 03/13] drm: bridge: icn6211: Fix HFP_HSW_HBP_HI and HFP_MIN handling
The HFP_HSW_HBP_HI register must be programmed with 2 LSbits of each Horizontal Front Porch/Sync/Back Porch. Currently the driver programs this register to 0, which breaks displays with either value above 255. The HFP_MIN register must be set to the same value as HFP_LI, otherwise there is visible image distortion, usually in the form of missing lines at the bottom of the panel. Fix this by correctly programming the HFP_HSW_HBP_HI and HFP_MIN registers. Acked-by: Maxime Ripard Fixes: ce517f18944e3 ("drm: bridge: Add Chipone ICN6211 MIPI-DSI to RGB bridge") Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Add AB from Maxime --- drivers/gpu/drm/bridge/chipone-icn6211.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index 4b8d1a5a50302..e29e6a84c39a6 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -35,6 +35,9 @@ #define HSYNC_LI 0x24 #define HBP_LI 0x25 #define HFP_HSW_HBP_HI 0x26 +#define HFP_HSW_HBP_HI_HFP(n) (((n) & 0x300) >> 4) +#define HFP_HSW_HBP_HI_HS(n) (((n) & 0x300) >> 6) +#define HFP_HSW_HBP_HI_HBP(n) (((n) & 0x300) >> 8) #define VFP0x27 #define VSYNC 0x28 #define VBP0x29 @@ -163,6 +166,7 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, { struct chipone *icn = bridge_to_chipone(bridge); struct drm_display_mode *mode = &icn->mode; + u16 hfp, hbp, hsync; ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI); @@ -178,13 +182,18 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, ((mode->hdisplay >> 8) & 0xf) | (((mode->vdisplay >> 8) & 0xf) << 4)); - ICN6211_DSI(icn, HFP_LI, mode->hsync_start - mode->hdisplay); + hfp = mode->hsync_start - mode->hdisplay; + hsync = mode->hsync_end - mode->hsync_start; + hbp = mode->htotal - mode->hsync_end; - ICN6211_DSI(icn, HSYNC_LI, mode->hsync_end - mode->hsync_start); - - ICN6211_DSI(icn, HBP_LI, mode->htotal - mode->hsync_end); - - ICN6211_DSI(icn, HFP_HSW_HBP_HI, 0x00); + ICN6211_DSI(icn, HFP_LI, hfp & 0xff); + ICN6211_DSI(icn, HSYNC_LI, hsync & 0xff); + ICN6211_DSI(icn, HBP_LI, hbp & 0xff); + /* Top two bits of Horizontal Front porch/Sync/Back porch */ + ICN6211_DSI(icn, HFP_HSW_HBP_HI, + HFP_HSW_HBP_HI_HFP(hfp) | + HFP_HSW_HBP_HI_HS(hsync) | + HFP_HSW_HBP_HI_HBP(hbp)); ICN6211_DSI(icn, VFP, mode->vsync_start - mode->vdisplay); @@ -194,7 +203,7 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, /* dsi specific sequence */ ICN6211_DSI(icn, SYNC_EVENT_DLY, 0x80); - ICN6211_DSI(icn, HFP_MIN, 0x28); + ICN6211_DSI(icn, HFP_MIN, hfp & 0xff); ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0); ICN6211_DSI(icn, PLL_CTRL(12), 0xff); ICN6211_DSI(icn, BIST_POL, BIST_POL_DE_POL); -- 2.34.1
[PATCH V3 05/13] drm: bridge: icn6211: Add DSI lane count DT property parsing
The driver currently hard-codes DSI lane count to two, however the chip is capable of operating in 1..4 DSI lanes mode. Parse 'data-lanes' DT property and program the result into DSI_CTRL register. Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann To: dri-devel@lists.freedesktop.org --- V2: Rebase on next-20220214 V3: Default to 4 data lanes unless specified otherwise --- drivers/gpu/drm/bridge/chipone-icn6211.c | 45 +--- 1 file changed, 41 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index 2ac8eb7e25f52..df8e75a068ad0 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -136,10 +136,12 @@ struct chipone { struct drm_bridge bridge; struct drm_display_mode mode; struct drm_bridge *panel_bridge; + struct device_node *host_node; struct gpio_desc *enable_gpio; struct regulator *vdd1; struct regulator *vdd2; struct regulator *vdd3; + int dsi_lanes; }; static inline struct chipone *bridge_to_chipone(struct drm_bridge *bridge) @@ -212,6 +214,11 @@ static void chipone_atomic_enable(struct drm_bridge *bridge, /* dsi specific sequence */ ICN6211_DSI(icn, SYNC_EVENT_DLY, 0x80); ICN6211_DSI(icn, HFP_MIN, hfp & 0xff); + + /* DSI data lane count */ + ICN6211_DSI(icn, DSI_CTRL, + DSI_CTRL_UNKNOWN | DSI_CTRL_DSI_LANES(icn->dsi_lanes - 1)); + ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0); ICN6211_DSI(icn, PLL_CTRL(12), 0xff); @@ -314,7 +321,9 @@ static const struct drm_bridge_funcs chipone_bridge_funcs = { static int chipone_parse_dt(struct chipone *icn) { struct device *dev = icn->dev; + struct device_node *endpoint; struct drm_panel *panel; + int dsi_lanes; int ret; icn->vdd1 = devm_regulator_get_optional(dev, "vdd1"); @@ -350,15 +359,42 @@ static int chipone_parse_dt(struct chipone *icn) return PTR_ERR(icn->enable_gpio); } + endpoint = of_graph_get_endpoint_by_regs(dev->of_node, 0, 0); + dsi_lanes = of_property_count_u32_elems(endpoint, "data-lanes"); + icn->host_node = of_graph_get_remote_port_parent(endpoint); + of_node_put(endpoint); + + if (!icn->host_node) + return -ENODEV; + + /* +* If the 'data-lanes' property does not exist in DT or is invalid, +* default to previously hard-coded behavior, which was 4 data lanes. +*/ + if (dsi_lanes < 0) { + icn->dsi_lanes = 4; + } else if (dsi_lanes > 4) { + ret = -EINVAL; + goto err_data_lanes; + } else { + icn->dsi_lanes = dsi_lanes; + } + ret = drm_of_find_panel_or_bridge(dev->of_node, 1, 0, &panel, NULL); if (ret) - return ret; + goto err_data_lanes; icn->panel_bridge = devm_drm_panel_bridge_add(dev, panel); - if (IS_ERR(icn->panel_bridge)) - return PTR_ERR(icn->panel_bridge); + if (IS_ERR(icn->panel_bridge)) { + ret = PTR_ERR(icn->panel_bridge); + goto err_data_lanes; + } return 0; + +err_data_lanes: + of_node_put(icn->host_node); + return ret; } static int chipone_probe(struct mipi_dsi_device *dsi) @@ -384,7 +420,7 @@ static int chipone_probe(struct mipi_dsi_device *dsi) drm_bridge_add(&icn->bridge); - dsi->lanes = 4; + dsi->lanes = icn->dsi_lanes; dsi->format = MIPI_DSI_FMT_RGB888; dsi->mode_flags = MIPI_DSI_MODE_VIDEO_SYNC_PULSE; @@ -403,6 +439,7 @@ static int chipone_remove(struct mipi_dsi_device *dsi) mipi_dsi_detach(dsi); drm_bridge_remove(&icn->bridge); + of_node_put(icn->host_node); return 0; } -- 2.34.1
[PATCH V3 01/13] dt-bindings: display: bridge: icn6211: Document DSI data-lanes property
It is necessary to specify the number of connected/used DSI data lanes when using the DSI input port of this bridge. Document the 'data-lanes' property of the DSI input port. Signed-off-by: Marek Vasut Cc: Jagan Teki Cc: Maxime Ripard Cc: Rob Herring Cc: Robert Foss Cc: Sam Ravnborg Cc: Thomas Zimmermann Cc: devicet...@vger.kernel.org To: dri-devel@lists.freedesktop.org --- V3: New patch --- .../display/bridge/chipone,icn6211.yaml| 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/display/bridge/chipone,icn6211.yaml b/Documentation/devicetree/bindings/display/bridge/chipone,icn6211.yaml index 62c3bd4cb28d8..f8cac721a7330 100644 --- a/Documentation/devicetree/bindings/display/bridge/chipone,icn6211.yaml +++ b/Documentation/devicetree/bindings/display/bridge/chipone,icn6211.yaml @@ -41,10 +41,26 @@ properties: properties: port@0: -$ref: /schemas/graph.yaml#/properties/port +$ref: /schemas/graph.yaml#/$defs/port-base +unevaluatedProperties: false description: Video port for MIPI DSI input +properties: + endpoint: +$ref: /schemas/media/video-interfaces.yaml# +unevaluatedProperties: false + +properties: + data-lanes: +description: array of physical DSI data lane indexes. +minItems: 1 +items: + - const: 1 + - const: 2 + - const: 3 + - const: 4 + port@1: $ref: /schemas/graph.yaml#/properties/port description: -- 2.34.1
Re: [PATCH v4 4/4] arm64/dts/qcom/sm8250: remove assigned-clock-rate property for mdp clk
On Fri, 4 Mar 2022 at 02:56, Stephen Boyd wrote: > > Quoting Dmitry Baryshkov (2022-03-03 15:50:50) > > On Thu, 3 Mar 2022 at 12:40, Vinod Polimera > > wrote: > > > > > > Kernel clock driver assumes that initial rate is the > > > max rate for that clock and was not allowing it to scale > > > beyond the assigned clock value. > > > > > > Drop the assigned clock rate property and vote on the mdp clock as per > > > calculated value during the usecase. > > > > > > Fixes: 7c1dffd471("arm64: dts: qcom: sm8250.dtsi: add display system > > > nodes") > > > > Please remove the Fixes tags from all commits. Otherwise the patches > > might be picked up into earlier kernels, which do not have a patch > > adding a vote on the MDP clock. > > What patch is that? The Fixes tag could point to that commit. Please correct me if I'm wrong. Currently the dtsi enforces bumping the MDP clock when the mdss device is being probed and when the dpu device is being probed. Later during the DPU lifetime the core_perf would change the clock's rate as it sees fit according to the CRTC requirements. However it would happen only when the during the dpu_crtc_atomic_flush(), before we call this function, the MDP clock is left in the undetermined state. The power rails controlled by the opp table are left in the undetermined state. I suppose that during the dpu_bind we should bump the clock to the max possible freq and let dpu_core_perf handle it afterwards. -- With best wishes Dmitry
Re: [PATCH v2] drm/msm/disp/dpu1: add inline rotation support for sc7280 target
On Thu, 3 Mar 2022 at 14:43, Vinod Polimera wrote: > > - Some DPU versions support inline rot90. It is supported only for > limited amount of UBWC formats. > - There are two versions of inline rotators, v1 (present on sm8250 and > sm7250) and v2 (sc7280). These versions differ in the list of supported > formats and in the scaler possibilities. > > Changes in RFC: > - Rebase changes to the latest code base. > - Append rotation config variables with v2 and > remove unused variables.(Dmitry) > - Move pixel_ext setup separately from scaler3 config.(Dmitry) > - Add 270 degree rotation to supported rotation list.(Dmitry) > > Changes in V2: > - Remove unused macros and fix indentation. > - Add check if 90 rotation is supported and add supported rotations to > rot_cfg. > > Signed-off-by: Kalyan Thota > Signed-off-by: Vinod Polimera > --- > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 44 +++--- > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 17 > drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 108 > +++-- > drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 2 + > 4 files changed, 134 insertions(+), 37 deletions(-) > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c > b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c > index aa75991..7cd07be 100644 > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c > @@ -25,6 +25,9 @@ > #define VIG_SM8250_MASK \ > (VIG_MASK | BIT(DPU_SSPP_QOS_8LVL) | BIT(DPU_SSPP_SCALER_QSEED3LITE)) > > +#define VIG_SC7280_MASK \ > + (VIG_SC7180_MASK | BIT(DPU_SSPP_INLINE_ROTATION)) > + > #define DMA_SDM845_MASK \ > (BIT(DPU_SSPP_SRC) | BIT(DPU_SSPP_QOS) | BIT(DPU_SSPP_QOS_8LVL) |\ > BIT(DPU_SSPP_TS_PREFILL) | BIT(DPU_SSPP_TS_PREFILL_REC1) |\ > @@ -177,6 +180,11 @@ static const uint32_t plane_formats_yuv[] = { > DRM_FORMAT_YVU420, > }; > > +static const uint32_t rotation_v2_formats[] = { > + DRM_FORMAT_NV12, > + /* TODO add formats after validation */ > +}; > + > /* > * DPU sub blocks config > */ > @@ -464,8 +472,7 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = { > */ > > /* SSPP common configuration */ > - > -#define _VIG_SBLK(num, sdma_pri, qseed_ver) \ > +#define _VIG_SBLK(num, sdma_pri, qseed_ver, rot_cfg) \ > { \ > .maxdwnscale = MAX_DOWNSCALE_RATIO, \ > .maxupscale = MAX_UPSCALE_RATIO, \ > @@ -482,6 +489,7 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = { > .num_formats = ARRAY_SIZE(plane_formats_yuv), \ > .virt_format_list = plane_formats, \ > .virt_num_formats = ARRAY_SIZE(plane_formats), \ > + .rotation_cfg = rot_cfg, \ > } > > #define _DMA_SBLK(num, sdma_pri) \ > @@ -497,14 +505,21 @@ static const struct dpu_ctl_cfg sc7280_ctl[] = { > .virt_num_formats = ARRAY_SIZE(plane_formats), \ > } > > +static const struct dpu_rotation_cfg dpu_rot_sc7280_cfg_v2 = { > + .rot_maxheight = 1088, > + .rot_num_formats = ARRAY_SIZE(rotation_v2_formats), > + .rot_format_list = rotation_v2_formats, > + .rot_supported = DRM_MODE_ROTATE_MASK | DRM_MODE_REFLECT_MASK, > +}; > + > static const struct dpu_sspp_sub_blks sdm845_vig_sblk_0 = > - _VIG_SBLK("0", 5, DPU_SSPP_SCALER_QSEED3); > + _VIG_SBLK("0", 5, DPU_SSPP_SCALER_QSEED3, > NULL); > static const struct dpu_sspp_sub_blks sdm845_vig_sblk_1 = > - _VIG_SBLK("1", 6, DPU_SSPP_SCALER_QSEED3); > + _VIG_SBLK("1", 6, DPU_SSPP_SCALER_QSEED3, > NULL); > static const struct dpu_sspp_sub_blks sdm845_vig_sblk_2 = > - _VIG_SBLK("2", 7, DPU_SSPP_SCALER_QSEED3); > + _VIG_SBLK("2", 7, DPU_SSPP_SCALER_QSEED3, > NULL); > static const struct dpu_sspp_sub_blks sdm845_vig_sblk_3 = > - _VIG_SBLK("3", 8, DPU_SSPP_SCALER_QSEED3); > + _VIG_SBLK("3", 8, DPU_SSPP_SCALER_QSEED3, > NULL); > > static const struct dpu_sspp_sub_blks sdm845_dma_sblk_0 = _DMA_SBLK("8", 1); > static const struct dpu_sspp_sub_blks sdm845_dma_sblk_1 = _DMA_SBLK("9", 2); > @@ -543,7 +558,10 @@ static const struct dpu_sspp_cfg sdm845_sspp[] = { > }; > > static const struct dpu_sspp_sub_blks sc7180_vig_sblk_0 = > - _VIG_SBLK("0", 4, DPU_SSPP_SCALER_QSEED4); > + _VIG_SBLK("0", 4, DPU_SSPP_SCALER_QSEED4, > NULL); > + > +static const struct dpu_sspp_sub_blks sc7280_vig_sblk_0 = > + _VIG_SBLK("0", 4, DPU_SSPP_SCALER_QSEED4, > &dpu_rot_sc7280_cfg_v2); > > static const struct dpu_sspp_cfg sc7180_sspp[] = { > SSPP_BLK
[PATCH v2 2/2] dt-bindings: gpu: Convert aspeed-gfx bindings to yaml
Convert the bindings to yaml and add the ast2600 compatible string. The legacy mfd description was put in place before the gfx bindings existed, to document the compatible that is used in the pinctrl bindings. Signed-off-by: Joel Stanley --- .../devicetree/bindings/gpu/aspeed,gfx.yaml | 69 +++ .../devicetree/bindings/gpu/aspeed-gfx.txt| 41 --- .../devicetree/bindings/mfd/aspeed-gfx.txt| 17 - 3 files changed, 69 insertions(+), 58 deletions(-) create mode 100644 Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml delete mode 100644 Documentation/devicetree/bindings/gpu/aspeed-gfx.txt delete mode 100644 Documentation/devicetree/bindings/mfd/aspeed-gfx.txt diff --git a/Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml b/Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml new file mode 100644 index ..8ddc4fa6e8e4 --- /dev/null +++ b/Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml @@ -0,0 +1,69 @@ +# SPDX-License-Identifier: GPL-2.0-only +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/gpu/aspeed,gfx.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: ASPEED GFX display device + +maintainers: + - Joel Stanley + +properties: + compatible: +items: + - enum: + - aspeed,ast2400-gfx + - aspeed,ast2500-gfx + - aspeed,ast2600-gfx + - const: syscon + + reg: +minItems: 1 + + interrupts: +maxItems: 1 + + clocks: +maxItems: 1 + + resets: +maxItems: 1 + + memory-region: true + + syscon: true + + reg-io-width: true + +required: + - reg + - compatible + - interrupts + - clocks + - resets + - memory-region + - syscon + +additionalProperties: false + +examples: + - | + #include + gfx: display@1e6e6000 { + compatible = "aspeed,ast2500-gfx", "syscon"; + reg = <0x1e6e6000 0x1000>; + reg-io-width = <4>; + clocks = <&syscon ASPEED_CLK_GATE_D1CLK>; + resets = <&syscon ASPEED_RESET_CRT1>; + interrupts = <0x19>; + syscon = <&syscon>; + memory-region = <&gfx_memory>; + }; + + gfx_memory: framebuffer { + size = <0x0100>; + alignment = <0x0100>; + compatible = "shared-dma-pool"; + reusable; + }; diff --git a/Documentation/devicetree/bindings/gpu/aspeed-gfx.txt b/Documentation/devicetree/bindings/gpu/aspeed-gfx.txt deleted file mode 100644 index 958bdf962339.. --- a/Documentation/devicetree/bindings/gpu/aspeed-gfx.txt +++ /dev/null @@ -1,41 +0,0 @@ -Device tree configuration for the GFX display device on the ASPEED SoCs - -Required properties: - - compatible -* Must be one of the following: - + aspeed,ast2500-gfx - + aspeed,ast2400-gfx -* In addition, the ASPEED pinctrl bindings require the 'syscon' property to - be present - - - reg: Physical base address and length of the GFX registers - - - interrupts: interrupt number for the GFX device - - - clocks: clock number used to generate the pixel clock - - - resets: reset line that must be released to use the GFX device - - - memory-region: -Phandle to a memory region to allocate from, as defined in -Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt - - -Example: - -gfx: display@1e6e6000 { - compatible = "aspeed,ast2500-gfx", "syscon"; - reg = <0x1e6e6000 0x1000>; - reg-io-width = <4>; - clocks = <&syscon ASPEED_CLK_GATE_D1CLK>; - resets = <&syscon ASPEED_RESET_CRT1>; - interrupts = <0x19>; - memory-region = <&gfx_memory>; -}; - -gfx_memory: framebuffer { - size = <0x0100>; - alignment = <0x0100>; - compatible = "shared-dma-pool"; - reusable; -}; diff --git a/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt b/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt deleted file mode 100644 index aea5370efd97.. --- a/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt +++ /dev/null @@ -1,17 +0,0 @@ -* Device tree bindings for Aspeed SoC Display Controller (GFX) - -The Aspeed SoC Display Controller primarily does as its name suggests, but also -participates in pinmux requests on the g5 SoCs. It is therefore considered a -syscon device. - -Required properties: -- compatible: "aspeed,ast2500-gfx", "syscon" -- reg: contains offset/length value of the GFX memory - region. - -Example: - -gfx: display@1e6e6000 { - compatible = "aspeed,ast2500-gfx", "syscon"; - reg = <0x1e6e6000 0x1000>; -}; -- 2.34.1
[PATCH v2 1/2] dt-bindings: pinctrl: aspeed: Update gfx node in example
The example needs updating to match the to be added yaml bindings for the gfx node. Signed-off-by: Joel Stanley --- .../bindings/pinctrl/aspeed,ast2500-pinctrl.yaml | 16 1 file changed, 16 insertions(+) diff --git a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml index d316cc082107..9969997c2f1b 100644 --- a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml +++ b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml @@ -73,6 +73,7 @@ additionalProperties: false examples: - | +#include apb { compatible = "simple-bus"; #address-cells = <1>; @@ -82,6 +83,8 @@ examples: syscon: scu@1e6e2000 { compatible = "aspeed,ast2500-scu", "syscon", "simple-mfd"; reg = <0x1e6e2000 0x1a8>; +#clock-cells = <1>; +#reset-cells = <1>; pinctrl: pinctrl { compatible = "aspeed,ast2500-pinctrl"; @@ -102,6 +105,12 @@ examples: gfx: display@1e6e6000 { compatible = "aspeed,ast2500-gfx", "syscon"; reg = <0x1e6e6000 0x1000>; +reg-io-width = <4>; +clocks = <&syscon ASPEED_CLK_GATE_D1CLK>; +resets = <&syscon ASPEED_RESET_CRT1>; +interrupts = <0x19>; +syscon = <&syscon>; +memory-region = <&gfx_memory>; }; }; @@ -128,3 +137,10 @@ examples: }; }; }; + +gfx_memory: framebuffer { +size = <0x0100>; +alignment = <0x0100>; +compatible = "shared-dma-pool"; +reusable; +}; -- 2.34.1
[PATCH v2 0/2] dt-bindings: Convert GFX bindings to yaml
v1: https://lore.kernel.org/all/20220302051056.678367-1-j...@jms.id.au/ This series cleans up the bindings for the ASPEED GFX unit. The old text files are deleted for both the description under gpu, and the placeholder one under mfd. The mfd one existed because pinctrl for the 2500 depends on the gfx bindings, and at the time we didn't have any support fo the gfx device, so Andrew added the mfd ones. The example in the pinctrl bindings is updated to prevent warnings about missing properties that pop up when the gfx yaml bindings are added. Joel Stanley (2): dt-bindings: pinctrl: aspeed: Update gfx node in example dt-bindings: gpu: Convert aspeed-gfx bindings to yaml .../devicetree/bindings/gpu/aspeed,gfx.yaml | 69 +++ .../devicetree/bindings/gpu/aspeed-gfx.txt| 41 --- .../devicetree/bindings/mfd/aspeed-gfx.txt| 17 - .../pinctrl/aspeed,ast2500-pinctrl.yaml | 16 + 4 files changed, 85 insertions(+), 58 deletions(-) create mode 100644 Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml delete mode 100644 Documentation/devicetree/bindings/gpu/aspeed-gfx.txt delete mode 100644 Documentation/devicetree/bindings/mfd/aspeed-gfx.txt -- 2.34.1
Re: [PATCH v4 4/4] arm64/dts/qcom/sm8250: remove assigned-clock-rate property for mdp clk
Quoting Dmitry Baryshkov (2022-03-03 15:50:50) > On Thu, 3 Mar 2022 at 12:40, Vinod Polimera wrote: > > > > Kernel clock driver assumes that initial rate is the > > max rate for that clock and was not allowing it to scale > > beyond the assigned clock value. > > > > Drop the assigned clock rate property and vote on the mdp clock as per > > calculated value during the usecase. > > > > Fixes: 7c1dffd471("arm64: dts: qcom: sm8250.dtsi: add display system nodes") > > Please remove the Fixes tags from all commits. Otherwise the patches > might be picked up into earlier kernels, which do not have a patch > adding a vote on the MDP clock. What patch is that? The Fixes tag could point to that commit.
Re: [PATCH v4 4/4] arm64/dts/qcom/sm8250: remove assigned-clock-rate property for mdp clk
On Thu, 3 Mar 2022 at 12:40, Vinod Polimera wrote: > > Kernel clock driver assumes that initial rate is the > max rate for that clock and was not allowing it to scale > beyond the assigned clock value. > > Drop the assigned clock rate property and vote on the mdp clock as per > calculated value during the usecase. > > Fixes: 7c1dffd471("arm64: dts: qcom: sm8250.dtsi: add display system nodes") Please remove the Fixes tags from all commits. Otherwise the patches might be picked up into earlier kernels, which do not have a patch adding a vote on the MDP clock. > Signed-off-by: Vinod Polimera > --- > arch/arm64/boot/dts/qcom/sm8250.dtsi | 9 ++--- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi > b/arch/arm64/boot/dts/qcom/sm8250.dtsi > index fdaf303..2105eb7 100644 > --- a/arch/arm64/boot/dts/qcom/sm8250.dtsi > +++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi > @@ -3164,9 +3164,6 @@ > <&dispcc DISP_CC_MDSS_MDP_CLK>; > clock-names = "iface", "bus", "nrt_bus", "core"; > > - assigned-clocks = <&dispcc DISP_CC_MDSS_MDP_CLK>; > - assigned-clock-rates = <46000>; > - > interrupts = ; > interrupt-controller; > #interrupt-cells = <1>; > @@ -3191,10 +3188,8 @@ > <&dispcc DISP_CC_MDSS_VSYNC_CLK>; > clock-names = "iface", "bus", "core", "vsync"; > > - assigned-clocks = <&dispcc > DISP_CC_MDSS_MDP_CLK>, > - <&dispcc > DISP_CC_MDSS_VSYNC_CLK>; > - assigned-clock-rates = <46000>, > - <1920>; > + assigned-clocks = <&dispcc > DISP_CC_MDSS_VSYNC_CLK>; > + assigned-clock-rates = <1920>; > > operating-points-v2 = <&mdp_opp_table>; > power-domains = <&rpmhpd SM8250_MMCX>; > -- > 2.7.4 > -- With best wishes Dmitry
Re: [PATCH v2 3/4] drm/msm: split the main platform driver
On Fri, 4 Mar 2022 at 02:00, Stephen Boyd wrote: > > Quoting Dmitry Baryshkov (2022-01-19 14:40:04) > > diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h > > index 06d26c5fb274..6895c056be19 100644 > > --- a/drivers/gpu/drm/msm/msm_drv.h > > +++ b/drivers/gpu/drm/msm/msm_drv.h > > @@ -451,10 +451,18 @@ static inline void msm_dp_debugfs_init(struct msm_dp > > *dp_display, > > > > #endif > > > > +#define KMS_MDP4 4 > > +#define KMS_MDP5 5 > > +#define KMS_DPU 3 > > + > > +void __init msm_mdp4_register(void); > > +void __exit msm_mdp4_unregister(void); > > void __init msm_mdp_register(void); > > void __exit msm_mdp_unregister(void); > > void __init msm_dpu_register(void); > > void __exit msm_dpu_unregister(void); > > +void __init msm_mdss_register(void); > > +void __exit msm_mdss_unregister(void); > > Don't need __init or __exit on prototypes. > > > > > #ifdef CONFIG_DEBUG_FS > > void msm_framebuffer_describe(struct drm_framebuffer *fb, struct seq_file > > *m); > > diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c > > index 92562221b517..759076357e0e 100644 > > --- a/drivers/gpu/drm/msm/msm_mdss.c > > +++ b/drivers/gpu/drm/msm/msm_mdss.c > > @@ -8,6 +8,8 @@ > > #include > > #include > > > > +#include > > What's this include for? > > > + > > #include "msm_drv.h" > > #include "msm_kms.h" > > > > @@ -127,7 +129,7 @@ static int _msm_mdss_irq_domain_add(struct msm_mdss > > *msm_mdss) > > return 0; > > } > > > > -int msm_mdss_enable(struct msm_mdss *msm_mdss) > > +static int msm_mdss_enable(struct msm_mdss *msm_mdss) > > { > > int ret; > > > > @@ -163,14 +165,14 @@ int msm_mdss_enable(struct msm_mdss *msm_mdss) > > return ret; > > } > > > > -int msm_mdss_disable(struct msm_mdss *msm_mdss) > > +static int msm_mdss_disable(struct msm_mdss *msm_mdss) > > { > > clk_bulk_disable_unprepare(msm_mdss->num_clocks, msm_mdss->clocks); > > > > return 0; > > } > > > > -void msm_mdss_destroy(struct msm_mdss *msm_mdss) > > +static void msm_mdss_destroy(struct msm_mdss *msm_mdss) > > { > > struct platform_device *pdev = to_platform_device(msm_mdss->dev); > > int irq; > > @@ -228,7 +230,7 @@ int mdp5_mdss_parse_clock(struct platform_device *pdev, > > struct clk_bulk_data **c > > return num_clocks; > > } > > > > -struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool mdp5) > > +static struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool > > mdp5) > > { > > struct msm_mdss *msm_mdss; > > int ret; > > @@ -269,3 +271,171 @@ struct msm_mdss *msm_mdss_init(struct platform_device > > *pdev, bool mdp5) > > > > return msm_mdss; > > } > > + > > +static int __maybe_unused mdss_runtime_suspend(struct device *dev) > > +{ > > + struct msm_drm_private *priv = dev_get_drvdata(dev); > > + > > + DBG(""); > > + > > + return msm_mdss_disable(priv->mdss); > > +} > > + > > +static int __maybe_unused mdss_runtime_resume(struct device *dev) > > +{ > > + struct msm_drm_private *priv = dev_get_drvdata(dev); > > + > > + DBG(""); > > + > > + return msm_mdss_enable(priv->mdss); > > +} > > + > > +static int __maybe_unused mdss_pm_suspend(struct device *dev) > > +{ > > + > > + if (pm_runtime_suspended(dev)) > > + return 0; > > + > > + return mdss_runtime_suspend(dev); > > +} > > + > > +static int __maybe_unused mdss_pm_resume(struct device *dev) > > +{ > > + if (pm_runtime_suspended(dev)) > > + return 0; > > + > > + return mdss_runtime_resume(dev); > > +} > > + > > +static const struct dev_pm_ops mdss_pm_ops = { > > + SET_SYSTEM_SLEEP_PM_OPS(mdss_pm_suspend, mdss_pm_resume) > > + SET_RUNTIME_PM_OPS(mdss_runtime_suspend, mdss_runtime_resume, NULL) > > + .prepare = msm_pm_prepare, > > + .complete = msm_pm_complete, > > +}; > > + > > +static int get_mdp_ver(struct platform_device *pdev) > > +{ > > + struct device *dev = &pdev->dev; > > + > > + return (int) (unsigned long) of_device_get_match_data(dev); > > +} > > + > > +static int find_mdp_node(struct device *dev, void *data) > > +{ > > + return of_match_node(dpu_dt_match, dev->of_node) || > > + of_match_node(mdp5_dt_match, dev->of_node); > > +} > > + > > +static int mdss_probe(struct platform_device *pdev) > > +{ > > + struct msm_mdss *mdss; > > + struct msm_drm_private *priv; > > + int mdp_ver = get_mdp_ver(pdev); > > + struct device *mdp_dev; > > + struct device *dev = &pdev->dev; > > + int ret; > > + > > + if (mdp_ver != KMS_MDP5 && mdp_ver != KMS_DPU) > > + return -EINVAL; > > Is it possible anymore? Now that the driver is split it seems like no. Yes, I'll drop this. > > > + > > + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); > > + if (!priv) > > + return -ENOMEM; > > + > > +
Re: [PATCH v2 2/4] drm/msm: remove extra indirection for msm_mdss
On Fri, 4 Mar 2022 at 01:54, Stephen Boyd wrote: > > Quoting Dmitry Baryshkov (2022-01-19 14:40:03) > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > > index be06a62d7ccb..f18dfbb614f0 100644 > > --- a/drivers/gpu/drm/msm/msm_drv.c > > +++ b/drivers/gpu/drm/msm/msm_drv.c > > @@ -1211,19 +1212,32 @@ static int msm_pdev_probe(struct platform_device > > *pdev) > > > > switch (get_mdp_ver(pdev)) { > > case KMS_MDP5: > > - ret = msm_mdss_init(pdev, true); > > + mdss = msm_mdss_init(pdev, true); > > + if (IS_ERR(mdss)) { > > + ret = PTR_ERR(mdss); > > + platform_set_drvdata(pdev, NULL); > > + > > + return ret; > > + } else { > > Drop else > > > + priv->mdss = mdss; > > + pm_runtime_enable(&pdev->dev); > > + } > > break; > > case KMS_DPU: > > - ret = msm_mdss_init(pdev, false); > > + mdss = msm_mdss_init(pdev, false); > > + if (IS_ERR(mdss)) { > > + ret = PTR_ERR(mdss); > > + platform_set_drvdata(pdev, NULL); > > + > > + return ret; > > + } else { > > + priv->mdss = mdss; > > + pm_runtime_enable(&pdev->dev); > > + } > > This is the same so why can't it be done below in the deleted if (ret)? I didn't like the idea of checking the if (IS_ERR(mdss)) outside of the case blocks, but now I can move it back. > > > break; > > default: > > - ret = 0; > > break; > > } > > - if (ret) { > > - platform_set_drvdata(pdev, NULL); > > - return ret; > > - } > > > > if (get_mdp_ver(pdev)) { > > ret = add_display_components(pdev, &match); > > diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h > > index 2459ba479caf..0c341660941a 100644 > > --- a/drivers/gpu/drm/msm/msm_kms.h > > +++ b/drivers/gpu/drm/msm/msm_kms.h > > @@ -239,50 +228,44 @@ int mdp5_mdss_parse_clock(struct platform_device > > *pdev, struct clk_bulk_data **c > > return num_clocks; > > } > > > > -int msm_mdss_init(struct platform_device *pdev, bool mdp5) > > +struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool mdp5) > > Ah I see it will quickly become not static. Still should have static > first and remove it here. -- With best wishes Dmitry
Re: [PATCH 00/12] Add writeback block support for DPU
Hi Stephen There is some discussion going on about the base dependency of the change: https://patchwork.kernel.org/project/dri-devel/patch/20220202085429.22261-6-suraj.kand...@intel.com/ I will resend this with comments addressed once the dependency is sorted out among intel, QC and laurent. Thanks Abhinav On 3/3/2022 2:46 PM, Stephen Boyd wrote: Quoting Abhinav Kumar (2022-02-04 13:17:13) This series adds support for writeback block on DPU. Writeback block is extremely useful to validate boards having no physical displays in addition to many other use-cases where we want to get the output of the display pipeline to examine whether issue is with the display pipeline or with the panel. Is this series going to be resent?
Re: [PATCH] dt-bindings: gpu: Convert aspeed-gfx bindings to yaml
On Fri, 4 Mar 2022, at 08:05, Joel Stanley wrote: > On Thu, 3 Mar 2022 at 19:34, Rob Herring wrote: >> >> On Wed, Mar 2, 2022 at 12:01 PM Rob Herring wrote: >> > >> > On Wed, Mar 02, 2022 at 03:40:56PM +1030, Joel Stanley wrote: >> > > Convert the bindings to yaml and add the ast2600 compatible string. >> > > >> > > Signed-off-by: Joel Stanley >> > > --- >> > > .../devicetree/bindings/gpu/aspeed,gfx.yaml | 69 +++ >> > > .../devicetree/bindings/gpu/aspeed-gfx.txt| 41 --- >> > > 2 files changed, 69 insertions(+), 41 deletions(-) >> > > create mode 100644 Documentation/devicetree/bindings/gpu/aspeed,gfx.yaml >> > > delete mode 100644 Documentation/devicetree/bindings/gpu/aspeed-gfx.txt >> > >> > Applied, thanks. >> >> Uggg, now dropped... >> >> What's Documentation/devicetree/bindings/mfd/aspeed-gfx.txt and also >> the example in >> Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml? >> Please sort those out. > > I think the aspeed-gfx.txt can be deleted. And the example in the > pinctrl bindings needs to be updated with the required properties. > > Andrew, can you clarify what's going on with those other files? Looks like you'll just need to paste your example from aspeed,gfx.yaml into the pinctrl yamls to replace the existing gfx nodes. Andrew
Re: [PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines
On Thu, Mar 03, 2022 at 02:37:35PM -0800, john.c.harri...@intel.com wrote: > From: John Harrison > > An earlier patch added support for compute engines. However, it missed > enabling the anti-pre-emption w/a for the new engine class. So move > the 'compute capable' flag earlier and use it for the pre-emption w/a > test. > > Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions") > Cc: Tvrtko Ursulin > Cc: Daniele Ceraolo Spurio > Cc: Aravind Iddamsetty > Cc: Matt Roper > Cc: Tvrtko Ursulin > Cc: Daniel Vetter > Cc: Maarten Lankhorst > Cc: Lucas De Marchi > Cc: John Harrison > Cc: Jason Ekstrand > Cc: "Michał Winiarski" > Cc: Matthew Brost > Cc: Chris Wilson > Cc: Tejas Upadhyay > Cc: Umesh Nerlige Ramappa > Cc: "Thomas Hellström" > Cc: Stuart Summers > Cc: Matthew Auld > Cc: Jani Nikula > Cc: Ramalingam C > Cc: Akeem G Abodunrin > Signed-off-by: John Harrison Reviewed-by: Matt Roper > --- > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > index 22e70e4e007c..4185c7338581 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > @@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum > intel_engine_id id, > engine->logical_mask = BIT(logical_instance); > __sprint_engine_name(engine); > > + /* features common between engines sharing EUs */ > + if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) { > + engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE; > + engine->flags |= I915_ENGINE_HAS_EU_PRIORITY; > + } > + > engine->props.heartbeat_interval_ms = > CONFIG_DRM_I915_HEARTBEAT_INTERVAL; > engine->props.max_busywait_duration_ns = > @@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum > intel_engine_id id, > CONFIG_DRM_I915_TIMESLICE_DURATION; > > /* Override to uninterruptible for OpenCL workloads. */ > - if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS) > + if (GRAPHICS_VER(i915) == 12 && (engine->flags & > I915_ENGINE_HAS_RCS_REG_STATE)) > engine->props.preempt_timeout_ms = 0; > > - /* features common between engines sharing EUs */ > - if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) { > - engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE; > - engine->flags |= I915_ENGINE_HAS_EU_PRIORITY; > - } > - > /* Cap properties according to any system limits */ > #define CLAMP_PROP(field) \ > do { \ > -- > 2.25.1 > -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795
Re: [PATCH v2 4/4] drm/msm: stop using device's match data pointer
Quoting Dmitry Baryshkov (2022-01-19 14:40:05) > diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c > index 759076357e0e..f83dca99f03d 100644 > --- a/drivers/gpu/drm/msm/msm_mdss.c > +++ b/drivers/gpu/drm/msm/msm_mdss.c > @@ -314,11 +314,11 @@ static const struct dev_pm_ops mdss_pm_ops = { > .complete = msm_pm_complete, > }; > > -static int get_mdp_ver(struct platform_device *pdev) > +static bool get_is_mdp5(struct platform_device *pdev) > { > struct device *dev = &pdev->dev; > > - return (int) (unsigned long) of_device_get_match_data(dev); > + return (bool) (unsigned long) of_device_get_match_data(dev); > } > > static int find_mdp_node(struct device *dev, void *data) > @@ -331,21 +331,18 @@ static int mdss_probe(struct platform_device *pdev) > { > struct msm_mdss *mdss; > struct msm_drm_private *priv; > - int mdp_ver = get_mdp_ver(pdev); > + bool is_mdp5 = get_is_mdp5(pdev); is_mdp5 = of_device_is_compatible(pdev->dev.of_node, "qcom,mdss"); > struct device *mdp_dev; > struct device *dev = &pdev->dev; > int ret; > > - if (mdp_ver != KMS_MDP5 && mdp_ver != KMS_DPU) > - return -EINVAL; > - > priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); > if (!priv) > return -ENOMEM; > > platform_set_drvdata(pdev, priv); > > - mdss = msm_mdss_init(pdev, mdp_ver == KMS_MDP5); > + mdss = msm_mdss_init(pdev, is_mdp5); > if (IS_ERR(mdss)) { > ret = PTR_ERR(mdss); > platform_set_drvdata(pdev, NULL); > @@ -409,12 +406,12 @@ static int mdss_remove(struct platform_device *pdev) > } > > static const struct of_device_id mdss_dt_match[] = { > - { .compatible = "qcom,mdss", .data = (void *)KMS_MDP5 }, > - { .compatible = "qcom,sdm845-mdss", .data = (void *)KMS_DPU }, > - { .compatible = "qcom,sc7180-mdss", .data = (void *)KMS_DPU }, > - { .compatible = "qcom,sc7280-mdss", .data = (void *)KMS_DPU }, > - { .compatible = "qcom,sm8150-mdss", .data = (void *)KMS_DPU }, > - { .compatible = "qcom,sm8250-mdss", .data = (void *)KMS_DPU }, > + { .compatible = "qcom,mdss", .data = (void *)true }, > + { .compatible = "qcom,sdm845-mdss", .data = (void *)false }, > + { .compatible = "qcom,sc7180-mdss", .data = (void *)false }, > + { .compatible = "qcom,sc7280-mdss", .data = (void *)false }, > + { .compatible = "qcom,sm8150-mdss", .data = (void *)false }, > + { .compatible = "qcom,sm8250-mdss", .data = (void *)false }, And then no data needed?
Re: [PATCH v4 2/2] drm/bridge: analogix_dp: Enable autosuspend
Hi, On Tue, Mar 1, 2022 at 6:11 PM Brian Norris wrote: > > DP AUX transactions can consist of many short operations. There's no > need to power things up/down in short intervals. > > I pick an arbitrary 100ms; for the systems I'm testing (Rockchip > RK3399), runtime-PM transitions only take a few microseconds. > > Signed-off-by: Brian Norris > --- > > Changes in v4: > - call pm_runtime_mark_last_busy() and >pm_runtime_dont_use_autosuspend() > - drop excess pm references around drm_get_edid(), now that we grab and >hold in the dp-aux helper > > Changes in v3: > - New in v3 > > drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) This looks great to me now, thanks. Reviewed-by: Douglas Anderson Though I'm not a massive expert on the Analogix DP driver, I'm pretty confident about the DP AUX stuff that Brian is touching. I just checked and I see that this driver isn't changing lots and the last change landed in drm-misc, which means that I can commit this. Thus, unless someone else shouts, I'll plan to wait until next week and commit these two patches to drm-misc. The first of the two patches is a "Fix" but since it's been broken since 2016 I'll assume that nobody is chomping at the bit for these to get into stable and that it would be easier to land both in "drm-misc-next". Please yell if someone disagrees. -Doug
Re: [PATCH v2 3/4] drm/msm: split the main platform driver
Quoting Dmitry Baryshkov (2022-01-19 14:40:04) > diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h > index 06d26c5fb274..6895c056be19 100644 > --- a/drivers/gpu/drm/msm/msm_drv.h > +++ b/drivers/gpu/drm/msm/msm_drv.h > @@ -451,10 +451,18 @@ static inline void msm_dp_debugfs_init(struct msm_dp > *dp_display, > > #endif > > +#define KMS_MDP4 4 > +#define KMS_MDP5 5 > +#define KMS_DPU 3 > + > +void __init msm_mdp4_register(void); > +void __exit msm_mdp4_unregister(void); > void __init msm_mdp_register(void); > void __exit msm_mdp_unregister(void); > void __init msm_dpu_register(void); > void __exit msm_dpu_unregister(void); > +void __init msm_mdss_register(void); > +void __exit msm_mdss_unregister(void); Don't need __init or __exit on prototypes. > > #ifdef CONFIG_DEBUG_FS > void msm_framebuffer_describe(struct drm_framebuffer *fb, struct seq_file > *m); > diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c > index 92562221b517..759076357e0e 100644 > --- a/drivers/gpu/drm/msm/msm_mdss.c > +++ b/drivers/gpu/drm/msm/msm_mdss.c > @@ -8,6 +8,8 @@ > #include > #include > > +#include What's this include for? > + > #include "msm_drv.h" > #include "msm_kms.h" > > @@ -127,7 +129,7 @@ static int _msm_mdss_irq_domain_add(struct msm_mdss > *msm_mdss) > return 0; > } > > -int msm_mdss_enable(struct msm_mdss *msm_mdss) > +static int msm_mdss_enable(struct msm_mdss *msm_mdss) > { > int ret; > > @@ -163,14 +165,14 @@ int msm_mdss_enable(struct msm_mdss *msm_mdss) > return ret; > } > > -int msm_mdss_disable(struct msm_mdss *msm_mdss) > +static int msm_mdss_disable(struct msm_mdss *msm_mdss) > { > clk_bulk_disable_unprepare(msm_mdss->num_clocks, msm_mdss->clocks); > > return 0; > } > > -void msm_mdss_destroy(struct msm_mdss *msm_mdss) > +static void msm_mdss_destroy(struct msm_mdss *msm_mdss) > { > struct platform_device *pdev = to_platform_device(msm_mdss->dev); > int irq; > @@ -228,7 +230,7 @@ int mdp5_mdss_parse_clock(struct platform_device *pdev, > struct clk_bulk_data **c > return num_clocks; > } > > -struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool mdp5) > +static struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool > mdp5) > { > struct msm_mdss *msm_mdss; > int ret; > @@ -269,3 +271,171 @@ struct msm_mdss *msm_mdss_init(struct platform_device > *pdev, bool mdp5) > > return msm_mdss; > } > + > +static int __maybe_unused mdss_runtime_suspend(struct device *dev) > +{ > + struct msm_drm_private *priv = dev_get_drvdata(dev); > + > + DBG(""); > + > + return msm_mdss_disable(priv->mdss); > +} > + > +static int __maybe_unused mdss_runtime_resume(struct device *dev) > +{ > + struct msm_drm_private *priv = dev_get_drvdata(dev); > + > + DBG(""); > + > + return msm_mdss_enable(priv->mdss); > +} > + > +static int __maybe_unused mdss_pm_suspend(struct device *dev) > +{ > + > + if (pm_runtime_suspended(dev)) > + return 0; > + > + return mdss_runtime_suspend(dev); > +} > + > +static int __maybe_unused mdss_pm_resume(struct device *dev) > +{ > + if (pm_runtime_suspended(dev)) > + return 0; > + > + return mdss_runtime_resume(dev); > +} > + > +static const struct dev_pm_ops mdss_pm_ops = { > + SET_SYSTEM_SLEEP_PM_OPS(mdss_pm_suspend, mdss_pm_resume) > + SET_RUNTIME_PM_OPS(mdss_runtime_suspend, mdss_runtime_resume, NULL) > + .prepare = msm_pm_prepare, > + .complete = msm_pm_complete, > +}; > + > +static int get_mdp_ver(struct platform_device *pdev) > +{ > + struct device *dev = &pdev->dev; > + > + return (int) (unsigned long) of_device_get_match_data(dev); > +} > + > +static int find_mdp_node(struct device *dev, void *data) > +{ > + return of_match_node(dpu_dt_match, dev->of_node) || > + of_match_node(mdp5_dt_match, dev->of_node); > +} > + > +static int mdss_probe(struct platform_device *pdev) > +{ > + struct msm_mdss *mdss; > + struct msm_drm_private *priv; > + int mdp_ver = get_mdp_ver(pdev); > + struct device *mdp_dev; > + struct device *dev = &pdev->dev; > + int ret; > + > + if (mdp_ver != KMS_MDP5 && mdp_ver != KMS_DPU) > + return -EINVAL; Is it possible anymore? Now that the driver is split it seems like no. > + > + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); > + if (!priv) > + return -ENOMEM; > + > + platform_set_drvdata(pdev, priv); > + > + mdss = msm_mdss_init(pdev, mdp_ver == KMS_MDP5); > + if (IS_ERR(mdss)) { > + ret = PTR_ERR(mdss); > + platform_set_drvdata(pdev, NULL); > + > + return ret; > + } > + > + priv->mdss = mdss; > + pm_runtime_enable(&pdev->dev); > + > + /* > +* MDP5/DPU based devices don't
Re: [PATCH v2 2/4] drm/msm: remove extra indirection for msm_mdss
Quoting Dmitry Baryshkov (2022-01-19 14:40:03) > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > index be06a62d7ccb..f18dfbb614f0 100644 > --- a/drivers/gpu/drm/msm/msm_drv.c > +++ b/drivers/gpu/drm/msm/msm_drv.c > @@ -1211,19 +1212,32 @@ static int msm_pdev_probe(struct platform_device > *pdev) > > switch (get_mdp_ver(pdev)) { > case KMS_MDP5: > - ret = msm_mdss_init(pdev, true); > + mdss = msm_mdss_init(pdev, true); > + if (IS_ERR(mdss)) { > + ret = PTR_ERR(mdss); > + platform_set_drvdata(pdev, NULL); > + > + return ret; > + } else { Drop else > + priv->mdss = mdss; > + pm_runtime_enable(&pdev->dev); > + } > break; > case KMS_DPU: > - ret = msm_mdss_init(pdev, false); > + mdss = msm_mdss_init(pdev, false); > + if (IS_ERR(mdss)) { > + ret = PTR_ERR(mdss); > + platform_set_drvdata(pdev, NULL); > + > + return ret; > + } else { > + priv->mdss = mdss; > + pm_runtime_enable(&pdev->dev); > + } This is the same so why can't it be done below in the deleted if (ret)? > break; > default: > - ret = 0; > break; > } > - if (ret) { > - platform_set_drvdata(pdev, NULL); > - return ret; > - } > > if (get_mdp_ver(pdev)) { > ret = add_display_components(pdev, &match); > diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h > index 2459ba479caf..0c341660941a 100644 > --- a/drivers/gpu/drm/msm/msm_kms.h > +++ b/drivers/gpu/drm/msm/msm_kms.h > @@ -239,50 +228,44 @@ int mdp5_mdss_parse_clock(struct platform_device *pdev, > struct clk_bulk_data **c > return num_clocks; > } > > -int msm_mdss_init(struct platform_device *pdev, bool mdp5) > +struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool mdp5) Ah I see it will quickly become not static. Still should have static first and remove it here.
Re: [PATCH v2 1/4] drm/msm: unify MDSS drivers
Quoting Dmitry Baryshkov (2022-01-19 14:40:02) > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c > b/drivers/gpu/drm/msm/msm_mdss.c > similarity index 58% > rename from drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c > rename to drivers/gpu/drm/msm/msm_mdss.c > index 9f5cc7f9e9a9..f5429eb0ae52 100644 > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c > +++ b/drivers/gpu/drm/msm/msm_mdss.c > @@ -188,22 +182,64 @@ static void dpu_mdss_destroy(struct msm_mdss *mdss) > > pm_runtime_suspend(mdss->dev); > pm_runtime_disable(mdss->dev); > - _dpu_mdss_irq_domain_fini(dpu_mdss); > + irq_domain_remove(dpu_mdss->irq_controller.domain); > + dpu_mdss->irq_controller.domain = NULL; > irq = platform_get_irq(pdev, 0); > irq_set_chained_handler_and_data(irq, NULL, NULL); > - > - if (dpu_mdss->mmio) > - devm_iounmap(&pdev->dev, dpu_mdss->mmio); > - dpu_mdss->mmio = NULL; > } > > static const struct msm_mdss_funcs mdss_funcs = { > - .enable = dpu_mdss_enable, > - .disable = dpu_mdss_disable, > - .destroy = dpu_mdss_destroy, > + .enable = msm_mdss_enable, > + .disable = msm_mdss_disable, > + .destroy = msm_mdss_destroy, > }; > > -int dpu_mdss_init(struct platform_device *pdev) > +/* > + * MDP5 MDSS uses at most three specified clocks. > + */ > +#define MDP5_MDSS_NUM_CLOCKS 3 > +int mdp5_mdss_parse_clock(struct platform_device *pdev, struct clk_bulk_data > **clocks) static? > +{ > + struct clk_bulk_data *bulk; > + struct clk *clk; > + int num_clocks = 0; > + > + if (!pdev) > + return -EINVAL; > + > + bulk = devm_kcalloc(&pdev->dev, MDP5_MDSS_NUM_CLOCKS, sizeof(struct > clk_bulk_data), GFP_KERNEL); > + if (!bulk) > + return -ENOMEM; > + > + /* We ignore all the errors except deferral: typically they mean that > the clock is not provided in the dts. */ > + clk = msm_clk_get(pdev, "iface"); > + if (!IS_ERR(clk)) { > + bulk[num_clocks].id = "iface"; > + bulk[num_clocks].clk = clk; > + num_clocks++; > + } else if (clk == ERR_PTR(-EPROBE_DEFER)) > + return -EPROBE_DEFER; > + > + clk = msm_clk_get(pdev, "bus"); > + if (!IS_ERR(clk)) { > + bulk[num_clocks].id = "bus"; > + bulk[num_clocks].clk = clk; > + num_clocks++; > + } else if (clk == ERR_PTR(-EPROBE_DEFER)) > + return -EPROBE_DEFER; > + > + clk = msm_clk_get(pdev, "vsync"); > + if (!IS_ERR(clk)) { > + bulk[num_clocks].id = "vsync"; > + bulk[num_clocks].clk = clk; > + num_clocks++; > + } else if (clk == ERR_PTR(-EPROBE_DEFER)) > + return -EPROBE_DEFER; > + > + return num_clocks; > +} > + > +int msm_mdss_init(struct platform_device *pdev, bool mdp5) Maybe is_mdp5 so the if reads simpler. > { > struct msm_drm_private *priv = platform_get_drvdata(pdev); > struct dpu_mdss *dpu_mdss; > @@ -220,27 +256,28 @@ int dpu_mdss_init(struct platform_device *pdev) > > DRM_DEBUG("mapped mdss address space @%pK\n", dpu_mdss->mmio); > > - ret = msm_parse_clock(pdev, &dpu_mdss->clocks); > + if (mdp5) > + ret = mdp5_mdss_parse_clock(pdev, &dpu_mdss->clocks); > + else > + ret = msm_parse_clock(pdev, &dpu_mdss->clocks); > if (ret < 0) { > - DPU_ERROR("failed to parse clocks, ret=%d\n", ret); > - goto clk_parse_err; > + DRM_ERROR("failed to parse clocks, ret=%d\n", ret); > + return ret; > } > dpu_mdss->num_clocks = ret;
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Add RCS mask to GuC ADS params
On 3/3/2022 14:34, Matt Roper wrote: From: Stuart Summers If RCS is not enumerated, GuC will return invalid parameters. Make sure we do not send RCS supported when we have not enumerated it. Cc: Vinay Belgaumkar Signed-off-by: Stuart Summers Signed-off-by: Matt Roper Reviewed-by: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 32c2053f2f08..acc4a3766dc1 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -433,7 +433,7 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc) static void fill_engine_enable_masks(struct intel_gt *gt, struct iosys_map *info_map) { - info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1); + info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], RCS_MASK(gt)); info_map_write(info_map, engine_enabled_masks[GUC_COMPUTE_CLASS], CCS_MASK(gt)); info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1); info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], VDBOX_MASK(gt));
Re: [PATCH 00/12] Add writeback block support for DPU
Quoting Abhinav Kumar (2022-02-04 13:17:13) > This series adds support for writeback block on DPU. Writeback > block is extremely useful to validate boards having no physical displays > in addition to many other use-cases where we want to get the output > of the display pipeline to examine whether issue is with the display > pipeline or with the panel. Is this series going to be resent?
[PATCH] drm/i915/dg2: Add preemption changes for Wa_14015141709
From: Akeem G Abodunrin Starting with DG2, preemption can no longer be controlled using userspace on a per-context basis. Instead, the hardware only allows us to enable or disable preemption in a global, system-wide basis. Also, we lose the ability to specify the preemption granularity (such as batch-level vs command-level vs object-level). As a result of this - for debugging purposes, this patch adds debugfs interface to configure (disable/enable) preemption globally. Jira: VLK-27831 Cc: Matt Roper Cc: Prathap Kumar Valsan Cc: John Harrison Cc: Joonas Lahtinen Signed-off-by: Akeem G Abodunrin Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 3 ++ drivers/gpu/drm/i915/gt/intel_workarounds.c | 2 +- drivers/gpu/drm/i915/i915_debugfs.c | 50 + drivers/gpu/drm/i915/i915_drv.h | 3 ++ 4 files changed, 57 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 19cd34f24263..21ede1887b9f 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -468,6 +468,9 @@ #define VF_PREEMPTION _MMIO(0x83a4) #define PREEMPTION_VERTEX_COUNT REG_GENMASK(15, 0) +#define GEN12_VFG_PREEMPTION_CHICKEN _MMIO(0x83b4) +#define GEN12_VFG_PREEMPT_CHICKEN_DISABLEREG_BIT(8) + #define GEN8_RC6_CTX_INFO _MMIO(0x8504) #define GEN12_SQCM _MMIO(0x8724) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index c014b40d2e9f..18dc82f29776 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -2310,7 +2310,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) FF_DOP_CLOCK_GATE_DISABLE); } - if (IS_GRAPHICS_VER(i915, 9, 12)) { + if (HAS_PERCTX_PREEMPT_CTRL(i915)) { /* FtrPerCtxtPreemptionGranularityControl:skl,bxt,kbl,cfl,cnl,icl,tgl */ wa_masked_en(wal, GEN7_FF_SLICE_CS_CHICKEN1, diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 747fe9f41e1f..40e6e17e2950 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -571,6 +571,55 @@ static int i915_wa_registers(struct seq_file *m, void *unused) return 0; } +static void i915_global_preemption_config(struct drm_i915_private *i915, + u32 val) +{ + const u32 bit = GEN12_VFG_PREEMPT_CHICKEN_DISABLE; + + if (val) + intel_uncore_write(&i915->uncore, GEN12_VFG_PREEMPTION_CHICKEN, + _MASKED_BIT_DISABLE(bit)); + else + intel_uncore_write(&i915->uncore, GEN12_VFG_PREEMPTION_CHICKEN, + _MASKED_BIT_ENABLE(bit)); +} + +static int i915_global_preempt_support_get(void *data, u64 *val) +{ + struct drm_i915_private *i915 = data; + intel_wakeref_t wakeref; + u32 curr_status = 0; + + if (HAS_PERCTX_PREEMPT_CTRL(i915) || GRAPHICS_VER(i915) < 11) + return -EINVAL; + + with_intel_runtime_pm(&i915->runtime_pm, wakeref) + curr_status = intel_uncore_read(&i915->uncore, + GEN12_VFG_PREEMPTION_CHICKEN); + *val = (curr_status & GEN12_VFG_PREEMPT_CHICKEN_DISABLE) ? 0 : 1; + + return 0; +} + +static int i915_global_preempt_support_set(void *data, u64 val) +{ + struct drm_i915_private *i915 = data; + intel_wakeref_t wakeref; + + if (HAS_PERCTX_PREEMPT_CTRL(i915) || GRAPHICS_VER(i915) < 11) + return -EINVAL; + + with_intel_runtime_pm(&i915->runtime_pm, wakeref) + i915_global_preemption_config(i915, val); + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_global_preempt_support_fops, + i915_global_preempt_support_get, + i915_global_preempt_support_set, + "%lld\n"); + static int i915_wedged_get(void *data, u64 *val) { struct drm_i915_private *i915 = data; @@ -765,6 +814,7 @@ static const struct i915_debugfs_files { const struct file_operations *fops; } i915_debugfs_files[] = { {"i915_perf_noa_delay", &i915_perf_noa_delay_fops}, + {"i915_global_preempt_support", &i915_global_preempt_support_fops}, {"i915_wedged", &i915_wedged_fops}, {"i915_gem_drop_caches", &i915_drop_caches_fops}, #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 457bc1993d19..8c3f69c87d36 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1407,6 +1407,9 @@ IS_SUBPLATFORM(con
Re: [PATCH v4 1/4] arm64/dts/qcom/sc7280: remove assigned-clock-rate property for mdp clk
Hi, On Thu, Mar 3, 2022 at 1:40 AM Vinod Polimera wrote: > > Kernel clock driver assumes that initial rate is the > max rate for that clock and was not allowing it to scale > beyond the assigned clock value. > > Drop the assigned clock rate property and vote on the mdp clock as per > calculated value during the usecase. I see the "Drop the assigned clock rate property" part, but where is the "and vote on the mdp clock" part? Did it already land or something? I definitely see that commit 5752c921d267 ("drm/msm/dpu: simplify clocks handling") changed a bunch of this but it looks like dpu_core_perf_init() still sets "max_core_clk_rate" to whatever the clock was at bootup. I assume you need to modify that function to call into the OPP layer to find the max frequency? > Changes in v2: > - Remove assigned-clock-rate property and set mdp clk during resume sequence. > - Add fixes tag. > > Changes in v3: > - Remove extra line after fixes tag.(Stephen Boyd) > > Fixes: 62fbdce91("arm64: dts: qcom: sc7280: add display dt nodes") Having a "Fixes" is good, but presumably you need a code change along with this, right? Otherwise if someone picks this back to stable then they'll end up breaking, right? We need to tag / note that _somehow_.
[PATCH v3 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
From: John Harrison Compute workloads are inherently not pre-emptible for long periods on current hardware. As a workaround for this, the pre-emption timeout for compute capable engines was disabled. This is undesirable with GuC submission as it prevents per engine reset of hung contexts. Hence the next patch will re-enable the timeout but bumped up by an order of magnitude. However, the heartbeat might not respect that. Depending upon current activity, a pre-emption to the heartbeat pulse might not even be attempted until the last heartbeat period. Which means that only one period is granted for the pre-emption to occur. With the aforesaid bump, the pre-emption timeout could be significantly larger than this heartbeat period. So adjust the heartbeat code to take the pre-emption timeout into account. When it reaches the final (high priority) period, it now ensures the delay before hitting reset is bigger than the pre-emption timeout. v2: Fix for selftests which adjust the heartbeat period manually. Signed-off-by: John Harrison --- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index a3698f611f45..0dc53def8e42 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -22,9 +22,27 @@ static bool next_heartbeat(struct intel_engine_cs *engine) { + struct i915_request *rq; long delay; delay = READ_ONCE(engine->props.heartbeat_interval_ms); + + rq = engine->heartbeat.systole; + + if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER && + delay == engine->defaults.heartbeat_interval_ms) { + long longer; + + /* +* The final try is at the highest priority possible. Up until now +* a pre-emption might not even have been attempted. So make sure +* this last attempt allows enough time for a pre-emption to occur. +*/ + longer = READ_ONCE(engine->props.preempt_timeout_ms) * 2; + if (longer > delay) + delay = longer; + } + if (!delay) return false; -- 2.25.1
[PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
From: John Harrison A workaround was added to the driver to allow OpenCL workloads to run 'forever' by disabling pre-emption on the RCS engine for Gen12. It is not totally unbound as the heartbeat will kick in eventually and cause a reset of the hung engine. However, this does not work well in GuC submission mode. In GuC mode, the pre-emption timeout is how GuC detects hung contexts and triggers a per engine reset. Thus, disabling the timeout means also losing all per engine reset ability. A full GT reset will still occur when the heartbeat finally expires, but that is a much more destructive and undesirable mechanism. The purpose of the workaround is actually to give OpenCL tasks longer to reach a pre-emption point after a pre-emption request has been issued. This is necessary because Gen12 does not support mid-thread pre-emption and OpenCL can have long running threads. So, rather than disabling the timeout completely, just set it to a 'long' value. v2: Review feedback from Tvrtko - must hard code the 'long' value instead of determining it algorithmically. So make it an extra CONFIG definition. Also, remove the execlist centric comment from the existing pre-emption timeout CONFIG option given that it applies to more than just execlists. Signed-off-by: John Harrison Reviewed-by: Daniele Ceraolo Spurio (v1) Acked-by: Michal Mrozek --- drivers/gpu/drm/i915/Kconfig.profile | 26 +++ drivers/gpu/drm/i915/gt/intel_engine_cs.c | 9 ++-- 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 39328567c200..7cc38d25ee5c 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT default 640 # milliseconds help How long to wait (in milliseconds) for a preemption event to occur - when submitting a new context via execlists. If the current context - does not hit an arbitration point and yield to HW before the timer - expires, the HW will be reset to allow the more important context - to execute. + when submitting a new context. If the current context does not hit + an arbitration point and yield to HW before the timer expires, the + HW will be reset to allow the more important context to execute. + + This is adjustable via + /sys/class/drm/card?/engine/*/preempt_timeout_ms + + May be 0 to disable the timeout. + + The compiled in default may get overridden at driver probe time on + certain platforms and certain engines which will be reflected in the + sysfs control. + +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE + int "Preempt timeout for compute engines (ms, jiffy granularity)" + default 7500 # milliseconds + help + How long to wait (in milliseconds) for a preemption event to occur + when submitting a new context to a compute capable engine. If the + current context does not hit an arbitration point and yield to HW + before the timer expires, the HW will be reset to allow the more + important context to execute. This is adjustable via /sys/class/drm/card?/engine/*/preempt_timeout_ms diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 4185c7338581..cc0954ad836a 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, engine->props.timeslice_duration_ms = CONFIG_DRM_I915_TIMESLICE_DURATION; - /* Override to uninterruptible for OpenCL workloads. */ + /* +* Mid-thread pre-emption is not available in Gen12. Unfortunately, +* some OpenCL workloads run quite long threads. That means they get +* reset due to not pre-empting in a timely manner. So, bump the +* pre-emption timeout value to be much higher for compute engines. +*/ if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE)) - engine->props.preempt_timeout_ms = 0; + engine->props.preempt_timeout_ms = CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE; /* Cap properties according to any system limits */ #define CLAMP_PROP(field) \ -- 2.25.1
[PATCH v3 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow
From: John Harrison GuC converts the pre-emption timeout and timeslice quantum values into clock ticks internally. That significantly reduces the point of 32bit overflow. On current platforms, worst case scenario is approximately 110 seconds. Rather than allowing the user to set higher values and then get confused by early timeouts, add limits when setting these values. v2: Add helper functins for clamping (review feedback from Tvrtko). Signed-off-by: John Harrison Reviewed-by: Daniele Ceraolo Spurio (v1) --- drivers/gpu/drm/i915/gt/intel_engine.h | 6 ++ drivers/gpu/drm/i915/gt/intel_engine_cs.c | 69 + drivers/gpu/drm/i915/gt/sysfs_engines.c | 25 +--- drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 9 +++ 4 files changed, 99 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 1c0ab05c3c40..d7044c4e526e 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -351,4 +351,10 @@ intel_engine_get_hung_context(struct intel_engine_cs *engine) return engine->hung_ce; } +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value); +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value); +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value); +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value); +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value); + #endif /* _INTEL_RINGBUFFER_H_ */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 7447411a5b26..22e70e4e007c 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -442,6 +442,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, engine->flags |= I915_ENGINE_HAS_EU_PRIORITY; } + /* Cap properties according to any system limits */ +#define CLAMP_PROP(field) \ + do { \ + u64 clamp = intel_clamp_##field(engine, engine->props.field); \ + if (clamp != engine->props.field) { \ + drm_notice(&engine->i915->drm, \ + "Warning, clamping %s to %lld to prevent overflow\n", \ + #field, clamp); \ + engine->props.field = clamp; \ + } \ + } while (0) + + CLAMP_PROP(heartbeat_interval_ms); + CLAMP_PROP(max_busywait_duration_ns); + CLAMP_PROP(preempt_timeout_ms); + CLAMP_PROP(stop_timeout_ms); + CLAMP_PROP(timeslice_duration_ms); + +#undef CLAMP_PROP + engine->defaults = engine->props; /* never to change again */ engine->context_size = intel_engine_context_size(gt, engine->class); @@ -464,6 +484,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, return 0; } +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value) +{ + value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)); + + return value; +} + +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value) +{ + value = min(value, jiffies_to_nsecs(2)); + + return value; +} + +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value) +{ + /* +* NB: The GuC API only supports 32bit values. However, the limit is further +* reduced due to internal calculations which would otherwise overflow. +*/ + if (intel_guc_submission_is_wanted(&engine->gt->uc.guc)) + value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS); + + value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)); + + return value; +} + +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value) +{ + value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)); + + return value; +} + +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value) +{ + /* +* NB: The GuC API only supports 32bit values. However, the limit is further +* reduced due to internal calculations which would otherwise overflow. +*/ + if (intel_guc_submission_is_wanted(&engine->gt->uc.guc)) + value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS); + + value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)); + + return value; +} + static void __setup_engine_capabilities(struct intel_engine_cs *engine) { struct drm_i915_private *i915 = engine->i915; diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c b/drivers/gpu/drm/i915/gt/sysfs_engines.c index 967031056202..f2d9858d827c 100644 --- a/drivers/gpu/drm/i915/gt/sysfs_engines.c +++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c @@ -144
[PATCH v3 0/4] Improve anti-pre-emption w/a for compute workloads
From: John Harrison Compute workloads are inherently not pre-emptible on current hardware. Thus the pre-emption timeout was disabled as a workaround to prevent unwanted resets. Instead, the hang detection was left to the heartbeat and its (longer) timeout. This is undesirable with GuC submission as the heartbeat is a full GT reset rather than a per engine reset and so is much more destructive. Instead, just bump the pre-emption timeout to a big value. Also, update the heartbeat to allow such a long pre-emption delay in the final heartbeat period. v2: Add clamping helpers. v3: Remove long timeout algorithm and replace with hard coded value (review feedback from Tvrtko). Also, fix execlist selftest failure and fix bug in compute enabling patch related to pre-emption timeouts. Signed-off-by: John Harrison John Harrison (4): drm/i915/guc: Limit scheduling properties to avoid overflow drm/i915: Fix compute pre-emption w/a to apply to compute engines drm/i915: Make the heartbeat play nice with long pre-emption timeouts drm/i915: Improve long running OCL w/a for GuC submission drivers/gpu/drm/i915/Kconfig.profile | 26 +- drivers/gpu/drm/i915/gt/intel_engine.h| 6 ++ drivers/gpu/drm/i915/gt/intel_engine_cs.c | 92 +-- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 18 drivers/gpu/drm/i915/gt/sysfs_engines.c | 25 +++-- drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 9 ++ 6 files changed, 153 insertions(+), 23 deletions(-) -- 2.25.1
[PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines
From: John Harrison An earlier patch added support for compute engines. However, it missed enabling the anti-pre-emption w/a for the new engine class. So move the 'compute capable' flag earlier and use it for the pre-emption w/a test. Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions") Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Cc: Aravind Iddamsetty Cc: Matt Roper Cc: Tvrtko Ursulin Cc: Daniel Vetter Cc: Maarten Lankhorst Cc: Lucas De Marchi Cc: John Harrison Cc: Jason Ekstrand Cc: "Michał Winiarski" Cc: Matthew Brost Cc: Chris Wilson Cc: Tejas Upadhyay Cc: Umesh Nerlige Ramappa Cc: "Thomas Hellström" Cc: Stuart Summers Cc: Matthew Auld Cc: Jani Nikula Cc: Ramalingam C Cc: Akeem G Abodunrin Signed-off-by: John Harrison --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 22e70e4e007c..4185c7338581 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, engine->logical_mask = BIT(logical_instance); __sprint_engine_name(engine); + /* features common between engines sharing EUs */ + if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) { + engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE; + engine->flags |= I915_ENGINE_HAS_EU_PRIORITY; + } + engine->props.heartbeat_interval_ms = CONFIG_DRM_I915_HEARTBEAT_INTERVAL; engine->props.max_busywait_duration_ns = @@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, CONFIG_DRM_I915_TIMESLICE_DURATION; /* Override to uninterruptible for OpenCL workloads. */ - if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS) + if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE)) engine->props.preempt_timeout_ms = 0; - /* features common between engines sharing EUs */ - if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) { - engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE; - engine->flags |= I915_ENGINE_HAS_EU_PRIORITY; - } - /* Cap properties according to any system limits */ #define CLAMP_PROP(field) \ do { \ -- 2.25.1
Re: [PATCH 3/4] drm/msm: Add SYSPROF param
Quoting Rob Clark (2022-03-03 13:47:14) > On Thu, Mar 3, 2022 at 1:17 PM Rob Clark wrote: > > > > On Thu, Mar 3, 2022 at 12:47 PM Stephen Boyd wrote: > > > > > > Quoting Rob Clark (2022-03-03 11:46:47) > > > > + > > > > + /* then apply new value: */ > > > > > > It would be safer to swap this. Otherwise a set when the values are at > > > "1" would drop to "zero" here and potentially trigger some glitch, > > > whereas incrementing one more time and then dropping the previous state > > > would avoid that short blip. > > > > > > > + switch (sysprof) { > > > > + default: > > > > + return -EINVAL; > > > > > > This will become more complicated though. > > > > Right, that is why I took the "unwind first and then re-apply" > > approach.. in practice I expect userspace to set the value before it > > starts sampling counter values, so I wasn't too concerned about this > > racing with a submit and clearing the counters. (Plus any glitch if > > userspace did decide to change it dynamically would just be transient > > and not really a big deal.) > > Actually I could just swap the two switch's.. the result would be that > an EINVAL would not change the state instead of dropping the state to > zero. Maybe that is better anyways > Yeah it isn't clear to me what should happen if the new state is invalid. Outright rejection is probably better than replacing the previous state with an invalid state.
[PATCH 2/2] drm/i915: Add RCS mask to GuC ADS params
From: Stuart Summers If RCS is not enumerated, GuC will return invalid parameters. Make sure we do not send RCS supported when we have not enumerated it. Cc: Vinay Belgaumkar Signed-off-by: Stuart Summers Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 32c2053f2f08..acc4a3766dc1 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -433,7 +433,7 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc) static void fill_engine_enable_masks(struct intel_gt *gt, struct iosys_map *info_map) { - info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1); + info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], RCS_MASK(gt)); info_map_write(info_map, engine_enabled_masks[GUC_COMPUTE_CLASS], CCS_MASK(gt)); info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1); info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], VDBOX_MASK(gt)); -- 2.34.1
[PATCH 1/2] drm/i915/xehp: Support platforms with CCS engines but no RCS
In the past we've always assumed that an RCS engine is present on every platform. However now that we have compute engines there may be platforms that have CCS engines but no RCS, or platforms that are designed to have both, but have the RCS engine fused off. Various engine-centric initialization that only needs to be done a single time for the group of RCS+CCS engines can't rely on being setup with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag that will be assigned to a single engine in the group; whichever engine has this flag will be responsible for some of the general setup (RCU_MODE programming, initialization of certain workarounds, etc.). Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 5 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 ++ drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +- drivers/gpu/drm/i915/gt/intel_workarounds.c | 2 +- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 2 +- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c| 2 +- drivers/gpu/drm/i915/i915_drv.h | 2 ++ 7 files changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 7447411a5b26..8080479f27aa 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -436,6 +436,11 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS) engine->props.preempt_timeout_ms = 0; + if ((engine->class == COMPUTE_CLASS && !RCS_MASK(engine->gt) && +__ffs(CCS_MASK(engine->gt)) == engine->instance) || +engine->class == RENDER_CLASS) + engine->flags |= I915_ENGINE_FIRST_RENDER_COMPUTE; + /* features common between engines sharing EUs */ if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) { engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 19ff8758e34d..4fbf45a74ec0 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -97,6 +97,7 @@ struct i915_ctx_workarounds { #define I915_MAX_VCS 8 #define I915_MAX_VECS 4 #define I915_MAX_CCS 4 +#define I915_MAX_RCS 1 /* * Engine IDs definitions. @@ -526,6 +527,7 @@ struct intel_engine_cs { #define I915_ENGINE_WANT_FORCED_PREEMPTION BIT(8) #define I915_ENGINE_HAS_RCS_REG_STATE BIT(9) #define I915_ENGINE_HAS_EU_PRIORITYBIT(10) +#define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11) unsigned int flags; /* diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 1c602d4ae297..e1470bb60f34 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2912,7 +2912,7 @@ static int execlists_resume(struct intel_engine_cs *engine) enable_execlists(engine); - if (engine->class == RENDER_CLASS) + if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE) xehp_enable_ccs_engines(engine); return 0; diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index c014b40d2e9f..beca8735bae5 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -2633,7 +2633,7 @@ engine_init_workarounds(struct intel_engine_cs *engine, struct i915_wa_list *wal * to a single RCS/CCS engine's workaround list since * they're reset as part of the general render domain reset. */ - if (engine->class == RENDER_CLASS) + if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE) general_render_compute_wa_init(engine, wal); if (engine->class == RENDER_CLASS) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 9bb551b83e7a..32c2053f2f08 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -335,7 +335,7 @@ static int guc_mmio_regset_init(struct temp_regset *regset, ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false); ret |= GUC_MMIO_REG_ADD(regset, RING_IMR(base), false); - if (engine->class == RENDER_CLASS && + if ((engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE) && CCS_MASK(engine->gt)) ret |= GUC_MMIO_REG_ADD(regset, GEN12_RCU_MODE, true); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 1ce7e04aa837..8a8bb87e77a0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/