Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource
On Fri, 2021-09-10 at 19:03 +0200, Christian König wrote: > Am 10.09.21 um 17:30 schrieb Thomas Hellström: > > On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote: > > > > > > Am 10.09.21 um 15:15 schrieb Thomas Hellström: > > > > Both the provider (resource manager) and the consumer (the TTM > > > > driver) > > > > want to subclass struct ttm_resource. Since this is left for > > > > the > > > > resource > > > > manager, we need to provide a private pointer for the TTM > > > > driver. > > > > > > > > Provide a struct ttm_resource_private for the driver to > > > > subclass > > > > for > > > > data with the same lifetime as the struct ttm_resource: In the > > > > i915 > > > > case > > > > it will, for example, be an sg-table and radix tree into the > > > > LMEM > > > > /VRAM pages that currently are awkwardly attached to the GEM > > > > object. > > > > > > > > Provide an ops structure for associated ops (Which is only > > > > destroy() ATM) > > > > It might seem pointless to provide a separate ops structure, > > > > but > > > > Linus > > > > has previously made it clear that that's the norm. > > > > > > > > After careful audit one could perhaps also on a per-driver > > > > basis > > > > replace the delete_mem_notify() TTM driver callback with the > > > > above > > > > destroy function. > > > Well this is a really big NAK to this approach. > > > > > > If you need to attach some additional information to the resource > > > then > > > implement your own resource manager like everybody else does. > > Well this was the long discussion we had back then when the > > resource > > mangagers started to derive from struct resource and I was under > > the > > impression that we had come to an agreement about the different > > use- > > cases here, and this was my main concern. > > Ok, then we somehow didn't understood each other. > > > I mean, it's a pretty big layer violation to do that for this use- > > case. > > Well exactly that's the point. TTM should not have a layer design in > the > first place. > > Devices, BOs, resources etc.. are base classes which should implement > a > base functionality which is then extended by the drivers to implement > the driver specific functionality. > > That is a component based approach, and not layered at all. > > > The TTM resource manager doesn't want to know about this data at > > all, > > it's private to the ttm resource user layer and the resource > > manager > > works perfectly well without it. (I assume the other drivers that > > implement their own resource managers need the data that the > > subclassing provides?) > > Yes, that's exactly why we have the subclassing. > > > The fundamental problem here is that there are two layers wanting > > to > > subclass struct ttm_resource. That means one layer gets to do that, > > the > > second gets to use a private pointer, (which in turn can provide > > yet > > another private pointer to a potential third layer). With your > > suggestion, the second layer instead is forced to subclass each > > subclassed instance it uses from the first layer provides? > > Well completely drop the layer approach/thinking here. > > The resource is an object with a base class. The base class > implements > the interface TTM needs to handle the object, e.g. > create/destroy/debug > etc... > > Then we need to subclass this object because without any additional > information the object is pretty pointless. > > One possibility for this is to use the range manager to implement > something drm_mm based. BTW: We should probably rename that to > something > like ttm_res_drm_mm or similar. Sure I'm all in on that, but my point is this becomes pretty awkward because the reusable code already subclasses struct ttm_resource. Let me give you an example: Prereqs: 1) We want to be able to re-use resource manager implementations among drivers. 2) A driver might want to re-use multiple implementations and have identical data "struct i915_data" attached to both With your suggestion that combination of prereqs would look like: struct i915_resource { /* Reason why we subclass */ struct i915_data my_data; /* * Uh this is awkward. We need to do this because these * already subclassed struct ttm_resource. */ struct ttm_resource *resource; union { struct ttm_range_mgr_node range; struct i915_ttm_buddy_resource buddy; }; }; And I can't make it look like struct i915_resource { struct i915_data my_data; struct ttm_resource *resource; } Without that private back pointer. But what I'd *really* would want is. struct i915_resource { struct i915_data my_data; struct ttm_resource resource; }; This would be identical to how we subclass a struct ttm_buffer_object or a struct ttm_tt. But It can't look like this because then we can't reuse exising implementations that *already subclass* struct ttm_resou
[PATCH v2] drm/rockchip: cdn-dp-core: Fix cdn_dp_resume unused warning
From: Palmer Dabbelt cdn_dp_resume is only used under PM_SLEEP, and now that it's static an unused function warning is triggered undner !PM_SLEEP. This marks the function as possibly unused, to avoid triggering compiler warnings. Fixes: 7c49abb4c2f8 ("drm/rockchip: cdn-dp-core: Make cdn_dp_core_suspend/resume static") Reviewed-by: Geert Uytterhoeven Signed-off-by: Palmer Dabbelt --- This is breaking my builds and looks like it'll land after -rc1, so I've put it on a shared tag for-rockchip-cdn_dp_resume-v2 which will let me pull it in to my fixes. LMK if you guys want me to send this up on my own, but I'm assuming that the drm/rockchip folks will handle it. --- drivers/gpu/drm/rockchip/cdn-dp-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c b/drivers/gpu/drm/rockchip/cdn-dp-core.c index 8ab3247dbc4a..13c6b857158f 100644 --- a/drivers/gpu/drm/rockchip/cdn-dp-core.c +++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c @@ -1123,7 +1123,7 @@ static int cdn_dp_suspend(struct device *dev) return ret; } -static int cdn_dp_resume(struct device *dev) +static __maybe_unused int cdn_dp_resume(struct device *dev) { struct cdn_dp_device *dp = dev_get_drvdata(dev); -- 2.33.0.309.g3052b89438-goog
Intel UHD resolutions
Hi, I would like to use QHD resolution (2560x1440) with my shiny new computer and display. That resolution works if I boot Windows 10 (cough). What do I need to do to use that resolution in Linux? I first tried openSUSE 15.3 (kernel 5.3.18-59.19-default) then I build a v5.14 kernel and tried that. Both of them max out at FHD (1920x1080). I am booting with "i915.force_probe=4c8a" on the kernel command line. My desktop is XFCE4. CPU is: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 167 model name : 11th Gen Intel(R) Core(TM) i9-11900 @ 2.50GHz stepping: 1 microcode : 0x40 cpu MHz : 1021.742 cache size : 16384 KB physical id : 0 siblings: 16 with an H470 chipset. (ASRock DeskMini H470) 00:02.0 VGA compatible controller: Intel Corporation RocketLake-S GT1 [UHD Graphics 750] (rev 04) or verbose: 00:02.0 VGA compatible controller: Intel Corporation RocketLake-S GT1 [UHD Graphics 750] (rev 04) (prog-if 00 [VGA controller]) Subsystem: ASRock Incorporation Device 4c8a Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, O BFF Not Supported AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled AtomicOpsCtl: ReqEn- Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit- Address: fee00018 Data: Masking: Pending: Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot -,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [100 v1] Process Address Space ID (PASID) PASIDCap: Exec- Priv-, Max PASID Width: 14 PASIDCtl: Enable- Exec- Priv- Capabilities: [200 v1] Address Translation Service (ATS) ATSCap: Invalidate Queue Depth: 00 ATSCtl: Enable-, Smallest Translation Unit: 00 Capabilities: [300 v1] Page Request Interface (PRI) PRICtl: Enable- Reset- PRISta: RF- UPRGI- Stopped+ Page Request Capacity: 8000, Page Request Allocation: Kernel driver in use: i915 Kernel modules: i915 thanks. -- ~Randy
[PATCH] drm/i915: fix odd_ptr_err.cocci warnings
From: kernel test robot drivers/gpu/drm/i915/display/intel_dpt.c:145:6-12: inconsistent IS_ERR and PTR_ERR on line 146. PTR_ERR should access the value just tested by IS_ERR Semantic patch information: There can be false positives in the patch case, where it is the call to IS_ERR that is wrong. Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci CC: Maarten Lankhorst Reported-by: kernel test robot Signed-off-by: kernel test robot --- url: https://github.com/0day-ci/linux/commits/Maarten-Lankhorst/drm-i915-Add-ww-context-to-intel_dpt_pin/20210910-162231 base: git://anongit.freedesktop.org/drm/drm-tip drm-tip :: branch date: 17 hours ago :: commit date: 17 hours ago intel_dpt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/i915/display/intel_dpt.c +++ b/drivers/gpu/drm/i915/display/intel_dpt.c @@ -143,7 +143,7 @@ struct i915_vma *intel_dpt_pin(struct i9 i915_vma_unpin(vma); if (IS_ERR(iomem)) { - err = PTR_ERR(vma); + err = PTR_ERR(iomem); continue; }
Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files
On Fri, 2021-09-10 at 14:52 -0700, Lucas De Marchi wrote: > On Fri, Sep 10, 2021 at 09:14:37PM +, Yokoyama, Caz wrote: > > On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote: > > > On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote: > > > > We shouldn't be using debugfs_ namespace for this > > > > functionality. > > > > Rename > > > > debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make > > > > functions, defines and structs follow suit. > > > > > > > > Signed-off-by: Lucas De Marchi > > > > --- > > > > drivers/gpu/drm/i915/Makefile | 2 +- > > > > drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 - > > > > > > > > - > > > > drivers/gpu/drm/i915/gt/intel_gt_debugfs.c | 4 ++-- > > > > .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} | 4 ++-- > > > > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h | 14 > > > > ++ > > > > 5 files changed, 19 insertions(+), 19 deletions(-) > > > > delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > > > > rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c => > > > > intel_gt_pm_debugfs.c} (99%) > > > > create mode 100644 > > > > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h > > > > > > > > diff --git a/drivers/gpu/drm/i915/Makefile > > > > b/drivers/gpu/drm/i915/Makefile > > > > index 232c9673a2e5..dd656f2d7721 100644 > > > > --- a/drivers/gpu/drm/i915/Makefile > > > > +++ b/drivers/gpu/drm/i915/Makefile > > > > @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o > > > > > > > > # "Graphics Technology" (aka we talk to the gpu) > > > > gt-y += \ > > > > - gt/debugfs_gt_pm.o \ > > > > gt/gen2_engine_cs.o \ > > > > gt/gen6_engine_cs.o \ > > > > gt/gen6_ppgtt.o \ > > > > @@ -103,6 +102,7 @@ gt-y += \ > > > > gt/intel_gt_engines_debugfs.o \ > > > > gt/intel_gt_irq.o \ > > > > gt/intel_gt_pm.o \ > > > > + gt/intel_gt_pm_debugfs.o \ > > > > gt/intel_gt_pm_irq.o \ > > > > gt/intel_gt_requests.o \ > > > > gt/intel_gtt.o \ > > > > diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > > > > b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > > > > deleted file mode 100644 > > > > index 4cf5f5c9da7d.. > > > > --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > > > > +++ /dev/null > > > > @@ -1,14 +0,0 @@ > > > > -/* SPDX-License-Identifier: MIT */ > > > > -/* > > > > - * Copyright © 2019 Intel Corporation > > > > - */ > > > > - > > > > -#ifndef DEBUGFS_GT_PM_H > > > > -#define DEBUGFS_GT_PM_H > > > > - > > > > -struct intel_gt; > > > > -struct dentry; > > > > - > > > > -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry > > > > *root); > > > > - > > > > -#endif /* DEBUGFS_GT_PM_H */ > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > > > b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > > > index e5d173c235a3..4096ee893b69 100644 > > > > --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > > > @@ -5,10 +5,10 @@ > > > > > > > > #include > > > > > > > > -#include "debugfs_gt_pm.h" > > > > #include "i915_drv.h" > > > > #include "intel_gt_debugfs.h" > > > > #include "intel_gt_engines_debugfs.h" > > > > +#include "intel_gt_pm_debugfs.h" > > Why locate here? Why not just replace debugfs_gt_pm.h? Compile > > error? > > are you asking why I moved the include? Because sorting them > alphabetically avoid big messes in these includes As the patch, it is easy to see if - and + lines are side by side. Anyway, I honor and respect your decision. -caz > > Lucas De Marchi > > > -caz > > > > > > #include "intel_sseu_debugfs.h" > > > > #include "uc/intel_uc_debugfs.h" > > > > > > > > @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct > > > > intel_gt > > > > *gt) > > > > return; > > > > > > > > intel_gt_engines_register_debugfs(gt, root); > > > > - debugfs_gt_pm_register(gt, root); > > > > + intel_gt_pm_register_debugfs(gt, root); > > > > > > This is one case I usually don't know what convention to follow > > > since > > > it > > > changes in different places. > > > > > > I did it like _register_debugfs because of calls like > > > intel_gt_init_scratch(), xxx_init_hw, etc. However here I see > > > that > > > just > > > below we have intel_sseu_debugfs_register(), so maybe I should > > > consider > > > debugfs as part of the namespace? > > > > > > Lucas De Marchi
Re: [Intel-gfx] [PATCH v9 15/17] drm/i915/pxp: add pxp debugfs
Reviewed-by: Alan Previn ..alan On Fri, 2021-09-10 at 08:36 -0700, Daniele Ceraolo Spurio wrote: > 2 debugfs files, one to query the current status of the pxp session and one > to trigger an invalidation for testing. > > v2: rename debugfs, fix date (Alan) > > Signed-off-by: Daniele Ceraolo Spurio > Reviewed-by : Alan Previn > --- > drivers/gpu/drm/i915/Makefile| 1 + > drivers/gpu/drm/i915/gt/debugfs_gt.c | 2 + > drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c | 78 > drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h | 21 ++ > 4 files changed, 102 insertions(+) > create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c > create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index 366e82cec44d..b46474ee1a1f 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -285,6 +285,7 @@ i915-y += i915_perf.o > i915-$(CONFIG_DRM_I915_PXP) += \ > pxp/intel_pxp.o \ > pxp/intel_pxp_cmd.o \ > + pxp/intel_pxp_debugfs.o \ > pxp/intel_pxp_irq.o \ > pxp/intel_pxp_pm.o \ > pxp/intel_pxp_session.o \ > diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt.c > b/drivers/gpu/drm/i915/gt/debugfs_gt.c > index 591eb60785db..c27847ddb796 100644 > --- a/drivers/gpu/drm/i915/gt/debugfs_gt.c > +++ b/drivers/gpu/drm/i915/gt/debugfs_gt.c > @@ -9,6 +9,7 @@ > #include "debugfs_gt.h" > #include "debugfs_gt_pm.h" > #include "intel_sseu_debugfs.h" > +#include "pxp/intel_pxp_debugfs.h" > #include "uc/intel_uc_debugfs.h" > #include "i915_drv.h" > > @@ -28,6 +29,7 @@ void debugfs_gt_register(struct intel_gt *gt) > intel_sseu_debugfs_register(gt, root); > > intel_uc_debugfs_register(>->uc, root); > + intel_pxp_debugfs_register(>->pxp, root); > } > > void intel_gt_debugfs_register_files(struct dentry *root, > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c > b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c > new file mode 100644 > index ..cbb1853676cc > --- /dev/null > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c > @@ -0,0 +1,78 @@ > +// SPDX-License-Identifier: MIT > +/* > + * Copyright © 2021 Intel Corporation > + */ > + > +#include > +#include > + > +#include "gt/debugfs_gt.h" > +#include "pxp/intel_pxp.h" > +#include "pxp/intel_pxp_irq.h" > +#include "i915_drv.h" > + > +static int pxp_info_show(struct seq_file *m, void *data) > +{ > + struct intel_pxp *pxp = m->private; > + struct drm_printer p = drm_seq_file_printer(m); > + bool enabled = intel_pxp_is_enabled(pxp); > + > + if (!enabled) { > + drm_printf(&p, "pxp disabled\n"); > + return 0; > + } > + > + drm_printf(&p, "active: %s\n", yesno(intel_pxp_is_active(pxp))); > + drm_printf(&p, "instance counter: %u\n", pxp->key_instance); > + > + return 0; > +} > +DEFINE_GT_DEBUGFS_ATTRIBUTE(pxp_info); > + > +static int pxp_terminate_get(void *data, u64 *val) > +{ > + /* nothing to read */ > + return -EPERM; > +} > + > +static int pxp_terminate_set(void *data, u64 val) > +{ > + struct intel_pxp *pxp = data; > + struct intel_gt *gt = pxp_to_gt(pxp); > + > + if (!intel_pxp_is_active(pxp)) > + return -ENODEV; > + > + /* simulate a termination interrupt */ > + spin_lock_irq(>->irq_lock); > + intel_pxp_irq_handler(pxp, > GEN12_DISPLAY_PXP_STATE_TERMINATED_INTERRUPT); > + spin_unlock_irq(>->irq_lock); > + > + if (!wait_for_completion_timeout(&pxp->termination, > + msecs_to_jiffies(100))) > + return -ETIMEDOUT; > + > + return 0; > +} > + > +DEFINE_SIMPLE_ATTRIBUTE(pxp_terminate_fops, pxp_terminate_get, > pxp_terminate_set, "%llx\n"); > +void intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry > *gt_root) > +{ > + static const struct debugfs_gt_file files[] = { > + { "info", &pxp_info_fops, NULL }, > + { "terminate_state", &pxp_terminate_fops, NULL }, > + }; > + struct dentry *root; > + > + if (!gt_root) > + return; > + > + if (!HAS_PXP((pxp_to_gt(pxp)->i915))) > + return; > + > + root = debugfs_create_dir("pxp", gt_root); > + if (IS_ERR(root)) > + return; > + > + intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), pxp); > +} > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h > b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h > new file mode 100644 > index ..7e0c3d2f5d7e > --- /dev/null > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h > @@ -0,0 +1,21 @@ > +/* SPDX-License-Identifier: MIT */ > +/* > + * Copyright © 2021 Intel Corporation > + */ > + > +#ifndef __INTEL_PXP_DEBUGFS_H__ > +#define __INTEL_PXP_DEBUGFS_H__ > + > +struct intel_pxp; > +struct dentry; > + > +#ifdef CONFIG_DRM_I915_PXP > +void intel_pxp_debugfs_register(struct intel_
Re: [PATCH] drm/rockchip: Update crtc fixup to account for fractional clk change
On Wed, Sep 08, 2021 at 09:05:52PM +0300, Andy Shevchenko wrote: > On Wed, Sep 08, 2021 at 08:53:56AM -0500, Chris Morgan wrote: > > From: Chris Morgan > > > > After commit 928f9e268611 ("clk: fractional-divider: Hide > > clk_fractional_divider_ops from wide audience") was merged it appears > > that the DSI panel on my Odroid Go Advance stopped working. Upon closer > > examination of the problem, it looks like it was the fixup in the > > rockchip_drm_vop.c file was causing the issue. The changes made to the > > clk driver appear to change some assumptions made in the fixup. > > > > After debugging the working 5.14 kernel and the no-longer working > > 5.15 kernel, it looks like this was broken all along but still > > worked, whereas after the fractional clock change it stopped > > working despite the issue (it went from sort-of broken to very broken). > > > > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to > > 17000999 on my board. The clock driver was taking the value of the > > parent clock and attempting to divide the requested value from it > > (1700/17000999 = 0), then subtracting 1 from it (making it -1), > > and running it through fls_long to get 64. It would then subtract > > the value of fd->mwidth from it to get 48, and then bit shift > > 17000999 to the left by 48, coming up with a very large number of > > 7649082492112076800. This resulted in a numerator of 65535 and a > > denominator of 1 from the clk driver. The driver seemingly would > > try again and get a correct 1:1 value later, and then move on. > > > > Output from my 5.14 kernel (with some printfs for good measure): > > [2.830066] rockchip-drm display-subsystem: bound ff46.vop (ops > > vop_component_ops) > > [2.839431] rockchip-drm display-subsystem: bound ff45.dsi (ops > > dw_mipi_dsi_rockchip_ops) > > [2.855980] Clock is dclk_vopb_frac > > [2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent > > Rate 1700, Best Numerator 65535, Best Denominator 1, fd->mwidth 16 > > [2.903529] Clock is dclk_vopb_frac > > [2.903556] Scale 0, Rate 1700, Oldrate 1700, Parent Rate > > 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > [2.903579] Clock is dclk_vopb_frac > > [2.903583] Scale 0, Rate 1700, Oldrate 1700, Parent Rate > > 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > > > Contrast this with 5.15 after the clk change where the rate of 17000999 > > was getting passed and resulted in numerators/denomiators of 17001/ > > 17000. > > > > Output from my 5.15 kernel (with some printfs added for good measure): > > [2.817571] rockchip-drm display-subsystem: bound ff46.vop (ops > > vop_component_ops) > > [2.826975] rockchip-drm display-subsystem: bound ff45.dsi (ops > > dw_mipi_dsi_rockchip_ops) > > [2.843430] Rate 17000999, Parent Rate 1700, Best Numerator 17018, > > Best Denominator 17017 > > [2.891073] Rate 17001000, Parent Rate 1700, Best Numerator 17001, > > Best Denominator 17000 > > [2.891269] Rate 17001000, Parent Rate 1700, Best Numerator 17001, > > Best Denominator 17000 > > [2.891281] Rate 17001000, Parent Rate 1700, Best Numerator 17001, > > Best Denominator 17000 > > > > After tracing through the code it appeared that this function here was > > adding a 999 to the requested frequency because of how the clk driver > > was rounding/accepting those frequencies. I believe after the changes > > made in the commit listed above the assumptions listed in this driver > > are no longer true. When I remove the + 999 from the driver the DSI > > panel begins to work again. > > > > Output from my 5.15 kernel with 999 removed (printfs added): > > [2.852054] rockchip-drm display-subsystem: bound ff46.vop (ops > > vop_component_ops) > > [2.864483] rockchip-drm display-subsystem: bound ff45.dsi (ops > > dw_mipi_dsi_rockchip_ops) > > [2.880869] Clock is dclk_vopb_frac > > [2.880892] Rate 1700, Parent Rate 1700, Best Numerator 1, Best > > Denominator 1 > > [2.928521] Clock is dclk_vopb_frac > > [2.928551] Rate 1700, Parent Rate 1700, Best Numerator 1, Best > > Denominator 1 > > [2.928570] Clock is dclk_vopb_frac > > [2.928574] Rate 1700, Parent Rate 1700, Best Numerator 1, Best > > Denominator 1 > > > > I have tested the change extensively on my Odroid Go Advance (Rockchip > > RK3326) and it appears to work well. However, this change will affect > > all Rockchip SoCs that use this driver so I believe further testing > > is warranted. Please note that without this change I can confirm > > at least all PX30s with DSI panels will stop working with the 5.15 > > kernel. > > To me it all makes a lot of sense, thank you for deep analysis of the issue! > In any case I think we will need a Fixes tag to something (either one of > clk-fractional-divider.c series or preexisted). Would this work for a
Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files
On Fri, Sep 10, 2021 at 09:14:37PM +, Yokoyama, Caz wrote: On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote: On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote: > We shouldn't be using debugfs_ namespace for this functionality. > Rename > debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make > functions, defines and structs follow suit. > > Signed-off-by: Lucas De Marchi > --- > drivers/gpu/drm/i915/Makefile | 2 +- > drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 - > - > drivers/gpu/drm/i915/gt/intel_gt_debugfs.c | 4 ++-- > .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} | 4 ++-- > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h | 14 > ++ > 5 files changed, 19 insertions(+), 19 deletions(-) > delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c => > intel_gt_pm_debugfs.c} (99%) > create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h > > diff --git a/drivers/gpu/drm/i915/Makefile > b/drivers/gpu/drm/i915/Makefile > index 232c9673a2e5..dd656f2d7721 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o > > # "Graphics Technology" (aka we talk to the gpu) > gt-y += \ > - gt/debugfs_gt_pm.o \ >gt/gen2_engine_cs.o \ >gt/gen6_engine_cs.o \ >gt/gen6_ppgtt.o \ > @@ -103,6 +102,7 @@ gt-y += \ >gt/intel_gt_engines_debugfs.o \ >gt/intel_gt_irq.o \ >gt/intel_gt_pm.o \ > + gt/intel_gt_pm_debugfs.o \ >gt/intel_gt_pm_irq.o \ >gt/intel_gt_requests.o \ >gt/intel_gtt.o \ > diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > deleted file mode 100644 > index 4cf5f5c9da7d.. > --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > +++ /dev/null > @@ -1,14 +0,0 @@ > -/* SPDX-License-Identifier: MIT */ > -/* > - * Copyright © 2019 Intel Corporation > - */ > - > -#ifndef DEBUGFS_GT_PM_H > -#define DEBUGFS_GT_PM_H > - > -struct intel_gt; > -struct dentry; > - > -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry > *root); > - > -#endif /* DEBUGFS_GT_PM_H */ > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > index e5d173c235a3..4096ee893b69 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > @@ -5,10 +5,10 @@ > > #include > > -#include "debugfs_gt_pm.h" > #include "i915_drv.h" > #include "intel_gt_debugfs.h" > #include "intel_gt_engines_debugfs.h" > +#include "intel_gt_pm_debugfs.h" Why locate here? Why not just replace debugfs_gt_pm.h? Compile error? are you asking why I moved the include? Because sorting them alphabetically avoid big messes in these includes Lucas De Marchi -caz > #include "intel_sseu_debugfs.h" > #include "uc/intel_uc_debugfs.h" > > @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt > *gt) >return; > >intel_gt_engines_register_debugfs(gt, root); > - debugfs_gt_pm_register(gt, root); > + intel_gt_pm_register_debugfs(gt, root); This is one case I usually don't know what convention to follow since it changes in different places. I did it like _register_debugfs because of calls like intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that just below we have intel_sseu_debugfs_register(), so maybe I should consider debugfs as part of the namespace? Lucas De Marchi
Re: [PATCH 2/2] drm/msm/dpu: Fix timeout issues on command mode panels
Hi Angelo! On 2021-09-01 19:43:47, AngeloGioacchino Del Regno wrote: > In function dpu_encoder_phys_cmd_wait_for_commit_done we are always > checking if the relative CTL is started by waiting for an interrupt > to fire: it is fine to do that, but then sometimes we call this > function while the CTL is up and has never been put down, but that > interrupt gets raised only when the CTL gets a state change from > 0 to 1 (disabled to enabled), so we're going to wait for something > that will never happen on its own. > > Solving this while avoiding to restart the CTL is actually possible > and can be done by just checking if it is already up and running > when the wait_for_commit_done function is called: in this case, so, > if the CTL was already running, we can say that the commit is done > if the command transmission is complete (in other terms, if the > interface has been flushed). > > Signed-off-by: AngeloGioacchino Del Regno > > --- > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c > b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c > index aa01698d6b25..b5b1b555ac4e 100644 > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c > @@ -682,6 +682,9 @@ static int dpu_encoder_phys_cmd_wait_for_commit_done( > if (!dpu_encoder_phys_cmd_is_master(phys_enc)) > return 0; > > + if (phys_enc->hw_ctl->ops.is_started) > + return dpu_encoder_phys_cmd_wait_for_tx_complete(phys_enc); In the previous commit you introduced is_started to the ops struct as function pointer, and you probably intend to call it here instead of just checking whether it might be NULL. As far as I remember this was also the reason for previously mentioning that it was faulty and required a v2 in: https://lore.kernel.org/linux-arm-msm/bdc67afc-3736-5497-c43f-5165c55e0...@somainline.org/ Thanks! - Marijn > + > return _dpu_encoder_phys_cmd_wait_for_ctl_start(phys_enc); > } > > -- > 2.32.0 >
Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files
On Fri, 2021-09-10 at 10:52 -0700, Lucas De Marchi wrote: > On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote: > > We shouldn't be using debugfs_ namespace for this functionality. > > Rename > > debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make > > functions, defines and structs follow suit. > > > > Signed-off-by: Lucas De Marchi > > --- > > drivers/gpu/drm/i915/Makefile | 2 +- > > drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 - > > - > > drivers/gpu/drm/i915/gt/intel_gt_debugfs.c | 4 ++-- > > .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} | 4 ++-- > > drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h | 14 > > ++ > > 5 files changed, 19 insertions(+), 19 deletions(-) > > delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > > rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c => > > intel_gt_pm_debugfs.c} (99%) > > create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h > > > > diff --git a/drivers/gpu/drm/i915/Makefile > > b/drivers/gpu/drm/i915/Makefile > > index 232c9673a2e5..dd656f2d7721 100644 > > --- a/drivers/gpu/drm/i915/Makefile > > +++ b/drivers/gpu/drm/i915/Makefile > > @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o > > > > # "Graphics Technology" (aka we talk to the gpu) > > gt-y += \ > > - gt/debugfs_gt_pm.o \ > > gt/gen2_engine_cs.o \ > > gt/gen6_engine_cs.o \ > > gt/gen6_ppgtt.o \ > > @@ -103,6 +102,7 @@ gt-y += \ > > gt/intel_gt_engines_debugfs.o \ > > gt/intel_gt_irq.o \ > > gt/intel_gt_pm.o \ > > + gt/intel_gt_pm_debugfs.o \ > > gt/intel_gt_pm_irq.o \ > > gt/intel_gt_requests.o \ > > gt/intel_gtt.o \ > > diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > > b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > > deleted file mode 100644 > > index 4cf5f5c9da7d.. > > --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h > > +++ /dev/null > > @@ -1,14 +0,0 @@ > > -/* SPDX-License-Identifier: MIT */ > > -/* > > - * Copyright © 2019 Intel Corporation > > - */ > > - > > -#ifndef DEBUGFS_GT_PM_H > > -#define DEBUGFS_GT_PM_H > > - > > -struct intel_gt; > > -struct dentry; > > - > > -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry > > *root); > > - > > -#endif /* DEBUGFS_GT_PM_H */ > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > index e5d173c235a3..4096ee893b69 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > @@ -5,10 +5,10 @@ > > > > #include > > > > -#include "debugfs_gt_pm.h" > > #include "i915_drv.h" > > #include "intel_gt_debugfs.h" > > #include "intel_gt_engines_debugfs.h" > > +#include "intel_gt_pm_debugfs.h" Why locate here? Why not just replace debugfs_gt_pm.h? Compile error? -caz > > #include "intel_sseu_debugfs.h" > > #include "uc/intel_uc_debugfs.h" > > > > @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt > > *gt) > > return; > > > > intel_gt_engines_register_debugfs(gt, root); > > - debugfs_gt_pm_register(gt, root); > > + intel_gt_pm_register_debugfs(gt, root); > > This is one case I usually don't know what convention to follow since > it > changes in different places. > > I did it like _register_debugfs because of calls like > intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that > just > below we have intel_sseu_debugfs_register(), so maybe I should > consider > debugfs as part of the namespace? > > Lucas De Marchi
Re: [Intel-gfx] [PATCH 23/27] drm/i915/guc: Implement no mid batch preemption for multi-lrc
On Fri, Sep 10, 2021 at 12:25:43PM +0100, Tvrtko Ursulin wrote: > > On 20/08/2021 23:44, Matthew Brost wrote: > > For some users of multi-lrc, e.g. split frame, it isn't safe to preempt > > mid BB. To safely enable preemption at the BB boundary, a handshake > > between to parent and child is needed. This is implemented via custom > > emit_bb_start & emit_fini_breadcrumb functions and enabled via by > > default if a context is configured by set parallel extension. > > FWIW I think it's wrong to hardcode the requirements of a particular > hardware generation fixed media pipeline into the uapi. IMO better solution > was when concept of parallel submission was decoupled from the no preemption > mid batch preambles. Otherwise might as well call the extension > I915_CONTEXT_ENGINES_EXT_MEDIA_SPLIT_FRAME_SUBMIT or something. > I don't disagree but this where we landed per Daniel Vetter's feedback - default to what our current hardware supports and extend it later to newer hardware / requirements as needed. Matt > Regards, > > Tvrtko > > Signed-off-by: Matthew Brost > > --- > > drivers/gpu/drm/i915/gt/intel_context.c | 2 +- > > drivers/gpu/drm/i915/gt/intel_context_types.h | 3 + > > drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 2 +- > > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 283 +- > > 4 files changed, 287 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c > > b/drivers/gpu/drm/i915/gt/intel_context.c > > index 5615be32879c..2de62649e275 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_context.c > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c > > @@ -561,7 +561,7 @@ void intel_context_bind_parent_child(struct > > intel_context *parent, > > GEM_BUG_ON(intel_context_is_child(child)); > > GEM_BUG_ON(intel_context_is_parent(child)); > > - parent->guc_number_children++; > > + child->guc_child_index = parent->guc_number_children++; > > list_add_tail(&child->guc_child_link, > > &parent->guc_child_list); > > child->parent = parent; > > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h > > b/drivers/gpu/drm/i915/gt/intel_context_types.h > > index 713d85b0b364..727f91e7f7c2 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h > > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h > > @@ -246,6 +246,9 @@ struct intel_context { > > /** @guc_number_children: number of children if parent */ > > u8 guc_number_children; > > + /** @guc_child_index: index into guc_child_list if child */ > > + u8 guc_child_index; > > + > > /** > > * @parent_page: page in context used by parent for work queue, > > * work queue descriptor > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h > > index 6cd26dc060d1..9f61cfa5566a 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h > > @@ -188,7 +188,7 @@ struct guc_process_desc { > > u32 wq_status; > > u32 engine_presence; > > u32 priority; > > - u32 reserved[30]; > > + u32 reserved[36]; > > } __packed; > > #define CONTEXT_REGISTRATION_FLAG_KMD BIT(0) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > index 91330525330d..1a18f99bf12a 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > @@ -11,6 +11,7 @@ > > #include "gt/intel_context.h" > > #include "gt/intel_engine_pm.h" > > #include "gt/intel_engine_heartbeat.h" > > +#include "gt/intel_gpu_commands.h" > > #include "gt/intel_gt.h" > > #include "gt/intel_gt_irq.h" > > #include "gt/intel_gt_pm.h" > > @@ -366,10 +367,14 @@ static struct i915_priolist *to_priolist(struct > > rb_node *rb) > > /* > >* When using multi-lrc submission an extra page in the context state is > > - * reserved for the process descriptor and work queue. > > + * reserved for the process descriptor, work queue, and preempt BB boundary > > + * handshake between the parent + childlren contexts. > >* > >* The layout of this page is below: > >* 0 guc_process_desc > > + * + sizeof(struct guc_process_desc) child go > > + * + CACHELINE_BYTES child join ... > > + * + CACHELINE_BYTES ... > >* ...unused > >* PAGE_SIZE / 2 work queue start > >* ...work queue > > @@ -1785,6 +1790,30 @@ static int deregister_context(struct intel_context > > *ce, u32 guc_id, bool loop) > > return __guc_action_deregister_context(guc, guc_id, loop); > > } > > +static inline void clear_children_join_go_memory(struct intel_context *ce) > > +
Re: [Intel-gfx] [PATCH 05/27] drm/i915: Add GT PM unpark worker
On Fri, Sep 10, 2021 at 09:36:17AM +0100, Tvrtko Ursulin wrote: > > On 20/08/2021 23:44, Matthew Brost wrote: > > Sometimes it is desirable to queue work up for later if the GT PM isn't > > held and run that work on next GT PM unpark. > > Sounds maybe plausible, but it depends how much work can happen on unpark > and whether it can have too much of a negative impact on latency for > interactive loads? Or from a reverse angle, why the work wouldn't be done on All it is does is add an interface to kick a work queue on unpark. i.e. All the actually work is done async in the work queue so it shouldn't add any latency. > parking? > > Also what kind of mechanism for dealing with too much stuff being put on > this list you have? Can there be pressure which triggers (or would need to No limits on pressure. See above, I don't think this is a concern. > trigger) these deregistrations to happen at runtime (no park/unpark > transitions)? > > > Implemented with a list in the GT of all pending work, workqueues in > > the list, a callback to add a workqueue to the list, and finally a > > wakeref post_get callback that iterates / drains the list + queues the > > workqueues. > > > > First user of this is deregistration of GuC contexts. > > Does first imply there are more incoming? > Haven't found another user yet but this is generic mechanism so we can add more in the future if other use cases arrise. > > Signed-off-by: Matthew Brost > > --- > > drivers/gpu/drm/i915/Makefile | 1 + > > drivers/gpu/drm/i915/gt/intel_gt.c| 3 ++ > > drivers/gpu/drm/i915/gt/intel_gt_pm.c | 8 > > .../gpu/drm/i915/gt/intel_gt_pm_unpark_work.c | 35 > > .../gpu/drm/i915/gt/intel_gt_pm_unpark_work.h | 40 +++ > > drivers/gpu/drm/i915/gt/intel_gt_types.h | 10 + > > drivers/gpu/drm/i915/gt/uc/intel_guc.h| 8 ++-- > > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +-- > > drivers/gpu/drm/i915/intel_wakeref.c | 5 +++ > > drivers/gpu/drm/i915/intel_wakeref.h | 1 + > > 10 files changed, 119 insertions(+), 7 deletions(-) > > create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c > > create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.h > > > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > > index 642a5b5a1b81..579bdc069f25 100644 > > --- a/drivers/gpu/drm/i915/Makefile > > +++ b/drivers/gpu/drm/i915/Makefile > > @@ -103,6 +103,7 @@ gt-y += \ > > gt/intel_gt_clock_utils.o \ > > gt/intel_gt_irq.o \ > > gt/intel_gt_pm.o \ > > + gt/intel_gt_pm_unpark_work.o \ > > gt/intel_gt_pm_irq.o \ > > gt/intel_gt_requests.o \ > > gt/intel_gtt.o \ > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c > > b/drivers/gpu/drm/i915/gt/intel_gt.c > > index 62d40c986642..7e690e74baa2 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gt.c > > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > > @@ -29,6 +29,9 @@ void intel_gt_init_early(struct intel_gt *gt, struct > > drm_i915_private *i915) > > spin_lock_init(>->irq_lock); > > + spin_lock_init(>->pm_unpark_work_lock); > > + INIT_LIST_HEAD(>->pm_unpark_work_list); > > + > > INIT_LIST_HEAD(>->closed_vma); > > spin_lock_init(>->closed_lock); > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c > > b/drivers/gpu/drm/i915/gt/intel_gt_pm.c > > index dea8e2479897..564c11a3748b 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c > > @@ -90,6 +90,13 @@ static int __gt_unpark(struct intel_wakeref *wf) > > return 0; > > } > > +static void __gt_unpark_work_queue(struct intel_wakeref *wf) > > +{ > > + struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref); > > + > > + intel_gt_pm_unpark_work_queue(gt); > > +} > > + > > static int __gt_park(struct intel_wakeref *wf) > > { > > struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref); > > @@ -118,6 +125,7 @@ static int __gt_park(struct intel_wakeref *wf) > > static const struct intel_wakeref_ops wf_ops = { > > .get = __gt_unpark, > > + .post_get = __gt_unpark_work_queue, > > .put = __gt_park, > > }; > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c > > b/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c > > new file mode 100644 > > index ..23162dbd0c35 > > --- /dev/null > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c > > @@ -0,0 +1,35 @@ > > +// SPDX-License-Identifier: MIT > > +/* > > + * Copyright © 2021 Intel Corporation > > + */ > > + > > +#include "i915_drv.h" > > +#include "intel_runtime_pm.h" > > +#include "intel_gt_pm.h" > > + > > +void intel_gt_pm_unpark_work_queue(struct intel_gt *gt) > > +{ > > + struct intel_gt_pm_unpark_work *work, *next; > > + unsigned long flags; > > + > > + spin_lock_irqsave(>->pm_unpark_work_lock, flags); > > + list_for_each_entry_safe(work, next, > > +
[PATCH v2 3/6] drm/i915/uncore: Replace gen8 write functions with general fwtable
Now that we have both a standard forcewake table (albeit a single-entry table) and the shadow table stored in the uncore, we can drop the gen8-specific write handlers in favor of the general fwtable version. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 5fa2bf26a948..4c6898746d10 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1046,16 +1046,6 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) return FORCEWAKE_RENDER; } -#define __gen8_reg_write_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd; \ - if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \ - __fwd = FORCEWAKE_RENDER; \ - else \ - __fwd = 0; \ - __fwd; \ -}) - static const struct intel_forcewake_range __gen6_fw_ranges[] = { GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER), }; @@ -1711,7 +1701,6 @@ __gen_write(func, 32) __gen_reg_write_funcs(gen12_fwtable); __gen_reg_write_funcs(gen11_fwtable); __gen_reg_write_funcs(fwtable); -__gen_reg_write_funcs(gen8); #undef __gen_reg_write_funcs #undef GEN6_WRITE_FOOTER @@ -2121,7 +2110,7 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) } else if (GRAPHICS_VER(i915) == 8) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_VALLEYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges); -- 2.25.4
[PATCH v2 6/6] drm/i915/dg2: Add DG2-specific shadow register table
We thought the DG2 table of shadowed registers would be the same as the gen12/xehp table, but it turns out that there are a few minor differences that require us to define a new DG2-specific table: * One register is removed (0xC4D4) * One register is added (0xC4E0) Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 41 ++- drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 10f124297e7c..b3ba710d4310 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1016,6 +1016,45 @@ static const struct i915_range gen12_shadowed_regs[] = { { .start = 0x1F8510, .end = 0x1F8550 }, }; +static const struct i915_range dg2_shadowed_regs[] = { + { .start = 0x2030, .end = 0x2030 }, + { .start = 0x2510, .end = 0x2550 }, + { .start = 0xA008, .end = 0xA00C }, + { .start = 0xA188, .end = 0xA188 }, + { .start = 0xA278, .end = 0xA278 }, + { .start = 0xA540, .end = 0xA56C }, + { .start = 0xC4C8, .end = 0xC4C8 }, + { .start = 0xC4E0, .end = 0xC4E0 }, + { .start = 0xC600, .end = 0xC600 }, + { .start = 0xC658, .end = 0xC658 }, + { .start = 0x22030, .end = 0x22030 }, + { .start = 0x22510, .end = 0x22550 }, + { .start = 0x1C0030, .end = 0x1C0030 }, + { .start = 0x1C0510, .end = 0x1C0550 }, + { .start = 0x1C4030, .end = 0x1C4030 }, + { .start = 0x1C4510, .end = 0x1C4550 }, + { .start = 0x1C8030, .end = 0x1C8030 }, + { .start = 0x1C8510, .end = 0x1C8550 }, + { .start = 0x1D0030, .end = 0x1D0030 }, + { .start = 0x1D0510, .end = 0x1D0550 }, + { .start = 0x1D4030, .end = 0x1D4030 }, + { .start = 0x1D4510, .end = 0x1D4550 }, + { .start = 0x1D8030, .end = 0x1D8030 }, + { .start = 0x1D8510, .end = 0x1D8550 }, + { .start = 0x1E0030, .end = 0x1E0030 }, + { .start = 0x1E0510, .end = 0x1E0550 }, + { .start = 0x1E4030, .end = 0x1E4030 }, + { .start = 0x1E4510, .end = 0x1E4550 }, + { .start = 0x1E8030, .end = 0x1E8030 }, + { .start = 0x1E8510, .end = 0x1E8550 }, + { .start = 0x1F0030, .end = 0x1F0030 }, + { .start = 0x1F0510, .end = 0x1F0550 }, + { .start = 0x1F4030, .end = 0x1F4030 }, + { .start = 0x1F4510, .end = 0x1F4550 }, + { .start = 0x1F8030, .end = 0x1F8030 }, + { .start = 0x1F8510, .end = 0x1F8550 }, +}; + static int mmio_range_cmp(u32 key, const struct i915_range *range) { if (key < range->start) @@ -2054,7 +2093,7 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); - ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); + ASSIGN_SHADOW_TABLE(uncore, dg2_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c index 22ef2c87df1a..bc8128170a99 100644 --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c @@ -68,6 +68,7 @@ static int intel_shadow_table_check(void) { gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) }, { gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) }, { gen12_shadowed_regs, ARRAY_SIZE(gen12_shadowed_regs) }, + { dg2_shadowed_regs, ARRAY_SIZE(dg2_shadowed_regs) }, }; const struct i915_range *range; unsigned int i, j; -- 2.25.4
[PATCH v2 5/6] drm/i915/uncore: Drop gen11 mmio read handlers
Consolidate down to just a single 'fwtable' implementation. For reads we don't need to worry about shadow tables. Also, the NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation can be dropped --- if a register is outside that range on one of the old platforms, then it won't belong to any forcewake range and 0 will be returned anyway. v2: - Restore NEEDS_FORCE_WAKE() check. (Chris, Tvrtko) Cc: Chris Wilson Cc: Tvrtko Ursulin Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 40 - 1 file changed, 17 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index bfb2a6337f9d..10f124297e7c 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -935,9 +935,6 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = { __fwd; \ }) -#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \ - find_fw_domain(uncore, offset) - /* *Must* be sorted by offset! See intel_shadow_table_check(). */ static const struct i915_range gen8_shadowed_regs[] = { { .start = 0x2030, .end = 0x2030 }, @@ -1570,33 +1567,30 @@ static inline void __force_wake_auto(struct intel_uncore *uncore, ___force_wake_auto(uncore, fw_domains); } -#define __gen_read(func, x) \ +#define __gen_fwtable_read(x) \ static u##x \ -func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \ +fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \ +{ \ enum forcewake_domains fw_engine; \ GEN6_READ_HEADER(x); \ - fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \ + fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \ if (fw_engine) \ __force_wake_auto(uncore, fw_engine); \ val = __raw_uncore_read##x(uncore, reg); \ GEN6_READ_FOOTER; \ } -#define __gen_reg_read_funcs(func) \ -static enum forcewake_domains \ -func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \ - return __##func##_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg)); \ -} \ -\ -__gen_read(func, 8) \ -__gen_read(func, 16) \ -__gen_read(func, 32) \ -__gen_read(func, 64) +static enum forcewake_domains +fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { + return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg)); +} -__gen_reg_read_funcs(gen11_fwtable); -__gen_reg_read_funcs(fwtable); +__gen_fwtable_read(8) +__gen_fwtable_read(16) +__gen_fwtable_read(32) +__gen_fwtable_read(64) -#undef __gen_reg_read_funcs +#undef __gen_fwtable_read #undef GEN6_READ_FOOTER #undef GEN6_READ_HEADER @@ -2062,22 +2056,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs); -- 2.25.4
[PATCH v2 2/6] drm/i915/uncore: Associate shadow table with uncore
Store a reference to a platform's shadow table inside the uncore, the same as we do with the forcewake table. This will allow us to use a single set of functions that operate on the shadow table reference rather than generating lots of nearly-identical functions via macros that differ only in terms of the table that they reference. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 40 - drivers/gpu/drm/i915/intel_uncore.h | 7 + 2 files changed, 35 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 8c09af1e9f7a..5fa2bf26a948 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1026,17 +1026,19 @@ static int mmio_range_cmp(u32 key, const struct i915_range *range) return 0; } -#define __is_X_shadowed(x) \ -static bool is_##x##_shadowed(u32 offset) \ -{ \ - const struct i915_range *regs = x##_shadowed_regs; \ - return BSEARCH(offset, regs, ARRAY_SIZE(x##_shadowed_regs), \ +static bool +is_shadowed(struct intel_uncore *uncore, u32 offset) +{ + if (drm_WARN_ON(&uncore->i915->drm, !uncore->shadowed_reg_table)) + return false; + + return BSEARCH(offset, + uncore->shadowed_reg_table, + uncore->shadowed_reg_table_entries, mmio_range_cmp); \ } -__is_X_shadowed(gen8) -__is_X_shadowed(gen11) -__is_X_shadowed(gen12) + static enum forcewake_domains gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) @@ -1047,7 +1049,7 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) #define __gen8_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd; \ - if (NEEDS_FORCE_WAKE(offset) && !is_gen8_shadowed(offset)) \ + if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \ __fwd = FORCEWAKE_RENDER; \ else \ __fwd = 0; \ @@ -1081,7 +1083,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { #define __fwtable_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd = 0; \ - if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \ + if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \ __fwd = find_fw_domain(uncore, offset); \ __fwd; \ }) @@ -1090,7 +1092,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \ - if (!is_gen11_shadowed(__offset)) \ + if (!is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ }) @@ -1099,7 +1101,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \ - if (!is_gen12_shadowed(__offset)) \ + if (!is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ }) @@ -1705,6 +1707,7 @@ __gen_write(func, 8) \ __gen_write(func, 16) \ __gen_write(func, 32) + __gen_reg_write_funcs(gen12_fwtable); __gen_reg_write_funcs(gen11_fwtable); __gen_reg_write_funcs(fwtable); @@ -1969,6 +1972,12 @@ static int intel_uncore_fw_domains_init(struct intel_uncore *uncore) (uncore)->fw_domains_table_entries = ARRAY_SIZE((d)); \ } +#define ASSIGN_SHADOW_TABLE(uncore, d) \ +{ \ + (uncore)->shadowed_reg_table = d; \ + (uncore)->shadowed_reg_table_entries = ARRAY_SIZE((d)); \ +} + static int i915_pmic_bus_access_notifier(struct notifier_block *nb, unsigned long action, void *data) { @@ -2081,30 +2090,37 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ra
[PATCH v2 4/6] drm/i915/uncore: Drop gen11/gen12 mmio write handlers
Now that the reference to the shadow table is stored within the uncore, we don't need to generate separate fwtable, gen11_fwtable, and gen12_fwtable variants of the register write functions; a single 'fwtable' implementation will work for all of those platforms now. While consolidating the functions, gen11/gen12 pick up a NEEDS_FORCE_WAKE() check that they didn't have before, allowing them to bypass a lot of forcewake/shadow checking for non-GT registers (e.g., display). However since these later platforms also introduce media engines at higher MMIO offsets, the definition of NEEDS_FORCE_WAKE() is extended to also consider register offsets above GEN11_BSD_RING_BASE. v2: - Restore NEEDS_FORCE_WAKE(), but extend it for compatibility with the gen11+ platforms by also passing offsets above GEN11_BSD_RING_BASE. (Chris, Tvrtko) Cc: Tvrtko Ursulin Cc: Chris Wilson Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 61 ++--- 1 file changed, 21 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 4c6898746d10..bfb2a6337f9d 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -851,7 +851,10 @@ void assert_forcewakes_active(struct intel_uncore *uncore, } /* We give fast paths for the really cool registers */ -#define NEEDS_FORCE_WAKE(reg) ((reg) < 0x4) +#define NEEDS_FORCE_WAKE(reg) ({ \ + u32 __reg = (reg); \ + __reg < 0x4 || __reg >= GEN11_BSD_RING_BASE; \ +}) static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry) { @@ -1071,27 +1074,10 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { }; #define __fwtable_reg_write_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd = 0; \ - if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \ - __fwd = find_fw_domain(uncore, offset); \ - __fwd; \ -}) - -#define __gen11_fwtable_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \ - if (!is_shadowed(uncore, __offset)) \ - __fwd = find_fw_domain(uncore, __offset); \ - __fwd; \ -}) - -#define __gen12_fwtable_reg_write_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd = 0; \ - const u32 __offset = (offset); \ - if (!is_shadowed(uncore, __offset)) \ + if (NEEDS_FORCE_WAKE((__offset)) && !is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ }) @@ -1675,34 +1661,29 @@ __gen6_write(8) __gen6_write(16) __gen6_write(32) -#define __gen_write(func, x) \ +#define __gen_fwtable_write(x) \ static void \ -func##_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { \ +fwtable_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { \ enum forcewake_domains fw_engine; \ GEN6_WRITE_HEADER; \ - fw_engine = __##func##_reg_write_fw_domains(uncore, offset); \ + fw_engine = __fwtable_reg_write_fw_domains(uncore, offset); \ if (fw_engine) \ __force_wake_auto(uncore, fw_engine); \ __raw_uncore_write##x(uncore, reg, val); \ GEN6_WRITE_FOOTER; \ } -#define __gen_reg_write_funcs(func) \ -static enum forcewake_domains \ -func##_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \ - return __##func##_reg_write_fw_domains(uncore, i915_mmio_reg_offset(reg)); \ -} \ -\ -__gen_write(func, 8) \ -__gen_write(func, 16) \ -__gen_write(func, 32) - +static enum forcewake_domains +fwtable_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) +{ + return __fwtable_reg_write_fw_domains(uncore, i915_mmio_reg_offset(reg)); +} -__gen_reg_write_funcs(gen12_fwtable); -__gen_reg_write_funcs(gen11_fwtable); -__gen_reg_write_funcs(fwtable); +__gen_fwtable_write(8) +__gen_fwtable_write(16) +__gen_fwtable_write(32) -#undef __gen_reg_write_funcs +#undef __gen_fwtable_write #undef GEN6_WRITE_FOOTER #undef GEN6_WRITE_HEADER @@ -2080,22 +2061,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ
[PATCH v2 1/6] drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we simply check whether the register offset is < 0x4, and return FORCEWAKE_RENDER if it is. To prepare for upcoming refactoring, let's define a single-entry forcewake table from [0x0, 0x3] and switch these platforms over to use the fwtable reader functions. v2: - Drop __gen6_reg_read_fw_domains which is no longer used. (Tvrtko) Cc: Tvrtko Ursulin Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 21 - 1 file changed, 8 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index f9767054dbdf..8c09af1e9f7a 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -853,16 +853,6 @@ void assert_forcewakes_active(struct intel_uncore *uncore, /* We give fast paths for the really cool registers */ #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x4) -#define __gen6_reg_read_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd; \ - if (NEEDS_FORCE_WAKE(offset)) \ - __fwd = FORCEWAKE_RENDER; \ - else \ - __fwd = 0; \ - __fwd; \ -}) - static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry) { if (offset < entry->start) @@ -1064,6 +1054,10 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) __fwd; \ }) +static const struct intel_forcewake_range __gen6_fw_ranges[] = { + GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER), +}; + /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */ static const struct intel_forcewake_range __chv_fw_ranges[] = { GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER), @@ -1623,7 +1617,6 @@ __gen_read(func, 64) __gen_reg_read_funcs(gen11_fwtable); __gen_reg_read_funcs(fwtable); -__gen_reg_read_funcs(gen6); #undef __gen_reg_read_funcs #undef GEN6_READ_FOOTER @@ -2111,15 +2104,17 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 8) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_VALLEYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_GRAPHICS_VER(i915, 6, 7)) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } uncore->pmic_bus_access_nb.notifier_call = i915_pmic_bus_access_notifier; -- 2.25.4
[PATCH v2 0/6] i915: Simplify mmio handling & add new DG2 shadow table
Our uncore MMIO functions for reading/writing registers have become very complicated over time. There's significant macro magic used to generate several nearly-identical functions that only really differ in terms of which platform-specific shadow register table they should check on write operations. We can significantly simplify our MMIO handlers by storing a reference to the current platform's shadow table within the 'struct intel_uncore' the same way we already do for forcewake; this allows us to consolidate the multiple variants of each 'write' function down to just a single 'fwtable' version that gets the shadow table out of the uncore struct rather than hardcoding the name of a specific platform's table. We can do similar consolidation on the MMIO read side by creating a single-entry forcewake table to replace the open-coded range check they had been using previously. The final patch of the series adds a new shadow table for DG2; this becomes quite clean and simple now, given the refactoring in the first five patches. Aside from simplifying the code signficantly, this series reduces the size of the generated .ko in exchange for adding an extra pointer indirection to access the tables. The size deltas (for just the first five patches, before we add an additional table in the final patch) are: Old: $ size drivers/gpu/drm/i915/i915.ko textdata bss dec hex filename 2865921 889722912 2957805 2d21ed drivers/gpu/drm/i915/i915.ko New: $ size drivers/gpu/drm/i915/i915.ko textdata bss dec hex filename 2854181 882362912 2945329 2cf131 drivers/gpu/drm/i915/i915.ko The code size deltas will become larger as we add more platforms; we already add one new platform table in the final patch of this series and our next few platforms are all expected to bring new shadow tables as well. I don't think the impact of the indirect table reference for shadow tables should be a concern for a few reasons: * The stored table + indirect lookup design is already deemed good enough for forcewake, which is used more frequently (both reads and writes, compared to shadow tables which are only used for writes) and operates on much larger tables. * Performance-critical sections of the code or those read/writing lots of registers in a batch usually do an explicit grab of the relevant forcewake domains and then perform their MMIO operations via *_fw() functions without considering shadowed registers and bypassing all of the table lookups. * In v2 of the series, we still apply NEEDS_FORCE_WAKE() checks that will bypass all of the forcewake and shadow logic for display register writes. v2: - Drop orphaned definition of __gen6_reg_read_fw_domains. (Tvrtko) - Restore NEEDS_FORCE_WAKE() check to __fwtable_reg_{read,write}_fw_domains, but update the definition of NEEDS_FORCE_WAKE to also return 'true' on offsets above GEN11_BSD_RING_BASE for compatibility with gen11+ platforms. (Chris, Tvrtko). Cc: Tvrtko Ursulin Cc: Chris Wilson Matt Roper (6): drm/i915/uncore: Convert gen6/gen7 read operations to fwtable drm/i915/uncore: Associate shadow table with uncore drm/i915/uncore: Replace gen8 write functions with general fwtable drm/i915/uncore: Drop gen11/gen12 mmio write handlers drm/i915/uncore: Drop gen11 mmio read handlers drm/i915/dg2: Add DG2-specific shadow register table drivers/gpu/drm/i915/intel_uncore.c | 200 ++ drivers/gpu/drm/i915/intel_uncore.h | 7 + drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + 3 files changed, 115 insertions(+), 93 deletions(-) -- 2.25.4
Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping
On Fri, Sep 10, 2021 at 12:12:42PM +0100, Tvrtko Ursulin wrote: > > On 20/08/2021 23:44, Matthew Brost wrote: > > Add logical engine mapping. This is required for split-frame, as > > workloads need to be placed on engines in a logically contiguous manner. > > > > v2: > > (Daniel Vetter) > >- Add kernel doc for new fields > > > > Signed-off-by: Matthew Brost > > --- > > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 --- > > drivers/gpu/drm/i915/gt/intel_engine_types.h | 5 ++ > > .../drm/i915/gt/intel_execlists_submission.c | 1 + > > drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 2 +- > > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +-- > > 5 files changed, 60 insertions(+), 29 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > index 0d9105a31d84..4d790f9a65dd 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > @@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs > > *engine, u16 iir) > > GEM_DEBUG_WARN_ON(iir); > > } > > -static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) > > +static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, > > + u8 logical_instance) > > { > > const struct engine_info *info = &intel_engines[id]; > > struct drm_i915_private *i915 = gt->i915; > > @@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum > > intel_engine_id id) > > engine->class = info->class; > > engine->instance = info->instance; > > + engine->logical_mask = BIT(logical_instance); > > __sprint_engine_name(engine); > > engine->props.heartbeat_interval_ms = > > @@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct > > intel_gt *gt) > > return info->engine_mask; > > } > > +static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids, > > +u8 class, const u8 *map, u8 num_instances) > > +{ > > + int i, j; > > + u8 current_logical_id = 0; > > + > > + for (j = 0; j < num_instances; ++j) { > > + for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) { > > + if (!HAS_ENGINE(gt, i) || > > + intel_engines[i].class != class) > > + continue; > > + > > + if (intel_engines[i].instance == map[j]) { > > + logical_ids[intel_engines[i].instance] = > > + current_logical_id++; > > + break; > > + } > > + } > > + } > > +} > > + > > +static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 > > class) > > +{ > > + int i; > > + u8 map[MAX_ENGINE_INSTANCE + 1]; > > + > > + for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i) > > + map[i] = i; > > What's the point of the map array since it is 1:1 with instance? > Future products do not have a 1 to 1 mapping and that mapping can change based on fusing, e.g. XeHP SDV. Also technically ICL / TGL / ADL physical instance 2 maps to logical instance 1. > > + populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map)); > > +} > > + > > /** > >* intel_engines_init_mmio() - allocate and prepare the Engine Command > > Streamers > >* @gt: pointer to struct intel_gt > > @@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt) > > struct drm_i915_private *i915 = gt->i915; > > const unsigned int engine_mask = init_engine_mask(gt); > > unsigned int mask = 0; > > - unsigned int i; > > + unsigned int i, class; > > + u8 logical_ids[MAX_ENGINE_INSTANCE + 1]; > > int err; > > drm_WARN_ON(&i915->drm, engine_mask == 0); > > @@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt) > > if (i915_inject_probe_failure(i915)) > > return -ENODEV; > > - for (i = 0; i < ARRAY_SIZE(intel_engines); i++) { > > - if (!HAS_ENGINE(gt, i)) > > - continue; > > + for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) { > > + setup_logical_ids(gt, logical_ids, class); > > - err = intel_engine_setup(gt, i); > > - if (err) > > - goto cleanup; > > + for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) { > > + u8 instance = intel_engines[i].instance; > > + > > + if (intel_engines[i].class != class || > > + !HAS_ENGINE(gt, i)) > > + continue; > > - mask |= BIT(i); > > + err = intel_engine_setup(gt, i, > > +logical_ids[instance]); > > + if (err) > > + goto cleanup; > > + > > + mask |= BIT(i); > > I still this there is a less clu
[PATCH 1/1] drm/amdkfd: Add sysfs bitfields and enums to uAPI
These bits are de-facto part of the uAPI, so declare them in a uAPI header. Signed-off-by: Felix Kuehling --- MAINTAINERS | 1 + drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 46 + include/uapi/linux/kfd_sysfs.h| 108 ++ 3 files changed, 110 insertions(+), 45 deletions(-) create mode 100644 include/uapi/linux/kfd_sysfs.h diff --git a/MAINTAINERS b/MAINTAINERS index 84cd16694640..7554ec928ee2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -930,6 +930,7 @@ F: drivers/gpu/drm/amd/include/kgd_kfd_interface.h F: drivers/gpu/drm/amd/include/v9_structs.h F: drivers/gpu/drm/amd/include/vi_structs.h F: include/uapi/linux/kfd_ioctl.h +F: include/uapi/linux/kfd_sysfs.h AMD SPI DRIVER M: Sanjay R Mehta diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h index a8db017c9b8e..f0cc59d2fd5d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h @@ -25,38 +25,11 @@ #include #include +#include #include "kfd_crat.h" #define KFD_TOPOLOGY_PUBLIC_NAME_SIZE 32 -#define HSA_CAP_HOT_PLUGGABLE 0x0001 -#define HSA_CAP_ATS_PRESENT0x0002 -#define HSA_CAP_SHARED_WITH_GRAPHICS 0x0004 -#define HSA_CAP_QUEUE_SIZE_POW20x0008 -#define HSA_CAP_QUEUE_SIZE_32BIT 0x0010 -#define HSA_CAP_QUEUE_IDLE_EVENT 0x0020 -#define HSA_CAP_VA_LIMIT 0x0040 -#define HSA_CAP_WATCH_POINTS_SUPPORTED 0x0080 -#define HSA_CAP_WATCH_POINTS_TOTALBITS_MASK0x0f00 -#define HSA_CAP_WATCH_POINTS_TOTALBITS_SHIFT 8 -#define HSA_CAP_DOORBELL_TYPE_TOTALBITS_MASK 0x3000 -#define HSA_CAP_DOORBELL_TYPE_TOTALBITS_SHIFT 12 - -#define HSA_CAP_DOORBELL_TYPE_PRE_1_0 0x0 -#define HSA_CAP_DOORBELL_TYPE_1_0 0x1 -#define HSA_CAP_DOORBELL_TYPE_2_0 0x2 -#define HSA_CAP_AQL_QUEUE_DOUBLE_MAP 0x4000 - -#define HSA_CAP_RESERVED_WAS_SRAM_EDCSUPPORTED 0x0008 /* Old buggy user mode depends on this being 0 */ -#define HSA_CAP_MEM_EDCSUPPORTED 0x0010 -#define HSA_CAP_RASEVENTNOTIFY 0x0020 -#define HSA_CAP_ASIC_REVISION_MASK 0x03c0 -#define HSA_CAP_ASIC_REVISION_SHIFT22 -#define HSA_CAP_SRAM_EDCSUPPORTED 0x0400 -#define HSA_CAP_SVMAPI_SUPPORTED 0x0800 -#define HSA_CAP_FLAGS_COHERENTHOSTACCESS 0x1000 -#define HSA_CAP_RESERVED 0xe00f8000 - struct kfd_node_properties { uint64_t hive_id; uint32_t cpu_cores_count; @@ -93,17 +66,6 @@ struct kfd_node_properties { char name[KFD_TOPOLOGY_PUBLIC_NAME_SIZE]; }; -#define HSA_MEM_HEAP_TYPE_SYSTEM 0 -#define HSA_MEM_HEAP_TYPE_FB_PUBLIC1 -#define HSA_MEM_HEAP_TYPE_FB_PRIVATE 2 -#define HSA_MEM_HEAP_TYPE_GPU_GDS 3 -#define HSA_MEM_HEAP_TYPE_GPU_LDS 4 -#define HSA_MEM_HEAP_TYPE_GPU_SCRATCH 5 - -#define HSA_MEM_FLAGS_HOT_PLUGGABLE0x0001 -#define HSA_MEM_FLAGS_NON_VOLATILE 0x0002 -#define HSA_MEM_FLAGS_RESERVED 0xfffc - struct kfd_mem_properties { struct list_headlist; uint32_theap_type; @@ -116,12 +78,6 @@ struct kfd_mem_properties { struct attributeattr; }; -#define HSA_CACHE_TYPE_DATA0x0001 -#define HSA_CACHE_TYPE_INSTRUCTION 0x0002 -#define HSA_CACHE_TYPE_CPU 0x0004 -#define HSA_CACHE_TYPE_HSACU 0x0008 -#define HSA_CACHE_TYPE_RESERVED0xfff0 - struct kfd_cache_properties { struct list_headlist; uint32_tprocessor_id_low; diff --git a/include/uapi/linux/kfd_sysfs.h b/include/uapi/linux/kfd_sysfs.h new file mode 100644 index ..e1fb78b4bf09 --- /dev/null +++ b/include/uapi/linux/kfd_sysfs.h @@ -0,0 +1,108 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT WITH Linux-syscall-note */ +/* + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONI
Re: [virtio-dev] [PATCH v1 09/12] drm/virtio: implement context init: allocate an array of fence contexts
On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh wrote: > > We don't want fences from different 3D contexts (virgl, gfxstream, > venus) to be on the same timeline. With explicit context creation, > we can specify the number of ring each context wants. > > Execbuffer can specify which ring to use. > > Signed-off-by: Gurchetan Singh > Acked-by: Lingfeng Yang > --- > drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 -- > 2 files changed, 35 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h > b/drivers/gpu/drm/virtio/virtgpu_drv.h > index a5142d60c2fa..cca9ab505deb 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_drv.h > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h > @@ -56,6 +56,7 @@ > #define STATE_ERR 2 > > #define MAX_CAPSET_ID 63 > +#define MAX_RINGS 64 > > struct virtio_gpu_object_params { > unsigned long size; > @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { > uint32_t ctx_id; > uint32_t context_init; > bool context_created; > + uint32_t num_rings; > + uint64_t base_fence_ctx; > struct mutex context_lock; > }; > > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > index f51f3393a194..262f79210283 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device > *dev, void *data, > int in_fence_fd = exbuf->fence_fd; > int out_fence_fd = -1; > void *buf; > + uint64_t fence_ctx; > + uint32_t ring_idx; > + > + fence_ctx = vgdev->fence_drv.context; > + ring_idx = 0; > > if (vgdev->has_virgl_3d == false) > return -ENOSYS; > @@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device > *dev, void *data, > if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) > return -EINVAL; > > + if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) { > + if (exbuf->ring_idx >= vfpriv->num_rings) > + return -EINVAL; > + > + if (!vfpriv->base_fence_ctx) > + return -EINVAL; > + > + fence_ctx = vfpriv->base_fence_ctx; > + ring_idx = exbuf->ring_idx; > + } > + > exbuf->fence_fd = -1; > > virtio_gpu_create_context(dev, file); > @@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device > *dev, void *data, > goto out_memdup; > } > > - out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, > 0); > + out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); > if(!out_fence) { > ret = -ENOMEM; > goto out_unresv; > @@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct > drm_device *dev, > return -EINVAL; > > /* Number of unique parameters supported at this time. */ > - if (num_params > 1) > + if (num_params > 2) > return -EINVAL; > > ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), > @@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct > drm_device *dev, > > vfpriv->context_init |= value; > break; > + case VIRTGPU_CONTEXT_PARAM_NUM_RINGS: > + if (vfpriv->base_fence_ctx) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + if (value > MAX_RINGS) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + vfpriv->base_fence_ctx = > dma_fence_context_alloc(value); With multiple fence contexts, we should do something about implicit fencing. The classic example is Mesa and X server. When both use virgl and the global fence context, no dma_fence_wait is fine. But when Mesa uses venus and the ring fence context, dma_fence_wait should be inserted. > + vfpriv->num_rings = value; > + break; > default: > ret = -EINVAL; > goto out_unlock; > -- > 2.33.0.153.gba50c8fa24-goog > > > - > To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org > For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org >
Re: [git pull] drm fixes for 5.15-rc1
The pull request you sent on Fri, 10 Sep 2021 16:35:59 +1000: > git://anongit.freedesktop.org/drm/drm tags/drm-next-2021-09-10 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/a668acb8f01fc0d1e3877cddecbe319ef2ef651c Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [Intel-gfx] [PATCH v9 05/17] drm/i915/pxp: Implement funcs to create the TEE channel
On Fri, Sep 10, 2021 at 08:36:15AM -0700, Daniele Ceraolo Spurio wrote: > From: "Huang, Sean Z" > > Implement the funcs to create the TEE channel, so kernel can > send the TEE commands directly to TEE for creating the arbitrary > (default) session. > > v2: fix locking, don't pollute dev_priv (Chris) > > v3: wait for mei PXP component to be bound. > > v4: drop the wait, as the component might be bound after i915 load > completes. We'll instead check when sending a tee message. > > v5: fix an issue with mei_pxp module removal > > v6: don't use fetch_and_zero in fini (Rodrigo) > > Signed-off-by: Huang, Sean Z > Signed-off-by: Daniele Ceraolo Spurio > Cc: Chris Wilson Reviewed-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/Makefile | 3 +- > drivers/gpu/drm/i915/pxp/intel_pxp.c | 13 > drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 79 ++ > drivers/gpu/drm/i915/pxp/intel_pxp_tee.h | 14 > drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 6 ++ > 5 files changed, 114 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c > create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index 23f5bc268962..d39bd0cefc64 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -283,7 +283,8 @@ i915-y += i915_perf.o > > # Protected execution platform (PXP) support > i915-$(CONFIG_DRM_I915_PXP) += \ > - pxp/intel_pxp.o > + pxp/intel_pxp.o \ > + pxp/intel_pxp_tee.o > > # Post-mortem debug and GPU hang state capture > i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c > b/drivers/gpu/drm/i915/pxp/intel_pxp.c > index 7b2053902146..400deaea2d8a 100644 > --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c > @@ -3,6 +3,7 @@ > * Copyright(c) 2020 Intel Corporation. > */ > #include "intel_pxp.h" > +#include "intel_pxp_tee.h" > #include "gt/intel_context.h" > #include "i915_drv.h" > > @@ -50,7 +51,16 @@ void intel_pxp_init(struct intel_pxp *pxp) > if (ret) > return; > > + ret = intel_pxp_tee_component_init(pxp); > + if (ret) > + goto out_context; > + > drm_info(>->i915->drm, "Protected Xe Path (PXP) protected content > support initialized\n"); > + > + return; > + > +out_context: > + destroy_vcs_context(pxp); > } > > void intel_pxp_fini(struct intel_pxp *pxp) > @@ -58,5 +68,8 @@ void intel_pxp_fini(struct intel_pxp *pxp) > if (!intel_pxp_is_enabled(pxp)) > return; > > + intel_pxp_tee_component_fini(pxp); > + > destroy_vcs_context(pxp); > + > } > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c > b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c > new file mode 100644 > index ..f1d8de832653 > --- /dev/null > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c > @@ -0,0 +1,79 @@ > +// SPDX-License-Identifier: MIT > +/* > + * Copyright(c) 2020 Intel Corporation. > + */ > + > +#include > +#include "drm/i915_pxp_tee_interface.h" > +#include "drm/i915_component.h" > +#include "i915_drv.h" > +#include "intel_pxp.h" > +#include "intel_pxp_tee.h" > + > +static inline struct intel_pxp *i915_dev_to_pxp(struct device *i915_kdev) > +{ > + return &kdev_to_i915(i915_kdev)->gt.pxp; > +} > + > +/** > + * i915_pxp_tee_component_bind - bind function to pass the function pointers > to pxp_tee > + * @i915_kdev: pointer to i915 kernel device > + * @tee_kdev: pointer to tee kernel device > + * @data: pointer to pxp_tee_master containing the function pointers > + * > + * This bind function is called during the system boot or resume from system > sleep. > + * > + * Return: return 0 if successful. > + */ > +static int i915_pxp_tee_component_bind(struct device *i915_kdev, > +struct device *tee_kdev, void *data) > +{ > + struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev); > + > + pxp->pxp_component = data; > + pxp->pxp_component->tee_dev = tee_kdev; > + > + return 0; > +} > + > +static void i915_pxp_tee_component_unbind(struct device *i915_kdev, > + struct device *tee_kdev, void *data) > +{ > + struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev); > + > + pxp->pxp_component = NULL; > +} > + > +static const struct component_ops i915_pxp_tee_component_ops = { > + .bind = i915_pxp_tee_component_bind, > + .unbind = i915_pxp_tee_component_unbind, > +}; > + > +int intel_pxp_tee_component_init(struct intel_pxp *pxp) > +{ > + int ret; > + struct intel_gt *gt = pxp_to_gt(pxp); > + struct drm_i915_private *i915 = gt->i915; > + > + ret = component_add_typed(i915->drm.dev, &i915_pxp_tee_component_ops, > + I915_COMPONENT_PXP); > + if (ret < 0) { > + drm_err(&i915->drm, "
Re: [PATCH v9 10/17] drm/i915/pxp: interfaces for using protected objects
On Fri, Sep 10, 2021 at 08:36:20AM -0700, Daniele Ceraolo Spurio wrote: > This api allow user mode to create protected buffers and to mark > contexts as making use of such objects. Only when using contexts > marked in such a way is the execution guaranteed to work as expected. > > Contexts can only be marked as using protected content at creation time > (i.e. the parameter is immutable) and they must be both bannable and not > recoverable. Given that the protected session gets invalidated on > suspend, contexts created this way hold a runtime pm wakeref until > they're either destroyed or invalidated. > > All protected objects and contexts will be considered invalid when the > PXP session is destroyed and all new submissions using them will be > rejected. All intel contexts within the invalidated gem contexts will be > marked banned. Userspace can detect that an invalidation has occurred via > the RESET_STATS ioctl, where we report it the same way as a ban due to a > hang. > > v5: squash patches, rebase on proto_ctx, update kerneldoc > > v6: rebase on obj create_ext changes > > v7: Use session counter to check if an object it valid, hold wakeref in > context, don't add a new flag to RESET_STATS (Daniel) > > v8: don't increase guilty count for contexts banned during pxp > invalidation (Rodrigo) > > v9: better comments, avoid wakeref put race between pxp_inval and > context_close, add usage examples (Rodrigo) > > Signed-off-by: Daniele Ceraolo Spurio > Signed-off-by: Bommu Krishnaiah > Cc: Rodrigo Vivi > Cc: Chris Wilson > Cc: Lionel Landwerlin > Cc: Jason Ekstrand > Cc: Daniel Vetter Reviewed-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 98 --- > drivers/gpu/drm/i915/gem/i915_gem_context.h | 6 ++ > .../gpu/drm/i915/gem/i915_gem_context_types.h | 28 ++ > drivers/gpu/drm/i915/gem/i915_gem_create.c| 72 ++ > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 18 > drivers/gpu/drm/i915/gem/i915_gem_object.c| 1 + > drivers/gpu/drm/i915/gem/i915_gem_object.h| 6 ++ > .../gpu/drm/i915/gem/i915_gem_object_types.h | 8 ++ > .../gpu/drm/i915/gem/selftests/mock_context.c | 4 +- > drivers/gpu/drm/i915/pxp/intel_pxp.c | 78 +++ > drivers/gpu/drm/i915/pxp/intel_pxp.h | 12 +++ > drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 6 ++ > drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 9 ++ > include/uapi/drm/i915_drm.h | 96 +- > 14 files changed, 407 insertions(+), 35 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index c2ab0e22db0a..3418be4f727f 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -77,6 +77,8 @@ > #include "gt/intel_gpu_commands.h" > #include "gt/intel_ring.h" > > +#include "pxp/intel_pxp.h" > + > #include "i915_gem_context.h" > #include "i915_trace.h" > #include "i915_user_extensions.h" > @@ -186,10 +188,13 @@ static int validate_priority(struct drm_i915_private > *i915, > return 0; > } > > -static void proto_context_close(struct i915_gem_proto_context *pc) > +static void proto_context_close(struct drm_i915_private *i915, > + struct i915_gem_proto_context *pc) > { > int i; > > + if (pc->pxp_wakeref) > + intel_runtime_pm_put(&i915->runtime_pm, pc->pxp_wakeref); > if (pc->vm) > i915_vm_put(pc->vm); > if (pc->user_engines) { > @@ -241,6 +246,33 @@ static int proto_context_set_persistence(struct > drm_i915_private *i915, > return 0; > } > > +static int proto_context_set_protected(struct drm_i915_private *i915, > +struct i915_gem_proto_context *pc, > +bool protected) > +{ > + int ret = 0; > + > + if (!intel_pxp_is_enabled(&i915->gt.pxp)) { > + ret = -ENODEV; > + } else if (!protected) { > + pc->uses_protected_content = false; > + } else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) || > +!(pc->user_flags & BIT(UCONTEXT_BANNABLE))) { > + ret = -EPERM; > + } else { > + pc->uses_protected_content = true; > + > + /* > + * protected context usage requires the PXP session to be up, > + * which in turn requires the device to be active. > + */ > + pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm); > + ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp); > + } > + > + return ret; > +} > + > static struct i915_gem_proto_context * > proto_context_create(struct drm_i915_private *i915, unsigned int flags) > { > @@ -269,7 +301,7 @@ proto_context_create(struct drm_i915_private *i915, > unsigned int flags) > return pc; > > proto_close: > - p
Re: [Intel-gfx] [PATCH v9 16/17] drm/i915/pxp: add PXP documentation
On Fri, Sep 10, 2021 at 08:36:26AM -0700, Daniele Ceraolo Spurio wrote: > Now that all the pieces are in place we can add a description of how the > feature works. Also modify the comments in struct intel_pxp into > kerneldoc. > > v2: improve doc (Rodrigo) > > Signed-off-by: Daniele Ceraolo Spurio > Cc: Daniel Vetter > Cc: Rodrigo Vivi Reviewed-by: Rodrigo Vivi > --- > Documentation/gpu/i915.rst | 8 > drivers/gpu/drm/i915/pxp/intel_pxp.c | 28 + > drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 47 -- > 3 files changed, 71 insertions(+), 12 deletions(-) > > diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst > index 101dde3eb1ea..78ecb9d5ec20 100644 > --- a/Documentation/gpu/i915.rst > +++ b/Documentation/gpu/i915.rst > @@ -471,6 +471,14 @@ Object Tiling IOCTLs > .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c > :doc: buffer object tiling > > +Protected Objects > +- > + > +.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp.c > + :doc: PXP > + > +.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp_types.h > + > Microcontrollers > > > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c > b/drivers/gpu/drm/i915/pxp/intel_pxp.c > index 97c6368fddc3..5610634f8929 100644 > --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c > @@ -11,6 +11,34 @@ > #include "gt/intel_context.h" > #include "i915_drv.h" > > +/** > + * DOC: PXP > + * > + * PXP (Protected Xe Path) is a feature available in Gen12 and newer > platforms. > + * It allows execution and flip to display of protected (i.e. encrypted) > + * objects. The SW support is enabled via the CONFIG_DRM_I915_PXP kconfig. > + * > + * Objects can opt-in to PXP encryption at creation time via the > + * I915_GEM_CREATE_EXT_PROTECTED_CONTENT create_ext flag. For objects to be > + * correctly protected they must be used in conjunction with a context > created > + * with the I915_CONTEXT_PARAM_PROTECTED_CONTENT flag. See the documentation > + * of those two uapi flags for details and restrictions. > + * > + * Protected objects are tied to a pxp session; currently we only support one > + * session, which i915 manages and whose index is available in the uapi > + * (I915_PROTECTED_CONTENT_DEFAULT_SESSION) for use in instructions targeting > + * protected objects. > + * The session is invalidated by the HW when certain events occur (e.g. > + * suspend/resume). When this happens, all the objects that were used with > the > + * session are marked as invalid and all contexts marked as using protected > + * content are banned. Any further attempt at using them in an execbuf call > is > + * rejected, while flips are converted to black frames. > + * > + * Some of the PXP setup operations are performed by the Management Engine, > + * which is handled by the mei driver; communication between i915 and mei is > + * performed via the mei_pxp component module. > + */ > + > /* KCR register definitions */ > #define KCR_INIT _MMIO(0x320f0) > > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h > b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h > index ae24064bb57e..73ef7d1754e1 100644 > --- a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h > @@ -16,42 +16,65 @@ > struct intel_context; > struct i915_pxp_component; > > +/** > + * struct intel_pxp - pxp state > + */ > struct intel_pxp { > + /** > + * @pxp_component: i915_pxp_component struct of the bound mei_pxp > + * module. Only set and cleared inside component bind/unbind functions, > + * which are protected by &tee_mutex. > + */ > struct i915_pxp_component *pxp_component; > + /** > + * @pxp_component_added: track if the pxp component has been added. > + * Set and cleared in tee init and fini functions respectively. > + */ > bool pxp_component_added; > > + /** @ce: kernel-owned context used for PXP operations */ > struct intel_context *ce; > > - /* > + /** @arb_mutex: protects arb session start */ > + struct mutex arb_mutex; > + /** > + * @arb_is_valid: tracks arb session status. >* After a teardown, the arb session can still be in play on the HW >* even if the keys are gone, so we can't rely on the HW state of the >* session to know if it's valid and need to track the status in SW. >*/ > - struct mutex arb_mutex; /* protects arb session start */ > bool arb_is_valid; > > - /* > - * Keep track of which key instance we're on, so we can use it to > - * determine if an object was created using the current key or a > + /** > + * @key_instance: tracks which key instance we're on, so we can use it > + * to determine if an object was created using the current key or a >* previous one. >*/ > u32 key_instance; > > - struct mutex tee
Re: [PATCH] drm/ttm: add a WARN_ON in ttm_set_driver_manager when array bounds (v2)
On 2021-09-10 11:09, Guchun Chen wrote: Vendor will define their own memory types on top of TTM_PL_PRIV, but call ttm_set_driver_manager directly without checking mem_type value when setting up memory manager. So add such check to aware the case when array bounds. v2: lower check level to WARN_ON Signed-off-by: Leslie Shi Signed-off-by: Guchun Chen --- include/drm/ttm/ttm_device.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h index 07d722950d5b..aa79953c807c 100644 --- a/include/drm/ttm/ttm_device.h +++ b/include/drm/ttm/ttm_device.h @@ -291,6 +291,7 @@ ttm_manager_type(struct ttm_device *bdev, int mem_type) static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type, struct ttm_resource_manager *manager) { + WARN_ON(type >= TTM_NUM_MEM_TYPES); Nit: I know nothing about this code, but from the context alone it would seem sensible to do if (WARN_ON(type >= TTM_NUM_MEM_TYPES)) return; to avoid making the subsequent assignment when we *know* it's invalid and likely to corrupt memory. Robin. bdev->man_drv[type] = manager; }
Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.
On Thu, Sep 09, 2021 at 11:32:18AM +0200, Maarten Lankhorst wrote: > This is also useful in regulator_lock_nested, which may avoid dropping > regulator_nesting_mutex in the uncontended path, so use it there. Acked-by: Mark Brown signature.asc Description: PGP signature
Re: [Intel-gfx] [PATCH 3/4] drm/i915: rename debugfs_gt_pm files
On Wed, Sep 08, 2021 at 05:49:40PM -0700, Lucas De Marchi wrote: We shouldn't be using debugfs_ namespace for this functionality. Rename debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make functions, defines and structs follow suit. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/Makefile | 2 +- drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 -- drivers/gpu/drm/i915/gt/intel_gt_debugfs.c | 4 ++-- .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} | 4 ++-- drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h | 14 ++ 5 files changed, 19 insertions(+), 19 deletions(-) delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} (99%) create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 232c9673a2e5..dd656f2d7721 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o # "Graphics Technology" (aka we talk to the gpu) gt-y += \ - gt/debugfs_gt_pm.o \ gt/gen2_engine_cs.o \ gt/gen6_engine_cs.o \ gt/gen6_ppgtt.o \ @@ -103,6 +102,7 @@ gt-y += \ gt/intel_gt_engines_debugfs.o \ gt/intel_gt_irq.o \ gt/intel_gt_pm.o \ + gt/intel_gt_pm_debugfs.o \ gt/intel_gt_pm_irq.o \ gt/intel_gt_requests.o \ gt/intel_gtt.o \ diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h deleted file mode 100644 index 4cf5f5c9da7d.. --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h +++ /dev/null @@ -1,14 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2019 Intel Corporation - */ - -#ifndef DEBUGFS_GT_PM_H -#define DEBUGFS_GT_PM_H - -struct intel_gt; -struct dentry; - -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root); - -#endif /* DEBUGFS_GT_PM_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c index e5d173c235a3..4096ee893b69 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c @@ -5,10 +5,10 @@ #include -#include "debugfs_gt_pm.h" #include "i915_drv.h" #include "intel_gt_debugfs.h" #include "intel_gt_engines_debugfs.h" +#include "intel_gt_pm_debugfs.h" #include "intel_sseu_debugfs.h" #include "uc/intel_uc_debugfs.h" @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt *gt) return; intel_gt_engines_register_debugfs(gt, root); - debugfs_gt_pm_register(gt, root); + intel_gt_pm_register_debugfs(gt, root); This is one case I usually don't know what convention to follow since it changes in different places. I did it like _register_debugfs because of calls like intel_gt_init_scratch(), xxx_init_hw, etc. However here I see that just below we have intel_sseu_debugfs_register(), so maybe I should consider debugfs as part of the namespace? Lucas De Marchi
Re: [PATCH] drm/msm: Disable frequency clamping on a630
On 10/09/2021 18:18, Rob Clark wrote: On Tue, Sep 7, 2021 at 7:20 PM Bjorn Andersson wrote: On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote: On 8/9/2021 9:48 PM, Caleb Connolly wrote: On 09/08/2021 17:12, Rob Clark wrote: On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen wrote: [..] I am a bit confused. We don't define a power domain for gpu in dt, correct? Then what exactly set_opp do here? Do you think this usleep is what is helping here somehow to mask the issue? The power domains (for cx and gx) are defined in the GMU DT, the OPPs in the GPU DT. For the sake of simplicity I'll refer to the lowest frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as the "min" state, and the highest frequency (71000) and OPP level (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in sdm845.dtsi under the gpu node. The new devfreq behaviour unmasks what I think is a driver bug, it inadvertently puts much more strain on the GPU regulators than they usually get. With the new behaviour the GPU jumps from it's min state to the max state and back again extremely rapidly under workloads as small as refreshing UI. Where previously the GPU would rarely if ever go above 342MHz when interacting with the device, it now jumps between min and max many times per second. If my understanding is correct, the current implementation of the GMU set freq is the following: - Get OPP for frequency to set - Push the frequency to the GMU - immediately updating the core clock - Call dev_pm_opp_set_opp() which triggers a notify chain, this winds up somewhere in power management code and causes the gx regulator level to be updated Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We were using a different api earlier which got deprecated - dev_pm_opp_set_bw(). On the Lenovo Yoga C630 this is reproduced by starting alacritty and if I'm lucky I managed to hit a few keys before it crashes, so I spent a few hours looking into this as well... As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote. The opp-level is just there for show and isn't used by anything, at least not on 845. Further more, I'm missing something in my tree, so the interconnect doesn't hit sync_state, and as such we're not actually scaling the buses. So the problem is not that Linux doesn't turn on the buses in time. So I suspect that the "AHB bus error" isn't saying that we turned off the bus, but rather that the GPU becomes unstable or something of that sort. Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran Aquarium for 20 minutes without a problem. I then switched the gpu devfreq governor to "userspace" and ran the following: while true; do echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq done It took 19 iterations of this loop to crash the GPU. I assume you still had aquarium running, to keep the gpu awake while you ran that loop? Fwiw, I modified this slightly to match sc7180's min/max gpu freq and could not trigger any issue.. interestingly sc7180 has a lower min freq (180) and higher max freq (800) so it was toggling over a wider freq range. I also tried on a device that had the higher 825MHz opp (since I noticed that was the only opp that used RPMH_REGULATOR_LEVEL_TURBO_L1 and wanted to rule that out), but could not reproduce. I guess a630 (sdm845) should have higher power draw (it is 2x # of shader cores and 2x GMEM size, but lower max freq).. the question is, is this the reason we see this on sdm845 and not sc7180? Or is there some other difference. On the gpu side of this, they are both closely related (ie. the same "sub-generation" of a6xx, same gmu fw, etc).. I'm less sure about the other parts (icc, rpmh, etc) My guess would be power draw, nobody has mentioned this yet but I've realised that the vdd_gfx rail is powered by a buck converter, which could explain a lot of the symptoms. Buck converters depend on high frequency switching and inductors to work, this inherently leads to some lag time when changing voltages, and also means that the behaviour of the regulator is defined in part by how much current is being drawn. Wikipedia has a pretty good explanation: https://en.wikipedia.org/wiki/Buck_converter At the best of times these regulators have a known voltage ripple, when under load and when rapidly switching voltages this will get a lot worse. Someone with an oscilloscope and schematics could probe the rail and probably see exactly what's going on when the GPU crashes. Because of the lag time in the regulator changing voltage, it might be undershooting whilst the GPU is trying to clock up and draw more current - causing instability and crashes. BR, -R So the problem doesn't seem to be Rob's change, it's just that prior to it the chance to hitting it is way lower. Question is still what it is that we're triggering.
Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.
On Fri, Sep 10, 2021 at 05:02:54PM +0200, Peter Zijlstra wrote: > That doesn't look right, how's this for you? Full patch for the robots here: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/commit/?h=locking/core&id=826e7b8826f0af185bb93249600533c33fd69a95
Re: [PATCH] drm/msm: Disable frequency clamping on a630
On Thu, Sep 9, 2021 at 1:54 PM Rob Clark wrote: > > On Thu, Sep 9, 2021 at 12:50 PM Akhil P Oommen wrote: > > > > On 9/9/2021 9:42 PM, Amit Pundir wrote: > > > On Thu, 9 Sept 2021 at 17:47, Amit Pundir wrote: > > >> > > >> On Wed, 8 Sept 2021 at 07:50, Bjorn Andersson > > >> wrote: > > >>> > > >>> On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote: > > >>> > > On 8/9/2021 9:48 PM, Caleb Connolly wrote: > > > > > > > > > On 09/08/2021 17:12, Rob Clark wrote: > > >> On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen > > >> wrote: > > >>> [..] > > >>> I am a bit confused. We don't define a power domain for gpu in dt, > > >>> correct? Then what exactly set_opp do here? Do you think this > > >>> usleep is > > >>> what is helping here somehow to mask the issue? > > > The power domains (for cx and gx) are defined in the GMU DT, the OPPs > > > in > > > the GPU DT. For the sake of simplicity I'll refer to the lowest > > > frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as > > > the "min" state, and the highest frequency (71000) and OPP level > > > (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined > > > in > > > sdm845.dtsi under the gpu node. > > > > > > The new devfreq behaviour unmasks what I think is a driver bug, it > > > inadvertently puts much more strain on the GPU regulators than they > > > usually get. With the new behaviour the GPU jumps from it's min state > > > to > > > the max state and back again extremely rapidly under workloads as > > > small > > > as refreshing UI. Where previously the GPU would rarely if ever go > > > above > > > 342MHz when interacting with the device, it now jumps between min and > > > max many times per second. > > > > > > If my understanding is correct, the current implementation of the GMU > > > set freq is the following: > > >- Get OPP for frequency to set > > >- Push the frequency to the GMU - immediately updating the core > > > clock > > >- Call dev_pm_opp_set_opp() which triggers a notify chain, this > > > winds > > > up somewhere in power management code and causes the gx regulator > > > level > > > to be updated > > > > Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing > > else. We > > were using a different api earlier which got deprecated - > > dev_pm_opp_set_bw(). > > > > >>> > > >>> On the Lenovo Yoga C630 this is reproduced by starting alacritty and if > > >>> I'm lucky I managed to hit a few keys before it crashes, so I spent a > > >>> few hours looking into this as well... > > >>> > > >>> As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote. > > >>> The opp-level is just there for show and isn't used by anything, at > > >>> least not on 845. > > >>> > > >>> Further more, I'm missing something in my tree, so the interconnect > > >>> doesn't hit sync_state, and as such we're not actually scaling the > > >>> buses. So the problem is not that Linux doesn't turn on the buses in > > >>> time. > > >>> > > >>> So I suspect that the "AHB bus error" isn't saying that we turned off > > >>> the bus, but rather that the GPU becomes unstable or something of that > > >>> sort. > > >>> > > >>> > > >>> Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran > > >>> Aquarium for 20 minutes without a problem. I then switched the gpu > > >>> devfreq governor to "userspace" and ran the following: > > >>> > > >>> while true; do > > >>>echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq > > >>>echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq > > >>> done > > >>> > > >>> It took 19 iterations of this loop to crash the GPU. > > >> > > >> Ack. With your above script, I can reproduce a crash too on db845c > > >> (A630) running v5.14. I didn't get any crash log though and device > > >> just rebooted to USB crash mode. > > >> > > >> And same crash on RB5 (A650) too https://hastebin.com/raw/ejutetuwun > > > > Are we sure this is the same issue? It could be, but I thought we were > > seeing a bunch of random gpu errors (which may eventually hit device crash). > > In the sense that async-serror often seems to be a clk issue, it > *could* be related.. but this would have to be triggered by CPU > access. The symptom does seem very different. > The more I think about it, the more I think this is a different issue.. a650 is somewhat different wrt gmu (ie. hfi vs legacy code paths). Amit, could you try the same experiment (with 9bc95570175a ("drm/msm: Devfreq tuning") revert) while running something like webgl aquarium to prevent the GPU from suspending? I'm kinda suspecting the issue you hit is more likely some suspend/resume issue. BR, -R
Re: [PATCH] drm/msm: Disable frequency clamping on a630
On Tue, Sep 7, 2021 at 7:20 PM Bjorn Andersson wrote: > > On Mon 09 Aug 10:26 PDT 2021, Akhil P Oommen wrote: > > > On 8/9/2021 9:48 PM, Caleb Connolly wrote: > > > > > > > > > On 09/08/2021 17:12, Rob Clark wrote: > > > > On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen > > > > wrote: > [..] > > > > > I am a bit confused. We don't define a power domain for gpu in dt, > > > > > correct? Then what exactly set_opp do here? Do you think this usleep > > > > > is > > > > > what is helping here somehow to mask the issue? > > > The power domains (for cx and gx) are defined in the GMU DT, the OPPs in > > > the GPU DT. For the sake of simplicity I'll refer to the lowest > > > frequency (25700) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as > > > the "min" state, and the highest frequency (71000) and OPP level > > > (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in > > > sdm845.dtsi under the gpu node. > > > > > > The new devfreq behaviour unmasks what I think is a driver bug, it > > > inadvertently puts much more strain on the GPU regulators than they > > > usually get. With the new behaviour the GPU jumps from it's min state to > > > the max state and back again extremely rapidly under workloads as small > > > as refreshing UI. Where previously the GPU would rarely if ever go above > > > 342MHz when interacting with the device, it now jumps between min and > > > max many times per second. > > > > > > If my understanding is correct, the current implementation of the GMU > > > set freq is the following: > > > - Get OPP for frequency to set > > > - Push the frequency to the GMU - immediately updating the core clock > > > - Call dev_pm_opp_set_opp() which triggers a notify chain, this winds > > > up somewhere in power management code and causes the gx regulator level > > > to be updated > > > > Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. We > > were using a different api earlier which got deprecated - > > dev_pm_opp_set_bw(). > > > > On the Lenovo Yoga C630 this is reproduced by starting alacritty and if > I'm lucky I managed to hit a few keys before it crashes, so I spent a > few hours looking into this as well... > > As you say, the dev_pm_opp_set_opp() will only cast a interconnect vote. > The opp-level is just there for show and isn't used by anything, at > least not on 845. > > Further more, I'm missing something in my tree, so the interconnect > doesn't hit sync_state, and as such we're not actually scaling the > buses. So the problem is not that Linux doesn't turn on the buses in > time. > > So I suspect that the "AHB bus error" isn't saying that we turned off > the bus, but rather that the GPU becomes unstable or something of that > sort. > > > Lastly, I reverted 9bc95570175a ("drm/msm: Devfreq tuning") and ran > Aquarium for 20 minutes without a problem. I then switched the gpu > devfreq governor to "userspace" and ran the following: > > while true; do > echo 25700 > /sys/class/devfreq/500.gpu/userspace/set_freq > echo 71000 > /sys/class/devfreq/500.gpu/userspace/set_freq > done > > It took 19 iterations of this loop to crash the GPU. I assume you still had aquarium running, to keep the gpu awake while you ran that loop? Fwiw, I modified this slightly to match sc7180's min/max gpu freq and could not trigger any issue.. interestingly sc7180 has a lower min freq (180) and higher max freq (800) so it was toggling over a wider freq range. I also tried on a device that had the higher 825MHz opp (since I noticed that was the only opp that used RPMH_REGULATOR_LEVEL_TURBO_L1 and wanted to rule that out), but could not reproduce. I guess a630 (sdm845) should have higher power draw (it is 2x # of shader cores and 2x GMEM size, but lower max freq).. the question is, is this the reason we see this on sdm845 and not sc7180? Or is there some other difference. On the gpu side of this, they are both closely related (ie. the same "sub-generation" of a6xx, same gmu fw, etc).. I'm less sure about the other parts (icc, rpmh, etc) BR, -R > So the problem doesn't seem to be Rob's change, it's just that prior to > it the chance to hitting it is way lower. Question is still what it is > that we're triggering. > > Regards, > Bjorn
Re: [PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries
On 9/10/21 9:51 AM, Rob Herring wrote: 'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum' is more concise and yields better error messages. Fix a couple more cases which have appeared. Cc: Rob Clark Cc: Sean Paul Cc: Mark Brown Cc: Wim Van Sebroeck Cc: Guenter Roeck Cc: Jonathan Marek Cc: Aswath Govindraju Cc: Marc Zyngier Cc: Linus Walleij Cc: dri-devel@lists.freedesktop.org Cc: freedr...@lists.freedesktop.org Cc: linux-...@vger.kernel.org Cc: linux-watch...@vger.kernel.org Signed-off-by: Rob Herring --- .../bindings/display/msm/dsi-phy-7nm.yaml | 8 .../devicetree/bindings/spi/omap-spi.yaml | 6 +++--- .../bindings/watchdog/maxim,max63xx.yaml | 14 +++--- For watchdog: Acked-by: Guenter Roeck 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml index 4265399bb154..c851770bbdf2 100644 --- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml +++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml @@ -14,10 +14,10 @@ allOf: properties: compatible: -oneOf: - - const: qcom,dsi-phy-7nm - - const: qcom,dsi-phy-7nm-8150 - - const: qcom,sc7280-dsi-phy-7nm +enum: + - qcom,dsi-phy-7nm + - qcom,dsi-phy-7nm-8150 + - qcom,sc7280-dsi-phy-7nm reg: items: diff --git a/Documentation/devicetree/bindings/spi/omap-spi.yaml b/Documentation/devicetree/bindings/spi/omap-spi.yaml index e55538186cf6..9952199cae11 100644 --- a/Documentation/devicetree/bindings/spi/omap-spi.yaml +++ b/Documentation/devicetree/bindings/spi/omap-spi.yaml @@ -84,9 +84,9 @@ unevaluatedProperties: false if: properties: compatible: - oneOf: -- const: ti,omap2-mcspi -- const: ti,omap4-mcspi + enum: +- ti,omap2-mcspi +- ti,omap4-mcspi then: properties: diff --git a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml index f2105eedac2c..ab9641e845db 100644 --- a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml +++ b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml @@ -15,13 +15,13 @@ maintainers: properties: compatible: -oneOf: - - const: maxim,max6369 - - const: maxim,max6370 - - const: maxim,max6371 - - const: maxim,max6372 - - const: maxim,max6373 - - const: maxim,max6374 +enum: + - maxim,max6369 + - maxim,max6370 + - maxim,max6371 + - maxim,max6372 + - maxim,max6373 + - maxim,max6374 reg: description: This is a 1-byte memory-mapped address
Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource
Am 10.09.21 um 17:30 schrieb Thomas Hellström: On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote: Am 10.09.21 um 15:15 schrieb Thomas Hellström: Both the provider (resource manager) and the consumer (the TTM driver) want to subclass struct ttm_resource. Since this is left for the resource manager, we need to provide a private pointer for the TTM driver. Provide a struct ttm_resource_private for the driver to subclass for data with the same lifetime as the struct ttm_resource: In the i915 case it will, for example, be an sg-table and radix tree into the LMEM /VRAM pages that currently are awkwardly attached to the GEM object. Provide an ops structure for associated ops (Which is only destroy() ATM) It might seem pointless to provide a separate ops structure, but Linus has previously made it clear that that's the norm. After careful audit one could perhaps also on a per-driver basis replace the delete_mem_notify() TTM driver callback with the above destroy function. Well this is a really big NAK to this approach. If you need to attach some additional information to the resource then implement your own resource manager like everybody else does. Well this was the long discussion we had back then when the resource mangagers started to derive from struct resource and I was under the impression that we had come to an agreement about the different use- cases here, and this was my main concern. Ok, then we somehow didn't understood each other. I mean, it's a pretty big layer violation to do that for this use-case. Well exactly that's the point. TTM should not have a layer design in the first place. Devices, BOs, resources etc.. are base classes which should implement a base functionality which is then extended by the drivers to implement the driver specific functionality. That is a component based approach, and not layered at all. The TTM resource manager doesn't want to know about this data at all, it's private to the ttm resource user layer and the resource manager works perfectly well without it. (I assume the other drivers that implement their own resource managers need the data that the subclassing provides?) Yes, that's exactly why we have the subclassing. The fundamental problem here is that there are two layers wanting to subclass struct ttm_resource. That means one layer gets to do that, the second gets to use a private pointer, (which in turn can provide yet another private pointer to a potential third layer). With your suggestion, the second layer instead is forced to subclass each subclassed instance it uses from the first layer provides? Well completely drop the layer approach/thinking here. The resource is an object with a base class. The base class implements the interface TTM needs to handle the object, e.g. create/destroy/debug etc... Then we need to subclass this object because without any additional information the object is pretty pointless. One possibility for this is to use the range manager to implement something drm_mm based. BTW: We should probably rename that to something like ttm_res_drm_mm or similar. What we should avoid is to abuse TTM resource interfaces in the driver, e.g. what i915 is currently doing. This is a TTM->resource mgr interface and should not be used by drivers at all. Ofc we can do that, but it does indeed feel pretty awkward. In any case, if you still think that's the approach we should go for, I'd need to add init() and fini() members to the ttm_range_manager_func struct to allow subclassing without having to unnecessarily copy the full code? Yes, exporting the ttm_range_manager functions as needed is one thing I wanted to do for the amdgpu_gtt_mgr.c code as well. Just don't extend the function table but rather directly export the necessary functions. Regards, Christian. Thanks, Thomas Regards, Christian. Cc: Matthew Auld Cc: König Christian Signed-off-by: Thomas Hellström --- drivers/gpu/drm/ttm/ttm_resource.c | 10 +++--- include/drm/ttm/ttm_resource.h | 28 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c index 2431717376e7..973e7c50bfed 100644 --- a/drivers/gpu/drm/ttm/ttm_resource.c +++ b/drivers/gpu/drm/ttm/ttm_resource.c @@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo, void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res) { struct ttm_resource_manager *man; + struct ttm_resource *resource = *res; - if (!*res) + if (!resource) return; - man = ttm_manager_type(bo->bdev, (*res)->mem_type); - man->func->free(man, *res); *res = NULL; + if (resource->priv) + resource->priv->ops.destroy(resource->priv); + + man = ttm_manager_type(bo->bdev, resource->mem_type); + man->func->free(man, resource);
[PATCH] video: fbdev: atyfb: Remove assigned but never used variable statements
From: Colin Ian King There are a couple of statements where local variables are being assigned values that are never read because the function returns immediately after the assignment. Clean up the code by removing them. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King --- drivers/video/fbdev/aty/mach64_gx.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/video/fbdev/aty/mach64_gx.c b/drivers/video/fbdev/aty/mach64_gx.c index 9c37e28fb78b..d06d24830080 100644 --- a/drivers/video/fbdev/aty/mach64_gx.c +++ b/drivers/video/fbdev/aty/mach64_gx.c @@ -352,10 +352,8 @@ static int aty_var_to_pll_18818(const struct fb_info *info, u32 vclk_per, post_divider = 1; if (MHz100 > MAX_FREQ_2595) { - MHz100 = MAX_FREQ_2595; return -EINVAL; } else if (MHz100 < ABS_MIN_FREQ_2595) { - program_bits = 0; /* MHz100 = 257 */ return -EINVAL; } else { while (MHz100 < MIN_FREQ_2595) { -- 2.32.0
Re: [PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries
On Fri, Sep 10, 2021 at 11:51:53AM -0500, Rob Herring wrote: > 'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum' > is more concise and yields better error messages. Acked-by: Mark Brown signature.asc Description: PGP signature
[PATCH] dt-bindings: More use 'enum' instead of 'oneOf' plus 'const' entries
'enum' is equivalent to 'oneOf' with a list of 'const' entries, but 'enum' is more concise and yields better error messages. Fix a couple more cases which have appeared. Cc: Rob Clark Cc: Sean Paul Cc: Mark Brown Cc: Wim Van Sebroeck Cc: Guenter Roeck Cc: Jonathan Marek Cc: Aswath Govindraju Cc: Marc Zyngier Cc: Linus Walleij Cc: dri-devel@lists.freedesktop.org Cc: freedr...@lists.freedesktop.org Cc: linux-...@vger.kernel.org Cc: linux-watch...@vger.kernel.org Signed-off-by: Rob Herring --- .../bindings/display/msm/dsi-phy-7nm.yaml | 8 .../devicetree/bindings/spi/omap-spi.yaml | 6 +++--- .../bindings/watchdog/maxim,max63xx.yaml | 14 +++--- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml index 4265399bb154..c851770bbdf2 100644 --- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml +++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml @@ -14,10 +14,10 @@ allOf: properties: compatible: -oneOf: - - const: qcom,dsi-phy-7nm - - const: qcom,dsi-phy-7nm-8150 - - const: qcom,sc7280-dsi-phy-7nm +enum: + - qcom,dsi-phy-7nm + - qcom,dsi-phy-7nm-8150 + - qcom,sc7280-dsi-phy-7nm reg: items: diff --git a/Documentation/devicetree/bindings/spi/omap-spi.yaml b/Documentation/devicetree/bindings/spi/omap-spi.yaml index e55538186cf6..9952199cae11 100644 --- a/Documentation/devicetree/bindings/spi/omap-spi.yaml +++ b/Documentation/devicetree/bindings/spi/omap-spi.yaml @@ -84,9 +84,9 @@ unevaluatedProperties: false if: properties: compatible: - oneOf: -- const: ti,omap2-mcspi -- const: ti,omap4-mcspi + enum: +- ti,omap2-mcspi +- ti,omap4-mcspi then: properties: diff --git a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml index f2105eedac2c..ab9641e845db 100644 --- a/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml +++ b/Documentation/devicetree/bindings/watchdog/maxim,max63xx.yaml @@ -15,13 +15,13 @@ maintainers: properties: compatible: -oneOf: - - const: maxim,max6369 - - const: maxim,max6370 - - const: maxim,max6371 - - const: maxim,max6372 - - const: maxim,max6373 - - const: maxim,max6374 +enum: + - maxim,max6369 + - maxim,max6370 + - maxim,max6371 + - maxim,max6372 + - maxim,max6373 + - maxim,max6374 reg: description: This is a 1-byte memory-mapped address -- 2.30.2
Re: [RFC PATCH 0/4] Allow to use DRM fbdev emulation layer with CONFIG_FB disabled
Hi Noralf, On Thu, Sep 09, 2021 at 06:27:02PM +0200, Noralf Trønnes wrote: > > > > Hi Daniel, > > > > > > > > > > > I think for a substantial improvement here in robustness what you > really > > > > want is > > > > - kmscon in userspace > > > > - disable FB layer > > > > - ideally also disable console/vt layer in the kernel > > > > - have a minimal emergency/boot-up log thing in drm, patches for that > > > > floated around a few times > > > > > > I assume you refer to this work by David Herrmann: > > > "[RFC] drm: add kernel-log renderer" > > > https://lists.freedesktop.org/archives/dri-devel/2014-March/055136.html > > > > > > > There's also this: > > > > [PATCH v2 0/3] drm: Add panic handling > > > https://lore.kernel.org/dri-devel/20190311174218.51899-1-nor...@tronnes.org/ > > And here's a DRM console example that was part of the early drm_client work: > > [RFC v4 25/25] drm/client: Hack: Add DRM VT console client > https://lore.kernel.org/dri-devel/20180414115318.14500-26-nor...@tronnes.org/ Thanks for providing these pointers. Looks forwards to find time to play with all this. Having an embedded board without any fbdev stuff seems like a nice goal. Sam
Re: Habanalabs Open-Source TPC LLVM compiler and SynapseAI Core library
Forgot to add dri-devel. On Fri, Sep 10, 2021 at 6:09 PM Daniel Vetter wrote: > > On Fri, Sep 10, 2021 at 9:58 AM Greg Kroah-Hartman > wrote: > > On Fri, Sep 10, 2021 at 10:26:56AM +0300, Oded Gabbay wrote: > > > Hi Greg, > > > > > > Following our conversations a couple of months ago, I'm happy to tell you > > > that > > > Habanalabs has open-sourced its TPC (Tensor Processing Core) LLVM > > > compiler, > > > which is a fork of the LLVM open-source project. > > > > > > The project can be found on Habanalabs GitHub website at: > > > https://github.com/HabanaAI/tpc_llvm > > > > > > There is a companion guide on how to write TPC kernels at: > > > https://docs.habana.ai/en/latest/TPC_User_Guide/TPC_User_Guide.html > > > > That's great news, thanks for pushing for this and releasing it all! > > Yeah this is neat. > > There's still the problem that we spent the past 2.5 years pissing off > a lot of people for an imo questionable political project, bypassing > all the technical review and expertise. Now that the political > nonsense is resolved I think we need to look at at least the technical > cleanup. The angered people are much harder to fix, so let's maybe > ignore that (or perhaps a ks topic, no idea, I'm honestly not super > motivated to rehash this entire story again). Here's what I think we > should do: > > - move drivers/misc/habanalabs under drivers/gpu/habanalabs and > review/discussions on dri-devel > - grandfather the entire current situation in as-is, it's not the only > driver we have with a funny uapi of its own (but the other driver did > manage to get their compiler into upstream llvm even, and not like 2 > years late) > - review the dma-buf stuff on dri-devel and then land it through > standard flows, not the gregk-misc bypass > - close drivers/misc backdoor for further accel driver submissions, > I'd like to focus on technical stuff in this area going forward and > not pointless exercises in bypassing due process and all that > > I expect we'll have a proper discussion what the stack should look > like with the next submission (from a different vendor maybe), that > ship kinda sailed with habanalabs. > > Cheers, Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code
On Fri, 10 Sep 2021 10:38:50 -0300 Jason Gunthorpe wrote: > On Fri, Sep 10, 2021 at 01:10:46PM +0100, Christoph Hellwig wrote: > > On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote: > > > Every driver just emits a static string, simply feed it through the ops > > > and provide a standard sysfs show function. > > > > Looks sensible. But can you make the attribute optional and add a > > comment marking it deprecated? Because it really is completely useless. > > We don't version userspace APIs, userspae has to discover new features > > individually by e.g. finding new sysfs files or just trying new ioctls. > > To be honest I have no idea what side effects that would have.. > > device code search tells me libvirt reads it and stuffs it into some > XML > > Something called mdevctl touches it, feeds it into some JSON and > other stuff.. > > qemu has some VFIO_DEVICE_API_* constants but it is all dead code > > I agree it shouldn't have been there in the first place > > Cornelia? Alex? Any thoughts? It's not a version, it's a means for userspace to determine the basic API for an mdev device without needing to go through the process of creating a container, adding the group, setting an IOMMU type, opening the device before being able to call VFIO_DEVICE_GET_INFO to determine the API. For example, it wouldn't make sense for libvirt to attach a vfio-ccw device to a PCIe root port in a VM. It's a means to say this mdev device is a vfio-pci or that mdev device is a vfio-ccw. If it were optional, then management tools would have no basic idea how to attach the device to a VM without gaining access to the device themselves. Thanks, Alex
[PATCH v9 15/17] drm/i915/pxp: add pxp debugfs
2 debugfs files, one to query the current status of the pxp session and one to trigger an invalidation for testing. v2: rename debugfs, fix date (Alan) Signed-off-by: Daniele Ceraolo Spurio Reviewed-by : Alan Previn --- drivers/gpu/drm/i915/Makefile| 1 + drivers/gpu/drm/i915/gt/debugfs_gt.c | 2 + drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c | 78 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h | 21 ++ 4 files changed, 102 insertions(+) create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 366e82cec44d..b46474ee1a1f 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -285,6 +285,7 @@ i915-y += i915_perf.o i915-$(CONFIG_DRM_I915_PXP) += \ pxp/intel_pxp.o \ pxp/intel_pxp_cmd.o \ + pxp/intel_pxp_debugfs.o \ pxp/intel_pxp_irq.o \ pxp/intel_pxp_pm.o \ pxp/intel_pxp_session.o \ diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt.c b/drivers/gpu/drm/i915/gt/debugfs_gt.c index 591eb60785db..c27847ddb796 100644 --- a/drivers/gpu/drm/i915/gt/debugfs_gt.c +++ b/drivers/gpu/drm/i915/gt/debugfs_gt.c @@ -9,6 +9,7 @@ #include "debugfs_gt.h" #include "debugfs_gt_pm.h" #include "intel_sseu_debugfs.h" +#include "pxp/intel_pxp_debugfs.h" #include "uc/intel_uc_debugfs.h" #include "i915_drv.h" @@ -28,6 +29,7 @@ void debugfs_gt_register(struct intel_gt *gt) intel_sseu_debugfs_register(gt, root); intel_uc_debugfs_register(>->uc, root); + intel_pxp_debugfs_register(>->pxp, root); } void intel_gt_debugfs_register_files(struct dentry *root, diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c new file mode 100644 index ..cbb1853676cc --- /dev/null +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c @@ -0,0 +1,78 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2021 Intel Corporation + */ + +#include +#include + +#include "gt/debugfs_gt.h" +#include "pxp/intel_pxp.h" +#include "pxp/intel_pxp_irq.h" +#include "i915_drv.h" + +static int pxp_info_show(struct seq_file *m, void *data) +{ + struct intel_pxp *pxp = m->private; + struct drm_printer p = drm_seq_file_printer(m); + bool enabled = intel_pxp_is_enabled(pxp); + + if (!enabled) { + drm_printf(&p, "pxp disabled\n"); + return 0; + } + + drm_printf(&p, "active: %s\n", yesno(intel_pxp_is_active(pxp))); + drm_printf(&p, "instance counter: %u\n", pxp->key_instance); + + return 0; +} +DEFINE_GT_DEBUGFS_ATTRIBUTE(pxp_info); + +static int pxp_terminate_get(void *data, u64 *val) +{ + /* nothing to read */ + return -EPERM; +} + +static int pxp_terminate_set(void *data, u64 val) +{ + struct intel_pxp *pxp = data; + struct intel_gt *gt = pxp_to_gt(pxp); + + if (!intel_pxp_is_active(pxp)) + return -ENODEV; + + /* simulate a termination interrupt */ + spin_lock_irq(>->irq_lock); + intel_pxp_irq_handler(pxp, GEN12_DISPLAY_PXP_STATE_TERMINATED_INTERRUPT); + spin_unlock_irq(>->irq_lock); + + if (!wait_for_completion_timeout(&pxp->termination, +msecs_to_jiffies(100))) + return -ETIMEDOUT; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(pxp_terminate_fops, pxp_terminate_get, pxp_terminate_set, "%llx\n"); +void intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry *gt_root) +{ + static const struct debugfs_gt_file files[] = { + { "info", &pxp_info_fops, NULL }, + { "terminate_state", &pxp_terminate_fops, NULL }, + }; + struct dentry *root; + + if (!gt_root) + return; + + if (!HAS_PXP((pxp_to_gt(pxp)->i915))) + return; + + root = debugfs_create_dir("pxp", gt_root); + if (IS_ERR(root)) + return; + + intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), pxp); +} diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h new file mode 100644 index ..7e0c3d2f5d7e --- /dev/null +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2021 Intel Corporation + */ + +#ifndef __INTEL_PXP_DEBUGFS_H__ +#define __INTEL_PXP_DEBUGFS_H__ + +struct intel_pxp; +struct dentry; + +#ifdef CONFIG_DRM_I915_PXP +void intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry *root); +#else +static inline void +intel_pxp_debugfs_register(struct intel_pxp *pxp, struct dentry *root) +{ +} +#endif + +#endif /* __INTEL_PXP_DEBUGFS_H__ */ -- 2.25.1
[PATCH v9 16/17] drm/i915/pxp: add PXP documentation
Now that all the pieces are in place we can add a description of how the feature works. Also modify the comments in struct intel_pxp into kerneldoc. v2: improve doc (Rodrigo) Signed-off-by: Daniele Ceraolo Spurio Cc: Daniel Vetter Cc: Rodrigo Vivi --- Documentation/gpu/i915.rst | 8 drivers/gpu/drm/i915/pxp/intel_pxp.c | 28 + drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 47 -- 3 files changed, 71 insertions(+), 12 deletions(-) diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index 101dde3eb1ea..78ecb9d5ec20 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -471,6 +471,14 @@ Object Tiling IOCTLs .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c :doc: buffer object tiling +Protected Objects +- + +.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp.c + :doc: PXP + +.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp_types.h + Microcontrollers diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index 97c6368fddc3..5610634f8929 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -11,6 +11,34 @@ #include "gt/intel_context.h" #include "i915_drv.h" +/** + * DOC: PXP + * + * PXP (Protected Xe Path) is a feature available in Gen12 and newer platforms. + * It allows execution and flip to display of protected (i.e. encrypted) + * objects. The SW support is enabled via the CONFIG_DRM_I915_PXP kconfig. + * + * Objects can opt-in to PXP encryption at creation time via the + * I915_GEM_CREATE_EXT_PROTECTED_CONTENT create_ext flag. For objects to be + * correctly protected they must be used in conjunction with a context created + * with the I915_CONTEXT_PARAM_PROTECTED_CONTENT flag. See the documentation + * of those two uapi flags for details and restrictions. + * + * Protected objects are tied to a pxp session; currently we only support one + * session, which i915 manages and whose index is available in the uapi + * (I915_PROTECTED_CONTENT_DEFAULT_SESSION) for use in instructions targeting + * protected objects. + * The session is invalidated by the HW when certain events occur (e.g. + * suspend/resume). When this happens, all the objects that were used with the + * session are marked as invalid and all contexts marked as using protected + * content are banned. Any further attempt at using them in an execbuf call is + * rejected, while flips are converted to black frames. + * + * Some of the PXP setup operations are performed by the Management Engine, + * which is handled by the mei driver; communication between i915 and mei is + * performed via the mei_pxp component module. + */ + /* KCR register definitions */ #define KCR_INIT _MMIO(0x320f0) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h index ae24064bb57e..73ef7d1754e1 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h @@ -16,42 +16,65 @@ struct intel_context; struct i915_pxp_component; +/** + * struct intel_pxp - pxp state + */ struct intel_pxp { + /** +* @pxp_component: i915_pxp_component struct of the bound mei_pxp +* module. Only set and cleared inside component bind/unbind functions, +* which are protected by &tee_mutex. +*/ struct i915_pxp_component *pxp_component; + /** +* @pxp_component_added: track if the pxp component has been added. +* Set and cleared in tee init and fini functions respectively. +*/ bool pxp_component_added; + /** @ce: kernel-owned context used for PXP operations */ struct intel_context *ce; - /* + /** @arb_mutex: protects arb session start */ + struct mutex arb_mutex; + /** +* @arb_is_valid: tracks arb session status. * After a teardown, the arb session can still be in play on the HW * even if the keys are gone, so we can't rely on the HW state of the * session to know if it's valid and need to track the status in SW. */ - struct mutex arb_mutex; /* protects arb session start */ bool arb_is_valid; - /* -* Keep track of which key instance we're on, so we can use it to -* determine if an object was created using the current key or a + /** +* @key_instance: tracks which key instance we're on, so we can use it +* to determine if an object was created using the current key or a * previous one. */ u32 key_instance; - struct mutex tee_mutex; /* protects the tee channel binding */ + /** @tee_mutex: protects the tee channel binding and messaging. */ + struct mutex tee_mutex; - /* -* If the HW perceives an attack on the integrity of the encryption it -* will invalidate the keys and expect SW
[PATCH v9 12/17] drm/i915/pxp: Enable PXP power management
From: "Huang, Sean Z" During the power event S3+ sleep/resume, hardware will lose all the encryption keys for every hardware session, even though the session state might still be marked as alive after resume. Therefore, we should consider the session as dead on suspend and invalidate all the objects. The session will be automatically restarted on the first protected submission on resume. v2: runtime suspend also invalidates the keys v3: fix return codes, simplify rpm ops (Chris), use the new worker func v4: invalidate the objects on suspend, don't re-create the arb sesson on resume (delayed to first submission). v5: move irq changes back to irq patch (Rodrigo) v6: drop invalidation in runtime suspend (Rodrigo) Signed-off-by: Huang, Sean Z Signed-off-by: Daniele Ceraolo Spurio Cc: Chris Wilson Cc: Rodrigo Vivi --- drivers/gpu/drm/i915/Makefile| 1 + drivers/gpu/drm/i915/gt/intel_gt_pm.c| 15 ++- drivers/gpu/drm/i915/i915_drv.c | 2 + drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 1 + drivers/gpu/drm/i915/pxp/intel_pxp_pm.c | 46 drivers/gpu/drm/i915/pxp/intel_pxp_pm.h | 23 ++ drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 38 +++- drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 9 8 files changed, 124 insertions(+), 11 deletions(-) create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_pm.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_pm.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index b22b8c195bb8..366e82cec44d 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -286,6 +286,7 @@ i915-$(CONFIG_DRM_I915_PXP) += \ pxp/intel_pxp.o \ pxp/intel_pxp_cmd.o \ pxp/intel_pxp_irq.o \ + pxp/intel_pxp_pm.o \ pxp/intel_pxp_session.o \ pxp/intel_pxp_tee.o diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c index dea8e2479897..b47a8d8f1bb5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c @@ -18,6 +18,7 @@ #include "intel_rc6.h" #include "intel_rps.h" #include "intel_wakeref.h" +#include "pxp/intel_pxp_pm.h" static void user_forcewake(struct intel_gt *gt, bool suspend) { @@ -262,6 +263,8 @@ int intel_gt_resume(struct intel_gt *gt) intel_uc_resume(>->uc); + intel_pxp_resume(>->pxp); + user_forcewake(gt, false); out_fw: @@ -296,6 +299,7 @@ void intel_gt_suspend_prepare(struct intel_gt *gt) user_forcewake(gt, true); wait_for_suspend(gt); + intel_pxp_suspend(>->pxp, false); intel_uc_suspend(>->uc); } @@ -346,6 +350,7 @@ void intel_gt_suspend_late(struct intel_gt *gt) void intel_gt_runtime_suspend(struct intel_gt *gt) { + intel_pxp_suspend(>->pxp, true); intel_uc_runtime_suspend(>->uc); GT_TRACE(gt, "\n"); @@ -353,11 +358,19 @@ void intel_gt_runtime_suspend(struct intel_gt *gt) int intel_gt_runtime_resume(struct intel_gt *gt) { + int ret; + GT_TRACE(gt, "\n"); intel_gt_init_swizzling(gt); intel_ggtt_restore_fences(gt->ggtt); - return intel_uc_runtime_resume(>->uc); + ret = intel_uc_runtime_resume(>->uc); + if (ret) + return ret; + + intel_pxp_resume(>->pxp); + + return 0; } static ktime_t __intel_gt_get_awake_time(const struct intel_gt *gt) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 59fb4c710c8c..d5bcc70a22d4 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -67,6 +67,8 @@ #include "gt/intel_gt_pm.h" #include "gt/intel_rc6.h" +#include "pxp/intel_pxp_pm.h" + #include "i915_debugfs.h" #include "i915_drv.h" #include "i915_ioc32.h" diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c index 340f20d130a8..9e5847c653f2 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c @@ -9,6 +9,7 @@ #include "gt/intel_gt_irq.h" #include "i915_irq.h" #include "i915_reg.h" +#include "intel_runtime_pm.h" /** * intel_pxp_irq_handler - Handles PXP interrupts. diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c new file mode 100644 index ..23fd86de5a24 --- /dev/null +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c @@ -0,0 +1,46 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright(c) 2020 Intel Corporation. + */ + +#include "intel_pxp.h" +#include "intel_pxp_irq.h" +#include "intel_pxp_pm.h" +#include "intel_pxp_session.h" + +void intel_pxp_suspend(struct intel_pxp *pxp, bool runtime) +{ + if (!intel_pxp_is_enabled(pxp)) + return; + + pxp->arb_is_valid = false; + + /* +* Contexts using protected objects keep a runtime PM reference, so we +* can only runtime suspend when all of t
[PATCH v9 13/17] drm/i915/pxp: Add plane decryption support
From: Anshuman Gupta Add support to enable/disable PLANE_SURF Decryption Request bit. It requires only to enable plane decryption support when following condition met. 1. PXP session is enabled. 2. Buffer object is protected. v2: - Used gen fb obj user_flags instead gem_object_metadata. [Krishna] v3: - intel_pxp_gem_object_status() API changes. v4: use intel_pxp_is_active (Daniele) v5: rebase and use the new protected object status checker (Daniele) v6: used plane state for plane_decryption to handle async flip as suggested by Ville. v7: check pxp session while plane decrypt state computation. [Ville] removed pointless code. [Ville] v8 (Daniele): update PXP check v9: move decrypt check after icl_check_nv12_planes() when overlays have fb set (Juston) v10 (Daniele): update PXP check again to match rework in earlier patches and don't consider protection valid if the object has not been used in an execbuf beforehand. Cc: Bommu Krishnaiah Cc: Huang Sean Z Cc: Gaurav Kumar Cc: Ville Syrjälä Signed-off-by: Anshuman Gupta Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Juston Li Reviewed-by: Rodrigo Vivi Reviewed-by: Uma Shankar #v9 --- drivers/gpu/drm/i915/display/intel_display.c | 26 +++ .../drm/i915/display/intel_display_types.h| 3 +++ .../drm/i915/display/skl_universal_plane.c| 15 --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 2 +- drivers/gpu/drm/i915/i915_reg.h | 1 + drivers/gpu/drm/i915/pxp/intel_pxp.c | 9 --- drivers/gpu/drm/i915/pxp/intel_pxp.h | 7 +++-- 7 files changed, 54 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index a7ca38613f89..7c19a7b0676a 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -71,6 +71,8 @@ #include "gt/intel_rps.h" #include "gt/gen8_ppgtt.h" +#include "pxp/intel_pxp.h" + #include "g4x_dp.h" #include "g4x_hdmi.h" #include "i915_drv.h" @@ -8994,13 +8996,23 @@ static int intel_bigjoiner_add_affected_planes(struct intel_atomic_state *state) return 0; } +static bool bo_has_valid_encryption(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + + return intel_pxp_key_check(&i915->gt.pxp, obj, false) == 0; +} + static int intel_atomic_check_planes(struct intel_atomic_state *state) { struct drm_i915_private *dev_priv = to_i915(state->base.dev); struct intel_crtc_state *old_crtc_state, *new_crtc_state; struct intel_plane_state *plane_state; struct intel_plane *plane; + struct intel_plane_state *new_plane_state; + struct intel_plane_state *old_plane_state; struct intel_crtc *crtc; + const struct drm_framebuffer *fb; int i, ret; ret = icl_add_linked_planes(state); @@ -9048,6 +9060,16 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state) return ret; } + for_each_new_intel_plane_in_state(state, plane, plane_state, i) { + new_plane_state = intel_atomic_get_new_plane_state(state, plane); + old_plane_state = intel_atomic_get_old_plane_state(state, plane); + fb = new_plane_state->hw.fb; + if (fb) + new_plane_state->decrypt = bo_has_valid_encryption(intel_fb_obj(fb)); + else + new_plane_state->decrypt = old_plane_state->decrypt; + } + return 0; } @@ -9334,6 +9356,10 @@ static int intel_atomic_check_async(struct intel_atomic_state *state) drm_dbg_kms(&i915->drm, "Color range cannot be changed in async flip\n"); return -EINVAL; } + + /* plane decryption is allow to change only in synchronous flips */ + if (old_plane_state->decrypt != new_plane_state->decrypt) + return -EINVAL; } return 0; diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index e9e806d90eec..d75c8bd39abc 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -625,6 +625,9 @@ struct intel_plane_state { struct intel_fb_view view; + /* Plane pxp decryption state */ + bool decrypt; + /* plane control register */ u32 ctl; diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c b/drivers/gpu/drm/i915/display/skl_universal_plane.c index 724e7b04f3b6..55e3f093b951 100644 --- a/drivers/gpu/drm/i915/display/skl_universal_plane.c +++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c @@ -18,6 +18,7 @@ #include "intel_sprite.h" #include "skl_scaler.h" #include "skl_universal_plane.h" +#include "pxp/intel_pxp.h"
[PATCH v9 09/17] drm/i915/pxp: Implement PXP irq handler
From: "Huang, Sean Z" The HW will generate a teardown interrupt when session termination is required, which requires i915 to submit a terminating batch. Once the HW is done with the termination it will generate another interrupt, at which point it is safe to re-create the session. Since the termination and re-creation flow is something we want to trigger from the driver as well, use a common work function that can be called both from the irq handler and from the driver set-up flows, which has the addded benefit of allowing us to skip any extra locks because the work itself serializes the operations. v2: use struct completion instead of bool (Chris) v3: drop locks, clean up functions and improve comments (Chris), move to common work function. v4: improve comments, simplify wait logic (Rodrigo) v5: unconditionally set interrupts, rename state_attacked var (Rodrigo) Signed-off-by: Huang, Sean Z Signed-off-by: Daniele Ceraolo Spurio Cc: Chris Wilson Cc: Rodrigo Vivi Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/Makefile| 1 + drivers/gpu/drm/i915/gt/intel_gt_irq.c | 7 ++ drivers/gpu/drm/i915/i915_reg.h | 1 + drivers/gpu/drm/i915/pxp/intel_pxp.c | 66 +++-- drivers/gpu/drm/i915/pxp/intel_pxp.h | 8 ++ drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 99 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h | 32 +++ drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 54 ++- drivers/gpu/drm/i915/pxp/intel_pxp_session.h | 5 +- drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 8 +- drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 18 11 files changed, 283 insertions(+), 16 deletions(-) create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 4fb663de344d..b22b8c195bb8 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -285,6 +285,7 @@ i915-y += i915_perf.o i915-$(CONFIG_DRM_I915_PXP) += \ pxp/intel_pxp.o \ pxp/intel_pxp_cmd.o \ + pxp/intel_pxp_irq.o \ pxp/intel_pxp_session.o \ pxp/intel_pxp_tee.o diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c index b2de83be4d97..699a74582d32 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c @@ -13,6 +13,7 @@ #include "intel_lrc_reg.h" #include "intel_uncore.h" #include "intel_rps.h" +#include "pxp/intel_pxp_irq.h" static void guc_irq_handler(struct intel_guc *guc, u16 iir) { @@ -64,6 +65,9 @@ gen11_other_irq_handler(struct intel_gt *gt, const u8 instance, if (instance == OTHER_GTPM_INSTANCE) return gen11_rps_irq_handler(>->rps, iir); + if (instance == OTHER_KCR_INSTANCE) + return intel_pxp_irq_handler(>->pxp, iir); + WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n", instance, iir); } @@ -196,6 +200,9 @@ void gen11_gt_irq_reset(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_GUC_SG_INTR_ENABLE, 0); intel_uncore_write(uncore, GEN11_GUC_SG_INTR_MASK, ~0); + + intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_ENABLE, 0); + intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_MASK, ~0); } void gen11_gt_irq_postinstall(struct intel_gt *gt) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index c2853cc005ee..84bc884bd474 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8117,6 +8117,7 @@ enum { /* irq instances for OTHER_CLASS */ #define OTHER_GUC_INSTANCE 0 #define OTHER_GTPM_INSTANCE1 +#define OTHER_KCR_INSTANCE 4 #define GEN11_INTR_IDENTITY_REG(x) _MMIO(0x190060 + ((x) * 4)) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index 26176d43a02d..b0c7edc10cc3 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -2,7 +2,9 @@ /* * Copyright(c) 2020 Intel Corporation. */ +#include #include "intel_pxp.h" +#include "intel_pxp_irq.h" #include "intel_pxp_session.h" #include "intel_pxp_tee.h" #include "gt/intel_context.h" @@ -68,6 +70,16 @@ void intel_pxp_init(struct intel_pxp *pxp) mutex_init(&pxp->tee_mutex); + /* +* we'll use the completion to check if there is a termination pending, +* so we start it as completed and we reinit it when a termination +* is triggered. +*/ + init_completion(&pxp->termination); + complete_all(&pxp->termination); + + INIT_WORK(&pxp->session_work, intel_pxp_session_work); + ret = create_vcs_context(pxp); if (ret) return; @@ -96,19 +108,61 @@ void intel_pxp_fini(struct in
[PATCH v9 11/17] drm/i915/pxp: start the arb session on demand
Now that we can handle destruction and re-creation of the arb session, we can postpone the start of the session to the first submission that requires it, to avoid keeping it running with no user. Signed-off-by: Daniele Ceraolo Spurio Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 ++- drivers/gpu/drm/i915/pxp/intel_pxp.c | 37 +--- drivers/gpu/drm/i915/pxp/intel_pxp.h | 5 +-- drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 2 +- drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 6 ++-- drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 10 +- drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 2 ++ 7 files changed, 37 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 3418be4f727f..f1a6cfc33148 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -267,7 +267,9 @@ static int proto_context_set_protected(struct drm_i915_private *i915, * which in turn requires the device to be active. */ pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm); - ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp); + + if (!intel_pxp_is_active(&i915->gt.pxp)) + ret = intel_pxp_start(&i915->gt.pxp); } return ret; diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index e49e60567a56..e183ac479e8b 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -79,6 +79,7 @@ void intel_pxp_init(struct intel_pxp *pxp) init_completion(&pxp->termination); complete_all(&pxp->termination); + mutex_init(&pxp->arb_mutex); INIT_WORK(&pxp->session_work, intel_pxp_session_work); ret = create_vcs_context(pxp); @@ -115,7 +116,7 @@ void intel_pxp_mark_termination_in_progress(struct intel_pxp *pxp) reinit_completion(&pxp->termination); } -static void intel_pxp_queue_termination(struct intel_pxp *pxp) +static void pxp_queue_termination(struct intel_pxp *pxp) { struct intel_gt *gt = pxp_to_gt(pxp); @@ -134,31 +135,41 @@ static void intel_pxp_queue_termination(struct intel_pxp *pxp) * the arb session is restarted from the irq work when we receive the * termination completion interrupt */ -int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp) +int intel_pxp_start(struct intel_pxp *pxp) { + int ret = 0; + if (!intel_pxp_is_enabled(pxp)) - return 0; + return -ENODEV; + + mutex_lock(&pxp->arb_mutex); + + if (pxp->arb_is_valid) + goto unlock; + + pxp_queue_termination(pxp); if (!wait_for_completion_timeout(&pxp->termination, -msecs_to_jiffies(100))) - return -ETIMEDOUT; + msecs_to_jiffies(100))) { + ret = -ETIMEDOUT; + goto unlock; + } + + /* make sure the compiler doesn't optimize the double access */ + barrier(); if (!pxp->arb_is_valid) - return -EIO; + ret = -EIO; - return 0; +unlock: + mutex_unlock(&pxp->arb_mutex); + return ret; } void intel_pxp_init_hw(struct intel_pxp *pxp) { kcr_pxp_enable(pxp_to_gt(pxp)); intel_pxp_irq_enable(pxp); - - /* -* the session could've been attacked while we weren't loaded, so -* handle it as if it was and re-create it. -*/ - intel_pxp_queue_termination(pxp); } void intel_pxp_fini_hw(struct intel_pxp *pxp) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h b/drivers/gpu/drm/i915/pxp/intel_pxp.h index f942bdd2af0c..424fe00a91fb 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.h +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h @@ -34,7 +34,8 @@ void intel_pxp_init_hw(struct intel_pxp *pxp); void intel_pxp_fini_hw(struct intel_pxp *pxp); void intel_pxp_mark_termination_in_progress(struct intel_pxp *pxp); -int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp); + +int intel_pxp_start(struct intel_pxp *pxp); int intel_pxp_key_check(struct intel_pxp *pxp, struct drm_i915_gem_object *obj); @@ -48,7 +49,7 @@ static inline void intel_pxp_fini(struct intel_pxp *pxp) { } -static inline int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp) +static inline int intel_pxp_start(struct intel_pxp *pxp) { return -ENODEV; } diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c index 46eca1e81b9b..340f20d130a8 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c @@ -31,7 +31,7 @@ void intel_pxp_irq_handler(struct intel_pxp *pxp, u16 iir) GEN12_DISPLAY_APP_TERMINATED_PER_FW_REQ_INTERRUPT)) {
[PATCH v9 17/17] drm/i915/pxp: enable PXP for integrated Gen12
Note that discrete cards can support PXP as well, but we haven't tested on those yet so keeping it disabled for now. Signed-off-by: Daniele Ceraolo Spurio Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index d4a6a9dcf182..169837de395d 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -865,6 +865,7 @@ static const struct intel_device_info jsl_info = { }, \ TGL_CURSOR_OFFSETS, \ .has_global_mocs = 1, \ + .has_pxp = 1, \ .display.has_dsb = 1 static const struct intel_device_info tgl_info = { @@ -891,6 +892,7 @@ static const struct intel_device_info rkl_info = { #define DGFX_FEATURES \ .memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \ .has_llc = 0, \ + .has_pxp = 0, \ .has_snoop = 1, \ .is_dgfx = 1 -- 2.25.1
[PATCH v9 14/17] drm/i915/pxp: black pixels on pxp disabled
From: Anshuman Gupta When protected sufaces has flipped and pxp session is disabled, display black pixels by using plane color CTM correction. v2: - Display black pixels in async flip too. v3: - Removed the black pixels logic for async flip. [Ville] - Used plane state to force black pixels. [Ville] v4 (Daniele): update pxp_is_borked check. v5: rebase on top of v9 plane decryption moving the decrypt check (Juston) Cc: Ville Syrjälä Cc: Gaurav Kumar Cc: Shankar Uma Signed-off-by: Anshuman Gupta Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Juston Li Reviewed-by: Rodrigo Vivi Reviewed-by: Uma Shankar --- drivers/gpu/drm/i915/display/intel_display.c | 12 - .../drm/i915/display/intel_display_types.h| 3 ++ .../drm/i915/display/skl_universal_plane.c| 36 ++- drivers/gpu/drm/i915/i915_reg.h | 46 +++ 4 files changed, 94 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 7c19a7b0676a..755f3e32516d 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -9003,6 +9003,11 @@ static bool bo_has_valid_encryption(struct drm_i915_gem_object *obj) return intel_pxp_key_check(&i915->gt.pxp, obj, false) == 0; } +static bool pxp_is_borked(struct drm_i915_gem_object *obj) +{ + return i915_gem_object_is_protected(obj) && !bo_has_valid_encryption(obj); +} + static int intel_atomic_check_planes(struct intel_atomic_state *state) { struct drm_i915_private *dev_priv = to_i915(state->base.dev); @@ -9064,10 +9069,13 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state) new_plane_state = intel_atomic_get_new_plane_state(state, plane); old_plane_state = intel_atomic_get_old_plane_state(state, plane); fb = new_plane_state->hw.fb; - if (fb) + if (fb) { new_plane_state->decrypt = bo_has_valid_encryption(intel_fb_obj(fb)); - else + new_plane_state->force_black = pxp_is_borked(intel_fb_obj(fb)); + } else { new_plane_state->decrypt = old_plane_state->decrypt; + new_plane_state->force_black = old_plane_state->force_black; + } } return 0; diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index d75c8bd39abc..9fa4ef06e377 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -628,6 +628,9 @@ struct intel_plane_state { /* Plane pxp decryption state */ bool decrypt; + /* Plane state to display black pixels when pxp is borked */ + bool force_black; + /* plane control register */ u32 ctl; diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c b/drivers/gpu/drm/i915/display/skl_universal_plane.c index 55e3f093b951..c4adcb3e12b3 100644 --- a/drivers/gpu/drm/i915/display/skl_universal_plane.c +++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c @@ -1002,6 +1002,33 @@ static u32 skl_surf_address(const struct intel_plane_state *plane_state, } } +static void intel_load_plane_csc_black(struct intel_plane *intel_plane) +{ + struct drm_i915_private *dev_priv = to_i915(intel_plane->base.dev); + enum pipe pipe = intel_plane->pipe; + enum plane_id plane = intel_plane->id; + u16 postoff = 0; + + drm_dbg_kms(&dev_priv->drm, "plane color CTM to black %s:%d\n", + intel_plane->base.name, plane); + intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 0), 0); + intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 1), 0); + + intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 2), 0); + intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 3), 0); + + intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 4), 0); + intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 5), 0); + + intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 0), 0); + intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 1), 0); + intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 2), 0); + + intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 0), postoff); + intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 1), postoff); + intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 2), postoff); +} + static void skl_program_plane(struct intel_plane *plane, const struct intel_crtc_state *crtc_state, @@ -1115,14 +1142,21 @@ skl_program_plane(struct intel_plane *plane, */ intel_de_write_fw(dev_priv, PLANE_CTL(pipe, plane_id), plane_ctl); plane_surf = intel_plane_ggtt_offs
[PATCH v9 10/17] drm/i915/pxp: interfaces for using protected objects
This api allow user mode to create protected buffers and to mark contexts as making use of such objects. Only when using contexts marked in such a way is the execution guaranteed to work as expected. Contexts can only be marked as using protected content at creation time (i.e. the parameter is immutable) and they must be both bannable and not recoverable. Given that the protected session gets invalidated on suspend, contexts created this way hold a runtime pm wakeref until they're either destroyed or invalidated. All protected objects and contexts will be considered invalid when the PXP session is destroyed and all new submissions using them will be rejected. All intel contexts within the invalidated gem contexts will be marked banned. Userspace can detect that an invalidation has occurred via the RESET_STATS ioctl, where we report it the same way as a ban due to a hang. v5: squash patches, rebase on proto_ctx, update kerneldoc v6: rebase on obj create_ext changes v7: Use session counter to check if an object it valid, hold wakeref in context, don't add a new flag to RESET_STATS (Daniel) v8: don't increase guilty count for contexts banned during pxp invalidation (Rodrigo) v9: better comments, avoid wakeref put race between pxp_inval and context_close, add usage examples (Rodrigo) Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Bommu Krishnaiah Cc: Rodrigo Vivi Cc: Chris Wilson Cc: Lionel Landwerlin Cc: Jason Ekstrand Cc: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 98 --- drivers/gpu/drm/i915/gem/i915_gem_context.h | 6 ++ .../gpu/drm/i915/gem/i915_gem_context_types.h | 28 ++ drivers/gpu/drm/i915/gem/i915_gem_create.c| 72 ++ .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 18 drivers/gpu/drm/i915/gem/i915_gem_object.c| 1 + drivers/gpu/drm/i915/gem/i915_gem_object.h| 6 ++ .../gpu/drm/i915/gem/i915_gem_object_types.h | 8 ++ .../gpu/drm/i915/gem/selftests/mock_context.c | 4 +- drivers/gpu/drm/i915/pxp/intel_pxp.c | 78 +++ drivers/gpu/drm/i915/pxp/intel_pxp.h | 12 +++ drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 6 ++ drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 9 ++ include/uapi/drm/i915_drm.h | 96 +- 14 files changed, 407 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index c2ab0e22db0a..3418be4f727f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -77,6 +77,8 @@ #include "gt/intel_gpu_commands.h" #include "gt/intel_ring.h" +#include "pxp/intel_pxp.h" + #include "i915_gem_context.h" #include "i915_trace.h" #include "i915_user_extensions.h" @@ -186,10 +188,13 @@ static int validate_priority(struct drm_i915_private *i915, return 0; } -static void proto_context_close(struct i915_gem_proto_context *pc) +static void proto_context_close(struct drm_i915_private *i915, + struct i915_gem_proto_context *pc) { int i; + if (pc->pxp_wakeref) + intel_runtime_pm_put(&i915->runtime_pm, pc->pxp_wakeref); if (pc->vm) i915_vm_put(pc->vm); if (pc->user_engines) { @@ -241,6 +246,33 @@ static int proto_context_set_persistence(struct drm_i915_private *i915, return 0; } +static int proto_context_set_protected(struct drm_i915_private *i915, + struct i915_gem_proto_context *pc, + bool protected) +{ + int ret = 0; + + if (!intel_pxp_is_enabled(&i915->gt.pxp)) { + ret = -ENODEV; + } else if (!protected) { + pc->uses_protected_content = false; + } else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) || + !(pc->user_flags & BIT(UCONTEXT_BANNABLE))) { + ret = -EPERM; + } else { + pc->uses_protected_content = true; + + /* +* protected context usage requires the PXP session to be up, +* which in turn requires the device to be active. +*/ + pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm); + ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp); + } + + return ret; +} + static struct i915_gem_proto_context * proto_context_create(struct drm_i915_private *i915, unsigned int flags) { @@ -269,7 +301,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) return pc; proto_close: - proto_context_close(pc); + proto_context_close(i915, pc); return err; } @@ -693,6 +725,8 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, ret = -EPERM; else if (args->value) pc->user_flag
[PATCH v9 08/17] drm/i915/pxp: Implement arb session teardown
From: "Huang, Sean Z" Teardown is triggered when the display topology changes and no long meets the secure playback requirement, and hardware trashes all the encryption keys for display. Additionally, we want to emit a teardown operation to make sure we're clean on boot and resume v2: emit in the ring, use high prio request (Chris) v3: better defines, stalling flush, cleaned up and renamed submission funcs (Chris) Signed-off-by: Huang, Sean Z Signed-off-by: Daniele Ceraolo Spurio Cc: Chris Wilson Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/Makefile| 1 + drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 22 ++- drivers/gpu/drm/i915/pxp/intel_pxp.c | 7 +- drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c | 141 +++ drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h | 15 ++ drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 29 drivers/gpu/drm/i915/pxp/intel_pxp_session.h | 1 + 7 files changed, 212 insertions(+), 4 deletions(-) create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 405e04f4dd59..4fb663de344d 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -284,6 +284,7 @@ i915-y += i915_perf.o # Protected execution platform (PXP) support i915-$(CONFIG_DRM_I915_PXP) += \ pxp/intel_pxp.o \ + pxp/intel_pxp_cmd.o \ pxp/intel_pxp_session.o \ pxp/intel_pxp_tee.o diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index 1c3af0fc0456..ec2a0a566c40 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -28,10 +28,13 @@ #define INSTR_26_TO_24_MASK0x700 #define INSTR_26_TO_24_SHIFT 24 +#define __INSTR(client) ((client) << INSTR_CLIENT_SHIFT) + /* * Memory interface instructions used by the kernel */ -#define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags)) +#define MI_INSTR(opcode, flags) \ + (__INSTR(INSTR_MI_CLIENT) | (opcode) << 23 | (flags)) /* Many MI commands use bit 22 of the header dword for GGTT vs PPGTT */ #define MI_GLOBAL_GTT(1<<22) @@ -57,6 +60,7 @@ #define MI_SUSPEND_FLUSH MI_INSTR(0x0b, 0) #define MI_SUSPEND_FLUSH_EN (1<<0) #define MI_SET_APPID MI_INSTR(0x0e, 0) +#define MI_SET_APPID_SESSION_ID(x) ((x) << 0) #define MI_OVERLAY_FLIPMI_INSTR(0x11, 0) #define MI_OVERLAY_CONTINUE (0x0<<21) #define MI_OVERLAY_ON(0x1<<21) @@ -146,6 +150,7 @@ #define MI_STORE_REGISTER_MEM_GEN8 MI_INSTR(0x24, 2) #define MI_SRM_LRM_GLOBAL_GTT(1<<22) #define MI_FLUSH_DWMI_INSTR(0x26, 1) /* for GEN6 */ +#define MI_FLUSH_DW_PROTECTED_MEM_EN (1<<22) #define MI_FLUSH_DW_STORE_INDEX (1<<21) #define MI_INVALIDATE_TLB(1<<18) #define MI_FLUSH_DW_OP_STOREDW (1<<14) @@ -272,6 +277,19 @@ #define MI_MATH_REG_ZF 0x32 #define MI_MATH_REG_CF 0x33 +/* + * Media instructions used by the kernel + */ +#define MEDIA_INSTR(pipe, op, sub_op, flags) \ + (__INSTR(INSTR_RC_CLIENT) | (pipe) << INSTR_SUBCLIENT_SHIFT | \ + (op) << INSTR_26_TO_24_SHIFT | (sub_op) << 16 | (flags)) + +#define MFX_WAIT MEDIA_INSTR(1, 0, 0, 0) +#define MFX_WAIT_DW0_MFX_SYNC_CONTROL_FLAGREG_BIT(8) +#define MFX_WAIT_DW0_PXP_SYNC_CONTROL_FLAGREG_BIT(9) + +#define CRYPTO_KEY_EXCHANGEMEDIA_INSTR(2, 6, 9, 0) + /* * Commands used only by the command parser */ @@ -328,8 +346,6 @@ #define GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS \ ((0x3<<29)|(0x3<<27)|(0x0<<24)|(0x47<<16)) -#define MFX_WAIT ((0x3<<29)|(0x1<<27)|(0x0<<16)) - #define COLOR_BLT ((0x2<<29)|(0x40<<22)) #define SRC_COPY_BLT ((0x2<<29)|(0x43<<22)) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index e1370f323126..26176d43a02d 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -98,9 +98,14 @@ void intel_pxp_fini(struct intel_pxp *pxp) void intel_pxp_init_hw(struct intel_pxp *pxp) { + int ret; + kcr_pxp_enable(pxp_to_gt(pxp)); - intel_pxp_create_arb_session(pxp); + /* always emit a full termination to clean the state */ + ret = intel_pxp_terminate_arb_session_and_global(pxp); + if (!ret) + intel_pxp_create_arb_session(pxp); } void intel_pxp_fini_hw(struct intel_pxp *pxp) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c new file mode 100644 index ..80678dafde15 --- /dev/null +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c @@ -0,0 +1,141 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright(c) 2020, Intel Corporation. All rights reserved. + */ + +#in
[PATCH v9 06/17] drm/i915/pxp: set KCR reg init
The setting is required by hardware to allow us doing further protection operation such as sending commands to GPU or TEE. The register needs to be re-programmed on resume, so for simplicitly we bundle the programming with the component binding, which is automatically called on resume. Further HW set-up operations will be added in the same location in follow-up patches, so get ready for them by using a couple of init/fini_hw wrappers instead of calling the KCR funcs directly. v3: move programming to component binding function, rework commit msg Signed-off-by: Huang, Sean Z Signed-off-by: Daniele Ceraolo Spurio Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/pxp/intel_pxp.c | 27 drivers/gpu/drm/i915/pxp/intel_pxp.h | 3 +++ drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 5 + 3 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index 400deaea2d8a..66a98feb33ab 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -7,6 +7,24 @@ #include "gt/intel_context.h" #include "i915_drv.h" +/* KCR register definitions */ +#define KCR_INIT _MMIO(0x320f0) + +/* Setting KCR Init bit is required after system boot */ +#define KCR_INIT_ALLOW_DISPLAY_ME_WRITES REG_BIT(14) + +static void kcr_pxp_enable(struct intel_gt *gt) +{ + intel_uncore_write(gt->uncore, KCR_INIT, + _MASKED_BIT_ENABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES)); +} + +static void kcr_pxp_disable(struct intel_gt *gt) +{ + intel_uncore_write(gt->uncore, KCR_INIT, + _MASKED_BIT_DISABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES)); +} + static int create_vcs_context(struct intel_pxp *pxp) { static struct lock_class_key pxp_lock; @@ -71,5 +89,14 @@ void intel_pxp_fini(struct intel_pxp *pxp) intel_pxp_tee_component_fini(pxp); destroy_vcs_context(pxp); +} + +void intel_pxp_init_hw(struct intel_pxp *pxp) +{ + kcr_pxp_enable(pxp_to_gt(pxp)); +} +void intel_pxp_fini_hw(struct intel_pxp *pxp) +{ + kcr_pxp_disable(pxp_to_gt(pxp)); } diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h b/drivers/gpu/drm/i915/pxp/intel_pxp.h index e87550fb9821..5427c3b28aa9 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.h +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h @@ -22,6 +22,9 @@ static inline bool intel_pxp_is_enabled(const struct intel_pxp *pxp) #ifdef CONFIG_DRM_I915_PXP void intel_pxp_init(struct intel_pxp *pxp); void intel_pxp_fini(struct intel_pxp *pxp); + +void intel_pxp_init_hw(struct intel_pxp *pxp); +void intel_pxp_fini_hw(struct intel_pxp *pxp); #else static inline void intel_pxp_init(struct intel_pxp *pxp) { diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c index f1d8de832653..0c0c7946e6a0 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c @@ -33,6 +33,9 @@ static int i915_pxp_tee_component_bind(struct device *i915_kdev, pxp->pxp_component = data; pxp->pxp_component->tee_dev = tee_kdev; + /* the component is required to fully start the PXP HW */ + intel_pxp_init_hw(pxp); + return 0; } @@ -41,6 +44,8 @@ static void i915_pxp_tee_component_unbind(struct device *i915_kdev, { struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev); + intel_pxp_fini_hw(pxp); + pxp->pxp_component = NULL; } -- 2.25.1
[PATCH v9 07/17] drm/i915/pxp: Create the arbitrary session after boot
From: "Huang, Sean Z" Create the arbitrary session, with the fixed session id 0xf, after system boot, for the case that application allocates the protected buffer without establishing any protection session. Because the hardware requires at least one alive session for protected buffer creation. This arbitrary session will need to be re-created after teardown or power event because hardware encryption key won't be valid after such cases. The session ID is exposed as part of the uapi so it can be used as part of userspace commands. v2: use gt->uncore->rpm (Chris) v3: s/arb_is_in_play/arb_is_valid (Chris), move set-up to the new init_hw function v4: move interface defs to separate header, set arb_is valid to false on fini (Rodrigo) v5: handle async component binding Signed-off-by: Huang, Sean Z Signed-off-by: Daniele Ceraolo Spurio Cc: Chris Wilson Cc: Rodrigo Vivi Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/pxp/intel_pxp.c | 7 ++ drivers/gpu/drm/i915/pxp/intel_pxp.h | 5 ++ drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 74 drivers/gpu/drm/i915/pxp/intel_pxp_session.h | 15 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 87 +++ drivers/gpu/drm/i915/pxp/intel_pxp_tee.h | 3 + .../drm/i915/pxp/intel_pxp_tee_interface.h| 37 drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 10 +++ include/uapi/drm/i915_drm.h | 3 + 10 files changed, 242 insertions(+) create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_session.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_session.h create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee_interface.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index d39bd0cefc64..405e04f4dd59 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -284,6 +284,7 @@ i915-y += i915_perf.o # Protected execution platform (PXP) support i915-$(CONFIG_DRM_I915_PXP) += \ pxp/intel_pxp.o \ + pxp/intel_pxp_session.o \ pxp/intel_pxp_tee.o # Post-mortem debug and GPU hang state capture diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index 66a98feb33ab..e1370f323126 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -3,6 +3,7 @@ * Copyright(c) 2020 Intel Corporation. */ #include "intel_pxp.h" +#include "intel_pxp_session.h" #include "intel_pxp_tee.h" #include "gt/intel_context.h" #include "i915_drv.h" @@ -65,6 +66,8 @@ void intel_pxp_init(struct intel_pxp *pxp) if (!HAS_PXP(gt->i915)) return; + mutex_init(&pxp->tee_mutex); + ret = create_vcs_context(pxp); if (ret) return; @@ -86,6 +89,8 @@ void intel_pxp_fini(struct intel_pxp *pxp) if (!intel_pxp_is_enabled(pxp)) return; + pxp->arb_is_valid = false; + intel_pxp_tee_component_fini(pxp); destroy_vcs_context(pxp); @@ -94,6 +99,8 @@ void intel_pxp_fini(struct intel_pxp *pxp) void intel_pxp_init_hw(struct intel_pxp *pxp) { kcr_pxp_enable(pxp_to_gt(pxp)); + + intel_pxp_create_arb_session(pxp); } void intel_pxp_fini_hw(struct intel_pxp *pxp) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h b/drivers/gpu/drm/i915/pxp/intel_pxp.h index 5427c3b28aa9..8eeb65af78b1 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.h +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h @@ -19,6 +19,11 @@ static inline bool intel_pxp_is_enabled(const struct intel_pxp *pxp) return pxp->ce; } +static inline bool intel_pxp_is_active(const struct intel_pxp *pxp) +{ + return pxp->arb_is_valid; +} + #ifdef CONFIG_DRM_I915_PXP void intel_pxp_init(struct intel_pxp *pxp); void intel_pxp_fini(struct intel_pxp *pxp); diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c new file mode 100644 index ..3331868f354c --- /dev/null +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright(c) 2020, Intel Corporation. All rights reserved. + */ + +#include "drm/i915_drm.h" +#include "i915_drv.h" + +#include "intel_pxp.h" +#include "intel_pxp_session.h" +#include "intel_pxp_tee.h" +#include "intel_pxp_types.h" + +#define ARB_SESSION I915_PROTECTED_CONTENT_DEFAULT_SESSION /* shorter define */ + +#define GEN12_KCR_SIP _MMIO(0x32260) /* KCR hwdrm session in play 0-31 */ + +static bool intel_pxp_session_is_in_play(struct intel_pxp *pxp, u32 id) +{ + struct intel_gt *gt = pxp_to_gt(pxp); + intel_wakeref_t wakeref; + u32 sip = 0; + + with_intel_runtime_pm(gt->uncore->rpm, wakeref) + sip = intel_uncore_read(gt->uncore, GEN12_KCR_SIP); + + return sip & BIT(id); +} + +static int pxp_wait_for_session_state(struct intel_pxp *pxp, u32 id, bool in_
[PATCH v9 05/17] drm/i915/pxp: Implement funcs to create the TEE channel
From: "Huang, Sean Z" Implement the funcs to create the TEE channel, so kernel can send the TEE commands directly to TEE for creating the arbitrary (default) session. v2: fix locking, don't pollute dev_priv (Chris) v3: wait for mei PXP component to be bound. v4: drop the wait, as the component might be bound after i915 load completes. We'll instead check when sending a tee message. v5: fix an issue with mei_pxp module removal v6: don't use fetch_and_zero in fini (Rodrigo) Signed-off-by: Huang, Sean Z Signed-off-by: Daniele Ceraolo Spurio Cc: Chris Wilson --- drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/pxp/intel_pxp.c | 13 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 79 ++ drivers/gpu/drm/i915/pxp/intel_pxp_tee.h | 14 drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 6 ++ 5 files changed, 114 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 23f5bc268962..d39bd0cefc64 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -283,7 +283,8 @@ i915-y += i915_perf.o # Protected execution platform (PXP) support i915-$(CONFIG_DRM_I915_PXP) += \ - pxp/intel_pxp.o + pxp/intel_pxp.o \ + pxp/intel_pxp_tee.o # Post-mortem debug and GPU hang state capture i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index 7b2053902146..400deaea2d8a 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -3,6 +3,7 @@ * Copyright(c) 2020 Intel Corporation. */ #include "intel_pxp.h" +#include "intel_pxp_tee.h" #include "gt/intel_context.h" #include "i915_drv.h" @@ -50,7 +51,16 @@ void intel_pxp_init(struct intel_pxp *pxp) if (ret) return; + ret = intel_pxp_tee_component_init(pxp); + if (ret) + goto out_context; + drm_info(>->i915->drm, "Protected Xe Path (PXP) protected content support initialized\n"); + + return; + +out_context: + destroy_vcs_context(pxp); } void intel_pxp_fini(struct intel_pxp *pxp) @@ -58,5 +68,8 @@ void intel_pxp_fini(struct intel_pxp *pxp) if (!intel_pxp_is_enabled(pxp)) return; + intel_pxp_tee_component_fini(pxp); + destroy_vcs_context(pxp); + } diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c new file mode 100644 index ..f1d8de832653 --- /dev/null +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c @@ -0,0 +1,79 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright(c) 2020 Intel Corporation. + */ + +#include +#include "drm/i915_pxp_tee_interface.h" +#include "drm/i915_component.h" +#include "i915_drv.h" +#include "intel_pxp.h" +#include "intel_pxp_tee.h" + +static inline struct intel_pxp *i915_dev_to_pxp(struct device *i915_kdev) +{ + return &kdev_to_i915(i915_kdev)->gt.pxp; +} + +/** + * i915_pxp_tee_component_bind - bind function to pass the function pointers to pxp_tee + * @i915_kdev: pointer to i915 kernel device + * @tee_kdev: pointer to tee kernel device + * @data: pointer to pxp_tee_master containing the function pointers + * + * This bind function is called during the system boot or resume from system sleep. + * + * Return: return 0 if successful. + */ +static int i915_pxp_tee_component_bind(struct device *i915_kdev, + struct device *tee_kdev, void *data) +{ + struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev); + + pxp->pxp_component = data; + pxp->pxp_component->tee_dev = tee_kdev; + + return 0; +} + +static void i915_pxp_tee_component_unbind(struct device *i915_kdev, + struct device *tee_kdev, void *data) +{ + struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev); + + pxp->pxp_component = NULL; +} + +static const struct component_ops i915_pxp_tee_component_ops = { + .bind = i915_pxp_tee_component_bind, + .unbind = i915_pxp_tee_component_unbind, +}; + +int intel_pxp_tee_component_init(struct intel_pxp *pxp) +{ + int ret; + struct intel_gt *gt = pxp_to_gt(pxp); + struct drm_i915_private *i915 = gt->i915; + + ret = component_add_typed(i915->drm.dev, &i915_pxp_tee_component_ops, + I915_COMPONENT_PXP); + if (ret < 0) { + drm_err(&i915->drm, "Failed to add PXP component (%d)\n", ret); + return ret; + } + + pxp->pxp_component_added = true; + + return 0; +} + +void intel_pxp_tee_component_fini(struct intel_pxp *pxp) +{ + struct drm_i915_private *i915 = pxp_to_gt(pxp)->i915; + + if (!pxp->pxp_component_added) + return; + +
[PATCH v9 04/17] drm/i915/pxp: allocate a vcs context for pxp usage
The context is required to send the session termination commands to the VCS, which will be implemented in a follow-up patch. We can also use the presence of the context as a check of pxp initialization completion. v2: use perma-pinned context (Chris) v3: rename pinned_context functions (Chris) v4: split export of pinned_context functions to a separate patch (Rodrigo) Signed-off-by: Daniele Ceraolo Spurio Cc: Chris Wilson Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/Makefile | 4 ++ drivers/gpu/drm/i915/gt/intel_engine.h | 2 + drivers/gpu/drm/i915/gt/intel_gt.c | 5 ++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 3 ++ drivers/gpu/drm/i915/pxp/intel_pxp.c | 62 ++ drivers/gpu/drm/i915/pxp/intel_pxp.h | 35 drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 15 ++ 7 files changed, 126 insertions(+) create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.h create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_types.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index c36c8a4f0716..23f5bc268962 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -281,6 +281,10 @@ i915-y += \ i915-y += i915_perf.o +# Protected execution platform (PXP) support +i915-$(CONFIG_DRM_I915_PXP) += \ + pxp/intel_pxp.o + # Post-mortem debug and GPU hang state capture i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o i915-$(CONFIG_DRM_I915_SELFTEST) += \ diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 87579affb952..eed4634c08cd 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -175,6 +175,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value) #define I915_GEM_HWS_SEQNO 0x40 #define I915_GEM_HWS_SEQNO_ADDR(I915_GEM_HWS_SEQNO * sizeof(u32)) #define I915_GEM_HWS_MIGRATE (0x42 * sizeof(u32)) +#define I915_GEM_HWS_PXP 0x60 +#define I915_GEM_HWS_PXP_ADDR (I915_GEM_HWS_PXP * sizeof(u32)) #define I915_GEM_HWS_SCRATCH 0x80 #define I915_HWS_CSB_BUF0_INDEX0x10 diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 2aeaae036a6f..da30919b7e99 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -21,6 +21,7 @@ #include "intel_uncore.h" #include "intel_pm.h" #include "shmem_utils.h" +#include "pxp/intel_pxp.h" void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915) { @@ -712,6 +713,8 @@ int intel_gt_init(struct intel_gt *gt) intel_migrate_init(>->migrate, gt); + intel_pxp_init(>->pxp); + goto out_fw; err_gt: __intel_gt_disable(gt); @@ -747,6 +750,8 @@ void intel_gt_driver_unregister(struct intel_gt *gt) intel_rps_driver_unregister(>->rps); + intel_pxp_fini(>->pxp); + /* * Upon unregistering the device to prevent any new users, cancel * all in-flight requests so that we can quickly unbind the active diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 6fdcde64c180..8001a61f42e5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -26,6 +26,7 @@ #include "intel_rps_types.h" #include "intel_migrate_types.h" #include "intel_wakeref.h" +#include "pxp/intel_pxp_types.h" struct drm_i915_private; struct i915_ggtt; @@ -196,6 +197,8 @@ struct intel_gt { struct { u8 uc_index; } mocs; + + struct intel_pxp pxp; }; enum intel_gt_scratch_field { diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c new file mode 100644 index ..7b2053902146 --- /dev/null +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -0,0 +1,62 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright(c) 2020 Intel Corporation. + */ +#include "intel_pxp.h" +#include "gt/intel_context.h" +#include "i915_drv.h" + +static int create_vcs_context(struct intel_pxp *pxp) +{ + static struct lock_class_key pxp_lock; + struct intel_gt *gt = pxp_to_gt(pxp); + struct intel_engine_cs *engine; + struct intel_context *ce; + + /* +* Find the first VCS engine present. We're guaranteed there is one +* if we're in this function due to the check in has_pxp +*/ + for (engine = gt->engine_class[VIDEO_DECODE_CLASS][0]; !engine; engine++); + GEM_BUG_ON(!engine || engine->class != VIDEO_DECODE_CLASS); + + ce = intel_engine_create_pinned_context(engine, engine->gt->vm, SZ_4K, + I915_GEM_HWS_PXP_ADDR, + &pxp_lock, "pxp_context"); + if (IS_ERR(ce)) {
[PATCH v9 03/17] drm/i915/pxp: define PXP device flag and kconfig
Ahead of the PXP implementation, define the relevant define flag and kconfig option. v2: flip kconfig default to N. Some machines have IFWIs that do not support PXP, so we need it to be an opt-in until we add support to query the caps from the mei device. Signed-off-by: Daniele Ceraolo Spurio Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/Kconfig | 11 +++ drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/intel_device_info.h | 1 + 3 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index f960f5d7664e..5987c3d5d9fb 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -131,6 +131,17 @@ config DRM_I915_GVT_KVMGT Choose this option if you want to enable KVMGT support for Intel GVT-g. +config DRM_I915_PXP + bool "Enable Intel PXP support for Intel Gen12+ platform" + depends on DRM_I915 + depends on INTEL_MEI && INTEL_MEI_PXP + default n + help + PXP (Protected Xe Path) is an i915 component, available on GEN12+ + GPUs, that helps to establish the hardware protected session and + manage the status of the alive software session, as well as its life + cycle. + menu "drm/i915 Debugging" depends on DRM_I915 depends on EXPERT diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 37c1ca266bcd..447a248f14aa 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1678,6 +1678,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_GLOBAL_MOCS_REGISTERS(dev_priv) (INTEL_INFO(dev_priv)->has_global_mocs) +#define HAS_PXP(dev_priv) (IS_ENABLED(CONFIG_DRM_I915_PXP) && \ + INTEL_INFO(dev_priv)->has_pxp) && \ + VDBOX_MASK(&dev_priv->gt) #define HAS_GMCH(dev_priv) (INTEL_INFO(dev_priv)->display.has_gmch) diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index d328bb95c49b..8e6f48d1eb7b 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -133,6 +133,7 @@ enum intel_ppgtt_type { func(has_logical_ring_elsq); \ func(has_mslices); \ func(has_pooled_eu); \ + func(has_pxp); \ func(has_rc6); \ func(has_rc6p); \ func(has_rps); \ -- 2.25.1
[PATCH v9 02/17] mei: pxp: export pavp client to me client bus
From: Vitaly Lubart Export PAVP client to work with i915 driver, for binding it uses kernel component framework. v2:drop debug prints, refactor match code to match mei_hdcp (Tomas) Signed-off-by: Vitaly Lubart Signed-off-by: Tomas Winkler Signed-off-by: Daniele Ceraolo Spurio Reviewed-by: Rodrigo Vivi --- drivers/misc/mei/Kconfig | 2 + drivers/misc/mei/Makefile | 1 + drivers/misc/mei/pxp/Kconfig | 13 ++ drivers/misc/mei/pxp/Makefile | 7 + drivers/misc/mei/pxp/mei_pxp.c | 229 + drivers/misc/mei/pxp/mei_pxp.h | 18 +++ 6 files changed, 270 insertions(+) create mode 100644 drivers/misc/mei/pxp/Kconfig create mode 100644 drivers/misc/mei/pxp/Makefile create mode 100644 drivers/misc/mei/pxp/mei_pxp.c create mode 100644 drivers/misc/mei/pxp/mei_pxp.h diff --git a/drivers/misc/mei/Kconfig b/drivers/misc/mei/Kconfig index f5fd5b786607..0e0bcd0da852 100644 --- a/drivers/misc/mei/Kconfig +++ b/drivers/misc/mei/Kconfig @@ -47,3 +47,5 @@ config INTEL_MEI_TXE Intel Bay Trail source "drivers/misc/mei/hdcp/Kconfig" +source "drivers/misc/mei/pxp/Kconfig" + diff --git a/drivers/misc/mei/Makefile b/drivers/misc/mei/Makefile index f1c76f7ee804..d8e5165917f2 100644 --- a/drivers/misc/mei/Makefile +++ b/drivers/misc/mei/Makefile @@ -26,3 +26,4 @@ mei-$(CONFIG_EVENT_TRACING) += mei-trace.o CFLAGS_mei-trace.o = -I$(src) obj-$(CONFIG_INTEL_MEI_HDCP) += hdcp/ +obj-$(CONFIG_INTEL_MEI_PXP) += pxp/ diff --git a/drivers/misc/mei/pxp/Kconfig b/drivers/misc/mei/pxp/Kconfig new file mode 100644 index ..4029b96afc04 --- /dev/null +++ b/drivers/misc/mei/pxp/Kconfig @@ -0,0 +1,13 @@ + +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2020, Intel Corporation. All rights reserved. +# +config INTEL_MEI_PXP + tristate "Intel PXP services of ME Interface" + select INTEL_MEI_ME + depends on DRM_I915 + help + MEI Support for PXP Services on Intel platforms. + + Enables the ME FW services required for PXP support through + I915 display driver of Intel. diff --git a/drivers/misc/mei/pxp/Makefile b/drivers/misc/mei/pxp/Makefile new file mode 100644 index ..0329950d5794 --- /dev/null +++ b/drivers/misc/mei/pxp/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2020, Intel Corporation. All rights reserved. +# +# Makefile - PXP client driver for Intel MEI Bus Driver. + +obj-$(CONFIG_INTEL_MEI_PXP) += mei_pxp.o diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c new file mode 100644 index ..f7380d387bab --- /dev/null +++ b/drivers/misc/mei/pxp/mei_pxp.c @@ -0,0 +1,229 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright © 2020 - 2021 Intel Corporation + */ + +/** + * DOC: MEI_PXP Client Driver + * + * The mei_pxp driver acts as a translation layer between PXP + * protocol implementer (I915) and ME FW by translating PXP + * negotiation messages to ME FW command payloads and vice versa. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "mei_pxp.h" + +/** + * mei_pxp_send_message() - Sends a PXP message to ME FW. + * @dev: device corresponding to the mei_cl_device + * @message: a message buffer to send + * @size: size of the message + * Return: 0 on Success, <0 on Failure + */ +static int +mei_pxp_send_message(struct device *dev, const void *message, size_t size) +{ + struct mei_cl_device *cldev; + ssize_t byte; + + if (!dev || !message) + return -EINVAL; + + cldev = to_mei_cl_device(dev); + + /* temporary drop const qualifier till the API is fixed */ + byte = mei_cldev_send(cldev, (u8 *)message, size); + if (byte < 0) { + dev_dbg(dev, "mei_cldev_send failed. %zd\n", byte); + return byte; + } + + return 0; +} + +/** + * mei_pxp_receive_message() - Receives a PXP message from ME FW. + * @dev: device corresponding to the mei_cl_device + * @buffer: a message buffer to contain the received message + * @size: size of the buffer + * Return: bytes sent on Success, <0 on Failure + */ +static int +mei_pxp_receive_message(struct device *dev, void *buffer, size_t size) +{ + struct mei_cl_device *cldev; + ssize_t byte; + + if (!dev || !buffer) + return -EINVAL; + + cldev = to_mei_cl_device(dev); + + byte = mei_cldev_recv(cldev, buffer, size); + if (byte < 0) { + dev_dbg(dev, "mei_cldev_recv failed. %zd\n", byte); + return byte; + } + + return byte; +} + +static const struct i915_pxp_component_ops mei_pxp_ops = { + .owner = THIS_MODULE, + .send = mei_pxp_send_message, + .recv = mei_pxp_receive_message, +}; + +static int mei_component_master_bind(struct device *dev) +{ + struct mei_cl_device *cldev = to_mei_cl_device(dev); + struct i915_pxp_component *comp_master = mei_
[PATCH v9 01/17] drm/i915/pxp: Define PXP component interface
This will be used for communication between the i915 driver and the mei one. Defining it in a stand-alone patch to avoid circualr dependedencies between the patches modifying the 2 drivers. Split out from an original patch from Huang, Sean Z v2: rename the component struct (Rodrigo) Signed-off-by: Daniele Ceraolo Spurio Cc: Rodrigo Vivi Reviewed-by: Rodrigo Vivi --- include/drm/i915_component.h | 1 + include/drm/i915_pxp_tee_interface.h | 42 2 files changed, 43 insertions(+) create mode 100644 include/drm/i915_pxp_tee_interface.h diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h index 55c3b123581b..c1e2a43d2d1e 100644 --- a/include/drm/i915_component.h +++ b/include/drm/i915_component.h @@ -29,6 +29,7 @@ enum i915_component_type { I915_COMPONENT_AUDIO = 1, I915_COMPONENT_HDCP, + I915_COMPONENT_PXP }; /* MAX_PORT is the number of port diff --git a/include/drm/i915_pxp_tee_interface.h b/include/drm/i915_pxp_tee_interface.h new file mode 100644 index ..af593ec64469 --- /dev/null +++ b/include/drm/i915_pxp_tee_interface.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2020 Intel Corporation + */ + +#ifndef _I915_PXP_TEE_INTERFACE_H_ +#define _I915_PXP_TEE_INTERFACE_H_ + +#include +#include + +/** + * struct i915_pxp_component_ops - ops for PXP services. + * @owner: Module providing the ops + * @send: sends data to PXP + * @receive: receives data from PXP + */ +struct i915_pxp_component_ops { + /** +* @owner: owner of the module provding the ops +*/ + struct module *owner; + + int (*send)(struct device *dev, const void *message, size_t size); + int (*recv)(struct device *dev, void *buffer, size_t size); +}; + +/** + * struct i915_pxp_component - Used for communication between i915 and TEE + * drivers for the PXP services + * @tee_dev: device that provide the PXP service from TEE Bus. + * @pxp_ops: Ops implemented by TEE driver, used by i915 driver. + */ +struct i915_pxp_component { + struct device *tee_dev; + const struct i915_pxp_component_ops *ops; + + /* To protect the above members. */ + struct mutex mutex; +}; + +#endif /* _I915_TEE_PXP_INTERFACE_H_ */ -- 2.25.1
[PATCH v9 00/17] drm/i915: Introduce Intel PXP
PXP (Protected Xe Path) is an i915 component, available on GEN12i and newer platforms, that helps to establish the hardware protected session and manage the status of the alive software session, as well as its life cycle. changes from v8: - comments/docs improvements - remove rpm put race (pxp_inval vs context_close) - don't call pxp_invalidate on rpm suspend because it's redundant Tested with: https://patchwork.freedesktop.org/series/87570/ Cc: Gaurav Kumar Cc: Chris Wilson Cc: Rodrigo Vivi Cc: Joonas Lahtinen Cc: Juston Li Cc: Alan Previn Cc: Lionel Landwerlin Cc: Jason Ekstrand Cc: Daniel Vetter Anshuman Gupta (2): drm/i915/pxp: Add plane decryption support drm/i915/pxp: black pixels on pxp disabled Daniele Ceraolo Spurio (9): drm/i915/pxp: Define PXP component interface drm/i915/pxp: define PXP device flag and kconfig drm/i915/pxp: allocate a vcs context for pxp usage drm/i915/pxp: set KCR reg init drm/i915/pxp: interfaces for using protected objects drm/i915/pxp: start the arb session on demand drm/i915/pxp: add pxp debugfs drm/i915/pxp: add PXP documentation drm/i915/pxp: enable PXP for integrated Gen12 Huang, Sean Z (5): drm/i915/pxp: Implement funcs to create the TEE channel drm/i915/pxp: Create the arbitrary session after boot drm/i915/pxp: Implement arb session teardown drm/i915/pxp: Implement PXP irq handler drm/i915/pxp: Enable PXP power management Vitaly Lubart (1): mei: pxp: export pavp client to me client bus Documentation/gpu/i915.rst| 8 + drivers/gpu/drm/i915/Kconfig | 11 + drivers/gpu/drm/i915/Makefile | 10 + drivers/gpu/drm/i915/display/intel_display.c | 34 +++ .../drm/i915/display/intel_display_types.h| 6 + .../drm/i915/display/skl_universal_plane.c| 49 ++- drivers/gpu/drm/i915/gem/i915_gem_context.c | 100 +- drivers/gpu/drm/i915/gem/i915_gem_context.h | 6 + .../gpu/drm/i915/gem/i915_gem_context_types.h | 28 ++ drivers/gpu/drm/i915/gem/i915_gem_create.c| 72 +++-- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 18 ++ drivers/gpu/drm/i915/gem/i915_gem_object.c| 1 + drivers/gpu/drm/i915/gem/i915_gem_object.h| 6 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 8 + .../gpu/drm/i915/gem/selftests/mock_context.c | 4 +- drivers/gpu/drm/i915/gt/debugfs_gt.c | 2 + drivers/gpu/drm/i915/gt/intel_engine.h| 2 + drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 22 +- drivers/gpu/drm/i915/gt/intel_gt.c| 5 + drivers/gpu/drm/i915/gt/intel_gt_irq.c| 7 + drivers/gpu/drm/i915/gt/intel_gt_pm.c | 15 +- drivers/gpu/drm/i915/gt/intel_gt_types.h | 3 + drivers/gpu/drm/i915/i915_drv.c | 2 + drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/i915_pci.c | 2 + drivers/gpu/drm/i915/i915_reg.h | 48 +++ drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/pxp/intel_pxp.c | 288 ++ drivers/gpu/drm/i915/pxp/intel_pxp.h | 67 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c | 141 + drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h | 15 + drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c | 78 + drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h | 21 ++ drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 100 ++ drivers/gpu/drm/i915/pxp/intel_pxp_irq.h | 32 ++ drivers/gpu/drm/i915/pxp/intel_pxp_pm.c | 46 +++ drivers/gpu/drm/i915/pxp/intel_pxp_pm.h | 23 ++ drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 175 +++ drivers/gpu/drm/i915/pxp/intel_pxp_session.h | 15 + drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 172 +++ drivers/gpu/drm/i915/pxp/intel_pxp_tee.h | 17 ++ .../drm/i915/pxp/intel_pxp_tee_interface.h| 37 +++ drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 83 + drivers/misc/mei/Kconfig | 2 + drivers/misc/mei/Makefile | 1 + drivers/misc/mei/pxp/Kconfig | 13 + drivers/misc/mei/pxp/Makefile | 7 + drivers/misc/mei/pxp/mei_pxp.c| 229 ++ drivers/misc/mei/pxp/mei_pxp.h| 18 ++ include/drm/i915_component.h | 1 + include/drm/i915_pxp_tee_interface.h | 42 +++ include/uapi/drm/i915_drm.h | 99 +- 52 files changed, 2153 insertions(+), 42 deletions(-) create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.h create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c create mode 100644 drivers/gpu/drm/i915/pxp/int
Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource
On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote: > > > Am 10.09.21 um 15:15 schrieb Thomas Hellström: > > Both the provider (resource manager) and the consumer (the TTM > > driver) > > want to subclass struct ttm_resource. Since this is left for the > > resource > > manager, we need to provide a private pointer for the TTM driver. > > > > Provide a struct ttm_resource_private for the driver to subclass > > for > > data with the same lifetime as the struct ttm_resource: In the i915 > > case > > it will, for example, be an sg-table and radix tree into the LMEM > > /VRAM pages that currently are awkwardly attached to the GEM > > object. > > > > Provide an ops structure for associated ops (Which is only > > destroy() ATM) > > It might seem pointless to provide a separate ops structure, but > > Linus > > has previously made it clear that that's the norm. > > > > After careful audit one could perhaps also on a per-driver basis > > replace the delete_mem_notify() TTM driver callback with the above > > destroy function. > > Well this is a really big NAK to this approach. > > If you need to attach some additional information to the resource > then > implement your own resource manager like everybody else does. Well this was the long discussion we had back then when the resource mangagers started to derive from struct resource and I was under the impression that we had come to an agreement about the different use- cases here, and this was my main concern. I mean, it's a pretty big layer violation to do that for this use-case. The TTM resource manager doesn't want to know about this data at all, it's private to the ttm resource user layer and the resource manager works perfectly well without it. (I assume the other drivers that implement their own resource managers need the data that the subclassing provides?) The fundamental problem here is that there are two layers wanting to subclass struct ttm_resource. That means one layer gets to do that, the second gets to use a private pointer, (which in turn can provide yet another private pointer to a potential third layer). With your suggestion, the second layer instead is forced to subclass each subclassed instance it uses from the first layer provides? Ofc we can do that, but it does indeed feel pretty awkward. In any case, if you still think that's the approach we should go for, I'd need to add init() and fini() members to the ttm_range_manager_func struct to allow subclassing without having to unnecessarily copy the full code? Thanks, Thomas > > Regards, > Christian. > > > > > Cc: Matthew Auld > > Cc: König Christian > > Signed-off-by: Thomas Hellström > > --- > > drivers/gpu/drm/ttm/ttm_resource.c | 10 +++--- > > include/drm/ttm/ttm_resource.h | 28 > > > > 2 files changed, 35 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c > > b/drivers/gpu/drm/ttm/ttm_resource.c > > index 2431717376e7..973e7c50bfed 100644 > > --- a/drivers/gpu/drm/ttm/ttm_resource.c > > +++ b/drivers/gpu/drm/ttm/ttm_resource.c > > @@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object > > *bo, > > void ttm_resource_free(struct ttm_buffer_object *bo, struct > > ttm_resource **res) > > { > > struct ttm_resource_manager *man; > > + struct ttm_resource *resource = *res; > > > > - if (!*res) > > + if (!resource) > > return; > > > > - man = ttm_manager_type(bo->bdev, (*res)->mem_type); > > - man->func->free(man, *res); > > *res = NULL; > > + if (resource->priv) > > + resource->priv->ops.destroy(resource->priv); > > + > > + man = ttm_manager_type(bo->bdev, resource->mem_type); > > + man->func->free(man, resource); > > } > > EXPORT_SYMBOL(ttm_resource_free); > > > > diff --git a/include/drm/ttm/ttm_resource.h > > b/include/drm/ttm/ttm_resource.h > > index 140b6b9a8bbe..5a22c9a29c05 100644 > > --- a/include/drm/ttm/ttm_resource.h > > +++ b/include/drm/ttm/ttm_resource.h > > @@ -44,6 +44,7 @@ struct dma_buf_map; > > struct io_mapping; > > struct sg_table; > > struct scatterlist; > > +struct ttm_resource_private; > > > > struct ttm_resource_manager_func { > > /** > > @@ -153,6 +154,32 @@ struct ttm_bus_placement { > > enum ttm_cachingcaching; > > }; > > > > +/** > > + * struct ttm_resource_private_ops - Operations for a struct > > + * ttm_resource_private > > + * > > + * Not much benefit to keep this as a separate struct with only a > > single member, > > + * but keeping a separate ops struct is the norm. > > + */ > > +struct ttm_resource_private_ops { > > + /** > > + * destroy() - Callback to destroy the private data > > + * @priv - The private data to destroy > > + */ > > + void (*destroy) (struct ttm_resource_private *priv); > > +}; > > + > > +/** > > + * struct ttm_resource_private - TTM drive
Re: [PATCH v2 1/3] video: fbdev: asiliantfb: Error out if 'pixclock' equals zero
Hi Zheyu, On Mon, Jul 26, 2021 at 12:04 PM Zheyu Ma wrote: > The userspace program could pass any values to the driver through > ioctl() interface. If the driver doesn't check the value of 'pixclock', > it may cause divide error. > > Fix this by checking whether 'pixclock' is zero first. > > The following log reveals it: > > [ 43.861711] divide error: [#1] PREEMPT SMP KASAN PTI > [ 43.861737] CPU: 2 PID: 11764 Comm: i740 Not tainted > 5.14.0-rc2-00513-gac532c9bbcfb-dirty #224 > [ 43.861756] RIP: 0010:asiliantfb_check_var+0x4e/0x730 > [ 43.861843] Call Trace: > [ 43.861848] ? asiliantfb_remove+0x190/0x190 > [ 43.861858] fb_set_var+0x2e4/0xeb0 > [ 43.861866] ? fb_blank+0x1a0/0x1a0 > [ 43.861873] ? lock_acquire+0x1ef/0x530 > [ 43.861884] ? lock_release+0x810/0x810 > [ 43.861892] ? lock_is_held_type+0x100/0x140 > [ 43.861903] ? ___might_sleep+0x1ee/0x2d0 > [ 43.861914] ? __mutex_lock+0x620/0x1190 > [ 43.861921] ? do_fb_ioctl+0x313/0x700 > [ 43.861929] ? mutex_lock_io_nested+0xfa0/0xfa0 > [ 43.861936] ? __this_cpu_preempt_check+0x1d/0x30 > [ 43.861944] ? _raw_spin_unlock_irqrestore+0x46/0x60 > [ 43.861952] ? lockdep_hardirqs_on+0x59/0x100 > [ 43.861959] ? _raw_spin_unlock_irqrestore+0x46/0x60 > [ 43.861967] ? trace_hardirqs_on+0x6a/0x1c0 > [ 43.861978] do_fb_ioctl+0x31e/0x700 > > Signed-off-by: Zheyu Ma Thanks for your patch! > --- > Changes in v2: > - Make commit log more descriptive > --- > drivers/video/fbdev/asiliantfb.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/video/fbdev/asiliantfb.c > b/drivers/video/fbdev/asiliantfb.c > index 3e006da47752..84c56f525889 100644 > --- a/drivers/video/fbdev/asiliantfb.c > +++ b/drivers/video/fbdev/asiliantfb.c > @@ -227,6 +227,9 @@ static int asiliantfb_check_var(struct fb_var_screeninfo > *var, > { > unsigned long Ftarget, ratio, remainder; > > + if (!var->pixclock) > + return -EINVAL; While this fixes the crash, it is not correct: according to the fbdev API, invalid values must be rounded up to a supported value, if possible. -EINVAL should only be returned if rounding up values in fb_var_screeninfo cannot give a valid mode. The same comment applies to the other patches in this series: [PATCH v2 2/3] video: fbdev: kyro: Error out if 'pixclock' equals zero [PATCH v2 3/3] video: fbdev: riva: Error out if 'pixclock' equals zero > + > ratio = 100 / var->pixclock; > remainder = 100 % var->pixclock; > Ftarget = 100 * ratio + (100 * remainder) / var->pixclock; Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table
On Fri, Sep 10, 2021 at 04:03:50PM +0100, Tvrtko Ursulin wrote: > > On 10/09/2021 15:24, Matt Roper wrote: > > On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote: > > > > > > On 10/09/2021 06:33, Matt Roper wrote: > > > > Our uncore MMIO functions for reading/writing registers have become very > > > > complicated over time. There's significant macro magic used to generate > > > > several nearly-identical functions that only really differ in terms of > > > > which platform-specific shadow register table they should check on write > > > > operations. We can significantly simplify our MMIO handlers by storing > > > > a reference to the current platform's shadow table within the 'struct > > > > intel_uncore' the same way we already do for forcewake; this allows us > > > > to consolidate the multiple variants of each 'write' function down to > > > > just a single 'fwtable' version that gets the shadow table out of the > > > > uncore struct rather than hardcoding the name of a specific platform's > > > > table. We can do similar consolidation on the MMIO read side by > > > > creating a single-entry forcewake table to replace the open-coded range > > > > check they had been using previously. > > > > > > > > The final patch of the series adds a new shadow table for DG2; this > > > > becomes quite clean and simple now, given the refactoring in the first > > > > five patches. > > > > > > Tidy and it ends up saving kernel binary size. > > > > > > However I am undecided yet, because one thing to note is that the trade > > > off > > > is source code and kernel text consolidation at the expense of more > > > indirect > > > calls at runtime and larger common read/write functions. > > > > > > To expand, current code generates a bunch of per gen functions but in > > > doing > > > so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and > > > BSEARCH > > > (from find_fw_domain) so at runtime each platform mmio read/write does not > > > have to do indirect calls to do lookups. > > > > > > It may matter a lot in the grand scheme of things but this trade off is > > > something to note in the cover letter I think. > > > > That's true. However it seems like if the extra indirect calls are good > > enough for our forcewake lookups (which are called more frequently and > > have to search through much larger tables) then using the same strategy > > for shadow registers should be less of a concern. Plus most of > > timing-critical parts of the code don't call through this at all; they > > just grab an explicit forcewake and then issue a bunch of *_fw() > > operations that skip all the per-register forcewake and shadow handling. > > With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have a > good idea of how many of those followed by "_fw" accessors we have vs > "un-optimized" access. But it's a good point. > > I was mostly coming from the point of view of old platforms like gen6, where > with this series reads go from inlined checks (NEEDS_FORCE_WAKE) to always > calling find_fw_domain. Just because it is a bit unfortunate to burden old > CPUs (they are not getting any faster) with executing more code. It's not > nice when old hardware gets slower and slower with software updates. :) But > whether or not this case would at all be measurable.. probably not. Unless > some compounding effects, like "death by thousand cuts", would come into > play. Chris pointed out in an offline mail that NEEDS_FORCE_WAKE does cut cut out a lot of display MMIO lookups. So I think it might be worth adding that back, but also adding an "|| GEN11_BSD_RING_BASE" so that it will still be accurate for the newer platforms too. But I think another thing to consider here would be that we might want to switch our intel_de_{read,write} wrappers to call raw mmio directly, to completely bypass forcewake and shadow logic. Matt > > Regards, > > Tvrtko > > > But you're right that this is something I should mention more clearly in > > the cover letter. > > > > > > Matt > > > > > > > > Regards, > > > > > > Tvrtko > > > > > > > Matt Roper (6): > > > > drm/i915/uncore: Convert gen6/gen7 read operations to fwtable > > > > drm/i915/uncore: Associate shadow table with uncore > > > > drm/i915/uncore: Replace gen8 write functions with general fwtable > > > > drm/i915/uncore: Drop gen11/gen12 mmio write handlers > > > > drm/i915/uncore: Drop gen11 mmio read handlers > > > > drm/i915/dg2: Add DG2-specific shadow register table > > > > > > > >drivers/gpu/drm/i915/intel_uncore.c | 190 > > > > ++ > > > >drivers/gpu/drm/i915/intel_uncore.h | 7 + > > > >drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + > > > >3 files changed, 110 insertions(+), 88 deletions(-) > > > > > > -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795
Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.
On Thu, Sep 09, 2021 at 11:32:18AM +0200, Maarten Lankhorst wrote: > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index d456579d0952..791c28005eef 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -736,6 +736,44 @@ __ww_mutex_lock(struct mutex *lock, unsigned int state, > unsigned int subclass, > return __mutex_lock_common(lock, state, subclass, NULL, ip, ww_ctx, > true); > } > > +/** > + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire > context > + * @lock: mutex to lock > + * @ctx: optional w/w acquire context > + * > + * Trylocks a mutex with the optional acquire context; no deadlock detection > is > + * possible. Returns 1 if the mutex has been acquired successfully, 0 > otherwise. > + * > + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a > @ctx is > + * specified, -EALREADY and -EDEADLK handling may happen in calls to > ww_mutex_lock. > + * > + * A mutex acquired with this function must be released with ww_mutex_unlock. > + */ > +int __sched > +ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ctx) > +{ > + bool locked; > + > + if (!ctx) > + return mutex_trylock(&ww->base); > + > +#ifdef CONFIG_DEBUG_MUTEXES > + DEBUG_LOCKS_WARN_ON(ww->base.magic != &ww->base); > +#endif > + > + preempt_disable(); > + locked = __mutex_trylock(&ww->base); > + > + if (locked) { > + ww_mutex_set_context_fastpath(ww, ctx); > + mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ctx->dep_map, > _RET_IP_); > + } > + preempt_enable(); > + > + return locked; > +} > +EXPORT_SYMBOL(ww_mutex_trylock); > + > #ifdef CONFIG_DEBUG_LOCK_ALLOC > void __sched > mutex_lock_nested(struct mutex *lock, unsigned int subclass) > diff --git a/kernel/locking/ww_rt_mutex.c b/kernel/locking/ww_rt_mutex.c > index 3f1fff7d2780..c4cb863edb4c 100644 > --- a/kernel/locking/ww_rt_mutex.c > +++ b/kernel/locking/ww_rt_mutex.c > @@ -50,6 +50,18 @@ __ww_rt_mutex_lock(struct ww_mutex *lock, struct > ww_acquire_ctx *ww_ctx, > return ret; > } > > +int __sched > +ww_mutex_trylock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) > +{ > + int locked = rt_mutex_trylock(&lock->base); > + > + if (locked && ctx) > + ww_mutex_set_context_fastpath(lock, ctx); > + > + return locked; > +} > +EXPORT_SYMBOL(ww_mutex_trylock); > + > int __sched > ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) > { That doesn't look right, how's this for you? --- --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -94,6 +94,9 @@ static inline unsigned long __owner_flag return owner & MUTEX_FLAGS; } +/* + * Returns: __mutex_owner(lock) on failure or NULL on success. + */ static inline struct task_struct *__mutex_trylock_common(struct mutex *lock, bool handoff) { unsigned long owner, curr = (unsigned long)current; @@ -736,6 +739,47 @@ __ww_mutex_lock(struct mutex *lock, unsi return __mutex_lock_common(lock, state, subclass, NULL, ip, ww_ctx, true); } +/** + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire context + * @ww: mutex to lock + * @ww_ctx: optional w/w acquire context + * + * Trylocks a mutex with the optional acquire context; no deadlock detection is + * possible. Returns 1 if the mutex has been acquired successfully, 0 otherwise. + * + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a @ctx is + * specified, -EALREADY handling may happen in calls to ww_mutex_trylock. + * + * A mutex acquired with this function must be released with ww_mutex_unlock. + */ +int ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ww_ctx) +{ + if (!ww_ctx) + return mutex_trylock(&ww->base); + + MUTEX_WARN_ON(ww->base.magic != &ww->base); + + if (unlikely(ww_ctx == READ_ONCE(ww->ctx))) + return -EALREADY; + + /* +* Reset the wounded flag after a kill. No other process can +* race and wound us here, since they can't have a valid owner +* pointer if we don't have any locks held. +*/ + if (ww_ctx->acquired == 0) + ww_ctx->wounded = 0; + + if (__mutex_trylock(&ww->base)) { + ww_mutex_set_context_fastpath(ww, ww_ctx); + mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ww_ctx->dep_map, _RET_IP_); + return 1; + } + + return 0; +} +EXPORT_SYMBOL(ww_mutex_trylock); + #ifdef CONFIG_DEBUG_LOCK_ALLOC void __sched mutex_lock_nested(struct mutex *lock, unsigned int subclass) --- a/kernel/locking/ww_rt_mutex.c +++ b/kernel/locking/ww_rt_mutex.c @@ -9,6 +9,34 @@ #define WW_RT #include "rtmutex.c" +int ww_mutex_trylock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx) +{ + struct rt_mutex *rtm = &lock->base; + + if (!ww_ctx) + return rt_mutex_trylock(rtm); + + if (unlikely(ww_
Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table
On 10/09/2021 15:24, Matt Roper wrote: On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote: On 10/09/2021 06:33, Matt Roper wrote: Our uncore MMIO functions for reading/writing registers have become very complicated over time. There's significant macro magic used to generate several nearly-identical functions that only really differ in terms of which platform-specific shadow register table they should check on write operations. We can significantly simplify our MMIO handlers by storing a reference to the current platform's shadow table within the 'struct intel_uncore' the same way we already do for forcewake; this allows us to consolidate the multiple variants of each 'write' function down to just a single 'fwtable' version that gets the shadow table out of the uncore struct rather than hardcoding the name of a specific platform's table. We can do similar consolidation on the MMIO read side by creating a single-entry forcewake table to replace the open-coded range check they had been using previously. The final patch of the series adds a new shadow table for DG2; this becomes quite clean and simple now, given the refactoring in the first five patches. Tidy and it ends up saving kernel binary size. However I am undecided yet, because one thing to note is that the trade off is source code and kernel text consolidation at the expense of more indirect calls at runtime and larger common read/write functions. To expand, current code generates a bunch of per gen functions but in doing so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH (from find_fw_domain) so at runtime each platform mmio read/write does not have to do indirect calls to do lookups. It may matter a lot in the grand scheme of things but this trade off is something to note in the cover letter I think. That's true. However it seems like if the extra indirect calls are good enough for our forcewake lookups (which are called more frequently and have to search through much larger tables) then using the same strategy for shadow registers should be less of a concern. Plus most of timing-critical parts of the code don't call through this at all; they just grab an explicit forcewake and then issue a bunch of *_fw() operations that skip all the per-register forcewake and shadow handling. With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have a good idea of how many of those followed by "_fw" accessors we have vs "un-optimized" access. But it's a good point. I was mostly coming from the point of view of old platforms like gen6, where with this series reads go from inlined checks (NEEDS_FORCE_WAKE) to always calling find_fw_domain. Just because it is a bit unfortunate to burden old CPUs (they are not getting any faster) with executing more code. It's not nice when old hardware gets slower and slower with software updates. :) But whether or not this case would at all be measurable.. probably not. Unless some compounding effects, like "death by thousand cuts", would come into play. Regards, Tvrtko But you're right that this is something I should mention more clearly in the cover letter. Matt Regards, Tvrtko Matt Roper (6): drm/i915/uncore: Convert gen6/gen7 read operations to fwtable drm/i915/uncore: Associate shadow table with uncore drm/i915/uncore: Replace gen8 write functions with general fwtable drm/i915/uncore: Drop gen11/gen12 mmio write handlers drm/i915/uncore: Drop gen11 mmio read handlers drm/i915/dg2: Add DG2-specific shadow register table drivers/gpu/drm/i915/intel_uncore.c | 190 ++ drivers/gpu/drm/i915/intel_uncore.h | 7 + drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + 3 files changed, 110 insertions(+), 88 deletions(-)
[PATCH 0/1] lib, stackdepot: Add helper to print stack entries into buffer.
This change is in response to discussion at [1]. The patch has been created on top of my earlier changes [2] and [3]. If needed I can resend all of these patches together, though my earlier patches have been Acked. [1] https://lore.kernel.org/lkml/e6f6fb85-1d83-425b-9e36-b5784cc9e...@suse.cz/ [2] https://lore.kernel.org/lkml/fe94ffd8-d235-87d8-9c3d-80f7f73e0...@suse.cz/ [3] https://lore.kernel.org/lkml/85f4f073-0b5a-9052-0ba9-74d450608...@suse.cz/ Imran Khan (1): lib, stackdepot: Add helper to print stack entries into buffer. drivers/gpu/drm/drm_dp_mst_topology.c | 5 + drivers/gpu/drm/drm_mm.c| 5 + drivers/gpu/drm/i915/i915_vma.c | 5 + drivers/gpu/drm/i915/intel_runtime_pm.c | 20 +--- include/linux/stackdepot.h | 3 +++ lib/stackdepot.c| 23 +++ mm/page_owner.c | 5 + 7 files changed, 35 insertions(+), 31 deletions(-) -- 2.30.2
[PATCH 1/1] lib, stackdepot: Add helper to print stack entries into buffer.
To print stack entries into a buffer, users of stackdepot, first get a list of stack entries using stack_depot_fetch and then print this list into a buffer using stack_trace_snprint. Provide a helper in stackdepot for this purpose. Also change above mentioned users to use this helper. Signed-off-by: Imran Khan Suggested-by: Vlastimil Babka --- drivers/gpu/drm/drm_dp_mst_topology.c | 5 + drivers/gpu/drm/drm_mm.c| 5 + drivers/gpu/drm/i915/i915_vma.c | 5 + drivers/gpu/drm/i915/intel_runtime_pm.c | 20 +--- include/linux/stackdepot.h | 3 +++ lib/stackdepot.c| 23 +++ mm/page_owner.c | 5 + 7 files changed, 35 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 86d13d6bc463..2d1adab9e360 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1668,13 +1668,10 @@ __dump_topology_ref_history(struct drm_dp_mst_topology_ref_history *history, for (i = 0; i < history->len; i++) { const struct drm_dp_mst_topology_ref_entry *entry = &history->entries[i]; - ulong *entries; - uint nr_entries; u64 ts_nsec = entry->ts_nsec; u32 rem_nsec = do_div(ts_nsec, 10); - nr_entries = stack_depot_fetch(entry->backtrace, &entries); - stack_trace_snprint(buf, PAGE_SIZE, entries, nr_entries, 4); + stack_depot_snprint(entry->backtrace, buf, PAGE_SIZE, 4); drm_printf(&p, " %d %ss (last at %5llu.%06u):\n%s", entry->count, diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 93d48a6f04ab..ca04d7f6f7b5 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -118,8 +118,6 @@ static noinline void save_stack(struct drm_mm_node *node) static void show_leaks(struct drm_mm *mm) { struct drm_mm_node *node; - unsigned long *entries; - unsigned int nr_entries; char *buf; buf = kmalloc(BUFSZ, GFP_KERNEL); @@ -133,8 +131,7 @@ static void show_leaks(struct drm_mm *mm) continue; } - nr_entries = stack_depot_fetch(node->stack, &entries); - stack_trace_snprint(buf, BUFSZ, entries, nr_entries, 0); + stack_depot_snprint(node->stack, buf, BUFSZ); DRM_ERROR("node [%08llx + %08llx]: inserted at\n%s", node->start, node->size, buf); } diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 4b7fc4647e46..f2d9ed375109 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -56,8 +56,6 @@ void i915_vma_free(struct i915_vma *vma) static void vma_print_allocator(struct i915_vma *vma, const char *reason) { - unsigned long *entries; - unsigned int nr_entries; char buf[512]; if (!vma->node.stack) { @@ -66,8 +64,7 @@ static void vma_print_allocator(struct i915_vma *vma, const char *reason) return; } - nr_entries = stack_depot_fetch(vma->node.stack, &entries); - stack_trace_snprint(buf, sizeof(buf), entries, nr_entries, 0); + stack_depot_snprint(vma->node.stack, buf, sizeof(buf), 0); DRM_DEBUG_DRIVER("vma.node [%08llx + %08llx] %s: inserted at %s\n", vma->node.start, vma->node.size, reason, buf); } diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c index eaf7688f517d..cc312f0a05eb 100644 --- a/drivers/gpu/drm/i915/intel_runtime_pm.c +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c @@ -65,16 +65,6 @@ static noinline depot_stack_handle_t __save_depot_stack(void) return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN); } -static void __print_depot_stack(depot_stack_handle_t stack, - char *buf, int sz, int indent) -{ - unsigned long *entries; - unsigned int nr_entries; - - nr_entries = stack_depot_fetch(stack, &entries); - stack_trace_snprint(buf, sz, entries, nr_entries, indent); -} - static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm) { spin_lock_init(&rpm->debug.lock); @@ -146,12 +136,12 @@ static void untrack_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm, if (!buf) return; - __print_depot_stack(stack, buf, PAGE_SIZE, 2); + stack_depot_snprint(stack, buf, PAGE_SIZE, 2); DRM_DEBUG_DRIVER("wakeref %x from\n%s", stack, buf); stack = READ_ONCE(rpm->debug.last_release); if (stack) { - __print_depot_stack(stack, buf, PAGE_SIZE, 2); +
Re: [PATCH v3 2/8] mm: Introduce a function to check for confidential computing features
On Wed, Sep 08, 2021 at 05:58:33PM -0500, Tom Lendacky wrote: > In prep for other confidential computing technologies, introduce a generic preparation > helper function, cc_platform_has(), that can be used to check for specific > active confidential computing attributes, like memory encryption. This is > intended to eliminate having to add multiple technology-specific checks to > the code (e.g. if (sev_active() || tdx_active())). ... > diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h > new file mode 100644 > index ..253f3ea66cd8 > --- /dev/null > +++ b/include/linux/cc_platform.h > @@ -0,0 +1,88 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Confidential Computing Platform Capability checks > + * > + * Copyright (C) 2021 Advanced Micro Devices, Inc. > + * > + * Author: Tom Lendacky > + */ > + > +#ifndef _CC_PLATFORM_H _LINUX_CC_PLATFORM_H > +#define _CC_PLATFORM_H -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
[PATCH v5 1/3] dt-bindings: Add YAML bindings for NVDEC
Add YAML device tree bindings for NVDEC, now in a more appropriate place compared to the old textual Host1x bindings. Signed-off-by: Mikko Perttunen --- v5: * Changed from nvidia,instance to nvidia,host1x-class optional property. * Added dma-coherent v4: * Fix incorrect compatibility string in 'if' condition v3: * Drop host1x bindings * Change read2 to read-1 in interconnect names v2: * Fix issues pointed out in v1 * Add T194 nvidia,instance property --- .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 ++ MAINTAINERS | 1 + 2 files changed, 105 insertions(+) create mode 100644 Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml diff --git a/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml b/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml new file mode 100644 index ..f1f8d083d736 --- /dev/null +++ b/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml @@ -0,0 +1,104 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: "http://devicetree.org/schemas/gpu/host1x/nvidia,tegra210-nvdec.yaml#"; +$schema: "http://devicetree.org/meta-schemas/core.yaml#"; + +title: Device tree binding for NVIDIA Tegra NVDEC + +description: | + NVDEC is the hardware video decoder present on NVIDIA Tegra210 + and newer chips. It is located on the Host1x bus and typically + programmed through Host1x channels. + +maintainers: + - Thierry Reding + - Mikko Perttunen + +properties: + $nodename: +pattern: "^nvdec@[0-9a-f]*$" + + compatible: +enum: + - nvidia,tegra210-nvdec + - nvidia,tegra186-nvdec + - nvidia,tegra194-nvdec + + reg: +maxItems: 1 + + clocks: +maxItems: 1 + + clock-names: +items: + - const: nvdec + + resets: +maxItems: 1 + + reset-names: +items: + - const: nvdec + + power-domains: +maxItems: 1 + + iommus: +maxItems: 1 + + dma-coherent: true + + interconnects: +items: + - description: DMA read memory client + - description: DMA read 2 memory client + - description: DMA write memory client + + interconnect-names: +items: + - const: dma-mem + - const: read-1 + - const: write + + nvidia,host1x-class: +description: Host1x class of the engine. If not specified, defaults to 0xf0. +$ref: /schemas/types.yaml#/definitions/uint32 + +required: + - compatible + - reg + - clocks + - clock-names + - resets + - reset-names + - power-domains + +additionalProperties: false + +examples: + - | +#include +#include +#include +#include +#include + +nvdec@1548 { +compatible = "nvidia,tegra186-nvdec"; +reg = <0x1548 0x4>; +clocks = <&bpmp TEGRA186_CLK_NVDEC>; +clock-names = "nvdec"; +resets = <&bpmp TEGRA186_RESET_NVDEC>; +reset-names = "nvdec"; + +power-domains = <&bpmp TEGRA186_POWER_DOMAIN_NVDEC>; +interconnects = <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD &emc>, +<&mc TEGRA186_MEMORY_CLIENT_NVDECSRD1 &emc>, +<&mc TEGRA186_MEMORY_CLIENT_NVDECSWR &emc>; +interconnect-names = "dma-mem", "read-1", "write"; +iommus = <&smmu TEGRA186_SID_NVDEC>; +}; + + diff --git a/MAINTAINERS b/MAINTAINERS index 69932194e1ba..ce9e360639d5 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6230,6 +6230,7 @@ L:linux-te...@vger.kernel.org S: Supported T: git git://anongit.freedesktop.org/tegra/linux.git F: Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt +F: Documentation/devicetree/bindings/gpu/host1x/ F: drivers/gpu/drm/tegra/ F: drivers/gpu/host1x/ F: include/linux/host1x.h -- 2.32.0
[PATCH v5 2/3] arm64: tegra: Add NVDEC to Tegra186/194 device trees
Add a device tree node for NVDEC on Tegra186, and device tree nodes for NVDEC and NVDEC1 on Tegra194. Signed-off-by: Mikko Perttunen --- v5: * Change from nvidia,instance to nvidia,host1x-class v4: * Add dma-coherent markers v3: * Change read2 to read-1 v2: * Add NVDECSRD1 memory client * Add also to T194 (both NVDEC0/1) --- arch/arm64/boot/dts/nvidia/tegra186.dtsi | 16 ++ arch/arm64/boot/dts/nvidia/tegra194.dtsi | 38 2 files changed, 54 insertions(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi b/arch/arm64/boot/dts/nvidia/tegra186.dtsi index d02f6bf3e2ca..4f2f21242b2c 100644 --- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi @@ -1342,6 +1342,22 @@ dsib: dsi@1540 { power-domains = <&bpmp TEGRA186_POWER_DOMAIN_DISP>; }; + nvdec@1548 { + compatible = "nvidia,tegra186-nvdec"; + reg = <0x1548 0x4>; + clocks = <&bpmp TEGRA186_CLK_NVDEC>; + clock-names = "nvdec"; + resets = <&bpmp TEGRA186_RESET_NVDEC>; + reset-names = "nvdec"; + + power-domains = <&bpmp TEGRA186_POWER_DOMAIN_NVDEC>; + interconnects = <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD &emc>, + <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD1 &emc>, + <&mc TEGRA186_MEMORY_CLIENT_NVDECSWR &emc>; + interconnect-names = "dma-mem", "read-1", "write"; + iommus = <&smmu TEGRA186_SID_NVDEC>; + }; + sor0: sor@1554 { compatible = "nvidia,tegra186-sor"; reg = <0x1554 0x1>; diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi index 5ba7a4519b95..04e883aa7aa2 100644 --- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi @@ -1412,6 +1412,25 @@ host1x@13e0 { interconnect-names = "dma-mem"; iommus = <&smmu TEGRA194_SID_HOST1X>; + nvdec@1514 { + compatible = "nvidia,tegra194-nvdec"; + reg = <0x1514 0x0004>; + clocks = <&bpmp TEGRA194_CLK_NVDEC1>; + clock-names = "nvdec"; + resets = <&bpmp TEGRA194_RESET_NVDEC1>; + reset-names = "nvdec"; + + power-domains = <&bpmp TEGRA194_POWER_DOMAIN_NVDECB>; + interconnects = <&mc TEGRA194_MEMORY_CLIENT_NVDEC1SRD &emc>, + <&mc TEGRA194_MEMORY_CLIENT_NVDEC1SRD1 &emc>, + <&mc TEGRA194_MEMORY_CLIENT_NVDEC1SWR &emc>; + interconnect-names = "dma-mem", "read-1", "write"; + iommus = <&smmu TEGRA194_SID_NVDEC1>; + dma-coherent; + + nvidia,host1x-class = <0xf5>; + }; + display-hub@1520 { compatible = "nvidia,tegra194-display"; reg = <0x1520 0x0004>; @@ -1525,6 +1544,25 @@ vic@1534 { iommus = <&smmu TEGRA194_SID_VIC>; }; + nvdec@1548 { + compatible = "nvidia,tegra194-nvdec"; + reg = <0x1548 0x0004>; + clocks = <&bpmp TEGRA194_CLK_NVDEC>; + clock-names = "nvdec"; + resets = <&bpmp TEGRA194_RESET_NVDEC>; + reset-names = "nvdec"; + + power-domains = <&bpmp TEGRA194_POWER_DOMAIN_NVDECA>; + interconnects = <&mc TEGRA194_MEMORY_CLIENT_NVDECSRD &emc>, + <&mc TEGRA194_MEMORY_CLIENT_NVDECSRD1 &emc>, + <&mc TEGRA194_MEMORY_CLIENT_NVDECSWR &emc>; + interconnect-names = "dma-mem", "read-1", "write"; + iommus = <&smmu TEGRA194_SID_NVDEC>; + dma-coherent; + + nvidia,host1x-class = <0xf0>; + }; + dpaux0: dpaux@155c { compatible = "nvidia,tegra194-dpaux"; reg = <0x155c 0x1>; -- 2.32.0
[PATCH v5 3/3] drm/tegra: Add NVDEC driver
Add support for booting and using NVDEC on Tegra210, Tegra186 and Tegra194 to the Host1x and TegraDRM drivers. Booting in secure mode is not currently supported. Signed-off-by: Mikko Perttunen --- v5: * Remove num_instances * Change from nvidia,instance to nvidia,host1x-class v3: * Change num_instances to unsigned int * Remove unnecessary '= 0' initializer * Populate num_instances data * Fix instance number check v2: * Use devm_platform_get_and_ioremap_resource * Remove reset handling, done by power domain code * Assume runtime PM is enabled --- drivers/gpu/drm/tegra/Makefile | 3 +- drivers/gpu/drm/tegra/drm.c| 4 + drivers/gpu/drm/tegra/drm.h| 1 + drivers/gpu/drm/tegra/nvdec.c | 464 + drivers/gpu/host1x/dev.c | 18 ++ include/linux/host1x.h | 2 + 6 files changed, 491 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/tegra/nvdec.c diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile index 5d2039f0c734..b248c631f790 100644 --- a/drivers/gpu/drm/tegra/Makefile +++ b/drivers/gpu/drm/tegra/Makefile @@ -24,7 +24,8 @@ tegra-drm-y := \ gr2d.o \ gr3d.o \ falcon.o \ - vic.o + vic.o \ + nvdec.o tegra-drm-y += trace.o diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index b20fd0833661..5f5afd7ba37e 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -1337,15 +1337,18 @@ static const struct of_device_id host1x_drm_subdevs[] = { { .compatible = "nvidia,tegra210-sor", }, { .compatible = "nvidia,tegra210-sor1", }, { .compatible = "nvidia,tegra210-vic", }, + { .compatible = "nvidia,tegra210-nvdec", }, { .compatible = "nvidia,tegra186-display", }, { .compatible = "nvidia,tegra186-dc", }, { .compatible = "nvidia,tegra186-sor", }, { .compatible = "nvidia,tegra186-sor1", }, { .compatible = "nvidia,tegra186-vic", }, + { .compatible = "nvidia,tegra186-nvdec", }, { .compatible = "nvidia,tegra194-display", }, { .compatible = "nvidia,tegra194-dc", }, { .compatible = "nvidia,tegra194-sor", }, { .compatible = "nvidia,tegra194-vic", }, + { .compatible = "nvidia,tegra194-nvdec", }, { /* sentinel */ } }; @@ -1369,6 +1372,7 @@ static struct platform_driver * const drivers[] = { &tegra_gr2d_driver, &tegra_gr3d_driver, &tegra_vic_driver, + &tegra_nvdec_driver, }; static int __init host1x_drm_init(void) diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 8b28327c931c..fc0a19554eac 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -202,5 +202,6 @@ extern struct platform_driver tegra_sor_driver; extern struct platform_driver tegra_gr2d_driver; extern struct platform_driver tegra_gr3d_driver; extern struct platform_driver tegra_vic_driver; +extern struct platform_driver tegra_nvdec_driver; #endif /* HOST1X_DRM_H */ diff --git a/drivers/gpu/drm/tegra/nvdec.c b/drivers/gpu/drm/tegra/nvdec.c new file mode 100644 index ..c3b6fe7fb454 --- /dev/null +++ b/drivers/gpu/drm/tegra/nvdec.c @@ -0,0 +1,464 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2015-2021, NVIDIA Corporation. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "drm.h" +#include "falcon.h" +#include "vic.h" + +struct nvdec_config { + const char *firmware; + unsigned int version; + bool supports_sid; +}; + +struct nvdec { + struct falcon falcon; + + void __iomem *regs; + struct tegra_drm_client client; + struct host1x_channel *channel; + struct device *dev; + struct clk *clk; + + /* Platform configuration */ + const struct nvdec_config *config; +}; + +static inline struct nvdec *to_nvdec(struct tegra_drm_client *client) +{ + return container_of(client, struct nvdec, client); +} + +static void nvdec_writel(struct nvdec *nvdec, u32 value, unsigned int offset) +{ + writel(value, nvdec->regs + offset); +} + +static int nvdec_boot(struct nvdec *nvdec) +{ +#ifdef CONFIG_IOMMU_API + struct iommu_fwspec *spec = dev_iommu_fwspec_get(nvdec->dev); +#endif + int err; + +#ifdef CONFIG_IOMMU_API + if (nvdec->config->supports_sid && spec) { + u32 value; + + value = TRANSCFG_ATT(1, TRANSCFG_SID_FALCON) | TRANSCFG_ATT(0, TRANSCFG_SID_HW); + nvdec_writel(nvdec, value, VIC_TFBIF_TRANSCFG); + + if (spec->num_ids > 0) { + value = spec->ids[0] & 0x; + + nvdec_writel(nvdec, value, VIC_THI_STREAMID0); + nvdec_writel(nvdec, value, VIC_THI_STREAMID1); + } + } +#endif + + err = falcon_boot(&nvdec->falcon); + if (
[PATCH v5 0/3] NVIDIA Tegra NVDEC support
Here's the v5 of the NVDEC support series, containing the following changes: * Changed from nvidia,instance property to nvidia,host1x-class property. * Set additionalProperties to false in DT bindings. * Added dma-coherent property to DT bindings. NVDEC hardware documentation can be found at https://github.com/NVIDIA/open-gpu-doc/tree/master/classes/video and example userspace can be found at https://github.com/cyndis/vaapi-tegra-driver Thanks, Mikko Mikko Perttunen (3): dt-bindings: Add YAML bindings for NVDEC arm64: tegra: Add NVDEC to Tegra186/194 device trees drm/tegra: Add NVDEC driver .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 MAINTAINERS | 1 + arch/arm64/boot/dts/nvidia/tegra186.dtsi | 16 + arch/arm64/boot/dts/nvidia/tegra194.dtsi | 38 ++ drivers/gpu/drm/tegra/Makefile| 3 +- drivers/gpu/drm/tegra/drm.c | 4 + drivers/gpu/drm/tegra/drm.h | 1 + drivers/gpu/drm/tegra/nvdec.c | 464 ++ drivers/gpu/host1x/dev.c | 18 + include/linux/host1x.h| 2 + 10 files changed, 650 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml create mode 100644 drivers/gpu/drm/tegra/nvdec.c -- 2.32.0
Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource
Am 10.09.21 um 15:15 schrieb Thomas Hellström: Both the provider (resource manager) and the consumer (the TTM driver) want to subclass struct ttm_resource. Since this is left for the resource manager, we need to provide a private pointer for the TTM driver. Provide a struct ttm_resource_private for the driver to subclass for data with the same lifetime as the struct ttm_resource: In the i915 case it will, for example, be an sg-table and radix tree into the LMEM /VRAM pages that currently are awkwardly attached to the GEM object. Provide an ops structure for associated ops (Which is only destroy() ATM) It might seem pointless to provide a separate ops structure, but Linus has previously made it clear that that's the norm. After careful audit one could perhaps also on a per-driver basis replace the delete_mem_notify() TTM driver callback with the above destroy function. Well this is a really big NAK to this approach. If you need to attach some additional information to the resource then implement your own resource manager like everybody else does. Regards, Christian. Cc: Matthew Auld Cc: König Christian Signed-off-by: Thomas Hellström --- drivers/gpu/drm/ttm/ttm_resource.c | 10 +++--- include/drm/ttm/ttm_resource.h | 28 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c index 2431717376e7..973e7c50bfed 100644 --- a/drivers/gpu/drm/ttm/ttm_resource.c +++ b/drivers/gpu/drm/ttm/ttm_resource.c @@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo, void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res) { struct ttm_resource_manager *man; + struct ttm_resource *resource = *res; - if (!*res) + if (!resource) return; - man = ttm_manager_type(bo->bdev, (*res)->mem_type); - man->func->free(man, *res); *res = NULL; + if (resource->priv) + resource->priv->ops.destroy(resource->priv); + + man = ttm_manager_type(bo->bdev, resource->mem_type); + man->func->free(man, resource); } EXPORT_SYMBOL(ttm_resource_free); diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h index 140b6b9a8bbe..5a22c9a29c05 100644 --- a/include/drm/ttm/ttm_resource.h +++ b/include/drm/ttm/ttm_resource.h @@ -44,6 +44,7 @@ struct dma_buf_map; struct io_mapping; struct sg_table; struct scatterlist; +struct ttm_resource_private; struct ttm_resource_manager_func { /** @@ -153,6 +154,32 @@ struct ttm_bus_placement { enum ttm_cachingcaching; }; +/** + * struct ttm_resource_private_ops - Operations for a struct + * ttm_resource_private + * + * Not much benefit to keep this as a separate struct with only a single member, + * but keeping a separate ops struct is the norm. + */ +struct ttm_resource_private_ops { + /** +* destroy() - Callback to destroy the private data +* @priv - The private data to destroy +*/ + void (*destroy) (struct ttm_resource_private *priv); +}; + +/** + * struct ttm_resource_private - TTM driver private data + * @ops: Pointer to struct ttm_resource_private_ops with associated operations + * + * Intended to be subclassed to hold, for example cached data sharing the + * lifetime with a struct ttm_resource. + */ +struct ttm_resource_private { + const struct ttm_resource_private_ops ops; +}; + /** * struct ttm_resource * @@ -171,6 +198,7 @@ struct ttm_resource { uint32_t mem_type; uint32_t placement; struct ttm_bus_placement bus; + struct ttm_resource_private *priv; }; /**
Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table
On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote: > > On 10/09/2021 06:33, Matt Roper wrote: > > Our uncore MMIO functions for reading/writing registers have become very > > complicated over time. There's significant macro magic used to generate > > several nearly-identical functions that only really differ in terms of > > which platform-specific shadow register table they should check on write > > operations. We can significantly simplify our MMIO handlers by storing > > a reference to the current platform's shadow table within the 'struct > > intel_uncore' the same way we already do for forcewake; this allows us > > to consolidate the multiple variants of each 'write' function down to > > just a single 'fwtable' version that gets the shadow table out of the > > uncore struct rather than hardcoding the name of a specific platform's > > table. We can do similar consolidation on the MMIO read side by > > creating a single-entry forcewake table to replace the open-coded range > > check they had been using previously. > > > > The final patch of the series adds a new shadow table for DG2; this > > becomes quite clean and simple now, given the refactoring in the first > > five patches. > > Tidy and it ends up saving kernel binary size. > > However I am undecided yet, because one thing to note is that the trade off > is source code and kernel text consolidation at the expense of more indirect > calls at runtime and larger common read/write functions. > > To expand, current code generates a bunch of per gen functions but in doing > so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH > (from find_fw_domain) so at runtime each platform mmio read/write does not > have to do indirect calls to do lookups. > > It may matter a lot in the grand scheme of things but this trade off is > something to note in the cover letter I think. That's true. However it seems like if the extra indirect calls are good enough for our forcewake lookups (which are called more frequently and have to search through much larger tables) then using the same strategy for shadow registers should be less of a concern. Plus most of timing-critical parts of the code don't call through this at all; they just grab an explicit forcewake and then issue a bunch of *_fw() operations that skip all the per-register forcewake and shadow handling. But you're right that this is something I should mention more clearly in the cover letter. Matt > > Regards, > > Tvrtko > > > Matt Roper (6): > >drm/i915/uncore: Convert gen6/gen7 read operations to fwtable > >drm/i915/uncore: Associate shadow table with uncore > >drm/i915/uncore: Replace gen8 write functions with general fwtable > >drm/i915/uncore: Drop gen11/gen12 mmio write handlers > >drm/i915/uncore: Drop gen11 mmio read handlers > >drm/i915/dg2: Add DG2-specific shadow register table > > > > drivers/gpu/drm/i915/intel_uncore.c | 190 ++ > > drivers/gpu/drm/i915/intel_uncore.h | 7 + > > drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + > > 3 files changed, 110 insertions(+), 88 deletions(-) > > -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795
Re: [PATCH] drm/vc4: hdmi: Remove unused struct
On Thu, 19 Aug 2021 at 15:08, Maxime Ripard wrote: > > Commitc7d30623540b ("drm/vc4: hdmi: Remove unused struct") removed the > references to the vc4_hdmi_audio_widgets and vc4_hdmi_audio_routes > structures, but not the structures themselves resulting in two warnings. > Remove it. > > Fixes: c7d30623540b ("drm/vc4: hdmi: Remove unused struct") > Reported-by: kernel test robot > Signed-off-by: Maxime Ripard Reviewed-by: Dave Stevenson > --- > drivers/gpu/drm/vc4/vc4_hdmi.c | 8 > 1 file changed, 8 deletions(-) > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c > index b7dc32a0c9bb..1e2d976e8736 100644 > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c > @@ -1403,14 +1403,6 @@ static int vc4_hdmi_audio_prepare(struct device *dev, > void *data, > return 0; > } > > -static const struct snd_soc_dapm_widget vc4_hdmi_audio_widgets[] = { > - SND_SOC_DAPM_OUTPUT("TX"), > -}; > - > -static const struct snd_soc_dapm_route vc4_hdmi_audio_routes[] = { > - { "TX", NULL, "Playback" }, > -}; > - > static const struct snd_soc_component_driver vc4_hdmi_audio_cpu_dai_comp = { > .name = "vc4-hdmi-cpu-dai-component", > }; > -- > 2.31.1 >
Re: [PATCH v3 1/6] drm/vc4: select PM
On Thu, 19 Aug 2021 at 14:59, Maxime Ripard wrote: > > We already depend on runtime PM to get the power domains and clocks for > most of the devices supported by the vc4 driver, so let's just select it > to make sure it's there, and remove the ifdef. > > Signed-off-by: Maxime Ripard Reviewed-by: Dave Stevenson > --- > drivers/gpu/drm/vc4/Kconfig| 1 + > drivers/gpu/drm/vc4/vc4_hdmi.c | 2 -- > 2 files changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/vc4/Kconfig b/drivers/gpu/drm/vc4/Kconfig > index 118e8a426b1a..f774ab340863 100644 > --- a/drivers/gpu/drm/vc4/Kconfig > +++ b/drivers/gpu/drm/vc4/Kconfig > @@ -9,6 +9,7 @@ config DRM_VC4 > select DRM_KMS_CMA_HELPER > select DRM_GEM_CMA_HELPER > select DRM_PANEL_BRIDGE > + select PM > select SND_PCM > select SND_PCM_ELD > select SND_SOC_GENERIC_DMAENGINE_PCM > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c > index c2876731ee2d..602203b2d8e1 100644 > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c > @@ -2107,7 +2107,6 @@ static int vc5_hdmi_init_resources(struct vc4_hdmi > *vc4_hdmi) > return 0; > } > > -#ifdef CONFIG_PM > static int vc4_hdmi_runtime_suspend(struct device *dev) > { > struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev); > @@ -2128,7 +2127,6 @@ static int vc4_hdmi_runtime_resume(struct device *dev) > > return 0; > } > -#endif > > static int vc4_hdmi_bind(struct device *dev, struct device *master, void > *data) > { > -- > 2.31.1 >
Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code
On Fri, Sep 10, 2021 at 01:10:46PM +0100, Christoph Hellwig wrote: > On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote: > > Every driver just emits a static string, simply feed it through the ops > > and provide a standard sysfs show function. > > Looks sensible. But can you make the attribute optional and add a > comment marking it deprecated? Because it really is completely useless. > We don't version userspace APIs, userspae has to discover new features > individually by e.g. finding new sysfs files or just trying new ioctls. To be honest I have no idea what side effects that would have.. device code search tells me libvirt reads it and stuffs it into some XML Something called mdevctl touches it, feeds it into some JSON and other stuff.. qemu has some VFIO_DEVICE_API_* constants but it is all dead code I agree it shouldn't have been there in the first place Cornelia? Alex? Any thoughts? Jason
Re: [Intel-gfx] [PATCH v5] drm/i915: Use Transparent Hugepages when IOMMU is enabled
On 09/09/2021 17:17, Rodrigo Vivi wrote: On Thu, Sep 09, 2021 at 12:44:48PM +0100, Tvrtko Ursulin wrote: From: Tvrtko Ursulin Usage of Transparent Hugepages was disabled in 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it appears majority of performance regressions reported with an enabled IOMMU can be almost eliminated by turning them on, lets just do that. To err on the side of safety we keep the current default in cases where IOMMU is not active, and only when it is default to the "huge=within_size" mode. Although there probably would be wins to enable them throughout, more extensive testing across benchmarks and platforms would need to be done. With the patch and IOMMU enabled my local testing on a small Skylake part shows OglVSTangent regression being reduced from ~14% (IOMMU on versus IOMMU off) to ~2% (same comparison but with THP on). More detailed testing done in the below referenced Gitlab issue by Eero: Skylake GT4e: Performance drops from enabling IOMMU: 30-35% SynMark CSDof 20-25% Unigine Heaven, MemBW GPU write, SynMark VSTangent ~20% GLB Egypt (1/2 screen window) 10-15% GLB T-Rex (1/2 screen window) 8-10% GfxBench T-Rex, MemBW GPU blit 7-8% SynMark DeferredAA + TerrainFly* + ZBuffer 6-7% GfxBench Manhattan 3.0 + 3.1, SynMark TexMem128 & CSCloth 5-6% GfxBench CarChase, Unigine Valley 3-5% GfxBench Vulkan & GL AztecRuins + ALU2, MemBW GPU texture, SynMark Fill*, Deferred, TerrainPan* 1-2% Most of the other tests With the patch drops become: 20-25% SynMark TexMem* 15-20% GLB Egypt (1/2 screen window) 10-15% GLB T-Rex (1/2 screen window) 4-7% GfxBench T-Rex, GpuTest Triangle 1-8% GfxBench ALU2 (offscreen 1%, onscreen 8%) 3% GfxBench Manhattan 3.0, SynMark CSDof 2-3% Unigine Heaven + Valley, MemBW GPU texture 1-3 GfxBench Manhattan 3.1 + CarChase + Vulkan & GL AztecRuins Broxton: Performance drops from IOMMU, without patch: 30% MemBW GPU write 25% SynMark ZBuffer + Fill* 20% MemBW GPU blit 15% MemBW GPU blend, GpuTest Triangle 10-15% MemBW GPU texture 10% GLB Egypt, Unigine Heaven (had hangs), SynMark TerrainFly* 7-9% GLB T-Rex, GfxBench Manhattan 3.0 + T-Rex, SynMark Deferred* + TexMem* 6-8% GfxBench CarChase, Unigine Valley, SynMark CSCloth + ShMapVsm + TerrainPan* 5-6% GfxBench Manhattan 3.1 + GL AztecRuins, SynMark CSDof + TexFilterTri 2-4% GfxBench ALU2, SynMark DrvRes + GSCloth + ShMapPcf + Batch[0-5] + TexFilterAniso, GpuTest GiMark + 32-bit Julia And with patch: 15-20% MemBW GPU texture 10% SynMark TexMem* 8-9% GLB Egypt (1/2 screen window) 4-5% GLB T-Rex (1/2 screen window) 3-6% GfxBench Manhattan 3.0, GpuTest FurMark, SynMark Deferred + TexFilterTri 3-4% GfxBench Manhattan 3.1 + T-Rex, SynMark VSInstancing 2-4% GpuTest Triangle, SynMark DeferredAA 2-3% Unigine Heaven + Valley 1-3% SynMark Terrain* 1-2% GfxBench CarChase, SynMark TexFilterAniso + ZBuffer Tigerlake-H: 20-25% MemBW GPU texture 15-20% GpuTest Triangle 13-15% SynMark TerrainFly* + DeferredAA + HdrBloom 8-10% GfxBench Manhattan 3.1, SynMark TerrainPan* + DrvRes 6-7% GfxBench Manhattan 3.0, SynMark TexMem* 4-8% GLB onscreen Fill + T-Rex + Egypt (more in onscreen than offscreen versions of T-Rex/Egypt) 4-6% GfxBench CarChase + GLES AztecRuins + ALU2, GpuTest 32-bit Julia, SynMark CSDof + DrvState 3-5% GfxBench T-Rex + Egypt, Unigine Heaven + Valley, GpuTest Plot3D 1-7% Media tests 2-3% MemBW GPU blit 1-3% Most of the rest of 3D tests With the patch: 6-8% MemBW GPU blend => the only regression in these tests (compared to IOMMU without THP) 4-6% SynMark DrvState (not impacted) + HdrBloom (improved) 3-4% GLB T-Rex ~3% GLB Egypt, SynMark DrvRes 1-3% GfxBench T-Rex + Egypt, SynMark TexFilterTri 1-2% GfxBench CarChase + GLES AztecRuins, Unigine Valley, GpuTest Triangle ~1% GfxBench Manhattan 3.0/3.1, Unigine Heaven Perf of several tests actually improved with IOMMU + THP, compared to no IOMMU / no THP: 10-15% SynMark Batch[0-3] 5-10% MemBW GPU texture, SynMark ShMapVsm 3-4% SynMark Fill* + Geom* 2-3% SynMark TexMem512 + CSCloth 1-2% SynMark TexMem128 + DeferredAA As a summary across all platforms, these are the benchmarks where enabling THP on top of IOMMU enabled brings regressions: * Skylake GT4e: 20-25% SynMark TexMem* (whereas all MemBW GPU tests either improve or are not affected) * Broxton J4205: 7% MemBW GPU texture 2-3% SynMark TexMem* * Tigerlake-H: 7% MemBW GPU blend Other benchmarks show either lowering of regressions or improvements. v2: * Add Kconfig dependency to transparent hugepages and some help text. * Move to helper for e
[Bug 213391] AMDGPU retries page fault with some specific processes amdgpu and sometimes followed [gfxhub0] retry page fault until *ERROR* ring gfx timeout, but soft recovered
https://bugzilla.kernel.org/show_bug.cgi?id=213391 --- Comment #37 from Michel Dänzer (mic...@daenzer.net) --- (In reply to Lahfa Samy from comment #36) > Did anyone test whether this has been fixed in newer firmware updates, or > should we still stay on version 20210315.3568f96-3 ? It's fixed in upstream linux-firmware 20210818. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH v2 3/6] drm/i915 Implement LMEM backup and restore for suspend / resume
On 9/6/21 6:55 PM, Thomas Hellström wrote: Just evict unpinned objects to system. For pinned LMEM objects, make a backup system object and blit the contents to that. Backup is performed in three steps, 1: Opportunistically evict evictable objects using the gpu blitter. 2: After gt idle, evict evictable objects using the gpu blitter. This will be modified in an upcoming patch to backup pinned objects that are not used by the blitter itself. 3: Backup remaining pinned objects using memcpy. Also move uC suspend to after 2) to make sure we have a functional GuC during 2) if using GuC submission. v2: - Major refactor to make sure gem_exec_suspend@hang-SX subtests work, and suspend / resume works with a slightly modified GuC submission enabling patch series. Signed-off-by: Thomas Hellström --- drivers/gpu/drm/i915/Makefile | 1 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 1 + drivers/gpu/drm/i915/gem/i915_gem_pm.c| 92 +++- drivers/gpu/drm/i915/gem/i915_gem_pm.h| 3 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 29 ++- drivers/gpu/drm/i915/gem/i915_gem_ttm.h | 10 + drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c| 205 ++ drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h| 24 ++ drivers/gpu/drm/i915/gt/intel_gt_pm.c | 4 +- drivers/gpu/drm/i915/i915_drv.c | 10 +- drivers/gpu/drm/i915/i915_drv.h | 2 +- 11 files changed, 364 insertions(+), 17 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index c36c8a4f0716..3379a0a6c91e 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -155,6 +155,7 @@ gem-y += \ gem/i915_gem_throttle.o \ gem/i915_gem_tiling.o \ gem/i915_gem_ttm.o \ + gem/i915_gem_ttm_pm.o \ gem/i915_gem_userptr.o \ gem/i915_gem_wait.o \ gem/i915_gemfs.o diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 2471f36aaff3..734cc8e16481 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -534,6 +534,7 @@ struct drm_i915_gem_object { struct { struct sg_table *cached_io_st; struct i915_gem_object_page_iter get_io_page; + struct drm_i915_gem_object *backup; bool created:1; } ttm; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index 8b9d7d14c4bd..9746c255ddcc 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -5,6 +5,7 @@ */ #include "gem/i915_gem_pm.h" +#include "gem/i915_gem_ttm_pm.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" #include "gt/intel_gt_requests.h" @@ -39,7 +40,79 @@ void i915_gem_suspend(struct drm_i915_private *i915) i915_gem_drain_freed_objects(i915); } -void i915_gem_suspend_late(struct drm_i915_private *i915) +static int lmem_restore(struct drm_i915_private *i915, bool allow_gpu) +{ + struct intel_memory_region *mr; + int ret = 0, id; + + for_each_memory_region(mr, i915, id) { + if (mr->type == INTEL_MEMORY_LOCAL) { + ret = i915_ttm_restore_region(mr, allow_gpu); + if (ret) + break; + } + } + + return ret; +} + +static int lmem_suspend(struct drm_i915_private *i915, bool allow_gpu, + bool backup_pinned) +{ + struct intel_memory_region *mr; + int ret = 0, id; + + for_each_memory_region(mr, i915, id) { + if (mr->type == INTEL_MEMORY_LOCAL) { + ret = i915_ttm_backup_region(mr, allow_gpu, backup_pinned); + if (ret) + break; + } + } + + return ret; +} + +static void lmem_recover(struct drm_i915_private *i915) +{ + struct intel_memory_region *mr; + int id; + + for_each_memory_region(mr, i915, id) + if (mr->type == INTEL_MEMORY_LOCAL) + i915_ttm_recover_region(mr); +} + +int i915_gem_backup_suspend(struct drm_i915_private *i915) +{ + int ret; + + /* Opportunistically try to evict unpinned objects */ + ret = lmem_suspend(i915, true, false); + if (ret) + goto out_recover; + + i915_gem_suspend(i915); + + /* +* More objects may have become unpinned as requests were +* retired. Now try to evict again. The gt may be wedged here +* in which case we automatically fall back to memcpy. +*/ + + ret = lmem_suspend(i915, true, false); + if (ret) +
[RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource
Both the provider (resource manager) and the consumer (the TTM driver) want to subclass struct ttm_resource. Since this is left for the resource manager, we need to provide a private pointer for the TTM driver. Provide a struct ttm_resource_private for the driver to subclass for data with the same lifetime as the struct ttm_resource: In the i915 case it will, for example, be an sg-table and radix tree into the LMEM /VRAM pages that currently are awkwardly attached to the GEM object. Provide an ops structure for associated ops (Which is only destroy() ATM) It might seem pointless to provide a separate ops structure, but Linus has previously made it clear that that's the norm. After careful audit one could perhaps also on a per-driver basis replace the delete_mem_notify() TTM driver callback with the above destroy function. Cc: Matthew Auld Cc: König Christian Signed-off-by: Thomas Hellström --- drivers/gpu/drm/ttm/ttm_resource.c | 10 +++--- include/drm/ttm/ttm_resource.h | 28 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c index 2431717376e7..973e7c50bfed 100644 --- a/drivers/gpu/drm/ttm/ttm_resource.c +++ b/drivers/gpu/drm/ttm/ttm_resource.c @@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo, void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res) { struct ttm_resource_manager *man; + struct ttm_resource *resource = *res; - if (!*res) + if (!resource) return; - man = ttm_manager_type(bo->bdev, (*res)->mem_type); - man->func->free(man, *res); *res = NULL; + if (resource->priv) + resource->priv->ops.destroy(resource->priv); + + man = ttm_manager_type(bo->bdev, resource->mem_type); + man->func->free(man, resource); } EXPORT_SYMBOL(ttm_resource_free); diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h index 140b6b9a8bbe..5a22c9a29c05 100644 --- a/include/drm/ttm/ttm_resource.h +++ b/include/drm/ttm/ttm_resource.h @@ -44,6 +44,7 @@ struct dma_buf_map; struct io_mapping; struct sg_table; struct scatterlist; +struct ttm_resource_private; struct ttm_resource_manager_func { /** @@ -153,6 +154,32 @@ struct ttm_bus_placement { enum ttm_cachingcaching; }; +/** + * struct ttm_resource_private_ops - Operations for a struct + * ttm_resource_private + * + * Not much benefit to keep this as a separate struct with only a single member, + * but keeping a separate ops struct is the norm. + */ +struct ttm_resource_private_ops { + /** +* destroy() - Callback to destroy the private data +* @priv - The private data to destroy +*/ + void (*destroy) (struct ttm_resource_private *priv); +}; + +/** + * struct ttm_resource_private - TTM driver private data + * @ops: Pointer to struct ttm_resource_private_ops with associated operations + * + * Intended to be subclassed to hold, for example cached data sharing the + * lifetime with a struct ttm_resource. + */ +struct ttm_resource_private { + const struct ttm_resource_private_ops ops; +}; + /** * struct ttm_resource * @@ -171,6 +198,7 @@ struct ttm_resource { uint32_t mem_type; uint32_t placement; struct ttm_bus_placement bus; + struct ttm_resource_private *priv; }; /** -- 2.31.1
[PATCH 3/3] drm/vc4: dsi: Switch to devm_drm_of_get_bridge
The new devm_drm_of_get_bridge removes most of the boilerplate we have to deal with. Let's switch to it. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/vc4/vc4_drv.c | 2 ++ drivers/gpu/drm/vc4/vc4_dsi.c | 28 2 files changed, 6 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c index 16abc3a3d601..96c526f1022e 100644 --- a/drivers/gpu/drm/vc4/vc4_drv.c +++ b/drivers/gpu/drm/vc4/vc4_drv.c @@ -25,7 +25,9 @@ #include #include #include +#include #include +#include #include #include #include diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c index a185027911ce..a229da58962a 100644 --- a/drivers/gpu/drm/vc4/vc4_dsi.c +++ b/drivers/gpu/drm/vc4/vc4_dsi.c @@ -1497,7 +1497,6 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data) struct drm_device *drm = dev_get_drvdata(master); struct vc4_dsi *dsi = dev_get_drvdata(dev); struct vc4_dsi_encoder *vc4_dsi_encoder; - struct drm_panel *panel; const struct of_device_id *match; dma_cap_mask_t dma_mask; int ret; @@ -1609,27 +1608,9 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data) return ret; } - ret = drm_of_find_panel_or_bridge(dev->of_node, 0, 0, - &panel, &dsi->bridge); - if (ret) { - /* If the bridge or panel pointed by dev->of_node is not -* enabled, just return 0 here so that we don't prevent the DRM -* dev from being registered. Of course that means the DSI -* encoder won't be exposed, but that's not a problem since -* nothing is connected to it. -*/ - if (ret == -ENODEV) - return 0; - - return ret; - } - - if (panel) { - dsi->bridge = devm_drm_panel_bridge_add_typed(dev, panel, - DRM_MODE_CONNECTOR_DSI); - if (IS_ERR(dsi->bridge)) - return PTR_ERR(dsi->bridge); - } + dsi->bridge = devm_drm_of_get_bridge(dev, dev->of_node, 0, 0); + if (IS_ERR(dsi->bridge)) + return PTR_ERR(dsi->bridge); /* The esc clock rate is supposed to always be 100Mhz. */ ret = clk_set_rate(dsi->escape_clock, 100 * 100); @@ -1667,8 +1648,7 @@ static void vc4_dsi_unbind(struct device *dev, struct device *master, { struct vc4_dsi *dsi = dev_get_drvdata(dev); - if (dsi->bridge) - pm_runtime_disable(dev); + pm_runtime_disable(dev); /* * Restore the bridge_chain so the bridge detach procedure can happen -- 2.31.1
[PATCH 2/3] drm/vc4: dpi: Switch to devm_drm_of_get_bridge
The new devm_drm_of_get_bridge removes most of the boilerplate we have to deal with. Let's switch to it. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/vc4/vc4_dpi.c | 15 --- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/vc4/vc4_dpi.c b/drivers/gpu/drm/vc4/vc4_dpi.c index a90f2545baee..c180eb60bee8 100644 --- a/drivers/gpu/drm/vc4/vc4_dpi.c +++ b/drivers/gpu/drm/vc4/vc4_dpi.c @@ -229,26 +229,19 @@ static const struct of_device_id vc4_dpi_dt_match[] = { static int vc4_dpi_init_bridge(struct vc4_dpi *dpi) { struct device *dev = &dpi->pdev->dev; - struct drm_panel *panel; struct drm_bridge *bridge; - int ret; - ret = drm_of_find_panel_or_bridge(dev->of_node, 0, 0, - &panel, &bridge); - if (ret) { + bridge = devm_drm_of_get_bridge(dev, dev->of_node, 0, 0); + if (IS_ERR(bridge)) { /* If nothing was connected in the DT, that's not an * error. */ - if (ret == -ENODEV) + if (PTR_ERR(bridge) == -ENODEV) return 0; else - return ret; + return PTR_ERR(bridge); } - if (panel) - bridge = drm_panel_bridge_add_typed(panel, - DRM_MODE_CONNECTOR_DPI); - return drm_bridge_attach(dpi->encoder, bridge, NULL, 0); } -- 2.31.1
[PATCH 1/3] drm/bridge: Add a function to abstract away panels
Display drivers so far need to have a lot of boilerplate to first retrieve either the panel or bridge that they are connected to using drm_of_find_panel_or_bridge(), and then either deal with each with ad-hoc functions or create a drm panel bridge through drm_panel_bridge_add. In order to reduce the boilerplate and hopefully create a path of least resistance towards using the DRM panel bridge layer, let's create the function devm_drm_of_get_next to reduce that boilerplate. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/drm_bridge.c | 42 drivers/gpu/drm/drm_of.c | 3 +++ include/drm/drm_bridge.h | 2 ++ 3 files changed, 43 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c index a8ed66751c2d..10ddca4638b0 100644 --- a/drivers/gpu/drm/drm_bridge.c +++ b/drivers/gpu/drm/drm_bridge.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include "drm_crtc_internal.h" @@ -51,10 +52,8 @@ * * Display drivers are responsible for linking encoders with the first bridge * in the chains. This is done by acquiring the appropriate bridge with - * of_drm_find_bridge() or drm_of_find_panel_or_bridge(), or creating it for a - * panel with drm_panel_bridge_add_typed() (or the managed version - * devm_drm_panel_bridge_add_typed()). Once acquired, the bridge shall be - * attached to the encoder with a call to drm_bridge_attach(). + * devm_drm_of_get_bridge(). Once acquired, the bridge shall be attached to the + * encoder with a call to drm_bridge_attach(). * * Bridges are responsible for linking themselves with the next bridge in the * chain, if any. This is done the same way as for encoders, with the call to @@ -1233,6 +1232,41 @@ struct drm_bridge *of_drm_find_bridge(struct device_node *np) return NULL; } EXPORT_SYMBOL(of_drm_find_bridge); + +/** + * devm_drm_of_get_bridge - Return next bridge in the chain + * @dev: device to tie the bridge lifetime to + * @np: device tree node containing encoder output ports + * @port: port in the device tree node + * @endpoint: endpoint in the device tree node + * + * Given a DT node's port and endpoint number, finds the connected node + * and returns the associated bridge if any, or creates and returns a + * drm panel bridge instance if a panel is connected. + * + * Returns a pointer to the bridge if successful, or an error pointer + * otherwise. + */ +struct drm_bridge *devm_drm_of_get_bridge(struct device *dev, + struct device_node *np, + unsigned int port, + unsigned int endpoint) +{ + struct drm_bridge *bridge; + struct drm_panel *panel; + int ret; + + ret = drm_of_find_panel_or_bridge(np, port, endpoint, + &panel, &bridge); + if (ret) + return ERR_PTR(ret); + + if (panel) + bridge = devm_drm_panel_bridge_add(dev, panel); + + return bridge; +} +EXPORT_SYMBOL(devm_drm_of_get_bridge); #endif MODULE_AUTHOR("Ajay Kumar "); diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c index 997b8827fed2..37c34146eea8 100644 --- a/drivers/gpu/drm/drm_of.c +++ b/drivers/gpu/drm/drm_of.c @@ -231,6 +231,9 @@ EXPORT_SYMBOL_GPL(drm_of_encoder_active_endpoint); * return either the associated struct drm_panel or drm_bridge device. Either * @panel or @bridge must not be NULL. * + * This function is deprecated and should not be used in new drivers. Use + * devm_drm_of_get_bridge() instead. + * * Returns zero if successful, or one of the standard error codes if it fails. */ int drm_of_find_panel_or_bridge(const struct device_node *np, diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h index 46bdfa48c413..f70c88ca96ef 100644 --- a/include/drm/drm_bridge.h +++ b/include/drm/drm_bridge.h @@ -911,6 +911,8 @@ struct drm_bridge *devm_drm_panel_bridge_add(struct device *dev, struct drm_bridge *devm_drm_panel_bridge_add_typed(struct device *dev, struct drm_panel *panel, u32 connector_type); +struct drm_bridge *devm_drm_of_get_bridge(struct device *dev, struct device_node *node, + unsigned int port, unsigned int endpoint); struct drm_connector *drm_panel_bridge_connector(struct drm_bridge *bridge); #endif -- 2.31.1
[PATCH 0/3] drm/bridge: Create a function to abstract panels away
Hi, This series used to be part of the DSI probe order series, but got removed since it wasn't useful there anymore. However, I still believe there is value in moving towards merging bridges and panels by only making encoder (or upstream bridges) manipulate bridges. The first patch creates a new helper that does just this by looking for a bridge and a panel, and if a panel is found create a panel_bridge to return that bridge instead. The next two patches convert the vc4 encoders to use it. If it's accepted, I plan on converting all the relevant users over time. Let me know what you think, Maxime Maxime Ripard (3): drm/bridge: Add a function to abstract away panels drm/vc4: dpi: Switch to devm_drm_of_get_bridge drm/vc4: dsi: Switch to devm_drm_of_get_bridge drivers/gpu/drm/drm_bridge.c | 42 +++ drivers/gpu/drm/drm_of.c | 3 +++ drivers/gpu/drm/vc4/vc4_dpi.c | 15 - drivers/gpu/drm/vc4/vc4_drv.c | 2 ++ drivers/gpu/drm/vc4/vc4_dsi.c | 28 --- include/drm/drm_bridge.h | 2 ++ 6 files changed, 53 insertions(+), 39 deletions(-) -- 2.31.1
Re: [PATCH v5 1/3] dt-bindings: Add YAML bindings for NVDEC
On Fri, 10 Sep 2021 13:42:45 +0300, Mikko Perttunen wrote: > Add YAML device tree bindings for NVDEC, now in a more appropriate > place compared to the old textual Host1x bindings. > > Signed-off-by: Mikko Perttunen > --- > v5: > * Changed from nvidia,instance to nvidia,host1x-class optional > property. > * Added dma-coherent > v4: > * Fix incorrect compatibility string in 'if' condition > v3: > * Drop host1x bindings > * Change read2 to read-1 in interconnect names > v2: > * Fix issues pointed out in v1 > * Add T194 nvidia,instance property > --- > .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 ++ > MAINTAINERS | 1 + > 2 files changed, 105 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml > My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check' on your patch (DT_CHECKER_FLAGS is new in v5.13): yamllint warnings/errors: ./Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml:104:1: [warning] too many blank lines (2 > 1) (empty-lines) dtschema/dtc warnings/errors: doc reference errors (make refcheckdocs): See https://patchwork.ozlabs.org/patch/1526459 This check can fail if there are any dependencies. The base for a patch series is generally the most recent rc1. If you already ran 'make dt_binding_check' and didn't see the above error(s), then make sure 'yamllint' is installed and dt-schema is up to date: pip3 install dtschema --upgrade Please check and re-submit.
Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table
On 10/09/2021 06:33, Matt Roper wrote: Our uncore MMIO functions for reading/writing registers have become very complicated over time. There's significant macro magic used to generate several nearly-identical functions that only really differ in terms of which platform-specific shadow register table they should check on write operations. We can significantly simplify our MMIO handlers by storing a reference to the current platform's shadow table within the 'struct intel_uncore' the same way we already do for forcewake; this allows us to consolidate the multiple variants of each 'write' function down to just a single 'fwtable' version that gets the shadow table out of the uncore struct rather than hardcoding the name of a specific platform's table. We can do similar consolidation on the MMIO read side by creating a single-entry forcewake table to replace the open-coded range check they had been using previously. The final patch of the series adds a new shadow table for DG2; this becomes quite clean and simple now, given the refactoring in the first five patches. Tidy and it ends up saving kernel binary size. However I am undecided yet, because one thing to note is that the trade off is source code and kernel text consolidation at the expense of more indirect calls at runtime and larger common read/write functions. To expand, current code generates a bunch of per gen functions but in doing so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH (from find_fw_domain) so at runtime each platform mmio read/write does not have to do indirect calls to do lookups. It may matter a lot in the grand scheme of things but this trade off is something to note in the cover letter I think. Regards, Tvrtko Matt Roper (6): drm/i915/uncore: Convert gen6/gen7 read operations to fwtable drm/i915/uncore: Associate shadow table with uncore drm/i915/uncore: Replace gen8 write functions with general fwtable drm/i915/uncore: Drop gen11/gen12 mmio write handlers drm/i915/uncore: Drop gen11 mmio read handlers drm/i915/dg2: Add DG2-specific shadow register table drivers/gpu/drm/i915/intel_uncore.c | 190 ++ drivers/gpu/drm/i915/intel_uncore.h | 7 + drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + 3 files changed, 110 insertions(+), 88 deletions(-)
Re: [Intel-gfx] [PATCH 1/6] drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
On 10/09/2021 06:33, Matt Roper wrote: On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we simply check whether the register offset is < 0x4, and return FORCEWAKE_RENDER if it is. To prepare for upcoming refactoring, let's define a single-entry forcewake table from [0x0, 0x3] and switch these platforms over to use the fwtable reader functions. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index f9767054dbdf..7f92f12d95f2 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1064,6 +1064,10 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) __fwd; \ }) Is __gen6_reg_read_fw_domains left orphaned somewhere around here or in a later patch? Regards, Tvrtko +static const struct intel_forcewake_range __gen6_fw_ranges[] = { + GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER), +}; + /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */ static const struct intel_forcewake_range __chv_fw_ranges[] = { GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER), @@ -1623,7 +1627,6 @@ __gen_read(func, 64) __gen_reg_read_funcs(gen11_fwtable); __gen_reg_read_funcs(fwtable); -__gen_reg_read_funcs(gen6); #undef __gen_reg_read_funcs #undef GEN6_READ_FOOTER @@ -2111,15 +2114,17 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 8) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_VALLEYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_GRAPHICS_VER(i915, 6, 7)) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } uncore->pmic_bus_access_nb.notifier_call = i915_pmic_bus_access_notifier;
Re: [Intel-gfx] [PATCH 5/6] drm/i915/uncore: Drop gen11 mmio read handlers
On 10/09/2021 06:33, Matt Roper wrote: Consolidate down to just a single 'fwtable' implementation. For reads we don't need to worry about shadow tables. Also, the NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation can be dropped --- if a register is outside that range on one of the old platforms, then it won't belong to any forcewake range and 0 will be returned anyway. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 45 +++-- 1 file changed, 17 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index c181e74fbf43..95398cb69722 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -935,14 +935,6 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = { }; #define __fwtable_reg_read_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd = 0; \ - if (NEEDS_FORCE_WAKE((offset))) \ - __fwd = find_fw_domain(uncore, offset); \ - __fwd; \ -}) - -#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \ find_fw_domain(uncore, offset) Looks like you can drop this macro and just call find_fw_domain or you think there is value to keep it? Regards, Tvrtko /* *Must* be sorted by offset! See intel_shadow_table_check(). */ @@ -1577,33 +1569,30 @@ static inline void __force_wake_auto(struct intel_uncore *uncore, ___force_wake_auto(uncore, fw_domains); } -#define __gen_read(func, x) \ +#define __gen_fwtable_read(x) \ static u##x \ -func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \ +fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \ +{ \ enum forcewake_domains fw_engine; \ GEN6_READ_HEADER(x); \ - fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \ + fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \ if (fw_engine) \ __force_wake_auto(uncore, fw_engine); \ val = __raw_uncore_read##x(uncore, reg); \ GEN6_READ_FOOTER; \ } -#define __gen_reg_read_funcs(func) \ -static enum forcewake_domains \ -func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \ - return __##func##_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg)); \ -} \ -\ -__gen_read(func, 8) \ -__gen_read(func, 16) \ -__gen_read(func, 32) \ -__gen_read(func, 64) +static enum forcewake_domains +fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { + return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg)); +} -__gen_reg_read_funcs(gen11_fwtable); -__gen_reg_read_funcs(fwtable); +__gen_fwtable_read(8) +__gen_fwtable_read(16) +__gen_fwtable_read(32) +__gen_fwtable_read(64) -#undef __gen_reg_read_funcs +#undef __gen_fwtable_read #undef GEN6_READ_FOOTER #undef GEN6_READ_HEADER @@ -2069,22 +2058,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
[Bug 213391] AMDGPU retries page fault with some specific processes amdgpu and sometimes followed [gfxhub0] retry page fault until *ERROR* ring gfx timeout, but soft recovered
https://bugzilla.kernel.org/show_bug.cgi?id=213391 --- Comment #36 from Lahfa Samy (s...@lahfa.xyz) --- Did anyone test whether this has been fixed in newer firmware updates, or should we still stay on version 20210315.3568f96-3 ? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH RESEND] drm/i915: Mark GPU wedging on driver unregister unrecoverable
On 03.09.2021 16:28, Janusz Krzysztofik wrote: GPU wedged flag now set on driver unregister to prevent from further using the GPU can be then cleared unintentionally when calling __intel_gt_unset_wedged() still before the flag is finally marked unrecoverable. We need to have it marked unrecoverable earlier. Implement that by replacing a call to intel_gt_set_wedged() in intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini(). With the above in place, intel_gt_set_wedged_on_fini() is now called twice on driver remove, second time from __intel_gt_disable(). This seems harmless, while dropping intel_gt_set_wedged_on_fini() from __intel_gt_disable() proved to break some driver probe error unwind paths as well as mock selftest exit path. Signed-off-by: Janusz Krzysztofik Cc: Michał Winiarski Reviewed-by: Michał Winiarski -Michał --- Resending with Cc: dri-devel@lists.freedesktop.org as requested. Thanks, Janusz drivers/gpu/drm/i915/gt/intel_gt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 62d40c986642..173b53cb2b47 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -750,7 +750,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt) * all in-flight requests so that we can quickly unbind the active * resources. */ - intel_gt_set_wedged(gt); + intel_gt_set_wedged_on_fini(gt); /* Scrub all HW state upon release */ with_intel_runtime_pm(gt->uncore->rpm, wakeref)
Re: [Intel-gfx] [PATCH 23/27] drm/i915/guc: Implement no mid batch preemption for multi-lrc
On 20/08/2021 23:44, Matthew Brost wrote: For some users of multi-lrc, e.g. split frame, it isn't safe to preempt mid BB. To safely enable preemption at the BB boundary, a handshake between to parent and child is needed. This is implemented via custom emit_bb_start & emit_fini_breadcrumb functions and enabled via by default if a context is configured by set parallel extension. FWIW I think it's wrong to hardcode the requirements of a particular hardware generation fixed media pipeline into the uapi. IMO better solution was when concept of parallel submission was decoupled from the no preemption mid batch preambles. Otherwise might as well call the extension I915_CONTEXT_ENGINES_EXT_MEDIA_SPLIT_FRAME_SUBMIT or something. Regards, Tvrtko Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/intel_context.c | 2 +- drivers/gpu/drm/i915/gt/intel_context_types.h | 3 + drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 2 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 283 +- 4 files changed, 287 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 5615be32879c..2de62649e275 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -561,7 +561,7 @@ void intel_context_bind_parent_child(struct intel_context *parent, GEM_BUG_ON(intel_context_is_child(child)); GEM_BUG_ON(intel_context_is_parent(child)); - parent->guc_number_children++; + child->guc_child_index = parent->guc_number_children++; list_add_tail(&child->guc_child_link, &parent->guc_child_list); child->parent = parent; diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 713d85b0b364..727f91e7f7c2 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -246,6 +246,9 @@ struct intel_context { /** @guc_number_children: number of children if parent */ u8 guc_number_children; + /** @guc_child_index: index into guc_child_list if child */ + u8 guc_child_index; + /** * @parent_page: page in context used by parent for work queue, * work queue descriptor diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h index 6cd26dc060d1..9f61cfa5566a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h @@ -188,7 +188,7 @@ struct guc_process_desc { u32 wq_status; u32 engine_presence; u32 priority; - u32 reserved[30]; + u32 reserved[36]; } __packed; #define CONTEXT_REGISTRATION_FLAG_KMD BIT(0) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 91330525330d..1a18f99bf12a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -11,6 +11,7 @@ #include "gt/intel_context.h" #include "gt/intel_engine_pm.h" #include "gt/intel_engine_heartbeat.h" +#include "gt/intel_gpu_commands.h" #include "gt/intel_gt.h" #include "gt/intel_gt_irq.h" #include "gt/intel_gt_pm.h" @@ -366,10 +367,14 @@ static struct i915_priolist *to_priolist(struct rb_node *rb) /* * When using multi-lrc submission an extra page in the context state is - * reserved for the process descriptor and work queue. + * reserved for the process descriptor, work queue, and preempt BB boundary + * handshake between the parent + childlren contexts. * * The layout of this page is below: * 0 guc_process_desc + * + sizeof(struct guc_process_desc) child go + * + CACHELINE_BYTES child join ... + * + CACHELINE_BYTES ... * ...unused * PAGE_SIZE / 2 work queue start * ...work queue @@ -1785,6 +1790,30 @@ static int deregister_context(struct intel_context *ce, u32 guc_id, bool loop) return __guc_action_deregister_context(guc, guc_id, loop); } +static inline void clear_children_join_go_memory(struct intel_context *ce) +{ + u32 *mem = (u32 *)(__get_process_desc(ce) + 1); + u8 i; + + for (i = 0; i < ce->guc_number_children + 1; ++i) + mem[i * (CACHELINE_BYTES / sizeof(u32))] = 0; +} + +static inline u32 get_children_go_value(struct intel_context *ce) +{ + u32 *mem = (u32 *)(__get_process_desc(ce) + 1); + + return mem[0]; +} + +static inline u32 get_children_join_value(struct intel_context *ce, + u8 child_index) +{ + u32 *mem = (u32 *)(__get_process_desc(ce) + 1); + +
Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping
On 20/08/2021 23:44, Matthew Brost wrote: Add logical engine mapping. This is required for split-frame, as workloads need to be placed on engines in a logically contiguous manner. v2: (Daniel Vetter) - Add kernel doc for new fields Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 5 ++ .../drm/i915/gt/intel_execlists_submission.c | 1 + drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 2 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +-- 5 files changed, 60 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 0d9105a31d84..4d790f9a65dd 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs *engine, u16 iir) GEM_DEBUG_WARN_ON(iir); } -static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) +static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, + u8 logical_instance) { const struct engine_info *info = &intel_engines[id]; struct drm_i915_private *i915 = gt->i915; @@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) engine->class = info->class; engine->instance = info->instance; + engine->logical_mask = BIT(logical_instance); __sprint_engine_name(engine); engine->props.heartbeat_interval_ms = @@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) return info->engine_mask; } +static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids, +u8 class, const u8 *map, u8 num_instances) +{ + int i, j; + u8 current_logical_id = 0; + + for (j = 0; j < num_instances; ++j) { + for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) { + if (!HAS_ENGINE(gt, i) || + intel_engines[i].class != class) + continue; + + if (intel_engines[i].instance == map[j]) { + logical_ids[intel_engines[i].instance] = + current_logical_id++; + break; + } + } + } +} + +static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class) +{ + int i; + u8 map[MAX_ENGINE_INSTANCE + 1]; + + for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i) + map[i] = i; What's the point of the map array since it is 1:1 with instance? + populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map)); +} + /** * intel_engines_init_mmio() - allocate and prepare the Engine Command Streamers * @gt: pointer to struct intel_gt @@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt) struct drm_i915_private *i915 = gt->i915; const unsigned int engine_mask = init_engine_mask(gt); unsigned int mask = 0; - unsigned int i; + unsigned int i, class; + u8 logical_ids[MAX_ENGINE_INSTANCE + 1]; int err; drm_WARN_ON(&i915->drm, engine_mask == 0); @@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt) if (i915_inject_probe_failure(i915)) return -ENODEV; - for (i = 0; i < ARRAY_SIZE(intel_engines); i++) { - if (!HAS_ENGINE(gt, i)) - continue; + for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) { + setup_logical_ids(gt, logical_ids, class); - err = intel_engine_setup(gt, i); - if (err) - goto cleanup; + for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) { + u8 instance = intel_engines[i].instance; + + if (intel_engines[i].class != class || + !HAS_ENGINE(gt, i)) + continue; - mask |= BIT(i); + err = intel_engine_setup(gt, i, +logical_ids[instance]); + if (err) + goto cleanup; + + mask |= BIT(i); I still this there is a less clunky way to set this up in less code and more readable at the same time. Like do it in two passes so you can iterate gt->engine_class[] array instead of having to implement a skip condition (both on class and HAS_ENGINE at two places) and also avoid walking the flat intel_engines array recursively. + } } /* diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index ed91bcff20eb..fddf35546b58 10
[PATCH v4 24/24] drm/exynos: dsi: Adjust probe order
Without proper care and an agreement between how DSI hosts and devices drivers register their MIPI-DSI entities and potential components, we can end up in a situation where the drivers can never probe. Most drivers were taking evasive maneuvers to try to workaround this, but not all of them were following the same conventions, resulting in various incompatibilities between DSI hosts and devices. Now that we have a sequence agreed upon and documented, let's convert exynos to it. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/exynos/exynos_drm_dsi.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c b/drivers/gpu/drm/exynos/exynos_drm_dsi.c index e39fac889edc..dfda2b259c44 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c @@ -1529,6 +1529,7 @@ static const struct drm_encoder_helper_funcs exynos_dsi_encoder_helper_funcs = { MODULE_DEVICE_TABLE(of, exynos_dsi_of_match); +static const struct component_ops exynos_dsi_component_ops; static int exynos_dsi_host_attach(struct mipi_dsi_host *host, struct mipi_dsi_device *device) { @@ -1536,6 +1537,7 @@ static int exynos_dsi_host_attach(struct mipi_dsi_host *host, struct drm_encoder *encoder = &dsi->encoder; struct drm_device *drm = encoder->dev; struct drm_bridge *out_bridge; + struct device *dev = host->dev; out_bridge = of_drm_find_bridge(device->dev.of_node); if (out_bridge) { @@ -1585,7 +1587,7 @@ static int exynos_dsi_host_attach(struct mipi_dsi_host *host, if (drm->mode_config.poll_enabled) drm_kms_helper_hotplug_event(drm); - return 0; + return component_add(dev, &exynos_dsi_component_ops); } static int exynos_dsi_host_detach(struct mipi_dsi_host *host, @@ -1593,6 +1595,9 @@ static int exynos_dsi_host_detach(struct mipi_dsi_host *host, { struct exynos_dsi *dsi = host_to_dsi(host); struct drm_device *drm = dsi->encoder.dev; + struct device *dev = host->dev; + + component_del(dev, &exynos_dsi_component_ops); if (dsi->panel) { mutex_lock(&drm->mode_config.mutex); @@ -1716,7 +1721,7 @@ static int exynos_dsi_bind(struct device *dev, struct device *master, of_node_put(in_bridge_node); } - return mipi_dsi_host_register(&dsi->dsi_host); + return 0; } static void exynos_dsi_unbind(struct device *dev, struct device *master, @@ -1726,8 +1731,6 @@ static void exynos_dsi_unbind(struct device *dev, struct device *master, struct drm_encoder *encoder = &dsi->encoder; exynos_dsi_disable(encoder); - - mipi_dsi_host_unregister(&dsi->dsi_host); } static const struct component_ops exynos_dsi_component_ops = { @@ -1821,7 +1824,7 @@ static int exynos_dsi_probe(struct platform_device *pdev) pm_runtime_enable(dev); - ret = component_add(dev, &exynos_dsi_component_ops); + ret = mipi_dsi_host_register(&dsi->dsi_host); if (ret) goto err_disable_runtime; @@ -1835,10 +1838,12 @@ static int exynos_dsi_probe(struct platform_device *pdev) static int exynos_dsi_remove(struct platform_device *pdev) { + struct exynos_dsi *dsi = platform_get_drvdata(pdev); + + mipi_dsi_host_unregister(&dsi->dsi_host); + pm_runtime_disable(&pdev->dev); - component_del(&pdev->dev, &exynos_dsi_component_ops); - return 0; } -- 2.31.1
[PATCH v4 23/24] drm/kirin: dsi: Adjust probe order
Without proper care and an agreement between how DSI hosts and devices drivers register their MIPI-DSI entities and potential components, we can end up in a situation where the drivers can never probe. Most drivers were taking evasive maneuvers to try to workaround this, but not all of them were following the same conventions, resulting in various incompatibilities between DSI hosts and devices. Now that we have a sequence agreed upon and documented, let's convert kirin to it. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c | 27 +++- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c index 952cfdb1961d..be20c2ffe798 100644 --- a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c +++ b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c @@ -720,10 +720,13 @@ static int dw_drm_encoder_init(struct device *dev, return 0; } +static const struct component_ops dsi_ops; static int dsi_host_attach(struct mipi_dsi_host *host, struct mipi_dsi_device *mdsi) { struct dw_dsi *dsi = host_to_dsi(host); + struct device *dev = host->dev; + int ret; if (mdsi->lanes < 1 || mdsi->lanes > 4) { DRM_ERROR("dsi device params invalid\n"); @@ -734,13 +737,20 @@ static int dsi_host_attach(struct mipi_dsi_host *host, dsi->format = mdsi->format; dsi->mode_flags = mdsi->mode_flags; + ret = component_add(dev, &dsi_ops); + if (ret) + return ret; + return 0; } static int dsi_host_detach(struct mipi_dsi_host *host, struct mipi_dsi_device *mdsi) { - /* do nothing */ + struct device *dev = host->dev; + + component_del(dev, &dsi_ops); + return 0; } @@ -785,10 +795,6 @@ static int dsi_bind(struct device *dev, struct device *master, void *data) if (ret) return ret; - ret = dsi_host_init(dev, dsi); - if (ret) - return ret; - ret = dsi_bridge_init(drm_dev, dsi); if (ret) return ret; @@ -859,12 +865,19 @@ static int dsi_probe(struct platform_device *pdev) platform_set_drvdata(pdev, data); - return component_add(&pdev->dev, &dsi_ops); + ret = dsi_host_init(&pdev->dev, dsi); + if (ret) + return ret; + + return 0; } static int dsi_remove(struct platform_device *pdev) { - component_del(&pdev->dev, &dsi_ops); + struct dsi_data *data = platform_get_drvdata(pdev); + struct dw_dsi *dsi = &data->dsi; + + mipi_dsi_host_unregister(&dsi->host); return 0; } -- 2.31.1
[PATCH v4 22/24] drm/bridge: tc358775: Register and attach our DSI device at probe
In order to avoid any probe ordering issue, the best practice is to move the secondary MIPI-DSI device registration and attachment to the MIPI-DSI host at probe time. Let's do this. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/bridge/tc358775.c | 37 +-- 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/bridge/tc358775.c b/drivers/gpu/drm/bridge/tc358775.c index 35e66d1b6456..2c76331b251d 100644 --- a/drivers/gpu/drm/bridge/tc358775.c +++ b/drivers/gpu/drm/bridge/tc358775.c @@ -594,11 +594,26 @@ static int tc_bridge_attach(struct drm_bridge *bridge, enum drm_bridge_attach_flags flags) { struct tc_data *tc = bridge_to_tc(bridge); + + /* Attach the panel-bridge to the dsi bridge */ + return drm_bridge_attach(bridge->encoder, tc->panel_bridge, +&tc->bridge, flags); +} + +static const struct drm_bridge_funcs tc_bridge_funcs = { + .attach = tc_bridge_attach, + .pre_enable = tc_bridge_pre_enable, + .enable = tc_bridge_enable, + .mode_valid = tc_mode_valid, + .post_disable = tc_bridge_post_disable, +}; + +static int tc_attach_host(struct tc_data *tc) +{ struct device *dev = &tc->i2c->dev; struct mipi_dsi_host *host; struct mipi_dsi_device *dsi; int ret; - const struct mipi_dsi_device_info info = { .type = "tc358775", .channel = 0, .node = NULL, @@ -628,19 +643,9 @@ static int tc_bridge_attach(struct drm_bridge *bridge, return ret; } - /* Attach the panel-bridge to the dsi bridge */ - return drm_bridge_attach(bridge->encoder, tc->panel_bridge, -&tc->bridge, flags); + return 0; } -static const struct drm_bridge_funcs tc_bridge_funcs = { - .attach = tc_bridge_attach, - .pre_enable = tc_bridge_pre_enable, - .enable = tc_bridge_enable, - .mode_valid = tc_mode_valid, - .post_disable = tc_bridge_post_disable, -}; - static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id) { struct device *dev = &client->dev; @@ -704,7 +709,15 @@ static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id) i2c_set_clientdata(client, tc); + ret = tc_attach_host(tc); + if (ret) + goto err_bridge_remove; + return 0; + +err_bridge_remove: + drm_bridge_remove(&tc->bridge); + return ret; } static int tc_remove(struct i2c_client *client) -- 2.31.1
[PATCH v4 21/24] drm/bridge: tc358775: Switch to devm MIPI-DSI helpers
Let's switch to the new devm MIPI-DSI function to register and attach our secondary device. This also avoids leaking the device when we detach the bridge. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/bridge/tc358775.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/bridge/tc358775.c b/drivers/gpu/drm/bridge/tc358775.c index 2272adcc5b4a..35e66d1b6456 100644 --- a/drivers/gpu/drm/bridge/tc358775.c +++ b/drivers/gpu/drm/bridge/tc358775.c @@ -610,11 +610,10 @@ static int tc_bridge_attach(struct drm_bridge *bridge, return -EPROBE_DEFER; } - dsi = mipi_dsi_device_register_full(host, &info); + dsi = devm_mipi_dsi_device_register_full(dev, host, &info); if (IS_ERR(dsi)) { dev_err(dev, "failed to create dsi device\n"); - ret = PTR_ERR(dsi); - goto err_dsi_device; + return PTR_ERR(dsi); } tc->dsi = dsi; @@ -623,19 +622,15 @@ static int tc_bridge_attach(struct drm_bridge *bridge, dsi->format = MIPI_DSI_FMT_RGB888; dsi->mode_flags = MIPI_DSI_MODE_VIDEO; - ret = mipi_dsi_attach(dsi); + ret = devm_mipi_dsi_attach(dev, dsi); if (ret < 0) { dev_err(dev, "failed to attach dsi to host\n"); - goto err_dsi_attach; + return ret; } /* Attach the panel-bridge to the dsi bridge */ return drm_bridge_attach(bridge->encoder, tc->panel_bridge, &tc->bridge, flags); -err_dsi_attach: - mipi_dsi_device_unregister(dsi); -err_dsi_device: - return ret; } static const struct drm_bridge_funcs tc_bridge_funcs = { -- 2.31.1
[PATCH v4 20/24] drm/bridge: sn65dsi86: Register and attach our DSI device at probe
In order to avoid any probe ordering issue, the best practice is to move the secondary MIPI-DSI device registration and attachment to the MIPI-DSI host at probe time. Let's do this. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/bridge/ti-sn65dsi86.c | 74 ++- 1 file changed, 38 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c index b5662269ff95..7f71329536a2 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c @@ -667,58 +667,27 @@ static struct ti_sn65dsi86 *bridge_to_ti_sn65dsi86(struct drm_bridge *bridge) return container_of(bridge, struct ti_sn65dsi86, bridge); } -static int ti_sn_bridge_attach(struct drm_bridge *bridge, - enum drm_bridge_attach_flags flags) +static int ti_sn_attach_host(struct ti_sn65dsi86 *pdata) { int ret, val; - struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge); struct mipi_dsi_host *host; struct mipi_dsi_device *dsi; struct device *dev = pdata->dev; const struct mipi_dsi_device_info info = { .type = "ti_sn_bridge", .channel = 0, .node = NULL, -}; + }; - if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) { - DRM_ERROR("Fix bridge driver to make connector optional!"); - return -EINVAL; - } - - pdata->aux.drm_dev = bridge->dev; - ret = drm_dp_aux_register(&pdata->aux); - if (ret < 0) { - drm_err(bridge->dev, "Failed to register DP AUX channel: %d\n", ret); - return ret; - } - - ret = ti_sn_bridge_connector_init(pdata); - if (ret < 0) - goto err_conn_init; - - /* -* TODO: ideally finding host resource and dsi dev registration needs -* to be done in bridge probe. But some existing DSI host drivers will -* wait for any of the drm_bridge/drm_panel to get added to the global -* bridge/panel list, before completing their probe. So if we do the -* dsi dev registration part in bridge probe, before populating in -* the global bridge list, then it will cause deadlock as dsi host probe -* will never complete, neither our bridge probe. So keeping it here -* will satisfy most of the existing host drivers. Once the host driver -* is fixed we can move the below code to bridge probe safely. -*/ host = of_find_mipi_dsi_host_by_node(pdata->host_node); if (!host) { DRM_ERROR("failed to find dsi host\n"); - ret = -ENODEV; - goto err_dsi_host; + return -ENODEV; } dsi = devm_mipi_dsi_device_register_full(dev, host, &info); if (IS_ERR(dsi)) { DRM_ERROR("failed to create dsi device\n"); - ret = PTR_ERR(dsi); - goto err_dsi_host; + return PTR_ERR(dsi); } /* TODO: setting to 4 MIPI lanes always for now */ @@ -736,10 +705,35 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge, ret = devm_mipi_dsi_attach(dev, dsi); if (ret < 0) { DRM_ERROR("failed to attach dsi to host\n"); - goto err_dsi_host; + return ret; } pdata->dsi = dsi; + return 0; +} + +static int ti_sn_bridge_attach(struct drm_bridge *bridge, + enum drm_bridge_attach_flags flags) +{ + struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge); + int ret; + + if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) { + DRM_ERROR("Fix bridge driver to make connector optional!"); + return -EINVAL; + } + + pdata->aux.drm_dev = bridge->dev; + ret = drm_dp_aux_register(&pdata->aux); + if (ret < 0) { + drm_err(bridge->dev, "Failed to register DP AUX channel: %d\n", ret); + return ret; + } + + ret = ti_sn_bridge_connector_init(pdata); + if (ret < 0) + goto err_conn_init; + /* We never want the next bridge to *also* create a connector: */ flags |= DRM_BRIDGE_ATTACH_NO_CONNECTOR; @@ -1223,7 +1217,15 @@ static int ti_sn_bridge_probe(struct auxiliary_device *adev, drm_bridge_add(&pdata->bridge); + ret = ti_sn_attach_host(pdata); + if (ret) + goto err_remove_bridge; + return 0; + +err_remove_bridge: + drm_bridge_remove(&pdata->bridge); + return ret; } static void ti_sn_bridge_remove(struct auxiliary_device *adev) -- 2.31.1
[PATCH v4 19/24] drm/bridge: sn65dsi86: Switch to devm MIPI-DSI helpers
Let's switch to the new devm MIPI-DSI function to register and attach our secondary device. This also avoids leaking the device when we detach the bridge. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/bridge/ti-sn65dsi86.c | 22 +++--- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c index 41d48a393e7f..b5662269ff95 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c @@ -674,6 +674,7 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge, struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge); struct mipi_dsi_host *host; struct mipi_dsi_device *dsi; + struct device *dev = pdata->dev; const struct mipi_dsi_device_info info = { .type = "ti_sn_bridge", .channel = 0, .node = NULL, @@ -713,7 +714,7 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge, goto err_dsi_host; } - dsi = mipi_dsi_device_register_full(host, &info); + dsi = devm_mipi_dsi_device_register_full(dev, host, &info); if (IS_ERR(dsi)) { DRM_ERROR("failed to create dsi device\n"); ret = PTR_ERR(dsi); @@ -726,16 +727,16 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge, dsi->mode_flags = MIPI_DSI_MODE_VIDEO; /* check if continuous dsi clock is required or not */ - pm_runtime_get_sync(pdata->dev); + pm_runtime_get_sync(dev); regmap_read(pdata->regmap, SN_DPPLL_SRC_REG, &val); - pm_runtime_put_autosuspend(pdata->dev); + pm_runtime_put_autosuspend(dev); if (!(val & DPPLL_CLK_SRC_DSICLK)) dsi->mode_flags |= MIPI_DSI_CLOCK_NON_CONTINUOUS; - ret = mipi_dsi_attach(dsi); + ret = devm_mipi_dsi_attach(dev, dsi); if (ret < 0) { DRM_ERROR("failed to attach dsi to host\n"); - goto err_dsi_attach; + goto err_dsi_host; } pdata->dsi = dsi; @@ -746,14 +747,10 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge, ret = drm_bridge_attach(bridge->encoder, pdata->next_bridge, &pdata->bridge, flags); if (ret < 0) - goto err_dsi_detach; + goto err_dsi_host; return 0; -err_dsi_detach: - mipi_dsi_detach(dsi); -err_dsi_attach: - mipi_dsi_device_unregister(dsi); err_dsi_host: drm_connector_cleanup(&pdata->connector); err_conn_init: @@ -1236,11 +1233,6 @@ static void ti_sn_bridge_remove(struct auxiliary_device *adev) if (!pdata) return; - if (pdata->dsi) { - mipi_dsi_detach(pdata->dsi); - mipi_dsi_device_unregister(pdata->dsi); - } - drm_bridge_remove(&pdata->bridge); of_node_put(pdata->host_node); -- 2.31.1
[PATCH v4 18/24] drm/bridge: sn65dsi83: Register and attach our DSI device at probe
In order to avoid any probe ordering issue, the best practice is to move the secondary MIPI-DSI device registration and attachment to the MIPI-DSI host at probe time. Let's do this. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/bridge/ti-sn65dsi83.c | 80 +++ 1 file changed, 46 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c index db4d39082705..f951eb19767b 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c @@ -245,40 +245,6 @@ static int sn65dsi83_attach(struct drm_bridge *bridge, enum drm_bridge_attach_flags flags) { struct sn65dsi83 *ctx = bridge_to_sn65dsi83(bridge); - struct device *dev = ctx->dev; - struct mipi_dsi_device *dsi; - struct mipi_dsi_host *host; - int ret = 0; - - const struct mipi_dsi_device_info info = { - .type = "sn65dsi83", - .channel = 0, - .node = NULL, - }; - - host = of_find_mipi_dsi_host_by_node(ctx->host_node); - if (!host) { - dev_err(dev, "failed to find dsi host\n"); - return -EPROBE_DEFER; - } - - dsi = devm_mipi_dsi_device_register_full(dev, host, &info); - if (IS_ERR(dsi)) { - return dev_err_probe(dev, PTR_ERR(dsi), -"failed to create dsi device\n"); - } - - ctx->dsi = dsi; - - dsi->lanes = ctx->dsi_lanes; - dsi->format = MIPI_DSI_FMT_RGB888; - dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST; - - ret = devm_mipi_dsi_attach(dev, dsi); - if (ret < 0) { - dev_err(dev, "failed to attach dsi to host\n"); - return ret; - } return drm_bridge_attach(bridge->encoder, ctx->panel_bridge, &ctx->bridge, flags); @@ -646,6 +612,44 @@ static int sn65dsi83_parse_dt(struct sn65dsi83 *ctx, enum sn65dsi83_model model) return 0; } +static int sn65dsi83_host_attach(struct sn65dsi83 *ctx) +{ + struct device *dev = ctx->dev; + struct mipi_dsi_device *dsi; + struct mipi_dsi_host *host; + const struct mipi_dsi_device_info info = { + .type = "sn65dsi83", + .channel = 0, + .node = NULL, + }; + int ret; + + host = of_find_mipi_dsi_host_by_node(ctx->host_node); + if (!host) { + dev_err(dev, "failed to find dsi host\n"); + return -EPROBE_DEFER; + } + + dsi = devm_mipi_dsi_device_register_full(dev, host, &info); + if (IS_ERR(dsi)) + return dev_err_probe(dev, PTR_ERR(dsi), +"failed to create dsi device\n"); + + ctx->dsi = dsi; + + dsi->lanes = ctx->dsi_lanes; + dsi->format = MIPI_DSI_FMT_RGB888; + dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST; + + ret = devm_mipi_dsi_attach(dev, dsi); + if (ret < 0) { + dev_err(dev, "failed to attach dsi to host: %d\n", ret); + return ret; + } + + return 0; +} + static int sn65dsi83_probe(struct i2c_client *client, const struct i2c_device_id *id) { @@ -686,7 +690,15 @@ static int sn65dsi83_probe(struct i2c_client *client, ctx->bridge.of_node = dev->of_node; drm_bridge_add(&ctx->bridge); + ret = sn65dsi83_host_attach(ctx); + if (ret) + goto err_remove_bridge; + return 0; + +err_remove_bridge: + drm_bridge_remove(&ctx->bridge); + return ret; } static int sn65dsi83_remove(struct i2c_client *client) -- 2.31.1
[PATCH v4 17/24] drm/bridge: sn65dsi83: Switch to devm MIPI-DSI helpers
Let's switch to the new devm MIPI-DSI function to register and attach our secondary device. This also avoids leaking the device when we detach the bridge but don't remove its driver. Signed-off-by: Maxime Ripard --- drivers/gpu/drm/bridge/ti-sn65dsi83.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c index a32f70bc68ea..db4d39082705 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c @@ -262,7 +262,7 @@ static int sn65dsi83_attach(struct drm_bridge *bridge, return -EPROBE_DEFER; } - dsi = mipi_dsi_device_register_full(host, &info); + dsi = devm_mipi_dsi_device_register_full(dev, host, &info); if (IS_ERR(dsi)) { return dev_err_probe(dev, PTR_ERR(dsi), "failed to create dsi device\n"); @@ -274,18 +274,14 @@ static int sn65dsi83_attach(struct drm_bridge *bridge, dsi->format = MIPI_DSI_FMT_RGB888; dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST; - ret = mipi_dsi_attach(dsi); + ret = devm_mipi_dsi_attach(dev, dsi); if (ret < 0) { dev_err(dev, "failed to attach dsi to host\n"); - goto err_dsi_attach; + return ret; } return drm_bridge_attach(bridge->encoder, ctx->panel_bridge, &ctx->bridge, flags); - -err_dsi_attach: - mipi_dsi_device_unregister(dsi); - return ret; } static void sn65dsi83_atomic_pre_enable(struct drm_bridge *bridge, @@ -697,8 +693,6 @@ static int sn65dsi83_remove(struct i2c_client *client) { struct sn65dsi83 *ctx = i2c_get_clientdata(client); - mipi_dsi_detach(ctx->dsi); - mipi_dsi_device_unregister(ctx->dsi); drm_bridge_remove(&ctx->bridge); of_node_put(ctx->host_node); -- 2.31.1