Re: [Intel-gfx] [PATCH v5 00/19] Add vfio_device cdev for iommufd support
On Fri, Mar 03, 2023 at 03:01:03PM +, Shameerali Kolothum Thodi wrote: > External email: Use caution opening links or attachments > > > > -Original Message- > > From: Nicolin Chen [mailto:nicol...@nvidia.com] > > Sent: 02 March 2023 23:51 > > To: Shameerali Kolothum Thodi > > Cc: Xu, Terrence ; Liu, Yi L ; > > Jason Gunthorpe ; alex.william...@redhat.com; Tian, > > Kevin ; j...@8bytes.org; robin.mur...@arm.com; > > coh...@redhat.com; eric.au...@redhat.com; k...@vger.kernel.org; > > mjros...@linux.ibm.com; chao.p.p...@linux.intel.com; > > yi.y@linux.intel.com; pet...@redhat.com; jasow...@redhat.com; > > l...@redhat.com; suravee.suthikulpa...@amd.com; > > intel-gvt-...@lists.freedesktop.org; intel-gfx@lists.freedesktop.org; > > linux-s...@vger.kernel.org; Hao, Xudong ; Zhao, > > Yan Y > > Subject: Re: [PATCH v5 00/19] Add vfio_device cdev for iommufd support > > > > On Thu, Mar 02, 2023 at 09:43:00AM +, Shameerali Kolothum Thodi > > wrote: > > > > > Hi Nicolin, > > > > > > Thanks for the latest ARM64 branch. Do you have a working Qemu branch > > corresponding to the > > > above one? > > > > > > I tried the > > https://github.com/nicolinc/qemu/tree/wip/iommufd_rfcv3%2Bnesting%2B > > smmuv3 > > > but for some reason not able to launch the Guest. > > > > > > Please let me know. > > > > I do use that branch. It might not be that robust though as it > > went through a big rebase. > > Ok. The issue seems to be quite random in nature and only happens when there > are multiple vCPUs. Also doesn't look like related to VFIO device assignment > as I can reproduce Guest hang without it by only having nested-smmuv3 and > iommufd object. > > ./qemu-system-aarch64-iommuf -machine > virt,gic-version=3,iommu=nested-smmuv3,iommufd=iommufd0 \ > -enable-kvm -cpu host -m 1G -smp cpus=8,maxcpus=8 \ > -object iommufd,id=iommufd0 \ > -bios QEMU_EFI.fd \ > -kernel Image-6.2-iommufd \ > -initrd rootfs-iperf.cpio \ > -net none \ > -nographic \ > -append "rdinit=init console=ttyAMA0 root=/dev/vda rw > earlycon=pl011,0x900" \ > -trace events=events \ > -D trace_iommufd > > When the issue happens, no output on terminal as if Qemu is in a locked state. > > Can you try with the followings? > > > > --trace "iommufd*" --trace "smmu*" --trace "vfio_*" --trace "pci_*" --trace > > "msi_*" --trace "nvme_*" > > The only trace events with above are this, > > iommufd_backend_connect fd=22 owned=1 users=1 (0) > smmu_add_mr smmuv3-iommu-memory-region-0-0 > > I haven't debugged this further. Please let me know if issue is reproducible > with multiple vCPUs at your end. For now will focus on VFIO dev specific > tests. Oh. My test environment has been a single-core vCPU. So that doesn't happen to me. Can you try a vanilla QEMU branch that our nesting branch is rebased on? I took a branch from Yi as the baseline, while he might take from Eric for the rfcv3. I am guessing that it might be an issue in the common tree. Thanks Nicolin
Re: [Intel-gfx] [PATCH v2 1/2] drm/i915/gt: Create per-tile debugfs files
On Wed, Mar 01, 2023 at 09:35:33PM +, Sripada, Radhakrishna wrote: > I am not sure if Tiles is appropriate usage here. Since MTL does not have the > concept of tiles. > Shouldn't we be using gt instead of tile in our usage? > > With s/tile/gt/g, > Reviewed-by: Radhakrishna Sripada Just one question... you reviewed twice Patch number 1. Did you mean to review patch 1 and patch 2? Thanks, Andi > > > -Original Message- > > From: dri-devel On Behalf Of Andi > > Shyti > > Sent: Wednesday, March 1, 2023 3:03 AM > > To: intel-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org > > Cc: Tvrtko Ursulin ; Andi Shyti > > ; Patelczyk, Maciej ; Andi > > Shyti ; Wajdeczko, Michal > > > > Subject: [PATCH v2 1/2] drm/i915/gt: Create per-tile debugfs files > > > > To support multi-GT configurations, we need to generate > > independent debug files for each GT. > > > > To achieve this create a separate directory for each GT under the > > debugfs directory. For instance, in a system with two tiles, the > > debugfs structure would look like this: > > > > /sys/kernel/debug/dri > > └── 0 > > ├── gt0 > > │ ├── drpc > > │ ├── engines > > │ ├── forcewake > > │ ├── frequency > > │ └── rps_boost > > └── gt1 > > : ├── drpc > > : ├── engines > > : ├── forcewake > > ├── frequency > > └── rps_boost > > > > Signed-off-by: Andi Shyti > > Cc: Tvrtko Ursulin > > --- > > drivers/gpu/drm/i915/gt/intel_gt_debugfs.c| 4 +++- > > drivers/gpu/drm/i915/gt/uc/intel_guc.h| 2 ++ > > drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 5 - > > drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.c | 2 ++ > > 4 files changed, 11 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > index 5fc2df01aa0df..4dc23b8d3aa2d 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c > > @@ -83,11 +83,13 @@ static void gt_debugfs_register(struct intel_gt *gt, > > struct dentry *root) > > void intel_gt_debugfs_register(struct intel_gt *gt) > > { > > struct dentry *root; > > + char gtname[4]; > > > > if (!gt->i915->drm.primary->debugfs_root) > > return; > > > > - root = debugfs_create_dir("gt", gt->i915->drm.primary->debugfs_root); > > + snprintf(gtname, sizeof(gtname), "gt%u", gt->info.id); > > + root = debugfs_create_dir(gtname, gt->i915->drm.primary- > > >debugfs_root); > > if (IS_ERR(root)) > > return; > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > index bb4dfe707a7d0..e46aac1a41e6d 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > @@ -42,6 +42,8 @@ struct intel_guc { > > /** @capture: the error-state-capture module's data and objects */ > > struct intel_guc_state_capture *capture; > > > > + struct dentry *dbgfs_node; > > + > > /** @sched_engine: Global engine used to submit requests to GuC */ > > struct i915_sched_engine *sched_engine; > > /** > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > > index 195db8c9d4200..55bc8b55fbc05 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > > @@ -542,8 +542,11 @@ static int guc_log_relay_create(struct intel_guc_log > > *log) > > */ > > n_subbufs = 8; > > > > + if (!guc->dbgfs_node) > > + return -ENOENT; > > + > > guc_log_relay_chan = relay_open("guc_log", > > - i915->drm.primary->debugfs_root, > > + guc->dbgfs_node, > > subbuf_size, n_subbufs, > > _callbacks, i915); > > if (!guc_log_relay_chan) { > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.c > > b/drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.c > > index 284d6fbc2d08c..2f93cc4e408a8 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.c > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.c > > @@ -54,6 +54,8 @@ void intel_uc_debugfs_register(struct intel_uc *uc, struct > > dentry *gt_root) > > if (IS_ERR(root)) > > return; > > > > + uc->guc.dbgfs_node = root; > > + > > intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), uc); > > > > intel_guc_debugfs_register(>guc, root); > > -- > > 2.39.1 >
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/pmu: Freq sampling: Fix requested freq fallback
== Series Details == Series: drm/i915/pmu: Freq sampling: Fix requested freq fallback URL : https://patchwork.freedesktop.org/series/114643/ State : success == Summary == CI Bug Log - changes from CI_DRM_12810 -> Patchwork_114643v1 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/index.html Participating hosts (39 -> 40) -- Additional (1): fi-kbl-soraka Known issues Here are the changes found in Patchwork_114643v1 that come from known issues: ### IGT changes ### Issues hit * igt@gem_huc_copy@huc-copy: - fi-kbl-soraka: NOTRUN -> [SKIP][1] ([fdo#109271] / [i915#2190]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html * igt@gem_lmem_swapping@basic: - fi-kbl-soraka: NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#4613]) +3 similar issues [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/fi-kbl-soraka/igt@gem_lmem_swapp...@basic.html * igt@i915_selftest@live@gt_pm: - fi-kbl-soraka: NOTRUN -> [DMESG-FAIL][3] ([i915#1886]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html * igt@i915_selftest@live@hangcheck: - fi-kbl-soraka: NOTRUN -> [INCOMPLETE][4] ([i915#7913]) [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/fi-kbl-soraka/igt@i915_selftest@l...@hangcheck.html * igt@i915_selftest@live@slpc: - bat-rpls-2: NOTRUN -> [DMESG-FAIL][5] ([i915#6997] / [i915#7913]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-rpls-2/igt@i915_selftest@l...@slpc.html * igt@kms_chamelium_hpd@common-hpd-after-suspend: - bat-rpls-2: NOTRUN -> [SKIP][6] ([i915#7828]) [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-rpls-2/igt@kms_chamelium_...@common-hpd-after-suspend.html - bat-rpls-1: NOTRUN -> [SKIP][7] ([i915#7828]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-rpls-1/igt@kms_chamelium_...@common-hpd-after-suspend.html * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic: - fi-kbl-soraka: NOTRUN -> [SKIP][8] ([fdo#109271]) +16 similar issues [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/fi-kbl-soraka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html * igt@kms_pipe_crc_basic@suspend-read-crc: - bat-rpls-2: NOTRUN -> [SKIP][9] ([i915#1845]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-rpls-2/igt@kms_pipe_crc_ba...@suspend-read-crc.html - bat-rpls-1: NOTRUN -> [SKIP][10] ([i915#1845]) [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-rpls-1/igt@kms_pipe_crc_ba...@suspend-read-crc.html Possible fixes * igt@gem_exec_suspend@basic-s3@smem: - bat-rpls-1: [ABORT][11] ([i915#6687] / [i915#7978]) -> [PASS][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-rpls-1/igt@gem_exec_suspend@basic...@smem.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-rpls-1/igt@gem_exec_suspend@basic...@smem.html * igt@i915_selftest@live@gt_heartbeat: - fi-apl-guc: [DMESG-FAIL][13] ([i915#5334]) -> [PASS][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html * igt@i915_selftest@live@gt_pm: - bat-rpls-2: [DMESG-FAIL][15] ([i915#4258]) -> [PASS][16] [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-rpls-2/igt@i915_selftest@live@gt_pm.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-rpls-2/igt@i915_selftest@live@gt_pm.html * igt@i915_selftest@live@migrate: - bat-adlp-6: [DMESG-FAIL][17] ([i915#7699]) -> [PASS][18] [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-adlp-6/igt@i915_selftest@l...@migrate.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-adlp-6/igt@i915_selftest@l...@migrate.html * igt@i915_selftest@live@requests: - bat-rpls-2: [ABORT][19] ([i915#7694] / [i915#7913] / [i915#7982]) -> [PASS][20] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-rpls-2/igt@i915_selftest@l...@requests.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114643v1/bat-rpls-2/igt@i915_selftest@l...@requests.html [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845 [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886 [i915#2190]:
Re: [Intel-gfx] [PATCH v6 8/8] drm/i915/pxp: Enable PXP with MTL-GSC-CS
On 2/27/2023 6:21 PM, Alan Previn wrote: Enable PXP with MTL-GSC-CS: add the has_pxp into device info and increase the debugfs teardown timeouts to align with new GSC-CS + firmware specs. Signed-off-by: Alan Previn Reviewed-by: Daniele Ceraolo Spurio Daniele --- drivers/gpu/drm/i915/i915_pci.c | 1 + drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c | 9 - drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 2 +- 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index a8d942b16223..4ecf0f2ab6ec 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1150,6 +1150,7 @@ static const struct intel_device_info mtl_info = { .has_guc_deprivilege = 1, .has_mslice_steering = 0, .has_snoop = 1, + .has_pxp = 1, .__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM, .__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0), .require_force_probe = 1, diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c index 9f6e300486b4..ddf9f8bb7791 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c @@ -14,6 +14,7 @@ #include "intel_pxp.h" #include "intel_pxp_debugfs.h" +#include "intel_pxp_gsccs.h" #include "intel_pxp_irq.h" #include "intel_pxp_types.h" @@ -45,6 +46,7 @@ static int pxp_terminate_set(void *data, u64 val) { struct intel_pxp *pxp = data; struct intel_gt *gt = intel_pxp_get_irq_gt(pxp); + int timeout_ms; if (!intel_pxp_is_active(pxp)) return -ENODEV; @@ -54,8 +56,13 @@ static int pxp_terminate_set(void *data, u64 val) intel_pxp_irq_handler(pxp, GEN12_DISPLAY_PXP_STATE_TERMINATED_INTERRUPT); spin_unlock_irq(gt->irq_lock); + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) + timeout_ms = GSC_PENDING_RETRY_LATENCY_MS; + else + timeout_ms = 250; + if (!wait_for_completion_timeout(>termination, -msecs_to_jiffies(100))) +msecs_to_jiffies(timeout_ms))) return -ETIMEDOUT; return 0; diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c index 4ddf2ee60222..03f006f9dc2e 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c @@ -44,7 +44,7 @@ static int pxp_wait_for_session_state(struct intel_pxp *pxp, u32 id, bool in_pla KCR_SIP(pxp->kcr_base), mask, in_play ? mask : 0, - 100); + 250); intel_runtime_pm_put(uncore->rpm, wakeref);
Re: [Intel-gfx] [PATCH v6 7/8] drm/i915/pxp: On MTL, KCR enabling doesn't wait on tee component
On 2/27/2023 6:21 PM, Alan Previn wrote: On legacy platforms, KCR HW enabling is done at the time the mei component interface is bound. It's also disabled during unbind. However, for MTL onwards, we don't depend on a tee component to start sending GSC-CS firmware messages. Thus, immediately enable (or disable) KCR HW on PXP's init, fini and resume. Signed-off-by: Alan Previn --- drivers/gpu/drm/i915/pxp/intel_pxp.c| 19 +++ drivers/gpu/drm/i915/pxp/intel_pxp_pm.c | 3 ++- 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index 61041277be24..e2f2cc5f6a6e 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -119,6 +119,7 @@ static void destroy_vcs_context(struct intel_pxp *pxp) static void pxp_init_full(struct intel_pxp *pxp) { struct intel_gt *gt = pxp->ctrl_gt; + intel_wakeref_t wakeref; int ret; /* @@ -140,10 +141,15 @@ static void pxp_init_full(struct intel_pxp *pxp) if (ret) return; - if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) { ret = intel_pxp_gsccs_init(pxp); - else + if (!ret) { + with_intel_runtime_pm(>ctrl_gt->i915->runtime_pm, wakeref) + intel_pxp_init_hw(pxp); personal preference: I'd move this (and the matching call in fini) inside intel_pxp_gsccs_init/fini. That way we can see this as more back-end specific: the gsccs initialize everything immediately, while the tee back-end follows a 2-step approach with the component. Not a blocker since it is a personal preference, so with or without the change: Reviewed-by: Daniele Ceraolo Spurio Daniele + } + } else { ret = intel_pxp_tee_component_init(pxp); + } if (ret) goto out_context; @@ -239,15 +245,20 @@ int intel_pxp_init(struct drm_i915_private *i915) void intel_pxp_fini(struct drm_i915_private *i915) { + intel_wakeref_t wakeref; + if (!i915->pxp) return; i915->pxp->arb_is_valid = false; - if (HAS_ENGINE(i915->pxp->ctrl_gt, GSC0)) + if (HAS_ENGINE(i915->pxp->ctrl_gt, GSC0)) { + with_intel_runtime_pm(>pxp->ctrl_gt->i915->runtime_pm, wakeref) + intel_pxp_fini_hw(i915->pxp); intel_pxp_gsccs_fini(i915->pxp); - else + } else { intel_pxp_tee_component_fini(i915->pxp); + } destroy_vcs_context(i915->pxp); diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c index 4f836b317424..1a04067f61fc 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_pm.c @@ -43,8 +43,9 @@ void intel_pxp_resume_complete(struct intel_pxp *pxp) * The PXP component gets automatically unbound when we go into S3 and * re-bound after we come out, so in that scenario we can defer the * hw init to the bind call. +* NOTE: GSC-CS backend doesn't rely on components. */ - if (!pxp->pxp_component) + if (!HAS_ENGINE(pxp->ctrl_gt, GSC0) && !pxp->pxp_component) return; intel_pxp_init_hw(pxp);
Re: [Intel-gfx] [PATCH v6 6/8] drm/i915/pxp: MTL-KCR interrupt ctrl's are in GT-0
On 2/27/2023 6:21 PM, Alan Previn wrote: Despite KCR subsystem being in the media-tile (close to the GSC-CS), the IRQ controls for it are on GT-0 with other global IRQ controls. Thus, add a helper for KCR hw interrupt enable/disable functions to get the correct gt structure (for uncore) for MTL. This is not correct. On MTL, the interrupts logic isn't on any particular GT, it is in a shared area. The fact that we handle all interrupts as if they were triggered on the root GT is an i915 implementation decision. Both uncores have access to the irq regs and the 2 GTs share the irq lock. A comparable example is the media GuC, where the interrupts enable/disable functions are called with the media GT structure. In the helper, we get GT-0's handle for uncore when touching IRQ registers despite the pxp->ctrl_gt being the media-tile. No difference for legacy of course. Signed-off-by: Alan Previn --- drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c | 2 +- drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 24 +--- drivers/gpu/drm/i915/pxp/intel_pxp_irq.h | 8 +++ 3 files changed, 30 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c index 4b8e70caa3ad..9f6e300486b4 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c @@ -44,7 +44,7 @@ static int pxp_terminate_get(void *data, u64 *val) static int pxp_terminate_set(void *data, u64 val) { struct intel_pxp *pxp = data; - struct intel_gt *gt = pxp->ctrl_gt; + struct intel_gt *gt = intel_pxp_get_irq_gt(pxp); In this function the only use you have of the GT is to take gt->irq_lock, but that's shared between the GTs so it is ok to use the media GT for it. if (!intel_pxp_is_active(pxp)) return -ENODEV; diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c index 91e9622c07d0..3a725397349f 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c @@ -4,10 +4,12 @@ */ #include +#include "gt/intel_gt.h" #include "gt/intel_gt_irq.h" #include "gt/intel_gt_regs.h" #include "gt/intel_gt_types.h" +#include "i915_drv.h" #include "i915_irq.h" #include "i915_reg.h" @@ -17,6 +19,22 @@ #include "intel_pxp_types.h" #include "intel_runtime_pm.h" +/** + * intel_pxp_get_irq_gt - Find the correct GT that owns KCR interrupts + * @pxp: pointer to pxp struct + * + * For platforms with a single GT, we return the pxp->ctrl_gt (as expected) + * but for MTL+ that has a media-tile, although the KCR engine is in the + * media-tile (i.e. pxp->ctrl_gt), the IRQ controls are on the root tile. + * In the end, we don't use pxp->ctrl_gt for IRQ, we always return root gt. + */ +struct intel_gt *intel_pxp_get_irq_gt(struct intel_pxp *pxp) +{ + WARN_ON_ONCE(!pxp->ctrl_gt->i915->media_gt && !gt_is_root(pxp->ctrl_gt)); + + return to_gt(pxp->ctrl_gt->i915); +} + /** * intel_pxp_irq_handler - Handles PXP interrupts. * @pxp: pointer to pxp struct @@ -29,7 +47,7 @@ void intel_pxp_irq_handler(struct intel_pxp *pxp, u16 iir) if (GEM_WARN_ON(!intel_pxp_is_enabled(pxp))) return; - gt = pxp->ctrl_gt; + gt = intel_pxp_get_irq_gt(pxp); same as above, only use here is the lock. lockdep_assert_held(gt->irq_lock); @@ -68,7 +86,7 @@ static inline void pxp_irq_reset(struct intel_gt *gt) void intel_pxp_irq_enable(struct intel_pxp *pxp) { - struct intel_gt *gt = pxp->ctrl_gt; + struct intel_gt *gt = intel_pxp_get_irq_gt(pxp); in this function we use the gt for: 1 - the lock: see above about this 2 - gen11_gt_reset_one_iir(): this should work with the media GT (we use it for media GuC) 3 - writes to the GEN11_CRYPTO_* regs: those should also work with the media GT uncore as these regs are in the same range as the GuC scratch regs and we use the media uncore for those accesses. spin_lock_irq(gt->irq_lock); @@ -83,7 +101,7 @@ void intel_pxp_irq_enable(struct intel_pxp *pxp) void intel_pxp_irq_disable(struct intel_pxp *pxp) { - struct intel_gt *gt = pxp->ctrl_gt; + struct intel_gt *gt = intel_pxp_get_irq_gt(pxp); AFAICS this functions uses the same 3 cases as above. Overall, I am not sure this patch is required. Am I missing something? Daniele /* * We always need to submit a global termination when we re-enable the diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.h b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.h index 8c292dc86f68..eea87c9eb62b 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.h +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.h @@ -9,6 +9,7 @@ #include struct intel_pxp; +struct intel_gt; #define GEN12_DISPLAY_PXP_STATE_TERMINATED_INTERRUPT BIT(1) #define
Re: [Intel-gfx] [PATCH v6 5/8] drm/i915/pxp: Add ARB session creation and cleanup
On 2/27/2023 6:21 PM, Alan Previn wrote: Add MTL's function for ARB session creation using PXP firmware version 4.3 ABI structure format. Also add MTL's function for ARB session invalidation but this reuses PXP firmware version 4.2 ABI structure format. Before checking the return status, look at the GSC-CS-Mem-Header's pending-bit which means the GSC firmware is busy and we should resubmit. Signed-off-by: Alan Previn --- drivers/gpu/drm/i915/pxp/intel_pxp.c | 34 -- .../drm/i915/pxp/intel_pxp_cmd_interface_43.h | 21 +++ drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c| 62 +++ drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h| 4 ++ drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 11 +++- 5 files changed, 126 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index aecc65b5da70..61041277be24 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -290,6 +290,8 @@ static bool pxp_component_bound(struct intel_pxp *pxp) static int __pxp_global_teardown_final(struct intel_pxp *pxp) { + int timeout; + if (!pxp->arb_is_valid) return 0; /* @@ -299,7 +301,12 @@ static int __pxp_global_teardown_final(struct intel_pxp *pxp) intel_pxp_mark_termination_in_progress(pxp); intel_pxp_terminate(pxp, false); - if (!wait_for_completion_timeout(>termination, msecs_to_jiffies(250))) + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) + timeout = GSC_PENDING_RETRY_LATENCY_MS; + else + timeout = 250; + + if (!wait_for_completion_timeout(>termination, msecs_to_jiffies(timeout))) return -ETIMEDOUT; return 0; @@ -307,6 +314,8 @@ static int __pxp_global_teardown_final(struct intel_pxp *pxp) static int __pxp_global_teardown_restart(struct intel_pxp *pxp) { + int timeout; + if (pxp->arb_is_valid) return 0; /* @@ -315,7 +324,12 @@ static int __pxp_global_teardown_restart(struct intel_pxp *pxp) */ pxp_queue_termination(pxp); - if (!wait_for_completion_timeout(>termination, msecs_to_jiffies(250))) + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) + timeout = GSC_PENDING_RETRY_LATENCY_MS; + else + timeout = 250; + + if (!wait_for_completion_timeout(>termination, msecs_to_jiffies(timeout))) return -ETIMEDOUT; return 0; @@ -353,8 +367,20 @@ int intel_pxp_start(struct intel_pxp *pxp) if (!intel_pxp_is_enabled(pxp)) return -ENODEV; - if (wait_for(pxp_component_bound(pxp), 250)) - return -ENXIO; + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) { + /* +* GSC-fw loading, GSC-proxy init (requiring an mei component driver) and +* HuC-fw loading must all occur first before we start requesting for PXP +* sessions. Checking HuC authentication (the last dependency) will suffice. +* Let's use a much larger 8 second timeout considering all the types of +* dependencies prior to that. +*/ + if (wait_for(intel_huc_is_authenticated(>ctrl_gt->uc.huc), 8000)) This big timeout needs an ack from userspace drivers, as intel_pxp_start is called during context creation and the current way to query if the feature is supported is to create a protected context. Unfortunately, we do need to wait to confirm that PXP is available (although in most cases it shouldn't take even close to 8 secs), because until everything is setup we're not sure if things will work as expected. I see 2 potential mitigations in case the timeout doesn't work as-is: 1) we return -EAGAIN (or another dedicated error code) to userspace if the prerequisite steps aren't done yet. This would indicate that the feature is there, but that we haven't completed the setup yet. The caller can then decide if they want to retry immediately or later. Pro: more flexibility for userspace; Cons: new interface return code. 2) we add a getparam to say if PXP is supported in HW and the support is compiled in i915. Userspace can query this as a way to check the feature support and only create the context if they actually need it for PXP operations. Pro: simpler kernel implementation; Cons: new getparam, plus even if the getparam returns true the pxp_start could later fail, so userspace needs to handle that case. + return -ENXIO; + } else { + if (wait_for(pxp_component_bound(pxp), 250)) + return -ENXIO; + } mutex_lock(>arb_mutex); diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h index b2523d6918c7..9089e02a8c2d 100644 ---
[Intel-gfx] [PATCH 2/2] drm/i915/pmu: Use correct requested freq for SLPC
SLPC does not use 'struct intel_rps'. Use UNSLICE_RATIO bits from GEN6_RPNSWREQ for SLPC. See intel_rps_get_requested_frequency. Bspec: 52745 Signed-off-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_pmu.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index f0a1e36915b8..5ee836610801 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -394,8 +394,13 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns) * frequency. Fortunately, the read should rarely fail! */ val = intel_rps_get_cagf(rps, intel_rps_read_rpstat_fw(rps)); - if (!val) - val = rps->cur_freq; + if (!val) { + if (intel_uc_uses_guc_slpc(>uc)) + val = intel_rps_read_punit_req(rps) >> + GEN9_SW_REQ_UNSLICE_RATIO_SHIFT; + else + val = rps->cur_freq; + } add_sample_mult(>sample[__I915_SAMPLE_FREQ_ACT], intel_gpu_freq(rps, val), period_ns / 1000); -- 2.38.0
[Intel-gfx] [PATCH 1/2] drm/i915/pmu: Use only freq bits for falling back to requested freq
On newer generations, the GEN12_RPSTAT1 register contains more than freq information, e.g. see GEN12_VOLTAGE_MASK. Therefore use only the freq bits to decide whether to fall back to requested freq. Signed-off-by: Ashutosh Dixit --- drivers/gpu/drm/i915/i915_pmu.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 52531ab28c5f..f0a1e36915b8 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -393,10 +393,8 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns) * case we assume the system is running at the intended * frequency. Fortunately, the read should rarely fail! */ - val = intel_rps_read_rpstat_fw(rps); - if (val) - val = intel_rps_get_cagf(rps, val); - else + val = intel_rps_get_cagf(rps, intel_rps_read_rpstat_fw(rps)); + if (!val) val = rps->cur_freq; add_sample_mult(>sample[__I915_SAMPLE_FREQ_ACT], -- 2.38.0
[Intel-gfx] [PATCH 0/2] drm/i915/pmu: Freq sampling: Fix requested freq fallback
A couple of minor fixes to the PMU requested freq fallback for PMU freq sampling. Ashutosh Dixit (2): drm/i915/pmu: Use only freq bits for falling back to requested freq drm/i915/pmu: Use correct requested freq for SLPC drivers/gpu/drm/i915/i915_pmu.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) -- 2.38.0
[Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [v4,1/5] drm/i915/power: move dc state members to struct i915_power_domains
== Series Details == Series: series starting with [v4,1/5] drm/i915/power: move dc state members to struct i915_power_domains URL : https://patchwork.freedesktop.org/series/114515/ State : success == Summary == CI Bug Log - changes from CI_DRM_12799_full -> Patchwork_114515v1_full Summary --- **SUCCESS** No regressions found. Participating hosts (19 -> 19) -- No changes in participating hosts Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_114515v1_full: ### IGT changes ### Possible regressions * igt@kms_atomic_transition@modeset-transition-nonblocking@3x-outputs (NEW): - {shard-dg2-11}: NOTRUN -> [FAIL][1] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114515v1/shard-dg2-11/igt@kms_atomic_transition@modeset-transition-nonblock...@3x-outputs.html Suppressed The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@kms_content_protection@atomic@pipe-a-dp-2: - {shard-dg2-12}: NOTRUN -> [TIMEOUT][2] [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114515v1/shard-dg2-12/igt@kms_content_protection@ato...@pipe-a-dp-2.html * igt@kms_hdr@static-toggle-suspend@pipe-a-dp-1: - {shard-dg2-11}: NOTRUN -> [FAIL][3] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114515v1/shard-dg2-11/igt@kms_hdr@static-toggle-susp...@pipe-a-dp-1.html * {igt@kms_plane_scaling@planes-downscale-factor-0-25-upscale-0-25@pipe-d-dp-4}: - {shard-dg2-11}: NOTRUN -> [SKIP][4] [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114515v1/shard-dg2-11/igt@kms_plane_scaling@planes-downscale-factor-0-25-upscale-0...@pipe-d-dp-4.html * {igt@kms_plane_scaling@planes-downscale-factor-0-25-upscale-20x20@pipe-c-dp-2}: - {shard-dg2-12}: NOTRUN -> [SKIP][5] +2 similar issues [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114515v1/shard-dg2-12/igt@kms_plane_scaling@planes-downscale-factor-0-25-upscale-20...@pipe-c-dp-2.html New tests - New tests have been introduced between CI_DRM_12799_full and Patchwork_114515v1_full: ### New IGT tests (23) ### * igt@kms_atomic_transition@modeset-transition-nonblocking@3x-outputs: - Statuses : 1 fail(s) - Exec time: [0.0] s * igt@kms_atomic_transition@modeset-transition-nonblocking@4x-outputs: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-modeset-vs-hang@ab-dp2-dp3: - Statuses : - Exec time: [None] s * igt@kms_flip@2x-plain-flip@ab-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@ab-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@ab-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@ac-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@ac-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@ac-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@ad-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@ad-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@ad-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@bc-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@bc-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@bc-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@bd-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@bd-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@bd-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@cd-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@cd-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-plain-flip@cd-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@flip-vs-suspend-interruptible@d-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@modeset-vs-vblank-race-interruptible@d-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s Known issues Here are the changes found in Patchwork_114515v1_full that come from known issues: ### IGT changes ### Issues hit * igt@device_reset@cold-reset-bound: - shard-tglu-10: NOTRUN -> [SKIP][6] ([i915#7701]) [6]:
Re: [Intel-gfx] [PATCH v6 4/8] drm/i915/pxp: Add GSC-CS backend to send GSC fw messages
On 2/27/2023 6:21 PM, Alan Previn wrote: Add GSC engine based method for sending PXP firmware packets to the GSC firmware for MTL (and future) products. Use the newly added helpers to populate the GSC-CS memory header and send the message packet to the FW by dispatching the GSC_HECI_CMD_PKT instruction on the GSC engine. We use non-priveleged batches for submission to GSC engine which require two buffers for the request: - a buffer for the HECI packet that contains PXP FW commands - a batch-buffer that contains the engine instruction for sending the HECI packet to the GSC firmware. Thus, add the allocation and freeing of these buffers in gsccs init and fini. The GSC-fw may reply to commands with a SUCCESS but with an additional pending-bit set in the reply packet. This bit means the GSC-FW is currently busy and the caller needs to try again with the gsc_message_handle the fw gave. The GSC-fw requires a non-zero host_session_handle provided by the caller to enable gsc_message_handle tracking. Thus, allocate the host_session_handle at init and destroy it at fini (the latter requiring an FYI to the gsc-firmware). Also ensure the send-message function allows replay of the gsc_message_handle. Signed-off-by: Alan Previn --- .../drm/i915/pxp/intel_pxp_cmd_interface_43.h | 4 + drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c| 239 +- drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h| 4 + drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 6 + 4 files changed, 251 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h index ad67e3f49c20..b2523d6918c7 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h @@ -12,6 +12,10 @@ /* PXP-Cmd-Op definitions */ #define PXP43_CMDID_START_HUC_AUTH 0x003A +/* PXP-Packet sizes for MTL's GSCCS-HECI instruction */ +#define PXP43_MAX_HECI_IN_SIZE (SZ_32K) +#define PXP43_MAX_HECI_OUT_SIZE (SZ_32K) + /* PXP-Input-Packet: HUC-Authentication */ struct pxp43_start_huc_auth_in { struct pxp_cmd_header header; diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c index 13693e78b57e..30aa660a975f 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c @@ -6,19 +6,226 @@ #include "gem/i915_gem_internal.h" #include "gt/intel_context.h" +#include "gt/uc/intel_gsc_uc_heci_cmd_submit.h" #include "i915_drv.h" #include "intel_pxp_cmd_interface_43.h" #include "intel_pxp_gsccs.h" #include "intel_pxp_types.h" +static int +gsccs_send_message(struct intel_pxp *pxp, + void *msg_in, size_t msg_in_size, + void *msg_out, size_t msg_out_size_max, + size_t *msg_out_len, + u64 *gsc_msg_handle_retry) +{ + struct intel_gt *gt = pxp->ctrl_gt; + struct drm_i915_private *i915 = gt->i915; + struct gsccs_session_resources *exec = >gsccs_res; in the alloc/free functions you called this object *strm_res; IMO better to use a consistent naming so it is clear they're the same object + struct intel_gsc_mtl_header *header = exec->pkt_vaddr; + struct intel_gsc_heci_non_priv_pkt pkt; + bool null_pkt = !msg_in && !msg_out; + size_t max_msg_size; + u32 reply_size; + int ret; + + if (!exec->ce) + return -ENODEV; + + max_msg_size = PXP43_MAX_HECI_IN_SIZE - sizeof(*header); Using the same max_msg_size for both in and out only works if PXP43_MAX_HECI_IN_SIZE == PXP43_MAX_HECI_OUT_SIZE. This is true now, but I'd add a: BUILD_BUG_ON(PXP43_MAX_HECI_IN_SIZE != PXP43_MAX_HECI_OUT_SIZE); just to be safe. Potentially also a: GEM_BUG_ON(exec->pkt_vma->size < (PXP43_MAX_HECI_IN_SIZE + PXP43_MAX_HECI_OUT_SIZE)); After checking that exec->pkt_vma exists. + + if (msg_in_size > max_msg_size || msg_out_size_max > max_msg_size) + return -ENOSPC; + + if (!exec->pkt_vma || !exec->bb_vma) + return -ENOENT; + + mutex_lock(>tee_mutex); + + memset(header, 0, sizeof(*header)); + intel_gsc_uc_heci_cmd_emit_mtl_header(header, GSC_HECI_MEADDRESS_PXP, + msg_in_size + sizeof(*header), + exec->host_session_handle); + + /* check if this is a host-session-handle cleanup call */ + if (null_pkt) nit: can't you just use if (!msg_in && !msg_out) instead of a local var? not a blocker. + header->flags |= GSC_HECI_FLAG_MSG_CLEANUP; + + /* copy caller provided gsc message handle if this is polling for a prior msg completion */ + header->gsc_message_handle = *gsc_msg_handle_retry; + + /* NOTE: zero size packets are used for session-cleanups */ +
Re: [Intel-gfx] [PATCH] drm/i915: Set wedged if enable guc communication failed
On 3/2/2023 1:50 PM, Zhanjun Dong wrote: Add err code check for enable_communication on resume path. When resume failed, we can no longer use the GPU, marking the GPU as wedged. This is slightly incorrect. If we fail to enable communication, the consequence is that we can't use the GuC. That translates to a failure to use the GT only if GuC submission is enabled; in execlists submission mode we can keep going, although we might end up losing HuC functionality (which is not considered a fatal error). Therefore, the code below should be updated to reflect this. Signed-off-by: Zhanjun Dong --- drivers/gpu/drm/i915/gt/intel_gt_pm.c | 7 ++- drivers/gpu/drm/i915/gt/uc/intel_uc.c | 9 +++-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c index cef3d6f5c34e..42a7d75ce39e 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c @@ -401,8 +401,13 @@ int intel_gt_runtime_resume(struct intel_gt *gt) intel_ggtt_restore_fences(gt->ggtt); ret = intel_uc_runtime_resume(>uc); - if (ret) + if (ret) { + /* Resume failed, we can no longer use the GPU, marking the GPU +* as wedged. +*/ + intel_gt_set_wedged(gt); intel_gt_set_wedged calls intel_runtime_pm_get, so it will deadlock if you call if from within the runtime_resume flow. return ret; + } return 0; } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c index 6648691bd645..d4f428acf20a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c @@ -698,8 +698,13 @@ static int __uc_resume(struct intel_uc *uc, bool enable_communication) /* Make sure we enable communication if and only if it's disabled */ GEM_BUG_ON(enable_communication == intel_guc_ct_enabled(>ct)); - if (enable_communication) - guc_enable_communication(guc); + if (enable_communication) { + err = guc_enable_communication(guc); + if (err) { + guc_dbg(guc, "Failed to resume, %pe", ERR_PTR(err)); nit: this isn't exactly a failure to resume because the GuC is running. It's more a failure to re-establish communication with the GuC. Daniele + return err; + } + } /* If we are only resuming GuC communication but not reloading * GuC, we need to ensure the ARAT timer interrupt is enabled
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Check HPD during eDP probe (rev5)
== Series Details == Series: drm/i915: Check HPD during eDP probe (rev5) URL : https://patchwork.freedesktop.org/series/114577/ State : success == Summary == CI Bug Log - changes from CI_DRM_12810 -> Patchwork_114577v5 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/index.html Participating hosts (39 -> 40) -- Additional (1): fi-kbl-soraka Known issues Here are the changes found in Patchwork_114577v5 that come from known issues: ### IGT changes ### Issues hit * igt@gem_huc_copy@huc-copy: - fi-kbl-soraka: NOTRUN -> [SKIP][1] ([fdo#109271] / [i915#2190]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html * igt@gem_lmem_swapping@basic: - fi-kbl-soraka: NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#4613]) +3 similar issues [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/fi-kbl-soraka/igt@gem_lmem_swapp...@basic.html * igt@i915_selftest@live@execlists: - fi-kbl-soraka: NOTRUN -> [INCOMPLETE][3] ([i915#7913]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/fi-kbl-soraka/igt@i915_selftest@l...@execlists.html * igt@i915_selftest@live@gt_heartbeat: - fi-kbl-soraka: NOTRUN -> [DMESG-FAIL][4] ([i915#5334] / [i915#7872]) [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html * igt@i915_selftest@live@gt_pm: - fi-kbl-soraka: NOTRUN -> [DMESG-FAIL][5] ([i915#1886]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html * igt@i915_selftest@live@slpc: - bat-rpls-1: [PASS][6] -> [DMESG-FAIL][7] ([i915#6367] / [i915#7996]) [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-rpls-1/igt@i915_selftest@l...@slpc.html [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/bat-rpls-1/igt@i915_selftest@l...@slpc.html * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic: - fi-kbl-soraka: NOTRUN -> [SKIP][8] ([fdo#109271]) +16 similar issues [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/fi-kbl-soraka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html * igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-dp-3: - bat-dg2-9: [PASS][9] -> [FAIL][10] ([i915#7932]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-dg2-9/igt@kms_pipe_crc_basic@suspend-read-...@pipe-c-dp-3.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/bat-dg2-9/igt@kms_pipe_crc_basic@suspend-read-...@pipe-c-dp-3.html Possible fixes * igt@i915_selftest@live@gt_heartbeat: - fi-apl-guc: [DMESG-FAIL][11] ([i915#5334]) -> [PASS][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html * igt@i915_selftest@live@migrate: - bat-adlp-6: [DMESG-FAIL][13] ([i915#7699]) -> [PASS][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-adlp-6/igt@i915_selftest@l...@migrate.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/bat-adlp-6/igt@i915_selftest@l...@migrate.html Warnings * igt@i915_selftest@live@requests: - bat-rpls-2: [ABORT][15] ([i915#7694] / [i915#7913] / [i915#7982]) -> [ABORT][16] ([i915#4983] / [i915#7694] / [i915#7913] / [i915#7981]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-rpls-2/igt@i915_selftest@l...@requests.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v5/bat-rpls-2/igt@i915_selftest@l...@requests.html [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886 [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190 [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613 [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983 [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334 [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367 [i915#7694]: https://gitlab.freedesktop.org/drm/intel/issues/7694 [i915#7699]: https://gitlab.freedesktop.org/drm/intel/issues/7699 [i915#7872]: https://gitlab.freedesktop.org/drm/intel/issues/7872 [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913 [i915#7932]: https://gitlab.freedesktop.org/drm/intel/issues/7932 [i915#7981]: https://gitlab.freedesktop.org/drm/intel/issues/7981 [i915#7982]: https://gitlab.freedesktop.org/drm/intel/issues/7982 [i915#7996]:
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: Check HPD during eDP probe (rev5)
== Series Details == Series: drm/i915: Check HPD during eDP probe (rev5) URL : https://patchwork.freedesktop.org/series/114577/ State : warning == Summary == Error: dim sparse failed Sparse version: v0.6.2 Fast mode used, each commit won't be checked separately. +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return' +./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return'
[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Check HPD during eDP probe (rev4)
== Series Details == Series: drm/i915: Check HPD during eDP probe (rev4) URL : https://patchwork.freedesktop.org/series/114577/ State : failure == Summary == CI Bug Log - changes from CI_DRM_12810 -> Patchwork_114577v4 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_114577v4 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_114577v4, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/index.html Participating hosts (39 -> 40) -- Additional (1): fi-kbl-soraka Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_114577v4: ### IGT changes ### Possible regressions * igt@gem_exec_suspend@basic-s0@smem: - fi-blb-e6850: [PASS][1] -> [ABORT][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/fi-blb-e6850/igt@gem_exec_suspend@basic...@smem.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-blb-e6850/igt@gem_exec_suspend@basic...@smem.html Known issues Here are the changes found in Patchwork_114577v4 that come from known issues: ### IGT changes ### Issues hit * igt@gem_exec_gttfill@basic: - fi-pnv-d510:[PASS][3] -> [FAIL][4] ([i915#7229]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/fi-pnv-d510/igt@gem_exec_gttf...@basic.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-pnv-d510/igt@gem_exec_gttf...@basic.html * igt@gem_exec_suspend@basic-s0@smem: - fi-rkl-11600: [PASS][5] -> [FAIL][6] ([fdo#103375]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/fi-rkl-11600/igt@gem_exec_suspend@basic...@smem.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-rkl-11600/igt@gem_exec_suspend@basic...@smem.html * igt@gem_huc_copy@huc-copy: - fi-kbl-soraka: NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#2190]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html * igt@gem_lmem_swapping@basic: - fi-kbl-soraka: NOTRUN -> [SKIP][8] ([fdo#109271] / [i915#4613]) +3 similar issues [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-kbl-soraka/igt@gem_lmem_swapp...@basic.html * igt@i915_selftest@live@execlists: - fi-kbl-soraka: NOTRUN -> [INCOMPLETE][9] ([i915#7913]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-kbl-soraka/igt@i915_selftest@l...@execlists.html * igt@i915_selftest@live@gt_pm: - fi-kbl-soraka: NOTRUN -> [DMESG-FAIL][10] ([i915#1886]) [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html * igt@i915_selftest@live@slpc: - bat-rpls-2: NOTRUN -> [DMESG-FAIL][11] ([i915#6367] / [i915#7913]) [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/bat-rpls-2/igt@i915_selftest@l...@slpc.html - bat-rpls-1: [PASS][12] -> [DMESG-FAIL][13] ([i915#6367] / [i915#7996]) [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-rpls-1/igt@i915_selftest@l...@slpc.html [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/bat-rpls-1/igt@i915_selftest@l...@slpc.html * igt@kms_chamelium_hpd@common-hpd-after-suspend: - bat-rpls-2: NOTRUN -> [SKIP][14] ([i915#7828]) [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/bat-rpls-2/igt@kms_chamelium_...@common-hpd-after-suspend.html * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic: - fi-kbl-soraka: NOTRUN -> [SKIP][15] ([fdo#109271]) +16 similar issues [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-kbl-soraka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html * igt@kms_pipe_crc_basic@suspend-read-crc: - bat-rpls-2: NOTRUN -> [SKIP][16] ([i915#1845]) [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/bat-rpls-2/igt@kms_pipe_crc_ba...@suspend-read-crc.html Possible fixes * igt@i915_selftest@live@gt_heartbeat: - fi-apl-guc: [DMESG-FAIL][17] ([i915#5334]) -> [PASS][18] [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v4/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html * igt@i915_selftest@live@migrate: - bat-adlp-6: [DMESG-FAIL][19] ([i915#7699]) -> [PASS][20] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12810/bat-adlp-6/igt@i915_selftest@l...@migrate.html [20]:
Re: [Intel-gfx] [PATCH v5 00/19] Add vfio_device cdev for iommufd support
On 2/28/23 9:29 PM, Nicolin Chen wrote: > On Tue, Feb 28, 2023 at 04:58:06PM +, Xu, Terrence wrote: > >> Verified this series by "Intel GVT-g GPU device mediated passthrough" and >> "Intel GVT-d GPU device direct passthrough" technologies. >> Both passed VFIO legacy mode / compat mode / cdev mode, including negative >> tests. >> >> Tested-by: Terrence Xu > > Sanity-tested this series on ARM64 with my wip branch: > https://github.com/nicolinc/iommufd/commits/wip/iommufd-v6.2-nesting > (Covering new iommufd and vfio-compat) > > Tested-by: Nicolin Chen Tested a few different flavors of this series on s390 (I grabbed the most recent v6 copy from github): legacy (IOMMUFD=n): vfio-pci, vfio-ccw, vfio-ap compat (CONFIG_IOMMUFD_VFIO_CONTAINER=y): vfio-pci, vfio-ccw, vfio-ap compat+cdev+group (VFIO_DEVICE_CDEV=y && VFIO_GROUP=y): vfio-pci (over cdev using Yi's qemu branch as well as via group), vfio-ccw and vfio-ap via group compat+cdev-only (VFIO_DEVICE_CDEV=y && VFIO_GROUP=n): vfio-pci using Yi's qemu branch Tested-by: Matthew Rosato
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Check HPD during eDP probe (rev4)
== Series Details == Series: drm/i915: Check HPD during eDP probe (rev4) URL : https://patchwork.freedesktop.org/series/114577/ State : warning == Summary == Error: dim checkpatch failed 16df697e08ab drm/i915: Populate dig_port->connected() before connector init -:60: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line #60: FILE: drivers/gpu/drm/i915/display/intel_ddi.c:4538: +* case we have some really bad VBTs... */ total: 0 errors, 1 warnings, 0 checks, 46 lines checked af0d3df94868 drm/i915: Fix SKL DDI A digital port .connected() 289843ee7db8 drm/i915: Get rid of the gm45 HPD live state nonsense 74641e72441b drm/i915: Introduce _hotplug_mask() 96a7c1c4e586 drm/i915: Introduce intel_hpd_enable_detection() e6532b561ef1 drm/i915: Check HPD live state during eDP probe -:51: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("")' - ie: 'commit 41e35ffb380b ("drm/i915: Favor last VBT child device with conflicting AUX ch/DDC pin")' #51: Also the systems (Asrock B250M-HDV at least) fixed by commit total: 1 errors, 0 warnings, 0 checks, 46 lines checked 81a56acf7696 drm/i915: Reuse _hotplug_mask() in .hpd_detection_setup()
Re: [Intel-gfx] [PATCH] drm/i915/pxp: limit drm-errors or warning on firmware API failures
On 2/3/2023 9:48 PM, Alan Previn wrote: MESA driver is creating protected context on every driver handle creation to query caps bits for app. So when running CI tests, they are observing hundreds of drm_errors when enabling PXP in .config but using SOC fusing or BIOS configuration that cannot support PXP sessions. The fixes tag referenced below was to resolve a related issue where we wanted to silence error messages, but that case was due to outdated IFWI (firmware) that definitely needed an upgrade and was, at that point, considered a one-off case as opposed to today's realization that default CI was enabling PXP in kernel config for all testing. So with this patch, let's strike a balance between issues that is critical but are root-caused from HW/platform gaps (louder drm-warn but just ONCE) vs other cases where it could also come from session state machine (which cannot be a WARN_ONCE since it can be triggered due to runtime operation events). Let's use helpers for these so as more functions are added in future features / HW (or as FW designers continue to bless upstreaming of the error codes and meanings), we only need to update the helpers. NOTE: Don't completely remove FW errors (via drm_debug) or else cusomer apps that really needs to know that content protection failed won't be aware of it. Fixes: b762787bf767 ("drm/i915/pxp: Use drm_dbg if arb session failed due to fw version") Signed-off-by: Alan Previn --- .../i915/pxp/intel_pxp_cmd_interface_cmn.h| 3 + drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 2 +- drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 73 +++ 3 files changed, 64 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h index ae9b151b7cb7..6f6541d5e49a 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_cmn.h @@ -18,6 +18,9 @@ enum pxp_status { PXP_STATUS_SUCCESS = 0x0, PXP_STATUS_ERROR_API_VERSION = 0x1002, + PXP_STATUS_NOT_READY = 0x100e, + PXP_STATUS_PLATFCONFIG_KF1_NOVERIF = 0x101a, + PXP_STATUS_PLATFCONFIG_KF1_BAD = 0x101f, PXP_STATUS_OP_NOT_PERMITTED = 0x4013 }; diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c index 448cacb0465d..7de849cb6c47 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c @@ -74,7 +74,7 @@ static int pxp_create_arb_session(struct intel_pxp *pxp) ret = pxp_wait_for_session_state(pxp, ARB_SESSION, true); if (ret) { - drm_err(>i915->drm, "arb session failed to go in play\n"); + drm_dbg(>i915->drm, "arb session failed to go in play\n"); return ret; } drm_dbg(>i915->drm, "PXP ARB session is alive\n"); diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c index d9d248b48093..2d3bcff93da3 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c @@ -19,6 +19,37 @@ #include "intel_pxp_tee.h" #include "intel_pxp_types.h" +static bool +is_fw_err_platform_config(u32 type) +{ + switch (type) { + case PXP_STATUS_ERROR_API_VERSION: + case PXP_STATUS_PLATFCONFIG_KF1_NOVERIF: + case PXP_STATUS_PLATFCONFIG_KF1_BAD: + return true; + default: + break; + } + return false; +} + +static const char * +fw_err_to_string(u32 type) +{ + switch (type) { + case PXP_STATUS_ERROR_API_VERSION: + return "ERR_API_VERSION"; + case PXP_STATUS_NOT_READY: + return "ERR_NOT_READY"; + case PXP_STATUS_PLATFCONFIG_KF1_NOVERIF: + case PXP_STATUS_PLATFCONFIG_KF1_BAD: + return "ERR_PLATFORM_CONFIG"; + default: + break; + } + return NULL; +} + static int intel_pxp_tee_io_message(struct intel_pxp *pxp, void *msg_in, u32 msg_in_size, void *msg_out, u32 msg_out_max_size, @@ -307,15 +338,21 @@ int intel_pxp_tee_cmd_create_arb_session(struct intel_pxp *pxp, _out, sizeof(msg_out), NULL); - if (ret) - drm_err(>drm, "Failed to send tee msg ret=[%d]\n", ret); - else if (msg_out.header.status == PXP_STATUS_ERROR_API_VERSION) - drm_dbg(>drm, "PXP firmware version unsupported, requested: " - "CMD-ID-[0x%08x] on API-Ver-[0x%08x]\n", - msg_in.header.command_id, msg_in.header.api_version); - else if (msg_out.header.status != 0x0) - drm_warn(>drm, "PXP firmware failed arb session init request ret=[0x%08x]\n", -
Re: [Intel-gfx] [PATCH 2/2] drm/i915/guc: Allow for very slow GuC loading
On 2/17/2023 3:47 PM, john.c.harri...@intel.com wrote: From: John Harrison A failure to load the GuC is occasionally observed where the GuC log actually showed that the GuC had loaded just fine. The implication being that the load just took ever so slightly longer than the 200ms timeout. Given that the actual time should be tens of milliseconds at the slowest, this should never happen. So far the issue has generally been caused by a bad IFWI resulting in low frequencies during boot (depsite the KMD requesting max frequency). However, the issue seems to happen more often than one would like. So a) increase the timeout so that the user still gets a working system even in the case of slow load. And b) report the frequency during the load to see if that is the case of the slow down. Some refs would be good here. From a quick search, these seems to match: https://gitlab.freedesktop.org/drm/intel/-/issues/7931 https://gitlab.freedesktop.org/drm/intel/-/issues/8060 https://gitlab.freedesktop.org/drm/intel/-/issues/8083 https://gitlab.freedesktop.org/drm/intel/-/issues/8136 https://gitlab.freedesktop.org/drm/intel/-/issues/8137 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 37 +-- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c index 2f5942606913d..72e003f50617d 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c @@ -12,6 +12,7 @@ #include "gt/intel_gt.h" #include "gt/intel_gt_mcr.h" #include "gt/intel_gt_regs.h" +#include "gt/intel_rps.h" #include "intel_guc_fw.h" #include "intel_guc_print.h" #include "i915_drv.h" @@ -139,9 +140,12 @@ static int guc_wait_ucode(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); struct intel_uncore *uncore = gt->uncore; + ktime_t before, after, delta; bool success; u32 status; - int ret; + int ret, count; + u64 delta_ms; + u32 before_freq; /* * Wait for the GuC to start up. @@ -159,13 +163,32 @@ static int guc_wait_ucode(struct intel_guc *guc) * issues to be resolved. In the meantime bump the timeout to * 200ms. Even at slowest clock, this should be sufficient. And * in the working case, a larger timeout makes no difference. +* +* IFWI updates have also been seen to cause sporadic failures due to +* the requested frequency not being granted and thus the firmware +* load is attempted at minimum frequency. That can lead to load times +* in the seconds range. However, there is a limit on how long an +* individual wait_for() can wait. So wrap it in a loop. */ - ret = wait_for(guc_load_done(uncore, , ), 200); + before_freq = intel_rps_read_actual_frequency(>gt->rps); + before = ktime_get(); + for (count = 0; count < 20; count++) { + ret = wait_for(guc_load_done(uncore, , ), 1000); Isn't 20 secs a bit too long for an in-place wait? I get that if the GuC doesn't load (or fail to) within a few secs the HW is likely toast, but still that seems a bit too long to me. What's the worst case load time ever observed? I suggest reducing the wait to 3 secs as a compromise, if that's bigger than the worst case. The patch LGTM apart from this point. Daniele + if (!ret || !success) + break; + + guc_dbg(guc, "load still in progress, count = %d, freq = %dMHz\n", + count, intel_rps_read_actual_frequency(>gt->rps)); + } + after = ktime_get(); + delta = ktime_sub(after, before); + delta_ms = ktime_to_ms(delta); if (ret || !success) { u32 ukernel = REG_FIELD_GET(GS_UKERNEL_MASK, status); u32 bootrom = REG_FIELD_GET(GS_BOOTROM_MASK, status); - guc_info(guc, "load failed: status = 0x%08X, ret = %d\n", status, ret); + guc_info(guc, "load failed: status = 0x%08X, time = %lldms, freq = %dMHz, ret = %d\n", +status, delta_ms, intel_rps_read_actual_frequency(>gt->rps), ret); guc_info(guc, "load failed: status: Reset = %d, BootROM = 0x%02X, UKernel = 0x%02X, MIA = 0x%02X, Auth = 0x%02X\n", REG_FIELD_GET(GS_MIA_IN_RESET, status), bootrom, ukernel, @@ -206,6 +229,14 @@ static int guc_wait_ucode(struct intel_guc *guc) /* Uncommon/unexpected error, see earlier status code print for details */ if (ret == 0) ret = -ENXIO; + } else if (delta_ms > 200) { + guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n", +delta_ms, intel_rps_read_actual_frequency(>gt->rps), +
Re: [Intel-gfx] [PATCH 1/2] drm/i915/guc: Improve GuC load error reporting
On 2/17/2023 3:47 PM, john.c.harri...@intel.com wrote: From: John Harrison There are multiple ways in which the GuC load can fail. The driver was reporting the status register as is, but not everyone can read the matrix unfiltered. So add decoding of the common error cases. Also, remove the comment about interrupt based load completion checking being not recommended. The interrupt was removed from the GuC firmware some time ago so it is no longer an option anyway. While at it, also abort the timeout if a known error code is reported. No need to keep waiting if the GuC has already given up the load. Signed-off-by: John Harrison --- .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h | 17 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 95 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h| 4 +- 3 files changed, 95 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h index 8085fb1812748..750fe0c6d8529 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h @@ -21,6 +21,9 @@ enum intel_guc_load_status { INTEL_GUC_LOAD_STATUS_ERROR_DEVID_BUILD_MISMATCH = 0x02, INTEL_GUC_LOAD_STATUS_GUC_PREPROD_BUILD_MISMATCH = 0x03, INTEL_GUC_LOAD_STATUS_ERROR_DEVID_INVALID_GUCTYPE = 0x04, + INTEL_GUC_LOAD_STATUS_HWCONFIG_START = 0x05, + INTEL_GUC_LOAD_STATUS_HWCONFIG_DONE= 0x06, + INTEL_GUC_LOAD_STATUS_HWCONFIG_ERROR = 0x07, INTEL_GUC_LOAD_STATUS_GDT_DONE = 0x10, INTEL_GUC_LOAD_STATUS_IDT_DONE = 0x20, INTEL_GUC_LOAD_STATUS_LAPIC_DONE = 0x30, @@ -38,4 +41,18 @@ enum intel_guc_load_status { INTEL_GUC_LOAD_STATUS_READY= 0xF0, }; +enum intel_bootrom_load_status { + INTEL_BOOTROM_STATUS_NO_KEY_FOUND = 0x13, + INTEL_BOOTROM_STATUS_AES_PROD_KEY_FOUND = 0x1A, + INTEL_BOOTROM_STATUS_RSA_FAILED = 0x50, + INTEL_BOOTROM_STATUS_PAVPC_FAILED = 0x73, + INTEL_BOOTROM_STATUS_WOPCM_FAILED = 0x74, + INTEL_BOOTROM_STATUS_LOADLOC_FAILED = 0x75, + INTEL_BOOTROM_STATUS_JUMP_PASSED = 0x76, + INTEL_BOOTROM_STATUS_JUMP_FAILED = 0x77, + INTEL_BOOTROM_STATUS_RC6CTXCONFIG_FAILED = 0x79, + INTEL_BOOTROM_STATUS_MPUMAP_INCORRECT = 0x7a, nit: you've used uppercase for the other hex characters, while only this one has a lowercase "a" + INTEL_BOOTROM_STATUS_EXCEPTION= 0x7E, +}; I've double checked the defines against the specs and they all match. + #endif /* _ABI_GUC_ERRORS_ABI_H */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c index 69133420c78b2..2f5942606913d 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c @@ -88,31 +88,64 @@ static int guc_xfer_rsa(struct intel_uc_fw *guc_fw, /* * Read the GuC status register (GUC_STATUS) and store it in the * specified location; then return a boolean indicating whether - * the value matches either of two values representing completion - * of the GuC boot process. + * the value matches either completion or a known failure code. * * This is used for polling the GuC status in a wait_for() * loop below. */ -static inline bool guc_ready(struct intel_uncore *uncore, u32 *status) +static inline bool guc_load_done(struct intel_uncore *uncore, u32 *status, bool *success) { u32 val = intel_uncore_read(uncore, GUC_STATUS); u32 uk_val = REG_FIELD_GET(GS_UKERNEL_MASK, val); + u32 br_val = REG_FIELD_GET(GS_BOOTROM_MASK, val); *status = val; - return uk_val == INTEL_GUC_LOAD_STATUS_READY; + *success = true; It feels a bit weird to default this to true. If we don't return true from one of the switches below, we can end up returning false from the wait but leaving success to true. I understand that this is used more as a "not failed" flag rather than a success one, so it is functionally correct, but maybe rename it? not a blocker. Apart from the nits, the patch LGTM: Reviewed-by: Daniele Ceraolo Spurio Daniele + switch (uk_val) { + case INTEL_GUC_LOAD_STATUS_READY: + return true; + + case INTEL_GUC_LOAD_STATUS_ERROR_DEVID_BUILD_MISMATCH: + case INTEL_GUC_LOAD_STATUS_GUC_PREPROD_BUILD_MISMATCH: + case INTEL_GUC_LOAD_STATUS_ERROR_DEVID_INVALID_GUCTYPE: + case INTEL_GUC_LOAD_STATUS_HWCONFIG_ERROR: + case INTEL_GUC_LOAD_STATUS_DPC_ERROR: + case INTEL_GUC_LOAD_STATUS_EXCEPTION: + case INTEL_GUC_LOAD_STATUS_INIT_DATA_INVALID: +
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On Fri, Mar 3, 2023 at 10:08 AM Tvrtko Ursulin wrote: > > > On 03/03/2023 14:48, Rob Clark wrote: > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > wrote: > >> > >> > >> On 03/03/2023 03:21, Rodrigo Vivi wrote: > >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > From: Rob Clark > > >>> > >>> missing some wording here... > >>> > v2: rebase > > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/i915/i915_request.c | 20 > 1 file changed, 20 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_request.c > b/drivers/gpu/drm/i915/i915_request.c > index 7503dcb9043b..44491e7e214c 100644 > --- a/drivers/gpu/drm/i915/i915_request.c > +++ b/drivers/gpu/drm/i915/i915_request.c > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct > dma_fence *fence) > return i915_request_enable_breadcrumb(to_request(fence)); > } > > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t > deadline) > +{ > +struct i915_request *rq = to_request(fence); > + > +if (i915_request_completed(rq)) > +return; > + > +if (i915_request_started(rq)) > +return; > >>> > >>> why do we skip the boost if already started? > >>> don't we want to boost the freq anyway? > >> > >> I'd wager Rob is just copying the current i915 wait boost logic. > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > compared to your RFC, this could be the bug > > Hm, there I have preserved this same !i915_request_started logic. > > Presumably it's not just fewer boosts but lower performance. How is he > setting the deadline? Somehow from clFlush or so? > > Regards, > > Tvrtko > > P.S. Take note that I did not post the latest version of my RFC. The one > where I fix the fence chain and array misses you pointed out. I did not > think it would be worthwhile given no universal love for it, but if > people are testing with it more widely that I was aware perhaps I should. Yep, that would be great. We're interested in it for ChromeOS. Please Cc me on the series when you send it.
[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Check HPD during eDP probe (rev3)
== Series Details == Series: drm/i915: Check HPD during eDP probe (rev3) URL : https://patchwork.freedesktop.org/series/114577/ State : failure == Summary == CI Bug Log - changes from CI_DRM_12808 -> Patchwork_114577v3 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_114577v3 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_114577v3, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/index.html Participating hosts (39 -> 40) -- Additional (1): fi-kbl-soraka Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_114577v3: ### IGT changes ### Possible regressions * igt@gem_exec_fence@basic-await@ccs3: - bat-dg2-9: [PASS][1] -> [FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12808/bat-dg2-9/igt@gem_exec_fence@basic-aw...@ccs3.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/bat-dg2-9/igt@gem_exec_fence@basic-aw...@ccs3.html Known issues Here are the changes found in Patchwork_114577v3 that come from known issues: ### IGT changes ### Issues hit * igt@gem_huc_copy@huc-copy: - fi-kbl-soraka: NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#2190]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html * igt@gem_lmem_swapping@basic: - fi-kbl-soraka: NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#4613]) +3 similar issues [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/fi-kbl-soraka/igt@gem_lmem_swapp...@basic.html * igt@i915_selftest@live@client: - fi-kbl-soraka: NOTRUN -> [INCOMPLETE][5] ([i915#7913]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/fi-kbl-soraka/igt@i915_selftest@l...@client.html * igt@i915_selftest@live@gt_heartbeat: - fi-kbl-soraka: NOTRUN -> [DMESG-FAIL][6] ([i915#5334] / [i915#7872]) [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html * igt@i915_selftest@live@gt_pm: - fi-kbl-soraka: NOTRUN -> [DMESG-FAIL][7] ([i915#1886]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html * igt@i915_selftest@live@guc: - bat-rpls-2: NOTRUN -> [DMESG-WARN][8] ([i915#7852]) [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/bat-rpls-2/igt@i915_selftest@l...@guc.html * igt@i915_selftest@live@hangcheck: - fi-skl-guc: [PASS][9] -> [DMESG-WARN][10] ([i915#8073]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12808/fi-skl-guc/igt@i915_selftest@l...@hangcheck.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/fi-skl-guc/igt@i915_selftest@l...@hangcheck.html * igt@i915_selftest@live@slpc: - bat-rpls-2: NOTRUN -> [DMESG-FAIL][11] ([i915#6367] / [i915#7913] / [i915#7996]) [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/bat-rpls-2/igt@i915_selftest@l...@slpc.html * igt@kms_chamelium_hpd@common-hpd-after-suspend: - bat-rpls-2: NOTRUN -> [SKIP][12] ([i915#7828]) [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/bat-rpls-2/igt@kms_chamelium_...@common-hpd-after-suspend.html * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic: - fi-kbl-soraka: NOTRUN -> [SKIP][13] ([fdo#109271]) +16 similar issues [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/fi-kbl-soraka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html * igt@kms_pipe_crc_basic@suspend-read-crc: - bat-rpls-2: NOTRUN -> [SKIP][14] ([i915#1845]) [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/bat-rpls-2/igt@kms_pipe_crc_ba...@suspend-read-crc.html Possible fixes * igt@i915_selftest@live@migrate: - bat-dg2-11: [DMESG-WARN][15] ([i915#7699]) -> [PASS][16] [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12808/bat-dg2-11/igt@i915_selftest@l...@migrate.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/bat-dg2-11/igt@i915_selftest@l...@migrate.html * igt@i915_selftest@live@reset: - bat-rpls-2: [ABORT][17] ([i915#4983] / [i915#7913]) -> [PASS][18] [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12808/bat-rpls-2/igt@i915_selftest@l...@reset.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v3/bat-rpls-2/igt@i915_selftest@l...@reset.html [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [i915#1845]:
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Check HPD during eDP probe (rev3)
== Series Details == Series: drm/i915: Check HPD during eDP probe (rev3) URL : https://patchwork.freedesktop.org/series/114577/ State : warning == Summary == Error: dim checkpatch failed cd335763b419 drm/i915: Populate dig_port->connected() before connector init -:60: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line #60: FILE: drivers/gpu/drm/i915/display/intel_ddi.c:4538: +* case we have some really bad VBTs... */ total: 0 errors, 1 warnings, 0 checks, 46 lines checked 777d74c6da73 drm/i915: Fix SKL DDI A digital port .connected() 1933e83aba4a drm/i915: Get rid of the gm45 HPD live state nonsense 0c03dc5b520b drm/i915: Introduce _hotplug_mask() 1a6d613b25bb drm/i915: Introduce intel_hpd_enable_detection() 06b959fa1b7e drm/i915: Check HPD live state during eDP probe -:51: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("")' - ie: 'commit 41e35ffb380b ("drm/i915: Favor last VBT child device with conflicting AUX ch/DDC pin")' #51: Also the systems (Asrock B250M-HDV at least) fixed by commit total: 1 errors, 0 warnings, 0 checks, 46 lines checked f8d051f54631 drm/i915: Reuse _hotplug_mask() in .hpd_detection_setup()
[Intel-gfx] [PATCH v2 4/7] drm/i915: Introduce _hotplug_mask()
From: Ville Syrjälä Pair each _hotplug_enables() function with a corresponding _hotplug_mask() function so that we can determine right bits to clear on a per hpd_pin basis. We'll need this for turning on HPD sense for a specific encoder rather than just all of them. v2: Drop the unused 'i915' param (Jani) v3: Drop the _foo_hotplug_enables() redirection too Cc: Jani Nikula Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/i915_irq.c | 205 +--- 1 file changed, 134 insertions(+), 71 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index cc3d016f76d1..22658b38454d 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2836,6 +2836,22 @@ static void cherryview_irq_reset(struct drm_i915_private *dev_priv) spin_unlock_irq(_priv->irq_lock); } +static u32 ibx_hotplug_mask(enum hpd_pin hpd_pin) +{ + switch (hpd_pin) { + case HPD_PORT_A: + return PORTA_HOTPLUG_ENABLE; + case HPD_PORT_B: + return PORTB_HOTPLUG_ENABLE | PORTB_PULSE_DURATION_MASK; + case HPD_PORT_C: + return PORTC_HOTPLUG_ENABLE | PORTC_PULSE_DURATION_MASK; + case HPD_PORT_D: + return PORTD_HOTPLUG_ENABLE | PORTD_PULSE_DURATION_MASK; + default: + return 0; + } +} + static u32 ibx_hotplug_enables(struct intel_encoder *encoder) { struct drm_i915_private *i915 = to_i915(encoder->base.dev); @@ -2870,13 +2886,10 @@ static void ibx_hpd_detection_setup(struct drm_i915_private *dev_priv) * The pulse duration bits are reserved on LPT+. */ intel_uncore_rmw(_priv->uncore, PCH_PORT_HOTPLUG, -PORTA_HOTPLUG_ENABLE | -PORTB_HOTPLUG_ENABLE | -PORTC_HOTPLUG_ENABLE | -PORTD_HOTPLUG_ENABLE | -PORTB_PULSE_DURATION_MASK | -PORTC_PULSE_DURATION_MASK | -PORTD_PULSE_DURATION_MASK, +ibx_hotplug_mask(HPD_PORT_A) | +ibx_hotplug_mask(HPD_PORT_B) | +ibx_hotplug_mask(HPD_PORT_C) | +ibx_hotplug_mask(HPD_PORT_D), intel_hpd_hotplug_enables(dev_priv, ibx_hotplug_enables)); } @@ -2892,14 +2905,34 @@ static void ibx_hpd_irq_setup(struct drm_i915_private *dev_priv) ibx_hpd_detection_setup(dev_priv); } +static u32 icp_ddi_hotplug_mask(enum hpd_pin hpd_pin) +{ + switch (hpd_pin) { + case HPD_PORT_A: + case HPD_PORT_B: + case HPD_PORT_C: + case HPD_PORT_D: + return SHOTPLUG_CTL_DDI_HPD_ENABLE(hpd_pin); + default: + return 0; + } +} + static u32 icp_ddi_hotplug_enables(struct intel_encoder *encoder) { - switch (encoder->hpd_pin) { - case HPD_PORT_A: - case HPD_PORT_B: - case HPD_PORT_C: - case HPD_PORT_D: - return SHOTPLUG_CTL_DDI_HPD_ENABLE(encoder->hpd_pin); + return icp_ddi_hotplug_mask(encoder->hpd_pin); +} + +static u32 icp_tc_hotplug_mask(enum hpd_pin hpd_pin) +{ + switch (hpd_pin) { + case HPD_PORT_TC1: + case HPD_PORT_TC2: + case HPD_PORT_TC3: + case HPD_PORT_TC4: + case HPD_PORT_TC5: + case HPD_PORT_TC6: + return ICP_TC_HPD_ENABLE(hpd_pin); default: return 0; } @@ -2907,38 +2940,28 @@ static u32 icp_ddi_hotplug_enables(struct intel_encoder *encoder) static u32 icp_tc_hotplug_enables(struct intel_encoder *encoder) { - switch (encoder->hpd_pin) { - case HPD_PORT_TC1: - case HPD_PORT_TC2: - case HPD_PORT_TC3: - case HPD_PORT_TC4: - case HPD_PORT_TC5: - case HPD_PORT_TC6: - return ICP_TC_HPD_ENABLE(encoder->hpd_pin); - default: - return 0; - } + return icp_tc_hotplug_mask(encoder->hpd_pin); } static void icp_ddi_hpd_detection_setup(struct drm_i915_private *dev_priv) { intel_uncore_rmw(_priv->uncore, SHOTPLUG_CTL_DDI, -SHOTPLUG_CTL_DDI_HPD_ENABLE(HPD_PORT_A) | -SHOTPLUG_CTL_DDI_HPD_ENABLE(HPD_PORT_B) | -SHOTPLUG_CTL_DDI_HPD_ENABLE(HPD_PORT_C) | -SHOTPLUG_CTL_DDI_HPD_ENABLE(HPD_PORT_D), +icp_ddi_hotplug_mask(HPD_PORT_A) | +icp_ddi_hotplug_mask(HPD_PORT_B) | +icp_ddi_hotplug_mask(HPD_PORT_C) | +icp_ddi_hotplug_mask(HPD_PORT_D), intel_hpd_hotplug_enables(dev_priv, icp_ddi_hotplug_enables)); } static void icp_tc_hpd_detection_setup(struct drm_i915_private *dev_priv) { intel_uncore_rmw(_priv->uncore, SHOTPLUG_CTL_TC, -ICP_TC_HPD_ENABLE(HPD_PORT_TC1) | -
Re: [Intel-gfx] [PATCH v5 09/19] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET
On Fri, 3 Mar 2023 06:36:35 + "Tian, Kevin" wrote: > > From: Liu, Yi L > > Sent: Thursday, March 2, 2023 10:20 PM > > > > > From: Jason Gunthorpe > > > Sent: Thursday, March 2, 2023 8:35 PM > > > > > > On Thu, Mar 02, 2023 at 09:55:46AM +, Tian, Kevin wrote: > > > > > From: Liu, Yi L > > > > > Sent: Thursday, March 2, 2023 2:07 PM > > > > > > > > > > > - if (!vfio_dev_in_groups(cur_vma, groups)) { > > > > > > + if (cur_vma->vdev.open_count && > > > > > > + !vfio_dev_in_groups(cur_vma, groups) && > > > > > > + !vfio_dev_in_iommufd_ctx(cur_vma, > > iommufd_ctx)) { > > > > > > > > > > Hi Alex, Jason, > > > > > > > > > > There is one concern on this approach which is related to the > > > > > cdev noiommu mode. As patch 16 of this series, cdev path > > > > > supports noiommu mode by passing a negative iommufd to > > > > > kernel. In such case, the vfio_device is not bound to a valid > > > > > iommufd. Then the check in vfio_dev_in_iommufd_ctx() is > > > > > to be broken. > > > > > > > > > > An idea is to add a cdev_noiommu flag in vfio_device, when > > > > > checking the iommufd_ictx, also check this flag. If all the opened > > > > > devices in the dev_set have vfio_device->cdev_noiommu==true, > > > > > then the reset is considered to be doable. But there is a special > > > > > case. If devices in this dev_set are opened by two applications > > > > > that operates in cdev noiommu mode, then this logic is not able > > > > > to differentiate them. In that case, should we allow the reset? > > > > > It seems to ok to allow reset since noiommu mode itself means > > > > > no security between the applications that use it. thoughts? > > > > > > > > > > > > > Probably we need still pass in a valid iommufd (instead of using > > > > a negative value) in noiommu case to mark the ownership so the > > > > check in the reset path can correctly catch whether an opened > > > > device belongs to this user. > > > > > > There should be no iommufd at all in no-iommu mode > > > > > > Adding one just to deal with noiommu reset seems pretty sad :\ > > > > > > no-iommu is only really used by dpdk, and it doesn't invoke > > > VFIO_DEVICE_PCI_HOT_RESET at all. > > > > Does it happen to be or by design, this ioctl is not needed by dpdk? I can't think of a reason DPDK couldn't use hot-reset. If we want to make it a policy, it should be enforced by code, but creating that policy based on a difficulty in supporting that mode with iommufd isn't great. > use of noiommu should be discouraged. > > if only known noiommu user doesn't use it then having certain > new restriction for noiommu in the hot reset path might be an > acceptable tradeoff. > > but again needs Alex's input as he knows all the history about > noiommu. No-IOMMU mode was meant to be a minimally invasive code change to re-use the vfio device interface, or alternatively avoid extending uio-pci-generic to support MSI/X, with better logging/tainting to know when userspace is driving devices without IOMMU protection, and as a means to promote a transition to standard support of vfio. AFAIK, there are still environments without v/IOMMU that make use of no-iommu mode. Thanks, Alex
[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/active: Fix misuse of non-idle barriers as fence trackers (rev6)
== Series Details == Series: drm/i915/active: Fix misuse of non-idle barriers as fence trackers (rev6) URL : https://patchwork.freedesktop.org/series/113950/ State : success == Summary == CI Bug Log - changes from CI_DRM_12799_full -> Patchwork_113950v6_full Summary --- **SUCCESS** No regressions found. Participating hosts (19 -> 19) -- No changes in participating hosts Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_113950v6_full: ### IGT changes ### Suppressed The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@gem_create@create-clear@smem0: - {shard-rkl}:NOTRUN -> [ABORT][1] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-rkl-4/igt@gem_create@create-cl...@smem0.html * igt@gem_create@create-ext-cpu-access-big: - {shard-dg2-5}: NOTRUN -> [ABORT][2] [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-dg2-5/igt@gem_cre...@create-ext-cpu-access-big.html * igt@kms_content_protection@srm@pipe-a-dp-2: - {shard-dg2-12}: NOTRUN -> [TIMEOUT][3] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-dg2-12/igt@kms_content_protection@s...@pipe-a-dp-2.html * igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a3: - {shard-dg2-6}: [PASS][4] -> [FAIL][5] [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12799/shard-dg2-6/igt@kms_flip@flip-vs-expired-vbl...@c-hdmi-a3.html [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-dg2-6/igt@kms_flip@flip-vs-expired-vbl...@c-hdmi-a3.html * igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-blt: - {shard-dg2-8}: NOTRUN -> [FAIL][6] [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-dg2-8/igt@kms_frontbuffer_track...@fbc-1p-offscren-pri-shrfb-draw-blt.html New tests - New tests have been introduced between CI_DRM_12799_full and Patchwork_113950v6_full: ### New IGT tests (53) ### * igt@kms_cursor_crc@cursor-random-64x64@pipe-d-dp-4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ab-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ab-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ab-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ac-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ac-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ac-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ad-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ad-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ad-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@bc-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@bc-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@bd-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@bd-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ab-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ab-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ab-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ac-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ac-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ac-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ad-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ad-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ad-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@bc-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@bc-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@bc-dp3-dp4: - Statuses : 1
[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/active: Fix misuse of non-idle barriers as fence trackers (rev6)
== Series Details == Series: drm/i915/active: Fix misuse of non-idle barriers as fence trackers (rev6) URL : https://patchwork.freedesktop.org/series/113950/ State : success == Summary == CI Bug Log - changes from CI_DRM_12799_full -> Patchwork_113950v6_full Summary --- **SUCCESS** No regressions found. Participating hosts (19 -> 19) -- No changes in participating hosts Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_113950v6_full: ### IGT changes ### Suppressed The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@gem_create@create-clear@smem0: - {shard-rkl}:NOTRUN -> [ABORT][1] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-rkl-4/igt@gem_create@create-cl...@smem0.html * igt@gem_create@create-ext-cpu-access-big: - {shard-dg2-5}: NOTRUN -> [ABORT][2] [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-dg2-5/igt@gem_cre...@create-ext-cpu-access-big.html * igt@kms_content_protection@srm@pipe-a-dp-2: - {shard-dg2-12}: NOTRUN -> [TIMEOUT][3] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-dg2-12/igt@kms_content_protection@s...@pipe-a-dp-2.html * igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a3: - {shard-dg2-6}: [PASS][4] -> [FAIL][5] [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12799/shard-dg2-6/igt@kms_flip@flip-vs-expired-vbl...@c-hdmi-a3.html [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-dg2-6/igt@kms_flip@flip-vs-expired-vbl...@c-hdmi-a3.html * igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-blt: - {shard-dg2-8}: NOTRUN -> [FAIL][6] [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_113950v6/shard-dg2-8/igt@kms_frontbuffer_track...@fbc-1p-offscren-pri-shrfb-draw-blt.html New tests - New tests have been introduced between CI_DRM_12799_full and Patchwork_113950v6_full: ### New IGT tests (53) ### * igt@kms_cursor_crc@cursor-random-64x64@pipe-d-dp-4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ab-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ab-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ab-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ac-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ac-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ac-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ad-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ad-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@ad-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@bc-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@bc-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@bd-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-dpms-vs-vblank-race-interruptible@bd-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ab-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ab-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ab-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ac-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ac-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ac-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ad-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ad-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@ad-dp3-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@bc-dp2-dp3: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@bc-dp2-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_flip@2x-flip-vs-rmfb@bc-dp3-dp4: - Statuses : 1
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On Fri, Mar 3, 2023 at 7:20 AM Ville Syrjälä wrote: > > On Fri, Mar 03, 2023 at 05:00:03PM +0200, Ville Syrjälä wrote: > > On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote: > > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > > wrote: > > > > > > > > > > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > > > >> From: Rob Clark > > > > >> > > > > > > > > > > missing some wording here... > > > > > > > > > >> v2: rebase > > > > >> > > > > >> Signed-off-by: Rob Clark > > > > >> --- > > > > >> drivers/gpu/drm/i915/i915_request.c | 20 > > > > >> 1 file changed, 20 insertions(+) > > > > >> > > > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c > > > > >> b/drivers/gpu/drm/i915/i915_request.c > > > > >> index 7503dcb9043b..44491e7e214c 100644 > > > > >> --- a/drivers/gpu/drm/i915/i915_request.c > > > > >> +++ b/drivers/gpu/drm/i915/i915_request.c > > > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct > > > > >> dma_fence *fence) > > > > >> return i915_request_enable_breadcrumb(to_request(fence)); > > > > >> } > > > > >> > > > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, > > > > >> ktime_t deadline) > > > > >> +{ > > > > >> +struct i915_request *rq = to_request(fence); > > > > >> + > > > > >> +if (i915_request_completed(rq)) > > > > >> +return; > > > > >> + > > > > >> +if (i915_request_started(rq)) > > > > >> +return; > > > > > > > > > > why do we skip the boost if already started? > > > > > don't we want to boost the freq anyway? > > > > > > > > I'd wager Rob is just copying the current i915 wait boost logic. > > > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > > compared to your RFC, this could be the bug > > > > I don't think i915 calls drm_atomic_helper_wait_for_fences() > > so that could explain something. > > Oh, I guess this wasn't even supposed to take over the current > display boost stuff since you didn't remove the old one. Right, I didn't try to replace the current thing.. but hopefully at least make it possible for i915 to use more of the atomic helpers in the future BR, -R > The current one just boosts after a missed vblank. The deadline > could use your timer approach I suppose and boost already a bit > earlier in the hopes of not missing the vblank. > > -- > Ville Syrjälä > Intel
Re: [Intel-gfx] [Freedreno] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On Fri, Mar 3, 2023 at 7:08 AM Tvrtko Ursulin wrote: > > > On 03/03/2023 14:48, Rob Clark wrote: > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > wrote: > >> > >> > >> On 03/03/2023 03:21, Rodrigo Vivi wrote: > >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > From: Rob Clark > > >>> > >>> missing some wording here... > >>> > v2: rebase > > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/i915/i915_request.c | 20 > 1 file changed, 20 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_request.c > b/drivers/gpu/drm/i915/i915_request.c > index 7503dcb9043b..44491e7e214c 100644 > --- a/drivers/gpu/drm/i915/i915_request.c > +++ b/drivers/gpu/drm/i915/i915_request.c > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct > dma_fence *fence) > return i915_request_enable_breadcrumb(to_request(fence)); > } > > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t > deadline) > +{ > +struct i915_request *rq = to_request(fence); > + > +if (i915_request_completed(rq)) > +return; > + > +if (i915_request_started(rq)) > +return; > >>> > >>> why do we skip the boost if already started? > >>> don't we want to boost the freq anyway? > >> > >> I'd wager Rob is just copying the current i915 wait boost logic. > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > compared to your RFC, this could be the bug > > Hm, there I have preserved this same !i915_request_started logic. > > Presumably it's not just fewer boosts but lower performance. How is he > setting the deadline? Somehow from clFlush or so? Yeah, fewer boosts, lower freq/perf.. I cobbled together a quick mesa hack to set the DEADLINE flag on syncobj waits, but it seems likely that I missed something somewhere BR, -R > Regards, > > Tvrtko > > P.S. Take note that I did not post the latest version of my RFC. The one > where I fix the fence chain and array misses you pointed out. I did not > think it would be worthwhile given no universal love for it, but if > people are testing with it more widely that I was aware perhaps I should. > > + > +/* > + * TODO something more clever for deadlines that are in the > + * future. I think probably track the nearest deadline in > + * rq->timeline and set timer to trigger boost accordingly? > + */ > >>> > >>> I'm afraid it will be very hard to find some heuristics of what's > >>> late enough for the boost no? > >>> I mean, how early to boost the freq on an upcoming deadline for the > >>> timer? > >> > >> We can off load this patch from Rob and deal with it separately, or > >> after the fact? > > > > That is completely my intention, I expect you to replace my i915 patch ;-) > > > > Rough idea when everyone is happy with the core bits is to setup an > > immutable branch without the driver specific patches, which could be > > merged into drm-next and $driver-next and then each driver team can > > add there own driver patches on top > > > > BR, > > -R > > > >> It's a half solution without a smarter scheduler too. Like > >> https://lore.kernel.org/all/20210208105236.28498-10-ch...@chris-wilson.co.uk/, > >> or if GuC plans to do something like that at any point. > >> > >> Or bump the priority too if deadline is looming? > >> > >> IMO it is not very effective to fiddle with the heuristic on an ad-hoc > >> basis. For instance I have a new heuristics which improves the > >> problematic OpenCL cases for further 5% (relative to the current > >> waitboost improvement from adding missing syncobj waitboost). But I > >> can't really test properly for regressions over platforms, stacks, > >> workloads.. :( > >> > >> Regards, > >> > >> Tvrtko > >> > >>> > + > +intel_rps_boost(rq); > +} > + > static signed long i915_fence_wait(struct dma_fence *fence, > bool interruptible, > signed long timeout) > @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > .signaled = i915_fence_signaled, > .wait = i915_fence_wait, > .release = i915_fence_release, > +.set_deadline = i915_fence_set_deadline, > }; > > static void irq_execute_cb(struct irq_work *wrk) > -- > 2.39.1 >
Re: [Intel-gfx] [PATCH] Change the meaning of the fields in the ttm_place structure from pfn to bytes
On Fri, Mar 03, 2023 at 03:55:56PM +0100, Michel Dänzer wrote: > On 3/3/23 08:16, Somalapuram Amaranath wrote: > > Change the ttm_place structure member fpfn, lpfn, mem_type to > > res_start, res_end, res_type. > > Change the unsigned to u64. > > Fix the dependence in all the DRM drivers and > > clean up PAGE_SHIFT operation. > > > > Signed-off-by: Somalapuram Amaranath > > > > [...] > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c > > index 44367f03316f..5b5104e724e3 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c > > @@ -131,11 +131,12 @@ static int amdgpu_gtt_mgr_new(struct > > ttm_resource_manager *man, > > goto err_free; > > } > > > > - if (place->lpfn) { > > + if (place->res_end) { > > spin_lock(>lock); > > r = drm_mm_insert_node_in_range(>mm, >mm_nodes[0], > > - num_pages, tbo->page_alignment, > > - 0, place->fpfn, place->lpfn, > > + num_pages, tbo->page_alignment, > > 0, > > + place->res_start << PAGE_SHIFT, > > + place->res_end << PAGE_SHIFT, > > DRM_MM_INSERT_BEST); > > This should be >> or no shift instead of <<, shouldn't it? Multiplying a > value in bytes by the page size doesn't make sense. > > > I didn't check the rest of the patch in detail, but it's easy introduce > subtle regressions with this kind of change. It'll require a lot of review & > testing scrutiny. Also good justification. The changelog says only what is done, nothing about why the change is needed. Regards Stanislaw
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On Fri, Mar 03, 2023 at 05:00:03PM +0200, Ville Syrjälä wrote: > On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote: > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > wrote: > > > > > > > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > > >> From: Rob Clark > > > >> > > > > > > > > missing some wording here... > > > > > > > >> v2: rebase > > > >> > > > >> Signed-off-by: Rob Clark > > > >> --- > > > >> drivers/gpu/drm/i915/i915_request.c | 20 > > > >> 1 file changed, 20 insertions(+) > > > >> > > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c > > > >> b/drivers/gpu/drm/i915/i915_request.c > > > >> index 7503dcb9043b..44491e7e214c 100644 > > > >> --- a/drivers/gpu/drm/i915/i915_request.c > > > >> +++ b/drivers/gpu/drm/i915/i915_request.c > > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct > > > >> dma_fence *fence) > > > >> return i915_request_enable_breadcrumb(to_request(fence)); > > > >> } > > > >> > > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t > > > >> deadline) > > > >> +{ > > > >> +struct i915_request *rq = to_request(fence); > > > >> + > > > >> +if (i915_request_completed(rq)) > > > >> +return; > > > >> + > > > >> +if (i915_request_started(rq)) > > > >> +return; > > > > > > > > why do we skip the boost if already started? > > > > don't we want to boost the freq anyway? > > > > > > I'd wager Rob is just copying the current i915 wait boost logic. > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > compared to your RFC, this could be the bug > > I don't think i915 calls drm_atomic_helper_wait_for_fences() > so that could explain something. Oh, I guess this wasn't even supposed to take over the current display boost stuff since you didn't remove the old one. The current one just boosts after a missed vblank. The deadline could use your timer approach I suppose and boost already a bit earlier in the hopes of not missing the vblank. -- Ville Syrjälä Intel
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On 03/03/2023 14:48, Rob Clark wrote: On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin wrote: On 03/03/2023 03:21, Rodrigo Vivi wrote: On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: From: Rob Clark missing some wording here... v2: rebase Signed-off-by: Rob Clark --- drivers/gpu/drm/i915/i915_request.c | 20 1 file changed, 20 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 7503dcb9043b..44491e7e214c 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) return i915_request_enable_breadcrumb(to_request(fence)); } +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) +{ +struct i915_request *rq = to_request(fence); + +if (i915_request_completed(rq)) +return; + +if (i915_request_started(rq)) +return; why do we skip the boost if already started? don't we want to boost the freq anyway? I'd wager Rob is just copying the current i915 wait boost logic. Yup, and probably incorrectly.. Matt reported fewer boosts/sec compared to your RFC, this could be the bug Hm, there I have preserved this same !i915_request_started logic. Presumably it's not just fewer boosts but lower performance. How is he setting the deadline? Somehow from clFlush or so? Regards, Tvrtko P.S. Take note that I did not post the latest version of my RFC. The one where I fix the fence chain and array misses you pointed out. I did not think it would be worthwhile given no universal love for it, but if people are testing with it more widely that I was aware perhaps I should. + +/* + * TODO something more clever for deadlines that are in the + * future. I think probably track the nearest deadline in + * rq->timeline and set timer to trigger boost accordingly? + */ I'm afraid it will be very hard to find some heuristics of what's late enough for the boost no? I mean, how early to boost the freq on an upcoming deadline for the timer? We can off load this patch from Rob and deal with it separately, or after the fact? That is completely my intention, I expect you to replace my i915 patch ;-) Rough idea when everyone is happy with the core bits is to setup an immutable branch without the driver specific patches, which could be merged into drm-next and $driver-next and then each driver team can add there own driver patches on top BR, -R It's a half solution without a smarter scheduler too. Like https://lore.kernel.org/all/20210208105236.28498-10-ch...@chris-wilson.co.uk/, or if GuC plans to do something like that at any point. Or bump the priority too if deadline is looming? IMO it is not very effective to fiddle with the heuristic on an ad-hoc basis. For instance I have a new heuristics which improves the problematic OpenCL cases for further 5% (relative to the current waitboost improvement from adding missing syncobj waitboost). But I can't really test properly for regressions over platforms, stacks, workloads.. :( Regards, Tvrtko + +intel_rps_boost(rq); +} + static signed long i915_fence_wait(struct dma_fence *fence, bool interruptible, signed long timeout) @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { .signaled = i915_fence_signaled, .wait = i915_fence_wait, .release = i915_fence_release, +.set_deadline = i915_fence_set_deadline, }; static void irq_execute_cb(struct irq_work *wrk) -- 2.39.1
Re: [Intel-gfx] [PATCH] Change the meaning of the fields in the ttm_place structure from pfn to bytes
On 3/3/23 08:16, Somalapuram Amaranath wrote: > Change the ttm_place structure member fpfn, lpfn, mem_type to > res_start, res_end, res_type. > Change the unsigned to u64. > Fix the dependence in all the DRM drivers and > clean up PAGE_SHIFT operation. > > Signed-off-by: Somalapuram Amaranath > > [...] > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c > index 44367f03316f..5b5104e724e3 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c > @@ -131,11 +131,12 @@ static int amdgpu_gtt_mgr_new(struct > ttm_resource_manager *man, > goto err_free; > } > > - if (place->lpfn) { > + if (place->res_end) { > spin_lock(>lock); > r = drm_mm_insert_node_in_range(>mm, >mm_nodes[0], > - num_pages, tbo->page_alignment, > - 0, place->fpfn, place->lpfn, > + num_pages, tbo->page_alignment, > 0, > + place->res_start << PAGE_SHIFT, > + place->res_end << PAGE_SHIFT, > DRM_MM_INSERT_BEST); This should be >> or no shift instead of <<, shouldn't it? Multiplying a value in bytes by the page size doesn't make sense. I didn't check the rest of the patch in detail, but it's easy introduce subtle regressions with this kind of change. It'll require a lot of review & testing scrutiny. -- Earthling Michel Dänzer| https://redhat.com Libre software enthusiast | Mesa and Xwayland developer
[Intel-gfx] ✓ Fi.CI.IGT: success for Some debugfs refactoring and improvements
== Series Details == Series: Some debugfs refactoring and improvements URL : https://patchwork.freedesktop.org/series/114510/ State : success == Summary == CI Bug Log - changes from CI_DRM_12799_full -> Patchwork_114510v1_full Summary --- **SUCCESS** No regressions found. Participating hosts (19 -> 19) -- No changes in participating hosts Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_114510v1_full: ### IGT changes ### Suppressed The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@gem_create@create-ext-cpu-access-big: - {shard-dg2-1}: NOTRUN -> [ABORT][1] +1 similar issue [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-dg2-1/igt@gem_cre...@create-ext-cpu-access-big.html * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-mmap-cpu: - {shard-dg2-3}: NOTRUN -> [FAIL][2] [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-dg2-3/igt@kms_frontbuffer_track...@fbc-1p-primscrn-pri-indfb-draw-mmap-cpu.html * igt@kms_hdr@static-toggle-suspend@pipe-a-dp-1: - {shard-dg2-11}: NOTRUN -> [FAIL][3] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-dg2-11/igt@kms_hdr@static-toggle-susp...@pipe-a-dp-1.html * {igt@kms_plane_scaling@planes-downscale-factor-0-25-upscale-0-25@pipe-d-dp-4}: - {shard-dg2-11}: NOTRUN -> [SKIP][4] [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-dg2-11/igt@kms_plane_scaling@planes-downscale-factor-0-25-upscale-0...@pipe-d-dp-4.html New tests - New tests have been introduced between CI_DRM_12799_full and Patchwork_114510v1_full: ### New IGT tests (5) ### * igt@kms_flip@flip-vs-suspend-interruptible@d-dp4: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_plane_scaling@planes-downscale-factor-0-75-upscale-0-25@pipe-a-dp-2: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_plane_scaling@planes-downscale-factor-0-75-upscale-0-25@pipe-c-dp-2: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_plane_scaling@planes-downscale-factor-0-75-upscale-0-25@pipe-d-dp-2: - Statuses : 1 pass(s) - Exec time: [0.0] s * igt@kms_rotation_crc: - Statuses : - Exec time: [None] s Known issues Here are the changes found in Patchwork_114510v1_full that come from known issues: ### IGT changes ### Issues hit * igt@fbdev@write: - shard-tglu-9: NOTRUN -> [SKIP][5] ([i915#2582]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-tglu-9/igt@fb...@write.html * igt@feature_discovery@display-3x: - shard-tglu-10: NOTRUN -> [SKIP][6] ([i915#1839]) +1 similar issue [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-tglu-10/igt@feature_discov...@display-3x.html * igt@gem_ccs@ctrl-surf-copy: - shard-tglu-10: NOTRUN -> [SKIP][7] ([i915#3555] / [i915#5325]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-tglu-10/igt@gem_...@ctrl-surf-copy.html * igt@gem_create@create-ext-cpu-access-sanity-check: - shard-tglu-9: NOTRUN -> [SKIP][8] ([i915#6335]) [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-tglu-9/igt@gem_cre...@create-ext-cpu-access-sanity-check.html * igt@gem_ctx_persistence@engines-hostile-preempt: - shard-snb: NOTRUN -> [SKIP][9] ([fdo#109271] / [i915#1099]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-snb1/igt@gem_ctx_persiste...@engines-hostile-preempt.html * igt@gem_ctx_sseu@engines: - shard-tglu-9: NOTRUN -> [SKIP][10] ([i915#280]) [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-tglu-9/igt@gem_ctx_s...@engines.html * igt@gem_exec_fair@basic-none-solo@rcs0: - shard-tglu-10: NOTRUN -> [FAIL][11] ([i915#2842]) [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-tglu-10/igt@gem_exec_fair@basic-none-s...@rcs0.html * igt@gem_exec_fair@basic-none@vcs0: - shard-glk: [PASS][12] -> [FAIL][13] ([i915#2842]) +1 similar issue [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12799/shard-glk2/igt@gem_exec_fair@basic-n...@vcs0.html [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-glk1/igt@gem_exec_fair@basic-n...@vcs0.html * igt@gem_exec_gttfill@multigpu-basic: - shard-tglu-10: NOTRUN -> [SKIP][14] ([i915#7697]) +1 similar issue [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114510v1/shard-tglu-10/igt@gem_exec_gttf...@multigpu-basic.html * igt@gem_lmem_swapping@massive-random: - shard-tglu-10: NOTRUN -> [SKIP][15] ([i915#4613]) +2 similar issues [15]:
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote: > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > wrote: > > > > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > >> From: Rob Clark > > >> > > > > > > missing some wording here... > > > > > >> v2: rebase > > >> > > >> Signed-off-by: Rob Clark > > >> --- > > >> drivers/gpu/drm/i915/i915_request.c | 20 > > >> 1 file changed, 20 insertions(+) > > >> > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c > > >> b/drivers/gpu/drm/i915/i915_request.c > > >> index 7503dcb9043b..44491e7e214c 100644 > > >> --- a/drivers/gpu/drm/i915/i915_request.c > > >> +++ b/drivers/gpu/drm/i915/i915_request.c > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct > > >> dma_fence *fence) > > >> return i915_request_enable_breadcrumb(to_request(fence)); > > >> } > > >> > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t > > >> deadline) > > >> +{ > > >> +struct i915_request *rq = to_request(fence); > > >> + > > >> +if (i915_request_completed(rq)) > > >> +return; > > >> + > > >> +if (i915_request_started(rq)) > > >> +return; > > > > > > why do we skip the boost if already started? > > > don't we want to boost the freq anyway? > > > > I'd wager Rob is just copying the current i915 wait boost logic. > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > compared to your RFC, this could be the bug I don't think i915 calls drm_atomic_helper_wait_for_fences() so that could explain something. -- Ville Syrjälä Intel
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On Thu, Mar 2, 2023 at 7:21 PM Rodrigo Vivi wrote: > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > From: Rob Clark > > > > missing some wording here... the wording should be "Pls replace this patch, kthx" ;-) > > > v2: rebase > > > > Signed-off-by: Rob Clark > > --- > > drivers/gpu/drm/i915/i915_request.c | 20 > > 1 file changed, 20 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_request.c > > b/drivers/gpu/drm/i915/i915_request.c > > index 7503dcb9043b..44491e7e214c 100644 > > --- a/drivers/gpu/drm/i915/i915_request.c > > +++ b/drivers/gpu/drm/i915/i915_request.c > > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence > > *fence) > > return i915_request_enable_breadcrumb(to_request(fence)); > > } > > > > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t > > deadline) > > +{ > > + struct i915_request *rq = to_request(fence); > > + > > + if (i915_request_completed(rq)) > > + return; > > + > > + if (i915_request_started(rq)) > > + return; > > why do we skip the boost if already started? > don't we want to boost the freq anyway? > > > + > > + /* > > + * TODO something more clever for deadlines that are in the > > + * future. I think probably track the nearest deadline in > > + * rq->timeline and set timer to trigger boost accordingly? > > + */ > > I'm afraid it will be very hard to find some heuristics of what's > late enough for the boost no? > I mean, how early to boost the freq on an upcoming deadline for the > timer? So, from my understanding of i915 boosting, it applies more specifically to a given request (vs msm which just bumps up the freq until the next devfreq sampling period which then recalculates target freq based on busy cycles and avg freq over the last sampling period). For msm I just set a timer for 3ms before the deadline and boost if the fence isn't signaled when the timer fires. It is kinda impossible to predict, even for userspace, how long a job will take to complete, so the goal isn't really to finish the specified job by the deadline, but instead to avoid devfreq landing at a local minimum (maximum?) AFAIU what I _think_ would make sense for i915 is to do the same thing you do if you miss a vblank pageflip deadline if the deadline passes without the fence signaling. BR, -R > > + > > + intel_rps_boost(rq); > > +} > > + > > static signed long i915_fence_wait(struct dma_fence *fence, > > bool interruptible, > > signed long timeout) > > @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > > .signaled = i915_fence_signaled, > > .wait = i915_fence_wait, > > .release = i915_fence_release, > > + .set_deadline = i915_fence_set_deadline, > > }; > > > > static void irq_execute_cb(struct irq_work *wrk) > > -- > > 2.39.1 > >
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin wrote: > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > >> From: Rob Clark > >> > > > > missing some wording here... > > > >> v2: rebase > >> > >> Signed-off-by: Rob Clark > >> --- > >> drivers/gpu/drm/i915/i915_request.c | 20 > >> 1 file changed, 20 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_request.c > >> b/drivers/gpu/drm/i915/i915_request.c > >> index 7503dcb9043b..44491e7e214c 100644 > >> --- a/drivers/gpu/drm/i915/i915_request.c > >> +++ b/drivers/gpu/drm/i915/i915_request.c > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct > >> dma_fence *fence) > >> return i915_request_enable_breadcrumb(to_request(fence)); > >> } > >> > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t > >> deadline) > >> +{ > >> +struct i915_request *rq = to_request(fence); > >> + > >> +if (i915_request_completed(rq)) > >> +return; > >> + > >> +if (i915_request_started(rq)) > >> +return; > > > > why do we skip the boost if already started? > > don't we want to boost the freq anyway? > > I'd wager Rob is just copying the current i915 wait boost logic. Yup, and probably incorrectly.. Matt reported fewer boosts/sec compared to your RFC, this could be the bug > >> + > >> +/* > >> + * TODO something more clever for deadlines that are in the > >> + * future. I think probably track the nearest deadline in > >> + * rq->timeline and set timer to trigger boost accordingly? > >> + */ > > > > I'm afraid it will be very hard to find some heuristics of what's > > late enough for the boost no? > > I mean, how early to boost the freq on an upcoming deadline for the > > timer? > > We can off load this patch from Rob and deal with it separately, or > after the fact? That is completely my intention, I expect you to replace my i915 patch ;-) Rough idea when everyone is happy with the core bits is to setup an immutable branch without the driver specific patches, which could be merged into drm-next and $driver-next and then each driver team can add there own driver patches on top BR, -R > It's a half solution without a smarter scheduler too. Like > https://lore.kernel.org/all/20210208105236.28498-10-ch...@chris-wilson.co.uk/, > or if GuC plans to do something like that at any point. > > Or bump the priority too if deadline is looming? > > IMO it is not very effective to fiddle with the heuristic on an ad-hoc > basis. For instance I have a new heuristics which improves the > problematic OpenCL cases for further 5% (relative to the current > waitboost improvement from adding missing syncobj waitboost). But I > can't really test properly for regressions over platforms, stacks, > workloads.. :( > > Regards, > > Tvrtko > > > > >> + > >> +intel_rps_boost(rq); > >> +} > >> + > >> static signed long i915_fence_wait(struct dma_fence *fence, > >> bool interruptible, > >> signed long timeout) > >> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > >> .signaled = i915_fence_signaled, > >> .wait = i915_fence_wait, > >> .release = i915_fence_release, > >> +.set_deadline = i915_fence_set_deadline, > >> }; > >> > >> static void irq_execute_cb(struct irq_work *wrk) > >> -- > >> 2.39.1 > >>
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Check HPD during eDP probe (rev2)
== Series Details == Series: drm/i915: Check HPD during eDP probe (rev2) URL : https://patchwork.freedesktop.org/series/114577/ State : success == Summary == CI Bug Log - changes from CI_DRM_12807 -> Patchwork_114577v2 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/index.html Participating hosts (36 -> 40) -- Additional (4): fi-kbl-soraka bat-atsm-1 fi-apl-guc fi-pnv-d510 Known issues Here are the changes found in Patchwork_114577v2 that come from known issues: ### IGT changes ### Issues hit * igt@fbdev@eof: - bat-atsm-1: NOTRUN -> [SKIP][1] ([i915#2582]) +4 similar issues [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@fb...@eof.html * igt@gem_exec_gttfill@basic: - fi-pnv-d510:NOTRUN -> [FAIL][2] ([i915#7229]) [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/fi-pnv-d510/igt@gem_exec_gttf...@basic.html * igt@gem_huc_copy@huc-copy: - fi-kbl-soraka: NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#2190]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html * igt@gem_lmem_swapping@basic: - fi-apl-guc: NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#4613]) +3 similar issues [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/fi-apl-guc/igt@gem_lmem_swapp...@basic.html - fi-kbl-soraka: NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#4613]) +3 similar issues [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/fi-kbl-soraka/igt@gem_lmem_swapp...@basic.html * igt@gem_mmap@basic: - bat-atsm-1: NOTRUN -> [SKIP][6] ([i915#4083]) [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@gem_m...@basic.html * igt@gem_tiled_fence_blits@basic: - bat-atsm-1: NOTRUN -> [SKIP][7] ([i915#4077]) +2 similar issues [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@gem_tiled_fence_bl...@basic.html * igt@gem_tiled_pread_basic: - bat-atsm-1: NOTRUN -> [SKIP][8] ([i915#4079]) +1 similar issue [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@gem_tiled_pread_basic.html * igt@i915_pm_rps@basic-api: - bat-atsm-1: NOTRUN -> [SKIP][9] ([i915#6621]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@i915_pm_...@basic-api.html * igt@i915_selftest@live@gt_pm: - fi-kbl-soraka: NOTRUN -> [DMESG-FAIL][10] ([i915#1886]) [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html * igt@i915_selftest@live@hangcheck: - fi-kbl-soraka: NOTRUN -> [INCOMPLETE][11] ([i915#7913]) [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/fi-kbl-soraka/igt@i915_selftest@l...@hangcheck.html * igt@i915_selftest@live@reset: - bat-rpls-1: [PASS][12] -> [ABORT][13] ([i915#4983]) [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12807/bat-rpls-1/igt@i915_selftest@l...@reset.html [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-rpls-1/igt@i915_selftest@l...@reset.html * igt@i915_suspend@basic-s3-without-i915: - bat-atsm-1: NOTRUN -> [SKIP][14] ([i915#6645]) [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@i915_susp...@basic-s3-without-i915.html * igt@kms_addfb_basic@size-max: - bat-atsm-1: NOTRUN -> [SKIP][15] ([i915#6077]) +36 similar issues [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@kms_addfb_ba...@size-max.html * igt@kms_chamelium_frames@hdmi-crc-fast: - fi-kbl-soraka: NOTRUN -> [SKIP][16] ([fdo#109271]) +16 similar issues [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/fi-kbl-soraka/igt@kms_chamelium_fra...@hdmi-crc-fast.html * igt@kms_chamelium_hpd@vga-hpd-fast: - fi-apl-guc: NOTRUN -> [SKIP][17] ([fdo#109271]) +22 similar issues [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/fi-apl-guc/igt@kms_chamelium_...@vga-hpd-fast.html * igt@kms_cursor_legacy@basic-flip-after-cursor-atomic: - bat-atsm-1: NOTRUN -> [SKIP][18] ([i915#6078]) +19 similar issues [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html * igt@kms_flip@basic-plain-flip: - bat-atsm-1: NOTRUN -> [SKIP][19] ([i915#6166]) +3 similar issues [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114577v2/bat-atsm-1/igt@kms_f...@basic-plain-flip.html * igt@kms_force_connector_basic@prune-stale-modes: - bat-atsm-1: NOTRUN -> [SKIP][20] ([i915#6093]) +3 similar issues
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Check HPD during eDP probe (rev2)
== Series Details == Series: drm/i915: Check HPD during eDP probe (rev2) URL : https://patchwork.freedesktop.org/series/114577/ State : warning == Summary == Error: dim checkpatch failed c7ee1a40c690 drm/i915: Populate dig_port->connected() before connector init -:60: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line #60: FILE: drivers/gpu/drm/i915/display/intel_ddi.c:4538: +* case we have some really bad VBTs... */ total: 0 errors, 1 warnings, 0 checks, 46 lines checked 17870842bd7a drm/i915: Fix SKL DDI A digital port .connected() 2b3299c63ec8 drm/i915: Get rid of the gm45 HPD live state nonsense 76519a95743b drm/i915: Introduce _hotplug_mask() 6e22c5be534e drm/i915: Introduce intel_hpd_enable_detection() 5f095199edf8 drm/i915: Check HPD live state during eDP probe -:51: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("")' - ie: 'commit 41e35ffb380b ("drm/i915: Favor last VBT child device with conflicting AUX ch/DDC pin")' #51: Also the systems (Asrock B250M-HDV at least) fixed by commit total: 1 errors, 0 warnings, 0 checks, 46 lines checked 52405c647e07 drm/i915: Reuse _hotplug_mask() in .hpd_detection_setup()
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/dsi: fix DSS CTL register offsets for TGL+ (rev2)
== Series Details == Series: drm/i915/dsi: fix DSS CTL register offsets for TGL+ (rev2) URL : https://patchwork.freedesktop.org/series/114522/ State : success == Summary == CI Bug Log - changes from CI_DRM_12807 -> Patchwork_114522v2 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/index.html Participating hosts (36 -> 39) -- Additional (3): bat-atsm-1 fi-apl-guc fi-pnv-d510 Known issues Here are the changes found in Patchwork_114522v2 that come from known issues: ### IGT changes ### Issues hit * igt@fbdev@eof: - bat-atsm-1: NOTRUN -> [SKIP][1] ([i915#2582]) +4 similar issues [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@fb...@eof.html * igt@gem_exec_gttfill@basic: - fi-pnv-d510:NOTRUN -> [FAIL][2] ([i915#7229]) [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/fi-pnv-d510/igt@gem_exec_gttf...@basic.html * igt@gem_lmem_swapping@basic: - fi-apl-guc: NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#4613]) +3 similar issues [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/fi-apl-guc/igt@gem_lmem_swapp...@basic.html * igt@gem_mmap@basic: - bat-atsm-1: NOTRUN -> [SKIP][4] ([i915#4083]) [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@gem_m...@basic.html * igt@gem_tiled_fence_blits@basic: - bat-atsm-1: NOTRUN -> [SKIP][5] ([i915#4077]) +2 similar issues [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@gem_tiled_fence_bl...@basic.html * igt@gem_tiled_pread_basic: - bat-atsm-1: NOTRUN -> [SKIP][6] ([i915#4079]) +1 similar issue [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@gem_tiled_pread_basic.html * igt@i915_pm_rps@basic-api: - bat-atsm-1: NOTRUN -> [SKIP][7] ([i915#6621]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@i915_pm_...@basic-api.html * igt@i915_selftest@live@execlists: - fi-bsw-n3050: [PASS][8] -> [ABORT][9] ([i915#7911]) [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12807/fi-bsw-n3050/igt@i915_selftest@l...@execlists.html [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/fi-bsw-n3050/igt@i915_selftest@l...@execlists.html * igt@i915_selftest@live@gt_heartbeat: - fi-apl-guc: NOTRUN -> [DMESG-FAIL][10] ([i915#5334]) [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html * igt@i915_selftest@live@migrate: - bat-dg2-11: [PASS][11] -> [DMESG-WARN][12] ([i915#7699]) [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12807/bat-dg2-11/igt@i915_selftest@l...@migrate.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-dg2-11/igt@i915_selftest@l...@migrate.html * igt@i915_selftest@live@reset: - bat-rpls-1: [PASS][13] -> [ABORT][14] ([i915#4983]) [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12807/bat-rpls-1/igt@i915_selftest@l...@reset.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-rpls-1/igt@i915_selftest@l...@reset.html * igt@i915_suspend@basic-s3-without-i915: - bat-atsm-1: NOTRUN -> [SKIP][15] ([i915#6645]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@i915_susp...@basic-s3-without-i915.html * igt@kms_addfb_basic@size-max: - bat-atsm-1: NOTRUN -> [SKIP][16] ([i915#6077]) +36 similar issues [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@kms_addfb_ba...@size-max.html * igt@kms_chamelium_hpd@vga-hpd-fast: - fi-apl-guc: NOTRUN -> [SKIP][17] ([fdo#109271]) +22 similar issues [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/fi-apl-guc/igt@kms_chamelium_...@vga-hpd-fast.html * igt@kms_cursor_legacy@basic-flip-after-cursor-atomic: - bat-atsm-1: NOTRUN -> [SKIP][18] ([i915#6078]) +19 similar issues [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html * igt@kms_flip@basic-plain-flip: - bat-atsm-1: NOTRUN -> [SKIP][19] ([i915#6166]) +3 similar issues [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@kms_f...@basic-plain-flip.html * igt@kms_force_connector_basic@prune-stale-modes: - bat-atsm-1: NOTRUN -> [SKIP][20] ([i915#6093]) +3 similar issues [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114522v2/bat-atsm-1/igt@kms_force_connector_ba...@prune-stale-modes.html * igt@kms_pipe_crc_basic@hang-read-crc: - bat-atsm-1: NOTRUN -> [SKIP][21]
Re: [Intel-gfx] [PATCH v4] drm/i915: add guard page to ggtt->error_capture
On 03.03.2023 13:01, Tvrtko Ursulin wrote: On 02/03/2023 11:00, Andrzej Hajda wrote: On 02.03.2023 11:43, Tvrtko Ursulin wrote: On 08/02/2023 10:51, Andrzej Hajda wrote: Write-combining memory allows speculative reads by CPU. ggtt->error_capture is WC mapped to CPU, so CPU/MMU can try to prefetch memory beyond the error_capture, ie it tries to read memory pointed by next PTE in GGTT. If this PTE points to invalid address DMAR errors will occur. This behaviour was observed on ADL, RPL, DG2 platforms. To avoid it, guard scratch page should be added after error_capture. The patch fixes the most annoying issue with error capture but since WC reads are used also in other places there is a risk similar problem can affect them as well. Signed-off-by: Andrzej Hajda Reviewed-by: Andi Shyti Research tells the explanation is plausible so: Acked-by: Tvrtko Ursulin Thanks for looking at it. On patch details below. What about "simiar risk in other places" - are there any plans to asses the potential impact elsewhere? Yes, merging this patch is the 1st step, as error_capture is responsible for most (maybe even all) regular DMAR read errors. After removing this noise it will be much easier to spot other DMAR read errors. There are also multiple regular DMAR write errors (less frequent, but still), which I hope to debug if time permits. Regards Andrzej --- This patch tries to diminish plague of DMAR read errors present in CI for ADL*, RPL*, DG2 platforms, see for example [1] (grep DMAR). CI is usually tolerant for these errors, so the scale of the problem is not really visible. To show it I have counted lines containing DMAR read errors in dmesgs produced by CI for all three versions of the patch, but in contrast to v2 I have grepped only for lines containing "PTE Read access". Below stats for kernel w/o patch vs patched one. v1: 210 vs 0 v2: 201 vs 0 v3: 214 vs 0 Apparently the patch fixes all common PTE read errors. In previous version there were different numbers due to less exact grepping, "grep DMAR" catched write errors and "DMAR: DRHD: handling fault status reg" lines, anyway the actual number of errors is much bigger - DMAR errors are rate-limited. [1]: http://gfx-ci.igk.intel.com/tree/drm-tip/CI_DRM_12678/bat-adln-1/dmesg0.txt Changelog: v2: - modified commit message (I hope the diagnosis is correct), - added bug checks to ensure scratch is initialized on gen3 platforms. CI produces strange stacktrace for it suggesting scratch[0] is NULL, to be removed after resolving the issue with gen3 platforms. v3: - removed bug checks, replaced with gen check. v4: - change code for scratch page insertion to support all platforms, - add info in commit message there could be more similar issues Regards Andrzej --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 31 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 842e69c7b21e49..6566d2066f1f8b 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -503,6 +503,21 @@ static void cleanup_init_ggtt(struct i915_ggtt *ggtt) mutex_destroy(>error_mutex); } +static void +ggtt_insert_scratch_pages(struct i915_ggtt *ggtt, u64 offset, u64 length) +{ + struct i915_address_space *vm = >vm; + + if (GRAPHICS_VER(ggtt->vm.i915) < 8) + return vm->clear_range(vm, offset, length); + /* clear_range since gen8 is nop */ It would also work to simply drop the loop below, right? If so would that be clearer? Apparently I have missed this comment, loop does not work for gen3, vm->scratch[0] is not initialized, it uses some code from drivers/char/agp/intel-gtt.c for clearing range. Alternatively, if you want to keep the dual path here, perhaps it would be better to replace the gen check with "clear_range !=/== nop_clear_range". That way maybe more of the platform knowledge remains at a central location. There is vm->error_range in internal branch which does the same as clear_range, which 'when upstreamed (?)' should simplify this code. On the other side unstaticising nop_clear_range would allow to drop mock_clear_range, dpt_clear_range. Another thing - are there any suspend-resume considerations (i915_ggtt_resume_vm)? Maybe this padding needs to be restored too. Didn't think about it, I will add code there too. Regards Andrzej Regards, Tvrtko + while (length > 0) { Maybe add a GEM_BUG_ON if length is not aligned? Granted it may be a huge stretch of imagination.. Regards, Tvrtko + vm->insert_page(vm, px_dma(vm->scratch[0]), offset, I915_CACHE_NONE, 0); + offset += I915_GTT_PAGE_SIZE; + length -= I915_GTT_PAGE_SIZE; + } +} + static int init_ggtt(struct i915_ggtt *ggtt) { /* @@ -551,8 +566,12 @@ static int init_ggtt(struct i915_ggtt *ggtt)
Re: [Intel-gfx] [PATCH] drm/i915: avoid flush_scheduled_work() usage
On 2023/03/03 19:11, Tetsuo Handa wrote: > @@ -79,6 +81,7 @@ static int __init i915_init(void) > { > int err, i; > > + i915_wq = alloc_workqueue("i915", 0, 0); Oops. I forgot to add if (!i915_wq) return -ENOMEM; here. But I'd like to wait for your response for a while before submitting v2 patch. > for (i = 0; i < ARRAY_SIZE(init_funcs); i++) { > err = init_funcs[i].init(); > if (err < 0) { > @@ -86,6 +89,7 @@ static int __init i915_init(void) > if (init_funcs[i].exit) > init_funcs[i].exit(); > } > + destroy_workqueue(i915_wq); > return err; > } else if (err > 0) { > /* > @@ -113,6 +117,7 @@ static void __exit i915_exit(void) > if (init_funcs[i].exit) > init_funcs[i].exit(); > } > + destroy_workqueue(i915_wq); > } > > module_init(i915_init);
Re: [Intel-gfx] [PATCH v4] drm/i915: add guard page to ggtt->error_capture
On 02/03/2023 11:00, Andrzej Hajda wrote: On 02.03.2023 11:43, Tvrtko Ursulin wrote: On 08/02/2023 10:51, Andrzej Hajda wrote: Write-combining memory allows speculative reads by CPU. ggtt->error_capture is WC mapped to CPU, so CPU/MMU can try to prefetch memory beyond the error_capture, ie it tries to read memory pointed by next PTE in GGTT. If this PTE points to invalid address DMAR errors will occur. This behaviour was observed on ADL, RPL, DG2 platforms. To avoid it, guard scratch page should be added after error_capture. The patch fixes the most annoying issue with error capture but since WC reads are used also in other places there is a risk similar problem can affect them as well. Signed-off-by: Andrzej Hajda Reviewed-by: Andi Shyti Research tells the explanation is plausible so: Acked-by: Tvrtko Ursulin Thanks for looking at it. On patch details below. What about "simiar risk in other places" - are there any plans to asses the potential impact elsewhere? Yes, merging this patch is the 1st step, as error_capture is responsible for most (maybe even all) regular DMAR read errors. After removing this noise it will be much easier to spot other DMAR read errors. There are also multiple regular DMAR write errors (less frequent, but still), which I hope to debug if time permits. Regards Andrzej --- This patch tries to diminish plague of DMAR read errors present in CI for ADL*, RPL*, DG2 platforms, see for example [1] (grep DMAR). CI is usually tolerant for these errors, so the scale of the problem is not really visible. To show it I have counted lines containing DMAR read errors in dmesgs produced by CI for all three versions of the patch, but in contrast to v2 I have grepped only for lines containing "PTE Read access". Below stats for kernel w/o patch vs patched one. v1: 210 vs 0 v2: 201 vs 0 v3: 214 vs 0 Apparently the patch fixes all common PTE read errors. In previous version there were different numbers due to less exact grepping, "grep DMAR" catched write errors and "DMAR: DRHD: handling fault status reg" lines, anyway the actual number of errors is much bigger - DMAR errors are rate-limited. [1]: http://gfx-ci.igk.intel.com/tree/drm-tip/CI_DRM_12678/bat-adln-1/dmesg0.txt Changelog: v2: - modified commit message (I hope the diagnosis is correct), - added bug checks to ensure scratch is initialized on gen3 platforms. CI produces strange stacktrace for it suggesting scratch[0] is NULL, to be removed after resolving the issue with gen3 platforms. v3: - removed bug checks, replaced with gen check. v4: - change code for scratch page insertion to support all platforms, - add info in commit message there could be more similar issues Regards Andrzej --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 31 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 842e69c7b21e49..6566d2066f1f8b 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -503,6 +503,21 @@ static void cleanup_init_ggtt(struct i915_ggtt *ggtt) mutex_destroy(>error_mutex); } +static void +ggtt_insert_scratch_pages(struct i915_ggtt *ggtt, u64 offset, u64 length) +{ + struct i915_address_space *vm = >vm; + + if (GRAPHICS_VER(ggtt->vm.i915) < 8) + return vm->clear_range(vm, offset, length); + /* clear_range since gen8 is nop */ It would also work to simply drop the below, right? If so would that be clearer? Alternatively, if you want to keep the dual path here, perhaps it would be better to replace the gen check with "clear_range !=/== nop_clear_range". That way maybe more of the platform knowledge remains at a central location. Another thing - are there any suspend-resume considerations (i915_ggtt_resume_vm)? Maybe this padding needs to be restored too. Regards, Tvrtko + while (length > 0) { Maybe add a GEM_BUG_ON if length is not aligned? Granted it may be a huge stretch of imagination.. Regards, Tvrtko + vm->insert_page(vm, px_dma(vm->scratch[0]), offset, I915_CACHE_NONE, 0); + offset += I915_GTT_PAGE_SIZE; + length -= I915_GTT_PAGE_SIZE; + } +} + static int init_ggtt(struct i915_ggtt *ggtt) { /* @@ -551,8 +566,12 @@ static int init_ggtt(struct i915_ggtt *ggtt) * paths, and we trust that 0 will remain reserved. However, * the only likely reason for failure to insert is a driver * bug, which we expect to cause other failures... + * + * Since CPU can perform speculative reads on error capture + * (write-combining allows it) add scratch page after error + * capture to avoid DMAR errors. */ - ggtt->error_capture.size = I915_GTT_PAGE_SIZE; + ggtt->error_capture.size = 2 * I915_GTT_PAGE_SIZE; ggtt->error_capture.color =
Re: [Intel-gfx] [PATCH v3] drm/i915/active: Fix misuse of non-idle barriers as fence trackers
Hi Janusz, Pushed to drm-intel-gt-next. Thanks, Andi On Thu, Mar 02, 2023 at 01:08:20PM +0100, Janusz Krzysztofik wrote: > Users reported oopses on list corruptions when using i915 perf with a > number of concurrently running graphics applications. Root cause analysis > pointed at an issue in barrier processing code -- a race among perf open / > close replacing active barriers with perf requests on kernel context and > concurrent barrier preallocate / acquire operations performed during user > context first pin / last unpin. > > When adding a request to a composite tracker, we try to reuse an existing > fence tracker, already allocated and registered with that composite. The > tracker we obtain may already track another fence, may be an idle barrier, > or an active barrier. > > If the tracker we get occurs a non-idle barrier then we try to delete that > barrier from a list of barrier tasks it belongs to. However, while doing > that we don't respect return value from a function that performs the > barrier deletion. Should the deletion ever fail, we would end up reusing > the tracker still registered as a barrier task. Since the same structure > field is reused with both fence callback lists and barrier tasks list, > list corruptions would likely occur. > > Barriers are now deleted from a barrier tasks list by temporarily removing > the list content, traversing that content with skip over the node to be > deleted, then populating the list back with the modified content. Should > that intentionally racy concurrent deletion attempts be not serialized, > one or more of those may fail because of the list being temporary empty. > > Related code that ignores the results of barrier deletion was initially > introduced in v5.4 by commit d8af05ff38ae ("drm/i915: Allow sharing the > idle-barrier from other kernel requests"). However, all users of the > barrier deletion routine were apparently serialized at that time, then the > issue didn't exhibit itself. Results of git bisect with help of a newly > developed igt@gem_barrier_race@remote-request IGT test indicate that list > corruptions might start to appear after commit 311770173fac ("drm/i915/gt: > Schedule request retirement when timeline idles"), introduced in v5.5. > > Respect results of barrier deletion attempts -- mark the barrier as idle > only if successfully deleted from the list. Then, before proceeding with > setting our fence as the one currently tracked, make sure that the tracker > we've got is not a non-idle barrier. If that check fails then don't use > that tracker but go back and try to acquire a new, usable one. > > v3: use unlikely() to document what outcome we expect (Andi), > - fix bad grammar in commit description. > v2: no code changes, > - blame commit 311770173fac ("drm/i915/gt: Schedule request retirement > when timeline idles"), v5.5, not commit d8af05ff38ae ("drm/i915: Allow > sharing the idle-barrier from other kernel requests"), v5.4, > - reword commit description. > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6333 > Fixes: 311770173fac ("drm/i915/gt: Schedule request retirement when timeline > idles") > Cc: Chris Wilson > Cc: sta...@vger.kernel.org # v5.5 > Cc: Andi Shyti > Signed-off-by: Janusz Krzysztofik > --- > drivers/gpu/drm/i915/i915_active.c | 25 ++--- > 1 file changed, 14 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_active.c > b/drivers/gpu/drm/i915/i915_active.c > index 7412abf166a8c..a9fea115f2d26 100644 > --- a/drivers/gpu/drm/i915/i915_active.c > +++ b/drivers/gpu/drm/i915/i915_active.c > @@ -422,12 +422,12 @@ replace_barrier(struct i915_active *ref, struct > i915_active_fence *active) >* we can use it to substitute for the pending idle-barrer >* request that we want to emit on the kernel_context. >*/ > - __active_del_barrier(ref, node_from_active(active)); > - return true; > + return __active_del_barrier(ref, node_from_active(active)); > } > > int i915_active_add_request(struct i915_active *ref, struct i915_request *rq) > { > + u64 idx = i915_request_timeline(rq)->fence_context; > struct dma_fence *fence = >fence; > struct i915_active_fence *active; > int err; > @@ -437,16 +437,19 @@ int i915_active_add_request(struct i915_active *ref, > struct i915_request *rq) > if (err) > return err; > > - active = active_instance(ref, i915_request_timeline(rq)->fence_context); > - if (!active) { > - err = -ENOMEM; > - goto out; > - } > + do { > + active = active_instance(ref, idx); > + if (!active) { > + err = -ENOMEM; > + goto out; > + } > + > + if (replace_barrier(ref, active)) { > + RCU_INIT_POINTER(active->fence, NULL); > + atomic_dec(>count); > + } > + } while
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On Fri, Mar 03, 2023 at 09:58:36AM +, Tvrtko Ursulin wrote: > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > > From: Rob Clark > > > > > > > missing some wording here... > > > > > v2: rebase > > > > > > Signed-off-by: Rob Clark > > > --- > > > drivers/gpu/drm/i915/i915_request.c | 20 > > > 1 file changed, 20 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_request.c > > > b/drivers/gpu/drm/i915/i915_request.c > > > index 7503dcb9043b..44491e7e214c 100644 > > > --- a/drivers/gpu/drm/i915/i915_request.c > > > +++ b/drivers/gpu/drm/i915/i915_request.c > > > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct > > > dma_fence *fence) > > > return i915_request_enable_breadcrumb(to_request(fence)); > > > } > > > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t > > > deadline) > > > +{ > > > + struct i915_request *rq = to_request(fence); > > > + > > > + if (i915_request_completed(rq)) > > > + return; > > > + > > > + if (i915_request_started(rq)) > > > + return; > > > > why do we skip the boost if already started? > > don't we want to boost the freq anyway? > > I'd wager Rob is just copying the current i915 wait boost logic. > > > > + > > > + /* > > > + * TODO something more clever for deadlines that are in the > > > + * future. I think probably track the nearest deadline in > > > + * rq->timeline and set timer to trigger boost accordingly? > > > + */ > > > > I'm afraid it will be very hard to find some heuristics of what's > > late enough for the boost no? > > I mean, how early to boost the freq on an upcoming deadline for the > > timer? > > We can off load this patch from Rob and deal with it separately, or after > the fact? > > It's a half solution without a smarter scheduler too. Like > https://lore.kernel.org/all/20210208105236.28498-10-ch...@chris-wilson.co.uk/, > or if GuC plans to do something like that at any point. Indeed, we already have the deadline implementation (and not just that), we just need to have some willingness to apply it. Andi > Or bump the priority too if deadline is looming? > > IMO it is not very effective to fiddle with the heuristic on an ad-hoc > basis. For instance I have a new heuristics which improves the problematic > OpenCL cases for further 5% (relative to the current waitboost improvement > from adding missing syncobj waitboost). But I can't really test properly for > regressions over platforms, stacks, workloads.. :( > > Regards, > > Tvrtko > > > > > > + > > > + intel_rps_boost(rq); > > > +} > > > + > > > static signed long i915_fence_wait(struct dma_fence *fence, > > > bool interruptible, > > > signed long timeout) > > > @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > > > .signaled = i915_fence_signaled, > > > .wait = i915_fence_wait, > > > .release = i915_fence_release, > > > + .set_deadline = i915_fence_set_deadline, > > > }; > > > static void irq_execute_cb(struct irq_work *wrk) > > > -- > > > 2.39.1 > > >
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/sseu: fix max_subslices array-index-out-of-bounds access (rev4)
== Series Details == Series: drm/i915/sseu: fix max_subslices array-index-out-of-bounds access (rev4) URL : https://patchwork.freedesktop.org/series/114199/ State : success == Summary == CI Bug Log - changes from CI_DRM_12794 -> Patchwork_114199v4 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/index.html Participating hosts (39 -> 37) -- Missing(2): fi-kbl-soraka fi-snb-2520m Known issues Here are the changes found in Patchwork_114199v4 that come from known issues: ### IGT changes ### Issues hit * igt@i915_selftest@live@requests: - bat-rpls-2: [PASS][1] -> [ABORT][2] ([i915#4983] / [i915#7694]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12794/bat-rpls-2/igt@i915_selftest@l...@requests.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/bat-rpls-2/igt@i915_selftest@l...@requests.html - bat-rpls-1: [PASS][3] -> [ABORT][4] ([i915#4983] / [i915#7694] / [i915#7911] / [i915#7981]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12794/bat-rpls-1/igt@i915_selftest@l...@requests.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/bat-rpls-1/igt@i915_selftest@l...@requests.html * igt@kms_chamelium_hpd@common-hpd-after-suspend: - bat-dg2-11: NOTRUN -> [SKIP][5] ([i915#7828]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/bat-dg2-11/igt@kms_chamelium_...@common-hpd-after-suspend.html Possible fixes * igt@gem_exec_gttfill@basic: - fi-pnv-d510:[FAIL][6] ([i915#7229]) -> [PASS][7] [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12794/fi-pnv-d510/igt@gem_exec_gttf...@basic.html [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/fi-pnv-d510/igt@gem_exec_gttf...@basic.html * igt@gem_exec_suspend@basic-s3@smem: - bat-dg2-11: [INCOMPLETE][8] ([i915#6311]) -> [PASS][9] [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12794/bat-dg2-11/igt@gem_exec_suspend@basic...@smem.html [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/bat-dg2-11/igt@gem_exec_suspend@basic...@smem.html * igt@i915_selftest@live@hangcheck: - fi-skl-guc: [DMESG-WARN][10] ([i915#8073]) -> [PASS][11] [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12794/fi-skl-guc/igt@i915_selftest@l...@hangcheck.html [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/fi-skl-guc/igt@i915_selftest@l...@hangcheck.html * igt@i915_selftest@live@migrate: - bat-atsm-1: [DMESG-FAIL][12] ([i915#7699]) -> [PASS][13] [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12794/bat-atsm-1/igt@i915_selftest@l...@migrate.html [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/bat-atsm-1/igt@i915_selftest@l...@migrate.html * igt@i915_selftest@live@workarounds: - bat-rpls-2: [DMESG-WARN][14] ([i915#7852]) -> [PASS][15] [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12794/bat-rpls-2/igt@i915_selftest@l...@workarounds.html [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/bat-rpls-2/igt@i915_selftest@l...@workarounds.html [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983 [i915#6311]: https://gitlab.freedesktop.org/drm/intel/issues/6311 [i915#7229]: https://gitlab.freedesktop.org/drm/intel/issues/7229 [i915#7694]: https://gitlab.freedesktop.org/drm/intel/issues/7694 [i915#7699]: https://gitlab.freedesktop.org/drm/intel/issues/7699 [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828 [i915#7852]: https://gitlab.freedesktop.org/drm/intel/issues/7852 [i915#7911]: https://gitlab.freedesktop.org/drm/intel/issues/7911 [i915#7981]: https://gitlab.freedesktop.org/drm/intel/issues/7981 [i915#8073]: https://gitlab.freedesktop.org/drm/intel/issues/8073 Build changes - * Linux: CI_DRM_12794 -> Patchwork_114199v4 CI-20190529: 20190529 CI_DRM_12794: 09f45ee84b4e66b882286806fb4b2b03907db5dc @ git://anongit.freedesktop.org/gfx-ci/linux IGT_7176: ffe88a907c0fafe6a736f5f17cee8ba8eddd6fa7 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_114199v4: 09f45ee84b4e66b882286806fb4b2b03907db5dc @ git://anongit.freedesktop.org/gfx-ci/linux ### Linux commits bb1a3712cbfe drm/i915/sseu: fix max_subslices array-index-out-of-bounds access == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114199v4/index.html
[Intel-gfx] ✓ Fi.CI.BAT: success for Enable YCbCr420 format for VDSC (rev2)
== Series Details == Series: Enable YCbCr420 format for VDSC (rev2) URL : https://patchwork.freedesktop.org/series/114246/ State : success == Summary == CI Bug Log - changes from CI_DRM_12805 -> Patchwork_114246v2 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114246v2/index.html Participating hosts (3 -> 2) -- Missing(1): fi-snb-2520m Changes --- No changes found Build changes - * Linux: CI_DRM_12805 -> Patchwork_114246v2 CI-20190529: 20190529 CI_DRM_12805: c1dbd5b5c8fdb5efd37a5a5234469b293c89a358 @ git://anongit.freedesktop.org/gfx-ci/linux IGT_7179: fcbbb6ab645cdbd7545c3f96d7b7df7674e620be @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_114246v2: c1dbd5b5c8fdb5efd37a5a5234469b293c89a358 @ git://anongit.freedesktop.org/gfx-ci/linux ### Linux commits 854d3e83acbf drm/i915/dsc: Add debugfs entry to validate DSC output formats 75d7b64e8731 drm/i915/vdsc: Check slice design requirement df0c32d69a62 drm/i915/display: Fill in native_420 field 95200017860b drm/i915: Enable YCbCr420 for VDSC a5f19b5be267 drm/i915: Adding the new registers for DSC 81922029e90d drm/i915/dp: Check if DSC supports the given output_format 57d743e4928a drm/dp_helper: Add helper to check DSC support with given o/p format == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114246v2/index.html
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: avoid flush_scheduled_work() usage
== Series Details == Series: drm/i915: avoid flush_scheduled_work() usage URL : https://patchwork.freedesktop.org/series/114608/ State : warning == Summary == Error: dim checkpatch failed 2457aaaeeb44 drm/i915: avoid flush_scheduled_work() usage -:154: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #154: FILE: drivers/gpu/drm/i915/display/intel_hdcp.c:2132: + queue_delayed_work(i915_wq, >check_work, DRM_HDCP2_CHECK_PERIOD_MS); -:158: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #158: FILE: drivers/gpu/drm/i915/display/intel_hdcp.c:2135: + queue_delayed_work(i915_wq, >check_work, DRM_HDCP_CHECK_PERIOD_MS); -:342: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #342: FILE: drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c:94: + queue_delayed_work(i915_wq, >work, round_jiffies_up_relative(HZ)); -:351: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #351: FILE: drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c:120: + queue_delayed_work(i915_wq, >work, round_jiffies_up_relative(HZ)); -:386: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #386: FILE: drivers/gpu/drm/i915/gt/intel_gt_requests.c:211: + queue_delayed_work(i915_wq, >requests.retire_work, round_jiffies_up_relative(HZ)); -:395: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #395: FILE: drivers/gpu/drm/i915/gt/intel_gt_requests.c:228: + queue_delayed_work(i915_wq, >requests.retire_work, round_jiffies_up_relative(HZ)); total: 0 errors, 0 warnings, 6 checks, 410 lines checked
[Intel-gfx] [PATCH] drm/i915: avoid flush_scheduled_work() usage
Like commit c4f135d643823a86 ("workqueue: Wrap flush_workqueue() using a macro") says, flush_scheduled_work() is dangerous and will be forbidden. There was an attempt to remove flush_scheduled_work() from intel_modeset_driver_remove_noirq(), but it went to backlog [1]. Now that i915 is about to become the last flush_scheduled_work() user, for now let's start with blind sed -e 's/system_wq,/i915_wq,/g' -e 's/schedule_work(/queue_work(i915_wq, /g' -e 's/schedule_delayed_work(/queue_delayed_work(i915_wq, /g' -e 's/flush_scheduled_work()/flush_workqueue(i915_wq)/g' conversion inside the whole drivers/gpu/drm/i915/ directory. There will be work items which do not need to be flushed by flush_workqueue(i915_wq), but that cleanup can be done after flush_scheduled_work() is removed. Link: https://lkml.kernel.org/r/20220923142934.29528-1-tvrtko.ursu...@linux.intel.com [1] Signed-off-by: Tetsuo Handa Cc: Tvrtko Ursulin Cc: Jani Nikula Cc: Ville Syrjälä --- drivers/gpu/drm/i915/display/intel_display.c | 4 ++-- drivers/gpu/drm/i915/display/intel_dmc.c | 2 +- drivers/gpu/drm/i915/display/intel_dp.c| 2 +- .../gpu/drm/i915/display/intel_dp_link_training.c | 2 +- drivers/gpu/drm/i915/display/intel_drrs.c | 2 +- drivers/gpu/drm/i915/display/intel_fbc.c | 2 +- drivers/gpu/drm/i915/display/intel_fbdev.c | 2 +- drivers/gpu/drm/i915/display/intel_hdcp.c | 14 +++--- drivers/gpu/drm/i915/display/intel_hotplug.c | 12 ++-- drivers/gpu/drm/i915/display/intel_opregion.c | 2 +- drivers/gpu/drm/i915/display/intel_pps.c | 2 +- drivers/gpu/drm/i915/display/intel_psr.c | 6 +++--- .../gpu/drm/i915/gt/intel_execlists_submission.c | 4 ++-- drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 4 ++-- drivers/gpu/drm/i915/gt/intel_gt_irq.c | 2 +- drivers/gpu/drm/i915/gt/intel_gt_requests.c| 6 +++--- drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- drivers/gpu/drm/i915/gt/intel_rps.c| 14 +++--- drivers/gpu/drm/i915/gt/selftest_engine_cs.c | 2 +- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_module.c | 5 + drivers/gpu/drm/i915/i915_request.c| 2 +- drivers/gpu/drm/i915/intel_wakeref.c | 4 +++- drivers/gpu/drm/i915/selftests/i915_sw_fence.c | 4 +++- 24 files changed, 56 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index d3994e2a7d63..4088f69f68ba 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -7648,7 +7648,7 @@ intel_atomic_commit_ready(struct i915_sw_fence *fence, _i915(state->base.dev)->display.atomic_helper; if (llist_add(>freed, >free_list)) - schedule_work(>free_work); + queue_work(i915_wq, >free_work); break; } } @@ -8981,7 +8981,7 @@ void intel_modeset_driver_remove_noirq(struct drm_i915_private *i915) intel_unregister_dsm_handler(); /* flush any delayed tasks or pending work */ - flush_scheduled_work(); + flush_workqueue(i915_wq); intel_hdcp_component_fini(i915); diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c index 257aa2b7cf20..acf30b64492b 100644 --- a/drivers/gpu/drm/i915/display/intel_dmc.c +++ b/drivers/gpu/drm/i915/display/intel_dmc.c @@ -992,7 +992,7 @@ void intel_dmc_ucode_init(struct drm_i915_private *dev_priv) } drm_dbg_kms(_priv->drm, "Loading %s\n", dmc->fw_path); - schedule_work(_priv->display.dmc.work); + queue_work(i915_wq, _priv->display.dmc.work); } /** diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 62cbab7402e9..dd401dce099d 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -5064,7 +5064,7 @@ static void intel_dp_oob_hotplug_event(struct drm_connector *connector) spin_lock_irq(>irq_lock); i915->display.hotplug.event_bits |= BIT(encoder->hpd_pin); spin_unlock_irq(>irq_lock); - queue_delayed_work(system_wq, >display.hotplug.hotplug_work, 0); + queue_delayed_work(i915_wq, >display.hotplug.hotplug_work, 0); } static const struct drm_connector_funcs intel_dp_connector_funcs = { diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c b/drivers/gpu/drm/i915/display/intel_dp_link_training.c index 3d3efcf02011..4449b21ad9bd 100644 --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c @@ -1123,7 +1123,7 @@ static void
Re: [Intel-gfx] [PATCH v9 15/15] drm/i915: Add deadline based boost support
On 03/03/2023 03:21, Rodrigo Vivi wrote: On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: From: Rob Clark missing some wording here... v2: rebase Signed-off-by: Rob Clark --- drivers/gpu/drm/i915/i915_request.c | 20 1 file changed, 20 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 7503dcb9043b..44491e7e214c 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) return i915_request_enable_breadcrumb(to_request(fence)); } +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) +{ + struct i915_request *rq = to_request(fence); + + if (i915_request_completed(rq)) + return; + + if (i915_request_started(rq)) + return; why do we skip the boost if already started? don't we want to boost the freq anyway? I'd wager Rob is just copying the current i915 wait boost logic. + + /* +* TODO something more clever for deadlines that are in the +* future. I think probably track the nearest deadline in +* rq->timeline and set timer to trigger boost accordingly? +*/ I'm afraid it will be very hard to find some heuristics of what's late enough for the boost no? I mean, how early to boost the freq on an upcoming deadline for the timer? We can off load this patch from Rob and deal with it separately, or after the fact? It's a half solution without a smarter scheduler too. Like https://lore.kernel.org/all/20210208105236.28498-10-ch...@chris-wilson.co.uk/, or if GuC plans to do something like that at any point. Or bump the priority too if deadline is looming? IMO it is not very effective to fiddle with the heuristic on an ad-hoc basis. For instance I have a new heuristics which improves the problematic OpenCL cases for further 5% (relative to the current waitboost improvement from adding missing syncobj waitboost). But I can't really test properly for regressions over platforms, stacks, workloads.. :( Regards, Tvrtko + + intel_rps_boost(rq); +} + static signed long i915_fence_wait(struct dma_fence *fence, bool interruptible, signed long timeout) @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { .signaled = i915_fence_signaled, .wait = i915_fence_wait, .release = i915_fence_release, + .set_deadline = i915_fence_set_deadline, }; static void irq_execute_cb(struct irq_work *wrk) -- 2.39.1
Re: [Intel-gfx] Linux 6.2.1 hits a display driver bug (list_del corruption, ffff88811b4af298->next is NULL)
On Fri, Mar 03, 2023 at 03:46:56AM +0700, Ammar Faizi wrote: > Hi, > > Linux 6.2.1 hits a display driver bug (list_del corruption, > 88811b4af298->next is NULL). > > Unfortunately, I don't know the last good commit and the first bad commit. Can you please try v6.1? -- An old man doll... just what I always wanted! - Clara signature.asc Description: PGP signature