[Intel-gfx] [PATCH] drm/i915: Flush buffer pools on driver remove
In preparation for clean driver release, attempts to drain work queues and release freed objects are taken at driver remove time. However, GT buffer pools are now not flushed before the driver release phase. Since unused objects may stay there for up to one second, some may survive until driver release is attempted. That can potentially explain sporadic then hardly reproducible issues observed at driver release time, like non-zero shrink counter or outstanding address space areas. Flush buffer pools on GT remove as a potential fix. Also, don't flush the pools at driver release again, just assert that the flush was called and nothing added more in between (suggested by Chris). Signed-off-by: Janusz Krzysztofik Cc: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++ drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 2161bf01ef8b..c03b399bfaf5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -652,6 +652,8 @@ void intel_gt_driver_remove(struct intel_gt *gt) intel_uc_driver_remove(>->uc); intel_engines_release(gt); + + intel_gt_flush_buffer_pool(gt); } void intel_gt_driver_unregister(struct intel_gt *gt) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c index aa0a59c5b614..acc49c56a9f3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c @@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt) struct intel_gt_buffer_pool *pool = >->buffer_pool; int n; - intel_gt_flush_buffer_pool(gt); - for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) GEM_BUG_ON(!list_empty(&pool->cache_list[n])); } -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RESEND] drm/i915: Flush buffer pools on driver remove
Hi Matt, Thanks for review. On czwartek, 23 września 2021 00:24:29 CEST Matt Roper wrote: > On Fri, Sep 03, 2021 at 04:23:20PM +0200, Janusz Krzysztofik wrote: > > In preparation for clean driver release, attempts to drain work queues > > and release freed objects are taken at driver remove time. However, GT > > buffer pools are now not flushed before the driver release phase. > > Since unused objects may stay there for up to one second, some may > > survive until driver release is attempted. That can potentially > > explain sporadic then hardly reproducible issues observed at driver > > release time, like non-zero shrink counter or outstanding address space > > So just to make sure I'm understanding the description here: > - We currently do an explicit flush of the buffer pools within the call >path of drm_driver.release(); this removes all buffers, regardless of >their age. And also triggers release of the buffers' underlying resources (objects, address space areas). > - However there may be other code that runs *earlier* within the >drm_driver.release() call chain Yes, within the drm_driver.release() call chain, but not necessarily earlier -- that's irrelevant, I believe, ... >that expects buffer pools have >already been flushed and are already empty. ... since that other code expects not just buffer pools but resource categories they consume (objects, address space areas) to be flushed already, while some may still be busy with old buffers not auto-flushed yet. > - Since buffer pools auto-flush old buffers once per second in a worker >thread, there's a small window where if we remove the driver while >there are still buffers with an age of less than one second, the >assumptions of the other release code may be violated. Correct. > So by moving the flush to driver remove (which executes earlier via the > pci_driver.remove() flow) you're ensuring that all buffers are flushed > before _any_ code in drm_driver.release() executes. And also flushed before some other code in pci_driver.remove() flushes those other resource categories released on buffer pools flush, then completeness of all those flushes is checked in drm_driver.release(). May I copy-paste some of you wording while rephrasing my commit description? Thanks, Janusz > > I found the wording of the commit message here somewhat confusing since > it's talking about flushes we do in driver release, but mentions > problems that arise during driver release due to lack of flushing. You > might want to reword the commit message somewhat to help clarify. > Otherwise, the code change itself looks reasonable to me. > > BTW, I do notice that drm_driver.release() in general is technically > deprecated at this point (with a suggestion in the drm_drv.h comments to > switch to using drmm_add_action(), drmm_kmalloc(), etc. to manage the > cleanup of resources). At some point in the future me may want to > rework the i915 cleanup in general according to that guidance. > > > Matt > > > areas. > > > > Flush buffer pools on GT remove as a fix. On driver release, don't > > flush the pools again, just assert that the flush was called and > > nothing added more in between. > > > > Signed-off-by: Janusz Krzysztofik > > Cc: Chris Wilson > > --- > > Resending with Cc: dri-de...@lists.freedesktop.org as requested, and a > > typo in commit description fixed. > > > > Thanks, > > Janusz > > > > drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++ > > drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 -- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c > > b/drivers/gpu/drm/i915/gt/intel_gt.c > > index 62d40c986642..8f322a4ecd87 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gt.c > > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > > @@ -737,6 +737,8 @@ void intel_gt_driver_remove(struct intel_gt *gt) > > intel_uc_driver_remove(>->uc); > > > > intel_engines_release(gt); > > + > > + intel_gt_flush_buffer_pool(gt); > > } > > > > void intel_gt_driver_unregister(struct intel_gt *gt) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c > > b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c > > index aa0a59c5b614..acc49c56a9f3 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c > > @@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt) > > struct intel_gt_buffer_pool *pool = >->buffer_pool; > > int n; > > > > - intel_gt_flush_buffer_pool(gt); > > - > > for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) > > GEM_BUG_ON(!list_empty(&pool->cache_list[n])); > > } > >
[Intel-gfx] [PATCH v2] drm/i915: Flush buffer pools on driver remove
We currently do an explicit flush of the buffer pools within the call path of drm_driver.release(); this removes all buffers, regardless of their age, freeing the buffers' associated resources (objects, adress space areas). However there is other code that runs within the drm_driver.release() call chain that expects objects and their associated address space areas have already been flushed. Since buffer pools auto-flush old buffers once per second in a worker thread, there's a small window where if we remove the driver while there are still objects in buffers with an age of less than one second, the assumptions of the other release code may be violated. By moving the flush to driver remove (which executes earlier via the pci_driver.remove() flow) we're ensuring that all buffers are flushed and their associated objects freed before some other code in pci_driver.remove() flushes those objects so they are released before _any_ code in drm_driver.release() that check completness of those flushes executes. v2: Reword commit descriptiom as suggested by Matt. Signed-off-by: Janusz Krzysztofik Cc: Chris Wilson Cc: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++ drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 4037c3778225..5b3acf2b064e 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -741,6 +741,8 @@ void intel_gt_driver_remove(struct intel_gt *gt) intel_uc_driver_remove(>->uc); intel_engines_release(gt); + + intel_gt_flush_buffer_pool(gt); } void intel_gt_driver_unregister(struct intel_gt *gt) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c index aa0a59c5b614..acc49c56a9f3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c @@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt) struct intel_gt_buffer_pool *pool = >->buffer_pool; int n; - intel_gt_flush_buffer_pool(gt); - for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) GEM_BUG_ON(!list_empty(&pool->cache_list[n])); } -- 2.25.1
Re: [Intel-gfx] [RFC PATCH i-g-t] lib/i915/perf: Fix non-card0 processing
Hi Lionel, On poniedziałek, 3 maja 2021 09:07:09 CEST Lionel Landwerlin wrote: > On 30/04/2021 19:18, Janusz Krzysztofik wrote: > > IGT i915/perf library functions now always operate on sysfs perf > > attributes of card0 device node, no matter which DRM device fd a user > > passes. The intention was to always switch to primary device node if > > a user passes a render device node fd, but that breaks handling of > > non-card0 devices. > > > > Instead of forcibly using DRM device minor number 0 when opening a > > device sysfs area, convert device minor number of a user passed device > > fd to the minor number of respective primary (cardX) device node. > > > > Signed-off-by: Janusz Krzysztofik > > --- > > lib/i915/perf.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/lib/i915/perf.c b/lib/i915/perf.c > > index 56d5c0b3a..336824df7 100644 > > --- a/lib/i915/perf.c > > +++ b/lib/i915/perf.c > > @@ -376,8 +376,8 @@ open_master_sysfs_dir(int drm_fd) > > if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode)) > > return -1; > > > > -snprintf(path, sizeof(path), "/sys/dev/char/%d:0", > > - major(st.st_rdev)); > > +snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", > > + major(st.st_rdev), minor(st.st_rdev) & ~128); > > > Isn't it minor(st.st_rdev) & 0xff ? Did you mean 0x7f? > or even 0x3f ? > > Looks like /dev/dri/controlD64 can exist too. Not any longer, see commit 0d49f303e8a7 ("drm: remove all control node code"). However, my approach of applying a mask is oversimplified. Minor numbers for different node types (primary and render) are handled separately. I'm going to propose a method similar to that implemented in igt_debugfs_path(). Thanks, Janusz > > > -Lionel > > > > > > return open(path, O_DIRECTORY); > > } > > > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t v2] lib/i915/perf: Fix non-card0 processing
IGT i915/perf library functions now always operate on sysfs perf attributes of card0 device node, no matter which DRM device fd a user passes. The intention was to always switch to primary device node if a user passes a render device node fd, but that breaks handling of non-card0 devices. If a user passed a render device node fd, find a primary device node of the same device and use it instead of forcibly using the primary device with minor number 0 when opening the device sysfs area. v2: Don't assume primary minor matches render minor with masked type. Signed-off-by: Janusz Krzysztofik Cc: Lionel Landwerlin --- lib/i915/perf.c | 31 --- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/lib/i915/perf.c b/lib/i915/perf.c index 56d5c0b3a..d7768468e 100644 --- a/lib/i915/perf.c +++ b/lib/i915/perf.c @@ -372,14 +372,39 @@ open_master_sysfs_dir(int drm_fd) { char path[128]; struct stat st; + int sysfs; if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode)) return -1; -snprintf(path, sizeof(path), "/sys/dev/char/%d:0", - major(st.st_rdev)); + snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), minor(st.st_rdev)); + sysfs = open(path, O_DIRECTORY); - return open(path, O_DIRECTORY); + if (sysfs >= 0 && minor(st.st_rdev) >= 128) { + char device[100], cmp[100]; + int device_len, cmp_len, i; + + device_len = readlinkat(sysfs, "device", device, sizeof(device)); + close(sysfs); + if (device_len < 0) + return device_len; + + for (i = 0; i < 128; i++) { + + snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), i); + sysfs = open(path, O_DIRECTORY); + if (sysfs < 0) + continue; + + cmp_len = readlinkat(sysfs, "device", cmp, sizeof(cmp)); + if (cmp_len == device_len && !memcmp(cmp, device, cmp_len)) + break; + + close(sysfs); + } + } + + return sysfs; } struct intel_perf * -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t v3] lib/i915/perf: Fix non-card0 processing
IGT i915/perf library functions now always operate on sysfs perf attributes of card0 device node, no matter which DRM device fd a user passes. The intention was to always switch to primary device node if a user passes a render device node fd, but that breaks handling of non-card0 devices. If a user passed a render device node fd, find a primary device node of the same device and use it instead of forcibly using the primary device with minor number 0 when opening the device sysfs area. v2: Don't assume primary minor matches render minor with masked type. v3: Reset sysfs dir fd if no match, consequently spell out error paths, add a comment on convertion of renderD* to cardX (Lionel). Signed-off-by: Janusz Krzysztofik Reviewed-by: Lionel Landwerlin --- lib/i915/perf.c | 35 --- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/lib/i915/perf.c b/lib/i915/perf.c index 56d5c0b3a..b9e10519e 100644 --- a/lib/i915/perf.c +++ b/lib/i915/perf.c @@ -372,14 +372,43 @@ open_master_sysfs_dir(int drm_fd) { char path[128]; struct stat st; + int sysfs; if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode)) return -1; -snprintf(path, sizeof(path), "/sys/dev/char/%d:0", - major(st.st_rdev)); + snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), minor(st.st_rdev)); + sysfs = open(path, O_DIRECTORY); + if (sysfs < 0) + return sysfs; - return open(path, O_DIRECTORY); + if (minor(st.st_rdev) >= 128) { + /* If we were given a renderD* drm_fd, find it's associated cardX node. */ + char device[100], cmp[100]; + int device_len, cmp_len, i; + + device_len = readlinkat(sysfs, "device", device, sizeof(device)); + close(sysfs); + if (device_len < 0) + return device_len; + + for (i = 0; i < 128; i++) { + + snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), i); + sysfs = open(path, O_DIRECTORY); + if (sysfs < 0) + continue; + + cmp_len = readlinkat(sysfs, "device", cmp, sizeof(cmp)); + if (cmp_len == device_len && !memcmp(cmp, device, cmp_len)) + break; + + close(sysfs); + sysfs = -1; + } + } + + return sysfs; } struct intel_perf * -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t v4] lib/i915/perf: Fix non-card0 processing
IGT i915/perf library functions now always operate on sysfs perf attributes of card0 device node, no matter which DRM device fd a user passes. The intention was to always switch to primary device node if a user passes a render device node fd, but that breaks handling of non-card0 devices. If a user passed a render device node fd, find a primary device node of the same device and use it instead of forcibly using the primary device with minor number 0 when opening the device sysfs area. v2: Don't assume primary minor matches render minor with masked type. v3: Reset sysfs dir fd if no match, consequently spell out error paths, add a comment on convertion of renderD* to cardX (Lionel). v4: Limit primary lookup to minors <64 (Chris) Signed-off-by: Janusz Krzysztofik Reviewed-by: Lionel Landwerlin # v3 Cc: Chris Wilson --- lib/i915/perf.c | 35 --- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/lib/i915/perf.c b/lib/i915/perf.c index 56d5c0b3a..5644a3469 100644 --- a/lib/i915/perf.c +++ b/lib/i915/perf.c @@ -372,14 +372,43 @@ open_master_sysfs_dir(int drm_fd) { char path[128]; struct stat st; + int sysfs; if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode)) return -1; -snprintf(path, sizeof(path), "/sys/dev/char/%d:0", - major(st.st_rdev)); + snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), minor(st.st_rdev)); + sysfs = open(path, O_DIRECTORY); + if (sysfs < 0) + return sysfs; - return open(path, O_DIRECTORY); + if (minor(st.st_rdev) >= 128) { + /* If we were given a renderD* drm_fd, find it's associated cardX node. */ + char device[100], cmp[100]; + int device_len, cmp_len, i; + + device_len = readlinkat(sysfs, "device", device, sizeof(device)); + close(sysfs); + if (device_len < 0) + return device_len; + + for (i = 0; i < 64; i++) { + + snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), i); + sysfs = open(path, O_DIRECTORY); + if (sysfs < 0) + continue; + + cmp_len = readlinkat(sysfs, "device", cmp, sizeof(cmp)); + if (cmp_len == device_len && !memcmp(cmp, device, cmp_len)) + break; + + close(sysfs); + sysfs = -1; + } + } + + return sysfs; } struct intel_perf * -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/gt: Do release kernel context if breadcrumb measure fails
Commit fb5970da1b42 ("drm/i915/gt: Use the kernel_context to measure the breadcrumb size") reordered some operations inside engine_init_common() and added an error unwind path to that function. In that path, a reference to a kernel context candidate supposed to be released on error was put, but the context, pinned when created, was not unpinned first. Fix it by replacing intel_context_put() with destroy_pinned_context() introduced later by commit b436a5f8b6c8 ("drm/i915/gt: Track all timelines created using the HWSP"). Signed-off-by: Janusz Krzysztofik Cc: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 6dbdbde00f14..eba2da9679a5 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -898,7 +898,7 @@ static int engine_init_common(struct intel_engine_cs *engine) return 0; err_context: - intel_context_put(ce); + destroy_pinned_context(ce); return ret; } -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/gt: Do release kernel context if breadcrumb measure fails
Hi Tvrtko, On poniedziałek, 10 maja 2021 11:14:46 CEST Tvrtko Ursulin wrote: > > On 07/05/2021 15:42, Janusz Krzysztofik wrote: > > Commit fb5970da1b42 ("drm/i915/gt: Use the kernel_context to measure the > > breadcrumb size") reordered some operations inside engine_init_common() > > and added an error unwind path to that function. In that path, a > > reference to a kernel context candidate supposed to be released on error > > was put, but the context, pinned when created, was not unpinned first. > > Fix it by replacing intel_context_put() with destroy_pinned_context() > > introduced later by commit b436a5f8b6c8 ("drm/i915/gt: Track all timelines > > created using the HWSP"). > > > > Signed-off-by: Janusz Krzysztofik > > Cc: Chris Wilson > > --- > > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > index 6dbdbde00f14..eba2da9679a5 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > @@ -898,7 +898,7 @@ static int engine_init_common(struct intel_engine_cs > > *engine) > > return 0; > > > > err_context: > > - intel_context_put(ce); > > + destroy_pinned_context(ce); > > return ret; > > } > > > > > > Reviewed-by: Tvrtko Ursulin > > Found by some test or code inspection? Code inspection. Thanks, Janusz > > Regards, > > Tvrtko > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong name announced on FB driver switching
Hi Jani, On Mon, 3 May 2021 19:38:17 CEST Jani Nikula wrote: > On Thu, 29 Apr 2021, Janusz Krzysztofik wrote: > > Commit 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info") > > effectively changed our FB driver name from "inteldrmfb" to > > "i915drmfb". However, we are still using the old name when kicking out > > a firmware fbdev driver potentially bound to our device. Use the new > > name to avoid confusion. > > > > Note: since the new name is assigned by a DRM fbdev helper called at > > the DRM driver registration time, that name is not available when we > > kick the other driver out early, hence a hardcoded name must be used > > unless the DRM layer exposes a macro for converting a DRM driver name > > to its associated fbdev driver name. > > > > Signed-off-by: Janusz Krzysztofik > > LGTM, Daniel? > > Reviewed-by: Jani Nikula Thanks for review. What are next steps? Please note I have no push permissions. Thanks, Janusz > > $ dim fixes 7a0f9ef9703d > Fixes: 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info") > Cc: Noralf Trønnes > Cc: Alex Deucher > Cc: Daniel Vetter > Cc: Jani Nikula > Cc: Joonas Lahtinen > Cc: Rodrigo Vivi > Cc: intel-gfx@lists.freedesktop.org > Cc: # v5.2+ > > > > --- > > drivers/gpu/drm/i915/i915_drv.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/ i915_drv.c > > index 785dcf20c77b..46082490dc9a 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -554,7 +554,7 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv) > > if (ret) > > goto err_perf; > > > > - ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "inteldrmfb"); > > + ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "i915drmfb"); > > if (ret) > > goto err_ggtt; > > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong name announced on FB driver switching
Hi, On poniedziałek, 3 maja 2021 19:38:17 CEST Jani Nikula wrote: > On Thu, 29 Apr 2021, Janusz Krzysztofik wrote: > > Commit 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info") > > effectively changed our FB driver name from "inteldrmfb" to > > "i915drmfb". However, we are still using the old name when kicking out > > a firmware fbdev driver potentially bound to our device. Use the new > > name to avoid confusion. > > > > Note: since the new name is assigned by a DRM fbdev helper called at > > the DRM driver registration time, that name is not available when we > > kick the other driver out early, hence a hardcoded name must be used > > unless the DRM layer exposes a macro for converting a DRM driver name > > to its associated fbdev driver name. > > > > Signed-off-by: Janusz Krzysztofik > > LGTM, Daniel? > > Reviewed-by: Jani Nikula Am I supposed to do something to push processing of this patch forward? Please note I have no push permissions so can't merge it myself. > > $ dim fixes 7a0f9ef9703d > Fixes: 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info") > Cc: Noralf Trønnes > Cc: Alex Deucher > Cc: Daniel Vetter > Cc: Jani Nikula > Cc: Joonas Lahtinen > Cc: Rodrigo Vivi > Cc: intel-gfx@lists.freedesktop.org > Cc: # v5.2+ Should I resubmit with those tags appended? Thanks, Janusz > > > > --- > > drivers/gpu/drm/i915/i915_drv.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/ i915_drv.c > > index 785dcf20c77b..46082490dc9a 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -554,7 +554,7 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv) > > if (ret) > > goto err_perf; > > > > - ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "inteldrmfb"); > > + ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "i915drmfb"); > > if (ret) > > goto err_ggtt; > > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Mark GPU wedging on driver unregister unrecoverable
GPU wedged flag now set on driver unregister to prevent from further using the GPU can be then cleared unintentionally when calling __intel_gt_unset_wedged() still before the flag is finally marked unrecoverable. We need to have it marked unrecoverable earlier. Implement that by replacing a call to intel_gt_set_wedged() in intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini(). With the above in place, intel_gt_set_wedged_on_fini() is now called twice on driver remove, second time from __intel_gt_disable(). This seems harmless, while dropping intel_gt_set_wedged_on_fini() from __intel_gt_disable() proved to break some driver probe error unwind paths as well as mock selftest exit path. Signed-off-by: Janusz Krzysztofik Cc: Michał Winiarski --- drivers/gpu/drm/i915/gt/intel_gt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 62d40c986642..173b53cb2b47 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -750,7 +750,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt) * all in-flight requests so that we can quickly unbind the active * resources. */ - intel_gt_set_wedged(gt); + intel_gt_set_wedged_on_fini(gt); /* Scrub all HW state upon release */ with_intel_runtime_pm(gt->uncore->rpm, wakeref) -- 2.25.1
[Intel-gfx] [PATCH] drm/i915: Flush buffer pools on driver remove
In preparation for clean driver release, attempts to drain work queues and release freed objects are taken at driver remove time. However, GT buffer pools are now not flushed before the driver release phase. Since unused objects may stay there for up to one second, some may survive until driver release is attempted. That can potentially explain sporadic then hardly reproducible issues observed at driver release time, like non-zero shrink counter or outstanding address space areas. Flush buffer pools on GT remove as a fix. On driver release, don't push the pools again, just assert that the flush was called and nothing added more in between. Signed-off-by: Janusz Krzysztofik Cc: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++ drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 62d40c986642..8f322a4ecd87 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -737,6 +737,8 @@ void intel_gt_driver_remove(struct intel_gt *gt) intel_uc_driver_remove(>->uc); intel_engines_release(gt); + + intel_gt_flush_buffer_pool(gt); } void intel_gt_driver_unregister(struct intel_gt *gt) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c index aa0a59c5b614..acc49c56a9f3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c @@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt) struct intel_gt_buffer_pool *pool = >->buffer_pool; int n; - intel_gt_flush_buffer_pool(gt); - for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) GEM_BUG_ON(!list_empty(&pool->cache_list[n])); } -- 2.25.1
[Intel-gfx] [PATCH RESEND] drm/i915: Flush buffer pools on driver remove
In preparation for clean driver release, attempts to drain work queues and release freed objects are taken at driver remove time. However, GT buffer pools are now not flushed before the driver release phase. Since unused objects may stay there for up to one second, some may survive until driver release is attempted. That can potentially explain sporadic then hardly reproducible issues observed at driver release time, like non-zero shrink counter or outstanding address space areas. Flush buffer pools on GT remove as a fix. On driver release, don't flush the pools again, just assert that the flush was called and nothing added more in between. Signed-off-by: Janusz Krzysztofik Cc: Chris Wilson --- Resending with Cc: dri-de...@lists.freedesktop.org as requested, and a typo in commit description fixed. Thanks, Janusz drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++ drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 62d40c986642..8f322a4ecd87 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -737,6 +737,8 @@ void intel_gt_driver_remove(struct intel_gt *gt) intel_uc_driver_remove(>->uc); intel_engines_release(gt); + + intel_gt_flush_buffer_pool(gt); } void intel_gt_driver_unregister(struct intel_gt *gt) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c index aa0a59c5b614..acc49c56a9f3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c @@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt) struct intel_gt_buffer_pool *pool = >->buffer_pool; int n; - intel_gt_flush_buffer_pool(gt); - for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++) GEM_BUG_ON(!list_empty(&pool->cache_list[n])); } -- 2.25.1
[Intel-gfx] [PATCH RESEND] drm/i915: Mark GPU wedging on driver unregister unrecoverable
GPU wedged flag now set on driver unregister to prevent from further using the GPU can be then cleared unintentionally when calling __intel_gt_unset_wedged() still before the flag is finally marked unrecoverable. We need to have it marked unrecoverable earlier. Implement that by replacing a call to intel_gt_set_wedged() in intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini(). With the above in place, intel_gt_set_wedged_on_fini() is now called twice on driver remove, second time from __intel_gt_disable(). This seems harmless, while dropping intel_gt_set_wedged_on_fini() from __intel_gt_disable() proved to break some driver probe error unwind paths as well as mock selftest exit path. Signed-off-by: Janusz Krzysztofik Cc: Michał Winiarski --- Resending with Cc: dri-de...@lists.freedesktop.org as requested. Thanks, Janusz drivers/gpu/drm/i915/gt/intel_gt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 62d40c986642..173b53cb2b47 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -750,7 +750,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt) * all in-flight requests so that we can quickly unbind the active * resources. */ - intel_gt_set_wedged(gt); + intel_gt_set_wedged_on_fini(gt); /* Scrub all HW state upon release */ with_intel_runtime_pm(gt->uncore->rpm, wakeref) -- 2.25.1
Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Mark GPU wedging on driver unregister unrecoverable (rev2)
On piątek, 3 września 2021 21:07:00 CEST Patchwork wrote: > == Series Details == > > Series: drm/i915: Mark GPU wedging on driver unregister unrecoverable (rev2) > URL : https://patchwork.freedesktop.org/series/94247/ > State : failure > > == Summary == > > CI Bug Log - changes from CI_DRM_10550_full -> Patchwork_20953_full > > > Summary > --- > > **FAILURE** > > Serious unknown changes coming with Patchwork_20953_full absolutely need to > be > verified manually. > > If you think the reported changes have nothing to do with the changes > introduced in Patchwork_20953_full, please notify your bug team to allow > them > to document this new failure mode, which will reduce false positives in CI. > > > > Possible new issues > --- > > Here are the unknown changes that may have been introduced in > Patchwork_20953_full: > > ### IGT changes ### > > Possible regressions > > * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile: > - shard-iclb: [PASS][1] -> [SKIP][2] >[1]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-iclb4/igt@kms_flip_scaled_...@flip-32bpp-ytile-to-64bpp-ytile.html >[2]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb2/igt@kms_flip_scaled_...@flip-32bpp-ytile-to-64bpp-ytile.html stdout: No valid pipe/connector/format/mod combination found That doesn't sound like a driver unregister related issue to me. Thanks, Janusz > > > Known issues > > > Here are the changes found in Patchwork_20953_full that come from known > issues: > > ### IGT changes ### > > Issues hit > > * igt@feature_discovery@display-2x: > - shard-iclb: NOTRUN -> [SKIP][3] ([i915#1839]) >[3]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb8/igt@feature_discov...@display-2x.html > > * igt@gem_create@create-massive: > - shard-snb: NOTRUN -> [DMESG-WARN][4] ([i915#3002]) >[4]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-snb6/igt@gem_cre...@create-massive.html > > * igt@gem_ctx_persistence@legacy-engines-hostile: > - shard-snb: NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) +2 > similar issues >[5]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-snb6/igt@gem_ctx_persiste...@legacy-engines-hostile.html > > * igt@gem_eio@in-flight-contexts-10ms: > - shard-tglb: [PASS][6] -> [TIMEOUT][7] ([i915#3063]) >[6]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-tglb3/igt@gem_...@in-flight-contexts-10ms.html >[7]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-tglb6/igt@gem_...@in-flight-contexts-10ms.html > > * igt@gem_eio@unwedge-stress: > - shard-tglb: [PASS][8] -> [TIMEOUT][9] ([i915#2369] / > [i915#3063] / [i915#3648]) >[8]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-tglb8/igt@gem_...@unwedge-stress.html >[9]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-tglb7/igt@gem_...@unwedge-stress.html > > * igt@gem_exec_fair@basic-flow@rcs0: > - shard-tglb: [PASS][10] -> [FAIL][11] ([i915#2842]) >[10]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-tglb1/igt@gem_exec_fair@basic-f...@rcs0.html >[11]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-tglb1/igt@gem_exec_fair@basic-f...@rcs0.html > > * igt@gem_exec_fair@basic-throttle@rcs0: > - shard-iclb: [PASS][12] -> [FAIL][13] ([i915#2849]) >[12]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-iclb8/igt@gem_exec_fair@basic-throt...@rcs0.html >[13]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb2/igt@gem_exec_fair@basic-throt...@rcs0.html > > * igt@gem_exec_params@secure-non-master: > - shard-iclb: NOTRUN -> [SKIP][14] ([fdo#112283]) >[14]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb8/igt@gem_exec_par...@secure-non-master.html > > * igt@gem_pread@exhaustion: > - shard-apl: NOTRUN -> [WARN][15] ([i915#2658]) >[15]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-apl6/igt@gem_pr...@exhaustion.html > - shard-skl: NOTRUN -> [WARN][16] ([i915#2658]) >[16]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-skl3/igt@gem_pr...@exhaustion.html > > * igt@gem_pwrite@basic-exhaustion: > - shard-kbl: NOTRUN -> [WARN][17] ([i915#2658]) +1 similar issue >[17]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-kbl3/igt@gem_pwr...@basic-exhaustion.html > > * igt@gem_render_copy@yf-tiled-to-vebox-x-tiled: > - shard-iclb: NOTRUN -> [SKIP][18] ([i915#768]) >[18]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb8/igt@gem_render_c...@yf-tiled-to-vebox-x-tiled
[Intel-gfx] [RFC PATCH i-g-t 1/6] tests/core_hotunplug: Add 'GEM context' variants
Verify if an additional context associated with an open device file descriptor is cleaned up correctly on device hotunbind / hotunplug. Signed-off-by: Janusz Krzysztofik --- tests/core_hotunplug.c | 79 ++ 1 file changed, 79 insertions(+) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index 56a88fefd..4f6c4f625 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -31,6 +31,7 @@ #include #include "i915/gem.h" +#include "i915/gem_context.h" #include "igt.h" #include "igt_device_scan.h" #include "igt_kmod.h" @@ -545,6 +546,60 @@ static void hotreplug_lateclose(struct hotunplug *priv) igt_assert_f(healthcheck(priv, false), "%s\n", priv->failure); } +static void ctx_hotunbind_lateclose(struct hotunplug *priv) +{ + uint32_t ctx; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + gem_require_contexts(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind"); + + local_debug("%s\n", "creating additional GEM user context"); + ctx = gem_context_create(priv->fd.drm); + + driver_unbind(priv, "hot ", 0); + + local_debug("%s\n", "trying to late destroy the context"); + igt_assert_eq(__gem_context_destroy(priv->fd.drm, ctx), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound "); + igt_assert_eq(priv->fd.drm, -1); +} + +static void ctx_hotunplug_lateclose(struct hotunplug *priv) +{ + uint32_t ctx; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + gem_require_contexts(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug"); + + local_debug("%s\n", "creating additional GEM user context"); + ctx = gem_context_create(priv->fd.drm); + + device_unplug(priv, "hot ", 0); + + local_debug("%s\n", "trying to late destroy the context"); + igt_assert_eq(__gem_context_destroy(priv->fd.drm, ctx), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "removed "); + igt_assert_eq(priv->fd.drm, -1); +} + /* Main */ igt_main @@ -682,6 +737,30 @@ igt_main recover(&priv); } + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if the driver can be cleanly unbound for a still open device with extra GEM context, then released"); + igt_subtest("ctx-hotunbind-lateclose") + ctx_hotunbind_lateclose(&priv); + + igt_fixture + recover(&priv); + } + + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if a still open device with extra GEM context can be cleanly unplugged, then released"); + igt_subtest("ctx-hotunplug-lateclose") + ctx_hotunplug_lateclose(&priv); + + igt_fixture + recover(&priv); + } + igt_fixture { post_healthcheck(&priv); -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH i-g-t 2/6] tests/core_hotunplug: Add 'GEM address space' variants
Verify if an additional address space associated with an open device file descriptor is cleaned up correctly on device hotunbind / hotunplug. Signed-off-by: Janusz Krzysztofik --- tests/core_hotunplug.c | 79 ++ 1 file changed, 79 insertions(+) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index 4f6c4f625..decfcdfda 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -32,6 +32,7 @@ #include "i915/gem.h" #include "i915/gem_context.h" +#include "i915/gem_vm.h" #include "igt.h" #include "igt_device_scan.h" #include "igt_kmod.h" @@ -600,6 +601,60 @@ static void ctx_hotunplug_lateclose(struct hotunplug *priv) igt_assert_eq(priv->fd.drm, -1); } +static void vm_hotunbind_lateclose(struct hotunplug *priv) +{ + int vm; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + gem_require_vm(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind"); + + local_debug("%s\n", "creating additional GEM user address space"); + vm = gem_vm_create(priv->fd.drm); + + driver_unbind(priv, "hot ", 0); + + local_debug("%s\n", "trying to late remove the address space"); + igt_assert_eq(__gem_vm_destroy(priv->fd.drm, vm), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "removed "); + igt_assert_eq(priv->fd.drm, -1); +} + +static void vm_hotunplug_lateclose(struct hotunplug *priv) +{ + int vm; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + gem_require_vm(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug"); + + local_debug("%s\n", "creating additional GEM user address space"); + vm = gem_vm_create(priv->fd.drm); + + device_unplug(priv, "hot ", 0); + + local_debug("%s\n", "trying to late remove the address space"); + igt_assert_eq(__gem_vm_destroy(priv->fd.drm, vm), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound "); + igt_assert_eq(priv->fd.drm, -1); +} + /* Main */ igt_main @@ -761,6 +816,30 @@ igt_main recover(&priv); } + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if the driver can be cleanly unboound form a still open device with extra GEM address space, then released"); + igt_subtest("vm-hotunbind-lateclose") + vm_hotunbind_lateclose(&priv); + + igt_fixture + recover(&priv); + } + + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if a still open device with extra GEM address space can be cleanly unplugged, then released"); + igt_subtest("vm-hotunplug-lateclose") + vm_hotunplug_lateclose(&priv); + + igt_fixture + recover(&priv); + } + igt_fixture { post_healthcheck(&priv); -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH i-g-t 3/6] tests/core_hotunplug: Add 'GEM object' variants
GEM objects belonging to user file descriptors still open on device hotunbind / hotunplug may exhibit still more driver issues. Add subtests that implements these scenarios. Signed-off-by: Janusz Krzysztofik --- tests/core_hotunplug.c | 85 ++ 1 file changed, 85 insertions(+) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index decfcdfda..7f61b4446 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -433,6 +433,13 @@ static void set_filter_from_device(int fd) igt_assert_eq(igt_device_filter_add(filter), 1); } +static int local_gem_close(int fd, uint32_t handle) +{ + struct drm_gem_close close_bo = { .handle = handle, }; + + return igt_ioctl(fd, DRM_IOCTL_GEM_CLOSE, &close_bo) ? -errno : 0; +} + /* Subtests */ static void unbind_rebind(struct hotunplug *priv) @@ -655,6 +662,60 @@ static void vm_hotunplug_lateclose(struct hotunplug *priv) igt_assert_eq(priv->fd.drm, -1); } +static void gem_hotunbind_lateclose(struct hotunplug *priv) +{ + uint32_t handle; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + igt_require_gem(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind"); + + local_debug("%s\n", "creating a GEM user object"); + handle = gem_create(priv->fd.drm, 4096); + + driver_unbind(priv, "hot", 0); + + local_debug("%s\n", "trying to late remove the object"); + igt_assert_eq(local_gem_close(priv->fd.drm, handle), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound "); + igt_assert_eq(priv->fd.drm, -1); +} + +static void gem_hotunplug_lateclose(struct hotunplug *priv) +{ + uint32_t handle; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + igt_require_gem(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug"); + + local_debug("%s\n", "creating a GEM user object"); + handle = gem_create(priv->fd.drm, 4096); + + device_unplug(priv, "hot", 0); + + local_debug("%s\n", "trying to late remove the object"); + igt_assert_eq(local_gem_close(priv->fd.drm, handle), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "removed "); + igt_assert_eq(priv->fd.drm, -1); +} + /* Main */ igt_main @@ -840,6 +901,30 @@ igt_main recover(&priv); } + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if the driver can be cleanly unbound from a device with a still open GEM object, then released"); + igt_subtest("gem-hotunbind-lateclose") + gem_hotunbind_lateclose(&priv); + + igt_fixture + recover(&priv); + } + + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if a device with a still open GEM object can be cleanly unplugged, then released"); + igt_subtest("gem-hotunplug-lateclose") + gem_hotunplug_lateclose(&priv); + + igt_fixture + recover(&priv); + } + igt_fixture { post_healthcheck(&priv); -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH i-g-t 5/6] tests/core_hotunplug: Add 'PRIME handle' variants
Even if all device file descriptors are closed on device hotunbind / hotunplug, PRIME exported objects may still exists, referenced by still open dma-buf file descriptors. Add subtests that keep such descriptor open on device hotunbind / hotunplug. Signed-off-by: Janusz Krzysztofik --- tests/core_hotunplug.c | 104 + 1 file changed, 104 insertions(+) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index 6f3b3b3d3..0cb1267ae 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -782,6 +782,86 @@ static void userptr_hotunplug_lateclose(struct hotunplug *priv) igt_fail_on_f(munmap(ptr, 4096), "Userptr unmap failure!"); } +static void prime_hotunbind_lateclose(struct hotunplug *priv) +{ + uint32_t handle; + int dmabuf, ret; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + igt_require_gem(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind"); + + local_debug("%s\n", "creating and PRIME-exporting a GEM object"); + handle = gem_create(priv->fd.drm, 4096); + dmabuf = prime_handle_to_fd(priv->fd.drm, handle); + + ret = local_gem_close(priv->fd.drm, handle); + priv->fd.drm = close_device(priv->fd.drm, "", "exported "); + + if (priv->fd.drm != -1) { + igt_ignore_warn(close(dmabuf)); + igt_assert_eq(priv->fd.drm, -1); + } + + /* once device close succeeds, take care of open dmabuf like if it was a device fd */ + priv->fd.drm = dmabuf; + igt_assert_f(!ret, "gem_close failed with errno %d\n", ret); + + driver_unbind(priv, "hot ", 0); + + igt_debug("late closing the PRIME file descriptor\n"); + dmabuf = local_close(dmabuf, "PRIME file descriptor late close failure"); + priv->fd.drm = dmabuf; + igt_assert_eq(dmabuf, -1); +} + +static void prime_hotunplug_lateclose(struct hotunplug *priv) +{ + uint32_t handle; + int dmabuf, ret; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + igt_require_gem(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug"); + + local_debug("%s\n", "creating and PRIME-exporting a GEM object"); + handle = gem_create(priv->fd.drm, 4096); + dmabuf = prime_handle_to_fd(priv->fd.drm, handle); + + ret = local_gem_close(priv->fd.drm, handle); + priv->fd.drm = close_device(priv->fd.drm, "", "exported "); + + if (priv->fd.drm != -1) { + igt_ignore_warn(close(dmabuf)); + igt_assert_eq(priv->fd.drm, -1); + } + + /* once device close succeeds, take care of open dmabuf like if it was a device fd */ + priv->fd.drm = dmabuf; + igt_assert_f(!ret, "gem_close failed with errno %d\n", ret); + + device_unplug(priv, "hot ", 0); + + igt_debug("late closing the PRIME file descriptor\n"); + dmabuf = local_close(dmabuf, "PRIME file descriptor late close failure"); + priv->fd.drm = dmabuf; + igt_assert_eq(dmabuf, -1); +} + /* Main */ igt_main @@ -1015,6 +1095,30 @@ igt_main recover(&priv); } + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if the driver can be cleanly unbound from a device with a still open PRIME-exported object, then released"); + igt_subtest("prime-hotunbind-lateclose") + prime_hotunbind_lateclose(&priv); + + igt_fixture + recover(&priv); + } + + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if a device with a still open PRIME-exported object can be cleanly unplugged, then released"); + igt_subtest("prime-hotunplug-lateclose") + prime_hotunplug_lateclose(&priv); + + igt_fixture + recover(&priv); + } + igt_fixture { post_healthcheck(&priv); -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH i-g-t 4/6] tests/core_hotunplug: Add 'userptr GEM object' variants
Verify if userptr GM objects are cleaned up equally well as regular GEM objects on device hotunbind / hotunplug. Signed-off-by: Janusz Krzysztofik --- tests/core_hotunplug.c | 90 ++ 1 file changed, 90 insertions(+) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index 7f61b4446..6f3b3b3d3 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -716,6 +716,72 @@ static void gem_hotunplug_lateclose(struct hotunplug *priv) igt_assert_eq(priv->fd.drm, -1); } +static void userptr_hotunbind_lateclose(struct hotunplug *priv) +{ + uint32_t handle; + void *ptr; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + igt_require_gem(priv->fd.drm); + igt_assert_eq(posix_memalign(&ptr, 4096, 4096), 0); + igt_require(!__gem_userptr(priv->fd.drm, ptr, 4096, 0, 0, &handle)); + gem_close(priv->fd.drm, handle); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind"); + + local_debug("%s\n", "creating a userptr GEM object"); + gem_userptr(priv->fd.drm, ptr, 4096, 0, 0, &handle); + + driver_unbind(priv, "hot ", 0); + + local_debug("%s\n", "trying to late remove the object"); + igt_assert_eq(local_gem_close(priv->fd.drm, handle), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound "); + igt_assert_eq(priv->fd.drm, -1); + + igt_fail_on_f(munmap(ptr, 4096), "Userptr unmap failure!"); +} + +static void userptr_hotunplug_lateclose(struct hotunplug *priv) +{ + uint32_t handle; + void *ptr; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + igt_require_gem(priv->fd.drm); + igt_assert_eq(posix_memalign(&ptr, 4096, 4096), 0); + igt_require(!__gem_userptr(priv->fd.drm, ptr, 4096, 0, 0, &handle)); + gem_close(priv->fd.drm, handle); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug"); + + local_debug("%s\n", "creating a userptr GEM object"); + gem_userptr(priv->fd.drm, ptr, 4096, 0, 0, &handle); + + device_unplug(priv, "hot ", 0); + + local_debug("%s\n", "trying to late remove the object"); + igt_assert_eq(local_gem_close(priv->fd.drm, handle), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "removed "); + igt_assert_eq(priv->fd.drm, -1); + + igt_fail_on_f(munmap(ptr, 4096), "Userptr unmap failure!"); +} + /* Main */ igt_main @@ -925,6 +991,30 @@ igt_main recover(&priv); } + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if the driver can be cleanly unbound from a device with a still open userptr GEM object, then released"); + igt_subtest("userptr-hotunbind-lateclose") + userptr_hotunbind_lateclose(&priv); + + igt_fixture + recover(&priv); + } + + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if a device with a still open userptr GEM object can be cleanly unplugged, then released"); + igt_subtest("userptr-hotunplug-lateclose") + userptr_hotunplug_lateclose(&priv); + + igt_fixture + recover(&priv); + } + igt_fixture { post_healthcheck(&priv); -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH i-g-t 6/6] tests/core_hotunplug: Add 'GEM spin' variants
Verify if a device with a GEM spin batch job still running on a GPU can be hot-unbound/unplugged cleanly and released. Signed-off-by: Janusz Krzysztofik --- tests/core_hotunplug.c | 124 + 1 file changed, 124 insertions(+) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index 0cb1267ae..f93545402 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -35,6 +35,7 @@ #include "i915/gem_vm.h" #include "igt.h" #include "igt_device_scan.h" +#include "igt_dummyload.h" #include "igt_kmod.h" #include "igt_sysfs.h" #include "sw_sync.h" @@ -440,6 +441,37 @@ static int local_gem_close(int fd, uint32_t handle) return igt_ioctl(fd, DRM_IOCTL_GEM_CLOSE, &close_bo) ? -errno : 0; } +static int local_bo_busy(int fd, uint32_t handle) +{ + struct drm_i915_gem_busy busy = { .handle = handle, }; + + return igt_ioctl(fd, DRM_IOCTL_I915_GEM_BUSY, &busy) ? -errno : 0; +} + +static void local_spin_free(struct hotunplug *priv, igt_spin_t *spin) +{ + igt_spin_end(spin); + + spin->poll_handle = 0; + spin->handle = 0; + + if (spin->poll) { + void *ptr = spin->poll; + + spin->poll = NULL; + igt_assert(!gem_munmap(ptr, 4096)); + } + + if (spin->batch) { + void *ptr = spin->poll; + + spin->batch = NULL; + igt_assert(!gem_munmap(ptr, 4096)); + } + + igt_spin_free(priv->fd.drm, spin); +} + /* Subtests */ static void unbind_rebind(struct hotunplug *priv) @@ -862,6 +894,74 @@ static void prime_hotunplug_lateclose(struct hotunplug *priv) igt_assert_eq(dmabuf, -1); } +static void spin_hotunbind_lateclose(struct hotunplug *priv) +{ + igt_spin_t *spin; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + igt_require_gem(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind"); + + local_debug("%s\n", "running dummy load"); + spin = igt_spin_new(priv->fd.drm, .flags = IGT_SPIN_POLL_RUN); + igt_spin_busywait_until_started(spin); + + driver_unbind(priv, "hot ", 0); + + local_debug("%s\n", "trying to late query the dummy load related GEM object status"); + igt_assert_eq(local_bo_busy(priv->fd.drm, spin->handle), -ENODEV); + local_debug("%s\n", "trying to late close the dummy load related GEM objects"); + igt_assert_eq(local_gem_close(priv->fd.drm, spin->poll_handle), -ENODEV); + igt_assert_eq(local_gem_close(priv->fd.drm, spin->handle), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound "); + igt_assert_eq(priv->fd.drm, -1); + + local_debug("%s\n", "trying to late free the dummy load"); + local_spin_free(priv, spin); +} + +static void spin_hotunplug_lateclose(struct hotunplug *priv) +{ + igt_spin_t *spin; + + igt_require(priv->fd.drm = -1); + priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites check"); + + igt_require_intel(priv->fd.drm); + igt_require_gem(priv->fd.drm); + priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked "); + + pre_check(priv); + + priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug"); + + local_debug("%s\n", "running dummy load"); + spin = igt_spin_new(priv->fd.drm, .flags = IGT_SPIN_POLL_RUN); + igt_spin_busywait_until_started(spin); + + device_unplug(priv, "hot ", 0); + + local_debug("%s\n", "trying to late query the dummy load related GEM object status"); + igt_assert_eq(local_bo_busy(priv->fd.drm, spin->handle), -ENODEV); + local_debug("%s\n", "trying to late close the dummy load related GEM objects"); + igt_assert_eq(local_gem_close(priv->fd.drm, spin->poll_handle), -ENODEV); + igt_assert_eq(local_gem_close(priv->fd.drm, spin->handle), -ENODEV); + + priv->fd.drm = close_device(priv->fd.drm, "late ", "removed "); + igt_assert_eq(priv->fd.drm, -1); + + local_debug("%s\n", "trying to late free the dummy load"); + local_spin_free(priv, spin); +} + /* Main */ igt_main @@ -1119,6 +1219,30
[Intel-gfx] [RFC PATCH 1/2] tests/core_hotunplug: Add perf health check
Sometimes CI reports skips of perf subtests when run subsequently after core_hotunplug. That may be an indication of issues with restoring device perf features on driver (hot)rebind. Detect device perf support at test start and check if still available after driver rebind. If that fails, a post-subtest device recovery step restores the device perf support so no subsequently executed tests are affected. Signed-off-by: Janusz Krzysztofik --- tests/core_hotunplug.c | 22 ++ tests/meson.build | 8 +++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index 56a88fefd..06f15d845 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -31,6 +31,7 @@ #include #include "i915/gem.h" +#include "i915/perf.h" #include "igt.h" #include "igt_device_scan.h" #include "igt_kmod.h" @@ -50,6 +51,7 @@ struct hotunplug { const char *dev_bus_addr; const char *failure; bool need_healthcheck; + bool has_intel_perf; }; /* Helpers */ @@ -319,6 +321,16 @@ static int local_i915_recover(int i915) return local_i915_healthcheck(i915, "post-"); } +static bool local_i915_perf_healthcheck(int i915) +{ + struct intel_perf *intel_perf; + + intel_perf = intel_perf_for_fd(i915); + if (intel_perf) + intel_perf_free(intel_perf); + return intel_perf; +} + #define FLAG_RENDER(1 << 0) #define FLAG_RECOVER (1 << 1) static void node_healthcheck(struct hotunplug *priv, unsigned flags) @@ -360,6 +372,13 @@ static void node_healthcheck(struct hotunplug *priv, unsigned flags) } } + if (!priv->failure && priv->has_intel_perf) { + local_debug("%s\n", "running i915 device perf healthcheck"); + priv->failure = "Device perf healthckeck failure!"; + if (local_i915_perf_healthcheck(fd_drm)) + priv->failure = NULL; + } + fd_drm = close_device(fd_drm, "", "health checked "); if (closed || fd_drm < -1) /* update status for post_healthcheck */ priv->fd.drm_hc = fd_drm; @@ -553,6 +572,7 @@ igt_main .fd = { .drm = -1, .drm_hc = -1, .sysfs_dev = -1, }, .failure= NULL, .need_healthcheck = true, + .has_intel_perf = false, }; igt_fixture { @@ -567,6 +587,8 @@ igt_main gem_quiescent_gpu(fd_drm); igt_require_gem(fd_drm); + priv.has_intel_perf = local_i915_perf_healthcheck(fd_drm); + /** * FIXME: Unbinding the i915 driver on some Haswell * platforms with Azalia audio results in a kernel WARN diff --git a/tests/meson.build b/tests/meson.build index 3e3db7d5b..3f6dc4fe3 100644 --- a/tests/meson.build +++ b/tests/meson.build @@ -3,7 +3,6 @@ test_progs = [ 'core_getclient', 'core_getstats', 'core_getversion', - 'core_hotunplug', 'core_setmaster', 'core_setmaster_vs_auth', 'debugfs_test', @@ -361,6 +360,13 @@ test_executables += executable('perf', install : true) test_list += 'perf' +test_executables += executable('core_hotunplug', 'core_hotunplug.c', + dependencies : test_deps + [ lib_igt_i915_perf ], + install_dir : libexecdir, + install_rpath : libexecdir_rpathdir, + install : true) +test_list += 'core_hotunplug' + executable('testdisplay', ['testdisplay.c', 'testdisplay_hotplug.c'], dependencies : test_deps, install_dir : libexecdir, -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 2/2] intel-ci: Unblock core_hotunplug@*hot*bind* subtests
Commit be529747d8ea ("intel-ci: Broaden core_hotunplug blacklist") blamed issues triggered by hot variants[*] as responsible for random failures in subsequently executed tests, According to the issue history[*], last reported occurrences were not related to core_hotunplug. Remove *hot*bind* subtests from CI blocklist. [*] https://gitlab.freedesktop.org/drm/intel/-/issues/2644. Signed-off-by: Janusz Krzysztofik --- tests/intel-ci/blacklist.txt | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tests/intel-ci/blacklist.txt b/tests/intel-ci/blacklist.txt index 33f92e37f..595fd0ca6 100644 --- a/tests/intel-ci/blacklist.txt +++ b/tests/intel-ci/blacklist.txt @@ -112,10 +112,10 @@ igt@.*@.*pipe-f($|-.*) # Temporary workarounds for CI-impacting bugs ### -# Currently fails and leaves the machine in a very bad state, and -# causes coverage loss for other tests. IOMMU related. -# https://gitlab.freedesktop.org/drm/intel/-/issues/2644 -igt@core_hotunplug@.*(hot|plug).* +# *plug* subtests still fail and leave the +# machine in a very bad state, causing coverage +# loss for other tests. IOMMU related. +igt@core_hotunplug@.*plug.* # hangs several gens of hosts, and has no immediate fix igt@device_reset@reset-bound -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 1/2] tests/core_hotunplug: Add perf health check
Sometimes CI reports skips of perf subtests when run subsequently after core_hotunplug. That may be an indication of issues with restoring device perf features on driver (hot)rebind. Detect device perf support at test start and check if still available after driver rebind. If that fails, a post-subtest device recovery step restores the device perf support so no subsequently executed tests are affected. Signed-off-by: Janusz Krzysztofik --- tests/core_hotunplug.c | 22 ++ tests/meson.build | 8 +++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index 56a88fefd..06f15d845 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -31,6 +31,7 @@ #include #include "i915/gem.h" +#include "i915/perf.h" #include "igt.h" #include "igt_device_scan.h" #include "igt_kmod.h" @@ -50,6 +51,7 @@ struct hotunplug { const char *dev_bus_addr; const char *failure; bool need_healthcheck; + bool has_intel_perf; }; /* Helpers */ @@ -319,6 +321,16 @@ static int local_i915_recover(int i915) return local_i915_healthcheck(i915, "post-"); } +static bool local_i915_perf_healthcheck(int i915) +{ + struct intel_perf *intel_perf; + + intel_perf = intel_perf_for_fd(i915); + if (intel_perf) + intel_perf_free(intel_perf); + return intel_perf; +} + #define FLAG_RENDER(1 << 0) #define FLAG_RECOVER (1 << 1) static void node_healthcheck(struct hotunplug *priv, unsigned flags) @@ -360,6 +372,13 @@ static void node_healthcheck(struct hotunplug *priv, unsigned flags) } } + if (!priv->failure && priv->has_intel_perf) { + local_debug("%s\n", "running i915 device perf healthcheck"); + priv->failure = "Device perf healthckeck failure!"; + if (local_i915_perf_healthcheck(fd_drm)) + priv->failure = NULL; + } + fd_drm = close_device(fd_drm, "", "health checked "); if (closed || fd_drm < -1) /* update status for post_healthcheck */ priv->fd.drm_hc = fd_drm; @@ -553,6 +572,7 @@ igt_main .fd = { .drm = -1, .drm_hc = -1, .sysfs_dev = -1, }, .failure= NULL, .need_healthcheck = true, + .has_intel_perf = false, }; igt_fixture { @@ -567,6 +587,8 @@ igt_main gem_quiescent_gpu(fd_drm); igt_require_gem(fd_drm); + priv.has_intel_perf = local_i915_perf_healthcheck(fd_drm); + /** * FIXME: Unbinding the i915 driver on some Haswell * platforms with Azalia audio results in a kernel WARN diff --git a/tests/meson.build b/tests/meson.build index 3e3db7d5b..3f6dc4fe3 100644 --- a/tests/meson.build +++ b/tests/meson.build @@ -3,7 +3,6 @@ test_progs = [ 'core_getclient', 'core_getstats', 'core_getversion', - 'core_hotunplug', 'core_setmaster', 'core_setmaster_vs_auth', 'debugfs_test', @@ -361,6 +360,13 @@ test_executables += executable('perf', install : true) test_list += 'perf' +test_executables += executable('core_hotunplug', 'core_hotunplug.c', + dependencies : test_deps + [ lib_igt_i915_perf ], + install_dir : libexecdir, + install_rpath : libexecdir_rpathdir, + install : true) +test_list += 'core_hotunplug' + executable('testdisplay', ['testdisplay.c', 'testdisplay_hotplug.c'], dependencies : test_deps, install_dir : libexecdir, -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 2/2] intel-ci: Unblock core_hotunplug@*hot*bind* subtests
Commit be529747d8ea ("intel-ci: Broaden core_hotunplug blacklist") blamed issues triggered by hot variants[*] as responsible for random failures in subsequently executed tests, According to the issue history[*], last reported occurrences were not related to core_hotunplug. Remove *hot*bind* subtests from CI blocklist. [*] https://gitlab.freedesktop.org/drm/intel/-/issues/2644. Signed-off-by: Janusz Krzysztofik --- tests/intel-ci/blacklist.txt | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tests/intel-ci/blacklist.txt b/tests/intel-ci/blacklist.txt index 33f92e37f..595fd0ca6 100644 --- a/tests/intel-ci/blacklist.txt +++ b/tests/intel-ci/blacklist.txt @@ -112,10 +112,10 @@ igt@.*@.*pipe-f($|-.*) # Temporary workarounds for CI-impacting bugs ### -# Currently fails and leaves the machine in a very bad state, and -# causes coverage loss for other tests. IOMMU related. -# https://gitlab.freedesktop.org/drm/intel/-/issues/2644 -igt@core_hotunplug@.*(hot|plug).* +# *plug* subtests still fail and leave the +# machine in a very bad state, causing coverage +# loss for other tests. IOMMU related. +igt@core_hotunplug@.*plug.* # hangs several gens of hosts, and has no immediate fix igt@device_reset@reset-bound -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 1/2] tests/core_hotunplug: Add perf health check
Sorry for double submission, I had to resend due to a typo in igt-dev list address. Janusz On czwartek, 8 kwietnia 2021 10:30:08 CEST Janusz Krzysztofik wrote: > Sometimes CI reports skips of perf subtests when run subsequently after > core_hotunplug. That may be an indication of issues with restoring > device perf features on driver (hot)rebind. > > Detect device perf support at test start and check if still available > after driver rebind. If that fails, a post-subtest device recovery > step restores the device perf support so no subsequently executed tests > are affected. > > Signed-off-by: Janusz Krzysztofik > --- > tests/core_hotunplug.c | 22 ++ > tests/meson.build | 8 +++- > 2 files changed, 29 insertions(+), 1 deletion(-) > > diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c > index 56a88fefd..06f15d845 100644 > --- a/tests/core_hotunplug.c > +++ b/tests/core_hotunplug.c > @@ -31,6 +31,7 @@ > #include > > #include "i915/gem.h" > +#include "i915/perf.h" > #include "igt.h" > #include "igt_device_scan.h" > #include "igt_kmod.h" > @@ -50,6 +51,7 @@ struct hotunplug { > const char *dev_bus_addr; > const char *failure; > bool need_healthcheck; > + bool has_intel_perf; > }; > > /* Helpers */ > @@ -319,6 +321,16 @@ static int local_i915_recover(int i915) > return local_i915_healthcheck(i915, "post-"); > } > > +static bool local_i915_perf_healthcheck(int i915) > +{ > + struct intel_perf *intel_perf; > + > + intel_perf = intel_perf_for_fd(i915); > + if (intel_perf) > + intel_perf_free(intel_perf); > + return intel_perf; > +} > + > #define FLAG_RENDER (1 << 0) > #define FLAG_RECOVER (1 << 1) > static void node_healthcheck(struct hotunplug *priv, unsigned flags) > @@ -360,6 +372,13 @@ static void node_healthcheck(struct hotunplug *priv, > unsigned flags) > } > } > > + if (!priv->failure && priv->has_intel_perf) { > + local_debug("%s\n", "running i915 device perf healthcheck"); > + priv->failure = "Device perf healthckeck failure!"; > + if (local_i915_perf_healthcheck(fd_drm)) > + priv->failure = NULL; > + } > + > fd_drm = close_device(fd_drm, "", "health checked "); > if (closed || fd_drm < -1) /* update status for post_healthcheck */ > priv->fd.drm_hc = fd_drm; > @@ -553,6 +572,7 @@ igt_main > .fd = { .drm = -1, .drm_hc = -1, .sysfs_dev = -1, }, > .failure= NULL, > .need_healthcheck = true, > + .has_intel_perf = false, > }; > > igt_fixture { > @@ -567,6 +587,8 @@ igt_main > gem_quiescent_gpu(fd_drm); > igt_require_gem(fd_drm); > > + priv.has_intel_perf = > local_i915_perf_healthcheck(fd_drm); > + > /** >* FIXME: Unbinding the i915 driver on some Haswell >* platforms with Azalia audio results in a kernel WARN > diff --git a/tests/meson.build b/tests/meson.build > index 3e3db7d5b..3f6dc4fe3 100644 > --- a/tests/meson.build > +++ b/tests/meson.build > @@ -3,7 +3,6 @@ test_progs = [ > 'core_getclient', > 'core_getstats', > 'core_getversion', > - 'core_hotunplug', > 'core_setmaster', > 'core_setmaster_vs_auth', > 'debugfs_test', > @@ -361,6 +360,13 @@ test_executables += executable('perf', > install : true) > test_list += 'perf' > > +test_executables += executable('core_hotunplug', 'core_hotunplug.c', > +dependencies : test_deps + [ lib_igt_i915_perf ], > +install_dir : libexecdir, > +install_rpath : libexecdir_rpathdir, > +install : true) > +test_list += 'core_hotunplug' > + > executable('testdisplay', ['testdisplay.c', 'testdisplay_hotplug.c'], > dependencies : test_deps, > install_dir : libexecdir, > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [RFC,1/2] tests/core_hotunplug: Add perf health check
On czwartek, 8 kwietnia 2021 16:50:45 CEST Patchwork wrote: > == Series Details == > > Series: series starting with [RFC,1/2] tests/core_hotunplug: Add perf health > check > URL : https://patchwork.freedesktop.org/series/88848/ > State : failure > > == Summary == > > CI Bug Log - changes from CI_DRM_9934_full -> IGTPW_5718_full > > > Summary > --- > > **FAILURE** > > Serious unknown changes coming with IGTPW_5718_full absolutely need to be > verified manually. > > If you think the reported changes have nothing to do with the changes > introduced in IGTPW_5718_full, please notify your bug team to allow them > to document this new failure mode, which will reduce false positives in CI. > > External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/index.html > > Possible new issues > --- > > Here are the unknown changes that may have been introduced in > IGTPW_5718_full: > > ### IGT changes ### > > Possible regressions > > * igt@core_hotunplug@hotrebind: > - shard-tglb: NOTRUN -> [FAIL][1] +1 similar issue >[1]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-tglb2/igt@core_hotunp...@hotrebind.html > - shard-glk: NOTRUN -> [FAIL][2] +1 similar issue >[2]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-glk1/igt@core_hotunp...@hotrebind.html > - shard-kbl: NOTRUN -> [FAIL][3] +1 similar issue >[3]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-kbl4/igt@core_hotunp...@hotrebind.html > > * igt@core_hotunplug@hotrebind-lateclose: > - shard-snb: NOTRUN -> [INCOMPLETE][4] >[4]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-snb6/igt@core_hotunp...@hotrebind-lateclose.html > - shard-iclb: NOTRUN -> [FAIL][5] +1 similar issue >[5]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-iclb5/igt@core_hotunp...@hotrebind-lateclose.html > - shard-apl: NOTRUN -> [FAIL][6] +1 similar issue >[6]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-apl7/igt@core_hotunp...@hotrebind-lateclose.html Those FAILs are clear indications there is an issue with restoring device perf features after hot rebind on some platforms (or an issue with IGT lib ability to detect them), then that's not a regression, only bringing the issue into light. As long as we keep hot*bind* subtests blocklisted, the issue will not be visible and will persist silently, I'm afraid. Regarding the INCOMPLETE, I'm wondering how often similar system crashes on GPU hangs happen, if they really happen only on GPU hangs after hot rebind, and if that's still a good reason to keep the hot*bind* subtests blocklisted. Chris, can you please comment? Thanks, Janusz > > > Known issues > > > Here are the changes found in IGTPW_5718_full that come from known issues: > > ### IGT changes ### > > Issues hit > > * igt@gem_create@create-massive: > - shard-snb: NOTRUN -> [DMESG-WARN][7] ([i915#3002]) >[7]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-snb7/igt@gem_cre...@create-massive.html > > * igt@gem_ctx_persistence@engines-queued: > - shard-snb: NOTRUN -> [SKIP][8] ([fdo#109271] / [i915#1099]) +3 > similar issues >[8]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-snb6/igt@gem_ctx_persiste...@engines-queued.html > > * igt@gem_ctx_sseu@invalid-args: > - shard-tglb: NOTRUN -> [SKIP][9] ([i915#280]) >[9]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-tglb5/igt@gem_ctx_s...@invalid-args.html > > * igt@gem_exec_fair@basic-deadline: > - shard-glk: [PASS][10] -> [FAIL][11] ([i915#2846]) >[10]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9934/shard-glk2/igt@gem_exec_f...@basic-deadline.html >[11]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-glk7/igt@gem_exec_f...@basic-deadline.html > - shard-apl: NOTRUN -> [FAIL][12] ([i915#2846]) >[12]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-apl1/igt@gem_exec_f...@basic-deadline.html > > * igt@gem_exec_fair@basic-none-solo@rcs0: > - shard-glk: [PASS][13] -> [FAIL][14] ([i915#2842]) +1 similar > issue >[13]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9934/shard-glk5/igt@gem_exec_fair@basic-none-s...@rcs0.html >[14]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-glk4/igt@gem_exec_fair@basic-none-s...@rcs0.html > > * igt@gem_exec_fair@basic-none@vcs0: > - shard-kbl: NOTRUN -> [FAIL][15] ([i915#2842]) +1 similar issue >[15]: > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-kbl7/igt@gem_exec_fair@basic-n...@vcs0.html > > * igt@gem_exec_fair@basic-pace@bcs0: > - shard-tglb: [PASS][16] -> [FAIL][17] ([i915#2842]) +1 similar > issu
[Intel-gfx] [RFC PATCH] tests/gem_userptr_blits: Check for banned mmap-offset
Support for mmap-offset to userptr has been obsoleted, then related lockdep splat reported issues are not going to be resolved other than still banning mmap-offset to userptr attempts. Replace "mmap-offset-invalidate-*" and "readonly-mmap-unsync" subtests which now skip with a negative "mmap-offset-banned" that fails if a mmap-offset attempt to a userptr object doesn't return ENODEV. Also, remove mmap-offset to userptr dependent processing paths from other subtest bodies and drop obsolete subtest variants. Signed-off-by: Janusz Krzysztofik --- tests/i915/gem_userptr_blits.c | 324 +++-- 1 file changed, 30 insertions(+), 294 deletions(-) diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c index 7a80c0161..aad5f141b 100644 --- a/tests/i915/gem_userptr_blits.c +++ b/tests/i915/gem_userptr_blits.c @@ -70,52 +70,12 @@ #endif static uint32_t userptr_flags; -static bool *can_mmap; #define WIDTH 512 #define HEIGHT 512 static uint32_t linear[WIDTH*HEIGHT]; -static bool has_mmap(int i915, const struct mmap_offset *t) -{ - void *ptr, *map; - uint32_t handle; - - handle = gem_create(i915, PAGE_SIZE); - map = __gem_mmap_offset(i915, handle, 0, PAGE_SIZE, PROT_WRITE, - t->type); - gem_close(i915, handle); - if (map) { - munmap(map, PAGE_SIZE); - } else { - igt_debug("no HW / kernel support for mmap-offset(%s)\n", - t->name); - return false; - } - map = NULL; - - igt_assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE) == 0); - - if (__gem_userptr(i915, ptr, 4096, 0, - I915_USERPTR_UNSYNCHRONIZED, &handle)) - goto out_ptr; - igt_assert(handle != 0); - - map = __gem_mmap_offset(i915, handle, 0, 4096, PROT_WRITE, t->type); - if (map) - munmap(map, 4096); - else - igt_debug("mmap-offset(%s) banned, lockdep loop prevention\n", - t->name); - - gem_close(i915, handle); -out_ptr: - free(ptr); - - return map != NULL; -} - static void gem_userptr_test_unsynchronized(void) { userptr_flags = I915_USERPTR_UNSYNCHRONIZED; @@ -914,28 +874,13 @@ static int test_invalid_mapping(int fd, const struct mmap_offset *t) } #define PE_BUSY 0x1 -static void test_process_exit(int fd, const struct mmap_offset *mmo, int flags) +static void test_process_exit(int fd, int flags) { - if (mmo) - igt_require_f(can_mmap[mmo->type], - "HW & kernel support for LLC and mmap-offset(%s) over userptr\n", - mmo->name); - igt_fork(child, 1) { uint32_t handle; handle = create_userptr_bo(fd, sizeof(linear)); - if (mmo) { - uint32_t *ptr; - - ptr = __gem_mmap_offset(fd, handle, 0, sizeof(linear), - PROT_READ | PROT_WRITE, - mmo->type); - if (ptr) - *ptr = 0; - } - if (flags & PE_BUSY) igt_assert_eq(copy(fd, handle, handle), 0); } @@ -1064,53 +1009,30 @@ static int test_map_fixed_invalidate(int fd, uint32_t flags, return 0; } -static void test_mmap_offset_invalidate(int fd, - const struct mmap_offset *t, - unsigned int flags) -#define MMOI_ACTIVE (1u << 0) +static void test_mmap_offset_banned(int fd, const struct mmap_offset *t) { - igt_spin_t *spin = NULL; - uint32_t handle; - uint32_t *map; + struct drm_i915_gem_mmap_offset arg; void *ptr; /* check if mmap_offset type is supported by hardware, skip if not */ - handle = gem_create(fd, PAGE_SIZE); - map = __gem_mmap_offset(fd, handle, 0, PAGE_SIZE, - PROT_READ | PROT_WRITE, t->type); - igt_require_f(map, - "HW & kernel support for mmap_offset(%s)\n", t->name); - munmap(map, PAGE_SIZE); - gem_close(fd, handle); + memset(&arg, 0, sizeof(arg)); + arg.flags = t->type; + arg.handle = gem_create(fd, PAGE_SIZE); + igt_skip_on_f(igt_ioctl(fd, DRM_IOCTL_I915_GEM_MMAP_OFFSET, &arg), + "HW & kernel support for mmap_offset(%s)\n", t->name); + gem_close(fd, arg.handle); /* create userptr object */ + memset(&arg, 0, sizeof(arg)); + arg.flags = t->type; igt_assert_eq(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE), 0); -
Re: [Intel-gfx] [RFC PATCH] tests/gem_userptr_blits: Check for banned mmap-offset
On czwartek, 15 kwietnia 2021 11:47:29 CEST Marcin Bernatowicz wrote: > On Fri, 2021-04-09 at 10:57 +0200, Janusz Krzysztofik wrote: > > Support for mmap-offset to userptr has been obsoleted, then related > > lockdep splat reported issues are not going to be resolved other than > > still banning mmap-offset to userptr attempts. > > > > Replace "mmap-offset-invalidate-*" and "readonly-mmap-unsync" > > subtests > > which now skip with a negative "mmap-offset-banned" that fails if a > > mmap-offset attempt to a userptr object doesn't return ENODEV. Also, > > remove mmap-offset to userptr dependent processing paths from other > > subtest bodies and drop obsolete subtest variants. > > > > Signed-off-by: Janusz Krzysztofik LGTM, > Reviewed-by: Marcin Bernatowicz Thank you Marcin, pushed. Janusz ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 1/2] tests/core_hotunplug: Add perf health check
On środa, 14 kwietnia 2021 11:50:10 CEST Marcin Bernatowicz wrote: > On Thu, 2021-04-08 at 10:31 +0200, Janusz Krzysztofik wrote: > > Sometimes CI reports skips of perf subtests when run subsequently > > after > > core_hotunplug. That may be an indication of issues with restoring > > device perf features on driver (hot)rebind. > > > > Detect device perf support at test start and check if still available > > after driver rebind. If that fails, a post-subtest device recovery > > step restores the device perf support so no subsequently executed > > tests > > are affected. > > > > Signed-off-by: Janusz Krzysztofik LGTM, > Acked-by: Marcin Bernatowicz Thank you Marcin, pushed. Janusz > > > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Fix wrong name announced on FB driver switching
Commit 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info") effectively changed our FB driver name from "inteldrmfb" to "i915drmfb". However, we are still using the old name when kicking out a firmware fbdev driver potentially bound to our device. Use the new name to avoid confusion. Note: since the new name is assigned by a DRM fbdev helper called at the DRM driver registration time, that name is not available when we kick the other driver out early, hence a hardcoded name must be used unless the DRM layer exposes a macro for converting a DRM driver name to its associated fbdev driver name. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 785dcf20c77b..46082490dc9a 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -554,7 +554,7 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv) if (ret) goto err_perf; - ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "inteldrmfb"); + ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "i915drmfb"); if (ret) goto err_ggtt; -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Fix wrong name announced on FB driver switching
On piątek, 30 kwietnia 2021 01:01:38 CEST Patchwork wrote: > == Series Details == > > Series: drm/i915: Fix wrong name announced on FB driver switching > URL : https://patchwork.freedesktop.org/series/89663/ > State : failure > > == Summary == > > CI Bug Log - changes from CI_DRM_10027_full -> Patchwork_20039_full > > > Summary > --- > > **FAILURE** > > Serious unknown changes coming with Patchwork_20039_full absolutely need to > be > verified manually. > > If you think the reported changes have nothing to do with the changes > introduced in Patchwork_20039_full, please notify your bug team to allow > them > to document this new failure mode, which will reduce false positives in CI. > > > > Possible new issues > --- > > Here are the unknown changes that may have been introduced in > Patchwork_20039_full: > > ### IGT changes ### > > Possible regressions > > * igt@kms_plane@plane-panning-bottom-right-pipe-b-planes: > - shard-tglb: [PASS][1] -> [DMESG-WARN][2] +4 similar issues >[1]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-tglb3/igt@kms_pl...@plane-panning-bottom-right-pipe-b-planes.html >[2]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-tglb5/igt@kms_pl...@plane-panning-bottom-right-pipe-b-planes.html False positive. The change only affects a notice sent to kernel log on FB switching, nothing else, then there is no possibility for any error messages in kernel log being related. Thanks, Janusz > > > Known issues > > > Here are the changes found in Patchwork_20039_full that come from known > issues: > > ### IGT changes ### > > Issues hit > > * igt@gem_create@create-clear: > - shard-glk: [PASS][3] -> [FAIL][4] ([i915#3160]) >[3]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-glk8/igt@gem_cre...@create-clear.html >[4]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-glk9/igt@gem_cre...@create-clear.html > - shard-skl: [PASS][5] -> [FAIL][6] ([i915#3160]) >[5]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-skl9/igt@gem_cre...@create-clear.html >[6]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-skl9/igt@gem_cre...@create-clear.html > > * igt@gem_create@create-massive: > - shard-apl: NOTRUN -> [DMESG-WARN][7] ([i915#3002]) >[7]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-apl3/igt@gem_cre...@create-massive.html > > * igt@gem_ctx_persistence@legacy-engines-hostile@render: > - shard-iclb: [PASS][8] -> [FAIL][9] ([i915#2410]) >[8]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-iclb6/igt@gem_ctx_persistence@legacy-engines-host...@render.html >[9]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-iclb6/igt@gem_ctx_persistence@legacy-engines-host...@render.html > > * igt@gem_ctx_persistence@legacy-engines-queued: > - shard-snb: NOTRUN -> [SKIP][10] ([fdo#109271] / [i915#1099]) > +2 similar issues >[10]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-snb5/igt@gem_ctx_persiste...@legacy-engines-queued.html > > * igt@gem_ctx_persistence@many-contexts: > - shard-tglb: [PASS][11] -> [FAIL][12] ([i915#2410]) >[11]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-tglb6/igt@gem_ctx_persiste...@many-contexts.html >[12]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-tglb2/igt@gem_ctx_persiste...@many-contexts.html > > * igt@gem_ctx_ringsize@active@bcs0: > - shard-skl: NOTRUN -> [INCOMPLETE][13] ([i915#3316]) >[13]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-skl8/igt@gem_ctx_ringsize@act...@bcs0.html > > * igt@gem_exec_fair@basic-deadline: > - shard-kbl: [PASS][14] -> [FAIL][15] ([i915#2846]) >[14]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-kbl2/igt@gem_exec_f...@basic-deadline.html >[15]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-kbl4/igt@gem_exec_f...@basic-deadline.html > > * igt@gem_exec_fair@basic-flow@rcs0: > - shard-kbl: [PASS][16] -> [SKIP][17] ([fdo#109271]) >[16]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-kbl7/igt@gem_exec_fair@basic-f...@rcs0.html >[17]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-kbl6/igt@gem_exec_fair@basic-f...@rcs0.html > > * igt@gem_exec_fair@basic-pace@rcs0: > - shard-kbl: [PASS][18] -> [FAIL][19] ([i915#2842]) +1 similar > issue >[18]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-kbl7/igt@gem_exec_fair@basic-p...@rcs0.html >[19]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-kbl4/igt@gem_exec_fair@basic-p...@rcs0.html > > * igt
[Intel-gfx] [RFC PATCH i-g-t] lib/i915/perf: Fix non-card0 processing
IGT i915/perf library functions now always operate on sysfs perf attributes of card0 device node, no matter which DRM device fd a user passes. The intention was to always switch to primary device node if a user passes a render device node fd, but that breaks handling of non-card0 devices. Instead of forcibly using DRM device minor number 0 when opening a device sysfs area, convert device minor number of a user passed device fd to the minor number of respective primary (cardX) device node. Signed-off-by: Janusz Krzysztofik --- lib/i915/perf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/i915/perf.c b/lib/i915/perf.c index 56d5c0b3a..336824df7 100644 --- a/lib/i915/perf.c +++ b/lib/i915/perf.c @@ -376,8 +376,8 @@ open_master_sysfs_dir(int drm_fd) if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode)) return -1; -snprintf(path, sizeof(path), "/sys/dev/char/%d:0", - major(st.st_rdev)); +snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", + major(st.st_rdev), minor(st.st_rdev) & ~128); return open(path, O_DIRECTORY); } -- 2.25.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Shutdown hooks
On Thursday, May 16, 2019 8:20:18 AM CEST Janusz Krzysztofik wrote: > On Wednesday, May 15, 2019 5:00:40 PM CEST Chris Wilson wrote: > > Janus, some old patches that may be of use for shutdown prior to kexec. > > -Chris > > Hi Chris, > > Thanks for sharing. > > I'm only not sure why you mentioned kexec. I have an impression someone else > was talking about kexec recently so maybe I was not the intended recipient. > But anyway, those patches look to me like they may be helpful by hotunplug so > I'm going to give them a try with the hotunplug test. I was wrong. The shutdown hook has nothing to do with hot unbind / unplug and the applicable remove hook already has in its path both calls covered by those patches. Then it looks like indeed I must have been not the intended recipient of those messages. Thanks, Janusz P.S. Sorry for business disclaimer appended to my last message. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH] drm/i915: Tolerate file owned GEM contexts on hot unbind
From: Janusz Krzysztofik During i915_driver_unload(), GEM contexts are verified restrictively inside i915_gem_fini() if they don't consume shared resources which should be cleaned up before the driver is released. If those checks don't result in kernel panic, one more check is performed at the end of i915_gem_fini() which issues a WARN_ON() if GEM contexts still exist. Some GEM contexts are allocated unconditionally on device file open, one per each file descriptor, and are kept open until those file descriptors are closed. Since open file descriptors prevent the driver module from being unloaded, that protects the driver from being released while contexts are still open. However, that's not the case on driver unbind or device unplug sysfs operations which are executed regardless of open file descriptors. To protect kernel resources from being accessed by those open file decriptors while driver unbind or device unplug operation is in progress, the driver now calls drm_device_unplug() at the beginning of that process and relies on the DRM layer to provide such protection. Taking all above information into account, as soon as shared resources not associated with specific file descriptors are cleaned up, it should be safe to postpone completion of driver release until users of those open file decriptors give up on errors and close them. When device has been marked unplugged, use WARN_ON() conditionally so the warning is displayed only if a GEM context not associated with a file descriptor is still allocated. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_gem.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 54f27cabae2a..c00b6dbaf4f5 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4670,7 +4670,17 @@ void i915_gem_fini(struct drm_i915_private *dev_priv) i915_gem_drain_freed_objects(dev_priv); - WARN_ON(!list_empty(&dev_priv->contexts.list)); + if (drm_dev_is_unplugged(&dev_priv->drm)) { + struct i915_gem_context *ctx, *cn; + + list_for_each_entry_safe(ctx, cn, &dev_priv->contexts.list, +link) { + WARN_ON(IS_ERR_OR_NULL(ctx->file_priv)); + break; + } + } else { + WARN_ON(!list_empty(&dev_priv->contexts.list)); + } } void i915_gem_init_mmio(struct drm_i915_private *i915) -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH] drm/i915: Tolerate file owned GEM contexts on hot unbind
On Friday, May 17, 2019 4:32:35 PM CEST Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-05-17 15:06:17) > > From: Janusz Krzysztofik > > > > During i915_driver_unload(), GEM contexts are verified restrictively > > inside i915_gem_fini() if they don't consume shared resources which > > should be cleaned up before the driver is released. If those checks > > don't result in kernel panic, one more check is performed at the end of > > i915_gem_fini() which issues a WARN_ON() if GEM contexts still exist. > > Just fix the underlying bug of this code being called too early. The > assumptions we made for unload are clearly invalid when applied to > unbind, and we need to split the phases. > -Chris Thanks Chris, I think I get it finally. Janusz ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Fix context IDs not released on driver hot unbind
From: Janusz Krzysztofik In case the driver gets unbound while a device is open, kernel panic may be forced if a list of allocated context IDs is not empty. When a device is open, the list may happen to be not empty because a context ID, once allocated by a context ID allocator to a context assosiated with that open file descriptor, is released as late as on device close. On the other hand, there is a need to release all allocated context IDs and destroy the context ID allocator on driver unbind, even if a device is open, in order to free memory resources consumed and prevent from memory leaks. The purpose of the forced kernel panic was to protect the context ID allocator from being silently destroyed if not all allocated IDs had been released. Before forcing the kernel panic on non-empty list of allocated context IDs, do that unlikely on non-empty list of contexts that should be freed by preceding drain of work queue (there must be another bug if that list happens to be not empty). If empty, we may assume that remaining contexts are idle (not pinned) and their IDs can be safely released. Once done, release context IDs of each of those remaining contexts unless it happens a context is unlikely pinned. Force kernel panic in that case, there must be still another bug in the driver code. Now the kernel panic protecting the allocator should not pop up as the list it checks should be empty. If it unlikely happens to be not empty, there must be still another bug. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_gem_context.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 280813a4bf82..18d004d94e43 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -611,6 +611,8 @@ void i915_gem_contexts_lost(struct drm_i915_private *dev_priv) void i915_gem_contexts_fini(struct drm_i915_private *i915) { + struct i915_gem_context *ctx, *cn; + lockdep_assert_held(&i915->drm.struct_mutex); if (i915->preempt_context) @@ -618,6 +620,14 @@ void i915_gem_contexts_fini(struct drm_i915_private *i915) destroy_kernel_context(&i915->kernel_context); /* Must free all deferred contexts (via flush_workqueue) first */ + GEM_BUG_ON(!llist_empty(&i915->contexts.free_list)); + + /* Release all remaining HW IDs before ID allocator is destroyed */ + list_for_each_entry_safe(ctx, cn, &i915->contexts.hw_id_list, +hw_id_link) { + GEM_BUG_ON(atomic_read(&ctx->hw_id_pin_count)); + release_hw_id(ctx); + } GEM_BUG_ON(!list_empty(&i915->contexts.hw_id_list)); ida_destroy(&i915->contexts.hw_ida); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix context IDs not released on driver hot unbind
On Thu, 2019-04-04 at 11:28 +0100, Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-04-04 11:24:45) > > From: Janusz Krzysztofik > > > > In case the driver gets unbound while a device is open, kernel > > panic > > may be forced if a list of allocated context IDs is not empty. > > > > When a device is open, the list may happen to be not empty because > > a > > context ID, once allocated by a context ID allocator to a context > > assosiated with that open file descriptor, is released as late as > > on device close. > > > > On the other hand, there is a need to release all allocated context > > IDs > > and destroy the context ID allocator on driver unbind, even if a > > device > > is open, in order to free memory resources consumed and prevent > > from > > memory leaks. The purpose of the forced kernel panic was to > > protect > > the context ID allocator from being silently destroyed if not all > > allocated IDs had been released. > > Those open fd are still pointing into kernel memory where the driver > used to be. The panic is entirely correct, we should not be unloading > the module before those dangling pointers have been made safe. > > This is papering over the symptom. How is the module being unloaded > with > open fd? A user can play with the driver unbind or device remove sysfs interface. Thanks, Janusz > If all the fd have been closed, how have we failed to flush and > retire all requests (thereby unpinning the contexts and all other > pointers). > -Chris > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix context IDs not released on driver hot unbind
On Thu, 2019-04-04 at 11:43 +0100, Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-04-04 11:40:24) > > On Thu, 2019-04-04 at 11:28 +0100, Chris Wilson wrote: > > > Quoting Janusz Krzysztofik (2019-04-04 11:24:45) > > > > From: Janusz Krzysztofik > > > > > > > > In case the driver gets unbound while a device is open, kernel > > > > panic > > > > may be forced if a list of allocated context IDs is not empty. > > > > > > > > When a device is open, the list may happen to be not empty > > > > because > > > > a > > > > context ID, once allocated by a context ID allocator to a > > > > context > > > > assosiated with that open file descriptor, is released as late > > > > as > > > > on device close. > > > > > > > > On the other hand, there is a need to release all allocated > > > > context > > > > IDs > > > > and destroy the context ID allocator on driver unbind, even if > > > > a > > > > device > > > > is open, in order to free memory resources consumed and prevent > > > > from > > > > memory leaks. The purpose of the forced kernel panic was to > > > > protect > > > > the context ID allocator from being silently destroyed if not > > > > all > > > > allocated IDs had been released. > > > > > > Those open fd are still pointing into kernel memory where the > > > driver > > > used to be. The panic is entirely correct, we should not be > > > unloading > > > the module before those dangling pointers have been made safe. > > > > > > This is papering over the symptom. How is the module being > > > unloaded > > > with > > > open fd? > > > > A user can play with the driver unbind or device remove sysfs > > interface. > > Sure, but we must still follow all the steps before _unloading_ the > module or else the user is left pointing into reused kernel memory. I'm not talking about unloading the module, that is prevented by open fds. The driver still exists after being unbound from a device and may just respond with -ENODEV. Janusz > -Chris > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Use drm_dev_unplug()
From: Janusz Krzysztofik The driver does not currently support unbinding from a device which is in use. Since open file descriptors may still be pointing into kernel memory where the device structures used to be, entirely correct kernel panics protect the driver from being unbound as we should not be unbinding it before those dangling pointers have been made safe. According to the documentation found inside drivers/gpu/drm/drm_drv.c, drm_dev_unplug() should be used instead of drm_dev_unregister() in order to make a device inaccessible to users as soon as it is unpluged. Follow that advice to make those possibly dangling pointers safe, protected by DRM layer from a user who is otherwise left pointing into possibly reused kernel memory after the driver has been unbound from the device. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 9df65d386d11..66163378c481 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) i915_pmu_unregister(dev_priv); i915_teardown_sysfs(dev_priv); - drm_dev_unregister(&dev_priv->drm); + drm_dev_unplug(&dev_priv->drm); i915_gem_shrinker_unregister(dev_priv); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Use drm_dev_unplug()
On Fri, 2019-04-05 at 08:41 +0100, Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-04-05 08:26:57) > > From: Janusz Krzysztofik > > > > The driver does not currently support unbinding from a device which > > is > > in use. Since open file descriptors may still be pointing into > > kernel > > memory where the device structures used to be, entirely correct > > kernel > > panics protect the driver from being unbound as we should not be > > unbinding it before those dangling pointers have been made safe. > > > > According to the documentation found inside > > drivers/gpu/drm/drm_drv.c, > > drm_dev_unplug() should be used instead of drm_dev_unregister() in > > order to make a device inaccessible to users as soon as it is > > unpluged. > > Follow that advice to make those possibly dangling pointers safe, > > protected by DRM layer from a user who is otherwise left pointing > > into > > possibly reused kernel memory after the driver has been unbound > > from > > the device. > > > > Signed-off-by: Janusz Krzysztofik > > --- > > drivers/gpu/drm/i915/i915_drv.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c > > b/drivers/gpu/drm/i915/i915_drv.c > > index 9df65d386d11..66163378c481 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct > > drm_i915_private *dev_priv) > > i915_pmu_unregister(dev_priv); > > > > i915_teardown_sysfs(dev_priv); > > - drm_dev_unregister(&dev_priv->drm); > > + drm_dev_unplug(&dev_priv->drm); > > I think we may have our onion inverted here. We want to stop the > users > as the first step, then start removing the entries. (That will also > nicely invert the order from register, which is what we typically > expect). > > After calling i915_driver_unregister(); call i915_gem_set_wedged() to > immediately (give or take external fences) cancel inflight > operations. OK, thanks. Do you prefer them squashed or as serparate patches? Thanks, Janusz > -Chris > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Use drm_dev_unplug()
On Fri, 2019-04-05 at 09:24 +0100, Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-04-05 09:11:54) > > On Fri, 2019-04-05 at 08:41 +0100, Chris Wilson wrote: > > > Quoting Janusz Krzysztofik (2019-04-05 08:26:57) > > > > From: Janusz Krzysztofik > > > > > > > > The driver does not currently support unbinding from a device > > > > which > > > > is > > > > in use. Since open file descriptors may still be pointing into > > > > kernel > > > > memory where the device structures used to be, entirely correct > > > > kernel > > > > panics protect the driver from being unbound as we should not > > > > be > > > > unbinding it before those dangling pointers have been made > > > > safe. > > > > > > > > According to the documentation found inside > > > > drivers/gpu/drm/drm_drv.c, > > > > drm_dev_unplug() should be used instead of drm_dev_unregister() > > > > in > > > > order to make a device inaccessible to users as soon as it is > > > > unpluged. > > > > Follow that advice to make those possibly dangling pointers > > > > safe, > > > > protected by DRM layer from a user who is otherwise left > > > > pointing > > > > into > > > > possibly reused kernel memory after the driver has been unbound > > > > from > > > > the device. > > > > > > > > Signed-off-by: Janusz Krzysztofik > > > > > > > > --- > > > > drivers/gpu/drm/i915/i915_drv.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c > > > > b/drivers/gpu/drm/i915/i915_drv.c > > > > index 9df65d386d11..66163378c481 100644 > > > > --- a/drivers/gpu/drm/i915/i915_drv.c > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > > > @@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct > > > > drm_i915_private *dev_priv) > > > > i915_pmu_unregister(dev_priv); > > > > > > > > i915_teardown_sysfs(dev_priv); > > > > - drm_dev_unregister(&dev_priv->drm); > > > > + drm_dev_unplug(&dev_priv->drm); > > > > > > I think we may have our onion inverted here. We want to stop the > > > users > > > as the first step, then start removing the entries. (That will > > > also > > > nicely invert the order from register, which is what we typically > > > expect). > > > > > > After calling i915_driver_unregister(); call > > > i915_gem_set_wedged() to > > > immediately (give or take external fences) cancel inflight > > > operations. > > > > OK, thanks. Do you prefer them squashed or as serparate patches? > > Quite happy to do the s/unregister/unplug/ and move in one go. Have a > pre-emptive > Reviewed-by: Chris Wilson > on that as that seems to be the right thing to do. > > And there should be no issues in placing a i915_gem_set_wedged() > immediately after the call to i915_driver_unregister, so if you > include > a line of commentary about why, for example > > /* > * After unregistering the device to prevent any new users, cancel > * all in-flight requests so that we can quickly unbind the active > * resources. > */ > i915_gem_set_wedged(dev_priv); > > Reviewed-by: Chris Wilson I've given it some testing, no side effects with test workloads I've tried, and looks like it at least helps to prevent from making the device actually wedged. With these two patches, plus the one we discussed yesterday, and yet another one I'm going to submit soon, I'm now able to unbind the driver from a device while a workload is running on it, unload the module, reload it and successfully perform basic GEM health checks, all in a quick succession :-). Unfortunately, not 100% reproducible, as well as not the case with device unplug simulated by writing 1 to device/remove sysfs file. Surely that needs the work you describe below to be done first. Thanks for your cooperation, Janusz > > I think overall though, we need to go through i915_driver_unload() > and > push the module cleanup operations to i915_driver_release -- that > will > take a bit of surgery to separate the different phases that are > currently smashed together. > -Chris > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Mark GEM wedged right after marking device unplugged
As soon as a device is considered unplugged, not only prevent pending users from accessing the device structures but also cancel all their pending requests so all consumed resources can be cleaned up as soon as possible. Signed-off-by: Janusz Krzysztofik Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 66163378c481..03a563ce7e6b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1598,6 +1598,13 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) i915_teardown_sysfs(dev_priv); drm_dev_unplug(&dev_priv->drm); + /* +* After unregistering the device to prevent any new users, cancel +* all in-flight requests so that we can quickly unbind the active +* resources. +*/ + i915_gem_set_wedged(dev_priv); + i915_gem_shrinker_unregister(dev_priv); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH] drm/i915: Don't panic on non-empty list of free cachelines
From: Janusz Krzysztofik If there are active users of a device during driver unbind, the driver now panics on non-empty list of free cachelines. By design, chachelines which are not in use are kept on a list of free chachelines associated with a timeline and rmoved from that list either when in use or when the timeline is destroyed. Timelines in turn are assigned to open file descriptors. As long as a device file is open, its associated timeline with its list of free cachelines will be hopefully destroyed on device close, either while outstanding execlists are destroyed or on i915_timeline_put() called directly, so as long as device file descriptors are protected from unwanted user activities by the device being marked unplugged, there should be no reason to panic. Moreover, timeline mutex which is destroyed right after the check for emptyness of a free cacheline list succeeds is never used to protect that list, only a list of active cachelines, so it can be freely destroyed even if the former is not empty. Simply remove the GEM_BUG_ON(!list_empty(>->hwsp_free_list)); line from i915_timelines_fini(). Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_timeline.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c index b2202d2e58a2..1f23c2dcc0da 100644 --- a/drivers/gpu/drm/i915/i915_timeline.c +++ b/drivers/gpu/drm/i915/i915_timeline.c @@ -325,7 +325,6 @@ void i915_timelines_fini(struct drm_i915_private *i915) struct i915_gt_timelines *gt = &i915->gt.timelines; GEM_BUG_ON(!list_empty(>->active_list)); - GEM_BUG_ON(!list_empty(>->hwsp_free_list)); mutex_destroy(>->mutex); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH] drm/i915: Don't panic on non-empty list of free cachelines
On Fri, 2019-04-05 at 13:20 +0100, Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-04-05 13:13:31) > > From: Janusz Krzysztofik > > > > If there are active users of a device during driver unbind, the > > driver > > now panics on non-empty list of free cachelines. > > This panic is there to say that fini is being called with active > contexts, that it is being called too early. Those requests should be > cleaned up first, unpinning the contexts and resources, and so > letting > the timeline be freed. OK, I see. But why panic? Maybe a WARN() would be enough. Thanks, Janusz > -Chris > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] drm/i915: Use drm_dev_unplug()
From: Janusz Krzysztofik The driver does not currently support unbinding from a device which is in use. Since open file descriptors may still be pointing into kernel memory where the device structures used to be, entirely correct kernel panics protect the driver from being unbound as we should not be unbinding it before those dangling pointers have been made safe. According to the documentation found inside drivers/gpu/drm/drm_drv.c, drm_dev_unplug() should be used instead of drm_dev_unregister() in order to make a device inaccessible to users as soon as it is unpluged. Follow that advice to make those possibly dangling pointers safe, protected by DRM layer from a user who is otherwise left pointing into possibly reused kernel memory after the driver has been unbound from the device. Once done, also cancel inflight operations immediately by calling i915_gem_set_wedged(). Signed-off-by: Janusz Krzysztofik Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 9df65d386d11..66163378c481 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) i915_pmu_unregister(dev_priv); i915_teardown_sysfs(dev_priv); - drm_dev_unregister(&dev_priv->drm); + drm_dev_unplug(&dev_priv->drm); i915_gem_shrinker_unregister(dev_priv); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/2] Stop users from using the device on driver unbind
Use drm_dev_unplug() to have device resources protected from user access by DRM layer as soon as the driver is going to be unbound. Also, cancel all pending work so associated resources can be quickly released. Janusz Krzysztofik (2): drm/i915: Use drm_dev_unplug() drm/i915: Mark GEM wedged right after marking device unplugged drivers/gpu/drm/i915/i915_drv.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) I'm resending these two patches together in series to make the robot happy about the second one. Also, I've added the Suggested-by: clause to credit actual Chris' contribution. Thanks, Janusz -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] drm/i915: Mark GEM wedged right after marking device unplugged
As soon as a device is considered unplugged, not only prevent pending users from accessing the device structures but also cancel all their pending requests so all consumed resources can be cleaned up as soon as possible. Suggested-by: Chris Wilson Signed-off-by: Janusz Krzysztofik Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 66163378c481..03a563ce7e6b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1598,6 +1598,13 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) i915_teardown_sysfs(dev_priv); drm_dev_unplug(&dev_priv->drm); + /* +* After unregistering the device to prevent any new users, cancel +* all in-flight requests so that we can quickly unbind the active +* resources. +*/ + i915_gem_set_wedged(dev_priv); + i915_gem_shrinker_unregister(dev_priv); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2] drm/i915: Don't panic on non-empty list of free cachelines
From: Janusz Krzysztofik If there are active users of a device during driver unbind, the driver now panics on non-empty list of free cachelines. By design, cachelines which are not in use are kept on a list of free cachelines associated with a timeline and removed from that list either when in use or when the timeline is destroyed. Timelines in turn are assigned to open file descriptors. As long as a device file is open, its associated timeline with its list of free cachelines will be hopefully destroyed on device close, either while outstanding execlists are destroyed or on i915_timeline_put() called directly, so as long as device file descriptors are protected from unwanted user activities by the device being marked unplugged, there should be no reason to panic. Moreover, timeline mutex which is destroyed right after the check for emptyness of a free cacheline list succeeds is never used to protect that list, only a list of active cachelines, so it can be freely destroyed even if the former is not empty. Since the desired behavior is to clean up active contexts first, unpinning the contexts and resources, and so letting the timeline be freed, the panic is there to say that i915_timelines_fini() is called to early. Don't remove the check completely then but convert it from the BUG() to a WARN() so the indication a long term fix is needed is still given. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_timeline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c index b2202d2e58a2..965fd3052b25 100644 --- a/drivers/gpu/drm/i915/i915_timeline.c +++ b/drivers/gpu/drm/i915/i915_timeline.c @@ -325,7 +325,7 @@ void i915_timelines_fini(struct drm_i915_private *i915) struct i915_gt_timelines *gt = &i915->gt.timelines; GEM_BUG_ON(!list_empty(>->active_list)); - GEM_BUG_ON(!list_empty(>->hwsp_free_list)); + GEM_WARN_ON(!list_empty(>->hwsp_free_list)); mutex_destroy(>->mutex); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for Stop users from using the device on driver unbind
On Friday, April 5, 2019 7:37:04 PM CEST Chris Wilson wrote: > Quoting Chris Wilson (2019-04-05 17:26:46) > > > Quoting Patchwork (2019-04-05 17:20:39) > > > > > == Series Details == > > > > > > Series: Stop users from using the device on driver unbind > > > URL : https://patchwork.freedesktop.org/series/59064/ > > > State : failure > > > > > > == Summary == > > > > > > CI Bug Log - changes from CI_DRM_5881 -> Patchwork_12699 > > > > > > > > > Summary > > > --- > > > > > > **FAILURE** > > > > > > Serious unknown changes coming with Patchwork_12699 absolutely need to > > > be > > > verified manually. > > > > > > If you think the reported changes have nothing to do with the changes > > > introduced in Patchwork_12699, please notify your bug team to allow > > > them > > > to document this new failure mode, which will reduce false positives > > > in CI. > > > > > > External URL: > > > https://patchwork.freedesktop.org/api/1.0/series/59064/revisions/1/mb > > > ox/> > > > > Possible new issues > > > --- > > > > > > Here are the unknown changes that may have been introduced in Patchwork_12699: > > > ### IGT changes ### > > > > > > Possible regressions > > > > > > * igt@i915_module_load@reload: > > 2 issues, it appears: > > > > <4> [271.799080] WARN_ON(dev_priv->mm.object_count) > > <4> [271.799241] WARNING: CPU: 0 PID: 3288 at > > drivers/gpu/drm/i915/i915_gem.c:5145 i915_gem_cleanup_early+0x104/0x110 > > [i915] <4> [271.799249] Modules linked in: vgem snd_hda_codec_hdmi > > snd_hda_codec_realtek snd_hda_codec_generic i915(-) mei_hdcp > > x86_pkg_temp_thermal btusb coretemp btrtl btbcm btintel bluetooth > > crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hwdep ghash_clmulni_intel > > snd_hda_core e1000e ecdh_generic snd_pcm mei_me ptp prime_numbers > > pps_core mei [last unloaded: snd_hda_intel] <4> [271.799302] CPU: 0 PID: > > 3288 Comm: i915_module_loa Tainted: G U > > 5.1.0-rc3-CI-Patchwork_12699+ #1 <4> [271.799307] Hardware name: > > /NUC6i7KYB, BIOS KYSKLi70.86A.0059.2018.1122.1431 11/22/2018 <4> > > [271.799406] RIP: 0010:i915_gem_cleanup_early+0x104/0x110 [i915] <4> > > [271.799412] Code: 00 00 48 c7 c2 d0 6b 3d a0 48 c7 c7 ca 5c 2c a0 e8 c1 > > b5 ec e0 0f 0b 48 c7 c6 68 c0 3f a0 48 c7 c7 63 88 42 a0 e8 9c 77 de e0 > > <0f> 0b e9 40 ff ff ff 0f 1f 44 00 00 e8 5b 7e 00 00 31 c0 c3 0f 1f <4> > > [271.799417] RSP: 0018:c9453dd0 EFLAGS: 00010282 > > <4> [271.799423] RAX: RBX: 88849afd RCX: > > <4> [271.799428] RDX: 0006 RSI: > > 88849ee130b8 RDI: 8211dc4d <4> [271.799432] RBP: > > 88849afd7630 R08: 028bc995 R09: <4> > > [271.799436] R10: R11: R12: > > a04a81e0 <4> [271.799440] R13: R14: > > R15: a04a82d0 <4> [271.799446] FS: > > 7f31e8cec980() GS:8884aee0() knlGS: > > <4> [271.799451] CS: 0010 DS: ES: CR0: 80050033 <4> > > [271.799455] CR2: 7ffea58773d8 CR3: 00044cfc6003 CR4: > > 003606f0 <4> [271.799459] Call Trace: > > <4> [271.799531] i915_driver_cleanup_early+0x30/0x70 [i915] > > <4> [271.799603] i915_driver_release+0xa/0x30 [i915] > > <4> [271.799672] i915_driver_unload+0x6a/0x120 [i915] > > <4> [271.799748] i915_pci_remove+0x19/0x30 [i915] > > <4> [271.799765] pci_device_remove+0x36/0xb0 > > So this is the bizarre part. We end up in the final i915_driver_release > because it appears that drm_dev_unplug() drops a reference. I couldn't > see where... > > [ 24.960676] WARNING: CPU: 2 PID: 637 at drivers/gpu/drm/drm_drv.c:895 > drm_dev_put+0x8/0x60 [ 24.960735] Modules linked in: nls_ascii nls_cp437 > vfat fat crct10dif_pclmul crc32_pclmul crc32c_intel i915(-) aesni_intel > aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_uncore > intel_rapl_perf efivars i2c_i801 intel_gtt drm_kms_helper ahci libahci > video button efivarfs [ 24.960848] CPU: 2 PID: 637 Comm: i915_module_loa > Tainted: GBU5.1.0-rc3+ #526 [ 24.960897] Hardware name: > Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS > BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [ 24.960952] RIP: > 0010:drm_dev_put+0x8/0x60 > [ 24.960993] Code: 48 8d 7b 60 e8 d9 8b c7 ff 48 8b 7b 60 5b 5d e9 0e 4f > c7 ff 48 89 df e8 06 c2 ff ff e9 3f ff ff ff 90 48 85 ff 75 01 c3 55 53 > <0f> 0b f0 ff 4f 14 0f 88 64 b7 2d 00 74 03 5b 5d c3 48 89 fb 48 8d [ > 24.961066] RSP: 0018:88872587fc80 EFLAGS: 00010286 > [ 24.961107] RAX: RBX: 88873f02 RCX: > 81680444 [ 24.961151] RDX: dc00 RSI: dc00 > RDI: 88873f02 [ 24.961195] RBP: 88873f02ad88 R08: > R09: fbfff04824c5 [ 24.961240] R10: fbfff04824c5 > R11: 8241262b R12: 88
[Intel-gfx] [PATCH v2 0/1] Stop users from using the device on driver unbind
Use drm_dev_unplug() to have device resources protected from user access by DRM layer as soon as the driver is going to be unbound. Janusz Krzysztofik (1): drm/i915: Use drm_dev_unplug() drivers/gpu/drm/i915/i915_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Since this patch should be now safe for use if merged with current drm-next or drm-tip branch which no longer suffer from incorrectly resolved merge confilct that was breaking it, finally fixed by commit bd53280ef042 ("drm/drv: Fix incorrect resolution of merge conflict"), I'm resending it with Daniel's Reviewed-by: added. Former patch 2/2 has been dropped as it is already in drm-intel-next as commit 141f3767e7b8 ("drm/i915: Mark GEM wedged right after marking device unplugged"). BTW, the wersion I sent was screwed up, not reflecting Chris' intention precisely enough, but Chris was vigilant and fixed it. Sorry Chris. Thanks, Janusz -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 1/1] drm/i915: Use drm_dev_unplug()
From: Janusz Krzysztofik The driver does not currently support unbinding from a device which is in use. Since open file descriptors may still be pointing into kernel memory where the device structures used to be, entirely correct kernel panics protect the driver from being unbound as we should not be unbinding it before those dangling pointers have been made safe. According to the documentation found inside drivers/gpu/drm/drm_drv.c, drm_dev_unplug() should be used instead of drm_dev_unregister() in order to make a device inaccessible to users as soon as it is unpluged. Follow that advice to make those possibly dangling pointers safe, protected by DRM layer from a user who is otherwise left pointing into possibly reused kernel memory after the driver has been unbound from the device. Signed-off-by: Janusz Krzysztofik Reviewed-by: Chris Wilson Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 9df65d386d11..66163378c481 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) i915_pmu_unregister(dev_priv); i915_teardown_sysfs(dev_priv); - drm_dev_unregister(&dev_priv->drm); + drm_dev_unplug(&dev_priv->drm); i915_gem_shrinker_unregister(dev_priv); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug
Hi Baolu, On Thursday, August 29, 2019 11:08:18 AM CEST Lu Baolu wrote: > Hi, > > On 8/29/19 3:58 PM, Janusz Krzysztofik wrote: > > Hi Baolu, > > > > On Thursday, August 29, 2019 3:43:31 AM CEST Lu Baolu wrote: > >> Hi Janusz, > >> > >> On 8/28/19 10:17 PM, Janusz Krzysztofik wrote: > >>>> We should avoid kernel panic when a intel_unmap() is called against > >>>> a non-existent domain. > >>> Does that mean you suggest to replace > >>> BUG_ON(!domain); > >>> with something like > >>> if (WARN_ON(!domain)) > >>> return; > >>> and to not care of orphaned mappings left allocated? Is there a way to > > inform > >>> users that their active DMA mappings are no longer valid and they > > shouldn't > >>> call dma_unmap_*()? > >>> > >>>> But we shouldn't expect the IOMMU driver not > >>>> cleaning up the domain info when a device remove notification comes and > >>>> wait until all file descriptors being closed, right? > >>> Shouldn't then the IOMMU driver take care of cleaning up resources still > >>> allocated on device remove before it invalidates and forgets their > > pointers? > >>> > >> > >> You are right. We need to wait until all allocated resources (iova and > >> mappings) to be released. > >> > >> How about registering a callback for BUS_NOTIFY_UNBOUND_DRIVER, and > >> removing the domain info when the driver detachment completes? > > > > Device core calls BUS_NOTIFY_UNBOUND_DRIVER on each driver unbind, regardless > > of a device being removed or not. As long as the device is not unplugged and > > the BUS_NOTIFY_REMOVED_DEVICE notification not generated, an unbound driver is > > not a problem here. > > Morever, BUS_NOTIFY_UNBOUND_DRIVER is called even before > > BUS_NOTIFY_REMOVED_DEVICE so that wouldn't help anyway. > > Last but not least, bus events are independent of the IOMMU driver use via > > DMA-API it exposes. > > Fair enough. > > > > > If keeping data for unplugged devices and reusing it on device re-plug is not > > acceptable then maybe the IOMMU driver should perform reference counting of > > its internal resources occupied by DMA-API users and perform cleanups on last > > release? > > I am not saying that keeping data is not acceptable. I just want to > check whether there are any other solutions. Then reverting 458b7c8e0dde and applying this patch still resolves the issue for me. No errors appear when mappings are unmapped on device close after the device has been removed, and domain info preserved on device removal is successfully reused on device re-plug. Is there anything else I can do to help? Thanks, Janusz > > Best regards, > Baolu > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug
Hi Baolu, On Tuesday, September 3, 2019 3:29:40 AM CEST Lu Baolu wrote: > Hi Janusz, > > On 9/2/19 4:37 PM, Janusz Krzysztofik wrote: > >> I am not saying that keeping data is not acceptable. I just want to > >> check whether there are any other solutions. > > Then reverting 458b7c8e0dde and applying this patch still resolves the issue > > for me. No errors appear when mappings are unmapped on device close after the > > device has been removed, and domain info preserved on device removal is > > successfully reused on device re-plug. > > This patch doesn't look good to me although I agree that keeping data is > acceptable. It updates dev->archdata.iommu, but leaves the hardware > context/pasid table unchanged. This might cause problems somewhere. > > > > > Is there anything else I can do to help? > > Can you please tell me how to reproduce the problem? The most simple way to reproduce the issue, assuming there are no non-Intel graphics adapters installed, is to run the following shell commands: #!/bin/sh # load i915 module modprobe i915 # open an i915 device and keep it open in background cat /dev/dri/card0 >/dev/null & sleep 2 # simulate device unplug echo 1 >/sys/class/drm/card0/device/remove # make the background process close the device on exit kill $! Thanks, Janusz > Keeping the per > device domain info while device is unplugged is a bit dangerous because > info->dev might be a wild pointer. We need to work out a clean fix. > > > > > Thanks, > > Janusz > > > > Best regards, > Baolu > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Don't unwedge if reset is disabled
When trying to reset a device with reset capability disabled or not supported while rings are full of requests, it has been observed when running in execlists submission mode that command stream buffer tail tends to be incremented by apparently still running GPU regardless of all requests being already cancelled and command stream buffer pointers reset. As a result, kernel panic on NULL pointer dereference occurs when a trace_ports() helper is called with command stream buffer tail incremented but request pointers being NULL during final __intel_gt_set_wedged() operation called from intel_gt_reset(). Skip actual reset procedure if reset is disabled or not supported. Suggested-by: Daniele Ceraolo Spurio Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/gt/intel_reset.c | 26 ++ 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index b9d84d52e986..d75da124e280 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -932,25 +932,35 @@ void intel_gt_reset(struct intel_gt *gt, GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, >->reset.flags)); mutex_lock(>->reset.mutex); - /* Clear any previous failed attempts at recovery. Time to try again. */ - if (!__intel_gt_unset_wedged(gt)) - goto unlock; - if (reason) dev_notice(gt->i915->drm.dev, "Resetting chip for %s\n", reason); - atomic_inc(>->i915->gpu_error.reset_count); - - awake = reset_prepare(gt); if (!intel_has_gpu_reset(gt->i915)) { if (i915_modparams.reset) dev_err(gt->i915->drm.dev, "GPU reset not supported\n"); else DRM_DEBUG_DRIVER("GPU reset disabled\n"); - goto error; + + /* +* Don't unwedge if reset is disabled or not supported +* because we can't guarantee what the hardware status is. +*/ + if (intel_gt_is_wedged(gt)) + goto unlock; } + /* Clear any previous failed attempts at recovery. Time to try again. */ + if (!__intel_gt_unset_wedged(gt)) + goto unlock; + + atomic_inc(>->i915->gpu_error.reset_count); + + awake = reset_prepare(gt); + + if (!intel_has_gpu_reset(gt->i915)) + goto error; + if (INTEL_INFO(gt->i915)->gpu_reset_clobbers_display) intel_runtime_pm_disable_interrupts(gt->i915); -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/guc: Fix detection of GuC submission in use
The driver always assumes active GuC submission mode if it is supported. That's not true if GuC initialization fails for some reason. That may lead to kernel panics, caused e.g. by execlists fallback submission mode incorrectly detecting GuC submission in use. Fix it by also checking for GuC enabled status. Fixes: 356c484822e6 ("drm/i915/uc: Add explicit DISABLED state for firmware") Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/gt/uc/intel_uc.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.h b/drivers/gpu/drm/i915/gt/uc/intel_uc.h index 527995c21196..b28bab64a280 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.h @@ -51,7 +51,8 @@ static inline bool intel_uc_supports_guc_submission(struct intel_uc *uc) static inline bool intel_uc_uses_guc_submission(struct intel_uc *uc) { - return intel_guc_is_submission_supported(&uc->guc); + return intel_guc_is_enabled(&uc->guc) && + intel_guc_is_submission_supported(&uc->guc); } static inline bool intel_uc_supports_huc(struct intel_uc *uc) -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t] lib: Don't use full reset on simulated hardware
If DROP_RESET_ACTIVE is requested while there is a large queue of pending GEM requests, waiting for idle engines performed as a first step of i915_gem_drop_caches debugfs request handler times out and an otherwise healthy device is marked wedged. If that happens while reset capabilities are disabled or not supported, there is no possibility to successfully reset the device after requests are retired. Avoid fake GPU terminally wedged conditions by not requesting DROP_RESET_ACTIVE from exit handler when running on simulated hardware. As a side effect, terminating a very busy test and running a subsequent one may take quite a while. Signed-off-by: Janusz Krzysztofik --- lib/drmtest.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/drmtest.c b/lib/drmtest.c index c379a7b7..b73bc132 100644 --- a/lib/drmtest.c +++ b/lib/drmtest.c @@ -318,7 +318,8 @@ static void __cancel_work_at_exit(int fd) igt_sysfs_set_parameter(fd, "reset", "%x", -1u /* any method */); igt_drop_caches_set(fd, /* cancel everything */ - DROP_RESET_ACTIVE | DROP_RESET_SEQNO | + igt_run_in_simulation() ? 0 : DROP_RESET_ACTIVE | + DROP_RESET_SEQNO | /* cleanup */ DROP_ACTIVE | DROP_RETIRE | DROP_IDLE | DROP_FREED); } -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix detection of GuC submission in use
Hi Michał, On Thursday, September 5, 2019 2:08:12 PM CEST Michal Wajdeczko wrote: > On Thu, 05 Sep 2019 13:16:31 +0200, Janusz Krzysztofik > wrote: > > > The driver always assumes active GuC submission mode if it is > > supported. That's not true if GuC initialization fails for some > > reason. That may lead to kernel panics, caused e.g. by execlists > > fallback submission mode incorrectly detecting GuC submission in use. > > > > Fix it by also checking for GuC enabled status. > > > > Fixes: 356c484822e6 ("drm/i915/uc: Add explicit DISABLED state for > > firmware") > > Signed-off-by: Janusz Krzysztofik > > --- > > drivers/gpu/drm/i915/gt/uc/intel_uc.h | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.h > > b/drivers/gpu/drm/i915/gt/uc/intel_uc.h > > index 527995c21196..b28bab64a280 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.h > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.h > > @@ -51,7 +51,8 @@ static inline bool > > intel_uc_supports_guc_submission(struct intel_uc *uc) > > static inline bool intel_uc_uses_guc_submission(struct intel_uc *uc) > > { > > - return intel_guc_is_submission_supported(&uc->guc); > > + return intel_guc_is_enabled(&uc->guc) && > > + intel_guc_is_submission_supported(&uc->guc); > > This wont fix your original problem (that btw is not possible to > repro on drm-tip) I'm not sure how you force GuC initialization to fail, mine just didn't have new firmware available. On module load, the driver was starting up in execlists submission mode and BUG_ON( was raised from process_csb(). Running on a simulator, I was using current internal tree, based on current drm-tip. > as after any GuC initialization failure we still > treat GuC as "enabled": My bad, I initially used intel_guc_is_running() but that interfered badly with module unload so I switched to intel_guc_is_enabled() and apparently didn't re-test if this still fixes the original issue. > intel_guc_is_supported => H/W support (static) > intel_guc_is_enabled => aka not disabled by the user (config) > intel_guc_is_running => no major fw failure (runtime) > > Note that we even s/intel_guc_is_enabled/intel_guc_is_running > won't help as GuC may be running but we may fail to correctly > initialize GuC submission. > > Correct fix to original problem must be aligned with new GuC > submission model (coming soon) and it may look as this: > > +static inline bool intel_guc_is_submission_active(struct intel_guc *guc) > +{ > + GEM_BUG_ON(guc->submission_active && !intel_guc_is_running(guc)); > + return guc->submission_active; > +} > > and then > > static inline bool intel_uc_uses_guc_submission(struct intel_uc *uc) > { > - return intel_guc_is_submission_supported(&uc->guc); > + return intel_guc_is_submission_active(&uc->guc); > } > > We may need to revisit all uses/supports/ macros to better > reflect configuration vs runtime differences. Definitely, or we may get in troubles like the one I experienced on module unload. And that can be done in advance, I believe. As long as the unload issue is resolved by not using intel_uc_uses_guc_submission() where it occurred inappropriate, using (intel_guc_is_running() && intel_guc_is_submission_supported()) seems a valid fix to me, easy to migrate to intel_guc_is_submission_active() as soon as available. I'll revert back to intel_guc_is_running(), fix the module unload issue and resubmit to trybot, maybe it can discover more issues with that. Thanks, Janusz > > Thanks, > Michal > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/3] drm/i915/uc: Disable GuC submission only if currently enabled
Hi Fernando, On Wednesday, August 28, 2019 2:45:57 AM CEST Fernando Pacheco wrote: > It is not enough to check that uc supports GuC submission now > that we can continue to load the driver after GuC initialization > failure (support != enabled). Instead we should explicitly check > that we enabled GuC submission. What's the status of this patch? I think that having your intel_guc_is_submission_enabled() helper available I would be able to resolve a few related issues, which I've accidentally taken upon myself, without inventing my own version. Thanks, Janusz > Signed-off-by: Fernando Pacheco > Cc: Michal Wajdeczko > Cc: Daniele Ceraolo Spurio > --- > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 23 +++ > .../gpu/drm/i915/gt/uc/intel_guc_submission.h | 1 + > drivers/gpu/drm/i915/gt/uc/intel_uc.c | 2 +- > 3 files changed, 25 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/ gpu/drm/i915/gt/uc/intel_guc_submission.c > index f325d3dd564f..d4aff9a96c7a 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > @@ -191,6 +191,16 @@ static bool __doorbell_valid(struct intel_guc *guc, u16 db_id) > return intel_uncore_read(uncore, GEN8_DRBREGL(db_id)) & GEN8_DRB_VALID; > } > > +static bool __doorbell_enabled(struct intel_guc_client *client) > +{ > + struct guc_doorbell_info *doorbell; > + > + GEM_BUG_ON(!has_doorbell(client)); > + > + doorbell = __get_doorbell(client); > + return doorbell->db_status == GUC_DOORBELL_ENABLED; > +} > + > static void __init_doorbell(struct intel_guc_client *client) > { > struct guc_doorbell_info *doorbell; > @@ -1112,6 +1122,19 @@ static void guc_set_default_submission(struct intel_engine_cs *engine) > GEM_BUG_ON(engine->irq_enable || engine->irq_disable); > } > > +bool intel_guc_is_submission_enabled(struct intel_guc *guc) > +{ > + if (!intel_guc_is_submission_supported(guc)) > + return false; > + > + /* > + * Use the fact that we enable the guc execbuf_client > + * and its doorbell when enabling GuC submission as a proxy > + * for the latter. > + */ > + return guc->execbuf_client && __doorbell_enabled(guc- >execbuf_client); > +} > + > int intel_guc_submission_enable(struct intel_guc *guc) > { > struct intel_gt *gt = guc_to_gt(guc); > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h b/drivers/ gpu/drm/i915/gt/uc/intel_guc_submission.h > index 54d716828352..80b18a2c885a 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h > @@ -58,6 +58,7 @@ struct intel_guc_client { > > void intel_guc_submission_init_early(struct intel_guc *guc); > int intel_guc_submission_init(struct intel_guc *guc); > +bool intel_guc_is_submission_enabled(struct intel_guc *guc); > int intel_guc_submission_enable(struct intel_guc *guc); > void intel_guc_submission_disable(struct intel_guc *guc); > void intel_guc_submission_fini(struct intel_guc *guc); > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/ gt/uc/intel_uc.c > index 29a9eec60d2e..b2eb340ce87e 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c > @@ -538,7 +538,7 @@ void intel_uc_fini_hw(struct intel_uc *uc) > if (!intel_guc_is_running(guc)) > return; > > - if (intel_uc_supports_guc_submission(uc)) > + if (intel_guc_is_submission_enabled(guc)) > intel_guc_submission_disable(guc); > > if (guc_communication_enabled(guc)) > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
Hi Chris, On Tuesday, September 10, 2019 12:55:36 AM CEST Chris Wilson wrote: > Unwedging the GPU requires a successful GPU reset before we restore the > default submission, or else we may see residual context switch events > that we were not expecting. > > Reported-by: Janusz Krzysztofik > Signed-off-by: Chris Wilson > Cc: Janusz Krzysztofik > Cc: Daniele Ceraolo Spurio > --- > drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/ gt/intel_reset.c > index fe57296b790c..5242496a893a 100644 > --- a/drivers/gpu/drm/i915/gt/intel_reset.c > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c > @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt) > struct intel_gt_timelines *timelines = >->timelines; > struct intel_timeline *tl; > unsigned long flags; > + bool ok; > > if (!test_bit(I915_WEDGED, >->reset.flags)) > return true; > @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt) > } > spin_unlock_irqrestore(&timelines->lock, flags); > > - intel_gt_sanitize(gt, false); > + ok = false; > + if (!reset_clobbers_display(gt->i915)) > + ok = __intel_gt_reset(gt, ALL_ENGINES) == 0; > + if (!ok) > + return false; Before your change, that code was executed inside intel_gt_sanitize(gt, false) which unfortunately didn't return any result. The same outcome could be achieved by redefining intel_gt_sanitize() to return that result and saying: if (!intel_gt_sanitize(gt, false) return false; Is there any specific reason for intel_gt_sanitize() returning void? Thanks, Janusz > > /* >* Undo nop_submit_request. We prevent all new i915 requests from > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Don't unwedge if reset is disabled
On Monday, September 9, 2019 11:48:42 PM CEST Chris Wilson wrote: > Quoting Chris Wilson (2019-09-07 09:39:52) > > Quoting Daniele Ceraolo Spurio (2019-09-06 23:28:05) > > > > > > > > > On 9/5/19 2:09 AM, Janusz Krzysztofik wrote: > > > > When trying to reset a device with reset capability disabled or not > > > > supported while rings are full of requests, it has been observed when > > > > running in execlists submission mode that command stream buffer tail > > > > tends to be incremented by apparently still running GPU regardless of > > > > all requests being already cancelled and command stream buffer pointers > > > > reset. As a result, kernel panic on NULL pointer dereference occurs > > > > when a trace_ports() helper is called with command stream buffer tail > > > > incremented but request pointers being NULL during final > > > > __intel_gt_set_wedged() operation called from intel_gt_reset(). > > > > > > > > Skip actual reset procedure if reset is disabled or not supported. > > > > > > This last sentence is a bit confusing. You're not skipping the reset > > > procedure, you're skipping the attempt of unwedging and resetting again > > > after a reset & wedge already happened. > > > > Loss of email over the last week, so jumping in at the end. My gut > > response is that this is still just papering over the bug, as what you > > say above makes no sense. > > So my gut response was to the run on sentence, when all you needed to > say that without a successful reset prior to calling > reset_default_submission, the engine may still generate CS events out of > the blue. And I think the patch should be written to require the > successful reset. You are right, successful reset seems the only safe protection. But anyway, while digging deeper waiting for your clarification of that gut respone ;-) , I've discovered that symptoms from which the issue can be predicted may be sometimes observed during reset_prepere() as failing intel_engine_stop_cs(). Checking for that failure alone may be too weak as it can probably happen to succeed regardless of the uncertain hardware status, but anyway, what do you think about modifying reset_prepare() so it may fail with an error propagated from functions it calls, then calling reset_prepare() at the beginning of intel_gt_reset() and skiping over __intel_gt_unset_wedgede() and further steps (do_reset(), ..., reset_finish()) if reset_prepare() fails? Wouldn't that be a useful additional layer of protection? If you think the idea is worth of being considered, please have a look at my first attempt sent to trybot already before your explanation arrived: https://patchwork.freedesktop.org/patch/329840/?series=66447&rev=1 (don't complain on its commit message making no sense, please ;-) ). Thanks, Janusz > -Chris > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 1/1] drm/i915: Split off pci_driver.remove() tail to drm_driver.release()
In order to support driver hot unbind, some cleanup operations, now performed on PCI driver remove, must be called later, after all device file descriptors are closed. Split out those operations from the tail of pci_driver.remove() callback and put them into drm_driver.release() which is called as soon as all references to the driver are put. As a result, those cleanups will be now run on last drm_dev_put(), either still called from pci_driver.remove() if all device file descriptors are already closed, or on last drm_release() file operation. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 17 + drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 10 +- 3 files changed, 23 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 83d2eb9e74cb..8be69f84eb6d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -738,6 +738,7 @@ static int i915_load_modeset_init(struct drm_device *dev) cleanup_gem: i915_gem_suspend(dev_priv); + i915_gem_fini_hw(dev_priv); i915_gem_fini(dev_priv); cleanup_modeset: intel_modeset_cleanup(dev); @@ -1685,7 +1686,6 @@ static void i915_driver_cleanup_hw(struct drm_i915_private *dev_priv) pci_disable_msi(pdev); pm_qos_remove_request(&dev_priv->pm_qos); - i915_ggtt_cleanup_hw(dev_priv); } /** @@ -1909,6 +1909,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent) out_cleanup_hw: i915_driver_cleanup_hw(dev_priv); + i915_ggtt_cleanup_hw(dev_priv); out_cleanup_mmio: i915_driver_cleanup_mmio(dev_priv); out_runtime_pm_put: @@ -1960,21 +1961,29 @@ void i915_driver_unload(struct drm_device *dev) cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work); i915_reset_error_state(dev_priv); - i915_gem_fini(dev_priv); + i915_gem_fini_hw(dev_priv); intel_power_domains_fini_hw(dev_priv); i915_driver_cleanup_hw(dev_priv); - i915_driver_cleanup_mmio(dev_priv); enable_rpm_wakeref_asserts(dev_priv); - intel_runtime_pm_cleanup(dev_priv); } static void i915_driver_release(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); + disable_rpm_wakeref_asserts(dev_priv); + + i915_gem_fini(dev_priv); + + i915_ggtt_cleanup_hw(dev_priv); + i915_driver_cleanup_mmio(dev_priv); + + enable_rpm_wakeref_asserts(dev_priv); + intel_runtime_pm_cleanup(dev_priv); + i915_driver_cleanup_early(dev_priv); i915_driver_destroy(dev_priv); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a2664ea1395b..d08e7bd83544 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3047,6 +3047,7 @@ void i915_gem_init_mmio(struct drm_i915_private *i915); int __must_check i915_gem_init(struct drm_i915_private *dev_priv); int __must_check i915_gem_init_hw(struct drm_i915_private *dev_priv); void i915_gem_init_swizzling(struct drm_i915_private *dev_priv); +void i915_gem_fini_hw(struct drm_i915_private *dev_priv); void i915_gem_fini(struct drm_i915_private *dev_priv); int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv, unsigned int flags, long timeout); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7cafd5612f71..c6a8e665a6ba 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4667,7 +4667,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv) return ret; } -void i915_gem_fini(struct drm_i915_private *dev_priv) +void i915_gem_fini_hw(struct drm_i915_private *dev_priv) { GEM_BUG_ON(dev_priv->gt.awake); @@ -4681,6 +4681,14 @@ void i915_gem_fini(struct drm_i915_private *dev_priv) intel_uc_fini_hw(dev_priv); intel_uc_fini(dev_priv); intel_engines_cleanup(dev_priv); + mutex_unlock(&dev_priv->drm.struct_mutex); + + i915_gem_drain_freed_objects(dev_priv); +} + +void i915_gem_fini(struct drm_i915_private *dev_priv) +{ + mutex_lock(&dev_priv->drm.struct_mutex); i915_gem_contexts_fini(dev_priv); i915_gem_fini_scratch(dev_priv); mutex_unlock(&dev_priv->drm.struct_mutex); -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 0/1] drm/i915: Split off pci_driver.remove() tail to drm_driver.release()
Hi, I do realize more work needs to be done to get a clean hotunplug solution, however I need your comments to make sure that I'm going in the right direction. So far I have no good idea how to resolve pm_runtime_get_sync() failures on outstanding device file close after successfull driver unbind. Thanks, Janusz Janusz Krzysztofik (1): drm/i915: Split off pci_driver.remove() tail to drm_driver.release() drivers/gpu/drm/i915/i915_drv.c | 17 + drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 10 +- 3 files changed, 23 insertions(+), 5 deletions(-) -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2] drm/i915: Split off pci_driver.remove() tail to drm_driver.release()
In order to support driver hot unbind, some cleanup operations, now performed on PCI driver remove, must be called later, after all device file descriptors are closed. Split out those operations from the tail of pci_driver.remove() callback and put them into drm_driver.release() which is called as soon as all references to the driver are put. As a result, those cleanups will be now run on last drm_dev_put(), either still called from pci_driver.remove() if all device file descriptors are already closed, or on last drm_release() file operation. Signed-off-by: Janusz Krzysztofik Reviewed-by: Chris Wilson --- Changelog: v1 -> v2: - defer intel_engines_cleanup() as well. (Chris) drivers/gpu/drm/i915/i915_drv.c | 17 + drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 10 +- 3 files changed, 23 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 83d2eb9e74cb..8be69f84eb6d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -738,6 +738,7 @@ static int i915_load_modeset_init(struct drm_device *dev) cleanup_gem: i915_gem_suspend(dev_priv); + i915_gem_fini_hw(dev_priv); i915_gem_fini(dev_priv); cleanup_modeset: intel_modeset_cleanup(dev); @@ -1685,7 +1686,6 @@ static void i915_driver_cleanup_hw(struct drm_i915_private *dev_priv) pci_disable_msi(pdev); pm_qos_remove_request(&dev_priv->pm_qos); - i915_ggtt_cleanup_hw(dev_priv); } /** @@ -1909,6 +1909,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent) out_cleanup_hw: i915_driver_cleanup_hw(dev_priv); + i915_ggtt_cleanup_hw(dev_priv); out_cleanup_mmio: i915_driver_cleanup_mmio(dev_priv); out_runtime_pm_put: @@ -1960,21 +1961,29 @@ void i915_driver_unload(struct drm_device *dev) cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work); i915_reset_error_state(dev_priv); - i915_gem_fini(dev_priv); + i915_gem_fini_hw(dev_priv); intel_power_domains_fini_hw(dev_priv); i915_driver_cleanup_hw(dev_priv); - i915_driver_cleanup_mmio(dev_priv); enable_rpm_wakeref_asserts(dev_priv); - intel_runtime_pm_cleanup(dev_priv); } static void i915_driver_release(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); + disable_rpm_wakeref_asserts(dev_priv); + + i915_gem_fini(dev_priv); + + i915_ggtt_cleanup_hw(dev_priv); + i915_driver_cleanup_mmio(dev_priv); + + enable_rpm_wakeref_asserts(dev_priv); + intel_runtime_pm_cleanup(dev_priv); + i915_driver_cleanup_early(dev_priv); i915_driver_destroy(dev_priv); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a2664ea1395b..d08e7bd83544 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3047,6 +3047,7 @@ void i915_gem_init_mmio(struct drm_i915_private *i915); int __must_check i915_gem_init(struct drm_i915_private *dev_priv); int __must_check i915_gem_init_hw(struct drm_i915_private *dev_priv); void i915_gem_init_swizzling(struct drm_i915_private *dev_priv); +void i915_gem_fini_hw(struct drm_i915_private *dev_priv); void i915_gem_fini(struct drm_i915_private *dev_priv); int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv, unsigned int flags, long timeout); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7cafd5612f71..20d3f7532cef 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4667,7 +4667,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv) return ret; } -void i915_gem_fini(struct drm_i915_private *dev_priv) +void i915_gem_fini_hw(struct drm_i915_private *dev_priv) { GEM_BUG_ON(dev_priv->gt.awake); @@ -4680,6 +4680,14 @@ void i915_gem_fini(struct drm_i915_private *dev_priv) mutex_lock(&dev_priv->drm.struct_mutex); intel_uc_fini_hw(dev_priv); intel_uc_fini(dev_priv); + mutex_unlock(&dev_priv->drm.struct_mutex); + + i915_gem_drain_freed_objects(dev_priv); +} + +void i915_gem_fini(struct drm_i915_private *dev_priv) +{ + mutex_lock(&dev_priv->drm.struct_mutex); intel_engines_cleanup(dev_priv); i915_gem_contexts_fini(dev_priv); i915_gem_fini_scratch(dev_priv); -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t v3] lib/i915/gem_mman: Add a helper for obtaining mappable aperture version
From: Janusz Krzysztofik If a test calls a function which depends on availabiblity of a supported mappable aperture, an error may be reported by the kernel on unsupported hardware. That may negatively affect results reported by a test framework even if that test ignores the failure and succeedes. This helper wraps an IOCTL call which returns a version number of a mappable aperture. It may be used by tests which need to adjust their scope depending on availability of specific version of mappable aperture. Signed-off-by: Janusz Krzysztofik Cc: Antonio Argenziano Cc: Michal Wajdeczko --- Changelog: v2 (internal) -> v3: - make the code less obsucre, more explicit (Antonio), - reword the helper documentation and commit message. v1 (internal) -> v2 (internal): - minimize future potential conflicts with https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1 (no progress with than one so not waiting for it any longer): - convert the helper to a drop-in replacement of the one from the above mentioned patch, returning mappable aperture version, not only information on its availability, - drop any other wrappers, - document the helper, - reword commit message. lib/i915/gem_mman.c | 22 ++ lib/i915/gem_mman.h | 1 + 2 files changed, 23 insertions(+) diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c index 3cf9a6bb..3a3f3e5c 100644 --- a/lib/i915/gem_mman.c +++ b/lib/i915/gem_mman.c @@ -40,6 +40,28 @@ #define VG(x) do {} while (0) #endif +/** + * gem_mmap__gtt_version: + * @fd: open i915 drm file descriptor + * + * This functions wraps up an IOCTL to obtain mappable aperture version. + * + * Returns: mappable aperture version, -1 on failure. + */ +int gem_mmap__gtt_version(int fd) +{ + int gtt_version, ret; + struct drm_i915_getparam gp = { + .param = I915_PARAM_MMAP_GTT_VERSION, + .value = >t_version, + }; + + ret = ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp); + if (ret == 0) + ret = gtt_version; + return ret; +} + /** * __gem_mmap__gtt: * @fd: open i915 drm file descriptor diff --git a/lib/i915/gem_mman.h b/lib/i915/gem_mman.h index f7242ed7..ab12e566 100644 --- a/lib/i915/gem_mman.h +++ b/lib/i915/gem_mman.h @@ -25,6 +25,7 @@ #ifndef GEM_MMAN_H #define GEM_MMAN_H +int gem_mmap__gtt_version(int fd); void *gem_mmap__gtt(int fd, uint32_t handle, uint64_t size, unsigned prot); void *gem_mmap__cpu(int fd, uint32_t handle, uint64_t offset, uint64_t size, unsigned prot); -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [igt-dev] [PATCH i-g-t v3] lib/i915/gem_mman: Add a helper for obtaining mappable aperture version
Hi Chris, On Friday, May 31, 2019 10:39:47 AM CEST Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-05-31 09:33:38) > > From: Janusz Krzysztofik > > > > If a test calls a function which depends on availabiblity of a > > supported mappable aperture, an error may be reported by the kernel on > > unsupported hardware. That may negatively affect results reported by a > > test framework even if that test ignores the failure and succeedes. > > > > This helper wraps an IOCTL call which returns a version number of a > > mappable aperture. It may be used by tests which need to adjust their > > scope depending on availability of specific version of mappable > > aperture. > > > > Signed-off-by: Janusz Krzysztofik > > Cc: Antonio Argenziano > > Cc: Michal Wajdeczko > > --- > > Changelog: > > v2 (internal) -> v3: > > - make the code less obsucre, more explicit (Antonio), > > - reword the helper documentation and commit message. > > > > v1 (internal) -> v2 (internal): > > - minimize future potential conflicts with > > https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1 > > (no progress with than one so not waiting for it any longer): > > - convert the helper to a drop-in replacement of the one from the > > above mentioned patch, returning mappable aperture version, not > > only information on its availability, > > - drop any other wrappers, > > - document the helper, > > - reword commit message. > > > > lib/i915/gem_mman.c | 22 ++ > > lib/i915/gem_mman.h | 1 + > > 2 files changed, 23 insertions(+) > > > > diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c > > index 3cf9a6bb..3a3f3e5c 100644 > > --- a/lib/i915/gem_mman.c > > +++ b/lib/i915/gem_mman.c > > @@ -40,6 +40,28 @@ > > #define VG(x) do {} while (0) > > #endif > > > > +/** > > + * gem_mmap__gtt_version: > > + * @fd: open i915 drm file descriptor > > + * > > + * This functions wraps up an IOCTL to obtain mappable aperture version. > > + * > > + * Returns: mappable aperture version, -1 on failure. > > + */ > > +int gem_mmap__gtt_version(int fd) > > +{ > > + int gtt_version, ret; > > + struct drm_i915_getparam gp = { > > + .param = I915_PARAM_MMAP_GTT_VERSION, > > + .value = >t_version, > > + }; > > + > > + ret = ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp); > > + if (ret == 0) > > + ret = gtt_version; > > + return ret; > > Maybe the actual error returned by the kernel and not glibc would be > interesting in the future? errno is not overwritten by the helper so it is available to IGT after it is called and actually reported when a call to the helper is wrapped with igt_require(). Do we need more? Thanks, Janusz > -Chris > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [igt-dev] [PATCH i-g-t v3] lib/i915/gem_mman: Add a helper for obtaining mappable aperture version
On Friday, May 31, 2019 10:41:36 AM CEST Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-05-31 09:33:38) > > From: Janusz Krzysztofik > > This is nothing to do with the mappable aperture version. This is the > nee MMAP_GTT interface version. > -Chris > Sorry for my ignorance, I'll reword it. Thanks, Janusz ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [igt-dev] [PATCH i-g-t v3] lib/i915/gem_mman: Add a helper for obtaining mappable aperture version
On Friday, May 31, 2019 10:55:46 AM CEST Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-05-31 09:53:41) > > Hi Chris, > > > > On Friday, May 31, 2019 10:39:47 AM CEST Chris Wilson wrote: > > > Quoting Janusz Krzysztofik (2019-05-31 09:33:38) > > > > From: Janusz Krzysztofik > > > > > > > > If a test calls a function which depends on availabiblity of a > > > > supported mappable aperture, an error may be reported by the kernel on > > > > unsupported hardware. That may negatively affect results reported by a > > > > test framework even if that test ignores the failure and succeedes. > > > > > > > > This helper wraps an IOCTL call which returns a version number of a > > > > mappable aperture. It may be used by tests which need to adjust their > > > > scope depending on availability of specific version of mappable > > > > aperture. > > > > > > > > Signed-off-by: Janusz Krzysztofik > > > > Cc: Antonio Argenziano > > > > Cc: Michal Wajdeczko > > > > --- > > > > Changelog: > > > > v2 (internal) -> v3: > > > > - make the code less obsucre, more explicit (Antonio), > > > > - reword the helper documentation and commit message. > > > > > > > > v1 (internal) -> v2 (internal): > > > > - minimize future potential conflicts with > > > > https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1 > > > > (no progress with than one so not waiting for it any longer): > > > > - convert the helper to a drop-in replacement of the one from the > > > > above mentioned patch, returning mappable aperture version, not > > > > only information on its availability, > > > > - drop any other wrappers, > > > > - document the helper, > > > > - reword commit message. > > > > > > > > lib/i915/gem_mman.c | 22 ++ > > > > lib/i915/gem_mman.h | 1 + > > > > 2 files changed, 23 insertions(+) > > > > > > > > diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c > > > > index 3cf9a6bb..3a3f3e5c 100644 > > > > --- a/lib/i915/gem_mman.c > > > > +++ b/lib/i915/gem_mman.c > > > > @@ -40,6 +40,28 @@ > > > > #define VG(x) do {} while (0) > > > > #endif > > > > > > > > +/** > > > > + * gem_mmap__gtt_version: > > > > + * @fd: open i915 drm file descriptor > > > > + * > > > > + * This functions wraps up an IOCTL to obtain mappable aperture > > > > version. > > > > + * > > > > + * Returns: mappable aperture version, -1 on failure. > > > > + */ > > > > +int gem_mmap__gtt_version(int fd) > > > > +{ > > > > + int gtt_version, ret; > > > > + struct drm_i915_getparam gp = { > > > > + .param = I915_PARAM_MMAP_GTT_VERSION, > > > > + .value = >t_version, > > > > + }; > > > > + > > > > + ret = ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp); > > > > + if (ret == 0) > > > > + ret = gtt_version; > > > > + return ret; > > > > > > Maybe the actual error returned by the kernel and not glibc would be > > > interesting in the future? > > > > errno is not overwritten by the helper so it is available to IGT after it > > is > > called and actually reported when a call to the helper is wrapped with > > igt_require(). Do we need more? > > Yes, we typically return the error and do not use errno. Imagine if we > just replaced ioctl() with syscall() :) OK. I'll fix it. Thanks, Janusz > -Chris > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t v4] lib/i915/gem_mman: Add a helper for obtaining MMAP_GTT interface version
From: Janusz Krzysztofik If a test calls a function which depends on availability of a specific version of MMAP_GTT interface, an error may occur on unsupported hardware. That may negatively affect results reported by a test framework even if that test ignores the failure and succeedes. This helper wraps up an IOCTL call which returns a version number of MMAP_GTT interface. It may be used by tests which should adjust their scope depending on availability of a specific version of MMAP_GTT interface. Signed-off-by: Janusz Krzysztofik Cc: Antonio Argenziano Cc: Michal Wajdeczko --- Changelog: v3 -> v4: - return errno value on failure (Chris - thanks!), - clear errno before return, as other helpers do, - reword the helper documentation and commit message again (Chris - thanks!). v2 (internal) -> v3: - make the code less obsucre, more explicit (Antonio - thanks!), - reword the helper documentation and commit message. v1 (internal) -> v2 (internal): - minimize future potential conflicts with https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1 (no progress with than one so not waiting for it any longer): - convert the helper to a drop-in replacement of the one from the above mentioned patch, returning mappable aperture version, not only information on its availability, - drop any other wrappers, - document the helper, - reword commit message. lib/i915/gem_mman.c | 25 + lib/i915/gem_mman.h | 1 + 2 files changed, 26 insertions(+) diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c index 3cf9a6bb..2c3d6971 100644 --- a/lib/i915/gem_mman.c +++ b/lib/i915/gem_mman.c @@ -40,6 +40,31 @@ #define VG(x) do {} while (0) #endif +/** + * gem_mmap__gtt_version: + * @fd: open i915 drm file descriptor + * + * This functions wraps up an IOCTL to obtain MMAP_GTT interface version + * + * Returns: MMAP_GTT interface version, kernel error code on failure. + */ +int gem_mmap__gtt_version(int fd) +{ + int gtt_version, ret; + struct drm_i915_getparam gp = { + .param = I915_PARAM_MMAP_GTT_VERSION, + .value = >t_version, + }; + + if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp)) + ret = errno; + else + ret = gtt_version; + + errno = 0; + return ret; +} + /** * __gem_mmap__gtt: * @fd: open i915 drm file descriptor diff --git a/lib/i915/gem_mman.h b/lib/i915/gem_mman.h index f7242ed7..ab12e566 100644 --- a/lib/i915/gem_mman.h +++ b/lib/i915/gem_mman.h @@ -25,6 +25,7 @@ #ifndef GEM_MMAN_H #define GEM_MMAN_H +int gem_mmap__gtt_version(int fd); void *gem_mmap__gtt(int fd, uint32_t handle, uint64_t size, unsigned prot); void *gem_mmap__cpu(int fd, uint32_t handle, uint64_t offset, uint64_t size, unsigned prot); -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t v4] lib/i915/gem_mman: Add a helper for obtaining MMAP_GTT interface version
On Friday, May 31, 2019 11:35:39 AM CEST Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-05-31 10:29:16) > > From: Janusz Krzysztofik > > > > If a test calls a function which depends on availability of a specific > > version of MMAP_GTT interface, an error may occur on unsupported hardware. > > That may negatively affect results reported by a test framework even if > > that test ignores the failure and succeedes. > > > > This helper wraps up an IOCTL call which returns a version number of > > MMAP_GTT interface. It may be used by tests which should adjust their > > scope depending on availability of a specific version of MMAP_GTT > > interface. > > > > Signed-off-by: Janusz Krzysztofik > > Cc: Antonio Argenziano > > Cc: Michal Wajdeczko > > --- > > Changelog: > > v3 -> v4: > > - return errno value on failure (Chris - thanks!), > > - clear errno before return, as other helpers do, > > - reword the helper documentation and commit message again (Chris - > > thanks!). > > > > v2 (internal) -> v3: > > - make the code less obsucre, more explicit (Antonio - thanks!), > > - reword the helper documentation and commit message. > > > > v1 (internal) -> v2 (internal): > > - minimize future potential conflicts with > > https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1 > > (no progress with than one so not waiting for it any longer): > > - convert the helper to a drop-in replacement of the one from the > > above mentioned patch, returning mappable aperture version, not > > only information on its availability, > > - drop any other wrappers, > > - document the helper, > > - reword commit message. > > > > lib/i915/gem_mman.c | 25 + > > lib/i915/gem_mman.h | 1 + > > 2 files changed, 26 insertions(+) > > > > diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c > > index 3cf9a6bb..2c3d6971 100644 > > --- a/lib/i915/gem_mman.c > > +++ b/lib/i915/gem_mman.c > > @@ -40,6 +40,31 @@ > > #define VG(x) do {} while (0) > > #endif > > > > +/** > > + * gem_mmap__gtt_version: > > + * @fd: open i915 drm file descriptor > > + * > > + * This functions wraps up an IOCTL to obtain MMAP_GTT interface version > > + * > > + * Returns: MMAP_GTT interface version, kernel error code on failure. > > + */ > > +int gem_mmap__gtt_version(int fd) > > +{ > > + int gtt_version, ret; > > + struct drm_i915_getparam gp = { > > + .param = I915_PARAM_MMAP_GTT_VERSION, > > + .value = >t_version, > > + }; > > + > > + if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp)) > > + ret = errno; > > ret = -errno; :) Sorry. > Petri also like it when we then say igt_assume(ret); > > Or one could use > > { > int result = -EIO; > struct ... gp = { > .param = I915_PARAM_MMAP_GTT_VERSION, > .value = &result, > }; > > if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp)) { > result = -errno; > igt_assume(result); OK, I'll learn what igt_assume() is first then use it. Thanks, Janusz > } > > errno = 0; > return result; > } > > Now just put it to use somewhere. > -Chris > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t v5] lib/i915/gem_mman: Add a helper for obtaining MMAP_GTT interface version
From: Janusz Krzysztofik If a test calls a function which depends on availability of a specific version of MMAP_GTT interface, an error may occur on unsupported hardware. That may negatively affect results reported by a test framework even if that test ignores the failure and succeedes. This helper wraps up an IOCTL call which returns a version number of MMAP_GTT interface. It may be used by tests which should adjust their scope depending on availability of a specific version of MMAP_GTT interface. Signed-off-by: Janusz Krzysztofik Cc: Antonio Argenziano Cc: Michal Wajdeczko --- Changelog: v4 -> v5: - change sign of errno before it is returned (Chris - thanks!), - validate -errno with igt_assume() (Chris - thanks!), - follow coding style suggested by Chris - thanks! To be honest, I think Chris should be somehow officially credited in the commit tags for his contributions but I'm not sure how. Would a Suggested-by: clause be OK, or Co-develped-by: maybe? v3 -> v4: - return errno value on failure (Chris - thanks!), - clear errno before return, as other helpers do, - reword the helper documentation and commit message again (Chris - thanks!). v2 (internal) -> v3: - make the code less obsucre, more explicit (Antonio - thanks!), - reword the helper documentation and commit message. v1 (internal) -> v2 (internal): - minimize future potential conflicts with https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1 (no progress with than one so not waiting for it any longer): - convert the helper to a drop-in replacement of the one from the above mentioned patch, returning mappable aperture version, not only information on its availability, - drop any other wrappers, - document the helper, - reword commit message. lib/i915/gem_mman.c | 25 + lib/i915/gem_mman.h | 1 + 2 files changed, 26 insertions(+) diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c index 3cf9a6bb..27c437da 100644 --- a/lib/i915/gem_mman.c +++ b/lib/i915/gem_mman.c @@ -40,6 +40,31 @@ #define VG(x) do {} while (0) #endif +/** + * gem_mmap__gtt_version: + * @fd: open i915 drm file descriptor + * + * This functions wraps up an IOCTL to obtain MMAP_GTT interface version + * + * Returns: MMAP_GTT interface version, kernel error code on failure. + */ +int gem_mmap__gtt_version(int fd) +{ + int result = -EIO; + struct drm_i915_getparam gp = { + .param = I915_PARAM_MMAP_GTT_VERSION, + .value = &result, + }; + + if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp)) { + result = -errno; + igt_assume(result); + } + + errno = 0; + return result; +} + /** * __gem_mmap__gtt: * @fd: open i915 drm file descriptor diff --git a/lib/i915/gem_mman.h b/lib/i915/gem_mman.h index f7242ed7..ab12e566 100644 --- a/lib/i915/gem_mman.h +++ b/lib/i915/gem_mman.h @@ -25,6 +25,7 @@ #ifndef GEM_MMAN_H #define GEM_MMAN_H +int gem_mmap__gtt_version(int fd); void *gem_mmap__gtt(int fd, uint32_t handle, uint64_t size, unsigned prot); void *gem_mmap__cpu(int fd, uint32_t handle, uint64_t offset, uint64_t size, unsigned prot); -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 1/1] drm/i915: Split off pci_driver.remove() tail to drm_driver.release()
On Monday, June 3, 2019 9:28:18 AM CEST Daniel Vetter wrote: > On Thu, May 30, 2019 at 10:40:09AM +0100, Chris Wilson wrote: > > Quoting Janusz Krzysztofik (2019-05-30 10:24:26) > > > In order to support driver hot unbind, some cleanup operations, now > > > performed on PCI driver remove, must be called later, after all device > > > file descriptors are closed. > > > > > > Split out those operations from the tail of pci_driver.remove() > > > callback and put them into drm_driver.release() which is called as soon > > > as all references to the driver are put. As a result, those cleanups > > > will be now run on last drm_dev_put(), either still called from > > > pci_driver.remove() if all device file descriptors are already closed, > > > or on last drm_release() file operation. > > > > > > Signed-off-by: Janusz Krzysztofik > > > --- > > > drivers/gpu/drm/i915/i915_drv.c | 17 + > > > drivers/gpu/drm/i915/i915_drv.h | 1 + > > > drivers/gpu/drm/i915/i915_gem.c | 10 +- > > > 3 files changed, 23 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/ i915_drv.c > > > index 83d2eb9e74cb..8be69f84eb6d 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.c > > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > > @@ -738,6 +738,7 @@ static int i915_load_modeset_init(struct drm_device *dev) > > > > > > cleanup_gem: > > > i915_gem_suspend(dev_priv); > > > + i915_gem_fini_hw(dev_priv); > > > i915_gem_fini(dev_priv); > > > cleanup_modeset: > > > intel_modeset_cleanup(dev); > > > @@ -1685,7 +1686,6 @@ static void i915_driver_cleanup_hw(struct drm_i915_private *dev_priv) > > > pci_disable_msi(pdev); > > > > > > pm_qos_remove_request(&dev_priv->pm_qos); > > > - i915_ggtt_cleanup_hw(dev_priv); > > > } > > > > > > /** > > > @@ -1909,6 +1909,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent) > > > > Would it make sense to rename load/unload from the legacy drm stubs over > > to match the pci entry points? > > +1 on that rename, load/unload is really terribly confusing and has > horrible semantics in the dri1 shadow attach world ... > -Daniel I've not responded to that comment, sorry, but I agree too. I've assumed that's a candidate for a separate patch or series. I'm willing to work on that as time permits. Thanks, Janusz > > > > > out_cleanup_hw: > > > i915_driver_cleanup_hw(dev_priv); > > > + i915_ggtt_cleanup_hw(dev_priv); > > > out_cleanup_mmio: > > > i915_driver_cleanup_mmio(dev_priv); > > > out_runtime_pm_put: > > > @@ -1960,21 +1961,29 @@ void i915_driver_unload(struct drm_device *dev) > > > cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work); > > > i915_reset_error_state(dev_priv); > > > > > > - i915_gem_fini(dev_priv); > > > + i915_gem_fini_hw(dev_priv); > > > > > > intel_power_domains_fini_hw(dev_priv); > > > > > > i915_driver_cleanup_hw(dev_priv); > > > - i915_driver_cleanup_mmio(dev_priv); > > > > > > enable_rpm_wakeref_asserts(dev_priv); > > > - intel_runtime_pm_cleanup(dev_priv); > > > } > > > > > > static void i915_driver_release(struct drm_device *dev) > > > { > > > struct drm_i915_private *dev_priv = to_i915(dev); > > > > > > + disable_rpm_wakeref_asserts(dev_priv); > > > + > > > + i915_gem_fini(dev_priv); > > > + > > > + i915_ggtt_cleanup_hw(dev_priv); > > > + i915_driver_cleanup_mmio(dev_priv); > > > + > > > + enable_rpm_wakeref_asserts(dev_priv); > > > + intel_runtime_pm_cleanup(dev_priv); > > > > We should really propagate the release nomenclature down and replace our > > mixed fini/cleanup. Consistency is helpful when trying to work out which > > phase the code is in. > > > > > i915_driver_cleanup_early(dev_priv); > > > i915_driver_destroy(dev_priv); > > > } > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/ i915_drv.h > > > index a2664ea1395b..d08e7bd83544 100644 > > > --- a/drivers/gpu/drm/i915/i915_d
[Intel-gfx] [PATCH i-g-t v11 1/1] tests: Add a new test for device hot unplug
From: Janusz Krzysztofik There is a test which verifies unloading of i915 driver module but no test exists that checks how a driver behaves when it gets unbound from a device or when the device gets unplugged. Provide such test using sysfs interface. Two minimalistic subtests - "unbind-rebind" and "unplug-rescan" - perform desired operations on a DRM device which is beleived to be not in use. A subtest named "drm_open-hotunplug" unplugs a DRM device while keeping a file descriptor open. Changelog: v2: - run a subprocess with dummy_load instead of external command (Antonio). v3: - run dummy_load from the test process directly (Antonio). v4: - run dummy_load from inside subtests (Antonio). v5: - try to restore the device to a working state after each subtest (Petri, Daniel). v6: - run workload inside an igt helper subprocess so resources consumed by the workload are cleaned up automatically on workload subprocess crash, without affecting test results, - move the igt helper with workload back from subtests to initial fixture so workload crash also does not affect test results, - other cleanups suggested by Katarzyna and Chris. v7: - no changes. v8: - move workload functions back from fixture to subtests, - register different actions and different workloads in respective tables and iterate over those tables while enumerating subtests, - introduce new subtest flavors by simply omiting module unload step, - instead of simply requesting bus rescan or not, introduce action specific device recovery helpers, required specifically with those new subtests not touching the module, - split workload functions in two parts, one spawning the workload, the other waiting for its completion, - for the new subtests not requiring module unload, run workload functions directly from the test process and use new workload completion wait functions in place of subprocess completion wait, - take more control over logging, longjumps and exit codes in workload subprocesses, - add some debug messages for easy progress watching, - move function API descriptions on top of respective typedefs. v9: All changes after Daniel's comments - thanks! - flatten the code, don't try to create a midlayer (Daniel), - provide mimimal subtests that even don't keep device open (Daniel), - don't use driver unbind in more advanced subtests (Daniel), - provide subtests with different level of resources allocated during device unplug (Daniel), - provide subtests which check driver behavior after device hot unplug (Daniel). v10: - rename variables and function arguments to something that indicates they're file descriptors (Daniel), - introduce a data structure that contains various file descriptors and a helper function to set them all (Daniel), - fix strange indenting (Daniel), - limit scope to first three subtests as the first set of tests to merge (Daniel). v11: - fix typos in some comments, - use SPDX license identifier, - include a per-patch changelog in the commit message (Daniel). Cc: Antonio Argenziano Cc: Petri Latvala Cc: Daniel Vetter Cc: Katarzyna Dec Cc: Chris Wilson Cc: Michał Wajdeczko Signed-off-by: Janusz Krzysztofik --- tests/Makefile.sources | 1 + tests/core_hotunplug.c | 222 + tests/meson.build | 1 + 3 files changed, 224 insertions(+) create mode 100644 tests/core_hotunplug.c diff --git a/tests/Makefile.sources b/tests/Makefile.sources index 027ed82f..3f24265f 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -17,6 +17,7 @@ TESTS_progs = \ core_getclient \ core_getstats \ core_getversion \ + core_hotunplug \ core_setmaster_vs_auth \ debugfs_test \ drm_import_export \ diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c new file mode 100644 index ..d36a0572 --- /dev/null +++ b/tests/core_hotunplug.c @@ -0,0 +1,222 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright © 2019 Intel Corporation + */ + +#include "igt.h" +#include "igt_device.h" +#include "igt_dummyload.h" +#include "igt_kmod.h" +#include "igt_sysfs.h" + +#include +#include +#include + +struct hotunplug { + int chipset; + struct { + int drm; + int sysfs_dev; + int sysfs_bus; + } fd; +}; + +/* Helpers */ + +static void prepare(struct hotunplug *priv) +{ + /* open the driver */ + priv->fd.drm = __drm_open_driver(priv->chipset); + igt_assert(priv->fd.drm >= 0); + + /* prepare for device unplug */ + priv->fd.sysfs_dev = igt_sysfs_open(priv->fd.drm); + igt_assert(priv->fd.sysfs_dev >= 0); + + /* prepare for bus rescan */ + priv->fd.sysfs_bus = openat(priv->fd.sysfs_dev, "device/subsystem", + O_DIREC
[Intel-gfx] [PATCH i-g-t v11 0/1] tests: Add a new test for device hot unplug
The test should help resolving driver bugs which exhibit themselves when a device is unplugged / driver unbind from a device while the device is busy (different from simple module unload which requires device references being put first). A kernel patch resolving kernel panics on driver hot unbind [1] was verified on trybot with v10 of this test before it has been submitted upstream. Current version (v11) has also been tested on trybot with the kernel patch already included upstream. Hence, no kernel panics are expected, however some kernel WARNs and driver error messages may still need to be resolved before CI is happy with this new test. [1] https://cgit.freedesktop.org/drm/drm-tip/commit/?id=47bc28d7ee6d8378ba4451c43885cb3241302243 Janusz Krzysztofik (1): tests: Add a new test for device hot unplug tests/Makefile.sources | 1 + tests/core_hotunplug.c | 222 + tests/meson.build | 1 + 3 files changed, 224 insertions(+) create mode 100644 tests/core_hotunplug.c Changelog: v10->v11: - fix typos in some comments, - use SPDX license identifier, - include a per-patch changelog in the commit message (Daniel). v9->v10 (submitted only to trybot): - rename variables and function arguments to something that indicates they're file descriptors (Daniel), - introduce a data structure that contains various file descriptors and a helper function to set them all (Daniel), - fix strange indenting (Daniel), - limit scope to first three subtests as the first set of tests to merge (Daniel). v8->v9: All changes after Daniel's comments - thanks! - flatten the code, don't try to create a midlayer, - provide mimimal subtests that even don't keep device open, - don't use driver unbind in more advanced subtests, - provide subtests with different level of resources allocated during device unplug, - provide subtests which check driver behavior after device hot unplug. v7->v8: - move workload functions back from fixture to subtests, - register different actions and different workloads in respective tables and iterate over those tables while enumerating subtests, - introduce new subtest flavors by simply omiting module unload step, - instead of simply requesting bus rescan or not, introduce action specific device recovery helpers, required specifically with those new subtests not touching the module, - split workload functions in two parts, one spawning the workload, the other waiting for its completion, - for the new subtests not requiring module unload, run workload functions directly from the test process and use new workload completion wait functions in place of subprocess completion wait, - take more control over logging, longjumps and exit codes in workload subprocesses, - add some debug messages for easy progress watching, - move function API descriptions on top of respective typedefs, - drop patch 2/2 with external workload command again, still nobody likes it. v6->v7: - add missing igt_exit() needed with the second patch. v5->v6 (third public submission, incorrectly marked as v5, sorry): - run workload inside an igt helper subprocess so resources consumed by the workload are cleaned up automatically on workload subprocess crash, without affecting test results, - move the igt helper with workload back from subtests to initial fixture so workload crash also does not affect test results, - re-add the second patch which extends the test with an option for using an external command as a workload, - other cleanups suggested by Kasia and Chris. v4->v5 (second public submission, marked as v2): - try to restore the device to a working state after each subtest (Petri, Daniel). v3->v4 (first public submission, not marked with any version number): - run dummy_load from inside subtests (Antonio). v2->v3 (internal submission): - run dummy_load from the test process directly (Antonio), - drop the patch for running external workload (Antonio). v1->v2 (internal submission): - run a subprocess with dummy_load instead of external command (Antonio), - keep use of external workload command as an option, move that to a separate patch. -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t v11 1/1] tests: Add a new test for device hot unplug
On Monday, June 10, 2019 8:49:38 AM CEST Petri Latvala wrote: > On Fri, Jun 07, 2019 at 01:51:42PM +0200, Janusz Krzysztofik wrote: > > - use SPDX license identifier, > > > Why? We don't use those in IGT. I must have had got an idea to change it from somewhere, unfortunately I'm not able to recall from where, sorry. I'll revert it. > > diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c > > new file mode 100644 > > index ..d36a0572 > > --- /dev/null > > +++ b/tests/core_hotunplug.c > > @@ -0,0 +1,222 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * Copyright © 2019 Intel Corporation > > + */ > > And why GPL-2.0? From the same source as the idea of SPDX, I guess. I'll fix it to be in line with IGT standards. Thanks, Janusz ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Fix reporting of size of created GEM object
Commit e163484afa8d ("drm/i915: Update size upon return from GEM_CREATE") (re)introduced reporting of actual size of created GEM objects, possibly rounded up on object alignment. Unfortunately, its implementation resulted in a possible use-after-free bug. The bug has been fixed by commit 929eec99f5fd ("drm/i915: Avoid use-after-free in reporting create.size") at the cost of possibly incorrect value being reported as actual object size. Safely restore correct reporting by capturing actual size of created GEM object before a reference to the object is put. Fixes: 929eec99f5fd ("drm/i915: Avoid use-after-free in reporting create.size") Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_gem.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7ade42b8ec99..16bae5870d6f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -171,6 +171,7 @@ i915_gem_create(struct drm_file *file, obj = i915_gem_object_create_shmem(dev_priv, size); if (IS_ERR(obj)) return PTR_ERR(obj); + size = obj->base.size; ret = drm_gem_handle_create(file, &obj->base, &handle); /* drop reference from allocate - handle holds it now */ -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 1/6] drm/i915: Rename "_load"/"_unload" to match PCI entry points
Current names of i915_driver_load/unload() functions originate in legacy DRM stubs. Reduce nomenclature ambiguity by renaming them to match their current use as helpers called from PCI entry points. Suggested by: Chris Wilson Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 8 drivers/gpu/drm/i915/i915_drv.h | 4 ++-- drivers/gpu/drm/i915/i915_pci.c | 4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 12182d2fc03c..8b72ae7c1f5d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1870,17 +1870,17 @@ static void i915_driver_destroy(struct drm_i915_private *i915) } /** - * i915_driver_load - setup chip and create an initial config + * i915_driver_probe - setup chip and create an initial config * @pdev: PCI device * @ent: matching PCI ID entry * - * The driver load routine has to do several things: + * The driver probe routine has to do several things: * - drive output discovery via intel_modeset_init() * - initialize the memory manager * - allocate initial config memory * - setup the DRM framebuffer with the allocated memory */ -int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent) +int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { const struct intel_device_info *match_info = (struct intel_device_info *)ent->driver_data; @@ -1946,7 +1946,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent) return ret; } -void i915_driver_unload(struct drm_device *dev) +void i915_driver_remove(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); struct pci_dev *pdev = dev_priv->drm.pdev; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a9381e404fd5..ebb4c09f8817 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2395,9 +2395,9 @@ extern long i915_compat_ioctl(struct file *filp, unsigned int cmd, #endif extern const struct dev_pm_ops i915_pm_ops; -extern int i915_driver_load(struct pci_dev *pdev, +extern int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent); -extern void i915_driver_unload(struct drm_device *dev); +extern void i915_driver_remove(struct drm_device *dev); extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine); extern void intel_hangcheck_init(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 94b588e0a1dd..786ca7b3439b 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -848,7 +848,7 @@ static void i915_pci_remove(struct pci_dev *pdev) if (!dev) /* driver load aborted, nothing to cleanup */ return; - i915_driver_unload(dev); + i915_driver_remove(dev); drm_dev_put(dev); pci_set_drvdata(pdev, NULL); @@ -923,7 +923,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) if (vga_switcheroo_client_probe_defer(pdev)) return -EPROBE_DEFER; - err = i915_driver_load(pdev, ent); + err = i915_driver_probe(pdev, ent); if (err) return err; -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 0/6] Rename functions to match their entry points
Need for this was identified while working on split of driver unbind path into _remove() and _release() parts. Consistency in function naming has been recognized as helpful when trying to work out which phase the code is in. What I'm still not sure about is desired depth of that modification - how deep should we go down with renaming to not override meaningfull function names. Please advise if you think still more deep renaming makes sense. Thanks, Janusz Janusz Krzysztofik (6): drm/i915: Rename "_load"/"_unload" to match PCI entry points drm/i915: Replace "_load" with "_probe" consequently drm/i915: Propagate "_release" function name suffix down drm/i915: Propagate "_remove" function name suffix down drm/i915: Propagate "_probe" function name suffix down drm/i915: Rename "inject_load_failure" module parameter drivers/gpu/drm/i915/display/intel_bios.c | 4 +- drivers/gpu/drm/i915/display/intel_bios.h | 2 +- .../gpu/drm/i915/display/intel_connector.c| 2 +- drivers/gpu/drm/i915/display/intel_display.c | 2 +- .../drm/i915/display/intel_display_power.c| 6 +- .../drm/i915/display/intel_display_power.h| 2 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/i915_drv.c | 111 +- drivers/gpu/drm/i915/i915_drv.h | 20 ++-- drivers/gpu/drm/i915/i915_gem.c | 12 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 4 +- drivers/gpu/drm/i915/i915_gem_gtt.h | 2 +- drivers/gpu/drm/i915/i915_params.c| 2 +- drivers/gpu/drm/i915/i915_params.h| 2 +- drivers/gpu/drm/i915/i915_pci.c | 6 +- drivers/gpu/drm/i915/intel_gvt.c | 7 +- drivers/gpu/drm/i915/intel_gvt.h | 4 +- drivers/gpu/drm/i915/intel_runtime_pm.c | 2 +- drivers/gpu/drm/i915/intel_runtime_pm.h | 2 +- drivers/gpu/drm/i915/intel_uncore.c | 2 +- drivers/gpu/drm/i915/intel_wopcm.c| 2 +- 21 files changed, 100 insertions(+), 98 deletions(-) -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 4/6] drm/i915: Propagate "_remove" function name suffix down
Similar to the "_release" case, consistently replace mixed "_cleanup"/"_fini"/"_fini_hw" components found in names of functions called from i915_driver_remove() with "_remove" or "_driver_remove" suffixes for better code readability. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/display/intel_bios.c | 4 ++-- drivers/gpu/drm/i915/display/intel_bios.h | 2 +- drivers/gpu/drm/i915/display/intel_display.c | 2 +- .../drm/i915/display/intel_display_power.c| 6 ++--- .../drm/i915/display/intel_display_power.h| 2 +- drivers/gpu/drm/i915/i915_drv.c | 24 +-- drivers/gpu/drm/i915/i915_drv.h | 4 ++-- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/intel_gvt.c | 5 ++-- drivers/gpu/drm/i915/intel_gvt.h | 4 ++-- 10 files changed, 28 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index 0c9808132d67..3c725edc79ef 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -1891,10 +1891,10 @@ void intel_bios_init(struct drm_i915_private *dev_priv) } /** - * intel_bios_cleanup - Free any resources allocated by intel_bios_init() + * intel_bios_driver_remove - Free any resources allocated by intel_bios_init() * @dev_priv: i915 device instance */ -void intel_bios_cleanup(struct drm_i915_private *dev_priv) +void intel_bios_driver_remove(struct drm_i915_private *dev_priv) { kfree(dev_priv->vbt.child_dev); dev_priv->vbt.child_dev = NULL; diff --git a/drivers/gpu/drm/i915/display/intel_bios.h b/drivers/gpu/drm/i915/display/intel_bios.h index 0b7be6389a07..4969189e620f 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.h +++ b/drivers/gpu/drm/i915/display/intel_bios.h @@ -228,7 +228,7 @@ struct mipi_pps_data { } __packed; void intel_bios_init(struct drm_i915_private *dev_priv); -void intel_bios_cleanup(struct drm_i915_private *dev_priv); +void intel_bios_driver_remove(struct drm_i915_private *dev_priv); bool intel_bios_is_valid_vbt(const void *buf, size_t size); bool intel_bios_is_tv_present(struct drm_i915_private *dev_priv); bool intel_bios_is_lvds_present(struct drm_i915_private *dev_priv, u8 *i2c_pin); diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index f09eda75711a..47dd682c9a62 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -17062,7 +17062,7 @@ static void intel_hpd_poll_fini(struct drm_device *dev) drm_connector_list_iter_end(&conn_iter); } -void intel_modeset_cleanup(struct drm_device *dev) +void intel_modeset_driver_remove(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 7437fc71d289..5f4939a9ca90 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -4427,7 +4427,7 @@ static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv); * * It will return with power domains disabled (to be enabled later by * intel_power_domains_enable()) and must be paired with - * intel_power_domains_fini_hw(). + * intel_power_domains_driver_remove(). */ void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume) { @@ -4479,7 +4479,7 @@ void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume) } /** - * intel_power_domains_fini_hw - deinitialize hw power domain state + * intel_power_domains_driver_remove - deinitialize hw power domain state * @i915: i915 device instance * * De-initializes the display power domain HW state. It also ensures that the @@ -4489,7 +4489,7 @@ void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume) * intel_power_domains_disable()) and must be paired with * intel_power_domains_init_hw(). */ -void intel_power_domains_fini_hw(struct drm_i915_private *i915) +void intel_power_domains_driver_remove(struct drm_i915_private *i915) { intel_wakeref_t wakeref __maybe_unused = fetch_and_zero(&i915->power_domains.wakeref); diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h index 8f43f7051a16..dbd1f5ef01d1 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.h +++ b/drivers/gpu/drm/i915/display/intel_display_power.h @@ -214,7 +214,7 @@ void gen9_enable_dc5(struct drm_i915_private *dev_priv); int intel_power_domains_init(struct drm_i915_private *dev_priv); void intel_power_domains_cleanup(struct drm_i915_private *dev_priv); void intel_power_domains_init_hw(struct drm_i9
[Intel-gfx] [RFC PATCH 5/6] drm/i915: Propagate "_probe" function name suffix down
Similar to the "_release" and "_remove" cases, consequently replace "_init" components of names of functions called from i915_driver_probe() with "_probe" suffixes for better code readability. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 6e83fe96d930..7241a7d14e9b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -675,7 +675,7 @@ static const struct vga_switcheroo_client_ops i915_switcheroo_ops = { .can_switch = i915_switcheroo_can_switch, }; -static int i915_load_modeset_init(struct drm_device *dev) +static int i915_driver_modeset_probe(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); struct pci_dev *pdev = dev_priv->drm.pdev; @@ -884,7 +884,7 @@ static void intel_detect_preproduction_hw(struct drm_i915_private *dev_priv) } /** - * i915_driver_init_early - setup state not requiring device access + * i915_driver_early_probe - setup state not requiring device access * @dev_priv: device private * * Initialize everything that is a "SW-only" state, that is state not @@ -893,7 +893,7 @@ static void intel_detect_preproduction_hw(struct drm_i915_private *dev_priv) * system memory allocation, setting up device specific attributes and * function hooks not requiring accessing the device. */ -static int i915_driver_init_early(struct drm_i915_private *dev_priv) +static int i915_driver_early_probe(struct drm_i915_private *dev_priv) { int ret = 0; @@ -963,7 +963,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv) /** * i915_driver_early_release - cleanup the setup done in - *i915_driver_init_early() + *i915_driver_early_probe() * @dev_priv: device private */ static void i915_driver_early_release(struct drm_i915_private *dev_priv) @@ -980,7 +980,7 @@ static void i915_driver_early_release(struct drm_i915_private *dev_priv) } /** - * i915_driver_init_mmio - setup device MMIO + * i915_driver_mmio_probe - setup device MMIO * @dev_priv: device private * * Setup minimal device state necessary for MMIO accesses later in the @@ -988,7 +988,7 @@ static void i915_driver_early_release(struct drm_i915_private *dev_priv) * side effects or exposing the driver via kernel internal or user space * interfaces. */ -static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) +static int i915_driver_mmio_probe(struct drm_i915_private *dev_priv) { int ret; @@ -1029,7 +1029,7 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) } /** - * i915_driver_mmio_release - cleanup the setup done in i915_driver_init_mmio() + * i915_driver_mmio_release - cleanup the setup done in i915_driver_mmio_probe() * @dev_priv: device private */ static void i915_driver_mmio_release(struct drm_i915_private *dev_priv) @@ -1525,13 +1525,13 @@ static void edram_detect(struct drm_i915_private *dev_priv) } /** - * i915_driver_init_hw - setup state requiring device access + * i915_driver_hw_probe - setup state requiring device access * @dev_priv: device private * * Setup state that requires accessing the device, but doesn't require * exposing the driver via kernel internal or userspace interfaces. */ -static int i915_driver_init_hw(struct drm_i915_private *dev_priv) +static int i915_driver_hw_probe(struct drm_i915_private *dev_priv) { struct pci_dev *pdev = dev_priv->drm.pdev; int ret; @@ -1900,7 +1900,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) if (ret) goto out_fini; - ret = i915_driver_init_early(dev_priv); + ret = i915_driver_early_probe(dev_priv); if (ret < 0) goto out_pci_disable; @@ -1908,15 +1908,15 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) i915_detect_vgpu(dev_priv); - ret = i915_driver_init_mmio(dev_priv); + ret = i915_driver_mmio_probe(dev_priv); if (ret < 0) goto out_runtime_pm_put; - ret = i915_driver_init_hw(dev_priv); + ret = i915_driver_hw_probe(dev_priv); if (ret < 0) goto out_cleanup_mmio; - ret = i915_load_modeset_init(&dev_priv->drm); + ret = i915_driver_modeset_probe(&dev_priv->drm); if (ret < 0) goto out_cleanup_hw; -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 2/6] drm/i915: Replace "_load" with "_probe" consequently
Use the "_probe" nomenclature not only in i915_driver_probe() helper name but also in other related function / variable names for consistency. Only the userspace exposed name of a related module parameter is left untouched. Signed-off-by: Janusz Krzysztofik --- .../gpu/drm/i915/display/intel_connector.c| 2 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/i915_drv.c | 20 +-- drivers/gpu/drm/i915/i915_drv.h | 10 +- drivers/gpu/drm/i915/i915_gem.c | 8 drivers/gpu/drm/i915/i915_pci.c | 2 +- drivers/gpu/drm/i915/intel_gvt.c | 2 +- drivers/gpu/drm/i915/intel_uncore.c | 2 +- drivers/gpu/drm/i915/intel_wopcm.c| 2 +- 9 files changed, 25 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_connector.c b/drivers/gpu/drm/i915/display/intel_connector.c index 41310f8e5a2a..d0163d86c42a 100644 --- a/drivers/gpu/drm/i915/display/intel_connector.c +++ b/drivers/gpu/drm/i915/display/intel_connector.c @@ -118,7 +118,7 @@ int intel_connector_register(struct drm_connector *connector) if (ret) goto err; - if (i915_inject_load_failure()) { + if (i915_inject_probe_failure()) { ret = -EFAULT; goto err_backlight; } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index df5932f5f578..a17f0f812735 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -426,7 +426,7 @@ int intel_engines_init_mmio(struct drm_i915_private *i915) WARN_ON(engine_mask & GENMASK(BITS_PER_TYPE(mask) - 1, I915_NUM_ENGINES)); - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; for (i = 0; i < ARRAY_SIZE(intel_engines); i++) { diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 8b72ae7c1f5d..ad24957ad86d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -81,14 +81,14 @@ static struct drm_driver driver; #if IS_ENABLED(CONFIG_DRM_I915_DEBUG) -static unsigned int i915_load_fail_count; +static unsigned int i915_probe_fail_count; -bool __i915_inject_load_failure(const char *func, int line) +bool __i915_inject_probe_failure(const char *func, int line) { - if (i915_load_fail_count >= i915_modparams.inject_load_failure) + if (i915_probe_fail_count >= i915_modparams.inject_load_failure) return false; - if (++i915_load_fail_count == i915_modparams.inject_load_failure) { + if (++i915_probe_fail_count == i915_modparams.inject_load_failure) { DRM_INFO("Injecting failure at checkpoint %u [%s:%d]\n", i915_modparams.inject_load_failure, func, line); i915_modparams.inject_load_failure = 0; @@ -100,7 +100,7 @@ bool __i915_inject_load_failure(const char *func, int line) bool i915_error_injected(void) { - return i915_load_fail_count && !i915_modparams.inject_load_failure; + return i915_probe_fail_count && !i915_modparams.inject_load_failure; } #endif @@ -681,7 +681,7 @@ static int i915_load_modeset_init(struct drm_device *dev) struct pci_dev *pdev = dev_priv->drm.pdev; int ret; - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; if (HAS_DISPLAY(dev_priv)) { @@ -897,7 +897,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv) { int ret = 0; - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; intel_device_info_subplatform_init(dev_priv); @@ -991,7 +991,7 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) { int ret; - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; if (i915_get_bridge_dev(dev_priv)) @@ -1535,7 +1535,7 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv) struct pci_dev *pdev = dev_priv->drm.pdev; int ret; - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; intel_device_info_runtime_init(dev_priv); @@ -1941,7 +1941,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) out_pci_disable: pci_disable_device(pdev); out_fini: - i915_load_error(dev_priv, "Device initialization failed (%d)\n", ret); + i915_probe_error(dev_priv, "Device initialization failed (%d)\n", ret); i915_driver_destroy(dev_priv); return ret; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/dr
[Intel-gfx] [RFC PATCH 6/6] drm/i915: Rename "inject_load_failure" module parameter
Use the "probe" nomenclature for consistency with internally used names of functions and variables. Requires adjustment of IGT tests and possibly affects other user custom applications. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c| 10 +- drivers/gpu/drm/i915/i915_params.c | 2 +- drivers/gpu/drm/i915/i915_params.h | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 7241a7d14e9b..3bac6be9f37d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -85,13 +85,13 @@ static unsigned int i915_probe_fail_count; bool __i915_inject_probe_failure(const char *func, int line) { - if (i915_probe_fail_count >= i915_modparams.inject_load_failure) + if (i915_probe_fail_count >= i915_modparams.inject_probe_failure) return false; - if (++i915_probe_fail_count == i915_modparams.inject_load_failure) { + if (++i915_probe_fail_count == i915_modparams.inject_probe_failure) { DRM_INFO("Injecting failure at checkpoint %u [%s:%d]\n", -i915_modparams.inject_load_failure, func, line); - i915_modparams.inject_load_failure = 0; +i915_modparams.inject_probe_failure, func, line); + i915_modparams.inject_probe_failure = 0; return true; } @@ -100,7 +100,7 @@ bool __i915_inject_probe_failure(const char *func, int line) bool i915_error_injected(void) { - return i915_probe_fail_count && !i915_modparams.inject_load_failure; + return i915_probe_fail_count && !i915_modparams.inject_probe_failure; } #endif diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 296452f9efe4..59a6586dae15 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -165,7 +165,7 @@ i915_param_named_unsafe(enable_dp_mst, bool, 0600, "Enable multi-stream transport (MST) for new DisplayPort sinks. (default: true)"); #if IS_ENABLED(CONFIG_DRM_I915_DEBUG) -i915_param_named_unsafe(inject_load_failure, uint, 0400, +i915_param_named_unsafe(inject_probe_failure, uint, 0400, "Force an error after a number of failure check points (0:disabled (default), N:force failure at the Nth failure check point)"); #endif diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index d29ade3b7de6..8c887413fc70 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -62,7 +62,7 @@ struct drm_printer; param(int, mmio_debug, -IS_ENABLED(CONFIG_DRM_I915_DEBUG_MMIO)) \ param(int, edp_vswing, 0) \ param(int, reset, 2) \ - param(unsigned int, inject_load_failure, 0) \ + param(unsigned int, inject_probe_failure, 0) \ param(int, fastboot, -1) \ param(int, enable_dpcd_backlight, 0) \ param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE) \ -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH 3/6] drm/i915: Propagate "_release" function name suffix down
Replace mixed "_fini"/"_cleanup"/"_cleanup_hw" suffixes found in names of fucntions called from i915_driver_release() with "_release" suffix consistently. This provides better code readability, especially helpful when trying to work out which phase the code is in. Functions names starting with "i915_driver_", i.e., those defined in drivers/gpu/dri/i915/i915_drv.c, just have their "cleanup" or "fini" parts of their names replaced with the "_release" suffix, while names of functions coming from other source files have been suffixed with "_driver_release" to avoid ambiguity with other possible .release entry points. Suggested-by: Chris Wilson Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 33 + drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 4 +-- drivers/gpu/drm/i915/i915_gem_gtt.h | 2 +- drivers/gpu/drm/i915/intel_runtime_pm.c | 2 +- drivers/gpu/drm/i915/intel_runtime_pm.h | 2 +- 7 files changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index ad24957ad86d..36c872220f68 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -752,7 +752,7 @@ static int i915_load_modeset_init(struct drm_device *dev) cleanup_gem: i915_gem_suspend(dev_priv); i915_gem_fini_hw(dev_priv); - i915_gem_fini(dev_priv); + i915_gem_driver_release(dev_priv); cleanup_modeset: intel_modeset_cleanup(dev); cleanup_irq: @@ -962,10 +962,11 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv) } /** - * i915_driver_cleanup_early - cleanup the setup done in i915_driver_init_early() + * i915_driver_early_release - cleanup the setup done in + *i915_driver_init_early() * @dev_priv: device private */ -static void i915_driver_cleanup_early(struct drm_i915_private *dev_priv) +static void i915_driver_early_release(struct drm_i915_private *dev_priv) { intel_irq_fini(dev_priv); intel_power_domains_cleanup(dev_priv); @@ -1028,10 +1029,10 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) } /** - * i915_driver_cleanup_mmio - cleanup the setup done in i915_driver_init_mmio() + * i915_driver_mmio_release - cleanup the setup done in i915_driver_init_mmio() * @dev_priv: device private */ -static void i915_driver_cleanup_mmio(struct drm_i915_private *dev_priv) +static void i915_driver_mmio_release(struct drm_i915_private *dev_priv) { intel_teardown_mchbar(dev_priv); intel_uncore_fini_mmio(&dev_priv->uncore); @@ -1684,7 +1685,7 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv) pci_disable_msi(pdev); pm_qos_remove_request(&dev_priv->pm_qos); err_ggtt: - i915_ggtt_cleanup_hw(dev_priv); + i915_ggtt_driver_release(dev_priv); err_perf: i915_perf_fini(dev_priv); return ret; @@ -1929,15 +1930,15 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) out_cleanup_hw: i915_driver_cleanup_hw(dev_priv); - i915_ggtt_cleanup_hw(dev_priv); + i915_ggtt_driver_release(dev_priv); /* Paranoia: make sure we have disabled everything before we exit. */ intel_sanitize_gt_powersave(dev_priv); out_cleanup_mmio: - i915_driver_cleanup_mmio(dev_priv); + i915_driver_mmio_release(dev_priv); out_runtime_pm_put: enable_rpm_wakeref_asserts(&dev_priv->runtime_pm); - i915_driver_cleanup_early(dev_priv); + i915_driver_early_release(dev_priv); out_pci_disable: pci_disable_device(pdev); out_fini: @@ -2000,19 +2001,19 @@ static void i915_driver_release(struct drm_device *dev) disable_rpm_wakeref_asserts(rpm); - i915_gem_fini(dev_priv); + i915_gem_driver_release(dev_priv); - i915_ggtt_cleanup_hw(dev_priv); + i915_ggtt_driver_release(dev_priv); /* Paranoia: make sure we have disabled everything before we exit. */ intel_sanitize_gt_powersave(dev_priv); - i915_driver_cleanup_mmio(dev_priv); + i915_driver_mmio_release(dev_priv); enable_rpm_wakeref_asserts(rpm); - intel_runtime_pm_cleanup(rpm); + intel_runtime_pm_driver_release(rpm); - i915_driver_cleanup_early(dev_priv); + i915_driver_early_release(dev_priv); i915_driver_destroy(dev_priv); } @@ -2205,7 +2206,7 @@ static int i915_drm_suspend_late(struct drm_device *dev, bool hibernation) out: enable_rpm_wakeref_asserts(rpm); if (!dev_priv->uncore.user_forcewake.count) - intel_runtime_pm_cleanup(rpm); + intel_runtime_pm_driver_release(rpm); return ret; } @@ -2969,7 +
Re: [Intel-gfx] [RFC PATCH 0/6] Rename functions to match their entry points
On Wednesday, July 10, 2019 2:47:08 PM CEST Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-07-10 13:36:25) > > Need for this was identified while working on split of driver unbind > > path into _remove() and _release() parts. Consistency in function > > naming has been recognized as helpful when trying to work out which > > phase the code is in. > > > > What I'm still not sure about is desired depth of that modification - > > how deep should we go down with renaming to not override meaningfull > > function names. Please advise if you think still more deep renaming > > makes sense. > > I did a double take over "driver_release" but by the end I was in > agreement. > > The early_release though, that is worth a bit of artistic license to say > early_probe pairs with late_release. OK, I'll fix it, as well as other issues pointed out by dim, and resubmit. Thanks, Janusz > -Chris > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC PATCH] drm/i915: Drop extern qualifiers from header function prototypes
Follow dim checkpatch recommendation so it doesn't complain on that now and again on header file modifications. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 +- drivers/gpu/drm/i915/gvt/gtt.h | 13 +++--- drivers/gpu/drm/i915/i915_drv.h| 48 +++--- drivers/gpu/drm/i915/i915_irq.h| 4 +- drivers/gpu/drm/i915/oa/i915_oa_bdw.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_bxt.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_cflgt2.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_cflgt3.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_chv.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_cnl.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_glk.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_hsw.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_icl.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_kblgt2.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_kblgt3.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_sklgt2.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_sklgt3.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_sklgt4.h | 2 +- include/drm/i915_drm.h | 10 ++--- 19 files changed, 52 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 20754c15412a..67aea07ea019 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -81,7 +81,7 @@ i915_gem_object_lookup(struct drm_file *file, u32 handle) } __deprecated -extern struct drm_gem_object * +struct drm_gem_object * drm_gem_object_lookup(struct drm_file *file, u32 handle); __attribute__((nonnull)) diff --git a/drivers/gpu/drm/i915/gvt/gtt.h b/drivers/gpu/drm/i915/gvt/gtt.h index 42d0394f0de2..88789316807d 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.h +++ b/drivers/gpu/drm/i915/gvt/gtt.h @@ -205,17 +205,18 @@ struct intel_vgpu_gtt { struct intel_vgpu_scratch_pt scratch_pt[GTT_TYPE_MAX]; }; -extern int intel_vgpu_init_gtt(struct intel_vgpu *vgpu); -extern void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu); +int intel_vgpu_init_gtt(struct intel_vgpu *vgpu); +void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu); void intel_vgpu_reset_ggtt(struct intel_vgpu *vgpu, bool invalidate_old); void intel_vgpu_invalidate_ppgtt(struct intel_vgpu *vgpu); -extern int intel_gvt_init_gtt(struct intel_gvt *gvt); +int intel_gvt_init_gtt(struct intel_gvt *gvt); void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu); -extern void intel_gvt_clean_gtt(struct intel_gvt *gvt); +void intel_gvt_clean_gtt(struct intel_gvt *gvt); -extern struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu, - int page_table_level, void *root_entry); +struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu, + int page_table_level, + void *root_entry); struct intel_vgpu_oos_page { struct intel_vgpu_ppgtt_spt *spt; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a9381e404fd5..649bebcc0019 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2388,19 +2388,18 @@ __i915_printk(struct drm_i915_private *dev_priv, const char *level, __i915_printk(dev_priv, KERN_ERR, fmt, ##__VA_ARGS__) #ifdef CONFIG_COMPAT -extern long i915_compat_ioctl(struct file *filp, unsigned int cmd, - unsigned long arg); +long i915_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); #else #define i915_compat_ioctl NULL #endif extern const struct dev_pm_ops i915_pm_ops; +extern const struct dev_pm_ops i915_pm_ops_1; -extern int i915_driver_load(struct pci_dev *pdev, - const struct pci_device_id *ent); -extern void i915_driver_unload(struct drm_device *dev); +int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent); +void i915_driver_unload(struct drm_device *dev); -extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine); -extern void intel_hangcheck_init(struct drm_i915_private *dev_priv); +void intel_engine_init_hangcheck(struct intel_engine_cs *engine); +void intel_hangcheck_init(struct drm_i915_private *dev_priv); int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on); u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv); @@ -2670,14 +2669,14 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, bool is_master); /* i915_perf.c */ -extern void i915_perf_init(struct drm_i915_private *dev_priv); -extern void i915_perf_fini(struct drm_i915_private *dev_priv); -extern void i915_perf_register(struct drm_i915_private *dev_priv); -extern void i915_perf_unregister(struct drm_i915_private *dev_priv); +void i915_perf_init(struct drm_i915_private *dev_priv); +void i915_perf_fini(struct drm_i915_pr
[Intel-gfx] [RFC PATCH] drm/i915: Join quoted strings and align them with open parenthesis
Follow dim checkpatch recommendations so it doesn't complain now and again on consistent modifications of i915_params.c Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_params.c | 96 ++ 1 file changed, 33 insertions(+), 63 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 296452f9efe4..8007fa893869 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -41,141 +41,111 @@ struct i915_params i915_modparams __read_mostly = { }; i915_param_named(modeset, int, 0400, - "Use kernel modesetting [KMS] (0=disable, " - "1=on, -1=force vga console preference [default])"); +"Use kernel modesetting [KMS] (0=disable, 1=on, -1=force vga console preference [default])"); i915_param_named_unsafe(enable_dc, int, 0400, - "Enable power-saving display C-states. " - "(-1=auto [default]; 0=disable; 1=up to DC5; 2=up to DC6)"); + "Enable power-saving display C-states. (-1=auto [default]; 0=disable; 1=up to DC5; 2=up to DC6)"); i915_param_named_unsafe(enable_fbc, int, 0600, - "Enable frame buffer compression for power savings " - "(default: -1 (use per-chip default))"); + "Enable frame buffer compression for power savings (default: -1 (use per-chip default))"); i915_param_named_unsafe(lvds_channel_mode, int, 0400, -"Specify LVDS channel mode " -"(0=probe BIOS [default], 1=single-channel, 2=dual-channel)"); + "Specify LVDS channel mode (0=probe BIOS [default], 1=single-channel, 2=dual-channel)"); i915_param_named_unsafe(panel_use_ssc, int, 0600, - "Use Spread Spectrum Clock with panels [LVDS/eDP] " - "(default: auto from VBT)"); + "Use Spread Spectrum Clock with panels [LVDS/eDP] (default: auto from VBT)"); i915_param_named_unsafe(vbt_sdvo_panel_type, int, 0400, - "Override/Ignore selection of SDVO panel mode in the VBT " - "(-2=ignore, -1=auto [default], index in VBT BIOS table)"); + "Override/Ignore selection of SDVO panel mode in the VBT (-2=ignore, -1=auto [default], index in VBT BIOS table)"); i915_param_named_unsafe(reset, int, 0600, - "Attempt GPU resets (0=disabled, 1=full gpu reset, 2=engine reset [default])"); + "Attempt GPU resets (0=disabled, 1=full gpu reset, 2=engine reset [default])"); i915_param_named_unsafe(vbt_firmware, charp, 0400, - "Load VBT from specified file under /lib/firmware"); + "Load VBT from specified file under /lib/firmware"); #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) i915_param_named(error_capture, bool, 0600, - "Record the GPU state following a hang. " - "This information in /sys/class/drm/card/error is vital for " - "triaging and debugging hangs."); +"Record the GPU state following a hang. This information in /sys/class/drm/card/error is vital for triaging and debugging hangs."); #endif i915_param_named_unsafe(enable_hangcheck, bool, 0600, - "Periodically check GPU activity for detecting hangs. " - "WARNING: Disabling this can cause system wide hangs. " - "(default: true)"); + "Periodically check GPU activity for detecting hangs. WARNING: Disabling this can cause system wide hangs. (default: true)"); i915_param_named_unsafe(enable_psr, int, 0600, - "Enable PSR " - "(0=disabled, 1=enabled) " - "Default: -1 (use per-chip default)"); + "Enable PSR (0=disabled, 1=enabled) Default: -1 (use per-chip default)"); i915_param_named_unsafe(force_probe, charp, 0400, - "Force probe the driver for specified devices. " - "See CONFIG_DRM_I915_FORCE_PROBE for details."); + "Force probe the driver for specified devices. See CONFIG_DRM_I915_FORCE_PROBE for details."); i915_param_named_unsafe(alpha_support, bool, 0400, - "Deprecated. See i915.force_probe."); + "Deprecated. See i915.force_probe."); i915_param_named_unsafe(disable_power_well, int, 0400, - "Disable display power wells when possible " - "(-1=auto [default], 0=power wells always on, 1=power wells disabled when possible)"); + "Disable display power wells when possible (-1=auto [default], 0=power wells always on, 1=power wells disabled when possible)"); i915_param_named_
Re: [Intel-gfx] [RFC PATCH] drm/i915: Drop extern qualifiers from header function prototypes
Hi Chris, On Wednesday, July 10, 2019 5:01:04 PM CEST Chris Wilson wrote: > Quoting Janusz Krzysztofik (2019-07-10 15:52:39) > > Follow dim checkpatch recommendation so it doesn't complain on that now > > and again on header file modifications. > > > > Signed-off-by: Janusz Krzysztofik > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -2388,19 +2388,18 @@ __i915_printk(struct drm_i915_private *dev_priv, const char *level, > > __i915_printk(dev_priv, KERN_ERR, fmt, ##__VA_ARGS__) > > > > #ifdef CONFIG_COMPAT > > -extern long i915_compat_ioctl(struct file *filp, unsigned int cmd, > > - unsigned long arg); > > +long i915_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); > > #else > > #define i915_compat_ioctl NULL > > #endif > > extern const struct dev_pm_ops i915_pm_ops; > > +extern const struct dev_pm_ops i915_pm_ops_1; > > That's novel. Oh, sorry, that was my testing of how dim checkpatch reacts on extern qualifiers on variables. Thanks for catching this. Janusz > > -Chris > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2] drm/i915: Drop extern qualifiers from header function prototypes
Follow dim checkpatch recommendation so it doesn't complain on that now and again on header file modifications. v2: Drop testing leftover Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 +- drivers/gpu/drm/i915/gvt/gtt.h | 13 +++--- drivers/gpu/drm/i915/i915_drv.h| 47 ++ drivers/gpu/drm/i915/i915_irq.h| 4 +- drivers/gpu/drm/i915/oa/i915_oa_bdw.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_bxt.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_cflgt2.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_cflgt3.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_chv.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_cnl.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_glk.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_hsw.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_icl.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_kblgt2.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_kblgt3.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_sklgt2.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_sklgt3.h | 2 +- drivers/gpu/drm/i915/oa/i915_oa_sklgt4.h | 2 +- include/drm/i915_drm.h | 10 ++--- 19 files changed, 51 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 20754c15412a..67aea07ea019 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -81,7 +81,7 @@ i915_gem_object_lookup(struct drm_file *file, u32 handle) } __deprecated -extern struct drm_gem_object * +struct drm_gem_object * drm_gem_object_lookup(struct drm_file *file, u32 handle); __attribute__((nonnull)) diff --git a/drivers/gpu/drm/i915/gvt/gtt.h b/drivers/gpu/drm/i915/gvt/gtt.h index 42d0394f0de2..88789316807d 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.h +++ b/drivers/gpu/drm/i915/gvt/gtt.h @@ -205,17 +205,18 @@ struct intel_vgpu_gtt { struct intel_vgpu_scratch_pt scratch_pt[GTT_TYPE_MAX]; }; -extern int intel_vgpu_init_gtt(struct intel_vgpu *vgpu); -extern void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu); +int intel_vgpu_init_gtt(struct intel_vgpu *vgpu); +void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu); void intel_vgpu_reset_ggtt(struct intel_vgpu *vgpu, bool invalidate_old); void intel_vgpu_invalidate_ppgtt(struct intel_vgpu *vgpu); -extern int intel_gvt_init_gtt(struct intel_gvt *gvt); +int intel_gvt_init_gtt(struct intel_gvt *gvt); void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu); -extern void intel_gvt_clean_gtt(struct intel_gvt *gvt); +void intel_gvt_clean_gtt(struct intel_gvt *gvt); -extern struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu, - int page_table_level, void *root_entry); +struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu, + int page_table_level, + void *root_entry); struct intel_vgpu_oos_page { struct intel_vgpu_ppgtt_spt *spt; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a9381e404fd5..246f9cb625dc 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2388,19 +2388,17 @@ __i915_printk(struct drm_i915_private *dev_priv, const char *level, __i915_printk(dev_priv, KERN_ERR, fmt, ##__VA_ARGS__) #ifdef CONFIG_COMPAT -extern long i915_compat_ioctl(struct file *filp, unsigned int cmd, - unsigned long arg); +long i915_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); #else #define i915_compat_ioctl NULL #endif extern const struct dev_pm_ops i915_pm_ops; -extern int i915_driver_load(struct pci_dev *pdev, - const struct pci_device_id *ent); -extern void i915_driver_unload(struct drm_device *dev); +int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent); +void i915_driver_unload(struct drm_device *dev); -extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine); -extern void intel_hangcheck_init(struct drm_i915_private *dev_priv); +void intel_engine_init_hangcheck(struct intel_engine_cs *engine); +void intel_hangcheck_init(struct drm_i915_private *dev_priv); int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on); u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv); @@ -2670,14 +2668,14 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, bool is_master); /* i915_perf.c */ -extern void i915_perf_init(struct drm_i915_private *dev_priv); -extern void i915_perf_fini(struct drm_i915_private *dev_priv); -extern void i915_perf_register(struct drm_i915_private *dev_priv); -extern void i915_perf_unregister(struct drm_i915_private *dev_priv); +void i915_perf_init(struct drm_i915_private *dev_priv); +void i915_perf_fini(struct drm_i915_private *dev_priv);
[Intel-gfx] [PATCH v2 5/5] drm/i915: Propagate "_probe" function name suffix down
Similar to the "_release" and "_remove" cases, consequently replace "_init" components of names of functions called from i915_driver_probe() with "_probe" suffixes for better code readability. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 4c4443757a36..ec4bb8038c9b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -675,7 +675,7 @@ static const struct vga_switcheroo_client_ops i915_switcheroo_ops = { .can_switch = i915_switcheroo_can_switch, }; -static int i915_load_modeset_init(struct drm_device *dev) +static int i915_driver_modeset_probe(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); struct pci_dev *pdev = dev_priv->drm.pdev; @@ -884,7 +884,7 @@ static void intel_detect_preproduction_hw(struct drm_i915_private *dev_priv) } /** - * i915_driver_init_early - setup state not requiring device access + * i915_driver_early_probe - setup state not requiring device access * @dev_priv: device private * * Initialize everything that is a "SW-only" state, that is state not @@ -893,7 +893,7 @@ static void intel_detect_preproduction_hw(struct drm_i915_private *dev_priv) * system memory allocation, setting up device specific attributes and * function hooks not requiring accessing the device. */ -static int i915_driver_init_early(struct drm_i915_private *dev_priv) +static int i915_driver_early_probe(struct drm_i915_private *dev_priv) { int ret = 0; @@ -963,7 +963,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv) /** * i915_driver_late_release - cleanup the setup done in - *i915_driver_init_early() + *i915_driver_early_probe() * @dev_priv: device private */ static void i915_driver_late_release(struct drm_i915_private *dev_priv) @@ -980,7 +980,7 @@ static void i915_driver_late_release(struct drm_i915_private *dev_priv) } /** - * i915_driver_init_mmio - setup device MMIO + * i915_driver_mmio_probe - setup device MMIO * @dev_priv: device private * * Setup minimal device state necessary for MMIO accesses later in the @@ -988,7 +988,7 @@ static void i915_driver_late_release(struct drm_i915_private *dev_priv) * side effects or exposing the driver via kernel internal or user space * interfaces. */ -static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) +static int i915_driver_mmio_probe(struct drm_i915_private *dev_priv) { int ret; @@ -1029,7 +1029,7 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) } /** - * i915_driver_mmio_release - cleanup the setup done in i915_driver_init_mmio() + * i915_driver_mmio_release - cleanup the setup done in i915_driver_mmio_probe() * @dev_priv: device private */ static void i915_driver_mmio_release(struct drm_i915_private *dev_priv) @@ -1525,13 +1525,13 @@ static void edram_detect(struct drm_i915_private *dev_priv) } /** - * i915_driver_init_hw - setup state requiring device access + * i915_driver_hw_probe - setup state requiring device access * @dev_priv: device private * * Setup state that requires accessing the device, but doesn't require * exposing the driver via kernel internal or userspace interfaces. */ -static int i915_driver_init_hw(struct drm_i915_private *dev_priv) +static int i915_driver_hw_probe(struct drm_i915_private *dev_priv) { struct pci_dev *pdev = dev_priv->drm.pdev; int ret; @@ -1900,7 +1900,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) if (ret) goto out_fini; - ret = i915_driver_init_early(dev_priv); + ret = i915_driver_early_probe(dev_priv); if (ret < 0) goto out_pci_disable; @@ -1908,15 +1908,15 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) i915_detect_vgpu(dev_priv); - ret = i915_driver_init_mmio(dev_priv); + ret = i915_driver_mmio_probe(dev_priv); if (ret < 0) goto out_runtime_pm_put; - ret = i915_driver_init_hw(dev_priv); + ret = i915_driver_hw_probe(dev_priv); if (ret < 0) goto out_cleanup_mmio; - ret = i915_load_modeset_init(&dev_priv->drm); + ret = i915_driver_modeset_probe(&dev_priv->drm); if (ret < 0) goto out_cleanup_hw; -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 1/5] drm/i915: Rename "_load"/"_unload" to match PCI entry points
Current names of i915_driver_load/unload() functions originate in legacy DRM stubs. Reduce nomenclature ambiguity by renaming them to match their current use as helpers called from PCI entry points. Suggested by: Chris Wilson Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 8 drivers/gpu/drm/i915/i915_drv.h | 4 ++-- drivers/gpu/drm/i915/i915_pci.c | 4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 12182d2fc03c..8b72ae7c1f5d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1870,17 +1870,17 @@ static void i915_driver_destroy(struct drm_i915_private *i915) } /** - * i915_driver_load - setup chip and create an initial config + * i915_driver_probe - setup chip and create an initial config * @pdev: PCI device * @ent: matching PCI ID entry * - * The driver load routine has to do several things: + * The driver probe routine has to do several things: * - drive output discovery via intel_modeset_init() * - initialize the memory manager * - allocate initial config memory * - setup the DRM framebuffer with the allocated memory */ -int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent) +int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { const struct intel_device_info *match_info = (struct intel_device_info *)ent->driver_data; @@ -1946,7 +1946,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent) return ret; } -void i915_driver_unload(struct drm_device *dev) +void i915_driver_remove(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); struct pci_dev *pdev = dev_priv->drm.pdev; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 246f9cb625dc..7d650475790e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2394,8 +2394,8 @@ long i915_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); #endif extern const struct dev_pm_ops i915_pm_ops; -int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent); -void i915_driver_unload(struct drm_device *dev); +int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent); +void i915_driver_remove(struct drm_device *dev); void intel_engine_init_hangcheck(struct intel_engine_cs *engine); void intel_hangcheck_init(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 94b588e0a1dd..786ca7b3439b 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -848,7 +848,7 @@ static void i915_pci_remove(struct pci_dev *pdev) if (!dev) /* driver load aborted, nothing to cleanup */ return; - i915_driver_unload(dev); + i915_driver_remove(dev); drm_dev_put(dev); pci_set_drvdata(pdev, NULL); @@ -923,7 +923,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) if (vga_switcheroo_client_probe_defer(pdev)) return -EPROBE_DEFER; - err = i915_driver_load(pdev, ent); + err = i915_driver_probe(pdev, ent); if (err) return err; -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 2/5] drm/i915: Replace "_load" with "_probe" consequently
Use the "_probe" nomenclature not only in i915_driver_probe() helper name but also in other related function / variable names for consistency. Only the userspace exposed name of a related module parameter is left untouched. Signed-off-by: Janusz Krzysztofik --- .../gpu/drm/i915/display/intel_connector.c| 2 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/i915_drv.c | 20 +-- drivers/gpu/drm/i915/i915_drv.h | 10 +- drivers/gpu/drm/i915/i915_gem.c | 8 drivers/gpu/drm/i915/i915_pci.c | 2 +- drivers/gpu/drm/i915/intel_gvt.c | 2 +- drivers/gpu/drm/i915/intel_uncore.c | 2 +- drivers/gpu/drm/i915/intel_wopcm.c| 2 +- 9 files changed, 25 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_connector.c b/drivers/gpu/drm/i915/display/intel_connector.c index 41310f8e5a2a..d0163d86c42a 100644 --- a/drivers/gpu/drm/i915/display/intel_connector.c +++ b/drivers/gpu/drm/i915/display/intel_connector.c @@ -118,7 +118,7 @@ int intel_connector_register(struct drm_connector *connector) if (ret) goto err; - if (i915_inject_load_failure()) { + if (i915_inject_probe_failure()) { ret = -EFAULT; goto err_backlight; } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index bdf279fa3b2e..375b0561bd1d 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -426,7 +426,7 @@ int intel_engines_init_mmio(struct drm_i915_private *i915) WARN_ON(engine_mask & GENMASK(BITS_PER_TYPE(mask) - 1, I915_NUM_ENGINES)); - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; for (i = 0; i < ARRAY_SIZE(intel_engines); i++) { diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 8b72ae7c1f5d..ad24957ad86d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -81,14 +81,14 @@ static struct drm_driver driver; #if IS_ENABLED(CONFIG_DRM_I915_DEBUG) -static unsigned int i915_load_fail_count; +static unsigned int i915_probe_fail_count; -bool __i915_inject_load_failure(const char *func, int line) +bool __i915_inject_probe_failure(const char *func, int line) { - if (i915_load_fail_count >= i915_modparams.inject_load_failure) + if (i915_probe_fail_count >= i915_modparams.inject_load_failure) return false; - if (++i915_load_fail_count == i915_modparams.inject_load_failure) { + if (++i915_probe_fail_count == i915_modparams.inject_load_failure) { DRM_INFO("Injecting failure at checkpoint %u [%s:%d]\n", i915_modparams.inject_load_failure, func, line); i915_modparams.inject_load_failure = 0; @@ -100,7 +100,7 @@ bool __i915_inject_load_failure(const char *func, int line) bool i915_error_injected(void) { - return i915_load_fail_count && !i915_modparams.inject_load_failure; + return i915_probe_fail_count && !i915_modparams.inject_load_failure; } #endif @@ -681,7 +681,7 @@ static int i915_load_modeset_init(struct drm_device *dev) struct pci_dev *pdev = dev_priv->drm.pdev; int ret; - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; if (HAS_DISPLAY(dev_priv)) { @@ -897,7 +897,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv) { int ret = 0; - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; intel_device_info_subplatform_init(dev_priv); @@ -991,7 +991,7 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) { int ret; - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; if (i915_get_bridge_dev(dev_priv)) @@ -1535,7 +1535,7 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv) struct pci_dev *pdev = dev_priv->drm.pdev; int ret; - if (i915_inject_load_failure()) + if (i915_inject_probe_failure()) return -ENODEV; intel_device_info_runtime_init(dev_priv); @@ -1941,7 +1941,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) out_pci_disable: pci_disable_device(pdev); out_fini: - i915_load_error(dev_priv, "Device initialization failed (%d)\n", ret); + i915_probe_error(dev_priv, "Device initialization failed (%d)\n", ret); i915_driver_destroy(dev_priv); return ret; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/dr
[Intel-gfx] [PATCH v2 3/5] drm/i915: Propagate "_release" function name suffix down
Replace mixed "_fini"/"_cleanup"/"_cleanup_hw" suffixes found in names of fucntions called from i915_driver_release() with "_release" suffix consistently. This provides better code readability, especially helpful when trying to work out which phase the code is in. Functions names starting with "i915_driver_", i.e., those defined in drivers/gpu/dri/i915/i915_drv.c, just have their "cleanup" or "fini" parts of their names replaced with the "_release" suffix, while names of functions coming from other source files have been suffixed with "_driver_release" to avoid ambiguity with other possible .release entry points. v2: early_probe pairs better with late_release (Chris) Suggested-by: Chris Wilson Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_drv.c | 33 + drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 4 +-- drivers/gpu/drm/i915/i915_gem_gtt.h | 2 +- drivers/gpu/drm/i915/intel_runtime_pm.c | 2 +- drivers/gpu/drm/i915/intel_runtime_pm.h | 2 +- 7 files changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index ad24957ad86d..33bbe74cd441 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -752,7 +752,7 @@ static int i915_load_modeset_init(struct drm_device *dev) cleanup_gem: i915_gem_suspend(dev_priv); i915_gem_fini_hw(dev_priv); - i915_gem_fini(dev_priv); + i915_gem_driver_release(dev_priv); cleanup_modeset: intel_modeset_cleanup(dev); cleanup_irq: @@ -962,10 +962,11 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv) } /** - * i915_driver_cleanup_early - cleanup the setup done in i915_driver_init_early() + * i915_driver_late_release - cleanup the setup done in + *i915_driver_init_early() * @dev_priv: device private */ -static void i915_driver_cleanup_early(struct drm_i915_private *dev_priv) +static void i915_driver_late_release(struct drm_i915_private *dev_priv) { intel_irq_fini(dev_priv); intel_power_domains_cleanup(dev_priv); @@ -1028,10 +1029,10 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) } /** - * i915_driver_cleanup_mmio - cleanup the setup done in i915_driver_init_mmio() + * i915_driver_mmio_release - cleanup the setup done in i915_driver_init_mmio() * @dev_priv: device private */ -static void i915_driver_cleanup_mmio(struct drm_i915_private *dev_priv) +static void i915_driver_mmio_release(struct drm_i915_private *dev_priv) { intel_teardown_mchbar(dev_priv); intel_uncore_fini_mmio(&dev_priv->uncore); @@ -1684,7 +1685,7 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv) pci_disable_msi(pdev); pm_qos_remove_request(&dev_priv->pm_qos); err_ggtt: - i915_ggtt_cleanup_hw(dev_priv); + i915_ggtt_driver_release(dev_priv); err_perf: i915_perf_fini(dev_priv); return ret; @@ -1929,15 +1930,15 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) out_cleanup_hw: i915_driver_cleanup_hw(dev_priv); - i915_ggtt_cleanup_hw(dev_priv); + i915_ggtt_driver_release(dev_priv); /* Paranoia: make sure we have disabled everything before we exit. */ intel_sanitize_gt_powersave(dev_priv); out_cleanup_mmio: - i915_driver_cleanup_mmio(dev_priv); + i915_driver_mmio_release(dev_priv); out_runtime_pm_put: enable_rpm_wakeref_asserts(&dev_priv->runtime_pm); - i915_driver_cleanup_early(dev_priv); + i915_driver_late_release(dev_priv); out_pci_disable: pci_disable_device(pdev); out_fini: @@ -2000,19 +2001,19 @@ static void i915_driver_release(struct drm_device *dev) disable_rpm_wakeref_asserts(rpm); - i915_gem_fini(dev_priv); + i915_gem_driver_release(dev_priv); - i915_ggtt_cleanup_hw(dev_priv); + i915_ggtt_driver_release(dev_priv); /* Paranoia: make sure we have disabled everything before we exit. */ intel_sanitize_gt_powersave(dev_priv); - i915_driver_cleanup_mmio(dev_priv); + i915_driver_mmio_release(dev_priv); enable_rpm_wakeref_asserts(rpm); - intel_runtime_pm_cleanup(rpm); + intel_runtime_pm_driver_release(rpm); - i915_driver_cleanup_early(dev_priv); + i915_driver_late_release(dev_priv); i915_driver_destroy(dev_priv); } @@ -2205,7 +2206,7 @@ static int i915_drm_suspend_late(struct drm_device *dev, bool hibernation) out: enable_rpm_wakeref_asserts(rpm); if (!dev_priv->uncore.user_forcewake.count) - intel_runtime_pm_cleanup(rpm); + intel_runtime_pm_driver
[Intel-gfx] [PATCH v2 0/5] drm/i915: Rename functions to match their entry points
Need for this was identified while working on split of driver unbind path into _remove() and _release() parts. Consistency in function naming has been recognized as helpful when trying to work out which phase the code is in. v2: * early_probe pairs better with late_release (Chris), * exclude patch 6/6 "drm/i915: Rename "inject_load_failure" module parameter" for now, it requires updates on user (IGT) side * rebase on top of "drm/i915: Drop extern qualifiers from header function prototypes" Janusz Krzysztofik (5): drm/i915: Rename "_load"/"_unload" to match PCI entry points drm/i915: Replace "_load" with "_probe" consequently drm/i915: Propagate "_release" function name suffix down drm/i915: Propagate "_remove" function name suffix down drm/i915: Propagate "_probe" function name suffix down drivers/gpu/drm/i915/display/intel_bios.c | 4 +- drivers/gpu/drm/i915/display/intel_bios.h | 2 +- .../gpu/drm/i915/display/intel_connector.c| 2 +- drivers/gpu/drm/i915/display/intel_display.c | 2 +- .../drm/i915/display/intel_display_power.c| 6 +- .../drm/i915/display/intel_display_power.h| 2 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/i915_drv.c | 107 +- drivers/gpu/drm/i915/i915_drv.h | 20 ++-- drivers/gpu/drm/i915/i915_gem.c | 12 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 4 +- drivers/gpu/drm/i915/i915_gem_gtt.h | 2 +- drivers/gpu/drm/i915/i915_pci.c | 6 +- drivers/gpu/drm/i915/intel_gvt.c | 7 +- drivers/gpu/drm/i915/intel_gvt.h | 5 +- drivers/gpu/drm/i915/intel_runtime_pm.c | 2 +- drivers/gpu/drm/i915/intel_runtime_pm.h | 2 +- drivers/gpu/drm/i915/intel_uncore.c | 2 +- drivers/gpu/drm/i915/intel_wopcm.c| 2 +- 19 files changed, 97 insertions(+), 94 deletions(-) -- 2.21.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 4/5] drm/i915: Propagate "_remove" function name suffix down
Similar to the "_release" case, consistently replace mixed "_cleanup"/"_fini"/"_fini_hw" components found in names of functions called from i915_driver_remove() with "_remove" or "_driver_remove" suffixes for better code readability. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/display/intel_bios.c | 4 ++-- drivers/gpu/drm/i915/display/intel_bios.h | 2 +- drivers/gpu/drm/i915/display/intel_display.c | 2 +- .../drm/i915/display/intel_display_power.c| 6 ++--- .../drm/i915/display/intel_display_power.h| 2 +- drivers/gpu/drm/i915/i915_drv.c | 24 +-- drivers/gpu/drm/i915/i915_drv.h | 4 ++-- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/intel_gvt.c | 5 ++-- drivers/gpu/drm/i915/intel_gvt.h | 5 ++-- 10 files changed, 29 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index 4fdbb5c35d87..4f709f5ddf07 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -1893,10 +1893,10 @@ void intel_bios_init(struct drm_i915_private *dev_priv) } /** - * intel_bios_cleanup - Free any resources allocated by intel_bios_init() + * intel_bios_driver_remove - Free any resources allocated by intel_bios_init() * @dev_priv: i915 device instance */ -void intel_bios_cleanup(struct drm_i915_private *dev_priv) +void intel_bios_driver_remove(struct drm_i915_private *dev_priv) { kfree(dev_priv->vbt.child_dev); dev_priv->vbt.child_dev = NULL; diff --git a/drivers/gpu/drm/i915/display/intel_bios.h b/drivers/gpu/drm/i915/display/intel_bios.h index 0b7be6389a07..4969189e620f 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.h +++ b/drivers/gpu/drm/i915/display/intel_bios.h @@ -228,7 +228,7 @@ struct mipi_pps_data { } __packed; void intel_bios_init(struct drm_i915_private *dev_priv); -void intel_bios_cleanup(struct drm_i915_private *dev_priv); +void intel_bios_driver_remove(struct drm_i915_private *dev_priv); bool intel_bios_is_valid_vbt(const void *buf, size_t size); bool intel_bios_is_tv_present(struct drm_i915_private *dev_priv); bool intel_bios_is_lvds_present(struct drm_i915_private *dev_priv, u8 *i2c_pin); diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 0286b97caa22..daf73c2d23c2 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -17073,7 +17073,7 @@ static void intel_hpd_poll_fini(struct drm_device *dev) drm_connector_list_iter_end(&conn_iter); } -void intel_modeset_cleanup(struct drm_device *dev) +void intel_modeset_driver_remove(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 7e22a2704843..db89550e3b6b 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -4429,7 +4429,7 @@ static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv); * * It will return with power domains disabled (to be enabled later by * intel_power_domains_enable()) and must be paired with - * intel_power_domains_fini_hw(). + * intel_power_domains_driver_remove(). */ void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume) { @@ -4481,7 +4481,7 @@ void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume) } /** - * intel_power_domains_fini_hw - deinitialize hw power domain state + * intel_power_domains_driver_remove - deinitialize hw power domain state * @i915: i915 device instance * * De-initializes the display power domain HW state. It also ensures that the @@ -4491,7 +4491,7 @@ void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume) * intel_power_domains_disable()) and must be paired with * intel_power_domains_init_hw(). */ -void intel_power_domains_fini_hw(struct drm_i915_private *i915) +void intel_power_domains_driver_remove(struct drm_i915_private *i915) { intel_wakeref_t wakeref __maybe_unused = fetch_and_zero(&i915->power_domains.wakeref); diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h index 8f43f7051a16..dbd1f5ef01d1 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.h +++ b/drivers/gpu/drm/i915/display/intel_display_power.h @@ -214,7 +214,7 @@ void gen9_enable_dc5(struct drm_i915_private *dev_priv); int intel_power_domains_init(struct drm_i915_private *dev_priv); void intel_power_domains_cleanup(struct drm_i915_private *dev_priv); void intel_power_domains_init_hw(struct drm_i9
[Intel-gfx] [PATCH 1/2] drm/i915/gem: Avoid taking runtime-pm under the shrinker
From: Chris Wilson Inside the shrinker, we cannot wake the device as that may cause recursion into fs-reclaim, so instead we only unbind vma if the device is currently awake. (In order to provide reclaim while asleep, we do wake the device up during kswapd -- we probably want to limit that wake up if we have anything to shrink though!) To avoid the same fs_reclaim recursion potential during i915_gem_object_unbind, we acquire a wakeref there, see commit 3e817471a34c ("drm/i915/gem: Take runtime-pm wakeref prior to unbinding"). However, we use i915_gem_object_unbind from the shrinker path to make the object available for shrinking and so we must make the wakeref acquisition here conditional. <4> [437.542172] == <4> [437.542174] WARNING: possible circular locking dependency detected <4> [437.542176] 5.19.0-rc6-CI_DRM_11876-g2305e0d00665+ #1 Tainted: G U <4> [437.542179] -- <4> [437.542181] kswapd0/93 is trying to acquire lock: <4> [437.542183] 827a7608 (acpi_wakeup_lock){+.+.}-{3:3}, at: acpi_device_wakeup_disable+0x12/0x50 <4> [437.542191] but task is already holding lock: <4> [437.542194] 8275d360 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x91/0x5c0 <4> [437.542199] which lock already depends on the new lock. <4> [437.542202] the existing dependency chain (in reverse order) is: <4> [437.542204] -> #2 (fs_reclaim){+.+.}-{0:0}: <4> [437.542207]fs_reclaim_acquire+0x9d/0xd0 <4> [437.542211]kmem_cache_alloc_trace+0x2a/0x250 <4> [437.542214]__acpi_device_add+0x263/0x3a0 <4> [437.542217]acpi_add_single_object+0x3ea/0x710 <4> [437.542220]acpi_bus_check_add+0xf7/0x240 <4> [437.54]acpi_bus_scan+0x34/0xf0 <4> [437.542224]acpi_scan_init+0xf5/0x241 <4> [437.542228]acpi_init+0x449/0x4aa <4> [437.542230]do_one_initcall+0x53/0x2e0 <4> [437.542233]kernel_init_freeable+0x18f/0x1dd <4> [437.542236]kernel_init+0x11/0x110 <4> [437.542239]ret_from_fork+0x1f/0x30 <4> [437.542241] -> #1 (acpi_device_lock){+.+.}-{3:3}: <4> [437.542245]__mutex_lock+0x97/0xf20 <4> [437.542246]acpi_enable_wakeup_device_power+0x30/0xf0 <4> [437.542249]__acpi_device_wakeup_enable+0x31/0x110 <4> [437.542252]acpi_pm_set_device_wakeup+0x55/0x100 <4> [437.542254]__pci_enable_wake+0x5e/0xa0 <4> [437.542257]pci_finish_runtime_suspend+0x32/0x70 <4> [437.542259]pci_pm_runtime_suspend+0xa3/0x160 <4> [437.542262]__rpm_callback+0x3d/0x110 <4> [437.542265]rpm_callback+0x54/0x60 <4> [437.542268]rpm_suspend.part.10+0x105/0x5a0 <4> [437.542270]pm_runtime_work+0x7d/0x1e0 <4> [437.542273]process_one_work+0x272/0x5c0 <4> [437.542276]worker_thread+0x37/0x370 <4> [437.542278]kthread+0xed/0x120 <4> [437.542280]ret_from_fork+0x1f/0x30 <4> [437.542282] -> #0 (acpi_wakeup_lock){+.+.}-{3:3}: <4> [437.542285]__lock_acquire+0x15ad/0x2940 <4> [437.542288]lock_acquire+0xd3/0x310 <4> [437.542291]__mutex_lock+0x97/0xf20 <4> [437.542293]acpi_device_wakeup_disable+0x12/0x50 <4> [437.542295]acpi_pm_set_device_wakeup+0x6e/0x100 <4> [437.542297]__pci_enable_wake+0x73/0xa0 <4> [437.542300]pci_pm_runtime_resume+0x45/0x90 <4> [437.542302]__rpm_callback+0x3d/0x110 <4> [437.542304]rpm_callback+0x54/0x60 <4> [437.542307]rpm_resume+0x54f/0x750 <4> [437.542309]__pm_runtime_resume+0x42/0x80 <4> [437.542311]__intel_runtime_pm_get+0x19/0x80 [i915] <4> [437.542386]i915_gem_object_unbind+0x8f/0x3b0 [i915] <4> [437.542487]i915_gem_shrink+0x634/0x850 [i915] <4> [437.542584]i915_gem_shrinker_scan+0x3a/0xc0 [i915] <4> [437.542679]shrink_slab.constprop.97+0x1a4/0x4f0 <4> [437.542684]shrink_node+0x21e/0x420 <4> [437.542687]balance_pgdat+0x241/0x5c0 <4> [437.542690]kswapd+0x229/0x4f0 <4> [437.542694]kthread+0xed/0x120 <4> [437.542697]ret_from_fork+0x1f/0x30 <4> [437.542701] other info that might help us debug this: <4> [437.542705] Chain exists of: acpi_wakeup_lock --> acpi_device_lock --> fs_reclaim <4> [437.542713] Possible unsafe locking scenario: <4> [437.542716]CPU0CPU1 <4> [437.542719] <4> [437.542721] lock(fs_reclaim); <4> [437.542725] lock(acpi_device_lock); <4
[Intel-gfx] [RFC PATCH 2/2] drm/i915/gem: Perform active shrinking from a background thread
From: Chris Wilson i915 is very greedy and will retain system pages for as long as the user requires them; once acquired they will be only returned when the object is freed. In order to respond to system memory pressure, i915 hooks into the shrinker subsystem, designed to prune the filesystem caches, to unbind and return system pages. However, we can only do so if the device is active at that moment, as we cannot resume the device from inside direct reclaim to unbind pages from the GPU, nor do we want to delay random processes with unbound waits trying to reclaim active pages. To workaround that quandary, what we avoided in direct reclaim we delegated to kswapd, as that is run from process context outside of direct reclaim and able to sleep and resume the device. In practice, kswapd also uses fs_reclaim_acquire() around its shrink_slab calls, prohibiting runtime resume. If we cannot wake the device from idle, we will retain system memory indefinitely. As we cannot take advantage of kswapd's decoupled process context to perform an active reclaim of bound pages, spawn our own kthread to wait under our wakeref. Similar to kswapd, there is no direct dependency on the background task to direct reclaim (other than failure to promptly return pages will implicitly result in oom), as such the task itself does not inherit the fs-reclaim context. A page reclaimed by i915 will typically not immediately be available for re-use, as it will require writeback, and so only a future allocation attempt may benefit. Concurrent page allocation attempts do not wait for either kswapd or our own swapper task. We mark our kthread as a memallocator (allowed to dip into memory reserves, but not allowed to trigger direct reclaim) and mark up the call to the shrinker with a fs_reclaim critical section. This should prevent us from accidentally abusing the background swapper task, and so the swapper kthread behaves like kswapd with the exception of being allowed to wake the device up, and being decoupled from the shrinker_rwsem. Reported-by: Thomas Hellström Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6449 Fixes: 178a30c90ac7 ("drm/i915: Unbind objects in shrinker only if device is runtime active") Signed-off-by: Chris Wilson Cc: Thomas Hellström Cc: Matthew Auld Cc: Tvrtko Ursulin Cc: sta...@vger.kernel.org # v4.8+ Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 134 +-- drivers/gpu/drm/i915/i915_drv.h | 15 +++ 2 files changed, 135 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 1030053571a2..bc6c1978e64a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -310,6 +310,113 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) return count; } +static unsigned long run_swapper(struct drm_i915_private *i915, +unsigned long target, +unsigned long *nr_scanned) +{ + return i915_gem_shrink(NULL, i915, + target, nr_scanned, + I915_SHRINK_ACTIVE | + I915_SHRINK_BOUND | + I915_SHRINK_UNBOUND | + I915_SHRINK_WRITEBACK); +} + +static int swapper(void *arg) +{ + struct drm_i915_private *i915 = arg; + atomic_long_t *target = &i915->mm.swapper.target; + unsigned int noreclaim_state; + + /* +* For us to be running the swapper implies that the system is under +* enough memory pressure to be swapping. At that point, we both want +* to ensure we make forward progress in order to reclaim pages from +* the device and not contribute further to direct reclaim pressure. We +* mark ourselves as a memalloc task in order to not trigger direct +* reclaim ourselves, but dip into the system memory reserves for +* shrinkers. +*/ + noreclaim_state = memalloc_noreclaim_save(); + + do { + intel_wakeref_t wakeref; + + ___wait_var_event(target, + atomic_long_read(target) || + kthread_should_stop(), + TASK_IDLE, 0, 0, schedule()); + if (kthread_should_stop()) + break; + + with_intel_runtime_pm(&i915->runtime_pm, wakeref) { + unsigned long nr_scan = atomic_long_xchg(target, 0); + + /* +* Now that we have woken up the device hierarchy, +* act as a normal shrinker. Our shrinker is primarily +* focussed on supporting direct reclaim (low latency, +*
Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Avoid taking runtime-pm under the shrinker
Hi Matthew, Thanks for review. On Tuesday, 26 July 2022 20:14:05 CEST Matthew Auld wrote: > On 20/07/2022 11:16, Janusz Krzysztofik wrote: > > From: Chris Wilson > > > > Inside the shrinker, we cannot wake the device as that may cause > > recursion into fs-reclaim, so instead we only unbind vma if the device > > is currently awake. (In order to provide reclaim while asleep, we do > > wake the device up during kswapd -- we probably want to limit that wake > > up if we have anything to shrink though!) > > > > To avoid the same fs_reclaim recursion potential during > > i915_gem_object_unbind, we acquire a wakeref there, see commit > > 3e817471a34c ("drm/i915/gem: Take runtime-pm wakeref prior to unbinding"). > > However, we use i915_gem_object_unbind from the shrinker path to make the > > object available for shrinking and so we must make the wakeref acquisition > > here conditional. > > > > <4> [437.542172] == > > <4> [437.542174] WARNING: possible circular locking dependency detected > > <4> [437.542176] 5.19.0-rc6-CI_DRM_11876-g2305e0d00665+ #1 Tainted: G U > > <4> [437.542179] -- > > <4> [437.542181] kswapd0/93 is trying to acquire lock: > > <4> [437.542183] 827a7608 (acpi_wakeup_lock){+.+.}-{3:3}, at: > > acpi_device_wakeup_disable+0x12/0x50 > > <4> [437.542191] > > but task is already holding lock: > > <4> [437.542194] 8275d360 (fs_reclaim){+.+.}-{0:0}, at: > > balance_pgdat+0x91/0x5c0 > > <4> [437.542199] > > which lock already depends on the new lock. > > <4> [437.542202] > > the existing dependency chain (in reverse order) is: > > <4> [437.542204] > > -> #2 (fs_reclaim){+.+.}-{0:0}: > > <4> [437.542207]fs_reclaim_acquire+0x9d/0xd0 > > <4> [437.542211]kmem_cache_alloc_trace+0x2a/0x250 > > <4> [437.542214]__acpi_device_add+0x263/0x3a0 > > <4> [437.542217]acpi_add_single_object+0x3ea/0x710 > > <4> [437.542220]acpi_bus_check_add+0xf7/0x240 > > <4> [437.54]acpi_bus_scan+0x34/0xf0 > > <4> [437.542224]acpi_scan_init+0xf5/0x241 > > <4> [437.542228]acpi_init+0x449/0x4aa > > <4> [437.542230]do_one_initcall+0x53/0x2e0 > > <4> [437.542233]kernel_init_freeable+0x18f/0x1dd > > <4> [437.542236]kernel_init+0x11/0x110 > > <4> [437.542239]ret_from_fork+0x1f/0x30 > > <4> [437.542241] > > -> #1 (acpi_device_lock){+.+.}-{3:3}: > > <4> [437.542245]__mutex_lock+0x97/0xf20 > > <4> [437.542246]acpi_enable_wakeup_device_power+0x30/0xf0 > > <4> [437.542249]__acpi_device_wakeup_enable+0x31/0x110 > > <4> [437.542252]acpi_pm_set_device_wakeup+0x55/0x100 > > <4> [437.542254]__pci_enable_wake+0x5e/0xa0 > > <4> [437.542257]pci_finish_runtime_suspend+0x32/0x70 > > <4> [437.542259]pci_pm_runtime_suspend+0xa3/0x160 > > <4> [437.542262]__rpm_callback+0x3d/0x110 > > <4> [437.542265]rpm_callback+0x54/0x60 > > <4> [437.542268]rpm_suspend.part.10+0x105/0x5a0 > > <4> [437.542270]pm_runtime_work+0x7d/0x1e0 > > <4> [437.542273]process_one_work+0x272/0x5c0 > > <4> [437.542276]worker_thread+0x37/0x370 > > <4> [437.542278]kthread+0xed/0x120 > > <4> [437.542280]ret_from_fork+0x1f/0x30 > > <4> [437.542282] > > -> #0 (acpi_wakeup_lock){+.+.}-{3:3}: > > <4> [437.542285]__lock_acquire+0x15ad/0x2940 > > <4> [437.542288]lock_acquire+0xd3/0x310 > > <4> [437.542291]__mutex_lock+0x97/0xf20 > > <4> [437.542293]acpi_device_wakeup_disable+0x12/0x50 > > <4> [437.542295]acpi_pm_set_device_wakeup+0x6e/0x100 > > <4> [437.542297]__pci_enable_wake+0x73/0xa0 > > <4> [437.542300]pci_pm_runtime_resume+0x45/0x90 > > <4> [437.542302]__rpm_callback+0x3d/0x110 > > <4> [437.542304]rpm_callback+0x54/0x60 > > <4> [437.542307]rpm_resume+0x54f/0x750 > > <4> [437.542309]__pm_runtime_resume+0x42/0x80 > > <4> [437.542311]__intel_runtime_pm_get+0x19/0x80 [i915] > > <4> [437.542386]i915_gem_object_unbind+0x8f/0x3b0 [i915] > > <4> [437.
[Intel-gfx] [RESUBMIT][PATCH 1/2] drm/i915/gem: Avoid taking runtime-pm under the shrinker
From: Chris Wilson Inside the shrinker, we cannot wake the device as that may cause recursion into fs-reclaim, so instead we only unbind vma if the device is currently awake. (In order to provide reclaim while asleep, we do wake the device up during kswapd -- we probably want to limit that wake up if we have anything to shrink though!) To avoid the same fs_reclaim recursion potential during i915_gem_object_unbind, we acquire a wakeref there, see commit 3e817471a34c ("drm/i915/gem: Take runtime-pm wakeref prior to unbinding"). However, we use i915_gem_object_unbind from the shrinker path to make the object available for shrinking and so we must make the wakeref acquisition here conditional. <4> [437.542172] == <4> [437.542174] WARNING: possible circular locking dependency detected <4> [437.542176] 5.19.0-rc6-CI_DRM_11876-g2305e0d00665+ #1 Tainted: G U <4> [437.542179] -- <4> [437.542181] kswapd0/93 is trying to acquire lock: <4> [437.542183] 827a7608 (acpi_wakeup_lock){+.+.}-{3:3}, at: acpi_device_wakeup_disable+0x12/0x50 <4> [437.542191] but task is already holding lock: <4> [437.542194] 8275d360 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x91/0x5c0 <4> [437.542199] which lock already depends on the new lock. <4> [437.542202] the existing dependency chain (in reverse order) is: <4> [437.542204] -> #2 (fs_reclaim){+.+.}-{0:0}: <4> [437.542207]fs_reclaim_acquire+0x9d/0xd0 <4> [437.542211]kmem_cache_alloc_trace+0x2a/0x250 <4> [437.542214]__acpi_device_add+0x263/0x3a0 <4> [437.542217]acpi_add_single_object+0x3ea/0x710 <4> [437.542220]acpi_bus_check_add+0xf7/0x240 <4> [437.54]acpi_bus_scan+0x34/0xf0 <4> [437.542224]acpi_scan_init+0xf5/0x241 <4> [437.542228]acpi_init+0x449/0x4aa <4> [437.542230]do_one_initcall+0x53/0x2e0 <4> [437.542233]kernel_init_freeable+0x18f/0x1dd <4> [437.542236]kernel_init+0x11/0x110 <4> [437.542239]ret_from_fork+0x1f/0x30 <4> [437.542241] -> #1 (acpi_device_lock){+.+.}-{3:3}: <4> [437.542245]__mutex_lock+0x97/0xf20 <4> [437.542246]acpi_enable_wakeup_device_power+0x30/0xf0 <4> [437.542249]__acpi_device_wakeup_enable+0x31/0x110 <4> [437.542252]acpi_pm_set_device_wakeup+0x55/0x100 <4> [437.542254]__pci_enable_wake+0x5e/0xa0 <4> [437.542257]pci_finish_runtime_suspend+0x32/0x70 <4> [437.542259]pci_pm_runtime_suspend+0xa3/0x160 <4> [437.542262]__rpm_callback+0x3d/0x110 <4> [437.542265]rpm_callback+0x54/0x60 <4> [437.542268]rpm_suspend.part.10+0x105/0x5a0 <4> [437.542270]pm_runtime_work+0x7d/0x1e0 <4> [437.542273]process_one_work+0x272/0x5c0 <4> [437.542276]worker_thread+0x37/0x370 <4> [437.542278]kthread+0xed/0x120 <4> [437.542280]ret_from_fork+0x1f/0x30 <4> [437.542282] -> #0 (acpi_wakeup_lock){+.+.}-{3:3}: <4> [437.542285]__lock_acquire+0x15ad/0x2940 <4> [437.542288]lock_acquire+0xd3/0x310 <4> [437.542291]__mutex_lock+0x97/0xf20 <4> [437.542293]acpi_device_wakeup_disable+0x12/0x50 <4> [437.542295]acpi_pm_set_device_wakeup+0x6e/0x100 <4> [437.542297]__pci_enable_wake+0x73/0xa0 <4> [437.542300]pci_pm_runtime_resume+0x45/0x90 <4> [437.542302]__rpm_callback+0x3d/0x110 <4> [437.542304]rpm_callback+0x54/0x60 <4> [437.542307]rpm_resume+0x54f/0x750 <4> [437.542309]__pm_runtime_resume+0x42/0x80 <4> [437.542311]__intel_runtime_pm_get+0x19/0x80 [i915] <4> [437.542386]i915_gem_object_unbind+0x8f/0x3b0 [i915] <4> [437.542487]i915_gem_shrink+0x634/0x850 [i915] <4> [437.542584]i915_gem_shrinker_scan+0x3a/0xc0 [i915] <4> [437.542679]shrink_slab.constprop.97+0x1a4/0x4f0 <4> [437.542684]shrink_node+0x21e/0x420 <4> [437.542687]balance_pgdat+0x241/0x5c0 <4> [437.542690]kswapd+0x229/0x4f0 <4> [437.542694]kthread+0xed/0x120 <4> [437.542697]ret_from_fork+0x1f/0x30 <4> [437.542701] other info that might help us debug this: <4> [437.542705] Chain exists of: acpi_wakeup_lock --> acpi_device_lock --> fs_reclaim <4> [437.542713] Possible unsafe locking scenario: <4> [437.542716]CPU0CPU1 <4> [437.542719] <4> [437.542721] lock(fs_reclaim); <4> [437.542725] lock(acpi_device_lock); <
[Intel-gfx] [RESUBMIT][PATCH 2/2] drm/i915/gem: Perform active shrinking from a background thread
From: Chris Wilson i915 is very greedy and will retain system pages for as long as the user requires them; once acquired they will be only returned when the object is freed. In order to respond to system memory pressure, i915 hooks into the shrinker subsystem, designed to prune the filesystem caches, to unbind and return system pages. However, we can only do so if the device is active at that moment, as we cannot resume the device from inside direct reclaim to unbind pages from the GPU, nor do we want to delay random processes with unbound waits trying to reclaim active pages. To workaround that quandary, what we avoided in direct reclaim we delegated to kswapd, as that is run from process context outside of direct reclaim and able to sleep and resume the device. In practice, kswapd also uses fs_reclaim_acquire() around its shrink_slab calls, prohibiting runtime resume. If we cannot wake the device from idle, we will retain system memory indefinitely. As we cannot take advantage of kswapd's decoupled process context to perform an active reclaim of bound pages, spawn our own kthread to wait under our wakeref. Similar to kswapd, there is no direct dependency on the background task to direct reclaim (other than failure to promptly return pages will implicitly result in oom), as such the task itself does not inherit the fs-reclaim context. A page reclaimed by i915 will typically not immediately be available for re-use, as it will require writeback, and so only a future allocation attempt may benefit. Concurrent page allocation attempts do not wait for either kswapd or our own swapper task. We mark our kthread as a memallocator (allowed to dip into memory reserves, but not allowed to trigger direct reclaim) and mark up the call to the shrinker with a fs_reclaim critical section. This should prevent us from accidentally abusing the background swapper task, and so the swapper kthread behaves like kswapd with the exception of being allowed to wake the device up, and being decoupled from the shrinker_rwsem. Reported-by: Thomas Hellström Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6449 Fixes: 178a30c90ac7 ("drm/i915: Unbind objects in shrinker only if device is runtime active") Signed-off-by: Chris Wilson Cc: Thomas Hellström Cc: Matthew Auld Cc: Tvrtko Ursulin Cc: sta...@vger.kernel.org # v4.8+ Signed-off-by: Janusz Krzysztofik --- Resubmit reason: drop RFC label. drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 134 +-- drivers/gpu/drm/i915/i915_drv.h | 15 +++ 2 files changed, 135 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 1030053571a2..bc6c1978e64a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -310,6 +310,113 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) return count; } +static unsigned long run_swapper(struct drm_i915_private *i915, +unsigned long target, +unsigned long *nr_scanned) +{ + return i915_gem_shrink(NULL, i915, + target, nr_scanned, + I915_SHRINK_ACTIVE | + I915_SHRINK_BOUND | + I915_SHRINK_UNBOUND | + I915_SHRINK_WRITEBACK); +} + +static int swapper(void *arg) +{ + struct drm_i915_private *i915 = arg; + atomic_long_t *target = &i915->mm.swapper.target; + unsigned int noreclaim_state; + + /* +* For us to be running the swapper implies that the system is under +* enough memory pressure to be swapping. At that point, we both want +* to ensure we make forward progress in order to reclaim pages from +* the device and not contribute further to direct reclaim pressure. We +* mark ourselves as a memalloc task in order to not trigger direct +* reclaim ourselves, but dip into the system memory reserves for +* shrinkers. +*/ + noreclaim_state = memalloc_noreclaim_save(); + + do { + intel_wakeref_t wakeref; + + ___wait_var_event(target, + atomic_long_read(target) || + kthread_should_stop(), + TASK_IDLE, 0, 0, schedule()); + if (kthread_should_stop()) + break; + + with_intel_runtime_pm(&i915->runtime_pm, wakeref) { + unsigned long nr_scan = atomic_long_xchg(target, 0); + + /* +* Now that we have woken up the device hierarchy, +* act as a normal shrinker. Our shrinker is primarily +* focussed on supportin