[Intel-gfx] [PATCH] drm/i915: Flush buffer pools on driver remove

2021-06-08 Thread Janusz Krzysztofik
In preparation for clean driver release, attempts to drain work queues
and release freed objects are taken at driver remove time.  However, GT
buffer pools are now not flushed before the driver release phase.
Since unused objects may stay there for up to one second, some may
survive until driver release is attempted.  That can potentially
explain sporadic then hardly reproducible issues observed at driver
release time, like non-zero shrink counter or outstanding address space
areas.

Flush buffer pools on GT remove as a potential fix.
Also, don't flush the pools at driver release again, just assert that
the flush was called and nothing added more in between (suggested by
Chris).

Signed-off-by: Janusz Krzysztofik 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++
 drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 2161bf01ef8b..c03b399bfaf5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -652,6 +652,8 @@ void intel_gt_driver_remove(struct intel_gt *gt)
intel_uc_driver_remove(>->uc);
 
intel_engines_release(gt);
+
+   intel_gt_flush_buffer_pool(gt);
 }
 
 void intel_gt_driver_unregister(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c 
b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
index aa0a59c5b614..acc49c56a9f3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
@@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt)
struct intel_gt_buffer_pool *pool = >->buffer_pool;
int n;
 
-   intel_gt_flush_buffer_pool(gt);
-
for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++)
GEM_BUG_ON(!list_empty(&pool->cache_list[n]));
 }
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RESEND] drm/i915: Flush buffer pools on driver remove

2021-09-23 Thread Janusz Krzysztofik
Hi Matt,

Thanks for review.

On czwartek, 23 września 2021 00:24:29 CEST Matt Roper wrote:
> On Fri, Sep 03, 2021 at 04:23:20PM +0200, Janusz Krzysztofik wrote:
> > In preparation for clean driver release, attempts to drain work queues
> > and release freed objects are taken at driver remove time.  However, GT
> > buffer pools are now not flushed before the driver release phase.
> > Since unused objects may stay there for up to one second, some may
> > survive until driver release is attempted.  That can potentially
> > explain sporadic then hardly reproducible issues observed at driver
> > release time, like non-zero shrink counter or outstanding address space
> 
> So just to make sure I'm understanding the description here:
>  - We currently do an explicit flush of the buffer pools within the call
>path of drm_driver.release(); this removes all buffers, regardless of
>their age.

And also triggers release of the buffers' underlying resources (objects, 
address space areas).

>  - However there may be other code that runs *earlier* within the
>drm_driver.release() call chain 

Yes, within the drm_driver.release() call chain, but not necessarily earlier 
-- that's irrelevant, I believe, ...

>that expects buffer pools have
>already been flushed and are already empty.

... since that other code expects not just buffer pools but resource 
categories they consume (objects, address space areas) to be flushed already, 
while some may still be busy with old buffers not auto-flushed yet.

>  - Since buffer pools auto-flush old buffers once per second in a worker
>thread, there's a small window where if we remove the driver while
>there are still buffers with an age of less than one second, the
>assumptions of the other release code may be violated.

Correct.

> So by moving the flush to driver remove (which executes earlier via the
> pci_driver.remove() flow) you're ensuring that all buffers are flushed
> before _any_ code in drm_driver.release() executes.

And also flushed before some other code in pci_driver.remove() flushes those 
other resource categories released on buffer pools flush, then completeness of 
all those flushes is checked in drm_driver.release().

May I copy-paste some of you wording while rephrasing my commit description?

Thanks,
Janusz

> 
> I found the wording of the commit message here somewhat confusing since
> it's talking about flushes we do in driver release, but mentions
> problems that arise during driver release due to lack of flushing.  You
> might want to reword the commit message somewhat to help clarify.
> Otherwise, the code change itself looks reasonable to me.
> 
> BTW, I do notice that drm_driver.release() in general is technically
> deprecated at this point (with a suggestion in the drm_drv.h comments to
> switch to using drmm_add_action(), drmm_kmalloc(), etc. to manage the
> cleanup of resources).  At some point in the future me may want to
> rework the i915 cleanup in general according to that guidance.
> 
> 
> Matt
> 
> > areas.
> > 
> > Flush buffer pools on GT remove as a fix.  On driver release, don't
> > flush the pools again, just assert that the flush was called and
> > nothing added more in between.
> > 
> > Signed-off-by: Janusz Krzysztofik 
> > Cc: Chris Wilson 
> > ---
> > Resending with Cc: dri-de...@lists.freedesktop.org as requested, and a
> > typo in commit description fixed.
> > 
> > Thanks,
> > Janusz
> > 
> >  drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++
> >  drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 --
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt.c
> > index 62d40c986642..8f322a4ecd87 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> > @@ -737,6 +737,8 @@ void intel_gt_driver_remove(struct intel_gt *gt)
> > intel_uc_driver_remove(>->uc);
> >  
> > intel_engines_release(gt);
> > +
> > +   intel_gt_flush_buffer_pool(gt);
> >  }
> >  
> >  void intel_gt_driver_unregister(struct intel_gt *gt)
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c 
> > b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> > index aa0a59c5b614..acc49c56a9f3 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> > @@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt)
> > struct intel_gt_buffer_pool *pool = >->buffer_pool;
> > int n;
> >  
> > -   intel_gt_flush_buffer_pool(gt);
> > -
> > for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++)
> > GEM_BUG_ON(!list_empty(&pool->cache_list[n]));
> >  }
> 
> 






[Intel-gfx] [PATCH v2] drm/i915: Flush buffer pools on driver remove

2021-09-24 Thread Janusz Krzysztofik
We currently do an explicit flush of the buffer pools within the call path
of drm_driver.release(); this removes all buffers, regardless of their age,
freeing the buffers' associated resources (objects, adress space areas).
However there is other code that runs within the drm_driver.release() call
chain that expects objects and their associated address space areas have
already been flushed.

Since buffer pools auto-flush old buffers once per second in a worker
thread, there's a small window where if we remove the driver while there
are still objects in buffers with an age of less than one second, the
assumptions of the other release code may be violated.

By moving the flush to driver remove (which executes earlier via the
pci_driver.remove() flow) we're ensuring that all buffers are flushed and
their associated objects freed before some other code in
pci_driver.remove() flushes those objects so they are released before
_any_ code in drm_driver.release() that check completness of those
flushes executes.

v2: Reword commit descriptiom as suggested by Matt.

Signed-off-by: Janusz Krzysztofik 
Cc: Chris Wilson 
Cc: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++
 drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 4037c3778225..5b3acf2b064e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -741,6 +741,8 @@ void intel_gt_driver_remove(struct intel_gt *gt)
intel_uc_driver_remove(>->uc);
 
intel_engines_release(gt);
+
+   intel_gt_flush_buffer_pool(gt);
 }
 
 void intel_gt_driver_unregister(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c 
b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
index aa0a59c5b614..acc49c56a9f3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
@@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt)
struct intel_gt_buffer_pool *pool = >->buffer_pool;
int n;
 
-   intel_gt_flush_buffer_pool(gt);
-
for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++)
GEM_BUG_ON(!list_empty(&pool->cache_list[n]));
 }
-- 
2.25.1



Re: [Intel-gfx] [RFC PATCH i-g-t] lib/i915/perf: Fix non-card0 processing

2021-05-05 Thread Janusz Krzysztofik
Hi Lionel,

On poniedziałek, 3 maja 2021 09:07:09 CEST Lionel Landwerlin wrote:
> On 30/04/2021 19:18, Janusz Krzysztofik wrote:
> > IGT i915/perf library functions now always operate on sysfs perf
> > attributes of card0 device node, no matter which DRM device fd a user
> > passes.  The intention was to always switch to primary device node if
> > a user passes a render device node fd, but that breaks handling of
> > non-card0 devices.
> >
> > Instead of forcibly using DRM device minor number 0 when opening a
> > device sysfs area, convert device minor number of a user passed device
> > fd to the minor number of respective primary (cardX) device node.
> >
> > Signed-off-by: Janusz Krzysztofik 
> > ---
> >   lib/i915/perf.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/i915/perf.c b/lib/i915/perf.c
> > index 56d5c0b3a..336824df7 100644
> > --- a/lib/i915/perf.c
> > +++ b/lib/i915/perf.c
> > @@ -376,8 +376,8 @@ open_master_sysfs_dir(int drm_fd)
> > if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode))
> >   return -1;
> >   
> > -snprintf(path, sizeof(path), "/sys/dev/char/%d:0",
> > - major(st.st_rdev));
> > +snprintf(path, sizeof(path), "/sys/dev/char/%d:%d",
> > + major(st.st_rdev), minor(st.st_rdev) & ~128);
> 
> 
> Isn't it minor(st.st_rdev) & 0xff ? 

Did you mean 0x7f?

> or even 0x3f ?
> 
> Looks like /dev/dri/controlD64 can exist too.

Not any longer, see commit 0d49f303e8a7 ("drm: remove all control node code").

However, my approach of applying a mask is oversimplified.  Minor numbers for 
different node types (primary and render) are handled separately.  I'm going 
to propose a method similar to that implemented in igt_debugfs_path().

Thanks,
Janusz


> 
> 
> -Lionel
> 
> 
> >   
> > return open(path, O_DIRECTORY);
> >   }
> 
> 
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH i-g-t v2] lib/i915/perf: Fix non-card0 processing

2021-05-05 Thread Janusz Krzysztofik
IGT i915/perf library functions now always operate on sysfs perf
attributes of card0 device node, no matter which DRM device fd a user
passes.  The intention was to always switch to primary device node if
a user passes a render device node fd, but that breaks handling of
non-card0 devices.

If a user passed a render device node fd, find a primary device node of
the same device and use it instead of forcibly using the primary device
with minor number 0 when opening the device sysfs area.

v2: Don't assume primary minor matches render minor with masked type.

Signed-off-by: Janusz Krzysztofik 
Cc: Lionel Landwerlin 
---
 lib/i915/perf.c | 31 ---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/lib/i915/perf.c b/lib/i915/perf.c
index 56d5c0b3a..d7768468e 100644
--- a/lib/i915/perf.c
+++ b/lib/i915/perf.c
@@ -372,14 +372,39 @@ open_master_sysfs_dir(int drm_fd)
 {
char path[128];
struct stat st;
+   int sysfs;
 
if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode))
 return -1;
 
-snprintf(path, sizeof(path), "/sys/dev/char/%d:0",
- major(st.st_rdev));
+   snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), 
minor(st.st_rdev));
+   sysfs = open(path, O_DIRECTORY);
 
-   return open(path, O_DIRECTORY);
+   if (sysfs >= 0 && minor(st.st_rdev) >= 128) {
+   char device[100], cmp[100];
+   int device_len, cmp_len, i;
+
+   device_len = readlinkat(sysfs, "device", device, 
sizeof(device));
+   close(sysfs);
+   if (device_len < 0)
+   return device_len;
+
+   for (i = 0; i < 128; i++) {
+
+   snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", 
major(st.st_rdev), i);
+   sysfs = open(path, O_DIRECTORY);
+   if (sysfs < 0)
+   continue;
+
+   cmp_len = readlinkat(sysfs, "device", cmp, sizeof(cmp));
+   if (cmp_len == device_len && !memcmp(cmp, device, 
cmp_len))
+   break;
+
+   close(sysfs);
+   }
+   }
+
+   return sysfs;
 }
 
 struct intel_perf *
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH i-g-t v3] lib/i915/perf: Fix non-card0 processing

2021-05-05 Thread Janusz Krzysztofik
IGT i915/perf library functions now always operate on sysfs perf
attributes of card0 device node, no matter which DRM device fd a user
passes.  The intention was to always switch to primary device node if
a user passes a render device node fd, but that breaks handling of
non-card0 devices.

If a user passed a render device node fd, find a primary device node of
the same device and use it instead of forcibly using the primary device
with minor number 0 when opening the device sysfs area.

v2: Don't assume primary minor matches render minor with masked type.
v3: Reset sysfs dir fd if no match, consequently spell out error paths,
add a comment on convertion of renderD* to cardX (Lionel).

Signed-off-by: Janusz Krzysztofik 
Reviewed-by: Lionel Landwerlin 
---
 lib/i915/perf.c | 35 ---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/lib/i915/perf.c b/lib/i915/perf.c
index 56d5c0b3a..b9e10519e 100644
--- a/lib/i915/perf.c
+++ b/lib/i915/perf.c
@@ -372,14 +372,43 @@ open_master_sysfs_dir(int drm_fd)
 {
char path[128];
struct stat st;
+   int sysfs;
 
if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode))
 return -1;
 
-snprintf(path, sizeof(path), "/sys/dev/char/%d:0",
- major(st.st_rdev));
+   snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), 
minor(st.st_rdev));
+   sysfs = open(path, O_DIRECTORY);
+   if (sysfs < 0)
+   return sysfs;
 
-   return open(path, O_DIRECTORY);
+   if (minor(st.st_rdev) >= 128) {
+   /* If we were given a renderD* drm_fd, find it's associated 
cardX node. */
+   char device[100], cmp[100];
+   int device_len, cmp_len, i;
+
+   device_len = readlinkat(sysfs, "device", device, 
sizeof(device));
+   close(sysfs);
+   if (device_len < 0)
+   return device_len;
+
+   for (i = 0; i < 128; i++) {
+
+   snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", 
major(st.st_rdev), i);
+   sysfs = open(path, O_DIRECTORY);
+   if (sysfs < 0)
+   continue;
+
+   cmp_len = readlinkat(sysfs, "device", cmp, sizeof(cmp));
+   if (cmp_len == device_len && !memcmp(cmp, device, 
cmp_len))
+   break;
+
+   close(sysfs);
+   sysfs = -1;
+   }
+   }
+
+   return sysfs;
 }
 
 struct intel_perf *
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH i-g-t v4] lib/i915/perf: Fix non-card0 processing

2021-05-05 Thread Janusz Krzysztofik
IGT i915/perf library functions now always operate on sysfs perf
attributes of card0 device node, no matter which DRM device fd a user
passes.  The intention was to always switch to primary device node if
a user passes a render device node fd, but that breaks handling of
non-card0 devices.

If a user passed a render device node fd, find a primary device node of
the same device and use it instead of forcibly using the primary device
with minor number 0 when opening the device sysfs area.

v2: Don't assume primary minor matches render minor with masked type.
v3: Reset sysfs dir fd if no match, consequently spell out error paths,
add a comment on convertion of renderD* to cardX (Lionel).
v4: Limit primary lookup to minors <64 (Chris)

Signed-off-by: Janusz Krzysztofik 
Reviewed-by: Lionel Landwerlin  # v3
Cc: Chris Wilson 
---
 lib/i915/perf.c | 35 ---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/lib/i915/perf.c b/lib/i915/perf.c
index 56d5c0b3a..5644a3469 100644
--- a/lib/i915/perf.c
+++ b/lib/i915/perf.c
@@ -372,14 +372,43 @@ open_master_sysfs_dir(int drm_fd)
 {
char path[128];
struct stat st;
+   int sysfs;
 
if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode))
 return -1;
 
-snprintf(path, sizeof(path), "/sys/dev/char/%d:0",
- major(st.st_rdev));
+   snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", major(st.st_rdev), 
minor(st.st_rdev));
+   sysfs = open(path, O_DIRECTORY);
+   if (sysfs < 0)
+   return sysfs;
 
-   return open(path, O_DIRECTORY);
+   if (minor(st.st_rdev) >= 128) {
+   /* If we were given a renderD* drm_fd, find it's associated 
cardX node. */
+   char device[100], cmp[100];
+   int device_len, cmp_len, i;
+
+   device_len = readlinkat(sysfs, "device", device, 
sizeof(device));
+   close(sysfs);
+   if (device_len < 0)
+   return device_len;
+
+   for (i = 0; i < 64; i++) {
+
+   snprintf(path, sizeof(path), "/sys/dev/char/%d:%d", 
major(st.st_rdev), i);
+   sysfs = open(path, O_DIRECTORY);
+   if (sysfs < 0)
+   continue;
+
+   cmp_len = readlinkat(sysfs, "device", cmp, sizeof(cmp));
+   if (cmp_len == device_len && !memcmp(cmp, device, 
cmp_len))
+   break;
+
+   close(sysfs);
+   sysfs = -1;
+   }
+   }
+
+   return sysfs;
 }
 
 struct intel_perf *
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/gt: Do release kernel context if breadcrumb measure fails

2021-05-07 Thread Janusz Krzysztofik
Commit fb5970da1b42 ("drm/i915/gt: Use the kernel_context to measure the
breadcrumb size") reordered some operations inside engine_init_common()
and added an error unwind path to that function.  In that path, a
reference to a kernel context candidate supposed to be released on error
was put, but the context, pinned when created, was not unpinned first.
Fix it by replacing intel_context_put() with destroy_pinned_context()
introduced later by commit b436a5f8b6c8 ("drm/i915/gt: Track all timelines
created using the HWSP").

Signed-off-by: Janusz Krzysztofik 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 6dbdbde00f14..eba2da9679a5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -898,7 +898,7 @@ static int engine_init_common(struct intel_engine_cs 
*engine)
return 0;
 
 err_context:
-   intel_context_put(ce);
+   destroy_pinned_context(ce);
return ret;
 }
 
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/gt: Do release kernel context if breadcrumb measure fails

2021-05-10 Thread Janusz Krzysztofik
Hi Tvrtko,

On poniedziałek, 10 maja 2021 11:14:46 CEST Tvrtko Ursulin wrote:
> 
> On 07/05/2021 15:42, Janusz Krzysztofik wrote:
> > Commit fb5970da1b42 ("drm/i915/gt: Use the kernel_context to measure the
> > breadcrumb size") reordered some operations inside engine_init_common()
> > and added an error unwind path to that function.  In that path, a
> > reference to a kernel context candidate supposed to be released on error
> > was put, but the context, pinned when created, was not unpinned first.
> > Fix it by replacing intel_context_put() with destroy_pinned_context()
> > introduced later by commit b436a5f8b6c8 ("drm/i915/gt: Track all timelines
> > created using the HWSP").
> > 
> > Signed-off-by: Janusz Krzysztofik 
> > Cc: Chris Wilson 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 6dbdbde00f14..eba2da9679a5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -898,7 +898,7 @@ static int engine_init_common(struct intel_engine_cs 
> > *engine)
> > return 0;
> >   
> >   err_context:
> > -   intel_context_put(ce);
> > +   destroy_pinned_context(ce);
> > return ret;
> >   }
> >   
> > 
> 
> Reviewed-by: Tvrtko Ursulin 
> 
> Found by some test or code inspection?

Code inspection.

Thanks,
Janusz

> 
> Regards,
> 
> Tvrtko
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong name announced on FB driver switching

2021-05-13 Thread Janusz Krzysztofik
Hi Jani,

On Mon, 3 May 2021 19:38:17 CEST Jani Nikula wrote:
> On Thu, 29 Apr 2021, Janusz Krzysztofik  
wrote:
> > Commit 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info")
> > effectively changed our FB driver name from "inteldrmfb" to
> > "i915drmfb".  However, we are still using the old name when kicking out
> > a firmware fbdev driver potentially bound to our device.  Use the new
> > name to avoid confusion.
> >
> > Note: since the new name is assigned by a DRM fbdev helper called at
> > the DRM driver registration time, that name is not available when we
> > kick the other driver out early, hence a hardcoded name must be used
> > unless the DRM layer exposes a macro for converting a DRM driver name
> > to its associated fbdev driver name.
> >
> > Signed-off-by: Janusz Krzysztofik 
> 
> LGTM, Daniel?
> 
> Reviewed-by: Jani Nikula 

Thanks for review.  What are next steps?  Please note I have no push 
permissions.

Thanks,
Janusz

> 
> $ dim fixes 7a0f9ef9703d
> Fixes: 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info")
> Cc: Noralf Trønnes 
> Cc: Alex Deucher 
> Cc: Daniel Vetter 
> Cc: Jani Nikula 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 
> Cc: intel-gfx@lists.freedesktop.org
> Cc:  # v5.2+
> 
> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/
i915_drv.c
> > index 785dcf20c77b..46082490dc9a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -554,7 +554,7 @@ static int i915_driver_hw_probe(struct 
drm_i915_private *dev_priv)
> > if (ret)
> > goto err_perf;
> >  
> > -   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
"inteldrmfb");
> > +   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
"i915drmfb");
> > if (ret)
> > goto err_ggtt;
> 
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix wrong name announced on FB driver switching

2021-05-26 Thread Janusz Krzysztofik
Hi,

On poniedziałek, 3 maja 2021 19:38:17 CEST Jani Nikula wrote:
> On Thu, 29 Apr 2021, Janusz Krzysztofik  
wrote:
> > Commit 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info")
> > effectively changed our FB driver name from "inteldrmfb" to
> > "i915drmfb".  However, we are still using the old name when kicking out
> > a firmware fbdev driver potentially bound to our device.  Use the new
> > name to avoid confusion.
> >
> > Note: since the new name is assigned by a DRM fbdev helper called at
> > the DRM driver registration time, that name is not available when we
> > kick the other driver out early, hence a hardcoded name must be used
> > unless the DRM layer exposes a macro for converting a DRM driver name
> > to its associated fbdev driver name.
> >
> > Signed-off-by: Janusz Krzysztofik 
> 
> LGTM, Daniel?
> 
> Reviewed-by: Jani Nikula 

Am I supposed to do something to push processing of this patch forward?  
Please note I have no push permissions so can't merge it myself.

> 
> $ dim fixes 7a0f9ef9703d
> Fixes: 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info")
> Cc: Noralf Trønnes 
> Cc: Alex Deucher 
> Cc: Daniel Vetter 
> Cc: Jani Nikula 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 
> Cc: intel-gfx@lists.freedesktop.org
> Cc:  # v5.2+

Should I resubmit with those tags appended?

Thanks,
Janusz

> 
> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/
i915_drv.c
> > index 785dcf20c77b..46082490dc9a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -554,7 +554,7 @@ static int i915_driver_hw_probe(struct 
drm_i915_private *dev_priv)
> > if (ret)
> > goto err_perf;
> >  
> > -   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
"inteldrmfb");
> > +   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
"i915drmfb");
> > if (ret)
> > goto err_ggtt;
> 
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Mark GPU wedging on driver unregister unrecoverable

2021-09-01 Thread Janusz Krzysztofik
GPU wedged flag now set on driver unregister to prevent from further
using the GPU can be then cleared unintentionally when calling
__intel_gt_unset_wedged() still before the flag is finally marked
unrecoverable.  We need to have it marked unrecoverable earlier.
Implement that by replacing a call to intel_gt_set_wedged() in
intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini().

With the above in place, intel_gt_set_wedged_on_fini() is now called
twice on driver remove, second time from __intel_gt_disable().  This
seems harmless, while dropping intel_gt_set_wedged_on_fini() from
__intel_gt_disable() proved to break some driver probe error unwind
paths as well as mock selftest exit path.

Signed-off-by: Janusz Krzysztofik 
Cc: Michał Winiarski 
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 62d40c986642..173b53cb2b47 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -750,7 +750,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 * all in-flight requests so that we can quickly unbind the active
 * resources.
 */
-   intel_gt_set_wedged(gt);
+   intel_gt_set_wedged_on_fini(gt);
 
/* Scrub all HW state upon release */
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
-- 
2.25.1



[Intel-gfx] [PATCH] drm/i915: Flush buffer pools on driver remove

2021-09-01 Thread Janusz Krzysztofik
In preparation for clean driver release, attempts to drain work queues
and release freed objects are taken at driver remove time.  However, GT
buffer pools are now not flushed before the driver release phase.
Since unused objects may stay there for up to one second, some may
survive until driver release is attempted.  That can potentially
explain sporadic then hardly reproducible issues observed at driver
release time, like non-zero shrink counter or outstanding address space
areas.

Flush buffer pools on GT remove as a fix.  On driver release, don't push
the pools again, just assert that the flush was called and nothing added
more in between.

Signed-off-by: Janusz Krzysztofik 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++
 drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 62d40c986642..8f322a4ecd87 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -737,6 +737,8 @@ void intel_gt_driver_remove(struct intel_gt *gt)
intel_uc_driver_remove(>->uc);
 
intel_engines_release(gt);
+
+   intel_gt_flush_buffer_pool(gt);
 }
 
 void intel_gt_driver_unregister(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c 
b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
index aa0a59c5b614..acc49c56a9f3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
@@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt)
struct intel_gt_buffer_pool *pool = >->buffer_pool;
int n;
 
-   intel_gt_flush_buffer_pool(gt);
-
for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++)
GEM_BUG_ON(!list_empty(&pool->cache_list[n]));
 }
-- 
2.25.1



[Intel-gfx] [PATCH RESEND] drm/i915: Flush buffer pools on driver remove

2021-09-03 Thread Janusz Krzysztofik
In preparation for clean driver release, attempts to drain work queues
and release freed objects are taken at driver remove time.  However, GT
buffer pools are now not flushed before the driver release phase.
Since unused objects may stay there for up to one second, some may
survive until driver release is attempted.  That can potentially
explain sporadic then hardly reproducible issues observed at driver
release time, like non-zero shrink counter or outstanding address space
areas.

Flush buffer pools on GT remove as a fix.  On driver release, don't
flush the pools again, just assert that the flush was called and
nothing added more in between.

Signed-off-by: Janusz Krzysztofik 
Cc: Chris Wilson 
---
Resending with Cc: dri-de...@lists.freedesktop.org as requested, and a
typo in commit description fixed.

Thanks,
Janusz

 drivers/gpu/drm/i915/gt/intel_gt.c | 2 ++
 drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 62d40c986642..8f322a4ecd87 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -737,6 +737,8 @@ void intel_gt_driver_remove(struct intel_gt *gt)
intel_uc_driver_remove(>->uc);
 
intel_engines_release(gt);
+
+   intel_gt_flush_buffer_pool(gt);
 }
 
 void intel_gt_driver_unregister(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c 
b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
index aa0a59c5b614..acc49c56a9f3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
@@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt)
struct intel_gt_buffer_pool *pool = >->buffer_pool;
int n;
 
-   intel_gt_flush_buffer_pool(gt);
-
for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++)
GEM_BUG_ON(!list_empty(&pool->cache_list[n]));
 }
-- 
2.25.1



[Intel-gfx] [PATCH RESEND] drm/i915: Mark GPU wedging on driver unregister unrecoverable

2021-09-03 Thread Janusz Krzysztofik
GPU wedged flag now set on driver unregister to prevent from further
using the GPU can be then cleared unintentionally when calling
__intel_gt_unset_wedged() still before the flag is finally marked
unrecoverable.  We need to have it marked unrecoverable earlier.
Implement that by replacing a call to intel_gt_set_wedged() in
intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini().

With the above in place, intel_gt_set_wedged_on_fini() is now called
twice on driver remove, second time from __intel_gt_disable().  This
seems harmless, while dropping intel_gt_set_wedged_on_fini() from
__intel_gt_disable() proved to break some driver probe error unwind
paths as well as mock selftest exit path.

Signed-off-by: Janusz Krzysztofik 
Cc: Michał Winiarski 
---
Resending with Cc: dri-de...@lists.freedesktop.org as requested.

Thanks,
Janusz

 drivers/gpu/drm/i915/gt/intel_gt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 62d40c986642..173b53cb2b47 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -750,7 +750,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 * all in-flight requests so that we can quickly unbind the active
 * resources.
 */
-   intel_gt_set_wedged(gt);
+   intel_gt_set_wedged_on_fini(gt);
 
/* Scrub all HW state upon release */
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
-- 
2.25.1



Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Mark GPU wedging on driver unregister unrecoverable (rev2)

2021-09-06 Thread Janusz Krzysztofik
On piątek, 3 września 2021 21:07:00 CEST Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Mark GPU wedging on driver unregister unrecoverable (rev2)
> URL   : https://patchwork.freedesktop.org/series/94247/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_10550_full -> Patchwork_20953_full
> 
> 
> Summary
> ---
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_20953_full absolutely need to 
> be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_20953_full, please notify your bug team to allow 
> them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   
> 
> Possible new issues
> ---
> 
>   Here are the unknown changes that may have been introduced in 
> Patchwork_20953_full:
> 
> ### IGT changes ###
> 
>  Possible regressions 
> 
>   * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile:
> - shard-iclb: [PASS][1] -> [SKIP][2]
>[1]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-iclb4/igt@kms_flip_scaled_...@flip-32bpp-ytile-to-64bpp-ytile.html
>[2]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb2/igt@kms_flip_scaled_...@flip-32bpp-ytile-to-64bpp-ytile.html

stdout: No valid pipe/connector/format/mod combination found

That doesn't sound like a driver unregister related issue to me.

Thanks,
Janusz

> 
>   
> Known issues
> 
> 
>   Here are the changes found in Patchwork_20953_full that come from known 
> issues:
> 
> ### IGT changes ###
> 
>  Issues hit 
> 
>   * igt@feature_discovery@display-2x:
> - shard-iclb: NOTRUN -> [SKIP][3] ([i915#1839])
>[3]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb8/igt@feature_discov...@display-2x.html
> 
>   * igt@gem_create@create-massive:
> - shard-snb:  NOTRUN -> [DMESG-WARN][4] ([i915#3002])
>[4]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-snb6/igt@gem_cre...@create-massive.html
> 
>   * igt@gem_ctx_persistence@legacy-engines-hostile:
> - shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) +2 
> similar issues
>[5]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-snb6/igt@gem_ctx_persiste...@legacy-engines-hostile.html
> 
>   * igt@gem_eio@in-flight-contexts-10ms:
> - shard-tglb: [PASS][6] -> [TIMEOUT][7] ([i915#3063])
>[6]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-tglb3/igt@gem_...@in-flight-contexts-10ms.html
>[7]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-tglb6/igt@gem_...@in-flight-contexts-10ms.html
> 
>   * igt@gem_eio@unwedge-stress:
> - shard-tglb: [PASS][8] -> [TIMEOUT][9] ([i915#2369] / 
> [i915#3063] / [i915#3648])
>[8]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-tglb8/igt@gem_...@unwedge-stress.html
>[9]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-tglb7/igt@gem_...@unwedge-stress.html
> 
>   * igt@gem_exec_fair@basic-flow@rcs0:
> - shard-tglb: [PASS][10] -> [FAIL][11] ([i915#2842])
>[10]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-tglb1/igt@gem_exec_fair@basic-f...@rcs0.html
>[11]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-tglb1/igt@gem_exec_fair@basic-f...@rcs0.html
> 
>   * igt@gem_exec_fair@basic-throttle@rcs0:
> - shard-iclb: [PASS][12] -> [FAIL][13] ([i915#2849])
>[12]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10550/shard-iclb8/igt@gem_exec_fair@basic-throt...@rcs0.html
>[13]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb2/igt@gem_exec_fair@basic-throt...@rcs0.html
> 
>   * igt@gem_exec_params@secure-non-master:
> - shard-iclb: NOTRUN -> [SKIP][14] ([fdo#112283])
>[14]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb8/igt@gem_exec_par...@secure-non-master.html
> 
>   * igt@gem_pread@exhaustion:
> - shard-apl:  NOTRUN -> [WARN][15] ([i915#2658])
>[15]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-apl6/igt@gem_pr...@exhaustion.html
> - shard-skl:  NOTRUN -> [WARN][16] ([i915#2658])
>[16]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-skl3/igt@gem_pr...@exhaustion.html
> 
>   * igt@gem_pwrite@basic-exhaustion:
> - shard-kbl:  NOTRUN -> [WARN][17] ([i915#2658]) +1 similar issue
>[17]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-kbl3/igt@gem_pwr...@basic-exhaustion.html
> 
>   * igt@gem_render_copy@yf-tiled-to-vebox-x-tiled:
> - shard-iclb: NOTRUN -> [SKIP][18] ([i915#768])
>[18]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20953/shard-iclb8/igt@gem_render_c...@yf-tiled-to-vebox-x-tiled

[Intel-gfx] [RFC PATCH i-g-t 1/6] tests/core_hotunplug: Add 'GEM context' variants

2021-04-01 Thread Janusz Krzysztofik
Verify if an additional context associated with an open device file
descriptor is cleaned up correctly on device hotunbind / hotunplug.

Signed-off-by: Janusz Krzysztofik 
---
 tests/core_hotunplug.c | 79 ++
 1 file changed, 79 insertions(+)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index 56a88fefd..4f6c4f625 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -31,6 +31,7 @@
 #include 
 
 #include "i915/gem.h"
+#include "i915/gem_context.h"
 #include "igt.h"
 #include "igt_device_scan.h"
 #include "igt_kmod.h"
@@ -545,6 +546,60 @@ static void hotreplug_lateclose(struct hotunplug *priv)
igt_assert_f(healthcheck(priv, false), "%s\n", priv->failure);
 }
 
+static void ctx_hotunbind_lateclose(struct hotunplug *priv)
+{
+   uint32_t ctx;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   gem_require_contexts(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind");
+
+   local_debug("%s\n", "creating additional GEM user context");
+   ctx = gem_context_create(priv->fd.drm);
+
+   driver_unbind(priv, "hot ", 0);
+
+   local_debug("%s\n", "trying to late destroy the context");
+   igt_assert_eq(__gem_context_destroy(priv->fd.drm, ctx), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound ");
+   igt_assert_eq(priv->fd.drm, -1);
+}
+
+static void ctx_hotunplug_lateclose(struct hotunplug *priv)
+{
+   uint32_t ctx;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   gem_require_contexts(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug");
+
+   local_debug("%s\n", "creating additional GEM user context");
+   ctx = gem_context_create(priv->fd.drm);
+
+   device_unplug(priv, "hot ", 0);
+
+   local_debug("%s\n", "trying to late destroy the context");
+   igt_assert_eq(__gem_context_destroy(priv->fd.drm, ctx), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "removed ");
+   igt_assert_eq(priv->fd.drm, -1);
+}
+
 /* Main */
 
 igt_main
@@ -682,6 +737,30 @@ igt_main
recover(&priv);
}
 
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if the driver can be cleanly unbound for a 
still open device with extra GEM context, then released");
+   igt_subtest("ctx-hotunbind-lateclose")
+   ctx_hotunbind_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if a still open device with extra GEM 
context can be cleanly unplugged, then released");
+   igt_subtest("ctx-hotunplug-lateclose")
+   ctx_hotunplug_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
igt_fixture {
post_healthcheck(&priv);
 
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH i-g-t 2/6] tests/core_hotunplug: Add 'GEM address space' variants

2021-04-01 Thread Janusz Krzysztofik
Verify if an additional address space associated with an open device
file descriptor is cleaned up correctly on device hotunbind / hotunplug.

Signed-off-by: Janusz Krzysztofik 
---
 tests/core_hotunplug.c | 79 ++
 1 file changed, 79 insertions(+)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index 4f6c4f625..decfcdfda 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -32,6 +32,7 @@
 
 #include "i915/gem.h"
 #include "i915/gem_context.h"
+#include "i915/gem_vm.h"
 #include "igt.h"
 #include "igt_device_scan.h"
 #include "igt_kmod.h"
@@ -600,6 +601,60 @@ static void ctx_hotunplug_lateclose(struct hotunplug *priv)
igt_assert_eq(priv->fd.drm, -1);
 }
 
+static void vm_hotunbind_lateclose(struct hotunplug *priv)
+{
+   int vm;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   gem_require_vm(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind");
+
+   local_debug("%s\n", "creating additional GEM user address space");
+   vm = gem_vm_create(priv->fd.drm);
+
+   driver_unbind(priv, "hot ", 0);
+
+   local_debug("%s\n", "trying to late remove the address space");
+   igt_assert_eq(__gem_vm_destroy(priv->fd.drm, vm), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "removed ");
+   igt_assert_eq(priv->fd.drm, -1);
+}
+
+static void vm_hotunplug_lateclose(struct hotunplug *priv)
+{
+   int vm;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   gem_require_vm(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug");
+
+   local_debug("%s\n", "creating additional GEM user address space");
+   vm = gem_vm_create(priv->fd.drm);
+
+   device_unplug(priv, "hot ", 0);
+
+   local_debug("%s\n", "trying to late remove the address space");
+   igt_assert_eq(__gem_vm_destroy(priv->fd.drm, vm), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound ");
+   igt_assert_eq(priv->fd.drm, -1);
+}
+
 /* Main */
 
 igt_main
@@ -761,6 +816,30 @@ igt_main
recover(&priv);
}
 
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if the driver can be cleanly unboound form 
a still open device with extra GEM address space, then released");
+   igt_subtest("vm-hotunbind-lateclose")
+   vm_hotunbind_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if a still open device with extra GEM 
address space can be cleanly unplugged, then released");
+   igt_subtest("vm-hotunplug-lateclose")
+   vm_hotunplug_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
igt_fixture {
post_healthcheck(&priv);
 
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH i-g-t 3/6] tests/core_hotunplug: Add 'GEM object' variants

2021-04-01 Thread Janusz Krzysztofik
GEM objects belonging to user file descriptors still open on device
hotunbind / hotunplug may exhibit still more driver issues.  Add
subtests that implements these scenarios.

Signed-off-by: Janusz Krzysztofik 
---
 tests/core_hotunplug.c | 85 ++
 1 file changed, 85 insertions(+)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index decfcdfda..7f61b4446 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -433,6 +433,13 @@ static void set_filter_from_device(int fd)
igt_assert_eq(igt_device_filter_add(filter), 1);
 }
 
+static int local_gem_close(int fd, uint32_t handle)
+{
+   struct drm_gem_close close_bo = { .handle = handle, };
+
+   return igt_ioctl(fd, DRM_IOCTL_GEM_CLOSE, &close_bo) ? -errno : 0;
+}
+
 /* Subtests */
 
 static void unbind_rebind(struct hotunplug *priv)
@@ -655,6 +662,60 @@ static void vm_hotunplug_lateclose(struct hotunplug *priv)
igt_assert_eq(priv->fd.drm, -1);
 }
 
+static void gem_hotunbind_lateclose(struct hotunplug *priv)
+{
+   uint32_t handle;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   igt_require_gem(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind");
+
+   local_debug("%s\n", "creating a GEM user object");
+   handle = gem_create(priv->fd.drm, 4096);
+
+   driver_unbind(priv, "hot", 0);
+
+   local_debug("%s\n", "trying to late remove the object");
+   igt_assert_eq(local_gem_close(priv->fd.drm, handle), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound ");
+   igt_assert_eq(priv->fd.drm, -1);
+}
+
+static void gem_hotunplug_lateclose(struct hotunplug *priv)
+{
+   uint32_t handle;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   igt_require_gem(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug");
+
+   local_debug("%s\n", "creating a GEM user object");
+   handle = gem_create(priv->fd.drm, 4096);
+
+   device_unplug(priv, "hot", 0);
+
+   local_debug("%s\n", "trying to late remove the object");
+   igt_assert_eq(local_gem_close(priv->fd.drm, handle), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "removed ");
+   igt_assert_eq(priv->fd.drm, -1);
+}
+
 /* Main */
 
 igt_main
@@ -840,6 +901,30 @@ igt_main
recover(&priv);
}
 
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if the driver can be cleanly unbound from a 
device with a still open GEM object, then released");
+   igt_subtest("gem-hotunbind-lateclose")
+   gem_hotunbind_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if a device with a still open GEM object 
can be cleanly unplugged, then released");
+   igt_subtest("gem-hotunplug-lateclose")
+   gem_hotunplug_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
igt_fixture {
post_healthcheck(&priv);
 
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH i-g-t 5/6] tests/core_hotunplug: Add 'PRIME handle' variants

2021-04-01 Thread Janusz Krzysztofik
Even if all device file descriptors are closed on device hotunbind /
hotunplug, PRIME exported objects may still exists, referenced by still
open dma-buf file descriptors.  Add subtests that keep such descriptor
open on device hotunbind / hotunplug.

Signed-off-by: Janusz Krzysztofik 
---
 tests/core_hotunplug.c | 104 +
 1 file changed, 104 insertions(+)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index 6f3b3b3d3..0cb1267ae 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -782,6 +782,86 @@ static void userptr_hotunplug_lateclose(struct hotunplug 
*priv)
igt_fail_on_f(munmap(ptr, 4096), "Userptr unmap failure!");
 }
 
+static void prime_hotunbind_lateclose(struct hotunplug *priv)
+{
+   uint32_t handle;
+   int dmabuf, ret;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   igt_require_gem(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind");
+
+   local_debug("%s\n", "creating and PRIME-exporting a GEM object");
+   handle = gem_create(priv->fd.drm, 4096);
+   dmabuf = prime_handle_to_fd(priv->fd.drm, handle);
+
+   ret = local_gem_close(priv->fd.drm, handle);
+   priv->fd.drm = close_device(priv->fd.drm, "", "exported ");
+
+   if (priv->fd.drm != -1) {
+   igt_ignore_warn(close(dmabuf));
+   igt_assert_eq(priv->fd.drm, -1);
+   }
+
+   /* once device close succeeds, take care of open dmabuf like if it was 
a device fd */
+   priv->fd.drm = dmabuf;
+   igt_assert_f(!ret, "gem_close failed with errno %d\n", ret);
+
+   driver_unbind(priv, "hot ", 0);
+
+   igt_debug("late closing the PRIME file descriptor\n");
+   dmabuf = local_close(dmabuf, "PRIME file descriptor late close 
failure");
+   priv->fd.drm = dmabuf;
+   igt_assert_eq(dmabuf, -1);
+}
+
+static void prime_hotunplug_lateclose(struct hotunplug *priv)
+{
+   uint32_t handle;
+   int dmabuf, ret;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   igt_require_gem(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug");
+
+   local_debug("%s\n", "creating and PRIME-exporting a GEM object");
+   handle = gem_create(priv->fd.drm, 4096);
+   dmabuf = prime_handle_to_fd(priv->fd.drm, handle);
+
+   ret = local_gem_close(priv->fd.drm, handle);
+   priv->fd.drm = close_device(priv->fd.drm, "", "exported ");
+
+   if (priv->fd.drm != -1) {
+   igt_ignore_warn(close(dmabuf));
+   igt_assert_eq(priv->fd.drm, -1);
+   }
+
+   /* once device close succeeds, take care of open dmabuf like if it was 
a device fd */
+   priv->fd.drm = dmabuf;
+   igt_assert_f(!ret, "gem_close failed with errno %d\n", ret);
+
+   device_unplug(priv, "hot ", 0);
+
+   igt_debug("late closing the PRIME file descriptor\n");
+   dmabuf = local_close(dmabuf, "PRIME file descriptor late close 
failure");
+   priv->fd.drm = dmabuf;
+   igt_assert_eq(dmabuf, -1);
+}
+
 /* Main */
 
 igt_main
@@ -1015,6 +1095,30 @@ igt_main
recover(&priv);
}
 
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if the driver can be cleanly unbound from a 
device with a still open PRIME-exported object, then released");
+   igt_subtest("prime-hotunbind-lateclose")
+   prime_hotunbind_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if a device with a still open 
PRIME-exported object can be cleanly unplugged, then released");
+   igt_subtest("prime-hotunplug-lateclose")
+   prime_hotunplug_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
igt_fixture {
post_healthcheck(&priv);
 
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH i-g-t 4/6] tests/core_hotunplug: Add 'userptr GEM object' variants

2021-04-01 Thread Janusz Krzysztofik
Verify if userptr GM objects are cleaned up equally well as regular
GEM objects on device hotunbind / hotunplug.

Signed-off-by: Janusz Krzysztofik 
---
 tests/core_hotunplug.c | 90 ++
 1 file changed, 90 insertions(+)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index 7f61b4446..6f3b3b3d3 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -716,6 +716,72 @@ static void gem_hotunplug_lateclose(struct hotunplug *priv)
igt_assert_eq(priv->fd.drm, -1);
 }
 
+static void userptr_hotunbind_lateclose(struct hotunplug *priv)
+{
+   uint32_t handle;
+   void *ptr;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   igt_require_gem(priv->fd.drm);
+   igt_assert_eq(posix_memalign(&ptr, 4096, 4096), 0);
+   igt_require(!__gem_userptr(priv->fd.drm, ptr, 4096, 0, 0, &handle));
+   gem_close(priv->fd.drm, handle);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind");
+
+   local_debug("%s\n", "creating a userptr GEM object");
+   gem_userptr(priv->fd.drm, ptr, 4096, 0, 0, &handle);
+
+   driver_unbind(priv, "hot ", 0);
+
+   local_debug("%s\n", "trying to late remove the object");
+   igt_assert_eq(local_gem_close(priv->fd.drm, handle), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound ");
+   igt_assert_eq(priv->fd.drm, -1);
+
+   igt_fail_on_f(munmap(ptr, 4096), "Userptr unmap failure!");
+}
+
+static void userptr_hotunplug_lateclose(struct hotunplug *priv)
+{
+   uint32_t handle;
+   void *ptr;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   igt_require_gem(priv->fd.drm);
+   igt_assert_eq(posix_memalign(&ptr, 4096, 4096), 0);
+   igt_require(!__gem_userptr(priv->fd.drm, ptr, 4096, 0, 0, &handle));
+   gem_close(priv->fd.drm, handle);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug");
+
+   local_debug("%s\n", "creating a userptr GEM object");
+   gem_userptr(priv->fd.drm, ptr, 4096, 0, 0, &handle);
+
+   device_unplug(priv, "hot ", 0);
+
+   local_debug("%s\n", "trying to late remove the object");
+   igt_assert_eq(local_gem_close(priv->fd.drm, handle), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "removed ");
+   igt_assert_eq(priv->fd.drm, -1);
+
+   igt_fail_on_f(munmap(ptr, 4096), "Userptr unmap failure!");
+}
+
 /* Main */
 
 igt_main
@@ -925,6 +991,30 @@ igt_main
recover(&priv);
}
 
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if the driver can be cleanly unbound from a 
device with a still open userptr GEM object, then released");
+   igt_subtest("userptr-hotunbind-lateclose")
+   userptr_hotunbind_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
+   igt_fixture
+   post_healthcheck(&priv);
+
+   igt_subtest_group {
+   igt_describe("Check if a device with a still open userptr GEM 
object can be cleanly unplugged, then released");
+   igt_subtest("userptr-hotunplug-lateclose")
+   userptr_hotunplug_lateclose(&priv);
+
+   igt_fixture
+   recover(&priv);
+   }
+
igt_fixture {
post_healthcheck(&priv);
 
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH i-g-t 6/6] tests/core_hotunplug: Add 'GEM spin' variants

2021-04-01 Thread Janusz Krzysztofik
Verify if a device with a GEM spin batch job still running on a GPU can
be hot-unbound/unplugged cleanly and released.

Signed-off-by: Janusz Krzysztofik 
---
 tests/core_hotunplug.c | 124 +
 1 file changed, 124 insertions(+)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index 0cb1267ae..f93545402 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -35,6 +35,7 @@
 #include "i915/gem_vm.h"
 #include "igt.h"
 #include "igt_device_scan.h"
+#include "igt_dummyload.h"
 #include "igt_kmod.h"
 #include "igt_sysfs.h"
 #include "sw_sync.h"
@@ -440,6 +441,37 @@ static int local_gem_close(int fd, uint32_t handle)
return igt_ioctl(fd, DRM_IOCTL_GEM_CLOSE, &close_bo) ? -errno : 0;
 }
 
+static int local_bo_busy(int fd, uint32_t handle)
+{
+   struct drm_i915_gem_busy busy = { .handle = handle, };
+
+   return igt_ioctl(fd, DRM_IOCTL_I915_GEM_BUSY, &busy) ? -errno : 0;
+}
+
+static void local_spin_free(struct hotunplug *priv, igt_spin_t *spin)
+{
+   igt_spin_end(spin);
+
+   spin->poll_handle = 0;
+   spin->handle = 0;
+
+   if (spin->poll) {
+   void *ptr = spin->poll;
+
+   spin->poll = NULL;
+   igt_assert(!gem_munmap(ptr, 4096));
+   }
+
+   if (spin->batch) {
+   void *ptr = spin->poll;
+
+   spin->batch = NULL;
+   igt_assert(!gem_munmap(ptr, 4096));
+   }
+
+   igt_spin_free(priv->fd.drm, spin);
+}
+
 /* Subtests */
 
 static void unbind_rebind(struct hotunplug *priv)
@@ -862,6 +894,74 @@ static void prime_hotunplug_lateclose(struct hotunplug 
*priv)
igt_assert_eq(dmabuf, -1);
 }
 
+static void spin_hotunbind_lateclose(struct hotunplug *priv)
+{
+   igt_spin_t *spin;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   igt_require_gem(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unbind");
+
+   local_debug("%s\n", "running dummy load");
+   spin = igt_spin_new(priv->fd.drm, .flags = IGT_SPIN_POLL_RUN);
+   igt_spin_busywait_until_started(spin);
+
+   driver_unbind(priv, "hot ", 0);
+
+   local_debug("%s\n", "trying to late query the dummy load related GEM 
object status");
+   igt_assert_eq(local_bo_busy(priv->fd.drm, spin->handle), -ENODEV);
+   local_debug("%s\n", "trying to late close the dummy load related GEM 
objects");
+   igt_assert_eq(local_gem_close(priv->fd.drm, spin->poll_handle), 
-ENODEV);
+   igt_assert_eq(local_gem_close(priv->fd.drm, spin->handle), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "unbound ");
+   igt_assert_eq(priv->fd.drm, -1);
+
+   local_debug("%s\n", "trying to late free the dummy load");
+   local_spin_free(priv, spin);
+}
+
+static void spin_hotunplug_lateclose(struct hotunplug *priv)
+{
+   igt_spin_t *spin;
+
+   igt_require(priv->fd.drm = -1);
+   priv->fd.drm = local_drm_open_driver(false, "pre-", " for prerequisites 
check");
+
+   igt_require_intel(priv->fd.drm);
+   igt_require_gem(priv->fd.drm);
+   priv->fd.drm = close_device(priv->fd.drm, "", "pre-checked ");
+
+   pre_check(priv);
+
+   priv->fd.drm = local_drm_open_driver(false, "", " for hot unplug");
+
+   local_debug("%s\n", "running dummy load");
+   spin = igt_spin_new(priv->fd.drm, .flags = IGT_SPIN_POLL_RUN);
+   igt_spin_busywait_until_started(spin);
+
+   device_unplug(priv, "hot ", 0);
+
+   local_debug("%s\n", "trying to late query the dummy load related GEM 
object status");
+   igt_assert_eq(local_bo_busy(priv->fd.drm, spin->handle), -ENODEV);
+   local_debug("%s\n", "trying to late close the dummy load related GEM 
objects");
+   igt_assert_eq(local_gem_close(priv->fd.drm, spin->poll_handle), 
-ENODEV);
+   igt_assert_eq(local_gem_close(priv->fd.drm, spin->handle), -ENODEV);
+
+   priv->fd.drm = close_device(priv->fd.drm, "late ", "removed ");
+   igt_assert_eq(priv->fd.drm, -1);
+
+   local_debug("%s\n", "trying to late free the dummy load");
+   local_spin_free(priv, spin);
+}
+
 /* Main */
 
 igt_main
@@ -1119,6 +1219,30

[Intel-gfx] [RFC PATCH 1/2] tests/core_hotunplug: Add perf health check

2021-04-08 Thread Janusz Krzysztofik
Sometimes CI reports skips of perf subtests when run subsequently after
core_hotunplug.  That may be an indication of issues with restoring
device perf features on driver (hot)rebind.

Detect device perf support at test start and check if still available
after driver rebind.  If that fails, a post-subtest device recovery
step restores the device perf support so no subsequently executed tests
are affected.

Signed-off-by: Janusz Krzysztofik 
---
 tests/core_hotunplug.c | 22 ++
 tests/meson.build  |  8 +++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index 56a88fefd..06f15d845 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -31,6 +31,7 @@
 #include 
 
 #include "i915/gem.h"
+#include "i915/perf.h"
 #include "igt.h"
 #include "igt_device_scan.h"
 #include "igt_kmod.h"
@@ -50,6 +51,7 @@ struct hotunplug {
const char *dev_bus_addr;
const char *failure;
bool need_healthcheck;
+   bool has_intel_perf;
 };
 
 /* Helpers */
@@ -319,6 +321,16 @@ static int local_i915_recover(int i915)
return local_i915_healthcheck(i915, "post-");
 }
 
+static bool local_i915_perf_healthcheck(int i915)
+{
+   struct intel_perf *intel_perf;
+
+   intel_perf = intel_perf_for_fd(i915);
+   if (intel_perf)
+   intel_perf_free(intel_perf);
+   return intel_perf;
+}
+
 #define FLAG_RENDER(1 << 0)
 #define FLAG_RECOVER   (1 << 1)
 static void node_healthcheck(struct hotunplug *priv, unsigned flags)
@@ -360,6 +372,13 @@ static void node_healthcheck(struct hotunplug *priv, 
unsigned flags)
}
}
 
+   if (!priv->failure && priv->has_intel_perf) {
+   local_debug("%s\n", "running i915 device perf healthcheck");
+   priv->failure = "Device perf healthckeck failure!";
+   if (local_i915_perf_healthcheck(fd_drm))
+   priv->failure = NULL;
+   }
+
fd_drm = close_device(fd_drm, "", "health checked ");
if (closed || fd_drm < -1)  /* update status for post_healthcheck */
priv->fd.drm_hc = fd_drm;
@@ -553,6 +572,7 @@ igt_main
.fd = { .drm = -1, .drm_hc = -1, .sysfs_dev = -1, },
.failure= NULL,
.need_healthcheck = true,
+   .has_intel_perf = false,
};
 
igt_fixture {
@@ -567,6 +587,8 @@ igt_main
gem_quiescent_gpu(fd_drm);
igt_require_gem(fd_drm);
 
+   priv.has_intel_perf = 
local_i915_perf_healthcheck(fd_drm);
+
/**
 * FIXME: Unbinding the i915 driver on some Haswell
 * platforms with Azalia audio results in a kernel WARN
diff --git a/tests/meson.build b/tests/meson.build
index 3e3db7d5b..3f6dc4fe3 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -3,7 +3,6 @@ test_progs = [
'core_getclient',
'core_getstats',
'core_getversion',
-   'core_hotunplug',
'core_setmaster',
'core_setmaster_vs_auth',
'debugfs_test',
@@ -361,6 +360,13 @@ test_executables += executable('perf',
   install : true)
 test_list += 'perf'
 
+test_executables += executable('core_hotunplug', 'core_hotunplug.c',
+  dependencies : test_deps + [ lib_igt_i915_perf ],
+  install_dir : libexecdir,
+  install_rpath : libexecdir_rpathdir,
+  install : true)
+test_list += 'core_hotunplug'
+
 executable('testdisplay', ['testdisplay.c', 'testdisplay_hotplug.c'],
   dependencies : test_deps,
   install_dir : libexecdir,
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH 2/2] intel-ci: Unblock core_hotunplug@*hot*bind* subtests

2021-04-08 Thread Janusz Krzysztofik
Commit be529747d8ea ("intel-ci: Broaden core_hotunplug blacklist")
blamed issues triggered by hot variants[*] as responsible for random
failures in subsequently executed tests,

According to the issue history[*], last reported occurrences were
not related to core_hotunplug.  Remove *hot*bind* subtests from CI
blocklist.

[*] https://gitlab.freedesktop.org/drm/intel/-/issues/2644.

Signed-off-by: Janusz Krzysztofik 
---
 tests/intel-ci/blacklist.txt | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/intel-ci/blacklist.txt b/tests/intel-ci/blacklist.txt
index 33f92e37f..595fd0ca6 100644
--- a/tests/intel-ci/blacklist.txt
+++ b/tests/intel-ci/blacklist.txt
@@ -112,10 +112,10 @@ igt@.*@.*pipe-f($|-.*)
 # Temporary workarounds for CI-impacting bugs
 ###
 
-# Currently fails and leaves the machine in a very bad state, and
-# causes coverage loss for other tests. IOMMU related.
-# https://gitlab.freedesktop.org/drm/intel/-/issues/2644
-igt@core_hotunplug@.*(hot|plug).*
+# *plug* subtests still fail and leave the
+# machine in a very bad state, causing coverage
+# loss for other tests.  IOMMU related.
+igt@core_hotunplug@.*plug.*
 
 # hangs several gens of hosts, and has no immediate fix
 igt@device_reset@reset-bound
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH 1/2] tests/core_hotunplug: Add perf health check

2021-04-08 Thread Janusz Krzysztofik
Sometimes CI reports skips of perf subtests when run subsequently after
core_hotunplug.  That may be an indication of issues with restoring
device perf features on driver (hot)rebind.

Detect device perf support at test start and check if still available
after driver rebind.  If that fails, a post-subtest device recovery
step restores the device perf support so no subsequently executed tests
are affected.

Signed-off-by: Janusz Krzysztofik 
---
 tests/core_hotunplug.c | 22 ++
 tests/meson.build  |  8 +++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
index 56a88fefd..06f15d845 100644
--- a/tests/core_hotunplug.c
+++ b/tests/core_hotunplug.c
@@ -31,6 +31,7 @@
 #include 
 
 #include "i915/gem.h"
+#include "i915/perf.h"
 #include "igt.h"
 #include "igt_device_scan.h"
 #include "igt_kmod.h"
@@ -50,6 +51,7 @@ struct hotunplug {
const char *dev_bus_addr;
const char *failure;
bool need_healthcheck;
+   bool has_intel_perf;
 };
 
 /* Helpers */
@@ -319,6 +321,16 @@ static int local_i915_recover(int i915)
return local_i915_healthcheck(i915, "post-");
 }
 
+static bool local_i915_perf_healthcheck(int i915)
+{
+   struct intel_perf *intel_perf;
+
+   intel_perf = intel_perf_for_fd(i915);
+   if (intel_perf)
+   intel_perf_free(intel_perf);
+   return intel_perf;
+}
+
 #define FLAG_RENDER(1 << 0)
 #define FLAG_RECOVER   (1 << 1)
 static void node_healthcheck(struct hotunplug *priv, unsigned flags)
@@ -360,6 +372,13 @@ static void node_healthcheck(struct hotunplug *priv, 
unsigned flags)
}
}
 
+   if (!priv->failure && priv->has_intel_perf) {
+   local_debug("%s\n", "running i915 device perf healthcheck");
+   priv->failure = "Device perf healthckeck failure!";
+   if (local_i915_perf_healthcheck(fd_drm))
+   priv->failure = NULL;
+   }
+
fd_drm = close_device(fd_drm, "", "health checked ");
if (closed || fd_drm < -1)  /* update status for post_healthcheck */
priv->fd.drm_hc = fd_drm;
@@ -553,6 +572,7 @@ igt_main
.fd = { .drm = -1, .drm_hc = -1, .sysfs_dev = -1, },
.failure= NULL,
.need_healthcheck = true,
+   .has_intel_perf = false,
};
 
igt_fixture {
@@ -567,6 +587,8 @@ igt_main
gem_quiescent_gpu(fd_drm);
igt_require_gem(fd_drm);
 
+   priv.has_intel_perf = 
local_i915_perf_healthcheck(fd_drm);
+
/**
 * FIXME: Unbinding the i915 driver on some Haswell
 * platforms with Azalia audio results in a kernel WARN
diff --git a/tests/meson.build b/tests/meson.build
index 3e3db7d5b..3f6dc4fe3 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -3,7 +3,6 @@ test_progs = [
'core_getclient',
'core_getstats',
'core_getversion',
-   'core_hotunplug',
'core_setmaster',
'core_setmaster_vs_auth',
'debugfs_test',
@@ -361,6 +360,13 @@ test_executables += executable('perf',
   install : true)
 test_list += 'perf'
 
+test_executables += executable('core_hotunplug', 'core_hotunplug.c',
+  dependencies : test_deps + [ lib_igt_i915_perf ],
+  install_dir : libexecdir,
+  install_rpath : libexecdir_rpathdir,
+  install : true)
+test_list += 'core_hotunplug'
+
 executable('testdisplay', ['testdisplay.c', 'testdisplay_hotplug.c'],
   dependencies : test_deps,
   install_dir : libexecdir,
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH 2/2] intel-ci: Unblock core_hotunplug@*hot*bind* subtests

2021-04-08 Thread Janusz Krzysztofik
Commit be529747d8ea ("intel-ci: Broaden core_hotunplug blacklist")
blamed issues triggered by hot variants[*] as responsible for random
failures in subsequently executed tests,

According to the issue history[*], last reported occurrences were
not related to core_hotunplug.  Remove *hot*bind* subtests from CI
blocklist.

[*] https://gitlab.freedesktop.org/drm/intel/-/issues/2644.

Signed-off-by: Janusz Krzysztofik 
---
 tests/intel-ci/blacklist.txt | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/intel-ci/blacklist.txt b/tests/intel-ci/blacklist.txt
index 33f92e37f..595fd0ca6 100644
--- a/tests/intel-ci/blacklist.txt
+++ b/tests/intel-ci/blacklist.txt
@@ -112,10 +112,10 @@ igt@.*@.*pipe-f($|-.*)
 # Temporary workarounds for CI-impacting bugs
 ###
 
-# Currently fails and leaves the machine in a very bad state, and
-# causes coverage loss for other tests. IOMMU related.
-# https://gitlab.freedesktop.org/drm/intel/-/issues/2644
-igt@core_hotunplug@.*(hot|plug).*
+# *plug* subtests still fail and leave the
+# machine in a very bad state, causing coverage
+# loss for other tests.  IOMMU related.
+igt@core_hotunplug@.*plug.*
 
 # hangs several gens of hosts, and has no immediate fix
 igt@device_reset@reset-bound
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC PATCH 1/2] tests/core_hotunplug: Add perf health check

2021-04-08 Thread Janusz Krzysztofik
Sorry for double submission, I had to resend due to a typo in igt-dev list 
address.

Janusz

On czwartek, 8 kwietnia 2021 10:30:08 CEST Janusz Krzysztofik wrote:
> Sometimes CI reports skips of perf subtests when run subsequently after
> core_hotunplug.  That may be an indication of issues with restoring
> device perf features on driver (hot)rebind.
> 
> Detect device perf support at test start and check if still available
> after driver rebind.  If that fails, a post-subtest device recovery
> step restores the device perf support so no subsequently executed tests
> are affected.
> 
> Signed-off-by: Janusz Krzysztofik 
> ---
>  tests/core_hotunplug.c | 22 ++
>  tests/meson.build  |  8 +++-
>  2 files changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
> index 56a88fefd..06f15d845 100644
> --- a/tests/core_hotunplug.c
> +++ b/tests/core_hotunplug.c
> @@ -31,6 +31,7 @@
>  #include 
>  
>  #include "i915/gem.h"
> +#include "i915/perf.h"
>  #include "igt.h"
>  #include "igt_device_scan.h"
>  #include "igt_kmod.h"
> @@ -50,6 +51,7 @@ struct hotunplug {
>   const char *dev_bus_addr;
>   const char *failure;
>   bool need_healthcheck;
> + bool has_intel_perf;
>  };
>  
>  /* Helpers */
> @@ -319,6 +321,16 @@ static int local_i915_recover(int i915)
>   return local_i915_healthcheck(i915, "post-");
>  }
>  
> +static bool local_i915_perf_healthcheck(int i915)
> +{
> + struct intel_perf *intel_perf;
> +
> + intel_perf = intel_perf_for_fd(i915);
> + if (intel_perf)
> + intel_perf_free(intel_perf);
> + return intel_perf;
> +}
> +
>  #define FLAG_RENDER  (1 << 0)
>  #define FLAG_RECOVER (1 << 1)
>  static void node_healthcheck(struct hotunplug *priv, unsigned flags)
> @@ -360,6 +372,13 @@ static void node_healthcheck(struct hotunplug *priv, 
> unsigned flags)
>   }
>   }
>  
> + if (!priv->failure && priv->has_intel_perf) {
> + local_debug("%s\n", "running i915 device perf healthcheck");
> + priv->failure = "Device perf healthckeck failure!";
> + if (local_i915_perf_healthcheck(fd_drm))
> + priv->failure = NULL;
> + }
> +
>   fd_drm = close_device(fd_drm, "", "health checked ");
>   if (closed || fd_drm < -1)  /* update status for post_healthcheck */
>   priv->fd.drm_hc = fd_drm;
> @@ -553,6 +572,7 @@ igt_main
>   .fd = { .drm = -1, .drm_hc = -1, .sysfs_dev = -1, },
>   .failure= NULL,
>   .need_healthcheck = true,
> + .has_intel_perf = false,
>   };
>  
>   igt_fixture {
> @@ -567,6 +587,8 @@ igt_main
>   gem_quiescent_gpu(fd_drm);
>   igt_require_gem(fd_drm);
>  
> + priv.has_intel_perf = 
> local_i915_perf_healthcheck(fd_drm);
> +
>   /**
>* FIXME: Unbinding the i915 driver on some Haswell
>* platforms with Azalia audio results in a kernel WARN
> diff --git a/tests/meson.build b/tests/meson.build
> index 3e3db7d5b..3f6dc4fe3 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -3,7 +3,6 @@ test_progs = [
>   'core_getclient',
>   'core_getstats',
>   'core_getversion',
> - 'core_hotunplug',
>   'core_setmaster',
>   'core_setmaster_vs_auth',
>   'debugfs_test',
> @@ -361,6 +360,13 @@ test_executables += executable('perf',
>  install : true)
>  test_list += 'perf'
>  
> +test_executables += executable('core_hotunplug', 'core_hotunplug.c',
> +dependencies : test_deps + [ lib_igt_i915_perf ],
> +install_dir : libexecdir,
> +install_rpath : libexecdir_rpathdir,
> +install : true)
> +test_list += 'core_hotunplug'
> +
>  executable('testdisplay', ['testdisplay.c', 'testdisplay_hotplug.c'],
>  dependencies : test_deps,
>  install_dir : libexecdir,
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [RFC,1/2] tests/core_hotunplug: Add perf health check

2021-04-09 Thread Janusz Krzysztofik
On czwartek, 8 kwietnia 2021 16:50:45 CEST Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [RFC,1/2] tests/core_hotunplug: Add perf health 
> check
> URL   : https://patchwork.freedesktop.org/series/88848/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_9934_full -> IGTPW_5718_full
> 
> 
> Summary
> ---
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with IGTPW_5718_full absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in IGTPW_5718_full, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/index.html
> 
> Possible new issues
> ---
> 
>   Here are the unknown changes that may have been introduced in 
> IGTPW_5718_full:
> 
> ### IGT changes ###
> 
>  Possible regressions 
> 
>   * igt@core_hotunplug@hotrebind:
> - shard-tglb: NOTRUN -> [FAIL][1] +1 similar issue
>[1]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-tglb2/igt@core_hotunp...@hotrebind.html
> - shard-glk:  NOTRUN -> [FAIL][2] +1 similar issue
>[2]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-glk1/igt@core_hotunp...@hotrebind.html
> - shard-kbl:  NOTRUN -> [FAIL][3] +1 similar issue
>[3]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-kbl4/igt@core_hotunp...@hotrebind.html
> 
>   * igt@core_hotunplug@hotrebind-lateclose:
> - shard-snb:  NOTRUN -> [INCOMPLETE][4]
>[4]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-snb6/igt@core_hotunp...@hotrebind-lateclose.html
> - shard-iclb: NOTRUN -> [FAIL][5] +1 similar issue
>[5]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-iclb5/igt@core_hotunp...@hotrebind-lateclose.html
> - shard-apl:  NOTRUN -> [FAIL][6] +1 similar issue
>[6]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-apl7/igt@core_hotunp...@hotrebind-lateclose.html

Those FAILs are clear indications there is an issue with restoring device perf 
features after hot rebind on some platforms (or an issue with IGT lib ability 
to detect them), then that's not a regression, only bringing the issue into 
light.  As long as we keep hot*bind* subtests blocklisted, the issue will not 
be visible and will persist silently, I'm afraid.

Regarding the INCOMPLETE, I'm wondering how often similar system crashes on 
GPU hangs happen, if they really happen only on GPU hangs after hot rebind, 
and if that's still a good reason to keep the hot*bind* subtests blocklisted.  
Chris, can you please comment?

Thanks,
Janusz

> 
>   
> Known issues
> 
> 
>   Here are the changes found in IGTPW_5718_full that come from known issues:
> 
> ### IGT changes ###
> 
>  Issues hit 
> 
>   * igt@gem_create@create-massive:
> - shard-snb:  NOTRUN -> [DMESG-WARN][7] ([i915#3002])
>[7]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-snb7/igt@gem_cre...@create-massive.html
> 
>   * igt@gem_ctx_persistence@engines-queued:
> - shard-snb:  NOTRUN -> [SKIP][8] ([fdo#109271] / [i915#1099]) +3 
> similar issues
>[8]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-snb6/igt@gem_ctx_persiste...@engines-queued.html
> 
>   * igt@gem_ctx_sseu@invalid-args:
> - shard-tglb: NOTRUN -> [SKIP][9] ([i915#280])
>[9]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-tglb5/igt@gem_ctx_s...@invalid-args.html
> 
>   * igt@gem_exec_fair@basic-deadline:
> - shard-glk:  [PASS][10] -> [FAIL][11] ([i915#2846])
>[10]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9934/shard-glk2/igt@gem_exec_f...@basic-deadline.html
>[11]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-glk7/igt@gem_exec_f...@basic-deadline.html
> - shard-apl:  NOTRUN -> [FAIL][12] ([i915#2846])
>[12]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-apl1/igt@gem_exec_f...@basic-deadline.html
> 
>   * igt@gem_exec_fair@basic-none-solo@rcs0:
> - shard-glk:  [PASS][13] -> [FAIL][14] ([i915#2842]) +1 similar 
> issue
>[13]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9934/shard-glk5/igt@gem_exec_fair@basic-none-s...@rcs0.html
>[14]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-glk4/igt@gem_exec_fair@basic-none-s...@rcs0.html
> 
>   * igt@gem_exec_fair@basic-none@vcs0:
> - shard-kbl:  NOTRUN -> [FAIL][15] ([i915#2842]) +1 similar issue
>[15]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5718/shard-kbl7/igt@gem_exec_fair@basic-n...@vcs0.html
> 
>   * igt@gem_exec_fair@basic-pace@bcs0:
> - shard-tglb: [PASS][16] -> [FAIL][17] ([i915#2842]) +1 similar 
> issu

[Intel-gfx] [RFC PATCH] tests/gem_userptr_blits: Check for banned mmap-offset

2021-04-09 Thread Janusz Krzysztofik
Support for mmap-offset to userptr has been obsoleted, then related
lockdep splat reported issues are not going to be resolved other than
still banning mmap-offset to userptr attempts.

Replace "mmap-offset-invalidate-*" and "readonly-mmap-unsync" subtests
which now skip with a negative "mmap-offset-banned" that fails if a
mmap-offset attempt to a userptr object doesn't return ENODEV.  Also,
remove mmap-offset to userptr dependent processing paths from other
subtest bodies and drop obsolete subtest variants.

Signed-off-by: Janusz Krzysztofik 
---
 tests/i915/gem_userptr_blits.c | 324 +++--
 1 file changed, 30 insertions(+), 294 deletions(-)

diff --git a/tests/i915/gem_userptr_blits.c b/tests/i915/gem_userptr_blits.c
index 7a80c0161..aad5f141b 100644
--- a/tests/i915/gem_userptr_blits.c
+++ b/tests/i915/gem_userptr_blits.c
@@ -70,52 +70,12 @@
 #endif
 
 static uint32_t userptr_flags;
-static bool *can_mmap;
 
 #define WIDTH 512
 #define HEIGHT 512
 
 static uint32_t linear[WIDTH*HEIGHT];
 
-static bool has_mmap(int i915, const struct mmap_offset *t)
-{
-   void *ptr, *map;
-   uint32_t handle;
-
-   handle = gem_create(i915, PAGE_SIZE);
-   map = __gem_mmap_offset(i915, handle, 0, PAGE_SIZE, PROT_WRITE,
-   t->type);
-   gem_close(i915, handle);
-   if (map) {
-   munmap(map, PAGE_SIZE);
-   } else {
-   igt_debug("no HW / kernel support for mmap-offset(%s)\n",
- t->name);
-   return false;
-   }
-   map = NULL;
-
-   igt_assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE) == 0);
-
-   if (__gem_userptr(i915, ptr, 4096, 0,
- I915_USERPTR_UNSYNCHRONIZED, &handle))
-   goto out_ptr;
-   igt_assert(handle != 0);
-
-   map = __gem_mmap_offset(i915, handle, 0, 4096, PROT_WRITE, t->type);
-   if (map)
-   munmap(map, 4096);
-   else
-   igt_debug("mmap-offset(%s) banned, lockdep loop prevention\n",
- t->name);
-
-   gem_close(i915, handle);
-out_ptr:
-   free(ptr);
-
-   return map != NULL;
-}
-
 static void gem_userptr_test_unsynchronized(void)
 {
userptr_flags = I915_USERPTR_UNSYNCHRONIZED;
@@ -914,28 +874,13 @@ static int test_invalid_mapping(int fd, const struct 
mmap_offset *t)
 }
 
 #define PE_BUSY 0x1
-static void test_process_exit(int fd, const struct mmap_offset *mmo, int flags)
+static void test_process_exit(int fd, int flags)
 {
-   if (mmo)
-   igt_require_f(can_mmap[mmo->type],
- "HW & kernel support for LLC and mmap-offset(%s) 
over userptr\n",
- mmo->name);
-
igt_fork(child, 1) {
uint32_t handle;
 
handle = create_userptr_bo(fd, sizeof(linear));
 
-   if (mmo) {
-   uint32_t *ptr;
-
-   ptr = __gem_mmap_offset(fd, handle, 0, sizeof(linear),
-   PROT_READ | PROT_WRITE,
-   mmo->type);
-   if (ptr)
-   *ptr = 0;
-   }
-
if (flags & PE_BUSY)
igt_assert_eq(copy(fd, handle, handle), 0);
}
@@ -1064,53 +1009,30 @@ static int test_map_fixed_invalidate(int fd, uint32_t 
flags,
return 0;
 }
 
-static void test_mmap_offset_invalidate(int fd,
-   const struct mmap_offset *t,
-   unsigned int flags)
-#define MMOI_ACTIVE (1u << 0)
+static void test_mmap_offset_banned(int fd, const struct mmap_offset *t)
 {
-   igt_spin_t *spin = NULL;
-   uint32_t handle;
-   uint32_t *map;
+   struct drm_i915_gem_mmap_offset arg;
void *ptr;
 
/* check if mmap_offset type is supported by hardware, skip if not */
-   handle = gem_create(fd, PAGE_SIZE);
-   map = __gem_mmap_offset(fd, handle, 0, PAGE_SIZE,
-   PROT_READ | PROT_WRITE, t->type);
-   igt_require_f(map,
- "HW & kernel support for mmap_offset(%s)\n", t->name);
-   munmap(map, PAGE_SIZE);
-   gem_close(fd, handle);
+   memset(&arg, 0, sizeof(arg));
+   arg.flags = t->type;
+   arg.handle = gem_create(fd, PAGE_SIZE);
+   igt_skip_on_f(igt_ioctl(fd, DRM_IOCTL_I915_GEM_MMAP_OFFSET, &arg),
+   "HW & kernel support for mmap_offset(%s)\n", 
t->name);
+   gem_close(fd, arg.handle);
 
/* create userptr object */
+   memset(&arg, 0, sizeof(arg));
+   arg.flags = t->type;
igt_assert_eq(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE), 0);
-   

Re: [Intel-gfx] [RFC PATCH] tests/gem_userptr_blits: Check for banned mmap-offset

2021-04-15 Thread Janusz Krzysztofik
On czwartek, 15 kwietnia 2021 11:47:29 CEST Marcin Bernatowicz wrote:
> On Fri, 2021-04-09 at 10:57 +0200, Janusz Krzysztofik wrote:
> > Support for mmap-offset to userptr has been obsoleted, then related
> > lockdep splat reported issues are not going to be resolved other than
> > still banning mmap-offset to userptr attempts.
> > 
> > Replace "mmap-offset-invalidate-*" and "readonly-mmap-unsync"
> > subtests
> > which now skip with a negative "mmap-offset-banned" that fails if a
> > mmap-offset attempt to a userptr object doesn't return ENODEV.  Also,
> > remove mmap-offset to userptr dependent processing paths from other
> > subtest bodies and drop obsolete subtest variants.
> > 
> > Signed-off-by: Janusz Krzysztofik  LGTM,
> Reviewed-by: Marcin Bernatowicz 

Thank you Marcin, pushed.

Janusz




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC PATCH 1/2] tests/core_hotunplug: Add perf health check

2021-04-15 Thread Janusz Krzysztofik
On środa, 14 kwietnia 2021 11:50:10 CEST Marcin Bernatowicz wrote:
> On Thu, 2021-04-08 at 10:31 +0200, Janusz Krzysztofik wrote:
> > Sometimes CI reports skips of perf subtests when run subsequently
> > after
> > core_hotunplug.  That may be an indication of issues with restoring
> > device perf features on driver (hot)rebind.
> > 
> > Detect device perf support at test start and check if still available
> > after driver rebind.  If that fails, a post-subtest device recovery
> > step restores the device perf support so no subsequently executed
> > tests
> > are affected.
> > 
> > Signed-off-by: Janusz Krzysztofik  LGTM,
> Acked-by: Marcin Bernatowicz 

Thank you Marcin, pushed.

Janusz

> 
> 
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Fix wrong name announced on FB driver switching

2021-04-29 Thread Janusz Krzysztofik
Commit 7a0f9ef9703d ("drm/i915: Use drm_fb_helper_fill_info")
effectively changed our FB driver name from "inteldrmfb" to
"i915drmfb".  However, we are still using the old name when kicking out
a firmware fbdev driver potentially bound to our device.  Use the new
name to avoid confusion.

Note: since the new name is assigned by a DRM fbdev helper called at
the DRM driver registration time, that name is not available when we
kick the other driver out early, hence a hardcoded name must be used
unless the DRM layer exposes a macro for converting a DRM driver name
to its associated fbdev driver name.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 785dcf20c77b..46082490dc9a 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -554,7 +554,7 @@ static int i915_driver_hw_probe(struct drm_i915_private 
*dev_priv)
if (ret)
goto err_perf;
 
-   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
"inteldrmfb");
+   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
"i915drmfb");
if (ret)
goto err_ggtt;
 
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Fix wrong name announced on FB driver switching

2021-04-30 Thread Janusz Krzysztofik
On piątek, 30 kwietnia 2021 01:01:38 CEST Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Fix wrong name announced on FB driver switching
> URL   : https://patchwork.freedesktop.org/series/89663/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_10027_full -> Patchwork_20039_full
> 
> 
> Summary
> ---
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_20039_full absolutely need to 
> be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_20039_full, please notify your bug team to allow 
> them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   
> 
> Possible new issues
> ---
> 
>   Here are the unknown changes that may have been introduced in 
> Patchwork_20039_full:
> 
> ### IGT changes ###
> 
>  Possible regressions 
> 
>   * igt@kms_plane@plane-panning-bottom-right-pipe-b-planes:
> - shard-tglb: [PASS][1] -> [DMESG-WARN][2] +4 similar issues
>[1]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-tglb3/igt@kms_pl...@plane-panning-bottom-right-pipe-b-planes.html
>[2]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-tglb5/igt@kms_pl...@plane-panning-bottom-right-pipe-b-planes.html

False positive.  The change only affects a notice sent to kernel log on FB 
switching, nothing else, then there is no possibility for any error messages 
in kernel log being related.

Thanks,
Janusz

> 
>   
> Known issues
> 
> 
>   Here are the changes found in Patchwork_20039_full that come from known 
> issues:
> 
> ### IGT changes ###
> 
>  Issues hit 
> 
>   * igt@gem_create@create-clear:
> - shard-glk:  [PASS][3] -> [FAIL][4] ([i915#3160])
>[3]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-glk8/igt@gem_cre...@create-clear.html
>[4]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-glk9/igt@gem_cre...@create-clear.html
> - shard-skl:  [PASS][5] -> [FAIL][6] ([i915#3160])
>[5]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-skl9/igt@gem_cre...@create-clear.html
>[6]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-skl9/igt@gem_cre...@create-clear.html
> 
>   * igt@gem_create@create-massive:
> - shard-apl:  NOTRUN -> [DMESG-WARN][7] ([i915#3002])
>[7]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-apl3/igt@gem_cre...@create-massive.html
> 
>   * igt@gem_ctx_persistence@legacy-engines-hostile@render:
> - shard-iclb: [PASS][8] -> [FAIL][9] ([i915#2410])
>[8]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-iclb6/igt@gem_ctx_persistence@legacy-engines-host...@render.html
>[9]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-iclb6/igt@gem_ctx_persistence@legacy-engines-host...@render.html
> 
>   * igt@gem_ctx_persistence@legacy-engines-queued:
> - shard-snb:  NOTRUN -> [SKIP][10] ([fdo#109271] / [i915#1099]) 
> +2 similar issues
>[10]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-snb5/igt@gem_ctx_persiste...@legacy-engines-queued.html
> 
>   * igt@gem_ctx_persistence@many-contexts:
> - shard-tglb: [PASS][11] -> [FAIL][12] ([i915#2410])
>[11]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-tglb6/igt@gem_ctx_persiste...@many-contexts.html
>[12]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-tglb2/igt@gem_ctx_persiste...@many-contexts.html
> 
>   * igt@gem_ctx_ringsize@active@bcs0:
> - shard-skl:  NOTRUN -> [INCOMPLETE][13] ([i915#3316])
>[13]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-skl8/igt@gem_ctx_ringsize@act...@bcs0.html
> 
>   * igt@gem_exec_fair@basic-deadline:
> - shard-kbl:  [PASS][14] -> [FAIL][15] ([i915#2846])
>[14]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-kbl2/igt@gem_exec_f...@basic-deadline.html
>[15]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-kbl4/igt@gem_exec_f...@basic-deadline.html
> 
>   * igt@gem_exec_fair@basic-flow@rcs0:
> - shard-kbl:  [PASS][16] -> [SKIP][17] ([fdo#109271])
>[16]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-kbl7/igt@gem_exec_fair@basic-f...@rcs0.html
>[17]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-kbl6/igt@gem_exec_fair@basic-f...@rcs0.html
> 
>   * igt@gem_exec_fair@basic-pace@rcs0:
> - shard-kbl:  [PASS][18] -> [FAIL][19] ([i915#2842]) +1 similar 
> issue
>[18]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10027/shard-kbl7/igt@gem_exec_fair@basic-p...@rcs0.html
>[19]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20039/shard-kbl4/igt@gem_exec_fair@basic-p...@rcs0.html
> 
>   * igt

[Intel-gfx] [RFC PATCH i-g-t] lib/i915/perf: Fix non-card0 processing

2021-04-30 Thread Janusz Krzysztofik
IGT i915/perf library functions now always operate on sysfs perf
attributes of card0 device node, no matter which DRM device fd a user
passes.  The intention was to always switch to primary device node if
a user passes a render device node fd, but that breaks handling of
non-card0 devices.

Instead of forcibly using DRM device minor number 0 when opening a
device sysfs area, convert device minor number of a user passed device
fd to the minor number of respective primary (cardX) device node.

Signed-off-by: Janusz Krzysztofik 
---
 lib/i915/perf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/i915/perf.c b/lib/i915/perf.c
index 56d5c0b3a..336824df7 100644
--- a/lib/i915/perf.c
+++ b/lib/i915/perf.c
@@ -376,8 +376,8 @@ open_master_sysfs_dir(int drm_fd)
if (fstat(drm_fd, &st) || !S_ISCHR(st.st_mode))
 return -1;
 
-snprintf(path, sizeof(path), "/sys/dev/char/%d:0",
- major(st.st_rdev));
+snprintf(path, sizeof(path), "/sys/dev/char/%d:%d",
+ major(st.st_rdev), minor(st.st_rdev) & ~128);
 
return open(path, O_DIRECTORY);
 }
-- 
2.25.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Shutdown hooks

2019-05-17 Thread Janusz Krzysztofik
On Thursday, May 16, 2019 8:20:18 AM CEST Janusz Krzysztofik wrote:
> On Wednesday, May 15, 2019 5:00:40 PM CEST Chris Wilson wrote:
> > Janus, some old patches that may be of use for shutdown prior to kexec.
> > -Chris
> 
> Hi Chris,
> 
> Thanks for sharing.
> 
> I'm only not sure why you mentioned kexec.  I have an impression someone 
else 
> was talking about kexec recently so maybe I was not the intended recipient.  
> But anyway, those patches look to me like they may be helpful by hotunplug 
so 
> I'm going to give them a try with the hotunplug test.

I was wrong. The shutdown hook has nothing to do with hot unbind / unplug and 
the applicable remove hook already has in its path both calls covered by those 
patches.  Then it looks like indeed I must have been not the intended 
recipient of those messages.

Thanks,
Janusz

P.S. Sorry for business disclaimer appended to my last message.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH] drm/i915: Tolerate file owned GEM contexts on hot unbind

2019-05-17 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

During i915_driver_unload(), GEM contexts are verified restrictively
inside i915_gem_fini() if they don't consume shared resources which
should be cleaned up before the driver is released.  If those checks
don't result in kernel panic, one more check is performed at the end of
i915_gem_fini() which issues a WARN_ON() if GEM contexts still exist.

Some GEM contexts are allocated unconditionally on device file open,
one per each file descriptor, and are kept open until those file
descriptors are closed.  Since open file descriptors prevent the driver
module from being unloaded, that protects the driver from being
released while contexts are still open.  However, that's not the case
on driver unbind or device unplug sysfs operations which are executed
regardless of open file descriptors.

To protect kernel resources from being accessed by those open file
decriptors while driver unbind or device unplug operation is in
progress, the driver now calls drm_device_unplug() at the beginning of
that process and relies on the DRM layer to provide such protection.

Taking all above information into account, as soon as shared resources
not associated with specific file descriptors are cleaned up, it should
be safe to postpone completion of driver release until users of those
open file decriptors give up on errors and close them.

When device has been marked unplugged, use WARN_ON() conditionally so
the warning is displayed only if a GEM context not associated with a
file descriptor is still allocated.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_gem.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 54f27cabae2a..c00b6dbaf4f5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4670,7 +4670,17 @@ void i915_gem_fini(struct drm_i915_private *dev_priv)
 
i915_gem_drain_freed_objects(dev_priv);
 
-   WARN_ON(!list_empty(&dev_priv->contexts.list));
+   if (drm_dev_is_unplugged(&dev_priv->drm)) {
+   struct i915_gem_context *ctx, *cn;
+
+   list_for_each_entry_safe(ctx, cn, &dev_priv->contexts.list,
+link) {
+   WARN_ON(IS_ERR_OR_NULL(ctx->file_priv));
+   break;
+   }
+   } else {
+   WARN_ON(!list_empty(&dev_priv->contexts.list));
+   }
 }
 
 void i915_gem_init_mmio(struct drm_i915_private *i915)
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC PATCH] drm/i915: Tolerate file owned GEM contexts on hot unbind

2019-05-20 Thread Janusz Krzysztofik
On Friday, May 17, 2019 4:32:35 PM CEST Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-05-17 15:06:17)
> > From: Janusz Krzysztofik 
> > 
> > During i915_driver_unload(), GEM contexts are verified restrictively
> > inside i915_gem_fini() if they don't consume shared resources which
> > should be cleaned up before the driver is released.  If those checks
> > don't result in kernel panic, one more check is performed at the end of
> > i915_gem_fini() which issues a WARN_ON() if GEM contexts still exist.
> 
> Just fix the underlying bug of this code being called too early. The
> assumptions we made for unload are clearly invalid when applied to
> unbind, and we need to split the phases.
> -Chris

Thanks Chris, I think I get it finally.

Janusz




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Fix context IDs not released on driver hot unbind

2019-04-04 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

In case the driver gets unbound while a device is open, kernel panic
may be forced if a list of allocated context IDs is not empty.

When a device is open, the list may happen to be not empty because a
context ID, once allocated by a context ID allocator to a context
assosiated with that open file descriptor, is released as late as
on device close.

On the other hand, there is a need to release all allocated context IDs
and destroy the context ID allocator on driver unbind, even if a device
is open, in order to free memory resources consumed and prevent from
memory leaks.  The purpose of the forced kernel panic was to protect
the context ID allocator from being silently destroyed if not all
allocated IDs had been released.

Before forcing the kernel panic on non-empty list of allocated context
IDs, do that unlikely on non-empty list of contexts that should be
freed by preceding drain of work queue (there must be another bug if
that list happens to be not empty).  If empty, we may assume that
remaining contexts are idle (not pinned) and their IDs can be safely
released.

Once done, release context IDs of each of those remaining contexts
unless it happens a context is unlikely pinned.  Force kernel panic in
that case, there must be still another bug in the driver code.

Now the kernel panic protecting the allocator should not pop up as the
list it checks should be empty.  If it unlikely happens to be not
empty, there must be still another bug.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 280813a4bf82..18d004d94e43 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -611,6 +611,8 @@ void i915_gem_contexts_lost(struct drm_i915_private 
*dev_priv)
 
 void i915_gem_contexts_fini(struct drm_i915_private *i915)
 {
+   struct i915_gem_context *ctx, *cn;
+
lockdep_assert_held(&i915->drm.struct_mutex);
 
if (i915->preempt_context)
@@ -618,6 +620,14 @@ void i915_gem_contexts_fini(struct drm_i915_private *i915)
destroy_kernel_context(&i915->kernel_context);
 
/* Must free all deferred contexts (via flush_workqueue) first */
+   GEM_BUG_ON(!llist_empty(&i915->contexts.free_list));
+
+   /* Release all remaining HW IDs before ID allocator is destroyed */
+   list_for_each_entry_safe(ctx, cn, &i915->contexts.hw_id_list,
+hw_id_link) {
+   GEM_BUG_ON(atomic_read(&ctx->hw_id_pin_count));
+   release_hw_id(ctx);
+   }
GEM_BUG_ON(!list_empty(&i915->contexts.hw_id_list));
ida_destroy(&i915->contexts.hw_ida);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Fix context IDs not released on driver hot unbind

2019-04-04 Thread Janusz Krzysztofik
On Thu, 2019-04-04 at 11:28 +0100, Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-04-04 11:24:45)
> > From: Janusz Krzysztofik 
> > 
> > In case the driver gets unbound while a device is open, kernel
> > panic
> > may be forced if a list of allocated context IDs is not empty.
> > 
> > When a device is open, the list may happen to be not empty because
> > a
> > context ID, once allocated by a context ID allocator to a context
> > assosiated with that open file descriptor, is released as late as
> > on device close.
> > 
> > On the other hand, there is a need to release all allocated context
> > IDs
> > and destroy the context ID allocator on driver unbind, even if a
> > device
> > is open, in order to free memory resources consumed and prevent
> > from
> > memory leaks.  The purpose of the forced kernel panic was to
> > protect
> > the context ID allocator from being silently destroyed if not all
> > allocated IDs had been released.
> 
> Those open fd are still pointing into kernel memory where the driver
> used to be. The panic is entirely correct, we should not be unloading
> the module before those dangling pointers have been made safe.
> 
> This is papering over the symptom. How is the module being unloaded
> with
> open fd? 

A user can play with the driver unbind or device remove sysfs
interface.

Thanks,
Janusz

> If all the fd have been closed, how have we failed to flush and
> retire all requests (thereby unpinning the contexts and all other
> pointers).
> -Chris
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Fix context IDs not released on driver hot unbind

2019-04-04 Thread Janusz Krzysztofik
On Thu, 2019-04-04 at 11:43 +0100, Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-04-04 11:40:24)
> > On Thu, 2019-04-04 at 11:28 +0100, Chris Wilson wrote:
> > > Quoting Janusz Krzysztofik (2019-04-04 11:24:45)
> > > > From: Janusz Krzysztofik 
> > > > 
> > > > In case the driver gets unbound while a device is open, kernel
> > > > panic
> > > > may be forced if a list of allocated context IDs is not empty.
> > > > 
> > > > When a device is open, the list may happen to be not empty
> > > > because
> > > > a
> > > > context ID, once allocated by a context ID allocator to a
> > > > context
> > > > assosiated with that open file descriptor, is released as late
> > > > as
> > > > on device close.
> > > > 
> > > > On the other hand, there is a need to release all allocated
> > > > context
> > > > IDs
> > > > and destroy the context ID allocator on driver unbind, even if
> > > > a
> > > > device
> > > > is open, in order to free memory resources consumed and prevent
> > > > from
> > > > memory leaks.  The purpose of the forced kernel panic was to
> > > > protect
> > > > the context ID allocator from being silently destroyed if not
> > > > all
> > > > allocated IDs had been released.
> > > 
> > > Those open fd are still pointing into kernel memory where the
> > > driver
> > > used to be. The panic is entirely correct, we should not be
> > > unloading
> > > the module before those dangling pointers have been made safe.
> > > 
> > > This is papering over the symptom. How is the module being
> > > unloaded
> > > with
> > > open fd? 
> > 
> > A user can play with the driver unbind or device remove sysfs
> > interface.
> 
> Sure, but we must still follow all the steps before _unloading_ the
> module or else the user is left pointing into reused kernel memory.

I'm not talking about unloading the module, that is prevented by open
fds.  The driver still exists after being unbound from a device and may
just respond with -ENODEV.

Janusz

> -Chris
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Use drm_dev_unplug()

2019-04-05 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

The driver does not currently support unbinding from a device which is
in use.  Since open file descriptors may still be pointing into kernel
memory where the device structures used to be, entirely correct kernel
panics protect the driver from being unbound as we should not be
unbinding it before those dangling pointers have been made safe.

According to the documentation found inside drivers/gpu/drm/drm_drv.c,
drm_dev_unplug() should be used instead of drm_dev_unregister() in
order to make a device inaccessible to users as soon as it is unpluged.
Follow that advice to make those possibly dangling pointers safe,
protected by DRM layer from a user who is otherwise left pointing into
possibly reused kernel memory after the driver has been unbound from
the device.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9df65d386d11..66163378c481 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
i915_pmu_unregister(dev_priv);
 
i915_teardown_sysfs(dev_priv);
-   drm_dev_unregister(&dev_priv->drm);
+   drm_dev_unplug(&dev_priv->drm);
 
i915_gem_shrinker_unregister(dev_priv);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Use drm_dev_unplug()

2019-04-05 Thread Janusz Krzysztofik
On Fri, 2019-04-05 at 08:41 +0100, Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-04-05 08:26:57)
> > From: Janusz Krzysztofik 
> > 
> > The driver does not currently support unbinding from a device which
> > is
> > in use.  Since open file descriptors may still be pointing into
> > kernel
> > memory where the device structures used to be, entirely correct
> > kernel
> > panics protect the driver from being unbound as we should not be
> > unbinding it before those dangling pointers have been made safe.
> > 
> > According to the documentation found inside
> > drivers/gpu/drm/drm_drv.c,
> > drm_dev_unplug() should be used instead of drm_dev_unregister() in
> > order to make a device inaccessible to users as soon as it is
> > unpluged.
> > Follow that advice to make those possibly dangling pointers safe,
> > protected by DRM layer from a user who is otherwise left pointing
> > into
> > possibly reused kernel memory after the driver has been unbound
> > from
> > the device.
> > 
> > Signed-off-by: Janusz Krzysztofik 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 9df65d386d11..66163378c481 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct
> > drm_i915_private *dev_priv)
> > i915_pmu_unregister(dev_priv);
> >  
> > i915_teardown_sysfs(dev_priv);
> > -   drm_dev_unregister(&dev_priv->drm);
> > +   drm_dev_unplug(&dev_priv->drm);
> 
> I think we may have our onion inverted here. We want to stop the
> users
> as the first step, then start removing the entries. (That will also
> nicely invert the order from register, which is what we typically
> expect).
> 
> After calling i915_driver_unregister(); call i915_gem_set_wedged() to
> immediately (give or take external fences) cancel inflight
> operations.

OK, thanks.  Do you prefer them squashed or as serparate patches?

Thanks,
Janusz

> -Chris
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Use drm_dev_unplug()

2019-04-05 Thread Janusz Krzysztofik
On Fri, 2019-04-05 at 09:24 +0100, Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-04-05 09:11:54)
> > On Fri, 2019-04-05 at 08:41 +0100, Chris Wilson wrote:
> > > Quoting Janusz Krzysztofik (2019-04-05 08:26:57)
> > > > From: Janusz Krzysztofik 
> > > > 
> > > > The driver does not currently support unbinding from a device
> > > > which
> > > > is
> > > > in use.  Since open file descriptors may still be pointing into
> > > > kernel
> > > > memory where the device structures used to be, entirely correct
> > > > kernel
> > > > panics protect the driver from being unbound as we should not
> > > > be
> > > > unbinding it before those dangling pointers have been made
> > > > safe.
> > > > 
> > > > According to the documentation found inside
> > > > drivers/gpu/drm/drm_drv.c,
> > > > drm_dev_unplug() should be used instead of drm_dev_unregister()
> > > > in
> > > > order to make a device inaccessible to users as soon as it is
> > > > unpluged.
> > > > Follow that advice to make those possibly dangling pointers
> > > > safe,
> > > > protected by DRM layer from a user who is otherwise left
> > > > pointing
> > > > into
> > > > possibly reused kernel memory after the driver has been unbound
> > > > from
> > > > the device.
> > > > 
> > > > Signed-off-by: Janusz Krzysztofik  > > > >
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index 9df65d386d11..66163378c481 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct
> > > > drm_i915_private *dev_priv)
> > > > i915_pmu_unregister(dev_priv);
> > > >  
> > > > i915_teardown_sysfs(dev_priv);
> > > > -   drm_dev_unregister(&dev_priv->drm);
> > > > +   drm_dev_unplug(&dev_priv->drm);
> > > 
> > > I think we may have our onion inverted here. We want to stop the
> > > users
> > > as the first step, then start removing the entries. (That will
> > > also
> > > nicely invert the order from register, which is what we typically
> > > expect).
> > > 
> > > After calling i915_driver_unregister(); call
> > > i915_gem_set_wedged() to
> > > immediately (give or take external fences) cancel inflight
> > > operations.
> > 
> > OK, thanks.  Do you prefer them squashed or as serparate patches?
> 
> Quite happy to do the s/unregister/unplug/ and move in one go. Have a
> pre-emptive
> Reviewed-by: Chris Wilson 
> on that as that seems to be the right thing to do.
> 
> And there should be no issues in placing a i915_gem_set_wedged()
> immediately after the call to i915_driver_unregister, so if you
> include
> a line of commentary about why, for example
> 
> /*
>  * After unregistering the device to prevent any new users, cancel
>  * all in-flight requests so that we can quickly unbind the active
>  * resources.
>  */
> i915_gem_set_wedged(dev_priv);
> 
> Reviewed-by: Chris Wilson 

I've given it some testing, no side effects with test workloads I've
tried, and looks like it at least helps to prevent from making the
device actually wedged.

With these two patches, plus the one we discussed yesterday, and yet
another one I'm going to submit soon, I'm now able to unbind the driver
from a device while a workload is running on it, unload the module,
reload it and successfully perform basic GEM health checks, all in a
quick succession :-).

Unfortunately, not 100% reproducible, as well as not the case with
device unplug simulated by writing 1 to device/remove sysfs file.
Surely that needs the work you describe below to be done first.

Thanks for your cooperation,
Janusz


> 
> I think overall though, we need to go through i915_driver_unload()
> and
> push the module cleanup operations to i915_driver_release -- that
> will
> take a bit of surgery to separate the different phases that are
> currently smashed together.
> -Chris
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Mark GEM wedged right after marking device unplugged

2019-04-05 Thread Janusz Krzysztofik
As soon as a device is considered unplugged, not only prevent pending
users from accessing the device structures but also cancel all their
pending requests so all consumed resources can be cleaned up as soon
as possible.

Signed-off-by: Janusz Krzysztofik 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 66163378c481..03a563ce7e6b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1598,6 +1598,13 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
i915_teardown_sysfs(dev_priv);
drm_dev_unplug(&dev_priv->drm);
 
+   /*
+* After unregistering the device to prevent any new users, cancel
+* all in-flight requests so that we can quickly unbind the active
+* resources.
+*/
+   i915_gem_set_wedged(dev_priv);
+
i915_gem_shrinker_unregister(dev_priv);
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH] drm/i915: Don't panic on non-empty list of free cachelines

2019-04-05 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

If there are active users of a device during driver unbind, the driver
now panics on non-empty list of free cachelines.

By design, chachelines which are not in use are kept on a list of free
chachelines associated with a timeline and rmoved from that list either
when in use or when the timeline is destroyed.  Timelines in turn are
assigned to open file descriptors.

As long as a device file is open, its associated timeline with its list
of free cachelines will be hopefully destroyed on device close, either
while outstanding execlists are destroyed or on i915_timeline_put()
called directly, so as long as device file descriptors are protected
from unwanted user activities by the device being marked unplugged,
there should be no reason to panic.

Moreover, timeline mutex which is destroyed right after the check for
emptyness of a free cacheline list succeeds is never used to protect
that list, only a list of active cachelines, so it can be freely
destroyed even if the former is not empty.

Simply remove the GEM_BUG_ON(!list_empty(>->hwsp_free_list)); line
from i915_timelines_fini().

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_timeline.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_timeline.c 
b/drivers/gpu/drm/i915/i915_timeline.c
index b2202d2e58a2..1f23c2dcc0da 100644
--- a/drivers/gpu/drm/i915/i915_timeline.c
+++ b/drivers/gpu/drm/i915/i915_timeline.c
@@ -325,7 +325,6 @@ void i915_timelines_fini(struct drm_i915_private *i915)
struct i915_gt_timelines *gt = &i915->gt.timelines;
 
GEM_BUG_ON(!list_empty(>->active_list));
-   GEM_BUG_ON(!list_empty(>->hwsp_free_list));
 
mutex_destroy(>->mutex);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC PATCH] drm/i915: Don't panic on non-empty list of free cachelines

2019-04-05 Thread Janusz Krzysztofik
On Fri, 2019-04-05 at 13:20 +0100, Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-04-05 13:13:31)
> > From: Janusz Krzysztofik 
> > 
> > If there are active users of a device during driver unbind, the
> > driver
> > now panics on non-empty list of free cachelines.
> 
> This panic is there to say that fini is being called with active
> contexts, that it is being called too early. Those requests should be
> cleaned up first, unpinning the contexts and resources, and so
> letting
> the timeline be freed.

OK, I see.  But why panic?  Maybe a WARN() would be enough.

Thanks,
Janusz

> -Chris
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 1/2] drm/i915: Use drm_dev_unplug()

2019-04-05 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

The driver does not currently support unbinding from a device which is
in use.  Since open file descriptors may still be pointing into kernel
memory where the device structures used to be, entirely correct kernel
panics protect the driver from being unbound as we should not be
unbinding it before those dangling pointers have been made safe.

According to the documentation found inside drivers/gpu/drm/drm_drv.c,
drm_dev_unplug() should be used instead of drm_dev_unregister() in
order to make a device inaccessible to users as soon as it is unpluged.
Follow that advice to make those possibly dangling pointers safe,
protected by DRM layer from a user who is otherwise left pointing into
possibly reused kernel memory after the driver has been unbound from
the device.  Once done, also cancel inflight operations immediately by
calling i915_gem_set_wedged().

Signed-off-by: Janusz Krzysztofik 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9df65d386d11..66163378c481 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
i915_pmu_unregister(dev_priv);
 
i915_teardown_sysfs(dev_priv);
-   drm_dev_unregister(&dev_priv->drm);
+   drm_dev_unplug(&dev_priv->drm);
 
i915_gem_shrinker_unregister(dev_priv);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 0/2] Stop users from using the device on driver unbind

2019-04-05 Thread Janusz Krzysztofik
Use drm_dev_unplug() to have device resources protected from user access
by DRM layer as soon as the driver is going to be unbound.  Also, cancel
all pending work so associated resources can be quickly released.

Janusz Krzysztofik (2):
  drm/i915: Use drm_dev_unplug()
  drm/i915: Mark GEM wedged right after marking device unplugged

 drivers/gpu/drm/i915/i915_drv.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

I'm resending these two patches together in series to make the robot
happy about the second one.  Also, I've added the Suggested-by: clause
to credit actual Chris' contribution.

Thanks,
Janusz
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 2/2] drm/i915: Mark GEM wedged right after marking device unplugged

2019-04-05 Thread Janusz Krzysztofik
As soon as a device is considered unplugged, not only prevent pending
users from accessing the device structures but also cancel all their
pending requests so all consumed resources can be cleaned up as soon
as possible.

Suggested-by: Chris Wilson 
Signed-off-by: Janusz Krzysztofik 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 66163378c481..03a563ce7e6b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1598,6 +1598,13 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
i915_teardown_sysfs(dev_priv);
drm_dev_unplug(&dev_priv->drm);
 
+   /*
+* After unregistering the device to prevent any new users, cancel
+* all in-flight requests so that we can quickly unbind the active
+* resources.
+*/
+   i915_gem_set_wedged(dev_priv);
+
i915_gem_shrinker_unregister(dev_priv);
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v2] drm/i915: Don't panic on non-empty list of free cachelines

2019-04-05 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

If there are active users of a device during driver unbind, the driver
now panics on non-empty list of free cachelines.

By design, cachelines which are not in use are kept on a list of free
cachelines associated with a timeline and removed from that list either
when in use or when the timeline is destroyed.  Timelines in turn are
assigned to open file descriptors.

As long as a device file is open, its associated timeline with its list
of free cachelines will be hopefully destroyed on device close, either
while outstanding execlists are destroyed or on i915_timeline_put()
called directly, so as long as device file descriptors are protected
from unwanted user activities by the device being marked unplugged,
there should be no reason to panic.

Moreover, timeline mutex which is destroyed right after the check for
emptyness of a free cacheline list succeeds is never used to protect
that list, only a list of active cachelines, so it can be freely
destroyed even if the former is not empty.

Since the desired behavior is to clean up active contexts first,
unpinning the contexts and resources, and so letting the timeline be
freed, the panic is there to say that i915_timelines_fini() is
called to early.  Don't remove the check completely then but convert it
from the BUG() to a WARN() so the indication a long term fix is needed
is still given.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_timeline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_timeline.c 
b/drivers/gpu/drm/i915/i915_timeline.c
index b2202d2e58a2..965fd3052b25 100644
--- a/drivers/gpu/drm/i915/i915_timeline.c
+++ b/drivers/gpu/drm/i915/i915_timeline.c
@@ -325,7 +325,7 @@ void i915_timelines_fini(struct drm_i915_private *i915)
struct i915_gt_timelines *gt = &i915->gt.timelines;
 
GEM_BUG_ON(!list_empty(>->active_list));
-   GEM_BUG_ON(!list_empty(>->hwsp_free_list));
+   GEM_WARN_ON(!list_empty(>->hwsp_free_list));
 
mutex_destroy(>->mutex);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for Stop users from using the device on driver unbind

2019-04-08 Thread Janusz Krzysztofik
On Friday, April 5, 2019 7:37:04 PM CEST Chris Wilson wrote:
> Quoting Chris Wilson (2019-04-05 17:26:46)
> 
> > Quoting Patchwork (2019-04-05 17:20:39)
> > 
> > > == Series Details ==
> > > 
> > > Series: Stop users from using the device on driver unbind
> > > URL   : https://patchwork.freedesktop.org/series/59064/
> > > State : failure
> > > 
> > > == Summary ==
> > > 
> > > CI Bug Log - changes from CI_DRM_5881 -> Patchwork_12699
> > > 
> > > 
> > > Summary
> > > ---
> > > 
> > >   **FAILURE**
> > >   
> > >   Serious unknown changes coming with Patchwork_12699 absolutely need to
> > >   be
> > >   verified manually.
> > >   
> > >   If you think the reported changes have nothing to do with the changes
> > >   introduced in Patchwork_12699, please notify your bug team to allow
> > >   them
> > >   to document this new failure mode, which will reduce false positives
> > >   in CI.
> > >   
> > >   External URL:
> > >   https://patchwork.freedesktop.org/api/1.0/series/59064/revisions/1/mb
> > >   ox/> > 
> > > Possible new issues
> > > ---
> > > 
> > >   Here are the unknown changes that may have been introduced in 
Patchwork_12699:
> > > ### IGT changes ###
> > > 
> > >  Possible regressions 
> > > 
> > >   * igt@i915_module_load@reload:
> > 2 issues, it appears:
> > 
> > <4> [271.799080] WARN_ON(dev_priv->mm.object_count)
> > <4> [271.799241] WARNING: CPU: 0 PID: 3288 at
> > drivers/gpu/drm/i915/i915_gem.c:5145 i915_gem_cleanup_early+0x104/0x110
> > [i915] <4> [271.799249] Modules linked in: vgem snd_hda_codec_hdmi
> > snd_hda_codec_realtek snd_hda_codec_generic i915(-) mei_hdcp
> > x86_pkg_temp_thermal btusb coretemp btrtl btbcm btintel bluetooth
> > crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hwdep ghash_clmulni_intel
> > snd_hda_core e1000e ecdh_generic snd_pcm mei_me ptp prime_numbers
> > pps_core mei [last unloaded: snd_hda_intel] <4> [271.799302] CPU: 0 PID:
> > 3288 Comm: i915_module_loa Tainted: G U   
> > 5.1.0-rc3-CI-Patchwork_12699+ #1 <4> [271.799307] Hardware name: 
> > /NUC6i7KYB, BIOS KYSKLi70.86A.0059.2018.1122.1431 11/22/2018 <4>
> > [271.799406] RIP: 0010:i915_gem_cleanup_early+0x104/0x110 [i915] <4>
> > [271.799412] Code: 00 00 48 c7 c2 d0 6b 3d a0 48 c7 c7 ca 5c 2c a0 e8 c1
> > b5 ec e0 0f 0b 48 c7 c6 68 c0 3f a0 48 c7 c7 63 88 42 a0 e8 9c 77 de e0
> > <0f> 0b e9 40 ff ff ff 0f 1f 44 00 00 e8 5b 7e 00 00 31 c0 c3 0f 1f <4>
> > [271.799417] RSP: 0018:c9453dd0 EFLAGS: 00010282
> > <4> [271.799423] RAX:  RBX: 88849afd RCX:
> >  <4> [271.799428] RDX: 0006 RSI:
> > 88849ee130b8 RDI: 8211dc4d <4> [271.799432] RBP:
> > 88849afd7630 R08: 028bc995 R09:  <4>
> > [271.799436] R10:  R11:  R12:
> > a04a81e0 <4> [271.799440] R13:  R14:
> >  R15: a04a82d0 <4> [271.799446] FS: 
> > 7f31e8cec980() GS:8884aee0() knlGS:
> > <4> [271.799451] CS:  0010 DS:  ES:  CR0: 80050033 <4>
> > [271.799455] CR2: 7ffea58773d8 CR3: 00044cfc6003 CR4:
> > 003606f0 <4> [271.799459] Call Trace:
> > <4> [271.799531]  i915_driver_cleanup_early+0x30/0x70 [i915]
> > <4> [271.799603]  i915_driver_release+0xa/0x30 [i915]
> > <4> [271.799672]  i915_driver_unload+0x6a/0x120 [i915]
> > <4> [271.799748]  i915_pci_remove+0x19/0x30 [i915]
> > <4> [271.799765]  pci_device_remove+0x36/0xb0
> 
> So this is the bizarre part. We end up in the final i915_driver_release
> because it appears that drm_dev_unplug() drops a reference. I couldn't
> see where...
> 
> [   24.960676] WARNING: CPU: 2 PID: 637 at drivers/gpu/drm/drm_drv.c:895
> drm_dev_put+0x8/0x60 [   24.960735] Modules linked in: nls_ascii nls_cp437
> vfat fat crct10dif_pclmul crc32_pclmul crc32c_intel i915(-) aesni_intel
> aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_uncore
> intel_rapl_perf efivars i2c_i801 intel_gtt drm_kms_helper ahci libahci
> video button efivarfs [   24.960848] CPU: 2 PID: 637 Comm: i915_module_loa
> Tainted: GBU5.1.0-rc3+ #526 [   24.960897] Hardware name:
> Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS
> BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [   24.960952] RIP:
> 0010:drm_dev_put+0x8/0x60
> [   24.960993] Code: 48 8d 7b 60 e8 d9 8b c7 ff 48 8b 7b 60 5b 5d e9 0e 4f
> c7 ff 48 89 df e8 06 c2 ff ff e9 3f ff ff ff 90 48 85 ff 75 01 c3 55 53
> <0f> 0b f0 ff 4f 14 0f 88 64 b7 2d 00 74 03 5b 5d c3 48 89 fb 48 8d [  
> 24.961066] RSP: 0018:88872587fc80 EFLAGS: 00010286
> [   24.961107] RAX:  RBX: 88873f02 RCX:
> 81680444 [   24.961151] RDX: dc00 RSI: dc00
> RDI: 88873f02 [   24.961195] RBP: 88873f02ad88 R08:
>  R09: fbfff04824c5 [   24.961240] R10: fbfff04824c5
> R11: 8241262b R12: 88

[Intel-gfx] [PATCH v2 0/1] Stop users from using the device on driver unbind

2019-04-18 Thread Janusz Krzysztofik
Use drm_dev_unplug() to have device resources protected from user access
by DRM layer as soon as the driver is going to be unbound.

Janusz Krzysztofik (1):
  drm/i915: Use drm_dev_unplug()

 drivers/gpu/drm/i915/i915_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Since this patch should be now safe for use if merged with current
drm-next or drm-tip branch which no longer suffer from incorrectly
resolved merge confilct that was breaking it, finally fixed by commit
bd53280ef042 ("drm/drv: Fix incorrect resolution of merge conflict"),
I'm resending it with Daniel's Reviewed-by: added.

Former patch 2/2 has been dropped as it is already in drm-intel-next as
commit 141f3767e7b8 ("drm/i915: Mark GEM wedged right after marking
device unplugged").  BTW, the wersion I sent was screwed up, not
reflecting Chris' intention precisely enough, but Chris was vigilant and
fixed it.  Sorry Chris.

Thanks,
Janusz
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v2 1/1] drm/i915: Use drm_dev_unplug()

2019-04-18 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

The driver does not currently support unbinding from a device which is
in use.  Since open file descriptors may still be pointing into kernel
memory where the device structures used to be, entirely correct kernel
panics protect the driver from being unbound as we should not be
unbinding it before those dangling pointers have been made safe.

According to the documentation found inside drivers/gpu/drm/drm_drv.c,
drm_dev_unplug() should be used instead of drm_dev_unregister() in
order to make a device inaccessible to users as soon as it is unpluged.
Follow that advice to make those possibly dangling pointers safe,
protected by DRM layer from a user who is otherwise left pointing into
possibly reused kernel memory after the driver has been unbound from
the device.

Signed-off-by: Janusz Krzysztofik 
Reviewed-by: Chris Wilson 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/i915_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9df65d386d11..66163378c481 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1596,7 +1596,7 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
i915_pmu_unregister(dev_priv);
 
i915_teardown_sysfs(dev_priv);
-   drm_dev_unregister(&dev_priv->drm);
+   drm_dev_unplug(&dev_priv->drm);
 
i915_gem_shrinker_unregister(dev_priv);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug

2019-09-02 Thread Janusz Krzysztofik
Hi Baolu,

On Thursday, August 29, 2019 11:08:18 AM CEST Lu Baolu wrote:
> Hi,
> 
> On 8/29/19 3:58 PM, Janusz Krzysztofik wrote:
> > Hi Baolu,
> > 
> > On Thursday, August 29, 2019 3:43:31 AM CEST Lu Baolu wrote:
> >> Hi Janusz,
> >>
> >> On 8/28/19 10:17 PM, Janusz Krzysztofik wrote:
> >>>> We should avoid kernel panic when a intel_unmap() is called against
> >>>> a non-existent domain.
> >>> Does that mean you suggest to replace
> >>>   BUG_ON(!domain);
> >>> with something like
> >>>   if (WARN_ON(!domain))
> >>>   return;
> >>> and to not care of orphaned mappings left allocated?  Is there a way to
> > inform
> >>> users that their active DMA mappings are no longer valid and they
> > shouldn't
> >>> call dma_unmap_*()?
> >>>
> >>>> But we shouldn't expect the IOMMU driver not
> >>>> cleaning up the domain info when a device remove notification comes and
> >>>> wait until all file descriptors being closed, right?
> >>> Shouldn't then the IOMMU driver take care of cleaning up resources still
> >>> allocated on device remove before it invalidates and forgets their
> > pointers?
> >>>
> >>
> >> You are right. We need to wait until all allocated resources (iova and
> >> mappings) to be released.
> >>
> >> How about registering a callback for BUS_NOTIFY_UNBOUND_DRIVER, and
> >> removing the domain info when the driver detachment completes?
> > 
> > Device core calls BUS_NOTIFY_UNBOUND_DRIVER on each driver unbind, 
regardless
> > of a device being removed or not.  As long as the device is not unplugged 
and
> > the BUS_NOTIFY_REMOVED_DEVICE notification not generated, an unbound 
driver is
> > not a problem here.
> > Morever, BUS_NOTIFY_UNBOUND_DRIVER  is called even before
> > BUS_NOTIFY_REMOVED_DEVICE so that wouldn't help anyway.
> > Last but not least, bus events are independent of the IOMMU driver use via
> > DMA-API it exposes.
> 
> Fair enough.
> 
> > 
> > If keeping data for unplugged devices and reusing it on device re-plug is 
not
> > acceptable then maybe the IOMMU driver should perform reference counting 
of
> > its internal resources occupied by DMA-API users and perform cleanups on 
last
> > release?
> 
> I am not saying that keeping data is not acceptable. I just want to
> check whether there are any other solutions.

Then reverting 458b7c8e0dde and applying this patch still resolves the issue 
for me.  No errors appear when mappings are unmapped on device close after the 
device has been removed, and domain info preserved on device removal is 
successfully reused on device re-plug.

Is there anything else I can do to help?

Thanks,
Janusz

> 
> Best regards,
> Baolu
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug

2019-09-03 Thread Janusz Krzysztofik
Hi Baolu,

On Tuesday, September 3, 2019 3:29:40 AM CEST Lu Baolu wrote:
> Hi Janusz,
> 
> On 9/2/19 4:37 PM, Janusz Krzysztofik wrote:
> >> I am not saying that keeping data is not acceptable. I just want to
> >> check whether there are any other solutions.
> > Then reverting 458b7c8e0dde and applying this patch still resolves the 
issue
> > for me.  No errors appear when mappings are unmapped on device close after 
the
> > device has been removed, and domain info preserved on device removal is
> > successfully reused on device re-plug.
> 
> This patch doesn't look good to me although I agree that keeping data is
> acceptable. It updates dev->archdata.iommu, but leaves the hardware
> context/pasid table unchanged. This might cause problems somewhere.
> 
> > 
> > Is there anything else I can do to help?
> 
> Can you please tell me how to reproduce the problem? 

The most simple way to reproduce the issue, assuming there are no non-Intel 
graphics adapters installed, is to run the following shell commands:

#!/bin/sh
# load i915 module
modprobe i915
# open an i915 device and keep it open in background
cat /dev/dri/card0 >/dev/null &
sleep 2
# simulate device unplug
echo 1 >/sys/class/drm/card0/device/remove
# make the background process close the device on exit
kill $!

Thanks,
Janusz


> Keeping the per
> device domain info while device is unplugged is a bit dangerous because
> info->dev might be a wild pointer. We need to work out a clean fix.
> 
> > 
> > Thanks,
> > Janusz
> > 
> 
> Best regards,
> Baolu
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Don't unwedge if reset is disabled

2019-09-05 Thread Janusz Krzysztofik
When trying to reset a device with reset capability disabled or not
supported while rings are full of requests, it has been observed when
running in execlists submission mode that command stream buffer tail
tends to be incremented by apparently still running GPU regardless of
all requests being already cancelled and command stream buffer pointers
reset.  As a result, kernel panic on NULL pointer dereference occurs
when a trace_ports() helper is called with command stream buffer tail
incremented but request pointers being NULL during final
__intel_gt_set_wedged() operation called from intel_gt_reset().

Skip actual reset procedure if reset is disabled or not supported.

Suggested-by: Daniele Ceraolo Spurio 
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index b9d84d52e986..d75da124e280 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -932,25 +932,35 @@ void intel_gt_reset(struct intel_gt *gt,
GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, >->reset.flags));
mutex_lock(>->reset.mutex);
 
-   /* Clear any previous failed attempts at recovery. Time to try again. */
-   if (!__intel_gt_unset_wedged(gt))
-   goto unlock;
-
if (reason)
dev_notice(gt->i915->drm.dev,
   "Resetting chip for %s\n", reason);
-   atomic_inc(>->i915->gpu_error.reset_count);
-
-   awake = reset_prepare(gt);
 
if (!intel_has_gpu_reset(gt->i915)) {
if (i915_modparams.reset)
dev_err(gt->i915->drm.dev, "GPU reset not supported\n");
else
DRM_DEBUG_DRIVER("GPU reset disabled\n");
-   goto error;
+
+   /*
+* Don't unwedge if reset is disabled or not supported
+* because we can't guarantee what the hardware status is.
+*/
+   if (intel_gt_is_wedged(gt))
+   goto unlock;
}
 
+   /* Clear any previous failed attempts at recovery. Time to try again. */
+   if (!__intel_gt_unset_wedged(gt))
+   goto unlock;
+
+   atomic_inc(>->i915->gpu_error.reset_count);
+
+   awake = reset_prepare(gt);
+
+   if (!intel_has_gpu_reset(gt->i915))
+   goto error;
+
if (INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
intel_runtime_pm_disable_interrupts(gt->i915);
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915/guc: Fix detection of GuC submission in use

2019-09-05 Thread Janusz Krzysztofik
The driver always assumes active GuC submission mode if it is
supported.  That's not true if GuC initialization fails for some
reason.  That may lead to kernel panics, caused e.g. by execlists
fallback submission mode incorrectly detecting GuC submission in use.

Fix it by also checking for GuC enabled status.

Fixes: 356c484822e6 ("drm/i915/uc: Add explicit DISABLED state for firmware")
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
index 527995c21196..b28bab64a280 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
@@ -51,7 +51,8 @@ static inline bool intel_uc_supports_guc_submission(struct 
intel_uc *uc)
 
 static inline bool intel_uc_uses_guc_submission(struct intel_uc *uc)
 {
-   return intel_guc_is_submission_supported(&uc->guc);
+   return intel_guc_is_enabled(&uc->guc) &&
+  intel_guc_is_submission_supported(&uc->guc);
 }
 
 static inline bool intel_uc_supports_huc(struct intel_uc *uc)
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t] lib: Don't use full reset on simulated hardware

2019-09-05 Thread Janusz Krzysztofik
If DROP_RESET_ACTIVE is requested while there is a large queue of pending
GEM requests, waiting for idle engines performed as a first step of
i915_gem_drop_caches debugfs request handler times out and an otherwise
healthy device is marked wedged.  If that happens while reset capabilities
are disabled or not supported, there is no possibility to successfully
reset the device after requests are retired.

Avoid fake GPU terminally wedged conditions by not requesting
DROP_RESET_ACTIVE from exit handler when running on simulated hardware.
As a side effect, terminating a very busy test and running a subsequent
one may take quite a while.

Signed-off-by: Janusz Krzysztofik 
---
 lib/drmtest.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/drmtest.c b/lib/drmtest.c
index c379a7b7..b73bc132 100644
--- a/lib/drmtest.c
+++ b/lib/drmtest.c
@@ -318,7 +318,8 @@ static void __cancel_work_at_exit(int fd)
igt_sysfs_set_parameter(fd, "reset", "%x", -1u /* any method */);
igt_drop_caches_set(fd,
/* cancel everything */
-   DROP_RESET_ACTIVE | DROP_RESET_SEQNO |
+   igt_run_in_simulation() ? 0 : DROP_RESET_ACTIVE |
+   DROP_RESET_SEQNO |
/* cleanup */
DROP_ACTIVE | DROP_RETIRE | DROP_IDLE | DROP_FREED);
 }
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix detection of GuC submission in use

2019-09-05 Thread Janusz Krzysztofik
Hi Michał,

On Thursday, September 5, 2019 2:08:12 PM CEST Michal Wajdeczko wrote:
> On Thu, 05 Sep 2019 13:16:31 +0200, Janusz Krzysztofik  
>  wrote:
> 
> > The driver always assumes active GuC submission mode if it is
> > supported.  That's not true if GuC initialization fails for some
> > reason.  That may lead to kernel panics, caused e.g. by execlists
> > fallback submission mode incorrectly detecting GuC submission in use.
> >
> > Fix it by also checking for GuC enabled status.
> >
> > Fixes: 356c484822e6 ("drm/i915/uc: Add explicit DISABLED state for  
> > firmware")
> > Signed-off-by: Janusz Krzysztofik 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_uc.h | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.h  
> > b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
> > index 527995c21196..b28bab64a280 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
> > @@ -51,7 +51,8 @@ static inline bool  
> > intel_uc_supports_guc_submission(struct intel_uc *uc)
> > static inline bool intel_uc_uses_guc_submission(struct intel_uc *uc)
> >  {
> > -   return intel_guc_is_submission_supported(&uc->guc);
> > +   return intel_guc_is_enabled(&uc->guc) &&
> > +  intel_guc_is_submission_supported(&uc->guc);
> 
> This wont fix your original problem (that btw is not possible to
> repro on drm-tip)

I'm not sure how you force GuC initialization to fail, mine just didn't have 
new firmware available.  On module load, the driver was starting up in 
execlists submission mode and BUG_ON( was raised from process_csb().  Running 
on a simulator, I was using current internal tree, based on current drm-tip.

> as after any GuC initialization failure we still
> treat GuC as "enabled":

My bad, I initially used intel_guc_is_running() but that interfered badly with 
module unload so I switched to intel_guc_is_enabled() and apparently didn't 
re-test if this still fixes the original issue.

> intel_guc_is_supported => H/W support (static)
> intel_guc_is_enabled => aka not disabled by the user (config)
> intel_guc_is_running => no major fw failure (runtime)
> 
> Note that we even s/intel_guc_is_enabled/intel_guc_is_running
> won't help as GuC may be running but we may fail to correctly
> initialize GuC submission.
> 
> Correct fix to original problem must be aligned with new GuC
> submission model (coming soon) and it may look as this:
> 
> +static inline bool intel_guc_is_submission_active(struct intel_guc *guc)
> +{
> + GEM_BUG_ON(guc->submission_active && !intel_guc_is_running(guc));
> + return guc->submission_active;
> +}
> 
> and then
> 
>   static inline bool intel_uc_uses_guc_submission(struct intel_uc *uc)
>   {
> - return intel_guc_is_submission_supported(&uc->guc);
> + return intel_guc_is_submission_active(&uc->guc);
>   }
> 
> We may need to revisit all uses/supports/ macros to better
> reflect configuration vs runtime differences.

Definitely, or we may get in troubles like the one I experienced on module 
unload.  And that can be done in advance, I believe.

As long as the unload issue is resolved by not using 
intel_uc_uses_guc_submission() where it occurred inappropriate, using 
(intel_guc_is_running() && intel_guc_is_submission_supported()) seems a valid 
fix to me, easy to migrate to intel_guc_is_submission_active() as soon as 
available.  I'll revert back to intel_guc_is_running(), fix the module unload 
issue and resubmit to trybot, maybe it can discover more issues with that.

Thanks,
Janusz

> 
> Thanks,
> Michal
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/3] drm/i915/uc: Disable GuC submission only if currently enabled

2019-09-09 Thread Janusz Krzysztofik
Hi Fernando,

On Wednesday, August 28, 2019 2:45:57 AM CEST Fernando Pacheco wrote:
> It is not enough to check that uc supports GuC submission now
> that we can continue to load the driver after GuC initialization
> failure (support != enabled). Instead we should explicitly check
> that we enabled GuC submission.

What's the status of this patch?

I think that having your intel_guc_is_submission_enabled() helper available I 
would be able to resolve a few related issues, which I've accidentally taken 
upon myself, without inventing my own version.

Thanks,
Janusz


> Signed-off-by: Fernando Pacheco 
> Cc: Michal Wajdeczko 
> Cc: Daniele Ceraolo Spurio 
> ---
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 23 +++
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.h |  1 +
>  drivers/gpu/drm/i915/gt/uc/intel_uc.c |  2 +-
>  3 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/
gpu/drm/i915/gt/uc/intel_guc_submission.c
> index f325d3dd564f..d4aff9a96c7a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -191,6 +191,16 @@ static bool __doorbell_valid(struct intel_guc *guc, u16 
db_id)
>   return intel_uncore_read(uncore, GEN8_DRBREGL(db_id)) & 
GEN8_DRB_VALID;
>  }
>  
> +static bool __doorbell_enabled(struct intel_guc_client *client)
> +{
> + struct guc_doorbell_info *doorbell;
> +
> + GEM_BUG_ON(!has_doorbell(client));
> +
> + doorbell = __get_doorbell(client);
> + return doorbell->db_status == GUC_DOORBELL_ENABLED;
> +}
> +
>  static void __init_doorbell(struct intel_guc_client *client)
>  {
>   struct guc_doorbell_info *doorbell;
> @@ -1112,6 +1122,19 @@ static void guc_set_default_submission(struct 
intel_engine_cs *engine)
>   GEM_BUG_ON(engine->irq_enable || engine->irq_disable);
>  }
>  
> +bool intel_guc_is_submission_enabled(struct intel_guc *guc)
> +{
> + if (!intel_guc_is_submission_supported(guc))
> + return false;
> +
> + /*
> +  * Use the fact that we enable the guc execbuf_client
> +  * and its doorbell when enabling GuC submission as a proxy
> +  * for the latter.
> +  */
> + return guc->execbuf_client && __doorbell_enabled(guc-
>execbuf_client);
> +}
> +
>  int intel_guc_submission_enable(struct intel_guc *guc)
>  {
>   struct intel_gt *gt = guc_to_gt(guc);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h b/drivers/
gpu/drm/i915/gt/uc/intel_guc_submission.h
> index 54d716828352..80b18a2c885a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
> @@ -58,6 +58,7 @@ struct intel_guc_client {
>  
>  void intel_guc_submission_init_early(struct intel_guc *guc);
>  int intel_guc_submission_init(struct intel_guc *guc);
> +bool intel_guc_is_submission_enabled(struct intel_guc *guc);
>  int intel_guc_submission_enable(struct intel_guc *guc);
>  void intel_guc_submission_disable(struct intel_guc *guc);
>  void intel_guc_submission_fini(struct intel_guc *guc);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/
gt/uc/intel_uc.c
> index 29a9eec60d2e..b2eb340ce87e 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
> @@ -538,7 +538,7 @@ void intel_uc_fini_hw(struct intel_uc *uc)
>   if (!intel_guc_is_running(guc))
>   return;
>  
> - if (intel_uc_supports_guc_submission(uc))
> + if (intel_guc_is_submission_enabled(guc))
>   intel_guc_submission_disable(guc);
>  
>   if (guc_communication_enabled(guc))
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first

2019-09-10 Thread Janusz Krzysztofik
Hi Chris,

On Tuesday, September 10, 2019 12:55:36 AM CEST Chris Wilson wrote:
> Unwedging the GPU requires a successful GPU reset before we restore the
> default submission, or else we may see residual context switch events
> that we were not expecting.
> 
> Reported-by: Janusz Krzysztofik 
> Signed-off-by: Chris Wilson 
> Cc: Janusz Krzysztofik 
> Cc: Daniele Ceraolo Spurio 
> ---
>  drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/
gt/intel_reset.c
> index fe57296b790c..5242496a893a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   struct intel_gt_timelines *timelines = >->timelines;
>   struct intel_timeline *tl;
>   unsigned long flags;
> + bool ok;
>  
>   if (!test_bit(I915_WEDGED, >->reset.flags))
>   return true;
> @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt 
*gt)
>   }
>   spin_unlock_irqrestore(&timelines->lock, flags);
>  
> - intel_gt_sanitize(gt, false);
> + ok = false;
> + if (!reset_clobbers_display(gt->i915))
> + ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
> + if (!ok)
> + return false;

Before your change, that code was executed inside intel_gt_sanitize(gt, false) 
which unfortunately didn't return any result.  The same outcome could be 
achieved by redefining intel_gt_sanitize() to return that result and saying:

if (!intel_gt_sanitize(gt, false)
return false;

Is there any specific reason for intel_gt_sanitize() returning void?

Thanks,
Janusz

>  
>   /*
>* Undo nop_submit_request. We prevent all new i915 requests from
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Don't unwedge if reset is disabled

2019-09-13 Thread Janusz Krzysztofik
On Monday, September 9, 2019 11:48:42 PM CEST Chris Wilson wrote:
> Quoting Chris Wilson (2019-09-07 09:39:52)
> > Quoting Daniele Ceraolo Spurio (2019-09-06 23:28:05)
> > > 
> > > 
> > > On 9/5/19 2:09 AM, Janusz Krzysztofik wrote:
> > > > When trying to reset a device with reset capability disabled or not
> > > > supported while rings are full of requests, it has been observed when
> > > > running in execlists submission mode that command stream buffer tail
> > > > tends to be incremented by apparently still running GPU regardless of
> > > > all requests being already cancelled and command stream buffer 
pointers
> > > > reset.  As a result, kernel panic on NULL pointer dereference occurs
> > > > when a trace_ports() helper is called with command stream buffer tail
> > > > incremented but request pointers being NULL during final
> > > > __intel_gt_set_wedged() operation called from intel_gt_reset().
> > > > 
> > > > Skip actual reset procedure if reset is disabled or not supported.
> > > 
> > > This last sentence is a bit confusing. You're not skipping the reset 
> > > procedure, you're skipping the attempt of unwedging and resetting again 
> > > after a reset & wedge already happened.
> > 
> > Loss of email over the last week, so jumping in at the end. My gut
> > response is that this is still just papering over the bug, as what you
> > say above makes no sense.
> 
> So my gut response was to the run on sentence, when all you needed to
> say that without a successful reset prior to calling
> reset_default_submission, the engine may still generate CS events out of
> the blue. And I think the patch should be written to require the
> successful reset.

You are right, successful reset seems the only safe protection.

But anyway, while digging deeper waiting for your clarification of that gut 
respone ;-) , I've discovered that symptoms from which the issue can be 
predicted may be sometimes observed during reset_prepere() as failing 
intel_engine_stop_cs().  Checking for that failure alone may be too weak as it 
can probably happen to succeed regardless of the uncertain hardware status, 
but anyway, what do you think about modifying reset_prepare() so it may fail 
with an error propagated from functions it calls, then calling reset_prepare() 
at the beginning of intel_gt_reset() and skiping over 
__intel_gt_unset_wedgede() and further steps (do_reset(), ..., reset_finish()) 
if reset_prepare() fails?  Wouldn't that be a useful additional layer of 
protection?

If you think the idea is worth of being considered, please have a look at my 
first attempt sent to trybot already before your explanation arrived:
https://patchwork.freedesktop.org/patch/329840/?series=66447&rev=1
(don't complain on its commit message making no sense, please ;-) ).

Thanks,
Janusz

> -Chris
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 1/1] drm/i915: Split off pci_driver.remove() tail to drm_driver.release()

2019-05-30 Thread Janusz Krzysztofik
In order to support driver hot unbind, some cleanup operations, now
performed on PCI driver remove, must be called later, after all device
file descriptors are closed.

Split out those operations from the tail of pci_driver.remove()
callback and put them into drm_driver.release() which is called as soon
as all references to the driver are put.  As a result, those cleanups
will be now run on last drm_dev_put(), either still called from
pci_driver.remove() if all device file descriptors are already closed,
or on last drm_release() file operation.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 17 +
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c | 10 +-
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 83d2eb9e74cb..8be69f84eb6d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -738,6 +738,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 
 cleanup_gem:
i915_gem_suspend(dev_priv);
+   i915_gem_fini_hw(dev_priv);
i915_gem_fini(dev_priv);
 cleanup_modeset:
intel_modeset_cleanup(dev);
@@ -1685,7 +1686,6 @@ static void i915_driver_cleanup_hw(struct 
drm_i915_private *dev_priv)
pci_disable_msi(pdev);
 
pm_qos_remove_request(&dev_priv->pm_qos);
-   i915_ggtt_cleanup_hw(dev_priv);
 }
 
 /**
@@ -1909,6 +1909,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 
 out_cleanup_hw:
i915_driver_cleanup_hw(dev_priv);
+   i915_ggtt_cleanup_hw(dev_priv);
 out_cleanup_mmio:
i915_driver_cleanup_mmio(dev_priv);
 out_runtime_pm_put:
@@ -1960,21 +1961,29 @@ void i915_driver_unload(struct drm_device *dev)
cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
i915_reset_error_state(dev_priv);
 
-   i915_gem_fini(dev_priv);
+   i915_gem_fini_hw(dev_priv);
 
intel_power_domains_fini_hw(dev_priv);
 
i915_driver_cleanup_hw(dev_priv);
-   i915_driver_cleanup_mmio(dev_priv);
 
enable_rpm_wakeref_asserts(dev_priv);
-   intel_runtime_pm_cleanup(dev_priv);
 }
 
 static void i915_driver_release(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
 
+   disable_rpm_wakeref_asserts(dev_priv);
+
+   i915_gem_fini(dev_priv);
+
+   i915_ggtt_cleanup_hw(dev_priv);
+   i915_driver_cleanup_mmio(dev_priv);
+
+   enable_rpm_wakeref_asserts(dev_priv);
+   intel_runtime_pm_cleanup(dev_priv);
+
i915_driver_cleanup_early(dev_priv);
i915_driver_destroy(dev_priv);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a2664ea1395b..d08e7bd83544 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3047,6 +3047,7 @@ void i915_gem_init_mmio(struct drm_i915_private *i915);
 int __must_check i915_gem_init(struct drm_i915_private *dev_priv);
 int __must_check i915_gem_init_hw(struct drm_i915_private *dev_priv);
 void i915_gem_init_swizzling(struct drm_i915_private *dev_priv);
+void i915_gem_fini_hw(struct drm_i915_private *dev_priv);
 void i915_gem_fini(struct drm_i915_private *dev_priv);
 int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv,
   unsigned int flags, long timeout);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7cafd5612f71..c6a8e665a6ba 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4667,7 +4667,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
return ret;
 }
 
-void i915_gem_fini(struct drm_i915_private *dev_priv)
+void i915_gem_fini_hw(struct drm_i915_private *dev_priv)
 {
GEM_BUG_ON(dev_priv->gt.awake);
 
@@ -4681,6 +4681,14 @@ void i915_gem_fini(struct drm_i915_private *dev_priv)
intel_uc_fini_hw(dev_priv);
intel_uc_fini(dev_priv);
intel_engines_cleanup(dev_priv);
+   mutex_unlock(&dev_priv->drm.struct_mutex);
+
+   i915_gem_drain_freed_objects(dev_priv);
+}
+
+void i915_gem_fini(struct drm_i915_private *dev_priv)
+{
+   mutex_lock(&dev_priv->drm.struct_mutex);
i915_gem_contexts_fini(dev_priv);
i915_gem_fini_scratch(dev_priv);
mutex_unlock(&dev_priv->drm.struct_mutex);
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 0/1] drm/i915: Split off pci_driver.remove() tail to drm_driver.release()

2019-05-30 Thread Janusz Krzysztofik
Hi,

I do realize more work needs to be done to get a clean hotunplug
solution, however I need your comments to make sure that I'm going in
the right direction.

So far I have no good idea how to resolve pm_runtime_get_sync()
failures on outstanding device file close after successfull driver
unbind.

Thanks,
Janusz


Janusz Krzysztofik (1):
  drm/i915: Split off pci_driver.remove() tail to drm_driver.release()

 drivers/gpu/drm/i915/i915_drv.c | 17 +
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c | 10 +-
 3 files changed, 23 insertions(+), 5 deletions(-)

-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v2] drm/i915: Split off pci_driver.remove() tail to drm_driver.release()

2019-05-30 Thread Janusz Krzysztofik
In order to support driver hot unbind, some cleanup operations, now
performed on PCI driver remove, must be called later, after all device
file descriptors are closed.

Split out those operations from the tail of pci_driver.remove()
callback and put them into drm_driver.release() which is called as soon
as all references to the driver are put.  As a result, those cleanups
will be now run on last drm_dev_put(), either still called from
pci_driver.remove() if all device file descriptors are already closed,
or on last drm_release() file operation.

Signed-off-by: Janusz Krzysztofik 
Reviewed-by: Chris Wilson 
---
Changelog:
v1 -> v2:
- defer intel_engines_cleanup() as well. (Chris)

 drivers/gpu/drm/i915/i915_drv.c | 17 +
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c | 10 +-
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 83d2eb9e74cb..8be69f84eb6d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -738,6 +738,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 
 cleanup_gem:
i915_gem_suspend(dev_priv);
+   i915_gem_fini_hw(dev_priv);
i915_gem_fini(dev_priv);
 cleanup_modeset:
intel_modeset_cleanup(dev);
@@ -1685,7 +1686,6 @@ static void i915_driver_cleanup_hw(struct 
drm_i915_private *dev_priv)
pci_disable_msi(pdev);
 
pm_qos_remove_request(&dev_priv->pm_qos);
-   i915_ggtt_cleanup_hw(dev_priv);
 }
 
 /**
@@ -1909,6 +1909,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 
 out_cleanup_hw:
i915_driver_cleanup_hw(dev_priv);
+   i915_ggtt_cleanup_hw(dev_priv);
 out_cleanup_mmio:
i915_driver_cleanup_mmio(dev_priv);
 out_runtime_pm_put:
@@ -1960,21 +1961,29 @@ void i915_driver_unload(struct drm_device *dev)
cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
i915_reset_error_state(dev_priv);
 
-   i915_gem_fini(dev_priv);
+   i915_gem_fini_hw(dev_priv);
 
intel_power_domains_fini_hw(dev_priv);
 
i915_driver_cleanup_hw(dev_priv);
-   i915_driver_cleanup_mmio(dev_priv);
 
enable_rpm_wakeref_asserts(dev_priv);
-   intel_runtime_pm_cleanup(dev_priv);
 }
 
 static void i915_driver_release(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
 
+   disable_rpm_wakeref_asserts(dev_priv);
+
+   i915_gem_fini(dev_priv);
+
+   i915_ggtt_cleanup_hw(dev_priv);
+   i915_driver_cleanup_mmio(dev_priv);
+
+   enable_rpm_wakeref_asserts(dev_priv);
+   intel_runtime_pm_cleanup(dev_priv);
+
i915_driver_cleanup_early(dev_priv);
i915_driver_destroy(dev_priv);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a2664ea1395b..d08e7bd83544 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3047,6 +3047,7 @@ void i915_gem_init_mmio(struct drm_i915_private *i915);
 int __must_check i915_gem_init(struct drm_i915_private *dev_priv);
 int __must_check i915_gem_init_hw(struct drm_i915_private *dev_priv);
 void i915_gem_init_swizzling(struct drm_i915_private *dev_priv);
+void i915_gem_fini_hw(struct drm_i915_private *dev_priv);
 void i915_gem_fini(struct drm_i915_private *dev_priv);
 int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv,
   unsigned int flags, long timeout);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7cafd5612f71..20d3f7532cef 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4667,7 +4667,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
return ret;
 }
 
-void i915_gem_fini(struct drm_i915_private *dev_priv)
+void i915_gem_fini_hw(struct drm_i915_private *dev_priv)
 {
GEM_BUG_ON(dev_priv->gt.awake);
 
@@ -4680,6 +4680,14 @@ void i915_gem_fini(struct drm_i915_private *dev_priv)
mutex_lock(&dev_priv->drm.struct_mutex);
intel_uc_fini_hw(dev_priv);
intel_uc_fini(dev_priv);
+   mutex_unlock(&dev_priv->drm.struct_mutex);
+
+   i915_gem_drain_freed_objects(dev_priv);
+}
+
+void i915_gem_fini(struct drm_i915_private *dev_priv)
+{
+   mutex_lock(&dev_priv->drm.struct_mutex);
intel_engines_cleanup(dev_priv);
i915_gem_contexts_fini(dev_priv);
i915_gem_fini_scratch(dev_priv);
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t v3] lib/i915/gem_mman: Add a helper for obtaining mappable aperture version

2019-05-31 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

If a test calls a function which depends on availabiblity of a
supported mappable aperture, an error may be reported by the kernel on
unsupported hardware.  That may negatively affect results reported by a
test framework even if that test ignores the failure and succeedes.

This helper wraps an IOCTL call which returns a version number of a
mappable aperture.  It may be used by tests which need to adjust their
scope depending on availability of specific version of mappable
aperture.

Signed-off-by: Janusz Krzysztofik 
Cc: Antonio Argenziano 
Cc: Michal Wajdeczko 
---
Changelog:
v2 (internal) -> v3:
- make the code less obsucre, more explicit (Antonio),
- reword the helper documentation and commit message.

v1 (internal) -> v2 (internal):
- minimize future potential conflicts with 
  https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1
  (no progress with than one so not waiting for it any longer):
  - convert the helper to a drop-in replacement of the one from the
above mentioned patch, returning mappable aperture version, not
only information on its availability,
  - drop any other wrappers,
- document the helper,
- reword commit message.

 lib/i915/gem_mman.c | 22 ++
 lib/i915/gem_mman.h |  1 +
 2 files changed, 23 insertions(+)

diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c
index 3cf9a6bb..3a3f3e5c 100644
--- a/lib/i915/gem_mman.c
+++ b/lib/i915/gem_mman.c
@@ -40,6 +40,28 @@
 #define VG(x) do {} while (0)
 #endif
 
+/**
+ * gem_mmap__gtt_version:
+ * @fd: open i915 drm file descriptor
+ *
+ * This functions wraps up an IOCTL to obtain mappable aperture version.
+ *
+ * Returns: mappable aperture version, -1 on failure.
+ */
+int gem_mmap__gtt_version(int fd)
+{
+   int gtt_version, ret;
+   struct drm_i915_getparam gp = {
+   .param = I915_PARAM_MMAP_GTT_VERSION,
+   .value = >t_version,
+   };
+
+   ret = ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp);
+   if (ret == 0)
+   ret = gtt_version;
+   return ret;
+}
+
 /**
  * __gem_mmap__gtt:
  * @fd: open i915 drm file descriptor
diff --git a/lib/i915/gem_mman.h b/lib/i915/gem_mman.h
index f7242ed7..ab12e566 100644
--- a/lib/i915/gem_mman.h
+++ b/lib/i915/gem_mman.h
@@ -25,6 +25,7 @@
 #ifndef GEM_MMAN_H
 #define GEM_MMAN_H
 
+int gem_mmap__gtt_version(int fd);
 void *gem_mmap__gtt(int fd, uint32_t handle, uint64_t size, unsigned prot);
 void *gem_mmap__cpu(int fd, uint32_t handle, uint64_t offset, uint64_t size, 
unsigned prot);
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t v3] lib/i915/gem_mman: Add a helper for obtaining mappable aperture version

2019-05-31 Thread Janusz Krzysztofik
Hi Chris,

On Friday, May 31, 2019 10:39:47 AM CEST Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-05-31 09:33:38)
> > From: Janusz Krzysztofik 
> > 
> > If a test calls a function which depends on availabiblity of a
> > supported mappable aperture, an error may be reported by the kernel on
> > unsupported hardware.  That may negatively affect results reported by a
> > test framework even if that test ignores the failure and succeedes.
> > 
> > This helper wraps an IOCTL call which returns a version number of a
> > mappable aperture.  It may be used by tests which need to adjust their
> > scope depending on availability of specific version of mappable
> > aperture.
> > 
> > Signed-off-by: Janusz Krzysztofik 
> > Cc: Antonio Argenziano 
> > Cc: Michal Wajdeczko 
> > ---
> > Changelog:
> > v2 (internal) -> v3:
> > - make the code less obsucre, more explicit (Antonio),
> > - reword the helper documentation and commit message.
> > 
> > v1 (internal) -> v2 (internal):
> > - minimize future potential conflicts with 
> >   https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1
> >   (no progress with than one so not waiting for it any longer):
> >   - convert the helper to a drop-in replacement of the one from the
> > above mentioned patch, returning mappable aperture version, not
> > only information on its availability,
> >   - drop any other wrappers,
> > - document the helper,
> > - reword commit message.
> > 
> >  lib/i915/gem_mman.c | 22 ++
> >  lib/i915/gem_mman.h |  1 +
> >  2 files changed, 23 insertions(+)
> > 
> > diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c
> > index 3cf9a6bb..3a3f3e5c 100644
> > --- a/lib/i915/gem_mman.c
> > +++ b/lib/i915/gem_mman.c
> > @@ -40,6 +40,28 @@
> >  #define VG(x) do {} while (0)
> >  #endif
> >  
> > +/**
> > + * gem_mmap__gtt_version:
> > + * @fd: open i915 drm file descriptor
> > + *
> > + * This functions wraps up an IOCTL to obtain mappable aperture version.
> > + *
> > + * Returns: mappable aperture version, -1 on failure.
> > + */
> > +int gem_mmap__gtt_version(int fd)
> > +{
> > +   int gtt_version, ret;
> > +   struct drm_i915_getparam gp = {
> > +   .param = I915_PARAM_MMAP_GTT_VERSION,
> > +   .value = >t_version,
> > +   };
> > +
> > +   ret = ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp);
> > +   if (ret == 0)
> > +   ret = gtt_version;
> > +   return ret;
> 
> Maybe the actual error returned by the kernel and not glibc would be
> interesting in the future?

errno is not overwritten by the helper so it is available to IGT after it is 
called and actually reported when a call to the helper is wrapped with 
igt_require().  Do we need more?

Thanks,
Janusz

> -Chris
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t v3] lib/i915/gem_mman: Add a helper for obtaining mappable aperture version

2019-05-31 Thread Janusz Krzysztofik
On Friday, May 31, 2019 10:41:36 AM CEST Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-05-31 09:33:38)
> > From: Janusz Krzysztofik 
> 
> This is nothing to do with the mappable aperture version. This is the
> nee MMAP_GTT interface version.
> -Chris
> 
Sorry for my ignorance, I'll reword it.

Thanks,
Janusz



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t v3] lib/i915/gem_mman: Add a helper for obtaining mappable aperture version

2019-05-31 Thread Janusz Krzysztofik
On Friday, May 31, 2019 10:55:46 AM CEST Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-05-31 09:53:41)
> > Hi Chris,
> > 
> > On Friday, May 31, 2019 10:39:47 AM CEST Chris Wilson wrote:
> > > Quoting Janusz Krzysztofik (2019-05-31 09:33:38)
> > > > From: Janusz Krzysztofik 
> > > > 
> > > > If a test calls a function which depends on availabiblity of a
> > > > supported mappable aperture, an error may be reported by the kernel on
> > > > unsupported hardware.  That may negatively affect results reported by a
> > > > test framework even if that test ignores the failure and succeedes.
> > > > 
> > > > This helper wraps an IOCTL call which returns a version number of a
> > > > mappable aperture.  It may be used by tests which need to adjust their
> > > > scope depending on availability of specific version of mappable
> > > > aperture.
> > > > 
> > > > Signed-off-by: Janusz Krzysztofik 
> > > > Cc: Antonio Argenziano 
> > > > Cc: Michal Wajdeczko 
> > > > ---
> > > > Changelog:
> > > > v2 (internal) -> v3:
> > > > - make the code less obsucre, more explicit (Antonio),
> > > > - reword the helper documentation and commit message.
> > > > 
> > > > v1 (internal) -> v2 (internal):
> > > > - minimize future potential conflicts with 
> > > >   https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1
> > > >   (no progress with than one so not waiting for it any longer):
> > > >   - convert the helper to a drop-in replacement of the one from the
> > > > above mentioned patch, returning mappable aperture version, not
> > > > only information on its availability,
> > > >   - drop any other wrappers,
> > > > - document the helper,
> > > > - reword commit message.
> > > > 
> > > >  lib/i915/gem_mman.c | 22 ++
> > > >  lib/i915/gem_mman.h |  1 +
> > > >  2 files changed, 23 insertions(+)
> > > > 
> > > > diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c
> > > > index 3cf9a6bb..3a3f3e5c 100644
> > > > --- a/lib/i915/gem_mman.c
> > > > +++ b/lib/i915/gem_mman.c
> > > > @@ -40,6 +40,28 @@
> > > >  #define VG(x) do {} while (0)
> > > >  #endif
> > > >  
> > > > +/**
> > > > + * gem_mmap__gtt_version:
> > > > + * @fd: open i915 drm file descriptor
> > > > + *
> > > > + * This functions wraps up an IOCTL to obtain mappable aperture 
> > > > version.
> > > > + *
> > > > + * Returns: mappable aperture version, -1 on failure.
> > > > + */
> > > > +int gem_mmap__gtt_version(int fd)
> > > > +{
> > > > +   int gtt_version, ret;
> > > > +   struct drm_i915_getparam gp = {
> > > > +   .param = I915_PARAM_MMAP_GTT_VERSION,
> > > > +   .value = >t_version,
> > > > +   };
> > > > +
> > > > +   ret = ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp);
> > > > +   if (ret == 0)
> > > > +   ret = gtt_version;
> > > > +   return ret;
> > > 
> > > Maybe the actual error returned by the kernel and not glibc would be
> > > interesting in the future?
> > 
> > errno is not overwritten by the helper so it is available to IGT after it 
> > is 
> > called and actually reported when a call to the helper is wrapped with 
> > igt_require().  Do we need more?
> 
> Yes, we typically return the error and do not use errno. Imagine if we
> just replaced ioctl() with syscall() :)

OK. I'll fix it.

Thanks,
Janusz

> -Chris
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t v4] lib/i915/gem_mman: Add a helper for obtaining MMAP_GTT interface version

2019-05-31 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

If a test calls a function which depends on availability of a specific
version of MMAP_GTT interface, an error may occur on unsupported hardware.
That may negatively affect results reported by a test framework even if
that test ignores the failure and succeedes.

This helper wraps up an IOCTL call which returns a version number of
MMAP_GTT interface.  It may be used by tests which should adjust their
scope depending on availability of a specific version of MMAP_GTT
interface.

Signed-off-by: Janusz Krzysztofik 
Cc: Antonio Argenziano 
Cc: Michal Wajdeczko 
---
Changelog:
v3 -> v4:
- return errno value on failure (Chris - thanks!),
- clear errno before return, as other helpers do,
- reword the helper documentation and commit message again (Chris -
  thanks!).

v2 (internal) -> v3:
- make the code less obsucre, more explicit (Antonio - thanks!),
- reword the helper documentation and commit message.

v1 (internal) -> v2 (internal):
- minimize future potential conflicts with 
  https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1
  (no progress with than one so not waiting for it any longer):
  - convert the helper to a drop-in replacement of the one from the
above mentioned patch, returning mappable aperture version, not
only information on its availability,
  - drop any other wrappers,
- document the helper,
- reword commit message.

 lib/i915/gem_mman.c | 25 +
 lib/i915/gem_mman.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c
index 3cf9a6bb..2c3d6971 100644
--- a/lib/i915/gem_mman.c
+++ b/lib/i915/gem_mman.c
@@ -40,6 +40,31 @@
 #define VG(x) do {} while (0)
 #endif
 
+/**
+ * gem_mmap__gtt_version:
+ * @fd: open i915 drm file descriptor
+ *
+ * This functions wraps up an IOCTL to obtain MMAP_GTT interface version
+ *
+ * Returns: MMAP_GTT interface version, kernel error code on failure.
+ */
+int gem_mmap__gtt_version(int fd)
+{
+   int gtt_version, ret;
+   struct drm_i915_getparam gp = {
+   .param = I915_PARAM_MMAP_GTT_VERSION,
+   .value = >t_version,
+   };
+
+   if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp))
+   ret = errno;
+   else
+   ret = gtt_version;
+
+   errno = 0;
+   return ret;
+}
+
 /**
  * __gem_mmap__gtt:
  * @fd: open i915 drm file descriptor
diff --git a/lib/i915/gem_mman.h b/lib/i915/gem_mman.h
index f7242ed7..ab12e566 100644
--- a/lib/i915/gem_mman.h
+++ b/lib/i915/gem_mman.h
@@ -25,6 +25,7 @@
 #ifndef GEM_MMAN_H
 #define GEM_MMAN_H
 
+int gem_mmap__gtt_version(int fd);
 void *gem_mmap__gtt(int fd, uint32_t handle, uint64_t size, unsigned prot);
 void *gem_mmap__cpu(int fd, uint32_t handle, uint64_t offset, uint64_t size, 
unsigned prot);
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH i-g-t v4] lib/i915/gem_mman: Add a helper for obtaining MMAP_GTT interface version

2019-05-31 Thread Janusz Krzysztofik
On Friday, May 31, 2019 11:35:39 AM CEST Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-05-31 10:29:16)
> > From: Janusz Krzysztofik 
> > 
> > If a test calls a function which depends on availability of a specific
> > version of MMAP_GTT interface, an error may occur on unsupported hardware.
> > That may negatively affect results reported by a test framework even if
> > that test ignores the failure and succeedes.
> > 
> > This helper wraps up an IOCTL call which returns a version number of
> > MMAP_GTT interface.  It may be used by tests which should adjust their
> > scope depending on availability of a specific version of MMAP_GTT
> > interface.
> > 
> > Signed-off-by: Janusz Krzysztofik 
> > Cc: Antonio Argenziano 
> > Cc: Michal Wajdeczko 
> > ---
> > Changelog:
> > v3 -> v4:
> > - return errno value on failure (Chris - thanks!),
> > - clear errno before return, as other helpers do,
> > - reword the helper documentation and commit message again (Chris -
> >   thanks!).
> > 
> > v2 (internal) -> v3:
> > - make the code less obsucre, more explicit (Antonio - thanks!),
> > - reword the helper documentation and commit message.
> > 
> > v1 (internal) -> v2 (internal):
> > - minimize future potential conflicts with 
> >   https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1
> >   (no progress with than one so not waiting for it any longer):
> >   - convert the helper to a drop-in replacement of the one from the
> > above mentioned patch, returning mappable aperture version, not
> > only information on its availability,
> >   - drop any other wrappers,
> > - document the helper,
> > - reword commit message.
> > 
> >  lib/i915/gem_mman.c | 25 +
> >  lib/i915/gem_mman.h |  1 +
> >  2 files changed, 26 insertions(+)
> > 
> > diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c
> > index 3cf9a6bb..2c3d6971 100644
> > --- a/lib/i915/gem_mman.c
> > +++ b/lib/i915/gem_mman.c
> > @@ -40,6 +40,31 @@
> >  #define VG(x) do {} while (0)
> >  #endif
> >  
> > +/**
> > + * gem_mmap__gtt_version:
> > + * @fd: open i915 drm file descriptor
> > + *
> > + * This functions wraps up an IOCTL to obtain MMAP_GTT interface version
> > + *
> > + * Returns: MMAP_GTT interface version, kernel error code on failure.
> > + */
> > +int gem_mmap__gtt_version(int fd)
> > +{
> > +   int gtt_version, ret;
> > +   struct drm_i915_getparam gp = {
> > +   .param = I915_PARAM_MMAP_GTT_VERSION,
> > +   .value = >t_version,
> > +   };
> > +
> > +   if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp))
> > +   ret = errno;
> 
> ret = -errno; :)

Sorry.

> Petri also like it when we then say igt_assume(ret);
> 
> Or one could use
> 
> {
>   int result = -EIO;
>   struct ... gp = {
>   .param = I915_PARAM_MMAP_GTT_VERSION,
>   .value = &result,
>   };
> 
>   if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp)) {
>   result = -errno;
>   igt_assume(result);

OK, I'll learn what igt_assume() is first then use it.

Thanks,
Janusz

>   }
> 
>   errno = 0;
>   return result;
> }
> 
> Now just put it to use somewhere.
> -Chris
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t v5] lib/i915/gem_mman: Add a helper for obtaining MMAP_GTT interface version

2019-05-31 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

If a test calls a function which depends on availability of a specific
version of MMAP_GTT interface, an error may occur on unsupported hardware.
That may negatively affect results reported by a test framework even if
that test ignores the failure and succeedes.

This helper wraps up an IOCTL call which returns a version number of
MMAP_GTT interface.  It may be used by tests which should adjust their
scope depending on availability of a specific version of MMAP_GTT
interface.

Signed-off-by: Janusz Krzysztofik 
Cc: Antonio Argenziano 
Cc: Michal Wajdeczko 
---
Changelog:
v4 -> v5:
- change sign of errno before it is returned (Chris - thanks!),
- validate -errno with igt_assume() (Chris - thanks!),
- follow coding style suggested by Chris - thanks!
To be honest, I think Chris should be somehow officially credited in
the commit tags for his contributions but I'm not sure how. Would a
Suggested-by: clause be OK, or Co-develped-by: maybe?

v3 -> v4:
- return errno value on failure (Chris - thanks!),
- clear errno before return, as other helpers do,
- reword the helper documentation and commit message again (Chris -
  thanks!).

v2 (internal) -> v3:
- make the code less obsucre, more explicit (Antonio - thanks!),
- reword the helper documentation and commit message.

v1 (internal) -> v2 (internal):
- minimize future potential conflicts with 
  https://patchwork.freedesktop.org/patch/294053/?series=58551&rev=1
  (no progress with than one so not waiting for it any longer):
  - convert the helper to a drop-in replacement of the one from the
above mentioned patch, returning mappable aperture version, not
only information on its availability,
  - drop any other wrappers,
- document the helper,
- reword commit message.

 lib/i915/gem_mman.c | 25 +
 lib/i915/gem_mman.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c
index 3cf9a6bb..27c437da 100644
--- a/lib/i915/gem_mman.c
+++ b/lib/i915/gem_mman.c
@@ -40,6 +40,31 @@
 #define VG(x) do {} while (0)
 #endif
 
+/**
+ * gem_mmap__gtt_version:
+ * @fd: open i915 drm file descriptor
+ *
+ * This functions wraps up an IOCTL to obtain MMAP_GTT interface version
+ *
+ * Returns: MMAP_GTT interface version, kernel error code on failure.
+ */
+int gem_mmap__gtt_version(int fd)
+{
+   int result = -EIO;
+   struct drm_i915_getparam gp = {
+   .param = I915_PARAM_MMAP_GTT_VERSION,
+   .value = &result,
+   };
+
+   if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp)) {
+   result = -errno;
+   igt_assume(result);
+   }
+
+   errno = 0;
+   return result;
+}
+
 /**
  * __gem_mmap__gtt:
  * @fd: open i915 drm file descriptor
diff --git a/lib/i915/gem_mman.h b/lib/i915/gem_mman.h
index f7242ed7..ab12e566 100644
--- a/lib/i915/gem_mman.h
+++ b/lib/i915/gem_mman.h
@@ -25,6 +25,7 @@
 #ifndef GEM_MMAN_H
 #define GEM_MMAN_H
 
+int gem_mmap__gtt_version(int fd);
 void *gem_mmap__gtt(int fd, uint32_t handle, uint64_t size, unsigned prot);
 void *gem_mmap__cpu(int fd, uint32_t handle, uint64_t offset, uint64_t size, 
unsigned prot);
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC PATCH 1/1] drm/i915: Split off pci_driver.remove() tail to drm_driver.release()

2019-06-03 Thread Janusz Krzysztofik
On Monday, June 3, 2019 9:28:18 AM CEST Daniel Vetter wrote:
> On Thu, May 30, 2019 at 10:40:09AM +0100, Chris Wilson wrote:
> > Quoting Janusz Krzysztofik (2019-05-30 10:24:26)
> > > In order to support driver hot unbind, some cleanup operations, now
> > > performed on PCI driver remove, must be called later, after all device
> > > file descriptors are closed.
> > > 
> > > Split out those operations from the tail of pci_driver.remove()
> > > callback and put them into drm_driver.release() which is called as soon
> > > as all references to the driver are put.  As a result, those cleanups
> > > will be now run on last drm_dev_put(), either still called from
> > > pci_driver.remove() if all device file descriptors are already closed,
> > > or on last drm_release() file operation.
> > > 
> > > Signed-off-by: Janusz Krzysztofik 
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.c | 17 +
> > >  drivers/gpu/drm/i915/i915_drv.h |  1 +
> > >  drivers/gpu/drm/i915/i915_gem.c | 10 +-
> > >  3 files changed, 23 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/
i915_drv.c
> > > index 83d2eb9e74cb..8be69f84eb6d 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -738,6 +738,7 @@ static int i915_load_modeset_init(struct drm_device 
*dev)
> > >  
> > >  cleanup_gem:
> > > i915_gem_suspend(dev_priv);
> > > +   i915_gem_fini_hw(dev_priv);
> > > i915_gem_fini(dev_priv);
> > >  cleanup_modeset:
> > > intel_modeset_cleanup(dev);
> > > @@ -1685,7 +1686,6 @@ static void i915_driver_cleanup_hw(struct 
drm_i915_private *dev_priv)
> > > pci_disable_msi(pdev);
> > >  
> > > pm_qos_remove_request(&dev_priv->pm_qos);
> > > -   i915_ggtt_cleanup_hw(dev_priv);
> > >  }
> > >  
> > >  /**
> > > @@ -1909,6 +1909,7 @@ int i915_driver_load(struct pci_dev *pdev, const 
struct pci_device_id *ent)
> > 
> > Would it make sense to rename load/unload from the legacy drm stubs over
> > to match the pci entry points?
> 
> +1 on that rename, load/unload is really terribly confusing and has
> horrible semantics in the dri1 shadow attach world ...
> -Daniel

I've not responded to that comment, sorry, but I agree too.  I've assumed 
that's a candidate for a separate patch or series.  I'm willing to work on 
that as time permits.

Thanks,
Janusz

> > 
> > >  out_cleanup_hw:
> > > i915_driver_cleanup_hw(dev_priv);
> > > +   i915_ggtt_cleanup_hw(dev_priv);
> > >  out_cleanup_mmio:
> > > i915_driver_cleanup_mmio(dev_priv);
> > >  out_runtime_pm_put:
> > > @@ -1960,21 +1961,29 @@ void i915_driver_unload(struct drm_device *dev)
> > > cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
> > > i915_reset_error_state(dev_priv);
> > >  
> > > -   i915_gem_fini(dev_priv);
> > > +   i915_gem_fini_hw(dev_priv);
> > >  
> > > intel_power_domains_fini_hw(dev_priv);
> > >  
> > > i915_driver_cleanup_hw(dev_priv);
> > > -   i915_driver_cleanup_mmio(dev_priv);
> > >  
> > > enable_rpm_wakeref_asserts(dev_priv);
> > > -   intel_runtime_pm_cleanup(dev_priv);
> > >  }
> > >  
> > >  static void i915_driver_release(struct drm_device *dev)
> > >  {
> > > struct drm_i915_private *dev_priv = to_i915(dev);
> > >  
> > > +   disable_rpm_wakeref_asserts(dev_priv);
> > > +
> > > +   i915_gem_fini(dev_priv);
> > > +
> > > +   i915_ggtt_cleanup_hw(dev_priv);
> > > +   i915_driver_cleanup_mmio(dev_priv);
> > > +
> > > +   enable_rpm_wakeref_asserts(dev_priv);
> > > +   intel_runtime_pm_cleanup(dev_priv);
> > 
> > We should really propagate the release nomenclature down and replace our
> > mixed fini/cleanup. Consistency is helpful when trying to work out which
> > phase the code is in.
> > 
> > > i915_driver_cleanup_early(dev_priv);
> > > i915_driver_destroy(dev_priv);
> > >  }
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/
i915_drv.h
> > > index a2664ea1395b..d08e7bd83544 100644
> > > --- a/drivers/gpu/drm/i915/i915_d

[Intel-gfx] [PATCH i-g-t v11 1/1] tests: Add a new test for device hot unplug

2019-06-07 Thread Janusz Krzysztofik
From: Janusz Krzysztofik 

There is a test which verifies unloading of i915 driver module but no test
exists that checks how a driver behaves when it gets unbound from a device
or when the device gets unplugged.  Provide such test using sysfs
interface.

Two minimalistic subtests - "unbind-rebind" and "unplug-rescan" - perform
desired operations on a DRM device which is beleived to be not in use.

A subtest named "drm_open-hotunplug" unplugs a DRM device while keeping
a file descriptor open.

Changelog:
v2:
- run a subprocess with dummy_load instead of external command
  (Antonio).

v3:
- run dummy_load from the test process directly (Antonio).

v4:
- run dummy_load from inside subtests (Antonio).

v5:
- try to restore the device to a working state after each subtest
  (Petri, Daniel).

v6:
- run workload inside an igt helper subprocess so resources consumed
  by the workload are cleaned up automatically on workload subprocess
  crash, without affecting test results,
- move the igt helper with workload back from subtests to initial
  fixture so workload crash also does not affect test results,
- other cleanups suggested by Katarzyna and Chris.

v7:
- no changes.

v8:
- move workload functions back from fixture to subtests,
- register different actions and different workloads in respective
  tables and iterate over those tables while enumerating subtests,
- introduce new subtest flavors by simply omiting module unload step,
- instead of simply requesting bus rescan or not, introduce action
  specific device recovery helpers, required specifically with those
  new subtests not touching the module,
- split workload functions in two parts, one spawning the workload,
  the other waiting for its completion,
- for the new subtests not requiring module unload, run workload
  functions directly from the test process and use new workload
  completion wait functions in place of subprocess completion wait,
- take more control over logging, longjumps and exit codes in
  workload subprocesses,
- add some debug messages for easy progress watching,
- move function API descriptions on top of respective typedefs.

v9:
All changes after Daniel's comments - thanks!
- flatten the code, don't try to create a midlayer (Daniel),
- provide mimimal subtests that even don't keep device open (Daniel),
- don't use driver unbind in more advanced subtests (Daniel),
- provide subtests with different level of resources allocated
  during device unplug (Daniel),
- provide subtests which check driver behavior after device hot
  unplug (Daniel).

v10:
- rename variables and function arguments to something that indicates
  they're file descriptors (Daniel),
- introduce a data structure that contains various file descriptors
  and a helper function to set them all (Daniel),
- fix strange indenting (Daniel),
- limit scope to first three subtests as the first set of tests to
  merge (Daniel).

v11:
- fix typos in some comments,
- use SPDX license identifier,
- include a per-patch changelog in the commit message (Daniel).

Cc: Antonio Argenziano 
Cc: Petri Latvala 
Cc: Daniel Vetter 
Cc: Katarzyna Dec 
Cc: Chris Wilson 
Cc: Michał Wajdeczko 
Signed-off-by: Janusz Krzysztofik 
---
 tests/Makefile.sources |   1 +
 tests/core_hotunplug.c | 222 +
 tests/meson.build  |   1 +
 3 files changed, 224 insertions(+)
 create mode 100644 tests/core_hotunplug.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 027ed82f..3f24265f 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -17,6 +17,7 @@ TESTS_progs = \
core_getclient \
core_getstats \
core_getversion \
+   core_hotunplug \
core_setmaster_vs_auth \
debugfs_test \
drm_import_export \
diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
new file mode 100644
index ..d36a0572
--- /dev/null
+++ b/tests/core_hotunplug.c
@@ -0,0 +1,222 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "igt.h"
+#include "igt_device.h"
+#include "igt_dummyload.h"
+#include "igt_kmod.h"
+#include "igt_sysfs.h"
+
+#include 
+#include 
+#include 
+
+struct hotunplug {
+   int chipset;
+   struct {
+   int drm;
+   int sysfs_dev;
+   int sysfs_bus;
+   } fd;
+};
+
+/* Helpers */
+
+static void prepare(struct hotunplug *priv)
+{
+   /* open the driver */
+   priv->fd.drm = __drm_open_driver(priv->chipset);
+   igt_assert(priv->fd.drm >= 0);
+
+   /* prepare for device unplug */
+   priv->fd.sysfs_dev = igt_sysfs_open(priv->fd.drm);
+   igt_assert(priv->fd.sysfs_dev >= 0);
+
+   /* prepare for bus rescan */
+   priv->fd.sysfs_bus = openat(priv->fd.sysfs_dev, "device/subsystem",
+   O_DIREC

[Intel-gfx] [PATCH i-g-t v11 0/1] tests: Add a new test for device hot unplug

2019-06-07 Thread Janusz Krzysztofik
The test should help resolving driver bugs which exhibit themselves
when a device is unplugged / driver unbind from a device while the
device is busy (different from simple module unload which requires 
device references being put first).

A kernel patch resolving kernel panics on driver hot unbind [1] was
verified on trybot with v10 of this test before it has been submitted
upstream.  Current version (v11) has also been tested on trybot with
the kernel patch already included upstream.  Hence, no kernel panics
are expected, however some kernel WARNs and driver error messages may
still need to be resolved before CI is happy with this new test.

[1] 
https://cgit.freedesktop.org/drm/drm-tip/commit/?id=47bc28d7ee6d8378ba4451c43885cb3241302243

Janusz Krzysztofik (1):
  tests: Add a new test for device hot unplug

 tests/Makefile.sources |   1 +
 tests/core_hotunplug.c | 222 +
 tests/meson.build  |   1 +
 3 files changed, 224 insertions(+)
 create mode 100644 tests/core_hotunplug.c

Changelog:
v10->v11:
- fix typos in some comments,
- use SPDX license identifier,
- include a per-patch changelog in the commit message (Daniel).

v9->v10 (submitted only to trybot):
- rename variables and function arguments to something that indicates
  they're file descriptors (Daniel),
- introduce a data structure that contains various file descriptors
  and a helper function to set them all (Daniel),
- fix strange indenting (Daniel),
- limit scope to first three subtests as the first set of tests to
  merge (Daniel).

v8->v9:
All changes after Daniel's comments - thanks!
- flatten the code, don't try to create a midlayer,
- provide mimimal subtests that even don't keep device open,
- don't use driver unbind in more advanced subtests,
- provide subtests with different level of resources allocated
  during device unplug,
- provide subtests which check driver behavior after device hot
  unplug.

v7->v8:
- move workload functions back from fixture to subtests,
- register different actions and different workloads in respective
  tables and iterate over those tables while enumerating subtests,
- introduce new subtest flavors by simply omiting module unload step,
- instead of simply requesting bus rescan or not, introduce action
  specific device recovery helpers, required specifically with those
  new subtests not touching the module,
- split workload functions in two parts, one spawning the workload,
  the other waiting for its completion,
- for the new subtests not requiring module unload, run workload
  functions directly from the test process and use new workload
  completion wait functions in place of subprocess completion wait,
- take more control over logging, longjumps and exit codes in
  workload subprocesses,
- add some debug messages for easy progress watching,
- move function API descriptions on top of respective typedefs,
- drop patch 2/2 with external workload command again, still nobody
  likes it.

v6->v7:
- add missing igt_exit() needed with the second patch.

v5->v6 (third public submission, incorrectly marked as v5, sorry):
- run workload inside an igt helper subprocess so resources consumed
  by the workload are cleaned up automatically on workload subprocess
  crash, without affecting test results,
- move the igt helper with workload back from subtests to initial
  fixture so workload crash also does not affect test results,
- re-add the second patch which extends the test with an option for
  using an external command as a workload,
- other cleanups suggested by Kasia and Chris.

v4->v5 (second public submission, marked as v2):
- try to restore the device to a working state after each subtest
  (Petri, Daniel).

v3->v4 (first public submission, not marked with any version number):
- run dummy_load from inside subtests (Antonio).

v2->v3 (internal submission):
- run dummy_load from the test process directly (Antonio),
- drop the patch for running external workload (Antonio).

v1->v2 (internal submission):
- run a subprocess with dummy_load instead of external command
  (Antonio),
- keep use of external workload command as an option, move that to a
  separate patch.

-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH i-g-t v11 1/1] tests: Add a new test for device hot unplug

2019-06-10 Thread Janusz Krzysztofik
On Monday, June 10, 2019 8:49:38 AM CEST Petri Latvala wrote:
> On Fri, Jun 07, 2019 at 01:51:42PM +0200, Janusz Krzysztofik wrote:
> > - use SPDX license identifier,
> 
> 
> Why? We don't use those in IGT.

I must have had got an idea to change it from somewhere, unfortunately I'm not 
able to recall from where, sorry.  I'll revert it.

> > diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
> > new file mode 100644
> > index ..d36a0572
> > --- /dev/null
> > +++ b/tests/core_hotunplug.c
> > @@ -0,0 +1,222 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright © 2019 Intel Corporation
> > + */
> 
> And why GPL-2.0?

From the same source as the idea of SPDX, I guess. I'll fix it to be in line 
with IGT standards.

Thanks,
Janusz



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Fix reporting of size of created GEM object

2019-07-08 Thread Janusz Krzysztofik
Commit e163484afa8d ("drm/i915: Update size upon return from
GEM_CREATE") (re)introduced reporting of actual size of created GEM
objects, possibly rounded up on object alignment.  Unfortunately, its
implementation resulted in a possible use-after-free bug.  The bug has
been fixed by commit 929eec99f5fd ("drm/i915: Avoid use-after-free in
reporting create.size") at the cost of possibly incorrect value being
reported as actual object size.

Safely restore correct reporting by capturing actual size of created
GEM object before a reference to the object is put.

Fixes: 929eec99f5fd ("drm/i915: Avoid use-after-free in reporting create.size")
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_gem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7ade42b8ec99..16bae5870d6f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -171,6 +171,7 @@ i915_gem_create(struct drm_file *file,
obj = i915_gem_object_create_shmem(dev_priv, size);
if (IS_ERR(obj))
return PTR_ERR(obj);
+   size = obj->base.size;
 
ret = drm_gem_handle_create(file, &obj->base, &handle);
/* drop reference from allocate - handle holds it now */
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 1/6] drm/i915: Rename "_load"/"_unload" to match PCI entry points

2019-07-10 Thread Janusz Krzysztofik
Current names of i915_driver_load/unload() functions originate in
legacy DRM stubs.  Reduce nomenclature ambiguity by renaming them to
match their current use as helpers called from PCI entry points.

Suggested by: Chris Wilson 
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 8 
 drivers/gpu/drm/i915/i915_drv.h | 4 ++--
 drivers/gpu/drm/i915/i915_pci.c | 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 12182d2fc03c..8b72ae7c1f5d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1870,17 +1870,17 @@ static void i915_driver_destroy(struct drm_i915_private 
*i915)
 }
 
 /**
- * i915_driver_load - setup chip and create an initial config
+ * i915_driver_probe - setup chip and create an initial config
  * @pdev: PCI device
  * @ent: matching PCI ID entry
  *
- * The driver load routine has to do several things:
+ * The driver probe routine has to do several things:
  *   - drive output discovery via intel_modeset_init()
  *   - initialize the memory manager
  *   - allocate initial config memory
  *   - setup the DRM framebuffer with the allocated memory
  */
-int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
+int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
const struct intel_device_info *match_info =
(struct intel_device_info *)ent->driver_data;
@@ -1946,7 +1946,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct 
pci_device_id *ent)
return ret;
 }
 
-void i915_driver_unload(struct drm_device *dev)
+void i915_driver_remove(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct pci_dev *pdev = dev_priv->drm.pdev;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a9381e404fd5..ebb4c09f8817 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2395,9 +2395,9 @@ extern long i915_compat_ioctl(struct file *filp, unsigned 
int cmd,
 #endif
 extern const struct dev_pm_ops i915_pm_ops;
 
-extern int i915_driver_load(struct pci_dev *pdev,
+extern int i915_driver_probe(struct pci_dev *pdev,
const struct pci_device_id *ent);
-extern void i915_driver_unload(struct drm_device *dev);
+extern void i915_driver_remove(struct drm_device *dev);
 
 extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
 extern void intel_hangcheck_init(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 94b588e0a1dd..786ca7b3439b 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -848,7 +848,7 @@ static void i915_pci_remove(struct pci_dev *pdev)
if (!dev) /* driver load aborted, nothing to cleanup */
return;
 
-   i915_driver_unload(dev);
+   i915_driver_remove(dev);
drm_dev_put(dev);
 
pci_set_drvdata(pdev, NULL);
@@ -923,7 +923,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
if (vga_switcheroo_client_probe_defer(pdev))
return -EPROBE_DEFER;
 
-   err = i915_driver_load(pdev, ent);
+   err = i915_driver_probe(pdev, ent);
if (err)
return err;
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 0/6] Rename functions to match their entry points

2019-07-10 Thread Janusz Krzysztofik
Need for this was identified while working on split of driver unbind
path into _remove() and _release() parts.  Consistency in function
naming has been recognized as helpful when trying to work out which
phase the code is in.

What I'm still not sure about is desired depth of that modification -
how deep should we go down with renaming to not override meaningfull
function names.  Please advise if you think still more deep renaming
makes sense.

Thanks,
Janusz

Janusz Krzysztofik (6):
  drm/i915: Rename "_load"/"_unload" to match PCI entry points
  drm/i915: Replace "_load" with "_probe" consequently
  drm/i915: Propagate "_release" function name suffix down
  drm/i915: Propagate "_remove" function name suffix down
  drm/i915: Propagate "_probe" function name suffix down
  drm/i915: Rename "inject_load_failure" module parameter

 drivers/gpu/drm/i915/display/intel_bios.c |   4 +-
 drivers/gpu/drm/i915/display/intel_bios.h |   2 +-
 .../gpu/drm/i915/display/intel_connector.c|   2 +-
 drivers/gpu/drm/i915/display/intel_display.c  |   2 +-
 .../drm/i915/display/intel_display_power.c|   6 +-
 .../drm/i915/display/intel_display_power.h|   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |   2 +-
 drivers/gpu/drm/i915/i915_drv.c   | 111 +-
 drivers/gpu/drm/i915/i915_drv.h   |  20 ++--
 drivers/gpu/drm/i915/i915_gem.c   |  12 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c   |   4 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   2 +-
 drivers/gpu/drm/i915/i915_params.c|   2 +-
 drivers/gpu/drm/i915/i915_params.h|   2 +-
 drivers/gpu/drm/i915/i915_pci.c   |   6 +-
 drivers/gpu/drm/i915/intel_gvt.c  |   7 +-
 drivers/gpu/drm/i915/intel_gvt.h  |   4 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c   |   2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.h   |   2 +-
 drivers/gpu/drm/i915/intel_uncore.c   |   2 +-
 drivers/gpu/drm/i915/intel_wopcm.c|   2 +-
 21 files changed, 100 insertions(+), 98 deletions(-)

-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 4/6] drm/i915: Propagate "_remove" function name suffix down

2019-07-10 Thread Janusz Krzysztofik
Similar to the "_release" case, consistently replace mixed
"_cleanup"/"_fini"/"_fini_hw" components found in names of functions
called from i915_driver_remove() with "_remove" or "_driver_remove"
suffixes for better code readability.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/display/intel_bios.c |  4 ++--
 drivers/gpu/drm/i915/display/intel_bios.h |  2 +-
 drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
 .../drm/i915/display/intel_display_power.c|  6 ++---
 .../drm/i915/display/intel_display_power.h|  2 +-
 drivers/gpu/drm/i915/i915_drv.c   | 24 +--
 drivers/gpu/drm/i915/i915_drv.h   |  4 ++--
 drivers/gpu/drm/i915/i915_gem.c   |  2 +-
 drivers/gpu/drm/i915/intel_gvt.c  |  5 ++--
 drivers/gpu/drm/i915/intel_gvt.h  |  4 ++--
 10 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c 
b/drivers/gpu/drm/i915/display/intel_bios.c
index 0c9808132d67..3c725edc79ef 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -1891,10 +1891,10 @@ void intel_bios_init(struct drm_i915_private *dev_priv)
 }
 
 /**
- * intel_bios_cleanup - Free any resources allocated by intel_bios_init()
+ * intel_bios_driver_remove - Free any resources allocated by intel_bios_init()
  * @dev_priv: i915 device instance
  */
-void intel_bios_cleanup(struct drm_i915_private *dev_priv)
+void intel_bios_driver_remove(struct drm_i915_private *dev_priv)
 {
kfree(dev_priv->vbt.child_dev);
dev_priv->vbt.child_dev = NULL;
diff --git a/drivers/gpu/drm/i915/display/intel_bios.h 
b/drivers/gpu/drm/i915/display/intel_bios.h
index 0b7be6389a07..4969189e620f 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.h
+++ b/drivers/gpu/drm/i915/display/intel_bios.h
@@ -228,7 +228,7 @@ struct mipi_pps_data {
 } __packed;
 
 void intel_bios_init(struct drm_i915_private *dev_priv);
-void intel_bios_cleanup(struct drm_i915_private *dev_priv);
+void intel_bios_driver_remove(struct drm_i915_private *dev_priv);
 bool intel_bios_is_valid_vbt(const void *buf, size_t size);
 bool intel_bios_is_tv_present(struct drm_i915_private *dev_priv);
 bool intel_bios_is_lvds_present(struct drm_i915_private *dev_priv, u8 
*i2c_pin);
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index f09eda75711a..47dd682c9a62 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -17062,7 +17062,7 @@ static void intel_hpd_poll_fini(struct drm_device *dev)
drm_connector_list_iter_end(&conn_iter);
 }
 
-void intel_modeset_cleanup(struct drm_device *dev)
+void intel_modeset_driver_remove(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
 
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 7437fc71d289..5f4939a9ca90 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -4427,7 +4427,7 @@ static void intel_power_domains_verify_state(struct 
drm_i915_private *dev_priv);
  *
  * It will return with power domains disabled (to be enabled later by
  * intel_power_domains_enable()) and must be paired with
- * intel_power_domains_fini_hw().
+ * intel_power_domains_driver_remove().
  */
 void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume)
 {
@@ -4479,7 +4479,7 @@ void intel_power_domains_init_hw(struct drm_i915_private 
*i915, bool resume)
 }
 
 /**
- * intel_power_domains_fini_hw - deinitialize hw power domain state
+ * intel_power_domains_driver_remove - deinitialize hw power domain state
  * @i915: i915 device instance
  *
  * De-initializes the display power domain HW state. It also ensures that the
@@ -4489,7 +4489,7 @@ void intel_power_domains_init_hw(struct drm_i915_private 
*i915, bool resume)
  * intel_power_domains_disable()) and must be paired with
  * intel_power_domains_init_hw().
  */
-void intel_power_domains_fini_hw(struct drm_i915_private *i915)
+void intel_power_domains_driver_remove(struct drm_i915_private *i915)
 {
intel_wakeref_t wakeref __maybe_unused =
fetch_and_zero(&i915->power_domains.wakeref);
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h 
b/drivers/gpu/drm/i915/display/intel_display_power.h
index 8f43f7051a16..dbd1f5ef01d1 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -214,7 +214,7 @@ void gen9_enable_dc5(struct drm_i915_private *dev_priv);
 int intel_power_domains_init(struct drm_i915_private *dev_priv);
 void intel_power_domains_cleanup(struct drm_i915_private *dev_priv);
 void intel_power_domains_init_hw(struct drm_i9

[Intel-gfx] [RFC PATCH 5/6] drm/i915: Propagate "_probe" function name suffix down

2019-07-10 Thread Janusz Krzysztofik
Similar to the "_release" and "_remove" cases, consequently replace
"_init" components of names of functions called from
i915_driver_probe() with "_probe" suffixes for better code readability.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6e83fe96d930..7241a7d14e9b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -675,7 +675,7 @@ static const struct vga_switcheroo_client_ops 
i915_switcheroo_ops = {
.can_switch = i915_switcheroo_can_switch,
 };
 
-static int i915_load_modeset_init(struct drm_device *dev)
+static int i915_driver_modeset_probe(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct pci_dev *pdev = dev_priv->drm.pdev;
@@ -884,7 +884,7 @@ static void intel_detect_preproduction_hw(struct 
drm_i915_private *dev_priv)
 }
 
 /**
- * i915_driver_init_early - setup state not requiring device access
+ * i915_driver_early_probe - setup state not requiring device access
  * @dev_priv: device private
  *
  * Initialize everything that is a "SW-only" state, that is state not
@@ -893,7 +893,7 @@ static void intel_detect_preproduction_hw(struct 
drm_i915_private *dev_priv)
  * system memory allocation, setting up device specific attributes and
  * function hooks not requiring accessing the device.
  */
-static int i915_driver_init_early(struct drm_i915_private *dev_priv)
+static int i915_driver_early_probe(struct drm_i915_private *dev_priv)
 {
int ret = 0;
 
@@ -963,7 +963,7 @@ static int i915_driver_init_early(struct drm_i915_private 
*dev_priv)
 
 /**
  * i915_driver_early_release - cleanup the setup done in
- *i915_driver_init_early()
+ *i915_driver_early_probe()
  * @dev_priv: device private
  */
 static void i915_driver_early_release(struct drm_i915_private *dev_priv)
@@ -980,7 +980,7 @@ static void i915_driver_early_release(struct 
drm_i915_private *dev_priv)
 }
 
 /**
- * i915_driver_init_mmio - setup device MMIO
+ * i915_driver_mmio_probe - setup device MMIO
  * @dev_priv: device private
  *
  * Setup minimal device state necessary for MMIO accesses later in the
@@ -988,7 +988,7 @@ static void i915_driver_early_release(struct 
drm_i915_private *dev_priv)
  * side effects or exposing the driver via kernel internal or user space
  * interfaces.
  */
-static int i915_driver_init_mmio(struct drm_i915_private *dev_priv)
+static int i915_driver_mmio_probe(struct drm_i915_private *dev_priv)
 {
int ret;
 
@@ -1029,7 +1029,7 @@ static int i915_driver_init_mmio(struct drm_i915_private 
*dev_priv)
 }
 
 /**
- * i915_driver_mmio_release - cleanup the setup done in i915_driver_init_mmio()
+ * i915_driver_mmio_release - cleanup the setup done in 
i915_driver_mmio_probe()
  * @dev_priv: device private
  */
 static void i915_driver_mmio_release(struct drm_i915_private *dev_priv)
@@ -1525,13 +1525,13 @@ static void edram_detect(struct drm_i915_private 
*dev_priv)
 }
 
 /**
- * i915_driver_init_hw - setup state requiring device access
+ * i915_driver_hw_probe - setup state requiring device access
  * @dev_priv: device private
  *
  * Setup state that requires accessing the device, but doesn't require
  * exposing the driver via kernel internal or userspace interfaces.
  */
-static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
+static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
 {
struct pci_dev *pdev = dev_priv->drm.pdev;
int ret;
@@ -1900,7 +1900,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
if (ret)
goto out_fini;
 
-   ret = i915_driver_init_early(dev_priv);
+   ret = i915_driver_early_probe(dev_priv);
if (ret < 0)
goto out_pci_disable;
 
@@ -1908,15 +1908,15 @@ int i915_driver_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 
i915_detect_vgpu(dev_priv);
 
-   ret = i915_driver_init_mmio(dev_priv);
+   ret = i915_driver_mmio_probe(dev_priv);
if (ret < 0)
goto out_runtime_pm_put;
 
-   ret = i915_driver_init_hw(dev_priv);
+   ret = i915_driver_hw_probe(dev_priv);
if (ret < 0)
goto out_cleanup_mmio;
 
-   ret = i915_load_modeset_init(&dev_priv->drm);
+   ret = i915_driver_modeset_probe(&dev_priv->drm);
if (ret < 0)
goto out_cleanup_hw;
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 2/6] drm/i915: Replace "_load" with "_probe" consequently

2019-07-10 Thread Janusz Krzysztofik
Use the "_probe" nomenclature not only in i915_driver_probe() helper
name but also in other related function / variable names for
consistency.  Only the userspace exposed name of a related module
parameter is left untouched.

Signed-off-by: Janusz Krzysztofik 
---
 .../gpu/drm/i915/display/intel_connector.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.c   | 20 +--
 drivers/gpu/drm/i915/i915_drv.h   | 10 +-
 drivers/gpu/drm/i915/i915_gem.c   |  8 
 drivers/gpu/drm/i915/i915_pci.c   |  2 +-
 drivers/gpu/drm/i915/intel_gvt.c  |  2 +-
 drivers/gpu/drm/i915/intel_uncore.c   |  2 +-
 drivers/gpu/drm/i915/intel_wopcm.c|  2 +-
 9 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_connector.c 
b/drivers/gpu/drm/i915/display/intel_connector.c
index 41310f8e5a2a..d0163d86c42a 100644
--- a/drivers/gpu/drm/i915/display/intel_connector.c
+++ b/drivers/gpu/drm/i915/display/intel_connector.c
@@ -118,7 +118,7 @@ int intel_connector_register(struct drm_connector 
*connector)
if (ret)
goto err;
 
-   if (i915_inject_load_failure()) {
+   if (i915_inject_probe_failure()) {
ret = -EFAULT;
goto err_backlight;
}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index df5932f5f578..a17f0f812735 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -426,7 +426,7 @@ int intel_engines_init_mmio(struct drm_i915_private *i915)
WARN_ON(engine_mask &
GENMASK(BITS_PER_TYPE(mask) - 1, I915_NUM_ENGINES));
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 8b72ae7c1f5d..ad24957ad86d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -81,14 +81,14 @@
 static struct drm_driver driver;
 
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
-static unsigned int i915_load_fail_count;
+static unsigned int i915_probe_fail_count;
 
-bool __i915_inject_load_failure(const char *func, int line)
+bool __i915_inject_probe_failure(const char *func, int line)
 {
-   if (i915_load_fail_count >= i915_modparams.inject_load_failure)
+   if (i915_probe_fail_count >= i915_modparams.inject_load_failure)
return false;
 
-   if (++i915_load_fail_count == i915_modparams.inject_load_failure) {
+   if (++i915_probe_fail_count == i915_modparams.inject_load_failure) {
DRM_INFO("Injecting failure at checkpoint %u [%s:%d]\n",
 i915_modparams.inject_load_failure, func, line);
i915_modparams.inject_load_failure = 0;
@@ -100,7 +100,7 @@ bool __i915_inject_load_failure(const char *func, int line)
 
 bool i915_error_injected(void)
 {
-   return i915_load_fail_count && !i915_modparams.inject_load_failure;
+   return i915_probe_fail_count && !i915_modparams.inject_load_failure;
 }
 
 #endif
@@ -681,7 +681,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
struct pci_dev *pdev = dev_priv->drm.pdev;
int ret;
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
if (HAS_DISPLAY(dev_priv)) {
@@ -897,7 +897,7 @@ static int i915_driver_init_early(struct drm_i915_private 
*dev_priv)
 {
int ret = 0;
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
intel_device_info_subplatform_init(dev_priv);
@@ -991,7 +991,7 @@ static int i915_driver_init_mmio(struct drm_i915_private 
*dev_priv)
 {
int ret;
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
if (i915_get_bridge_dev(dev_priv))
@@ -1535,7 +1535,7 @@ static int i915_driver_init_hw(struct drm_i915_private 
*dev_priv)
struct pci_dev *pdev = dev_priv->drm.pdev;
int ret;
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
intel_device_info_runtime_init(dev_priv);
@@ -1941,7 +1941,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 out_pci_disable:
pci_disable_device(pdev);
 out_fini:
-   i915_load_error(dev_priv, "Device initialization failed (%d)\n", ret);
+   i915_probe_error(dev_priv, "Device initialization failed (%d)\n", ret);
i915_driver_destroy(dev_priv);
return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/dr

[Intel-gfx] [RFC PATCH 6/6] drm/i915: Rename "inject_load_failure" module parameter

2019-07-10 Thread Janusz Krzysztofik
Use the "probe" nomenclature for consistency with internally used names
of functions and variables.

Requires adjustment of IGT tests and possibly affects other user custom
applications.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c| 10 +-
 drivers/gpu/drm/i915/i915_params.c |  2 +-
 drivers/gpu/drm/i915/i915_params.h |  2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7241a7d14e9b..3bac6be9f37d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -85,13 +85,13 @@ static unsigned int i915_probe_fail_count;
 
 bool __i915_inject_probe_failure(const char *func, int line)
 {
-   if (i915_probe_fail_count >= i915_modparams.inject_load_failure)
+   if (i915_probe_fail_count >= i915_modparams.inject_probe_failure)
return false;
 
-   if (++i915_probe_fail_count == i915_modparams.inject_load_failure) {
+   if (++i915_probe_fail_count == i915_modparams.inject_probe_failure) {
DRM_INFO("Injecting failure at checkpoint %u [%s:%d]\n",
-i915_modparams.inject_load_failure, func, line);
-   i915_modparams.inject_load_failure = 0;
+i915_modparams.inject_probe_failure, func, line);
+   i915_modparams.inject_probe_failure = 0;
return true;
}
 
@@ -100,7 +100,7 @@ bool __i915_inject_probe_failure(const char *func, int line)
 
 bool i915_error_injected(void)
 {
-   return i915_probe_fail_count && !i915_modparams.inject_load_failure;
+   return i915_probe_fail_count && !i915_modparams.inject_probe_failure;
 }
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 296452f9efe4..59a6586dae15 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -165,7 +165,7 @@ i915_param_named_unsafe(enable_dp_mst, bool, 0600,
"Enable multi-stream transport (MST) for new DisplayPort sinks. 
(default: true)");
 
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
-i915_param_named_unsafe(inject_load_failure, uint, 0400,
+i915_param_named_unsafe(inject_probe_failure, uint, 0400,
"Force an error after a number of failure check points (0:disabled 
(default), N:force failure at the Nth failure check point)");
 #endif
 
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index d29ade3b7de6..8c887413fc70 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -62,7 +62,7 @@ struct drm_printer;
param(int, mmio_debug, -IS_ENABLED(CONFIG_DRM_I915_DEBUG_MMIO)) \
param(int, edp_vswing, 0) \
param(int, reset, 2) \
-   param(unsigned int, inject_load_failure, 0) \
+   param(unsigned int, inject_probe_failure, 0) \
param(int, fastboot, -1) \
param(int, enable_dpcd_backlight, 0) \
param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE) \
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 3/6] drm/i915: Propagate "_release" function name suffix down

2019-07-10 Thread Janusz Krzysztofik
Replace mixed "_fini"/"_cleanup"/"_cleanup_hw" suffixes found in names
of fucntions called from i915_driver_release() with "_release" suffix
consistently.  This provides better code readability, especially
helpful when trying to work out which phase the code is in.

Functions names starting with "i915_driver_", i.e., those defined in
drivers/gpu/dri/i915/i915_drv.c, just have their "cleanup" or "fini"
parts of their names replaced with the "_release" suffix, while names
of functions coming from other source files have been suffixed with
"_driver_release" to avoid ambiguity with other possible .release entry
points.

Suggested-by: Chris Wilson 
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 33 +
 drivers/gpu/drm/i915/i915_drv.h |  2 +-
 drivers/gpu/drm/i915/i915_gem.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c |  4 +--
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c |  2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.h |  2 +-
 7 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ad24957ad86d..36c872220f68 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -752,7 +752,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 cleanup_gem:
i915_gem_suspend(dev_priv);
i915_gem_fini_hw(dev_priv);
-   i915_gem_fini(dev_priv);
+   i915_gem_driver_release(dev_priv);
 cleanup_modeset:
intel_modeset_cleanup(dev);
 cleanup_irq:
@@ -962,10 +962,11 @@ static int i915_driver_init_early(struct drm_i915_private 
*dev_priv)
 }
 
 /**
- * i915_driver_cleanup_early - cleanup the setup done in 
i915_driver_init_early()
+ * i915_driver_early_release - cleanup the setup done in
+ *i915_driver_init_early()
  * @dev_priv: device private
  */
-static void i915_driver_cleanup_early(struct drm_i915_private *dev_priv)
+static void i915_driver_early_release(struct drm_i915_private *dev_priv)
 {
intel_irq_fini(dev_priv);
intel_power_domains_cleanup(dev_priv);
@@ -1028,10 +1029,10 @@ static int i915_driver_init_mmio(struct 
drm_i915_private *dev_priv)
 }
 
 /**
- * i915_driver_cleanup_mmio - cleanup the setup done in i915_driver_init_mmio()
+ * i915_driver_mmio_release - cleanup the setup done in i915_driver_init_mmio()
  * @dev_priv: device private
  */
-static void i915_driver_cleanup_mmio(struct drm_i915_private *dev_priv)
+static void i915_driver_mmio_release(struct drm_i915_private *dev_priv)
 {
intel_teardown_mchbar(dev_priv);
intel_uncore_fini_mmio(&dev_priv->uncore);
@@ -1684,7 +1685,7 @@ static int i915_driver_init_hw(struct drm_i915_private 
*dev_priv)
pci_disable_msi(pdev);
pm_qos_remove_request(&dev_priv->pm_qos);
 err_ggtt:
-   i915_ggtt_cleanup_hw(dev_priv);
+   i915_ggtt_driver_release(dev_priv);
 err_perf:
i915_perf_fini(dev_priv);
return ret;
@@ -1929,15 +1930,15 @@ int i915_driver_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 
 out_cleanup_hw:
i915_driver_cleanup_hw(dev_priv);
-   i915_ggtt_cleanup_hw(dev_priv);
+   i915_ggtt_driver_release(dev_priv);
 
/* Paranoia: make sure we have disabled everything before we exit. */
intel_sanitize_gt_powersave(dev_priv);
 out_cleanup_mmio:
-   i915_driver_cleanup_mmio(dev_priv);
+   i915_driver_mmio_release(dev_priv);
 out_runtime_pm_put:
enable_rpm_wakeref_asserts(&dev_priv->runtime_pm);
-   i915_driver_cleanup_early(dev_priv);
+   i915_driver_early_release(dev_priv);
 out_pci_disable:
pci_disable_device(pdev);
 out_fini:
@@ -2000,19 +2001,19 @@ static void i915_driver_release(struct drm_device *dev)
 
disable_rpm_wakeref_asserts(rpm);
 
-   i915_gem_fini(dev_priv);
+   i915_gem_driver_release(dev_priv);
 
-   i915_ggtt_cleanup_hw(dev_priv);
+   i915_ggtt_driver_release(dev_priv);
 
/* Paranoia: make sure we have disabled everything before we exit. */
intel_sanitize_gt_powersave(dev_priv);
 
-   i915_driver_cleanup_mmio(dev_priv);
+   i915_driver_mmio_release(dev_priv);
 
enable_rpm_wakeref_asserts(rpm);
-   intel_runtime_pm_cleanup(rpm);
+   intel_runtime_pm_driver_release(rpm);
 
-   i915_driver_cleanup_early(dev_priv);
+   i915_driver_early_release(dev_priv);
i915_driver_destroy(dev_priv);
 }
 
@@ -2205,7 +2206,7 @@ static int i915_drm_suspend_late(struct drm_device *dev, 
bool hibernation)
 out:
enable_rpm_wakeref_asserts(rpm);
if (!dev_priv->uncore.user_forcewake.count)
-   intel_runtime_pm_cleanup(rpm);
+   intel_runtime_pm_driver_release(rpm);
 
return ret;
 }
@@ -2969,7 +

Re: [Intel-gfx] [RFC PATCH 0/6] Rename functions to match their entry points

2019-07-10 Thread Janusz Krzysztofik
On Wednesday, July 10, 2019 2:47:08 PM CEST Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-07-10 13:36:25)
> > Need for this was identified while working on split of driver unbind
> > path into _remove() and _release() parts.  Consistency in function
> > naming has been recognized as helpful when trying to work out which
> > phase the code is in.
> > 
> > What I'm still not sure about is desired depth of that modification -
> > how deep should we go down with renaming to not override meaningfull
> > function names.  Please advise if you think still more deep renaming
> > makes sense.
> 
> I did a double take over "driver_release" but by the end I was in
> agreement.
> 
> The early_release though, that is worth a bit of artistic license to say
> early_probe pairs with late_release.

OK, I'll fix it, as well as other issues pointed out by dim, and resubmit.

Thanks,
Janusz

> -Chris
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH] drm/i915: Drop extern qualifiers from header function prototypes

2019-07-10 Thread Janusz Krzysztofik
Follow dim checkpatch recommendation so it doesn't complain on that now
and again on header file modifications.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 +-
 drivers/gpu/drm/i915/gvt/gtt.h | 13 +++---
 drivers/gpu/drm/i915/i915_drv.h| 48 +++---
 drivers/gpu/drm/i915/i915_irq.h|  4 +-
 drivers/gpu/drm/i915/oa/i915_oa_bdw.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_bxt.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_cflgt2.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_cflgt3.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_chv.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_cnl.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_glk.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_hsw.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_icl.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_kblgt2.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_kblgt3.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_sklgt2.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_sklgt3.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_sklgt4.h   |  2 +-
 include/drm/i915_drm.h | 10 ++---
 19 files changed, 52 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 20754c15412a..67aea07ea019 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -81,7 +81,7 @@ i915_gem_object_lookup(struct drm_file *file, u32 handle)
 }
 
 __deprecated
-extern struct drm_gem_object *
+struct drm_gem_object *
 drm_gem_object_lookup(struct drm_file *file, u32 handle);
 
 __attribute__((nonnull))
diff --git a/drivers/gpu/drm/i915/gvt/gtt.h b/drivers/gpu/drm/i915/gvt/gtt.h
index 42d0394f0de2..88789316807d 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.h
+++ b/drivers/gpu/drm/i915/gvt/gtt.h
@@ -205,17 +205,18 @@ struct intel_vgpu_gtt {
struct intel_vgpu_scratch_pt scratch_pt[GTT_TYPE_MAX];
 };
 
-extern int intel_vgpu_init_gtt(struct intel_vgpu *vgpu);
-extern void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu);
+int intel_vgpu_init_gtt(struct intel_vgpu *vgpu);
+void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu);
 void intel_vgpu_reset_ggtt(struct intel_vgpu *vgpu, bool invalidate_old);
 void intel_vgpu_invalidate_ppgtt(struct intel_vgpu *vgpu);
 
-extern int intel_gvt_init_gtt(struct intel_gvt *gvt);
+int intel_gvt_init_gtt(struct intel_gvt *gvt);
 void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu);
-extern void intel_gvt_clean_gtt(struct intel_gvt *gvt);
+void intel_gvt_clean_gtt(struct intel_gvt *gvt);
 
-extern struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu,
-   int page_table_level, void *root_entry);
+struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu,
+ int page_table_level,
+ void *root_entry);
 
 struct intel_vgpu_oos_page {
struct intel_vgpu_ppgtt_spt *spt;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a9381e404fd5..649bebcc0019 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2388,19 +2388,18 @@ __i915_printk(struct drm_i915_private *dev_priv, const 
char *level,
__i915_printk(dev_priv, KERN_ERR, fmt, ##__VA_ARGS__)
 
 #ifdef CONFIG_COMPAT
-extern long i915_compat_ioctl(struct file *filp, unsigned int cmd,
- unsigned long arg);
+long i915_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
 #else
 #define i915_compat_ioctl NULL
 #endif
 extern const struct dev_pm_ops i915_pm_ops;
+extern const struct dev_pm_ops i915_pm_ops_1;
 
-extern int i915_driver_load(struct pci_dev *pdev,
-   const struct pci_device_id *ent);
-extern void i915_driver_unload(struct drm_device *dev);
+int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent);
+void i915_driver_unload(struct drm_device *dev);
 
-extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
-extern void intel_hangcheck_init(struct drm_i915_private *dev_priv);
+void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
+void intel_hangcheck_init(struct drm_i915_private *dev_priv);
 int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on);
 
 u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv);
@@ -2670,14 +2669,14 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
*engine,
bool is_master);
 
 /* i915_perf.c */
-extern void i915_perf_init(struct drm_i915_private *dev_priv);
-extern void i915_perf_fini(struct drm_i915_private *dev_priv);
-extern void i915_perf_register(struct drm_i915_private *dev_priv);
-extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
+void i915_perf_init(struct drm_i915_private *dev_priv);
+void i915_perf_fini(struct drm_i915_pr

[Intel-gfx] [RFC PATCH] drm/i915: Join quoted strings and align them with open parenthesis

2019-07-10 Thread Janusz Krzysztofik
Follow dim checkpatch recommendations so it doesn't complain now and
again on consistent modifications of i915_params.c

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_params.c | 96 ++
 1 file changed, 33 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 296452f9efe4..8007fa893869 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -41,141 +41,111 @@ struct i915_params i915_modparams __read_mostly = {
 };
 
 i915_param_named(modeset, int, 0400,
-   "Use kernel modesetting [KMS] (0=disable, "
-   "1=on, -1=force vga console preference [default])");
+"Use kernel modesetting [KMS] (0=disable, 1=on, -1=force vga 
console preference [default])");
 
 i915_param_named_unsafe(enable_dc, int, 0400,
-   "Enable power-saving display C-states. "
-   "(-1=auto [default]; 0=disable; 1=up to DC5; 2=up to DC6)");
+   "Enable power-saving display C-states. (-1=auto 
[default]; 0=disable; 1=up to DC5; 2=up to DC6)");
 
 i915_param_named_unsafe(enable_fbc, int, 0600,
-   "Enable frame buffer compression for power savings "
-   "(default: -1 (use per-chip default))");
+   "Enable frame buffer compression for power savings 
(default: -1 (use per-chip default))");
 
 i915_param_named_unsafe(lvds_channel_mode, int, 0400,
-"Specify LVDS channel mode "
-"(0=probe BIOS [default], 1=single-channel, 2=dual-channel)");
+   "Specify LVDS channel mode (0=probe BIOS [default], 
1=single-channel, 2=dual-channel)");
 
 i915_param_named_unsafe(panel_use_ssc, int, 0600,
-   "Use Spread Spectrum Clock with panels [LVDS/eDP] "
-   "(default: auto from VBT)");
+   "Use Spread Spectrum Clock with panels [LVDS/eDP] 
(default: auto from VBT)");
 
 i915_param_named_unsafe(vbt_sdvo_panel_type, int, 0400,
-   "Override/Ignore selection of SDVO panel mode in the VBT "
-   "(-2=ignore, -1=auto [default], index in VBT BIOS table)");
+   "Override/Ignore selection of SDVO panel mode in the 
VBT (-2=ignore, -1=auto [default], index in VBT BIOS table)");
 
 i915_param_named_unsafe(reset, int, 0600,
-   "Attempt GPU resets (0=disabled, 1=full gpu reset, 2=engine reset 
[default])");
+   "Attempt GPU resets (0=disabled, 1=full gpu reset, 
2=engine reset [default])");
 
 i915_param_named_unsafe(vbt_firmware, charp, 0400,
-   "Load VBT from specified file under /lib/firmware");
+   "Load VBT from specified file under /lib/firmware");
 
 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 i915_param_named(error_capture, bool, 0600,
-   "Record the GPU state following a hang. "
-   "This information in /sys/class/drm/card/error is vital for "
-   "triaging and debugging hangs.");
+"Record the GPU state following a hang. This information in 
/sys/class/drm/card/error is vital for triaging and debugging hangs.");
 #endif
 
 i915_param_named_unsafe(enable_hangcheck, bool, 0600,
-   "Periodically check GPU activity for detecting hangs. "
-   "WARNING: Disabling this can cause system wide hangs. "
-   "(default: true)");
+   "Periodically check GPU activity for detecting hangs. 
WARNING: Disabling this can cause system wide hangs. (default: true)");
 
 i915_param_named_unsafe(enable_psr, int, 0600,
-   "Enable PSR "
-   "(0=disabled, 1=enabled) "
-   "Default: -1 (use per-chip default)");
+   "Enable PSR (0=disabled, 1=enabled) Default: -1 (use 
per-chip default)");
 
 i915_param_named_unsafe(force_probe, charp, 0400,
-   "Force probe the driver for specified devices. "
-   "See CONFIG_DRM_I915_FORCE_PROBE for details.");
+   "Force probe the driver for specified devices. See 
CONFIG_DRM_I915_FORCE_PROBE for details.");
 
 i915_param_named_unsafe(alpha_support, bool, 0400,
-   "Deprecated. See i915.force_probe.");
+   "Deprecated. See i915.force_probe.");
 
 i915_param_named_unsafe(disable_power_well, int, 0400,
-   "Disable display power wells when possible "
-   "(-1=auto [default], 0=power wells always on, 1=power wells disabled 
when possible)");
+   "Disable display power wells when possible (-1=auto 
[default], 0=power wells always on, 1=power wells disabled when possible)");
 
 i915_param_named_

Re: [Intel-gfx] [RFC PATCH] drm/i915: Drop extern qualifiers from header function prototypes

2019-07-10 Thread Janusz Krzysztofik
Hi Chris,

On Wednesday, July 10, 2019 5:01:04 PM CEST Chris Wilson wrote:
> Quoting Janusz Krzysztofik (2019-07-10 15:52:39)
> > Follow dim checkpatch recommendation so it doesn't complain on that now
> > and again on header file modifications.
> > 
> > Signed-off-by: Janusz Krzysztofik 
> 
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2388,19 +2388,18 @@ __i915_printk(struct drm_i915_private *dev_priv, 
const char *level,
> > __i915_printk(dev_priv, KERN_ERR, fmt, ##__VA_ARGS__)
> >  
> >  #ifdef CONFIG_COMPAT
> > -extern long i915_compat_ioctl(struct file *filp, unsigned int cmd,
> > - unsigned long arg);
> > +long i915_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long 
arg);
> >  #else
> >  #define i915_compat_ioctl NULL
> >  #endif
> >  extern const struct dev_pm_ops i915_pm_ops;
> > +extern const struct dev_pm_ops i915_pm_ops_1;
> 
> That's novel.

Oh, sorry, that was my testing of how dim checkpatch reacts on extern 
qualifiers on variables.  Thanks for catching this.

Janusz

> > -Chris
> 




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v2] drm/i915: Drop extern qualifiers from header function prototypes

2019-07-10 Thread Janusz Krzysztofik
Follow dim checkpatch recommendation so it doesn't complain on that now
and again on header file modifications.

v2: Drop testing leftover

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 +-
 drivers/gpu/drm/i915/gvt/gtt.h | 13 +++---
 drivers/gpu/drm/i915/i915_drv.h| 47 ++
 drivers/gpu/drm/i915/i915_irq.h|  4 +-
 drivers/gpu/drm/i915/oa/i915_oa_bdw.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_bxt.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_cflgt2.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_cflgt3.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_chv.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_cnl.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_glk.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_hsw.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_icl.h  |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_kblgt2.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_kblgt3.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_sklgt2.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_sklgt3.h   |  2 +-
 drivers/gpu/drm/i915/oa/i915_oa_sklgt4.h   |  2 +-
 include/drm/i915_drm.h | 10 ++---
 19 files changed, 51 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 20754c15412a..67aea07ea019 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -81,7 +81,7 @@ i915_gem_object_lookup(struct drm_file *file, u32 handle)
 }
 
 __deprecated
-extern struct drm_gem_object *
+struct drm_gem_object *
 drm_gem_object_lookup(struct drm_file *file, u32 handle);
 
 __attribute__((nonnull))
diff --git a/drivers/gpu/drm/i915/gvt/gtt.h b/drivers/gpu/drm/i915/gvt/gtt.h
index 42d0394f0de2..88789316807d 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.h
+++ b/drivers/gpu/drm/i915/gvt/gtt.h
@@ -205,17 +205,18 @@ struct intel_vgpu_gtt {
struct intel_vgpu_scratch_pt scratch_pt[GTT_TYPE_MAX];
 };
 
-extern int intel_vgpu_init_gtt(struct intel_vgpu *vgpu);
-extern void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu);
+int intel_vgpu_init_gtt(struct intel_vgpu *vgpu);
+void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu);
 void intel_vgpu_reset_ggtt(struct intel_vgpu *vgpu, bool invalidate_old);
 void intel_vgpu_invalidate_ppgtt(struct intel_vgpu *vgpu);
 
-extern int intel_gvt_init_gtt(struct intel_gvt *gvt);
+int intel_gvt_init_gtt(struct intel_gvt *gvt);
 void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu);
-extern void intel_gvt_clean_gtt(struct intel_gvt *gvt);
+void intel_gvt_clean_gtt(struct intel_gvt *gvt);
 
-extern struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu,
-   int page_table_level, void *root_entry);
+struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu,
+ int page_table_level,
+ void *root_entry);
 
 struct intel_vgpu_oos_page {
struct intel_vgpu_ppgtt_spt *spt;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a9381e404fd5..246f9cb625dc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2388,19 +2388,17 @@ __i915_printk(struct drm_i915_private *dev_priv, const 
char *level,
__i915_printk(dev_priv, KERN_ERR, fmt, ##__VA_ARGS__)
 
 #ifdef CONFIG_COMPAT
-extern long i915_compat_ioctl(struct file *filp, unsigned int cmd,
- unsigned long arg);
+long i915_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
 #else
 #define i915_compat_ioctl NULL
 #endif
 extern const struct dev_pm_ops i915_pm_ops;
 
-extern int i915_driver_load(struct pci_dev *pdev,
-   const struct pci_device_id *ent);
-extern void i915_driver_unload(struct drm_device *dev);
+int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent);
+void i915_driver_unload(struct drm_device *dev);
 
-extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
-extern void intel_hangcheck_init(struct drm_i915_private *dev_priv);
+void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
+void intel_hangcheck_init(struct drm_i915_private *dev_priv);
 int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on);
 
 u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv);
@@ -2670,14 +2668,14 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
*engine,
bool is_master);
 
 /* i915_perf.c */
-extern void i915_perf_init(struct drm_i915_private *dev_priv);
-extern void i915_perf_fini(struct drm_i915_private *dev_priv);
-extern void i915_perf_register(struct drm_i915_private *dev_priv);
-extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
+void i915_perf_init(struct drm_i915_private *dev_priv);
+void i915_perf_fini(struct drm_i915_private *dev_priv);

[Intel-gfx] [PATCH v2 5/5] drm/i915: Propagate "_probe" function name suffix down

2019-07-11 Thread Janusz Krzysztofik
Similar to the "_release" and "_remove" cases, consequently replace
"_init" components of names of functions called from
i915_driver_probe() with "_probe" suffixes for better code readability.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 4c4443757a36..ec4bb8038c9b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -675,7 +675,7 @@ static const struct vga_switcheroo_client_ops 
i915_switcheroo_ops = {
.can_switch = i915_switcheroo_can_switch,
 };
 
-static int i915_load_modeset_init(struct drm_device *dev)
+static int i915_driver_modeset_probe(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct pci_dev *pdev = dev_priv->drm.pdev;
@@ -884,7 +884,7 @@ static void intel_detect_preproduction_hw(struct 
drm_i915_private *dev_priv)
 }
 
 /**
- * i915_driver_init_early - setup state not requiring device access
+ * i915_driver_early_probe - setup state not requiring device access
  * @dev_priv: device private
  *
  * Initialize everything that is a "SW-only" state, that is state not
@@ -893,7 +893,7 @@ static void intel_detect_preproduction_hw(struct 
drm_i915_private *dev_priv)
  * system memory allocation, setting up device specific attributes and
  * function hooks not requiring accessing the device.
  */
-static int i915_driver_init_early(struct drm_i915_private *dev_priv)
+static int i915_driver_early_probe(struct drm_i915_private *dev_priv)
 {
int ret = 0;
 
@@ -963,7 +963,7 @@ static int i915_driver_init_early(struct drm_i915_private 
*dev_priv)
 
 /**
  * i915_driver_late_release - cleanup the setup done in
- *i915_driver_init_early()
+ *i915_driver_early_probe()
  * @dev_priv: device private
  */
 static void i915_driver_late_release(struct drm_i915_private *dev_priv)
@@ -980,7 +980,7 @@ static void i915_driver_late_release(struct 
drm_i915_private *dev_priv)
 }
 
 /**
- * i915_driver_init_mmio - setup device MMIO
+ * i915_driver_mmio_probe - setup device MMIO
  * @dev_priv: device private
  *
  * Setup minimal device state necessary for MMIO accesses later in the
@@ -988,7 +988,7 @@ static void i915_driver_late_release(struct 
drm_i915_private *dev_priv)
  * side effects or exposing the driver via kernel internal or user space
  * interfaces.
  */
-static int i915_driver_init_mmio(struct drm_i915_private *dev_priv)
+static int i915_driver_mmio_probe(struct drm_i915_private *dev_priv)
 {
int ret;
 
@@ -1029,7 +1029,7 @@ static int i915_driver_init_mmio(struct drm_i915_private 
*dev_priv)
 }
 
 /**
- * i915_driver_mmio_release - cleanup the setup done in i915_driver_init_mmio()
+ * i915_driver_mmio_release - cleanup the setup done in 
i915_driver_mmio_probe()
  * @dev_priv: device private
  */
 static void i915_driver_mmio_release(struct drm_i915_private *dev_priv)
@@ -1525,13 +1525,13 @@ static void edram_detect(struct drm_i915_private 
*dev_priv)
 }
 
 /**
- * i915_driver_init_hw - setup state requiring device access
+ * i915_driver_hw_probe - setup state requiring device access
  * @dev_priv: device private
  *
  * Setup state that requires accessing the device, but doesn't require
  * exposing the driver via kernel internal or userspace interfaces.
  */
-static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
+static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
 {
struct pci_dev *pdev = dev_priv->drm.pdev;
int ret;
@@ -1900,7 +1900,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
if (ret)
goto out_fini;
 
-   ret = i915_driver_init_early(dev_priv);
+   ret = i915_driver_early_probe(dev_priv);
if (ret < 0)
goto out_pci_disable;
 
@@ -1908,15 +1908,15 @@ int i915_driver_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 
i915_detect_vgpu(dev_priv);
 
-   ret = i915_driver_init_mmio(dev_priv);
+   ret = i915_driver_mmio_probe(dev_priv);
if (ret < 0)
goto out_runtime_pm_put;
 
-   ret = i915_driver_init_hw(dev_priv);
+   ret = i915_driver_hw_probe(dev_priv);
if (ret < 0)
goto out_cleanup_mmio;
 
-   ret = i915_load_modeset_init(&dev_priv->drm);
+   ret = i915_driver_modeset_probe(&dev_priv->drm);
if (ret < 0)
goto out_cleanup_hw;
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v2 1/5] drm/i915: Rename "_load"/"_unload" to match PCI entry points

2019-07-11 Thread Janusz Krzysztofik
Current names of i915_driver_load/unload() functions originate in
legacy DRM stubs.  Reduce nomenclature ambiguity by renaming them to
match their current use as helpers called from PCI entry points.

Suggested by: Chris Wilson 
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 8 
 drivers/gpu/drm/i915/i915_drv.h | 4 ++--
 drivers/gpu/drm/i915/i915_pci.c | 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 12182d2fc03c..8b72ae7c1f5d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1870,17 +1870,17 @@ static void i915_driver_destroy(struct drm_i915_private 
*i915)
 }
 
 /**
- * i915_driver_load - setup chip and create an initial config
+ * i915_driver_probe - setup chip and create an initial config
  * @pdev: PCI device
  * @ent: matching PCI ID entry
  *
- * The driver load routine has to do several things:
+ * The driver probe routine has to do several things:
  *   - drive output discovery via intel_modeset_init()
  *   - initialize the memory manager
  *   - allocate initial config memory
  *   - setup the DRM framebuffer with the allocated memory
  */
-int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
+int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
const struct intel_device_info *match_info =
(struct intel_device_info *)ent->driver_data;
@@ -1946,7 +1946,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct 
pci_device_id *ent)
return ret;
 }
 
-void i915_driver_unload(struct drm_device *dev)
+void i915_driver_remove(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct pci_dev *pdev = dev_priv->drm.pdev;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 246f9cb625dc..7d650475790e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2394,8 +2394,8 @@ long i915_compat_ioctl(struct file *filp, unsigned int 
cmd, unsigned long arg);
 #endif
 extern const struct dev_pm_ops i915_pm_ops;
 
-int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent);
-void i915_driver_unload(struct drm_device *dev);
+int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent);
+void i915_driver_remove(struct drm_device *dev);
 
 void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
 void intel_hangcheck_init(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 94b588e0a1dd..786ca7b3439b 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -848,7 +848,7 @@ static void i915_pci_remove(struct pci_dev *pdev)
if (!dev) /* driver load aborted, nothing to cleanup */
return;
 
-   i915_driver_unload(dev);
+   i915_driver_remove(dev);
drm_dev_put(dev);
 
pci_set_drvdata(pdev, NULL);
@@ -923,7 +923,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
if (vga_switcheroo_client_probe_defer(pdev))
return -EPROBE_DEFER;
 
-   err = i915_driver_load(pdev, ent);
+   err = i915_driver_probe(pdev, ent);
if (err)
return err;
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v2 2/5] drm/i915: Replace "_load" with "_probe" consequently

2019-07-11 Thread Janusz Krzysztofik
Use the "_probe" nomenclature not only in i915_driver_probe() helper
name but also in other related function / variable names for
consistency.  Only the userspace exposed name of a related module
parameter is left untouched.

Signed-off-by: Janusz Krzysztofik 
---
 .../gpu/drm/i915/display/intel_connector.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.c   | 20 +--
 drivers/gpu/drm/i915/i915_drv.h   | 10 +-
 drivers/gpu/drm/i915/i915_gem.c   |  8 
 drivers/gpu/drm/i915/i915_pci.c   |  2 +-
 drivers/gpu/drm/i915/intel_gvt.c  |  2 +-
 drivers/gpu/drm/i915/intel_uncore.c   |  2 +-
 drivers/gpu/drm/i915/intel_wopcm.c|  2 +-
 9 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_connector.c 
b/drivers/gpu/drm/i915/display/intel_connector.c
index 41310f8e5a2a..d0163d86c42a 100644
--- a/drivers/gpu/drm/i915/display/intel_connector.c
+++ b/drivers/gpu/drm/i915/display/intel_connector.c
@@ -118,7 +118,7 @@ int intel_connector_register(struct drm_connector 
*connector)
if (ret)
goto err;
 
-   if (i915_inject_load_failure()) {
+   if (i915_inject_probe_failure()) {
ret = -EFAULT;
goto err_backlight;
}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index bdf279fa3b2e..375b0561bd1d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -426,7 +426,7 @@ int intel_engines_init_mmio(struct drm_i915_private *i915)
WARN_ON(engine_mask &
GENMASK(BITS_PER_TYPE(mask) - 1, I915_NUM_ENGINES));
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 8b72ae7c1f5d..ad24957ad86d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -81,14 +81,14 @@
 static struct drm_driver driver;
 
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
-static unsigned int i915_load_fail_count;
+static unsigned int i915_probe_fail_count;
 
-bool __i915_inject_load_failure(const char *func, int line)
+bool __i915_inject_probe_failure(const char *func, int line)
 {
-   if (i915_load_fail_count >= i915_modparams.inject_load_failure)
+   if (i915_probe_fail_count >= i915_modparams.inject_load_failure)
return false;
 
-   if (++i915_load_fail_count == i915_modparams.inject_load_failure) {
+   if (++i915_probe_fail_count == i915_modparams.inject_load_failure) {
DRM_INFO("Injecting failure at checkpoint %u [%s:%d]\n",
 i915_modparams.inject_load_failure, func, line);
i915_modparams.inject_load_failure = 0;
@@ -100,7 +100,7 @@ bool __i915_inject_load_failure(const char *func, int line)
 
 bool i915_error_injected(void)
 {
-   return i915_load_fail_count && !i915_modparams.inject_load_failure;
+   return i915_probe_fail_count && !i915_modparams.inject_load_failure;
 }
 
 #endif
@@ -681,7 +681,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
struct pci_dev *pdev = dev_priv->drm.pdev;
int ret;
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
if (HAS_DISPLAY(dev_priv)) {
@@ -897,7 +897,7 @@ static int i915_driver_init_early(struct drm_i915_private 
*dev_priv)
 {
int ret = 0;
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
intel_device_info_subplatform_init(dev_priv);
@@ -991,7 +991,7 @@ static int i915_driver_init_mmio(struct drm_i915_private 
*dev_priv)
 {
int ret;
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
if (i915_get_bridge_dev(dev_priv))
@@ -1535,7 +1535,7 @@ static int i915_driver_init_hw(struct drm_i915_private 
*dev_priv)
struct pci_dev *pdev = dev_priv->drm.pdev;
int ret;
 
-   if (i915_inject_load_failure())
+   if (i915_inject_probe_failure())
return -ENODEV;
 
intel_device_info_runtime_init(dev_priv);
@@ -1941,7 +1941,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 out_pci_disable:
pci_disable_device(pdev);
 out_fini:
-   i915_load_error(dev_priv, "Device initialization failed (%d)\n", ret);
+   i915_probe_error(dev_priv, "Device initialization failed (%d)\n", ret);
i915_driver_destroy(dev_priv);
return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/dr

[Intel-gfx] [PATCH v2 3/5] drm/i915: Propagate "_release" function name suffix down

2019-07-11 Thread Janusz Krzysztofik
Replace mixed "_fini"/"_cleanup"/"_cleanup_hw" suffixes found in names
of fucntions called from i915_driver_release() with "_release" suffix
consistently.  This provides better code readability, especially
helpful when trying to work out which phase the code is in.

Functions names starting with "i915_driver_", i.e., those defined in
drivers/gpu/dri/i915/i915_drv.c, just have their "cleanup" or "fini"
parts of their names replaced with the "_release" suffix, while names
of functions coming from other source files have been suffixed with
"_driver_release" to avoid ambiguity with other possible .release entry
points.

v2: early_probe pairs better with late_release (Chris)

Suggested-by: Chris Wilson 
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/i915_drv.c | 33 +
 drivers/gpu/drm/i915/i915_drv.h |  2 +-
 drivers/gpu/drm/i915/i915_gem.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c |  4 +--
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c |  2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.h |  2 +-
 7 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ad24957ad86d..33bbe74cd441 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -752,7 +752,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
 cleanup_gem:
i915_gem_suspend(dev_priv);
i915_gem_fini_hw(dev_priv);
-   i915_gem_fini(dev_priv);
+   i915_gem_driver_release(dev_priv);
 cleanup_modeset:
intel_modeset_cleanup(dev);
 cleanup_irq:
@@ -962,10 +962,11 @@ static int i915_driver_init_early(struct drm_i915_private 
*dev_priv)
 }
 
 /**
- * i915_driver_cleanup_early - cleanup the setup done in 
i915_driver_init_early()
+ * i915_driver_late_release - cleanup the setup done in
+ *i915_driver_init_early()
  * @dev_priv: device private
  */
-static void i915_driver_cleanup_early(struct drm_i915_private *dev_priv)
+static void i915_driver_late_release(struct drm_i915_private *dev_priv)
 {
intel_irq_fini(dev_priv);
intel_power_domains_cleanup(dev_priv);
@@ -1028,10 +1029,10 @@ static int i915_driver_init_mmio(struct 
drm_i915_private *dev_priv)
 }
 
 /**
- * i915_driver_cleanup_mmio - cleanup the setup done in i915_driver_init_mmio()
+ * i915_driver_mmio_release - cleanup the setup done in i915_driver_init_mmio()
  * @dev_priv: device private
  */
-static void i915_driver_cleanup_mmio(struct drm_i915_private *dev_priv)
+static void i915_driver_mmio_release(struct drm_i915_private *dev_priv)
 {
intel_teardown_mchbar(dev_priv);
intel_uncore_fini_mmio(&dev_priv->uncore);
@@ -1684,7 +1685,7 @@ static int i915_driver_init_hw(struct drm_i915_private 
*dev_priv)
pci_disable_msi(pdev);
pm_qos_remove_request(&dev_priv->pm_qos);
 err_ggtt:
-   i915_ggtt_cleanup_hw(dev_priv);
+   i915_ggtt_driver_release(dev_priv);
 err_perf:
i915_perf_fini(dev_priv);
return ret;
@@ -1929,15 +1930,15 @@ int i915_driver_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 
 out_cleanup_hw:
i915_driver_cleanup_hw(dev_priv);
-   i915_ggtt_cleanup_hw(dev_priv);
+   i915_ggtt_driver_release(dev_priv);
 
/* Paranoia: make sure we have disabled everything before we exit. */
intel_sanitize_gt_powersave(dev_priv);
 out_cleanup_mmio:
-   i915_driver_cleanup_mmio(dev_priv);
+   i915_driver_mmio_release(dev_priv);
 out_runtime_pm_put:
enable_rpm_wakeref_asserts(&dev_priv->runtime_pm);
-   i915_driver_cleanup_early(dev_priv);
+   i915_driver_late_release(dev_priv);
 out_pci_disable:
pci_disable_device(pdev);
 out_fini:
@@ -2000,19 +2001,19 @@ static void i915_driver_release(struct drm_device *dev)
 
disable_rpm_wakeref_asserts(rpm);
 
-   i915_gem_fini(dev_priv);
+   i915_gem_driver_release(dev_priv);
 
-   i915_ggtt_cleanup_hw(dev_priv);
+   i915_ggtt_driver_release(dev_priv);
 
/* Paranoia: make sure we have disabled everything before we exit. */
intel_sanitize_gt_powersave(dev_priv);
 
-   i915_driver_cleanup_mmio(dev_priv);
+   i915_driver_mmio_release(dev_priv);
 
enable_rpm_wakeref_asserts(rpm);
-   intel_runtime_pm_cleanup(rpm);
+   intel_runtime_pm_driver_release(rpm);
 
-   i915_driver_cleanup_early(dev_priv);
+   i915_driver_late_release(dev_priv);
i915_driver_destroy(dev_priv);
 }
 
@@ -2205,7 +2206,7 @@ static int i915_drm_suspend_late(struct drm_device *dev, 
bool hibernation)
 out:
enable_rpm_wakeref_asserts(rpm);
if (!dev_priv->uncore.user_forcewake.count)
-   intel_runtime_pm_cleanup(rpm);
+   intel_runtime_pm_driver

[Intel-gfx] [PATCH v2 0/5] drm/i915: Rename functions to match their entry points

2019-07-11 Thread Janusz Krzysztofik
Need for this was identified while working on split of driver unbind
path into _remove() and _release() parts.  Consistency in function
naming has been recognized as helpful when trying to work out which
phase the code is in.

v2: * early_probe pairs better with late_release (Chris),
* exclude patch 6/6 "drm/i915: Rename "inject_load_failure" module
  parameter" for now, it requires updates on user (IGT) side
* rebase on top of "drm/i915: Drop extern qualifiers from header
      function prototypes"

Janusz Krzysztofik (5):
  drm/i915: Rename "_load"/"_unload" to match PCI entry points
  drm/i915: Replace "_load" with "_probe" consequently
  drm/i915: Propagate "_release" function name suffix down
  drm/i915: Propagate "_remove" function name suffix down
  drm/i915: Propagate "_probe" function name suffix down

 drivers/gpu/drm/i915/display/intel_bios.c |   4 +-
 drivers/gpu/drm/i915/display/intel_bios.h |   2 +-
 .../gpu/drm/i915/display/intel_connector.c|   2 +-
 drivers/gpu/drm/i915/display/intel_display.c  |   2 +-
 .../drm/i915/display/intel_display_power.c|   6 +-
 .../drm/i915/display/intel_display_power.h|   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |   2 +-
 drivers/gpu/drm/i915/i915_drv.c   | 107 +-
 drivers/gpu/drm/i915/i915_drv.h   |  20 ++--
 drivers/gpu/drm/i915/i915_gem.c   |  12 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c   |   4 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   2 +-
 drivers/gpu/drm/i915/i915_pci.c   |   6 +-
 drivers/gpu/drm/i915/intel_gvt.c  |   7 +-
 drivers/gpu/drm/i915/intel_gvt.h  |   5 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c   |   2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.h   |   2 +-
 drivers/gpu/drm/i915/intel_uncore.c   |   2 +-
 drivers/gpu/drm/i915/intel_wopcm.c|   2 +-
 19 files changed, 97 insertions(+), 94 deletions(-)

-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v2 4/5] drm/i915: Propagate "_remove" function name suffix down

2019-07-11 Thread Janusz Krzysztofik
Similar to the "_release" case, consistently replace mixed
"_cleanup"/"_fini"/"_fini_hw" components found in names of functions
called from i915_driver_remove() with "_remove" or "_driver_remove"
suffixes for better code readability.

Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/display/intel_bios.c |  4 ++--
 drivers/gpu/drm/i915/display/intel_bios.h |  2 +-
 drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
 .../drm/i915/display/intel_display_power.c|  6 ++---
 .../drm/i915/display/intel_display_power.h|  2 +-
 drivers/gpu/drm/i915/i915_drv.c   | 24 +--
 drivers/gpu/drm/i915/i915_drv.h   |  4 ++--
 drivers/gpu/drm/i915/i915_gem.c   |  2 +-
 drivers/gpu/drm/i915/intel_gvt.c  |  5 ++--
 drivers/gpu/drm/i915/intel_gvt.h  |  5 ++--
 10 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c 
b/drivers/gpu/drm/i915/display/intel_bios.c
index 4fdbb5c35d87..4f709f5ddf07 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -1893,10 +1893,10 @@ void intel_bios_init(struct drm_i915_private *dev_priv)
 }
 
 /**
- * intel_bios_cleanup - Free any resources allocated by intel_bios_init()
+ * intel_bios_driver_remove - Free any resources allocated by intel_bios_init()
  * @dev_priv: i915 device instance
  */
-void intel_bios_cleanup(struct drm_i915_private *dev_priv)
+void intel_bios_driver_remove(struct drm_i915_private *dev_priv)
 {
kfree(dev_priv->vbt.child_dev);
dev_priv->vbt.child_dev = NULL;
diff --git a/drivers/gpu/drm/i915/display/intel_bios.h 
b/drivers/gpu/drm/i915/display/intel_bios.h
index 0b7be6389a07..4969189e620f 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.h
+++ b/drivers/gpu/drm/i915/display/intel_bios.h
@@ -228,7 +228,7 @@ struct mipi_pps_data {
 } __packed;
 
 void intel_bios_init(struct drm_i915_private *dev_priv);
-void intel_bios_cleanup(struct drm_i915_private *dev_priv);
+void intel_bios_driver_remove(struct drm_i915_private *dev_priv);
 bool intel_bios_is_valid_vbt(const void *buf, size_t size);
 bool intel_bios_is_tv_present(struct drm_i915_private *dev_priv);
 bool intel_bios_is_lvds_present(struct drm_i915_private *dev_priv, u8 
*i2c_pin);
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 0286b97caa22..daf73c2d23c2 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -17073,7 +17073,7 @@ static void intel_hpd_poll_fini(struct drm_device *dev)
drm_connector_list_iter_end(&conn_iter);
 }
 
-void intel_modeset_cleanup(struct drm_device *dev)
+void intel_modeset_driver_remove(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
 
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 7e22a2704843..db89550e3b6b 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -4429,7 +4429,7 @@ static void intel_power_domains_verify_state(struct 
drm_i915_private *dev_priv);
  *
  * It will return with power domains disabled (to be enabled later by
  * intel_power_domains_enable()) and must be paired with
- * intel_power_domains_fini_hw().
+ * intel_power_domains_driver_remove().
  */
 void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume)
 {
@@ -4481,7 +4481,7 @@ void intel_power_domains_init_hw(struct drm_i915_private 
*i915, bool resume)
 }
 
 /**
- * intel_power_domains_fini_hw - deinitialize hw power domain state
+ * intel_power_domains_driver_remove - deinitialize hw power domain state
  * @i915: i915 device instance
  *
  * De-initializes the display power domain HW state. It also ensures that the
@@ -4491,7 +4491,7 @@ void intel_power_domains_init_hw(struct drm_i915_private 
*i915, bool resume)
  * intel_power_domains_disable()) and must be paired with
  * intel_power_domains_init_hw().
  */
-void intel_power_domains_fini_hw(struct drm_i915_private *i915)
+void intel_power_domains_driver_remove(struct drm_i915_private *i915)
 {
intel_wakeref_t wakeref __maybe_unused =
fetch_and_zero(&i915->power_domains.wakeref);
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h 
b/drivers/gpu/drm/i915/display/intel_display_power.h
index 8f43f7051a16..dbd1f5ef01d1 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -214,7 +214,7 @@ void gen9_enable_dc5(struct drm_i915_private *dev_priv);
 int intel_power_domains_init(struct drm_i915_private *dev_priv);
 void intel_power_domains_cleanup(struct drm_i915_private *dev_priv);
 void intel_power_domains_init_hw(struct drm_i9

[Intel-gfx] [PATCH 1/2] drm/i915/gem: Avoid taking runtime-pm under the shrinker

2022-07-20 Thread Janusz Krzysztofik
From: Chris Wilson 

Inside the shrinker, we cannot wake the device as that may cause
recursion into fs-reclaim, so instead we only unbind vma if the device
is currently awake. (In order to provide reclaim while asleep, we do
wake the device up during kswapd -- we probably want to limit that wake
up if we have anything to shrink though!)

To avoid the same fs_reclaim recursion potential during
i915_gem_object_unbind, we acquire a wakeref there, see commit
3e817471a34c ("drm/i915/gem: Take runtime-pm wakeref prior to unbinding").
However, we use i915_gem_object_unbind from the shrinker path to make the
object available for shrinking and so we must make the wakeref acquisition
here conditional.

<4> [437.542172] ==
<4> [437.542174] WARNING: possible circular locking dependency detected
<4> [437.542176] 5.19.0-rc6-CI_DRM_11876-g2305e0d00665+ #1 Tainted: G U
<4> [437.542179] --
<4> [437.542181] kswapd0/93 is trying to acquire lock:
<4> [437.542183] 827a7608 (acpi_wakeup_lock){+.+.}-{3:3}, at: 
acpi_device_wakeup_disable+0x12/0x50
<4> [437.542191]
but task is already holding lock:
<4> [437.542194] 8275d360 (fs_reclaim){+.+.}-{0:0}, at: 
balance_pgdat+0x91/0x5c0
<4> [437.542199]
which lock already depends on the new lock.
<4> [437.542202]
the existing dependency chain (in reverse order) is:
<4> [437.542204]
-> #2 (fs_reclaim){+.+.}-{0:0}:
<4> [437.542207]fs_reclaim_acquire+0x9d/0xd0
<4> [437.542211]kmem_cache_alloc_trace+0x2a/0x250
<4> [437.542214]__acpi_device_add+0x263/0x3a0
<4> [437.542217]acpi_add_single_object+0x3ea/0x710
<4> [437.542220]acpi_bus_check_add+0xf7/0x240
<4> [437.54]acpi_bus_scan+0x34/0xf0
<4> [437.542224]acpi_scan_init+0xf5/0x241
<4> [437.542228]acpi_init+0x449/0x4aa
<4> [437.542230]do_one_initcall+0x53/0x2e0
<4> [437.542233]kernel_init_freeable+0x18f/0x1dd
<4> [437.542236]kernel_init+0x11/0x110
<4> [437.542239]ret_from_fork+0x1f/0x30
<4> [437.542241]
-> #1 (acpi_device_lock){+.+.}-{3:3}:
<4> [437.542245]__mutex_lock+0x97/0xf20
<4> [437.542246]acpi_enable_wakeup_device_power+0x30/0xf0
<4> [437.542249]__acpi_device_wakeup_enable+0x31/0x110
<4> [437.542252]acpi_pm_set_device_wakeup+0x55/0x100
<4> [437.542254]__pci_enable_wake+0x5e/0xa0
<4> [437.542257]pci_finish_runtime_suspend+0x32/0x70
<4> [437.542259]pci_pm_runtime_suspend+0xa3/0x160
<4> [437.542262]__rpm_callback+0x3d/0x110
<4> [437.542265]rpm_callback+0x54/0x60
<4> [437.542268]rpm_suspend.part.10+0x105/0x5a0
<4> [437.542270]pm_runtime_work+0x7d/0x1e0
<4> [437.542273]process_one_work+0x272/0x5c0
<4> [437.542276]worker_thread+0x37/0x370
<4> [437.542278]kthread+0xed/0x120
<4> [437.542280]ret_from_fork+0x1f/0x30
<4> [437.542282]
-> #0 (acpi_wakeup_lock){+.+.}-{3:3}:
<4> [437.542285]__lock_acquire+0x15ad/0x2940
<4> [437.542288]lock_acquire+0xd3/0x310
<4> [437.542291]__mutex_lock+0x97/0xf20
<4> [437.542293]acpi_device_wakeup_disable+0x12/0x50
<4> [437.542295]acpi_pm_set_device_wakeup+0x6e/0x100
<4> [437.542297]__pci_enable_wake+0x73/0xa0
<4> [437.542300]pci_pm_runtime_resume+0x45/0x90
<4> [437.542302]__rpm_callback+0x3d/0x110
<4> [437.542304]rpm_callback+0x54/0x60
<4> [437.542307]rpm_resume+0x54f/0x750
<4> [437.542309]__pm_runtime_resume+0x42/0x80
<4> [437.542311]__intel_runtime_pm_get+0x19/0x80 [i915]
<4> [437.542386]i915_gem_object_unbind+0x8f/0x3b0 [i915]
<4> [437.542487]i915_gem_shrink+0x634/0x850 [i915]
<4> [437.542584]i915_gem_shrinker_scan+0x3a/0xc0 [i915]
<4> [437.542679]shrink_slab.constprop.97+0x1a4/0x4f0
<4> [437.542684]shrink_node+0x21e/0x420
<4> [437.542687]balance_pgdat+0x241/0x5c0
<4> [437.542690]kswapd+0x229/0x4f0
<4> [437.542694]kthread+0xed/0x120
<4> [437.542697]ret_from_fork+0x1f/0x30
<4> [437.542701]
other info that might help us debug this:
<4> [437.542705] Chain exists of:
  acpi_wakeup_lock --> acpi_device_lock --> fs_reclaim
<4> [437.542713]  Possible unsafe locking scenario:
<4> [437.542716]CPU0CPU1
<4> [437.542719]
<4> [437.542721]   lock(fs_reclaim);
<4> [437.542725]    lock(acpi_device_lock);
<4

[Intel-gfx] [RFC PATCH 2/2] drm/i915/gem: Perform active shrinking from a background thread

2022-07-20 Thread Janusz Krzysztofik
From: Chris Wilson 

i915 is very greedy and will retain system pages for as long as the user
requires them; once acquired they will be only returned when the object
is freed. In order to respond to system memory pressure, i915 hooks into
the shrinker subsystem, designed to prune the filesystem caches, to
unbind and return system pages. However, we can only do so if the device
is active at that moment, as we cannot resume the device from inside
direct reclaim to unbind pages from the GPU, nor do we want to delay
random processes with unbound waits trying to reclaim active pages. To
workaround that quandary, what we avoided in direct reclaim we
delegated to kswapd, as that is run from process context outside of
direct reclaim and able to sleep and resume the device.

In practice, kswapd also uses fs_reclaim_acquire() around its
shrink_slab calls, prohibiting runtime resume. If we cannot wake the
device from idle, we will retain system memory indefinitely.

As we cannot take advantage of kswapd's decoupled process context to
perform an active reclaim of bound pages, spawn our own kthread to wait
under our wakeref. Similar to kswapd, there is no direct dependency on
the background task to direct reclaim (other than failure to promptly
return pages will implicitly result in oom), as such the task itself does
not inherit the fs-reclaim context. A page reclaimed by i915 will
typically not immediately be available for re-use, as it will require
writeback, and so only a future allocation attempt may benefit.
Concurrent page allocation attempts do not wait for either kswapd or our
own swapper task.

We mark our kthread as a memallocator (allowed to dip into memory
reserves, but not allowed to trigger direct reclaim) and mark up
the call to the shrinker with a fs_reclaim critical section. This
should prevent us from accidentally abusing the background swapper task,
and so the swapper kthread behaves like kswapd with the exception of
being allowed to wake the device up, and being decoupled from the
shrinker_rwsem.

Reported-by: Thomas Hellström 
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6449
Fixes: 178a30c90ac7 ("drm/i915: Unbind objects in shrinker only if device is 
runtime active")
Signed-off-by: Chris Wilson 
Cc: Thomas Hellström 
Cc: Matthew Auld 
Cc: Tvrtko Ursulin 
Cc: sta...@vger.kernel.org # v4.8+
Signed-off-by: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 134 +--
 drivers/gpu/drm/i915/i915_drv.h  |  15 +++
 2 files changed, 135 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 1030053571a2..bc6c1978e64a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -310,6 +310,113 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct 
shrink_control *sc)
return count;
 }
 
+static unsigned long run_swapper(struct drm_i915_private *i915,
+unsigned long target,
+unsigned long *nr_scanned)
+{
+   return i915_gem_shrink(NULL, i915,
+  target, nr_scanned,
+  I915_SHRINK_ACTIVE |
+  I915_SHRINK_BOUND |
+  I915_SHRINK_UNBOUND |
+  I915_SHRINK_WRITEBACK);
+}
+
+static int swapper(void *arg)
+{
+   struct drm_i915_private *i915 = arg;
+   atomic_long_t *target = &i915->mm.swapper.target;
+   unsigned int noreclaim_state;
+
+   /*
+* For us to be running the swapper implies that the system is under
+* enough memory pressure to be swapping. At that point, we both want
+* to ensure we make forward progress in order to reclaim pages from
+* the device and not contribute further to direct reclaim pressure. We
+* mark ourselves as a memalloc task in order to not trigger direct
+* reclaim ourselves, but dip into the system memory reserves for
+* shrinkers.
+*/
+   noreclaim_state = memalloc_noreclaim_save();
+
+   do {
+   intel_wakeref_t wakeref;
+
+   ___wait_var_event(target,
+ atomic_long_read(target) ||
+ kthread_should_stop(),
+ TASK_IDLE, 0, 0, schedule());
+   if (kthread_should_stop())
+   break;
+
+   with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
+   unsigned long nr_scan = atomic_long_xchg(target, 0);
+
+   /*
+* Now that we have woken up the device hierarchy,
+* act as a normal shrinker. Our shrinker is primarily
+* focussed on supporting direct reclaim (low latency,
+*

Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Avoid taking runtime-pm under the shrinker

2022-07-29 Thread Janusz Krzysztofik
Hi Matthew,

Thanks for review.

On Tuesday, 26 July 2022 20:14:05 CEST Matthew Auld wrote:
> On 20/07/2022 11:16, Janusz Krzysztofik wrote:
> > From: Chris Wilson 
> > 
> > Inside the shrinker, we cannot wake the device as that may cause
> > recursion into fs-reclaim, so instead we only unbind vma if the device
> > is currently awake. (In order to provide reclaim while asleep, we do
> > wake the device up during kswapd -- we probably want to limit that wake
> > up if we have anything to shrink though!)
> > 
> > To avoid the same fs_reclaim recursion potential during
> > i915_gem_object_unbind, we acquire a wakeref there, see commit
> > 3e817471a34c ("drm/i915/gem: Take runtime-pm wakeref prior to unbinding").
> > However, we use i915_gem_object_unbind from the shrinker path to make the
> > object available for shrinking and so we must make the wakeref acquisition
> > here conditional.
> > 
> > <4> [437.542172] ==
> > <4> [437.542174] WARNING: possible circular locking dependency detected
> > <4> [437.542176] 5.19.0-rc6-CI_DRM_11876-g2305e0d00665+ #1 Tainted: G U
> > <4> [437.542179] --
> > <4> [437.542181] kswapd0/93 is trying to acquire lock:
> > <4> [437.542183] 827a7608 (acpi_wakeup_lock){+.+.}-{3:3}, at: 
> > acpi_device_wakeup_disable+0x12/0x50
> > <4> [437.542191]
> > but task is already holding lock:
> > <4> [437.542194] 8275d360 (fs_reclaim){+.+.}-{0:0}, at: 
> > balance_pgdat+0x91/0x5c0
> > <4> [437.542199]
> > which lock already depends on the new lock.
> > <4> [437.542202]
> > the existing dependency chain (in reverse order) is:
> > <4> [437.542204]
> > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > <4> [437.542207]fs_reclaim_acquire+0x9d/0xd0
> > <4> [437.542211]kmem_cache_alloc_trace+0x2a/0x250
> > <4> [437.542214]__acpi_device_add+0x263/0x3a0
> > <4> [437.542217]acpi_add_single_object+0x3ea/0x710
> > <4> [437.542220]acpi_bus_check_add+0xf7/0x240
> > <4> [437.54]acpi_bus_scan+0x34/0xf0
> > <4> [437.542224]acpi_scan_init+0xf5/0x241
> > <4> [437.542228]acpi_init+0x449/0x4aa
> > <4> [437.542230]do_one_initcall+0x53/0x2e0
> > <4> [437.542233]kernel_init_freeable+0x18f/0x1dd
> > <4> [437.542236]kernel_init+0x11/0x110
> > <4> [437.542239]ret_from_fork+0x1f/0x30
> > <4> [437.542241]
> > -> #1 (acpi_device_lock){+.+.}-{3:3}:
> > <4> [437.542245]__mutex_lock+0x97/0xf20
> > <4> [437.542246]acpi_enable_wakeup_device_power+0x30/0xf0
> > <4> [437.542249]__acpi_device_wakeup_enable+0x31/0x110
> > <4> [437.542252]acpi_pm_set_device_wakeup+0x55/0x100
> > <4> [437.542254]__pci_enable_wake+0x5e/0xa0
> > <4> [437.542257]pci_finish_runtime_suspend+0x32/0x70
> > <4> [437.542259]pci_pm_runtime_suspend+0xa3/0x160
> > <4> [437.542262]__rpm_callback+0x3d/0x110
> > <4> [437.542265]rpm_callback+0x54/0x60
> > <4> [437.542268]rpm_suspend.part.10+0x105/0x5a0
> > <4> [437.542270]pm_runtime_work+0x7d/0x1e0
> > <4> [437.542273]process_one_work+0x272/0x5c0
> > <4> [437.542276]worker_thread+0x37/0x370
> > <4> [437.542278]kthread+0xed/0x120
> > <4> [437.542280]ret_from_fork+0x1f/0x30
> > <4> [437.542282]
> > -> #0 (acpi_wakeup_lock){+.+.}-{3:3}:
> > <4> [437.542285]__lock_acquire+0x15ad/0x2940
> > <4> [437.542288]lock_acquire+0xd3/0x310
> > <4> [437.542291]__mutex_lock+0x97/0xf20
> > <4> [437.542293]acpi_device_wakeup_disable+0x12/0x50
> > <4> [437.542295]acpi_pm_set_device_wakeup+0x6e/0x100
> > <4> [437.542297]__pci_enable_wake+0x73/0xa0
> > <4> [437.542300]pci_pm_runtime_resume+0x45/0x90
> > <4> [437.542302]__rpm_callback+0x3d/0x110
> > <4> [437.542304]rpm_callback+0x54/0x60
> > <4> [437.542307]rpm_resume+0x54f/0x750
> > <4> [437.542309]__pm_runtime_resume+0x42/0x80
> > <4> [437.542311]__intel_runtime_pm_get+0x19/0x80 [i915]
> > <4> [437.542386]i915_gem_object_unbind+0x8f/0x3b0 [i915]
> > <4> [437.

[Intel-gfx] [RESUBMIT][PATCH 1/2] drm/i915/gem: Avoid taking runtime-pm under the shrinker

2022-07-29 Thread Janusz Krzysztofik
From: Chris Wilson 

Inside the shrinker, we cannot wake the device as that may cause
recursion into fs-reclaim, so instead we only unbind vma if the device
is currently awake. (In order to provide reclaim while asleep, we do
wake the device up during kswapd -- we probably want to limit that wake
up if we have anything to shrink though!)

To avoid the same fs_reclaim recursion potential during
i915_gem_object_unbind, we acquire a wakeref there, see commit
3e817471a34c ("drm/i915/gem: Take runtime-pm wakeref prior to unbinding").
However, we use i915_gem_object_unbind from the shrinker path to make the
object available for shrinking and so we must make the wakeref acquisition
here conditional.

<4> [437.542172] ==
<4> [437.542174] WARNING: possible circular locking dependency detected
<4> [437.542176] 5.19.0-rc6-CI_DRM_11876-g2305e0d00665+ #1 Tainted: G U
<4> [437.542179] --
<4> [437.542181] kswapd0/93 is trying to acquire lock:
<4> [437.542183] 827a7608 (acpi_wakeup_lock){+.+.}-{3:3}, at: 
acpi_device_wakeup_disable+0x12/0x50
<4> [437.542191]
but task is already holding lock:
<4> [437.542194] 8275d360 (fs_reclaim){+.+.}-{0:0}, at: 
balance_pgdat+0x91/0x5c0
<4> [437.542199]
which lock already depends on the new lock.
<4> [437.542202]
the existing dependency chain (in reverse order) is:
<4> [437.542204]
-> #2 (fs_reclaim){+.+.}-{0:0}:
<4> [437.542207]fs_reclaim_acquire+0x9d/0xd0
<4> [437.542211]kmem_cache_alloc_trace+0x2a/0x250
<4> [437.542214]__acpi_device_add+0x263/0x3a0
<4> [437.542217]acpi_add_single_object+0x3ea/0x710
<4> [437.542220]acpi_bus_check_add+0xf7/0x240
<4> [437.54]acpi_bus_scan+0x34/0xf0
<4> [437.542224]acpi_scan_init+0xf5/0x241
<4> [437.542228]acpi_init+0x449/0x4aa
<4> [437.542230]do_one_initcall+0x53/0x2e0
<4> [437.542233]kernel_init_freeable+0x18f/0x1dd
<4> [437.542236]kernel_init+0x11/0x110
<4> [437.542239]ret_from_fork+0x1f/0x30
<4> [437.542241]
-> #1 (acpi_device_lock){+.+.}-{3:3}:
<4> [437.542245]__mutex_lock+0x97/0xf20
<4> [437.542246]acpi_enable_wakeup_device_power+0x30/0xf0
<4> [437.542249]__acpi_device_wakeup_enable+0x31/0x110
<4> [437.542252]acpi_pm_set_device_wakeup+0x55/0x100
<4> [437.542254]__pci_enable_wake+0x5e/0xa0
<4> [437.542257]pci_finish_runtime_suspend+0x32/0x70
<4> [437.542259]pci_pm_runtime_suspend+0xa3/0x160
<4> [437.542262]__rpm_callback+0x3d/0x110
<4> [437.542265]rpm_callback+0x54/0x60
<4> [437.542268]rpm_suspend.part.10+0x105/0x5a0
<4> [437.542270]pm_runtime_work+0x7d/0x1e0
<4> [437.542273]process_one_work+0x272/0x5c0
<4> [437.542276]worker_thread+0x37/0x370
<4> [437.542278]kthread+0xed/0x120
<4> [437.542280]ret_from_fork+0x1f/0x30
<4> [437.542282]
-> #0 (acpi_wakeup_lock){+.+.}-{3:3}:
<4> [437.542285]__lock_acquire+0x15ad/0x2940
<4> [437.542288]lock_acquire+0xd3/0x310
<4> [437.542291]__mutex_lock+0x97/0xf20
<4> [437.542293]acpi_device_wakeup_disable+0x12/0x50
<4> [437.542295]acpi_pm_set_device_wakeup+0x6e/0x100
<4> [437.542297]__pci_enable_wake+0x73/0xa0
<4> [437.542300]pci_pm_runtime_resume+0x45/0x90
<4> [437.542302]__rpm_callback+0x3d/0x110
<4> [437.542304]rpm_callback+0x54/0x60
<4> [437.542307]rpm_resume+0x54f/0x750
<4> [437.542309]__pm_runtime_resume+0x42/0x80
<4> [437.542311]__intel_runtime_pm_get+0x19/0x80 [i915]
<4> [437.542386]i915_gem_object_unbind+0x8f/0x3b0 [i915]
<4> [437.542487]i915_gem_shrink+0x634/0x850 [i915]
<4> [437.542584]i915_gem_shrinker_scan+0x3a/0xc0 [i915]
<4> [437.542679]shrink_slab.constprop.97+0x1a4/0x4f0
<4> [437.542684]shrink_node+0x21e/0x420
<4> [437.542687]balance_pgdat+0x241/0x5c0
<4> [437.542690]kswapd+0x229/0x4f0
<4> [437.542694]kthread+0xed/0x120
<4> [437.542697]ret_from_fork+0x1f/0x30
<4> [437.542701]
other info that might help us debug this:
<4> [437.542705] Chain exists of:
  acpi_wakeup_lock --> acpi_device_lock --> fs_reclaim
<4> [437.542713]  Possible unsafe locking scenario:
<4> [437.542716]CPU0CPU1
<4> [437.542719]
<4> [437.542721]   lock(fs_reclaim);
<4> [437.542725]    lock(acpi_device_lock);
<

[Intel-gfx] [RESUBMIT][PATCH 2/2] drm/i915/gem: Perform active shrinking from a background thread

2022-07-29 Thread Janusz Krzysztofik
From: Chris Wilson 

i915 is very greedy and will retain system pages for as long as the user
requires them; once acquired they will be only returned when the object
is freed. In order to respond to system memory pressure, i915 hooks into
the shrinker subsystem, designed to prune the filesystem caches, to
unbind and return system pages. However, we can only do so if the device
is active at that moment, as we cannot resume the device from inside
direct reclaim to unbind pages from the GPU, nor do we want to delay
random processes with unbound waits trying to reclaim active pages. To
workaround that quandary, what we avoided in direct reclaim we
delegated to kswapd, as that is run from process context outside of
direct reclaim and able to sleep and resume the device.

In practice, kswapd also uses fs_reclaim_acquire() around its
shrink_slab calls, prohibiting runtime resume. If we cannot wake the
device from idle, we will retain system memory indefinitely.

As we cannot take advantage of kswapd's decoupled process context to
perform an active reclaim of bound pages, spawn our own kthread to wait
under our wakeref. Similar to kswapd, there is no direct dependency on
the background task to direct reclaim (other than failure to promptly
return pages will implicitly result in oom), as such the task itself does
not inherit the fs-reclaim context. A page reclaimed by i915 will
typically not immediately be available for re-use, as it will require
writeback, and so only a future allocation attempt may benefit.
Concurrent page allocation attempts do not wait for either kswapd or our
own swapper task.

We mark our kthread as a memallocator (allowed to dip into memory
reserves, but not allowed to trigger direct reclaim) and mark up
the call to the shrinker with a fs_reclaim critical section. This
should prevent us from accidentally abusing the background swapper task,
and so the swapper kthread behaves like kswapd with the exception of
being allowed to wake the device up, and being decoupled from the
shrinker_rwsem.

Reported-by: Thomas Hellström 
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6449
Fixes: 178a30c90ac7 ("drm/i915: Unbind objects in shrinker only if device is 
runtime active")
Signed-off-by: Chris Wilson 
Cc: Thomas Hellström 
Cc: Matthew Auld 
Cc: Tvrtko Ursulin 
Cc: sta...@vger.kernel.org # v4.8+
Signed-off-by: Janusz Krzysztofik 
---
Resubmit reason: drop RFC label.

 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 134 +--
 drivers/gpu/drm/i915/i915_drv.h  |  15 +++
 2 files changed, 135 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 1030053571a2..bc6c1978e64a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -310,6 +310,113 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct 
shrink_control *sc)
return count;
 }
 
+static unsigned long run_swapper(struct drm_i915_private *i915,
+unsigned long target,
+unsigned long *nr_scanned)
+{
+   return i915_gem_shrink(NULL, i915,
+  target, nr_scanned,
+  I915_SHRINK_ACTIVE |
+  I915_SHRINK_BOUND |
+  I915_SHRINK_UNBOUND |
+  I915_SHRINK_WRITEBACK);
+}
+
+static int swapper(void *arg)
+{
+   struct drm_i915_private *i915 = arg;
+   atomic_long_t *target = &i915->mm.swapper.target;
+   unsigned int noreclaim_state;
+
+   /*
+* For us to be running the swapper implies that the system is under
+* enough memory pressure to be swapping. At that point, we both want
+* to ensure we make forward progress in order to reclaim pages from
+* the device and not contribute further to direct reclaim pressure. We
+* mark ourselves as a memalloc task in order to not trigger direct
+* reclaim ourselves, but dip into the system memory reserves for
+* shrinkers.
+*/
+   noreclaim_state = memalloc_noreclaim_save();
+
+   do {
+   intel_wakeref_t wakeref;
+
+   ___wait_var_event(target,
+ atomic_long_read(target) ||
+ kthread_should_stop(),
+ TASK_IDLE, 0, 0, schedule());
+   if (kthread_should_stop())
+   break;
+
+   with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
+   unsigned long nr_scan = atomic_long_xchg(target, 0);
+
+   /*
+* Now that we have woken up the device hierarchy,
+* act as a normal shrinker. Our shrinker is primarily
+* focussed on supportin

  1   2   3   4   5   6   7   8   9   >