Re: [Intel-gfx] [PATCH] drm/i915/guc: Initialize GuC submission locks and queues early

2022-02-18 Thread John Harrison
: https://gitlab.freedesktop.org/drm/intel/-/issues/4932 Signed-off-by: Daniele Ceraolo Spurio Cc: Matthew Brost Cc: John Harrison Reviewed-by: John Harrison --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 27 ++- 1 file changed, 14 insertions(+), 13 deletions(-) diff

Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix flag query to not modify state

2022-02-08 Thread John Harrison
On 2/8/2022 01:39, Tvrtko Ursulin wrote: On 08/02/2022 02:07, john.c.harri...@intel.com wrote: From: John Harrison A flag query helper was actually writing to the flags word rather than just reading. Fix that. Also update the function's comment as it was out of date. Fixes: 0f7976506de61

Re: [Intel-gfx] [PATCH v3] drm/i915/dg2: Define GuC firmware version for DG2

2022-02-07 Thread John Harrison
Hmm, this is actually v1 not v3! Had something stale when posting. John. On 2/7/2022 12:36, john.c.harri...@intel.com wrote: From: John Harrison First release of GuC for DG2. Signed-off-by: John Harrison CC: Tomasz Mistat CC: Ramalingam C CC: Daniele Ceraolo Spurio --- drivers/gpu

Re: [Intel-gfx] [PATCH v3 2/2] drm/i915/uapi: Add query for hwconfig table

2022-02-04 Thread John Harrison
FFF, 0x, 0xFF00, }; The attribute ids are defined in a hardware spec. Cc: Tvrtko Ursulin Cc: Kenneth Graunke Cc: Michal Wajdeczko Cc: Slawomir Milczarek Signed-off-by: Rodrigo Vivi Signed-off-by: John Harrison Reviewed-by: Matthew Brost --- drivers/gpu/drm/i915/i915_quer

Re: [Intel-gfx] [PATCH 1/2] drm/i915/pmu: Fix KMD and GuC race on accessing busyness

2022-01-28 Thread John Harrison
On 1/28/2022 01:34, Tvrtko Ursulin wrote: John, What CI results were used to merge this particular single patch? Unless I am not seeing it, it was always set in pair with something else. First with "drm/i915/pmu: Use PM timestamp instead of RING TIMESTAMP for reference", which was merged

Re: [Intel-gfx] [PATCH v3 2/2] drm/i915/uapi: Add query for hwconfig table

2022-01-27 Thread John Harrison
mean that it doesn't exist. John. Cc: Tvrtko Ursulin Cc: Kenneth Graunke Cc: Michal Wajdeczko Cc: Slawomir Milczarek Signed-off-by: Rodrigo Vivi Signed-off-by: John Harrison Reviewed-by: Matthew Brost --- drivers/gpu/drm/i915/i915_query.c | 23 +++ include/uapi/dr

Re: [Intel-gfx] [PATCH 3/4] drm/i915/execlists: Fix execlists request cancellation corner case

2022-01-26 Thread John Harrison
On 1/24/2022 07:01, Matthew Brost wrote: More than 1 request can be submitted to a single ELSP at a time if multiple requests are ready run to on the same context. When a request is canceled it is marked bad, an idle pulse is triggered to the engine (high priority kernel request), the execlists

Re: [Intel-gfx] [PATCH 2/4] drm/i915/guc: Cancel requests immediately

2022-01-26 Thread John Harrison
On 1/24/2022 07:01, Matthew Brost wrote: Change the preemption timeout to the smallest possible value (1 us) when disabling scheduling to cancel a request and restore it after cancellation. This not only cancels the request as fast as possible, it fixes a bug where the preemption timeout is 0

Re: [Intel-gfx] [PATCH][next] drm/i915/guc: fix spelling mistake "notificaion" -> "notification"

2022-01-25 Thread John Harrison
On 1/25/2022 01:13, Colin Ian King wrote: There is a spelling mistake in a drm_err error message. Fix it. Signed-off-by: Colin Ian King Reviewed-by: John Harrison However, note that this message is going to be deleted anyway. Or at least dropped down to an informational. Partly

Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Add work queue to trigger a GT reset

2022-01-21 Thread John Harrison
On 1/20/2022 20:31, Matthew Brost wrote: The G2H handler needs to be flushed during a GT reset but a G2H indicating engine reset failure can trigger a GT reset. Add a worker to trigger the GT rest when an engine reset failure is received to break this circular dependency. v2: (John Harrison

Re: [Intel-gfx] [PATCH 3/3] drm/i915/guc: Flush G2H handler during a GT reset

2022-01-20 Thread John Harrison
GuC generated G2H (e.g. engine resets) race with a GT reset. v2: (John Harrison) - Fix typo in commit message (s/is/in) Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 18 +- 1 file changed, 1 insertion(+), 17

Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Add work queue to trigger a GT reset

2022-01-20 Thread John Harrison
: (John Harrison) - Store engine reset mask - Fix typo in commit message Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 9 + .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 37 +-- 2 files changed, 42 insertions(+), 4 deletions

Re: [Intel-gfx] [PATCH] drm/i915: Lock timeline mutex directly in error path of eb_pin_timeline

2022-01-20 Thread John Harrison
On 1/11/2022 08:39, Matthew Brost wrote: Don't use the interruptable version of the timeline mutex lock in the error path of eb_pin_timeline as the cleanup must always happen. v2: (John Harrison) - Don't check for interrupt during mutex lock v3: (Tvrtko) - A comment explaining why

Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for Add support for querying hw info that UMDs need

2022-01-19 Thread John Harrison
On 1/19/2022 16:42, Patchwork wrote: Project List - Patchwork *Patch Details* *Series:* Add support for querying hw info that UMDs need *URL:* https://patchwork.freedesktop.org/series/99060/ *State:*failure *Details:*

Re: [Intel-gfx] [PATCH 1/3] drm/i915: Allocate intel_engine_coredump_alloc with ALLOW_FAIL

2022-01-19 Thread John Harrison
On 1/19/2022 12:47, Matthew Brost wrote: On Tue, Jan 18, 2022 at 05:29:54PM -0800, John Harrison wrote: On 1/18/2022 13:43, Matthew Brost wrote: Allocate intel_engine_coredump_alloc with ALLOW_FAIL rather than GFP_KERNEL do fully decouple the error capture from fence signalling. s/do

Re: [Intel-gfx] [PATCH 3/3] drm/i915/guc: Flush G2H handler during a GT reset

2022-01-18 Thread John Harrison
On 1/18/2022 13:43, Matthew Brost wrote: Now that the error capture is fully decoupled from fence signalling (request retirement to free memory, which is turn depends on resets) we s/is/in/ With that fixed: Reviewed-by: John Harrison John. can safely flush the G2H handler during a GT reset

Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Add work queue to trigger a GT reset

2022-01-18 Thread John Harrison
On 1/18/2022 13:43, Matthew Brost wrote: The G2H handler needs to be flushed during a GT reset but a G2H indicating engine reset failure can trigger a GT reset. Add a worker to trigger the GT when a engine reset failure is received to break this s/a/an/ circular dependency. Signed-off-by:

Re: [Intel-gfx] [PATCH 1/3] drm/i915: Allocate intel_engine_coredump_alloc with ALLOW_FAIL

2022-01-18 Thread John Harrison
On 1/18/2022 13:43, Matthew Brost wrote: Allocate intel_engine_coredump_alloc with ALLOW_FAIL rather than GFP_KERNEL do fully decouple the error capture from fence signalling. s/do/to/ Fixes: 8b91cdd4f8649 ("drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code") Does this really count as a

Re: [Intel-gfx] [PATCH] drm/i915/guc: Ensure multi-lrc fini breadcrumb math is correct

2022-01-18 Thread John Harrison
good. There was a checkpatch warning about blank lines. With that fixed: Reviewed-by: John Harrison --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 32 +++ 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b

Re: [Intel-gfx] [igt-dev] [PATCH v3 i-g-t 12/15] tests/i915/gem_exec_fence: Configure correct context

2022-01-13 Thread John Harrison
On 1/13/2022 13:06, Matthew Brost wrote: On Thu, Jan 13, 2022 at 11:59:44AM -0800, john.c.harri...@intel.com wrote: From: John Harrison The update to use intel_ctx_t missed a line that configures the context to allow hanging. Fix that. Fixes: 09c36188b23f83ef9a7b5414e2a10100adc4291f

Re: [Intel-gfx] [igt-dev] [PATCH v3 i-g-t 11/15] tests/i915/i915_hangman: Don't let background contexts cause a ban

2022-01-13 Thread John Harrison
On 1/13/2022 13:01, Matthew Brost wrote: On Thu, Jan 13, 2022 at 11:59:43AM -0800, john.c.harri...@intel.com wrote: From: John Harrison The global context used by all the subtests for causing hangs is marked as unbannable. However, some of the subtests set background spinners running on all

Re: [Intel-gfx] [igt-dev] [PATCH v3 i-g-t 09/15] tests/i915/i915_hangman: Remove reliance on context persistance

2022-01-13 Thread John Harrison
On 1/13/2022 12:30, Matthew Brost wrote: On Thu, Jan 13, 2022 at 11:59:41AM -0800, john.c.harri...@intel.com wrote: From: John Harrison The hang test was relying on context persitence for no particular reason. That is, it would set a bunch of background spinners running then immediately

Re: [Intel-gfx] [igt-dev] [PATCH v3 i-g-t 07/15] lib/store: Refactor common store code into helper function

2022-01-13 Thread John Harrison
On 1/13/2022 12:23, Matthew Brost wrote: On Thu, Jan 13, 2022 at 12:27:00PM -0800, John Harrison wrote: On 1/13/2022 12:10, Matthew Brost wrote: On Thu, Jan 13, 2022 at 11:59:39AM -0800, john.c.harri...@intel.com wrote: From: John Harrison A lot of tests use almost identical code

Re: [Intel-gfx] [igt-dev] [PATCH v3 i-g-t 07/15] lib/store: Refactor common store code into helper function

2022-01-13 Thread John Harrison
On 1/13/2022 12:10, Matthew Brost wrote: On Thu, Jan 13, 2022 at 11:59:39AM -0800, john.c.harri...@intel.com wrote: From: John Harrison A lot of tests use almost identical code for creating a batch buffer which does a single write to memory and another is about to be added. Instead, move

Re: [Intel-gfx] [PATCH 1/2] drm/i915/selftests: Add a cancel request selftest that triggers a reset

2022-01-13 Thread John Harrison
timeout compiled out - Skip test if engine reset isn't supported - Update debug prints to be more descriptive v3: - Add comment explaining test v4: (John Harrison) - Fix typos in comment explaining test - goto out_rq is NOP creation fails Signed-off-by: Matthew Brost --- drivers

Re: [Intel-gfx] [PATCH 1/2] drm/i915/selftests: Add a cancel request selftest that triggers a reset

2022-01-13 Thread John Harrison
On 1/13/2022 09:34, Matthew Brost wrote: On Thu, Jan 13, 2022 at 09:33:12AM -0800, John Harrison wrote: On 1/11/2022 15:11, Matthew Brost wrote: Add a cancel request selftest that results in an engine reset to cancel the request as it is non-preemptable. Also insert a NOP request after

Re: [Intel-gfx] [PATCH 1/2] drm/i915/selftests: Add a cancel request selftest that triggers a reset

2022-01-13 Thread John Harrison
On 1/11/2022 15:11, Matthew Brost wrote: Add a cancel request selftest that results in an engine reset to cancel the request as it is non-preemptable. Also insert a NOP request after the cancelled request and confirm that it completes successfully. v2: (Tvrtko) - Skip test if preemption

Re: [Intel-gfx] [PATCH 2/2] drm/i915/guc: Remove hacks for reset and schedule disable G2H being received out of order

2022-01-13 Thread John Harrison
these hacks. Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 30 ++- 1 file changed, 2 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 08/11] lib/store: Refactor common store code into helper function

2022-01-12 Thread John Harrison
On 12/26/2021 22:02, Zbigniew Kempczyński wrote: On Tue, Dec 21, 2021 at 06:22:22PM -0800, John Harrison wrote: On 12/20/2021 10:13, Zbigniew Kempczyński wrote: On Thu, Dec 16, 2021 at 02:40:21PM -0800, John Harrison wrote: On 12/15/2021 23:46, Zbigniew Kempczyński wrote: On Mon, Dec 13

Re: [Intel-gfx] [PATCH] drm/i915/guc: Don't error on reset of banned context

2022-01-06 Thread John Harrison
On 1/6/2022 16:31, john.c.harri...@intel.com wrote: From: John Harrison There is a race (already documented in the code) whereby a context can be (re-)queued for submission at the same time as it is being banned due to a hang and reset. That leads to a hang/reset report from GuC for a context

Re: [Intel-gfx] [PATCH] drm/i915: Check return intel_context_timeline_lock of in eb_pin_timeline

2022-01-04 Thread John Harrison
On 1/4/2022 09:31, Matthew Brost wrote: intel_context_timeline_lock can return can error if interrupted by a user when trying to lock the timeline mutex. Check the return value of intel_context_timeline_lock in eb_pin_timeline (execbuf IOCTL). Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf")

Re: [Intel-gfx] [PATCH] drm/i915: Increment composite fence seqno

2021-12-23 Thread John Harrison
eb->context->parallel.seqno, +eb->context->parallel.seqno++, As is, this change looks good. So: Reviewed-by: John Harrison However, just spotted that dma_fence_array_create() takes the seqno value as an 'unsigned' even though it pas

Re: [Intel-gfx] [PATCH] drm/i915/guc: Log engine resets

2021-12-23 Thread John Harrison
On 12/23/2021 02:23, Tvrtko Ursulin wrote: On 22/12/2021 21:58, John Harrison wrote: On 12/22/2021 08:21, Tvrtko Ursulin wrote: On 21/12/2021 22:14, John Harrison wrote: On 12/21/2021 05:37, Tvrtko Ursulin wrote: On 20/12/2021 18:34, John Harrison wrote: On 12/20/2021 07:00, Tvrtko Ursulin

Re: [Intel-gfx] [PATCH] drm/i915/execlists: Weak parallel submission support for execlists

2021-12-22 Thread John Harrison
are not allowed. This is on par with what is there for the existing (hopefully soon deprecated) bonding interface. We perma-pin these execlists contexts to align with GuC implementation. v2: (John Harrison) - Drop siblings array as num_siblings must be 1 v3: (John Harrison) - Drop single submission

Re: [Intel-gfx] [PATCH] drm/i915/guc: Use lockless list for destroyed contexts

2021-12-22 Thread John Harrison
On 12/22/2021 15:29, Matthew Brost wrote: Use a lockless list structure for destroyed contexts to avoid hammering on global submission spin lock. I thought the guidance was that lockless anything without an explanation longer than War And Peace comes with an automatic termination penalty?

Re: [Intel-gfx] [PATCH] drm/i915/guc: Log engine resets

2021-12-22 Thread John Harrison
On 12/22/2021 08:21, Tvrtko Ursulin wrote: On 21/12/2021 22:14, John Harrison wrote: On 12/21/2021 05:37, Tvrtko Ursulin wrote: On 20/12/2021 18:34, John Harrison wrote: On 12/20/2021 07:00, Tvrtko Ursulin wrote: On 17/12/2021 16:22, Matthew Brost wrote: On Fri, Dec 17, 2021 at 12:15:53PM

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 08/11] lib/store: Refactor common store code into helper function

2021-12-21 Thread John Harrison
On 12/20/2021 10:13, Zbigniew Kempczyński wrote: On Thu, Dec 16, 2021 at 02:40:21PM -0800, John Harrison wrote: On 12/15/2021 23:46, Zbigniew Kempczyński wrote: On Mon, Dec 13, 2021 at 03:29:11PM -0800, john.c.harri...@intel.com wrote: From: John Harrison A lot of tests use almost identical

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 04/11] tests/i915/i915_hangman: Explicitly test per engine reset vs full GPU reset

2021-12-21 Thread John Harrison
On 12/21/2021 03:28, Dandamudi, Priyanka wrote: Does this test series cover to prove that it can survive killing one without killing all the others except RCS+CCS combination(killing one affects other and shows with the help of reset stats)? That is part of the purpose of i915_hangman - to

Re: [Intel-gfx] [PATCH] drm/i915/guc: Log engine resets

2021-12-21 Thread John Harrison
On 12/21/2021 05:37, Tvrtko Ursulin wrote: On 20/12/2021 18:34, John Harrison wrote: On 12/20/2021 07:00, Tvrtko Ursulin wrote: On 17/12/2021 16:22, Matthew Brost wrote: On Fri, Dec 17, 2021 at 12:15:53PM +, Tvrtko Ursulin wrote: On 14/12/2021 15:07, Tvrtko Ursulin wrote: From: Tvrtko

Re: [Intel-gfx] [PATCH] drm/i915/guc: Log engine resets

2021-12-20 Thread John Harrison
ion itself could be triggered but basically, if GuC resets an active context then it is because it did not pre-empt quickly enough when requested. Regards, Tvrtko Signed-off-by: Tvrtko Ursulin Cc: Matthew Brost Cc: John Harrison ---    drivers/gpu/drm/i915/gt/uc/intel_guc_submiss

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 08/11] lib/store: Refactor common store code into helper function

2021-12-16 Thread John Harrison
On 12/15/2021 23:46, Zbigniew Kempczyński wrote: On Mon, Dec 13, 2021 at 03:29:11PM -0800, john.c.harri...@intel.com wrote: From: John Harrison A lot of tests use almost identical code for creating a batch buffer which does a single write to memory. This patch collects two such instances

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 07/11] tests/i915/i915_hangman: Add alive-ness test after error capture

2021-12-16 Thread John Harrison
On 12/15/2021 23:23, Zbigniew Kempczyński wrote: On Mon, Dec 13, 2021 at 03:29:10PM -0800, john.c.harri...@intel.com wrote: From: John Harrison Added a an extra step to the i915_hangman tests to check that the system is still alive after the hang and recovery. This submits a simple batch

Re: [Intel-gfx] [PATCH] drm/i915/guc: Check for wedged before doing stuff

2021-12-16 Thread John Harrison
On 12/16/2021 00:47, Tvrtko Ursulin wrote: On 15/12/2021 22:45, john.c.harri...@intel.com wrote: From: John Harrison A fault injection probe test hit a BUG_ON in a GuC error path. It showed that the GuC code could potentially attempt to do many things when the device is actually wedged. So

Re: [Intel-gfx] [PATCH 7/7] drm/i915/guc: Selftest for stealing of guc ids

2021-12-14 Thread John Harrison
request which should successfully steal a guc_id. Wait on last request to complete, idle GPU, verify a guc_id was stolen via a counter, and exit the test. Test also artificially reduces the number of guc_ids so the test runs in a timely manner. v2: (John Harrison) - s/stole/stolen - Fix some

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 01/11] tests/i915/i915_hangman: Add descriptions

2021-12-14 Thread John Harrison
On 12/14/2021 01:47, Petri Latvala wrote: On Mon, Dec 13, 2021 at 03:29:04PM -0800, john.c.harri...@intel.com wrote: From: John Harrison Added descriptions of the various sub-tests and the test as a whole. Signed-off-by: John Harrison --- tests/i915/i915_hangman.c | 11 +-- 1

Re: [Intel-gfx] [PATCH 6/7] drm/i915/guc: Kick G2H tasklet if no credits

2021-12-13 Thread John Harrison
On 12/11/2021 09:35, Matthew Brost wrote: Let's be paranoid and kick the G2H tasklet, which dequeues messages, if G2H credits are exhausted. Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 9 - 1 file changed, 8 insertions

Re: [Intel-gfx] [PATCH 7/7] drm/i915/guc: Selftest for stealing of guc ids

2021-12-13 Thread John Harrison
manner. v2: (John Harrison) - s/stole/stolen - Fix some wording in test description - Rework indexing into context array - Add test description to commit message - Fix typo in commit message (Checkpatch) - s/guc/(guc) in NUMBER_MULTI_LRC_GUC_ID Signed-off-by: Matthew Brost

Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for More fixes/tweaks to GuC support

2021-12-13 Thread John Harrison
On 12/11/2021 18:43, Patchwork wrote: Project List - Patchwork *Patch Details* *Series:* More fixes/tweaks to GuC support *URL:* https://patchwork.freedesktop.org/series/97898/ *State:*failure *Details:* https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21832/index.html CI

Re: [Intel-gfx] [v2] drm/i915/gen11: Moving WAs to icl_gt_workarounds_init()

2021-12-10 Thread John Harrison
On 12/10/2021 21:30, Lucas De Marchi wrote: On Fri, Dec 10, 2021 at 05:41:46PM -0800, John Harrison wrote: On 12/10/2021 17:22, Matt Roper wrote: On Thu, Dec 09, 2021 at 09:21:39AM -0800, Lucas De Marchi wrote: On Fri, Dec 03, 2021 at 08:26:03PM +0530, ravitejax.goud.ta...@intel.com wrote

Re: [Intel-gfx] [PATCH 5/7] drm/i915/guc: Add extra debug on CT deadlock

2021-12-10 Thread John Harrison
On 12/10/2021 17:43, John Harrison wrote: On 12/10/2021 16:56, Matthew Brost wrote: Print CT state (H2G + G2H head / tail pointers, credits) on CT deadlock. Signed-off-by: Matthew Brost Reviewed-by: John Harrison ---   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 9 +   1 file

Re: [Intel-gfx] [PATCH 5/7] drm/i915/guc: Add extra debug on CT deadlock

2021-12-10 Thread John Harrison
On 12/10/2021 16:56, Matthew Brost wrote: Print CT state (H2G + G2H head / tail pointers, credits) on CT deadlock. Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers

Re: [Intel-gfx] [v2] drm/i915/gen11: Moving WAs to icl_gt_workarounds_init()

2021-12-10 Thread John Harrison
1407352427,Wa_1406680159 to proper function icl_gt_workarounds_init() Which will resolve guc enabling error v2: - Previous patch rev2 was created by email client which caused the Build failure, This v2 is to resolve the previous broken series Reviewed-by: John Harrison Signed-off-by: Raviteja

Re: [Intel-gfx] [PATCH 7/7] drm/i915/guc: Selftest for stealing of guc ids

2021-12-10 Thread John Harrison
On 12/10/2021 16:56, Matthew Brost wrote: Testing the stealing of guc ids is hard from user spaec as we have 64k spaec -> space guc_ids. Add a selftest, which artificially reduces the number of guc ids, and forces a steal. Details of test has comment in code so will not has -> are But would

Re: [Intel-gfx] [PATCH 2/7] drm/i915/guc: Only assign guc_id.id when stealing guc_id

2021-12-10 Thread John Harrison
On 12/10/2021 16:56, Matthew Brost wrote: Previously assigned whole guc_id structure (list, spin lock) which is incorrect, only assign the guc_id.id. Fixes: 0f7976506de61 ("drm/i915/guc: Rework and simplify locking") Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- d

Re: [Intel-gfx] [PATCH 4/7] drm/i915/guc: Don't hog IRQs when destroying contexts

2021-12-10 Thread John Harrison
On 12/10/2021 16:56, Matthew Brost wrote: From: John Harrison While attempting to debug a CT deadlock issue in various CI failures (most easily reproduced with gem_ctx_create/basic-files), I was seeing CPU deadlock errors being reported. This were because the context destroy loop was blocking

Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for Assorted fixes/tweaks to GuC support (rev5)

2021-12-09 Thread John Harrison
On 12/8/2021 20:03, Patchwork wrote: Project List - Patchwork *Patch Details* *Series:* Assorted fixes/tweaks to GuC support (rev5) *URL:* https://patchwork.freedesktop.org/series/97514/ *State:*failure *Details:*

Re: [Intel-gfx] [PATCH] drm/i915/dg2: make GuC FW a requirement for Gen12 and beyond devices

2021-12-09 Thread John Harrison
On 12/9/2021 06:41, Robert Beckett wrote: On 09/12/2021 00:24, John Harrison wrote: On 12/8/2021 09:58, Robert Beckett wrote: On 07/12/2021 23:15, John Harrison wrote: On 12/7/2021 09:53, Adrian Larumbe wrote: Beginning with DG2, all successive devices will require GuC FW to be present

Re: [Intel-gfx] [PATCH] drm/i915/dg2: make GuC FW a requirement for Gen12 and beyond devices

2021-12-07 Thread John Harrison
On 12/7/2021 09:53, Adrian Larumbe wrote: Beginning with DG2, all successive devices will require GuC FW to be present and loaded at probe() time. This change alters error handling in the FW init and load functions so that the driver's probe() function will fail if GuC could not be loaded. We

Re: [Intel-gfx] [PATCH v7] drm/i915: Update error capture code to avoid using the current vma state

2021-12-07 Thread John Harrison
On 11/29/2021 12:22, Thomas Hellström wrote: With asynchronous migrations, the vma state may be several migrations ahead of the state that matches the request we're capturing. Address that by introducing an i915_vma_snapshot structure that can be used to snapshot relevant state at request

Re: [Intel-gfx] [PATCH] drm/i915/execlists: Weak parallel submission support for execlists

2021-12-06 Thread John Harrison
are not allowed. This is on par with what is there for the existing (hopefully soon deprecated) bonding interface. We perma-pin these execlists contexts to align with GuC implementation. v2: (John Harrison) - Drop siblings array as num_siblings must be 1 v3: (John Harrison) - Drop single submission

Re: [Intel-gfx] ✗ Fi.CI.DOCS: warning for Update to GuC version 69.0.0

2021-12-06 Thread John Harrison
Michal, do you know what this is complaining about? John. On 12/3/2021 14:27, Patchwork wrote: == Series Details == Series: Update to GuC version 69.0.0 URL : https://patchwork.freedesktop.org/series/97564/ State : warning == Summary == $ make htmldocs 2>&1 > /dev/null | grep i915

Re: [Intel-gfx] [PATCH] drm/i915/gen11: Moving WAs to icl_gt_workarounds_init()

2021-11-30 Thread John Harrison
On 11/23/2021 06:45, ravitejax.goud.ta...@intel.com wrote: From: raviteja goud talla Bspec page says "Reset: BUS", Accordingly moving w/a's: Wa_1407352427,Wa_1406680159 to proper function icl_gt_workarounds_init() Which will resolve guc enabling error Cc: John Harrison

Re: [Intel-gfx] [PATCH v2 2/2] drm/i915/uapi: Add query for hwconfig table

2021-11-03 Thread John Harrison
On 11/3/2021 14:38, Jordan Justen wrote: John Harrison writes: On 11/1/2021 08:39, Jordan Justen wrote: writes: From: Rodrigo Vivi GuC contains a consolidated table with a bunch of information about the current device. Previously, this information was spread and hardcoded to all

Re: [Intel-gfx] [PATCH i-g-t 4/8] tests/i915/gem_exec_capture: Use contexts and engines properly

2021-11-03 Thread John Harrison
On 11/3/2021 02:36, Petri Latvala wrote: On Tue, Nov 02, 2021 at 06:45:38PM -0700, John Harrison wrote: On 11/2/2021 16:34, Matthew Brost wrote: On Thu, Oct 21, 2021 at 04:40:40PM -0700, john.c.harri...@intel.com wrote: From: John Harrison Some of the capture tests were using explicit

Re: [Intel-gfx] [PATCH i-g-t 5/8] tests/i915/gem_exec_capture: Check for memory allocation failure

2021-11-03 Thread John Harrison
On 11/3/2021 07:00, Tvrtko Ursulin wrote: On 22/10/2021 00:40, john.c.harri...@intel.com wrote: From: John Harrison The sysfs file read helper does not actually report any errors if a realloc fails. It just silently returns a 'valid' but truncated buffer. This then leads to the decode

Re: [Intel-gfx] [PATCH i-g-t 1/8] tests/i915/gem_exec_capture: Remove pointless assert

2021-11-03 Thread John Harrison
On 11/3/2021 06:50, Tvrtko Ursulin wrote: On 22/10/2021 00:40, john.c.harri...@intel.com wrote: From: John Harrison The 'many' test ended with an 'assert(count)', presumably meaning to ensure that some objects were actually captured. However, 'count' is the number of objects created not how

Re: [Intel-gfx] [PATCH i-g-t 4/8] tests/i915/gem_exec_capture: Use contexts and engines properly

2021-11-02 Thread John Harrison
On 11/2/2021 16:34, Matthew Brost wrote: On Thu, Oct 21, 2021 at 04:40:40PM -0700, john.c.harri...@intel.com wrote: From: John Harrison Some of the capture tests were using explicit contexts, some not. Some were poking the per engine pre-emption timeout, some not. This would lead to sporadic

Re: [Intel-gfx] [PATCH v2 2/2] drm/i915/uapi: Add query for hwconfig table

2021-11-02 Thread John Harrison
On 11/1/2021 08:39, Jordan Justen wrote: writes: From: Rodrigo Vivi GuC contains a consolidated table with a bunch of information about the current device. Previously, this information was spread and hardcoded to all the components including GuC, i915 and various UMDs. The goal here is to

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 2/8] tests/i915/gem_exec_capture: Cope with larger page sizes

2021-10-29 Thread John Harrison
On 10/29/2021 10:39, Matthew Brost wrote: On Thu, Oct 21, 2021 at 04:40:38PM -0700, john.c.harri...@intel.com wrote: From: John Harrison At some point, larger than 4KB page sizes were added to the i915 driver. This included adding an informational line to the buffer entries in error capture

Re: [Intel-gfx] [PATCH] drm/i915/resets: Don't set / test for per-engine reset bits with GuC submission

2021-10-28 Thread John Harrison
these bits, rip the use of these bits out from the reset code. To be clear, there are other tests poking these bits in addition to hangcheck. Not sure if they would suffer from the same problems but I don't see why they wouldn't. Reviewed-by: John Harrison Signed-off-by: Matthew Brost

Re: [Intel-gfx] [PATCH] drm/i915/selftests: Allow engine reset failure to do a GT reset in hangcheck selftest

2021-10-27 Thread John Harrison
On 10/26/2021 23:36, Thomas Hellström wrote: Hi, John, On 10/26/21 21:55, John Harrison wrote: On 10/21/2021 23:23, Thomas Hellström wrote: On 10/21/21 22:37, Matthew Brost wrote: On Thu, Oct 21, 2021 at 08:15:49AM +0200, Thomas Hellström wrote: Hi, Matthew, On Mon, 2021-10-11 at 16:47

Re: [Intel-gfx] [PATCH] drm/i915/execlists: Weak parallel submission support for execlists

2021-10-27 Thread John Harrison
On 10/27/2021 12:17, Matthew Brost wrote: On Tue, Oct 26, 2021 at 02:58:00PM -0700, John Harrison wrote: On 10/20/2021 14:47, Matthew Brost wrote: A weak implementation of parallel submission (multi-bb execbuf IOCTL) for execlists. Doing as little as possible to support this interface

Re: [Intel-gfx] [PATCH] drm/i915/execlists: Weak parallel submission support for execlists

2021-10-26 Thread John Harrison
are not allowed. This is on par with what is there for the existing (hopefully soon deprecated) bonding interface. We perma-pin these execlists contexts to align with GuC implementation. v2: (John Harrison) - Drop siblings array as num_siblings must be 1 Signed-off-by: Matthew Brost --- drivers/gpu

Re: [Intel-gfx] [PATCH] drm/i915/selftests: Allow engine reset failure to do a GT reset in hangcheck selftest

2021-10-26 Thread John Harrison
On 10/21/2021 23:23, Thomas Hellström wrote: On 10/21/21 22:37, Matthew Brost wrote: On Thu, Oct 21, 2021 at 08:15:49AM +0200, Thomas Hellström wrote: Hi, Matthew, On Mon, 2021-10-11 at 16:47 -0700, Matthew Brost wrote: The hangcheck selftest blocks per engine resets by setting magic bits in

Re: [Intel-gfx] [PATCH] drm/i915/selftests: Allow engine reset failure to do a GT reset in hangcheck selftest

2021-10-26 Thread John Harrison
On 10/11/2021 16:47, Matthew Brost wrote: The hangcheck selftest blocks per engine resets by setting magic bits in the reset flags. This is incorrect for GuC submission because if the GuC fails to reset an engine we would like to do a full GT reset. Do no set these magic bits when using GuC

Re: [Intel-gfx] [PATCH] drm/i915/trace: Hide backend specific fields behind Kconfig

2021-10-25 Thread John Harrison
On 10/25/2021 09:34, Matthew Brost wrote: Hide the guc_id and tail fields, for request trace points, behind CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option. Trace points are ABI (maybe?) so don't change them without kernel developers Kconfig options. The i915 sw arch team have previously

Re: [Intel-gfx] [PATCH] drm/i915/selftests: Allow engine reset failure to do a GT reset in hangcheck selftest

2021-10-25 Thread John Harrison
On 10/23/2021 11:36, Thomas Hellström wrote: On 10/23/21 20:18, Matthew Brost wrote: On Sat, Oct 23, 2021 at 07:46:48PM +0200, Thomas Hellström wrote: On 10/22/21 20:09, John Harrison wrote: And to be clear, the engine reset is not supposed to fail. Whether issued by GuC or i915, the GDRST

Re: [Intel-gfx] [PATCH 00/47] GuC submission support

2021-10-25 Thread John Harrison
st [1] https://patchwork.freedesktop.org/series/89844/ [2] https://patchwork.freedesktop.org/series/91417/ Daniele Ceraolo Spurio (1): drm/i915/guc: Unblock GuC submission on Gen11+ John Harrison (10): drm/i915/guc: Module load failure test for CT buffer creation drm/i915: Track '

Re: [Intel-gfx] [PATCH] drm/i915/selftests: Allow engine reset failure to do a GT reset in hangcheck selftest

2021-10-22 Thread John Harrison
On 10/22/2021 10:03, Matthew Brost wrote: On Fri, Oct 22, 2021 at 08:23:55AM +0200, Thomas Hellström wrote: On 10/21/21 22:37, Matthew Brost wrote: On Thu, Oct 21, 2021 at 08:15:49AM +0200, Thomas Hellström wrote: Hi, Matthew, On Mon, 2021-10-11 at 16:47 -0700, Matthew Brost wrote: The

Re: [Intel-gfx] [PATCH] drm/i915/selftests: Increase timeout in requests perf selftest

2021-10-20 Thread John Harrison
understand that is ok for contexts to get starved in this scenario. A future patch / cleanup may just delete these micro benchmark tests as they basically mean nothing. We care about real workloads not made up ones. Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- drivers/gpu/drm/i915

Re: [Intel-gfx] [igt-dev] [PATCH v2 i-g-t] tests/i915: Skip gem_exec_fair on GuC based platforms

2021-10-15 Thread John Harrison
On 10/15/2021 07:52, Dixit, Ashutosh wrote: On Thu, 14 Oct 2021 12:42:38 -0700, wrote: + /* +* These tests are for a specific scheduling model which is +* not currently implemented by GuC. So skip on GuC platforms. +*/ +

Re: [Intel-gfx] [PATCH] drm/i915: fix blank screen booting crashes

2021-10-15 Thread John Harrison
d after all? John. commit b0179f0d18dd7e6fb6b1c52c49ac21365257e97e Author: Hugh Dickins AuthorDate: Tue Sep 21 18:50:39 2021 -0700 Commit: John Harrison CommitDate: Thu Oct 14 18:29:01 2021 -0700     drm/i915: fix blank screen booting crashes commit cdc1e6e225e3256d56dc6648411630e71d7c776b Author: Hugh Dickins

Re: [Intel-gfx] [PATCH 25/25] drm/i915/execlists: Weak parallel submission support for execlists

2021-10-14 Thread John Harrison
On 10/14/2021 10:20, Matthew Brost wrote: A weak implementation of parallel submission (multi-bb execbuf IOCTL) for execlists. Doing as little as possible to support this interface for execlists - basically just passing submit fences between each request generated and virtual engines are not

Re: [Intel-gfx] [PATCH 24/25] drm/i915: Enable multi-bb execbuf

2021-10-14 Thread John Harrison
On 10/14/2021 10:20, Matthew Brost wrote: Enable multi-bb execbuf by enabling the set_parallel extension. Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/i915

Re: [Intel-gfx] [PATCH 20/25] drm/i915: Multi-BB execbuf

2021-10-14 Thread John Harrison
media UMD: https://github.com/intel/media-driver/pull/1252 v2: (Matthew Brost) - Return proper error value if i915_request_create fails v3: (John Harrison) - Add comment explaining create / add order loops + locking - Update commit message explaining different in IOCTL behavior

Re: [Intel-gfx] [PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface

2021-10-14 Thread John Harrison
/1252 v2: (Daniel Vetter) - Add IGT link and placeholder for media UMD link v3: (Kernel test robot) - Fix warning in unpin engines call (John Harrison) - Reword a bunch of the kernel doc v4: (John Harrison) - Add comment why perma-pin is done after setting gem context

Re: [Intel-gfx] [PATCH 08/25] drm/i915/guc: Add multi-lrc context registration

2021-10-14 Thread John Harrison
On 10/14/2021 10:19, Matthew Brost wrote: Add multi-lrc context registration H2G. In addition a workqueue and process descriptor are setup during multi-lrc context registration as these data structures are needed for multi-lrc submission. v2: (John Harrison) - Move GuC specific fields

Re: [Intel-gfx] [PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface

2021-10-14 Thread John Harrison
On 10/14/2021 09:41, Matthew Brost wrote: On Thu, Oct 14, 2021 at 09:43:36AM -0700, John Harrison wrote: On 10/14/2021 08:32, Matthew Brost wrote: On Wed, Oct 13, 2021 at 06:02:42PM -0700, John Harrison wrote: On 10/13/2021 13:42, Matthew Brost wrote: Introduce 'set parallel submit

Re: [Intel-gfx] [PATCH 11/25] drm/i915/guc: Implement parallel context pin / unpin functions

2021-10-14 Thread John Harrison
scheduling with the GuC / or deregister the context. v2: (Daniel Vetter) - Perma-pin parallel contexts Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 70 +++ 1 file changed, 70 insertions(+) diff --git a/drivers

Re: [Intel-gfx] [PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface

2021-10-14 Thread John Harrison
On 10/14/2021 08:32, Matthew Brost wrote: On Wed, Oct 13, 2021 at 06:02:42PM -0700, John Harrison wrote: On 10/13/2021 13:42, Matthew Brost wrote: Introduce 'set parallel submit' extension to connect UAPI to GuC multi-lrc interface. Kernel doc in new uAPI should explain it all. IGT: https

Re: [Intel-gfx] [PATCH i-g-t] tests/i915: Skip gem_exec_fair on GuC based platforms

2021-10-13 Thread John Harrison
On 10/13/2021 15:53, Dixit, Ashutosh wrote: On Wed, 13 Oct 2021 15:43:17 -0700, wrote: From: John Harrison The gem_exec_fair test is specifically testing scheduler algorithm performance. However, GuC does not implement the same algorithm as execlist mode and this test is not applicable. So

Re: [Intel-gfx] [PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface

2021-10-13 Thread John Harrison
/1252 v2: (Daniel Vetter) - Add IGT link and placeholder for media UMD link v3: (Kernel test robot) - Fix warning in unpin engines call (John Harrison) - Reword a bunch of the kernel doc v4: (John Harrison) - Add comment why perma-pin is done after setting gem context

Re: [Intel-gfx] [PATCH 22/25] drm/i915: Make request conflict tracking understand parallel submits

2021-10-13 Thread John Harrison
the request conflict tracking understand this by comparing a parallel submit's parent context and skipping conflict insertion if the values match. v2: (John Harrison) - Reword commit message Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- drivers/gpu/drm/i915/i915_request.c | 43

Re: [Intel-gfx] [PATCH 21/25] drm/i915/guc: Handle errors in multi-lrc requests

2021-10-13 Thread John Harrison
the parent and children to make forward progress. If all the requests are not present this handshake doesn't work. To work around this, if multi-lrc request has an error we skip the handshake but still emit the breadcrumbs seqno. v2: (John Harrison) - Add comment explaining the skipping

Re: [Intel-gfx] [PATCH 20/25] drm/i915: Multi-BB execbuf

2021-10-13 Thread John Harrison
media UMD: https://github.com/intel/media-driver/pull/1252 v2: (Matthew Brost) - Return proper error value if i915_request_create fails v3: (John Harrison) - Add comment explaining create / add order loops + locking - Update commit message explaining different in IOCTL behavior

Re: [Intel-gfx] [PATCH 19/25] drm/i915/guc: Implement no mid batch preemption for multi-lrc

2021-10-13 Thread John Harrison
v2: (John Harrison) - Fix a few comments wording - Add struture for parent page layout v3: (Jojhn Harrison) - A structure for sync semaphore - Use offsetof to calc address - Update commit message Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/intel_context.c |

Re: [Intel-gfx] [PATCH 14/25] drm/i915/guc: Implement multi-lrc reset

2021-10-13 Thread John Harrison
On 10/13/2021 13:42, Matthew Brost wrote: Update context and full GPU reset to work with multi-lrc. The idea is parent context tracks all the active requests inflight for itself and its children. The parent context owns the reset replaying / canceling requests as needed. v2: (John Harrison

Re: [Intel-gfx] [PATCH 12/25] drm/i915/guc: Implement multi-lrc submission

2021-10-13 Thread John Harrison
. As such, the tasklet and bypass path have been updated to coalesce requests into a single submission. v2: (John Harrison) - s/wqe/wqi - Use FIELD_PREP macros - Add GEM_BUG_ONs ensures length fits within field - Add comment / white space to intel_guc_write_barrier (Kernel test robot) - Make

Re: [Intel-gfx] [PATCH 08/25] drm/i915/guc: Add multi-lrc context registration

2021-10-13 Thread John Harrison
On 10/13/2021 13:42, Matthew Brost wrote: Add multi-lrc context registration H2G. In addition a workqueue and process descriptor are setup during multi-lrc context registration as these data structures are needed for multi-lrc submission. v2: (John Harrison) - Move GuC specific fields

Re: [Intel-gfx] [PATCH 03/25] drm/i915/guc: Take engine PM when a context is pinned with GuC submission

2021-10-13 Thread John Harrison
annotations to pin / unpin function v3: (CI) - Drop intel_engine_pm_might_put from unpin path as an async put is used v4: (John Harrison) - Make intel_engine_pm_might_get/put work with GuC virtual engines - Update commit message v5: - Update commit message again Signed-off

<    1   2   3   4   5   6   7   8   9   10   >