Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix recursive lock in GuC submission

2021-10-26 Thread Joonas Lahtinen
Quoting Matthew Brost (2021-10-25 20:13:22)
> On Mon, Oct 25, 2021 at 03:23:00PM +0300, Joonas Lahtinen wrote:
> > Quoting Thomas Hellström (2021-10-21 08:39:48)
> > > On Wed, 2021-10-20 at 12:21 -0700, Matthew Brost wrote:
> > 
> > 
> > 
> > > > Fixes: 1a52faed31311 ("drm/i915/guc: Take engine PM when a context is
> > > > pinned with GuC submission")
> > > > Signed-off-by: Matthew Brost 
> > > > Cc: sta...@vger.kernel.org
> > 
> > This Cc: stable annotation is unnecessary.
> > 
> > Please always use "dim fixes 1a52faed31311" for helping to decide which
> > Cc's are needed. In this case stable is not needed. If it was, there
> > would be an indication of kernel version. In this case this is fine to
> > be picked up by in drm-intel-next-fixes PR.
> > 
> > Let's pay attention to the right Fixes: annotation while submitting and
> > reviewing patches.
> > 
> 
> Will do. Working on getting push rights. Is there any documentation with
> all the rules when pushing as it seems like there are a lot of rules.

Yes, we have the documentation here:

https://drm.pages.freedesktop.org/maintainer-tools/committer-guidelines.html

And more specifically this topic:

https://drm.pages.freedesktop.org/maintainer-tools/committer-drm-intel.html#labeling-fixes-before-pushing

I could even recommend to at least do a cursory read through the wider
documentation about how the different trees interact:

https://drm.pages.freedesktop.org/maintainer-tools/index.html

Makes it easier to understand how the tags are used.

Regards, Joonas

> 
> Matt 
> 
> > Regards, Joonas


Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix recursive lock in GuC submission

2021-10-25 Thread Matthew Brost
On Mon, Oct 25, 2021 at 03:23:00PM +0300, Joonas Lahtinen wrote:
> Quoting Thomas Hellström (2021-10-21 08:39:48)
> > On Wed, 2021-10-20 at 12:21 -0700, Matthew Brost wrote:
> 
> 
> 
> > > Fixes: 1a52faed31311 ("drm/i915/guc: Take engine PM when a context is
> > > pinned with GuC submission")
> > > Signed-off-by: Matthew Brost 
> > > Cc: sta...@vger.kernel.org
> 
> This Cc: stable annotation is unnecessary.
> 
> Please always use "dim fixes 1a52faed31311" for helping to decide which
> Cc's are needed. In this case stable is not needed. If it was, there
> would be an indication of kernel version. In this case this is fine to
> be picked up by in drm-intel-next-fixes PR.
> 
> Let's pay attention to the right Fixes: annotation while submitting and
> reviewing patches.
> 

Will do. Working on getting push rights. Is there any documentation with
all the rules when pushing as it seems like there are a lot of rules.

Matt 

> Regards, Joonas


Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix recursive lock in GuC submission

2021-10-25 Thread Joonas Lahtinen
Quoting Thomas Hellström (2021-10-21 08:39:48)
> On Wed, 2021-10-20 at 12:21 -0700, Matthew Brost wrote:



> > Fixes: 1a52faed31311 ("drm/i915/guc: Take engine PM when a context is
> > pinned with GuC submission")
> > Signed-off-by: Matthew Brost 
> > Cc: sta...@vger.kernel.org

This Cc: stable annotation is unnecessary.

Please always use "dim fixes 1a52faed31311" for helping to decide which
Cc's are needed. In this case stable is not needed. If it was, there
would be an indication of kernel version. In this case this is fine to
be picked up by in drm-intel-next-fixes PR.

Let's pay attention to the right Fixes: annotation while submitting and
reviewing patches.

Regards, Joonas


Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix recursive lock in GuC submission

2021-10-20 Thread Thomas Hellström
On Wed, 2021-10-20 at 12:21 -0700, Matthew Brost wrote:
> Use __release_guc_id (lock held) rather than release_guc_id (acquires
> lock), add lockdep annotations.
> 
> 213.280129] i915: Running i915_perf_live_selftests/live_noa_gpr
> [ 213.283459] 
> [ 213.283462] WARNING: possible recursive locking detected
> {{[ 213.283466] 5.15.0-rc6+ #18 Tainted: G U W }}
> [ 213.283470] 
> [ 213.283472] kworker/u24:0/8 is trying to acquire lock:
> [ 213.283475] 8ffc4f6cc1e8 (>submission_state.lock){}-
> {2:2}, at: destroyed_worker_func+0x2df/0x350 [i915]
> {{[ 213.283618] }}
> {{ but task is already holding lock:}}
> [ 213.283621] 8ffc4f6cc1e8 (>submission_state.lock){}-
> {2:2}, at: destroyed_worker_func+0x4f/0x350 [i915]
> {{[ 213.283720] }}
> {{ other info that might help us debug this:}}
> [ 213.283724] Possible unsafe locking scenario:[ 213.283727] CPU0
> [ 213.283728] 
> [ 213.283730] lock(>submission_state.lock);
> [ 213.283734] lock(>submission_state.lock);
> {{[ 213.283737] }}
> {{ *** DEADLOCK ***}}[ 213.283740] May be due to missing lock nesting
> notation[ 213.283744] 3 locks held by kworker/u24:0/8:
> [ 213.283747] #0: 8ffb80059d38
> ((wq_completion)events_unbound){..}-{0:0}, at:
> process_one_work+0x1f3/0x550
> [ 213.283757] #1: b509000e3e78 ((work_completion)(
> >submission_state.destroyed_worker)){..}-{0:0}, at:
> process_one_work+0x1f3/0x550
> [ 213.283766] #2: 8ffc4f6cc1e8 (
> >submission_state.lock){}-{2:2}, at:
> destroyed_worker_func+0x4f/0x350 [i915]
> {{[ 213.283860] }}
> {{ stack backtrace:}}
> [ 213.283863] CPU: 8 PID: 8 Comm: kworker/u24:0 Tainted: G U W
> 5.15.0-rc6+ #18
> [ 213.283868] Hardware name: ASUS System Product Name/PRIME B560M-A
> AC, BIOS 0403 01/26/2021
> [ 213.283873] Workqueue: events_unbound destroyed_worker_func [i915]
> [ 213.283957] Call Trace:
> [ 213.283960] dump_stack_lvl+0x57/0x72
> [ 213.283966] __lock_acquire.cold+0x191/0x2d3
> [ 213.283972] lock_acquire+0xb5/0x2b0
> [ 213.283978] ? destroyed_worker_func+0x2df/0x350 [i915]
> [ 213.284059] ? destroyed_worker_func+0x2d7/0x350 [i915]
> [ 213.284139] ? lock_release+0xb9/0x280
> [ 213.284143] _raw_spin_lock_irqsave+0x48/0x60
> [ 213.284148] ? destroyed_worker_func+0x2df/0x350 [i915]
> [ 213.284226] destroyed_worker_func+0x2df/0x350 [i915]
> [ 213.284310] process_one_work+0x270/0x550
> [ 213.284315] worker_thread+0x52/0x3b0
> [ 213.284319] ? process_one_work+0x550/0x550
> [ 213.284322] kthread+0x135/0x160
> [ 213.284326] ? set_kthread_struct+0x40/0x40
> [ 213.284331] ret_from_fork+0x1f/0x30
> 
> and a bit later in the trace:
> 
> {{ 227.499864] do_raw_spin_lock+0x94/0xa0}}
> [ 227.499868] _raw_spin_lock_irqsave+0x50/0x60
> [ 227.499871] ? guc_flush_destroyed_contexts+0x4f/0xf0 [i915]
> [ 227.45] guc_flush_destroyed_contexts+0x4f/0xf0 [i915]
> [ 227.500104] intel_guc_submission_reset_prepare+0x99/0x4b0 [i915]
> [ 227.500209] ? mark_held_locks+0x49/0x70
> [ 227.500212] intel_uc_reset_prepare+0x46/0x50 [i915]
> [ 227.500320] reset_prepare+0x78/0x90 [i915]
> [ 227.500412] __intel_gt_set_wedged.part.0+0x13/0xe0 [i915]
> [ 227.500485] intel_gt_set_wedged.part.0+0x54/0x100 [i915]
> [ 227.500556] intel_gt_set_wedged_on_fini+0x1a/0x30 [i915]
> [ 227.500622] intel_gt_driver_unregister+0x1e/0x60 [i915]
> [ 227.500694] i915_driver_remove+0x4a/0xf0 [i915]
> [ 227.500767] i915_pci_probe+0x84/0x170 [i915]
> [ 227.500838] local_pci_probe+0x42/0x80
> [ 227.500842] pci_device_probe+0xd9/0x190
> [ 227.500844] really_probe+0x1f2/0x3f0
> [ 227.500847] __driver_probe_device+0xfe/0x180
> [ 227.500848] driver_probe_device+0x1e/0x90
> [ 227.500850] __driver_attach+0xc4/0x1d0
> [ 227.500851] ? __device_attach_driver+0xe0/0xe0
> [ 227.500853] ? __device_attach_driver+0xe0/0xe0
> [ 227.500854] bus_for_each_dev+0x64/0x90
> [ 227.500856] bus_add_driver+0x12e/0x1f0
> [ 227.500857] driver_register+0x8f/0xe0
> [ 227.500859] i915_init+0x1d/0x8f [i915]
> [ 227.500934] ? 0xc144a000
> [ 227.500936] do_one_initcall+0x58/0x2d0
> [ 227.500938] ? rcu_read_lock_sched_held+0x3f/0x80
> [ 227.500940] ? kmem_cache_alloc_trace+0x238/0x2d0
> [ 227.500944] do_init_module+0x5c/0x270
> [ 227.500946] __do_sys_finit_module+0x95/0xe0
> [ 227.500949] do_syscall_64+0x38/0x90
> [ 227.500951] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 227.500953] RIP: 0033:0x7ffa59d2ae0d
> [ 227.500954] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e
> fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
> 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64
> 89 01 48
> [ 227.500955] RSP: 002b:7fff320bbf48 EFLAGS: 0246 ORIG_RAX:
> 0139
> [ 227.500956] RAX: ffda RBX: 022ea710 RCX:
> 7ffa59d2ae0d
> [ 227.500957] RDX:  RSI: 022e1d90 RDI:
> 0004
> [ 227.500958] RBP: 0020 R08: 7ffa59df3a60 R09:
> 0070
> [ 227.500958] R10: