On Tue, Sep 19, 2023 at 03:23:28PM +0200, Oleksandr Natalenko wrote:
> /cc Bagas as well (see below).
> 
> On úterý 19. září 2023 10:26:42 CEST Oleksandr Natalenko wrote:
> > /cc Matthew Wilcox and Andrew Morton because of folios (please see below).
> > 
> > On sobota 2. září 2023 18:14:12 CEST Oleksandr Natalenko wrote:
> > > Hello.
> > > 
> > > Since v6.5 kernel the following HW:
> > > 
> > > * Lenovo T460s laptop with Skylake GT2 [HD Graphics 520] (rev 07)
> > > * Lenovo T490s laptop with WhiskeyLake-U GT2 [UHD Graphics 620] (rev 02)
> > > 
> > > is affected by the following crash once KDE on either X11 or Wayland is 
> > > started:
> > > 
> > > i915 0000:00:02.0: enabling device (0006 -> 0007)
> > > i915 0000:00:02.0: vgaarb: deactivate vga console
> > > i915 0000:00:02.0: vgaarb: changed VGA decodes: 
> > > olddecodes=io+mem,decodes=io+mem:owns=mem
> > > i915 0000:00:02.0: [drm] Finished loading DMC firmware 
> > > i915/skl_dmc_ver1_27.bin (v1.27)
> > > [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 1
> > > fbcon: i915drmfb (fb0) is primary device
> > > i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
> > > …
> > > memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=674 
> > > 'kwin_wayland'
> > > BUG: unable to handle page fault for address: ffffb422c2800000
> > > #PF: supervisor write access in kernel mode
> > > #PF: error_code(0x0002) - not-present page
> > > PGD 100000067 P4D 100000067 PUD 1001df067 PMD 10d1cf067 PTE 0
> > > Oops: 0002 [#1] PREEMPT SMP PTI
> > > CPU: 1 PID: 674 Comm: kwin_wayland Not tainted 6.5.0-pf1 #1 
> > > a6c58ff41a7b8bb16a19f5af9e0e9bce20f9f38d
> > > Hardware name: LENOVO 20FAS2BM0F/20FAS2BM0F, BIOS N1CET90W (1.58 ) 
> > > 11/15/2022
> > > RIP: 0010:gen8_ggtt_insert_entries+0xc2/0x140 [i915]
> > > …
> > > Call Trace:
> > >  <TASK>
> > >  intel_ggtt_bind_vma+0x3e/0x60 [i915 
> > > a83fdc6539431252dba13053979a8b680af86836]
> > >  i915_vma_bind+0x216/0x4b0 [i915 a83fdc6539431252dba13053979a8b680af86836]
> > >  i915_vma_pin_ww+0x405/0xa80 [i915 
> > > a83fdc6539431252dba13053979a8b680af86836]
> > >  __i915_ggtt_pin+0x5a/0x130 [i915 
> > > a83fdc6539431252dba13053979a8b680af86836]
> > >  i915_ggtt_pin+0x78/0x1f0 [i915 a83fdc6539431252dba13053979a8b680af86836]
> > >  __intel_context_do_pin_ww+0x312/0x700 [i915 
> > > a83fdc6539431252dba13053979a8b680af86836]
> > >  i915_gem_do_execbuffer+0xfc6/0x2720 [i915 
> > > a83fdc6539431252dba13053979a8b680af86836]
> > >  i915_gem_execbuffer2_ioctl+0x111/0x260 [i915 
> > > a83fdc6539431252dba13053979a8b680af86836]
> > >  drm_ioctl_kernel+0xca/0x170
> > >  drm_ioctl+0x30f/0x580
> > >  __x64_sys_ioctl+0x94/0xd0
> > >  do_syscall_64+0x5d/0x90
> > >  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > > …
> > > note: kwin_wayland[674] exited with irqs disabled
> > > 
> > > RIP seems to translate into this:
> > > 
> > > $ scripts/faddr2line drivers/gpu/drm/i915/gt/intel_ggtt.o 
> > > gen8_ggtt_insert_entries+0xc2
> > > gen8_ggtt_insert_entries+0xc2/0x150:
> > > writeq at 
> > > /home/pf/work/devel/own/pf-kernel/linux/./arch/x86/include/asm/io.h:99
> > > (inlined by) gen8_set_pte at 
> > > /home/pf/work/devel/own/pf-kernel/linux/drivers/gpu/drm/i915/gt/intel_ggtt.c:257
> > > (inlined by) gen8_ggtt_insert_entries at 
> > > /home/pf/work/devel/own/pf-kernel/linux/drivers/gpu/drm/i915/gt/intel_ggtt.c:300
> > > 
> > > Probably, recent PTE-related changes are relevant:
> > > 
> > > $ git log --oneline --no-merges v6.4..v6.5 -- 
> > > drivers/gpu/drm/i915/gt/intel_ggtt.c
> > > 3532e75dfadcf drm/i915/uc: perma-pin firmwares
> > > 4722e2ebe6f21 drm/i915/gt: Fix second parameter type of pre-gen8 
> > > pte_encode callbacks
> > > 9275277d53248 drm/i915: use pat_index instead of cache_level
> > > 5e352e32aec23 drm/i915: preparation for using PAT index
> > > 341ad0e8e2542 drm/i915/mtl: Add PTE encode function
> > > 
> > > Also note Lenovo T14s laptop with TigerLake-LP GT2 [Iris Xe Graphics] 
> > > (rev 01) is not affected by this issue.
> > > 
> > > Full dmesg with DRM debug enabled is available in the bugreport I've 
> > > reported earlier [1]. I'm sending this email to make the issue more 
> > > visible.
> > > 
> > > Please help.
> > > 
> > > Thanks.
> > > 
> > > [1] https://gitlab.freedesktop.org/drm/intel/-/issues/9256
> > 
> > Matthew,
> > 
> > Andrzej asked me to try to revert commits 0b62af28f249, e0b72c14d8dc and 
> > 1e0877d58b1e, and reverting those fixed the i915 crash for me. The 
> > e0b72c14d8dc and 1e0877d58b1e commits look like just prerequisites, so I 
> > assume 0b62af28f249 ("i915: convert shmem_sg_free_table() to use a 
> > folio_batch") is the culprit here.
> > 
> > Could you please check this?
> > 
> > Our conversation with Andrzej is available at drm-intel GitLab [1].
> > 
> > Thanks.
> > 
> > [1] https://gitlab.freedesktop.org/drm/intel/-/issues/9256
> 
> Bagas,
> 
> would you mind adding this to the regression tracker please?
> 

Will add shortly, thanks!

-- 
An old man doll... just what I always wanted! - Clara

Attachment: signature.asc
Description: PGP signature

Reply via email to