Re: [Regression] 6.9.0: WARNING: workqueue: WQ_MEM_RECLAIM ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work [qxl]

2024-05-13 Thread Greg KH
On Wed, May 08, 2024 at 02:51:10PM +0200, Linux regression tracking (Thorsten 
Leemhuis) wrote:
> On 08.05.24 14:35, Anders Blomdell wrote:
> > On 2024-05-07 07:04, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> On 06.05.24 16:30, David Wang wrote:
>  On 30.04.24 08:13, David Wang wrote:
> >>
>  And confirmed that the warning is caused by
>  07ed11afb68d94eadd4ffc082b97c2331307c5ea and reverting it can fix.
> >>>
> >>> The kernel warning still shows up in 6.9.0-rc7.
> >>> (I think 4 high load processes on a 2-Core VM could easily trigger
> >>> the kernel warning.)
> >>
> >> Thx for the report. Linus just reverted the commit 07ed11afb68 you
> >> mentioned in your initial mail (I put that quote in again, see above):
> >>
> >> 3628e0383dd349 ("Reapply "drm/qxl: simplify qxl_fence_wait"")
> >> https://git.kernel.org/torvalds/c/3628e0383dd349f02f882e612ab6184e4bb3dc10
> >>
> >> So this hopefully should be history now.
> >>
> > Since this affects the 6.8 series (6.8.7 and onwards), I made a CC to
> > sta...@vger.kernel.org
> 
> Ohh, good idea, I thought Linus had added a stable tag, but that is not
> the case. Adding Greg as well and making things explicit:
> 
> @Greg: you might want to add 3628e0383dd349 ("Reapply "drm/qxl: simplify
> qxl_fence_wait"") to all branches that received 07ed11afb68d94 ("Revert
> "drm/qxl: simplify qxl_fence_wait"") (which afaics went into v6.8.7,
> v6.6.28, v6.1.87, and v5.15.156).

Now queued up, thanks.

greg k-h



Re: [Regression] 6.9.0: WARNING: workqueue: WQ_MEM_RECLAIM ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work [qxl]

2024-05-08 Thread Linux regression tracking (Thorsten Leemhuis)
On 08.05.24 14:35, Anders Blomdell wrote:
> On 2024-05-07 07:04, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 06.05.24 16:30, David Wang wrote:
 On 30.04.24 08:13, David Wang wrote:
>>
 And confirmed that the warning is caused by
 07ed11afb68d94eadd4ffc082b97c2331307c5ea and reverting it can fix.
>>>
>>> The kernel warning still shows up in 6.9.0-rc7.
>>> (I think 4 high load processes on a 2-Core VM could easily trigger
>>> the kernel warning.)
>>
>> Thx for the report. Linus just reverted the commit 07ed11afb68 you
>> mentioned in your initial mail (I put that quote in again, see above):
>>
>> 3628e0383dd349 ("Reapply "drm/qxl: simplify qxl_fence_wait"")
>> https://git.kernel.org/torvalds/c/3628e0383dd349f02f882e612ab6184e4bb3dc10
>>
>> So this hopefully should be history now.
>>
> Since this affects the 6.8 series (6.8.7 and onwards), I made a CC to
> sta...@vger.kernel.org

Ohh, good idea, I thought Linus had added a stable tag, but that is not
the case. Adding Greg as well and making things explicit:

@Greg: you might want to add 3628e0383dd349 ("Reapply "drm/qxl: simplify
qxl_fence_wait"") to all branches that received 07ed11afb68d94 ("Revert
"drm/qxl: simplify qxl_fence_wait"") (which afaics went into v6.8.7,
v6.6.28, v6.1.87, and v5.15.156).

Ciao, Thorsten


Re: [Regression] 6.9.0: WARNING: workqueue: WQ_MEM_RECLAIM ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work [qxl]

2024-05-08 Thread Anders Blomdell




On 2024-05-07 07:04, Linux regression tracking (Thorsten Leemhuis) wrote:



On 06.05.24 16:30, David Wang wrote:

On 30.04.24 08:13, David Wang wrote:



And confirmed that the warning is caused by
07ed11afb68d94eadd4ffc082b97c2331307c5ea and reverting it can fix.


The kernel warning still shows up in 6.9.0-rc7.
(I think 4 high load processes on a 2-Core VM could easily trigger the kernel 
warning.)


Thx for the report. Linus just reverted the commit 07ed11afb68 you
mentioned in your initial mail (I put that quote in again, see above):

3628e0383dd349 ("Reapply "drm/qxl: simplify qxl_fence_wait"")
https://git.kernel.org/torvalds/c/3628e0383dd349f02f882e612ab6184e4bb3dc10

So this hopefully should be history now.

Ciao, Thorsten


Since this affects the 6.8 series (6.8.7 and onwards), I made a CC to 
sta...@vger.kernel.org

/Anders



Re: [Regression] 6.9.0: WARNING: workqueue: WQ_MEM_RECLAIM ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work [qxl]

2024-05-06 Thread Linux regression tracking (Thorsten Leemhuis)



On 06.05.24 16:30, David Wang wrote:
>> On 30.04.24 08:13, David Wang wrote:

>> And confirmed that the warning is caused by
>> 07ed11afb68d94eadd4ffc082b97c2331307c5ea and reverting it can fix.
>
> The kernel warning still shows up in 6.9.0-rc7.
> (I think 4 high load processes on a 2-Core VM could easily trigger the kernel 
> warning.)

Thx for the report. Linus just reverted the commit 07ed11afb68 you
mentioned in your initial mail (I put that quote in again, see above):

3628e0383dd349 ("Reapply "drm/qxl: simplify qxl_fence_wait"")
https://git.kernel.org/torvalds/c/3628e0383dd349f02f882e612ab6184e4bb3dc10

So this hopefully should be history now.

Ciao, Thorsten


Re: [Regression] 6.9.0: WARNING: workqueue: WQ_MEM_RECLAIM ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work [qxl]

2024-05-06 Thread David Wang
The kernel warning still shows up in 6.9.0-rc7.

(I think 4 high load processes on a 2-Core VM could easily trigger the kernel 
warning.)

Thanks
David




[Regression] 6.9.0: WARNING: workqueue: WQ_MEM_RECLAIM ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work [qxl]

2024-04-30 Thread David Wang
Hi,
I got following kernel WARNING when the my 2-core KVM(6.9.0-rc6) is under high 
cpu load.

[Mon Apr 29 21:36:04 2024] [ cut here ]
[Mon Apr 29 21:36:04 2024] workqueue: WQ_MEM_RECLAIM 
ttm:ttm_bo_delayed_delete [ttm] is flushing !WQ_MEM_RECLAIM events:qxl_gc_work 
[qxl]
[Mon Apr 29 21:36:04 2024] WARNING: CPU: 1 PID: 792 at 
kernel/workqueue.c:3728 check_flush_dependency+0xfd/0x120
[Mon Apr 29 21:36:04 2024] Modules linked in: xt_conntrack(E) 
nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack_netlink(E) 
xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) 
br_netfilter(E) bridge(E) stp(E) llc(E) ip_set(E) nfnetlink(E) ip_vs_sh(E) 
ip_vs_wrr(E) ip_vs_rr(E) ip_vs(E) nf_conntrack(E) nf_defrag_ipv6(E) 
nf_defrag_ipv4(E) intel_rapl_msr(E) intel_rapl_common(E) crct10dif_pclmul(E) 
ghash_clmulni_intel(E) snd_hda_codec_generic(E) snd_hda_intel(E) 
snd_intel_dspcfg(E) sha512_ssse3(E) snd_hda_codec(E) sha512_generic(E) 
sha256_ssse3(E) overlay(E) sha1_ssse3(E) snd_hda_core(E) snd_hwdep(E) 
aesni_intel(E) snd_pcm(E) crypto_simd(E) pcspkr(E) cryptd(E) joydev(E) qxl(E) 
snd_timer(E) drm_ttm_helper(E) ttm(E) evdev(E) snd(E) iTCO_wdt(E) serio_raw(E) 
sg(E) virtio_balloon(E) virtio_console(E) iTCO_vendor_support(E) soundcore(E) 
qemu_fw_cfg(E) drm_kms_helper(E) button(E) binfmt_misc(E) fuse(E) drm(E) 
configfs(E) virtio_rng(E) rng_core(E) ip_tables(E) x_tables(E) autofs4(E) 
ext4(E) crc16(E) mbcache(E) jbd2(E)
[Mon Apr 29 21:36:04 2024]  hid_generic(E) usbhid(E) hid(E) sr_mod(E) 
cdrom(E) ahci(E) libahci(E) virtio_net(E) net_failover(E) failover(E) 
virtio_blk(E) libata(E) xhci_pci(E) crc32_pclmul(E) crc32c_intel(E) scsi_mod(E) 
scsi_common(E) lpc_ich(E) i2c_i801(E) xhci_hcd(E) psmouse(E) i2c_smbus(E) 
virtio_pci(E) usbcore(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) 
usb_common(E) virtio(E) mfd_core(E) virtio_ring(E)
[Mon Apr 29 21:36:04 2024] CPU: 1 PID: 792 Comm: kworker/u13:4 Tainted: 
GE  6.9.0-rc6-linan-5 #197
[Mon Apr 29 21:36:04 2024] Hardware name: QEMU Standard PC (Q35 + ICH9, 
2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[Mon Apr 29 21:36:04 2024] Workqueue: ttm ttm_bo_delayed_delete [ttm]
[Mon Apr 29 21:36:04 2024] RIP: 0010:check_flush_dependency+0xfd/0x120
[Mon Apr 29 21:36:04 2024] Code: 8b 45 18 48 8d b2 c0 00 00 00 49 89 e8 
48 8d 8b c0 00 00 00 48 c7 c7 68 30 a4 a7 c6 05 9b 12 6e 01 01 48 89 c2 e8 53 
b9 fd ff <0f> 0b e9 1e ff ff ff 80 3d 86 12 6e 01 00 75 93 e9 4a ff ff ff 66
[Mon Apr 29 21:36:04 2024] RSP: 0018:9d31805abce8 EFLAGS: 00010086
[Mon Apr 29 21:36:04 2024] RAX:  RBX: 8c8c4004ee00 
RCX: 
[Mon Apr 29 21:36:04 2024] RDX: 0003 RSI: 0027 
RDI: 
[Mon Apr 29 21:36:04 2024] RBP: c0b53570 R08:  
R09: 0003
[Mon Apr 29 21:36:04 2024] R10: 9d31805abb80 R11: a7cc1108 
R12: 8c8c42eb8000
[Mon Apr 29 21:36:04 2024] R13: 8c8c48077900 R14: 8c8cbbd30b80 
R15: 0001
[Mon Apr 29 21:36:04 2024] FS:  () 
GS:8c8cbbd0() knlGS:
[Mon Apr 29 21:36:04 2024] CS:  0010 DS:  ES:  CR0: 
80050033
[Mon Apr 29 21:36:04 2024] CR2: 7ffd38bb3ff8 CR3: 00010217a000 
CR4: 00350ef0
[Mon Apr 29 21:36:04 2024] Call Trace:
[Mon Apr 29 21:36:04 2024]  
[Mon Apr 29 21:36:04 2024]  ? __warn+0x7c/0x120
[Mon Apr 29 21:36:04 2024]  ? check_flush_dependency+0xfd/0x120
[Mon Apr 29 21:36:04 2024]  ? report_bug+0x18d/0x1c0
[Mon Apr 29 21:36:04 2024]  ? srso_return_thunk+0x5/0x5f
[Mon Apr 29 21:36:04 2024]  ? handle_bug+0x3c/0x80
[Mon Apr 29 21:36:04 2024]  ? exc_invalid_op+0x13/0x60
[Mon Apr 29 21:36:04 2024]  ? asm_exc_invalid_op+0x16/0x20
[Mon Apr 29 21:36:04 2024]  ? __pfx_qxl_gc_work+0x10/0x10 [qxl]
[Mon Apr 29 21:36:04 2024]  ? check_flush_dependency+0xfd/0x120
[Mon Apr 29 21:36:04 2024]  ? check_flush_dependency+0xfd/0x120
[Mon Apr 29 21:36:04 2024]  __flush_work.isra.0+0xc0/0x270
[Mon Apr 29 21:36:04 2024]  ? srso_return_thunk+0x5/0x5f
[Mon Apr 29 21:36:04 2024]  ? srso_return_thunk+0x5/0x5f
[Mon Apr 29 21:36:04 2024]  ? __queue_work.part.0+0x18b/0x3d0
[Mon Apr 29 21:36:04 2024]  ? srso_return_thunk+0x5/0x5f
[Mon Apr 29 21:36:04 2024]  qxl_queue_garbage_collect+0x7f/0x90 [qxl]
[Mon Apr 29 21:36:04 2024]  qxl_fence_wait+0x9c/0x180 [qxl]
[Mon Apr 29 21:36:04 2024]  dma_fence_wait_timeout+0x61/0x130
[Mon Apr 29 21:36:04 2024]  dma_resv_wait_timeout+0x6d/0xd0
[Mon Apr 29 21:36:04 2024]  ttm_bo_delayed_delete+0x26/0x80 [ttm]
[Mon Apr 29 21:36:04 2024]  process_one_work+0x18c/0x3b0
[Mon Apr 29 21:36:04