Re: Linux 6.7-rc1+: WARNING at drivers/xen/evtchn.c:167 evtchn_interrupt
On Tue, Mar 12, 2024 at 02:10:02PM +0100, Juergen Gross wrote: > On 30.11.23 03:34, Marek Marczykowski-Górecki wrote: > > Hi, > > > > While testing 6.7-rc3 on Qubes OS I found several warning like in the > > subject in dom0 log. I see them when running 6.7-rc1 too. I'm not sure > > what exactly triggers the issue, but my guess would be unbinding an > > event channel from userspace (closing vchan connection). > > > > Specific message: > > > > [ 83.973377] [ cut here ] > > [ 83.975523] Interrupt for port 77, but apparently not enabled; per-user > > a0e9f1d1 > > Finally I think I have a fix (thanks to Demi for finding the problematic patch > through bisecting). > > Could you please try the attached patch? It is based on current upstream, but > I think it should apply to 6.7 or stable 6.6, too. > > > Juergen > From 9d673c37b2d0c9aa274c53f619c4e9e43a419f41 Mon Sep 17 00:00:00 2001 > From: Juergen Gross > To: linux-ker...@vger.kernel.org > Cc: Juergen Gross > Cc: Stefano Stabellini > Cc: Oleksandr Tyshchenko > Cc: xen-devel@lists.xenproject.org > Date: Tue, 12 Mar 2024 13:52:24 +0100 > Subject: [PATCH] xen/evtchn: avoid WARN() when unbinding an event channel > > When unbinding a user event channel, the related handler might be > called a last time in case the kernel was built with > CONFIG_DEBUG_SHIRQ. This might cause a WARN() in the handler. > > Avoid that by adding an "unbinding" flag to struct user_event which > will short circuit the handler. > > Fixes: 9e90e58c11b7 ("xen: evtchn: Allow shared registration of IRQ handers") > Signed-off-by: Juergen Gross > --- > drivers/xen/evtchn.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c > index 59717628ca42..f6a2216c2c87 100644 > --- a/drivers/xen/evtchn.c > +++ b/drivers/xen/evtchn.c > @@ -85,6 +85,7 @@ struct user_evtchn { > struct per_user_data *user; > evtchn_port_t port; > bool enabled; > + bool unbinding; > }; > > static void evtchn_free_ring(evtchn_port_t *ring) > @@ -164,6 +165,10 @@ static irqreturn_t evtchn_interrupt(int irq, void *data) > struct per_user_data *u = evtchn->user; > unsigned int prod, cons; > > + /* Handler might be called when tearing down the IRQ. */ > + if (evtchn->unbinding) > + return IRQ_HANDLED; > + > WARN(!evtchn->enabled, >"Interrupt for port %u, but apparently not enabled; per-user %p\n", >evtchn->port, u); > @@ -421,6 +426,7 @@ static void evtchn_unbind_from_user(struct per_user_data > *u, > > BUG_ON(irq < 0); > > + evtchn->unbinding = true; > unbind_from_irqhandler(irq, evtchn); > > del_evtchn(u, evtchn); > -- > 2.35.3 That fixes the problem, thanks! Code looks good to me. Reported-by: Demi Marie Obenour Tested-by: Demi Marie Obenour -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab signature.asc Description: PGP signature
Re: Linux 6.7-rc1+: WARNING at drivers/xen/evtchn.c:167 evtchn_interrupt
On 30.11.23 03:34, Marek Marczykowski-Górecki wrote: Hi, While testing 6.7-rc3 on Qubes OS I found several warning like in the subject in dom0 log. I see them when running 6.7-rc1 too. I'm not sure what exactly triggers the issue, but my guess would be unbinding an event channel from userspace (closing vchan connection). Specific message: [ 83.973377] [ cut here ] [ 83.975523] Interrupt for port 77, but apparently not enabled; per-user a0e9f1d1 Finally I think I have a fix (thanks to Demi for finding the problematic patch through bisecting). Could you please try the attached patch? It is based on current upstream, but I think it should apply to 6.7 or stable 6.6, too. Juergen From 9d673c37b2d0c9aa274c53f619c4e9e43a419f41 Mon Sep 17 00:00:00 2001 From: Juergen Gross To: linux-ker...@vger.kernel.org Cc: Juergen Gross Cc: Stefano Stabellini Cc: Oleksandr Tyshchenko Cc: xen-devel@lists.xenproject.org Date: Tue, 12 Mar 2024 13:52:24 +0100 Subject: [PATCH] xen/evtchn: avoid WARN() when unbinding an event channel When unbinding a user event channel, the related handler might be called a last time in case the kernel was built with CONFIG_DEBUG_SHIRQ. This might cause a WARN() in the handler. Avoid that by adding an "unbinding" flag to struct user_event which will short circuit the handler. Fixes: 9e90e58c11b7 ("xen: evtchn: Allow shared registration of IRQ handers") Signed-off-by: Juergen Gross --- drivers/xen/evtchn.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index 59717628ca42..f6a2216c2c87 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -85,6 +85,7 @@ struct user_evtchn { struct per_user_data *user; evtchn_port_t port; bool enabled; + bool unbinding; }; static void evtchn_free_ring(evtchn_port_t *ring) @@ -164,6 +165,10 @@ static irqreturn_t evtchn_interrupt(int irq, void *data) struct per_user_data *u = evtchn->user; unsigned int prod, cons; + /* Handler might be called when tearing down the IRQ. */ + if (evtchn->unbinding) + return IRQ_HANDLED; + WARN(!evtchn->enabled, "Interrupt for port %u, but apparently not enabled; per-user %p\n", evtchn->port, u); @@ -421,6 +426,7 @@ static void evtchn_unbind_from_user(struct per_user_data *u, BUG_ON(irq < 0); + evtchn->unbinding = true; unbind_from_irqhandler(irq, evtchn); del_evtchn(u, evtchn); -- 2.35.3
Re: Linux 6.7-rc1+: WARNING at drivers/xen/evtchn.c:167 evtchn_interrupt
On Sat, Dec 02, 2023 at 10:46:38PM +0100, Marek Marczykowski-Górecki wrote: > On Fri, Dec 01, 2023 at 09:29:30AM +0100, Juergen Gross wrote: > > Hi Marek, > > > > On 30.11.23 03:34, Marek Marczykowski-Górecki wrote: > > > Hi, > > > > > > While testing 6.7-rc3 on Qubes OS I found several warning like in the > > > subject in dom0 log. I see them when running 6.7-rc1 too. I'm not sure > > > what exactly triggers the issue, but my guess would be unbinding an > > > event channel from userspace (closing vchan connection). > > > > > > Specific message: > > > > > > [ 83.973377] [ cut here ] > > > [ 83.975523] Interrupt for port 77, but apparently not enabled; > > > per-user a0e9f1d1 > > > > Just a guess, but I think this might happen when the vchan connection > > is closed while there is still some traffic. This could result in events > > triggering in parallel to unbinding the event channel. > > > > Could you please test the attached patch (only compile tested)? > > Unfortunately that doesn't help, I get exactly the same traceback. Hi, this is still an issue on 6.7.2, any other ideas? -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab signature.asc Description: PGP signature
Re: Linux 6.7-rc1+: WARNING at drivers/xen/evtchn.c:167 evtchn_interrupt
On Fri, Dec 01, 2023 at 09:29:30AM +0100, Juergen Gross wrote: > Hi Marek, > > On 30.11.23 03:34, Marek Marczykowski-Górecki wrote: > > Hi, > > > > While testing 6.7-rc3 on Qubes OS I found several warning like in the > > subject in dom0 log. I see them when running 6.7-rc1 too. I'm not sure > > what exactly triggers the issue, but my guess would be unbinding an > > event channel from userspace (closing vchan connection). > > > > Specific message: > > > > [ 83.973377] [ cut here ] > > [ 83.975523] Interrupt for port 77, but apparently not enabled; per-user > > a0e9f1d1 > > Just a guess, but I think this might happen when the vchan connection > is closed while there is still some traffic. This could result in events > triggering in parallel to unbinding the event channel. > > Could you please test the attached patch (only compile tested)? Unfortunately that doesn't help, I get exactly the same traceback. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab signature.asc Description: PGP signature
Re: Linux 6.7-rc1+: WARNING at drivers/xen/evtchn.c:167 evtchn_interrupt
Hi Marek, On 30.11.23 03:34, Marek Marczykowski-Górecki wrote: Hi, While testing 6.7-rc3 on Qubes OS I found several warning like in the subject in dom0 log. I see them when running 6.7-rc1 too. I'm not sure what exactly triggers the issue, but my guess would be unbinding an event channel from userspace (closing vchan connection). Specific message: [ 83.973377] [ cut here ] [ 83.975523] Interrupt for port 77, but apparently not enabled; per-user a0e9f1d1 Just a guess, but I think this might happen when the vchan connection is closed while there is still some traffic. This could result in events triggering in parallel to unbinding the event channel. Could you please test the attached patch (only compile tested)? Juergen From e29761a8d5d68d7432659d601af60f556e3136a9 Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Fri, 1 Dec 2023 08:03:39 +0100 Subject: [PATCH] xen/events: fix race when unbinding an event channel MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When unbinding an event channel the unbind function should only return to the caller when no interrupt handler can be active any more. When switching from a rwlock to RCU for freeing event channels this assumption was broken. Fix that by replacing the "is_active" flag of an event channel with an atomic used to keep track of event channel freeing AND interrupt handler activity. Rearrange struct irq_info a little bit to avoid adding new padding holes. Fixes: 87797fad6cce ("xen/events: replace evtchn_rwlock with RCU") Reported-by: Marek Marczykowski-Górecki Signed-off-by: Juergen Gross --- drivers/xen/events/events_base.c | 51 ++-- 1 file changed, 35 insertions(+), 16 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index f5edb9e27e3c..c480bc498ba8 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -106,7 +106,7 @@ struct irq_info { #define EVT_MASK_REASON_EXPLICIT 0x01 #define EVT_MASK_REASON_TEMPORARY 0x02 #define EVT_MASK_REASON_EOI_PENDING 0x04 - u8 is_active; /* Is event just being handled? */ + bool is_static; /* Is event channel static */ unsigned irq; evtchn_port_t evtchn; /* event channel */ unsigned short cpu; /* cpu bound */ @@ -114,7 +114,23 @@ struct irq_info { unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ u64 eoi_time; /* Time in jiffies when to EOI. */ raw_spinlock_t lock; - bool is_static; /* Is event channel static */ + + /* + * When freeing an event channel it is marked with EVT_ACT_FREEING. + * When an IRQ handler is starting the event channel is marked with + * EVT_ACT_HANDLER | EVT_ACT_EOI. + * When leaving the IRQ handler EVT_ACT_HANDLER is cleared. EVT_ACT_EOI + * is cleared either at the same time or in case of a late-EOI event + * when the EOI is signalled. + * A IRQ handler may only be entered if none of the flags is set. + * Freeing an event channel is done via RCU, but the freeing function + * starting the RCU process may only return if EVT_ACT_HANDLER is not + * set (no handler is active any more). + */ + atomic_t active; /* Is event just being handled? */ +#define EVT_ACT_HANDLER 0x0001 /* IRQ handler active */ +#define EVT_ACT_EOI 0x0002 /* EOI pending */ +#define EVT_ACT_FREEING 0x0004 /* even channel being freed */ union { unsigned short virq; @@ -640,8 +656,8 @@ static void xen_irq_lateeoi_locked(struct irq_info *info, bool spurious) info->eoi_time = 0; - /* is_active hasn't been reset yet, do it now. */ - smp_store_release(>is_active, 0); + /* EVT_ACT_EOI hasn't been reset yet, do it now. */ + atomic_fetch_andnot_release(EVT_ACT_EOI, >active); do_unmask(info, EVT_MASK_REASON_EOI_PENDING); } @@ -792,9 +808,9 @@ static void xen_free_irq(struct irq_info *info) } /* Not called for lateeoi events. */ -static void event_handler_exit(struct irq_info *info) +static void event_handler_exit(struct irq_info *info, int flag) { - smp_store_release(>is_active, 0); + atomic_fetch_andnot_release(EVT_ACT_HANDLER | flag, >active); clear_evtchn(info->evtchn); } @@ -819,7 +835,7 @@ static void do_eoi_pirq(struct irq_info *info) if (!VALID_EVTCHN(info->evtchn)) return; - event_handler_exit(info); + event_handler_exit(info, EVT_ACT_EOI); if (pirq_needs_eoi(info)) { rc = HYPERVISOR_physdev_op(PHYSDEVOP_eoi, ); @@ -968,6 +984,8 @@ static void __unbind_from_irq(struct irq_info *info, unsigned int irq) return; } + atomic_fetch_or_acquire(EVT_ACT_FREEING, >active); + evtchn = info->evtchn; if (VALID_EVTCHN(evtchn)) { @@ -997,6 +1015,10 @@ static void __unbind_from_irq(struct irq_info *info, unsigned int irq) xen_irq_info_cleanup(info); } + /* Wait until all IRQ handlers are gone. */ + while (atomic_read_acquire(>active) & EVT_ACT_HANDLER) + cpu_relax(); + xen_free_irq(info);
Linux 6.7-rc1+: WARNING at drivers/xen/evtchn.c:167 evtchn_interrupt
Hi, While testing 6.7-rc3 on Qubes OS I found several warning like in the subject in dom0 log. I see them when running 6.7-rc1 too. I'm not sure what exactly triggers the issue, but my guess would be unbinding an event channel from userspace (closing vchan connection). Specific message: [ 83.973377] [ cut here ] [ 83.975523] Interrupt for port 77, but apparently not enabled; per-user a0e9f1d1 [ 83.979083] WARNING: CPU: 1 PID: 2432 at drivers/xen/evtchn.c:167 evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 83.982869] Modules linked in: joydev intel_rapl_msr intel_rapl_common snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi ppdev snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm parport_pc parport pcspkr snd_timer snd e1000e i2c_piix4 soundcore fuse loop xenfs dm_thin_pool dm_persistent_data dm_bio_prison crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 virtio_scsi bochs serio_raw drm_vram_helper xhci_pci xhci_pci_renesas floppy drm_ttm_helper xhci_hcd ttm ata_generic pata_acpi qemu_fw_cfg virtio_console xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput dm_multipath [ 84.019753] CPU: 1 PID: 2432 Comm: qrexec-client Not tainted 6.7.0-0.rc3.1.qubes.1.fc37.x86_64 #1 [ 84.027045] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [ 84.033828] RIP: e030:evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 84.037814] Code: ba 01 00 00 00 be 1d 00 00 00 48 8d bb 88 00 00 00 e8 ce 93 27 c1 eb b4 8b 76 20 48 89 da 48 c7 c7 70 52 21 c0 e8 0a 5c f0 c0 <0f> 0b e9 61 ff ff ff 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 [ 84.048932] RSP: e02b:c90041e7fd60 EFLAGS: 00010082 [ 84.051252] RAX: RBX: 888104fc9a80 RCX: 0027 [ 84.054569] RDX: 88813ff21608 RSI: 0001 RDI: 88813ff21600 [ 84.057606] RBP: 88810522a280 R08: R09: c90041e7fbf8 [ 84.060864] R10: 0003 R11: 82f460c8 R12: 888105d5b6a4 [ 84.064043] R13: 888105d5b760 R14: 88810522a280 R15: 888105d5b600 [ 84.066598] FS: 75ee3eb8ed40() GS:88813ff0() knlGS: [ 84.069251] CS: 1e030 DS: ES: CR0: 80050033 [ 84.071293] CR2: 7b86fa0f0f5c CR3: 000106e18000 CR4: 00040660 [ 84.073911] Call Trace: [ 84.074746] [ 84.075471] ? evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 84.077448] ? __warn+0x81/0x130 [ 84.078671] ? evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 84.080507] ? report_bug+0x171/0x1a0 [ 84.081830] ? prb_read_valid+0x1b/0x30 [ 84.083682] ? handle_bug+0x41/0x70 [ 84.084952] ? exc_invalid_op+0x17/0x70 [ 84.086314] ? asm_exc_invalid_op+0x1a/0x20 [ 84.088520] ? evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 84.090300] __free_irq+0x114/0x2b0 [ 84.091478] free_irq+0x32/0x70 [ 84.092522] unbind_from_irqhandler+0x31/0xb0 [ 84.094035] evtchn_release+0x2b/0xa0 [xen_evtchn] [ 84.095643] __fput+0x9a/0x2c0 [ 84.096734] __x64_sys_close+0x3d/0x80 [ 84.097995] do_syscall_64+0x63/0xe0 [ 84.099188] ? __vm_munmap+0xbb/0x150 [ 84.100431] ? syscall_exit_work+0x103/0x130 [ 84.101854] ? syscall_exit_to_user_mode+0x2b/0x40 [ 84.103509] ? do_syscall_64+0x6f/0xe0 [ 84.104757] ? syscall_exit_to_user_mode+0x2b/0x40 [ 84.106350] ? do_syscall_64+0x6f/0xe0 [ 84.107587] ? exit_to_user_mode_prepare+0xbc/0xd0 [ 84.109180] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 84.110863] RIP: 0033:0x75ee3ed15a0a [ 84.111999] Code: 00 00 0f 05 48 3d 00 f0 ff ff 77 44 c3 0f 1f 00 48 83 ec 18 89 7c 24 0c e8 33 bc f8 ff 8b 7c 24 0c 89 c2 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 93 bc f8 ff 8b 44 24 [ 84.118267] RSP: 002b:70ad0f70 EFLAGS: 0293 ORIG_RAX: 0003 [ 84.120756] RAX: ffda RBX: 55d3d8f70970 RCX: 75ee3ed15a0a [ 84.123675] RDX: RSI: RDI: 000f [ 84.126012] RBP: R08: R09: [ 84.128366] R10: 75ee3ec31468 R11: 0293 R12: [ 84.130699] R13: R14: 70ad2eaa R15: 55d3d8f6f2a0 [ 84.133050] [ 84.133810] ---[ end trace ]--- Full log: https://openqa.qubes-os.org/tests/86647/logfile?filename=serial0.txt -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab signature.asc Description: PGP signature