Re: [PATCH] e1000e: Prevent crash from legacy interrupt firing after MSI-X enable

Akihiko Odaki Thu, 07 Aug 2025 03:35:01 -0700

On 2025/08/07 19:31, Laurent Vivier wrote:

On 07/08/2025 12:13, Akihiko Odaki wrote:
On 2025/08/07 17:36, Laurent Vivier wrote:
On 06/08/2025 19:44, Akihiko Odaki wrote:
On 2025/08/07 0:29, Laurent Vivier wrote:
A race condition between guest driver actions and QEMU timers can lead
to an assertion failure when the guest switches the e1000e from legacy
interrupt mode to MSI-X. If a legacy interrupt delay timer (TIDV or
RDTR) is active, but the guest enables MSI-X before the timer fires,
the pending interrupt cause can trigger an assert in
`e1000e_intmgr_collect_delayed_causes()`.
The function's assertion (`assert(core->delayed_causes == 0)`)
incorrectly assumes that it's impossible for a legacy delayedinterrupt
to be pending once the device is in MSI-X mode.

This behavior is incorrect. On a physical device, a driver-initiated
mode switch would mask interrupts, reconfigure the hardware, and clear
any stale interrupt states. The legacy delay timers (TIDV/RDTR) arenot
used for moderation in MSI-X mode; the Interrupt Throttle Rate (ITR)
mechanism is used instead. Therefore, any pending interrupt from the
old mode should be ignored.
It is true that triggering assertion is incorrect as per: docs/devel/ secure-coding- practices.rst
However, I don't see statements in the datasheet that says modeswitch will clear stale interrupts.
The datasheet doesn't say "stale interrupts are cleared." but itdescribes two fundamentally separate and mutually exclusive hardwarepaths for generating interrupts:
Intel® 82574 GbE Controller Family Datasheet
https://docs.rs-online.com/96e8/0900766b81384733.pdf
It is an old revision (2.5). More recent revision (3.4) can be found at:
https://courses.cs.washington.edu/courses/cse451/18au/readings/e1000e.pdf
See 7.4.1 Legacy and MSI Interrupt Modes
Legacy/MSI Path: An event (like a packet transmission completing)sets a cause bit in the ICR (Interrupt Cause Register). The legacydelay timers (TIDV/RDTR) can delay the propagation of this cause.When the timer expires, if the corresponding bit in the IMS(Interrupt Mask Set) is enabled, the hardware asserts the INTx pin tosignal the CPU.
Both revisions 2.5 and 3.4 have the following statements, differentfrom what you quoted:
 > In legacy and MSI modes, an interrupt cause is reflected by setting
 > one of the bits in the ICR register, where each bit reflects one or
 > more causes. This description of ICR register provides the mapping of
 > interrupt causes (for example, a specific Rx queue event or a LSC
 > event) to bits in the ICR.
 >
 > Mapping of causes relating to the Tx and Rx queues as well as
 > non-queue causes in this mode is not configurable. Each possible queue
 > interrupt cause (such as, each Rx queue, Tx queue or any other
 > interrupt source) has an entry in the ICR.
 >
 > The following configuration and parameters are involved:
 > • The ICR[31:0] bits are allocated to specific interrupt causes

Please ensure that you refer a correct datasheet and share it for me.
See 7.4.2 MSI-X Mode
MSI-X Path: An event on a specific queue (e.g., Rx Queue 1) is mappedvia the IVAR (Interrupt Vector Allocation Register) to a specificMSI-X vector. The ITR (Interrupt Throttle Rate) register for thatvector then determines if an interrupt should be generated. Ifallowed, the hardware performs a memory write to the addressspecified in the PCI MSI-X table, delivering the message to aspecific CPU core.
Only non-queue causes are reflected in ICR (so not TIDV/RDTR).
The key here is that a TIDV timer expiring is only connected to thelegacy ICR->IMS- >INTx path. There is no described hardware path fora TIDV timer expiration to trigger an MSI-X memory write.
Therefore, if a guest enables MSI-X, it reconfigures its interruptcontroller (APIC) to listen for MSI-X messages, not legacy INTx pinassertions. So, even if the stale TIDV timer fires on the hardware,the interrupt it generates has no configured path to be received bythe guest OS. From the guest's perspective, the interrupt is lost.Our emulation should model this by ignoring/clearing the stale event,which is precisely what the patch does.
The expression "TIDV/RDTR are not used for moderation in MSI-X mode"is also unclear. Behaving drivers may indeed use ITR for thatpurpose, but the question for us is: what will e1000e do when theguest tries to use TIDV/RDTR in MSI-X mode anyway? That defines thebehavior we need to implement.
The TIDV and RDTR registers are part of the device's memory-mapped I/O space. The hardware will almost certainly allow a write to theseregisters at any time.
However, based on the separate hardware paths described above, thewrite would be ineffective. A driver could set the TIDV timer, andthe timer would likely count down, but its expiration event is onlywired to the legacy interrupt generation logic. Since the device isconfigured for MSI-X interrupt delivery, that path is dormant. Thewrite is accepted, but the action is inert.
If the datasheet describes the expected behavior with delayedinterrupts in MSI-X, a reference to the datasheet should be made atleast in the patch message. Otherwise, perhaps this "if(msix_enabled(core->owner))" is just extraneous and should be removed.
The "if (msix_enabled(core->owner))" check is not extraneous and mustbe kept. It correctly separates the emulation logic for these twomutually exclusive hardware modes. Removing it would incorrectlyallow a legacy delayed interrupt to be processed as if it were validin MSI-X mode, which is not how the hardware works.
Moreover it can introduce unexpected behavior in the guest as thiscase could not be managed.
Revision 3.4's 4.6.1 Interrupts During Initialization says:
 > Most drivers disable interrupts during initialization to prevent
 > re-entrancy. Interrupts are disabled by writing to the IMC register.
 > Note that the interrupts need to be disabled also after issuing a
 > global reset, so a typical driver initialization flow is:
 >
 > 1. Disable interrupts
 > 2. Issue a global reset
 > 3. Disable interrupts (again)
 > 4. …
 >
 > After the initialization completes, a typical driver enables the
 > desired interrupts by
 > writing to the IMS register.
Drivers can ensure that old interrupts will not fire by following thisprocedure. The behavior when not following this is undefined (in thedatasheet I'm referring to).
So, do you agree if I only replace the assert() by aqemu_log_mask(LOG_GUEST_ERROR, ...)?

The following "return 0" is extraneous so should be removed along withassert().


Regards,
Akihiko Odaki

Re: [PATCH] e1000e: Prevent crash from legacy interrupt firing after MSI-X enable

Reply via email to