Hi Geoff,

Saitoh-san pointed me at this email. I've been looking at MSI briefly -
should have some work in place to sort out this situation. About your
specific situation:

Geoff Wing <gcw%pobox.com@localhost> writes:

> Hi,
> brief background:  on an amd64 VM (1 CPU on VMWare ESXi) I had a network
> interface (vmx) failing because it could not get an interrupt slot.  The
> vmx wants 3 interrupts per interface (tx/rx/link-state).  I had a few
> on an admin machine and one started failing when ahcisata was changed to
> use MSI (not ahcisata's fault, obviously).
>
> On i386/amd64 each CPU has a 32 bitmask for interrupts (1 bit per) - but
> 16 of the 32 are reserved for legacy IRQs (on the first CPU).  MSI-X allows
> for 2048 interrupts.  On a physical machine with many CPUs the MSI interrupts
> are farmed out across the different CPUs so would not be apparent to most.
> (and no problem for those 65+ core machines).
>
> For my personal use, I've hacked around by ignoring the reserved legacy IRQ
> region because it's not relevant to me in my VM with MSI/MSI-X.  Other
> people using single CPU VMs may start bumping into this issue so just
> making people aware.  I haven't looked into changing how interrupts are
> handled or if there would be significant performance penalty.
>

You could have a stopgap fix by just using a 64 bit mask and equivalent
supporting data structures instead of the 32bit one. You'll probably
need to also look at spl.S assembler primitives that access the pending
bitmask via assembler instructions and teach them how to do this on a
64bit pending string.

The right thing to do is to stop using a bit mask entirely, and using
a bit more scalable Data structure for this. This isn't trivial though -
the assembler stuff will be harder to maintain correctness than a
straightup buslocked bitscan/compare etc. 

In any case, I'll report back when I get around to this.

Many Thanks,
-- 
~cherry

Reply via email to