This is urgently needed for nxge.
What I don't understand, however, is why or how IRM plays here. Are we
concerned that there will be devices requesting more interrupts than are
present on the system?
What is so different from a device exporting this property versus
requesting them via ddi_intr_alloc()? (In other words, having an
arbitrary limit, with a trivial override, seems somewhat pointless to
me, unless -- and quite possibly -- I'm missing some larger point.)
Note that nxge will suffer even on x86 if the default is less than the
number of cores available. I realize nxge performance on x86 isn't as
interesting, but I'm uncomfortable with the idea of ignoring this. And
future IHV-based products are likely to want/need more MSI-X interrupts
as well. (The default of 2 is woefully inadequate, especially for large
systems like x4600.)
Perhaps the default value for this should be based upon the number of
cores available in the system? (Admittedly, I am not intimately
familiar with how MSI-X interrupts are allocated on x86 hardware...)
-- Garrett
Artem Kachitchkine wrote:
> I am sponsoring this case for Edward Gillett.
> Requested binding is patch/micro, timeout 08/15/2007.
>
> -Artem
>
> Template Version: @(#)sac_nextcase 1.64 07/13/07 SMI
> This information is Copyright 2007 Sun Microsystems
> 1. Introduction
> 1.1. Project/Component Working Name:
> MSI-X interrupt limit override
> 1.2. Name of Document Author/Supplier:
> Author: Edward Gillett
> 1.3 Date of This Document:
> 07 August, 2007
> 4. Technical Description
>
> 4.1. Summary
>
> This proposal provides a short-term solution for device drivers to
> request and receive more MSI-X interrupt resources than the current
> default limit. The future long-term solution will be provided by the
> Interrupt Resource Management (IRM) project. Proposed solution will be
> implemented on SPARC only.
>
>
> 4.2. Problem
>
> Even though drivers can request arbitrary number of interrupts via
> ddi_intr_alloc(9F), currently the number of returned MSI-X interrupts
> is limited to 2. Some high-throughput drivers, such as nxge and qlc,
> are forced to share 2 interrupts among multiple DMA channels, leading
> to negative performance impact.
>
> The reason for the limit of 2 is that the initial phase of the Advanced
> DDI interrupt project (PSARC 2004/253) did not implement any IRM
> interfaces. The IRM project is expected to integrate in the early 2008,
> but there is immediate need to address performance issues on Sun's
> latest platforms.
>
>
> 4.3. Proposal
>
> Provide a boolean device property, "#msix-request", that indicates to
> the DDI framework to attempt to allocate more interrupts than the
> default limit, should the driver request it.
>
> A device driver can create the proposed property if and only if it
> requires more MSI-X interrupts than the default limit. It should create
> this property before registering any of its interrupts. Sample code:
>
> (void) ddi_prop_create(DDI_DEV_T_NONE, dip,
> DDI_PROP_CANSLEEP, "#msix-request", NULL, 0);
>
> The DDI framework will look for the proposed property as part of the
> driver's interrupt registration process and will try to satisfy the
> driver's MSI-X allocation request. The number of MSI-X interrupts that
> can be allocated in this case may be less than or equal to what the
> driver has requested. This depends on the availability of MSI-X
> interrupt resources at that moment and also based on other platform
> specific limitations. So, the driver must continue to check the
> returned number of MSI-X interrupts per ddi_intr_alloc(9F) and must
> not assume its requests will be honored.
>
> Constraints:
>
> 1) This solution is limited to MSI-X interrupts only and it will not be
> extended to MSI interrupts. This means device drivers will continue
> to receive the current default number of MSI interrupts irrespective
> of this special driver property.
>
> 2) Device drivers using MSI-X interrupts without this property will
> only receive the default MSI-X allocation.
>
> 3) At present, there is no plan to implement this feature on non-SPARC
> platforms. The property, if present, will be ignored on x86
> platforms. Business priority is to address performance issues on
> SPARC systems urgently, while the x86 platforms can wait for the
> long-term solution. Also, x86 systems provide much more limited
> interrupt resources and can suffer from allocating too many
> interrupts.
>
> The proposed interface is Contracted Consolidation Private. Each driver
> owner group is expected to contract the interfaces and transition to
> the new IRM based interfaces when they are available.
>
> 4.4. Interfaces
>
> interface | stability | description
> --------------+----------------------------------+----------------------
> #msix-request | Contracted Consolidation Private | boolean property to
> | | override MSI-X limit
> --------------+----------------------------------+----------------------
>
> Binding: patch/micro
>
> 4.5. References
>
> PSARC 2004/253 Advanced DDI Interrupt Interfaces
>
>
> 6. Resources and Schedule
> 6.4. Steering Committee requested information
> 6.4.1. Consolidation C-team Name:
> ON
> 6.5. ARC review type: FastTrack
> 6.6. ARC Exposure: open
>