On 08/07/07 22:32, Garrett D'Amore wrote: > This is urgently needed for nxge. > > What I don't understand, however, is why or how IRM plays here. Are we > concerned that there will be devices requesting more interrupts than are > present on the system?
Yes, that is one of the concerns here. IRM will be managing the interrupt resources, allocating and reclaiming them as necessary. Basically a contract between drivers and the ddi interrupt framework where a driver has to be aware that if it has requested additional interrupt resources, it may need to return them at a future time as system conditions change. The x86 architecture is much more limited than what we have on SPARC. Some priority levels have a very limited number of interrupt resources and without management it's possible for 1 or 2 devices to request everything and leave nothing for anyone else at that PIL level. You can contact the ddi-intr-iteam alias for more info. > What is so different from a device exporting this property versus > requesting them via ddi_intr_alloc()? (In other words, having an > arbitrary limit, with a trivial override, seems somewhat pointless to > me, unless -- and quite possibly -- I'm missing some larger point.) The framework can change the limit of how many interrupts can be allocated through ddi_intr_alloc(), but there is no contract or anything between the drivers and framework. You'll still need to make the request through ddi_intr_alloc(), but this property will provide an interim solution/contract for the drivers. Edward > Note that nxge will suffer even on x86 if the default is less than the > number of cores available. I realize nxge performance on x86 isn't as > interesting, but I'm uncomfortable with the idea of ignoring this. And > future IHV-based products are likely to want/need more MSI-X interrupts > as well. (The default of 2 is woefully inadequate, especially for large > systems like x4600.) > > Perhaps the default value for this should be based upon the number of > cores available in the system? (Admittedly, I am not intimately > familiar with how MSI-X interrupts are allocated on x86 hardware...) > > -- Garrett > > Artem Kachitchkine wrote: >> I am sponsoring this case for Edward Gillett. >> Requested binding is patch/micro, timeout 08/15/2007. >> >> -Artem >> >> Template Version: @(#)sac_nextcase 1.64 07/13/07 SMI >> This information is Copyright 2007 Sun Microsystems >> 1. Introduction >> 1.1. Project/Component Working Name: >> MSI-X interrupt limit override >> 1.2. Name of Document Author/Supplier: >> Author: Edward Gillett >> 1.3 Date of This Document: >> 07 August, 2007 >> 4. Technical Description >> >> 4.1. Summary >> >> This proposal provides a short-term solution for device drivers to >> request and receive more MSI-X interrupt resources than the current >> default limit. The future long-term solution will be provided by the >> Interrupt Resource Management (IRM) project. Proposed solution will be >> implemented on SPARC only. >> >> >> 4.2. Problem >> >> Even though drivers can request arbitrary number of interrupts via >> ddi_intr_alloc(9F), currently the number of returned MSI-X interrupts >> is limited to 2. Some high-throughput drivers, such as nxge and qlc, >> are forced to share 2 interrupts among multiple DMA channels, leading >> to negative performance impact. >> >> The reason for the limit of 2 is that the initial phase of the Advanced >> DDI interrupt project (PSARC 2004/253) did not implement any IRM >> interfaces. The IRM project is expected to integrate in the early 2008, >> but there is immediate need to address performance issues on Sun's >> latest platforms. >> >> >> 4.3. Proposal >> >> Provide a boolean device property, "#msix-request", that indicates to >> the DDI framework to attempt to allocate more interrupts than the >> default limit, should the driver request it. >> >> A device driver can create the proposed property if and only if it >> requires more MSI-X interrupts than the default limit. It should create >> this property before registering any of its interrupts. Sample code: >> >> (void) ddi_prop_create(DDI_DEV_T_NONE, dip, >> DDI_PROP_CANSLEEP, "#msix-request", NULL, 0); >> >> The DDI framework will look for the proposed property as part of the >> driver's interrupt registration process and will try to satisfy the >> driver's MSI-X allocation request. The number of MSI-X interrupts that >> can be allocated in this case may be less than or equal to what the >> driver has requested. This depends on the availability of MSI-X >> interrupt resources at that moment and also based on other platform >> specific limitations. So, the driver must continue to check the >> returned number of MSI-X interrupts per ddi_intr_alloc(9F) and must >> not assume its requests will be honored. >> >> Constraints: >> >> 1) This solution is limited to MSI-X interrupts only and it will not be >> extended to MSI interrupts. This means device drivers will continue >> to receive the current default number of MSI interrupts irrespective >> of this special driver property. >> >> 2) Device drivers using MSI-X interrupts without this property will >> only receive the default MSI-X allocation. >> >> 3) At present, there is no plan to implement this feature on non-SPARC >> platforms. The property, if present, will be ignored on x86 >> platforms. Business priority is to address performance issues on >> SPARC systems urgently, while the x86 platforms can wait for the >> long-term solution. Also, x86 systems provide much more limited >> interrupt resources and can suffer from allocating too many >> interrupts. >> >> The proposed interface is Contracted Consolidation Private. Each driver >> owner group is expected to contract the interfaces and transition to >> the new IRM based interfaces when they are available. >> >> 4.4. Interfaces >> >> interface | stability | description >> --------------+----------------------------------+---------------------- >> #msix-request | Contracted Consolidation Private | boolean property to >> | | override MSI-X limit >> --------------+----------------------------------+---------------------- >> >> Binding: patch/micro >> >> 4.5. References >> >> PSARC 2004/253 Advanced DDI Interrupt Interfaces >> >> >> 6. Resources and Schedule >> 6.4. Steering Committee requested information >> 6.4.1. Consolidation C-team Name: >> ON >> 6.5. ARC review type: FastTrack >> 6.6. ARC Exposure: open >> >
