Re: [osol-code] kstat_intr considerations

Garrett D'Amore Tue, 26 Aug 2008 22:35:23 -0700

Dan Mick wrote:
> Garrett D'Amore wrote:
>> I'm getting ready to write the kstat intr support for a revised audio 
>> driver, and I noticed that it only reports hard, not spurious 
>> interrupts.  That got me thinking.
>>
>> On an x86 system at least, the interrupt chain logic keeps retrying 
>> the interrupt until all devices report DDI_INTR_UNCLAIMED.  So, a 
>> typical device on an unshared bus will have at least 1 spurious 
>> interrupt for each hard interrupt.  (Ignoring the case where 
>> successive calls into the drivers ISR return DDI_INTR_CLAIMED, of 
>> course.)
>
> Well...there's "I got an interrupt indication, but I can't tell what 
> device it's for" and there's "I'm looping to look for any other 
> claimers because of ancient sharing rules", and I don't think they 
> really both fall into the class of what I'd call "spurious", but...I'm 
> not sure what you mean by spurious, or the need/desire to count them.
>
>> On a system with shared interrupts (more and more common, 
>> *especially* on x86 systems) the spurious interrupt count is likely 
>> to be much higher for some devices, because a busy device will cause 
>> spurious interrupts (or the appearance thereof) in the less busy 
>> device(s) on the same interrupt.
>>
>> The upshot of all this is, it seems to me that drivers cannot 
>> reliably count spurious interrupts.
>>
>> The other observation is that very few drivers bother to use 
>> kstat_intr.  (Some drivers, such as NIC drivers, report interrupts 
>> differently, such as in a named kstat.)
>>
>> There's the intrstat program which uses dtrace, but from what I can 
>> tell, it seems to record entries into the interrupt handler, which 
>> doesn't correctly count hard interrupts (see above for why not!)
>
> By "hard interrupts" you mean "an actual hardware indication of 
> interrupt"?
> Note that, for level-triggered interrupts, you can't ever really count 
> those accurately anyway, but...yeah, it counts ISR activations.  While 
> I'm not saying this is right or wrong, what do you see as a problem 
> with this?
>
>> So what I'm thinking is that the interrupt framework itself could 
>> count interrupts and keep track of this in a kstat.  This is useful 
>> for debug, but it would also be able to provide a more accurate 
>> picture than intrstat does.  (Furthermore, as much as I like dtrace, 
>> it might make it possible to build a portable intrstat, that was able 
>> to report not just PCI interrupts but other kinds of interrupts -- 
>> hard, soft, pci, pciex, isa, usb, firewire, sbus, and maybe other 
>> kinds in the future (for example SDcard can support IO devices that 
>> interrupt, although the framework and nexus drivers I just putback 
>> into 97 don't have support for SDIO yet.)
>>
>> The framework could also count spurious interrupts -- a spurious 
>> interrupt being one where the top-level CPU or nexus specific 
>> interrupt handler dispatched, but didn't find *any* handler that 
>> could service the interrupt.  (This could be counted as spurious 
>> interrupt for *each* device on the bus, or it could be counted in a 
>> separate top-level spurious interrupt counter for the bus.)
>>
>> Thoughts?
>
> I've mused from time to time about trying to clean that code up just 
> from the "wasteful to call everyone again and again" aspect, but it 
> would change semantics; probably in a way we can handle, but I'd have 
> to think harder about it.  I've never thought about deficiencies in 
> the stats, though, and I'm not sure one way is better or worse than 
> the other.
>
> What would you gain by knowing about 'spurious', where 'spurious' 
> really relates to two different possible conditions?  It's a 
> negatively-charged term, but one of the two types is just "the way 
> Solaris does it".  What would be better about counting only 'hard 
> interrupts'?


Its useful to know if a device is interrupting and then not handling the 
interrupts.  It can also give some indication of interrupt latencies, 
and "wasted" interrupt related context switches.

It also gives a good indication of how much impact your interrupt 
sharing is causing you.

Hard interrupt statistics are *incredibly* useful.  For example, I need 
to know how many times a second a certain audio related interrupt is 
occurring, so that I know that I have configured it properly.

    -- Garrett


_______________________________________________
opensolaris-code mailing list
opensolaris-code@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Re: [osol-code] kstat_intr considerations

Reply via email to