On Tue, Nov 24, 2020 at 05:27:30AM +0200, Jarkko Sakkinen wrote:
> On Thu, Nov 19, 2020 at 03:42:35PM +0100, Hans de Goede wrote:
> > Hi,
> > 
> > On 11/19/20 7:36 AM, Jerry Snitselaar wrote:
> > > 
> > > Matthew Garrett @ 2020-10-15 15:39 MST:
> > > 
> > >> On Thu, Oct 15, 2020 at 2:44 PM Jerry Snitselaar <jsnit...@redhat.com> 
> > >> wrote:
> > >>>
> > >>> There is a misconfiguration in the bios of the gpio pin used for the
> > >>> interrupt in the T490s. When interrupts are enabled in the tpm_tis
> > >>> driver code this results in an interrupt storm. This was initially
> > >>> reported when we attempted to enable the interrupt code in the tpm_tis
> > >>> driver, which previously wasn't setting a flag to enable it. Due to
> > >>> the reports of the interrupt storm that code was reverted and we went 
> > >>> back
> > >>> to polling instead of using interrupts. Now that we know the T490s 
> > >>> problem
> > >>> is a firmware issue, add code to check if the system is a T490s and
> > >>> disable interrupts if that is the case. This will allow us to enable
> > >>> interrupts for everyone else. If the user has a fixed bios they can
> > >>> force the enabling of interrupts with tpm_tis.interrupts=1 on the
> > >>> kernel command line.
> > >>
> > >> I think an implication of this is that systems haven't been
> > >> well-tested with interrupts enabled. In general when we've found a
> > >> firmware issue in one place it ends up happening elsewhere as well, so
> > >> it wouldn't surprise me if there are other machines that will also be
> > >> unhappy with interrupts enabled. Would it be possible to automatically
> > >> detect this case (eg, if we get more than a certain number of
> > >> interrupts in a certain timeframe immediately after enabling the
> > >> interrupt) and automatically fall back to polling in that case? It
> > >> would also mean that users with fixed firmware wouldn't need to pass a
> > >> parameter.
> > > 
> > > I believe Matthew is correct here. I found another system today
> > > with completely different vendor for both the system and the tpm chip.
> > > In addition another Lenovo model, the L490, has the issue.
> > > 
> > > This initial attempt at a solution like Matthew suggested works on
> > > the system I found today, but I imagine it is all sorts of wrong.
> > > In the 2 systems where I've seen it, there are about 100000 interrupts
> > > in around 1.5 seconds, and then the irq code shuts down the interrupt
> > > because they aren't being handled.
> > 
> > Is that with your patch? The IRQ should be silenced as soon as
> > devm_free_irq(chip->dev.parent, priv->irq, chip); is called.
> > 
> > Depending on if we can get your storm-detection to work or not,
> > we might also choose to just never try to use the IRQ (at least on
> > x86 systems). AFAIK the TPM is never used for high-throughput stuff
> > so the polling overhead should not be a big deal (and I'm getting the 
> > feeling
> > that Windows always polls).
> > 
> > Regards,
> > 
> > Hans
> 
> Yeah, this is what I've been wondering for a while. Why could not we
> just strip off IRQ code? Why does it matter?

And we DO NOT use interrupts in tpm_crb and nobody has ever complained.

/Jarkko

Reply via email to