Re: [PATCH-1] A new irq device interface

Junling Ma Sun, 02 Aug 2020 16:45:42 -0700

I guess you misunderstood what I mean. I mean that there seems to be no 
interest in the discussion on IRC yesterday that we introduce a new scheme. If 
so, I will not work on that part. For part I, ie., moving the irq management to 
mach side, it people think it is useful, I will  prepare a series of revised 
patch for that.


Junling

> On Aug 2, 2020, at 4:33 PM, Samuel Thibault <[email protected]> wrote:
> 
> Junling Ma, le dim. 02 août 2020 16:19:39 -0700, a ecrit:
>> I was distracted by using a counter at all in the original code,
>> because an irq cannot be fired more than once before it is acked.
> 
> Right, normally that doesn't happen.
> 
>> What we really need is a boolean_t type per notification,
> 
> And that could be an option.
> 
>>>> IN fact, I was pondering on the idea that each irq* device should be
>>>> opened exclusively,
>>> 
>>> Then things won't work. Shared IRQ is a thing you can't circumvent with
>>> the hardware that people have atm.
>> 
>> You may have missed the part where I say let user side [/dev/irq*] translator
>> do the multiplexing.
> 
> I didn't but that looks like extra complexity, and makes the whole thing
> less robust: if the multiplexer gets to die somehow, everything dies.
> 
>> And when MSI/MSI-X is introduced, irq sharing will not be needed.
> 
> That would still ignore all the hardware that people still have.
> 
>>>> The original way of counting means that the interrupt will be
>>>> unfortunately blocked forever if a message somehow failed to deliver.
>>> 
>>> There is no such thing. Either the message gets delivered, or it does
>>> not. It can take some time to get delivered. Then other translators
>>> using the same IRQ will suffer from the situation. But that's what we
>>> *have* to do. Until the userland translator either acknowledges the irq
>>> or dies, we *have* to keep the irq disabled (and know that we have done
>>> so for the death case).  We can't go around it anyway, so there is no
>>> point in trying to fix it.
>> 
>> The delivery could have returned an error, for example, if ikm_alloc return
>> NULL, as checked in deliver_intr.
> 
> In such a case, I don't think we can do anything better than closing
> the user port, so the user can get notified of the missed IRQ, and
> possibly try to restart the driver (which is fine for network cards for
> instance). So we're back to the "port death" case.
> 
>>>> This patch increases n_unacked only if a notification is delivered,
>>> 
>>> That would mean that the responsibility of ack-ing interrupts belongs to
>>> userland?
>> 
>> We already require that user land ack the interrupt by calling 
>> device_intr_ack,
>> right?
> 
> That it acks, yes. That it tracks how many times __enable_irq() should
> be called on port death, no.
> 
> Again, just checking for port death is a simple way and gets everything
> working as expected. What I can grasp from what you proposed, it is much
> less clear.
> 
>> As I understand, if a handler has not acked, then the n_unacked remain
>> positive, and also __enable_irq will not be called, and thus the irq
>> remains disabled, for all other handlers attached to the same irq.
> 
> Yes. Until the user notices the translator is stuck, and thus kills it,
> and thus the kernel notices the port death, and can unlock the IRQ.
> 
>> That is what I felt from the IRC discussion yesterday, too. And that is fine 
>> if
>> there is no interest in this patch.
> 
> The thing is: I do not see what improvement there is here. It took us
> some time to get to this working state of ack semantic. If you want to
> improve it, then fine, but you have not explained where there is actual
> improvement (and just rewriting the whole thing for the sake of it will
> most probably only bring bugs).
> 
>> But still, new comes may ask the same question as I did, why would
>> any device driver need to implement device_intr_register and
>> device_intr_ack,
> 
> That question is complementely independent of everything we have been
> discussing above. Again a reason why to separate changes into separate
> patches, so we can discuss separately and avoid squashing together
> things that don't actually need to be, thus blocking improvements on
> some side just because some other side poses question.
> 
> My only comment on patch-2 was that it was introducing other unrelated
> changes, making it unreviewable. As discussed on IRC the other day, your
> device IPC change makes sense. But I don't see any relation with the IRQ
> *management* that you have introduced. Make that patch independent from
> the whole rest, and we'll be able to discuss it.
> 
> 
> Really, I'm wondering how it cannot be obvious for newcomers that yes,
> you need to separate out changes so we can discuss things separately,
> otherwise it's just a big mess and we can't work things out.
> 
> Samuel

Re: [PATCH-1] A new irq device interface

Reply via email to