Re: [PATCH gnumach] interrupt: Mask, eoi, unmask

2023-10-01 Thread Damien Zammit
On 2/10/23 10:56, Samuel Thibault wrote:
> So in the end, I'd tend to think that it's up to queue_intr to do the
> unmasking, like it does now, while spl is still high and thus we don't
> risk nesting. That way it's the in-kernel intr handler that knows
> whether to mask/unmask or not. I.e. it'd be like this:
> 
> - interrupt.S raises spl (thus IF cleared)
> - interrupt.S EOI
> - interrupt.S calls the handler
>- for pure in-kernel handlers, they do whatever they want with IF
>  cleared.
>- when a userland handler is registers, queue_intr masks the irq.
> - interrupt.S lowers spl with splx_cli, thus IF still cleared
> - iret, that clears the IF
> 
> - later on, userland acks the IRQ, that unmasks the irq
> 
> Is that not already very close to what we have currently?

Yes this is very close to what we have now.
I think we just need to remove the if >=16 part to do it for
all interrupts and remove the EOI call in irq_acknowledge.

Damien




Re: [PATCH gnumach] interrupt: Mask, eoi, unmask

2023-10-01 Thread Samuel Thibault
Samuel Thibault, le lun. 02 oct. 2023 01:43:48 +0200, a ecrit:
> Also, "don't EOI in the user handlers anymore.", isn't that already the
> case?

Ah, that's the irqtab.irqdev_ack call in irq_acknowledge.

Then I'd say we should try to move it to queue_intr? So that it's done
before the driver manages the hardware (possibly generating another
irq).

Samuel



Re: [PATCH gnumach] interrupt: Mask, eoi, unmask

2023-10-01 Thread Samuel Thibault
Samuel Thibault, le lun. 02 oct. 2023 01:56:51 +0200, a ecrit:
> Samuel Thibault, le lun. 02 oct. 2023 01:50:14 +0200, a ecrit:
> > Samuel Thibault, le lun. 02 oct. 2023 01:43:48 +0200, a ecrit:
> > > Damien Zammit, le dim. 01 oct. 2023 23:26:02 +, a ecrit:
> > > > I think the logic for this should be:
> > > > 
> > > > When we get irq N, first we mask irq N, then EOI irq N.
> > > > Then call the handler. If there is a user handler for irq N, let the 
> > > > irq_ack
> > > > unmask irq N, otherwise we need to unmask irq N now.
> > > > But don't EOI in the user handlers anymore.
> > > > 
> > > > What do you think?
> > > 
> > > That can be a plan, yes, but what about the legacy irqs, and their
> > > in-kernel handlers?  We need to fix them too. One keypoint there is that
> > > we want to unmask with IF cleared, so that we don't get interrupted
> > > again on unmask, but on iret (so we don't nest).
> > 
> > For the legacy irqs and their in-kernel handlers, we don't really need
> > to mask actually, we can just keep IF cleared, i.e. the spl level raise
> > should be enough. But we want to keep IF cleared when lowering it again,
> > IIRC that's already handled so.
> 
> So in the end, I'd tend to think that it's up to queue_intr to do the
> unmasking, like it does now, while spl is still high and thus we don't
> risk nesting. That way it's the in-kernel intr handler that knows
> whether to mask/unmask or not. I.e. it'd be like this:
> 
> - interrupt.S raises spl (thus IF cleared)
> - interrupt.S EOI
> - interrupt.S calls the handler
>   - for pure in-kernel handlers, they do whatever they want with IF
> cleared.
>   - when a userland handler is registers, queue_intr masks the irq.
> - interrupt.S lowers spl with splx_cli, thus IF still cleared
> - iret, that clears the IF
> 
> - later on, userland acks the IRQ, that unmasks the irq
> 
> Is that not already very close to what we have currently?

(of course we want to record that reasoning in the source code)

Samuel



Re: [PATCH gnumach] interrupt: Mask, eoi, unmask

2023-10-01 Thread Samuel Thibault
Samuel Thibault, le lun. 02 oct. 2023 01:50:14 +0200, a ecrit:
> Samuel Thibault, le lun. 02 oct. 2023 01:43:48 +0200, a ecrit:
> > Damien Zammit, le dim. 01 oct. 2023 23:26:02 +, a ecrit:
> > > I think the logic for this should be:
> > > 
> > > When we get irq N, first we mask irq N, then EOI irq N.
> > > Then call the handler. If there is a user handler for irq N, let the 
> > > irq_ack
> > > unmask irq N, otherwise we need to unmask irq N now.
> > > But don't EOI in the user handlers anymore.
> > > 
> > > What do you think?
> > 
> > That can be a plan, yes, but what about the legacy irqs, and their
> > in-kernel handlers?  We need to fix them too. One keypoint there is that
> > we want to unmask with IF cleared, so that we don't get interrupted
> > again on unmask, but on iret (so we don't nest).
> 
> For the legacy irqs and their in-kernel handlers, we don't really need
> to mask actually, we can just keep IF cleared, i.e. the spl level raise
> should be enough. But we want to keep IF cleared when lowering it again,
> IIRC that's already handled so.

So in the end, I'd tend to think that it's up to queue_intr to do the
unmasking, like it does now, while spl is still high and thus we don't
risk nesting. That way it's the in-kernel intr handler that knows
whether to mask/unmask or not. I.e. it'd be like this:

- interrupt.S raises spl (thus IF cleared)
- interrupt.S EOI
- interrupt.S calls the handler
  - for pure in-kernel handlers, they do whatever they want with IF
cleared.
  - when a userland handler is registers, queue_intr masks the irq.
- interrupt.S lowers spl with splx_cli, thus IF still cleared
- iret, that clears the IF

- later on, userland acks the IRQ, that unmasks the irq

Is that not already very close to what we have currently?

Samuel



Re: [PATCH gnumach] interrupt: Mask, eoi, unmask

2023-10-01 Thread Samuel Thibault
Samuel Thibault, le lun. 02 oct. 2023 01:43:48 +0200, a ecrit:
> Damien Zammit, le dim. 01 oct. 2023 23:26:02 +, a ecrit:
> > I think the logic for this should be:
> > 
> > When we get irq N, first we mask irq N, then EOI irq N.
> > Then call the handler. If there is a user handler for irq N, let the irq_ack
> > unmask irq N, otherwise we need to unmask irq N now.
> > But don't EOI in the user handlers anymore.
> > 
> > What do you think?
> 
> That can be a plan, yes, but what about the legacy irqs, and their
> in-kernel handlers?  We need to fix them too. One keypoint there is that
> we want to unmask with IF cleared, so that we don't get interrupted
> again on unmask, but on iret (so we don't nest).

For the legacy irqs and their in-kernel handlers, we don't really need
to mask actually, we can just keep IF cleared, i.e. the spl level raise
should be enough. But we want to keep IF cleared when lowering it again,
IIRC that's already handled so.

Samuel



Re: [PATCH gnumach] interrupt: Mask, eoi, unmask

2023-10-01 Thread Samuel Thibault
Damien Zammit, le dim. 01 oct. 2023 23:26:02 +, a ecrit:
> I think the logic for this should be:
> 
> When we get irq N, first we mask irq N, then EOI irq N.
> Then call the handler. If there is a user handler for irq N, let the irq_ack
> unmask irq N, otherwise we need to unmask irq N now.
> But don't EOI in the user handlers anymore.
> 
> What do you think?

That can be a plan, yes, but what about the legacy irqs, and their
in-kernel handlers?  We need to fix them too. One keypoint there is that
we want to unmask with IF cleared, so that we don't get interrupted
again on unmask, but on iret (so we don't nest).

Also, if there is no user handler, I guess we rather want to keep the
irq masked? Otherwise we risk just getting interrupted again -> hang or
cpu waste.

Also, "don't EOI in the user handlers anymore.", isn't that already the
case?

Samuel



Re: [PATCH gnumach] interrupt: Mask, eoi, unmask

2023-10-01 Thread Damien Zammit
Hi,

On 1/10/23 20:13, Samuel Thibault wrote:
> Sometimes the "why" of a commit is obvious, so it doesn't need to be
> explained, but here it's really not and thus it definitely needs to
> be.  We have had various pings-pongs in the past about whether to EOI
> before/after the interrupt, masking or not, etc. So we really need
> a firm explanation, recorded in the git history, why we believe the
> proposed way is now correct.

Yes, sorry about that.
I think the logic for this should be:

When we get irq N, first we mask irq N, then EOI irq N.
Then call the handler. If there is a user handler for irq N, let the irq_ack
unmask irq N, otherwise we need to unmask irq N now.
But don't EOI in the user handlers anymore.

What do you think?

Damien




Re: [PATCH gnumach] interrupt: Mask, eoi, unmask

2023-10-01 Thread Samuel Thibault
Hello,

Sometimes the "why" of a commit is obvious, so it doesn't need to be
explained, but here it's really not and thus it definitely needs to
be.  We have had various pings-pongs in the past about whether to EOI
before/after the interrupt, masking or not, etc. So we really need
a firm explanation, recorded in the git history, why we believe the
proposed way is now correct.

Samuel

Damien Zammit, le dim. 01 oct. 2023 04:58:35 +, a ecrit:
> ---
>  i386/i386at/interrupt.S | 19 ---
>  x86_64/interrupt.S  | 19 +++
>  2 files changed, 31 insertions(+), 7 deletions(-)
> 
> diff --git a/i386/i386at/interrupt.S b/i386/i386at/interrupt.S
> index 8ae6b97c..ec2fc656 100644
> --- a/i386/i386at/interrupt.S
> +++ b/i386/i386at/interrupt.S
> @@ -98,11 +98,24 @@ ENTRY(interrupt)
>   outb%al,$(PIC_MASTER_OCW)   /* unmask master */
>  2:
>  #else
> - cmpl$16,%ecx/* was this a low ISA intr? */
> - jge _no_eoi /* no, must be PCI (let irq_ack handle 
> EOI) */

So we are now EOI-ing all irqs? The comment was saying we were leaving
that up to irq_ack for IRQs above 16. Was that actually wrong/changed?

>   movl%ecx,(%esp) /* load irq number as 1st arg */
> + movl$IOAPIC_MASK_DISABLED,4(%esp)   /* load disabled flag into 2nd 
> arg */
> + callEXT(ioapic_toggle)  /* ioapic mask irq so upon re-entry we 
> dont nest */
> + movlS_IRQ,%eax
> + movl%eax,(%esp) /* load irq number as 1st arg */
>   callEXT(ioapic_irq_eoi) /* ioapic irq specific EOI */
> -_no_eoi:
> + movlS_IRQ,%eax
> + cmpl$9,%eax /* was this a high PCI interrupt? >= 9 
> except 12 */
> + jge _no_unmask  /* yes, dont unmask until acked */
> +_unmask:
> + movl%eax,(%esp) /* load irq number as 1st arg */
> + movl$IOAPIC_MASK_ENABLED,4(%esp)/* load enabled flag into 2nd 
> arg */
> + callEXT(ioapic_toggle)  /* ioapic unmask irq to re-enable */
> + jmp _done

? I don't see the point of masking while doing the EOI for the non-PCI
irqs? The nested interrupt will just happen later. Perhaps you are
just pushing the issue further for them, where it'll be even harder to
reproduce and thus harder to fix.

> +_no_unmask:
> + cmpl$12,%eax/* was this the mouse interrupt? */
> + je  _unmask /* yes, unmask */

Does IRQ #13 (FPU) not also need special care? (possibly also on
x86_64?)

> +_done:
>  #endif
>  
>   movlS_IPL,%eax
> diff --git a/x86_64/interrupt.S b/x86_64/interrupt.S
> index fcd5a032..85f94da6 100644
> --- a/x86_64/interrupt.S
> +++ b/x86_64/interrupt.S
> @@ -98,11 +98,22 @@ ENTRY(interrupt)
>   outb%al,$(PIC_MASTER_OCW)   /* unmask master */
>  2:
>  #else
> - cmpl$16,%ecx/* was this a low ISA intr? */
> - jge  _no_eoi/* no, must be PCI (let irq_ack handle 
> EOI) */
> - movl%ecx,%edi   /* load irq number as 1st arg */
> + movlS_IRQ,S_ARG0/* load irq number as 1st arg */
> + movl$IOAPIC_MASK_DISABLED,S_ARG1/* load disabled flag into 2nd 
> arg */
> + callEXT(ioapic_toggle)  /* ioapic mask irq so upon re-entry we 
> dont nest */
> + movlS_IRQ,S_ARG0/* load irq number as 1st arg */
>   callEXT(ioapic_irq_eoi) /* ioapic irq specific EOI */
> -_no_eoi:
> + movlS_IRQ,S_ARG0/* load irq number as 1st arg */
> + cmpl$9,S_ARG0   /* was this a high PCI interrupt? >= 9 
> except 12 */
> + jge _no_unmask  /* yes, dont unmask until acked */
> +_unmask:
> + movl$IOAPIC_MASK_ENABLED,S_ARG1 /* load enabled flag into 2nd 
> arg */
> + callEXT(ioapic_toggle)  /* ioapic unmask irq to re-enable */
> + jmp _done
> +_no_unmask:
> + cmpl$12,S_ARG0  /* was this the mouse interrupt? */
> + je  _unmask /* yes, unmask */
> +_done:
>  #endif