Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-27 Thread Julien Grall

Hi,

On 2/26/19 11:02 AM, Roger Pau Monné wrote:

On Tue, Feb 26, 2019 at 10:26:21AM +, Julien Grall wrote:

On 26/02/2019 10:17, Roger Pau Monné wrote:

On Tue, Feb 26, 2019 at 10:03:38AM +, Julien Grall wrote:

Hi Roger,

On 26/02/2019 09:44, Roger Pau Monné wrote:

On Tue, Feb 26, 2019 at 09:30:07AM +, Andrew Cooper wrote:

On 26/02/2019 09:14, Roger Pau Monné wrote:

On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:

Hi Oleksandr,

On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:

On 2/22/19 3:33 PM, Julien Grall wrote:

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:

On 2/20/19 10:46 PM, Julien Grall wrote:

Discussing with my team, a solution that came up would be to
introduce one atomic field per event to record the number of
event received. I will explore that solution tomorrow.

How will this help if events have some payload?

What payload? The event channel does not carry any payload. It only
notify you that something happen. Then this is up to the user to
decide what to you with it.

Sorry, I was probably not precise enough. I mean that an event might have
associated payload in the ring buffer, for example [1]. So, counting events
may help somehow, but the ring's data may still be lost

   From my understanding of event channels are edge interrupts. By definition,

IMO event channels are active high level interrupts.

Let's take into account the following situation: you have an event
channel masked and the event channel pending bit (akin to the line on
bare metal) goes from low to high (0 -> 1), then you unmask the
interrupt and you get an event injected. If it was an edge interrupt
you wont get an event injected after unmasking, because you would
have lost the edge. I think the problem here is that Linux treats
event channels as edge interrupts, when they are actually level.


Event channels are edge interrupts.  There are several very subtle bugs
to be had by software which treats them as line interrupts.

Most critically, if you fail to ack them, rebind them to a new vcpu, and
reenable interrupts, you don't get a new interrupt notification.  This
was the source of a 4 month bug when XenServer was moving from
classic-xen to PVOps where using irqbalance would cause dom0 to
occasionally lose interrupts.


I would argue that you need to mask them first, rebind to a new vcpu
and unmask, and then you will get an interrupt notification, or this
should be fixed in Xen to work as you expect: trigger an interrupt
notification when moving an asserted event channel between CPUs.

Is there any document that describes how such non trivial things (like
moving between CPUs) work for event/level interrupts?

Maybe I'm being obtuse, but from the example I gave above it's quite
clear to me event channels don't get triggered based on edge changes,
but rather on the line level.


Your example above is not enough to give the semantics of level. You would
only use the MASK bit if your interrupt handler is threaded to avoid the
interrupt coming up again.

So if you remove the mask from the equation, then the interrupt flow should be:

1) handle interrupt
2) EOI


This is bogus if you don't mask the interrupt source. You should
instead do

1) EOI
2) Handle interrupt

And loop over this.

So that's not a level semantics. It is a edge one :). In the level case, you
would clear the state once you are done with the interrupt.

Also, it would be ACK and not EOI.


For level triggered interrupts you have to somehow signal the device
to stop asserting the line, which doesn't happen for Xen devices
because they just signal interrupts to Xen, but don't have a way to
keep event channels asserted, so I agree that this is different from
traditional level interrupts because devices using event channels
don't have a way to keep lines asserted.

I guess the most similar native interrupt is MSI with masking
support?


I don't know enough about MSI with masking support to be able to draw a 
comparison :).


The flow I have been suggested to re-use in Linux is 
handle_fasteoi_ack_irq. I haven't yet had time to have a try at it.


Cheers,

--
Julien Grall


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-26 Thread Roger Pau Monné
On Tue, Feb 26, 2019 at 10:26:21AM +, Julien Grall wrote:
> Hi,
> 
> On 26/02/2019 10:17, Roger Pau Monné wrote:
> > On Tue, Feb 26, 2019 at 10:03:38AM +, Julien Grall wrote:
> > > Hi Roger,
> > > 
> > > On 26/02/2019 09:44, Roger Pau Monné wrote:
> > > > On Tue, Feb 26, 2019 at 09:30:07AM +, Andrew Cooper wrote:
> > > > > On 26/02/2019 09:14, Roger Pau Monné wrote:
> > > > > > On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:
> > > > > > > Hi Oleksandr,
> > > > > > > 
> > > > > > > On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> > > > > > > > On 2/22/19 3:33 PM, Julien Grall wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> > > > > > > > > > On 2/20/19 10:46 PM, Julien Grall wrote:
> > > > > > > > > > > Discussing with my team, a solution that came up would be 
> > > > > > > > > > > to
> > > > > > > > > > > introduce one atomic field per event to record the number 
> > > > > > > > > > > of
> > > > > > > > > > > event received. I will explore that solution tomorrow.
> > > > > > > > > > How will this help if events have some payload?
> > > > > > > > > What payload? The event channel does not carry any payload. 
> > > > > > > > > It only
> > > > > > > > > notify you that something happen. Then this is up to the user 
> > > > > > > > > to
> > > > > > > > > decide what to you with it.
> > > > > > > > Sorry, I was probably not precise enough. I mean that an event 
> > > > > > > > might have
> > > > > > > > associated payload in the ring buffer, for example [1]. So, 
> > > > > > > > counting events
> > > > > > > > may help somehow, but the ring's data may still be lost
> > > > > > >   From my understanding of event channels are edge interrupts. By 
> > > > > > > definition,
> > > > > > IMO event channels are active high level interrupts.
> > > > > > 
> > > > > > Let's take into account the following situation: you have an event
> > > > > > channel masked and the event channel pending bit (akin to the line 
> > > > > > on
> > > > > > bare metal) goes from low to high (0 -> 1), then you unmask the
> > > > > > interrupt and you get an event injected. If it was an edge interrupt
> > > > > > you wont get an event injected after unmasking, because you would
> > > > > > have lost the edge. I think the problem here is that Linux treats
> > > > > > event channels as edge interrupts, when they are actually level.
> > > > > 
> > > > > Event channels are edge interrupts.  There are several very subtle 
> > > > > bugs
> > > > > to be had by software which treats them as line interrupts.
> > > > > 
> > > > > Most critically, if you fail to ack them, rebind them to a new vcpu, 
> > > > > and
> > > > > reenable interrupts, you don't get a new interrupt notification.  This
> > > > > was the source of a 4 month bug when XenServer was moving from
> > > > > classic-xen to PVOps where using irqbalance would cause dom0 to
> > > > > occasionally lose interrupts.
> > > > 
> > > > I would argue that you need to mask them first, rebind to a new vcpu
> > > > and unmask, and then you will get an interrupt notification, or this
> > > > should be fixed in Xen to work as you expect: trigger an interrupt
> > > > notification when moving an asserted event channel between CPUs.
> > > > 
> > > > Is there any document that describes how such non trivial things (like
> > > > moving between CPUs) work for event/level interrupts?
> > > > 
> > > > Maybe I'm being obtuse, but from the example I gave above it's quite
> > > > clear to me event channels don't get triggered based on edge changes,
> > > > but rather on the line level.
> > > 
> > > Your example above is not enough to give the semantics of level. You would
> > > only use the MASK bit if your interrupt handler is threaded to avoid the
> > > interrupt coming up again.
> > > 
> > > So if you remove the mask from the equation, then the interrupt flow 
> > > should be:
> > > 
> > > 1) handle interrupt
> > > 2) EOI
> > 
> > This is bogus if you don't mask the interrupt source. You should
> > instead do
> > 
> > 1) EOI
> > 2) Handle interrupt
> > 
> > And loop over this.
> So that's not a level semantics. It is a edge one :). In the level case, you
> would clear the state once you are done with the interrupt.
> 
> Also, it would be ACK and not EOI.

For level triggered interrupts you have to somehow signal the device
to stop asserting the line, which doesn't happen for Xen devices
because they just signal interrupts to Xen, but don't have a way to
keep event channels asserted, so I agree that this is different from
traditional level interrupts because devices using event channels
don't have a way to keep lines asserted.

I guess the most similar native interrupt is MSI with masking
support?

Roger.


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-26 Thread Julien Grall

Hi,

On 26/02/2019 10:17, Roger Pau Monné wrote:

On Tue, Feb 26, 2019 at 10:03:38AM +, Julien Grall wrote:

Hi Roger,

On 26/02/2019 09:44, Roger Pau Monné wrote:

On Tue, Feb 26, 2019 at 09:30:07AM +, Andrew Cooper wrote:

On 26/02/2019 09:14, Roger Pau Monné wrote:

On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:

Hi Oleksandr,

On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:

On 2/22/19 3:33 PM, Julien Grall wrote:

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:

On 2/20/19 10:46 PM, Julien Grall wrote:

Discussing with my team, a solution that came up would be to
introduce one atomic field per event to record the number of
event received. I will explore that solution tomorrow.

How will this help if events have some payload?

What payload? The event channel does not carry any payload. It only
notify you that something happen. Then this is up to the user to
decide what to you with it.

Sorry, I was probably not precise enough. I mean that an event might have
associated payload in the ring buffer, for example [1]. So, counting events
may help somehow, but the ring's data may still be lost

  From my understanding of event channels are edge interrupts. By definition,

IMO event channels are active high level interrupts.

Let's take into account the following situation: you have an event
channel masked and the event channel pending bit (akin to the line on
bare metal) goes from low to high (0 -> 1), then you unmask the
interrupt and you get an event injected. If it was an edge interrupt
you wont get an event injected after unmasking, because you would
have lost the edge. I think the problem here is that Linux treats
event channels as edge interrupts, when they are actually level.


Event channels are edge interrupts.  There are several very subtle bugs
to be had by software which treats them as line interrupts.

Most critically, if you fail to ack them, rebind them to a new vcpu, and
reenable interrupts, you don't get a new interrupt notification.  This
was the source of a 4 month bug when XenServer was moving from
classic-xen to PVOps where using irqbalance would cause dom0 to
occasionally lose interrupts.


I would argue that you need to mask them first, rebind to a new vcpu
and unmask, and then you will get an interrupt notification, or this
should be fixed in Xen to work as you expect: trigger an interrupt
notification when moving an asserted event channel between CPUs.

Is there any document that describes how such non trivial things (like
moving between CPUs) work for event/level interrupts?

Maybe I'm being obtuse, but from the example I gave above it's quite
clear to me event channels don't get triggered based on edge changes,
but rather on the line level.


Your example above is not enough to give the semantics of level. You would
only use the MASK bit if your interrupt handler is threaded to avoid the
interrupt coming up again.

So if you remove the mask from the equation, then the interrupt flow should be:

1) handle interrupt
2) EOI


This is bogus if you don't mask the interrupt source. You should
instead do

1) EOI
2) Handle interrupt

And loop over this.
So that's not a level semantics. It is a edge one :). In the level case, you 
would clear the state once you are done with the interrupt.


Also, it would be ACK and not EOI.




The EOI in our case would be clearing the PENDING state. In a proper level
interrupt, the state would stay PENDING if there were more to come. This is
not the case with the events and you will lose the interrupt.

So I don't think they are proper level interrupts. They have more a
semantics of edge interrupts with some property of level (i.e for the
mask/unmask).


OK, I guess it depends on how you look at it, to me event channels are
maybe quirky level interrupts, but are definitely closer to level than
edge interrupts, specially taking into account the interrupt injection
that happens on unmask of a pending line, there's no such thing at all
with edge interrupts.


Cheers,

--
Julien Grall


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-26 Thread Roger Pau Monné
On Tue, Feb 26, 2019 at 10:03:38AM +, Julien Grall wrote:
> Hi Roger,
> 
> On 26/02/2019 09:44, Roger Pau Monné wrote:
> > On Tue, Feb 26, 2019 at 09:30:07AM +, Andrew Cooper wrote:
> > > On 26/02/2019 09:14, Roger Pau Monné wrote:
> > > > On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:
> > > > > Hi Oleksandr,
> > > > > 
> > > > > On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> > > > > > On 2/22/19 3:33 PM, Julien Grall wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> > > > > > > > On 2/20/19 10:46 PM, Julien Grall wrote:
> > > > > > > > > Discussing with my team, a solution that came up would be to
> > > > > > > > > introduce one atomic field per event to record the number of
> > > > > > > > > event received. I will explore that solution tomorrow.
> > > > > > > > How will this help if events have some payload?
> > > > > > > What payload? The event channel does not carry any payload. It 
> > > > > > > only
> > > > > > > notify you that something happen. Then this is up to the user to
> > > > > > > decide what to you with it.
> > > > > > Sorry, I was probably not precise enough. I mean that an event 
> > > > > > might have
> > > > > > associated payload in the ring buffer, for example [1]. So, 
> > > > > > counting events
> > > > > > may help somehow, but the ring's data may still be lost
> > > > >  From my understanding of event channels are edge interrupts. By 
> > > > > definition,
> > > > IMO event channels are active high level interrupts.
> > > > 
> > > > Let's take into account the following situation: you have an event
> > > > channel masked and the event channel pending bit (akin to the line on
> > > > bare metal) goes from low to high (0 -> 1), then you unmask the
> > > > interrupt and you get an event injected. If it was an edge interrupt
> > > > you wont get an event injected after unmasking, because you would
> > > > have lost the edge. I think the problem here is that Linux treats
> > > > event channels as edge interrupts, when they are actually level.
> > > 
> > > Event channels are edge interrupts.  There are several very subtle bugs
> > > to be had by software which treats them as line interrupts.
> > > 
> > > Most critically, if you fail to ack them, rebind them to a new vcpu, and
> > > reenable interrupts, you don't get a new interrupt notification.  This
> > > was the source of a 4 month bug when XenServer was moving from
> > > classic-xen to PVOps where using irqbalance would cause dom0 to
> > > occasionally lose interrupts.
> > 
> > I would argue that you need to mask them first, rebind to a new vcpu
> > and unmask, and then you will get an interrupt notification, or this
> > should be fixed in Xen to work as you expect: trigger an interrupt
> > notification when moving an asserted event channel between CPUs.
> > 
> > Is there any document that describes how such non trivial things (like
> > moving between CPUs) work for event/level interrupts?
> > 
> > Maybe I'm being obtuse, but from the example I gave above it's quite
> > clear to me event channels don't get triggered based on edge changes,
> > but rather on the line level.
> 
> Your example above is not enough to give the semantics of level. You would
> only use the MASK bit if your interrupt handler is threaded to avoid the
> interrupt coming up again.
> 
> So if you remove the mask from the equation, then the interrupt flow should 
> be:
> 
> 1) handle interrupt
> 2) EOI

This is bogus if you don't mask the interrupt source. You should
instead do

1) EOI
2) Handle interrupt

And loop over this.

> The EOI in our case would be clearing the PENDING state. In a proper level
> interrupt, the state would stay PENDING if there were more to come. This is
> not the case with the events and you will lose the interrupt.
>
> So I don't think they are proper level interrupts. They have more a
> semantics of edge interrupts with some property of level (i.e for the
> mask/unmask).

OK, I guess it depends on how you look at it, to me event channels are
maybe quirky level interrupts, but are definitely closer to level than
edge interrupts, specially taking into account the interrupt injection
that happens on unmask of a pending line, there's no such thing at all
with edge interrupts.

Thanks, Roger.


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-26 Thread Julien Grall

Hi Roger,

On 26/02/2019 09:44, Roger Pau Monné wrote:

On Tue, Feb 26, 2019 at 09:30:07AM +, Andrew Cooper wrote:

On 26/02/2019 09:14, Roger Pau Monné wrote:

On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:

Hi Oleksandr,

On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:

On 2/22/19 3:33 PM, Julien Grall wrote:

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:

On 2/20/19 10:46 PM, Julien Grall wrote:

Discussing with my team, a solution that came up would be to
introduce one atomic field per event to record the number of
event received. I will explore that solution tomorrow.

How will this help if events have some payload?

What payload? The event channel does not carry any payload. It only
notify you that something happen. Then this is up to the user to
decide what to you with it.

Sorry, I was probably not precise enough. I mean that an event might have
associated payload in the ring buffer, for example [1]. So, counting events
may help somehow, but the ring's data may still be lost

 From my understanding of event channels are edge interrupts. By definition,

IMO event channels are active high level interrupts.

Let's take into account the following situation: you have an event
channel masked and the event channel pending bit (akin to the line on
bare metal) goes from low to high (0 -> 1), then you unmask the
interrupt and you get an event injected. If it was an edge interrupt
you wont get an event injected after unmasking, because you would
have lost the edge. I think the problem here is that Linux treats
event channels as edge interrupts, when they are actually level.


Event channels are edge interrupts.  There are several very subtle bugs
to be had by software which treats them as line interrupts.

Most critically, if you fail to ack them, rebind them to a new vcpu, and
reenable interrupts, you don't get a new interrupt notification.  This
was the source of a 4 month bug when XenServer was moving from
classic-xen to PVOps where using irqbalance would cause dom0 to
occasionally lose interrupts.


I would argue that you need to mask them first, rebind to a new vcpu
and unmask, and then you will get an interrupt notification, or this
should be fixed in Xen to work as you expect: trigger an interrupt
notification when moving an asserted event channel between CPUs.

Is there any document that describes how such non trivial things (like
moving between CPUs) work for event/level interrupts?

Maybe I'm being obtuse, but from the example I gave above it's quite
clear to me event channels don't get triggered based on edge changes,
but rather on the line level.


Your example above is not enough to give the semantics of level. You would only 
use the MASK bit if your interrupt handler is threaded to avoid the interrupt 
coming up again.


So if you remove the mask from the equation, then the interrupt flow should be:

1) handle interrupt
2) EOI

The EOI in our case would be clearing the PENDING state. In a proper level 
interrupt, the state would stay PENDING if there were more to come. This is not 
the case with the events and you will lose the interrupt.


So I don't think they are proper level interrupts. They have more a semantics of 
edge interrupts with some property of level (i.e for the mask/unmask).


Cheers,

--
Julien Grall


RE: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-26 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of 
> Andrew Cooper
> Sent: 26 February 2019 09:30
> To: Roger Pau Monne ; Julien Grall 
> 
> Cc: Juergen Gross ; Stefano Stabellini 
> ; Oleksandr
> Andrushchenko ; linux-kernel@vger.kernel.org; Jan Beulich 
> ;
> xen-devel ; Boris Ostrovsky 
> ; Dave P
> Martin 
> Subject: Re: [Xen-devel] xen/evtchn and forced threaded irq
> 
> On 26/02/2019 09:14, Roger Pau Monné wrote:
> > On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:
> >> Hi Oleksandr,
> >>
> >> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> >>> On 2/22/19 3:33 PM, Julien Grall wrote:
> >>>> Hi,
> >>>>
> >>>> On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> >>>>> On 2/20/19 10:46 PM, Julien Grall wrote:
> >>>>>> Discussing with my team, a solution that came up would be to
> >>>>>> introduce one atomic field per event to record the number of
> >>>>>> event received. I will explore that solution tomorrow.
> >>>>> How will this help if events have some payload?
> >>>> What payload? The event channel does not carry any payload. It only
> >>>> notify you that something happen. Then this is up to the user to
> >>>> decide what to you with it.
> >>> Sorry, I was probably not precise enough. I mean that an event might have
> >>> associated payload in the ring buffer, for example [1]. So, counting 
> >>> events
> >>> may help somehow, but the ring's data may still be lost
> >> From my understanding of event channels are edge interrupts. By definition,
> > IMO event channels are active high level interrupts.
> >
> > Let's take into account the following situation: you have an event
> > channel masked and the event channel pending bit (akin to the line on
> > bare metal) goes from low to high (0 -> 1), then you unmask the
> > interrupt and you get an event injected. If it was an edge interrupt
> > you wont get an event injected after unmasking, because you would
> > have lost the edge. I think the problem here is that Linux treats
> > event channels as edge interrupts, when they are actually level.
> 
> Event channels are edge interrupts.  There are several very subtle bugs
> to be had by software which treats them as line interrupts.

They are more subtle than that are they not? There is a single per-vcpu ACK 
which can cover multiple event channels.

  Paul

> 
> Most critically, if you fail to ack them, rebind them to a new vcpu, and
> reenable interrupts, you don't get a new interrupt notification.  This
> was the source of a 4 month bug when XenServer was moving from
> classic-xen to PVOps where using irqbalance would cause dom0 to
> occasionally lose interrupts.
> 
> ~Andrew
> 
> ___
> Xen-devel mailing list
> xen-de...@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-26 Thread Roger Pau Monné
On Tue, Feb 26, 2019 at 09:30:07AM +, Andrew Cooper wrote:
> On 26/02/2019 09:14, Roger Pau Monné wrote:
> > On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:
> >> Hi Oleksandr,
> >>
> >> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> >>> On 2/22/19 3:33 PM, Julien Grall wrote:
>  Hi,
> 
>  On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> > On 2/20/19 10:46 PM, Julien Grall wrote:
> >> Discussing with my team, a solution that came up would be to
> >> introduce one atomic field per event to record the number of
> >> event received. I will explore that solution tomorrow.
> > How will this help if events have some payload?
>  What payload? The event channel does not carry any payload. It only
>  notify you that something happen. Then this is up to the user to
>  decide what to you with it.
> >>> Sorry, I was probably not precise enough. I mean that an event might have
> >>> associated payload in the ring buffer, for example [1]. So, counting 
> >>> events
> >>> may help somehow, but the ring's data may still be lost
> >> From my understanding of event channels are edge interrupts. By definition,
> > IMO event channels are active high level interrupts.
> >
> > Let's take into account the following situation: you have an event
> > channel masked and the event channel pending bit (akin to the line on
> > bare metal) goes from low to high (0 -> 1), then you unmask the
> > interrupt and you get an event injected. If it was an edge interrupt
> > you wont get an event injected after unmasking, because you would
> > have lost the edge. I think the problem here is that Linux treats
> > event channels as edge interrupts, when they are actually level.
> 
> Event channels are edge interrupts.  There are several very subtle bugs
> to be had by software which treats them as line interrupts.
> 
> Most critically, if you fail to ack them, rebind them to a new vcpu, and
> reenable interrupts, you don't get a new interrupt notification.  This
> was the source of a 4 month bug when XenServer was moving from
> classic-xen to PVOps where using irqbalance would cause dom0 to
> occasionally lose interrupts.

I would argue that you need to mask them first, rebind to a new vcpu
and unmask, and then you will get an interrupt notification, or this
should be fixed in Xen to work as you expect: trigger an interrupt
notification when moving an asserted event channel between CPUs.

Is there any document that describes how such non trivial things (like
moving between CPUs) work for event/level interrupts?

Maybe I'm being obtuse, but from the example I gave above it's quite
clear to me event channels don't get triggered based on edge changes,
but rather on the line level.

Thanks, Roger.


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-26 Thread Andrew Cooper
On 26/02/2019 09:14, Roger Pau Monné wrote:
> On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
>>> On 2/22/19 3:33 PM, Julien Grall wrote:
 Hi,

 On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> On 2/20/19 10:46 PM, Julien Grall wrote:
>> Discussing with my team, a solution that came up would be to
>> introduce one atomic field per event to record the number of
>> event received. I will explore that solution tomorrow.
> How will this help if events have some payload?
 What payload? The event channel does not carry any payload. It only
 notify you that something happen. Then this is up to the user to
 decide what to you with it.
>>> Sorry, I was probably not precise enough. I mean that an event might have
>>> associated payload in the ring buffer, for example [1]. So, counting events
>>> may help somehow, but the ring's data may still be lost
>> From my understanding of event channels are edge interrupts. By definition,
> IMO event channels are active high level interrupts.
>
> Let's take into account the following situation: you have an event
> channel masked and the event channel pending bit (akin to the line on
> bare metal) goes from low to high (0 -> 1), then you unmask the
> interrupt and you get an event injected. If it was an edge interrupt
> you wont get an event injected after unmasking, because you would
> have lost the edge. I think the problem here is that Linux treats
> event channels as edge interrupts, when they are actually level.

Event channels are edge interrupts.  There are several very subtle bugs
to be had by software which treats them as line interrupts.

Most critically, if you fail to ack them, rebind them to a new vcpu, and
reenable interrupts, you don't get a new interrupt notification.  This
was the source of a 4 month bug when XenServer was moving from
classic-xen to PVOps where using irqbalance would cause dom0 to
occasionally lose interrupts.

~Andrew


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-26 Thread Roger Pau Monné
On Mon, Feb 25, 2019 at 01:55:42PM +, Julien Grall wrote:
> Hi Oleksandr,
> 
> On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:
> > On 2/22/19 3:33 PM, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:
> > > > On 2/20/19 10:46 PM, Julien Grall wrote:
> > > > > Discussing with my team, a solution that came up would be to
> > > > > introduce one atomic field per event to record the number of
> > > > > event received. I will explore that solution tomorrow.
> > > > How will this help if events have some payload?
> > > 
> > > What payload? The event channel does not carry any payload. It only
> > > notify you that something happen. Then this is up to the user to
> > > decide what to you with it.
> > Sorry, I was probably not precise enough. I mean that an event might have
> > associated payload in the ring buffer, for example [1]. So, counting events
> > may help somehow, but the ring's data may still be lost
> 
> From my understanding of event channels are edge interrupts. By definition,

IMO event channels are active high level interrupts.

Let's take into account the following situation: you have an event
channel masked and the event channel pending bit (akin to the line on
bare metal) goes from low to high (0 -> 1), then you unmask the
interrupt and you get an event injected. If it was an edge interrupt
you wont get an event injected after unmasking, because you would
have lost the edge. I think the problem here is that Linux treats
event channels as edge interrupts, when they are actually level.

Roger.


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-25 Thread Julien Grall

Hi Oleksandr,

On 25/02/2019 14:08, Oleksandr Andrushchenko wrote:

On 2/25/19 3:55 PM, Julien Grall wrote:

Hi Oleksandr,

On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:

On 2/22/19 3:33 PM, Julien Grall wrote:

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:

On 2/20/19 10:46 PM, Julien Grall wrote:
Discussing with my team, a solution that came up would be to introduce one 
atomic field per event to record the number of event received. I will 
explore that solution tomorrow.

How will this help if events have some payload?


What payload? The event channel does not carry any payload. It only notify 
you that something happen. Then this is up to the user to decide what to you 
with it.

Sorry, I was probably not precise enough. I mean that an event might have
associated payload in the ring buffer, for example [1]. So, counting events
may help somehow, but the ring's data may still be lost


From my understanding of event channels are edge interrupts. By definition, 
they can be merged so you can get a signal notification to the guest for 
multiple "events". So if you rely on the event to have an associated payload, 
then you probably have done something wrong in your driver.


I haven't implemented PV drivers myself, but I would expect either side to 
block if there were no space in the ring.


What do you do in the displif driver when the ring is full?


It is handled by the originator, the display backend in our case: it doesn't 
send
events if it sees that the ring will overflow. But I was worried about
such a generic change with counting number of events received and if this
really helps to recover in general case


Well, I was originally looking at modifying only the /dev/evtchn driver but it 
turns out the event flow for Xen irqchip is not entirely correct.


A lot of Xen PV drivers will thread the handler and set IRQF_ONESHOT expecting 
the event to be masked until the event handler has ran. However, the flow we use 
does not deal with it and actually warn you may receive another event before 
executing the handler for the first event.


I have discussed with Marc Z. (one of the irqchip maintainers) about the issue. 
He suggested a different interrupt flow that I need to try out.


Cheers,

--
Julien Grall


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-25 Thread Oleksandr Andrushchenko

On 2/25/19 3:55 PM, Julien Grall wrote:

Hi Oleksandr,

On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:

On 2/22/19 3:33 PM, Julien Grall wrote:

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:

On 2/20/19 10:46 PM, Julien Grall wrote:
Discussing with my team, a solution that came up would be to 
introduce one atomic field per event to record the number of event 
received. I will explore that solution tomorrow.

How will this help if events have some payload?


What payload? The event channel does not carry any payload. It only 
notify you that something happen. Then this is up to the user to 
decide what to you with it.
Sorry, I was probably not precise enough. I mean that an event might 
have
associated payload in the ring buffer, for example [1]. So, counting 
events

may help somehow, but the ring's data may still be lost


From my understanding of event channels are edge interrupts. By 
definition, they can be merged so you can get a signal notification to 
the guest for multiple "events". So if you rely on the event to have 
an associated payload, then you probably have done something wrong in 
your driver.


I haven't implemented PV drivers myself, but I would expect either 
side to block if there were no space in the ring.


What do you do in the displif driver when the ring is full?

It is handled by the originator, the display backend in our case: it 
doesn't send

events if it sees that the ring will overflow. But I was worried about
such a generic change with counting number of events received and if this
really helps to recover in general case

Cheers,

[1] 
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/io/displif.h;h=cc5de9cb1f35dedc99c866d73d086b19e496852a;hb=HEAD#l756 







Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-25 Thread Julien Grall

Hi Oleksandr,

On 25/02/2019 13:24, Oleksandr Andrushchenko wrote:

On 2/22/19 3:33 PM, Julien Grall wrote:

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:

On 2/20/19 10:46 PM, Julien Grall wrote:
Discussing with my team, a solution that came up would be to introduce one 
atomic field per event to record the number of event received. I will 
explore that solution tomorrow.

How will this help if events have some payload?


What payload? The event channel does not carry any payload. It only notify you 
that something happen. Then this is up to the user to decide what to you with it.

Sorry, I was probably not precise enough. I mean that an event might have
associated payload in the ring buffer, for example [1]. So, counting events
may help somehow, but the ring's data may still be lost


From my understanding of event channels are edge interrupts. By definition, 
they can be merged so you can get a signal notification to the guest for 
multiple "events". So if you rely on the event to have an associated payload, 
then you probably have done something wrong in your driver.


I haven't implemented PV drivers myself, but I would expect either side to block 
if there were no space in the ring.


What do you do in the displif driver when the ring is full?

Cheers,

[1] 
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/io/displif.h;h=cc5de9cb1f35dedc99c866d73d086b19e496852a;hb=HEAD#l756 



--
Julien Grall


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-25 Thread Oleksandr Andrushchenko

On 2/22/19 3:33 PM, Julien Grall wrote:

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:

On 2/20/19 10:46 PM, Julien Grall wrote:
Discussing with my team, a solution that came up would be to 
introduce one atomic field per event to record the number of event 
received. I will explore that solution tomorrow.

How will this help if events have some payload?


What payload? The event channel does not carry any payload. It only 
notify you that something happen. Then this is up to the user to 
decide what to you with it.

Sorry, I was probably not precise enough. I mean that an event might have
associated payload in the ring buffer, for example [1]. So, counting events
may help somehow, but the ring's data may still be lost


Cheers,

[1] 
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/io/displif.h;h=cc5de9cb1f35dedc99c866d73d086b19e496852a;hb=HEAD#l756


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-22 Thread Julien Grall

Hi,

On 22/02/2019 12:38, Oleksandr Andrushchenko wrote:

On 2/20/19 10:46 PM, Julien Grall wrote:
Discussing with my team, a solution that came up would be to introduce one 
atomic field per event to record the number of event received. I will explore 
that solution tomorrow.

How will this help if events have some payload?


What payload? The event channel does not carry any payload. It only notify you 
that something happen. Then this is up to the user to decide what to you with it.


Cheers,

--
Julien Grall


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-22 Thread Oleksandr Andrushchenko

On 2/20/19 10:46 PM, Julien Grall wrote:

(+ Andrew and Jan for feedback on the event channel interrupt)

Hi Boris,

Thank you for the your feedback.

On 2/20/19 8:04 PM, Boris Ostrovsky wrote:

On 2/20/19 1:05 PM, Julien Grall wrote:

Hi,

On 20/02/2019 17:07, Boris Ostrovsky wrote:

On 2/20/19 9:15 AM, Julien Grall wrote:

Hi Boris,

Thank you for your answer.

On 20/02/2019 00:02, Boris Ostrovsky wrote:

On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:

Hi all,

I have been looking at using Linux RT in Dom0. Once the guest is
started,
the console is ending to have a lot of warning (see trace below).

After some investigation, this is because the irq handler will now
be threaded.
I can reproduce the same error with the vanilla Linux when passing
the option
'threadirqs' on the command line (the trace below is from 5.0.0-rc7
that has
not RT support).

FWIW, the interrupt for port 6 is used to for the guest to
communicate with
xenstore.

   From my understanding, this is happening because the interrupt
handler is now
run in a thread. So we can have the following happening.

  Interrupt context    | Interrupt thread
   |
  receive interrupt port 6 |
  clear the evtchn port    |
  set IRQF_RUNTHREAD    |
  kick interrupt thread    |
   |    clear IRQF_RUNTHREAD
   |    call evtchn_interrupt
  receive interrupt port 6 |
  clear the evtchn port    |
  set IRQF_RUNTHREAD   |
  kick interrupt thread    |
   |    disable interrupt port 6
   | evtchn->enabled = false
   |    []
   |
   |    *** Handling the second
interrupt ***
   |    clear IRQF_RUNTHREAD
   |    call evtchn_interrupt
   |    WARN(...)

I am not entirely sure how to fix this. I have two solutions in 
mind:


1) Prevent the interrupt handler to be threaded. We would also
need to
switch from spin_lock to raw_spin_lock as the former may sleep on
RT-Linux.

2) Remove the warning


I think access to evtchn->enabled is racy so (with or without the
warning) we can't use it reliably.


Thinking about it, it would not be the only issue. The ring is sized
to contain only one instance of the same event. So if you receive
twice the event, you may overflow the ring.


Hm... That's another argument in favor of "unthreading" the handler.


I first thought it would be possible to unthread it. However,
wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
so this cannot be used in an interrupt context.

So I think "unthreading" the handler is not an option here.


That sounds like a different problem. I.e. there are two issues:
* threaded interrupts don't work properly (races, ring overflow)
* evtchn_interrupt() (threaded or not) has spin_lock(), which is not
going to work for RT


I am afraid that's not correct, you can use spin_lock() in threaded 
interrupt handler.



The first can be fixed by using non-threaded handlers.


The two are somewhat related, if you use a non-threaded handler then 
you are not going to help the RT case.


In general, the unthreaded solution should be used in the last resort.









Another alternative could be to queue the irq if !evtchn->enabled 
and

handle it in evtchn_write() (which is where irq is supposed to be
re-enabled).

What do you mean by queue? Is it queueing in the ring?



No, I was thinking about having a new structure for deferred 
interrupts.


Hmmm, I am not entirely sure what would be the structure here. Could
you expand your thinking?


Some sort of a FIFO that stores {irq, data} tuple. It could obviously be
implemented as a ring but not necessarily as Xen shared ring (if that's
what you were referring to).


The underlying question is what happen if you miss an interrupt. Is it 
going to be ok? If no, then we have to record everything and can't 
kill/send an error to the user app because it is not its fault.


This means a FIFO would not be a viable. How do you size it? Static 
(i.e pre-defined) size is not going to be possible because you don't 
know how many interrupt you are going to receive before the thread 
handler runs. You can't make it grow dynamically as it make become 
quite big for the same reason.


Discussing with my team, a solution that came up would be to introduce 
one atomic field per event to record the number of event received. I 
will explore that solution tomorrow.

How will this help if events have some payload?


Cheers,





Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-21 Thread Julien Grall
Hi Roger,

On 21/02/2019 09:14, Roger Pau Monné wrote:
> On Thu, Feb 21, 2019 at 08:38:39AM +, Julien Grall wrote:
>> Hi Roger,
>>
>> On Thu, 21 Feb 2019, 08:08 Roger Pau Monné,  wrote:
>>
>>> FWIW, you can also mask the interrupt while waiting for the thread to
>>> execute the interrupt handler. Ie:
>>>
>>
>> Thank you for providing steps, however where would the masking be done? By
>> the irqchip or a custom solution?
>
> I'm not familiar with the irqchip infrastructure in Linux, what I
> proposed below is what FreeBSD does when running interrupt handlers in
> deferred threads IIRC.
>
> If irqchip has a specific handler to dispatch to a thread, then that's
> the place where the masking should happen. Likely, the unmasking
> should be done by the irq handling infrastructure after the thread
> executing the interrupt handler has finished.
>
> Isn't there a similar way to handle interrupts in threads for Linux?

Linux has a flag (IRQF_ONESHOT) to mask interrupt until the interrupt
handler has been run. It is set for all interrupts handler that have
been forced to be threaded.

However, it looks like the flag is been ignored by the irqchip handler
we use (handle_edge_irq). Doing a bit of digging, IRQF_ONESHOT use to be
handled in handle_edge_irq until the following commit from 2009:

commit 4dbc9ca219b0f294332e734528f7b82211700170
Author: Thomas Gleixner 
Date:   Thu Aug 27 09:38:49 2009 +0200

 genirq: Do not mask oneshot edge type interrupts

 Masking oneshot edge type interrupts is wrong as we might lose an
 interrupt which is issued when the threaded handler is handling the
 device. We can keep the irq unmasked safely as with edge type
 interrupts there is no danger of interrupt floods. If the threaded
 handler has not yet finished then IRQTF_RUNTHREAD is set which will
 keep the handler thread active.

 Debugged and verified in preempt-rt.

 Signed-off-by: Thomas Gleixner 

Furthermore, it is pretty clear from the comment on top of
handle_edge_irq that the same interrupt can come-up before the first one
is one is handled by the associated event handler.

I am still trying to figure out whether the issue lies in the evtchn
driver or the Xen irqchip (events_base.c). I will have a closer look and
come back with updates here.

Cheers,

--
Julien Grall
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-21 Thread Roger Pau Monné
On Thu, Feb 21, 2019 at 08:38:39AM +, Julien Grall wrote:
> Hi Roger,
> 
> On Thu, 21 Feb 2019, 08:08 Roger Pau Monné,  wrote:
> 
> > FWIW, you can also mask the interrupt while waiting for the thread to
> > execute the interrupt handler. Ie:
> >
> 
> Thank you for providing steps, however where would the masking be done? By
> the irqchip or a custom solution?

I'm not familiar with the irqchip infrastructure in Linux, what I
proposed below is what FreeBSD does when running interrupt handlers in
deferred threads IIRC.

If irqchip has a specific handler to dispatch to a thread, then that's
the place where the masking should happen. Likely, the unmasking
should be done by the irq handling infrastructure after the thread
executing the interrupt handler has finished.

Isn't there a similar way to handle interrupts in threads for Linux?

> 
> > 1. Interrupt injected
> > 2. Execute guest event channel callback
> > 3. Scan for pending interrupts
> > 4. Mask interrupt
> > 5. Clear pending field
> > 6. Queue threaded handler
> > 7. Go to 3 until all interrupts are drained
> > [...]
> > 8. Execute interrupt handler in thread
> > 9. Unmask interrupt
> >
> > That should prevent you from stacking interrupts?
> >
> > Roger.
> >


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-21 Thread Juergen Gross
On 21/02/2019 09:38, Julien Grall wrote:
> Hi Roger,
> 
> On Thu, 21 Feb 2019, 08:08 Roger Pau Monné,  > wrote:
> 
> FWIW, you can also mask the interrupt while waiting for the thread to
> execute the interrupt handler. Ie:
> 
> 
> Thank you for providing steps, however where would the masking be done?
> By the irqchip or a custom solution?

I'd go the irqchip route. This would be the least intrusive change
without the need for handling RT in a special way.


Juergen


Re: [Xen-devel] xen/evtchn and forced threaded irq

2019-02-21 Thread Roger Pau Monné
On Wed, Feb 20, 2019 at 10:03:57PM +, Julien Grall wrote:
> Hi Boris,
> 
> On 2/20/19 9:46 PM, Boris Ostrovsky wrote:
> > On 2/20/19 3:46 PM, Julien Grall wrote:
> > > (+ Andrew and Jan for feedback on the event channel interrupt)
> > > 
> > > Hi Boris,
> > > 
> > > Thank you for the your feedback.
> > > 
> > > On 2/20/19 8:04 PM, Boris Ostrovsky wrote:
> > > > On 2/20/19 1:05 PM, Julien Grall wrote:
> > > > > Hi,
> > > > > 
> > > > > On 20/02/2019 17:07, Boris Ostrovsky wrote:
> > > > > > On 2/20/19 9:15 AM, Julien Grall wrote:
> > > > > > > Hi Boris,
> > > > > > > 
> > > > > > > Thank you for your answer.
> > > > > > > 
> > > > > > > On 20/02/2019 00:02, Boris Ostrovsky wrote:
> > > > > > > > On Tue, Feb 19, 2019 at 05:31:10PM +, Julien Grall wrote:
> > > > > > > > > Hi all,
> > > > > > > > > 
> > > > > > > > > I have been looking at using Linux RT in Dom0. Once the guest 
> > > > > > > > > is
> > > > > > > > > started,
> > > > > > > > > the console is ending to have a lot of warning (see trace 
> > > > > > > > > below).
> > > > > > > > > 
> > > > > > > > > After some investigation, this is because the irq handler 
> > > > > > > > > will now
> > > > > > > > > be threaded.
> > > > > > > > > I can reproduce the same error with the vanilla Linux when 
> > > > > > > > > passing
> > > > > > > > > the option
> > > > > > > > > 'threadirqs' on the command line (the trace below is from 
> > > > > > > > > 5.0.0-rc7
> > > > > > > > > that has
> > > > > > > > > not RT support).
> > > > > > > > > 
> > > > > > > > > FWIW, the interrupt for port 6 is used to for the guest to
> > > > > > > > > communicate with
> > > > > > > > > xenstore.
> > > > > > > > > 
> > > > > > > > >     From my understanding, this is happening because the 
> > > > > > > > > interrupt
> > > > > > > > > handler is now
> > > > > > > > > run in a thread. So we can have the following happening.
> > > > > > > > > 
> > > > > > > > >    Interrupt context    | Interrupt thread
> > > > > > > > >     |
> > > > > > > > >    receive interrupt port 6 |
> > > > > > > > >    clear the evtchn port    |
> > > > > > > > >    set IRQF_RUNTHREAD    |
> > > > > > > > >    kick interrupt thread    |
> > > > > > > > >     |    clear IRQF_RUNTHREAD
> > > > > > > > >     |    call evtchn_interrupt
> > > > > > > > >    receive interrupt port 6 |
> > > > > > > > >    clear the evtchn port    |
> > > > > > > > >    set IRQF_RUNTHREAD   |
> > > > > > > > >    kick interrupt thread    |
> > > > > > > > >     |    disable interrupt 
> > > > > > > > > port 6
> > > > > > > > >     |    evtchn->enabled = 
> > > > > > > > > false
> > > > > > > > >     |    []
> > > > > > > > >     |
> > > > > > > > >     |    *** Handling the 
> > > > > > > > > second
> > > > > > > > > interrupt ***
> > > > > > > > >     |    clear IRQF_RUNTHREAD
> > > > > > > > >     |    call evtchn_interrupt
> > > > > > > > >     |    WARN(...)
> > > > > > > > > 
> > > > > > > > > I am not entirely sure how to fix this. I have two solutions 
> > > > > > > > > in
> > > > > > > > > mind:
> > > > > > > > > 
> > > > > > > > > 1) Prevent the interrupt handler to be threaded. We would also
> > > > > > > > > need to
> > > > > > > > > switch from spin_lock to raw_spin_lock as the former may 
> > > > > > > > > sleep on
> > > > > > > > > RT-Linux.
> > > > > > > > > 
> > > > > > > > > 2) Remove the warning
> > > > > > > > 
> > > > > > > > I think access to evtchn->enabled is racy so (with or without 
> > > > > > > > the
> > > > > > > > warning) we can't use it reliably.
> > > > > > > 
> > > > > > > Thinking about it, it would not be the only issue. The ring is 
> > > > > > > sized
> > > > > > > to contain only one instance of the same event. So if you receive
> > > > > > > twice the event, you may overflow the ring.
> > > > > > 
> > > > > > Hm... That's another argument in favor of "unthreading" the handler.
> > > > > 
> > > > > I first thought it would be possible to unthread it. However,
> > > > > wake_up_interruptible is using a spin_lock. On RT spin_lock can sleep,
> > > > > so this cannot be used in an interrupt context.
> > > > > 
> > > > > So I think "unthreading" the handler is not an option here.
> > > > 
> > > > That sounds like a different problem. I.e. there are two issues:
> > > > * threaded interrupts don't work properly (races, ring overflow)
> > > > * evtchn_interrupt() (threaded or not) has spin_lock(), which is not
> > > > going to work for RT
> > > 
> > > I am afraid that's not correct, you can use spin_lock() in threaded
> > > interrupt handler.
> > 
> > In