Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-02 Thread Stefano Stabellini
On Thu, 2 Nov 2017, Andre Przywara wrote:
>  (CC:ing some KVM/ARM folks involved in the VGIC)
> 
>  starting with the addition of the ITS support we were seeing more and
>  more issues with the current implementation of our ARM Generic Interrupt
>  Controller (GIC) emulation, the VGIC.
>  Among other approaches to fix those issues it was proposed to copy the
>  VGIC emulation used in KVM. This one was suffering from very similar
>  issues, and a clean design from scratch lead to a very robust and
>  capable re-implementation. Interestingly this implementation is fairly
>  self-contained, so it seems feasible to copy it. Hopefully we only need
>  minor adjustments, possibly we can even copy it verbatim with some
>  additional glue layer code.
> 
>  Stefano asked for getting a design overview, to assess the feasibility
>  of copying the KVM code without reviewing tons of code in the first
>  place.
>  So to follow Xen rules for new features, this design document below is
>  an attempt to describe the current KVM VGIC design - in a hypervisor
>  agnostic session. It is a bit of a retro-fit design description, as it
>  is not strictly forward-looking only, but actually describing the
>  existing implemenation [1].
> 
>  Please have a look and let me know:
>  1) if this document has the right scope
>  2) if this document has the right level of detail
>  3) if there are points missing from the document
>  3) if the design in general is a fit
> >>>
> >>> Please read the following statements as genuine questions and concerns.
> >>> Most ideas on this document are good. Some of them I have even suggested
> >>> them myself in the context of GIC improvements for Xen. I asked for a
> >>> couple of clarifications.
> >>>
> >>> But I don't see why we cannot implement these ideas on top of the
> >>> existing code, rather than with a separate codebase, ending up with two
> >>> drivers. I would prefer a natual evolution. Specifically, the following
> >>> improvements would be simple and would give us most of the benefits on
> >>> top of the current codebase:
> >>> - adding the irq lock, and the refcount
> >>> - taking both vcpu locks when necessary (on migration code for example
> >>>   it would help a lot), the lower vcpu_id first
> >>> - level irq emulation
> >>
> >> I think some of those points you mentioned are not easily implemented in
> >> the current Xen. For instance I ran into locking order issues with those
> >> *two* inflight and lr_queue lists, when trying to implement the lock and
> >> the refcount.
> >> Also this "put vIRQs into LRs early, but possibly rip them out again" is
> >> really complicating things a lot.
> >>
> >> I believe only level IRQs could be added in a relatively straight
> >> forward manner.
> >>
> >> So the problem with the evolutionary approach is that it generates a lot
> >> of patches, some of them quite invasive, others creating hard-to-read
> >> diffs, which are both hard to review.
> >> And chances are that the actual result would be pretty close to the KVM
> >> code. To be clear: I hacked the Xen VGIC into the KVM direction in a few
> >> days some months ago, but it took me *weeks* to make sane patches of
> >> only the first part of it.
> >> And this would not cover all those general, tedious corner cases that
> >> the VGIC comes with. Those would need to be fixed in a painful process,
> >> which we could avoid by "lifting" the KVM code.
> > 
> > I hear you, but the principal cost here is the review time, not the
> > development time. Julien told me that it would be pretty much the same
> > for him in terms of time it takes to review the changes, it doesn't
> > matter if it's a new driver or changes to the existing driver. For me,
> > it wouldn't be the same: I think it would take me far less time to
> > review them if they were against the existing codebase.
> 
> I am not so sure about this. The changes are quite dramatic, and those
> changes tend to produce horrible diffs. Or we try to mitigate this, but
> this comes at a cost of having *many* patches, which take a while to
> produce.
> But if we instantiate a new VGIC implementation from scratch, we can
> provide very nice-to-review patches, because the patches can focus on
> logical changes and don't need to care about bisectability.

All right


> > However, as I wrote, this is not my foremost concern. I would be up to
> > committing myself to review this even if we decide to go for a new
> > driver.
> > 
> > 
> >>> If we do end up with a second separate driver for technical or process
> >>> reasons, I would expect the regular Xen submission/review process to be
> >>> followed. The code style will be different, the hooks into the rest of
> >>> the hypervisors will be different and things will be generally changed.
> >>> The new V/GIC might be derived from KVM, but it should end up looking
> >>> and feeling like a 100% 

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-02 Thread Andre Przywara
Hi,

On 01/11/17 21:54, Stefano Stabellini wrote:
> On Wed, 1 Nov 2017, Andre Przywara wrote:
>> Hi Stefano,
>>
>>
>> On 01/11/17 01:58, Stefano Stabellini wrote:
>>> On Wed, 11 Oct 2017, Andre Przywara wrote:
>>
>> many thanks for going through all of this!
> 
> No problems, and thanks for your work and for caring about doing the
> best thing for the project.
> 
> 
 (CC:ing some KVM/ARM folks involved in the VGIC)

 starting with the addition of the ITS support we were seeing more and
 more issues with the current implementation of our ARM Generic Interrupt
 Controller (GIC) emulation, the VGIC.
 Among other approaches to fix those issues it was proposed to copy the
 VGIC emulation used in KVM. This one was suffering from very similar
 issues, and a clean design from scratch lead to a very robust and
 capable re-implementation. Interestingly this implementation is fairly
 self-contained, so it seems feasible to copy it. Hopefully we only need
 minor adjustments, possibly we can even copy it verbatim with some
 additional glue layer code.

 Stefano asked for getting a design overview, to assess the feasibility
 of copying the KVM code without reviewing tons of code in the first
 place.
 So to follow Xen rules for new features, this design document below is
 an attempt to describe the current KVM VGIC design - in a hypervisor
 agnostic session. It is a bit of a retro-fit design description, as it
 is not strictly forward-looking only, but actually describing the
 existing implemenation [1].

 Please have a look and let me know:
 1) if this document has the right scope
 2) if this document has the right level of detail
 3) if there are points missing from the document
 3) if the design in general is a fit
>>>
>>> Please read the following statements as genuine questions and concerns.
>>> Most ideas on this document are good. Some of them I have even suggested
>>> them myself in the context of GIC improvements for Xen. I asked for a
>>> couple of clarifications.
>>>
>>> But I don't see why we cannot implement these ideas on top of the
>>> existing code, rather than with a separate codebase, ending up with two
>>> drivers. I would prefer a natual evolution. Specifically, the following
>>> improvements would be simple and would give us most of the benefits on
>>> top of the current codebase:
>>> - adding the irq lock, and the refcount
>>> - taking both vcpu locks when necessary (on migration code for example
>>>   it would help a lot), the lower vcpu_id first
>>> - level irq emulation
>>
>> I think some of those points you mentioned are not easily implemented in
>> the current Xen. For instance I ran into locking order issues with those
>> *two* inflight and lr_queue lists, when trying to implement the lock and
>> the refcount.
>> Also this "put vIRQs into LRs early, but possibly rip them out again" is
>> really complicating things a lot.
>>
>> I believe only level IRQs could be added in a relatively straight
>> forward manner.
>>
>> So the problem with the evolutionary approach is that it generates a lot
>> of patches, some of them quite invasive, others creating hard-to-read
>> diffs, which are both hard to review.
>> And chances are that the actual result would be pretty close to the KVM
>> code. To be clear: I hacked the Xen VGIC into the KVM direction in a few
>> days some months ago, but it took me *weeks* to make sane patches of
>> only the first part of it.
>> And this would not cover all those general, tedious corner cases that
>> the VGIC comes with. Those would need to be fixed in a painful process,
>> which we could avoid by "lifting" the KVM code.
> 
> I hear you, but the principal cost here is the review time, not the
> development time. Julien told me that it would be pretty much the same
> for him in terms of time it takes to review the changes, it doesn't
> matter if it's a new driver or changes to the existing driver. For me,
> it wouldn't be the same: I think it would take me far less time to
> review them if they were against the existing codebase.

I am not so sure about this. The changes are quite dramatic, and those
changes tend to produce horrible diffs. Or we try to mitigate this, but
this comes at a cost of having *many* patches, which take a while to
produce.
But if we instantiate a new VGIC implementation from scratch, we can
provide very nice-to-review patches, because the patches can focus on
logical changes and don't need to care about bisectability.

> However, as I wrote, this is not my foremost concern. I would be up to
> committing myself to review this even if we decide to go for a new
> driver.
> 
> 
>>> If we do end up with a second separate driver for technical or process
>>> reasons, I would expect the regular Xen submission/review process to be
>>> followed. The code style will be different, the hooks into the rest of
>>> the hypervisors will be different and 

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-02 Thread Christoffer Dall
On Wed, Nov 1, 2017 at 10:54 PM, Stefano Stabellini
 wrote:

[...]

>
>> > The suggestion of using this model in Xen was made in the past already.
>> > I always objected for the reason that we don't actually know how many
>> > LRs the hardware provides, potentially very many, and it is expensive
>> > and needless to read/write them all every time on entry/exit.
>> >
>> > I would prefer to avoid that, but I'll be honest: I can be convinced
>> > that that model of handling LRs is so much simpler that it is worth it.
>> > I am more concerned about the future maintainance of a separate new
>> > driver developed elsewhere.
>>
>> I think this LR topic should have been covered in that other email.
>>
>> Beside being a strong supporter of the KISS principle in general, I
>> believe in case of the GIC emulation we should avoid (premature)
>> optimizations like the plague, as there are quite some corner cases in
>> any VGIC, and handling all of them explicitly with some hacks will not
>> fly (been there, done that).
>> So I can just support Christoffer's point: having an architecture
>> compliant VGIC emulation in an maintainable manner requires a
>> straight-forward and clear design. Everything else should be secondary,
>> and can be evaluated later, if there are good reasons (numbers!).
>
> The reason why I stated the above is that I ran the numbers back in the
> day and reading or writing LRs on an XGene was so slow that it made
> sense to avoid it as much as possible. But maybe things have changed if
> Christoffer also ran the numbers and managed to demonstrate the
> opposite.

Accessing LRs on GICv2 is terrible indeed, and we already optimize it
as much as it makes sense.  It's just that with the current KVM code
base reading/writing the same value almost never happens, so there's
no more room (in practice) for optimization.

Also, note with GICv3 the cost goes down a lot, and potentially also
for other integrations of GICv2.

-Christoffer

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-02 Thread Christoffer Dall
On Wed, Nov 1, 2017 at 10:15 AM, Andre Przywara
 wrote:
> Hi,
>
> On 01/11/17 04:31, Christoffer Dall wrote:
>> On Wed, Nov 1, 2017 at 9:58 AM, Stefano Stabellini
>>  wrote:
>>
>> []
>
> Christoffer, many thanks for answering this!
> I think we have a lot of assumptions about the whole VGIC life cycle
> floating around, but it would indeed be good to get some numbers behind it.
> I would be all too happy to trace some workloads on Xen again and
> getting some metrics, though this sounds time consuming if done properly.
>
> Do you have any numbers on VGIC performance available somewhere?
>

I ran the full set of workloads and micro numbers that I've used for
my papers with/without the optimization, and couldn't find a
measurable difference.

-Christoffer

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-01 Thread Stefano Stabellini
On Wed, 1 Nov 2017, Andre Przywara wrote:
> Hi Stefano,
> 
> 
> On 01/11/17 01:58, Stefano Stabellini wrote:
> > On Wed, 11 Oct 2017, Andre Przywara wrote:
> 
> many thanks for going through all of this!

No problems, and thanks for your work and for caring about doing the
best thing for the project.


> >> (CC:ing some KVM/ARM folks involved in the VGIC)
> >>
> >> starting with the addition of the ITS support we were seeing more and
> >> more issues with the current implementation of our ARM Generic Interrupt
> >> Controller (GIC) emulation, the VGIC.
> >> Among other approaches to fix those issues it was proposed to copy the
> >> VGIC emulation used in KVM. This one was suffering from very similar
> >> issues, and a clean design from scratch lead to a very robust and
> >> capable re-implementation. Interestingly this implementation is fairly
> >> self-contained, so it seems feasible to copy it. Hopefully we only need
> >> minor adjustments, possibly we can even copy it verbatim with some
> >> additional glue layer code.
> >>
> >> Stefano asked for getting a design overview, to assess the feasibility
> >> of copying the KVM code without reviewing tons of code in the first
> >> place.
> >> So to follow Xen rules for new features, this design document below is
> >> an attempt to describe the current KVM VGIC design - in a hypervisor
> >> agnostic session. It is a bit of a retro-fit design description, as it
> >> is not strictly forward-looking only, but actually describing the
> >> existing implemenation [1].
> >>
> >> Please have a look and let me know:
> >> 1) if this document has the right scope
> >> 2) if this document has the right level of detail
> >> 3) if there are points missing from the document
> >> 3) if the design in general is a fit
> > 
> > Please read the following statements as genuine questions and concerns.
> > Most ideas on this document are good. Some of them I have even suggested
> > them myself in the context of GIC improvements for Xen. I asked for a
> > couple of clarifications.
> > 
> > But I don't see why we cannot implement these ideas on top of the
> > existing code, rather than with a separate codebase, ending up with two
> > drivers. I would prefer a natual evolution. Specifically, the following
> > improvements would be simple and would give us most of the benefits on
> > top of the current codebase:
> > - adding the irq lock, and the refcount
> > - taking both vcpu locks when necessary (on migration code for example
> >   it would help a lot), the lower vcpu_id first
> > - level irq emulation
> 
> I think some of those points you mentioned are not easily implemented in
> the current Xen. For instance I ran into locking order issues with those
> *two* inflight and lr_queue lists, when trying to implement the lock and
> the refcount.
> Also this "put vIRQs into LRs early, but possibly rip them out again" is
> really complicating things a lot.
> 
> I believe only level IRQs could be added in a relatively straight
> forward manner.
> 
> So the problem with the evolutionary approach is that it generates a lot
> of patches, some of them quite invasive, others creating hard-to-read
> diffs, which are both hard to review.
> And chances are that the actual result would be pretty close to the KVM
> code. To be clear: I hacked the Xen VGIC into the KVM direction in a few
> days some months ago, but it took me *weeks* to make sane patches of
> only the first part of it.
> And this would not cover all those general, tedious corner cases that
> the VGIC comes with. Those would need to be fixed in a painful process,
> which we could avoid by "lifting" the KVM code.

I hear you, but the principal cost here is the review time, not the
development time. Julien told me that it would be pretty much the same
for him in terms of time it takes to review the changes, it doesn't
matter if it's a new driver or changes to the existing driver. For me,
it wouldn't be the same: I think it would take me far less time to
review them if they were against the existing codebase.

However, as I wrote, this is not my foremost concern. I would be up to
committing myself to review this even if we decide to go for a new
driver.


> > If we do end up with a second separate driver for technical or process
> > reasons, I would expect the regular Xen submission/review process to be
> > followed. The code style will be different, the hooks into the rest of
> > the hypervisors will be different and things will be generally changed.
> > The new V/GIC might be derived from KVM, but it should end up looking
> > and feeling like a 100% genuine Xen component. After all, we'll
> > maintain it going forward. I don't want a copy of a Linux driver with
> > glue code. The Xen community cannot be expected not to review the
> > submission, but if we review it, then we'll ask for changes. Once we
> > change the code, there will be no point in keeping the Linux code
> > separate with glue code. We should fully adapt it to Xen.
> 
> 

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-01 Thread Andre Przywara
Hi Christoffer,

On 12/10/17 13:05, Christoffer Dall wrote:
> Hi Andre,
> 
> On Wed, Oct 11, 2017 at 03:33:03PM +0100, Andre Przywara wrote:
>> Hi,
>>
>> (CC:ing some KVM/ARM folks involved in the VGIC)
> 
> Very nice writeup!
> 
> I added a bunch of comments, mostly for the writing and clarity, I hope
> it helps.

Thank you very much for the response and the comments! I really
appreciate your precise (academic) language here.
I held back the response since Stefano was the actual addressee of this
write-up, so: sorry for the delay.

>> starting with the addition of the ITS support we were seeing more and
>> more issues with the current implementation of our ARM Generic Interrupt
>> Controller (GIC) emulation, the VGIC.
>> Among other approaches to fix those issues it was proposed to copy the
>> VGIC emulation used in KVM. This one was suffering from very similar
>> issues, and a clean design from scratch lead to a very robust and
>> capable re-implementation. Interestingly this implementation is fairly
>> self-contained, so it seems feasible to copy it. Hopefully we only need
>> minor adjustments, possibly we can even copy it verbatim with some
>> additional glue layer code.
>> Stefano asked for getting a design overview, to assess the feasibility
>> of copying the KVM code without reviewing tons of code in the first
>> place.
>> So to follow Xen rules for new features, this design document below is
>> an attempt to describe the current KVM VGIC design - in a hypervisor
>> agnostic session. It is a bit of a retro-fit design description, as it
>> is not strictly forward-looking only, but actually describing the
>> existing implemenation [1].
>>
>> Please have a look and let me know:
>> 1) if this document has the right scope
>> 2) if this document has the right level of detail
>> 3) if there are points missing from the document
>> 3) if the design in general is a fit
>>
>> Appreciate any feedback!
>>
>> Cheers,
>> Andre.
>>
>> ---
>>
>> VGIC design
>> ===
>>
>> This document describes the design of an ARM Generic Interrupt Controller 
>> (GIC)
>> emulation. It is meant to emulate a GIC for a guest in an virtual machine,
>> the common name for that is VGIC (from "virtual GIC").
>>
>> This design was the result of a one-week-long design session with some
>> engineers in a room, triggered by ever-increasing difficulties in maintaining
>> the existing GIC emulation in the KVM hypervisor. The design eventually
>> materialised as an alternative VGIC implementation in the Linux kernel
>> (merged into Linux v4.7). As of Linux v4.8 the previous VGIC implementation
>> was removed, so it is now the current code used by Linux.
>> Although being used in KVM, the actual design of this VGIC is rather 
>> hypervisor
>> agnostic and can be used by other hypervisors as well, in particular for Xen.
>>
>> GIC hardware virtualization support
>> ---
>>
>> The ARM Generic Interrupt Controller (since v2) supports the virtualization
>> extensions, which allows some parts of the interrupt life cycle to be handled
>> purely inside the guest without exiting into the hypervisor.
>> In the GICv2 and GICv3 architecture this covers mostly the "interrupt
>> acknowledgement", "priority drop" and "interrupt deactivate" actions.
>> So a guest can handle most of the interrupt processing code without
>> leaving EL1 and trapping into the hypervisor. To accomplish
>> this, the GIC holds so called "list registers" (LRs), which shadow the
>> interrupt state for any virtual interrupt. Injecting an interrupt to a guest
>> involves setting up one LR with the interrupt number, its priority and 
>> initial
>> state (mostly "pending"), then entering the guest. Any EOI related action
>> from within the guest just acts on those LRs, the hypervisor can later update
>> the virtual interrupt state when the guest exists the next time (for whatever
>> reason).
>> But despite the GIC hardware helping out here, the whole interrupt
>> configuration management is not virtualized at all and needs to be emulated
>> by the hypervisor - or another related software component, for instance a
>> userland emulator. This so called "distributor" part of the GIC consists of
>> memory mapped registers, which can be trapped by the hypervisor, so any guest
>> access can be emulated in the usual way.
>>
>> VGIC design motivation
>> --
>>
>> A GIC emulation thus needs to take care of those bits:
>>
>> - trap GIC distributor MMIO accesses and shadow the configuration setup
>>   (enabled/disabled, level/edge, priority, affinity) for virtual interrupts
>> - handle incoming hardware and virtual interrupt requests and inject the
>>   associated virtual interrupt by manipulating one of the list registers
>> - track the state of a virtual interrupt by inspecting the LRs after the
>>   guest has exited, possibly adjusting the shadowed virtual interrupt state
>>
>> Despite the distributor MMIO 

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-01 Thread Andre Przywara
Hi Stefano,


On 01/11/17 01:58, Stefano Stabellini wrote:
> On Wed, 11 Oct 2017, Andre Przywara wrote:

many thanks for going through all of this!

>> (CC:ing some KVM/ARM folks involved in the VGIC)
>>
>> starting with the addition of the ITS support we were seeing more and
>> more issues with the current implementation of our ARM Generic Interrupt
>> Controller (GIC) emulation, the VGIC.
>> Among other approaches to fix those issues it was proposed to copy the
>> VGIC emulation used in KVM. This one was suffering from very similar
>> issues, and a clean design from scratch lead to a very robust and
>> capable re-implementation. Interestingly this implementation is fairly
>> self-contained, so it seems feasible to copy it. Hopefully we only need
>> minor adjustments, possibly we can even copy it verbatim with some
>> additional glue layer code.
>>
>> Stefano asked for getting a design overview, to assess the feasibility
>> of copying the KVM code without reviewing tons of code in the first
>> place.
>> So to follow Xen rules for new features, this design document below is
>> an attempt to describe the current KVM VGIC design - in a hypervisor
>> agnostic session. It is a bit of a retro-fit design description, as it
>> is not strictly forward-looking only, but actually describing the
>> existing implemenation [1].
>>
>> Please have a look and let me know:
>> 1) if this document has the right scope
>> 2) if this document has the right level of detail
>> 3) if there are points missing from the document
>> 3) if the design in general is a fit
> 
> Please read the following statements as genuine questions and concerns.
> Most ideas on this document are good. Some of them I have even suggested
> them myself in the context of GIC improvements for Xen. I asked for a
> couple of clarifications.
> 
> But I don't see why we cannot implement these ideas on top of the
> existing code, rather than with a separate codebase, ending up with two
> drivers. I would prefer a natual evolution. Specifically, the following
> improvements would be simple and would give us most of the benefits on
> top of the current codebase:
> - adding the irq lock, and the refcount
> - taking both vcpu locks when necessary (on migration code for example
>   it would help a lot), the lower vcpu_id first
> - level irq emulation

I think some of those points you mentioned are not easily implemented in
the current Xen. For instance I ran into locking order issues with those
*two* inflight and lr_queue lists, when trying to implement the lock and
the refcount.
Also this "put vIRQs into LRs early, but possibly rip them out again" is
really complicating things a lot.

I believe only level IRQs could be added in a relatively straight
forward manner.

So the problem with the evolutionary approach is that it generates a lot
of patches, some of them quite invasive, others creating hard-to-read
diffs, which are both hard to review.
And chances are that the actual result would be pretty close to the KVM
code. To be clear: I hacked the Xen VGIC into the KVM direction in a few
days some months ago, but it took me *weeks* to make sane patches of
only the first part of it.
And this would not cover all those general, tedious corner cases that
the VGIC comes with. Those would need to be fixed in a painful process,
which we could avoid by "lifting" the KVM code.

> If we do end up with a second separate driver for technical or process
> reasons, I would expect the regular Xen submission/review process to be
> followed. The code style will be different, the hooks into the rest of
> the hypervisors will be different and things will be generally changed.
> The new V/GIC might be derived from KVM, but it should end up looking
> and feeling like a 100% genuine Xen component. After all, we'll
> maintain it going forward. I don't want a copy of a Linux driver with
> glue code. The Xen community cannot be expected not to review the
> submission, but if we review it, then we'll ask for changes. Once we
> change the code, there will be no point in keeping the Linux code
> separate with glue code. We should fully adapt it to Xen.

I see your point, and this actually simplifies *my* work, but I am a bit
worried about the effects of having two separate implementations which
then diverge over time.
In the moment we have two separate implementations as well, but they are
quite different, which has the advantage of doing things differently
enough to help in finding bugs in the other one (something we should
actually exploit in testing, I believe).

So how is your feeling towards some shared "libvgic"? I understand that
people are not too happy about that extra maintenance cost of having a
separate repository, but I am curious what your, Marc's and
Christoffer's take is on this idea.

> That is what was done in the past when KVM took code from Xen (for
> example async shadow pagetables). I am eager to avoid a situation like
> the current SMMU driver in Xen, which comes from 

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-11-01 Thread Andre Przywara
Hi,

On 01/11/17 04:31, Christoffer Dall wrote:
> On Wed, Nov 1, 2017 at 9:58 AM, Stefano Stabellini
>  wrote:
> 
> []

Christoffer, many thanks for answering this!
I think we have a lot of assumptions about the whole VGIC life cycle
floating around, but it would indeed be good to get some numbers behind it.
I would be all too happy to trace some workloads on Xen again and
getting some metrics, though this sounds time consuming if done properly.

Do you have any numbers on VGIC performance available somewhere?



>>> ### List register management
>>>
>>> A list register (LR) holds the state of a virtual interrupt, which will
>>> be used by the GIC hardware to simulate an IRQ life cycle for a guest.
>>> Each GIC hardware implementation can choose to implement a number of LRs,
>>> having four of them seems to be a common value. This design here does not
>>> try to manage the LRs very cleverly, instead on every guest exit every LR
>>> in use will be synced to the emulated state, then cleared. Upon guest entry
>>> the top priority virtual IRQs will be inserted into the LRs. If there are
>>> more pending or active IRQs than list registers, the GIC management IRQ
>>> will be configured to notify the hypervisor of a free LR (once the guest
>>> has EOIed one IRQ). This will trigger a normal exit, which will go through
>>> the normal cleanup/repopulate scheme, possibly now queuing the leftover
>>> interrupt(s).
>>> To facilitate quick guest exit and entry times, the VGIC maintains the list
>>> of pending or active interrupts (ap\_list) sorted by their priority. Active
>>> interrupts always go first on the list, since a guest and the hardware GIC
>>> expect those to stay until they have been explicitly deactivated. Failure
>>> in keeping active IRQs around will result in error conditions in the GIC.
>>> The second sort criteria for the ap\_list is their priority, so higher
>>> priority pending interrupt always go first into the LRs.
>>
>> The suggestion of using this model in Xen was made in the past already.
>> I always objected for the reason that we don't actually know how many
>> LRs the hardware provides, potentially very many, and it is expensive
>> and needless to read/write them all every time on entry/exit.
>>
>> I would prefer to avoid that, but I'll be honest: I can be convinced
>> that that model of handling LRs is so much simpler that it is worth it.
>> I am more concerned about the future maintainance of a separate new
>> driver developed elsewhere.
> 
> [Having just spent a fair amount of time optimizing KVM/ARM and
> measuring GIC interaction, I'll comment on this and leave it up to
> Andre to drive the rest of the discussion].
> 
> In KVM we currently only ever touch an LR when we absolutely have to.
> For example, if there are no interrupts, we do not touch an LR.

Yes, I think this is a key point. We only touch LRs that we need to
touch: On guest entry we iterate our per-VCPU list of pending IRQs
(ap_list, that could be empty!), and store that number in a variable.
On entry we just sync back the first  LRs.
I think the code in KVM explains it quite well:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/virt/kvm/arm/vgic/vgic.c#n677

> When you do have an interrupt in flight, and have programmed one or
> more LRs, you have to either read back that LR, or read one of the
> status registers to figure out if the interrupt has become inactive
> (and should potentially be injected again).  I measured both on KVM
> for various workloads and it was faster to never read the status
> registers, but simply read back the LRs that were in use when entering
> the guest.
> 
> You can potentially micro-optimize slightly by remembering the exit
> value of an LR (and not clearing it on guest exit), but you have to
> pay the cost in terms of additional logic during VCPU migration and
> when you enter a VM again, maintaining a mapping of the LR and the
> virtual state, to avoid rewriting the same value to the LR again.  We
> tried that in KVM and could not measure any benefit using either a
> pinned or oversubscribed workload; I speculate that the number of
> times you exit with unprocessed interrupts in the LRs is extremely
> rare.
> 
> In terms of the number of LRs, I stil haven't seen an implementation
> with anything else than 4 LRs.

Yes, that is what I know of as well. The fast model has 16, but I guess
this doesn't count - though it's good to test some code. I can try to
learn the figure in newer hardware.

In the past I traced some workloads and found only a small number of LRs
to be actually used, with 4 or more being extremely rare.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-10-31 Thread Christoffer Dall
On Wed, Nov 1, 2017 at 9:58 AM, Stefano Stabellini
 wrote:

[]

>
>> ### List register management
>>
>> A list register (LR) holds the state of a virtual interrupt, which will
>> be used by the GIC hardware to simulate an IRQ life cycle for a guest.
>> Each GIC hardware implementation can choose to implement a number of LRs,
>> having four of them seems to be a common value. This design here does not
>> try to manage the LRs very cleverly, instead on every guest exit every LR
>> in use will be synced to the emulated state, then cleared. Upon guest entry
>> the top priority virtual IRQs will be inserted into the LRs. If there are
>> more pending or active IRQs than list registers, the GIC management IRQ
>> will be configured to notify the hypervisor of a free LR (once the guest
>> has EOIed one IRQ). This will trigger a normal exit, which will go through
>> the normal cleanup/repopulate scheme, possibly now queuing the leftover
>> interrupt(s).
>> To facilitate quick guest exit and entry times, the VGIC maintains the list
>> of pending or active interrupts (ap\_list) sorted by their priority. Active
>> interrupts always go first on the list, since a guest and the hardware GIC
>> expect those to stay until they have been explicitly deactivated. Failure
>> in keeping active IRQs around will result in error conditions in the GIC.
>> The second sort criteria for the ap\_list is their priority, so higher
>> priority pending interrupt always go first into the LRs.
>
> The suggestion of using this model in Xen was made in the past already.
> I always objected for the reason that we don't actually know how many
> LRs the hardware provides, potentially very many, and it is expensive
> and needless to read/write them all every time on entry/exit.
>
> I would prefer to avoid that, but I'll be honest: I can be convinced
> that that model of handling LRs is so much simpler that it is worth it.
> I am more concerned about the future maintainance of a separate new
> driver developed elsewhere.

[Having just spent a fair amount of time optimizing KVM/ARM and
measuring GIC interaction, I'll comment on this and leave it up to
Andre to drive the rest of the discussion].

In KVM we currently only ever touch an LR when we absolutely have to.
For example, if there are no interrupts, we do not touch an LR.

When you do have an interrupt in flight, and have programmed one or
more LRs, you have to either read back that LR, or read one of the
status registers to figure out if the interrupt has become inactive
(and should potentially be injected again).  I measured both on KVM
for various workloads and it was faster to never read the status
registers, but simply read back the LRs that were in use when entering
the guest.

You can potentially micro-optimize slightly by remembering the exit
value of an LR (and not clearing it on guest exit), but you have to
pay the cost in terms of additional logic during VCPU migration and
when you enter a VM again, maintaining a mapping of the LR and the
virtual state, to avoid rewriting the same value to the LR again.  We
tried that in KVM and could not measure any benefit using either a
pinned or oversubscribed workload; I speculate that the number of
times you exit with unprocessed interrupts in the LRs is extremely
rare.

In terms of the number of LRs, I stil haven't seen an implementation
with anything else than 4 LRs.

-Christoffer

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-10-31 Thread Stefano Stabellini
On Wed, 11 Oct 2017, Andre Przywara wrote:
> Hi,
> 
> (CC:ing some KVM/ARM folks involved in the VGIC)
> 
> starting with the addition of the ITS support we were seeing more and
> more issues with the current implementation of our ARM Generic Interrupt
> Controller (GIC) emulation, the VGIC.
> Among other approaches to fix those issues it was proposed to copy the
> VGIC emulation used in KVM. This one was suffering from very similar
> issues, and a clean design from scratch lead to a very robust and
> capable re-implementation. Interestingly this implementation is fairly
> self-contained, so it seems feasible to copy it. Hopefully we only need
> minor adjustments, possibly we can even copy it verbatim with some
> additional glue layer code.
>
> Stefano asked for getting a design overview, to assess the feasibility
> of copying the KVM code without reviewing tons of code in the first
> place.
> So to follow Xen rules for new features, this design document below is
> an attempt to describe the current KVM VGIC design - in a hypervisor
> agnostic session. It is a bit of a retro-fit design description, as it
> is not strictly forward-looking only, but actually describing the
> existing implemenation [1].
> 
> Please have a look and let me know:
> 1) if this document has the right scope
> 2) if this document has the right level of detail
> 3) if there are points missing from the document
> 3) if the design in general is a fit

Please read the following statements as genuine questions and concerns.
Most ideas on this document are good. Some of them I have even suggested
them myself in the context of GIC improvements for Xen. I asked for a
couple of clarifications.

But I don't see why we cannot implement these ideas on top of the
existing code, rather than with a separate codebase, ending up with two
drivers. I would prefer a natual evolution. Specifically, the following
improvements would be simple and would give us most of the benefits on
top of the current codebase:
- adding the irq lock, and the refcount
- taking both vcpu locks when necessary (on migration code for example
  it would help a lot), the lower vcpu_id first
- level irq emulation


If we do end up with a second separate driver for technical or process
reasons, I would expect the regular Xen submission/review process to be
followed. The code style will be different, the hooks into the rest of
the hypervisors will be different and things will be generally changed.
The new V/GIC might be derived from KVM, but it should end up looking
and feeling like a 100% genuine Xen component. After all, we'll
maintain it going forward. I don't want a copy of a Linux driver with
glue code. The Xen community cannot be expected not to review the
submission, but if we review it, then we'll ask for changes. Once we
change the code, there will be no point in keeping the Linux code
separate with glue code. We should fully adapt it to Xen.

That is what was done in the past when KVM took code from Xen (for
example async shadow pagetables). I am eager to avoid a situation like
the current SMMU driver in Xen, which comes from Linux, and we are not
entirely sure how to maintain it.


> Appreciate any feedback!
> 
> Cheers,
> Andre.
> 
> ---
> 
> VGIC design
> ===
> 
> This document describes the design of an ARM Generic Interrupt Controller 
> (GIC)
> emulation. It is meant to emulate a GIC for a guest in an virtual machine,
> the common name for that is VGIC (from "virtual GIC").
> 
> This design was the result of a one-week-long design session with some
> engineers in a room, triggered by ever-increasing difficulties in maintaining
> the existing GIC emulation in the KVM hypervisor. The design eventually
> materialised as an alternative VGIC implementation in the Linux kernel
> (merged into Linux v4.7). As of Linux v4.8 the previous VGIC implementation
> was removed, so it is now the current code used by Linux.
> Although being used in KVM, the actual design of this VGIC is rather 
> hypervisor
> agnostic and can be used by other hypervisors as well, in particular for Xen.
> 
> GIC hardware virtualization support
> ---
> 
> The ARM Generic Interrupt Controller (since v2) supports the virtualization
> extensions, which allows some parts of the interrupt life cycle to be handled
> purely inside the guest without exiting into the hypervisor.
> In the GICv2 and GICv3 architecture this covers mostly the "interrupt
> acknowledgement", "priority drop" and "interrupt deactivate" actions.
> So a guest can handle most of the interrupt processing code without
> leaving EL1 and trapping into the hypervisor. To accomplish
> this, the GIC holds so called "list registers" (LRs), which shadow the
> interrupt state for any virtual interrupt. Injecting an interrupt to a guest
> involves setting up one LR with the interrupt number, its priority and initial
> state (mostly "pending"), then entering the guest. Any 

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-10-12 Thread Christoffer Dall
Hi Andre,

On Wed, Oct 11, 2017 at 03:33:03PM +0100, Andre Przywara wrote:
> Hi,
> 
> (CC:ing some KVM/ARM folks involved in the VGIC)

Very nice writeup!

I added a bunch of comments, mostly for the writing and clarity, I hope
it helps.

> 
> starting with the addition of the ITS support we were seeing more and
> more issues with the current implementation of our ARM Generic Interrupt
> Controller (GIC) emulation, the VGIC.
> Among other approaches to fix those issues it was proposed to copy the
> VGIC emulation used in KVM. This one was suffering from very similar
> issues, and a clean design from scratch lead to a very robust and
> capable re-implementation. Interestingly this implementation is fairly
> self-contained, so it seems feasible to copy it. Hopefully we only need
> minor adjustments, possibly we can even copy it verbatim with some
> additional glue layer code.
> Stefano asked for getting a design overview, to assess the feasibility
> of copying the KVM code without reviewing tons of code in the first
> place.
> So to follow Xen rules for new features, this design document below is
> an attempt to describe the current KVM VGIC design - in a hypervisor
> agnostic session. It is a bit of a retro-fit design description, as it
> is not strictly forward-looking only, but actually describing the
> existing implemenation [1].
> 
> Please have a look and let me know:
> 1) if this document has the right scope
> 2) if this document has the right level of detail
> 3) if there are points missing from the document
> 3) if the design in general is a fit
> 
> Appreciate any feedback!
> 
> Cheers,
> Andre.
> 
> ---
> 
> VGIC design
> ===
> 
> This document describes the design of an ARM Generic Interrupt Controller 
> (GIC)
> emulation. It is meant to emulate a GIC for a guest in an virtual machine,
> the common name for that is VGIC (from "virtual GIC").
> 
> This design was the result of a one-week-long design session with some
> engineers in a room, triggered by ever-increasing difficulties in maintaining
> the existing GIC emulation in the KVM hypervisor. The design eventually
> materialised as an alternative VGIC implementation in the Linux kernel
> (merged into Linux v4.7). As of Linux v4.8 the previous VGIC implementation
> was removed, so it is now the current code used by Linux.
> Although being used in KVM, the actual design of this VGIC is rather 
> hypervisor
> agnostic and can be used by other hypervisors as well, in particular for Xen.
> 
> GIC hardware virtualization support
> ---
> 
> The ARM Generic Interrupt Controller (since v2) supports the virtualization
> extensions, which allows some parts of the interrupt life cycle to be handled
> purely inside the guest without exiting into the hypervisor.
> In the GICv2 and GICv3 architecture this covers mostly the "interrupt
> acknowledgement", "priority drop" and "interrupt deactivate" actions.
> So a guest can handle most of the interrupt processing code without
> leaving EL1 and trapping into the hypervisor. To accomplish
> this, the GIC holds so called "list registers" (LRs), which shadow the
> interrupt state for any virtual interrupt. Injecting an interrupt to a guest
> involves setting up one LR with the interrupt number, its priority and initial
> state (mostly "pending"), then entering the guest. Any EOI related action
> from within the guest just acts on those LRs, the hypervisor can later update
> the virtual interrupt state when the guest exists the next time (for whatever
> reason).
> But despite the GIC hardware helping out here, the whole interrupt
> configuration management is not virtualized at all and needs to be emulated
> by the hypervisor - or another related software component, for instance a
> userland emulator. This so called "distributor" part of the GIC consists of
> memory mapped registers, which can be trapped by the hypervisor, so any guest
> access can be emulated in the usual way.
> 
> VGIC design motivation
> --
> 
> A GIC emulation thus needs to take care of those bits:
> 
> - trap GIC distributor MMIO accesses and shadow the configuration setup
>   (enabled/disabled, level/edge, priority, affinity) for virtual interrupts
> - handle incoming hardware and virtual interrupt requests and inject the
>   associated virtual interrupt by manipulating one of the list registers
> - track the state of a virtual interrupt by inspecting the LRs after the
>   guest has exited, possibly adjusting the shadowed virtual interrupt state
> 
> Despite the distributor MMIO register emulation being a sizeable chunk of
> the emulation, it is actually not dominant if looking at the frequency at
> which it is accessed. Normally the interrupt configuration is done at boot
> time or upon initialising the device (driver), but rarely during the actual
> run time of a system. Injecting and EOI-ing interrupts however happens much
> more often. A good 

Re: [Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-10-11 Thread Andre Przywara
Hi,

On 11/10/17 15:33, Andre Przywara wrote:
> Hi,
> 
> (CC:ing some KVM/ARM folks involved in the VGIC)
> 
> starting with the addition of the ITS support we were seeing more and
> more issues with the current implementation of our ARM Generic Interrupt
> Controller (GIC) emulation, the VGIC.
> Among other approaches to fix those issues it was proposed to copy the
> VGIC emulation used in KVM. This one was suffering from very similar
> issues, and a clean design from scratch lead to a very robust and
> capable re-implementation. Interestingly this implementation is fairly
> self-contained, so it seems feasible to copy it. Hopefully we only need
> minor adjustments, possibly we can even copy it verbatim with some
> additional glue layer code.
> Stefano asked for getting a design overview, to assess the feasibility
> of copying the KVM code without reviewing tons of code in the first
> place.
> So to follow Xen rules for new features, this design document below is
> an attempt to describe the current KVM VGIC design - in a hypervisor
> agnostic session. It is a bit of a retro-fit design description, as it
> is not strictly forward-looking only, but actually describing the
> existing implemenation [1].

and that link should point to:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/virt/kvm/arm/vgic

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC] ARM: New (Xen) VGIC design document

2017-10-11 Thread Andre Przywara
Hi,

(CC:ing some KVM/ARM folks involved in the VGIC)

starting with the addition of the ITS support we were seeing more and
more issues with the current implementation of our ARM Generic Interrupt
Controller (GIC) emulation, the VGIC.
Among other approaches to fix those issues it was proposed to copy the
VGIC emulation used in KVM. This one was suffering from very similar
issues, and a clean design from scratch lead to a very robust and
capable re-implementation. Interestingly this implementation is fairly
self-contained, so it seems feasible to copy it. Hopefully we only need
minor adjustments, possibly we can even copy it verbatim with some
additional glue layer code.
Stefano asked for getting a design overview, to assess the feasibility
of copying the KVM code without reviewing tons of code in the first
place.
So to follow Xen rules for new features, this design document below is
an attempt to describe the current KVM VGIC design - in a hypervisor
agnostic session. It is a bit of a retro-fit design description, as it
is not strictly forward-looking only, but actually describing the
existing implemenation [1].

Please have a look and let me know:
1) if this document has the right scope
2) if this document has the right level of detail
3) if there are points missing from the document
3) if the design in general is a fit

Appreciate any feedback!

Cheers,
Andre.

---

VGIC design
===

This document describes the design of an ARM Generic Interrupt Controller (GIC)
emulation. It is meant to emulate a GIC for a guest in an virtual machine,
the common name for that is VGIC (from "virtual GIC").

This design was the result of a one-week-long design session with some
engineers in a room, triggered by ever-increasing difficulties in maintaining
the existing GIC emulation in the KVM hypervisor. The design eventually
materialised as an alternative VGIC implementation in the Linux kernel
(merged into Linux v4.7). As of Linux v4.8 the previous VGIC implementation
was removed, so it is now the current code used by Linux.
Although being used in KVM, the actual design of this VGIC is rather hypervisor
agnostic and can be used by other hypervisors as well, in particular for Xen.

GIC hardware virtualization support
---

The ARM Generic Interrupt Controller (since v2) supports the virtualization
extensions, which allows some parts of the interrupt life cycle to be handled
purely inside the guest without exiting into the hypervisor.
In the GICv2 and GICv3 architecture this covers mostly the "interrupt
acknowledgement", "priority drop" and "interrupt deactivate" actions.
So a guest can handle most of the interrupt processing code without
leaving EL1 and trapping into the hypervisor. To accomplish
this, the GIC holds so called "list registers" (LRs), which shadow the
interrupt state for any virtual interrupt. Injecting an interrupt to a guest
involves setting up one LR with the interrupt number, its priority and initial
state (mostly "pending"), then entering the guest. Any EOI related action
from within the guest just acts on those LRs, the hypervisor can later update
the virtual interrupt state when the guest exists the next time (for whatever
reason).
But despite the GIC hardware helping out here, the whole interrupt
configuration management is not virtualized at all and needs to be emulated
by the hypervisor - or another related software component, for instance a
userland emulator. This so called "distributor" part of the GIC consists of
memory mapped registers, which can be trapped by the hypervisor, so any guest
access can be emulated in the usual way.

VGIC design motivation
--

A GIC emulation thus needs to take care of those bits:

- trap GIC distributor MMIO accesses and shadow the configuration setup
  (enabled/disabled, level/edge, priority, affinity) for virtual interrupts
- handle incoming hardware and virtual interrupt requests and inject the
  associated virtual interrupt by manipulating one of the list registers
- track the state of a virtual interrupt by inspecting the LRs after the
  guest has exited, possibly adjusting the shadowed virtual interrupt state

Despite the distributor MMIO register emulation being a sizeable chunk of
the emulation, it is actually not dominant if looking at the frequency at
which it is accessed. Normally the interrupt configuration is done at boot
time or upon initialising the device (driver), but rarely during the actual
run time of a system. Injecting and EOI-ing interrupts however happens much
more often. A good emulation approach should thus focus on tracking the virtual
interrupt state efficiently, allowing quick handling of incoming and EOI-ed
interrupts.

The actual interrupt state tracking can be quite tricky in parts. Interrupt
injections can be independent from the guest entry/exit points, also MMIO
configuration accesses could be triggered by any VCPU at any point in