On Wed, Oct 03, 2018 at 03:39:13PM +1000, David Gibson wrote:
> On Tue, Oct 02, 2018 at 09:31:21PM +1000, Paul Mackerras wrote:
> > From: Suraj Jitindar Singh <sjitindarsi...@gmail.com>
> > 
> > Consider a normal (L1) guest running under the main hypervisor (L0),
> > and then a nested guest (L2) running under the L1 guest which is acting
> > as a nested hypervisor. L0 has page tables to map the address space for
> > L1 providing the translation from L1 real address -> L0 real address;
> > 
> >     L1
> >     |
> >     | (L1 -> L0)
> >     |
> >     ----> L0
> > 
> > There are also page tables in L1 used to map the address space for L2
> > providing the translation from L2 real address -> L1 read address. Since
> > the hardware can only walk a single level of page table, we need to
> > maintain in L0 a "shadow_pgtable" for L2 which provides the translation
> > from L2 real address -> L0 real address. Which looks like;
> > 
> >     L2                              L2
> >     |                               |
> >     | (L2 -> L1)                    |
> >     |                               |
> >     ----> L1                        | (L2 -> L0)
> >           |                         |
> >           | (L1 -> L0)              |
> >           |                         |
> >           ----> L0                  --------> L0
> > 
> > When a page fault occurs while running a nested (L2) guest we need to
> > insert a pte into this "shadow_pgtable" for the L2 -> L0 mapping. To
> > do this we need to:
> > 
> > 1. Walk the pgtable in L1 memory to find the L2 -> L1 mapping, and
> >    provide a page fault to L1 if this mapping doesn't exist.
> > 2. Use our L1 -> L0 pgtable to convert this L1 address to an L0 address,
> >    or try to insert a pte for that mapping if it doesn't exist.
> > 3. Now we have a L2 -> L0 mapping, insert this into our shadow_pgtable
> > 
> > Once this mapping exists we can take rc faults when hardware is unable
> > to automatically set the reference and change bits in the pte. On these
> > we need to:
> > 
> > 1. Check the rc bits on the L2 -> L1 pte match, and otherwise reflect
> >    the fault down to L1.
> > 2. Set the rc bits in the L1 -> L0 pte which corresponds to the same
> >    host page.
> > 3. Set the rc bits in the L2 -> L0 pte.
> > 
> > As we reuse a large number of functions in book3s_64_mmu_radix.c for
> > this we also needed to refactor a number of these functions to take
> > an lpid parameter so that the correct lpid is used for tlb invalidations.
> > The functionality however has remained the same.
> > 
> > Signed-off-by: Suraj Jitindar Singh <sjitindarsi...@gmail.com>
> > Signed-off-by: Paul Mackerras <pau...@ozlabs.org>
> 
> Some comments below, but no showstoppers, so,
> 
> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au>

One more, again not a showstopper:

> > @@ -393,10 +396,20 @@ struct kvm_nested_guest *kvmhv_alloc_nested(struct 
> > kvm *kvm, unsigned int lpid)
> >   */
> >  static void kvmhv_release_nested(struct kvm_nested_guest *gp)
> >  {
> > +   struct kvm *kvm = gp->l1_host;
> > +
> > +   if (gp->shadow_pgtable) {
> > +           /*
> > +            * No vcpu is using this struct and no call to
> > +            * kvmhv_remove_nest_rmap can find this struct,

It's kind of dubious that you're referring to kvmhv_remove_nest_rmap()
a patch before it is introduced.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature

Reply via email to