Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-09-06 Thread Christoph Hellwig
On Fri, Sep 06, 2019 at 05:06:39PM +0530, Bharata B Rao wrote:
> > Also is bit 56+ a set of values, so is there 1 << 56 and 3 << 56 as well?  
> > Seems
> > like even that other patch doesn't fully define these "pfn" values.
> 
> I realized that the bit numbers have changed, it is no longer bits 60:56,
> but instead top 8bits. 
> 
> #define KVMPPC_RMAP_UVMEM_PFN   0x0200
> static inline bool kvmppc_rmap_is_uvmem_pfn(unsigned long *rmap)
> {
> return ((*rmap & 0xff00) == KVMPPC_RMAP_UVMEM_PFN);
> }

In that overall scheme I'd actually much prefer something like (names
just made up, they should vaguely match the spec this written to):

static inline unsigned long kvmppc_rmap_type(unsigned long *rmap)
{
return (rmap & 0xff00);
}

And then where you check it you can use:

if (kvmppc_rmap_type(*rmap) == KVMPPC_RMAP_UVMEM_PFN)

and where you set it you do:

*rmap |= KVMPPC_RMAP_UVMEM_PFN;

as in the current patch to keep things symmetric.


Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-09-06 Thread Bharata B Rao
On Mon, Sep 02, 2019 at 09:53:56AM +0200, Christoph Hellwig wrote:
> On Fri, Aug 30, 2019 at 09:12:59AM +0530, Bharata B Rao wrote:
> > On Thu, Aug 29, 2019 at 10:38:10AM +0200, Christoph Hellwig wrote:
> > > On Thu, Aug 22, 2019 at 03:56:14PM +0530, Bharata B Rao wrote:
> > > > +/*
> > > > + * Bits 60:56 in the rmap entry will be used to identify the
> > > > + * different uses/functions of rmap.
> > > > + */
> > > > +#define KVMPPC_RMAP_DEVM_PFN   (0x2ULL << 56)
> > > 
> > > How did you come up with this specific value?
> > 
> > Different usage types of RMAP array are being defined.
> > https://patchwork.ozlabs.org/patch/1149791/
> > 
> > The above value is reserved for device pfn usage.
> 
> Shouldn't all these defintions go in together in a patch?

Ideally yes, but the above patch is already in Paul's tree, I will sync
up with him about this.

> Also is bit 56+ a set of values, so is there 1 << 56 and 3 << 56 as well?  
> Seems
> like even that other patch doesn't fully define these "pfn" values.

I realized that the bit numbers have changed, it is no longer bits 60:56,
but instead top 8bits. 

#define KVMPPC_RMAP_UVMEM_PFN   0x0200
static inline bool kvmppc_rmap_is_uvmem_pfn(unsigned long *rmap)
{
return ((*rmap & 0xff00) == KVMPPC_RMAP_UVMEM_PFN);
}

> 
> > > No need for !! when returning a bool.  Also the helper seems a little
> > > pointless, just opencoding it would make the code more readable in my
> > > opinion.
> > 
> > I expect similar routines for other usages of RMAP to come up.
> 
> Please drop them all.  Having to wade through a header to check for
> a specific bit that also is set manually elsewhere in related code
> just obsfucates it for the reader.

I am currently using the routine kvmppc_rmap_is_uvmem_pfn() (shown
above) instead open coding it at multiple places, but I can drop it if
you prefer.

Regards,
Bharata.



Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-09-02 Thread Christoph Hellwig
On Fri, Aug 30, 2019 at 09:12:59AM +0530, Bharata B Rao wrote:
> On Thu, Aug 29, 2019 at 10:38:10AM +0200, Christoph Hellwig wrote:
> > On Thu, Aug 22, 2019 at 03:56:14PM +0530, Bharata B Rao wrote:
> > > +/*
> > > + * Bits 60:56 in the rmap entry will be used to identify the
> > > + * different uses/functions of rmap.
> > > + */
> > > +#define KVMPPC_RMAP_DEVM_PFN (0x2ULL << 56)
> > 
> > How did you come up with this specific value?
> 
> Different usage types of RMAP array are being defined.
> https://patchwork.ozlabs.org/patch/1149791/
> 
> The above value is reserved for device pfn usage.

Shouldn't all these defintions go in together in a patch?  Also is bi
t 56+ a set of values, so is there 1 << 56 and 3 << 56 as well?  Seems
like even that other patch doesn't fully define these "pfn" values.

> > No need for !! when returning a bool.  Also the helper seems a little
> > pointless, just opencoding it would make the code more readable in my
> > opinion.
> 
> I expect similar routines for other usages of RMAP to come up.

Please drop them all.  Having to wade through a header to check for
a specific bit that also is set manually elsewhere in related code
just obsfucates it for the reader.

> > > + *mig->dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
> > > + return 0;
> > > +}
> > 
> > I think you can just merge this trivial helper into the only caller.
> 
> Yes I can, but felt it is nicely abstracted out to a function right now.

Not really.  It just fits the old calling conventions before I removed
the indirection.

> > Here we actually have two callers, but they have a fair amount of
> > duplicate code in them.  I think you want to move that common
> > code (including setting up the migrate_vma structure) into this
> > function and maybe also give it a more descriptive name.
> 
> Sure, I will give this a try. The name is already very descriptive, will
> come up with an appropriate name.

I don't think alloc_and_copy is very helpful.  It matches some of the
implementation, but not the intent.  Why not kvmppc_svm_page_in/out
similar to the hypervisor calls calling them?  Yes, for one case it
also gets called from the pagefault handler, but it still performs
these basic page in/out actions.

> BTW this file and the fuction prefixes in this file started out with
> kvmppc_hmm, switched to kvmppc_devm when HMM routines weren't used anymore.
> Now with the use of only non-dev versions, planning to swtich to
> kvmppc_uvmem_

That prefix sounds fine to me as well.


Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-08-30 Thread Bharata B Rao
On Thu, Aug 29, 2019 at 12:39:11PM -0700, Sukadev Bhattiprolu wrote:
> Bharata B Rao [bhar...@linux.ibm.com] wrote:
> > On Wed, Aug 28, 2019 at 08:02:19PM -0700, Sukadev Bhattiprolu wrote:
> Where do we serialize two threads attempting to H_SVM_PAGE_IN the same gfn
> at the same time? Or one thread issuing a H_SVM_PAGE_IN and another a
> H_SVM_PAGE_OUT for the same page?

I am not not serializing page-in/out calls on same gfn, I thought you take
care of that in UV, guess UV doesn't yet.

I can probably use rmap_lock() and serialize such calls in HV if UV can't
prevent such calls easily.

> > > > +
> > > > +   if (!trylock_page(dpage))
> > > > +   goto out_clear;
> > > > +
> > > > +   *rmap = devm_pfn | KVMPPC_RMAP_DEVM_PFN;
> > > > +   pvt = kzalloc(sizeof(*pvt), GFP_ATOMIC);
> > > > +   if (!pvt)
> > > > +   goto out_unlock;
> 
> If we fail to alloc, we don't clear the KVMPPC_RMAP_DEVM_PFN?

Right, I will move the assignment to *rmap to after kzalloc.

> 
> Also, when/where do we clear this flag on an uv-page-out?
> kvmppc_devm_drop_pages() drops the flag on a local variable but not
> in the rmap? If we don't clear the flag on page-out, would the
> subsequent H_SVM_PAGE_IN of this page fail?

It gets cleared in kvmppc_devm_page_free().

> 
> Ok. Nit. thought we can drop the "_fault" in the function name but would
> collide the other "alloc_and_copy" function used during H_SVM_PAGE_IN.
> If the two alloc_and_copy functions are symmetric, maybe they could
> have "page_in" and "page_out" in the (already long) names.

Christoph also suggested to reorganize these two calls. Will take care.

Regards,
Bharata.



Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-08-29 Thread Bharata B Rao
On Thu, Aug 29, 2019 at 10:38:10AM +0200, Christoph Hellwig wrote:
> On Thu, Aug 22, 2019 at 03:56:14PM +0530, Bharata B Rao wrote:
> > +/*
> > + * Bits 60:56 in the rmap entry will be used to identify the
> > + * different uses/functions of rmap.
> > + */
> > +#define KVMPPC_RMAP_DEVM_PFN   (0x2ULL << 56)
> 
> How did you come up with this specific value?

Different usage types of RMAP array are being defined.
https://patchwork.ozlabs.org/patch/1149791/

The above value is reserved for device pfn usage.

> 
> > +
> > +static inline bool kvmppc_rmap_is_devm_pfn(unsigned long pfn)
> > +{
> > +   return !!(pfn & KVMPPC_RMAP_DEVM_PFN);
> > +}
> 
> No need for !! when returning a bool.  Also the helper seems a little
> pointless, just opencoding it would make the code more readable in my
> opinion.

I expect similar routines for other usages of RMAP to come up.

> 
> > +#ifdef CONFIG_PPC_UV
> > +extern int kvmppc_devm_init(void);
> > +extern void kvmppc_devm_free(void);
> 
> There is no need for extern in a function declaration.
> 
> > +static int
> > +kvmppc_devm_migrate_alloc_and_copy(struct migrate_vma *mig,
> > +  unsigned long *rmap, unsigned long gpa,
> > +  unsigned int lpid, unsigned long page_shift)
> > +{
> > +   struct page *spage = migrate_pfn_to_page(*mig->src);
> > +   unsigned long pfn = *mig->src >> MIGRATE_PFN_SHIFT;
> > +   struct page *dpage;
> > +
> > +   *mig->dst = 0;
> > +   if (!spage || !(*mig->src & MIGRATE_PFN_MIGRATE))
> > +   return 0;
> > +
> > +   dpage = kvmppc_devm_get_page(rmap, gpa, lpid);
> > +   if (!dpage)
> > +   return -EINVAL;
> > +
> > +   if (spage)
> > +   uv_page_in(lpid, pfn << page_shift, gpa, 0, page_shift);
> > +
> > +   *mig->dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
> > +   return 0;
> > +}
> 
> I think you can just merge this trivial helper into the only caller.

Yes I can, but felt it is nicely abstracted out to a function right now.

> 
> > +static int
> > +kvmppc_devm_fault_migrate_alloc_and_copy(struct migrate_vma *mig,
> > +unsigned long page_shift)
> > +{
> > +   struct page *dpage, *spage;
> > +   struct kvmppc_devm_page_pvt *pvt;
> > +   unsigned long pfn;
> > +   int ret;
> > +
> > +   spage = migrate_pfn_to_page(*mig->src);
> > +   if (!spage || !(*mig->src & MIGRATE_PFN_MIGRATE))
> > +   return 0;
> > +   if (!is_zone_device_page(spage))
> > +   return 0;
> > +
> > +   dpage = alloc_page_vma(GFP_HIGHUSER, mig->vma, mig->start);
> > +   if (!dpage)
> > +   return -EINVAL;
> > +   lock_page(dpage);
> > +   pvt = spage->zone_device_data;
> > +
> > +   pfn = page_to_pfn(dpage);
> > +   ret = uv_page_out(pvt->lpid, pfn << page_shift, pvt->gpa, 0,
> > + page_shift);
> > +   if (ret == U_SUCCESS)
> > +   *mig->dst = migrate_pfn(pfn) | MIGRATE_PFN_LOCKED;
> > +   else {
> > +   unlock_page(dpage);
> > +   __free_page(dpage);
> > +   }
> > +   return ret;
> > +}
> 
> Here we actually have two callers, but they have a fair amount of
> duplicate code in them.  I think you want to move that common
> code (including setting up the migrate_vma structure) into this
> function and maybe also give it a more descriptive name.

Sure, I will give this a try. The name is already very descriptive, will
come up with an appropriate name.

BTW this file and the fuction prefixes in this file started out with
kvmppc_hmm, switched to kvmppc_devm when HMM routines weren't used anymore.
Now with the use of only non-dev versions, planning to swtich to
kvmppc_uvmem_

> 
> > +static void kvmppc_devm_page_free(struct page *page)
> > +{
> > +   unsigned long pfn = page_to_pfn(page);
> > +   unsigned long flags;
> > +   struct kvmppc_devm_page_pvt *pvt;
> > +
> > +   spin_lock_irqsave(_devm_pfn_lock, flags);
> > +   pvt = page->zone_device_data;
> > +   page->zone_device_data = NULL;
> > +
> > +   bitmap_clear(kvmppc_devm_pfn_bitmap,
> > +pfn - (kvmppc_devm_pgmap.res.start >> PAGE_SHIFT), 1);
> 
> Nit: I'd just initialize pfn to the value you want from the start.
> That makes the code a little easier to read, and keeps a tiny bit more
> code outside the spinlock.
> 
>   unsigned long pfn = page_to_pfn(page) -
>   (kvmppc_devm_pgmap.res.start >> PAGE_SHIFT);
> 
>   ..
> 
>bitmap_clear(kvmppc_devm_pfn_bitmap, pfn, 1);

Sure.

> 
> 
> > +   kvmppc_devm_pgmap.type = MEMORY_DEVICE_PRIVATE;
> > +   kvmppc_devm_pgmap.res = *res;
> > +   kvmppc_devm_pgmap.ops = _devm_ops;
> > +   addr = memremap_pages(_devm_pgmap, -1);
> 
> This -1 should be NUMA_NO_NODE for clarity.

Right.

Regards,
Bharata.



Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-08-29 Thread Sukadev Bhattiprolu
Bharata B Rao [bhar...@linux.ibm.com] wrote:
> On Wed, Aug 28, 2019 at 08:02:19PM -0700, Sukadev Bhattiprolu wrote:
> > Some minor comments/questions below. Overall, the patches look
> > fine to me.
> > 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +static struct dev_pagemap kvmppc_devm_pgmap;
> > > +static unsigned long *kvmppc_devm_pfn_bitmap;
> > > +static DEFINE_SPINLOCK(kvmppc_devm_pfn_lock);
> > 
> > Is this lock protecting just the pfn_bitmap?
> 
> Yes.
> 
> > 
> > > +
> > > +struct kvmppc_devm_page_pvt {
> > > + unsigned long *rmap;
> > > + unsigned int lpid;
> > > + unsigned long gpa;
> > > +};
> > > +
> > > +/*
> > > + * Get a free device PFN from the pool
> > > + *
> > > + * Called when a normal page is moved to secure memory (UV_PAGE_IN). 
> > > Device
> > > + * PFN will be used to keep track of the secure page on HV side.
> > > + *
> > > + * @rmap here is the slot in the rmap array that corresponds to @gpa.
> > > + * Thus a non-zero rmap entry indicates that the corresponding guest
> > > + * page has become secure, and is not mapped on the HV side.
> > > + *
> > > + * NOTE: In this and subsequent functions, we pass around and access
> > > + * individual elements of kvm_memory_slot->arch.rmap[] without any
> > > + * protection. Should we use lock_rmap() here?

Where do we serialize two threads attempting to H_SVM_PAGE_IN the same gfn
at the same time? Or one thread issuing a H_SVM_PAGE_IN and another a
H_SVM_PAGE_OUT for the same page?

> > > + */
> > > +static struct page *kvmppc_devm_get_page(unsigned long *rmap, unsigned 
> > > long gpa,
> > > +  unsigned int lpid)
> > > +{
> > > + struct page *dpage = NULL;
> > > + unsigned long bit, devm_pfn;
> > > + unsigned long flags;
> > > + struct kvmppc_devm_page_pvt *pvt;
> > > + unsigned long pfn_last, pfn_first;
> > > +
> > > + if (kvmppc_rmap_is_devm_pfn(*rmap))
> > > + return NULL;
> > > +
> > > + pfn_first = kvmppc_devm_pgmap.res.start >> PAGE_SHIFT;
> > > + pfn_last = pfn_first +
> > > +(resource_size(_devm_pgmap.res) >> PAGE_SHIFT);
> > > + spin_lock_irqsave(_devm_pfn_lock, flags);
> > 
> > Blank lines around spin_lock() would help.
> 
> You mean blank line before lock and after unlock to clearly see
> where the lock starts and ends?
> 
> > 
> > > + bit = find_first_zero_bit(kvmppc_devm_pfn_bitmap, pfn_last - pfn_first);
> > > + if (bit >= (pfn_last - pfn_first))
> > > + goto out;
> > > +
> > > + bitmap_set(kvmppc_devm_pfn_bitmap, bit, 1);
> > > + devm_pfn = bit + pfn_first;
> > 
> > Can we drop the _devm_pfn_lock here or after the trylock_page()?
> > Or does it also protect the ->zone_device_data' assignment below as well?
> > If so, maybe drop the 'pfn_' from the name of the lock?
> > 
> > Besides, we don't seem to hold this lock when accessing ->zone_device_data
> > in kvmppc_share_page(). Maybe _devm_pfn_lock just protects the 
> > bitmap?
> 
> Will move the unlock to appropriately.
> 
> > 
> > 
> > > + dpage = pfn_to_page(devm_pfn);
> > 
> > Does this code and hence CONFIG_PPC_UV depend on a specific model like
> > CONFIG_SPARSEMEM_VMEMMAP?
> 
> I don't think so. Irrespective of that pfn_to_page() should just work
> for us.
> 
> > > +
> > > + if (!trylock_page(dpage))
> > > + goto out_clear;
> > > +
> > > + *rmap = devm_pfn | KVMPPC_RMAP_DEVM_PFN;
> > > + pvt = kzalloc(sizeof(*pvt), GFP_ATOMIC);
> > > + if (!pvt)
> > > + goto out_unlock;

If we fail to alloc, we don't clear the KVMPPC_RMAP_DEVM_PFN?

Also, when/where do we clear this flag on an uv-page-out?
kvmppc_devm_drop_pages() drops the flag on a local variable but not
in the rmap? If we don't clear the flag on page-out, would the
subsequent H_SVM_PAGE_IN of this page fail?

> > > + pvt->rmap = rmap;
> > > + pvt->gpa = gpa;
> > > + pvt->lpid = lpid;
> > > + dpage->zone_device_data = pvt;
> > 
> > ->zone_device_data is set after locking the dpage here, but in
> > kvmppc_share_page() and kvmppc_devm_fault_migrate_alloc_and_copy()
> > it is accessed without locking the page?
> > 
> > > + spin_unlock_irqrestore(_devm_pfn_lock, flags);
> > > +
> > > + get_page(dpage);
> > > + return dpage;
> > > +
> > > +out_unlock:
> > > + unlock_page(dpage);
> > > +out_clear:
> > > + bitmap_clear(kvmppc_devm_pfn_bitmap, devm_pfn - pfn_first, 1);
> > > +out:
> > > + spin_unlock_irqrestore(_devm_pfn_lock, flags);
> > > + return NULL;
> > > +}
> > > +
> > > +/*
> > > + * Alloc a PFN from private device memory pool and copy page from normal
> > > + * memory to secure memory.
> > > + */
> > > +static int
> > > +kvmppc_devm_migrate_alloc_and_copy(struct migrate_vma *mig,
> > > +unsigned long *rmap, unsigned long gpa,
> > > +unsigned int lpid, unsigned long page_shift)
> > > +{
> > > + struct page *spage = migrate_pfn_to_page(*mig->src);
> > > + unsigned long pfn = *mig->src >> MIGRATE_PFN_SHIFT;
> > > + struct page *dpage;
> > > +
> > > + 

Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-08-29 Thread Christoph Hellwig
On Thu, Aug 22, 2019 at 03:56:14PM +0530, Bharata B Rao wrote:
> +/*
> + * Bits 60:56 in the rmap entry will be used to identify the
> + * different uses/functions of rmap.
> + */
> +#define KVMPPC_RMAP_DEVM_PFN (0x2ULL << 56)

How did you come up with this specific value?

> +
> +static inline bool kvmppc_rmap_is_devm_pfn(unsigned long pfn)
> +{
> + return !!(pfn & KVMPPC_RMAP_DEVM_PFN);
> +}

No need for !! when returning a bool.  Also the helper seems a little
pointless, just opencoding it would make the code more readable in my
opinion.

> +#ifdef CONFIG_PPC_UV
> +extern int kvmppc_devm_init(void);
> +extern void kvmppc_devm_free(void);

There is no need for extern in a function declaration.

> +static int
> +kvmppc_devm_migrate_alloc_and_copy(struct migrate_vma *mig,
> +unsigned long *rmap, unsigned long gpa,
> +unsigned int lpid, unsigned long page_shift)
> +{
> + struct page *spage = migrate_pfn_to_page(*mig->src);
> + unsigned long pfn = *mig->src >> MIGRATE_PFN_SHIFT;
> + struct page *dpage;
> +
> + *mig->dst = 0;
> + if (!spage || !(*mig->src & MIGRATE_PFN_MIGRATE))
> + return 0;
> +
> + dpage = kvmppc_devm_get_page(rmap, gpa, lpid);
> + if (!dpage)
> + return -EINVAL;
> +
> + if (spage)
> + uv_page_in(lpid, pfn << page_shift, gpa, 0, page_shift);
> +
> + *mig->dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
> + return 0;
> +}

I think you can just merge this trivial helper into the only caller.

> +static int
> +kvmppc_devm_fault_migrate_alloc_and_copy(struct migrate_vma *mig,
> +  unsigned long page_shift)
> +{
> + struct page *dpage, *spage;
> + struct kvmppc_devm_page_pvt *pvt;
> + unsigned long pfn;
> + int ret;
> +
> + spage = migrate_pfn_to_page(*mig->src);
> + if (!spage || !(*mig->src & MIGRATE_PFN_MIGRATE))
> + return 0;
> + if (!is_zone_device_page(spage))
> + return 0;
> +
> + dpage = alloc_page_vma(GFP_HIGHUSER, mig->vma, mig->start);
> + if (!dpage)
> + return -EINVAL;
> + lock_page(dpage);
> + pvt = spage->zone_device_data;
> +
> + pfn = page_to_pfn(dpage);
> + ret = uv_page_out(pvt->lpid, pfn << page_shift, pvt->gpa, 0,
> +   page_shift);
> + if (ret == U_SUCCESS)
> + *mig->dst = migrate_pfn(pfn) | MIGRATE_PFN_LOCKED;
> + else {
> + unlock_page(dpage);
> + __free_page(dpage);
> + }
> + return ret;
> +}

Here we actually have two callers, but they have a fair amount of
duplicate code in them.  I think you want to move that common
code (including setting up the migrate_vma structure) into this
function and maybe also give it a more descriptive name.

> +static void kvmppc_devm_page_free(struct page *page)
> +{
> + unsigned long pfn = page_to_pfn(page);
> + unsigned long flags;
> + struct kvmppc_devm_page_pvt *pvt;
> +
> + spin_lock_irqsave(_devm_pfn_lock, flags);
> + pvt = page->zone_device_data;
> + page->zone_device_data = NULL;
> +
> + bitmap_clear(kvmppc_devm_pfn_bitmap,
> +  pfn - (kvmppc_devm_pgmap.res.start >> PAGE_SHIFT), 1);

Nit: I'd just initialize pfn to the value you want from the start.
That makes the code a little easier to read, and keeps a tiny bit more
code outside the spinlock.

unsigned long pfn = page_to_pfn(page) -
(kvmppc_devm_pgmap.res.start >> PAGE_SHIFT);

..

 bitmap_clear(kvmppc_devm_pfn_bitmap, pfn, 1);


> + kvmppc_devm_pgmap.type = MEMORY_DEVICE_PRIVATE;
> + kvmppc_devm_pgmap.res = *res;
> + kvmppc_devm_pgmap.ops = _devm_ops;
> + addr = memremap_pages(_devm_pgmap, -1);

This -1 should be NUMA_NO_NODE for clarity.


Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-08-29 Thread Bharata B Rao
On Wed, Aug 28, 2019 at 08:02:19PM -0700, Sukadev Bhattiprolu wrote:
> Some minor comments/questions below. Overall, the patches look
> fine to me.
> 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +static struct dev_pagemap kvmppc_devm_pgmap;
> > +static unsigned long *kvmppc_devm_pfn_bitmap;
> > +static DEFINE_SPINLOCK(kvmppc_devm_pfn_lock);
> 
> Is this lock protecting just the pfn_bitmap?

Yes.

> 
> > +
> > +struct kvmppc_devm_page_pvt {
> > +   unsigned long *rmap;
> > +   unsigned int lpid;
> > +   unsigned long gpa;
> > +};
> > +
> > +/*
> > + * Get a free device PFN from the pool
> > + *
> > + * Called when a normal page is moved to secure memory (UV_PAGE_IN). Device
> > + * PFN will be used to keep track of the secure page on HV side.
> > + *
> > + * @rmap here is the slot in the rmap array that corresponds to @gpa.
> > + * Thus a non-zero rmap entry indicates that the corresponding guest
> > + * page has become secure, and is not mapped on the HV side.
> > + *
> > + * NOTE: In this and subsequent functions, we pass around and access
> > + * individual elements of kvm_memory_slot->arch.rmap[] without any
> > + * protection. Should we use lock_rmap() here?
> > + */
> > +static struct page *kvmppc_devm_get_page(unsigned long *rmap, unsigned 
> > long gpa,
> > +unsigned int lpid)
> > +{
> > +   struct page *dpage = NULL;
> > +   unsigned long bit, devm_pfn;
> > +   unsigned long flags;
> > +   struct kvmppc_devm_page_pvt *pvt;
> > +   unsigned long pfn_last, pfn_first;
> > +
> > +   if (kvmppc_rmap_is_devm_pfn(*rmap))
> > +   return NULL;
> > +
> > +   pfn_first = kvmppc_devm_pgmap.res.start >> PAGE_SHIFT;
> > +   pfn_last = pfn_first +
> > +  (resource_size(_devm_pgmap.res) >> PAGE_SHIFT);
> > +   spin_lock_irqsave(_devm_pfn_lock, flags);
> 
> Blank lines around spin_lock() would help.

You mean blank line before lock and after unlock to clearly see
where the lock starts and ends?

> 
> > +   bit = find_first_zero_bit(kvmppc_devm_pfn_bitmap, pfn_last - pfn_first);
> > +   if (bit >= (pfn_last - pfn_first))
> > +   goto out;
> > +
> > +   bitmap_set(kvmppc_devm_pfn_bitmap, bit, 1);
> > +   devm_pfn = bit + pfn_first;
> 
> Can we drop the _devm_pfn_lock here or after the trylock_page()?
> Or does it also protect the ->zone_device_data' assignment below as well?
> If so, maybe drop the 'pfn_' from the name of the lock?
> 
> Besides, we don't seem to hold this lock when accessing ->zone_device_data
> in kvmppc_share_page(). Maybe _devm_pfn_lock just protects the bitmap?

Will move the unlock to appropriately.

> 
> 
> > +   dpage = pfn_to_page(devm_pfn);
> 
> Does this code and hence CONFIG_PPC_UV depend on a specific model like
> CONFIG_SPARSEMEM_VMEMMAP?

I don't think so. Irrespective of that pfn_to_page() should just work
for us.

> > +
> > +   if (!trylock_page(dpage))
> > +   goto out_clear;
> > +
> > +   *rmap = devm_pfn | KVMPPC_RMAP_DEVM_PFN;
> > +   pvt = kzalloc(sizeof(*pvt), GFP_ATOMIC);
> > +   if (!pvt)
> > +   goto out_unlock;
> > +   pvt->rmap = rmap;
> > +   pvt->gpa = gpa;
> > +   pvt->lpid = lpid;
> > +   dpage->zone_device_data = pvt;
> 
> ->zone_device_data is set after locking the dpage here, but in
> kvmppc_share_page() and kvmppc_devm_fault_migrate_alloc_and_copy()
> it is accessed without locking the page?
> 
> > +   spin_unlock_irqrestore(_devm_pfn_lock, flags);
> > +
> > +   get_page(dpage);
> > +   return dpage;
> > +
> > +out_unlock:
> > +   unlock_page(dpage);
> > +out_clear:
> > +   bitmap_clear(kvmppc_devm_pfn_bitmap, devm_pfn - pfn_first, 1);
> > +out:
> > +   spin_unlock_irqrestore(_devm_pfn_lock, flags);
> > +   return NULL;
> > +}
> > +
> > +/*
> > + * Alloc a PFN from private device memory pool and copy page from normal
> > + * memory to secure memory.
> > + */
> > +static int
> > +kvmppc_devm_migrate_alloc_and_copy(struct migrate_vma *mig,
> > +  unsigned long *rmap, unsigned long gpa,
> > +  unsigned int lpid, unsigned long page_shift)
> > +{
> > +   struct page *spage = migrate_pfn_to_page(*mig->src);
> > +   unsigned long pfn = *mig->src >> MIGRATE_PFN_SHIFT;
> > +   struct page *dpage;
> > +
> > +   *mig->dst = 0;
> > +   if (!spage || !(*mig->src & MIGRATE_PFN_MIGRATE))
> > +   return 0;
> > +
> > +   dpage = kvmppc_devm_get_page(rmap, gpa, lpid);
> > +   if (!dpage)
> > +   return -EINVAL;
> > +
> > +   if (spage)
> > +   uv_page_in(lpid, pfn << page_shift, gpa, 0, page_shift);
> > +
> > +   *mig->dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
> > +   return 0;
> > +}
> > +
> > +/*
> > + * Move page from normal memory to secure memory.
> > + */
> > +unsigned long
> > +kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
> > +unsigned long flags, unsigned long page_shift)
> > +{
> > +   unsigned long addr, end;
> > +   unsigned long src_pfn, 

Re: [PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-08-28 Thread Sukadev Bhattiprolu
Some minor comments/questions below. Overall, the patches look
fine to me.

> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static struct dev_pagemap kvmppc_devm_pgmap;
> +static unsigned long *kvmppc_devm_pfn_bitmap;
> +static DEFINE_SPINLOCK(kvmppc_devm_pfn_lock);

Is this lock protecting just the pfn_bitmap?

> +
> +struct kvmppc_devm_page_pvt {
> + unsigned long *rmap;
> + unsigned int lpid;
> + unsigned long gpa;
> +};
> +
> +/*
> + * Get a free device PFN from the pool
> + *
> + * Called when a normal page is moved to secure memory (UV_PAGE_IN). Device
> + * PFN will be used to keep track of the secure page on HV side.
> + *
> + * @rmap here is the slot in the rmap array that corresponds to @gpa.
> + * Thus a non-zero rmap entry indicates that the corresponding guest
> + * page has become secure, and is not mapped on the HV side.
> + *
> + * NOTE: In this and subsequent functions, we pass around and access
> + * individual elements of kvm_memory_slot->arch.rmap[] without any
> + * protection. Should we use lock_rmap() here?
> + */
> +static struct page *kvmppc_devm_get_page(unsigned long *rmap, unsigned long 
> gpa,
> +  unsigned int lpid)
> +{
> + struct page *dpage = NULL;
> + unsigned long bit, devm_pfn;
> + unsigned long flags;
> + struct kvmppc_devm_page_pvt *pvt;
> + unsigned long pfn_last, pfn_first;
> +
> + if (kvmppc_rmap_is_devm_pfn(*rmap))
> + return NULL;
> +
> + pfn_first = kvmppc_devm_pgmap.res.start >> PAGE_SHIFT;
> + pfn_last = pfn_first +
> +(resource_size(_devm_pgmap.res) >> PAGE_SHIFT);
> + spin_lock_irqsave(_devm_pfn_lock, flags);

Blank lines around spin_lock() would help.

> + bit = find_first_zero_bit(kvmppc_devm_pfn_bitmap, pfn_last - pfn_first);
> + if (bit >= (pfn_last - pfn_first))
> + goto out;
> +
> + bitmap_set(kvmppc_devm_pfn_bitmap, bit, 1);
> + devm_pfn = bit + pfn_first;

Can we drop the _devm_pfn_lock here or after the trylock_page()?
Or does it also protect the ->zone_device_data' assignment below as well?
If so, maybe drop the 'pfn_' from the name of the lock?

Besides, we don't seem to hold this lock when accessing ->zone_device_data
in kvmppc_share_page(). Maybe _devm_pfn_lock just protects the bitmap?


> + dpage = pfn_to_page(devm_pfn);

Does this code and hence CONFIG_PPC_UV depend on a specific model like
CONFIG_SPARSEMEM_VMEMMAP?
> +
> + if (!trylock_page(dpage))
> + goto out_clear;
> +
> + *rmap = devm_pfn | KVMPPC_RMAP_DEVM_PFN;
> + pvt = kzalloc(sizeof(*pvt), GFP_ATOMIC);
> + if (!pvt)
> + goto out_unlock;
> + pvt->rmap = rmap;
> + pvt->gpa = gpa;
> + pvt->lpid = lpid;
> + dpage->zone_device_data = pvt;

->zone_device_data is set after locking the dpage here, but in
kvmppc_share_page() and kvmppc_devm_fault_migrate_alloc_and_copy()
it is accessed without locking the page?

> + spin_unlock_irqrestore(_devm_pfn_lock, flags);
> +
> + get_page(dpage);
> + return dpage;
> +
> +out_unlock:
> + unlock_page(dpage);
> +out_clear:
> + bitmap_clear(kvmppc_devm_pfn_bitmap, devm_pfn - pfn_first, 1);
> +out:
> + spin_unlock_irqrestore(_devm_pfn_lock, flags);
> + return NULL;
> +}
> +
> +/*
> + * Alloc a PFN from private device memory pool and copy page from normal
> + * memory to secure memory.
> + */
> +static int
> +kvmppc_devm_migrate_alloc_and_copy(struct migrate_vma *mig,
> +unsigned long *rmap, unsigned long gpa,
> +unsigned int lpid, unsigned long page_shift)
> +{
> + struct page *spage = migrate_pfn_to_page(*mig->src);
> + unsigned long pfn = *mig->src >> MIGRATE_PFN_SHIFT;
> + struct page *dpage;
> +
> + *mig->dst = 0;
> + if (!spage || !(*mig->src & MIGRATE_PFN_MIGRATE))
> + return 0;
> +
> + dpage = kvmppc_devm_get_page(rmap, gpa, lpid);
> + if (!dpage)
> + return -EINVAL;
> +
> + if (spage)
> + uv_page_in(lpid, pfn << page_shift, gpa, 0, page_shift);
> +
> + *mig->dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
> + return 0;
> +}
> +
> +/*
> + * Move page from normal memory to secure memory.
> + */
> +unsigned long
> +kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
> +  unsigned long flags, unsigned long page_shift)
> +{
> + unsigned long addr, end;
> + unsigned long src_pfn, dst_pfn;

These are the host frame numbers correct? Trying to distinguish them
from 'gfn' and 'gpa' used in the function.

> + struct migrate_vma mig;
> + struct vm_area_struct *vma;
> + int srcu_idx;
> + unsigned long gfn = gpa >> page_shift;
> + struct kvm_memory_slot *slot;
> + unsigned long *rmap;
> + int ret;
> +
> + if (page_shift != PAGE_SHIFT)
> + return H_P3;
> +
> + if (flags)
> + return H_P2;
> +
> 

[PATCH v7 1/7] kvmppc: Driver to manage pages of secure guest

2019-08-22 Thread Bharata B Rao
KVMPPC driver to manage page transitions of secure guest
via H_SVM_PAGE_IN and H_SVM_PAGE_OUT hcalls.

H_SVM_PAGE_IN: Move the content of a normal page to secure page
H_SVM_PAGE_OUT: Move the content of a secure page to normal page

Private ZONE_DEVICE memory equal to the amount of secure memory
available in the platform for running secure guests is created.
Whenever a page belonging to the guest becomes secure, a page from
this private device memory is used to represent and track that secure
page on the HV side. The movement of pages between normal and secure
memory is done via migrate_vma_pages() using UV_PAGE_IN and
UV_PAGE_OUT ucalls.

Signed-off-by: Bharata B Rao 
---
 arch/powerpc/include/asm/hvcall.h  |   4 +
 arch/powerpc/include/asm/kvm_book3s_devm.h |  29 ++
 arch/powerpc/include/asm/kvm_host.h|  23 ++
 arch/powerpc/include/asm/ultravisor-api.h  |   2 +
 arch/powerpc/include/asm/ultravisor.h  |  14 +
 arch/powerpc/kvm/Makefile  |   3 +
 arch/powerpc/kvm/book3s_hv.c   |  19 +
 arch/powerpc/kvm/book3s_hv_devm.c  | 438 +
 8 files changed, 532 insertions(+)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_devm.h
 create mode 100644 arch/powerpc/kvm/book3s_hv_devm.c

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index 463c63a9fcf1..2f6b952deb0f 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -337,6 +337,10 @@
 #define H_TLB_INVALIDATE   0xF808
 #define H_COPY_TOFROM_GUEST0xF80C
 
+/* Platform-specific hcalls used by the Ultravisor */
+#define H_SVM_PAGE_IN  0xEF00
+#define H_SVM_PAGE_OUT 0xEF04
+
 /* Values for 2nd argument to H_SET_MODE */
 #define H_SET_MODE_RESOURCE_SET_CIABR  1
 #define H_SET_MODE_RESOURCE_SET_DAWR   2
diff --git a/arch/powerpc/include/asm/kvm_book3s_devm.h 
b/arch/powerpc/include/asm/kvm_book3s_devm.h
new file mode 100644
index ..9603c2b48d67
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_book3s_devm.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __POWERPC_KVM_PPC_HMM_H__
+#define __POWERPC_KVM_PPC_HMM_H__
+
+#ifdef CONFIG_PPC_UV
+unsigned long kvmppc_h_svm_page_in(struct kvm *kvm,
+  unsigned long gra,
+  unsigned long flags,
+  unsigned long page_shift);
+unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
+   unsigned long gra,
+   unsigned long flags,
+   unsigned long page_shift);
+#else
+static inline unsigned long
+kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gra,
+unsigned long flags, unsigned long page_shift)
+{
+   return H_UNSUPPORTED;
+}
+
+static inline unsigned long
+kvmppc_h_svm_page_out(struct kvm *kvm, unsigned long gra,
+ unsigned long flags, unsigned long page_shift)
+{
+   return H_UNSUPPORTED;
+}
+#endif /* CONFIG_PPC_UV */
+#endif /* __POWERPC_KVM_PPC_HMM_H__ */
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 4bb552d639b8..855d82730f44 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -242,6 +242,17 @@ struct revmap_entry {
 #define KVMPPC_RMAP_PRESENT0x1ul
 #define KVMPPC_RMAP_INDEX  0xul
 
+/*
+ * Bits 60:56 in the rmap entry will be used to identify the
+ * different uses/functions of rmap.
+ */
+#define KVMPPC_RMAP_DEVM_PFN   (0x2ULL << 56)
+
+static inline bool kvmppc_rmap_is_devm_pfn(unsigned long pfn)
+{
+   return !!(pfn & KVMPPC_RMAP_DEVM_PFN);
+}
+
 struct kvm_arch_memory_slot {
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
unsigned long *rmap;
@@ -849,4 +860,16 @@ static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu 
*vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
+#ifdef CONFIG_PPC_UV
+extern int kvmppc_devm_init(void);
+extern void kvmppc_devm_free(void);
+#else
+static inline int kvmppc_devm_init(void)
+{
+   return 0;
+}
+
+static inline void kvmppc_devm_free(void) {}
+#endif /* CONFIG_PPC_UV */
+
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/ultravisor-api.h 
b/arch/powerpc/include/asm/ultravisor-api.h
index 6a0f9c74f959..1cd1f595fd81 100644
--- a/arch/powerpc/include/asm/ultravisor-api.h
+++ b/arch/powerpc/include/asm/ultravisor-api.h
@@ -25,5 +25,7 @@
 /* opcodes */
 #define UV_WRITE_PATE  0xF104
 #define UV_RETURN  0xF11C
+#define UV_PAGE_IN 0xF128
+#define UV_PAGE_OUT0xF12C
 
 #endif /* _ASM_POWERPC_ULTRAVISOR_API_H */
diff --git a/arch/powerpc/include/asm/ultravisor.h 
b/arch/powerpc/include/asm/ultravisor.h
index