Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-10-11 Thread Olaf Hering
On Wed, Oct 11, Olaf Hering wrote:

> -#define MAX_BATCH_SIZE 1024   /* up to 1024 pages (4MB) at a time */
> +#define MAX_BATCH_SIZE SUPERPAGE_1GB_NR_PFNS   /* up to 1GB at a time */

Actually the error is something else, I missed this in the debug output:

xc: error: Failed to get types for pfn batch (7 = Argument list too long): 
Internal error

write_batch() should probably split the requests when filling types[] because
Xen has "1024" hardcoded in XEN_DOMCTL_getpageframeinfo3...


Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-10-11 Thread Olaf Hering
On Fri, Sep 08, Olaf Hering wrote:

> A related question: is it save to increase MAX_BATCH_SIZE from 1024 to
> (256*1024) to transfer a whole gigabyte at a time? That way it will be
> easier to handle holes within a 1GB superpage.

To answer my own question:

This change leads to this error:

-#define MAX_BATCH_SIZE 1024   /* up to 1024 pages (4MB) at a time */
+#define MAX_BATCH_SIZE SUPERPAGE_1GB_NR_PFNS   /* up to 1GB at a time */

...
xc: info: Found x86 HVM domain from Xen 4.10
xc: detail: dom 9 p2m_size fee01 max_pages 100100
xc: info: Restoring domain
xc: error: Failed to read Record Header from stream (0 = Success): Internal 
error
xc: error: Restore failed (0 = Success): Internal error
...

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-08 Thread Olaf Hering
On Wed, Sep 06, Andrew Cooper wrote:

> The stream has always been in-order for the first pass (even in the
> legacy days), and I don't forsee that changing.  Reliance on the order
> was suggested by both myself and Jan during the early design.

A related question: is it save to increase MAX_BATCH_SIZE from 1024 to
(256*1024) to transfer a whole gigabyte at a time? That way it will be
easier to handle holes within a 1GB superpage.

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-06 Thread Olaf Hering
On Wed, Sep 06, Andrew Cooper wrote:

> If a PVH guest has got MTRRs disabled, then it genuinely can run on an
> unshattered 1G superpage at 0.

Ok, the code will detect the holes and will release memory as needed. I
will drop these two lines.

Olaf



signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-06 Thread Olaf Hering
On Wed, Sep 06, Andrew Cooper wrote:

> I still fail to understand why you need the bitmaps at all?  You can
> calculate everything you need from the pfn list alone, which will also
> let you spot the presence or absence of the VGA hole.

These bitmaps track if a range has been allocated as superpage or not.
If there is a given pfn within a range of either 1G or 2M there might be
double allocation of a 1G or 2M page. This is not related to the VGA
hole. These two lines are just hints that in this range no superpage can
be allocated.

> You need to track which pfns you've see so far in the stream, and which
> pfns have been populated.  When you find holes in the pfns in the
> stream, you need to undo the prospective superpage allocation.  Unless
> I've missed something?

This is whats happening, holes will be created as soon as they are seen
in the stream.

> Also, please take care to use 2M decrease reservations wherever
> possible, or you will end up shattering the host superpage as part of
> trying to remove the memory.

This is what Wei suggested, build a list of pfns instead of releasing
each pfn individually. I think with this new code it should be possible
to decrease in 2M steps as needed.

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-06 Thread Andrew Cooper
On 06/09/17 13:17, Olaf Hering wrote:
> On Wed, Sep 06, Andrew Cooper wrote:
>
>> On 01/09/17 17:08, Olaf Hering wrote:
>>> +/* No superpage in 1st 2MB due to VGA hole */
>>> +xc_sr_set_bit(0, >x86_hvm.restore.attempted_1g);
>>> +xc_sr_set_bit(0, >x86_hvm.restore.attempted_2m);
>> This is false for PVH guests.
> How can I detect a PVH guest?

You (hopefully) can't, and it would be a layering violation if you could.

The exact set of emulation available to/used by a guest is not relevant
to how we move its memory.

If a PVH guest has got MTRRs disabled, then it genuinely can run on an
unshattered 1G superpage at 0.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-06 Thread Olaf Hering
On Wed, Sep 06, Andrew Cooper wrote:

> On 01/09/17 17:08, Olaf Hering wrote:
> > +/* No superpage in 1st 2MB due to VGA hole */
> > +xc_sr_set_bit(0, >x86_hvm.restore.attempted_1g);
> > +xc_sr_set_bit(0, >x86_hvm.restore.attempted_2m);
> This is false for PVH guests.

How can I detect a PVH guest?

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-06 Thread Andrew Cooper
On 01/09/17 17:08, Olaf Hering wrote:
> +/* No superpage in 1st 2MB due to VGA hole */
> +xc_sr_set_bit(0, >x86_hvm.restore.attempted_1g);
> +xc_sr_set_bit(0, >x86_hvm.restore.attempted_2m);

This is false for PVH guests.

I still fail to understand why you need the bitmaps at all?  You can
calculate everything you need from the pfn list alone, which will also
let you spot the presence or absence of the VGA hole.

You need to track which pfns you've see so far in the stream, and which
pfns have been populated.  When you find holes in the pfns in the
stream, you need to undo the prospective superpage allocation.  Unless
I've missed something?

Also, please take care to use 2M decrease reservations wherever
possible, or you will end up shattering the host superpage as part of
trying to remove the memory.

~Andrew

> +
>  return 0;
> +
> +out:
> +ERROR("Unable to allocate memory for pfn bitmaps");
> +return -1;
>  }
>  
>  /*
>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-06 Thread Olaf Hering
Am Wed, 6 Sep 2017 12:34:10 +0100
schrieb Wei Liu :

> > +struct x86_hvm_sp {  
> Forgot to ask: what does sp stand for?

superpage. I will check if there is room to expand this string.

> > + * Try to allocate superpages.
> > + * This works without memory map only if the pfns arrive in incremental 
> > order.
> > + */  
> I have said several times, one way or another, I don't want to make
> assumption on the stream of pfns. So I'm afraid I can't ack a patch like
> this.

It will work with any order, I think. Just with incremental order the 
superpages will not be split once they are allocated.

Thanks for the review. I will send another series shortly.

Olaf


pgp5WsQzensvH.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-06 Thread Andrew Cooper

>> +if ( rc )
>> +{
>> +PERROR("Failed to release pfn %" PRI_xen_pfn, pfn);
>> +return false;
>> +}
>> +ctx->restore.tot_pages--;
>> +freed++;
>> +}
>> +pfn++;
>> +}
>> +if ( freed )
>> +DPRINTF("freed %u between %" PRI_xen_pfn " %" PRI_xen_pfn "\n",
>> +freed, min_pfn, max_pfn);
>> +return true;
>> +}
>> +
>> +/*
>> + * Try to allocate superpages.
>> + * This works without memory map only if the pfns arrive in incremental 
>> order.
>> + */
> I have said several times, one way or another, I don't want to make
> assumption on the stream of pfns. So I'm afraid I can't ack a patch like
> this.
>
> If Ian or Andrew thinks this is OK, I won't stand in the way.

The stream has always been in-order for the first pass (even in the
legacy days), and I don't forsee that changing.  Reliance on the order
was suggested by both myself and Jan during the early design.

It is certainly an acceptable assumption until we put a proper address
map into the head of the stream.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-06 Thread Wei Liu
On Fri, Sep 01, 2017 at 06:08:43PM +0200, Olaf Hering wrote:
[...]
> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
> index 734320947a..93141a6e25 100644
> --- a/tools/libxc/xc_sr_common.h
> +++ b/tools/libxc/xc_sr_common.h
> @@ -139,6 +139,16 @@ struct xc_sr_restore_ops
>   */
>  int (*setup)(struct xc_sr_context *ctx);
>  
> +/**
> + * Populate PFNs
> + *
> + * Given a set of pfns, obtain memory from Xen to fill the physmap for 
> the
> + * unpopulated subset.
> + */
> +int (*populate_pfns)(struct xc_sr_context *ctx, unsigned count,
> + const xen_pfn_t *original_pfns, const uint32_t 
> *types);
> +

One blank line is good enough.

> +
>  /**
>   * Process an individual record from the stream.  The caller shall take
>   * care of processing common records (e.g. END, PAGE_DATA).
> @@ -224,6 +234,8 @@ struct xc_sr_context
>  
>  int send_back_fd;
>  unsigned long p2m_size;
> +unsigned long max_pages;
> +unsigned long tot_pages;
>  xc_hypercall_buffer_t dirty_bitmap_hbuf;
>  
>  /* From Image Header. */
> @@ -336,6 +348,12 @@ struct xc_sr_context
>  /* HVM context blob. */
>  void *context;
>  size_t contextsz;
> +
> +/* Bitmap of currently allocated PFNs during restore. */
> +struct xc_sr_bitmap attempted_1g;
> +struct xc_sr_bitmap attempted_2m;
> +struct xc_sr_bitmap allocated_pfns;
> +xen_pfn_t idx1G_prev, idx2M_prev;
>  } restore;
>  };
>  } x86_hvm;
> @@ -459,14 +477,6 @@ static inline int write_record(struct xc_sr_context *ctx,
>   */
>  int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec);
>  
> -/*
> - * This would ideally be private in restore.c, but is needed by
> - * x86_pv_localise_page() if we receive pagetables frames ahead of the
> - * contents of the frames they point at.
> - */
> -int populate_pfns(struct xc_sr_context *ctx, unsigned count,
> -  const xen_pfn_t *original_pfns, const uint32_t *types);
> -
>  #endif
>  /*
>   * Local variables:
[...]
>  
> +struct x86_hvm_sp {

Forgot to ask: what does sp stand for?

> +static bool x86_hvm_punch_hole(struct xc_sr_context *ctx, xen_pfn_t max_pfn)
> +{
> +xc_interface *xch = ctx->xch;
> +struct xc_sr_bitmap *bm;
> +xen_pfn_t _pfn, pfn, min_pfn;
> +uint32_t domid, freed = 0, order;

unsigned int / long for freed and order.

> +int rc = -1;
> +
> +/*
> + * Scan the entire superpage because several batches will fit into
> + * a superpage, and it is unknown which pfn triggered the allocation.
> + */
> +order = SUPERPAGE_1GB_SHIFT;
> +pfn = min_pfn = (max_pfn >> order) << order;
> +

min_pfn -> start_pfn?

> +while ( pfn <= max_pfn )
> +{

bm can be defined here.

> +bm = >x86_hvm.restore.allocated_pfns;
> +if ( !xc_sr_bitmap_resize(bm, pfn) )
> +{
> +PERROR("Failed to realloc allocated_pfns %" PRI_xen_pfn, pfn);
> +return false;
> +}
> +if ( !pfn_is_populated(ctx, pfn) &&
> +xc_sr_test_and_clear_bit(pfn, bm) ) {

domid and _pfn can be defined here.

> +domid = ctx->domid;
> +_pfn = pfn;
> +rc = xc_domain_decrease_reservation_exact(xch, domid, 1, 0, 
> &_pfn);

Please batch the requests otherwise it is going to be very slow.

It should be feasible to construct an array of pfns here and issue a
single decrease_reservation outside of this loop.

> +if ( rc )
> +{
> +PERROR("Failed to release pfn %" PRI_xen_pfn, pfn);
> +return false;
> +}
> +ctx->restore.tot_pages--;
> +freed++;
> +}
> +pfn++;
> +}
> +if ( freed )
> +DPRINTF("freed %u between %" PRI_xen_pfn " %" PRI_xen_pfn "\n",
> +freed, min_pfn, max_pfn);
> +return true;
> +}
> +
> +/*
> + * Try to allocate superpages.
> + * This works without memory map only if the pfns arrive in incremental 
> order.
> + */

I have said several times, one way or another, I don't want to make
assumption on the stream of pfns. So I'm afraid I can't ack a patch like
this.

If Ian or Andrew thinks this is OK, I won't stand in the way.

> +static int x86_hvm_populate_pfns(struct xc_sr_context *ctx, unsigned count,
> + const xen_pfn_t *original_pfns,

original_pfns -> pfns?

The list is not copied and/or altered in any way afaict.

(I skipped the rest)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v9 3/3] tools/libxc: use superpages during restore of HVM guest

2017-09-01 Thread Olaf Hering
During creating of a HVM domU meminit_hvm() tries to map superpages.
After save/restore or migration this mapping is lost, everything is
allocated in single pages. This causes a performance degradition after
migration.

Add neccessary code to preallocate a superpage for the chunk of pfns
that is received. In case a pfn was not populated on the sending side it
must be freed on the receiving side to avoid over-allocation.

The existing code for x86_pv is moved unmodified into its own file.

Signed-off-by: Olaf Hering 
---
 tools/libxc/xc_sr_common.h  |  26 ++-
 tools/libxc/xc_sr_restore.c |  75 +---
 tools/libxc/xc_sr_restore_x86_hvm.c | 341 
 tools/libxc/xc_sr_restore_x86_pv.c  |  72 +++-
 4 files changed, 436 insertions(+), 78 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 734320947a..93141a6e25 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -139,6 +139,16 @@ struct xc_sr_restore_ops
  */
 int (*setup)(struct xc_sr_context *ctx);
 
+/**
+ * Populate PFNs
+ *
+ * Given a set of pfns, obtain memory from Xen to fill the physmap for the
+ * unpopulated subset.
+ */
+int (*populate_pfns)(struct xc_sr_context *ctx, unsigned count,
+ const xen_pfn_t *original_pfns, const uint32_t 
*types);
+
+
 /**
  * Process an individual record from the stream.  The caller shall take
  * care of processing common records (e.g. END, PAGE_DATA).
@@ -224,6 +234,8 @@ struct xc_sr_context
 
 int send_back_fd;
 unsigned long p2m_size;
+unsigned long max_pages;
+unsigned long tot_pages;
 xc_hypercall_buffer_t dirty_bitmap_hbuf;
 
 /* From Image Header. */
@@ -336,6 +348,12 @@ struct xc_sr_context
 /* HVM context blob. */
 void *context;
 size_t contextsz;
+
+/* Bitmap of currently allocated PFNs during restore. */
+struct xc_sr_bitmap attempted_1g;
+struct xc_sr_bitmap attempted_2m;
+struct xc_sr_bitmap allocated_pfns;
+xen_pfn_t idx1G_prev, idx2M_prev;
 } restore;
 };
 } x86_hvm;
@@ -459,14 +477,6 @@ static inline int write_record(struct xc_sr_context *ctx,
  */
 int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec);
 
-/*
- * This would ideally be private in restore.c, but is needed by
- * x86_pv_localise_page() if we receive pagetables frames ahead of the
- * contents of the frames they point at.
- */
-int populate_pfns(struct xc_sr_context *ctx, unsigned count,
-  const xen_pfn_t *original_pfns, const uint32_t *types);
-
 #endif
 /*
  * Local variables:
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index d53948e1a6..8cd9289d1a 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -68,74 +68,6 @@ static int read_headers(struct xc_sr_context *ctx)
 return 0;
 }
 
-/*
- * Given a set of pfns, obtain memory from Xen to fill the physmap for the
- * unpopulated subset.  If types is NULL, no page type checking is performed
- * and all unpopulated pfns are populated.
- */
-int populate_pfns(struct xc_sr_context *ctx, unsigned count,
-  const xen_pfn_t *original_pfns, const uint32_t *types)
-{
-xc_interface *xch = ctx->xch;
-xen_pfn_t *mfns = malloc(count * sizeof(*mfns)),
-*pfns = malloc(count * sizeof(*pfns));
-unsigned i, nr_pfns = 0;
-int rc = -1;
-
-if ( !mfns || !pfns )
-{
-ERROR("Failed to allocate %zu bytes for populating the physmap",
-  2 * count * sizeof(*mfns));
-goto err;
-}
-
-for ( i = 0; i < count; ++i )
-{
-if ( (!types || (types &&
- (types[i] != XEN_DOMCTL_PFINFO_XTAB &&
-  types[i] != XEN_DOMCTL_PFINFO_BROKEN))) &&
- !pfn_is_populated(ctx, original_pfns[i]) )
-{
-rc = pfn_set_populated(ctx, original_pfns[i]);
-if ( rc )
-goto err;
-pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i];
-++nr_pfns;
-}
-}
-
-if ( nr_pfns )
-{
-rc = xc_domain_populate_physmap_exact(
-xch, ctx->domid, nr_pfns, 0, 0, mfns);
-if ( rc )
-{
-PERROR("Failed to populate physmap");
-goto err;
-}
-
-for ( i = 0; i < nr_pfns; ++i )
-{
-if ( mfns[i] == INVALID_MFN )
-{
-ERROR("Populate physmap failed for pfn %u", i);
-rc = -1;
-goto err;
-}
-
-ctx->restore.ops.set_gfn(ctx, pfns[i], mfns[i]);
-}
-}
-
-rc = 0;
-
- err:
-free(pfns);
-free(mfns);