Re: [Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking

2020-03-17 Thread Tamas K Lengyel
On Tue, Mar 17, 2020 at 10:35 AM Jan Beulich  wrote:
>
> On 17.03.2020 17:23, Tamas K Lengyel wrote:
> > On Tue, Mar 17, 2020 at 10:02 AM Jan Beulich  wrote:
> >> On 28.02.2020 19:40, Tamas K Lengyel wrote:
> >>> @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn(
> >>>  return page;
> >>>
> >>>  /* Error path: not a suitable GFN at all */
> >>> -if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) )
> >>> +if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) &&
> >>> + !mem_sharing_is_fork(p2m->domain) )
> >>>  return NULL;
> >>
> >> This looks pretty broad a condition, i.e. all possible types would
> >> make it through here for a fork. Wouldn't it make sense to limit
> >> to to p2m_is_hole() page types, like you check for in
> >> __get_gfn_type_access()?
> >
> > No need to put that check here. By allowing to go further when we have
> > a forked VM, this code-path will call get_gfn_type_access, which does
> > that check. It's better to have that check at one place instead of all
> > over unnecessarily.
>
> Well, if worse performance (due to more cases where the lock will
> be taken) is not of concern - so be it.
>
> >>> --- a/xen/include/asm-x86/mem_sharing.h
> >>> +++ b/xen/include/asm-x86/mem_sharing.h
> >>> @@ -39,6 +39,9 @@ struct mem_sharing_domain
> >>>
> >>>  #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled)
> >>>
> >>> +#define mem_sharing_is_fork(d) \
> >>> +(mem_sharing_enabled(d) && !!((d)->parent))
> >>
> >> Again not need for !! (for a different reason).
> >
> > Which is?
>
> Further up the reason was that you pass the value as argument
> for a boolean function parameter. Here the reason is that is an
> operand of &&.
>
> >> Also, does the build break if you made this an inline function
> >> (as we generally prefer)?
> >
> > Any particular reason for that (inline vs define)?
>
> Inline functions add type safety for the arguments, which
> #define-s don't do.

Ack.

>
> >>> @@ -532,6 +533,10 @@ struct xen_mem_sharing_op {
> >>>  uint32_t gref; /* IN: gref to debug */
> >>>  } u;
> >>>  } debug;
> >>> +struct mem_sharing_op_fork {  /* OP_FORK */
> >>> +domid_t parent_domain;/* IN: parent's domain id */
> >>> +uint16_t _pad[3]; /* Must be set to 0 */
> >>
> >> Especially in the public interface - no new name space
> >> violations please. I.e. please drop the leading underscore.
> >> I also struggle to see why this is an array of three
> >> elements. In fact I don't see why the padding field would be
> >> needed at all - one other union member only gets padded to
> >> its alignment (which is what I'd expect), while others
> >> (presumably older ones) don't have any padding at all. Here
> >> there's no implicit structure's alignment padding that wants
> >> making explicit.
> >
> > I don't know what you are asking. Drop the padding? I prefer each
> > union member to be padded to 64-bit, reduces cognitive load trying to
> > figure out what the size and alginment of each member struct would be.
>
> Personally I'd suggest to drop the padding, as it actually
> grows the size of the structure. But if you feel strongly
> about keeping it, then I'll be okay with just the field's
> name changed.

It grows the structure size to 64-bit, yes, but it doesn't grow the
size of union as other members are much larger. I'll remove the
underscore from the pad name but I still prefer it aligned.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking

2020-03-17 Thread Jan Beulich
On 17.03.2020 17:23, Tamas K Lengyel wrote:
> On Tue, Mar 17, 2020 at 10:02 AM Jan Beulich  wrote:
>> On 28.02.2020 19:40, Tamas K Lengyel wrote:
>>> @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn(
>>>  return page;
>>>
>>>  /* Error path: not a suitable GFN at all */
>>> -if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) )
>>> +if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) &&
>>> + !mem_sharing_is_fork(p2m->domain) )
>>>  return NULL;
>>
>> This looks pretty broad a condition, i.e. all possible types would
>> make it through here for a fork. Wouldn't it make sense to limit
>> to to p2m_is_hole() page types, like you check for in
>> __get_gfn_type_access()?
> 
> No need to put that check here. By allowing to go further when we have
> a forked VM, this code-path will call get_gfn_type_access, which does
> that check. It's better to have that check at one place instead of all
> over unnecessarily.

Well, if worse performance (due to more cases where the lock will
be taken) is not of concern - so be it.

>>> --- a/xen/include/asm-x86/mem_sharing.h
>>> +++ b/xen/include/asm-x86/mem_sharing.h
>>> @@ -39,6 +39,9 @@ struct mem_sharing_domain
>>>
>>>  #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled)
>>>
>>> +#define mem_sharing_is_fork(d) \
>>> +(mem_sharing_enabled(d) && !!((d)->parent))
>>
>> Again not need for !! (for a different reason).
> 
> Which is?

Further up the reason was that you pass the value as argument
for a boolean function parameter. Here the reason is that is an
operand of &&.

>> Also, does the build break if you made this an inline function
>> (as we generally prefer)?
> 
> Any particular reason for that (inline vs define)?

Inline functions add type safety for the arguments, which
#define-s don't do.

>>> @@ -532,6 +533,10 @@ struct xen_mem_sharing_op {
>>>  uint32_t gref; /* IN: gref to debug */
>>>  } u;
>>>  } debug;
>>> +struct mem_sharing_op_fork {  /* OP_FORK */
>>> +domid_t parent_domain;/* IN: parent's domain id */
>>> +uint16_t _pad[3]; /* Must be set to 0 */
>>
>> Especially in the public interface - no new name space
>> violations please. I.e. please drop the leading underscore.
>> I also struggle to see why this is an array of three
>> elements. In fact I don't see why the padding field would be
>> needed at all - one other union member only gets padded to
>> its alignment (which is what I'd expect), while others
>> (presumably older ones) don't have any padding at all. Here
>> there's no implicit structure's alignment padding that wants
>> making explicit.
> 
> I don't know what you are asking. Drop the padding? I prefer each
> union member to be padded to 64-bit, reduces cognitive load trying to
> figure out what the size and alginment of each member struct would be.

Personally I'd suggest to drop the padding, as it actually
grows the size of the structure. But if you feel strongly
about keeping it, then I'll be okay with just the field's
name changed.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking

2020-03-17 Thread Tamas K Lengyel
On Tue, Mar 17, 2020 at 10:02 AM Jan Beulich  wrote:
>
> On 28.02.2020 19:40, Tamas K Lengyel wrote:
> > --- a/xen/arch/x86/mm/p2m.c
> > +++ b/xen/arch/x86/mm/p2m.c
> > @@ -509,6 +509,12 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, 
> > unsigned long gfn_l,
> >
> >  mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL);
> >
> > +/* Check if we need to fork the page */
> > +if ( (q & P2M_ALLOC) && p2m_is_hole(*t) &&
> > + !mem_sharing_fork_page(p2m->domain, gfn, !!(q & P2M_UNSHARE)) )
>
> No need for !! here.

I don't think it really matters but sure.

>
> > @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn(
> >  return page;
> >
> >  /* Error path: not a suitable GFN at all */
> > -if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) )
> > +if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) &&
> > + !mem_sharing_is_fork(p2m->domain) )
> >  return NULL;
>
> This looks pretty broad a condition, i.e. all possible types would
> make it through here for a fork. Wouldn't it make sense to limit
> to to p2m_is_hole() page types, like you check for in
> __get_gfn_type_access()?

No need to put that check here. By allowing to go further when we have
a forked VM, this code-path will call get_gfn_type_access, which does
that check. It's better to have that check at one place instead of all
over unnecessarily.

>
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -1269,6 +1269,9 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, 
> > unsigned offset)
> >
> >  v->vcpu_info = new_info;
> >  v->vcpu_info_mfn = page_to_mfn(page);
> > +#ifdef CONFIG_MEM_SHARING
> > +v->vcpu_info_offset = offset;
> > +#endif
>
> Seeing something like this makes me wonder whether forking shouldn't
> have its own Kconfig control.

For now I think it's fine to have it under mem_sharing.

>
> > --- a/xen/include/asm-x86/mem_sharing.h
> > +++ b/xen/include/asm-x86/mem_sharing.h
> > @@ -39,6 +39,9 @@ struct mem_sharing_domain
> >
> >  #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled)
> >
> > +#define mem_sharing_is_fork(d) \
> > +(mem_sharing_enabled(d) && !!((d)->parent))
>
> Again not need for !! (for a different reason).

Which is?

>
> Also, does the build break if you made this an inline function
> (as we generally prefer)?

Any particular reason for that (inline vs define)?

>
> > @@ -141,6 +148,16 @@ static inline int mem_sharing_notify_enomem(struct 
> > domain *d, unsigned long gfn,
> >  return -EOPNOTSUPP;
> >  }
> >
> > +static inline int mem_sharing_fork(struct domain *d, struct domain *cd, 
> > bool vcpu)
> > +{
> > +return -EOPNOTSUPP;
> > +}

This actually is no longer needed at all.

> > +
> > +static inline int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool 
> > lock)
> > +{
> > +return -EOPNOTSUPP;
> > +}
>
> Can these be reached? If not, please add ASSERT_UNREACHABLE().

This can be reached.

>
> > @@ -532,6 +533,10 @@ struct xen_mem_sharing_op {
> >  uint32_t gref; /* IN: gref to debug */
> >  } u;
> >  } debug;
> > +struct mem_sharing_op_fork {  /* OP_FORK */
> > +domid_t parent_domain;/* IN: parent's domain id */
> > +uint16_t _pad[3]; /* Must be set to 0 */
>
> Especially in the public interface - no new name space
> violations please. I.e. please drop the leading underscore.
> I also struggle to see why this is an array of three
> elements. In fact I don't see why the padding field would be
> needed at all - one other union member only gets padded to
> its alignment (which is what I'd expect), while others
> (presumably older ones) don't have any padding at all. Here
> there's no implicit structure's alignment padding that wants
> making explicit.

I don't know what you are asking. Drop the padding? I prefer each
union member to be padded to 64-bit, reduces cognitive load trying to
figure out what the size and alginment of each member struct would be.

>
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -248,6 +248,9 @@ struct vcpu
> >
> >  /* Guest-specified relocation of vcpu_info. */
> >  mfn_tvcpu_info_mfn;
> > +#ifdef CONFIG_MEM_SHARING
> > +uint32_t vcpu_info_offset;
>
> There's no need for a fixed width type here afaics - unsigned
> int and probably even unsigned short would seem to both be
> fine.

OK.

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking

2020-03-17 Thread Jan Beulich
On 28.02.2020 19:40, Tamas K Lengyel wrote:
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -509,6 +509,12 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, 
> unsigned long gfn_l,
>  
>  mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL);
>  
> +/* Check if we need to fork the page */
> +if ( (q & P2M_ALLOC) && p2m_is_hole(*t) &&
> + !mem_sharing_fork_page(p2m->domain, gfn, !!(q & P2M_UNSHARE)) )

No need for !! here.

> @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn(
>  return page;
>  
>  /* Error path: not a suitable GFN at all */
> -if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) )
> +if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) &&
> + !mem_sharing_is_fork(p2m->domain) )
>  return NULL;

This looks pretty broad a condition, i.e. all possible types would
make it through here for a fork. Wouldn't it make sense to limit
to to p2m_is_hole() page types, like you check for in
__get_gfn_type_access()?

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1269,6 +1269,9 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, 
> unsigned offset)
>  
>  v->vcpu_info = new_info;
>  v->vcpu_info_mfn = page_to_mfn(page);
> +#ifdef CONFIG_MEM_SHARING
> +v->vcpu_info_offset = offset;
> +#endif

Seeing something like this makes me wonder whether forking shouldn't
have its own Kconfig control.

> --- a/xen/include/asm-x86/mem_sharing.h
> +++ b/xen/include/asm-x86/mem_sharing.h
> @@ -39,6 +39,9 @@ struct mem_sharing_domain
>  
>  #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled)
>  
> +#define mem_sharing_is_fork(d) \
> +(mem_sharing_enabled(d) && !!((d)->parent))

Again not need for !! (for a different reason).

Also, does the build break if you made this an inline function
(as we generally prefer)?

> @@ -141,6 +148,16 @@ static inline int mem_sharing_notify_enomem(struct 
> domain *d, unsigned long gfn,
>  return -EOPNOTSUPP;
>  }
>  
> +static inline int mem_sharing_fork(struct domain *d, struct domain *cd, bool 
> vcpu)
> +{
> +return -EOPNOTSUPP;
> +}
> +
> +static inline int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool 
> lock)
> +{
> +return -EOPNOTSUPP;
> +}

Can these be reached? If not, please add ASSERT_UNREACHABLE().

> @@ -532,6 +533,10 @@ struct xen_mem_sharing_op {
>  uint32_t gref; /* IN: gref to debug */
>  } u;
>  } debug;
> +struct mem_sharing_op_fork {  /* OP_FORK */
> +domid_t parent_domain;/* IN: parent's domain id */
> +uint16_t _pad[3]; /* Must be set to 0 */

Especially in the public interface - no new name space
violations please. I.e. please drop the leading underscore.
I also struggle to see why this is an array of three
elements. In fact I don't see why the padding field would be
needed at all - one other union member only gets padded to
its alignment (which is what I'd expect), while others
(presumably older ones) don't have any padding at all. Here
there's no implicit structure's alignment padding that wants
making explicit.

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -248,6 +248,9 @@ struct vcpu
>  
>  /* Guest-specified relocation of vcpu_info. */
>  mfn_tvcpu_info_mfn;
> +#ifdef CONFIG_MEM_SHARING
> +uint32_t vcpu_info_offset;

There's no need for a fixed width type here afaics - unsigned
int and probably even unsigned short would seem to both be
fine.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking

2020-02-28 Thread Tamas K Lengyel
VM forking is the process of creating a domain with an empty memory space and a
parent domain specified from which to populate the memory when necessary. For
the new domain to be functional the VM state is copied over as part of the fork
operation (HVM params, hap allocation, etc).

Signed-off-by: Tamas K Lengyel 
---
v11: Fully copy vcpu_info pages
 Setup vcpu_runstate for forks
 Added TODO note for PV timers
 Copy shared_info page
 Add copy_settings function, to be shared with fork_reset in the next patch
---
 xen/arch/x86/domain.c |  11 +
 xen/arch/x86/hvm/hvm.c|   4 +-
 xen/arch/x86/mm/hap/hap.c |   3 +-
 xen/arch/x86/mm/mem_sharing.c | 368 ++
 xen/arch/x86/mm/p2m.c |   9 +-
 xen/common/domain.c   |   3 +
 xen/include/asm-x86/hap.h |   1 +
 xen/include/asm-x86/hvm/hvm.h |   2 +
 xen/include/asm-x86/mem_sharing.h |  17 ++
 xen/include/public/memory.h   |   5 +
 xen/include/xen/sched.h   |   5 +
 11 files changed, 423 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index fe63c23676..1ab0ca0942 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2203,6 +2203,17 @@ int domain_relinquish_resources(struct domain *d)
 ret = relinquish_shared_pages(d);
 if ( ret )
 return ret;
+
+/*
+ * If the domain is forked, decrement the parent's pause count
+ * and release the domain.
+ */
+if ( d->parent )
+{
+domain_unpause(d->parent);
+put_domain(d->parent);
+d->parent = NULL;
+}
 }
 #endif
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index db5d7b4d30..c551cc31f3 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1916,7 +1916,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
 }
 #endif
 
-/* Spurious fault? PoD and log-dirty also take this path. */
+/* Spurious fault? PoD, log-dirty and VM forking also take this path. */
 if ( p2m_is_ram(p2mt) )
 {
 rc = 1;
@@ -4430,7 +4430,7 @@ static int hvm_allow_get_param(struct domain *d,
 return rc;
 }
 
-static int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value)
+int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value)
 {
 int rc;
 
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 3d93f3451c..c7c7ff6e99 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -321,8 +321,7 @@ static void hap_free_p2m_page(struct domain *d, struct 
page_info *pg)
 }
 
 /* Return the size of the pool, rounded up to the nearest MB */
-static unsigned int
-hap_get_allocation(struct domain *d)
+unsigned int hap_get_allocation(struct domain *d)
 {
 unsigned int pg = d->arch.paging.hap.total_pages
 + d->arch.paging.hap.p2m_pages;
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 3835bc928f..4a840d09dd 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -22,6 +22,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -36,6 +37,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 #include "mm-locks.h"
@@ -1444,6 +1447,334 @@ static inline int mem_sharing_control(struct domain *d, 
bool enable)
 return 0;
 }
 
+/*
+ * Forking a page only gets called when the VM faults due to no entry being
+ * in the EPT for the access. Depending on the type of access we either
+ * populate the physmap with a shared entry for read-only access or
+ * fork the page if its a write access.
+ *
+ * The client p2m is already locked so we only need to lock
+ * the parent's here.
+ */
+int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool unsharing)
+{
+int rc = -ENOENT;
+shr_handle_t handle;
+struct domain *parent = d->parent;
+struct p2m_domain *p2m;
+unsigned long gfn_l = gfn_x(gfn);
+mfn_t mfn, new_mfn;
+p2m_type_t p2mt;
+struct page_info *page;
+
+if ( !mem_sharing_is_fork(d) )
+return -ENOENT;
+
+if ( !unsharing )
+{
+/* For read-only accesses we just add a shared entry to the physmap */
+while ( parent )
+{
+if ( !(rc = nominate_page(parent, gfn, 0, )) )
+break;
+
+parent = parent->parent;
+}
+
+if ( !rc )
+{
+/* The client's p2m is already locked */
+struct p2m_domain *pp2m = p2m_get_hostp2m(parent);
+
+p2m_lock(pp2m);
+rc = add_to_physmap(parent, gfn_l, handle, d, gfn_l, false);
+p2m_unlock(pp2m);
+
+if ( !rc )
+return 0;
+}
+}
+
+/*
+ * If it's a write access (ie. unsharing) or if adding a shared entry to
+ * the physmap failed we'll