Re: [Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking
On Tue, Mar 17, 2020 at 10:35 AM Jan Beulich wrote: > > On 17.03.2020 17:23, Tamas K Lengyel wrote: > > On Tue, Mar 17, 2020 at 10:02 AM Jan Beulich wrote: > >> On 28.02.2020 19:40, Tamas K Lengyel wrote: > >>> @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn( > >>> return page; > >>> > >>> /* Error path: not a suitable GFN at all */ > >>> -if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) ) > >>> +if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) && > >>> + !mem_sharing_is_fork(p2m->domain) ) > >>> return NULL; > >> > >> This looks pretty broad a condition, i.e. all possible types would > >> make it through here for a fork. Wouldn't it make sense to limit > >> to to p2m_is_hole() page types, like you check for in > >> __get_gfn_type_access()? > > > > No need to put that check here. By allowing to go further when we have > > a forked VM, this code-path will call get_gfn_type_access, which does > > that check. It's better to have that check at one place instead of all > > over unnecessarily. > > Well, if worse performance (due to more cases where the lock will > be taken) is not of concern - so be it. > > >>> --- a/xen/include/asm-x86/mem_sharing.h > >>> +++ b/xen/include/asm-x86/mem_sharing.h > >>> @@ -39,6 +39,9 @@ struct mem_sharing_domain > >>> > >>> #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled) > >>> > >>> +#define mem_sharing_is_fork(d) \ > >>> +(mem_sharing_enabled(d) && !!((d)->parent)) > >> > >> Again not need for !! (for a different reason). > > > > Which is? > > Further up the reason was that you pass the value as argument > for a boolean function parameter. Here the reason is that is an > operand of &&. > > >> Also, does the build break if you made this an inline function > >> (as we generally prefer)? > > > > Any particular reason for that (inline vs define)? > > Inline functions add type safety for the arguments, which > #define-s don't do. Ack. > > >>> @@ -532,6 +533,10 @@ struct xen_mem_sharing_op { > >>> uint32_t gref; /* IN: gref to debug */ > >>> } u; > >>> } debug; > >>> +struct mem_sharing_op_fork { /* OP_FORK */ > >>> +domid_t parent_domain;/* IN: parent's domain id */ > >>> +uint16_t _pad[3]; /* Must be set to 0 */ > >> > >> Especially in the public interface - no new name space > >> violations please. I.e. please drop the leading underscore. > >> I also struggle to see why this is an array of three > >> elements. In fact I don't see why the padding field would be > >> needed at all - one other union member only gets padded to > >> its alignment (which is what I'd expect), while others > >> (presumably older ones) don't have any padding at all. Here > >> there's no implicit structure's alignment padding that wants > >> making explicit. > > > > I don't know what you are asking. Drop the padding? I prefer each > > union member to be padded to 64-bit, reduces cognitive load trying to > > figure out what the size and alginment of each member struct would be. > > Personally I'd suggest to drop the padding, as it actually > grows the size of the structure. But if you feel strongly > about keeping it, then I'll be okay with just the field's > name changed. It grows the structure size to 64-bit, yes, but it doesn't grow the size of union as other members are much larger. I'll remove the underscore from the pad name but I still prefer it aligned. Tamas ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking
On 17.03.2020 17:23, Tamas K Lengyel wrote: > On Tue, Mar 17, 2020 at 10:02 AM Jan Beulich wrote: >> On 28.02.2020 19:40, Tamas K Lengyel wrote: >>> @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn( >>> return page; >>> >>> /* Error path: not a suitable GFN at all */ >>> -if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) ) >>> +if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) && >>> + !mem_sharing_is_fork(p2m->domain) ) >>> return NULL; >> >> This looks pretty broad a condition, i.e. all possible types would >> make it through here for a fork. Wouldn't it make sense to limit >> to to p2m_is_hole() page types, like you check for in >> __get_gfn_type_access()? > > No need to put that check here. By allowing to go further when we have > a forked VM, this code-path will call get_gfn_type_access, which does > that check. It's better to have that check at one place instead of all > over unnecessarily. Well, if worse performance (due to more cases where the lock will be taken) is not of concern - so be it. >>> --- a/xen/include/asm-x86/mem_sharing.h >>> +++ b/xen/include/asm-x86/mem_sharing.h >>> @@ -39,6 +39,9 @@ struct mem_sharing_domain >>> >>> #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled) >>> >>> +#define mem_sharing_is_fork(d) \ >>> +(mem_sharing_enabled(d) && !!((d)->parent)) >> >> Again not need for !! (for a different reason). > > Which is? Further up the reason was that you pass the value as argument for a boolean function parameter. Here the reason is that is an operand of &&. >> Also, does the build break if you made this an inline function >> (as we generally prefer)? > > Any particular reason for that (inline vs define)? Inline functions add type safety for the arguments, which #define-s don't do. >>> @@ -532,6 +533,10 @@ struct xen_mem_sharing_op { >>> uint32_t gref; /* IN: gref to debug */ >>> } u; >>> } debug; >>> +struct mem_sharing_op_fork { /* OP_FORK */ >>> +domid_t parent_domain;/* IN: parent's domain id */ >>> +uint16_t _pad[3]; /* Must be set to 0 */ >> >> Especially in the public interface - no new name space >> violations please. I.e. please drop the leading underscore. >> I also struggle to see why this is an array of three >> elements. In fact I don't see why the padding field would be >> needed at all - one other union member only gets padded to >> its alignment (which is what I'd expect), while others >> (presumably older ones) don't have any padding at all. Here >> there's no implicit structure's alignment padding that wants >> making explicit. > > I don't know what you are asking. Drop the padding? I prefer each > union member to be padded to 64-bit, reduces cognitive load trying to > figure out what the size and alginment of each member struct would be. Personally I'd suggest to drop the padding, as it actually grows the size of the structure. But if you feel strongly about keeping it, then I'll be okay with just the field's name changed. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking
On Tue, Mar 17, 2020 at 10:02 AM Jan Beulich wrote: > > On 28.02.2020 19:40, Tamas K Lengyel wrote: > > --- a/xen/arch/x86/mm/p2m.c > > +++ b/xen/arch/x86/mm/p2m.c > > @@ -509,6 +509,12 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, > > unsigned long gfn_l, > > > > mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL); > > > > +/* Check if we need to fork the page */ > > +if ( (q & P2M_ALLOC) && p2m_is_hole(*t) && > > + !mem_sharing_fork_page(p2m->domain, gfn, !!(q & P2M_UNSHARE)) ) > > No need for !! here. I don't think it really matters but sure. > > > @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn( > > return page; > > > > /* Error path: not a suitable GFN at all */ > > -if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) ) > > +if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) && > > + !mem_sharing_is_fork(p2m->domain) ) > > return NULL; > > This looks pretty broad a condition, i.e. all possible types would > make it through here for a fork. Wouldn't it make sense to limit > to to p2m_is_hole() page types, like you check for in > __get_gfn_type_access()? No need to put that check here. By allowing to go further when we have a forked VM, this code-path will call get_gfn_type_access, which does that check. It's better to have that check at one place instead of all over unnecessarily. > > > --- a/xen/common/domain.c > > +++ b/xen/common/domain.c > > @@ -1269,6 +1269,9 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, > > unsigned offset) > > > > v->vcpu_info = new_info; > > v->vcpu_info_mfn = page_to_mfn(page); > > +#ifdef CONFIG_MEM_SHARING > > +v->vcpu_info_offset = offset; > > +#endif > > Seeing something like this makes me wonder whether forking shouldn't > have its own Kconfig control. For now I think it's fine to have it under mem_sharing. > > > --- a/xen/include/asm-x86/mem_sharing.h > > +++ b/xen/include/asm-x86/mem_sharing.h > > @@ -39,6 +39,9 @@ struct mem_sharing_domain > > > > #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled) > > > > +#define mem_sharing_is_fork(d) \ > > +(mem_sharing_enabled(d) && !!((d)->parent)) > > Again not need for !! (for a different reason). Which is? > > Also, does the build break if you made this an inline function > (as we generally prefer)? Any particular reason for that (inline vs define)? > > > @@ -141,6 +148,16 @@ static inline int mem_sharing_notify_enomem(struct > > domain *d, unsigned long gfn, > > return -EOPNOTSUPP; > > } > > > > +static inline int mem_sharing_fork(struct domain *d, struct domain *cd, > > bool vcpu) > > +{ > > +return -EOPNOTSUPP; > > +} This actually is no longer needed at all. > > + > > +static inline int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool > > lock) > > +{ > > +return -EOPNOTSUPP; > > +} > > Can these be reached? If not, please add ASSERT_UNREACHABLE(). This can be reached. > > > @@ -532,6 +533,10 @@ struct xen_mem_sharing_op { > > uint32_t gref; /* IN: gref to debug */ > > } u; > > } debug; > > +struct mem_sharing_op_fork { /* OP_FORK */ > > +domid_t parent_domain;/* IN: parent's domain id */ > > +uint16_t _pad[3]; /* Must be set to 0 */ > > Especially in the public interface - no new name space > violations please. I.e. please drop the leading underscore. > I also struggle to see why this is an array of three > elements. In fact I don't see why the padding field would be > needed at all - one other union member only gets padded to > its alignment (which is what I'd expect), while others > (presumably older ones) don't have any padding at all. Here > there's no implicit structure's alignment padding that wants > making explicit. I don't know what you are asking. Drop the padding? I prefer each union member to be padded to 64-bit, reduces cognitive load trying to figure out what the size and alginment of each member struct would be. > > > --- a/xen/include/xen/sched.h > > +++ b/xen/include/xen/sched.h > > @@ -248,6 +248,9 @@ struct vcpu > > > > /* Guest-specified relocation of vcpu_info. */ > > mfn_tvcpu_info_mfn; > > +#ifdef CONFIG_MEM_SHARING > > +uint32_t vcpu_info_offset; > > There's no need for a fixed width type here afaics - unsigned > int and probably even unsigned short would seem to both be > fine. OK. Thanks, Tamas ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking
On 28.02.2020 19:40, Tamas K Lengyel wrote: > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -509,6 +509,12 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, > unsigned long gfn_l, > > mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL); > > +/* Check if we need to fork the page */ > +if ( (q & P2M_ALLOC) && p2m_is_hole(*t) && > + !mem_sharing_fork_page(p2m->domain, gfn, !!(q & P2M_UNSHARE)) ) No need for !! here. > @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn( > return page; > > /* Error path: not a suitable GFN at all */ > -if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) ) > +if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) && > + !mem_sharing_is_fork(p2m->domain) ) > return NULL; This looks pretty broad a condition, i.e. all possible types would make it through here for a fork. Wouldn't it make sense to limit to to p2m_is_hole() page types, like you check for in __get_gfn_type_access()? > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -1269,6 +1269,9 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, > unsigned offset) > > v->vcpu_info = new_info; > v->vcpu_info_mfn = page_to_mfn(page); > +#ifdef CONFIG_MEM_SHARING > +v->vcpu_info_offset = offset; > +#endif Seeing something like this makes me wonder whether forking shouldn't have its own Kconfig control. > --- a/xen/include/asm-x86/mem_sharing.h > +++ b/xen/include/asm-x86/mem_sharing.h > @@ -39,6 +39,9 @@ struct mem_sharing_domain > > #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled) > > +#define mem_sharing_is_fork(d) \ > +(mem_sharing_enabled(d) && !!((d)->parent)) Again not need for !! (for a different reason). Also, does the build break if you made this an inline function (as we generally prefer)? > @@ -141,6 +148,16 @@ static inline int mem_sharing_notify_enomem(struct > domain *d, unsigned long gfn, > return -EOPNOTSUPP; > } > > +static inline int mem_sharing_fork(struct domain *d, struct domain *cd, bool > vcpu) > +{ > +return -EOPNOTSUPP; > +} > + > +static inline int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool > lock) > +{ > +return -EOPNOTSUPP; > +} Can these be reached? If not, please add ASSERT_UNREACHABLE(). > @@ -532,6 +533,10 @@ struct xen_mem_sharing_op { > uint32_t gref; /* IN: gref to debug */ > } u; > } debug; > +struct mem_sharing_op_fork { /* OP_FORK */ > +domid_t parent_domain;/* IN: parent's domain id */ > +uint16_t _pad[3]; /* Must be set to 0 */ Especially in the public interface - no new name space violations please. I.e. please drop the leading underscore. I also struggle to see why this is an array of three elements. In fact I don't see why the padding field would be needed at all - one other union member only gets padded to its alignment (which is what I'd expect), while others (presumably older ones) don't have any padding at all. Here there's no implicit structure's alignment padding that wants making explicit. > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -248,6 +248,9 @@ struct vcpu > > /* Guest-specified relocation of vcpu_info. */ > mfn_tvcpu_info_mfn; > +#ifdef CONFIG_MEM_SHARING > +uint32_t vcpu_info_offset; There's no need for a fixed width type here afaics - unsigned int and probably even unsigned short would seem to both be fine. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v11 1/3] xen/mem_sharing: VM forking
VM forking is the process of creating a domain with an empty memory space and a parent domain specified from which to populate the memory when necessary. For the new domain to be functional the VM state is copied over as part of the fork operation (HVM params, hap allocation, etc). Signed-off-by: Tamas K Lengyel --- v11: Fully copy vcpu_info pages Setup vcpu_runstate for forks Added TODO note for PV timers Copy shared_info page Add copy_settings function, to be shared with fork_reset in the next patch --- xen/arch/x86/domain.c | 11 + xen/arch/x86/hvm/hvm.c| 4 +- xen/arch/x86/mm/hap/hap.c | 3 +- xen/arch/x86/mm/mem_sharing.c | 368 ++ xen/arch/x86/mm/p2m.c | 9 +- xen/common/domain.c | 3 + xen/include/asm-x86/hap.h | 1 + xen/include/asm-x86/hvm/hvm.h | 2 + xen/include/asm-x86/mem_sharing.h | 17 ++ xen/include/public/memory.h | 5 + xen/include/xen/sched.h | 5 + 11 files changed, 423 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index fe63c23676..1ab0ca0942 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2203,6 +2203,17 @@ int domain_relinquish_resources(struct domain *d) ret = relinquish_shared_pages(d); if ( ret ) return ret; + +/* + * If the domain is forked, decrement the parent's pause count + * and release the domain. + */ +if ( d->parent ) +{ +domain_unpause(d->parent); +put_domain(d->parent); +d->parent = NULL; +} } #endif diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index db5d7b4d30..c551cc31f3 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1916,7 +1916,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, } #endif -/* Spurious fault? PoD and log-dirty also take this path. */ +/* Spurious fault? PoD, log-dirty and VM forking also take this path. */ if ( p2m_is_ram(p2mt) ) { rc = 1; @@ -4430,7 +4430,7 @@ static int hvm_allow_get_param(struct domain *d, return rc; } -static int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value) +int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value) { int rc; diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index 3d93f3451c..c7c7ff6e99 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -321,8 +321,7 @@ static void hap_free_p2m_page(struct domain *d, struct page_info *pg) } /* Return the size of the pool, rounded up to the nearest MB */ -static unsigned int -hap_get_allocation(struct domain *d) +unsigned int hap_get_allocation(struct domain *d) { unsigned int pg = d->arch.paging.hap.total_pages + d->arch.paging.hap.p2m_pages; diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index 3835bc928f..4a840d09dd 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -22,6 +22,7 @@ #include #include +#include #include #include #include @@ -36,6 +37,8 @@ #include #include #include +#include +#include #include #include "mm-locks.h" @@ -1444,6 +1447,334 @@ static inline int mem_sharing_control(struct domain *d, bool enable) return 0; } +/* + * Forking a page only gets called when the VM faults due to no entry being + * in the EPT for the access. Depending on the type of access we either + * populate the physmap with a shared entry for read-only access or + * fork the page if its a write access. + * + * The client p2m is already locked so we only need to lock + * the parent's here. + */ +int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool unsharing) +{ +int rc = -ENOENT; +shr_handle_t handle; +struct domain *parent = d->parent; +struct p2m_domain *p2m; +unsigned long gfn_l = gfn_x(gfn); +mfn_t mfn, new_mfn; +p2m_type_t p2mt; +struct page_info *page; + +if ( !mem_sharing_is_fork(d) ) +return -ENOENT; + +if ( !unsharing ) +{ +/* For read-only accesses we just add a shared entry to the physmap */ +while ( parent ) +{ +if ( !(rc = nominate_page(parent, gfn, 0, )) ) +break; + +parent = parent->parent; +} + +if ( !rc ) +{ +/* The client's p2m is already locked */ +struct p2m_domain *pp2m = p2m_get_hostp2m(parent); + +p2m_lock(pp2m); +rc = add_to_physmap(parent, gfn_l, handle, d, gfn_l, false); +p2m_unlock(pp2m); + +if ( !rc ) +return 0; +} +} + +/* + * If it's a write access (ie. unsharing) or if adding a shared entry to + * the physmap failed we'll