[Xen-devel] Ping: [PATCH RFC v2] x86/domctl: Don't pause the whole domain if only getting vcpu state
Any thoughts appreciated. On Vi, 2017-10-06 at 13:02 +0300, Alexandru Isaila wrote: > This patch adds the hvm_save_one_cpu_ctxt() function. > It optimizes by only pausing the vcpu on all HVMSR_PER_VCPU save > callbacks where only data for one VCPU is required. > > Signed-off-by: Alexandru Isaila > > --- > Changes since V1: > - Integrated the vcpu check into all the save callbacks > --- > tools/tests/vhpet/emul.h | 3 +- > tools/tests/vhpet/main.c | 2 +- > xen/arch/x86/cpu/mcheck/vmce.c | 16 ++- > xen/arch/x86/domctl.c | 2 - > xen/arch/x86/hvm/hpet.c| 2 +- > xen/arch/x86/hvm/hvm.c | 280 ++- > -- > xen/arch/x86/hvm/i8254.c | 2 +- > xen/arch/x86/hvm/irq.c | 6 +- > xen/arch/x86/hvm/mtrr.c| 32 - > xen/arch/x86/hvm/pmtimer.c | 2 +- > xen/arch/x86/hvm/rtc.c | 2 +- > xen/arch/x86/hvm/save.c| 71 --- > xen/arch/x86/hvm/vioapic.c | 2 +- > xen/arch/x86/hvm/viridian.c| 17 ++- > xen/arch/x86/hvm/vlapic.c | 23 +++- > xen/arch/x86/hvm/vpic.c| 2 +- > xen/include/asm-x86/hvm/hvm.h | 2 + > xen/include/asm-x86/hvm/save.h | 5 +- > 18 files changed, 324 insertions(+), 147 deletions(-) > > diff --git a/tools/tests/vhpet/emul.h b/tools/tests/vhpet/emul.h > index 383acff..99d5bbd 100644 > --- a/tools/tests/vhpet/emul.h > +++ b/tools/tests/vhpet/emul.h > @@ -296,7 +296,8 @@ struct hvm_hw_hpet > }; > > typedef int (*hvm_save_handler)(struct domain *d, > -hvm_domain_context_t *h); > +hvm_domain_context_t *h, > +unsigned int instance); > typedef int (*hvm_load_handler)(struct domain *d, > hvm_domain_context_t *h); > > diff --git a/tools/tests/vhpet/main.c b/tools/tests/vhpet/main.c > index 6fe65ea..3d8e7f5 100644 > --- a/tools/tests/vhpet/main.c > +++ b/tools/tests/vhpet/main.c > @@ -177,7 +177,7 @@ void __init hvm_register_savevm(uint16_t > typecode, > > int do_save(uint16_t typecode, struct domain *d, > hvm_domain_context_t *h) > { > -return hvm_sr_handlers[typecode].save(d, h); > +return hvm_sr_handlers[typecode].save(d, h, d->max_vcpus); > } > > int do_load(uint16_t typecode, struct domain *d, > hvm_domain_context_t *h) > diff --git a/xen/arch/x86/cpu/mcheck/vmce.c > b/xen/arch/x86/cpu/mcheck/vmce.c > index e07cd2f..a1a12a5 100644 > --- a/xen/arch/x86/cpu/mcheck/vmce.c > +++ b/xen/arch/x86/cpu/mcheck/vmce.c > @@ -349,12 +349,24 @@ int vmce_wrmsr(uint32_t msr, uint64_t val) > return ret; > } > > -static int vmce_save_vcpu_ctxt(struct domain *d, > hvm_domain_context_t *h) > +static int vmce_save_vcpu_ctxt(struct domain *d, > hvm_domain_context_t *h, unsigned int instance) > { > struct vcpu *v; > int err = 0; > > -for_each_vcpu ( d, v ) > +if( instance < d->max_vcpus ) > +{ > +struct hvm_vmce_vcpu ctxt; > + > +v = d->vcpu[instance]; > +ctxt.caps = v->arch.vmce.mcg_cap; > +ctxt.mci_ctl2_bank0 = v->arch.vmce.bank[0].mci_ctl2; > +ctxt.mci_ctl2_bank1 = v->arch.vmce.bank[1].mci_ctl2; > +ctxt.mcg_ext_ctl = v->arch.vmce.mcg_ext_ctl; > + > +err = hvm_save_entry(VMCE_VCPU, v->vcpu_id, h, &ctxt); > +} > +else for_each_vcpu ( d, v ) > { > struct hvm_vmce_vcpu ctxt = { > .caps = v->arch.vmce.mcg_cap, > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > index 540ba08..d3c4e14 100644 > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -624,12 +624,10 @@ long arch_do_domctl( > !is_hvm_domain(d) ) > break; > > -domain_pause(d); > ret = hvm_save_one(d, domctl->u.hvmcontext_partial.type, > domctl->u.hvmcontext_partial.instance, > domctl->u.hvmcontext_partial.buffer, > &domctl->u.hvmcontext_partial.bufsz); > -domain_unpause(d); > > if ( !ret ) > copyback = true; > diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c > index 3ea895a..56f4691 100644 > --- a/xen/arch/x86/hvm/hpet.c > +++ b/xen/arch/x86/hvm/hpet.c > @@ -509,7 +509,7 @@ static const struct hvm_mmio_ops hpet_mmio_ops = > { > }; > > > -static int hpet_save(struct domain *d, hvm_domain_context_t *h) > +static int hpet_save(struct domain *d, hvm_domain_context_t *h, > unsigned int instance) > { > HPETState *hp = domain_vhpet(d); > struct vcpu *v = pt_global_vcpu_target(d); > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index 205b4cb..140f2c3 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -728,13 +728,19 @@ void hvm_domain_destroy(struct domain *d) > } > } > > -static int hvm_save_tsc_adjust(struct domain *d, > hvm_domain_context_t *h) > +static int hvm_save_tsc_adjust(struct domain *d, > hvm_
Re: [Xen-devel] [PATCH v9] x86/hvm: Implement hvmemul_write() using real mappings
> I'd be fine taking care of all the comments while committing (and > then adding my R-b), provided you (and ideally also Andrew) > agree, and of course assuming Paul would ack the patch, plus > no-one else finds yet another problem which once again I may > have overlooked. > Hi Jan, Thank you for your help and I agree with the suggested changes. Regards, Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5] x86/hvm: Implement hvmemul_write() using real mappings
On Mi, 2017-09-27 at 09:38 +0100, Andrew Cooper wrote: > On 27/09/2017 09:04, Alexandru Isaila wrote: > > > > From: Andrew Cooper > > > > > > -return X86EMUL_EXCEPTION; > > -case HVMTRANS_bad_gfn_to_mfn: > > -return hvmemul_linear_mmio_write(addr, bytes, p_data, > > pfec, hvmemul_ctxt, 0); > Where has the if ( !mapping ) test gone? The HVMTRANS_bad_gfn_to_mfn > case needs handling. There was a comment form Jan in V2. NOTE: "v1 comment:'Pointless"else".'" This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings
On Mi, 2017-09-20 at 14:37 +, Paul Durrant wrote: > > > > -Original Message- > > From: Jan Beulich [mailto:jbeul...@suse.com] > > Sent: 20 September 2017 13:24 > > To: Alexandru Isaila > > Cc: suravee.suthikulpa...@amd.com; Andrew Cooper > > ; Paul Durrant > > ; > > Wei Liu ; George Dunlap > com>; > > Ian Jackson ; jun.nakaj...@intel.com; Kevin > > Tian > > ; sstabell...@kernel.org; xen-de...@lists.xen > > .org; > > boris.ostrov...@oracle.com; konrad.w...@oracle.com; Tim (Xen.org) > > > > Subject: Re: [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() > > using > > real mappings > > > > > > > > > > > > > > > > > > > On 20.09.17 at 11:22, wrote: > > > +static void *hvmemul_map_linear_addr( > > > +unsigned long linear, unsigned int bytes, uint32_t pfec, > > > +struct hvm_emulate_ctxt *hvmemul_ctxt) > > > +{ > > > +struct vcpu *curr = current; > > > +void *err, *mapping; > > > + > > > +/* First and final gfns which need mapping. */ > > > +unsigned long frame = linear >> PAGE_SHIFT, first = frame; > > > +unsigned long final = (linear + bytes - !!bytes) >> > > > PAGE_SHIFT; > > > + > > > +/* > > > + * mfn points to the next free slot. All used slots have a > > > page reference > > > + * held on them. > > > + */ > > > +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > > > + > > > +/* > > > + * The caller has no legitimate reason for trying a zero- > > > byte write, but > > > + * final is calculate to fail safe in release builds. > > > + * > > > + * The maximum write size depends on the number of adjacent > > > mfns[] > > which > > > > > > + * can be vmap()'d, accouting for possible misalignment > > > within the > > region. > > > > > > + * The higher level emulation callers are responsible for > > > ensuring that > > > + * mfns[] is large enough for the requested write size. > > > + */ > > > +if ( bytes == 0 || > > > + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) ) > > > +{ > > > +ASSERT_UNREACHABLE(); > > > +goto unhandleable; > > > +} > > > + > > > +do { > > > +enum hvm_translation_result res; > > > +struct page_info *page; > > > +pagefault_info_t pfinfo; > > > +p2m_type_t p2mt; > > > + > > > +/* Error checking. Confirm that the current slot is > > > clean. */ > > > +ASSERT(mfn_x(*mfn) == 0); > > > + > > > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, > > > true, > > pfec, > > > > > > + &pfinfo, &page, NULL, > > > &p2mt); > > > + > > > +switch ( res ) > > > +{ > > > +case HVMTRANS_okay: > > > +break; > > > + > > > +case HVMTRANS_bad_linear_to_gfn: > > > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, > > > &hvmemul_ctxt- > > > ctxt); > > > +err = ERR_PTR(~X86EMUL_EXCEPTION); > > > +goto out; > > > + > > > +case HVMTRANS_bad_gfn_to_mfn: > > > +err = NULL; > > > +goto out; > > > + > > > +case HVMTRANS_gfn_paged_out: > > > +case HVMTRANS_gfn_shared: > > > +err = ERR_PTR(~X86EMUL_RETRY); > > > +goto out; > > > + > > > +default: > > > +goto unhandleable; > > > +} > > > + > > > +if ( p2m_is_discard_write(p2mt) ) > > > +{ > > > +err = ERR_PTR(~X86EMUL_OKAY); > > > +goto out; > > > +} > > > + > > > +*mfn++ = _mfn(page_to_mfn(page)); > > > + > > > +} while ( ++frame < final ); > > Interesting - I had specifically pointed out in a reply to v3 that > > the > > increment of mfn _cannot_ be moved down here: You're now > > leaking a page ref on the p2m_is_discard_write() error path afaict. > It could be left here if a put_page() is added to the above error > path, which I'd clearly deluded myself was already there. I think it's clearer to move it back. Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] x86/hvm: Implement hvmemul_write() using real mappings
On Mi, 2017-09-20 at 06:24 -0600, Jan Beulich wrote: > > > > > > > > > > > > > On 20.09.17 at 11:22, wrote: > > +static void *hvmemul_map_linear_addr( > > +unsigned long linear, unsigned int bytes, uint32_t pfec, > > +struct hvm_emulate_ctxt *hvmemul_ctxt) > > +{ > > +struct vcpu *curr = current; > > +void *err, *mapping; > > + > > +/* First and final gfns which need mapping. */ > > +unsigned long frame = linear >> PAGE_SHIFT, first = frame; > > +unsigned long final = (linear + bytes - !!bytes) >> > > PAGE_SHIFT; > > + > > +/* > > + * mfn points to the next free slot. All used slots have a > > page reference > > + * held on them. > > + */ > > +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > > + > > +/* > > + * The caller has no legitimate reason for trying a zero-byte > > write, but > > + * final is calculate to fail safe in release builds. > > + * > > + * The maximum write size depends on the number of adjacent > > mfns[] which > > + * can be vmap()'d, accouting for possible misalignment within > > the region. > > + * The higher level emulation callers are responsible for > > ensuring that > > + * mfns[] is large enough for the requested write size. > > + */ > > +if ( bytes == 0 || > > + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) ) > > +{ > > +ASSERT_UNREACHABLE(); > > +goto unhandleable; > > +} > > + > > +do { > > +enum hvm_translation_result res; > > +struct page_info *page; > > +pagefault_info_t pfinfo; > > +p2m_type_t p2mt; > > + > > +/* Error checking. Confirm that the current slot is > > clean. */ > > +ASSERT(mfn_x(*mfn) == 0); > > + > > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, > > true, pfec, > > + &pfinfo, &page, NULL, &p2mt); > > + > > +switch ( res ) > > +{ > > +case HVMTRANS_okay: > > +break; > > + > > +case HVMTRANS_bad_linear_to_gfn: > > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, > > &hvmemul_ctxt->ctxt); > > +err = ERR_PTR(~X86EMUL_EXCEPTION); > > +goto out; > > + > > +case HVMTRANS_bad_gfn_to_mfn: > > +err = NULL; > > +goto out; > > + > > +case HVMTRANS_gfn_paged_out: > > +case HVMTRANS_gfn_shared: > > +err = ERR_PTR(~X86EMUL_RETRY); > > +goto out; > > + > > +default: > > +goto unhandleable; > > +} > > + > > +if ( p2m_is_discard_write(p2mt) ) > > +{ > > +err = ERR_PTR(~X86EMUL_OKAY); > > +goto out; > > +} > > + > > +*mfn++ = _mfn(page_to_mfn(page)); > > + > > +} while ( ++frame < final ); > Interesting - I had specifically pointed out in a reply to v3 that > the > increment of mfn _cannot_ be moved down here: You're now > leaking a page ref on the p2m_is_discard_write() error path afaict. > Sorry about that, I realized the error after reading it again. I'll wait for all the comments until doing the final version. Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/domctl: Don't pause the whole domain if only getting vcpu state
On Ma, 2017-09-19 at 00:11 -0600, Jan Beulich wrote: > > > > > > > > > > > > > Razvan Cojocaru 09/18/17 7:05 PM > > > > >>> > > On 09/18/2017 06:35 PM, Jan Beulich wrote: > > > > > > > > > > > > > > > > > > > > > > > > On 12.09.17 at 15:53, wrote: > > > > --- a/xen/arch/x86/domctl.c > > > > +++ b/xen/arch/x86/domctl.c > > > > @@ -625,6 +625,26 @@ long arch_do_domctl( > > > > !is_hvm_domain(d) ) > > > > break; > > > > > > > > +if ( domctl->u.hvmcontext_partial.type == > > > > HVM_SAVE_CODE(CPU) && > > > > + domctl->u.hvmcontext_partial.instance < d- > > > > >max_vcpus ) > > > I have to admit that I'm not in favor of such special casing, > > > even > > > less so without any code comment saying why this is so special. > > > What if someone else wanted some other piece of vCPU state > > > without pausing the entire domain? Wouldn't it be possible to > > > generalize this to cover all such state elements? > > There's no reason why all the other cases where this would the > > possible > > shouldn't be optimized. What has made this one stand out for us is > > that > > we're using it a lot with introspection, and the optimization > > counts. > > > > But judging by the code reorganization (the addition of > > hvm_save_one_cpu_ctxt()), the changes would need to be done on a > > one-by-one case anyway (different queries may require different > > ways of > > chaging the code). > But this function addition is precisely what I'd like to avoid in > favor of > an extension to the existing mechanism using the registered function > pointers. > What will be a suitable extend of the current call back system? Regards, Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/4] x86/hvm: Implement hvmemul_write() using real mappings
On Lu, 2017-09-18 at 07:43 -0600, Jan Beulich wrote: > > > > > > > > > > > > > On 08.09.17 at 18:05, wrote: > > Changes since V1: > > - Moved ASSERT to the begining of the loop > > - Corrected the decrement on mfn int the while statement > > - Modified the comment to PAGE_SIZE+1 > While several of my v1 comments were taken care of verbally, some > haven't been addressed here or during the discussion. Sorry about that, I must have lost some or some emails have not been indexed. I'll address all from now on. > > > > > While the maximum size of linear mapping is capped at 1 page, the > > logic > > in the helpers is written to work properly as hvmemul_ctxt->mfn[] > > gets > > longer, > > specifically with XSAVE instruction emulation in mind. > > > > This has only had light testing so far. > Has this changed in the meantime? This has not changed so far. > > > > > +static void *hvmemul_map_linear_addr( > > +unsigned long linear, unsigned int bytes, uint32_t pfec, > > +struct hvm_emulate_ctxt *hvmemul_ctxt) > > +{ > > +struct vcpu *curr = current; > > +void *err, *mapping; > > + > > +/* First and final gfns which need mapping. */ > > +unsigned long frame = linear >> PAGE_SHIFT, first = frame; > > +unsigned long final = (linear + bytes - !!bytes) >> > > PAGE_SHIFT; > > + > > +/* > > + * mfn points to the next free slot. All used slots have a > > page reference > > + * held on them. > > + */ > > +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > > + > > +/* > > + * The caller has no legitimate reason for trying a zero-byte > > write, but > > + * final is calculate to fail safe in release builds. > > + * > > + * The maximum write size depends on the number of adjacent > > mfns[] which > > + * can be vmap()'d, accouting for possible misalignment within > > the region. > > + * The higher level emulation callers are responsible for > > ensuring that > > + * mfns[] is large enough for the requested write size. > > + */ > > +if ( bytes == 0 || > > + final - first > ARRAY_SIZE(hvmemul_ctxt->mfn) - 1 ) > > +{ > > +ASSERT_UNREACHABLE(); > > +goto unhandleable; > > +} > > + > > +do { > > +enum hvm_translation_result res; > > +struct page_info *page; > > +pagefault_info_t pfinfo; > > +p2m_type_t p2mt; > > + > > +/* Error checking. Confirm that the current slot is > > clean. */ > > +ASSERT(mfn_x(*mfn) == 0); > > + > > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, > > true, pfec, > > + &pfinfo, &page, NULL, &p2mt); > > + > > +switch ( res ) > > +{ > > +case HVMTRANS_okay: > > +break; > > + > > +case HVMTRANS_bad_linear_to_gfn: > > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, > > &hvmemul_ctxt->ctxt); > > +err = ERR_PTR(~(long)X86EMUL_EXCEPTION); > Why the casts to long here and further down? > > > > > +goto out; > > + > > +case HVMTRANS_bad_gfn_to_mfn: > > +err = NULL; > > +goto out; > > + > > +case HVMTRANS_gfn_paged_out: > > +case HVMTRANS_gfn_shared: > > +err = ERR_PTR(~(long)X86EMUL_RETRY); > > +goto out; > > + > > +default: > > +goto unhandleable; > > +} > > + > > +*mfn++ = _mfn(page_to_mfn(page)); > > +frame++; > > + > > +if ( p2m_is_discard_write(p2mt) ) > > +{ > > +err = ERR_PTR(~(long)X86EMUL_OKAY); > > +goto out; > > +} > > + > > +} while ( frame < final ); > > + > > +/* Entire access within a single frame? */ > > +if ( first == final ) > > +mapping = map_domain_page(hvmemul_ctxt->mfn[0]) + (linear > > & ~PAGE_MASK); > > +/* Multiple frames? Need to vmap(). */ > > +else if ( (mapping = vmap(hvmemul_ctxt->mfn, > > + mfn - hvmemul_ctxt->mfn)) == NULL ) > v1 comment was "final - first + 1 would likely yield better code." will do. > > > > > +goto unhandleable; > > + > > +#ifndef NDEBUG /* Poision unused mfn[]s with INVALID_MFN. */ > > +while ( mfn < hvmemul_ctxt->mfn + ARRAY_SIZE(hvmemul_ctxt- > > >mfn) ) > > +{ > > +ASSERT(mfn_x(*mfn) == 0); > > +*mfn++ = INVALID_MFN; > > +} > > +#endif > > + > > +return mapping; > > + > > + unhandleable: > > +err = ERR_PTR(~(long)X86EMUL_UNHANDLEABLE); > > + > > + out: > > +/* Drop all held references. */ > > +while ( mfn-- > hvmemul_ctxt->mfn ) > > +put_page(mfn_to_page(mfn_x(*mfn))); > > + > > +return err; > > +} > > + > > +static void hvmemul_unmap_linear_addr( > > +void *mapping, unsigned long linear, unsigned int bytes, > While this was discussed in response to v1, I still think "mapping" > should be const void *, or a prereq patch (which I would object > to) should be submitted to drop the const from vun
Re: [Xen-devel] [PATCH v5] common/vm_event: Initialize vm_event lists on domain creation
> > > > -static int vm_event_disable(struct domain *d, struct > > vm_event_domain *ved) > > +static int vm_event_disable(struct domain *d, struct > > vm_event_domain **ved) > > { > A lot of the code churn here and above could be avoided by changing > ved > in parameter list to something else (vedp?) and having a local > variable > called > > struct vm_event_domain *ved = *vedp; > > (I don't feel very strongly about this though) > I don't think it is necessary but the decision comes to the maintainers. Regards, Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5] common/vm_event: Initialize vm_event lists on domain creation
On Ma, 2017-08-29 at 09:14 -0600, Tamas K Lengyel wrote: > On Tue, Aug 29, 2017 at 8:17 AM, Alexandru Isaila > wrote: > > > > The patch splits the vm_event into three structures:vm_event_share, > > vm_event_paging, vm_event_monitor. The allocation for the > > structure is moved to vm_event_enable so that it can be > > allocated/init when needed and freed in vm_event_disable. > > > > Signed-off-by: Alexandru Isaila > > > > --- > > Changes since V4: > > - Replaced all NULL checks with vm_event_check_ring > > > > Note: Did not test on arm, compliled on arm and x86. > This looks good to me as is but could you at least check with the > xen-access test tool on x86 that it still works as expected? > > I've just tested right now and it works fine. Thanks, Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6] x86/hvm: Allow guest_request vm_events coming from userspace
On Vi, 2017-08-25 at 06:13 -0600, Jan Beulich wrote: > > > > > > > > > > > > > On 17.08.17 at 13:50, wrote: > > --- a/xen/common/monitor.c > > +++ b/xen/common/monitor.c > > @@ -75,6 +75,7 @@ int monitor_domctl(struct domain *d, struct > > xen_domctl_monitor_op *mop) > > domain_pause(d); > > d->monitor.guest_request_sync = mop->u.guest_request.sync; > > d->monitor.guest_request_enabled = requested_status; > > +d->arch.monitor.guest_request_userspace_enabled = mop- > > >u.guest_request.allow_userspace; > This breaks the build on ARM. There are 2 solutions, I can move the case in x86/monitor.c in the arch_monitor_domctl_event function or I can make a arch specific function that does the assignment in the x86 case and does nothing in the arm case. What approach do you prefer? > > Jan > > > > This email was scanned by Bitdefender Thanks, Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] common/vm_event: Initialize vm_event lists on domain creation
On Jo, 2017-08-24 at 07:24 -0600, Jan Beulich wrote: > > > > > > > > > > > > > On 24.08.17 at 13:48, wrote: > > The patch splits the vm_event into three structures:vm_event_share, > > vm_event_paging, vm_event_monitor. The allocation for the > > structure is moved to vm_event_enable so that it can be > > allocated/init when needed and freed in vm_event_disable. > > > > Signed-off-by: Alexandru Isaila > Missing brief description of changes from v1 here. > > > > > @@ -50,32 +50,37 @@ static int vm_event_enable( > > int rc; > > unsigned long ring_gfn = d->arch.hvm_domain.params[param]; > > > > +if ( !(*ved) ) > > +(*ved) = xzalloc(struct vm_event_domain); > > +if ( !(*ved) ) > In none of the three cases you really need the parentheses around > *ved. > > > > > @@ -187,39 +194,45 @@ void vm_event_wake(struct domain *d, struct > > vm_event_domain *ved) > > vm_event_wake_blocked(d, ved); > > } > > > > -static int vm_event_disable(struct domain *d, struct > > vm_event_domain *ved) > > +static int vm_event_disable(struct domain *d, struct > > vm_event_domain **ved) > > { > > -if ( ved->ring_page ) > > +if ( !*ved ) > > +return 0; > > + > > +if ( (*ved)->ring_page ) > > { > > [...] > > +xfree(*ved); > > +*ved = NULL; > > } > If both if()-s above are really useful, you are leaking *ved when it > is non-NULL but ->ring_page is NULL. > > > > > @@ -500,6 +519,9 @@ bool_t vm_event_check_ring(struct > > vm_event_domain *ved) > > int __vm_event_claim_slot(struct domain *d, struct vm_event_domain > > *ved, > > bool_t allow_sleep) > > { > > +if ( !ved ) > > +return -ENOSYS; > I don't think ENOSYS is correct here. Can you tell me what is the preferred return value here? > > > > > @@ -510,24 +532,24 @@ int __vm_event_claim_slot(struct domain *d, > > struct vm_event_domain *ved, > > /* Registered with Xen-bound event channel for incoming > > notifications. */ > > static void mem_paging_notification(struct vcpu *v, unsigned int > > port) > > { > > -if ( likely(v->domain->vm_event->paging.ring_page != NULL) ) > > -vm_event_resume(v->domain, &v->domain->vm_event->paging); > > +if ( likely(v->domain->vm_event_paging->ring_page != NULL) ) > > +vm_event_resume(v->domain, v->domain->vm_event_paging); > > } > > #endif > > > > /* Registered with Xen-bound event channel for incoming > > notifications. */ > > static void monitor_notification(struct vcpu *v, unsigned int > > port) > > { > > -if ( likely(v->domain->vm_event->monitor.ring_page != NULL) ) > > -vm_event_resume(v->domain, &v->domain->vm_event->monitor); > > +if ( likely(v->domain->vm_event_monitor->ring_page != NULL) ) > > +vm_event_resume(v->domain, v->domain->vm_event_monitor); > > } > > > > #ifdef CONFIG_HAS_MEM_SHARING > > /* Registered with Xen-bound event channel for incoming > > notifications. */ > > static void mem_sharing_notification(struct vcpu *v, unsigned int > > port) > > { > > -if ( likely(v->domain->vm_event->share.ring_page != NULL) ) > > -vm_event_resume(v->domain, &v->domain->vm_event->share); > > +if ( likely(v->domain->vm_event_share->ring_page != NULL) ) > > +vm_event_resume(v->domain, v->domain->vm_event_share); > > } > > #endif > For all three a local variable holding v->domain would certain help; > eventually the functions should even be passed struct domain * > instead of struct vcpu *, I think. > > > > > @@ -599,7 +621,6 @@ int vm_event_domctl(struct domain *d, > > xen_domctl_vm_event_op_t *vec, > > #ifdef CONFIG_HAS_MEM_PAGING > > case XEN_DOMCTL_VM_EVENT_OP_PAGING: > > { > > -struct vm_event_domain *ved = &d->vm_event->paging; > Dropping this local variable (and similar ones below) pointlessly > increases the size of this patch. I have dropped the local var because in case of a XEN_VM_EVENT_ENABLE d->vm_event_... is allocated in the vm_event_enable function below so it will allocate mem for the local variable. > > > > > --- a/xen/drivers/passthrough/pci.c > > +++ b/xen/drivers/passthrough/pci.c > > @@ -1363,9 +1363,11 @@ static int assign_device(struct domain *d, > > u16 seg, u8 bus, u8 devfn, u32 flag) > > > > /* Prevent device assign if mem paging or mem sharing have > > been > > * enabled for this domain */ > > +if( !d->vm_event_paging ) > > +return -EXDEV; > Is this check the wrong way round? And why can't it be combined > with ... > > > > > if ( unlikely(!need_iommu(d) && > > (d->arch.hvm_domain.mem_sharing_enabled || > > - d->vm_event->paging.ring_page || > > + d->vm_event_paging->ring_page || > > p2m_get_hostp2m(d)->global_logdirty)) ) > > return -EXDEV; > ... this set? > > Jan Alex > > > This email was scanned by Bitdefender ___
Re: [Xen-devel] [PATCH v5] x86/hvm: Allow guest_request vm_events coming from userspace
On Ma, 2017-08-08 at 12:27 +0100, Wei Liu wrote: > On Tue, Aug 08, 2017 at 11:27:35AM +0300, Alexandru Isaila wrote: > > > > In some introspection usecases, an in-guest agent needs to > > communicate > > with the external introspection agent. An existing mechanism is > > HVMOP_guest_request_vm_event, but this is restricted to kernel > > usecases > > like all other hypercalls. > > > > Introduce a mechanism whereby the introspection agent can whitelist > > the > > use of HVMOP_guest_request_vm_event directly from userspace. > > > > Signed-off-by: Alexandru Isaila > > > > --- > > Changes since V4: > > - Changed function mane from xc_allow_guest_userspace_event > > to xc_monitor_guest_userspace_event > > - Fixed guest_request_enabled check > > - Delete the guest_request_sync > > - Changed guest_request_userspace_event to > > guest_request_userspace_enabled > > - Moved guest_request_userspace_enabled flag from sched.h to > > domain.h > > --- > > tools/libxc/include/xenctrl.h | 1 + > > tools/libxc/xc_monitor.c | 14 ++ > > xen/arch/x86/hvm/hypercall.c | 5 + > > xen/common/monitor.c | 13 + > > xen/include/asm-x86/domain.h | 19 ++- > > xen/include/public/domctl.h | 21 +++-- > > 6 files changed, 54 insertions(+), 19 deletions(-) > > > > diff --git a/tools/libxc/include/xenctrl.h > > b/tools/libxc/include/xenctrl.h > > index bde8313..c72e12d 100644 > > --- a/tools/libxc/include/xenctrl.h > > +++ b/tools/libxc/include/xenctrl.h > > @@ -2022,6 +2022,7 @@ int xc_monitor_descriptor_access(xc_interface > > *xch, domid_t domain_id, > > bool enable); > > int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id, > > bool enable, bool sync); > > +int xc_monitor_guest_userspace_event(xc_interface *xch, domid_t > > domain_id, bool enable); > > int xc_monitor_debug_exceptions(xc_interface *xch, domid_t > > domain_id, > > bool enable, bool sync); > > int xc_monitor_cpuid(xc_interface *xch, domid_t domain_id, bool > > enable); > > diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c > > index b44ce93..bd8cbcf 100644 > > --- a/tools/libxc/xc_monitor.c > > +++ b/tools/libxc/xc_monitor.c > > @@ -161,6 +161,20 @@ int xc_monitor_guest_request(xc_interface > > *xch, domid_t domain_id, bool enable, > > return do_domctl(xch, &domctl); > > } > > > > +int xc_monitor_guest_userspace_event(xc_interface *xch, domid_t > > domain_id, bool enable) > > +{ > > +DECLARE_DOMCTL; > > + > > +domctl.cmd = XEN_DOMCTL_monitor_op; > > +domctl.domain = domain_id; > > +domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE > > +: > > XEN_DOMCTL_MONITOR_OP_DISABLE; > > +domctl.u.monitor_op.event = > > XEN_DOMCTL_MONITOR_EVENT_GUEST_USERSPACE_EVENT; > > + > > +return do_domctl(xch, &domctl); > > +} > > + > > + > For this bit: > > Acked-by: Wei Liu > > Some nits below. > > > > > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm- > > x86/domain.h > > index c10522b..de02507 100644 > > --- a/xen/include/asm-x86/domain.h > > +++ b/xen/include/asm-x86/domain.h > > @@ -396,15 +396,16 @@ struct arch_domain > > > > /* Arch-specific monitor options */ > > struct { > > -unsigned int write_ctrlreg_enabled : 4; > > -unsigned int write_ctrlreg_sync : 4; > > -unsigned int write_ctrlreg_onchangeonly : 4; > > -unsigned int singlestep_enabled : 1; > > -unsigned int software_breakpoint_enabled : 1; > > -unsigned int debug_exception_enabled : 1; > > -unsigned int debug_exception_sync: 1; > > -unsigned int cpuid_enabled : 1; > > -unsigned int descriptor_access_enabled : 1; > > +unsigned int > > write_ctrlreg_enabled : 4; > > +unsigned int > > write_ctrlreg_sync: 4; > > +unsigned int > > write_ctrlreg_onchangeonly: 4; > > +unsigned int > > singlestep_enabled: 1; > > +unsigned int > > software_breakpoint_enabled : 1; > > +unsigned int > > debug_exception_enabled : 1; > > +unsigned int > > debug_exception_sync : 1; > > +unsigned int > > cpuid_enabled : 1; > > +unsigned int > > descriptor_access_enabled : 1; > > +unsigned int > > guest_request_userspace_enabled : 1; > Indentation here and below seems rather excessive. This indentation was a suggestion made by Jan Beulich on Patch V3. > > > This email was
Re: [Xen-devel] [PATCH v4] x86/hvm: Allow guest_request vm_events coming from userspace
On Vi, 2017-08-04 at 19:32 -0600, Tamas K Lengyel wrote: On Fri, Aug 4, 2017 at 5:32 AM, Alexandru Isaila mailto:aisa...@bitdefender.com>> wrote: In some introspection usecases, an in-guest agent needs to communicate with the external introspection agent. An existing mechanism is HVMOP_guest_request_vm_event, but this is restricted to kernel usecases like all other hypercalls. Introduce a mechanism whereby the introspection agent can whitelist the use of HVMOP_guest_request_vm_event directly from userspace. Signed-off-by: Alexandru Isaila mailto:aisa...@bitdefender.com>> --- Changes since V3: - Changed commit message - Added new lines - Indent the maximum space on the defines - Chaned the name of the define/function name/struct member from vmcall to event --- tools/libxc/include/xenctrl.h | 1 + tools/libxc/xc_monitor.c | 14 ++ xen/arch/x86/hvm/hypercall.c | 5 + xen/common/monitor.c | 14 ++ xen/include/public/domctl.h | 21 +++-- xen/include/xen/sched.h | 5 +++-- 6 files changed, 48 insertions(+), 12 deletions(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index bde8313..90a056f 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2022,6 +2022,7 @@ int xc_monitor_descriptor_access(xc_interface *xch, domid_t domain_id, bool enable); int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id, bool enable, bool sync); +int xc_allow_guest_userspace_event(xc_interface *xch, domid_t domain_id, bool enable); int xc_monitor_debug_exceptions(xc_interface *xch, domid_t domain_id, bool enable, bool sync); int xc_monitor_cpuid(xc_interface *xch, domid_t domain_id, bool enable); diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c index b44ce93..6064c39 100644 --- a/tools/libxc/xc_monitor.c +++ b/tools/libxc/xc_monitor.c @@ -161,6 +161,20 @@ int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id, bool enable, return do_domctl(xch, &domctl); } +int xc_allow_guest_userspace_event(xc_interface *xch, domid_t domain_id, bool enable) This function should be prefixed with "xc_monitor_" like all the rest of the functions here. +{ +DECLARE_DOMCTL; + +domctl.cmd = XEN_DOMCTL_monitor_op; +domctl.domain = domain_id; +domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE +: XEN_DOMCTL_MONITOR_OP_DISABLE; +domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_GUEST_USERSPACE_EVENT; + +return do_domctl(xch, &domctl); +} + + int xc_monitor_emulate_each_rep(xc_interface *xch, domid_t domain_id, bool enable) { diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c index e7238ce..8eb5f49 100644 --- a/xen/arch/x86/hvm/hypercall.c +++ b/xen/arch/x86/hvm/hypercall.c @@ -155,6 +155,11 @@ int hvm_hypercall(struct cpu_user_regs *regs) /* Fallthrough to permission check. */ case 4: case 2: +if ( currd->monitor.guest_request_userspace_event && +eax == __HYPERVISOR_hvm_op && +(mode == 8 ? regs->rdi : regs->ebx) == HVMOP_guest_request_vm_event ) +break; + if ( unlikely(hvm_get_cpl(curr)) ) { default: diff --git a/xen/common/monitor.c b/xen/common/monitor.c index 451f42f..21a1457 100644 --- a/xen/common/monitor.c +++ b/xen/common/monitor.c @@ -79,6 +79,20 @@ int monitor_domctl(struct domain *d, struct xen_domctl_monitor_op *mop) break; } +case XEN_DOMCTL_MONITOR_EVENT_GUEST_USERSPACE_EVENT: +{ +bool_t old_status = d->monitor.guest_request_enabled; You are checking guest_request enabled here while later setting guest_request_userspace_event. These are two separate monitor options, adjust accordingly. + +if ( unlikely(old_status == requested_status) ) +return -EEXIST; + +domain_pause(d); +d->monitor.guest_request_sync = mop->u.guest_request.sync; You are setting guest_request_sync here which is a setting belonging to guest_request_enabled. If you need sync/async option for the userspace guest request it should be a separate bit. Since the toolstack side you add in this patch never sets the sync option I assume that is not the case, so remove this. +d->monitor.guest_request_userspace_event = requested_status; +domain_unpause(d); +break; +} + default: /* Give arch-side the chance to handle this event */ return arch_monitor_domctl_event(d, mop); diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index ff39762..870495c 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -1073,16 +1073,17 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cmt_op_t); #define XEN_DO
Re: [Xen-devel] [PATCH v4] x86/hvm: Allow guest_request vm_events coming from userspace
From: Tamas K Lengyel Sent: Saturday, August 5, 2017 4:32 AM To: Alexandru Stefan ISAILA Cc: Xen-devel; wei.l...@citrix.com; Tim Deegan; Stefano Stabellini; Konrad Rzeszutek Wilk; Jan Beulich; Ian Jackson; George Dunlap; Andrew Cooper; Razvan Cojocaru Subject: Re: [PATCH v4] x86/hvm: Allow guest_request vm_events coming from userspace On Fri, Aug 4, 2017 at 5:32 AM, Alexandru Isaila mailto:aisa...@bitdefender.com>> wrote: In some introspection usecases, an in-guest agent needs to communicate with the external introspection agent. An existing mechanism is HVMOP_guest_request_vm_event, but this is restricted to kernel usecases like all other hypercalls. Introduce a mechanism whereby the introspection agent can whitelist the use of HVMOP_guest_request_vm_event directly from userspace. Signed-off-by: Alexandru Isaila mailto:aisa...@bitdefender.com>> --- Changes since V3: - Changed commit message - Added new lines - Indent the maximum space on the defines - Chaned the name of the define/function name/struct member from vmcall to event --- tools/libxc/include/xenctrl.h | 1 + tools/libxc/xc_monitor.c | 14 ++ xen/arch/x86/hvm/hypercall.c | 5 + xen/common/monitor.c | 14 ++ xen/include/public/domctl.h | 21 +++-- xen/include/xen/sched.h | 5 +++-- 6 files changed, 48 insertions(+), 12 deletions(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index bde8313..90a056f 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2022,6 +2022,7 @@ int xc_monitor_descriptor_access(xc_interface *xch, domid_t domain_id, bool enable); int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id, bool enable, bool sync); +int xc_allow_guest_userspace_event(xc_interface *xch, domid_t domain_id, bool enable); int xc_monitor_debug_exceptions(xc_interface *xch, domid_t domain_id, bool enable, bool sync); int xc_monitor_cpuid(xc_interface *xch, domid_t domain_id, bool enable); diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c index b44ce93..6064c39 100644 --- a/tools/libxc/xc_monitor.c +++ b/tools/libxc/xc_monitor.c @@ -161,6 +161,20 @@ int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id, bool enable, return do_domctl(xch, &domctl); } +int xc_allow_guest_userspace_event(xc_interface *xch, domid_t domain_id, bool enable) This function should be prefixed with "xc_monitor_" like all the rest of the functions here. +{ +DECLARE_DOMCTL; + +domctl.cmd = XEN_DOMCTL_monitor_op; +domctl.domain = domain_id; +domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE +: XEN_DOMCTL_MONITOR_OP_DISABLE; +domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_GUEST_USERSPACE_EVENT; + +return do_domctl(xch, &domctl); +} + + int xc_monitor_emulate_each_rep(xc_interface *xch, domid_t domain_id, bool enable) { diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c index e7238ce..8eb5f49 100644 --- a/xen/arch/x86/hvm/hypercall.c +++ b/xen/arch/x86/hvm/hypercall.c @@ -155,6 +155,11 @@ int hvm_hypercall(struct cpu_user_regs *regs) /* Fallthrough to permission check. */ case 4: case 2: +if ( currd->monitor.guest_request_userspace_event && +eax == __HYPERVISOR_hvm_op && +(mode == 8 ? regs->rdi : regs->ebx) == HVMOP_guest_request_vm_event ) +break; + if ( unlikely(hvm_get_cpl(curr)) ) { default: diff --git a/xen/common/monitor.c b/xen/common/monitor.c index 451f42f..21a1457 100644 --- a/xen/common/monitor.c +++ b/xen/common/monitor.c @@ -79,6 +79,20 @@ int monitor_domctl(struct domain *d, struct xen_domctl_monitor_op *mop) break; } +case XEN_DOMCTL_MONITOR_EVENT_GUEST_USERSPACE_EVENT: +{ +bool_t old_status = d->monitor.guest_request_enabled; You are checking guest_request enabled here while later setting guest_request_userspace_event. These are two separate monitor options, adjust accordingly. + +if ( unlikely(old_status == requested_status) ) +return -EEXIST; + +domain_pause(d); +d->monitor.guest_request_sync = mop->u.guest_request.sync; You are setting guest_request_sync here which is a setting belonging to guest_request_enabled. If you need sync/async option for the userspace guest request it should be a separate bit. Since the toolstack side you add in this patch never sets the sync option I assume that is not the case, so remove this. +d->monitor.guest_request_userspace_event = requested_status; +domain_unpause(d); +break; +
Re: [Xen-devel] [PATCH v2] x86/hvm: Allow guest_request vm_events coming from userspace
I'm sure we can to this and use a monitor op together with the HVMOP_guest_request_vm_event event. We have discussed this and have a good idea on how to do it. ~Alex From: Andrew Cooper Sent: Tuesday, August 1, 2017 1:30 PM To: Alexandru Stefan ISAILA; xen-devel@lists.xen.org Cc: jbeul...@suse.com; Razvan Cojocaru; Tamas K Lengyel Subject: Re: [PATCH v2] x86/hvm: Allow guest_request vm_events coming from userspace On 01/08/17 10:46, Alexandru Isaila wrote: > Allow guest userspace code to request that a vm_event be sent out > via VMCALL. This functionality seems to be handy for a number of > Xen developers, as stated on the mailing list (thread "[Xen-devel] > HVMOP_guest_request_vm_event only works from guest in ring0"). > This is a use case in communication between a userspace application > in the guest and the introspection application in dom0. > > Signed-off-by: Alexandru Isaila This issue has been argued several times before, and while I am in favour of the change, there is a legitimate argument that it breaks one of our security boundaries. One intermediate option comes to mind however. Could we introduce a new monitor op which permits the use of HVMOP_guest_request_vm_event from userspace? This way, it requires a positive action on behalf of the introspection agent to relax the CPL check, rather than having the CPL check unconditionally relaxed. I think this would be sufficient to alleviate the previous objections. ~Andrew > > --- > Changes since V1: > - Added Fallthrough check on mode == 2 > --- > xen/arch/x86/hvm/hypercall.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c > index e7238ce..1c067c3 100644 > --- a/xen/arch/x86/hvm/hypercall.c > +++ b/xen/arch/x86/hvm/hypercall.c > @@ -152,9 +152,15 @@ int hvm_hypercall(struct cpu_user_regs *regs) > { > case 8: > eax = regs->rax; > +if ( eax == __HYPERVISOR_hvm_op && > + regs->rdi == HVMOP_guest_request_vm_event ) > +break; > /* Fallthrough to permission check. */ > case 4: > case 2: > +if ( mode != 8 && eax == __HYPERVISOR_hvm_op && > + regs->ebx == HVMOP_guest_request_vm_event ) > +break; > if ( unlikely(hvm_get_cpl(curr)) ) > { > default: This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel