Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
Hi guys I have found the problem (after hours and hours of gruesome debugging with the almighty print) and it seems that this could potentially have quite a bit of impact if altp2m is enabled for a guest domain (even if the functionality is never actively used), since destroying any vcpu of this guest could lead to a hypervisor panic. So a malicious user could simply destroy and restart his VM(s) in order to DOS the VMs of other users by killing the hypervisor. Granted, this is not very effective, but, depending on the environment, it is extremely easy to implement. The bug persists in Xen 4.7 and I do not that it was fixed in the current master branch. The following happens. The call void hvm_vcpu_destroy(struct vcpu *v) { hvm_all_ioreq_servers_remove_vcpu(v->domain, v); if ( hvm_altp2m_supported() ) altp2m_vcpu_destroy(v); at some time reaches vmx_vcpu_update_eptp which ends with a vmx_vmcs_exit(v);. There vmx_clear_vmcs(v); -> __vmx_clear_vmcs is called where the current_vmcs is invalidated if the current vmcs in the CPU is the same as virt_to_maddr (v->arch.hvm_vmx->vmcs): __vmpclear(virt_to_maddr(arch_vmx->vmcs)); ( http://www.tptp.cc/mirrors/siyobik.info/instruction/VMCLEAR.html ) To check this assumption I implemented a basic __vmptrst ( http://www.tptp.cc/mirrors/siyobik.info/instruction/VMPTRST.html ) and added the result to the debug output. (XEN) vmcs.c:519:IDLEv4 __vmx_clear_vmcs: realVMCS BEFORE __vmpclear 82415a000 (XEN) vmcs.c:522:IDLEv4 __vmx_clear_vmcs: realVMCS AFTER __vmpclear After that no vmcs_load / enter is executed so the vmcs in the CPU remains invalidated. For the next function in hvm_vcpu_destroy, the nestedhvm_vcpu_destroy(v) the missing vmcs is no problem (at least in our use case), but the free_compat_arg_xlat crashes. The callstack is as follows: hvm_vcpu_destroy free_compat_arg_xlat destroy_perdomain_mapping map_domain_page (probably inlined) mapcache_current_vcpu sync_local_execstate __sync_local_execstate __context_switch (with function pointer v->arch.ctxt_switch_from = vmx_ctxt_switch_from) vmx_ctxt_switch_from (probably inlined) vmx_fpu_leave There a vmwrite is tried if either ( !(v->arch.hvm_vmx.host_cr0 & X86_CR0_TS) ) or ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS) ) is true. The executed vmwrite then crashes. As my knowledge of Xen is not that comprehensive, could you tell me when the TS-bits are set / cleared and what they are used for? static void vmx_fpu_leave(struct vcpu *v) { ASSERT(!v->fpu_dirtied); ASSERT(read_cr0() & X86_CR0_TS); if ( !(v->arch.hvm_vmx.host_cr0 & X86_CR0_TS) ) { v->arch.hvm_vmx.host_cr0 |= X86_CR0_TS; __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0); } if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS) ) { v->arch.hvm_vcpu.hw_cr[0] |= X86_CR0_TS; __vmwrite(GUEST_CR0, v->arch.hvm_vcpu.hw_cr[0]); v->arch.hvm_vmx.exception_bitmap |= (1u << TRAP_no_device); vmx_update_exception_bitmap(v); } } In the crash dump the additional debug output shows that at least one __vmwrite will be tried and that the VMCS in the CPU is invalidated: (XEN) vmx.c:698:IDLEv4 vmx_fpu_leave: vcpu 8300defae000 vmcs 8301586c9000 host_cr0-case FALSE guest_cr[0]-case TRUE curr 8300df2fb000 curr->arch.hvm_vmx.vmcs realVMCS As a quick fix I patched the fpu_leave to only allow the __vmwrite when the realVMCS is valid. This seems to work fine, but requires a call to __vmptrst every time vmx_fpu_leave is called. Also I do not know if an ignored TS has any negative consequences when destroying a vcpu. I assume that this is not case. In our tests nothing pointed to any problems. I added the patch to enable altp2m unconditionally and a patch which evades the panic in vmx_fpu_leave. They are not pretty or anywhere near production ready, but I think you will get the idea. I tried to implement the __vmptrst with the #ifdef HAVE_GAS_VM parts ( analogue to the other functions in vmx.h ) but failed miserably since I lack the required knowledge about the OPCODE definitions. :-D Cheers Kevin > -Ursprüngliche Nachricht- > Von: Andrew Cooper [mailto:andrew.coop...@citrix.com] > Gesendet: Montag, 22. August 2016 13:58 > An: Mayer, Kevin; jbeul...@suse.com > Cc: xen-devel@lists.xen.org > Betreff: Re: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault > > On 19/08/16 11:01, kevin.ma...@gdata.de wrote: > > Hi > > > > I took another look at Xen and a new crashdump. > > The last successful __vmwrite should be in static void > > vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...] > > __vmwrite(SECONDARY_VM_EXEC_CONTROL, > > v->arch.hvm_vmx.secondary_exec_control); > > [...] > > After this the altp2m_vcpu_destroy wakes up the vcpu and is then > finished. > > > > In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can > overwritten (but is not reached in our case as
Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
Hi I took the time to write a small script which restores and destroys domains from provided state files. Just apply the patch to a xen 4.6.1, provide some images + state files and start the script. python VmStarter.py -FILE /path/to/domU-0.state -FILE /path/to/domU-1.state --loggingLevel DEBUG You can provide an arbitrary amount of state files and the script will start an additional thread for each one. Each thread restores one guest domain from the provided state file, waits for a random time between 20 and 30 seconds (sleepTime = random.randint(20,30) ) , destroys the domain and then starts the process again. The guest domains and the corresponding state files need to have the same name since the script extracts the domain name from the state file name. When starting about one guest domain for every physical core of the CPU the crash should occur in 5 to 10 minutes. Since the crashes are pretty random the hypervisor sometimes panics almost instantly and sometimes it takes a while, but it seems to correlate with the amount of started guest domains. More domains => faster crash Kevin > -Ursprüngliche Nachricht- > Von: Andrew Cooper [mailto:andrew.coop...@citrix.com] > Gesendet: Montag, 22. August 2016 13:58 > An: Mayer, Kevin; jbeul...@suse.com > Cc: xen-devel@lists.xen.org > Betreff: Re: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault > > On 19/08/16 11:01, kevin.ma...@gdata.de wrote: > > Hi > > > > I took another look at Xen and a new crashdump. > > The last successful __vmwrite should be in static void > > vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...] > > __vmwrite(SECONDARY_VM_EXEC_CONTROL, > > v->arch.hvm_vmx.secondary_exec_control); > > [...] > > After this the altp2m_vcpu_destroy wakes up the vcpu and is then > finished. > > > > In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can > overwritten (but is not reached in our case as far as I can see): > > if ( nvcpu->nv_n1vmcx ) > > v->arch.hvm_vmx.vmcs = nvcpu->nv_n1vmcx; > > > > In conclusion: > > When destroying a domain the altp2m_vcpu_destroy(v); path seems to > mess up the vmcs which ( only ) sometimes leads to a failed __vmwrite in > vmx_fpu_leave. > > That is as far as I can get with my understanding of the Xen code. > > > > Do you guys have any additional ideas what I could test / analyse? > > Do you have easy reproduction instructions you could share? Sadly, this is > looking like an issue which isn't viable to debug over email. > > ~Andrew Virus checked by G Data MailSecurity Version: AVA 25.8183 dated 07.09.2016 Virus news: www.antiviruslab.com xen-altp2menable.patch Description: xen-altp2menable.patch VmStarter.py Description: VmStarter.py ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
Hi The reproduction should be pretty simple: Apply the patch to enable altp2m unconditionally: d->arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1; d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_REASON] = SHUTDOWN_reboot; +d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] = 1; + vpic_init(d); rc = vioapic_init(d); For the guest we use one state file ( Windows 10 ) from which the guests are restored with libvirt. Simply restore and destroy several guests (5-7 in our current setup) in fast succession (every guest has about 1-2minutes runtime). The amount of guest-VMs seems to correlate with the time until the crash occurs, but other, random factors seem to be more important. More VMs => the crash happens faster. Is the following debug-setup possible? L0: Xen / VMWare L1: Xen with altp2m enabled L2: Several guest-VMs being constantly restored / destroyed Then periodically take snapshots until the hypervisor panics and try to debug from the latest snapshot on. > -Ursprüngliche Nachricht- > Von: Andrew Cooper [mailto:andrew.coop...@citrix.com] > Gesendet: Montag, 22. August 2016 13:58 > An: Mayer, Kevin; jbeul...@suse.com > Cc: xen-devel@lists.xen.org > Betreff: Re: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault > > On 19/08/16 11:01, kevin.ma...@gdata.de wrote: > > Hi > > > > I took another look at Xen and a new crashdump. > > The last successful __vmwrite should be in static void > > vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...] > > __vmwrite(SECONDARY_VM_EXEC_CONTROL, > > v->arch.hvm_vmx.secondary_exec_control); > > [...] > > After this the altp2m_vcpu_destroy wakes up the vcpu and is then > finished. > > > > In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can > overwritten (but is not reached in our case as far as I can see): > > if ( nvcpu->nv_n1vmcx ) > > v->arch.hvm_vmx.vmcs = nvcpu->nv_n1vmcx; > > > > In conclusion: > > When destroying a domain the altp2m_vcpu_destroy(v); path seems to > mess up the vmcs which ( only ) sometimes leads to a failed __vmwrite in > vmx_fpu_leave. > > That is as far as I can get with my understanding of the Xen code. > > > > Do you guys have any additional ideas what I could test / analyse? > > Do you have easy reproduction instructions you could share? Sadly, this is > looking like an issue which isn't viable to debug over email. > > ~Andrew Virus checked by G Data MailSecurity Version: AVA 25.7981 dated 22.08.2016 Virus news: www.antiviruslab.com ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
Hi I took another look at Xen and a new crashdump. The last successful __vmwrite should be in static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...] __vmwrite(SECONDARY_VM_EXEC_CONTROL, v->arch.hvm_vmx.secondary_exec_control); [...] After this the altp2m_vcpu_destroy wakes up the vcpu and is then finished. In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can overwritten (but is not reached in our case as far as I can see): if ( nvcpu->nv_n1vmcx ) v->arch.hvm_vmx.vmcs = nvcpu->nv_n1vmcx; In conclusion: When destroying a domain the altp2m_vcpu_destroy(v); path seems to mess up the vmcs which ( only ) sometimes leads to a failed __vmwrite in vmx_fpu_leave. That is as far as I can get with my understanding of the Xen code. Do you guys have any additional ideas what I could test / analyse? > -Ursprüngliche Nachricht- > Von: Jan Beulich [mailto:jbeul...@suse.com] > Gesendet: Montag, 8. August 2016 12:29 > An: Mayer, Kevin> Cc: andrew.coop...@citrix.com; xen-devel@lists.xen.org > Betreff: Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault > > >>> On 08.08.16 at 11:48, wrote: > > vmx_vmenter_helper is not part of the call stack. The address is > > simply the location of the ud2 to which the __vmwrite(HOST_CR0, > > v->arch.hvm_vmx.host_cr0); In static void vmx_fpu_leave(struct vcpu > > *v) jumps. > > There are two vmwrites in vmx_vcpu_update_eptp (called by > > altp2m_vcpu_destroy): > > __vmwrite(EPT_POINTER, ept_get_eptp(ept)); __vmwrite(EPTP_INDEX, > > vcpu_altp2m(v).p2midx); > > > > And four in vmx_vcpu_update_vmfunc_ve (also called by > > altp2m_vcpu_destroy) __vmwrite(VM_FUNCTION_CONTROL, > > VMX_VMFUNC_EPTP_SWITCHING); __vmwrite(EPTP_LIST_ADDR, > > virt_to_maddr(d->arch.altp2m_eptp)); > > __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT); > > __vmwrite(SECONDARY_VM_EXEC_CONTROL, > > v->arch.hvm_vmx.secondary_exec_control); > > > > After the altp2m-part hvm_vcpu_destroy also calls > > nestedhvm_vcpu_destroy(v), but this code path is executed > > unconditionally so I assume that the error lies somewhere in the > altp2m_vcpu_destroy(v). > > > > What exactly are the vmx_vmcs_enter / exit required for? I often see > > the vmx_vmcs_enter; __vmwrite; vmx_vmcs_exit combination. Need the > > __vmwrites be guarded by an enter / exit ( which Is not the case in > > the static void vmx_fpu_leave(struct vcpu *v) )? > > On code paths where the correct VMCS may not be the current one it is > necessary to frame vmread / vmwrite accordingly. > > > Is it possible that the > > altp2m_vcpu_destroy->vmx_vcpu_update_eptp->vmx_vmcs_exit- > >vmx_clear_vm > > cs invalidates the vmcs for the current vcpu? > > I certainly can't exclude this possibility. > > Jan Virus checked by G Data MailSecurity Version: AVA 25.7943 dated 19.08.2016 Virus news: www.antiviruslab.com ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
vmx_vmenter_helper is not part of the call stack. The address is simply the location of the ud2 to which the __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0); In static void vmx_fpu_leave(struct vcpu *v) jumps. There are two vmwrites in vmx_vcpu_update_eptp (called by altp2m_vcpu_destroy): __vmwrite(EPT_POINTER, ept_get_eptp(ept)); __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx); And four in vmx_vcpu_update_vmfunc_ve (also called by altp2m_vcpu_destroy) __vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING); __vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp)); __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT); __vmwrite(SECONDARY_VM_EXEC_CONTROL, v->arch.hvm_vmx.secondary_exec_control); After the altp2m-part hvm_vcpu_destroy also calls nestedhvm_vcpu_destroy(v), but this code path is executed unconditionally so I assume that the error lies somewhere in the altp2m_vcpu_destroy(v). What exactly are the vmx_vmcs_enter / exit required for? I often see the vmx_vmcs_enter; __vmwrite; vmx_vmcs_exit combination. Need the __vmwrites be guarded by an enter / exit ( which Is not the case in the static void vmx_fpu_leave(struct vcpu *v) )? Is it possible that the altp2m_vcpu_destroy->vmx_vcpu_update_eptp->vmx_vmcs_exit->vmx_clear_vmcs invalidates the vmcs for the current vcpu? Cheers Kevin > -Ursprüngliche Nachricht- > Von: Jan Beulich [mailto:jbeul...@suse.com] > Gesendet: Freitag, 5. August 2016 16:49 > An: Mayer, Kevin> Cc: andrew.coop...@citrix.com; xen-devel@lists.xen.org > Betreff: Re: AW: AW: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabled > bydefault > > >>> On 05.08.16 at 14:51, wrote: > > According to the xen dmesg > > > > (XEN) RIP:e008:[] > vmx_vmenter_helper+0x27e/0x30a > > (XEN) RFLAGS: 00010003 CONTEXT: hypervisor > > (XEN) rax: 8005003b rbx: 8300e72fc000 rcx: > > > (XEN) rdx: 6c00 rsi: 830617fd7fc0 rdi: 8300e6fc > > (XEN) rbp: 830617fd7c40 rsp: 830617fd7c30 r8: > > (XEN) r9: 830be8dc9310 r10: r11: 3475e9cf85d0 > > (XEN) r12: 0006 r13: 830c14ee1000 r14: 8300e6fc > > (XEN) r15: 830617fd cr0: 8005003b cr4: > 26e0 > > (XEN) cr3: 0001bd665000 cr2: 0451 > > (XEN) ds: es: fs: gs: ss: cs: e008 > > > > 0x82d0801fa0c3 :mov$0x6c00,%edx > > 0x82d0801fa0c8 :vmwrite %rax,%rdx > > > > The vmwrite tries to write 0x8005003b to 0x6c00. > > But the active VCPU has a 0-vmcs-pointer. > > Which likely means altp2m manages to confuse some of VMX'es VMCS > management - vmx_vmenter_helper() being on the path back to the guest, > it should be impossible for the VMCS pointer to be zero here. Can you > perhaps identify the most recent vmread or vmwrite which worked fine? > There ought to be many on that path, and the state corruption could then > perhaps be narrowed to quite small a range of code. > > Jan Virus checked by G Data MailSecurity Version: AVA 25.7794 dated 08.08.2016 Virus news: www.antiviruslab.com ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled bydefault
According to the xen dmesg (XEN) RIP:e008:[] vmx_vmenter_helper+0x27e/0x30a (XEN) RFLAGS: 00010003 CONTEXT: hypervisor (XEN) rax: 8005003b rbx: 8300e72fc000 rcx: (XEN) rdx: 6c00 rsi: 830617fd7fc0 rdi: 8300e6fc (XEN) rbp: 830617fd7c40 rsp: 830617fd7c30 r8: (XEN) r9: 830be8dc9310 r10: r11: 3475e9cf85d0 (XEN) r12: 0006 r13: 830c14ee1000 r14: 8300e6fc (XEN) r15: 830617fd cr0: 8005003b cr4: 26e0 (XEN) cr3: 0001bd665000 cr2: 0451 (XEN) ds: es: fs: gs: ss: cs: e008 0x82d0801fa0c3:mov$0x6c00,%edx 0x82d0801fa0c8 :vmwrite %rax,%rdx The vmwrite tries to write 0x8005003b to 0x6c00. But the active VCPU has a 0-vmcs-pointer. > -Ursprüngliche Nachricht- > Von: Jan Beulich [mailto:jbeul...@suse.com] > Gesendet: Donnerstag, 4. August 2016 17:36 > An: Mayer, Kevin > Cc: andrew.coop...@citrix.com; xen-devel@lists.xen.org > Betreff: Re: AW: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by > default > > >>> On 04.08.16 at 17:08, wrote: > > crash> x /130x 0x830bd0da1000 > > 0x830bd0da1000: 0x000e 0x > > 0x830bd0da1010: 0x 0x > > 0x830bd0da1020: 0x 0x > > 0x830bd0da1030: 0x 0x > > 0x830bd0da1040: 0x 0x > > 0x830bd0da1050: 0x 0x > > 0x830bd0da1060: 0x 0x > > 0x830bd0da1070: 0x 0x000bd0da3000 > > 0x830bd0da1080: 0x000c17e36000 0x > > 0x830bd0da1090: 0x 0x > > 0x830bd0da10a0: 0xe7512000 0xe7513000 > > 0x830bd0da10b0: 0x000bd0da 0x > > 0x830bd0da10c0: 0x 0x > > 0x830bd0da10d0: 0x 0x006fedea809b > > 0x830bd0da10e0: 0x0001a379e000 0x000610f9101e > > 0x830bd0da10f0: 0x 0x > > 0x830bd0da1100: 0x 0x0007010600070106 > > 0x830bd0da1110: 0x 0x > > 0x830bd0da1120: 0x006bb6a075fa 0x00060042003f > > 0x830bd0da1130: 0x 0x000fefff > > 0x830bd0da1140: 0x 0x51ff > > 0x830bd0da1150: 0x0041 0x > > 0x830bd0da1160: 0x 0x000c > > 0x830bd0da1170: 0x 0x > > 0x830bd0da1180: 0x0001 0x > > 0x830bd0da1190: 0x0008 0x > > 0x830bd0da11a0: 0x0001 0x0096 > > 0x830bd0da11b0: 0x82d0802bc208 0x806f6dbc > > 0x830bd0da11c0: 0x 0x0400 > > 0x830bd0da11d0: 0x80550f34 0xf0e48161 > > 0x830bd0da11e0: 0x0246 0x > > 0x830bd0da11f0: 0xf79c3000 0x804de6f0 > > 0x830bd0da1200: 0x0023 0x > > 0x830bd0da1210: 0x00c0f300 0x0008 > > 0x830bd0da1220: 0x 0x00c09b00 > > 0x830bd0da1230: 0x0010 0x > > 0x830bd0da1240: 0x00c09300 0x0023 > > 0x830bd0da1250: 0x 0x00c0f300 > > 0x830bd0da1260: 0x0030 0xffdff000 > > 0x830bd0da1270: 0x00c093001fff 0x > > 0x830bd0da1280: 0x 0x01c0 > > 0x830bd0da1290: 0x 0x > > 0x830bd0da12a0: 0x01c0 0x0028 > > 0x830bd0da12b0: 0x80042000 0x8b0020ab > > 0x830bd0da12c0: 0x8003f000 0x8003f400 > > 0x830bd0da12d0: 0x07ff03ff 0x8001003b > > 0x830bd0da12e0: 0x00039000 0x26d9 > > 0x830bd0da12f0: 0xdc3c 0x > > 0x830bd0da1300: 0xe008 0x > > 0x830bd0da1310: 0x 0xe040 > > 0x830bd0da1320: 0x050100070406 0x > >
Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default
According to the crash-dump ( output of vcpu ) the v->arch.hvm_vmx.host_cr0 is " 0 ". This cannot be the correct result because of if ( !(v->arch.hvm_vmx.host_cr0 & X86_CR0_TS) ) { v->arch.hvm_vmx.host_cr0 |= X86_CR0_TS; __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0); } It should at least be 0x8. Also the v->arch.hvm_vmx.vmcs is " 0 " which I assume leads to the crash. Since I assumed that somehow the wrong VCPU is used I tried to find the correct one: vcpus gives VCID PCID VCPU ST T DOMID DOMAIN > 0 0 8300e7557000 RU I 32767 830c14ee1000 > 1 1 8300e75f2000 RU I 32767 830c14ee1000 2 2 8300e72fe000 RU I 32767 830c14ee1000 > 3 3 8300e75f1000 RU I 32767 830c14ee1000 > 4 4 8300e75f RU I 32767 830c14ee1000 > 5 5 8300e72fd000 RU I 32767 830c14ee1000 >*6 6 8300e72fc000 RU I 32767 830c14ee1000 > 7 7 8300e72fb000 RU I 32767 830c14ee1000 > 0 2 8300e72f9000 RU 0 0 830c17e32000 1 3 8300e72f8000 BL 0 0 830c17e32000 2 5 8300e755f000 BL 0 0 830c17e32000 3 0 8300e755e000 BL 0 0 830c17e32000 4 6 8300e755d000 BL 0 0 830c17e32000 5 4 8300e755c000 BL 0 0 830c17e32000 6 7 8300e755b000 BL 0 0 830c17e32000 7 5 8300e755a000 BL 0 0 830c17e32000 0 1 8300e6fc7000 BL U 162 830bdee8f000 0 3 8300e6fc9000 BL U 163 830be20d3000 0 6 8300e6fc BL U 164 830be8dc9000 0 0 8300e6fc6000 BL U 165 830bd0cc Since I see the domain 830be8dc9000 all over the xen dmesg this should be the correct VCPU. On this CPU the v->arch.hvm_vmx.host_cr0 is 2147811387 (0x 8005003B) which corresponds to the cr0 in the xen dmesg. v->arch.hvm_vmx.vmcs is 0x830bd0da1000 crash> x /10x 0x830bd0da1000 0x830bd0da1000: 0x000e 0x 0x830bd0da1010: 0x 0x 0x830bd0da1020: 0x 0x 0x830bd0da1030: 0x 0x 0x830bd0da1040: 0x 0x So the vmcs revision id is 0xe. rdmsr 0x480 (the IA32_VMX_BASIC MSR ) gives da040e which confirms the revision ID. Size should be 0x400 bytes. crash> x /130x 0x830bd0da1000 0x830bd0da1000: 0x000e 0x 0x830bd0da1010: 0x 0x 0x830bd0da1020: 0x 0x 0x830bd0da1030: 0x 0x 0x830bd0da1040: 0x 0x 0x830bd0da1050: 0x 0x 0x830bd0da1060: 0x 0x 0x830bd0da1070: 0x 0x000bd0da3000 0x830bd0da1080: 0x000c17e36000 0x 0x830bd0da1090: 0x 0x 0x830bd0da10a0: 0xe7512000 0xe7513000 0x830bd0da10b0: 0x000bd0da 0x 0x830bd0da10c0: 0x 0x 0x830bd0da10d0: 0x 0x006fedea809b 0x830bd0da10e0: 0x0001a379e000 0x000610f9101e 0x830bd0da10f0: 0x 0x 0x830bd0da1100: 0x 0x0007010600070106 0x830bd0da1110: 0x 0x 0x830bd0da1120: 0x006bb6a075fa 0x00060042003f 0x830bd0da1130: 0x 0x000fefff 0x830bd0da1140: 0x 0x51ff 0x830bd0da1150: 0x0041 0x 0x830bd0da1160: 0x 0x000c 0x830bd0da1170: 0x 0x 0x830bd0da1180: 0x0001 0x 0x830bd0da1190: 0x0008 0x 0x830bd0da11a0: 0x0001 0x0096 0x830bd0da11b0: 0x82d0802bc208 0x806f6dbc 0x830bd0da11c0: 0x 0x0400 0x830bd0da11d0: 0x80550f34 0xf0e48161 0x830bd0da11e0: 0x0246 0x 0x830bd0da11f0: 0xf79c3000 0x804de6f0 0x830bd0da1200: 0x0023 0x 0x830bd0da1210: 0x00c0f300 0x0008 0x830bd0da1220: 0x 0x00c09b00 0x830bd0da1230: 0x0010 0x 0x830bd0da1240:
Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default
Hi guys I got around to take a closer look at the crash dump today. tl;dr: You were right, vmx_vmenter_helper is not called at all in the call stack. The real reason behind the [] vmx_vmenter_helper+0x27e/0x30a should be a failed __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0); in static void vmx_fpu_leave(struct vcpu *v). Long story in "Chapter1". Concerning the stray vmx_vcpu_update_eptp: This seems to be leftovers (either due to a corrupted stack or simply uninitialized local variables) of previous function calls originating in hvm_vcpu_destroy. More precisely: if ( hvm_altp2m_supported() ) altp2m_vcpu_destroy(v); being called BEFORE free_compat_arg_xlat. I assume some kind of error in the altp2m_vcpu_destroy-path to be responsible for the crash, but I have no idea where and how to start investigating. Long story in "Chapter2". Chapter1: I started with a function in the callstack and followed the assembly code to deduce where the (XEN)[] vmx_vmenter_helper+0x27e/0x30a comes from: sync_local_execstate: 0x82d080178c36: mov%rsp,%rbp 0x82d080178c39 : callq 0x82d080178bbb <__sync_local_execstate> 0x82d080178c3e : pop%rbp __sync_local_execstate: 0x82d080178c09 <__sync_local_execstate+78>: cmp%rsi,0x7fe8(%rax) 0x82d080178c10 <__sync_local_execstate+85>: je 0x82d080178c14 <__sync_local_execstate+89> 0x82d080178c12 <__sync_local_execstate+87>: ud2 0x82d080178c13 <__sync_local_execstate+88>: or %eax,%ebp 0x82d080178c15 <__sync_local_execstate+90>: popfq 0x82d080178c16 <__sync_local_execstate+91>: retq $0x 0x82d080178c19 <__sync_local_execstate+94>: and$0x200,%ebx Here crash / gdb seem to get confused with the je. crash> x /3i __sync_local_execstate+89 0x82d080178c14 <__sync_local_execstate+89>: callq 0x82d080174eb6 <__context_switch> 0x82d080178c19 <__sync_local_execstate+94>: and$0x200,%ebx 0x82d080178c1f <__sync_local_execstate+100>: pushfq It seems this code calls the __context_switch: switch_required = (this_cpu(curr_vcpu) != current); if ( switch_required ) { ASSERT(current == idle_vcpu[smp_processor_id()]); __context_switch(); } Up to the __context_switch everything seems to be running as it should. Except for the stray [830617fd7d38] vmx_vcpu_update_eptp at 82d0801f7c6b which can be found in the "crash> bt" output but not in the "dmesg". __context_switch: 0x82d080174f7f <__context_switch+201>: mov%r14,%rdi 0x82d080174f82 <__context_switch+204>: callq 0x82d08017c474 0x82d080174f87 <__context_switch+209>: mov%r14,%rdi 0x82d080174f8a <__context_switch+212>: callq *0x3a8(%r14) Following r14 / rdi ( 0x8300e6fc ) as given in the crash dump seemingly leads to a vtable with a function pointer at the offset 0x3a8: 0x82d0801fa06e crash> x /i 0x82d0801fa06e 0x82d0801fa06e : push %rbp This call, which does not show up in the backtrace, is expected at this position when looking at the C-code: static void __context_switch(void) [...] if ( !is_idle_domain(pd) ) { memcpy(>arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES); vcpu_save_fpu(p); p->arch.ctxt_switch_from(p); [...] as it is set in: static int vmx_vcpu_initialise(struct vcpu *v) [...] v->arch.ctxt_switch_from = vmx_ctxt_switch_from; [...] Finally at: 0x82d0801fa0c3 :mov$0x6c00,%edx 0x82d0801fa0c8 :vmwrite %rax,%rdx 0x82d0801fa0cb :jbe0x82d0801fd23a The jump to [] vmx_vmenter_helper+0x27e/0x30a (ud2 following vmx_vmenter_helper) is done. vmx_ctxt_switch_from is rather short in C and the called static functions are inlined. static void vmx_ctxt_switch_from(struct vcpu *v) { /* * Return early if trying to do a context switch without VMX enabled, * this can happen when the hypervisor shuts down with HVM guests * still running. */ if ( unlikely(!this_cpu(vmxon)) ) return; vmx_fpu_leave(v); vmx_save_guest_msrs(v); vmx_restore_host_msrs(); vmx_save_dr(v); } The unlikely path is not taken and the two ud2 (I assume the ud2 are the ASSERTs in vmx_fpu_leave?) are not reached either: 0x82d0801fa077 : lea 0x15c692(%rip),%rax# 0x82d080356710 0x82d0801fa07e :mov%rsp,%rdx 0x82d0801fa081 :and $0x8000,%rdx 0x82d0801fa088 :mov 0x7ff0(%rdx),%rdx 0x82d0801fa08f :cmpb $0x0,(%rdx,%rax,1) 0x82d0801fa093 :je 0x82d0801fa1d9
Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default
Thanks for your reply. I installed the debug hypervisor and got a new crash dump now. I must confess that I have little to no experience debugging crash dumps, but this seems to be a different kind of error, or at least the way the error is reached is different. The pattern with “page number X invalid” and the “restore” repeats for all preceding domains visible in the dump. […] (XEN) memory.c:269:d164v0 Domain 164 page number 54fc invalid (XEN) memory.c:269:d164v0 Domain 164 page number 54fd invalid (XEN) grant_table.c:1491:d164v0 Expanding dom (164) grant table from (4) to (32) frames. (XEN) Dom164 callback via changed to GSI 28 (XEN) HVM165 restore: VM saved on one CPU (0x206c2) and restored on another (0x106a5). (XEN) HVM165 restore: CPU 0 (XEN) HVM165 restore: PIC 0 (XEN) HVM165 restore: PIC 1 (XEN) HVM165 restore: IOAPIC 0 (XEN) HVM165 restore: LAPIC 0 (XEN) HVM165 restore: LAPIC_REGS 0 (XEN) HVM165 restore: PCI_IRQ 0 (XEN) HVM165 restore: ISA_IRQ 0 (XEN) HVM165 restore: PCI_LINK 0 (XEN) HVM165 restore: PIT 0 (XEN) HVM165 restore: RTC 0 (XEN) HVM165 restore: HPET 0 (XEN) HVM165 restore: PMTIMER 0 (XEN) HVM165 restore: MTRR 0 (XEN) HVM165 restore: VMCE_VCPU 0 (XEN) HVM165 restore: TSC_ADJUST 0 (XEN) memory.c:269:d165v0 Domain 165 page number 54de invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54df invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e0 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e1 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e2 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e3 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e4 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e5 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e6 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e7 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e8 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54e9 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54ea invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54eb invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54ec invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54ed invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54ee invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54ef invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f0 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f1 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f2 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f3 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f4 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f5 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f6 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f7 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f8 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54f9 invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54fa invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54fb invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54fc invalid (XEN) memory.c:269:d165v0 Domain 165 page number 54fd invalid (XEN) grant_table.c:1491:d165v0 Expanding dom (165) grant table from (4) to (32) frames. (XEN) Dom165 callback via changed to GSI 28 (XEN) Debugging connection not set up. (XEN) [ Xen-4.6.1 x86_64 debug=y Not tainted ] (XEN) CPU:6 (XEN) RIP:e008:[] vmx_vmenter_helper+0x27e/0x30a (XEN) RFLAGS: 00010003 CONTEXT: hypervisor (XEN) rax: 8005003b rbx: 8300e72fc000 rcx: (XEN) rdx: 6c00 rsi: 830617fd7fc0 rdi: 8300e6fc (XEN) rbp: 830617fd7c40 rsp: 830617fd7c30 r8: (XEN) r9: 830be8dc9310 r10: r11: 3475e9cf85d0 (XEN) r12: 0006 r13: 830c14ee1000 r14: 8300e6fc (XEN) r15: 830617fd cr0: 8005003b cr4: 26e0 (XEN) cr3: 0001bd665000 cr2: 0451 (XEN) ds: es: fs: gs: ss: cs: e008 (XEN) Xen stack trace from rsp=830617fd7c30: (XEN)830617fd7c40 8300e72fc000 830617fd7ca0 82d080174f91 (XEN)830617fd7f18 830be8dc9000 0286 830617fd7c90 (XEN)0206 0246 0001 830617e91250 (XEN)8300e72fc000 830be8dc9000 830617fd7cc0 82d080178c19 (XEN)00bdeeae 8300e72fc000 830617fd7cd0 82d080178c3e (XEN)830617fd7d20 82d080179740 8300e6fc2000 830c17e38e80 (XEN)830617e91250 82008000 0002 830617e91250 (XEN)830617e91240 830be8dc9000 830617fd7d70 82d080196152 (XEN)830617fd7d50 82d0801f7c6b 8300e6fc2000 830617e91250 (XEN)8300e6fc2000 830617e91250 830617e91240 830be8dc9000 (XEN)830617fd7d80
[Xen-devel] Xen 4.6.1 crash with altp2m enabled by default
Hi guys We are using Xen 4.6.1 to manage our virtual machines on x86-64-servers. We start dozens of VMs and destroy them again after 60 seconds, which works fine as it is, but the next step in our approach requires the use of the altp2m functionality. Since libvirt does not pass the altp2m-enable flag to the hypervisor we enabled altp2m unconditionally by patching the hvm.c . Since all of our machines support the altp2m this seemed to be ok. d->arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1; d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_REASON] = SHUTDOWN_reboot; +d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] = 1; + vpic_init(d); rc = vioapic_init(d); Since applying this patch the hypervisor crashes after several hundred restarted VMs (without any altp2m-functionality used by us) with the following dmesg: (XEN) [ Xen-4.6.1 x86_64 debug=n Not tainted ] (XEN) CPU:7 (XEN) RIP:e008:[] vmx_vmenter_helper+0x2b5/0x340 (XEN) RFLAGS: 00010003 CONTEXT: hypervisor (d0v3) (XEN) rax: 8005003b rbx: 8300e7038000 rcx: 0008 (XEN) rdx: 6c00 rsi: 83062eb5e000 rdi: 8300e7038000 (XEN) rbp: 830c17e3f000 rsp: 830617fc7d70 r8: (XEN) r9: 83014f8d7028 r10: 02700f858000 r11: 2201be6861f0 (XEN) r12: 83062eb5e000 r13: 8300e752f000 r14: 82d08030ea40 (XEN) r15: 0007 cr0: 8005003b cr4: 26e0 (XEN) cr3: 0001bf4da000 cr2: dd840c00 (XEN) ds: es: fs: gs: ss: cs: e008 (XEN) Xen stack trace from rsp=830617fc7d70: (XEN)8300e7038000 82d080170c04 000780109f6a (XEN)830617fc7f18 831e 8300e752f19c (XEN)0286 8300e752f000 8300e72fc000 0007 (XEN)830c17e3f000 830c14ee1000 82d08030ea40 82d080173d6a (XEN) (XEN)82d08030ea40 8300e72fc000 02700f481091 0001 (XEN)82d080324560 82d08030ea40 8300e752f000 82d080128004 (XEN)0001 01c9c380 830c14ef60e8 17fce600 (XEN)0001 82d0801bd18b 82d0801d9e88 8300e752f000 (XEN)01c9c380 82d08012e700 006e0171 (XEN)830617fc 82d0802f8f80 83062eb5e000 (XEN)82d08030ea40 82d08012b040 8300e7038000 830617fc (XEN)8300e7038000 830c14ee1000 82d080170970 (XEN)8300e72fc000 (XEN) 80550f50 ffdffc70 (XEN) 2fcffe19 (XEN)ffdffc70 ffdffc50 853b0918 (XEN)00fa f0e48162 0246 (XEN)80550f34 (XEN) 0007 8300e752f000 (XEN) Xen call trace: (XEN)[] vmx_vmenter_helper+0x2b5/0x340 (XEN)[] __context_switch+0xb4/0x350 (XEN)[] context_switch+0xca/0xef0 (XEN)[] schedule+0x264/0x5f0 (XEN)[] mwait_idle+0x25b/0x3a0 (XEN)[] hvm_vcpu_has_pending_irq+0x58/0xc0 (XEN)[] timer_softirq_action+0x80/0x250 (XEN)[] __do_softirq+0x60/0x90 (XEN)[] idle_loop+0x20/0x50 (XEN) (XEN) (XEN) (XEN) Panic on CPU 7: (XEN) FATAL TRAP: vector = 6 (invalid opcode) (XEN) (XEN) (XEN) Reboot in five seconds... (XEN) Executing kexec image on cpu7 (XEN) Shot down all CPUs The RIP points to ud2 0x82d0801f5a55: ud2 >From the RFLAGS we concluded that the vmwrite failed due to an invalid >vmcs-pointer (CF = 1), but this is where we are stuck since we have no idea >how the pointer could have gotten corrupted. crash> vcpu gives vmcs = 0x817cbc20 for vcpu_id = 7, and vcpus gives VCID PCID VCPU ST T DOMID DOMAIN 0 0 8300e75f2000 RU I 32767 830c14ee1000 1 1 8300e72fe000 RU I 32767 830c14ee1000 2 2 8300e7527000 RU I 32767 830c14ee1000 > 3 3 8300e7526000 RU I 32767 830c14ee1000 4 4 8300e75f1000 RU I 32767 830c14ee1000 > 5 5 8300e75f RU I 32767 830c14ee1000 > 6 6 8300e72fd000 RU I 32767 830c14ee1000 7 7 8300e72fc000 RU I 32767 830c14ee1000 0 0 8300e72fa000 BL 0 0 830c17e3f000 1 6 8300e72f9000 BL 0 0 830c17e3f000 2 3 8300e72f8000 BL 0 0 830c17e3f000 > 3 7 8300e752f000 RU 0 0 830c17e3f000 4 5 8300e752e000 RU 0 0 830c17e3f000 >
Re: [Xen-devel] Branch Trace Storage for guests andVPMUinitialization
-Ursprüngliche Nachricht- Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com] Gesendet: Donnerstag, 26. Februar 2015 17:35 An: Dietmar Hahn; xen-devel@lists.xen.org Cc: Mayer, Kevin Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMUinitialization On 02/26/2015 03:56 AM, Dietmar Hahn wrote: Am Mittwoch 25 Februar 2015, 11:31:31 schrieb Boris Ostrovsky: On 02/25/2015 10:12 AM, kevin.ma...@gdata.de wrote: -Ursprüngliche Nachricht- Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com] Gesendet: Dienstag, 24. Februar 2015 18:13 An: Mayer, Kevin; xen-devel@lists.xen.org Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU initialization On 02/24/2015 10:27 AM, kevin.ma...@gdata.de wrote: Hi guys I`m trying to set up the BTS so that I can log the branches taken in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7 Sandy Bridge. I added the vpmu=bts boot parameter to my grub2 configuration and extended the libxl,libxc,domctl,… with an own command so that I can trigger the activation of the BTS whenever I want. I am not sure why you are doing all these changes to Xen code. BTS is supposed to be managed from the guest. For example, a Fedora HVM guest will produce this: [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to write data ] [ perf record: Captured and wrote 0.704 MB perf.data (~30756 samples) ] [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f ip,addr,sym,dso,symoff --show-kernel-path 8167c347 native_irq_return_iret+0x0 (/proc/kcore) = 328c001590 [unknown] (/proc/kcore) 8167c347 native_irq_return_iret+0x0 (/proc/kcore) = 328c001590 [unknown] ([unknown]) 328c001593 [unknown] ([unknown]) = 328c004b70 [unknown] ([unknown]) ... I want to be able to log the taken branches (of the guest) without the need to modify the guest at all. This means I have to do all the logic in the hypervisor, or am I wrong? In that case, yes. But then you have to make sure that at least * you don't load guest's VPMU (or, at least, BTS-related registers) on context switch But you need to modify PMU registers when switching to/from the guest context to get PMU running. I was thinking that all BTS stuff can be controlled from dom0 and so we can use dom0's version of these registers. I didn't realize that DS_AREA would have to be accessed in guest's address space (and that DEBUGCTL is loaded from VMCS). Which is what I think I said in response to this message (which didn't show up on the list because Kevin accidentally dropped xen-devel). -boris Terribly sorry about that... So the VPMU doesn’t get loaded when there is a VMENTER? I thought I could set the domU-vcpu-vpmu to enable BTS while in dom0 (with modified versions of msr_write_intercept, vpmu_do_wrmsr and core2_vpmu_do_wrmsr of course since the build in ones use the current-vcpu which would be the dom0-vcpu) and as soon as there is a context switch to domU the vpmu gets loaded and the guest starts logging. If the described behavior is correct the only problem I can see is with allocating memory in dom0 in a way that the guest can access it. But if I got it wrong please explain how the vpmu really works. Cheers Kevin I didn't think of using the VPMU stuff with modifying the context from outside the guest. * You don't send the interrupt to the guest (meaning that you will need to somehow inform dom0 of the BTS interrupt) and probably more. Essentially, you want dom0 to profile the guest. I have been working on patches that would allow that but they are still under review. In this command I do the following: I set up the memory region for the BTS Buffer and the DS Buffer Management Area using xzalloc_bytes I don't think you should be allocating BTS buffers in the hypervisor, they are in guest's memory. I agree. As I said I think this is where my main problem is at the moment. Is there any way I can allocate memory in the hypervisor in a way the guest can access it? I am not sure this is what you want since you seem to *not* want the guest to process the samples, right? But yes, you can. E.g. something like what map_vcpu_info() does. (I have no idea how you'd do this from Windows.) The DS buffer has to be mapped within the guests address space so the CPU running in guest context can access this area. Otherwise you get this triple fault. So I would think you need a mixture of writing some stuff in Windows and patching the hypervisor. Dietmar. Of course the guest must not be able to use this memory in its normal operations but just for BTS. Is this even possible? I am rather confused at the moment. :-D Then I write the pointer to the BTS Buffer into the DS Buffer
Re: [Xen-devel] Branch Trace Storage for guests and VPMUinitialization
-Ursprüngliche Nachricht- Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com] Gesendet: Dienstag, 24. Februar 2015 18:13 An: Mayer, Kevin; xen-devel@lists.xen.org Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU initialization On 02/24/2015 10:27 AM, kevin.ma...@gdata.de wrote: Hi guys I`m trying to set up the BTS so that I can log the branches taken in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7 Sandy Bridge. I added the vpmu=bts boot parameter to my grub2 configuration and extended the libxl,libxc,domctl,… with an own command so that I can trigger the activation of the BTS whenever I want. I am not sure why you are doing all these changes to Xen code. BTS is supposed to be managed from the guest. For example, a Fedora HVM guest will produce this: [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to write data ] [ perf record: Captured and wrote 0.704 MB perf.data (~30756 samples) ] [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f ip,addr,sym,dso,symoff --show-kernel-path 8167c347 native_irq_return_iret+0x0 (/proc/kcore) = 328c001590 [unknown] (/proc/kcore) 8167c347 native_irq_return_iret+0x0 (/proc/kcore) = 328c001590 [unknown] ([unknown]) 328c001593 [unknown] ([unknown]) = 328c004b70 [unknown] ([unknown]) ... I want to be able to log the taken branches (of the guest) without the need to modify the guest at all. This means I have to do all the logic in the hypervisor, or am I wrong? In this command I do the following: I set up the memory region for the BTS Buffer and the DS Buffer Management Area using xzalloc_bytes I don't think you should be allocating BTS buffers in the hypervisor, they are in guest's memory. I agree. As I said I think this is where my main problem is at the moment. Is there any way I can allocate memory in the hypervisor in a way the guest can access it? Of course the guest must not be able to use this memory in its normal operations but just for BTS. Is this even possible? I am rather confused at the moment. :-D Then I write the pointer to the BTS Buffer into the DS Buffer Management Area at +0x0 and +0x8 (BTS Buffer Base and BTS Index) When I use vmx_msr_write_intercept to store the value in MSR_IA32_DS_AREA the host reboots (my idea is he tries to access a vpmu-struct that isn´t there in the current vcpu and panics). Can you post hypervisor log? (hard to say how helpful it will be without seeing your code changes though) Right after enabling the BTS I get a triple fault. hvm.c:1357:d2 Triple fault on VCPU0 - invoking HVM shutdown action 1. When I use a modified version of vmx_msr_write_intercept I don’t get any crashes as long as I don’t enable BTS and TR in the GUEST_IA32_DEBUGCTL (BTR works). When I enable the BTS (and TR) the guest crashes. I suppose he gets killed by the hypervisor for accessing forbidden memory. Possibly because DS area point to hypervisor memory. Having said all this, I am not sure how well BTS works. You did notice this in the hypervisor log: (XEN) ** (XEN) ** WARNING: Emulation of BTS Feature is switched on ** (XEN) ** Using this processor feature in a virtualized ** (XEN) ** environment is not 100% safe. ** (XEN) ** Setting the DS buffer address with wrong values ** (XEN) ** may lead to hypervisor hangs or crashes. ** (XEN) ** It is NOT recommended for production use! ** (XEN) ** Yes, I saw that. It doesn’t state that BTS is not working at all, just that it is not that safe to use. As I understand it as long as I set the DS buffer address correctly I should be fine, right? Since I don’t want to use for production that is fine with me. At least for now. Kevin -boris The modified version of vmx_msr_write_intercept takes a vcpu-struct as a parameter and uses this instead of the current vcpu. Instead of staticint vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content) { struct vcpu *v = current; I just have staticint own_vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content, struct vcpu *v) I get this vcpu by d-vcpu[0] as I have limited my guest domain to one vcpu atm. Of course I also use similarly modified version of the called functions(vpmu_do_wrmsr,…). I´m pretty sure that my problem is with a wrong scope/usage of the vcpus/memory, but I have no idea how to fix this. I can see a potential problem with the memory allocation (in the host) into which the cpu in guest-mode is supposed to write. Or maybe I got the principle of a vcpu/vpmu all wrong. Since I couldn’t find any project that uses the BTS for the guest, I am wondering if anyone has ever
[Xen-devel] Branch Trace Storage for guests and VPMU initialization
Hi guys I`m trying to set up the BTS so that I can log the branches taken in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7 Sandy Bridge. I added the vpmu=bts boot parameter to my grub2 configuration and extended the libxl,libxc,domctl,... with an own command so that I can trigger the activation of the BTS whenever I want. In this command I do the following: I set up the memory region for the BTS Buffer and the DS Buffer Management Area using xzalloc_bytes Then I write the pointer to the BTS Buffer into the DS Buffer Management Area at +0x0 and +0x8 (BTS Buffer Base and BTS Index) When I use vmx_msr_write_intercept to store the value in MSR_IA32_DS_AREA the host reboots (my idea is he tries to access a vpmu-struct that isn´t there in the current vcpu and panics). When I use a modified version of vmx_msr_write_intercept I don't get any crashes as long as I don't enable BTS and TR in the GUEST_IA32_DEBUGCTL (BTR works). When I enable the BTS (and TR) the guest crashes. I suppose he gets killed by the hypervisor for accessing forbidden memory. The modified version of vmx_msr_write_intercept takes a vcpu-struct as a parameter and uses this instead of the current vcpu. Instead of static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content) { struct vcpu *v = current; I just have static int own_vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content, struct vcpu *v) I get this vcpu by d-vcpu[0] as I have limited my guest domain to one vcpu atm. Of course I also use similarly modified version of the called functions(vpmu_do_wrmsr,...). I´m pretty sure that my problem is with a wrong scope/usage of the vcpus/memory, but I have no idea how to fix this. I can see a potential problem with the memory allocation (in the host) into which the cpu in guest-mode is supposed to write. Or maybe I got the principle of a vcpu/vpmu all wrong. Since I couldn't find any project that uses the BTS for the guest, I am wondering if anyone has ever done this and if it is possible at all. Any input is welcome as I am pretty much stuck atm... Cheers Kevin Virus checked by G Data MailSecurity Version: AVA 25.404 dated 24.02.2015 Virus news: www.antiviruslab.com___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Tracking guest code execution with EPT violations
Hi all I`m trying to track code execution with page granularity by setting the access rights in the EPT to not executable on Xen 4.4.1. The idea is as follows: According to the intel manual A reference using a guest-physical address whose translation encounters an EPT paging-structure that is not present causes an EPT violation. So whenever a nonexisting memory page gets requested an EPT violation is caused (and handled by ept_handle_violation). Extending the EXIT_REASON_EPT_VIOLATION I should be able to set the access rights for every new page to access_rw(By using the p2m-get_entry and p2m- set_entry functions right after the violation was handled), leading to a new EPT violation every time an instruction is fetched from this page. There are several problems with my approach so far: * I get to few unique GFN (derived from the gpa by PAGE_SHIFT in the EPT violations when booting a WinXP guest. I get about 250 EPT_VIOLATIONS with unique GFNs when booting the guest OS and none when starting new programs in the guest. So something seems to be wrong there. Also I read the access rights of the pages back after setting them. Most of the time the initial access rights are access_n before and the same after I tried setting them to access_rw (this happens when the type is p2m_mmio_dm, when the type is p2m_ram_rw the setting works temporarily). * I never get an EPT violation with the EPT_EXEC_VIOLATION flag set in the exit qualifications even for the pages where the setting of the access rights did succeed. * Later when checking the access rights (I simply save the GFNs in an array and use p2m-get_entry in an own call to domctl.c from xl) of the GFNs they all have access right access_n and type p2m_mmio_dm , even for the pages where the setting of the access rights did succeed or the type was different before. This all tells me that there is something fundamentally wrong with my approach so far, leading me to the following questions: 1. Every time a new page in memory is allocated by the guest I get an EPT_VIOLATION, right? a. If this is the case then why don't I get new violations after windows has finished booting? 2. What is the difference between types p2m_mmio_dm and p2m_ram_rw? (got a feeling that part of the problem lies here) 3. Are the p2m-get_entry/p2m-set_entry functions the right tools for this purpose? a. If they are, then why do they sometimes fail? 4. To get the domain I use struct vcpu *curr = current; and struct p2m_domain *p2m = p2m_get_hostp2m(curr-domain); before using the get/set_entry-functions. Do I get confused with wrong domains or something like that? 5. Because I just set the access rights to rw every time EXIT_REASON_EPT_VIOLATION is called the whole domain should freeze/crash as soon as the first page tries to execute an instruction, right? It doesn't because I get no execution attempts on the pages I set the access_rw, but why don't I get an execution attempt? I hope it got clear what I try to achieve. Thanks Kevin Virus checked by G Data MailSecurity Version: AVA 24.6111 dated 16.01.2015 Virus news: www.antiviruslab.com___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel