Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault

2016-09-21 Thread Kevin.Mayer
Hi guys

I have found the problem (after hours and hours of gruesome
debugging with the almighty print) and it seems that this could potentially
have quite a bit of impact if altp2m is enabled for a guest domain (even if
the
functionality is never actively used), since destroying any vcpu of this
guest could lead to a hypervisor panic.
So a malicious user could simply destroy and restart his VM(s) in order to
DOS the VMs of other users by killing the hypervisor.
Granted, this is not very effective, but, depending on the environment, it
is extremely easy to implement.
The bug persists in Xen 4.7 and I do not that it was fixed in the current
master branch.

The following happens.
The call
void hvm_vcpu_destroy(struct vcpu *v)
{
hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
if ( hvm_altp2m_supported() )
altp2m_vcpu_destroy(v);

at some time reaches vmx_vcpu_update_eptp which ends with a
vmx_vmcs_exit(v);.
There vmx_clear_vmcs(v); -> __vmx_clear_vmcs  is called where the
current_vmcs is invalidated if the current vmcs in the CPU is the same as
virt_to_maddr (v->arch.hvm_vmx->vmcs):

__vmpclear(virt_to_maddr(arch_vmx->vmcs)); (
http://www.tptp.cc/mirrors/siyobik.info/instruction/VMCLEAR.html )

To check this assumption I implemented a basic __vmptrst (
http://www.tptp.cc/mirrors/siyobik.info/instruction/VMPTRST.html ) and added
the result to the debug output.
(XEN) vmcs.c:519:IDLEv4 __vmx_clear_vmcs: realVMCS BEFORE __vmpclear
82415a000 
(XEN) vmcs.c:522:IDLEv4 __vmx_clear_vmcs: realVMCS AFTER __vmpclear


After that no vmcs_load / enter is executed so the vmcs in the CPU remains
invalidated.

For the next function in hvm_vcpu_destroy, the nestedhvm_vcpu_destroy(v) the
missing vmcs is no problem (at least in our use case), but the
free_compat_arg_xlat crashes.
The callstack is as follows:
hvm_vcpu_destroy
free_compat_arg_xlat
destroy_perdomain_mapping
map_domain_page
(probably inlined) mapcache_current_vcpu
sync_local_execstate
__sync_local_execstate
__context_switch
(with function pointer v->arch.ctxt_switch_from = vmx_ctxt_switch_from)
vmx_ctxt_switch_from 
(probably inlined) vmx_fpu_leave

There a vmwrite is tried if either ( !(v->arch.hvm_vmx.host_cr0 &
X86_CR0_TS) ) or ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS) ) is true.
The executed vmwrite then crashes.
As my knowledge of Xen is not that comprehensive, could you tell me when the
TS-bits are set / cleared and what they are used for?

static void vmx_fpu_leave(struct vcpu *v)
{
ASSERT(!v->fpu_dirtied);
ASSERT(read_cr0() & X86_CR0_TS);

if ( !(v->arch.hvm_vmx.host_cr0 & X86_CR0_TS) )
{
v->arch.hvm_vmx.host_cr0 |= X86_CR0_TS;
__vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);
}

if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS) )
{
v->arch.hvm_vcpu.hw_cr[0] |= X86_CR0_TS;
__vmwrite(GUEST_CR0, v->arch.hvm_vcpu.hw_cr[0]);
v->arch.hvm_vmx.exception_bitmap |= (1u << TRAP_no_device);
vmx_update_exception_bitmap(v);
}
}

In the crash dump the additional debug output shows that at least one
__vmwrite will be tried and that the VMCS in the CPU is invalidated:
(XEN) vmx.c:698:IDLEv4 vmx_fpu_leave: vcpu 8300defae000 vmcs
8301586c9000 host_cr0-case FALSE guest_cr[0]-case TRUE curr
8300df2fb000 curr->arch.hvm_vmx.vmcs  realVMCS


As a quick fix I patched the fpu_leave to only allow the __vmwrite when the
realVMCS is valid.
This seems to work fine, but requires a call to __vmptrst every time
vmx_fpu_leave is called. Also I do not know if an ignored TS has any
negative consequences when destroying a vcpu. I assume that this is not
case. In our tests nothing pointed to any problems.

I added the patch to enable altp2m unconditionally and a patch which evades
the panic in vmx_fpu_leave.
They are not pretty or anywhere near production ready, but I think you will
get the idea.
I tried to implement the __vmptrst with the #ifdef HAVE_GAS_VM parts (
analogue to the other functions in vmx.h ) but failed miserably since I lack
the required knowledge about the OPCODE definitions. :-D

Cheers

Kevin

> -Ursprüngliche Nachricht-
> Von: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Gesendet: Montag, 22. August 2016 13:58
> An: Mayer, Kevin ; jbeul...@suse.com
> Cc: xen-devel@lists.xen.org
> Betreff: Re: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
> 
> On 19/08/16 11:01, kevin.ma...@gdata.de wrote:
> > Hi
> >
> > I took another look at Xen and a new crashdump.
> > The last successful __vmwrite should be in static void
> > vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...]
> > __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> >   v->arch.hvm_vmx.secondary_exec_control);
> > [...]
> > After this the altp2m_vcpu_destroy wakes up the vcpu and is then
> finished.
> >
> > In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can
> overwritten (but is not reached in our case as 

Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault

2016-09-07 Thread Kevin.Mayer
Hi

I took the time to write a small script which restores and destroys domains 
from provided state files.
Just apply the patch to a xen 4.6.1, provide some images + state files and 
start the script.

python VmStarter.py -FILE /path/to/domU-0.state -FILE /path/to/domU-1.state 
--loggingLevel DEBUG

You can provide an arbitrary amount of state files and the script will start an 
additional thread for each one.
Each thread restores one guest domain from the provided state file, waits for a 
random time between 20 and 30 seconds (sleepTime = random.randint(20,30) ) , 
destroys the domain and then starts the process again.

The guest domains and the corresponding state files need to have the same name 
since the script extracts the domain name from the state file name.

When starting about one guest domain for every physical core of the CPU the 
crash should occur in 5 to 10 minutes. Since the crashes are pretty random the 
hypervisor sometimes panics almost instantly and sometimes it takes a while, 
but it seems to correlate with the amount of started guest domains.
More domains => faster crash

Kevin

> -Ursprüngliche Nachricht-
> Von: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Gesendet: Montag, 22. August 2016 13:58
> An: Mayer, Kevin ; jbeul...@suse.com
> Cc: xen-devel@lists.xen.org
> Betreff: Re: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
> 
> On 19/08/16 11:01, kevin.ma...@gdata.de wrote:
> > Hi
> >
> > I took another look at Xen and a new crashdump.
> > The last successful __vmwrite should be in static void
> > vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...]
> > __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> >   v->arch.hvm_vmx.secondary_exec_control);
> > [...]
> > After this the altp2m_vcpu_destroy wakes up the vcpu and is then
> finished.
> >
> > In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can
> overwritten (but is not reached in our case as far as I can see):
> > if ( nvcpu->nv_n1vmcx )
> > v->arch.hvm_vmx.vmcs = nvcpu->nv_n1vmcx;
> >
> > In conclusion:
> > When destroying a domain the altp2m_vcpu_destroy(v); path seems to
> mess up the vmcs which ( only ) sometimes leads to a failed __vmwrite in
> vmx_fpu_leave.
> > That is as far as I can get with my understanding of the Xen code.
> >
> > Do you guys have any additional ideas what I could test / analyse?
> 
> Do you have easy reproduction instructions you could share?  Sadly, this is
> looking like an issue which isn't viable to debug over email.
> 
> ~Andrew


Virus checked by G Data MailSecurity
Version: AVA 25.8183 dated 07.09.2016
Virus news: www.antiviruslab.com

xen-altp2menable.patch
Description: xen-altp2menable.patch


VmStarter.py
Description: VmStarter.py
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault

2016-08-22 Thread Kevin.Mayer
Hi

The reproduction should be pretty simple:

Apply the patch to enable altp2m unconditionally:
 d->arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1;
 d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_REASON] = SHUTDOWN_reboot;
+d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] = 1;
+
 vpic_init(d);
 rc = vioapic_init(d);

For the guest we use one state file ( Windows 10 ) from which the guests are 
restored with libvirt.
Simply restore and destroy several guests (5-7 in our current setup) in fast 
succession (every guest has about 1-2minutes runtime).
The amount of guest-VMs seems to correlate with the time until the crash 
occurs, but other, random factors seem to be more important.
More VMs => the crash happens faster.


Is the following debug-setup possible?
L0: Xen / VMWare
L1: Xen with altp2m enabled
L2: Several guest-VMs being constantly restored / destroyed

Then periodically take snapshots until the hypervisor panics and try to debug 
from the latest snapshot on.

> -Ursprüngliche Nachricht-
> Von: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Gesendet: Montag, 22. August 2016 13:58
> An: Mayer, Kevin ; jbeul...@suse.com
> Cc: xen-devel@lists.xen.org
> Betreff: Re: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
> 
> On 19/08/16 11:01, kevin.ma...@gdata.de wrote:
> > Hi
> >
> > I took another look at Xen and a new crashdump.
> > The last successful __vmwrite should be in static void
> > vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...]
> > __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> >   v->arch.hvm_vmx.secondary_exec_control);
> > [...]
> > After this the altp2m_vcpu_destroy wakes up the vcpu and is then
> finished.
> >
> > In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can
> overwritten (but is not reached in our case as far as I can see):
> > if ( nvcpu->nv_n1vmcx )
> > v->arch.hvm_vmx.vmcs = nvcpu->nv_n1vmcx;
> >
> > In conclusion:
> > When destroying a domain the altp2m_vcpu_destroy(v); path seems to
> mess up the vmcs which ( only ) sometimes leads to a failed __vmwrite in
> vmx_fpu_leave.
> > That is as far as I can get with my understanding of the Xen code.
> >
> > Do you guys have any additional ideas what I could test / analyse?
> 
> Do you have easy reproduction instructions you could share?  Sadly, this is
> looking like an issue which isn't viable to debug over email.
> 
> ~Andrew

Virus checked by G Data MailSecurity
Version: AVA 25.7981 dated 22.08.2016
Virus news: www.antiviruslab.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault

2016-08-19 Thread Kevin.Mayer
Hi

I took another look at Xen and a new crashdump.
The last successful __vmwrite should be in 
static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
[...]
__vmwrite(SECONDARY_VM_EXEC_CONTROL,
  v->arch.hvm_vmx.secondary_exec_control);
[...]
After this the altp2m_vcpu_destroy wakes up the vcpu and is then finished.

In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can overwritten (but is 
not reached in our case as far as I can see):
if ( nvcpu->nv_n1vmcx )
v->arch.hvm_vmx.vmcs = nvcpu->nv_n1vmcx;

In conclusion:
When destroying a domain the altp2m_vcpu_destroy(v); path seems to mess up the 
vmcs which ( only ) sometimes leads to a failed __vmwrite in vmx_fpu_leave.
That is as far as I can get with my understanding of the Xen code.

Do you guys have any additional ideas what I could test / analyse?

> -Ursprüngliche Nachricht-
> Von: Jan Beulich [mailto:jbeul...@suse.com]
> Gesendet: Montag, 8. August 2016 12:29
> An: Mayer, Kevin 
> Cc: andrew.coop...@citrix.com; xen-devel@lists.xen.org
> Betreff: Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
> 
> >>> On 08.08.16 at 11:48,  wrote:
> > vmx_vmenter_helper is not part of the call stack. The address is
> > simply the location of the ud2 to which the __vmwrite(HOST_CR0,
> > v->arch.hvm_vmx.host_cr0); In static void vmx_fpu_leave(struct vcpu
> > *v) jumps.
> > There are two vmwrites in vmx_vcpu_update_eptp (called by
> > altp2m_vcpu_destroy):
> > __vmwrite(EPT_POINTER, ept_get_eptp(ept)); __vmwrite(EPTP_INDEX,
> > vcpu_altp2m(v).p2midx);
> >
> > And four in vmx_vcpu_update_vmfunc_ve (also called by
> > altp2m_vcpu_destroy) __vmwrite(VM_FUNCTION_CONTROL,
> > VMX_VMFUNC_EPTP_SWITCHING); __vmwrite(EPTP_LIST_ADDR,
> > virt_to_maddr(d->arch.altp2m_eptp));
> > __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
> > __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> > v->arch.hvm_vmx.secondary_exec_control);
> >
> > After the altp2m-part hvm_vcpu_destroy also calls
> > nestedhvm_vcpu_destroy(v), but this code path is executed
> > unconditionally so I assume that the error lies somewhere in the
> altp2m_vcpu_destroy(v).
> >
> > What exactly are the vmx_vmcs_enter / exit required for? I often see
> > the vmx_vmcs_enter; __vmwrite; vmx_vmcs_exit combination. Need the
> > __vmwrites be guarded by an enter / exit ( which Is not the case in
> > the static void vmx_fpu_leave(struct vcpu *v) )?
> 
> On code paths where the correct VMCS may not be the current one it is
> necessary to frame vmread / vmwrite accordingly.
> 
> > Is it possible that the
> > altp2m_vcpu_destroy->vmx_vcpu_update_eptp->vmx_vmcs_exit-
> >vmx_clear_vm
> > cs invalidates the vmcs for the current vcpu?
> 
> I certainly can't exclude this possibility.
> 
> Jan

Virus checked by G Data MailSecurity
Version: AVA 25.7943 dated 19.08.2016
Virus news: www.antiviruslab.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault

2016-08-08 Thread Kevin.Mayer
vmx_vmenter_helper is not part of the call stack. The address is simply the 
location of the ud2 to which the 
__vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);
In
static void vmx_fpu_leave(struct vcpu *v)
jumps.
There are two vmwrites in vmx_vcpu_update_eptp (called by altp2m_vcpu_destroy):
__vmwrite(EPT_POINTER, ept_get_eptp(ept));
__vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);

And four in vmx_vcpu_update_vmfunc_ve (also called by altp2m_vcpu_destroy)
__vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
__vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
__vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
__vmwrite(SECONDARY_VM_EXEC_CONTROL,  v->arch.hvm_vmx.secondary_exec_control);

After the altp2m-part hvm_vcpu_destroy also calls nestedhvm_vcpu_destroy(v), 
but this code path is executed unconditionally so I assume that the error lies 
somewhere in the altp2m_vcpu_destroy(v).

What exactly are the vmx_vmcs_enter / exit required for? I often see the 
vmx_vmcs_enter; __vmwrite; vmx_vmcs_exit combination. Need the __vmwrites be 
guarded by an enter / exit ( which Is not the case in the static void 
vmx_fpu_leave(struct vcpu *v) )?
Is it possible that the 
altp2m_vcpu_destroy->vmx_vcpu_update_eptp->vmx_vmcs_exit->vmx_clear_vmcs 
invalidates the vmcs for the current vcpu?

Cheers

Kevin

> -Ursprüngliche Nachricht-
> Von: Jan Beulich [mailto:jbeul...@suse.com]
> Gesendet: Freitag, 5. August 2016 16:49
> An: Mayer, Kevin 
> Cc: andrew.coop...@citrix.com; xen-devel@lists.xen.org
> Betreff: Re: AW: AW: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabled
> bydefault
> 
> >>> On 05.08.16 at 14:51,  wrote:
> > According to the xen dmesg
> >
> > (XEN) RIP:e008:[]
> vmx_vmenter_helper+0x27e/0x30a
> > (XEN) RFLAGS: 00010003   CONTEXT: hypervisor
> > (XEN) rax: 8005003b   rbx: 8300e72fc000   rcx:
> 
> > (XEN) rdx: 6c00   rsi: 830617fd7fc0   rdi: 8300e6fc
> > (XEN) rbp: 830617fd7c40   rsp: 830617fd7c30   r8:  
> > (XEN) r9:  830be8dc9310   r10:    r11: 3475e9cf85d0
> > (XEN) r12: 0006   r13: 830c14ee1000   r14: 8300e6fc
> > (XEN) r15: 830617fd   cr0: 8005003b   cr4:
> 26e0
> > (XEN) cr3: 0001bd665000   cr2: 0451
> > (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
> >
> > 0x82d0801fa0c3 :mov$0x6c00,%edx
> > 0x82d0801fa0c8 :vmwrite %rax,%rdx
> >
> > The vmwrite tries to write 0x8005003b   to 0x6c00.
> > But the active VCPU has a 0-vmcs-pointer.
> 
> Which likely means altp2m manages to confuse some of VMX'es VMCS
> management - vmx_vmenter_helper() being on the path back to the guest,
> it should be impossible for the VMCS pointer to be zero here. Can you
> perhaps identify the most recent vmread or vmwrite which worked fine?
> There ought to be many on that path, and the state corruption could then
> perhaps be narrowed to quite small a range of code.
> 
> Jan

Virus checked by G Data MailSecurity
Version: AVA 25.7794 dated 08.08.2016
Virus news: www.antiviruslab.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled bydefault

2016-08-05 Thread Kevin.Mayer
According to the xen dmesg

(XEN) RIP:e008:[] vmx_vmenter_helper+0x27e/0x30a
(XEN) RFLAGS: 00010003   CONTEXT: hypervisor
(XEN) rax: 8005003b   rbx: 8300e72fc000   rcx: 
(XEN) rdx: 6c00   rsi: 830617fd7fc0   rdi: 8300e6fc
(XEN) rbp: 830617fd7c40   rsp: 830617fd7c30   r8:  
(XEN) r9:  830be8dc9310   r10:    r11: 3475e9cf85d0
(XEN) r12: 0006   r13: 830c14ee1000   r14: 8300e6fc
(XEN) r15: 830617fd   cr0: 8005003b   cr4: 26e0
(XEN) cr3: 0001bd665000   cr2: 0451
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008

0x82d0801fa0c3 :mov$0x6c00,%edx
0x82d0801fa0c8 :vmwrite %rax,%rdx

The vmwrite tries to write 0x8005003b   to 0x6c00.
But the active VCPU has a 0-vmcs-pointer.



> -Ursprüngliche Nachricht-
> Von: Jan Beulich [mailto:jbeul...@suse.com]
> Gesendet: Donnerstag, 4. August 2016 17:36
> An: Mayer, Kevin 
> Cc: andrew.coop...@citrix.com; xen-devel@lists.xen.org
> Betreff: Re: AW: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by
> default
> 
> >>> On 04.08.16 at 17:08,  wrote:
> > crash> x /130x 0x830bd0da1000
> > 0x830bd0da1000: 0x000e  0x
> > 0x830bd0da1010: 0x  0x
> > 0x830bd0da1020: 0x  0x
> > 0x830bd0da1030: 0x  0x
> > 0x830bd0da1040: 0x  0x
> > 0x830bd0da1050: 0x  0x
> > 0x830bd0da1060: 0x  0x
> > 0x830bd0da1070: 0x  0x000bd0da3000
> > 0x830bd0da1080: 0x000c17e36000  0x
> > 0x830bd0da1090: 0x  0x
> > 0x830bd0da10a0: 0xe7512000  0xe7513000
> > 0x830bd0da10b0: 0x000bd0da  0x
> > 0x830bd0da10c0: 0x  0x
> > 0x830bd0da10d0: 0x  0x006fedea809b
> > 0x830bd0da10e0: 0x0001a379e000  0x000610f9101e
> > 0x830bd0da10f0: 0x  0x
> > 0x830bd0da1100: 0x  0x0007010600070106
> > 0x830bd0da1110: 0x  0x
> > 0x830bd0da1120: 0x006bb6a075fa  0x00060042003f
> > 0x830bd0da1130: 0x  0x000fefff
> > 0x830bd0da1140: 0x  0x51ff
> > 0x830bd0da1150: 0x0041  0x
> > 0x830bd0da1160: 0x  0x000c
> > 0x830bd0da1170: 0x  0x
> > 0x830bd0da1180: 0x0001  0x
> > 0x830bd0da1190: 0x0008  0x
> > 0x830bd0da11a0: 0x0001  0x0096
> > 0x830bd0da11b0: 0x82d0802bc208  0x806f6dbc
> > 0x830bd0da11c0: 0x  0x0400
> > 0x830bd0da11d0: 0x80550f34  0xf0e48161
> > 0x830bd0da11e0: 0x0246  0x
> > 0x830bd0da11f0: 0xf79c3000  0x804de6f0
> > 0x830bd0da1200: 0x0023  0x
> > 0x830bd0da1210: 0x00c0f300  0x0008
> > 0x830bd0da1220: 0x  0x00c09b00
> > 0x830bd0da1230: 0x0010  0x
> > 0x830bd0da1240: 0x00c09300  0x0023
> > 0x830bd0da1250: 0x  0x00c0f300
> > 0x830bd0da1260: 0x0030  0xffdff000
> > 0x830bd0da1270: 0x00c093001fff  0x
> > 0x830bd0da1280: 0x  0x01c0
> > 0x830bd0da1290: 0x  0x
> > 0x830bd0da12a0: 0x01c0  0x0028
> > 0x830bd0da12b0: 0x80042000  0x8b0020ab
> > 0x830bd0da12c0: 0x8003f000  0x8003f400
> > 0x830bd0da12d0: 0x07ff03ff  0x8001003b
> > 0x830bd0da12e0: 0x00039000  0x26d9
> > 0x830bd0da12f0: 0xdc3c  0x
> > 0x830bd0da1300: 0xe008  0x
> > 0x830bd0da1310: 0x  0xe040
> > 0x830bd0da1320: 0x050100070406  0x
> > 

Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default

2016-08-04 Thread Kevin.Mayer
According to the crash-dump ( output of vcpu ) the v->arch.hvm_vmx.host_cr0 is 
" 0 ".
This cannot be the correct result because of

if ( !(v->arch.hvm_vmx.host_cr0 & X86_CR0_TS) )
{
v->arch.hvm_vmx.host_cr0 |= X86_CR0_TS;
__vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);
}
It should at least be 0x8.
Also the v->arch.hvm_vmx.vmcs is " 0 " which I assume leads to the crash.


Since I assumed that somehow the wrong VCPU is used I tried to find the correct 
one:

vcpus gives
   VCID  PCID   VCPU   ST T DOMID  DOMAIN 
> 0 0 8300e7557000 RU I 32767 830c14ee1000
> 1 1 8300e75f2000 RU I 32767 830c14ee1000
  2 2 8300e72fe000 RU I 32767 830c14ee1000
> 3 3 8300e75f1000 RU I 32767 830c14ee1000
> 4 4 8300e75f RU I 32767 830c14ee1000
> 5 5 8300e72fd000 RU I 32767 830c14ee1000
>*6 6 8300e72fc000 RU I 32767 830c14ee1000
> 7 7 8300e72fb000 RU I 32767 830c14ee1000
> 0 2 8300e72f9000 RU 0 0 830c17e32000
  1 3 8300e72f8000 BL 0 0 830c17e32000
  2 5 8300e755f000 BL 0 0 830c17e32000
  3 0 8300e755e000 BL 0 0 830c17e32000
  4 6 8300e755d000 BL 0 0 830c17e32000
  5 4 8300e755c000 BL 0 0 830c17e32000
  6 7 8300e755b000 BL 0 0 830c17e32000
  7 5 8300e755a000 BL 0 0 830c17e32000
  0 1 8300e6fc7000 BL U   162 830bdee8f000
  0 3 8300e6fc9000 BL U   163 830be20d3000
  0 6 8300e6fc BL U   164 830be8dc9000
  0 0 8300e6fc6000 BL U   165 830bd0cc

Since I see the domain 830be8dc9000 all over the xen dmesg this should be 
the correct VCPU.
On this CPU the v->arch.hvm_vmx.host_cr0 is 2147811387 (0x 8005003B) which 
corresponds to the cr0 in the xen dmesg.
v->arch.hvm_vmx.vmcs is 0x830bd0da1000

crash> x /10x 0x830bd0da1000
0x830bd0da1000: 0x000e  0x
0x830bd0da1010: 0x  0x
0x830bd0da1020: 0x  0x
0x830bd0da1030: 0x  0x
0x830bd0da1040: 0x  0x

So the vmcs revision id is 0xe.
rdmsr 0x480 (the IA32_VMX_BASIC MSR ) gives da040e which confirms the 
revision ID.
Size should be 0x400 bytes.

crash> x /130x 0x830bd0da1000
0x830bd0da1000: 0x000e  0x
0x830bd0da1010: 0x  0x
0x830bd0da1020: 0x  0x
0x830bd0da1030: 0x  0x
0x830bd0da1040: 0x  0x
0x830bd0da1050: 0x  0x
0x830bd0da1060: 0x  0x
0x830bd0da1070: 0x  0x000bd0da3000
0x830bd0da1080: 0x000c17e36000  0x
0x830bd0da1090: 0x  0x
0x830bd0da10a0: 0xe7512000  0xe7513000
0x830bd0da10b0: 0x000bd0da  0x
0x830bd0da10c0: 0x  0x
0x830bd0da10d0: 0x  0x006fedea809b
0x830bd0da10e0: 0x0001a379e000  0x000610f9101e
0x830bd0da10f0: 0x  0x
0x830bd0da1100: 0x  0x0007010600070106
0x830bd0da1110: 0x  0x
0x830bd0da1120: 0x006bb6a075fa  0x00060042003f
0x830bd0da1130: 0x  0x000fefff
0x830bd0da1140: 0x  0x51ff
0x830bd0da1150: 0x0041  0x
0x830bd0da1160: 0x  0x000c
0x830bd0da1170: 0x  0x
0x830bd0da1180: 0x0001  0x
0x830bd0da1190: 0x0008  0x
0x830bd0da11a0: 0x0001  0x0096
0x830bd0da11b0: 0x82d0802bc208  0x806f6dbc
0x830bd0da11c0: 0x  0x0400
0x830bd0da11d0: 0x80550f34  0xf0e48161
0x830bd0da11e0: 0x0246  0x
0x830bd0da11f0: 0xf79c3000  0x804de6f0
0x830bd0da1200: 0x0023  0x
0x830bd0da1210: 0x00c0f300  0x0008
0x830bd0da1220: 0x  0x00c09b00
0x830bd0da1230: 0x0010  0x
0x830bd0da1240: 

Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default

2016-08-03 Thread Kevin.Mayer
Hi guys

I got around to take a closer look at the crash dump today.

tl;dr:
You were right, vmx_vmenter_helper is not called at all in the call stack.
The real reason behind the [] vmx_vmenter_helper+0x27e/0x30a 
should be a failed
__vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0); in static void 
vmx_fpu_leave(struct vcpu *v).
Long story in "Chapter1".

Concerning the stray vmx_vcpu_update_eptp:
This seems to be leftovers (either due to a corrupted stack or simply 
uninitialized local variables) of previous function calls originating in 
hvm_vcpu_destroy.
More precisely:
if ( hvm_altp2m_supported() )
altp2m_vcpu_destroy(v);
being called BEFORE free_compat_arg_xlat.
I assume some kind of error in the altp2m_vcpu_destroy-path to be responsible 
for the crash, but I have no idea where and how to start investigating.
Long story in "Chapter2".



Chapter1:
I started with a function in the callstack and followed the assembly code to 
deduce where the (XEN)[] vmx_vmenter_helper+0x27e/0x30a 
comes from:

sync_local_execstate:
0x82d080178c36 : mov%rsp,%rbp
0x82d080178c39 : callq  0x82d080178bbb 
<__sync_local_execstate>
0x82d080178c3e : pop%rbp

__sync_local_execstate:
0x82d080178c09 <__sync_local_execstate+78>:  cmp%rsi,0x7fe8(%rax)
0x82d080178c10 <__sync_local_execstate+85>:  je 0x82d080178c14 
<__sync_local_execstate+89>
0x82d080178c12 <__sync_local_execstate+87>:  ud2
0x82d080178c13 <__sync_local_execstate+88>:  or %eax,%ebp
0x82d080178c15 <__sync_local_execstate+90>:  popfq
0x82d080178c16 <__sync_local_execstate+91>:  retq   $0x
0x82d080178c19 <__sync_local_execstate+94>:  and$0x200,%ebx

Here crash / gdb seem to get confused with the je.

crash> x /3i __sync_local_execstate+89
   0x82d080178c14 <__sync_local_execstate+89>:  callq  
0x82d080174eb6 <__context_switch>
   0x82d080178c19 <__sync_local_execstate+94>:  and$0x200,%ebx
   0x82d080178c1f <__sync_local_execstate+100>: pushfq
   
It seems this code calls the __context_switch:
switch_required = (this_cpu(curr_vcpu) != current);

if ( switch_required )
{
ASSERT(current == idle_vcpu[smp_processor_id()]);
__context_switch();
}

Up to the __context_switch everything seems to be running as it should. Except 
for the stray
[830617fd7d38] vmx_vcpu_update_eptp at 82d0801f7c6b
which can be found in the "crash> bt" output but not in the "dmesg".

__context_switch:
0x82d080174f7f <__context_switch+201>:   mov%r14,%rdi
0x82d080174f82 <__context_switch+204>:   callq  0x82d08017c474 

0x82d080174f87 <__context_switch+209>:   mov%r14,%rdi
0x82d080174f8a <__context_switch+212>:   callq  *0x3a8(%r14)

Following r14 / rdi ( 0x8300e6fc ) as given in the crash dump seemingly 
leads to a vtable with a function pointer at the offset 0x3a8:
0x82d0801fa06e
crash> x /i 0x82d0801fa06e
   0x82d0801fa06e :   push   %rbp

This call, which does not show up in the backtrace, is expected at this 
position when looking at the C-code:
static void __context_switch(void)
[...]
if ( !is_idle_domain(pd) )
{
memcpy(>arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
vcpu_save_fpu(p);
p->arch.ctxt_switch_from(p);
[...]

as it is set in:
static int vmx_vcpu_initialise(struct vcpu *v)
[...]
v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
[...]

Finally at:

0x82d0801fa0c3 :mov$0x6c00,%edx
0x82d0801fa0c8 :vmwrite %rax,%rdx
0x82d0801fa0cb :jbe0x82d0801fd23a

The jump to [] vmx_vmenter_helper+0x27e/0x30a (ud2 following 
vmx_vmenter_helper) is done.
vmx_ctxt_switch_from is rather short in C and the called static functions are 
inlined.
static void vmx_ctxt_switch_from(struct vcpu *v)
{
/*
 * Return early if trying to do a context switch without VMX enabled,
 * this can happen when the hypervisor shuts down with HVM guests
 * still running.
 */
if ( unlikely(!this_cpu(vmxon)) )
return;

vmx_fpu_leave(v);
vmx_save_guest_msrs(v);
vmx_restore_host_msrs();
vmx_save_dr(v);
}

The unlikely path is not taken and the two ud2 (I assume the ud2 are the 
ASSERTs in vmx_fpu_leave?) are not reached either:
0x82d0801fa077 : lea 0x15c692(%rip),%rax# 
0x82d080356710 
0x82d0801fa07e :mov%rsp,%rdx
0x82d0801fa081 :and
$0x8000,%rdx
0x82d0801fa088 :mov 0x7ff0(%rdx),%rdx
0x82d0801fa08f :cmpb   $0x0,(%rdx,%rax,1)
0x82d0801fa093 :je  0x82d0801fa1d9 


Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabled by default

2016-08-02 Thread Kevin.Mayer
Thanks for your reply.
I installed the debug hypervisor and got a new crash dump now.
I must confess that I have little to no experience debugging crash dumps, but 
this seems to be a different kind of error, or at least the way the error is 
reached is different.

The pattern with “page number X invalid” and the “restore” repeats for all 
preceding domains visible in the dump.

[…]
(XEN) memory.c:269:d164v0 Domain 164 page number 54fc invalid
(XEN) memory.c:269:d164v0 Domain 164 page number 54fd invalid
(XEN) grant_table.c:1491:d164v0 Expanding dom (164) grant table from (4) to 
(32) frames.
(XEN) Dom164 callback via changed to GSI 28
(XEN) HVM165 restore: VM saved on one CPU (0x206c2) and restored on another 
(0x106a5).
(XEN) HVM165 restore: CPU 0
(XEN) HVM165 restore: PIC 0
(XEN) HVM165 restore: PIC 1
(XEN) HVM165 restore: IOAPIC 0
(XEN) HVM165 restore: LAPIC 0
(XEN) HVM165 restore: LAPIC_REGS 0
(XEN) HVM165 restore: PCI_IRQ 0
(XEN) HVM165 restore: ISA_IRQ 0
(XEN) HVM165 restore: PCI_LINK 0
(XEN) HVM165 restore: PIT 0
(XEN) HVM165 restore: RTC 0
(XEN) HVM165 restore: HPET 0
(XEN) HVM165 restore: PMTIMER 0
(XEN) HVM165 restore: MTRR 0
(XEN) HVM165 restore: VMCE_VCPU 0
(XEN) HVM165 restore: TSC_ADJUST 0
(XEN) memory.c:269:d165v0 Domain 165 page number 54de invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54df invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e0 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e1 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e2 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e3 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e4 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e5 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e6 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e7 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e8 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54e9 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54ea invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54eb invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54ec invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54ed invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54ee invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54ef invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f0 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f1 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f2 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f3 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f4 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f5 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f6 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f7 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f8 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54f9 invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54fa invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54fb invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54fc invalid
(XEN) memory.c:269:d165v0 Domain 165 page number 54fd invalid
(XEN) grant_table.c:1491:d165v0 Expanding dom (165) grant table from (4) to 
(32) frames.
(XEN) Dom165 callback via changed to GSI 28
(XEN) Debugging connection not set up.
(XEN) [ Xen-4.6.1  x86_64  debug=y  Not tainted ]
(XEN) CPU:6
(XEN) RIP:e008:[] vmx_vmenter_helper+0x27e/0x30a
(XEN) RFLAGS: 00010003   CONTEXT: hypervisor
(XEN) rax: 8005003b   rbx: 8300e72fc000   rcx: 
(XEN) rdx: 6c00   rsi: 830617fd7fc0   rdi: 8300e6fc
(XEN) rbp: 830617fd7c40   rsp: 830617fd7c30   r8:  
(XEN) r9:  830be8dc9310   r10:    r11: 3475e9cf85d0
(XEN) r12: 0006   r13: 830c14ee1000   r14: 8300e6fc
(XEN) r15: 830617fd   cr0: 8005003b   cr4: 26e0
(XEN) cr3: 0001bd665000   cr2: 0451
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=830617fd7c30:
(XEN)830617fd7c40 8300e72fc000 830617fd7ca0 82d080174f91
(XEN)830617fd7f18 830be8dc9000 0286 830617fd7c90
(XEN)0206 0246 0001 830617e91250
(XEN)8300e72fc000 830be8dc9000 830617fd7cc0 82d080178c19
(XEN)00bdeeae 8300e72fc000 830617fd7cd0 82d080178c3e
(XEN)830617fd7d20 82d080179740 8300e6fc2000 830c17e38e80
(XEN)830617e91250 82008000 0002 830617e91250
(XEN)830617e91240 830be8dc9000 830617fd7d70 82d080196152
(XEN)830617fd7d50 82d0801f7c6b 8300e6fc2000 830617e91250
(XEN)8300e6fc2000 830617e91250 830617e91240 830be8dc9000
(XEN)830617fd7d80 

[Xen-devel] Xen 4.6.1 crash with altp2m enabled by default

2016-07-29 Thread Kevin.Mayer
Hi guys

We are using Xen 4.6.1 to manage our virtual machines on x86-64-servers.
We start dozens of VMs and destroy them again after 60 seconds, which works 
fine as it is, but the next step in our approach requires the use of the altp2m 
functionality.
Since libvirt does not pass the altp2m-enable flag to the hypervisor we enabled 
altp2m unconditionally by patching the hvm.c . Since all of our machines 
support the altp2m this seemed to be ok.

 d->arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1;
 d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_REASON] = SHUTDOWN_reboot;
+d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] = 1;
+
 vpic_init(d);
 rc = vioapic_init(d);

Since applying this patch the hypervisor crashes after several hundred 
restarted VMs (without any altp2m-functionality used by us) with the following 
dmesg:

(XEN) [ Xen-4.6.1  x86_64  debug=n  Not tainted ]
(XEN) CPU:7
(XEN) RIP:e008:[] vmx_vmenter_helper+0x2b5/0x340
(XEN) RFLAGS: 00010003   CONTEXT: hypervisor (d0v3)
(XEN) rax: 8005003b   rbx: 8300e7038000   rcx: 0008
(XEN) rdx: 6c00   rsi: 83062eb5e000   rdi: 8300e7038000
(XEN) rbp: 830c17e3f000   rsp: 830617fc7d70   r8:  
(XEN) r9:  83014f8d7028   r10: 02700f858000   r11: 2201be6861f0
(XEN) r12: 83062eb5e000   r13: 8300e752f000   r14: 82d08030ea40
(XEN) r15: 0007   cr0: 8005003b   cr4: 26e0
(XEN) cr3: 0001bf4da000   cr2: dd840c00
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=830617fc7d70:
(XEN)8300e7038000 82d080170c04  000780109f6a
(XEN)830617fc7f18 831e  8300e752f19c
(XEN)0286 8300e752f000 8300e72fc000 0007
(XEN)830c17e3f000 830c14ee1000 82d08030ea40 82d080173d6a
(XEN)   
(XEN)82d08030ea40 8300e72fc000 02700f481091 0001
(XEN)82d080324560 82d08030ea40 8300e752f000 82d080128004
(XEN)0001 01c9c380 830c14ef60e8 17fce600
(XEN)0001 82d0801bd18b 82d0801d9e88 8300e752f000
(XEN)01c9c380 82d08012e700 006e0171 
(XEN)830617fc 82d0802f8f80  83062eb5e000
(XEN)82d08030ea40 82d08012b040 8300e7038000 830617fc
(XEN)8300e7038000  830c14ee1000 82d080170970
(XEN)8300e72fc000   
(XEN) 80550f50 ffdffc70 
(XEN)   2fcffe19
(XEN)ffdffc70  ffdffc50 853b0918
(XEN)00fa f0e48162  0246
(XEN)80550f34   
(XEN)  0007 8300e752f000
(XEN) Xen call trace:
(XEN)[] vmx_vmenter_helper+0x2b5/0x340
(XEN)[] __context_switch+0xb4/0x350
(XEN)[] context_switch+0xca/0xef0
(XEN)[] schedule+0x264/0x5f0
(XEN)[] mwait_idle+0x25b/0x3a0
(XEN)[] hvm_vcpu_has_pending_irq+0x58/0xc0
(XEN)[] timer_softirq_action+0x80/0x250
(XEN)[] __do_softirq+0x60/0x90
(XEN)[] idle_loop+0x20/0x50
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 7:
(XEN) FATAL TRAP: vector = 6 (invalid opcode)
(XEN) 
(XEN)
(XEN) Reboot in five seconds...
(XEN) Executing kexec image on cpu7
(XEN) Shot down all CPUs

The RIP points to ud2
0x82d0801f5a55:  ud2
>From the RFLAGS we concluded that the vmwrite failed due to an invalid 
>vmcs-pointer (CF = 1), but this is where we are stuck since we have no idea 
>how the pointer could have gotten corrupted.
crash> vcpu
gives vmcs = 0x817cbc20 for vcpu_id = 7,

and vcpus gives

   VCID  PCID   VCPU   ST T DOMID  DOMAIN
  0 0 8300e75f2000 RU I 32767 830c14ee1000
  1 1 8300e72fe000 RU I 32767 830c14ee1000
  2 2 8300e7527000 RU I 32767 830c14ee1000
> 3 3 8300e7526000 RU I 32767 830c14ee1000
  4 4 8300e75f1000 RU I 32767 830c14ee1000
> 5 5 8300e75f RU I 32767 830c14ee1000
> 6 6 8300e72fd000 RU I 32767 830c14ee1000
  7 7 8300e72fc000 RU I 32767 830c14ee1000
  0 0 8300e72fa000 BL 0 0 830c17e3f000
  1 6 8300e72f9000 BL 0 0 830c17e3f000
  2 3 8300e72f8000 BL 0 0 830c17e3f000
> 3 7 8300e752f000 RU 0 0 830c17e3f000
  4 5 8300e752e000 RU 0 0 830c17e3f000
> 

Re: [Xen-devel] Branch Trace Storage for guests andVPMUinitialization

2015-02-26 Thread Kevin.Mayer


 -Ursprüngliche Nachricht-
 Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
 Gesendet: Donnerstag, 26. Februar 2015 17:35
 An: Dietmar Hahn; xen-devel@lists.xen.org
 Cc: Mayer, Kevin
 Betreff: Re: [Xen-devel] Branch Trace Storage for guests and
 VPMUinitialization
 
 On 02/26/2015 03:56 AM, Dietmar Hahn wrote:
  Am Mittwoch 25 Februar 2015, 11:31:31 schrieb Boris Ostrovsky:
  On 02/25/2015 10:12 AM, kevin.ma...@gdata.de wrote:
  -Ursprüngliche Nachricht-
  Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
  Gesendet: Dienstag, 24. Februar 2015 18:13
  An: Mayer, Kevin; xen-devel@lists.xen.org
  Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU
  initialization
 
  On 02/24/2015 10:27 AM, kevin.ma...@gdata.de wrote:
  Hi guys
 
  I`m trying to set up the BTS so that I can log the branches taken
  in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7
  Sandy Bridge.
 
  I added the vpmu=bts boot parameter to my grub2 configuration and
  extended the libxl,libxc,domctl,… with an own command so that I
  can trigger the activation of the BTS whenever I want.
 
  I am not sure why you are doing all these changes to Xen code. BTS
  is supposed to be managed from the guest. For example, a Fedora
 HVM
  guest will produce this:
 
  [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e
  branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to
  write data ] [ perf record: Captured and wrote 0.704 MB perf.data
  (~30756 samples) ]
  [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f
  ip,addr,sym,dso,symoff --show-kernel-path
  8167c347 native_irq_return_iret+0x0 (/proc/kcore) =
  328c001590 [unknown] (/proc/kcore)
  8167c347 native_irq_return_iret+0x0 (/proc/kcore) =
  328c001590 [unknown] ([unknown])
328c001593 [unknown] ([unknown]) =   328c004b70 [unknown]
  ([unknown])
  ...
 
  I want to be able to log the taken branches (of the guest) without the
 need to modify the guest at all.
  This means I have to do all the logic in the hypervisor, or am I wrong?
  In that case, yes. But then you have to make sure that at least
 * you don't load guest's VPMU (or, at least, BTS-related
  registers) on context switch
  But you need to modify PMU registers when switching to/from the guest
  context to get PMU running.
 
 
 
 I was thinking that all BTS stuff can be controlled from dom0 and so we can
 use dom0's version of these registers. I didn't realize that DS_AREA would
 have to be accessed in guest's address space (and that DEBUGCTL is loaded
 from VMCS).
 
 Which is what I think I said in response to this message (which didn't show up
 on the list because Kevin accidentally dropped xen-devel).
 
 -boris
 
Terribly sorry about that...

So the VPMU doesn’t get loaded when there is a VMENTER?
I thought I could set the domU-vcpu-vpmu to enable BTS while in dom0 (with 
modified versions of msr_write_intercept, vpmu_do_wrmsr and core2_vpmu_do_wrmsr 
of course since the build in ones use the current-vcpu which would be the 
dom0-vcpu)
and as soon as there is a context switch to domU the vpmu gets loaded and the 
guest starts logging.
If the described behavior is correct the only problem I can see is with 
allocating memory in dom0 in a way that the guest can access it.
But if I got it wrong please explain how the vpmu really works.

Cheers

Kevin


 
 
  I didn't think of using the VPMU stuff with modifying the context from
  outside the guest.
 
 * You don't send the interrupt to the guest (meaning that you will
  need to somehow inform dom0 of the BTS interrupt)
 
  and probably more.
 
  Essentially, you want dom0 to profile the guest. I have been working
  on patches that would allow that but they are still under review.
 
 
  In this command I do the following:
 
  I set up the memory region for the BTS Buffer and the DS Buffer
  Management Area using xzalloc_bytes
 
  I don't think you should be allocating BTS buffers in the
  hypervisor, they are in guest's memory.
  I agree. As I said I think this is where my main problem is at the moment.
  Is there any way I can allocate memory in the hypervisor in a way the
 guest can access it?
  I am not sure this is what you want since you seem to *not* want the
  guest to process the samples, right?
 
  But yes, you can. E.g. something like what map_vcpu_info() does. (I
  have no idea how you'd do this from Windows.)
  The DS buffer has to be mapped within the guests address space so the
  CPU running in guest context can access this area. Otherwise you get
  this triple fault.
  So I would think you need a mixture of writing some stuff in Windows
  and patching the hypervisor.
 
  Dietmar.
 
 
  Of course the guest must not be able to use this memory in its normal
 operations but just for BTS.
  Is this even possible? I am rather confused at the moment. :-D
 
  Then I write the pointer to the BTS Buffer into the DS Buffer
  

Re: [Xen-devel] Branch Trace Storage for guests and VPMUinitialization

2015-02-25 Thread Kevin.Mayer
 -Ursprüngliche Nachricht-
 Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
 Gesendet: Dienstag, 24. Februar 2015 18:13
 An: Mayer, Kevin; xen-devel@lists.xen.org
 Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU
 initialization
 
 On 02/24/2015 10:27 AM, kevin.ma...@gdata.de wrote:
 
  Hi guys
 
  I`m trying to set up the BTS so that I can log the branches taken in
  the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7 Sandy
  Bridge.
 
  I added the vpmu=bts boot parameter to my grub2 configuration and
  extended the libxl,libxc,domctl,… with an own command so that I can
  trigger the activation of the BTS whenever I want.
 
 
 
 I am not sure why you are doing all these changes to Xen code. BTS is
 supposed to be managed from the guest. For example, a Fedora HVM guest
 will produce this:
 
 [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e
 branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to write data ] 
 [
 perf record: Captured and wrote 0.704 MB perf.data (~30756 samples) ]
 [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f
 ip,addr,sym,dso,symoff --show-kernel-path
   8167c347 native_irq_return_iret+0x0 (/proc/kcore) =
 328c001590 [unknown] (/proc/kcore)
   8167c347 native_irq_return_iret+0x0 (/proc/kcore) =
 328c001590 [unknown] ([unknown])
 328c001593 [unknown] ([unknown]) =   328c004b70 [unknown]
 ([unknown])
 ...
 

I want to be able to log the taken branches (of the guest) without the need to 
modify the guest at all.
This means I have to do all the logic in the hypervisor, or am I wrong?

  In this command I do the following:
 
  I set up the memory region for the BTS Buffer and the DS Buffer
  Management Area using xzalloc_bytes
 
 
 
 I don't think you should be allocating BTS buffers in the hypervisor, they are
 in guest's memory.

I agree. As I said I think this is where my main problem is at the moment. 
Is there any way I can allocate memory in the hypervisor in a way the guest can 
access it?
Of course the guest must not be able to use this memory in its normal 
operations but just for BTS.
Is this even possible? I am rather confused at the moment. :-D

  Then I write the pointer to the BTS Buffer into the DS Buffer
  Management Area at +0x0 and +0x8 (BTS Buffer Base and BTS Index)
 
  When I use vmx_msr_write_intercept to store the value in
  MSR_IA32_DS_AREA the host reboots (my idea is he tries to access a
  vpmu-struct that isn´t there in the current vcpu and panics).
 
 
 Can you post hypervisor log? (hard to say how helpful it will be without
 seeing your code changes though)
 

Right after enabling the BTS I get a triple fault.
hvm.c:1357:d2 Triple fault on VCPU0 - invoking HVM shutdown action 1. 

  When I use a modified version of vmx_msr_write_intercept I don’t get
  any crashes as long as I don’t enable BTS and TR in the
  GUEST_IA32_DEBUGCTL (BTR works). When I enable the BTS (and TR) the
  guest crashes. I suppose he gets killed by the hypervisor for
  accessing forbidden memory.
 
 
 Possibly because DS area point to hypervisor memory.
 
 
 Having said all this, I am not sure how well BTS works. You did notice
 this in the hypervisor log:
 
 (XEN) **
 (XEN) ** WARNING: Emulation of BTS Feature is switched on **
 (XEN) ** Using this processor feature in a virtualized **
 (XEN) ** environment is not 100% safe. **
 (XEN) ** Setting the DS buffer address with wrong values **
 (XEN) ** may lead to hypervisor hangs or crashes. **
 (XEN) ** It is NOT recommended for production use! **
 (XEN) **
 

Yes, I saw that. It doesn’t state that BTS is not working at all, just that it 
is not that safe to use.
As I understand it as long as I set the DS buffer address correctly I should be 
fine, right?
Since I don’t want to use for production that is fine with me. At least for now.


Kevin
 
 -boris
 
 
  The modified version of vmx_msr_write_intercept takes a vcpu-struct as
  a parameter and uses this instead of the current vcpu.
 
  Instead of
 
  staticint vmx_msr_write_intercept(unsigned int msr, uint64_t
 msr_content)
 
  {
 
  struct vcpu *v = current;
 
  I just have
 
  staticint own_vmx_msr_write_intercept(unsigned int msr, uint64_t
  msr_content, struct vcpu *v)
 
  I get this vcpu by d-vcpu[0] as I have limited my guest domain to one
  vcpu atm.
 
  Of course I also use similarly modified version of the called
  functions(vpmu_do_wrmsr,…).
 
  I´m pretty sure that my problem is with a wrong scope/usage of the
  vcpus/memory, but I have no idea how to fix this.
 
  I can see a potential problem with the memory allocation (in the host)
  into which the cpu in guest-mode is supposed to write.
 
  Or maybe I got the principle of a vcpu/vpmu all wrong.
 
  Since I couldn’t find any project that uses the BTS for the guest, I
  am wondering if anyone has ever 

[Xen-devel] Branch Trace Storage for guests and VPMU initialization

2015-02-24 Thread Kevin.Mayer
Hi guys

I`m trying to set up the BTS so that I can log the branches taken in the guest 
using Xen 4.4.1 with a WinXP SP3 guest on a Core i7 Sandy Bridge.
I added the vpmu=bts boot parameter to my grub2 configuration and extended the 
libxl,libxc,domctl,... with an own command so that I can trigger the activation 
of the BTS whenever I want.
In this command I do the following:
I set up the memory region for the BTS Buffer and the DS Buffer Management Area 
using xzalloc_bytes
Then I write the pointer to the BTS Buffer into the DS Buffer Management Area 
at +0x0 and +0x8 (BTS Buffer Base and BTS Index)
When I use vmx_msr_write_intercept to store the value in MSR_IA32_DS_AREA the 
host reboots (my idea is he tries to access a vpmu-struct that isn´t there in 
the current vcpu and panics).
When I use a modified version of vmx_msr_write_intercept I don't get any 
crashes as long as I don't enable BTS and TR in the GUEST_IA32_DEBUGCTL (BTR 
works). When I enable the BTS (and TR) the guest crashes. I suppose he gets 
killed by the hypervisor for accessing forbidden memory.
The modified version of vmx_msr_write_intercept takes a vcpu-struct as a 
parameter and uses this instead of the current vcpu.
Instead of
static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content)
{
struct vcpu *v = current;
I just have
static int own_vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content, 
struct vcpu *v)

I get this vcpu by d-vcpu[0] as I have limited my guest domain to one vcpu atm.
Of course I also use similarly modified version of the called 
functions(vpmu_do_wrmsr,...).
I´m pretty sure that my problem is with a wrong scope/usage of the 
vcpus/memory, but I have no idea how to fix this.
I can see a potential problem with the memory allocation (in the host) into 
which the cpu in guest-mode is supposed to write.
Or maybe I got the principle of a vcpu/vpmu all wrong.

Since I couldn't find any project that uses the BTS for the guest, I am 
wondering if anyone has ever done this and if it is possible at all.
Any input is welcome as I am pretty much stuck atm...

Cheers

Kevin


Virus checked by G Data MailSecurity
Version: AVA 25.404 dated 24.02.2015
Virus news: www.antiviruslab.com___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Tracking guest code execution with EPT violations

2015-01-16 Thread Kevin.Mayer
Hi all

I`m trying to track code execution with page granularity by setting the access 
rights in the EPT to not executable on Xen 4.4.1.
The idea is as follows:

According to the intel manual  A reference using a guest-physical address 
whose translation encounters an EPT paging-structure that is not present causes 
an EPT violation.
So whenever a nonexisting memory page gets requested an EPT violation is caused 
(and handled by ept_handle_violation). Extending the  EXIT_REASON_EPT_VIOLATION 
I should be able to set the access rights for every new page to access_rw(By 
using the p2m-get_entry and p2m- set_entry functions right after the 
violation was handled), leading to a new EPT violation every time an 
instruction is fetched from this page.

There are several problems with my approach so far:

* I get to few unique GFN (derived from the gpa by PAGE_SHIFT in the 
EPT violations when booting a WinXP guest.  I get about 250 EPT_VIOLATIONS with 
unique GFNs when booting the guest OS and none when starting new programs in 
the guest. So something seems to be wrong there. Also I read the access rights 
of the pages back after setting them. Most of the time the initial access 
rights are access_n before and the same after I tried setting them to access_rw 
(this happens when the type is p2m_mmio_dm, when the type is p2m_ram_rw the 
setting works temporarily).

* I never get an EPT violation with the EPT_EXEC_VIOLATION flag set in 
the exit qualifications even for the pages where the setting of the access 
rights did succeed.

* Later when checking the access rights (I simply save the GFNs in an 
array and use p2m-get_entry in an own  call to domctl.c from xl) of the GFNs 
they all have access right access_n and type p2m_mmio_dm , even for the pages 
where the setting of the access rights did succeed or the type was different 
before.

This all tells me that there is something fundamentally wrong with my approach 
so far, leading me to the following questions:


1.   Every time a new page in memory is allocated by the guest I get an 
EPT_VIOLATION, right?

a.   If this is the case then why don't I get new violations after windows 
has finished booting?

2.   What is the difference between types p2m_mmio_dm and p2m_ram_rw? (got 
a feeling that part of the problem lies here)

3.   Are the p2m-get_entry/p2m-set_entry functions the right tools for 
this purpose?

a.   If they are, then why do they sometimes fail?

4.   To get the domain I use struct vcpu *curr = current; and struct 
p2m_domain *p2m = p2m_get_hostp2m(curr-domain); before using the 
get/set_entry-functions. Do I get confused with wrong domains or something like 
that?

5.   Because I just set the access rights to rw every time 
EXIT_REASON_EPT_VIOLATION is called the whole domain should freeze/crash as 
soon as the first page tries to execute an instruction, right? It doesn't 
because I get no execution attempts on the pages I set the access_rw, but why 
don't I get an execution attempt?

I hope it got clear what I try to achieve.

Thanks

Kevin


Virus checked by G Data MailSecurity
Version: AVA 24.6111 dated 16.01.2015
Virus news: www.antiviruslab.com___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel