Re: Panic starting a bhyve guest after resume

2013-12-23 Thread Neel Natu
Hi John,

On Fri, Dec 20, 2013 at 2:23 PM, John Baldwin  wrote:
> On Friday, December 13, 2013 9:28:29 pm Neel Natu wrote:
>> Hi John,
>>
>> On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin  wrote:
>> > On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote:
>> >> Hi John,
>> >>
>> >> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin  wrote:
>> >> > If I suspend and resume my laptop and then try to start a guest after 
>> >> > the
>> >> > resume, I get an odd panic.  It generates a privileged instruction 
>> >> > fault (in
>> >> > kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMXE 
>> >> > is set.
>> >> > I dont have any other ideas off the top of my head on what I should be 
>> >> > poking
>> >> > at?  It looks like we read a bunch of MSRs in vmx_init(), but we don't 
>> >> > write
>> >> > to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from 
>> >> > what I
>> >> > can tell.
>> >> >
>> >>
>> >> It also does a "vmxon" on each logical cpu which may also need to be
>> >> done after a resume.
>> >
>> > Ah, yes it does.  That was sufficient both for starting a new guest after
>> > resume and even doing a suspend/resume while a guest was active (and the
>> > guest continued to run fine).  I have a hacky patch for this.  One, it
>> > includes both a suspend and resume hook for VMM, though for my testing I 
>> > only
>> > needed a resume hook to invoke vmxon.  Second, the name of vmx_resume2()
>> > is a total hack (because vmx_resume() was already taken.  I think for now
>> > if I were to commit this, I'd just add the resme hook and maybe call the
>> > Intel method vmx_reset() or vmx_restore()?
>> >
>> > http://people.freebsd.org/~jhb/patches/bhyve_resume.patch
>> >
>>
>> There seems to be a race after the APs are restarted and before
>> 'vmm_resume_p()' where it would be problematic to execute a VMX
>> instruction.
>>
>> Perhaps we should enable VMX on each cpu before they return to the
>> interrupted code?
>
> I've updated the patch at the URL above to do just that.  This also works
> in my testing.
>

Looks great!

best
Neel

> --
> John Baldwin
___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Panic starting a bhyve guest after resume

2013-12-23 Thread John Baldwin
On Friday, December 13, 2013 9:28:29 pm Neel Natu wrote:
> Hi John,
> 
> On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin  wrote:
> > On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote:
> >> Hi John,
> >>
> >> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin  wrote:
> >> > If I suspend and resume my laptop and then try to start a guest after the
> >> > resume, I get an odd panic.  It generates a privileged instruction fault 
> >> > (in
> >> > kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMXE is 
> >> > set.
> >> > I dont have any other ideas off the top of my head on what I should be 
> >> > poking
> >> > at?  It looks like we read a bunch of MSRs in vmx_init(), but we don't 
> >> > write
> >> > to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from 
> >> > what I
> >> > can tell.
> >> >
> >>
> >> It also does a "vmxon" on each logical cpu which may also need to be
> >> done after a resume.
> >
> > Ah, yes it does.  That was sufficient both for starting a new guest after
> > resume and even doing a suspend/resume while a guest was active (and the
> > guest continued to run fine).  I have a hacky patch for this.  One, it
> > includes both a suspend and resume hook for VMM, though for my testing I 
> > only
> > needed a resume hook to invoke vmxon.  Second, the name of vmx_resume2()
> > is a total hack (because vmx_resume() was already taken.  I think for now
> > if I were to commit this, I'd just add the resme hook and maybe call the
> > Intel method vmx_reset() or vmx_restore()?
> >
> > http://people.freebsd.org/~jhb/patches/bhyve_resume.patch
> >
> 
> There seems to be a race after the APs are restarted and before
> 'vmm_resume_p()' where it would be problematic to execute a VMX
> instruction.
> 
> Perhaps we should enable VMX on each cpu before they return to the
> interrupted code?

I've updated the patch at the URL above to do just that.  This also works
in my testing.

-- 
John Baldwin
___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Panic starting a bhyve guest after resume

2013-12-19 Thread Roger Pau Monné
On 18/12/13 21:02, John Baldwin wrote:
> On Saturday, December 14, 2013 3:15:45 am Roger Pau Monné wrote:
>> On 14/12/13 03:28, Neel Natu wrote:
>>> Hi John,
>>>
>>> On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin  wrote:
 On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote:
> Hi John,
>
> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin  wrote:
>> If I suspend and resume my laptop and then try to start a guest after 
> the
>> resume, I get an odd panic.  It generates a privileged instruction 
> fault (in
>> kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMXE 
> is set.
>> I dont have any other ideas off the top of my head on what I should be 
> poking
>> at?  It looks like we read a bunch of MSRs in vmx_init(), but we don't 
> write
>> to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from 
> what I
>> can tell.
>>
>
> It also does a "vmxon" on each logical cpu which may also need to be
> done after a resume.

 Ah, yes it does.  That was sufficient both for starting a new guest after
 resume and even doing a suspend/resume while a guest was active (and the
 guest continued to run fine).  I have a hacky patch for this.  One, it
 includes both a suspend and resume hook for VMM, though for my testing I 
> only
 needed a resume hook to invoke vmxon.  Second, the name of vmx_resume2()
 is a total hack (because vmx_resume() was already taken.  I think for now
 if I were to commit this, I'd just add the resme hook and maybe call the
 Intel method vmx_reset() or vmx_restore()?

 http://people.freebsd.org/~jhb/patches/bhyve_resume.patch

>>>
>>> There seems to be a race after the APs are restarted and before
>>> 'vmm_resume_p()' where it would be problematic to execute a VMX
>>> instruction.
>>>
>>> Perhaps we should enable VMX on each cpu before they return to the
>>> interrupted code?
>>
>> Can you use the hook in cpususpend_handler? It's cpu_ops.cpu_resume, and
>> gets called on each CPU before returning from the handler.
> 
> That is the right place, yes.  However, I'm worried about collisions.  Can 
> you 
> run nested VMM's under Xen?  That is, can a xenhvm guest start a bhyve vmm?

In theory yes, but AFAIK nested virtualization support in Xen is not
really stable (it's still marked as experimental).

> If so, then you would need to run both cpu_resume handlers.  Also, cpu_resume 
> isn't run on the CPU that initiates the suspend.  For now I will stick with a
> dedicated vmm_resume_p hook, but we may want to revisit that at some point.

Using something like an event handler would be the best solution, but
that doesn't work on interrupt context due to the usage of mutexes.

___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Panic starting a bhyve guest after resume

2013-12-18 Thread John Baldwin
On Saturday, December 14, 2013 3:15:45 am Roger Pau Monné wrote:
> On 14/12/13 03:28, Neel Natu wrote:
> > Hi John,
> > 
> > On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin  wrote:
> >> On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote:
> >>> Hi John,
> >>>
> >>> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin  wrote:
>  If I suspend and resume my laptop and then try to start a guest after 
the
>  resume, I get an odd panic.  It generates a privileged instruction 
fault (in
>  kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMXE 
is set.
>  I dont have any other ideas off the top of my head on what I should be 
poking
>  at?  It looks like we read a bunch of MSRs in vmx_init(), but we don't 
write
>  to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from 
what I
>  can tell.
> 
> >>>
> >>> It also does a "vmxon" on each logical cpu which may also need to be
> >>> done after a resume.
> >>
> >> Ah, yes it does.  That was sufficient both for starting a new guest after
> >> resume and even doing a suspend/resume while a guest was active (and the
> >> guest continued to run fine).  I have a hacky patch for this.  One, it
> >> includes both a suspend and resume hook for VMM, though for my testing I 
only
> >> needed a resume hook to invoke vmxon.  Second, the name of vmx_resume2()
> >> is a total hack (because vmx_resume() was already taken.  I think for now
> >> if I were to commit this, I'd just add the resme hook and maybe call the
> >> Intel method vmx_reset() or vmx_restore()?
> >>
> >> http://people.freebsd.org/~jhb/patches/bhyve_resume.patch
> >>
> > 
> > There seems to be a race after the APs are restarted and before
> > 'vmm_resume_p()' where it would be problematic to execute a VMX
> > instruction.
> > 
> > Perhaps we should enable VMX on each cpu before they return to the
> > interrupted code?
> 
> Can you use the hook in cpususpend_handler? It's cpu_ops.cpu_resume, and
> gets called on each CPU before returning from the handler.

That is the right place, yes.  However, I'm worried about collisions.  Can you 
run nested VMM's under Xen?  That is, can a xenhvm guest start a bhyve vmm?  
If so, then you would need to run both cpu_resume handlers.  Also, cpu_resume 
isn't run on the CPU that initiates the suspend.  For now I will stick with a
dedicated vmm_resume_p hook, but we may want to revisit that at some point.

-- 
John Baldwin
___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Panic starting a bhyve guest after resume

2013-12-14 Thread Roger Pau Monné
On 14/12/13 03:28, Neel Natu wrote:
> Hi John,
> 
> On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin  wrote:
>> On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote:
>>> Hi John,
>>>
>>> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin  wrote:
 If I suspend and resume my laptop and then try to start a guest after the
 resume, I get an odd panic.  It generates a privileged instruction fault 
 (in
 kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMXE is 
 set.
 I dont have any other ideas off the top of my head on what I should be 
 poking
 at?  It looks like we read a bunch of MSRs in vmx_init(), but we don't 
 write
 to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from 
 what I
 can tell.

>>>
>>> It also does a "vmxon" on each logical cpu which may also need to be
>>> done after a resume.
>>
>> Ah, yes it does.  That was sufficient both for starting a new guest after
>> resume and even doing a suspend/resume while a guest was active (and the
>> guest continued to run fine).  I have a hacky patch for this.  One, it
>> includes both a suspend and resume hook for VMM, though for my testing I only
>> needed a resume hook to invoke vmxon.  Second, the name of vmx_resume2()
>> is a total hack (because vmx_resume() was already taken.  I think for now
>> if I were to commit this, I'd just add the resme hook and maybe call the
>> Intel method vmx_reset() or vmx_restore()?
>>
>> http://people.freebsd.org/~jhb/patches/bhyve_resume.patch
>>
> 
> There seems to be a race after the APs are restarted and before
> 'vmm_resume_p()' where it would be problematic to execute a VMX
> instruction.
> 
> Perhaps we should enable VMX on each cpu before they return to the
> interrupted code?

Can you use the hook in cpususpend_handler? It's cpu_ops.cpu_resume, and
gets called on each CPU before returning from the handler.

___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Panic starting a bhyve guest after resume

2013-12-13 Thread Neel Natu
Hi John,

On Fri, Dec 13, 2013 at 2:09 PM, John Baldwin  wrote:
> On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote:
>> Hi John,
>>
>> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin  wrote:
>> > If I suspend and resume my laptop and then try to start a guest after the
>> > resume, I get an odd panic.  It generates a privileged instruction fault 
>> > (in
>> > kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMXE is 
>> > set.
>> > I dont have any other ideas off the top of my head on what I should be 
>> > poking
>> > at?  It looks like we read a bunch of MSRs in vmx_init(), but we don't 
>> > write
>> > to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from 
>> > what I
>> > can tell.
>> >
>>
>> It also does a "vmxon" on each logical cpu which may also need to be
>> done after a resume.
>
> Ah, yes it does.  That was sufficient both for starting a new guest after
> resume and even doing a suspend/resume while a guest was active (and the
> guest continued to run fine).  I have a hacky patch for this.  One, it
> includes both a suspend and resume hook for VMM, though for my testing I only
> needed a resume hook to invoke vmxon.  Second, the name of vmx_resume2()
> is a total hack (because vmx_resume() was already taken.  I think for now
> if I were to commit this, I'd just add the resme hook and maybe call the
> Intel method vmx_reset() or vmx_restore()?
>
> http://people.freebsd.org/~jhb/patches/bhyve_resume.patch
>

There seems to be a race after the APs are restarted and before
'vmm_resume_p()' where it would be problematic to execute a VMX
instruction.

Perhaps we should enable VMX on each cpu before they return to the
interrupted code?

best
Neel

> --
> John Baldwin
___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Panic starting a bhyve guest after resume

2013-12-13 Thread John Baldwin
On Thursday, December 12, 2013 4:00:08 pm Neel Natu wrote:
> Hi John,
> 
> On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin  wrote:
> > If I suspend and resume my laptop and then try to start a guest after the
> > resume, I get an odd panic.  It generates a privileged instruction fault (in
> > kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMXE is 
> > set.
> > I dont have any other ideas off the top of my head on what I should be 
> > poking
> > at?  It looks like we read a bunch of MSRs in vmx_init(), but we don't write
> > to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from what 
> > I
> > can tell.
> >
> 
> It also does a "vmxon" on each logical cpu which may also need to be
> done after a resume.

Ah, yes it does.  That was sufficient both for starting a new guest after
resume and even doing a suspend/resume while a guest was active (and the
guest continued to run fine).  I have a hacky patch for this.  One, it
includes both a suspend and resume hook for VMM, though for my testing I only
needed a resume hook to invoke vmxon.  Second, the name of vmx_resume2()
is a total hack (because vmx_resume() was already taken.  I think for now
if I were to commit this, I'd just add the resme hook and maybe call the
Intel method vmx_reset() or vmx_restore()?

http://people.freebsd.org/~jhb/patches/bhyve_resume.patch

-- 
John Baldwin
___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Panic starting a bhyve guest after resume

2013-12-12 Thread Neel Natu
Hi John,

On Thu, Dec 12, 2013 at 12:11 PM, John Baldwin  wrote:
> If I suspend and resume my laptop and then try to start a guest after the
> resume, I get an odd panic.  It generates a privileged instruction fault (in
> kernel mode) for 'vmclear'.  I've checked CR4 and it claims that VMXE is set.
> I dont have any other ideas off the top of my head on what I should be poking
> at?  It looks like we read a bunch of MSRs in vmx_init(), but we don't write
> to them, and all vmx_enable() does on each CPU is set VMXE in CR4 from what I
> can tell.
>

It also does a "vmxon" on each logical cpu which may also need to be
done after a resume.

best
Neel

> --
> John Baldwin
> ___
> freebsd-virtualization@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
> To unsubscribe, send any mail to 
> "freebsd-virtualization-unsubscr...@freebsd.org"
___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"