Re: [PATCH 3/4 V10] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-30 Thread Avi Kivity
On 01/17/2012 08:40 PM, Eric B Munson wrote:
 Now that we have a flag that will tell the guest it was suspended, create an
 interface for that communication using a KVM ioctl.


 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index e1d94bf..1931e5c 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1491,6 +1491,19 @@ following algorithm:
  Some guests configure the LINT1 NMI input to cause a panic, aiding in
  debugging.
  
 +4.65 KVMCLOCK_GUEST_PAUSED
 +
 +Capability: KVM_CAP_GUEST_PAUSED
 +Architechtures: Any that implement pvclocks (currently x86 only)
 +Type: vcpu ioctl

vm ioctl.

 +Parameters: None
 +Returns: 0 on success, -1 on error
 +
 +This signals to the host kernel that the specified guest is being paused by
 +userspace.  The host will set a flag in the pvclock structure that is checked
 +from the soft lockup watchdog.  This ioctl can be called during pause or
 +unpause.
 +
  5. The kvm_run structure
  
  
 +/*
 + * kvm_set_guest_paused() indicates to the guest kernel that it has been
 + * stopped by the hypervisor.  This function will be called from the host 
 only.
 + */
 +static int kvm_set_guest_paused(struct kvm *kvm)
 +{
 + struct kvm_vcpu *vcpu;
 + struct pvclock_vcpu_time_info *src;
 + int i;
 +
 + kvm_for_each_vcpu(i, vcpu, kvm) {
 + if (!vcpu-arch.time_page)
 + continue;
 + src = vcpu-arch.hv_clock;
 + src-flags |= PVCLOCK_GUEST_STOPPED;

This looks racy.  The vcpu can remove its kvmclock concurrently with
this access, and src will be NULL.

Can you point me to the discussion that moved this to be a vm ioctl?  In
general vm ioctls that do things for all vcpus are racy, like here. 
You're accessing variables that are protected by the vcpu mutex, and not
taking the mutex (nor can you, since it is held while the guest is
running, unlike most kernel mutexes).

 + mark_page_dirty(vcpu-kvm, vcpu-arch.time  PAGE_SHIFT);
 + }
 + return 0;
 +}
 +
  long kvm_arch_vm_ioctl(struct file *filp,
  unsigned int ioctl, unsigned long arg)
  {
 @@ -3351,6 +3372,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
   r = 0;
   break;
   }
 + case KVMCLOCK_GUEST_PAUSED: {
 + r = kvm_set_guest_paused(kvm);
 + break;
 + }
  
   default:
   ;
 diff --git a/include/linux/kvm.h b/include/linux/kvm.h
 index 68e67e5..4ffe0df 100644
 --- a/include/linux/kvm.h
 +++ b/include/linux/kvm.h
 @@ -558,6 +558,7 @@ struct kvm_ppc_pvinfo {
  #define KVM_CAP_PPC_PAPR 68
  #define KVM_CAP_S390_GMAP 71
  #define KVM_CAP_TSC_DEADLINE_TIMER 72
 +#define KVM_CAP_GUEST_PAUSED 73
  
  #ifdef KVM_CAP_IRQ_ROUTING
  
 @@ -763,6 +764,8 @@ struct kvm_clock_data {
  #define KVM_CREATE_SPAPR_TCE   _IOW(KVMIO,  0xa8, struct 
 kvm_create_spapr_tce)
  /* Available with KVM_CAP_RMA */
  #define KVM_ALLOCATE_RMA   _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
 +/* VM is being stopped by host */
 +#define KVMCLOCK_GUEST_PAUSED  _IO(KVMIO,   0xaa)
  
  #define KVM_DEV_ASSIGN_ENABLE_IOMMU  (1  0)
  


-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4 V10] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-30 Thread Avi Kivity
On 01/30/2012 05:07 PM, Avi Kivity wrote:
  +Parameters: None
  +Returns: 0 on success, -1 on error
  +
  +This signals to the host kernel that the specified guest is being paused by
  +userspace.  The host will set a flag in the pvclock structure that is 
  checked
  +from the soft lockup watchdog.  This ioctl can be called during pause or
  +unpause.
  +
   5. The kvm_run structure
   
   
  +/*
  + * kvm_set_guest_paused() indicates to the guest kernel that it has been
  + * stopped by the hypervisor.  This function will be called from the host 
  only.
  + */
  +static int kvm_set_guest_paused(struct kvm *kvm)
  +{
  +   struct kvm_vcpu *vcpu;
  +   struct pvclock_vcpu_time_info *src;
  +   int i;
  +
  +   kvm_for_each_vcpu(i, vcpu, kvm) {
  +   if (!vcpu-arch.time_page)
  +   continue;
  +   src = vcpu-arch.hv_clock;
  +   src-flags |= PVCLOCK_GUEST_STOPPED;

 This looks racy.  The vcpu can remove its kvmclock concurrently with
 this access, and src will be NULL.

 Can you point me to the discussion that moved this to be a vm ioctl?  In
 general vm ioctls that do things for all vcpus are racy, like here. 
 You're accessing variables that are protected by the vcpu mutex, and not
 taking the mutex (nor can you, since it is held while the guest is
 running, unlike most kernel mutexes).


Note, the standard way to fix this race is to
kvm_make_request(KVM_REQ_KVMCLOCK_STOP) and kvm_vcpu_kick(vcpu), and do
the update in vcpu_enter_guest().  But I think this is needless
complexity and want to understand what motivated the move to a vm ioctl.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4 V10] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-30 Thread Eric B Munson
On Mon, 30 Jan 2012, Avi Kivity wrote:

 On 01/17/2012 08:40 PM, Eric B Munson wrote:
  Now that we have a flag that will tell the guest it was suspended, create an
  interface for that communication using a KVM ioctl.
 
 
  diff --git a/Documentation/virtual/kvm/api.txt 
  b/Documentation/virtual/kvm/api.txt
  index e1d94bf..1931e5c 100644
  --- a/Documentation/virtual/kvm/api.txt
  +++ b/Documentation/virtual/kvm/api.txt
  @@ -1491,6 +1491,19 @@ following algorithm:
   Some guests configure the LINT1 NMI input to cause a panic, aiding in
   debugging.
   
  +4.65 KVMCLOCK_GUEST_PAUSED
  +
  +Capability: KVM_CAP_GUEST_PAUSED
  +Architechtures: Any that implement pvclocks (currently x86 only)
  +Type: vcpu ioctl
 
 vm ioctl.
 
  +Parameters: None
  +Returns: 0 on success, -1 on error
  +
  +This signals to the host kernel that the specified guest is being paused by
  +userspace.  The host will set a flag in the pvclock structure that is 
  checked
  +from the soft lockup watchdog.  This ioctl can be called during pause or
  +unpause.
  +
   5. The kvm_run structure
   
   
  +/*
  + * kvm_set_guest_paused() indicates to the guest kernel that it has been
  + * stopped by the hypervisor.  This function will be called from the host 
  only.
  + */
  +static int kvm_set_guest_paused(struct kvm *kvm)
  +{
  +   struct kvm_vcpu *vcpu;
  +   struct pvclock_vcpu_time_info *src;
  +   int i;
  +
  +   kvm_for_each_vcpu(i, vcpu, kvm) {
  +   if (!vcpu-arch.time_page)
  +   continue;
  +   src = vcpu-arch.hv_clock;
  +   src-flags |= PVCLOCK_GUEST_STOPPED;
 
 This looks racy.  The vcpu can remove its kvmclock concurrently with
 this access, and src will be NULL.
 
 Can you point me to the discussion that moved this to be a vm ioctl?  In
 general vm ioctls that do things for all vcpus are racy, like here. 
 You're accessing variables that are protected by the vcpu mutex, and not
 taking the mutex (nor can you, since it is held while the guest is
 running, unlike most kernel mutexes).
 

Jan Kiszka suggested that becuase there isn't a use case for notifying
individual vcpus (can vcpu's be paused individually?) that it makes more sense
to have a vm ioctl.

http://thread.gmane.org/gmane.comp.emulators.qemu/131624

If the per vcpu ioctl is the right choice I can resend those patches.

Eric


signature.asc
Description: Digital signature


Re: [PATCH 3/4 V10] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-30 Thread Eric B Munson
On Mon, 30 Jan 2012, Avi Kivity wrote:

 On 01/30/2012 05:07 PM, Avi Kivity wrote:
   +Parameters: None
   +Returns: 0 on success, -1 on error
   +
   +This signals to the host kernel that the specified guest is being paused 
   by
   +userspace.  The host will set a flag in the pvclock structure that is 
   checked
   +from the soft lockup watchdog.  This ioctl can be called during pause or
   +unpause.
   +
5. The kvm_run structure


   +/*
   + * kvm_set_guest_paused() indicates to the guest kernel that it has been
   + * stopped by the hypervisor.  This function will be called from the 
   host only.
   + */
   +static int kvm_set_guest_paused(struct kvm *kvm)
   +{
   + struct kvm_vcpu *vcpu;
   + struct pvclock_vcpu_time_info *src;
   + int i;
   +
   + kvm_for_each_vcpu(i, vcpu, kvm) {
   + if (!vcpu-arch.time_page)
   + continue;
   + src = vcpu-arch.hv_clock;
   + src-flags |= PVCLOCK_GUEST_STOPPED;
 
  This looks racy.  The vcpu can remove its kvmclock concurrently with
  this access, and src will be NULL.
 
  Can you point me to the discussion that moved this to be a vm ioctl?  In
  general vm ioctls that do things for all vcpus are racy, like here. 
  You're accessing variables that are protected by the vcpu mutex, and not
  taking the mutex (nor can you, since it is held while the guest is
  running, unlike most kernel mutexes).
 
 
 Note, the standard way to fix this race is to
 kvm_make_request(KVM_REQ_KVMCLOCK_STOP) and kvm_vcpu_kick(vcpu), and do
 the update in vcpu_enter_guest().  But I think this is needless
 complexity and want to understand what motivated the move to a vm ioctl.
 

I will hold off on fixing this race until we settle the vm or vcpu ioctl
question.

Eric


signature.asc
Description: Digital signature


Re: [PATCH 3/4 V10] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-30 Thread Avi Kivity
On 01/30/2012 05:32 PM, Eric B Munson wrote:
  
  Can you point me to the discussion that moved this to be a vm ioctl?  In
  general vm ioctls that do things for all vcpus are racy, like here. 
  You're accessing variables that are protected by the vcpu mutex, and not
  taking the mutex (nor can you, since it is held while the guest is
  running, unlike most kernel mutexes).
  

 Jan Kiszka suggested that becuase there isn't a use case for notifying
 individual vcpus (can vcpu's be paused individually?

They can, though the guest will grind to a halt very soon.

 ) that it makes more sense
 to have a vm ioctl.

 http://thread.gmane.org/gmane.comp.emulators.qemu/131624

 If the per vcpu ioctl is the right choice I can resend those patches.

The races are solvable but I think it's easier in userspace.  It's also
more flexible, though I don't really see a use for this flexibility.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4 V10] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-30 Thread Jan Kiszka
On 2012-01-30 16:07, Avi Kivity wrote:
 On 01/17/2012 08:40 PM, Eric B Munson wrote:
 Now that we have a flag that will tell the guest it was suspended, create an
 interface for that communication using a KVM ioctl.


 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index e1d94bf..1931e5c 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1491,6 +1491,19 @@ following algorithm:
  Some guests configure the LINT1 NMI input to cause a panic, aiding in
  debugging.
  
 +4.65 KVMCLOCK_GUEST_PAUSED
 +
 +Capability: KVM_CAP_GUEST_PAUSED
 +Architechtures: Any that implement pvclocks (currently x86 only)
 +Type: vcpu ioctl
 
 vm ioctl.
 
 +Parameters: None
 +Returns: 0 on success, -1 on error
 +
 +This signals to the host kernel that the specified guest is being paused by
 +userspace.  The host will set a flag in the pvclock structure that is 
 checked
 +from the soft lockup watchdog.  This ioctl can be called during pause or
 +unpause.
 +
  5. The kvm_run structure
  
  
 +/*
 + * kvm_set_guest_paused() indicates to the guest kernel that it has been
 + * stopped by the hypervisor.  This function will be called from the host 
 only.
 + */
 +static int kvm_set_guest_paused(struct kvm *kvm)
 +{
 +struct kvm_vcpu *vcpu;
 +struct pvclock_vcpu_time_info *src;
 +int i;
 +
 +kvm_for_each_vcpu(i, vcpu, kvm) {
 +if (!vcpu-arch.time_page)
 +continue;
 +src = vcpu-arch.hv_clock;
 +src-flags |= PVCLOCK_GUEST_STOPPED;
 
 This looks racy.  The vcpu can remove its kvmclock concurrently with
 this access, and src will be NULL.

There is no race here (src is member of the vcpu), but arch.time might
have become invalid. KVM_REQ_CLOCK_UPDATE instead of mark_page_dirty
would indeed be the way to go. Trivial solution, I would say.

However, the concept of guest stopped has VM, not VCPU scope. That
makes the call more appropriate as a VM ioctl. If that thing should
really become per-vcpu, at least call it KVMCLOCK_VCPU_STOPPED.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4 V10] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-30 Thread Avi Kivity
On 01/30/2012 06:18 PM, Jan Kiszka wrote:
  
  This looks racy.  The vcpu can remove its kvmclock concurrently with
  this access, and src will be NULL.

 There is no race here (src is member of the vcpu), but arch.time might
 have become invalid. KVM_REQ_CLOCK_UPDATE instead of mark_page_dirty
 would indeed be the way to go. Trivial solution, I would say.

 However, the concept of guest stopped has VM, not VCPU scope. 

We're not stopping the guest here, just setting a flag in kvmclock,
which certainly is a per-vcpu thing.

 That
 makes the call more appropriate as a VM ioctl. If that thing should
 really become per-vcpu, at least call it KVMCLOCK_VCPU_STOPPED.


All current ioctls start with KVM_. Maybe KVM_KVMCLOCK_CTRL?

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html