Re: [PATCH v4 1/6] kvm: add device control API

2013-04-26 Thread Gleb Natapov
On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote:
 On 04/25/2013 01:22:04 PM, Gleb Natapov wrote:
 On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote:
  On 04/25/2013 05:47:39 AM, Alexander Graf wrote:
  
  On 25.04.2013, at 11:43, Gleb Natapov wrote:
  
   +void kvm_device_put(struct kvm_device *dev)
   +{
   +  if (atomic_dec_and_test(dev-users))
   +  dev-ops-destroy(dev);
   +}
   +
   +static int kvm_device_release(struct inode *inode, struct file
  *filp)
   +{
   +  struct kvm_device *dev = filp-private_data;
   +  struct kvm *kvm = dev-kvm;
   +
   +  kvm_device_put(dev);
   +  kvm_put_kvm(kvm);
   We may put kvm only if users goes to zero, otherwise kvm can be
   freed while something holds a reference to a device. Why not make
   kvm_device_put() do it?
  
  Nice catch. I'll change the patch so it does the kvm_put_kvm
  inside kvm_device_put's destroy branch.
 
  No, please don't.  The KVM reference being put here is associated
  with the file descriptor, not with the MPIC object.
 Is it so? Device holds a pointer to kvm, so it increments kvm
 reference
 to make sure the pointer is valid. What prevents kvm from been
 destroyed
 while device is still in use in current code?
 
 Where will that kvm pointer be used, after all the file descriptors
 go away and the vcpus stop running?  mmio_mapped guards against
 unmapping the MMIO if it's already been unmapped due to KVM
 destruction.  We don't have any timers or other delayed work.
 
MPIC does not, but timer device will have one.

 Well, I do see one place, that Alex added -- the NULLing out of
 dev-kvm-arch.mpic, which didn't exist in my patchset.
 
  that change I think you'll have circular references and thus a
  memory leak, because the vcpus can hold a reference to the MPIC
  object.
 
 How circular reference can be created?
 
 MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu
 is not destroyed until KVM is destroyed.
 
Yes, you are right. So we need to think about how to fix it in a
different way. What about holding all devices in kvm-devices[] array
and destroy them during kvm destruction, like we do for vcpus?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-26 Thread Alexander Graf

On 26.04.2013, at 11:53, Gleb Natapov wrote:

 On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote:
 On 04/25/2013 01:22:04 PM, Gleb Natapov wrote:
 On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote:
 On 04/25/2013 05:47:39 AM, Alexander Graf wrote:
 
 On 25.04.2013, at 11:43, Gleb Natapov wrote:
 
 +void kvm_device_put(struct kvm_device *dev)
 +{
 +   if (atomic_dec_and_test(dev-users))
 +   dev-ops-destroy(dev);
 +}
 +
 +static int kvm_device_release(struct inode *inode, struct file
 *filp)
 +{
 +   struct kvm_device *dev = filp-private_data;
 +   struct kvm *kvm = dev-kvm;
 +
 +   kvm_device_put(dev);
 +   kvm_put_kvm(kvm);
 We may put kvm only if users goes to zero, otherwise kvm can be
 freed while something holds a reference to a device. Why not make
 kvm_device_put() do it?
 
 Nice catch. I'll change the patch so it does the kvm_put_kvm
 inside kvm_device_put's destroy branch.
 
 No, please don't.  The KVM reference being put here is associated
 with the file descriptor, not with the MPIC object.
 Is it so? Device holds a pointer to kvm, so it increments kvm
 reference
 to make sure the pointer is valid. What prevents kvm from been
 destroyed
 while device is still in use in current code?
 
 Where will that kvm pointer be used, after all the file descriptors
 go away and the vcpus stop running?  mmio_mapped guards against
 unmapping the MMIO if it's already been unmapped due to KVM
 destruction.  We don't have any timers or other delayed work.
 
 MPIC does not, but timer device will have one.
 
 Well, I do see one place, that Alex added -- the NULLing out of
 dev-kvm-arch.mpic, which didn't exist in my patchset.
 
 that change I think you'll have circular references and thus a
 memory leak, because the vcpus can hold a reference to the MPIC
 object.
 
 How circular reference can be created?
 
 MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu
 is not destroyed until KVM is destroyed.
 
 Yes, you are right. So we need to think about how to fix it in a
 different way. What about holding all devices in kvm-devices[] array
 and destroy them during kvm destruction, like we do for vcpus?

You should really look at your patches in LIFO order :). A patch doing that was 
already sent by Scott last night and is in v4 of my patch set.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-26 Thread Gleb Natapov
On Fri, Apr 26, 2013 at 11:55:27AM +0200, Alexander Graf wrote:
 
 On 26.04.2013, at 11:53, Gleb Natapov wrote:
 
  On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote:
  On 04/25/2013 01:22:04 PM, Gleb Natapov wrote:
  On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote:
  On 04/25/2013 05:47:39 AM, Alexander Graf wrote:
  
  On 25.04.2013, at 11:43, Gleb Natapov wrote:
  
  +void kvm_device_put(struct kvm_device *dev)
  +{
  + if (atomic_dec_and_test(dev-users))
  + dev-ops-destroy(dev);
  +}
  +
  +static int kvm_device_release(struct inode *inode, struct file
  *filp)
  +{
  + struct kvm_device *dev = filp-private_data;
  + struct kvm *kvm = dev-kvm;
  +
  + kvm_device_put(dev);
  + kvm_put_kvm(kvm);
  We may put kvm only if users goes to zero, otherwise kvm can be
  freed while something holds a reference to a device. Why not make
  kvm_device_put() do it?
  
  Nice catch. I'll change the patch so it does the kvm_put_kvm
  inside kvm_device_put's destroy branch.
  
  No, please don't.  The KVM reference being put here is associated
  with the file descriptor, not with the MPIC object.
  Is it so? Device holds a pointer to kvm, so it increments kvm
  reference
  to make sure the pointer is valid. What prevents kvm from been
  destroyed
  while device is still in use in current code?
  
  Where will that kvm pointer be used, after all the file descriptors
  go away and the vcpus stop running?  mmio_mapped guards against
  unmapping the MMIO if it's already been unmapped due to KVM
  destruction.  We don't have any timers or other delayed work.
  
  MPIC does not, but timer device will have one.
  
  Well, I do see one place, that Alex added -- the NULLing out of
  dev-kvm-arch.mpic, which didn't exist in my patchset.
  
  that change I think you'll have circular references and thus a
  memory leak, because the vcpus can hold a reference to the MPIC
  object.
  
  How circular reference can be created?
  
  MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu
  is not destroyed until KVM is destroyed.
  
  Yes, you are right. So we need to think about how to fix it in a
  different way. What about holding all devices in kvm-devices[] array
  and destroy them during kvm destruction, like we do for vcpus?
 
 You should really look at your patches in LIFO order :). A patch doing that 
 was already sent by Scott last night and is in v4 of my patch set.
 
 
I tried! This causes starvation for some patches. I need better algorithm :)

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-26 Thread Gleb Natapov
On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote:
 On 04/25/2013 01:22:04 PM, Gleb Natapov wrote:
 On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote:
  On 04/25/2013 05:47:39 AM, Alexander Graf wrote:
  
  On 25.04.2013, at 11:43, Gleb Natapov wrote:
  
   +void kvm_device_put(struct kvm_device *dev)
   +{
   +  if (atomic_dec_and_test(dev-users))
   +  dev-ops-destroy(dev);
   +}
   +
   +static int kvm_device_release(struct inode *inode, struct file
  *filp)
   +{
   +  struct kvm_device *dev = filp-private_data;
   +  struct kvm *kvm = dev-kvm;
   +
   +  kvm_device_put(dev);
   +  kvm_put_kvm(kvm);
   We may put kvm only if users goes to zero, otherwise kvm can be
   freed while something holds a reference to a device. Why not make
   kvm_device_put() do it?
  
  Nice catch. I'll change the patch so it does the kvm_put_kvm
  inside kvm_device_put's destroy branch.
 
  No, please don't.  The KVM reference being put here is associated
  with the file descriptor, not with the MPIC object.
 Is it so? Device holds a pointer to kvm, so it increments kvm
 reference
 to make sure the pointer is valid. What prevents kvm from been
 destroyed
 while device is still in use in current code?
 
 Where will that kvm pointer be used, after all the file descriptors
 go away and the vcpus stop running?  mmio_mapped guards against
 unmapping the MMIO if it's already been unmapped due to KVM
 destruction.  We don't have any timers or other delayed work.
 
MPIC does not, but timer device will have one.

 Well, I do see one place, that Alex added -- the NULLing out of
 dev-kvm-arch.mpic, which didn't exist in my patchset.
 
  that change I think you'll have circular references and thus a
  memory leak, because the vcpus can hold a reference to the MPIC
  object.
 
 How circular reference can be created?
 
 MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu
 is not destroyed until KVM is destroyed.
 
Yes, you are right. So we need to think about how to fix it in a
different way. What about holding all devices in kvm-devices[] array
and destroy them during kvm destruction, like we do for vcpus?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-26 Thread Gleb Natapov
On Fri, Apr 26, 2013 at 11:55:27AM +0200, Alexander Graf wrote:
 
 On 26.04.2013, at 11:53, Gleb Natapov wrote:
 
  On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote:
  On 04/25/2013 01:22:04 PM, Gleb Natapov wrote:
  On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote:
  On 04/25/2013 05:47:39 AM, Alexander Graf wrote:
  
  On 25.04.2013, at 11:43, Gleb Natapov wrote:
  
  +void kvm_device_put(struct kvm_device *dev)
  +{
  + if (atomic_dec_and_test(dev-users))
  + dev-ops-destroy(dev);
  +}
  +
  +static int kvm_device_release(struct inode *inode, struct file
  *filp)
  +{
  + struct kvm_device *dev = filp-private_data;
  + struct kvm *kvm = dev-kvm;
  +
  + kvm_device_put(dev);
  + kvm_put_kvm(kvm);
  We may put kvm only if users goes to zero, otherwise kvm can be
  freed while something holds a reference to a device. Why not make
  kvm_device_put() do it?
  
  Nice catch. I'll change the patch so it does the kvm_put_kvm
  inside kvm_device_put's destroy branch.
  
  No, please don't.  The KVM reference being put here is associated
  with the file descriptor, not with the MPIC object.
  Is it so? Device holds a pointer to kvm, so it increments kvm
  reference
  to make sure the pointer is valid. What prevents kvm from been
  destroyed
  while device is still in use in current code?
  
  Where will that kvm pointer be used, after all the file descriptors
  go away and the vcpus stop running?  mmio_mapped guards against
  unmapping the MMIO if it's already been unmapped due to KVM
  destruction.  We don't have any timers or other delayed work.
  
  MPIC does not, but timer device will have one.
  
  Well, I do see one place, that Alex added -- the NULLing out of
  dev-kvm-arch.mpic, which didn't exist in my patchset.
  
  that change I think you'll have circular references and thus a
  memory leak, because the vcpus can hold a reference to the MPIC
  object.
  
  How circular reference can be created?
  
  MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu
  is not destroyed until KVM is destroyed.
  
  Yes, you are right. So we need to think about how to fix it in a
  different way. What about holding all devices in kvm-devices[] array
  and destroy them during kvm destruction, like we do for vcpus?
 
 You should really look at your patches in LIFO order :). A patch doing that 
 was already sent by Scott last night and is in v4 of my patch set.
 
 
I tried! This causes starvation for some patches. I need better algorithm :)

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Gleb Natapov
On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
 Currently, devices that are emulated inside KVM are configured in a
 hardcoded manner based on an assumption that any given architecture
 only has one way to do it.  If there's any need to access device state,
 it is done through inflexible one-purpose-only IOCTLs (e.g.
 KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
 cumbersome and depletes a limited numberspace.
 
 This API provides a mechanism to instantiate a device of a certain
 type, returning an ID that can be used to set/get attributes of the
 device.  Attributes may include configuration parameters (e.g.
 register base address), device state, operational commands, etc.  It
 is similar to the ONE_REG API, except that it acts on devices rather
 than vcpus.
 
 Both device types and individual attributes can be tested without having
 to create the device or get/set the attribute, without the need for
 separately managing enumerated capabilities.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
 v4:
  - Move some boilerplate back into generic code, as requested by Gleb.
File descriptor management and reference counting is no longer the
concern of the device implementation.
 
  - Don't hold kvm-lock during create.  The original reasons
for doing so have vanished as for as MPIC is concerned, and
this avoids needing to answer the question of whether to
hold the lock during destroy as well.
 
Paul, you may need to acquire the lock yourself in kvm_create_xics()
to protect the -EEXIST check.
 
 v3: remove some changes that were merged into this patch by accident,
 and fix the error documentation for KVM_CREATE_DEVICE.
 ---
  Documentation/virtual/kvm/api.txt|   70 
  Documentation/virtual/kvm/devices/README |1 +
  include/linux/kvm_host.h |   35 
  include/uapi/linux/kvm.h |   27 +++
  virt/kvm/kvm_main.c  |  129 
 ++
  5 files changed, 262 insertions(+)
  create mode 100644 Documentation/virtual/kvm/devices/README
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 976eb65..d52f3f9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
 from the data
  written, then `n_invalid' invalid entries, invalidating any previously
  valid entries found.
  
 +4.79 KVM_CREATE_DEVICE
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: vm ioctl
 +Parameters: struct kvm_create_device (in/out)
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENODEV: The device type is unknown or unsupported
 +  EEXIST: Device already created, and this type of device may not
 +  be instantiated multiple times
 +
 +  Other error conditions may be defined by individual device types or
 +  have their standard meanings.
 +
 +Creates an emulated device in the kernel.  The file descriptor returned
 +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
 +
 +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
 +device type is supported (not necessarily whether it can be created
 +in the current vm).
 +
 +Individual devices should not define flags.  Attributes should be used
 +for specifying any behavior that is not implied by the device type
 +number.
 +
 +struct kvm_create_device {
 + __u32   type;   /* in: KVM_DEV_TYPE_xxx */
 + __u32   fd; /* out: device handle */
 + __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
 +};
Should we add __u32 padding here to make struct size multiple of u64?

 +
 +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct kvm_device_attr
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENXIO:  The group or attribute is unknown/unsupported for this device
 +  EPERM:  The attribute cannot (currently) be accessed this way
 +  (e.g. read-only attribute, or attribute that only makes
 +  sense when the device is in a different state)
 +
 +  Other error conditions may be defined by individual device types.
 +
 +Gets/sets a specified piece of device configuration and/or state.  The
 +semantics are device-specific.  See individual device documentation in
 +the devices directory.  As with ONE_REG, the size of the data
 +transferred is defined by the particular attribute.
 +
 +struct kvm_device_attr {
 + __u32   flags;  /* no flags currently defined */
 + __u32   group;  /* device-defined */
 + __u64   attr;   /* group-defined */
 + __u64   addr;   /* userspace address of attr data */
 +};
 +
 +4.81 KVM_HAS_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct kvm_device_attr
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENXIO:  The group or attribute is unknown/unsupported for this device
 

Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Alexander Graf

On 25.04.2013, at 11:43, Gleb Natapov wrote:

 On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
 Currently, devices that are emulated inside KVM are configured in a
 hardcoded manner based on an assumption that any given architecture
 only has one way to do it.  If there's any need to access device state,
 it is done through inflexible one-purpose-only IOCTLs (e.g.
 KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
 cumbersome and depletes a limited numberspace.
 
 This API provides a mechanism to instantiate a device of a certain
 type, returning an ID that can be used to set/get attributes of the
 device.  Attributes may include configuration parameters (e.g.
 register base address), device state, operational commands, etc.  It
 is similar to the ONE_REG API, except that it acts on devices rather
 than vcpus.
 
 Both device types and individual attributes can be tested without having
 to create the device or get/set the attribute, without the need for
 separately managing enumerated capabilities.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
 v4:
 - Move some boilerplate back into generic code, as requested by Gleb.
   File descriptor management and reference counting is no longer the
   concern of the device implementation.
 
 - Don't hold kvm-lock during create.  The original reasons
   for doing so have vanished as for as MPIC is concerned, and
   this avoids needing to answer the question of whether to
   hold the lock during destroy as well.
 
   Paul, you may need to acquire the lock yourself in kvm_create_xics()
   to protect the -EEXIST check.
 
 v3: remove some changes that were merged into this patch by accident,
 and fix the error documentation for KVM_CREATE_DEVICE.
 ---
 Documentation/virtual/kvm/api.txt|   70 
 Documentation/virtual/kvm/devices/README |1 +
 include/linux/kvm_host.h |   35 
 include/uapi/linux/kvm.h |   27 +++
 virt/kvm/kvm_main.c  |  129 
 ++
 5 files changed, 262 insertions(+)
 create mode 100644 Documentation/virtual/kvm/devices/README
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 976eb65..d52f3f9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
 from the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
 +4.79 KVM_CREATE_DEVICE
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: vm ioctl
 +Parameters: struct kvm_create_device (in/out)
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENODEV: The device type is unknown or unsupported
 +  EEXIST: Device already created, and this type of device may not
 +  be instantiated multiple times
 +
 +  Other error conditions may be defined by individual device types or
 +  have their standard meanings.
 +
 +Creates an emulated device in the kernel.  The file descriptor returned
 +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
 +
 +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
 +device type is supported (not necessarily whether it can be created
 +in the current vm).
 +
 +Individual devices should not define flags.  Attributes should be used
 +for specifying any behavior that is not implied by the device type
 +number.
 +
 +struct kvm_create_device {
 +__u32   type;   /* in: KVM_DEV_TYPE_xxx */
 +__u32   fd; /* out: device handle */
 +__u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
 +};
 Should we add __u32 padding here to make struct size multiple of u64?

Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and 
ppc64 doesn't either.

 
 +
 +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct kvm_device_attr
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENXIO:  The group or attribute is unknown/unsupported for this device
 +  EPERM:  The attribute cannot (currently) be accessed this way
 +  (e.g. read-only attribute, or attribute that only makes
 +  sense when the device is in a different state)
 +
 +  Other error conditions may be defined by individual device types.
 +
 +Gets/sets a specified piece of device configuration and/or state.  The
 +semantics are device-specific.  See individual device documentation in
 +the devices directory.  As with ONE_REG, the size of the data
 +transferred is defined by the particular attribute.
 +
 +struct kvm_device_attr {
 +__u32   flags;  /* no flags currently defined */
 +__u32   group;  /* device-defined */
 +__u64   attr;   /* group-defined */
 +__u64   addr;   /* userspace address of attr data */
 +};
 +
 +4.81 KVM_HAS_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct 

Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Gleb Natapov
On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote:
 
 On 25.04.2013, at 11:43, Gleb Natapov wrote:
 
  On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
  Currently, devices that are emulated inside KVM are configured in a
  hardcoded manner based on an assumption that any given architecture
  only has one way to do it.  If there's any need to access device state,
  it is done through inflexible one-purpose-only IOCTLs (e.g.
  KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
  cumbersome and depletes a limited numberspace.
  
  This API provides a mechanism to instantiate a device of a certain
  type, returning an ID that can be used to set/get attributes of the
  device.  Attributes may include configuration parameters (e.g.
  register base address), device state, operational commands, etc.  It
  is similar to the ONE_REG API, except that it acts on devices rather
  than vcpus.
  
  Both device types and individual attributes can be tested without having
  to create the device or get/set the attribute, without the need for
  separately managing enumerated capabilities.
  
  Signed-off-by: Scott Wood scottw...@freescale.com
  ---
  v4:
  - Move some boilerplate back into generic code, as requested by Gleb.
File descriptor management and reference counting is no longer the
concern of the device implementation.
  
  - Don't hold kvm-lock during create.  The original reasons
for doing so have vanished as for as MPIC is concerned, and
this avoids needing to answer the question of whether to
hold the lock during destroy as well.
  
Paul, you may need to acquire the lock yourself in kvm_create_xics()
to protect the -EEXIST check.
  
  v3: remove some changes that were merged into this patch by accident,
  and fix the error documentation for KVM_CREATE_DEVICE.
  ---
  Documentation/virtual/kvm/api.txt|   70 
  Documentation/virtual/kvm/devices/README |1 +
  include/linux/kvm_host.h |   35 
  include/uapi/linux/kvm.h |   27 +++
  virt/kvm/kvm_main.c  |  129 
  ++
  5 files changed, 262 insertions(+)
  create mode 100644 Documentation/virtual/kvm/devices/README
  
  diff --git a/Documentation/virtual/kvm/api.txt 
  b/Documentation/virtual/kvm/api.txt
  index 976eb65..d52f3f9 100644
  --- a/Documentation/virtual/kvm/api.txt
  +++ b/Documentation/virtual/kvm/api.txt
  @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
  from the data
  written, then `n_invalid' invalid entries, invalidating any previously
  valid entries found.
  
  +4.79 KVM_CREATE_DEVICE
  +
  +Capability: KVM_CAP_DEVICE_CTRL
  +Type: vm ioctl
  +Parameters: struct kvm_create_device (in/out)
  +Returns: 0 on success, -1 on error
  +Errors:
  +  ENODEV: The device type is unknown or unsupported
  +  EEXIST: Device already created, and this type of device may not
  +  be instantiated multiple times
  +
  +  Other error conditions may be defined by individual device types or
  +  have their standard meanings.
  +
  +Creates an emulated device in the kernel.  The file descriptor returned
  +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
  +
  +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
  +device type is supported (not necessarily whether it can be created
  +in the current vm).
  +
  +Individual devices should not define flags.  Attributes should be used
  +for specifying any behavior that is not implied by the device type
  +number.
  +
  +struct kvm_create_device {
  +  __u32   type;   /* in: KVM_DEV_TYPE_xxx */
  +  __u32   fd; /* out: device handle */
  +  __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
  +};
  Should we add __u32 padding here to make struct size multiple of u64?
 
 Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't 
 and ppc64 doesn't either.
 
Not really. I just notices that we pad some structures to that effect.

  
  +
  +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
  +
  +Capability: KVM_CAP_DEVICE_CTRL
  +Type: device ioctl
  +Parameters: struct kvm_device_attr
  +Returns: 0 on success, -1 on error
  +Errors:
  +  ENXIO:  The group or attribute is unknown/unsupported for this device
  +  EPERM:  The attribute cannot (currently) be accessed this way
  +  (e.g. read-only attribute, or attribute that only makes
  +  sense when the device is in a different state)
  +
  +  Other error conditions may be defined by individual device types.
  +
  +Gets/sets a specified piece of device configuration and/or state.  The
  +semantics are device-specific.  See individual device documentation in
  +the devices directory.  As with ONE_REG, the size of the data
  +transferred is defined by the particular attribute.
  +
  +struct kvm_device_attr {
  +  __u32   flags;  /* no flags currently defined */
  +  __u32   group;  /* 

Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Alexander Graf

On 25.04.2013, at 14:07, Gleb Natapov wrote:

 On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote:
 
 On 25.04.2013, at 11:43, Gleb Natapov wrote:
 
 On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
 Currently, devices that are emulated inside KVM are configured in a
 hardcoded manner based on an assumption that any given architecture
 only has one way to do it.  If there's any need to access device state,
 it is done through inflexible one-purpose-only IOCTLs (e.g.
 KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
 cumbersome and depletes a limited numberspace.
 
 This API provides a mechanism to instantiate a device of a certain
 type, returning an ID that can be used to set/get attributes of the
 device.  Attributes may include configuration parameters (e.g.
 register base address), device state, operational commands, etc.  It
 is similar to the ONE_REG API, except that it acts on devices rather
 than vcpus.
 
 Both device types and individual attributes can be tested without having
 to create the device or get/set the attribute, without the need for
 separately managing enumerated capabilities.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
 v4:
 - Move some boilerplate back into generic code, as requested by Gleb.
  File descriptor management and reference counting is no longer the
  concern of the device implementation.
 
 - Don't hold kvm-lock during create.  The original reasons
  for doing so have vanished as for as MPIC is concerned, and
  this avoids needing to answer the question of whether to
  hold the lock during destroy as well.
 
  Paul, you may need to acquire the lock yourself in kvm_create_xics()
  to protect the -EEXIST check.
 
 v3: remove some changes that were merged into this patch by accident,
 and fix the error documentation for KVM_CREATE_DEVICE.
 ---
 Documentation/virtual/kvm/api.txt|   70 
 Documentation/virtual/kvm/devices/README |1 +
 include/linux/kvm_host.h |   35 
 include/uapi/linux/kvm.h |   27 +++
 virt/kvm/kvm_main.c  |  129 
 ++
 5 files changed, 262 insertions(+)
 create mode 100644 Documentation/virtual/kvm/devices/README
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 976eb65..d52f3f9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
 from the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
 +4.79 KVM_CREATE_DEVICE
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: vm ioctl
 +Parameters: struct kvm_create_device (in/out)
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENODEV: The device type is unknown or unsupported
 +  EEXIST: Device already created, and this type of device may not
 +  be instantiated multiple times
 +
 +  Other error conditions may be defined by individual device types or
 +  have their standard meanings.
 +
 +Creates an emulated device in the kernel.  The file descriptor returned
 +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
 +
 +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
 +device type is supported (not necessarily whether it can be created
 +in the current vm).
 +
 +Individual devices should not define flags.  Attributes should be used
 +for specifying any behavior that is not implied by the device type
 +number.
 +
 +struct kvm_create_device {
 +  __u32   type;   /* in: KVM_DEV_TYPE_xxx */
 +  __u32   fd; /* out: device handle */
 +  __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
 +};
 Should we add __u32 padding here to make struct size multiple of u64?
 
 Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't 
 and ppc64 doesn't either.
 
 Not really. I just notices that we pad some structures to that effect.

I don't think we really need to :).

 
 
 +
 +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct kvm_device_attr
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENXIO:  The group or attribute is unknown/unsupported for this device
 +  EPERM:  The attribute cannot (currently) be accessed this way
 +  (e.g. read-only attribute, or attribute that only makes
 +  sense when the device is in a different state)
 +
 +  Other error conditions may be defined by individual device types.
 +
 +Gets/sets a specified piece of device configuration and/or state.  The
 +semantics are device-specific.  See individual device documentation in
 +the devices directory.  As with ONE_REG, the size of the data
 +transferred is defined by the particular attribute.
 +
 +struct kvm_device_attr {
 +  __u32   flags;  /* no flags currently defined */
 +  __u32   group;  /* device-defined */
 +  __u64   attr; 

Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Gleb Natapov
On Thu, Apr 25, 2013 at 03:45:14PM +0200, Alexander Graf wrote:
  Please move struct definitions and KVM_CREATE_DEVICE_TEST define out
  from ioctl definition block.
  
  Let me change that in my tree...
  
  So are you sending this via your tree and I should not apply it directly?
 
 I was hoping to have things ready very soon for you to just pull...
 
Make sense since there are PPC patches that depend on this one. 3.10 merge 
windows
will very likely open next week though...

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Scott Wood

On 04/25/2013 05:47:39 AM, Alexander Graf wrote:


On 25.04.2013, at 11:43, Gleb Natapov wrote:

 +void kvm_device_put(struct kvm_device *dev)
 +{
 +  if (atomic_dec_and_test(dev-users))
 +  dev-ops-destroy(dev);
 +}
 +
 +static int kvm_device_release(struct inode *inode, struct file  
*filp)

 +{
 +  struct kvm_device *dev = filp-private_data;
 +  struct kvm *kvm = dev-kvm;
 +
 +  kvm_device_put(dev);
 +  kvm_put_kvm(kvm);
 We may put kvm only if users goes to zero, otherwise kvm can be
 freed while something holds a reference to a device. Why not make
 kvm_device_put() do it?

Nice catch. I'll change the patch so it does the kvm_put_kvm inside  
kvm_device_put's destroy branch.


No, please don't.  The KVM reference being put here is associated  
with the file descriptor, not with the MPIC object.  If you make that  
change I think you'll have circular references and thus a memory leak,  
because the vcpus can hold a reference to the MPIC object.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Gleb Natapov
On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote:
 On 04/25/2013 05:47:39 AM, Alexander Graf wrote:
 
 On 25.04.2013, at 11:43, Gleb Natapov wrote:
 
  +void kvm_device_put(struct kvm_device *dev)
  +{
  + if (atomic_dec_and_test(dev-users))
  + dev-ops-destroy(dev);
  +}
  +
  +static int kvm_device_release(struct inode *inode, struct file
 *filp)
  +{
  + struct kvm_device *dev = filp-private_data;
  + struct kvm *kvm = dev-kvm;
  +
  + kvm_device_put(dev);
  + kvm_put_kvm(kvm);
  We may put kvm only if users goes to zero, otherwise kvm can be
  freed while something holds a reference to a device. Why not make
  kvm_device_put() do it?
 
 Nice catch. I'll change the patch so it does the kvm_put_kvm
 inside kvm_device_put's destroy branch.
 
 No, please don't.  The KVM reference being put here is associated
 with the file descriptor, not with the MPIC object.
Is it so? Device holds a pointer to kvm, so it increments kvm reference
to make sure the pointer is valid. What prevents kvm from been destroyed
while device is still in use in current code?
 

If you make
 that change I think you'll have circular references and thus a
 memory leak, because the vcpus can hold a reference to the MPIC
 object.
 
How circular reference can be created?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Scott Wood

On 04/25/2013 01:22:04 PM, Gleb Natapov wrote:

On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote:
 On 04/25/2013 05:47:39 AM, Alexander Graf wrote:
 
 On 25.04.2013, at 11:43, Gleb Natapov wrote:
 
  +void kvm_device_put(struct kvm_device *dev)
  +{
  + if (atomic_dec_and_test(dev-users))
  + dev-ops-destroy(dev);
  +}
  +
  +static int kvm_device_release(struct inode *inode, struct file
 *filp)
  +{
  + struct kvm_device *dev = filp-private_data;
  + struct kvm *kvm = dev-kvm;
  +
  + kvm_device_put(dev);
  + kvm_put_kvm(kvm);
  We may put kvm only if users goes to zero, otherwise kvm can be
  freed while something holds a reference to a device. Why not make
  kvm_device_put() do it?
 
 Nice catch. I'll change the patch so it does the kvm_put_kvm
 inside kvm_device_put's destroy branch.

 No, please don't.  The KVM reference being put here is associated
 with the file descriptor, not with the MPIC object.
Is it so? Device holds a pointer to kvm, so it increments kvm  
reference
to make sure the pointer is valid. What prevents kvm from been  
destroyed

while device is still in use in current code?


Where will that kvm pointer be used, after all the file descriptors go  
away and the vcpus stop running?  mmio_mapped guards against unmapping  
the MMIO if it's already been unmapped due to KVM destruction.  We  
don't have any timers or other delayed work.


Well, I do see one place, that Alex added -- the NULLing out of  
dev-kvm-arch.mpic, which didn't exist in my patchset.



 that change I think you'll have circular references and thus a
 memory leak, because the vcpus can hold a reference to the MPIC
 object.

How circular reference can be created?


MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is  
not destroyed until KVM is destroyed.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Gleb Natapov
On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
 Currently, devices that are emulated inside KVM are configured in a
 hardcoded manner based on an assumption that any given architecture
 only has one way to do it.  If there's any need to access device state,
 it is done through inflexible one-purpose-only IOCTLs (e.g.
 KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
 cumbersome and depletes a limited numberspace.
 
 This API provides a mechanism to instantiate a device of a certain
 type, returning an ID that can be used to set/get attributes of the
 device.  Attributes may include configuration parameters (e.g.
 register base address), device state, operational commands, etc.  It
 is similar to the ONE_REG API, except that it acts on devices rather
 than vcpus.
 
 Both device types and individual attributes can be tested without having
 to create the device or get/set the attribute, without the need for
 separately managing enumerated capabilities.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
 v4:
  - Move some boilerplate back into generic code, as requested by Gleb.
File descriptor management and reference counting is no longer the
concern of the device implementation.
 
  - Don't hold kvm-lock during create.  The original reasons
for doing so have vanished as for as MPIC is concerned, and
this avoids needing to answer the question of whether to
hold the lock during destroy as well.
 
Paul, you may need to acquire the lock yourself in kvm_create_xics()
to protect the -EEXIST check.
 
 v3: remove some changes that were merged into this patch by accident,
 and fix the error documentation for KVM_CREATE_DEVICE.
 ---
  Documentation/virtual/kvm/api.txt|   70 
  Documentation/virtual/kvm/devices/README |1 +
  include/linux/kvm_host.h |   35 
  include/uapi/linux/kvm.h |   27 +++
  virt/kvm/kvm_main.c  |  129 
 ++
  5 files changed, 262 insertions(+)
  create mode 100644 Documentation/virtual/kvm/devices/README
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 976eb65..d52f3f9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
 from the data
  written, then `n_invalid' invalid entries, invalidating any previously
  valid entries found.
  
 +4.79 KVM_CREATE_DEVICE
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: vm ioctl
 +Parameters: struct kvm_create_device (in/out)
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENODEV: The device type is unknown or unsupported
 +  EEXIST: Device already created, and this type of device may not
 +  be instantiated multiple times
 +
 +  Other error conditions may be defined by individual device types or
 +  have their standard meanings.
 +
 +Creates an emulated device in the kernel.  The file descriptor returned
 +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
 +
 +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
 +device type is supported (not necessarily whether it can be created
 +in the current vm).
 +
 +Individual devices should not define flags.  Attributes should be used
 +for specifying any behavior that is not implied by the device type
 +number.
 +
 +struct kvm_create_device {
 + __u32   type;   /* in: KVM_DEV_TYPE_xxx */
 + __u32   fd; /* out: device handle */
 + __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
 +};
Should we add __u32 padding here to make struct size multiple of u64?

 +
 +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct kvm_device_attr
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENXIO:  The group or attribute is unknown/unsupported for this device
 +  EPERM:  The attribute cannot (currently) be accessed this way
 +  (e.g. read-only attribute, or attribute that only makes
 +  sense when the device is in a different state)
 +
 +  Other error conditions may be defined by individual device types.
 +
 +Gets/sets a specified piece of device configuration and/or state.  The
 +semantics are device-specific.  See individual device documentation in
 +the devices directory.  As with ONE_REG, the size of the data
 +transferred is defined by the particular attribute.
 +
 +struct kvm_device_attr {
 + __u32   flags;  /* no flags currently defined */
 + __u32   group;  /* device-defined */
 + __u64   attr;   /* group-defined */
 + __u64   addr;   /* userspace address of attr data */
 +};
 +
 +4.81 KVM_HAS_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct kvm_device_attr
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENXIO:  The group or attribute is unknown/unsupported for this device
 

Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Alexander Graf

On 25.04.2013, at 11:43, Gleb Natapov wrote:

 On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
 Currently, devices that are emulated inside KVM are configured in a
 hardcoded manner based on an assumption that any given architecture
 only has one way to do it.  If there's any need to access device state,
 it is done through inflexible one-purpose-only IOCTLs (e.g.
 KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
 cumbersome and depletes a limited numberspace.
 
 This API provides a mechanism to instantiate a device of a certain
 type, returning an ID that can be used to set/get attributes of the
 device.  Attributes may include configuration parameters (e.g.
 register base address), device state, operational commands, etc.  It
 is similar to the ONE_REG API, except that it acts on devices rather
 than vcpus.
 
 Both device types and individual attributes can be tested without having
 to create the device or get/set the attribute, without the need for
 separately managing enumerated capabilities.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
 v4:
 - Move some boilerplate back into generic code, as requested by Gleb.
   File descriptor management and reference counting is no longer the
   concern of the device implementation.
 
 - Don't hold kvm-lock during create.  The original reasons
   for doing so have vanished as for as MPIC is concerned, and
   this avoids needing to answer the question of whether to
   hold the lock during destroy as well.
 
   Paul, you may need to acquire the lock yourself in kvm_create_xics()
   to protect the -EEXIST check.
 
 v3: remove some changes that were merged into this patch by accident,
 and fix the error documentation for KVM_CREATE_DEVICE.
 ---
 Documentation/virtual/kvm/api.txt|   70 
 Documentation/virtual/kvm/devices/README |1 +
 include/linux/kvm_host.h |   35 
 include/uapi/linux/kvm.h |   27 +++
 virt/kvm/kvm_main.c  |  129 
 ++
 5 files changed, 262 insertions(+)
 create mode 100644 Documentation/virtual/kvm/devices/README
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 976eb65..d52f3f9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
 from the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
 +4.79 KVM_CREATE_DEVICE
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: vm ioctl
 +Parameters: struct kvm_create_device (in/out)
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENODEV: The device type is unknown or unsupported
 +  EEXIST: Device already created, and this type of device may not
 +  be instantiated multiple times
 +
 +  Other error conditions may be defined by individual device types or
 +  have their standard meanings.
 +
 +Creates an emulated device in the kernel.  The file descriptor returned
 +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
 +
 +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
 +device type is supported (not necessarily whether it can be created
 +in the current vm).
 +
 +Individual devices should not define flags.  Attributes should be used
 +for specifying any behavior that is not implied by the device type
 +number.
 +
 +struct kvm_create_device {
 +__u32   type;   /* in: KVM_DEV_TYPE_xxx */
 +__u32   fd; /* out: device handle */
 +__u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
 +};
 Should we add __u32 padding here to make struct size multiple of u64?

Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and 
ppc64 doesn't either.

 
 +
 +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct kvm_device_attr
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENXIO:  The group or attribute is unknown/unsupported for this device
 +  EPERM:  The attribute cannot (currently) be accessed this way
 +  (e.g. read-only attribute, or attribute that only makes
 +  sense when the device is in a different state)
 +
 +  Other error conditions may be defined by individual device types.
 +
 +Gets/sets a specified piece of device configuration and/or state.  The
 +semantics are device-specific.  See individual device documentation in
 +the devices directory.  As with ONE_REG, the size of the data
 +transferred is defined by the particular attribute.
 +
 +struct kvm_device_attr {
 +__u32   flags;  /* no flags currently defined */
 +__u32   group;  /* device-defined */
 +__u64   attr;   /* group-defined */
 +__u64   addr;   /* userspace address of attr data */
 +};
 +
 +4.81 KVM_HAS_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct 

Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Gleb Natapov
On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote:
 
 On 25.04.2013, at 11:43, Gleb Natapov wrote:
 
  On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
  Currently, devices that are emulated inside KVM are configured in a
  hardcoded manner based on an assumption that any given architecture
  only has one way to do it.  If there's any need to access device state,
  it is done through inflexible one-purpose-only IOCTLs (e.g.
  KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
  cumbersome and depletes a limited numberspace.
  
  This API provides a mechanism to instantiate a device of a certain
  type, returning an ID that can be used to set/get attributes of the
  device.  Attributes may include configuration parameters (e.g.
  register base address), device state, operational commands, etc.  It
  is similar to the ONE_REG API, except that it acts on devices rather
  than vcpus.
  
  Both device types and individual attributes can be tested without having
  to create the device or get/set the attribute, without the need for
  separately managing enumerated capabilities.
  
  Signed-off-by: Scott Wood scottw...@freescale.com
  ---
  v4:
  - Move some boilerplate back into generic code, as requested by Gleb.
File descriptor management and reference counting is no longer the
concern of the device implementation.
  
  - Don't hold kvm-lock during create.  The original reasons
for doing so have vanished as for as MPIC is concerned, and
this avoids needing to answer the question of whether to
hold the lock during destroy as well.
  
Paul, you may need to acquire the lock yourself in kvm_create_xics()
to protect the -EEXIST check.
  
  v3: remove some changes that were merged into this patch by accident,
  and fix the error documentation for KVM_CREATE_DEVICE.
  ---
  Documentation/virtual/kvm/api.txt|   70 
  Documentation/virtual/kvm/devices/README |1 +
  include/linux/kvm_host.h |   35 
  include/uapi/linux/kvm.h |   27 +++
  virt/kvm/kvm_main.c  |  129 
  ++
  5 files changed, 262 insertions(+)
  create mode 100644 Documentation/virtual/kvm/devices/README
  
  diff --git a/Documentation/virtual/kvm/api.txt 
  b/Documentation/virtual/kvm/api.txt
  index 976eb65..d52f3f9 100644
  --- a/Documentation/virtual/kvm/api.txt
  +++ b/Documentation/virtual/kvm/api.txt
  @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
  from the data
  written, then `n_invalid' invalid entries, invalidating any previously
  valid entries found.
  
  +4.79 KVM_CREATE_DEVICE
  +
  +Capability: KVM_CAP_DEVICE_CTRL
  +Type: vm ioctl
  +Parameters: struct kvm_create_device (in/out)
  +Returns: 0 on success, -1 on error
  +Errors:
  +  ENODEV: The device type is unknown or unsupported
  +  EEXIST: Device already created, and this type of device may not
  +  be instantiated multiple times
  +
  +  Other error conditions may be defined by individual device types or
  +  have their standard meanings.
  +
  +Creates an emulated device in the kernel.  The file descriptor returned
  +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
  +
  +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
  +device type is supported (not necessarily whether it can be created
  +in the current vm).
  +
  +Individual devices should not define flags.  Attributes should be used
  +for specifying any behavior that is not implied by the device type
  +number.
  +
  +struct kvm_create_device {
  +  __u32   type;   /* in: KVM_DEV_TYPE_xxx */
  +  __u32   fd; /* out: device handle */
  +  __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
  +};
  Should we add __u32 padding here to make struct size multiple of u64?
 
 Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't 
 and ppc64 doesn't either.
 
Not really. I just notices that we pad some structures to that effect.

  
  +
  +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
  +
  +Capability: KVM_CAP_DEVICE_CTRL
  +Type: device ioctl
  +Parameters: struct kvm_device_attr
  +Returns: 0 on success, -1 on error
  +Errors:
  +  ENXIO:  The group or attribute is unknown/unsupported for this device
  +  EPERM:  The attribute cannot (currently) be accessed this way
  +  (e.g. read-only attribute, or attribute that only makes
  +  sense when the device is in a different state)
  +
  +  Other error conditions may be defined by individual device types.
  +
  +Gets/sets a specified piece of device configuration and/or state.  The
  +semantics are device-specific.  See individual device documentation in
  +the devices directory.  As with ONE_REG, the size of the data
  +transferred is defined by the particular attribute.
  +
  +struct kvm_device_attr {
  +  __u32   flags;  /* no flags currently defined */
  +  __u32   group;  /* 

Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Alexander Graf

On 25.04.2013, at 14:07, Gleb Natapov wrote:

 On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote:
 
 On 25.04.2013, at 11:43, Gleb Natapov wrote:
 
 On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote:
 Currently, devices that are emulated inside KVM are configured in a
 hardcoded manner based on an assumption that any given architecture
 only has one way to do it.  If there's any need to access device state,
 it is done through inflexible one-purpose-only IOCTLs (e.g.
 KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
 cumbersome and depletes a limited numberspace.
 
 This API provides a mechanism to instantiate a device of a certain
 type, returning an ID that can be used to set/get attributes of the
 device.  Attributes may include configuration parameters (e.g.
 register base address), device state, operational commands, etc.  It
 is similar to the ONE_REG API, except that it acts on devices rather
 than vcpus.
 
 Both device types and individual attributes can be tested without having
 to create the device or get/set the attribute, without the need for
 separately managing enumerated capabilities.
 
 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
 v4:
 - Move some boilerplate back into generic code, as requested by Gleb.
  File descriptor management and reference counting is no longer the
  concern of the device implementation.
 
 - Don't hold kvm-lock during create.  The original reasons
  for doing so have vanished as for as MPIC is concerned, and
  this avoids needing to answer the question of whether to
  hold the lock during destroy as well.
 
  Paul, you may need to acquire the lock yourself in kvm_create_xics()
  to protect the -EEXIST check.
 
 v3: remove some changes that were merged into this patch by accident,
 and fix the error documentation for KVM_CREATE_DEVICE.
 ---
 Documentation/virtual/kvm/api.txt|   70 
 Documentation/virtual/kvm/devices/README |1 +
 include/linux/kvm_host.h |   35 
 include/uapi/linux/kvm.h |   27 +++
 virt/kvm/kvm_main.c  |  129 
 ++
 5 files changed, 262 insertions(+)
 create mode 100644 Documentation/virtual/kvm/devices/README
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 976eb65..d52f3f9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
 from the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
 +4.79 KVM_CREATE_DEVICE
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: vm ioctl
 +Parameters: struct kvm_create_device (in/out)
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENODEV: The device type is unknown or unsupported
 +  EEXIST: Device already created, and this type of device may not
 +  be instantiated multiple times
 +
 +  Other error conditions may be defined by individual device types or
 +  have their standard meanings.
 +
 +Creates an emulated device in the kernel.  The file descriptor returned
 +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
 +
 +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
 +device type is supported (not necessarily whether it can be created
 +in the current vm).
 +
 +Individual devices should not define flags.  Attributes should be used
 +for specifying any behavior that is not implied by the device type
 +number.
 +
 +struct kvm_create_device {
 +  __u32   type;   /* in: KVM_DEV_TYPE_xxx */
 +  __u32   fd; /* out: device handle */
 +  __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
 +};
 Should we add __u32 padding here to make struct size multiple of u64?
 
 Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't 
 and ppc64 doesn't either.
 
 Not really. I just notices that we pad some structures to that effect.

I don't think we really need to :).

 
 
 +
 +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
 +
 +Capability: KVM_CAP_DEVICE_CTRL
 +Type: device ioctl
 +Parameters: struct kvm_device_attr
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENXIO:  The group or attribute is unknown/unsupported for this device
 +  EPERM:  The attribute cannot (currently) be accessed this way
 +  (e.g. read-only attribute, or attribute that only makes
 +  sense when the device is in a different state)
 +
 +  Other error conditions may be defined by individual device types.
 +
 +Gets/sets a specified piece of device configuration and/or state.  The
 +semantics are device-specific.  See individual device documentation in
 +the devices directory.  As with ONE_REG, the size of the data
 +transferred is defined by the particular attribute.
 +
 +struct kvm_device_attr {
 +  __u32   flags;  /* no flags currently defined */
 +  __u32   group;  /* device-defined */
 +  __u64   attr; 

Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Gleb Natapov
On Thu, Apr 25, 2013 at 03:45:14PM +0200, Alexander Graf wrote:
  Please move struct definitions and KVM_CREATE_DEVICE_TEST define out
  from ioctl definition block.
  
  Let me change that in my tree...
  
  So are you sending this via your tree and I should not apply it directly?
 
 I was hoping to have things ready very soon for you to just pull...
 
Make sense since there are PPC patches that depend on this one. 3.10 merge 
windows
will very likely open next week though...

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Scott Wood

On 04/25/2013 05:47:39 AM, Alexander Graf wrote:


On 25.04.2013, at 11:43, Gleb Natapov wrote:

 +void kvm_device_put(struct kvm_device *dev)
 +{
 +  if (atomic_dec_and_test(dev-users))
 +  dev-ops-destroy(dev);
 +}
 +
 +static int kvm_device_release(struct inode *inode, struct file  
*filp)

 +{
 +  struct kvm_device *dev = filp-private_data;
 +  struct kvm *kvm = dev-kvm;
 +
 +  kvm_device_put(dev);
 +  kvm_put_kvm(kvm);
 We may put kvm only if users goes to zero, otherwise kvm can be
 freed while something holds a reference to a device. Why not make
 kvm_device_put() do it?

Nice catch. I'll change the patch so it does the kvm_put_kvm inside  
kvm_device_put's destroy branch.


No, please don't.  The KVM reference being put here is associated  
with the file descriptor, not with the MPIC object.  If you make that  
change I think you'll have circular references and thus a memory leak,  
because the vcpus can hold a reference to the MPIC object.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/6] kvm: add device control API

2013-04-25 Thread Scott Wood

On 04/25/2013 01:22:04 PM, Gleb Natapov wrote:

On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote:
 On 04/25/2013 05:47:39 AM, Alexander Graf wrote:
 
 On 25.04.2013, at 11:43, Gleb Natapov wrote:
 
  +void kvm_device_put(struct kvm_device *dev)
  +{
  + if (atomic_dec_and_test(dev-users))
  + dev-ops-destroy(dev);
  +}
  +
  +static int kvm_device_release(struct inode *inode, struct file
 *filp)
  +{
  + struct kvm_device *dev = filp-private_data;
  + struct kvm *kvm = dev-kvm;
  +
  + kvm_device_put(dev);
  + kvm_put_kvm(kvm);
  We may put kvm only if users goes to zero, otherwise kvm can be
  freed while something holds a reference to a device. Why not make
  kvm_device_put() do it?
 
 Nice catch. I'll change the patch so it does the kvm_put_kvm
 inside kvm_device_put's destroy branch.

 No, please don't.  The KVM reference being put here is associated
 with the file descriptor, not with the MPIC object.
Is it so? Device holds a pointer to kvm, so it increments kvm  
reference
to make sure the pointer is valid. What prevents kvm from been  
destroyed

while device is still in use in current code?


Where will that kvm pointer be used, after all the file descriptors go  
away and the vcpus stop running?  mmio_mapped guards against unmapping  
the MMIO if it's already been unmapped due to KVM destruction.  We  
don't have any timers or other delayed work.


Well, I do see one place, that Alex added -- the NULLing out of  
dev-kvm-arch.mpic, which didn't exist in my patchset.



 that change I think you'll have circular references and thus a
 memory leak, because the vcpus can hold a reference to the MPIC
 object.

How circular reference can be created?


MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is  
not destroyed until KVM is destroyed.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html