Re: [PATCH v4 1/6] kvm: add device control API
On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote: On 04/25/2013 01:22:04 PM, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote: On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. Is it so? Device holds a pointer to kvm, so it increments kvm reference to make sure the pointer is valid. What prevents kvm from been destroyed while device is still in use in current code? Where will that kvm pointer be used, after all the file descriptors go away and the vcpus stop running? mmio_mapped guards against unmapping the MMIO if it's already been unmapped due to KVM destruction. We don't have any timers or other delayed work. MPIC does not, but timer device will have one. Well, I do see one place, that Alex added -- the NULLing out of dev-kvm-arch.mpic, which didn't exist in my patchset. that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. How circular reference can be created? MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is not destroyed until KVM is destroyed. Yes, you are right. So we need to think about how to fix it in a different way. What about holding all devices in kvm-devices[] array and destroy them during kvm destruction, like we do for vcpus? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On 26.04.2013, at 11:53, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote: On 04/25/2013 01:22:04 PM, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote: On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. Is it so? Device holds a pointer to kvm, so it increments kvm reference to make sure the pointer is valid. What prevents kvm from been destroyed while device is still in use in current code? Where will that kvm pointer be used, after all the file descriptors go away and the vcpus stop running? mmio_mapped guards against unmapping the MMIO if it's already been unmapped due to KVM destruction. We don't have any timers or other delayed work. MPIC does not, but timer device will have one. Well, I do see one place, that Alex added -- the NULLing out of dev-kvm-arch.mpic, which didn't exist in my patchset. that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. How circular reference can be created? MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is not destroyed until KVM is destroyed. Yes, you are right. So we need to think about how to fix it in a different way. What about holding all devices in kvm-devices[] array and destroy them during kvm destruction, like we do for vcpus? You should really look at your patches in LIFO order :). A patch doing that was already sent by Scott last night and is in v4 of my patch set. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On Fri, Apr 26, 2013 at 11:55:27AM +0200, Alexander Graf wrote: On 26.04.2013, at 11:53, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote: On 04/25/2013 01:22:04 PM, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote: On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. Is it so? Device holds a pointer to kvm, so it increments kvm reference to make sure the pointer is valid. What prevents kvm from been destroyed while device is still in use in current code? Where will that kvm pointer be used, after all the file descriptors go away and the vcpus stop running? mmio_mapped guards against unmapping the MMIO if it's already been unmapped due to KVM destruction. We don't have any timers or other delayed work. MPIC does not, but timer device will have one. Well, I do see one place, that Alex added -- the NULLing out of dev-kvm-arch.mpic, which didn't exist in my patchset. that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. How circular reference can be created? MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is not destroyed until KVM is destroyed. Yes, you are right. So we need to think about how to fix it in a different way. What about holding all devices in kvm-devices[] array and destroy them during kvm destruction, like we do for vcpus? You should really look at your patches in LIFO order :). A patch doing that was already sent by Scott last night and is in v4 of my patch set. I tried! This causes starvation for some patches. I need better algorithm :) -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote: On 04/25/2013 01:22:04 PM, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote: On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. Is it so? Device holds a pointer to kvm, so it increments kvm reference to make sure the pointer is valid. What prevents kvm from been destroyed while device is still in use in current code? Where will that kvm pointer be used, after all the file descriptors go away and the vcpus stop running? mmio_mapped guards against unmapping the MMIO if it's already been unmapped due to KVM destruction. We don't have any timers or other delayed work. MPIC does not, but timer device will have one. Well, I do see one place, that Alex added -- the NULLing out of dev-kvm-arch.mpic, which didn't exist in my patchset. that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. How circular reference can be created? MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is not destroyed until KVM is destroyed. Yes, you are right. So we need to think about how to fix it in a different way. What about holding all devices in kvm-devices[] array and destroy them during kvm destruction, like we do for vcpus? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On Fri, Apr 26, 2013 at 11:55:27AM +0200, Alexander Graf wrote: On 26.04.2013, at 11:53, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 01:59:20PM -0500, Scott Wood wrote: On 04/25/2013 01:22:04 PM, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote: On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. Is it so? Device holds a pointer to kvm, so it increments kvm reference to make sure the pointer is valid. What prevents kvm from been destroyed while device is still in use in current code? Where will that kvm pointer be used, after all the file descriptors go away and the vcpus stop running? mmio_mapped guards against unmapping the MMIO if it's already been unmapped due to KVM destruction. We don't have any timers or other delayed work. MPIC does not, but timer device will have one. Well, I do see one place, that Alex added -- the NULLing out of dev-kvm-arch.mpic, which didn't exist in my patchset. that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. How circular reference can be created? MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is not destroyed until KVM is destroyed. Yes, you are right. So we need to think about how to fix it in a different way. What about holding all devices in kvm-devices[] array and destroy them during kvm destruction, like we do for vcpus? You should really look at your patches in LIFO order :). A patch doing that was already sent by Scott last night and is in v4 of my patch set. I tried! This causes starvation for some patches. I need better algorithm :) -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood scottw...@freescale.com --- v4: - Move some boilerplate back into generic code, as requested by Gleb. File descriptor management and reference counting is no longer the concern of the device implementation. - Don't hold kvm-lock during create. The original reasons for doing so have vanished as for as MPIC is concerned, and this avoids needing to answer the question of whether to hold the lock during destroy as well. Paul, you may need to acquire the lock yourself in kvm_create_xics() to protect the -EEXIST check. v3: remove some changes that were merged into this patch by accident, and fix the error documentation for KVM_CREATE_DEVICE. --- Documentation/virtual/kvm/api.txt| 70 Documentation/virtual/kvm/devices/README |1 + include/linux/kvm_host.h | 35 include/uapi/linux/kvm.h | 27 +++ virt/kvm/kvm_main.c | 129 ++ 5 files changed, 262 insertions(+) create mode 100644 Documentation/virtual/kvm/devices/README diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 976eb65..d52f3f9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data written, then `n_invalid' invalid entries, invalidating any previously valid entries found. +4.79 KVM_CREATE_DEVICE + +Capability: KVM_CAP_DEVICE_CTRL +Type: vm ioctl +Parameters: struct kvm_create_device (in/out) +Returns: 0 on success, -1 on error +Errors: + ENODEV: The device type is unknown or unsupported + EEXIST: Device already created, and this type of device may not + be instantiated multiple times + + Other error conditions may be defined by individual device types or + have their standard meanings. + +Creates an emulated device in the kernel. The file descriptor returned +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. + +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the +device type is supported (not necessarily whether it can be created +in the current vm). + +Individual devices should not define flags. Attributes should be used +for specifying any behavior that is not implied by the device type +number. + +struct kvm_create_device { + __u32 type; /* in: KVM_DEV_TYPE_xxx */ + __u32 fd; /* out: device handle */ + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ +}; Should we add __u32 padding here to make struct size multiple of u64? + +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device + EPERM: The attribute cannot (currently) be accessed this way + (e.g. read-only attribute, or attribute that only makes + sense when the device is in a different state) + + Other error conditions may be defined by individual device types. + +Gets/sets a specified piece of device configuration and/or state. The +semantics are device-specific. See individual device documentation in +the devices directory. As with ONE_REG, the size of the data +transferred is defined by the particular attribute. + +struct kvm_device_attr { + __u32 flags; /* no flags currently defined */ + __u32 group; /* device-defined */ + __u64 attr; /* group-defined */ + __u64 addr; /* userspace address of attr data */ +}; + +4.81 KVM_HAS_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device
Re: [PATCH v4 1/6] kvm: add device control API
On 25.04.2013, at 11:43, Gleb Natapov wrote: On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood scottw...@freescale.com --- v4: - Move some boilerplate back into generic code, as requested by Gleb. File descriptor management and reference counting is no longer the concern of the device implementation. - Don't hold kvm-lock during create. The original reasons for doing so have vanished as for as MPIC is concerned, and this avoids needing to answer the question of whether to hold the lock during destroy as well. Paul, you may need to acquire the lock yourself in kvm_create_xics() to protect the -EEXIST check. v3: remove some changes that were merged into this patch by accident, and fix the error documentation for KVM_CREATE_DEVICE. --- Documentation/virtual/kvm/api.txt| 70 Documentation/virtual/kvm/devices/README |1 + include/linux/kvm_host.h | 35 include/uapi/linux/kvm.h | 27 +++ virt/kvm/kvm_main.c | 129 ++ 5 files changed, 262 insertions(+) create mode 100644 Documentation/virtual/kvm/devices/README diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 976eb65..d52f3f9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data written, then `n_invalid' invalid entries, invalidating any previously valid entries found. +4.79 KVM_CREATE_DEVICE + +Capability: KVM_CAP_DEVICE_CTRL +Type: vm ioctl +Parameters: struct kvm_create_device (in/out) +Returns: 0 on success, -1 on error +Errors: + ENODEV: The device type is unknown or unsupported + EEXIST: Device already created, and this type of device may not + be instantiated multiple times + + Other error conditions may be defined by individual device types or + have their standard meanings. + +Creates an emulated device in the kernel. The file descriptor returned +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. + +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the +device type is supported (not necessarily whether it can be created +in the current vm). + +Individual devices should not define flags. Attributes should be used +for specifying any behavior that is not implied by the device type +number. + +struct kvm_create_device { +__u32 type; /* in: KVM_DEV_TYPE_xxx */ +__u32 fd; /* out: device handle */ +__u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ +}; Should we add __u32 padding here to make struct size multiple of u64? Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and ppc64 doesn't either. + +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device + EPERM: The attribute cannot (currently) be accessed this way + (e.g. read-only attribute, or attribute that only makes + sense when the device is in a different state) + + Other error conditions may be defined by individual device types. + +Gets/sets a specified piece of device configuration and/or state. The +semantics are device-specific. See individual device documentation in +the devices directory. As with ONE_REG, the size of the data +transferred is defined by the particular attribute. + +struct kvm_device_attr { +__u32 flags; /* no flags currently defined */ +__u32 group; /* device-defined */ +__u64 attr; /* group-defined */ +__u64 addr; /* userspace address of attr data */ +}; + +4.81 KVM_HAS_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct
Re: [PATCH v4 1/6] kvm: add device control API
On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood scottw...@freescale.com --- v4: - Move some boilerplate back into generic code, as requested by Gleb. File descriptor management and reference counting is no longer the concern of the device implementation. - Don't hold kvm-lock during create. The original reasons for doing so have vanished as for as MPIC is concerned, and this avoids needing to answer the question of whether to hold the lock during destroy as well. Paul, you may need to acquire the lock yourself in kvm_create_xics() to protect the -EEXIST check. v3: remove some changes that were merged into this patch by accident, and fix the error documentation for KVM_CREATE_DEVICE. --- Documentation/virtual/kvm/api.txt| 70 Documentation/virtual/kvm/devices/README |1 + include/linux/kvm_host.h | 35 include/uapi/linux/kvm.h | 27 +++ virt/kvm/kvm_main.c | 129 ++ 5 files changed, 262 insertions(+) create mode 100644 Documentation/virtual/kvm/devices/README diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 976eb65..d52f3f9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data written, then `n_invalid' invalid entries, invalidating any previously valid entries found. +4.79 KVM_CREATE_DEVICE + +Capability: KVM_CAP_DEVICE_CTRL +Type: vm ioctl +Parameters: struct kvm_create_device (in/out) +Returns: 0 on success, -1 on error +Errors: + ENODEV: The device type is unknown or unsupported + EEXIST: Device already created, and this type of device may not + be instantiated multiple times + + Other error conditions may be defined by individual device types or + have their standard meanings. + +Creates an emulated device in the kernel. The file descriptor returned +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. + +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the +device type is supported (not necessarily whether it can be created +in the current vm). + +Individual devices should not define flags. Attributes should be used +for specifying any behavior that is not implied by the device type +number. + +struct kvm_create_device { + __u32 type; /* in: KVM_DEV_TYPE_xxx */ + __u32 fd; /* out: device handle */ + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ +}; Should we add __u32 padding here to make struct size multiple of u64? Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and ppc64 doesn't either. Not really. I just notices that we pad some structures to that effect. + +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device + EPERM: The attribute cannot (currently) be accessed this way + (e.g. read-only attribute, or attribute that only makes + sense when the device is in a different state) + + Other error conditions may be defined by individual device types. + +Gets/sets a specified piece of device configuration and/or state. The +semantics are device-specific. See individual device documentation in +the devices directory. As with ONE_REG, the size of the data +transferred is defined by the particular attribute. + +struct kvm_device_attr { + __u32 flags; /* no flags currently defined */ + __u32 group; /*
Re: [PATCH v4 1/6] kvm: add device control API
On 25.04.2013, at 14:07, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood scottw...@freescale.com --- v4: - Move some boilerplate back into generic code, as requested by Gleb. File descriptor management and reference counting is no longer the concern of the device implementation. - Don't hold kvm-lock during create. The original reasons for doing so have vanished as for as MPIC is concerned, and this avoids needing to answer the question of whether to hold the lock during destroy as well. Paul, you may need to acquire the lock yourself in kvm_create_xics() to protect the -EEXIST check. v3: remove some changes that were merged into this patch by accident, and fix the error documentation for KVM_CREATE_DEVICE. --- Documentation/virtual/kvm/api.txt| 70 Documentation/virtual/kvm/devices/README |1 + include/linux/kvm_host.h | 35 include/uapi/linux/kvm.h | 27 +++ virt/kvm/kvm_main.c | 129 ++ 5 files changed, 262 insertions(+) create mode 100644 Documentation/virtual/kvm/devices/README diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 976eb65..d52f3f9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data written, then `n_invalid' invalid entries, invalidating any previously valid entries found. +4.79 KVM_CREATE_DEVICE + +Capability: KVM_CAP_DEVICE_CTRL +Type: vm ioctl +Parameters: struct kvm_create_device (in/out) +Returns: 0 on success, -1 on error +Errors: + ENODEV: The device type is unknown or unsupported + EEXIST: Device already created, and this type of device may not + be instantiated multiple times + + Other error conditions may be defined by individual device types or + have their standard meanings. + +Creates an emulated device in the kernel. The file descriptor returned +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. + +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the +device type is supported (not necessarily whether it can be created +in the current vm). + +Individual devices should not define flags. Attributes should be used +for specifying any behavior that is not implied by the device type +number. + +struct kvm_create_device { + __u32 type; /* in: KVM_DEV_TYPE_xxx */ + __u32 fd; /* out: device handle */ + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ +}; Should we add __u32 padding here to make struct size multiple of u64? Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and ppc64 doesn't either. Not really. I just notices that we pad some structures to that effect. I don't think we really need to :). + +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device + EPERM: The attribute cannot (currently) be accessed this way + (e.g. read-only attribute, or attribute that only makes + sense when the device is in a different state) + + Other error conditions may be defined by individual device types. + +Gets/sets a specified piece of device configuration and/or state. The +semantics are device-specific. See individual device documentation in +the devices directory. As with ONE_REG, the size of the data +transferred is defined by the particular attribute. + +struct kvm_device_attr { + __u32 flags; /* no flags currently defined */ + __u32 group; /* device-defined */ + __u64 attr;
Re: [PATCH v4 1/6] kvm: add device control API
On Thu, Apr 25, 2013 at 03:45:14PM +0200, Alexander Graf wrote: Please move struct definitions and KVM_CREATE_DEVICE_TEST define out from ioctl definition block. Let me change that in my tree... So are you sending this via your tree and I should not apply it directly? I was hoping to have things ready very soon for you to just pull... Make sense since there are PPC patches that depend on this one. 3.10 merge windows will very likely open next week though... -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. If you make that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote: On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. Is it so? Device holds a pointer to kvm, so it increments kvm reference to make sure the pointer is valid. What prevents kvm from been destroyed while device is still in use in current code? If you make that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. How circular reference can be created? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On 04/25/2013 01:22:04 PM, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote: On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. Is it so? Device holds a pointer to kvm, so it increments kvm reference to make sure the pointer is valid. What prevents kvm from been destroyed while device is still in use in current code? Where will that kvm pointer be used, after all the file descriptors go away and the vcpus stop running? mmio_mapped guards against unmapping the MMIO if it's already been unmapped due to KVM destruction. We don't have any timers or other delayed work. Well, I do see one place, that Alex added -- the NULLing out of dev-kvm-arch.mpic, which didn't exist in my patchset. that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. How circular reference can be created? MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is not destroyed until KVM is destroyed. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood scottw...@freescale.com --- v4: - Move some boilerplate back into generic code, as requested by Gleb. File descriptor management and reference counting is no longer the concern of the device implementation. - Don't hold kvm-lock during create. The original reasons for doing so have vanished as for as MPIC is concerned, and this avoids needing to answer the question of whether to hold the lock during destroy as well. Paul, you may need to acquire the lock yourself in kvm_create_xics() to protect the -EEXIST check. v3: remove some changes that were merged into this patch by accident, and fix the error documentation for KVM_CREATE_DEVICE. --- Documentation/virtual/kvm/api.txt| 70 Documentation/virtual/kvm/devices/README |1 + include/linux/kvm_host.h | 35 include/uapi/linux/kvm.h | 27 +++ virt/kvm/kvm_main.c | 129 ++ 5 files changed, 262 insertions(+) create mode 100644 Documentation/virtual/kvm/devices/README diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 976eb65..d52f3f9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data written, then `n_invalid' invalid entries, invalidating any previously valid entries found. +4.79 KVM_CREATE_DEVICE + +Capability: KVM_CAP_DEVICE_CTRL +Type: vm ioctl +Parameters: struct kvm_create_device (in/out) +Returns: 0 on success, -1 on error +Errors: + ENODEV: The device type is unknown or unsupported + EEXIST: Device already created, and this type of device may not + be instantiated multiple times + + Other error conditions may be defined by individual device types or + have their standard meanings. + +Creates an emulated device in the kernel. The file descriptor returned +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. + +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the +device type is supported (not necessarily whether it can be created +in the current vm). + +Individual devices should not define flags. Attributes should be used +for specifying any behavior that is not implied by the device type +number. + +struct kvm_create_device { + __u32 type; /* in: KVM_DEV_TYPE_xxx */ + __u32 fd; /* out: device handle */ + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ +}; Should we add __u32 padding here to make struct size multiple of u64? + +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device + EPERM: The attribute cannot (currently) be accessed this way + (e.g. read-only attribute, or attribute that only makes + sense when the device is in a different state) + + Other error conditions may be defined by individual device types. + +Gets/sets a specified piece of device configuration and/or state. The +semantics are device-specific. See individual device documentation in +the devices directory. As with ONE_REG, the size of the data +transferred is defined by the particular attribute. + +struct kvm_device_attr { + __u32 flags; /* no flags currently defined */ + __u32 group; /* device-defined */ + __u64 attr; /* group-defined */ + __u64 addr; /* userspace address of attr data */ +}; + +4.81 KVM_HAS_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device
Re: [PATCH v4 1/6] kvm: add device control API
On 25.04.2013, at 11:43, Gleb Natapov wrote: On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood scottw...@freescale.com --- v4: - Move some boilerplate back into generic code, as requested by Gleb. File descriptor management and reference counting is no longer the concern of the device implementation. - Don't hold kvm-lock during create. The original reasons for doing so have vanished as for as MPIC is concerned, and this avoids needing to answer the question of whether to hold the lock during destroy as well. Paul, you may need to acquire the lock yourself in kvm_create_xics() to protect the -EEXIST check. v3: remove some changes that were merged into this patch by accident, and fix the error documentation for KVM_CREATE_DEVICE. --- Documentation/virtual/kvm/api.txt| 70 Documentation/virtual/kvm/devices/README |1 + include/linux/kvm_host.h | 35 include/uapi/linux/kvm.h | 27 +++ virt/kvm/kvm_main.c | 129 ++ 5 files changed, 262 insertions(+) create mode 100644 Documentation/virtual/kvm/devices/README diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 976eb65..d52f3f9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data written, then `n_invalid' invalid entries, invalidating any previously valid entries found. +4.79 KVM_CREATE_DEVICE + +Capability: KVM_CAP_DEVICE_CTRL +Type: vm ioctl +Parameters: struct kvm_create_device (in/out) +Returns: 0 on success, -1 on error +Errors: + ENODEV: The device type is unknown or unsupported + EEXIST: Device already created, and this type of device may not + be instantiated multiple times + + Other error conditions may be defined by individual device types or + have their standard meanings. + +Creates an emulated device in the kernel. The file descriptor returned +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. + +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the +device type is supported (not necessarily whether it can be created +in the current vm). + +Individual devices should not define flags. Attributes should be used +for specifying any behavior that is not implied by the device type +number. + +struct kvm_create_device { +__u32 type; /* in: KVM_DEV_TYPE_xxx */ +__u32 fd; /* out: device handle */ +__u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ +}; Should we add __u32 padding here to make struct size multiple of u64? Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and ppc64 doesn't either. + +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device + EPERM: The attribute cannot (currently) be accessed this way + (e.g. read-only attribute, or attribute that only makes + sense when the device is in a different state) + + Other error conditions may be defined by individual device types. + +Gets/sets a specified piece of device configuration and/or state. The +semantics are device-specific. See individual device documentation in +the devices directory. As with ONE_REG, the size of the data +transferred is defined by the particular attribute. + +struct kvm_device_attr { +__u32 flags; /* no flags currently defined */ +__u32 group; /* device-defined */ +__u64 attr; /* group-defined */ +__u64 addr; /* userspace address of attr data */ +}; + +4.81 KVM_HAS_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct
Re: [PATCH v4 1/6] kvm: add device control API
On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood scottw...@freescale.com --- v4: - Move some boilerplate back into generic code, as requested by Gleb. File descriptor management and reference counting is no longer the concern of the device implementation. - Don't hold kvm-lock during create. The original reasons for doing so have vanished as for as MPIC is concerned, and this avoids needing to answer the question of whether to hold the lock during destroy as well. Paul, you may need to acquire the lock yourself in kvm_create_xics() to protect the -EEXIST check. v3: remove some changes that were merged into this patch by accident, and fix the error documentation for KVM_CREATE_DEVICE. --- Documentation/virtual/kvm/api.txt| 70 Documentation/virtual/kvm/devices/README |1 + include/linux/kvm_host.h | 35 include/uapi/linux/kvm.h | 27 +++ virt/kvm/kvm_main.c | 129 ++ 5 files changed, 262 insertions(+) create mode 100644 Documentation/virtual/kvm/devices/README diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 976eb65..d52f3f9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data written, then `n_invalid' invalid entries, invalidating any previously valid entries found. +4.79 KVM_CREATE_DEVICE + +Capability: KVM_CAP_DEVICE_CTRL +Type: vm ioctl +Parameters: struct kvm_create_device (in/out) +Returns: 0 on success, -1 on error +Errors: + ENODEV: The device type is unknown or unsupported + EEXIST: Device already created, and this type of device may not + be instantiated multiple times + + Other error conditions may be defined by individual device types or + have their standard meanings. + +Creates an emulated device in the kernel. The file descriptor returned +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. + +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the +device type is supported (not necessarily whether it can be created +in the current vm). + +Individual devices should not define flags. Attributes should be used +for specifying any behavior that is not implied by the device type +number. + +struct kvm_create_device { + __u32 type; /* in: KVM_DEV_TYPE_xxx */ + __u32 fd; /* out: device handle */ + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ +}; Should we add __u32 padding here to make struct size multiple of u64? Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and ppc64 doesn't either. Not really. I just notices that we pad some structures to that effect. + +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device + EPERM: The attribute cannot (currently) be accessed this way + (e.g. read-only attribute, or attribute that only makes + sense when the device is in a different state) + + Other error conditions may be defined by individual device types. + +Gets/sets a specified piece of device configuration and/or state. The +semantics are device-specific. See individual device documentation in +the devices directory. As with ONE_REG, the size of the data +transferred is defined by the particular attribute. + +struct kvm_device_attr { + __u32 flags; /* no flags currently defined */ + __u32 group; /*
Re: [PATCH v4 1/6] kvm: add device control API
On 25.04.2013, at 14:07, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood scottw...@freescale.com --- v4: - Move some boilerplate back into generic code, as requested by Gleb. File descriptor management and reference counting is no longer the concern of the device implementation. - Don't hold kvm-lock during create. The original reasons for doing so have vanished as for as MPIC is concerned, and this avoids needing to answer the question of whether to hold the lock during destroy as well. Paul, you may need to acquire the lock yourself in kvm_create_xics() to protect the -EEXIST check. v3: remove some changes that were merged into this patch by accident, and fix the error documentation for KVM_CREATE_DEVICE. --- Documentation/virtual/kvm/api.txt| 70 Documentation/virtual/kvm/devices/README |1 + include/linux/kvm_host.h | 35 include/uapi/linux/kvm.h | 27 +++ virt/kvm/kvm_main.c | 129 ++ 5 files changed, 262 insertions(+) create mode 100644 Documentation/virtual/kvm/devices/README diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 976eb65..d52f3f9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data written, then `n_invalid' invalid entries, invalidating any previously valid entries found. +4.79 KVM_CREATE_DEVICE + +Capability: KVM_CAP_DEVICE_CTRL +Type: vm ioctl +Parameters: struct kvm_create_device (in/out) +Returns: 0 on success, -1 on error +Errors: + ENODEV: The device type is unknown or unsupported + EEXIST: Device already created, and this type of device may not + be instantiated multiple times + + Other error conditions may be defined by individual device types or + have their standard meanings. + +Creates an emulated device in the kernel. The file descriptor returned +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. + +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the +device type is supported (not necessarily whether it can be created +in the current vm). + +Individual devices should not define flags. Attributes should be used +for specifying any behavior that is not implied by the device type +number. + +struct kvm_create_device { + __u32 type; /* in: KVM_DEV_TYPE_xxx */ + __u32 fd; /* out: device handle */ + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ +}; Should we add __u32 padding here to make struct size multiple of u64? Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and ppc64 doesn't either. Not really. I just notices that we pad some structures to that effect. I don't think we really need to :). + +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR + +Capability: KVM_CAP_DEVICE_CTRL +Type: device ioctl +Parameters: struct kvm_device_attr +Returns: 0 on success, -1 on error +Errors: + ENXIO: The group or attribute is unknown/unsupported for this device + EPERM: The attribute cannot (currently) be accessed this way + (e.g. read-only attribute, or attribute that only makes + sense when the device is in a different state) + + Other error conditions may be defined by individual device types. + +Gets/sets a specified piece of device configuration and/or state. The +semantics are device-specific. See individual device documentation in +the devices directory. As with ONE_REG, the size of the data +transferred is defined by the particular attribute. + +struct kvm_device_attr { + __u32 flags; /* no flags currently defined */ + __u32 group; /* device-defined */ + __u64 attr;
Re: [PATCH v4 1/6] kvm: add device control API
On Thu, Apr 25, 2013 at 03:45:14PM +0200, Alexander Graf wrote: Please move struct definitions and KVM_CREATE_DEVICE_TEST define out from ioctl definition block. Let me change that in my tree... So are you sending this via your tree and I should not apply it directly? I was hoping to have things ready very soon for you to just pull... Make sense since there are PPC patches that depend on this one. 3.10 merge windows will very likely open next week though... -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. If you make that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 1/6] kvm: add device control API
On 04/25/2013 01:22:04 PM, Gleb Natapov wrote: On Thu, Apr 25, 2013 at 11:51:08AM -0500, Scott Wood wrote: On 04/25/2013 05:47:39 AM, Alexander Graf wrote: On 25.04.2013, at 11:43, Gleb Natapov wrote: +void kvm_device_put(struct kvm_device *dev) +{ + if (atomic_dec_and_test(dev-users)) + dev-ops-destroy(dev); +} + +static int kvm_device_release(struct inode *inode, struct file *filp) +{ + struct kvm_device *dev = filp-private_data; + struct kvm *kvm = dev-kvm; + + kvm_device_put(dev); + kvm_put_kvm(kvm); We may put kvm only if users goes to zero, otherwise kvm can be freed while something holds a reference to a device. Why not make kvm_device_put() do it? Nice catch. I'll change the patch so it does the kvm_put_kvm inside kvm_device_put's destroy branch. No, please don't. The KVM reference being put here is associated with the file descriptor, not with the MPIC object. Is it so? Device holds a pointer to kvm, so it increments kvm reference to make sure the pointer is valid. What prevents kvm from been destroyed while device is still in use in current code? Where will that kvm pointer be used, after all the file descriptors go away and the vcpus stop running? mmio_mapped guards against unmapping the MMIO if it's already been unmapped due to KVM destruction. We don't have any timers or other delayed work. Well, I do see one place, that Alex added -- the NULLing out of dev-kvm-arch.mpic, which didn't exist in my patchset. that change I think you'll have circular references and thus a memory leak, because the vcpus can hold a reference to the MPIC object. How circular reference can be created? MPIC holds reference on KVM, vcpu holds reference on MPIC, and vcpu is not destroyed until KVM is destroyed. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html