RE: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt
-Original Message- From: Greg KH [mailto:g...@kroah.com] Sent: Tuesday, August 23, 2011 10:58 PM To: KY Srinivasan Cc: gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org; Haiyang Zhang Subject: Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt On Wed, Aug 24, 2011 at 12:58:36AM +, KY Srinivasan wrote: -Original Message- From: Greg KH [mailto:g...@kroah.com] Sent: Tuesday, August 23, 2011 7:10 PM To: KY Srinivasan Cc: gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org; Haiyang Zhang Subject: Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt On Fri, Jul 15, 2011 at 10:47:14AM -0700, K. Y. Srinivasan wrote: Now that we have a spin lock protecting access to the stor device pointer, use it manage the reference count as well. Signed-off-by: K. Y. Srinivasan k...@microsoft.com Signed-off-by: Haiyang Zhang haiya...@microsoft.com --- drivers/staging/hv/hyperv_storage.h |8 drivers/staging/hv/storvsc.c| 10 +- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/hv/hyperv_storage.h b/drivers/staging/hv/hyperv_storage.h index 53b65be..d946211 100644 --- a/drivers/staging/hv/hyperv_storage.h +++ b/drivers/staging/hv/hyperv_storage.h @@ -265,7 +265,7 @@ struct storvsc_device { struct hv_device *device; /* 0 indicates the device is being destroyed */ - atomic_t ref_count; + int ref_count; Is this really needed? Can't you rely on the reference count of the hv_device itself? We don't have a reference count on the hv_device Wait, why not? You shure better have a reference count on that device if you have a pointer to it, if not, you have a bug, and that needs to be fixed. Please reread Documentation/CodingStyle for details. Greg, I am confused here. The model we have is identical to other device/bus abstractions in the kernel. For instance, neither struct pci_dev nor struct virtio_device have an explicit reference count. However, they both embed struct device which has the kobject structure embedded to manage the object life cycle. This is exactly the model we have for the vmbus devices - struct hv_device embeds struct device that embeds the struct kobject for managing the lifecycle. Like other bus specific devices in the kernel (pci devices, virtio devices,), class specific vmbus devices - struct storvsc_device and struct netvsc_device embed a pointer to the underlying struct hv_device. Furthermore, a pointer to the class specific device structure is stashed in the struct hv_device (the ext pointer). This is identical what is done in the virtio blk device - look at the priv element in struct virtio_device. As I noted in a different email, I did not introduce this reference counting, I just fixed some gaping holes in the logic. The reason, I fixed the logic and kept the reference counting is because I can see cases where we could end up de-referencing a NULL pointer from the interrupt side that is trying to deliver a vmbus packet to a device that is being destroyed. Regards, K. Y and this count is taken to deal with racing unloads and incoming traffic on the channel from the host. Is this something that all other storage drivers have to do? If not, then you shouldn't be doing that as well. greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt
On Wed, Aug 24, 2011 at 04:25:18PM +, KY Srinivasan wrote: On Fri, Jul 15, 2011 at 10:47:14AM -0700, K. Y. Srinivasan wrote: Now that we have a spin lock protecting access to the stor device pointer, use it manage the reference count as well. Signed-off-by: K. Y. Srinivasan k...@microsoft.com Signed-off-by: Haiyang Zhang haiya...@microsoft.com --- drivers/staging/hv/hyperv_storage.h |8 drivers/staging/hv/storvsc.c| 10 +- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/hv/hyperv_storage.h b/drivers/staging/hv/hyperv_storage.h index 53b65be..d946211 100644 --- a/drivers/staging/hv/hyperv_storage.h +++ b/drivers/staging/hv/hyperv_storage.h @@ -265,7 +265,7 @@ struct storvsc_device { struct hv_device *device; /* 0 indicates the device is being destroyed */ - atomic_t ref_count; + int ref_count; Is this really needed? Can't you rely on the reference count of the hv_device itself? We don't have a reference count on the hv_device Wait, why not? You shure better have a reference count on that device if you have a pointer to it, if not, you have a bug, and that needs to be fixed. Please reread Documentation/CodingStyle for details. Greg, I am confused here. The model we have is identical to other device/bus abstractions in the kernel. For instance, neither struct pci_dev nor struct virtio_device have an explicit reference count. However, they both embed struct device which has the kobject structure embedded to manage the object life cycle. This is exactly the model we have for the vmbus devices - struct hv_device embeds struct device that embeds the struct kobject for managing the lifecycle. Yes, that is correct. Like other bus specific devices in the kernel (pci devices, virtio devices,), class specific vmbus devices - struct storvsc_device and struct netvsc_device embed a pointer to the underlying struct hv_device. And when you save that pointer, you ARE incrementing the reference count properly, right? If not, you just caused a bug. Furthermore, a pointer to the class specific device structure is stashed in the struct hv_device (the ext pointer). This is identical what is done in the virtio blk device - look at the priv element in struct virtio_device. Yes, but the base structure of virtio_device does not contain a lock that the users of that priv pointer are using to control access to data _within_ the priv pointer, right? That's up to the users within the priv pointer. Now I see how you were trying to move the lock deeper as it seemed that everyone needed it, but you just moved the lock away from the thing that it was trying to protect, which might cause problems, and at the least, is confusing as heck because you are mixing two different layers here, in ways that should not be mixed. If you really need a lock to protect some private data within the priv pointer, then put it somewhere else,like in the priv pointer, as almost all other subsystems do. greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt
-Original Message- From: Greg KH [mailto:g...@kroah.com] Sent: Wednesday, August 24, 2011 4:17 PM To: KY Srinivasan Cc: gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org; Haiyang Zhang Subject: Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt On Wed, Aug 24, 2011 at 04:25:18PM +, KY Srinivasan wrote: On Fri, Jul 15, 2011 at 10:47:14AM -0700, K. Y. Srinivasan wrote: Now that we have a spin lock protecting access to the stor device pointer, use it manage the reference count as well. Signed-off-by: K. Y. Srinivasan k...@microsoft.com Signed-off-by: Haiyang Zhang haiya...@microsoft.com --- drivers/staging/hv/hyperv_storage.h |8 drivers/staging/hv/storvsc.c| 10 +- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/hv/hyperv_storage.h b/drivers/staging/hv/hyperv_storage.h index 53b65be..d946211 100644 --- a/drivers/staging/hv/hyperv_storage.h +++ b/drivers/staging/hv/hyperv_storage.h @@ -265,7 +265,7 @@ struct storvsc_device { struct hv_device *device; /* 0 indicates the device is being destroyed */ - atomic_t ref_count; + int ref_count; Is this really needed? Can't you rely on the reference count of the hv_device itself? We don't have a reference count on the hv_device Wait, why not? You shure better have a reference count on that device if you have a pointer to it, if not, you have a bug, and that needs to be fixed. Please reread Documentation/CodingStyle for details. Greg, I am confused here. The model we have is identical to other device/bus abstractions in the kernel. For instance, neither struct pci_dev nor struct virtio_device have an explicit reference count. However, they both embed struct device which has the kobject structure embedded to manage the object life cycle. This is exactly the model we have for the vmbus devices - struct hv_device embeds struct device that embeds the struct kobject for managing the lifecycle. Yes, that is correct. Like other bus specific devices in the kernel (pci devices, virtio devices,), class specific vmbus devices - struct storvsc_device and struct netvsc_device embed a pointer to the underlying struct hv_device. And when you save that pointer, you ARE incrementing the reference count properly, right? If not, you just caused a bug. Why do you say that. This assignment is done in the probe function where the driver core is invoking the driver specific probe function. In the driver specific probe function, I allocate class specific device state and embed the bus specific device pointer. So, I would think the driver core is taking the appropriate reference count. What I am doing is exactly what other PCI and virtio drivers are doing. For instance, in virtio_blk.c , virtblk_probe() function, the virtio_device pointer is stashed away in the virtio_blk structure in exactly the same way I am doing. I suspect the assumption here is that if probe succeeded, then the remove() function would be called to let the driver cleanup the state. Furthermore, a pointer to the class specific device structure is stashed in the struct hv_device (the ext pointer). This is identical what is done in the virtio blk device - look at the priv element in struct virtio_device. Yes, but the base structure of virtio_device does not contain a lock that the users of that priv pointer are using to control access to data _within_ the priv pointer, right? True. As I noted in an earlier email, the current hyper-v driver code has logic to deal with the race conditions I have described. It is just that the current implementation is completely bogus - look at for instance the function must_get_stor_device() in storvsc.c. This function is invoked in the channel callback code path to process messages coming from the host. I added this lock to close the window in getting the ext pointer. Clearly the lock protecting the ext pointer must be in a structure whose persistence is guaranteed and that is the reason I put the lock in the struct hv_device. That's up to the users within the priv pointer. Now I see how you were trying to move the lock deeper as it seemed that everyone needed it, but you just moved the lock away from the thing that it was trying to protect, which might cause problems, and at the least, is confusing as heck because you are mixing two different layers here, in ways that should not be mixed. If you really need a lock to protect some private data within the priv pointer, then put it somewhere else,like in the priv pointer, as almost all other subsystems do. What is being protected is the ext pointer that points to the class specific device state
Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt
On Wed, Aug 24, 2011 at 10:57:18PM +, KY Srinivasan wrote: Like other bus specific devices in the kernel (pci devices, virtio devices,), class specific vmbus devices - struct storvsc_device and struct netvsc_device embed a pointer to the underlying struct hv_device. And when you save that pointer, you ARE incrementing the reference count properly, right? If not, you just caused a bug. Why do you say that. This assignment is done in the probe function where the driver core is invoking the driver specific probe function. But if you save a pointer, you HAVE to increment the reference count. In the driver specific probe function, I allocate class specific device state and embed the bus specific device pointer. So, I would think the driver core is taking the appropriate reference count. What I am doing is exactly what other PCI and virtio drivers are doing. For instance, in virtio_blk.c , virtblk_probe() function, the virtio_device pointer is stashed away in the virtio_blk structure in exactly the same way I am doing. I suspect the assumption here is that if probe succeeded, then the remove() function would be called to let the driver cleanup the state. Yes, but that's a bug, the pointer reference count should be incremented. Look at drivers/usb/usb-skeleton.c for a well-documented way to handle a driver that works with a bus that has devices that could go away at any point in time. It handles the reference count, and the locking correctly, and has been audited by lots of people. Furthermore, a pointer to the class specific device structure is stashed in the struct hv_device (the ext pointer). This is identical what is done in the virtio blk device - look at the priv element in struct virtio_device. Yes, but the base structure of virtio_device does not contain a lock that the users of that priv pointer are using to control access to data _within_ the priv pointer, right? True. As I noted in an earlier email, the current hyper-v driver code has logic to deal with the race conditions I have described. It is just that the current implementation is completely bogus - look at for instance the function must_get_stor_device() in storvsc.c. This function is invoked in the channel callback code path to process messages coming from the host. I added this lock to close the window in getting the ext pointer. Clearly the lock protecting the ext pointer must be in a structure whose persistence is guaranteed and that is the reason I put the lock in the struct hv_device. But no, that's not the way to do it, put it in the structure that has the data you are trying to protect. greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt
On Fri, Jul 15, 2011 at 10:47:14AM -0700, K. Y. Srinivasan wrote: Now that we have a spin lock protecting access to the stor device pointer, use it manage the reference count as well. Signed-off-by: K. Y. Srinivasan k...@microsoft.com Signed-off-by: Haiyang Zhang haiya...@microsoft.com --- drivers/staging/hv/hyperv_storage.h |8 drivers/staging/hv/storvsc.c| 10 +- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/hv/hyperv_storage.h b/drivers/staging/hv/hyperv_storage.h index 53b65be..d946211 100644 --- a/drivers/staging/hv/hyperv_storage.h +++ b/drivers/staging/hv/hyperv_storage.h @@ -265,7 +265,7 @@ struct storvsc_device { struct hv_device *device; /* 0 indicates the device is being destroyed */ - atomic_t ref_count; + int ref_count; Is this really needed? Can't you rely on the reference count of the hv_device itself? greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt
-Original Message- From: Greg KH [mailto:g...@kroah.com] Sent: Tuesday, August 23, 2011 7:10 PM To: KY Srinivasan Cc: gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org; Haiyang Zhang Subject: Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt On Fri, Jul 15, 2011 at 10:47:14AM -0700, K. Y. Srinivasan wrote: Now that we have a spin lock protecting access to the stor device pointer, use it manage the reference count as well. Signed-off-by: K. Y. Srinivasan k...@microsoft.com Signed-off-by: Haiyang Zhang haiya...@microsoft.com --- drivers/staging/hv/hyperv_storage.h |8 drivers/staging/hv/storvsc.c| 10 +- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/hv/hyperv_storage.h b/drivers/staging/hv/hyperv_storage.h index 53b65be..d946211 100644 --- a/drivers/staging/hv/hyperv_storage.h +++ b/drivers/staging/hv/hyperv_storage.h @@ -265,7 +265,7 @@ struct storvsc_device { struct hv_device *device; /* 0 indicates the device is being destroyed */ - atomic_t ref_count; + int ref_count; Is this really needed? Can't you rely on the reference count of the hv_device itself? We don't have a reference count on the hv_device and this count is taken to deal with racing unloads and incoming traffic on the channel from the host. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt
On Wed, Aug 24, 2011 at 12:58:36AM +, KY Srinivasan wrote: -Original Message- From: Greg KH [mailto:g...@kroah.com] Sent: Tuesday, August 23, 2011 7:10 PM To: KY Srinivasan Cc: gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org; Haiyang Zhang Subject: Re: [PATCH 086/117] Staging: hv: storvsc: Leverage the spinlock to manage ref_cnt On Fri, Jul 15, 2011 at 10:47:14AM -0700, K. Y. Srinivasan wrote: Now that we have a spin lock protecting access to the stor device pointer, use it manage the reference count as well. Signed-off-by: K. Y. Srinivasan k...@microsoft.com Signed-off-by: Haiyang Zhang haiya...@microsoft.com --- drivers/staging/hv/hyperv_storage.h |8 drivers/staging/hv/storvsc.c| 10 +- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/hv/hyperv_storage.h b/drivers/staging/hv/hyperv_storage.h index 53b65be..d946211 100644 --- a/drivers/staging/hv/hyperv_storage.h +++ b/drivers/staging/hv/hyperv_storage.h @@ -265,7 +265,7 @@ struct storvsc_device { struct hv_device *device; /* 0 indicates the device is being destroyed */ - atomic_t ref_count; + int ref_count; Is this really needed? Can't you rely on the reference count of the hv_device itself? We don't have a reference count on the hv_device Wait, why not? You shure better have a reference count on that device if you have a pointer to it, if not, you have a bug, and that needs to be fixed. Please reread Documentation/CodingStyle for details. and this count is taken to deal with racing unloads and incoming traffic on the channel from the host. Is this something that all other storage drivers have to do? If not, then you shouldn't be doing that as well. greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization