Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
On Thu, 19 Jan 2017 11:04:37 +0800 Cao jinwrote: > On 01/19/2017 05:32 AM, Alex Williamson wrote: > > On Tue, 10 Jan 2017 17:11:01 +0200 > > "Michael S. Tsirkin" wrote: > > > >> On Tue, Jan 10, 2017 at 07:46:17PM +0800, Cao jin wrote: > >>> > >>> > >>> On 01/10/2017 07:04 AM, Michael S. Tsirkin wrote: > On Sat, Dec 31, 2016 at 05:15:36PM +0800, Cao jin wrote: > > Support serious device error recovery > > serious? > > >>> > >>> Sorry for my poor vocabulary if it confuses people. I wanted to express > >>> the meaning that: vfio-pci actually cannot do a real recovery for device > >>> even if it provides the callbacks, it relies on the user to do a > >>> effective(or word "serious"?) recovery. > >>> > >>> Welcome the amendment on the commit log. > >> > >> It's up to Alex, maybe he's able to figure it all out from > >> code, but the rest of us could benefit from a description > >> of what the patch does from userspace point of view. > >> > >> Also, is it a pre-requisite of the userspace patches you posted? > > > > This is the same blocking user accesses while the device is in recovery > > that you thought was ineffective/wrong before. Why do we still need it > > if QEMU isn't trying to handle fatal errors? If the kernel is doing a > > reset shouldn't the user consider the device dead? A commit log > > explaining this is absolutely necessary. Thanks, > > > > Alex > > > > Yes, it is the same blocking user access as before, and I did said it is > not effective as we expected, and I drew the figure to illustrate my > analysis. I think the blocking is right, maybe just not enough to work > fine, because it is possible that vfio's blocking is over, while > hardware reset is not done, results in inaccessible device. > > Leave the blocking here is no harm for now, and could be useful in > future(when we handle fatal error). If you want this in the kernel, you're going to need to invest the effort to make it work. I'm not going to put in code that is ineffective at what it intends to do. > We don't forward fatal error events to guest, why would guest kernel do > a reset? Or do you mean some device driver would do hardware reset on > non-fatal error? The question is if we're only trying to recover from non-fatal events, what is the scenario where the user is attempting to access the device while it's in reset? Do we need to consider the existing notifier to be a fatal event where access to the device should stop immediately and add a new notifier for non-fatal events where it's safe for the user to access the device? Trying to use the eventfd to push status through a single notifier seems flawed. Thanks, Alex
Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
On 01/19/2017 05:32 AM, Alex Williamson wrote: > On Tue, 10 Jan 2017 17:11:01 +0200 > "Michael S. Tsirkin"wrote: > >> On Tue, Jan 10, 2017 at 07:46:17PM +0800, Cao jin wrote: >>> >>> >>> On 01/10/2017 07:04 AM, Michael S. Tsirkin wrote: On Sat, Dec 31, 2016 at 05:15:36PM +0800, Cao jin wrote: > Support serious device error recovery serious? >>> >>> Sorry for my poor vocabulary if it confuses people. I wanted to express >>> the meaning that: vfio-pci actually cannot do a real recovery for device >>> even if it provides the callbacks, it relies on the user to do a >>> effective(or word "serious"?) recovery. >>> >>> Welcome the amendment on the commit log. >> >> It's up to Alex, maybe he's able to figure it all out from >> code, but the rest of us could benefit from a description >> of what the patch does from userspace point of view. >> >> Also, is it a pre-requisite of the userspace patches you posted? > > This is the same blocking user accesses while the device is in recovery > that you thought was ineffective/wrong before. Why do we still need it > if QEMU isn't trying to handle fatal errors? If the kernel is doing a > reset shouldn't the user consider the device dead? A commit log > explaining this is absolutely necessary. Thanks, > > Alex > Yes, it is the same blocking user access as before, and I did said it is not effective as we expected, and I drew the figure to illustrate my analysis. I think the blocking is right, maybe just not enough to work fine, because it is possible that vfio's blocking is over, while hardware reset is not done, results in inaccessible device. Leave the blocking here is no harm for now, and could be useful in future(when we handle fatal error). We don't forward fatal error events to guest, why would guest kernel do a reset? Or do you mean some device driver would do hardware reset on non-fatal error? -- Sincerely, Cao jin > > Signed-off-by: Cao jin > --- > drivers/vfio/pci/vfio_pci.c | 70 > +++-- > drivers/vfio/pci/vfio_pci_private.h | 2 ++ > 2 files changed, 70 insertions(+), 2 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index 712a849..752af20 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -534,6 +534,15 @@ static long vfio_pci_ioctl(void *device_data, > { > struct vfio_pci_device *vdev = device_data; > unsigned long minsz; > + int ret; > + > + if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS || > + cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) { > + ret = wait_for_completion_interruptible( > + >aer_completion); don't split it like that. > + if (ret) > + return ret; > + } > > if (cmd == VFIO_DEVICE_GET_INFO) { > struct vfio_device_info info; > @@ -953,6 +962,15 @@ static ssize_t vfio_pci_rw(void *device_data, char > __user *buf, > { > unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); > struct vfio_pci_device *vdev = device_data; > + int ret; > + > + /* block all kinds of access during host recovery */ > + if (vdev->aer_recovering) { > + ret = wait_for_completion_interruptible( > + >aer_completion); > + if (ret) > + return ret; > + } > > if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions) > return -EINVAL; > @@ -1117,6 +1135,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, > const struct pci_device_id *id) > vdev->irq_type = VFIO_PCI_NUM_IRQS; > mutex_init(>igate); > spin_lock_init(>irqlock); > + init_completion(>aer_completion); > > ret = vfio_add_group_dev(>dev, _pci_ops, vdev); > if (ret) { > @@ -1176,6 +1195,9 @@ static pci_ers_result_t > vfio_pci_aer_err_detected(struct pci_dev *pdev, > { > struct vfio_pci_device *vdev; > struct vfio_device *device; > + u32 uncor_status; > + unsigned int aer_cap_offset; > + int ret; > > device = vfio_device_get_from_dev(>dev); > if (device == NULL) > @@ -1187,10 +1209,29 @@ static pci_ers_result_t > vfio_pci_aer_err_detected(struct pci_dev *pdev, > return PCI_ERS_RESULT_DISCONNECT; > } > > + /* > + * get device's uncorrectable error status as soon as possible, should be "Get". > + * and signal it to user space. The later we read it, the possibility > + * the register value is mangled grows. > + */ > + aer_cap_offset = pci_find_ext_capability(vdev->pdev, > PCI_EXT_CAP_ID_ERR); > + ret = pci_read_config_dword(vdev->pdev, aer_cap_offset + > +
Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
On Tue, 10 Jan 2017 17:11:01 +0200 "Michael S. Tsirkin"wrote: > On Tue, Jan 10, 2017 at 07:46:17PM +0800, Cao jin wrote: > > > > > > On 01/10/2017 07:04 AM, Michael S. Tsirkin wrote: > > > On Sat, Dec 31, 2016 at 05:15:36PM +0800, Cao jin wrote: > > >> Support serious device error recovery > > > > > > serious? > > > > > > > Sorry for my poor vocabulary if it confuses people. I wanted to express > > the meaning that: vfio-pci actually cannot do a real recovery for device > > even if it provides the callbacks, it relies on the user to do a > > effective(or word "serious"?) recovery. > > > > Welcome the amendment on the commit log. > > It's up to Alex, maybe he's able to figure it all out from > code, but the rest of us could benefit from a description > of what the patch does from userspace point of view. > > Also, is it a pre-requisite of the userspace patches you posted? This is the same blocking user accesses while the device is in recovery that you thought was ineffective/wrong before. Why do we still need it if QEMU isn't trying to handle fatal errors? If the kernel is doing a reset shouldn't the user consider the device dead? A commit log explaining this is absolutely necessary. Thanks, Alex > > >> > > >> Signed-off-by: Cao jin > > >> --- > > >> drivers/vfio/pci/vfio_pci.c | 70 > > >> +++-- > > >> drivers/vfio/pci/vfio_pci_private.h | 2 ++ > > >> 2 files changed, 70 insertions(+), 2 deletions(-) > > >> > > >> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > > >> index 712a849..752af20 100644 > > >> --- a/drivers/vfio/pci/vfio_pci.c > > >> +++ b/drivers/vfio/pci/vfio_pci.c > > >> @@ -534,6 +534,15 @@ static long vfio_pci_ioctl(void *device_data, > > >> { > > >> struct vfio_pci_device *vdev = device_data; > > >> unsigned long minsz; > > >> +int ret; > > >> + > > >> +if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS || > > >> +cmd == VFIO_DEVICE_RESET || cmd == > > >> VFIO_DEVICE_PCI_HOT_RESET)) { > > >> +ret = wait_for_completion_interruptible( > > >> +>aer_completion); > > > > > > don't split it like that. > > > > > >> +if (ret) > > >> +return ret; > > >> +} > > >> > > >> if (cmd == VFIO_DEVICE_GET_INFO) { > > >> struct vfio_device_info info; > > >> @@ -953,6 +962,15 @@ static ssize_t vfio_pci_rw(void *device_data, char > > >> __user *buf, > > >> { > > >> unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); > > >> struct vfio_pci_device *vdev = device_data; > > >> +int ret; > > >> + > > >> +/* block all kinds of access during host recovery */ > > >> +if (vdev->aer_recovering) { > > >> +ret = wait_for_completion_interruptible( > > >> +>aer_completion); > > >> +if (ret) > > >> +return ret; > > >> +} > > >> > > >> if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions) > > >> return -EINVAL; > > >> @@ -1117,6 +1135,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, > > >> const struct pci_device_id *id) > > >> vdev->irq_type = VFIO_PCI_NUM_IRQS; > > >> mutex_init(>igate); > > >> spin_lock_init(>irqlock); > > >> +init_completion(>aer_completion); > > >> > > >> ret = vfio_add_group_dev(>dev, _pci_ops, vdev); > > >> if (ret) { > > >> @@ -1176,6 +1195,9 @@ static pci_ers_result_t > > >> vfio_pci_aer_err_detected(struct pci_dev *pdev, > > >> { > > >> struct vfio_pci_device *vdev; > > >> struct vfio_device *device; > > >> +u32 uncor_status; > > >> +unsigned int aer_cap_offset; > > >> +int ret; > > >> > > >> device = vfio_device_get_from_dev(>dev); > > >> if (device == NULL) > > >> @@ -1187,10 +1209,29 @@ static pci_ers_result_t > > >> vfio_pci_aer_err_detected(struct pci_dev *pdev, > > >> return PCI_ERS_RESULT_DISCONNECT; > > >> } > > >> > > >> +/* > > >> + * get device's uncorrectable error status as soon as possible, > > >> > > > > > > should be "Get". > > > > > >> + * and signal it to user space. The later we read it, the > > >> possibility > > >> + * the register value is mangled grows. > > >> + */ > > >> +aer_cap_offset = pci_find_ext_capability(vdev->pdev, > > >> PCI_EXT_CAP_ID_ERR); > > >> +ret = pci_read_config_dword(vdev->pdev, aer_cap_offset + > > >> +PCI_ERR_UNCOR_STATUS, > > >> _status); > > >> +if (ret) > > >> +return PCI_ERS_RESULT_DISCONNECT; > > >> + > > >> +pr_info("device %d got AER detect notification. uncorrectable > > >> error status =
Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
Alex, Do you have any comments on this version & and the qemu parts? -- Sincerely, Cao jin On 12/31/2016 05:15 PM, Cao jin wrote: > Support serious device error recovery > > Signed-off-by: Cao jin> --- > drivers/vfio/pci/vfio_pci.c | 70 > +++-- > drivers/vfio/pci/vfio_pci_private.h | 2 ++ > 2 files changed, 70 insertions(+), 2 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index 712a849..752af20 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -534,6 +534,15 @@ static long vfio_pci_ioctl(void *device_data, > { > struct vfio_pci_device *vdev = device_data; > unsigned long minsz; > + int ret; > + > + if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS || > + cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) { > + ret = wait_for_completion_interruptible( > + >aer_completion); > + if (ret) > + return ret; > + } > > if (cmd == VFIO_DEVICE_GET_INFO) { > struct vfio_device_info info; > @@ -953,6 +962,15 @@ static ssize_t vfio_pci_rw(void *device_data, char > __user *buf, > { > unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); > struct vfio_pci_device *vdev = device_data; > + int ret; > + > + /* block all kinds of access during host recovery */ > + if (vdev->aer_recovering) { > + ret = wait_for_completion_interruptible( > + >aer_completion); > + if (ret) > + return ret; > + } > > if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions) > return -EINVAL; > @@ -1117,6 +1135,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const > struct pci_device_id *id) > vdev->irq_type = VFIO_PCI_NUM_IRQS; > mutex_init(>igate); > spin_lock_init(>irqlock); > + init_completion(>aer_completion); > > ret = vfio_add_group_dev(>dev, _pci_ops, vdev); > if (ret) { > @@ -1176,6 +1195,9 @@ static pci_ers_result_t > vfio_pci_aer_err_detected(struct pci_dev *pdev, > { > struct vfio_pci_device *vdev; > struct vfio_device *device; > + u32 uncor_status; > + unsigned int aer_cap_offset; > + int ret; > > device = vfio_device_get_from_dev(>dev); > if (device == NULL) > @@ -1187,10 +1209,29 @@ static pci_ers_result_t > vfio_pci_aer_err_detected(struct pci_dev *pdev, > return PCI_ERS_RESULT_DISCONNECT; > } > > + /* > + * get device's uncorrectable error status as soon as possible, > + * and signal it to user space. The later we read it, the possibility > + * the register value is mangled grows. > + */ > + aer_cap_offset = pci_find_ext_capability(vdev->pdev, > PCI_EXT_CAP_ID_ERR); > + ret = pci_read_config_dword(vdev->pdev, aer_cap_offset + > +PCI_ERR_UNCOR_STATUS, _status); > +if (ret) > +return PCI_ERS_RESULT_DISCONNECT; > + > + pr_info("device %d got AER detect notification. uncorrectable error > status = 0x%x\n", pdev->devfn, uncor_status);//to be removed > mutex_lock(>igate); > > - if (vdev->err_trigger) > - eventfd_signal(vdev->err_trigger, 1); > + vdev->aer_recovering = true; > + reinit_completion(>aer_completion); > + > + if (vdev->err_trigger && uncor_status) { > + pr_info("device %d signal uncor status 0x%x to user", > + pdev->devfn, uncor_status); > + /* signal uncorrectable error status to user space */ > + eventfd_signal(vdev->err_trigger, uncor_status); > +} > > mutex_unlock(>igate); > > @@ -1199,8 +1240,33 @@ static pci_ers_result_t > vfio_pci_aer_err_detected(struct pci_dev *pdev, > return PCI_ERS_RESULT_CAN_RECOVER; > } > > +static void vfio_pci_aer_resume(struct pci_dev *pdev) > +{ > + struct vfio_pci_device *vdev; > + struct vfio_device *device; > + > + device = vfio_device_get_from_dev(>dev); > + if (device == NULL) > + return; > + > + vdev = vfio_device_data(device); > + if (vdev == NULL) { > + vfio_device_put(device); > + return; > + } > + > + mutex_lock(>igate); > + vdev->aer_recovering = false; > + mutex_unlock(>igate); > + > + complete_all(>aer_completion); > + > + vfio_device_put(device); > +} > + > static const struct pci_error_handlers vfio_err_handlers = { > .error_detected = vfio_pci_aer_err_detected, > + .resume = vfio_pci_aer_resume, > }; > > static struct pci_driver vfio_pci_driver = { > diff --git a/drivers/vfio/pci/vfio_pci_private.h > b/drivers/vfio/pci/vfio_pci_private.h > index 8a7d546..ba8471f 100644 > --- a/drivers/vfio/pci/vfio_pci_private.h > +++ b/drivers/vfio/pci/vfio_pci_private.h
Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
On Wed, Jan 11, 2017 at 09:53:25AM +0800, Cao jin wrote: > > > On 01/10/2017 11:11 PM, Michael S. Tsirkin wrote: > > On Tue, Jan 10, 2017 at 07:46:17PM +0800, Cao jin wrote: > >> > >> > >> On 01/10/2017 07:04 AM, Michael S. Tsirkin wrote: > >>> On Sat, Dec 31, 2016 at 05:15:36PM +0800, Cao jin wrote: > Support serious device error recovery > >>> > >>> serious? > >>> > >> > >> Sorry for my poor vocabulary if it confuses people. I wanted to express > >> the meaning that: vfio-pci actually cannot do a real recovery for device > >> even if it provides the callbacks, it relies on the user to do a > >> effective(or word "serious"?) recovery. > >> > >> Welcome the amendment on the commit log. > > > > It's up to Alex, maybe he's able to figure it all out from > > code, but the rest of us could benefit from a description > > of what the patch does from userspace point of view. > > > > Also, is it a pre-requisite of the userspace patches you posted? > > > > Yes, it is. Looks like it's time for another design document :)
Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
On 01/10/2017 11:11 PM, Michael S. Tsirkin wrote: > On Tue, Jan 10, 2017 at 07:46:17PM +0800, Cao jin wrote: >> >> >> On 01/10/2017 07:04 AM, Michael S. Tsirkin wrote: >>> On Sat, Dec 31, 2016 at 05:15:36PM +0800, Cao jin wrote: Support serious device error recovery >>> >>> serious? >>> >> >> Sorry for my poor vocabulary if it confuses people. I wanted to express >> the meaning that: vfio-pci actually cannot do a real recovery for device >> even if it provides the callbacks, it relies on the user to do a >> effective(or word "serious"?) recovery. >> >> Welcome the amendment on the commit log. > > It's up to Alex, maybe he's able to figure it all out from > code, but the rest of us could benefit from a description > of what the patch does from userspace point of view. > > Also, is it a pre-requisite of the userspace patches you posted? > Yes, it is. -- Sincerely, Cao jin Signed-off-by: Cao jin--- drivers/vfio/pci/vfio_pci.c | 70 +++-- drivers/vfio/pci/vfio_pci_private.h | 2 ++ 2 files changed, 70 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 712a849..752af20 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -534,6 +534,15 @@ static long vfio_pci_ioctl(void *device_data, { struct vfio_pci_device *vdev = device_data; unsigned long minsz; + int ret; + + if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS || + cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) { + ret = wait_for_completion_interruptible( + >aer_completion); >>> >>> don't split it like that. >>> + if (ret) + return ret; + } if (cmd == VFIO_DEVICE_GET_INFO) { struct vfio_device_info info; @@ -953,6 +962,15 @@ static ssize_t vfio_pci_rw(void *device_data, char __user *buf, { unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); struct vfio_pci_device *vdev = device_data; + int ret; + + /* block all kinds of access during host recovery */ + if (vdev->aer_recovering) { + ret = wait_for_completion_interruptible( + >aer_completion); + if (ret) + return ret; + } if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions) return -EINVAL; @@ -1117,6 +1135,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) vdev->irq_type = VFIO_PCI_NUM_IRQS; mutex_init(>igate); spin_lock_init(>irqlock); + init_completion(>aer_completion); ret = vfio_add_group_dev(>dev, _pci_ops, vdev); if (ret) { @@ -1176,6 +1195,9 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev, { struct vfio_pci_device *vdev; struct vfio_device *device; + u32 uncor_status; + unsigned int aer_cap_offset; + int ret; device = vfio_device_get_from_dev(>dev); if (device == NULL) @@ -1187,10 +1209,29 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev, return PCI_ERS_RESULT_DISCONNECT; } + /* + * get device's uncorrectable error status as soon as possible, >>> >>> should be "Get". >>> + * and signal it to user space. The later we read it, the possibility + * the register value is mangled grows. + */ + aer_cap_offset = pci_find_ext_capability(vdev->pdev, PCI_EXT_CAP_ID_ERR); + ret = pci_read_config_dword(vdev->pdev, aer_cap_offset + +PCI_ERR_UNCOR_STATUS, _status); +if (ret) +return PCI_ERS_RESULT_DISCONNECT; + + pr_info("device %d got AER detect notification. uncorrectable error status = 0x%x\n", pdev->devfn, uncor_status);//to be removed >>> >>> Pls drop this. >>> mutex_lock(>igate); - if (vdev->err_trigger) - eventfd_signal(vdev->err_trigger, 1); + vdev->aer_recovering = true; + reinit_completion(>aer_completion); + + if (vdev->err_trigger && uncor_status) { + pr_info("device %d signal uncor status 0x%x to user", + pdev->devfn, uncor_status); + /* signal uncorrectable error status to user space */ + eventfd_signal(vdev->err_trigger, uncor_status); +} mutex_unlock(>igate); @@ -1199,8 +1240,33 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev, return PCI_ERS_RESULT_CAN_RECOVER; } +static void vfio_pci_aer_resume(struct pci_dev
Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
On Tue, Jan 10, 2017 at 07:46:17PM +0800, Cao jin wrote: > > > On 01/10/2017 07:04 AM, Michael S. Tsirkin wrote: > > On Sat, Dec 31, 2016 at 05:15:36PM +0800, Cao jin wrote: > >> Support serious device error recovery > > > > serious? > > > > Sorry for my poor vocabulary if it confuses people. I wanted to express > the meaning that: vfio-pci actually cannot do a real recovery for device > even if it provides the callbacks, it relies on the user to do a > effective(or word "serious"?) recovery. > > Welcome the amendment on the commit log. It's up to Alex, maybe he's able to figure it all out from code, but the rest of us could benefit from a description of what the patch does from userspace point of view. Also, is it a pre-requisite of the userspace patches you posted? > -- > Sincerely, > Cao jin > > >> > >> Signed-off-by: Cao jin> >> --- > >> drivers/vfio/pci/vfio_pci.c | 70 > >> +++-- > >> drivers/vfio/pci/vfio_pci_private.h | 2 ++ > >> 2 files changed, 70 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > >> index 712a849..752af20 100644 > >> --- a/drivers/vfio/pci/vfio_pci.c > >> +++ b/drivers/vfio/pci/vfio_pci.c > >> @@ -534,6 +534,15 @@ static long vfio_pci_ioctl(void *device_data, > >> { > >>struct vfio_pci_device *vdev = device_data; > >>unsigned long minsz; > >> + int ret; > >> + > >> + if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS || > >> + cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) { > >> + ret = wait_for_completion_interruptible( > >> + >aer_completion); > > > > don't split it like that. > > > >> + if (ret) > >> + return ret; > >> + } > >> > >>if (cmd == VFIO_DEVICE_GET_INFO) { > >>struct vfio_device_info info; > >> @@ -953,6 +962,15 @@ static ssize_t vfio_pci_rw(void *device_data, char > >> __user *buf, > >> { > >>unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); > >>struct vfio_pci_device *vdev = device_data; > >> + int ret; > >> + > >> + /* block all kinds of access during host recovery */ > >> + if (vdev->aer_recovering) { > >> + ret = wait_for_completion_interruptible( > >> + >aer_completion); > >> + if (ret) > >> + return ret; > >> + } > >> > >>if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions) > >>return -EINVAL; > >> @@ -1117,6 +1135,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, > >> const struct pci_device_id *id) > >>vdev->irq_type = VFIO_PCI_NUM_IRQS; > >>mutex_init(>igate); > >>spin_lock_init(>irqlock); > >> + init_completion(>aer_completion); > >> > >>ret = vfio_add_group_dev(>dev, _pci_ops, vdev); > >>if (ret) { > >> @@ -1176,6 +1195,9 @@ static pci_ers_result_t > >> vfio_pci_aer_err_detected(struct pci_dev *pdev, > >> { > >>struct vfio_pci_device *vdev; > >>struct vfio_device *device; > >> + u32 uncor_status; > >> + unsigned int aer_cap_offset; > >> + int ret; > >> > >>device = vfio_device_get_from_dev(>dev); > >>if (device == NULL) > >> @@ -1187,10 +1209,29 @@ static pci_ers_result_t > >> vfio_pci_aer_err_detected(struct pci_dev *pdev, > >>return PCI_ERS_RESULT_DISCONNECT; > >>} > >> > >> + /* > >> + * get device's uncorrectable error status as soon as possible, > > > > should be "Get". > > > >> + * and signal it to user space. The later we read it, the possibility > >> + * the register value is mangled grows. > >> + */ > >> + aer_cap_offset = pci_find_ext_capability(vdev->pdev, > >> PCI_EXT_CAP_ID_ERR); > >> + ret = pci_read_config_dword(vdev->pdev, aer_cap_offset + > >> +PCI_ERR_UNCOR_STATUS, _status); > >> +if (ret) > >> +return PCI_ERS_RESULT_DISCONNECT; > >> + > >> + pr_info("device %d got AER detect notification. uncorrectable error > >> status = 0x%x\n", pdev->devfn, uncor_status);//to be removed > > > > Pls drop this. > > > >>mutex_lock(>igate); > >> > >> - if (vdev->err_trigger) > >> - eventfd_signal(vdev->err_trigger, 1); > >> + vdev->aer_recovering = true; > >> + reinit_completion(>aer_completion); > >> + > >> + if (vdev->err_trigger && uncor_status) { > >> + pr_info("device %d signal uncor status 0x%x to user", > >> + pdev->devfn, uncor_status); > >> + /* signal uncorrectable error status to user space */ > >> + eventfd_signal(vdev->err_trigger, uncor_status); > >> +} > >> > >>mutex_unlock(>igate); > >> > >> @@ -1199,8 +1240,33 @@ static pci_ers_result_t > >> vfio_pci_aer_err_detected(struct pci_dev *pdev, > >>return PCI_ERS_RESULT_CAN_RECOVER; > >> } > >> > >> +static void vfio_pci_aer_resume(struct pci_dev *pdev) > >> +{ > >> + struct vfio_pci_device *vdev; > >> + struct
Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
On 01/10/2017 07:04 AM, Michael S. Tsirkin wrote: > On Sat, Dec 31, 2016 at 05:15:36PM +0800, Cao jin wrote: >> Support serious device error recovery > > serious? > Sorry for my poor vocabulary if it confuses people. I wanted to express the meaning that: vfio-pci actually cannot do a real recovery for device even if it provides the callbacks, it relies on the user to do a effective(or word "serious"?) recovery. Welcome the amendment on the commit log. -- Sincerely, Cao jin >> >> Signed-off-by: Cao jin>> --- >> drivers/vfio/pci/vfio_pci.c | 70 >> +++-- >> drivers/vfio/pci/vfio_pci_private.h | 2 ++ >> 2 files changed, 70 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c >> index 712a849..752af20 100644 >> --- a/drivers/vfio/pci/vfio_pci.c >> +++ b/drivers/vfio/pci/vfio_pci.c >> @@ -534,6 +534,15 @@ static long vfio_pci_ioctl(void *device_data, >> { >> struct vfio_pci_device *vdev = device_data; >> unsigned long minsz; >> +int ret; >> + >> +if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS || >> +cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) { >> +ret = wait_for_completion_interruptible( >> +>aer_completion); > > don't split it like that. > >> +if (ret) >> +return ret; >> +} >> >> if (cmd == VFIO_DEVICE_GET_INFO) { >> struct vfio_device_info info; >> @@ -953,6 +962,15 @@ static ssize_t vfio_pci_rw(void *device_data, char >> __user *buf, >> { >> unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); >> struct vfio_pci_device *vdev = device_data; >> +int ret; >> + >> +/* block all kinds of access during host recovery */ >> +if (vdev->aer_recovering) { >> +ret = wait_for_completion_interruptible( >> +>aer_completion); >> +if (ret) >> +return ret; >> +} >> >> if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions) >> return -EINVAL; >> @@ -1117,6 +1135,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const >> struct pci_device_id *id) >> vdev->irq_type = VFIO_PCI_NUM_IRQS; >> mutex_init(>igate); >> spin_lock_init(>irqlock); >> +init_completion(>aer_completion); >> >> ret = vfio_add_group_dev(>dev, _pci_ops, vdev); >> if (ret) { >> @@ -1176,6 +1195,9 @@ static pci_ers_result_t >> vfio_pci_aer_err_detected(struct pci_dev *pdev, >> { >> struct vfio_pci_device *vdev; >> struct vfio_device *device; >> +u32 uncor_status; >> +unsigned int aer_cap_offset; >> +int ret; >> >> device = vfio_device_get_from_dev(>dev); >> if (device == NULL) >> @@ -1187,10 +1209,29 @@ static pci_ers_result_t >> vfio_pci_aer_err_detected(struct pci_dev *pdev, >> return PCI_ERS_RESULT_DISCONNECT; >> } >> >> +/* >> + * get device's uncorrectable error status as soon as possible, > > should be "Get". > >> + * and signal it to user space. The later we read it, the possibility >> + * the register value is mangled grows. >> + */ >> +aer_cap_offset = pci_find_ext_capability(vdev->pdev, >> PCI_EXT_CAP_ID_ERR); >> +ret = pci_read_config_dword(vdev->pdev, aer_cap_offset + >> +PCI_ERR_UNCOR_STATUS, _status); >> +if (ret) >> +return PCI_ERS_RESULT_DISCONNECT; >> + >> +pr_info("device %d got AER detect notification. uncorrectable error >> status = 0x%x\n", pdev->devfn, uncor_status);//to be removed > > Pls drop this. > >> mutex_lock(>igate); >> >> -if (vdev->err_trigger) >> -eventfd_signal(vdev->err_trigger, 1); >> +vdev->aer_recovering = true; >> +reinit_completion(>aer_completion); >> + >> +if (vdev->err_trigger && uncor_status) { >> +pr_info("device %d signal uncor status 0x%x to user", >> +pdev->devfn, uncor_status); >> +/* signal uncorrectable error status to user space */ >> +eventfd_signal(vdev->err_trigger, uncor_status); >> +} >> >> mutex_unlock(>igate); >> >> @@ -1199,8 +1240,33 @@ static pci_ers_result_t >> vfio_pci_aer_err_detected(struct pci_dev *pdev, >> return PCI_ERS_RESULT_CAN_RECOVER; >> } >> >> +static void vfio_pci_aer_resume(struct pci_dev *pdev) >> +{ >> +struct vfio_pci_device *vdev; >> +struct vfio_device *device; >> + >> +device = vfio_device_get_from_dev(>dev); >> +if (device == NULL) >> +return; >> + >> +vdev = vfio_device_data(device); >> +if (vdev == NULL) { >> +vfio_device_put(device); >> +return; >> +} >> + >> +mutex_lock(>igate); >> +vdev->aer_recovering = false; >> +mutex_unlock(>igate); >> + >> +complete_all(>aer_completion); >> + >> +
Re: [Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
On Sat, Dec 31, 2016 at 05:15:36PM +0800, Cao jin wrote: > Support serious device error recovery serious? > > Signed-off-by: Cao jin> --- > drivers/vfio/pci/vfio_pci.c | 70 > +++-- > drivers/vfio/pci/vfio_pci_private.h | 2 ++ > 2 files changed, 70 insertions(+), 2 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index 712a849..752af20 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -534,6 +534,15 @@ static long vfio_pci_ioctl(void *device_data, > { > struct vfio_pci_device *vdev = device_data; > unsigned long minsz; > + int ret; > + > + if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS || > + cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) { > + ret = wait_for_completion_interruptible( > + >aer_completion); don't split it like that. > + if (ret) > + return ret; > + } > > if (cmd == VFIO_DEVICE_GET_INFO) { > struct vfio_device_info info; > @@ -953,6 +962,15 @@ static ssize_t vfio_pci_rw(void *device_data, char > __user *buf, > { > unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); > struct vfio_pci_device *vdev = device_data; > + int ret; > + > + /* block all kinds of access during host recovery */ > + if (vdev->aer_recovering) { > + ret = wait_for_completion_interruptible( > + >aer_completion); > + if (ret) > + return ret; > + } > > if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions) > return -EINVAL; > @@ -1117,6 +1135,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const > struct pci_device_id *id) > vdev->irq_type = VFIO_PCI_NUM_IRQS; > mutex_init(>igate); > spin_lock_init(>irqlock); > + init_completion(>aer_completion); > > ret = vfio_add_group_dev(>dev, _pci_ops, vdev); > if (ret) { > @@ -1176,6 +1195,9 @@ static pci_ers_result_t > vfio_pci_aer_err_detected(struct pci_dev *pdev, > { > struct vfio_pci_device *vdev; > struct vfio_device *device; > + u32 uncor_status; > + unsigned int aer_cap_offset; > + int ret; > > device = vfio_device_get_from_dev(>dev); > if (device == NULL) > @@ -1187,10 +1209,29 @@ static pci_ers_result_t > vfio_pci_aer_err_detected(struct pci_dev *pdev, > return PCI_ERS_RESULT_DISCONNECT; > } > > + /* > + * get device's uncorrectable error status as soon as possible, should be "Get". > + * and signal it to user space. The later we read it, the possibility > + * the register value is mangled grows. > + */ > + aer_cap_offset = pci_find_ext_capability(vdev->pdev, > PCI_EXT_CAP_ID_ERR); > + ret = pci_read_config_dword(vdev->pdev, aer_cap_offset + > +PCI_ERR_UNCOR_STATUS, _status); > +if (ret) > +return PCI_ERS_RESULT_DISCONNECT; > + > + pr_info("device %d got AER detect notification. uncorrectable error > status = 0x%x\n", pdev->devfn, uncor_status);//to be removed Pls drop this. > mutex_lock(>igate); > > - if (vdev->err_trigger) > - eventfd_signal(vdev->err_trigger, 1); > + vdev->aer_recovering = true; > + reinit_completion(>aer_completion); > + > + if (vdev->err_trigger && uncor_status) { > + pr_info("device %d signal uncor status 0x%x to user", > + pdev->devfn, uncor_status); > + /* signal uncorrectable error status to user space */ > + eventfd_signal(vdev->err_trigger, uncor_status); > +} > > mutex_unlock(>igate); > > @@ -1199,8 +1240,33 @@ static pci_ers_result_t > vfio_pci_aer_err_detected(struct pci_dev *pdev, > return PCI_ERS_RESULT_CAN_RECOVER; > } > > +static void vfio_pci_aer_resume(struct pci_dev *pdev) > +{ > + struct vfio_pci_device *vdev; > + struct vfio_device *device; > + > + device = vfio_device_get_from_dev(>dev); > + if (device == NULL) > + return; > + > + vdev = vfio_device_data(device); > + if (vdev == NULL) { > + vfio_device_put(device); > + return; > + } > + > + mutex_lock(>igate); > + vdev->aer_recovering = false; > + mutex_unlock(>igate); > + > + complete_all(>aer_completion); > + > + vfio_device_put(device); > +} > + > static const struct pci_error_handlers vfio_err_handlers = { > .error_detected = vfio_pci_aer_err_detected, > + .resume = vfio_pci_aer_resume, > }; > > static struct pci_driver vfio_pci_driver = { > diff --git a/drivers/vfio/pci/vfio_pci_private.h > b/drivers/vfio/pci/vfio_pci_private.h > index 8a7d546..ba8471f 100644 > --- a/drivers/vfio/pci/vfio_pci_private.h > +++ b/drivers/vfio/pci/vfio_pci_private.h
[Qemu-devel] [PATCH v2] vfio/pci: Support error recovery
Support serious device error recovery Signed-off-by: Cao jin--- drivers/vfio/pci/vfio_pci.c | 70 +++-- drivers/vfio/pci/vfio_pci_private.h | 2 ++ 2 files changed, 70 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 712a849..752af20 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -534,6 +534,15 @@ static long vfio_pci_ioctl(void *device_data, { struct vfio_pci_device *vdev = device_data; unsigned long minsz; + int ret; + + if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS || + cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) { + ret = wait_for_completion_interruptible( + >aer_completion); + if (ret) + return ret; + } if (cmd == VFIO_DEVICE_GET_INFO) { struct vfio_device_info info; @@ -953,6 +962,15 @@ static ssize_t vfio_pci_rw(void *device_data, char __user *buf, { unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); struct vfio_pci_device *vdev = device_data; + int ret; + + /* block all kinds of access during host recovery */ + if (vdev->aer_recovering) { + ret = wait_for_completion_interruptible( + >aer_completion); + if (ret) + return ret; + } if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions) return -EINVAL; @@ -1117,6 +1135,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) vdev->irq_type = VFIO_PCI_NUM_IRQS; mutex_init(>igate); spin_lock_init(>irqlock); + init_completion(>aer_completion); ret = vfio_add_group_dev(>dev, _pci_ops, vdev); if (ret) { @@ -1176,6 +1195,9 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev, { struct vfio_pci_device *vdev; struct vfio_device *device; + u32 uncor_status; + unsigned int aer_cap_offset; + int ret; device = vfio_device_get_from_dev(>dev); if (device == NULL) @@ -1187,10 +1209,29 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev, return PCI_ERS_RESULT_DISCONNECT; } + /* +* get device's uncorrectable error status as soon as possible, +* and signal it to user space. The later we read it, the possibility +* the register value is mangled grows. +*/ + aer_cap_offset = pci_find_ext_capability(vdev->pdev, PCI_EXT_CAP_ID_ERR); + ret = pci_read_config_dword(vdev->pdev, aer_cap_offset + +PCI_ERR_UNCOR_STATUS, _status); +if (ret) +return PCI_ERS_RESULT_DISCONNECT; + + pr_info("device %d got AER detect notification. uncorrectable error status = 0x%x\n", pdev->devfn, uncor_status);//to be removed mutex_lock(>igate); - if (vdev->err_trigger) - eventfd_signal(vdev->err_trigger, 1); + vdev->aer_recovering = true; + reinit_completion(>aer_completion); + + if (vdev->err_trigger && uncor_status) { + pr_info("device %d signal uncor status 0x%x to user", + pdev->devfn, uncor_status); + /* signal uncorrectable error status to user space */ + eventfd_signal(vdev->err_trigger, uncor_status); +} mutex_unlock(>igate); @@ -1199,8 +1240,33 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev, return PCI_ERS_RESULT_CAN_RECOVER; } +static void vfio_pci_aer_resume(struct pci_dev *pdev) +{ + struct vfio_pci_device *vdev; + struct vfio_device *device; + + device = vfio_device_get_from_dev(>dev); + if (device == NULL) + return; + + vdev = vfio_device_data(device); + if (vdev == NULL) { + vfio_device_put(device); + return; + } + + mutex_lock(>igate); + vdev->aer_recovering = false; + mutex_unlock(>igate); + + complete_all(>aer_completion); + + vfio_device_put(device); +} + static const struct pci_error_handlers vfio_err_handlers = { .error_detected = vfio_pci_aer_err_detected, + .resume = vfio_pci_aer_resume, }; static struct pci_driver vfio_pci_driver = { diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h index 8a7d546..ba8471f 100644 --- a/drivers/vfio/pci/vfio_pci_private.h +++ b/drivers/vfio/pci/vfio_pci_private.h @@ -83,6 +83,8 @@ struct vfio_pci_device { boolbardirty; boolhas_vga; boolneeds_reset; + boolaer_recovering; + struct completion