RE: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove
> -Original Message- > From: Bjorn Helgaas [mailto:helg...@kernel.org] > Sent: Tuesday, September 27, 2016 12:30 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Bjorn Helgaas ; > de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org; Long Li > Subject: Re: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove > > On Wed, Sep 14, 2016 at 07:10:01PM -0700, Long Li wrote: > > From: Long Li > > > > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI > device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, > the bus is already removed before the call, so we don't need to rescan the > bus in the workqueue scheduled from hv_pci_devices_present. By > introducing status hv_pcibus_removed, we can avoid this situation. > > > > The patch fixes the following kernel panic. > > > > [ 383.853124] Workqueue: events pci_devices_present_work [pci_hyperv] > > [ 383.853124] task: 88007f5f8000 ti: 88007f60 task.ti: > > 88007f60 > > [ 383.853124] RIP: 0010:[] [] > > pci_is_pcie+0x6/0x20 > > [ 383.853124] RSP: 0018:88007f603d38 EFLAGS: 00010206 > > [ 383.853124] RAX: 88007f5f8000 RBX: 642f3d4854415056 RCX: > > 88007f603fd8 > > [ 383.853124] RDX: RSI: RDI: > > 642f3d4854415056 > > [ 383.853124] RBP: 88007f603d68 R08: 0246 R09: > > a045eb9e > > [ 383.853124] R10: 88007b419a80 R11: ea0001c0ef40 R12: > > 880003ee1c00 > > [ 383.853124] R13: 63702f30303a3137 R14: R15: > > 0246 > > [ 383.853124] FS: () GS:88007b40() > > knlGS: > > [ 383.853124] CS: 0010 DS: ES: CR0: 80050033 > > [ 383.853124] CR2: 7f68b3f52350 CR3: 03546000 CR4: > > 000406f0 > > [ 383.853124] DR0: DR1: DR2: > > > > [ 383.853124] DR3: DR6: 0ff0 DR7: > > 0400 > > [ 383.853124] Stack: > > [ 383.853124] 88007f603d68 8134db17 0008 > > 880003ee1c00 > > [ 383.853124] 63702f30303a3137 880003d8edb8 88007f603da0 > > 8134ee2d > > [ 383.853124] 880003d8ed00 88007f603dd8 880075fec320 > > 880003d8edb8 > > [ 383.853124] Call Trace: > > [ 383.853124] [] ? pci_scan_slot+0x27/0x140 > > [ 383.853124] [] pci_scan_child_bus+0x3d/0x150 > > [ 383.853124] [] > > pci_devices_present_work+0x3ea/0x400 [pci_hyperv] > > [ 383.853124] [] process_one_work+0x17b/0x470 > > [ 383.853124] [] worker_thread+0x126/0x410 > > [ 383.853124] [] ? rescuer_thread+0x460/0x460 > > [ 383.853124] [] kthread+0xcf/0xe0 > > [ 383.853124] [] ? > > kthread_create_on_node+0x140/0x140 > > [ 383.853124] [] ret_from_fork+0x58/0x90 > > [ 383.853124] [] ? > > kthread_create_on_node+0x140/0x140 > > [ 383.853124] Code: 89 e5 5d 25 f0 00 00 00 c1 f8 04 c3 66 0f 1f 84 00 > > 00 00 00 00 66 66 66 66 90 55 0f b6 47 4a 48 89 e5 5d c3 90 66 66 66 66 > > 90 55 <80> 7f 4a 00 48 89 e5 5d 0f 95 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 > > [ 383.853124] RIP [] pci_is_pcie+0x6/0x20 > > [ 383.853124] RSP > > Personally, I would remove the timestamps and addresses from this trace > because I don't think they contribute to diagnosing the problem. > > > Signed-off-by: Long Li Acked-by: KY Srinivasan Thanks, K. Y > > I'm ready to apply these but am waiting for an ack from the maintainers > listed in MAINTAINERS (feel free to update that if it's out of date). > > > --- > > drivers/pci/host/pci-hyperv.c | 20 +--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c > > index a8deeca..4a37598 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > > hv_pcibus_init = 0, > > hv_pcibus_probed, > > hv_pcibus_installed, > > + hv_pcibus_removed, > > hv_pcibus_maximum > > }; > > > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > work_struct *work) > > put_pcichild(hpdev, hv_pcidev_ref_initial); > > } > > > > - /* Tell the core to rescan bus because there may have been changes. > */ > > - if (hbus->state == hv
RE: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove
> -Original Message- > From: Bjorn Helgaas [mailto:helg...@kernel.org] > Sent: Tuesday, September 27, 2016 12:30 PM > To: Long Li > Cc: KY Srinivasan ; Haiyang Zhang > ; Bjorn Helgaas ; > de...@linuxdriverproject.org; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org; Long Li > Subject: Re: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove > > On Wed, Sep 14, 2016 at 07:10:01PM -0700, Long Li wrote: > > From: Long Li > > > > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI > device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, > the bus is already removed before the call, so we don't need to rescan the > bus in the workqueue scheduled from hv_pci_devices_present. By > introducing status hv_pcibus_removed, we can avoid this situation. > > > > The patch fixes the following kernel panic. > > > > [ 383.853124] Workqueue: events pci_devices_present_work [pci_hyperv] > > [ 383.853124] task: 88007f5f8000 ti: 88007f60 task.ti: > > 88007f60 > > [ 383.853124] RIP: 0010:[] [] > > pci_is_pcie+0x6/0x20 > > [ 383.853124] RSP: 0018:88007f603d38 EFLAGS: 00010206 [ > > 383.853124] RAX: 88007f5f8000 RBX: 642f3d4854415056 RCX: > > 88007f603fd8 > > [ 383.853124] RDX: RSI: RDI: > > 642f3d4854415056 > > [ 383.853124] RBP: 88007f603d68 R08: 0246 R09: > > a045eb9e > > [ 383.853124] R10: 88007b419a80 R11: ea0001c0ef40 R12: > > 880003ee1c00 > > [ 383.853124] R13: 63702f30303a3137 R14: R15: > > 0246 > > [ 383.853124] FS: () GS:88007b40() > > knlGS: > > [ 383.853124] CS: 0010 DS: ES: CR0: 80050033 [ > > 383.853124] CR2: 7f68b3f52350 CR3: 03546000 CR4: > > 000406f0 > > [ 383.853124] DR0: DR1: DR2: > > > > [ 383.853124] DR3: DR6: 0ff0 DR7: > > 0400 > > [ 383.853124] Stack: > > [ 383.853124] 88007f603d68 8134db17 0008 > > 880003ee1c00 > > [ 383.853124] 63702f30303a3137 880003d8edb8 88007f603da0 > > 8134ee2d [ 383.853124] 880003d8ed00 88007f603dd8 > > 880075fec320 > > 880003d8edb8 > > [ 383.853124] Call Trace: > > [ 383.853124] [] ? pci_scan_slot+0x27/0x140 [ > > 383.853124] [] pci_scan_child_bus+0x3d/0x150 [ > > 383.853124] [] > > pci_devices_present_work+0x3ea/0x400 [pci_hyperv] [ 383.853124] > > [] process_one_work+0x17b/0x470 [ 383.853124] > > [] worker_thread+0x126/0x410 [ 383.853124] > > [] ? rescuer_thread+0x460/0x460 [ 383.853124] > > [] kthread+0xcf/0xe0 [ 383.853124] > > [] ? > > kthread_create_on_node+0x140/0x140 > > [ 383.853124] [] ret_from_fork+0x58/0x90 [ > > 383.853124] [] ? > > kthread_create_on_node+0x140/0x140 > > [ 383.853124] Code: 89 e5 5d 25 f0 00 00 00 c1 f8 04 c3 66 0f 1f 84 > > 00 > > 00 00 00 00 66 66 66 66 90 55 0f b6 47 4a 48 89 e5 5d c3 90 66 66 66 > > 66 > > 90 55 <80> 7f 4a 00 48 89 e5 5d 0f 95 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 > > [ 383.853124] RIP [] pci_is_pcie+0x6/0x20 [ > > 383.853124] RSP > > Personally, I would remove the timestamps and addresses from this trace > because I don't think they contribute to diagnosing the problem. Thanks Bjorn. I will remove those kernel traces and send a v3 patch. > > > Signed-off-by: Long Li > > I'm ready to apply these but am waiting for an ack from the maintainers listed > in MAINTAINERS (feel free to update that if it's out of date). > > > --- > > drivers/pci/host/pci-hyperv.c | 20 +--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/host/pci-hyperv.c > > b/drivers/pci/host/pci-hyperv.c index a8deeca..4a37598 100644 > > --- a/drivers/pci/host/pci-hyperv.c > > +++ b/drivers/pci/host/pci-hyperv.c > > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > > hv_pcibus_init = 0, > > hv_pcibus_probed, > > hv_pcibus_installed, > > + hv_pcibus_removed, > > hv_pcibus_maximum > > }; > > > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > work_struct *work) > > put_pcichild(hpdev, hv_pcidev_ref_initial); > > } > > > > - /* Tell the core to rescan bus because there may have been changes. > */ >
Re: [PATCH 1/2 v2] pci-hyperv: properly handle pci bus remove
On Wed, Sep 14, 2016 at 07:10:01PM -0700, Long Li wrote: > From: Long Li > > hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device > from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is > already removed before the call, so we don't need to rescan the bus in the > workqueue scheduled from hv_pci_devices_present. By introducing status > hv_pcibus_removed, we can avoid this situation. > > The patch fixes the following kernel panic. > > [ 383.853124] Workqueue: events pci_devices_present_work [pci_hyperv] > [ 383.853124] task: 88007f5f8000 ti: 88007f60 task.ti: > 88007f60 > [ 383.853124] RIP: 0010:[] [] > pci_is_pcie+0x6/0x20 > [ 383.853124] RSP: 0018:88007f603d38 EFLAGS: 00010206 > [ 383.853124] RAX: 88007f5f8000 RBX: 642f3d4854415056 RCX: > 88007f603fd8 > [ 383.853124] RDX: RSI: RDI: > 642f3d4854415056 > [ 383.853124] RBP: 88007f603d68 R08: 0246 R09: > a045eb9e > [ 383.853124] R10: 88007b419a80 R11: ea0001c0ef40 R12: > 880003ee1c00 > [ 383.853124] R13: 63702f30303a3137 R14: R15: > 0246 > [ 383.853124] FS: () GS:88007b40() > knlGS: > [ 383.853124] CS: 0010 DS: ES: CR0: 80050033 > [ 383.853124] CR2: 7f68b3f52350 CR3: 03546000 CR4: > 000406f0 > [ 383.853124] DR0: DR1: DR2: > > [ 383.853124] DR3: DR6: 0ff0 DR7: > 0400 > [ 383.853124] Stack: > [ 383.853124] 88007f603d68 8134db17 0008 > 880003ee1c00 > [ 383.853124] 63702f30303a3137 880003d8edb8 88007f603da0 > 8134ee2d > [ 383.853124] 880003d8ed00 88007f603dd8 880075fec320 > 880003d8edb8 > [ 383.853124] Call Trace: > [ 383.853124] [] ? pci_scan_slot+0x27/0x140 > [ 383.853124] [] pci_scan_child_bus+0x3d/0x150 > [ 383.853124] [] > pci_devices_present_work+0x3ea/0x400 [pci_hyperv] > [ 383.853124] [] process_one_work+0x17b/0x470 > [ 383.853124] [] worker_thread+0x126/0x410 > [ 383.853124] [] ? rescuer_thread+0x460/0x460 > [ 383.853124] [] kthread+0xcf/0xe0 > [ 383.853124] [] ? > kthread_create_on_node+0x140/0x140 > [ 383.853124] [] ret_from_fork+0x58/0x90 > [ 383.853124] [] ? > kthread_create_on_node+0x140/0x140 > [ 383.853124] Code: 89 e5 5d 25 f0 00 00 00 c1 f8 04 c3 66 0f 1f 84 00 > 00 00 00 00 66 66 66 66 90 55 0f b6 47 4a 48 89 e5 5d c3 90 66 66 66 66 > 90 55 <80> 7f 4a 00 48 89 e5 5d 0f 95 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 > [ 383.853124] RIP [] pci_is_pcie+0x6/0x20 > [ 383.853124] RSP Personally, I would remove the timestamps and addresses from this trace because I don't think they contribute to diagnosing the problem. > Signed-off-by: Long Li I'm ready to apply these but am waiting for an ack from the maintainers listed in MAINTAINERS (feel free to update that if it's out of date). > --- > drivers/pci/host/pci-hyperv.c | 20 +--- > 1 file changed, 17 insertions(+), 3 deletions(-) > > diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c > index a8deeca..4a37598 100644 > --- a/drivers/pci/host/pci-hyperv.c > +++ b/drivers/pci/host/pci-hyperv.c > @@ -348,6 +348,7 @@ enum hv_pcibus_state { > hv_pcibus_init = 0, > hv_pcibus_probed, > hv_pcibus_installed, > + hv_pcibus_removed, > hv_pcibus_maximum > }; > > @@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct > work_struct *work) > put_pcichild(hpdev, hv_pcidev_ref_initial); > } > > - /* Tell the core to rescan bus because there may have been changes. */ > - if (hbus->state == hv_pcibus_installed) { > + switch (hbus->state) { > + case hv_pcibus_installed: > + /* > + * Tell the core to rescan bus > + * because there may have been changes. > + */ > pci_lock_rescan_remove(); > pci_scan_child_bus(hbus->pci_bus); > pci_unlock_rescan_remove(); > - } else { > + break; > + > + case hv_pcibus_init: > + case hv_pcibus_probed: > survey_child_resources(hbus); > + break; > + > + default: > + break; > } > > up(&hbus->enum_sem); > @@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev, > hbus = kzalloc(sizeof(*hbus), GFP_KERNEL); > if (!hbus) > return -ENOMEM; > + hbus->state = hv_pcibus_init; > > /* >* The PCI bus "domain" is what is called "segment" in ACPI and > @@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev) > pci_stop_root_bus(hbus->pci_bus); > pci_remove_root_bus(hbus->pci_bus); > pci_unlock_rescan_remove(); > + hbus->state =