Re: [PATCH] nvme-pci: Fix NULL ptr deref in EEH code
On Tue, Mar 20, 2018 at 11:22:42AM +1100, Michael Neuling wrote: > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index b6f43b738f..404b346e3c 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -2626,6 +2626,9 @@ static pci_ers_result_t nvme_error_detected(struct > pci_dev *pdev, > { > struct nvme_dev *dev = pci_get_drvdata(pdev); > > + if (!dev) > + return PCI_ERS_RESULT_NEED_RESET; This implies the method has been called before ->probe has been finished or after ->remove has been called. That would be fundamentally racy and needs to be fixed in the PCI layer, not papered over in drivers.
Re: [PATCH] nvme-pci: Fix NULL ptr deref in EEH code
On Tue, Mar 20, 2018 at 11:22:42AM +1100, Michael Neuling wrote: > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index b6f43b738f..404b346e3c 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -2626,6 +2626,9 @@ static pci_ers_result_t nvme_error_detected(struct > pci_dev *pdev, > { > struct nvme_dev *dev = pci_get_drvdata(pdev); > > + if (!dev) > + return PCI_ERS_RESULT_NEED_RESET; This implies the method has been called before ->probe has been finished or after ->remove has been called. That would be fundamentally racy and needs to be fixed in the PCI layer, not papered over in drivers.
[PATCH] nvme-pci: Fix NULL ptr deref in EEH code
On powerpc on boot we can take an EEH event which results in this oops. cpu 0x23: Vector: 300 (Data Access) at [c00ff50f3800] pc: c008089a0eb0: nvme_error_detected+0x4c/0x90 [nvme] lr: c0026564: eeh_report_error+0xe0/0x110 sp: c00ff50f3a80 msr: 90009033 dar: 400 dsisr: 4000 current = 0xc00ff507c000 paca = 0xcfdc9d80 softe: 0 irq_happened: 0x01 pid = 782, comm = eehd Linux version 4.15.6-openpower1 (smc@smc-desktop) (gcc version 6.4.0 (Buildroot 2017.11.2-8-g4b6188e)) #2 SM P Tue Feb 27 12:33:27 PST 2018 enter ? for help [c00ff50f3af0] c0026564 eeh_report_error+0xe0/0x110 [c00ff50f3b30] c0025520 eeh_pe_dev_traverse+0xc0/0xdc [c00ff50f3bc0] c0026bd0 eeh_handle_normal_event+0x184/0x4c4 [c00ff50f3c70] c0026ff4 eeh_handle_event+0x30/0x288 [c00ff50f3d10] c002758c eeh_event_handler+0x124/0x170 [c00ff50f3dc0] c008fed0 kthread+0x14c/0x154 [c00ff50f3e30] c000b594 ret_from_kernel_thread+0x5c/0xc8 This fixes the NULL ptr deref. Signed-off-by: Michael Neuling--- drivers/nvme/host/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index b6f43b738f..404b346e3c 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2626,6 +2626,9 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev, { struct nvme_dev *dev = pci_get_drvdata(pdev); + if (!dev) + return PCI_ERS_RESULT_NEED_RESET; + /* * A frozen channel requires a reset. When detected, this method will * shutdown the controller to quiesce. The controller will be restarted -- 2.14.1
[PATCH] nvme-pci: Fix NULL ptr deref in EEH code
On powerpc on boot we can take an EEH event which results in this oops. cpu 0x23: Vector: 300 (Data Access) at [c00ff50f3800] pc: c008089a0eb0: nvme_error_detected+0x4c/0x90 [nvme] lr: c0026564: eeh_report_error+0xe0/0x110 sp: c00ff50f3a80 msr: 90009033 dar: 400 dsisr: 4000 current = 0xc00ff507c000 paca = 0xcfdc9d80 softe: 0 irq_happened: 0x01 pid = 782, comm = eehd Linux version 4.15.6-openpower1 (smc@smc-desktop) (gcc version 6.4.0 (Buildroot 2017.11.2-8-g4b6188e)) #2 SM P Tue Feb 27 12:33:27 PST 2018 enter ? for help [c00ff50f3af0] c0026564 eeh_report_error+0xe0/0x110 [c00ff50f3b30] c0025520 eeh_pe_dev_traverse+0xc0/0xdc [c00ff50f3bc0] c0026bd0 eeh_handle_normal_event+0x184/0x4c4 [c00ff50f3c70] c0026ff4 eeh_handle_event+0x30/0x288 [c00ff50f3d10] c002758c eeh_event_handler+0x124/0x170 [c00ff50f3dc0] c008fed0 kthread+0x14c/0x154 [c00ff50f3e30] c000b594 ret_from_kernel_thread+0x5c/0xc8 This fixes the NULL ptr deref. Signed-off-by: Michael Neuling --- drivers/nvme/host/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index b6f43b738f..404b346e3c 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2626,6 +2626,9 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev, { struct nvme_dev *dev = pci_get_drvdata(pdev); + if (!dev) + return PCI_ERS_RESULT_NEED_RESET; + /* * A frozen channel requires a reset. When detected, this method will * shutdown the controller to quiesce. The controller will be restarted -- 2.14.1