Re: [PATCH] nvme-pci: Fix NULL ptr deref in EEH code

2018-03-20 Thread Christoph Hellwig
On Tue, Mar 20, 2018 at 11:22:42AM +1100, Michael Neuling wrote:
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index b6f43b738f..404b346e3c 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2626,6 +2626,9 @@ static pci_ers_result_t nvme_error_detected(struct 
> pci_dev *pdev,
>  {
>   struct nvme_dev *dev = pci_get_drvdata(pdev);
>  
> + if (!dev)
> + return PCI_ERS_RESULT_NEED_RESET;

This implies the method has been called before ->probe has been finished
or after ->remove has been called.  That would be fundamentally racy
and needs to be fixed in the PCI layer, not papered over in drivers.


Re: [PATCH] nvme-pci: Fix NULL ptr deref in EEH code

2018-03-20 Thread Christoph Hellwig
On Tue, Mar 20, 2018 at 11:22:42AM +1100, Michael Neuling wrote:
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index b6f43b738f..404b346e3c 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2626,6 +2626,9 @@ static pci_ers_result_t nvme_error_detected(struct 
> pci_dev *pdev,
>  {
>   struct nvme_dev *dev = pci_get_drvdata(pdev);
>  
> + if (!dev)
> + return PCI_ERS_RESULT_NEED_RESET;

This implies the method has been called before ->probe has been finished
or after ->remove has been called.  That would be fundamentally racy
and needs to be fixed in the PCI layer, not papered over in drivers.


[PATCH] nvme-pci: Fix NULL ptr deref in EEH code

2018-03-19 Thread Michael Neuling
On powerpc on boot we can take an EEH event which results in this oops.

cpu 0x23: Vector: 300 (Data Access) at [c00ff50f3800]
pc: c008089a0eb0: nvme_error_detected+0x4c/0x90 [nvme]
lr: c0026564: eeh_report_error+0xe0/0x110
sp: c00ff50f3a80
msr: 90009033
dar: 400
dsisr: 4000
current = 0xc00ff507c000
paca = 0xcfdc9d80 softe: 0 irq_happened: 0x01
pid = 782, comm = eehd
Linux version 4.15.6-openpower1 (smc@smc-desktop) (gcc version 6.4.0 (Buildroot 
2017.11.2-8-g4b6188e)) #2 SM P Tue Feb 27 12:33:27 PST 2018
enter ? for help
[c00ff50f3af0] c0026564 eeh_report_error+0xe0/0x110
[c00ff50f3b30] c0025520 eeh_pe_dev_traverse+0xc0/0xdc
[c00ff50f3bc0] c0026bd0 eeh_handle_normal_event+0x184/0x4c4
[c00ff50f3c70] c0026ff4 eeh_handle_event+0x30/0x288
[c00ff50f3d10] c002758c eeh_event_handler+0x124/0x170
[c00ff50f3dc0] c008fed0 kthread+0x14c/0x154
[c00ff50f3e30] c000b594 ret_from_kernel_thread+0x5c/0xc8

This fixes the NULL ptr deref.

Signed-off-by: Michael Neuling 
---
 drivers/nvme/host/pci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b6f43b738f..404b346e3c 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2626,6 +2626,9 @@ static pci_ers_result_t nvme_error_detected(struct 
pci_dev *pdev,
 {
struct nvme_dev *dev = pci_get_drvdata(pdev);
 
+   if (!dev)
+   return PCI_ERS_RESULT_NEED_RESET;
+
/*
 * A frozen channel requires a reset. When detected, this method will
 * shutdown the controller to quiesce. The controller will be restarted
-- 
2.14.1



[PATCH] nvme-pci: Fix NULL ptr deref in EEH code

2018-03-19 Thread Michael Neuling
On powerpc on boot we can take an EEH event which results in this oops.

cpu 0x23: Vector: 300 (Data Access) at [c00ff50f3800]
pc: c008089a0eb0: nvme_error_detected+0x4c/0x90 [nvme]
lr: c0026564: eeh_report_error+0xe0/0x110
sp: c00ff50f3a80
msr: 90009033
dar: 400
dsisr: 4000
current = 0xc00ff507c000
paca = 0xcfdc9d80 softe: 0 irq_happened: 0x01
pid = 782, comm = eehd
Linux version 4.15.6-openpower1 (smc@smc-desktop) (gcc version 6.4.0 (Buildroot 
2017.11.2-8-g4b6188e)) #2 SM P Tue Feb 27 12:33:27 PST 2018
enter ? for help
[c00ff50f3af0] c0026564 eeh_report_error+0xe0/0x110
[c00ff50f3b30] c0025520 eeh_pe_dev_traverse+0xc0/0xdc
[c00ff50f3bc0] c0026bd0 eeh_handle_normal_event+0x184/0x4c4
[c00ff50f3c70] c0026ff4 eeh_handle_event+0x30/0x288
[c00ff50f3d10] c002758c eeh_event_handler+0x124/0x170
[c00ff50f3dc0] c008fed0 kthread+0x14c/0x154
[c00ff50f3e30] c000b594 ret_from_kernel_thread+0x5c/0xc8

This fixes the NULL ptr deref.

Signed-off-by: Michael Neuling 
---
 drivers/nvme/host/pci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b6f43b738f..404b346e3c 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2626,6 +2626,9 @@ static pci_ers_result_t nvme_error_detected(struct 
pci_dev *pdev,
 {
struct nvme_dev *dev = pci_get_drvdata(pdev);
 
+   if (!dev)
+   return PCI_ERS_RESULT_NEED_RESET;
+
/*
 * A frozen channel requires a reset. When detected, this method will
 * shutdown the controller to quiesce. The controller will be restarted
-- 
2.14.1