Re: fix nvme(4): NULL deref. and empty device attachments

2021-02-24 Thread David Gwynne
ok

> On 25 Feb 2021, at 02:34, Jan Klemkow  wrote:
> 
> Hi,
> 
> While attaching the following disks, the nvme driver runs into a NULL
> dereference in nvme_scsi_capacity16() and nvme_scsi_capacity().
> 
> nvme0 at pci1 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme0: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ0413002P4P0DGN
> scsibus1 at nvme0: 129 targets, initiator 0
> sd0 at scsibus1 targ 1 lun 0: 
> sd0: 3815447MB, 512 bytes/sector, 7814037168 sectors
> sd1 at scsibus1 targ 2 lun 0: 
> uvm_fault(0x821d00e8, 0x0, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at  nvme_scsi_capacity16+0x39:  movq0(%rax),%rcx
> ddb{0}>
> 
> "ns" in both functions will be NULL, if "identify" is not allocated in
> nvme_scsi_probe().  Thus, its better to just not attach this empty
> disks/LUNs.
> 
> nvme0 at pci1 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme0: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ0413002P4P0DGN
> scsibus1 at nvme0: 129 targets, initiator 0
> sd0 at scsibus1 targ 1 lun 0: 
> sd0: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb1 at pci0 dev 3 function 2 "AMD 17h PCIE" rev 0x00: msi
> pci2 at ppb1 bus 98
> nvme1 at pci2 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme1: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041500C34P0DGN
> scsibus2 at nvme1: 129 targets, initiator 0
> sd1 at scsibus2 targ 1 lun 0: 
> sd1: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb2 at pci0 dev 3 function 3 "AMD 17h PCIE" rev 0x00: msi
> pci3 at ppb2 bus 99
> nvme2 at pci3 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme2: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041402Z64P0DGN
> scsibus3 at nvme2: 129 targets, initiator 0
> sd2 at scsibus3 targ 1 lun 0: 
> sd2: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb3 at pci0 dev 3 function 4 "AMD 17h PCIE" rev 0x00: msi
> pci4 at ppb3 bus 100
> nvme3 at pci4 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme3: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041403134P0DGN
> scsibus4 at nvme3: 129 targets, initiator 0
> sd3 at scsibus4 targ 1 lun 0: 
> sd3: 3815447MB, 512 bytes/sector, 7814037168 sectors
> 
> The following diff signals an error for the upper probing function in
> the SCSI layer to prevents further function calls in nvme(4) which would
> just leads to the upper described error and hundreds of not configured
> devices.
> 
> OK?
> 
> bye,
> Jan
> 
> Index: dev/ic/nvme.c
> ===
> RCS file: /cvs//src/sys/dev/ic/nvme.c,v
> retrieving revision 1.90
> diff -u -p -r1.90 nvme.c
> --- dev/ic/nvme.c 9 Feb 2021 01:50:10 -   1.90
> +++ dev/ic/nvme.c 24 Feb 2021 16:01:48 -
> @@ -463,11 +463,16 @@ nvme_scsi_probe(struct scsi_link *link)
>   scsi_io_put(&sc->sc_iopool, ccb);
> 
>   identify = NVME_DMA_KVA(mem);
> - if (rv == 0 && lemtoh64(&identify->nsze) > 0) {
> - /* Commit namespace if it has a size greater than zero. */
> - identify = malloc(sizeof(*identify), M_DEVBUF, M_WAITOK);
> - memcpy(identify, NVME_DMA_KVA(mem), sizeof(*identify));
> - sc->sc_namespaces[link->target].ident = identify;
> + if (rv == 0) {
> + if (lemtoh64(&identify->nsze) > 0) {
> + /* Commit namespace if it has a size greater than zero. 
> */
> + identify = malloc(sizeof(*identify), M_DEVBUF, 
> M_WAITOK);
> + memcpy(identify, NVME_DMA_KVA(mem), sizeof(*identify));
> + sc->sc_namespaces[link->target].ident = identify;
> + } else {
> + /* Don't attach a namespace if its size is zero. */
> + rv = ENXIO;
> + }
>   }
> 
>   nvme_dmamem_free(sc, mem);
> 



Re: fix nvme(4): NULL deref. and empty device attachments

2021-02-24 Thread Patrick Wildt
Am Wed, Feb 24, 2021 at 05:34:48PM +0100 schrieb Jan Klemkow:
> Hi,
> 
> While attaching the following disks, the nvme driver runs into a NULL
> dereference in nvme_scsi_capacity16() and nvme_scsi_capacity().
> 
> nvme0 at pci1 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme0: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ0413002P4P0DGN
> scsibus1 at nvme0: 129 targets, initiator 0
> sd0 at scsibus1 targ 1 lun 0: 
> sd0: 3815447MB, 512 bytes/sector, 7814037168 sectors
> sd1 at scsibus1 targ 2 lun 0: 
> uvm_fault(0x821d00e8, 0x0, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at  nvme_scsi_capacity16+0x39:  movq0(%rax),%rcx
> ddb{0}>
> 
> "ns" in both functions will be NULL, if "identify" is not allocated in
> nvme_scsi_probe().  Thus, its better to just not attach this empty
> disks/LUNs.
> 
> nvme0 at pci1 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme0: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ0413002P4P0DGN
> scsibus1 at nvme0: 129 targets, initiator 0
> sd0 at scsibus1 targ 1 lun 0: 
> sd0: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb1 at pci0 dev 3 function 2 "AMD 17h PCIE" rev 0x00: msi
> pci2 at ppb1 bus 98
> nvme1 at pci2 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme1: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041500C34P0DGN
> scsibus2 at nvme1: 129 targets, initiator 0
> sd1 at scsibus2 targ 1 lun 0: 
> sd1: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb2 at pci0 dev 3 function 3 "AMD 17h PCIE" rev 0x00: msi
> pci3 at ppb2 bus 99
> nvme2 at pci3 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme2: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041402Z64P0DGN
> scsibus3 at nvme2: 129 targets, initiator 0
> sd2 at scsibus3 targ 1 lun 0: 
> sd2: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb3 at pci0 dev 3 function 4 "AMD 17h PCIE" rev 0x00: msi
> pci4 at ppb3 bus 100
> nvme3 at pci4 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme3: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041403134P0DGN
> scsibus4 at nvme3: 129 targets, initiator 0
> sd3 at scsibus4 targ 1 lun 0: 
> sd3: 3815447MB, 512 bytes/sector, 7814037168 sectors
> 
> The following diff signals an error for the upper probing function in
> the SCSI layer to prevents further function calls in nvme(4) which would
> just leads to the upper described error and hundreds of not configured
> devices.
> 
> OK?

I think this is the correct way to fix it.  The issue essentially is
that we still return "ok" even though the size is zero.  And we should
probably fail similarly as if it didn't exist.  FreeBSD has a similar
check here, which explains it a little:

/*
 * If the size of is zero, chances are this isn't a valid
 * namespace (eg one that's not been configured yet). The
 * standard says the entire id will be zeros, so this is a
 * cheap way to test for that.
 */
if (ns->data.nsze == 0)
return (ENXIO);

I wondered if this block of code could be written a bit differently, but
I couldn't make it look any better.

ok patrick@ fwiw

> bye,
> Jan
> 
> Index: dev/ic/nvme.c
> ===
> RCS file: /cvs//src/sys/dev/ic/nvme.c,v
> retrieving revision 1.90
> diff -u -p -r1.90 nvme.c
> --- dev/ic/nvme.c 9 Feb 2021 01:50:10 -   1.90
> +++ dev/ic/nvme.c 24 Feb 2021 16:01:48 -
> @@ -463,11 +463,16 @@ nvme_scsi_probe(struct scsi_link *link)
>   scsi_io_put(&sc->sc_iopool, ccb);
>  
>   identify = NVME_DMA_KVA(mem);
> - if (rv == 0 && lemtoh64(&identify->nsze) > 0) {
> - /* Commit namespace if it has a size greater than zero. */
> - identify = malloc(sizeof(*identify), M_DEVBUF, M_WAITOK);
> - memcpy(identify, NVME_DMA_KVA(mem), sizeof(*identify));
> - sc->sc_namespaces[link->target].ident = identify;
> + if (rv == 0) {
> + if (lemtoh64(&identify->nsze) > 0) {
> + /* Commit namespace if it has a size greater than zero. 
> */
> + identify = malloc(sizeof(*identify), M_DEVBUF, 
> M_WAITOK);
> + memcpy(identify, NVME_DMA_KVA(mem), sizeof(*identify));
> + sc->sc_namespaces[link->target].ident = identify;
> + } else {
> + /* Don't attach a namespace if its size is zero. */
> + rv = ENXIO;
> + }
>   }
>  
>   nvme_dmamem_free(sc, mem);
> 



fix nvme(4): NULL deref. and empty device attachments

2021-02-24 Thread Jan Klemkow
Hi,

While attaching the following disks, the nvme driver runs into a NULL
dereference in nvme_scsi_capacity16() and nvme_scsi_capacity().

nvme0 at pci1 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 0x00: 
msix, NVMe 1.2
nvme0: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ0413002P4P0DGN
scsibus1 at nvme0: 129 targets, initiator 0
sd0 at scsibus1 targ 1 lun 0: 
sd0: 3815447MB, 512 bytes/sector, 7814037168 sectors
sd1 at scsibus1 targ 2 lun 0: 
uvm_fault(0x821d00e8, 0x0, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  nvme_scsi_capacity16+0x39:  movq0(%rax),%rcx
ddb{0}>

"ns" in both functions will be NULL, if "identify" is not allocated in
nvme_scsi_probe().  Thus, its better to just not attach this empty
disks/LUNs.

nvme0 at pci1 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 0x00: 
msix, NVMe 1.2
nvme0: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ0413002P4P0DGN
scsibus1 at nvme0: 129 targets, initiator 0
sd0 at scsibus1 targ 1 lun 0: 
sd0: 3815447MB, 512 bytes/sector, 7814037168 sectors
ppb1 at pci0 dev 3 function 2 "AMD 17h PCIE" rev 0x00: msi
pci2 at ppb1 bus 98
nvme1 at pci2 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 0x00: 
msix, NVMe 1.2
nvme1: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041500C34P0DGN
scsibus2 at nvme1: 129 targets, initiator 0
sd1 at scsibus2 targ 1 lun 0: 
sd1: 3815447MB, 512 bytes/sector, 7814037168 sectors
ppb2 at pci0 dev 3 function 3 "AMD 17h PCIE" rev 0x00: msi
pci3 at ppb2 bus 99
nvme2 at pci3 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 0x00: 
msix, NVMe 1.2
nvme2: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041402Z64P0DGN
scsibus3 at nvme2: 129 targets, initiator 0
sd2 at scsibus3 targ 1 lun 0: 
sd2: 3815447MB, 512 bytes/sector, 7814037168 sectors
ppb3 at pci0 dev 3 function 4 "AMD 17h PCIE" rev 0x00: msi
pci4 at ppb3 bus 100
nvme3 at pci4 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 0x00: 
msix, NVMe 1.2
nvme3: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041403134P0DGN
scsibus4 at nvme3: 129 targets, initiator 0
sd3 at scsibus4 targ 1 lun 0: 
sd3: 3815447MB, 512 bytes/sector, 7814037168 sectors

The following diff signals an error for the upper probing function in
the SCSI layer to prevents further function calls in nvme(4) which would
just leads to the upper described error and hundreds of not configured
devices.

OK?

bye,
Jan

Index: dev/ic/nvme.c
===
RCS file: /cvs//src/sys/dev/ic/nvme.c,v
retrieving revision 1.90
diff -u -p -r1.90 nvme.c
--- dev/ic/nvme.c   9 Feb 2021 01:50:10 -   1.90
+++ dev/ic/nvme.c   24 Feb 2021 16:01:48 -
@@ -463,11 +463,16 @@ nvme_scsi_probe(struct scsi_link *link)
scsi_io_put(&sc->sc_iopool, ccb);
 
identify = NVME_DMA_KVA(mem);
-   if (rv == 0 && lemtoh64(&identify->nsze) > 0) {
-   /* Commit namespace if it has a size greater than zero. */
-   identify = malloc(sizeof(*identify), M_DEVBUF, M_WAITOK);
-   memcpy(identify, NVME_DMA_KVA(mem), sizeof(*identify));
-   sc->sc_namespaces[link->target].ident = identify;
+   if (rv == 0) {
+   if (lemtoh64(&identify->nsze) > 0) {
+   /* Commit namespace if it has a size greater than zero. 
*/
+   identify = malloc(sizeof(*identify), M_DEVBUF, 
M_WAITOK);
+   memcpy(identify, NVME_DMA_KVA(mem), sizeof(*identify));
+   sc->sc_namespaces[link->target].ident = identify;
+   } else {
+   /* Don't attach a namespace if its size is zero. */
+   rv = ENXIO;
+   }
}
 
nvme_dmamem_free(sc, mem);