Re: [PATCH] scsi: lpfc: Uninitialized variable in lpfc_debugfs_nodelist_data()
On Mon, 2018-10-22 at 09:50 +0300, Dan Carpenter wrote: > There was a merge problem and we accidentally removed the "nrport" > initialization. > > Fixes: 77c5bf5647b5 ("Merge branch 'misc' into for-next") > Signed-off-by: Dan Carpenter > --- > drivers/scsi/lpfc/lpfc_debugfs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c > b/drivers/scsi/lpfc/lpfc_debugfs.c > index c6912394b56d..0c8005bb0f53 100644 > --- a/drivers/scsi/lpfc/lpfc_debugfs.c > +++ b/drivers/scsi/lpfc/lpfc_debugfs.c > @@ -550,7 +550,7 @@ lpfc_debugfs_nodelist_data(struct lpfc_vport > *vport, char *buf, int size) > struct lpfc_nodelist *ndlp; > unsigned char *statep; > struct nvme_fc_local_port *localport; > - struct nvme_fc_remote_port *nrport; > + struct nvme_fc_remote_port *nrport = NULL; Really? that's not the way I did the merge in my for-next: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git/commit/?h=for-next=1f86cc958e2a60cef07546b15bdce90713a05e80 77c5bf5647b5 looks to be the orphaned residue of a failed merge ... how did you find it because it's not in any of the public branches? James
[PATCH] scsi: myrs: fix build failure on 32 bit
For 32 bit versions we have to be careful about divisions of 64 bit quantities so use do_div() instead of a direct division. This fixes a warning about _uldivmod being undefined in certain configurations Fixes: 77266186397c ("scsi: myrs: Add Mylex RAID controller") Reported-by: kbuild test robot Signed-off-by: James Bottomley --- drivers/scsi/myrs.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c index b02ee0b0dd55..a9f9c77e889f 100644 --- a/drivers/scsi/myrs.c +++ b/drivers/scsi/myrs.c @@ -1978,7 +1978,8 @@ myrs_get_resync(struct device *dev) struct scsi_device *sdev = to_scsi_device(dev); struct myrs_hba *cs = shost_priv(sdev->host); struct myrs_ldev_info *ldev_info = sdev->hostdata; - u8 percent_complete = 0, status; + u64 percent_complete = 0; + u8 status; if (sdev->channel < cs->ctlr_info->physchan_present || !ldev_info) return; @@ -1986,8 +1987,8 @@ myrs_get_resync(struct device *dev) unsigned short ldev_num = ldev_info->ldev_num; status = myrs_get_ldev_info(cs, ldev_num, ldev_info); - percent_complete = ldev_info->rbld_lba * 100 / - ldev_info->cfg_devsize; + percent_complete = ldev_info->rbld_lba * 100; + do_div(percent_complete, ldev_info->cfg_devsize); } raid_set_resync(myrs_raid_template, dev, percent_complete); } -- 2.16.4
Re: [scsi:misc 194/233] ERROR: "__aeabi_uldivmod" [drivers/scsi/myrs.ko] undefined!
On Thu, 2018-10-18 at 10:28 -0700, James Bottomley wrote: > On Fri, 2018-10-19 at 01:18 +0800, kbuild test robot wrote: > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.g > > it > > misc > > head: 4d5b4ac1eae471bcd0fa381ab4099cc33e94e15d > > commit: 77266186397c6c782a3f670d32808a9671806ec5 [194/233] scsi: > > myrs: Add Mylex RAID controller (SCSI interface) > > config: arm-allmodconfig (attached as .config) > > compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0 > > reproduce: > > wget https://raw.githubusercontent.com/intel/lkp-tests/mast > > er > > /sbin/make.cross -O ~/bin/make.cross > > chmod +x ~/bin/make.cross > > git checkout 77266186397c6c782a3f670d32808a9671806ec5 > > # save the attached .config to linux build tree > > GCC_VERSION=7.2.0 make.cross ARCH=arm > > > > All errors (new ones prefixed by >>): > > > > > > ERROR: "__aeabi_uldivmod" [drivers/scsi/myrs.ko] undefined! > > I think this is the fix, can someone with an arm build check? This fix turned out to be bogus, but now I've dusted off my arm32 build environment (had to fix a bug with arm32 emulation in qemu 2.11.2 would you believe) and actually tried compiling it, the next patch is the fix (build tested but not boot tested). James
Re: [scsi:misc 194/233] ERROR: "__aeabi_uldivmod" [drivers/scsi/myrs.ko] undefined!
On Fri, 2018-10-19 at 01:18 +0800, kbuild test robot wrote: > tree: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git > misc > head: 4d5b4ac1eae471bcd0fa381ab4099cc33e94e15d > commit: 77266186397c6c782a3f670d32808a9671806ec5 [194/233] scsi: > myrs: Add Mylex RAID controller (SCSI interface) > config: arm-allmodconfig (attached as .config) > compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0 > reproduce: > wget https://raw.githubusercontent.com/intel/lkp-tests/master > /sbin/make.cross -O ~/bin/make.cross > chmod +x ~/bin/make.cross > git checkout 77266186397c6c782a3f670d32808a9671806ec5 > # save the attached .config to linux build tree > GCC_VERSION=7.2.0 make.cross ARCH=arm > > All errors (new ones prefixed by >>): > > > > ERROR: "__aeabi_uldivmod" [drivers/scsi/myrs.ko] undefined! I think this is the fix, can someone with an arm build check? Thanks, James --- diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c index b02ee0b0dd55..6bf3cb416505 100644 --- a/drivers/scsi/myrs.c +++ b/drivers/scsi/myrs.c @@ -1986,8 +1986,8 @@ myrs_get_resync(struct device *dev) unsigned short ldev_num = ldev_info->ldev_num; status = myrs_get_ldev_info(cs, ldev_num, ldev_info); - percent_complete = ldev_info->rbld_lba * 100 / - ldev_info->cfg_devsize; + percent_complete = do_div(ldev_info->rbld_lba * 100, + ldev_info->cfg_devsize); } raid_set_resync(myrs_raid_template, dev, percent_complete); }
Re: [possible bug] critical target error (out of range) when sending UNMAP on lsi2308 (mpt3sas) for the last sector of the drive
On Wed, 2018-10-10 at 16:55 +0200, Michal Soltys wrote: > Hi, > > I have a server with old server with lsi2308 ("it" mode) and sas2x36 > expander in front. I've been testing how it handles ssd drives (among > those if/how it copes with unmap). And it looks like > stomped on a bug (it's 100% reproducible in 4.14.74 and 4.18.12). > > The hardware in question: > > mpt2sas_cm0: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x05), > BiosVersion(07.39.02.00) > scsi host6: Fusion MPT SAS Host > scsi 6:0:19:0: Enclosure LSI SAS2X36 0e12 PQ: 0 > ANSI: 5 > > Tested with old Samsung 840 pro drive. > > Initially I was just doing a simple blkdiscard (which on the drive in > question works just fine in plain sata controller), which greeted me > with unexpected: > > 16:00 # blkdiscard /dev/sdt > blkdiscard: /dev/sdt: BLKDISCARD ioctl failed: Remote I/O error > > Log in dmesg: > > [ 391.870979] sd 7:0:20:0: [sdt] 250069680 512-byte logical blocks: > (128 GB/119 GiB) > [ 391.872702] sd 7:0:20:0: [sdt] Write Protect is off > [ 391.872705] sd 7:0:20:0: [sdt] Mode Sense: 7f 00 10 08 > [ 391.872962] sd 7:0:20:0: [sdt] Write cache: enabled, read cache: > enabled, supports DPO and FUA > ... > [ 529.907273] sd 7:0:20:0: [sdt] tag#3625 FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > [ 529.907277] sd 7:0:20:0: [sdt] tag#3625 Sense Key : Illegal > Request [current] > [ 529.907279] sd 7:0:20:0: [sdt] tag#3625 Add. Sense: Logical block > address out of range > [ 529.907281] sd 7:0:20:0: [sdt] tag#3625 CDB: Unmap/Read sub- > channel 42 00 00 00 00 00 00 00 18 00 > [ 529.907283] print_req_error: critical target error, dev sdt, > sector 247463877 > > Then I started testing it more manually and it turned out that if I > try to discard any range that includes the last sector, the command > fail in this fashion. For example: > > 2 sectors from the end, 1 sector (works): > blkdiscard -o $(( 250069678*512 )) -l 512 /dev/sdt > > 1 sectors from the end, 1 sector (fails): > blkdiscard -o $(( 250069679*512 )) -l 512 /dev/sdt > blkdiscard: /dev/sdt: BLKDISCARD ioctl failed: Remote I/O error > > [ 702.139612] sd 7:0:20:0: [sdt] tag#2511 FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > [ 702.139618] sd 7:0:20:0: [sdt] tag#2511 Sense Key : Illegal > Request [current] > [ 702.139622] sd 7:0:20:0: [sdt] tag#2511 Add. Sense: Logical block > address out of range > [ 702.139627] sd 7:0:20:0: [sdt] tag#2511 CDB: Unmap/Read sub- > channel 42 00 00 00 00 00 00 00 18 00 > [ 702.139631] print_req_error: critical target error, dev sdt, > sector 250069679 > > Similar results with sg_unmap: > > 16:36 # sg_unmap -f -l 250069678 -n 1 /dev/sdt > 16:36 # sg_unmap -f -l 250069679 -n 1 /dev/sdt > sg_unmap failed: LBA out of range > > The last sector on its own is of course normally readable/writable > otherwise. > > This looks like some +/-1 division/reminder issue somewhere. This looks like a firmware error either on the drive or the mpt3sas. I'm afraid the only ways to fix are either: 1. Don't use discard at all and use the entire disk 2. Use discard but shorten the disk in the partition map by one sector so the entire visible disk can be discarded I'm afraid if you can't change the partition table, you can only use option one (turn off discard). James
Re: [PATCH] mpt3sas: Fix for regression caused due to cf6bf9710c patch
On Tue, 2018-07-03 at 22:49 +0900, David Miller wrote: > From: Sreekanth Reddy > Date: Tue, 3 Jul 2018 17:48:49 +0530 > > > Any suggestion/update over my previous mail. I am using 4.13 > kernel. > > I think the issue is that if you are reading a 32-bit word and then > interpreting it as a struct full of individual bytes, you have to > order the bytes in the structure appropriately for the cpu > endianness. This is undoubtedly it. The point being if you read from a structure using readX, you have to read every element at its correct length for the endian swaps to work. You can't do a readq on 2 32 bit words and expect the endianness to be correct (you'll find they come out in the wrong order). I think you're using a shared (device and cpu) memory mapped structured data with a doorbell register, which is pretty identical to how the qla1280 does it. We went through several iterations of fixing that driver for big endian but finally settled on putting __le annotations on all the structures and doing cpu_to_leX() swaps as we wrote them (and obviously leX_to_cpu() swaps to read them), meaning the structure in memory is always correct for the device. Then we used a writeX to poke the doorbell and the device just picked up the correct information. The rule you want to be following is: memory mapped structure, you're responsible for annotation and swapping; readX/writeX to correctly sized data, the API will swap for you. So, can we just revert the original patch which is clearly now a regression and try to get this fixed in the merge window? I think the actual bug is simply you're missing __leX annotations on the shared memory mapped structure to fix sparse, but otherwise everything is working. James
Re: [PATCH] mpt3sas: Fix for regression caused due to cf6bf9710c patch
On Fri, 2018-06-29 at 10:58 -0400, Chaitra P B wrote: > "scsi: mpt3sas: Bug fix for big endian systems" > > Above patch with commit id "cf6bf9710cabba1fe94a4349f4eb8db623c77ebc" > was posted to fix sparse warnings. While posting this patch it was > assumed that readl() & writel() APIs internally calls le32_to_cpu() & > cpu_to_le32() APIs respectively. Looks like it is not true for all > architecture Just a minute, it damn well should be. The definition of readl/writel is barriers and little endian (you can see this in asm-generic/io.h). Which architecture is getting this wrong? Because it sounds like that's what we need to fix rather than doing something like this in all drivers. Sparc (and parisc) definitely do the little endian thing, so if this code is what it takes to get them working again, it looks like you're double swapping somewhere. I really think cf6bf9710c needs to be reverted and you should look again at your sparse warnings. Since the driver is using the non-raw readX/writeX it should be cpu endian internally which cf6bf9710c upsets. James
Re: [PATCH v2] scsi: qla2xxx: Spinlock recursion in qla_target
On Wed, 2018-06-13 at 16:13 +, Madhani, Himanshu wrote: > > On Jun 13, 2018, at 6:05 AM, Mikhail Malygin > > wrote: > > > > Here is the patch used for verification: > > > > [PATCH] scsi: qla2xxx: Fixup spinlock recursion in qla_target > > > > The patch reverts changes done in qlt_schedule_sess_for_deletion() > > To avoid spinlock recursion sess->vha->work_lock should be used > > instead of ha->tgt.sess_lock, that can be locked in > > callers: qlt_reset() or qlt_handle_login() > > > > Thanks for testing out the change. > > > Fixes: 1c6cacf4ea6c04 ("scsi: qla2xxx: Fixup locking for session > > deletion") > > > > Signed-off-by: Mikhail Malygin > > I want to see following tags for correctness > > Fixes: 1c6cacf4ea6c04 ("scsi: qla2xxx: Fixup locking for session > deletion”) > Cc: #4.17 > Reported-by: Mikhail Malygin > Tested-by: Mikhail Malygin > Signed-off-by: Himanshu Madhani No on this last one: he can't add your signoff tag. The Signed-off-by: tags track the patch submission path and has meaning under the DCO, so if Mikhail sends it to you and you send it to the list (or you pick it off the list and resend it) *you* can add your signoff but not if it goes straight from Mikhail to the scsi tree (for this case we have an Acked-by: tag instead if that works for you?). James
Re: sd 6:0:0:0: [sdb] Unaligned partial completion
On Mon, 2018-06-11 at 14:59 -0700, Ted Cabeen wrote: > On 06/11/2018 02:40 PM, James Bottomley wrote: > > On Mon, 2018-06-11 at 12:20 -0400, Douglas Gilbert wrote: > > > I have also seen Aborted Command sense when doing heavy testing > > > on one or more SAS disks behind a SAS expander. I put it down to > > > a temporary lack of paths available (on the link between the > > > host's HBA and the expander) when one of those SAS disks tries to > > > get a connection back to the host with the data (data-in > > > transfer) from an earlier READ command. > > > > > > In my code (ddpt and sg_dd) I treat it as a "retry" type error > > > and in my experience that works. IOW a follow-up READ with the > > > same parameters is successful. > > > > We do treat ABORTED_COMMAND as a retry. However, it will tick down > > the retry count (usually 3) and then fail if it still occurs. How > > long does this condition persist for? because if it's long lived we > > could treat it as ADD_TO_MLQUEUE which would mean we'd retry until > > the timeout condition was reached. > > On my system, it's a bit hard to tell, as as soon as ZFS sees the > read error, it starts resilvering to repair the sector that reported > the I/O error. Without the scrub, it happened once over a 5-day > window. During the scrub, it was usually 10s of minutes between > occurrences that failed all the retries, but I had some occasions > where it happened about 5-10 minutes apart. It definitely seems to > be load-related, so how long and hard the load stays elevated is a > factor. OK, try this: it will print a rate limited warning if it triggers (showing it is this problem) and return ADD_TO_MLQUEUE for all the SAS errors (we'll likely narrow this if it works, but for now let's do the lot). James --- diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 8932ae81a15a..94aa5cb94064 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -531,6 +531,11 @@ int scsi_check_sense(struct scsi_cmnd *scmd) if (sshdr.asc == 0xc1 && sshdr.ascq == 0x01 && sdev->sdev_bflags & BLIST_RETRY_ASC_C1) return ADD_TO_MLQUEUE; + if (sshdr.asc == 0x4b) { + printk_ratelimited(KERN_WARNING "SAS/SATA link retry\n"); + return ADD_TO_MLQUEUE; + } + return NEEDS_RETRY; case NOT_READY:
Re: RAID6: "Bad block number requested"
On Mon, 2018-06-11 at 17:56 -0400, Bryan Gurney wrote: > On Mon, Jun 11, 2018 at 1:00 PM, Anthony Youngman > wrote: > > On 11/06/18 16:06, James Bottomley wrote: > > > Well, this is the problem: a 4k logical (presumably 4k physical) > > > drive cannot be addressed in block sectors that are not divisible > > > by 8. This type of drive configuration is very unusual (although > > > it was something we tested years ago before the industry realised > > > it had to ship drives with 4k physical but 512 byte logical > > > sectors because of the legacy problem). > > > > I understood these drives were now becoming much more common, > > especially enterprise-grade drives. I know there were problems > > switching from 512/512 drives to 512/4096, but as you say I thought > > they were pretty much addressed. > > As soon as I saw the model number "HGST HUH721010AL", and did a > search, I said, "Oh, it's _this_ drive." > > The HGST Ultrastar He10 has both "512e Format" and "4K Native Format" > part numbers, so it's easy to potentially buy the wrong type of drive > (e.g.: accidentally buy a 4K Native drive, and discover some obscure > I/O failures). > > FYI, in my experience, when an application sends a > smaller-than-4096-bytes I/O to a 4096-bytes block device, the usual > error code that's sent by the driver is EINVAL (or "Invalid > argument"), so see if there's a log message citing that error code. We've done the work to make this function. However, it was a while ago and I don't believe anyone tests regularly now (particularly with the corner cases) so errors can creep back into the stack. > > I think it must be a couple of years ago now though, that I heard > > (on LWN) enterprise drives were apparently switching over to > > 4096/4096. With NO 512 emulation fall-back. > > Some drive manufacturers seem to be more eager than others, but > there's still work to be done. For example, try this with a 4K- > native drive: > > 1. Write an ISO image to the drive with the command "dd > if=isofile.iso of=/dev/testdevice bs=4096 oflag=direct" > > 2. Create a test directory (for example, "/mnt/testdir"), then > attempt to mount the device with "mount /dev/testdevice /mnt/testdir" This is a textbook case of something that can never work: The requirement for a 4k drive is that the stack must be aligned, meaning 4k or multiple of 4k block size all the way up and down. The isofs you're copying only has a 2k block size. You get the same failure with any non 4k multiple filesystem block size. Fortunately most modern filesystems have had 4k, or multiple thereof, block sizes for a while now, so you're unlikely to see this on your old ext4 devices but, in principle, it could happen. James > When I tried it on RHEL 7.5, I saw this: "kernel: isofs_fill_super: > bread failed, dev=testdevice, iso_blknum=17, block=-2147483648" > > Note that ISO filesystems have a 2048-byte block size (maximum), but > in this test, it's stored on a block device with a block size of 4096 > bytes. > > There may be more issues out there, but they have to be found first. > And finding the issues is difficult, due to the obscurity of the > error messages seen. > > > Thanks, > > Bryan >
Re: sd 6:0:0:0: [sdb] Unaligned partial completion
[readd linux-scsi] On Mon, 2018-06-11 at 14:43 -0700, Ted Cabeen wrote: > On 06/11/2018 02:40 PM, James Bottomley wrote: > > > I have also seen Aborted Command sense when doing heavy testing > > > on one or more SAS disks behind a SAS expander. I put it down to > > > a temporary lack of paths available (on the link between the > > > host's HBA and the expander) when one of those SAS disks tries to > > > get a connection back to the host with the data (data-in > > > transfer) from an earlier READ command. > > > > > > In my code (ddpt and sg_dd) I treat it as a "retry" type error > > > and in my experience that works. IOW a follow-up READ with the > > > same parameters is successful. > > > > We do treat ABORTED_COMMAND as a retry. However, it will tick down > > the retry count (usually 3) and then fail if it still occurs. How > > long does this condition persist for? because if it's long lived we > > could treat it as ADD_TO_MLQUEUE which would mean we'd retry until > > the timeout condition was reached. > > When you retry, should that result in additional kernel messages, or > does the kernel message only appear after the 3 retrys have all > failed? The latter: without enabling logging, we don't print anything for successfully retried commands. James
Re: sd 6:0:0:0: [sdb] Unaligned partial completion
On Mon, 2018-06-11 at 12:20 -0400, Douglas Gilbert wrote: > On 2018-06-11 12:07 PM, Ted Cabeen wrote: > > I'm seeing a similar behavior on my system, but across multiple > > devices on a SAS drive array (front bays on a Supermicro-based > > system with onboard mpt3sas card). > > The Sense Key here doesn't show a medium error, and the multiple- > > drive behavior makes me think it's more likely either a controller > > or cable problem. Interestingly, the issue only shows up under > > heavy load (specifically a ZFS scrub). > > > > During my next downtime window, I'm going to try to re-create the > > problem while capturing a blktrace. Any other things to try at > > that time, or a filter-mask I should apply? > > > > [Wed Jun 6 14:30:19 2018] blk_update_request: I/O error, dev sdn, > > sector > > 1757633640 > > [Wed Jun 6 14:37:10 2018] sd 15:0:5:0: unaligned partial > > completion avoided > > (xfer_cnt=3072, sector_sz=4096) > > [Wed Jun 6 14:37:10 2018] sd 15:0:5:0: [sdr] FAILED Result: > > hostbyte=DID_OK > > driverbyte=DRIVER_SENSE > > [Wed Jun 6 14:37:10 2018] sd 15:0:5:0: [sdr] Sense Key : Aborted > > Command > > [current] [descriptor] > > [Wed Jun 6 14:37:10 2018] sd 15:0:5:0: [sdr] Add. Sense: Nak > > received > > [Wed Jun 6 14:37:10 2018] sd 15:0:5:0: [sdr] CDB: Read(10) 28 00 > > 07 8a c1 ca 00 > > 00 01 00 > > [Wed Jun 6 14:37:10 2018] blk_update_request: I/O error, dev sdr, > > sector > > 1012272720 > > [Wed Jun 6 15:20:43 2018] sd 15:0:8:0: unaligned partial > > completion avoided > > (xfer_cnt=52224, sector_sz=4096) > > [Wed Jun 6 15:20:43 2018] sd 15:0:8:0: [sdu] FAILED Result: > > hostbyte=DID_OK > > driverbyte=DRIVER_SENSE > > [Wed Jun 6 15:20:43 2018] sd 15:0:8:0: [sdu] Sense Key : Aborted > > Command > > [current] [descriptor] > > [Wed Jun 6 15:20:43 2018] sd 15:0:8:0: [sdu] Add. Sense: Nak > > received > > [Wed Jun 6 15:20:43 2018] sd 15:0:8:0: [sdu] CDB: Read(10) 28 00 > > 12 ab dc 52 00 > > 00 19 00 > > [Wed Jun 6 15:20:43 2018] blk_update_request: I/O error, dev sdu, > > sector > > 2506023568 > > [Wed Jun 6 15:46:20 2018] sd 15:0:2:0: unaligned partial > > completion avoided > > (xfer_cnt=11264, sector_sz=4096) > > [Wed Jun 6 15:46:20 2018] sd 15:0:2:0: [sdo] FAILED Result: > > hostbyte=DID_OK > > driverbyte=DRIVER_SENSE > > [Wed Jun 6 15:46:20 2018] sd 15:0:2:0: [sdo] Sense Key : Aborted > > Command > > [current] [descriptor] > > [Wed Jun 6 15:46:20 2018] sd 15:0:2:0: [sdo] Add. Sense: Nak > > received > > [Wed Jun 6 15:46:20 2018] sd 15:0:2:0: [sdo] CDB: Read(10) 28 00 > > 40 a8 ef b5 00 > > 00 03 00 > > [Wed Jun 6 15:46:20 2018] blk_update_request: I/O error, dev sdo, > > sector > > 8678505896 > > > > I have also seen Aborted Command sense when doing heavy testing on > one or more SAS disks behind a SAS expander. I put it down to a > temporary lack of paths available (on the link between the host's HBA > and the expander) when one of those SAS disks tries to get a > connection back to the host with the data (data-in transfer) from an > earlier READ command. > > In my code (ddpt and sg_dd) I treat it as a "retry" type error and in > my experience that works. IOW a follow-up READ with the same > parameters is successful. We do treat ABORTED_COMMAND as a retry. However, it will tick down the retry count (usually 3) and then fail if it still occurs. How long does this condition persist for? because if it's long lived we could treat it as ADD_TO_MLQUEUE which would mean we'd retry until the timeout condition was reached. James
Re: RAID6: "Bad block number requested"
On Mon, 2018-06-11 at 11:18 -0400, Douglas Gilbert wrote: > On 2018-06-11 11:06 AM, James Bottomley wrote: > > On Mon, 2018-06-11 at 16:24 +0200, Sebastian Hegler wrote: > > > Dear all, > > > > > > First off: sorry for cross-posting. I don't know if this is a > > > RAID > > > issue or a SCSI issue, so I'll just ask y'all. > > > > > > > > > For a RAID6 capacity upgrade (higher capacity drives), we bought > > > some > > > 10TB disks: > > > == > > > Apr 17 11:16:05 kuiper kernel: [12795386.862031] scsi 6:0:36:0: > > > Direct-Access ATA HGST HUH721010AL T21D PQ: 0 ANSI: 6 > > > Apr 17 11:16:05 kuiper kernel: [12795386.919904] scsi 6:0:36:0: > > > atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), > > > sw_preserve(y) > > > Apr 17 11:16:05 kuiper kernel: [12795386.974186] sd 6:0:36:0: > > > [sdl] > > > 2441609216 4096-byte logical blocks: (10.0 TB/9.10 TiB) > > > > Well, this is the problem: a 4k logical (presumably 4k physical) > > drive > > cannot be addressed in block sectors that are not divisible by > > 8. This > > type of drive configuration is very unusual (although it was > > something > > we tested years ago before the industry realised it had to ship > > drives > > with 4k physical but 512 byte logical sectors because of the legacy > > problem). > > > > > Apr 17 11:16:05 kuiper kernel: [12795386.998016] sd 6:0:36:0: > > > [sdl] > > > Write Protect is off > > > Apr 17 11:16:05 kuiper kernel: [12795387.000625] sd 6:0:36:0: > > > Attached scsi generic sg12 type 0 > > > Apr 17 11:16:05 kuiper kernel: [12795387.035341] sd 6:0:36:0: > > > [sdl] > > > Mode Sense: 7f 00 10 08 > > > Apr 17 11:16:05 kuiper kernel: [12795387.035679] sd 6:0:36:0: > > > [sdl] > > > Write cache: enabled, read cache: enabled, supports DPO and FUA > > > Apr 17 11:16:05 kuiper kernel: [12795387.098315] sd 6:0:36:0: > > > [sdl] > > > Attached SCSI disk > > > == > > > > > > RAID add and rebuild operations went fine. However, some minutes > > > after rebuild completion, several hundreds of these error > > > messages > > > started to appear: > > > == > > > Apr 20 03:37:29 kuiper kernel: [13027072.454811] sd 6:0:36:0: > > > [sdl] > > > Bad block number requested > > > > This means that somehow, something sent a non 4k aligned 4k sized > > request. SCSI here is just the messenger. However, if you apply > > this > > patch, it will capture the stack trace of what above it triggered > > this, > > which may help us in debugging. It could be we may also want to > > see > > what the values of block and blk_rq_sectors(rq) actually are, but > > lets > > begin with the stack trace. > > > > James > > > > --- > > > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > > index 9421d9877730..ac865e048533 100644 > > --- a/drivers/scsi/sd.c > > +++ b/drivers/scsi/sd.c > > @@ -1109,6 +1109,7 @@ static int sd_setup_read_write_cmnd(struct > > scsi_cmnd *SCpnt) > > if ((block & 7) || (blk_rq_sectors(rq) & 7)) { > > scmd_printk(KERN_ERR, SCpnt, > > "Bad block number > > requested\n"); > > Not a very informative error message. How about a quasi SCSI one > like: > Logical Block out of range, due to different block sizes Well, this is supposed to be an impossible condition: if someone wants to use non-512 byte logical drives, they're supposed to align the stack to ensure it works. In this case, it looks like the stack is aligned but one component isn't completely 4k safe. I don't think there's any message we can print in SCSI that would help with debugging that. I agree the message we print could be more informative about what went wrong, but it's unlikely to be helpful to the user who sees it. James
Re: RAID6: "Bad block number requested"
On Mon, 2018-06-11 at 16:24 +0200, Sebastian Hegler wrote: > Dear all, > > First off: sorry for cross-posting. I don't know if this is a RAID > issue or a SCSI issue, so I'll just ask y'all. > > > For a RAID6 capacity upgrade (higher capacity drives), we bought some > 10TB disks: > == > Apr 17 11:16:05 kuiper kernel: [12795386.862031] scsi 6:0:36:0: > Direct-Access ATA HGST HUH721010AL T21D PQ: 0 ANSI: 6 > Apr 17 11:16:05 kuiper kernel: [12795386.919904] scsi 6:0:36:0: > atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y) > Apr 17 11:16:05 kuiper kernel: [12795386.974186] sd 6:0:36:0: [sdl] > 2441609216 4096-byte logical blocks: (10.0 TB/9.10 TiB) Well, this is the problem: a 4k logical (presumably 4k physical) drive cannot be addressed in block sectors that are not divisible by 8. This type of drive configuration is very unusual (although it was something we tested years ago before the industry realised it had to ship drives with 4k physical but 512 byte logical sectors because of the legacy problem). > Apr 17 11:16:05 kuiper kernel: [12795386.998016] sd 6:0:36:0: [sdl] > Write Protect is off > Apr 17 11:16:05 kuiper kernel: [12795387.000625] sd 6:0:36:0: > Attached scsi generic sg12 type 0 > Apr 17 11:16:05 kuiper kernel: [12795387.035341] sd 6:0:36:0: [sdl] > Mode Sense: 7f 00 10 08 > Apr 17 11:16:05 kuiper kernel: [12795387.035679] sd 6:0:36:0: [sdl] > Write cache: enabled, read cache: enabled, supports DPO and FUA > Apr 17 11:16:05 kuiper kernel: [12795387.098315] sd 6:0:36:0: [sdl] > Attached SCSI disk > == > > RAID add and rebuild operations went fine. However, some minutes > after rebuild completion, several hundreds of these error messages > started to appear: > == > Apr 20 03:37:29 kuiper kernel: [13027072.454811] sd 6:0:36:0: [sdl] > Bad block number requested This means that somehow, something sent a non 4k aligned 4k sized request. SCSI here is just the messenger. However, if you apply this patch, it will capture the stack trace of what above it triggered this, which may help us in debugging. It could be we may also want to see what the values of block and blk_rq_sectors(rq) actually are, but lets begin with the stack trace. James --- diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 9421d9877730..ac865e048533 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1109,6 +1109,7 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt) if ((block & 7) || (blk_rq_sectors(rq) & 7)) { scmd_printk(KERN_ERR, SCpnt, "Bad block number requested\n"); + WARN_ON_ONCE(1); goto out; } else { block = block >> 3;
Re: [PATCH] mpt3sas: Add an i/o barrier
On Thu, 2018-05-24 at 17:31 +0200, Tomas Henzl wrote: > On 05/24/2018 05:19 PM, James Bottomley wrote: > > On Thu, 2018-05-24 at 17:12 +0200, Tomas Henzl wrote: > > > A barrier should be added to ensure proper ordering of memory > > > mapped > > > writes. > > > > > > Signed-off-by: Tomas Henzl <the...@redhat.com> > > > --- > > > drivers/scsi/mpt3sas/mpt3sas_base.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c > > > b/drivers/scsi/mpt3sas/mpt3sas_base.c > > > index bf04fa90f..569392d0d 100644 > > > --- a/drivers/scsi/mpt3sas/mpt3sas_base.c > > > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c > > > @@ -3348,6 +3348,7 @@ _base_mpi_ep_writeq(__u64 b, volatile void > > > __iomem *addr, > > > spin_lock_irqsave(writeq_lock, flags); > > > writel((u32)(data_out), addr); > > > writel((u32)(data_out >> 32), (addr + 4)); > > > + mmiowb(); > > > spin_unlock_irqrestore(writeq_lock, flags); > > > } > > > > I thought, assuming mpt3sas has this right, that this construction > > is only used on 32 bit platforms that don't have a writeq > > instruction? I don't believe there's any overlap with the NUMA > > systems that need io and memory domain synchronization, so either > > this problem is purely theoretical or mpt3sas doesn't have the use > > of writeq correct and if the latter case it should be fixed > > correctly. > > The _base_mpi_ep_writeq is used regardless to 32/64 bit arch > for example in _base_put_smid_mpi_ep_scsi_io, > mpt3sas_base_put_smid_hi_priority > and so on. So it's the latter ... but my point is that that should be fixed rather than adding barriers to what should be a corner case work around. James
Re: [PATCH] mpt3sas: Add an i/o barrier
On Thu, 2018-05-24 at 17:12 +0200, Tomas Henzl wrote: > A barrier should be added to ensure proper ordering of memory mapped > writes. > > Signed-off-by: Tomas Henzl> --- > drivers/scsi/mpt3sas/mpt3sas_base.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c > b/drivers/scsi/mpt3sas/mpt3sas_base.c > index bf04fa90f..569392d0d 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_base.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c > @@ -3348,6 +3348,7 @@ _base_mpi_ep_writeq(__u64 b, volatile void > __iomem *addr, > spin_lock_irqsave(writeq_lock, flags); > writel((u32)(data_out), addr); > writel((u32)(data_out >> 32), (addr + 4)); > + mmiowb(); > spin_unlock_irqrestore(writeq_lock, flags); > } I thought, assuming mpt3sas has this right, that this construction is only used on 32 bit platforms that don't have a writeq instruction? I don't believe there's any overlap with the NUMA systems that need io and memory domain synchronization, so either this problem is purely theoretical or mpt3sas doesn't have the use of writeq correct and if the latter case it should be fixed correctly. James
[GIT PULL] SCSI fixes for 4.17-rc6
Two driver fixes (zfcp and target core), one information leak in sg and one build clean up. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Alexander Potapenko (1): scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() Jens Remus (1): scsi: zfcp: fix infinite iteration on ERP ready list Prasanna Kumar Kalever (1): scsi: target: tcmu: fix error resetting qfull_time_out to default Randy Dunlap (1): scsi: core: clean up generated file scsi_devinfo_tbl.c And the diffstat: drivers/s390/scsi/zfcp_dbf.c | 23 ++- drivers/s390/scsi/zfcp_ext.h | 5 - drivers/s390/scsi/zfcp_scsi.c | 14 +++--- drivers/scsi/Makefile | 2 +- drivers/scsi/sg.c | 2 +- drivers/target/target_core_user.c | 2 ++ 6 files changed, 37 insertions(+), 11 deletions(-) With full diff below. James --- diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c index a8b831000b2d..18c4f933e8b9 100644 --- a/drivers/s390/scsi/zfcp_dbf.c +++ b/drivers/s390/scsi/zfcp_dbf.c @@ -4,7 +4,7 @@ * * Debug traces for zfcp. * - * Copyright IBM Corp. 2002, 2017 + * Copyright IBM Corp. 2002, 2018 */ #define KMSG_COMPONENT "zfcp" @@ -308,6 +308,27 @@ void zfcp_dbf_rec_trig(char *tag, struct zfcp_adapter *adapter, spin_unlock_irqrestore(>rec_lock, flags); } +/** + * zfcp_dbf_rec_trig_lock - trace event related to triggered recovery with lock + * @tag: identifier for event + * @adapter: adapter on which the erp_action should run + * @port: remote port involved in the erp_action + * @sdev: scsi device involved in the erp_action + * @want: wanted erp_action + * @need: required erp_action + * + * The adapter->erp_lock must not be held. + */ +void zfcp_dbf_rec_trig_lock(char *tag, struct zfcp_adapter *adapter, + struct zfcp_port *port, struct scsi_device *sdev, + u8 want, u8 need) +{ + unsigned long flags; + + read_lock_irqsave(>erp_lock, flags); + zfcp_dbf_rec_trig(tag, adapter, port, sdev, want, need); + read_unlock_irqrestore(>erp_lock, flags); +} /** * zfcp_dbf_rec_run_lvl - trace event related to running recovery diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h index bf8ea4df2bb8..e5eed8aac0ce 100644 --- a/drivers/s390/scsi/zfcp_ext.h +++ b/drivers/s390/scsi/zfcp_ext.h @@ -4,7 +4,7 @@ * * External function declarations. * - * Copyright IBM Corp. 2002, 2016 + * Copyright IBM Corp. 2002, 2018 */ #ifndef ZFCP_EXT_H @@ -35,6 +35,9 @@ extern int zfcp_dbf_adapter_register(struct zfcp_adapter *); extern void zfcp_dbf_adapter_unregister(struct zfcp_adapter *); extern void zfcp_dbf_rec_trig(char *, struct zfcp_adapter *, struct zfcp_port *, struct scsi_device *, u8, u8); +extern void zfcp_dbf_rec_trig_lock(char *tag, struct zfcp_adapter *adapter, + struct zfcp_port *port, + struct scsi_device *sdev, u8 want, u8 need); extern void zfcp_dbf_rec_run(char *, struct zfcp_erp_action *); extern void zfcp_dbf_rec_run_lvl(int level, char *tag, struct zfcp_erp_action *erp); diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c index 4d2ba5682493..22f9562f415c 100644 --- a/drivers/s390/scsi/zfcp_scsi.c +++ b/drivers/s390/scsi/zfcp_scsi.c @@ -4,7 +4,7 @@ * * Interface to Linux SCSI midlayer. * - * Copyright IBM Corp. 2002, 2017 + * Copyright IBM Corp. 2002, 2018 */ #define KMSG_COMPONENT "zfcp" @@ -618,9 +618,9 @@ static void zfcp_scsi_rport_register(struct zfcp_port *port) ids.port_id = port->d_id; ids.roles = FC_RPORT_ROLE_FCP_TARGET; - zfcp_dbf_rec_trig("scpaddy", port->adapter, port, NULL, - ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD, - ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD); + zfcp_dbf_rec_trig_lock("scpaddy", port->adapter, port, NULL, + ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD, + ZFCP_PSEUDO_ERP_ACTION_RPORT_ADD); rport = fc_remote_port_add(port->adapter->scsi_host, 0, ); if (!rport) { dev_err(>adapter->ccw_device->dev, @@ -642,9 +642,9 @@ static void zfcp_scsi_rport_block(struct zfcp_port *port) struct fc_rport *rport = port->rport; if (rport) { - zfcp_dbf_rec_trig("scpdely", port->adapter, port, NULL, - ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL, - ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL); + zfcp_dbf_rec_trig_lock("scpdely", port->adapter, port, NULL, + ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL, + ZFCP_PSEUDO_ERP_ACTION_RPORT_DEL);
[GIT PULL] SCSI fixes for 4.17-rc5
Two small driver fixes: aacraid to fix an unknown IU type on task management functions which causes a firmware fault and vmw_pvscsi to change a return code to retry the operation instead of causing an immediate error The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Dave Carroll (1): scsi: aacraid: Correct hba_send to include iu_type Jim Gill (1): scsi: vmw-pvscsi: return DID_BUS_BUSY for adapter-initated aborts And the diffstat: drivers/scsi/aacraid/commsup.c | 8 drivers/scsi/vmw_pvscsi.c | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index 0156c9623c35..d62ddd63f4fe 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -724,6 +724,8 @@ int aac_hba_send(u8 command, struct fib *fibptr, fib_callback callback, int wait; unsigned long flags = 0; unsigned long mflags = 0; + struct aac_hba_cmd_req *hbacmd = (struct aac_hba_cmd_req *) + fibptr->hw_fib_va; fibptr->flags = (FIB_CONTEXT_FLAG | FIB_CONTEXT_FLAG_NATIVE_HBA); if (callback) { @@ -734,11 +736,9 @@ int aac_hba_send(u8 command, struct fib *fibptr, fib_callback callback, wait = 1; - if (command == HBA_IU_TYPE_SCSI_CMD_REQ) { - struct aac_hba_cmd_req *hbacmd = - (struct aac_hba_cmd_req *)fibptr->hw_fib_va; + hbacmd->iu_type = command; - hbacmd->iu_type = command; + if (command == HBA_IU_TYPE_SCSI_CMD_REQ) { /* bit1 of request_id must be 0 */ hbacmd->request_id = cpu_to_le32u32)(fibptr - dev->fibs)) << 2) + 1); diff --git a/drivers/scsi/vmw_pvscsi.c b/drivers/scsi/vmw_pvscsi.c index c374e3b5c678..777e5f1e52d1 100644 --- a/drivers/scsi/vmw_pvscsi.c +++ b/drivers/scsi/vmw_pvscsi.c @@ -609,7 +609,7 @@ static void pvscsi_complete_request(struct pvscsi_adapter *adapter, break; case BTSTAT_ABORTQUEUE: - cmd->result = (DID_ABORT << 16); + cmd->result = (DID_BUS_BUSY << 16); break; case BTSTAT_SCSIPARITY:
[GIT PULL] SCSI fixes for 4.17-rc3
Three small bug fixes: an illegally overlapping memcmp in target code, a potential infinite loop in isci under certain rare phy conditions and an ATA queue depth (performance) correction for storvsc. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Bryant G Ly (1): scsi: target: Fix fortify_panic kernel exception Colin Ian King (1): scsi: isci: Fix infinite loop in while loop Long Li (1): scsi: storvsc: Set up correct queue depth values for IDE devices And the diffstat: drivers/scsi/isci/port_config.c | 3 +-- drivers/scsi/storvsc_drv.c | 7 +-- drivers/target/target_core_iblock.c | 8 3 files changed, 10 insertions(+), 8 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/isci/port_config.c b/drivers/scsi/isci/port_config.c index edb7be786c65..9e8de1462593 100644 --- a/drivers/scsi/isci/port_config.c +++ b/drivers/scsi/isci/port_config.c @@ -291,7 +291,7 @@ sci_mpc_agent_validate_phy_configuration(struct isci_host *ihost, * Note: We have not moved the current phy_index so we will actually * compare the startting phy with itself. * This is expected and required to add the phy to the port. */ - while (phy_index < SCI_MAX_PHYS) { + for (; phy_index < SCI_MAX_PHYS; phy_index++) { if ((phy_mask & (1 << phy_index)) == 0) continue; sci_phy_get_sas_address(>phys[phy_index], @@ -311,7 +311,6 @@ sci_mpc_agent_validate_phy_configuration(struct isci_host *ihost, >phys[phy_index]); assigned_phy_mask |= (1 << phy_index); - phy_index++; } } diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 8c51d628b52e..a2ec0bc9e9fa 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1722,11 +1722,14 @@ static int storvsc_probe(struct hv_device *device, max_targets = STORVSC_MAX_TARGETS; max_channels = STORVSC_MAX_CHANNELS; /* -* On Windows8 and above, we support sub-channels for storage. +* On Windows8 and above, we support sub-channels for storage +* on SCSI and FC controllers. * The number of sub-channels offerred is based on the number of * VCPUs in the guest. */ - max_sub_channels = (num_cpus / storvsc_vcpus_per_sub_channel); + if (!dev_is_ide) + max_sub_channels = + (num_cpus - 1) / storvsc_vcpus_per_sub_channel; } scsi_driver.can_queue = (max_outstanding_req_per_channel * diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c index 07c814c42648..60429011292a 100644 --- a/drivers/target/target_core_iblock.c +++ b/drivers/target/target_core_iblock.c @@ -427,8 +427,8 @@ iblock_execute_zero_out(struct block_device *bdev, struct se_cmd *cmd) { struct se_device *dev = cmd->se_dev; struct scatterlist *sg = >t_data_sg[0]; - unsigned char *buf, zero = 0x00, *p = - int rc, ret; + unsigned char *buf, *not_zero; + int ret; buf = kmap(sg_page(sg)) + sg->offset; if (!buf) @@ -437,10 +437,10 @@ iblock_execute_zero_out(struct block_device *bdev, struct se_cmd *cmd) * Fall back to block_execute_write_same() slow-path if * incoming WRITE_SAME payload does not contain zeros. */ - rc = memcmp(buf, p, cmd->data_length); + not_zero = memchr_inv(buf, 0x00, cmd->data_length); kunmap(sg_page(sg)); - if (rc) + if (not_zero) return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; ret = blkdev_issue_zeroout(bdev,
[GIT PULL] SCSI fixes for 4.17-rc2
8 bug fixes, one spelling update and one tracepoint addition. The most serious is probably the mpt3sas write same fix because it means anyone using these controllers sees errors when modern filesystems try to issue discards (if the drive supports the WRITE SAME variant). The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Bart Van Assche (1): scsi: sd_zbc: Avoid that resetting a zone fails sporadically Chris Leech (1): scsi: iscsi: respond to netlink with unicast when appropriate Colin Ian King (1): scsi: fnic: fix spelling mistake in fnic stats "Abord" -> "Abort" Douglas Gilbert (1): scsi: scsi_debug: IMMED related delay adjustments John Pittman (1): scsi: core: remove reference to scsi_show_extd_sense() Mahesh Rajashekhara (1): scsi: sd: Defer spinning up drive while SANITIZE is in progress Martin K. Petersen (1): scsi: mptsas: Disable WRITE SAME Ming Lei (1): scsi: target: fix crash with iscsi target and dvd Ohad Sharabi (1): scsi: ufs: add trace event for ufs upiu Vinson Lee (1): scsi: megaraid_sas: Do not log an error if FW successfully initializes. And the diffstat: drivers/message/fusion/mptsas.c | 1 + drivers/scsi/fnic/fnic_trace.c | 2 +- drivers/scsi/megaraid/megaraid_sas_fusion.c | 6 +- drivers/scsi/scsi_debug.c | 33 +-- drivers/scsi/scsi_transport_iscsi.c | 29 +++--- drivers/scsi/sd.c | 2 + drivers/scsi/sd_zbc.c | 140 drivers/scsi/ufs/ufshcd.c | 40 drivers/target/target_core_pscsi.c | 2 + include/linux/blkdev.h | 5 + include/scsi/scsi_dbg.h | 2 - include/trace/events/ufs.h | 27 ++ 12 files changed, 205 insertions(+), 84 deletions(-) With full diff below James --- diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c index 231f3a1e27bf..86503f60468f 100644 --- a/drivers/message/fusion/mptsas.c +++ b/drivers/message/fusion/mptsas.c @@ -1994,6 +1994,7 @@ static struct scsi_host_template mptsas_driver_template = { .cmd_per_lun= 7, .use_clustering = ENABLE_CLUSTERING, .shost_attrs= mptscsih_host_attrs, + .no_write_same = 1, }; static int mptsas_get_linkerrors(struct sas_phy *phy) diff --git a/drivers/scsi/fnic/fnic_trace.c b/drivers/scsi/fnic/fnic_trace.c index abddde11982b..98597b59c12a 100644 --- a/drivers/scsi/fnic/fnic_trace.c +++ b/drivers/scsi/fnic/fnic_trace.c @@ -296,7 +296,7 @@ int fnic_get_stats_data(struct stats_debug_info *debug, "Number of Abort FW Timeouts: %lld\n" "Number of Abort IO NOT Found: %lld\n" - "Abord issued times: \n" + "Abort issued times: \n" "< 6 sec : %lld\n" " 6 sec - 20 sec : %lld\n" "20 sec - 30 sec : %lld\n" diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index ce97cde3b41c..f4d988dd1e9d 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -1124,12 +1124,12 @@ megasas_ioc_init_fusion(struct megasas_instance *instance) goto fail_fw_init; } - ret = 0; + return 0; fail_fw_init: dev_err(>pdev->dev, - "Init cmd return status %s for SCSI host %d\n", - ret ? "FAILED" : "SUCCESS", instance->host->host_no); + "Init cmd return status FAILED for SCSI host %d\n", + instance->host->host_no); return ret; } diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 9ef5e3b810f6..656c98e116a9 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -234,11 +234,13 @@ static const char *sdebug_version_date = "20180128"; #define F_INV_OP 0x200 #define F_FAKE_RW 0x400 #define F_M_ACCESS 0x800 /* media access */ -#define F_LONG_DELAY 0x1000 +#define F_SSU_DELAY0x1000 +#define F_SYNC_DELAY 0x2000 #define FF_RESPOND (F_RL_WLUN_OK | F_SKIP_UA | F_DELAY_OVERR) #define FF_MEDIA_IO (F_M_ACCESS | F_FAKE_RW) #define FF_SA (F_SA_HIGH | F_SA_LOW) +#define F_LONG_DELAY (F_SSU_DELAY | F_SYNC_DELAY) #define SDEBUG_MAX_PARTS 4 @@ -510,7 +512,7 @@ static const struct opcode_info_t release_iarr[] = { }; static const struct opcode_info_t sync_cache_iarr[] = { - {0, 0x91, 0, F_LONG_DELAY | F_M_ACCESS, resp_sync_cache, NULL, + {0, 0x91, 0, F_SYNC_DELAY | F_M_ACCESS, resp_sync_cache, NULL, {16, 0x6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
Re: [PATCH] bsg referencing bus driver module
On Fri, 2018-04-20 at 16:44 -0600, Anatoliy Glagolev wrote: > > > This patch isn't applyable because your mailer has changed all the > > tabs to spaces. > > > > I also think there's no need to do it this way. I think what we > > need is for fc_bsg_remove() to wait until the bsg queue is > > drained. It does look like the author thought this happened > > otherwise the code wouldn't have the note. If we fix it that way > > we can do the same thing in all the other transport classes that > > use bsg (which all have a similar issue). > > > > James > > > > Thanks, James. Sorry about the tabs; re-sending. > > On fc_bsg_remove()...: are you suggesting to implement the whole fix > in scsi_transport_fc.c? Yes, but it's not just scsi_transport_fc, scsi_transport_sas has the same issue. I think it's probably just the one liner addition of blk_drain_queue() that fixes this. There should probably be a block primitive that does the correct queue reference dance and calls blk_cleanup_queue() and blk_drain_queue() in order. > That would be nice, but I do not see how that > is possible. Even with the queue drained bsg still holds a reference > to the Scsi_Host via bsg_class_device; bsg_class_device itself is > referenced on bsg_open and kept around while a user-mode process > keeps a handle to bsg. Once you've called bsg_unregister_queue(), the queue will be destroyed and the reference released once the last job is drained, meaning the user can keep the bsg device open, but it will just return errors because of the lack of queue. This scenario allows removal to proceed without being held hostage by open devices. > Even if we somehow implement the waiting the call may be stuck > forever if the user-mode process keeps the handle. No it won't: after blk_cleanup_queue(), the queue is in bypass mode: no requests queued after this do anything other than complete with error, so they never make it into SCSI. > I think handling it via a rererence to the module is more consistent > with the way things are done in Linux. You suggested the approach > youself back in "Waiting for scsi_host_template release" discussion. That was before I analyzed the code paths. Module release is tricky, because the module exit won't be called until the references drop to zero, so you have to be careful about not creating a situation where module exit never gets called and module exit code should force stuff to detach and wait for the forcing to complete to make up for the reference circularity problem. If you do it purely by refcounting, the module actually may never release (that's why scsi_remove_host works the way it does, for instance). James
Re: [PATCH] bsg referencing bus driver module
On Thu, 2018-04-19 at 15:10 -0700, Anatoliy Glagolev wrote: > Updated: rebased on recent Linux, cc-ed maintainers per instructions > in MAINTAINERS file > > From df939b80d02bf37b21efaaef8ede86cfd39b0cb8 Mon Sep 17 00:00:00 > 2001 > From: Anatoliy Glagolev> Date: Thu, 19 Apr 2018 15:06:06 -0600 > Subject: [PATCH] bsg referencing parent module > > Fixing a bug when bsg layer holds the last reference to device > when the device's module has been unloaded. Upon dropping the > reference the device's release function may touch memory of > the unloaded module. > --- > block/bsg-lib.c | 24 ++-- > block/bsg.c | 29 ++--- > drivers/scsi/scsi_transport_fc.c | 8 ++-- > include/linux/bsg-lib.h | 4 > include/linux/bsg.h | 5 + > 5 files changed, 63 insertions(+), 7 deletions(-) > > diff --git a/block/bsg-lib.c b/block/bsg-lib.c > index fc2e5ff..90c28fd 100644 > --- a/block/bsg-lib.c > +++ b/block/bsg-lib.c > @@ -309,6 +309,25 @@ struct request_queue *bsg_setup_queue(struct > device *dev, const char *name, > bsg_job_fn *job_fn, int dd_job_size, > void (*release)(struct device *)) > { > + return bsg_setup_queue_ex(dev, name, job_fn, dd_job_size, release, > + NULL); > +} > +EXPORT_SYMBOL_GPL(bsg_setup_queue); > + > +/** > + * bsg_setup_queue - Create and add the bsg hooks so we can receive > requests > + * @dev: device to attach bsg device to > + * @name: device to give bsg device > + * @job_fn: bsg job handler > + * @dd_job_size: size of LLD data needed for each job > + * @release: @dev release function > + * @dev_module: @dev's module > + */ > +struct request_queue *bsg_setup_queue_ex(struct device *dev, const > char *name, > + bsg_job_fn *job_fn, int dd_job_size, > + void (*release)(struct device *), > + struct module *dev_module) This patch isn't applyable because your mailer has changed all the tabs to spaces. I also think there's no need to do it this way. I think what we need is for fc_bsg_remove() to wait until the bsg queue is drained. It does look like the author thought this happened otherwise the code wouldn't have the note. If we fix it that way we can do the same thing in all the other transport classes that use bsg (which all have a similar issue). James
Re: [PATCH] isci: Fix infinite loop in while loop
On Fri, 2018-04-20 at 10:03 +0100, Colin King wrote: > From: Colin Ian King> > In the case when the phy_mask is bitwise anded with the > phy_index bit is zero the continue statement currently jumps > to the next iteration of the while loop and phy_index is > never actually incremented, potentially causing an infinite > loop if phy_index is less than SCI_MAX_PHS. Fix this by > jumping to the increment of phy_index. > > [ The goto is used to save one more level of nesting that > makes the code far wider than 80 columns. ] what's wrong with replacing the while() with a for() that just works (removing the increment at the end). This is effectively open coding a for loop anyway, which is a pattern we wouldn't want replicated. James
Re: usercopy whitelist woe in scsi_sense_cache
On Mon, 2018-04-16 at 20:12 -0700, Kees Cook wrote: > I still haven't figured this out, though... any have a moment to look > at this? Just to let you know you're not alone ... but I can't make any sense of this either. The bfdq is the elevator_data, which is initialised when the scheduler is attached, so it shouldn't change. Is it possible to set a data break point on elevator_data after it's initialised and see if it got changed by something? James
Re: MPT Fusion - ext4 delayed allocation failed errors on 4.14!
On Mon, 2018-04-16 at 23:25 -0400, Martin K. Petersen wrote: > Nikola, > > > thanks for explanation. but disabling write same for now is safe, > > right? > > I was hoping we'd be able to disable it for SATA devices only. > > > root@siv-70140:~ # sg_vpd /dev/sda > > Supported VPD pages VPD page: > > Supported VPD pages [sv] > > Device identification [di] > > Supported VPD pages [sv] > > Supported VPD pages [sv] > > However, it doesn't look like we have much to work with since there > is no ATA Information VPD page exposed. it's the fat firmware problem again. However, the driver does know, so could we persuade a modification of mpt3sas_scsih.c:_scsih_display_sata_capabilities() to populate a VPD page? James
[GIT PULL] final round of SCSI updates for the 4.16+ merge window
This is a set of minor (and safe changes) that didn't make the initial pull request plus some bug fixes. The status handling code is actually a running regression from the previous merge window which had an incomplete fix (now reverted) and most of the remaining bug fixes are for problems older than the current merge window. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Bart Van Assche (3): scsi: core: Make scsi_result_to_blk_status() recognize CONDITION MET scsi: core: Rename __scsi_error_from_host_byte() into scsi_result_to_blk_status() Revert "scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()" Ben Hutchings (2): scsi: qla2xxx: Fix race condition between iocb timeout and initialisation scsi: qla2xxx: Avoid double completion of abort command Bill Kuzeja (1): scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure Colin Ian King (1): scsi: core: remove redundant assignment to shost->use_blk_mq Dan Carpenter (2): scsi: cxgb4i: silence overflow warning in t4_uld_rx_handler() scsi: dpt_i2o: Use after free in I2ORESETCMD ioctl Dave Carroll (1): scsi: aacraid: Insure command thread is not recursively stopped Johannes Thumshirn (3): scsi: qla2xxx: Correct setting of SAM_STAT_CHECK_CONDITION scsi: qla2xxx: correctly shift host byte scsi: scsi_dh: Don't look for NULL devices handlers by name Ross Lagerwall (1): scsi: devinfo: Add Microsoft iSCSI target to 1024 sector blacklist And the diffstat: drivers/scsi/aacraid/commsup.c | 4 +++- drivers/scsi/aacraid/linit.c | 1 + drivers/scsi/cxgbi/cxgb4i/cxgb4i.c | 8 +++ drivers/scsi/dpt_i2o.c | 13 ++- drivers/scsi/hosts.c | 1 - drivers/scsi/qla2xxx/qla_gs.c | 12 ++- drivers/scsi/qla2xxx/qla_init.c| 41 +-- drivers/scsi/qla2xxx/qla_inline.h | 2 +- drivers/scsi/qla2xxx/qla_iocb.c| 8 --- drivers/scsi/qla2xxx/qla_isr.c | 8 +++ drivers/scsi/qla2xxx/qla_mbx.c | 8 +++ drivers/scsi/qla2xxx/qla_mid.c | 2 +- drivers/scsi/qla2xxx/qla_mr.c | 5 +++-- drivers/scsi/qla2xxx/qla_os.c | 44 ++ drivers/scsi/qla2xxx/qla_target.c | 2 +- drivers/scsi/scsi_devinfo.c| 2 +- drivers/scsi/scsi_dh.c | 3 +++ drivers/scsi/scsi_lib.c| 27 ++- 18 files changed, 109 insertions(+), 82 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index 84858d5c8257..0156c9623c35 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -1502,9 +1502,10 @@ static int _aac_reset_adapter(struct aac_dev *aac, int forced, u8 reset_type) host = aac->scsi_host_ptr; scsi_block_requests(host); aac_adapter_disable_int(aac); - if (aac->thread->pid != current->pid) { + if (aac->thread && aac->thread->pid != current->pid) { spin_unlock_irq(host->host_lock); kthread_stop(aac->thread); + aac->thread = NULL; jafo = 1; } @@ -1591,6 +1592,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int forced, u8 reset_type) aac->name); if (IS_ERR(aac->thread)) { retval = PTR_ERR(aac->thread); + aac->thread = NULL; goto out; } } diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c index 2664ea0df35f..f24fb942065d 100644 --- a/drivers/scsi/aacraid/linit.c +++ b/drivers/scsi/aacraid/linit.c @@ -1562,6 +1562,7 @@ static void __aac_shutdown(struct aac_dev * aac) up(>event_wait); } kthread_stop(aac->thread); + aac->thread = NULL; } aac_send_shutdown(aac); diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c index 406e94312d4e..211da1d5a869 100644 --- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c +++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c @@ -2108,12 +2108,12 @@ static int t4_uld_rx_handler(void *handle, const __be64 *rsp, log_debug(1 << CXGBI_DBG_TOE, "cdev %p, opcode 0x%x(0x%x,0x%x), skb %p.\n", cdev, opc, rpl->ot.opcode_tid, ntohl(rpl->ot.opcode_tid), skb); - if (cxgb4i_cplhandlers[opc]) - cxgb4i_cplhandlers[opc](cdev, skb); - else { + if (opc >= ARRAY_SIZE(cxgb4i_cplhandlers) || !cxgb4i_cplhandlers[opc]) { pr_err("No handler for opcode 0x%x.\n", opc); __kfree_skb(skb); - } + } else + cxgb4i_cplhandlers[opc](cdev, skb); + return 0; nomem:
Re: [PATCH] Waiting for scsi_host_template release
On Wed, 2018-04-11 at 12:22 -0700, Anatoliy Glagolev wrote: > Hannes, James, thanks a lot for taking a look! > > On the problem the patch is solving: it is in the "Description" part > of my initial e-mail. If you agree that a Scsi_Host may be around > after a driver has unloaded, the problem applies to any driver > creating a new Scsi_Host. No, I don't agree: as I said, the template is part of the module and the module should be reference counted. Any use after free of the template means there's a refcounting bug somewhere. > I fixed it in qla2xxx to illustrate the usage of the new function > and scsi_host_template's flag; also, qla2xxx is where I actually > observe crashes. Other drivers may do the same if they want to > address the problem. > > Here are details on the qla2xxx crash repro, if that is what you were > asking about. If I run "qaucli" utility that retrieves some info from > the driver via SCSI mid-layer, and unload the driver in parallel, the > kernel crashes with the following stack: > > [16834.636216,07] Call Trace: > ... > scsi_proc_hostdir_rm > [16834.641944,07] [] > scsi_host_dev_release+0x3f/0x130 > [16834.647740,07] [] device_release+0x32/0xa0 > [16834.653423,07] [] kobject_cleanup+0x77/0x190 > [16834.659002,07] [] kobject_put+0x25/0x50 > [16834.664430,07] [] put_device+0x17/0x20 > [16834.669740,07] [] > bsg_kref_release_function+0x24/0x30 > [16834.675007,07] [] bsg_release+0x166/0x1d0 > [16834.680148,07] [] __fput+0xcb/0x1d0 > [16834.685156,07] [] fput+0xe/0x10 > [16834.690017,07] [] task_work_run+0x86/0xb0 > [16834.694781,07] [] > exit_to_usermode_loop+0x6b/0x9a > [16834.699466,07] [] > syscall_return_slowpath+0x55/0x60 > [16834.704110,07] [] > int_ret_from_sys_call+0x25/0x9f This one's a bit baffling: open of the bsg device should have already taken the module reference. What was the actual error: NULL deref? The thing which is supposed to hold the module is the device open/close which does scsi_device_put on sd_release ... unless this is some sort of non-scsi device and qlogic forgot how to refcount? > On refcount for scsi_host_template: valid point, I did consider it. > Existing drivers allocate scsi_host_template statically. We cannot > change them all at once. So we have to allow 2 ways of allocating > scsi_host_template: the dynamic one with refcounts and the static one > for legacy driver support. That is kind of ugly, too. In addition, > having a refcounted scsi_host_template after driver unload is > confusing: the memory of scsi_host_template is OK, but any attempt to > call a method from the template still causes a crash. No, the static template already is part of the module so it should be refcounted as a module reference. James
Re: [PATCH] Waiting for scsi_host_template release
On Wed, 2018-04-11 at 16:11 +0200, Hannes Reinecke wrote: > On Mon, 9 Apr 2018 23:23:51 -0700 > Anatoliy Glagolevwrote: > > > Description: > > SCSI mid-layer may hold references to Scsi_Host structs when > > the owning module has already unloaded. Scsi_Host release path > > touches scsi_host_template struct that is usually allocated > > in the unloaded module's memory. That results in a crash. > > To work around the problem, this change implements > > scsi_host_template_release API to be called at driver unload > > path to make sure all Scsi_Host structs are gone before > > releasing scsi_host_template memory. > > > > --- > > drivers/scsi/hosts.c | 2 ++ > > drivers/scsi/qla2xxx/qla_os.c | 2 ++ > > drivers/scsi/scsi_priv.h | 1 + > > drivers/scsi/scsi_proc.c | 64 > > +++ > > include/scsi/scsi_host.h | 17 5 files changed, 80 > > insertions(+), 6 deletions(-) > > > > Whee, that is ugly. And what's the actual problem it's solving? It looks to be something in qla2xxx module removal? > Any particular reason why we can't do refcounting here? We can ... the template is module data and any reference to the dev or the host will increment the module reference. We could even have a dummy template reference that only incremented the module refcount. However, knowing what to do involves knowing what the problem is and how it is triggered. James
[GIT PULL] first round of SCSI updates for the 4.16+ merge window
This is mostly updates of the usual drivers: arcmsr, qla2xx, lpfc, ufs, mpt3sas, hisi_sas. In addition we have removed several really old drivers: sym53c416, NCR53c406a, fdomain, fdomain_cs and removed the old scsi_module.c initialization from all remaining drivers. Plus an assortment of bug fixes, initialization errors and other minor fixes. This time there's a really nasty merge between the fixes and misc branches of the SCSI tree which I've resolved in the merge commit: the non obvious part is that you have to remove some lines from qla2xxx/qla_gs.c to avoid a compile failure which would cause bisection problems, so I've done that and documented it in the merge commit. If you'd like to do it yourself, let me know and I'll send the two branches separately. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-for-linus The short changelog is: Adrian Hunter (1): scsi: ufs: Add support for Auto-Hibernate Idle Timer Arnd Bergmann (7): scsi: mpt3sas: clarify mmio pointer types scsi: qedi: fix build regression scsi: sym53c416: avoid section mismatch with LTO scsi: NCR53c406a: avoid section mismatch with LTO scsi: qedf: use correct strncpy() size scsi: qedf: fix LTO-enabled build scsi: qedi: fix building with LTO Bart Van Assche (15): scsi: ufs: Fix kernel-doc errors and warnings scsi: sd_zbc: Fix sd_zbc_get_seq_zones() kernel-doc header scsi: libsas: Fix kernel-doc headers scsi: core: Reduce number of scsi_test_unit_ready() retries scsi: core: Move the eh_deadline module parameter definition scsi: core: scmd_eh_abort_handler(): Add a comment scsi: pmcraid: Use sgl_alloc_order() and sgl_free_order() scsi: pmcraid: Remove an unused structure member scsi: ipr: Use sgl_alloc_order() and sgl_free_order() scsi: scsi_debug: Simplify request tag decoding scsi: qla2xxx: Fix function argument descriptions scsi: qla4xxx: Move an array from a .h into a .c file scsi: qla4xxx: Remove unused symbols scsi: qla2xxx: Remove unused symbols scsi: qla2xxx: Use %p for printing pointers Ching Huang (4): scsi: arcmsr: Change driver version to v1.40.00.05-20180309 scsi: arcmsr: Sleep to avoid CPU stuck too long for waiting adapter ready scsi: arcmsr: Handle adapter removed due to thunderbolt cable disconnection. scsi: arcmsr: Rename ACB_F_BUS_HANG_ON to ACB_F_ADAPTER_REMOVED for adapter hot-plug Christoph Hellwig (11): scsi: remove the old scsi_module.c initialization model scsi: remove the sym53c416 driver scsi: remove the NCR53c406a driver scsi: remove the fdomain and fdomain_cs drivers scsi: mvme147: stop using scsi_module.c scsi: esas2r: remove initialization / cleanup dead wood scsi: core: unexport scsi_host_set_state scsi: documentation: remove ChangeLog.1992-1997 scsi: aha1740: stop using scsi_unregister scsi: ips: don't set .detect and .release in the host template scsi: dpt_i2o: stop using scsi_unregister Colin Ian King (7): scsi: qla2xxx: fix spelling mistake: "existant" -> "existent" scsi: lpfc: make several unions static, fix non-ANSI prototype scsi: scsi_transport_spi: make two const arrays static, shrinks object size scsi: pmcraid: remove redundant initializations of pointer 'ioadl' scsi: isci: remove redundant initialization to 'bit' scsi: libfc: remove redundant initialization of 'disc' scsi: qedf: remove redundant initialization of 'fcport' Dan Carpenter (3): scsi: dpt_i2o: use after free in __adpt_reset() scsi: dpt_i2o: use after free in adpt_release() scsi: atp870u: 64 bit bug in atp885_init() Darren Trapp (10): scsi: qla2xxx: Cleanup code to improve FC-NVMe error handling scsi: qla2xxx: Fix FC-NVMe IO abort during driver reset scsi: qla2xxx: Fix retry for PRLI RJT with reason of BUSY scsi: qla2xxx: Remove nvme_done_list scsi: qla2xxx: Return busy if rport going away scsi: qla2xxx: Fix n2n_ae flag to prevent dev_loss on PDB change scsi: qla2xxx: Add FC-NVMe abort processing scsi: qla2xxx: Add changes for devloss timeout in driver scsi: qla2xxx: Set IIDMA and fcport state before qla_nvme_register_remote() scsi: qla2xxx: Restore ZIO threshold setting Don Brace (1): scsi: smartpqi: update driver version Douglas Gilbert (2): scsi: core: Make SCSI Status CONDITION MET equivalent to GOOD scsi: scsi_debug: implement IMMED bit Finn Thain (1): scsi: jazz_esp, sun3x_esp: Pass struct device pointer in dma calls Geert Uytterhoeven (1): scsi: hisi_sas: Remove depends on HAS_DMA in case of platform dependency Hannes Reinecke (1): scsi: raid_class: Add 'JBOD' RAID level James Smart (43): scsi: lpfc: Change Copyright of 12.0.0.1 modified files to 2018 scsi: lpfc: update driver
[GIT PULL] SCSI fixes for 4.16-rc7
Two driver fixes (ibmvfc, iscsi_tcp) and a USB fix for devices that give the wrong return to Read Capacity and cause a huge log spew. The remaining 5 patches all try to fix commit 84676c1f21e8ff5 "genirq/affinity: assign vectors to all possible CPUs") which broke the non-mq I/O path. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Brian King (1): scsi: ibmvfc: Avoid unnecessary port relogin Jianchao Wang (1): scsi: iscsi_tcp: set BDI_CAP_STABLE_WRITES when data digest enabled Martin K. Petersen (1): scsi: sd: Remember that READ CAPACITY(16) succeeded Ming Lei (5): scsi: virtio_scsi: unify scsi_host_template scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity scsi: core: introduce force_blk_mq scsi: megaraid_sas: fix selection of reply queue scsi: hpsa: fix selection of reply queue And the diffstat: drivers/scsi/hosts.c| 1 + drivers/scsi/hpsa.c | 73 drivers/scsi/hpsa.h | 1 + drivers/scsi/ibmvscsi/ibmvfc.c | 6 +- drivers/scsi/iscsi_tcp.c| 8 ++ drivers/scsi/megaraid/megaraid_sas.h| 1 + drivers/scsi/megaraid/megaraid_sas_base.c | 39 - drivers/scsi/megaraid/megaraid_sas_fusion.c | 12 +-- drivers/scsi/sd.c | 2 + drivers/scsi/virtio_scsi.c | 129 include/scsi/scsi_host.h| 3 + 11 files changed, 128 insertions(+), 147 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index dd9464920456..ef22b275d050 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -474,6 +474,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) shost->dma_boundary = 0x; shost->use_blk_mq = scsi_use_blk_mq; + shost->use_blk_mq = scsi_use_blk_mq || shost->hostt->force_blk_mq; device_initialize(>shost_gendev); dev_set_name(>shost_gendev, "host%d", shost->host_no); diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 87b260e403ec..31423b6dc26d 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -1045,11 +1045,7 @@ static void set_performant_mode(struct ctlr_info *h, struct CommandList *c, c->busaddr |= 1 | (h->blockFetchTable[c->Header.SGList] << 1); if (unlikely(!h->msix_vectors)) return; - if (likely(reply_queue == DEFAULT_REPLY_QUEUE)) - c->Header.ReplyQueue = - raw_smp_processor_id() % h->nreply_queues; - else - c->Header.ReplyQueue = reply_queue % h->nreply_queues; + c->Header.ReplyQueue = reply_queue; } } @@ -1063,10 +1059,7 @@ static void set_ioaccel1_performant_mode(struct ctlr_info *h, * Tell the controller to post the reply to the queue for this * processor. This seems to give the best I/O throughput. */ - if (likely(reply_queue == DEFAULT_REPLY_QUEUE)) - cp->ReplyQueue = smp_processor_id() % h->nreply_queues; - else - cp->ReplyQueue = reply_queue % h->nreply_queues; + cp->ReplyQueue = reply_queue; /* * Set the bits in the address sent down to include: * - performant mode bit (bit 0) @@ -1087,10 +1080,7 @@ static void set_ioaccel2_tmf_performant_mode(struct ctlr_info *h, /* Tell the controller to post the reply to the queue for this * processor. This seems to give the best I/O throughput. */ - if (likely(reply_queue == DEFAULT_REPLY_QUEUE)) - cp->reply_queue = smp_processor_id() % h->nreply_queues; - else - cp->reply_queue = reply_queue % h->nreply_queues; + cp->reply_queue = reply_queue; /* Set the bits in the address sent down to include: * - performant mode bit not used in ioaccel mode 2 * - pull count (bits 0-3) @@ -1109,10 +1099,7 @@ static void set_ioaccel2_performant_mode(struct ctlr_info *h, * Tell the controller to post the reply to the queue for this * processor. This seems to give the best I/O throughput. */ - if (likely(reply_queue == DEFAULT_REPLY_QUEUE)) - cp->reply_queue = smp_processor_id() % h->nreply_queues; - else - cp->reply_queue = reply_queue % h->nreply_queues; + cp->reply_queue = reply_queue; /* * Set the bits in the address sent down to include: * - performant mode bit not used in ioaccel mode 2 @@ -1157,6 +1144,8 @@ static void __enqueue_cmd_and_start_io(struct ctlr_info *h, { dial_down_lockup_detection_during_fw_flash(h, c);
[GIT PULL] SCSI fixes for 4.16-rc6
One driver patch (qla2xxx) which fixes a problem caused by an existing regression fix (FCP discovery is failing) and one generic fix to a longstanding bug in libsas that causes I/O eventually to hang to the device in the face of ATA error recovery. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Himanshu Madhani (1): scsi: qla2xxx: Remove FC_NO_LOOP_ID for FCP and FC-NVMe Discovery Jason Yan (1): scsi: libsas: defer ata device eh commands to libata And the diffstat: drivers/scsi/libsas/sas_scsi_host.c | 33 + drivers/scsi/qla2xxx/qla_init.c | 1 - 2 files changed, 13 insertions(+), 21 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c index 626727207889..a372af68d9a9 100644 --- a/drivers/scsi/libsas/sas_scsi_host.c +++ b/drivers/scsi/libsas/sas_scsi_host.c @@ -223,6 +223,7 @@ int sas_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) { struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(cmd->device->host); + struct domain_device *dev = cmd_to_domain_dev(cmd); struct sas_task *task = TO_SAS_TASK(cmd); /* At this point, we only get called following an actual abort @@ -231,6 +232,14 @@ static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) */ sas_end_task(cmd, task); + if (dev_is_sata(dev)) { + /* defer commands to libata so that libata EH can +* handle ata qcs correctly +*/ + list_move_tail(>eh_entry, _ha->eh_ata_q); + return; + } + /* now finish the command and move it on to the error * handler done list, this also takes it off the * error handler pending list. @@ -238,22 +247,6 @@ static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) scsi_eh_finish_cmd(cmd, _ha->eh_done_q); } -static void sas_eh_defer_cmd(struct scsi_cmnd *cmd) -{ - struct domain_device *dev = cmd_to_domain_dev(cmd); - struct sas_ha_struct *ha = dev->port->ha; - struct sas_task *task = TO_SAS_TASK(cmd); - - if (!dev_is_sata(dev)) { - sas_eh_finish_cmd(cmd); - return; - } - - /* report the timeout to libata */ - sas_end_task(cmd, task); - list_move_tail(>eh_entry, >eh_ata_q); -} - static void sas_scsi_clear_queue_lu(struct list_head *error_q, struct scsi_cmnd *my_cmd) { struct scsi_cmnd *cmd, *n; @@ -261,7 +254,7 @@ static void sas_scsi_clear_queue_lu(struct list_head *error_q, struct scsi_cmnd list_for_each_entry_safe(cmd, n, error_q, eh_entry) { if (cmd->device->sdev_target == my_cmd->device->sdev_target && cmd->device->lun == my_cmd->device->lun) - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); } } @@ -618,12 +611,12 @@ static void sas_eh_handle_sas_errors(struct Scsi_Host *shost, struct list_head * case TASK_IS_DONE: SAS_DPRINTK("%s: task 0x%p is done\n", __func__, task); - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); continue; case TASK_IS_ABORTED: SAS_DPRINTK("%s: task 0x%p is aborted\n", __func__, task); - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); continue; case TASK_IS_AT_LU: SAS_DPRINTK("task 0x%p is at LU: lu recover\n", task); @@ -634,7 +627,7 @@ static void sas_eh_handle_sas_errors(struct Scsi_Host *shost, struct list_head * "recovered\n", SAS_ADDR(task->dev), cmd->device->lun); - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); sas_scsi_clear_queue_lu(work_q, cmd); goto Again; } diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 00329dda6179..8d7fab3cd01d 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -1719,7 +1719,6 @@ qla24xx_handle_plogi_done_event(struct scsi_qla_host *vha, struct event_arg *ea) set_bit(ea->fcport->loop_id, vha->hw->loop_id_map); spin_lock_irqsave(>hw->tgt.sess_lock, flags); - ea->fcport->loop_id = FC_NO_LOOP_ID; ea->fcport->chip_reset = vha->hw->base_qpair->chip_reset;
Re: How to Locate drive directly attached to mpt3sas HBA
On Mon, 2018-03-19 at 17:30 +0100, Jack Wang wrote: > > > > > > > > > > And I think either mpt2sas and/or mpt3sas HBAs (I don't have my > > > hardware > > > nearby) have a SMP target hidden away somewhere. Perhaps someone > > > from > > > LSI/Avago/Broadcom could supply more information about that. > > Indeed, mpt3sas has internal enclosure some how. > snip from "sas3ircu 0 display" output: > Device is a Hard disk > Enclosure # : 1 > Slot # : 4 > SAS Address : 4433221-1-0400- > State : Ready (RDY) > > And it's possible to locate it using "sas3ircu 0 locate 1:4 ON" to > locate the drive. The fat firmware eats pretty much all trace of the enclosure. I think it could be exposed as a device, but that would disrupt the mpt3sas internal handling if our Linux enclosure service also binds to it. James
Re: How to Locate drive directly attached to mpt3sas HBA
On Mon, 2018-03-19 at 15:20 +0100, Douglas Gilbert wrote: > On 2018-03-19 11:40 AM, Jack Wang wrote: > > > > Hi list, > > > > Any one knows how can I locate a HDD directly attached to mpt3sas, > > sas3ircu only supports LOCATE commd to locates driver installed in > > a > > disk enclosure, but not directly attached. > > > > I know microsemi/PMCs supports SGPIO interface to locate drive eg: > > Adp80xxapp sgpio 0 set 0 1 > > > > I searched in latest upstream mpt3sas code, didn't find such > > interface. Do I miss something? > > Hi, > If its a SAS disk, it might have an onboard LED. > > In the sdparm package there is a script called "sas_disk_blink" > that will flash that LED. > > > And I think either mpt2sas and/or mpt3sas HBAs (I don't have my > hardware nearby) have a SMP target hidden away somewhere. Perhaps > someone from LSI/Avago/Broadcom could supply more information about > that. And if mpt3sas just used the standard Linux SES infrastructure, you'd find the disk in sysfs linked to the slot along with a nice sysfs file you can write to to flash the locator light. James
Re: [PATCH v2] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device
On Fri, 2018-03-16 at 23:10 +, Bart Van Assche wrote: > On Fri, 2018-03-16 at 15:49 -0700, James Bottomley wrote: > > > > In your new code you have > > > > + if (sdev->sdev_state != SDEV_QUIESCE) > > + rtn = shost->hostt->queuecommand(shost, scmd); > > + else > > + rtn = SCSI_MLQUEUE_DEVICE_BUSY; > > > > That sets rtn instead of calling queuecommand > > > > Then you drop through to this code below: > > > > if (rtn) { > > if (timeleft > stall_for) { > > scsi_eh_restore_cmnd(scmd, ); > > timeleft -= stall_for; > > msleep(jiffies_to_msecs(stall_for)); > > goto retry; > > } > > > > Which causes a msleep which is equivalent to the while loop. > > Hello James, > > Thanks for the clarification - apparently we were each looking at > another part of the code. > > If the "rtn = SCSI_MLQUEUE_DEVICE_BUSY" statement is executed that > means that the if-statement that controls that statement noticed that > sdev->sdev_state == SDEV_QUIESCE. Since the while loop above that > statement only finishes if either sdev->sdev_state != SDEV_QUIESCE or > timeleft <= 0, if the "rtn = SCSI_MLQUEUE_DEVICE_BUSY" statement is > executed that implies that timeleft <= 0. Since stall_for > 0, the > expression timeleft > stall_for will evaluate to false. In other > words, the msleep() shown in your e-mail will be skipped. Not if you remove the while loop from the patch which was the original request. James
Re: [PATCH v2] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device
On Fri, 2018-03-16 at 22:40 +, Bart Van Assche wrote: > On Fri, 2018-03-16 at 15:31 -0700, James Bottomley wrote: > > > > On Fri, 2018-03-16 at 22:21 +, Bart Van Assche wrote: > > > > > > On Fri, 2018-03-16 at 15:00 -0700, James Bottomley wrote: > > > > > > > > > > > > On Fri, 2018-03-16 at 10:40 -0700, Bart Van Assche wrote: > > > > > > > > > > > > > > > @@ -1050,7 +1050,22 @@ static int scsi_send_eh_cmnd(struct > > > > > scsi_cmnd > > > > > *scmd, unsigned char *cmnd, > > > > > > > > > > scsi_log_send(scmd); > > > > > scmd->scsi_done = scsi_eh_done; > > > > > - rtn = shost->hostt->queuecommand(shost, scmd); > > > > > + mutex_lock(>state_mutex); > > > > > + while (sdev->sdev_state == SDEV_QUIESCE && timeleft > > > > > > 0) > > > > > { > > > > > + mutex_unlock(>state_mutex); > > > > > + SCSI_LOG_ERROR_RECOVERY(5, > > > > > sdev_printk(KERN_DEBUG, > > > > > sdev, > > > > > + "%s: state %d <> %d\n", __func__, > > > > > sdev- > > > > > > > > > > > > > > > > > > sdev_state, > > > > > > > > > > + SDEV_QUIESCE)); > > > > > + delay = min(timeleft, stall_for); > > > > > + timeleft -= delay; > > > > > + msleep(jiffies_to_msecs(delay)); > > > > > + mutex_lock(>state_mutex); > > > > > + } > > > > > > > > What's the point of this loop? if you eliminate it, you still > > > > get > > > > exactly the same msleep from the stall_for retry processing. > > > > > > Hello James, > > > > > > The purpose of that loop is to check the SCSI device state every > > > "stall_for" jiffies and to avoid that more than "timeleft" > > > jiffies is > > > spent on waiting. > > > > I know what the loop does; the the question I was asking is doesn't > > setting rtn instead of calling ->queuecommand() achieve the same > > thing? > > Sorry but I don't understand that last question. How could setting > rtn be an alternative for calling ->queuecommand() since we really > want to call ->queuecommand if the SCSI device leaves the quiesced > state before the timeout has expired? Can you rephrase that last > question? In your new code you have + if (sdev->sdev_state != SDEV_QUIESCE) + rtn = shost->hostt->queuecommand(shost, scmd); + else + rtn = SCSI_MLQUEUE_DEVICE_BUSY; That sets rtn instead of calling queuecommand Then you drop through to this code below: if (rtn) { if (timeleft > stall_for) { scsi_eh_restore_cmnd(scmd, ); timeleft -= stall_for; msleep(jiffies_to_msecs(stall_for)); goto retry; } Which causes a msleep which is equivalent to the while loop. James
Re: [PATCH v2] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device
On Fri, 2018-03-16 at 22:21 +, Bart Van Assche wrote: > On Fri, 2018-03-16 at 15:00 -0700, James Bottomley wrote: > > > > On Fri, 2018-03-16 at 10:40 -0700, Bart Van Assche wrote: > > > > > > @@ -1050,7 +1050,22 @@ static int scsi_send_eh_cmnd(struct > > > scsi_cmnd > > > *scmd, unsigned char *cmnd, > > > > > > scsi_log_send(scmd); > > > scmd->scsi_done = scsi_eh_done; > > > - rtn = shost->hostt->queuecommand(shost, scmd); > > > + mutex_lock(>state_mutex); > > > + while (sdev->sdev_state == SDEV_QUIESCE && timeleft > 0) > > > { > > > + mutex_unlock(>state_mutex); > > > + SCSI_LOG_ERROR_RECOVERY(5, > > > sdev_printk(KERN_DEBUG, > > > sdev, > > > + "%s: state %d <> %d\n", __func__, sdev- > > > > > > > > sdev_state, > > > > > > + SDEV_QUIESCE)); > > > + delay = min(timeleft, stall_for); > > > + timeleft -= delay; > > > + msleep(jiffies_to_msecs(delay)); > > > + mutex_lock(>state_mutex); > > > + } > > > > What's the point of this loop? if you eliminate it, you still get > > exactly the same msleep from the stall_for retry processing. > > Hello James, > > The purpose of that loop is to check the SCSI device state every > "stall_for" jiffies and to avoid that more than "timeleft" jiffies is > spent on waiting. I know what the loop does; the the question I was asking is doesn't setting rtn instead of calling ->queuecommand() achieve the same thing? James
Re: [PATCH v2] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device
On Fri, 2018-03-16 at 10:40 -0700, Bart Van Assche wrote: > @@ -1050,7 +1050,22 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd > *scmd, unsigned char *cmnd, > > scsi_log_send(scmd); > scmd->scsi_done = scsi_eh_done; > - rtn = shost->hostt->queuecommand(shost, scmd); > + mutex_lock(>state_mutex); > + while (sdev->sdev_state == SDEV_QUIESCE && timeleft > 0) { > + mutex_unlock(>state_mutex); > + SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_DEBUG, > sdev, > + "%s: state %d <> %d\n", __func__, sdev- > >sdev_state, > + SDEV_QUIESCE)); > + delay = min(timeleft, stall_for); > + timeleft -= delay; > + msleep(jiffies_to_msecs(delay)); > + mutex_lock(>state_mutex); > + } What's the point of this loop? if you eliminate it, you still get exactly the same msleep from the stall_for retry processing. Plus I really don't think you want to call ->queuecommand() with the state mutex held. You don't even need to hold the state mutex to read sdev->state because the read is atomic and the mutex doesn't mediate anything. The check to queuecommand race is the same for every consumer. James
[GIT PULL] SCSI fixes for 4.16-rc5
This is four patches, consisting of one regression from the merge window (qla2xxx) one lonstanding memory leak (sd_zbc) one event queue mislabelling which we want to eliminate to discourage the pattern (mpt3sas) and one behaviour change because re-reading the partition table shouldn't clear the ro flag. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Bill Kuzeja (1): scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure Damien Le Moal (1): scsi: sd_zbc: Fix potential memory leak Hannes Reinecke (1): scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM Jeremy Cline (1): scsi: sd: Keep disk read-only when re-reading partition And the diffstat: drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/qla2xxx/qla_os.c| 59 ++-- drivers/scsi/sd.c| 3 +- drivers/scsi/sd_zbc.c| 35 + 4 files changed, 55 insertions(+), 44 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index c2ea13c7e37e..a1cb0236c550 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -10558,7 +10558,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) snprintf(ioc->firmware_event_name, sizeof(ioc->firmware_event_name), "fw_event_%s%d", ioc->driver_name, ioc->id); ioc->firmware_event_thread = alloc_ordered_workqueue( - ioc->firmware_event_name, WQ_MEM_RECLAIM); + ioc->firmware_event_name, 0); if (!ioc->firmware_event_thread) { pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n", ioc->name, __FILE__, __LINE__, __func__); diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 285911e81728..5c5dcca4d1da 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -454,7 +454,7 @@ static int qla2x00_alloc_queues(struct qla_hw_data *ha, struct req_que *req, ha->req_q_map[0] = req; set_bit(0, ha->rsp_qid_map); set_bit(0, ha->req_qid_map); - return 1; + return 0; fail_qpair_map: kfree(ha->base_qpair); @@ -471,6 +471,9 @@ static int qla2x00_alloc_queues(struct qla_hw_data *ha, struct req_que *req, static void qla2x00_free_req_que(struct qla_hw_data *ha, struct req_que *req) { + if (!ha->req_q_map) + return; + if (IS_QLAFX00(ha)) { if (req && req->ring_fx00) dma_free_coherent(>pdev->dev, @@ -481,14 +484,17 @@ static void qla2x00_free_req_que(struct qla_hw_data *ha, struct req_que *req) (req->length + 1) * sizeof(request_t), req->ring, req->dma); - if (req) + if (req) { kfree(req->outstanding_cmds); - - kfree(req); + kfree(req); + } } static void qla2x00_free_rsp_que(struct qla_hw_data *ha, struct rsp_que *rsp) { + if (!ha->rsp_q_map) + return; + if (IS_QLAFX00(ha)) { if (rsp && rsp->ring) dma_free_coherent(>pdev->dev, @@ -499,7 +505,8 @@ static void qla2x00_free_rsp_que(struct qla_hw_data *ha, struct rsp_que *rsp) (rsp->length + 1) * sizeof(response_t), rsp->ring, rsp->dma); } - kfree(rsp); + if (rsp) + kfree(rsp); } static void qla2x00_free_queues(struct qla_hw_data *ha) @@ -1723,6 +1730,8 @@ __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res) struct qla_tgt_cmd *cmd; uint8_t trace = 0; + if (!ha->req_q_map) + return; spin_lock_irqsave(qp->qp_lock_ptr, flags); req = qp->req; for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { @@ -3095,14 +3104,14 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) /* Set up the irqs */ ret = qla2x00_request_irqs(ha, rsp); if (ret) - goto probe_hw_failed; + goto probe_failed; /* Alloc arrays of request and response ring ptrs */ - if (!qla2x00_alloc_queues(ha, req, rsp)) { + if (qla2x00_alloc_queues(ha, req, rsp)) { ql_log(ql_log_fatal, base_vha, 0x003d, "Failed to allocate memory for queue pointers..." "aborting.\n"); - goto probe_init_failed; + goto probe_failed; } if (ha->mqenable && shost_use_blk_mq(host)) { @@ -3387,15 +3396,6 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) return 0; -probe_init_failed: - qla2x00_free_req_que(ha, req); - ha->req_q_map[0] = NULL; - clear_bit(0, ha->req_qid_map); - qla2x00_free_rsp_que(ha, rsp); -
Re: [PATCH, resend] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device
On Wed, 2018-03-14 at 15:45 -0700, Bart Van Assche wrote: > Several SCSI transport and LLD drivers surround code that does not > tolerate concurrent calls of .queuecommand() with scsi_target_block() > / > scsi_target_unblock(). These last two functions use > blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request > queues to prevent concurrent .queuecommand() calls. However, that is > not sufficient to prevent .queuecommand() calls from > scsi_send_eh_cmnd(). > Hence surround the .queuecommand() call from the SCSI error handler > with > code that avoids that .queuecommand() gets called in the quiesced > state. > > Notes: > - Converting the .queuecommand() call in scsi_send_eh_cmnd() into > code that calls blk_get_request() + blk_execute_rq() is not an > option since scsi_send_eh_cmnd() must be able to make forward > progress > even if all requests are allocated. > - Converting the .queuecommand() call in scsi_send_eh_cmnd() into a > blk_execute_rq() or blk_mq_requeue_request() call is not an option > either > because that would require to change every individual function in > the I/O > path. Each function in the I/O path would have to be modified such > that it > handles requests received from the block layer core and request > received > from the SCSI EH differently. Since struct scsi_cmnd is not > initialized by > the block layer for filesystem requests, it is not possible to > determine > in scsi_queue_rq() whether or not a request has been submitted by > the > SCSI EH without modifying the block layer. > > Signed-off-by: Bart Van Assche> Cc: Hannes Reinecke > Cc: Johannes Thumshirn > --- > drivers/scsi/scsi_error.c | 13 + > drivers/scsi/scsi_lib.c| 2 ++ > drivers/scsi/scsi_scan.c | 1 + > include/scsi/scsi_device.h | 1 + > 4 files changed, 17 insertions(+) > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 946039117bf4..cfc805851a2a 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -1042,6 +1042,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd > *scmd, unsigned char *cmnd, > unsigned long timeleft = timeout; > struct scsi_eh_save ses; > const unsigned long stall_for = msecs_to_jiffies(100); > + DEFINE_WAIT(wait); > int rtn; > > retry: > @@ -1050,7 +1051,19 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd > *scmd, unsigned char *cmnd, > > scsi_log_send(scmd); > scmd->scsi_done = scsi_eh_done; > + mutex_lock(>state_mutex); > + while (sdev->sdev_state == SDEV_QUIESCE) { > + prepare_to_wait(>state_wq, , > TASK_INTERRUPTIBLE); > + mutex_unlock(>state_mutex); > + SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_DEBUG, > sdev, > + "%s: state %d <> %d\n", __func__, sdev- > >sdev_state, > + SDEV_QUIESCE)); > + schedule(); > + mutex_lock(>state_mutex); > + } > + finish_wait(>state_wq, ); > rtn = shost->hostt->queuecommand(shost, scmd); > + mutex_unlock(>state_mutex); > if (rtn) { > if (timeleft > stall_for) { This has got to be minutely rare: why not just use the existing stall_for timeout infrastructure instead of adding a waitqueue to every device? James
Re: [PATCH] scsi: resolve COMMAND_SIZE at compile time
On Sat, 2018-03-10 at 14:29 +0100, Stephen Kitt wrote: > Hi Bart, > > On Fri, 9 Mar 2018 22:47:12 +, Bart Van Asschec.com> > wrote: > > > > On Fri, 2018-03-09 at 23:33 +0100, Stephen Kitt wrote: > > > > > > +/* > > > + * SCSI command sizes are as follows, in bytes, for fixed size > > > commands, > > > per > > > + * group: 6, 10, 10, 12, 16, 12, 10, 10. The top three bits of > > > an opcode > > > + * determine its group. > > > + * The size table is encoded into a 32-bit value by subtracting > > > each > > > value > > > + * from 16, resulting in a value of 1715488362 > > > + * (6 << 28 + 6 << 24 + 4 << 20 + 0 << 16 + 4 << 12 + 6 << 8 + 6 > > > << 4 + > > > 10). > > > + * Command group 3 is reserved and should never be used. > > > + */ > > > +#define COMMAND_SIZE(opcode) \ > > > + (16 - (15 & (1715488362 >> (4 * (((opcode) >> 5) & > > > 7) > > > > To me this seems hard to read and hard to verify. Could this have > > been > > written as a combination of ternary expressions, e.g. using a gcc > > statement > > expression to ensure that opcode is evaluated once? > > That’s what I’d tried initially, e.g. > > #define COMMAND_SIZE(opcode) ({ \ > int index = ((opcode) >> 5) & 7; \ > index == 0 ? 6 : (index == 4 ? 16 : index == 3 || index == 5 ? 12 : > 10); \ > }) > > But gcc still reckons that results in a VLA, defeating the initial > purpose of > the exercise. > > Does it help if I make the magic value construction clearer? > > #define SCSI_COMMAND_SIZE_TBL ( \ > (16 - 6)\ > + ((16 - 10) << 4) \ > + ((16 - 10) << 8) \ > + ((16 - 12) << 12) \ > + ((16 - 16) << 16) \ > + ((16 - 12) << 20) \ > + ((16 - 10) << 24) \ > + ((16 - 10) << 28)) > > #define COMMAND_SIZE(opcode) > \ > (16 - (15 & (SCSI_COMMAND_SIZE_TBL >> (4 * (((opcode) >> 5) & > 7) Couldn't we do the less clever thing of making the array a static const and moving it to a header? That way the compiler should be able to work it out at compile time. James signature.asc Description: This is a digitally signed message part
Re: [PATCH] lpfc: Resolve static check error in lpfc_nvmet.c
Really, the commit message has to be descriptive. This makes me think this is about some annoying policy of making everything static rather than a critical bug. What's wrong with lpfc: add missing unlock on defer WQFULL path ? You can then expand on the static checker discovery in the main message. Since this looks to be a pretty common occurrence, perhaps this should be folded with a rebase? James
Re: [PATCH 6/6] lib/scatterlist: Drop order argument from sgl_free_n_order
On Wed, 2018-03-07 at 12:47 +, Tvrtko Ursulin wrote: > From: Tvrtko UrsulinFirstly, I don't see any justifiable benefit to churning this API, so why bother? but secondly this: > We can derive the order from sg->length and so do not need to pass it > in explicitly. Is wrong. I can have a length 2 scatterlist that crosses a page boundary, but I can also have one within a single page, so the order cannot be deduced from the length. James
[GIT PULL] SCSI fixes for 4.16-rc4
This patch is mostly fixes for driver specific issues (nine of them) and the storvsc performance improvement with interrupt handling which was dropped from the previous fixes pull request. We also have two regressions: one is a double call_rcu() in ATA error handling and the other is a missed conversion to BLK_STS_OK in __scsi_error_from_host_byte(). The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Bart Van Assche (1): scsi: core: Avoid that ATA error handling can trigger a kernel hang or oops Darren Trapp (1): scsi: qla2xxx: Fix FC-NVMe LUN discovery Hannes Reinecke (4): scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte() scsi: qla2xxx: ensure async flags are reset correctly scsi: qla2xxx: do not check login_state if no loop id is assigned scsi: qla2xxx: Fixup locking for session deletion Manish Rangankar (1): scsi: qedi: Fix kernel crash during port toggle Mauricio Faria de Oliveira (1): scsi: mpt3sas: fix oops in error handlers after shutdown/unload Michael Kelley (EOSG) (1): scsi: storvsc: Spread interrupts when picking a channel for I/O requests Shivasharan S (1): scsi: megaraid_sas: Do not use 32-bit atomic request descriptor for Ventura controllers Sreekanth Reddy (1): scsi: mpt3sas: wait for and flush running commands on shutdown/unload himanshu.madh...@cavium.com (1): scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS And the diffstat: drivers/scsi/hosts.c| 3 -- drivers/scsi/megaraid/megaraid_sas_fusion.c | 42 drivers/scsi/mpt3sas/mpt3sas_base.c | 8 ++--- drivers/scsi/mpt3sas/mpt3sas_base.h | 3 ++ drivers/scsi/mpt3sas/mpt3sas_scsih.c| 21 +--- drivers/scsi/qedi/qedi_fw.c | 5 +++ drivers/scsi/qla2xxx/qla_def.h | 5 +-- drivers/scsi/qla2xxx/qla_gs.c | 5 +++ drivers/scsi/qla2xxx/qla_init.c | 51 - drivers/scsi/qla2xxx/qla_os.c | 14 ++-- drivers/scsi/qla2xxx/qla_target.c | 17 -- drivers/scsi/scsi_error.c | 5 +-- drivers/scsi/scsi_lib.c | 4 +++ drivers/scsi/storvsc_drv.c | 3 +- include/scsi/scsi_cmnd.h| 3 ++ include/scsi/scsi_host.h| 2 -- 16 files changed, 115 insertions(+), 76 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 57bf43e34863..dd9464920456 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -328,8 +328,6 @@ static void scsi_host_dev_release(struct device *dev) if (shost->work_q) destroy_workqueue(shost->work_q); - destroy_rcu_head(>rcu); - if (shost->shost_state == SHOST_CREATED) { /* * Free the shost_dev device name here if scsi_host_alloc() @@ -404,7 +402,6 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) INIT_LIST_HEAD(>starved_list); init_waitqueue_head(>host_wait); mutex_init(>scan_mutex); - init_rcu_head(>rcu); index = ida_simple_get(_index_ida, 0, 0, GFP_KERNEL); if (index < 0) diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index 073ced07e662..dc8e850fbfd2 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -216,36 +216,30 @@ inline void megasas_return_cmd_fusion(struct megasas_instance *instance, /** * megasas_fire_cmd_fusion - Sends command to the FW * @instance: Adapter soft state - * @req_desc: 32bit or 64bit Request descriptor + * @req_desc: 64bit Request descriptor * - * Perform PCI Write. Ventura supports 32 bit Descriptor. - * Prior to Ventura (12G) MR controller supports 64 bit Descriptor. + * Perform PCI Write. */ static void megasas_fire_cmd_fusion(struct megasas_instance *instance, union MEGASAS_REQUEST_DESCRIPTOR_UNION *req_desc) { - if (instance->adapter_type == VENTURA_SERIES) - writel(le32_to_cpu(req_desc->u.low), - >reg_set->inbound_single_queue_port); - else { #if defined(writeq) && defined(CONFIG_64BIT) - u64 req_data = (((u64)le32_to_cpu(req_desc->u.high) << 32) | - le32_to_cpu(req_desc->u.low)); + u64 req_data = (((u64)le32_to_cpu(req_desc->u.high) << 32) | + le32_to_cpu(req_desc->u.low)); - writeq(req_data, >reg_set->inbound_low_queue_port); + writeq(req_data, >reg_set->inbound_low_queue_port); #else - unsigned long flags; - spin_lock_irqsave(>hba_lock, flags); -
Re: [PATCH] lpfc: use __raw_writeX on DPP copies
On Mon, 2018-03-05 at 09:03 -0800, James Smart wrote: > This patch replaces: > https://www.spinics.net/lists/linux-scsi/msg117838.html > > A prior lpfc patch: > scsi: lpfc: Add push-to-adapter support to sli4 > commitid=1351e69fc6db30e186295f1c9495d03cef6a01a2 > > Fails compilation on some 32-bit systems as writeq() is not supported > on all architectures. Additionally, it was pointed out that as > writeX() > does byteswapping if necessary for pci vs the cpu endianness, the > code > was broken on BE PPC. > > After discussions with Arnd Bergmann, we've resolved the issue > to the following: > When 32-bit, use 32-bit accesses. > Instead of writeX(), use __raw_writeX() - which writes to io > space while preserving byte order. To use this, the code > was changed to use a different buffer that lpfc prepped > via sli_pcimem_bcopy() that was set to the bytestream to > be written. > > Signed-off-by: Dick Kennedy> Signed-off-by: James Smart > --- > drivers/scsi/lpfc/lpfc_sli.c | 11 +-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/lpfc/lpfc_sli.c > b/drivers/scsi/lpfc/lpfc_sli.c > index 4ce3ca6f4b79..675914cc8ab8 100644 > --- a/drivers/scsi/lpfc/lpfc_sli.c > +++ b/drivers/scsi/lpfc/lpfc_sli.c > @@ -140,9 +140,16 @@ lpfc_sli4_wq_put(struct lpfc_queue *q, union > lpfc_wqe *wqe) > lpfc_sli_pcimem_bcopy(wqe, temp_wqe, q->entry_size); > if (q->dpp_enable && q->phba->cfg_enable_dpp) { > /* write to DPP aperture taking advatage of Combined > Writes */ > - tmp = (uint8_t *)wqe; > + tmp = (uint8_t *)temp_wqe; > +#ifdef CONFIG_64BIT Whether a platform has a functioning writeq isn't related to its 64 bitness, there are a few 32 bit platforms which have it. As long as you don't get an include of the lo-hi helpers which really confused this situation, the define you're looking for is #ifdef __raw_writeq James
Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS
On Mon, 2018-02-12 at 10:28 -0800, Himanshu Madhani wrote: [...] > Cc:#4.4+ This is the wrong stable tag, which would lead to stable not picking up the patch automatically. The correct stable address is Cc: #4.4+ ^^ Please be more careful in future. Martin, can you fix this up with a rebase and I'll resync my tree? Thanks, James
Re: [GIT PULL] SCSI fixes for 4.16-rc2
On Thu, 2018-02-22 at 16:37 -0800, Linus Torvalds wrote: > On Thu, Feb 22, 2018 at 11:28 AM, James Bottomley > <james.bottom...@hansenpartnership.com> wrote: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi- > > fixes > > When I pull that, I get something completely different from what you > claim I should get. Much bigger, and a lot more commits. > > Hmm? > > Dropped. OK, I see the problem, but I don't currently have an explanation for how it happened. The scsi-fixes tag somehow became attached to a push of the misc tree for the next merge window instead of the fixes branch. I've reset it to where it should be commit 1bc5ad3a6acdcf56f83272f2de1cd2389ea9e9e2 Author: Manish Rangankar <manish.rangan...@cavium.com> Date: Sun Feb 11 22:48:41 2018 -0800 scsi: qla4xxx: skip error recovery in case of register disconnect. So it should all be correct if you re-pull. I suppose it's good that the checks and balances of changelog + diffstat caught this; I'll poke about in my current preparation scripts to see if I can find out what went wrong. James
[GIT PULL] SCSI fixes for 4.16-rc2
These are mostly fixes for problems with merge window code. In addition we have one doc update (alua) and two dead code removals (aiclib and octogon) a spurious assignment removal (csiostor) and a performance improvement for storvsc involving better interrupt spreading and increasing the command per lun handling. [fixup over previous pull to drop one of the storvsc patches] The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Andrew Vasquez (1): scsi: qedi: Fix truncation of CHAP name and secret Bart Van Assche (3): scsi: qla2xxx: Fix a locking imbalance in qlt_24xx_handle_els() scsi: scsi_dh: Document alua_rtpg_queue() arguments scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion() Chad Dupuis (1): scsi: bnx2fc: Fix check in SCSI completion handler for timed out request Colin Ian King (1): scsi: csiostor: remove redundant assignment to pointer 'ln' Corentin Labbe (2): scsi: Remove Makefile entry for oktagon files scsi: aic7xxx: remove aiclib.c Dan Carpenter (2): scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo() scsi: sym53c8xx_2: iterator underflow in sym_getsync() Himanshu Madhani (1): scsi: qla2xxx: Fix incorrect handle for abort IOCB Manish Rangankar (1): scsi: qla4xxx: skip error recovery in case of register disconnect. Meelis Roos (1): scsi: aacraid: fix shutdown crash when init fails Michael Kelley (EOSG) (1): scsi: storvsc: Increase cmd_per_lun for higher speed devices Nilesh Javali (1): scsi: qedi: Cleanup local str variable Quinn Tran (2): scsi: qla2xxx: Fix double free bug after firmware timeout scsi: qla2xxx: Fix memory corruption during hba reset test Sujit Reddy Thumma (1): scsi: ufs: Enable quirk to ignore sending WRITE_SAME command Tomas Henzl (1): scsi: mpt3sas: fix an out of bound write Tyrel Datwyler (1): scsi: ibmvfc: fix misdefined reserved field in ibmvfc_fcp_rsp_info And the diffstat: drivers/message/fusion/mptctl.c| 2 ++ drivers/scsi/Makefile | 1 - drivers/scsi/aacraid/linit.c | 4 ++- drivers/scsi/aic7xxx/aiclib.c | 34 -- drivers/scsi/bnx2fc/bnx2fc_io.c| 1 + drivers/scsi/csiostor/csio_lnode.c | 2 +- drivers/scsi/device_handler/scsi_dh_alua.c | 5 +++ drivers/scsi/ibmvscsi/ibmvfc.h | 2 +- drivers/scsi/mpt3sas/mpt3sas_base.c| 5 ++- drivers/scsi/qedi/qedi_main.c | 55 ++ drivers/scsi/qla2xxx/qla_init.c| 23 ++--- drivers/scsi/qla2xxx/qla_iocb.c| 7 ++-- drivers/scsi/qla2xxx/qla_isr.c | 6 ++-- drivers/scsi/qla2xxx/qla_os.c | 2 ++ drivers/scsi/qla2xxx/qla_target.c | 2 -- drivers/scsi/qla4xxx/ql4_def.h | 2 ++ drivers/scsi/qla4xxx/ql4_os.c | 46 + drivers/scsi/storvsc_drv.c | 2 +- drivers/scsi/sym53c8xx_2/sym_hipd.c| 2 +- drivers/scsi/ufs/ufshcd.c | 2 ++ 20 files changed, 107 insertions(+), 98 deletions(-) delete mode 100644 drivers/scsi/aic7xxx/aiclib.c With full diff below. James --- diff --git a/drivers/message/fusion/mptctl.c b/drivers/message/fusion/mptctl.c index 8d12017b9893..4470630dd545 100644 --- a/drivers/message/fusion/mptctl.c +++ b/drivers/message/fusion/mptctl.c @@ -2687,6 +2687,8 @@ mptctl_hp_targetinfo(unsigned long arg) __FILE__, __LINE__, iocnum); return -ENODEV; } + if (karg.hdr.id >= MPT_MAX_FC_DEVICES) + return -EINVAL; dctlprintk(ioc, printk(MYIOC_s_DEBUG_FMT "mptctl_hp_targetinfo called.\n", ioc->name)); diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index fcfd28d2884c..de1b3fce936d 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -185,7 +185,6 @@ ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \ CFLAGS_ncr53c8xx.o := $(ncr53c8xx-flags-y) $(ncr53c8xx-flags-m) zalon7xx-objs := zalon.o ncr53c8xx.o NCR_Q720_mod-objs := NCR_Q720.o ncr53c8xx.o -oktagon_esp_mod-objs := oktagon_esp.o oktagon_io.o # Files generated that shall be removed upon make clean clean-files := 53c700_d.h 53c700_u.h diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c index ad6ec573cc87..b730e8edb8b3 100644 --- a/drivers/scsi/aacraid/linit.c +++ b/drivers/scsi/aacraid/linit.c @@ -1690,8 +1690,10 @@ static int aac_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) * Map in the registers from the adapter. */ aac->base_size = AAC_MIN_FOOTPRINT_SIZE; - if ((*aac_drivers[index].init)(aac)) + if ((*aac_drivers[index].init)(aac)) { + error = -ENODEV; goto out_unmap; + }
Re: [GIT PULL] SCSI fixes for 4.16-rc1
On Wed, 2018-02-14 at 12:19 -0800, James Bottomley wrote: > These are mostly fixes for problems with merge window code. In > addition we have one doc update (alua) and two dead code removals > (aiclib and octogon) a spurious assignment removal (csiostor) and a > performance improvement for storvsc involving better interrupt > spreading and increasing the command per lun handling. Actually, can I cancel this. We've had a tree rebase between mine and Martin's which means I need to do a reset. I'll do that now and submit a new pull request next week. Thanks, James
[GIT PULL] SCSI fixes for 4.16-rc1
These are mostly fixes for problems with merge window code. In addition we have one doc update (alua) and two dead code removals (aiclib and octogon) a spurious assignment removal (csiostor) and a performance improvement for storvsc involving better interrupt spreading and increasing the command per lun handling. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Bart Van Assche (3): scsi: qla2xxx: Fix a locking imbalance in qlt_24xx_handle_els() scsi: scsi_dh: Document alua_rtpg_queue() arguments scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion() Chad Dupuis (1): scsi: bnx2fc: Fix check in SCSI completion handler for timed out request Colin Ian King (1): scsi: csiostor: remove redundant assignment to pointer 'ln' Corentin Labbe (2): scsi: Remove Makefile entry for oktagon files scsi: aic7xxx: remove aiclib.c Dan Carpenter (2): scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo() scsi: sym53c8xx_2: iterator underflow in sym_getsync() Himanshu Madhani (1): scsi: qla2xxx: Fix incorrect handle for abort IOCB Michael Kelley (EOSG) (2): scsi: storvsc: Increase cmd_per_lun for higher speed devices scsi: storvsc: Spread interrupts when picking a channel for I/O requests Quinn Tran (2): scsi: qla2xxx: Fix double free bug after firmware timeout scsi: qla2xxx: Fix memory corruption during hba reset test Sujit Reddy Thumma (1): scsi: ufs: Enable quirk to ignore sending WRITE_SAME command Tomas Henzl (1): scsi: mpt3sas: fix an out of bound write Tyrel Datwyler (1): scsi: ibmvfc: fix misdefined reserved field in ibmvfc_fcp_rsp_info and diffstat: drivers/message/fusion/mptctl.c| 2 ++ drivers/scsi/Makefile | 1 - drivers/scsi/aic7xxx/aiclib.c | 34 -- drivers/scsi/bnx2fc/bnx2fc_io.c| 1 + drivers/scsi/csiostor/csio_lnode.c | 2 +- drivers/scsi/device_handler/scsi_dh_alua.c | 5 + drivers/scsi/ibmvscsi/ibmvfc.h | 2 +- drivers/scsi/mpt3sas/mpt3sas_base.c| 5 - drivers/scsi/qla2xxx/qla_init.c| 23 +++- drivers/scsi/qla2xxx/qla_iocb.c| 7 +++--- drivers/scsi/qla2xxx/qla_isr.c | 6 -- drivers/scsi/qla2xxx/qla_os.c | 2 ++ drivers/scsi/qla2xxx/qla_target.c | 2 -- drivers/scsi/storvsc_drv.c | 5 +++-- drivers/scsi/sym53c8xx_2/sym_hipd.c| 2 +- drivers/scsi/ufs/ufshcd.c | 2 ++ 16 files changed, 32 insertions(+), 69 deletions(-) delete mode 100644 drivers/scsi/aic7xxx/aiclib.c With full diff below. James --- diff --git a/drivers/message/fusion/mptctl.c b/drivers/message/fusion/mptctl.c index 8d12017b9893..4470630dd545 100644 --- a/drivers/message/fusion/mptctl.c +++ b/drivers/message/fusion/mptctl.c @@ -2687,6 +2687,8 @@ mptctl_hp_targetinfo(unsigned long arg) __FILE__, __LINE__, iocnum); return -ENODEV; } + if (karg.hdr.id >= MPT_MAX_FC_DEVICES) + return -EINVAL; dctlprintk(ioc, printk(MYIOC_s_DEBUG_FMT "mptctl_hp_targetinfo called.\n", ioc->name)); diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index fcfd28d2884c..de1b3fce936d 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -185,7 +185,6 @@ ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \ CFLAGS_ncr53c8xx.o := $(ncr53c8xx-flags-y) $(ncr53c8xx-flags-m) zalon7xx-objs := zalon.o ncr53c8xx.o NCR_Q720_mod-objs := NCR_Q720.o ncr53c8xx.o -oktagon_esp_mod-objs := oktagon_esp.o oktagon_io.o # Files generated that shall be removed upon make clean clean-files := 53c700_d.h 53c700_u.h diff --git a/drivers/scsi/aic7xxx/aiclib.c b/drivers/scsi/aic7xxx/aiclib.c deleted file mode 100644 index 828ae3d9a510.. --- a/drivers/scsi/aic7xxx/aiclib.c +++ /dev/null @@ -1,34 +0,0 @@ -/* - * Implementation of Utility functions for all SCSI device types. - * - * Copyright (c) 1997, 1998, 1999 Justin T. Gibbs. - * Copyright (c) 1997, 1998 Kenneth D. Merry. - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - *notice, this list of conditions, and the following disclaimer, - *without modification, immediately at the beginning of the file. - * 2. The name of the author may not be used to endorse or promote products - *derived from this software without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF
Re: [PATCH v1] scsi: mpt3sas: Use lo_hi_writeq() helper
On Wed, 2018-02-14 at 21:48 +0200, Andy Shevchenko wrote: > On Wed, 2018-02-14 at 11:40 -0800, James Bottomley wrote: > > > > On Wed, 2018-02-14 at 20:10 +0200, Andy Shevchenko wrote: > > > > > > Since we have a writeq() for 32-bit architectures as provided by > > > IO > > > non-atomic helpers, there is no need to open code it. > > > > > > Moreover sparse complains about this: > > > > > > drivers/scsi/mpt3sas/mpt3sas_base.c:2975:16: expected unsigned > > > long > > > long val > > > drivers/scsi/mpt3sas/mpt3sas_base.c:2975:16: got restricted > > > __le64 > > > > > > > > > Fixing this by replacing custom writeq() with one provided by > > > io-64-nonatomic-lo-hi.h header. > > > > > > > > > > > > -#if defined(writeq) && defined(CONFIG_64BIT) > > > -static inline void > > > -_base_writeq(__u64 b, volatile void __iomem *addr, spinlock_t > > > *writeq_lock) > > > -{ > > > - writeq(cpu_to_le64(b), addr); > > > -} > > > -#else > > > static inline void > > > _base_writeq(__u64 b, volatile void __iomem *addr, spinlock_t > > > *writeq_lock) > > > { > > > @@ -2985,11 +2979,9 @@ _base_writeq(__u64 b, volatile void > > > __iomem > > > *addr, spinlock_t *writeq_lock) > > > __u64 data_out = cpu_to_le64(b); > > > > > > spin_lock_irqsave(writeq_lock, flags); > > > - writel((u32)(data_out), addr); > > > - writel((u32)(data_out >> 32), (addr + 4)); > > > + writeq(data_out, addr); > > > spin_unlock_irqrestore(writeq_lock, flags); > > > } > > > -#endif > > > > This would rather defeat the purpose of the original code, I think. > > James, TBH, I don't see any value of that lock. What it's protecting > from? simultaneous thread doing writeq()? But this is pointless if at > the same time you will have writel() to the device. The lock prevents two threads doing an interleaved write to this specific address which could be caused by two threads writing to the same address. If I remember correctly the firmware hangs in that case. > For my opinion perhaps complete removal of the custom writeq() and > replacing it by just writeq() with enabled non-atomic helpers will do > the job. > > The code is very old, and I have no idea why it's done this (strange) > way. The write seems to trigger starting the engine on the HBA if you look at the code, which is why it must be written completely and in order. It's equivalent to a mailbox post. James
Re: [PATCH v1] scsi: mpt3sas: Use lo_hi_writeq() helper
On Wed, 2018-02-14 at 20:10 +0200, Andy Shevchenko wrote: > Since we have a writeq() for 32-bit architectures as provided by IO > non-atomic helpers, there is no need to open code it. > > Moreover sparse complains about this: > > drivers/scsi/mpt3sas/mpt3sas_base.c:2975:16: expected unsigned long > long val > drivers/scsi/mpt3sas/mpt3sas_base.c:2975:16: got restricted __le64 > > > Fixing this by replacing custom writeq() with one provided by > io-64-nonatomic-lo-hi.h header. > > Reported-by: kbuild test robot> Signed-off-by: Andy Shevchenko > --- > drivers/scsi/mpt3sas/mpt3sas_base.c | 14 +++--- > 1 file changed, 3 insertions(+), 11 deletions(-) > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c > b/drivers/scsi/mpt3sas/mpt3sas_base.c > index 59a87ca328d3..a92ab4a801d7 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_base.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c > @@ -56,6 +56,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -2968,16 +2969,9 @@ mpt3sas_base_free_smid(struct MPT3SAS_ADAPTER > *ioc, u16 smid) > * @writeq_lock: spin lock > * > * Glue for handling an atomic 64 bit word to MMIO. This special > handling takes > - * care of 32 bit environment where its not quarenteed to send the > entire word > + * care of 32 bit environment where its not guaranteed to send the > entire word > * in one transfer. > */ > -#if defined(writeq) && defined(CONFIG_64BIT) > -static inline void > -_base_writeq(__u64 b, volatile void __iomem *addr, spinlock_t > *writeq_lock) > -{ > - writeq(cpu_to_le64(b), addr); > -} > -#else > static inline void > _base_writeq(__u64 b, volatile void __iomem *addr, spinlock_t > *writeq_lock) > { > @@ -2985,11 +2979,9 @@ _base_writeq(__u64 b, volatile void __iomem > *addr, spinlock_t *writeq_lock) > __u64 data_out = cpu_to_le64(b); > > spin_lock_irqsave(writeq_lock, flags); > - writel((u32)(data_out), addr); > - writel((u32)(data_out >> 32), (addr + 4)); > + writeq(data_out, addr); > spin_unlock_irqrestore(writeq_lock, flags); > } > -#endif This would rather defeat the purpose of the original code, I think. The coding seems to indicate that if the platform can do a writeq atomically (i.e. it's 64 bit and has the primitive) then it should and not take the hit of using a lock. Otherwise the 64 bit value is updated by two 32 bit writes protected by a lock to ensure we don't get interleaving of the write. So I think you can replace the double writel with a lo_hi_writeq but you have to lose the lock if the platform supports the atomic 64 bit write for performance reasons. James
[GIT PULL] SCSI postmerge updates for the 4.15+ merge window
This is a set of three patches that depended on mq and zone changes in the block tree (now upstream). The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-postmerge The short changelog is: Bart Van Assche (1): scsi: scsi-mq-debugfs: Show more information Damien Le Moal (2): scsi: sd: Remove zone write locking scsi: sd_zbc: Initialize device request queue zoned data And the diffstat: drivers/scsi/scsi_debugfs.c | 40 +++- drivers/scsi/sd.c | 41 +--- drivers/scsi/sd.h | 11 --- drivers/scsi/sd_zbc.c | 235 +++- include/scsi/scsi_cmnd.h| 3 +- 5 files changed, 187 insertions(+), 143 deletions(-) James
Re: [GIT PULL] first round of SCSI updates for the 4.14+ merge window
On Wed, 2018-01-31 at 11:29 -0800, Linus Torvalds wrote: > On Wed, Jan 31, 2018 at 9:42 AM, James Bottomley > <james.bottom...@hansenpartnership.com> wrote: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi- > > misc > > Ok, now I did indeed get that > > gpg: Can't check signature: unknown pubkey algorithm > > because of your fancy new key. But making git use gpg2 indeed fixes > it, since I have a recent enough version. > > Let's see if anybody else even notices. I don't think lots of people > look at the signatures. Great! thanks for being the guinea pig. At least we know the theory is now tested. Regards, James
Sorry about all the patch dropped email noise
If you received it, just ignore it. I didn't initialize my fixes tree correctly before pulling in Martin's branch, so it wrongly sent a patch dropped email for every patch in the misc tree (which has already been routed to Linus). James
[GIT PULL] first round of SCSI updates for the 4.14+ merge window
This is mostly updates of the usual driver suspects: arcmsr, scsi_debug, mpt3sas, lpfc, cxlflash, qla2xxx, aacraid, megaraid_sas, hisi_sas. We also have a rework of the libsas hotplug handling to make it more robust, a slew of 32 bit time conversions and fixes, and a host of the usual minor updates and style changes. The biggest potential for regressions is the libsas hotplug changes, but so far they seem stable under testing. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc The short changelog is: Andy Shevchenko (3): scsi: mptfusion: Use snprintf() instead of open coded divisions scsi: hpsa: Use vsnprintf extension %phN scsi: libsas: remove private hex2bin() implementation Anil Gurumurthy (1): scsi: qla2xxx: Add XCB counters to debugfs Arnd Bergmann (16): scsi: arcmsr: avoid do_gettimeofday scsi: fas216: fix sense buffer initialization scsi: megaraid: use ktime_get_real for firmware time scsi: fnic: use 64-bit timestamps scsi: bfa: convert to strlcpy/strlcat scsi: scsi_debug: remove jiffies_to_timespec scsi: 3w-9xxx: rework lock timeouts scsi: 3ware: use 64-bit times for FW time sync scsi: 3ware: fix 32-bit time calculations scsi: bfa: use 64-bit times in bfa_aen_entry_s ABI scsi: bfa: try to sanitize vendor netlink events scsi: bfa: replace bfa_get_log_time() with ktime_get_real_seconds() scsi: bfa: document overflow of io_profile_start_time scsi: bfa: improve bfa_ioc_send_enable/disable data scsi: bfa: use proper time accessor for stats_reset_time scsi: bfa: use ktime_get_real_ts64 for firmware timestamp Bart Van Assche (9): scsi: core: Change third __scsi_queue_insert() argument from int to bool scsi: qla2xxx: Suppress gcc 7 fall-through warnings scsi: scsi_debug: Add support for injecting SCSI_MLQUEUE_HOST_BUSY scsi: dh: Remove scsi_dh_remove_device() scsi: core: Unexport scsi_initialize_rq() scsi: core: Introduce scsi_devinfo_key enumeration type scsi: core: scsi_get_device_flags_keyed(): Always return device flags scsi: core: Convert a source code comment into a runtime check scsi: core: Ensure that the SCSI error handler gets woken up Bryant G. Ly (1): scsi: ibmvscsis: add DRC indices to debug statements Chaitra P B (1): scsi: mpt3sas: Proper handling of set/clear of "ATA command pending" flag. Ching Huang (23): scsi: arcmsr: simplify arcmsr_request_device_map routine scsi: arcmsr: simplify all arcmsr_hbaX_get_config routine by call a new get_adapter_config function scsi: arcmsr: simplify arcmsr_hbaE_get_config function scsi: arcmsr: waiting for iop firmware ready before issue get_config command to iop scsi: arcmsr: simplify arcmsr_hbaC_get_config function scsi: arcmsr: Fix command result for CHECK_CONDITION scsi: arcmsr: Update driver version to v1.40.00.04-20171130 scsi: arcmsr: Add driver module parameter msix_enable scsi: arcmsr: Add driver module parameter msi_enable scsi: arcmsr: Fix grammar scsi: arcmsr: Adjust whitespace scsi: arcmsr: Spin off duplicate code scsi: arcmsr: Fix clear doorbell queue on ACB_ADAPTER_TYPE_B scsi: arcmsr: Add a function to set date and time to firmware scsi: arcmsr: Add ACB_F_MSG_GET_CONFIG to acb->acb_flags scsi: arcmsr: Add driver option cmd_per_lun scsi: arcmsr: Replace constant ARCMSR_MAX_OUTSTANDING_CMD scsi: arcmsr: Add driver option host_can_queue scsi: arcmsr: replace constant ARCMSR_MAX_FREECCB_NUM scsi: arcmsr: Increase host controller command queue depth scsi: arcmsr: Add code for ACB_ADAPTER_TYPE_E scsi: arcmsr: simplify arcmsr_iop_init function scsi: arcmsr: Redefine ACB_ADAPTER_TYPE_A, _B, _C, _D Christopher Díaz Riveros (1): scsi: ibmvfc: Remove unneeded semicolons Colin Ian King (15): scsi: mptsas: remove duplicated assignment to pointer head scsi: mpt3sas: make function _get_st_from_smid static scsi: bfa: use ARRAY_SIZE for array sizing calculation on array __pciids scsi: qla2xxx: remove redundant assignment of d scsi: aacraid: remove redundant setting of variable c scsi: lpfc: fix a couple of minor indentation issues scsi: lpfc: don't dereference localport before it has been null checked scsi: arcmsr: remove redundant check for secs < 0 scsi: fusion: clean up some indentations scsi: ipr: fix incorrect indentation of assignment statement scsi: csiostor: fix spelling mistake: "Couldnt" -> "Couldn't" scsi: bnx2fc: fix spelling mistake: "Couldnt" -> "Couldn't" scsi: wd719x: make card_types static const, shrinks object size scsi: bfa: remove unused pointer 'port' scsi: aacraid: remove unused variable managed_request_id Dan Carpenter (2): scsi: storvsc: missing error
Re: [LSF/MM TOPIC] Two blk-mq related topics
On Mon, 2018-01-29 at 14:00 -0700, Jens Axboe wrote: > On 1/29/18 1:56 PM, James Bottomley wrote: > > > > On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote: > > [...] > > > > > > 2. When to enable SCSI_MQ at default again? > > > > I'm not sure there's much to discuss ... I think the basic answer > > is as soon as Christoph wants to try it again. > > FWIW, internally I've been running various IO intensive workloads on > what is essentially 4.12 upstream with scsi-mq the default (with > mq-deadline as the scheduler) and comparing IO workloads with a > previous 4.6 kernel (without scsi-mq), and things are looking > great. > > We're never going to iron out the last kinks with it being off > by default, I think we should attempt to flip the switch again > for 4.16. Absolutely, I agree we turn it on ASAP. I just don't want to be on the receiving end of Linus' flamethrower because a bug we already had reported against scsi-mq caused problems. Get confirmation from the original reporters (or as close to it as you can) that their problems are fixed and we're good to go; he won't kick us nearly as hard for new bugs that turn up. James
Re: [LSF/MM TOPIC] Two blk-mq related topics
On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote: [...] > 2. When to enable SCSI_MQ at default again? I'm not sure there's much to discuss ... I think the basic answer is as soon as Christoph wants to try it again. > SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In > V4.13-rc1, it is enabled at default, but later the patch is reverted > in V4.13-rc7, and becomes disabled at default too. > > Now both the original reported PM issue(actually SCSI quiesce) and > the sequential IO performance issue have been addressed. Is the blocker bug just not closed because no-one thought to do it: https://bugzilla.kernel.org/show_bug.cgi?id=178381 (we have confirmed that this issue is now fixed with the original reporter?) And did the Huawei guy (Jonathan Cameron) confirm his performance issue was fixed (I don't think I saw email that he did)? James
Re: [PATCH resend 0/6] Fix cdrom autoclose
On Fri, 2018-01-26 at 17:58 +0100, Michal Suchanek wrote: > First time I did not get any feedback for the patches. This is likely because no-one who might inspect the code saw the patches ... what list are they going to? I'm on the block, scsi and ide mailing lists and I only saw a doc patch the last time. James
Re: [LSF/MM TOPIC] Patch Submission process and Handling Internal Conflict
On Wed, 2018-01-24 at 19:26 +, Bart Van Assche wrote: > On Wed, 2018-01-24 at 11:05 -0800, James Bottomley wrote: > > > > 2. Handling Internal Conflict > > > > My observation here is that actually most conflict is generated by > > the review process (I know, if we increase reviews as I propose in > > 1. we'll increase conflict on the lists on the basis of this > > observation), so I've been thinking about ways to de-escalate > > it. The principle issue is that a review which doesn't just say > > the patch is fine (or fine except for nitpicks) can be taken as > > criticism and criticism is often processed personally. The way you > > phrase criticism can have a great bearing on the amount of personal > > insult taken by the other party. Corny as it sounds, the 0day bot > > response "Hi Z, I love your patch! Perhaps something to improve:" > > is specifically targetted at this problem and seems actually to > > work. I think we could all benefit from discussing how to give and > > receive criticism in the form of patch reviews responsibly, > > especially as not everyone's native language in English and certain > > common linguistic phrasings in other languages can come off as rude > > when directly translated to English (Russian springs immediately to > > mind for some reason here). Also Note, I think fixing the review > > problem would solve most of the issues, so I'm not proposing > > anything more formal like the code of conflict stuff in the main > > kernel. > > > > We could lump both of these under a single "Community Discussion" > > topic if the organizers prefer ... especially if anyone has any > > other community type issues they'd like to bring up. > > Hello James, > > How about discussing the following two additional topics during the > same or another session: > * We all want a concensus about the code and the algorithms in the > Linux > kernel. However, some contributors are not interested in trying to > strive > towards a concensus. If some contributors e.g. receive a request to > rework > their patches, if they don't like that request and if the reviewer > is > working for the same employer sometimes they try to use the > corporate > hierarchy to make the reviewer shut up. I think this is behavior > that works > against the long-term interests of the Linux kernel. Trying to intimidate people into doing what you want is a well known social tool. The particular problem here is that it's the corporate power structure that's used to effect the intimidation. That's pretty much outside of our control (unless we work for the same company), so there's not much we can do to stop it except say you shouldn't try to intimidate reviewers. I suppose one practical policy could be demanding reviews from independent (meaning outside the corporate power structure) people. > * Some other contributors are not interested in achieving a consensus > and do > not attempt to address reviewer feedback but instead keep arguing > or do what > they can to insult the reviewer. Well, I think this fits neatly in the giving and receiving and receiving criticism bucket. The first rule of interacting with others is "never attribute to malice something which could possibly be accidental". In the end it's up to the maintainer to arbitrate based on their opinion of the merits of the review. James
Re: [LSF/MM TOPIC] Patch Submission process and Handling Internal Conflict
On Wed, 2018-01-24 at 11:20 -0800, Mike Kravetz wrote: > On 01/24/2018 11:05 AM, James Bottomley wrote: > > > > I've got two community style topics, which should probably be > > discussed > > in the plenary > > > > 1. Patch Submission Process > > > > Today we don't have a uniform patch submission process across > > Storage, Filesystems and MM. The question is should we (or at > > least should we adhere to some minimal standards). The standard > > we've been trying to hold to in SCSI is one review per accepted > > non-trivial patch. For us, it's useful because it encourages > > driver writers to review each other's patches rather than just > > posting and then complaining their patch hasn't gone in. I can > > certainly think of a couple of bugs I've had to chase in mm where > > the underlying patches would have benefited from review, so I'd > > like to discuss making the one review per non-trival patch our base > > minimum standard across the whole of LSF/MM; it would certainly > > serve to improve our Reviewed-by statistics. > > Well, the mm track at least has some discussion of this last year: > https://lwn.net/Articles/718212/ The pushback in your session was mandating reviews would mean slowing patch acceptance or possibly causing the dropping of patches that couldn't get reviewed. Michal did say that XFS didn't have the problem, however there not being XFS people in the room, discussion stopped there. Having this as a plenary would allow people outside mm to describe their experiences and for us to look at process based solutions using our shared experience. James
[LSF/MM TOPIC] Patch Submission process and Handling Internal Conflict
I've got two community style topics, which should probably be discussed in the plenary 1. Patch Submission Process Today we don't have a uniform patch submission process across Storage, Filesystems and MM. The question is should we (or at least should we adhere to some minimal standards). The standard we've been trying to hold to in SCSI is one review per accepted non-trivial patch. For us, it's useful because it encourages driver writers to review each other's patches rather than just posting and then complaining their patch hasn't gone in. I can certainly think of a couple of bugs I've had to chase in mm where the underlying patches would have benefited from review, so I'd like to discuss making the one review per non-trival patch our base minimum standard across the whole of LSF/MM; it would certainly serve to improve our Reviewed-by statistics. 2. Handling Internal Conflict My observation here is that actually most conflict is generated by the review process (I know, if we increase reviews as I propose in 1. we'll increase conflict on the lists on the basis of this observation), so I've been thinking about ways to de-escalate it. The principle issue is that a review which doesn't just say the patch is fine (or fine except for nitpicks) can be taken as criticism and criticism is often processed personally. The way you phrase criticism can have a great bearing on the amount of personal insult taken by the other party. Corny as it sounds, the 0day bot response "Hi Z, I love your patch! Perhaps something to improve:" is specifically targetted at this problem and seems actually to work. I think we could all benefit from discussing how to give and receive criticism in the form of patch reviews responsibly, especially as not everyone's native language in English and certain common linguistic phrasings in other languages can come off as rude when directly translated to English (Russian springs immediately to mind for some reason here). Also Note, I think fixing the review problem would solve most of the issues, so I'm not proposing anything more formal like the code of conflict stuff in the main kernel. We could lump both of these under a single "Community Discussion" topic if the organizers prefer ... especially if anyone has any other community type issues they'd like to bring up. James
[GIT PULL] SCSI fixes for 4.15-rc8
One fix for SAS attached SATA CD-ROMs. It turns out that the libata handling of CD devices relies on the SCSI error handler, so disable async aborts (which don't start the error handler) for these devices. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Hannes Reinecke (1): scsi: libsas: Disable asynchronous aborts for SATA devices And the diffstat: drivers/scsi/libsas/sas_scsi_host.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c index 91795eb56206..eea94aa4091c 100644 --- a/drivers/scsi/libsas/sas_scsi_host.c +++ b/drivers/scsi/libsas/sas_scsi_host.c @@ -486,15 +486,28 @@ static int sas_queue_reset(struct domain_device *dev, int reset_type, int sas_eh_abort_handler(struct scsi_cmnd *cmd) { - int res; + int res = TMF_RESP_FUNC_FAILED; struct sas_task *task = TO_SAS_TASK(cmd); struct Scsi_Host *host = cmd->device->host; + struct domain_device *dev = cmd_to_domain_dev(cmd); struct sas_internal *i = to_sas_internal(host->transportt); + unsigned long flags; if (!i->dft->lldd_abort_task) return FAILED; - res = i->dft->lldd_abort_task(task); + spin_lock_irqsave(host->host_lock, flags); + /* We cannot do async aborts for SATA devices */ + if (dev_is_sata(dev) && !host->host_eh_scheduled) { + spin_unlock_irqrestore(host->host_lock, flags); + return FAILED; + } + spin_unlock_irqrestore(host->host_lock, flags); + + if (task) + res = i->dft->lldd_abort_task(task); + else + SAS_DPRINTK("no task to abort\n"); if (res == TMF_RESP_FUNC_SUCC || res == TMF_RESP_FUNC_COMPLETE) return SUCCESS;
Re: [PATCH] sd: succeed check_event if device is not removable
On Thu, 2018-01-18 at 17:22 +0100, Jack Wang wrote: > From: Jack Wang> > The check_events interface was added for check if device changes, > mainly for device is removable eg. CDROM > > In sd_open, it checks if device is removable then check_disk_change. > > when the device is not removable, we can simple succeeds the call > without send TUR. > > Signed-off-by: Jack Wang > --- > drivers/scsi/sd.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index ab75ebd..773ce81 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -1576,6 +1576,10 @@ static unsigned int sd_check_events(struct > gendisk *disk, unsigned int clearing) > sdp = sdkp->device; > SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, > "sd_check_events\n")); > > + if (!sdp->removable) { > + retval = 0; > + goto out; > + } This looks like very much the wrong place to fix whatever problem you're seeing is. We could simply avoid setting up the events work function in genhd.c:device_add_disk(), which would be way more efficient. However, I think some devices do require being probed occasionally with a TUR because its the only way SCSI gets AENs (not that we make much use of them). So first of all, what's the actual problem? James
Re: [PATCH v2] scsi: sd: add new match array for cache_type
On Thu, 2018-01-18 at 22:19 +0800, weiping zhang wrote: > - return sprintf(buf, "%s\n", sd_cache_types[ct]); > + return sprintf(buf, "%s\n%s\nwrite:%s, read:%s\n", > sd_cache_types[ct], > + sd_wce_rcd[ct], sdkp->WCE ? "on" : "off", > + sdkp->RCD ? "off" : "on"); I don't think we can do this. The output of the cache type in sysfs is a user exported ABI. We'd potentially break it if we add extra stuff. James
Re: [PATCH] scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS
On Thu, 2018-01-18 at 11:17 +0100, Hannes Reinecke wrote: > On 01/18/2018 03:43 AM, Martin K. Petersen wrote: > > > > > > Martin, > > > > > > > > You'd like to spend a precious BLIST bit for this single device > > > which uses vendor-specific ASC/Q? > > > > I really don't want string comparisons in the regular code paths. > > Also not a fan of vendor-specific ASCs. But if you must use them, > > please add a flag and trigger on that. > > > > Since this is a bit of an unusual check condition combo, we could > > entertain whether we should simply always ADD_TO_MLQUEUE on > > 0xb/0xc1/0x1. I wonder what would break? > > > That's the usual problem I have with vendor-specific sense codes. > In general the risk is quite low of different vendors actually using > the same code; I guess we can get away with just adding a debug > message here and enable it without any vendor check. > Then we can reconsider the whole thing once we notice several vendors > using the same sense codes. Murphy's law says that if we rely on Vendors to chose non-overlapping numbers we'll end up with a clash fairly quickly ... Can't we find a way of doing this in the device_handler? That way we can use vendor specific codes only when we know who the vendor is? James
Re: [Lsf-pc] [LSF/MM TOPIC] A high-performance userspace block driver
On Tue, 2018-01-16 at 18:23 -0500, Theodore Ts'o wrote: > On Tue, Jan 16, 2018 at 06:52:40AM -0800, Matthew Wilcox wrote: > > > > > > I see the improvements that Facebook have been making to the nbd > > driver, and I think that's a wonderful thing. Maybe the outcome of > > this topic is simply: "Shut up, Matthew, this is good enough". > > > > It's clear that there's an appetite for userspace block devices; > > not for swap devices or the root device, but for accessing data > > that's stored in that silo over there, and I really don't want to > > bring that entire mess of CORBA / Go / Rust / whatever into the > > kernel to get to it, but it would be really handy to present it as > > a block device. > > ... and using iSCSI was too painful and heavyweight. >From what I've seen a reasonable number of storage over IP cloud implementations are actually using AoE. The argument goes that the protocol is about ideal (at least as compared to iSCSI or FCoE) and the company behind it doesn't seem to want to add any more features that would bloat it. James
Re: [PATCH 10/18] qla2xxx: prevent bounds-check bypass via speculative execution
On Fri, 2018-01-12 at 08:27 +0100, Greg KH wrote: > On Thu, Jan 11, 2018 at 02:15:12PM -0800, Dan Williams wrote: > > > > On Sat, Jan 6, 2018 at 1:03 AM, Greg KH> > wrote: > > > > > > On Fri, Jan 05, 2018 at 05:10:48PM -0800, Dan Williams wrote: > > > > > > > > Static analysis reports that 'handle' may be a user controlled > > > > value that is used as a data dependency to read 'sp' from the > > > > 'req->outstanding_cmds' array. In order to avoid potential > > > > leaks of kernel memory values, block speculative execution of > > > > the instruction stream that could issue reads based on an > > > > invalid value of 'sp'. In this case 'sp' is directly > > > > dereferenced later in the function. > > > > > > I'm pretty sure that 'handle' comes from the hardware, not from > > > userspace, from what I can tell here. If we want to start > > > auditing __iomem data sources, great! But that's a bigger task, > > > and one I don't think we are ready to tackle... > > > > I think it falls in the hygiene bucket of shutting off an array > > index from a source that could be under attacker control. Should we > > leave this one un-patched while we decide if we generally have a > > problem with trusting completion 'tags' from hardware? My vote is > > patch it for now. > > Hah, if you are worried about "tags" from hardware, we have a lot > more auditing to do, right? We'd also have a lot more to do: the assumption would have to be malicious hardware and most hardware has access to fairly vital stuff directly. I really don't think we have to worry about side channel attacks from hardware until the direct attack vector is closed. James
Re: [PATCH v2 17/19] qla2xxx: prevent bounds-check bypass via speculative execution
On Thu, 2018-01-11 at 21:38 -0800, Dan Williams wrote: > On Thu, Jan 11, 2018 at 5:19 PM, James Bottomley > <j...@linux.vnet.ibm.com> wrote: > > > > On Thu, 2018-01-11 at 16:47 -0800, Dan Williams wrote: > > > > > > Static analysis reports that 'handle' may be a user controlled > > > value that is used as a data dependency to read 'sp' from the > > > 'req->outstanding_cmds' array. > > > > Greg already told you it comes from hardware, specifically the > > hardware response queue. If you don't believe him, I can confirm > > it's quite definitely all copied from the iomem where the mailbox > > response is, so it can't be a user controlled value (well, unless > > the user has some influence over the firmware of the > > qla2xxx controller, which probably means you have other things to > > worry about than speculative information leaks). > > I do believe him, and I still submitted this. I'm trying to probe at > the meta question of where do we draw the line with these especially > when it costs us relatively little to apply a few line patch? We fix > theoretical lockdep races, why not theoretical data leak paths? I think I've lost the thread of what you're after. I thought you were asking for the domain experts to look and see if there is the potential for attack; if there's no theoretical way for a user to influence the value what's the point of killing speculation? Furthermore, if the user could affect that 32 bit value, what they'd actually do is extract information via the que variable which you didn't fix and which could be used to compromise the kernel without resorting to side channel attacks. What's most puzzling to me is the inconsistency of the positions: if it doesn't cost that much to turn off speculation, just do it on kernel entry as Jiří suggested; we can make it a dynamic option and the cloud providers can do it and the rest of us don't need to bother. If it does cost a lot to turn it off as Alan said, then you need us to identify the cases above where there's no need to disrupt the speculation pipeline and not turn it off there. Which is it? James
Re: [PATCH v2 17/19] qla2xxx: prevent bounds-check bypass via speculative execution
On Thu, 2018-01-11 at 16:47 -0800, Dan Williams wrote: > Static analysis reports that 'handle' may be a user controlled value > that is used as a data dependency to read 'sp' from the > 'req->outstanding_cmds' array. Greg already told you it comes from hardware, specifically the hardware response queue. If you don't believe him, I can confirm it's quite definitely all copied from the iomem where the mailbox response is, so it can't be a user controlled value (well, unless the user has some influence over the firmware of the qla2xxx controller, which probably means you have other things to worry about than speculative information leaks). I think what it actually is is a handle returned in the mailbox that's used to find other mailbox entries in different queues (the packet handle actually contains an entry in the lower 16 bits and a queue designation in the upper). Perhaps the qla2xxx people should comment on this because the code seems to check and print an error if there's a problem with the handle being too big, but we don't check the que value and use it blindly to read into the req_q_map: if handle could be wrong, couldn't que? James
[GIT PULL] SCSI fixes for 4.15-rc5
Two simple fixes, both of which cause I/O hangs. The storvsc one is from the hyper-v which can hang under certain hot add/remove conditions and the other is generally, where removing a target and a device in close proximity can result in the release method being executed twice (and subsequent list and other corruption and an eventual panic). The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Cathy Avery (1): scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error Hannes Reinecke (1): scsi: core: check for device state in __scsi_remove_target() And the diffstat: drivers/scsi/scsi_sysfs.c | 5 - drivers/scsi/storvsc_drv.c | 3 ++- 2 files changed, 6 insertions(+), 2 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index a9996c16f4ae..26ce17178401 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -1415,7 +1415,10 @@ static void __scsi_remove_target(struct scsi_target *starget) * check. */ if (sdev->channel != starget->channel || - sdev->id != starget->id || + sdev->id != starget->id) + continue; + if (sdev->sdev_state == SDEV_DEL || + sdev->sdev_state == SDEV_CANCEL || !get_device(>sdev_gendev)) continue; spin_unlock_irqrestore(shost->host_lock, flags); diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 1b06cf0375dc..3b3d1d050cac 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -953,10 +953,11 @@ static void storvsc_handle_error(struct vmscsi_request *vm_srb, case TEST_UNIT_READY: break; default: - set_host_byte(scmnd, DID_TARGET_FAILURE); + set_host_byte(scmnd, DID_ERROR); } break; case SRB_STATUS_INVALID_LUN: + set_host_byte(scmnd, DID_NO_CONNECT); do_work = true; process_err_fn = storvsc_remove_lun; break;
Re: [PATCH 2/3] ses: skip error messages for invalid LUNs
On Thu, 2017-12-21 at 12:22 +0100, Hannes Reinecke wrote: > Some storage array set the 'Embedded enclosure' bit even though > no LUN is present, causing the first RECEIVE DIAGNOSTIC call to > be returned with sense code 'LOGICAL UNIT NOT SUPPORTED'. > This patch skips the annoying 'Failed to get diagnostic page 0x1' > messages for those cases. What disagnostic pages does this thing support? Can you do a receive diagnostic on page 0 to find out? I suspect a lot of embedded enclosure services are simple and support page 7 only. If it really refuses all diagnostic page requests (which would be a gross standards violation), then it should probably be blacklisted by inquiry string as unusable. James
Re: [PATCH 1/3] ses: make initial allocation size configurable
On Thu, 2017-12-21 at 12:22 +0100, Hannes Reinecke wrote: > Some storage arrays have an incorrect SES implementation which will > always return the allocation length of the CDB instead of the true > length of the requested page. That's a pretty gross standards violation, is this common? When the buffer is > than the data to return, does it get set then? Because if not, we're going to be processing bogus data and no module parameter can fix this because the returned length depends on the number of elements in the enclosure making this parameter really unsafe unless you get it exactly right. > With this patch one can modify the initial allocation size to > get the full output of those arrays. This really doesn't look like the right way to do it. Shouldn't we rather have a blacklist for these devices and simply do a page for each allocation for them (assuming they return the correct length when over subscribed). James
Re: [PATCH 3/3] ses: Retry Power-on-reset check condition
On Thu, 2017-12-21 at 12:22 +0100, Hannes Reinecke wrote: > During startup any SCSI request might encounter a 'Power-on/reset' > sense code, which just can be retried. > In the case of ses it needs to be retried, otherwise ses will > errorneously detect this as a failure and not attach the driver. > > Signed-off-by: Hannes Reinecke> --- > drivers/scsi/ses.c | 14 ++ > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c > index c1f96b0..9c8b3db 100644 > --- a/drivers/scsi/ses.c > +++ b/drivers/scsi/ses.c > @@ -110,14 +110,20 @@ static int ses_recv_diag(struct scsi_device > *sdev, int page_code, > 0 > }; > unsigned char recv_page_code; > + int retries = SES_RETRIES; > struct scsi_sense_hdr sshdr; > > +retry: > ret = scsi_execute_req(sdev, cmd, DMA_FROM_DEVICE, buf, > bufflen, > - , SES_TIMEOUT, SES_RETRIES, > NULL); > + , SES_TIMEOUT, retries, NULL); > if (unlikely(ret)) { > - if (status_byte(ret) == CHECK_CONDITION && > - sshdr.asc == 0x25 && sshdr.ascq == 0x00) { > - ret = -ENODEV; > + if (status_byte(ret) == CHECK_CONDITION) { > + if (sshdr.asc == 0x29) { > + retries--; Nothing ever checks this, meaning the loop potentially never terminates. We already have two templates for how to do this in sd.c ... on the other hand I can't see any use of scsi_execute/scsi_execute_req that actually want to see the ASC 29 conditions, so it might be better to integrate it into scsi_execute(). James
[GIT PULL] SCSI fixes for 4.15-rc4
Two simple fixes: one for sparse warnings that were introduced by the merge window conversion to blist_flags_t and the other to fix dropped I/O during reset in aacraid. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Bart Van Assche (1): scsi: core: Use blist_flags_t consistently Prasad B Munirathnam (1): scsi: aacraid: Fix I/O drop during reset And the diffstat: drivers/scsi/aacraid/aacraid.h| 1 + drivers/scsi/aacraid/linit.c | 2 +- drivers/scsi/scsi_devinfo.c | 6 ++ drivers/scsi/scsi_scan.c | 13 +++-- drivers/scsi/scsi_sysfs.c | 5 +++-- drivers/scsi/scsi_transport_spi.c | 12 +++- 6 files changed, 21 insertions(+), 18 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h index 6e3d81969a77..d52265416da2 100644 --- a/drivers/scsi/aacraid/aacraid.h +++ b/drivers/scsi/aacraid/aacraid.h @@ -1725,6 +1725,7 @@ struct aac_dev #define FIB_CONTEXT_FLAG_NATIVE_HBA(0x0010) #define FIB_CONTEXT_FLAG_NATIVE_HBA_TMF(0x0020) #define FIB_CONTEXT_FLAG_SCSI_CMD (0x0040) +#define FIB_CONTEXT_FLAG_EH_RESET (0x0080) /* * Define the command values diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c index bdf127aaab41..d55332de08f9 100644 --- a/drivers/scsi/aacraid/linit.c +++ b/drivers/scsi/aacraid/linit.c @@ -1037,7 +1037,7 @@ static int aac_eh_bus_reset(struct scsi_cmnd* cmd) info = >hba_map[bus][cid]; if (bus >= AAC_MAX_BUSES || cid >= AAC_MAX_TARGETS || info->devtype != AAC_DEVTYPE_NATIVE_RAW) { - fib->flags |= FIB_CONTEXT_FLAG_TIMED_OUT; + fib->flags |= FIB_CONTEXT_FLAG_EH_RESET; cmd->SCp.phase = AAC_OWNER_ERROR_HANDLER; } } diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c index 449ef5adbb2b..dfb8da83fa50 100644 --- a/drivers/scsi/scsi_devinfo.c +++ b/drivers/scsi/scsi_devinfo.c @@ -374,10 +374,8 @@ int scsi_dev_info_list_add_keyed(int compatible, char *vendor, char *model, model, compatible); if (strflags) - devinfo->flags = simple_strtoul(strflags, NULL, 0); - else - devinfo->flags = flags; - + flags = (__force blist_flags_t)simple_strtoul(strflags, NULL, 0); + devinfo->flags = flags; devinfo->compatible = compatible; if (compatible) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index be5e919db0e8..0880d975eed3 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -770,7 +770,7 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result, * SCSI_SCAN_LUN_PRESENT: a new scsi_device was allocated and initialized **/ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result, - int *bflags, int async) + blist_flags_t *bflags, int async) { int ret; @@ -1049,14 +1049,15 @@ static unsigned char *scsi_inq_str(unsigned char *buf, unsigned char *inq, * - SCSI_SCAN_LUN_PRESENT: a new scsi_device was allocated and initialized **/ static int scsi_probe_and_add_lun(struct scsi_target *starget, - u64 lun, int *bflagsp, + u64 lun, blist_flags_t *bflagsp, struct scsi_device **sdevp, enum scsi_scan_mode rescan, void *hostdata) { struct scsi_device *sdev; unsigned char *result; - int bflags, res = SCSI_SCAN_NO_RESPONSE, result_len = 256; + blist_flags_t bflags; + int res = SCSI_SCAN_NO_RESPONSE, result_len = 256; struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); /* @@ -1201,7 +1202,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, * Modifies sdevscan->lun. **/ static void scsi_sequential_lun_scan(struct scsi_target *starget, -int bflags, int scsi_level, +blist_flags_t bflags, int scsi_level, enum scsi_scan_mode rescan) { uint max_dev_lun; @@ -1292,7 +1293,7 @@ static void scsi_sequential_lun_scan(struct scsi_target *starget, * 0: scan completed (or no memory, so further scanning is futile) * 1: could not scan with REPORT LUN **/ -static int scsi_report_lun_scan(struct scsi_target *starget, int bflags, +static int scsi_report_lun_scan(struct scsi_target *starget, blist_flags_t bflags, enum scsi_scan_mode rescan) { unsigned char
[GIT PULL] SCSI fixes for 4.15-rc3
The most important one is the bfa fix because it's easy to oops the kernel with this driver (this includes the commit that corrects the compiler warning in the original), a regression in the new timespec conversion in aacraid and a regression in the Fibre Channel ELS handling patch. The other three are a theoretical problem with termination in the vendor/host matching code and a use after free in lpfc. The additional patches are a fix for an I/O hang in the mq code under certain circumstances and a rare oops in some debugging code. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Arnd Bergmann (2): scsi: bfa: fix type conversion warning scsi: aacraid: address UBSAN warning regression Bart Van Assche (1): scsi: core: Fix a scsi_show_rq() NULL pointer dereference Dan Carpenter (1): scsi: lpfc: Use after free in lpfc_rq_buf_free() Jason Yan (1): scsi: libsas: fix length error in sas_smp_handler() Johannes Thumshirn (2): scsi: MAINTAINERS: change FCoE list to linux-scsi scsi: bfa: fix access to bfad_im_port_s Martin Wilck (3): scsi: scsi_devinfo: cleanly zero-pad devinfo strings scsi: scsi_devinfo: handle non-terminated strings scsi: libfc: fix ELS request handling Ming Lei (1): scsi: core: run queue if SCSI device queue isn't ready and queue is idle And the diffstat: MAINTAINERS| 2 +- drivers/scsi/aacraid/commsup.c | 8 ++-- drivers/scsi/bfa/bfad_bsg.c| 6 -- drivers/scsi/bfa/bfad_im.c | 6 -- drivers/scsi/bfa/bfad_im.h | 10 ++ drivers/scsi/libfc/fc_lport.c | 4 drivers/scsi/libsas/sas_expander.c | 10 +- drivers/scsi/lpfc/lpfc_mem.c | 2 +- drivers/scsi/scsi_debugfs.c| 6 -- drivers/scsi/scsi_devinfo.c| 27 ++- drivers/scsi/scsi_lib.c| 2 ++ drivers/scsi/sd.c | 4 +++- 12 files changed, 54 insertions(+), 33 deletions(-) With full diff below. James --- diff --git a/MAINTAINERS b/MAINTAINERS index cd7e12dc6af4..37841b52a5b6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5352,7 +5352,7 @@ F:drivers/media/tuners/fc2580* FCOE SUBSYSTEM (libfc, libfcoe, fcoe) M: Johannes Thumshirn-L: fcoe-de...@open-fcoe.org +L: linux-scsi@vger.kernel.org W: www.Open-FCoE.org S: Supported F: drivers/scsi/libfc/ diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index bec9f3193f60..80a8cb26cdea 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -2482,8 +2482,8 @@ int aac_command_thread(void *data) /* Synchronize our watches */ if (((NSEC_PER_SEC - (NSEC_PER_SEC / HZ)) > now.tv_nsec) && (now.tv_nsec > (NSEC_PER_SEC / HZ))) - difference = (((NSEC_PER_SEC - now.tv_nsec) * HZ) - + NSEC_PER_SEC / 2) / NSEC_PER_SEC; + difference = HZ + HZ / 2 - +now.tv_nsec / (NSEC_PER_SEC / HZ); else { if (now.tv_nsec > NSEC_PER_SEC / 2) ++now.tv_sec; @@ -2507,6 +2507,10 @@ int aac_command_thread(void *data) if (kthread_should_stop()) break; + /* +* we probably want usleep_range() here instead of the +* jiffies computation +*/ schedule_timeout(difference); if (kthread_should_stop()) diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c index 72ca2a2e08e2..b2fa195adc7a 100644 --- a/drivers/scsi/bfa/bfad_bsg.c +++ b/drivers/scsi/bfa/bfad_bsg.c @@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job *job) struct fc_bsg_request *bsg_request = job->request; struct fc_bsg_reply *bsg_reply = job->reply; uint32_t vendor_cmd = bsg_request->rqst_data.h_vendor.vendor_cmd[0]; - struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job)); + struct Scsi_Host *shost = fc_bsg_to_shost(job); + struct bfad_im_port_s *im_port = bfad_get_im_port(shost); struct bfad_s *bfad = im_port->bfad; void *payload_kbuf; int rc = -EINVAL; @@ -3350,7 +3351,8 @@ int bfad_im_bsg_els_ct_request(struct bsg_job *job) { struct bfa_bsg_data *bsg_data; - struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job)); + struct Scsi_Host *shost = fc_bsg_to_shost(job); + struct bfad_im_port_s *im_port = bfad_get_im_port(shost); struct bfad_s *bfad = im_port->bfad; bfa_bsg_fcpt_t *bsg_fcpt; struct bfad_fcxp*drv_fcxp; diff --git a/drivers/scsi/bfa/bfad_im.c
Re: [BUG] scsi/qla2xxx: a possible sleep-in-atomic bug in qlt_get_tag
On Wed, 2017-12-13 at 11:18 +0800, Jia-Ju Bai wrote: > The driver may sleep under a spinlock. > The function call paths are: > qlt_handle_abts_recv_work (acquire the spinlock) > qlt_response_pkt_all_vps > qlt_response_pkt > qlt_handle_cmd_for_atio > qlt_get_tag > percpu_ida_alloc --> may sleep > > qla82xx_msix_rsp_q (acquire the spinlock) > qla24xx_process_response_queue > qlt_handle_abts_recv > qlt_response_pkt_all_vps > qlt_response_pkt > qlt_handle_cmd_for_atio > qlt_get_tag > percpu_ida_alloc --> may sleep-in-atomic > > qla24xx_intr_handler (acquire the spinlock) > qla24xx_process_response_queue > qlt_handle_abts_recv > qlt_response_pkt > qlt_handle_cmd_for_atio > qlt_get_tag > percpu_ida_alloc --> may sleep > > I do not find a good way to fix it, so I only report. > This possible bug is found by my static analysis tool (DSAC) and > checked by my code review. The report is incorrect: percpu_ida_alloc with state==TASK_RUNNING is atomic (and interrupt) safe which appears to be the case here. James
Re: [GIT PULL] SCSI fixes for 4.15-rc3
On Tue, 2017-12-12 at 09:32 -0800, Linus Torvalds wrote: > On Tue, Dec 12, 2017 at 9:22 AM, Martin K. Petersen >wrote: > > > > > > Arnd and Johannes fixed this up right away: > > The commit you point to _is_ the probnlem. It does: > > struct bfad_im_port_s *im_port = shost->hostdata[0]; > > which is utter bullshit crap. > > Notice? It's assigning a pointer (im_port), from an integer value > ("hostdata[0]" is "unsigned long"). > > The code is garbage. he means this one: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git/commit/?h=fixes=48d83282db077f93b2cf40de120f4d6f29eb293b Which properly encapsulates the reference in a function which does the correct conversions. James
Re: [GIT PULL] SCSI fixes for 4.15-rc3
On Tue, 2017-12-12 at 12:22 -0500, Martin K. Petersen wrote: > Linus, > > > > > This is utter shite, and doesn't even compile cleanly. > > > > Sure, it's "just" a warning, and the code works. But no, I'm not > > pulling crap like this. If you save a pointer in an integer > > "hostdata[0]" field, then you damn well do the proper casts or > > helper > > functions or something, you don't just ignore the compiler when it > > very reasonably warns about it. > > > > What the hell is going on? Nobody compiled this stuff at all? Or > > nobody cares about new build warnings? > > Arnd and Johannes fixed this up right away: > > https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/? > h=fixes=45349821ab3a8d378b8f37e52c6fe1aa1b870c47 Yes, I have that in the current testing batch, how about I push the combined thing on wednesday. James
[GIT PULL] SCSI fixes for 4.15-rc3
The most important one is the bfa fix because it's easy to oops the kernel with this driver, a regression in the new timespec conversion in aacraid and a regression in the Fibre Channel ELS handling patch. The other three are a theoretical problem with termination in the vendor/host matching code and a use after free in lpfc. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Arnd Bergmann (1): scsi: aacraid: address UBSAN warning regression Dan Carpenter (1): scsi: lpfc: Use after free in lpfc_rq_buf_free() Johannes Thumshirn (1): scsi: bfa: fix access to bfad_im_port_s Martin Wilck (3): scsi: scsi_devinfo: cleanly zero-pad devinfo strings scsi: scsi_devinfo: handle non-terminated strings scsi: libfc: fix ELS request handling With the diffstat: drivers/scsi/aacraid/commsup.c | 8 ++-- drivers/scsi/bfa/bfad_bsg.c| 6 -- drivers/scsi/libfc/fc_lport.c | 4 drivers/scsi/lpfc/lpfc_mem.c | 2 +- drivers/scsi/scsi_devinfo.c| 27 ++- 5 files changed, 25 insertions(+), 22 deletions(-) And full diff below James --- diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index bec9f3193f60..80a8cb26cdea 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -2482,8 +2482,8 @@ int aac_command_thread(void *data) /* Synchronize our watches */ if (((NSEC_PER_SEC - (NSEC_PER_SEC / HZ)) > now.tv_nsec) && (now.tv_nsec > (NSEC_PER_SEC / HZ))) - difference = (((NSEC_PER_SEC - now.tv_nsec) * HZ) - + NSEC_PER_SEC / 2) / NSEC_PER_SEC; + difference = HZ + HZ / 2 - +now.tv_nsec / (NSEC_PER_SEC / HZ); else { if (now.tv_nsec > NSEC_PER_SEC / 2) ++now.tv_sec; @@ -2507,6 +2507,10 @@ int aac_command_thread(void *data) if (kthread_should_stop()) break; + /* +* we probably want usleep_range() here instead of the +* jiffies computation +*/ schedule_timeout(difference); if (kthread_should_stop()) diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c index 72ca2a2e08e2..09ef68c8225f 100644 --- a/drivers/scsi/bfa/bfad_bsg.c +++ b/drivers/scsi/bfa/bfad_bsg.c @@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job *job) struct fc_bsg_request *bsg_request = job->request; struct fc_bsg_reply *bsg_reply = job->reply; uint32_t vendor_cmd = bsg_request->rqst_data.h_vendor.vendor_cmd[0]; - struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job)); + struct Scsi_Host *shost = fc_bsg_to_shost(job); + struct bfad_im_port_s *im_port = shost->hostdata[0]; struct bfad_s *bfad = im_port->bfad; void *payload_kbuf; int rc = -EINVAL; @@ -3350,7 +3351,8 @@ int bfad_im_bsg_els_ct_request(struct bsg_job *job) { struct bfa_bsg_data *bsg_data; - struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job)); + struct Scsi_Host *shost = fc_bsg_to_shost(job); + struct bfad_im_port_s *im_port = shost->hostdata[0]; struct bfad_s *bfad = im_port->bfad; bfa_bsg_fcpt_t *bsg_fcpt; struct bfad_fcxp*drv_fcxp; diff --git a/drivers/scsi/libfc/fc_lport.c b/drivers/scsi/libfc/fc_lport.c index 5da46052e179..21be672679fb 100644 --- a/drivers/scsi/libfc/fc_lport.c +++ b/drivers/scsi/libfc/fc_lport.c @@ -904,10 +904,14 @@ static void fc_lport_recv_els_req(struct fc_lport *lport, case ELS_FLOGI: if (!lport->point_to_multipoint) fc_lport_recv_flogi_req(lport, fp); + else + fc_rport_recv_req(lport, fp); break; case ELS_LOGO: if (fc_frame_sid(fp) == FC_FID_FLOGI) fc_lport_recv_logo_req(lport, fp); + else + fc_rport_recv_req(lport, fp); break; case ELS_RSCN: lport->tt.disc_recv_req(lport, fp); diff --git a/drivers/scsi/lpfc/lpfc_mem.c b/drivers/scsi/lpfc/lpfc_mem.c index 56faeb049b4a..87c08ff37ddd 100644 --- a/drivers/scsi/lpfc/lpfc_mem.c +++ b/drivers/scsi/lpfc/lpfc_mem.c @@ -753,12 +753,12 @@ lpfc_rq_buf_free(struct lpfc_hba *phba, struct lpfc_dmabuf *mp) drqe.address_hi = putPaddrHigh(rqb_entry->dbuf.phys); rc = lpfc_sli4_rq_put(rqb_entry->hrq, rqb_entry->drq, , ); if (rc < 0) { -
Re: [PATCH] bfa: fix access to bfad_im_port_s
On Wed, 2017-12-06 at 09:07 +0100, Johannes Thumshirn wrote: > James Bottomley <james.bottom...@hansenpartnership.com> writes: > > > > On Tue, 2017-11-28 at 16:26 +0100, Johannes Thumshirn wrote: > > > > > > Commit 'cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() > > > helper")' > > > changed access to bfa's 'struct bfad_im_port_s' by using > > > shost_priv() > > > instead of shost->hostdata[0]. > > > > > > This lead to crashes like in the following back-trace: > > > > > > task: 880046375300 ti: 8800a2ef8000 task.ti: > > > 8800a2ef8000 > > > RIP: e030:[] [] > > > bfa_fcport_get_attr+0x82/0x260 [bfa] > > > RSP: e02b:8800a2efba10 EFLAGS: 00010046 > > > RAX: 575f415441536432 RBX: 8800a2efba28 RCX: > > > RDX: RSI: 8800a2efba28 RDI: 880004dc31d8 > > > RBP: 880004dc31d8 R08: R09: 0001 > > > R10: 88011fadc468 R11: 0001 R12: 880004dc31f0 > > > R13: 0200 R14: 880004dc61d0 R15: 880004947a10 > > > FS: 7feb1e489700() GS:88011fac() > > > knlGS: > > > CS: e033 DS: ES: CR0: 8005003b > > > CR2: 7ffe14e46c10 CR3: 957b8000 CR4: 0660 > > > Stack: > > > 88001d4da000 880004dc31c0 a048a9df > > > 81e56380 > > > > > > > > > [] bfad_iocmd_ioc_get_info+0x4f/0x220 [bfa] > > > [] bfad_iocmd_handler+0xa00/0xd40 [bfa] > > > [] bfad_im_bsg_request+0xee/0x1b0 [bfa] > > > [] fc_bsg_dispatch+0x10b/0x1b0 [scsi_transport_fc] > > > [] bsg_request_fn+0x11d/0x1c0 > > > [] __blk_run_queue+0x2f/0x40 > > > [] blk_execute_rq_nowait+0xa8/0x160 > > > [] blk_execute_rq+0x77/0x120 > > > [] bsg_ioctl+0x1b6/0x200 > > > [] do_vfs_ioctl+0x2cd/0x4a0 > > > [] SyS_ioctl+0x74/0x80 > > > [] entry_SYSCALL_64_fastpath+0x12/0x6d > > > > > > Fixes: cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() > > > helper") > > > Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de> > > > Cc: Michal Koutný <mkou...@suse.com> > > > --- > > > drivers/scsi/bfa/bfad_bsg.c | 6 -- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/scsi/bfa/bfad_bsg.c > > > b/drivers/scsi/bfa/bfad_bsg.c > > > index 72ca2a2e08e2..09ef68c8225f 100644 > > > --- a/drivers/scsi/bfa/bfad_bsg.c > > > +++ b/drivers/scsi/bfa/bfad_bsg.c > > > @@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job > > > *job) > > > struct fc_bsg_request *bsg_request = job->request; > > > struct fc_bsg_reply *bsg_reply = job->reply; > > > uint32_t vendor_cmd = bsg_request- > > > > > > > > rqst_data.h_vendor.vendor_cmd[0]; > > > - struct bfad_im_port_s *im_port = > > > shost_priv(fc_bsg_to_shost(job)); > > > + struct Scsi_Host *shost = fc_bsg_to_shost(job); > > > + struct bfad_im_port_s *im_port = shost->hostdata[0]; > > > struct bfad_s *bfad = im_port->bfad; > > > void *payload_kbuf; > > > int rc = -EINVAL; > > > @@ -3350,7 +3351,8 @@ int > > > bfad_im_bsg_els_ct_request(struct bsg_job *job) > > > { > > > struct bfa_bsg_data *bsg_data; > > > - struct bfad_im_port_s *im_port = > > > shost_priv(fc_bsg_to_shost(job)); > > > + struct Scsi_Host *shost = fc_bsg_to_shost(job); > > > + struct bfad_im_port_s *im_port = shost->hostdata[0]; > > > struct bfad_s *bfad = im_port->bfad; > > > bfa_bsg_fcpt_t *bsg_fcpt; > > > struct bfad_fcxp*drv_fcxp; > > > > OK, so we had a linux next failure with this: > > > > After merging the scsi tree, today's linux-next build (x86_64 > > allmodconfig) produced these warnings: > > > > drivers/scsi/bfa/bfad_bsg.c: In function > > 'bfad_im_bsg_vendor_request': > > drivers/scsi/bfa/bfad_bsg.c:3137:35: warning: initialization > > makes pointer from integer without a cast [-Wint-conversion] > > struct bfad_im_port_s *im_port = shost->hostdata[0]; > > ^ > > drivers/scsi/bfa/bfad_bsg.c: In function > > 'bfad_im_bsg_els_ct_request': > > drivers/scsi/bfa/bfad_bsg.c:3353:35: warning: i
Re: [PATCH] bfa: fix access to bfad_im_port_s
On Tue, 2017-11-28 at 16:26 +0100, Johannes Thumshirn wrote: > Commit 'cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper")' > changed access to bfa's 'struct bfad_im_port_s' by using shost_priv() > instead of shost->hostdata[0]. > > This lead to crashes like in the following back-trace: > > task: 880046375300 ti: 8800a2ef8000 task.ti: 8800a2ef8000 > RIP: e030:[] [] > bfa_fcport_get_attr+0x82/0x260 [bfa] > RSP: e02b:8800a2efba10 EFLAGS: 00010046 > RAX: 575f415441536432 RBX: 8800a2efba28 RCX: > RDX: RSI: 8800a2efba28 RDI: 880004dc31d8 > RBP: 880004dc31d8 R08: R09: 0001 > R10: 88011fadc468 R11: 0001 R12: 880004dc31f0 > R13: 0200 R14: 880004dc61d0 R15: 880004947a10 > FS: 7feb1e489700() GS:88011fac() > knlGS: > CS: e033 DS: ES: CR0: 8005003b > CR2: 7ffe14e46c10 CR3: 957b8000 CR4: 0660 > Stack: > 88001d4da000 880004dc31c0 a048a9df 81e56380 > > [] bfad_iocmd_ioc_get_info+0x4f/0x220 [bfa] > [] bfad_iocmd_handler+0xa00/0xd40 [bfa] > [] bfad_im_bsg_request+0xee/0x1b0 [bfa] > [] fc_bsg_dispatch+0x10b/0x1b0 [scsi_transport_fc] > [] bsg_request_fn+0x11d/0x1c0 > [] __blk_run_queue+0x2f/0x40 > [] blk_execute_rq_nowait+0xa8/0x160 > [] blk_execute_rq+0x77/0x120 > [] bsg_ioctl+0x1b6/0x200 > [] do_vfs_ioctl+0x2cd/0x4a0 > [] SyS_ioctl+0x74/0x80 > [] entry_SYSCALL_64_fastpath+0x12/0x6d > > Fixes: cd21c605b2cf ("scsi: fc: provide fc_bsg_to_shost() helper") > Signed-off-by: Johannes Thumshirn> Cc: Michal Koutný > --- > drivers/scsi/bfa/bfad_bsg.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/bfa/bfad_bsg.c > b/drivers/scsi/bfa/bfad_bsg.c > index 72ca2a2e08e2..09ef68c8225f 100644 > --- a/drivers/scsi/bfa/bfad_bsg.c > +++ b/drivers/scsi/bfa/bfad_bsg.c > @@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job *job) > struct fc_bsg_request *bsg_request = job->request; > struct fc_bsg_reply *bsg_reply = job->reply; > uint32_t vendor_cmd = bsg_request- > >rqst_data.h_vendor.vendor_cmd[0]; > - struct bfad_im_port_s *im_port = > shost_priv(fc_bsg_to_shost(job)); > + struct Scsi_Host *shost = fc_bsg_to_shost(job); > + struct bfad_im_port_s *im_port = shost->hostdata[0]; > struct bfad_s *bfad = im_port->bfad; > void *payload_kbuf; > int rc = -EINVAL; > @@ -3350,7 +3351,8 @@ int > bfad_im_bsg_els_ct_request(struct bsg_job *job) > { > struct bfa_bsg_data *bsg_data; > - struct bfad_im_port_s *im_port = > shost_priv(fc_bsg_to_shost(job)); > + struct Scsi_Host *shost = fc_bsg_to_shost(job); > + struct bfad_im_port_s *im_port = shost->hostdata[0]; > struct bfad_s *bfad = im_port->bfad; > bfa_bsg_fcpt_t *bsg_fcpt; > struct bfad_fcxp*drv_fcxp; OK, so we had a linux next failure with this: After merging the scsi tree, today's linux-next build (x86_64 allmodconfig) produced these warnings: drivers/scsi/bfa/bfad_bsg.c: In function 'bfad_im_bsg_vendor_request': drivers/scsi/bfa/bfad_bsg.c:3137:35: warning: initialization makes pointer from integer without a cast [-Wint-conversion] struct bfad_im_port_s *im_port = shost->hostdata[0]; ^ drivers/scsi/bfa/bfad_bsg.c: In function 'bfad_im_bsg_els_ct_request': drivers/scsi/bfa/bfad_bsg.c:3353:35: warning: initialization makes pointer from integer without a cast [-Wint-conversion] struct bfad_im_port_s *im_port = shost->hostdata[0]; ^ Introduced by commit 45349821ab3a ("scsi: bfa: fix access to bfad_im_port_s") -- Cheers, Stephen Rothwell The reason is you should be using shost_priv(shost) not shost->hostdata[0]. James
Re: [PATCH] scsi: fix race condition when removing target
On Wed, 2017-12-06 at 08:41 +0800, Jason Yan wrote: > On 2017/12/5 23:37, James Bottomley wrote: > > > > On Tue, 2017-12-05 at 20:37 +0800, Jason Yan wrote: > > > > > > > > > On 2017/12/1 23:35, James Bottomley wrote: > > > > > > > > > > > > On Fri, 2017-12-01 at 16:40 +0800, Jason Yan wrote: > > > > > > > > > > > > > > > On 2017/12/1 7:56, James Bottomley wrote: > > > > > > > > > > > > > > > > > > b/include/scsi/scsi_device.h > > > > > > index 571ddb49b926..2e4d48d8cd68 100644 > > > > > > --- a/include/scsi/scsi_device.h > > > > > > +++ b/include/scsi/scsi_device.h > > > > > > @@ -380,6 +380,23 @@ extern struct scsi_device > > > > > > *__scsi_iterate_devices(struct Scsi_Host *, > > > > > > #define __shost_for_each_device(sdev, shost) \ > > > > > > list_for_each_entry((sdev), &((shost)- > > > > > > >__devices), > > > > > > siblings) > > > > > > > > > > > > > > > > Seems that __shost_for_each_device() is still not safe. scsi > > > > > device > > > > > been deleted stays in the list and put_device() can be called > > > > > anywhere out of the host lock. > > > > > > > > Not if it's used with scsi_get_device(). As I said, I only did > > > > a cursory inspectiont, so if I've missed a loop, please > > > > specify. > > > > > > > > The point was more a demonstration of how we could fix the > > > > problem if we don't change get_device(). > > > > > > > > James > > > > > > > > > > Yes, it's OK now. __shost_for_each_device() is not used with > > > scsi_get_device() yet. > > > > > > Another problem is that put_device() cannot be called while > > > holding the host lock, > > > > Yes it can. That's one of the design goals of the execute in > > process context: you can call it from interrupt context and you can > > call it with locks held and we'll return immediately and delay all > > the dangerous stuff until we have a process context. > > > > To get the process context to be acquired, the in_interrupt() test > > must pass (so the spin lock must be acquired irqsave) ; is that > > condition missing anywhere? > > > > James > > > > > > Call it from interrupt context is ok. I'm talking about calling it > from process context. > > Think about this in a process context: > scsi_device_lookup() > ->spin_lock_irqsave(shost->host_lock, flags); > ->__scsi_device_lookup() > ->iterate and kobject_get_unless_zero() > ->put_device() > ->scsi_device_dev_release() if the last put > ->scsi_device_dev_release_usercontext() > ->acquire the host lock = deadlock execute_in_process_context() is supposed to produce us a context whenever the local context isn't available, and that's supposed to include when interrupts are disabled as in spin_lock_irqsave(). So let me ask this another way: have you seen this deadlock (which would mean we have a bug in execute_process_context())? James
[GIT PULL] SCSI fixes for 4.15-rc2
We have a bunch of fixes for aacraid, a set of coherency fixes that only affect non-coherent platforms and one coccinelle detected null check after use. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Christoph Hellwig (1): scsi: dma-mapping: always provide dma_get_cache_alignment Guilherme G. Piccoli (3): scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path scsi: aacraid: Perform initialization reset only once scsi: aacraid: Check for PCI state of device in a generic way Gustavo A. R. Silva (1): scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg Huacai Chen (2): scsi: libsas: align sata_device's rps_resp on a cacheline scsi: use dma_get_cache_alignment() as minimum DMA alignment And the diffstat: drivers/scsi/aacraid/aacraid.h | 1 + drivers/scsi/aacraid/commsup.c | 35 +++ drivers/scsi/aacraid/linit.c | 3 +++ drivers/scsi/aacraid/rx.c | 15 ++- drivers/scsi/aacraid/src.c | 20 ++-- drivers/scsi/scsi_lib.c| 10 ++ drivers/scsi/ufs/ufshcd.c | 7 +-- include/linux/dma-mapping.h| 2 -- include/scsi/libsas.h | 2 +- 9 files changed, 43 insertions(+), 52 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h index 403a639574e5..6e3d81969a77 100644 --- a/drivers/scsi/aacraid/aacraid.h +++ b/drivers/scsi/aacraid/aacraid.h @@ -1673,6 +1673,7 @@ struct aac_dev struct aac_hba_map_info hba_map[AAC_MAX_BUSES][AAC_MAX_TARGETS]; u8 adapter_shutdown; u32 handle_pci_error; + boolinit_reset; }; #define aac_adapter_interrupt(dev) \ diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index 525a652dab48..bec9f3193f60 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -467,35 +467,6 @@ int aac_queue_get(struct aac_dev * dev, u32 * index, u32 qid, struct hw_fib * hw return 0; } -#ifdef CONFIG_EEH -static inline int aac_check_eeh_failure(struct aac_dev *dev) -{ - /* Check for an EEH failure for the given -* device node. Function eeh_dev_check_failure() -* returns 0 if there has not been an EEH error -* otherwise returns a non-zero value. -* -* Need to be called before any PCI operation, -* i.e.,before aac_adapter_check_health() -*/ - struct eeh_dev *edev = pci_dev_to_eeh_dev(dev->pdev); - - if (eeh_dev_check_failure(edev)) { - /* The EEH mechanisms will handle this -* error and reset the device if -* necessary. -*/ - return 1; - } - return 0; -} -#else -static inline int aac_check_eeh_failure(struct aac_dev *dev) -{ - return 0; -} -#endif - /* * Define the highest level of host to adapter communication routines. * These routines will support host to adapter FS commuication. These @@ -701,7 +672,7 @@ int aac_fib_send(u16 command, struct fib *fibptr, unsigned long size, return -ETIMEDOUT; } - if (aac_check_eeh_failure(dev)) + if (unlikely(pci_channel_offline(dev->pdev))) return -EFAULT; if ((blink = aac_adapter_check_health(dev)) > 0) { @@ -801,7 +772,7 @@ int aac_hba_send(u8 command, struct fib *fibptr, fib_callback callback, spin_unlock_irqrestore(>event_lock, flags); - if (aac_check_eeh_failure(dev)) + if (unlikely(pci_channel_offline(dev->pdev))) return -EFAULT; fibptr->flags |= FIB_CONTEXT_FLAG_WAIT; @@ -1583,6 +1554,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int forced, u8 reset_type) * will ensure that i/o is queisced and the card is flushed in that * case. */ + aac_free_irq(aac); aac_fib_map_free(aac); dma_free_coherent(>pdev->dev, aac->comm_size, aac->comm_addr, aac->comm_phys); @@ -1590,7 +1562,6 @@ static int _aac_reset_adapter(struct aac_dev *aac, int forced, u8 reset_type) aac->comm_phys = 0; kfree(aac->queues); aac->queues = NULL; - aac_free_irq(aac); kfree(aac->fsa_dev); aac->fsa_dev = NULL; diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c index c9252b138c1f..bdf127aaab41 100644 --- a/drivers/scsi/aacraid/linit.c +++ b/drivers/scsi/aacraid/linit.c @@ -1680,6 +1680,9 @@ static int aac_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
Re: [PATCH 1/2] scsi-mq: Only show the CDB if available
On Wed, 2017-12-06 at 00:38 +0800, Ming Lei wrote: > On Tue, Dec 05, 2017 at 04:22:33PM +, Bart Van Assche wrote: > > > > On Tue, 2017-12-05 at 13:00 +0800, Ming Lei wrote: > > > > > > No, do not mix two different things in one patch, especially the > > > fix part need to be backported to stable. > > > > > > The fix part should aim at V4.15, and the other part can be a > > > V4.16 stuff. > > > > Does this mean that you do not plan to post a v5 of your patch and > > that you want me to rework this patch series? I can do that. > > I believe V4 has been OK for merge, actually the only concern from > James is that 'set the cmnd to NULL *before* calling free so we > narrow the race window.', but that isn't required as I explained, > even though you don't do that in this patch too. > > https://marc.info/?t=15103046433=1=2 > > I will work on V5 if Martin/James thinks it is needed. I don't buy that it isn't needed. The point (and the pattern) is for a destructive action set the signal *before* you execute the action not after. The reason should be obvious: if you set it after you invite a race where the check says OK but the object has gone. Even if the race is highly unlikely, the pattern point still holds. James
Re: [PATCH] scsi: fix race condition when removing target
On Tue, 2017-12-05 at 20:37 +0800, Jason Yan wrote: > > On 2017/12/1 23:35, James Bottomley wrote: > > > > On Fri, 2017-12-01 at 16:40 +0800, Jason Yan wrote: > > > > > > On 2017/12/1 7:56, James Bottomley wrote: > > > > > > > > b/include/scsi/scsi_device.h > > > > index 571ddb49b926..2e4d48d8cd68 100644 > > > > --- a/include/scsi/scsi_device.h > > > > +++ b/include/scsi/scsi_device.h > > > > @@ -380,6 +380,23 @@ extern struct scsi_device > > > > *__scsi_iterate_devices(struct Scsi_Host *, > > > > #define __shost_for_each_device(sdev, shost) \ > > > > list_for_each_entry((sdev), &((shost)->__devices), > > > > siblings) > > > > > > > > > > Seems that __shost_for_each_device() is still not safe. scsi > > > device > > > been deleted stays in the list and put_device() can be called > > > anywhere out of the host lock. > > > > Not if it's used with scsi_get_device(). As I said, I only did a > > cursory inspectiont, so if I've missed a loop, please specify. > > > > The point was more a demonstration of how we could fix the problem > > if we don't change get_device(). > > > > James > > > > Yes, it's OK now. __shost_for_each_device() is not used with > scsi_get_device() yet. > > Another problem is that put_device() cannot be called while holding > the host lock, Yes it can. That's one of the design goals of the execute in process context: you can call it from interrupt context and you can call it with locks held and we'll return immediately and delay all the dangerous stuff until we have a process context. To get the process context to be acquired, the in_interrupt() test must pass (so the spin lock must be acquired irqsave) ; is that condition missing anywhere? James
Re: [PATCH] scsi: fix race condition when removing target
On Fri, 2017-12-01 at 16:40 +0800, Jason Yan wrote: > On 2017/12/1 7:56, James Bottomley wrote: > > b/include/scsi/scsi_device.h > > index 571ddb49b926..2e4d48d8cd68 100644 > > --- a/include/scsi/scsi_device.h > > +++ b/include/scsi/scsi_device.h > > @@ -380,6 +380,23 @@ extern struct scsi_device > > *__scsi_iterate_devices(struct Scsi_Host *, > > #define __shost_for_each_device(sdev, shost) \ > > list_for_each_entry((sdev), &((shost)->__devices), > > siblings) > > > > Seems that __shost_for_each_device() is still not safe. scsi device > been deleted stays in the list and put_device() can be called > anywhere out of the host lock. Not if it's used with scsi_get_device(). As I said, I only did a cursory inspectiont, so if I've missed a loop, please specify. The point was more a demonstration of how we could fix the problem if we don't change get_device(). James
Re: [PATCH] scsi: fix race condition when removing target
On Thu, 2017-11-30 at 16:08 +, Bart Van Assche wrote: > On Thu, 2017-11-30 at 09:18 +0800, Jason Yan wrote: > > > > Hi Bart, I chose the approach in my patch because it has been used > > in scsi_device_get() for years and been proved safe. I think using > > kobject_get_unless_zero() is safe here and can fix this issue too. > > And this approach is beneficial to all users. > > Hello Jason, > > A possible approach is that we start with your patch and defer any > get_device() changes until after your patch has been applied. It's possible, but not quite good enough: the same race can be produced with any of our sdev lists that are deleted in the release callback, because there could be a released device on any one of them. The only way to mediate it properly is to get a reference in the iterator using kobject_get_unless_zero(). It's a bit like a huge can of worms, there's another problem every time I look. However, this is something like the mechanism that could work (and if get_device() ever gets fixed, we can put it in place of kobject_get_unless_zero()). James --- diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index 6be77b3aa8a5..c3246f26c02c 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -1169,6 +1169,7 @@ process_script_interrupt(__u32 dsps, __u32 dsp, struct scsi_cmnd *SCp, } + put_device(>sdev_gendev); } else if(dsps == A_RESELECTED_DURING_SELECTION) { /* This section is full of debugging code because I've diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c index c3fc34b9964d..7736f3fb2501 100644 --- a/drivers/scsi/esp_scsi.c +++ b/drivers/scsi/esp_scsi.c @@ -1198,6 +1198,7 @@ static int esp_reconnect(struct esp *esp) goto do_reset; } lp = dev->hostdata; + put_device(>sdev_gendev); ent = lp->non_tagged_cmd; if (!ent) { diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index a7e4fba724b7..c96c11716152 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -677,11 +677,10 @@ struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *starget, { struct scsi_device *sdev; - list_for_each_entry(sdev, >devices, same_target_siblings) { - if (sdev->sdev_state == SDEV_DEL) - continue; - if (sdev->lun ==lun) + __sdev_for_each_get(sdev, >devices, same_target_siblings) { + if (sdev->sdev_state != SDEV_DEL && sdev->lun ==lun) return sdev; + put_device(>sdev_gendev); } return NULL; @@ -700,15 +699,16 @@ EXPORT_SYMBOL(__scsi_device_lookup_by_target); struct scsi_device *scsi_device_lookup_by_target(struct scsi_target *starget, u64 lun) { - struct scsi_device *sdev; + struct scsi_device *sdev, *sdev_copy; struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); unsigned long flags; spin_lock_irqsave(shost->host_lock, flags); - sdev = __scsi_device_lookup_by_target(starget, lun); + sdev_copy = sdev = __scsi_device_lookup_by_target(starget, lun); + spin_unlock_irqrestore(shost->host_lock, flags); if (sdev && scsi_device_get(sdev)) sdev = NULL; - spin_unlock_irqrestore(shost->host_lock, flags); + put_device(_copy->sdev_gendev); return sdev; } @@ -735,12 +735,12 @@ struct scsi_device *__scsi_device_lookup(struct Scsi_Host *shost, { struct scsi_device *sdev; - list_for_each_entry(sdev, >__devices, siblings) { - if (sdev->sdev_state == SDEV_DEL) - continue; - if (sdev->channel == channel && sdev->id == id && - sdev->lun ==lun) + __sdev_for_each_get(sdev, >__devices, siblings) { + if (sdev->sdev_state != SDEV_DEL && + sdev->channel == channel && sdev->id == id && + sdev->lun ==lun) return sdev; + put_device(>sdev_gendev); } return NULL; @@ -761,14 +761,15 @@ EXPORT_SYMBOL(__scsi_device_lookup); struct scsi_device *scsi_device_lookup(struct Scsi_Host *shost, uint channel, uint id, u64 lun) { - struct scsi_device *sdev; + struct scsi_device *sdev, *sdev_copy; unsigned long flags; spin_lock_irqsave(shost->host_lock, flags); - sdev = __scsi_device_lookup(shost, channel, id, lun); + sdev_copy = sdev = __scsi_device_lookup(shost, channel, id, lun); + spin_unlock_irqrestore(shost->host_lock, flags); if (sdev && scsi_device_get(sdev)) sdev = NULL; - spin_unlock_irqrestore(shost->host_lock, flags); + put_device(_copy->sdev_gendev); return sdev; } diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index
Re: [PATCH] scsi: fix race condition when removing target
On Wed, 2017-11-29 at 17:34 +0100, Christoph Hellwig wrote: > On Wed, Nov 29, 2017 at 08:31:48AM -0800, James Bottomley wrote: > > > > This analysis fails here: get_device() on something with refcount 0 > > returns NULL. That triggers the if clause to ignore this device. > > No, it doesn't. Take a look at the get_device and kobject_get > implementations, Hm, so why doesn't get_device use kref_get_unless_zero()? James
Re: [PATCH] scsi: fix race condition when removing target
On Wed, 2017-11-29 at 11:05 +0800, Jason Yan wrote: > In commit fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()"), > we > removed scsi_device_get() and directly called get_device() to > increase > the refcount of the device. But actullay scsi_device_get() will fail > in > three cases: > 1. the scsi device is in SDEV_DEL or SDEV_CANCEL state > 2. get_device() fail > 3. the module is not alive > > The intended purpose was to remove the check of the module alive. > Unfortunately the check of the device state was droped too. And this > introduced a race condition like this: > > CPU0 CPU1 > __scsi_remove_target() > ->iterate shost->__devices > ->scsi_remove_device() > ->put_device() > someone still hold a refcount > sd_release() > - > >scsi_disk_put() > ->put_device() > last put and trigger the device release > > ->goto restart > ->iterate shost->__devices and got the same device > ->get_device() while refcount is 0 This analysis fails here: get_device() on something with refcount 0 returns NULL. That triggers the if clause to ignore this device. We may have a more complex way of triggering a dual put race as the trace implies, but I don't think this is it. [...] > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > index 50e7d7e..d398894 100644 > --- a/drivers/scsi/scsi_sysfs.c > +++ b/drivers/scsi/scsi_sysfs.c > @@ -1398,6 +1398,15 @@ void scsi_remove_device(struct scsi_device > *sdev) > } > EXPORT_SYMBOL(scsi_remove_device); > > +static int scsi_device_get_not_deleted(struct scsi_device *sdev) > +{ > + if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == > SDEV_CANCEL) > + return -ENXIO; > + if (!get_device(>sdev_gendev)) > + return -ENXIO; > + return 0; > +} This is pretty much scsi_device_get() without the try_module get, so they should probably be combined. James > static void __scsi_remove_target(struct scsi_target *starget) > { > struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); > @@ -1415,7 +1424,7 @@ static void __scsi_remove_target(struct > scsi_target *starget) > */ > if (sdev->channel != starget->channel || > sdev->id != starget->id || > - !get_device(>sdev_gendev)) > + scsi_device_get_not_deleted(sdev)) > continue; > spin_unlock_irqrestore(shost->host_lock, flags); > scsi_remove_device(sdev);
Re: UFS utilities
On Wed, 2017-11-29 at 15:39 +, Bean Huo (beanhuo) wrote: > Thread-Topic: UFS utilities > Thread-Index: AdNpKDYJWTPjczfxQHC/pD9WNnnl4w== Could you fix your mail transfer agent? It's using these non standard headers instead of the RFC mandated In-Reply-To: and References: The problem is that this breaks threading on the public mailing lists and means most of us can't tie this email back to the original thread. > > On Mon, Nov 27, 2017 at 11:25:47AM +, Bean Huo (beanhuo) wrote: > > > > > > Hi, all > > > Is there someone knows if exists one utilis dedicated to UFS > > > device, rather than SCSI utils? > > > > > > I have tried sg3-utils, but it is not convenient for the embedded > > > ARM-based system. > > > > > > And also it doesn't support several UFS special command. > > > > What specific UFS commands do you need to make to the device that > > the current driver does not support? > > > There are some UFS/vendor native commands. They are not SCSI based. Can you list them? The reason for asking is that SCSI-3 is pretty extensive in terms of management utilities, so any function you do with vendor native commands that matches an sg3utils function could be done by the latter instead via a SCSI translation layer. James
[GIT PULL] final round of SCSI updates for the 4.14+ merge window
Two basic fixes: one for the sparse problem with the blacklist flags and another for a hang forever in bnx2i. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Chad Dupuis (1): scsi: bnx2fc: Fix hung task messages when a cleanup response is not received during abort Hannes Reinecke (1): scsi: Use 'blist_flags_t' for scsi_devinfo flags And the diffstat: drivers/scsi/bnx2fc/bnx2fc_io.c | 40 ++--- drivers/scsi/scsi_devinfo.c | 18 +++ drivers/scsi/scsi_priv.h| 15 +++-- drivers/scsi/scsi_scan.c| 2 +- include/scsi/scsi_device.h | 4 +++- include/scsi/scsi_devinfo.h | 50 - 6 files changed, 78 insertions(+), 51 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c index 5b6153f23f01..8e2f767147cb 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c @@ -1084,24 +1084,35 @@ static int bnx2fc_abts_cleanup(struct bnx2fc_cmd *io_req) { struct bnx2fc_rport *tgt = io_req->tgt; int rc = SUCCESS; + unsigned int time_left; io_req->wait_for_comp = 1; bnx2fc_initiate_cleanup(io_req); spin_unlock_bh(>tgt_lock); - wait_for_completion(_req->tm_done); - + /* +* Can't wait forever on cleanup response lest we let the SCSI error +* handler wait forever +*/ + time_left = wait_for_completion_timeout(_req->tm_done, + BNX2FC_FW_TIMEOUT); io_req->wait_for_comp = 0; + if (!time_left) + BNX2FC_IO_DBG(io_req, "%s(): Wait for cleanup timed out.\n", + __func__); + /* -* release the reference taken in eh_abort to allow the -* target to re-login after flushing IOs +* Release reference held by SCSI command the cleanup completion +* hits the BNX2FC_CLEANUP case in bnx2fc_process_cq_compl() and +* thus the SCSI command is not returnedi by bnx2fc_scsi_done(). */ kref_put(_req->refcount, bnx2fc_cmd_release); spin_lock_bh(>tgt_lock); return rc; } + /** * bnx2fc_eh_abort - eh_abort_handler api to abort an outstanding * SCSI command @@ -1118,6 +1129,7 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) struct fc_lport *lport; struct bnx2fc_rport *tgt; int rc; + unsigned int time_left; rc = fc_block_scsi_eh(sc_cmd); if (rc) @@ -1194,6 +1206,11 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) if (cancel_delayed_work(_req->timeout_work)) kref_put(_req->refcount, bnx2fc_cmd_release); /* drop timer hold */ + /* +* We don't want to hold off the upper layer timer so simply +* cleanup the command and return that I/O was successfully +* aborted. +*/ rc = bnx2fc_abts_cleanup(io_req); /* This only occurs when an task abort was requested while ABTS is in progress. Setting the IO_CLEANUP flag will skip the @@ -1201,7 +1218,7 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) was a result from the ABTS request rather than the CLEANUP request */ set_bit(BNX2FC_FLAG_IO_CLEANUP, _req->req_flags); - goto out; + goto done; } /* Cancel the current timer running on this io_req */ @@ -1221,7 +1238,11 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) } spin_unlock_bh(>tgt_lock); - wait_for_completion(_req->tm_done); + /* Wait 2 * RA_TOV + 1 to be sure timeout function hasn't fired */ + time_left = wait_for_completion_timeout(_req->tm_done, + (2 * rp->r_a_tov + 1) * HZ); + if (time_left) + BNX2FC_IO_DBG(io_req, "Timed out in eh_abort waiting for tm_done"); spin_lock_bh(>tgt_lock); io_req->wait_for_comp = 0; @@ -1233,8 +1254,12 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) /* Let the scsi-ml try to recover this command */ printk(KERN_ERR PFX "abort failed, xid = 0x%x\n", io_req->xid); + /* +* Cleanup firmware residuals before returning control back +* to SCSI ML. +*/ rc = bnx2fc_abts_cleanup(io_req); - goto out; + goto done; } else { /* * We come here even when there was a race condition @@ -1249,7 +1274,6 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) done: /* release the reference taken in eh_abort */
Re: Seagate External SMR drive USB resets (was: Re: [PATCH] uas: Add US_FL_NO_ATA_1X quirk for one more Seagate device)
On Wed, 2017-11-15 at 17:02 -0500, Alan Stern wrote: > On Wed, 15 Nov 2017, Jérôme Carretero wrote: > > > Because with several of these drives / lots of activity / > occasional > > > issues, it looks like it will be hard to catch (yes I can use > > > usbmon). > > > > > > - It looks like there is no configurable timeout for USB MSC > requests. > > > Perhaps the device is not responding in time and this is why > it's > > > reset? > > Timeouts are set by the SCSI layer. I believe they are rather long > (30 seconds, by default). Presumably they are configurable, although > I would have to do some digging to figure out how. They're in /sys/class/scsi_device//device/timeout jejb@bedivere:~> ls -l /sys/class/scsi_device/0\:0\:0\:0/device/timeout -rw-r--r-- 1 root root 4096 Nov 15 14:37 /sys/class/scsi_device/0:0:0:0/device/timeout jejb@bedivere:~> cat /sys/class/scsi_device/0\:0\:0\:0/device/timeout 30 You can actually have a udev rule adjust them on a per device id basis. James
Re: [PATCH V4] scsi_debugfs: fix crash in scsi_show_rq()
On Wed, 2017-11-15 at 18:09 +0800, Ming Lei wrote: > On Tue, Nov 14, 2017 at 10:14:52AM -0800, James Bottomley wrote: > > > > On Tue, 2017-11-14 at 08:55 +0800, Ming Lei wrote: > > > > > > Hi James, > > > > > > On Mon, Nov 13, 2017 at 10:55:52AM -0800, James Bottomley wrote: > > > > > > > > > > > > On Sat, 2017-11-11 at 10:43 +0800, Ming Lei wrote: > > > > > > > > > > > > > > > So from CPU1's review, cmd->cmnd is in a remote NUMA node, > > > > > __scsi_format_command() is executed much slower than > > > > > mempool_free(). > > > > > So when mempool_free() returns, __scsi_format_command() may > > > > > not fetched the buffer in L1 cache yet, then use-after-free > > > > > is still triggered. > > > > > > > > > > That is why I say this use-after-free is inevitable no matter > > > > > 'setting SCpnt->cmnd to NULL before calling mempool_free()' > > > > > or not. > > > > > > > > The bottom line is that there are several creative ways around > > > > this but the proposed code is currently broken and simply > > > > putting a comment in saying so doesn't make it acceptable. > > > > > > As I explained above, I didn't see one really workable way. Or > > > please correct it if I am wrong. > > > > I simply can't believe it's beyond the wit of man to solve a use > > after free race. About 40% of kernel techniques are devoted to > > this. All I really care about is not losing the PI information we > > previously had. I agree with Bart that NULL cmnd is a good > > indicator, so it seems reasonable to use it. If you have another > > mechanism, feel free to propose it. > > Hi James, > > This patch is my proposal, no others thought of yet. > > We can fix the use-after-free easily via lock, rcu and ..., but some > cost has to pay. In this case, we can't wait too long in show_rq(), > otherwise we may lose important debug info, so I do not have better > way. > > IMO this use-after-free is actually no harm, I don't think we have to > fix it, but it should be better to not let utility warn on this case. Fine, so lose the snide comment and set the cmnd to NULL *before* calling free so we narrow the race window. James
Re: [GIT PULL] first round of SCSI updates for the 4.14+ merge window
On Tue, 2017-11-14 at 16:33 -0800, Linus Torvalds wrote: > On Tue, Nov 14, 2017 at 8:36 AM, James Bottomley > <james.bottom...@hansenpartnership.com> wrote: > > > > > > Hannes Reinecke (14): > > scsi: scsi_devinfo: Reformat blacklist flags > > Ugh, that's just really ugly, but it's also wrong. > > Just having long lines would probably have been much preferable, and > would mean that the commit that explains the bit would show up when > you grep for the bit. > > Having a small helper macro like > > #define BLIST_n(x) ((__force __u32 __bitwise)(1 << (n))) > > woiuld also likely have made it more legible. > > But that only takes care of the ugliness and the greppability. > > It's not right for sparse even _with_ those changes. > > Why? Because "__bitwise" actually creates a new type. So what those > BLIST defines should do is to use a special type something like > > typedef unsigned int __bitwise blist_flags_t; > > and now you have _one_ type thanks to that typedef, that is different > from all the other bitwise types. Then you force all the constants > and the field that implements to have that type, and you have type- > safety: you can use those constants together, and you can assign the > result to the blist flags, but you can't mix it with other __bitwise > types. > > That's why things like this work: > > typedef __u16 __bitwise __le16; > typedef __u16 __bitwise __be16; > > where __le16 and __be16 are actually different types, even though > their underlying _storage_ is the same (a 16-bit unsigned). > > Anyway, I've pulled, because clearly this only matters for sparse, > but I would hope that this gets fixed up, ok? It will, boss; I'll make sure of it. James
Re: [PATCH V4] scsi_debugfs: fix crash in scsi_show_rq()
On Tue, 2017-11-14 at 08:55 +0800, Ming Lei wrote: > Hi James, > > On Mon, Nov 13, 2017 at 10:55:52AM -0800, James Bottomley wrote: > > > > On Sat, 2017-11-11 at 10:43 +0800, Ming Lei wrote: > > > > > > So from CPU1's review, cmd->cmnd is in a remote NUMA node, > > > __scsi_format_command() is executed much slower than > > > mempool_free(). > > > So when mempool_free() returns, __scsi_format_command() may not > > > fetched the buffer in L1 cache yet, then use-after-free > > > is still triggered. > > > > > > That is why I say this use-after-free is inevitable no matter > > > 'setting SCpnt->cmnd to NULL before calling mempool_free()' or > > > not. > > > > The bottom line is that there are several creative ways around this > > but the proposed code is currently broken and simply putting a > > comment in saying so doesn't make it acceptable. > > As I explained above, I didn't see one really workable way. Or please > correct it if I am wrong. I simply can't believe it's beyond the wit of man to solve a use after free race. About 40% of kernel techniques are devoted to this. All I really care about is not losing the PI information we previously had. I agree with Bart that NULL cmnd is a good indicator, so it seems reasonable to use it. If you have another mechanism, feel free to propose it. James
[GIT PULL] first round of SCSI updates for the 4.14+ merge window
This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas, megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor updates. There's no major behaviour change or additions to the core in all of this, so the potential for regressions should be small (biggest potential being in the scsi error handler changes). The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc The short changelog is: Alim Akhtar (4): scsi: ufs: Remove unused UFS_BIT() macro scsi: ufs: Remove unused #defines scsi: ufs-qcom: Remove uses of UFS_BIT() macro scsi: ufs: Change HCI macro to actual bit position Arnd Bergmann (3): scsi: aacraid: use timespec64 instead of timeval scsi: mpt3sas: fix dma_addr_t casts scsi: nsp32: fix logic bug in error handling Arvind Yadav (1): scsi: scsi_transport_iscsi: fix spelling mistake: 'Cound' -> 'Could' Bader Ali Saleh (1): scsi: hpsa: update discovery polling Bart Van Assche (1): scsi: qla2xxx: Suppress a kernel complaint in qla_init_base_qpair() Cathy Avery (1): scsi: storvsc: Allow only one remove lun work item to be issued per lun Christoph Hellwig (4): scsi: qla2xxx: don't break the bsg-lib abstractions scsi: scsi_transport_sas: check reply payload length instead of bidi request scsi: libfc: don't assign resid_len in fc_lport_bsg_request scsi: bfa: don't reset max_segments for every bsg request Christos Gkekas (2): scsi: qedi: Delete redundant variables scsi: bnx2i: Clean up unused pointers in bnx2i_hwi Colin Ian King (6): scsi: megaraid_sas: fix spelling mistake: "thershold" -> "threshold" scsi: aic7xxx: make a couple of functions static scsi: libsas: remove unused variable sas_ha scsi: libcxgbi: remove redundant check and close on csk scsi: ufs: tc-dwc-g210: make arrays static, reduces object code size scsi: lpfc: remove redundant null check on eqe Damien Le Moal (6): scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() scsi: sd_zbc: Use well defined macros scsi: sd_zbc: Rearrange code scsi: sd_zbc: Fix comments and indentation scsi: sd_zbc: Move ZBC declarations to scsi_proto.h scsi: sd: Align maximum write same blocks to physical block size Dan Carpenter (4): scsi: mpt3sas: remove a stray KERN_INFO scsi: mpt3sas: cleanup _scsih_pcie_enumeration_event() scsi: lpfc: Fix a precedence bug in lpfc_nvme_io_cmd_wqe_cmpl() scsi: bfa: integer overflow in debugfs Dick Kennedy (19): scsi: lpfc: Fix hard lock up NMI in els timeout handling. scsi: lpfc: change version to 11.4.0.4 scsi: lpfc: Fix oops of nvme host during driver unload. scsi: lpfc: Extend RDP support scsi: lpfc: Ensure io aborts interlocked with the target. scsi: lpfc: Fix secure firmware updates scsi: lpfc: Fix crash in lpfc_nvme_fcp_io_submit during LIP scsi: lpfc: Disable NPIV support if NVME is enabled scsi: lpfc: Fix oops if nvmet_fc_register_targetport fails scsi: lpfc: Revise NVME module parameter descriptions for better clarity scsi: lpfc: Fix FCP hba_wqidx assignment scsi: lpfc: Move CQ processing to a soft IRQ scsi: lpfc: Make ktime sampling more accurate scsi: lpfc: PLOGI failures during NPIV testing scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined scsi: lpfc: Fix lpfc nvme host rejecting IO with Not Ready message scsi: lpfc: Fix crash receiving ELS while detaching driver scsi: lpfc: fix pci hot plug crash in list_add call scsi: lpfc: fix pci hot plug crash in timer management routines Don Brace (10): scsi: hpsa: bump driver version scsi: hpsa: add enclosure logical identifier scsi: hpsa: correct logical volume removal scsi: hpsa: reduce warning messages on device removal scsi: hpsa: update queue depth for externals scsi: hpsa: correct smart path enabled scsi: hpsa: change timeout for internal cmds scsi: hpsa: add controller checkpoint scsi: smartpqi: correct spelling error in documentation scsi: smartpqi: update driver version to 1.1.2-126 Douglas Gilbert (1): scsi: scsi_debug: write_same: fix error report Duane Grigsby (2): scsi: qla2xxx: Changes to support N2N logins scsi: qla2xxx: Allow MBC_GET_PORT_DATABASE to query and save the port states Finn Thain (1): scsi: NCR5380: Suppress SDTR and WDTR message logging Giridhar Malavali (1): scsi: qla2xxx: Query FC4 type during RSCN processing Hannes Reinecke (14): scsi: scsi_error: Handle power-on reset unit attention scsi: scsi_error: Do not retry illegal function error scsi: scsi_devinfo: Add TRY_VPD_PAGES to HITACHI OPEN-V blacklist entry scsi: scsi_devinfo: Add 'AIX VDASD' to blacklist scsi: scsi_devinfo: fixup string compare scsi: scsi_devinfo:
Re: [PATCH V4] scsi_debugfs: fix crash in scsi_show_rq()
On Sat, 2017-11-11 at 10:43 +0800, Ming Lei wrote: > On Fri, Nov 10, 2017 at 08:51:58AM -0800, James Bottomley wrote: > > > > On Fri, 2017-11-10 at 17:01 +0800, Ming Lei wrote: > > > > > > cmd->cmnd can be allocated/freed dynamically in case of > > > T10_PI_TYPE2_PROTECTION, so we should check it in scsi_show_rq() > > > because this request may have been freed already here, and cmd- > > > >cmnd has been set as null. > > > > > > We choose to accept read-after-free and dump request data as far > > > as possible. > > > > > > This patch fixs the following kernel crash when dumping request > > > via block's debugfs interface: > > > > > > [ 252.962045] BUG: unable to handle kernel NULL pointer > > > dereference > > > at (null) > > > [ 252.963007] IP: scsi_format_opcode_name+0x1a/0x1c0 > > > [ 252.963007] PGD 25e75a067 P4D 25e75a067 PUD 25e75b067 PMD 0 > > > [ 252.963007] Oops: [#1] PREEMPT SMP > > > [ 252.963007] Dumping ftrace buffer: > > > [ 252.963007](ftrace buffer empty) > > > [ 252.963007] Modules linked in: scsi_debug ebtable_filter > > > ebtables > > > ip6table_filter ip6_tables xt_CHECKSUM iptable_mangle > > > ipt_MASQUERADE > > > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 > > > nf_defrag_ipv4 > > > nf_nat_ipv4 nf_nat nf_conntrack libcrc32c bridge stp llc > > > iptable_filter fuse ip_tables sd_mod sg mptsas mptscsih mptbase > > > crc32c_intel ahci libahci nvme serio_raw scsi_transport_sas > > > libata > > > lpc_ich nvme_core virtio_scsi binfmt_misc dm_mod iscsi_tcp > > > libiscsi_tcp libiscsi scsi_transport_iscsi null_blk configs > > > [ 252.963007] CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0- > > > rc2.blk_mq_io_hang+ #516 > > > [ 252.963007] Hardware name: QEMU Standard PC (Q35 + ICH9, > > > 2009), > > > BIOS 1.9.3-1.fc25 04/01/2014 > > > [ 252.963007] task: 88025e6f6000 task.stack: > > > c90001bd > > > [ 252.963007] RIP: 0010:scsi_format_opcode_name+0x1a/0x1c0 > > > [ 252.963007] RSP: 0018:c90001bd3c50 EFLAGS: 00010286 > > > [ 252.963007] RAX: 4843 RBX: 0050 RCX: > > > > > > [ 252.963007] RDX: RSI: 0050 RDI: > > > c90001bd3cd8 > > > [ 252.963007] RBP: c90001bd3c88 R08: 1000 R09: > > > > > > [ 252.963007] R10: 880275134000 R11: 88027513406c R12: > > > 0050 > > > [ 252.963007] R13: c90001bd3cd8 R14: R15: > > > > > > [ 252.963007] FS: 7f4d11762700() > > > GS:88027fc4() > > > knlGS: > > > [ 252.963007] CS: 0010 DS: ES: CR0: 80050033 > > > [ 252.963007] CR2: CR3: 00025e789003 CR4: > > > 003606e0 > > > [ 252.963007] DR0: DR1: DR2: > > > > > > [ 252.963007] DR3: DR6: fffe0ff0 DR7: > > > 0400 > > > [ 252.963007] Call Trace: > > > [ 252.963007] __scsi_format_command+0x27/0xc0 > > > [ 252.963007] scsi_show_rq+0x5c/0xc0 > > > [ 252.963007] ? seq_printf+0x4e/0x70 > > > [ 252.963007] ? blk_flags_show+0x5b/0xf0 > > > [ 252.963007] __blk_mq_debugfs_rq_show+0x116/0x130 > > > [ 252.963007] blk_mq_debugfs_rq_show+0xe/0x10 > > > [ 252.963007] seq_read+0xfe/0x3b0 > > > [ 252.963007] ? __handle_mm_fault+0x631/0x1150 > > > [ 252.963007] full_proxy_read+0x54/0x90 > > > [ 252.963007] __vfs_read+0x37/0x160 > > > [ 252.963007] ? security_file_permission+0x9b/0xc0 > > > [ 252.963007] vfs_read+0x96/0x130 > > > [ 252.963007] SyS_read+0x55/0xc0 > > > [ 252.963007] entry_SYSCALL_64_fastpath+0x1a/0xa5 > > > [ 252.963007] RIP: 0033:0x7f4d1127e9b0 > > > [ 252.963007] RSP: 002b:7ffd27082568 EFLAGS: 0246 > > > ORIG_RAX: > > > > > > [ 252.963007] RAX: ffda RBX: 7f4d1154bb20 RCX: > > > 7f4d1127e9b0 > > > [ 252.963007] RDX: 0002 RSI: 7f4d115a7000 RDI: > > > 0003 > > > [ 252.963007] RBP: 00021010 R08: R09: > > > > > > [ 252.963007] R10: 037b R11: 0246 R12: > > >
Re: [PATCH V4] scsi_debugfs: fix crash in scsi_show_rq()
On Fri, 2017-11-10 at 17:01 +0800, Ming Lei wrote: > cmd->cmnd can be allocated/freed dynamically in case of > T10_PI_TYPE2_PROTECTION, > so we should check it in scsi_show_rq() because this request may have > been freed already here, and cmd->cmnd has been set as null. > > We choose to accept read-after-free and dump request data as far as > possible. > > This patch fixs the following kernel crash when dumping request via > block's > debugfs interface: > > [ 252.962045] BUG: unable to handle kernel NULL pointer dereference > at (null) > [ 252.963007] IP: scsi_format_opcode_name+0x1a/0x1c0 > [ 252.963007] PGD 25e75a067 P4D 25e75a067 PUD 25e75b067 PMD 0 > [ 252.963007] Oops: [#1] PREEMPT SMP > [ 252.963007] Dumping ftrace buffer: > [ 252.963007](ftrace buffer empty) > [ 252.963007] Modules linked in: scsi_debug ebtable_filter ebtables > ip6table_filter ip6_tables xt_CHECKSUM iptable_mangle ipt_MASQUERADE > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 > nf_nat_ipv4 nf_nat nf_conntrack libcrc32c bridge stp llc > iptable_filter fuse ip_tables sd_mod sg mptsas mptscsih mptbase > crc32c_intel ahci libahci nvme serio_raw scsi_transport_sas libata > lpc_ich nvme_core virtio_scsi binfmt_misc dm_mod iscsi_tcp > libiscsi_tcp libiscsi scsi_transport_iscsi null_blk configs > [ 252.963007] CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0- > rc2.blk_mq_io_hang+ #516 > [ 252.963007] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > BIOS 1.9.3-1.fc25 04/01/2014 > [ 252.963007] task: 88025e6f6000 task.stack: c90001bd > [ 252.963007] RIP: 0010:scsi_format_opcode_name+0x1a/0x1c0 > [ 252.963007] RSP: 0018:c90001bd3c50 EFLAGS: 00010286 > [ 252.963007] RAX: 4843 RBX: 0050 RCX: > > [ 252.963007] RDX: RSI: 0050 RDI: > c90001bd3cd8 > [ 252.963007] RBP: c90001bd3c88 R08: 1000 R09: > > [ 252.963007] R10: 880275134000 R11: 88027513406c R12: > 0050 > [ 252.963007] R13: c90001bd3cd8 R14: R15: > > [ 252.963007] FS: 7f4d11762700() GS:88027fc4() > knlGS: > [ 252.963007] CS: 0010 DS: ES: CR0: 80050033 > [ 252.963007] CR2: CR3: 00025e789003 CR4: > 003606e0 > [ 252.963007] DR0: DR1: DR2: > > [ 252.963007] DR3: DR6: fffe0ff0 DR7: > 0400 > [ 252.963007] Call Trace: > [ 252.963007] __scsi_format_command+0x27/0xc0 > [ 252.963007] scsi_show_rq+0x5c/0xc0 > [ 252.963007] ? seq_printf+0x4e/0x70 > [ 252.963007] ? blk_flags_show+0x5b/0xf0 > [ 252.963007] __blk_mq_debugfs_rq_show+0x116/0x130 > [ 252.963007] blk_mq_debugfs_rq_show+0xe/0x10 > [ 252.963007] seq_read+0xfe/0x3b0 > [ 252.963007] ? __handle_mm_fault+0x631/0x1150 > [ 252.963007] full_proxy_read+0x54/0x90 > [ 252.963007] __vfs_read+0x37/0x160 > [ 252.963007] ? security_file_permission+0x9b/0xc0 > [ 252.963007] vfs_read+0x96/0x130 > [ 252.963007] SyS_read+0x55/0xc0 > [ 252.963007] entry_SYSCALL_64_fastpath+0x1a/0xa5 > [ 252.963007] RIP: 0033:0x7f4d1127e9b0 > [ 252.963007] RSP: 002b:7ffd27082568 EFLAGS: 0246 ORIG_RAX: > > [ 252.963007] RAX: ffda RBX: 7f4d1154bb20 RCX: > 7f4d1127e9b0 > [ 252.963007] RDX: 0002 RSI: 7f4d115a7000 RDI: > 0003 > [ 252.963007] RBP: 00021010 R08: R09: > > [ 252.963007] R10: 037b R11: 0246 R12: > 00022000 > [ 252.963007] R13: 7f4d1154bb78 R14: 1000 R15: > 0002 > [ 252.963007] Code: c6 e8 1b ca 24 00 eb 8c e8 74 2c ae ff 0f 1f 40 > 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 89 f4 > 48 83 ec 18 <44> 0f b6 32 48 c7 45 c8 00 00 00 00 65 48 8b 04 25 28 > 00 00 00 > [ 252.963007] RIP: scsi_format_opcode_name+0x1a/0x1c0 RSP: > c90001bd3c50 > [ 252.963007] CR2: > [ 252.963007] ---[ end trace 83c5bddfbaa6573c ]--- > [ 252.963007] Kernel panic - not syncing: Fatal exception > [ 252.963007] Dumping ftrace buffer: > [ 252.963007](ftrace buffer empty) > [ 252.963007] Kernel Offset: disabled > [ 252.963007] ---[ end Kernel panic - not syncing: Fatal exception > > Fixes: 0eebd005dd07(scsi: Implement blk_mq_ops.show_rq()) > Cc: Bart Van Assche <bart.vanass...@sandisk.com> > Cc: Omar Sandoval <osan...@fb.com> > Cc: Martin K. Petersen <martin.peter...@oracle.com> > Cc: James Bottomley <
Re: target-pending/for-next patches
On Wed, 2017-11-08 at 01:37 -0800, Nicholas A. Bellinger wrote: > On Sun, 2017-11-05 at 08:05 -0800, James Bottomley wrote: > > > > On Sat, 2017-11-04 at 18:14 -0700, Nicholas A. Bellinger wrote: > > > > > > Hi all, > > > > > > Just a friendly email after catching up on patches this week, the > > > majority of those outstanding on the list have been merged into > > > target-pending/for-next. Please see below. > > > > > > For those who submitted patches, please have a look and let me > > > know if anything is else missing. Note there are two exceptions > > > that have been left out for now that I'll be following up with > > > separately. > > > > > > Thus far it's all been either straight-forward bug-fixes, minor > > > cleanups, or small miscellaneous enhancements. AFAICT, nothing > > > looks particularly concerning. > > > > The concern would be you dumping a tree on the eve of a merge > > window, which you're presumably going to send to Linus in a week or > > so, when the last time you appeared was a fixes pull in 12 August, > > because it suggests this lot is just some randomly chosen selection > > to try to keep the tree alive. > > No. The tree contains outstanding bug-fixes and very minor > improvements. > > Do you have an objection to a particular patch..? Just the CommitDate on all of them as I said. The main problem is that it's 4 November and there's no other commit activity for an earlier date in this merge window cycle. > > I really wouldn't do it like this: I know Linus doesn't care too > > much for SCSI stuff and if you're lucky he may be too busy yelling > > at Jens to notice, but if not, you'll find yourself on the > > receiving end of his ire and that will damage the reputation of > > your tree a lot. > > I'd rather target-pending/for-next be judged on patches, and not > preconceived fear of including bug-fixes that address end-user issues > close to a merge window. A tree is judged on the trustworthiness of its maintainer. Linus uses process, feedback, regression handling and other markers to make that assessment. Having a three month gap in the CommitDate indicates a haphazard approach to the process of maintaining the tree. > > If the work of running the target tree has got too much, get a > > patch wrangler who can help with the process stuff you're > > completely lacking, like reviews and testing and long incubation in > > linux-next for exposure to 0day. > > The past release has been exceedingly slow in drivers/target/, with > the rate of incoming patches is down to ~2 per week. Those queued > are specific to drivers/target/ and don't run afoul of linux-next and > 0-day builds. Neither has actually caught up fully to your tree yet, if you check so the results should start coming in soon. > Personally, I'm still focused on bug-fixing and ensuring patches make > it back to v4.x.y linux-stable.git. However, there are still few > people working on bugs specific to host <-> fabric <-> target-core <- > > backend configurations, beyond individual components fixes. > > > > > I'm sure we can find several volunteers. > > Sure, a patch wrangler would be helpful. > > Where things tend to run into trouble is when someone starts merging > subsystem changes, but aren't directly responsible for distro > releases or production users running linux-stable.git code. > > That said, I'd prefer someone with 'skin in the game' and end-users > they support. OK, so did you have anyone in mind? James
Re: [PATCH V3] scsi_debugfs: fix crash in scsi_show_rq()
On Wed, 2017-11-08 at 21:21 +0800, Ming Lei wrote: > cmd->cmnd can be allocated/freed dynamically in case of > T10_PI_TYPE2_PROTECTION, > so we should check it in scsi_show_rq() because this request may have > been freed in scsi_show_rq(). > > This patch fixs the following kernel crash when dumping request via > block's > debugfs interface: > > [ 252.962045] BUG: unable to handle kernel NULL pointer dereference > at (null) > [ 252.963007] IP: scsi_format_opcode_name+0x1a/0x1c0 > [ 252.963007] PGD 25e75a067 P4D 25e75a067 PUD 25e75b067 PMD 0 > [ 252.963007] Oops: [#1] PREEMPT SMP > [ 252.963007] Dumping ftrace buffer: > [ 252.963007](ftrace buffer empty) > [ 252.963007] Modules linked in: scsi_debug ebtable_filter ebtables > ip6table_filter ip6_tables xt_CHECKSUM iptable_mangle ipt_MASQUERADE > nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 > nf_nat_ipv4 nf_nat nf_conntrack libcrc32c bridge stp llc > iptable_filter fuse ip_tables sd_mod sg mptsas mptscsih mptbase > crc32c_intel ahci libahci nvme serio_raw scsi_transport_sas libata > lpc_ich nvme_core virtio_scsi binfmt_misc dm_mod iscsi_tcp > libiscsi_tcp libiscsi scsi_transport_iscsi null_blk configs > [ 252.963007] CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0- > rc2.blk_mq_io_hang+ #516 > [ 252.963007] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > BIOS 1.9.3-1.fc25 04/01/2014 > [ 252.963007] task: 88025e6f6000 task.stack: c90001bd > [ 252.963007] RIP: 0010:scsi_format_opcode_name+0x1a/0x1c0 > [ 252.963007] RSP: 0018:c90001bd3c50 EFLAGS: 00010286 > [ 252.963007] RAX: 4843 RBX: 0050 RCX: > > [ 252.963007] RDX: RSI: 0050 RDI: > c90001bd3cd8 > [ 252.963007] RBP: c90001bd3c88 R08: 1000 R09: > > [ 252.963007] R10: 880275134000 R11: 88027513406c R12: > 0050 > [ 252.963007] R13: c90001bd3cd8 R14: R15: > > [ 252.963007] FS: 7f4d11762700() GS:88027fc4() > knlGS: > [ 252.963007] CS: 0010 DS: ES: CR0: 80050033 > [ 252.963007] CR2: CR3: 00025e789003 CR4: > 003606e0 > [ 252.963007] DR0: DR1: DR2: > > [ 252.963007] DR3: DR6: fffe0ff0 DR7: > 0400 > [ 252.963007] Call Trace: > [ 252.963007] __scsi_format_command+0x27/0xc0 > [ 252.963007] scsi_show_rq+0x5c/0xc0 > [ 252.963007] ? seq_printf+0x4e/0x70 > [ 252.963007] ? blk_flags_show+0x5b/0xf0 > [ 252.963007] __blk_mq_debugfs_rq_show+0x116/0x130 > [ 252.963007] blk_mq_debugfs_rq_show+0xe/0x10 > [ 252.963007] seq_read+0xfe/0x3b0 > [ 252.963007] ? __handle_mm_fault+0x631/0x1150 > [ 252.963007] full_proxy_read+0x54/0x90 > [ 252.963007] __vfs_read+0x37/0x160 > [ 252.963007] ? security_file_permission+0x9b/0xc0 > [ 252.963007] vfs_read+0x96/0x130 > [ 252.963007] SyS_read+0x55/0xc0 > [ 252.963007] entry_SYSCALL_64_fastpath+0x1a/0xa5 > [ 252.963007] RIP: 0033:0x7f4d1127e9b0 > [ 252.963007] RSP: 002b:7ffd27082568 EFLAGS: 0246 ORIG_RAX: > > [ 252.963007] RAX: ffda RBX: 7f4d1154bb20 RCX: > 7f4d1127e9b0 > [ 252.963007] RDX: 0002 RSI: 7f4d115a7000 RDI: > 0003 > [ 252.963007] RBP: 00021010 R08: R09: > > [ 252.963007] R10: 037b R11: 0246 R12: > 00022000 > [ 252.963007] R13: 7f4d1154bb78 R14: 1000 R15: > 0002 > [ 252.963007] Code: c6 e8 1b ca 24 00 eb 8c e8 74 2c ae ff 0f 1f 40 > 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 89 f4 > 48 83 ec 18 <44> 0f b6 32 48 c7 45 c8 00 00 00 00 65 48 8b 04 25 28 > 00 00 00 > [ 252.963007] RIP: scsi_format_opcode_name+0x1a/0x1c0 RSP: > c90001bd3c50 > [ 252.963007] CR2: > [ 252.963007] ---[ end trace 83c5bddfbaa6573c ]--- > [ 252.963007] Kernel panic - not syncing: Fatal exception > [ 252.963007] Dumping ftrace buffer: > [ 252.963007](ftrace buffer empty) > [ 252.963007] Kernel Offset: disabled > [ 252.963007] ---[ end Kernel panic - not syncing: Fatal exception > > Fixes: 0eebd005dd07(scsi: Implement blk_mq_ops.show_rq()) > Cc: Bart Van Assche <bart.vanass...@sandisk.com> > Cc: Omar Sandoval <osan...@fb.com> > Cc: Martin K. Petersen <martin.peter...@oracle.com> > Cc: James Bottomley <james.bottom...@hansenpartnership.com> > Cc: Hannes Reinecke <h...@suse.com> > Signed-off-by: Ming Lei <ming.