Re: [PATCH] scsi: aacraid: fix io drop during the reset
Prasad, > "FIB_CONTEXT_FLAG_TIMEDOUT" flag is set in aac_eh_abort to > indicate command timeout, using the same flag in reset handler > causes the command to timeout and the IO's were droped. Applied to 4.15/scsi-fixes. Thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v3] Use blist_flags_t consistently
Bart, > Use the type blist_flags_t for all variables that represent blacklist > flags. Additionally, suppress recently introduced sparse warnings > related to blacklist flags. Applied to 4.15/scsi-fixes, thanks! PS. Had to tweak the commit ID. -- Martin K. Petersen Oracle Linux Engineering
Re: bloated workload of SCSI commands at attachment
Doug, > What seems to be to be new is the three repeated sequence involving > the "a3 0c" command. That is REPORT SUPPORTED OPERATION CODES. > Why is the same sequence repeated 3 times? One invocation is probably partition scanning. sd_probe calls revalidate twice. It used to be a somewhat tricky dance of doing it once to get basic device properties, then allocating the gendisk, and finally doing another revalidate pass to fill out the remaining fields that had a dependency on a gendisk being instantiated. Given some of the recent block changes, I am not entirely sure two calls are required. Would be interesting to test... -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 5/4] scsi: arcmsr: simplify arcmsr_request_device_map routine
Ching, > simplify arcmsr_request_device_map routine Applied to 4.16/scsi-queue. Thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v2 5/5] scsi_debug: add resp_write_scat function
On 2017-12-14 05:57 PM, Bart Van Assche wrote: On Wed, 2017-12-13 at 20:40 -0500, Douglas Gilbert wrote: + {0, 0x7f, 0x11, F_SA_HIGH | F_D_OUT | FF_MEDIA_IO, resp_write_scat, + NULL, {32, 0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0x11, 0xf8, + 0, 0xff, 0xff, 0x0, 0x0} }, /* WRITE SCATTERED(32) */ I know this is consistent with other command declarations in the scsi_debug driver, but it seems weird to me to report to the initiator that only some bits of the ADDITIONAL CDB LENGTH fields are supported (0x18)? From SPC-5: The CDB USAGE DATA field contains information about the CDB for the command being queried. The first byte of the CDB USAGE DATA field shall contain the operation code for the command being queried. If the command being queried contains a service action, then that service action code shall be placed in the CDB USAGE DATA field in the same location as the SERVICE ACTION field of the command CDB. All other bytes of the CDB USAGE DATA field shall contain a usage map for bits in the CDB for the command being queried. Given the opcode and service action for WRITE SCATTERED(32), if the ADDITIONAL CDB LENGTH is other than 0x18, then the command is in error. The mask with the best coverage on that is 0x18. It will catch any bits set that are clear in the corresponding mask position but not catch bits that are clear where the mask is set. So 0x19 will get caught as invalid but 0x10 won't. Attached is some output from a Seagate SAS SSD using the sg_opcodes utility. Note that the 5 variable length cdb_s are all 32 bytes long and have a mask of 0x18 for the ADDITIONAL CDB LENGTH. + lrdp = kzalloc(lbdof_blen, GFP_ATOMIC); + if (lrdp == NULL) { + mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC, + INSUFF_RES_ASCQ); + return illegal_condition_result; + } It seems weird to me to report a fatal error to the initiator if a memory allocation fails? I think other SCSI LLDs return SCSI_MLQUEUE_HOST_BUSY if a memory allocation fails. The scsi_debug driver plays the role of SCSI initiator and target connected via a virtual transport. If the underlying OS reports a resource error than I believe it is the implementer's choice whether to treat it as a initiator error or a target (logical unit) error. I chose the latter because the failure is in code that is roughly equivalent to the device server (inside a LU inside a target). The scsi_debug driver reports SCSI_MLQUEUE_HOST_BUSY in other contexts, both real out-of-resources situations and via error injection. Doug Gilbert SEAGATE ST200FM0073 CB02 Peripheral device type: disk Opcode ServiceCDBName (hex) action(h) size --- 00 6Test Unit Ready cdb usage: 00 00 00 00 00 00 01 6Rezero Unit cdb usage: 01 00 01 00 ff 00 03 6Request Sense cdb usage: 03 e1 00 00 ff 00 04 6Format Unit cdb usage: 04 ff 00 ff ff 00 07 6Reassign Blocks cdb usage: 07 03 00 00 00 00 08 6Read(6) cdb usage: 08 1f ff ff ff 00 0a 6Write(6) cdb usage: 0a 1f ff ff ff 00 0b 6Seek(6) cdb usage: 0b 1f ff ff 00 00 12 6Inquiry cdb usage: 12 e1 ff ff ff 00 15 6Mode select(6) cdb usage: 15 11 00 00 ff 00 16 6Reserve(6) cdb usage: 16 1e ff ff ff 00 17 6Release(6) cdb usage: 17 1e ff 00 00 00 1a 6Mode sense(6) cdb usage: 1a 08 ff ff ff 00 1b 6Start stop unit cdb usage: 1b 01 00 0f f5 00 1c 6Receive diagnostic results cdb usage: 1c 01 ff ff ff 00 1d 6Send diagnostic cdb usage: 1d f7 00 ff ff 00 25 10Read capacity(10) cdb usage: 25 00 7f ff ff ff 00 00 01 00 28 10Read(10) cdb usage: 28 fa ff ff ff ff 00 ff ff 00 2a 10Write(10) cdb usage: 2a fa ff ff ff ff 00 ff ff 00 2b 10Seek(10) cdb usage: 2b 00 ff ff ff ff 00 00 00 00 2e 10Write and verify(10) cdb usage: 2e f2 ff ff ff ff 00 ff ff 00 2f 10Verify(10) cdb usage: 2f f2 ff ff ff ff 00 ff ff 00 35 10Synchronize cache(10) cdb usage: 35 06 ff ff ff ff 00 ff ff 00 37 10Read defect data(10) cdb usage: 37 00 1f 00 00 00 00 ff ff 00 3b010Write buffer, combined header and data [or multiple modes] cdb usage: 3b 00 ff ff ff ff ff ff ff 00 3b210Write buffer, data cdb usage: 3b 02 ff ff ff ff ff ff ff
Re: [PATCH 0/4] scsi: arcmsr: simplify hba_get_config routine
Ching, > patch 1: simplify arcmsr_hbaC_get_config function. > > patch 2: wait iop firmware ready before issue get_config command to iop. > > patch 3: simplify arcmsr_hbaE_get_config function. > > patch 4: simplify all arcmsr_hbaX_get_config routine by call a new > get_adapter_config function Applied to 4.16/scsi-queue, thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 0/9] lpfc updates for 11.4.0.6
James, > This patch set provides a number of bug fixes and 1 addition to > the driver. Applied to 4.16/scsi-queue. Thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 00/19] hisi_sas: PM, RAS, and other misc changes
John, > This patchset contains support for some new > features, and also some modifications and other > fixes. > > Headline changes include: > - v3 hw Suspend and Resume support > - v3 hw RAS (PCI AER) support > - v2 hw HW port error handling support > - other misc fixes and tidy-up Applied to 4.16/scsi-queue. Thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v2 5/5] scsi_debug: add resp_write_scat function
On Wed, 2017-12-13 at 20:40 -0500, Douglas Gilbert wrote: > + {0, 0x7f, 0x11, F_SA_HIGH | F_D_OUT | FF_MEDIA_IO, resp_write_scat, > + NULL, {32, 0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0x11, 0xf8, > +0, 0xff, 0xff, 0x0, 0x0} }, /* WRITE SCATTERED(32) */ I know this is consistent with other command declarations in the scsi_debug driver, but it seems weird to me to report to the initiator that only some bits of the ADDITIONAL CDB LENGTH fields are supported (0x18)? From SPC-5: The CDB USAGE DATA field contains information about the CDB for the command being queried. The first byte of the CDB USAGE DATA field shall contain the operation code for the command being queried. If the command being queried contains a service action, then that service action code shall be placed in the CDB USAGE DATA field in the same location as the SERVICE ACTION field of the command CDB. All other bytes of the CDB USAGE DATA field shall contain a usage map for bits in the CDB for the command being queried. > + lrdp = kzalloc(lbdof_blen, GFP_ATOMIC); > + if (lrdp == NULL) { > + mk_sense_buffer(scp, ILLEGAL_REQUEST, INSUFF_RES_ASC, > + INSUFF_RES_ASCQ); > + return illegal_condition_result; > + } It seems weird to me to report a fatal error to the initiator if a memory allocation fails? I think other SCSI LLDs return SCSI_MLQUEUE_HOST_BUSY if a memory allocation fails. Thanks, Bart.
Re: [PATCH v2 4/5] scsi_debug: ARRAY_SIZE() and FF_MEDIA_IO
On Wed, 2017-12-13 at 20:40 -0500, Douglas Gilbert wrote: > Reviewer suggested using the ARRAY_SIZE macro. That reduced one of the subtle > inter-dependencies in the parser's tables. > > It is important that commands which simulate media access, indicate this in > the > flags for that command. The flag to do that was FF_DIRECT_IO. On reflection > FF_MEDIA_IO seems a more accurate description. Thanks! Reviewed-by: Bart Van Assche
Re: [PATCH v2 3/5] scsi_debug: do_device_access, add sg offset argument
On Wed, 2017-12-13 at 20:40 -0500, Douglas Gilbert wrote: > WRITE SCATTERED needs to take several "bites" out of the data-out buffer. > Expand the do_device_access() function to take a sg_skip argument. Reviewed-by: Bart Van Assche
Re: [PATCH v2 2/5] scsi_debug: fix group_number mask
On Wed, 2017-12-13 at 20:40 -0500, Douglas Gilbert wrote: > static const struct opcode_info_t reserve_iarr[1] = { > @@ -508,11 +508,11 @@ static const struct opcode_info_t > opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = { >0, 0} }, > {3, 0x88, 0, F_D_IN | FF_DIRECT_IO, resp_read_dt0, read_iarr, > {16, 0xfe, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, > - 0xff, 0xff, 0xff, 0x9f, 0xc7} }, /* READ(16) */ > + 0xff, 0xff, 0xff, 0xff, 0xc7} }, /* READ(16) */ > /* 10 */ > {3, 0x8a, 0, F_D_OUT | FF_DIRECT_IO, resp_write_dt0, write_iarr, > {16, 0xfa, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, > - 0xff, 0xff, 0xff, 0x9f, 0xc7} }, /* WRITE(16) */ > + 0xff, 0xff, 0xff, 0xff, 0xc7} }, /* WRITE(16) */ Please mention all changes in the patch description. I think the READ(16) and WRITE(16) changes not only change the GROUP NUMBER field from 5 to 6 bits but also declare the DLD0 bit as supported. Anyway: Reviewed-by: Bart Van Assche
Re: [PATCH v2 1/5] scsi_debug: tab, kstrto changes
On Wed, 2017-12-13 at 20:40 -0500, Douglas Gilbert wrote: > Some of my development tools tend to add spaces (my preference) rather > than tabs (kernel convention). Running unexpand to clean these spaces > up found more of them than checkpatch.pl did. Then checkpatch.pl > complained about other style violations in those newly tabbed lines. Reviewed-by: Bart Van Assche
Re: [PATCH 4/4] sd: use async_probe cookie to avoid deadlocks
On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote: > With the current design we're waiting for all async probes to > finish when removing any sd device. > This might lead to a livelock where the 'remove' call is blocking > for any probe calls to finish, and the probe calls are waiting for > a response, which will never be processes as the thread handling > the responses is waiting for the remove call to finish. > Which is completely pointless as we only _really_ care for the > probe on _this_ device to be completed; any other probing can > happily continue for all we care. > So save the async probing cookie in the structure and only wait > if this specific probe is still active. From async_synchronize_cookie_domain(): wait_event(async_done, lowest_in_progress(domain) >= cookie); So async_synchronize_cookie_domain() also waits for multiple asynchronous probes to finish. Does this patch have any advantages over the patch I posted (https://marc.info/?l=linux-scsi&m=151275368714540)? Thanks, Bart.
Re: [PATCH] scsi: check for device state in __scsi_remove_target()
On Thu, 2017-12-14 at 10:02 +0100, Hannes Reinecke wrote: > On 12/14/2017 09:05 AM, Jason Yan wrote: > > > > On 2017/12/14 6:23, Bart Van Assche wrote: > >> On Wed, 2017-12-13 at 14:21 +0100, Hannes Reinecke wrote: > >>> As it turned out device_get() doesn't use kref_get_unless_zero(), > >>> so we will be always getting a device pointer. > >>> So we need to check for the device state in __scsi_remove_target() > >>> to avoid tripping over deleted objects. > >>> > >>> Fixes: fbce4d9 ("scsi: fixup kernel warning during rmmod()") > >> > >> How about adding Reported-by: Jason Yan? See also > >> https://www.spinics.net/lists/linux-scsi/msg115295.html > >> > >> Anyway: > >> > >> Reviewed-by: Bart Van Assche > >> > > > > Seems the same as my patch.So how do we plan to fix this issue, > > pick this approach up or the approach James Bottomley suggested? > > I have sent a patch to change get_device() but Greg seems do not > > like this way. > > > This is actually a real regression, which can be trivially exercised by > eg logging out from two connections to an iSCSI target. > (Our QA tripped across that one). > So I'd rather have to have it fixed reasonably soon. > > While 'get_device' is IMO the 'correct' solution it surely warrants a > broader discussion, plus one would need to audit all callers to check > the return value. If we were going down that route we should probably > add a __must_check to get_device(), too. > But again, this will probably drag out for quite some time, and I'd > prefer to have the fix in the meantime. > > Cheers, > > Hannes We have 2 reproducible test cases, this patch fixes one of them, which was a continually oscillating FC target port w/short dev_loss_tmo. I'm still waiting for a report on the iSCSI test. The code looks good. We need to get some kind of fix for this sooner rather than later. Reviewed-by: Ewan D. Milne
Re: [PATCH 3/4] scsi: add missing get_device() return value checks
On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote: > sdev->sdev_gendev.parent = get_device(&starget->dev); > + if (!sdev->sdev_gendev.parent) { > + kfree(sdev); > + return NULL; > + } Are you sure that get_device() can return NULL? Thanks, Bart.
Re: [PATCH 2/4] sd: Check if parent is still present before proceeding with probing
On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote: > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index ab75ebd..228b0b62 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -3399,6 +3399,10 @@ static int sd_probe(struct device *dev) > } > > device_initialize(&sdkp->dev); > + > + if (!get_device(dev)) > + goto out_free_index; get_device(dev) returns the value of the 'dev' argument. Are you sure the if-test is useful? Did you perhaps want to check sdp->sdev_state like scsi_device_get() does? > sdkp->dev.parent = dev; > sdkp->dev.class = &sd_disk_class; > dev_set_name(&sdkp->dev, "%s", dev_name(dev)); > @@ -3407,7 +3411,6 @@ static int sd_probe(struct device *dev) > if (error) > goto out_free_index; > > - get_device(dev); > dev_set_drvdata(dev, sdkp); > > get_device(&sdkp->dev); /* prevent release before async_schedule */ Since the get_device() call has been moved up, are there any error paths that have to be adjusted? Bart.
Re: [PATCH 1/4] block: check for GENHD_FL_UP in del_gendisk()
On Wed, 2017-12-13 at 08:36 +0100, Hannes Reinecke wrote: > On 12/12/2017 05:57 PM, Bart Van Assche wrote: > > On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote: > > > From: Hannes Reinecke > > > > > > When a device is probed asynchronously del_gendisk() might be called > > > before the async probing was run, causing del_gendisk() to crash > > > due to uninitialized sysfs objects. > > > > > > Signed-off-by: Hannes Reinecke > > > --- > > > block/genhd.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/block/genhd.c b/block/genhd.c > > > index c2223f1..cc40d95 100644 > > > --- a/block/genhd.c > > > +++ b/block/genhd.c > > > @@ -697,6 +697,9 @@ void del_gendisk(struct gendisk *disk) > > > struct disk_part_iter piter; > > > struct hd_struct *part; > > > > > > + if (!(disk->flags & GENHD_FL_UP)) > > > + return; > > > + > > > blk_integrity_del(disk); > > > disk_del_events(disk); > > > > Hello Hannes, > > > > Thank you for having published your approach for increasing disk probing > > concurrency. Your approach looks interesting to me. However, I don't think > > that patches 1/4..3/4 are sufficient to avoid races between e.g. > > device_add_disk() and del_gendisk(). As far as I know no locks are held > > around the device_add_disk() and del_gendisk() calls. Does that mean that > > del_gendisk() can call e.g. blk_integrity_del() before device_add_disk() has > > called blk_integrity_add()? > > In principle, yes. However, the overall idea here is that device_add_disk() > and del_gendisk() are enclosed within upper layer procedures, which themselves > provide additional locking. In our case the sd driver provided synchronisation > guarantees ensuring that device_add_disk() and del_gendisk() doesn't run > concurrently. > > if one is really concerned we could convert disk->flags to a bitmask, and use > atomic bitmask modification; that should avoid any concurrency issues. Hello Hannes, Regarding the scenario explained in a previous e-mail: what guarantees that the device_add_disk() call in sd_probe_async() does not happen concurrently with the device_unregister() call from __scsi_remove_device()? Thanks, Bart.
[PATCH 5/6] Documentetion: cdrom: introduce CDS_DRIVE_ERROR
CDS_DRIVE_NOT_READY is used for the state in which CDROM is 'becoming ready' (typically analyzing the disc) but also as the fallback when nothing else applies. Introduce CDS_DRIVE_ERROR for the fallback case. Signed-off-by: Michal Suchanek --- Documentation/cdrom/cdrom-standard.tex | 8 +++- Documentation/cdrom/ide-cd | 6 ++ Documentation/ioctl/cdrom.txt | 1 + 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/Documentation/cdrom/cdrom-standard.tex b/Documentation/cdrom/cdrom-standard.tex index 8f85b0e41046..018284ba696a 100644 --- a/Documentation/cdrom/cdrom-standard.tex +++ b/Documentation/cdrom/cdrom-standard.tex @@ -371,11 +371,17 @@ $$ CDS_NO_INFO& no information available\cr CDS_NO_DISC& no disc is inserted, tray is closed\cr CDS_TRAY_OPEN& tray is opened\cr -CDS_DRIVE_NOT_READY& something is wrong, tray is moving?\cr +CDS_DRIVE_NOT_READY& tray just closed?\cr CDS_DISC_OK& a disc is loaded and everything is fine\cr +CDS_DRIVE_ERROR& something is wrong\cr } $$ +Note: The IDE and SCSI cdroms have a status code 'drive becoming ready' which +is typically returned when the drive has just closed and is analyzing the disc. +For other cdrom types this state is not reported by the hardware or not +implemented by the driver. + \subsection{$Int\ media_changed(struct\ cdrom_device_info * cdi, int\ disc_nr)$} This function is very similar to the original function in $struct\ diff --git a/Documentation/cdrom/ide-cd b/Documentation/cdrom/ide-cd index a5f2a7f1ff46..9324a8fd9a39 100644 --- a/Documentation/cdrom/ide-cd +++ b/Documentation/cdrom/ide-cd @@ -455,6 +455,9 @@ main (int argc, char **argv) case CDS_DRIVE_NOT_READY: printf ("Drive Not Ready.\n"); break; + case CDS_DRIVE_ERROR: + printf ("Drive problem.\n"); + break; default: printf ("This Should not happen!\n"); break; @@ -481,6 +484,9 @@ main (int argc, char **argv) case CDS_NO_INFO: printf ("No Information available."); break; + case CDS_DRIVE_ERROR: + printf ("Drive problem.\n"); + break; default: printf ("This Should not happen!\n"); break; diff --git a/Documentation/ioctl/cdrom.txt b/Documentation/ioctl/cdrom.txt index a4d62a9d6771..7720d11807c3 100644 --- a/Documentation/ioctl/cdrom.txt +++ b/Documentation/ioctl/cdrom.txt @@ -700,6 +700,7 @@ CDROM_DRIVE_STATUS Get tray position, etc. CDS_TRAY_OPEN CDS_DRIVE_NOT_READY CDS_DISC_OK + CDS_DRIVE_ERROR -1 error error returns: -- 2.13.6
[PATCH 0/6] Fix cdrom autoclose
Hello, there is cdrom autoclose feature that is supposed to close the tray, wait for the disc to become ready, and then open the device. This used to work in ancient times. Then in old times there was a hack in util-linux which worked around the breakage which probably resulted from switching to scsi emulation. Currently util-linux maintainer refuses to merge another hack on the basis that kernel still has the feature so it should be fixed there. Indeed, to implement this feature effectively from userspace one would need to know when the CD-ROM is in the "drive becoming ready" state which is knowledge that never leaves the hardware-specific driver and is passed neither to userspace nor the generic cdrom driver. So this patchset fixes the kernel autoclose implementation in cdrom.c and to do so reports the "drive becoming ready" state from the harware specific drivers. Michal Suchanek (6): delay: add poll_event_interruptible cdrom: factor out common open_for_* code cdrom: wait for tray to close cdrom: introduce CDS_DRIVE_ERROR Documentetion: cdrom: introduce CDS_DRIVE_ERROR cdrom: wait for drive to become ready Documentation/cdrom/cdrom-standard.tex | 8 ++- Documentation/cdrom/ide-cd | 6 ++ Documentation/ioctl/cdrom.txt | 1 + drivers/block/paride/pcd.c | 2 +- drivers/cdrom/cdrom.c | 124 - drivers/cdrom/gdrom.c | 2 +- drivers/ide/ide-cd_ioctl.c | 12 ++-- drivers/scsi/sr_ioctl.c| 2 +- include/linux/delay.h | 12 include/uapi/linux/cdrom.h | 1 + 10 files changed, 99 insertions(+), 71 deletions(-) -- 2.13.6
[PATCH 3/3] virtio_scsi: Implement 'native LUN' feature
The 'native LUN' feature allows virtio-scsi to pass in the LUN numbers from the underlying storage directly, without having to modify the LUN number itself. It works by shifting the existing LUN number down by 8 bytes, and add the virtio-specific 8-byte LUN steering header. With that virtio doesn't have to mangle the LUN number, allowing us to pass the 'real' LUN number to the guest. Of course, we do cut off the last 8 bytes of the 'real' LUN number, but I'm not aware of any array utilizing that, so the impact should be negligible. Signed-off-by: Hannes Reinecke --- drivers/scsi/virtio_scsi.c | 62 ++-- include/uapi/linux/virtio_scsi.h | 1 + 2 files changed, 48 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index f925fbd..63c2c85 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -356,8 +356,12 @@ static void virtscsi_handle_transport_reset(struct virtio_scsi *vscsi, struct scsi_device *sdev; struct Scsi_Host *shost = virtio_scsi_host(vscsi->vdev); unsigned int target = event->lun[1]; - unsigned int lun = (event->lun[2] << 8) | event->lun[3]; + u64 lun; + if (virtio_has_feature(vscsi->vdev, VIRTIO_SCSI_F_NATIVE_LUN)) + lun = scsilun_to_int((struct scsi_lun *)event->lun) >> 16; + else + lun = (event->lun[2] << 8) | event->lun[3]; switch (virtio32_to_cpu(vscsi->vdev, event->reason)) { case VIRTIO_SCSI_EVT_RESET_RESCAN: scsi_add_device(shost, 0, target, lun); @@ -368,7 +372,7 @@ static void virtscsi_handle_transport_reset(struct virtio_scsi *vscsi, scsi_remove_device(sdev); scsi_device_put(sdev); } else { - pr_err("SCSI device %d 0 %d %d not found\n", + pr_err("SCSI device %d 0 %d %llu not found\n", shost->host_no, target, lun); } break; @@ -383,13 +387,17 @@ static void virtscsi_handle_param_change(struct virtio_scsi *vscsi, struct scsi_device *sdev; struct Scsi_Host *shost = virtio_scsi_host(vscsi->vdev); unsigned int target = event->lun[1]; - unsigned int lun = (event->lun[2] << 8) | event->lun[3]; + u64 lun; u8 asc = virtio32_to_cpu(vscsi->vdev, event->reason) & 255; u8 ascq = virtio32_to_cpu(vscsi->vdev, event->reason) >> 8; + if (virtio_has_feature(vscsi->vdev, VIRTIO_SCSI_F_NATIVE_LUN)) + lun = scsilun_to_int((struct scsi_lun *)event->lun) >> 16; + else + lun = (event->lun[2] << 8) | event->lun[3]; sdev = scsi_device_lookup(shost, 0, target, lun); if (!sdev) { - pr_err("SCSI device %d 0 %d %d not found\n", + pr_err("SCSI device %d 0 %d %llu not found\n", shost->host_no, target, lun); return; } @@ -524,10 +532,16 @@ static void virtio_scsi_init_hdr(struct virtio_device *vdev, int target_id, struct scsi_cmnd *sc) { - cmd->lun[0] = 1; - cmd->lun[1] = target_id; - cmd->lun[2] = (sc->device->lun >> 8) | 0x40; - cmd->lun[3] = sc->device->lun & 0xff; + if (virtio_has_feature(vdev, VIRTIO_SCSI_F_NATIVE_LUN)) { + u64 lun = sc->device->lun << 16; + lun |= ((u64)1 << 8) | (u64)target_id; + int_to_scsilun(lun, (struct scsi_lun *)&cmd->lun); + } else { + cmd->lun[0] = 1; + cmd->lun[1] = target_id; + cmd->lun[2] = (sc->device->lun >> 8) | 0x40; + cmd->lun[3] = sc->device->lun & 0xff; + } cmd->tag = cpu_to_virtio64(vdev, (unsigned long)sc); cmd->task_attr = VIRTIO_SCSI_S_SIMPLE; cmd->prio = 0; @@ -776,11 +790,17 @@ static int virtscsi_device_reset(struct scsi_cmnd *sc) .type = VIRTIO_SCSI_T_TMF, .subtype = cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET), - .lun[0] = 1, - .lun[1] = target_id, - .lun[2] = (sc->device->lun >> 8) | 0x40, - .lun[3] = sc->device->lun & 0xff, }; + if (virtio_has_feature(vscsi->vdev, VIRTIO_SCSI_F_NATIVE_LUN)) { + u64 lun = sc->device->lun << 16; + lun |= ((u64)1 << 8) | (u64)target_id; + int_to_scsilun(lun, (struct scsi_lun *)&cmd->req.tmf.lun); + } else { + cmd->req.tmf.lun[0] = 1; + cmd->req.tmf.lun[1] = target_id; + cmd->req.tmf.lun[2] = (sc->device->lun >> 8) | 0x40; + cmd->req.tmf.lun[3] = sc->device->lun & 0xff; + } return virtscsi_tmf(vscsi, cmd); } @@ -851,10 +871,18 @@ static int virtsc
[PATCH 2/3] virtio-scsi: Add FC transport class
When a device announces an 'FC' protocol we should be pulling in the FC transport class to have the rports etc setup correctly. Signed-off-by: Hannes Reinecke --- drivers/scsi/virtio_scsi.c | 323 ++--- 1 file changed, 277 insertions(+), 46 deletions(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index a561e90..f925fbd 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -25,11 +25,13 @@ #include #include #include +#include #include #include #include #include #include +#include #include #include @@ -91,6 +93,12 @@ struct virtio_scsi_vq { * an atomic_t. */ struct virtio_scsi_target_state { + struct list_head list; + struct fc_rport *rport; + struct virtio_scsi *vscsi; + int target_id; + bool removed; + seqcount_t tgt_seq; /* Count of outstanding requests. */ @@ -117,8 +125,12 @@ struct virtio_scsi { /* Protected by event_vq lock */ bool stop_events; + int protocol; int next_target_id; + u64 wwnn; + u64 wwpn; struct work_struct rescan_work; + struct list_head target_list; spinlock_t rescan_lock; struct virtio_scsi_vq ctrl_vq; @@ -128,6 +140,7 @@ struct virtio_scsi { static struct kmem_cache *virtscsi_cmd_cache; static mempool_t *virtscsi_cmd_pool; +static struct scsi_transport_template *virtio_transport_template; static inline struct Scsi_Host *virtio_scsi_host(struct virtio_device *vdev) { @@ -156,15 +169,21 @@ static void virtscsi_compute_resid(struct scsi_cmnd *sc, u32 resid) static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf) { struct virtio_scsi_cmd *cmd = buf; + struct fc_rport *rport; struct scsi_cmnd *sc = cmd->sc; struct virtio_scsi_cmd_resp *resp = &cmd->resp.cmd; - struct virtio_scsi_target_state *tgt = - scsi_target(sc->device)->hostdata; + struct virtio_scsi_target_state *tgt; dev_dbg(&sc->device->sdev_gendev, "cmd %p response %u status %#02x sense_len %u\n", sc, resp->response, resp->status, resp->sense_len); + rport = starget_to_rport(scsi_target(sc->device)); + if (!rport) + tgt = scsi_target(sc->device)->hostdata; + else + tgt = rport->dd_data; + sc->result = resp->status; virtscsi_compute_resid(sc, virtio32_to_cpu(vscsi->vdev, resp->resid)); switch (resp->response) { @@ -502,10 +521,11 @@ static int virtscsi_kick_cmd(struct virtio_scsi_vq *vq, static void virtio_scsi_init_hdr(struct virtio_device *vdev, struct virtio_scsi_cmd_req *cmd, +int target_id, struct scsi_cmnd *sc) { cmd->lun[0] = 1; - cmd->lun[1] = sc->device->id; + cmd->lun[1] = target_id; cmd->lun[2] = (sc->device->lun >> 8) | 0x40; cmd->lun[3] = sc->device->lun & 0xff; cmd->tag = cpu_to_virtio64(vdev, (unsigned long)sc); @@ -517,12 +537,14 @@ static void virtio_scsi_init_hdr(struct virtio_device *vdev, #ifdef CONFIG_BLK_DEV_INTEGRITY static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev, struct virtio_scsi_cmd_req_pi *cmd_pi, + int target_id, struct scsi_cmnd *sc) { struct request *rq = sc->request; struct blk_integrity *bi; - virtio_scsi_init_hdr(vdev, (struct virtio_scsi_cmd_req *)cmd_pi, sc); + virtio_scsi_init_hdr(vdev, (struct virtio_scsi_cmd_req *)cmd_pi, +target_id, sc); if (!rq || !scsi_prot_sg_count(sc)) return; @@ -542,6 +564,7 @@ static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev, static int virtscsi_queuecommand(struct virtio_scsi *vscsi, struct virtio_scsi_vq *req_vq, +int target_id, struct scsi_cmnd *sc) { struct Scsi_Host *shost = virtio_scsi_host(vscsi->vdev); @@ -564,13 +587,15 @@ static int virtscsi_queuecommand(struct virtio_scsi *vscsi, #ifdef CONFIG_BLK_DEV_INTEGRITY if (virtio_has_feature(vscsi->vdev, VIRTIO_SCSI_F_T10_PI)) { - virtio_scsi_init_hdr_pi(vscsi->vdev, &cmd->req.cmd_pi, sc); + virtio_scsi_init_hdr_pi(vscsi->vdev, &cmd->req.cmd_pi, + target_id, sc); memcpy(cmd->req.cmd_pi.cdb, sc->cmnd, sc->cmd_len); req_size = sizeof(cmd->req.cmd_pi); } else #endif { - virtio_scsi_init_hdr(vscsi->vdev, &cmd->req.cmd, sc); + virtio_scsi_init_hdr(vscsi->vdev, &cmd->req.cmd, +target_id, sc); memc
[PATCH 0/3] virtio-vfc implementation
Hi all, here's my attempt to implement a 'Virtual FC' emulation for virtio-scsi, based on the presentation at the KVM Forum in Prague. This doesn't so much implement the FC protocol per se, it just uses the port information to hook in the FC transport class, making the device to look like a FC device. This patchset has a complementary patchset for the qemu virtio-scsi driver, which will be posted to the qemu-devel list. As usual, comments and reviews are welcome. Hannes Reinecke (3): virtio-scsi: implement target rescan virtio-scsi: Add FC transport class virtio_scsi: Implement 'native LUN' feature drivers/scsi/virtio_scsi.c | 582 --- include/uapi/linux/virtio_scsi.h | 16 ++ 2 files changed, 554 insertions(+), 44 deletions(-) -- 1.8.5.6
[PATCH 1/3] virtio-scsi: implement target rescan
Implement the 'rescan' virtio-scsi feature. Rescanning works by sending a 'rescan' virtio-scsi command with the next requested target id to the backend. The backend will respond with the next used target id or '-1' if no more targets are found. This avoids scanning all possible targets. Signed-off-by: Hannes Reinecke --- drivers/scsi/virtio_scsi.c | 239 ++- include/uapi/linux/virtio_scsi.h | 15 +++ 2 files changed, 250 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 7c28e8d..a561e90 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -46,12 +46,14 @@ struct virtio_scsi_cmd { struct virtio_scsi_cmd_req_picmd_pi; struct virtio_scsi_ctrl_tmf_req tmf; struct virtio_scsi_ctrl_an_req an; + struct virtio_scsi_rescan_reqrescan; } req; union { struct virtio_scsi_cmd_resp cmd; struct virtio_scsi_ctrl_tmf_resp tmf; struct virtio_scsi_ctrl_an_resp an; struct virtio_scsi_event evt; + struct virtio_scsi_rescan_resp rescan; } resp; } cacheline_aligned_in_smp; @@ -115,6 +117,10 @@ struct virtio_scsi { /* Protected by event_vq lock */ bool stop_events; + int next_target_id; + struct work_struct rescan_work; + spinlock_t rescan_lock; + struct virtio_scsi_vq ctrl_vq; struct virtio_scsi_vq event_vq; struct virtio_scsi_vq req_vqs[]; @@ -318,6 +324,11 @@ static void virtscsi_cancel_event_work(struct virtio_scsi *vscsi) for (i = 0; i < VIRTIO_SCSI_EVENT_LEN; i++) cancel_work_sync(&vscsi->event_list[i].work); + + spin_lock_irq(&vscsi->rescan_lock); + vscsi->next_target_id = -1; + spin_unlock_irq(&vscsi->rescan_lock); + cancel_work_sync(&vscsi->rescan_work); } static void virtscsi_handle_transport_reset(struct virtio_scsi *vscsi, @@ -805,6 +816,168 @@ static enum blk_eh_timer_return virtscsi_eh_timed_out(struct scsi_cmnd *scmnd) return BLK_EH_RESET_TIMER; } +static void virtscsi_rescan_work(struct work_struct *work) +{ + struct virtio_scsi *vscsi = + container_of(work, struct virtio_scsi, rescan_work); + struct Scsi_Host *sh = virtio_scsi_host(vscsi->vdev); + int target_id, ret; + struct virtio_scsi_cmd *cmd; + DECLARE_COMPLETION_ONSTACK(comp); + + spin_lock_irq(&vscsi->rescan_lock); + target_id = vscsi->next_target_id; + if (target_id == -1) { + shost_printk(KERN_INFO, sh, "rescan: terminated\n"); + spin_unlock_irq(&vscsi->rescan_lock); + return; + } + spin_unlock_irq(&vscsi->rescan_lock); + + cmd = mempool_alloc(virtscsi_cmd_pool, GFP_NOIO); + if (!cmd) { + shost_printk(KERN_INFO, sh, "rescan: no memory\n"); + goto scan_host; + } + shost_printk(KERN_INFO, sh, "rescan: next target %d\n", target_id); + memset(cmd, 0, sizeof(*cmd)); + cmd->comp = ∁ + cmd->sc = NULL; + cmd->req.rescan = (struct virtio_scsi_rescan_req){ + .type = VIRTIO_SCSI_T_RESCAN, + .next_id = cpu_to_virtio32(vscsi->vdev, target_id), + }; + + ret = virtscsi_kick_cmd(&vscsi->ctrl_vq, cmd, sizeof(cmd->req.rescan), + sizeof(cmd->resp.rescan)); + if (ret < 0) { + mempool_free(cmd, virtscsi_cmd_pool); + goto scan_host; + } + + wait_for_completion(&comp); + target_id = virtio32_to_cpu(vscsi->vdev, cmd->resp.rescan.id); + if (target_id != -1) { + int transport = virtio32_to_cpu(vscsi->vdev, + cmd->resp.rescan.transport); + spin_lock_irq(&vscsi->rescan_lock); + vscsi->next_target_id = target_id + 1; + spin_unlock_irq(&vscsi->rescan_lock); + shost_printk(KERN_INFO, sh, +"found %s target %d (WWN %*phN)\n", +transport == SCSI_PROTOCOL_FCP ? "FC" : "SAS", +target_id, 8, +cmd->resp.rescan.port_wwn); + scsi_scan_target(&sh->shost_gendev, 0, target_id, +SCAN_WILD_CARD, SCSI_SCAN_INITIAL); + queue_work(system_freezable_wq, &vscsi->rescan_work); + } else { + shost_printk(KERN_INFO, sh, +"rescan: no more targets\n"); + spin_lock_irq(&vscsi->rescan_lock); + vscsi->next_target_id = -1; + spin_unlock_irq(&vscsi->rescan_lock); + } + mempool_free(cmd, virtscsi_cmd_pool); + return; +scan_host: + spin_lock_irq(&vscsi->rescan_lock); +
Re: [PATCH] driver core: Make it safe to use get_device() if the reference count is zero
On 2017/12/14 16:56, Greg KH wrote: On Thu, Dec 14, 2017 at 04:27:56PM +0800, Jason Yan wrote: On 2017/12/14 16:10, Greg KH wrote: On Thu, Dec 14, 2017 at 03:56:46PM +0800, Jason Yan wrote: On 2017/12/14 15:42, Greg KH wrote: On Thu, Dec 14, 2017 at 11:39:36AM +0800, Jason Yan wrote: Some driviers may have the chance to increase a reference count that has dropped to zero when using get_device() because of their design. Then those drivers are broken :) We have met such a issue with scsi: https://www.spinics.net/lists/linux-scsi/msg115295.html The scsi core will keep the scsi device object in the host list after it has been deleted and the iterator can still find it. All of the places where need iterating have to check the state of the scsi device and this makes a lot of code redundancy and complexity. Provide a safe mechanism in get_device() by using kobject_get_unless_zero(). Suggested-by: Bart Van Assche Signed-off-by: Jason Yan CC: Greg Kroah-Hartman CC: Bart Van Assche CC: Ewan D. Milne CC: James E.J. Bottomley CC: Christoph Hellwig --- drivers/base/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 12ebd05..cc74810 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1916,7 +1916,7 @@ EXPORT_SYMBOL_GPL(device_register); */ struct device *get_device(struct device *dev) { - return dev ? kobj_to_dev(kobject_get(&dev->kobj)) : NULL; + return dev && kobject_get_unless_zero(&dev->kobj) ? dev : NULL; I really don't want to do this, the bus the device is on should prevent this from happening. Also, once that reference count drops to zero, the memory should be freed, so you really have a stale pointer here, and this code would fail if you had slab debugging enabled anyway. Actually I don't want this either. But the design of scsi core will leave the scsi device on the host list after it is deleted, and it can be found later and the refcount have a very big chance to increase from zero again. And after a lot of discussion it seems that the scsi layer is difficult to change the situation in the near future. Keeping a 'struct device' reference counted chunk of memory on a list that has a different lifetime rule from that device itself, is crazy. The lifetime rule is the same. That device itself will delete from the host list in the destructor, scsi_device_dev_release_usercontext(), and the memory will be freed after that. That's why this issue came out. And yes, I remember how all of this came about, but I really don't have the time to work on it myself... So I don't even think this fixes the issue you think it fixes :) This issue is very easy to reproduce on my machine and I have tested the patch and it really fixes the issue. Even with slab debugging enabled? If so, what is keeping that memory from being freed once the reference count drops to 0? As above, the memory is not freed yet when we increasing the refconunt from zero, so it's nothing to do with slab debugging enabled or not. If we want to free it, we have to grab host lock first to delete it from the list, so if we are grabing the host lock, we can increase the refcount safely from 0 to 1. Wait, what? Once that refcount goes to 0, the release callback will be called, and the memory had better be freed. Any pointer you might still have to the structure is now invalid. The fact that you are somehow still keeping that pointer around is not ok, and slab debugging should have caused the memory to be overwritten and garbage would result if you tried to make this call again. I don't know why scsi have this design. Anyone who is familiar with the history of this design can explain this? thanks, greg k-h .
[Bug 198161] incremental patch-4.9.51-52 on sg.c break Nero 4 nero burning rom application
https://bugzilla.kernel.org/show_bug.cgi?id=198161 Johannes Thumshirn (jthumsh...@suse.de) changed: What|Removed |Added CC||jthumsh...@suse.de --- Comment #2 from Johannes Thumshirn (jthumsh...@suse.de) --- (In reply to Jean-Luc from comment #0) > Hi > > Nero 4 Linux can no longer detect devices since the changes in code in sg.c. > It displays the following message: "Warning: no device detected"... > > Regards Can you please re-test with a more recent kernel (e.g. v4.14) ? I have had submitted a fix for exactly this issue with Nero and believe it is gone. -- You are receiving this mail because: You are watching the assignee of the bug.
[PATCH] scsi: ufs: add trace event for ufs upiu
Add UFS Protocol Information Units(upiu) trace events for ufs driver, used to trace various ufs transaction types- command, task-management and device management. The trace-point format is generic and can be easily adapted to trace other upius if needed. Currently tracing ufs transaction of type 'device management', which this patch introduce, cannot be obtained from any other trace. Device management transactions are used for communication with the device such as reading and writing descriptor or attributes etc. Signed-off-by: Ohad Sharabi --- drivers/scsi/ufs/ufshcd.c | 48 ++ include/trace/events/ufs.h | 28 +++ 2 files changed, 76 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index a355d98..6d79c99 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -274,6 +274,47 @@ static inline void ufshcd_remove_non_printable(char *val) *val = ' '; } +static void ufshcd_add_upiu_trace(struct ufs_hba *hba, unsigned int tag, + const char *str) +{ + struct utp_task_req_desc *descp; + struct utp_upiu_task_req *task_req; + struct utp_upiu_req *rq; + u8 tx_code, *hdr, *tsf; + int off; + + if (!trace_ufshcd_upiu_enabled()) + return; + + off = (int)tag - hba->nutrs; + if (off < 0) { + rq = hba->lrb[tag].ucd_req_ptr; + hdr = (u8 *)&rq->header; + } else { + descp = &hba->utmrdl_base_addr[off]; + task_req = (struct utp_upiu_task_req *)descp->task_req_upiu; + hdr = (u8 *)&task_req->header; + } + + tx_code = hdr[0] & 0x3f; + switch (hdr[0] & 0x3f) { + case UPIU_TRANSACTION_COMMAND: + tsf = (u8 *)&rq->sc.cdb; + break; + case UPIU_TRANSACTION_TASK_REQ: + tsf = (u8 *)&task_req->input_param1; + break; + case UPIU_TRANSACTION_QUERY_REQ: + tsf = (u8 *)&rq->qr; + break; + default: + return; + } + + /* trace UPIU header and Transaction Specific Fields (TSF) */ + trace_ufshcd_upiu(dev_name(hba->dev), str, hdr, tsf); +} + static void ufshcd_add_command_trace(struct ufs_hba *hba, unsigned int tag, const char *str) { @@ -283,6 +324,9 @@ static void ufshcd_add_command_trace(struct ufs_hba *hba, struct ufshcd_lrb *lrbp; int transfer_len = -1; + /* trace UPIU also */ + ufshcd_add_upiu_trace(hba, tag, str); + if (!trace_ufshcd_command_enabled()) return; @@ -5462,11 +5506,14 @@ static int ufshcd_issue_tm_cmd(struct ufs_hba *hba, int lun_id, int task_id, spin_unlock_irqrestore(host->host_lock, flags); + ufshcd_add_upiu_trace(hba, task_tag, "tm_send"); + /* wait until the task management command is completed */ err = wait_event_timeout(hba->tm_wq, test_bit(free_slot, &hba->tm_condition), msecs_to_jiffies(TM_CMD_TIMEOUT)); if (!err) { + ufshcd_add_upiu_trace(hba, task_tag, "tm_complete_err"); dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n", __func__, tm_function); if (ufshcd_clear_tm_cmd(hba, free_slot)) @@ -5475,6 +5522,7 @@ static int ufshcd_issue_tm_cmd(struct ufs_hba *hba, int lun_id, int task_id, err = -ETIMEDOUT; } else { err = ufshcd_task_req_compl(hba, free_slot, tm_response); + ufshcd_add_upiu_trace(hba, task_tag, "tm_complete"); } clear_bit(free_slot, &hba->tm_condition); diff --git a/include/trace/events/ufs.h b/include/trace/events/ufs.h index bf6f826..0b2ff5d 100644 --- a/include/trace/events/ufs.h +++ b/include/trace/events/ufs.h @@ -257,6 +257,34 @@ ) ); +TRACE_EVENT(ufshcd_upiu, + TP_PROTO(const char *dev_name, const char *str, unsigned char *hdr, + unsigned char *tsf), + + TP_ARGS(dev_name, str, hdr, tsf), + + TP_STRUCT__entry( + __string(dev_name, dev_name) + __string(str, str) + __array(unsigned char, hdr, 12) + __array(unsigned char, tsf, 16) + ), + + TP_fast_assign( + __assign_str(dev_name, dev_name); + __assign_str(str, str); + memcpy(__entry->hdr, hdr, sizeof(__entry->hdr)); + memcpy(__entry->tsf, tsf, sizeof(__entry->tsf)); + ), + + TP_printk( + "%s: %s: HDR:%s, CDB:%s", + __get_str(str), __get_str(dev_name), + __print_hex(__entry->hdr, sizeof(__entry->hdr)), + __print_hex(__entry->tsf, sizeof(__entry->tsf)) + ) +); + #endif /* if !defined(_TRACE_UFS_H) || defined(TRACE_HEADER_MULTI_READ) */ /* This part must
Re: [PATCH] scsi: check for device state in __scsi_remove_target()
On 12/14/2017 09:05 AM, Jason Yan wrote: > > On 2017/12/14 6:23, Bart Van Assche wrote: >> On Wed, 2017-12-13 at 14:21 +0100, Hannes Reinecke wrote: >>> As it turned out device_get() doesn't use kref_get_unless_zero(), >>> so we will be always getting a device pointer. >>> So we need to check for the device state in __scsi_remove_target() >>> to avoid tripping over deleted objects. >>> >>> Fixes: fbce4d9 ("scsi: fixup kernel warning during rmmod()") >> >> How about adding Reported-by: Jason Yan? See also >> https://www.spinics.net/lists/linux-scsi/msg115295.html >> >> Anyway: >> >> Reviewed-by: Bart Van Assche >> > > Seems the same as my patch.So how do we plan to fix this issue, > pick this approach up or the approach James Bottomley suggested? > I have sent a patch to change get_device() but Greg seems do not > like this way. > This is actually a real regression, which can be trivially exercised by eg logging out from two connections to an iSCSI target. (Our QA tripped across that one). So I'd rather have to have it fixed reasonably soon. While 'get_device' is IMO the 'correct' solution it surely warrants a broader discussion, plus one would need to audit all callers to check the return value. If we were going down that route we should probably add a __must_check to get_device(), too. But again, this will probably drag out for quite some time, and I'd prefer to have the fix in the meantime. Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
[bug report] scsi: qla2xxx: Fix memory leak in dual/target mode
Hello himanshu.madh...@cavium.com, This is a semi-automatic email about new static checker warnings. The patch 7867b98dceb7: "scsi: qla2xxx: Fix memory leak in dual/target mode" from Dec 4, 2017, leads to the following Smatch complaint: drivers/scsi/qla2xxx/qla_mid.c:586 qla25xx_delete_req_que() error: we previously assumed 'req' could be null (see line 580) drivers/scsi/qla2xxx/qla_mid.c:602 qla25xx_delete_rsp_que() error: we previously assumed 'rsp' could be null (see line 596) drivers/scsi/qla2xxx/qla_mid.c 579 580 if (req && vha->flags.qpairs_req_created) { ^^^ Check for NULL 581 req->options |= BIT_0; 582 ret = qla25xx_init_req_que(vha, req); 583 if (ret != QLA_SUCCESS) 584 return QLA_FUNCTION_FAILED; 585 } 586 qla25xx_free_req_que(vha, req); ^^^ Unchecked dereference inside function. 587 588 return ret; [ snip ] 595 596 if (rsp && vha->flags.qpairs_rsp_created) { ^^^ 597 rsp->options |= BIT_0; 598 ret = qla25xx_init_rsp_que(vha, rsp); 599 if (ret != QLA_SUCCESS) 600 return QLA_FUNCTION_FAILED; 601 } 602 qla25xx_free_rsp_que(vha, rsp); ^^^ 603 604 return ret; regards, dan carpenter regards, dan carpenter
Re: [PATCH] driver core: Make it safe to use get_device() if the reference count is zero
On Thu, Dec 14, 2017 at 04:27:56PM +0800, Jason Yan wrote: > > > On 2017/12/14 16:10, Greg KH wrote: > > On Thu, Dec 14, 2017 at 03:56:46PM +0800, Jason Yan wrote: > > > > > > On 2017/12/14 15:42, Greg KH wrote: > > > > On Thu, Dec 14, 2017 at 11:39:36AM +0800, Jason Yan wrote: > > > > > Some driviers may have the chance to increase a reference count that > > > > > has dropped to zero when using get_device() because of their design. > > > > Then those drivers are broken :) > > > > > > > > > We have met such a issue with scsi: > > > > > https://www.spinics.net/lists/linux-scsi/msg115295.html > > > > > > > > > > The scsi core will keep the scsi device object in the host list after > > > > > it has been deleted and the iterator can still find it. All of the > > > > > places where need iterating have to check the state of the scsi device > > > > > and this makes a lot of code redundancy and complexity. > > > > > > > > > > Provide a safe mechanism in get_device() by using > > > > > kobject_get_unless_zero(). > > > > > > > > > > Suggested-by: Bart Van Assche > > > > > Signed-off-by: Jason Yan > > > > > CC: Greg Kroah-Hartman > > > > > CC: Bart Van Assche > > > > > CC: Ewan D. Milne > > > > > CC: James E.J. Bottomley > > > > > CC: Christoph Hellwig > > > > > --- > > > > >drivers/base/core.c | 2 +- > > > > >1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > diff --git a/drivers/base/core.c b/drivers/base/core.c > > > > > index 12ebd05..cc74810 100644 > > > > > --- a/drivers/base/core.c > > > > > +++ b/drivers/base/core.c > > > > > @@ -1916,7 +1916,7 @@ EXPORT_SYMBOL_GPL(device_register); > > > > > */ > > > > >struct device *get_device(struct device *dev) > > > > >{ > > > > > - return dev ? kobj_to_dev(kobject_get(&dev->kobj)) : NULL; > > > > > + return dev && kobject_get_unless_zero(&dev->kobj) ? dev : NULL; > > > > I really don't want to do this, the bus the device is on should prevent > > > > this from happening. > > > > > > > > Also, once that reference count drops to zero, the memory should be > > > > freed, so you really have a stale pointer here, and this code would fail > > > > if you had slab debugging enabled anyway. > > > > > > Actually I don't want this either. But the design of scsi core will leave > > > the scsi device on the host list after it is deleted, and it can be > > > found later and the refcount have a very big chance to increase from > > > zero again. And after a lot of discussion it seems that the scsi layer > > > is difficult to change the situation in the near future. > > > > Keeping a 'struct device' reference counted chunk of memory on a list > > that has a different lifetime rule from that device itself, is crazy. > > > > The lifetime rule is the same. That device itself will delete from the > host list in the destructor, scsi_device_dev_release_usercontext(), and > the memory will be freed after that. That's why this issue came out. > > > And yes, I remember how all of this came about, but I really don't have > > the time to work on it myself... > > > > > > So I don't even think this fixes the issue you think it fixes :) > > > > > > This issue is very easy to reproduce on my machine and I have tested the > > > patch and it really fixes the issue. > > > > Even with slab debugging enabled? If so, what is keeping that memory > > from being freed once the reference count drops to 0? > > > > As above, the memory is not freed yet when we increasing the refconunt from > zero, so it's nothing to do with slab debugging enabled or not. If > we want to free it, we have to grab host lock first to delete it from > the list, so if we are grabing the host lock, we can increase the > refcount safely from 0 to 1. Wait, what? Once that refcount goes to 0, the release callback will be called, and the memory had better be freed. Any pointer you might still have to the structure is now invalid. The fact that you are somehow still keeping that pointer around is not ok, and slab debugging should have caused the memory to be overwritten and garbage would result if you tried to make this call again. thanks, greg k-h
Re: [PATCH] driver core: Make it safe to use get_device() if the reference count is zero
On 2017/12/14 16:10, Greg KH wrote: On Thu, Dec 14, 2017 at 03:56:46PM +0800, Jason Yan wrote: On 2017/12/14 15:42, Greg KH wrote: On Thu, Dec 14, 2017 at 11:39:36AM +0800, Jason Yan wrote: Some driviers may have the chance to increase a reference count that has dropped to zero when using get_device() because of their design. Then those drivers are broken :) We have met such a issue with scsi: https://www.spinics.net/lists/linux-scsi/msg115295.html The scsi core will keep the scsi device object in the host list after it has been deleted and the iterator can still find it. All of the places where need iterating have to check the state of the scsi device and this makes a lot of code redundancy and complexity. Provide a safe mechanism in get_device() by using kobject_get_unless_zero(). Suggested-by: Bart Van Assche Signed-off-by: Jason Yan CC: Greg Kroah-Hartman CC: Bart Van Assche CC: Ewan D. Milne CC: James E.J. Bottomley CC: Christoph Hellwig --- drivers/base/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 12ebd05..cc74810 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1916,7 +1916,7 @@ EXPORT_SYMBOL_GPL(device_register); */ struct device *get_device(struct device *dev) { - return dev ? kobj_to_dev(kobject_get(&dev->kobj)) : NULL; + return dev && kobject_get_unless_zero(&dev->kobj) ? dev : NULL; I really don't want to do this, the bus the device is on should prevent this from happening. Also, once that reference count drops to zero, the memory should be freed, so you really have a stale pointer here, and this code would fail if you had slab debugging enabled anyway. Actually I don't want this either. But the design of scsi core will leave the scsi device on the host list after it is deleted, and it can be found later and the refcount have a very big chance to increase from zero again. And after a lot of discussion it seems that the scsi layer is difficult to change the situation in the near future. Keeping a 'struct device' reference counted chunk of memory on a list that has a different lifetime rule from that device itself, is crazy. The lifetime rule is the same. That device itself will delete from the host list in the destructor, scsi_device_dev_release_usercontext(), and the memory will be freed after that. That's why this issue came out. And yes, I remember how all of this came about, but I really don't have the time to work on it myself... So I don't even think this fixes the issue you think it fixes :) This issue is very easy to reproduce on my machine and I have tested the patch and it really fixes the issue. Even with slab debugging enabled? If so, what is keeping that memory from being freed once the reference count drops to 0? As above, the memory is not freed yet when we increasing the refconunt from zero, so it's nothing to do with slab debugging enabled or not. If we want to free it, we have to grab host lock first to delete it from the list, so if we are grabing the host lock, we can increase the refcount safely from 0 to 1. I think you are just papering over the real issue here, which is one reason I really do not like the get_unless_zero() function at all. thanks, greg k-h .
Re: [PATCH] driver core: Make it safe to use get_device() if the reference count is zero
On Thu, Dec 14, 2017 at 03:56:46PM +0800, Jason Yan wrote: > > On 2017/12/14 15:42, Greg KH wrote: > > On Thu, Dec 14, 2017 at 11:39:36AM +0800, Jason Yan wrote: > > > Some driviers may have the chance to increase a reference count that > > > has dropped to zero when using get_device() because of their design. > > Then those drivers are broken :) > > > > > We have met such a issue with scsi: > > > https://www.spinics.net/lists/linux-scsi/msg115295.html > > > > > > The scsi core will keep the scsi device object in the host list after > > > it has been deleted and the iterator can still find it. All of the > > > places where need iterating have to check the state of the scsi device > > > and this makes a lot of code redundancy and complexity. > > > > > > Provide a safe mechanism in get_device() by using > > > kobject_get_unless_zero(). > > > > > > Suggested-by: Bart Van Assche > > > Signed-off-by: Jason Yan > > > CC: Greg Kroah-Hartman > > > CC: Bart Van Assche > > > CC: Ewan D. Milne > > > CC: James E.J. Bottomley > > > CC: Christoph Hellwig > > > --- > > > drivers/base/core.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/base/core.c b/drivers/base/core.c > > > index 12ebd05..cc74810 100644 > > > --- a/drivers/base/core.c > > > +++ b/drivers/base/core.c > > > @@ -1916,7 +1916,7 @@ EXPORT_SYMBOL_GPL(device_register); > > >*/ > > > struct device *get_device(struct device *dev) > > > { > > > - return dev ? kobj_to_dev(kobject_get(&dev->kobj)) : NULL; > > > + return dev && kobject_get_unless_zero(&dev->kobj) ? dev : NULL; > > I really don't want to do this, the bus the device is on should prevent > > this from happening. > > > > Also, once that reference count drops to zero, the memory should be > > freed, so you really have a stale pointer here, and this code would fail > > if you had slab debugging enabled anyway. > > Actually I don't want this either. But the design of scsi core will leave > the scsi device on the host list after it is deleted, and it can be > found later and the refcount have a very big chance to increase from > zero again. And after a lot of discussion it seems that the scsi layer > is difficult to change the situation in the near future. Keeping a 'struct device' reference counted chunk of memory on a list that has a different lifetime rule from that device itself, is crazy. And yes, I remember how all of this came about, but I really don't have the time to work on it myself... > > So I don't even think this fixes the issue you think it fixes :) > > This issue is very easy to reproduce on my machine and I have tested the > patch and it really fixes the issue. Even with slab debugging enabled? If so, what is keeping that memory from being freed once the reference count drops to 0? I think you are just papering over the real issue here, which is one reason I really do not like the get_unless_zero() function at all. thanks, greg k-h
[bug report] scsi: lpfc: Correct driver deregistrations with host nvme transport
Hello James Smart, This is a semi-automatic email about new static checker warnings. The patch add9d6be3d65: "scsi: lpfc: Correct driver deregistrations with host nvme transport" from Nov 20, 2017, leads to the following Smatch complaint: drivers/scsi/lpfc/lpfc_nvme.c:969 lpfc_nvme_io_cmd_wqe_cmpl() error: we previously assumed 'ndlp' could be null (see line 938) drivers/scsi/lpfc/lpfc_nvme.c 937 938 if (ndlp && NLP_CHK_NODE_ACT(ndlp)) Existing code assumes ndlp can be NULL. 939 atomic_dec(&ndlp->cmd_pending); 940 941 /* Update stats and complete the IO. There is 942 * no need for dma unprep because the nvme_transport 943 * owns the dma address. 944 */ 945 #ifdef CONFIG_SCSI_LPFC_DEBUG_FS 946 if (lpfc_ncmd->ts_cmd_start) { 947 lpfc_ncmd->ts_isr_cmpl = pwqeIn->isr_timestamp; 948 lpfc_ncmd->ts_data_nvme = ktime_get_ns(); 949 phba->ktime_last_cmd = lpfc_ncmd->ts_data_nvme; 950 lpfc_nvme_ktime(phba, lpfc_ncmd); 951 } 952 if (phba->cpucheck_on & LPFC_CHECK_NVME_IO) { 953 if (lpfc_ncmd->cpu != smp_processor_id()) 954 lpfc_printf_vlog(vport, KERN_ERR, LOG_NVME_IOERR, 955 "6701 CPU Check cmpl: " 956 "cpu %d expect %d\n", 957 smp_processor_id(), lpfc_ncmd->cpu); 958 if (lpfc_ncmd->cpu < LPFC_CHECK_CPU_CNT) 959 phba->cpucheck_cmpl_io[lpfc_ncmd->cpu]++; 960 } 961 #endif 962 freqpriv = nCmd->private; 963 freqpriv->nvme_buf = NULL; 964 965 /* NVME targets need completion held off until the abort exchange 966 * completes unless the NVME Rport is getting unregistered. 967 */ 968 if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY) || 969 ndlp->upcall_flags & NLP_WAIT_FOR_UNREG) { ^ The patch adds an unchecked dereference. 970 /* Clear the XBUSY flag to prevent double completions. 971 * The nvme rport is getting unregistered and there is regards, dan carpenter
Re: [PATCH] scsi: check for device state in __scsi_remove_target()
On 2017/12/14 6:23, Bart Van Assche wrote: On Wed, 2017-12-13 at 14:21 +0100, Hannes Reinecke wrote: As it turned out device_get() doesn't use kref_get_unless_zero(), so we will be always getting a device pointer. So we need to check for the device state in __scsi_remove_target() to avoid tripping over deleted objects. Fixes: fbce4d9 ("scsi: fixup kernel warning during rmmod()") How about adding Reported-by: Jason Yan? See also https://www.spinics.net/lists/linux-scsi/msg115295.html Anyway: Reviewed-by: Bart Van Assche Seems the same as my patch.So how do we plan to fix this issue, pick this approach up or the approach James Bottomley suggested? I have sent a patch to change get_device() but Greg seems do not like this way.