Re: [PATCH v2 1/8] qla2xxx: kill sessions/log out initiator on RSCN and port down events
On 07/27/2015 08:09 PM, Roland Dreier wrote: On Mon, Jul 27, 2015 at 1:09 AM, Hannes Reinecke h...@suse.de wrote: Hmm? What happened to the original FIXME? Is it not required anymore? Not sure I follow the question. The original FIXME was /* FIXME: Re-enable Global event handling.. */ and this patch indeed re-enables global event handling. I don't think it's a separate patch, it's all part of the same work. Unless I'm misunderstanding... No, not really. If the patch indeed enables global event handling everything's fine. Wasn't just not quite obvious from the description nor the patch itself. So: Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote: On Wed, Jul 22, 2015 at 11:28 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: } void sas_device_set_phy(struct domain_device *dev, struct sas_port *port) diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c index d3c5297c6c89..9a25ae3a52a4 100644 --- a/drivers/scsi/libsas/sas_port.c +++ b/drivers/scsi/libsas/sas_port.c @@ -219,7 +219,6 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone) if (port-num_phys == 1) { sas_unregister_domain_devices(port, gone); - sas_port_delete(port-port); port-port = NULL; } else { sas_port_delete_phy(port-port, phy-phy); This should become if (port-num_phys == 1) sas_unregister_domain_device(port, gone); sas_port_delete_phy(port-port, phy-phy); So we end up with a port scheduled for destruction with no phys rather than making the last phy association hang around until the DISCE workqueue runs. Sounds ok in theory. It's not really a choice. The specific problem you've introduced with this patch is failure to cope with link flutter: a deform and form event queued sequentially. In the new scheme you're trying to introduce, the destruct event gets queued from the deform but behind the form and the link flutter results in a dead link. I thought just forcing a zero phy port would fix this, but it won't, either the destruct has to run in the context of the deform event or the form has to be queued later than the destruct. I think coupled with the changes above, there needs to be if (port-port) { /* dying port, requeue form event */ resend the PORTE_BYTES_DMAED event return } inside the unmatched port loop in sas_port_form() if nothing is found as well to close this. I don't have a libsas environment handy, I worked with Praveen to validate the version as submitted if you want to re-work it. A couple of days ago, this was so urgent as to have to go outside the usual patch process ... now it's not important enough for you to bother working on it; which is it? James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
On Mon, Jul 27, 2015 at 8:17 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote: On Wed, Jul 22, 2015 at 11:28 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: } void sas_device_set_phy(struct domain_device *dev, struct sas_port *port) diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c index d3c5297c6c89..9a25ae3a52a4 100644 --- a/drivers/scsi/libsas/sas_port.c +++ b/drivers/scsi/libsas/sas_port.c @@ -219,7 +219,6 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone) if (port-num_phys == 1) { sas_unregister_domain_devices(port, gone); - sas_port_delete(port-port); port-port = NULL; } else { sas_port_delete_phy(port-port, phy-phy); This should become if (port-num_phys == 1) sas_unregister_domain_device(port, gone); sas_port_delete_phy(port-port, phy-phy); So we end up with a port scheduled for destruction with no phys rather than making the last phy association hang around until the DISCE workqueue runs. Sounds ok in theory. It's not really a choice. The specific problem you've introduced with this patch is failure to cope with link flutter: a deform and form event queued sequentially. In the new scheme you're trying to introduce, the destruct event gets queued from the deform but behind the form and the link flutter results in a dead link. I thought just forcing a zero phy port would fix this, but it won't, either the destruct has to run in the context of the deform event or the form has to be queued later than the destruct. I think coupled with the changes above, there needs to be if (port-port) { /* dying port, requeue form event */ resend the PORTE_BYTES_DMAED event return } inside the unmatched port loop in sas_port_form() if nothing is found as well to close this. I think it's too late. Once the lldd has triggered libsas to start tear down I seem to recall the lldd has the expectation that a new PORTE_BYTES_DMAED triggers the creation of a new port instance for that phy. Once the flutter reaches libsas the race is already lost and the port needs to be torn down, but I would need to take a closer look. I don't have a libsas environment handy, I worked with Praveen to validate the version as submitted if you want to re-work it. A couple of days ago, this was so urgent as to have to go outside the usual patch process ... now it's not important enough for you to bother working on it; which is it? Neither, it was a reviewed patch that was idling in the process. I'm still of the opinion that pinging Andrew in a case like this *is* the expected process, unless there's a place I can check that a patch is still in the application queue? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] scsi: fix hang in scsi error handling
-Original Message- From: Hannes Reinecke [mailto:h...@suse.de] Sent: July-27-15 6:39 AM On 07/16/2015 08:55 PM, Kevin Groeneveld wrote: -Original Message- From: Hannes Reinecke [mailto:h...@suse.de] Sent: July-16-15 7:11 AM When the hang occurs shost-host_busy == 2 and shost-host_failed == 1 in the scsi_eh_wakeup function. However this function only wakes the error handler if host_busy == host_failed. Which just means that one command is still outstanding, and we need to wait for it to complete. But see below... So the root cause of the hang is maybe that the second command never completes? Maybe host_failed being non zero is blocking something in the port multiplier code? Hmm. I am really not sure about this. I wasn't sure either, that is one reason why I posted the patch. 'host_busy' indicates the number of outstanding commands, and 'host_failed' is the number of commands which have failed (on the ground that failed commands are considered outstanding, too). So the first hunk would change the behaviour from 'start SCSI EH once all commands are completed or failed' to 'start SCSI EH for _any_ command if scsi_eh_wakeup is called' (note that shost_failed might be '0'...). Which doesn't sound right. So could the patch create any problems by starting the EH any time scsi_eh_wakeup is called? Or is it is just inefficient? SCSI EH _relies_ on the fact that no other commands are outstanding on that SCSI host, hence the contents of eh_entry list won't change. Your patch breaks this assumption, causing some I/O to be lost. I guess this needs further debugging to get to the bottom of it. Any suggestions on things I could try? The fact that the problem goes away when I only enable one CPU core makes me think there is a race happening somewhere. Not sure here. You're effectively creating an endless loop with your patch, assuming that each ioctl will be However, you are effectively creating an endless loop with you testcase, assuming that 'ioctl' finishes all I/O before returning. Which _actually_ is not a requirement; the I/O itself needs to be finished by the time the ioctl returns (obviously), but the _structures_ associated with the ioctl might linger on a bit longer (delayed freeing and whatnot). Yet this is a bit far-fetched, and definitely needs some more analysis. For debugging I would suggest looking at the lifetime of each scsi command, figuring out if by the time the ioctl returns the scsi command is indeed freed up. Thanks for the further feedback on this. I haven't had a lot of time to debug this further. Last week I did tried enabling SCSI logging as you suggested in your previous post. I tried many different combinations of setting /proc/sys/dev/scsi/logging_level to enable different types and levels of debugging. However everything I tried either resulted in not being able to trigger the problem or nothing useful in the log. I was thinking of looking into the SCSI trace functionality to see if that would give more useful results. One thing I did notice which may be a small clue is the following values each time after the hang: /sys/class/scsi_device/0:0:0:0/device/device_busy = 1 (CD-ROM) /sys/class/scsi_device/0:1:0:0/device/device_busy = 0 (HDD) Before the hang the HDD busy value varies from 0 to 31. After the hang the HDD busy value is always 0. Also you might want to play around with the 'usleep' a bit; my assumption is that at one point for a large enough wait the problem goes away. (And, incidentally, we might actually getting more than one pending commands if the sleep is small enough; but this is just conjecture :-) I tried a 10 second usleep. On the first attempt the third ioctl never returned. After a reboot and a second attempt the 10th ioctl never returned. I also tried getting rid of the usleep entirely. If I avoid HDD access at the same time I can get about 100 ioctl calls per second and /sys/.../device_busy never seems to go above 1. As soon as I access the HDD all SCSI access hangs. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
On Mon, 2015-07-27 at 08:48 -0700, Dan Williams wrote: On Mon, Jul 27, 2015 at 8:17 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote: On Wed, Jul 22, 2015 at 11:28 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: } void sas_device_set_phy(struct domain_device *dev, struct sas_port *port) diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c index d3c5297c6c89..9a25ae3a52a4 100644 --- a/drivers/scsi/libsas/sas_port.c +++ b/drivers/scsi/libsas/sas_port.c @@ -219,7 +219,6 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone) if (port-num_phys == 1) { sas_unregister_domain_devices(port, gone); - sas_port_delete(port-port); port-port = NULL; } else { sas_port_delete_phy(port-port, phy-phy); This should become if (port-num_phys == 1) sas_unregister_domain_device(port, gone); sas_port_delete_phy(port-port, phy-phy); So we end up with a port scheduled for destruction with no phys rather than making the last phy association hang around until the DISCE workqueue runs. Sounds ok in theory. It's not really a choice. The specific problem you've introduced with this patch is failure to cope with link flutter: a deform and form event queued sequentially. In the new scheme you're trying to introduce, the destruct event gets queued from the deform but behind the form and the link flutter results in a dead link. I thought just forcing a zero phy port would fix this, but it won't, either the destruct has to run in the context of the deform event or the form has to be queued later than the destruct. I think coupled with the changes above, there needs to be if (port-port) { /* dying port, requeue form event */ resend the PORTE_BYTES_DMAED event return } inside the unmatched port loop in sas_port_form() if nothing is found as well to close this. I think it's too late. Once the lldd has triggered libsas to start tear down I seem to recall the lldd has the expectation that a new PORTE_BYTES_DMAED triggers the creation of a new port instance for that phy. Once the flutter reaches libsas the race is already lost and the port needs to be torn down, but I would need to take a closer look. I don't understand your reasoning. The expectation is that PORTE_BYTES_DMAED leads to port formation. The proposal detects that this event precedes DISCE_DESTRUCT for the port and requeues the event, now after DISC_DESTRUCT, so it gets acted on. Where is the problem? James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
On Mon, Jul 27, 2015 at 10:11 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Mon, 2015-07-27 at 08:48 -0700, Dan Williams wrote: On Mon, Jul 27, 2015 at 8:17 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote: On Wed, Jul 22, 2015 at 11:28 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: } void sas_device_set_phy(struct domain_device *dev, struct sas_port *port) diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c index d3c5297c6c89..9a25ae3a52a4 100644 --- a/drivers/scsi/libsas/sas_port.c +++ b/drivers/scsi/libsas/sas_port.c @@ -219,7 +219,6 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone) if (port-num_phys == 1) { sas_unregister_domain_devices(port, gone); - sas_port_delete(port-port); port-port = NULL; } else { sas_port_delete_phy(port-port, phy-phy); This should become if (port-num_phys == 1) sas_unregister_domain_device(port, gone); sas_port_delete_phy(port-port, phy-phy); So we end up with a port scheduled for destruction with no phys rather than making the last phy association hang around until the DISCE workqueue runs. Sounds ok in theory. It's not really a choice. The specific problem you've introduced with this patch is failure to cope with link flutter: a deform and form event queued sequentially. In the new scheme you're trying to introduce, the destruct event gets queued from the deform but behind the form and the link flutter results in a dead link. I thought just forcing a zero phy port would fix this, but it won't, either the destruct has to run in the context of the deform event or the form has to be queued later than the destruct. I think coupled with the changes above, there needs to be if (port-port) { /* dying port, requeue form event */ resend the PORTE_BYTES_DMAED event return } inside the unmatched port loop in sas_port_form() if nothing is found as well to close this. I think it's too late. Once the lldd has triggered libsas to start tear down I seem to recall the lldd has the expectation that a new PORTE_BYTES_DMAED triggers the creation of a new port instance for that phy. Once the flutter reaches libsas the race is already lost and the port needs to be torn down, but I would need to take a closer look. I don't understand your reasoning. The expectation is that PORTE_BYTES_DMAED leads to port formation. The proposal detects that this event precedes DISCE_DESTRUCT for the port and requeues the event, now after DISC_DESTRUCT, so it gets acted on. Where is the problem? Ah, I read flutter and mistakenly thought debounce. You're addressing the case where we're fully committed to the port going down, but a port up event is injected between the port down and destruct events. Yes, I agree re-queuing needs to happen, however don't we now have queue re-order problem with respect to a new port down event? It seems events need to be held off in-order while the subordinate DISCE events are processed. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/8] qla2xxx: kill sessions/log out initiator on RSCN and port down events
On Mon, Jul 27, 2015 at 1:09 AM, Hannes Reinecke h...@suse.de wrote: Hmm? What happened to the original FIXME? Is it not required anymore? Not sure I follow the question. The original FIXME was /* FIXME: Re-enable Global event handling.. */ and this patch indeed re-enables global event handling. I don't think it's a separate patch, it's all part of the same work. Unless I'm misunderstanding... -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
On Mon, 2015-07-27 at 08:48 -0700, Dan Williams wrote: On Mon, Jul 27, 2015 at 8:17 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote: I don't have a libsas environment handy, I worked with Praveen to validate the version as submitted if you want to re-work it. A couple of days ago, this was so urgent as to have to go outside the usual patch process ... now it's not important enough for you to bother working on it; which is it? Neither, it was a reviewed patch that was idling in the process. I'm still of the opinion that pinging Andrew in a case like this *is* the expected process, unless there's a place I can check that a patch is still in the application queue? I didn't ask you to justify your process, I asked you how important you thought the patch was mainly because of the conflicting signals you've sent. I get that you think I should treat all your patches as important whether you do or not, but the world doesn't quite work like that: patch application is a process of triage. Patches, like this, which have timing related issues potentially leading to races get looked at by me as the last reviewer. The speed of review depends on several factors, but one of which is what type of user visible issue is this causing. The user visible effects of this are a nasty warning message and nothing more, I believe? A useful indicator in this triage is how important the submitter thinks the patch is, which was originally why I asked. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: fix memory leak with scsi-mq
On Thu, 2015-07-16 at 11:40 -0400, Tony Battersby wrote: Fix a memory leak with scsi-mq triggered by commands with large data transfer length. __sg_alloc_table() sets both table-nents and table-orig_nents to the same value. When the scatterlist is DMA-mapped, table-nents is overwritten with the (possibly smaller) size of the DMA-mapped scatterlist, while table-orig_nents retains the original size of the allocated scatterlist. scsi_free_sgtable() should therefore check orig_nents instead of nents, and all code that initializes sdb-table without calling __sg_alloc_table() should set both nents and orig_nents. Fixes: d285203cf647 (scsi: add support for a blk-mq based I/O path.) Cc: sta...@vger.kernel.org # 3.17+ Signed-off-by: Tony Battersby to...@cybernetics.com --- For immediate inclusion. diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 106884a..cfadcce 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -944,7 +944,7 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses, scmd-sdb.length); scmd-sdb.table.sgl = ses-sense_sgl; scmd-sc_data_direction = DMA_FROM_DEVICE; - scmd-sdb.table.nents = 1; + scmd-sdb.table.nents = scmd-sdb.table.orig_nents = 1; scmd-cmnd[0] = REQUEST_SENSE; scmd-cmnd[4] = scmd-sdb.length; scmd-cmd_len = COMMAND_SIZE(scmd-cmnd[0]); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index b1a2631..448ebda 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -583,7 +583,7 @@ static struct scatterlist *scsi_sg_alloc(unsigned int nents, gfp_t gfp_mask) static void scsi_free_sgtable(struct scsi_data_buffer *sdb, bool mq) { - if (mq sdb-table.nents = SCSI_MAX_SG_SEGMENTS) + if (mq sdb-table.orig_nents = SCSI_MAX_SG_SEGMENTS) return; __sg_free_table(sdb-table, SCSI_MAX_SG_SEGMENTS, mq, scsi_sg_free); } @@ -597,8 +597,8 @@ static int scsi_alloc_sgtable(struct scsi_data_buffer *sdb, int nents, bool mq) if (mq) { if (nents = SCSI_MAX_SG_SEGMENTS) { - sdb-table.nents = nents; - sg_init_table(sdb-table.sgl, sdb-table.nents); + sdb-table.nents = sdb-table.orig_nents = nents; + sg_init_table(sdb-table.sgl, nents); return 0; } first_chunk = sdb-table.sgl; -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Looks good. James, can we get this in please? Reviewed-by: Ewan D. Milne emi...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] scsi: rescan device if an invalid capacity had been reported
On 07/08/2015 12:41 AM, Hannes Reinecke wrote: Device paths in ALUA state 'standby' do not necessarily support the READ_CAPACITY command. This patch adds a new flag 'invalid_capacity' to the scsi device, and rescans the device if an ALUA state change occurred. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/scsi_lib.c| 4 drivers/scsi/sd.c | 19 ++- include/scsi/scsi_device.h | 1 + 3 files changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index c005e42..d4245f0 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2701,6 +2701,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt) envp[idx++] = SDEV_UA=INQUIRY_DATA_HAS_CHANGED; break; case SDEV_EVT_CAPACITY_CHANGE_REPORTED: + scsi_rescan_device(sdev-sdev_gendev); envp[idx++] = SDEV_UA=CAPACITY_DATA_HAS_CHANGED; break; case SDEV_EVT_SOFT_THRESHOLD_REACHED_REPORTED: @@ -2713,6 +2714,9 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt) envp[idx++] = SDEV_UA=REPORTED_LUNS_DATA_HAS_CHANGED; break; case SDEV_EVT_ALUA_STATE_CHANGE_REPORTED: + if (sdev-invalid_capacity) + scsi_rescan_device(sdev-sdev_gendev); + envp[idx++] = SDEV_UA=ASYMMETRIC_ACCESS_STATE_CHANGED; break; default: diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 03fdfa9..7c0bdaa 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1983,14 +1983,18 @@ static int read_capacity_16(struct scsi_disk *sdkp, struct scsi_device *sdp, continue; if (sense_valid sshdr.sense_key == NOT_READY - sshdr.asc == 0x04 sshdr.ascq == 0x0a) + sshdr.asc == 0x04 sshdr.ascq == 0x0a) { /* Target port in transition */ + sdp-invalid_capacity = 1; return 0; + } if (sense_valid sshdr.sense_key == NOT_READY - sshdr.asc == 0x04 sshdr.ascq == 0x0b) + sshdr.asc == 0x04 sshdr.ascq == 0x0b) { /* Target port in standy state */ + sdp-invalid_capacity = 1; return 0; + } } retries--; @@ -2075,14 +2079,18 @@ static int read_capacity_10(struct scsi_disk *sdkp, struct scsi_device *sdp, continue; if (sense_valid sshdr.sense_key == NOT_READY - sshdr.asc == 0x04 sshdr.ascq == 0x0a) + sshdr.asc == 0x04 sshdr.ascq == 0x0a) { /* Target port in transition */ + sdp-invalid_capacity = 1; return 0; + } if (sense_valid sshdr.sense_key == NOT_READY - sshdr.asc == 0x04 sshdr.ascq == 0x0b) + sshdr.asc == 0x04 sshdr.ascq == 0x0b) { /* Target port in standy state */ + sdp-invalid_capacity = 1; return 0; + } } retries--; @@ -2199,7 +2207,8 @@ got_data: assuming 512.\n); if (!sdkp-physical_block_size) sdkp-physical_block_size = sector_size; - } + } else + sdp-invalid_capacity = 0; if (sector_size != 512 sector_size != 1024 diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 50c2a36..99bde5a 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -175,6 +175,7 @@ struct scsi_device { unsigned no_dif:1; /* T10 PI (DIF) should be disabled */ unsigned broken_fua:1; /* Don't set FUA bit */ unsigned lun_in_cdb:1; /* Store LUN bits in CDB[1] */ + unsigned invalid_capacity:1;/* READ_CAPACITY not supported */ Not sure I like the comment for the new field. How about something like current capacity unknown? atomic_t disk_events_disable_depth; /* disable depth for disk events */ -- Lee Duncan SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
On Mon, 2015-07-27 at 10:55 -0700, Dan Williams wrote: On Mon, Jul 27, 2015 at 10:11 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Mon, 2015-07-27 at 08:48 -0700, Dan Williams wrote: On Mon, Jul 27, 2015 at 8:17 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote: On Wed, Jul 22, 2015 at 11:28 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: } void sas_device_set_phy(struct domain_device *dev, struct sas_port *port) diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c index d3c5297c6c89..9a25ae3a52a4 100644 --- a/drivers/scsi/libsas/sas_port.c +++ b/drivers/scsi/libsas/sas_port.c @@ -219,7 +219,6 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone) if (port-num_phys == 1) { sas_unregister_domain_devices(port, gone); - sas_port_delete(port-port); port-port = NULL; } else { sas_port_delete_phy(port-port, phy-phy); This should become if (port-num_phys == 1) sas_unregister_domain_device(port, gone); sas_port_delete_phy(port-port, phy-phy); So we end up with a port scheduled for destruction with no phys rather than making the last phy association hang around until the DISCE workqueue runs. Sounds ok in theory. It's not really a choice. The specific problem you've introduced with this patch is failure to cope with link flutter: a deform and form event queued sequentially. In the new scheme you're trying to introduce, the destruct event gets queued from the deform but behind the form and the link flutter results in a dead link. I thought just forcing a zero phy port would fix this, but it won't, either the destruct has to run in the context of the deform event or the form has to be queued later than the destruct. I think coupled with the changes above, there needs to be if (port-port) { /* dying port, requeue form event */ resend the PORTE_BYTES_DMAED event return } inside the unmatched port loop in sas_port_form() if nothing is found as well to close this. I think it's too late. Once the lldd has triggered libsas to start tear down I seem to recall the lldd has the expectation that a new PORTE_BYTES_DMAED triggers the creation of a new port instance for that phy. Once the flutter reaches libsas the race is already lost and the port needs to be torn down, but I would need to take a closer look. I don't understand your reasoning. The expectation is that PORTE_BYTES_DMAED leads to port formation. The proposal detects that this event precedes DISCE_DESTRUCT for the port and requeues the event, now after DISC_DESTRUCT, so it gets acted on. Where is the problem? Ah, I read flutter and mistakenly thought debounce. You're addressing the case where we're fully committed to the port going down, but a port up event is injected between the port down and destruct events. Yes, I agree re-queuing needs to happen, however don't we now have queue re-order problem with respect to a new port down event? It seems events need to be held off in-order while the subordinate DISCE events are processed. No, that seems to be the intent of the prior code. The reason port visibility goes immediately (along with all associated phys), is that the port is ready for reuse as soon as sas_deform_port() returns. Destroying the subtree is left to DISCE_DESTRUCT, but since the subtree is not visible, a new port can form even as the elements associated with the old port are being destroyed. the DISCE_DISCOVER that populates it will queue behind the destruct, so everything just works today (modulo the new warning). It would be ideal if we could detect in the destruct queue that the visibility is already gone and make use of it, but we can't. The patch that causes this warning induces a mismatch between the state of the kernfs tree and the kobjects ... we still have to call device_del which leads to kobject_del (which triggers the warning) to bring this state back into alignment. So the disconnected subtree we originally used to make all this work is what's suddenly been rendered invalid. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
On Mon, Jul 27, 2015 at 11:07 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Mon, 2015-07-27 at 08:48 -0700, Dan Williams wrote: On Mon, Jul 27, 2015 at 8:17 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote: I don't have a libsas environment handy, I worked with Praveen to validate the version as submitted if you want to re-work it. A couple of days ago, this was so urgent as to have to go outside the usual patch process ... now it's not important enough for you to bother working on it; which is it? Neither, it was a reviewed patch that was idling in the process. I'm still of the opinion that pinging Andrew in a case like this *is* the expected process, unless there's a place I can check that a patch is still in the application queue? I didn't ask you to justify your process, I asked you how important you thought the patch was mainly because of the conflicting signals you've sent. I get that you think I should treat all your patches as important whether you do or not, but the world doesn't quite work like that: patch application is a process of triage. Patches, like this, which have timing related issues potentially leading to races get looked at by me as the last reviewer. The speed of review depends on several factors, but one of which is what type of user visible issue is this causing. The user visible effects of this are a nasty warning message and nothing more, I believe? A useful indicator in this triage is how important the submitter thinks the patch is, which was originally why I asked. That would be a question to Praveen. It wasn't clear to me whether this sysfs backtrace was a simply a warning or eventually fatal to the box. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
-Original Message- From: Dan Williams [mailto:dan.j.willi...@intel.com] Sent: Monday, July 27, 2015 11:25 AM To: James Bottomley Cc: Christoph Hellwig; Praveen Murali; linux-scsi; sta...@vger.kernel.org Subject: Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time I didn't ask you to justify your process, I asked you how important you thought the patch was mainly because of the conflicting signals you've sent. I get that you think I should treat all your patches as important whether you do or not, but the world doesn't quite work like that: patch application is a process of triage. Patches, like this, which have timing related issues potentially leading to races get looked at by me as the last reviewer. The speed of review depends on several factors, but one of which is what type of user visible issue is this causing. The user visible effects of this are a nasty warning message and nothing more, I believe? A useful indicator in this triage is how important the submitter thinks the patch is, which was originally why I asked. That would be a question to Praveen. It wasn't clear to me whether this sysfs backtrace was a simply a warning or eventually fatal to the box. As far as I remember, the issue was mostly with the sysfs backtraces. I don’t remember it causing a fatal error; but that could very well be because I did not run it long enough. N�r��yb�X��ǧv�^�){.n�+{{ay�ʇڙ�,j��f���h���z��w��� ���j:+v���w�j�mzZ+�ݢj��!�i
Re: [for 4.1 PATCH resend] libsas: fix sysfs group not found warnings at port teardown time
On Mon, Jul 27, 2015 at 11:38 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: No, that seems to be the intent of the prior code. The reason port visibility goes immediately (along with all associated phys), is that the port is ready for reuse as soon as sas_deform_port() returns. Destroying the subtree is left to DISCE_DESTRUCT, but since the subtree is not visible, a new port can form even as the elements associated with the old port are being destroyed. the DISCE_DISCOVER that populates it will queue behind the destruct, so everything just works today (modulo the new warning). It would be ideal if we could detect in the destruct queue that the visibility is already gone and make use of it, but we can't. The patch that causes this warning induces a mismatch between the state of the kernfs tree and the kobjects ... we still have to call device_del which leads to kobject_del (which triggers the warning) to bring this state back into alignment. So the disconnected subtree we originally used to make all this work is what's suddenly been rendered invalid. Looking at the original report this seemed new for the 3.16 kernel. However, looking closer, that warning in sysfs_remove_group() has been present for a long time before that. I don't recall seeing it back when we were doing focused hotplug testing with the isci driver for the 3.5 / 3.6 kernels, so I believe it is newly uncovered since then. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qla2xxx: Return the fabric command state for non-task management requests
Christoph, Currently, the command state within TCM (se_cmd.t_state) only track command states from the point of new to the Back End and from Back End up to QLA driver. From the debug perspective, that¹s only half the picture. The other half comes from qla2xxx¹s private command state provided by this patch. Having another trace point that happens 5~10 seconds ago might be difficult to back track. For task management debugging, dumping the current states (se_cmd + qla_tgt_cmd) of each command when the TMR is received provides some benefit. If you feel uncomfortable with the driver defined binary, one alternative would be to translate QLA¹s private command state into se_cmd¹s new field. This new file would be modify by the fabric layer only. This would limit any regression with existing se_cmd field modification. Regards, Quinn Tran On 7/24/15, 11:29 PM, target-devel-ow...@vger.kernel.org on behalf of Christoph Hellwig target-devel-ow...@vger.kernel.org on behalf of h...@infradead.org wrote: Which debug printk do you care about? I'd much prefer having a trace point inside the driver which could even pretty print it instead of the hack where a driver defined binary value is printed by the core. -- To unsubscribe from this list: send the line unsubscribe target-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html attachment: winmail.dat
Re: [PATCH] scsi: fix hang in scsi error handling
On 07/16/2015 08:55 PM, Kevin Groeneveld wrote: -Original Message- From: Hannes Reinecke [mailto:h...@suse.de] Sent: July-16-15 7:11 AM When the hang occurs shost-host_busy == 2 and shost-host_failed == 1 in the scsi_eh_wakeup function. However this function only wakes the error handler if host_busy == host_failed. Which just means that one command is still outstanding, and we need to wait for it to complete. But see below... So the root cause of the hang is maybe that the second command never completes? Maybe host_failed being non zero is blocking something in the port multiplier code? Hmm. I am really not sure about this. I wasn't sure either, that is one reason why I posted the patch. 'host_busy' indicates the number of outstanding commands, and 'host_failed' is the number of commands which have failed (on the ground that failed commands are considered outstanding, too). So the first hunk would change the behaviour from 'start SCSI EH once all commands are completed or failed' to 'start SCSI EH for _any_ command if scsi_eh_wakeup is called' (note that shost_failed might be '0'...). Which doesn't sound right. So could the patch create any problems by starting the EH any time scsi_eh_wakeup is called? Or is it is just inefficient? SCSI EH _relies_ on the fact that no other commands are outstanding on that SCSI host, hence the contents of eh_entry list won't change. Your patch breaks this assumption, causing some I/O to be lost. I guess this needs further debugging to get to the bottom of it. Any suggestions on things I could try? The fact that the problem goes away when I only enable one CPU core makes me think there is a race happening somewhere. Not sure here. You're effectively creating an endless loop with your patch, assuming that each ioctl will be However, you are effectively creating an endless loop with you testcase, assuming that 'ioctl' finishes all I/O before returning. Which _actually_ is not a requirement; the I/O itself needs to be finished by the time the ioctl returns (obviously), but the _structures_ associated with the ioctl might linger on a bit longer (delayed freeing and whatnot). Yet this is a bit far-fetched, and definitely needs some more analysis. For debugging I would suggest looking at the lifetime of each scsi command, figuring out if by the time the ioctl returns the scsi command is indeed freed up. Also you might want to play around with the 'usleep' a bit; my assumption is that at one point for a large enough wait the problem goes away. (And, incidentally, we might actually getting more than one pending commands if the sleep is small enough; but this is just conjecture :-) Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/8] qla2xxx: kill sessions/log out initiator on RSCN and port down events
On 07/14/2015 10:00 PM, Himanshu Madhani wrote: From: Roland Dreier rol...@purestorage.com To fix some issues talking to ESX, this patch modifies the qla2xxx driver so that it never logs into remote ports. This has the side effect of getting rid of the rports entirely, which means we never log out of initiators and never tear down sessions when an initiator goes away. This is mostly OK, except that we can run into trouble if we have initiator A assigned FC address X:Y:Z by the fabric talking to us, and then initiator A goes away. Some time (could be a long time) later, initiator B comes along and also gets FC address X:Y:Z (which is available again, because initiator A is gone). If initiator B starts talking to us, then we'll still have the session for initiator A, and since we look up incoming IO based on the FC address X:Y:Z, initiator B will end up using ACLs for initiator A. Fix this by: 1. Handling RSCN events somewhat differently; instead of completely skipping the processing of fcports, we look through the list, and if an fcport disappears, we tell the target code the tear down the session and tell the HBA FW to release the N_Port handle. 2. Handling port down events by flushing all of our sessions. The firmware was already releasing the N_Port handle but we want the target code to drop all the sessions too. Cc: sta...@vger.kernel.org Signed-off-by: Roland Dreier rol...@purestorage.com Signed-off-by: Alexei Potashnik ale...@purestorage.com Acked-by: Quinn Tran quinn.t...@qlogic.com Signed-off-by: Himanshu Madhani himanshu.madh...@qlogic.com --- drivers/scsi/qla2xxx/qla_dbg.c|2 +- drivers/scsi/qla2xxx/qla_init.c | 137 ++--- drivers/scsi/qla2xxx/qla_target.c |9 ++- 3 files changed, 117 insertions(+), 31 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_dbg.c b/drivers/scsi/qla2xxx/qla_dbg.c index 0e6ee3c..e9ae6b9 100644 --- a/drivers/scsi/qla2xxx/qla_dbg.c +++ b/drivers/scsi/qla2xxx/qla_dbg.c @@ -68,7 +68,7 @@ * | || 0xd101-0xd1fe | * | || 0xd214-0xd2fe | * | Target Mode | 0xe079 || - * | Target Mode Management| 0xf072 | 0xf002 | + * | Target Mode Management| 0xf080 | 0xf002 | * | || 0xf046-0xf049 | * | Target Mode Task Management |0x1000b || * -- diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 766fdfc..4895a4a 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -3462,20 +3462,43 @@ qla2x00_configure_fabric(scsi_qla_host_t *vha) if ((fcport-flags FCF_FABRIC_DEVICE) == 0) continue; - if (fcport-scan_state == QLA_FCPORT_SCAN - atomic_read(fcport-state) == FCS_ONLINE) { - qla2x00_mark_device_lost(vha, fcport, - ql2xplogiabsentdevice, 0); - if (fcport-loop_id != FC_NO_LOOP_ID - (fcport-flags FCF_FCP2_DEVICE) == 0 - fcport-port_type != FCT_INITIATOR - fcport-port_type != FCT_BROADCAST) { - ha-isp_ops-fabric_logout(vha, - fcport-loop_id, - fcport-d_id.b.domain, - fcport-d_id.b.area, - fcport-d_id.b.al_pa); - qla2x00_clear_loop_id(fcport); + if (fcport-scan_state == QLA_FCPORT_SCAN) { + if (qla_ini_mode_enabled(base_vha) + atomic_read(fcport-state) == FCS_ONLINE) { + qla2x00_mark_device_lost(vha, fcport, + ql2xplogiabsentdevice, 0); + if (fcport-loop_id != FC_NO_LOOP_ID + (fcport-flags FCF_FCP2_DEVICE) == 0 + fcport-port_type != FCT_INITIATOR + fcport-port_type != FCT_BROADCAST) { + ha-isp_ops-fabric_logout(vha, + fcport-loop_id, + fcport-d_id.b.domain, + fcport-d_id.b.area, +
Re: [PATCH v2 5/8] qla2xxx: added sess generations to detect RSCN update races
On 07/14/2015 10:00 PM, Himanshu Madhani wrote: From: Alexei Potashnik ale...@purestorage.com RSCN processing in qla2xxx driver can run in parallel with ELS/IO processing. As such the decision to remove disappeared fc port's session could be stale, because a new login sequence has occurred since and created a brand new session. Previous mechanism of dealing with this by delaying deletion request was prone to erroneous deletions if the event that was supposed to cancel the deletion never arrived or has been delayed in processing. New mechanism relies on a time-like generation counter to serialize RSCN updates relative to ELS/IO updates. Cc: sta...@vger.kernel.org Signed-off-by: Alexei Potashnik ale...@purestorage.com Signed-off-by: Himanshu Madhani himanshu.madh...@qlogic.com --- Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/8] qla2xxx: disable scsi_transport_fc registration in target mode
On 07/14/2015 10:00 PM, Himanshu Madhani wrote: From: Alexei Potashnik ale...@purestorage.com There are multiple reasons for disabling this: 1. It provides no functional benefit. We pretty much only get a few more sysfs entries for each port, but all that information is already available from /sys/kernel/debug/target/qla-session-X 2. It already only works in private-loop mode. By disabling we'll be getting more uniform behavior with fabric mode. 3. It creates complications for the new PLOGI handling mechanism: scsi_transport_fc port deletion timer could race with new session from initiator and cause logout after successful login. Cc: sta...@vger.kernel.org Signed-off-by: Alexei Potashnik ale...@purestorage.com Signed-off-by: Himanshu Madhani himanshu.madh...@qlogic.com --- Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 7/8] qla2xxx: drop cmds/tmrs arrived while session is being deleted
On 07/14/2015 10:00 PM, Himanshu Madhani wrote: From: Alexei Potashnik ale...@purestorage.com If a new initiator (different WWN) shows up on the same fcport, old initiator's session is scheduled for deletion. But there is a small window between it being marked with QLA_SESS_DELETION_IN_PROGRESS and qlt_unret_sess getting called when new session's commands will keep finding old session in the fcport map. This patch drops cmds/tmrs if they find session in the progress of being deleted. Cc: sta...@vger.kernel.org Signed-off-by: Alexei Potashnik ale...@purestorage.com Acked-by: Quinn Tran quinn.t...@qlogic.com Signed-off-by: Himanshu Madhani himanshu.madh...@qlogic.com --- Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/8] qla2xxx: cleanup cmd in qla workqueue before processing TMR
On 07/14/2015 10:00 PM, Himanshu Madhani wrote: From: Swapnil Nagle swapnil.na...@purestorage.com Since cmds go into qla_tgt_wq and TMRs don't, it's possible that TMR like TASK_ABORT can be queued over the cmd for which it was meant. To avoid this race, use a per-port list to keep track of cmds that are enqueued to qla_tgt_wq but not yet processed. When a TMR arrives, iterate through this list and remove any cmds that match the TMR. This patch supports TASK_ABORT and LUN_RESET. Cc: sta...@vger.kernel.org Signed-off-by: Swapnil Nagle swapnil.na...@purestorage.com Signed-off-by: Alexei Potashnik ale...@purestorage.com Acked-by: Quinn Tran quinn.t...@qlogic.com Signed-off-by: Himanshu Madhani himanshu.madh...@qlogic.com --- [ .. ] @@ -3374,14 +3462,25 @@ static void qlt_create_sess_from_atio(struct work_struct *work) unsigned long flags; uint8_t *s_id = op-atio.u.isp24.fcp_hdr.s_id; + spin_lock_irqsave(vha-cmd_list_lock, flags); + list_del(op-cmd_list); + spin_unlock_irqrestore(vha-cmd_list_lock, flags); + + if (op-aborted) { + ql_dbg(ql_dbg_tgt_mgt, vha, 0xf083, + sess_op with tag %u is aborted\n, + op-atio.u.isp24.exchange_addr); + goto out_term; + } + ql_dbg(ql_dbg_tgt_mgt, vha, 0xf022, - qla_target(%d): Unable to find wwn login - (s_id %x:%x:%x), trying to create it manually\n, - vha-vp_idx, s_id[0], s_id[1], s_id[2]); + qla_target(%d): Unable to find wwn login + (s_id %x:%x:%x), trying to create it manually\n, + vha-vp_idx, s_id[0], s_id[1], s_id[2]); if (op-atio.u.raw.entry_count 1) { ql_dbg(ql_dbg_tgt_mgt, vha, 0xf023, - Dropping multy entry atio %p\n, op-atio); + Dropping multy entry atio %p\n, op-atio); goto out_term; } Pointless indentation change. Other than that: Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/8] qla2xxx: Abort stale cmds on qla_tgt_wq when plogi arrives
On 07/14/2015 10:00 PM, Himanshu Madhani wrote: From: Alexei Potashnik ale...@purestorage.com cancel any commands from initiator's s_id that are still waiting on qla_tgt_wq when PLOGI arrives. Cc: sta...@vger.kernel.org Signed-off-by: Alexei Potashnik ale...@purestorage.com Acked-by: Quinn Tran quinn.t...@qlogic.com Signed-off-by: Himanshu Madhani himanshu.madh...@qlogic.com --- Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/8] qla2xxx: delay plogi/prli ack until existing sessions are deleted
On 07/14/2015 10:00 PM, Himanshu Madhani wrote: From: Alexei Potashnik ale...@purestorage.com - keep qla_tgt_sess object on the session list until it's freed - modify use of sess-deleted flag to differentiate delayed session deletion that can be cancelled from irreversible one: QLA_SESS_DELETION_PENDING vs QLA_SESS_DELETION_IN_PROGRESS - during IN_PROGRESS deletion all newly arrived commands and TMRs will be rejected, existing commands and TMRs will be terminated when given by the core to the fabric or simply dropped if session logout has already happened (logout terminates all existing exchanges) - new PLOGI will initiate deletion of the following sessions (unless deletion is already IN_PROGRESS): - with the same port_name (with logout) - different port_name, different loop_id but the same port_id (with logout) - different port_name, different port_id, but the same loop_id (without logout) - additionally each new PLOGI will store imm notify iocb in the same port_name session being deleted. When deletion process completes this iocb will be acked. Only the most recent PLOGI iocb is stored. The older ones will be terminated when replaced. - new PRLI will initiate deletion of the following sessions (unless deletion is already IN_PROGRESS): - different port_name, different port_id, but the same loop_id (without logout) Cc: sta...@vger.kernel.org Signed-off-by: Alexei Potashnik ale...@purestorage.com Acked-by: Quinn Tran quinn.t...@qlogic.com Signed-off-by: Himanshu Madhani himanshu.madh...@qlogic.com --- Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 8/8] qla2xxx: terminate exchange when command is aborted by LIO
On 07/14/2015 10:00 PM, Himanshu Madhani wrote: From: Alexei Potashnik ale...@purestorage.com The newly introduced aborted_task TFO callback has to terminate exchange with QLogic driver, since command is being deleted and no status will be queued to the driver at a later point. This patch also moves the burden of releasing one cmd refcount to the aborted_task handler. Changed iSCSI aborted_task logic to satisfy the above requirement. Cc: sta...@vger.kernel.org Signed-off-by: Alexei Potashnik ale...@purestorage.com Acked-by: Quinn Tran quinn.t...@qlogic.com Signed-off-by: Himanshu Madhani himanshu.madh...@qlogic.com --- Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] [SCSI] mpt2sas, mpt3sas: Abort initialization if no memory I/O resources detected
On 07/15/2015 06:49 AM, Sreekanth Reddy wrote: Driver crashes if the BIOS do not set up at least one memory I/O resource. This failure can happen if the device is too slow to respond during POST and is missed by the BIOS, but Linux then detects the device later in the boot process. Changes in v3: Rearranged the code to remove the code redundancy Signed-off-by: Sreekanth Reddy sreekanth.re...@avagotech.com --- Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] MAINTAINERS: Update LSILOGIC MPT FUSION DRIVERS (FC/SAS/SPI) maintainers list
On 07/15/2015 06:50 AM, Sreekanth Reddy wrote: Updating maintainers list for the entry LSILOGIC MPT FUSION DRIVERS in MAINTAINERS file Signed-off-by: Sreekanth Reddy sreekanth.re...@avagotech.com --- MAINTAINERS | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 2d3d55c..c3cd5c9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6348,10 +6348,9 @@ S: Maintained F: arch/arm/mach-lpc32xx/ LSILOGIC MPT FUSION DRIVERS (FC/SAS/SPI) -M: Nagalakshmi Nandigama nagalakshmi.nandig...@avagotech.com -M: Praveen Krishnamoorthy praveen.krishnamoor...@avagotech.com +M: Sathya Prakash sathya.prak...@avagotech.com M: Sreekanth Reddy sreekanth.re...@avagotech.com -M: Abhijit Mahajan abhijit.maha...@avagotech.com +M: Chaitra Basappa chaitra.basa...@avagotech.com L: mpt-fusionlinux@avagotech.com L: linux-scsi@vger.kernel.org W: http://www.lsilogic.com/support Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/52] atp870u: Major rework
On 07/26/2015 10:24 PM, Ondrej Zary wrote: This patch series removes the first level of crap from atp870u - the worst (in-tree) SCSI driver [1] and then cleans up the code a bit. First the driver is converted to C from something that looks like a (bad) assembler code: use of tmport for all I/O accesses and gotos all over the code. Then I/O access wrappers are introduced. Finally, is870, is880 and is885 are unified and two of them removed. After that, some cleanups are done. There's still a lot of work left to do. Tested on AEC-6710D (ATP870IU-C). [1] http://www.spinics.net/lists/linux-scsi/msg81142.html atp870u.c | 3707 +- atp870u.h |4 2 files changed, 1064 insertions(+), 2647 deletions(-) Partially (up to patch 34) Acked-by: Christoph Hellwig h...@lst.de For this patchset: Reviewed-by: Hannes Reinecke h...@suse.com Cheers, Hannes -- Dr. Hannes ReineckezSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html