Re: [LSF/MM TOPIC] Reducing the SRP initiator failover time

2013-02-08 Thread Sagi Grimberg

On 2/8/2013 12:42 AM, Vu Pham wrote:





It is known that it takes about two to three minutes before the 
upstream SRP initiator fails over from a failed path to a working 
path. This is not only considered longer than acceptable but is also 
longer than other Linux SCSI initiators (e.g. iSCSI and FC). Progress 
so far with improving the fail-over SRP initiator has been slow. This 
is because the discussion about candidate patches occurred at two 
different levels: not only the patches itself were discussed but also 
the approach that should be followed. That last aspect is easier to 
discuss in a meeting than over a mailing list. Hence the proposal to 
discuss SRP initiator failover behavior during the LSF/MM summit. The 
topics that need further discussion are:

* If a path fails, remove the entire SCSI host or preserve the SCSI
  host and only remove the SCSI devices associated with that host ?
* Which software component should test the state of a path and should
  reconnect to an SRP target if a path is restored ? Should that be
  done by the user space process srp_daemon or by the SRP initiator
  kernel module ?
* How should the SRP initiator behave after a path failure has been
  detected ? Should the behavior be similar to the FC initiator with
  its fast_io_fail_tmo and dev_loss_tmo parameters ?

Dave, if this topic gets accepted, I really hope you will be able to 
attend the LSF/MM summit.


Bart.


Hello Bart,

Thank you for taking the initiative.
Mellanox think that this should be discussed. We'd be happy to attend.

We also would like to discuss:
* How and how fast does SRP detect a path failure besides RC error?
* Role of srp_daemon, how often srp_daemon scan fabric for new/old 
targets, how-to scale srp_daemon discovery, traps.


-vu

Hey Bart,

I agree with Vu that this issue should be discussed. We'd be happy to 
attend.


--
Sagi
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [LSF/MM TOPIC][ATTEND] protection information and userspace

2013-02-08 Thread Joel Becker
On Thu, Feb 07, 2013 at 02:12:57PM -0500, Martin K. Petersen wrote:
  Joel == Joel Becker jl...@evilplan.org writes:
 
 Joel I'm happy to chat about it.  Unfortunately, like Darrick says,
 Joel sys_dio() coding hasn't happened.  I do think we're better off
 Joel with some kind of explicit API than some magic state on the file.
 Joel I mean, even something like:
 
 Joel ssize_t write_with_pi(int fd, const void *buf, size_t count,
 Joel   const void *pi, size_t pi_count);
 
 Joel It's not as nice as a non-historical API (eg sys_dio), but it also
 Joel probably plays nicer with buffered I/O.
 
 Pretty much everyone I have talked to that are interested in explicitly
 attaching PI (as opposed to relying on the kernel doing it) are using
 Linux aio.
 
 I am not opposed to having more read()/write() like interface as
 well. But I think it's important to cater to the I/O paradigm used by
 the applications interested in this. It's a lot easier to tweak a few
 IOCB fields than it is to rewrite how an application does I/O.

You know I'm not going to argue with this.  I was merely stating that
I'm flexible in how we start :-)

Joel

 
 -- 
 Martin K. PetersenOracle Linux Engineering
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 

Depend on the rabbit's foot if you will, but remember, it didn't
 help the rabbit.
- R. E. Shay

http://www.jlbec.org/
jl...@evilplan.org
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[LSF/MM TOPIC][ATTEND] protection information batched I/O interfaces.

2013-02-08 Thread Joel Becker
I'm definitely interested in attending to discuss PI injection from
userspace, batched I/O interfaces, and potential O_DIRECT cleanups.

Joel

--
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [LSF/MM TOPIC][ATTEND] protection information and userspace

2013-02-08 Thread Joel Becker
On Thu, Feb 07, 2013 at 04:04:36PM -0500, J. Bruce Fields wrote:
 On Thu, Feb 07, 2013 at 09:36:39AM -0800, Joel Becker wrote:
  Dear LSF committee,
  I'd like to explicitly request attendance for this discussion
  :-)
 
 http://marc.info/?l=linux-fsdevelm=135894412908342w=2
 
   Also, the way I compile the list of requests is from thread
   heads ...  that means don't send your attendee request as a
   reply to something else either otherwise it might get missed.

Ack.  Send as such.

Thanks,
Joel

 
 --b.
 
  
  Joel
  
  On Thu, Feb 07, 2013 at 09:27:35AM -0800, Zach Brown wrote:
   On Thu, Feb 07, 2013 at 11:19:59AM -0500, Jeff Moyer wrote:
Boaz Harrosh bharr...@panasas.com writes:
 
 For aio we just need to add additional fields to an existing 
 structure.
 
 So yeah, I'd be interested in that discussion as well.

Sure, it's easy to start there, but then you eventually end up having to
add a non-aio interface as well.  Let's not take the latter off the
table.
   
   I agree that a sync variant should't be ignored, but needing a sync
   interface with PI arguments also shouldn't get in the way of adding
   support to the aio+dio path.  Simply because it's what people use :/.
   
I'm not sure how that's directly related to aio, but ok.  If we're going
to rewrite the aio code, I think Zach's acall would be a good start, at
least on the API front:
  http://lwn.net/Articles/316806/
   
   Yeah, I'm happy to chat about this stuff if people are interested.  I
   think I'd do things differently today than what was done in that aged
   acall prototype.
   
   - z
   --
   To unsubscribe from this list: send the line unsubscribe linux-scsi in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
  -- 
  
  You can get more with a kind word and a gun than you can with
   a kind word alone.
   - Al Capone
  
  http://www.jlbec.org/
  jl...@evilplan.org
  --
  To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 

You look in her eyes, the music begins to play.
 Hopeless romantics, here we go again.

http://www.jlbec.org/
jl...@evilplan.org
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [LSF/MM TOPIC] Reducing the SRP initiator failover time

2013-02-08 Thread Sebastian Riemer
On 08.02.2013 10:24, Sagi Grimberg wrote:
 On 2/8/2013 12:42 AM, Vu Pham wrote:
 Hello Bart,

 Thank you for taking the initiative.
 Mellanox think that this should be discussed. We'd be happy to attend.

 We also would like to discuss:
 * How and how fast does SRP detect a path failure besides RC error?
 * Role of srp_daemon, how often srp_daemon scan fabric for new/old
 targets, how-to scale srp_daemon discovery, traps.

 -vu
 Hey Bart,
 
 I agree with Vu that this issue should be discussed. We'd be happy to
 attend.
 
 -- 
 Sagi

Wow, also thanks to Mellanox for spending resources on SRP as well! Last
year in June we came across a very different situation.

Cheers,
Sebastian and the ProfitBricks storage team
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] target: Fix sense data for out-of-bounds IO operations

2013-02-08 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com

We're supposed to return LOGICAL BLOCK ADDRESS OUT OF RANGE, not
INVALID FIELD IN CDB.

Signed-off-by: Roland Dreier rol...@purestorage.com
---
 drivers/target/target_core_sbc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index a664c66..170f1f7 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -486,7 +486,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops)
 */
if (cmd-t_task_lba || sectors) {
if (sbc_check_valid_sectors(cmd)  0)
-   return TCM_INVALID_CDB_FIELD;
+   return TCM_ADDRESS_OUT_OF_RANGE;
}
cmd-execute_cmd = ops-execute_sync_cache;
break;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] target: Fix error checking for UNMAP commands

2013-02-08 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com

SBC-3 (revision 35) says:

The PARAMETER LIST LENGTH field specifies the length in bytes of the
UNMAP parameter list that is available to be transferred from the
Data-Out Buffer. If the parameter list length is greater than zero
and less than 0008h (i.e., eight), then the device server shall
terminate the command with CHECK CONDITION status with the sense key
set to ILLEGAL REQUEST and the additional sense code set to
PARAMETER LIST LENGTH ERROR. A PARAMETER LIST LENGTH set to zero
specifies that no data shall be sent.

so our sense code for too-short descriptors was wrong, and we were
incorrectly failing commands that didn't transfer any descriptors.

While we're at it, also handle the UNMAP check:

If the ANCHOR bit is set to one, and the ANC_SUP bit in the Logical
Block Provisioning VPD page (see 6.6.4) is set to zero, then the
device server shall terminate the command with CHECK CONDITION
status with the sense key set to ILLEGAL REQUEST and the additional
sense code set to INVALID FIELD IN CDB.

Signed-off-by: Roland Dreier rol...@purestorage.com
---
 drivers/target/target_core_iblock.c| 11 ++-
 drivers/target/target_core_transport.c | 10 ++
 include/target/target_core_base.h  |  1 +
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/target/target_core_iblock.c 
b/drivers/target/target_core_iblock.c
index b526d23..b72ca5b 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -390,10 +390,19 @@ iblock_execute_unmap(struct se_cmd *cmd)
sense_reason_t ret = 0;
int dl, bd_dl, err;
 
+   /* We never set ANC_SUP */
+   if (cdb[1])
+   return TCM_INVALID_CDB_FIELD;
+
+   if (cmd-data_length == 0) {
+   target_complete_cmd(cmd, SAM_STAT_GOOD);
+   return 0;
+   }
+
if (cmd-data_length  8) {
pr_warn(UNMAP parameter list length %u too small\n,
cmd-data_length);
-   return TCM_INVALID_PARAMETER_LIST;
+   return TCM_PARAMETER_LIST_LENGTH_ERROR;
}
 
buf = transport_kmap_data_sg(cmd);
diff --git a/drivers/target/target_core_transport.c 
b/drivers/target/target_core_transport.c
index bd587b7..82c4204 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1514,6 +1514,7 @@ void transport_generic_request_failure(struct se_cmd *cmd,
case TCM_UNSUPPORTED_SCSI_OPCODE:
case TCM_INVALID_CDB_FIELD:
case TCM_INVALID_PARAMETER_LIST:
+   case TCM_PARAMETER_LIST_LENGTH_ERROR:
case TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE:
case TCM_UNKNOWN_MODE_PAGE:
case TCM_WRITE_PROTECTED:
@@ -2674,6 +2675,15 @@ transport_send_check_condition_and_sense(struct se_cmd 
*cmd,
/* INVALID FIELD IN PARAMETER LIST */
buffer[SPC_ASC_KEY_OFFSET] = 0x26;
break;
+   case TCM_PARAMETER_LIST_LENGTH_ERROR:
+   /* CURRENT ERROR */
+   buffer[0] = 0x70;
+   buffer[SPC_ADD_SENSE_LEN_OFFSET] = 10;
+   /* ILLEGAL REQUEST */
+   buffer[SPC_SENSE_KEY_OFFSET] = ILLEGAL_REQUEST;
+   /* INVALID FIELD IN PARAMETER LIST */
+   buffer[SPC_ASC_KEY_OFFSET] = 0x1a;
+   break;
case TCM_UNEXPECTED_UNSOLICITED_DATA:
/* CURRENT ERROR */
buffer[0] = 0x70;
diff --git a/include/target/target_core_base.h 
b/include/target/target_core_base.h
index 4fa0f10..37e3baa 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -193,6 +193,7 @@ enum tcm_sense_reason_table {
TCM_RESERVATION_CONFLICT= R(0x10),
TCM_ADDRESS_OUT_OF_RANGE= R(0x11),
TCM_OUT_OF_RESOURCES= R(0x12),
+   TCM_PARAMETER_LIST_LENGTH_ERROR = R(0x13),
 #undef R
 };
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] target: Fix parameter list length checking in MODE SELECT

2013-02-08 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com

An empty parameter list (length == 0) is not an error, so succeed MODE
SELECT in this case.  If the parameter list length is too small,
return the correct sense code of PARAMETER LIST LENGTH ERROR.

Signed-off-by: Roland Dreier rol...@purestorage.com
---
 drivers/target/target_core_spc.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index 2d88f08..73c5d53 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -983,6 +983,14 @@ static sense_reason_t spc_emulate_modeselect(struct se_cmd 
*cmd)
int ret = 0;
int i;
 
+   if (!cmd-data_length) {
+   target_complete_cmd(cmd, GOOD);
+   return 0;
+   }
+
+   if (cmd-data_length  off + 2)
+   return TCM_PARAMETER_LIST_LENGTH_ERROR;
+
buf = transport_kmap_data_sg(cmd);
if (!buf)
return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
@@ -1007,6 +1015,11 @@ static sense_reason_t spc_emulate_modeselect(struct 
se_cmd *cmd)
goto out;
 
 check_contents:
+   if (cmd-data_length  off + length) {
+   ret = TCM_PARAMETER_LIST_LENGTH_ERROR;
+   goto out;
+   }
+
if (memcmp(buf + off, tbuf, length))
ret = TCM_INVALID_PARAMETER_LIST;
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 0/10] More device removal fixes

2013-02-08 Thread Joe Lawrence
On Thu, 7 Feb 2013, Bart Van Assche wrote:

 On 02/06/13 23:31, Joe Lawrence wrote:
  crash list scsi_device.siblings -H 0x8808513a4290 -s scsi_device
  
  880851232520
  struct scsi_device {
 is_visible = 0x1,
 sdev_state = SDEV_DEL,
  }
  880851235388
  struct scsi_device {
 is_visible = 0x1,
 sdev_state = SDEV_DEL,
  }
 
 This is interesting. This probably means that one or more threads got stuck in
 __scsi_remove_device(). If you still have the crash dump available it would be
 appreciated if you could verify whether this is correct. If so, there might be
 an issue in the mpt2sas driver where scsi_done() does not get invoked for all
 outstanding commands after a surprise removal.

Hi Bart,

I haven't had time to rerun the test without the two patches that wait in 
scsi_remove_host(), however I did rerun the test and verify the same 
behavior as in my earlier mail.  I didn't see any __scsi_remove_device() 
instances running.

Some more investigation revealed that MD RAID was holding a reference to 
the removed device.  (In short, mdadm --remove had failed and left the 
device as a faulty member of the array.)  When I did finally manage to 
kick that disk from the MD device, scsi host/device removal continued to 
completion as expected.  

There's a bit more context to the MD situation that I'll post to the raid 
list once I get the details together for Neil.  I will CC you if you are 
interested in following.

Regards,

-- Joe
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html