from:"Laurence Oberman"

Re: [dm-devel] [PATCH] scsi-dh-emc: fix activate vs set_params race

2013-04-04 Thread Laurence Oberman

I can test it. I have a clarion Cx3
Will get to it next week, traveling tomorrow
Laurence

Sent from my iPhone

On Apr 4, 2013, at 7:11 PM, Mike Christie micha...@cs.wisc.edu wrote:

 On 04/02/2013 07:09 PM, Mikulas Patocka wrote:
 Hi
 
 This fixes a possible race in scsi_dh_emc. It is untested because I don't 
 have the hardware. It could happen when we reload a multipath device and 
 path failure happens at the same time.
 
 
 I think this patch is ok. I do not have the hw to test it anymore.
 
 If you wanted to test just to make sure it is safe you should bug Rob
 Evers. He can help you find a machine in the westford lab that has it
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Debug flag parameter for SCSI tape driver

2014-06-10 Thread Laurence Oberman

Hello

I am tired of building modules to enable SCSI tape driver debug so I am hoping 
this patch is acceptable.
Tested using kernel 3.14.6

Usage example:
modprobe st debug_flag=1

diff -Nur a/st.c b/st.c
--- a/st.c  2014-06-10 16:45:18.522354105 -0400
+++ b/st.c  2014-06-10 16:45:33.953765908 -0400
@@ -80,6 +80,7 @@
 static int try_direct_io = TRY_DIRECT_IO;
 static int try_rdio = 1;
 static int try_wdio = 1;
+static int debug_flag = 0;
 
 static struct class st_sysfs_class;
 static const struct attribute_group *st_dev_groups[];
@@ -100,6 +101,9 @@
 MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to 
use (256));
 module_param_named(try_direct_io, try_direct_io, int, 0);
 MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape 
drive (1));
+module_param_named(debug_flag, debug_flag, int, 0);
+MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting DEBUG 1 in 
source);
+
 
 /* Extra parameters for testing */
 module_param_named(try_rdio, try_rdio, int, 0);
@@ -124,6 +128,9 @@
},
{
try_direct_io, try_direct_io
+   },
+   {
+   debug_flag, debug_flag
}
 };
 #endif
@@ -4277,7 +4284,9 @@
 static int __init init_st(void)
 {
int err;
-
+   debugging = (debug_flag  0) ? debug_flag : DEBUG;
+if (debugging) 
+   printk(KERN_INFO st: Debugging enabled debug_flag = 
%d\n,debugging);
validate_options();
 
printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n,

Thanks
Laurence
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

debug_flag added to st tape driver

2014-06-10 Thread Laurence Oberman

Hello

I am tired of building modules to enable SCSI tape driver debug so I
am hoping this patch is acceptable.
Tested using kernel 3.14.6

Usage example:
modprobe st debug_flag=1

diff -Nur a/st.c b/st.c
--- a/st.c2014-06-10 16:45:18.522354105 -0400
+++ b/st.c2014-06-10 16:45:33.953765908 -0400
@@ -80,6 +80,7 @@
 static int try_direct_io = TRY_DIRECT_IO;
 static int try_rdio = 1;
 static int try_wdio = 1;
+static int debug_flag = 0;

 static struct class st_sysfs_class;
 static const struct attribute_group *st_dev_groups[];
@@ -100,6 +101,9 @@
 MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather
segments to use (256));
 module_param_named(try_direct_io, try_direct_io, int, 0);
 MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer
and tape drive (1));
+module_param_named(debug_flag, debug_flag, int, 0);
+MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting DEBUG 1
in source);
+

 /* Extra parameters for testing */
 module_param_named(try_rdio, try_rdio, int, 0);
@@ -124,6 +128,9 @@
 },
 {
 try_direct_io, try_direct_io
+},
+{
+debug_flag, debug_flag
 }
 };
 #endif
@@ -4277,7 +4284,9 @@
 static int __init init_st(void)
 {
 int err;
-
+debugging = (debug_flag  0) ? debug_flag : DEBUG;
+if (debugging)
+printk(KERN_INFO st: Debugging enabled debug_flag =
%d\n,debugging);
 validate_options();

 printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n,

Thanks
Laurence
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH]: add debug flag parameter for SCSI tape driver

2014-06-10 Thread Laurence Oberman

Hello

Take 2 of this patch, changed module description and subject line.

This patch adds a debug_flag parameter that can be set on module load, and 
allows the DEBUG facility without a module recompile.
Usage: mpdprobe st debug_flag=1

Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nur a/st.c b/st.c
--- a/st.c  2014-06-10 16:45:18.522354105 -0400
+++ b/st.c  2014-06-10 19:40:39.774387990 -0400
@@ -80,6 +80,7 @@
 static int try_direct_io = TRY_DIRECT_IO;
 static int try_rdio = 1;
 static int try_wdio = 1;
+static int debug_flag = 0;
 
 static struct class st_sysfs_class;
 static const struct attribute_group *st_dev_groups[];
@@ -100,6 +101,9 @@
 MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to 
use (256));
 module_param_named(try_direct_io, try_direct_io, int, 0);
 MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape 
drive (1));
+module_param_named(debug_flag, debug_flag, int, 0);
+MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1);
+
 
 /* Extra parameters for testing */
 module_param_named(try_rdio, try_rdio, int, 0);
@@ -124,6 +128,9 @@
},
{
try_direct_io, try_direct_io
+   },
+   {
+   debug_flag, debug_flag
}
 };
 #endif
@@ -4277,7 +4284,9 @@
 static int __init init_st(void)
 {
int err;
-
+   debugging = (debug_flag  0) ? debug_flag : DEBUG;
+if (debugging) 
+   printk(KERN_INFO st: Debugging enabled debug_flag = 
%d\n,debugging);
validate_options();
 
printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n,


Thanks
Laurence
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: add debug flag parameter for SCSI tape driver

2014-06-11 Thread Laurence Oberman

Kai,
Thank you for considering this.

With #define DEBUG 0
We still include

#define DEB(a)
#define DEBC(a)

With the debug_flag we then provide the needed debug I am looking for at module 
load time.
But I agree that it enables it for all devices and that may not be optimal
I don't change the default, I just allow the parameter to control it.

In the last few issues I have been working I have had to recompile and provide 
the st module to get what I needed captured for debugging so I decided to try 
the patch submission.

Thank You
Laurence

- Original Message -
From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi
To: Laurence Oberman lober...@redhat.com
Cc: linux-scsi@vger.kernel.org
Sent: Wednesday, June 11, 2014 2:03:15 PM
Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver


On 11.6.2014, at 2.48, Laurence Oberman lober...@redhat.com wrote:

 Hello
 
 Take 2 of this patch, changed module description and subject line.
 
 This patch adds a debug_flag parameter that can be set on module load, and 
 allows the DEBUG facility without a module recompile.
 Usage: mpdprobe st debug_flag=1
 
 Signed-off-by: Laurence Oberman lober...@redhat.com
 

What is wrong with the existing methods to control debugging? You can enable 
and disable debugging for each device with ioctl() (as described in the driver 
documentation). You can use mt-st to do this from command line.

Your patch just allows one to change the default for all devices. The real 
problem may be that the distributions don’t compile the debugging code into the 
drivets but your patch does not change this.

Kai

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: add debug flag parameter for SCSI tape driver

2014-06-11 Thread Laurence Oberman

Kai,

Its likely not worth doing this, I cross checked and indeed many distros have 
this compiled out.
So lets leave it as is.

Thanks
Laurence

- Original Message -
From: Laurence Oberman lober...@redhat.com
To: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi
Cc: linux-scsi@vger.kernel.org
Sent: Wednesday, June 11, 2014 2:24:25 PM
Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver

Kai,
Thank you for considering this.

With #define DEBUG 0
We still include

#define DEB(a)
#define DEBC(a)

With the debug_flag we then provide the needed debug I am looking for at module 
load time.
But I agree that it enables it for all devices and that may not be optimal
I don't change the default, I just allow the parameter to control it.

In the last few issues I have been working I have had to recompile and provide 
the st module to get what I needed captured for debugging so I decided to try 
the patch submission.

Thank You
Laurence

- Original Message -
From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi
To: Laurence Oberman lober...@redhat.com
Cc: linux-scsi@vger.kernel.org
Sent: Wednesday, June 11, 2014 2:03:15 PM
Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver


On 11.6.2014, at 2.48, Laurence Oberman lober...@redhat.com wrote:

 Hello
 
 Take 2 of this patch, changed module description and subject line.
 
 This patch adds a debug_flag parameter that can be set on module load, and 
 allows the DEBUG facility without a module recompile.
 Usage: mpdprobe st debug_flag=1
 
 Signed-off-by: Laurence Oberman lober...@redhat.com
 

What is wrong with the existing methods to control debugging? You can enable 
and disable debugging for each device with ioctl() (as described in the driver 
documentation). You can use mt-st to do this from command line.

Your patch just allows one to change the default for all devices. The real 
problem may be that the distributions don’t compile the debugging code into the 
drivets but your patch does not change this.

Kai

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd request

2014-10-17 Thread Laurence Oberman

Hello Kai

You have seen this patch before. The first time around, given that we don't 
enable DEBUG by default, I let it go.
However we have been looking into defining DEBUG 1 by default here at Redhat 
and then setting the default to disabled.

Are you open to considering changing the driver based on this patch.
i.e. default DEFINE 1 and adding this code with default set to off.

Note that with DEBUG 0, as you know you need to change that and recompile. 
That is exactly what I am trying to avoid with Enterprise customers.

This patch adds a debug_flag parameter that can be set on module load, and 
allows the DEBUG facility without a module recompile.
Note that now DEBUG 1 is the default with this patch.

Usage: modprobe st debug_flag=1

Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nur a/st.c b/st.c
--- a/st.c  2014-10-17 16:15:54.103810627 -0400
+++ b/st.c  2014-10-17 16:22:12.303810392 -0400
@@ -56,7 +56,7 @@
 
 /* The driver prints some debugging information on the console if DEBUG
is defined and non-zero. */
-#define DEBUG 0
+#define DEBUG 1
 
 #define ST_DEB_MSG  KERN_NOTICE
 #if DEBUG
@@ -80,6 +80,7 @@
 static int try_direct_io = TRY_DIRECT_IO;
 static int try_rdio = 1;
 static int try_wdio = 1;
+static int debug_flag = 0;
 
 static struct class st_sysfs_class;
 static const struct attribute_group *st_dev_groups[];
@@ -100,6 +101,9 @@
 MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to 
use (256));
 module_param_named(try_direct_io, try_direct_io, int, 0);
 MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape 
drive (1));
+module_param_named(debug_flag, debug_flag, int, 0);
+MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1);
+
 
 /* Extra parameters for testing */
 module_param_named(try_rdio, try_rdio, int, 0);
@@ -124,6 +128,9 @@
},
{
try_direct_io, try_direct_io
+},
+{
+debug_flag, debug_flag
}
 };
 #endif
@@ -4306,6 +4313,11 @@
 
validate_options();
 
+debugging = (debug_flag  0) ? debug_flag : DEBUG;
+ if (debugging)
+printk(KERN_INFO st: Debugging enabled debug_flag = 
%d\n,debugging);
+
+
printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n,
verstr, st_fixed_buffer_size, st_max_sg_segs);
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd request

2014-10-17 Thread Laurence Oberman

Oops, patch was defaulting to 1.

Here is v2 properly defining DEBUG 1 and defaulting to 0 unless debug_flag=1

This patch adds a debug_flag parameter that can be set on module load, and 
allows the DEBUG facility without a module recompile.
Note that now DEBUG 1 is the default with this patch.

Usage: modprobe st 

Signed-off-by: Laurence Oberman lober...@redhat.com


diff -Nur a/st.c b/st.c
--- a/st.c  2014-10-17 16:15:54.103810627 -0400
+++ b/st.c  2014-10-17 16:42:45.992809531 -0400
@@ -56,7 +56,8 @@
 
 /* The driver prints some debugging information on the console if DEBUG
is defined and non-zero. */
-#define DEBUG 0
+#define DEBUG 1
+#define NO_DEBUG 0
 
 #define ST_DEB_MSG  KERN_NOTICE
 #if DEBUG
@@ -80,6 +81,7 @@
 static int try_direct_io = TRY_DIRECT_IO;
 static int try_rdio = 1;
 static int try_wdio = 1;
+static int debug_flag = 0;
 
 static struct class st_sysfs_class;
 static const struct attribute_group *st_dev_groups[];
@@ -100,6 +102,9 @@
 MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to 
use (256));
 module_param_named(try_direct_io, try_direct_io, int, 0);
 MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape 
drive (1));
+module_param_named(debug_flag, debug_flag, int, 0);
+MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1);
+
 
 /* Extra parameters for testing */
 module_param_named(try_rdio, try_rdio, int, 0);
@@ -124,6 +129,9 @@
},
{
try_direct_io, try_direct_io
+},
+{
+debug_flag, debug_flag
}
 };
 #endif
@@ -4306,6 +4314,11 @@
 
validate_options();
 
+debugging = (debug_flag  0) ? debug_flag : NO_DEBUG;
+ if (debugging)
+printk(KERN_INFO st: Debugging enabled debug_flag = 
%d\n,debugging);
+
+
printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n,
verstr, st_fixed_buffer_size, st_max_sg_segs);


- Original Message -
From: Laurence Oberman lober...@redhat.com
To: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi, Rob Evers 
rev...@redhat.com
Cc: linux-scsi@vger.kernel.org
Sent: Friday, October 17, 2014 4:20:29 PM
Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd 
request

Hello Kai

You have seen this patch before. The first time around, given that we don't 
enable DEBUG by default, I let it go.
However we have been looking into defining DEBUG 1 by default here at Redhat 
and then setting the default to disabled.

Are you open to considering changing the driver based on this patch.
i.e. default DEFINE 1 and adding this code with default set to off.

Note that with DEBUG 0, as you know you need to change that and recompile. 
That is exactly what I am trying to avoid with Enterprise customers.

This patch adds a debug_flag parameter that can be set on module load, and 
allows the DEBUG facility without a module recompile.
Note that now DEBUG 1 is the default with this patch.

Usage: modprobe st debug_flag=1

Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nur a/st.c b/st.c
--- a/st.c  2014-10-17 16:15:54.103810627 -0400
+++ b/st.c  2014-10-17 16:22:12.303810392 -0400
@@ -56,7 +56,7 @@
 
 /* The driver prints some debugging information on the console if DEBUG
is defined and non-zero. */
-#define DEBUG 0
+#define DEBUG 1
 
 #define ST_DEB_MSG  KERN_NOTICE
 #if DEBUG
@@ -80,6 +80,7 @@
 static int try_direct_io = TRY_DIRECT_IO;
 static int try_rdio = 1;
 static int try_wdio = 1;
+static int debug_flag = 0;
 
 static struct class st_sysfs_class;
 static const struct attribute_group *st_dev_groups[];
@@ -100,6 +101,9 @@
 MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to 
use (256));
 module_param_named(try_direct_io, try_direct_io, int, 0);
 MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape 
drive (1));
+module_param_named(debug_flag, debug_flag, int, 0);
+MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1);
+
 
 /* Extra parameters for testing */
 module_param_named(try_rdio, try_rdio, int, 0);
@@ -124,6 +128,9 @@
},
{
try_direct_io, try_direct_io
+},
+{
+debug_flag, debug_flag
}
 };
 #endif
@@ -4306,6 +4313,11 @@
 
validate_options();
 
+debugging = (debug_flag  0) ? debug_flag : DEBUG;
+ if (debugging)
+printk(KERN_INFO st: Debugging enabled debug_flag = 
%d\n,debugging);
+
+
printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n,
verstr, st_fixed_buffer_size, st_max_sg_segs);
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd request

2014-10-19 Thread Laurence Oberman

Hello Kai

Thanks. 

Here is v3

This patch adds a debug_flag parameter that can be set on module load, and 
allows the DEBUG facility without a module recompile.
Note that now DEBUG 1 is the default with this patch.

Usage: modprobe st debug_flag=1

Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nur a/Documentation/scsi/st.txt b/Documentation/scsi/st.txt
--- a/Documentation/scsi/st.txt 2014-10-19 09:36:52.243863716 -0400
+++ b/Documentation/scsi/st.txt 2014-10-19 09:43:19.222863447 -0400
@@ -506,9 +506,11 @@
 
 DEBUGGING HINTS
 
-To enable debugging messages, edit st.c and #define DEBUG 1. As seen
-above, debugging can be switched off with an ioctl if debugging is
-compiled into the driver. The debugging output is not voluminous.
+Debugging code is now compiled in by default but debugging is turned off with 
+the kernel module parameter debug_flag defaulting to 0.
+Debugging can still be switched on and off with an ioctl.
+To enable debug at module load time add debug_flag=1 to the module load 
+options, the debugging output is not voluminous.
 
 If the tape seems to hang, I would be very interested to hear where
 the driver is waiting. With the command 'ps -l' you can see the state

diff -Nur a/drivers/scsi/st.c b/drivers/scsi/st.c
--- a/drivers/scsi/st.c 2014-10-19 09:35:45.673863756 -0400
+++ b/drivers/scsi/st.c 2014-10-19 09:35:30.621863483 -0400
@@ -56,7 +56,8 @@
 
 /* The driver prints some debugging information on the console if DEBUG
is defined and non-zero. */
-#define DEBUG 0
+#define DEBUG 1
+#define NO_DEBUG 0
 
 #define ST_DEB_MSG  KERN_NOTICE
 #if DEBUG
@@ -80,6 +81,7 @@
 static int try_direct_io = TRY_DIRECT_IO;
 static int try_rdio = 1;
 static int try_wdio = 1;
+static int debug_flag = 0;
 
 static struct class st_sysfs_class;
 static const struct attribute_group *st_dev_groups[];
@@ -100,6 +102,9 @@
 MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to 
use (256));
 module_param_named(try_direct_io, try_direct_io, int, 0);
 MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape 
drive (1));
+module_param_named(debug_flag, debug_flag, int, 0);
+MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1);
+
 
 /* Extra parameters for testing */
 module_param_named(try_rdio, try_rdio, int, 0);
@@ -124,6 +129,9 @@
},
{
try_direct_io, try_direct_io
+},
+{
+debug_flag, debug_flag
}
 };
 #endif
@@ -4309,6 +4317,10 @@
printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n,
verstr, st_fixed_buffer_size, st_max_sg_segs);
 
+debugging = (debug_flag  0) ? debug_flag : NO_DEBUG;
+ if (debugging)
+printk(KERN_INFO st: Debugging enabled debug_flag = 
%d\n,debugging);
+
err = class_register(st_sysfs_class);
if (err) {
pr_err(Unable register sysfs class for SCSI tapes\n);


- Original Message -
From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi
To: Laurence Oberman lober...@redhat.com
Cc: Rob Evers rev...@redhat.com, linux-scsi@vger.kernel.org
Sent: Sunday, October 19, 2014 4:54:10 AM
Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd 
request

Hello,

I am responding to this, but noticed your next, fixed version.

 On 17.10.2014, at 23.20, Laurence Oberman lober...@redhat.com wrote:
 
 Hello Kai
 
 You have seen this patch before. The first time around, given that we don't 
 enable DEBUG by default, I let it go.
 However we have been looking into defining DEBUG 1 by default here at Redhat 
 and then setting the default to disabled.
 
 Are you open to considering changing the driver based on this patch.
 i.e. default DEFINE 1 and adding this code with default set to off.
 
Yes. I certainly think defining DEBUG 1 and changing the default to zero should 
be done if it is useful for supporting users. The runtime overhead is 
negligible and the extra code does not matter nowadays (it did matter, at least 
theoretically, for years ago).

I am not so sure about the module option. When the debugging code is compiled 
in, debugging can be enabled and disabled for each device by the MTIOCTOP ioctl 
(e.g., mtst -f tape_device stsetoptions debug). The module option enables 
debugging for all tape devices. However, if you think this additional module 
option is useful, I am not against it. It does not remove the possibility for 
controlling debugging for each device for those who want to do it that way.

Anyway, you should modify the documentation (Documentation/scsi/st.txt) 
according to the changes.

 Note that with DEBUG 0, as you know you need to change that and recompile. 
 That is exactly what I am trying to avoid with Enterprise customers.
 
I have also noticed this when someone has asked me about some tape problems.

 This patch adds a debug_flag parameter that can be set on module load, and 
 allows the DEBUG facility without a module

Re: [PATCH] st: implement sysfs based tape statistics

2014-11-20 Thread Laurence Oberman

Hi Shane,
I was actually about to pull this patch and test it.
Lots of changes and a big patch so going to create a another driver as
a tape stats driver for now for testing.
Will exercise this fully and provide feedback to the list.
Regards
Laurence Oberman


On Thu, Nov 20, 2014 at 6:49 PM, Seymour, Shane M shane.seym...@hp.com wrote:
 I was wondering if anyone had a chance to review the patch? Comments are 
 appreciated and I'm more than happy to make changes that will allow it to be 
 accepted.
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH qla2xxx] Race in handling rport deletion in Qlogic driver during recovery causes panic

2014-11-25 Thread Laurence Oberman

When we have an rport disconnect we race during rport deletion and 
re-connection resulting in a panic.
When we do this, we call fc_remote_port_del() just before we do the calls to 
re-establish the session with 
the FC transport with fc_remote_port_add() and then fc_remote_port_rolechg().

If we remove the call to fc_remote_port_del() before re-establishing the 
connection this prevents the race.
This patch has resolved this for multiple customers via test kernels.

Suggested by Chad Dupuis, implemented and tested by Laurence Oberman.

Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nur a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
--- a/drivers/scsi/qla2xxx/qla_init.c   2014-10-14 18:07:48.313648535 -0400
+++ b/drivers/scsi/qla2xxx/qla_init.c   2014-11-25 09:08:17.108814261 -0500
@@ -3237,8 +3237,6 @@
struct fc_rport *rport;
unsigned long flags;
 
-   qla2x00_rport_del(fcport);
-
rport_ids.node_name = wwn_to_u64(fcport-node_name);
rport_ids.port_name = wwn_to_u64(fcport-port_name);
rport_ids.port_id = fcport-d_id.b.domain  16 |


Supporting traces

qla2xxx :06:00.1: scsi(1:4:0): Abort command issued -- 1 2002.
qla2xxx :06:00.1: scsi(1:4:0): BUS RESET ISSUED.
qla2xxx :06:00.1: qla2xxx_eh_bus_reset: reset succeded
qla2xxx :06:00.1: scsi(1:4:0): Abort command issued -- 1 2002.
qla2xxx :06:00.1: scsi(1:4:0): ADAPTER RESET ISSUED.
qla2xxx :06:00.1: Performing ISP error recovery - ha= 880bd5b55000.
qla2xxx :06:00.1: FW: Loading via request-firmware...
qla2xxx :06:00.1: LOOP UP detected (4 Gbps).
qla2xxx :06:00.1: qla2xxx_eh_host_reset: reset succeded
qla2xxx :09:00.1: scsi(3:3:0): Abort command issued -- 1 2002.
qla2xxx :09:00.1: scsi(3:3:0): Abort command issued -- 1 2002.
qla2xxx :09:00.1: scsi(3:3:0): DEVICE RESET ISSUED.
qla2xxx :09:00.1: scsi(3:3:0): DEVICE RESET SUCCEEDED.
qla2xxx :06:00.1: scsi(1:4:0): Abort command issued -- 1 2002.
scsi 1:0:4:0: Device offlined - not ready after error recovery
..
..
scsi 3:0:2:0: Device offlined - not ready after error recovery
qla2xxx :06:00.1: scsi(1:8:0): Abort command issued -- 1 2002.
qla2xxx :06:00.1: scsi(1:8:0): Abort command issued -- 1 2002.
qla2xxx :06:00.1: scsi(1:8:0): DEVICE RESET ISSUED.
qla2xxx :06:00.1: scsi(1:8:0): DEVICE RESET SUCCEEDED.
qla2xxx :06:00.1: scsi(1:8:0): Abort command issued -- 1 2002.
qla2xxx :06:00.1: scsi(1:8:0): TARGET RESET ISSUED.
qla2xxx :06:00.1: scsi(1:8:0): TARGET RESET SUCCEEDED.
qla2xxx :09:00.1: scsi(3:3:0): Abort command issued -- 1 2002.

BUG: unable to handle kernel NULL pointer dereference at 0058
IP: [8134fa1b] scsi_is_host_device+0xb/0x20
PGD b80681067 PUD b833ca067 PMD 0 
Oops:  [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu2/cpufreq/scaling_setspeed
CPU 9 
Modules linked in: nfs fscache xfs ext3 jbd ext2 iptable_mangle iptable_nat 
nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables 
mptctl mptbase vxodm(P)(U) amf(P)(U) vxfen(P)(U) gab(P)(U) llt(P)(U) nfsd lockd 
nfs_acl auth_rpcgss autofs4 sunrpc dmpjbod(P)(U) dmpap(P)(U) dmpaa(P)(U) 
vxspec(P)(U) vxio(P)(U) vxdmp(P)(U) pcc_cpufreq bonding ipv6 vxportal(P)(U) 
fdd(P)(U) vxfs(P)(U) exportfs emcpvlumd(P)(U) emcpxcrypt(P)(U) emcpdm(P)(U) 
emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) dm_mirror dm_region_hash dm_log hpilo 
hpwdt microcode serio_raw iTCO_wdt iTCO_vendor_support i7core_edac edac_core 
ses enclosure sg power_meter hwmon be2net shpchp ext4 mbcache jbd2 sd_mod 
crc_t10dif hpsa(U) qla2xxx scsi_transport_fc scsi_tgt dm_mod [last unloaded: 
emcpioc]

Modules linked in: nfs fscache xfs ext3 jbd ext2 iptable_mangle iptable_nat 
nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables 
mptctl mptbase vxodm(P)(U) amf(P)(U) vxfen(P)(U) gab(P)(U) llt(P)(U) nfsd lockd 
nfs_acl auth_rpcgss autofs4 sunrpc dmpjbod(P)(U) dmpap(P)(U) dmpaa(P)(U) 
vxspec(P)(U) vxio(P)(U) vxdmp(P)(U) pcc_cpufreq bonding ipv6 vxportal(P)(U) 
fdd(P)(U) vxfs(P)(U) exportfs emcpvlumd(P)(U) emcpxcrypt(P)(U) emcpdm(P)(U) 
emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) dm_mirror dm_region_hash dm_log hpilo 
hpwdt microcode serio_raw iTCO_wdt iTCO_vendor_support i7core_edac edac_core 
ses enclosure sg power_meter hwmon be2net shpchp ext4 mbcache jbd2 sd_mod 
crc_t10dif hpsa(U) qla2xxx scsi_transport_fc scsi_tgt dm_mod [last unloaded: 
emcpioc]
Pid: 641, comm: qla2xxx_3_dpc Tainted: P   M      
2.6.32-131.26.1.el6.x86_64 #1 ProLiant BL460c G7
RIP: 0010:[8134fa1b]  [8134fa1b] 
scsi_is_host_device+0xb/0x20
RSP: 0018:8817d15d5c80  EFLAGS: 00010246
RAX:  RBX: 880bcf094000 RCX: 5ee0
RDX: 880bd5b37850 RSI: 0297 RDI: 
RBP: 8817d15d5c80 R08: 0006 R09: 880bd5b39210
R10: 8817d15d5d18 R11:  R12: 
R13

Re: [PATCH] st: implement sysfs based tape statistics

2014-11-30 Thread Laurence Oberman

Hello Shane,

So far so good on the upstream, built as a new driver.
I need to run some more tests to capture stats and validate the
numbers, so far just functional tests and reading sysfs numbers.
I then need to backport to RHEL6 and RHEL7 kernels as we have two BZ's
out there for this.
WIll be working on that this coming week.
I needed to get some real tape hardware ready, so took a while to
get that staged, started out using mhvtl and that seemed fine.

Thanks
Laurence

On Thu, Nov 20, 2014 at 7:09 PM, Laurence Oberman oberma...@gmail.com wrote:
 Hi Shane,
 I was actually about to pull this patch and test it.
 Lots of changes and a big patch so going to create a another driver as
 a tape stats driver for now for testing.
 Will exercise this fully and provide feedback to the list.
 Regards
 Laurence Oberman


 On Thu, Nov 20, 2014 at 6:49 PM, Seymour, Shane M shane.seym...@hp.com 
 wrote:
 I was wondering if anyone had a chance to review the patch? Comments are 
 appreciated and I'm more than happy to make changes that will allow it to be 
 accepted.
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Crash when copying from broken external hdd

2014-11-30 Thread Laurence Oberman

I need more of the stack if you have it, the screenshot is not attached.
Thanks
Laurence

On Sun, Nov 30, 2014 at 6:11 AM, Richard Weinberger
richard.weinber...@gmail.com wrote:
 On Sat, Nov 29, 2014 at 11:52 AM, Simon Danner danner.si...@gmail.com wrote:
 Hello,
 i get the following crash after i try to copy files from a broken
 external hdd to another external hdd.
 It happens after a few minutes, with latest git and 3.17.4 from Arch.
 Attached screenshot is from latest mainline git.

 i hope this can be fixed somehow,
 Regards Simon Danner

 Can you decode scsi_requed_end+0x122?
 CC'ing block and scsi folks.

 --
 Thanks,
 //richard
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Crash when copying from broken external hdd

2014-11-30 Thread Laurence Oberman

The BUG_ON is taken because
blk_queued_rq(req)) returns true which means the request-queuelist is
empty, i.e no more entries by the time the request is dereferenced

Can I get the messages file entries (last 100 lines) just prior to the
panic, not the whole file.
If its large attach  .gz and email to me.

Thanks

On Sun, Nov 30, 2014 at 11:42 AM, Richard Weinberger rich...@nod.at wrote:
 Am 30.11.2014 um 17:36 schrieb Simon Danner:
 Hi,
 here the two screenshots i could take, from 3.17.4 and 3.18 git.

 You're hitting BUG_ON(blk_queued_rq(req)); in blk_finish_request()

 Thanks,
 //richard

 Thanks
 Simon

 On Sun, 2014-11-30 at 10:58 -0500, Laurence Oberman wrote:
 I need more of the stack if you have it, the screenshot is not attached.
 Thanks
 Laurence

 On Sun, Nov 30, 2014 at 6:11 AM, Richard Weinberger
 richard.weinber...@gmail.com wrote:
 On Sat, Nov 29, 2014 at 11:52 AM, Simon Danner danner.si...@gmail.com 
 wrote:
 Hello,
 i get the following crash after i try to copy files from a broken
 external hdd to another external hdd.
 It happens after a few minutes, with latest git and 3.17.4 from Arch.
 Attached screenshot is from latest mainline git.

 i hope this can be fixed somehow,
 Regards Simon Danner

 Can you decode scsi_requed_end+0x122?
 CC'ing block and scsi folks.

 --
 Thanks,
 //richard
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Drivers: scsi: FLUSH timeout

2013-09-20 Thread Laurence Oberman

I am thinking Srini meant in the sd_mod driver module.
#define SD_FLUSH_TIMEOUT (60 * HZ)

Laurence


On Fri, Sep 20, 2013 at 4:32 PM, Greg KH gre...@linuxfoundation.org wrote:
 On Fri, Sep 20, 2013 at 12:32:27PM -0700, K. Y. Srinivasan wrote:
 The SD_FLUSH_TIMEOUT value is currently hardcoded.

 Hardcoded where?  Please, more context.

 On our cloud, we sometimes hit this timeout. I was wondering if we
 could make this a module parameter. If this is acceptable, I can send
 you a patch for this.

 A module parameter don't make sense for a per-device value, does it?

 greg k-h
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Persistent reservation behaviour/compliance with redundant controllers

2014-01-06 Thread Laurence Oberman

I reached out to a. Contact at HP and he shared this with. Not sure if its 
helpful.

3PAR does something different based on the host OS mode or Persona that is set 
for the host OS type being used as to how we respond with these commands. The  
main aspects of this question derive with how a active/passive controller model 
would work, however, because 3PAR is all controllers or nodes are equal all 
paths are active. The 3Par implementation of S2R and S3PGR is intended to 
comply with SPC-3. The scope of reservations is limited to a full logical unit, 
element scope is not supported. SCSI-3 reservations allow each host/array path 
to have a key registered against it. Typically a host will register the same 
key upon all of the paths it sees to the array and each host will have its own 
unique key. Access to the volume can then be restricted to those hosts who have 
registered keys. Should a host be determined to have gone rogue its key can be 
revoked by any of the still active hosts, causing the rogue host to lose access 
to the volume.
 
They need to register the same key to all paths of the same lun.
 
Once the host has taken appropriate action to become healthy again it can 
register a new key and regain access.
 
For 3PAR use the showrsv command to view things from the 3PAR array:
 
showrsv - Show information about scsi reservations of virtual volumes (VVs).
 
SYNTAX
showrsv [options arg] [VV_name]
 
DESCRIPTION
The showrsv command displays SCSI reservation and registration information
for VLUNs bound for a specified port.
 
AUTHORITY
Any role in the system
 
OPTIONS
-l scsi3|scsi2

 On Jan 6, 2014, at 6:35 PM, Matthias Eble psychotr...@gmail.com wrote:
 
 2014/1/7 James Bottomley james.bottom...@hansenpartnership.com:
 On Mon, 2014-01-06 at 23:53 +0100, Matthias Eble wrote:
 
 Can sdg and sdl be the same I_T_Nexus at a time?
 Right now, they are handled like that.
 In my understanding, every scsi disk device represents an I_T_Nexus.
 
 No, every SCSI disk is an I_T_L nexus.  There's no actual device object
 in Linux for an I_T nexus.
 
 So, PR registrations are made for an I_T nexus using an I_T_L nexus.
 Probably my previous systems had a 1:1 relation between I_T and I_T_L.
 
 Is there a way to identify which I_T_L nexuses belong to the same I_T nexus?
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

scsi_debug driver puzzle

2014-03-31 Thread Laurence Oberman

Hello

I have what to me is a puzzle but is likely a stupid question about
the queuecommand interface in the scsi_debug driver.

I see the host template set for  scsi_debug_queuecommand but in the
driver we have the function declared as int
scsi_debug_queuecommand_lck
So how is this working.

Egrep pattern: scsi_debug_queuecommand

  File Line
0 scsi_debug.c 3551 int scsi_debug_queuecommand_lck(struct scsi_cmnd
*SCpnt, done_funct_t done)
1 scsi_debug.c 3900 static DEF_SCSI_QCMD(scsi_debug_queuecommand)
2 scsi_debug.c 3912 .queuecommand =  scsi_debug_queuecommand,

Thanks
Laurence
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] lpfc: correct device removal deadlock after link bounce

2014-12-30 Thread Laurence Oberman

This patch was tested in house at Red Hat and is running in test
kernels at a couple of Red Hat customers.
James, thanks for sending it upstream.
Laurence

On Tue, Dec 30, 2014 at 12:08 PM, James Smart james.sm...@emulex.com wrote:
 This patch, applicable to 8G/4G/2G adapters, adds a call that
 resumes transmit operations after a link bounce. Without it, targets
 that tried to suspend exchanges after a link bounce (such as tape devices
 using sequence level error recovery) would never resume io operation,
 causing scan failures, and eventually deadlocks if a device removal
 request is made.

 The patches were cut against Christoph's scsi-queue.git,
 branch drivers-for-3.18.  The driver rev cut against is 10.4.8000.0

 -- james s


 Signed-off-by: James Smart james.sm...@emulex.com
 Signed-off-by: Dick Kennedy dick.kenn...@emulex.com
 ---
  lpfc_els.c |9 +
  1 file changed, 9 insertions(+)

 diff -upNr a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
 --- a/drivers/scsi/lpfc/lpfc_els.c  2014-12-29 12:48:08.0 -0500
 +++ b/drivers/scsi/lpfc/lpfc_els.c  2014-12-30 11:23:04.344426606 -0500
 @@ -2225,6 +2225,15 @@ lpfc_adisc_done(struct lpfc_vport *vport
 if ((phba-sli3_options  LPFC_SLI3_NPIV_ENABLED) 
 !(vport-fc_flag  FC_RSCN_MODE) 
 (phba-sli_rev  LPFC_SLI_REV4)) {
 +   /* The ADISCs are complete.  Doesn't matter if they
 +* succeeded or failed because the ADISC completion
 +* routine guarantees to call the state machine and
 +* the RPI is either unregistered (failed ADISC response)
 +* or the RPI is still valid and the node is marked
 +* mapped for a target.  The exchanges should be in the
 +* correct state. This code is specific to SLI3.
 +*/
 +   lpfc_issue_clear_la(phba, vport);
 lpfc_issue_reg_vpi(phba, vport);
 return;
 }


 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] st: implement sysfs based tape statistics v2

2015-02-02 Thread Laurence Oberman

I pulled this this morning and will be testing. The prior version was
stable for me on the upstream and RHEL 6.5 kernel without exhaustive
testing.
We also just received more requests to get this into RHEL from HP /
Red Hat customers.

Kai, what are your thoughts. I realize this is a large amount of
additional code. I am not keen to create a driver just for stats as we
would have to keep the rest of the st driver changes always in sync.

Thanks
Laurence

On Mon, Jan 12, 2015 at 10:43 PM, Seymour, Shane M shane.seym...@hp.com wrote:
 Some small changes since the last version - this version removes two files 
 from sysfs compared to the last version (read and write block counts since 
 they're derived from the byte counts they can be calculated in user space) 
 but that's the only change. This version has been rebased to 
 3.19.0-rc3-next-20150108.

 Since posting the last version an email was received by Kai and myself from 
 an ATT employee who has a need for tape statistics to be implemented (who 
 gave permission to quote their email), I've included part of the email here:

 There are over 20,000 tape devices managed by our operations group zoned to 
 tens of thousands of servers.

 My challenge is that I cannot provide operations a solution that gives them 
 visibility into the tape drive performance metrics when that platform is 
 linux. Our legacy platforms (Solaris,HPUX,AIX) provide facilities to use 
 iostat and sar to determine the write speed of the tape drives. We took for 
 granted that this would be available in linux and its absence has been very 
 troublesome. Because operations cannot measure tape drive performance in this 
 way they cannot easily determine when there is a tape drive performance 
 problem and whether the change improved or worsened the performance problem.
 ...
 I have followed the debate https://lkml.org/lkml/2013/3/20/696 and from a 
 service provide perspective we would expect the same maturity and 
 functionality that we have from the traditional unix platform in regards to 
 iostat/sar. This issue is important and urgent because tape drive performance 
 issues are common and I am unable to provide standards and processes to 
 identify and remediate these issues.

 Another HP customer has also requested the same functionality (but hasn't 
 given permission to be named), they own tape drives numbering in the 1000s 
 and also need the ability to investigate performance issues.

 Signed-off-by: shane.seym...@hp.com
 Tested-by: shane.seym...@hp.com
 ---
 diff -uprN a/drivers/scsi/st.c b/drivers/scsi/st.c
 --- a/drivers/scsi/st.c 2015-01-11 14:46:00.243814755 -0600
 +++ b/drivers/scsi/st.c 2015-01-12 13:54:52.549117333 -0600
 @@ -20,6 +20,7 @@
  static const char *verstr = 20101219;

  #include linux/module.h
 +#include linux/kobject.h

  #include linux/fs.h
  #include linux/kernel.h
 @@ -226,6 +227,20 @@ static DEFINE_SPINLOCK(st_index_lock);
  static DEFINE_SPINLOCK(st_use_lock);
  static DEFINE_IDR(st_index_idr);

 +static inline void st_stats_remove_files(struct scsi_tape *);
 +static inline void st_stats_create_files(struct scsi_tape *);
 +static ssize_t st_tape_attr_show(struct kobject *, struct attribute *, char 
 *);
 +static ssize_t st_tape_attr_store(struct kobject *, struct attribute *,
 +   const char *, size_t);
 +static void st_release_stats_kobj(struct kobject *);
 +static const struct sysfs_ops st_stats_sysfs_ops = {
 +   .show   = st_tape_attr_show,
 +   .store  = st_tape_attr_store,
 +};
 +static struct kobj_type st_stats_ktype = {
 +   .release = st_release_stats_kobj,
 +   .sysfs_ops = st_stats_sysfs_ops,
 +};



  #include osst_detect.h
 @@ -476,10 +491,22 @@ static void st_scsi_execute_end(struct r
 struct st_request *SRpnt = req-end_io_data;
 struct scsi_tape *STp = SRpnt-stp;
 struct bio *tmp;
 +   u64 ticks;

 STp-buffer-cmdstat.midlevel_result = SRpnt-result = req-errors;
 STp-buffer-cmdstat.residual = req-resid_len;

 +   if (STp-stats != NULL) {
 +   ticks = get_jiffies_64();
 +   STp-stats-in_flight--;
 +   ticks -= STp-stats-stamp;
 +   STp-stats-io_ticks += ticks;
 +   if (req-cmd[0] == WRITE_6)
 +   STp-stats-write_ticks += ticks;
 +   else if (req-cmd[0] == READ_6)
 +   STp-stats-read_ticks += ticks;
 +   }
 +
 tmp = SRpnt-bio;
 if (SRpnt-waiting)
 complete(SRpnt-waiting);
 @@ -496,6 +523,7 @@ static int st_scsi_execute(struct st_req
 struct rq_map_data *mdata = SRpnt-stp-buffer-map_data;
 int err = 0;
 int write = (data_direction == DMA_TO_DEVICE);
 +   struct scsi_tape *STp = SRpnt-stp;

 req = blk_get_request(SRpnt-stp-device-request_queue, write,
   GFP_KERNEL);
 @@ -516,6 +544,20 @@ static int st_scsi_execute(struct st_req
 }
 }

 +

Re: [PATCH] st: implement sysfs based tape statistics v2

2015-02-05 Thread Laurence Oberman

- Original Message -
From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi
To: Laurence Oberman oberma...@gmail.com
Cc: Shane M Seymour shane.seym...@hp.com, lober...@redhat.com, 
linux-scsi@vger.kernel.org, James E.J. Bottomley (jbottom...@parallels.com) 
jbottom...@parallels.com, je...@suse.com
Sent: Thursday, February 5, 2015 12:03:29 PM
Subject: Re: [PATCH] st: implement sysfs based tape statistics v2

 On 2.2.2015, at 17.16, Laurence Oberman oberma...@gmail.com wrote:

 I pulled this this morning and will be testing. The prior version was
 stable for me on the upstream and RHEL 6.5 kernel without exhaustive
 testing.
 We also just received more requests to get this into RHEL from HP /
 Red Hat customers.

 Kai, what are your thoughts. I realize this is a large amount of
 additional code. I am not keen to create a driver just for stats as we
 would have to keep the rest of the st driver changes always in sync.

I still think that the tape statistics should be exported like the statistics 
of “real” block devices, i.e., one sysfs file exporting on a single line the 
statistics that temporally belong together. James rejected this approach. I am 
leaving the decision about this code to him. I will neither ack nor nak this 
code.

Thanks,
Kai

Hello Kai,

I missed the earlier conversations with James, I will go search for them.
Do you mean add them so they are similar to the /proc/diskstats

cat /proc/diskstats
..
   8   0 sda 2258346 152801 291907067 5263795 388817 1518048 15013833 
4542062 0 4794931 9803495
   8   1 sda1 717 102 26154 1179 8 2 80 76 0 1172 1254
   8   2 sda2 328 31 2872 1554 0 0 0 0 0 1554 1554
   8   3 sda3 2195205 151617 290898283 5203627 355053 1518046 15009528 
4370598 0 4594137 9571937
   8   4 sda4 61921 1050 978350 57218 18 0 4225 34 0 56384 57185
  11   0 sr0 0 0 0 0 0 0 0 0 0 0 0
..

Laurence Oberman
Red Hat Global Support Service
SEG Team

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] implementing tape statistics single file vs multi-file in sysfs

2015-02-07 Thread Laurence Oberman

Hello
Its not going to be tens of thousands of devices. That count was an
aggregate based on 1000's of servers.
In reality its unlikely to ever be more than 100 tapes drives per
individual Linux kernel instance.
Therefore sysfs will be the valid way to do this and make the data
available to user space.

Thanks
Laurence


 On Feb 6, 2015, at 11:07 PM, Greg KH gre...@linuxfoundation.org wrote:
 
 On Fri, Feb 06, 2015 at 03:41:58PM +, Bryn M. Reeves wrote:
 On Fri, Feb 06, 2015 at 04:59:16AM -0800, Greg KH wrote:
 On Fri, Feb 06, 2015 at 12:20:53AM +, Seymour, Shane M wrote:
 The current patch that implements tape statistics is here:
 
 http://marc.info/?l=linux-scsim=142112067313723w=2
 
 Aside from the do we want to do this all in a single file issue that I
 will say more on below, this patch has issues.  Please don't use a
 kobject for _ANYTHING_ in sysfs that has a struct device as a parent.
 If you do that, it can't be seen by userspace tools very well, if at
 all.
 
 I can't speak for Shane but wouldn't spend too much time looking at the
 current v2 patch: it's the result of a pretty ugly compromise suggested
 on linux-scsi.
 
 Fair enough, but please feel free to cc: me on the patch that you do
 feel is correct to get a sysfs-related review.
 
 Recently there was was another discussion here about one file vs a 
 collection of files for tape statistics:
 
 http://marc.info/?l=linux-scsim=142316255501550w=2
 
 The result was that I should ask here what method I should use. I
 would like to get feedback in relation to tape statistics and one file
 vs multi-file in sysfs. I'm happy to keep the existing code or change
 to a single file approach.
 
 One of the primary reasons we created sysfs and the one value per file
 rule is that multi-value files just do not work well.  Yes, you get an
 atomic snapshot, and you save some open/read/close syscall roundtrips,
 but you do so at the expense of forcing userspace to know what the
 format of the file is.  And once you create it, you can NEVER CHANGE IT
 AGAIN.
 
 I am not convinced this is a concern for tape statistics: they are pretty
 much a solved problem. The commercial *nixes have had this for decades.
 
 Likewise for disk stats: although fluff like maj:min/name etc. has been
 shuffled a few times the basic fields have remained unchanged for a very
 long time and sysfs already removes the need to include an identity
 field.
 
 We already handle i/o stats just fine, why create a special sysfs
 interface for just a tape device interface?  What makes them so special?
 
 Yes, that's right, if you come up with some new statistic in the future,
 or realize that one of the ones you have now is wrong, you can't change
 it, you have to make a whole new file, otherwise you could break
 userspace tools.
 
 I understand the fact that you can't change them; I just don't think it's
 a big problem in this specific case (and much less than some of the
 more imaginative sysfs content - 2d int arrays with column headers
 anyone?).
 
 What sysfs file is a 2d int array?  I'll be glad to fix it.
 
 Also, everyone doesn't think their solution will ever need to be
 changed.  Until later when it does :)
 
 And yes, open/read/close does take take a few extra cycles, but you
 can't really measure it for a virtual filesystem like this on any modern
 system.
 
 I'll try to get some numbers when I get back home next week - Shane is
 talking about use cases involving tens of thousands of tape devices. I
 am not certain that the overhead would be unmeasurable in that case: the
 additional context switching  TLB flushes alone seem like they would
 add up.
 
 If you want to measure tens of thousands of tape devices then you
 shouldn't be using sysfs in the first place as it is not designed for
 speed at all.  Use the existing i/o rate interfaces instead, don't try
 to cram something into sysfs that doesn't belong there.
 
 thanks,
 
 greg k-h
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] st: implement sysfs based tape statistics v2

2015-02-05 Thread Laurence Oberman

- Original Message -
From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi
To: Laurence Oberman lober...@redhat.com
Cc: Laurence Oberman oberma...@gmail.com, Shane M Seymour 
shane.seym...@hp.com, linux-scsi@vger.kernel.org, James E.J. Bottomley 
(jbottom...@parallels.com) jbottom...@parallels.com, je...@suse.com
Sent: Thursday, February 5, 2015 12:46:32 PM
Subject: Re: [PATCH] st: implement sysfs based tape statistics v2

 On 5.2.2015, at 19.40, Laurence Oberman lober...@redhat.com wrote:

 - Original Message -
 From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi
 To: Laurence Oberman oberma...@gmail.com
 Cc: Shane M Seymour shane.seym...@hp.com, lober...@redhat.com, 
 linux-scsi@vger.kernel.org, James E.J. Bottomley (jbottom...@parallels.com) 
 jbottom...@parallels.com, je...@suse.com
 Sent: Thursday, February 5, 2015 12:03:29 PM
 Subject: Re: [PATCH] st: implement sysfs based tape statistics v2

 On 2.2.2015, at 17.16, Laurence Oberman oberma...@gmail.com wrote:

 I pulled this this morning and will be testing. The prior version was
 stable for me on the upstream and RHEL 6.5 kernel without exhaustive
 testing.
 We also just received more requests to get this into RHEL from HP /
 Red Hat customers.

 Kai, what are your thoughts. I realize this is a large amount of
 additional code. I am not keen to create a driver just for stats as we
 would have to keep the rest of the st driver changes always in sync.

 I still think that the tape statistics should be exported like the statistics 
 of “real” block devices, i.e., one sysfs file exporting on a single line the 
 statistics that temporally belong together. James rejected this approach. I 
 am leaving the decision about this code to him. I will neither ack nor nak 
 this code.

 Thanks,
 Kai

 Hello Kai,

 I missed the earlier conversations with James, I will go search for them.
 Do you mean add them so they are similar to the /proc/diskstats

 cat /proc/diskstats
 ..
   8   0 sda 2258346 152801 291907067 5263795 388817 1518048 15013833 
 4542062 0 4794931 9803495
   8   1 sda1 717 102 26154 1179 8 2 80 76 0 1172 1254
   8   2 sda2 328 31 2872 1554 0 0 0 0 0 1554 1554
   8   3 sda3 2195205 151617 290898283 5203627 355053 1518046 15009528 
 4370598 0 4594137 9571937
   8   4 sda4 61921 1050 978350 57218 18 0 4225 34 0 56384 57185
  11   0 sr0 0 0 0 0 0 0 0 0 0 0 0
 ..

Not exactly. I mean the data exported in sysfs, for example:

 cat /sys/block/sda/sda1/stat
  159740 9006  594150664461   12472455907 12772208  3598677
0   299875  3663235

Kai

Ok, Thanks, got it now. Let me circle back with Shane

Laurence Oberman
Red Hat Global Support Service
SEG Team
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module

2015-03-07 Thread Laurence Oberman

Hello

I use target LIO for all my storage array test targets and customer problem 
resolution here at Red Hat.
This patch resulted from a requirement to mimic behaviour of an expensive 
hardware jammer for a customer.
I have used this for some time with good success to simulate and reproduce 
latency and slow drain fabric issues and
for testing and validating error handling behaviour in the Emulex, Qlogic and 
other F/C drivers.

Works by checking jammer_flag==1 and host # and discards SCSI command, 
controlled using echo to sys parameter.

I decided to share the patch, in the hope it may be useful for others but I do 
understand this is a special use case.
If this is useful and Nab wants to include it I will create a proper 
documentation patch as well.

filename:   
/lib/modules/3.17.7-200.jammer.fc20.x86_64/kernel/drivers/scsi/qla2xxx/tcm_qla2xxx.ko
license:GPL
description:TCM QLA2XXX series NPIV enabled fabric driver
depends:target_core_mod,qla2xxx,scsi_transport_fc
intree: Y
vermagic:   3.17.7-200.jammer.fc20.x86_64 SMP mod_unload 
parm:   jammer_flag:Set to 1: Enable jammer (int)
parm:   host_flag:host number to match on (int)


Enable host 6 to be jammed
echo 6  /sys/module/tcm_qla2xxx/parameters/host_flag

Usage example script:

#!/bin/bash
host=`cat /sys/module/tcm_qla2xxx/parameters/host_flag`
sleep_time=120  ### Time to jam for
echo We start to discard commands on SCSI host $host
logger Jammer started
echo 1   /sys/module/tcm_qla2xxx/parameters/jammer_flag
sleep $sleep_time
echo 0   /sys/module/tcm_qla2xxx/parameters/jammer_flag
echo We stopped the jammer
logger Jammer stopped

This Patch diff against 3.19.1

Tested by: Laurence Oberman lober...@redhat.com
Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nurp a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c2015-03-07 18:35:15.246737589 
-0500
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c2015-03-07 18:35:40.168599630 
-0500
@@ -50,6 +50,14 @@
 #include qla_target.h
 #include tcm_qla2xxx.h
 
+int message_flag=0;
+int jammer_flag = 0;
+module_param(jammer_flag, int,0644);
+MODULE_PARM_DESC(jammer_flag, If set to 1: Enable jammer);
+int host_flag=0;
+module_param(host_flag, int,0644);
+MODULE_PARM_DESC(host_flag, host number to match on);
+
 static struct workqueue_struct *tcm_qla2xxx_free_wq;
 static struct workqueue_struct *tcm_qla2xxx_cmd_wq;
 
@@ -570,6 +578,22 @@ static int tcm_qla2xxx_handle_cmd(scsi_q
pr_err(Unable to locate active struct se_session\n);
return -EINVAL;
}
+   
+   // Control messaging here
+   message_flag += jammer_flag;
+   if(message_flag == 1)
+   printk(tcm_qla2xx:SCSI Jammer enabled on host %d\n,host_flag);
+   if((jammer_flag == 0)  (message_flag =0)) {
+   printk(tcm_qla2xx:SCSI Jammer stopped, %d SCSI commands 
discarded for host %d\n,message_flag,host_flag);
+   message_flag=-1;
+   }
+   
+   if ((vha-host_no == host_flag)  (jammer_flag == 1))
+   {
+   // return, and don't run target_submit_cmd, effectively 
discarding command
+   return 0;
+   }
+
 
return target_submit_cmd(se_cmd, se_sess, cdb, cmd-sense_buffer[0],
cmd-unpacked_lun, data_length, fcp_task_attr,
@@ -2165,6 +2189,7 @@ static void tcm_qla2xxx_deregister_confi
 static int __init tcm_qla2xxx_init(void)
 {
int ret;
+   jammer_flag = 0;
 
ret = tcm_qla2xxx_register_configfs();
if (ret  0)


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH ] qla2xxx Add SCSI command jammer/discard capabilty to the qla2xxx target module - revision3

2015-03-12 Thread Laurence Oberman

Hello Bart, Quinn Tran

Thanks for the feedback.

Revision3
Moved the discard to the __qlt_do_work code to prevent the memory leak, this 
cleans up the allocations.
I will look at seeing how best this can be done for the other transports, or in 
the core but for me the most useful case has been F/C.
I wanted to get feedback so far, and suggest that we should start with this as 
the initial jamming patch as its the least risky change for now.
I did test this and ran the same set of tests I normally use this error 
injection for and it looks good.


Patch notes
---
I use target LIO for all my storage array test targets and customer problem 
resolution here at Red Hat.
This patch resulted from a requirement to mimic behaviour of an expensive 
hardware jammer for a customer.
I have used this for some time with good success to simulate and reproduce 
latency and slow drain fabric issues and
for testing and validating error handling behaviour in the Emulex, Qlogic and 
other F/C drivers.
While the jammer is enabled SCSI commands are discarded for the selected host 
and this allows all the multipath error recovery and other
LLD driver error recovery and timeout code to be debugged and tested.

Works by checking new parameter jam_host if its = 0 and matches vha-host_no , 
jamming is enabled when jam_host =0
If parameter set to -1 (default) no jamming is enabled.
I decided to share the patch, in the hope it may be useful for others but I do 
understand this is a special use case.

Tested by: Laurence Oberman lober...@redhat.com
Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nurp a/Documentation/scsi/qla2xxx.txt b/Documentation/scsi/qla2xxx.txt
--- a/Documentation/scsi/qla2xxx.txt1969-12-31 19:00:00.0 -0500
+++ b/Documentation/scsi/qla2xxx.txt2015-03-12 21:42:49.828788582 -0400
@@ -0,0 +1,34 @@
+qla2xxx target mode parameters
+--
+parm:   qlini_mode:Determines when initiator mode will be enabled. 
Possible values: exclusive - initiator mode will be enabled on load, disabled 
on enabling target mode and then on disabling target mode enabled back; 
disabled - initiator mode will never be enabled; enabled (default) - 
initiator mode will always stay enabled. (charp)
+
+Enables qla2xxx target mode by setting to disabled on module load
+
+There is now a new module parameter added to the qla2xxx module
+parm:   jam_host:Host to jam =0 Enable jammer (int)
+
+Use this parameter to control the discarding of SCSI commands to a selected 
host.
+This may be useful for testing error handling and simulating slow drain and 
other
+fabric issues.
+
+Any value =0 that matches a fc_host # will discard the commands for that host.
+Reset back to -1 to stop the jamming.
+
+Enable host 6 to be jammed
+echo 6  /sys/module/qla2xxx/parameters/jam_host
+
+Disable jamming on host 6
+echo -1  /sys/module/qla2xxx/parameters/jam_host
+
+Usage example script:
+
+#!/bin/bash
+sleep_time=120  ### Time to jam for
+echo 6   /sys/module/qla2xxx/parameters/jam_host
+host=`cat /sys/module/qla2xxx/parameters/jam_host`
+echo We start to discard commands on SCSI host $host
+logger Jammer started
+sleep $sleep_time
+echo -1   /sys/module/qla2xxx/parameters/jam_host
+echo We stopped the jammer
+logger Jammer stopped

diff -Nurp a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
--- a/drivers/scsi/qla2xxx/qla_target.c 2015-03-12 21:44:04.691314527 -0400
+++ b/drivers/scsi/qla2xxx/qla_target.c 2015-03-12 21:52:27.551557133 -0400
@@ -59,6 +59,11 @@ MODULE_PARM_DESC(qlini_mode,
 
 int ql2x_ini_mode = QLA2XXX_INI_MODE_EXCLUSIVE;
 
+int jam_host = -1;
+module_param(jam_host, int, 0644);
+MODULE_PARM_DESC(jam_host, Host to jam =0 Enable jammer);
+
+
 static int temp_sam_status = SAM_STAT_BUSY;
 
 /*
@@ -3264,6 +3269,11 @@ static void __qlt_do_work(struct qla_tgt
cmd-cmd_flags |= BIT_1;
if (tgt-tgt_stop)
goto out_term;
+   /*
+   * If jam_host =0, goto out_term discarding command for matching host
+   */
+   if (unlikely(vha-host_no == jam_host))
+   goto out_term;
 
cdb = atio-u.isp24.fcp_cmnd.cdb[0];
cmd-tag = atio-u.isp24.exchange_addr;


Laurence Oberman
Red Hat Global Support Service
SEG Team

- Original Message -
From: Bart Van Assche bart.vanass...@sandisk.com
To: Laurence Oberman lober...@redhat.com
Cc: Andy Grover agro...@redhat.com, linux-scsi@vger.kernel.org, 
n...@daterainc.com, Laurence Oberman oberma...@gmail.com
Sent: Thursday, March 12, 2015 9:13:28 AM
Subject: Re: [PATCH ] tcm_qla2xxx  Add SCSI command jammer/discard capabilty to 
the tcm_qla2xxx module - revision2

On 03/08/2015 11:38 AM, Laurence Oberman wrote:
 Here is revision2

 I added unlikely and removed messaging control as it not necessary and adds 
 overhead.

 I use target LIO for all my storage array test targets and customer problem 
 resolution here at Red Hat.
 This patch resulted from a requirement

Re: [PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module

2015-03-12 Thread Laurence Oberman

Hello Quinn Tran

Thank you for the feedback. There is a revision2 of this patch I sent as a 
follow on to Bart that is much cleaner but its still exposed to the memory 
leaks.
The newer version has a single jam_host parameter as suggested by Bart and the 
messaging removed. Have a look for it.
Bart also suggested moving the discard to a higher layer in his most recent 
response to allow other transports to benefit as well.

I have used this a lot but and its been extremely useful, but never used it for 
extended periods and specifically to test servers connected via F/C to to the 
LIO host.
I was concerned that we had a dangling allocation after discard but never saw 
the leak show up significantly in my testing. 
Mostly because my test servers are in error recovery and waiting on timeouts.
Where I placed the discard seemed to be the safest pace for my particular use 
case. 
I did use other options like zeroing the cdb and passing the command on to 
avoid the dangling allocation, to force lots of underruns on the host during 
testing.

Let me revisit my most recent version and take care of the memory leak exposure 
and look into your other suggestions.
I will reply in that latest thread with a new version.

Many Thanks for the consideration

Laurence

Laurence Oberman
Red Hat Global Support Service
SEG Team

- Original Message -
From: Quinn Tran quinn.t...@qlogic.com
To: Laurence Oberman lober...@redhat.com, Andy Grover 
agro...@redhat.com, linux-scsi linux-scsi@vger.kernel.org, 
n...@daterainc.com
Cc: Laurence Oberman oberma...@gmail.com
Sent: Thursday, March 12, 2015 6:07:08 PM
Subject: Re: [PATCH ] tcm_qla2xxx  Add SCSI command jammer/discard capabilty to 
the tcm_qla2xxx module

This idea definitely help flush out additional interaction issues between
fabric drivers and TCM.

However, the current spot where the error injection is placed will cause
memory leak.  The error injection tries to drop the command before
submission to TCM.  TCM  QLA drivers will loose track of this command.
The test will be short live if enough memory have been leaked.  May be the
command should be drop before mem allocation.

In addition, it would nice if the other spots can be included such as:
queue_status(), queue_data_in, aborted_task(), queue_tm_rsp() 
target_submit_tmr().

If the intend is to test all adapters, then the error injection need to be
move higher up into TCM driver.


Regards,
Quinn Tran




On 3/7/15, 8:26 PM, Laurence Oberman lober...@redhat.com wrote:

Hello

I use target LIO for all my storage array test targets and customer
problem resolution here at Red Hat.
This patch resulted from a requirement to mimic behaviour of an expensive
hardware jammer for a customer.
I have used this for some time with good success to simulate and
reproduce latency and slow drain fabric issues and
for testing and validating error handling behaviour in the Emulex, Qlogic
and other F/C drivers.

Works by checking jammer_flag==1 and host # and discards SCSI command,
controlled using echo to sys parameter.

I decided to share the patch, in the hope it may be useful for others but
I do understand this is a special use case.
If this is useful and Nab wants to include it I will create a proper
documentation patch as well.

filename:   
/lib/modules/3.17.7-200.jammer.fc20.x86_64/kernel/drivers/scsi/qla2xxx/tcm
_qla2xxx.ko
license:GPL
description:TCM QLA2XXX series NPIV enabled fabric driver
depends:target_core_mod,qla2xxx,scsi_transport_fc
intree: Y
vermagic:   3.17.7-200.jammer.fc20.x86_64 SMP mod_unload
parm:   jammer_flag:Set to 1: Enable jammer (int)
parm:   host_flag:host number to match on (int)


Enable host 6 to be jammed
echo 6  /sys/module/tcm_qla2xxx/parameters/host_flag

Usage example script:

#!/bin/bash
host=`cat /sys/module/tcm_qla2xxx/parameters/host_flag`
sleep_time=120  ### Time to jam for
echo We start to discard commands on SCSI host $host
logger Jammer started
echo 1   /sys/module/tcm_qla2xxx/parameters/jammer_flag
sleep $sleep_time
echo 0   /sys/module/tcm_qla2xxx/parameters/jammer_flag
echo We stopped the jammer
logger Jammer stopped

This Patch diff against 3.19.1

Tested by: Laurence Oberman lober...@redhat.com
Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nurp a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c   2015-03-07 18:35:15.246737589
-0500
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c   2015-03-07 18:35:40.168599630
-0500
@@ -50,6 +50,14 @@
 #include qla_target.h
 #include tcm_qla2xxx.h
 
+int message_flag=0;
+int jammer_flag = 0;
+module_param(jammer_flag, int,0644);
+MODULE_PARM_DESC(jammer_flag, If set to 1: Enable jammer);
+int host_flag=0;
+module_param(host_flag, int,0644);
+MODULE_PARM_DESC(host_flag, host number to match on);
+
 static struct workqueue_struct *tcm_qla2xxx_free_wq;
 static struct workqueue_struct *tcm_qla2xxx_cmd_wq;
 
@@ -570,6

Resend: [PATCH ] qla2xxx Add SCSI command jammer/discard capabilty to the qla2xxx target module - revision3

2015-03-30 Thread Laurence Oberman

Hello Bart, Quinn Tran,

I have been using this jammer facility since I posted the below updated patch 
with no memory leaks and no issues.
Is there any interest to take this patch in, its certainly been critical for me 
in some of the error recovery testing I have been doing.

Thanks

Laurence Oberman
Red Hat Global Support Service
SEG Team

- Original Message -
From: Laurence Oberman lober...@redhat.com
To: Bart Van Assche bart.vanass...@sandisk.com, Quinn Tran 
quinn.t...@qlogic.com
Cc: Andy Grover agro...@redhat.com, linux-scsi@vger.kernel.org, 
n...@daterainc.com, Laurence Oberman oberma...@gmail.com
Sent: Thursday, March 12, 2015 10:13:57 PM
Subject: Re: [PATCH ] qla2xxx  Add SCSI command jammer/discard capabilty to the 
qla2xxx target module - revision3

Hello Bart, Quinn Tran

Thanks for the feedback.

Revision3
Moved the discard to the __qlt_do_work code to prevent the memory leak, this 
cleans up the allocations.
I will look at seeing how best this can be done for the other transports, or in 
the core but for me the most useful case has been F/C.
I wanted to get feedback so far, and suggest that we should start with this as 
the initial jamming patch as its the least risky change for now.
I did test this and ran the same set of tests I normally use this error 
injection for and it looks good.


Patch notes
---
I use target LIO for all my storage array test targets and customer problem 
resolution here at Red Hat.
This patch resulted from a requirement to mimic behaviour of an expensive 
hardware jammer for a customer.
I have used this for some time with good success to simulate and reproduce 
latency and slow drain fabric issues and
for testing and validating error handling behaviour in the Emulex, Qlogic and 
other F/C drivers.
While the jammer is enabled SCSI commands are discarded for the selected host 
and this allows all the multipath error recovery and other
LLD driver error recovery and timeout code to be debugged and tested.

Works by checking new parameter jam_host if its = 0 and matches vha-host_no , 
jamming is enabled when jam_host =0
If parameter set to -1 (default) no jamming is enabled.
I decided to share the patch, in the hope it may be useful for others but I do 
understand this is a special use case.

Tested by: Laurence Oberman lober...@redhat.com
Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nurp a/Documentation/scsi/qla2xxx.txt b/Documentation/scsi/qla2xxx.txt
--- a/Documentation/scsi/qla2xxx.txt1969-12-31 19:00:00.0 -0500
+++ b/Documentation/scsi/qla2xxx.txt2015-03-12 21:42:49.828788582 -0400
@@ -0,0 +1,34 @@
+qla2xxx target mode parameters
+--
+parm:   qlini_mode:Determines when initiator mode will be enabled. 
Possible values: exclusive - initiator mode will be enabled on load, disabled 
on enabling target mode and then on disabling target mode enabled back; 
disabled - initiator mode will never be enabled; enabled (default) - 
initiator mode will always stay enabled. (charp)
+
+Enables qla2xxx target mode by setting to disabled on module load
+
+There is now a new module parameter added to the qla2xxx module
+parm:   jam_host:Host to jam =0 Enable jammer (int)
+
+Use this parameter to control the discarding of SCSI commands to a selected 
host.
+This may be useful for testing error handling and simulating slow drain and 
other
+fabric issues.
+
+Any value =0 that matches a fc_host # will discard the commands for that host.
+Reset back to -1 to stop the jamming.
+
+Enable host 6 to be jammed
+echo 6  /sys/module/qla2xxx/parameters/jam_host
+
+Disable jamming on host 6
+echo -1  /sys/module/qla2xxx/parameters/jam_host
+
+Usage example script:
+
+#!/bin/bash
+sleep_time=120  ### Time to jam for
+echo 6   /sys/module/qla2xxx/parameters/jam_host
+host=`cat /sys/module/qla2xxx/parameters/jam_host`
+echo We start to discard commands on SCSI host $host
+logger Jammer started
+sleep $sleep_time
+echo -1   /sys/module/qla2xxx/parameters/jam_host
+echo We stopped the jammer
+logger Jammer stopped

diff -Nurp a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
--- a/drivers/scsi/qla2xxx/qla_target.c 2015-03-12 21:44:04.691314527 -0400
+++ b/drivers/scsi/qla2xxx/qla_target.c 2015-03-12 21:52:27.551557133 -0400
@@ -59,6 +59,11 @@ MODULE_PARM_DESC(qlini_mode,
 
 int ql2x_ini_mode = QLA2XXX_INI_MODE_EXCLUSIVE;
 
+int jam_host = -1;
+module_param(jam_host, int, 0644);
+MODULE_PARM_DESC(jam_host, Host to jam =0 Enable jammer);
+
+
 static int temp_sam_status = SAM_STAT_BUSY;
 
 /*
@@ -3264,6 +3269,11 @@ static void __qlt_do_work(struct qla_tgt
cmd-cmd_flags |= BIT_1;
if (tgt-tgt_stop)
goto out_term;
+   /*
+   * If jam_host =0, goto out_term discarding command for matching host
+   */
+   if (unlikely(vha-host_no == jam_host))
+   goto out_term;
 
cdb = atio-u.isp24.fcp_cmnd.cdb[0];
cmd-tag = atio-u.isp24

Re: [PATCH v6] st implement tape statistics

2015-02-25 Thread Laurence Oberman

Hello,
I pulled the latest revision of this patch and tested it. I can vouch
for it working as expected with out any obvious impact to the existing
st driver
Is there any way we can move this along.
Thanks

Tested-by:Laurence Oberman lober...@redhat.com

On Thu, Feb 12, 2015 at 6:15 AM, Seymour, Shane M shane.seym...@hp.com wrote:
 The following patch exposes statistics for the st driver via sysfs.
 There is a need for companies with large numbers of tape drives
 numbering in the tens of thousands to track the performance of
 those tape drives (e.g. when a backup exceeds its window). The
 statistics provided should allow the calculation of throughput,
 average block sizes for read and write, and time spent waiting
 for I/O to complete or doing tape movement.

 Signed-off-by: Shane Seymour shane.seym...@hp.com
 Tested-by: Shane Seymour shane.seym...@hp.com
 ---
 - Removed comment
 - Found an issue where read and write sizes were over reported
 (fixed) In all the test cases I have the stats now report what I
 expect to be the correct value. Some of the values to be used
 with statistics are now stored in temporary variables and used
 to calculate the stats when the I/O ends. Separated out the
 timestamp into 3 since I found it was possible for other tape
 I/O to happen during writes updating the stamp value causing
 the time tracked to be wrong.
 - Moved the end statistics into a separate function because it
 had made the function that it was in too large.
 - Added a new statistic - A count of the number of times we had
 a residual greater than 0.
 --- a/drivers/scsi/st.c 2015-01-11 14:46:00.243814755 -0600
 +++ b/drivers/scsi/st.c 2015-02-11 22:37:01.382243090 -0600
 @@ -471,6 +471,47 @@ static void st_release_request(struct st
 kfree(streq);
  }

 +static void st_do_stats(struct scsi_tape *STp, struct request *req)
 +{
 +   u64 ticks;
 +
 +   ticks = get_jiffies_64();
 +   STp-stats-in_flight--;
 +   if (req-cmd[0] == WRITE_6) {
 +   ticks -= STp-stats-write_stamp;
 +   STp-stats-write_ticks += ticks;
 +   STp-stats-io_ticks += ticks;
 +   STp-stats-write_cnt++;
 +   if (req-errors) {
 +   STp-stats-write_byte_cnt +=
 +   STp-stats-last_write_size -
 +   STp-buffer-cmdstat.residual;
 +   if (STp-buffer-cmdstat.residual  0)
 +   STp-stats-resid_cnt++;
 +   } else
 +   STp-stats-write_byte_cnt +=
 +   STp-stats-last_write_size;
 +   } else if (req-cmd[0] == READ_6) {
 +   ticks -= STp-stats-read_stamp;
 +   STp-stats-read_ticks += ticks;
 +   STp-stats-io_ticks += ticks;
 +   STp-stats-read_cnt++;
 +   if (req-errors)
 +   STp-stats-read_byte_cnt +=
 +   STp-stats-last_read_size -
 +   STp-buffer-cmdstat.residual;
 +   if (STp-buffer-cmdstat.residual  0)
 +   STp-stats-resid_cnt++;
 +   else
 +   STp-stats-read_byte_cnt +=
 +   STp-stats-last_read_size;
 +   } else {
 +   ticks -= STp-stats-other_stamp;
 +   STp-stats-io_ticks += ticks;
 +   STp-stats-other_cnt++;
 +   }
 +}
 +
  static void st_scsi_execute_end(struct request *req, int uptodate)
  {
 struct st_request *SRpnt = req-end_io_data;
 @@ -480,6 +521,8 @@ static void st_scsi_execute_end(struct r
 STp-buffer-cmdstat.midlevel_result = SRpnt-result = req-errors;
 STp-buffer-cmdstat.residual = req-resid_len;

 +   st_do_stats(STp, req);
 +
 tmp = SRpnt-bio;
 if (SRpnt-waiting)
 complete(SRpnt-waiting);
 @@ -496,6 +539,7 @@ static int st_scsi_execute(struct st_req
 struct rq_map_data *mdata = SRpnt-stp-buffer-map_data;
 int err = 0;
 int write = (data_direction == DMA_TO_DEVICE);
 +   struct scsi_tape *STp = SRpnt-stp;

 req = blk_get_request(SRpnt-stp-device-request_queue, write,
   GFP_KERNEL);
 @@ -516,6 +560,17 @@ static int st_scsi_execute(struct st_req
 }
 }

 +   if (cmd[0] == WRITE_6) {
 +   STp-stats-last_write_size = bufflen;
 +   STp-stats-write_stamp = get_jiffies_64();
 +   } else if (cmd[0] == READ_6) {
 +   STp-stats-last_read_size = bufflen;
 +   STp-stats-read_stamp = get_jiffies_64();
 +   } else {
 +   STp-stats-other_stamp = get_jiffies_64();
 +   }
 +   STp-stats-in_flight++;
 +
 SRpnt-bio = req-bio;
 req-cmd_len = COMMAND_SIZE(cmd[0]);
 memset(req-cmd, 0, BLK_MAX_CDB);
 @@ -4222,6 +4277,12 @@ static int st_probe(struct device *dev)

Re: [PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision2

2015-03-08 Thread Laurence Oberman

Hello Bart,
Thanks

Here is revision2

I added unlikely and removed messaging control as it not necessary and adds 
overhead.

I use target LIO for all my storage array test targets and customer problem 
resolution here at Red Hat.
This patch resulted from a requirement to mimic behaviour of an expensive 
hardware jammer for a customer.
I have used this for some time with good success to simulate and reproduce 
latency and slow drain fabric issues and
for testing and validating error handling behaviour in the Emulex, Qlogic and 
other F/C drivers.

Works by checking new parameter jam_host if its = 0 and matches vha-host_no , 
jamming is enabled when jam_host =0
If parameter set to -1 (default) no jamming is enabled. 
I decided to share the patch, in the hope it may be useful for others but I do 
understand this is a special use case.


This Patch diff against 3.19.1

$ linux-3.19.1/scripts/checkpatch.pl latest-upstream-jammer-path 
total: 0 errors, 0 warnings, 60 lines checked

latest-upstream-jammer-path has no obvious style problems and is ready for 
submission.

Tested by: Laurence Oberman lober...@redhat.com
Signed-off-by: Laurence Oberman lober...@redhat.com

diff -Nurp a/Documentation/scsi/tcm_qla2xxx.txt 
b/Documentation/scsi/tcm_qla2xxx.txt
--- a/Documentation/scsi/tcm_qla2xxx.txt1969-12-31 19:00:00.0 
-0500
+++ b/Documentation/scsi/tcm_qla2xxx.txt2015-03-08 11:32:42.262181821 
-0400
@@ -0,0 +1,30 @@
+tcm_qla2xxx jammer parameter usage
+--
+There is now a new module parameter added to the tcm_qla2xx module
+parm:   jam_host:Host to jam =0 Enable jammer (int)
+
+Use this parameter to control the discarding of SCSI commands to a selected 
host.
+This may be useful for testing error handling and simulating slow drain and 
other
+fabric issues.
+
+Any value =0 that matches a fc_host # will discard the commands for that host.
+Reset back to -1 to stop the jamming.
+
+Enable host 6 to be jammed
+echo 6  /sys/module/tcm_qla2xxx/parameters/jam_host
+
+Disable jamming on host 6
+echo -1  /sys/module/tcm_qla2xxx/parameters/jam_host
+
+Usage example script:
+
+#!/bin/bash
+sleep_time=120  ### Time to jam for
+echo 6   /sys/module/tcm_qla2xxx/parameters/jam_host
+host=`cat /sys/module/tcm_qla2xxx/parameters/jam_host`
+echo We start to discard commands on SCSI host $host
+logger Jammer started
+sleep $sleep_time
+echo -1   /sys/module/tcm_qla2xxx/parameters/jam_host
+echo We stopped the jammer
+logger Jammer stopped
diff -Nurp a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c2015-03-08 10:13:31.798400426 
-0400
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c2015-03-08 11:00:53.002419568 
-0400
@@ -50,6 +50,10 @@
 #include qla_target.h
 #include tcm_qla2xxx.h
 
+int jam_host = -1;
+module_param(jam_host, int, 0644);
+MODULE_PARM_DESC(jam_host, Host to jam =0 Enable jammer);
+
 static struct workqueue_struct *tcm_qla2xxx_free_wq;
 static struct workqueue_struct *tcm_qla2xxx_cmd_wq;
 
@@ -571,6 +575,13 @@ static int tcm_qla2xxx_handle_cmd(scsi_q
return -EINVAL;
}
 
+   if (unlikely(vha-host_no == jam_host)) {
+   /*
+   return, and dont run target_submit_cmd, discarding command
+   */
+   return 0;
+   }
+
return target_submit_cmd(se_cmd, se_sess, cdb, cmd-sense_buffer[0],
cmd-unpacked_lun, data_length, fcp_task_attr,
data_dir, flags);
@@ -2165,6 +2176,7 @@ static void tcm_qla2xxx_deregister_confi
 static int __init tcm_qla2xxx_init(void)
 {
int ret;
+   jam_host = -1;
 
ret = tcm_qla2xxx_register_configfs();
if (ret  0)





Thanks you for the consideration

Laurence Oberman
Red Hat Global Support Service
SEG Team

- Original Message -
From: Bart Van Assche bart.vanass...@sandisk.com
To: Laurence Oberman lober...@redhat.com, Andy Grover 
agro...@redhat.com, linux-scsi@vger.kernel.org, n...@daterainc.com
Cc: Laurence Oberman oberma...@gmail.com
Sent: Sunday, March 8, 2015 4:10:34 AM
Subject: Re: [PATCH ] tcm_qla2xxx  Add SCSI command jammer/discard capabilty to 
the tcm_qla2xxx module


On 03/08/2015 04:26 AM, Laurence Oberman wrote:
 Hello

 I use target LIO for all my storage array test targets and customer problem 
 resolution here at Red Hat.
 This patch resulted from a requirement to mimic behaviour of an expensive 
 hardware jammer for a customer.
 I have used this for some time with good success to simulate and reproduce 
 latency and slow drain fabric issues and
 for testing and validating error handling behaviour in the Emulex, Qlogic and 
 other F/C drivers.

 Works by checking jammer_flag==1 and host # and discards SCSI command, 
 controlled using echo to sys parameter.

 I decided to share the patch, in the hope it may be useful for others but I 
 do understand this is a special

Re: mvsas panics and dies when attached to a port extender on newer kernels

2015-04-14 Thread Laurence Oberman

Any chance you can capture a vmcore (kernel only pages), I will
provide an upload location.
Thanks
Laurence

On Tue, Apr 14, 2015 at 5:16 PM, James Bottomley
james.bottom...@hansenpartnership.com wrote:
 On Tue, 2015-04-14 at 14:03 -0700, Adam Talbot wrote:
 To make a very long debugging story short, I think there is an issues/bug
 with the mvsas driver. It works, with older kernels, and breaks on
 newer kernels.

 My Debian Jessie system was running great on a 3.18 kernel.  Changed
 cases to a newer supermicro case with a SAS expander backplane (SAS933EL).  
 That
 was the only hardware change.  Now, when ever I boot, the system kernel 
 panics.

 3.2.65-1+deb7u2 works
 3.9.0 Gentoo CD works
 3.16+ all fail
 Attached are 3 kernel panics on 3.16+ kernels.

 Motherboard is a Supermicro X8SIE, with a Marvell Technology Group Ltd.
 88SE6440 SAS/SATA PCIe controller

 Is this a known bug?

 Well, you're the only person that's reported it so far.

 I think based on the above is that your configuration is a single
 expander attached SATA device ... and if you move it to be non expander
 attached it works fine?

 At this point I have two options:
 Stick with the old kernel (yuck)
 Buy a new card running a better supported chipset

 Any help would be greatly appreciated
 Thanks

 You didn't specify: does 3.15 work?  At least the highest working kernel
 version would help me narrow down potential problems.

 James


 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] st: trivial: remove form feed characters

2015-11-04 Thread Laurence Oberman

Reviewed-by: Laurence Oberman <lober...@redhat.com>

On Wed, Nov 4, 2015 at 4:52 AM, Maurizio Lombardi <mlomb...@redhat.com> wrote:
> Signed-off-by: Maurizio Lombardi <mlomb...@redhat.com>
> ---
>  drivers/scsi/st.c | 24 
>  1 file changed, 8 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
> index b37b9b0..7c4e518 100644
> --- a/drivers/scsi/st.c
> +++ b/drivers/scsi/st.c
> @@ -226,7 +226,6 @@ static DEFINE_SPINLOCK(st_use_lock);
>  static DEFINE_IDR(st_index_idr);
>
>
> -
>  #include "osst_detect.h"
>  #ifndef SIGS_FROM_OSST
>  #define SIGS_FROM_OSST \
> @@ -305,7 +304,6 @@ static char * st_incompatible(struct scsi_device* SDp)
> }
> return NULL;
>  }
> -
>
>  static inline char *tape_name(struct scsi_tape *tape)
>  {
> @@ -877,7 +875,7 @@ static int flush_buffer(struct scsi_tape *STp, int 
> seek_next)
> return result;
>
>  }
> -
> +
>  /* Set the mode parameters */
>  static int set_mode_densblk(struct scsi_tape * STp, struct st_modedef * STm)
>  {
> @@ -952,7 +950,7 @@ static void reset_state(struct scsi_tape *STp)
> STp->new_partition = STp->partition;
> }
>  }
> -
> +
>  /* Test if the drive is ready. Returns either one of the codes below or a 
> negative system
> error code. */
>  #define CHKRES_READY   0
> @@ -1241,7 +1239,7 @@ static int check_tape(struct scsi_tape *STp, struct 
> file *filp)
>  }
>
>
> - /* Open the device. Needs to take the BKL only because of incrementing the 
> SCSI host
> +/* Open the device. Needs to take the BKL only because of incrementing the 
> SCSI host
> module count. */
>  static int st_open(struct inode *inode, struct file *filp)
>  {
> @@ -1334,7 +1332,6 @@ static int st_open(struct inode *inode, struct file 
> *filp)
> return retval;
>
>  }
> -
>
>  /* Flush the tape buffer before close */
>  static int st_flush(struct file *filp, fl_owner_t id)
> @@ -1470,7 +1467,7 @@ static int st_release(struct inode *inode, struct file 
> *filp)
>
> return result;
>  }
> -
> +
>  /* The checks common to both reading and writing */
>  static ssize_t rw_checks(struct scsi_tape *STp, struct file *filp, size_t 
> count)
>  {
> @@ -1889,7 +1886,7 @@ st_write(struct file *filp, const char __user *buf, 
> size_t count, loff_t * ppos)
>
> return retval;
>  }
> -
> +
>  /* Read data from the tape. Returns zero in the normal case, one if the
> eof status has changed, and the negative error code in case of a
> fatal error. Otherwise updates the buffer and the eof state.
> @@ -2085,7 +2082,6 @@ static long read_tape(struct scsi_tape *STp, long count,
> }
> return retval;
>  }
> -
>
>  /* Read command */
>  static ssize_t
> @@ -2233,7 +2229,6 @@ st_read(struct file *filp, char __user *buf, size_t 
> count, loff_t * ppos)
>
> return retval;
>  }
> -
>
>
>  DEB(
> @@ -2447,7 +2442,7 @@ static int st_set_options(struct scsi_tape *STp, long 
> options)
>
> return 0;
>  }
> -
> +
>  #define MODE_HEADER_LENGTH  4
>
>  /* Mode header and page byte offsets */
> @@ -2665,7 +2660,7 @@ static int do_load_unload(struct scsi_tape *STp, struct 
> file *filp, int load_cod
>
> return retval;
>  }
> -
> +
>  #if DEBUG
>  #define ST_DEB_FORWARD  0
>  #define ST_DEB_BACKWARD 1
> @@ -3091,7 +3086,6 @@ static int st_int_ioctl(struct scsi_tape *STp, unsigned 
> int cmd_in, unsigned lon
>
> return ioctl_result;
>  }
> -
>
>  /* Get the tape position. If bt == 2, arg points into a kernel space mt_loc
> structure. */
> @@ -3283,7 +3277,7 @@ static int switch_partition(struct scsi_tape *STp)
> STps->last_block_visited = 0;
> return set_location(STp, STps->last_block_visited, 
> STp->new_partition, 1);
>  }
> -
> +
>  /* Functions for reading and writing the medium partition mode page. */
>
>  #define PART_PAGE   0x11
> @@ -3396,7 +3390,6 @@ static int partition_tape(struct scsi_tape *STp, int 
> size)
>
> return result;
>  }
> -
>
>
>  /* The ioctl command */
> @@ -3766,7 +3759,6 @@ static long st_compat_ioctl(struct file *file, unsigned 
> int cmd, unsigned long a
>  }
>  #endif
>
> -
>
>  /* Try to allocate a new tape buffer. Calling function must not hold
> dev_arr_lock. */
> --
> Maurizio Lombardi
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] st: allow debug output to be enabled or disabled via sysfs

2015-10-12 Thread Laurence Oberman

I support this addition as it can be done without the module reload
provided by my prior patch.

Reviewed-by: Laurence Oberman <oberma...@gmail.com>

On Mon, Oct 12, 2015 at 12:31 AM, Seymour, Shane M
<shane.seym...@hpe.com> wrote:
>
> Change st driver to allow enabling or disabling debug output
> via sysfs file /sys/bus/scsi/drivers/st/debug_flag.
>
> Previously the only way to enable debug output was:
>
> 1. loading the driver with the module parameter debug_flag=1
> 2. an ioctl call (this method was also the only way to dynamically
> disable debug output).
>
> To use the ioctl you need a second tape drive (if you are
> actively testing the first tape drive) since a second process
> cannot open the first tape drive if it is in use.
>
> The this change is only functional if the value of the macro
> DEBUG in st.c is a non-zero value (which it is by default).
>
> Signed-off-by: Shane Seymour <shane.seym...@hpe.com>
> ---
> --- a/drivers/scsi/st.c 2015-10-06 17:11:16.299801789 -0500
> +++ b/drivers/scsi/st.c 2015-10-11 14:45:10.595060995 -0500
> @@ -4452,11 +4452,41 @@ static ssize_t version_show(struct devic
>  }
>  static DRIVER_ATTR_RO(version);
>
> +#if DEBUG
> +static ssize_t debug_flag_store(struct device_driver *ddp,
> +   const char *buf, size_t count)
> +{
> +/* We only care what the first byte of the data is the rest is unused.
> + * if it's a '1' we turn on debug and if it's a '0' we disable it. All
> + * other values have -EINVAL returned if they are passed in.
> + */
> +   if (count > 0) {
> +   if (buf[0] == '0') {
> +   debugging = NO_DEBUG;
> +   return count;
> +   } else if (buf[0] == '1') {
> +   debugging = 1;
> +   return count;
> +   }
> +   }
> +   return -EINVAL;
> +}
> +
> +static ssize_t debug_flag_show(struct device_driver *ddp, char *buf)
> +{
> +   return scnprintf(buf, PAGE_SIZE, "%d\n", debugging);
> +}
> +static DRIVER_ATTR_RW(debug_flag);
> +#endif
> +
>  static struct attribute *st_drv_attrs[] = {
> _attr_try_direct_io.attr,
> _attr_fixed_buffer_size.attr,
> _attr_max_sg_segs.attr,
> _attr_version.attr,
> +#if DEBUG
> +   _attr_debug_flag.attr,
> +#endif
> NULL,
>  };
>  ATTRIBUTE_GROUPS(st_drv);
> diff -uprN a/Documentation/ABI/testing/sysfs-driver-st 
> b/Documentation/ABI/testing/sysfs-driver-st
> --- a/Documentation/ABI/testing/sysfs-driver-st 1969-12-31 18:00:00.0 
> -0600
> +++ b/Documentation/ABI/testing/sysfs-driver-st 2015-10-11 14:28:43.537128220 
> -0500
> @@ -0,0 +1,12 @@
> +What:  /sys/bus/scsi/drivers/st/debug_flag
> +Date:  October 2015
> +Kernel Version:?.?
> +Contact:   shane.seym...@hpe.com
> +Description:
> +   This file allows you to turn debug output from the st driver
> +   off if you write a '0' to the file or on if you write a '1'.
> +   Note that debug output requires that the module be compiled
> +   with the #define DEBUG set to a non-zero value (this is the
> +   default). If DEBUG is set to 0 then this file will not
> +   appear in sysfs as its presence is conditional upon debug
> +   output support being compiled into the module.
> --- a/Documentation/scsi/st.txt 2015-10-06 17:11:12.323802060 -0500
> +++ b/Documentation/scsi/st.txt 2015-10-11 14:19:48.176164681 -0500
> @@ -569,7 +569,9 @@ Debugging code is now compiled in by def
>  with the kernel module parameter debug_flag defaulting to 0.  Debugging
>  can still be switched on and off with an ioctl.  To enable debug at
>  module load time add debug_flag=1 to the module load options, the
> -debugging output is not voluminous.
> +debugging output is not voluminous. Debugging can also be enabled
> +and disabled by writing a '0' (disable) or '1' (enable) to the sysfs
> +file /sys/bus/scsi/drivers/st/debug_flag.
>
>  If the tape seems to hang, I would be very interested to hear where
>  the driver is waiting. With the command 'ps -l' you can see the state
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: st driver doesn't seem to grok LTO partitioning

2016-01-06 Thread Laurence Oberman

Thanks Doug
Trying that now


Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Douglas Gilbert" <dgilb...@interlog.com>
To: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" 
<eflo...@intellique.com>
Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" 
<kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org
Sent: Wednesday, January 6, 2016 10:48:44 AM
Subject: Re: st driver doesn't seem to grok LTO partitioning

On 16-01-06 10:32 AM, Laurence Oberman wrote:
> Firmware update fails as follows:
>
> Still researching. This is the only LTO5 I have access to so unless Shane has 
> one I may not be able to make progress.
> (Its way long out of warranty and support)
>
> We mostly use this for generic st driver and changer testing for RHEL and it 
> has not been updated for at least two years.
>
> Performing FUP operation...
>
> Checking image file (/root/V3210A011-E00.IMG)
>
> Checking device readiness
>
> Sending image file to the device
>
> Redetecting device
> Fup drive command failed: Unknown error! (status = -100)
>
> Host adapter status = 0x00
> Driver status = 0x08
> Error buffer = 'MSG: FupDrive() - Error committing image file to drive 
> (/root/V3210A011-E00.IMG) 1584236 of 1584236 bytes written.
> SCSI: WriteBuffer()::DevIo() - ErrorCode (0x70h) ,Sense Key (0x05h) ILLEGAL 
> REQUEST, INVALID FIELD IN PARAMETER LIST. ASC(0x26h), ASCQ(0x00h) - )
> '
>
> Unable to perform FUP operation.

The 1584236 byte firmware image might be too big for a single
WRITE BUFFER command. You might try getting a recent version of
sg3_utils and doing something like:
sg_write_buffer -b 4k -I V3210A011-E00.IMG -m 7 /dev/sg3

where /dev/sg3 corresponds to your tape drive. 'lsscsi -g' will
show you the mapping.

The above technique works fine for recent Seagate SAS disks (with
".LOD" firmware images).

Doug Gilbert

> - Original Message -
> From: "Laurence Oberman" <lober...@redhat.com>
> To: "Emmanuel Florac" <eflo...@intellique.com>
> Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" 
> <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org
> Sent: Wednesday, January 6, 2016 10:25:37 AM
> Subject: Re: st driver doesn't seem to grok LTO partitioning
>
> I left the log of the failure to partition out
>
> Here it is
>
> # mt -f /dev/nst0  mkpartition 1
> /dev/nst0: Input/output error
>
> [ 5499.341648] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
> [ 5499.342903] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 
> 8
> [ 5499.343523] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
> [ 5499.344114] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 
> blocks).
> [ 5499.344702] st 0:0:0:0: [st0] Loading tape.
> [ 5499.359733] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
> [ 5499.360970] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 
> 8
> [ 5499.361584] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
> [ 5499.362165] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 
> blocks).
> [ 5499.363851] st 0:0:0:0: [st0] Partition page length is 10 bytes.
> [ 5499.364468] st 0:0:0:0: [st0] PP: max 0, add 0, xdp 0, psum 03, pofmetc 
> 0,rec 03, units 09, sizes: 1541 65535
> [ 5499.365074] st 0:0:0:0: [st0] MP: 11 08 00 00 18 03 09 00 06 05 ff ff
> [ 5499.365658] st 0:0:0:0: [st0] psd_cnt 2, max.parts 0, nbr_parts 0
> [ 5499.366246] st 0:0:0:0: [st0] Formatting tape with two partitions (FDP).
> [ 5499.366826] st 0:0:0:0: [st0] Sent partition page length is 12 bytes.  
> needs_format: 0
> [ 5499.367424] st 0:0:0:0: [st0] PP: max 0, add 1, xdp 4, psum 03, pofmetc 0 
> rec 03, units 00, sizes: 65535 65535
> [ 5499.368024] st 0:0:0:0: [st0] MP: 11 0a 00 01 98 03 00 00 ff ff ff ff
> [ 5499.369842] st 0:0:0:0: [st0] Error: 802, cmd: 15 10 0 0 18 0
> [ 5499.370495] st 0:0:0:0: [st0] Sense Key : Illegal Request [current]
> [ 5499.371109] st 0:0:0:0: [st0] Add. Sense: Invalid field in parameter list
> [ 5499.371714] st 0:0:0:0: [st0] Partitioning of tape failed.
>
> Laurence Oberman
> Principal Software Maintenance Engineer
> Red Hat Global Support Services
>
> - Original Message -
> From: "Laurence Oberman" <lober...@redhat.com>
> To: "Emmanuel Florac" <eflo...@intellique.com>
> Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" 
> <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org
> Sent: Wednesday, January 6, 2016 10:23:34 AM
> Subject: Re: st driver doesn't seem to g

Re: st driver doesn't seem to grok LTO partitioning

2016-01-06 Thread Laurence Oberman

Hello Emanuel

I am using this device, its an Ultrium 5 (LTO5)
Its an older changer and I am unable to update the firmware, still working on 
that.

What version of mt are you using, as I am testing using a RHEL7.2 base and the 
upstream patched kernel.

Linux example.redhat.com 4.3.3 #1 SMP Tue Jan 5 15:58:47 EST 2016 x86_64 x86_64 
x86_64 GNU/Linux

# tapeinfo -f /dev/st0
Product Type: Tape Drive
Vendor ID: 'QUANTUM '
Product ID: 'ULTRIUM 5   '
Revision: '3060'
Attached Changer API: No
SerialNumber: 'HU1023AKHE'
MinBlock: 1
MaxBlock: 16777215
SCSI ID: 0
SCSI LUN: 0
Ready: yes
BufferedMode: yes
Medium Type: Not Loaded
Density Code: 0x58
BlockSize: 512
DataCompEnabled: yes
DataCompCapable: yes
DataDeCompEnabled: yes
CompType: 0x1
DeCompType: 0x1
BOP: yes
Block Position: 0
Partition 0 Remaining Kbytes: 1541692
Partition 0 Size in Kbytes: 1541692
ActivePartition: 0
EarlyWarningSize: 0
NumPartitions: 0
MaxPartitions: 0

Drive is working fine,

# mt -f /dev/st0 status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 512 bytes. Density code 0x58 (no translation).
Soft error count since last status=0
General status bits on (4101):
 BOT ONLINE IM_REP_EN

This is what I get when I try and partition and I believe this may be a 
firmware issue for me.

mt -f /dev/st0  stsetoption can-partitions

[ 5343.620005] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[ 5343.621424] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[ 5343.622005] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[ 5343.622606] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks).
[ 5343.623208] st 0:0:0:0: [st0] Mode 0 options: buffer writes: 1, async 
writes: 1, read ahead: 1
[ 5343.623810] st 0:0:0:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, 
auto lock: 0,
[ 5343.624413] st 0:0:0:0: [st0] defs for wr: 0, no block limits: 0, 
partitions: 1, s2 log: 0
[ 5343.625011] st 0:0:0:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0
[ 5343.625623] st 0:0:0:0: [st0] debugging: 1
[ 5343.626222] st 0:0:0:0: [st0] Rewinding tape.

# mt -f /dev/nst0  mkpartition 1
/dev/nst0: Input/output error





Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Emmanuel Florac" <eflo...@intellique.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" 
<kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org
Sent: Wednesday, January 6, 2016 10:10:49 AM
Subject: Re: st driver doesn't seem to grok LTO partitioning

Le Tue, 5 Jan 2016 16:55:04 -0500 (EST)
Laurence Oberman <lober...@redhat.com> écrivait:

> mt -f /dev/nst0  mkpartition 1
> 

What is the type of drive exactly? I still couldn't test with the LTO-5
drive as the machine it's connected to is being reinstalled.

-- 

Emmanuel Florac |   Direction technique
|   Intellique
|   <eflo...@intellique.com>
|   +33 1 78 94 84 02

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: st driver doesn't seem to grok LTO partitioning

2016-01-06 Thread Laurence Oberman

Firmware update fails as follows:

Still researching. This is the only LTO5 I have access to so unless Shane has 
one I may not be able to make progress.
(Its way long out of warranty and support)

We mostly use this for generic st driver and changer testing for RHEL and it 
has not been updated for at least two years.

Performing FUP operation...

Checking image file (/root/V3210A011-E00.IMG)

Checking device readiness

Sending image file to the device

Redetecting device
Fup drive command failed: Unknown error! (status = -100)

Host adapter status = 0x00
Driver status = 0x08
Error buffer = 'MSG: FupDrive() - Error committing image file to drive 
(/root/V3210A011-E00.IMG) 1584236 of 1584236 bytes written.
SCSI: WriteBuffer()::DevIo() - ErrorCode (0x70h) ,Sense Key (0x05h) ILLEGAL 
REQUEST, INVALID FIELD IN PARAMETER LIST. ASC(0x26h), ASCQ(0x00h) - )
'

Unable to perform FUP operation.


Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Laurence Oberman" <lober...@redhat.com>
To: "Emmanuel Florac" <eflo...@intellique.com>
Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" 
<kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org
Sent: Wednesday, January 6, 2016 10:25:37 AM
Subject: Re: st driver doesn't seem to grok LTO partitioning

I left the log of the failure to partition out

Here it is

# mt -f /dev/nst0  mkpartition 1
/dev/nst0: Input/output error

[ 5499.341648] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[ 5499.342903] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[ 5499.343523] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[ 5499.344114] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks).
[ 5499.344702] st 0:0:0:0: [st0] Loading tape.
[ 5499.359733] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[ 5499.360970] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[ 5499.361584] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[ 5499.362165] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks).
[ 5499.363851] st 0:0:0:0: [st0] Partition page length is 10 bytes.
[ 5499.364468] st 0:0:0:0: [st0] PP: max 0, add 0, xdp 0, psum 03, pofmetc 
0,rec 03, units 09, sizes: 1541 65535
[ 5499.365074] st 0:0:0:0: [st0] MP: 11 08 00 00 18 03 09 00 06 05 ff ff
[ 5499.365658] st 0:0:0:0: [st0] psd_cnt 2, max.parts 0, nbr_parts 0
[ 5499.366246] st 0:0:0:0: [st0] Formatting tape with two partitions (FDP).
[ 5499.366826] st 0:0:0:0: [st0] Sent partition page length is 12 bytes.  
needs_format: 0
[ 5499.367424] st 0:0:0:0: [st0] PP: max 0, add 1, xdp 4, psum 03, pofmetc 0 
rec 03, units 00, sizes: 65535 65535
[ 5499.368024] st 0:0:0:0: [st0] MP: 11 0a 00 01 98 03 00 00 ff ff ff ff
[ 5499.369842] st 0:0:0:0: [st0] Error: 802, cmd: 15 10 0 0 18 0
[ 5499.370495] st 0:0:0:0: [st0] Sense Key : Illegal Request [current] 
[ 5499.371109] st 0:0:0:0: [st0] Add. Sense: Invalid field in parameter list
[ 5499.371714] st 0:0:0:0: [st0] Partitioning of tape failed.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Laurence Oberman" <lober...@redhat.com>
To: "Emmanuel Florac" <eflo...@intellique.com>
Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" 
<kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org
Sent: Wednesday, January 6, 2016 10:23:34 AM
Subject: Re: st driver doesn't seem to grok LTO partitioning

Hello Emanuel

I am using this device, its an Ultrium 5 (LTO5)
Its an older changer and I am unable to update the firmware, still working on 
that.

What version of mt are you using, as I am testing using a RHEL7.2 base and the 
upstream patched kernel.

Linux example.redhat.com 4.3.3 #1 SMP Tue Jan 5 15:58:47 EST 2016 x86_64 x86_64 
x86_64 GNU/Linux

# tapeinfo -f /dev/st0
Product Type: Tape Drive
Vendor ID: 'QUANTUM '
Product ID: 'ULTRIUM 5   '
Revision: '3060'
Attached Changer API: No
SerialNumber: 'HU1023AKHE'
MinBlock: 1
MaxBlock: 16777215
SCSI ID: 0
SCSI LUN: 0
Ready: yes
BufferedMode: yes
Medium Type: Not Loaded
Density Code: 0x58
BlockSize: 512
DataCompEnabled: yes
DataCompCapable: yes
DataDeCompEnabled: yes
CompType: 0x1
DeCompType: 0x1
BOP: yes
Block Position: 0
Partition 0 Remaining Kbytes: 1541692
Partition 0 Size in Kbytes: 1541692
ActivePartition: 0
EarlyWarningSize: 0
NumPartitions: 0
MaxPartitions: 0

Drive is working fine,

# mt -f /dev/st0 status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 512 bytes. Density code 0x58 (no translation).
Soft error count since last status=0
General status bits on (4101):
 BOT ONLINE IM_REP_EN

This is what I get when I try and partition and I believe this may be a 
firmware issue for me.

mt -f /dev/st0  stsetoption can-partitions

[ 5343.620005] st

Re: st driver doesn't seem to grok LTO partitioning

2016-01-06 Thread Laurence Oberman

I left the log of the failure to partition out

Here it is

# mt -f /dev/nst0  mkpartition 1
/dev/nst0: Input/output error

[ 5499.341648] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[ 5499.342903] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[ 5499.343523] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[ 5499.344114] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks).
[ 5499.344702] st 0:0:0:0: [st0] Loading tape.
[ 5499.359733] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[ 5499.360970] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[ 5499.361584] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[ 5499.362165] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks).
[ 5499.363851] st 0:0:0:0: [st0] Partition page length is 10 bytes.
[ 5499.364468] st 0:0:0:0: [st0] PP: max 0, add 0, xdp 0, psum 03, pofmetc 
0,rec 03, units 09, sizes: 1541 65535
[ 5499.365074] st 0:0:0:0: [st0] MP: 11 08 00 00 18 03 09 00 06 05 ff ff
[ 5499.365658] st 0:0:0:0: [st0] psd_cnt 2, max.parts 0, nbr_parts 0
[ 5499.366246] st 0:0:0:0: [st0] Formatting tape with two partitions (FDP).
[ 5499.366826] st 0:0:0:0: [st0] Sent partition page length is 12 bytes.  
needs_format: 0
[ 5499.367424] st 0:0:0:0: [st0] PP: max 0, add 1, xdp 4, psum 03, pofmetc 0 
rec 03, units 00, sizes: 65535 65535
[ 5499.368024] st 0:0:0:0: [st0] MP: 11 0a 00 01 98 03 00 00 ff ff ff ff
[ 5499.369842] st 0:0:0:0: [st0] Error: 802, cmd: 15 10 0 0 18 0
[ 5499.370495] st 0:0:0:0: [st0] Sense Key : Illegal Request [current] 
[ 5499.371109] st 0:0:0:0: [st0] Add. Sense: Invalid field in parameter list
[ 5499.371714] st 0:0:0:0: [st0] Partitioning of tape failed.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Laurence Oberman" <lober...@redhat.com>
To: "Emmanuel Florac" <eflo...@intellique.com>
Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" 
<kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org
Sent: Wednesday, January 6, 2016 10:23:34 AM
Subject: Re: st driver doesn't seem to grok LTO partitioning

Hello Emanuel

I am using this device, its an Ultrium 5 (LTO5)
Its an older changer and I am unable to update the firmware, still working on 
that.

What version of mt are you using, as I am testing using a RHEL7.2 base and the 
upstream patched kernel.

Linux example.redhat.com 4.3.3 #1 SMP Tue Jan 5 15:58:47 EST 2016 x86_64 x86_64 
x86_64 GNU/Linux

# tapeinfo -f /dev/st0
Product Type: Tape Drive
Vendor ID: 'QUANTUM '
Product ID: 'ULTRIUM 5   '
Revision: '3060'
Attached Changer API: No
SerialNumber: 'HU1023AKHE'
MinBlock: 1
MaxBlock: 16777215
SCSI ID: 0
SCSI LUN: 0
Ready: yes
BufferedMode: yes
Medium Type: Not Loaded
Density Code: 0x58
BlockSize: 512
DataCompEnabled: yes
DataCompCapable: yes
DataDeCompEnabled: yes
CompType: 0x1
DeCompType: 0x1
BOP: yes
Block Position: 0
Partition 0 Remaining Kbytes: 1541692
Partition 0 Size in Kbytes: 1541692
ActivePartition: 0
EarlyWarningSize: 0
NumPartitions: 0
MaxPartitions: 0

Drive is working fine,

# mt -f /dev/st0 status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 512 bytes. Density code 0x58 (no translation).
Soft error count since last status=0
General status bits on (4101):
 BOT ONLINE IM_REP_EN

This is what I get when I try and partition and I believe this may be a 
firmware issue for me.

mt -f /dev/st0  stsetoption can-partitions

[ 5343.620005] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[ 5343.621424] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[ 5343.622005] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[ 5343.622606] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks).
[ 5343.623208] st 0:0:0:0: [st0] Mode 0 options: buffer writes: 1, async 
writes: 1, read ahead: 1
[ 5343.623810] st 0:0:0:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, 
auto lock: 0,
[ 5343.624413] st 0:0:0:0: [st0] defs for wr: 0, no block limits: 0, 
partitions: 1, s2 log: 0
[ 5343.625011] st 0:0:0:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0
[ 5343.625623] st 0:0:0:0: [st0] debugging: 1
[ 5343.626222] st 0:0:0:0: [st0] Rewinding tape.

# mt -f /dev/nst0  mkpartition 1
/dev/nst0: Input/output error





Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Emmanuel Florac" <eflo...@intellique.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" 
<kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org
Sent: Wednesday, January 6, 2016 10:10:49 AM
Subject: Re: st driver doesn't seem to grok LTO partitioning

Le Tue, 5 Jan 2016 16:55:04 -0500 (EST)
Laurence Oberman <lober...@redhat

Re: st driver doesn't seem to grok LTO partitioning

2016-01-05 Thread Laurence Oberman

Testing the patch here in the lab, it seems my firmware will need to be updated 
to support more than 1 partition.
Looking into that now.

[  193.647807] st: Version 20160104, fixed bufsize 32768, s/g segs 256
[  193.648992] st: Debugging enabled debug_flag = 1
[  193.650907] st 0:0:0:0: Attached scsi tape st0
[  193.652046] st 0:0:0:0: st0: try direct i/o: yes (alignment 4 B)

[  280.069260] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[  280.070543] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[  280.073068] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[  280.073725] st 0:0:0:0: [st0] Block size: 0, buffer size: 4096 (1 blocks).

 mt -f /dev/st0  stsetoption can-partitions

[  676.835972] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[  676.837403] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[  676.838404] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[  676.838880] st 0:0:0:0: [st0] Block size: 0, buffer size: 4096 (1 blocks).
[  676.840383] st 0:0:0:0: [st0] Mode 0 options: buffer writes: 1, async 
writes: 1, read ahead: 1
[  676.840880] st 0:0:0:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, 
auto lock: 0,
[  676.842424] st 0:0:0:0: [st0] defs for wr: 0, no block limits: 0, 
partitions: 1, s2 log: 0
[  676.842937] st 0:0:0:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0
[  676.844524] st 0:0:0:0: [st0] debugging: 1
[  676.845042] st 0:0:0:0: [st0] Rewinding tape.

mt -f /dev/nst0  mkpartition 1

[  798.711408] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[  798.712799] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[  798.713948] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[  798.714504] st 0:0:0:0: [st0] Block size: 0, buffer size: 4096 (1 blocks).
[  798.716227] st 0:0:0:0: [st0] Loading tape.
[  798.731230] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes.
[  798.732874] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8
[  798.734269] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1
[  798.734971] st 0:0:0:0: [st0] Block size: 0, buffer size: 4096 (1 blocks).
[  798.737572] st 0:0:0:0: [st0] Partition page length is 10 bytes.
[  798.739162] st 0:0:0:0: [st0] PP: max 0, add 0, xdp 0, psum 03, pofmetc 
0,rec 03, units 09, sizes: 1541 65535
[  798.739974] st 0:0:0:0: [st0] MP: 11 08 00 00 18 03 09 00 06 05 ff ff
[  798.740810] st 0:0:0:0: [st0] psd_cnt 2, max.parts 0, nbr_parts 0
[  798.744194] st 0:0:0:0: [st0] Formatting tape with two partitions (FDP).
[  798.745045] st 0:0:0:0: [st0] Sent partition page length is 12 bytes.  
needs_format: 0
[  798.747718] st 0:0:0:0: [st0] PP: max 0, add 1, xdp 4, psum 03, pofmetc 0 
rec 03, units 00, sizes: 65535 65535
[  798.748558] st 0:0:0:0: [st0] MP: 11 0a 00 01 98 03 00 00 ff ff ff ff
[  798.752622] st 0:0:0:0: [st0] Error: 802, cmd: 15 10 0 0 18 0
[  798.753465] st 0:0:0:0: [st0] Sense Key : Illegal Request [current] 
[  798.754289] st 0:0:0:0: [st0] Add. Sense: Invalid field in parameter list
[  798.757546] st 0:0:0:0: [st0] Partitioning of tape failed.



Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services


From: Emmanuel Florac <eflo...@intellique.com>
Date: Mon, Jan 4, 2016 at 6:46 AM
Subject: Re: st driver doesn't seem to grok LTO partitioning
To: Kai Makisara <kai.makis...@kolumbus.fi>
Cc: linux-scsi@vger.kernel.org


Le Mon, 4 Jan 2016 12:22:34 +0200 (EET)
Kai Makisara <kai.makis...@kolumbus.fi> écrivait:

> Here is again a new version of the patch. This does load before
> partitioning. The code performing default partitioning (FDP=1) has
> also been slightly modified (two more bits of the original mode page
> retained).
>
> The patch has been tested with my DDS-4 drive.

That works fine for me. I'm going to do some testing with other drives
I have (LTO-3 -- should fail -- and LTO-5).

# modprobe st

Jan  4 12:31:53 shakuhachi kernel: st: Version 20160104, fixed bufsize
32768, s/g segs 256
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: Attached scsi tape st0
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: try direct i/o: yes
(alignment 512 B)
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Block limits 1 -
16777215 bytes.
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Mode sense. Length 11,
medium 0, WBS 10, BLL 8
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Density 5a, tape
length: 0, drv buffer: 1
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Block size: 0, buffer
size: 4096 (1 blocks).
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Block limits 1 -
16777215 bytes.
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Mode sense. Length 11,
medium 0, WBS 10, BLL 8
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Density 5a, tape
length: 0, drv buffer: 1
Jan  4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Block size: 0, buffer
size: 4096 (1 blocks).


# mt -f /dev/st0

Re: st driver doesn't seem to grok LTO partitioning

2015-12-22 Thread Laurence Oberman

I am just waiting on some LTO5 tape cartridges and then will start
working on this.
I only have LTO cartridges so had to order a couple of LTO5's

On Tue, Dec 22, 2015 at 5:04 AM, Emmanuel Florac <eflo...@intellique.com> wrote:
> Le Tue, 22 Dec 2015 02:20:31 -0500
> Laurence Oberman <oberma...@gmail.com> écrivait:
>
>> I also have access to newer hardware if needed. I have started
>> reviewing all of this and will post back to this thread.
>> Emmanuel can you summarize what you would like to achieve and we will
>> all work on this together.
>
> I'd like to be able to partition LTO media through standard commands,
> like "mt mkpartition", mostly to be able to create LTFS tapes without
> relying on hard to compile code from IBM/HP/Quantum/Oracle.
>
> --
> 
> Emmanuel Florac |   Direction technique
> |   Intellique
> |   <eflo...@intellique.com>
> |   +33 1 78 94 84 02
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] bnx2i: fix spelling mistake "complection" -> "completion"

2016-06-04 Thread Laurence Oberman



- Original Message -
> From: "Colin King" <colin.k...@canonical.com>
> To: qlogic-storage-upstr...@qlogic.com, "James E . J . Bottomley" 
> <j...@linux.vnet.ibm.com>, "Martin K . Petersen"
> <martin.peter...@oracle.com>, linux-scsi@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Sent: Saturday, June 4, 2016 3:14:30 PM
> Subject: [PATCH] bnx2i: fix spelling mistake "complection" -> "completion"
> 
> From: Colin Ian King <colin.k...@canonical.com>
> 
> trivial fix to spelling mistake in printk message
> 
> Signed-off-by: Colin Ian King <colin.k...@canonical.com>
> ---
>  drivers/scsi/bnx2i/bnx2i_hwi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c b/drivers/scsi/bnx2i/bnx2i_hwi.c
> index fb072cc..42921db 100644
> --- a/drivers/scsi/bnx2i/bnx2i_hwi.c
> +++ b/drivers/scsi/bnx2i/bnx2i_hwi.c
> @@ -2417,7 +2417,7 @@ static void bnx2i_process_conn_destroy_cmpl(struct
> bnx2i_hba *hba,
>   ep = bnx2i_find_ep_in_destroy_list(hba, conn_destroy->iscsi_conn_id);
>   if (!ep) {
>   printk(KERN_ALERT "bnx2i_conn_destroy_cmpl: no pending "
> -   "offload request, unexpected complection\n");
> +   "offload request, unexpected completion\n");
>   return;
>   }
>  
> --
> 2.8.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Simple fix
Reviewed-by Laurence Oberman <lober...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PREEMPT-RT] [PATCH v2] scsi/fcoe: convert to kworker

2016-06-09 Thread Laurence Oberman



- Original Message -
> From: "Sebastian Andrzej Siewior" <bige...@linutronix.de>
> To: "Laurence Oberman" <lober...@redhat.com>, "James Bottomley" 
> <j...@linux.vnet.ibm.com>
> Cc: "Christoph Hellwig" <h...@infradead.org>, linux-scsi@vger.kernel.org, 
> "Martin K. Petersen"
> <martin.peter...@oracle.com>, "Vasu Dev" <vasu@intel.com>, 
> r...@linutronix.de, fcoe-de...@open-fcoe.org, "Chad
> Dupuis" <chad.dup...@qlogic.com>
> Sent: Thursday, June 9, 2016 9:09:37 AM
> Subject: Re: [PREEMPT-RT] [PATCH v2] scsi/fcoe: convert to kworker
> 
> On 04/22/2016 06:39 PM, Laurence Oberman wrote:
> > I have fcoe for testing.
> > I will pull this in next week and test it.
> 
> any update?
> 
> > 
> > Laurence Oberman
> > Principal Software Maintenance Engineer
> > Red Hat Global Support Services
> 
> Sebastian
> 
> 
Hello
Apologies, somehow this fell off my radar.
I will get the FCOE test bed up and get it done ASAP.

Regards
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability to the tcm_qla2xxx module

2016-05-25 Thread Laurence Oberman



- Original Message -
> From: "Himanshu Madhani" <himanshu.madh...@qlogic.com>
> To: "Laurence Oberman" <lober...@redhat.com>, "Nicholas A. Bellinger" 
> <n...@linux-iscsi.org>
> Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, "linux-scsi" 
> <linux-scsi@vger.kernel.org>, "target-devel"
> <target-de...@vger.kernel.org>, "Quinn Tran" <quinn.t...@qlogic.com>
> Sent: Monday, May 9, 2016 1:08:36 PM
> Subject: Re: [PATCH]  tcm_qla2xxx Add SCSI command jammer/discard capability 
> to the tcm_qla2xxx module
> 
> On 5/9/16, 7:56 AM, "Laurence Oberman" <lober...@redhat.com> wrote:
> 
> 
> 
> >
> >
> >- Original Message -
> >> From: "Laurence Oberman" <lober...@redhat.com>
> >> To: "Nicholas A. Bellinger" <n...@linux-iscsi.org>
> >> Cc: "Himanshu Madhani" <himanshu.madh...@qlogic.com>, "Bart Van Assche"
> >> <bart.vanass...@sandisk.com>, "linux-scsi"
> >> <linux-scsi@vger.kernel.org>, "target-devel"
> >> <target-de...@vger.kernel.org>, "Quinn Tran" <quinn.t...@qlogic.com>
> >> Sent: Monday, April 4, 2016 6:50:03 PM
> >> Subject: Re: [PATCH]  tcm_qla2xxx Add SCSI command jammer/discard
> >> capability to the tcm_qla2xxx module
> >> 
> >> Hello Nicholas
> >> 
> >> Its fixed now.
> >> Many Thanks.
> >> 
> >> $ scripts/checkpatch.pl
> >> 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch
> >> WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
> >> #12:
> >> new file mode 100644
> >> 
> >> total: 0 errors, 1 warnings, 91 lines checked
> >> 
> >> 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch has style
> >> problems, please review.
> >> 
> >> NOTE: If any of the errors are false positives, please report
> >>   them to the maintainer, see CHECKPATCH in MAINTAINERS.
> >> 
> >> 
> >> 
> >> Tested by: Laurence Oberman <lober...@redhat.com>
> >> Signed-off-by: Laurence Oberman <lober...@redhat.com>
> >> ---
> >>  Documentation/scsi/tcm_qla2xxx.txt |   22 ++
> >>  drivers/scsi/qla2xxx/Kconfig   |9 +
> >>  drivers/scsi/qla2xxx/tcm_qla2xxx.c |   20 
> >>  drivers/scsi/qla2xxx/tcm_qla2xxx.h |1 +
> >>  4 files changed, 52 insertions(+), 0 deletions(-)
> >>  create mode 100644 Documentation/scsi/tcm_qla2xxx.txt
> >> 
> >> diff --git a/Documentation/scsi/tcm_qla2xxx.txt
> >> b/Documentation/scsi/tcm_qla2xxx.txt
> >> new file mode 100644
> >> index 000..c3a670a
> >> --- /dev/null
> >> +++ b/Documentation/scsi/tcm_qla2xxx.txt
> >> @@ -0,0 +1,22 @@
> >> +tcm_qla2xxx jam_host attribute
> >> +--
> >> +There is now a new module endpoint atribute called jam_host
> >> +attribute: jam_host: boolean=0/1
> >> +This attribute and accompanying code is only included if the
> >> +Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y
> >> +By default this jammer code and functionality is disabled
> >> +
> >> +Use this attribute to control the discarding of SCSI commands to a
> >> +selected host.
> >> +This may be useful for testing error handling and simulating slow drain
> >> +and other fabric issues.
> >> +
> >> +Setting a boolean of 1 for the jam_host attribute for a particular host
> >> + will discard the commands for that host.
> >> +Reset back to 0 to stop the jamming.
> >> +
> >> +Enable host 4 to be jammed
> >> +echo 1 >
> >> /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host
> >> +
> >> +Disable jamming on host 4
> >> +echo 0 >
> >> /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host
> >> diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
> >> index 10aa18b..67c0d5a 100644
> >> --- a/drivers/scsi/qla2xxx/Kconfig
> >> +++ b/drivers/scsi/qla2xxx/Kconfig
> >> @@ -36,3 +36,12 @@ config TCM_QLA2XXX
> >>default n
> >>---help---
> >>Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+
> >>series
> >>target mode HBAs
> >> +
&

Re: [PATCH] aic7xxx: fix wrong return values

2016-06-08 Thread Laurence Oberman

  }
>  
>   if ((ahc->features & AHC_TWIN) != 0) {
>   if (ahc_alloc_tstate(ahc, ahc->our_id_b, 'B') == NULL) {
>   printk("%s: unable to allocate ahc_tmode_tstate.  "
>  "Failing attach\n", ahc_name(ahc));
> - return (ENOMEM);
> + return -ENOMEM;
>   }
>   }
>  
> @@ -5660,7 +5660,7 @@ ahc_suspend(struct ahc_softc *ahc)
>  
>   if (LIST_FIRST(>pending_scbs) != NULL) {
>   ahc_unpause(ahc);
> - return (EBUSY);
> + return -EBUSY;
>   }
>  
>  #ifdef AHC_TARGET_MODE
> @@ -5671,7 +5671,7 @@ ahc_suspend(struct ahc_softc *ahc)
>*/
>   if (ahc->pending_device != NULL) {
>   ahc_unpause(ahc);
> - return (EBUSY);
> + return -EBUSY;
>   }
>  #endif
>   ahc_shutdown(ahc);
> @@ -6908,7 +6908,7 @@ ahc_loadseq(struct ahc_softc *ahc)
>   printk("\n%s: Program too large for instruction memory "
>  "size of %d!\n", ahc_name(ahc),
>  ahc->instruction_ram_size);
> - return (ENOMEM);
> + return -ENOMEM;
>   }
>  
>   /*
> diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm.c
> b/drivers/scsi/aic7xxx/aic7xxx_osm.c
> index fc6a831..78433f6 100644
> --- a/drivers/scsi/aic7xxx/aic7xxx_osm.c
> +++ b/drivers/scsi/aic7xxx/aic7xxx_osm.c
> @@ -835,7 +835,7 @@ ahc_dma_tag_create(struct ahc_softc *ahc, bus_dma_tag_t
> parent,
>  
>   dmat = kmalloc(sizeof(*dmat), GFP_ATOMIC);
>   if (dmat == NULL)
> - return (ENOMEM);
> + return -ENOMEM;
>  
>   /*
>* Linux is very simplistic about DMA memory.  For now don't
> @@ -864,7 +864,7 @@ ahc_dmamem_alloc(struct ahc_softc *ahc, bus_dma_tag_t
> dmat, void** vaddr,
>   *vaddr = pci_alloc_consistent(ahc->dev_softc,
> dmat->maxsize, mapp);
>   if (*vaddr == NULL)
> - return ENOMEM;
> + return -ENOMEM;
>   return 0;
>  }
>  
> @@ -1096,7 +1096,7 @@ ahc_linux_register_host(struct ahc_softc *ahc, struct
> scsi_host_template *templa
>   template->name = ahc->description;
>   host = scsi_host_alloc(template, sizeof(struct ahc_softc *));
>   if (host == NULL)
> - return (ENOMEM);
> + return -ENOMEM;
>  
>   *((struct ahc_softc **)host->hostdata) = ahc;
>   ahc->platform_data->host = host;
> @@ -1215,7 +1215,7 @@ ahc_platform_alloc(struct ahc_softc *ahc, void
> *platform_arg)
>   ahc->platform_data =
>   kzalloc(sizeof(struct ahc_platform_data), GFP_ATOMIC);
>   if (ahc->platform_data == NULL)
> - return (ENOMEM);
> + return -ENOMEM;
>   ahc->platform_data->irq = AHC_LINUX_NOIRQ;
>   ahc_lockinit(ahc);
>   ahc->seltime = (aic7xxx_seltime & 0x3) << 4;
> diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
> b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
> index 0fc14da..8bca7f4 100644
> --- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
> +++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
> @@ -346,13 +346,13 @@ static int
>  ahc_linux_pci_reserve_io_region(struct ahc_softc *ahc, resource_size_t
>  *base)
>  {
>   if (aic7xxx_allow_memio == 0)
> - return (ENOMEM);
> + return -ENOMEM;
>  
>   *base = pci_resource_start(ahc->dev_softc, 0);
>   if (*base == 0)
> - return (ENOMEM);
> + return -ENOMEM;
>   if (!request_region(*base, 256, "aic7xxx"))
> - return (ENOMEM);
> + return -ENOMEM;
>   return (0);
>  }
>  
> @@ -369,16 +369,16 @@ ahc_linux_pci_reserve_mem_region(struct ahc_softc *ahc,
>   if (start != 0) {
>   *bus_addr = start;
>   if (!request_mem_region(start, 0x1000, "aic7xxx"))
> - error = ENOMEM;
> + error = -ENOMEM;
>   if (error == 0) {
>   *maddr = ioremap_nocache(start, 256);
>   if (*maddr == NULL) {
> - error = ENOMEM;
> + error = -ENOMEM;
>   release_mem_region(start, 0x1000);
>   }
>   }
>   } else
> - error = ENOMEM;
> + error = -ENOMEM;
>   return (error);
>  }
>  
> diff --git a/drivers/scsi/aic7xxx/aic7xxx_pci.c
> b/drivers/scsi/aic7xxx/aic7xxx_pci.c
> index 22d5a94..40e1c9b 100644
> --- a/drivers/scsi/aic7xxx/aic7xxx_pci.c
> +++ b/drivers/scsi/aic7xxx/aic7xxx_pci.c
> @@ -806,7 +806,7 @@ ahc_pci_config(struct ahc_softc *ahc, const struct
> ahc_pci_identity *entry)
>  
>   error = ahc_reset(ahc, /*reinit*/FALSE);
>   if (error != 0)
> - return (ENXIO);
> + return -ENXIO;
>  
>   if ((ahc->features & AHC_DT) != 0) {
>   u_int sfunct;
> @@ -2387,7 +2387,7 @@ static int
>  ahc_raid_setup(struct ahc_softc *ahc)
>  {
>   printk("RAID functionality unsupported\n");
> - return (ENXIO);
> + return -ENXIO;
>  }
>  
>  static int
> --
> 2.5.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Patch looks simple as the change is straightforward.
However can you make the code consistent, some have parenthesis in return, some 
not.
How did this work before though if it was returning non-negative to the caller 
or upper layer
Has this been tested to work with the changes

Reviewed-by Laurence Oberman <lober...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Connect-IB not performing as well as ConnectX-3 with iSER

2016-06-22 Thread Laurence Oberman



- Original Message -
> From: "Bart Van Assche" 
> To: "Robert LeBlanc" , "Sagi Grimberg" 
> 
> Cc: linux-r...@vger.kernel.org, linux-scsi@vger.kernel.org, "Max Gurtovoy" 
> 
> Sent: Wednesday, June 22, 2016 4:18:31 AM
> Subject: Re: Connect-IB not performing as well as ConnectX-3 with iSER
> 
> On 06/21/2016 10:26 PM, Robert LeBlanc wrote:
> > Srpt keeps crashing couldn't test
> 
> If this is reproducible with the latest rc kernel or with any of the
> stable kernels please report this in a separate e-mail, together with
> the crash call stack and information about how to reproduce this.
> 
> Thanks,
> 
> Bart.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Robert

I am exercising the ib_srpt configured vi a targetlio very heavily in 4.7.0-rc1.
I have no crashes or issues.
I also had 4.5 running ib_srpt with no crashes, although I had some other 
timeouts etc depending on the load.

What sort of crashes are you talking about ?
Does the system crash, ib_srpt dump stack ?

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] snic: Fix use-after-free in case of a dma mapping error

2016-06-23 Thread Laurence Oberman



- Original Message -
> From: "Johannes Thumshirn" <jthumsh...@suse.de>
> To: "Martin K . Petersen" <martin.peter...@oracle.com>, "James Bottomley" 
> <j...@linux.vnet.ibm.com>
> Cc: "Linux SCSI Mailinglist" <linux-scsi@vger.kernel.org>, "Linux Kernel 
> Mailinglist" <linux-ker...@vger.kernel.org>,
> "Narsimhulu Musini" <nmus...@cisco.com>, "Sesidhar Baddela" 
> <sebad...@cisco.com>, "Johannes Thumshirn"
> <jthumsh...@suse.de>
> Sent: Thursday, June 23, 2016 8:37:20 AM
> Subject: [PATCH] snic: Fix use-after-free in case of a dma mapping error
> 
> If there is a dma mapping error snic kfree()s buf right before printing it.
> Change the order to not accidently trip on memory that's not owned by us
> anymore.
> 
> Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de>
> ---
>  drivers/scsi/snic/snic_disc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/snic/snic_disc.c b/drivers/scsi/snic/snic_disc.c
> index b0fefd6..b106596 100644
> --- a/drivers/scsi/snic/snic_disc.c
> +++ b/drivers/scsi/snic/snic_disc.c
> @@ -113,11 +113,11 @@ snic_queue_report_tgt_req(struct snic *snic)
>  
>   pa = pci_map_single(snic->pdev, buf, buf_len, PCI_DMA_FROMDEVICE);
>   if (pci_dma_mapping_error(snic->pdev, pa)) {
> - kfree(buf);
> - snic_req_free(snic, rqi);
>   SNIC_HOST_ERR(snic->shost,
> "Rpt-tgt rspbuf %p: PCI DMA Mapping Failed\n",
> buf);
> + kfree(buf);
> + snic_req_free(snic, rqi);
>   ret = -EINVAL;
>  
>   goto error;
> --
> 2.8.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Looks fine to me
Reviewed-by Laurence Oberman <lober...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] tcm_qla2xxx: fix spelling mistake: "seperator" -> "separator"

2016-06-23 Thread Laurence Oberman



- Original Message -
> From: "Colin King" <colin.k...@canonical.com>
> To: "James E . J . Bottomley" <j...@linux.vnet.ibm.com>, "Martin K . 
> Petersen" <martin.peter...@oracle.com>,
> linux-scsi@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Sent: Thursday, June 23, 2016 1:12:25 PM
> Subject: [PATCH] tcm_qla2xxx: fix spelling mistake: "seperator" -> "separator"
> 
> From: Colin Ian King <colin.k...@canonical.com>
> 
> trivial fix to spelling mistake in pr_err message
> 
> Signed-off-by: Colin Ian King <colin.k...@canonical.com>
> ---
>  drivers/scsi/qla2xxx/tcm_qla2xxx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> index 6643f6f..46fe6f4 100644
> --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> @@ -1738,7 +1738,7 @@ static struct se_wwn *tcm_qla2xxx_npiv_make_lport(
>  
>   p = strchr(tmp, '@');
>   if (!p) {
> - pr_err("Unable to locate NPIV '@' seperator\n");
> + pr_err("Unable to locate NPIV '@' separator\n");
>   return ERR_PTR(-EINVAL);
>   }
>   *p++ = '\0';
> --
> 2.8.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Simple change, and its fine
Reviewed-by Laurence Oberman <lober...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning

2016-02-04 Thread Laurence Oberman


Kai's latest patch passes all my tests on the DAT DSS drive
Fails on the older LTO3 as it should. (un-partionable)
I don't have the new LTO5 yet, arrives end of week I am told.

Testing log
---
[root@srp-server ~]# uname -a
Linux srp-server 4.4.0 #1 SMP Thu Jan 28 15:06:45 EST 2016 x86_64 x86_64 x86_64 
GNU/Linux

Storage Changer /dev/sg3:1 Drives, 6 Slots ( 0 Import/Export )
Data Transfer Element 0:Full (Storage Element 2 Loaded)
  Storage Element 1:Full
  Storage Element 2:Empty
  Storage Element 3:Full
  Storage Element 4:Full
  Storage Element 5:Full
  Storage Element 6:Empty

[root@srp-server home]# mtx -f /dev/sg3 unload 2 0
Unloading drive 0 into Storage Element 2...done

[root@srp-server home]# mtx -f /dev/sg3 load 3 0
Loading media from Storage Element 3 into drive 0...done

[root@srp-server home]# sg_map -st -i
/dev/sg2  /dev/nst0  HPDAT72X6   B409
/dev/sg3  HPDAT72X6   B409

[root@srp-server home]# mt -f /dev/st0 stsetoption can-partitions

[root@srp-server home]# mt -f /dev/st0 mkpartition 1

Tape screen shows Format

Completed with no errors and I can set to a specific partition

Feb 04 13:42:27 srp-server kernel: st: Unloaded.
Feb 04 13:43:57 srp-server kernel: st: Version 20160203, fixed bufsize 32768, 
s/g segs 256
Feb 04 13:43:57 srp-server kernel: st: Debugging enabled debug_flag = 1
Feb 04 13:43:57 srp-server kernel: st 6:0:1:0: Attached scsi tape st0
Feb 04 13:43:57 srp-server kernel: st 6:0:1:0: st0: try direct i/o: yes 
(alignment 4 B)

Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Updating partition number 
in status.
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Got tape pos. blk 0 part 0.
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Mode 0 options: buffer 
writes: 1, async writes: 1, read ahead: 1
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] can bsr: 1, two FMs: 
0, fast mteom: 0, auto lock: 0,
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] defs for wr: 0, no 
block limits: 0, partitions: 1, s2 log: 0
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] sysv: 0 nowait: 0 
sili: 0 nowait_filemark: 0
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] debugging: 1
Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape.

Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Loading tape.
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 0 0 0 
0 0 0
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Unit Attention 
[current]
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Not ready to 
ready change, medium may have changed
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Partition page length is 
10 bytes.
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 0, xdp 0, 
psum 02, pofmetc 0, rec 03, units 00, sizes: 0 65535
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 00 10 03 00 
00 00 00 ff ff
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] psd_cnt 1, max.parts 1, 
nbr_parts 0
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Formatting tape with two 
partitions (1 = 1 MB).
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Sent partition page length 
is 10 bytes. needs_format: 0
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 1, xdp 1, 
psum 02, pofmetc 0, rec 03, units 00, sizes: 1 65535
Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 01 30 03 00 
00 27 10 ff ff


Tested-by: Laurence Oberman <lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Douglas Gilbert" <dgilb...@int

Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning

2016-01-28 Thread Laurence Oberman

Hi Kai

What kernel was the last patch you attached against.

Thanks

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi>
To: "Shane M Seymour" <shane.seym...@hpe.com>
Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" 
<eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, 
linux-scsi@vger.kernel.org
Sent: Thursday, January 28, 2016 12:04:20 PM
Subject: Re: What partition should the MTMKPART argument specify? Was: Re: st 
driver doesn't seem to grok LTO partitioning


> On 28.1.2016, at 9.36, Seymour, Shane M <shane.seym...@hpe.com> wrote:
> 
> Hi Kai,
> 
> With the changes the I get a failure partitioning a HP DAT72 drive (DDS-5):
> 
> # ./mt -f /dev/st1 stsetoption debug
> # ./mt -f /dev/st1 stsetoption can-partitions
> # ./mt -f /dev/st1 mkpartition 1000
> /dev/st1: Input/output error
> 
...
> [ 3976.389605] st 6:0:3:0: [st1] Partition page length is 10 bytes.
> [ 3976.389610] st 6:0:3:0: [st1] PP: max 1, add 0, xdp 0, psum 02, pofmetc 0, 
> rec 03, units 00, sizes: 0 65535
> [ 3976.389614] st 6:0:3:0: [st1] MP: 11 08 01 00 10 03 00 00 00 00 ff ff
> [ 3976.389618] st 6:0:3:0: [st1] psd_cnt 2, max.parts 1, nbr_parts 0
 ^
The problem is here

...
> Using a slightly older kernel to partition the DAT72 drive works (same 3 
> commands as above):
...
> [  351.584906] st 6:0:3:0: [st1] Partition page length is 10 bytes.
> [  351.584908] st 6:0:3:0: [st1] psd_cnt 1, max.parts 1, nbr_parts 0

The old driver computes the psd_cnt from the returned page length. The same 
applies
to the patched driver if the SCSI level of the device < SCSI_3. This works 
correctly with
my drive that reports SCSI_2. So, the question is: what SCSI level does your 
device
report?

Kai

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning

2016-01-28 Thread Laurence Oberman

Meant to mention, still waiting for my new LTO5, also this is the first time I 
am testing the DAT72.

Shane, have you had the DAT working before this last patch, if so which patch

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Laurence Oberman" <lober...@redhat.com>
To: "Shane M Seymour" <shane.seym...@hpe.com>
Cc: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi>, "Emmanuel Florac" 
<eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, 
linux-scsi@vger.kernel.org
Sent: Thursday, January 28, 2016 6:23:13 PM
Subject: Re: What partition should the MTMKPART argument specify? Was: Re: st 
driver doesn't seem to grok LTO partitioning

On My DAT tape with the latest patch


[root@srp-server ~]# cat /sys/class/scsi_tape/st0/device/scsi_level
4

[root@srp-server ~]# mt -f /dev/st0 stsetoption can-partitions

Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Mode 0 options: buffer 
writes: 1, async writes: 1, read ahead: 1
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] can bsr: 1, two FMs: 
0, fast mteom: 0, auto lock: 0,
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] defs for wr: 0, no 
block limits: 0, partitions: 1, s2 log: 0
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] sysv: 0 nowait: 0 
sili: 0 nowait_filemark: 0
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] debugging: 1
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape.

[root@srp-server ~]# mt -f /dev/st0 mkpartition 1000

Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Loading tape.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 0 0 0 
0 0 0
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Unit Attention 
[current]
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Not ready to 
ready change, medium may have changed
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Partition page length is 
10 bytes.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 0, xdp 0, 
psum 02, pofmetc 0, rec 03, units 00, sizes: 0 65535
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 00 10 03 00 
00 00 00 ff ff
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] psd_cnt 2, max.parts 1, 
nbr_parts 0
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Formatting tape with two 
partitions (1 = 1000 MB).
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sent partition page length 
is 12 bytes. needs_format: 0
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 1, xdp 1, 
psum 02, pofmetc 0, rec 03, units 00, sizes: 65535 1000
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] MP: 11 0a 01 01 30 03 00 
00 ff ff 03 e8
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 15 10 
0 0 18 0
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Illegal 
Request [current]
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Invalid field 
in parameter list
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Partitioning of tape 
failed.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape.



Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Shane M Seymour" <shane.seym...@hpe.com>
To: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi>
Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" 
<eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, 
linux-scsi@vger.kernel.org
Sent: Thursday, January 28, 2016 6:12:41 PM
Subject: RE: What partition should the MTMKP

Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning

2016-01-28 Thread Laurence Oberman

On My DAT tape with the latest patch


[root@srp-server ~]# cat /sys/class/scsi_tape/st0/device/scsi_level
4

[root@srp-server ~]# mt -f /dev/st0 stsetoption can-partitions

Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Mode 0 options: buffer 
writes: 1, async writes: 1, read ahead: 1
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] can bsr: 1, two FMs: 
0, fast mteom: 0, auto lock: 0,
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] defs for wr: 0, no 
block limits: 0, partitions: 1, s2 log: 0
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] sysv: 0 nowait: 0 
sili: 0 nowait_filemark: 0
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] debugging: 1
Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape.

[root@srp-server ~]# mt -f /dev/st0 mkpartition 1000

Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Loading tape.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 0 0 0 
0 0 0
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Unit Attention 
[current]
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Not ready to 
ready change, medium may have changed
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Partition page length is 
10 bytes.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 0, xdp 0, 
psum 02, pofmetc 0, rec 03, units 00, sizes: 0 65535
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 00 10 03 00 
00 00 00 ff ff
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] psd_cnt 2, max.parts 1, 
nbr_parts 0
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Formatting tape with two 
partitions (1 = 1000 MB).
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sent partition page length 
is 12 bytes. needs_format: 0
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 1, xdp 1, 
psum 02, pofmetc 0, rec 03, units 00, sizes: 65535 1000
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] MP: 11 0a 01 01 30 03 00 
00 ff ff 03 e8
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 15 10 
0 0 18 0
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Illegal 
Request [current]
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Invalid field 
in parameter list
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Partitioning of tape 
failed.
Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape.



Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Shane M Seymour" <shane.seym...@hpe.com>
To: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi>
Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" 
<eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, 
linux-scsi@vger.kernel.org
Sent: Thursday, January 28, 2016 6:12:41 PM
Subject: RE: What partition should the MTMKPART argument specify? Was: Re: st 
driver doesn't seem to grok LTO partitioning

Hi Kai,

$ pwd
/sys/class/scsi_tape/st1/device
$ cat scsi_level
4

Thanks
Shane

> -Original Message-
> From: "Kai Mäkisara (Kolumbus)" [mailto:kai.makis...@kolumbus.fi]
> Sent: Friday, January 29, 2016 4:04 AM
> To: Seymour, Shane M
> Cc: Laurence Oberman; Emmanuel Florac; Laurence Oberman; linux-
> s...@vger.kernel.org
> Subject: Re: What partition should the MTMKPART argument specify? Was:
> Re: st driver doesn't seem to grok LTO partitioning
> 
> 
> > On 28.1.2016, at 9.36, Seymour, Shane M <shane.seym...@hpe.com>
> wrote:
> >
> > Hi Kai,
> >
> > With the changes the I get a failure partitioning a HP DAT72 drive (DDS-5):
> >

Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning

2016-02-01 Thread Laurence Oberman

The new patch did not work for me, but I chatted with Shane and I have his mt 
version. 
I will update my DAT to same firmware or newer than his and provide a second 
tested by.
I also expect my LTO5 to show up this week so will be ready for that.

Thanks everyone for keeping tapes alive

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi>
To: "Shane M Seymour" <shane.seym...@hpe.com>
Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" 
<eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, 
linux-scsi@vger.kernel.org
Sent: Monday, February 1, 2016 1:43:26 PM
Subject: Re: What partition should the MTMKPART argument specify? Was: Re: st 
driver doesn't seem to grok LTO partitioning


> On 1.2.2016, at 8.31, Seymour, Shane M <shane.seym...@hpe.com> wrote:
> 
> Hi Kai,
> 
> Thanks for the changes the HPE DAT72 DDS5 drive now works as expected:
> 
Good. Thanks for testing.

...
> 
> I'm asking around again one final time to see if I can lay my hands on a LTO5 
> or greater drive so I can test LTO partitioning as well.
> 
> The only other thing I can think of (I'm not sure if this is an improvement 
> or not) is if bp[pgo + PP_OFF_MAX_ADD_PARTS] + bp[pgo + PP_OFF_NBR_ADD_PARTS] 
> (max.parts and nbr_parts in the debug message) is zero just return -EINVAL 
> unless you know of any take drives that report them both as 0 but can be 
> partitioned? That is after this:
> 
>DEBC_printk(STp, "psd_cnt %d, max.parts %d, nbr_parts %d\n",
>psd_cnt, bp[pgo + PP_OFF_MAX_ADD_PARTS],
>bp[pgo + PP_OFF_NBR_ADD_PARTS]);
> 
> add (and also turn off the can-partitions option):
> 
>   if ((bp[pgo + PP_OFF_MAX_ADD_PARTS] + bp[pgo + PP_OFF_NBR_ADD_PARTS]) 
> == 0) {
>   DEBC_printk(STp, "Drive not partitionable - max.parts+nbr_parts 
> is 0\n");
>   STp->can_partitions = 0;
>   return -EINVAL;
>   }
> 
> I'm not especially fussed if you don't want to add that though.
> 
I thought about a test like this (only test maximum number) but decided not to 
add it. The reason was that
I did not want to change anything that has worked before. I quite trust that 
the current drives return sense
data instead of crashing and the end result for the user would be the same. 
However, one can argue that
returning EINVAL is better than EIO but does the user notice? If the common 
opinion is that a test like this
should be added, I am not against it. It can be added to the code for SCSI >=3 
where it does not risk
anything for the old drives.

IMHO, can_partitions should not be cleared based on the test. For example, 
trying to partition a LTO-4 tape
in a LTO-5 drive should not disable partitioning. (The mode page should return 
zero as maximum number of
partitions when a LTO-4 tape is inserted.)

Thanks,
Kai

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning

2016-02-02 Thread Laurence Oberman

Hello

Finally got my firmware on my DAT updated.
Using Kai's latest patch I validated the patch on my DAT driver as well
Thanks to Shane for providing the correct mt code, as that was also one of my 
problems besides firmware.

[root@srp-server mt-st-1.1-patched]# ./mt -f /dev/st0 stsetoption can-partitions
[root@srp-server mt-st-1.1-patched]# ./mt -f /dev/st0 mkpartition 1000

Took almost 6 minutes to partition this old DDS

Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Mode 0 options: buffer 
writes: 1, async writes: 1, read ahead: 1
Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] can bsr: 1, two FMs: 
0, fast mteom: 0, auto lock: 0,
Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] defs for wr: 0, no 
block limits: 0, partitions: 1, s2 log: 0
Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] sysv: 0 nowait: 0 
sili: 0 nowait_filemark: 0
Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] debugging: 1
Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape.

Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Updating partition number 
in status.
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Got tape pos. blk 0 part 0.
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Loading tape.
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 0 0 0 
0 0 0
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Unit Attention 
[current] 
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Not ready to 
ready change, medium may have changed
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 
bytes.
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, 
medium 0, WBS 10, BLL 8
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 
0, drv buffer: 1
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer 
size: 4096 (1 blocks).
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Partition page length is 
10 bytes.
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 0, xdp 0, 
psum 02, pofmetc 0, rec 03, units 00, sizes: 0 65535
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 00 10 03 00 
00 00 00 ff ff
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] psd_cnt 1, max.parts 1, 
nbr_parts 0
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Formatting tape with two 
partitions (1 = 1000 MB).
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Sent partition page length 
is 10 bytes. needs_format: 0
Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 1, xdp 1, 
psum 02, pofmetc 0, rec 03, units 00, sizes: 1000 65535
Feb 02 22:31:45 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape.

I will retest with Shane's latest additions he just sent after first testing 
with Kai's latest patch on my LTO5.
(here's hoping I dont have to update the f/w on that one)


Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi>
To: "Shane M Seymour" <shane.seym...@hpe.com>
Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" 
<eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, 
linux-scsi@vger.kernel.org
Sent: Monday, February 1, 2016 1:43:26 PM
Subject: Re: What partition should the MTMKPART argument specify? Was: Re: st 
driver doesn't seem to grok LTO partitioning


> On 1.2.2016, at 8.31, Seymour, Shane M <shane.seym...@hpe.com> wrote:
> 
> Hi Kai,
> 
> Thanks for the changes the HPE DAT72 DDS5 drive now works as expected:
> 
Good. Thanks for testing.

...
> 
> I'm asking around again one final time to see if I can lay my hands on a LTO5 
> or greater drive so I can test LTO partitioning as well.
> 
> The only other thing I can think of (I'm not sure if this is an improvement 
> or not) is if bp[pgo + PP_OFF_MAX_ADD_PARTS] + bp[pgo + PP_OFF_NBR_ADD_PARTS] 
> (max.parts and nbr_parts in the debug message) is zero just return -EINVAL 
> unless you know of any take drives that report them both as 0 but can be 
> partitioned? That is after this:
&

Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning

2016-01-21 Thread Laurence Oberman

Given what we see at customers I am leaning towards the SCSI level <=2 to 
ensure the older LTO5's are supported.
The newer ones should be backwards compatible.
I may have an older LTO5 showing up that wont need a F/W update to work, and 
will be able to add a "tested by" once I get it.

But lets see what the others have to say

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi>
To: "Shane M Seymour" <shane.seym...@hpe.com>
Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" 
<eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, 
linux-scsi@vger.kernel.org
Sent: Thursday, January 21, 2016 3:58:46 PM
Subject: What partition should the MTMKPART argument specify? Was: Re: st 
driver doesn't seem to grok LTO partitioning

> On 15.1.2016, at 2.21, Seymour, Shane M <shane.seym...@hpe.com> wrote:
> 
> Unfortunately I'm unable to lay my hands on an LTO 5 tape drive so I'm not 
> able to test that it works either. If it helps at all I can test in the 
> negative and make sure that for an LTO 3 drive it fails gracefully but that's 
> about it at the moment.

Thanks for all testers and those who attempted to test. The latest patch 
applies the standard quite strictly and I think it should work with most 
drives. The implementation can be fixed later if problems are found.

However, before making the final patch, we should decide which partition the 
specified size should apply to. For the SCSI level <=2 it applies to partition 
1. For other drives we may have some freedom to “tune” the definition. The size 
should apply to the partition the users expect it to apply. 

The current documentation says "the argument gives in megabytes the size of 
partition 1 that is physically the first partition of the tape”. The 
documentation I have found for current drives (HP and IBM LTO, IBM 3592, 
Storagetek T1000) all number the partitions sequentially from the start of the 
tape. The access time for any partition is probably about the same when 
wrapwise partitioning is used. It does matter with linear partitioning. 
Unfortunately, the standards leave the numbering to the implementor.

Partitioning with two partitions is used for storing index in a small partition 
and use the rest of the tape for data. In this case, it is probably natural to 
specify the size of the index. The LTFS definition supports index in any 
partition. The open source code I have seen seem to default to index in 
partition 0.

The HP and IBM LTO default partitioning (FDP=1) specifies two wraps (minimum) 
to partition 1 and the rest to 0.

There seem to be lot of arguments supporting both possible choices. Should we 
use the existing definition (1) or change it for the drives supporting SCSI 
level >= 3 (or supporting FORMAT MEDIUM)? The definition can’t be changed 
later. This is why we should make a good decision.

Opinions?

Thanks,
Kai

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi_sysfs: Fix typo in is_bin_visible()

2016-03-10 Thread Laurence Oberman

Reviewed-by:Laurence Oberman lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Ewan Milne" <emi...@redhat.com>
To: "Hannes Reinecke" <h...@suse.de>
Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "Christoph Hellwig" 
<h...@lst.de>, "Johannes Thumshirn" <jthumsh...@suse.com>, "James Bottomley" 
<james.bottom...@hansenpartnership.com>, linux-scsi@vger.kernel.org, "Hannes 
Reinecke" <h...@suse.com>
Sent: Thursday, March 10, 2016 10:25:08 AM
Subject: Re: [PATCH] scsi_sysfs: Fix typo in is_bin_visible()

On Thu, 2016-03-10 at 11:25 +0100, Hannes Reinecke wrote:
> The test for the existence vpd_pg83 is inverted.
> 
> Fixes: 7e47976bcff ("scsi_sysfs: add 'is_bin_visible' callback")
> Signed-off-by: Hannes Reinecke <h...@suse.com>
> ---
>  drivers/scsi/scsi_sysfs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 58ac9c1..d805d55 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1105,7 +1105,7 @@ static umode_t scsi_sdev_bin_attr_is_visible(struct 
> kobject *kobj,
>   if (attr == _attr_vpd_pg80 && !sdev->vpd_pg80)
>   return 0;
>  
> - if (attr == _attr_vpd_pg83 && sdev->vpd_pg83)
> + if (attr == _attr_vpd_pg83 && !sdev->vpd_pg83)
>   return 0;
>  
>   return S_IRUGO;

Reviewed-by: Ewan D. Milne <emi...@redhat.com>


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 0/2] Update SCSI target removal path

2016-03-30 Thread Laurence Oberman

I can test this next week.
I can test pre and then post patch.
Will update when its validated.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Ewan D. Milne" <emi...@redhat.com>
To: "jthumshirn" <jthumsh...@suse.de>
Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "James E.J. Bottomley" 
<j...@linux.vnet.ibm.com>, "Hannes Reinecke" <h...@suse.de>, "Christoph 
Hellwig" <h...@infradead.org>, linux-scsi@vger.kernel.org
Sent: Wednesday, March 30, 2016 12:30:27 PM
Subject: Re: [PATCH v2 0/2] Update SCSI target removal path

On Wed, 2016-03-30 at 13:01 +0200, jthumshirn wrote:
> [+Cc linux-scsi back]
> On 2016-03-30 02:59, Martin K. Petersen wrote:
> >>>>>> "Ewan" == Ewan D Milne <emi...@redhat.com> writes:
> > 
> > Ewan> I would probably use an APCON or other physical layer switch to
> > Ewan> drop the FC link and test the error recovery/device loss.  But we
> > Ewan> don't have one.
> > 
> > They go for a couple of hundred bucks on eBay. I had one of these in a
> > previous life and it was awesome.
> 
> Though this would work (like any other FC/FCoE switch) I thought more of 
> a simulated environment. scsi_debug, Qemu, something like that.

You might be able to do something with that but it doesn't quite do the
same thing in terms of how a HBA/driver will react to a fault.  You also
need to be sure there is enough entropy in the timing of when the target
goes away.  (Otherwise, you could just rmmod scsi_debug...)

> 
> I've had a look at scsi_debug but it seems like it's quite some 
> refactoring needed to get it to a point where one can simulate target 
> errors. I kinda like the idea of having something in 
> tools/testing/selftest but it'll probably end up with a FC switch.
> 
> Johannes


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Cant write to max_sectors_kb on 4.5.0 SRP target

2016-04-07 Thread Laurence Oberman

ation descriptor number 5, descriptor length: 8
designator_type: Logical unit group,  code_set: Binary
associated with the addressed logical unit
  Logical unit group: 0x0
  Designation descriptor number 6, descriptor length: 48
transport: SCSI RDMA Protocol (SRP)
designator_type: SCSI name string,  code_set: UTF-8
associated with the target port
  SCSI name string:
  0xfe807cfe900300726e4e,t,0x0001
  Designation descriptor number 7, descriptor length: 40
transport: SCSI RDMA Protocol (SRP)
designator_type: SCSI name string,  code_set: UTF-8
associated with the target device that contains addressed lu
  SCSI name string:
  0xfe807cfe900300726e4e



Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Cant write to max_sectors_kb on 4.5.0 SRP target

2016-04-11 Thread Laurence Oberman

As a follow up to this issue.

I looked at modifying the LIO target code to allow a larger max_sectors_kb 
exported to the initiator for the nvme devices but had some issues.
In the end I created 15 fileio devices using 200GB of ramdisk and exported 
those so I could test 4MB I/O from the initiator.

These allow the 4MB setting on the upstream kernel.

[root@srptest ~]# sg_inq -p 0xb0 /dev/sdk
VPD INQUIRY: Block limits page (SBC)
  Maximum compare and write length: 1 blocks
  Optimal transfer length granularity: 1 blocks
  Maximum transfer length: 16384 blocks
  Optimal transfer length: 16384 blocks
  Maximum prefetch, xdread, xdwrite transfer length: 0 blocks

The sg_map issues I am having on the RHEL kernel are likely due to the "proper" 
max sector size being ignored.
I am testing latest upstream now 4.5.0 with all the sg related patches to see 
if that's stable.

Thanks

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -----
From: "Laurence Oberman" <lober...@redhat.com>
To: emi...@redhat.com
Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org
Sent: Friday, April 8, 2016 9:11:19 AM
Subject: Re: Cant write to max_sectors_kb on 4.5.0  SRP target

Hi Ewan, 

OK, that makes sense.
I suspected after everybody's responses that RHEL was somehow ignoring the 
array imposed limit here.
I actually got lucky because I needed to be able to issue 4MB IO'S to reproduce 
the failures seen
at the customer on the initiator side.

Looking at the target-LIO array now its clamped to 1MB I/O sizes which makes 
sense.
I really was not focusing on the array at the time expecting it may chop the 
I/O up as many do.

Knowing what's up now I can continue to test and figure out what patches I need 
to pull in to SRP on RHEL to make progress.

Thank you to all that responded.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -----
From: "Ewan D. Milne" <emi...@redhat.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org
Sent: Friday, April 8, 2016 8:39:52 AM
Subject: Re: Cant write to max_sectors_kb on 4.5.0  SRP target

The version of RHEL you are using does not have:

commit ca369d51b3e1649be4a72addd6d6a168cfb3f537
Author: Martin K. Petersen <martin.peter...@oracle.com>
Date:   Fri Nov 13 16:46:48 2015 -0500

block/sd: Fix device-imposed transfer length limits

(which will be added during the next update).

In the upstream kernel queue_max_sectors_store() does not
permit you to set a value larger than the device-imposed
limit.  This value, stored in q->limits.max_dev_sectors,
is not visible via the block queue sysfs interface.

The code that sets q->limits.max_sectors and q->limits.io_opt
in sd.c does not take the device limit into account, but
the sysfs code to change max_sectors ("max_sectors_kb") does.

So there are a couple of problems here, one is that RHEL
is not clamping to the device limit, and the other one is
that neither RHEL nor upstream kernels take the device limit
into account when setting q->limits.io_opt.  This only seems
to be a problem for you because your target is reporting
an optimal I/O size in VPD page B0 that is *smaller* than
the reported maximum I/O size.

The target is clearly reporting inconsistent data, the
question is whether we should change the code to clamp the
optimal I/O size, or whether we should assume the value
the target is reporting is wrong.

So the question is:  does the target actually process
requests that are larger than the VPD page B0 reported
maximum size?  If so, maybe we should just issue a warning
message rather than reducing the optimal I/O size.

-Ewan

On Fri, 2016-04-08 at 04:31 -0400, Laurence Oberman wrote:
> Hello Martin
> 
> Yes, Ewan also noticed that.
> 
> This started out as me testing the SRP stack on RHEL 7.2 and baselining 
> against upstream.
> We have a customer that requires 4MB I/O.
> I bumped into a number of SRP issues including sg_map failures so started 
> reviewing upstream changes to the SRP code and patches.
> 
> The RHEL kernel is ignoring this so perhaps we have an issue on our side 
> (RHEL kernel) and upstream is behaving as it should.
> 
> What is intersting is that I cannot change the max_sectors_kb at all on the 
> upstream for the SRP LUNS.
> 
> Here is an HP SmartArray LUN
> 
> [root@srptest ~]#  sg_inq --p 0xb0 /dev/sda
> VPD INQUIRY: page=0xb0
> inquiry: field in cdb illegal (page not supported)    Known that its 
> not supported
> 
> However
> 
> /sys/block/sda/queue
> 
> [root@s

Re: Cant write to max_sectors_kb on 4.5.0 SRP target

2016-04-08 Thread Laurence Oberman

Hi Ewan, 

OK, that makes sense.
I suspected after everybody's responses that RHEL was somehow ignoring the 
array imposed limit here.
I actually got lucky because I needed to be able to issue 4MB IO'S to reproduce 
the failures seen
at the customer on the initiator side.

Looking at the target-LIO array now its clamped to 1MB I/O sizes which makes 
sense.
I really was not focusing on the array at the time expecting it may chop the 
I/O up as many do.

Knowing what's up now I can continue to test and figure out what patches I need 
to pull in to SRP on RHEL to make progress.

Thank you to all that responded.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Ewan D. Milne" <emi...@redhat.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org
Sent: Friday, April 8, 2016 8:39:52 AM
Subject: Re: Cant write to max_sectors_kb on 4.5.0  SRP target

The version of RHEL you are using does not have:

commit ca369d51b3e1649be4a72addd6d6a168cfb3f537
Author: Martin K. Petersen <martin.peter...@oracle.com>
Date:   Fri Nov 13 16:46:48 2015 -0500

block/sd: Fix device-imposed transfer length limits

(which will be added during the next update).

In the upstream kernel queue_max_sectors_store() does not
permit you to set a value larger than the device-imposed
limit.  This value, stored in q->limits.max_dev_sectors,
is not visible via the block queue sysfs interface.

The code that sets q->limits.max_sectors and q->limits.io_opt
in sd.c does not take the device limit into account, but
the sysfs code to change max_sectors ("max_sectors_kb") does.

So there are a couple of problems here, one is that RHEL
is not clamping to the device limit, and the other one is
that neither RHEL nor upstream kernels take the device limit
into account when setting q->limits.io_opt.  This only seems
to be a problem for you because your target is reporting
an optimal I/O size in VPD page B0 that is *smaller* than
the reported maximum I/O size.

The target is clearly reporting inconsistent data, the
question is whether we should change the code to clamp the
optimal I/O size, or whether we should assume the value
the target is reporting is wrong.

So the question is:  does the target actually process
requests that are larger than the VPD page B0 reported
maximum size?  If so, maybe we should just issue a warning
message rather than reducing the optimal I/O size.

-Ewan

On Fri, 2016-04-08 at 04:31 -0400, Laurence Oberman wrote:
> Hello Martin
> 
> Yes, Ewan also noticed that.
> 
> This started out as me testing the SRP stack on RHEL 7.2 and baselining 
> against upstream.
> We have a customer that requires 4MB I/O.
> I bumped into a number of SRP issues including sg_map failures so started 
> reviewing upstream changes to the SRP code and patches.
> 
> The RHEL kernel is ignoring this so perhaps we have an issue on our side 
> (RHEL kernel) and upstream is behaving as it should.
> 
> What is intersting is that I cannot change the max_sectors_kb at all on the 
> upstream for the SRP LUNS.
> 
> Here is an HP SmartArray LUN
> 
> [root@srptest ~]#  sg_inq --p 0xb0 /dev/sda
> VPD INQUIRY: page=0xb0
> inquiry: field in cdb illegal (page not supported)    Known that its 
> not supported
> 
> However
> 
> /sys/block/sda/queue
> 
> [root@srptest queue]# cat max_hw_sectors_kb max_sectors_kb
> 4096
> 1280
> [root@srptest queue]# echo 4096 > max_sectors_kb
> [root@srptest queue]# cat max_hw_sectors_kb max_sectors_kb
> 4096
> 4096
> 
> On the SRP LUNS I am unable to change to a lower value than  max_sectors_kb 
> unless I change it to 128
> So perhaps the size on the array is the issue here as Nicholas said and the 
> RHEL kernel has a bug and ignores it.
> 
> /sys/block/sdc/queue
> 
> [root@srptest queue]# cat max_hw_sectors_kb max_sectors_kb
> 4096
> 1280
> 
> [root@srptest queue]# echo 512 > max_sectors_kb
> -bash: echo: write error: Invalid argument
> 
> [root@srptest queue]# echo 256 > max_sectors_kb
> -bash: echo: write error: Invalid argument
> 
> 128 works
> [root@srptest queue]# echo 128 > max_sectors_kb
> 
> 
> 
> 
> Laurence Oberman
> Principal Software Maintenance Engineer
> Red Hat Global Support Services
> 
> - Original Message -
> From: "Martin K. Petersen" <martin.peter...@oracle.com>
> To: "Laurence Oberman" <lober...@redhat.com>
> Cc: "linux-scsi" <linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org
> Sent: Thursday, April 7, 2016 11:00:16 PM
> Subject: Re: Cant writ

Re: Cant write to max_sectors_kb on 4.5.0 SRP target

2016-04-08 Thread Laurence Oberman

vcTim Util
03:56:57 sdc  0  000  1092608  0 1067 10241024  
   3 2  0   74
03:56:57 dm-4 0  000  1092608  0 1067 10241024  
   3 2  0   79
03:56:58 sdc  0  000  1070080  0 1045 10241024  
   3 2  0   73
03:56:58 dm-4 0  000  1070080  0 1045 10241024  
   3 2  0   78
03:56:59 sdc  0  000  1101824  0 1076 10241024  
   3 2  0   72
03:56:59 dm-4 0  0    00  1101824  0 1076 10241024  
   3 2  0   77


Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services


Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -----
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Laurence Oberman" <lober...@redhat.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org
Sent: Thursday, April 7, 2016 10:49:58 PM
Subject: Re: Cant write to max_sectors_kb on 4.5.0 SRP target

On 04/07/16 14:16, Laurence Oberman wrote:
> I have been testing the SRP initiator code to an LIO array here and
 > part of the testing requires me to set the max_sectors_kb size to
 > get 4k I/O's.
.
Hello Laurence,

Have you already tried to set the max_sect parameter in 
/etc/srp_daemon.conf (assuming you are using srptools >= 1.0.3 for SRP 
login) ? Additionally, writing something like "options ib_srp 
cmd_sg_entries=255" into /etc/modprobe.d/ib_srp.conf will increase the 
maximum SRP transfer size.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Cant write to max_sectors_kb on 4.5.0 SRP target

2016-04-08 Thread Laurence Oberman

o- mapped_lun28 
 [lun28 
block/block-29 (rw)]
  | | | | o- mapped_lun29 
 [lun29 
block/block-30 (rw)]
  | | | o- ib.4f6e72000390fe7c7cfe900300726ed3 
. [Mapped LUNs: 30]
  | | |   o- mapped_lun0 
... [lun0 
block/block-1 (rw)]
  | | |   o- mapped_lun1 
... [lun1 
block/block-2 (rw)]
  | | |   o- mapped_lun2 
... [lun2 
block/block-3 (rw)]
  | | |   o- mapped_lun3 
... [lun3 
block/block-4 (rw)]
  | | |   o- mapped_lun4 
... [lun4 
block/block-5 (rw)]
..
,,

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Nicholas A. Bellinger" <n...@linux-iscsi.org>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: "linux-scsi" <linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org, 
"target-devel" <target-de...@vger.kernel.org>
Sent: Friday, April 8, 2016 1:30:28 AM
Subject: Re: Cant write to max_sectors_kb on 4.5.0  SRP target

Hi Laurence,

On Thu, 2016-04-07 at 17:15 -0400, Laurence Oberman wrote:
> Hello
> 
> I have been testing the SRP initiator code to an LIO array here and
> part of the testing requires me to set the max_sectors_kb size to get
> 4k I/O's.
> This has been due to me having to debug various sg_map issues.
> 
> Linux srptest 4.5.0 #2 SMP Thu Apr 7 16:14:38 EDT 2016 x86_64 x86_64
> x86_64 GNU/Linux
> This kernel has the scan patch from Hannes, as well as the "[PATCH]
> IB/mlx5: Expose correct max_sge_rd limit" patch. 
> However, I also tested with vanilla 4.5.0 as well and its the same
> issue.
> 
> For some reason I cannot change the max_sectors_kb size on 4.5.0 here.
> 
> I chatted with Ewan about it as well and he reminded me about Martins
> changes so wondering if that's playing into this.
> 
> Take /dev/sdb as an example
> 
> [root@srptest queue]# sg_inq --p 0xb0 /dev/sdb
> VPD INQUIRY: Block limits page (SBC)
>   Maximum compare and write length: 1 blocks
>   Optimal transfer length granularity: 256 blocks
>   Maximum transfer length: 256 blocks
>   Optimal transfer length: 768 blocks
>   Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
> 

Just curious what target backend this is with..?

Specifically the optimal transfer length granularity and optimal
transfer length may be reported by underlying backend device (eg:
IBLOCK) in spc_emulate_evpd_b0(). 

What does 'head /sys/kernel/config/target/core/$HBA/$DEV/attrib/*'
of the backend device in question look like..?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

2016-04-11 Thread Laurence Oberman

Thanks Bart
Good catch, I completely missed it.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Christoph Hellwig" <h...@lst.de>, linux-r...@vger.kernel.org, 
linux-scsi@vger.kernel.org
Sent: Monday, April 11, 2016 7:32:16 PM
Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct 
Scsi_Host

On 04/11/2016 03:47 PM, Christoph Hellwig wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 8106515..04c660d 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, 
> struct request_queue *q)
>   blk_queue_segment_boundary(q, shost->dma_boundary);
>   dma_set_seg_boundary(dev, shost->dma_boundary);
>
> - blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
> + blk_queue_max_segment_size(q,
> + min(shost->max_segment_size, dma_get_max_seg_size(dev)));
>
>   if (!shost->use_clustering)
>   q->limits.cluster = 0;

Hello Christoph,

Since Scsi_Host.max_segment_size is initialized to zero, shouldn't min() 
be changed into min_not_zero()?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

2016-04-12 Thread Laurence Oberman

Modified patch to use min_not_zero()
Ran a number of tests overnight on F/C, SCSI/SAS and SRP (RDMA) and no issues 
found.

Tested-by: Laurence Oberman <lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Laurence Oberman" <lober...@redhat.com>
To: "Bart Van Assche" <bart.vanass...@sandisk.com>
Cc: "Christoph Hellwig" <h...@lst.de>, linux-r...@vger.kernel.org, 
linux-scsi@vger.kernel.org
Sent: Monday, April 11, 2016 7:44:24 PM
Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct 
Scsi_Host

Thanks Bart
Good catch, I completely missed it.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Christoph Hellwig" <h...@lst.de>, linux-r...@vger.kernel.org, 
linux-scsi@vger.kernel.org
Sent: Monday, April 11, 2016 7:32:16 PM
Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct 
Scsi_Host

On 04/11/2016 03:47 PM, Christoph Hellwig wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 8106515..04c660d 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, 
> struct request_queue *q)
>   blk_queue_segment_boundary(q, shost->dma_boundary);
>   dma_set_seg_boundary(dev, shost->dma_boundary);
>
> - blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
> + blk_queue_max_segment_size(q,
> + min(shost->max_segment_size, dma_get_max_seg_size(dev)));
>
>   if (!shost->use_clustering)
>   q->limits.cluster = 0;

Hello Christoph,

Since Scsi_Host.max_segment_size is initialized to zero, shouldn't min() 
be changed into min_not_zero()?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

2016-04-12 Thread Laurence Oberman

Other than adding the patch and rebuilding the kernel and testing regular 
stuff, which I had to do anyway, that was the extent of testing.
I did not see where it was used to be honest other than adding the structure 
member.
I wanted to test the simple change because it was in scsi_lib.c which has many 
dependencies of course.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Christoph Hellwig" <h...@lst.de>, linux-r...@vger.kernel.org, 
linux-scsi@vger.kernel.org
Sent: Tuesday, April 12, 2016 11:19:20 AM
Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct 
Scsi_Host

On 04/12/2016 07:13 AM, Christoph Hellwig wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 8106515..ad79372 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, 
> struct request_queue *q)
>   blk_queue_segment_boundary(q, shost->dma_boundary);
>   dma_set_seg_boundary(dev, shost->dma_boundary);
>
> - blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
> + blk_queue_max_segment_size(q, min_not_zero(shost->max_segment_size,
> +dma_get_max_seg_size(dev)));
>
>   if (!shost->use_clustering)
>   q->limits.cluster = 0;
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index fcfa3d7..f11d3fe 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -621,6 +621,7 @@ struct Scsi_Host {
>   short unsigned int sg_tablesize;
>   short unsigned int sg_prot_tablesize;
>   unsigned int max_sectors;
> + unsigned int max_segment_size;
>   unsigned long dma_boundary;
>   /*
>* In scsi-mq mode, the number of hardware queues supported by the LLD.

Hello Christoph,

The value zero has another meaning for Scsi_Host.max_segment_size than 
for queue_limits.max_segment_size. Shouldn't that be documented somewhere?

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] drivers/scsi/fnic/fnic_scsi.c: Deinline fnic_queue_abort_io_req, save 1792 bytes

2016-04-08 Thread Laurence Oberman

Simple change, looks fine to me.

Reviewed-by: Laurence Oberman <lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Denys Vlasenko" <dvlas...@redhat.com>
To: "James Bottomley" <james.bottom...@hansenpartnership.com>
Cc: "Denys Vlasenko" <dvlas...@redhat.com>, "Hiral Patel" <hiral...@cisco.com>, 
"Suma Ramars" <sram...@cisco.com>, "Brian Uchino" <buch...@cisco.com>, 
linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org
Sent: Friday, April 8, 2016 2:58:43 PM
Subject: [PATCH] drivers/scsi/fnic/fnic_scsi.c: Deinline 
fnic_queue_abort_io_req, save 1792 bytes

This function compiles to 511 bytes of machine code.

Abort commands are not time-critical at all.

Signed-off-by: Denys Vlasenko <dvlas...@redhat.com>
CC: James Bottomley <james.bottom...@hansenpartnership.com>
CC: Hiral Patel <hiral...@cisco.com>
CC: Suma Ramars <sram...@cisco.com>
CC: Brian Uchino <buch...@cisco.com>
CC: linux-scsi@vger.kernel.org
CC: linux-ker...@vger.kernel.org
---
 drivers/scsi/fnic/fnic_scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index 266b909..0a3edee 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -1435,7 +1435,7 @@ wq_copy_cleanup_scsi_cmd:
}
 }
 
-static inline int fnic_queue_abort_io_req(struct fnic *fnic, int tag,
+static int fnic_queue_abort_io_req(struct fnic *fnic, int tag,
  u32 task_req, u8 *fc_lun,
  struct fnic_io_req *io_req)
 {
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

2016-04-11 Thread Laurence Oberman

This looks fine to me.
I am pulling this in to my SRP initiator and target testing ongoing at the 
moment so will be testing.
Up to now this has likely not affected me but I am pulling in all RDMA patches 
to test.

Reviewed-by: Laurence Oberman <lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Christoph Hellwig" <h...@lst.de>
To: linux-r...@vger.kernel.org, linux-scsi@vger.kernel.org
Sent: Monday, April 11, 2016 6:47:25 PM
Subject: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

RDMA drivers need segments that aren't larger than a single HCA page
for memory registrations to work properly, so wire up this limitation
in the host.

While we could just call blk_queue_max_segment_size from ->slave_configure,
that would override the global limit based on the DMA device, so let's do
it the traditional way by adding a field to the Scsi_Host structure.

Signed-off-by: Christoph Hellwig <h...@lst.de>
---
 drivers/scsi/scsi_lib.c  | 3 ++-
 include/scsi/scsi_host.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 8106515..04c660d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, 
struct request_queue *q)
blk_queue_segment_boundary(q, shost->dma_boundary);
dma_set_seg_boundary(dev, shost->dma_boundary);
 
-   blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
+   blk_queue_max_segment_size(q,
+   min(shost->max_segment_size, dma_get_max_seg_size(dev)));
 
if (!shost->use_clustering)
q->limits.cluster = 0;
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index fcfa3d7..f11d3fe 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -621,6 +621,7 @@ struct Scsi_Host {
short unsigned int sg_tablesize;
short unsigned int sg_prot_tablesize;
unsigned int max_sectors;
+   unsigned int max_segment_size;
unsigned long dma_boundary;
/*
 * In scsi-mq mode, the number of hardware queues supported by the LLD.
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3] scsi: disable automatic target scan

2016-03-19 Thread Laurence Oberman

Hi Hannes,

Please share those dracut patches because I want to test this patch series.
Which kernel is the diff against for the scan patches.

Thanks 

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Hannes Reinecke" <h...@suse.de>
To: "Bart Van Assche" <bart.vanass...@sandisk.com>, "Martin K. Petersen" 
<martin.peter...@oracle.com>
Cc: "Christoph Hellwig" <h...@lst.de>, "James Bottomley" 
<james.bottom...@hansenpartnership.com>, linux-scsi@vger.kernel.org
Sent: Saturday, March 19, 2016 11:18:09 AM
Subject: Re: [PATCHv3] scsi: disable automatic target scan

On 03/18/2016 10:56 PM, Bart Van Assche wrote:
> On 03/17/2016 12:39 AM, Hannes Reinecke wrote:
>> On larger installations it is useful to disable automatic LUN
>> scanning, and only add the required LUNs via udev rules.
>> This can speed up bootup dramatically.
>>
>> This patch introduces a new scan module parameter value 'manual',
>> which works like 'none', but can be overriden by setting the 'rescan'
>> value from scsi_scan_target to 'SCSI_SCAN_MANUAL'.
>> And it updates all relevant callers to set the 'rescan' value
>> to 'SCSI_SCAN_MANUAL' if invoked via the 'scan' option in sysfs.
> 
> Hello Hannes,
> 
> Will setting scsi_scan_type to 'manual' allow a system to boot from a
> SCSI disk? If not, are there alternatives to this approach? Would it be
> a valid alternative to e.g. introduce a new threshold parameter such
> that only LUN numbers below this threshold are scanned during boot?
> 
I have a patch for dracut, which will generate udev rules for all
devices required for mounting the root fs.
Once the system is booted properly I've got another patch for systemd
which switches back to 'normal' scanning (ie by writing 'sync' into
/sys/modules/scsi_mod/parameters/scan) and rescan all scsi hosts.

With that there's no need to have any arbitrary limits; only the
necessary devices are enabled during boot.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fnic: move printk()s outside of the critical code section.

2016-03-20 Thread Laurence Oberman

Reviewed-by: Laurence Oberman <lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Maurizio Lombardi" <mlomb...@redhat.com>
To: linux-scsi@vger.kernel.org
Cc: hiral...@cisco.com, sram...@cisco.com, buch...@cisco.com, 
j...@linux.vnet.ibm.com
Sent: Wednesday, March 16, 2016 9:44:08 AM
Subject: [PATCH] fnic: move printk()s outside of the critical code section.

This patch moves a printk() outside of the code section where
interrupt are disabled. In some cases a flood of error messages may
cause a kernel panic.
It also removes one of the printk()s because the same error
message was printed twice.

[709686.317197] Kernel panic - not syncing: Watchdog detected hard LOCKUP on 
cpu 12
[709686.317200] CPU: 12 PID: 1963 Comm: systemd-journal Tainted: GF  
O--   3.10.0-229.el7.x86_64 #1
[709686.317201] Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, 
BIOS B200M3.2.2.3.6.030620151309 03/06/2015
[709686.317206]  8182b2e8 392722ba 88046fcc5c48 
81603f36
[709686.317209]  88046fcc5cc8 815fd7da 0010 
88046fcc5cd8
[709686.317211]  88046fcc5c78 392722ba 88046fcc5c88 
000c
[709686.317212] Call Trace:
[709686.317221][] dump_stack+0x19/0x1b
[709686.317223]  [] panic+0xd8/0x1e7
[709686.317227]  [] ? 
watchdog_enable_all_cpus.part.2+0x40/0x40
[709686.317229]  [] watchdog_overflow_callback+0xc2/0xd0
[709686.317233]  [] __perf_event_overflow+0xa1/0x250
[709686.317235]  [] perf_event_overflow+0x14/0x20
[709686.317239]  [] intel_pmu_handle_irq+0x1fd/0x410
[709686.317242]  [] ? unmap_kernel_range_noflush+0x11/0x20
[709686.317246]  [] ? ghes_copy_tofrom_phys+0x124/0x210
[709686.317249]  [] perf_event_nmi_handler+0x2b/0x50
[709686.317251]  [] nmi_handle.isra.0+0x69/0xb0
[709686.317252]  [] do_nmi+0xd0/0x340
[709686.317256]  [] end_repeat_nmi+0x1e/0x2e
[709686.317260]  [] ? memcpy+0xd/0x110
[709686.317263]  [] ? memcpy+0xd/0x110
[709686.317265]  [] ? memcpy+0xd/0x110
[709686.317269]  <>  [] ? vgacon_scroll+0x2d7/0x330
[709686.317273]  [] scrup+0xfc/0x110
[709686.317275]  [] lf+0xa0/0xb0
[709686.317278]  [] vt_console_print+0x2d2/0x420
[709686.317283]  [] 
call_console_drivers.constprop.15+0x91/0xf0
[709686.317287]  [] console_unlock+0x3bf/0x400
[709686.317291]  [] vprintk_emit+0x2b6/0x530
[709686.317294]  [] printk_emit+0x44/0x5b
[709686.317297]  [] devkmsg_writev+0x158/0x1d0
[709686.317303]  [] do_sync_readv_writev+0x79/0xd0
[709686.317307]  [] do_readv_writev+0xce/0x260
[709686.317310]  [] ? __sb_start_write+0x58/0x110
[709686.317314]  [] vfs_writev+0x35/0x60
[709686.317318]  [] SyS_writev+0x5c/0xd0
[709686.317322]  [] system_call_fastpath+0x16/0x1b

Signed-off-by: Maurizio Lombardi <mlomb...@redhat.com>
---
 drivers/scsi/fnic/fnic_scsi.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index 266b909..f3032ca 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -958,23 +958,22 @@ static void fnic_fcpio_icmnd_cmpl_handler(struct fnic 
*fnic,
case FCPIO_INVALID_PARAM:/* some parameter in request invalid */
case FCPIO_REQ_NOT_SUPPORTED:/* request type is not supported */
default:
-   shost_printk(KERN_ERR, fnic->lport->host, "hdr status = %s\n",
-fnic_fcpio_status_to_str(hdr_status));
sc->result = (DID_ERROR << 16) | icmnd_cmpl->scsi_status;
break;
}
 
-   if (hdr_status != FCPIO_SUCCESS) {
-   atomic64_inc(_stats->io_stats.io_failures);
-   shost_printk(KERN_ERR, fnic->lport->host, "hdr status = %s\n",
-fnic_fcpio_status_to_str(hdr_status));
-   }
/* Break link with the SCSI command */
CMD_SP(sc) = NULL;
CMD_FLAGS(sc) |= FNIC_IO_DONE;
 
spin_unlock_irqrestore(io_lock, flags);
 
+   if (hdr_status != FCPIO_SUCCESS) {
+   atomic64_inc(_stats->io_stats.io_failures);
+   shost_printk(KERN_ERR, fnic->lport->host, "hdr status = %s\n",
+fnic_fcpio_status_to_str(hdr_status));
+   }
+
fnic_release_ioreq_buf(fnic, io_req, sc);
 
mempool_free(io_req, fnic->io_req_pool);
-- 
Maurizio Lombardi

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3] scsi: disable automatic target scan

2016-03-21 Thread Laurence Oberman

Hello Hannes

Please share latest scripts and an example of how you are using them.
I have some scripts from last November, that you posted but I am sure they have 
changed.
If not then I will modify them as appropriate, just let me know.

I have added the patches and booted the system set to async, so before I boot 
with scsi_mod.scan=manual want to prepare my test system.
This feature may be a very useful feature we would want to include in RHEL as 
we struggle with large LUN boot configurations all the time.
When you have time and thanks

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Hannes Reinecke" <h...@suse.de>
To: "Bart Van Assche" <bart.vanass...@sandisk.com>, "Martin K. Petersen" 
<martin.peter...@oracle.com>
Cc: "Christoph Hellwig" <h...@lst.de>, "James Bottomley" 
<james.bottom...@hansenpartnership.com>, linux-scsi@vger.kernel.org
Sent: Monday, March 21, 2016 3:15:10 AM
Subject: Re: [PATCHv3] scsi: disable automatic target scan

On 03/21/2016 02:24 AM, Bart Van Assche wrote:
> On 03/19/16 08:18, Hannes Reinecke wrote:
>> On 03/18/2016 10:56 PM, Bart Van Assche wrote:
>>> On 03/17/2016 12:39 AM, Hannes Reinecke wrote:
>>>> On larger installations it is useful to disable automatic LUN
>>>> scanning, and only add the required LUNs via udev rules.
>>>> This can speed up bootup dramatically.
>>>>
>>>> This patch introduces a new scan module parameter value 'manual',
>>>> which works like 'none', but can be overriden by setting the
>>>> 'rescan'
>>>> value from scsi_scan_target to 'SCSI_SCAN_MANUAL'.
>>>> And it updates all relevant callers to set the 'rescan' value
>>>> to 'SCSI_SCAN_MANUAL' if invoked via the 'scan' option in sysfs.
>>>
>>> Hello Hannes,
>>>
>>> Will setting scsi_scan_type to 'manual' allow a system to boot
>>> from a
>>> SCSI disk? If not, are there alternatives to this approach? Would
>>> it be
>>> a valid alternative to e.g. introduce a new threshold parameter such
>>> that only LUN numbers below this threshold are scanned during boot?
>>>
>> I have a patch for dracut, which will generate udev rules for all
>> devices required for mounting the root fs.
>> Once the system is booted properly I've got another patch for systemd
>> which switches back to 'normal' scanning (ie by writing 'sync' into
>> /sys/modules/scsi_mod/parameters/scan) and rescan all scsi hosts.
>>
>> With that there's no need to have any arbitrary limits; only the
>> necessary devices are enabled during boot.
> 
> Hello Hannes,
> 
> That sounds like a really interesting approach. Will this approach
> also work if the SCSI host and/or LUN numbers change during a reboot?
> 
It's independent on the SCSI host as it just looks for the rport ID
(FC WWPN, SAS ID, or iSCSI target name). The LUN number, however, is
fixed; the whole point of this exercise is that you want to blank
out individual LUNs behind a given target.
Hence you need to able to address the LUNs in the first place.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3] scsi: disable automatic target scan

2016-03-23 Thread Laurence Oberman

Hello

Tested Hannes's scan disable patch (subject above) with hpsa module patch below.
Because of the way the hpsa works I created a module that will force the scan 
when all scans are manual.
I also tested Hannes's patch with boot-from-san via F/C and validated the patch 
in subject using Hannes's original dracut lunmask patch.

For Hannes's scan disable patch 
Tested-by: Laurence Oberman <lober...@redhat.com>

linux16 /vmlinuz-4.4.5scan root=/dev/mapper/rhel-root ro crashkernel=512M@64M 
rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS0,115200n8 
scsi_mod.scan=manual rd.hpsa=0 rdloaddriver=hpsa

Additional HPSA module for dracut, needs to be cleaned up and reviewed 
internally here at Red Hat for a separate submission later.
Included for others who want to test this.
We need this for hpsa as this is by far the most popular boot controller we 
face.

diff -Nurp modules.d.orig/06hpsa/hpsa.sh modules.d/06hpsa/hpsa.sh
--- modules.d.orig/06hpsa/hpsa.sh   1969-12-31 19:00:00.0 -0500
+++ modules.d/06hpsa/hpsa.sh2016-03-23 21:38:33.157233465 -0400
@@ -0,0 +1,6 @@
+#!/bin/sh
+### hpsa.sh: Called by the parse-hpsa.sh script to create the scan script ###
+### Laurence Oberman lober...@redhat.com
+. /lib/dracut-lib.sh
+### The actual script that scans the hpsa for LUNS 
+/bin/sh /sbin/hpsa_scan.sh
diff -Nurp modules.d.orig/06hpsa/module-setup.sh 
modules.d/06hpsa/module-setup.sh
--- modules.d.orig/06hpsa/module-setup.sh   1969-12-31 19:00:00.0 
-0500
+++ modules.d/06hpsa/module-setup.sh2016-03-23 21:40:36.994767642 -0400
@@ -0,0 +1,14 @@
+#!/bin/sh
+ Test the hpsa driver load with scan #
+ Laurence Oberman lober...@redhat.com 
+### module-setup.sh - Required for every module
+### Standard script invocations required
+check() {
+   return 0
+}
+
+### Install the hpsa.sh in the module directory
+install() {
+   inst_hook cmdline 20 "$moddir/parse-hpsa.sh"
+   inst_simple "$moddir/hpsa.sh" /sbin/hpsa.sh
+}
diff -Nurp modules.d.orig/06hpsa/parse-hpsa.sh modules.d/06hpsa/parse-hpsa.sh
--- modules.d.orig/06hpsa/parse-hpsa.sh 1969-12-31 19:00:00.0 -0500
+++ modules.d/06hpsa/parse-hpsa.sh  2016-03-23 21:42:28.141856121 -0400
@@ -0,0 +1,18 @@
+#!/bin.bash
+### Laurence Oberman lober...@redhat.com
+### parse-hpsa.sh 
+### Parses the rd.hpsa=x tp get the host number
+### Using rdloaddriver=hpsa will enforce hpsa becoming scsi0
+
+for p in $(getargs rd.hpsa=); do
+(
+ echo "echo 1 > /sys/class/scsi_host/host$p/rescan" > /sbin/hpsa_scan.sh
+_do_hpsa=1
+)
+done
+
+### Standard way to call the script from udev
+/sbin/initqueue --settled --unique --onetime /bin/sh /sbin/hpsa.sh
+#/bin/sh /sbin/hpsa.sh
+unset _do_hpsa
+


Test log


[5.591817] HP HPSA Driver (v 3.4.14-0)
[5.593799] hpsa :05:00.0: can't disable ASPM; OS doesn't have ASPM 
control
[5.597423] hpsa :05:00.0: MSI-X capable controller
[5.600181] hpsa :05:00.0: only 16 MSI-X vectors available
[5.602995] hpsa :05:00.0: Logical aborts not supported
[5.606011] hpsa :05:00.0: HP SSD Smart Path aborts not supported
[5.631300] scsi host0: hpsa

[  OK  ] Started dracut pre-udev hook.
 Starting udev Kernel Device Manager...
[  OK  ] Started udev Kernel Device Manager.
 Starting udev Coldplug all Devices...
[5.676569] clocksource: Switched to clocksource tsc
 Mounting Configuration File System...
[  OK  ] Mounted Configuration File System.
[  OK  ] Started udev Coldplug all Devices.
 Starting dracut initqueue hook...
 Starting Show Plymouth Boot Screen...
[  OK  ] Reached target System Initialization.
[5.708890] bnx2: QLogic bnx2 Gigabit Ethernet Driver v2.2.6 (January 29, 
2014)
[  OK  ] Started Show Plymouth Boot Screen.
[5.749275] bnx2 :03:00.0 eth0: Broadcom NetXtreme II BCM5709 1000Base-T 
(C0) PCI Express found at mem f400, IRQ 16, node addr e4:11:5b:b8:ea:6a
[  OK  ] Reached target Paths.
[  OK  ] Reached target Basic System.
[5.828145] bnx2 :03:00.1 eth1: Broadcom NetXtreme II BCM5709 1000Base-T 
(C0) PCI Express found at mem f200, IRQ 17, node addr e4:11:5b:b8:ea:6c
[5.905874] bnx2 :04:00.0 eth2: Broadcom NetXtreme II BCM5709 1000Base-T 
(C0) PCI Express found at mem f800, IRQ 18, node addr e4:11:5b:b8:ea:6e
[5.906632] bnx2 :04:00.1 eth3: Broadcom NetXtreme II BCM5709 1000Base-T 
(C0) PCI Express found at mem f600, IRQ 19, node addr e4:11:5b:b8:ea:70
[6.061847] bnx2 :04:00.1 enp4s0f1: renamed from eth3
[6.098914] mlx5_core :08:00.0: firmware version: 12.14.2036
[6.147046] Emulex LightPulse Fibre Channel SCSI driver 11.0.0.0.
[6.147054] [drm] Initialized drm 1.1.0 20060810
[6.147148] qla2xxx [:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 
8.07.00.26-k.
[6.147278] qla2xxx [:0e:00.0]-001d: : Found an ISP2432 irq 27 iobase 
0xc900192b8000.
[6.14

Re: [PATCH] mpt3sas - remove unused fw_event_work delayed_work

2016-04-01 Thread Laurence Oberman

Looks fine to me.

Reviewed-by: Laurence Oberman <lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Joe Lawrence" <joe.lawre...@stratus.com>
To: linux-scsi@vger.kernel.org
Cc: "Sathya Prakash" <sathya.prak...@broadcom.com>, "Chaitra P B" 
<chaitra.basa...@broadcom.com>, "Suganath Prabu Subramani" 
<suganath-prabu.subram...@broadcom.com>, "Calvin Owens" <calvinow...@fb.com>, 
"Joe Lawrence" <joe.lawre...@stratus.com>
Sent: Friday, April 1, 2016 1:56:29 PM
Subject: [PATCH] mpt3sas - remove unused fw_event_work delayed_work

The driver's fw events are queued up using the the fw_event_work's
struct work, not its delayed_work member.  The latter appears to be
unused and may provoke CONFIG_DEBUG_OBJECTS_TIMERS "assert_init not
available" false warnings in _scsih_fw_event_cleanup_queue.  Remove it
and update _scsih_fw_event_cleanup_queue accordingly.

Signed-off-by: Joe Lawrence <joe.lawre...@stratus.com>
---

I think this goes all the way back to the introduction of the mpt3sas
driver.  The previous generation mpt2sas driver uses delayed_work, so
perhaps it was simply copied and pasted into the mpt3sas but never
updated.

 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index e0e4920d0fa6..67643602efbc 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -189,7 +189,6 @@ struct fw_event_work {
struct list_headlist;
struct work_struct  work;
u8  cancel_pending_work;
-   struct delayed_work delayed_work;
 
struct MPT3SAS_ADAPTER *ioc;
u16 device_handle;
@@ -2804,12 +2803,12 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER 
*ioc)
/*
 * Wait on the fw_event to complete. If this returns 1, then
 * the event was never executed, and we need a put for the
-* reference the delayed_work had on the fw_event.
+* reference the work had on the fw_event.
 *
 * If it did execute, we wait for it to finish, and the put will
 * happen from _firmware_event_work()
 */
-   if (cancel_delayed_work_sync(_event->delayed_work))
+   if (cancel_work_sync(_event->work))
fw_event_work_put(fw_event);
 
fw_event_work_put(fw_event);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision3

2016-04-01 Thread Laurence Oberman

Himanshu

I looked at using the attribute for this but because of where I have to discard 
the command I dont want to have to go fetch the attribute each time in the same 
code path.
Its significant overhead to have to go fetch the attribute value each time as I 
allow for a dynamic on off via the module parameter so I have to check it each 
command.
With the module parameter its a simple compare and by having this as a module 
parameter its globally accessible and imposes virtually no overhead.

Are you OK with me using #ifdef on the CONFIG_TCM_QLA2XXX_DEBUG .config 
parameter I will add here to include the module parameter and code only if set 
to "yes"
The default unless expicitly set will be no change.

Thanks

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Himanshu Madhani" <himanshu.madh...@qlogic.com>
To: "Nicholas A. Bellinger" <n...@linux-iscsi.org>, "Bart Van Assche" 
<bart.vanass...@sandisk.com>
Cc: "Laurence Oberman" <lober...@redhat.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>, "target-devel" <target-de...@vger.kernel.org>, 
"Quinn Tran" <quinn.t...@qlogic.com>
Sent: Thursday, March 31, 2016 8:20:56 PM
Subject: Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the 
tcm_qla2xxx module - revision3

Hi Nic, Laurence, 



On 3/30/16, 10:34 PM, "Nicholas A. Bellinger" <n...@linux-iscsi.org> wrote:

>(Adding target-devel + Qlogic target folks)
>
>On Tue, 2016-03-29 at 22:05 -0700, Bart Van Assche wrote:
>> On 03/29/16 07:42, Laurence Oberman wrote:
>> > I have been using this jammer functionality to continue testing the SCSI 
>> > F/C drivers and recovery for over a year now.
>> > Any chance you would agree to ack this so I can get it in now.
>> > I last posted to the list last March and it was not picked up.
>> >
>> > I did look into moving this to upper layers but I find I use it primarily 
>> > for fiber channel target testing.
>> > Attempting to add this functionality to upper layers led to complexities 
>> > and this is very solid.
>> >
>> > This Patch diff against 4.5
>> >
>> > I use target LIO for all my storage array test targets and customer 
>> > problem resolution here at Red Hat.
>> > This patch resulted from a requirement to mimic behavior of an expensive 
>> > hardware jammer for a customer.
>> > I have used this for some time with good success to simulate and reproduce 
>> > latency and slow drain fabric issues and
>> > for testing and validating error handling behavior
>> >   in the Emulex, Qlogic and other F/C drivers.
>> >
>> > Works by checking new parameter jam_host if its >= 0 and matches 
>> > vha->host_no , jamming is enabled when jam_host >=0
>> > If parameter set to -1 (default) no jamming is enabled.
>> 
>> Hello Laurence,
>> 
>> Nic Bellinger is the maintainer of LIO so my recommendation is to ask 
>> Nic first about his opinion (I have CC'd Nic). I'm not sure what Nic 
>> thinks about this but in my opinion such functionality belongs in the 
>> target core instead of in a target driver. But please wait until Nic has 
>> provided his opinion before spending more time on this. The mailing list 
>> for SCSI target patches is target-de...@vger.kernel.org.
>> 
>
>So really it's Himanshu's + Quinn's call if they would like to include
>something like this in mainline.
>
>If so, then I'd prefer to do it with a per tcm_qla2xxx_tpg endpoint
>attribute instead a new module parameter, and add a new kernel config
>option (CONFIG_TCM_QLA2XXX_DEBUG) to disable (by default) so end users
>don't inadvertently play with it via targetcli + friends.
>

I agree here with Nic. The patch does provides benefit and is good addition,
but we don’t want to enable it by default.

Laurence, 

Would you be kind to rework patch with suggested changes from Nic and post it. 

Thanks, 
Himanshu
N�r��y���b�X��ǧv�^�)޺{.n�+{���"�{ay�ʇڙ�,j��f���h���z��w��j:+v���w�j�mzZ+��ݢj"��
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability to the tcm_qla2xxx module

2016-04-03 Thread Laurence Oberman

Hi Nicholas

Apologies for the top posting, that was in my haste to correct the prior patch 
that had the typo.
When I investigated the attributes it looked like I would have had to create a 
store and a check function and call the check function each time.
That was my lack of understanding of the functionality.

I also looked at your example and in my case I needed a way to set the 
attribute to a number matching the host#.
When I tested this I was only able to set boolean values of 1 or 0 for the 
attributes and the definition of
tcm_qla2xxx_tpg_attrib_##name##_store validates that only booleans of 1 or 0 
are supported.

However after your email I then realized using a boolean on the endpoints below 
will work here.
Thank you for taking the time to show me, it was very helpful.

sys]# find . -name jam_host
./kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host
./kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:af/tpgt_1/attrib/jam_host

I tested this and here are the patches in the format you require.
Hopefully this new functionality will be useful for others.
I am not set for emailing directly from git.

Tested by: Laurence Oberman <lober...@redhat.com>
Signed-off-by: Laurence Oberman <lober...@redhat.com>
---
 drivers/scsi/qla2xxx/Kconfig   |   11 +++
 drivers/scsi/qla2xxx/tcm_qla2xxx.c |   20 
 drivers/scsi/qla2xxx/tcm_qla2xxx.h |1 +
 3 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
index 10aa18b..5110fab 100644
--- a/drivers/scsi/qla2xxx/Kconfig
+++ b/drivers/scsi/qla2xxx/Kconfig
@@ -36,3 +36,14 @@ config TCM_QLA2XXX
default n
---help---
Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ 
series target mode HBAs
+
+config TCM_QLA2XXX_DEBUG
+   bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series 
target mode HBAs"
+   depends on SCSI_QLA_FC && TARGET_CORE
+   depends on LIBFC
+   select BTREE
+   default n
+   ---help---
+   Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 
24xx+ series target mode HBAs
+   This will include code to enable the SCSI command jammer
+
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 1808a01..411a450 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -457,6 +457,10 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, 
struct qla_tgt_cmd *cmd,
struct se_cmd *se_cmd = >se_cmd;
struct se_session *se_sess;
struct qla_tgt_sess *sess;
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+struct se_portal_group *se_tpg;
+struct tcm_qla2xxx_tpg *tpg;
+#endif
int flags = TARGET_SCF_ACK_KREF;
 
if (bidi)
@@ -476,6 +480,15 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, 
struct qla_tgt_cmd *cmd,
pr_err("Unable to locate active struct se_session\n");
return -EINVAL;
}
+ 
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+   se_tpg = se_sess->se_tpg;
+   tpg = container_of(se_tpg,struct tcm_qla2xxx_tpg, se_tpg);
+   if (unlikely(tpg->tpg_attrib.jam_host)) {
+   /* return, and dont run target_submit_cmd,discarding command */
+return 0;
+   }
+#endif
 
cmd->vha->tgt_counters.qla_core_sbt_cmd++;
return target_submit_cmd(se_cmd, se_sess, cdb, >sense_buffer[0],
@@ -844,6 +857,9 @@ DEF_QLA_TPG_ATTRIB(cache_dynamic_acls);
 DEF_QLA_TPG_ATTRIB(demo_mode_write_protect);
 DEF_QLA_TPG_ATTRIB(prod_mode_write_protect);
 DEF_QLA_TPG_ATTRIB(demo_mode_login_only);
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+DEF_QLA_TPG_ATTRIB(jam_host);
+#endif
 
 static struct configfs_attribute *tcm_qla2xxx_tpg_attrib_attrs[] = {
_qla2xxx_tpg_attrib_attr_generate_node_acls,
@@ -851,6 +867,9 @@ static struct configfs_attribute 
*tcm_qla2xxx_tpg_attrib_attrs[] = {
_qla2xxx_tpg_attrib_attr_demo_mode_write_protect,
_qla2xxx_tpg_attrib_attr_prod_mode_write_protect,
_qla2xxx_tpg_attrib_attr_demo_mode_login_only,
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+_qla2xxx_tpg_attrib_attr_jam_host,
+#endif
NULL,
 };
 
@@ -1023,6 +1042,7 @@ static struct se_portal_group *tcm_qla2xxx_make_tpg(
tpg->tpg_attrib.demo_mode_write_protect = 1;
tpg->tpg_attrib.cache_dynamic_acls = 1;
tpg->tpg_attrib.demo_mode_login_only = 1;
+   tpg->tpg_attrib.jam_host = 0;
 
ret = core_tpg_register(wwn, >se_tpg, SCSI_PROTOCOL_FCP);
if (ret < 0) {
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.h 
b/drivers/scsi/qla2xxx/tcm_qla2xxx.h
index 3bbf4cb..37e026a 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.h
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.h
@@ -34,6 +34,7 @@ struct tcm_qla2xxx_tpg_attrib {
int prod_mode_write_protect;
int demo_mode_l

Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision3

2016-03-31 Thread Laurence Oberman

Hello Himanshu

Thanks, I will rework and post back to the thread.

Thank you

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Himanshu Madhani" <himanshu.madh...@qlogic.com>
To: "Nicholas A. Bellinger" <n...@linux-iscsi.org>, "Bart Van Assche" 
<bart.vanass...@sandisk.com>
Cc: "Laurence Oberman" <lober...@redhat.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>, "target-devel" <target-de...@vger.kernel.org>, 
"Quinn Tran" <quinn.t...@qlogic.com>
Sent: Thursday, March 31, 2016 8:20:56 PM
Subject: Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the 
tcm_qla2xxx module - revision3

Hi Nic, Laurence, 



On 3/30/16, 10:34 PM, "Nicholas A. Bellinger" <n...@linux-iscsi.org> wrote:

>(Adding target-devel + Qlogic target folks)
>
>On Tue, 2016-03-29 at 22:05 -0700, Bart Van Assche wrote:
>> On 03/29/16 07:42, Laurence Oberman wrote:
>> > I have been using this jammer functionality to continue testing the SCSI 
>> > F/C drivers and recovery for over a year now.
>> > Any chance you would agree to ack this so I can get it in now.
>> > I last posted to the list last March and it was not picked up.
>> >
>> > I did look into moving this to upper layers but I find I use it primarily 
>> > for fiber channel target testing.
>> > Attempting to add this functionality to upper layers led to complexities 
>> > and this is very solid.
>> >
>> > This Patch diff against 4.5
>> >
>> > I use target LIO for all my storage array test targets and customer 
>> > problem resolution here at Red Hat.
>> > This patch resulted from a requirement to mimic behavior of an expensive 
>> > hardware jammer for a customer.
>> > I have used this for some time with good success to simulate and reproduce 
>> > latency and slow drain fabric issues and
>> > for testing and validating error handling behavior
>> >   in the Emulex, Qlogic and other F/C drivers.
>> >
>> > Works by checking new parameter jam_host if its >= 0 and matches 
>> > vha->host_no , jamming is enabled when jam_host >=0
>> > If parameter set to -1 (default) no jamming is enabled.
>> 
>> Hello Laurence,
>> 
>> Nic Bellinger is the maintainer of LIO so my recommendation is to ask 
>> Nic first about his opinion (I have CC'd Nic). I'm not sure what Nic 
>> thinks about this but in my opinion such functionality belongs in the 
>> target core instead of in a target driver. But please wait until Nic has 
>> provided his opinion before spending more time on this. The mailing list 
>> for SCSI target patches is target-de...@vger.kernel.org.
>> 
>
>So really it's Himanshu's + Quinn's call if they would like to include
>something like this in mainline.
>
>If so, then I'd prefer to do it with a per tcm_qla2xxx_tpg endpoint
>attribute instead a new module parameter, and add a new kernel config
>option (CONFIG_TCM_QLA2XXX_DEBUG) to disable (by default) so end users
>don't inadvertently play with it via targetcli + friends.
>

I agree here with Nic. The patch does provides benefit and is good addition,
but we don’t want to enable it by default.

Laurence, 

Would you be kind to rework patch with suggested changes from Nic and post it. 

Thanks, 
Himanshu
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision4

2016-04-02 Thread Laurence Oberman

Hello Himanshu

This patch was reworked to only include the jammer code if the parameter 
TCM_QLA2XXX_DEBUG=Y is set.
The default is to not provide this functionality at all.
I looked at using attributes but this code is in the fastpath and the overhead 
or fetching the attribute
each time is not a good idea. 
Control of this needs to be dynamic and the module parameter allows a simple 
compare in the fastpath.

Patch notes

I use target LIO for all my storage array test targets and customer problem 
resolution here at Red Hat.
This patch resulted from a requirement to mimic behavior of an expensive 
hardware jammer for a customer.
I have used this for some time with good success to simulate and reproduce 
latency and slow drain fabric issues and
for testing and validating error handling behavior  in the Emulex, Qlogic and 
other F/C drivers.

Works by checking new parameter jam_host if its >= 0 and matches vha->host_no , 
jamming is enabled when jam_host >=0
If parameter set to -1 (default) no jamming is enabled. 

Tested by: Laurence Oberman <lober...@redhat.com>
Signed-off-by: Laurence Oberman <lober...@redhat.com>

diff -Nurp linux-4.5/Documentation/scsi/tcm_qla2xxx.txt 
linux-4.5.new/Documentation/scsi/tcm_qla2xxx.txt
--- linux-4.5/Documentation/scsi/tcm_qla2xxx.txt1969-12-31 
19:00:00.0 -0500
+++ linux-4.5.new/Documentation/scsi/tcm_qla2xxx.txt2016-04-02 
11:36:42.693081232 -0400
@@ -0,0 +1,34 @@
+tcm_qla2xxx jammer parameter usage
+--
+There is now a new module parameter added to the tcm_qla2xx module
+parm:   jam_host:Host to jam >=0 Enable jammer (int)
+This parameter and accompanying code is only included if the 
+Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y
+By default this jammer code and functionality is disabled
+
+Use this parameter to control the discarding of SCSI commands to a selected
+host.
+This may be useful for testing error handling and simulating slow drain
+and other fabric issues.
+
+Any value >=0 that matches a fc_host # will discard the commands for that host.
+Reset back to -1 to stop the jamming.
+
+Enable host 6 to be jammed
+echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host
+
+Disable jamming on host 6
+echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host
+
+Usage example script:
+
+#!/bin/bash
+sleep_time=120  ### Time to jam for
+echo 6 >  /sys/module/tcm_qla2xxx/parameters/jam_host
+host=`cat /sys/module/tcm_qla2xxx/parameters/jam_host`
+echo "We start to discard commands on SCSI host $host"
+logger "Jammer started"
+sleep $sleep_time
+echo -1 >  /sys/module/tcm_qla2xxx/parameters/jam_host
+echo "We stopped the jammer"
+logger "Jammer stopped"
diff -Nurp linux-4.5/drivers/scsi/qla2xxx/Kconfig 
linux-4.5.new/drivers/scsi/qla2xxx/Kconfig
--- linux-4.5/drivers/scsi/qla2xxx/Kconfig  2016-03-14 00:28:54.0 
-0400
+++ linux-4.5.new/drivers/scsi/qla2xxx/Kconfig  2016-04-02 11:31:15.302516676 
-0400
@@ -36,3 +36,13 @@ config TCM_QLA2XXX
default n
---help---
Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ 
series target mode HBAs
+
+config TCM_QLA2XXX_DEBUG
+   bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series 
target mode HBAs"
+   depends on SCSI_QLA_FC && TARGET_CORE
+   depends on LIBFC
+   select BTREE
+   default n
+   ---help---
+   Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 
24xx+ series target mode HBAs
+   This will include code to enable the SCSI command jammer
diff -Nurp linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
linux-4.5.new/drivers/scsi/qla2xxx/tcm_qla2xxx.c
--- linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-03-14 
00:28:54.0 -0400
+++ linux-4.5.new/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-04-02 
11:32:35.317410249 -0400
@@ -48,6 +48,12 @@
 #include "qla_target.h"
 #include "tcm_qla2xxx.h"
 
+#ifdef TCM_QLA2XXX_DEBUG
+int jam_host = -1;
+module_param(jam_host, int, 0644);
+MODULE_PARM_DESC(jam_host, "Host to jam >=0 Enable jammer");
+#endif
+
 static struct workqueue_struct *tcm_qla2xxx_free_wq;
 static struct workqueue_struct *tcm_qla2xxx_cmd_wq;
 
@@ -477,6 +483,13 @@ static int tcm_qla2xxx_handle_cmd(scsi_q
return -EINVAL;
}
 
+#ifdef TCM_QLA2XXX_DEBUG
+   if (unlikely(vha->host_no == jam_host)) {
+   /* return, and dont run target_submit_cmd,discarding command */
+   return 0;
+   }
+#endif
+
cmd->vha->tgt_counters.qla_core_sbt_cmd++;
return target_submit_cmd(se_cmd, se_sess, cdb, >sense_buffer[0],
cmd->unpacked_lun, data_length, fcp_task_attr,
@@ -1967,6 +1980,9 @@ static void tcm_qla2xxx_deregister_confi
 static int __init tcm_qla2xxx_init(void)
 {
    int ret;
+#ifdef TCM_QLA2XXX

Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision4

2016-04-02 Thread Laurence Oberman

Hello Himanshu

I noticed a typo in the patch I submitted here is the corrected patch.
Please ignore the prior patch

I was missing the full CONFIG name in the #ifdef check 

Corrected Patch

[root@localhost home]# linux-4.5/scripts/checkpatch.pl jammer_patch.v4
total: 0 errors, 0 warnings, 81 lines checked

jammer_patch.v4 has no obvious style problems and is ready for submission.


This patch was reworked to only include the jammer code if the parameter 
TCM_QLA2XXX_DEBUG=Y is set.
The default is to not provide this functionality at all.
I looked at using attributes but this code is in the fastpath and the overhead 
or fetching the attribute
each time is not a good idea.
Control of this needs to be dynamic and the module parameter allows a simple 
compare in the fastpath.

Patch notes

I use target LIO for all my storage array test targets and customer problem 
resolution here at Red Hat.
This patch resulted from a requirement to mimic behavior of an expensive 
hardware jammer for a customer.
I have used this for some time with good success to simulate and reproduce 
latency and slow drain fabric issues and
for testing and validating error handling behavior  in the Emulex, Qlogic and 
other F/C drivers.

Works by checking new parameter jam_host if its >= 0 and matches vha->host_no , 
jamming is enabled when jam_host >=0
If parameter set to -1 (default) no jamming is enabled.

Tested by: Laurence Oberman <lober...@redhat.com>
Signed-off-by: Laurence Oberman <lober...@redhat.com>

diff -Nurp linux-4.5/Documentation/scsi/tcm_qla2xxx.txt 
linux-4.5.new/Documentation/scsi/tcm_qla2xxx.txt
--- linux-4.5/Documentation/scsi/tcm_qla2xxx.txt1969-12-31 
19:00:00.0 -0500
+++ linux-4.5.new/Documentation/scsi/tcm_qla2xxx.txt2016-04-02 
11:36:42.693081232 -0400
@@ -0,0 +1,34 @@
+tcm_qla2xxx jammer parameter usage
+--
+There is now a new module parameter added to the tcm_qla2xx module
+parm:   jam_host:Host to jam >=0 Enable jammer (int)
+This parameter and accompanying code is only included if the
+Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y
+By default this jammer code and functionality is disabled
+
+Use this parameter to control the discarding of SCSI commands to a selected
+host.
+This may be useful for testing error handling and simulating slow drain
+and other fabric issues.
+
+Any value >=0 that matches a fc_host # will discard the commands for that host.
+Reset back to -1 to stop the jamming.
+
+Enable host 6 to be jammed
+echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host
+
+Disable jamming on host 6
+echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host
+
+Usage example script:
+
+#!/bin/bash
+sleep_time=120  ### Time to jam for
+echo 6 >  /sys/module/tcm_qla2xxx/parameters/jam_host
+host=`cat /sys/module/tcm_qla2xxx/parameters/jam_host`
+echo "We start to discard commands on SCSI host $host"
+logger "Jammer started"
+sleep $sleep_time
+echo -1 >  /sys/module/tcm_qla2xxx/parameters/jam_host
+echo "We stopped the jammer"
+logger "Jammer stopped"
diff -Nurp linux-4.5/drivers/scsi/qla2xxx/Kconfig 
linux-4.5.new/drivers/scsi/qla2xxx/Kconfig
--- linux-4.5/drivers/scsi/qla2xxx/Kconfig  2016-03-14 00:28:54.0 
-0400
+++ linux-4.5.new/drivers/scsi/qla2xxx/Kconfig  2016-04-02 11:31:15.302516676 
-0400
@@ -36,3 +36,13 @@ config TCM_QLA2XXX
default n
---help---
Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ 
series target mode HBAs
+
+config TCM_QLA2XXX_DEBUG
+   bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series 
target mode HBAs"
+   depends on SCSI_QLA_FC && TARGET_CORE
+   depends on LIBFC
+   select BTREE
+   default n
+   ---help---
+   Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 
24xx+ series target mode HBAs
+   This will include code to enable the SCSI command jammer
diff -Nurp linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
linux-4.5.new/drivers/scsi/qla2xxx/tcm_qla2xxx.c
--- linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-03-14 
00:28:54.0 -0400
+++ linux-4.5.new/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-04-02 
11:32:35.317410249 -0400
@@ -48,6 +48,12 @@
 #include "qla_target.h"
 #include "tcm_qla2xxx.h"
 
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+int jam_host = -1;
+module_param(jam_host, int, 0644);
+MODULE_PARM_DESC(jam_host, "Host to jam >=0 Enable jammer");
+#endif
+
 static struct workqueue_struct *tcm_qla2xxx_free_wq;
 static struct workqueue_struct *tcm_qla2xxx_cmd_wq;
 
@@ -477,6 +483,13 @@ static int tcm_qla2xxx_handle_cmd(scsi_q
return -EINVAL;
}
 
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+   if (unlikely(vha->host_no == jam_host)) {
+   /* return, and dont run target_submit_cmd,discarding command */
+   r

Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability to the tcm_qla2xxx module

2016-04-04 Thread Laurence Oberman

Hello Nicholas

Its fixed now.
Many Thanks.

$ scripts/checkpatch.pl 
0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#12: 
new file mode 100644

total: 0 errors, 1 warnings, 91 lines checked

0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch has style 
problems, please review.

NOTE: If any of the errors are false positives, please report
  them to the maintainer, see CHECKPATCH in MAINTAINERS.



Tested by: Laurence Oberman <lober...@redhat.com>
Signed-off-by: Laurence Oberman <lober...@redhat.com>
---
 Documentation/scsi/tcm_qla2xxx.txt |   22 ++
 drivers/scsi/qla2xxx/Kconfig   |9 +
 drivers/scsi/qla2xxx/tcm_qla2xxx.c |   20 
 drivers/scsi/qla2xxx/tcm_qla2xxx.h |1 +
 4 files changed, 52 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/scsi/tcm_qla2xxx.txt

diff --git a/Documentation/scsi/tcm_qla2xxx.txt 
b/Documentation/scsi/tcm_qla2xxx.txt
new file mode 100644
index 000..c3a670a
--- /dev/null
+++ b/Documentation/scsi/tcm_qla2xxx.txt
@@ -0,0 +1,22 @@
+tcm_qla2xxx jam_host attribute
+--
+There is now a new module endpoint atribute called jam_host
+attribute: jam_host: boolean=0/1
+This attribute and accompanying code is only included if the
+Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y
+By default this jammer code and functionality is disabled
+
+Use this attribute to control the discarding of SCSI commands to a
+selected host.
+This may be useful for testing error handling and simulating slow drain
+and other fabric issues.
+
+Setting a boolean of 1 for the jam_host attribute for a particular host
+ will discard the commands for that host.
+Reset back to 0 to stop the jamming.
+
+Enable host 4 to be jammed
+echo 1 > 
/sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host
+
+Disable jamming on host 4
+echo 0 > 
/sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host
diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
index 10aa18b..67c0d5a 100644
--- a/drivers/scsi/qla2xxx/Kconfig
+++ b/drivers/scsi/qla2xxx/Kconfig
@@ -36,3 +36,12 @@ config TCM_QLA2XXX
default n
---help---
Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ 
series target mode HBAs
+
+if TCM_QLA2XXX
+config TCM_QLA2XXX_DEBUG
+   bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series 
target mode HBAs"
+   default n
+   ---help---
+   Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 
24xx+ series target mode HBAs
+   This will include code to enable the SCSI command jammer
+endif
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 1808a01..948224e 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -457,6 +457,10 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, 
struct qla_tgt_cmd *cmd,
struct se_cmd *se_cmd = >se_cmd;
struct se_session *se_sess;
struct qla_tgt_sess *sess;
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+   struct se_portal_group *se_tpg;
+   struct tcm_qla2xxx_tpg *tpg;
+#endif
int flags = TARGET_SCF_ACK_KREF;
 
if (bidi)
@@ -477,6 +481,15 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, 
struct qla_tgt_cmd *cmd,
return -EINVAL;
}
 
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+   se_tpg = se_sess->se_tpg;
+   tpg = container_of(se_tpg, struct tcm_qla2xxx_tpg, se_tpg);
+   if (unlikely(tpg->tpg_attrib.jam_host)) {
+   /* return, and dont run target_submit_cmd,discarding command */
+   return 0;
+   }
+#endif
+
cmd->vha->tgt_counters.qla_core_sbt_cmd++;
return target_submit_cmd(se_cmd, se_sess, cdb, >sense_buffer[0],
cmd->unpacked_lun, data_length, fcp_task_attr,
@@ -844,6 +857,9 @@ DEF_QLA_TPG_ATTRIB(cache_dynamic_acls);
 DEF_QLA_TPG_ATTRIB(demo_mode_write_protect);
 DEF_QLA_TPG_ATTRIB(prod_mode_write_protect);
 DEF_QLA_TPG_ATTRIB(demo_mode_login_only);
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+DEF_QLA_TPG_ATTRIB(jam_host);
+#endif
 
 static struct configfs_attribute *tcm_qla2xxx_tpg_attrib_attrs[] = {
_qla2xxx_tpg_attrib_attr_generate_node_acls,
@@ -851,6 +867,9 @@ static struct configfs_attribute 
*tcm_qla2xxx_tpg_attrib_attrs[] = {
_qla2xxx_tpg_attrib_attr_demo_mode_write_protect,
_qla2xxx_tpg_attrib_attr_prod_mode_write_protect,
_qla2xxx_tpg_attrib_attr_demo_mode_login_only,
+#ifdef CONFIG_TCM_QLA2XXX_DEBUG
+   _qla2xxx_tpg_attrib_attr_jam_host,
+#endif
NULL,
 };
 
@@ -1023,6 +1042,7 @@ static struct se_portal_group *tcm_qla2xxx_make_tpg(
tpg->tpg_attrib.demo_mode_write_protect = 1;

tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision3

2016-03-29 Thread Laurence Oberman

Hello Bart,

I have been using this jammer functionality to continue testing the SCSI F/C 
drivers and recovery for over a year now.
Any chance you would agree to ack this so I can get it in now.
I last posted to the list last March and it was not picked up.

I did look into moving this to upper layers but I find I use it primarily for 
fiber channel target testing.
Attempting to add this functionality to upper layers led to complexities and 
this is very solid.

This Patch diff against 4.5

I use target LIO for all my storage array test targets and customer problem 
resolution here at Red Hat.
This patch resulted from a requirement to mimic behavior of an expensive 
hardware jammer for a customer.
I have used this for some time with good success to simulate and reproduce 
latency and slow drain fabric issues and
for testing and validating error handling behavior
 in the Emulex, Qlogic and other F/C drivers.

Works by checking new parameter jam_host if its >= 0 and matches vha->host_no , 
jamming is enabled when jam_host >=0
If parameter set to -1 (default) no jamming is enabled. 

Tested by: Laurence Oberman <lober...@redhat.com>
Signed-off-by: Laurence Oberman <lober...@redhat.com>

diff -Nurp linux-4.5.orig/Documentation/scsi/tcm_qla2xxx.txt 
linux-4.5/Documentation/scsi/tcm_qla2xxx.txt
--- linux-4.5.orig/Documentation/scsi/tcm_qla2xxx.txt   1969-12-31 
19:00:00.0 -0500
+++ linux-4.5/Documentation/scsi/tcm_qla2xxx.txt2016-03-29 
10:08:57.455761389 -0400
@@ -0,0 +1,31 @@
+tcm_qla2xxx jammer parameter usage
+--
+There is now a new module parameter added to the tcm_qla2xx module
+parm:   jam_host:Host to jam >=0 Enable jammer (int)
+
+Use this parameter to control the discarding of SCSI commands to a selected
+host.
+This may be useful for testing error handling and simulating slow drain
+and other fabric issues.
+
+Any value >=0 that matches a fc_host # will discard the commands for that host.
+Reset back to -1 to stop the jamming.
+
+Enable host 6 to be jammed
+echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host
+
+Disable jamming on host 6
+echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host
+
+Usage example script:
+
+#!/bin/bash
+sleep_time=120  ### Time to jam for
+echo 6 >  /sys/module/tcm_qla2xxx/parameters/jam_host
+host=`cat /sys/module/tcm_qla2xxx/parameters/jam_host`
+echo "We start to discard commands on SCSI host $host"
+logger "Jammer started"
+sleep $sleep_time
+echo -1 >  /sys/module/tcm_qla2xxx/parameters/jam_host
+echo "We stopped the jammer"
+logger "Jammer stopped"
diff -Nurp linux-4.5.orig/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c
--- linux-4.5.orig/drivers/scsi/qla2xxx/tcm_qla2xxx.c   2016-03-14 
00:28:54.0 -0400
+++ linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-03-29 
10:10:09.677298099 -0400
@@ -48,6 +48,10 @@
 #include "qla_target.h"
 #include "tcm_qla2xxx.h"
 
+int jam_host = -1;
+module_param(jam_host, int, 0644);
+MODULE_PARM_DESC(jam_host, "Host to jam >=0 Enable jammer");
+
 static struct workqueue_struct *tcm_qla2xxx_free_wq;
 static struct workqueue_struct *tcm_qla2xxx_cmd_wq;
 
@@ -477,6 +481,11 @@ static int tcm_qla2xxx_handle_cmd(scsi_q
return -EINVAL;
}
 
+   if (unlikely(vha->host_no == jam_host)) {
+   /* return, and dont run target_submit_cmd,discarding command */
+   return 0;
+   }
+
cmd->vha->tgt_counters.qla_core_sbt_cmd++;
return target_submit_cmd(se_cmd, se_sess, cdb, >sense_buffer[0],
cmd->unpacked_lun, data_length, fcp_task_attr,
@@ -1967,6 +1976,7 @@ static void tcm_qla2xxx_deregister_confi
 static int __init tcm_qla2xxx_init(void)
 {
    int ret;
+   jam_host = -1;
 
ret = tcm_qla2xxx_register_configfs();
if (ret < 0)


Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] qla1280: Reduce can_queue to 32

2016-04-22 Thread Laurence Oberman

The change looks fine, I see its hard-coded to 32 in qla1280_set_defaults()
Would it be better to create a #define like other drivers and use that in both.
Also did the below patch resolve this for the bug reporter.

I ask because if I check 4.3 it was also set to the same value of 0xf and 
that is reported as working.
So other changes in 4.4 must be "abusing" this high value.

Reviewed-by Laurence Oberman <lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Johannes Thumshirn" <jthumsh...@suse.de>
To: "Martin K . Petersen" <martin.peter...@oracle.com>, "James E . J . 
Bottomley" <j...@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org, "Johannes 
Thumshirn" <jthumsh...@suse.de>, "Laura Abbott" <labb...@redhat.com>, "Michael 
Reed" <m...@sgi.com>, sta...@vger.kernel.org
Sent: Friday, April 22, 2016 3:31:10 AM
Subject: [PATCH] qla1280: Reduce can_queue to 32

It was reported in https://bugzilla.redhat.com/show_bug.cgi?id=1321033,
that the qla1280 driver sets the scsi_host_template's can_queue field
to 0xf which results in an allocation failure when allocating the
block layer tags for the driver's queues like the one shown below:

[4.804166] scsi host0: QLogic QLA1040 PCI to SCSI Host Adapter Firmware 
version:  7.65.06, Driver version 3.27.1
[4.804174] [ cut here ]
[4.804184] WARNING: CPU: 2 PID: 305 at mm/page_alloc.c:2989 
alloc_pages_nodemask+0xae8/0xbc0()
[4.804186] Modules linked in: amdkfd amd_iommu_v2 radeon i2c_algo_bit 
m_kms_helper ttm drm megaraid_sas serio_raw 8021q garp bnx2 stp llc mrp nhme 
qla1280(+) fjes
[4.804208] CPU: 2 PID: 305 Comm: systemd-udevd Not tainted 
4.6-201.fc22.x86_64 #1
[4.804210] Hardware name: Google Enterprise Search Appliance/0DT021, OS 
1.1.2 08/14/2006
[4.804212]  0286 2f01064c 88042985b710 
ff813b542e
[4.804216]   81a75024 88042985b748 
ff810a40f2
[4.804220]    000b 
00
[4.804223] Call Trace:
[4.804231]  [] dump_stack+0x63/0x85
[4.804236]  [] warn_slowpath_common+0x82/0xc0
[4.804239]  [] warn_slowpath_null+0x1a/0x20
[4.804242]  [] __alloc_pages_nodemask+0xae8/0xbc0
[4.804247]  [] ? _raw_spin_unlock_irqrestore+0xe/0x10
[4.804251]  [] ? irq_work_queue+0x8e/0xa0
[4.804256]  [] ? console_unlock+0x20a/0x540
[4.804262]  [] alloc_pages_current+0x8c/0x110
[4.804265]  [] alloc_kmem_pages+0x19/0x90
[4.804268]  [] kmalloc_order_trace+0x2e/0xe0
[4.804272]  [] __kmalloc+0x232/0x260
[4.804277]  [] init_tag_map+0x3d/0xc0
[4.804290]  [] __blk_queue_init_tags+0x45/0x80
[4.804293]  [] blk_init_tags+0x14/0x20
[4.804298]  [] scsi_add_host_with_dma+0x80/0x300
[4.804305]  [] qla1280_probe_one+0x683/0x9ef [qla1280]
[4.804309]  [] local_pci_probe+0x45/0xa0
[4.804312]  [] pci_device_probe+0xfd/0x140
[4.804316]  [] driver_probe_device+0x222/0x490
[4.804319]  [] __driver_attach+0x84/0x90
[4.804321]  [] ? driver_probe_device+0x490/0x490
[4.804324]  [] bus_for_each_dev+0x6c/0xc0
[4.804326]  [] driver_attach+0x1e/0x20
[4.804328]  [] bus_add_driver+0x1eb/0x280
[4.804331]  [] ? 0xa0015000
[4.804333]  [] driver_register+0x60/0xe0
[4.804336]  [] __pci_register_driver+0x4c/0x50
[4.804339]  [] qla1280_init+0x1ce/0x1000 [qla1280]
[4.804341]  [] ? 0xa0015000
[4.804345]  [] do_one_initcall+0xb3/0x200
[4.804348]  [] ? kmem_cache_alloc_trace+0x196/0x210
[4.804352]  [] ? do_init_module+0x27/0x1cb
[4.804354]  [] do_init_module+0x5f/0x1cb
[4.804358]  [] load_module+0x2040/0x2680
[4.804360]  [] ? __symbol_put+0x60/0x60
[4.804363]  [] SYSC_init_module+0x149/0x190
[4.804366]  [] SyS_init_module+0xe/0x10
[4.804369]  [] entry_SYSCALL_64_fastpath+0x12/0x71
[4.804371] ---[ end trace 0ea3b625f86705f7 ]---
[4.804581] qla1280: probe of :11:04.0 failed with error -12

In qla1280_set_defaults() the maximum queue depth is
set to 32 so adopt the scsi_host_template to it as well.

Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de>
Cc: Laura Abbott <labb...@redhat.com>
Cc: Michael Reed <m...@sgi.com>
Cc: sta...@vger.kernel.org
---
 drivers/scsi/qla1280.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c
index 5d0ec42..6bd748e 100644
--- a/drivers/scsi/qla1280.c
+++ b/drivers/scsi/qla1280.c
@@ -4214,7 +4214,7 @@ static struct scsi_host_template qla1280_driver_template 
= {
.eh_bus_reset_handler   = qla1280_eh_bus_reset,
.eh_host_reset_handler  = qla1280_eh_adapter_reset,
.bios_param = qla128

Re: [PATCH] qla1280: Reduce can_queue to 32

2016-04-22 Thread Laurence Oberman

Johannes
OK , yes thanks for pointing out the commit.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Johannes Thumshirn" <jthmsh...@suse.de>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: "Johannes Thumshirn" <jthumsh...@suse.de>, "Martin K . Petersen" 
<martin.peter...@oracle.com>, "James E . J . Bottomley" 
<j...@linux.vnet.ibm.com>, linux-scsi@vger.kernel.org, 
linux-ker...@vger.kernel.org, "Laura Abbott" <labb...@redhat.com>, "Michael 
Reed" <m...@sgi.com>, sta...@vger.kernel.org
Sent: Friday, April 22, 2016 10:01:48 AM
Subject: Re: [PATCH] qla1280: Reduce can_queue to 32

On Fri, Apr 22, 2016 at 08:16:44AM -0400, Laurence Oberman wrote:
> The change looks fine, I see its hard-coded to 32 in qla1280_set_defaults()
> Would it be better to create a #define like other drivers and use that in 
> both.
> Also did the below patch resolve this for the bug reporter.

Yes, that's probably a reasonable idea, I'll re-send.

> I ask because if I check 4.3 it was also set to the same value of 0xf and 
> that is reported as working.
> So other changes in 4.4 must be "abusing" this high value.

I think it was introduced with commit 64d513ac31b - "scsi: use host wide
tags by default". Since this commit scsi_add_host_with_dma() directly calls
blk_init_tags() instead of scsi_init_shared_tag_map(). The qla1280 driver
has never set up the block tags though, so the bogus value was not a problem.
That at least is my analysis, feel free to correct my assumptions.

> 
> Reviewed-by Laurence Oberman <lober...@redhat.com>
> 
> Laurence Oberman
> Principal Software Maintenance Engineer
> Red Hat Global Support Services
> 
> - Original Message -
> From: "Johannes Thumshirn" <jthumsh...@suse.de>
> To: "Martin K . Petersen" <martin.peter...@oracle.com>, "James E . J . 
> Bottomley" <j...@linux.vnet.ibm.com>
> Cc: linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org, "Johannes 
> Thumshirn" <jthumsh...@suse.de>, "Laura Abbott" <labb...@redhat.com>, 
> "Michael Reed" <m...@sgi.com>, sta...@vger.kernel.org
> Sent: Friday, April 22, 2016 3:31:10 AM
> Subject: [PATCH] qla1280: Reduce can_queue to 32
> 
> It was reported in https://bugzilla.redhat.com/show_bug.cgi?id=1321033,
> that the qla1280 driver sets the scsi_host_template's can_queue field
> to 0xf which results in an allocation failure when allocating the
> block layer tags for the driver's queues like the one shown below:
> 
> [4.804166] scsi host0: QLogic QLA1040 PCI to SCSI Host Adapter Firmware 
> version:  7.65.06, Driver version 3.27.1
> [4.804174] [ cut here ]
> [4.804184] WARNING: CPU: 2 PID: 305 at mm/page_alloc.c:2989 
> alloc_pages_nodemask+0xae8/0xbc0()
> [4.804186] Modules linked in: amdkfd amd_iommu_v2 radeon i2c_algo_bit 
> m_kms_helper ttm drm megaraid_sas serio_raw 8021q garp bnx2 stp llc mrp nhme 
> qla1280(+) fjes
> [4.804208] CPU: 2 PID: 305 Comm: systemd-udevd Not tainted 
> 4.6-201.fc22.x86_64 #1
> [4.804210] Hardware name: Google Enterprise Search Appliance/0DT021, OS 
> 1.1.2 08/14/2006
> [4.804212]  0286 2f01064c 88042985b710 
> ff813b542e
> [4.804216]   81a75024 88042985b748 
> ff810a40f2
> [4.804220]    000b 
> 00
> [4.804223] Call Trace:
> [4.804231]  [] dump_stack+0x63/0x85
> [4.804236]  [] warn_slowpath_common+0x82/0xc0
> [4.804239]  [] warn_slowpath_null+0x1a/0x20
> [4.804242]  [] __alloc_pages_nodemask+0xae8/0xbc0
> [4.804247]  [] ? _raw_spin_unlock_irqrestore+0xe/0x10
> [4.804251]  [] ? irq_work_queue+0x8e/0xa0
> [4.804256]  [] ? console_unlock+0x20a/0x540
> [4.804262]  [] alloc_pages_current+0x8c/0x110
> [4.804265]  [] alloc_kmem_pages+0x19/0x90
> [4.804268]  [] kmalloc_order_trace+0x2e/0xe0
> [4.804272]  [] __kmalloc+0x232/0x260
> [4.804277]  [] init_tag_map+0x3d/0xc0
> [4.804290]  [] __blk_queue_init_tags+0x45/0x80
> [4.804293]  [] blk_init_tags+0x14/0x20
> [4.804298]  [] scsi_add_host_with_dma+0x80/0x300
> [4.804305]  [] qla1280_probe_one+0x683/0x9ef [qla1280]
> [4.804309]  [] local_pci_probe+0x45/0xa0
> [4.804312]  [] pci_device_probe+0xfd/0x140
> [4.804316]  [] driver_probe_device+0x222/0x490
> [4.804319]  [] __driver_attach+0x84/0x90
> [4.804321]  [] ? driver_probe_device+0x490/0x490
> [4.804324]  [] bus_for_each_dev+0x6c/0xc0
> [4

Re: [PREEMPT-RT] [PATCH v2] scsi/fcoe: convert to kworker

2016-04-22 Thread Laurence Oberman

I have fcoe for testing.
I will pull this in next week and test it.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "James Bottomley" <j...@linux.vnet.ibm.com>
To: "Sebastian Andrzej Siewior" <bige...@linutronix.de>, "Christoph Hellwig" 
<h...@infradead.org>
Cc: linux-scsi@vger.kernel.org, "Martin K. Petersen" 
<martin.peter...@oracle.com>, "Vasu Dev" <vasu@intel.com>, 
r...@linutronix.de, fcoe-de...@open-fcoe.org, "Chad Dupuis" 
<chad.dup...@qlogic.com>
Sent: Friday, April 22, 2016 11:49:45 AM
Subject: Re: [PREEMPT-RT] [PATCH v2] scsi/fcoe: convert to kworker

On Fri, 2016-04-22 at 17:27 +0200, Sebastian Andrzej Siewior wrote:
> On 04/12/2016 05:16 PM, Sebastian Andrzej Siewior wrote:
> > The driver creates its own per-CPU threads which are updated based
> > on
> > CPU hotplug events. It is also possible to use kworkers and remove
> > some
> > of the kthread infrastrucure.
> > 
> > The code checked ->thread to decide if there is an active per-CPU
> > thread. By using the kworker infrastructure this is no longer
> > possible (or
> > required). The thread pointer is saved in `kthread' instead of
> > `thread' so
> > anything trying to use thread is caught by the compiler. Currently
> > only the
> > bnx2fc driver is using struct fcoe_percpu_s and the kthread member.
> > 
> > After a CPU went offline, we may still enqueue items on the
> > "offline"
> > CPU. This isn't much of a problem. The work will be done on a
> > random
> > CPU. The allocated crc_eof_page page won't be cleaned up. It is
> > probably
> > expected that the CPU comes up at some point so it should not be a
> > problem. The crc_eof_page memory is released of course once the
> > module is
> > removed.
> > 
> > This patch was only compile-tested due to -ENODEV.
> > 
> > Cc: Vasu Dev <vasu@intel.com>
> > Cc: "James E.J. Bottomley" <j...@linux.vnet.ibm.com>
> > Cc: "Martin K. Petersen" <martin.peter...@oracle.com>
> > Cc: Christoph Hellwig <h...@lst.de>
> > Cc: fcoe-de...@open-fcoe.org
> > Cc: linux-scsi@vger.kernel.org
> > Signed-off-by: Sebastian Andrzej Siewior <bige...@linutronix.de>
> > ---
> > v1…v2: use kworker instead of smbthread as per hch
> > 
> > If you want this I would the same for the two bnx drivers.
> 
> *ping*

Ping what?  You've sent in an untested patch that looks to be a big
change.  It's definitely not going in until it's tested.  Why don't you
see if you can recruit an FCoE person to your cause and get them to
test it.

It looks like you're looking for testing on bnx2fc, correct?  In which
case cc'ing a bnx2fc person might have been helpful (cc added).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM

2016-04-29 Thread Laurence Oberman

Hello Bart

Around 300s before the paths were declared hard failed and the devices offlined.
This is when I/O restarts.
The remaining paths on the second Qlogic port (that are not jammed) will not be 
used until the error handler activity completes.

Until we get these for example, and device-mapper starts declaring paths down 
we are blocked.
Apr 29 17:20:51 localhost kernel: sd 1:0:1:0: Device offlined - not ready after 
error recovery
Apr 29 17:20:51 localhost kernel: sd 1:0:1:13: Device offlined - not ready 
after error recovery

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: "James Bottomley" <james.bottom...@hansenpartnership.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, 
linux-bl...@vger.kernel.org, "device-mapper development" <dm-de...@redhat.com>, 
l...@lists.linux-foundation.org
Sent: Friday, April 29, 2016 8:36:22 PM
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at 
LSF/MM

On 04/29/2016 02:47 PM, Laurence Oberman wrote:
> Recovery with 21 LUNS is 300s that have in-flights to abort.
> [ ... ]
> eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set
 > to 10 for all devices. In multipath fast_io_fail_tmo=5
>
> I jam one of the target array ports and discard the commands
 > effectively black-holing the commands and leave it that way until
 > we recover and I watch the I/O. The recovery takes around 300s even
 > with all the tuning and this effectively lands up in Oracle cluster
 > evictions.

Hello Laurence,

This discussion started as a discussion about the time needed to fail 
over from one path to another. How long did it take in your test before 
I/O failed over from the jammed port to another port?

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM

2016-04-28 Thread Laurence Oberman

Hello Bart, This is when we have a subset of the paths fails.
As you know the remaining path wont be used until the eh_handler is either done 
or is short circuited.

What I will do is set this up via my jammer and capture a test using latest 
upstream.

Of course my customer pain points are all in the RHEL kernels so I need to 
capture a recovery trace
on the latest upstream kernel.

When the SCSI commands for a path are black-holed and remain that way, even 
with eh_deadline and the short circuited adapter resets 
we simply try again and get back in the wait loop until we finally declare the 
device offline.

This can take a while and differs depending on Qlogic, Emulex or fnic etc.

First thing tomorrow will set this up and show you what I mean.

Thanks!!

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, 
"Mike Snitzer" <snit...@redhat.com>, "James Bottomley" 
<james.bottom...@hansenpartnership.com>, "device-mapper development" 
<dm-de...@redhat.com>, l...@lists.linux-foundation.org
Sent: Thursday, April 28, 2016 12:41:26 PM
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at 
LSF/MM

On 04/28/2016 09:23 AM, Laurence Oberman wrote:
> We still suffer from periodic complaints in our large customer base
 > regarding the long recovery times for dm-multipath.
> Most of the time this is when we have something like a switch
 > back-plane issue or an issue where RSCN'S are blocked coming back up
 > the fabric. Corner cases still bite us often.
>
> Most of the complaints originate from customers for example seeing
 > Oracle cluster evictions where during the waiting on the mid-layer
 > all mpath I/O is blocked until recovery.
>
> We have to tune eh_deadline, eh_timeout and fast_io_fail_tmo but
 > even tuning those we have to wait on serial recovery even if we
 > set the timeouts low.
>
> Lately we have been living with
> eh_deadline=10
> eh_timeout=5
> fast_fail_io_tmo=10
> leaving default sd timeout at 30s
>
> So this continues to be an issue and I have specific examples using
 > the jammer I can provide showing the serial recovery times here.

Hello Laurence,

The long recovery times you refer to, is that for a scenario where all 
paths failed or for a scenario where some paths failed and other paths 
are still working? In the latter case, how long does it take before 
dm-multipath fails over to another path?

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Lsf] Notes from the four separate IO track sessions at LSF/MM

2016-04-28 Thread Laurence Oberman

Hello Folks,

We still suffer from periodic complaints in our large customer base regarding 
the long recovery times for dm-multipath.
Most of the time this is when we have something like a switch back-plane issue 
or an issue where RSCN'S are blocked coming back up the fabric.
Corner cases still bite us often.

Most of the complaints originate from customers for example seeing Oracle 
cluster evictions where during the waiting on the mid-layer all mpath I/O is 
blocked until recovery.

We have to tune eh_deadline, eh_timeout and fast_io_fail_tmo but even tuning 
those we have to wait on serial recovery even if we set the timeouts low.

Lately we have been living with
eh_deadline=10
eh_timeout=5
fast_fail_io_tmo=10
leaving default sd timeout at 30s

So this continues to be an issue and I have specific examples using the jammer 
I can provide showing the serial recovery times here.

Thanks

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "James Bottomley" <james.bottom...@hansenpartnership.com>, "Mike Snitzer" 
<snit...@redhat.com>
Cc: linux-bl...@vger.kernel.org, l...@lists.linux-foundation.org, 
"device-mapper development" <dm-de...@redhat.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>
Sent: Thursday, April 28, 2016 11:53:50 AM
Subject: Re: [Lsf] Notes from the four separate IO track sessions at LSF/MM

On 04/28/2016 08:40 AM, James Bottomley wrote:
> Well, the entire room, that's vendors, users and implementors
> complained that path failover takes far too long.  I think in their
> minds this is enough substance to go on.

The only complaints I heard about path failover taking too long came 
from people working on FC drivers. Aren't SCSI transport layer 
implementations expected to fail I/O after fast_io_fail_tmo expired 
instead of waiting until the SCSI error handler has finished? If so, why 
is it considered an issue that error handling for the FC protocol can 
take very long (hours)?

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: use spinlock instead of mutex for RCU-protected VPD inquiry data

2016-05-20 Thread Laurence Oberman



- Original Message -
> From: "Ewan D. Milne" <emi...@redhat.com>
> To: linux-scsi@vger.kernel.org
> Sent: Friday, May 20, 2016 8:56:14 AM
> Subject: [PATCH] scsi: use spinlock instead of mutex for RCU-protected VPD 
> inquiry data
> 
> From: "Ewan D. Milne" <emi...@redhat.com>
> 
> A spinlock is sufficient for this purpose, and much smaller.
> 
> Signed-off-by: Ewan D. Milne <emi...@redhat.com>
> ---
>  drivers/scsi/scsi.c| 8 
>  drivers/scsi/scsi_scan.c   | 2 +-
>  include/scsi/scsi_device.h | 2 +-
>  3 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 1deb6ad..330d807 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -829,11 +829,11 @@ retry_pg80:
>   kfree(vpd_buf);
>   goto retry_pg80;
>   }
> - mutex_lock(>inquiry_mutex);
> + spin_lock(>inquiry_lock);
>   orig_vpd_buf = sdev->vpd_pg80;
>   sdev->vpd_pg80_len = result;
>   rcu_assign_pointer(sdev->vpd_pg80, vpd_buf);
> - mutex_unlock(>inquiry_mutex);
> + spin_unlock(>inquiry_lock);
>   synchronize_rcu();
>   if (orig_vpd_buf) {
>   kfree(orig_vpd_buf);
> @@ -858,11 +858,11 @@ retry_pg83:
>   kfree(vpd_buf);
>   goto retry_pg83;
>   }
> - mutex_lock(>inquiry_mutex);
> + spin_lock(>inquiry_lock);
>   orig_vpd_buf = sdev->vpd_pg83;
>   sdev->vpd_pg83_len = result;
>   rcu_assign_pointer(sdev->vpd_pg83, vpd_buf);
> - mutex_unlock(>inquiry_mutex);
> + spin_unlock(>inquiry_lock);
>   synchronize_rcu();
>   if (orig_vpd_buf)
>   kfree(orig_vpd_buf);
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index e0a78f5..f445615 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -240,7 +240,7 @@ static struct scsi_device *scsi_alloc_sdev(struct
> scsi_target *starget,
>   INIT_LIST_HEAD(>starved_entry);
>   INIT_LIST_HEAD(>event_list);
>   spin_lock_init(>list_lock);
> - mutex_init(>inquiry_mutex);
> + spin_lock_init(>inquiry_lock);
>   INIT_WORK(>event_work, scsi_evt_thread);
>   INIT_WORK(>requeue_work, scsi_requeue_run_queue);
>  
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index a6c346d..0410ed8 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -115,7 +115,7 @@ struct scsi_device {
>   char type;
>   char scsi_level;
>   char inq_periph_qual;   /* PQ from INQUIRY data */
> - struct mutex inquiry_mutex;
> + spinlock_t inquiry_lock;
>   unsigned char inquiry_len;  /* valid bytes in 'inquiry' */
>   unsigned char * inquiry;/* INQUIRY response data */
>   const char * vendor;/* [back_compat] point into 'inquiry' 
> ... */
> --
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Look fine to me:
Reviewed by: Laurence Oberman <lober...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM

2016-04-29 Thread Laurence Oberman

st multipathd: mpathi: sdbb - path offline
Apr 29 17:21:18 localhost multipathd: checker failed path 67:80 in map mpathi
Apr 29 17:21:18 localhost multipathd: mpathi: remaining active paths: 2
Apr 29 17:21:18 localhost multipathd: mpatho: sdbr - path offline
Apr 29 17:21:18 localhost multipathd: checker failed path 68:80 in map mpatho
Apr 29 17:21:18 localhost multipathd: mpatho: remaining active paths: 2
Apr 29 17:21:18 localhost multipathd: mpathq: sdbp - path offline
Apr 29 17:21:18 localhost multipathd: checker failed path 68:48 in map mpathq
Apr 29 17:21:18 localhost multipathd: mpathq: remaining active paths: 2
Apr 29 17:21:18 localhost multipathd: mpathv: sdbz - path offline
Apr 29 17:21:18 localhost multipathd: checker failed path 68:208 in map mpathv
Apr 29 17:21:18 localhost multipathd: mpathv: remaining active paths: 2
Apr 29 17:21:18 localhost multipathd: mpatht: sdbl - path offline
Apr 29 17:21:18 localhost multipathd: checker failed path 67:240 in map mpatht
Apr 29 17:21:18 localhost multipathd: mpatht: remaining active paths: 2
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 66:224.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 68:176.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:208.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:176.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:144.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:112.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:80.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 68:80.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 68:48.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 68:208.
Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:240.
Apr 29 17:21:18 localhost kernel: blk_update_request: I/O error, dev sdaw, 
sector 0
Apr 29 17:21:18 localhost kernel: sd 0:0:1:8: [sdbn] tag#25 FAILED Result: 
hostbyte=DID_RESET driverbyte=DRIVER_OK
Apr 29 17:21:18 localhost kernel: sd 0:0:1:8: [sdbn] tag#25 CDB: Read(10) 28 00 
00 00 00 00 00 00 08 00
Apr 29 17:21:18 localhost kernel: blk_update_request: I/O error, dev sdbn, 
sector 0
Apr 29 17:21:18 localhost kernel: sd 0:0:1:8: rejecting I/O to offline device
Apr 29 17:21:18 localhost kernel: sd 0:0:1:8: [sdbn] killing request


Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Laurence Oberman" <lober...@redhat.com>
To: "Bart Van Assche" <bart.vanass...@sandisk.com>
Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, 
"Mike Snitzer" <snit...@redhat.com>, "James Bottomley" 
<james.bottom...@hansenpartnership.com>, "device-mapper development" 
<dm-de...@redhat.com>, l...@lists.linux-foundation.org
Sent: Thursday, April 28, 2016 12:47:24 PM
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at 
LSF/MM

Hello Bart, This is when we have a subset of the paths fails.
As you know the remaining path wont be used until the eh_handler is either done 
or is short circuited.

What I will do is set this up via my jammer and capture a test using latest 
upstream.

Of course my customer pain points are all in the RHEL kernels so I need to 
capture a recovery trace
on the latest upstream kernel.

When the SCSI commands for a path are black-holed and remain that way, even 
with eh_deadline and the short circuited adapter resets 
we simply try again and get back in the wait loop until we finally declare the 
device offline.

This can take a while and differs depending on Qlogic, Emulex or fnic etc.

First thing tomorrow will set this up and show you what I mean.

Thanks!!

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, 
"Mike Snitzer" <snit...@redhat.com>, "James Bottomley" 
<james.bottom...@hansenpartnership.com>, "device-mapper development" 
<dm-de...@redhat.com>, l...@lists.linux-foundation.org
Sent: Thursday, April 28, 2016 12:41:26 PM
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at 
LSF/MM

On 04/28/2016 09:23 AM, Laurence Oberman wrote:
> We still suffer from periodic complaints in our large customer base
 > regarding the long recovery times for dm-multipath.
> Most of the time this is when we have something like a switch
 > back-plane issue or an issue where RSCN'S are blocked coming back up
 > the fabric. Corne

Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM

2016-04-29 Thread Laurence Oberman

One small correction

In the cut and past the mpath timing was this. I had a cut and past error in my 
prior message.

Fri Apr 29 17:16:14 EDT 2016
mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13
size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 0:0:0:12 sds  65:32  active ready running
  |- 0:0:1:12 sdbh 67:176 active ready running
  |- 1:0:0:12 sdr  65:16  active ready running
  `- 1:0:1:12 sdbi 67:192 active ready running

Start again here so its the same 5 minutes while we are in the error_handler

Fri Apr 29 17:21:26 EDT 2016
mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13
size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 0:0:0:12 sds  65:32  active ready  running
  |- 0:0:1:12 sdbh 67:176 failed faulty offline
  |- 1:0:0:12 sdr  65:16  active ready  running
  `- 1:0:1:12 sdbi 67:192 failed faulty offline

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Laurence Oberman" <lober...@redhat.com>
To: "Bart Van Assche" <bart.vanass...@sandisk.com>
Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, 
"Mike Snitzer" <snit...@redhat.com>, "James Bottomley" 
<james.bottom...@hansenpartnership.com>, "device-mapper development" 
<dm-de...@redhat.com>, l...@lists.linux-foundation.org, "Benjamin Marzinski" 
<bmarz...@redhat.com>
Sent: Friday, April 29, 2016 5:47:07 PM
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at 
LSF/MM

Hello Bart

I will email the entire log just to you. This is a summary only below of 
pertinent log messages.
I dont think the whole list will have an interest in all thge log messages.
When I sent the dull log to you I will include SCSI debug for the error handler 
stuff.


I ran the tests. This is a worst case test with 21 LUNS and jammed commands.
Typical failures like a port switch failure or link down wont be like this.
Also where we get RSCN's and we can react quicker we will.

In this case I simulated a hung switch issue like a backplane/mesh problem and 
believe me I see a lot of these 
black-holed SCSI command situations in real life.
Recovery with 21 LUNS is 300s that have in-flights to abort.

This configuration is a multibus configuration for multipath. 
Two qla2xx ports are connected to a switch and the 2 array pots are connected 
to the same switch.
This gives me 4 active/active paths for 21 mpath devices 

I start I/O to all 21 reading 64k blocks using dd and iflag=direct

Example mpath device
mpathf (360014056a5be89021364a4c90556bfbb) dm-7 LIO-ORG ,block-14
size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 0:0:0:13 sdp  8:240  active ready running
  |- 0:0:1:13 sdbf 67:144 active ready running
  |- 1:0:0:13 sdo  8:224  active ready running
  `- 1:0:1:13 sdbg 67:160 active ready running

eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set to 10 for all 
devices
In multipath fast_io_fail_tmo=5

I jam one of the target array ports and discard the commands effectively 
black-holing the commands and leave it that way until we recover and I watch 
the I/O.
The recovery takes around 300s even with all the tuning and this effectively 
lands up in Oracle cluster evictions.

Watching multipath -ll mpathe I will block as expected while in recovery

BLocked here
Fri Apr 29 17:16:14 EDT 2016
mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13
size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 0:0:0:12 sds  65:32  active ready running
  |- 0:0:1:12 sdbh 67:176 active ready running
  |- 1:0:0:12 sdr  65:16  active ready running
  `- 1:0:1:12 sdbi 67:192 active ready running

Starte again here
Fri Apr 29 17:16:26 EDT 2016
mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13
size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 0:0:0:12 sds  65:32  active ready  running
  |- 0:0:1:12 sdbh 67:176 failed faulty offline
  |- 1:0:0:12 sdr  65:16  active ready  running
  `- 1:0:1:12 sdbi 67:192 failed faulty offline

Tracking I/O
procs ---memory-- ---swap-- -io -system-- 
--cpu- -timestamp-
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa 
st EDT
 0 21  0 15409656  25580 45205600 13740 0  367 2523  0  1 41 59 
 0 2016-04-29 17:16:17
 0 21  0 15408904  25580 45233600 15872 0  378 2779  0  1 42 57 
 0 2016-04-29 17:16:18
 2 20  0 15408096  25580 45262400 17612 0  399 3310  0  0 26 73 
 0 2016-04-29 17:16:19
 0 21  0 15407188  2558

Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM

2016-05-03 Thread Laurence Oberman



- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: "James Bottomley" <james.bottom...@hansenpartnership.com>, "linux-scsi" 
<linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, 
linux-bl...@vger.kernel.org, "device-mapper development" <dm-de...@redhat.com>, 
l...@lists.linux-foundation.org
Sent: Monday, May 2, 2016 6:28:16 PM
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at 
LSF/MM

On 05/02/2016 12:28 PM, Laurence Oberman wrote:
> Even in the case of the ib_srp, don't we also have to still run the
> eh_timeout for each of the devices that has inflight requiring error
> handling serially. This means we will still have to wait to get a
> path failover until all are through the timeout.

Hello Laurence,

It depends. If a transport layer error (e.g. a cable pull) has been 
observed by the ib_srp driver then fast_io_fail_tmo seconds later the 
ib_srp driver will terminate all outstanding SCSI commands without 
waiting for the error handler to finish. If no transport layer error has 
been observed then at most (SCSI timeout) + (number of pending commands 
+ 1) * 5 seconds later srp_reset_device() will have finished terminating 
all pending SCSI commands.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hello Bart

OK, Yes, that lines up with my testing here with Qlogic and Emulex.
I am about to test srp but I need to add some jammer code first.
The link down and other interruptions will always be fast. 
Its always going to be the black-hole events that are troublesome.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM

2016-05-02 Thread Laurence Oberman



Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Bart Van Assche" <bart.vanass...@sandisk.com>
To: "Laurence Oberman" <lober...@redhat.com>
Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, 
"Mike Snitzer" <snit...@redhat.com>, "James Bottomley" 
<james.bottom...@hansenpartnership.com>, "device-mapper development" 
<dm-de...@redhat.com>, l...@lists.linux-foundation.org
Sent: Monday, May 2, 2016 2:49:54 PM
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at 
LSF/MM

On 04/29/2016 05:47 PM, Laurence Oberman wrote:
> From: "Bart Van Assche" <bart.vanass...@sandisk.com>
> To: "Laurence Oberman" <lober...@redhat.com>
> Cc: "James Bottomley" <james.bottom...@hansenpartnership.com>, "linux-scsi" 
> <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, 
> linux-bl...@vger.kernel.org, "device-mapper development" 
> <dm-de...@redhat.com>, l...@lists.linux-foundation.org
> Sent: Friday, April 29, 2016 8:36:22 PM
> Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions 
> at LSF/MM
>
>> On 04/29/2016 02:47 PM, Laurence Oberman wrote:
>>> Recovery with 21 LUNS is 300s that have in-flights to abort.
>>> [ ... ]
>>> eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set
>>> to 10 for all devices. In multipath fast_io_fail_tmo=5
>>>
>>> I jam one of the target array ports and discard the commands
>>> effectively black-holing the commands and leave it that way until
>>> we recover and I watch the I/O. The recovery takes around 300s even
>>> with all the tuning and this effectively lands up in Oracle cluster
>>> evictions.
>>
>> This discussion started as a discussion about the time needed to fail
>> over from one path to another. How long did it take in your test before
>> I/O failed over from the jammed port to another port?
 >
 > Around 300s before the paths were declared hard failed and the
 > devices offlined. This is when I/O restarts.
 > The remaining paths on the second Qlogic port (that are not jammed)
 > will not be used until the error handler activity completes.
 >
 > Until we get these for example, and device-mapper starts declaring
 > paths down we are blocked.
 > Apr 29 17:20:51 localhost kernel: sd 1:0:1:0: Device offlined - not
 > ready after error recovery
 > Apr 29 17:20:51 localhost kernel: sd 1:0:1:13: Device offlined - not
 > ready after error recovery

Hello Laurence,

Everyone else on all mailing lists to which this message has been posted 
replies below the message. Please follow this convention.

Regarding the fail-over time: the ib_srp driver guarantees that 
scsi_done() is invoked from inside its terminate_rport_io() function. 
Apparently the lpfc and the qla2xxx drivers behave differently. Please 
work with the maintainers of these drivers to reduce fail-over time.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hello Bart

Even in the case of the ib_srp, don't we also have to still run the eh_timeout 
for each of the devices that has inflight requiring error handling serially.
This means we will still have to wait to get a path failover until all are 
through the timeout.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] st: clear ILI if Medium Error

2016-04-18 Thread Laurence Oberman

Looks good
Reviewed-by Laurence Oberman <lober...@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

- Original Message -
From: "Kai Makisara" <kai.makis...@kolumbus.fi>
To: linux-scsi@vger.kernel.org
Cc: mlomb...@redhat.com
Sent: Monday, April 18, 2016 1:47:18 AM
Subject: [PATCH] st: clear ILI if Medium Error

Some drives set the ILI flag together with MEDIUM ERROR
sense code. Clear the ILI flag in this case so that the
medium error will be handled. The problem was reported by
Maurizio Lombardi.

Signed-off-by: Kai Mäkisara <kai.makis...@kolumbus.fi>
---
 drivers/scsi/st.c |9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- a/drivers/scsi/st.c 2016-04-17 21:22:15.671897001 +0300
+++ b/drivers/scsi/st.c 2016-04-17 22:25:39.234321293 +0300
@@ -1974,9 +1974,12 @@ static long read_tape(struct scsi_tape *
transfer = (int)cmdstatp->uremainder64;
else
transfer = 0;
-   if (STp->block_size == 0 &&
-   cmdstatp->sense_hdr.sense_key == 
MEDIUM_ERROR)
-   transfer = bytes;
+   if (cmdstatp->sense_hdr.sense_key == 
MEDIUM_ERROR) {
+   if (STp->block_size == 0)
+   transfer = bytes;
+   /* Some drives set ILI with MEDIUM 
ERROR */
+   cmdstatp->flags &= ~SENSE_ILI;
+   }
 
if (cmdstatp->flags & SENSE_ILI) {  /* ILI 
*/
if (STp->block_size == 0 &&
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability to the tcm_qla2xxx module

2016-05-09 Thread Laurence Oberman



- Original Message -
> From: "Laurence Oberman" <lober...@redhat.com>
> To: "Nicholas A. Bellinger" <n...@linux-iscsi.org>
> Cc: "Himanshu Madhani" <himanshu.madh...@qlogic.com>, "Bart Van Assche" 
> <bart.vanass...@sandisk.com>, "linux-scsi"
> <linux-scsi@vger.kernel.org>, "target-devel" <target-de...@vger.kernel.org>, 
> "Quinn Tran" <quinn.t...@qlogic.com>
> Sent: Monday, April 4, 2016 6:50:03 PM
> Subject: Re: [PATCH]  tcm_qla2xxx Add SCSI command jammer/discard capability 
> to the tcm_qla2xxx module
> 
> Hello Nicholas
> 
> Its fixed now.
> Many Thanks.
> 
> $ scripts/checkpatch.pl
> 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch
> WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
> #12:
> new file mode 100644
> 
> total: 0 errors, 1 warnings, 91 lines checked
> 
> 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch has style
> problems, please review.
> 
> NOTE: If any of the errors are false positives, please report
>   them to the maintainer, see CHECKPATCH in MAINTAINERS.
> 
> 
> 
> Tested by: Laurence Oberman <lober...@redhat.com>
> Signed-off-by: Laurence Oberman <lober...@redhat.com>
> ---
>  Documentation/scsi/tcm_qla2xxx.txt |   22 ++
>  drivers/scsi/qla2xxx/Kconfig   |9 +
>  drivers/scsi/qla2xxx/tcm_qla2xxx.c |   20 
>  drivers/scsi/qla2xxx/tcm_qla2xxx.h |1 +
>  4 files changed, 52 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/scsi/tcm_qla2xxx.txt
> 
> diff --git a/Documentation/scsi/tcm_qla2xxx.txt
> b/Documentation/scsi/tcm_qla2xxx.txt
> new file mode 100644
> index 000..c3a670a
> --- /dev/null
> +++ b/Documentation/scsi/tcm_qla2xxx.txt
> @@ -0,0 +1,22 @@
> +tcm_qla2xxx jam_host attribute
> +--
> +There is now a new module endpoint atribute called jam_host
> +attribute: jam_host: boolean=0/1
> +This attribute and accompanying code is only included if the
> +Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y
> +By default this jammer code and functionality is disabled
> +
> +Use this attribute to control the discarding of SCSI commands to a
> +selected host.
> +This may be useful for testing error handling and simulating slow drain
> +and other fabric issues.
> +
> +Setting a boolean of 1 for the jam_host attribute for a particular host
> + will discard the commands for that host.
> +Reset back to 0 to stop the jamming.
> +
> +Enable host 4 to be jammed
> +echo 1 >
> /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host
> +
> +Disable jamming on host 4
> +echo 0 >
> /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host
> diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
> index 10aa18b..67c0d5a 100644
> --- a/drivers/scsi/qla2xxx/Kconfig
> +++ b/drivers/scsi/qla2xxx/Kconfig
> @@ -36,3 +36,12 @@ config TCM_QLA2XXX
>   default n
>   ---help---
>   Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ 
> series
>   target mode HBAs
> +
> +if TCM_QLA2XXX
> +config TCM_QLA2XXX_DEBUG
> + bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series 
> target
> mode HBAs"
> + default n
> + ---help---
> + Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 
> 24xx+
> series target mode HBAs
> + This will include code to enable the SCSI command jammer
> +endif
> diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> index 1808a01..948224e 100644
> --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
> @@ -457,6 +457,10 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha,
> struct qla_tgt_cmd *cmd,
>   struct se_cmd *se_cmd = >se_cmd;
>   struct se_session *se_sess;
>   struct qla_tgt_sess *sess;
> +#ifdef CONFIG_TCM_QLA2XXX_DEBUG
> + struct se_portal_group *se_tpg;
> + struct tcm_qla2xxx_tpg *tpg;
> +#endif
>   int flags = TARGET_SCF_ACK_KREF;
>  
>   if (bidi)
> @@ -477,6 +481,15 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha,
> struct qla_tgt_cmd *cmd,
>   return -EINVAL;
>   }
>  
> +#ifdef CONFIG_TCM_QLA2XXX_DEBUG
> + se_tpg = se_sess->se_tpg;
> + tpg = container_of(se_tpg, struct tcm_qla2xxx_tpg, se_tpg);
> + if (unlikely(tpg->tpg_attrib.jam_host)) {
> + /* return, and dont run target_submit_cmd,discarding command

Re: [PATCH] scsi: Delete an unnecessary check before the function call "kfree"

2016-07-24 Thread Laurence Oberman



- Original Message -
> From: "SF Markus Elfring" <elfr...@users.sourceforge.net>
> To: linux-scsi@vger.kernel.org, "Christoph Hellwig" <h...@lst.de>, "Hannes 
> Reinecke" <h...@suse.de>, "James E. J.
> Bottomley" <j...@linux.vnet.ibm.com>, "Martin K. Petersen" 
> <martin.peter...@oracle.com>
> Cc: "LKML" <linux-ker...@vger.kernel.org>, kernel-janit...@vger.kernel.org, 
> "Julia Lawall" <julia.law...@lip6.fr>
> Sent: Sunday, July 24, 2016 8:30:35 AM
> Subject: [PATCH] scsi: Delete an unnecessary check before the function call 
> "kfree"
> 
> From: Markus Elfring <elfr...@users.sourceforge.net>
> Date: Sun, 24 Jul 2016 14:20:21 +0200
> 
> The kfree() function tests whether its argument is NULL and then
> returns immediately. Thus the test around the call is not needed.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring <elfr...@users.sourceforge.net>
> ---
>  drivers/scsi/scsi.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 1f36aca..1794c0c 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -864,8 +864,7 @@ retry_pg83:
>   rcu_assign_pointer(sdev->vpd_pg83, vpd_buf);
>   mutex_unlock(>inquiry_mutex);
>   synchronize_rcu();
> - if (orig_vpd_buf)
> - kfree(orig_vpd_buf);
> + kfree(orig_vpd_buf);
>   }
>  }
>  
> --
> 2.9.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Looks fine, small comment is that the function call prior to check in the 
fucntion sets up variables etc. 
So is more expensive than a simple NULL check prior.

Reviewed-by: Laurence Oberman <lober...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0082/1285] Replace numeric parameter like 0444 with macro

2016-08-02 Thread Laurence Oberman



- Original Message -
> From: "Baole Ni" <baolex...@intel.com>
> To: "don brace" <don.br...@microsemi.com>, "len brown" <len.br...@intel.com>, 
> pa...@ucw.cz,
> gre...@linuxfoundation.org, h...@zytor.com, x...@kernel.org
> Cc: "iss storagedev" <iss_storage...@hp.com>, "esc storagedev" 
> <esc.storage...@microsemi.com>,
> linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org, "chuansheng liu" 
> <chuansheng@intel.com>, "baolex ni"
> <baolex...@intel.com>
> Sent: Tuesday, August 2, 2016 6:39:14 AM
> Subject: [PATCH 0082/1285] Replace numeric parameter like 0444 with macro
> 
> I find that the developers often just specified the numeric value
> when calling a macro which is defined with a parameter for access permission.
> As we know, these numeric value for access permission have had the
> corresponding macro,
> and that using macro can improve the robustness and readability of the code,
> thus, I suggest replacing the numeric parameter with the macro.
> 
> Signed-off-by: Chuansheng Liu <chuansheng@intel.com>
> Signed-off-by: Baole Ni <baolex...@intel.com>
> ---
>  drivers/block/cciss.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> index 63c2064..05dc1bd 100644
> --- a/drivers/block/cciss.c
> +++ b/drivers/block/cciss.c
> @@ -67,7 +67,7 @@ MODULE_SUPPORTED_DEVICE("HP Smart Array Controllers");
>  MODULE_VERSION("3.6.26");
>  MODULE_LICENSE("GPL");
>  static int cciss_tape_cmds = 6;
> -module_param(cciss_tape_cmds, int, 0644);
> +module_param(cciss_tape_cmds, int, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
>  MODULE_PARM_DESC(cciss_tape_cmds,
>   "number of commands to allocate for tape devices (default: 6)");
>  static int cciss_simple_mode;
> --
> 2.9.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Looks fine: 
Reviewed by: Laurence Oberman <lober...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dm-mq and end_clone_request()

2016-08-02 Thread Laurence Oberman

Hi Bart

I simplified the test to 2 simple scripts and only running against one XFS file 
system.
Can you validate these and tell me if its enough to emulate what you are doing.
Perhaps our test-suite is too simple.

Start the test

# cat run_test.sh
#!/bin/bash
logger "Starting Bart's test"
#for i in `seq 1 10`
for i in 1
do
fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
--iodepth=64 --group_reporting --sync=1 --direct=1 --ioengine=libaio \
--directory="/data-$i" --name=data-integrity-test --thread --numjobs=16 
\
--runtime=600 --output=fio-output.txt >/dev/null &
done

Delete the host, I wait 10s in between host deletions. 
But I also tested with 3s and still its stable with Mike's patches.

#!/bin/bash
for i in /sys/class/srp_remote_ports/*
do
 echo "Deleting host $i, it will re-connect via srp_daemon" 
 echo 1 > $i/delete
 sleep 10
done

Check for I/O errors affecting XFS and we now have none with the patches Mike 
provided.
After recovery I can create files in the xfs mount with no issues.

Can you use my scripts and 1 mount and see if it still fails for you.

Thanks
Laurence

- Original Message -
> From: "Mike Snitzer" <snit...@redhat.com>
> To: "Bart Van Assche" <bart.vanass...@sandisk.com>
> Cc: dm-de...@redhat.com, "Laurence Oberman" <lober...@redhat.com>, 
> linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 8:40:14 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> On Tue, Aug 02 2016 at  8:19pm -0400,
> Bart Van Assche <bart.vanass...@sandisk.com> wrote:
> 
> > On 08/02/2016 10:45 AM, Mike Snitzer wrote:
> > > Please do these same tests against a v4.7 kernel with the 4 patches from
> > > this branch applied (no need for your other debug patches):
> > > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.7-mpath-fixes
> > > 
> > > I've had good results with my blk-mq SRP based testing.
> > 
> > Hello Mike,
> > 
> > Thanks again for having made these patches available. The results of my
> > tests are as follows:
> 
> Disappointing.  But I asked you to run the v4.7 kernel patches I
> pointed to _without_ any of your debug patches.
> 
> I cannot reproduce on our SRP testbed with the fixes I provided.  We're
> now in a place where there would appear to be something very unique to
> your environment causing these failures.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dm-mq and end_clone_request()

2016-08-02 Thread Laurence Oberman



- Original Message -
> From: "Mike Snitzer" <snit...@redhat.com>
> To: "Laurence Oberman" <lober...@redhat.com>
> Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, 
> linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 10:10:12 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> On Tue, Aug 02 2016 at  9:33pm -0400,
> Laurence Oberman <lober...@redhat.com> wrote:
> 
> > Hi Bart
> > 
> > I simplified the test to 2 simple scripts and only running against one XFS
> > file system.
> > Can you validate these and tell me if its enough to emulate what you are
> > doing.
> > Perhaps our test-suite is too simple.
> > 
> > Start the test
> > 
> > # cat run_test.sh
> > #!/bin/bash
> > logger "Starting Bart's test"
> > #for i in `seq 1 10`
> > for i in 1
> > do
> > fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
> > --iodepth=64 --group_reporting --sync=1 --direct=1
> > --ioengine=libaio \
> > --directory="/data-$i" --name=data-integrity-test --thread
> > --numjobs=16 \
> > --runtime=600 --output=fio-output.txt >/dev/null &
> > done
> > 
> > Delete the host, I wait 10s in between host deletions.
> > But I also tested with 3s and still its stable with Mike's patches.
> > 
> > #!/bin/bash
> > for i in /sys/class/srp_remote_ports/*
> > do
> >  echo "Deleting host $i, it will re-connect via srp_daemon"
> >  echo 1 > $i/delete
> >  sleep 10
> > done
> > 
> > Check for I/O errors affecting XFS and we now have none with the patches
> > Mike provided.
> > After recovery I can create files in the xfs mount with no issues.
> > 
> > Can you use my scripts and 1 mount and see if it still fails for you.
> 
> In parallel we can try Bart's testsuite that he shared earlier in this
> thread: https://github.com/bvanassche/srp-test
> 
> README.md says:
> "Running these tests manually is tedious. Hence this test suite that
> tests the SRP initiator and target drivers by loading both drivers on
> the same server, by logging in using the IB loopback functionality and
> by sending I/O through the SRP initiator driver to a RAM disk exported
> by the SRP target driver."
> 
> This could explain why Bart is still seeing issues.  He isn't testing
> real hardware -- as such he is using ramdisk to expose races, etc.
> 
> Mike
> 

Hi Mike,

I looked at Bart's scripts, they looked fine but I wanted a more simplified way 
to bring the error out.
Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS.
That is the same way I do it when I am not connected to a large array as it is 
the only way I can get EDR like speeds.

I don't thinks its racing due to the ramdisk back-end but  maybe we need to 
ramp ours up to run more in parallel in a loop.

I will run 21 parallel runs and see if it makes a difference tonight and report 
back tomorrow.
Clearly prior to your final patches we were escaping back to the FS layer with 
errors but since your patches, at least in out test harness that is resolved.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dm-mq and end_clone_request()

2016-08-02 Thread Laurence Oberman



- Original Message -
> From: "Laurence Oberman" <lober...@redhat.com>
> To: "Mike Snitzer" <snit...@redhat.com>
> Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, 
> linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 10:18:30 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> 
> 
> - Original Message -
> > From: "Mike Snitzer" <snit...@redhat.com>
> > To: "Laurence Oberman" <lober...@redhat.com>
> > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com,
> > linux-scsi@vger.kernel.org
> > Sent: Tuesday, August 2, 2016 10:10:12 PM
> > Subject: Re: dm-mq and end_clone_request()
> > 
> > On Tue, Aug 02 2016 at  9:33pm -0400,
> > Laurence Oberman <lober...@redhat.com> wrote:
> > 
> > > Hi Bart
> > > 
> > > I simplified the test to 2 simple scripts and only running against one
> > > XFS
> > > file system.
> > > Can you validate these and tell me if its enough to emulate what you are
> > > doing.
> > > Perhaps our test-suite is too simple.
> > > 
> > > Start the test
> > > 
> > > # cat run_test.sh
> > > #!/bin/bash
> > > logger "Starting Bart's test"
> > > #for i in `seq 1 10`
> > > for i in 1
> > > do
> > >   fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
> > > --iodepth=64 --group_reporting --sync=1 --direct=1
> > > --ioengine=libaio \
> > > --directory="/data-$i" --name=data-integrity-test --thread
> > > --numjobs=16 \
> > > --runtime=600 --output=fio-output.txt >/dev/null &
> > > done
> > > 
> > > Delete the host, I wait 10s in between host deletions.
> > > But I also tested with 3s and still its stable with Mike's patches.
> > > 
> > > #!/bin/bash
> > > for i in /sys/class/srp_remote_ports/*
> > > do
> > >  echo "Deleting host $i, it will re-connect via srp_daemon"
> > >  echo 1 > $i/delete
> > >  sleep 10
> > > done
> > > 
> > > Check for I/O errors affecting XFS and we now have none with the patches
> > > Mike provided.
> > > After recovery I can create files in the xfs mount with no issues.
> > > 
> > > Can you use my scripts and 1 mount and see if it still fails for you.
> > 
> > In parallel we can try Bart's testsuite that he shared earlier in this
> > thread: https://github.com/bvanassche/srp-test
> > 
> > README.md says:
> > "Running these tests manually is tedious. Hence this test suite that
> > tests the SRP initiator and target drivers by loading both drivers on
> > the same server, by logging in using the IB loopback functionality and
> > by sending I/O through the SRP initiator driver to a RAM disk exported
> > by the SRP target driver."
> > 
> > This could explain why Bart is still seeing issues.  He isn't testing
> > real hardware -- as such he is using ramdisk to expose races, etc.
> > 
> > Mike
> > 
> 
> Hi Mike,
> 
> I looked at Bart's scripts, they looked fine but I wanted a more simplified
> way to bring the error out.
> Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS.
> That is the same way I do it when I am not connected to a large array as it
> is the only way I can get EDR like speeds.
> 
> I don't thinks its racing due to the ramdisk back-end but  maybe we need to
> ramp ours up to run more in parallel in a loop.
> 
> I will run 21 parallel runs and see if it makes a difference tonight and
> report back tomorrow.
> Clearly prior to your final patches we were escaping back to the FS layer
> with errors but since your patches, at least in out test harness that is
> resolved.
> 
> Thanks
> Laurence
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Hello

I ran 20 parallel runs with 3 loops through host deletion and in each case fio 
survived with no hard error escaping to the FS layer.
Its solid in our test bed,
Keep in mind we have no ib_srpt loaded as we have a hardware based array and 
are connected directly to the array with EDR 100.
I am also not removing and reloading modules like is happening in Barts's 
scripts and also not trying to delete mpath maps etc.

I focused only on the I/O error that was escaping up to the FS layer.
I will check in with Bart tomorrow.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [dm-devel] dm-mq and end_clone_request()

2016-08-11 Thread Laurence Oberman



- Original Message -
> From: "Laurence Oberman" <lober...@redhat.com>
> To: "Bart Van Assche" <bart.vanass...@sandisk.com>
> Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" 
> <snit...@redhat.com>, "Johannes Thumshirn"
> <jthumsh...@suse.de>
> Sent: Wednesday, August 10, 2016 5:38:16 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> 
> 
> - Original Message -
> > From: "Laurence Oberman" <lober...@redhat.com>
> > To: "Bart Van Assche" <bart.vanass...@sandisk.com>
> > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer"
> > <snit...@redhat.com>, "Johannes Thumshirn"
> > <jthumsh...@suse.de>
> > Sent: Tuesday, August 9, 2016 1:21:15 PM
> > Subject: Re: [dm-devel] dm-mq and end_clone_request()
> > 
> > 
> > 
> > - Original Message -
> > > From: "Bart Van Assche" <bart.vanass...@sandisk.com>
> > > To: "Laurence Oberman" <lober...@redhat.com>
> > > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer"
> > > <snit...@redhat.com>, "Johannes Thumshirn"
> > > <jthumsh...@suse.de>
> > > Sent: Tuesday, August 9, 2016 1:16:52 PM
> > > Subject: Re: [dm-devel] dm-mq and end_clone_request()
> > > 
> > > On 08/09/2016 10:12 AM, Laurence Oberman wrote:
> > > > I was talking about this patch
> > > >
> > > > --- a/drivers/scsi/scsi_scan.c
> > > > +++ b/drivers/scsi/scsi_scan.c
> > > > @@ -1890,10 +1890,11 @@ void scsi_forget_host(struct Scsi_Host *shost)
> > > >   restart:
> > > >  spin_lock_irqsave(shost->host_lock, flags);
> > > >  list_for_each_entry(sdev, >__devices, siblings) {
> > > > -if (sdev->sdev_state == SDEV_DEL)
> > > > +if (sdev->sdev_state == SDEV_DEL ||
> > > > scsi_device_get(sdev)
> > > > < 0)
> > > >  continue;
> > > >  spin_unlock_irqrestore(shost->host_lock, flags);
> > > >  __scsi_remove_device(sdev);
> > > > +scsi_device_put(sdev);
> > > >  goto restart;
> > > >  }
> > > >  spin_unlock_irqrestore(shost->host_lock, flags);
> > > 
> > > Hello Laurence,
> > > 
> > > Did you run your tests with that patch applied? If so, it would help if
> > > you could rerun your tests without that patch. If the above patch makes
> > > a difference it means that it can happen that __scsi_remove_device()
> > > does not change the device state into SDEV_DEL. That's a bug and we need
> > > to know whether or not __scsi_remove_device() behaves correctly.
> > > 
> > > Thanks,
> > > 
> > > Bart.
> > > 
> > Yes Sir, I ran all yesterdays tests on your kernel with that patch applied.
> > Of course it may well just be luck/coincidence that the host delete race is
> > no longer happening
> > so I agree we need to re-run the tests so I will revert and re-run.
> > I will probably only get back to you tomorrow with the results.
> > 
> > Thanks
> > Laurence
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> Hello Bart
> 
> I only just got time now to revert that patch and build a kernel.
> Will test this tonight and let you know if I am back to seeing panics
> sporadically without the patch.
> As already mentioned, this is a different configuration to what I had when I
> was able to reproduce the panic.
> This means the lack of hitting this stack trace and panic may turn out to
> have nothing to do with the patch I applied.
> 
> Thanks
> Laurence
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Hello Bart

I can no longer reproduce the stack even with the patch reverted so its 
behaving as you expected and the patch is as you already said, not valid.
I ran about 30 fio tests with your kernel and multiple host deletions and and 
did experience only one hard fio error.
My tests now produce the same results as you are seeing.

The single fio errors was with many more executions of the test so its not easy 
to get these fio errors.

Away from tomorrow on vacation for 10 days

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [dm-devel] dm-mq and end_clone_request()

2016-08-09 Thread Laurence Oberman



- Original Message -
> From: "Bart Van Assche" <bart.vanass...@sandisk.com>
> To: "Laurence Oberman" <lober...@redhat.com>
> Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" 
> <snit...@redhat.com>, "Johannes Thumshirn"
> <jthumsh...@suse.de>
> Sent: Tuesday, August 9, 2016 1:16:52 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/09/2016 10:12 AM, Laurence Oberman wrote:
> > I was talking about this patch
> >
> > --- a/drivers/scsi/scsi_scan.c
> > +++ b/drivers/scsi/scsi_scan.c
> > @@ -1890,10 +1890,11 @@ void scsi_forget_host(struct Scsi_Host *shost)
> >   restart:
> >  spin_lock_irqsave(shost->host_lock, flags);
> >  list_for_each_entry(sdev, >__devices, siblings) {
> > -if (sdev->sdev_state == SDEV_DEL)
> > +if (sdev->sdev_state == SDEV_DEL || scsi_device_get(sdev)
> > < 0)
> >  continue;
> >  spin_unlock_irqrestore(shost->host_lock, flags);
> >  __scsi_remove_device(sdev);
> > +scsi_device_put(sdev);
> >  goto restart;
> >  }
> >  spin_unlock_irqrestore(shost->host_lock, flags);
> 
> Hello Laurence,
> 
> Did you run your tests with that patch applied? If so, it would help if
> you could rerun your tests without that patch. If the above patch makes
> a difference it means that it can happen that __scsi_remove_device()
> does not change the device state into SDEV_DEL. That's a bug and we need
> to know whether or not __scsi_remove_device() behaves correctly.
> 
> Thanks,
> 
> Bart.
> 
Yes Sir, I ran all yesterdays tests on your kernel with that patch applied.
Of course it may well just be luck/coincidence that the host delete race is no 
longer happening
so I agree we need to re-run the tests so I will revert and re-run.
I will probably only get back to you tomorrow with the results.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [dm-devel] dm-mq and end_clone_request()

2016-08-09 Thread Laurence Oberman



- Original Message -
> From: "Bart Van Assche" <bart.vanass...@sandisk.com>
> To: "Laurence Oberman" <lober...@redhat.com>
> Cc: dm-de...@redhat.com, "Mike Snitzer" <snit...@redhat.com>, 
> linux-scsi@vger.kernel.org, "Johannes Thumshirn"
> <jthumsh...@suse.de>
> Sent: Tuesday, August 9, 2016 11:51:00 AM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/08/2016 05:09 PM, Laurence Oberman wrote:
> > So now back to a 10 LUN dual path (ramdisk backed) two-server
> > configuration I am unable to reproduce the dm issue.
> > Recovery is very fast with the servers connected back to back.
> > This is using your kernel and this multipath.conf
> > 
> > [ ... ]
> > 
> > Mikes patches have definitely stabilized this issue for me on this
> > configuration.
> > 
> > I will see if I can move to a larger target server that has more
> > memory and allocate more mpath devices. I feel this issue in large
> > configurations is now rooted in multipath not bringing back maps
> > sometimes even when the actual paths are back via srp_daemon.
> > I am still tracking that down.
> > 
> > If you recall, last week I caused some of our own issues by
> > forgetting I had a no_path_retry 12 hiding in my multipath.conf.
> > Since removing that and spending most of the weekend testing on
> > the DDN array (had to give that back today), most of my issues
> > were either the sporadic host delete race or multipath not
> > re-instantiating paths.
> > 
> > I dont know if this helps, but since applying your latest patch I
> > have not seen the host delete race.
> 
> Hello Laurence,
> 
> My latest SCSI core patch adds additional instrumentation to the SCSI
> core but does not change the behavior of the SCSI core. So it cannot
> fix the scsi_forget_host() crash you had reported.
> 
> On my setup, with the kernel code from the srp-initiator-for-next
> branch and with CONFIG_DM_MQ_DEFAULT=n, I still see that when I run the
> srp-test software that fio reports I/O errors every now and then. What
> I see in syslog seems to indicate that these I/O errors are generated
> by dm-mpath:
> 
> Aug  9 08:45:39 ion-dev-ib-ini kernel: mpath 254:1: queue_if_no_path 1 -> 0
> Aug  9 08:45:39 ion-dev-ib-ini kernel: must_push_back: 107 callbacks
> suppressed
> Aug  9 08:45:39 ion-dev-ib-ini kernel: device-mapper: multipath:
> must_push_back: queue_if_no_path=0 suspend_active=1 suspending=0
> Aug  9 08:45:39 ion-dev-ib-ini kernel: __multipath_map(): (a) returning -5
> Aug  9 08:45:39 ion-dev-ib-ini kernel: map_request(): clone_and_map_rq()
> returned -5
> Aug  9 08:45:39 ion-dev-ib-ini kernel: dm_complete_request: error = -5
> Aug  9 08:45:39 ion-dev-ib-ini kernel: dm_softirq_done: dm-1 tio->error = -5
> 
> Bart.
> 
> 
Hello Bart

I was talking about this patch

--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1890,10 +1890,11 @@ void scsi_forget_host(struct Scsi_Host *shost)
  restart:
 spin_lock_irqsave(shost->host_lock, flags);
 list_for_each_entry(sdev, >__devices, siblings) {
-if (sdev->sdev_state == SDEV_DEL)
+if (sdev->sdev_state == SDEV_DEL || scsi_device_get(sdev) < 0)
 continue;
 spin_unlock_irqrestore(shost->host_lock, flags);
 __scsi_remove_device(sdev);
+scsi_device_put(sdev);
 goto restart;
 }
 spin_unlock_irqrestore(shost->host_lock, flags);
-- 

This is the one I applied. that's not just instrumentation right ?

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dm-mq and end_clone_request()

2016-08-03 Thread Laurence Oberman



- Original Message -
> From: "Laurence Oberman" <lober...@redhat.com>
> To: "Mike Snitzer" <snit...@redhat.com>
> Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, 
> linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 10:55:59 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> 
> 
> - Original Message -
> > From: "Laurence Oberman" <lober...@redhat.com>
> > To: "Mike Snitzer" <snit...@redhat.com>
> > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com,
> > linux-scsi@vger.kernel.org
> > Sent: Tuesday, August 2, 2016 10:18:30 PM
> > Subject: Re: dm-mq and end_clone_request()
> > 
> > 
> > 
> > - Original Message -
> > > From: "Mike Snitzer" <snit...@redhat.com>
> > > To: "Laurence Oberman" <lober...@redhat.com>
> > > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com,
> > > linux-scsi@vger.kernel.org
> > > Sent: Tuesday, August 2, 2016 10:10:12 PM
> > > Subject: Re: dm-mq and end_clone_request()
> > > 
> > > On Tue, Aug 02 2016 at  9:33pm -0400,
> > > Laurence Oberman <lober...@redhat.com> wrote:
> > > 
> > > > Hi Bart
> > > > 
> > > > I simplified the test to 2 simple scripts and only running against one
> > > > XFS
> > > > file system.
> > > > Can you validate these and tell me if its enough to emulate what you
> > > > are
> > > > doing.
> > > > Perhaps our test-suite is too simple.
> > > > 
> > > > Start the test
> > > > 
> > > > # cat run_test.sh
> > > > #!/bin/bash
> > > > logger "Starting Bart's test"
> > > > #for i in `seq 1 10`
> > > > for i in 1
> > > > do
> > > > fio --verify=md5 -rw=randwrite --size=10M --bs=4K 
> > > > --loops=$((10**6)) \
> > > > --iodepth=64 --group_reporting --sync=1 --direct=1
> > > > --ioengine=libaio \
> > > > --directory="/data-$i" --name=data-integrity-test --thread
> > > > --numjobs=16 \
> > > > --runtime=600 --output=fio-output.txt >/dev/null &
> > > > done
> > > > 
> > > > Delete the host, I wait 10s in between host deletions.
> > > > But I also tested with 3s and still its stable with Mike's patches.
> > > > 
> > > > #!/bin/bash
> > > > for i in /sys/class/srp_remote_ports/*
> > > > do
> > > >  echo "Deleting host $i, it will re-connect via srp_daemon"
> > > >  echo 1 > $i/delete
> > > >  sleep 10
> > > > done
> > > > 
> > > > Check for I/O errors affecting XFS and we now have none with the
> > > > patches
> > > > Mike provided.
> > > > After recovery I can create files in the xfs mount with no issues.
> > > > 
> > > > Can you use my scripts and 1 mount and see if it still fails for you.
> > > 
> > > In parallel we can try Bart's testsuite that he shared earlier in this
> > > thread: https://github.com/bvanassche/srp-test
> > > 
> > > README.md says:
> > > "Running these tests manually is tedious. Hence this test suite that
> > > tests the SRP initiator and target drivers by loading both drivers on
> > > the same server, by logging in using the IB loopback functionality and
> > > by sending I/O through the SRP initiator driver to a RAM disk exported
> > > by the SRP target driver."
> > > 
> > > This could explain why Bart is still seeing issues.  He isn't testing
> > > real hardware -- as such he is using ramdisk to expose races, etc.
> > > 
> > > Mike
> > > 
> > 
> > Hi Mike,
> > 
> > I looked at Bart's scripts, they looked fine but I wanted a more simplified
> > way to bring the error out.
> > Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS.
> > That is the same way I do it when I am not connected to a large array as it
> > is the only way I can get EDR like speeds.
> > 
> > I don't thinks its racing due to the ramdisk back-end but  maybe we need to
> > ramp ours up to run more in parallel in a loop.
> > 
> > I will run 21 parallel runs and see if it makes a difference tonight and
&

Re: [dm-devel] dm-mq and end_clone_request()

2016-08-03 Thread Laurence Oberman



- Original Message -
> From: "Bart Van Assche" <bart.vanass...@sandisk.com>
> To: "Laurence Oberman" <lober...@redhat.com>, "Mike Snitzer" 
> <snit...@redhat.com>
> Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org
> Sent: Wednesday, August 3, 2016 12:06:17 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/02/2016 06:33 PM, Laurence Oberman wrote:
> > #!/bin/bash
> > for i in /sys/class/srp_remote_ports/*
> > do
> >  echo "Deleting host $i, it will re-connect via srp_daemon"
> >  echo 1 > $i/delete
> >  sleep 10
> > done
> 
> Hello Laurence,
> 
> Sorry but the above looks wrong to me. There should be a second loop
> around this loop and the sleep statement should be moved from the inner
> loop to the outer loop. The above code logs out one (initiator, target)
> port pair at a time instead of logging out all paths at once.
> 
> Bart.
> 

Hi Bart

Latest tests are still good on our side.
I am now taking both paths out at the same time but still we seem stable here.
First test removed sleep and we still had a delay, second test add a background 
so they ran as close as possible to the same time.
Both tests passed.

I will email messages log just to you.

With no sleep we still have a gap when we delete paths of 9s and we are good.

Aug  3 13:41:21 jumpclient multipathd: 360001ff0b035d0008d71: 
remaining active paths: 1
Aug  3 13:41:22 jumpclient multipathd: 360001ff0b035d0028d720003: 
remaining active paths: 1
Aug  3 13:41:22 jumpclient multipathd: 360001ff0b035d0048d740005: 
remaining active paths: 1
Aug  3 13:41:22 jumpclient multipathd: 360001ff0b035d0068d760007: 
remaining active paths: 1
Aug  3 13:41:23 jumpclient multipathd: 360001ff0b035d00b8d7b000c: 
remaining active paths: 1
Aug  3 13:41:23 jumpclient multipathd: 360001ff0b035d00d8d7d000e: 
remaining active paths: 1
Aug  3 13:41:23 jumpclient multipathd: 360001ff0b035d0118d810012: 
remaining active paths: 1
Aug  3 13:41:24 jumpclient multipathd: 360001ff0b035d0138d830014: 
remaining active paths: 1
Aug  3 13:41:24 jumpclient multipathd: 360001ff0b035d0158d850016: 
remaining active paths: 1
Aug  3 13:41:25 jumpclient multipathd: 360001ff0b035d0178d870018: 
remaining active paths: 1
Aug  3 13:41:25 jumpclient multipathd: 360001ff0b035d0198d89001a: 
remaining active paths: 1
Aug  3 13:41:25 jumpclient multipathd: 360001ff0b035d01a8d8a001b: 
remaining active paths: 1
Aug  3 13:41:25 jumpclient multipathd: 360001ff0b035d01c8d8c001d: 
remaining active paths: 1
Aug  3 13:41:26 jumpclient multipathd: 360001ff0b035d01e8d8e001f: 
remaining active paths: 1
Aug  3 13:41:26 jumpclient multipathd: 360001ff0b035d01f8d8f0020: 
remaining active paths: 1
Aug  3 13:41:26 jumpclient multipathd: 360001ff0b035d0208d900021: 
remaining active paths: 1
Aug  3 13:41:26 jumpclient multipathd: 360001ff0b035d0228d920023: 
remaining active paths: 1
Aug  3 13:41:28 jumpclient multipathd: 360001ff0b035d0248d940025: 
remaining active paths: 1
Aug  3 13:41:29 jumpclient multipathd: 360001ff0b035d0268d960027: 
remaining active paths: 1
Aug  3 13:41:29 jumpclient multipathd: 360001ff0b035d0278d970028: 
remaining active paths: 1
Aug  3 13:41:30 jumpclient multipathd: 360001ff0b035d0288d980029: 
remaining active paths: 1
Aug  3 13:41:35 jumpclient multipathd: 360001ff0b035d0008d71: 
remaining active paths: 0
Aug  3 13:41:36 jumpclient multipathd: 360001ff0b035d0028d720003: 
remaining active paths: 0
Aug  3 13:41:37 jumpclient multipathd: 360001ff0b035d0048d740005: 
remaining active paths: 0
Aug  3 13:41:37 jumpclient multipathd: 360001ff0b035d0068d760007: 
remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d00b8d7b000c: 
remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d00d8d7d000e: 
remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d0108d800011: 
remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d0118d810012: 
remaining active paths: 0
Aug  3 13:41:38 jumpclient multipathd: 360001ff0b035d0138d830014: 
remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d0158d850016: 
remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d0178d870018: 
remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d0198d89001a: 
remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d01a8d8a001b: 
remaining active paths: 0
Aug  3 13:41:39 jumpclient multipathd: 360001ff0b035d01c8d8c001d: 
remaining active paths: 0
Aug  3 13:41:39

Re: [dm-devel] dm-mq and end_clone_request()

2016-08-03 Thread Laurence Oberman



- Original Message -
> From: "Bart Van Assche" <bart.vanass...@sandisk.com>
> To: "Laurence Oberman" <lober...@redhat.com>, "Mike Snitzer" 
> <snit...@redhat.com>
> Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org
> Sent: Wednesday, August 3, 2016 12:06:17 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> On 08/02/2016 06:33 PM, Laurence Oberman wrote:
> > #!/bin/bash
> > for i in /sys/class/srp_remote_ports/*
> > do
> >  echo "Deleting host $i, it will re-connect via srp_daemon"
> >  echo 1 > $i/delete
> >  sleep 10
> > done
> 
> Hello Laurence,
> 
> Sorry but the above looks wrong to me. There should be a second loop
> around this loop and the sleep statement should be moved from the inner
> loop to the outer loop. The above code logs out one (initiator, target)
> port pair at a time instead of logging out all paths at once.
> 
> Bart.
> 

Hi Bart

It logs out each host in turn with a 10s sleep in between.
I actually reduced the sleep to 3s last night.
We do land up with all paths lost but not at precisely the same second.

Are you saying we have to lose all paths at the same time.
That is easy to fix and I was running it that way in beginning, I will re-test.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [dm-devel] dm-mq and end_clone_request()

2016-08-10 Thread Laurence Oberman



- Original Message -
> From: "Laurence Oberman" <lober...@redhat.com>
> To: "Bart Van Assche" <bart.vanass...@sandisk.com>
> Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" 
> <snit...@redhat.com>, "Johannes Thumshirn"
> <jthumsh...@suse.de>
> Sent: Tuesday, August 9, 2016 1:21:15 PM
> Subject: Re: [dm-devel] dm-mq and end_clone_request()
> 
> 
> 
> - Original Message -
> > From: "Bart Van Assche" <bart.vanass...@sandisk.com>
> > To: "Laurence Oberman" <lober...@redhat.com>
> > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer"
> > <snit...@redhat.com>, "Johannes Thumshirn"
> > <jthumsh...@suse.de>
> > Sent: Tuesday, August 9, 2016 1:16:52 PM
> > Subject: Re: [dm-devel] dm-mq and end_clone_request()
> > 
> > On 08/09/2016 10:12 AM, Laurence Oberman wrote:
> > > I was talking about this patch
> > >
> > > --- a/drivers/scsi/scsi_scan.c
> > > +++ b/drivers/scsi/scsi_scan.c
> > > @@ -1890,10 +1890,11 @@ void scsi_forget_host(struct Scsi_Host *shost)
> > >   restart:
> > >  spin_lock_irqsave(shost->host_lock, flags);
> > >  list_for_each_entry(sdev, >__devices, siblings) {
> > > -if (sdev->sdev_state == SDEV_DEL)
> > > +if (sdev->sdev_state == SDEV_DEL ||
> > > scsi_device_get(sdev)
> > > < 0)
> > >  continue;
> > >  spin_unlock_irqrestore(shost->host_lock, flags);
> > >  __scsi_remove_device(sdev);
> > > +scsi_device_put(sdev);
> > >  goto restart;
> > >  }
> > >  spin_unlock_irqrestore(shost->host_lock, flags);
> > 
> > Hello Laurence,
> > 
> > Did you run your tests with that patch applied? If so, it would help if
> > you could rerun your tests without that patch. If the above patch makes
> > a difference it means that it can happen that __scsi_remove_device()
> > does not change the device state into SDEV_DEL. That's a bug and we need
> > to know whether or not __scsi_remove_device() behaves correctly.
> > 
> > Thanks,
> > 
> > Bart.
> > 
> Yes Sir, I ran all yesterdays tests on your kernel with that patch applied.
> Of course it may well just be luck/coincidence that the host delete race is
> no longer happening
> so I agree we need to re-run the tests so I will revert and re-run.
> I will probably only get back to you tomorrow with the results.
> 
> Thanks
> Laurence
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Hello Bart

I only just got time now to revert that patch and build a kernel.
Will test this tonight and let you know if I am back to seeing panics 
sporadically without the patch.
As already mentioned, this is a different configuration to what I had when I 
was able to reproduce the panic.
This means the lack of hitting this stack trace and panic may turn out to have 
nothing to do with the patch I applied.

Thanks
Laurence 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 >

1 - 100 of 230 matches

Mail list logo