Re: [dm-devel] [PATCH] scsi-dh-emc: fix activate vs set_params race
I can test it. I have a clarion Cx3 Will get to it next week, traveling tomorrow Laurence Sent from my iPhone On Apr 4, 2013, at 7:11 PM, Mike Christie micha...@cs.wisc.edu wrote: On 04/02/2013 07:09 PM, Mikulas Patocka wrote: Hi This fixes a possible race in scsi_dh_emc. It is untested because I don't have the hardware. It could happen when we reload a multipath device and path failure happens at the same time. I think this patch is ok. I do not have the hw to test it anymore. If you wanted to test just to make sure it is safe you should bug Rob Evers. He can help you find a machine in the westford lab that has it -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Debug flag parameter for SCSI tape driver
Hello I am tired of building modules to enable SCSI tape driver debug so I am hoping this patch is acceptable. Tested using kernel 3.14.6 Usage example: modprobe st debug_flag=1 diff -Nur a/st.c b/st.c --- a/st.c 2014-06-10 16:45:18.522354105 -0400 +++ b/st.c 2014-06-10 16:45:33.953765908 -0400 @@ -80,6 +80,7 @@ static int try_direct_io = TRY_DIRECT_IO; static int try_rdio = 1; static int try_wdio = 1; +static int debug_flag = 0; static struct class st_sysfs_class; static const struct attribute_group *st_dev_groups[]; @@ -100,6 +101,9 @@ MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to use (256)); module_param_named(try_direct_io, try_direct_io, int, 0); MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape drive (1)); +module_param_named(debug_flag, debug_flag, int, 0); +MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting DEBUG 1 in source); + /* Extra parameters for testing */ module_param_named(try_rdio, try_rdio, int, 0); @@ -124,6 +128,9 @@ }, { try_direct_io, try_direct_io + }, + { + debug_flag, debug_flag } }; #endif @@ -4277,7 +4284,9 @@ static int __init init_st(void) { int err; - + debugging = (debug_flag 0) ? debug_flag : DEBUG; +if (debugging) + printk(KERN_INFO st: Debugging enabled debug_flag = %d\n,debugging); validate_options(); printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n, Thanks Laurence -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
debug_flag added to st tape driver
Hello I am tired of building modules to enable SCSI tape driver debug so I am hoping this patch is acceptable. Tested using kernel 3.14.6 Usage example: modprobe st debug_flag=1 diff -Nur a/st.c b/st.c --- a/st.c2014-06-10 16:45:18.522354105 -0400 +++ b/st.c2014-06-10 16:45:33.953765908 -0400 @@ -80,6 +80,7 @@ static int try_direct_io = TRY_DIRECT_IO; static int try_rdio = 1; static int try_wdio = 1; +static int debug_flag = 0; static struct class st_sysfs_class; static const struct attribute_group *st_dev_groups[]; @@ -100,6 +101,9 @@ MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to use (256)); module_param_named(try_direct_io, try_direct_io, int, 0); MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape drive (1)); +module_param_named(debug_flag, debug_flag, int, 0); +MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting DEBUG 1 in source); + /* Extra parameters for testing */ module_param_named(try_rdio, try_rdio, int, 0); @@ -124,6 +128,9 @@ }, { try_direct_io, try_direct_io +}, +{ +debug_flag, debug_flag } }; #endif @@ -4277,7 +4284,9 @@ static int __init init_st(void) { int err; - +debugging = (debug_flag 0) ? debug_flag : DEBUG; +if (debugging) +printk(KERN_INFO st: Debugging enabled debug_flag = %d\n,debugging); validate_options(); printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n, Thanks Laurence -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH]: add debug flag parameter for SCSI tape driver
Hello Take 2 of this patch, changed module description and subject line. This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module recompile. Usage: mpdprobe st debug_flag=1 Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nur a/st.c b/st.c --- a/st.c 2014-06-10 16:45:18.522354105 -0400 +++ b/st.c 2014-06-10 19:40:39.774387990 -0400 @@ -80,6 +80,7 @@ static int try_direct_io = TRY_DIRECT_IO; static int try_rdio = 1; static int try_wdio = 1; +static int debug_flag = 0; static struct class st_sysfs_class; static const struct attribute_group *st_dev_groups[]; @@ -100,6 +101,9 @@ MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to use (256)); module_param_named(try_direct_io, try_direct_io, int, 0); MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape drive (1)); +module_param_named(debug_flag, debug_flag, int, 0); +MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1); + /* Extra parameters for testing */ module_param_named(try_rdio, try_rdio, int, 0); @@ -124,6 +128,9 @@ }, { try_direct_io, try_direct_io + }, + { + debug_flag, debug_flag } }; #endif @@ -4277,7 +4284,9 @@ static int __init init_st(void) { int err; - + debugging = (debug_flag 0) ? debug_flag : DEBUG; +if (debugging) + printk(KERN_INFO st: Debugging enabled debug_flag = %d\n,debugging); validate_options(); printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n, Thanks Laurence -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: add debug flag parameter for SCSI tape driver
Kai, Thank you for considering this. With #define DEBUG 0 We still include #define DEB(a) #define DEBC(a) With the debug_flag we then provide the needed debug I am looking for at module load time. But I agree that it enables it for all devices and that may not be optimal I don't change the default, I just allow the parameter to control it. In the last few issues I have been working I have had to recompile and provide the st module to get what I needed captured for debugging so I decided to try the patch submission. Thank You Laurence - Original Message - From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi To: Laurence Oberman lober...@redhat.com Cc: linux-scsi@vger.kernel.org Sent: Wednesday, June 11, 2014 2:03:15 PM Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver On 11.6.2014, at 2.48, Laurence Oberman lober...@redhat.com wrote: Hello Take 2 of this patch, changed module description and subject line. This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module recompile. Usage: mpdprobe st debug_flag=1 Signed-off-by: Laurence Oberman lober...@redhat.com What is wrong with the existing methods to control debugging? You can enable and disable debugging for each device with ioctl() (as described in the driver documentation). You can use mt-st to do this from command line. Your patch just allows one to change the default for all devices. The real problem may be that the distributions don’t compile the debugging code into the drivets but your patch does not change this. Kai -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: add debug flag parameter for SCSI tape driver
Kai, Its likely not worth doing this, I cross checked and indeed many distros have this compiled out. So lets leave it as is. Thanks Laurence - Original Message - From: Laurence Oberman lober...@redhat.com To: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi Cc: linux-scsi@vger.kernel.org Sent: Wednesday, June 11, 2014 2:24:25 PM Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver Kai, Thank you for considering this. With #define DEBUG 0 We still include #define DEB(a) #define DEBC(a) With the debug_flag we then provide the needed debug I am looking for at module load time. But I agree that it enables it for all devices and that may not be optimal I don't change the default, I just allow the parameter to control it. In the last few issues I have been working I have had to recompile and provide the st module to get what I needed captured for debugging so I decided to try the patch submission. Thank You Laurence - Original Message - From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi To: Laurence Oberman lober...@redhat.com Cc: linux-scsi@vger.kernel.org Sent: Wednesday, June 11, 2014 2:03:15 PM Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver On 11.6.2014, at 2.48, Laurence Oberman lober...@redhat.com wrote: Hello Take 2 of this patch, changed module description and subject line. This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module recompile. Usage: mpdprobe st debug_flag=1 Signed-off-by: Laurence Oberman lober...@redhat.com What is wrong with the existing methods to control debugging? You can enable and disable debugging for each device with ioctl() (as described in the driver documentation). You can use mt-st to do this from command line. Your patch just allows one to change the default for all devices. The real problem may be that the distributions don’t compile the debugging code into the drivets but your patch does not change this. Kai -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd request
Hello Kai You have seen this patch before. The first time around, given that we don't enable DEBUG by default, I let it go. However we have been looking into defining DEBUG 1 by default here at Redhat and then setting the default to disabled. Are you open to considering changing the driver based on this patch. i.e. default DEFINE 1 and adding this code with default set to off. Note that with DEBUG 0, as you know you need to change that and recompile. That is exactly what I am trying to avoid with Enterprise customers. This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module recompile. Note that now DEBUG 1 is the default with this patch. Usage: modprobe st debug_flag=1 Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nur a/st.c b/st.c --- a/st.c 2014-10-17 16:15:54.103810627 -0400 +++ b/st.c 2014-10-17 16:22:12.303810392 -0400 @@ -56,7 +56,7 @@ /* The driver prints some debugging information on the console if DEBUG is defined and non-zero. */ -#define DEBUG 0 +#define DEBUG 1 #define ST_DEB_MSG KERN_NOTICE #if DEBUG @@ -80,6 +80,7 @@ static int try_direct_io = TRY_DIRECT_IO; static int try_rdio = 1; static int try_wdio = 1; +static int debug_flag = 0; static struct class st_sysfs_class; static const struct attribute_group *st_dev_groups[]; @@ -100,6 +101,9 @@ MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to use (256)); module_param_named(try_direct_io, try_direct_io, int, 0); MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape drive (1)); +module_param_named(debug_flag, debug_flag, int, 0); +MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1); + /* Extra parameters for testing */ module_param_named(try_rdio, try_rdio, int, 0); @@ -124,6 +128,9 @@ }, { try_direct_io, try_direct_io +}, +{ +debug_flag, debug_flag } }; #endif @@ -4306,6 +4313,11 @@ validate_options(); +debugging = (debug_flag 0) ? debug_flag : DEBUG; + if (debugging) +printk(KERN_INFO st: Debugging enabled debug_flag = %d\n,debugging); + + printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n, verstr, st_fixed_buffer_size, st_max_sg_segs); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd request
Oops, patch was defaulting to 1. Here is v2 properly defining DEBUG 1 and defaulting to 0 unless debug_flag=1 This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module recompile. Note that now DEBUG 1 is the default with this patch. Usage: modprobe st Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nur a/st.c b/st.c --- a/st.c 2014-10-17 16:15:54.103810627 -0400 +++ b/st.c 2014-10-17 16:42:45.992809531 -0400 @@ -56,7 +56,8 @@ /* The driver prints some debugging information on the console if DEBUG is defined and non-zero. */ -#define DEBUG 0 +#define DEBUG 1 +#define NO_DEBUG 0 #define ST_DEB_MSG KERN_NOTICE #if DEBUG @@ -80,6 +81,7 @@ static int try_direct_io = TRY_DIRECT_IO; static int try_rdio = 1; static int try_wdio = 1; +static int debug_flag = 0; static struct class st_sysfs_class; static const struct attribute_group *st_dev_groups[]; @@ -100,6 +102,9 @@ MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to use (256)); module_param_named(try_direct_io, try_direct_io, int, 0); MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape drive (1)); +module_param_named(debug_flag, debug_flag, int, 0); +MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1); + /* Extra parameters for testing */ module_param_named(try_rdio, try_rdio, int, 0); @@ -124,6 +129,9 @@ }, { try_direct_io, try_direct_io +}, +{ +debug_flag, debug_flag } }; #endif @@ -4306,6 +4314,11 @@ validate_options(); +debugging = (debug_flag 0) ? debug_flag : NO_DEBUG; + if (debugging) +printk(KERN_INFO st: Debugging enabled debug_flag = %d\n,debugging); + + printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n, verstr, st_fixed_buffer_size, st_max_sg_segs); - Original Message - From: Laurence Oberman lober...@redhat.com To: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi, Rob Evers rev...@redhat.com Cc: linux-scsi@vger.kernel.org Sent: Friday, October 17, 2014 4:20:29 PM Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd request Hello Kai You have seen this patch before. The first time around, given that we don't enable DEBUG by default, I let it go. However we have been looking into defining DEBUG 1 by default here at Redhat and then setting the default to disabled. Are you open to considering changing the driver based on this patch. i.e. default DEFINE 1 and adding this code with default set to off. Note that with DEBUG 0, as you know you need to change that and recompile. That is exactly what I am trying to avoid with Enterprise customers. This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module recompile. Note that now DEBUG 1 is the default with this patch. Usage: modprobe st debug_flag=1 Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nur a/st.c b/st.c --- a/st.c 2014-10-17 16:15:54.103810627 -0400 +++ b/st.c 2014-10-17 16:22:12.303810392 -0400 @@ -56,7 +56,7 @@ /* The driver prints some debugging information on the console if DEBUG is defined and non-zero. */ -#define DEBUG 0 +#define DEBUG 1 #define ST_DEB_MSG KERN_NOTICE #if DEBUG @@ -80,6 +80,7 @@ static int try_direct_io = TRY_DIRECT_IO; static int try_rdio = 1; static int try_wdio = 1; +static int debug_flag = 0; static struct class st_sysfs_class; static const struct attribute_group *st_dev_groups[]; @@ -100,6 +101,9 @@ MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to use (256)); module_param_named(try_direct_io, try_direct_io, int, 0); MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape drive (1)); +module_param_named(debug_flag, debug_flag, int, 0); +MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1); + /* Extra parameters for testing */ module_param_named(try_rdio, try_rdio, int, 0); @@ -124,6 +128,9 @@ }, { try_direct_io, try_direct_io +}, +{ +debug_flag, debug_flag } }; #endif @@ -4306,6 +4313,11 @@ validate_options(); +debugging = (debug_flag 0) ? debug_flag : DEBUG; + if (debugging) +printk(KERN_INFO st: Debugging enabled debug_flag = %d\n,debugging); + + printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n, verstr, st_fixed_buffer_size, st_max_sg_segs); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd request
Hello Kai Thanks. Here is v3 This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module recompile. Note that now DEBUG 1 is the default with this patch. Usage: modprobe st debug_flag=1 Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nur a/Documentation/scsi/st.txt b/Documentation/scsi/st.txt --- a/Documentation/scsi/st.txt 2014-10-19 09:36:52.243863716 -0400 +++ b/Documentation/scsi/st.txt 2014-10-19 09:43:19.222863447 -0400 @@ -506,9 +506,11 @@ DEBUGGING HINTS -To enable debugging messages, edit st.c and #define DEBUG 1. As seen -above, debugging can be switched off with an ioctl if debugging is -compiled into the driver. The debugging output is not voluminous. +Debugging code is now compiled in by default but debugging is turned off with +the kernel module parameter debug_flag defaulting to 0. +Debugging can still be switched on and off with an ioctl. +To enable debug at module load time add debug_flag=1 to the module load +options, the debugging output is not voluminous. If the tape seems to hang, I would be very interested to hear where the driver is waiting. With the command 'ps -l' you can see the state diff -Nur a/drivers/scsi/st.c b/drivers/scsi/st.c --- a/drivers/scsi/st.c 2014-10-19 09:35:45.673863756 -0400 +++ b/drivers/scsi/st.c 2014-10-19 09:35:30.621863483 -0400 @@ -56,7 +56,8 @@ /* The driver prints some debugging information on the console if DEBUG is defined and non-zero. */ -#define DEBUG 0 +#define DEBUG 1 +#define NO_DEBUG 0 #define ST_DEB_MSG KERN_NOTICE #if DEBUG @@ -80,6 +81,7 @@ static int try_direct_io = TRY_DIRECT_IO; static int try_rdio = 1; static int try_wdio = 1; +static int debug_flag = 0; static struct class st_sysfs_class; static const struct attribute_group *st_dev_groups[]; @@ -100,6 +102,9 @@ MODULE_PARM_DESC(max_sg_segs, Maximum number of scatter/gather segments to use (256)); module_param_named(try_direct_io, try_direct_io, int, 0); MODULE_PARM_DESC(try_direct_io, Try direct I/O between user buffer and tape drive (1)); +module_param_named(debug_flag, debug_flag, int, 0); +MODULE_PARM_DESC(debug_flag, Enable DEBUG, same as setting debugging=1); + /* Extra parameters for testing */ module_param_named(try_rdio, try_rdio, int, 0); @@ -124,6 +129,9 @@ }, { try_direct_io, try_direct_io +}, +{ +debug_flag, debug_flag } }; #endif @@ -4309,6 +4317,10 @@ printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n, verstr, st_fixed_buffer_size, st_max_sg_segs); +debugging = (debug_flag 0) ? debug_flag : NO_DEBUG; + if (debugging) +printk(KERN_INFO st: Debugging enabled debug_flag = %d\n,debugging); + err = class_register(st_sysfs_class); if (err) { pr_err(Unable register sysfs class for SCSI tapes\n); - Original Message - From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi To: Laurence Oberman lober...@redhat.com Cc: Rob Evers rev...@redhat.com, linux-scsi@vger.kernel.org Sent: Sunday, October 19, 2014 4:54:10 AM Subject: Re: [PATCH]: add debug flag parameter for SCSI tape driver - 2nd request Hello, I am responding to this, but noticed your next, fixed version. On 17.10.2014, at 23.20, Laurence Oberman lober...@redhat.com wrote: Hello Kai You have seen this patch before. The first time around, given that we don't enable DEBUG by default, I let it go. However we have been looking into defining DEBUG 1 by default here at Redhat and then setting the default to disabled. Are you open to considering changing the driver based on this patch. i.e. default DEFINE 1 and adding this code with default set to off. Yes. I certainly think defining DEBUG 1 and changing the default to zero should be done if it is useful for supporting users. The runtime overhead is negligible and the extra code does not matter nowadays (it did matter, at least theoretically, for years ago). I am not so sure about the module option. When the debugging code is compiled in, debugging can be enabled and disabled for each device by the MTIOCTOP ioctl (e.g., mtst -f tape_device stsetoptions debug). The module option enables debugging for all tape devices. However, if you think this additional module option is useful, I am not against it. It does not remove the possibility for controlling debugging for each device for those who want to do it that way. Anyway, you should modify the documentation (Documentation/scsi/st.txt) according to the changes. Note that with DEBUG 0, as you know you need to change that and recompile. That is exactly what I am trying to avoid with Enterprise customers. I have also noticed this when someone has asked me about some tape problems. This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module
Re: [PATCH] st: implement sysfs based tape statistics
Hi Shane, I was actually about to pull this patch and test it. Lots of changes and a big patch so going to create a another driver as a tape stats driver for now for testing. Will exercise this fully and provide feedback to the list. Regards Laurence Oberman On Thu, Nov 20, 2014 at 6:49 PM, Seymour, Shane M shane.seym...@hp.com wrote: I was wondering if anyone had a chance to review the patch? Comments are appreciated and I'm more than happy to make changes that will allow it to be accepted. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH qla2xxx] Race in handling rport deletion in Qlogic driver during recovery causes panic
When we have an rport disconnect we race during rport deletion and re-connection resulting in a panic. When we do this, we call fc_remote_port_del() just before we do the calls to re-establish the session with the FC transport with fc_remote_port_add() and then fc_remote_port_rolechg(). If we remove the call to fc_remote_port_del() before re-establishing the connection this prevents the race. This patch has resolved this for multiple customers via test kernels. Suggested by Chad Dupuis, implemented and tested by Laurence Oberman. Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nur a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c --- a/drivers/scsi/qla2xxx/qla_init.c 2014-10-14 18:07:48.313648535 -0400 +++ b/drivers/scsi/qla2xxx/qla_init.c 2014-11-25 09:08:17.108814261 -0500 @@ -3237,8 +3237,6 @@ struct fc_rport *rport; unsigned long flags; - qla2x00_rport_del(fcport); - rport_ids.node_name = wwn_to_u64(fcport-node_name); rport_ids.port_name = wwn_to_u64(fcport-port_name); rport_ids.port_id = fcport-d_id.b.domain 16 | Supporting traces qla2xxx :06:00.1: scsi(1:4:0): Abort command issued -- 1 2002. qla2xxx :06:00.1: scsi(1:4:0): BUS RESET ISSUED. qla2xxx :06:00.1: qla2xxx_eh_bus_reset: reset succeded qla2xxx :06:00.1: scsi(1:4:0): Abort command issued -- 1 2002. qla2xxx :06:00.1: scsi(1:4:0): ADAPTER RESET ISSUED. qla2xxx :06:00.1: Performing ISP error recovery - ha= 880bd5b55000. qla2xxx :06:00.1: FW: Loading via request-firmware... qla2xxx :06:00.1: LOOP UP detected (4 Gbps). qla2xxx :06:00.1: qla2xxx_eh_host_reset: reset succeded qla2xxx :09:00.1: scsi(3:3:0): Abort command issued -- 1 2002. qla2xxx :09:00.1: scsi(3:3:0): Abort command issued -- 1 2002. qla2xxx :09:00.1: scsi(3:3:0): DEVICE RESET ISSUED. qla2xxx :09:00.1: scsi(3:3:0): DEVICE RESET SUCCEEDED. qla2xxx :06:00.1: scsi(1:4:0): Abort command issued -- 1 2002. scsi 1:0:4:0: Device offlined - not ready after error recovery .. .. scsi 3:0:2:0: Device offlined - not ready after error recovery qla2xxx :06:00.1: scsi(1:8:0): Abort command issued -- 1 2002. qla2xxx :06:00.1: scsi(1:8:0): Abort command issued -- 1 2002. qla2xxx :06:00.1: scsi(1:8:0): DEVICE RESET ISSUED. qla2xxx :06:00.1: scsi(1:8:0): DEVICE RESET SUCCEEDED. qla2xxx :06:00.1: scsi(1:8:0): Abort command issued -- 1 2002. qla2xxx :06:00.1: scsi(1:8:0): TARGET RESET ISSUED. qla2xxx :06:00.1: scsi(1:8:0): TARGET RESET SUCCEEDED. qla2xxx :09:00.1: scsi(3:3:0): Abort command issued -- 1 2002. BUG: unable to handle kernel NULL pointer dereference at 0058 IP: [8134fa1b] scsi_is_host_device+0xb/0x20 PGD b80681067 PUD b833ca067 PMD 0 Oops: [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu2/cpufreq/scaling_setspeed CPU 9 Modules linked in: nfs fscache xfs ext3 jbd ext2 iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables mptctl mptbase vxodm(P)(U) amf(P)(U) vxfen(P)(U) gab(P)(U) llt(P)(U) nfsd lockd nfs_acl auth_rpcgss autofs4 sunrpc dmpjbod(P)(U) dmpap(P)(U) dmpaa(P)(U) vxspec(P)(U) vxio(P)(U) vxdmp(P)(U) pcc_cpufreq bonding ipv6 vxportal(P)(U) fdd(P)(U) vxfs(P)(U) exportfs emcpvlumd(P)(U) emcpxcrypt(P)(U) emcpdm(P)(U) emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) dm_mirror dm_region_hash dm_log hpilo hpwdt microcode serio_raw iTCO_wdt iTCO_vendor_support i7core_edac edac_core ses enclosure sg power_meter hwmon be2net shpchp ext4 mbcache jbd2 sd_mod crc_t10dif hpsa(U) qla2xxx scsi_transport_fc scsi_tgt dm_mod [last unloaded: emcpioc] Modules linked in: nfs fscache xfs ext3 jbd ext2 iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables mptctl mptbase vxodm(P)(U) amf(P)(U) vxfen(P)(U) gab(P)(U) llt(P)(U) nfsd lockd nfs_acl auth_rpcgss autofs4 sunrpc dmpjbod(P)(U) dmpap(P)(U) dmpaa(P)(U) vxspec(P)(U) vxio(P)(U) vxdmp(P)(U) pcc_cpufreq bonding ipv6 vxportal(P)(U) fdd(P)(U) vxfs(P)(U) exportfs emcpvlumd(P)(U) emcpxcrypt(P)(U) emcpdm(P)(U) emcpgpx(P)(U) emcpmpx(P)(U) emcp(P)(U) dm_mirror dm_region_hash dm_log hpilo hpwdt microcode serio_raw iTCO_wdt iTCO_vendor_support i7core_edac edac_core ses enclosure sg power_meter hwmon be2net shpchp ext4 mbcache jbd2 sd_mod crc_t10dif hpsa(U) qla2xxx scsi_transport_fc scsi_tgt dm_mod [last unloaded: emcpioc] Pid: 641, comm: qla2xxx_3_dpc Tainted: P M 2.6.32-131.26.1.el6.x86_64 #1 ProLiant BL460c G7 RIP: 0010:[8134fa1b] [8134fa1b] scsi_is_host_device+0xb/0x20 RSP: 0018:8817d15d5c80 EFLAGS: 00010246 RAX: RBX: 880bcf094000 RCX: 5ee0 RDX: 880bd5b37850 RSI: 0297 RDI: RBP: 8817d15d5c80 R08: 0006 R09: 880bd5b39210 R10: 8817d15d5d18 R11: R12: R13
Re: [PATCH] st: implement sysfs based tape statistics
Hello Shane, So far so good on the upstream, built as a new driver. I need to run some more tests to capture stats and validate the numbers, so far just functional tests and reading sysfs numbers. I then need to backport to RHEL6 and RHEL7 kernels as we have two BZ's out there for this. WIll be working on that this coming week. I needed to get some real tape hardware ready, so took a while to get that staged, started out using mhvtl and that seemed fine. Thanks Laurence On Thu, Nov 20, 2014 at 7:09 PM, Laurence Oberman oberma...@gmail.com wrote: Hi Shane, I was actually about to pull this patch and test it. Lots of changes and a big patch so going to create a another driver as a tape stats driver for now for testing. Will exercise this fully and provide feedback to the list. Regards Laurence Oberman On Thu, Nov 20, 2014 at 6:49 PM, Seymour, Shane M shane.seym...@hp.com wrote: I was wondering if anyone had a chance to review the patch? Comments are appreciated and I'm more than happy to make changes that will allow it to be accepted. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crash when copying from broken external hdd
I need more of the stack if you have it, the screenshot is not attached. Thanks Laurence On Sun, Nov 30, 2014 at 6:11 AM, Richard Weinberger richard.weinber...@gmail.com wrote: On Sat, Nov 29, 2014 at 11:52 AM, Simon Danner danner.si...@gmail.com wrote: Hello, i get the following crash after i try to copy files from a broken external hdd to another external hdd. It happens after a few minutes, with latest git and 3.17.4 from Arch. Attached screenshot is from latest mainline git. i hope this can be fixed somehow, Regards Simon Danner Can you decode scsi_requed_end+0x122? CC'ing block and scsi folks. -- Thanks, //richard -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crash when copying from broken external hdd
The BUG_ON is taken because blk_queued_rq(req)) returns true which means the request-queuelist is empty, i.e no more entries by the time the request is dereferenced Can I get the messages file entries (last 100 lines) just prior to the panic, not the whole file. If its large attach .gz and email to me. Thanks On Sun, Nov 30, 2014 at 11:42 AM, Richard Weinberger rich...@nod.at wrote: Am 30.11.2014 um 17:36 schrieb Simon Danner: Hi, here the two screenshots i could take, from 3.17.4 and 3.18 git. You're hitting BUG_ON(blk_queued_rq(req)); in blk_finish_request() Thanks, //richard Thanks Simon On Sun, 2014-11-30 at 10:58 -0500, Laurence Oberman wrote: I need more of the stack if you have it, the screenshot is not attached. Thanks Laurence On Sun, Nov 30, 2014 at 6:11 AM, Richard Weinberger richard.weinber...@gmail.com wrote: On Sat, Nov 29, 2014 at 11:52 AM, Simon Danner danner.si...@gmail.com wrote: Hello, i get the following crash after i try to copy files from a broken external hdd to another external hdd. It happens after a few minutes, with latest git and 3.17.4 from Arch. Attached screenshot is from latest mainline git. i hope this can be fixed somehow, Regards Simon Danner Can you decode scsi_requed_end+0x122? CC'ing block and scsi folks. -- Thanks, //richard -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drivers: scsi: FLUSH timeout
I am thinking Srini meant in the sd_mod driver module. #define SD_FLUSH_TIMEOUT (60 * HZ) Laurence On Fri, Sep 20, 2013 at 4:32 PM, Greg KH gre...@linuxfoundation.org wrote: On Fri, Sep 20, 2013 at 12:32:27PM -0700, K. Y. Srinivasan wrote: The SD_FLUSH_TIMEOUT value is currently hardcoded. Hardcoded where? Please, more context. On our cloud, we sometimes hit this timeout. I was wondering if we could make this a module parameter. If this is acceptable, I can send you a patch for this. A module parameter don't make sense for a per-device value, does it? greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Persistent reservation behaviour/compliance with redundant controllers
I reached out to a. Contact at HP and he shared this with. Not sure if its helpful. 3PAR does something different based on the host OS mode or Persona that is set for the host OS type being used as to how we respond with these commands. The main aspects of this question derive with how a active/passive controller model would work, however, because 3PAR is all controllers or nodes are equal all paths are active. The 3Par implementation of S2R and S3PGR is intended to comply with SPC-3. The scope of reservations is limited to a full logical unit, element scope is not supported. SCSI-3 reservations allow each host/array path to have a key registered against it. Typically a host will register the same key upon all of the paths it sees to the array and each host will have its own unique key. Access to the volume can then be restricted to those hosts who have registered keys. Should a host be determined to have gone rogue its key can be revoked by any of the still active hosts, causing the rogue host to lose access to the volume. They need to register the same key to all paths of the same lun. Once the host has taken appropriate action to become healthy again it can register a new key and regain access. For 3PAR use the showrsv command to view things from the 3PAR array: showrsv - Show information about scsi reservations of virtual volumes (VVs). SYNTAX showrsv [options arg] [VV_name] DESCRIPTION The showrsv command displays SCSI reservation and registration information for VLUNs bound for a specified port. AUTHORITY Any role in the system OPTIONS -l scsi3|scsi2 On Jan 6, 2014, at 6:35 PM, Matthias Eble psychotr...@gmail.com wrote: 2014/1/7 James Bottomley james.bottom...@hansenpartnership.com: On Mon, 2014-01-06 at 23:53 +0100, Matthias Eble wrote: Can sdg and sdl be the same I_T_Nexus at a time? Right now, they are handled like that. In my understanding, every scsi disk device represents an I_T_Nexus. No, every SCSI disk is an I_T_L nexus. There's no actual device object in Linux for an I_T nexus. So, PR registrations are made for an I_T nexus using an I_T_L nexus. Probably my previous systems had a 1:1 relation between I_T and I_T_L. Is there a way to identify which I_T_L nexuses belong to the same I_T nexus? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
scsi_debug driver puzzle
Hello I have what to me is a puzzle but is likely a stupid question about the queuecommand interface in the scsi_debug driver. I see the host template set for scsi_debug_queuecommand but in the driver we have the function declared as int scsi_debug_queuecommand_lck So how is this working. Egrep pattern: scsi_debug_queuecommand File Line 0 scsi_debug.c 3551 int scsi_debug_queuecommand_lck(struct scsi_cmnd *SCpnt, done_funct_t done) 1 scsi_debug.c 3900 static DEF_SCSI_QCMD(scsi_debug_queuecommand) 2 scsi_debug.c 3912 .queuecommand = scsi_debug_queuecommand, Thanks Laurence -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] lpfc: correct device removal deadlock after link bounce
This patch was tested in house at Red Hat and is running in test kernels at a couple of Red Hat customers. James, thanks for sending it upstream. Laurence On Tue, Dec 30, 2014 at 12:08 PM, James Smart james.sm...@emulex.com wrote: This patch, applicable to 8G/4G/2G adapters, adds a call that resumes transmit operations after a link bounce. Without it, targets that tried to suspend exchanges after a link bounce (such as tape devices using sequence level error recovery) would never resume io operation, causing scan failures, and eventually deadlocks if a device removal request is made. The patches were cut against Christoph's scsi-queue.git, branch drivers-for-3.18. The driver rev cut against is 10.4.8000.0 -- james s Signed-off-by: James Smart james.sm...@emulex.com Signed-off-by: Dick Kennedy dick.kenn...@emulex.com --- lpfc_els.c |9 + 1 file changed, 9 insertions(+) diff -upNr a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c --- a/drivers/scsi/lpfc/lpfc_els.c 2014-12-29 12:48:08.0 -0500 +++ b/drivers/scsi/lpfc/lpfc_els.c 2014-12-30 11:23:04.344426606 -0500 @@ -2225,6 +2225,15 @@ lpfc_adisc_done(struct lpfc_vport *vport if ((phba-sli3_options LPFC_SLI3_NPIV_ENABLED) !(vport-fc_flag FC_RSCN_MODE) (phba-sli_rev LPFC_SLI_REV4)) { + /* The ADISCs are complete. Doesn't matter if they +* succeeded or failed because the ADISC completion +* routine guarantees to call the state machine and +* the RPI is either unregistered (failed ADISC response) +* or the RPI is still valid and the node is marked +* mapped for a target. The exchanges should be in the +* correct state. This code is specific to SLI3. +*/ + lpfc_issue_clear_la(phba, vport); lpfc_issue_reg_vpi(phba, vport); return; } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: implement sysfs based tape statistics v2
I pulled this this morning and will be testing. The prior version was stable for me on the upstream and RHEL 6.5 kernel without exhaustive testing. We also just received more requests to get this into RHEL from HP / Red Hat customers. Kai, what are your thoughts. I realize this is a large amount of additional code. I am not keen to create a driver just for stats as we would have to keep the rest of the st driver changes always in sync. Thanks Laurence On Mon, Jan 12, 2015 at 10:43 PM, Seymour, Shane M shane.seym...@hp.com wrote: Some small changes since the last version - this version removes two files from sysfs compared to the last version (read and write block counts since they're derived from the byte counts they can be calculated in user space) but that's the only change. This version has been rebased to 3.19.0-rc3-next-20150108. Since posting the last version an email was received by Kai and myself from an ATT employee who has a need for tape statistics to be implemented (who gave permission to quote their email), I've included part of the email here: There are over 20,000 tape devices managed by our operations group zoned to tens of thousands of servers. My challenge is that I cannot provide operations a solution that gives them visibility into the tape drive performance metrics when that platform is linux. Our legacy platforms (Solaris,HPUX,AIX) provide facilities to use iostat and sar to determine the write speed of the tape drives. We took for granted that this would be available in linux and its absence has been very troublesome. Because operations cannot measure tape drive performance in this way they cannot easily determine when there is a tape drive performance problem and whether the change improved or worsened the performance problem. ... I have followed the debate https://lkml.org/lkml/2013/3/20/696 and from a service provide perspective we would expect the same maturity and functionality that we have from the traditional unix platform in regards to iostat/sar. This issue is important and urgent because tape drive performance issues are common and I am unable to provide standards and processes to identify and remediate these issues. Another HP customer has also requested the same functionality (but hasn't given permission to be named), they own tape drives numbering in the 1000s and also need the ability to investigate performance issues. Signed-off-by: shane.seym...@hp.com Tested-by: shane.seym...@hp.com --- diff -uprN a/drivers/scsi/st.c b/drivers/scsi/st.c --- a/drivers/scsi/st.c 2015-01-11 14:46:00.243814755 -0600 +++ b/drivers/scsi/st.c 2015-01-12 13:54:52.549117333 -0600 @@ -20,6 +20,7 @@ static const char *verstr = 20101219; #include linux/module.h +#include linux/kobject.h #include linux/fs.h #include linux/kernel.h @@ -226,6 +227,20 @@ static DEFINE_SPINLOCK(st_index_lock); static DEFINE_SPINLOCK(st_use_lock); static DEFINE_IDR(st_index_idr); +static inline void st_stats_remove_files(struct scsi_tape *); +static inline void st_stats_create_files(struct scsi_tape *); +static ssize_t st_tape_attr_show(struct kobject *, struct attribute *, char *); +static ssize_t st_tape_attr_store(struct kobject *, struct attribute *, + const char *, size_t); +static void st_release_stats_kobj(struct kobject *); +static const struct sysfs_ops st_stats_sysfs_ops = { + .show = st_tape_attr_show, + .store = st_tape_attr_store, +}; +static struct kobj_type st_stats_ktype = { + .release = st_release_stats_kobj, + .sysfs_ops = st_stats_sysfs_ops, +}; #include osst_detect.h @@ -476,10 +491,22 @@ static void st_scsi_execute_end(struct r struct st_request *SRpnt = req-end_io_data; struct scsi_tape *STp = SRpnt-stp; struct bio *tmp; + u64 ticks; STp-buffer-cmdstat.midlevel_result = SRpnt-result = req-errors; STp-buffer-cmdstat.residual = req-resid_len; + if (STp-stats != NULL) { + ticks = get_jiffies_64(); + STp-stats-in_flight--; + ticks -= STp-stats-stamp; + STp-stats-io_ticks += ticks; + if (req-cmd[0] == WRITE_6) + STp-stats-write_ticks += ticks; + else if (req-cmd[0] == READ_6) + STp-stats-read_ticks += ticks; + } + tmp = SRpnt-bio; if (SRpnt-waiting) complete(SRpnt-waiting); @@ -496,6 +523,7 @@ static int st_scsi_execute(struct st_req struct rq_map_data *mdata = SRpnt-stp-buffer-map_data; int err = 0; int write = (data_direction == DMA_TO_DEVICE); + struct scsi_tape *STp = SRpnt-stp; req = blk_get_request(SRpnt-stp-device-request_queue, write, GFP_KERNEL); @@ -516,6 +544,20 @@ static int st_scsi_execute(struct st_req } } +
Re: [PATCH] st: implement sysfs based tape statistics v2
- Original Message - From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi To: Laurence Oberman oberma...@gmail.com Cc: Shane M Seymour shane.seym...@hp.com, lober...@redhat.com, linux-scsi@vger.kernel.org, James E.J. Bottomley (jbottom...@parallels.com) jbottom...@parallels.com, je...@suse.com Sent: Thursday, February 5, 2015 12:03:29 PM Subject: Re: [PATCH] st: implement sysfs based tape statistics v2 On 2.2.2015, at 17.16, Laurence Oberman oberma...@gmail.com wrote: I pulled this this morning and will be testing. The prior version was stable for me on the upstream and RHEL 6.5 kernel without exhaustive testing. We also just received more requests to get this into RHEL from HP / Red Hat customers. Kai, what are your thoughts. I realize this is a large amount of additional code. I am not keen to create a driver just for stats as we would have to keep the rest of the st driver changes always in sync. I still think that the tape statistics should be exported like the statistics of “real” block devices, i.e., one sysfs file exporting on a single line the statistics that temporally belong together. James rejected this approach. I am leaving the decision about this code to him. I will neither ack nor nak this code. Thanks, Kai Hello Kai, I missed the earlier conversations with James, I will go search for them. Do you mean add them so they are similar to the /proc/diskstats cat /proc/diskstats .. 8 0 sda 2258346 152801 291907067 5263795 388817 1518048 15013833 4542062 0 4794931 9803495 8 1 sda1 717 102 26154 1179 8 2 80 76 0 1172 1254 8 2 sda2 328 31 2872 1554 0 0 0 0 0 1554 1554 8 3 sda3 2195205 151617 290898283 5203627 355053 1518046 15009528 4370598 0 4594137 9571937 8 4 sda4 61921 1050 978350 57218 18 0 4225 34 0 56384 57185 11 0 sr0 0 0 0 0 0 0 0 0 0 0 0 .. Laurence Oberman Red Hat Global Support Service SEG Team -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] implementing tape statistics single file vs multi-file in sysfs
Hello Its not going to be tens of thousands of devices. That count was an aggregate based on 1000's of servers. In reality its unlikely to ever be more than 100 tapes drives per individual Linux kernel instance. Therefore sysfs will be the valid way to do this and make the data available to user space. Thanks Laurence On Feb 6, 2015, at 11:07 PM, Greg KH gre...@linuxfoundation.org wrote: On Fri, Feb 06, 2015 at 03:41:58PM +, Bryn M. Reeves wrote: On Fri, Feb 06, 2015 at 04:59:16AM -0800, Greg KH wrote: On Fri, Feb 06, 2015 at 12:20:53AM +, Seymour, Shane M wrote: The current patch that implements tape statistics is here: http://marc.info/?l=linux-scsim=142112067313723w=2 Aside from the do we want to do this all in a single file issue that I will say more on below, this patch has issues. Please don't use a kobject for _ANYTHING_ in sysfs that has a struct device as a parent. If you do that, it can't be seen by userspace tools very well, if at all. I can't speak for Shane but wouldn't spend too much time looking at the current v2 patch: it's the result of a pretty ugly compromise suggested on linux-scsi. Fair enough, but please feel free to cc: me on the patch that you do feel is correct to get a sysfs-related review. Recently there was was another discussion here about one file vs a collection of files for tape statistics: http://marc.info/?l=linux-scsim=142316255501550w=2 The result was that I should ask here what method I should use. I would like to get feedback in relation to tape statistics and one file vs multi-file in sysfs. I'm happy to keep the existing code or change to a single file approach. One of the primary reasons we created sysfs and the one value per file rule is that multi-value files just do not work well. Yes, you get an atomic snapshot, and you save some open/read/close syscall roundtrips, but you do so at the expense of forcing userspace to know what the format of the file is. And once you create it, you can NEVER CHANGE IT AGAIN. I am not convinced this is a concern for tape statistics: they are pretty much a solved problem. The commercial *nixes have had this for decades. Likewise for disk stats: although fluff like maj:min/name etc. has been shuffled a few times the basic fields have remained unchanged for a very long time and sysfs already removes the need to include an identity field. We already handle i/o stats just fine, why create a special sysfs interface for just a tape device interface? What makes them so special? Yes, that's right, if you come up with some new statistic in the future, or realize that one of the ones you have now is wrong, you can't change it, you have to make a whole new file, otherwise you could break userspace tools. I understand the fact that you can't change them; I just don't think it's a big problem in this specific case (and much less than some of the more imaginative sysfs content - 2d int arrays with column headers anyone?). What sysfs file is a 2d int array? I'll be glad to fix it. Also, everyone doesn't think their solution will ever need to be changed. Until later when it does :) And yes, open/read/close does take take a few extra cycles, but you can't really measure it for a virtual filesystem like this on any modern system. I'll try to get some numbers when I get back home next week - Shane is talking about use cases involving tens of thousands of tape devices. I am not certain that the overhead would be unmeasurable in that case: the additional context switching TLB flushes alone seem like they would add up. If you want to measure tens of thousands of tape devices then you shouldn't be using sysfs in the first place as it is not designed for speed at all. Use the existing i/o rate interfaces instead, don't try to cram something into sysfs that doesn't belong there. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: implement sysfs based tape statistics v2
- Original Message - From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi To: Laurence Oberman lober...@redhat.com Cc: Laurence Oberman oberma...@gmail.com, Shane M Seymour shane.seym...@hp.com, linux-scsi@vger.kernel.org, James E.J. Bottomley (jbottom...@parallels.com) jbottom...@parallels.com, je...@suse.com Sent: Thursday, February 5, 2015 12:46:32 PM Subject: Re: [PATCH] st: implement sysfs based tape statistics v2 On 5.2.2015, at 19.40, Laurence Oberman lober...@redhat.com wrote: - Original Message - From: Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi To: Laurence Oberman oberma...@gmail.com Cc: Shane M Seymour shane.seym...@hp.com, lober...@redhat.com, linux-scsi@vger.kernel.org, James E.J. Bottomley (jbottom...@parallels.com) jbottom...@parallels.com, je...@suse.com Sent: Thursday, February 5, 2015 12:03:29 PM Subject: Re: [PATCH] st: implement sysfs based tape statistics v2 On 2.2.2015, at 17.16, Laurence Oberman oberma...@gmail.com wrote: I pulled this this morning and will be testing. The prior version was stable for me on the upstream and RHEL 6.5 kernel without exhaustive testing. We also just received more requests to get this into RHEL from HP / Red Hat customers. Kai, what are your thoughts. I realize this is a large amount of additional code. I am not keen to create a driver just for stats as we would have to keep the rest of the st driver changes always in sync. I still think that the tape statistics should be exported like the statistics of “real” block devices, i.e., one sysfs file exporting on a single line the statistics that temporally belong together. James rejected this approach. I am leaving the decision about this code to him. I will neither ack nor nak this code. Thanks, Kai Hello Kai, I missed the earlier conversations with James, I will go search for them. Do you mean add them so they are similar to the /proc/diskstats cat /proc/diskstats .. 8 0 sda 2258346 152801 291907067 5263795 388817 1518048 15013833 4542062 0 4794931 9803495 8 1 sda1 717 102 26154 1179 8 2 80 76 0 1172 1254 8 2 sda2 328 31 2872 1554 0 0 0 0 0 1554 1554 8 3 sda3 2195205 151617 290898283 5203627 355053 1518046 15009528 4370598 0 4594137 9571937 8 4 sda4 61921 1050 978350 57218 18 0 4225 34 0 56384 57185 11 0 sr0 0 0 0 0 0 0 0 0 0 0 0 .. Not exactly. I mean the data exported in sysfs, for example: cat /sys/block/sda/sda1/stat 159740 9006 594150664461 12472455907 12772208 3598677 0 299875 3663235 Kai Ok, Thanks, got it now. Let me circle back with Shane Laurence Oberman Red Hat Global Support Service SEG Team -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module
Hello I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behaviour of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behaviour in the Emulex, Qlogic and other F/C drivers. Works by checking jammer_flag==1 and host # and discards SCSI command, controlled using echo to sys parameter. I decided to share the patch, in the hope it may be useful for others but I do understand this is a special use case. If this is useful and Nab wants to include it I will create a proper documentation patch as well. filename: /lib/modules/3.17.7-200.jammer.fc20.x86_64/kernel/drivers/scsi/qla2xxx/tcm_qla2xxx.ko license:GPL description:TCM QLA2XXX series NPIV enabled fabric driver depends:target_core_mod,qla2xxx,scsi_transport_fc intree: Y vermagic: 3.17.7-200.jammer.fc20.x86_64 SMP mod_unload parm: jammer_flag:Set to 1: Enable jammer (int) parm: host_flag:host number to match on (int) Enable host 6 to be jammed echo 6 /sys/module/tcm_qla2xxx/parameters/host_flag Usage example script: #!/bin/bash host=`cat /sys/module/tcm_qla2xxx/parameters/host_flag` sleep_time=120 ### Time to jam for echo We start to discard commands on SCSI host $host logger Jammer started echo 1 /sys/module/tcm_qla2xxx/parameters/jammer_flag sleep $sleep_time echo 0 /sys/module/tcm_qla2xxx/parameters/jammer_flag echo We stopped the jammer logger Jammer stopped This Patch diff against 3.19.1 Tested by: Laurence Oberman lober...@redhat.com Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nurp a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c2015-03-07 18:35:15.246737589 -0500 +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c2015-03-07 18:35:40.168599630 -0500 @@ -50,6 +50,14 @@ #include qla_target.h #include tcm_qla2xxx.h +int message_flag=0; +int jammer_flag = 0; +module_param(jammer_flag, int,0644); +MODULE_PARM_DESC(jammer_flag, If set to 1: Enable jammer); +int host_flag=0; +module_param(host_flag, int,0644); +MODULE_PARM_DESC(host_flag, host number to match on); + static struct workqueue_struct *tcm_qla2xxx_free_wq; static struct workqueue_struct *tcm_qla2xxx_cmd_wq; @@ -570,6 +578,22 @@ static int tcm_qla2xxx_handle_cmd(scsi_q pr_err(Unable to locate active struct se_session\n); return -EINVAL; } + + // Control messaging here + message_flag += jammer_flag; + if(message_flag == 1) + printk(tcm_qla2xx:SCSI Jammer enabled on host %d\n,host_flag); + if((jammer_flag == 0) (message_flag =0)) { + printk(tcm_qla2xx:SCSI Jammer stopped, %d SCSI commands discarded for host %d\n,message_flag,host_flag); + message_flag=-1; + } + + if ((vha-host_no == host_flag) (jammer_flag == 1)) + { + // return, and don't run target_submit_cmd, effectively discarding command + return 0; + } + return target_submit_cmd(se_cmd, se_sess, cdb, cmd-sense_buffer[0], cmd-unpacked_lun, data_length, fcp_task_attr, @@ -2165,6 +2189,7 @@ static void tcm_qla2xxx_deregister_confi static int __init tcm_qla2xxx_init(void) { int ret; + jammer_flag = 0; ret = tcm_qla2xxx_register_configfs(); if (ret 0) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH ] qla2xxx Add SCSI command jammer/discard capabilty to the qla2xxx target module - revision3
Hello Bart, Quinn Tran Thanks for the feedback. Revision3 Moved the discard to the __qlt_do_work code to prevent the memory leak, this cleans up the allocations. I will look at seeing how best this can be done for the other transports, or in the core but for me the most useful case has been F/C. I wanted to get feedback so far, and suggest that we should start with this as the initial jamming patch as its the least risky change for now. I did test this and ran the same set of tests I normally use this error injection for and it looks good. Patch notes --- I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behaviour of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behaviour in the Emulex, Qlogic and other F/C drivers. While the jammer is enabled SCSI commands are discarded for the selected host and this allows all the multipath error recovery and other LLD driver error recovery and timeout code to be debugged and tested. Works by checking new parameter jam_host if its = 0 and matches vha-host_no , jamming is enabled when jam_host =0 If parameter set to -1 (default) no jamming is enabled. I decided to share the patch, in the hope it may be useful for others but I do understand this is a special use case. Tested by: Laurence Oberman lober...@redhat.com Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nurp a/Documentation/scsi/qla2xxx.txt b/Documentation/scsi/qla2xxx.txt --- a/Documentation/scsi/qla2xxx.txt1969-12-31 19:00:00.0 -0500 +++ b/Documentation/scsi/qla2xxx.txt2015-03-12 21:42:49.828788582 -0400 @@ -0,0 +1,34 @@ +qla2xxx target mode parameters +-- +parm: qlini_mode:Determines when initiator mode will be enabled. Possible values: exclusive - initiator mode will be enabled on load, disabled on enabling target mode and then on disabling target mode enabled back; disabled - initiator mode will never be enabled; enabled (default) - initiator mode will always stay enabled. (charp) + +Enables qla2xxx target mode by setting to disabled on module load + +There is now a new module parameter added to the qla2xxx module +parm: jam_host:Host to jam =0 Enable jammer (int) + +Use this parameter to control the discarding of SCSI commands to a selected host. +This may be useful for testing error handling and simulating slow drain and other +fabric issues. + +Any value =0 that matches a fc_host # will discard the commands for that host. +Reset back to -1 to stop the jamming. + +Enable host 6 to be jammed +echo 6 /sys/module/qla2xxx/parameters/jam_host + +Disable jamming on host 6 +echo -1 /sys/module/qla2xxx/parameters/jam_host + +Usage example script: + +#!/bin/bash +sleep_time=120 ### Time to jam for +echo 6 /sys/module/qla2xxx/parameters/jam_host +host=`cat /sys/module/qla2xxx/parameters/jam_host` +echo We start to discard commands on SCSI host $host +logger Jammer started +sleep $sleep_time +echo -1 /sys/module/qla2xxx/parameters/jam_host +echo We stopped the jammer +logger Jammer stopped diff -Nurp a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c --- a/drivers/scsi/qla2xxx/qla_target.c 2015-03-12 21:44:04.691314527 -0400 +++ b/drivers/scsi/qla2xxx/qla_target.c 2015-03-12 21:52:27.551557133 -0400 @@ -59,6 +59,11 @@ MODULE_PARM_DESC(qlini_mode, int ql2x_ini_mode = QLA2XXX_INI_MODE_EXCLUSIVE; +int jam_host = -1; +module_param(jam_host, int, 0644); +MODULE_PARM_DESC(jam_host, Host to jam =0 Enable jammer); + + static int temp_sam_status = SAM_STAT_BUSY; /* @@ -3264,6 +3269,11 @@ static void __qlt_do_work(struct qla_tgt cmd-cmd_flags |= BIT_1; if (tgt-tgt_stop) goto out_term; + /* + * If jam_host =0, goto out_term discarding command for matching host + */ + if (unlikely(vha-host_no == jam_host)) + goto out_term; cdb = atio-u.isp24.fcp_cmnd.cdb[0]; cmd-tag = atio-u.isp24.exchange_addr; Laurence Oberman Red Hat Global Support Service SEG Team - Original Message - From: Bart Van Assche bart.vanass...@sandisk.com To: Laurence Oberman lober...@redhat.com Cc: Andy Grover agro...@redhat.com, linux-scsi@vger.kernel.org, n...@daterainc.com, Laurence Oberman oberma...@gmail.com Sent: Thursday, March 12, 2015 9:13:28 AM Subject: Re: [PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision2 On 03/08/2015 11:38 AM, Laurence Oberman wrote: Here is revision2 I added unlikely and removed messaging control as it not necessary and adds overhead. I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement
Re: [PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module
Hello Quinn Tran Thank you for the feedback. There is a revision2 of this patch I sent as a follow on to Bart that is much cleaner but its still exposed to the memory leaks. The newer version has a single jam_host parameter as suggested by Bart and the messaging removed. Have a look for it. Bart also suggested moving the discard to a higher layer in his most recent response to allow other transports to benefit as well. I have used this a lot but and its been extremely useful, but never used it for extended periods and specifically to test servers connected via F/C to to the LIO host. I was concerned that we had a dangling allocation after discard but never saw the leak show up significantly in my testing. Mostly because my test servers are in error recovery and waiting on timeouts. Where I placed the discard seemed to be the safest pace for my particular use case. I did use other options like zeroing the cdb and passing the command on to avoid the dangling allocation, to force lots of underruns on the host during testing. Let me revisit my most recent version and take care of the memory leak exposure and look into your other suggestions. I will reply in that latest thread with a new version. Many Thanks for the consideration Laurence Laurence Oberman Red Hat Global Support Service SEG Team - Original Message - From: Quinn Tran quinn.t...@qlogic.com To: Laurence Oberman lober...@redhat.com, Andy Grover agro...@redhat.com, linux-scsi linux-scsi@vger.kernel.org, n...@daterainc.com Cc: Laurence Oberman oberma...@gmail.com Sent: Thursday, March 12, 2015 6:07:08 PM Subject: Re: [PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module This idea definitely help flush out additional interaction issues between fabric drivers and TCM. However, the current spot where the error injection is placed will cause memory leak. The error injection tries to drop the command before submission to TCM. TCM QLA drivers will loose track of this command. The test will be short live if enough memory have been leaked. May be the command should be drop before mem allocation. In addition, it would nice if the other spots can be included such as: queue_status(), queue_data_in, aborted_task(), queue_tm_rsp() target_submit_tmr(). If the intend is to test all adapters, then the error injection need to be move higher up into TCM driver. Regards, Quinn Tran On 3/7/15, 8:26 PM, Laurence Oberman lober...@redhat.com wrote: Hello I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behaviour of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behaviour in the Emulex, Qlogic and other F/C drivers. Works by checking jammer_flag==1 and host # and discards SCSI command, controlled using echo to sys parameter. I decided to share the patch, in the hope it may be useful for others but I do understand this is a special use case. If this is useful and Nab wants to include it I will create a proper documentation patch as well. filename: /lib/modules/3.17.7-200.jammer.fc20.x86_64/kernel/drivers/scsi/qla2xxx/tcm _qla2xxx.ko license:GPL description:TCM QLA2XXX series NPIV enabled fabric driver depends:target_core_mod,qla2xxx,scsi_transport_fc intree: Y vermagic: 3.17.7-200.jammer.fc20.x86_64 SMP mod_unload parm: jammer_flag:Set to 1: Enable jammer (int) parm: host_flag:host number to match on (int) Enable host 6 to be jammed echo 6 /sys/module/tcm_qla2xxx/parameters/host_flag Usage example script: #!/bin/bash host=`cat /sys/module/tcm_qla2xxx/parameters/host_flag` sleep_time=120 ### Time to jam for echo We start to discard commands on SCSI host $host logger Jammer started echo 1 /sys/module/tcm_qla2xxx/parameters/jammer_flag sleep $sleep_time echo 0 /sys/module/tcm_qla2xxx/parameters/jammer_flag echo We stopped the jammer logger Jammer stopped This Patch diff against 3.19.1 Tested by: Laurence Oberman lober...@redhat.com Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nurp a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 2015-03-07 18:35:15.246737589 -0500 +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c 2015-03-07 18:35:40.168599630 -0500 @@ -50,6 +50,14 @@ #include qla_target.h #include tcm_qla2xxx.h +int message_flag=0; +int jammer_flag = 0; +module_param(jammer_flag, int,0644); +MODULE_PARM_DESC(jammer_flag, If set to 1: Enable jammer); +int host_flag=0; +module_param(host_flag, int,0644); +MODULE_PARM_DESC(host_flag, host number to match on); + static struct workqueue_struct *tcm_qla2xxx_free_wq; static struct workqueue_struct *tcm_qla2xxx_cmd_wq; @@ -570,6
Resend: [PATCH ] qla2xxx Add SCSI command jammer/discard capabilty to the qla2xxx target module - revision3
Hello Bart, Quinn Tran, I have been using this jammer facility since I posted the below updated patch with no memory leaks and no issues. Is there any interest to take this patch in, its certainly been critical for me in some of the error recovery testing I have been doing. Thanks Laurence Oberman Red Hat Global Support Service SEG Team - Original Message - From: Laurence Oberman lober...@redhat.com To: Bart Van Assche bart.vanass...@sandisk.com, Quinn Tran quinn.t...@qlogic.com Cc: Andy Grover agro...@redhat.com, linux-scsi@vger.kernel.org, n...@daterainc.com, Laurence Oberman oberma...@gmail.com Sent: Thursday, March 12, 2015 10:13:57 PM Subject: Re: [PATCH ] qla2xxx Add SCSI command jammer/discard capabilty to the qla2xxx target module - revision3 Hello Bart, Quinn Tran Thanks for the feedback. Revision3 Moved the discard to the __qlt_do_work code to prevent the memory leak, this cleans up the allocations. I will look at seeing how best this can be done for the other transports, or in the core but for me the most useful case has been F/C. I wanted to get feedback so far, and suggest that we should start with this as the initial jamming patch as its the least risky change for now. I did test this and ran the same set of tests I normally use this error injection for and it looks good. Patch notes --- I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behaviour of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behaviour in the Emulex, Qlogic and other F/C drivers. While the jammer is enabled SCSI commands are discarded for the selected host and this allows all the multipath error recovery and other LLD driver error recovery and timeout code to be debugged and tested. Works by checking new parameter jam_host if its = 0 and matches vha-host_no , jamming is enabled when jam_host =0 If parameter set to -1 (default) no jamming is enabled. I decided to share the patch, in the hope it may be useful for others but I do understand this is a special use case. Tested by: Laurence Oberman lober...@redhat.com Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nurp a/Documentation/scsi/qla2xxx.txt b/Documentation/scsi/qla2xxx.txt --- a/Documentation/scsi/qla2xxx.txt1969-12-31 19:00:00.0 -0500 +++ b/Documentation/scsi/qla2xxx.txt2015-03-12 21:42:49.828788582 -0400 @@ -0,0 +1,34 @@ +qla2xxx target mode parameters +-- +parm: qlini_mode:Determines when initiator mode will be enabled. Possible values: exclusive - initiator mode will be enabled on load, disabled on enabling target mode and then on disabling target mode enabled back; disabled - initiator mode will never be enabled; enabled (default) - initiator mode will always stay enabled. (charp) + +Enables qla2xxx target mode by setting to disabled on module load + +There is now a new module parameter added to the qla2xxx module +parm: jam_host:Host to jam =0 Enable jammer (int) + +Use this parameter to control the discarding of SCSI commands to a selected host. +This may be useful for testing error handling and simulating slow drain and other +fabric issues. + +Any value =0 that matches a fc_host # will discard the commands for that host. +Reset back to -1 to stop the jamming. + +Enable host 6 to be jammed +echo 6 /sys/module/qla2xxx/parameters/jam_host + +Disable jamming on host 6 +echo -1 /sys/module/qla2xxx/parameters/jam_host + +Usage example script: + +#!/bin/bash +sleep_time=120 ### Time to jam for +echo 6 /sys/module/qla2xxx/parameters/jam_host +host=`cat /sys/module/qla2xxx/parameters/jam_host` +echo We start to discard commands on SCSI host $host +logger Jammer started +sleep $sleep_time +echo -1 /sys/module/qla2xxx/parameters/jam_host +echo We stopped the jammer +logger Jammer stopped diff -Nurp a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c --- a/drivers/scsi/qla2xxx/qla_target.c 2015-03-12 21:44:04.691314527 -0400 +++ b/drivers/scsi/qla2xxx/qla_target.c 2015-03-12 21:52:27.551557133 -0400 @@ -59,6 +59,11 @@ MODULE_PARM_DESC(qlini_mode, int ql2x_ini_mode = QLA2XXX_INI_MODE_EXCLUSIVE; +int jam_host = -1; +module_param(jam_host, int, 0644); +MODULE_PARM_DESC(jam_host, Host to jam =0 Enable jammer); + + static int temp_sam_status = SAM_STAT_BUSY; /* @@ -3264,6 +3269,11 @@ static void __qlt_do_work(struct qla_tgt cmd-cmd_flags |= BIT_1; if (tgt-tgt_stop) goto out_term; + /* + * If jam_host =0, goto out_term discarding command for matching host + */ + if (unlikely(vha-host_no == jam_host)) + goto out_term; cdb = atio-u.isp24.fcp_cmnd.cdb[0]; cmd-tag = atio-u.isp24
Re: [PATCH v6] st implement tape statistics
Hello, I pulled the latest revision of this patch and tested it. I can vouch for it working as expected with out any obvious impact to the existing st driver Is there any way we can move this along. Thanks Tested-by:Laurence Oberman lober...@redhat.com On Thu, Feb 12, 2015 at 6:15 AM, Seymour, Shane M shane.seym...@hp.com wrote: The following patch exposes statistics for the st driver via sysfs. There is a need for companies with large numbers of tape drives numbering in the tens of thousands to track the performance of those tape drives (e.g. when a backup exceeds its window). The statistics provided should allow the calculation of throughput, average block sizes for read and write, and time spent waiting for I/O to complete or doing tape movement. Signed-off-by: Shane Seymour shane.seym...@hp.com Tested-by: Shane Seymour shane.seym...@hp.com --- - Removed comment - Found an issue where read and write sizes were over reported (fixed) In all the test cases I have the stats now report what I expect to be the correct value. Some of the values to be used with statistics are now stored in temporary variables and used to calculate the stats when the I/O ends. Separated out the timestamp into 3 since I found it was possible for other tape I/O to happen during writes updating the stamp value causing the time tracked to be wrong. - Moved the end statistics into a separate function because it had made the function that it was in too large. - Added a new statistic - A count of the number of times we had a residual greater than 0. --- a/drivers/scsi/st.c 2015-01-11 14:46:00.243814755 -0600 +++ b/drivers/scsi/st.c 2015-02-11 22:37:01.382243090 -0600 @@ -471,6 +471,47 @@ static void st_release_request(struct st kfree(streq); } +static void st_do_stats(struct scsi_tape *STp, struct request *req) +{ + u64 ticks; + + ticks = get_jiffies_64(); + STp-stats-in_flight--; + if (req-cmd[0] == WRITE_6) { + ticks -= STp-stats-write_stamp; + STp-stats-write_ticks += ticks; + STp-stats-io_ticks += ticks; + STp-stats-write_cnt++; + if (req-errors) { + STp-stats-write_byte_cnt += + STp-stats-last_write_size - + STp-buffer-cmdstat.residual; + if (STp-buffer-cmdstat.residual 0) + STp-stats-resid_cnt++; + } else + STp-stats-write_byte_cnt += + STp-stats-last_write_size; + } else if (req-cmd[0] == READ_6) { + ticks -= STp-stats-read_stamp; + STp-stats-read_ticks += ticks; + STp-stats-io_ticks += ticks; + STp-stats-read_cnt++; + if (req-errors) + STp-stats-read_byte_cnt += + STp-stats-last_read_size - + STp-buffer-cmdstat.residual; + if (STp-buffer-cmdstat.residual 0) + STp-stats-resid_cnt++; + else + STp-stats-read_byte_cnt += + STp-stats-last_read_size; + } else { + ticks -= STp-stats-other_stamp; + STp-stats-io_ticks += ticks; + STp-stats-other_cnt++; + } +} + static void st_scsi_execute_end(struct request *req, int uptodate) { struct st_request *SRpnt = req-end_io_data; @@ -480,6 +521,8 @@ static void st_scsi_execute_end(struct r STp-buffer-cmdstat.midlevel_result = SRpnt-result = req-errors; STp-buffer-cmdstat.residual = req-resid_len; + st_do_stats(STp, req); + tmp = SRpnt-bio; if (SRpnt-waiting) complete(SRpnt-waiting); @@ -496,6 +539,7 @@ static int st_scsi_execute(struct st_req struct rq_map_data *mdata = SRpnt-stp-buffer-map_data; int err = 0; int write = (data_direction == DMA_TO_DEVICE); + struct scsi_tape *STp = SRpnt-stp; req = blk_get_request(SRpnt-stp-device-request_queue, write, GFP_KERNEL); @@ -516,6 +560,17 @@ static int st_scsi_execute(struct st_req } } + if (cmd[0] == WRITE_6) { + STp-stats-last_write_size = bufflen; + STp-stats-write_stamp = get_jiffies_64(); + } else if (cmd[0] == READ_6) { + STp-stats-last_read_size = bufflen; + STp-stats-read_stamp = get_jiffies_64(); + } else { + STp-stats-other_stamp = get_jiffies_64(); + } + STp-stats-in_flight++; + SRpnt-bio = req-bio; req-cmd_len = COMMAND_SIZE(cmd[0]); memset(req-cmd, 0, BLK_MAX_CDB); @@ -4222,6 +4277,12 @@ static int st_probe(struct device *dev)
Re: [PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision2
Hello Bart, Thanks Here is revision2 I added unlikely and removed messaging control as it not necessary and adds overhead. I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behaviour of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behaviour in the Emulex, Qlogic and other F/C drivers. Works by checking new parameter jam_host if its = 0 and matches vha-host_no , jamming is enabled when jam_host =0 If parameter set to -1 (default) no jamming is enabled. I decided to share the patch, in the hope it may be useful for others but I do understand this is a special use case. This Patch diff against 3.19.1 $ linux-3.19.1/scripts/checkpatch.pl latest-upstream-jammer-path total: 0 errors, 0 warnings, 60 lines checked latest-upstream-jammer-path has no obvious style problems and is ready for submission. Tested by: Laurence Oberman lober...@redhat.com Signed-off-by: Laurence Oberman lober...@redhat.com diff -Nurp a/Documentation/scsi/tcm_qla2xxx.txt b/Documentation/scsi/tcm_qla2xxx.txt --- a/Documentation/scsi/tcm_qla2xxx.txt1969-12-31 19:00:00.0 -0500 +++ b/Documentation/scsi/tcm_qla2xxx.txt2015-03-08 11:32:42.262181821 -0400 @@ -0,0 +1,30 @@ +tcm_qla2xxx jammer parameter usage +-- +There is now a new module parameter added to the tcm_qla2xx module +parm: jam_host:Host to jam =0 Enable jammer (int) + +Use this parameter to control the discarding of SCSI commands to a selected host. +This may be useful for testing error handling and simulating slow drain and other +fabric issues. + +Any value =0 that matches a fc_host # will discard the commands for that host. +Reset back to -1 to stop the jamming. + +Enable host 6 to be jammed +echo 6 /sys/module/tcm_qla2xxx/parameters/jam_host + +Disable jamming on host 6 +echo -1 /sys/module/tcm_qla2xxx/parameters/jam_host + +Usage example script: + +#!/bin/bash +sleep_time=120 ### Time to jam for +echo 6 /sys/module/tcm_qla2xxx/parameters/jam_host +host=`cat /sys/module/tcm_qla2xxx/parameters/jam_host` +echo We start to discard commands on SCSI host $host +logger Jammer started +sleep $sleep_time +echo -1 /sys/module/tcm_qla2xxx/parameters/jam_host +echo We stopped the jammer +logger Jammer stopped diff -Nurp a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c2015-03-08 10:13:31.798400426 -0400 +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c2015-03-08 11:00:53.002419568 -0400 @@ -50,6 +50,10 @@ #include qla_target.h #include tcm_qla2xxx.h +int jam_host = -1; +module_param(jam_host, int, 0644); +MODULE_PARM_DESC(jam_host, Host to jam =0 Enable jammer); + static struct workqueue_struct *tcm_qla2xxx_free_wq; static struct workqueue_struct *tcm_qla2xxx_cmd_wq; @@ -571,6 +575,13 @@ static int tcm_qla2xxx_handle_cmd(scsi_q return -EINVAL; } + if (unlikely(vha-host_no == jam_host)) { + /* + return, and dont run target_submit_cmd, discarding command + */ + return 0; + } + return target_submit_cmd(se_cmd, se_sess, cdb, cmd-sense_buffer[0], cmd-unpacked_lun, data_length, fcp_task_attr, data_dir, flags); @@ -2165,6 +2176,7 @@ static void tcm_qla2xxx_deregister_confi static int __init tcm_qla2xxx_init(void) { int ret; + jam_host = -1; ret = tcm_qla2xxx_register_configfs(); if (ret 0) Thanks you for the consideration Laurence Oberman Red Hat Global Support Service SEG Team - Original Message - From: Bart Van Assche bart.vanass...@sandisk.com To: Laurence Oberman lober...@redhat.com, Andy Grover agro...@redhat.com, linux-scsi@vger.kernel.org, n...@daterainc.com Cc: Laurence Oberman oberma...@gmail.com Sent: Sunday, March 8, 2015 4:10:34 AM Subject: Re: [PATCH ] tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module On 03/08/2015 04:26 AM, Laurence Oberman wrote: Hello I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behaviour of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behaviour in the Emulex, Qlogic and other F/C drivers. Works by checking jammer_flag==1 and host # and discards SCSI command, controlled using echo to sys parameter. I decided to share the patch, in the hope it may be useful for others but I do understand this is a special
Re: mvsas panics and dies when attached to a port extender on newer kernels
Any chance you can capture a vmcore (kernel only pages), I will provide an upload location. Thanks Laurence On Tue, Apr 14, 2015 at 5:16 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Tue, 2015-04-14 at 14:03 -0700, Adam Talbot wrote: To make a very long debugging story short, I think there is an issues/bug with the mvsas driver. It works, with older kernels, and breaks on newer kernels. My Debian Jessie system was running great on a 3.18 kernel. Changed cases to a newer supermicro case with a SAS expander backplane (SAS933EL). That was the only hardware change. Now, when ever I boot, the system kernel panics. 3.2.65-1+deb7u2 works 3.9.0 Gentoo CD works 3.16+ all fail Attached are 3 kernel panics on 3.16+ kernels. Motherboard is a Supermicro X8SIE, with a Marvell Technology Group Ltd. 88SE6440 SAS/SATA PCIe controller Is this a known bug? Well, you're the only person that's reported it so far. I think based on the above is that your configuration is a single expander attached SATA device ... and if you move it to be non expander attached it works fine? At this point I have two options: Stick with the old kernel (yuck) Buy a new card running a better supported chipset Any help would be greatly appreciated Thanks You didn't specify: does 3.15 work? At least the highest working kernel version would help me narrow down potential problems. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: trivial: remove form feed characters
Reviewed-by: Laurence Oberman <lober...@redhat.com> On Wed, Nov 4, 2015 at 4:52 AM, Maurizio Lombardi <mlomb...@redhat.com> wrote: > Signed-off-by: Maurizio Lombardi <mlomb...@redhat.com> > --- > drivers/scsi/st.c | 24 > 1 file changed, 8 insertions(+), 16 deletions(-) > > diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c > index b37b9b0..7c4e518 100644 > --- a/drivers/scsi/st.c > +++ b/drivers/scsi/st.c > @@ -226,7 +226,6 @@ static DEFINE_SPINLOCK(st_use_lock); > static DEFINE_IDR(st_index_idr); > > > - > #include "osst_detect.h" > #ifndef SIGS_FROM_OSST > #define SIGS_FROM_OSST \ > @@ -305,7 +304,6 @@ static char * st_incompatible(struct scsi_device* SDp) > } > return NULL; > } > - > > static inline char *tape_name(struct scsi_tape *tape) > { > @@ -877,7 +875,7 @@ static int flush_buffer(struct scsi_tape *STp, int > seek_next) > return result; > > } > - > + > /* Set the mode parameters */ > static int set_mode_densblk(struct scsi_tape * STp, struct st_modedef * STm) > { > @@ -952,7 +950,7 @@ static void reset_state(struct scsi_tape *STp) > STp->new_partition = STp->partition; > } > } > - > + > /* Test if the drive is ready. Returns either one of the codes below or a > negative system > error code. */ > #define CHKRES_READY 0 > @@ -1241,7 +1239,7 @@ static int check_tape(struct scsi_tape *STp, struct > file *filp) > } > > > - /* Open the device. Needs to take the BKL only because of incrementing the > SCSI host > +/* Open the device. Needs to take the BKL only because of incrementing the > SCSI host > module count. */ > static int st_open(struct inode *inode, struct file *filp) > { > @@ -1334,7 +1332,6 @@ static int st_open(struct inode *inode, struct file > *filp) > return retval; > > } > - > > /* Flush the tape buffer before close */ > static int st_flush(struct file *filp, fl_owner_t id) > @@ -1470,7 +1467,7 @@ static int st_release(struct inode *inode, struct file > *filp) > > return result; > } > - > + > /* The checks common to both reading and writing */ > static ssize_t rw_checks(struct scsi_tape *STp, struct file *filp, size_t > count) > { > @@ -1889,7 +1886,7 @@ st_write(struct file *filp, const char __user *buf, > size_t count, loff_t * ppos) > > return retval; > } > - > + > /* Read data from the tape. Returns zero in the normal case, one if the > eof status has changed, and the negative error code in case of a > fatal error. Otherwise updates the buffer and the eof state. > @@ -2085,7 +2082,6 @@ static long read_tape(struct scsi_tape *STp, long count, > } > return retval; > } > - > > /* Read command */ > static ssize_t > @@ -2233,7 +2229,6 @@ st_read(struct file *filp, char __user *buf, size_t > count, loff_t * ppos) > > return retval; > } > - > > > DEB( > @@ -2447,7 +2442,7 @@ static int st_set_options(struct scsi_tape *STp, long > options) > > return 0; > } > - > + > #define MODE_HEADER_LENGTH 4 > > /* Mode header and page byte offsets */ > @@ -2665,7 +2660,7 @@ static int do_load_unload(struct scsi_tape *STp, struct > file *filp, int load_cod > > return retval; > } > - > + > #if DEBUG > #define ST_DEB_FORWARD 0 > #define ST_DEB_BACKWARD 1 > @@ -3091,7 +3086,6 @@ static int st_int_ioctl(struct scsi_tape *STp, unsigned > int cmd_in, unsigned lon > > return ioctl_result; > } > - > > /* Get the tape position. If bt == 2, arg points into a kernel space mt_loc > structure. */ > @@ -3283,7 +3277,7 @@ static int switch_partition(struct scsi_tape *STp) > STps->last_block_visited = 0; > return set_location(STp, STps->last_block_visited, > STp->new_partition, 1); > } > - > + > /* Functions for reading and writing the medium partition mode page. */ > > #define PART_PAGE 0x11 > @@ -3396,7 +3390,6 @@ static int partition_tape(struct scsi_tape *STp, int > size) > > return result; > } > - > > > /* The ioctl command */ > @@ -3766,7 +3759,6 @@ static long st_compat_ioctl(struct file *file, unsigned > int cmd, unsigned long a > } > #endif > > - > > /* Try to allocate a new tape buffer. Calling function must not hold > dev_arr_lock. */ > -- > Maurizio Lombardi > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: allow debug output to be enabled or disabled via sysfs
I support this addition as it can be done without the module reload provided by my prior patch. Reviewed-by: Laurence Oberman <oberma...@gmail.com> On Mon, Oct 12, 2015 at 12:31 AM, Seymour, Shane M <shane.seym...@hpe.com> wrote: > > Change st driver to allow enabling or disabling debug output > via sysfs file /sys/bus/scsi/drivers/st/debug_flag. > > Previously the only way to enable debug output was: > > 1. loading the driver with the module parameter debug_flag=1 > 2. an ioctl call (this method was also the only way to dynamically > disable debug output). > > To use the ioctl you need a second tape drive (if you are > actively testing the first tape drive) since a second process > cannot open the first tape drive if it is in use. > > The this change is only functional if the value of the macro > DEBUG in st.c is a non-zero value (which it is by default). > > Signed-off-by: Shane Seymour <shane.seym...@hpe.com> > --- > --- a/drivers/scsi/st.c 2015-10-06 17:11:16.299801789 -0500 > +++ b/drivers/scsi/st.c 2015-10-11 14:45:10.595060995 -0500 > @@ -4452,11 +4452,41 @@ static ssize_t version_show(struct devic > } > static DRIVER_ATTR_RO(version); > > +#if DEBUG > +static ssize_t debug_flag_store(struct device_driver *ddp, > + const char *buf, size_t count) > +{ > +/* We only care what the first byte of the data is the rest is unused. > + * if it's a '1' we turn on debug and if it's a '0' we disable it. All > + * other values have -EINVAL returned if they are passed in. > + */ > + if (count > 0) { > + if (buf[0] == '0') { > + debugging = NO_DEBUG; > + return count; > + } else if (buf[0] == '1') { > + debugging = 1; > + return count; > + } > + } > + return -EINVAL; > +} > + > +static ssize_t debug_flag_show(struct device_driver *ddp, char *buf) > +{ > + return scnprintf(buf, PAGE_SIZE, "%d\n", debugging); > +} > +static DRIVER_ATTR_RW(debug_flag); > +#endif > + > static struct attribute *st_drv_attrs[] = { > _attr_try_direct_io.attr, > _attr_fixed_buffer_size.attr, > _attr_max_sg_segs.attr, > _attr_version.attr, > +#if DEBUG > + _attr_debug_flag.attr, > +#endif > NULL, > }; > ATTRIBUTE_GROUPS(st_drv); > diff -uprN a/Documentation/ABI/testing/sysfs-driver-st > b/Documentation/ABI/testing/sysfs-driver-st > --- a/Documentation/ABI/testing/sysfs-driver-st 1969-12-31 18:00:00.0 > -0600 > +++ b/Documentation/ABI/testing/sysfs-driver-st 2015-10-11 14:28:43.537128220 > -0500 > @@ -0,0 +1,12 @@ > +What: /sys/bus/scsi/drivers/st/debug_flag > +Date: October 2015 > +Kernel Version:?.? > +Contact: shane.seym...@hpe.com > +Description: > + This file allows you to turn debug output from the st driver > + off if you write a '0' to the file or on if you write a '1'. > + Note that debug output requires that the module be compiled > + with the #define DEBUG set to a non-zero value (this is the > + default). If DEBUG is set to 0 then this file will not > + appear in sysfs as its presence is conditional upon debug > + output support being compiled into the module. > --- a/Documentation/scsi/st.txt 2015-10-06 17:11:12.323802060 -0500 > +++ b/Documentation/scsi/st.txt 2015-10-11 14:19:48.176164681 -0500 > @@ -569,7 +569,9 @@ Debugging code is now compiled in by def > with the kernel module parameter debug_flag defaulting to 0. Debugging > can still be switched on and off with an ioctl. To enable debug at > module load time add debug_flag=1 to the module load options, the > -debugging output is not voluminous. > +debugging output is not voluminous. Debugging can also be enabled > +and disabled by writing a '0' (disable) or '1' (enable) to the sysfs > +file /sys/bus/scsi/drivers/st/debug_flag. > > If the tape seems to hang, I would be very interested to hear where > the driver is waiting. With the command 'ps -l' you can see the state > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: st driver doesn't seem to grok LTO partitioning
Thanks Doug Trying that now Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Douglas Gilbert" <dgilb...@interlog.com> To: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" <eflo...@intellique.com> Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org Sent: Wednesday, January 6, 2016 10:48:44 AM Subject: Re: st driver doesn't seem to grok LTO partitioning On 16-01-06 10:32 AM, Laurence Oberman wrote: > Firmware update fails as follows: > > Still researching. This is the only LTO5 I have access to so unless Shane has > one I may not be able to make progress. > (Its way long out of warranty and support) > > We mostly use this for generic st driver and changer testing for RHEL and it > has not been updated for at least two years. > > Performing FUP operation... > > Checking image file (/root/V3210A011-E00.IMG) > > Checking device readiness > > Sending image file to the device > > Redetecting device > Fup drive command failed: Unknown error! (status = -100) > > Host adapter status = 0x00 > Driver status = 0x08 > Error buffer = 'MSG: FupDrive() - Error committing image file to drive > (/root/V3210A011-E00.IMG) 1584236 of 1584236 bytes written. > SCSI: WriteBuffer()::DevIo() - ErrorCode (0x70h) ,Sense Key (0x05h) ILLEGAL > REQUEST, INVALID FIELD IN PARAMETER LIST. ASC(0x26h), ASCQ(0x00h) - ) > ' > > Unable to perform FUP operation. The 1584236 byte firmware image might be too big for a single WRITE BUFFER command. You might try getting a recent version of sg3_utils and doing something like: sg_write_buffer -b 4k -I V3210A011-E00.IMG -m 7 /dev/sg3 where /dev/sg3 corresponds to your tape drive. 'lsscsi -g' will show you the mapping. The above technique works fine for recent Seagate SAS disks (with ".LOD" firmware images). Doug Gilbert > - Original Message - > From: "Laurence Oberman" <lober...@redhat.com> > To: "Emmanuel Florac" <eflo...@intellique.com> > Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" > <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org > Sent: Wednesday, January 6, 2016 10:25:37 AM > Subject: Re: st driver doesn't seem to grok LTO partitioning > > I left the log of the failure to partition out > > Here it is > > # mt -f /dev/nst0 mkpartition 1 > /dev/nst0: Input/output error > > [ 5499.341648] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. > [ 5499.342903] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL > 8 > [ 5499.343523] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 > [ 5499.344114] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 > blocks). > [ 5499.344702] st 0:0:0:0: [st0] Loading tape. > [ 5499.359733] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. > [ 5499.360970] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL > 8 > [ 5499.361584] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 > [ 5499.362165] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 > blocks). > [ 5499.363851] st 0:0:0:0: [st0] Partition page length is 10 bytes. > [ 5499.364468] st 0:0:0:0: [st0] PP: max 0, add 0, xdp 0, psum 03, pofmetc > 0,rec 03, units 09, sizes: 1541 65535 > [ 5499.365074] st 0:0:0:0: [st0] MP: 11 08 00 00 18 03 09 00 06 05 ff ff > [ 5499.365658] st 0:0:0:0: [st0] psd_cnt 2, max.parts 0, nbr_parts 0 > [ 5499.366246] st 0:0:0:0: [st0] Formatting tape with two partitions (FDP). > [ 5499.366826] st 0:0:0:0: [st0] Sent partition page length is 12 bytes. > needs_format: 0 > [ 5499.367424] st 0:0:0:0: [st0] PP: max 0, add 1, xdp 4, psum 03, pofmetc 0 > rec 03, units 00, sizes: 65535 65535 > [ 5499.368024] st 0:0:0:0: [st0] MP: 11 0a 00 01 98 03 00 00 ff ff ff ff > [ 5499.369842] st 0:0:0:0: [st0] Error: 802, cmd: 15 10 0 0 18 0 > [ 5499.370495] st 0:0:0:0: [st0] Sense Key : Illegal Request [current] > [ 5499.371109] st 0:0:0:0: [st0] Add. Sense: Invalid field in parameter list > [ 5499.371714] st 0:0:0:0: [st0] Partitioning of tape failed. > > Laurence Oberman > Principal Software Maintenance Engineer > Red Hat Global Support Services > > - Original Message - > From: "Laurence Oberman" <lober...@redhat.com> > To: "Emmanuel Florac" <eflo...@intellique.com> > Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" > <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org > Sent: Wednesday, January 6, 2016 10:23:34 AM > Subject: Re: st driver doesn't seem to g
Re: st driver doesn't seem to grok LTO partitioning
Hello Emanuel I am using this device, its an Ultrium 5 (LTO5) Its an older changer and I am unable to update the firmware, still working on that. What version of mt are you using, as I am testing using a RHEL7.2 base and the upstream patched kernel. Linux example.redhat.com 4.3.3 #1 SMP Tue Jan 5 15:58:47 EST 2016 x86_64 x86_64 x86_64 GNU/Linux # tapeinfo -f /dev/st0 Product Type: Tape Drive Vendor ID: 'QUANTUM ' Product ID: 'ULTRIUM 5 ' Revision: '3060' Attached Changer API: No SerialNumber: 'HU1023AKHE' MinBlock: 1 MaxBlock: 16777215 SCSI ID: 0 SCSI LUN: 0 Ready: yes BufferedMode: yes Medium Type: Not Loaded Density Code: 0x58 BlockSize: 512 DataCompEnabled: yes DataCompCapable: yes DataDeCompEnabled: yes CompType: 0x1 DeCompType: 0x1 BOP: yes Block Position: 0 Partition 0 Remaining Kbytes: 1541692 Partition 0 Size in Kbytes: 1541692 ActivePartition: 0 EarlyWarningSize: 0 NumPartitions: 0 MaxPartitions: 0 Drive is working fine, # mt -f /dev/st0 status SCSI 2 tape drive: File number=0, block number=0, partition=0. Tape block size 512 bytes. Density code 0x58 (no translation). Soft error count since last status=0 General status bits on (4101): BOT ONLINE IM_REP_EN This is what I get when I try and partition and I believe this may be a firmware issue for me. mt -f /dev/st0 stsetoption can-partitions [ 5343.620005] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 5343.621424] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 5343.622005] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 5343.622606] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks). [ 5343.623208] st 0:0:0:0: [st0] Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 [ 5343.623810] st 0:0:0:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, [ 5343.624413] st 0:0:0:0: [st0] defs for wr: 0, no block limits: 0, partitions: 1, s2 log: 0 [ 5343.625011] st 0:0:0:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0 [ 5343.625623] st 0:0:0:0: [st0] debugging: 1 [ 5343.626222] st 0:0:0:0: [st0] Rewinding tape. # mt -f /dev/nst0 mkpartition 1 /dev/nst0: Input/output error Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Emmanuel Florac" <eflo...@intellique.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org Sent: Wednesday, January 6, 2016 10:10:49 AM Subject: Re: st driver doesn't seem to grok LTO partitioning Le Tue, 5 Jan 2016 16:55:04 -0500 (EST) Laurence Oberman <lober...@redhat.com> écrivait: > mt -f /dev/nst0 mkpartition 1 > What is the type of drive exactly? I still couldn't test with the LTO-5 drive as the machine it's connected to is being reinstalled. -- Emmanuel Florac | Direction technique | Intellique | <eflo...@intellique.com> | +33 1 78 94 84 02 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: st driver doesn't seem to grok LTO partitioning
Firmware update fails as follows: Still researching. This is the only LTO5 I have access to so unless Shane has one I may not be able to make progress. (Its way long out of warranty and support) We mostly use this for generic st driver and changer testing for RHEL and it has not been updated for at least two years. Performing FUP operation... Checking image file (/root/V3210A011-E00.IMG) Checking device readiness Sending image file to the device Redetecting device Fup drive command failed: Unknown error! (status = -100) Host adapter status = 0x00 Driver status = 0x08 Error buffer = 'MSG: FupDrive() - Error committing image file to drive (/root/V3210A011-E00.IMG) 1584236 of 1584236 bytes written. SCSI: WriteBuffer()::DevIo() - ErrorCode (0x70h) ,Sense Key (0x05h) ILLEGAL REQUEST, INVALID FIELD IN PARAMETER LIST. ASC(0x26h), ASCQ(0x00h) - ) ' Unable to perform FUP operation. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Laurence Oberman" <lober...@redhat.com> To: "Emmanuel Florac" <eflo...@intellique.com> Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org Sent: Wednesday, January 6, 2016 10:25:37 AM Subject: Re: st driver doesn't seem to grok LTO partitioning I left the log of the failure to partition out Here it is # mt -f /dev/nst0 mkpartition 1 /dev/nst0: Input/output error [ 5499.341648] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 5499.342903] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 5499.343523] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 5499.344114] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks). [ 5499.344702] st 0:0:0:0: [st0] Loading tape. [ 5499.359733] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 5499.360970] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 5499.361584] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 5499.362165] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks). [ 5499.363851] st 0:0:0:0: [st0] Partition page length is 10 bytes. [ 5499.364468] st 0:0:0:0: [st0] PP: max 0, add 0, xdp 0, psum 03, pofmetc 0,rec 03, units 09, sizes: 1541 65535 [ 5499.365074] st 0:0:0:0: [st0] MP: 11 08 00 00 18 03 09 00 06 05 ff ff [ 5499.365658] st 0:0:0:0: [st0] psd_cnt 2, max.parts 0, nbr_parts 0 [ 5499.366246] st 0:0:0:0: [st0] Formatting tape with two partitions (FDP). [ 5499.366826] st 0:0:0:0: [st0] Sent partition page length is 12 bytes. needs_format: 0 [ 5499.367424] st 0:0:0:0: [st0] PP: max 0, add 1, xdp 4, psum 03, pofmetc 0 rec 03, units 00, sizes: 65535 65535 [ 5499.368024] st 0:0:0:0: [st0] MP: 11 0a 00 01 98 03 00 00 ff ff ff ff [ 5499.369842] st 0:0:0:0: [st0] Error: 802, cmd: 15 10 0 0 18 0 [ 5499.370495] st 0:0:0:0: [st0] Sense Key : Illegal Request [current] [ 5499.371109] st 0:0:0:0: [st0] Add. Sense: Invalid field in parameter list [ 5499.371714] st 0:0:0:0: [st0] Partitioning of tape failed. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Laurence Oberman" <lober...@redhat.com> To: "Emmanuel Florac" <eflo...@intellique.com> Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org Sent: Wednesday, January 6, 2016 10:23:34 AM Subject: Re: st driver doesn't seem to grok LTO partitioning Hello Emanuel I am using this device, its an Ultrium 5 (LTO5) Its an older changer and I am unable to update the firmware, still working on that. What version of mt are you using, as I am testing using a RHEL7.2 base and the upstream patched kernel. Linux example.redhat.com 4.3.3 #1 SMP Tue Jan 5 15:58:47 EST 2016 x86_64 x86_64 x86_64 GNU/Linux # tapeinfo -f /dev/st0 Product Type: Tape Drive Vendor ID: 'QUANTUM ' Product ID: 'ULTRIUM 5 ' Revision: '3060' Attached Changer API: No SerialNumber: 'HU1023AKHE' MinBlock: 1 MaxBlock: 16777215 SCSI ID: 0 SCSI LUN: 0 Ready: yes BufferedMode: yes Medium Type: Not Loaded Density Code: 0x58 BlockSize: 512 DataCompEnabled: yes DataCompCapable: yes DataDeCompEnabled: yes CompType: 0x1 DeCompType: 0x1 BOP: yes Block Position: 0 Partition 0 Remaining Kbytes: 1541692 Partition 0 Size in Kbytes: 1541692 ActivePartition: 0 EarlyWarningSize: 0 NumPartitions: 0 MaxPartitions: 0 Drive is working fine, # mt -f /dev/st0 status SCSI 2 tape drive: File number=0, block number=0, partition=0. Tape block size 512 bytes. Density code 0x58 (no translation). Soft error count since last status=0 General status bits on (4101): BOT ONLINE IM_REP_EN This is what I get when I try and partition and I believe this may be a firmware issue for me. mt -f /dev/st0 stsetoption can-partitions [ 5343.620005] st
Re: st driver doesn't seem to grok LTO partitioning
I left the log of the failure to partition out Here it is # mt -f /dev/nst0 mkpartition 1 /dev/nst0: Input/output error [ 5499.341648] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 5499.342903] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 5499.343523] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 5499.344114] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks). [ 5499.344702] st 0:0:0:0: [st0] Loading tape. [ 5499.359733] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 5499.360970] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 5499.361584] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 5499.362165] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks). [ 5499.363851] st 0:0:0:0: [st0] Partition page length is 10 bytes. [ 5499.364468] st 0:0:0:0: [st0] PP: max 0, add 0, xdp 0, psum 03, pofmetc 0,rec 03, units 09, sizes: 1541 65535 [ 5499.365074] st 0:0:0:0: [st0] MP: 11 08 00 00 18 03 09 00 06 05 ff ff [ 5499.365658] st 0:0:0:0: [st0] psd_cnt 2, max.parts 0, nbr_parts 0 [ 5499.366246] st 0:0:0:0: [st0] Formatting tape with two partitions (FDP). [ 5499.366826] st 0:0:0:0: [st0] Sent partition page length is 12 bytes. needs_format: 0 [ 5499.367424] st 0:0:0:0: [st0] PP: max 0, add 1, xdp 4, psum 03, pofmetc 0 rec 03, units 00, sizes: 65535 65535 [ 5499.368024] st 0:0:0:0: [st0] MP: 11 0a 00 01 98 03 00 00 ff ff ff ff [ 5499.369842] st 0:0:0:0: [st0] Error: 802, cmd: 15 10 0 0 18 0 [ 5499.370495] st 0:0:0:0: [st0] Sense Key : Illegal Request [current] [ 5499.371109] st 0:0:0:0: [st0] Add. Sense: Invalid field in parameter list [ 5499.371714] st 0:0:0:0: [st0] Partitioning of tape failed. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Laurence Oberman" <lober...@redhat.com> To: "Emmanuel Florac" <eflo...@intellique.com> Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org Sent: Wednesday, January 6, 2016 10:23:34 AM Subject: Re: st driver doesn't seem to grok LTO partitioning Hello Emanuel I am using this device, its an Ultrium 5 (LTO5) Its an older changer and I am unable to update the firmware, still working on that. What version of mt are you using, as I am testing using a RHEL7.2 base and the upstream patched kernel. Linux example.redhat.com 4.3.3 #1 SMP Tue Jan 5 15:58:47 EST 2016 x86_64 x86_64 x86_64 GNU/Linux # tapeinfo -f /dev/st0 Product Type: Tape Drive Vendor ID: 'QUANTUM ' Product ID: 'ULTRIUM 5 ' Revision: '3060' Attached Changer API: No SerialNumber: 'HU1023AKHE' MinBlock: 1 MaxBlock: 16777215 SCSI ID: 0 SCSI LUN: 0 Ready: yes BufferedMode: yes Medium Type: Not Loaded Density Code: 0x58 BlockSize: 512 DataCompEnabled: yes DataCompCapable: yes DataDeCompEnabled: yes CompType: 0x1 DeCompType: 0x1 BOP: yes Block Position: 0 Partition 0 Remaining Kbytes: 1541692 Partition 0 Size in Kbytes: 1541692 ActivePartition: 0 EarlyWarningSize: 0 NumPartitions: 0 MaxPartitions: 0 Drive is working fine, # mt -f /dev/st0 status SCSI 2 tape drive: File number=0, block number=0, partition=0. Tape block size 512 bytes. Density code 0x58 (no translation). Soft error count since last status=0 General status bits on (4101): BOT ONLINE IM_REP_EN This is what I get when I try and partition and I believe this may be a firmware issue for me. mt -f /dev/st0 stsetoption can-partitions [ 5343.620005] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 5343.621424] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 5343.622005] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 5343.622606] st 0:0:0:0: [st0] Block size: 512, buffer size: 4096 (8 blocks). [ 5343.623208] st 0:0:0:0: [st0] Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 [ 5343.623810] st 0:0:0:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, [ 5343.624413] st 0:0:0:0: [st0] defs for wr: 0, no block limits: 0, partitions: 1, s2 log: 0 [ 5343.625011] st 0:0:0:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0 [ 5343.625623] st 0:0:0:0: [st0] debugging: 1 [ 5343.626222] st 0:0:0:0: [st0] Rewinding tape. # mt -f /dev/nst0 mkpartition 1 /dev/nst0: Input/output error Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Emmanuel Florac" <eflo...@intellique.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: "Laurence Oberman" <oberma...@gmail.com>, "Kai Makisara" <kai.makis...@kolumbus.fi>, linux-scsi@vger.kernel.org Sent: Wednesday, January 6, 2016 10:10:49 AM Subject: Re: st driver doesn't seem to grok LTO partitioning Le Tue, 5 Jan 2016 16:55:04 -0500 (EST) Laurence Oberman <lober...@redhat
Re: st driver doesn't seem to grok LTO partitioning
Testing the patch here in the lab, it seems my firmware will need to be updated to support more than 1 partition. Looking into that now. [ 193.647807] st: Version 20160104, fixed bufsize 32768, s/g segs 256 [ 193.648992] st: Debugging enabled debug_flag = 1 [ 193.650907] st 0:0:0:0: Attached scsi tape st0 [ 193.652046] st 0:0:0:0: st0: try direct i/o: yes (alignment 4 B) [ 280.069260] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 280.070543] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 280.073068] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 280.073725] st 0:0:0:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). mt -f /dev/st0 stsetoption can-partitions [ 676.835972] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 676.837403] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 676.838404] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 676.838880] st 0:0:0:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). [ 676.840383] st 0:0:0:0: [st0] Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 [ 676.840880] st 0:0:0:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, [ 676.842424] st 0:0:0:0: [st0] defs for wr: 0, no block limits: 0, partitions: 1, s2 log: 0 [ 676.842937] st 0:0:0:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0 [ 676.844524] st 0:0:0:0: [st0] debugging: 1 [ 676.845042] st 0:0:0:0: [st0] Rewinding tape. mt -f /dev/nst0 mkpartition 1 [ 798.711408] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 798.712799] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 798.713948] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 798.714504] st 0:0:0:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). [ 798.716227] st 0:0:0:0: [st0] Loading tape. [ 798.731230] st 0:0:0:0: [st0] Block limits 1 - 16777215 bytes. [ 798.732874] st 0:0:0:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 [ 798.734269] st 0:0:0:0: [st0] Density 58, tape length: 0, drv buffer: 1 [ 798.734971] st 0:0:0:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). [ 798.737572] st 0:0:0:0: [st0] Partition page length is 10 bytes. [ 798.739162] st 0:0:0:0: [st0] PP: max 0, add 0, xdp 0, psum 03, pofmetc 0,rec 03, units 09, sizes: 1541 65535 [ 798.739974] st 0:0:0:0: [st0] MP: 11 08 00 00 18 03 09 00 06 05 ff ff [ 798.740810] st 0:0:0:0: [st0] psd_cnt 2, max.parts 0, nbr_parts 0 [ 798.744194] st 0:0:0:0: [st0] Formatting tape with two partitions (FDP). [ 798.745045] st 0:0:0:0: [st0] Sent partition page length is 12 bytes. needs_format: 0 [ 798.747718] st 0:0:0:0: [st0] PP: max 0, add 1, xdp 4, psum 03, pofmetc 0 rec 03, units 00, sizes: 65535 65535 [ 798.748558] st 0:0:0:0: [st0] MP: 11 0a 00 01 98 03 00 00 ff ff ff ff [ 798.752622] st 0:0:0:0: [st0] Error: 802, cmd: 15 10 0 0 18 0 [ 798.753465] st 0:0:0:0: [st0] Sense Key : Illegal Request [current] [ 798.754289] st 0:0:0:0: [st0] Add. Sense: Invalid field in parameter list [ 798.757546] st 0:0:0:0: [st0] Partitioning of tape failed. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services From: Emmanuel Florac <eflo...@intellique.com> Date: Mon, Jan 4, 2016 at 6:46 AM Subject: Re: st driver doesn't seem to grok LTO partitioning To: Kai Makisara <kai.makis...@kolumbus.fi> Cc: linux-scsi@vger.kernel.org Le Mon, 4 Jan 2016 12:22:34 +0200 (EET) Kai Makisara <kai.makis...@kolumbus.fi> écrivait: > Here is again a new version of the patch. This does load before > partitioning. The code performing default partitioning (FDP=1) has > also been slightly modified (two more bits of the original mode page > retained). > > The patch has been tested with my DDS-4 drive. That works fine for me. I'm going to do some testing with other drives I have (LTO-3 -- should fail -- and LTO-5). # modprobe st Jan 4 12:31:53 shakuhachi kernel: st: Version 20160104, fixed bufsize 32768, s/g segs 256 Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: Attached scsi tape st0 Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: try direct i/o: yes (alignment 512 B) Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Block limits 1 - 16777215 bytes. Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Density 5a, tape length: 0, drv buffer: 1 Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Block size: 0, buffer size: 4096 (1 blocks). Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Block limits 1 - 16777215 bytes. Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Density 5a, tape length: 0, drv buffer: 1 Jan 4 12:31:53 shakuhachi kernel: st 7:0:0:0: st0: Block size: 0, buffer size: 4096 (1 blocks). # mt -f /dev/st0
Re: st driver doesn't seem to grok LTO partitioning
I am just waiting on some LTO5 tape cartridges and then will start working on this. I only have LTO cartridges so had to order a couple of LTO5's On Tue, Dec 22, 2015 at 5:04 AM, Emmanuel Florac <eflo...@intellique.com> wrote: > Le Tue, 22 Dec 2015 02:20:31 -0500 > Laurence Oberman <oberma...@gmail.com> écrivait: > >> I also have access to newer hardware if needed. I have started >> reviewing all of this and will post back to this thread. >> Emmanuel can you summarize what you would like to achieve and we will >> all work on this together. > > I'd like to be able to partition LTO media through standard commands, > like "mt mkpartition", mostly to be able to create LTFS tapes without > relying on hard to compile code from IBM/HP/Quantum/Oracle. > > -- > > Emmanuel Florac | Direction technique > | Intellique > | <eflo...@intellique.com> > | +33 1 78 94 84 02 > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bnx2i: fix spelling mistake "complection" -> "completion"
- Original Message - > From: "Colin King" <colin.k...@canonical.com> > To: qlogic-storage-upstr...@qlogic.com, "James E . J . Bottomley" > <j...@linux.vnet.ibm.com>, "Martin K . Petersen" > <martin.peter...@oracle.com>, linux-scsi@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > Sent: Saturday, June 4, 2016 3:14:30 PM > Subject: [PATCH] bnx2i: fix spelling mistake "complection" -> "completion" > > From: Colin Ian King <colin.k...@canonical.com> > > trivial fix to spelling mistake in printk message > > Signed-off-by: Colin Ian King <colin.k...@canonical.com> > --- > drivers/scsi/bnx2i/bnx2i_hwi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/bnx2i/bnx2i_hwi.c b/drivers/scsi/bnx2i/bnx2i_hwi.c > index fb072cc..42921db 100644 > --- a/drivers/scsi/bnx2i/bnx2i_hwi.c > +++ b/drivers/scsi/bnx2i/bnx2i_hwi.c > @@ -2417,7 +2417,7 @@ static void bnx2i_process_conn_destroy_cmpl(struct > bnx2i_hba *hba, > ep = bnx2i_find_ep_in_destroy_list(hba, conn_destroy->iscsi_conn_id); > if (!ep) { > printk(KERN_ALERT "bnx2i_conn_destroy_cmpl: no pending " > - "offload request, unexpected complection\n"); > + "offload request, unexpected completion\n"); > return; > } > > -- > 2.8.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Simple fix Reviewed-by Laurence Oberman <lober...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PREEMPT-RT] [PATCH v2] scsi/fcoe: convert to kworker
- Original Message - > From: "Sebastian Andrzej Siewior" <bige...@linutronix.de> > To: "Laurence Oberman" <lober...@redhat.com>, "James Bottomley" > <j...@linux.vnet.ibm.com> > Cc: "Christoph Hellwig" <h...@infradead.org>, linux-scsi@vger.kernel.org, > "Martin K. Petersen" > <martin.peter...@oracle.com>, "Vasu Dev" <vasu@intel.com>, > r...@linutronix.de, fcoe-de...@open-fcoe.org, "Chad > Dupuis" <chad.dup...@qlogic.com> > Sent: Thursday, June 9, 2016 9:09:37 AM > Subject: Re: [PREEMPT-RT] [PATCH v2] scsi/fcoe: convert to kworker > > On 04/22/2016 06:39 PM, Laurence Oberman wrote: > > I have fcoe for testing. > > I will pull this in next week and test it. > > any update? > > > > > Laurence Oberman > > Principal Software Maintenance Engineer > > Red Hat Global Support Services > > Sebastian > > Hello Apologies, somehow this fell off my radar. I will get the FCOE test bed up and get it done ASAP. Regards Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability to the tcm_qla2xxx module
- Original Message - > From: "Himanshu Madhani" <himanshu.madh...@qlogic.com> > To: "Laurence Oberman" <lober...@redhat.com>, "Nicholas A. Bellinger" > <n...@linux-iscsi.org> > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, "linux-scsi" > <linux-scsi@vger.kernel.org>, "target-devel" > <target-de...@vger.kernel.org>, "Quinn Tran" <quinn.t...@qlogic.com> > Sent: Monday, May 9, 2016 1:08:36 PM > Subject: Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability > to the tcm_qla2xxx module > > On 5/9/16, 7:56 AM, "Laurence Oberman" <lober...@redhat.com> wrote: > > > > > > > > >- Original Message - > >> From: "Laurence Oberman" <lober...@redhat.com> > >> To: "Nicholas A. Bellinger" <n...@linux-iscsi.org> > >> Cc: "Himanshu Madhani" <himanshu.madh...@qlogic.com>, "Bart Van Assche" > >> <bart.vanass...@sandisk.com>, "linux-scsi" > >> <linux-scsi@vger.kernel.org>, "target-devel" > >> <target-de...@vger.kernel.org>, "Quinn Tran" <quinn.t...@qlogic.com> > >> Sent: Monday, April 4, 2016 6:50:03 PM > >> Subject: Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard > >> capability to the tcm_qla2xxx module > >> > >> Hello Nicholas > >> > >> Its fixed now. > >> Many Thanks. > >> > >> $ scripts/checkpatch.pl > >> 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch > >> WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? > >> #12: > >> new file mode 100644 > >> > >> total: 0 errors, 1 warnings, 91 lines checked > >> > >> 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch has style > >> problems, please review. > >> > >> NOTE: If any of the errors are false positives, please report > >> them to the maintainer, see CHECKPATCH in MAINTAINERS. > >> > >> > >> > >> Tested by: Laurence Oberman <lober...@redhat.com> > >> Signed-off-by: Laurence Oberman <lober...@redhat.com> > >> --- > >> Documentation/scsi/tcm_qla2xxx.txt | 22 ++ > >> drivers/scsi/qla2xxx/Kconfig |9 + > >> drivers/scsi/qla2xxx/tcm_qla2xxx.c | 20 > >> drivers/scsi/qla2xxx/tcm_qla2xxx.h |1 + > >> 4 files changed, 52 insertions(+), 0 deletions(-) > >> create mode 100644 Documentation/scsi/tcm_qla2xxx.txt > >> > >> diff --git a/Documentation/scsi/tcm_qla2xxx.txt > >> b/Documentation/scsi/tcm_qla2xxx.txt > >> new file mode 100644 > >> index 000..c3a670a > >> --- /dev/null > >> +++ b/Documentation/scsi/tcm_qla2xxx.txt > >> @@ -0,0 +1,22 @@ > >> +tcm_qla2xxx jam_host attribute > >> +-- > >> +There is now a new module endpoint atribute called jam_host > >> +attribute: jam_host: boolean=0/1 > >> +This attribute and accompanying code is only included if the > >> +Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y > >> +By default this jammer code and functionality is disabled > >> + > >> +Use this attribute to control the discarding of SCSI commands to a > >> +selected host. > >> +This may be useful for testing error handling and simulating slow drain > >> +and other fabric issues. > >> + > >> +Setting a boolean of 1 for the jam_host attribute for a particular host > >> + will discard the commands for that host. > >> +Reset back to 0 to stop the jamming. > >> + > >> +Enable host 4 to be jammed > >> +echo 1 > > >> /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host > >> + > >> +Disable jamming on host 4 > >> +echo 0 > > >> /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host > >> diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig > >> index 10aa18b..67c0d5a 100644 > >> --- a/drivers/scsi/qla2xxx/Kconfig > >> +++ b/drivers/scsi/qla2xxx/Kconfig > >> @@ -36,3 +36,12 @@ config TCM_QLA2XXX > >>default n > >>---help--- > >>Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ > >>series > >>target mode HBAs > >> + &
Re: [PATCH] aic7xxx: fix wrong return values
} > > if ((ahc->features & AHC_TWIN) != 0) { > if (ahc_alloc_tstate(ahc, ahc->our_id_b, 'B') == NULL) { > printk("%s: unable to allocate ahc_tmode_tstate. " > "Failing attach\n", ahc_name(ahc)); > - return (ENOMEM); > + return -ENOMEM; > } > } > > @@ -5660,7 +5660,7 @@ ahc_suspend(struct ahc_softc *ahc) > > if (LIST_FIRST(>pending_scbs) != NULL) { > ahc_unpause(ahc); > - return (EBUSY); > + return -EBUSY; > } > > #ifdef AHC_TARGET_MODE > @@ -5671,7 +5671,7 @@ ahc_suspend(struct ahc_softc *ahc) >*/ > if (ahc->pending_device != NULL) { > ahc_unpause(ahc); > - return (EBUSY); > + return -EBUSY; > } > #endif > ahc_shutdown(ahc); > @@ -6908,7 +6908,7 @@ ahc_loadseq(struct ahc_softc *ahc) > printk("\n%s: Program too large for instruction memory " > "size of %d!\n", ahc_name(ahc), > ahc->instruction_ram_size); > - return (ENOMEM); > + return -ENOMEM; > } > > /* > diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm.c > b/drivers/scsi/aic7xxx/aic7xxx_osm.c > index fc6a831..78433f6 100644 > --- a/drivers/scsi/aic7xxx/aic7xxx_osm.c > +++ b/drivers/scsi/aic7xxx/aic7xxx_osm.c > @@ -835,7 +835,7 @@ ahc_dma_tag_create(struct ahc_softc *ahc, bus_dma_tag_t > parent, > > dmat = kmalloc(sizeof(*dmat), GFP_ATOMIC); > if (dmat == NULL) > - return (ENOMEM); > + return -ENOMEM; > > /* >* Linux is very simplistic about DMA memory. For now don't > @@ -864,7 +864,7 @@ ahc_dmamem_alloc(struct ahc_softc *ahc, bus_dma_tag_t > dmat, void** vaddr, > *vaddr = pci_alloc_consistent(ahc->dev_softc, > dmat->maxsize, mapp); > if (*vaddr == NULL) > - return ENOMEM; > + return -ENOMEM; > return 0; > } > > @@ -1096,7 +1096,7 @@ ahc_linux_register_host(struct ahc_softc *ahc, struct > scsi_host_template *templa > template->name = ahc->description; > host = scsi_host_alloc(template, sizeof(struct ahc_softc *)); > if (host == NULL) > - return (ENOMEM); > + return -ENOMEM; > > *((struct ahc_softc **)host->hostdata) = ahc; > ahc->platform_data->host = host; > @@ -1215,7 +1215,7 @@ ahc_platform_alloc(struct ahc_softc *ahc, void > *platform_arg) > ahc->platform_data = > kzalloc(sizeof(struct ahc_platform_data), GFP_ATOMIC); > if (ahc->platform_data == NULL) > - return (ENOMEM); > + return -ENOMEM; > ahc->platform_data->irq = AHC_LINUX_NOIRQ; > ahc_lockinit(ahc); > ahc->seltime = (aic7xxx_seltime & 0x3) << 4; > diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c > b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c > index 0fc14da..8bca7f4 100644 > --- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c > +++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c > @@ -346,13 +346,13 @@ static int > ahc_linux_pci_reserve_io_region(struct ahc_softc *ahc, resource_size_t > *base) > { > if (aic7xxx_allow_memio == 0) > - return (ENOMEM); > + return -ENOMEM; > > *base = pci_resource_start(ahc->dev_softc, 0); > if (*base == 0) > - return (ENOMEM); > + return -ENOMEM; > if (!request_region(*base, 256, "aic7xxx")) > - return (ENOMEM); > + return -ENOMEM; > return (0); > } > > @@ -369,16 +369,16 @@ ahc_linux_pci_reserve_mem_region(struct ahc_softc *ahc, > if (start != 0) { > *bus_addr = start; > if (!request_mem_region(start, 0x1000, "aic7xxx")) > - error = ENOMEM; > + error = -ENOMEM; > if (error == 0) { > *maddr = ioremap_nocache(start, 256); > if (*maddr == NULL) { > - error = ENOMEM; > + error = -ENOMEM; > release_mem_region(start, 0x1000); > } > } > } else > - error = ENOMEM; > + error = -ENOMEM; > return (error); > } > > diff --git a/drivers/scsi/aic7xxx/aic7xxx_pci.c > b/drivers/scsi/aic7xxx/aic7xxx_pci.c > index 22d5a94..40e1c9b 100644 > --- a/drivers/scsi/aic7xxx/aic7xxx_pci.c > +++ b/drivers/scsi/aic7xxx/aic7xxx_pci.c > @@ -806,7 +806,7 @@ ahc_pci_config(struct ahc_softc *ahc, const struct > ahc_pci_identity *entry) > > error = ahc_reset(ahc, /*reinit*/FALSE); > if (error != 0) > - return (ENXIO); > + return -ENXIO; > > if ((ahc->features & AHC_DT) != 0) { > u_int sfunct; > @@ -2387,7 +2387,7 @@ static int > ahc_raid_setup(struct ahc_softc *ahc) > { > printk("RAID functionality unsupported\n"); > - return (ENXIO); > + return -ENXIO; > } > > static int > -- > 2.5.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Patch looks simple as the change is straightforward. However can you make the code consistent, some have parenthesis in return, some not. How did this work before though if it was returning non-negative to the caller or upper layer Has this been tested to work with the changes Reviewed-by Laurence Oberman <lober...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Connect-IB not performing as well as ConnectX-3 with iSER
- Original Message - > From: "Bart Van Assche"> To: "Robert LeBlanc" , "Sagi Grimberg" > > Cc: linux-r...@vger.kernel.org, linux-scsi@vger.kernel.org, "Max Gurtovoy" > > Sent: Wednesday, June 22, 2016 4:18:31 AM > Subject: Re: Connect-IB not performing as well as ConnectX-3 with iSER > > On 06/21/2016 10:26 PM, Robert LeBlanc wrote: > > Srpt keeps crashing couldn't test > > If this is reproducible with the latest rc kernel or with any of the > stable kernels please report this in a separate e-mail, together with > the crash call stack and information about how to reproduce this. > > Thanks, > > Bart. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Robert I am exercising the ib_srpt configured vi a targetlio very heavily in 4.7.0-rc1. I have no crashes or issues. I also had 4.5 running ib_srpt with no crashes, although I had some other timeouts etc depending on the load. What sort of crashes are you talking about ? Does the system crash, ib_srpt dump stack ? Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] snic: Fix use-after-free in case of a dma mapping error
- Original Message - > From: "Johannes Thumshirn" <jthumsh...@suse.de> > To: "Martin K . Petersen" <martin.peter...@oracle.com>, "James Bottomley" > <j...@linux.vnet.ibm.com> > Cc: "Linux SCSI Mailinglist" <linux-scsi@vger.kernel.org>, "Linux Kernel > Mailinglist" <linux-ker...@vger.kernel.org>, > "Narsimhulu Musini" <nmus...@cisco.com>, "Sesidhar Baddela" > <sebad...@cisco.com>, "Johannes Thumshirn" > <jthumsh...@suse.de> > Sent: Thursday, June 23, 2016 8:37:20 AM > Subject: [PATCH] snic: Fix use-after-free in case of a dma mapping error > > If there is a dma mapping error snic kfree()s buf right before printing it. > Change the order to not accidently trip on memory that's not owned by us > anymore. > > Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de> > --- > drivers/scsi/snic/snic_disc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/snic/snic_disc.c b/drivers/scsi/snic/snic_disc.c > index b0fefd6..b106596 100644 > --- a/drivers/scsi/snic/snic_disc.c > +++ b/drivers/scsi/snic/snic_disc.c > @@ -113,11 +113,11 @@ snic_queue_report_tgt_req(struct snic *snic) > > pa = pci_map_single(snic->pdev, buf, buf_len, PCI_DMA_FROMDEVICE); > if (pci_dma_mapping_error(snic->pdev, pa)) { > - kfree(buf); > - snic_req_free(snic, rqi); > SNIC_HOST_ERR(snic->shost, > "Rpt-tgt rspbuf %p: PCI DMA Mapping Failed\n", > buf); > + kfree(buf); > + snic_req_free(snic, rqi); > ret = -EINVAL; > > goto error; > -- > 2.8.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Looks fine to me Reviewed-by Laurence Oberman <lober...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tcm_qla2xxx: fix spelling mistake: "seperator" -> "separator"
- Original Message - > From: "Colin King" <colin.k...@canonical.com> > To: "James E . J . Bottomley" <j...@linux.vnet.ibm.com>, "Martin K . > Petersen" <martin.peter...@oracle.com>, > linux-scsi@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > Sent: Thursday, June 23, 2016 1:12:25 PM > Subject: [PATCH] tcm_qla2xxx: fix spelling mistake: "seperator" -> "separator" > > From: Colin Ian King <colin.k...@canonical.com> > > trivial fix to spelling mistake in pr_err message > > Signed-off-by: Colin Ian King <colin.k...@canonical.com> > --- > drivers/scsi/qla2xxx/tcm_qla2xxx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c > b/drivers/scsi/qla2xxx/tcm_qla2xxx.c > index 6643f6f..46fe6f4 100644 > --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c > +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c > @@ -1738,7 +1738,7 @@ static struct se_wwn *tcm_qla2xxx_npiv_make_lport( > > p = strchr(tmp, '@'); > if (!p) { > - pr_err("Unable to locate NPIV '@' seperator\n"); > + pr_err("Unable to locate NPIV '@' separator\n"); > return ERR_PTR(-EINVAL); > } > *p++ = '\0'; > -- > 2.8.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Simple change, and its fine Reviewed-by Laurence Oberman <lober...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning
Kai's latest patch passes all my tests on the DAT DSS drive Fails on the older LTO3 as it should. (un-partionable) I don't have the new LTO5 yet, arrives end of week I am told. Testing log --- [root@srp-server ~]# uname -a Linux srp-server 4.4.0 #1 SMP Thu Jan 28 15:06:45 EST 2016 x86_64 x86_64 x86_64 GNU/Linux Storage Changer /dev/sg3:1 Drives, 6 Slots ( 0 Import/Export ) Data Transfer Element 0:Full (Storage Element 2 Loaded) Storage Element 1:Full Storage Element 2:Empty Storage Element 3:Full Storage Element 4:Full Storage Element 5:Full Storage Element 6:Empty [root@srp-server home]# mtx -f /dev/sg3 unload 2 0 Unloading drive 0 into Storage Element 2...done [root@srp-server home]# mtx -f /dev/sg3 load 3 0 Loading media from Storage Element 3 into drive 0...done [root@srp-server home]# sg_map -st -i /dev/sg2 /dev/nst0 HPDAT72X6 B409 /dev/sg3 HPDAT72X6 B409 [root@srp-server home]# mt -f /dev/st0 stsetoption can-partitions [root@srp-server home]# mt -f /dev/st0 mkpartition 1 Tape screen shows Format Completed with no errors and I can set to a specific partition Feb 04 13:42:27 srp-server kernel: st: Unloaded. Feb 04 13:43:57 srp-server kernel: st: Version 20160203, fixed bufsize 32768, s/g segs 256 Feb 04 13:43:57 srp-server kernel: st: Debugging enabled debug_flag = 1 Feb 04 13:43:57 srp-server kernel: st 6:0:1:0: Attached scsi tape st0 Feb 04 13:43:57 srp-server kernel: st 6:0:1:0: st0: try direct i/o: yes (alignment 4 B) Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Updating partition number in status. Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Got tape pos. blk 0 part 0. Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] defs for wr: 0, no block limits: 0, partitions: 1, s2 log: 0 Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0 Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] debugging: 1 Feb 04 13:48:30 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape. Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Loading tape. Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 0 0 0 0 0 0 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Unit Attention [current] Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Not ready to ready change, medium may have changed Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Partition page length is 10 bytes. Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 0, xdp 0, psum 02, pofmetc 0, rec 03, units 00, sizes: 0 65535 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 00 10 03 00 00 00 00 ff ff Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] psd_cnt 1, max.parts 1, nbr_parts 0 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Formatting tape with two partitions (1 = 1 MB). Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] Sent partition page length is 10 bytes. needs_format: 0 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 1, xdp 1, psum 02, pofmetc 0, rec 03, units 00, sizes: 1 65535 Feb 04 13:48:42 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 01 30 03 00 00 27 10 ff ff Tested-by: Laurence Oberman <lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Douglas Gilbert" <dgilb...@int
Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning
Hi Kai What kernel was the last patch you attached against. Thanks Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi> To: "Shane M Seymour" <shane.seym...@hpe.com> Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" <eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, linux-scsi@vger.kernel.org Sent: Thursday, January 28, 2016 12:04:20 PM Subject: Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning > On 28.1.2016, at 9.36, Seymour, Shane M <shane.seym...@hpe.com> wrote: > > Hi Kai, > > With the changes the I get a failure partitioning a HP DAT72 drive (DDS-5): > > # ./mt -f /dev/st1 stsetoption debug > # ./mt -f /dev/st1 stsetoption can-partitions > # ./mt -f /dev/st1 mkpartition 1000 > /dev/st1: Input/output error > ... > [ 3976.389605] st 6:0:3:0: [st1] Partition page length is 10 bytes. > [ 3976.389610] st 6:0:3:0: [st1] PP: max 1, add 0, xdp 0, psum 02, pofmetc 0, > rec 03, units 00, sizes: 0 65535 > [ 3976.389614] st 6:0:3:0: [st1] MP: 11 08 01 00 10 03 00 00 00 00 ff ff > [ 3976.389618] st 6:0:3:0: [st1] psd_cnt 2, max.parts 1, nbr_parts 0 ^ The problem is here ... > Using a slightly older kernel to partition the DAT72 drive works (same 3 > commands as above): ... > [ 351.584906] st 6:0:3:0: [st1] Partition page length is 10 bytes. > [ 351.584908] st 6:0:3:0: [st1] psd_cnt 1, max.parts 1, nbr_parts 0 The old driver computes the psd_cnt from the returned page length. The same applies to the patched driver if the SCSI level of the device < SCSI_3. This works correctly with my drive that reports SCSI_2. So, the question is: what SCSI level does your device report? Kai -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning
Meant to mention, still waiting for my new LTO5, also this is the first time I am testing the DAT72. Shane, have you had the DAT working before this last patch, if so which patch Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Laurence Oberman" <lober...@redhat.com> To: "Shane M Seymour" <shane.seym...@hpe.com> Cc: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi>, "Emmanuel Florac" <eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, linux-scsi@vger.kernel.org Sent: Thursday, January 28, 2016 6:23:13 PM Subject: Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning On My DAT tape with the latest patch [root@srp-server ~]# cat /sys/class/scsi_tape/st0/device/scsi_level 4 [root@srp-server ~]# mt -f /dev/st0 stsetoption can-partitions Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] defs for wr: 0, no block limits: 0, partitions: 1, s2 log: 0 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] debugging: 1 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape. [root@srp-server ~]# mt -f /dev/st0 mkpartition 1000 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Loading tape. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 0 0 0 0 0 0 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Unit Attention [current] Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Not ready to ready change, medium may have changed Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Partition page length is 10 bytes. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 0, xdp 0, psum 02, pofmetc 0, rec 03, units 00, sizes: 0 65535 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 00 10 03 00 00 00 00 ff ff Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] psd_cnt 2, max.parts 1, nbr_parts 0 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Formatting tape with two partitions (1 = 1000 MB). Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sent partition page length is 12 bytes. needs_format: 0 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 1, xdp 1, psum 02, pofmetc 0, rec 03, units 00, sizes: 65535 1000 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] MP: 11 0a 01 01 30 03 00 00 ff ff 03 e8 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 15 10 0 0 18 0 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Illegal Request [current] Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Invalid field in parameter list Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Partitioning of tape failed. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Shane M Seymour" <shane.seym...@hpe.com> To: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi> Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" <eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, linux-scsi@vger.kernel.org Sent: Thursday, January 28, 2016 6:12:41 PM Subject: RE: What partition should the MTMKP
Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning
On My DAT tape with the latest patch [root@srp-server ~]# cat /sys/class/scsi_tape/st0/device/scsi_level 4 [root@srp-server ~]# mt -f /dev/st0 stsetoption can-partitions Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] defs for wr: 0, no block limits: 0, partitions: 1, s2 log: 0 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] debugging: 1 Jan 28 18:17:49 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape. [root@srp-server ~]# mt -f /dev/st0 mkpartition 1000 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Loading tape. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 0 0 0 0 0 0 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Unit Attention [current] Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Not ready to ready change, medium may have changed Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Partition page length is 10 bytes. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 0, xdp 0, psum 02, pofmetc 0, rec 03, units 00, sizes: 0 65535 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 00 10 03 00 00 00 00 ff ff Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] psd_cnt 2, max.parts 1, nbr_parts 0 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Formatting tape with two partitions (1 = 1000 MB). Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sent partition page length is 12 bytes. needs_format: 0 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 1, xdp 1, psum 02, pofmetc 0, rec 03, units 00, sizes: 65535 1000 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] MP: 11 0a 01 01 30 03 00 00 ff ff 03 e8 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 15 10 0 0 18 0 Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Illegal Request [current] Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Invalid field in parameter list Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Partitioning of tape failed. Jan 28 18:18:01 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Shane M Seymour" <shane.seym...@hpe.com> To: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi> Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" <eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, linux-scsi@vger.kernel.org Sent: Thursday, January 28, 2016 6:12:41 PM Subject: RE: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning Hi Kai, $ pwd /sys/class/scsi_tape/st1/device $ cat scsi_level 4 Thanks Shane > -Original Message- > From: "Kai Mäkisara (Kolumbus)" [mailto:kai.makis...@kolumbus.fi] > Sent: Friday, January 29, 2016 4:04 AM > To: Seymour, Shane M > Cc: Laurence Oberman; Emmanuel Florac; Laurence Oberman; linux- > s...@vger.kernel.org > Subject: Re: What partition should the MTMKPART argument specify? Was: > Re: st driver doesn't seem to grok LTO partitioning > > > > On 28.1.2016, at 9.36, Seymour, Shane M <shane.seym...@hpe.com> > wrote: > > > > Hi Kai, > > > > With the changes the I get a failure partitioning a HP DAT72 drive (DDS-5): > >
Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning
The new patch did not work for me, but I chatted with Shane and I have his mt version. I will update my DAT to same firmware or newer than his and provide a second tested by. I also expect my LTO5 to show up this week so will be ready for that. Thanks everyone for keeping tapes alive Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi> To: "Shane M Seymour" <shane.seym...@hpe.com> Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" <eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, linux-scsi@vger.kernel.org Sent: Monday, February 1, 2016 1:43:26 PM Subject: Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning > On 1.2.2016, at 8.31, Seymour, Shane M <shane.seym...@hpe.com> wrote: > > Hi Kai, > > Thanks for the changes the HPE DAT72 DDS5 drive now works as expected: > Good. Thanks for testing. ... > > I'm asking around again one final time to see if I can lay my hands on a LTO5 > or greater drive so I can test LTO partitioning as well. > > The only other thing I can think of (I'm not sure if this is an improvement > or not) is if bp[pgo + PP_OFF_MAX_ADD_PARTS] + bp[pgo + PP_OFF_NBR_ADD_PARTS] > (max.parts and nbr_parts in the debug message) is zero just return -EINVAL > unless you know of any take drives that report them both as 0 but can be > partitioned? That is after this: > >DEBC_printk(STp, "psd_cnt %d, max.parts %d, nbr_parts %d\n", >psd_cnt, bp[pgo + PP_OFF_MAX_ADD_PARTS], >bp[pgo + PP_OFF_NBR_ADD_PARTS]); > > add (and also turn off the can-partitions option): > > if ((bp[pgo + PP_OFF_MAX_ADD_PARTS] + bp[pgo + PP_OFF_NBR_ADD_PARTS]) > == 0) { > DEBC_printk(STp, "Drive not partitionable - max.parts+nbr_parts > is 0\n"); > STp->can_partitions = 0; > return -EINVAL; > } > > I'm not especially fussed if you don't want to add that though. > I thought about a test like this (only test maximum number) but decided not to add it. The reason was that I did not want to change anything that has worked before. I quite trust that the current drives return sense data instead of crashing and the end result for the user would be the same. However, one can argue that returning EINVAL is better than EIO but does the user notice? If the common opinion is that a test like this should be added, I am not against it. It can be added to the code for SCSI >=3 where it does not risk anything for the old drives. IMHO, can_partitions should not be cleared based on the test. For example, trying to partition a LTO-4 tape in a LTO-5 drive should not disable partitioning. (The mode page should return zero as maximum number of partitions when a LTO-4 tape is inserted.) Thanks, Kai -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning
Hello Finally got my firmware on my DAT updated. Using Kai's latest patch I validated the patch on my DAT driver as well Thanks to Shane for providing the correct mt code, as that was also one of my problems besides firmware. [root@srp-server mt-st-1.1-patched]# ./mt -f /dev/st0 stsetoption can-partitions [root@srp-server mt-st-1.1-patched]# ./mt -f /dev/st0 mkpartition 1000 Took almost 6 minutes to partition this old DDS Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] defs for wr: 0, no block limits: 0, partitions: 1, s2 log: 0 Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] sysv: 0 nowait: 0 sili: 0 nowait_filemark: 0 Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] debugging: 1 Feb 02 22:25:10 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape. Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Updating partition number in status. Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Got tape pos. blk 0 part 0. Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Loading tape. Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Error: 802, cmd: 0 0 0 0 0 0 Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Sense Key : Unit Attention [current] Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Add. Sense: Not ready to ready change, medium may have changed Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Block limits 1 - 16777215 bytes. Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Mode sense. Length 11, medium 0, WBS 10, BLL 8 Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Density 47, tape length: 0, drv buffer: 1 Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Block size: 0, buffer size: 4096 (1 blocks). Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Partition page length is 10 bytes. Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 0, xdp 0, psum 02, pofmetc 0, rec 03, units 00, sizes: 0 65535 Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] MP: 11 08 01 00 10 03 00 00 00 00 ff ff Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] psd_cnt 1, max.parts 1, nbr_parts 0 Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Formatting tape with two partitions (1 = 1000 MB). Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] Sent partition page length is 10 bytes. needs_format: 0 Feb 02 22:25:24 srp-server kernel: st 6:0:1:0: [st0] PP: max 1, add 1, xdp 1, psum 02, pofmetc 0, rec 03, units 00, sizes: 1000 65535 Feb 02 22:31:45 srp-server kernel: st 6:0:1:0: [st0] Rewinding tape. I will retest with Shane's latest additions he just sent after first testing with Kai's latest patch on my LTO5. (here's hoping I dont have to update the f/w on that one) Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi> To: "Shane M Seymour" <shane.seym...@hpe.com> Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" <eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, linux-scsi@vger.kernel.org Sent: Monday, February 1, 2016 1:43:26 PM Subject: Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning > On 1.2.2016, at 8.31, Seymour, Shane M <shane.seym...@hpe.com> wrote: > > Hi Kai, > > Thanks for the changes the HPE DAT72 DDS5 drive now works as expected: > Good. Thanks for testing. ... > > I'm asking around again one final time to see if I can lay my hands on a LTO5 > or greater drive so I can test LTO partitioning as well. > > The only other thing I can think of (I'm not sure if this is an improvement > or not) is if bp[pgo + PP_OFF_MAX_ADD_PARTS] + bp[pgo + PP_OFF_NBR_ADD_PARTS] > (max.parts and nbr_parts in the debug message) is zero just return -EINVAL > unless you know of any take drives that report them both as 0 but can be > partitioned? That is after this: &
Re: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning
Given what we see at customers I am leaning towards the SCSI level <=2 to ensure the older LTO5's are supported. The newer ones should be backwards compatible. I may have an older LTO5 showing up that wont need a F/W update to work, and will be able to add a "tested by" once I get it. But lets see what the others have to say Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Kai Mäkisara (Kolumbus)" <kai.makis...@kolumbus.fi> To: "Shane M Seymour" <shane.seym...@hpe.com> Cc: "Laurence Oberman" <lober...@redhat.com>, "Emmanuel Florac" <eflo...@intellique.com>, "Laurence Oberman" <oberma...@gmail.com>, linux-scsi@vger.kernel.org Sent: Thursday, January 21, 2016 3:58:46 PM Subject: What partition should the MTMKPART argument specify? Was: Re: st driver doesn't seem to grok LTO partitioning > On 15.1.2016, at 2.21, Seymour, Shane M <shane.seym...@hpe.com> wrote: > > Unfortunately I'm unable to lay my hands on an LTO 5 tape drive so I'm not > able to test that it works either. If it helps at all I can test in the > negative and make sure that for an LTO 3 drive it fails gracefully but that's > about it at the moment. Thanks for all testers and those who attempted to test. The latest patch applies the standard quite strictly and I think it should work with most drives. The implementation can be fixed later if problems are found. However, before making the final patch, we should decide which partition the specified size should apply to. For the SCSI level <=2 it applies to partition 1. For other drives we may have some freedom to “tune” the definition. The size should apply to the partition the users expect it to apply. The current documentation says "the argument gives in megabytes the size of partition 1 that is physically the first partition of the tape”. The documentation I have found for current drives (HP and IBM LTO, IBM 3592, Storagetek T1000) all number the partitions sequentially from the start of the tape. The access time for any partition is probably about the same when wrapwise partitioning is used. It does matter with linear partitioning. Unfortunately, the standards leave the numbering to the implementor. Partitioning with two partitions is used for storing index in a small partition and use the rest of the tape for data. In this case, it is probably natural to specify the size of the index. The LTFS definition supports index in any partition. The open source code I have seen seem to default to index in partition 0. The HP and IBM LTO default partitioning (FDP=1) specifies two wraps (minimum) to partition 1 and the rest to 0. There seem to be lot of arguments supporting both possible choices. Should we use the existing definition (1) or change it for the drives supporting SCSI level >= 3 (or supporting FORMAT MEDIUM)? The definition can’t be changed later. This is why we should make a good decision. Opinions? Thanks, Kai -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_sysfs: Fix typo in is_bin_visible()
Reviewed-by:Laurence Oberman lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Ewan Milne" <emi...@redhat.com> To: "Hannes Reinecke" <h...@suse.de> Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "Christoph Hellwig" <h...@lst.de>, "Johannes Thumshirn" <jthumsh...@suse.com>, "James Bottomley" <james.bottom...@hansenpartnership.com>, linux-scsi@vger.kernel.org, "Hannes Reinecke" <h...@suse.com> Sent: Thursday, March 10, 2016 10:25:08 AM Subject: Re: [PATCH] scsi_sysfs: Fix typo in is_bin_visible() On Thu, 2016-03-10 at 11:25 +0100, Hannes Reinecke wrote: > The test for the existence vpd_pg83 is inverted. > > Fixes: 7e47976bcff ("scsi_sysfs: add 'is_bin_visible' callback") > Signed-off-by: Hannes Reinecke <h...@suse.com> > --- > drivers/scsi/scsi_sysfs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > index 58ac9c1..d805d55 100644 > --- a/drivers/scsi/scsi_sysfs.c > +++ b/drivers/scsi/scsi_sysfs.c > @@ -1105,7 +1105,7 @@ static umode_t scsi_sdev_bin_attr_is_visible(struct > kobject *kobj, > if (attr == _attr_vpd_pg80 && !sdev->vpd_pg80) > return 0; > > - if (attr == _attr_vpd_pg83 && sdev->vpd_pg83) > + if (attr == _attr_vpd_pg83 && !sdev->vpd_pg83) > return 0; > > return S_IRUGO; Reviewed-by: Ewan D. Milne <emi...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/2] Update SCSI target removal path
I can test this next week. I can test pre and then post patch. Will update when its validated. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Ewan D. Milne" <emi...@redhat.com> To: "jthumshirn" <jthumsh...@suse.de> Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "James E.J. Bottomley" <j...@linux.vnet.ibm.com>, "Hannes Reinecke" <h...@suse.de>, "Christoph Hellwig" <h...@infradead.org>, linux-scsi@vger.kernel.org Sent: Wednesday, March 30, 2016 12:30:27 PM Subject: Re: [PATCH v2 0/2] Update SCSI target removal path On Wed, 2016-03-30 at 13:01 +0200, jthumshirn wrote: > [+Cc linux-scsi back] > On 2016-03-30 02:59, Martin K. Petersen wrote: > >>>>>> "Ewan" == Ewan D Milne <emi...@redhat.com> writes: > > > > Ewan> I would probably use an APCON or other physical layer switch to > > Ewan> drop the FC link and test the error recovery/device loss. But we > > Ewan> don't have one. > > > > They go for a couple of hundred bucks on eBay. I had one of these in a > > previous life and it was awesome. > > Though this would work (like any other FC/FCoE switch) I thought more of > a simulated environment. scsi_debug, Qemu, something like that. You might be able to do something with that but it doesn't quite do the same thing in terms of how a HBA/driver will react to a fault. You also need to be sure there is enough entropy in the timing of when the target goes away. (Otherwise, you could just rmmod scsi_debug...) > > I've had a look at scsi_debug but it seems like it's quite some > refactoring needed to get it to a point where one can simulate target > errors. I kinda like the idea of having something in > tools/testing/selftest but it'll probably end up with a FC switch. > > Johannes -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Cant write to max_sectors_kb on 4.5.0 SRP target
ation descriptor number 5, descriptor length: 8 designator_type: Logical unit group, code_set: Binary associated with the addressed logical unit Logical unit group: 0x0 Designation descriptor number 6, descriptor length: 48 transport: SCSI RDMA Protocol (SRP) designator_type: SCSI name string, code_set: UTF-8 associated with the target port SCSI name string: 0xfe807cfe900300726e4e,t,0x0001 Designation descriptor number 7, descriptor length: 40 transport: SCSI RDMA Protocol (SRP) designator_type: SCSI name string, code_set: UTF-8 associated with the target device that contains addressed lu SCSI name string: 0xfe807cfe900300726e4e Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cant write to max_sectors_kb on 4.5.0 SRP target
As a follow up to this issue. I looked at modifying the LIO target code to allow a larger max_sectors_kb exported to the initiator for the nvme devices but had some issues. In the end I created 15 fileio devices using 200GB of ramdisk and exported those so I could test 4MB I/O from the initiator. These allow the 4MB setting on the upstream kernel. [root@srptest ~]# sg_inq -p 0xb0 /dev/sdk VPD INQUIRY: Block limits page (SBC) Maximum compare and write length: 1 blocks Optimal transfer length granularity: 1 blocks Maximum transfer length: 16384 blocks Optimal transfer length: 16384 blocks Maximum prefetch, xdread, xdwrite transfer length: 0 blocks The sg_map issues I am having on the RHEL kernel are likely due to the "proper" max sector size being ignored. I am testing latest upstream now 4.5.0 with all the sg related patches to see if that's stable. Thanks Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message ----- From: "Laurence Oberman" <lober...@redhat.com> To: emi...@redhat.com Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "linux-scsi" <linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org Sent: Friday, April 8, 2016 9:11:19 AM Subject: Re: Cant write to max_sectors_kb on 4.5.0 SRP target Hi Ewan, OK, that makes sense. I suspected after everybody's responses that RHEL was somehow ignoring the array imposed limit here. I actually got lucky because I needed to be able to issue 4MB IO'S to reproduce the failures seen at the customer on the initiator side. Looking at the target-LIO array now its clamped to 1MB I/O sizes which makes sense. I really was not focusing on the array at the time expecting it may chop the I/O up as many do. Knowing what's up now I can continue to test and figure out what patches I need to pull in to SRP on RHEL to make progress. Thank you to all that responded. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message ----- From: "Ewan D. Milne" <emi...@redhat.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "linux-scsi" <linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org Sent: Friday, April 8, 2016 8:39:52 AM Subject: Re: Cant write to max_sectors_kb on 4.5.0 SRP target The version of RHEL you are using does not have: commit ca369d51b3e1649be4a72addd6d6a168cfb3f537 Author: Martin K. Petersen <martin.peter...@oracle.com> Date: Fri Nov 13 16:46:48 2015 -0500 block/sd: Fix device-imposed transfer length limits (which will be added during the next update). In the upstream kernel queue_max_sectors_store() does not permit you to set a value larger than the device-imposed limit. This value, stored in q->limits.max_dev_sectors, is not visible via the block queue sysfs interface. The code that sets q->limits.max_sectors and q->limits.io_opt in sd.c does not take the device limit into account, but the sysfs code to change max_sectors ("max_sectors_kb") does. So there are a couple of problems here, one is that RHEL is not clamping to the device limit, and the other one is that neither RHEL nor upstream kernels take the device limit into account when setting q->limits.io_opt. This only seems to be a problem for you because your target is reporting an optimal I/O size in VPD page B0 that is *smaller* than the reported maximum I/O size. The target is clearly reporting inconsistent data, the question is whether we should change the code to clamp the optimal I/O size, or whether we should assume the value the target is reporting is wrong. So the question is: does the target actually process requests that are larger than the VPD page B0 reported maximum size? If so, maybe we should just issue a warning message rather than reducing the optimal I/O size. -Ewan On Fri, 2016-04-08 at 04:31 -0400, Laurence Oberman wrote: > Hello Martin > > Yes, Ewan also noticed that. > > This started out as me testing the SRP stack on RHEL 7.2 and baselining > against upstream. > We have a customer that requires 4MB I/O. > I bumped into a number of SRP issues including sg_map failures so started > reviewing upstream changes to the SRP code and patches. > > The RHEL kernel is ignoring this so perhaps we have an issue on our side > (RHEL kernel) and upstream is behaving as it should. > > What is intersting is that I cannot change the max_sectors_kb at all on the > upstream for the SRP LUNS. > > Here is an HP SmartArray LUN > > [root@srptest ~]# sg_inq --p 0xb0 /dev/sda > VPD INQUIRY: page=0xb0 > inquiry: field in cdb illegal (page not supported) Known that its > not supported > > However > > /sys/block/sda/queue > > [root@s
Re: Cant write to max_sectors_kb on 4.5.0 SRP target
Hi Ewan, OK, that makes sense. I suspected after everybody's responses that RHEL was somehow ignoring the array imposed limit here. I actually got lucky because I needed to be able to issue 4MB IO'S to reproduce the failures seen at the customer on the initiator side. Looking at the target-LIO array now its clamped to 1MB I/O sizes which makes sense. I really was not focusing on the array at the time expecting it may chop the I/O up as many do. Knowing what's up now I can continue to test and figure out what patches I need to pull in to SRP on RHEL to make progress. Thank you to all that responded. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Ewan D. Milne" <emi...@redhat.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: "Martin K. Petersen" <martin.peter...@oracle.com>, "linux-scsi" <linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org Sent: Friday, April 8, 2016 8:39:52 AM Subject: Re: Cant write to max_sectors_kb on 4.5.0 SRP target The version of RHEL you are using does not have: commit ca369d51b3e1649be4a72addd6d6a168cfb3f537 Author: Martin K. Petersen <martin.peter...@oracle.com> Date: Fri Nov 13 16:46:48 2015 -0500 block/sd: Fix device-imposed transfer length limits (which will be added during the next update). In the upstream kernel queue_max_sectors_store() does not permit you to set a value larger than the device-imposed limit. This value, stored in q->limits.max_dev_sectors, is not visible via the block queue sysfs interface. The code that sets q->limits.max_sectors and q->limits.io_opt in sd.c does not take the device limit into account, but the sysfs code to change max_sectors ("max_sectors_kb") does. So there are a couple of problems here, one is that RHEL is not clamping to the device limit, and the other one is that neither RHEL nor upstream kernels take the device limit into account when setting q->limits.io_opt. This only seems to be a problem for you because your target is reporting an optimal I/O size in VPD page B0 that is *smaller* than the reported maximum I/O size. The target is clearly reporting inconsistent data, the question is whether we should change the code to clamp the optimal I/O size, or whether we should assume the value the target is reporting is wrong. So the question is: does the target actually process requests that are larger than the VPD page B0 reported maximum size? If so, maybe we should just issue a warning message rather than reducing the optimal I/O size. -Ewan On Fri, 2016-04-08 at 04:31 -0400, Laurence Oberman wrote: > Hello Martin > > Yes, Ewan also noticed that. > > This started out as me testing the SRP stack on RHEL 7.2 and baselining > against upstream. > We have a customer that requires 4MB I/O. > I bumped into a number of SRP issues including sg_map failures so started > reviewing upstream changes to the SRP code and patches. > > The RHEL kernel is ignoring this so perhaps we have an issue on our side > (RHEL kernel) and upstream is behaving as it should. > > What is intersting is that I cannot change the max_sectors_kb at all on the > upstream for the SRP LUNS. > > Here is an HP SmartArray LUN > > [root@srptest ~]# sg_inq --p 0xb0 /dev/sda > VPD INQUIRY: page=0xb0 > inquiry: field in cdb illegal (page not supported) Known that its > not supported > > However > > /sys/block/sda/queue > > [root@srptest queue]# cat max_hw_sectors_kb max_sectors_kb > 4096 > 1280 > [root@srptest queue]# echo 4096 > max_sectors_kb > [root@srptest queue]# cat max_hw_sectors_kb max_sectors_kb > 4096 > 4096 > > On the SRP LUNS I am unable to change to a lower value than max_sectors_kb > unless I change it to 128 > So perhaps the size on the array is the issue here as Nicholas said and the > RHEL kernel has a bug and ignores it. > > /sys/block/sdc/queue > > [root@srptest queue]# cat max_hw_sectors_kb max_sectors_kb > 4096 > 1280 > > [root@srptest queue]# echo 512 > max_sectors_kb > -bash: echo: write error: Invalid argument > > [root@srptest queue]# echo 256 > max_sectors_kb > -bash: echo: write error: Invalid argument > > 128 works > [root@srptest queue]# echo 128 > max_sectors_kb > > > > > Laurence Oberman > Principal Software Maintenance Engineer > Red Hat Global Support Services > > - Original Message - > From: "Martin K. Petersen" <martin.peter...@oracle.com> > To: "Laurence Oberman" <lober...@redhat.com> > Cc: "linux-scsi" <linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org > Sent: Thursday, April 7, 2016 11:00:16 PM > Subject: Re: Cant writ
Re: Cant write to max_sectors_kb on 4.5.0 SRP target
vcTim Util 03:56:57 sdc 0 000 1092608 0 1067 10241024 3 2 0 74 03:56:57 dm-4 0 000 1092608 0 1067 10241024 3 2 0 79 03:56:58 sdc 0 000 1070080 0 1045 10241024 3 2 0 73 03:56:58 dm-4 0 000 1070080 0 1045 10241024 3 2 0 78 03:56:59 sdc 0 000 1101824 0 1076 10241024 3 2 0 72 03:56:59 dm-4 0 0 00 1101824 0 1076 10241024 3 2 0 77 Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message ----- From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Laurence Oberman" <lober...@redhat.com>, "linux-scsi" <linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org Sent: Thursday, April 7, 2016 10:49:58 PM Subject: Re: Cant write to max_sectors_kb on 4.5.0 SRP target On 04/07/16 14:16, Laurence Oberman wrote: > I have been testing the SRP initiator code to an LIO array here and > part of the testing requires me to set the max_sectors_kb size to > get 4k I/O's. . Hello Laurence, Have you already tried to set the max_sect parameter in /etc/srp_daemon.conf (assuming you are using srptools >= 1.0.3 for SRP login) ? Additionally, writing something like "options ib_srp cmd_sg_entries=255" into /etc/modprobe.d/ib_srp.conf will increase the maximum SRP transfer size. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cant write to max_sectors_kb on 4.5.0 SRP target
o- mapped_lun28 [lun28 block/block-29 (rw)] | | | | o- mapped_lun29 [lun29 block/block-30 (rw)] | | | o- ib.4f6e72000390fe7c7cfe900300726ed3 . [Mapped LUNs: 30] | | | o- mapped_lun0 ... [lun0 block/block-1 (rw)] | | | o- mapped_lun1 ... [lun1 block/block-2 (rw)] | | | o- mapped_lun2 ... [lun2 block/block-3 (rw)] | | | o- mapped_lun3 ... [lun3 block/block-4 (rw)] | | | o- mapped_lun4 ... [lun4 block/block-5 (rw)] .. ,, Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Nicholas A. Bellinger" <n...@linux-iscsi.org> To: "Laurence Oberman" <lober...@redhat.com> Cc: "linux-scsi" <linux-scsi@vger.kernel.org>, linux-r...@vger.kernel.org, "target-devel" <target-de...@vger.kernel.org> Sent: Friday, April 8, 2016 1:30:28 AM Subject: Re: Cant write to max_sectors_kb on 4.5.0 SRP target Hi Laurence, On Thu, 2016-04-07 at 17:15 -0400, Laurence Oberman wrote: > Hello > > I have been testing the SRP initiator code to an LIO array here and > part of the testing requires me to set the max_sectors_kb size to get > 4k I/O's. > This has been due to me having to debug various sg_map issues. > > Linux srptest 4.5.0 #2 SMP Thu Apr 7 16:14:38 EDT 2016 x86_64 x86_64 > x86_64 GNU/Linux > This kernel has the scan patch from Hannes, as well as the "[PATCH] > IB/mlx5: Expose correct max_sge_rd limit" patch. > However, I also tested with vanilla 4.5.0 as well and its the same > issue. > > For some reason I cannot change the max_sectors_kb size on 4.5.0 here. > > I chatted with Ewan about it as well and he reminded me about Martins > changes so wondering if that's playing into this. > > Take /dev/sdb as an example > > [root@srptest queue]# sg_inq --p 0xb0 /dev/sdb > VPD INQUIRY: Block limits page (SBC) > Maximum compare and write length: 1 blocks > Optimal transfer length granularity: 256 blocks > Maximum transfer length: 256 blocks > Optimal transfer length: 768 blocks > Maximum prefetch, xdread, xdwrite transfer length: 0 blocks > Just curious what target backend this is with..? Specifically the optimal transfer length granularity and optimal transfer length may be reported by underlying backend device (eg: IBLOCK) in spc_emulate_evpd_b0(). What does 'head /sys/kernel/config/target/core/$HBA/$DEV/attrib/*' of the backend device in question look like..? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
Thanks Bart Good catch, I completely missed it. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Christoph Hellwig" <h...@lst.de>, linux-r...@vger.kernel.org, linux-scsi@vger.kernel.org Sent: Monday, April 11, 2016 7:32:16 PM Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host On 04/11/2016 03:47 PM, Christoph Hellwig wrote: > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 8106515..04c660d 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, > struct request_queue *q) > blk_queue_segment_boundary(q, shost->dma_boundary); > dma_set_seg_boundary(dev, shost->dma_boundary); > > - blk_queue_max_segment_size(q, dma_get_max_seg_size(dev)); > + blk_queue_max_segment_size(q, > + min(shost->max_segment_size, dma_get_max_seg_size(dev))); > > if (!shost->use_clustering) > q->limits.cluster = 0; Hello Christoph, Since Scsi_Host.max_segment_size is initialized to zero, shouldn't min() be changed into min_not_zero()? Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
Modified patch to use min_not_zero() Ran a number of tests overnight on F/C, SCSI/SAS and SRP (RDMA) and no issues found. Tested-by: Laurence Oberman <lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Laurence Oberman" <lober...@redhat.com> To: "Bart Van Assche" <bart.vanass...@sandisk.com> Cc: "Christoph Hellwig" <h...@lst.de>, linux-r...@vger.kernel.org, linux-scsi@vger.kernel.org Sent: Monday, April 11, 2016 7:44:24 PM Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host Thanks Bart Good catch, I completely missed it. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Christoph Hellwig" <h...@lst.de>, linux-r...@vger.kernel.org, linux-scsi@vger.kernel.org Sent: Monday, April 11, 2016 7:32:16 PM Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host On 04/11/2016 03:47 PM, Christoph Hellwig wrote: > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 8106515..04c660d 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, > struct request_queue *q) > blk_queue_segment_boundary(q, shost->dma_boundary); > dma_set_seg_boundary(dev, shost->dma_boundary); > > - blk_queue_max_segment_size(q, dma_get_max_seg_size(dev)); > + blk_queue_max_segment_size(q, > + min(shost->max_segment_size, dma_get_max_seg_size(dev))); > > if (!shost->use_clustering) > q->limits.cluster = 0; Hello Christoph, Since Scsi_Host.max_segment_size is initialized to zero, shouldn't min() be changed into min_not_zero()? Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
Other than adding the patch and rebuilding the kernel and testing regular stuff, which I had to do anyway, that was the extent of testing. I did not see where it was used to be honest other than adding the structure member. I wanted to test the simple change because it was in scsi_lib.c which has many dependencies of course. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Christoph Hellwig" <h...@lst.de>, linux-r...@vger.kernel.org, linux-scsi@vger.kernel.org Sent: Tuesday, April 12, 2016 11:19:20 AM Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host On 04/12/2016 07:13 AM, Christoph Hellwig wrote: > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 8106515..ad79372 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, > struct request_queue *q) > blk_queue_segment_boundary(q, shost->dma_boundary); > dma_set_seg_boundary(dev, shost->dma_boundary); > > - blk_queue_max_segment_size(q, dma_get_max_seg_size(dev)); > + blk_queue_max_segment_size(q, min_not_zero(shost->max_segment_size, > +dma_get_max_seg_size(dev))); > > if (!shost->use_clustering) > q->limits.cluster = 0; > diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h > index fcfa3d7..f11d3fe 100644 > --- a/include/scsi/scsi_host.h > +++ b/include/scsi/scsi_host.h > @@ -621,6 +621,7 @@ struct Scsi_Host { > short unsigned int sg_tablesize; > short unsigned int sg_prot_tablesize; > unsigned int max_sectors; > + unsigned int max_segment_size; > unsigned long dma_boundary; > /* >* In scsi-mq mode, the number of hardware queues supported by the LLD. Hello Christoph, The value zero has another meaning for Scsi_Host.max_segment_size than for queue_limits.max_segment_size. Shouldn't that be documented somewhere? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] drivers/scsi/fnic/fnic_scsi.c: Deinline fnic_queue_abort_io_req, save 1792 bytes
Simple change, looks fine to me. Reviewed-by: Laurence Oberman <lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Denys Vlasenko" <dvlas...@redhat.com> To: "James Bottomley" <james.bottom...@hansenpartnership.com> Cc: "Denys Vlasenko" <dvlas...@redhat.com>, "Hiral Patel" <hiral...@cisco.com>, "Suma Ramars" <sram...@cisco.com>, "Brian Uchino" <buch...@cisco.com>, linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org Sent: Friday, April 8, 2016 2:58:43 PM Subject: [PATCH] drivers/scsi/fnic/fnic_scsi.c: Deinline fnic_queue_abort_io_req, save 1792 bytes This function compiles to 511 bytes of machine code. Abort commands are not time-critical at all. Signed-off-by: Denys Vlasenko <dvlas...@redhat.com> CC: James Bottomley <james.bottom...@hansenpartnership.com> CC: Hiral Patel <hiral...@cisco.com> CC: Suma Ramars <sram...@cisco.com> CC: Brian Uchino <buch...@cisco.com> CC: linux-scsi@vger.kernel.org CC: linux-ker...@vger.kernel.org --- drivers/scsi/fnic/fnic_scsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c index 266b909..0a3edee 100644 --- a/drivers/scsi/fnic/fnic_scsi.c +++ b/drivers/scsi/fnic/fnic_scsi.c @@ -1435,7 +1435,7 @@ wq_copy_cleanup_scsi_cmd: } } -static inline int fnic_queue_abort_io_req(struct fnic *fnic, int tag, +static int fnic_queue_abort_io_req(struct fnic *fnic, int tag, u32 task_req, u8 *fc_lun, struct fnic_io_req *io_req) { -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
This looks fine to me. I am pulling this in to my SRP initiator and target testing ongoing at the moment so will be testing. Up to now this has likely not affected me but I am pulling in all RDMA patches to test. Reviewed-by: Laurence Oberman <lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Christoph Hellwig" <h...@lst.de> To: linux-r...@vger.kernel.org, linux-scsi@vger.kernel.org Sent: Monday, April 11, 2016 6:47:25 PM Subject: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host RDMA drivers need segments that aren't larger than a single HCA page for memory registrations to work properly, so wire up this limitation in the host. While we could just call blk_queue_max_segment_size from ->slave_configure, that would override the global limit based on the DMA device, so let's do it the traditional way by adding a field to the Scsi_Host structure. Signed-off-by: Christoph Hellwig <h...@lst.de> --- drivers/scsi/scsi_lib.c | 3 ++- include/scsi/scsi_host.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 8106515..04c660d 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) blk_queue_segment_boundary(q, shost->dma_boundary); dma_set_seg_boundary(dev, shost->dma_boundary); - blk_queue_max_segment_size(q, dma_get_max_seg_size(dev)); + blk_queue_max_segment_size(q, + min(shost->max_segment_size, dma_get_max_seg_size(dev))); if (!shost->use_clustering) q->limits.cluster = 0; diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index fcfa3d7..f11d3fe 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -621,6 +621,7 @@ struct Scsi_Host { short unsigned int sg_tablesize; short unsigned int sg_prot_tablesize; unsigned int max_sectors; + unsigned int max_segment_size; unsigned long dma_boundary; /* * In scsi-mq mode, the number of hardware queues supported by the LLD. -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] scsi: disable automatic target scan
Hi Hannes, Please share those dracut patches because I want to test this patch series. Which kernel is the diff against for the scan patches. Thanks Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Hannes Reinecke" <h...@suse.de> To: "Bart Van Assche" <bart.vanass...@sandisk.com>, "Martin K. Petersen" <martin.peter...@oracle.com> Cc: "Christoph Hellwig" <h...@lst.de>, "James Bottomley" <james.bottom...@hansenpartnership.com>, linux-scsi@vger.kernel.org Sent: Saturday, March 19, 2016 11:18:09 AM Subject: Re: [PATCHv3] scsi: disable automatic target scan On 03/18/2016 10:56 PM, Bart Van Assche wrote: > On 03/17/2016 12:39 AM, Hannes Reinecke wrote: >> On larger installations it is useful to disable automatic LUN >> scanning, and only add the required LUNs via udev rules. >> This can speed up bootup dramatically. >> >> This patch introduces a new scan module parameter value 'manual', >> which works like 'none', but can be overriden by setting the 'rescan' >> value from scsi_scan_target to 'SCSI_SCAN_MANUAL'. >> And it updates all relevant callers to set the 'rescan' value >> to 'SCSI_SCAN_MANUAL' if invoked via the 'scan' option in sysfs. > > Hello Hannes, > > Will setting scsi_scan_type to 'manual' allow a system to boot from a > SCSI disk? If not, are there alternatives to this approach? Would it be > a valid alternative to e.g. introduce a new threshold parameter such > that only LUN numbers below this threshold are scanned during boot? > I have a patch for dracut, which will generate udev rules for all devices required for mounting the root fs. Once the system is booted properly I've got another patch for systemd which switches back to 'normal' scanning (ie by writing 'sync' into /sys/modules/scsi_mod/parameters/scan) and rescan all scsi hosts. With that there's no need to have any arbitrary limits; only the necessary devices are enabled during boot. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fnic: move printk()s outside of the critical code section.
Reviewed-by: Laurence Oberman <lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Maurizio Lombardi" <mlomb...@redhat.com> To: linux-scsi@vger.kernel.org Cc: hiral...@cisco.com, sram...@cisco.com, buch...@cisco.com, j...@linux.vnet.ibm.com Sent: Wednesday, March 16, 2016 9:44:08 AM Subject: [PATCH] fnic: move printk()s outside of the critical code section. This patch moves a printk() outside of the code section where interrupt are disabled. In some cases a flood of error messages may cause a kernel panic. It also removes one of the printk()s because the same error message was printed twice. [709686.317197] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 12 [709686.317200] CPU: 12 PID: 1963 Comm: systemd-journal Tainted: GF O-- 3.10.0-229.el7.x86_64 #1 [709686.317201] Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.6.030620151309 03/06/2015 [709686.317206] 8182b2e8 392722ba 88046fcc5c48 81603f36 [709686.317209] 88046fcc5cc8 815fd7da 0010 88046fcc5cd8 [709686.317211] 88046fcc5c78 392722ba 88046fcc5c88 000c [709686.317212] Call Trace: [709686.317221][] dump_stack+0x19/0x1b [709686.317223] [] panic+0xd8/0x1e7 [709686.317227] [] ? watchdog_enable_all_cpus.part.2+0x40/0x40 [709686.317229] [] watchdog_overflow_callback+0xc2/0xd0 [709686.317233] [] __perf_event_overflow+0xa1/0x250 [709686.317235] [] perf_event_overflow+0x14/0x20 [709686.317239] [] intel_pmu_handle_irq+0x1fd/0x410 [709686.317242] [] ? unmap_kernel_range_noflush+0x11/0x20 [709686.317246] [] ? ghes_copy_tofrom_phys+0x124/0x210 [709686.317249] [] perf_event_nmi_handler+0x2b/0x50 [709686.317251] [] nmi_handle.isra.0+0x69/0xb0 [709686.317252] [] do_nmi+0xd0/0x340 [709686.317256] [] end_repeat_nmi+0x1e/0x2e [709686.317260] [] ? memcpy+0xd/0x110 [709686.317263] [] ? memcpy+0xd/0x110 [709686.317265] [] ? memcpy+0xd/0x110 [709686.317269] <> [] ? vgacon_scroll+0x2d7/0x330 [709686.317273] [] scrup+0xfc/0x110 [709686.317275] [] lf+0xa0/0xb0 [709686.317278] [] vt_console_print+0x2d2/0x420 [709686.317283] [] call_console_drivers.constprop.15+0x91/0xf0 [709686.317287] [] console_unlock+0x3bf/0x400 [709686.317291] [] vprintk_emit+0x2b6/0x530 [709686.317294] [] printk_emit+0x44/0x5b [709686.317297] [] devkmsg_writev+0x158/0x1d0 [709686.317303] [] do_sync_readv_writev+0x79/0xd0 [709686.317307] [] do_readv_writev+0xce/0x260 [709686.317310] [] ? __sb_start_write+0x58/0x110 [709686.317314] [] vfs_writev+0x35/0x60 [709686.317318] [] SyS_writev+0x5c/0xd0 [709686.317322] [] system_call_fastpath+0x16/0x1b Signed-off-by: Maurizio Lombardi <mlomb...@redhat.com> --- drivers/scsi/fnic/fnic_scsi.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c index 266b909..f3032ca 100644 --- a/drivers/scsi/fnic/fnic_scsi.c +++ b/drivers/scsi/fnic/fnic_scsi.c @@ -958,23 +958,22 @@ static void fnic_fcpio_icmnd_cmpl_handler(struct fnic *fnic, case FCPIO_INVALID_PARAM:/* some parameter in request invalid */ case FCPIO_REQ_NOT_SUPPORTED:/* request type is not supported */ default: - shost_printk(KERN_ERR, fnic->lport->host, "hdr status = %s\n", -fnic_fcpio_status_to_str(hdr_status)); sc->result = (DID_ERROR << 16) | icmnd_cmpl->scsi_status; break; } - if (hdr_status != FCPIO_SUCCESS) { - atomic64_inc(_stats->io_stats.io_failures); - shost_printk(KERN_ERR, fnic->lport->host, "hdr status = %s\n", -fnic_fcpio_status_to_str(hdr_status)); - } /* Break link with the SCSI command */ CMD_SP(sc) = NULL; CMD_FLAGS(sc) |= FNIC_IO_DONE; spin_unlock_irqrestore(io_lock, flags); + if (hdr_status != FCPIO_SUCCESS) { + atomic64_inc(_stats->io_stats.io_failures); + shost_printk(KERN_ERR, fnic->lport->host, "hdr status = %s\n", +fnic_fcpio_status_to_str(hdr_status)); + } + fnic_release_ioreq_buf(fnic, io_req, sc); mempool_free(io_req, fnic->io_req_pool); -- Maurizio Lombardi -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] scsi: disable automatic target scan
Hello Hannes Please share latest scripts and an example of how you are using them. I have some scripts from last November, that you posted but I am sure they have changed. If not then I will modify them as appropriate, just let me know. I have added the patches and booted the system set to async, so before I boot with scsi_mod.scan=manual want to prepare my test system. This feature may be a very useful feature we would want to include in RHEL as we struggle with large LUN boot configurations all the time. When you have time and thanks Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Hannes Reinecke" <h...@suse.de> To: "Bart Van Assche" <bart.vanass...@sandisk.com>, "Martin K. Petersen" <martin.peter...@oracle.com> Cc: "Christoph Hellwig" <h...@lst.de>, "James Bottomley" <james.bottom...@hansenpartnership.com>, linux-scsi@vger.kernel.org Sent: Monday, March 21, 2016 3:15:10 AM Subject: Re: [PATCHv3] scsi: disable automatic target scan On 03/21/2016 02:24 AM, Bart Van Assche wrote: > On 03/19/16 08:18, Hannes Reinecke wrote: >> On 03/18/2016 10:56 PM, Bart Van Assche wrote: >>> On 03/17/2016 12:39 AM, Hannes Reinecke wrote: >>>> On larger installations it is useful to disable automatic LUN >>>> scanning, and only add the required LUNs via udev rules. >>>> This can speed up bootup dramatically. >>>> >>>> This patch introduces a new scan module parameter value 'manual', >>>> which works like 'none', but can be overriden by setting the >>>> 'rescan' >>>> value from scsi_scan_target to 'SCSI_SCAN_MANUAL'. >>>> And it updates all relevant callers to set the 'rescan' value >>>> to 'SCSI_SCAN_MANUAL' if invoked via the 'scan' option in sysfs. >>> >>> Hello Hannes, >>> >>> Will setting scsi_scan_type to 'manual' allow a system to boot >>> from a >>> SCSI disk? If not, are there alternatives to this approach? Would >>> it be >>> a valid alternative to e.g. introduce a new threshold parameter such >>> that only LUN numbers below this threshold are scanned during boot? >>> >> I have a patch for dracut, which will generate udev rules for all >> devices required for mounting the root fs. >> Once the system is booted properly I've got another patch for systemd >> which switches back to 'normal' scanning (ie by writing 'sync' into >> /sys/modules/scsi_mod/parameters/scan) and rescan all scsi hosts. >> >> With that there's no need to have any arbitrary limits; only the >> necessary devices are enabled during boot. > > Hello Hannes, > > That sounds like a really interesting approach. Will this approach > also work if the SCSI host and/or LUN numbers change during a reboot? > It's independent on the SCSI host as it just looks for the rport ID (FC WWPN, SAS ID, or iSCSI target name). The LUN number, however, is fixed; the whole point of this exercise is that you want to blank out individual LUNs behind a given target. Hence you need to able to address the LUNs in the first place. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] scsi: disable automatic target scan
Hello Tested Hannes's scan disable patch (subject above) with hpsa module patch below. Because of the way the hpsa works I created a module that will force the scan when all scans are manual. I also tested Hannes's patch with boot-from-san via F/C and validated the patch in subject using Hannes's original dracut lunmask patch. For Hannes's scan disable patch Tested-by: Laurence Oberman <lober...@redhat.com> linux16 /vmlinuz-4.4.5scan root=/dev/mapper/rhel-root ro crashkernel=512M@64M rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS0,115200n8 scsi_mod.scan=manual rd.hpsa=0 rdloaddriver=hpsa Additional HPSA module for dracut, needs to be cleaned up and reviewed internally here at Red Hat for a separate submission later. Included for others who want to test this. We need this for hpsa as this is by far the most popular boot controller we face. diff -Nurp modules.d.orig/06hpsa/hpsa.sh modules.d/06hpsa/hpsa.sh --- modules.d.orig/06hpsa/hpsa.sh 1969-12-31 19:00:00.0 -0500 +++ modules.d/06hpsa/hpsa.sh2016-03-23 21:38:33.157233465 -0400 @@ -0,0 +1,6 @@ +#!/bin/sh +### hpsa.sh: Called by the parse-hpsa.sh script to create the scan script ### +### Laurence Oberman lober...@redhat.com +. /lib/dracut-lib.sh +### The actual script that scans the hpsa for LUNS +/bin/sh /sbin/hpsa_scan.sh diff -Nurp modules.d.orig/06hpsa/module-setup.sh modules.d/06hpsa/module-setup.sh --- modules.d.orig/06hpsa/module-setup.sh 1969-12-31 19:00:00.0 -0500 +++ modules.d/06hpsa/module-setup.sh2016-03-23 21:40:36.994767642 -0400 @@ -0,0 +1,14 @@ +#!/bin/sh + Test the hpsa driver load with scan # + Laurence Oberman lober...@redhat.com +### module-setup.sh - Required for every module +### Standard script invocations required +check() { + return 0 +} + +### Install the hpsa.sh in the module directory +install() { + inst_hook cmdline 20 "$moddir/parse-hpsa.sh" + inst_simple "$moddir/hpsa.sh" /sbin/hpsa.sh +} diff -Nurp modules.d.orig/06hpsa/parse-hpsa.sh modules.d/06hpsa/parse-hpsa.sh --- modules.d.orig/06hpsa/parse-hpsa.sh 1969-12-31 19:00:00.0 -0500 +++ modules.d/06hpsa/parse-hpsa.sh 2016-03-23 21:42:28.141856121 -0400 @@ -0,0 +1,18 @@ +#!/bin.bash +### Laurence Oberman lober...@redhat.com +### parse-hpsa.sh +### Parses the rd.hpsa=x tp get the host number +### Using rdloaddriver=hpsa will enforce hpsa becoming scsi0 + +for p in $(getargs rd.hpsa=); do +( + echo "echo 1 > /sys/class/scsi_host/host$p/rescan" > /sbin/hpsa_scan.sh +_do_hpsa=1 +) +done + +### Standard way to call the script from udev +/sbin/initqueue --settled --unique --onetime /bin/sh /sbin/hpsa.sh +#/bin/sh /sbin/hpsa.sh +unset _do_hpsa + Test log [5.591817] HP HPSA Driver (v 3.4.14-0) [5.593799] hpsa :05:00.0: can't disable ASPM; OS doesn't have ASPM control [5.597423] hpsa :05:00.0: MSI-X capable controller [5.600181] hpsa :05:00.0: only 16 MSI-X vectors available [5.602995] hpsa :05:00.0: Logical aborts not supported [5.606011] hpsa :05:00.0: HP SSD Smart Path aborts not supported [5.631300] scsi host0: hpsa [ OK ] Started dracut pre-udev hook. Starting udev Kernel Device Manager... [ OK ] Started udev Kernel Device Manager. Starting udev Coldplug all Devices... [5.676569] clocksource: Switched to clocksource tsc Mounting Configuration File System... [ OK ] Mounted Configuration File System. [ OK ] Started udev Coldplug all Devices. Starting dracut initqueue hook... Starting Show Plymouth Boot Screen... [ OK ] Reached target System Initialization. [5.708890] bnx2: QLogic bnx2 Gigabit Ethernet Driver v2.2.6 (January 29, 2014) [ OK ] Started Show Plymouth Boot Screen. [5.749275] bnx2 :03:00.0 eth0: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem f400, IRQ 16, node addr e4:11:5b:b8:ea:6a [ OK ] Reached target Paths. [ OK ] Reached target Basic System. [5.828145] bnx2 :03:00.1 eth1: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem f200, IRQ 17, node addr e4:11:5b:b8:ea:6c [5.905874] bnx2 :04:00.0 eth2: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem f800, IRQ 18, node addr e4:11:5b:b8:ea:6e [5.906632] bnx2 :04:00.1 eth3: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem f600, IRQ 19, node addr e4:11:5b:b8:ea:70 [6.061847] bnx2 :04:00.1 enp4s0f1: renamed from eth3 [6.098914] mlx5_core :08:00.0: firmware version: 12.14.2036 [6.147046] Emulex LightPulse Fibre Channel SCSI driver 11.0.0.0. [6.147054] [drm] Initialized drm 1.1.0 20060810 [6.147148] qla2xxx [:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 8.07.00.26-k. [6.147278] qla2xxx [:0e:00.0]-001d: : Found an ISP2432 irq 27 iobase 0xc900192b8000. [6.14
Re: [PATCH] mpt3sas - remove unused fw_event_work delayed_work
Looks fine to me. Reviewed-by: Laurence Oberman <lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Joe Lawrence" <joe.lawre...@stratus.com> To: linux-scsi@vger.kernel.org Cc: "Sathya Prakash" <sathya.prak...@broadcom.com>, "Chaitra P B" <chaitra.basa...@broadcom.com>, "Suganath Prabu Subramani" <suganath-prabu.subram...@broadcom.com>, "Calvin Owens" <calvinow...@fb.com>, "Joe Lawrence" <joe.lawre...@stratus.com> Sent: Friday, April 1, 2016 1:56:29 PM Subject: [PATCH] mpt3sas - remove unused fw_event_work delayed_work The driver's fw events are queued up using the the fw_event_work's struct work, not its delayed_work member. The latter appears to be unused and may provoke CONFIG_DEBUG_OBJECTS_TIMERS "assert_init not available" false warnings in _scsih_fw_event_cleanup_queue. Remove it and update _scsih_fw_event_cleanup_queue accordingly. Signed-off-by: Joe Lawrence <joe.lawre...@stratus.com> --- I think this goes all the way back to the introduction of the mpt3sas driver. The previous generation mpt2sas driver uses delayed_work, so perhaps it was simply copied and pasted into the mpt3sas but never updated. drivers/scsi/mpt3sas/mpt3sas_scsih.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index e0e4920d0fa6..67643602efbc 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -189,7 +189,6 @@ struct fw_event_work { struct list_headlist; struct work_struct work; u8 cancel_pending_work; - struct delayed_work delayed_work; struct MPT3SAS_ADAPTER *ioc; u16 device_handle; @@ -2804,12 +2803,12 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER *ioc) /* * Wait on the fw_event to complete. If this returns 1, then * the event was never executed, and we need a put for the -* reference the delayed_work had on the fw_event. +* reference the work had on the fw_event. * * If it did execute, we wait for it to finish, and the put will * happen from _firmware_event_work() */ - if (cancel_delayed_work_sync(_event->delayed_work)) + if (cancel_work_sync(_event->work)) fw_event_work_put(fw_event); fw_event_work_put(fw_event); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision3
Himanshu I looked at using the attribute for this but because of where I have to discard the command I dont want to have to go fetch the attribute each time in the same code path. Its significant overhead to have to go fetch the attribute value each time as I allow for a dynamic on off via the module parameter so I have to check it each command. With the module parameter its a simple compare and by having this as a module parameter its globally accessible and imposes virtually no overhead. Are you OK with me using #ifdef on the CONFIG_TCM_QLA2XXX_DEBUG .config parameter I will add here to include the module parameter and code only if set to "yes" The default unless expicitly set will be no change. Thanks Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Himanshu Madhani" <himanshu.madh...@qlogic.com> To: "Nicholas A. Bellinger" <n...@linux-iscsi.org>, "Bart Van Assche" <bart.vanass...@sandisk.com> Cc: "Laurence Oberman" <lober...@redhat.com>, "linux-scsi" <linux-scsi@vger.kernel.org>, "target-devel" <target-de...@vger.kernel.org>, "Quinn Tran" <quinn.t...@qlogic.com> Sent: Thursday, March 31, 2016 8:20:56 PM Subject: Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision3 Hi Nic, Laurence, On 3/30/16, 10:34 PM, "Nicholas A. Bellinger" <n...@linux-iscsi.org> wrote: >(Adding target-devel + Qlogic target folks) > >On Tue, 2016-03-29 at 22:05 -0700, Bart Van Assche wrote: >> On 03/29/16 07:42, Laurence Oberman wrote: >> > I have been using this jammer functionality to continue testing the SCSI >> > F/C drivers and recovery for over a year now. >> > Any chance you would agree to ack this so I can get it in now. >> > I last posted to the list last March and it was not picked up. >> > >> > I did look into moving this to upper layers but I find I use it primarily >> > for fiber channel target testing. >> > Attempting to add this functionality to upper layers led to complexities >> > and this is very solid. >> > >> > This Patch diff against 4.5 >> > >> > I use target LIO for all my storage array test targets and customer >> > problem resolution here at Red Hat. >> > This patch resulted from a requirement to mimic behavior of an expensive >> > hardware jammer for a customer. >> > I have used this for some time with good success to simulate and reproduce >> > latency and slow drain fabric issues and >> > for testing and validating error handling behavior >> > in the Emulex, Qlogic and other F/C drivers. >> > >> > Works by checking new parameter jam_host if its >= 0 and matches >> > vha->host_no , jamming is enabled when jam_host >=0 >> > If parameter set to -1 (default) no jamming is enabled. >> >> Hello Laurence, >> >> Nic Bellinger is the maintainer of LIO so my recommendation is to ask >> Nic first about his opinion (I have CC'd Nic). I'm not sure what Nic >> thinks about this but in my opinion such functionality belongs in the >> target core instead of in a target driver. But please wait until Nic has >> provided his opinion before spending more time on this. The mailing list >> for SCSI target patches is target-de...@vger.kernel.org. >> > >So really it's Himanshu's + Quinn's call if they would like to include >something like this in mainline. > >If so, then I'd prefer to do it with a per tcm_qla2xxx_tpg endpoint >attribute instead a new module parameter, and add a new kernel config >option (CONFIG_TCM_QLA2XXX_DEBUG) to disable (by default) so end users >don't inadvertently play with it via targetcli + friends. > I agree here with Nic. The patch does provides benefit and is good addition, but we don’t want to enable it by default. Laurence, Would you be kind to rework patch with suggested changes from Nic and post it. Thanks, Himanshu N�r��y���b�X��ǧv�^�){.n�+{���"�{ay�ʇڙ�,j��f���h���z��w��j:+v���w�j�mzZ+��ݢj"�� -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability to the tcm_qla2xxx module
Hi Nicholas Apologies for the top posting, that was in my haste to correct the prior patch that had the typo. When I investigated the attributes it looked like I would have had to create a store and a check function and call the check function each time. That was my lack of understanding of the functionality. I also looked at your example and in my case I needed a way to set the attribute to a number matching the host#. When I tested this I was only able to set boolean values of 1 or 0 for the attributes and the definition of tcm_qla2xxx_tpg_attrib_##name##_store validates that only booleans of 1 or 0 are supported. However after your email I then realized using a boolean on the endpoints below will work here. Thank you for taking the time to show me, it was very helpful. sys]# find . -name jam_host ./kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host ./kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:af/tpgt_1/attrib/jam_host I tested this and here are the patches in the format you require. Hopefully this new functionality will be useful for others. I am not set for emailing directly from git. Tested by: Laurence Oberman <lober...@redhat.com> Signed-off-by: Laurence Oberman <lober...@redhat.com> --- drivers/scsi/qla2xxx/Kconfig | 11 +++ drivers/scsi/qla2xxx/tcm_qla2xxx.c | 20 drivers/scsi/qla2xxx/tcm_qla2xxx.h |1 + 3 files changed, 32 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig index 10aa18b..5110fab 100644 --- a/drivers/scsi/qla2xxx/Kconfig +++ b/drivers/scsi/qla2xxx/Kconfig @@ -36,3 +36,14 @@ config TCM_QLA2XXX default n ---help--- Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ series target mode HBAs + +config TCM_QLA2XXX_DEBUG + bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series target mode HBAs" + depends on SCSI_QLA_FC && TARGET_CORE + depends on LIBFC + select BTREE + default n + ---help--- + Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 24xx+ series target mode HBAs + This will include code to enable the SCSI command jammer + diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c index 1808a01..411a450 100644 --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c @@ -457,6 +457,10 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, struct qla_tgt_cmd *cmd, struct se_cmd *se_cmd = >se_cmd; struct se_session *se_sess; struct qla_tgt_sess *sess; +#ifdef CONFIG_TCM_QLA2XXX_DEBUG +struct se_portal_group *se_tpg; +struct tcm_qla2xxx_tpg *tpg; +#endif int flags = TARGET_SCF_ACK_KREF; if (bidi) @@ -476,6 +480,15 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, struct qla_tgt_cmd *cmd, pr_err("Unable to locate active struct se_session\n"); return -EINVAL; } + +#ifdef CONFIG_TCM_QLA2XXX_DEBUG + se_tpg = se_sess->se_tpg; + tpg = container_of(se_tpg,struct tcm_qla2xxx_tpg, se_tpg); + if (unlikely(tpg->tpg_attrib.jam_host)) { + /* return, and dont run target_submit_cmd,discarding command */ +return 0; + } +#endif cmd->vha->tgt_counters.qla_core_sbt_cmd++; return target_submit_cmd(se_cmd, se_sess, cdb, >sense_buffer[0], @@ -844,6 +857,9 @@ DEF_QLA_TPG_ATTRIB(cache_dynamic_acls); DEF_QLA_TPG_ATTRIB(demo_mode_write_protect); DEF_QLA_TPG_ATTRIB(prod_mode_write_protect); DEF_QLA_TPG_ATTRIB(demo_mode_login_only); +#ifdef CONFIG_TCM_QLA2XXX_DEBUG +DEF_QLA_TPG_ATTRIB(jam_host); +#endif static struct configfs_attribute *tcm_qla2xxx_tpg_attrib_attrs[] = { _qla2xxx_tpg_attrib_attr_generate_node_acls, @@ -851,6 +867,9 @@ static struct configfs_attribute *tcm_qla2xxx_tpg_attrib_attrs[] = { _qla2xxx_tpg_attrib_attr_demo_mode_write_protect, _qla2xxx_tpg_attrib_attr_prod_mode_write_protect, _qla2xxx_tpg_attrib_attr_demo_mode_login_only, +#ifdef CONFIG_TCM_QLA2XXX_DEBUG +_qla2xxx_tpg_attrib_attr_jam_host, +#endif NULL, }; @@ -1023,6 +1042,7 @@ static struct se_portal_group *tcm_qla2xxx_make_tpg( tpg->tpg_attrib.demo_mode_write_protect = 1; tpg->tpg_attrib.cache_dynamic_acls = 1; tpg->tpg_attrib.demo_mode_login_only = 1; + tpg->tpg_attrib.jam_host = 0; ret = core_tpg_register(wwn, >se_tpg, SCSI_PROTOCOL_FCP); if (ret < 0) { diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.h b/drivers/scsi/qla2xxx/tcm_qla2xxx.h index 3bbf4cb..37e026a 100644 --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.h +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.h @@ -34,6 +34,7 @@ struct tcm_qla2xxx_tpg_attrib { int prod_mode_write_protect; int demo_mode_l
Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision3
Hello Himanshu Thanks, I will rework and post back to the thread. Thank you Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Himanshu Madhani" <himanshu.madh...@qlogic.com> To: "Nicholas A. Bellinger" <n...@linux-iscsi.org>, "Bart Van Assche" <bart.vanass...@sandisk.com> Cc: "Laurence Oberman" <lober...@redhat.com>, "linux-scsi" <linux-scsi@vger.kernel.org>, "target-devel" <target-de...@vger.kernel.org>, "Quinn Tran" <quinn.t...@qlogic.com> Sent: Thursday, March 31, 2016 8:20:56 PM Subject: Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision3 Hi Nic, Laurence, On 3/30/16, 10:34 PM, "Nicholas A. Bellinger" <n...@linux-iscsi.org> wrote: >(Adding target-devel + Qlogic target folks) > >On Tue, 2016-03-29 at 22:05 -0700, Bart Van Assche wrote: >> On 03/29/16 07:42, Laurence Oberman wrote: >> > I have been using this jammer functionality to continue testing the SCSI >> > F/C drivers and recovery for over a year now. >> > Any chance you would agree to ack this so I can get it in now. >> > I last posted to the list last March and it was not picked up. >> > >> > I did look into moving this to upper layers but I find I use it primarily >> > for fiber channel target testing. >> > Attempting to add this functionality to upper layers led to complexities >> > and this is very solid. >> > >> > This Patch diff against 4.5 >> > >> > I use target LIO for all my storage array test targets and customer >> > problem resolution here at Red Hat. >> > This patch resulted from a requirement to mimic behavior of an expensive >> > hardware jammer for a customer. >> > I have used this for some time with good success to simulate and reproduce >> > latency and slow drain fabric issues and >> > for testing and validating error handling behavior >> > in the Emulex, Qlogic and other F/C drivers. >> > >> > Works by checking new parameter jam_host if its >= 0 and matches >> > vha->host_no , jamming is enabled when jam_host >=0 >> > If parameter set to -1 (default) no jamming is enabled. >> >> Hello Laurence, >> >> Nic Bellinger is the maintainer of LIO so my recommendation is to ask >> Nic first about his opinion (I have CC'd Nic). I'm not sure what Nic >> thinks about this but in my opinion such functionality belongs in the >> target core instead of in a target driver. But please wait until Nic has >> provided his opinion before spending more time on this. The mailing list >> for SCSI target patches is target-de...@vger.kernel.org. >> > >So really it's Himanshu's + Quinn's call if they would like to include >something like this in mainline. > >If so, then I'd prefer to do it with a per tcm_qla2xxx_tpg endpoint >attribute instead a new module parameter, and add a new kernel config >option (CONFIG_TCM_QLA2XXX_DEBUG) to disable (by default) so end users >don't inadvertently play with it via targetcli + friends. > I agree here with Nic. The patch does provides benefit and is good addition, but we don’t want to enable it by default. Laurence, Would you be kind to rework patch with suggested changes from Nic and post it. Thanks, Himanshu -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision4
Hello Himanshu This patch was reworked to only include the jammer code if the parameter TCM_QLA2XXX_DEBUG=Y is set. The default is to not provide this functionality at all. I looked at using attributes but this code is in the fastpath and the overhead or fetching the attribute each time is not a good idea. Control of this needs to be dynamic and the module parameter allows a simple compare in the fastpath. Patch notes I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behavior of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behavior in the Emulex, Qlogic and other F/C drivers. Works by checking new parameter jam_host if its >= 0 and matches vha->host_no , jamming is enabled when jam_host >=0 If parameter set to -1 (default) no jamming is enabled. Tested by: Laurence Oberman <lober...@redhat.com> Signed-off-by: Laurence Oberman <lober...@redhat.com> diff -Nurp linux-4.5/Documentation/scsi/tcm_qla2xxx.txt linux-4.5.new/Documentation/scsi/tcm_qla2xxx.txt --- linux-4.5/Documentation/scsi/tcm_qla2xxx.txt1969-12-31 19:00:00.0 -0500 +++ linux-4.5.new/Documentation/scsi/tcm_qla2xxx.txt2016-04-02 11:36:42.693081232 -0400 @@ -0,0 +1,34 @@ +tcm_qla2xxx jammer parameter usage +-- +There is now a new module parameter added to the tcm_qla2xx module +parm: jam_host:Host to jam >=0 Enable jammer (int) +This parameter and accompanying code is only included if the +Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y +By default this jammer code and functionality is disabled + +Use this parameter to control the discarding of SCSI commands to a selected +host. +This may be useful for testing error handling and simulating slow drain +and other fabric issues. + +Any value >=0 that matches a fc_host # will discard the commands for that host. +Reset back to -1 to stop the jamming. + +Enable host 6 to be jammed +echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host + +Disable jamming on host 6 +echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host + +Usage example script: + +#!/bin/bash +sleep_time=120 ### Time to jam for +echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host +host=`cat /sys/module/tcm_qla2xxx/parameters/jam_host` +echo "We start to discard commands on SCSI host $host" +logger "Jammer started" +sleep $sleep_time +echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host +echo "We stopped the jammer" +logger "Jammer stopped" diff -Nurp linux-4.5/drivers/scsi/qla2xxx/Kconfig linux-4.5.new/drivers/scsi/qla2xxx/Kconfig --- linux-4.5/drivers/scsi/qla2xxx/Kconfig 2016-03-14 00:28:54.0 -0400 +++ linux-4.5.new/drivers/scsi/qla2xxx/Kconfig 2016-04-02 11:31:15.302516676 -0400 @@ -36,3 +36,13 @@ config TCM_QLA2XXX default n ---help--- Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ series target mode HBAs + +config TCM_QLA2XXX_DEBUG + bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series target mode HBAs" + depends on SCSI_QLA_FC && TARGET_CORE + depends on LIBFC + select BTREE + default n + ---help--- + Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 24xx+ series target mode HBAs + This will include code to enable the SCSI command jammer diff -Nurp linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c linux-4.5.new/drivers/scsi/qla2xxx/tcm_qla2xxx.c --- linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-03-14 00:28:54.0 -0400 +++ linux-4.5.new/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-04-02 11:32:35.317410249 -0400 @@ -48,6 +48,12 @@ #include "qla_target.h" #include "tcm_qla2xxx.h" +#ifdef TCM_QLA2XXX_DEBUG +int jam_host = -1; +module_param(jam_host, int, 0644); +MODULE_PARM_DESC(jam_host, "Host to jam >=0 Enable jammer"); +#endif + static struct workqueue_struct *tcm_qla2xxx_free_wq; static struct workqueue_struct *tcm_qla2xxx_cmd_wq; @@ -477,6 +483,13 @@ static int tcm_qla2xxx_handle_cmd(scsi_q return -EINVAL; } +#ifdef TCM_QLA2XXX_DEBUG + if (unlikely(vha->host_no == jam_host)) { + /* return, and dont run target_submit_cmd,discarding command */ + return 0; + } +#endif + cmd->vha->tgt_counters.qla_core_sbt_cmd++; return target_submit_cmd(se_cmd, se_sess, cdb, >sense_buffer[0], cmd->unpacked_lun, data_length, fcp_task_attr, @@ -1967,6 +1980,9 @@ static void tcm_qla2xxx_deregister_confi static int __init tcm_qla2xxx_init(void) { int ret; +#ifdef TCM_QLA2XXX
Re: tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision4
Hello Himanshu I noticed a typo in the patch I submitted here is the corrected patch. Please ignore the prior patch I was missing the full CONFIG name in the #ifdef check Corrected Patch [root@localhost home]# linux-4.5/scripts/checkpatch.pl jammer_patch.v4 total: 0 errors, 0 warnings, 81 lines checked jammer_patch.v4 has no obvious style problems and is ready for submission. This patch was reworked to only include the jammer code if the parameter TCM_QLA2XXX_DEBUG=Y is set. The default is to not provide this functionality at all. I looked at using attributes but this code is in the fastpath and the overhead or fetching the attribute each time is not a good idea. Control of this needs to be dynamic and the module parameter allows a simple compare in the fastpath. Patch notes I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behavior of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behavior in the Emulex, Qlogic and other F/C drivers. Works by checking new parameter jam_host if its >= 0 and matches vha->host_no , jamming is enabled when jam_host >=0 If parameter set to -1 (default) no jamming is enabled. Tested by: Laurence Oberman <lober...@redhat.com> Signed-off-by: Laurence Oberman <lober...@redhat.com> diff -Nurp linux-4.5/Documentation/scsi/tcm_qla2xxx.txt linux-4.5.new/Documentation/scsi/tcm_qla2xxx.txt --- linux-4.5/Documentation/scsi/tcm_qla2xxx.txt1969-12-31 19:00:00.0 -0500 +++ linux-4.5.new/Documentation/scsi/tcm_qla2xxx.txt2016-04-02 11:36:42.693081232 -0400 @@ -0,0 +1,34 @@ +tcm_qla2xxx jammer parameter usage +-- +There is now a new module parameter added to the tcm_qla2xx module +parm: jam_host:Host to jam >=0 Enable jammer (int) +This parameter and accompanying code is only included if the +Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y +By default this jammer code and functionality is disabled + +Use this parameter to control the discarding of SCSI commands to a selected +host. +This may be useful for testing error handling and simulating slow drain +and other fabric issues. + +Any value >=0 that matches a fc_host # will discard the commands for that host. +Reset back to -1 to stop the jamming. + +Enable host 6 to be jammed +echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host + +Disable jamming on host 6 +echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host + +Usage example script: + +#!/bin/bash +sleep_time=120 ### Time to jam for +echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host +host=`cat /sys/module/tcm_qla2xxx/parameters/jam_host` +echo "We start to discard commands on SCSI host $host" +logger "Jammer started" +sleep $sleep_time +echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host +echo "We stopped the jammer" +logger "Jammer stopped" diff -Nurp linux-4.5/drivers/scsi/qla2xxx/Kconfig linux-4.5.new/drivers/scsi/qla2xxx/Kconfig --- linux-4.5/drivers/scsi/qla2xxx/Kconfig 2016-03-14 00:28:54.0 -0400 +++ linux-4.5.new/drivers/scsi/qla2xxx/Kconfig 2016-04-02 11:31:15.302516676 -0400 @@ -36,3 +36,13 @@ config TCM_QLA2XXX default n ---help--- Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ series target mode HBAs + +config TCM_QLA2XXX_DEBUG + bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series target mode HBAs" + depends on SCSI_QLA_FC && TARGET_CORE + depends on LIBFC + select BTREE + default n + ---help--- + Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 24xx+ series target mode HBAs + This will include code to enable the SCSI command jammer diff -Nurp linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c linux-4.5.new/drivers/scsi/qla2xxx/tcm_qla2xxx.c --- linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-03-14 00:28:54.0 -0400 +++ linux-4.5.new/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-04-02 11:32:35.317410249 -0400 @@ -48,6 +48,12 @@ #include "qla_target.h" #include "tcm_qla2xxx.h" +#ifdef CONFIG_TCM_QLA2XXX_DEBUG +int jam_host = -1; +module_param(jam_host, int, 0644); +MODULE_PARM_DESC(jam_host, "Host to jam >=0 Enable jammer"); +#endif + static struct workqueue_struct *tcm_qla2xxx_free_wq; static struct workqueue_struct *tcm_qla2xxx_cmd_wq; @@ -477,6 +483,13 @@ static int tcm_qla2xxx_handle_cmd(scsi_q return -EINVAL; } +#ifdef CONFIG_TCM_QLA2XXX_DEBUG + if (unlikely(vha->host_no == jam_host)) { + /* return, and dont run target_submit_cmd,discarding command */ + r
Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability to the tcm_qla2xxx module
Hello Nicholas Its fixed now. Many Thanks. $ scripts/checkpatch.pl 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? #12: new file mode 100644 total: 0 errors, 1 warnings, 91 lines checked 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Tested by: Laurence Oberman <lober...@redhat.com> Signed-off-by: Laurence Oberman <lober...@redhat.com> --- Documentation/scsi/tcm_qla2xxx.txt | 22 ++ drivers/scsi/qla2xxx/Kconfig |9 + drivers/scsi/qla2xxx/tcm_qla2xxx.c | 20 drivers/scsi/qla2xxx/tcm_qla2xxx.h |1 + 4 files changed, 52 insertions(+), 0 deletions(-) create mode 100644 Documentation/scsi/tcm_qla2xxx.txt diff --git a/Documentation/scsi/tcm_qla2xxx.txt b/Documentation/scsi/tcm_qla2xxx.txt new file mode 100644 index 000..c3a670a --- /dev/null +++ b/Documentation/scsi/tcm_qla2xxx.txt @@ -0,0 +1,22 @@ +tcm_qla2xxx jam_host attribute +-- +There is now a new module endpoint atribute called jam_host +attribute: jam_host: boolean=0/1 +This attribute and accompanying code is only included if the +Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y +By default this jammer code and functionality is disabled + +Use this attribute to control the discarding of SCSI commands to a +selected host. +This may be useful for testing error handling and simulating slow drain +and other fabric issues. + +Setting a boolean of 1 for the jam_host attribute for a particular host + will discard the commands for that host. +Reset back to 0 to stop the jamming. + +Enable host 4 to be jammed +echo 1 > /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host + +Disable jamming on host 4 +echo 0 > /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig index 10aa18b..67c0d5a 100644 --- a/drivers/scsi/qla2xxx/Kconfig +++ b/drivers/scsi/qla2xxx/Kconfig @@ -36,3 +36,12 @@ config TCM_QLA2XXX default n ---help--- Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ series target mode HBAs + +if TCM_QLA2XXX +config TCM_QLA2XXX_DEBUG + bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series target mode HBAs" + default n + ---help--- + Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic 24xx+ series target mode HBAs + This will include code to enable the SCSI command jammer +endif diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c index 1808a01..948224e 100644 --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c @@ -457,6 +457,10 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, struct qla_tgt_cmd *cmd, struct se_cmd *se_cmd = >se_cmd; struct se_session *se_sess; struct qla_tgt_sess *sess; +#ifdef CONFIG_TCM_QLA2XXX_DEBUG + struct se_portal_group *se_tpg; + struct tcm_qla2xxx_tpg *tpg; +#endif int flags = TARGET_SCF_ACK_KREF; if (bidi) @@ -477,6 +481,15 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, struct qla_tgt_cmd *cmd, return -EINVAL; } +#ifdef CONFIG_TCM_QLA2XXX_DEBUG + se_tpg = se_sess->se_tpg; + tpg = container_of(se_tpg, struct tcm_qla2xxx_tpg, se_tpg); + if (unlikely(tpg->tpg_attrib.jam_host)) { + /* return, and dont run target_submit_cmd,discarding command */ + return 0; + } +#endif + cmd->vha->tgt_counters.qla_core_sbt_cmd++; return target_submit_cmd(se_cmd, se_sess, cdb, >sense_buffer[0], cmd->unpacked_lun, data_length, fcp_task_attr, @@ -844,6 +857,9 @@ DEF_QLA_TPG_ATTRIB(cache_dynamic_acls); DEF_QLA_TPG_ATTRIB(demo_mode_write_protect); DEF_QLA_TPG_ATTRIB(prod_mode_write_protect); DEF_QLA_TPG_ATTRIB(demo_mode_login_only); +#ifdef CONFIG_TCM_QLA2XXX_DEBUG +DEF_QLA_TPG_ATTRIB(jam_host); +#endif static struct configfs_attribute *tcm_qla2xxx_tpg_attrib_attrs[] = { _qla2xxx_tpg_attrib_attr_generate_node_acls, @@ -851,6 +867,9 @@ static struct configfs_attribute *tcm_qla2xxx_tpg_attrib_attrs[] = { _qla2xxx_tpg_attrib_attr_demo_mode_write_protect, _qla2xxx_tpg_attrib_attr_prod_mode_write_protect, _qla2xxx_tpg_attrib_attr_demo_mode_login_only, +#ifdef CONFIG_TCM_QLA2XXX_DEBUG + _qla2xxx_tpg_attrib_attr_jam_host, +#endif NULL, }; @@ -1023,6 +1042,7 @@ static struct se_portal_group *tcm_qla2xxx_make_tpg( tpg->tpg_attrib.demo_mode_write_protect = 1;
tcm_qla2xxx Add SCSI command jammer/discard capabilty to the tcm_qla2xxx module - revision3
Hello Bart, I have been using this jammer functionality to continue testing the SCSI F/C drivers and recovery for over a year now. Any chance you would agree to ack this so I can get it in now. I last posted to the list last March and it was not picked up. I did look into moving this to upper layers but I find I use it primarily for fiber channel target testing. Attempting to add this functionality to upper layers led to complexities and this is very solid. This Patch diff against 4.5 I use target LIO for all my storage array test targets and customer problem resolution here at Red Hat. This patch resulted from a requirement to mimic behavior of an expensive hardware jammer for a customer. I have used this for some time with good success to simulate and reproduce latency and slow drain fabric issues and for testing and validating error handling behavior in the Emulex, Qlogic and other F/C drivers. Works by checking new parameter jam_host if its >= 0 and matches vha->host_no , jamming is enabled when jam_host >=0 If parameter set to -1 (default) no jamming is enabled. Tested by: Laurence Oberman <lober...@redhat.com> Signed-off-by: Laurence Oberman <lober...@redhat.com> diff -Nurp linux-4.5.orig/Documentation/scsi/tcm_qla2xxx.txt linux-4.5/Documentation/scsi/tcm_qla2xxx.txt --- linux-4.5.orig/Documentation/scsi/tcm_qla2xxx.txt 1969-12-31 19:00:00.0 -0500 +++ linux-4.5/Documentation/scsi/tcm_qla2xxx.txt2016-03-29 10:08:57.455761389 -0400 @@ -0,0 +1,31 @@ +tcm_qla2xxx jammer parameter usage +-- +There is now a new module parameter added to the tcm_qla2xx module +parm: jam_host:Host to jam >=0 Enable jammer (int) + +Use this parameter to control the discarding of SCSI commands to a selected +host. +This may be useful for testing error handling and simulating slow drain +and other fabric issues. + +Any value >=0 that matches a fc_host # will discard the commands for that host. +Reset back to -1 to stop the jamming. + +Enable host 6 to be jammed +echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host + +Disable jamming on host 6 +echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host + +Usage example script: + +#!/bin/bash +sleep_time=120 ### Time to jam for +echo 6 > /sys/module/tcm_qla2xxx/parameters/jam_host +host=`cat /sys/module/tcm_qla2xxx/parameters/jam_host` +echo "We start to discard commands on SCSI host $host" +logger "Jammer started" +sleep $sleep_time +echo -1 > /sys/module/tcm_qla2xxx/parameters/jam_host +echo "We stopped the jammer" +logger "Jammer stopped" diff -Nurp linux-4.5.orig/drivers/scsi/qla2xxx/tcm_qla2xxx.c linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c --- linux-4.5.orig/drivers/scsi/qla2xxx/tcm_qla2xxx.c 2016-03-14 00:28:54.0 -0400 +++ linux-4.5/drivers/scsi/qla2xxx/tcm_qla2xxx.c2016-03-29 10:10:09.677298099 -0400 @@ -48,6 +48,10 @@ #include "qla_target.h" #include "tcm_qla2xxx.h" +int jam_host = -1; +module_param(jam_host, int, 0644); +MODULE_PARM_DESC(jam_host, "Host to jam >=0 Enable jammer"); + static struct workqueue_struct *tcm_qla2xxx_free_wq; static struct workqueue_struct *tcm_qla2xxx_cmd_wq; @@ -477,6 +481,11 @@ static int tcm_qla2xxx_handle_cmd(scsi_q return -EINVAL; } + if (unlikely(vha->host_no == jam_host)) { + /* return, and dont run target_submit_cmd,discarding command */ + return 0; + } + cmd->vha->tgt_counters.qla_core_sbt_cmd++; return target_submit_cmd(se_cmd, se_sess, cdb, >sense_buffer[0], cmd->unpacked_lun, data_length, fcp_task_attr, @@ -1967,6 +1976,7 @@ static void tcm_qla2xxx_deregister_confi static int __init tcm_qla2xxx_init(void) { int ret; + jam_host = -1; ret = tcm_qla2xxx_register_configfs(); if (ret < 0) Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qla1280: Reduce can_queue to 32
The change looks fine, I see its hard-coded to 32 in qla1280_set_defaults() Would it be better to create a #define like other drivers and use that in both. Also did the below patch resolve this for the bug reporter. I ask because if I check 4.3 it was also set to the same value of 0xf and that is reported as working. So other changes in 4.4 must be "abusing" this high value. Reviewed-by Laurence Oberman <lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Johannes Thumshirn" <jthumsh...@suse.de> To: "Martin K . Petersen" <martin.peter...@oracle.com>, "James E . J . Bottomley" <j...@linux.vnet.ibm.com> Cc: linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org, "Johannes Thumshirn" <jthumsh...@suse.de>, "Laura Abbott" <labb...@redhat.com>, "Michael Reed" <m...@sgi.com>, sta...@vger.kernel.org Sent: Friday, April 22, 2016 3:31:10 AM Subject: [PATCH] qla1280: Reduce can_queue to 32 It was reported in https://bugzilla.redhat.com/show_bug.cgi?id=1321033, that the qla1280 driver sets the scsi_host_template's can_queue field to 0xf which results in an allocation failure when allocating the block layer tags for the driver's queues like the one shown below: [4.804166] scsi host0: QLogic QLA1040 PCI to SCSI Host Adapter Firmware version: 7.65.06, Driver version 3.27.1 [4.804174] [ cut here ] [4.804184] WARNING: CPU: 2 PID: 305 at mm/page_alloc.c:2989 alloc_pages_nodemask+0xae8/0xbc0() [4.804186] Modules linked in: amdkfd amd_iommu_v2 radeon i2c_algo_bit m_kms_helper ttm drm megaraid_sas serio_raw 8021q garp bnx2 stp llc mrp nhme qla1280(+) fjes [4.804208] CPU: 2 PID: 305 Comm: systemd-udevd Not tainted 4.6-201.fc22.x86_64 #1 [4.804210] Hardware name: Google Enterprise Search Appliance/0DT021, OS 1.1.2 08/14/2006 [4.804212] 0286 2f01064c 88042985b710 ff813b542e [4.804216] 81a75024 88042985b748 ff810a40f2 [4.804220] 000b 00 [4.804223] Call Trace: [4.804231] [] dump_stack+0x63/0x85 [4.804236] [] warn_slowpath_common+0x82/0xc0 [4.804239] [] warn_slowpath_null+0x1a/0x20 [4.804242] [] __alloc_pages_nodemask+0xae8/0xbc0 [4.804247] [] ? _raw_spin_unlock_irqrestore+0xe/0x10 [4.804251] [] ? irq_work_queue+0x8e/0xa0 [4.804256] [] ? console_unlock+0x20a/0x540 [4.804262] [] alloc_pages_current+0x8c/0x110 [4.804265] [] alloc_kmem_pages+0x19/0x90 [4.804268] [] kmalloc_order_trace+0x2e/0xe0 [4.804272] [] __kmalloc+0x232/0x260 [4.804277] [] init_tag_map+0x3d/0xc0 [4.804290] [] __blk_queue_init_tags+0x45/0x80 [4.804293] [] blk_init_tags+0x14/0x20 [4.804298] [] scsi_add_host_with_dma+0x80/0x300 [4.804305] [] qla1280_probe_one+0x683/0x9ef [qla1280] [4.804309] [] local_pci_probe+0x45/0xa0 [4.804312] [] pci_device_probe+0xfd/0x140 [4.804316] [] driver_probe_device+0x222/0x490 [4.804319] [] __driver_attach+0x84/0x90 [4.804321] [] ? driver_probe_device+0x490/0x490 [4.804324] [] bus_for_each_dev+0x6c/0xc0 [4.804326] [] driver_attach+0x1e/0x20 [4.804328] [] bus_add_driver+0x1eb/0x280 [4.804331] [] ? 0xa0015000 [4.804333] [] driver_register+0x60/0xe0 [4.804336] [] __pci_register_driver+0x4c/0x50 [4.804339] [] qla1280_init+0x1ce/0x1000 [qla1280] [4.804341] [] ? 0xa0015000 [4.804345] [] do_one_initcall+0xb3/0x200 [4.804348] [] ? kmem_cache_alloc_trace+0x196/0x210 [4.804352] [] ? do_init_module+0x27/0x1cb [4.804354] [] do_init_module+0x5f/0x1cb [4.804358] [] load_module+0x2040/0x2680 [4.804360] [] ? __symbol_put+0x60/0x60 [4.804363] [] SYSC_init_module+0x149/0x190 [4.804366] [] SyS_init_module+0xe/0x10 [4.804369] [] entry_SYSCALL_64_fastpath+0x12/0x71 [4.804371] ---[ end trace 0ea3b625f86705f7 ]--- [4.804581] qla1280: probe of :11:04.0 failed with error -12 In qla1280_set_defaults() the maximum queue depth is set to 32 so adopt the scsi_host_template to it as well. Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de> Cc: Laura Abbott <labb...@redhat.com> Cc: Michael Reed <m...@sgi.com> Cc: sta...@vger.kernel.org --- drivers/scsi/qla1280.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c index 5d0ec42..6bd748e 100644 --- a/drivers/scsi/qla1280.c +++ b/drivers/scsi/qla1280.c @@ -4214,7 +4214,7 @@ static struct scsi_host_template qla1280_driver_template = { .eh_bus_reset_handler = qla1280_eh_bus_reset, .eh_host_reset_handler = qla1280_eh_adapter_reset, .bios_param = qla128
Re: [PATCH] qla1280: Reduce can_queue to 32
Johannes OK , yes thanks for pointing out the commit. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Johannes Thumshirn" <jthmsh...@suse.de> To: "Laurence Oberman" <lober...@redhat.com> Cc: "Johannes Thumshirn" <jthumsh...@suse.de>, "Martin K . Petersen" <martin.peter...@oracle.com>, "James E . J . Bottomley" <j...@linux.vnet.ibm.com>, linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org, "Laura Abbott" <labb...@redhat.com>, "Michael Reed" <m...@sgi.com>, sta...@vger.kernel.org Sent: Friday, April 22, 2016 10:01:48 AM Subject: Re: [PATCH] qla1280: Reduce can_queue to 32 On Fri, Apr 22, 2016 at 08:16:44AM -0400, Laurence Oberman wrote: > The change looks fine, I see its hard-coded to 32 in qla1280_set_defaults() > Would it be better to create a #define like other drivers and use that in > both. > Also did the below patch resolve this for the bug reporter. Yes, that's probably a reasonable idea, I'll re-send. > I ask because if I check 4.3 it was also set to the same value of 0xf and > that is reported as working. > So other changes in 4.4 must be "abusing" this high value. I think it was introduced with commit 64d513ac31b - "scsi: use host wide tags by default". Since this commit scsi_add_host_with_dma() directly calls blk_init_tags() instead of scsi_init_shared_tag_map(). The qla1280 driver has never set up the block tags though, so the bogus value was not a problem. That at least is my analysis, feel free to correct my assumptions. > > Reviewed-by Laurence Oberman <lober...@redhat.com> > > Laurence Oberman > Principal Software Maintenance Engineer > Red Hat Global Support Services > > - Original Message - > From: "Johannes Thumshirn" <jthumsh...@suse.de> > To: "Martin K . Petersen" <martin.peter...@oracle.com>, "James E . J . > Bottomley" <j...@linux.vnet.ibm.com> > Cc: linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org, "Johannes > Thumshirn" <jthumsh...@suse.de>, "Laura Abbott" <labb...@redhat.com>, > "Michael Reed" <m...@sgi.com>, sta...@vger.kernel.org > Sent: Friday, April 22, 2016 3:31:10 AM > Subject: [PATCH] qla1280: Reduce can_queue to 32 > > It was reported in https://bugzilla.redhat.com/show_bug.cgi?id=1321033, > that the qla1280 driver sets the scsi_host_template's can_queue field > to 0xf which results in an allocation failure when allocating the > block layer tags for the driver's queues like the one shown below: > > [4.804166] scsi host0: QLogic QLA1040 PCI to SCSI Host Adapter Firmware > version: 7.65.06, Driver version 3.27.1 > [4.804174] [ cut here ] > [4.804184] WARNING: CPU: 2 PID: 305 at mm/page_alloc.c:2989 > alloc_pages_nodemask+0xae8/0xbc0() > [4.804186] Modules linked in: amdkfd amd_iommu_v2 radeon i2c_algo_bit > m_kms_helper ttm drm megaraid_sas serio_raw 8021q garp bnx2 stp llc mrp nhme > qla1280(+) fjes > [4.804208] CPU: 2 PID: 305 Comm: systemd-udevd Not tainted > 4.6-201.fc22.x86_64 #1 > [4.804210] Hardware name: Google Enterprise Search Appliance/0DT021, OS > 1.1.2 08/14/2006 > [4.804212] 0286 2f01064c 88042985b710 > ff813b542e > [4.804216] 81a75024 88042985b748 > ff810a40f2 > [4.804220] 000b > 00 > [4.804223] Call Trace: > [4.804231] [] dump_stack+0x63/0x85 > [4.804236] [] warn_slowpath_common+0x82/0xc0 > [4.804239] [] warn_slowpath_null+0x1a/0x20 > [4.804242] [] __alloc_pages_nodemask+0xae8/0xbc0 > [4.804247] [] ? _raw_spin_unlock_irqrestore+0xe/0x10 > [4.804251] [] ? irq_work_queue+0x8e/0xa0 > [4.804256] [] ? console_unlock+0x20a/0x540 > [4.804262] [] alloc_pages_current+0x8c/0x110 > [4.804265] [] alloc_kmem_pages+0x19/0x90 > [4.804268] [] kmalloc_order_trace+0x2e/0xe0 > [4.804272] [] __kmalloc+0x232/0x260 > [4.804277] [] init_tag_map+0x3d/0xc0 > [4.804290] [] __blk_queue_init_tags+0x45/0x80 > [4.804293] [] blk_init_tags+0x14/0x20 > [4.804298] [] scsi_add_host_with_dma+0x80/0x300 > [4.804305] [] qla1280_probe_one+0x683/0x9ef [qla1280] > [4.804309] [] local_pci_probe+0x45/0xa0 > [4.804312] [] pci_device_probe+0xfd/0x140 > [4.804316] [] driver_probe_device+0x222/0x490 > [4.804319] [] __driver_attach+0x84/0x90 > [4.804321] [] ? driver_probe_device+0x490/0x490 > [4.804324] [] bus_for_each_dev+0x6c/0xc0 > [4
Re: [PREEMPT-RT] [PATCH v2] scsi/fcoe: convert to kworker
I have fcoe for testing. I will pull this in next week and test it. Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "James Bottomley" <j...@linux.vnet.ibm.com> To: "Sebastian Andrzej Siewior" <bige...@linutronix.de>, "Christoph Hellwig" <h...@infradead.org> Cc: linux-scsi@vger.kernel.org, "Martin K. Petersen" <martin.peter...@oracle.com>, "Vasu Dev" <vasu@intel.com>, r...@linutronix.de, fcoe-de...@open-fcoe.org, "Chad Dupuis" <chad.dup...@qlogic.com> Sent: Friday, April 22, 2016 11:49:45 AM Subject: Re: [PREEMPT-RT] [PATCH v2] scsi/fcoe: convert to kworker On Fri, 2016-04-22 at 17:27 +0200, Sebastian Andrzej Siewior wrote: > On 04/12/2016 05:16 PM, Sebastian Andrzej Siewior wrote: > > The driver creates its own per-CPU threads which are updated based > > on > > CPU hotplug events. It is also possible to use kworkers and remove > > some > > of the kthread infrastrucure. > > > > The code checked ->thread to decide if there is an active per-CPU > > thread. By using the kworker infrastructure this is no longer > > possible (or > > required). The thread pointer is saved in `kthread' instead of > > `thread' so > > anything trying to use thread is caught by the compiler. Currently > > only the > > bnx2fc driver is using struct fcoe_percpu_s and the kthread member. > > > > After a CPU went offline, we may still enqueue items on the > > "offline" > > CPU. This isn't much of a problem. The work will be done on a > > random > > CPU. The allocated crc_eof_page page won't be cleaned up. It is > > probably > > expected that the CPU comes up at some point so it should not be a > > problem. The crc_eof_page memory is released of course once the > > module is > > removed. > > > > This patch was only compile-tested due to -ENODEV. > > > > Cc: Vasu Dev <vasu@intel.com> > > Cc: "James E.J. Bottomley" <j...@linux.vnet.ibm.com> > > Cc: "Martin K. Petersen" <martin.peter...@oracle.com> > > Cc: Christoph Hellwig <h...@lst.de> > > Cc: fcoe-de...@open-fcoe.org > > Cc: linux-scsi@vger.kernel.org > > Signed-off-by: Sebastian Andrzej Siewior <bige...@linutronix.de> > > --- > > v1…v2: use kworker instead of smbthread as per hch > > > > If you want this I would the same for the two bnx drivers. > > *ping* Ping what? You've sent in an untested patch that looks to be a big change. It's definitely not going in until it's tested. Why don't you see if you can recruit an FCoE person to your cause and get them to test it. It looks like you're looking for testing on bnx2fc, correct? In which case cc'ing a bnx2fc person might have been helpful (cc added). James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
Hello Bart Around 300s before the paths were declared hard failed and the devices offlined. This is when I/O restarts. The remaining paths on the second Qlogic port (that are not jammed) will not be used until the error handler activity completes. Until we get these for example, and device-mapper starts declaring paths down we are blocked. Apr 29 17:20:51 localhost kernel: sd 1:0:1:0: Device offlined - not ready after error recovery Apr 29 17:20:51 localhost kernel: sd 1:0:1:13: Device offlined - not ready after error recovery Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: "James Bottomley" <james.bottom...@hansenpartnership.com>, "linux-scsi" <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, linux-bl...@vger.kernel.org, "device-mapper development" <dm-de...@redhat.com>, l...@lists.linux-foundation.org Sent: Friday, April 29, 2016 8:36:22 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM On 04/29/2016 02:47 PM, Laurence Oberman wrote: > Recovery with 21 LUNS is 300s that have in-flights to abort. > [ ... ] > eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set > to 10 for all devices. In multipath fast_io_fail_tmo=5 > > I jam one of the target array ports and discard the commands > effectively black-holing the commands and leave it that way until > we recover and I watch the I/O. The recovery takes around 300s even > with all the tuning and this effectively lands up in Oracle cluster > evictions. Hello Laurence, This discussion started as a discussion about the time needed to fail over from one path to another. How long did it take in your test before I/O failed over from the jammed port to another port? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
Hello Bart, This is when we have a subset of the paths fails. As you know the remaining path wont be used until the eh_handler is either done or is short circuited. What I will do is set this up via my jammer and capture a test using latest upstream. Of course my customer pain points are all in the RHEL kernels so I need to capture a recovery trace on the latest upstream kernel. When the SCSI commands for a path are black-holed and remain that way, even with eh_deadline and the short circuited adapter resets we simply try again and get back in the wait loop until we finally declare the device offline. This can take a while and differs depending on Qlogic, Emulex or fnic etc. First thing tomorrow will set this up and show you what I mean. Thanks!! Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, "James Bottomley" <james.bottom...@hansenpartnership.com>, "device-mapper development" <dm-de...@redhat.com>, l...@lists.linux-foundation.org Sent: Thursday, April 28, 2016 12:41:26 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM On 04/28/2016 09:23 AM, Laurence Oberman wrote: > We still suffer from periodic complaints in our large customer base > regarding the long recovery times for dm-multipath. > Most of the time this is when we have something like a switch > back-plane issue or an issue where RSCN'S are blocked coming back up > the fabric. Corner cases still bite us often. > > Most of the complaints originate from customers for example seeing > Oracle cluster evictions where during the waiting on the mid-layer > all mpath I/O is blocked until recovery. > > We have to tune eh_deadline, eh_timeout and fast_io_fail_tmo but > even tuning those we have to wait on serial recovery even if we > set the timeouts low. > > Lately we have been living with > eh_deadline=10 > eh_timeout=5 > fast_fail_io_tmo=10 > leaving default sd timeout at 30s > > So this continues to be an issue and I have specific examples using > the jammer I can provide showing the serial recovery times here. Hello Laurence, The long recovery times you refer to, is that for a scenario where all paths failed or for a scenario where some paths failed and other paths are still working? In the latter case, how long does it take before dm-multipath fails over to another path? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Lsf] Notes from the four separate IO track sessions at LSF/MM
Hello Folks, We still suffer from periodic complaints in our large customer base regarding the long recovery times for dm-multipath. Most of the time this is when we have something like a switch back-plane issue or an issue where RSCN'S are blocked coming back up the fabric. Corner cases still bite us often. Most of the complaints originate from customers for example seeing Oracle cluster evictions where during the waiting on the mid-layer all mpath I/O is blocked until recovery. We have to tune eh_deadline, eh_timeout and fast_io_fail_tmo but even tuning those we have to wait on serial recovery even if we set the timeouts low. Lately we have been living with eh_deadline=10 eh_timeout=5 fast_fail_io_tmo=10 leaving default sd timeout at 30s So this continues to be an issue and I have specific examples using the jammer I can provide showing the serial recovery times here. Thanks Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "James Bottomley" <james.bottom...@hansenpartnership.com>, "Mike Snitzer" <snit...@redhat.com> Cc: linux-bl...@vger.kernel.org, l...@lists.linux-foundation.org, "device-mapper development" <dm-de...@redhat.com>, "linux-scsi" <linux-scsi@vger.kernel.org> Sent: Thursday, April 28, 2016 11:53:50 AM Subject: Re: [Lsf] Notes from the four separate IO track sessions at LSF/MM On 04/28/2016 08:40 AM, James Bottomley wrote: > Well, the entire room, that's vendors, users and implementors > complained that path failover takes far too long. I think in their > minds this is enough substance to go on. The only complaints I heard about path failover taking too long came from people working on FC drivers. Aren't SCSI transport layer implementations expected to fail I/O after fast_io_fail_tmo expired instead of waiting until the SCSI error handler has finished? If so, why is it considered an issue that error handling for the FC protocol can take very long (hours)? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: use spinlock instead of mutex for RCU-protected VPD inquiry data
- Original Message - > From: "Ewan D. Milne" <emi...@redhat.com> > To: linux-scsi@vger.kernel.org > Sent: Friday, May 20, 2016 8:56:14 AM > Subject: [PATCH] scsi: use spinlock instead of mutex for RCU-protected VPD > inquiry data > > From: "Ewan D. Milne" <emi...@redhat.com> > > A spinlock is sufficient for this purpose, and much smaller. > > Signed-off-by: Ewan D. Milne <emi...@redhat.com> > --- > drivers/scsi/scsi.c| 8 > drivers/scsi/scsi_scan.c | 2 +- > include/scsi/scsi_device.h | 2 +- > 3 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c > index 1deb6ad..330d807 100644 > --- a/drivers/scsi/scsi.c > +++ b/drivers/scsi/scsi.c > @@ -829,11 +829,11 @@ retry_pg80: > kfree(vpd_buf); > goto retry_pg80; > } > - mutex_lock(>inquiry_mutex); > + spin_lock(>inquiry_lock); > orig_vpd_buf = sdev->vpd_pg80; > sdev->vpd_pg80_len = result; > rcu_assign_pointer(sdev->vpd_pg80, vpd_buf); > - mutex_unlock(>inquiry_mutex); > + spin_unlock(>inquiry_lock); > synchronize_rcu(); > if (orig_vpd_buf) { > kfree(orig_vpd_buf); > @@ -858,11 +858,11 @@ retry_pg83: > kfree(vpd_buf); > goto retry_pg83; > } > - mutex_lock(>inquiry_mutex); > + spin_lock(>inquiry_lock); > orig_vpd_buf = sdev->vpd_pg83; > sdev->vpd_pg83_len = result; > rcu_assign_pointer(sdev->vpd_pg83, vpd_buf); > - mutex_unlock(>inquiry_mutex); > + spin_unlock(>inquiry_lock); > synchronize_rcu(); > if (orig_vpd_buf) > kfree(orig_vpd_buf); > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index e0a78f5..f445615 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -240,7 +240,7 @@ static struct scsi_device *scsi_alloc_sdev(struct > scsi_target *starget, > INIT_LIST_HEAD(>starved_entry); > INIT_LIST_HEAD(>event_list); > spin_lock_init(>list_lock); > - mutex_init(>inquiry_mutex); > + spin_lock_init(>inquiry_lock); > INIT_WORK(>event_work, scsi_evt_thread); > INIT_WORK(>requeue_work, scsi_requeue_run_queue); > > diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h > index a6c346d..0410ed8 100644 > --- a/include/scsi/scsi_device.h > +++ b/include/scsi/scsi_device.h > @@ -115,7 +115,7 @@ struct scsi_device { > char type; > char scsi_level; > char inq_periph_qual; /* PQ from INQUIRY data */ > - struct mutex inquiry_mutex; > + spinlock_t inquiry_lock; > unsigned char inquiry_len; /* valid bytes in 'inquiry' */ > unsigned char * inquiry;/* INQUIRY response data */ > const char * vendor;/* [back_compat] point into 'inquiry' > ... */ > -- > 2.1.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Look fine to me: Reviewed by: Laurence Oberman <lober...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
st multipathd: mpathi: sdbb - path offline Apr 29 17:21:18 localhost multipathd: checker failed path 67:80 in map mpathi Apr 29 17:21:18 localhost multipathd: mpathi: remaining active paths: 2 Apr 29 17:21:18 localhost multipathd: mpatho: sdbr - path offline Apr 29 17:21:18 localhost multipathd: checker failed path 68:80 in map mpatho Apr 29 17:21:18 localhost multipathd: mpatho: remaining active paths: 2 Apr 29 17:21:18 localhost multipathd: mpathq: sdbp - path offline Apr 29 17:21:18 localhost multipathd: checker failed path 68:48 in map mpathq Apr 29 17:21:18 localhost multipathd: mpathq: remaining active paths: 2 Apr 29 17:21:18 localhost multipathd: mpathv: sdbz - path offline Apr 29 17:21:18 localhost multipathd: checker failed path 68:208 in map mpathv Apr 29 17:21:18 localhost multipathd: mpathv: remaining active paths: 2 Apr 29 17:21:18 localhost multipathd: mpatht: sdbl - path offline Apr 29 17:21:18 localhost multipathd: checker failed path 67:240 in map mpatht Apr 29 17:21:18 localhost multipathd: mpatht: remaining active paths: 2 Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 66:224. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 68:176. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:208. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:176. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:144. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:112. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:80. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 68:80. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 68:48. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 68:208. Apr 29 17:21:18 localhost kernel: device-mapper: multipath: Failing path 67:240. Apr 29 17:21:18 localhost kernel: blk_update_request: I/O error, dev sdaw, sector 0 Apr 29 17:21:18 localhost kernel: sd 0:0:1:8: [sdbn] tag#25 FAILED Result: hostbyte=DID_RESET driverbyte=DRIVER_OK Apr 29 17:21:18 localhost kernel: sd 0:0:1:8: [sdbn] tag#25 CDB: Read(10) 28 00 00 00 00 00 00 00 08 00 Apr 29 17:21:18 localhost kernel: blk_update_request: I/O error, dev sdbn, sector 0 Apr 29 17:21:18 localhost kernel: sd 0:0:1:8: rejecting I/O to offline device Apr 29 17:21:18 localhost kernel: sd 0:0:1:8: [sdbn] killing request Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Laurence Oberman" <lober...@redhat.com> To: "Bart Van Assche" <bart.vanass...@sandisk.com> Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, "James Bottomley" <james.bottom...@hansenpartnership.com>, "device-mapper development" <dm-de...@redhat.com>, l...@lists.linux-foundation.org Sent: Thursday, April 28, 2016 12:47:24 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM Hello Bart, This is when we have a subset of the paths fails. As you know the remaining path wont be used until the eh_handler is either done or is short circuited. What I will do is set this up via my jammer and capture a test using latest upstream. Of course my customer pain points are all in the RHEL kernels so I need to capture a recovery trace on the latest upstream kernel. When the SCSI commands for a path are black-holed and remain that way, even with eh_deadline and the short circuited adapter resets we simply try again and get back in the wait loop until we finally declare the device offline. This can take a while and differs depending on Qlogic, Emulex or fnic etc. First thing tomorrow will set this up and show you what I mean. Thanks!! Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, "James Bottomley" <james.bottom...@hansenpartnership.com>, "device-mapper development" <dm-de...@redhat.com>, l...@lists.linux-foundation.org Sent: Thursday, April 28, 2016 12:41:26 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM On 04/28/2016 09:23 AM, Laurence Oberman wrote: > We still suffer from periodic complaints in our large customer base > regarding the long recovery times for dm-multipath. > Most of the time this is when we have something like a switch > back-plane issue or an issue where RSCN'S are blocked coming back up > the fabric. Corne
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
One small correction In the cut and past the mpath timing was this. I had a cut and past error in my prior message. Fri Apr 29 17:16:14 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 active ready running |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 active ready running Start again here so its the same 5 minutes while we are in the error_handler Fri Apr 29 17:21:26 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 failed faulty offline |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 failed faulty offline Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Laurence Oberman" <lober...@redhat.com> To: "Bart Van Assche" <bart.vanass...@sandisk.com> Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, "James Bottomley" <james.bottom...@hansenpartnership.com>, "device-mapper development" <dm-de...@redhat.com>, l...@lists.linux-foundation.org, "Benjamin Marzinski" <bmarz...@redhat.com> Sent: Friday, April 29, 2016 5:47:07 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM Hello Bart I will email the entire log just to you. This is a summary only below of pertinent log messages. I dont think the whole list will have an interest in all thge log messages. When I sent the dull log to you I will include SCSI debug for the error handler stuff. I ran the tests. This is a worst case test with 21 LUNS and jammed commands. Typical failures like a port switch failure or link down wont be like this. Also where we get RSCN's and we can react quicker we will. In this case I simulated a hung switch issue like a backplane/mesh problem and believe me I see a lot of these black-holed SCSI command situations in real life. Recovery with 21 LUNS is 300s that have in-flights to abort. This configuration is a multibus configuration for multipath. Two qla2xx ports are connected to a switch and the 2 array pots are connected to the same switch. This gives me 4 active/active paths for 21 mpath devices I start I/O to all 21 reading 64k blocks using dd and iflag=direct Example mpath device mpathf (360014056a5be89021364a4c90556bfbb) dm-7 LIO-ORG ,block-14 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:13 sdp 8:240 active ready running |- 0:0:1:13 sdbf 67:144 active ready running |- 1:0:0:13 sdo 8:224 active ready running `- 1:0:1:13 sdbg 67:160 active ready running eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set to 10 for all devices In multipath fast_io_fail_tmo=5 I jam one of the target array ports and discard the commands effectively black-holing the commands and leave it that way until we recover and I watch the I/O. The recovery takes around 300s even with all the tuning and this effectively lands up in Oracle cluster evictions. Watching multipath -ll mpathe I will block as expected while in recovery BLocked here Fri Apr 29 17:16:14 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 active ready running |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 active ready running Starte again here Fri Apr 29 17:16:26 EDT 2016 mpathe (360014052a6f5f9f256d4c1097eedfd95) dm-2 LIO-ORG ,block-13 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active |- 0:0:0:12 sds 65:32 active ready running |- 0:0:1:12 sdbh 67:176 failed faulty offline |- 1:0:0:12 sdr 65:16 active ready running `- 1:0:1:12 sdbi 67:192 failed faulty offline Tracking I/O procs ---memory-- ---swap-- -io -system-- --cpu- -timestamp- r b swpd free buff cache si sobibo in cs us sy id wa st EDT 0 21 0 15409656 25580 45205600 13740 0 367 2523 0 1 41 59 0 2016-04-29 17:16:17 0 21 0 15408904 25580 45233600 15872 0 378 2779 0 1 42 57 0 2016-04-29 17:16:18 2 20 0 15408096 25580 45262400 17612 0 399 3310 0 0 26 73 0 2016-04-29 17:16:19 0 21 0 15407188 2558
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
- Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: "James Bottomley" <james.bottom...@hansenpartnership.com>, "linux-scsi" <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, linux-bl...@vger.kernel.org, "device-mapper development" <dm-de...@redhat.com>, l...@lists.linux-foundation.org Sent: Monday, May 2, 2016 6:28:16 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM On 05/02/2016 12:28 PM, Laurence Oberman wrote: > Even in the case of the ib_srp, don't we also have to still run the > eh_timeout for each of the devices that has inflight requiring error > handling serially. This means we will still have to wait to get a > path failover until all are through the timeout. Hello Laurence, It depends. If a transport layer error (e.g. a cable pull) has been observed by the ib_srp driver then fast_io_fail_tmo seconds later the ib_srp driver will terminate all outstanding SCSI commands without waiting for the error handler to finish. If no transport layer error has been observed then at most (SCSI timeout) + (number of pending commands + 1) * 5 seconds later srp_reset_device() will have finished terminating all pending SCSI commands. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hello Bart OK, Yes, that lines up with my testing here with Qlogic and Emulex. I am about to test srp but I need to add some jammer code first. The link down and other interruptions will always be fast. Its always going to be the black-hole events that are troublesome. Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Bart Van Assche" <bart.vanass...@sandisk.com> To: "Laurence Oberman" <lober...@redhat.com> Cc: linux-bl...@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, "James Bottomley" <james.bottom...@hansenpartnership.com>, "device-mapper development" <dm-de...@redhat.com>, l...@lists.linux-foundation.org Sent: Monday, May 2, 2016 2:49:54 PM Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM On 04/29/2016 05:47 PM, Laurence Oberman wrote: > From: "Bart Van Assche" <bart.vanass...@sandisk.com> > To: "Laurence Oberman" <lober...@redhat.com> > Cc: "James Bottomley" <james.bottom...@hansenpartnership.com>, "linux-scsi" > <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snit...@redhat.com>, > linux-bl...@vger.kernel.org, "device-mapper development" > <dm-de...@redhat.com>, l...@lists.linux-foundation.org > Sent: Friday, April 29, 2016 8:36:22 PM > Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions > at LSF/MM > >> On 04/29/2016 02:47 PM, Laurence Oberman wrote: >>> Recovery with 21 LUNS is 300s that have in-flights to abort. >>> [ ... ] >>> eh_deadline is set to 10 on the 2 qlogic ports, eh_timeout is set >>> to 10 for all devices. In multipath fast_io_fail_tmo=5 >>> >>> I jam one of the target array ports and discard the commands >>> effectively black-holing the commands and leave it that way until >>> we recover and I watch the I/O. The recovery takes around 300s even >>> with all the tuning and this effectively lands up in Oracle cluster >>> evictions. >> >> This discussion started as a discussion about the time needed to fail >> over from one path to another. How long did it take in your test before >> I/O failed over from the jammed port to another port? > > Around 300s before the paths were declared hard failed and the > devices offlined. This is when I/O restarts. > The remaining paths on the second Qlogic port (that are not jammed) > will not be used until the error handler activity completes. > > Until we get these for example, and device-mapper starts declaring > paths down we are blocked. > Apr 29 17:20:51 localhost kernel: sd 1:0:1:0: Device offlined - not > ready after error recovery > Apr 29 17:20:51 localhost kernel: sd 1:0:1:13: Device offlined - not > ready after error recovery Hello Laurence, Everyone else on all mailing lists to which this message has been posted replies below the message. Please follow this convention. Regarding the fail-over time: the ib_srp driver guarantees that scsi_done() is invoked from inside its terminate_rport_io() function. Apparently the lpfc and the qla2xxx drivers behave differently. Please work with the maintainers of these drivers to reduce fail-over time. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hello Bart Even in the case of the ib_srp, don't we also have to still run the eh_timeout for each of the devices that has inflight requiring error handling serially. This means we will still have to wait to get a path failover until all are through the timeout. Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: clear ILI if Medium Error
Looks good Reviewed-by Laurence Oberman <lober...@redhat.com> Laurence Oberman Principal Software Maintenance Engineer Red Hat Global Support Services - Original Message - From: "Kai Makisara" <kai.makis...@kolumbus.fi> To: linux-scsi@vger.kernel.org Cc: mlomb...@redhat.com Sent: Monday, April 18, 2016 1:47:18 AM Subject: [PATCH] st: clear ILI if Medium Error Some drives set the ILI flag together with MEDIUM ERROR sense code. Clear the ILI flag in this case so that the medium error will be handled. The problem was reported by Maurizio Lombardi. Signed-off-by: Kai Mäkisara <kai.makis...@kolumbus.fi> --- drivers/scsi/st.c |9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) --- a/drivers/scsi/st.c 2016-04-17 21:22:15.671897001 +0300 +++ b/drivers/scsi/st.c 2016-04-17 22:25:39.234321293 +0300 @@ -1974,9 +1974,12 @@ static long read_tape(struct scsi_tape * transfer = (int)cmdstatp->uremainder64; else transfer = 0; - if (STp->block_size == 0 && - cmdstatp->sense_hdr.sense_key == MEDIUM_ERROR) - transfer = bytes; + if (cmdstatp->sense_hdr.sense_key == MEDIUM_ERROR) { + if (STp->block_size == 0) + transfer = bytes; + /* Some drives set ILI with MEDIUM ERROR */ + cmdstatp->flags &= ~SENSE_ILI; + } if (cmdstatp->flags & SENSE_ILI) { /* ILI */ if (STp->block_size == 0 && -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability to the tcm_qla2xxx module
- Original Message - > From: "Laurence Oberman" <lober...@redhat.com> > To: "Nicholas A. Bellinger" <n...@linux-iscsi.org> > Cc: "Himanshu Madhani" <himanshu.madh...@qlogic.com>, "Bart Van Assche" > <bart.vanass...@sandisk.com>, "linux-scsi" > <linux-scsi@vger.kernel.org>, "target-devel" <target-de...@vger.kernel.org>, > "Quinn Tran" <quinn.t...@qlogic.com> > Sent: Monday, April 4, 2016 6:50:03 PM > Subject: Re: [PATCH] tcm_qla2xxx Add SCSI command jammer/discard capability > to the tcm_qla2xxx module > > Hello Nicholas > > Its fixed now. > Many Thanks. > > $ scripts/checkpatch.pl > 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch > WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? > #12: > new file mode 100644 > > total: 0 errors, 1 warnings, 91 lines checked > > 0001-tcm_qla2xxx-Add-SCSI-command-jammer-discard-capabili.patch has style > problems, please review. > > NOTE: If any of the errors are false positives, please report > them to the maintainer, see CHECKPATCH in MAINTAINERS. > > > > Tested by: Laurence Oberman <lober...@redhat.com> > Signed-off-by: Laurence Oberman <lober...@redhat.com> > --- > Documentation/scsi/tcm_qla2xxx.txt | 22 ++ > drivers/scsi/qla2xxx/Kconfig |9 + > drivers/scsi/qla2xxx/tcm_qla2xxx.c | 20 > drivers/scsi/qla2xxx/tcm_qla2xxx.h |1 + > 4 files changed, 52 insertions(+), 0 deletions(-) > create mode 100644 Documentation/scsi/tcm_qla2xxx.txt > > diff --git a/Documentation/scsi/tcm_qla2xxx.txt > b/Documentation/scsi/tcm_qla2xxx.txt > new file mode 100644 > index 000..c3a670a > --- /dev/null > +++ b/Documentation/scsi/tcm_qla2xxx.txt > @@ -0,0 +1,22 @@ > +tcm_qla2xxx jam_host attribute > +-- > +There is now a new module endpoint atribute called jam_host > +attribute: jam_host: boolean=0/1 > +This attribute and accompanying code is only included if the > +Kconfig parameter TCM_QLA2XXX_DEBUG is set to Y > +By default this jammer code and functionality is disabled > + > +Use this attribute to control the discarding of SCSI commands to a > +selected host. > +This may be useful for testing error handling and simulating slow drain > +and other fabric issues. > + > +Setting a boolean of 1 for the jam_host attribute for a particular host > + will discard the commands for that host. > +Reset back to 0 to stop the jamming. > + > +Enable host 4 to be jammed > +echo 1 > > /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host > + > +Disable jamming on host 4 > +echo 0 > > /sys/kernel/config/target/qla2xxx/21:00:00:24:ff:27:8f:ae/tpgt_1/attrib/jam_host > diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig > index 10aa18b..67c0d5a 100644 > --- a/drivers/scsi/qla2xxx/Kconfig > +++ b/drivers/scsi/qla2xxx/Kconfig > @@ -36,3 +36,12 @@ config TCM_QLA2XXX > default n > ---help--- > Say Y here to enable the TCM_QLA2XXX fabric module for QLogic 24xx+ > series > target mode HBAs > + > +if TCM_QLA2XXX > +config TCM_QLA2XXX_DEBUG > + bool "TCM_QLA2XXX fabric module DEBUG mode for QLogic 24xx+ series > target > mode HBAs" > + default n > + ---help--- > + Say Y here to enable the TCM_QLA2XXX fabric module DEBUG for QLogic > 24xx+ > series target mode HBAs > + This will include code to enable the SCSI command jammer > +endif > diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c > b/drivers/scsi/qla2xxx/tcm_qla2xxx.c > index 1808a01..948224e 100644 > --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c > +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c > @@ -457,6 +457,10 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, > struct qla_tgt_cmd *cmd, > struct se_cmd *se_cmd = >se_cmd; > struct se_session *se_sess; > struct qla_tgt_sess *sess; > +#ifdef CONFIG_TCM_QLA2XXX_DEBUG > + struct se_portal_group *se_tpg; > + struct tcm_qla2xxx_tpg *tpg; > +#endif > int flags = TARGET_SCF_ACK_KREF; > > if (bidi) > @@ -477,6 +481,15 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t *vha, > struct qla_tgt_cmd *cmd, > return -EINVAL; > } > > +#ifdef CONFIG_TCM_QLA2XXX_DEBUG > + se_tpg = se_sess->se_tpg; > + tpg = container_of(se_tpg, struct tcm_qla2xxx_tpg, se_tpg); > + if (unlikely(tpg->tpg_attrib.jam_host)) { > + /* return, and dont run target_submit_cmd,discarding command
Re: [PATCH] scsi: Delete an unnecessary check before the function call "kfree"
- Original Message - > From: "SF Markus Elfring" <elfr...@users.sourceforge.net> > To: linux-scsi@vger.kernel.org, "Christoph Hellwig" <h...@lst.de>, "Hannes > Reinecke" <h...@suse.de>, "James E. J. > Bottomley" <j...@linux.vnet.ibm.com>, "Martin K. Petersen" > <martin.peter...@oracle.com> > Cc: "LKML" <linux-ker...@vger.kernel.org>, kernel-janit...@vger.kernel.org, > "Julia Lawall" <julia.law...@lip6.fr> > Sent: Sunday, July 24, 2016 8:30:35 AM > Subject: [PATCH] scsi: Delete an unnecessary check before the function call > "kfree" > > From: Markus Elfring <elfr...@users.sourceforge.net> > Date: Sun, 24 Jul 2016 14:20:21 +0200 > > The kfree() function tests whether its argument is NULL and then > returns immediately. Thus the test around the call is not needed. > > This issue was detected by using the Coccinelle software. > > Signed-off-by: Markus Elfring <elfr...@users.sourceforge.net> > --- > drivers/scsi/scsi.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c > index 1f36aca..1794c0c 100644 > --- a/drivers/scsi/scsi.c > +++ b/drivers/scsi/scsi.c > @@ -864,8 +864,7 @@ retry_pg83: > rcu_assign_pointer(sdev->vpd_pg83, vpd_buf); > mutex_unlock(>inquiry_mutex); > synchronize_rcu(); > - if (orig_vpd_buf) > - kfree(orig_vpd_buf); > + kfree(orig_vpd_buf); > } > } > > -- > 2.9.2 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Looks fine, small comment is that the function call prior to check in the fucntion sets up variables etc. So is more expensive than a simple NULL check prior. Reviewed-by: Laurence Oberman <lober...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0082/1285] Replace numeric parameter like 0444 with macro
- Original Message - > From: "Baole Ni" <baolex...@intel.com> > To: "don brace" <don.br...@microsemi.com>, "len brown" <len.br...@intel.com>, > pa...@ucw.cz, > gre...@linuxfoundation.org, h...@zytor.com, x...@kernel.org > Cc: "iss storagedev" <iss_storage...@hp.com>, "esc storagedev" > <esc.storage...@microsemi.com>, > linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org, "chuansheng liu" > <chuansheng@intel.com>, "baolex ni" > <baolex...@intel.com> > Sent: Tuesday, August 2, 2016 6:39:14 AM > Subject: [PATCH 0082/1285] Replace numeric parameter like 0444 with macro > > I find that the developers often just specified the numeric value > when calling a macro which is defined with a parameter for access permission. > As we know, these numeric value for access permission have had the > corresponding macro, > and that using macro can improve the robustness and readability of the code, > thus, I suggest replacing the numeric parameter with the macro. > > Signed-off-by: Chuansheng Liu <chuansheng@intel.com> > Signed-off-by: Baole Ni <baolex...@intel.com> > --- > drivers/block/cciss.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c > index 63c2064..05dc1bd 100644 > --- a/drivers/block/cciss.c > +++ b/drivers/block/cciss.c > @@ -67,7 +67,7 @@ MODULE_SUPPORTED_DEVICE("HP Smart Array Controllers"); > MODULE_VERSION("3.6.26"); > MODULE_LICENSE("GPL"); > static int cciss_tape_cmds = 6; > -module_param(cciss_tape_cmds, int, 0644); > +module_param(cciss_tape_cmds, int, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); > MODULE_PARM_DESC(cciss_tape_cmds, > "number of commands to allocate for tape devices (default: 6)"); > static int cciss_simple_mode; > -- > 2.9.2 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Looks fine: Reviewed by: Laurence Oberman <lober...@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dm-mq and end_clone_request()
Hi Bart I simplified the test to 2 simple scripts and only running against one XFS file system. Can you validate these and tell me if its enough to emulate what you are doing. Perhaps our test-suite is too simple. Start the test # cat run_test.sh #!/bin/bash logger "Starting Bart's test" #for i in `seq 1 10` for i in 1 do fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \ --iodepth=64 --group_reporting --sync=1 --direct=1 --ioengine=libaio \ --directory="/data-$i" --name=data-integrity-test --thread --numjobs=16 \ --runtime=600 --output=fio-output.txt >/dev/null & done Delete the host, I wait 10s in between host deletions. But I also tested with 3s and still its stable with Mike's patches. #!/bin/bash for i in /sys/class/srp_remote_ports/* do echo "Deleting host $i, it will re-connect via srp_daemon" echo 1 > $i/delete sleep 10 done Check for I/O errors affecting XFS and we now have none with the patches Mike provided. After recovery I can create files in the xfs mount with no issues. Can you use my scripts and 1 mount and see if it still fails for you. Thanks Laurence - Original Message - > From: "Mike Snitzer" <snit...@redhat.com> > To: "Bart Van Assche" <bart.vanass...@sandisk.com> > Cc: dm-de...@redhat.com, "Laurence Oberman" <lober...@redhat.com>, > linux-scsi@vger.kernel.org > Sent: Tuesday, August 2, 2016 8:40:14 PM > Subject: Re: dm-mq and end_clone_request() > > On Tue, Aug 02 2016 at 8:19pm -0400, > Bart Van Assche <bart.vanass...@sandisk.com> wrote: > > > On 08/02/2016 10:45 AM, Mike Snitzer wrote: > > > Please do these same tests against a v4.7 kernel with the 4 patches from > > > this branch applied (no need for your other debug patches): > > > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.7-mpath-fixes > > > > > > I've had good results with my blk-mq SRP based testing. > > > > Hello Mike, > > > > Thanks again for having made these patches available. The results of my > > tests are as follows: > > Disappointing. But I asked you to run the v4.7 kernel patches I > pointed to _without_ any of your debug patches. > > I cannot reproduce on our SRP testbed with the fixes I provided. We're > now in a place where there would appear to be something very unique to > your environment causing these failures. > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dm-mq and end_clone_request()
- Original Message - > From: "Mike Snitzer" <snit...@redhat.com> > To: "Laurence Oberman" <lober...@redhat.com> > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, > linux-scsi@vger.kernel.org > Sent: Tuesday, August 2, 2016 10:10:12 PM > Subject: Re: dm-mq and end_clone_request() > > On Tue, Aug 02 2016 at 9:33pm -0400, > Laurence Oberman <lober...@redhat.com> wrote: > > > Hi Bart > > > > I simplified the test to 2 simple scripts and only running against one XFS > > file system. > > Can you validate these and tell me if its enough to emulate what you are > > doing. > > Perhaps our test-suite is too simple. > > > > Start the test > > > > # cat run_test.sh > > #!/bin/bash > > logger "Starting Bart's test" > > #for i in `seq 1 10` > > for i in 1 > > do > > fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \ > > --iodepth=64 --group_reporting --sync=1 --direct=1 > > --ioengine=libaio \ > > --directory="/data-$i" --name=data-integrity-test --thread > > --numjobs=16 \ > > --runtime=600 --output=fio-output.txt >/dev/null & > > done > > > > Delete the host, I wait 10s in between host deletions. > > But I also tested with 3s and still its stable with Mike's patches. > > > > #!/bin/bash > > for i in /sys/class/srp_remote_ports/* > > do > > echo "Deleting host $i, it will re-connect via srp_daemon" > > echo 1 > $i/delete > > sleep 10 > > done > > > > Check for I/O errors affecting XFS and we now have none with the patches > > Mike provided. > > After recovery I can create files in the xfs mount with no issues. > > > > Can you use my scripts and 1 mount and see if it still fails for you. > > In parallel we can try Bart's testsuite that he shared earlier in this > thread: https://github.com/bvanassche/srp-test > > README.md says: > "Running these tests manually is tedious. Hence this test suite that > tests the SRP initiator and target drivers by loading both drivers on > the same server, by logging in using the IB loopback functionality and > by sending I/O through the SRP initiator driver to a RAM disk exported > by the SRP target driver." > > This could explain why Bart is still seeing issues. He isn't testing > real hardware -- as such he is using ramdisk to expose races, etc. > > Mike > Hi Mike, I looked at Bart's scripts, they looked fine but I wanted a more simplified way to bring the error out. Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS. That is the same way I do it when I am not connected to a large array as it is the only way I can get EDR like speeds. I don't thinks its racing due to the ramdisk back-end but maybe we need to ramp ours up to run more in parallel in a loop. I will run 21 parallel runs and see if it makes a difference tonight and report back tomorrow. Clearly prior to your final patches we were escaping back to the FS layer with errors but since your patches, at least in out test harness that is resolved. Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dm-mq and end_clone_request()
- Original Message - > From: "Laurence Oberman" <lober...@redhat.com> > To: "Mike Snitzer" <snit...@redhat.com> > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, > linux-scsi@vger.kernel.org > Sent: Tuesday, August 2, 2016 10:18:30 PM > Subject: Re: dm-mq and end_clone_request() > > > > - Original Message - > > From: "Mike Snitzer" <snit...@redhat.com> > > To: "Laurence Oberman" <lober...@redhat.com> > > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, > > linux-scsi@vger.kernel.org > > Sent: Tuesday, August 2, 2016 10:10:12 PM > > Subject: Re: dm-mq and end_clone_request() > > > > On Tue, Aug 02 2016 at 9:33pm -0400, > > Laurence Oberman <lober...@redhat.com> wrote: > > > > > Hi Bart > > > > > > I simplified the test to 2 simple scripts and only running against one > > > XFS > > > file system. > > > Can you validate these and tell me if its enough to emulate what you are > > > doing. > > > Perhaps our test-suite is too simple. > > > > > > Start the test > > > > > > # cat run_test.sh > > > #!/bin/bash > > > logger "Starting Bart's test" > > > #for i in `seq 1 10` > > > for i in 1 > > > do > > > fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \ > > > --iodepth=64 --group_reporting --sync=1 --direct=1 > > > --ioengine=libaio \ > > > --directory="/data-$i" --name=data-integrity-test --thread > > > --numjobs=16 \ > > > --runtime=600 --output=fio-output.txt >/dev/null & > > > done > > > > > > Delete the host, I wait 10s in between host deletions. > > > But I also tested with 3s and still its stable with Mike's patches. > > > > > > #!/bin/bash > > > for i in /sys/class/srp_remote_ports/* > > > do > > > echo "Deleting host $i, it will re-connect via srp_daemon" > > > echo 1 > $i/delete > > > sleep 10 > > > done > > > > > > Check for I/O errors affecting XFS and we now have none with the patches > > > Mike provided. > > > After recovery I can create files in the xfs mount with no issues. > > > > > > Can you use my scripts and 1 mount and see if it still fails for you. > > > > In parallel we can try Bart's testsuite that he shared earlier in this > > thread: https://github.com/bvanassche/srp-test > > > > README.md says: > > "Running these tests manually is tedious. Hence this test suite that > > tests the SRP initiator and target drivers by loading both drivers on > > the same server, by logging in using the IB loopback functionality and > > by sending I/O through the SRP initiator driver to a RAM disk exported > > by the SRP target driver." > > > > This could explain why Bart is still seeing issues. He isn't testing > > real hardware -- as such he is using ramdisk to expose races, etc. > > > > Mike > > > > Hi Mike, > > I looked at Bart's scripts, they looked fine but I wanted a more simplified > way to bring the error out. > Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS. > That is the same way I do it when I am not connected to a large array as it > is the only way I can get EDR like speeds. > > I don't thinks its racing due to the ramdisk back-end but maybe we need to > ramp ours up to run more in parallel in a loop. > > I will run 21 parallel runs and see if it makes a difference tonight and > report back tomorrow. > Clearly prior to your final patches we were escaping back to the FS layer > with errors but since your patches, at least in out test harness that is > resolved. > > Thanks > Laurence > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Hello I ran 20 parallel runs with 3 loops through host deletion and in each case fio survived with no hard error escaping to the FS layer. Its solid in our test bed, Keep in mind we have no ib_srpt loaded as we have a hardware based array and are connected directly to the array with EDR 100. I am also not removing and reloading modules like is happening in Barts's scripts and also not trying to delete mpath maps etc. I focused only on the I/O error that was escaping up to the FS layer. I will check in with Bart tomorrow. Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] dm-mq and end_clone_request()
- Original Message - > From: "Laurence Oberman" <lober...@redhat.com> > To: "Bart Van Assche" <bart.vanass...@sandisk.com> > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" > <snit...@redhat.com>, "Johannes Thumshirn" > <jthumsh...@suse.de> > Sent: Wednesday, August 10, 2016 5:38:16 PM > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > > > - Original Message - > > From: "Laurence Oberman" <lober...@redhat.com> > > To: "Bart Van Assche" <bart.vanass...@sandisk.com> > > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" > > <snit...@redhat.com>, "Johannes Thumshirn" > > <jthumsh...@suse.de> > > Sent: Tuesday, August 9, 2016 1:21:15 PM > > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > > > > > > > - Original Message - > > > From: "Bart Van Assche" <bart.vanass...@sandisk.com> > > > To: "Laurence Oberman" <lober...@redhat.com> > > > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" > > > <snit...@redhat.com>, "Johannes Thumshirn" > > > <jthumsh...@suse.de> > > > Sent: Tuesday, August 9, 2016 1:16:52 PM > > > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > > > > > On 08/09/2016 10:12 AM, Laurence Oberman wrote: > > > > I was talking about this patch > > > > > > > > --- a/drivers/scsi/scsi_scan.c > > > > +++ b/drivers/scsi/scsi_scan.c > > > > @@ -1890,10 +1890,11 @@ void scsi_forget_host(struct Scsi_Host *shost) > > > > restart: > > > > spin_lock_irqsave(shost->host_lock, flags); > > > > list_for_each_entry(sdev, >__devices, siblings) { > > > > -if (sdev->sdev_state == SDEV_DEL) > > > > +if (sdev->sdev_state == SDEV_DEL || > > > > scsi_device_get(sdev) > > > > < 0) > > > > continue; > > > > spin_unlock_irqrestore(shost->host_lock, flags); > > > > __scsi_remove_device(sdev); > > > > +scsi_device_put(sdev); > > > > goto restart; > > > > } > > > > spin_unlock_irqrestore(shost->host_lock, flags); > > > > > > Hello Laurence, > > > > > > Did you run your tests with that patch applied? If so, it would help if > > > you could rerun your tests without that patch. If the above patch makes > > > a difference it means that it can happen that __scsi_remove_device() > > > does not change the device state into SDEV_DEL. That's a bug and we need > > > to know whether or not __scsi_remove_device() behaves correctly. > > > > > > Thanks, > > > > > > Bart. > > > > > Yes Sir, I ran all yesterdays tests on your kernel with that patch applied. > > Of course it may well just be luck/coincidence that the host delete race is > > no longer happening > > so I agree we need to re-run the tests so I will revert and re-run. > > I will probably only get back to you tomorrow with the results. > > > > Thanks > > Laurence > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > > the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Hello Bart > > I only just got time now to revert that patch and build a kernel. > Will test this tonight and let you know if I am back to seeing panics > sporadically without the patch. > As already mentioned, this is a different configuration to what I had when I > was able to reproduce the panic. > This means the lack of hitting this stack trace and panic may turn out to > have nothing to do with the patch I applied. > > Thanks > Laurence > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Hello Bart I can no longer reproduce the stack even with the patch reverted so its behaving as you expected and the patch is as you already said, not valid. I ran about 30 fio tests with your kernel and multiple host deletions and and did experience only one hard fio error. My tests now produce the same results as you are seeing. The single fio errors was with many more executions of the test so its not easy to get these fio errors. Away from tomorrow on vacation for 10 days Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] dm-mq and end_clone_request()
- Original Message - > From: "Bart Van Assche" <bart.vanass...@sandisk.com> > To: "Laurence Oberman" <lober...@redhat.com> > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" > <snit...@redhat.com>, "Johannes Thumshirn" > <jthumsh...@suse.de> > Sent: Tuesday, August 9, 2016 1:16:52 PM > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > On 08/09/2016 10:12 AM, Laurence Oberman wrote: > > I was talking about this patch > > > > --- a/drivers/scsi/scsi_scan.c > > +++ b/drivers/scsi/scsi_scan.c > > @@ -1890,10 +1890,11 @@ void scsi_forget_host(struct Scsi_Host *shost) > > restart: > > spin_lock_irqsave(shost->host_lock, flags); > > list_for_each_entry(sdev, >__devices, siblings) { > > -if (sdev->sdev_state == SDEV_DEL) > > +if (sdev->sdev_state == SDEV_DEL || scsi_device_get(sdev) > > < 0) > > continue; > > spin_unlock_irqrestore(shost->host_lock, flags); > > __scsi_remove_device(sdev); > > +scsi_device_put(sdev); > > goto restart; > > } > > spin_unlock_irqrestore(shost->host_lock, flags); > > Hello Laurence, > > Did you run your tests with that patch applied? If so, it would help if > you could rerun your tests without that patch. If the above patch makes > a difference it means that it can happen that __scsi_remove_device() > does not change the device state into SDEV_DEL. That's a bug and we need > to know whether or not __scsi_remove_device() behaves correctly. > > Thanks, > > Bart. > Yes Sir, I ran all yesterdays tests on your kernel with that patch applied. Of course it may well just be luck/coincidence that the host delete race is no longer happening so I agree we need to re-run the tests so I will revert and re-run. I will probably only get back to you tomorrow with the results. Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] dm-mq and end_clone_request()
- Original Message - > From: "Bart Van Assche" <bart.vanass...@sandisk.com> > To: "Laurence Oberman" <lober...@redhat.com> > Cc: dm-de...@redhat.com, "Mike Snitzer" <snit...@redhat.com>, > linux-scsi@vger.kernel.org, "Johannes Thumshirn" > <jthumsh...@suse.de> > Sent: Tuesday, August 9, 2016 11:51:00 AM > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > On 08/08/2016 05:09 PM, Laurence Oberman wrote: > > So now back to a 10 LUN dual path (ramdisk backed) two-server > > configuration I am unable to reproduce the dm issue. > > Recovery is very fast with the servers connected back to back. > > This is using your kernel and this multipath.conf > > > > [ ... ] > > > > Mikes patches have definitely stabilized this issue for me on this > > configuration. > > > > I will see if I can move to a larger target server that has more > > memory and allocate more mpath devices. I feel this issue in large > > configurations is now rooted in multipath not bringing back maps > > sometimes even when the actual paths are back via srp_daemon. > > I am still tracking that down. > > > > If you recall, last week I caused some of our own issues by > > forgetting I had a no_path_retry 12 hiding in my multipath.conf. > > Since removing that and spending most of the weekend testing on > > the DDN array (had to give that back today), most of my issues > > were either the sporadic host delete race or multipath not > > re-instantiating paths. > > > > I dont know if this helps, but since applying your latest patch I > > have not seen the host delete race. > > Hello Laurence, > > My latest SCSI core patch adds additional instrumentation to the SCSI > core but does not change the behavior of the SCSI core. So it cannot > fix the scsi_forget_host() crash you had reported. > > On my setup, with the kernel code from the srp-initiator-for-next > branch and with CONFIG_DM_MQ_DEFAULT=n, I still see that when I run the > srp-test software that fio reports I/O errors every now and then. What > I see in syslog seems to indicate that these I/O errors are generated > by dm-mpath: > > Aug 9 08:45:39 ion-dev-ib-ini kernel: mpath 254:1: queue_if_no_path 1 -> 0 > Aug 9 08:45:39 ion-dev-ib-ini kernel: must_push_back: 107 callbacks > suppressed > Aug 9 08:45:39 ion-dev-ib-ini kernel: device-mapper: multipath: > must_push_back: queue_if_no_path=0 suspend_active=1 suspending=0 > Aug 9 08:45:39 ion-dev-ib-ini kernel: __multipath_map(): (a) returning -5 > Aug 9 08:45:39 ion-dev-ib-ini kernel: map_request(): clone_and_map_rq() > returned -5 > Aug 9 08:45:39 ion-dev-ib-ini kernel: dm_complete_request: error = -5 > Aug 9 08:45:39 ion-dev-ib-ini kernel: dm_softirq_done: dm-1 tio->error = -5 > > Bart. > > Hello Bart I was talking about this patch --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1890,10 +1890,11 @@ void scsi_forget_host(struct Scsi_Host *shost) restart: spin_lock_irqsave(shost->host_lock, flags); list_for_each_entry(sdev, >__devices, siblings) { -if (sdev->sdev_state == SDEV_DEL) +if (sdev->sdev_state == SDEV_DEL || scsi_device_get(sdev) < 0) continue; spin_unlock_irqrestore(shost->host_lock, flags); __scsi_remove_device(sdev); +scsi_device_put(sdev); goto restart; } spin_unlock_irqrestore(shost->host_lock, flags); -- This is the one I applied. that's not just instrumentation right ? Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dm-mq and end_clone_request()
- Original Message - > From: "Laurence Oberman" <lober...@redhat.com> > To: "Mike Snitzer" <snit...@redhat.com> > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, > linux-scsi@vger.kernel.org > Sent: Tuesday, August 2, 2016 10:55:59 PM > Subject: Re: dm-mq and end_clone_request() > > > > - Original Message - > > From: "Laurence Oberman" <lober...@redhat.com> > > To: "Mike Snitzer" <snit...@redhat.com> > > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, > > linux-scsi@vger.kernel.org > > Sent: Tuesday, August 2, 2016 10:18:30 PM > > Subject: Re: dm-mq and end_clone_request() > > > > > > > > - Original Message - > > > From: "Mike Snitzer" <snit...@redhat.com> > > > To: "Laurence Oberman" <lober...@redhat.com> > > > Cc: "Bart Van Assche" <bart.vanass...@sandisk.com>, dm-de...@redhat.com, > > > linux-scsi@vger.kernel.org > > > Sent: Tuesday, August 2, 2016 10:10:12 PM > > > Subject: Re: dm-mq and end_clone_request() > > > > > > On Tue, Aug 02 2016 at 9:33pm -0400, > > > Laurence Oberman <lober...@redhat.com> wrote: > > > > > > > Hi Bart > > > > > > > > I simplified the test to 2 simple scripts and only running against one > > > > XFS > > > > file system. > > > > Can you validate these and tell me if its enough to emulate what you > > > > are > > > > doing. > > > > Perhaps our test-suite is too simple. > > > > > > > > Start the test > > > > > > > > # cat run_test.sh > > > > #!/bin/bash > > > > logger "Starting Bart's test" > > > > #for i in `seq 1 10` > > > > for i in 1 > > > > do > > > > fio --verify=md5 -rw=randwrite --size=10M --bs=4K > > > > --loops=$((10**6)) \ > > > > --iodepth=64 --group_reporting --sync=1 --direct=1 > > > > --ioengine=libaio \ > > > > --directory="/data-$i" --name=data-integrity-test --thread > > > > --numjobs=16 \ > > > > --runtime=600 --output=fio-output.txt >/dev/null & > > > > done > > > > > > > > Delete the host, I wait 10s in between host deletions. > > > > But I also tested with 3s and still its stable with Mike's patches. > > > > > > > > #!/bin/bash > > > > for i in /sys/class/srp_remote_ports/* > > > > do > > > > echo "Deleting host $i, it will re-connect via srp_daemon" > > > > echo 1 > $i/delete > > > > sleep 10 > > > > done > > > > > > > > Check for I/O errors affecting XFS and we now have none with the > > > > patches > > > > Mike provided. > > > > After recovery I can create files in the xfs mount with no issues. > > > > > > > > Can you use my scripts and 1 mount and see if it still fails for you. > > > > > > In parallel we can try Bart's testsuite that he shared earlier in this > > > thread: https://github.com/bvanassche/srp-test > > > > > > README.md says: > > > "Running these tests manually is tedious. Hence this test suite that > > > tests the SRP initiator and target drivers by loading both drivers on > > > the same server, by logging in using the IB loopback functionality and > > > by sending I/O through the SRP initiator driver to a RAM disk exported > > > by the SRP target driver." > > > > > > This could explain why Bart is still seeing issues. He isn't testing > > > real hardware -- as such he is using ramdisk to expose races, etc. > > > > > > Mike > > > > > > > Hi Mike, > > > > I looked at Bart's scripts, they looked fine but I wanted a more simplified > > way to bring the error out. > > Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS. > > That is the same way I do it when I am not connected to a large array as it > > is the only way I can get EDR like speeds. > > > > I don't thinks its racing due to the ramdisk back-end but maybe we need to > > ramp ours up to run more in parallel in a loop. > > > > I will run 21 parallel runs and see if it makes a difference tonight and &
Re: [dm-devel] dm-mq and end_clone_request()
- Original Message - > From: "Bart Van Assche" <bart.vanass...@sandisk.com> > To: "Laurence Oberman" <lober...@redhat.com>, "Mike Snitzer" > <snit...@redhat.com> > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org > Sent: Wednesday, August 3, 2016 12:06:17 PM > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > On 08/02/2016 06:33 PM, Laurence Oberman wrote: > > #!/bin/bash > > for i in /sys/class/srp_remote_ports/* > > do > > echo "Deleting host $i, it will re-connect via srp_daemon" > > echo 1 > $i/delete > > sleep 10 > > done > > Hello Laurence, > > Sorry but the above looks wrong to me. There should be a second loop > around this loop and the sleep statement should be moved from the inner > loop to the outer loop. The above code logs out one (initiator, target) > port pair at a time instead of logging out all paths at once. > > Bart. > Hi Bart Latest tests are still good on our side. I am now taking both paths out at the same time but still we seem stable here. First test removed sleep and we still had a delay, second test add a background so they ran as close as possible to the same time. Both tests passed. I will email messages log just to you. With no sleep we still have a gap when we delete paths of 9s and we are good. Aug 3 13:41:21 jumpclient multipathd: 360001ff0b035d0008d71: remaining active paths: 1 Aug 3 13:41:22 jumpclient multipathd: 360001ff0b035d0028d720003: remaining active paths: 1 Aug 3 13:41:22 jumpclient multipathd: 360001ff0b035d0048d740005: remaining active paths: 1 Aug 3 13:41:22 jumpclient multipathd: 360001ff0b035d0068d760007: remaining active paths: 1 Aug 3 13:41:23 jumpclient multipathd: 360001ff0b035d00b8d7b000c: remaining active paths: 1 Aug 3 13:41:23 jumpclient multipathd: 360001ff0b035d00d8d7d000e: remaining active paths: 1 Aug 3 13:41:23 jumpclient multipathd: 360001ff0b035d0118d810012: remaining active paths: 1 Aug 3 13:41:24 jumpclient multipathd: 360001ff0b035d0138d830014: remaining active paths: 1 Aug 3 13:41:24 jumpclient multipathd: 360001ff0b035d0158d850016: remaining active paths: 1 Aug 3 13:41:25 jumpclient multipathd: 360001ff0b035d0178d870018: remaining active paths: 1 Aug 3 13:41:25 jumpclient multipathd: 360001ff0b035d0198d89001a: remaining active paths: 1 Aug 3 13:41:25 jumpclient multipathd: 360001ff0b035d01a8d8a001b: remaining active paths: 1 Aug 3 13:41:25 jumpclient multipathd: 360001ff0b035d01c8d8c001d: remaining active paths: 1 Aug 3 13:41:26 jumpclient multipathd: 360001ff0b035d01e8d8e001f: remaining active paths: 1 Aug 3 13:41:26 jumpclient multipathd: 360001ff0b035d01f8d8f0020: remaining active paths: 1 Aug 3 13:41:26 jumpclient multipathd: 360001ff0b035d0208d900021: remaining active paths: 1 Aug 3 13:41:26 jumpclient multipathd: 360001ff0b035d0228d920023: remaining active paths: 1 Aug 3 13:41:28 jumpclient multipathd: 360001ff0b035d0248d940025: remaining active paths: 1 Aug 3 13:41:29 jumpclient multipathd: 360001ff0b035d0268d960027: remaining active paths: 1 Aug 3 13:41:29 jumpclient multipathd: 360001ff0b035d0278d970028: remaining active paths: 1 Aug 3 13:41:30 jumpclient multipathd: 360001ff0b035d0288d980029: remaining active paths: 1 Aug 3 13:41:35 jumpclient multipathd: 360001ff0b035d0008d71: remaining active paths: 0 Aug 3 13:41:36 jumpclient multipathd: 360001ff0b035d0028d720003: remaining active paths: 0 Aug 3 13:41:37 jumpclient multipathd: 360001ff0b035d0048d740005: remaining active paths: 0 Aug 3 13:41:37 jumpclient multipathd: 360001ff0b035d0068d760007: remaining active paths: 0 Aug 3 13:41:38 jumpclient multipathd: 360001ff0b035d00b8d7b000c: remaining active paths: 0 Aug 3 13:41:38 jumpclient multipathd: 360001ff0b035d00d8d7d000e: remaining active paths: 0 Aug 3 13:41:38 jumpclient multipathd: 360001ff0b035d0108d800011: remaining active paths: 0 Aug 3 13:41:38 jumpclient multipathd: 360001ff0b035d0118d810012: remaining active paths: 0 Aug 3 13:41:38 jumpclient multipathd: 360001ff0b035d0138d830014: remaining active paths: 0 Aug 3 13:41:39 jumpclient multipathd: 360001ff0b035d0158d850016: remaining active paths: 0 Aug 3 13:41:39 jumpclient multipathd: 360001ff0b035d0178d870018: remaining active paths: 0 Aug 3 13:41:39 jumpclient multipathd: 360001ff0b035d0198d89001a: remaining active paths: 0 Aug 3 13:41:39 jumpclient multipathd: 360001ff0b035d01a8d8a001b: remaining active paths: 0 Aug 3 13:41:39 jumpclient multipathd: 360001ff0b035d01c8d8c001d: remaining active paths: 0 Aug 3 13:41:39
Re: [dm-devel] dm-mq and end_clone_request()
- Original Message - > From: "Bart Van Assche" <bart.vanass...@sandisk.com> > To: "Laurence Oberman" <lober...@redhat.com>, "Mike Snitzer" > <snit...@redhat.com> > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org > Sent: Wednesday, August 3, 2016 12:06:17 PM > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > On 08/02/2016 06:33 PM, Laurence Oberman wrote: > > #!/bin/bash > > for i in /sys/class/srp_remote_ports/* > > do > > echo "Deleting host $i, it will re-connect via srp_daemon" > > echo 1 > $i/delete > > sleep 10 > > done > > Hello Laurence, > > Sorry but the above looks wrong to me. There should be a second loop > around this loop and the sleep statement should be moved from the inner > loop to the outer loop. The above code logs out one (initiator, target) > port pair at a time instead of logging out all paths at once. > > Bart. > Hi Bart It logs out each host in turn with a 10s sleep in between. I actually reduced the sleep to 3s last night. We do land up with all paths lost but not at precisely the same second. Are you saying we have to lose all paths at the same time. That is easy to fix and I was running it that way in beginning, I will re-test. Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] dm-mq and end_clone_request()
- Original Message - > From: "Laurence Oberman" <lober...@redhat.com> > To: "Bart Van Assche" <bart.vanass...@sandisk.com> > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" > <snit...@redhat.com>, "Johannes Thumshirn" > <jthumsh...@suse.de> > Sent: Tuesday, August 9, 2016 1:21:15 PM > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > > > - Original Message - > > From: "Bart Van Assche" <bart.vanass...@sandisk.com> > > To: "Laurence Oberman" <lober...@redhat.com> > > Cc: dm-de...@redhat.com, linux-scsi@vger.kernel.org, "Mike Snitzer" > > <snit...@redhat.com>, "Johannes Thumshirn" > > <jthumsh...@suse.de> > > Sent: Tuesday, August 9, 2016 1:16:52 PM > > Subject: Re: [dm-devel] dm-mq and end_clone_request() > > > > On 08/09/2016 10:12 AM, Laurence Oberman wrote: > > > I was talking about this patch > > > > > > --- a/drivers/scsi/scsi_scan.c > > > +++ b/drivers/scsi/scsi_scan.c > > > @@ -1890,10 +1890,11 @@ void scsi_forget_host(struct Scsi_Host *shost) > > > restart: > > > spin_lock_irqsave(shost->host_lock, flags); > > > list_for_each_entry(sdev, >__devices, siblings) { > > > -if (sdev->sdev_state == SDEV_DEL) > > > +if (sdev->sdev_state == SDEV_DEL || > > > scsi_device_get(sdev) > > > < 0) > > > continue; > > > spin_unlock_irqrestore(shost->host_lock, flags); > > > __scsi_remove_device(sdev); > > > +scsi_device_put(sdev); > > > goto restart; > > > } > > > spin_unlock_irqrestore(shost->host_lock, flags); > > > > Hello Laurence, > > > > Did you run your tests with that patch applied? If so, it would help if > > you could rerun your tests without that patch. If the above patch makes > > a difference it means that it can happen that __scsi_remove_device() > > does not change the device state into SDEV_DEL. That's a bug and we need > > to know whether or not __scsi_remove_device() behaves correctly. > > > > Thanks, > > > > Bart. > > > Yes Sir, I ran all yesterdays tests on your kernel with that patch applied. > Of course it may well just be luck/coincidence that the host delete race is > no longer happening > so I agree we need to re-run the tests so I will revert and re-run. > I will probably only get back to you tomorrow with the results. > > Thanks > Laurence > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Hello Bart I only just got time now to revert that patch and build a kernel. Will test this tonight and let you know if I am back to seeing panics sporadically without the patch. As already mentioned, this is a different configuration to what I had when I was able to reproduce the panic. This means the lack of hitting this stack trace and panic may turn out to have nothing to do with the patch I applied. Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html