[patch] [SCSI] bnx2fc: zero out sense buffer properly

2012-08-18 Thread Dan Carpenter
-sense_buffer used to be an array but it changed to pointer in
de25deb180 [SCSI] use dynamically allocated sense buffer.  This call
to memset() needs to be updated as well.

Signed-off-by: Dan Carpenter dan.carpen...@oracle.com

diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c
index 73f231c..8d4626c 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_io.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_io.c
@@ -1807,7 +1807,7 @@ static void bnx2fc_parse_fcp_rsp(struct bnx2fc_cmd 
*io_req,
fcp_sns_len = SCSI_SENSE_BUFFERSIZE;
}
 
-   memset(sc_cmd-sense_buffer, 0, sizeof(sc_cmd-sense_buffer));
+   memset(sc_cmd-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
if (fcp_sns_len)
memcpy(sc_cmd-sense_buffer, rq_data, fcp_sns_len);
 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] [SCSI] bnx2fc: zero out sense buffer properly

2012-08-18 Thread FUJITA Tomonori
On Sat, 18 Aug 2012 11:46:37 +0300
Dan Carpenter dan.carpen...@oracle.com wrote:

 -sense_buffer used to be an array but it changed to pointer in
 de25deb180 [SCSI] use dynamically allocated sense buffer.  This call
 to memset() needs to be updated as well.
 
 Signed-off-by: Dan Carpenter dan.carpen...@oracle.com
 
 diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c
 index 73f231c..8d4626c 100644
 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c
 +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c
 @@ -1807,7 +1807,7 @@ static void bnx2fc_parse_fcp_rsp(struct bnx2fc_cmd 
 *io_req,
   fcp_sns_len = SCSI_SENSE_BUFFERSIZE;
   }
  
 - memset(sc_cmd-sense_buffer, 0, sizeof(sc_cmd-sense_buffer));
 + memset(sc_cmd-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);

I guess that you can remove the line instead. IIRC, scsi-ml does it
for LLDs.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/18] srp_transport: Add transport layer recovery support

2012-08-18 Thread Bart Van Assche
On 07/16/12 22:28, David Dillow wrote:
 On Mon, 2012-07-16 at 18:07 -0400, Mike Christie wrote:
 On 01/14/2012 05:56 AM, Bart Van Assche wrote:
 Add the necessary functions in the SRP transport module to allow
 an SRP initiator driver to implement transport layer recovery.

 I was updating my iscsi dev loss patch when I saw this is still not merged.
 
 Yes, I got part way into doing a rework of Bart's code before I got
 sidetracked on another project. Cognizant of the looming merge window,
 I'm desperately trying to get back to it. 
 
 Not sure about the ping code, but I think the dev loss tmo and fast io
 fail related stuff should go to scsi_transport_template. We can then add
 some common code to start the timers/delayed_work_queues, stop them,
 set/get the value from sysfs, etc that all the transport classes can
 then call.
 
 Good idea, I'll look into doing this as part of it.

(replying to an e-mail of one month ago)

Hello Dave,

Have you already been able to make some progress with this work ? If the
patches for moving the shared parts of dev_loss_tmo / fast_io_fail_tmo
handling from scsi_transport_fc to scsi_transport_template would be
available in a reasonable timeframe I could adapt my ib_srp patch series
to take advantage of that work.

Bart.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn

2012-08-18 Thread Bart Van Assche
On 08/16/12 07:52, Bart Van Assche wrote:
 On 08/16/12 01:35, Chanho Min wrote:
 functions will occur in line.  I also don't see why the sdev reference
 couldn't drop to zero here.
 scsi_request_fn is called under the lock of request_queue-queue_lock.
 If we drop the sdev reference to zero here,
 scsi_device_dev_release_usercontext is
 invoked and make request_queue to NULL. When caller of scsi_request_fn try to
 unlock request_queue-queue_lock, the oops is occurred.
 
 Whether or not your patch is applied, if the put_device() call in
 scsi_request_fn() decreases the sdev reference count to zero, the
 scsi_request_fn() caller will still try to unlock the queue lock after
 scsi_request_fn() finished and hence will trigger a use-after-free. I'm
 afraid the only real solution is to modify the SCSI and/or block layers
 such that scsi_remove_device() can't finish while scsi_request_fn() is
 in progress. And once that is guaranteed the get_device() / put_device()
 pair can be left out from scsi_request_fn().

(replying to my own e-mail)

How about the patch below ?

[PATCH] Fix device removal race

If the put_device() call in scsi_request_fn() drops the sdev refcount
to zero then the spin_lock_irq() call after the put_device() call
triggers a use-after-free. Avoid that by making sure that blk_cleanup_queue()
can only finish after all active scsi_request_fn() calls have returned.
---
 block/blk-core.c|1 +
 drivers/scsi/scsi_lib.c |   10 ++
 include/linux/blkdev.h  |5 +
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4b4dbdf..0e4da3b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -388,6 +388,7 @@ void blk_drain_queue(struct request_queue *q, bool 
drain_all)
__blk_run_queue(q);
 
drain |= q-nr_rqs_elvpriv;
+   drain |= q-request_fn_active;
 
/*
 * Unfortunately, requests are queued at and tracked from
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ffd7773..10bb348 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1514,9 +1514,7 @@ static void scsi_request_fn(struct request_queue *q)
struct scsi_cmnd *cmd;
struct request *req;
 
-   if(!get_device(sdev-sdev_gendev))
-   /* We must be tearing the block queue down already */
-   return;
+   q-request_fn_active++;
 
/*
 * To start with, we keep looping until the queue is empty, or until
@@ -1626,11 +1624,7 @@ out_delay:
if (sdev-device_busy == 0)
blk_delay_queue(q, SCSI_QUEUE_DELAY);
 out:
-   /* must be careful here...if we trigger the -remove() function
-* we cannot be holding the q lock */
-   spin_unlock_irq(q-queue_lock);
-   put_device(sdev-sdev_gendev);
-   spin_lock_irq(q-queue_lock);
+   q-request_fn_active--;
 }
 
 u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4e72a9d..11c1987 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -377,6 +377,11 @@ struct request_queue {
 
unsigned intnr_sorted;
unsigned intin_flight[2];
+   /*
+* Number of active request_fn() calls for those request_fn()
+* implementations that unlock the queue_lock, e.g. scsi_request_fn().
+*/
+   unsigned intrequest_fn_active;
 
unsigned intrq_timeout;
struct timer_list   timeout;
-- 
1.7.7


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


drivers/scsi/ipr.c:8448:2-19: alloc with no test

2012-08-18 Thread Fengguang Wu
Hi Brian,

FYI, coccinelle warns about

  drivers/scsi/ipr.c:8448:2-19: alloc with no test, possible model on line 8457

vi +8448 drivers/scsi/ipr.c

  8443 if (ioa_cfg-sis64) {
  8444 ioa_cfg-target_ids = kzalloc(sizeof(unsigned long) *
  8445   
BITS_TO_LONGS(ioa_cfg-max_devs_supported), GFP_KERNEL);
  8446 ioa_cfg-array_ids = kzalloc(sizeof(unsigned long) *
  8447  
BITS_TO_LONGS(ioa_cfg-max_devs_supported), GFP_KERNEL);
 8448 ioa_cfg-vset_ids = kzalloc(sizeof(unsigned long) *
  8449 
BITS_TO_LONGS(ioa_cfg-max_devs_supported), GFP_KERNEL);
  8450 }
  8451   

Thanks,
Fengguang
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] target: Remove unused se_cmd.cmd_spdtl

2012-08-18 Thread Roland Dreier
On Fri, Aug 17, 2012 at 6:02 PM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 No, or at least that is not what happens anymore with current target
 core + iscsi-target code..

 The se_cmd-data_length re-assignment here is what will be used by
 iscsi-target fabric code for all iSCSI descriptor related allocations
 +ransfer, instead of the original fabric 'expected transfer length' that
 declares the size of the SCSI initiator's available buffer at the other
 end.

Not sure I follow this.  Isn't cmd-data_length also what we use to
allocate the buffer used to store the data we're going to transfer?

I guess my example with READ(10) actually works, because
iblock_execute_rw() just uses the buffer allocated, rather than
paying attention to the sector count in the command.

But what if an initiator sends, say, REPORT LUNS or PR OUT
with an allocation length of 8192, but a transport-level length
of only 4096?  If the REPORT LUNS or PR OUT response is
bigger than 4096 bytes, we'll overflow the allocated buffer with
your patch, right?

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 5/5] [PATCH 5/5] st: raise device limit

2012-08-18 Thread Jeff Mahoney
The device limit of 128 tape drives was established in 2003 as a
significant increase from the 8 tape drives allowed previously.

We're seeing customer sites that between a large number of drives
and multipath are discovering more than 128 devices and running
into problems.

Now that we're not stuck having to store a pointer in array
and aren't limited by kmalloc failing on higher order allocs we can
lift the limit to fill the entire minor range based on the number
of modes.

Based on the current code, that's 2^17 devices.

Reviewed-by: Lee Duncan ldun...@suse.com
Signed-off-by: Jeff Mahoney je...@suse.com
---
 Documentation/scsi/st.txt |6 ++
 drivers/scsi/st.h |2 +-
 2 files changed, 3 insertions(+), 5 deletions(-)

--- a/Documentation/scsi/st.txt
+++ b/Documentation/scsi/st.txt
@@ -112,10 +112,8 @@ attempted).
 
 MINOR NUMBERS
 
-The tape driver currently supports 128 drives by default. This number
-can be increased by editing st.h and recompiling the driver if
-necessary. The upper limit is 2^17 drives if 4 modes for each drive
-are used.
+The tape driver currently supports up to 2^17 drives if 4 modes for
+each drive are used.
 
 The minor numbers consist of the following bit fields:
 
--- a/drivers/scsi/st.h
+++ b/drivers/scsi/st.h
@@ -78,7 +78,7 @@ struct st_modedef {
 #define ST_MODE_SHIFT (7 - ST_NBR_MODE_BITS)
 #define ST_MODE_MASK ((ST_NBR_MODES - 1)  ST_MODE_SHIFT)
 
-#define ST_MAX_TAPES 128
+#define ST_MAX_TAPES (1  (20 - (ST_NBR_MODE_BITS + 1)))
 #define ST_MAX_TAPE_ENTRIES  (ST_MAX_TAPES  (ST_NBR_MODE_BITS + 1))
 
 /* The status related to each partition */


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 0/5] st: Clean up and raise max device limit (v4)

2012-08-18 Thread Jeff Mahoney
This patchset cleans up the SCSI tape device handling code and leverages it
to lift the limitation of the number of tape drives from the previous
arbitrary limit of 128 to the maximum supported by a device node that
creates 8 character devices per physical device. Since minors are 20 bits,
that means 2^17 tape drives can be supported.

Changed in this version: A previous revision introduced a regression where
tape drives would not be shown as tape devices in lsscsi. This was due to
the missing tape symlink in sysfs. That issue has been addressed and
lsscsi works properly again. Also, it passes checkpatch with no errors.

Please apply.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 2/5] [PATCH 2/5] st: clean up dev cleanup in st_probe

2012-08-18 Thread Jeff Mahoney
st_probe leaves a cdev pointer hanging around that is compared
during the error path and freed later. There's no need for the pointer
to hang around at all. So we free it immediately and simplify the error
handling.

Reviewed-by: Lee Duncan ldun...@suse.com
Signed-off-by: Jeff Mahoney je...@suse.com
---
 drivers/scsi/st.c |   11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -4156,6 +4156,7 @@ static int st_probe(struct device *dev)
printk(KERN_ERR
   st%d: out of memory. Device not 
attached.\n,
   dev_num);
+   cdev_del(cdev);
goto out_free_tape;
}
cdev-owner = THIS_MODULE;
@@ -4194,17 +4195,13 @@ out_free_tape:
  tape);
for (j=0; j  2; j++) {
if (STm-cdevs[j]) {
-   if (cdev == STm-cdevs[j])
-   cdev = NULL;
-   device_destroy(st_sysfs_class,
-  MKDEV(SCSI_TAPE_MAJOR,
-TAPE_MINOR(i, 
mode, j)));
+   device_destroy(st_sysfs_class,
+  MKDEV(SCSI_TAPE_MAJOR,
+TAPE_MINOR(i, mode, j)));
cdev_del(STm-cdevs[j]);
}
}
}
-   if (cdev)
-   cdev_del(cdev);
write_lock(st_dev_arr_lock);
scsi_tapes[dev_num] = NULL;
st_nr_dev--;


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/5] [PATCH 1/5] st: Use static class attributes

2012-08-18 Thread Jeff Mahoney
st currently sets up and tears down class attributes manually for
every tape drive in the system. This patch uses a statically defined
class with class attributes to let the device core do it for us.

Reviewed-by: Lee Duncan ldun...@suse.com
Signed-off-by: Jeff Mahoney je...@suse.com
---
 drivers/scsi/st.c |   74 --
 1 file changed, 33 insertions(+), 41 deletions(-)

--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -84,7 +84,8 @@ static int try_wdio = 1;
 static int st_dev_max;
 static int st_nr_dev;
 
-static struct class *st_sysfs_class;
+static struct class st_sysfs_class;
+static struct device_attribute st_dev_attrs[];
 
 MODULE_AUTHOR(Kai Makisara);
 MODULE_DESCRIPTION(SCSI tape (st) driver);
@@ -4195,7 +4196,7 @@ out_free_tape:
if (STm-cdevs[j]) {
if (cdev == STm-cdevs[j])
cdev = NULL;
-   device_destroy(st_sysfs_class,
+   device_destroy(st_sysfs_class,
   MKDEV(SCSI_TAPE_MAJOR,
 TAPE_MINOR(i, 
mode, j)));
cdev_del(STm-cdevs[j]);
@@ -4236,7 +4237,7 @@ static int st_remove(struct device *dev)
  tape);
for (mode = 0; mode  ST_NBR_MODES; ++mode) {
for (j=0; j  2; j++) {
-   device_destroy(st_sysfs_class,
+   device_destroy(st_sysfs_class,
   MKDEV(SCSI_TAPE_MAJOR,
 TAPE_MINOR(i, 
mode, j)));
cdev_del(tpnt-modes[mode].cdevs[j]);
@@ -4283,6 +4284,11 @@ static void scsi_tape_release(struct kre
return;
 }
 
+static struct class st_sysfs_class = {
+   .name = scsi_tape,
+   .dev_attrs = st_dev_attrs,
+};
+
 static int __init init_st(void)
 {
int err;
@@ -4292,10 +4298,10 @@ static int __init init_st(void)
printk(KERN_INFO st: Version %s, fixed bufsize %d, s/g segs %d\n,
verstr, st_fixed_buffer_size, st_max_sg_segs);
 
-   st_sysfs_class = class_create(THIS_MODULE, scsi_tape);
-   if (IS_ERR(st_sysfs_class)) {
-   printk(KERN_ERR Unable create sysfs class for SCSI tapes\n);
-   return PTR_ERR(st_sysfs_class);
+   err = class_register(st_sysfs_class);
+   if (err) {
+   pr_err(Unable register sysfs class for SCSI tapes\n);
+   return err;
}
 
err = register_chrdev_region(MKDEV(SCSI_TAPE_MAJOR, 0),
@@ -4322,7 +4328,7 @@ err_chrdev:
unregister_chrdev_region(MKDEV(SCSI_TAPE_MAJOR, 0),
 ST_MAX_TAPE_ENTRIES);
 err_class:
-   class_destroy(st_sysfs_class);
+   class_unregister(st_sysfs_class);
return err;
 }
 
@@ -4332,7 +4338,7 @@ static void __exit exit_st(void)
scsi_unregister_driver(st_template.gendrv);
unregister_chrdev_region(MKDEV(SCSI_TAPE_MAJOR, 0),
 ST_MAX_TAPE_ENTRIES);
-   class_destroy(st_sysfs_class);
+   class_unregister(st_sysfs_class);
kfree(scsi_tapes);
printk(KERN_INFO st: Unloaded.\n);
 }
@@ -4405,10 +4411,9 @@ static void do_remove_sysfs_files(void)
driver_remove_file(sysfs, driver_attr_try_direct_io);
 }
 
-
 /* The sysfs simple class interface */
 static ssize_t
-st_defined_show(struct device *dev, struct device_attribute *attr, char *buf)
+defined_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
struct st_modedef *STm = dev_get_drvdata(dev);
ssize_t l = 0;
@@ -4417,10 +4422,9 @@ st_defined_show(struct device *dev, stru
return l;
 }
 
-DEVICE_ATTR(defined, S_IRUGO, st_defined_show, NULL);
-
 static ssize_t
-st_defblk_show(struct device *dev, struct device_attribute *attr, char *buf)
+default_blksize_show(struct device *dev, struct device_attribute *attr,
+char *buf)
 {
struct st_modedef *STm = dev_get_drvdata(dev);
ssize_t l = 0;
@@ -4429,10 +4433,10 @@ st_defblk_show(struct device *dev, struc
return l;
 }
 
-DEVICE_ATTR(default_blksize, S_IRUGO, st_defblk_show, NULL);
 
 static ssize_t
-st_defdensity_show(struct device *dev, struct device_attribute *attr, char 
*buf)
+default_density_show(struct device *dev, struct device_attribute *attr,
+char *buf)
 {
struct st_modedef *STm = dev_get_drvdata(dev);
ssize_t l = 0;
@@ -4443,11 +4447,9 @@ st_defdensity_show(struct device *dev, s
return l;
 }
 
-DEVICE_ATTR(default_density, S_IRUGO, st_defdensity_show, NULL);
-
 static ssize_t
-st_defcompression_show(struct device *dev, struct 

Re: [GIT PULL] tcm_vhost: Initial merge of vhost level target fabric driver

2012-08-18 Thread Michael S. Tsirkin
Hi Nicholas,
I just noticed this problem in the interface:

+#include linux/vhost.h
+
+/*
+ * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
+ *
+ * ABI Rev 0: July 2012 version starting point for v3.6-rc merge
candidate +
+ *RFC-v2 vhost-scsi userspace.  Add GET_ABI_VERSION ioctl
usage
+ */
+
+#define VHOST_SCSI_ABI_VERSION 0
+
+struct vhost_scsi_target {
+   int abi_version;
+   unsigned char vhost_wwpn[TRANSPORT_IQN_LEN];
+   unsigned short vhost_tpgt;
+};
+

Here TRANSPORT_IQN_LEN is 224, which is a multiple of 4.
Since vhost_tpgt is 2 bytes and abi_version is 4, the total size would
be 230.  But gcc needs struct size be aligned to first field size, which
is 4 bytes, so it pads the structure by extra 2 bytes to the total of
232.

This padding is very undesirable in an ABI:
- it can not be initialized easily
- it can not be checked easily
- it can leak information between kernel and userspace

Simplest solution is probably just to make the padding
explicit:

+struct vhost_scsi_target {
+   int abi_version;
+   unsigned char vhost_wwpn[TRANSPORT_IQN_LEN];
+   unsigned short vhost_tpgt;
+   unsigned short reserved;
+};
+

I think we should fix this buglet before it goes out to users.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] tcm_vhost: Initial merge of vhost level target fabric driver

2012-08-18 Thread Nicholas A. Bellinger
On Sat, 2012-08-18 at 23:04 +0300, Michael S. Tsirkin wrote:
 Hi Nicholas,
 I just noticed this problem in the interface:
 
 +#include linux/vhost.h
 +
 +/*
 + * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
 + *
 + * ABI Rev 0: July 2012 version starting point for v3.6-rc merge
 candidate +
 + *RFC-v2 vhost-scsi userspace.  Add GET_ABI_VERSION ioctl
 usage
 + */
 +
 +#define VHOST_SCSI_ABI_VERSION 0
 +
 +struct vhost_scsi_target {
 +   int abi_version;
 +   unsigned char vhost_wwpn[TRANSPORT_IQN_LEN];
 +   unsigned short vhost_tpgt;
 +};
 +
 
 Here TRANSPORT_IQN_LEN is 224, which is a multiple of 4.
 Since vhost_tpgt is 2 bytes and abi_version is 4, the total size would
 be 230.  But gcc needs struct size be aligned to first field size, which
 is 4 bytes, so it pads the structure by extra 2 bytes to the total of
 232.
 
 This padding is very undesirable in an ABI:
 - it can not be initialized easily
 - it can not be checked easily
 - it can leak information between kernel and userspace
 

H, yes.  Very good reasons to avoid ABI ambiguity  ..

 Simplest solution is probably just to make the padding
 explicit:
 
 +struct vhost_scsi_target {
 +   int abi_version;
 +   unsigned char vhost_wwpn[TRANSPORT_IQN_LEN];
 +   unsigned short vhost_tpgt;
 +   unsigned short reserved;
 +};
 +
 
 I think we should fix this buglet before it goes out to users.
 

nod, fixing this up in target-pending/master now w/ your reported-by
+signoff, and will change vhost-scsi's copy of these defs for next
week's RFC-v3 posting.

Thanks MST!

--nab

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Odd behavior of a SAS-2 backplane with SGPIO commands

2012-08-18 Thread Rich
I actually have a lot of information I can share about this since I
wrote this email.

To be brief, the sas2ircu toolset works perfectly fine with IT mode
HBAs - just not with passive backplanes AFAICS [e.g. the 846A, versus
the 846EL2].

Also, the bug I'm experiencing is not present with the SAS2308
generation of the HBAs (the 9217 and friends) at the same F/W version
as the 9211, so this may be a bug - on the controller chip, driver, or
HBA firmware.

- Rich

On Sat, Aug 18, 2012 at 7:52 PM, Pasi Kärkkäinen pa...@iki.fi wrote:
 On Tue, Jul 03, 2012 at 09:54:58PM -0400, Rich wrote:
 [My apologies if this is the wrong mailing list for this, but it
 seemed to be the most relevant. Please direct me elsewhere and discard
 if this is OT.]


 This is a perfectly fine list, hopefully :)

 Hi all,


 Hello,

 I've been beating my head against SGPIO for a bit now, and I can't
 tell if I'm just misreading SFF-8485, or this backplane is just not
 doing something sane.


 I've been struggling with SGPIO LED control aswell!

 I have here a Supermicro X8DAH-F+ with a backplane of model
 BPN-SAS-836A - which means it has 3 MG9072 controller chips on it.
 Supports speaking in-band or out-of-band to read data, control lights,
 c.

 With two LSI 9201-16i HBA attached, manipulating it via
 /dev/bsg/sas_hostN and smp_utils 0.97, SFF-8485 section 8.4.4 seems to
 think that changing this:
  00 41 02 00 00 a0 a0 a0 a0  a0 a0 a0 a0 a0 a0 a0 a0
  10 a0 a0 a0 a0 00 00 00 00  00 00 00 00 00 00 00 00
  20 00 00 00 00

 By doing this:
 # smp_write_gpio -d a1,a0,a0,a0 -t 3 -i 0 -c /dev/bsg/sas_hostN

 Resulting in:
  00 41 02 00 00 a1 a0 a0 a0  a0 a0 a0 a0 a0 a0 a0 a0
  10 a0 a0 a0 a0 00 00 00 00  00 00 00 00 00 00 00 00
  20 00 00 00 00

 Should enable a single drive's fault LED (drive m+3, by spec above).


 Wow, I've been trying to figure out how to control SGPIO from Linux,
 and for some reason I didn't find anything about smp_write_gpio earlier.

 I need to try that. Thanks a lot for this information! :)

 (earlier I tried getting LSI sas2ircu to do something clever
 with the LEDS but that didn't work at all - I guess LSI only supports
 SGPIO LED control in IR mode - not IT mode).


 In practice, I am in possession of ~90 identical machines, and on
 every single one, this controls the fault LED on 3 drives - I have 3
 SAS ports wired to the backplane (via iPASS cable) per HBA, so I would
 suspect the HBA of setting bit m+3 on each SAS port [and having each
 of its three ports running to an input controlled by a different SGPIO
 controller on the backplane]. No permutation of the register bits
 beyond the 4 bytes at index 0 changes the behavior of the LEDs that I
 can see, and the output in the beginning [a0 a0 ... 00 00 00 00] is
 what is returned on clean cold boot of the machine.

 In contrast, when I have an LSI 9240-8i attached, I can control
 per-drive LEDs, though the mechanism it uses for this is opaque to me
 (the megaraid_sas driver doesn't expose a virtual SAS host port like
 mpt2sas, so I am manipulating this using LSI's closed-source MegaCli
 blob); so I know it is possible, through some method, to control
 per-drive LEDs.

 I cannot tell, however, whether I am using smp_write_gpio incorrectly
 (either by misreading the spec or somehow abusing the virtual SAS
 port), or it's a bug somewhere (be it in smp_write_gpio, the kernel,
 the HBA firmware, or the upstream MG9072's firmware, that is somehow
 magically avoided by the 9240-8i), and there is almost no public
 documentation of people using smp_write_gpio.

 I've tried this with kernel 2.6.32-220.7.1.el6, 3.4.4, and 3.5.0-rc5
 (the last because it includes mpt2sas 13.XX.XX.XX, while 3.4.4 has
 12.XX.XX.XX, and I was curious if anything had changed), and smp_utils
 0.97.

 I wanted to see what the register state was when the 9240-8i was in
 use, but I couldn't coax the megaraid_sas driver into disgorging this
 information, and upon unplugging the iPASS cable from the backplane to
 connect it to the HBA, the backplane appears to reset its internal
 state; the LEDs cease blinking in any pattern, and consequent querying
 with the HBA returns expected default initial state.

 Am I insane, or is this backplane crazy? Does anyone have any
 experience with cajoling such things into submission?


 Hmm, I have some servers with LSI 9211-8i SAS2 HBAs (mpt2sas driver)
 and Supermicro BPN-SAS-846A backplanes, which have 3x MG9072 chips in them.

 I'll let you know how that stuff works for me!

 Thanks,

 -- Pasi

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html