[PATCH 0/3] MidLayer updates

2005-01-29 Thread James . Smart

The following 3 patches extend the scsi midlayer. Changes were prompted
by recent discussions with Christoph as part of the Emulex driver thread.

Patch 1: 
  Updates the recent transport container patch for :
  - allow device driver-specific attributes to be added to class objects
  - exposes classdev to transport via setup function

Patch 2: 
  Adds io statistics (requests, completions, error count) as generic
  attributes for scsi devices.

Patch 3: 
  This patch extends scsi_target support:
  - Allows for driver-specific data to be allocated along with the
target structure and accessible via the starget-hostdata pointer.
  - Adds scsi target alloc/configure/destory callbacks to the
scsi host template.
  - Rearranges the calling sequences for scsi targets so that the
target and slave alloc/configure/destory callbacks are in
order (target before slave on alloc/configure).

Although bad form, as there's so much churn in this area, the following
dependencies exist:
- Patches were created against scsi-rc-fixes-2.6 bitkeeper tree.
- Patch 1 depends on Mike Christies patch 
  http://marc.theaimsgroup.com/?l=linux-scsim=110678937715985w=2


-- James S
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] MidLayer updates - extending transport/attribute container changes

2005-01-29 Thread James . Smart

Patch 1: 
  Updates the recent transport container patch for :
  - allow device driver-specific attributes to be added to class objects
  - exposes classdev to transport via setup function

-- James S


p1_xpt.patch
Description: p1_xpt.patch


[PATCH 3/3] MidLayer updates - extending scsi_target support

2005-01-29 Thread James . Smart

Patch 3: 
  This patch extends scsi_target support:
  - Allows for driver-specific data to be allocated along with the
target structure and accessible via the starget-hostdata pointer.
  - Adds scsi target alloc/configure/destory callbacks to the
scsi host template.
  - Rearranges the calling sequences for scsi targets so that the
target and slave alloc/configure/destory callbacks are in
order (target before slave on alloc/configure).

-- James S


p3_hostdata.patch
Description: p3_hostdata.patch


Re: Ooops unmounting a defect DVD

2005-01-29 Thread James Bottomley
I wouldn't have noticed this at all since you didn't send it to the scsi
list, but fortunately, Al Viro drew it politely to my attention as
another example of SCSI refcounting problems.

The issue seems to be that we have a spurious scsi_cd_put() on the error
path of sr_open().  The sr_block_..() functions are the real block
opens and should be refcounted, the sr_...() are the pseudo cdrom opens
and should not be refcounted.

Could you try this and see if it fixes the problem?

James

= drivers/scsi/sr.c 1.124 vs edited =
--- 1.124/drivers/scsi/sr.c 2005-01-29 08:30:34 -06:00
+++ edited/drivers/scsi/sr.c2005-01-29 08:39:09 -06:00
@@ -544,7 +544,6 @@
return 0;
 
 error_out:
-   scsi_cd_put(cd);
return retval;  
 }
 


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mid-layer handling of NOT_READY conditions...

2005-01-29 Thread James Bottomley
On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
 Returning back DID_IMM_RETRY for these 'transport' related conditions
 would of course help in this issue -- but at the same time bring with it
 several side-effects which may not be desirable.
 
 So, beyond this particular circumstance, what would be considered a
 'proper' return status for this type of event? 

Well, the correct return, since this is a condition from the storage, is
simply the check condition and the sense code (rather than having the
driver interpret it).

  Would this be an approach to consider?  Or should we tackle the problem
  by addressing the quirky (cmd-retries  cmd-allowed) state?

That's what I think the correct approach should bewe have a few
other quirky devices that aren't pleased with our current NOT_READY
handling.  Were you going to look into coding up a patch for this?

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ooops unmounting a defect DVD

2005-01-29 Thread Oliver Neukum
Am Samstag, 29. Januar 2005 15:46 schrieb James Bottomley:
 I wouldn't have noticed this at all since you didn't send it to the scsi
 list, but fortunately, Al Viro drew it politely to my attention as
 another example of SCSI refcounting problems.

Sorry, it happening in cdrom_release fooled me into considering it
a generic cdrom problem.

 The issue seems to be that we have a spurious scsi_cd_put() on the error
 path of sr_open().  The sr_block_..() functions are the real block
 opens and should be refcounted, the sr_...() are the pseudo cdrom opens
 and should not be refcounted.
 
 Could you try this and see if it fixes the problem?

It fully fixes the problem. Thank you.

Regards
Oliver
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mid-layer handling of NOT_READY conditions...

2005-01-29 Thread Patrick Mansfield
On Sat, Jan 29, 2005 at 10:44:41AM -0600, James Bottomley wrote:
 On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
  Returning back DID_IMM_RETRY for these 'transport' related conditions
  would of course help in this issue -- but at the same time bring with it
  several side-effects which may not be desirable.
  
  So, beyond this particular circumstance, what would be considered a
  'proper' return status for this type of event? 
 
 Well, the correct return, since this is a condition from the storage, is
 simply the check condition and the sense code (rather than having the
 driver interpret it).

But the transport hit a failure, not the storage device.

I thought Andrew hit this sequence:

- pull / replace cable

- IO resumes but gets NOT_READY (the device could be logging back
  into the fibre or such)

- a FC transport problem is hit, DID_BUSY_BUSY is returned, but
  scmd-retries has already been exhausted by the NOT_READY

Did I misread something?

   Would this be an approach to consider?  Or should we tackle the problem
   by addressing the quirky (cmd-retries  cmd-allowed) state?
 
 That's what I think the correct approach should bewe have a few
 other quirky devices that aren't pleased with our current NOT_READY
 handling.  Were you going to look into coding up a patch for this?

We don't track what errors caused a retry (doing so is too painful), or
reset the retries. In scsi_decide_disposition() if we get a few retry
cases for one or multiple errors, and then a different error that should
reasonably be a retry case, we return SUCCESS instead of NEEDS_RETRY.

Why not just set scmd-retries to zero in scsi_requeue_command()?

All callers are cases that we want to keep retrying if other errors are hit,
and would fix other potential retry problems, not only the NOT_READY case.

[There is one bad looking scsi_requeue_command() for UNIT_ATTENTION that
looks like it could retry forever, independent of this problem.]

Fixing the NOT_READY case to quiesce (and not incrementing retries) would
fix the problem or make it much less likely, and is still a good idea.

And as a long term goal, losing the retry count and moving to allowing all
retries for a period of time would avoid other potential problems, and not
be tied to the speed of the system.

-- Patrick Mansfield
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix multiple HBA problem with transport classes

2005-01-29 Thread James Bottomley
All of the transport class patches contain a thinko in device matching
(and, unfortunately, one I exhorted everyone not to make in the generic
transport class comments): The match matches every container in the
class instead of the specific container belonging to the HBA.  This
causes a oops when there are two or more HBAs in the system.

James

= drivers/scsi/scsi_transport_fc.c 1.14 vs edited =
--- 1.14/drivers/scsi/scsi_transport_fc.c   2005-01-18 13:15:07 -06:00
+++ edited/drivers/scsi/scsi_transport_fc.c 2005-01-29 18:53:19 -06:00
@@ -728,6 +728,7 @@
  struct device *dev)
 {
struct Scsi_Host *shost;
+   struct fc_internal *i;
 
if (!scsi_is_host_device(dev))
return 0;
@@ -736,13 +737,17 @@
if (!shost-transportt  || shost-transportt-host_attrs.class
!= fc_host_class.class)
return 0;
-   return 1;
+
+   i = to_fc_internal(shost-transportt);
+   
+   return i-t.host_attrs == cont;
 }
 
 static int fc_target_match(struct attribute_container *cont,
struct device *dev)
 {
struct Scsi_Host *shost;
+   struct fc_internal *i;
 
if (!scsi_is_target_device(dev))
return 0;
@@ -751,7 +756,10 @@
if (!shost-transportt  || shost-transportt-host_attrs.class
!= fc_host_class.class)
return 0;
-   return 1;
+
+   i = to_fc_internal(shost-transportt);
+   
+   return i-t.host_attrs == cont;
 }
 
 
= drivers/scsi/scsi_transport_iscsi.c 1.2 vs edited =
--- 1.2/drivers/scsi/scsi_transport_iscsi.c 2005-01-18 13:15:07 -06:00
+++ edited/drivers/scsi/scsi_transport_iscsi.c  2005-01-29 18:56:01 -06:00
@@ -258,6 +258,7 @@
  struct device *dev)
 {
struct Scsi_Host *shost;
+   struct iscsi_internal *i;
 
if (!scsi_is_host_device(dev))
return 0;
@@ -266,13 +267,17 @@
if (!shost-transportt  || shost-transportt-host_attrs.class
!= iscsi_host_class.class)
return 0;
-   return 1;
+
+   i = to_iscsi_internal(shost-transportt);
+   
+   return i-t.host_attrs == cont;
 }
 
 static int iscsi_target_match(struct attribute_container *cont,
struct device *dev)
 {
struct Scsi_Host *shost;
+   struct iscsi_internal *i;
 
if (!scsi_is_target_device(dev))
return 0;
@@ -281,7 +286,10 @@
if (!shost-transportt  || shost-transportt-host_attrs.class
!= iscsi_host_class.class)
return 0;
-   return 1;
+
+   i = to_iscsi_internal(shost-transportt);
+   
+   return i-t.host_attrs == cont;
 }
 
 struct scsi_transport_template *
= drivers/scsi/scsi_transport_spi.c 1.23 vs edited =
--- 1.23/drivers/scsi/scsi_transport_spi.c  2005-01-18 13:15:07 -06:00
+++ edited/drivers/scsi/scsi_transport_spi.c2005-01-29 18:33:18 -06:00
@@ -136,6 +136,7 @@
  struct device *dev)
 {
struct Scsi_Host *shost;
+   struct spi_internal *i;
 
if (!scsi_is_host_device(dev))
return 0;
@@ -144,7 +145,10 @@
if (!shost-transportt  || shost-transportt-host_attrs.class
!= spi_host_class.class)
return 0;
-   return 1;
+
+   i = to_spi_internal(shost-transportt);
+   
+   return i-t.target_attrs == cont;
 }
 
 static int spi_device_configure(struct device *dev)
@@ -831,6 +835,7 @@
struct device *dev)
 {
struct Scsi_Host *shost;
+   struct spi_internal *i;
 
if (!scsi_is_target_device(dev))
return 0;
@@ -839,7 +844,10 @@
if (!shost-transportt  || shost-transportt-host_attrs.class
!= spi_host_class.class)
return 0;
-   return 1;
+
+   i = to_spi_internal(shost-transportt);
+   
+   return i-t.target_attrs == cont;
 }
 
 static DECLARE_TRANSPORT_CLASS(spi_transport_class,


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mid-layer handling of NOT_READY conditions...

2005-01-29 Thread James Bottomley
On Sat, 2005-01-29 at 11:34 -0800, Patrick Mansfield wrote:
 But the transport hit a failure, not the storage device.
 
 I thought Andrew hit this sequence:
 
   - pull / replace cable
 
   - IO resumes but gets NOT_READY (the device could be logging back
 into the fibre or such)
 
   - a FC transport problem is hit, DID_BUSY_BUSY is returned, but
 scmd-retries has already been exhausted by the NOT_READY
 
 Did I misread something?

Erm, not sure.  Perhaps I'm confused.  I thought it was the *device*
that had responded NOT_READY.  Obviously, if it's the driver
manufacturing NOT_READY sense because of some transport condition, then
it needs to be correctly reported as a DID_...

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] add missing class_device_del to transport classes

2005-01-29 Thread James Bottomley
On Wed, 2005-01-26 at 15:59 -0800, Mike Christie wrote:
 It appears there is a missing class_device_del.
 The comments for transport_remove_device indicate
 that transport_remove_classdev should call it
 (which the attached patch does), but the comment in
 attribute_container_remove_device:

OK, try this.  I think it corrects my architectural cockup and puts the
attribute additions and removals in the container class where they
should have been in the first place.

James

= drivers/base/attribute_container.c 1.1 vs edited =
--- 1.1/drivers/base/attribute_container.c  2005-01-18 12:54:21 -06:00
+++ edited/drivers/base/attribute_container.c   2005-01-29 10:00:15 -06:00
@@ -150,7 +150,7 @@
if (fn)
fn(cont, dev, ic-classdev);
else
-   class_device_add(ic-classdev);
+   attribute_container_add_class_device(ic-classdev);
list_add_tail(ic-node, cont-containers);
}
up(attribute_container_mutex);
@@ -195,8 +195,10 @@
list_del(ic-node);
if (fn)
fn(cont, dev, ic-classdev);
-   else
+   else {
+   attribute_container_remove_attrs(ic-classdev);
class_device_unregister(ic-classdev);
+   }
}
}
up(attribute_container_mutex);
@@ -264,7 +266,107 @@
up(attribute_container_mutex);
 }
 EXPORT_SYMBOL_GPL(attribute_container_trigger);
- 
+
+/**
+ * attribute_container_add_attrs - add attributes
+ *
+ * @classdev: The class device
+ *
+ * This simply creates all the class device sysfs files from the
+ * attributes listed in the container
+ */
+int
+attribute_container_add_attrs(struct class_device *classdev)
+{
+   struct attribute_container *cont =
+   attribute_container_classdev_to_container(classdev);
+   struct class_device_attribute **attrs = cont-attrs;
+   int i, error;
+
+   if (!attrs)
+   return 0;
+
+   for (i = 0; attrs[i]; i++) {
+   error = class_device_create_file(classdev, attrs[i]);
+   if (error)
+   return error;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(attribute_container_add_attrs);
+
+/**
+ * attribute_container_add_class_device - same function as class_device_add
+ *
+ * @classdev:  the class device to add
+ *
+ * This performs essentially the same function as class_device_add except for
+ * attribute containers, namely add the classdev to the system and then
+ * create the attribute files
+ */
+int
+attribute_container_add_class_device(struct class_device *classdev)
+{
+   int error = class_device_add(classdev);
+   if (error)
+   return error;
+   return attribute_container_add_attrs(classdev);
+}
+EXPORT_SYMBOL_GPL(attribute_container_add_class_device);
+
+/**
+ * attribute_container_add_class_device_adapter - simple adapter for triggers
+ *
+ * This function is identical to attribute_container_add_class_device except
+ * that it is designed to be called from the triggers
+ */
+int
+attribute_container_add_class_device_adapter(struct attribute_container *cont,
+struct device *dev,
+struct class_device *classdev)
+{
+   return attribute_container_add_class_device(classdev);
+}
+EXPORT_SYMBOL_GPL(attribute_container_add_class_device_adapter);
+
+/**
+ * attribute_container_remove_attrs - remove any attribute files
+ *
+ * @classdev: The class device to remove the files from
+ *
+ */
+void
+attribute_container_remove_attrs(struct class_device *classdev)
+{
+   struct attribute_container *cont =
+   attribute_container_classdev_to_container(classdev);
+   struct class_device_attribute **attrs = cont-attrs;
+   int i;
+
+   if (!attrs)
+   return;
+
+   for (i = 0; attrs[i]; i++)
+   class_device_remove_file(classdev, attrs[i]);
+}
+EXPORT_SYMBOL_GPL(attribute_container_remove_attrs);
+
+/**
+ * attribute_container_class_device_del - equivalent of class_device_del
+ *
+ * @classdev: the class device
+ *
+ * This function simply removes all the attribute files and then calls
+ * class_device_del.
+ */
+void
+attribute_container_class_device_del(struct class_device *classdev)
+{
+   attribute_container_remove_attrs(classdev);
+   class_device_del(classdev);
+}
+EXPORT_SYMBOL_GPL(attribute_container_class_device_del);
 
 int __init
 attribute_container_init(void)
= drivers/base/transport_class.c 1.1 vs edited =
--- 1.1/drivers/base/transport_class.c  2005-01-18 13:03:32 -06:00
+++ edited/drivers/base/transport_class.c   2005-01-29 09:50:15 -06:00
@@ -146,25 +146,6 @@
 EXPORT_SYMBOL_GPL(transport_setup_device);
 
 
-static int 

Re: Mid-layer handling of NOT_READY conditions...

2005-01-29 Thread Douglas Gilbert
Patrick Mansfield wrote:
On Sat, Jan 29, 2005 at 10:44:41AM -0600, James Bottomley wrote:
On Fri, 2005-01-28 at 21:46 -0800, Andrew Vasquez wrote:
Returning back DID_IMM_RETRY for these 'transport' related conditions
would of course help in this issue -- but at the same time bring with it
several side-effects which may not be desirable.
So, beyond this particular circumstance, what would be considered a
'proper' return status for this type of event? 
Well, the correct return, since this is a condition from the storage, is
simply the check condition and the sense code (rather than having the
driver interpret it).

But the transport hit a failure, not the storage device.
I thought Andrew hit this sequence:
- pull / replace cable
- IO resumes but gets NOT_READY (the device could be logging back
  into the fibre or such)
- a FC transport problem is hit, DID_BUSY_BUSY is returned, but
  scmd-retries has already been exhausted by the NOT_READY
Did I misread something?
Patrick,
I was also thinking of commenting on this. It depends on
where the failure is:
  a) between the device server (target) and a logical unit (lu)
  b) in the service delivery subsystem between the
 initiator (port) and the target (port).
James's explanation covers case a) (i.e. the device server
should constuct appropriate sense data and a SCSI status
in response to the current and future SCSI commands.
In case b) the reponse is transport dependent.
For example, in the case of SAS there are two further
situations:
   1) the failure occurs on a direct connect between the
  initiator (port) and the target (port) [e.g. between
  a HBA port and a target port on a disk].
  Then a low level state machine (phy/link layer) on
  the HBA will notice the problem
   2) the failure occurs between an expander and an end
  device (e.g. a tape drive). Then the expander issues
  a BROADCAST(CHANGE) link layer primitive which the
  initiator(s) will receive. In reponse to this the
  initiator(s) should do another discovery process
  to find the new topology (via SMP).
Also both of these situations are detected in real time
(more or less), not when the next command is issued.
New SCSI commands will fail relatively quickly when
the SAS HBA fails to open a connection to the target.
SCSI commands in flight to an effected target should
trigger connection timeouts in the initiator.
Doug Gilbert
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Should multipath detect changed path UIDs?

2005-01-29 Thread Tim Pepper
Having the checker check for this change isn't a solution...it
shouldn't prevent any data corruption.  Even if you check prior to
every IO, you still have an obvious race between the check and the
subsequent IO.  You gain the ability to recognise the problem and
somehow magically warn the user (eg: disallow all subsequent IO to
catch the their attention).  So then what is a reasonable frequency
for the check?  How much corruption will you tolerate, because a lot
of IO can go out an FC connection in a short time.

Also, I'd think your #1 pre-requisite isn't quite right.  I'd say the
hole is wider and that you probably could have IO running heavily, as
long as none of it is failed by the lower layers in response to the
missing cable (and the default retries/timeouts could easily mask even
a clumsy cable swap).

Personally I'm also not inclined to think triggering hot plug events
on the cable removal/replacement is ideal.  I think fibre channel
isn't exactly intended as a dynamic environment and the SAN topology
is static or expected to be static from a given host's perspective. 
(I could be wrong.)  If this is the case, it would be helpful for a
host (or it's admin) to be able to know that topology and it's health
over time instead of only knowing the healthy portion of the topology
because the unhealthy parts are removed from its mappings.  Problem
determination seems like it's easier if you have a list of bad parts
instead of just a shrinking list of good parts.

Shouldn't the hba hardware/software be able to recognise and discern
between different classes of fabric failures, hba port link loss and
the return of a link that puts the port in a different place in the
SAN.  Does the HBA API give a common way to get detailed info out to
userspace daemons?
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html