Re: [PATCH v2 3/5] ata: Add APM X-Gene SATA driver

2013-11-11 Thread Arnd Bergmann
On Sunday 10 November 2013, Olof Johansson wrote:

  diff --git a/drivers/ata/Makefile b/drivers/ata/Makefile
  index 46518c6..022f9d1 100644
  --- a/drivers/ata/Makefile
  +++ b/drivers/ata/Makefile
  @@ -11,6 +11,8 @@ obj-$(CONFIG_SATA_SIL24)  += sata_sil24.o
   obj-$(CONFIG_SATA_DWC) += sata_dwc_460ex.o
   obj-$(CONFIG_SATA_HIGHBANK)+= sata_highbank.o libahci.o
   obj-$(CONFIG_AHCI_IMX) += ahci_imx.o
  +sata-xgene-objs := sata_xgene.o sata_xgene_serdes.o
  +obj-$(CONFIG_SATA_XGENE)   += sata-xgene.o
 
 Why not just doing obj-$(CONFIG_SATA_XGENE)   += sata_xgene.o
 sata_xgene_serdes.o
 ?
 

That wouldn't create a single module built from two files. However, if
the serdes part is moved to the more appropriate drivers/phy directory
and changed to use generic interfaces (I guess they are merged now,
need to check), then it would be two modules anyay.

 
  +/* Flush the IOB to ensure all SATA controller writes completed before
  +   servicing the completed command. */
  +static int xgene_ahci_iob_flush(struct xgene_ahci_context *ctx)
  +{
  +   if (ctx-ahbc_io_base == NULL) {
  +   void *ahbc_base;
  +   u32 val;
  +
  +   /* The AHBC address is fixed in X-Gene */
  +   ahbc_base = devm_ioremap(ctx-dev, 0x1F2A, 0x8);
 
 Even if fixed, having a defined constant makes sense here and below.

I would still insist on having the address be part of the DT and described
in the binding. You never know if the HW designers change their minds
on the next generation, or if the part is actually licensed from some
other company that also licensed the same thing to someone else.

I think this ought to be put into a proper device driver. It's not clear
from the comment why this is required here, but it seems to be either
working around a bug in MSI signalling that could go away entirely with
a fixed chip revision, or it's something that would be required by every
single DMA master in the system and should not be open-coded in the
individual device drivers.

 This doesn't quite make sense for me. In the case of ACPI firmware on
 server, the firmware can setup SERDES on its own. And if you want to
 provide new override values, you need to rebuild the firmware anyway,
 so there's no way to supply the overrides separately. Thus it really
 makes no sense to do these in the ACPI case.

Agreed.
 
 For DT the case is slightly different since the DT is supplied
 separate from the firmware image, so it's possible to ship newer
 settings. Still, even there there's no reason to not have firmware do
 the setup in most cases.

I'd argue that you shouldn't have to ship a fixed DT to change those
values, but instead put the values into the device driver and fix the
kernel when you need to change them.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/11] pm80xx : gpio feature support for motherboard controllers

2013-11-11 Thread Jack Wang
Hi James,

About this gpio feature, do you think it's OK to implement with IOCTL or
we can use exist bsg interface in libsas and add another function call?

Regards,
Jack

On 11/11/2013 06:57 AM, Viswas G wrote:
 Hi Jack,
 
 The GPIO feature we implemented here is for controlling and configuring the 
 GPIO pins present in the HBA and it is not related to the GPIO registers 
 present in the SGPIO. Following is one of the GPIO operations we do from 
 application.
 
 When application wants to know the insertion/removal of a SAS cable in any of 
 the port, it configures the GPIO for corresponding port to generate event for 
 SAS cable insertion or removal using the IOCTL to the HBA driver and waits by 
 calling poll function. When driver receives the GPIO event for the SAS cable 
 insertion or removal then it intimates the application.
 
 We are using IOCTL instead of sysfs interface since we have to pass 
 structures between user space and kernel space. Again, in the kernel space, 
 we have to parse user buffer from application and convert it to the 
 corresponding data structure. We wanted to avoid the parsing complexity by 
 using ioctl interface.
 
 Regards,
 Viswas G
 
 
 -Original Message-
 From: Jack Wang [mailto:xjtu...@gmail.com] 
 Sent: Monday, November 04, 2013 4:00 PM
 To: Viswas G
 Cc: linux-scsi@vger.kernel.org; Sangeetha Gnanasekaran; Nikith 
 Ganigarakoppal; Anand Kumar Santhanam; Suresh Thiagarajan
 Subject: Re: [PATCH 11/11] pm80xx : gpio feature support for motherboard 
 controllers
 
 On 11/04/2013 11:13 AM, Viswas G wrote:
 Hi Jack,

 We wanted to control the GPIO in the HBA only. Bsg interface gets created 
 only for enclosure or expander. 

 For our HBA, bsg interface will not be created since it does not have an 
 enclosure target inside. That's why we wanted to use IOCTL. Please advise.

 Best Regards,
 Viswas G

 Hi Viswas,
 
 No, bsg interface also support HBA.
 Here is two example output from LSI mpt2sas:
 
 smp_rep_manufacturer /dev/bsg/sas_host0
 Report manufacturer response:
   SAS-1.1 format: 0
   vendor identification: LSI
   product identification: Virtual SMP Port
   product revision level:
 smp_read_gpio /dev/bsg/sas_host0
 Read GPIO register response:
   GPIO_CFG[0]:
 version: 0
 GPIO enable: 1
 cfg register count: 2
 gp register count: 1
 supported drive count: 16
 
 Regards,
 Jack
 
 -Original Message-
 From: Jack Wang [mailto:xjtu...@gmail.com] 
 Sent: Tuesday, October 29, 2013 3:49 PM
 To: Viswas G
 Cc: linux-scsi@vger.kernel.org; Sangeetha Gnanasekaran; Nikith 
 Ganigarakoppal; Anand Kumar Santhanam
 Subject: Re: [PATCH 11/11] pm80xx : gpio feature support for motherboard 
 controllers

 Hi Viswas,

 As ioctl interface is not welcome for new feature, that's why we removed 
 ioctl interface when pm8001 accepted into mainline.

 I suggest you use bsg interface for this, see sas_host_smp.c for details.

 Regards,
 Jack

 On 10/22/2013 02:20 PM, Viswas G wrote:

 Signed-off-by: Viswas G viswa...@pmcs.com
 ---
  drivers/scsi/pm8001/pm8001_ctl.c  |  248
 -
  drivers/scsi/pm8001/pm8001_ctl.h  |   55 
  drivers/scsi/pm8001/pm8001_init.c |   37 ++
  drivers/scsi/pm8001/pm8001_sas.h  |   30 +
  drivers/scsi/pm8001/pm80xx_hwi.c  |  106 
  drivers/scsi/pm8001/pm80xx_hwi.h  |   49 +++
  6 files changed, 524 insertions(+), 1 deletions(-)

 diff --git a/drivers/scsi/pm8001/pm8001_ctl.c
 b/drivers/scsi/pm8001/pm8001_ctl.c
 index 247cb1c..3c9ca21 100644
 --- a/drivers/scsi/pm8001/pm8001_ctl.c
 +++ b/drivers/scsi/pm8001/pm8001_ctl.c
 @@ -40,7 +40,8 @@
  #include linux/firmware.h
  #include linux/slab.h
  #include pm8001_sas.h
 -#include pm8001_ctl.h
 +
 +int pm8001_major = -1;

  /* scsi host attributes */

 @@ -759,3 +760,248 @@ struct device_attribute *pm8001_host_attrs[] = {
 NULL,
  };

 +/**
 + * pm8001_open - open the configuration file
 + * @inode: inode being opened
 + * @file: file handle attached
 + *
 + * Called when the configuration device is opened. Does the needed
 + * set up on the handle and then returns
 + *
 + */
 +static int pm8001_open(struct inode *inode, struct file *file) {
 +   struct pm8001_hba_info *pm8001_ha;
 +   unsigned minor_number = iminor(inode);
 +   int err = -ENODEV;
 +
 +   list_for_each_entry(pm8001_ha, hba_list, list) {
 +   if (pm8001_ha-id == minor_number) {
 +   file-private_data = pm8001_ha;
 +   err = 0;
 +   break;
 +   }
 +   }
 +
 +   return err;
 +}
 +
 +/**
 + * pm8001_close - close the configuration file
 + * @inode: inode being opened
 + * @file: file handle attached
 + *
 + * Called when the configuration device is closed. Does the needed
 + * set up on the handle and then returns
 + *
 + */
 +static int pm8001_close(struct inode *inode, struct file *file) {
 +   return 0;
 +}
 +
 +static long 

[PATCH 2/5] scsi: improved eh timeout handler

2013-11-11 Thread Hannes Reinecke
When a command runs into a timeout we need to send an 'ABORT TASK'
TMF. This is typically done by the 'eh_abort_handler' LLDD callback.

Conceptually, however, this function is a normal SCSI command, so
there is no need to enter the error handler.

This patch implements a new scsi_abort_command() function which
invokes an asynchronous function scsi_eh_abort_handler() to
abort the commands via the usual 'eh_abort_handler'.

If abort succeeds the command is either retried or terminated,
depending on the number of allowed retries. However, 'eh_eflags'
records the abort, so if the retry would fail again the
command is pushed onto the error handler without trying to
abort it (again); it'll be cleared up from SCSI EH.

Signed-off-by: Hannes Reinecke h...@suse.de
Cc: Ren Mingxin re...@cn.fujitsu.com
Cc: Christoph Hellwig h...@infradead.org
---
 drivers/scsi/hosts.c  |  14 -
 drivers/scsi/scsi.c   |   3 +
 drivers/scsi/scsi_error.c | 151 ++
 drivers/scsi/scsi_priv.h  |   2 +
 include/scsi/scsi_cmnd.h  |   1 +
 include/scsi/scsi_host.h  |  10 +++
 6 files changed, 167 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index f334859..3b619819 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -169,6 +169,7 @@ void scsi_remove_host(struct Scsi_Host *shost)
spin_unlock_irqrestore(shost-host_lock, flags);
 
scsi_autopm_get_host(shost);
+   flush_workqueue(shost-tmf_work_q);
scsi_forget_host(shost);
mutex_unlock(shost-scan_mutex);
scsi_proc_host_rm(shost);
@@ -294,6 +295,8 @@ static void scsi_host_dev_release(struct device *dev)
 
scsi_proc_hostdir_rm(shost-hostt);
 
+   if (shost-tmf_work_q)
+   destroy_workqueue(shost-tmf_work_q);
if (shost-ehandler)
kthread_stop(shost-ehandler);
if (shost-work_q)
@@ -360,7 +363,6 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template 
*sht, int privsize)
INIT_LIST_HEAD(shost-eh_cmd_q);
INIT_LIST_HEAD(shost-starved_list);
init_waitqueue_head(shost-host_wait);
-
mutex_init(shost-scan_mutex);
 
/*
@@ -443,9 +445,19 @@ struct Scsi_Host *scsi_host_alloc(struct 
scsi_host_template *sht, int privsize)
goto fail_kfree;
}
 
+   shost-tmf_work_q = alloc_workqueue(scsi_tmf_%d,
+   WQ_UNBOUND | WQ_MEM_RECLAIM,
+  1, shost-host_no);
+   if (!shost-tmf_work_q) {
+   printk(KERN_WARNING scsi%d: failed to create tmf workq\n,
+  shost-host_no);
+   goto fail_kthread;
+   }
scsi_proc_hostdir_add(shost-hostt);
return shost;
 
+ fail_kthread:
+   kthread_stop(shost-ehandler);
  fail_kfree:
kfree(shost);
return NULL;
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index fe0bcb1..2b04a57 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -297,6 +297,7 @@ struct scsi_cmnd *scsi_get_command(struct scsi_device *dev, 
gfp_t gfp_mask)
 
cmd-device = dev;
INIT_LIST_HEAD(cmd-list);
+   INIT_DELAYED_WORK(cmd-abort_work, scmd_eh_abort_handler);
spin_lock_irqsave(dev-list_lock, flags);
list_add_tail(cmd-list, dev-cmd_list);
spin_unlock_irqrestore(dev-list_lock, flags);
@@ -353,6 +354,8 @@ void scsi_put_command(struct scsi_cmnd *cmd)
list_del_init(cmd-list);
spin_unlock_irqrestore(cmd-device-list_lock, flags);
 
+   cancel_delayed_work(cmd-abort_work);
+
__scsi_put_command(cmd-device-host, cmd, sdev-sdev_gendev);
 }
 EXPORT_SYMBOL(scsi_put_command);
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 67c0014..cab3c9b 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -53,6 +53,8 @@ static void scsi_eh_done(struct scsi_cmnd *scmd);
 #define HOST_RESET_SETTLE_TIME  (10)
 
 static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
+static int scsi_try_to_abort_cmd(struct scsi_host_template *,
+struct scsi_cmnd *);
 
 /* called with shost-host_lock held */
 void scsi_eh_wakeup(struct Scsi_Host *shost)
@@ -100,6 +102,116 @@ static int scsi_host_eh_past_deadline(struct Scsi_Host 
*shost)
 }
 
 /**
+ * scmd_eh_abort_handler - Handle command aborts
+ * @work:  command to be aborted.
+ */
+void
+scmd_eh_abort_handler(struct work_struct *work)
+{
+   struct scsi_cmnd *scmd =
+   container_of(work, struct scsi_cmnd, abort_work.work);
+   struct scsi_device *sdev = scmd-device;
+   unsigned long flags;
+   int rtn;
+
+   spin_lock_irqsave(sdev-host-host_lock, flags);
+   if (scsi_host_eh_past_deadline(sdev-host)) {
+   spin_unlock_irqrestore(sdev-host-host_lock, flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+ 

[PATCH 4/5] scsi: Set the minimum valid value of 'eh_deadline' as 0

2013-11-11 Thread Hannes Reinecke
From: Ren Mingxin re...@cn.fujitsu.com

The former minimum valid value of 'eh_deadline' is 1s, which means
the earliest occasion to shorten EH is 1 second later since a
command is failed or timed out. But if we want to skip EH steps
ASAP, we have to wait until the first EH step is finished. If the
duration of the first EH step is long, this waiting time is
excruciating. So, it is necessary to accept 0 as the minimum valid
value for 'eh_deadline'.

According to my test, with Hannes' patchset 'New EH command timeout
handler' as well, the minimum IO time is improved from 73s
(eh_deadline = 1) to 43s(eh_deadline = 0) when commands are timed
out by disabling RSCN and target port.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
Signed-off-by: Hannes Reinecke h...@suse.de
---
 drivers/scsi/hosts.c  | 16 
 drivers/scsi/scsi_error.c | 34 +-
 drivers/scsi/scsi_sysfs.c | 36 +---
 3 files changed, 58 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 3b619819..6966def 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -319,11 +319,11 @@ static void scsi_host_dev_release(struct device *dev)
kfree(shost);
 }
 
-static unsigned int shost_eh_deadline;
+static int shost_eh_deadline = -1;
 
-module_param_named(eh_deadline, shost_eh_deadline, uint, S_IRUGO|S_IWUSR);
+module_param_named(eh_deadline, shost_eh_deadline, int, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(eh_deadline,
-SCSI EH timeout in seconds (should be between 1 and 2^32-1));
+SCSI EH timeout in seconds (should be between 0 and 2^31-1));
 
 static struct device_type scsi_host_type = {
.name = scsi_host,
@@ -396,7 +396,15 @@ struct Scsi_Host *scsi_host_alloc(struct 
scsi_host_template *sht, int privsize)
shost-unchecked_isa_dma = sht-unchecked_isa_dma;
shost-use_clustering = sht-use_clustering;
shost-ordered_tag = sht-ordered_tag;
-   shost-eh_deadline = shost_eh_deadline * HZ;
+   if (shost_eh_deadline == -1)
+   shost-eh_deadline = -1;
+   else if ((ulong) shost_eh_deadline * HZ  INT_MAX) {
+   shost_printk(KERN_WARNING, shost,
+eh_deadline %u too large, setting to %u\n,
+shost_eh_deadline, INT_MAX / HZ);
+   shost-eh_deadline = INT_MAX;
+   } else
+   shost-eh_deadline = shost_eh_deadline * HZ;
 
if (sht-supported_mode == MODE_UNKNOWN)
/* means we didn't set it ... default to INITIATOR */
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 1d27625..04c2ce8 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -91,18 +91,18 @@ EXPORT_SYMBOL_GPL(scsi_schedule_eh);
 
 static int scsi_host_eh_past_deadline(struct Scsi_Host *shost)
 {
-   if (!shost-last_reset || !shost-eh_deadline)
+   if (!shost-last_reset || shost-eh_deadline == -1)
return 0;
 
/*
 * 32bit accesses are guaranteed to be atomic
 * (on all supported architectures), so instead
 * of using a spinlock we can as well double check
-* if eh_deadline has been unset during the
+* if eh_deadline has been set to 'off' during the
 * time_before call.
 */
if (time_before(jiffies, shost-last_reset + shost-eh_deadline) 
-   shost-eh_deadline != 0)
+   shost-eh_deadline  -1)
return 0;
 
return 1;
@@ -132,26 +132,34 @@ scmd_eh_abort_handler(struct work_struct *work)
rtn = scsi_try_to_abort_cmd(sdev-host-hostt, scmd);
if (rtn == SUCCESS) {
scmd-result |= DID_TIME_OUT  16;
-   if (!scsi_noretry_cmd(scmd) 
+   if (scsi_host_eh_past_deadline(sdev-host)) {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   scmd %p eh timeout, 
+   not retrying aborted 
+   command\n, scmd));
+   } else if (!scsi_noretry_cmd(scmd) 
(++scmd-retries = scmd-allowed)) {
SCSI_LOG_ERROR_RECOVERY(3,
scmd_printk(KERN_WARNING, scmd,
scmd %p retry 
aborted command\n, scmd));
scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+   return;
} else {
SCSI_LOG_ERROR_RECOVERY(3,
scmd_printk(KERN_WARNING, scmd,
 

[PATCH 3/5] scsi: Unlock accesses to eh_deadline

2013-11-11 Thread Hannes Reinecke
32bit accesses are guaranteed to be atomic, so we can remove
the spinlock when checking for eh_deadline. We only need to
make sure to catch any updates which might happened during
the call to time_before(); if so we just recheck with the
correct value.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 drivers/scsi/scsi_error.c | 44 +---
 1 file changed, 9 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index cab3c9b..1d27625 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -94,8 +94,15 @@ static int scsi_host_eh_past_deadline(struct Scsi_Host 
*shost)
if (!shost-last_reset || !shost-eh_deadline)
return 0;
 
-   if (time_before(jiffies,
-   shost-last_reset + shost-eh_deadline))
+   /*
+* 32bit accesses are guaranteed to be atomic
+* (on all supported architectures), so instead
+* of using a spinlock we can as well double check
+* if eh_deadline has been unset during the
+* time_before call.
+*/
+   if (time_before(jiffies, shost-last_reset + shost-eh_deadline) 
+   shost-eh_deadline != 0)
return 0;
 
return 1;
@@ -111,18 +118,14 @@ scmd_eh_abort_handler(struct work_struct *work)
struct scsi_cmnd *scmd =
container_of(work, struct scsi_cmnd, abort_work.work);
struct scsi_device *sdev = scmd-device;
-   unsigned long flags;
int rtn;
 
-   spin_lock_irqsave(sdev-host-host_lock, flags);
if (scsi_host_eh_past_deadline(sdev-host)) {
-   spin_unlock_irqrestore(sdev-host-host_lock, flags);
SCSI_LOG_ERROR_RECOVERY(3,
scmd_printk(KERN_INFO, scmd,
scmd %p eh timeout, not aborting\n,
scmd));
} else {
-   spin_unlock_irqrestore(sdev-host-host_lock, flags);
SCSI_LOG_ERROR_RECOVERY(3,
scmd_printk(KERN_INFO, scmd,
aborting command %p\n, scmd));
@@ -1132,7 +1135,6 @@ int scsi_eh_get_sense(struct list_head *work_q,
struct scsi_cmnd *scmd, *next;
struct Scsi_Host *shost;
int rtn;
-   unsigned long flags;
 
list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
if ((scmd-eh_eflags  SCSI_EH_CANCEL_CMD) ||
@@ -1140,16 +1142,13 @@ int scsi_eh_get_sense(struct list_head *work_q,
continue;
 
shost = scmd-device-host;
-   spin_lock_irqsave(shost-host_lock, flags);
if (scsi_host_eh_past_deadline(shost)) {
-   spin_unlock_irqrestore(shost-host_lock, flags);
SCSI_LOG_ERROR_RECOVERY(3,
shost_printk(KERN_INFO, shost,
skip %s, past eh deadline\n,
 __func__));
break;
}
-   spin_unlock_irqrestore(shost-host_lock, flags);
SCSI_LOG_ERROR_RECOVERY(2, scmd_printk(KERN_INFO, scmd,
  %s: requesting sense\n,
  current-comm));
@@ -1235,26 +1234,21 @@ static int scsi_eh_test_devices(struct list_head 
*cmd_list,
struct scsi_cmnd *scmd, *next;
struct scsi_device *sdev;
int finish_cmds;
-   unsigned long flags;
 
while (!list_empty(cmd_list)) {
scmd = list_entry(cmd_list-next, struct scsi_cmnd, eh_entry);
sdev = scmd-device;
 
if (!try_stu) {
-   spin_lock_irqsave(sdev-host-host_lock, flags);
if (scsi_host_eh_past_deadline(sdev-host)) {
/* Push items back onto work_q */
list_splice_init(cmd_list, work_q);
-   spin_unlock_irqrestore(sdev-host-host_lock,
-  flags);
SCSI_LOG_ERROR_RECOVERY(3,
shost_printk(KERN_INFO, sdev-host,
 skip %s, past eh 
deadline,
 __func__));
break;
}
-   spin_unlock_irqrestore(sdev-host-host_lock, flags);
}
 
finish_cmds = !scsi_device_online(scmd-device) ||
@@ -1295,15 +1289,12 @@ static int scsi_eh_abort_cmds(struct list_head *work_q,
LIST_HEAD(check_list);
int rtn;
struct Scsi_Host *shost;
-   unsigned long flags;
 
list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
  

[PATCH 1/5] scsi: Fix erratic device offline during EH

2013-11-11 Thread Hannes Reinecke
From: James Bottomley jbottom...@parallels.com

Commit 18a4d0a22ed6c54b67af7718c305cd010f09ddf8
(Handle disk devices which can not process medium access commands)
was introduced to offline any device which cannot process medium
access commands.
However, commit 3eef6257de48ff84a5d98ca533685df8a3beaeb8
(Reduce error recovery time by reducing use of TURs) reduced
the number of TURs by sending it only on the first failing
command, which might or might not be a medium access command.
So in combination this results in an erratic device offlining
during EH; if the command where the TUR was sent upon happens
to be a medium access command the device will be set offline,
if not everything proceeds as normal.

This patch moves the check to the final test, eliminating
this problem.

Signed-off-by: Hannes Reinecke h...@suse.de
Signed-off-by: James Bottomley jbottom...@parallels.com
---
 drivers/scsi/scsi_error.c  | 26 +-
 drivers/scsi/sd.c  | 26 --
 include/scsi/scsi_driver.h |  2 +-
 3 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index e8bee9f..67c0014 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -941,12 +941,6 @@ retry:
 
scsi_eh_restore_cmnd(scmd, ses);
 
-   if (scmd-request-cmd_type != REQ_TYPE_BLOCK_PC) {
-   struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd);
-   if (sdrv-eh_action)
-   rtn = sdrv-eh_action(scmd, cmnd, cmnd_size, rtn);
-   }
-
return rtn;
 }
 
@@ -964,6 +958,16 @@ static int scsi_request_sense(struct scsi_cmnd *scmd)
return scsi_send_eh_cmnd(scmd, NULL, 0, scmd-device-eh_timeout, ~0);
 }
 
+static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn)
+{
+   if (scmd-request-cmd_type != REQ_TYPE_BLOCK_PC) {
+   struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd);
+   if (sdrv-eh_action)
+   rtn = sdrv-eh_action(scmd, rtn);
+   }
+   return rtn;
+}
+
 /**
  * scsi_eh_finish_cmd - Handle a cmd that eh is finished with.
  * @scmd:  Original SCSI cmd that eh has finished.
@@ -1142,7 +1146,9 @@ static int scsi_eh_test_devices(struct list_head 
*cmd_list,
 
list_for_each_entry_safe(scmd, next, cmd_list, eh_entry)
if (scmd-device == sdev) {
-   if (finish_cmds)
+   if (finish_cmds 
+   (try_stu ||
+scsi_eh_action(scmd, SUCCESS) == SUCCESS))
scsi_eh_finish_cmd(scmd, done_q);
else
list_move_tail(scmd-eh_entry, work_q);
@@ -1283,7 +1289,8 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
!scsi_eh_tur(stu_scmd)) {
list_for_each_entry_safe(scmd, next,
  work_q, eh_entry) {
-   if (scmd-device == sdev)
+   if (scmd-device == sdev 
+   scsi_eh_action(scmd, SUCCESS) == 
SUCCESS)
scsi_eh_finish_cmd(scmd, 
done_q);
}
}
@@ -1350,7 +1357,8 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host 
*shost,
!scsi_eh_tur(bdr_scmd)) {
list_for_each_entry_safe(scmd, next,
 work_q, eh_entry) {
-   if (scmd-device == sdev)
+   if (scmd-device == sdev 
+   scsi_eh_action(scmd, rtn) != FAILED)
scsi_eh_finish_cmd(scmd,
   done_q);
}
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index fd874b8..1929838 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -110,7 +110,7 @@ static int sd_suspend_runtime(struct device *);
 static int sd_resume(struct device *);
 static void sd_rescan(struct device *);
 static int sd_done(struct scsi_cmnd *);
-static int sd_eh_action(struct scsi_cmnd *, unsigned char *, int, int);
+static int sd_eh_action(struct scsi_cmnd *, int);
 static void sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer);
 static void scsi_disk_release(struct device *cdev);
 static void sd_print_sense_hdr(struct scsi_disk *, struct scsi_sense_hdr *);
@@ -1551,23 +1551,23 @@ static const struct block_device_operations sd_fops = {
 /**
  * sd_eh_action - error handling callback
  * @scmd:  sd-issued command that has failed
- * @eh_cmnd:   

Re: [PATCH 2/6] libata: avoid waking disk to check power

2013-11-11 Thread Sergei Shtylyov

Hello.

On 10-11-2013 1:03, Phillip Susi wrote:


When a disk is in SLEEP mode it can not respond to commands,
including the CHECK POWER command.  Instead of waking up the
sleeping disk, fake the reply to the CHECK POWER command to
indicate the disk is in standby mode.  This prevents udisks
from waking up sleeping disks when it polls to see if they
are awake or not before trying to read their smart status.
---
  drivers/ata/libata-core.c | 8 
  1 file changed, 8 insertions(+)



diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 83b1a9f..686c441 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5084,6 +5084,14 @@ void ata_qc_issue(struct ata_queued_cmd *qc)

/* if device is sleeping, schedule reset and abort the link */
if (unlikely(qc-dev-flags  ATA_DFLAG_SLEEPING)) {
+   if (unlikely(qc-tf.command == ATA_CMD_CHK_POWER))
+   {


   Keep { on the same line as *if* please.

WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/6] sd: don't bother spinning up disks on resume

2013-11-11 Thread Sergei Shtylyov

Hello.

On 10-11-2013 1:03, Phillip Susi wrote:


Don't bother forcing disks to spin up on resume, as they
will do so automatically when accessed, and forcing them
to spin up slows down the resume.  Add a second bit to the
manage_start_stop flag to restore the previous behavior.
---
  drivers/scsi/sd.c  | 6 +++---
  include/scsi/scsi_device.h | 2 +-
  2 files changed, 4 insertions(+), 4 deletions(-)



diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e62d17d..3143311 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c

[...]

diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index d65fbec..1c46d2d 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -152,7 +152,7 @@ struct scsi_device {
unsigned use_192_bytes_for_3f:1; /* ask for 192 bytes from page 0x3f */
unsigned no_start_on_add:1; /* do not issue start on add */
unsigned allow_restart:1; /* issue START_UNIT in error handler */
-   unsigned manage_start_stop:1;   /* Let HLD (sd) manage start/stop */
+   unsigned manage_start_stop:2;   /* Let HLD (sd) manage start/stop */


   I think you should better document this 2-bit field, or better still, make 
it 2 1-bit fields.


WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] libata: resume in the background

2013-11-11 Thread Sergei Shtylyov

On 10-11-2013 1:03, Phillip Susi wrote:


Don't block the resume path waiting for the disk to
spin up.
---
  drivers/ata/libata-core.c | 10 --
  1 file changed, 4 insertions(+), 6 deletions(-)



diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 686c441..128ce0d 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5421,20 +5421,18 @@ static int __ata_port_resume_common(struct ata_port 
*ap, pm_message_t mesg,
  static int ata_port_resume_common(struct device *dev, pm_message_t mesg)
  {
struct ata_port *ap = to_ata_port(dev);
+   static int dontcare;

-   return __ata_port_resume_common(ap, mesg, NULL);
+   return __ata_port_resume_common(ap, mesg, dontcare);
  }

  static int ata_port_resume(struct device *dev)
  {
int rc;

+   if (pm_runtime_suspended(dev))
+   return 0;
rc = ata_port_resume_common(dev, PMSG_RESUME);
-   if (!rc) {
-   pm_runtime_disable(dev);
-   pm_runtime_set_active(dev);
-   pm_runtime_enable(dev);
-   }


With this modification, you don't need 'rc' anymore.


return rc;
  }



MBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/6] sd: don't bother spinning up disks on resume

2013-11-11 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/11/2013 8:08 AM, Sergei Shtylyov wrote:
 -unsigned manage_start_stop:1;/* Let HLD (sd) manage 
 start/stop */ +unsigned manage_start_stop:2;/* Let HLD
 (sd) manage start/stop */
 
 I think you should better document this 2-bit field, or better
 still, make it 2 1-bit fields.

Not a bad idea, though I'm really not sure that it shouldn't just be
removed entirely.  It is a bad idea to not stop the disk on
suspend/shutdown since it leaves the disk to take an emergency head
retract, which isn't good for it.  At the very least it should park
the heads, though currently libata does not handle the scsi cmd for that.

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSgOmiAAoJEJrBOlT6nu755CcH/2ckSBdvCtMHFe1ech1XFOdJ
PUZc5VEHk/rzPqXqxS26H9eR5FIbgu437yVizJ/w5Fy4MAX0rVKyz61FK/HavGL7
aqNYgKWIjucg7panlWZsPIraDl/bVPYVS2PnthVItabC+GskYRR0g92xcDqDPSeX
kmFIG0JOx2bU6JYWtFWxECpdjsBJBQxAnoLtzI+SONmlhp2GxH9Bc9Msa5hqwBoW
vmUJmggBaPoL3ZTQUFGZyYWU8/7n2d5lhXpJdcsvcrd+PJyrsfv9GGKqgQW/kLmL
sYAFwO83/5YJUK03l68S9xHt48NW2rdFm2ph035/CpbsNib2fKTaOsyfIgf/Idg=
=y73A
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3]: pm8001 driver bug fixes.

2013-11-11 Thread Anand
From c3bcd7c02e1fa487edbab4c1d5182daca066db61 Mon Sep 17 00:00:00 2001
From: Nikith Ganigarakoppal nikith.ganigarakop...@pmcs.com
Date: Mon, 11 Nov 2013 19:41:51 +0530
Subject: [PATCH 0/3]: pm8001 driver bug fixes.


Nikith Ganigarakoppal (3):
  pm80xx: Fix for direct attached device.
  pm80xx: Resetting the phy state.
  pm80xx: Tasklets synchronization fix.

 drivers/scsi/pm8001/pm8001_hwi.c  |2 +
 drivers/scsi/pm8001/pm8001_hwi.h  |4 ++
 drivers/scsi/pm8001/pm8001_init.c |   91 +---
 drivers/scsi/pm8001/pm8001_sas.c  |4 +-
 drivers/scsi/pm8001/pm8001_sas.h  |9 +++-
 drivers/scsi/pm8001/pm80xx_hwi.c  |2 +
 drivers/scsi/pm8001/pm80xx_hwi.h  |2 +
 7 files changed, 72 insertions(+), 42 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 1/3] pm80xx: Fix for direct attached device.

2013-11-11 Thread Anand
From cf6a06ddf571464571f826109bb1e5a0667c7751 Mon Sep 17 00:00:00 2001
From: Nikith Ganigarakoppal nikith.ganigarakop...@pmcs.com
Date: Wed, 30 Oct 2013 16:13:22 +0530
Subject: [PATCH V1 1/3] pm80xx: Fix for direct attached device.

In case of direct attached SATA device delay is not enough.
It will give crash for set device state command response and
wait_for_completion is the best solution for this.

Updation of module author.

Signed-off-by: anandkumar.santha...@pmcs.com
Signed-off-by: nikith.ganigarakop...@pmcs.com
---
 drivers/scsi/pm8001/pm8001_init.c |1 +
 drivers/scsi/pm8001/pm8001_sas.c  |4 +++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_init.c 
b/drivers/scsi/pm8001/pm8001_init.c
index 7b57fcd..17daaa4 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -1170,6 +1170,7 @@ module_exit(pm8001_exit);
 MODULE_AUTHOR(Jack Wang jack_w...@usish.com);
 MODULE_AUTHOR(Anand Kumar Santhanam anandkumar.santha...@pmcs.com);
 MODULE_AUTHOR(Sangeetha Gnanasekaran sangeetha.gnanaseka...@pmcs.com);
+MODULE_AUTHOR(Nikith Ganigarakoppal nikith.ganigarakop...@pmcs.com);
 MODULE_DESCRIPTION(
PMC-Sierra PM8001/8081/8088/8089/8074/8076/8077 
SAS/SATA controller driver);
diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index f4eb18e..f50ac44 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -1098,15 +1098,17 @@ int pm8001_lu_reset(struct domain_device *dev, u8 *lun)
struct pm8001_tmf_task tmf_task;
struct pm8001_device *pm8001_dev = dev-lldd_dev;
struct pm8001_hba_info *pm8001_ha = pm8001_find_ha_by_dev(dev);
+   DECLARE_COMPLETION_ONSTACK(completion_setstate);
if (dev_is_sata(dev)) {
struct sas_phy *phy = sas_get_local_phy(dev);
rc = pm8001_exec_internal_task_abort(pm8001_ha, pm8001_dev ,
dev, 1, 0);
rc = sas_phy_reset(phy, 1);
sas_put_local_phy(phy);
+   pm8001_dev-setds_completion = completion_setstate;
rc = PM8001_CHIP_DISP-set_dev_state_req(pm8001_ha,
pm8001_dev, 0x01);
-   msleep(2000);
+   wait_for_completion(completion_setstate);
} else {
tmf_task.tmf = TMF_LU_RESET;
rc = pm8001_issue_ssp_tmf(dev, lun, tmf_task);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 2/3] pm80xx: Resetting the phy state.

2013-11-11 Thread Anand
From 800d934dd22f3ef5d7f52d900295d371d17004bd Mon Sep 17 00:00:00 2001
From: Nikith Ganigarakoppal nikith.ganigarakop...@pmcs.com
Date: Wed, 30 Oct 2013 16:23:47 +0530
Subject: [PATCH V1 2/3] pm80xx: Resetting the phy state.

Setting the phy state for hard reset response.
After sending hard reset for a device ,phy down event sets
the phy state to zero but for phy up event it will not set
the phy state again.This will cause problem to successive
hard resets.


Signed-off-by: anandkumar.santha...@pmcs.com
Signed-off-by: nikith.ganigarakop...@pmcs.com
---
 drivers/scsi/pm8001/pm8001_hwi.c |2 ++
 drivers/scsi/pm8001/pm8001_hwi.h |4 
 drivers/scsi/pm8001/pm80xx_hwi.c |2 ++
 drivers/scsi/pm8001/pm80xx_hwi.h |2 ++
 4 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index 9961616..b23f49d 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -3403,6 +3403,7 @@ hw_event_sas_phy_up(struct pm8001_hba_info *pm8001_ha, 
void *piomb)
unsigned long flags;
u8 deviceType = pPayload-sas_identify.dev_type;
port-port_state =  portstate;
+   phy-phy_state = PHY_STATE_LINK_UP_SPC;
PM8001_MSG_DBG(pm8001_ha,
pm8001_printk(HW_EVENT_SAS_PHY_UP port id = %d, phy id = %d\n,
port_id, phy_id));
@@ -3483,6 +3484,7 @@ hw_event_sata_phy_up(struct pm8001_hba_info *pm8001_ha, 
void *piomb)
pm8001_printk(HW_EVENT_SATA_PHY_UP port id = %d,
 phy id = %d\n, port_id, phy_id));
port-port_state =  portstate;
+   phy-phy_state = PHY_STATE_LINK_UP_SPC;
port-port_attached = 1;
pm8001_get_lrate_mode(phy, link_rate);
phy-phy_type |= PORT_TYPE_SATA;
diff --git a/drivers/scsi/pm8001/pm8001_hwi.h b/drivers/scsi/pm8001/pm8001_hwi.h
index 6d91e24..e4867e6 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.h
+++ b/drivers/scsi/pm8001/pm8001_hwi.h
@@ -131,6 +131,10 @@
 #define LINKRATE_30(0x02  8)
 #define LINKRATE_60(0x04  8)
 
+/* for phy state */
+
+#define PHY_STATE_LINK_UP_SPC  0x1
+
 /* for new SPC controllers MEMBASE III is shared between BIOS and DATA */
 #define GSM_SM_BASE0x4F
 struct mpi_msg_hdr{
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 4ebc79b..41bee62 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -2894,6 +2894,7 @@ hw_event_sas_phy_up(struct pm8001_hba_info *pm8001_ha, 
void *piomb)
unsigned long flags;
u8 deviceType = pPayload-sas_identify.dev_type;
port-port_state = portstate;
+   phy-phy_state = PHY_STATE_LINK_UP_SPCV;
PM8001_MSG_DBG(pm8001_ha, pm8001_printk(
portid:%d; phyid:%d; linkrate:%d; 
portstate:%x; devicetype:%x\n,
@@ -2978,6 +2979,7 @@ hw_event_sata_phy_up(struct pm8001_hba_info *pm8001_ha, 
void *piomb)
port_id, phy_id, link_rate, portstate));
 
port-port_state = portstate;
+   phy-phy_state = PHY_STATE_LINK_UP_SPCV;
port-port_attached = 1;
pm8001_get_lrate_mode(phy, link_rate);
phy-phy_type |= PORT_TYPE_SATA;
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.h b/drivers/scsi/pm8001/pm80xx_hwi.h
index c86816b..9970a38 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.h
+++ b/drivers/scsi/pm8001/pm80xx_hwi.h
@@ -215,6 +215,8 @@
 #define SAS_DOPNRJT_RTRY_TMO128
 #define SAS_COPNRJT_RTRY_TMO128
 
+/* for phy state */
+#define PHY_STATE_LINK_UP_SPCV 0x2
 /*
   Making ORR bigger than IT NEXUS LOSS which is 200us = 2 second.
   Assuming a bigger value 3 second, 300/128 = 23437.5 where 128
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 3/3] pm80xx: Tasklets synchronization fix.

2013-11-11 Thread Anand
From c3bcd7c02e1fa487edbab4c1d5182daca066db61 Mon Sep 17 00:00:00 2001
From: Nikith Ganigarakoppal nikith.ganigarakop...@pmcs.com
Date: Mon, 11 Nov 2013 15:28:14 +0530
Subject: [PATCH V1 3/3] pm80xx: Tasklets synchronization fix.

When multiple vectors are used, the vector variable is over written, 
resulting in unhandled operation for those vectors. 
This fix prevents the problem by maitaining HBA instance and 
vector values for each irq.
 
Signed-off-by: anandkumar.santha...@pmcs.com
Signed-off-by: nikith.ganigarakop...@pmcs.com

---
 drivers/scsi/pm8001/pm8001_init.c |   90 +---
 drivers/scsi/pm8001/pm8001_sas.h  |9 +++-
 2 files changed, 58 insertions(+), 41 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_init.c 
b/drivers/scsi/pm8001/pm8001_init.c
index 17daaa4..83feae8 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -175,20 +175,16 @@ static void pm8001_free(struct pm8001_hba_info *pm8001_ha)
 static void pm8001_tasklet(unsigned long opaque)
 {
struct pm8001_hba_info *pm8001_ha;
-   u32 vec;
-   pm8001_ha = (struct pm8001_hba_info *)opaque;
+   struct isr_param *irq_vector;
+
+   irq_vector = (struct isr_param *)opaque;
+   pm8001_ha = irq_vector-drv_inst;
if (unlikely(!pm8001_ha))
BUG_ON(1);
-   vec = pm8001_ha-int_vector;
-   PM8001_CHIP_DISP-isr(pm8001_ha, vec);
+   PM8001_CHIP_DISP-isr(pm8001_ha, irq_vector-irq_id);
 }
 #endif
 
-static struct  pm8001_hba_info *outq_to_hba(u8 *outq)
-{
-   return container_of((outq - *outq), struct pm8001_hba_info, outq[0]);
-}
-
 /**
  * pm8001_interrupt_handler_msix - main MSIX interrupt handler.
  * It obtains the vector number and calls the equivalent bottom
@@ -198,18 +194,20 @@ static struct  pm8001_hba_info *outq_to_hba(u8 *outq)
  */
 static irqreturn_t pm8001_interrupt_handler_msix(int irq, void *opaque)
 {
-   struct pm8001_hba_info *pm8001_ha = outq_to_hba(opaque);
-   u8 outq = *(u8 *)opaque;
+   struct isr_param *irq_vector;
+   struct pm8001_hba_info *pm8001_ha;
irqreturn_t ret = IRQ_HANDLED;
+   irq_vector = (struct isr_param *)opaque;
+   pm8001_ha = irq_vector-drv_inst;
+
if (unlikely(!pm8001_ha))
return IRQ_NONE;
if (!PM8001_CHIP_DISP-is_our_interupt(pm8001_ha))
return IRQ_NONE;
-   pm8001_ha-int_vector = outq;
 #ifdef PM8001_USE_TASKLET
-   tasklet_schedule(pm8001_ha-tasklet);
+   tasklet_schedule(pm8001_ha-tasklet[irq_vector-irq_id]);
 #else
-   ret = PM8001_CHIP_DISP-isr(pm8001_ha, outq);
+   ret = PM8001_CHIP_DISP-isr(pm8001_ha, irq_vector-irq_id);
 #endif
return ret;
 }
@@ -230,9 +228,8 @@ static irqreturn_t pm8001_interrupt_handler_intx(int irq, 
void *dev_id)
if (!PM8001_CHIP_DISP-is_our_interupt(pm8001_ha))
return IRQ_NONE;
 
-   pm8001_ha-int_vector = 0;
 #ifdef PM8001_USE_TASKLET
-   tasklet_schedule(pm8001_ha-tasklet);
+   tasklet_schedule(pm8001_ha-tasklet[0]);
 #else
ret = PM8001_CHIP_DISP-isr(pm8001_ha, 0);
 #endif
@@ -457,7 +454,7 @@ static struct pm8001_hba_info *pm8001_pci_alloc(struct 
pci_dev *pdev,
 {
struct pm8001_hba_info *pm8001_ha;
struct sas_ha_struct *sha = SHOST_TO_SAS_HA(shost);
-
+   int j;
 
pm8001_ha = sha-lldd_ha;
if (!pm8001_ha)
@@ -480,12 +477,14 @@ static struct pm8001_hba_info *pm8001_pci_alloc(struct 
pci_dev *pdev,
pm8001_ha-iomb_size = IOMB_SIZE_SPC;
 
 #ifdef PM8001_USE_TASKLET
-   /**
-   * default tasklet for non msi-x interrupt handler/first msi-x
-   * interrupt handler
-   **/
-   tasklet_init(pm8001_ha-tasklet, pm8001_tasklet,
-   (unsigned long)pm8001_ha);
+   /* Tasklet for non msi-x interrupt */
+   if ((!pdev-msix_cap) || (pm8001_ha-chip_id == chip_8001))
+   tasklet_init(pm8001_ha-tasklet[0], pm8001_tasklet,
+   (unsigned long)(pm8001_ha-irq_vector[0]));
+   else
+   for (j = 0; j  PM8001_MAX_MSIX_VEC ; j++)
+   tasklet_init(pm8001_ha-tasklet[j], pm8001_tasklet,
+   (unsigned long)(pm8001_ha-irq_vector[j]));
 #endif
pm8001_ioremap(pm8001_ha);
if (!pm8001_alloc(pm8001_ha, ent))
@@ -733,19 +732,20 @@ static u32 pm8001_setup_msix(struct pm8001_hba_info 
*pm8001_ha)
pci_enable_msix request ret:%d no of intr %d\n,
rc, pm8001_ha-number_of_intr));
 
-   for (i = 0; i  number_of_intr; i++)
-   pm8001_ha-outq[i] = i;
 
for (i = 0; i  number_of_intr; i++) {
snprintf(intr_drvname[i], sizeof(intr_drvname[0]),
DRV_NAME%d, i);
+   pm8001_ha-irq_vector[i].irq_id = i;
+   

[PATCH 0/2] target: I/O topology for block backstores

2013-11-11 Thread Andy Grover
Hi Nab,

Here's a patch from October, along with a trivial cleanup. hch had
concerns about the four function pointers added to struct
se_subsystem_api, but was OK with it for now.

Please apply -- Andy

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] target: Pass through I/O topology for block backstores

2013-11-11 Thread Andy Grover
In addition to block size (already implemented), passing through
alignment offset, logical-to-phys block exponent, I/O granularity and
optimal I/O length will allow initiators to properly handle layout on
LUNs with 4K block sizes.

Tested with various weird values via scsi_debug module.

One thing to look at with this patch is the new block limits values --
instead of granularity 1 optimal 8192, Lio will now be returning whatever
the block device says, which may affect performance.

Signed-off-by: Andy Grover agro...@redhat.com
---
 drivers/target/target_core_iblock.c  |   43 ++
 drivers/target/target_core_sbc.c |   12 -
 drivers/target/target_core_spc.c |   11 +++-
 include/target/target_core_backend.h |5 
 4 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/drivers/target/target_core_iblock.c 
b/drivers/target/target_core_iblock.c
index b9a3394..c87959f 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -710,6 +710,45 @@ static sector_t iblock_get_blocks(struct se_device *dev)
return iblock_emulate_read_cap_with_block_size(dev, bd, q);
 }
 
+static sector_t iblock_get_alignment_offset_lbas(struct se_device *dev)
+{
+   struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+   struct block_device *bd = ib_dev-ibd_bd;
+   int ret;
+
+   ret = bdev_alignment_offset(bd);
+   if (ret == -1)
+   return 0;
+
+   /* convert offset-bytes to offset-lbas */
+   return ret / bdev_logical_block_size(bd);
+}
+
+static unsigned int iblock_get_lbppbe(struct se_device *dev)
+{
+   struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+   struct block_device *bd = ib_dev-ibd_bd;
+   int logs_per_phys = bdev_physical_block_size(bd) / 
bdev_logical_block_size(bd);
+
+   return ilog2(logs_per_phys);
+}
+
+static unsigned int iblock_get_io_min(struct se_device *dev)
+{
+   struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+   struct block_device *bd = ib_dev-ibd_bd;
+
+   return bdev_io_min(bd);
+}
+
+static unsigned int iblock_get_io_opt(struct se_device *dev)
+{
+   struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+   struct block_device *bd = ib_dev-ibd_bd;
+
+   return bdev_io_opt(bd);
+}
+
 static struct sbc_ops iblock_sbc_ops = {
.execute_rw = iblock_execute_rw,
.execute_sync_cache = iblock_execute_sync_cache,
@@ -749,6 +788,10 @@ static struct se_subsystem_api iblock_template = {
.show_configfs_dev_params = iblock_show_configfs_dev_params,
.get_device_type= sbc_get_device_type,
.get_blocks = iblock_get_blocks,
+   .get_alignment_offset_lbas = iblock_get_alignment_offset_lbas,
+   .get_lbppbe = iblock_get_lbppbe,
+   .get_io_min = iblock_get_io_min,
+   .get_io_opt = iblock_get_io_opt,
.get_write_cache= iblock_get_write_cache,
 };
 
diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 6c17295..61a30f0 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -105,12 +105,22 @@ sbc_emulate_readcapacity_16(struct se_cmd *cmd)
buf[9] = (dev-dev_attrib.block_size  16)  0xff;
buf[10] = (dev-dev_attrib.block_size  8)  0xff;
buf[11] = dev-dev_attrib.block_size  0xff;
+
+   if (dev-transport-get_lbppbe)
+   buf[13] = dev-transport-get_lbppbe(dev)  0x0f;
+
+   if (dev-transport-get_alignment_offset_lbas) {
+   u16 lalba = dev-transport-get_alignment_offset_lbas(dev);
+   buf[14] = (lalba  8)  0x3f;
+   buf[15] = lalba  0xff;
+   }
+
/*
 * Set Thin Provisioning Enable bit following sbc3r22 in section
 * READ CAPACITY (16) byte 14 if emulate_tpu or emulate_tpws is enabled.
 */
if (dev-dev_attrib.emulate_tpu || dev-dev_attrib.emulate_tpws)
-   buf[14] = 0x80;
+   buf[14] |= 0x80;
 
rbuf = transport_kmap_data_sg(cmd);
if (rbuf) {
diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index 0745395..f89a86f 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -452,6 +452,7 @@ spc_emulate_evpd_b0(struct se_cmd *cmd, unsigned char *buf)
struct se_device *dev = cmd-se_dev;
u32 max_sectors;
int have_tp = 0;
+   int opt, min;
 
/*
 * Following spc3r22 section 6.5.3 Block Limits VPD page, when
@@ -475,7 +476,10 @@ spc_emulate_evpd_b0(struct se_cmd *cmd, unsigned char *buf)
/*
 * Set OPTIMAL TRANSFER LENGTH GRANULARITY
 */
-   put_unaligned_be16(1, buf[6]);
+   if (dev-transport-get_io_min  (min = 
dev-transport-get_io_min(dev)))
+   put_unaligned_be16(min / dev-dev_attrib.block_size, buf[6]);
+   else
+   put_unaligned_be16(1, buf[6]);
 
  

[PATCH 2/2] target: Core does not need blkdev.h

2013-11-11 Thread Andy Grover
Target core does not depend on the block layer, only backstores that
use the block layer do.

Signed-off-by: Andy Grover agro...@redhat.com
---
 drivers/target/target_core_rd.c|1 -
 drivers/target/target_core_stat.c  |1 -
 drivers/target/target_core_transport.c |1 -
 3 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c
index 131327a..4ffe5f2 100644
--- a/drivers/target/target_core_rd.c
+++ b/drivers/target/target_core_rd.c
@@ -27,7 +27,6 @@
 #include linux/string.h
 #include linux/parser.h
 #include linux/timer.h
-#include linux/blkdev.h
 #include linux/slab.h
 #include linux/spinlock.h
 #include scsi/scsi.h
diff --git a/drivers/target/target_core_stat.c 
b/drivers/target/target_core_stat.c
index 9c642e0..632665c 100644
--- a/drivers/target/target_core_stat.c
+++ b/drivers/target/target_core_stat.c
@@ -32,7 +32,6 @@
 #include linux/utsname.h
 #include linux/proc_fs.h
 #include linux/seq_file.h
-#include linux/blkdev.h
 #include linux/configfs.h
 #include scsi/scsi.h
 #include scsi/scsi_device.h
diff --git a/drivers/target/target_core_transport.c 
b/drivers/target/target_core_transport.c
index e0669ac..71f79f6 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -28,7 +28,6 @@
 #include linux/string.h
 #include linux/timer.h
 #include linux/slab.h
-#include linux/blkdev.h
 #include linux/spinlock.h
 #include linux/kthread.h
 #include linux/in.h
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RESEND v2 0/2] SATA disk resume time optimization

2013-11-11 Thread Todd E Brandt
500ms is actually quite alot when you compare it with android or iOS,
or even with other subsystems like USB. USB is the second worst offender 
after SATA disks and it tends to take around 600ms for most standard 
systems with an onboard webcam, and with other devices plugged in ever 
longer. I think we should optimize out every millisecond we can.

You mentioned that you tuned your system to remove all cases that trigger
non-essential disk wakeups, and that works great for you, but it won't 
help anyone using fedora or OpenSUSE or Yocto, etc. And even if you
could formallize your changes to Ubuntu to create a fully optimized 
user space, you're still at the mercy of the user's install choices, which
can quickly throw a wrench in the works.

On Sat, Nov 09, 2013 at 03:59:04PM -0500, Phillip Susi wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA512
 
 On 11/08/2013 08:20 PM, Todd E Brandt wrote:
  I tested your patches and they do function. We tried a similar
  approach a few months back where instead of waking the scsi disks
  we just set them all to runtime_suspended and skipped the resume.
  Then we let them be awakened later by read/write access just as you
  have. It's a really tempting approach, in theory, since you're
  saving both time and power by only waking those disks you know you
  need. But in practice I've found that userspace doesn't play nice.
 
 I've just about gotten rid of all instances of user space waking the
 disks on my system.  The one I have left to do is to fake the IDENTIFY
 DEVICE command using the cached identity info to satisfy a script that
 tries to set the APM mode of the disk after resume.  If I disable that
 script, my extra disks stay asleep and Xwindows fires up nice and fast.
 
  In my experience the user layer almost always manages to wake up
  every mounted disk after resume, even if you didn't deliberately
  use them prior to suspend. The accesses can come from the file
  manager doing a scan after resume, or from any number of apps
  running on the system that decide they need to get even the
  smallest piece of information from the disks. A simple space check
  will wake them up.
 
 By simple space check I assume you mean df?  The superblocks easily
 fit in the cache so that shouldn't generate any IO.
 
  Thus when you leave all the disks stopped, user space ends up
  triggering a traffic jam when the OS wakes back up, which makes
  disk access take even longer.
  
  My patch works very similarly to yours but just triggers an
  asynchronous wakeup to all the disks in anticipation of userspace's
  needs. We've tested it pretty heavily on ubuntu machines of all
  types and it's done well.
 
 I don't think the wakeup is needed since ( ata ) disks normally wake
 up on their own unless you enable power up in standby and pro-actively
 initiating the wakeup only buys you maybe 500ms or less?  What is that
 compared to a typical 10 second startup time?
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.14 (GNU/Linux)
 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
 
 iQEcBAEBCgAGBQJSfqIUAAoJEJrBOlT6nu75v3sH/R374d/Sgx3DB0DVGgDch3jA
 ZlR4eb5x5umw2CApGE0jbbj91/330Z5uxgr76tn6/nSRftDJ5ZgLc6dBTF1VwX4q
 fqxKgNY1euIARiCL4jLxiK9JfX7hB0GtknJaMRvG4JHaSP4d0Cvhr0sbd5mpmJp7
 P0TMVslJJHyIFVk0QjvisDBcFgo1onBkbVnX6B5Z6mPZXhAd+WCA3CJfiHnAK7t+
 mINmlTBXnZQFXLXY2rDrmZEUCLFfTqtlprkAuGdlfXsMVYBTD31notuZ74Xbv7C7
 vBJLiQ6b7dyF8eqcHoc49qqNO1n38nhRmYhIOSYgsyRFhECjVms5/mfEj9UBkiY=
 =iDTY
 -END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RESEND v2 0/2] SATA disk resume time optimization

2013-11-11 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/11/2013 11:59 AM, Todd E Brandt wrote:
 500ms is actually quite alot when you compare it with android or
 iOS, or even with other subsystems like USB. USB is the second
 worst offender after SATA disks and it tends to take around 600ms
 for most standard systems with an onboard webcam, and with other
 devices plugged in ever longer. I think we should optimize out
 every millisecond we can.

I think you missed my point: if the disk is going to be started in
9000ms vs 8500ms, it doesn't make much difference, and I think in
practice the difference will be less than that.  I don't think it is
worth requiring that all disks start up to save a few percent in the
cases where the disk is required ( and I'm not clear that there even
are any such cases ).

 You mentioned that you tuned your system to remove all cases that
 trigger non-essential disk wakeups, and that works great for you,
 but it won't help anyone using fedora or OpenSUSE or Yocto, etc.
 And even if you could formallize your changes to Ubuntu to create a
 fully optimized user space, you're still at the mercy of the user's
 install choices, which can quickly throw a wrench in the works.

No, I did not tune my system; I fixed the kernel so that userspace's
activities do not start those disks.

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSgQ8hAAoJEJrBOlT6nu75c04H/2ej9NzdyrWkuPv2SLGgb51Y
iBmeAkJEoFXwgu0UCZ2Ty3dpUkW7jcRiKKv5q8d1Anj4eGFIpx+CjUptCE3MCs1w
PWQm7uB6+DgOhj3aqF9DulBx0Yp3+gISwOTXYLAIirIReE4Vk904ZuYHo4IvJmug
vbk4p6n5iGLMqV/QFsymPoOFtUzKQPHj4YDL3Piu0fse8AOIHk5BJtmmtj1YbAck
oAsMIYz6UQZ2tXpzxcy1ISgZYUO8k6CcrXg8WE2ws9gZLNJtBHniLC1t2VCE10KF
BSzP+tgN+Glbk+HyJUuwjALa/CkFk9FKQJT8LJQfn0rRhjku/vZXsZoFgCQiIx0=
=THLl
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] Documentation: Add documentation for APM X-Gene SATA DTS binding

2013-11-11 Thread Loc Ho
Hi Arnd,

 ---
  .../devicetree/bindings/ata/apm-xgene.txt  |   84 
 

 Please Cc the devicetree-discuss mailing list for the binding submission.
[Loc Ho]
I did cc on the first version. But this email
'devicetree-disc...@lists.ozlabs.org' bounced on me.

 +- interrupt-parent   : Interrupt controller
 +- interrupts : Interrupt mapping for SATA IRQ
 +- #clock-cells   : Shall be value of 1

 Why is there a #clock-cells entry here?
[Loc Ho]
Okay... Let me see if I can get rip of this.


 +- clocks : Reference to the clock entry
 +- clock-names: Shall be eth01clk, eth23clk, or 
 eth45clk.

 This makes no sense. The clock-names property needs to have a fixed
 string according to the name of the clock input signal of the hardware,
 not a name of the clock provider.
[Loc Ho]
The clock eth01clk, eth23clk, and eth45clk are the internal
divider clock. They are not the physical input clock. It sounds like
we don't need the clock-names are all.


 +- serdes-diff-clk: Shall be 0 for external, 1 internal differential,
 +   or 2 internal single ended clock. Default is 0.
 +- gen-sel: Shall be 1 (force Gen1), 2 (Force Gen2, or 3 Gen3).
 +   Default is 3.
 +- EQA1   : Serdes EQ parameter for A1 chip. Default is 
 9.
 +- EQ : Serdes EQ parameter for non-A1 chip. Default is 2.
 +- GENAVG : Enable averaging Serdes calculation. Default is 0 for
 +   A1 chip and 1 for non-A1 chip.
 +- LBA1   : Serdes loopback buffer for A1 chip. Default 
 is 1;
 +- LB : Serdes loopback buffer for non-A1 chip. Default is 0;
 +- LCA1   : Serdes loopback enable control for A1 chip. 
 Default
 +   is 1;
 +- LC : Serdes loopback enable control for non-A1 chip.
 +   Default is 0;
 +- CDRA1  : Serdes SPD select CDR for A1 chip. Default 
 is 5.
 +- CDR: Serdes SPD select CDR for non-A1 chip. 
 Default is 5.
 +- PQA1   : Serdes PQ for A1 chip. Default is 8.
 +- PQ : Serdes PQ for non-A1 chip. Default is 0xA.
 +- coherent   : Enable coherent (1 = enable, 0 = disable).
 +   Default is 1.

 This looks like a really bad binding. I would suggest that instead of having
 individual register values in here, you hardwire the settings in the driver
 based on the compatible string. It's pretty crazy to put register-level 
 configuration
 in the DT like this.
[Loc Ho]
If I hardwire them in the driver, it will NOT scale across multiple
board. I guess if I moved it out to the PHY driver, then we can
discuss in that driver.


 Further, you should probably use the generic PHY binding to create a separate 
 driver
 for the serdes PHY,

[Loc Ho]
I will look into this.

-Loc
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] Documentation: Add documentation for APM X-Gene SATA DTS binding

2013-11-11 Thread Arnd Bergmann
On Monday 11 November 2013, Loc Ho wrote:
 Hi Arnd,
 
  ---
   .../devicetree/bindings/ata/apm-xgene.txt  |   84 
  
 
  Please Cc the devicetree-discuss mailing list for the binding submission.
 [Loc Ho]
 I did cc on the first version. But this email
 'devicetree-disc...@lists.ozlabs.org' bounced on me.

It has recently moved to devicet...@vger.kernel.org

  +- clocks : Reference to the clock entry
  +- clock-names: Shall be eth01clk, eth23clk, or 
  eth45clk.
 
  This makes no sense. The clock-names property needs to have a fixed
  string according to the name of the clock input signal of the hardware,
  not a name of the clock provider.
 [Loc Ho]
 The clock eth01clk, eth23clk, and eth45clk are the internal
 divider clock. They are not the physical input clock. It sounds like
 we don't need the clock-names are all.

Ok.

 
 
  +- serdes-diff-clk: Shall be 0 for external, 1 internal differential,
  +   or 2 internal single ended clock. Default is 0.
  +- gen-sel: Shall be 1 (force Gen1), 2 (Force Gen2, or 3 Gen3).
  +   Default is 3.
  +- EQA1   : Serdes EQ parameter for A1 chip. Default 
  is 9.
  +- EQ : Serdes EQ parameter for non-A1 chip. Default is 2.
  +- GENAVG : Enable averaging Serdes calculation. Default is 0 
  for
  +   A1 chip and 1 for non-A1 chip.
  +- LBA1   : Serdes loopback buffer for A1 chip. 
  Default is 1;
  +- LB : Serdes loopback buffer for non-A1 chip. Default is 
  0;
  +- LCA1   : Serdes loopback enable control for A1 
  chip. Default
  +   is 1;
  +- LC : Serdes loopback enable control for non-A1 chip.
  +   Default is 0;
  +- CDRA1  : Serdes SPD select CDR for A1 chip. Default 
  is 5.
  +- CDR: Serdes SPD select CDR for non-A1 chip. 
  Default is 5.
  +- PQA1   : Serdes PQ for A1 chip. Default is 8.
  +- PQ : Serdes PQ for non-A1 chip. Default is 0xA.
  +- coherent   : Enable coherent (1 = enable, 0 = disable).
  +   Default is 1.
 
  This looks like a really bad binding. I would suggest that instead of having
  individual register values in here, you hardwire the settings in the driver
  based on the compatible string. It's pretty crazy to put register-level 
  configuration
  in the DT like this.
 [Loc Ho]
 If I hardwire them in the driver, it will NOT scale across multiple
 board. I guess if I moved it out to the PHY driver, then we can
 discuss in that driver.

Yes, makes sense. If you need the values to change per board, it's probably 
best to
come up with a somewhat higher-level representation of the same contents. 
Ideally
we should be able to use the same properties for any SerDes PHY, regarless of
how the register level interface is implemented.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 1/2] eeprom-93cx6: Add (read-only) support for 8-bit mode

2013-11-11 Thread Ondrej Zary
Add read-only support for EEPROMs configured in 8-bit mode (ORG pin connected
to GND).
This will be used by wd719x driver.

Signed-off-by: Ondrej Zary li...@rainbow-software.org
---
 drivers/misc/eeprom/eeprom_93cx6.c |   62 +++-
 include/linux/eeprom_93cx6.h   |4 +++
 2 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/eeprom/eeprom_93cx6.c 
b/drivers/misc/eeprom/eeprom_93cx6.c
index 0ff4b02..0cf2c9d 100644
--- a/drivers/misc/eeprom/eeprom_93cx6.c
+++ b/drivers/misc/eeprom/eeprom_93cx6.c
@@ -170,7 +170,7 @@ static void eeprom_93cx6_read_bits(struct eeprom_93cx6 
*eeprom,
 }
 
 /**
- * eeprom_93cx6_read - Read multiple words from eeprom
+ * eeprom_93cx6_read - Read a word from eeprom
  * @eeprom: Pointer to eeprom structure
  * @word: Word index from where we should start reading
  * @data: target pointer where the information will have to be stored
@@ -235,6 +235,66 @@ void eeprom_93cx6_multiread(struct eeprom_93cx6 *eeprom, 
const u8 word,
 EXPORT_SYMBOL_GPL(eeprom_93cx6_multiread);
 
 /**
+ * eeprom_93cx6_readb - Read a byte from eeprom
+ * @eeprom: Pointer to eeprom structure
+ * @word: Byte index from where we should start reading
+ * @data: target pointer where the information will have to be stored
+ *
+ * This function will read a byte of the eeprom data
+ * into the given data pointer.
+ */
+void eeprom_93cx6_readb(struct eeprom_93cx6 *eeprom, const u8 byte,
+   u8 *data)
+{
+   u16 command;
+   u16 tmp;
+
+   /*
+* Initialize the eeprom register
+*/
+   eeprom_93cx6_startup(eeprom);
+
+   /*
+* Select the read opcode and the byte to be read.
+*/
+   command = (PCI_EEPROM_READ_OPCODE  (eeprom-width + 1)) | byte;
+   eeprom_93cx6_write_bits(eeprom, command,
+   PCI_EEPROM_WIDTH_OPCODE + eeprom-width + 1);
+
+   /*
+* Read the requested 8 bits.
+*/
+   eeprom_93cx6_read_bits(eeprom, tmp, 8);
+   *data = tmp  0xff;
+
+   /*
+* Cleanup eeprom register.
+*/
+   eeprom_93cx6_cleanup(eeprom);
+}
+EXPORT_SYMBOL_GPL(eeprom_93cx6_readb);
+
+/**
+ * eeprom_93cx6_multireadb - Read multiple bytes from eeprom
+ * @eeprom: Pointer to eeprom structure
+ * @byte: Index from where we should start reading
+ * @data: target pointer where the information will have to be stored
+ * @words: Number of bytes that should be read.
+ *
+ * This function will read all requested bytes from the eeprom,
+ * this is done by calling eeprom_93cx6_readb() multiple times.
+ */
+void eeprom_93cx6_multireadb(struct eeprom_93cx6 *eeprom, const u8 byte,
+   u8 *data, const u16 bytes)
+{
+   unsigned int i;
+
+   for (i = 0; i  bytes; i++)
+   eeprom_93cx6_readb(eeprom, byte + i, data[i]);
+}
+EXPORT_SYMBOL_GPL(eeprom_93cx6_multireadb);
+
+/**
  * eeprom_93cx6_wren - set the write enable state
  * @eeprom: Pointer to eeprom structure
  * @enable: true to enable writes, otherwise disable writes
diff --git a/include/linux/eeprom_93cx6.h b/include/linux/eeprom_93cx6.h
index e50f98b..eb0b198 100644
--- a/include/linux/eeprom_93cx6.h
+++ b/include/linux/eeprom_93cx6.h
@@ -75,6 +75,10 @@ extern void eeprom_93cx6_read(struct eeprom_93cx6 *eeprom,
const u8 word, u16 *data);
 extern void eeprom_93cx6_multiread(struct eeprom_93cx6 *eeprom,
const u8 word, __le16 *data, const u16 words);
+extern void eeprom_93cx6_readb(struct eeprom_93cx6 *eeprom,
+   const u8 byte, u8 *data);
+extern void eeprom_93cx6_multireadb(struct eeprom_93cx6 *eeprom,
+   const u8 byte, u8 *data, const u16 bytes);
 
 extern void eeprom_93cx6_wren(struct eeprom_93cx6 *eeprom, bool enable);
 
-- 
Ondrej Zary
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 2/2] wd719x: Introduce Western Digital WD7193/7197/7296 PCI SCSI card driver

2013-11-11 Thread Ondrej Zary
Introduce wd719x, a driver for Western Digital WD7193, WD7197 and WD7296 PCI
SCSI controllers based on WD33C296A chip.
Tested with WD7193 card.

Signed-off-by: Ondrej Zary li...@rainbow-software.org
---
 drivers/scsi/Kconfig  |8 +
 drivers/scsi/Makefile |1 +
 drivers/scsi/wd719x.c | 1008 +
 drivers/scsi/wd719x.h |  249 
 4 files changed, 1266 insertions(+)

diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 48b2918..713c207 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -1518,6 +1518,14 @@ config SCSI_NSP32
  To compile this driver as a module, choose M here: the
  module will be called nsp32.
 
+config SCSI_WD719X
+   tristate Western Digital WD7193/7197/7296 support
+   depends on PCI  SCSI
+   select EEPROM_93CX6
+   ---help---
+ This is a driver for Western Digital WD7193, WD7197 and WD7296 PCI
+ SCSI controllers (based on WD33C296A chip).
+
 config SCSI_DEBUG
tristate SCSI debugging host simulator
depends on SCSI
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index b607ba4..9cf61c4 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -145,6 +145,7 @@ obj-$(CONFIG_SCSI_PMCRAID)  += pmcraid.o
 obj-$(CONFIG_SCSI_VIRTIO)  += virtio_scsi.o
 obj-$(CONFIG_VMWARE_PVSCSI)+= vmw_pvscsi.o
 obj-$(CONFIG_HYPERV_STORAGE)   += hv_storvsc.o
+obj-$(CONFIG_SCSI_WD719X)  += wd719x.o
 
 obj-$(CONFIG_ARM)  += arm/
 
diff --git a/drivers/scsi/wd719x.c b/drivers/scsi/wd719x.c
new file mode 100644
index 000..ce6dafd
--- /dev/null
+++ b/drivers/scsi/wd719x.c
@@ -0,0 +1,1008 @@
+/*
+ * Driver for Western Digital WD7193, WD7197 and WD7296 SCSI cards
+ * Copyright 2013 Ondrej Zary
+ *
+ * Original driver by
+ * Aaron Dewell dew...@woods.net
+ * Gaerti juergen.gaert...@mbox.si.uni-hannover.de
+ *
+ * HW documentation available in book:
+ *
+ * SPIDER Command Protocol
+ * by Chandru M. Sippy
+ * SCSI Storage Products (MCP)
+ * Western Digital Corporation
+ * 09-15-95
+ *
+ * 
http://web.archive.org/web/20070717175254/http://sun1.rrzn.uni-hannover.de/gaertner.juergen/wd719x/Linux/Docu/Spider/
+ */
+
+/*
+ * Driver workflow:
+ * 1. SCSI command is transformed to SCB (Spider Control Block) by the
+ *queuecommand function.
+ * 2. The address of the SCB is stored in a list to be able to access it, if
+ *something goes wrong.
+ * 3. The address of the SCB is written to the Controller, which loads the SCB
+ *via BM-DMA and processes it.
+ * 4. After it has finished, it generates an interrupt, and sets registers.
+ *
+ * flaws:
+ *  - abort/reset functions
+ *
+ * ToDo:
+ *  - tagged queueing
+ */
+
+#include linux/module.h
+#include linux/delay.h
+#include linux/pci.h
+#include linux/firmware.h
+#include linux/eeprom_93cx6.h
+#include scsi/scsi_cmnd.h
+#include scsi/scsi_device.h
+#include scsi/scsi_host.h
+#include wd719x.h
+
+/* low-level register access */
+static inline u8 wd719x_readb(struct wd719x *wd, u8 reg)
+{
+   return ioread8(wd-base + reg);
+}
+
+static inline u32 wd719x_readl(struct wd719x *wd, u8 reg)
+{
+   return ioread32(wd-base + reg);
+}
+
+static inline void wd719x_writeb(struct wd719x *wd, u8 reg, u8 val)
+{
+   iowrite8(val, wd-base + reg);
+}
+
+static inline void wd719x_writew(struct wd719x *wd, u8 reg, u16 val)
+{
+   iowrite16(val, wd-base + reg);
+}
+
+static inline void wd719x_writel(struct wd719x *wd, u8 reg, u32 val)
+{
+   iowrite32(val, wd-base + reg);
+}
+
+/* wait until the command register is ready */
+static inline int wd719x_wait_ready(struct wd719x *wd)
+{
+   int i = 0;
+
+   do {
+   if (wd719x_readb(wd, WD719X_AMR_COMMAND) == WD719X_CMD_READY)
+   return 0;
+   udelay(1);
+   } while (i++  WD719X_WAIT_FOR_CMD_READY);
+
+   dev_err(wd-pdev-dev, command register is not ready: 0x%02x\n,
+   wd719x_readb(wd, WD719X_AMR_COMMAND));
+
+   return -ETIMEDOUT;
+}
+
+/* poll interrupt status register until command finishes */
+static inline int wd719x_wait_done(struct wd719x *wd, int timeout)
+{
+   u8 status;
+
+   while (timeout  0) {
+   status = wd719x_readb(wd, WD719X_AMR_INT_STATUS);
+   if (status)
+   break;
+   timeout--;
+   udelay(1);
+   }
+
+   if (timeout = 0) {
+   dev_err(wd-pdev-dev, direct command timed out\n);
+   return -ETIMEDOUT;
+   }
+
+   if (status != WD719X_INT_NOERRORS) {
+   dev_err(wd-pdev-dev, direct command failed, status 0x%02x, 
SUE 0x%02x\n,
+   status, wd719x_readb(wd, WD719X_AMR_SCB_ERROR));
+   return -EIO;
+   }
+
+   return 0;
+}
+
+static int wd719x_direct_cmd(struct wd719x *wd, u8 opcode, u8 dev, u8 lun,
+u8 tag, dma_addr_t data, int timeout)
+{
+   int 

[RFC PATCH 0/2] wd719x: Introduce Western Digital WD7193/7197/7296 PCI SCSI card driver

2013-11-11 Thread Ondrej Zary
Hello,
this is an attempt to provide a new driver for Western Digital WD7193, WD7197 
and WD7296 PCI SCSI controllers based on WD33C296A chip.

It's based on old and ugly wd719x driver written back in 2.0 days, then hacked 
to 2.2 and finally to 2.4 kernels. Most of the code is rewritten: from ~4100 
lines to ~1200.

I've tested the driver on a WD7193 card with some hard drives and CD-ROMs:
QUANTUM  LP240S GM240S01X 4.6
IBM  0663L12  e g
IBM  DORS-32160   WA0A
SONY CD-ROM CDU-55S   1.0t
SONY CD-ROM CDU-415   1.1g

The card supports TCQ and linked commands (for cmd_per_lun  1?) but I don't 
know how it should be implemented in a driver.

Device/bus/host resets seem to work fine when tested by sg_reset.
But don't know how to test command abort.

The card requires firmware that can be cut out of the Windows NT driver that 
can be downloaded from WD at: 
http://support.wdc.com/product/download.asp?groupid=801sid=27lang=en
There is no license anywhere in the file or on the page - so I guess that the 
firmware cannot be added to linux-firmware.

This script downloads and extracts the firmware:

#!/bin/sh
wget http://support.wdc.com/download/archive/pciscsi.exe
lha xi pciscsi.exe pci-scsi.exe
lha xi pci-scsi.exe nt/wd7296a.sys
rm pci-scsi.exe
dd if=wd7296a.sys of=wd719x-risc.bin bs=1 skip=5760 count=14336
dd if=wd7296a.sys of=wd719x-wcs.bin bs=1 skip=20096 count=514
rm wd7296a.sys


-- 
Ondrej Zary
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CmdSN greather than MaxCmdSN protocol error in LIO Iser

2013-11-11 Thread Nicholas A. Bellinger
On Mon, 2013-11-11 at 16:48 -0800, Nicholas A. Bellinger wrote:
 On Mon, 2013-11-11 at 16:42 -0800, Nicholas A. Bellinger wrote:
  On Mon, 2013-11-11 at 13:17 -0800, Nicholas A. Bellinger wrote:
   Hi Moussa  Co,
   
   On Mon, 2013-11-11 at 19:05 +, Moussa Ba (moussaba) wrote:
My system setup is as follows:

Target: CentOS 6.4 - LIO Target running 3.12 kernel, 8 PCIe SSD in one 
volume group, 8 logical volumes, 8 targets, 1 LUN/target.
Initiator: CentOS 6.4 running 3.11 Kernel (Also ran 2.6.32-358), ISER 
based initiator over Mellanox 40Gbit ConnectX HCA

When running performance tests on the initiator, I am running into
   fio timeouts that lead to ABORT_TASK commands on the target.  In other
   words, fio fails to reap all the io threads almost as if it is
   waiting for lost IOs to complete. This is happening on random write IO
   operations.  Some context, we are generating about 576KIOPS 4KB block
   sizes using 8 LUNS.  The PCIe SSD have a write buffer that can absorb
   writes with an end to end latency on the initiator of 44us.  We are
   not currently seeing any errors on read IOs, which tend to have 2X+
   the latency of writes.  

See below for the dmesg on the target side.
Timeout Condition occurs at 154 which is the Protocol Error after fio 
is interrupted or runs to completion.  
[  154.453663] Received CmdSN: 0x000fcbb7 is greater than MaxCmdSN: 
0x000fcbb6, protocol error.
[  154.453673] Received CmdSN: 0x000fcbb8 is greater than MaxCmdSN: 
0x000fcbb6, protocol error.

   
   (CC'ing Mike)
   
   As mentioned off-list, this would tend to indicate some manner of
   open-iscsi bug, as it's never legal for an initiator to send a CmdSN
   greater than the MaxCmdSN that's currently being sent in target outgoing
   response PDUs.
   
   Mike, any idea as to how this might be happening on the initiator
   side..?
   
  

SNIP

M, just noticed a bit of very suspicious logic in open-iscsi.
(CC'ing linux-scsi)

Namely, how drivers/scsi/libiscsi.c:iscsi_data_xmit() attempts to drain
conn-cmdqueue and hence call iscsi_prep_scsi_cmd_pdu() + increment
session-cmdsn, without ever checking to see if the CmdSN window may
have already closed..

The only CmdSN window check that I see in the I/O dispatch codepath is
in iscsi_queuecommand() - iscsi_check_cmdsn_window_closed(), but once
iscsi_conn_queue_work() is invoked and process context in
iscsi_xmitworker() - iscsi_data_xmit() starts to execute, there is
AFAICT no further CmdSN window open checks to ensure the initiator does
not overflow MaxCmdSN..

This would very much line up with the type of bug that is being
reported.  Before trying to determine what a possible fix might look
like, can you try the following libiscsi patch to see if your able to
hit either of the BUG's below..?

Thanks,

--nab

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index e399561..9106f63 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1483,6 +1483,12 @@ check_mgmt:
fail_scsi_task(conn-task, DID_IMM_RETRY);
continue;
}
+
+   if (iscsi_check_cmdsn_window_closed(conn)) {
+   printk(CmdSN window already closed in iscsi_data_xmit 
#1\n);
+   BUG();
+   }
+
rc = iscsi_prep_scsi_cmd_pdu(conn-task);
if (rc) {
if (rc == -ENOMEM || rc == -EACCES) {
@@ -1518,6 +1524,11 @@ check_mgmt:
if (iscsi_check_tmf_restrictions(task, ISCSI_OP_SCSI_DATA_OUT))
break;
 
+   if (iscsi_check_cmdsn_window_closed(conn)) {
+   printk(CmdSN window already closed in iscsi_data_xmit 
#2\n);
+   BUG();
+   }
+
conn-task = task;
list_del_init(conn-task-running);
conn-task-state = ISCSI_TASK_RUNNING;


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] target: Pass through I/O topology for block backstores

2013-11-11 Thread Martin K. Petersen
 Andy == Andy Grover agro...@redhat.com writes:

Andy In addition to block size (already implemented), passing through
Andy alignment offset, logical-to-phys block exponent, I/O granularity
Andy and optimal I/O length will allow initiators to properly handle
Andy layout on LUNs with 4K block sizes.

Looks good to me, Andy.

Acked-by: Martin K. Petersen martin.peter...@oracle.com

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CmdSN greather than MaxCmdSN protocol error in LIO Iser

2013-11-11 Thread Nicholas A. Bellinger
On Mon, 2013-11-11 at 17:31 -0800, Nicholas A. Bellinger wrote:
 On Mon, 2013-11-11 at 16:48 -0800, Nicholas A. Bellinger wrote:
  On Mon, 2013-11-11 at 16:42 -0800, Nicholas A. Bellinger wrote:
   On Mon, 2013-11-11 at 13:17 -0800, Nicholas A. Bellinger wrote:
Hi Moussa  Co,

On Mon, 2013-11-11 at 19:05 +, Moussa Ba (moussaba) wrote:
 My system setup is as follows:
 
 Target: CentOS 6.4 - LIO Target running 3.12 kernel, 8 PCIe SSD in 
 one volume group, 8 logical volumes, 8 targets, 1 LUN/target.
 Initiator: CentOS 6.4 running 3.11 Kernel (Also ran 2.6.32-358), ISER 
 based initiator over Mellanox 40Gbit ConnectX HCA
 
 When running performance tests on the initiator, I am running into
fio timeouts that lead to ABORT_TASK commands on the target.  In other
words, fio fails to reap all the io threads almost as if it is
waiting for lost IOs to complete. This is happening on random write IO
operations.  Some context, we are generating about 576KIOPS 4KB block
sizes using 8 LUNS.  The PCIe SSD have a write buffer that can absorb
writes with an end to end latency on the initiator of 44us.  We are
not currently seeing any errors on read IOs, which tend to have 2X+
the latency of writes.  
 
 See below for the dmesg on the target side.
 Timeout Condition occurs at 154 which is the Protocol Error after fio 
 is interrupted or runs to completion.  
 [  154.453663] Received CmdSN: 0x000fcbb7 is greater than MaxCmdSN: 
 0x000fcbb6, protocol error.
 [  154.453673] Received CmdSN: 0x000fcbb8 is greater than MaxCmdSN: 
 0x000fcbb6, protocol error.
 

(CC'ing Mike)

As mentioned off-list, this would tend to indicate some manner of
open-iscsi bug, as it's never legal for an initiator to send a CmdSN
greater than the MaxCmdSN that's currently being sent in target outgoing
response PDUs.

Mike, any idea as to how this might be happening on the initiator
side..?

   
 
 SNIP
 
 M, just noticed a bit of very suspicious logic in open-iscsi.
 (CC'ing linux-scsi)
 
 Namely, how drivers/scsi/libiscsi.c:iscsi_data_xmit() attempts to drain
 conn-cmdqueue and hence call iscsi_prep_scsi_cmd_pdu() + increment
 session-cmdsn, without ever checking to see if the CmdSN window may
 have already closed..
 
 The only CmdSN window check that I see in the I/O dispatch codepath is
 in iscsi_queuecommand() - iscsi_check_cmdsn_window_closed(), but once
 iscsi_conn_queue_work() is invoked and process context in
 iscsi_xmitworker() - iscsi_data_xmit() starts to execute, there is
 AFAICT no further CmdSN window open checks to ensure the initiator does
 not overflow MaxCmdSN..
 
 This would very much line up with the type of bug that is being
 reported.  Before trying to determine what a possible fix might look
 like, can you try the following libiscsi patch to see if your able to
 hit either of the BUG's below..?
 

One more time with a different CmdSN check against the currently
allocated task-cmdsn in iscsi_data_xmit(), as the previous patch using
iscsi_check_cmdsn_window_closed() only checks against the value in
session-queued_cmdsn, that will have already been incremented before
returning from iscsi_queuecommand().

--nab

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index e399561..760f2cf 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1494,6 +1494,11 @@ check_mgmt:
fail_scsi_task(conn-task, DID_ABORT);
continue;
}
+   if (!iscsi_sna_lte(conn-task-cmdsn, 
conn-session-max_cmdsn)) {
+   printk(CmdSN window already closed in iscsi_data_xmit 
#1 0x%08x/0x%08x\n,
+   conn-task-cmdsn, 
conn-session-max_cmdsn);
+   BUG();
+   }
rc = iscsi_xmit_task(conn);
if (rc)
goto done;
@@ -1518,6 +1523,12 @@ check_mgmt:
if (iscsi_check_tmf_restrictions(task, ISCSI_OP_SCSI_DATA_OUT))
break;
 
+   if (!iscsi_sna_lte(conn-task-cmdsn, 
conn-session-max_cmdsn)) {
+   printk(CmdSN window already closed in iscsi_data_xmit 
#2 0x%08x/0x%08x\n,
+   conn-task-cmdsn, 
conn-session-max_cmdsn);
+   BUG();
+   }
+
conn-task = task;
list_del_init(conn-task-running);
conn-task-state = ISCSI_TASK_RUNNING;

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CmdSN greather than MaxCmdSN protocol error in LIO Iser

2013-11-11 Thread Michael Christie


 On Nov 11, 2013, at 7:31 PM, Nicholas A. Bellinger n...@linux-iscsi.org 
 wrote:
 
 On Mon, 2013-11-11 at 16:48 -0800, Nicholas A. Bellinger wrote:
 On Mon, 2013-11-11 at 16:42 -0800, Nicholas A. Bellinger wrote:
 On Mon, 2013-11-11 at 13:17 -0800, Nicholas A. Bellinger wrote:
 Hi Moussa  Co,
 
 On Mon, 2013-11-11 at 19:05 +, Moussa Ba (moussaba) wrote:
 My system setup is as follows:
 
 Target: CentOS 6.4 - LIO Target running 3.12 kernel, 8 PCIe SSD in one 
 volume group, 8 logical volumes, 8 targets, 1 LUN/target.
 Initiator: CentOS 6.4 running 3.11 Kernel (Also ran 2.6.32-358), ISER 
 based initiator over Mellanox 40Gbit ConnectX HCA
 
 When running performance tests on the initiator, I am running into
 fio timeouts that lead to ABORT_TASK commands on the target.  In other
 words, fio fails to reap all the io threads almost as if it is
 waiting for lost IOs to complete. This is happening on random write IO
 operations.  Some context, we are generating about 576KIOPS 4KB block
 sizes using 8 LUNS.  The PCIe SSD have a write buffer that can absorb
 writes with an end to end latency on the initiator of 44us.  We are
 not currently seeing any errors on read IOs, which tend to have 2X+
 the latency of writes.  
 
 See below for the dmesg on the target side.
 Timeout Condition occurs at 154 which is the Protocol Error after fio is 
 interrupted or runs to completion.  
 [  154.453663] Received CmdSN: 0x000fcbb7 is greater than MaxCmdSN: 
 0x000fcbb6, protocol error.
 [  154.453673] Received CmdSN: 0x000fcbb8 is greater than MaxCmdSN: 
 0x000fcbb6, protocol error.
 
 (CC'ing Mike)
 
 As mentioned off-list, this would tend to indicate some manner of
 open-iscsi bug, as it's never legal for an initiator to send a CmdSN
 greater than the MaxCmdSN that's currently being sent in target outgoing
 response PDUs.
 
 Mike, any idea as to how this might be happening on the initiator
 side..?
 
 SNIP
 
 M, just noticed a bit of very suspicious logic in open-iscsi.
 (CC'ing linux-scsi)
 
 Namely, how drivers/scsi/libiscsi.c:iscsi_data_xmit() attempts to drain
 conn-cmdqueue and hence call iscsi_prep_scsi_cmd_pdu() + increment
 session-cmdsn, without ever checking to see if the CmdSN window may
 have already closed..
 
 The only CmdSN window check that I see in the I/O dispatch codepath is
 in iscsi_queuecommand() - iscsi_check_cmdsn_window_closed(), but once
 iscsi_conn_queue_work() is invoked and process context in
 iscsi_xmitworker() - iscsi_data_xmit() starts to execute, there is
 AFAICT no further CmdSN window open checks to ensure the initiator does
 not overflow MaxCmdSN..
 
 This would very much line up with the type of bug that is being
 reported.  Before trying to determine what a possible fix might look
 like, can you try the following libiscsi patch to see if your able to
 hit either of the BUG's below..?
 
 Thanks,
 
 --nab
 
 diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
 index e399561..9106f63 100644
 --- a/drivers/scsi/libiscsi.c
 +++ b/drivers/scsi/libiscsi.c
 @@ -1483,6 +1483,12 @@ check_mgmt:
fail_scsi_task(conn-task, DID_IMM_RETRY);
continue;
}
 +
 +   if (iscsi_check_cmdsn_window_closed(conn)) {
 +   printk(CmdSN window already closed in 
 iscsi_data_xmit #1\n);
 +   BUG();
 +   }
 +
rc = iscsi_prep_scsi_cmd_pdu(conn-task);
if (rc) {
if (rc == -ENOMEM || rc == -EACCES) {
 @@ -1518,6 +1524,11 @@ check_mgmt:
if (iscsi_check_tmf_restrictions(task, ISCSI_OP_SCSI_DATA_OUT))
break;
 
 +   if (iscsi_check_cmdsn_window_closed(conn)) {
 +   printk(CmdSN window already closed in 
 iscsi_data_xmit #2\n);
 +   BUG();
 +   }
 +
conn-task = task;
list_del_init(conn-task-running);
conn-task-state = ISCSI_TASK_RUNNING;
 
 

If you hit a bug there then it is probably the target or if you are using the 
new libiscsi back/frwd lock patches it could be related to them. We should not 
need to check above because we never queue more cmds from the scsi layer than 
the window will allow at the time. If the window is 10, the queued_cmdsn check 
should make sure we have only accepted 10 commands into the iscsi layer so we 
should not need to recheck above.

You should just enable libiscsi debug printk in iscsi_check_cmdsn_window so we 
can see the sn related values at the time of queuing. You should enable that 
printk whithout your patch.


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CmdSN greather than MaxCmdSN protocol error in LIO Iser

2013-11-11 Thread Michael Christie


 On Nov 11, 2013, at 8:32 PM, Michael Christie micha...@cs.wisc.edu wrote:
 
 
 
 On Nov 11, 2013, at 7:31 PM, Nicholas A. Bellinger n...@linux-iscsi.org 
 wrote:
 
 On Mon, 2013-11-11 at 16:48 -0800, Nicholas A. Bellinger wrote:
 On Mon, 2013-11-11 at 16:42 -0800, Nicholas A. Bellinger wrote:
 On Mon, 2013-11-11 at 13:17 -0800, Nicholas A. Bellinger wrote:
 Hi Moussa  Co,
 
 On Mon, 2013-11-11 at 19:05 +, Moussa Ba (moussaba) wrote:
 My system setup is as follows:
 
 Target: CentOS 6.4 - LIO Target running 3.12 kernel, 8 PCIe SSD in one 
 volume group, 8 logical volumes, 8 targets, 1 LUN/target.
 Initiator: CentOS 6.4 running 3.11 Kernel (Also ran 2.6.32-358), ISER 
 based initiator over Mellanox 40Gbit ConnectX HCA
 
 When running performance tests on the initiator, I am running into
 fio timeouts that lead to ABORT_TASK commands on the target.  In other
 words, fio fails to reap all the io threads almost as if it is
 waiting for lost IOs to complete. This is happening on random write IO
 operations.  Some context, we are generating about 576KIOPS 4KB block
 sizes using 8 LUNS.  The PCIe SSD have a write buffer that can absorb
 writes with an end to end latency on the initiator of 44us.  We are
 not currently seeing any errors on read IOs, which tend to have 2X+
 the latency of writes.  
 
 See below for the dmesg on the target side.
 Timeout Condition occurs at 154 which is the Protocol Error after fio is 
 interrupted or runs to completion.  
 [  154.453663] Received CmdSN: 0x000fcbb7 is greater than MaxCmdSN: 
 0x000fcbb6, protocol error.
 [  154.453673] Received CmdSN: 0x000fcbb8 is greater than MaxCmdSN: 
 0x000fcbb6, protocol error.
 
 (CC'ing Mike)
 
 As mentioned off-list, this would tend to indicate some manner of
 open-iscsi bug, as it's never legal for an initiator to send a CmdSN
 greater than the MaxCmdSN that's currently being sent in target outgoing
 response PDUs.
 
 Mike, any idea as to how this might be happening on the initiator
 side..?
 
 SNIP
 
 M, just noticed a bit of very suspicious logic in open-iscsi.
 (CC'ing linux-scsi)
 
 Namely, how drivers/scsi/libiscsi.c:iscsi_data_xmit() attempts to drain
 conn-cmdqueue and hence call iscsi_prep_scsi_cmd_pdu() + increment
 session-cmdsn, without ever checking to see if the CmdSN window may
 have already closed..
 
 The only CmdSN window check that I see in the I/O dispatch codepath is
 in iscsi_queuecommand() - iscsi_check_cmdsn_window_closed(), but once
 iscsi_conn_queue_work() is invoked and process context in
 iscsi_xmitworker() - iscsi_data_xmit() starts to execute, there is
 AFAICT no further CmdSN window open checks to ensure the initiator does
 not overflow MaxCmdSN..
 
 This would very much line up with the type of bug that is being
 reported.  Before trying to determine what a possible fix might look
 like, can you try the following libiscsi patch to see if your able to
 hit either of the BUG's below..?
 
 Thanks,
 
 --nab
 
 diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
 index e399561..9106f63 100644
 --- a/drivers/scsi/libiscsi.c
 +++ b/drivers/scsi/libiscsi.c
 @@ -1483,6 +1483,12 @@ check_mgmt:
   fail_scsi_task(conn-task, DID_IMM_RETRY);
   continue;
   }
 +
 +   if (iscsi_check_cmdsn_window_closed(conn)) {
 +   printk(CmdSN window already closed in 
 iscsi_data_xmit #1\n);
 +   BUG();
 +   }
 +
   rc = iscsi_prep_scsi_cmd_pdu(conn-task);
   if (rc) {
   if (rc == -ENOMEM || rc == -EACCES) {
 @@ -1518,6 +1524,11 @@ check_mgmt:
   if (iscsi_check_tmf_restrictions(task, ISCSI_OP_SCSI_DATA_OUT))
   break;
 
 +   if (iscsi_check_cmdsn_window_closed(conn)) {
 +   printk(CmdSN window already closed in 
 iscsi_data_xmit #2\n);
 +   BUG();
 +   }
 +
   conn-task = task;
   list_del_init(conn-task-running);
   conn-task-state = ISCSI_TASK_RUNNING;
 
 If you hit a bug there then it is probably the target or if you are using the 
 new libiscsi back/frwd lock patches it could be related to them. We should 
 not need to check above because we never queue more cmds from the scsi layer 
 than the window will allow at the time. If the window is 10, the queued_cmdsn 
 check should make sure we have only accepted 10 commands into the iscsi layer 
 so we should not need to recheck above.
 
 You should just enable libiscsi debug printk in iscsi_check_cmdsn_window so 
 we can see the sn related values at the time of queuing. You should enable 
 that printk whithout your patch.

Actually that will not help because we always have the queued cmdsn lower than 
the max in that path. I would do your patch but then also print the queued 
cmdsn, cmdsn and also the maxcmdsn in the check function.


I am not by my computer so I cannot send a 

Re: CmdSN greather than MaxCmdSN protocol error in LIO Iser

2013-11-11 Thread Nicholas A. Bellinger
On Mon, 2013-11-11 at 20:39 -0600, Michael Christie wrote:
 
  On Nov 11, 2013, at 8:32 PM, Michael Christie micha...@cs.wisc.edu wrote:
  

SNIP

  Mike, any idea as to how this might be happening on the initiator
  side..?
  
  SNIP
  
  M, just noticed a bit of very suspicious logic in open-iscsi.
  (CC'ing linux-scsi)
  
  Namely, how drivers/scsi/libiscsi.c:iscsi_data_xmit() attempts to drain
  conn-cmdqueue and hence call iscsi_prep_scsi_cmd_pdu() + increment
  session-cmdsn, without ever checking to see if the CmdSN window may
  have already closed..
  
  The only CmdSN window check that I see in the I/O dispatch codepath is
  in iscsi_queuecommand() - iscsi_check_cmdsn_window_closed(), but once
  iscsi_conn_queue_work() is invoked and process context in
  iscsi_xmitworker() - iscsi_data_xmit() starts to execute, there is
  AFAICT no further CmdSN window open checks to ensure the initiator does
  not overflow MaxCmdSN..
  
  This would very much line up with the type of bug that is being
  reported.  Before trying to determine what a possible fix might look
  like, can you try the following libiscsi patch to see if your able to
  hit either of the BUG's below..?
  
  Thanks,
  
  --nab
  
  diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
  index e399561..9106f63 100644
  --- a/drivers/scsi/libiscsi.c
  +++ b/drivers/scsi/libiscsi.c
  @@ -1483,6 +1483,12 @@ check_mgmt:
fail_scsi_task(conn-task, DID_IMM_RETRY);
continue;
}
  +
  +   if (iscsi_check_cmdsn_window_closed(conn)) {
  +   printk(CmdSN window already closed in 
  iscsi_data_xmit #1\n);
  +   BUG();
  +   }
  +
rc = iscsi_prep_scsi_cmd_pdu(conn-task);
if (rc) {
if (rc == -ENOMEM || rc == -EACCES) {
  @@ -1518,6 +1524,11 @@ check_mgmt:
if (iscsi_check_tmf_restrictions(task, 
  ISCSI_OP_SCSI_DATA_OUT))
break;
  
  +   if (iscsi_check_cmdsn_window_closed(conn)) {
  +   printk(CmdSN window already closed in 
  iscsi_data_xmit #2\n);
  +   BUG();
  +   }
  +
conn-task = task;
list_del_init(conn-task-running);
conn-task-state = ISCSI_TASK_RUNNING;
  
  If you hit a bug there then it is probably the target or if you are
 using the new libiscsi back/frwd lock patches it could be related to
 them. 

FYI, this is with v3.11 and RHEL 6.x open-iscsi code..

 We should not need to check above because we never queue more cmds
 from the scsi layer than the window will allow at the time. If the
 window is 10, the queued_cmdsn check should make sure we have only
 accepted 10 commands into the iscsi layer so we should not need to
 recheck above.
  
  You should just enable libiscsi debug printk in
 iscsi_check_cmdsn_window so we can see the sn related values at the
 time of queuing. You should enable that printk whithout your patch.
 
 Actually that will not help because we always have the queued cmdsn
 lower than the max in that path. I would do your patch but then also
 print the queued cmdsn, cmdsn and also the maxcmdsn in the check
 function.
 
 

Yes, notice the second test patch is doing the explicit check of
task-cmdsn vs. sess-max_cmdsn, instead of the one that checks
-queued_cmdsn vs. -max_cmdsn in iscsi_check_cmdsn_window_closed().

Btw, I'm surmising the bug in question does not happen from the
iscsi_queuecommand() submission path, but rather from the
iscsi_update_cmdsn() - __iscsi_update_cmdsn() completion path, where
iscsi_conn_queue_work() gets called when the CmdSN window was
previously closed, eg:

if (max_cmdsn != session-max_cmdsn 
!iscsi_sna_lt(max_cmdsn, session-max_cmdsn)) {
session-max_cmdsn = max_cmdsn;
/*
 * if the window closed with IO queued, then kick the
 * xmit thread
 */
if (!list_empty(session-leadconn-cmdqueue) ||
!list_empty(session-leadconn-mgmtqueue))
iscsi_conn_queue_work(session-leadconn);
}
}

Once iscsi_conn_queue_work() is invoked here to start process context
execution of iscsi_xmitworker() - iscsi_data_xmit() code, AFAICT there
is no logic in place within iscsi_data_xmit() to honor the last received
MaxCmdSN.

Or to put it another way: what is preventing iscsi_data_xmit() from
completely draining both conn-cmdqueue + conn-requeue, even when the
CmdSN window has potentially been closed again..?

--nab

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/5] ata: Add APM X-Gene SATA driver

2013-11-11 Thread Loc Ho
Hi Arnd/Olof,

I looked over the phy code for USB and NET. There isn't such PHY
infrastructure for SATA from what I can tell. It seems like I will
need to put this all together. I am thinking about porting the USB
version over (with changes for SATA) and put it under
./drivers/ata/phy. Any suggestion?

-Loc
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CmdSN greather than MaxCmdSN protocol error in LIO Iser

2013-11-11 Thread Or Gerlitz

On 12/11/2013 06:34, Nicholas A. Bellinger wrote:

Once iscsi_conn_queue_work() is invoked here to start process context
execution of iscsi_xmitworker() - iscsi_data_xmit() code, AFAICT there
is no logic in place within iscsi_data_xmit() to honor the last received
MaxCmdSN.

Or to put it another way: what is preventing iscsi_data_xmit() from
completely draining both conn-cmdqueue + conn-requeue, even when the
CmdSN window has potentially been closed again..?


Guys,

Note that the iser initiator transport uses the pass-through command 
submission mode of libiscsi, that is

iscsi_conn_queue_work isn't called from queuecommand at all.

This is b/c we call iscsi_host_allocwith xmit_can_sleep = 0. Hence no 
workqueue is used for the command processing/submission over the wire, 
just a call toiscsi_prep_scsi_cmd_pdu and following that to iser's 
xmit_task callbackwhich isiscsi_iser_task_xmit that calls 
iser_send_command, etc.


Mike, Nic is not using the new locking framework patches for libiscsi, 
as you know they are not upstream

yet...

Or.


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html