Re: usage of max_sectors in scsi_host_template

2007-11-07 Thread Stefan Richter
Erez Zilber wrote:
 I'm running sgp_dd (on RHAS 4 up4 - kernel version is 2.6.9), so it 
 calls scsi-ml directly (without going through ll_rw_blk).
...
 I guess that the max_sectors value is never used. Am I right?

I have no idea.  However you might be able to track how max_sectors
trickles through the layers via LXR:
http://lxr.free-electrons.com/ident?i=max_sectors
(Not all of the LXR sites out there support search for struct members
but free-electrons' does.)
-- 
Stefan Richter
-=-=-=== =-== --===
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: LTO-3 read performance issues

2007-11-07 Thread Kai Makisara
On Wed, 7 Nov 2007, James Pearson wrote:

 I have two LTO-3 (QUANTUM ULTRIUM 3) drives attached to a dual Adaptec U160
 controller (one per SCSI host) on a Dell PE2850 running a RHEL4 based kernel
 (2.6.9 based).
 
 I'm trying to read (with tar) LTO-3 tapes written on another system (possibly
 an SGI IRIX box), but I'm getting extremely variable read rates - from a few
 Kb/s to tens of Mb/s - while reading the same tape
 
 After a bit of trial and error, it looks like the tapes have been written in
 variable block mode with a block size of 16Kb
 
 To list the tapes, I need to set the block size to 0 (mt setblk 0) and run:
 
 tar tvfb $TAPE 32
 
 Running strace on the tar process shows that it does a number of read()'s then
 'sticks' on a read() for a number of seconds, and then does a burst of
 read()'s - the number of reads it does in these bursts and the time if waits
 on a particular read vary.
 
 My guess this is something to do the drive having to repositioning the tape
 between reads and breaking the tape streaming ...
 
The Quuntum LTO-3 sustained transfer rate is 68 MB/s. Compression does 
increase this. In order to avoid stopping the tape, your system should be 
able to able to process the data continuously at this rate. If this is not 
the case, you can't avoid the stops.

 I get the same issue on both drives with different tapes from the same source.
 
 I am using the default st module options and not doing anything other than
 using 'mt setblk 0'.
 
 Is there anything I can do to get a decent, sustained read rate from these
 tapes?
 
If your system can process the data fast enough and the tape blocks are of 
the same size, you can try using fixed block mode and buffered transfers. 
This allows the driver to use larger SCSI reads. You have to load the st 
module with parameter buffer_kbs=xxx, where xxx is, for instance, 1024 (1 
MB buffer). You also have to disable direct i/o with the module parameter
try_direct_io=0 (otherwise buffered transfers are disabled).

-- 
Kai
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [SCSI] iscsi: return data transfer residual for data-out commands

2007-11-07 Thread Tony Battersby
Currently, the iSCSI driver returns the data transfer residual for
data-in commands (e.g. read) but not data-out commands (e.g. write).
This patch makes it return the data transfer residual for both types of
commands.

Signed-off-by: Tony Battersby [EMAIL PROTECTED]
---
--- linux-2.6.24-rc2/drivers/scsi/libiscsi.c.orig   2007-11-07 
12:52:20.0 -0500
+++ linux-2.6.24-rc2/drivers/scsi/libiscsi.c2007-11-07 12:52:27.0 
-0500
@@ -291,9 +291,6 @@ invalid_datalen:
   min_t(uint16_t, senselen, SCSI_SENSE_BUFFERSIZE));
}
 
-   if (sc-sc_data_direction == DMA_TO_DEVICE)
-   goto out;
-
if (rhdr-flags  ISCSI_FLAG_CMD_UNDERFLOW) {
int res_count = be32_to_cpu(rhdr-residual_count);
 


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] gdth: scp timeout clean up (try #2)

2007-11-07 Thread Christoph Hellwig
On Tue, Nov 06, 2007 at 02:34:53PM -0800, [EMAIL PROTECTED] wrote:
 gdth driver is modified NOT to use scp-eh_timeout. Now, it has
 eh_timed_out (gdth_timed_out) to handle command timeouts for locked
 I/O's. Have not tested as I don't have needed hardware! Patch is
 against 2.6.23-mm1.

Looks good to me except for some tiny whitespace damage in the
host template.

It would be really useful if we could get some gdth devices for testing..

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


LTO-3 read performance issues

2007-11-07 Thread James Pearson
I have two LTO-3 (QUANTUM ULTRIUM 3) drives attached to a dual Adaptec 
U160 controller (one per SCSI host) on a Dell PE2850 running a RHEL4 
based kernel (2.6.9 based).


I'm trying to read (with tar) LTO-3 tapes written on another system 
(possibly an SGI IRIX box), but I'm getting extremely variable read 
rates - from a few Kb/s to tens of Mb/s - while reading the same tape


After a bit of trial and error, it looks like the tapes have been 
written in variable block mode with a block size of 16Kb


To list the tapes, I need to set the block size to 0 (mt setblk 0) and run:

tar tvfb $TAPE 32

Running strace on the tar process shows that it does a number of 
read()'s then 'sticks' on a read() for a number of seconds, and then 
does a burst of read()'s - the number of reads it does in these bursts 
and the time if waits on a particular read vary.


My guess this is something to do the drive having to repositioning the 
tape between reads and breaking the tape streaming ...


I get the same issue on both drives with different tapes from the same 
source.


I am using the default st module options and not doing anything other 
than using 'mt setblk 0'.


Is there anything I can do to get a decent, sustained read rate from 
these tapes?


Thanks

James Pearson

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: usage of max_sectors in scsi_host_template

2007-11-07 Thread Erez Zilber

Stefan Richter wrote:


Erez Zilber wrote:
 I'm not sure that I understand the meaning of max_sectors in
 scsi_host_template.

Did you have a look at scsi_mid_low_api.txt?
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/scsi/scsi_mid_low_api.txt;h=6f70f2b9327e1f0db7bc05bdbf2d6ce3b2fcbdcf#l1232



I will go over it. Thanks for the link.


 Is it the maximum data length of a single SCSI command?

Yes.

 Is it in bytes?

No, it is in units of 512 bytes.

 What's the size of a sector?

Usually 512 bytes according to above doc.  Always 512 bytes from the
point of view of block/ll_rw_blk.c::blk_queue_max_sectors().
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=block/ll_rw_blk.c;h=75c98d58f4ddf7252e2717e0924b9d6a8925b4e5#l590



So, ll_rw_blk actually uses the max_sectors value to chop requests 
larger than max_sectors. Am I right? If yes, I have a problem:


I'm running sgp_dd (on RHAS 4 up4 - kernel version is 2.6.9), so it 
calls scsi-ml directly (without going through ll_rw_blk). I ran it with 
the following parameters:


sgp_dd bs=512 of=/dev/null if=/dev/sg1 bpt=2048 thr=4 time=1 count=100k 
deb=9


I see that a single 1MB command is generated. Here's the debug info from 
sgp_dd:


sgp_dd: if=/dev/sg1 skip=0 of=/dev/null seek=0 count=102400
Start of loop, count=102400, in_num_sect=0, out_num_sect=0
Starting worker thread k=0
sg_start_io: SCSI READ, blk=0 num_blks=2048
Read (10) [28 00 00 00 00 00 00 08 00 00 ]
dir=-3, len=1048576, dxfrp=0x2a9558a000, cmd_len=10

Now, the low-level driver below scsi-ml is open-iscsi over iSER. 
max_sectors is set to 1024 (i.e. 512 kB). Still, the iSER driver 
receives a 1MB command. I guess that the max_sectors value is never 
used. Am I right?


Thanks,
Erez
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] gdth: scp timeout clean up (try #2)

2007-11-07 Thread Jens Axboe
On Tue, Nov 06 2007, [EMAIL PROTECTED] wrote:
 gdth driver is modified NOT to use scp-eh_timeout. Now, it has
 eh_timed_out (gdth_timed_out) to handle command timeouts for locked
 I/O's. Have not tested as I don't have needed hardware! Patch is
 against 2.6.23-mm1.

I updated the timeout patch to current kernel and fixed some fallout. I
included your gdth patch.

Completely untested, patch is below. It's also in the #timeout branch of
the block git repo, to keep track of it.

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 75c98d5..79ed268 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -46,6 +46,7 @@ static struct io_context *current_io_context(gfp_t gfp_flags, 
int node);
 static void blk_recalc_rq_segments(struct request *rq);
 static void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
struct bio *bio);
+static void blk_rq_timed_out_timer(unsigned long data);
 
 /*
  * For the allocated request tables
@@ -181,6 +182,18 @@ void blk_queue_softirq_done(struct request_queue *q, 
softirq_done_fn *fn)
 
 EXPORT_SYMBOL(blk_queue_softirq_done);
 
+void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
+{
+   q-rq_timeout = timeout;
+}
+EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
+
+void blk_queue_rq_timed_out(struct request_queue *q, rq_timed_out_fn *fn)
+{
+   q-rq_timed_out_fn = fn;
+}
+EXPORT_SYMBOL_GPL(blk_queue_rq_timed_out);
+
 /**
  * blk_queue_make_request - define an alternate make_request function for a 
device
  * @q:  the request queue for the device to be affected
@@ -243,7 +256,9 @@ static void rq_init(struct request_queue *q, struct request 
*rq)
 {
INIT_LIST_HEAD(rq-queuelist);
INIT_LIST_HEAD(rq-donelist);
+   INIT_LIST_HEAD(rq-timeout_list);
 
+   rq-timeout = 0;
rq-errors = 0;
rq-bio = rq-biotail = NULL;
INIT_HLIST_NODE(rq-hash);
@@ -1868,6 +1883,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t 
gfp_mask, int node_id)
}
 
init_timer(q-unplug_timer);
+   setup_timer(q-timeout, blk_rq_timed_out_timer, (unsigned long) q);
+   INIT_LIST_HEAD(q-timeout_list);
 
kobject_set_name(q-kobj, %s, queue);
q-kobj.ktype = queue_ktype;
@@ -2285,6 +2302,7 @@ EXPORT_SYMBOL(blk_start_queueing);
  */
 void blk_requeue_request(struct request_queue *q, struct request *rq)
 {
+   blk_delete_timer(rq);
blk_add_trace_rq(q, rq, BLK_TA_REQUEUE);
 
if (blk_rq_tagged(rq))
@@ -3622,12 +3640,145 @@ static int __cpuinit blk_cpu_notify(struct 
notifier_block *self, unsigned long a
return NOTIFY_OK;
 }
 
-
 static struct notifier_block blk_cpu_notifier __cpuinitdata = {
.notifier_call  = blk_cpu_notify,
 };
 
 /**
+ * blk_delete_timer - Delete/cancel timer for a given function.
+ * @req:   request that we are canceling timer for
+ *
+ * Return value:
+ * 1 if we were able to detach the timer.  0 if we blew it, and the
+ * timer function has already started to run. Caller must hold queue lock.
+ */
+int blk_delete_timer(struct request *req)
+{
+   struct request_queue *q = req-q;
+ 
+   /*
+* Nothing to detach
+*/
+   if (!q-rq_timed_out_fn)
+   return 1;
+
+   /*
+* Not on the list, must have already been scheduled (or never added)
+*/
+   if (list_empty(req-timeout_list))
+   return 0;
+
+   list_del_init(req-timeout_list);
+
+   if (list_empty(q-timeout_list))
+   del_timer(q-timeout);
+
+   return 1;
+}
+EXPORT_SYMBOL_GPL(blk_delete_timer);
+
+static void blk_rq_timed_out(struct request *req)
+{
+   struct request_queue *q = req-q;
+   enum blk_eh_timer_return ret;
+
+   ret = q-rq_timed_out_fn(req);
+   switch (ret) {
+   case BLK_EH_HANDLED:
+   __blk_complete_request(req);
+   break;
+   case BLK_EH_RESET_TIMER:
+   blk_add_timer(req);
+   break;
+   case BLK_EH_NOT_HANDLED:
+   /*
+* LLD handles this for now but in the future
+* we can send a request msg to abort the command
+* and we can move more of the generic scsi eh code to
+* the blk layer.
+*/
+   break;
+   default:
+   printk(KERN_ERR block: bad eh return: %d\n, ret);
+   break;
+   }
+}
+
+static void blk_rq_timed_out_timer(unsigned long data)
+{
+   struct request_queue *q = (struct request_queue *) data;
+   unsigned long flags, next = 0;
+   struct request *rq, *tmp;
+
+   spin_lock_irqsave(q-queue_lock, flags);
+
+   list_for_each_entry_safe(rq, tmp, q-timeout_list, timeout_list) {
+   if (!next || time_before(next, rq-timeout))
+   next = rq-timeout;
+   if (time_after_eq(jiffies, rq-timeout)) {
+   list_del_init(rq-timeout_list);
+  

Re: [PATCH] [SCSI] iscsi: return data transfer residual for data-out commands

2007-11-07 Thread Boaz Harrosh
On Wed, Nov 07 2007 at 20:06 +0200, Tony Battersby [EMAIL PROTECTED] wrote:
 Currently, the iSCSI driver returns the data transfer residual for
 data-in commands (e.g. read) but not data-out commands (e.g. write).
 This patch makes it return the data transfer residual for both types of
 commands.
All types of commands, also good for BIDI ;)

 
 Signed-off-by: Tony Battersby [EMAIL PROTECTED]
 ---
 --- linux-2.6.24-rc2/drivers/scsi/libiscsi.c.orig 2007-11-07 
 12:52:20.0 -0500
 +++ linux-2.6.24-rc2/drivers/scsi/libiscsi.c  2007-11-07 12:52:27.0 
 -0500
 @@ -291,9 +291,6 @@ invalid_datalen:
  min_t(uint16_t, senselen, SCSI_SENSE_BUFFERSIZE));
   }
  
 - if (sc-sc_data_direction == DMA_TO_DEVICE)
 - goto out;
 -
   if (rhdr-flags  ISCSI_FLAG_CMD_UNDERFLOW) {
   int res_count = be32_to_cpu(rhdr-residual_count);
  
 
 
Thanks, this looks right to me. (And good catch)
I have went through the code and it looks like the right thing to do.
svn blame annotates this code to the patch (r527) that libiscsi was 
cut out of iscsi_tcp so perhaps then it made sense, but does not anymore.
It is also needed for the bidi patches, as currently bidi commands
have a sc_data_direction == DMA_TO_DEVICE.

Pleas accept this patch
Boaz

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/1] aacraid: don't assign cpu_to_le32(int) to u8

2007-11-07 Thread Salyzyn, Mark
Christoph Hellwig [mailto:[EMAIL PROTECTED] sez:
 Did anyone run the driver through sparse to see if we have 
 more issues like this?

There are some warnings from sparse, none like this one. I will deal
with the warnings ...

Sincerely -- Mark Salyzyn

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/8] scsi: megaraid_sas - add module param max_sectors, cmd_per_lun

2007-11-07 Thread Christoph Hellwig
On Tue, Nov 06, 2007 at 12:06:39PM -0700, Yang, Bo wrote:
 The fast_load parameter is for the user to decide at driver load time if
 (s)he wants to skip scan of devices in PD channels.
 After driver is loaded the user cannot be permitted to modify this
 value. If the user needs to see the devices in the PD channels, (s)he
 may initiate a scan via sysfs/proc based on the kernel being used. Once
 the user has done the scan, the fast_load value does not have any
 significance and thus not exposed for reading.

The issue here is that this should really be a per-hba setting, and
as HBAs can appear anytime due to PCI hotplug a module paramater is not
enough.  Then again I still don't see why we need to spend so much effort
on this as you could trivially just fail the PD scanning commands in the
firmware without messing up the driver.

 cmd_per_lun  max_sectors are also intended to be provided by user only
 at driver load time. In the current implementation both these do appear
 as read-only values under host# in sysfs. The current design is not to
 allow these values to be modified on the fly. 

Same argument here about beeing per-hba.  And they really should be changeable
at runtime at least for hbas that don't have commands in flight.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI] sym53c8xx: fix free_irq() regression in 2.6.24

2007-11-07 Thread Christoph Hellwig
On Tue, Nov 06, 2007 at 02:40:54PM -0500, Tony Battersby wrote:
 The following commit changed the pointer passed to request_irq(), but
 failed to change the pointer passed to free_irq():

Looks good.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH [2/8] scsi: megaraid_sas - add module param fast_load

2007-11-07 Thread Christoph Hellwig
On Tue, Nov 06, 2007 at 12:04:31PM -0700, Yang, Bo wrote:
 I see that scsi_scan_host_selected is in scsi_priv.h and currently is
 not used by any other driver. The scsi_priv.h is not part of the include
 dir (/include/scsi). One of the major Linux distro's don't even include
 this file in /usr/src/kernels. Also it looks like at this time this
 function may not be available (not exported?) for driver modules to use.
 Even if I include scsi_priv.h I get unknown symbol for this function
 while loading.

Yes, it would have to be exported and moved to a public header.

 May be in the long run we can solve all these issues to call
 scsi_scan_host_selected. However, the current implementation works fine
 and has been tested by LSI and others. This implementation doesn't break
 any protocol nor does it adversely affect any driver functionality.

Your implementation adds state to scanning that could easily break and
makes the driver complex for things that don't belong into a driver,
so there's a clear NACK for this from me.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] aacraid: don't assign cpu_to_le32(int) to u8

2007-11-07 Thread Salyzyn, Mark
Good point, thanks. The intent of the management applications
utilization of this AIF report is to observe the LSB of the value of
integer value in BlinkLED. The actions of the cpu_to_le32 actually
breaks this and reports the wrong content in swapped architectures.

This attached follow-up patch is against current scsi-misc-2.6 *after*
the application of the 'don't assign cpu_to_le32(constant) to u8' patch
submitted by Stephen Rothwell which has already been taken by the -mm
tree. Inspection of other areas of the aacraid driver came up blank for
similar style bugs.

ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments (inline gets damaged, use attachment).

Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]

 drivers/scsi/aacraid/commsup.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -ru a/drivers/scsi/aacraid/commsup.c
b/drivers/scsi/aacraid/commsup.c
--- a/drivers/scsi/aacraid/commsup.c2007-11-07 10:35:16.603727464
-0500
+++ b/drivers/scsi/aacraid/commsup.c2007-11-07 10:37:50.540311107
-0500
@@ -1342,7 +1342,7 @@
aif-data[0] = AifEnExpEvent;
aif-data[1] = AifExeFirmwarePanic;
aif-data[2] = AifHighPriority;
-   aif-data[3] = cpu_to_le32(BlinkLED);
+   aif-data[3] = BlinkLED;

/*
 * Put the FIB onto the

Sincerely -- Mark Salyzyn

 -Original Message-
 From: Andreas Schwab [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, November 01, 2007 9:31 AM
 To: Stephen Rothwell
 Cc: AACRAID; linux-scsi@vger.kernel.org; LKML
 Subject: Re: [PATCHv2] aacraid: don't assign 
 cpu_to_le32(constant) to u8
 
 Stephen Rothwell [EMAIL PROTECTED] writes:
 
  diff --git a/drivers/scsi/aacraid/commsup.c 
 b/drivers/scsi/aacraid/commsup.c
  index 240a0bb..3c2dbc0 100644
  --- a/drivers/scsi/aacraid/commsup.c
  +++ b/drivers/scsi/aacraid/commsup.c
  @@ -1339,9 +1339,9 @@ int aac_check_health(struct aac_dev * aac)
  aif = (struct aac_aifcmd *)hw_fib-data;
  aif-command = cpu_to_le32(AifCmdEventNotify);
  aif-seqnum = cpu_to_le32(0x);
  -   aif-data[0] = cpu_to_le32(AifEnExpEvent);
  -   aif-data[1] = cpu_to_le32(AifExeFirmwarePanic);
  -   aif-data[2] = cpu_to_le32(AifHighPriority);
  +   aif-data[0] = AifEnExpEvent;
  +   aif-data[1] = AifExeFirmwarePanic;
  +   aif-data[2] = AifHighPriority;
  aif-data[3] = cpu_to_le32(BlinkLED);
 
 What about the last line?
 
 Andreas.


aacraid_BlinkLED.patch
Description: aacraid_BlinkLED.patch


Re: usage of max_sectors in scsi_host_template

2007-11-07 Thread Mike Christie

Erez Zilber wrote:

Stefan Richter wrote:

Erez Zilber wrote:

I'm not sure that I understand the meaning of max_sectors in
scsi_host_template.

Did you have a look at scsi_mid_low_api.txt?
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/scsi/scsi_mid_low_api.txt;h=6f70f2b9327e1f0db7bc05bdbf2d6ce3b2fcbdcf#l1232



I will go over it. Thanks for the link.


Is it the maximum data length of a single SCSI command?

Yes.


Is it in bytes?

No, it is in units of 512 bytes.


What's the size of a sector?

Usually 512 bytes according to above doc.  Always 512 bytes from the
point of view of block/ll_rw_blk.c::blk_queue_max_sectors().
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=block/ll_rw_blk.c;h=75c98d58f4ddf7252e2717e0924b9d6a8925b4e5#l590



So, ll_rw_blk actually uses the max_sectors value to chop requests 


Well, there is q-max_sectors and q-max_hw_sectors. In current kernels 
q-max_hw_sectors is scsi_host_template-max_sectors. And q-max_sectors 
is sort of a block layer limit to make sure commands to not get too big.




larger than max_sectors. Am I right? If yes, I have a problem:

I'm running sgp_dd (on RHAS 4 up4 - kernel version is 2.6.9), so it 
calls scsi-ml directly (without going through ll_rw_blk). I ran it with 
the following parameters:


RHEL4's sg.c does not take into account q-max_sectors or q-max_hw_sectors.

In later kernels like in RHEL5 (probably upstream 2.6.16+), sg.c and 
st.c goes through llw_rw_blkc and obeys the sector limit. For pass 
through like sg and block layer sg, the scsi command is limited by 
q-max_hw_sectors which like I said above is 
scsi_host_template-max_sectors. And normal FS commands are limited by 
min(q-max_hw_sectors, q-max_sectors).

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -mm 0/3] convert IOMMUs to use iova

2007-11-07 Thread FUJITA Tomonori
On Fri, 2 Nov 2007 19:12:27 +0200
Muli Ben-Yehuda [EMAIL PROTECTED] wrote:

 On Sat, Nov 03, 2007 at 02:05:39AM +0900, FUJITA Tomonori wrote:
 
  This patchset convert the PPC64 IOMMU to use the iova code for free
  area management.
  
  The IOMMUs ignores low level drivers' restrictions, the maximum
  segment size and segment boundary.
  
  I fixed the former:
  
  http://thread.gmane.org/gmane.linux.scsi/35602
  
  The latter makes the free area management complicated. I'd like to
  convert IOMMUs to use the iova code (that intel-iommu introduced)
  for free area management and enable iova to handle segment boundary
  restrictions, rather than fixing all the IOMMUs' free area
  management,
 
 In general it sounds like a great idea, but have you looked at what
 impact this has on the performance of the IO path?

I converted swiotlb to use iova and compared it with the original
algorithm (better than the simple bit map one that most of the IOMMUs
use, I think).

I use 'swiotlb=force' boot option and run netperf with -m 128 -D
(lead to tons of dma_map_single).

The original produced 281.8 Mb/s and the iova produced 77.2 Mb/s.

Seems that it would be better to generalization the swiotlb algorithm
(at least for small I/O area)? Or my patch might have bugs.

Here's the patch to convert swiotlb to use iova against my iova
patchset:

http://marc.info/?l=linux-kernelm=119402340801254w=2

diff --git a/arch/x86/Kconfig.x86_64 b/arch/x86/Kconfig.x86_64
index c10d3f0..8735822 100644
--- a/arch/x86/Kconfig.x86_64
+++ b/arch/x86/Kconfig.x86_64
@@ -524,6 +524,7 @@ config CALGARY_IOMMU_ENABLED_BY_DEFAULT
 # need this always selected by IOMMU for the VIA workaround
 config SWIOTLB
bool
+   select IOVA
help
  Support for software bounce buffers used on x86-64 systems
  which don't have a hardware IOMMU (e.g. the current generation
diff --git a/arch/x86/kernel/pci-dma_64.c b/arch/x86/kernel/pci-dma_64.c
index aa805b1..8507402 100644
--- a/arch/x86/kernel/pci-dma_64.c
+++ b/arch/x86/kernel/pci-dma_64.c
@@ -325,6 +325,9 @@ static int __init pci_iommu_init(void)
gart_iommu_init();
 #endif
 
+#ifdef CONFIG_SWIOTLB
+   swiotlb_alloc();
+#endif
no_iommu_init();
return 0;
 }
diff --git a/include/asm-x86/swiotlb.h b/include/asm-x86/swiotlb.h
index f9c5895..f00d20c 100644
--- a/include/asm-x86/swiotlb.h
+++ b/include/asm-x86/swiotlb.h
@@ -40,6 +40,7 @@ extern void swiotlb_free_coherent (struct device *hwdev, 
size_t size,
   void *vaddr, dma_addr_t dma_handle);
 extern int swiotlb_dma_supported(struct device *hwdev, u64 mask);
 extern void swiotlb_init(void);
+extern void swiotlb_alloc(void);
 
 extern int swiotlb_force;
 
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 1a8050a..54ecb87 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -24,6 +24,7 @@
 #include linux/string.h
 #include linux/types.h
 #include linux/ctype.h
+#include linux/iova.h
 
 #include asm/io.h
 #include asm/dma.h
@@ -103,10 +104,7 @@ static unsigned int io_tlb_index;
  */
 static unsigned char **io_tlb_orig_addr;
 
-/*
- * Protect the above data structures in the map and unmap calls
- */
-static DEFINE_SPINLOCK(io_tlb_lock);
+static struct iova_domain swiotlb_iovad;
 
 static int __init
 setup_io_tlb_npages(char *str)
@@ -272,6 +270,19 @@ cleanup1:
return -ENOMEM;
 }
 
+static struct kmem_cache *iova_cachep;
+
+void __init
+swiotlb_alloc(void)
+{
+   if (!swiotlb)
+   return;
+
+   iova_cachep = KMEM_CACHE(iova, 0);
+   init_iova_domain(swiotlb_iovad, DMA_32BIT_MASK  IO_TLB_SHIFT,
+iova_cachep);
+}
+
 static int
 address_needs_mapping(struct device *hwdev, dma_addr_t addr)
 {
@@ -288,70 +299,20 @@ address_needs_mapping(struct device *hwdev, dma_addr_t 
addr)
 static void *
 map_single(struct device *hwdev, char *buffer, size_t size, int dir)
 {
-   unsigned long flags;
char *dma_addr;
-   unsigned int nslots, stride, index, wrap;
+   unsigned int nslots, index;
int i;
+   struct iova *iova;
 
-   /*
-* For mappings greater than a page, we limit the stride (and
-* hence alignment) to a page size.
-*/
nslots = ALIGN(size, 1  IO_TLB_SHIFT)  IO_TLB_SHIFT;
-   if (size  PAGE_SIZE)
-   stride = (1  (PAGE_SHIFT - IO_TLB_SHIFT));
-   else
-   stride = 1;
-
BUG_ON(!nslots);
 
-   /*
-* Find suitable number of IO TLB entries size that will fit this
-* request and allocate a buffer from that IO TLB pool.
-*/
-   spin_lock_irqsave(io_tlb_lock, flags);
-   {
-   wrap = index = ALIGN(io_tlb_index, stride);
-
-   if (index = io_tlb_nslabs)
-   wrap = index = 0;
-
-   do {
-   /*
-* If we find a slot that indicates we have 'nslots'
-* number of contiguous 

[PATCH] [SCSI] sym53c8xx: don't flood syslog with negotiation messages

2007-11-07 Thread Tony Battersby
sym53c8xx prints a negotiation message after every check condition.
This can add up to a lot of messages for removable-medium devices
(CD-ROM, tape drives, etc.) that are being polled, since they return
check condition when no medium is present.  This patch suppresses the
negotiation message if it would be the same as the last one printed.

Signed-off-by: Tony Battersby [EMAIL PROTECTED]
---
diff -urpN linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_hipd.c 
linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_hipd.c
--- linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_hipd.c2007-11-07 
15:47:58.0 -0500
+++ linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_hipd.c   2007-11-07 
15:49:03.0 -0500
@@ -2040,6 +2040,29 @@ static void sym_settrans(struct sym_hcb 
}
 }
 
+static void sym_announce_transfer_rate(struct sym_tcb *tp)
+{
+   struct scsi_target *starget = tp-starget;
+
+   if (tp-tprint.period != spi_period(starget) ||
+   tp-tprint.offset != spi_offset(starget) ||
+   tp-tprint.width != spi_width(starget) ||
+   tp-tprint.iu != spi_iu(starget) ||
+   tp-tprint.dt != spi_dt(starget) ||
+   tp-tprint.qas != spi_qas(starget) ||
+   !tp-tprint.check_nego) {
+   tp-tprint.period = spi_period(starget);
+   tp-tprint.offset = spi_offset(starget);
+   tp-tprint.width = spi_width(starget);
+   tp-tprint.iu = spi_iu(starget);
+   tp-tprint.dt = spi_dt(starget);
+   tp-tprint.qas = spi_qas(starget);
+   tp-tprint.check_nego = 1;
+
+   spi_display_xfer_agreement(starget);
+   }
+}
+
 /*
  *  We received a WDTR.
  *  Let everything be aware of the changes.
@@ -2063,7 +2086,7 @@ static void sym_setwide(struct sym_hcb *
spi_qas(starget) = 0;
 
if (sym_verbose = 3)
-   spi_display_xfer_agreement(starget);
+   sym_announce_transfer_rate(tp);
 }
 
 /*
@@ -2090,7 +2113,7 @@ sym_setsync(struct sym_hcb *np, int targ
tp-tgoal.check_nego = 0;
}
 
-   spi_display_xfer_agreement(starget);
+   sym_announce_transfer_rate(tp);
 }
 
 /*
@@ -2114,7 +2137,7 @@ sym_setpprot(struct sym_hcb *np, int tar
spi_qas(starget) = tp-tgoal.qas = !!(opts  PPR_OPT_QAS);
tp-tgoal.check_nego = 0;
 
-   spi_display_xfer_agreement(starget);
+   sym_announce_transfer_rate(tp);
 }
 
 /*
diff -urpN linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_hipd.h 
linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_hipd.h
--- linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_hipd.h2007-11-07 
15:47:58.0 -0500
+++ linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_hipd.h   2007-11-07 
15:49:03.0 -0500
@@ -419,6 +419,9 @@ struct sym_tcb {
/* Transfer goal */
struct sym_trans tgoal;
 
+   /* Last printed transfer speed */
+   struct sym_trans tprint;
+
/*
 * Keep track of the CCB used for the negotiation in order
 * to ensure that only 1 negotiation is queued at a time.


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] SCSI: Use is_power_of_2() macro for simplicity.

2007-11-07 Thread Mike Christie

Robert P. J. Day wrote:

Signed-off-by: Robert P. J. Day [EMAIL PROTECTED]

---


Thanks for the iscsi bits. We have the libiscsi part from 
[EMAIL PROTECTED] queued for 2.6.25. And the iscsi_tcp parts you 
are patching over are broken and I am going to fix that for 2.6.25. We 
only support 1 r2t right now so we cannot hit the bug where if userspace 
negotiated X r2ts we have to use X r2ts we cannot round up.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [SCSI] sym53c8xx: fix setflag user command to control disconnects

2007-11-07 Thread Tony Battersby
This patch fixes the sym53c8xx setflag user command to control
disconnect privilege, which has been broken for a long time.

Signed-off-by: Tony Battersby [EMAIL PROTECTED]
---

NOTE regarding the following change:

can_disconnect = (cp-tag != NO_TAG) ||
-   (lp  (lp-curr_flags  SYM_DISC_ENABLED));
+   (lp  (tp-usrflags  SYM_DISC_ENABLED));

In 2.4 kernels, lp == NULL when scanning for devices, and allowing
disconnect when lp == NULL would confuse the code that handles
reselection.  So with 2.4 kernels, the check for lp != NULL had to be
left in even if lp wasn't being dereferenced.  In 2.6 kernels,
lp != NULL when scanning for devices.  In fact, I didn't encounter any
cases where lp == NULL during my testing with 2.6 kernels, so the check
for lp != NULL may be superfluous now.  However, the same check is
performed in other places in the same function, so I left it in to be
safe.

diff -urpN linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_glue.c 
linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_glue.c
--- linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_glue.c2007-11-07 
15:05:22.0 -0500
+++ linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_glue.c   2007-11-07 
15:06:41.0 -0500
@@ -785,11 +785,6 @@ static int sym53c8xx_slave_configure(str
int reqtags, depth_to_use;
 
/*
-*  Get user flags.
-*/
-   lp-curr_flags = lp-user_flags;
-
-   /*
 *  Select queue depth from driver setup.
 *  Donnot use more than configured by user.
 *  Use at least 2.
@@ -937,7 +932,9 @@ static void sym_exec_user_command (struc
OUTB(np, nc_istat, SIGP|SEM);
break;
case UC_SETFLAG:
-   tp-usrflags = uc-data;
+   tp-usrflags =
+   (tp-usrflags  ~SYM_DISC_ENABLED) |
+   uc-data;
break;
}
}
@@ -1098,6 +1095,7 @@ printk(sym_user_command: data=%ld\n, u
break;
 #endif /* SYM_LINUX_DEBUG_CONTROL_SUPPORT */
case UC_SETFLAG:
+   uc-data = SYM_DISC_ENABLED;
while (len  0) {
SKIP_SPACES(ptr, len);
if  ((arg_len = is_keyword(ptr, len, no_disc)))
diff -urpN linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_hipd.c 
linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_hipd.c
--- linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_hipd.c2007-11-07 
15:05:22.0 -0500
+++ linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_hipd.c   2007-11-07 
15:06:41.0 -0500
@@ -4984,11 +4984,6 @@ struct sym_lcb *sym_alloc_lcb (struct sy
 */
lp-head.resel_sa = cpu_to_scr(SCRIPTB_BA(np, resel_bad_lun));
 
-   /*
-*  Set user capabilities.
-*/
-   lp-user_flags = tp-usrflags  (SYM_DISC_ENABLED | SYM_TAGS_ENABLED);
-
 #ifdef SYM_OPT_HANDLE_DEVICE_QUEUEING
/*
 *  Initialize device queueing.
@@ -5077,7 +5072,7 @@ int sym_queue_scsiio(struct sym_hcb *np,
lp = sym_lp(tp, sdev-lun);
 
can_disconnect = (cp-tag != NO_TAG) ||
-   (lp  (lp-curr_flags  SYM_DISC_ENABLED));
+   (lp  (tp-usrflags  SYM_DISC_ENABLED));
 
msgptr = cp-scsi_smsg;
msglen = 0;
diff -urpN linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_hipd.h 
linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_hipd.h
--- linux-2.6.24-rc2/drivers/scsi/sym53c8xx_2/sym_hipd.h2007-11-07 
15:05:22.0 -0500
+++ linux-2.6.24-rc2-sym2/drivers/scsi/sym53c8xx_2/sym_hipd.h   2007-11-07 
15:06:41.0 -0500
@@ -536,12 +536,6 @@ struct sym_lcb {
 *  Set when we want to clear all tasks.
 */
u_char to_clear;
-
-   /*
-*  Capabilities.
-*/
-   u_char  user_flags;
-   u_char  curr_flags;
 };
 
 /*


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI] iscsi: return data transfer residual for data-out commands

2007-11-07 Thread Mike Christie

Boaz Harrosh wrote:

On Wed, Nov 07 2007 at 20:06 +0200, Tony Battersby [EMAIL PROTECTED] wrote:

Currently, the iSCSI driver returns the data transfer residual for
data-in commands (e.g. read) but not data-out commands (e.g. write).
This patch makes it return the data transfer residual for both types of
commands.

All types of commands, also good for BIDI ;)


Signed-off-by: Tony Battersby [EMAIL PROTECTED]
---
--- linux-2.6.24-rc2/drivers/scsi/libiscsi.c.orig   2007-11-07 
12:52:20.0 -0500
+++ linux-2.6.24-rc2/drivers/scsi/libiscsi.c2007-11-07 12:52:27.0 
-0500
@@ -291,9 +291,6 @@ invalid_datalen:
   min_t(uint16_t, senselen, SCSI_SENSE_BUFFERSIZE));
}
 
-	if (sc-sc_data_direction == DMA_TO_DEVICE)

-   goto out;
-
if (rhdr-flags  ISCSI_FLAG_CMD_UNDERFLOW) {
int res_count = be32_to_cpu(rhdr-residual_count);
 




Thanks, this looks right to me. (And good catch)
I have went through the code and it looks like the right thing to do.
svn blame annotates this code to the patch (r527) that libiscsi was 
cut out of iscsi_tcp so perhaps then it made sense


Actually I think we always had it in one way or another. Maybe because 
of the code moves svn blame goofed.


I think it was originally done because we probably thought we could not 
get underflow or overflow on a write. I do not know for sure though, but 
it may have been a workaround for some old firmware on another target 
that was fixed a long time ago.



The patch looks ok. Thanks.

James please apply with his other patch

http://marc.info/?l=linux-scsim=119334187619290w=2

for scsi-rc-fixes when you get a chance. Thanks.

Signed-off-by: Mike Christie [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html