Re: [Bug 9880] dma_free_coherent in arcmsr when calling areca tools

2008-02-04 Thread Russell King
On Sun, Feb 03, 2008 at 09:55:37PM -0600, James Bottomley wrote:
 I've cc'd the people responsible for this apparent bit of idiocy.  Since
 the API addition to dma_alloc_coherent() was the GFP flags so you could
 call it from interrupt context with GFP_ATOMIC if so desired,
 (pci_dma_alloc_consistent always has GFP_ATOMIC semantics), why on earth
 would the corresponding free routine require non-atomic semantics?

For the N-th time, when tearing down the MMU mappings which we need
to setup for coherent mappings, we need to invalidate the TLB.  On
SMP systems, that means doing an IPI to the other CPUs to tell them
to invalidate their TLBs as well.

Now, look at the restrictions on calling smp_call_function_on_cpu()
or equivalent function which is used to do the IPIs.  I don't care if
you look that up for x86 or ARM, because the restrictions are the same
- see the last two lines:

x86:
/**
 * smp_call_function_mask(): Run a function on a set of other CPUs.
 * @mask: The set of cpus to run on.  Must not include the current cpu.
 * @func: The function to run. This must be fast and non-blocking.
 * @info: An arbitrary pointer to pass to the function.
 * @wait: If true, wait (atomically) until function has completed on other 
CPUs. *
  * Returns 0 on success, else a negative status code.
 *
 * If @wait is true, then returns once @func has returned; otherwise
 * it returns just before the target cpu calls @func.
 *
 * You must not call this function with disabled interrupts or from a
 * hardware interrupt handler or from a bottom half handler.
 */

arm:
/*
 * You must not call this function with disabled interrupts, from a
 * hardware interrupt handler, nor from a bottom half handler.
 */


So, if the architecture requires you to IPI the TLB flush to other CPUs
this restriction has to propagate to dma_free_coherent.  Or we just don't
provide *any* DMA coherent memory and dictate that such platforms can
never use DMA.

Which is more idiotic?

 Is it seriously true that you can call dma_alloc_coherent() from atomic
 context on arm, but not dma_free_coherent()?

Yes.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 9880] dma_free_coherent in arcmsr when calling areca tools

2008-02-04 Thread bugme-daemon
http://bugzilla.kernel.org/show_bug.cgi?id=9880





--- Comment #3 from [EMAIL PROTECTED]  2008-02-04 00:20 ---
On Sun, Feb 03, 2008 at 09:55:37PM -0600, James Bottomley wrote:
 I've cc'd the people responsible for this apparent bit of idiocy.  Since
 the API addition to dma_alloc_coherent() was the GFP flags so you could
 call it from interrupt context with GFP_ATOMIC if so desired,
 (pci_dma_alloc_consistent always has GFP_ATOMIC semantics), why on earth
 would the corresponding free routine require non-atomic semantics?

For the N-th time, when tearing down the MMU mappings which we need
to setup for coherent mappings, we need to invalidate the TLB.  On
SMP systems, that means doing an IPI to the other CPUs to tell them
to invalidate their TLBs as well.

Now, look at the restrictions on calling smp_call_function_on_cpu()
or equivalent function which is used to do the IPIs.  I don't care if
you look that up for x86 or ARM, because the restrictions are the same
- see the last two lines:

x86:
/**
 * smp_call_function_mask(): Run a function on a set of other CPUs.
 * @mask: The set of cpus to run on.  Must not include the current cpu.
 * @func: The function to run. This must be fast and non-blocking.
 * @info: An arbitrary pointer to pass to the function.
 * @wait: If true, wait (atomically) until function has completed on other
CPUs. *
  * Returns 0 on success, else a negative status code.
 *
 * If @wait is true, then returns once @func has returned; otherwise
 * it returns just before the target cpu calls @func.
 *
 * You must not call this function with disabled interrupts or from a
 * hardware interrupt handler or from a bottom half handler.
 */

arm:
/*
 * You must not call this function with disabled interrupts, from a
 * hardware interrupt handler, nor from a bottom half handler.
 */


So, if the architecture requires you to IPI the TLB flush to other CPUs
this restriction has to propagate to dma_free_coherent.  Or we just don't
provide *any* DMA coherent memory and dictate that such platforms can
never use DMA.

Which is more idiotic?

 Is it seriously true that you can call dma_alloc_coherent() from atomic
 context on arm, but not dma_free_coherent()?

Yes.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI] sd: make error handling more robust (v2)

2008-02-04 Thread Luben Tuikov
--- On Sun, 2/3/08, Mike Snitzer [EMAIL PROTECTED] wrote:

 From: Mike Snitzer [EMAIL PROTECTED]
 Subject: Re: [PATCH] [SCSI] sd: make error handling more robust (v2)
 To: James Bottomley [EMAIL PROTECTED]
 Cc: Tony Battersby [EMAIL PROTECTED], linux-scsi@vger.kernel.org 
 linux-scsi@vger.kernel.org, Luben Tuikov [EMAIL PROTECTED], Salyzyn, 
 Mark [EMAIL PROTECTED]
 Date: Sunday, February 3, 2008, 7:14 AM
 On Feb 2, 2008 5:06 PM, James Bottomley
 [EMAIL PROTECTED] wrote:
 
  On Fri, 2008-02-01 at 12:03 -0500, Tony Battersby
 wrote:
   This patch fixes a problem with some out-of-spec
 SCSI disks that report
   hardware or medium errors incorrectly.  Without
 the patch, the kernel
   may silently ignore a failed write command or
 return corrupted data on a
   failed read command.
  
   Signed-off-by: Tony Battersby
 [EMAIL PROTECTED]
   ---
  
   This is a simplified version of the original
 patch that fixes just the
   problem at hand, without trying to handle other
 theoretical out-of-spec
   cases.
 
  Actually, to restore the original check, this is what
 we want, isn't it?
  Ok, so I also made the sector division logic
 futureproof for the day we
  have  4096 sector devices ...
 
  James
 
  ---
 
  From 5ae2e4a8ff095aab5997f17068d3e4212c33f039 Mon
 Sep 17 00:00:00 2001
  From: Tony Battersby [EMAIL PROTECTED]
  Date: Fri, 1 Feb 2008 12:03:27 -0500
  Subject: [SCSI] sd: make error handling more robust
 
  This patch fixes a problem with some out-of-spec SCSI
 disks that report
  hardware or medium errors incorrectly.  Without the
 patch, the kernel
  may silently ignore a failed write command or return
 corrupted data on a
  failed read command.
 
  Signed-off-by: Tony Battersby
 [EMAIL PROTECTED]
  Cc: Stable Tree [EMAIL PROTECTED]
  Signed-off-by: James Bottomley
 [EMAIL PROTECTED]
 
 I've verified that this patch fixes the 2.6.22.16 SCSI
 IO error
 propagation issue I had when physically pulling a drive
 from an
 enclosure connected to an aacraid controller.

Just as in your case and Tony's case, which I presume
uses the same RAID firmware vendor, it would've
probably been better if the RAID firmware vendor
fixed the firmware to not set the VALID bit if the
INFORMATION field is not valid.

   Luben

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix for an underflow condition in usb storage isd200.c

2008-02-04 Thread Boaz Harrosh
On Sun, Feb 03 2008 at 21:23 +0200, Matthew Dharm [EMAIL PROTECTED] wrote:
 On Sun, Feb 03, 2008 at 06:28:48PM +0200, Boaz Harrosh wrote:
 From 3610cfa93c990bbbafb296134ac01ef6d426eb8d Mon Sep 17 00:00:00 2001
 From: Boaz Harrosh [EMAIL PROTECTED]
 Date: Thu, 31 Jan 2008 21:31:31 +0200
 Subject: [PATCH] bugfix for an overflow condition in usb storage  isd200.c

   scsi_scan is issuing a 36-byte INQUIRY request to llds. isd200 would
   volunteer 96 bytes of INQUIRY. This caused an overflow condition in
   protocol.c usb_stor_access_xfer_buf(). So first fix is to
   usb_stor_access_xfer_buf() to properly handle overflow/underflow 
 conditions.
   Then usb_stor_set_xfer_buf() should report this condition as cmnd-result 
 ==
   (DID_BAD_TARGET  16).

   Then also isd200.c is fixed to only return the type of INQUIRY  SENSE
   the upper layer asked for.

 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
  drivers/usb/storage/isd200.c   |7 +--
  drivers/usb/storage/protocol.c |   20 
  2 files changed, 21 insertions(+), 6 deletions(-)
 
 Looking at this again, I think I see Alan's point.  The modifications to
 ISD200 code aren't really needed.
 
 Or, put another way, if you're going to modify the ISD200 code, you should
 fix up all the other users of usb_stor_access_xfer_buf() -- there are over
 a dozen or so, and it looks like none of them have sanity checks for length
 (but the happen to work right now).
 
 But, the modifications to usb_stor_access_xfer_buf() look good -- no
 request from a sub-driver should be allowed to scribble into memory.  The
 current code does make the implicit assumption that there is enough
 storage, and will walk right off the end of the sg list if there isn't.
 
 I'm not sure I like the mods to usb_stor_set_xfer_buf().  Any place we set
 a status that we know is going to be thrown away is an invitation for a
 problem later if someone changes the code to preserve that status.  It's a
 jack-in-the-box, waiting to spring open in our face later.  The limit check
 (which mirrors the usb_stor_access_xfer_buf modification) and WARN_ON() are
 probably good.
 
If you want the WARN_ON() then isd200 code modification must stay, other 
wise the WARN_ON will trigger regularly.
You will find that most other places are naturally bound by other factors
and will not overflow. I think that those places that can, like INQUIRY,
should be fixed, and the WARN_ON should stay. If you don't fix them the
WARN_ON must go.

 In a strictly technical sense, the change to protocol.c are sufficient.
 That is, they will prevent a serious error.  There is a justification tho
 to fix all of the users of usb_stor_access_buf() to not attempt to use more
 SCSI buffer than exists.
 
 My opinion is this:  Let's make the protocol.c mods (modulo my comments
 about setting useless status bits) now.  Then, let's decide if we're going
 to patch all the other users of the usb_stor_*_xfer_buf() functions as a
 separate discussion.
 
 Matt
 
I'm removing the set of result. I don't see any danger in it but it's your
code. But if the WARN_ON stays then so is the isd200 fixes.

Boaz

---
From cd66d4d4a4a239e580714e926e9635f3426dd7fd Mon Sep 17 00:00:00 2001
From: Boaz Harrosh [EMAIL PROTECTED]
Date: Thu, 31 Jan 2008 21:31:31 +0200
Subject: [PATCH] bugfix for an overflow condition in usb storage  isd200.c

  scsi_scan is issuing a 36-byte INQUIRY request to llds. isd200 would
  volunteer 96 bytes of INQUIRY. This caused an overflow condition in
  protocol.c usb_stor_access_xfer_buf(). So first fix is to
  usb_stor_access_xfer_buf() to properly handle overflow/underflow conditions.
  Then put a WARN_ON in usb_stor_set_xfer_buf().

  isd200.c is fixed to only return the type of INQUIRY  SENSE
  the upper layer asked for.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/usb/storage/isd200.c   |7 +--
 drivers/usb/storage/protocol.c |   12 
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/storage/isd200.c b/drivers/usb/storage/isd200.c
index 49ba6c0..8186e93 100644
--- a/drivers/usb/storage/isd200.c
+++ b/drivers/usb/storage/isd200.c
@@ -1238,6 +1238,7 @@ static int isd200_scsi_to_ata(struct scsi_cmnd *srb, 
struct us_data *us,
unsigned long lba;
unsigned long blockCount;
unsigned char senseData[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
+   unsigned xfer_len;
 
memset(ataCdb, 0, sizeof(union ata_cdb));
 
@@ -1247,8 +1248,9 @@ static int isd200_scsi_to_ata(struct scsi_cmnd *srb, 
struct us_data *us,
US_DEBUGP(   ATA OUT - INQUIRY\n);
 
/* copy InquiryData */
+   xfer_len = min(sizeof(info-InquiryData), scsi_bufflen(srb));
usb_stor_set_xfer_buf((unsigned char *) info-InquiryData,
-   sizeof(info-InquiryData), srb);
+   xfer_len, srb);
srb-result = SAM_STAT_GOOD;

Re: [PATCH] [SCSI] sd: make error handling more robust (v2)

2008-02-04 Thread Luben Tuikov
--- On Sat, 2/2/08, James Bottomley [EMAIL PROTECTED] wrote:

 From: James Bottomley [EMAIL PROTECTED]
 Subject: Re: [PATCH] [SCSI] sd: make error handling more robust (v2)
 To: Tony Battersby [EMAIL PROTECTED]
 Cc: linux-scsi@vger.kernel.org linux-scsi@vger.kernel.org, Luben Tuikov 
 [EMAIL PROTECTED], Salyzyn, Mark [EMAIL PROTECTED]
 Date: Saturday, February 2, 2008, 2:06 PM
 On Fri, 2008-02-01 at 12:03 -0500, Tony Battersby wrote:
  This patch fixes a problem with some out-of-spec SCSI
 disks that report
  hardware or medium errors incorrectly.  Without the
 patch, the kernel
  may silently ignore a failed write command or return
 corrupted data on a
  failed read command.
  
  Signed-off-by: Tony Battersby
 [EMAIL PROTECTED]
  ---
  
  This is a simplified version of the original patch
 that fixes just the
  problem at hand, without trying to handle other
 theoretical out-of-spec
  cases.
 
 Actually, to restore the original check, this is what we
 want, isn't it?
 Ok, so I also made the sector division logic futureproof
 for the day we
 have  4096 sector devices ...
 
 James
 
 ---
 
 From 5ae2e4a8ff095aab5997f17068d3e4212c33f039 Mon Sep
 17 00:00:00 2001
 From: Tony Battersby [EMAIL PROTECTED]
 Date: Fri, 1 Feb 2008 12:03:27 -0500
 Subject: [SCSI] sd: make error handling more robust
 
 This patch fixes a problem with some out-of-spec SCSI disks
 that report
 hardware or medium errors incorrectly.  Without the patch,
 the kernel
 may silently ignore a failed write command or return
 corrupted data on a
 failed read command.
 
 Signed-off-by: Tony Battersby [EMAIL PROTECTED]
 Cc: Stable Tree [EMAIL PROTECTED]
 Signed-off-by: James Bottomley
 [EMAIL PROTECTED]
 ---
  drivers/scsi/sd.c |5 +
  1 files changed, 5 insertions(+), 0 deletions(-)
 
 Index: BUILD-2.6/drivers/scsi/sd.c
 ===
 --- BUILD-2.6.orig/drivers/scsi/sd.c  2008-02-02
 15:43:20.0 -0600
 +++ BUILD-2.6/drivers/scsi/sd.c   2008-02-02
 16:01:24.0 -0600
 @@ -929,6 +929,7 @@
   unsigned int xfer_size = scsi_bufflen(SCpnt);
   unsigned int good_bytes = result ? 0 : xfer_size;
   u64 start_lba = SCpnt-request-sector;
 + u64 end_lba = SCpnt-request-sector + (xfer_size /
 512);
   u64 bad_lba;
   struct scsi_sense_hdr sshdr;
   int sense_valid = 0;
 @@ -967,26 +968,23 @@
   goto out;
   if (xfer_size = SCpnt-device-sector_size)
   goto out;
 - switch (SCpnt-device-sector_size) {
 - case 256:
 + if (SCpnt-device-sector_size  512) {
 + /* only legitimate sector_size here is 256 */
   start_lba = 1;
 - break;
 - case 512:
 - break;
 - case 1024:
 - start_lba = 1;
 - break;
 - case 2048:
 - start_lba = 2;
 - break;
 - case 4096:
 - start_lba = 3;
 - break;
 - default:
 - /* Print something here with limiting frequency. */
 - goto out;
 - break;
 + end_lba = 1;
 + } else {
 + /* be careful ... don't want any overflows */
 + u64 factor = SCpnt-device-sector_size / 512;
 + do_div(start_lba, factor);
 + do_div(end_lba, factor);
   }
 +
 + if (bad_lba  start_lba  || bad_lba = end_lba)
 + /* the bad lba was reported incorrectly, we have
 +  * no idea where the error is
 +  */
 + goto out;
 +
   /* This computation should always be done in terms of
* the resolution of the device's medium.
*/

Looks good except that End LBA is usually defined
to be something of the sort of the LBA of the last
logical block accessed by the command or the LBA
of the logical block on which the command failed.

A spec savvy editor of this code would be
pleasantly surprised if they had to use end_lba,
and didn't pay attention that it was actually
End LBA + 1.

   Luben

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND number 2] libata: eliminate the home grown dma padding in favour of that provided by the block layer

2008-02-04 Thread Tejun Heo
Tejun Heo wrote:
 Some ATA controllers including SFF BMDMA and libata PIO HSM need the
 number of bytes mapped in the sg table.  Yeah, it can be calculated with
 a simple macro but it also is a fundamentally confusing dual-sizing
 which should be made as clear as possible.  Plus, it can be difficult to
 find out when somebody used the wrong thing, so what I'm saying is that
 we need to make it easy.  Anyways, please lemme work on it a bit.  I'll
 get back to you guys soon.

Okay, here's first draft combined patch.  Only compile tested (expect
it to be broken) but it should be functionally equivalent to
ata_sg_setup_extra() based implementation albeit with shorter drain
buffer size.  Several things to note...

* fsl last sg check isn't included here.  Will split it out and post
  it separately.

* rq-raw_data_len added.  Rationales...

  - All these padding and draining are to prevent controllers from
crapping themselves when data buffer is shorter than it likes it
to be.  Any controller which talks MMC (or SPC for that matter)
should be ready for transfers shorter than buffer so feeding
enlarged buffer size is inherently safter than feeding the length,
so the primary data length field, rq-data_len, contains the
adjusted length.

  - raw_data_len can't be easily deduced from data_len.  The other way
is possible but with both aligning and draining and command
filtering, calculating it later is messy.

* Draining configuration is done in sr as it's the driver for MMC.  It
  can move both ways - either into SCSI midlayer as SPC and other
  commands do variable length responses too or into libata if all
  non-ATA controllers are happy without such workarounds.  If you ask
  me, I'm inclined to move it into SCSI midlayer as the added overhead
  is insignificant (especially with drain_needed added) and it won't
  break anything (well, theoretically, at least).

* Padding via alinging seems a bit too hacky to me.  It doesn't even
  cover all sg cases.  I think we'll need improvements there, well,
  but for the time being, this should do.

I'll test and report in a few hours.

Thanks.

Index: work/block/blk-core.c
===
--- work.orig/block/blk-core.c
+++ work/block/blk-core.c
@@ -116,6 +116,7 @@ void rq_init(struct request_queue *q, st
rq-ref_count = 1;
rq-q = q;
rq-special = NULL;
+   rq-raw_data_len = 0;
rq-data_len = 0;
rq-data = NULL;
rq-nr_phys_segments = 0;
@@ -1982,6 +1983,7 @@ void blk_rq_bio_prep(struct request_queu
rq-hard_cur_sectors = rq-current_nr_sectors;
rq-hard_nr_sectors = rq-nr_sectors = bio_sectors(bio);
rq-buffer = bio_data(bio);
+   rq-raw_data_len = bio-bi_size;
rq-data_len = bio-bi_size;
 
rq-bio = rq-biotail = bio;
Index: work/block/blk-map.c
===
--- work.orig/block/blk-map.c
+++ work/block/blk-map.c
@@ -19,6 +19,7 @@ int blk_rq_append_bio(struct request_que
rq-biotail-bi_next = bio;
rq-biotail = bio;
 
+   rq-raw_data_len += bio-bi_size;
rq-data_len += bio-bi_size;
}
return 0;
@@ -56,8 +57,10 @@ static int __blk_rq_map_user(struct requ
if (!(uaddr  queue_dma_alignment(q)) 
!(len  queue_dma_alignment(q)))
bio = bio_map_user(q, NULL, uaddr, len, reading);
-   else
+   else {
bio = bio_copy_user(q, uaddr, len, reading);
+   rq-data_len = roundup(len, queue_dma_alignment(q));
+   }
 
if (IS_ERR(bio))
return PTR_ERR(bio);
Index: work/include/linux/blkdev.h
===
--- work.orig/include/linux/blkdev.h
+++ work/include/linux/blkdev.h
@@ -214,6 +214,7 @@ struct request {
unsigned int cmd_len;
unsigned char cmd[BLK_MAX_CDB];
 
+   unsigned int raw_data_len;
unsigned int data_len;
unsigned int sense_len;
void *data;
@@ -256,6 +257,7 @@ struct bio_vec;
 typedef int (merge_bvec_fn) (struct request_queue *, struct bio *, struct 
bio_vec *);
 typedef void (prepare_flush_fn) (struct request_queue *, struct request *);
 typedef void (softirq_done_fn)(struct request *);
+typedef int (dma_drain_needed_fn)(struct request *);
 
 enum blk_queue_state {
Queue_down,
@@ -292,6 +294,7 @@ struct request_queue
merge_bvec_fn   *merge_bvec_fn;
prepare_flush_fn*prepare_flush_fn;
softirq_done_fn *softirq_done_fn;
+   dma_drain_needed_fn *dma_drain_needed;
 
/*
 * Dispatch queue sorting
@@ -696,8 +699,9 @@ extern void blk_queue_max_hw_segments(st
 extern void blk_queue_max_segment_size(struct request_queue *, unsigned int);
 extern void blk_queue_hardsect_size(struct request_queue *, unsigned short);
 extern void 

Re: [build bug] drivers/scsi/NCR53C9x.c:913: error: 'Scsi_Cmnd' has no member named 'use_sg'

2008-02-04 Thread Maciej W. Rozycki
On Thu, 31 Jan 2008, Boaz Harrosh wrote:

 On Thu, Jan 31 2008 at 19:29 +0200, Ingo Molnar [EMAIL PROTECTED] wrote:
  FYI, automated testing found the following build breakage:
  
  drivers/scsi/NCR53C9x.c: In function 'esp_get_dmabufs':
  drivers/scsi/NCR53C9x.c:913: error: 'Scsi_Cmnd' has no member named 'use_sg'
  drivers/scsi/NCR53C9x.c:914: error: 'Scsi_Cmnd' has no member named 
  'request_bufflen'
  
  config attached.
  
  Ingo
  
 Cc linux-scsi mailing list.
 
 This driver and others are scheduled to be removed in the scsi-pending tree
 and are awaiting ACKs from - disappeared - maintainers.

 Well, buried in some other activities.  You certainly have had my ack to 
remove dec_esp.c already and I started to work on replacement front-end 
drivers long ago.

 Unfortunately one of the three hardware configurations to be supported by 
the front-ends does not fit the interrupt handling model implemented by 
the esp_scsi.c core.  Or it is actually the other way round as you cannot 
adjust hardware to fit the driver, so I am in a process of rewriting the 
core a little bit in this respect -- the core switches to interrupt 
polling under some circumstances and does not expect a higher-priority 
interrupt of a different kind to arrive from the bus master controller the 
SCSI chips are attached to instead.

 Being no SCSI expert though I have to study the SCSI spec well enough to 
understand some bits and I am somewhat distracted these days, so it may 
take a while yet.  It is at the highest priority on my to-do list though.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Fwd: 2.6.24 kernel and LIO Target memory mapping]

2008-02-04 Thread Nicholas A. Bellinger
Sorry, resend..
---BeginMessage---
On Sat, 2008-02-02 at 05:26 -0800, Nicholas A. Bellinger wrote: 
 Hi Bart,
 
 Ok, I have 2.6.24 running on ppc64 doing iSCSI/HD on PS3-Linux.  The
 changes for struct scatterlist-page moving to struct
 scatterlist-page_link where pretty straightforward, considering the LIO
 storage engine does not depend on struct scatterlist.  I am going to
 take another look at the diffs later today and make sure everything
 looks corerct, and will make the commits then.
 
 If you could test these on your setup with 2.6.24 in the non IPoIB case
 (that from my previous emails I am guessing will be fine on your setup)
 I would really appreciate it.  I will put 2.6.24 in VM on the
 Linux-iSCSI.org fabric in the upcoming days, and do some additional
 testing.  Getting the CentOS v5u1 x86_64 builds release are a bit higher
 priority than 2.6.24, but I think that the 2.6.24 changes are reasonable
 and do not cause concern with the LIO v2.9 codebase.
 
 Have you made any futher progress on debugging the issue LIO Target with
 IPoIB..?
 
 Many thanks for your most valuable of time,
 
 --nab
 
 On Fri, 2008-02-01 at 06:50 -0800, Nicholas A. Bellinger wrote:
  Hi There,
  
  I will doing a 2.6.24 build for LIO on PS3-Linux this weekend now that
  ps3rom.c is reporting the proper struct scsi_host_template-max_sectors.
  Also in the queue are releasing updated builds for CentOS v5.1 x86_64.
  I have recently upgraded one pair of core nodes on the Linux-iSCSI.org
  fabric to the newest CentOS release, and minus a few minor issues with
  the upgrade process, everything is looking very stable.  The few issues
  that I had (an lvremove issue on v4.5, and qemu-dm performance with HVM
  on v5.0) have been resolved and the fabric is running much smoother for
  the VMs that are providing OCFS2 cluster storage to the LIO debian and
  ubuntu repositories.  
  
  I will let you know when I get 2.6.24 building.  Until then, feel free
  to post the build failure log here.
  
  Many thanks for your most valuable of time,
  
  --nab
  
  On Fri, 2008-02-01 at 15:09 +0100, Bart Van Assche wrote:
   Hello,
   
   The Linux-iSCSI target (kernel module) does not compile on the 2.6.24
   kernel because of changes in the scatterlist API. Has a target date
   for a version that compiles with the 2.6.24 kernel headers already
   been set ?
   
   Bart Van Assche.
   
 

Here are the changes to transport_memcpy_[write,read]_[contig,sg]()
respectively.  This functions are legacy within v2.9 LIO SE, and are
currently unused in kernel mode because the SE core does not rely on
struct scatterlist.

Index: target/iscsi_target_transport.c
===
--- target/iscsi_target_transport.c (revision 205)
+++ target/iscsi_target_transport.c (revision 206)
@@ -4181,7 +4181,7 @@
if (length  total_length)
length = total_length;
 
-   src = GET_ADDR_SG(sg_s, i);
+   src = GET_ADDR_SG(sg_s[i]);
 
memcpy(dst, src, length);
 
@@ -4211,12 +4211,12 @@
if (length  total_length)
length = total_length;
 
-   dst = GET_ADDR_SG(sg_d, i) + dst_offset;
+   dst = GET_ADDR_SG(sg_d[i]) + dst_offset;
if (!dst)
BUG();
i++;

-   src = GET_ADDR_SG(sg_s, j) + src_offset;
+   src = GET_ADDR_SG(sg_s[j]) + src_offset;
if (!src)
BUG();
 
@@ -4228,7 +4228,7 @@
if (length  total_length)
length = total_length;
 
-   dst = GET_ADDR_SG(sg_d, i) + dst_offset;
+   dst = GET_ADDR_SG(sg_d[i]) + dst_offset;
if (!dst)
BUG();
 
@@ -4238,7 +4238,7 @@
} else
dst_offset = length;
 
-   src = GET_ADDR_SG(sg_s, j) + src_offset;
+   src = GET_ADDR_SG(sg_s[j]) + src_offset;
if (!src)
BUG();
j++;
@@ -4269,7 +4269,7 @@
if (length  total_length)
length = total_length;
 
-   dst = GET_ADDR_SG(sg_d, i);
+   dst = GET_ADDR_SG(sg_d[i]);
 
memcpy(dst, src, length);


These changes are followed up by transport_map_sg_to_mem()
and transport_map_mem_to_sg()..  The latter is the default path for
LIO SE v2.9 to v2.6 Linux storage subsystems along with the other completely
virtual subsystem drivers.  Note the former does reverse contigious
scatterlist array mapping to SE linked list memory which is then handed to
the SCSI transport, in the LIO case, traditional 

Re: [PATCH RESEND number 2] libata: eliminate the home grown dma padding in favour of that provided by the block layer

2008-02-04 Thread Tejun Heo
And, here's working version.  I'll splite and post them tomorrow.

Thanks.

Index: work/block/blk-core.c
===
--- work.orig/block/blk-core.c
+++ work/block/blk-core.c
@@ -116,6 +116,7 @@ void rq_init(struct request_queue *q, st
rq-ref_count = 1;
rq-q = q;
rq-special = NULL;
+   rq-raw_data_len = 0;
rq-data_len = 0;
rq-data = NULL;
rq-nr_phys_segments = 0;
@@ -1982,6 +1983,7 @@ void blk_rq_bio_prep(struct request_queu
rq-hard_cur_sectors = rq-current_nr_sectors;
rq-hard_nr_sectors = rq-nr_sectors = bio_sectors(bio);
rq-buffer = bio_data(bio);
+   rq-raw_data_len = bio-bi_size;
rq-data_len = bio-bi_size;
 
rq-bio = rq-biotail = bio;
Index: work/block/blk-map.c
===
--- work.orig/block/blk-map.c
+++ work/block/blk-map.c
@@ -19,6 +19,7 @@ int blk_rq_append_bio(struct request_que
rq-biotail-bi_next = bio;
rq-biotail = bio;
 
+   rq-raw_data_len += bio-bi_size;
rq-data_len += bio-bi_size;
}
return 0;
@@ -139,6 +140,25 @@ int blk_rq_map_user(struct request_queue
ubuf += ret;
}
 
+   /*
+* __blk_rq_map_user() copies the buffers if starting address
+* or length aren't aligned.  As the copied buffer is always
+* page aligned, we know for a fact that there's enough room
+* for padding.  Extend the last bio and update rq-data_len
+* accordingly.
+*
+* On unmap, bio_uncopy_user() will use unmodified
+* bio_map_data pointed to by bio-bi_private.
+*/
+   if (len  queue_dma_alignment(q)) {
+   unsigned int pad_len = (queue_dma_alignment(q)  ~len) + 1;
+   struct bio *bio = rq-biotail;
+
+   bio-bi_io_vec[bio-bi_vcnt - 1].bv_len += pad_len;
+   bio-bi_size += pad_len;
+   rq-data_len += pad_len;
+   }
+
rq-buffer = rq-data = NULL;
return 0;
 unmap_rq:
Index: work/include/linux/blkdev.h
===
--- work.orig/include/linux/blkdev.h
+++ work/include/linux/blkdev.h
@@ -214,6 +214,7 @@ struct request {
unsigned int cmd_len;
unsigned char cmd[BLK_MAX_CDB];
 
+   unsigned int raw_data_len;
unsigned int data_len;
unsigned int sense_len;
void *data;
@@ -256,6 +257,7 @@ struct bio_vec;
 typedef int (merge_bvec_fn) (struct request_queue *, struct bio *, struct 
bio_vec *);
 typedef void (prepare_flush_fn) (struct request_queue *, struct request *);
 typedef void (softirq_done_fn)(struct request *);
+typedef int (dma_drain_needed_fn)(struct request *);
 
 enum blk_queue_state {
Queue_down,
@@ -292,6 +294,7 @@ struct request_queue
merge_bvec_fn   *merge_bvec_fn;
prepare_flush_fn*prepare_flush_fn;
softirq_done_fn *softirq_done_fn;
+   dma_drain_needed_fn *dma_drain_needed;
 
/*
 * Dispatch queue sorting
@@ -696,8 +699,9 @@ extern void blk_queue_max_hw_segments(st
 extern void blk_queue_max_segment_size(struct request_queue *, unsigned int);
 extern void blk_queue_hardsect_size(struct request_queue *, unsigned short);
 extern void blk_queue_stack_limits(struct request_queue *t, struct 
request_queue *b);
-extern int blk_queue_dma_drain(struct request_queue *q, void *buf,
-  unsigned int size);
+extern int blk_queue_dma_drain(struct request_queue *q,
+  dma_drain_needed_fn *dma_drain_needed,
+  void *buf, unsigned int size);
 extern void blk_queue_segment_boundary(struct request_queue *, unsigned long);
 extern void blk_queue_prep_rq(struct request_queue *, prep_rq_fn *pfn);
 extern void blk_queue_merge_bvec(struct request_queue *, merge_bvec_fn *);
Index: work/block/blk-merge.c
===
--- work.orig/block/blk-merge.c
+++ work/block/blk-merge.c
@@ -220,7 +220,7 @@ new_segment:
bvprv = bvec;
} /* segments in rq */
 
-   if (q-dma_drain_size) {
+   if (q-dma_drain_size  q-dma_drain_needed(rq)) {
sg-page_link = ~0x02;
sg = sg_next(sg);
sg_set_page(sg, virt_to_page(q-dma_drain_buffer),
@@ -228,6 +228,7 @@ new_segment:
((unsigned long)q-dma_drain_buffer) 
(PAGE_SIZE - 1));
nsegs++;
+   rq-data_len += q-dma_drain_size;
}
 
if (sg)
Index: work/block/bsg.c
===
--- work.orig/block/bsg.c
+++ work/block/bsg.c
@@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(stru
}
 
if (rq-next_rq) {
-  

Re: [PATCH] [RFC] sd: make error handling more robust

2008-02-04 Thread Tony Battersby
Luben Tuikov wrote:
 You then took hunk #2 (2 lines) of the patch I sent
 you and submitted it as your own, and then I acked
 your patch.
   
I _really_ _really_ hope that you don't believe that I am trying to take
credit for your work. If you take another look, my original patch had
the following hunk:

+
+   /* Make sure that bad_lba is one of the sectors that the
+* command was trying to access.
+*/
+   if (bad_lba  start_lba ||
+   bad_lba = start_lba + xfer_size / sector_size)
+   goto out;
+


Your response patch had the following hunk:

+   if (bad_lba  start_lba)
+   goto out;


So I don't feel that it was dishonest for me to submit this as my
work. If you were offended, then I apologize.

 I think it would've been much clearer if you had
 singled out the problems you were seeing with your
 HW and sent a single problem with a single patch per
 single email.

   
Agreed. Sometimes it is difficult to predict when something that seems
so straightforward will generate so much controversy.

Tony

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Bart Van Assche
On Feb 4, 2008 1:27 PM, Vladislav Bolkhovitin [EMAIL PROTECTED] wrote:

 So, James, what is your opinion on the above? Or the overall SCSI target
 project simplicity doesn't matter much for you and you think it's fine
 to duplicate Linux page cache in the user space to keep the in-kernel
 part of the project as small as possible?

It's too early to draw conclusions about performance. I'm currently
performing more measurements, and the results are not easy to
interpret. My plan is to measure the following:
* Setup: target with RAM disk of 2 GB as backing storage.
* Throughput reported by dd and xdd (direct I/O).
* Transfers with dd/xdd in units of 1 KB to 1 GB (the smallest
transfer size that can be specified to xdd is 1 KB).
* Target SCSI software to be tested: IETD iSCSI via IPoIB, STGT iSCSI
via IPoIB, STGT iSER, SCST iSCSI via IPoIB, SCST SRP, LIO iSCSI via
IPoIB.

The reason I chose dd/xdd for these tests is that I want to measure
the performance of the communication protocols, and that I am assuming
that this performance can be modeled by the following formula:
(transfer time in s) = (transfer setup latency in s) + (transfer size
in MB) / (bandwidth in MB/s). Measuring the time needed for transfers
with varying block size allows to compute the constants in the above
formula via linear regression.

One difficulty I already encountered is that the performance of the
Linux IPoIB implementation varies a lot under high load
(http://bugzilla.kernel.org/show_bug.cgi?id=9883).

Another issue I have to look further into is that dd and xdd report
different results for very large block sizes ( 1 MB).

Bart Van Assche.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


LIO Target iSCSI/SE PS3-Linux / FC8 builds

2008-02-04 Thread Nicholas A. Bellinger
Greetings all,

I have updated the wiki at:

http://linux-iscsi.org/index.php/Playstation3/iSCSI

and posted the first LIO target builds on the LIO Cluster:

http://linux-iscsi.org/builds/ps3-linux/

I will adding the documentation for both LIO SE on PS3-Linux, and the
FC8 upgrade process, as the latter can still be a bit challenging for
new users.

Here is the info from the README:

LIO Target iSCSI/SE for PS3-Linux v2.9.0.209 

I) Kernel module package

iscsi-target-module-2.6.24-2.9.0.209-1.powerpc.rpm

This modules are built for ppc64 and built with the toolkit for Fedora Core 8 
ppc.
This is gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)

This module has been tested with the latest ps3-linux.git and built against
arch/powerpc/configs/ps3_defconfig.  To use the BD-ROM, your 2.6.24 kernel
must contain ps3rom-use-128-max-sector.diff.  Please see:

http://git.kernel.org/?p=linux/kernel/git/geoff/ps3-linux.git;a=commit;h=e82112af66a39d11bcb484de9cfa45f0d214c97f

This module package should work with kernel-2.6.24-20080131.ppc64.rpm from
CELL-Linux-CL_20080201-ADDON.iso, but the BD-ROM will throw an exception without
ps3rom-use-128-max-sector.diff.  Please see the following link for more
information about the ADDON CD, and watch for an updated kernel package soon..

http://www.kernel.org/pub/linux/kernel/people/geoff/cell/

II) Userspace packages

iscsi-target-tools-2.9.0.209-1.ppc.rpm
sbe-mibs-2.9.0.209-1.ppc.rpm

Note that these are 32-bit and built on Fedora Core 8.

Have fun!

--nab

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24][RFC] gdth: Use of scsi_eh API and sense accessors

2008-02-04 Thread Boaz Harrosh
On Mon, Feb 04 2008 at 18:11 +0200, Jeff Garzik [EMAIL PROTECTED] wrote:
 Boaz Harrosh wrote:
   Use of new scsi_eh API for setting sense information into
   the scsi command.

 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
  drivers/scsi/gdth.c |   47 ++-
  drivers/scsi/gdth.h |1 +
  2 files changed, 27 insertions(+), 21 deletions(-)

 diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
 index c825239..9fdd5ef 100644
 --- a/drivers/scsi/gdth.c
 +++ b/drivers/scsi/gdth.c
 @@ -2098,6 +2098,16 @@ static void gdth_putq(gdth_ha_str *ha, Scsi_Cmnd 
 *scp, unchar priority)
  #endif
  }
  
 +static void gdth_set_4byte_sense(struct scsi_cmnd *scp, u8 sense_code)
 +{
 +u8 sense[4];
 +
 +memset(sense, 0, sizeof(sense));
 +sense[0] = 0x70;
 +sense[2] = sense_code;
 +scsi_eh_cpy_sense(scp, sense, sizeof(sense));
 +}
 
 IMO, setting 0x70 and 0x72 is highly common, and worthy of some simple 
 helper functions.  See ata_scsi_set_sense() in libata-scsi.c or 
 stex_set_sense() in stex.c, which is a copy of the former.
 
   Jeff
 
Thanks, Yes I was thinking of a more general sense-formating helper but
I'm not yet sure of it's API. If you also thinks so, it motivates me
to define one and use it in a lot of places that do such formating.

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/24][RFC] dpt_i2o: Use new scsi_eh_cpy_sense()

2008-02-04 Thread Boaz Harrosh
  - Abstract away scsi_cmnd-sense_buffer for later removal.

  - Removed a filtering out of a REQUEST_SENSE at .queuecommand
In the case of sense beeing clean. This is no longer relevant
since scsi-ml will always send a zero out sense buffer even
on a resend, so this means outside REQUEST_SENSE would never
go through. If this is intended then comment and check should
change.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/dpt_i2o.c |   25 +++--
 1 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
index c9dd839..6ee6fcd 100644
--- a/drivers/scsi/dpt_i2o.c
+++ b/drivers/scsi/dpt_i2o.c
@@ -71,6 +71,7 @@ MODULE_DESCRIPTION(Adaptec I2O RAID Driver);
 #include scsi/scsi_device.h
 #include scsi/scsi_host.h
 #include scsi/scsi_tcq.h
+#include scsi/scsi_eh.h
 
 #include dpt/dptsig.h
 #include dpti.h
@@ -385,18 +386,6 @@ static int adpt_queue(struct scsi_cmnd * cmd, void (*done) 
(struct scsi_cmnd *))
struct adpt_device* pDev = NULL;/* dpt per device information */
 
cmd-scsi_done = done;
-   /*
-* SCSI REQUEST_SENSE commands will be executed automatically by the 
-* Host Adapter for any errors, so they should not be executed 
-* explicitly unless the Sense Data is zero indicating that no error 
-* occurred.
-*/
-
-   if ((cmd-cmnd[0] == REQUEST_SENSE)  (cmd-sense_buffer[0] != 0)) {
-   cmd-result = (DID_OK  16);
-   cmd-scsi_done(cmd);
-   return 0;
-   }
 
pHba = (adpt_hba*)cmd-device-host-hostdata[0];
if (!pHba) {
@@ -2226,8 +2215,6 @@ static s32 adpt_i2o_to_scsi(void __iomem *reply, struct 
scsi_cmnd* cmd)
 
pHba = (adpt_hba*) cmd-device-host-hostdata[0];
 
-   cmd-sense_buffer[0] = '\0';  // initialize sense valid flag to false
-
if(!(reply_flags  MSG_FAIL)) {
switch(detailed_status  I2O_SCSI_DSC_MASK) {
case I2O_SCSI_DSC_SUCCESS:
@@ -2297,11 +2284,13 @@ static s32 adpt_i2o_to_scsi(void __iomem *reply, struct 
scsi_cmnd* cmd)
// copy over the request sense data if it was a check
// condition status
if (dev_status == SAM_STAT_CHECK_CONDITION) {
-   u32 len = min(SCSI_SENSE_BUFFERSIZE, 40);
+   u8 sense_buffer[40];
+   u32 len = sizeof(sense_buffer);
// Copy over the sense data
-   memcpy_fromio(cmd-sense_buffer, (reply+28) , len);
-   if(cmd-sense_buffer[0] == 0x70 /* class 7 */  
-  cmd-sense_buffer[2] == DATA_PROTECT ){
+   memcpy_fromio(sense_buffer, (reply+28) , len);
+   scsi_eh_cpy_sense(cmd, sense_buffer, len);
+   if (sense_buffer[0] == 0x70 /* class 7 */ 
+  sense_buffer[2] == DATA_PROTECT){
/* This is to handle an array failed */
cmd-result = (DID_TIME_OUT  16);
printk(KERN_WARNING%s: SCSI Data 
Protect-Device (%d,%d,%d) hba_status=0x%x, dev_status=0x%x, cmd=0x%x\n,
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/24][RFC] scsi-drivers: Move to new sense API. The Trevial case

2008-02-04 Thread Boaz Harrosh
  All these drivers are trevialy converted from memcpy into command's
  sense_buffer from a driver private area, to the new scsi_eh_cpy_sense()
  API. Some also do amaturistic sense editing or printing.

  FIXME: weed out these drivers that this patch is a bugfix for.
(copy more than what they have at private area)

  The list of converted drivers:
   arch/ia64/hp/sim/simscsi.c
   drivers/block/cciss_scsi.c
   drivers/infiniband/ulp/srp/ib_srp.c
   drivers/message/fusion/mptscsih.c
   drivers/message/i2o/i2o_scsi.c
   drivers/scsi/3w-9xxx.c
   drivers/scsi/a100u2w.c
   drivers/scsi/aha1542.c
   drivers/scsi/aha1740.c
   drivers/scsi/atp870u.c
   drivers/scsi/hptiop.c
   drivers/scsi/ibmvscsi/ibmvscsi.c
   drivers/scsi/ibmvscsi/ibmvstgt.c
   drivers/scsi/ide-scsi.c
   drivers/scsi/libiscsi.c
   drivers/scsi/libsas/sas_scsi_host.c
   drivers/scsi/lpfc/lpfc_scsi.c
   drivers/scsi/megaraid.c
   drivers/scsi/megaraid/megaraid_sas.c
   drivers/scsi/ncr53c8xx.c
   drivers/scsi/qlogicpti.c
   drivers/scsi/scsi_debug.c
   drivers/scsi/sym53c8xx_2/sym_glue.c

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 arch/ia64/hp/sim/simscsi.c   |8 ++--
 drivers/block/cciss_scsi.c   |8 +++-
 drivers/infiniband/ulp/srp/ib_srp.c  |6 +++---
 drivers/message/fusion/mptscsih.c|9 +
 drivers/message/i2o/i2o_scsi.c   |4 ++--
 drivers/scsi/3w-9xxx.c   |5 -
 drivers/scsi/a100u2w.c   |4 ++--
 drivers/scsi/aha1542.c   |4 ++--
 drivers/scsi/aha1740.c   |4 ++--
 drivers/scsi/atp870u.c   |1 -
 drivers/scsi/hptiop.c|6 +++---
 drivers/scsi/ibmvscsi/ibmvscsi.c |4 +---
 drivers/scsi/ibmvscsi/ibmvstgt.c |2 +-
 drivers/scsi/ide-scsi.c  |4 +++-
 drivers/scsi/libiscsi.c  |6 ++
 drivers/scsi/libsas/sas_scsi_host.c  |3 +--
 drivers/scsi/lpfc/lpfc_scsi.c|9 -
 drivers/scsi/megaraid.c  |   14 +-
 drivers/scsi/megaraid/megaraid_sas.c |6 +++---
 drivers/scsi/ncr53c8xx.c |9 -
 drivers/scsi/qlogicpti.c |4 ++--
 drivers/scsi/scsi_debug.c|5 ++---
 drivers/scsi/sym53c8xx_2/sym_glue.c  |5 ++---
 23 files changed, 66 insertions(+), 64 deletions(-)

diff --git a/arch/ia64/hp/sim/simscsi.c b/arch/ia64/hp/sim/simscsi.c
index 7661bb0..c33c4b4 100644
--- a/arch/ia64/hp/sim/simscsi.c
+++ b/arch/ia64/hp/sim/simscsi.c
@@ -326,9 +326,13 @@ simscsi_queuecommand (struct scsi_cmnd *sc, void 
(*done)(struct scsi_cmnd *))
}
}
if (sc-result == DID_BAD_TARGET) {
+   u8 sense_buffer[3];
sc-result |= DRIVER_SENSE  24;
-   sc-sense_buffer[0] = 0x70;
-   sc-sense_buffer[2] = 0x00;
+
+   sense_buffer[0] = 0x70;
+   sense_buffer[1] = 0;
+   sense_buffer[2] = 0x00;
+   scsi_eh_cpy_sense(sc, sense_buffer, sizeof(sense_buffer));
}
if (atomic_read(num_reqs) = SIMSCSI_REQ_QUEUE_LEN) {
panic(Attempt to queue command while command is pending!!);
diff --git a/drivers/block/cciss_scsi.c b/drivers/block/cciss_scsi.c
index 63ee6c0..1ccd225 100644
--- a/drivers/block/cciss_scsi.c
+++ b/drivers/block/cciss_scsi.c
@@ -37,7 +37,8 @@
 
 #include scsi/scsi_cmnd.h
 #include scsi/scsi_device.h
-#include scsi/scsi_host.h 
+#include scsi/scsi_host.h
+#include scsi/scsi_eh.h
 
 #include cciss_scsi.h
 
@@ -579,10 +580,7 @@ complete_scsi_command( CommandList_struct *cp, int 
timeout, __u32 tag)
 
/* copy the sense data whether we need to or not. */
 
-   memcpy(cmd-sense_buffer, ei-SenseInfo, 
-   ei-SenseLen  SCSI_SENSE_BUFFERSIZE ?
-   SCSI_SENSE_BUFFERSIZE : 
-   ei-SenseLen);
+   scsi_eh_cpy_sense(cmd, ei-SenseInfo, ei-SenseLen);
scsi_set_resid(cmd, ei-ResidualCnt);
 
if(ei-CommandStatus != 0) 
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 195ce7c..0437585 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -48,6 +48,7 @@
 #include scsi/scsi_dbg.h
 #include scsi/srp.h
 #include scsi/scsi_transport_srp.h
+#include scsi/scsi_eh.h
 
 #include rdma/ib_cache.h
 
@@ -798,10 +799,9 @@ static void srp_process_rsp(struct srp_target_port 
*target, struct srp_rsp *rsp)
scmnd-result = rsp-status;
 
if (rsp-flags  SRP_RSP_FLAG_SNSVALID) {
-   memcpy(scmnd-sense_buffer, rsp-data +
+   scsi_eh_cpy_sense(scmnd, rsp-data +
   be32_to_cpu(rsp-resp_data_len),
-

Re: [patch] pci: pci_enable_device_bars() fix

2008-02-04 Thread Jeff Garzik

Ingo Molnar wrote:

* Jeff Garzik [EMAIL PROTECTED] wrote:


Ingo Molnar wrote:
so please tell me Jeff. If Greg, who is the super-maintainer of your 
code area, and who deals with your code every day and changes it 
every minute and hour, simply did not Cc: the SCSI list - how am i, a 
largely outside party in this matter, supposed to notice that 3 
maintainers and 3 mailing lists in the Cc: were somehow not enough 
and that i was supposed to grow the already sizable Cc: list even 
more?
Because, regardless of the situation, it's both common courtesy and 
wise practice to CC relevant driver maintainers, when you touch a 
driver.


And it's just common sense: Greg simply does not know the intimate 
details of every PCI driver.  Nor do I.  Nor you.


In the case of lpfc here, we have an active driver maintainer, and an 
up-to-date MAINTAINERS entry.  Even if you are too slack to read 
MAINTAINERS, 'git log' would have given you the same info.


Don't pretend there is some benefit here to ignoring the people that 
best know the driver.  I don't buy that; it simply makes no 
engineering sense whatsoever.


what you _STILL_ do not realize is the following: you still attribute 
the lack of Cc:s to some intention of mine. No, it was not my intention. 


I was never speaking to intent.

I was noting that, having been in the kernel community for years, both 
of you guys should know that you should always CC a driver author, when 
touching their driver.


Even after this thread, I have not even heard a yes, I agree, I should 
have CC'd the driver author since they know the most about the driver 
from either of you, which is quite disappointing.


Instead, I get this long thread in response...


  is just super fragile and does not serve users at all. Even Greg and i 
  got it wrong accidentally. If _we_ get it wrong, who will get it 


Sure.  But... do you agree the CC list should have included the driver 
author?  Do you agree that a mistake was made in this case?


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/24][RFC] Add .sense_buffsize to drivers that use scsi_eh_prep_cmnd

2008-02-04 Thread Boaz Harrosh
  With new sense handling drivers that use scsi_eh_prep_cmnd()
  should set .sense_buffsize to none-zero at host template.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/aha152x.c |1 +
 drivers/scsi/arm/cumana_1.c|1 +
 drivers/scsi/arm/oak.c |1 +
 drivers/scsi/atari_scsi.c  |1 +
 drivers/scsi/dmx3191d.c|1 +
 drivers/scsi/dtc.c |1 +
 drivers/scsi/g_NCR5380.c   |1 +
 drivers/scsi/mac_scsi.c|1 +
 drivers/scsi/pas16.c   |1 +
 drivers/scsi/sun3_scsi.c   |3 ++-
 drivers/scsi/sun3_scsi_vme.c   |1 +
 drivers/scsi/t128.c|1 +
 drivers/usb/storage/scsiglue.c |3 +++
 13 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/aha152x.c b/drivers/scsi/aha152x.c
index 6ccdc96..2df7600 100644
--- a/drivers/scsi/aha152x.c
+++ b/drivers/scsi/aha152x.c
@@ -3477,6 +3477,7 @@ static struct scsi_host_template aha152x_driver_template 
= {
.cmd_per_lun= 1,
.use_clustering = DISABLE_CLUSTERING,
.slave_alloc= aha152x_adjust_queue,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 #if !defined(PCMCIA)
diff --git a/drivers/scsi/arm/cumana_1.c b/drivers/scsi/arm/cumana_1.c
index 49d838e..5866bb0 100644
--- a/drivers/scsi/arm/cumana_1.c
+++ b/drivers/scsi/arm/cumana_1.c
@@ -225,6 +225,7 @@ static struct scsi_host_template cumanascsi_template = {
.unchecked_isa_dma  = 0,
.use_clustering = DISABLE_CLUSTERING,
.proc_name  = CumanaSCSI-1,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 static int __devinit
diff --git a/drivers/scsi/arm/oak.c b/drivers/scsi/arm/oak.c
index 849cdf8..1646983 100644
--- a/drivers/scsi/arm/oak.c
+++ b/drivers/scsi/arm/oak.c
@@ -127,6 +127,7 @@ static struct scsi_host_template oakscsi_template = {
.cmd_per_lun= 2,
.use_clustering = DISABLE_CLUSTERING,
.proc_name  = oakscsi,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 static int __devinit
diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index f5732d8..7324f49 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -1140,6 +1140,7 @@ static struct scsi_host_template driver_template = {
.sg_tablesize   = 0, /* initialized at run-time */
.cmd_per_lun= 0, /* initialized at run-time */
.use_clustering = DISABLE_CLUSTERING
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 
diff --git a/drivers/scsi/dmx3191d.c b/drivers/scsi/dmx3191d.c
index fa738ec..1ee5264 100644
--- a/drivers/scsi/dmx3191d.c
+++ b/drivers/scsi/dmx3191d.c
@@ -66,6 +66,7 @@ static struct scsi_host_template dmx3191d_driver_template = {
.sg_tablesize   = SG_ALL,
.cmd_per_lun= 2,
.use_clustering = DISABLE_CLUSTERING,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 static int __devinit dmx3191d_probe_one(struct pci_dev *pdev,
diff --git a/drivers/scsi/dtc.c b/drivers/scsi/dtc.c
index c2677ba..7d84259 100644
--- a/drivers/scsi/dtc.c
+++ b/drivers/scsi/dtc.c
@@ -482,5 +482,6 @@ static struct scsi_host_template driver_template = {
.sg_tablesize   = SG_ALL,
.cmd_per_lun= CMD_PER_LUN,
.use_clustering = DISABLE_CLUSTERING,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 #include scsi_module.c
diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 75585a5..d47a62e 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -925,6 +925,7 @@ static struct scsi_host_template driver_template = {
 .sg_tablesize  = SG_ALL,
.cmd_per_lun= CMD_PER_LUN,
 .use_clustering= DISABLE_CLUSTERING,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 #include linux/module.h
 #include scsi_module.c
diff --git a/drivers/scsi/mac_scsi.c b/drivers/scsi/mac_scsi.c
index 3b09ab2..abec17f 100644
--- a/drivers/scsi/mac_scsi.c
+++ b/drivers/scsi/mac_scsi.c
@@ -594,6 +594,7 @@ static struct scsi_host_template driver_template = {
.cmd_per_lun= CMD_PER_LUN,
.unchecked_isa_dma  = 0,
.use_clustering = DISABLE_CLUSTERING
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 
diff --git a/drivers/scsi/pas16.c b/drivers/scsi/pas16.c
index f2018b4..90ac61f 100644
--- a/drivers/scsi/pas16.c
+++ b/drivers/scsi/pas16.c
@@ -628,6 +628,7 @@ static struct scsi_host_template driver_template = {
.sg_tablesize   = SG_ALL,
.cmd_per_lun= CMD_PER_LUN,
.use_clustering = DISABLE_CLUSTERING,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 #include scsi_module.c
 
diff --git 

[PATCH 14/24][RFC]] dc395x: Use scsi_eh API for REQUEST_SENSE invocation

2008-02-04 Thread Boaz Harrosh
  - Using scsi_eh_{prep,restore}_cmnd() for synchronous
REQUEST_SENSE invocation. simplifies code alot, because
it can now use the regular command invocation code path.
  - Use new sense accessors where needed.
  - use scsi_print_sense() (that is there for ages) in place
of a driver's made one. (Is that needed still)

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/dc395x.c |  140 
 1 files changed, 24 insertions(+), 116 deletions(-)

diff --git a/drivers/scsi/dc395x.c b/drivers/scsi/dc395x.c
index 22ef371..5e92fcc 100644
--- a/drivers/scsi/dc395x.c
+++ b/drivers/scsi/dc395x.c
@@ -64,6 +64,8 @@
 #include scsi/scsi_cmnd.h
 #include scsi/scsi_device.h
 #include scsi/scsi_host.h
+#include scsi/scsi_eh.h
+#include scsi/scsi_dbg.h
 
 #include dc395x.h
 
@@ -236,16 +238,8 @@ struct ScsiReqBlk {
u8 sg_index;/* Index of HW sg entry for this 
request */
size_t total_xfer_length;   /* Total number of bytes remaining to 
be transfered */
size_t request_length;  /* Total number of bytes in this 
request */
-   /*
-* The sense buffer handling function, request_sense, uses
-* the first hw sg entry (segment_x[0]) and the transfer
-* length (total_xfer_length). While doing this it stores the
-* original values into the last sg hw list
-* (srb-segment_x[DC395x_MAX_SG_LISTENTRY - 1] and the
-* total_xfer_length in xferred. These values are restored in
-* pci_unmap_srb_sense. This is the only place xferred is used.
-*/
-   size_t xferred; /* Saved copy of total_xfer_length */
+
+   struct scsi_eh_save ses;
 
u16 state;
 
@@ -1624,18 +1618,11 @@ static u8 start_scsi(struct AdapterCtlBlk* acb, struct 
DeviceCtlBlk* dcb,
dprintkdbg(DBG_KG, start_scsi: (pid#%li) %02i-%i cmnd=0x%02x 
tag=%i\n,
srb-cmd-serial_number, srb-cmd-device-id, 
srb-cmd-device-lun,
srb-cmd-cmnd[0], srb-tag_number);
-   if (srb-flag  AUTO_REQSENSE) {
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, REQUEST_SENSE);
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, (dcb-target_lun  5));
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, 0);
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, 0);
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, SCSI_SENSE_BUFFERSIZE);
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, 0);
-   } else {
-   ptr = (u8 *)srb-cmd-cmnd;
-   for (i = 0; i  srb-cmd-cmd_len; i++)
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, *ptr++);
-   }
+   ptr = (u8 *)srb-cmd-cmnd;
+
+   for (i = 0; i  srb-cmd-cmd_len; i++)
+   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, *ptr++);
+
   no_cmd:
DC395x_write16(acb, TRM_S1040_SCSI_CONTROL,
   DO_HWRESELECT | DO_DATALATCH);
@@ -1894,29 +1881,19 @@ static void command_phase0(struct AdapterCtlBlk *acb, 
struct ScsiReqBlk *srb,
 static void command_phase1(struct AdapterCtlBlk *acb, struct ScsiReqBlk *srb,
u16 *pscsi_status)
 {
-   struct DeviceCtlBlk *dcb;
u8 *ptr;
u16 i;
dprintkdbg(DBG_0, command_phase1: (pid#%li)\n, 
srb-cmd-serial_number);
 
clear_fifo(acb, command_phase1);
DC395x_write16(acb, TRM_S1040_SCSI_CONTROL, DO_CLRATN);
-   if (!(srb-flag  AUTO_REQSENSE)) {
-   ptr = (u8 *)srb-cmd-cmnd;
-   for (i = 0; i  srb-cmd-cmd_len; i++) {
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, *ptr);
-   ptr++;
-   }
-   } else {
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, REQUEST_SENSE);
-   dcb = acb-active_dcb;
-   /* target id */
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, (dcb-target_lun  5));
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, 0);
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, 0);
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, SCSI_SENSE_BUFFERSIZE);
-   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, 0);
+
+   ptr = (u8 *)srb-cmd-cmnd;
+   for (i = 0; i  srb-cmd-cmd_len; i++) {
+   DC395x_write8(acb, TRM_S1040_SCSI_FIFO, *ptr);
+   ptr++;
}
+
srb-state |= SRB_COMMAND;
/* it's important for atn stop */
DC395x_write16(acb, TRM_S1040_SCSI_CONTROL, DO_DATALATCH);
@@ -3290,17 +3267,9 @@ static void pci_unmap_srb_sense(struct AdapterCtlBlk 
*acb,
 {
if (!(srb-flag  AUTO_REQSENSE))
return;
-   /* Unmap sense buffer */
-   dprintkdbg(DBG_SG, pci_unmap_srb_sense: buffer=%08x\n,
-  srb-segment_x[0].address);
-   pci_unmap_single(acb-dev, srb-segment_x[0].address,
-srb-segment_x[0].length, PCI_DMA_FROMDEVICE);
-   /* Restore SG stuff */
-   

[PATCH 12/24][RFC] 53c700: Use scsi_eh API for REQUEST_SENSE invocation

2008-02-04 Thread Boaz Harrosh
  - Use scsi_eh_prep/restor_cmnd() for synchronous
REQUEST_SENSE invocation.
  - Refactor some code that is now commonly used in 2
places.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/53c700.c |  134 +
 drivers/scsi/53c700.h |   20 +++-
 2 files changed, 54 insertions(+), 100 deletions(-)

diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c
index f5a9add..9b5c8d1 100644
--- a/drivers/scsi/53c700.c
+++ b/drivers/scsi/53c700.c
@@ -336,6 +336,8 @@ NCR_700_detect(struct scsi_host_template *tpnt,
if(tpnt-proc_name == NULL)
tpnt-proc_name = 53c700;
 
+   tpnt-sense_buffsize = SCSI_SENSE_BUFFERSIZE;
+
host = scsi_host_alloc(tpnt, 4);
if (!host)
return NULL;
@@ -578,6 +580,34 @@ save_for_reselection(struct NCR_700_Host_Parameters 
*hostdata,
hostdata-cmd = NULL;
 }
 
+STATIC void
+NCR_700_map(struct NCR_700_Host_Parameters *hostdata, struct scsi_cmnd *SCp,
+ struct NCR_700_command_slot *slot, int move_ins)
+{
+   int i;
+   int sg_count;
+   struct scatterlist *sg;
+
+   sg_count = scsi_dma_map(SCp);
+   BUG_ON(sg_count  0);
+
+   scsi_for_each_sg(SCp, sg, sg_count, i) {
+   dma_addr_t vPtr = sg_dma_address(sg);
+   __u32 count = sg_dma_len(sg);
+
+   slot-SG[i].ins = bS_to_host(move_ins | count);
+   DEBUG(( scatter block %d: move %d[%08x] from 0x%lx\n,
+  i, count, slot-SG[i].ins, (unsigned long)vPtr));
+   slot-SG[i].pAddr = bS_to_host(vPtr);
+   }
+   slot-SG[i].ins = bS_to_host(SCRIPT_RETURN);
+   slot-SG[i].pAddr = 0;
+   dma_cache_sync(hostdata-dev, slot-SG, sizeof(slot-SG), 
DMA_TO_DEVICE);
+   DEBUG(( SETTING %08lx to %x\n,
+  (slot-pSG[i].ins),
+  slot-SG[i].ins));
+}
+
 STATIC inline void
 NCR_700_unmap(struct NCR_700_Host_Parameters *hostdata, struct scsi_cmnd *SCp,
  struct NCR_700_command_slot *slot)
@@ -598,26 +628,18 @@ NCR_700_scsi_done(struct NCR_700_Host_Parameters 
*hostdata,
struct NCR_700_command_slot *slot = 
(struct NCR_700_command_slot *)SCp-host_scribble;

-   dma_unmap_single(hostdata-dev, slot-pCmd,
-MAX_COMMAND_SIZE, DMA_TO_DEVICE);
+   NCR_700_unmap(hostdata, SCp, slot);
if (slot-flags == NCR_700_FLAG_AUTOSENSE) {
-   char *cmnd = NCR_700_get_sense_cmnd(SCp-device);
+   struct NCR_700_Device_Parameters *ndp =
+   NCR_700_Device_Parameters(SCp-device);
 #ifdef NCR_700_DEBUG
printk( ORIGINAL CMD %p RETURNED %d, new return is %d 
sense is\n,
   SCp, SCp-cmnd[7], result);
scsi_print_sense(53c700, SCp);
 
 #endif
-   dma_unmap_single(hostdata-dev, slot-dma_handle,
-SCSI_SENSE_BUFFERSIZE, 
DMA_FROM_DEVICE);
-   /* restore the old result if the request sense was
-* successful */
-   if (result == 0)
-   result = cmnd[7];
-   /* restore the original length */
-   SCp-cmd_len = cmnd[8];
-   } else
-   NCR_700_unmap(hostdata, SCp, slot);
+   scsi_eh_restore_cmnd(SCp, ndp-ses);
+   }
 
free_slot(slot, hostdata);
 #ifdef NCR_700_DEBUG
@@ -988,8 +1010,8 @@ process_script_interrupt(__u32 dsps, __u32 dsp, struct 
scsi_cmnd *SCp,
broken device is looping in contingent 
allegiance: ignoring\n);
NCR_700_scsi_done(hostdata, SCp, 
hostdata-status[0]);
} else {
-   char *cmnd =
-   NCR_700_get_sense_cmnd(SCp-device);
+   struct NCR_700_Device_Parameters *ndp =
+   NCR_700_Device_Parameters(SCp-device);
 #ifdef NCR_DEBUG
scsi_print_command(SCp);
printk(  cmd %p has status %d, requesting 
sense\n,
@@ -1007,32 +1029,14 @@ process_script_interrupt(__u32 dsps, __u32 dsp, struct 
scsi_cmnd *SCp,
 MAX_COMMAND_SIZE,
 DMA_TO_DEVICE);
 
-   cmnd[0] = REQUEST_SENSE;
-   cmnd[1] = (SCp-device-lun  0x7)  5;
-   cmnd[2] = 0;
-   cmnd[3] = 0;
-   cmnd[4] = SCSI_SENSE_BUFFERSIZE;
-   cmnd[5] = 0;
-

Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Vladislav Bolkhovitin

Bart Van Assche wrote:

On Feb 4, 2008 1:27 PM, Vladislav Bolkhovitin [EMAIL PROTECTED] wrote:


So, James, what is your opinion on the above? Or the overall SCSI target
project simplicity doesn't matter much for you and you think it's fine
to duplicate Linux page cache in the user space to keep the in-kernel
part of the project as small as possible?



It's too early to draw conclusions about performance. I'm currently
performing more measurements, and the results are not easy to
interpret. My plan is to measure the following:
* Setup: target with RAM disk of 2 GB as backing storage.
* Throughput reported by dd and xdd (direct I/O).
* Transfers with dd/xdd in units of 1 KB to 1 GB (the smallest
transfer size that can be specified to xdd is 1 KB).
* Target SCSI software to be tested: IETD iSCSI via IPoIB, STGT iSCSI
via IPoIB, STGT iSER, SCST iSCSI via IPoIB, SCST SRP, LIO iSCSI via
IPoIB.

The reason I chose dd/xdd for these tests is that I want to measure
the performance of the communication protocols, and that I am assuming
that this performance can be modeled by the following formula:
(transfer time in s) = (transfer setup latency in s) + (transfer size
in MB) / (bandwidth in MB/s).


It isn't fully correct, you forgot about link latency. More correct one is:

(transfer time) = (transfer setup latency on both initiator and target, 
consisting from software processing time, including memory copy, if 
necessary, and PCI setup/transfer time) + (transfer size)/(bandwidth) + 
(link latency to deliver request for READs or status for WRITES) + 
(2*(link latency) to deliver R2T/XFER_READY request in case of WRITEs, 
if necessary (e.g. iSER for small transfers might not need it, but SRP 
most likely always needs it)). Also you should note that it's correct 
only in case of single threaded workloads with one outstanding command 
at time. For other workloads it depends from how well they manage to 
keep the link full in interval from (transfer size)/(transfer time) to 
bandwidth.



Measuring the time needed for transfers
with varying block size allows to compute the constants in the above
formula via linear regression.


Unfortunately, it isn't so easy, see above.


One difficulty I already encountered is that the performance of the
Linux IPoIB implementation varies a lot under high load
(http://bugzilla.kernel.org/show_bug.cgi?id=9883).

Another issue I have to look further into is that dd and xdd report
different results for very large block sizes ( 1 MB).


Look at /proc/scsi_tgt/sgv (for SCST) and you will see, which transfer 
sizes are actually used. Initiators don't like sending big requests and 
often split them on smaller ones.


Look at this message as well, it might be helpful: 
http://lkml.org/lkml/2007/5/16/223



Bart Van Assche.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/24][RFC] gdth: Use of scsi_eh API and sense accessors

2008-02-04 Thread Boaz Harrosh
  Use of new scsi_eh API for setting sense information into
  the scsi command.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/gdth.c |   47 ++-
 drivers/scsi/gdth.h |1 +
 2 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index c825239..9fdd5ef 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -2098,6 +2098,16 @@ static void gdth_putq(gdth_ha_str *ha, Scsi_Cmnd *scp, 
unchar priority)
 #endif
 }
 
+static void gdth_set_4byte_sense(struct scsi_cmnd *scp, u8 sense_code)
+{
+   u8 sense[4];
+
+   memset(sense, 0, sizeof(sense));
+   sense[0] = 0x70;
+   sense[2] = sense_code;
+   scsi_eh_cpy_sense(scp, sense, sizeof(sense));
+}
+
 static void gdth_next(gdth_ha_str *ha)
 {
 register Scsi_Cmnd *pscp;
@@ -2199,9 +2209,7 @@ static void gdth_next(gdth_ha_str *ha)
 this_cmd = FALSE;
 next_cmd = FALSE;
 } else {
-memset((char*)nscp-sense_buffer,0,16);
-nscp-sense_buffer[0] = 0x70;
-nscp-sense_buffer[2] = NOT_READY;
+   gdth_set_4byte_sense(nscp, NOT_READY);
 nscp-result = (DID_OK  16) | (CHECK_CONDITION  1);
 if (!nscp_cmndinfo-wait_for_completion)
 nscp_cmndinfo-wait_for_completion++;
@@ -2244,9 +2252,7 @@ static void gdth_next(gdth_ha_str *ha)
 TRACE2((cmd 0x%x target %d: UNIT_ATTENTION\n,
  nscp-cmnd[0], t));
 ha-hdr[t].media_changed = FALSE;
-memset((char*)nscp-sense_buffer,0,16);
-nscp-sense_buffer[0] = 0x70;
-nscp-sense_buffer[2] = UNIT_ATTENTION;
+gdth_set_4byte_sense(nscp, UNIT_ATTENTION);
 nscp-result = (DID_OK  16) | (CHECK_CONDITION  1);
 if (!nscp_cmndinfo-wait_for_completion)
 nscp_cmndinfo-wait_for_completion++;
@@ -2263,7 +2269,7 @@ static void gdth_next(gdth_ha_str *ha)
 if ( (nscp-cmnd[4]1)  !(ha-hdr[t].devtype1) ) {
 TRACE((Prevent r. nonremov. drive-do nothing\n));
 nscp-result = DID_OK  16;
-nscp-sense_buffer[0] = 0;
+scsi_eh_reset_sense(nscp);
 if (!nscp_cmndinfo-wait_for_completion)
 nscp_cmndinfo-wait_for_completion++;
 else
@@ -2296,9 +2302,7 @@ static void gdth_next(gdth_ha_str *ha)
 TRACE2((cmd 0x%x target %d: UNIT_ATTENTION\n,
  nscp-cmnd[0], t));
 ha-hdr[t].media_changed = FALSE;
-memset((char*)nscp-sense_buffer,0,16);
-nscp-sense_buffer[0] = 0x70;
-nscp-sense_buffer[2] = UNIT_ATTENTION;
+gdth_set_4byte_sense(nscp, UNIT_ATTENTION);
 nscp-result = (DID_OK  16) | (CHECK_CONDITION  1);
 if (!nscp_cmndinfo-wait_for_completion)
 nscp_cmndinfo-wait_for_completion++;
@@ -2410,7 +2414,6 @@ static int gdth_internal_cache_cmd(gdth_ha_str *ha, 
Scsi_Cmnd *scp)
scp-cmnd[0],t));
 
 scp-result = DID_OK  16;
-scp-sense_buffer[0] = 0;
 
 switch (scp-cmnd[0]) {
   case TEST_UNIT_READY:
@@ -2726,8 +2729,8 @@ static int gdth_fill_raw_cmd(gdth_ha_str *ha, Scsi_Cmnd 
*scp, unchar b)
 }
 
 } else {
-page = virt_to_page(scp-sense_buffer);
-offset = (ulong)scp-sense_buffer  ~PAGE_MASK;
+page = virt_to_page(cmndinfo-sense);
+offset = (ulong)cmndinfo-sense  ~PAGE_MASK;
 sense_paddr = pci_map_page(ha-pdev,page,offset,
16,PCI_DMA_FROMDEVICE);
 
@@ -3395,9 +3398,14 @@ static int gdth_sync_event(gdth_ha_str *ha, int service, 
unchar index,
 pci_unmap_sg(ha-pdev, gdth_sglist(scp), gdth_sg_count(scp),
  cmndinfo-dma_dir);
 
-if (cmndinfo-sense_paddr)
+if (cmndinfo-sense_paddr) {
 pci_unmap_page(ha-pdev, cmndinfo-sense_paddr, 16,
PCI_DMA_FROMDEVICE);
+/* this here is called before gdth_next so it will not
+ * overwrite fake sense returned there.
+ */
+scsi_eh_cpy_sense(scp, cmndinfo-sense, 16);
+   }
 
 if (ha-status == S_OK) {
 cmndinfo-status = S_OK;
@@ -3441,7 +3449,7 @@ static int gdth_sync_event(gdth_ha_str *ha, int service, 
unchar index,
 ha-hdr[t].cluster_type = ~CLUSTER_RESERVED;
 }   
 scp-result = DID_OK  16;
-scp-sense_buffer[0] = 0;
+scsi_eh_reset_sense(scp);
 }
 } else {
 cmndinfo-status = 

[PATCH 10/24][RFC] usb/microtek: No special handling for REQUEST_SENSE command please

2008-02-04 Thread Boaz Harrosh
  - Request sense command now comes like any other command with sg-list
and regular dma mapping. So just remove that special handling.
No thanks!
  - Some left-over cleanup from the accessors and not use_sg patch
Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/usb/image/microtek.c |   20 +---
 1 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/usb/image/microtek.c b/drivers/usb/image/microtek.c
index bc207e3..3ff0474 100644
--- a/drivers/usb/image/microtek.c
+++ b/drivers/usb/image/microtek.c
@@ -480,23 +480,13 @@ static void mts_command_done( struct urb *transfer )
return;
}
 
-   if (context-srb-cmnd[0] == REQUEST_SENSE) {
-   mts_int_submit_urb(transfer,
-  context-data_pipe,
-  context-srb-sense_buffer,
+   if (context-data)
+   mts_int_submit_urb(transfer, context-data_pipe, context-data,
   context-data_length,
-  mts_data_done);
-   } else { if ( context-data ) {
-   mts_int_submit_urb(transfer,
-  context-data_pipe,
-  context-data,
-  context-data_length,
-  scsi_sg_count(context-srb)  1 ?
-  mts_do_sg : mts_data_done);
-   } else {
+  scsi_sg_count(context-srb)  1 ?
+mts_do_sg : mts_data_done);
+   else
mts_get_status(transfer);
-   }
-   }
 
return;
 }
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/24][RFC] firewire ieee1394: Simple convert to new scsi_eh_cpy_sense.

2008-02-04 Thread Boaz Harrosh
  Abstract away scsi_cmnd-sense_buffer for later removal.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/firewire/fw-sbp2.c |7 +--
 drivers/ieee1394/sbp2.c|9 +++--
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/firewire/fw-sbp2.c b/drivers/firewire/fw-sbp2.c
index 1d9602b..0404650 100644
--- a/drivers/firewire/fw-sbp2.c
+++ b/drivers/firewire/fw-sbp2.c
@@ -46,6 +46,7 @@
 #include scsi/scsi_cmnd.h
 #include scsi/scsi_device.h
 #include scsi/scsi_host.h
+#include scsi/scsi_eh.h
 
 #include fw-transaction.h
 #include fw-topology.h
@@ -1016,8 +1017,9 @@ static struct fw_driver sbp2_driver = {
 };
 
 static unsigned int
-sbp2_status_to_sense_data(u8 *sbp2_status, u8 *sense_data)
+sbp2_status_to_sense_data(u8 *sbp2_status, struct scsi_cmnd *srb)
 {
+   u8 sense_data[16];
int sam_status;
 
sense_data[0] = 0x70;
@@ -1036,6 +1038,7 @@ sbp2_status_to_sense_data(u8 *sbp2_status, u8 *sense_data)
sense_data[13] = sbp2_status[3];
sense_data[14] = sbp2_status[12];
sense_data[15] = sbp2_status[13];
+   scsi_eh_cpy_sense(srb, sense_data, sizeof(sense_data));
 
sam_status = sbp2_status[0]  0x3f;
 
@@ -1081,7 +1084,7 @@ complete_command_orb(struct sbp2_orb *base_orb, struct 
sbp2_status *status)
 
if (result == DID_OK  16  STATUS_GET_LEN(*status)  1)
result = 
sbp2_status_to_sense_data(STATUS_GET_DATA(*status),
-  
orb-cmd-sense_buffer);
+  orb-cmd);
} else {
/*
 * If the orb completes with status == NULL, something
diff --git a/drivers/ieee1394/sbp2.c b/drivers/ieee1394/sbp2.c
index 2b889d9..ed54c54 100644
--- a/drivers/ieee1394/sbp2.c
+++ b/drivers/ieee1394/sbp2.c
@@ -89,6 +89,7 @@
 #include scsi/scsi_dbg.h
 #include scsi/scsi_device.h
 #include scsi/scsi_host.h
+#include scsi/scsi_eh.h
 
 #include csr1212.h
 #include highlevel.h
@@ -1672,8 +1673,11 @@ static int sbp2_send_command(struct sbp2_lu *lu, struct 
scsi_cmnd *SCpnt,
  * Translates SBP-2 status into SCSI sense data for check conditions
  */
 static unsigned int sbp2_status_to_sense_data(unchar *sbp2_status,
- unchar *sense_data)
+ struct scsi_cmnd *SCpnt)
 {
+   u8 sense_data[16];
+
+   memset(sense_data, 0, sizeof(sense_data));
/* OK, it's pretty ugly... ;-) */
sense_data[0] = 0x70;
sense_data[1] = 0x0;
@@ -1691,6 +1695,7 @@ static unsigned int sbp2_status_to_sense_data(unchar 
*sbp2_status,
sense_data[13] = sbp2_status[11];
sense_data[14] = sbp2_status[20];
sense_data[15] = sbp2_status[21];
+   scsi_eh_cpy_sense(SCpnt, sense_data, sizeof(sense_data));
 
return sbp2_status[8]  0x3f;
 }
@@ -1784,7 +1789,7 @@ static int sbp2_handle_status_write(struct hpsb_host 
*host, int nodeid,
 
if (STATUS_GET_LEN(h)  1)
scsi_status = sbp2_status_to_sense_data(
-   (unchar *)sb, SCpnt-sense_buffer);
+   (unchar *)sb, SCpnt);
 
if (STATUS_TEST_DEAD(h))
 sbp2_agent_reset(lu, 0);
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/24][RFC] u14-34f: Use of scsi_make_sense() API for DMA-able sense buffer

2008-02-04 Thread Boaz Harrosh
  - Use a pre allocated, DMA mapped, sense buffer at each command,
Using the scsi_make_sense() API. And scsi_return_sense() when
done.
  - Set .pre_allocate_sense  .sense_buffsize at host template.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/u14-34f.c |   22 ++
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/u14-34f.c b/drivers/scsi/u14-34f.c
index 662c004..ec66899 100644
--- a/drivers/scsi/u14-34f.c
+++ b/drivers/scsi/u14-34f.c
@@ -429,6 +429,7 @@
 #include scsi/scsi_host.h
 #include scsi/scsi_tcq.h
 #include scsi/scsicam.h
+#include scsi/scsi_eh.h
 
 static int u14_34f_detect(struct scsi_host_template *);
 static int u14_34f_release(struct Scsi_Host *);
@@ -451,6 +452,8 @@ static struct scsi_host_template driver_template = {
 .this_id = 7,
 .unchecked_isa_dma   = 1,
 .use_clustering  = ENABLE_CLUSTERING,
+   .pre_allocate_sense  = 1,
+   .sense_buffsize  = SCSI_SENSE_BUFFERSIZE,
 };
 
 #if !defined(__BIG_ENDIAN_BITFIELD)  !defined(__LITTLE_ENDIAN_BITFIELD)
@@ -577,6 +580,7 @@ struct mscp {
 
/* Additional fields begin here. */
struct scsi_cmnd *SCpnt;
+   u8 *sense_buffer;
unsigned int cpp_index;  /* cp index */
 
/* All the cp structure is zero filled by queuecommand except the
@@ -1118,8 +1122,9 @@ static void map_dma(unsigned int i, unsigned int j) {
cpp = HD(j)-cp[i]; SCpnt = cpp-SCpnt;
pci_dir = SCpnt-sc_data_direction;
 
-   if (SCpnt-sense_buffer)
-  cpp-sense_addr = H2DEV(pci_map_single(HD(j)-pdev, SCpnt-sense_buffer,
+   cpp-sense_buffer = scsi_make_sense(SCpnt);
+   BUG_ON(!cpp-sense_buffer);
+   cpp-sense_addr = H2DEV(pci_map_single(HD(j)-pdev, cpp-sense_buffer,
SCSI_SENSE_BUFFERSIZE, PCI_DMA_FROMDEVICE));
 
cpp-sense_len = SCSI_SENSE_BUFFERSIZE;
@@ -1144,7 +1149,7 @@ static void map_dma(unsigned int i, unsigned int j) {
 
} else {
   pci_dir = PCI_DMA_BIDIRECTIONAL;
-  cpp-data_len = H2DEV(scsi_bufflen(SCpnt));
+  cpp-data_len = 0;
}
 }
 
@@ -1156,9 +1161,9 @@ static void unmap_dma(unsigned int i, unsigned int j) {
cpp = HD(j)-cp[i]; SCpnt = cpp-SCpnt;
pci_dir = SCpnt-sc_data_direction;
 
-   if (DEV2H(cpp-sense_addr))
-  pci_unmap_single(HD(j)-pdev, DEV2H(cpp-sense_addr),
+   pci_unmap_single(HD(j)-pdev, DEV2H(cpp-sense_addr),
DEV2H(cpp-sense_len), PCI_DMA_FROMDEVICE);
+   scsi_return_sense(SCpnt, cpp-sense_buffer);
 
scsi_dma_unmap(SCpnt);
 
@@ -1298,6 +1303,7 @@ static int u14_34f_queuecommand(struct scsi_cmnd *SCpnt, 
void (*done)(struct scs
/* Use data transfer direction SCpnt-sc_data_direction */
scsi_to_dev_dir(i, j);
 
+   /* FIXME:map_dma can fail */
/* Map DMA buffers and SG list */
map_dma(i, j);
 
@@ -1821,7 +1827,7 @@ static irqreturn_t ihdlr(int irq, unsigned int j) {
  /* Works around a flaw in scsi.c */
  else if (tstatus == CHECK_CONDITION
SCpnt-device-type == TYPE_DISK
-   (SCpnt-sense_buffer[2]  0xf) == RECOVERED_ERROR)
+   (spp-sense_buffer[2]  0xf) == RECOVERED_ERROR)
 status = DID_BUS_BUSY  16;
 
  else
@@ -1832,11 +1838,11 @@ static irqreturn_t ihdlr(int irq, unsigned int j) {
 
  if (spp-target_status  SCpnt-device-type == TYPE_DISK 
  (!(tstatus == CHECK_CONDITION  HD(j)-iocount = 1000 
-   (SCpnt-sense_buffer[2]  0xf) == NOT_READY)))
+   (spp-sense_buffer[2]  0xf) == NOT_READY)))
 scmd_printk(KERN_INFO, SCpnt,
ihdlr, pid %ld, target_status 0x%x, sense key 0x%x.\n,
SCpnt-serial_number, spp-target_status,
-   SCpnt-sense_buffer[2]);
+   spp-sense_buffer[2]);
 
  HD(j)-target_to[scmd_id(SCpnt)][scmd_channel(SCpnt)] = 0;
 
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/24][RFC] tmscsim: Use scsi_eh API for REQUEST_SENSE invocation

2008-02-04 Thread Boaz Harrosh
  - Use scsi_eh_prep/restore_cmnd() for synchronous
REQUEST_SENSE invocation.
  - Use new sense accessors where needed.
  - Cleanup of no longer used bits

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/tmscsim.c |   72 ++-
 drivers/scsi/tmscsim.h |   11 +--
 2 files changed, 12 insertions(+), 71 deletions(-)

diff --git a/drivers/scsi/tmscsim.c b/drivers/scsi/tmscsim.c
index 5b04ddf..2447da9 100644
--- a/drivers/scsi/tmscsim.c
+++ b/drivers/scsi/tmscsim.c
@@ -242,7 +242,6 @@
 #include scsi/scsicam.h
 #include scsi/scsi_tcq.h
 
-
 #define DC390_BANNER Tekram DC390/AM53C974
 #define DC390_VERSION 2.1d 2004-05-27
 
@@ -428,33 +427,13 @@ static __inline__ void dc390_Going_remove (struct 
dc390_dcb* pDCB, struct dc390_
pDCB-GoingSRBCnt--;
 }
 
-static struct scatterlist* dc390_sg_build_single(struct scatterlist *sg, void 
*addr, unsigned int length)
-{
-   sg_init_one(sg, addr, length);
-   return sg;
-}
-
 /* Create pci mapping */
 static int dc390_pci_map (struct dc390_srb* pSRB)
 {
int error = 0;
struct scsi_cmnd *pcmd = pSRB-pcmd;
-   struct pci_dev *pdev = pSRB-pSRBDCB-pDCBACB-pdev;
-   dc390_cmd_scp_t* cmdp = ((dc390_cmd_scp_t*)(pcmd-SCp));
-
-   /* Map sense buffer */
-   if (pSRB-SRBFlag  AUTO_REQSENSE) {
-   pSRB-pSegmentList  = 
dc390_sg_build_single(pSRB-Segmentx, pcmd-sense_buffer, 
SCSI_SENSE_BUFFERSIZE);
-   pSRB-SGcount   = pci_map_sg(pdev, pSRB-pSegmentList, 
1,
-DMA_FROM_DEVICE);
-   cmdp-saved_dma_handle  = sg_dma_address(pSRB-pSegmentList);
 
-   /* TODO: error handling */
-   if (pSRB-SGcount != 1)
-   error = 1;
-   DEBUG1(printk(%s(): Mapped sense buffer %p at %x\n, 
__FUNCTION__, pcmd-sense_buffer, cmdp-saved_dma_handle));
-   /* Map SG list */
-   } else if (scsi_sg_count(pcmd)) {
+   if (scsi_sg_count(pcmd)) {
int nseg;
 
nseg = scsi_dma_map(pcmd);
@@ -478,17 +457,10 @@ static int dc390_pci_map (struct dc390_srb* pSRB)
 static void dc390_pci_unmap (struct dc390_srb* pSRB)
 {
struct scsi_cmnd *pcmd = pSRB-pcmd;
-   struct pci_dev *pdev = pSRB-pSRBDCB-pDCBACB-pdev;
-   DEBUG1(dc390_cmd_scp_t* cmdp = ((dc390_cmd_scp_t*)(pcmd-SCp)));
 
-   if (pSRB-SRBFlag) {
-   pci_unmap_sg(pdev, pSRB-Segmentx, 1, DMA_FROM_DEVICE);
-   DEBUG1(printk(%s(): Unmapped sense buffer at %x\n, 
__FUNCTION__, cmdp-saved_dma_handle));
-   } else {
-   scsi_dma_unmap(pcmd);
-   DEBUG1(printk(%s(): Unmapped SG at %p with %d elements\n,
- __FUNCTION__, scsi_sglist(pcmd), 
scsi_sg_count(pcmd)));
-   }
+   scsi_dma_unmap(pcmd);
+   DEBUG1(printk(%s(): Unmapped SG at %p with %d elements\n,
+ __FUNCTION__, scsi_sglist(pcmd), scsi_sg_count(pcmd)));
 }
 
 static void __inline__
@@ -593,23 +565,10 @@ dc390_StartSCSI( struct dc390_acb* pACB, struct 
dc390_dcb* pDCB, struct dc390_sr
 /* Command is written in CommandPhase, if SEL_W_ATN_STOP ... */
 if (cmd != SEL_W_ATN_STOP)
   {
-   if( pSRB-SRBFlag  AUTO_REQSENSE )
- {
-   DC390_write8 (ScsiFifo, REQUEST_SENSE);
-   DC390_write8 (ScsiFifo, pDCB-TargetLUN  5);
-   DC390_write8 (ScsiFifo, 0);
-   DC390_write8 (ScsiFifo, 0);
-   DC390_write8 (ScsiFifo, SCSI_SENSE_BUFFERSIZE);
-   DC390_write8 (ScsiFifo, 0);
-   DEBUG1(printk (KERN_DEBUG DC390: AutoReqSense !\n));
- }
-   else/* write cmnd to bus */ 
- {
u8 *ptr; u8 i;
ptr = (u8 *)scmd-cmnd;
for (i = 0; i  scmd-cmd_len; i++)
  DC390_write8 (ScsiFifo, *(ptr++));
- }
   }
 DEBUG0(if (pACB-pActiveDCB)   \
   printk (KERN_WARNING DC390: ActiveDCB != 0\n));
@@ -1369,30 +1328,17 @@ dc390_DataInPhase( struct dc390_acb* pACB, struct 
dc390_srb* pSRB, u8 *psstatus)
 static void
 dc390_CommandPhase( struct dc390_acb* pACB, struct dc390_srb* pSRB, u8 
*psstatus)
 {
-struct dc390_dcb*   pDCB;
 u8  i, cnt;
 u8 *ptr;
 
 DC390_write8 (ScsiCmd, RESET_ATN_CMD);
 DC390_write8 (ScsiCmd, CLEAR_FIFO_CMD);
-if( !(pSRB-SRBFlag  AUTO_REQSENSE) )
-{
+
cnt = (u8) pSRB-pcmd-cmd_len;
ptr = (u8 *) pSRB-pcmd-cmnd;
for(i=0; i  cnt; i++)
DC390_write8 (ScsiFifo, *(ptr++));
-}
-else
-{
-   DC390_write8 (ScsiFifo, REQUEST_SENSE);
-   pDCB = pACB-pActiveDCB;
-   DC390_write8 (ScsiFifo, pDCB-TargetLUN  5);
-   DC390_write8 (ScsiFifo, 0);
-   DC390_write8 (ScsiFifo, 0);
-   DC390_write8 (ScsiFifo, SCSI_SENSE_BUFFERSIZE);
-   DC390_write8 (ScsiFifo, 0);
-   DEBUG0(printk(KERN_DEBUG DC390: AutoReqSense (CmndPhase)!\n));
-}
+
 

Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley

On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:
 James Bottomley wrote:
 So, James, what is your opinion on the above? Or the overall SCSI target 
 project simplicity doesn't matter much for you and you think it's fine 
 to duplicate Linux page cache in the user space to keep the in-kernel 
 part of the project as small as possible?
 
 
 The answers were pretty much contained here
 
 http://marc.info/?l=linux-scsim=120164008302435
 
 and here:
 
 http://marc.info/?l=linux-scsim=120171067107293
 
 Weren't they?
 
 No, sorry, it doesn't look so for me. They are about performance, but 
 I'm asking about the overall project's architecture, namely about one 
 part of it: simplicity. Particularly, what do you think about 
 duplicating Linux page cache in the user space to have zero-copy cached 
 I/O? Or can you suggest another architectural solution for that problem 
 in the STGT's approach?
  
  
  Isn't that an advantage of a user space solution?  It simply uses the
  backing store of whatever device supplies the data.  That means it takes
  advantage of the existing mechanisms for caching.
 
 No, please reread this thread, especially this message: 
 http://marc.info/?l=linux-kernelm=120169189504361w=2. This is one of 
 the advantages of the kernel space implementation. The user space 
 implementation has to have data copied between the cache and user space 
 buffer, but the kernel space one can use pages in the cache directly, 
 without extra copy.

Well, you've said it thrice (the bellman cried) but that doesn't make it
true.

The way a user space solution should work is to schedule mmapped I/O
from the backing store and then send this mmapped region off for target
I/O.  For reads, the page gather will ensure that the pages are up to
date from the backing store to the cache before sending the I/O out.
For writes, You actually have to do a msync on the region to get the
data secured to the backing store.  You also have to pull tricks with
the mmap region in the case of writes to prevent useless data being read
in from the backing store.  However, none of this involves data copies.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/24][RFC] aic7xxx_old: Use scsi_eh API for REQUEST_SENSE invocation

2008-02-04 Thread Boaz Harrosh
  - Use scsi_eh_prep/restore_cmnd() for synchronous
REQUEST_SENSE invocation.
  - Use new sense accessors where needed.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/aic7xxx_old.c |   74 ---
 1 files changed, 21 insertions(+), 53 deletions(-)

diff --git a/drivers/scsi/aic7xxx_old.c b/drivers/scsi/aic7xxx_old.c
index 3bfd929..520d46a 100644
--- a/drivers/scsi/aic7xxx_old.c
+++ b/drivers/scsi/aic7xxx_old.c
@@ -786,10 +786,7 @@ struct aic7xxx_scb {
struct hw_scatterlist   *sg_list;   /* SG list in adapter format */
unsigned char   tag_action;
unsigned char   sg_count;
-   unsigned char   *sense_cmd; /*
-* Allocate 6 characters for
-* sense command.
-*/
+   struct scsi_eh_save ses;
unsigned char   *cmnd;
unsigned intsg_length;  /*
 * We init this during
@@ -823,9 +820,6 @@ static struct {
   { CIOPARERR, CIOBUS Parity Error }
 };
 
-static unsigned char
-generic_sense[] = { REQUEST_SENSE, 0, 0, 0, 255, 0 };
-
 typedef struct {
   scb_queue_type free_scbs;/*
 * SCBs assigned to free slot on
@@ -1277,6 +1271,8 @@ static void aic7xxx_print_sequencer(struct aic7xxx_host 
*p, int downloaded);
 #ifdef AIC7XXX_VERBOSE_DEBUGGING
 static void aic7xxx_check_scbs(struct aic7xxx_host *p, char *buffer);
 #endif
+static void aic7xxx_buildscb(struct aic7xxx_host *p, struct scsi_cmnd *cmd,
+struct aic7xxx_scb *scb);
 
 /
  *
@@ -2528,7 +2524,7 @@ static int
 aic7xxx_allocate_scb(struct aic7xxx_host *p)
 {
   struct aic7xxx_scb   *scbp = NULL;
-  int scb_size = (sizeof (struct hw_scatterlist) * AIC7XXX_MAX_SG) + 12 + 6;
+  int scb_size = (sizeof(struct hw_scatterlist) * AIC7XXX_MAX_SG) + 12;
   int i;
   int step = PAGE_SIZE / 1024;
   unsigned long scb_count = 0;
@@ -2598,9 +2594,8 @@ aic7xxx_allocate_scb(struct aic7xxx_host *p)
   scbp = scb_ap[i];
   scbp-hscb = p-scb_data-hscbs[p-scb_data-numscbs];
   scbp-sg_list = hsgp[i * AIC7XXX_MAX_SG];
-  scbp-sense_cmd = bufs;
-  scbp-cmnd = bufs + 6;
-  bufs += 12 + 6;
+  scbp-cmnd = bufs;
+  bufs += 12;
   scbp-scb_dma = scb_dma;
   memset(scbp-hscb, 0, sizeof(struct aic7xxx_hwscb));
   scbp-hscb-tag = p-scb_data-numscbs;
@@ -2694,10 +2689,7 @@ aic7xxx_done(struct aic7xxx_host *p, struct aic7xxx_scb 
*scb)
 
   if (scb-flags  SCB_SENSE)
   {
-pci_unmap_single(p-pdev,
- le32_to_cpu(scb-sg_list[0].address),
- SCSI_SENSE_BUFFERSIZE,
- PCI_DMA_FROMDEVICE);
+   scsi_eh_restore_cmnd(cmd, scb-ses);
   }
   if (scb-flags  SCB_RECOVERY_SCB)
   {
@@ -2720,8 +2712,8 @@ aic7xxx_done(struct aic7xxx_host *p, struct aic7xxx_scb 
*scb)
  * after failing to negotiate a wide or sync transfer message.
  */
 if ((scb-flags  SCB_SENSE)  
-  ((scb-cmd-sense_buffer[12] == 0x43) ||  /* INVALID_MESSAGE */
-  (scb-cmd-sense_buffer[12] == 0x49))) /* MESSAGE_ERROR  */
+  ((scsi_sense(scb-cmd)[12] == 0x43) ||  /* INVALID_MESSAGE */
+  (scsi_sense(scb-cmd)[12] == 0x49))) /* MESSAGE_ERROR  */
 {
   message_error = TRUE;
 }
@@ -4263,18 +4255,8 @@ aic7xxx_handle_seqint(struct aic7xxx_host *p, unsigned 
char intstat)
  * Send a sense command to the requesting target.
  * XXX - revisit this and get rid of the memcopys.
  */
-memcpy(scb-sense_cmd, generic_sense[0],
-   sizeof(generic_sense));
-
-scb-sense_cmd[1] = (cmd-device-lun  5);
-scb-sense_cmd[4] = SCSI_SENSE_BUFFERSIZE;
-
-scb-sg_list[0].length = 
-  cpu_to_le32(SCSI_SENSE_BUFFERSIZE);
-   scb-sg_list[0].address =
-cpu_to_le32(pci_map_single(p-pdev, cmd-sense_buffer,
-   SCSI_SENSE_BUFFERSIZE,
-   PCI_DMA_FROMDEVICE));
+   scsi_eh_prep_cmnd(cmd, scb-ses, NULL, 0, ~0);
+   aic7xxx_buildscb(p, cmd, scb);
 
 /*
  * XXX - We should allow disconnection, but can't as it
@@ -4283,21 +4265,6 @@ aic7xxx_handle_seqint(struct aic7xxx_host *p, unsigned 
char intstat)
 /* hscb-control = DISCENB; */
 hscb-control = 0;
 hscb-target_status = 0;
-hscb-SG_list_pointer = 
- cpu_to_le32(SCB_DMA_ADDR(scb, scb-sg_list));
-hscb-SCSI_cmd_pointer = 
-  

Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley

On Mon, 2008-02-04 at 19:25 +0300, Vladislav Bolkhovitin wrote:
 James Bottomley wrote:
 Vladislav Bolkhovitin wrote:
 So, James, what is your opinion on the above? Or the overall SCSI target 
 project simplicity doesn't matter much for you and you think it's fine 
 to duplicate Linux page cache in the user space to keep the in-kernel 
 part of the project as small as possible?
  
  
  The answers were pretty much contained here
  
  http://marc.info/?l=linux-scsim=120164008302435
  
  and here:
  
  http://marc.info/?l=linux-scsim=120171067107293
  
  Weren't they?
 
 No, sorry, it doesn't look so for me. They are about performance, but 
 I'm asking about the overall project's architecture, namely about one 
 part of it: simplicity. Particularly, what do you think about 
 duplicating Linux page cache in the user space to have zero-copy cached 
 I/O? Or can you suggest another architectural solution for that problem 
 in the STGT's approach?

Isn't that an advantage of a user space solution?  It simply uses the
backing store of whatever device supplies the data.  That means it takes
advantage of the existing mechanisms for caching.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/24][RFC] initio: Use of scsi_make_sense() API for DMA-able sense buffer

2008-02-04 Thread Boaz Harrosh
  - Use a pre allocated, DMA mapped, sense buffer at each command,
Using the scsi_make_sense() API. And scsi_return_sense() when
done.
  - Set .pre_allocate_sense  .sense_buffsize at host template.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/initio.c |   12 +---
 drivers/scsi/initio.h |1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index 40e9875..b32a145 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -102,6 +102,7 @@
 #include scsi/scsi_device.h
 #include scsi/scsi_host.h
 #include scsi/scsi_tcq.h
+#include scsi/scsi_eh.h
 
 #include initio.h
 
@@ -2578,8 +2579,9 @@ static void initio_build_scb(struct initio_host * host, 
struct scsi_ctrl_blk * c
 
cblk-flags |= SCF_SENSE;   /* Turn on auto request sense   */
 
+   cblk-sense_buffer = scsi_make_sense(cmnd);
/* Map the sense buffer into bus memory */
-   dma_addr = dma_map_single(host-pci_dev-dev, cmnd-sense_buffer,
+   dma_addr = dma_map_single(host-pci_dev-dev, cblk-sense_buffer,
  SENSE_SIZE, DMA_FROM_DEVICE);
cblk-senseptr = cpu_to_le32((u32)dma_addr);
cblk-senselen = cpu_to_le32(SENSE_SIZE);
@@ -2733,7 +2735,8 @@ static int i91u_biosparam(struct scsi_device *sdev, 
struct block_device *dev,
  * was mapped originally as part of initio_build_scb
  */
 
-static void i91u_unmap_scb(struct pci_dev *pci_dev, struct scsi_cmnd *cmnd)
+static void i91u_unmap_scb(struct pci_dev *pci_dev, struct scsi_ctrl_blk *cblk,
+   struct scsi_cmnd *cmnd)
 {
/* auto sense buffer */
if (cmnd-SCp.ptr) {
@@ -2741,6 +2744,7 @@ static void i91u_unmap_scb(struct pci_dev *pci_dev, 
struct scsi_cmnd *cmnd)
 (dma_addr_t)((unsigned long)cmnd-SCp.ptr),
 SENSE_SIZE, DMA_FROM_DEVICE);
cmnd-SCp.ptr = NULL;
+   scsi_return_sense(cmnd, cblk-sense_buffer);
}
 
/* request buffer */
@@ -2817,7 +2821,7 @@ static void i91uSCBPost(u8 * host_mem, u8 * cblk_mem)
 
cmnd-result = cblk-tastat | (cblk-hastat  16);
WARN_ON(cmnd == NULL);
-   i91u_unmap_scb(host-pci_dev, cmnd);
+   i91u_unmap_scb(host-pci_dev, cblk, cmnd);
cmnd-scsi_done(cmnd);  /* Notify system DONE   */
initio_release_scb(host, cblk); /* Release SCB for current channel */
 }
@@ -2833,6 +2837,8 @@ static struct scsi_host_template initio_template = {
.sg_tablesize   = SG_ALL,
.cmd_per_lun= 1,
.use_clustering = ENABLE_CLUSTERING,
+   .pre_allocate_sense = 1,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 static int initio_probe_one(struct pci_dev *pdev,
diff --git a/drivers/scsi/initio.h b/drivers/scsi/initio.h
index cb48efa..9e4a79d 100644
--- a/drivers/scsi/initio.h
+++ b/drivers/scsi/initio.h
@@ -384,6 +384,7 @@ struct scsi_ctrl_blk {
void (*post) (u8 *, u8 *);  /*4C POST routine */
struct scsi_cmnd *srb;  /*50 SRB Pointer */
struct sg_entry sglist[TOTAL_SG_ENTRY]; /*54 Start of SG list */
+   u8 *sense_buffer;
 };
 
 /* Bit Definition for status */
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND number 2] libata: eliminate the home grown dma padding in favour of that provided by the block layer

2008-02-04 Thread James Bottomley
On Mon, 2008-02-04 at 23:43 +0900, Tejun Heo wrote:
 And, here's working version.  I'll splite and post them tomorrow.
 
 Thanks.
 
 Index: work/block/blk-core.c
 ===
 --- work.orig/block/blk-core.c
 +++ work/block/blk-core.c
 @@ -116,6 +116,7 @@ void rq_init(struct request_queue *q, st
   rq-ref_count = 1;
   rq-q = q;
   rq-special = NULL;
 + rq-raw_data_len = 0;
   rq-data_len = 0;
   rq-data = NULL;
   rq-nr_phys_segments = 0;
 @@ -1982,6 +1983,7 @@ void blk_rq_bio_prep(struct request_queu
   rq-hard_cur_sectors = rq-current_nr_sectors;
   rq-hard_nr_sectors = rq-nr_sectors = bio_sectors(bio);
   rq-buffer = bio_data(bio);
 + rq-raw_data_len = bio-bi_size;
   rq-data_len = bio-bi_size;
  
   rq-bio = rq-biotail = bio;
 Index: work/block/blk-map.c
 ===
 --- work.orig/block/blk-map.c
 +++ work/block/blk-map.c
 @@ -19,6 +19,7 @@ int blk_rq_append_bio(struct request_que
   rq-biotail-bi_next = bio;
   rq-biotail = bio;
  
 + rq-raw_data_len += bio-bi_size;
   rq-data_len += bio-bi_size;
   }
   return 0;
 @@ -139,6 +140,25 @@ int blk_rq_map_user(struct request_queue
   ubuf += ret;
   }
  
 + /*
 +  * __blk_rq_map_user() copies the buffers if starting address
 +  * or length aren't aligned.  As the copied buffer is always
 +  * page aligned, we know for a fact that there's enough room
 +  * for padding.  Extend the last bio and update rq-data_len
 +  * accordingly.
 +  *
 +  * On unmap, bio_uncopy_user() will use unmodified
 +  * bio_map_data pointed to by bio-bi_private.
 +  */
 + if (len  queue_dma_alignment(q)) {
 + unsigned int pad_len = (queue_dma_alignment(q)  ~len) + 1;
 + struct bio *bio = rq-biotail;
 +
 + bio-bi_io_vec[bio-bi_vcnt - 1].bv_len += pad_len;
 + bio-bi_size += pad_len;
 + rq-data_len += pad_len;
 + }
 +
   rq-buffer = rq-data = NULL;
   return 0;
  unmap_rq:
 Index: work/include/linux/blkdev.h
 ===
 --- work.orig/include/linux/blkdev.h
 +++ work/include/linux/blkdev.h
 @@ -214,6 +214,7 @@ struct request {
   unsigned int cmd_len;
   unsigned char cmd[BLK_MAX_CDB];
  
 + unsigned int raw_data_len;
   unsigned int data_len;
   unsigned int sense_len;
   void *data;

Please, no ... none of this is necessary.  You're wasting four bytes in
every request from a quantity we already know in the queue.

The point of the original code is that the drain is always waiting for
you silently in the sg list, but never shown in the length.  That means
it's entirely up to you whether you use it or ignore it.  It also means
that there's no need for the discriminating function in block, you
either do or don't use the drain element.

 @@ -256,6 +257,7 @@ struct bio_vec;
  typedef int (merge_bvec_fn) (struct request_queue *, struct bio *, struct 
 bio_vec *);
  typedef void (prepare_flush_fn) (struct request_queue *, struct request *);
  typedef void (softirq_done_fn)(struct request *);
 +typedef int (dma_drain_needed_fn)(struct request *);
  
  enum blk_queue_state {
   Queue_down,
 @@ -292,6 +294,7 @@ struct request_queue
   merge_bvec_fn   *merge_bvec_fn;
   prepare_flush_fn*prepare_flush_fn;
   softirq_done_fn *softirq_done_fn;
 + dma_drain_needed_fn *dma_drain_needed;

Like I said, not necessary because the MMC devices are all single
command queues, so for them, you simply apply the drain buffer the whole
time in block and decide whether to use it in the issue layer

   /*
* Dispatch queue sorting
 @@ -696,8 +699,9 @@ extern void blk_queue_max_hw_segments(st
  extern void blk_queue_max_segment_size(struct request_queue *, unsigned int);
  extern void blk_queue_hardsect_size(struct request_queue *, unsigned short);
  extern void blk_queue_stack_limits(struct request_queue *t, struct 
 request_queue *b);
 -extern int blk_queue_dma_drain(struct request_queue *q, void *buf,
 -unsigned int size);
 +extern int blk_queue_dma_drain(struct request_queue *q,
 +dma_drain_needed_fn *dma_drain_needed,
 +void *buf, unsigned int size);
  extern void blk_queue_segment_boundary(struct request_queue *, unsigned 
 long);
  extern void blk_queue_prep_rq(struct request_queue *, prep_rq_fn *pfn);
  extern void blk_queue_merge_bvec(struct request_queue *, merge_bvec_fn *);
 Index: work/block/blk-merge.c
 ===
 --- work.orig/block/blk-merge.c
 +++ work/block/blk-merge.c
 @@ -220,7 +220,7 @@ new_segment:
   bvprv = bvec;
   } /* segments in rq */
  
 - if 

Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley
On Mon, 2008-02-04 at 21:38 +0300, Vladislav Bolkhovitin wrote:
 James Bottomley wrote:
  On Mon, 2008-02-04 at 20:56 +0300, Vladislav Bolkhovitin wrote:
  
 James Bottomley wrote:
 
 On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:
 
 
 James Bottomley wrote:
 
 
 So, James, what is your opinion on the above? Or the overall SCSI 
 target 
 project simplicity doesn't matter much for you and you think it's 
 fine 
 to duplicate Linux page cache in the user space to keep the in-kernel 
 part of the project as small as possible?
 
 
 The answers were pretty much contained here
 
 http://marc.info/?l=linux-scsim=120164008302435
 
 and here:
 
 http://marc.info/?l=linux-scsim=120171067107293
 
 Weren't they?
 
 No, sorry, it doesn't look so for me. They are about performance, but 
 I'm asking about the overall project's architecture, namely about one 
 part of it: simplicity. Particularly, what do you think about 
 duplicating Linux page cache in the user space to have zero-copy cached 
 I/O? Or can you suggest another architectural solution for that problem 
 in the STGT's approach?
 
 
 Isn't that an advantage of a user space solution?  It simply uses the
 backing store of whatever device supplies the data.  That means it takes
 advantage of the existing mechanisms for caching.
 
 No, please reread this thread, especially this message: 
 http://marc.info/?l=linux-kernelm=120169189504361w=2. This is one of 
 the advantages of the kernel space implementation. The user space 
 implementation has to have data copied between the cache and user space 
 buffer, but the kernel space one can use pages in the cache directly, 
 without extra copy.
 
 
 Well, you've said it thrice (the bellman cried) but that doesn't make it
 true.
 
 The way a user space solution should work is to schedule mmapped I/O
 from the backing store and then send this mmapped region off for target
 I/O.  For reads, the page gather will ensure that the pages are up to
 date from the backing store to the cache before sending the I/O out.
 For writes, You actually have to do a msync on the region to get the
 data secured to the backing store. 
 
 James, have you checked how fast is mmaped I/O if work size  size of 
 RAM? It's several times slower comparing to buffered I/O. It was many 
 times discussed in LKML and, seems, VM people consider it unavoidable. 
  
  
  Erm, but if you're using the case of work size  size of RAM, you'll
  find buffered I/O won't help because you don't have the memory for
  buffers either.
 
 James, just check and you will see, buffered I/O is a lot faster.

So in an out of memory situation the buffers you don't have are a lot
faster than the pages I don't have?

 So, using mmaped IO isn't an option for high performance. Plus, mmaped 
 IO isn't an option for high reliability requirements, since it doesn't 
 provide a practical way to handle I/O errors.
  
  I think you'll find it does ... the page gather returns -EFAULT if
  there's an I/O error in the gathered region. 
 
 Err, to whom return? If you try to read from a mmaped page, which can't 
 be populated due to I/O error, you will get SIGBUS or SIGSEGV, I don't 
 remember exactly. It's quite tricky to get back to the faulted command 
 from the signal handler.
 
 Or do you mean mmap(MAP_POPULATE)/munmap() for each command? Do you 
 think that such mapping/unmapping is good for performance?
 
  msync does something
  similar if there's a write failure.
  
 You also have to pull tricks with
 the mmap region in the case of writes to prevent useless data being read
 in from the backing store.
 
 Can you be more exact and specify what kind of tricks should be done for 
 that?
  
  Actually, just avoid touching it seems to do the trick with a recent
  kernel.
 
 Hmm, how can one write to an mmaped page and don't touch it?

I meant from user space ... the writes are done inside the kernel.

However, as Linus has pointed out, this discussion is getting a bit off
topic.  There's no actual evidence that copy problems are causing any
performatince issues issues for STGT.  In fact, there's evidence that
they're not for everything except IB networks.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/9] scsi_dh: scsi handling of REQ_LB_OP_TRANSITION

2008-02-04 Thread James Bottomley

On Fri, 2008-02-01 at 14:00 -0600, Mike Christie wrote:
 Chandra Seetharaman wrote:
  @@ -1445,9 +1479,24 @@ static void scsi_kill_request(struct req
   static void scsi_softirq_done(struct request *rq)
   {
  struct scsi_cmnd *cmd = rq-completion_data;
  -   unsigned long wait_for = (cmd-allowed + 1) * cmd-timeout_per_command;
  int disposition;
  +   struct request_queue *q;
  +   unsigned long wait_for, flags;
   
  +   if (blk_linux_request(rq)) {
  +   q = rq-q;
  +   spin_lock_irqsave(q-queue_lock, flags);
  +   /*
  +* we always return 1 and the caller should
  +* check rq-errors for the complete status
  +*/
  +   end_that_request_last(rq, 1);
  +   spin_unlock_irqrestore(q-queue_lock, flags);
  +   return;
  +   }
  +
  +
  +   wait_for = (cmd-allowed + 1) * cmd-timeout_per_command;
  INIT_LIST_HEAD(cmd-eh_entry);
   
 .
 
  +
   /*
* Function:scsi_request_fn()
*
  @@ -1519,7 +1612,23 @@ static void scsi_request_fn(struct reque
   * accept it.
   */
  req = elv_next_request(q);
  -   if (!req || !scsi_dev_queue_ready(q, sdev))
  +   if (!req)
  +   break;
  +
  +   /*
  +* We do not account for linux blk req in the device
  +* or host busy accounting because it is not necessarily
  +* a scsi command that is sent to some object. The lower
  +* level can translate it into a request/scsi_cmnd, if
  +* necessary, and then queue that up using REQ_TYPE_BLOCK_PC.
  +*/
  +   if (blk_linux_request(req)) {
  +   blkdev_dequeue_request(req);
  +   scsi_execute_blk_linux_cmd(req);
  +   continue;
  +   }
  +
  +   if (!scsi_dev_queue_ready(q, sdev))
  break;
 
 I think these two pieces are one of the reasons I have not pushed the 
 patches. I thought the completion and execution pieces here are a little 
 ugly and seem to just wedge themselves in where they want to be.
 
 Is there any way to make the insertion of non-scsi commands more common? 
 Do we have the code for being able to send requests directly to 
 something like a fc rport done? Could we maybe inject these special 
 commands to the hw handler using something similar to how bsg would send 
 non scsi commands to weird objects (objects like rport, sessions, and 
 not devices we traditionally associated with queues like scsi_devices). 
 Just a thought with no code :) that is why the ugly code existed still :)

We sort of do.  The bsg code in scsi_transport_sas to send SMP frames to
expander devices would be an example of non-scsi commands going via a
mechanism other than being encapsulated in SCSI.  I don't know if that's
the complete solution in this case, but you could investigate it.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] scsi_dh: Add support for SDEV_PASSIVE

2008-02-04 Thread James Bottomley

On Wed, 2008-01-23 at 16:32 -0800, Chandra Seetharaman wrote:
 Subject: scsi_dh: Add support for SDEV_PASSIVE
 
 From: Chandra Seetharaman [EMAIL PROTECTED]
 
 This patch adds a new device state SDEV_PASSIVE, to correspond to the
 passive side access of an active/passive multipathed device.

Really, no; this isn't right.  The state field of a SCSI device is for
the SCSI state model.  Passive might be a valid device mapper state, but
it's not a valid SCSI state.  If these patches can't work except by
mucking with the SCSI state model, there's some layering problem
elsewhere that needs sorting out.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley
On Mon, 2008-02-04 at 10:29 -0800, Linus Torvalds wrote:
 
 On Mon, 4 Feb 2008, James Bottomley wrote:
  
  The way a user space solution should work is to schedule mmapped I/O
  from the backing store and then send this mmapped region off for target
  I/O.
 
 mmap'ing may avoid the copy, but the overhead of a mmap operation is 
 quite often much *bigger* than the overhead of a copy operation.
 
 Please do not advocate the use of mmap() as a way to avoid memory copies. 
 It's not realistic. Even if you can do it with a single mmap() system 
 call (which is not at all a given, considering that block devices can 
 easily be much larger than the available virtual memory space), the fact 
 is that page table games along with the fault (and even just TLB miss) 
 overhead is easily more than the cost of copying a page in a nice 
 streaming manner.
 
 Yes, memory is slow, but dammit, so is mmap().
 
  You also have to pull tricks with the mmap region in the case of writes 
  to prevent useless data being read in from the backing store.  However, 
  none of this involves data copies.
 
 data copies is irrelevant. The only thing that matters is performance. 
 And if avoiding data copies is more costly (or even of a similar cost) 
 than the copies themselves would have been, there is absolutely no upside, 
 and only downsides due to extra complexity.
 
 If you want good performance for a service like this, you really generally 
 *do* need to in kernel space. You can play games in user space, but you're 
 fooling yourself if you think you can do as well as doing it in the 
 kernel. And you're *definitely* fooling yourself if you think mmap() 
 solves performance issues. Zero-copy does not equate to fast. Memory 
 speeds may be slower that core CPU speeds, but not infinitely so!
 
 (That said: there *are* alternatives to mmap, like splice(), that really 
 do potentially solve some issues without the page table and TLB overheads. 
 But while splice() avoids the costs of paging, I strongly suspect it would 
 still have easily measurable latency issues. Switching between user and 
 kernel space multiple times is definitely not going to be free, although 
 it's probably not a huge issue if you have big enough requests).

Sorry ... this is really just a discussion of how something (zero copy)
could be done, rather than an implementation proposal.  (I'm not
actually planning to make the STGT people do anything ... although
investigating splice does sound interesting).

Right at the moment, STGT seems to be performing just fine on
measurements up to gigabit networks.  There are suggestions that there
may be a problem on 8G IB networks, but it's not definitive yet.

I'm already on record as saying I think the best fix for IB networks is
just to reduce the context switches by increasing the transfer size, but
the infrastructure to allow that only just went into git head.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/24][RFC] scsi-drivers: more drivers use new scsi_eh_cpy_sense()

2008-02-04 Thread Salyzyn, Mark
ACK for aacraid and ips with condition that community accepts the RFC's premise.

The code changes appear trivial and make sense.

I had some upstream changes in the set_sense code in aacraid related to a 
recent discussion regarding error propagation that is overlapped directly by 
the changes in this RFC. However, it looks like the changes in the SCSI layer 
to deal with the error propagation will mask the problem in set_sense reducing 
the pressure on me submitting the set_sense code changes; so I can hold on to 
them for a while until this RFC is settled and submit them as a janitor-style 
changes later.

Sincerely -- Mark Salyzyn

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Boaz Harrosh
 Sent: Monday, February 04, 2008 10:48 AM
 To: James Bottomley; FUJITA Tomonori; Christoph Hellwig; Jens
 Axboe; Jeff Garzik; linux-scsi
 Cc: Andrew Morton
 Subject: [PATCH 3/24][RFC] scsi-drivers: more drivers use new
 scsi_eh_cpy_sense()

   All below drivers had the sense information stored in some
   driver internal structure, which was copied or manipulated
   and set into sense_buffer. So use scsi_eh_cpy_sense() in it's
   place. In case of manipulation of sense data. a temporary buffer
   is used, then copied.

   Some places that inspect the sense buffer are converted to use
   scsi_sense() accessor.

   driver files changed:
 drivers/s390/scsi/zfcp_fsf.c
 drivers/scsi/3w-.c
 drivers/scsi/aacraid/aachba.c
 drivers/scsi/aic7xxx/aic79xx_osm.c
 drivers/scsi/aic7xxx/aic7xxx_osm.c
 drivers/scsi/arcmsr/arcmsr_hba.c
 drivers/scsi/ipr.c
 drivers/scsi/ips.c
 drivers/scsi/megaraid/megaraid_mbox.c
 drivers/scsi/ps3rom.c
 drivers/scsi/qla4xxx/ql4_isr.c
 drivers/scsi/stex.c

 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
  drivers/s390/scsi/zfcp_fsf.c  |   11 ++
  drivers/scsi/3w-.c|   20 +
  drivers/scsi/aacraid/aachba.c |   52
 ++--
  drivers/scsi/aic7xxx/aic79xx_osm.c|   34 +++--
  drivers/scsi/aic7xxx/aic7xxx_osm.c|   22 --
  drivers/scsi/arcmsr/arcmsr_hba.c  |   32 
  drivers/scsi/arm/fas216.c |2 +-
  drivers/scsi/ipr.c|   17 ++-
  drivers/scsi/ips.c|   13 
  drivers/scsi/megaraid/megaraid_mbox.c |   24 ++-
  drivers/scsi/ps3rom.c |   31 +++
  drivers/scsi/qla1280.c|   26 
  drivers/scsi/qla4xxx/ql4_isr.c|   12 ++-
  drivers/scsi/stex.c   |   31 +--
  14 files changed, 147 insertions(+), 180 deletions(-)

 diff --git a/drivers/s390/scsi/zfcp_fsf.c
 b/drivers/s390/scsi/zfcp_fsf.c
 index 1abbac5..388d218 100644
 --- a/drivers/s390/scsi/zfcp_fsf.c
 +++ b/drivers/s390/scsi/zfcp_fsf.c
 @@ -4209,13 +4209,11 @@
 zfcp_fsf_send_fcp_command_task_handler(struct zfcp_fsf_req *fsf_req)

 /* check for sense data */
 if (unlikely(fcp_rsp_iu-validity.bits.fcp_sns_len_valid)) {
 +   u8 *sense;
 sns_len = FSF_FCP_RSP_SIZE -
 sizeof (struct fcp_rsp_iu) +
 fcp_rsp_iu-fcp_rsp_len;
 ZFCP_LOG_TRACE(room for %i bytes sense data
 in QTCB\n,
sns_len);
 -   sns_len = min(sns_len, (u32) SCSI_SENSE_BUFFERSIZE);
 -   ZFCP_LOG_TRACE(room for %i bytes sense data
 in SCSI command\n,
 -  SCSI_SENSE_BUFFERSIZE);
 sns_len = min(sns_len, fcp_rsp_iu-fcp_sns_len);
 ZFCP_LOG_TRACE(scpnt-result =0x%x, command was:\n,
scpnt-result);
 @@ -4224,10 +4222,9 @@
 zfcp_fsf_send_fcp_command_task_handler(struct zfcp_fsf_req *fsf_req)

 ZFCP_LOG_TRACE(%i bytes sense data provided
 by FCP\n,
fcp_rsp_iu-fcp_sns_len);
 -   memcpy(scpnt-sense_buffer,
 -  zfcp_get_fcp_sns_info_ptr(fcp_rsp_iu),
 sns_len);
 -   ZFCP_HEX_DUMP(ZFCP_LOG_LEVEL_TRACE,
 - (void *)scpnt-sense_buffer, sns_len);
 +   sense = zfcp_get_fcp_sns_info_ptr(fcp_rsp_iu);
 +   scsi_eh_cpy_sense(scpnt, sense, sns_len);
 +   ZFCP_HEX_DUMP(ZFCP_LOG_LEVEL_TRACE, sense, sns_len);
 }

 /* check for overrun */
 diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c
 index d095321..f5dde3d 100644
 --- a/drivers/scsi/3w-.c
 +++ b/drivers/scsi/3w-.c
 @@ -214,6 +214,7 @@
  #include scsi/scsi_host.h
  #include scsi/scsi_tcq.h
  #include scsi/scsi_cmnd.h
 +#include scsi/scsi_eh.h
  #include 3w-.h

  /* Globals */
 @@ -410,23 +411,30 @@ static int
 tw_decode_sense(TW_Device_Extension *tw_dev, int request_id, int fill
 if ((command-status == 0xc7) ||
 (command-status == 

RE: [PATCH 5/24][RFC] dpt_i2o: Use new scsi_eh_cpy_sense()

2008-02-04 Thread Salyzyn, Mark
ACK with condition that community accepts the RFC's entire premise.

The removed code that shunted the REQUEST_SENSE was based on the assumption 
that the sense data in the current scsi command packet was left over from the 
previous command's execution with a check condition as the scsi command packet 
is reused to issue the REQUEST_SENSE. For a new, or second from the target's 
point of view, request sense to the target issued by these older kernels would 
always return an erased sense. The dpt_i2o driver does not itself maintain the 
sense history, nor does the Firmware. This behavior, I believe, is not the case 
for current kernels so the code fragment made little sense (pun not intended). 
If my historical knowledge is correct, this (now removed) workaround makes no 
more sense because the scsi layer correctly manages adapters that produce 
auto-request sense and does not ever turn around the command and send a second 
request for sense information.

Given this understanding, I have no problem with the removed fragment of 
REQUEST_SENSE shunting. However, I do urge some target error recovery testing, 
tape drives being the likely type of target affected by this change. I have no 
such hardware to confirm...

Sincerely -- Mark Salyzyn

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Boaz Harrosh
 Sent: Monday, February 04, 2008 10:59 AM
 To: James Bottomley; FUJITA Tomonori; Christoph Hellwig; Jens
 Axboe; Jeff Garzik; linux-scsi
 Cc: Andrew Morton
 Subject: [PATCH 5/24][RFC] dpt_i2o: Use new scsi_eh_cpy_sense()

   - Abstract away scsi_cmnd-sense_buffer for later removal.

   - Removed a filtering out of a REQUEST_SENSE at .queuecommand
 In the case of sense beeing clean. This is no longer relevant
 since scsi-ml will always send a zero out sense buffer even
 on a resend, so this means outside REQUEST_SENSE would never
 go through. If this is intended then comment and check should
 change.

 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
  drivers/scsi/dpt_i2o.c |   25 +++--
  1 files changed, 7 insertions(+), 18 deletions(-)

 diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
 index c9dd839..6ee6fcd 100644
 --- a/drivers/scsi/dpt_i2o.c
 +++ b/drivers/scsi/dpt_i2o.c
 @@ -71,6 +71,7 @@ MODULE_DESCRIPTION(Adaptec I2O RAID Driver);
  #include scsi/scsi_device.h
  #include scsi/scsi_host.h
  #include scsi/scsi_tcq.h
 +#include scsi/scsi_eh.h

  #include dpt/dptsig.h
  #include dpti.h
 @@ -385,18 +386,6 @@ static int adpt_queue(struct scsi_cmnd *
 cmd, void (*done) (struct scsi_cmnd *))
 struct adpt_device* pDev = NULL;/* dpt per
 device information */

 cmd-scsi_done = done;
 -   /*
 -* SCSI REQUEST_SENSE commands will be executed
 automatically by the
 -* Host Adapter for any errors, so they should not be executed
 -* explicitly unless the Sense Data is zero
 indicating that no error
 -* occurred.
 -*/
 -
 -   if ((cmd-cmnd[0] == REQUEST_SENSE) 
 (cmd-sense_buffer[0] != 0)) {
 -   cmd-result = (DID_OK  16);
 -   cmd-scsi_done(cmd);
 -   return 0;
 -   }

 pHba = (adpt_hba*)cmd-device-host-hostdata[0];
 if (!pHba) {
 @@ -2226,8 +2215,6 @@ static s32 adpt_i2o_to_scsi(void
 __iomem *reply, struct scsi_cmnd* cmd)

 pHba = (adpt_hba*) cmd-device-host-hostdata[0];

 -   cmd-sense_buffer[0] = '\0';  // initialize sense
 valid flag to false
 -
 if(!(reply_flags  MSG_FAIL)) {
 switch(detailed_status  I2O_SCSI_DSC_MASK) {
 case I2O_SCSI_DSC_SUCCESS:
 @@ -2297,11 +2284,13 @@ static s32 adpt_i2o_to_scsi(void
 __iomem *reply, struct scsi_cmnd* cmd)
 // copy over the request sense data if it was a check
 // condition status
 if (dev_status == SAM_STAT_CHECK_CONDITION) {
 -   u32 len = min(SCSI_SENSE_BUFFERSIZE, 40);
 +   u8 sense_buffer[40];
 +   u32 len = sizeof(sense_buffer);
 // Copy over the sense data
 -   memcpy_fromio(cmd-sense_buffer,
 (reply+28) , len);
 -   if(cmd-sense_buffer[0] == 0x70 /*
 class 7 */ 
 -  cmd-sense_buffer[2] == DATA_PROTECT ){
 +   memcpy_fromio(sense_buffer, (reply+28) , len);
 +   scsi_eh_cpy_sense(cmd, sense_buffer, len);
 +   if (sense_buffer[0] == 0x70 /* class 7 */ 
 +  sense_buffer[2] == DATA_PROTECT){
 /* This is to handle an array
 failed */
 cmd-result = (DID_TIME_OUT  16);
 printk(KERN_WARNING%s: SCSI
 Data Protect-Device (%d,%d,%d) hba_status=0x%x,
 dev_status=0x%x, cmd=0x%x\n,
 --
 1.5.3.3

 -
 To unsubscribe from this list: send 

Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Linus Torvalds


On Mon, 4 Feb 2008, James Bottomley wrote:
 
 The way a user space solution should work is to schedule mmapped I/O
 from the backing store and then send this mmapped region off for target
 I/O.

mmap'ing may avoid the copy, but the overhead of a mmap operation is 
quite often much *bigger* than the overhead of a copy operation.

Please do not advocate the use of mmap() as a way to avoid memory copies. 
It's not realistic. Even if you can do it with a single mmap() system 
call (which is not at all a given, considering that block devices can 
easily be much larger than the available virtual memory space), the fact 
is that page table games along with the fault (and even just TLB miss) 
overhead is easily more than the cost of copying a page in a nice 
streaming manner.

Yes, memory is slow, but dammit, so is mmap().

 You also have to pull tricks with the mmap region in the case of writes 
 to prevent useless data being read in from the backing store.  However, 
 none of this involves data copies.

data copies is irrelevant. The only thing that matters is performance. 
And if avoiding data copies is more costly (or even of a similar cost) 
than the copies themselves would have been, there is absolutely no upside, 
and only downsides due to extra complexity.

If you want good performance for a service like this, you really generally 
*do* need to in kernel space. You can play games in user space, but you're 
fooling yourself if you think you can do as well as doing it in the 
kernel. And you're *definitely* fooling yourself if you think mmap() 
solves performance issues. Zero-copy does not equate to fast. Memory 
speeds may be slower that core CPU speeds, but not infinitely so!

(That said: there *are* alternatives to mmap, like splice(), that really 
do potentially solve some issues without the page table and TLB overheads. 
But while splice() avoids the costs of paging, I strongly suspect it would 
still have easily measurable latency issues. Switching between user and 
kernel space multiple times is definitely not going to be free, although 
it's probably not a huge issue if you have big enough requests).

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 07/12] qla4xxx: add qla4xxx async scan support

2008-02-04 Thread David Somayajulu
Mike Christie wrote:
 qla4xxx has the old school startup/probe where it finds 
 presetup sessions
 in its flash and then attempts to log into them before 
 returning from the
 probe. This however, makes it very simple to add a iscsi 
 class scan finished
 helper which the driver can use.
 
 In future patches Dave or I will rip apart the driver to make it more
 like qla2xxx, but for now this is a very simple two line patch which
 fixes the problem of trying to figure out when the initial sessions
 are done being scanned.
 
 Signed-off-by: Mike Christie [EMAIL PROTECTED]
 ---
  drivers/scsi/qla4xxx/ql4_os.c |4 +++-
  1 files changed, 3 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/scsi/qla4xxx/ql4_os.c 
 b/drivers/scsi/qla4xxx/ql4_os.c
 index d4dd149..c3c59d7 100644
 --- a/drivers/scsi/qla4xxx/ql4_os.c
 +++ b/drivers/scsi/qla4xxx/ql4_os.c
 @@ -89,6 +89,8 @@ static struct scsi_host_template 
 qla4xxx_driver_template = {
   .slave_alloc= qla4xxx_slave_alloc,
   .slave_destroy  = qla4xxx_slave_destroy,
  
 + .scan_finished  = iscsi_scan_finished,
 +
   .this_id= -1,
   .cmd_per_lun= 3,
   .use_clustering = ENABLE_CLUSTERING,
 @@ -1306,7 +1308,7 @@ static int __devinit 
 qla4xxx_probe_adapter(struct pci_dev *pdev,
  qla4xxx_version_str, ha-pdev-device, 
 pci_name(ha-pdev),
  ha-host_no, ha-firmware_version[0], 
 ha-firmware_version[1],
  ha-patch_number, ha-build_number);
 -
 + scsi_scan_host(host);
   return 0;
  
  remove_host:
 -- 
 1.5.2.1
Acked by David Somayajulu [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley
On Mon, 2008-02-04 at 20:56 +0300, Vladislav Bolkhovitin wrote:
 James Bottomley wrote:
  On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:
  
 James Bottomley wrote:
 
 So, James, what is your opinion on the above? Or the overall SCSI 
 target 
 project simplicity doesn't matter much for you and you think it's fine 
 to duplicate Linux page cache in the user space to keep the in-kernel 
 part of the project as small as possible?
 
 
 The answers were pretty much contained here
 
 http://marc.info/?l=linux-scsim=120164008302435
 
 and here:
 
 http://marc.info/?l=linux-scsim=120171067107293
 
 Weren't they?
 
 No, sorry, it doesn't look so for me. They are about performance, but 
 I'm asking about the overall project's architecture, namely about one 
 part of it: simplicity. Particularly, what do you think about 
 duplicating Linux page cache in the user space to have zero-copy cached 
 I/O? Or can you suggest another architectural solution for that problem 
 in the STGT's approach?
 
 
 Isn't that an advantage of a user space solution?  It simply uses the
 backing store of whatever device supplies the data.  That means it takes
 advantage of the existing mechanisms for caching.
 
 No, please reread this thread, especially this message: 
 http://marc.info/?l=linux-kernelm=120169189504361w=2. This is one of 
 the advantages of the kernel space implementation. The user space 
 implementation has to have data copied between the cache and user space 
 buffer, but the kernel space one can use pages in the cache directly, 
 without extra copy.
  
  
  Well, you've said it thrice (the bellman cried) but that doesn't make it
  true.
  
  The way a user space solution should work is to schedule mmapped I/O
  from the backing store and then send this mmapped region off for target
  I/O.  For reads, the page gather will ensure that the pages are up to
  date from the backing store to the cache before sending the I/O out.
  For writes, You actually have to do a msync on the region to get the
  data secured to the backing store. 
 
 James, have you checked how fast is mmaped I/O if work size  size of 
 RAM? It's several times slower comparing to buffered I/O. It was many 
 times discussed in LKML and, seems, VM people consider it unavoidable. 

Erm, but if you're using the case of work size  size of RAM, you'll
find buffered I/O won't help because you don't have the memory for
buffers either.

 So, using mmaped IO isn't an option for high performance. Plus, mmaped 
 IO isn't an option for high reliability requirements, since it doesn't 
 provide a practical way to handle I/O errors.

I think you'll find it does ... the page gather returns -EFAULT if
there's an I/O error in the gathered region.  msync does something
similar if there's a write failure.

  You also have to pull tricks with
  the mmap region in the case of writes to prevent useless data being read
  in from the backing store.
 
 Can you be more exact and specify what kind of tricks should be done for 
 that?

Actually, just avoid touching it seems to do the trick with a recent
kernel.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/24][RFC] libata: Use scsi_eh API for REQUEST_SENSE invocation

2008-02-04 Thread Boaz Harrosh
  both libata-scsi and libata-eh used a cooked up
  REQUEST_SENSE command to retrieve sense data. Use
  of the scsi_eh_{prep,restore}_cmnd() can facilitate
  and simplify the code. And insulates code from scsi
  future changes.

  Am I right in assuming that ata_exec_internal_sg() executes
  synchronously (called from atapi_eh_request_sense()) and
  once returned contain valid sense data?

  Also other places in libata where converted to new scsi_eh
  API and accessors.

  Set shost-sense_buffsize in ata_scsi_add_hosts() for all drivers.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/ata/libata-eh.c   |   29 +++--
 drivers/ata/libata-scsi.c |   44 ++--
 include/linux/libata.h|3 +++
 3 files changed, 36 insertions(+), 40 deletions(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 4e31071..9f02be4 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1270,29 +1270,19 @@ static int ata_eh_read_log_10h(struct ata_device *dev,
 static unsigned int atapi_eh_request_sense(struct ata_queued_cmd *qc)
 {
struct ata_device *dev = qc-dev;
-   unsigned char *sense_buf = qc-scsicmd-sense_buffer;
struct ata_port *ap = dev-link-ap;
struct ata_taskfile tf;
-   u8 cdb[ATAPI_CDB_LEN];
+   struct scsi_eh_save ses;
+   struct scsi_cmnd *cmd = qc-scsicmd;
+   int ret;
 
DPRINTK(ATAPI request sense\n);
 
-   /* FIXME: is this needed? */
-   memset(sense_buf, 0, SCSI_SENSE_BUFFERSIZE);
-
-   /* initialize sense_buf with the error register,
-* for the case where they are -not- overwritten
-*/
-   sense_buf[0] = 0x70;
-   sense_buf[2] = qc-result_tf.feature  4;
-
+   /*?? Is there a maximum size here that ATAPI will confuse if more ??*/
+   scsi_eh_prep_cmnd(cmd, ses, NULL, 0, ~0);
/* some devices time out if garbage left in tf */
ata_tf_init(dev, tf);
 
-   memset(cdb, 0, ATAPI_CDB_LEN);
-   cdb[0] = REQUEST_SENSE;
-   cdb[4] = SCSI_SENSE_BUFFERSIZE;
-
tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
tf.command = ATA_CMD_PACKET;
 
@@ -1302,12 +1292,15 @@ static unsigned int atapi_eh_request_sense(struct 
ata_queued_cmd *qc)
tf.feature |= ATAPI_PKT_DMA;
} else {
tf.protocol = ATAPI_PROT_PIO;
-   tf.lbam = SCSI_SENSE_BUFFERSIZE;
+   tf.lbam = scsi_bufflen(cmd);
tf.lbah = 0;
}
 
-   return ata_exec_internal(dev, tf, cdb, DMA_FROM_DEVICE,
-sense_buf, SCSI_SENSE_BUFFERSIZE, 0);
+   ret = ata_exec_internal_sg(dev, tf, cmd-cmnd, DMA_FROM_DEVICE,
+scsi_sglist(cmd), scsi_sg_count(cmd), 0);
+
+   scsi_eh_restore_cmnd(cmd, ses);
+   return ret;
 }
 
 /**
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index c02c490..652fce4 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -705,11 +705,11 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd 
*qc)
 {
struct scsi_cmnd *cmd = qc-scsicmd;
struct ata_taskfile *tf = qc-result_tf;
-   unsigned char *sb = cmd-sense_buffer;
+   unsigned char sb[24];
unsigned char *desc = sb + 8;
int verbose = qc-ap-ops-error_handler == NULL;
 
-   memset(sb, 0, SCSI_SENSE_BUFFERSIZE);
+   memset(sb, 0, sizeof(sb));
 
cmd-result = (DRIVER_SENSE  24) | SAM_STAT_CHECK_CONDITION;
 
@@ -758,6 +758,7 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd 
*qc)
desc[8] = tf-hob_lbam;
desc[10] = tf-hob_lbah;
}
+   scsi_eh_cpy_sense(cmd, sb, sizeof(sb));
 }
 
 /**
@@ -775,12 +776,12 @@ static void ata_gen_ata_sense(struct ata_queued_cmd *qc)
struct ata_device *dev = qc-dev;
struct scsi_cmnd *cmd = qc-scsicmd;
struct ata_taskfile *tf = qc-result_tf;
-   unsigned char *sb = cmd-sense_buffer;
+   u8 sb[24];
unsigned char *desc = sb + 8;
int verbose = qc-ap-ops-error_handler == NULL;
u64 block;
 
-   memset(sb, 0, SCSI_SENSE_BUFFERSIZE);
+   memset(sb, 0, sizeof(sb));
 
cmd-result = (DRIVER_SENSE  24) | SAM_STAT_CHECK_CONDITION;
 
@@ -811,6 +812,8 @@ static void ata_gen_ata_sense(struct ata_queued_cmd *qc)
desc[9] = block  16;
desc[10] = block  8;
desc[11] = block;
+
+   scsi_eh_cpy_sense(cmd, sb, sizeof(sb));
 }
 
 static void ata_scsi_sdev_config(struct scsi_device *sdev)
@@ -2277,13 +2280,17 @@ unsigned int ata_scsiop_report_luns(struct 
ata_scsi_args *args, u8 *rbuf,
 
 void ata_scsi_set_sense(struct scsi_cmnd *cmd, u8 sk, u8 asc, u8 ascq)
 {
+   u8 sb[14];
+
+   memset(sb, 0, sizeof(sb));
cmd-result = (DRIVER_SENSE  24) | SAM_STAT_CHECK_CONDITION;
 
-   cmd-sense_buffer[0] = 0x70;/* fixed format, current */
-   

[PATCH 18/24][RFC] eata: Use of scsi_make_sense() API for DMA-able sense buffer

2008-02-04 Thread Boaz Harrosh
  - Use a pre allocated, DMA mapped, sense buffer at each command,
Using the scsi_make_sense() API. And scsi_return_sense() when
done.
  - mark this driver as: need for a pre allocated sense buffer
at host template.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/eata.c |   20 +---
 1 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c
index 8be3d76..2d9e086 100644
--- a/drivers/scsi/eata.c
+++ b/drivers/scsi/eata.c
@@ -501,6 +501,7 @@
 #include scsi/scsi_host.h
 #include scsi/scsi_tcq.h
 #include scsi/scsicam.h
+#include scsi/scsi_eh.h
 
 static int eata2x_detect(struct scsi_host_template *);
 static int eata2x_release(struct Scsi_Host *);
@@ -524,6 +525,8 @@ static struct scsi_host_template driver_template = {
.this_id = 7,
.unchecked_isa_dma = 1,
.use_clustering = ENABLE_CLUSTERING,
+   .pre_allocate_sense = 1,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 #if !defined(__BIG_ENDIAN_BITFIELD)  !defined(__LITTLE_ENDIAN_BITFIELD)
@@ -801,6 +804,7 @@ struct mscp {
u_int32_t data_address; /* If sg=0 Data Address, if sg=1 sglist address 
*/
u_int32_t sp_dma_addr;  /* Address where sp is DMA'ed when cp completes 
*/
u_int32_t sense_addr;   /* Address where Sense Data is DMA'ed on error 
*/
+   u8 *sense_buffer;   /* scsi_{make,return}_sense pointer */
 
/* Additional fields begin here. */
struct scsi_cmnd *SCpnt;
@@ -1619,9 +1623,9 @@ static void map_dma(unsigned int i, struct hostdata *ha)
SCpnt = cpp-SCpnt;
pci_dir = SCpnt-sc_data_direction;
 
-   if (SCpnt-sense_buffer)
-   cpp-sense_addr =
-   H2DEV(pci_map_single(ha-pdev, SCpnt-sense_buffer,
+   cpp-sense_buffer = scsi_make_sense(SCpnt);
+   cpp-sense_addr =
+   H2DEV(pci_map_single(ha-pdev, cpp-sense_buffer,
   SCSI_SENSE_BUFFERSIZE, PCI_DMA_FROMDEVICE));
 
cpp-sense_len = SCSI_SENSE_BUFFERSIZE;
@@ -1651,9 +1655,11 @@ static void unmap_dma(unsigned int i, struct hostdata 
*ha)
SCpnt = cpp-SCpnt;
pci_dir = SCpnt-sc_data_direction;
 
-   if (DEV2H(cpp-sense_addr))
+   if (DEV2H(cpp-sense_addr)) {
pci_unmap_single(ha-pdev, DEV2H(cpp-sense_addr),
 DEV2H(cpp-sense_len), PCI_DMA_FROMDEVICE);
+   scsi_return_sense(SCpnt, cpp-sense_buffer);
+   }
 
scsi_dma_unmap(SCpnt);
 
@@ -2428,7 +2434,7 @@ static irqreturn_t ihdlr(int irq, struct Scsi_Host *shost)
/* Works around a flaw in scsi.c */
else if (tstatus == CHECK_CONDITION
  SCpnt-device-type == TYPE_DISK
- (SCpnt-sense_buffer[2]  0xf) == RECOVERED_ERROR)
+ (cpp-sense_buffer[2]  0xf) == RECOVERED_ERROR)
status = DID_BUS_BUSY  16;
 
else
@@ -2440,13 +2446,13 @@ static irqreturn_t ihdlr(int irq, struct Scsi_Host 
*shost)
 
if (spp-target_status  SCpnt-device-type == TYPE_DISK 
(!(tstatus == CHECK_CONDITION  ha-iocount = 1000 
-  (SCpnt-sense_buffer[2]  0xf) == NOT_READY)))
+  (cpp-sense_buffer[2]  0xf) == NOT_READY)))
printk(%s: ihdlr, target %d.%d:%d, pid %ld, 
   target_status 0x%x, sense key 0x%x.\n,
   ha-board_name,
   SCpnt-device-channel, SCpnt-device-id,
   SCpnt-device-lun, SCpnt-serial_number,
-  spp-target_status, SCpnt-sense_buffer[2]);
+  spp-target_status, cpp-sense_buffer[2]);
 
ha-target_to[SCpnt-device-id][SCpnt-device-channel] = 0;
 
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/24][RFC] BusLogic: Use of scsi_make_sense() API for DMA-able sense buffer

2008-02-04 Thread Boaz Harrosh
  - Use a pre allocated, DMA mapped, sense buffer at each command,
Using the scsi_make_sense() API. And scsi_return_sense() when
done.
  - set .pre_allocate_sense at host template.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/BusLogic.c |   24 +++-
 drivers/scsi/BusLogic.h |1 +
 2 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/BusLogic.c b/drivers/scsi/BusLogic.c
index 4d3ebb1..5b076f3 100644
--- a/drivers/scsi/BusLogic.c
+++ b/drivers/scsi/BusLogic.c
@@ -42,17 +42,19 @@
 #include linux/spinlock.h
 #include linux/jiffies.h
 #include linux/dma-mapping.h
-#include scsi/scsicam.h
 
 #include asm/dma.h
 #include asm/io.h
 #include asm/system.h
 
+#include scsi/scsicam.h
 #include scsi/scsi.h
 #include scsi/scsi_cmnd.h
 #include scsi/scsi_device.h
 #include scsi/scsi_host.h
 #include scsi/scsi_tcq.h
+#include scsi/scsi_eh.h
+
 #include BusLogic.h
 #include FlashPoint.c
 
@@ -309,6 +311,7 @@ static void BusLogic_DeallocateCCB(struct BusLogic_CCB *CCB)
pci_unmap_single(HostAdapter-PCI_Device, CCB-SenseDataPointer,
 CCB-SenseDataLength, PCI_DMA_FROMDEVICE);
 
+   scsi_return_sense(CCB-Command, CCB-sense_buffer);
CCB-Command = NULL;
CCB-Status = BusLogic_CCB_Free;
CCB-Next = HostAdapter-Free_CCBs;
@@ -2627,7 +2630,7 @@ static void BusLogic_ProcessCompletedCCBs(struct 
BusLogic_HostAdapter *HostAdapt
BusLogic_Notice(\n, 
HostAdapter);
BusLogic_Notice(Sense , 
HostAdapter);
for (i = 0; i  
CCB-SenseDataLength; i++)
-   BusLogic_Notice( 
%02X, HostAdapter, Command-sense_buffer[i]);
+   BusLogic_Notice( 
%02X, HostAdapter, CCB-sense_buffer[i]);
BusLogic_Notice(\n, 
HostAdapter);
}
}
@@ -2816,16 +2819,6 @@ static int BusLogic_QueueCommand(struct scsi_cmnd 
*Command, void (*CompletionRou
int Count;
struct BusLogic_CCB *CCB;
/*
-  SCSI REQUEST_SENSE commands will be executed automatically by the 
Host
-  Adapter for any errors, so they should not be executed explicitly 
unless
-  the Sense Data is zero indicating that no error occurred.
-*/
-   if (CDB[0] == REQUEST_SENSE  Command-sense_buffer[0] != 0) {
-   Command-result = DID_OK  16;
-   CompletionRoutine(Command);
-   return 0;
-   }
-   /*
   Allocate a CCB from the Host Adapter's free list.  In the unlikely 
event
   that there are none available and memory allocation fails, wait 1 
second
   and try again.  If that fails, the Host Adapter is probably hung so 
signal
@@ -2948,7 +2941,10 @@ static int BusLogic_QueueCommand(struct scsi_cmnd 
*Command, void (*CompletionRou
}
memcpy(CCB-CDB, CDB, CDB_Length);
CCB-SenseDataLength = SCSI_SENSE_BUFFERSIZE;
-   CCB-SenseDataPointer = pci_map_single(HostAdapter-PCI_Device, 
Command-sense_buffer, CCB-SenseDataLength, PCI_DMA_FROMDEVICE);
+   CCB-sense_buffer = scsi_make_sense(Command);
+   CCB-SenseDataPointer = pci_map_single(HostAdapter-PCI_Device,
+   CCB-sense_buffer, CCB-SenseDataLength,
+   PCI_DMA_FROMDEVICE);
CCB-Command = Command;
Command-scsi_done = CompletionRoutine;
if (BusLogic_MultiMasterHostAdapterP(HostAdapter)) {
@@ -3575,6 +3571,8 @@ static struct scsi_host_template Bus_Logic_template = {
.unchecked_isa_dma = 1,
.max_sectors = 128,
.use_clustering = ENABLE_CLUSTERING,
+   .pre_allocate_sense = 1,
+   .sense_buffsize = SCSI_SENSE_BUFFERSIZE,
 };
 
 /*
diff --git a/drivers/scsi/BusLogic.h b/drivers/scsi/BusLogic.h
index bfbfb5c..6e0a131 100644
--- a/drivers/scsi/BusLogic.h
+++ b/drivers/scsi/BusLogic.h
@@ -890,6 +890,7 @@ struct BusLogic_CCB {
struct BusLogic_CCB *NextAll;
struct BusLogic_ScatterGatherSegment
 ScatterGatherList[BusLogic_ScatterGatherLimit];
+   u8 *sense_buffer;
 };
 
 /*
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/24][RFC] scsi-drivers: more drivers use new scsi_eh_cpy_sense()

2008-02-04 Thread Boaz Harrosh
  All below drivers had the sense information stored in some
  driver internal structure, which was copied or manipulated
  and set into sense_buffer. So use scsi_eh_cpy_sense() in it's
  place. In case of manipulation of sense data. a temporary buffer
  is used, then copied.

  Some places that inspect the sense buffer are converted to use
  scsi_sense() accessor.

  driver files changed:
drivers/s390/scsi/zfcp_fsf.c
drivers/scsi/3w-.c
drivers/scsi/aacraid/aachba.c
drivers/scsi/aic7xxx/aic79xx_osm.c
drivers/scsi/aic7xxx/aic7xxx_osm.c
drivers/scsi/arcmsr/arcmsr_hba.c
drivers/scsi/ipr.c
drivers/scsi/ips.c
drivers/scsi/megaraid/megaraid_mbox.c
drivers/scsi/ps3rom.c
drivers/scsi/qla4xxx/ql4_isr.c
drivers/scsi/stex.c

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/s390/scsi/zfcp_fsf.c  |   11 ++
 drivers/scsi/3w-.c|   20 +
 drivers/scsi/aacraid/aachba.c |   52 ++--
 drivers/scsi/aic7xxx/aic79xx_osm.c|   34 +++--
 drivers/scsi/aic7xxx/aic7xxx_osm.c|   22 --
 drivers/scsi/arcmsr/arcmsr_hba.c  |   32 
 drivers/scsi/arm/fas216.c |2 +-
 drivers/scsi/ipr.c|   17 ++-
 drivers/scsi/ips.c|   13 
 drivers/scsi/megaraid/megaraid_mbox.c |   24 ++-
 drivers/scsi/ps3rom.c |   31 +++
 drivers/scsi/qla1280.c|   26 
 drivers/scsi/qla4xxx/ql4_isr.c|   12 ++-
 drivers/scsi/stex.c   |   31 +--
 14 files changed, 147 insertions(+), 180 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 1abbac5..388d218 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -4209,13 +4209,11 @@ zfcp_fsf_send_fcp_command_task_handler(struct 
zfcp_fsf_req *fsf_req)
 
/* check for sense data */
if (unlikely(fcp_rsp_iu-validity.bits.fcp_sns_len_valid)) {
+   u8 *sense;
sns_len = FSF_FCP_RSP_SIZE -
sizeof (struct fcp_rsp_iu) + fcp_rsp_iu-fcp_rsp_len;
ZFCP_LOG_TRACE(room for %i bytes sense data in QTCB\n,
   sns_len);
-   sns_len = min(sns_len, (u32) SCSI_SENSE_BUFFERSIZE);
-   ZFCP_LOG_TRACE(room for %i bytes sense data in SCSI command\n,
-  SCSI_SENSE_BUFFERSIZE);
sns_len = min(sns_len, fcp_rsp_iu-fcp_sns_len);
ZFCP_LOG_TRACE(scpnt-result =0x%x, command was:\n,
   scpnt-result);
@@ -4224,10 +4222,9 @@ zfcp_fsf_send_fcp_command_task_handler(struct 
zfcp_fsf_req *fsf_req)
 
ZFCP_LOG_TRACE(%i bytes sense data provided by FCP\n,
   fcp_rsp_iu-fcp_sns_len);
-   memcpy(scpnt-sense_buffer,
-  zfcp_get_fcp_sns_info_ptr(fcp_rsp_iu), sns_len);
-   ZFCP_HEX_DUMP(ZFCP_LOG_LEVEL_TRACE,
- (void *)scpnt-sense_buffer, sns_len);
+   sense = zfcp_get_fcp_sns_info_ptr(fcp_rsp_iu);
+   scsi_eh_cpy_sense(scpnt, sense, sns_len);
+   ZFCP_HEX_DUMP(ZFCP_LOG_LEVEL_TRACE, sense, sns_len);
}
 
/* check for overrun */
diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c
index d095321..f5dde3d 100644
--- a/drivers/scsi/3w-.c
+++ b/drivers/scsi/3w-.c
@@ -214,6 +214,7 @@
 #include scsi/scsi_host.h
 #include scsi/scsi_tcq.h
 #include scsi/scsi_cmnd.h
+#include scsi/scsi_eh.h
 #include 3w-.h
 
 /* Globals */
@@ -410,23 +411,30 @@ static int tw_decode_sense(TW_Device_Extension *tw_dev, 
int request_id, int fill
if ((command-status == 0xc7) || (command-status == 0xcb)) {
for (i = 0; i  ARRAY_SIZE(tw_sense_table); i++) {
if (command-flags == tw_sense_table[i][0]) {
+   u8 sense[14];
+   struct scsi_cmnd *srb =
+   tw_dev-srb[request_id];
 
+   memset(sense, 0, sizeof(sense));
/* Valid bit and 'current errors' */
-   
tw_dev-srb[request_id]-sense_buffer[0] = (0x1  7 | 0x70);
+   sense[0] = (0x1  7 | 0x70);
 
/* Sense key */
-   
tw_dev-srb[request_id]-sense_buffer[2] = tw_sense_table[i][1];
+   sense[2] = tw_sense_table[i][1];
 
/* Additional sense length */
-   
tw_dev-srb[request_id]-sense_buffer[7] = 0xa; /* 10 bytes 

new scsi sense handling

2008-02-04 Thread Boaz Harrosh
There are 3 usages of sense handling in drivers

1. sense is available in driver internal structure and is mem-copied to upper 
level
2. A CHECK_CONDITION status was returned and the driver uses the 
scsi_eh_prep_cmnd()
   for a REQUEST_SENSE invocation to the target. Then returning the sense in 
   scsi_eh_return_cmnd(). A variation on this is when the driver does nothing 
the queue
   is frozen an the scsi watchdog timer does the above.
3. The underline host adapter does the REQUEST_SENSE and a pre-allocated and 
DMA mapped
   sense buffer receives the sense information from HW.

Now since all IO requests come with a sense set at request-sense the buffer at 
scsi_cmnd-sense_buffer can go away, and case 1 above can copy it's bits 
directly
into it.
Inspection of all users of blk_execute_rq_nowait() shows that all users setup a 
sense
buffer, but submitted code puts a BUG_ON to make sure new code does not break 
that.

for cases 2 and 3 above two new members in scsi host template will tell 
scsi-midlayer
what the driver wants to do.

.sense_buffsize  - will instruct a mempool_t to be allocated per host with 
sense buffers 
   of .sense_buffsize size.

.pre_allocate_sense - If set tells the midlayer to pre-allocate a sense buffer 
for each
  command. When set the code will behave like today. If the 
sense 
  allocation fails the command allocation also fails. (With 
one extra
  sense at mempool)
  If .pre_allocate_sense is not set, then mempool_t is 
resized to
  accommodate one reserved buffer for each target. So a 
call to
  scsi_eh_prep_cmnd() will not fail to allocate a sense 
buffer even
  in low memory condition.

Case 1 drivers call - scsi_eh_cpy_sense() - to transfer their sense information 
to upper
  layer. The drivers are completely abstracted from any future changes at the 
scsi and block
  sense handling. 

Case 3 drivers call the new scsi_make_sense()/scsi_return_sense() API to 
retrieve
  a pointer to an extra DMAable sense buffer/return it to free store. 
scsi_return_sense()
  will call scsi_eh_cpy_sense() to set new sense information into upper layer 
before
  freeing the buffer back to the mempool. If the driver properly set the 
.pre_allocate_sense
  flag  .sense_buffsize at host template then these calls are guaranteed to 
succeed.

Case 2 drivers continue to call scsi_eh_prep_cmnd()/scsi_eh_restore_cmnd() to 
reuse the
  failing command for REQUEST_SENSE. Inside these functions scsi_error.c will 
call above
  scsi_make_sense()/scsi_return_sense() for the actual sense buffer allocation. 
In this 
  case there is a guarantied, one reserved buffer per scsi_device. 
(.pre_allocate_sense not set).
  Drivers need to set proper size at .sense_buffsize in host template, for this 
to work.

Upper layer and ULDs that need to inspect the sense information can get to it
  using the scsi_sense() accessor.

Submitted for Review and comments (RFC) a patcset for that effect.

- first patch will introduce the new API implemented over existing members:
scsi_eh: Define  new API for sense handling

- first group of patches are the case 1 drivers:
scsi-drivers: Move to new sense API. The Trevial case
scsi-drivers: more drivers use new scsi_eh_cpy_sense()
firewire  ieee1394: Simple convert to new scsi_eh_cpy_sense.
dpt_i2o: Use new scsi_eh_cpy_sense()
gdth: Use of scsi_eh API and sense accessors
qla2xxx: convert to new scsi_eh_cpy_sense API
isd200, transport: Use scsi_eh_cpy_sense API
ultrastor: Use scsi_eh_cpy_sense() API
usb/microtek: No special handling for REQUEST_SENSE command please

- second group are case 2 drivers:
libata: Use scsi_eh API for REQUEST_SENSE invocation
53c700: Use scsi_eh API for REQUEST_SENSE invocation
aic7xxx_old: Use scsi_eh API for REQUEST_SENSE invocation
dc395x: Use scsi_eh API for REQUEST_SENSE invocation
tmscsim: Use scsi_eh API for REQUEST_SENSE invocation
Add .sense_buffsize to drivers that use scsi_eh_prep_cmnd

- third group are case 3:
BusLogic: Use of scsi_make_sense() API for DMA-able sense buffer
eata: Use of scsi_make_sense() API for DMA-able sense buffer
initio: Use of scsi_make_sense() API for DMA-able sense buffer
u14-34f: Use of scsi_make_sense() API for DMA-able sense buffer

- Then, some patches for the upper layer:
scsi_tgt: use of sense accessors
scsi upper layer use of sense accessors

- And finally, block and scsi-midlayer switch to new handling.
block: Minor changes to sense handling
scsi: New sense handling

These patches are over latest scsi-misc. They are completely *untested* and are 
for discussion
only. I will debug and update in the near future but only if it is accepted in 
principle
first.

Lots of 

[PATCH 3/3] libata: implement drain buffers

2008-02-04 Thread James Bottomley
This just updates the libata slave configure routine to take advantage
of the block layer drain buffers.  It also adjusts the size lengths in
the atapi code to add the drain buffer to the DMA length so the driver
knows it can rely on it.

I suspect I should also be checking for AHCI as well as ATA_DEV_ATAPI,
but I couldn't see how to do that easily.

Signed-off-by: James Bottomley [EMAIL PROTECTED]
---
 drivers/ata/libata-scsi.c |   30 ++
 1 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 844..acf6a8b 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -826,8 +826,8 @@ static void ata_scsi_sdev_config(struct scsi_device *sdev)
sdev-max_device_blocked = 1;
 }
 
-static void ata_scsi_dev_config(struct scsi_device *sdev,
-   struct ata_device *dev)
+static int ata_scsi_dev_config(struct scsi_device *sdev,
+  struct ata_device *dev)
 {
/* configure max sectors */
blk_queue_max_sectors(sdev-request_queue, dev-max_sectors);
@@ -839,6 +839,16 @@ static void ata_scsi_dev_config(struct scsi_device *sdev,
sdev-manage_start_stop = 1;
}
 
+   if (dev-class == ATA_DEV_ATAPI) {
+   struct request_queue *q = sdev-request_queue;
+   void *buf = kmalloc(ATAPI_MAX_DRAIN, GFP_KERNEL);
+   if (!buf) {
+   sdev_printk(KERN_ERR, sdev, drain buffer allocation 
failed\n);
+   return -ENOMEM;
+   }
+   blk_queue_dma_drain(q, buf, ATAPI_MAX_DRAIN);
+   }
+
if (dev-flags  ATA_DFLAG_AN)
set_bit(SDEV_EVT_MEDIA_CHANGE, sdev-supported_events);
 
@@ -849,6 +859,8 @@ static void ata_scsi_dev_config(struct scsi_device *sdev,
depth = min(ATA_MAX_QUEUE - 1, depth);
scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, depth);
}
+
+   return 0;
 }
 
 /**
@@ -867,13 +879,14 @@ int ata_scsi_slave_config(struct scsi_device *sdev)
 {
struct ata_port *ap = ata_shost_to_port(sdev-host);
struct ata_device *dev = __ata_scsi_find_dev(ap, sdev);
+   int rc = 0;
 
ata_scsi_sdev_config(sdev);
 
if (dev)
-   ata_scsi_dev_config(sdev, dev);
+   rc = ata_scsi_dev_config(sdev, dev);
 
-   return 0;
+   return rc;
 }
 
 /**
@@ -895,6 +908,7 @@ void ata_scsi_slave_destroy(struct scsi_device *sdev)
struct ata_port *ap = ata_shost_to_port(sdev-host);
unsigned long flags;
struct ata_device *dev;
+   struct request_queue *q = sdev-request_queue;
 
if (!ap-ops-error_handler)
return;
@@ -908,6 +922,10 @@ void ata_scsi_slave_destroy(struct scsi_device *sdev)
ata_port_schedule_eh(ap);
}
spin_unlock_irqrestore(ap-lock, flags);
+
+   kfree(q-dma_drain_buffer);
+   q-dma_drain_buffer = NULL;
+   q-dma_drain_size = 0;
 }
 
 /**
@@ -2478,6 +2496,8 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc)
 
qc-tf.command = ATA_CMD_PACKET;
qc-nbytes = scsi_bufflen(scmd);
+   if (blk_pc_request(scmd-request))
+   qc-nbytes += blk_rq_drain_size(scmd-request);
 
/* check whether ATAPI DMA is safe */
if (!using_pio  ata_check_atapi_dma(qc))
@@ -2814,6 +2834,8 @@ static unsigned int ata_scsi_pass_thru(struct 
ata_queued_cmd *qc)
 *   cover scatter/gather case.
 */
qc-nbytes = scsi_bufflen(scmd);
+   if (ata_is_atapi(qc-tf.protocol))
+   qc-nbytes += blk_rq_drain_size(scmd-request);
 
/* request result TF and be quiet about device error */
qc-flags |= ATA_QCFLAG_RESULT_TF | ATA_QCFLAG_QUIET;
-- 
1.5.3.8



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] block: add blk_rq_drain_size() API

2008-02-04 Thread James Bottomley
Signed-off-by: James Bottomley [EMAIL PROTECTED]
---
 include/linux/blkdev.h |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 90392a9..a526066 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -676,6 +676,11 @@ extern void blk_complete_request(struct request *);
 extern unsigned int blk_rq_bytes(struct request *rq);
 extern unsigned int blk_rq_cur_bytes(struct request *rq);
 
+static inline int blk_rq_drain_size(struct request *rq)
+{
+   return rq-q-dma_drain_size;
+}
+
 static inline void blkdev_dequeue_request(struct request *req)
 {
elv_dequeue_request(req-q, req);
-- 
1.5.3.8



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/24][RFC] isd200, transport: Use scsi_eh_cpy_sense API

2008-02-04 Thread Boaz Harrosh
  Use of new scsi_eh_cpy_sense() to set sense information into the
  command.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/usb/storage/isd200.c|   14 +++---
 drivers/usb/storage/transport.c |   26 +-
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/usb/storage/isd200.c b/drivers/usb/storage/isd200.c
index 8a761b6..9544728 100644
--- a/drivers/usb/storage/isd200.c
+++ b/drivers/usb/storage/isd200.c
@@ -53,6 +53,7 @@
 #include scsi/scsi.h
 #include scsi/scsi_cmnd.h
 #include scsi/scsi_device.h
+#include scsi/scsi_eh.h
 
 #include usb.h
 #include transport.h
@@ -365,8 +366,9 @@ struct sense_data {
 static void isd200_build_sense(struct us_data *us, struct scsi_cmnd *srb)
 {
struct isd200_info *info = (struct isd200_info *)us-extra;
-   struct sense_data *buf = (struct sense_data *) srb-sense_buffer[0];
-   unsigned char error = info-ATARegs[ATA_REG_ERROR_OFFSET];
+   struct sense_data sense;
+   struct sense_data *buf = sense;
+   unsigned char error = info-ATARegs[IDE_ERROR_OFFSET];
 
if(error  ATA_ERROR_MEDIA_CHANGE) {
buf-ErrorCode = 0x70 | SENSE_ERRCODE_VALID;
@@ -399,6 +401,9 @@ static void isd200_build_sense(struct us_data *us, struct 
scsi_cmnd *srb)
buf-AdditionalSenseCode = 0;
buf-AdditionalSenseCodeQualifier = 0;
}
+   scsi_eh_cpy_sense(srb, sense, sizeof(sense));
+
+   srb-result = buf-Flags ? SAM_STAT_CHECK_CONDITION : SAM_STAT_GOOD;
 }
 
 
@@ -639,11 +644,6 @@ static void isd200_invoke_transport( struct us_data *us,
}
if (result == ISD200_GOOD) {
isd200_build_sense(us, srb);
-   srb-result = SAM_STAT_CHECK_CONDITION;
-
-   /* If things are really okay, then let's show that */
-   if ((srb-sense_buffer[2]  0xf) == 0x0)
-   srb-result = SAM_STAT_GOOD;
} else {
srb-result = DID_ERROR  16;
/* Need reset here */
diff --git a/drivers/usb/storage/transport.c b/drivers/usb/storage/transport.c
index d9f4912..03b5e22 100644
--- a/drivers/usb/storage/transport.c
+++ b/drivers/usb/storage/transport.c
@@ -635,15 +635,15 @@ void usb_stor_invoke_transport(struct scsi_cmnd *srb, 
struct us_data *us)
 
US_DEBUGP(-- Result from auto-sense is %d\n, temp_result);
US_DEBUGP(-- code: 0x%x, key: 0x%x, ASC: 0x%x, ASCQ: 0x%x\n,
- srb-sense_buffer[0],
- srb-sense_buffer[2]  0xf,
- srb-sense_buffer[12], 
- srb-sense_buffer[13]);
+ scsi_sense(srb)[0],
+ scsi_sense(srb)[2]  0xf,
+ scsi_sense(srb)[12],
+ scsi_sense(srb)[13]);
 #ifdef CONFIG_USB_STORAGE_DEBUG
usb_stor_show_sense(
- srb-sense_buffer[2]  0xf,
- srb-sense_buffer[12], 
- srb-sense_buffer[13]);
+ scsi_sense(srb)[2]  0xf,
+ scsi_sense(srb)[12],
+ scsi_sense(srb)[13]);
 #endif
 
/* set the result so the higher layers expect this data */
@@ -654,12 +654,12 @@ void usb_stor_invoke_transport(struct scsi_cmnd *srb, 
struct us_data *us)
 * we did an unsolicited auto-sense. */
if (result == USB_STOR_TRANSPORT_GOOD 
/* Filemark 0, ignore EOM, ILI 0, no sense */
-   (srb-sense_buffer[2]  0xaf) == 0 
+   (scsi_sense(srb)[2]  0xaf) == 0 
/* No ASC or ASCQ */
-   srb-sense_buffer[12] == 0 
-   srb-sense_buffer[13] == 0) {
+   scsi_sense(srb)[12] == 0 
+   scsi_sense(srb)[13] == 0) {
srb-result = SAM_STAT_GOOD;
-   srb-sense_buffer[0] = 0x0;
+   scsi_eh_reset_sense(srb);
}
}
 
@@ -1056,8 +1056,8 @@ int usb_stor_Bulk_transport(struct scsi_cmnd *srb, struct 
us_data *us)
case US_BULK_STAT_OK:
/* device babbled -- return fake sense data */
if (fake_sense) {
-   memcpy(srb-sense_buffer, 
-  usb_stor_sense_invalidCDB, 
+   scsi_eh_cpy_sense(srb,
+  usb_stor_sense_invalidCDB,
   sizeof(usb_stor_sense_invalidCDB));
return USB_STOR_TRANSPORT_NO_SENSE;
}
-- 
1.5.3.3

-
To unsubscribe from this list: send the 

Re: [PATCH 4/24][RFC] firewire ieee1394: Simple convert to new scsi_eh_cpy_sense.

2008-02-04 Thread Stefan Richter
Boaz Harrosh wrote:
 --- a/drivers/ieee1394/sbp2.c
 +++ b/drivers/ieee1394/sbp2.c
 @@ -1672,8 +1673,11 @@ static int sbp2_send_command(struct sbp2_lu *lu, 
 struct scsi_cmnd *SCpnt,
   * Translates SBP-2 status into SCSI sense data for check conditions
   */
  static unsigned int sbp2_status_to_sense_data(unchar *sbp2_status,
 -   unchar *sense_data)
 +   struct scsi_cmnd *SCpnt)
  {
 + u8 sense_data[16];
 +
 + memset(sense_data, 0, sizeof(sense_data));
   /* OK, it's pretty ugly... ;-) */
   sense_data[0] = 0x70;
   sense_data[1] = 0x0;
 @@ -1691,6 +1695,7 @@ static unsigned int sbp2_status_to_sense_data(unchar 
 *sbp2_status,
   sense_data[13] = sbp2_status[11];
   sense_data[14] = sbp2_status[20];
   sense_data[15] = sbp2_status[21];
 + scsi_eh_cpy_sense(SCpnt, sense_data, sizeof(sense_data));
  
   return sbp2_status[8]  0x3f;
  }

You don't need the memset.

Also, here and in drivers/firewire/fw-sbp2.c, the SCSI sense data could
AFAICS be rewritten in-place in sbp2_status.  But I don't know if this
is a worthwhile optimization; it would reduce readability.
-- 
Stefan Richter
-=-==--- --=- --=--
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] pci: pci_enable_device_bars() fix

2008-02-04 Thread Andrew Morton
On Mon, 4 Feb 2008 13:57:36 +0100 Ingo Molnar [EMAIL PROTECTED] wrote:

 
 * Jeff Garzik [EMAIL PROTECTED] wrote:
 
  Ingo Molnar wrote:
  so please tell me Jeff. If Greg, who is the super-maintainer of your 
  code area, and who deals with your code every day and changes it 
  every minute and hour, simply did not Cc: the SCSI list - how am i, a 
  largely outside party in this matter, supposed to notice that 3 
  maintainers and 3 mailing lists in the Cc: were somehow not enough 
  and that i was supposed to grow the already sizable Cc: list even 
  more?
 
  Because, regardless of the situation, it's both common courtesy and 
  wise practice to CC relevant driver maintainers, when you touch a 
  driver.
 
  And it's just common sense: Greg simply does not know the intimate 
  details of every PCI driver.  Nor do I.  Nor you.
 
  In the case of lpfc here, we have an active driver maintainer, and an 
  up-to-date MAINTAINERS entry.  Even if you are too slack to read 
  MAINTAINERS, 'git log' would have given you the same info.
 
  Don't pretend there is some benefit here to ignoring the people that 
  best know the driver.  I don't buy that; it simply makes no 
  engineering sense whatsoever.
 
 what you _STILL_ do not realize is the following: you still attribute 
 the lack of Cc:s to some intention of mine. No, it was not my intention. 
 At first glance the Cc: looked large and complete enough in an 
 _existing_ discussion and that's was the end of my (brief) attention 
 regarding the Cc: line. Yes, it would have been a bit better had i 
 noticed the lack of Cc:s in an existing discussion, but i didnt.

Actually I (and probably others) generally avoid cc'ing mailing lists on
patch traffic.  I spew out enough script-generated traffic as it is.

 ...
   mailing list aliases to get the 'guaranteed attention' of maintainers 


whoa.  You must know better mailing lists than I do ;)

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] libata: eliminate the home grown dma padding in favour of that provided by the block layer

2008-02-04 Thread James Bottomley
ATA requires that all DMA transfers begin and end on word boundaries.
Because of this, a large amount of machinery grew up in ide to adjust
scatterlists on this basis.  However, as of 2.5, the block layer has a
dma_alignment variable which ensures both the beginning and length of a
DMA transfer are aligned on the dma_alignment boundary.  Although the
block layer does adjust the beginning of the transfer to ensure this
happens, it doesn't actually adjust the length, it merely makes sure
that space is allocated for transfers beyond the declared length.  The
upshot of this is that scatterlists may be padded to any size between
the actual length and the length adjusted to the dma_alignment safely
knowing that memory is allocated in this region.

Right at the moment, SCSI takes the default dma_aligment which is on a
512 byte boundary.  Note that this aligment only applies to transfers
coming in from user space.  However, since all kernel allocations are
automatically aligned on a minimum of 32 byte boundaries, it is safe to
adjust them in this manner as well.

Signed-off-by: James Bottomley [EMAIL PROTECTED]
---
 drivers/ata/ahci.c|5 --
 drivers/ata/libata-core.c |  146 +---
 drivers/ata/libata-scsi.c |   23 +-
 drivers/ata/pata_icside.c |8 --
 drivers/ata/sata_fsl.c|   17 +
 drivers/ata/sata_mv.c |6 +--
 drivers/ata/sata_sil24.c  |5 --
 drivers/scsi/ipr.c|4 +-
 drivers/scsi/libsas/sas_ata.c |4 +-
 include/linux/libata.h|   28 +
 10 files changed, 31 insertions(+), 215 deletions(-)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 27c8d56..e75966b 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1979,16 +1979,11 @@ static int ahci_port_start(struct ata_port *ap)
struct ahci_port_priv *pp;
void *mem;
dma_addr_t mem_dma;
-   int rc;
 
pp = devm_kzalloc(dev, sizeof(*pp), GFP_KERNEL);
if (!pp)
return -ENOMEM;
 
-   rc = ata_pad_alloc(ap, dev);
-   if (rc)
-   return rc;
-
mem = dmam_alloc_coherent(dev, AHCI_PORT_PRIV_DMA_SZ, mem_dma,
  GFP_KERNEL);
if (!mem)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index bdbd55a..679a404 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -60,6 +60,8 @@
 #include linux/io.h
 #include scsi/scsi.h
 #include scsi/scsi_cmnd.h
+#include scsi/scsi_device.h
+#include scsi/scsi_dbg.h
 #include scsi/scsi_host.h
 #include linux/libata.h
 #include asm/semaphore.h
@@ -4476,30 +4478,13 @@ void ata_sg_clean(struct ata_queued_cmd *qc)
struct ata_port *ap = qc-ap;
struct scatterlist *sg = qc-sg;
int dir = qc-dma_dir;
-   void *pad_buf = NULL;
 
WARN_ON(sg == NULL);
 
-   VPRINTK(unmapping %u sg elements\n, qc-mapped_n_elem);
+   VPRINTK(unmapping %u sg elements\n, qc-n_elem);
 
-   /* if we padded the buffer out to 32-bit bound, and data
-* xfer direction is from-device, we must copy from the
-* pad buffer back into the supplied buffer
-*/
-   if (qc-pad_len  !(qc-tf.flags  ATA_TFLAG_WRITE))
-   pad_buf = ap-pad + (qc-tag * ATA_DMA_PAD_SZ);
-
-   if (qc-mapped_n_elem)
-   dma_unmap_sg(ap-dev, sg, qc-mapped_n_elem, dir);
-   /* restore last sg */
-   if (qc-last_sg)
-   *qc-last_sg = qc-saved_last_sg;
-   if (pad_buf) {
-   struct scatterlist *psg = qc-extra_sg[1];
-   void *addr = kmap_atomic(sg_page(psg), KM_IRQ0);
-   memcpy(addr + psg-offset, pad_buf, qc-pad_len);
-   kunmap_atomic(addr, KM_IRQ0);
-   }
+   if (qc-n_elem)
+   dma_unmap_sg(ap-dev, sg, qc-n_elem, dir);
 
qc-flags = ~ATA_QCFLAG_DMAMAP;
qc-sg = NULL;
@@ -4765,97 +4750,6 @@ void ata_sg_init(struct ata_queued_cmd *qc, struct 
scatterlist *sg,
qc-cursg = qc-sg;
 }
 
-static unsigned int ata_sg_setup_extra(struct ata_queued_cmd *qc,
-  unsigned int *n_elem_extra,
-  unsigned int *nbytes_extra)
-{
-   struct ata_port *ap = qc-ap;
-   unsigned int n_elem = qc-n_elem;
-   struct scatterlist *lsg, *copy_lsg = NULL, *tsg = NULL, *esg = NULL;
-
-   *n_elem_extra = 0;
-   *nbytes_extra = 0;
-
-   /* needs padding? */
-   qc-pad_len = qc-nbytes  3;
-
-   if (likely(!qc-pad_len))
-   return n_elem;
-
-   /* locate last sg and save it */
-   lsg = sg_last(qc-sg, n_elem);
-   qc-last_sg = lsg;
-   qc-saved_last_sg = *lsg;
-
-   sg_init_table(qc-extra_sg, ARRAY_SIZE(qc-extra_sg));
-
-   if (qc-pad_len) {
-   struct scatterlist *psg = qc-extra_sg[1];
-   void *pad_buf = ap-pad + (qc-tag * ATA_DMA_PAD_SZ);
-   unsigned int 

Re: [patch] pci: pci_enable_device_bars() fix

2008-02-04 Thread Ingo Molnar

* Jeff Garzik [EMAIL PROTECTED] wrote:

 Ingo Molnar wrote:
 so please tell me Jeff. If Greg, who is the super-maintainer of your 
 code area, and who deals with your code every day and changes it 
 every minute and hour, simply did not Cc: the SCSI list - how am i, a 
 largely outside party in this matter, supposed to notice that 3 
 maintainers and 3 mailing lists in the Cc: were somehow not enough 
 and that i was supposed to grow the already sizable Cc: list even 
 more?

 Because, regardless of the situation, it's both common courtesy and 
 wise practice to CC relevant driver maintainers, when you touch a 
 driver.

 And it's just common sense: Greg simply does not know the intimate 
 details of every PCI driver.  Nor do I.  Nor you.

 In the case of lpfc here, we have an active driver maintainer, and an 
 up-to-date MAINTAINERS entry.  Even if you are too slack to read 
 MAINTAINERS, 'git log' would have given you the same info.

 Don't pretend there is some benefit here to ignoring the people that 
 best know the driver.  I don't buy that; it simply makes no 
 engineering sense whatsoever.

what you _STILL_ do not realize is the following: you still attribute 
the lack of Cc:s to some intention of mine. No, it was not my intention. 
At first glance the Cc: looked large and complete enough in an 
_existing_ discussion and that's was the end of my (brief) attention 
regarding the Cc: line. Yes, it would have been a bit better had i 
noticed the lack of Cc:s in an existing discussion, but i didnt.

[ And it might not surprise you if i observe here that i think this
  little mishap is further (incidental) proof that having this many
  mailing list aliases to get the 'guaranteed attention' of maintainers 
  is just super fragile and does not serve users at all. Even Greg and i 
  got it wrong accidentally. If _we_ get it wrong, who will get it 
  right? I see several mis-Cc:ed emails every day and there's over a 
  1000 unfixed bugs in bugzilla for 2.6 alone. It does not take a genius 
  to observe that something is fundamentally wrong ;-) That 2.6 works on
  your or my box is a _very_ poor metric - we both fix bugs on our 
  systems super fast so we've got a very biased first-hand experience 
  about the stability and reliability of the kernel. We never really 
  feel the kind of frustration that testers feel when their mail to lkml 
  gets ignored. We never really feel the helplessness that comes from 
  unfixed bugs. ]

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 03/12] qla4xxx: have qla4xxx use iscsi class session state check ready

2008-02-04 Thread David Somayajulu

Mike Christie wrote :
 This has qla4xxx use the iscsi class's check ready function
 in the queue command function, so all iscsi drivers return the
 same error value for common problems.
 
 Signed-off-by: Mike Christie [EMAIL PROTECTED]
 ---
  drivers/scsi/qla4xxx/ql4_os.c |   12 
  1 files changed, 12 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/scsi/qla4xxx/ql4_os.c 
 b/drivers/scsi/qla4xxx/ql4_os.c
 index a87fb9f..437d169 100644
 --- a/drivers/scsi/qla4xxx/ql4_os.c
 +++ b/drivers/scsi/qla4xxx/ql4_os.c
 @@ -398,9 +398,21 @@ static int qla4xxx_queuecommand(struct 
 scsi_cmnd *cmd,
  {
   struct scsi_qla_host *ha = to_qla_host(cmd-device-host);
   struct ddb_entry *ddb_entry = cmd-device-hostdata;
 + struct iscsi_cls_session *sess = ddb_entry-sess;
   struct srb *srb;
   int rval;
  
 + if (!sess) {
 + cmd-result = DID_IMM_RETRY  16;
 + goto qc_fail_command;
 + }
 +
 + rval = iscsi_session_chkready(sess);
 + if (rval) {
 + cmd-result = rval;
 + goto qc_fail_command;
 + }
 +
   if (atomic_read(ddb_entry-state) != DDB_STATE_ONLINE) {
   if (atomic_read(ddb_entry-state) == DDB_STATE_DEAD) {
   cmd-result = DID_NO_CONNECT  16;
 -- 
 1.5.2.1
Acked by David Somayajulu [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Vladislav Bolkhovitin

James Bottomley wrote:
So, James, what is your opinion on the above? Or the overall SCSI target 
project simplicity doesn't matter much for you and you think it's fine 
to duplicate Linux page cache in the user space to keep the in-kernel 
part of the project as small as possible?



The answers were pretty much contained here

http://marc.info/?l=linux-scsim=120164008302435

and here:

http://marc.info/?l=linux-scsim=120171067107293

Weren't they?


No, sorry, it doesn't look so for me. They are about performance, but 
I'm asking about the overall project's architecture, namely about one 
part of it: simplicity. Particularly, what do you think about 
duplicating Linux page cache in the user space to have zero-copy cached 
I/O? Or can you suggest another architectural solution for that problem 
in the STGT's approach?



Isn't that an advantage of a user space solution?  It simply uses the
backing store of whatever device supplies the data.  That means it takes
advantage of the existing mechanisms for caching.


No, please reread this thread, especially this message: 
http://marc.info/?l=linux-kernelm=120169189504361w=2. This is one of 
the advantages of the kernel space implementation. The user space 
implementation has to have data copied between the cache and user space 
buffer, but the kernel space one can use pages in the cache directly, 
without extra copy.


Vlad
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Vladislav Bolkhovitin

James Bottomley wrote:

On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:


James Bottomley wrote:

So, James, what is your opinion on the above? Or the overall SCSI target 
project simplicity doesn't matter much for you and you think it's fine 
to duplicate Linux page cache in the user space to keep the in-kernel 
part of the project as small as possible?



The answers were pretty much contained here

http://marc.info/?l=linux-scsim=120164008302435

and here:

http://marc.info/?l=linux-scsim=120171067107293

Weren't they?


No, sorry, it doesn't look so for me. They are about performance, but 
I'm asking about the overall project's architecture, namely about one 
part of it: simplicity. Particularly, what do you think about 
duplicating Linux page cache in the user space to have zero-copy cached 
I/O? Or can you suggest another architectural solution for that problem 
in the STGT's approach?



Isn't that an advantage of a user space solution?  It simply uses the
backing store of whatever device supplies the data.  That means it takes
advantage of the existing mechanisms for caching.


No, please reread this thread, especially this message: 
http://marc.info/?l=linux-kernelm=120169189504361w=2. This is one of 
the advantages of the kernel space implementation. The user space 
implementation has to have data copied between the cache and user space 
buffer, but the kernel space one can use pages in the cache directly, 
without extra copy.



Well, you've said it thrice (the bellman cried) but that doesn't make it
true.

The way a user space solution should work is to schedule mmapped I/O
from the backing store and then send this mmapped region off for target
I/O.  For reads, the page gather will ensure that the pages are up to
date from the backing store to the cache before sending the I/O out.
For writes, You actually have to do a msync on the region to get the
data secured to the backing store. 


James, have you checked how fast is mmaped I/O if work size  size of 
RAM? It's several times slower comparing to buffered I/O. It was many 
times discussed in LKML and, seems, VM people consider it unavoidable. 
So, using mmaped IO isn't an option for high performance. Plus, mmaped 
IO isn't an option for high reliability requirements, since it doesn't 
provide a practical way to handle I/O errors.



You also have to pull tricks with
the mmap region in the case of writes to prevent useless data being read
in from the backing store.


Can you be more exact and specify what kind of tricks should be done for 
that?



 However, none of this involves data copies.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/24][RFC] qla2xxx: convert to new scsi_eh_cpy_sense API

2008-02-04 Thread Boaz Harrosh
  this driver is special in that it would read sense in parts
  until done. Same mechanics left here but read into a driver
  internal buffer, which is then scsi_eh_cpy_sense() into command
  when done.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/qla2xxx/qla_def.h |4 ++--
 drivers/scsi/qla2xxx/qla_isr.c |   17 +++--
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index b72c7f1..c2109f6 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -195,8 +195,9 @@ typedef struct srb {
/* Single transfer DMA context */
dma_addr_t dma_handle;
 
-   uint32_t request_sense_length;
+   uint16_t request_sense_length;
uint8_t *request_sense_ptr;
+   uint8_t sense_buffer[SCSI_SENSE_BUFFERSIZE];
 } srb_t;
 
 /*
@@ -2601,7 +2602,6 @@ typedef struct scsi_qla_host {
 #define CMD_COMPL_STATUS(Cmnd)  ((Cmnd)-SCp.this_residual)
 #define CMD_RESID_LEN(Cmnd)((Cmnd)-SCp.buffers_residual)
 #define CMD_SCSI_STATUS(Cmnd)  ((Cmnd)-SCp.Status)
-#define CMD_ACTUAL_SNSLEN(Cmnd)((Cmnd)-SCp.Message)
 #define CMD_ENTRY_STATUS(Cmnd) ((Cmnd)-SCp.have_data_in)
 
 #endif
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 642a0c3..611f556 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -8,6 +8,7 @@
 
 #include linux/delay.h
 #include scsi/scsi_tcq.h
+#include scsi/scsi_eh.h
 
 static void qla2x00_mbx_completion(scsi_qla_host_t *, uint16_t);
 static void qla2x00_process_completed_request(struct scsi_qla_host *, 
uint32_t);
@@ -825,18 +826,15 @@ qla2x00_process_response_queue(struct scsi_qla_host *ha)
 static inline void
 qla2x00_handle_sense(srb_t *sp, uint8_t *sense_data, uint32_t sense_len)
 {
-   struct scsi_cmnd *cp = sp-cmd;
-
if (sense_len = SCSI_SENSE_BUFFERSIZE)
sense_len = SCSI_SENSE_BUFFERSIZE;
 
-   CMD_ACTUAL_SNSLEN(cp) = sense_len;
sp-request_sense_length = sense_len;
-   sp-request_sense_ptr = cp-sense_buffer;
-   if (sp-request_sense_length  32)
+   sp-request_sense_ptr = sp-sense_buffer;
+   if (sense_len  32)
sense_len = 32;
 
-   memcpy(cp-sense_buffer, sense_data, sense_len);
+   memcpy(sp-sense_buffer, sense_data, sense_len);
 
sp-request_sense_ptr += sense_len;
sp-request_sense_length -= sense_len;
@@ -847,8 +845,7 @@ qla2x00_handle_sense(srb_t *sp, uint8_t *sense_data, 
uint32_t sense_len)
cmd=%p pid=%ld\n, __func__, sp-ha-host_no, cp-device-channel,
cp-device-id, cp-device-lun, cp, cp-serial_number));
if (sense_len)
-   DEBUG5(qla2x00_dump_buffer(cp-sense_buffer,
-   CMD_ACTUAL_SNSLEN(cp)));
+   DEBUG5(qla2x00_dump_buffer(cp-sense_buffer, sense_len));
 }
 
 /**
@@ -1005,7 +1002,6 @@ qla2x00_status_entry(scsi_qla_host_t *ha, void *pkt)
if (lscsi_status != SS_CHECK_CONDITION)
break;
 
-   memset(cp-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
if (!(scsi_status  SS_SENSE_LEN_VALID))
break;
 
@@ -1064,7 +1060,6 @@ qla2x00_status_entry(scsi_qla_host_t *ha, void *pkt)
if (lscsi_status != SS_CHECK_CONDITION)
break;
 
-   memset(cp-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
if (!(scsi_status  SS_SENSE_LEN_VALID))
break;
 
@@ -1268,6 +1263,8 @@ qla2x00_status_cont_entry(scsi_qla_host_t *ha, 
sts_cont_entry_t *pkt)
 
/* Place command on done queue. */
if (sp-request_sense_length == 0) {
+   scsi_eh_cpy_sense(cp, sp-sense_buffer,
+   sp-request_sense_ptr - sp-sense_buffer);
ha-status_srb = NULL;
qla2x00_sp_compl(ha, sp);
}
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24][RFC] scsi_eh: Define new API for sense handling

2008-02-04 Thread James Bottomley
On Mon, 2008-02-04 at 17:30 +0200, Boaz Harrosh wrote:
 This patch defines a new API for sense handling. All drivers will
   be converted to this API, before the sense handling implementation will
   change. API is as follows:
 
 void scsi_eh_cpy_sense(struct scsi_cmnd *cmd, void* sense,
unsigned sense_bytes);
 To be used by drivers, when they have sense-bits
 and wants to send them to upper layer. Max size
 need not be a concern, If upper layer does not have
 enough space it will be automatically truncated.
 
 u8 *scsi_make_sense(struct scsi_cmnd *cmd);
 To be used by drivers, and scsi-midlayer. Returns a DMA-able
 sense buffer. Must be returned by scsi_return_sense(). It should
 never fail if .pre_allocate_sense  .sense_buffsize in host
 template where properly set.
 the buffer is of shost-sense_buffsize long.
 
 void *scsi_return_sense(struct scsi_cmnd *cmd, u8 *sb);
 Frees and returns the sense to the upper layer,
 copying only what's necessary.
 
 void scsi_eh_reset_sense(struct scsi_cmnd *cmd)
 Should not be used or necessary.
 
 const u8 *scsi_sense(struct scsi_cmnd *cmd)
 Used by ULDs and for inspecting the returned sense, can not
 be modified. It is only valid after a call to
 scsi_eh_cpy_sense() or a call to scsi_return_sense(). Before
 that it will/should return an empty buffer.
 
 New members at scsi host template:
 .sense_buffsize - if a driver calls scsi_make_sense() or
   scsi_eh_prep_cmnd(), This value should be none
   zero indicating the max sense size, the driver
   supports. In most cases it should be
   SCSI_SENSE_BUFFERSIZE.
   If this value is zero the driver will only call
   scsi_eh_cpy_sense().
 
 .pre_allocate_sense - if a Driver calls scsi_make_sense()
   in .queuecommand for every cmnd, this
   should be set to true. In which case
   scsi_make_sense() will not fail because
   midlayer will fail the command allocation.
   If the drivers calls scsi_eh_prep_cmnd()
   then sense_buffsize is not Zero but this
   here is set to false.

My initial reaction to this is that you're doing too many contortions to
ensure something we don't particularly care about:  whether we can
allocate a sense buffer atomically or not.

What all this code should be doing is simply allocating the sense buffer
in scsi_eh_prep_cmnd() using tomo's existing slab (and GFP_ATOMIC) if
that fails, we need a return from scsi_eh_prep_cmnd() telling us.  At
that point, the driver should abandon the auto request sense attempt and
instead just return the CC/UA without the DRIVER_SENSE bit set which
will trigger the eh to collect the sense for us.

Ideally, doing it this way might mean we could even dump the
sense_buffer pointer from the command (although I don't see that as
necessary).

This solves the 99% case without getting into preallocation contortions.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24][RFC] gdth: Use of scsi_eh API and sense accessors

2008-02-04 Thread Jeff Garzik

Boaz Harrosh wrote:

  Use of new scsi_eh API for setting sense information into
  the scsi command.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/gdth.c |   47 ++-
 drivers/scsi/gdth.h |1 +
 2 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index c825239..9fdd5ef 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -2098,6 +2098,16 @@ static void gdth_putq(gdth_ha_str *ha, Scsi_Cmnd *scp, 
unchar priority)
 #endif
 }
 
+static void gdth_set_4byte_sense(struct scsi_cmnd *scp, u8 sense_code)

+{
+   u8 sense[4];
+
+   memset(sense, 0, sizeof(sense));
+   sense[0] = 0x70;
+   sense[2] = sense_code;
+   scsi_eh_cpy_sense(scp, sense, sizeof(sense));
+}


IMO, setting 0x70 and 0x72 is highly common, and worthy of some simple 
helper functions.  See ata_scsi_set_sense() in libata-scsi.c or 
stex_set_sense() in stex.c, which is a copy of the former.


Jeff




-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Cbe-oss-dev] LIO Target iSCSI/SE PS3-Linux / FC8 builds

2008-02-04 Thread Nicholas A. Bellinger
Hi Marc,

You can generate the kernel RPM with 'make kernel ARCH=powerpc'.

Also, while module-assistant is supported on debian/ubuntu,
trunk/buildtools/ currently does not support generating kernel module
source rpms.  If you want to send a patch, I would be more than happy to
take it.

--nab


On Mon, 2008-02-04 at 16:47 +0100, Marc Dietrich wrote:
 Hi Nicholas,
 
 can you please also upload a src.rpm? I'm having toubles compiling the kernel 
 code:
 
 # cd target/ ; ./autoconfig --write-to-file ; cat .make_autoconfig ; make 
 kernel
 /usr/src/linux-iscsi/trunk/target/.make_autoconfig
 ARCH?=ppc
 AUTO_CFLAGS?= -DHAS_UTS_RELEASE -DUSE_SCSI_H 
 -I/lib/modules/2.6.24-06289-g144de36/source/drivers/scsi  -DUSE_MSLEEP 
 -DUSE_COMPAT_IOCTL -Dscsi_execute_async_address  
 -DPYX_ISCSI_VENDOR='Linux-iSCSI.org'  
 -DIQN_PREFIX='iqn.2003-01.org.linux-iscsi'  -DLINUX 
 -DLINUX_SCATTERLIST_HAS_PAGE -DSVN_VSN=\209\
 BASENAME?=FedoraCore-R8-Werewolf.ppc
 DISTRO?=FEDORA
 KERNEL?=26
 KERNEL_DIR?=/lib/modules/2.6.24-06289-g144de36/build
 KERNEL_INCLUDE_DIR?=/lib/modules/2.6.24-06289-g144de36/source/include
 KERNEL_SOURCE_DIR?=/lib/modules/2.6.24-06289-g144de36/source
 KERNEL_VERSION_INFO?=LINUX_KERNEL_26
 OSTYPE?=LINUX
 PYX_ISCSI_VERSION?=2.9.0.209
 RELEASE?=2.6.24-06289-g144de36
 RELEASES?=ARRAY(0x102052d4)
 RPM_DIR?=/usr/src/redhat
 SNMP?=0
 SYSTEM?=FedoraCore-R8-Werewolf
 VERSION_IPYXD?=2.9.0.209
 make -C target clean all
 make[1]: Entering directory `/usr/src/linux-iscsi/trunk/target/target'
 rm -f /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_crc.o 
 /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_debug_opcodes.o 
 /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_parameters.o 
 /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_seq_and_pdu_list.o 
 /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_serial.o 
 /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_thread_queue.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_datain_values.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_device.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_discovery.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_erl0.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_erl1.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_erl2.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_feature_obj.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_hba.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_info.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_ioctl.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_linux_proc.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_login.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_nego.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_nodeattrib.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_plugin.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_reportluns.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_scdb.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_seobj.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_tmr.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_tpg.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_transport.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_util.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target.o 
 /usr/src/linux-iscsi/trunk/target/target/div64.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_raid.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_repl.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_iblock.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_pscsi.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_rd.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_file.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_vt.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_mc.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_mib.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_mod.o 
 /usr/src/linux-iscsi/trunk/target/target/iscsi_target_mod.mod.o
 rm -f iscsi_target_mod.ko iscsi_target_mod.mod.c
 rm -f .*.cmd ../common/.*.cmd .make_autoconfig *~
 rm -fr .tmp_versions
 make -C /lib/modules/2.6.24-06289-g144de36/build 
 SUBDIRS=/usr/src/linux-iscsi/trunk/target/target modules 
 CWD=/usr/src/linux-iscsi/trunk/target/target ARCH=ppc KBUILD_VERBOSE=0
 make[2]: Entering directory `/usr/src/ps3-linux'
   CC [M]  /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_crc.o
 In file included from include/asm/mmu.h:7,
  from include/asm/lppaca.h:32,
  from include/asm/paca.h:20,
  from include/asm/hw_irq.h:17,
  from include/asm/system.h:9,
  from include/linux/list.h:9,
  from include/linux/preempt.h:11,
  from include/linux/spinlock.h:49,

Re: [PATCH RESEND number 2] libata: eliminate the home grown dma padding in favour of that provided by the block layer

2008-02-04 Thread James Bottomley
On Mon, 2008-02-04 at 18:25 +0900, Tejun Heo wrote:
 Tejun Heo wrote:
  Some ATA controllers including SFF BMDMA and libata PIO HSM need the
  number of bytes mapped in the sg table.  Yeah, it can be calculated with
  a simple macro but it also is a fundamentally confusing dual-sizing
  which should be made as clear as possible.  Plus, it can be difficult to
  find out when somebody used the wrong thing, so what I'm saying is that
  we need to make it easy.  Anyways, please lemme work on it a bit.  I'll
  get back to you guys soon.
 
 Okay, here's first draft combined patch.  Only compile tested (expect
 it to be broken) but it should be functionally equivalent to
 ata_sg_setup_extra() based implementation albeit with shorter drain
 buffer size.  Several things to note...

I noticed you posted an update ... I'll go over the code in that one.

 * fsl last sg check isn't included here.  Will split it out and post
   it separately.
 
 * rq-raw_data_len added.  Rationales...
 
   - All these padding and draining are to prevent controllers from
 crapping themselves when data buffer is shorter than it likes it
 to be.  Any controller which talks MMC (or SPC for that matter)
 should be ready for transfers shorter than buffer so feeding
 enlarged buffer size is inherently safter than feeding the length,
 so the primary data length field, rq-data_len, contains the
 adjusted length.

Actually, no they don't.  Overrun termination is pretty much standard in
most HBAs that do MMC (including the strange block one).  Allowing an
overrun is simply unsafe and a security risk, so all apart from the ATA
ones allow us to terminate the transaction safely.

   - raw_data_len can't be easily deduced from data_len.  The other way
 is possible but with both aligning and draining and command
 filtering, calculating it later is messy.

Like I said, it's simply the command size plus the drain buffer which
you get from the queue.  I'll redo your patches to show how it can be
done without adding the extra space to every request.

 * Draining configuration is done in sr as it's the driver for MMC.  It
   can move both ways - either into SCSI midlayer as SPC and other
   commands do variable length responses too or into libata if all
   non-ATA controllers are happy without such workarounds.  If you ask
   me, I'm inclined to move it into SCSI midlayer as the added overhead
   is insignificant (especially with drain_needed added) and it won't
   break anything (well, theoretically, at least).

Let me think about this one.  The only consumer of draining is libata.
It might make sense to process the drains in sr.c but it certainly
doesn't make sense to turn them on indiscriminately in there because
*none* of the SCSI users needs them.

 * Padding via alinging seems a bit too hacky to me.  It doesn't even
   cover all sg cases.  I think we'll need improvements there, well,
   but for the time being, this should do.

That's essentially what the dma_aligment parameter was designed for ...
we can always make the way it's handled differently; I just updated
block to use what exists already.

 I'll test and report in a few hours.

Great, thanks.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/24][RFC] scsi_eh: Define new API for sense handling

2008-02-04 Thread Boaz Harrosh
  This patch defines a new API for sense handling. All drivers will
  be converted to this API, before the sense handling implementation will
  change. API is as follows:

void scsi_eh_cpy_sense(struct scsi_cmnd *cmd, void* sense,
 unsigned sense_bytes);
To be used by drivers, when they have sense-bits
and wants to send them to upper layer. Max size
need not be a concern, If upper layer does not have
enough space it will be automatically truncated.

u8 *scsi_make_sense(struct scsi_cmnd *cmd);
To be used by drivers, and scsi-midlayer. Returns a DMA-able
sense buffer. Must be returned by scsi_return_sense(). It should
never fail if .pre_allocate_sense  .sense_buffsize in host
template where properly set.
the buffer is of shost-sense_buffsize long.

void *scsi_return_sense(struct scsi_cmnd *cmd, u8 *sb);
Frees and returns the sense to the upper layer,
copying only what's necessary.

void scsi_eh_reset_sense(struct scsi_cmnd *cmd)
Should not be used or necessary.

const u8 *scsi_sense(struct scsi_cmnd *cmd)
Used by ULDs and for inspecting the returned sense, can not
be modified. It is only valid after a call to
scsi_eh_cpy_sense() or a call to scsi_return_sense(). Before
that it will/should return an empty buffer.

New members at scsi host template:
.sense_buffsize - if a driver calls scsi_make_sense() or
  scsi_eh_prep_cmnd(), This value should be none
  zero indicating the max sense size, the driver
  supports. In most cases it should be
  SCSI_SENSE_BUFFERSIZE.
  If this value is zero the driver will only call
  scsi_eh_cpy_sense().

.pre_allocate_sense - if a Driver calls scsi_make_sense()
  in .queuecommand for every cmnd, this
  should be set to true. In which case
  scsi_make_sense() will not fail because
  midlayer will fail the command allocation.
  If the drivers calls scsi_eh_prep_cmnd()
  then sense_buffsize is not Zero but this
  here is set to false.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/scsi.c   |   12 
 drivers/scsi/scsi_error.c |7 +++
 include/scsi/scsi_eh.h|   17 +
 include/scsi/scsi_host.h  |   15 +++
 4 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 98cba7d..af29ccc 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -290,6 +290,18 @@ void scsi_put_command(struct scsi_cmnd *cmd)
 }
 EXPORT_SYMBOL(scsi_put_command);
 
+u8 *scsi_make_sense(struct scsi_cmnd *cmd)
+{
+   return cmd-sense_buffer;
+}
+EXPORT_SYMBOL(scsi_make_sense);
+
+void scsi_return_sense(struct scsi_cmnd *cmd, u8 *sb)
+{
+   BUG_ON(cmd-sense_buffer != sb);
+}
+EXPORT_SYMBOL(scsi_return_sense);
+
 /**
  * scsi_setup_command_freelist - Setup the command freelist for a scsi host.
  * @shost: host to allocate the freelist for.
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 98696ae..dc8cd2b 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -588,6 +588,13 @@ static void scsi_abort_eh_cmnd(struct scsi_cmnd *scmd)
scsi_try_host_reset(scmd);
 }
 
+void scsi_eh_cpy_sense(struct scsi_cmnd *cmd, void *sense, unsigned 
sense_bytes)
+{
+   unsigned len = min_t(unsigned, sense_bytes, SCSI_SENSE_BUFFERSIZE);
+   memcpy(cmd-sense_buffer, sense, len);
+}
+EXPORT_SYMBOL(scsi_eh_cpy_sense);
+
 /**
  * scsi_eh_prep_cmnd  - Save a scsi command info as part of error recory
  * @scmd:   SCSI command structure to hijack
diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
index 9438ea1..ce84330 100644
--- a/include/scsi/scsi_eh.h
+++ b/include/scsi/scsi_eh.h
@@ -87,4 +87,21 @@ extern void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd,
 extern void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd,
struct scsi_eh_save *ses);
 
+extern void scsi_eh_cpy_sense(struct scsi_cmnd *cmd, void *sense,
+   unsigned sense_bytes);
+
+extern u8 *scsi_make_sense(struct scsi_cmnd *cmd);
+extern void scsi_return_sense(struct scsi_cmnd *cmd, u8 *sb);
+
+/*FIXME: don't use, it's temporary */
+static inline void scsi_eh_reset_sense(struct scsi_cmnd *cmd)
+{
+   memset(cmd-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+}
+
+static inline const u8 *scsi_sense(struct scsi_cmnd *cmd)
+{
+   return cmd-sense_buffer;
+}
+
 #endif /* _SCSI_SCSI_EH_H */
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index c2dd31d..f232768 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -443,6 +443,18 @@ struct 

Re: [patch] pci: pci_enable_device_bars() fix

2008-02-04 Thread Jeff Garzik

Andrew Morton wrote:

Actually I (and probably others) generally avoid cc'ing mailing lists on
patch traffic.  I spew out enough script-generated traffic as it is.


You pretty much always ensure the driver author gets CC'd, which is 
exemplary :)


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/3] cciss: Don't call pci_free_consistent with irqs disabled

2008-02-04 Thread scameron


Don't call pci_free_consistent with irqs disabled
(Was triggering a warning in arch/x86/kernel/pci-dma_32.c
in dma_free_coherent)

Signed-off-by: Stephen M. Cameron [EMAIL PROTECTED]
---

 linux-2.6.24/drivers/block/cciss_scsi.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN linux-2.6.24/drivers/block/cciss_scsi.c~fix_pci_free_irq_bug 
linux-2.6.24/drivers/block/cciss_scsi.c
--- kernel.org2/linux-2.6.24/drivers/block/cciss_scsi.c~fix_pci_free_irq_bug
2008-02-04 07:52:39.0 -0600
+++ kernel.org2-root/linux-2.6.24/drivers/block/cciss_scsi.c2008-02-04 
07:52:39.0 -0600
@@ -1349,9 +1349,9 @@ cciss_unregister_scsi(int ctlr)
/* set scsi_host to NULL so our detect routine will 
   find us on register */
sa-scsi_host = NULL;
+   spin_unlock_irqrestore(CCISS_LOCK(ctlr), flags);
scsi_cmd_stack_free(ctlr);
kfree(sa);
-   spin_unlock_irqrestore(CCISS_LOCK(ctlr), flags);
 }
 
 static int 
_
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Vladislav Bolkhovitin

Vladislav Bolkhovitin wrote:

James Bottomley wrote:


The two target architectures perform essentially identical functions, so
there's only really room for one in the kernel.  Right at the moment,
it's STGT.  Problems in STGT come from the user-kernel boundary which
can be mitigated in a variety of ways.  The fact that the figures are
pretty much comparable on non IB networks shows this.

I really need a whole lot more evidence than at worst a 20% performance
difference on IB to pull one implementation out and replace it with
another.  Particularly as there's no real evidence that STGT can't be
tweaked to recover the 20% even on IB.



James,

Although the performance difference between STGT and SCST is apparent, 
this isn't the only point why SCST is better. I've already written about 
it many times in various mailing lists, but let me summarize it one more 
time here.


As you know, almost all kernel parts can be done in user space, 
including all the drivers, networking, I/O management with block/SCSI 
initiator subsystem and disk cache manager. But does it mean that 
currently Linux kernel is bad and all the above should be (re)done in 
user space instead? I believe, not. Linux isn't a microkernel for very 
pragmatic reasons: simplicity and performance. So, additional important 
point why SCST is better is simplicity.


For SCSI target, especially with hardware target card, data are came 
from kernel and eventually served by kernel, which does actual I/O or 
getting/putting data from/to cache. Dividing requests processing between 
user and kernel space creates unnecessary interface layer(s) and 
effectively makes the requests processing job distributed with all its 
complexity and reliability problems. From my point of view, having such 
distribution, where user space is master side and kernel is slave is 
rather wrong, because:


1. It makes kernel depend from user program, which services it and 
provides for it its routines, while the regular paradigm is the 
opposite: kernel services user space applications. As a direct 
consequence from it that there is no real protection for the kernel from 
faults in the STGT core code without excessive effort, which, no 
surprise, wasn't currently done and, seems, is never going to be done. 
So, on practice debugging and developing under STGT isn't easier, than 
if the whole code was in the kernel space, but, actually, harder (see 
below why).


2. It requires new complicated interface between kernel and user spaces 
that creates additional maintenance and debugging headaches, which don't 
exist for kernel only code. Linus Torvalds some time ago perfectly 
described why it is bad, see http://lkml.org/lkml/2007/4/24/451, 
http://lkml.org/lkml/2006/7/1/41 and http://lkml.org/lkml/2007/4/24/364.


3. It makes for SCSI target impossible to use (at least, on a simple and 
sane way) many effective optimizations: zero-copy cached I/O, more 
control over read-ahead, device queue unplugging-plugging, etc. One 
example of already implemented such features is zero-copy network data 
transmission, done in simple 260 lines put_page_callback patch. This 
optimization is especially important for the user space gate (scst_user 
module), see below for details.


The whole point that development for kernel is harder, than for user 
space, is totally nonsense nowadays. It's different, yes, in some ways 
more limited, yes, but not harder. For ones who need gdb (I for many 
years - don't) kernel has kgdb, plus it also has many not available for 
user space or more limited there debug facilities like lockdep, lockup 
detection, oprofile, etc. (I don't mention wider choice of more 
effectively implemented synchronization primitives and not only them).


For people who need complicated target devices emulation, like, e.g., in 
case of VTL (Virtual Tape Library), where there is a need to operate 
with large mmap'ed memory areas, SCST provides gateway to the user space 
(scst_user module), but, in contrast with STGT, it's done in regular 
kernel - master, user application - slave paradigm, so it's reliable 
and no fault in user space device emulator can break kernel and other 
user space applications. Plus, since SCSI target state machine and 
memory management are in the kernel, it's very effective and allows only 
one kernel-user space switch per SCSI command.


Also, I should note here, that in the current state STGT in many aspects 
doesn't fully conform SCSI specifications, especially in area of 
management events, like Unit Attentions generation and processing, and 
it doesn't look like somebody cares about it. At the same time, SCST 
pays big attention to fully conform SCSI specifications, because price 
of non-conformance is a possible user's data corruption.


Returning to performance, modern SCSI transports, e.g. InfiniBand, have 
as low link latency as 1(!) microsecond. For comparison, the 
inter-thread context switch time on a modern system is about the same, 
syscall time 

Re: dmesg spam

2008-02-04 Thread Bartlomiej Zolnierkiewicz
On Sunday 03 February 2008, Andrew Morton wrote:
 
 With latest -mm, running fc8 I am getting this in the logs,
   ^^^
= SCSI/libata

cc:ing Jeff

 once per second.
 
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
 sr0: CDROM not ready.  Make sure there is a disc in the drive.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley
On Mon, 2008-02-04 at 15:27 +0300, Vladislav Bolkhovitin wrote:
 Vladislav Bolkhovitin wrote:
 So, James, what is your opinion on the above? Or the overall SCSI target 
 project simplicity doesn't matter much for you and you think it's fine 
 to duplicate Linux page cache in the user space to keep the in-kernel 
 part of the project as small as possible?

The answers were pretty much contained here

http://marc.info/?l=linux-scsim=120164008302435

and here:

http://marc.info/?l=linux-scsim=120171067107293

Weren't they?

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 9/24][RFC] ultrastor: Use scsi_eh_cpy_sense() API

2008-02-04 Thread Boaz Harrosh
  - allocate a driver private area sense buffer, and set IO
to transfer sense into that buffer.
  - at end of command execution copy the sense information
using scsi_eh_cpy_sense().

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/ultrastor.c |   14 +-
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/ultrastor.c b/drivers/scsi/ultrastor.c
index f385dce..b2ccb48 100644
--- a/drivers/scsi/ultrastor.c
+++ b/drivers/scsi/ultrastor.c
@@ -199,6 +199,7 @@ struct mscp {
   void (*done) (struct scsi_cmnd *);
   struct scsi_cmnd *SCint;
   ultrastor_sg_list sglist[ULTRASTOR_24F_MAX_SG]; /* use larger size for 24F */
+  u8 sense_buffer[SCSI_SENSE_BUFFERSIZE];
 };
 
 
@@ -743,19 +744,20 @@ static int ultrastor_queuecommand(struct scsi_cmnd *SCpnt,
 } else {
/* Unset scatter/gather flag in SCSI command packet */
my_mscp-sg = FALSE;
-   my_mscp-transfer_data = isa_virt_to_bus(scsi_sglist(SCpnt));
-   my_mscp-transfer_data_length = scsi_bufflen(SCpnt);
+   my_mscp-transfer_data = 0;
+   my_mscp-transfer_data_length = 0;
 }
 my_mscp-command_link = 0; /*???*/
 my_mscp-scsi_command_link_id = 0; /*???*/
-my_mscp-length_of_sense_byte = SCSI_SENSE_BUFFERSIZE;
 my_mscp-length_of_scsi_cdbs = SCpnt-cmd_len;
 memcpy(my_mscp-scsi_cdbs, SCpnt-cmnd, my_mscp-length_of_scsi_cdbs);
 my_mscp-adapter_status = 0;
 my_mscp-target_status = 0;
-my_mscp-sense_data = isa_virt_to_bus(SCpnt-sense_buffer);
 my_mscp-done = done;
 my_mscp-SCint = SCpnt;
+memset(my_mscp-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+my_mscp-length_of_sense_byte = SCSI_SENSE_BUFFERSIZE;
+my_mscp-sense_data = isa_virt_to_bus(my_mscp-sense_buffer);
 SCpnt-host_scribble = (unsigned char *)my_mscp;
 
 /* Find free OGM slot.  On 24F, look for OGM status byte == 0.
@@ -1140,7 +1142,9 @@ static void ultrastor_interrupt(void *dev_id)
status = DID_TIME_OUT  16;
break;
   }
-
+if (mscp-sense_buffer[0])
+   scsi_eh_cpy_sense(SCtmp, mscp-sense_buffer,
+   sizeof(mscp-sense_buffer));
 SCtmp-result = status | mscp-target_status;
 
 SCtmp-host_scribble = NULL;
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Cbe-oss-dev] LIO Target iSCSI/SE PS3-Linux / FC8 builds

2008-02-04 Thread Nicholas A. Bellinger
Hi Again,

Almost forgot, I put the following RPMs that iscsi userspace tools rpms
depends for LIO-console.pl and LIO-demo.sh:

perl-Curses-1.15-1.fc6.ppc.rpm
perl-IO-All-0.33-3.fc5.noarch.rpm
perl-IO-String-1.08-1.2.fc5.rf.noarch.rpm

http://linux-iscsi.org/builds/ps3-linux/

These should work, let me know if you are missing any when you install
iscsi-target-tools.

Thanks,

--nab

On Mon, 2008-02-04 at 07:52 -0800, Nicholas A. Bellinger wrote:
 Hi Marc,
 
 You can generate the kernel RPM with 'make kernel ARCH=powerpc'.
 
 Also, while module-assistant is supported on debian/ubuntu,
 trunk/buildtools/ currently does not support generating kernel module
 source rpms.  If you want to send a patch, I would be more than happy to
 take it.
 
 --nab
 
 
 On Mon, 2008-02-04 at 16:47 +0100, Marc Dietrich wrote:
  Hi Nicholas,
  
  can you please also upload a src.rpm? I'm having toubles compiling the 
  kernel 
  code:
  
  # cd target/ ; ./autoconfig --write-to-file ; cat .make_autoconfig ; make 
  kernel
  /usr/src/linux-iscsi/trunk/target/.make_autoconfig
  ARCH?=ppc
  AUTO_CFLAGS?= -DHAS_UTS_RELEASE -DUSE_SCSI_H 
  -I/lib/modules/2.6.24-06289-g144de36/source/drivers/scsi  -DUSE_MSLEEP 
  -DUSE_COMPAT_IOCTL -Dscsi_execute_async_address  
  -DPYX_ISCSI_VENDOR='Linux-iSCSI.org'  
  -DIQN_PREFIX='iqn.2003-01.org.linux-iscsi'  -DLINUX 
  -DLINUX_SCATTERLIST_HAS_PAGE -DSVN_VSN=\209\
  BASENAME?=FedoraCore-R8-Werewolf.ppc
  DISTRO?=FEDORA
  KERNEL?=26
  KERNEL_DIR?=/lib/modules/2.6.24-06289-g144de36/build
  KERNEL_INCLUDE_DIR?=/lib/modules/2.6.24-06289-g144de36/source/include
  KERNEL_SOURCE_DIR?=/lib/modules/2.6.24-06289-g144de36/source
  KERNEL_VERSION_INFO?=LINUX_KERNEL_26
  OSTYPE?=LINUX
  PYX_ISCSI_VERSION?=2.9.0.209
  RELEASE?=2.6.24-06289-g144de36
  RELEASES?=ARRAY(0x102052d4)
  RPM_DIR?=/usr/src/redhat
  SNMP?=0
  SYSTEM?=FedoraCore-R8-Werewolf
  VERSION_IPYXD?=2.9.0.209
  make -C target clean all
  make[1]: Entering directory `/usr/src/linux-iscsi/trunk/target/target'
  rm -f /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_crc.o 
  /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_debug_opcodes.o 
  /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_parameters.o 
  /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_seq_and_pdu_list.o 
  /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_serial.o 
  /usr/src/linux-iscsi/trunk/target/target/../common/iscsi_thread_queue.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_datain_values.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_device.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_discovery.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_erl0.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_erl1.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_erl2.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_feature_obj.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_hba.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_info.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_ioctl.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_linux_proc.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_login.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_nego.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_nodeattrib.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_plugin.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_reportluns.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_scdb.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_seobj.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_tmr.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_tpg.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_transport.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_util.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target.o 
  /usr/src/linux-iscsi/trunk/target/target/div64.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_raid.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_repl.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_iblock.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_pscsi.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_rd.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_file.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_vt.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_mc.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_mib.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_mod.o 
  /usr/src/linux-iscsi/trunk/target/target/iscsi_target_mod.mod.o
  rm -f iscsi_target_mod.ko iscsi_target_mod.mod.c
  rm -f .*.cmd ../common/.*.cmd .make_autoconfig *~
  rm -fr .tmp_versions
  make -C /lib/modules/2.6.24-06289-g144de36/build 
  SUBDIRS=/usr/src/linux-iscsi/trunk/target/target modules 
  CWD=/usr/src/linux-iscsi/trunk/target/target 

Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread David Dillow

On Mon, 2008-02-04 at 14:53 +0100, Bart Van Assche wrote:
 Another issue I have to look further into is that dd and xdd report
 different results for very large block sizes ( 1 MB).

Be aware that xdd reports 1 MB as 100, not 1048576. Though, it looks
like dd is the same, so that's probably not helpful. Also, make sure
you're passing {i,o}flag=direct to dd if you're using -dio in xdd to be
sure you are comparing apples to apples.
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/24][RFC] scsi_tgt: use of sense accessors

2008-02-04 Thread Boaz Harrosh
  FIXME: I need help with this driver (Pete?)
I used scsi_sense() in a none const way. But since
scsi_tgt is the ULD here, it can just access it's own sense
buffer directly. I did not use scsi_eh_cpy_sense() because
I did not want the extra copy. Pete will want to use a 260
bytes buffer here.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Need-help-from: Pete Wyckoff [EMAIL PROTECTED]
---
 drivers/scsi/scsi_tgt_lib.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c
index e01a985..d42f318 100644
--- a/drivers/scsi/scsi_tgt_lib.c
+++ b/drivers/scsi/scsi_tgt_lib.c
@@ -29,6 +29,7 @@
 #include scsi/scsi_host.h
 #include scsi/scsi_transport.h
 #include scsi/scsi_tgt.h
+#include scsi/scsi_eh.h
 
 #include scsi_tgt_priv.h
 
@@ -397,7 +398,7 @@ static int scsi_tgt_copy_sense(struct scsi_cmnd *cmd, 
unsigned long uaddr,
 {
char __user *p = (char __user *) uaddr;
 
-   if (copy_from_user(cmd-sense_buffer, p,
+   if (copy_from_user(scsi_sense(cmd), p,
   min_t(unsigned, SCSI_SENSE_BUFFERSIZE, len))) {
printk(KERN_ERR Could not copy the sense buffer\n);
return -EIO;
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/24][RFC] firewire ieee1394: Simple convert to new scsi_eh_cpy_sense.

2008-02-04 Thread Boaz Harrosh
On Mon, Feb 04 2008 at 18:37 +0200, Stefan Richter [EMAIL PROTECTED] wrote:
 Boaz Harrosh wrote:
 --- a/drivers/ieee1394/sbp2.c
 +++ b/drivers/ieee1394/sbp2.c
 @@ -1672,8 +1673,11 @@ static int sbp2_send_command(struct sbp2_lu *lu, 
 struct scsi_cmnd *SCpnt,
   * Translates SBP-2 status into SCSI sense data for check conditions
   */
  static unsigned int sbp2_status_to_sense_data(unchar *sbp2_status,
 -  unchar *sense_data)
 +  struct scsi_cmnd *SCpnt)
  {
 +u8 sense_data[16];
 +
 +memset(sense_data, 0, sizeof(sense_data));
  /* OK, it's pretty ugly... ;-) */
  sense_data[0] = 0x70;
  sense_data[1] = 0x0;
 @@ -1691,6 +1695,7 @@ static unsigned int sbp2_status_to_sense_data(unchar 
 *sbp2_status,
  sense_data[13] = sbp2_status[11];
  sense_data[14] = sbp2_status[20];
  sense_data[15] = sbp2_status[21];
 +scsi_eh_cpy_sense(SCpnt, sense_data, sizeof(sense_data));
  
  return sbp2_status[8]  0x3f;
  }
 
 You don't need the memset.
 
OK I see what you mean now, they are all used. Thanks.

 Also, here and in drivers/firewire/fw-sbp2.c, the SCSI sense data could
 AFAICS be rewritten in-place in sbp2_status.  But I don't know if this
 is a worthwhile optimization; it would reduce readability.
Right, it's a very unlikely code path. readability is more important
here.

I will fix both places. Maybe I can use also here what Jeff suggested in the
other mail. We'll see how it goes.

Boaz

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/24][RFC] scsi upper layer use of sense accessors

2008-02-04 Thread Boaz Harrosh
  code that inspects the return sense can use the new scsi_sense()
  accessor.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/constants.c |4 ++--
 drivers/scsi/sd.c|4 ++--
 drivers/scsi/sr.c|   15 +++
 3 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
index 9785d73..0c6ad0e 100644
--- a/drivers/scsi/constants.c
+++ b/drivers/scsi/constants.c
@@ -1349,10 +1349,10 @@ void scsi_print_sense(char *name, struct scsi_cmnd *cmd)
struct scsi_sense_hdr sshdr;
 
scmd_printk(KERN_INFO, cmd, );
-   scsi_decode_sense_buffer(cmd-sense_buffer, SCSI_SENSE_BUFFERSIZE,
+   scsi_decode_sense_buffer(scsi_sense(cmd), SCSI_SENSE_BUFFERSIZE,
 sshdr);
scsi_show_sense_hdr(sshdr);
-   scsi_decode_sense_extras(cmd-sense_buffer, SCSI_SENSE_BUFFERSIZE,
+   scsi_decode_sense_extras(scsi_sense(cmd), SCSI_SENSE_BUFFERSIZE,
 sshdr);
scmd_printk(KERN_INFO, cmd, );
scsi_show_extd_sense(sshdr.asc, sshdr.ascq);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 51a5557..be3cee8 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -960,7 +960,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
case MEDIUM_ERROR:
if (!blk_fs_request(SCpnt-request))
goto out;
-   info_valid = scsi_get_sense_info_fld(SCpnt-sense_buffer,
+   info_valid = scsi_get_sense_info_fld(scsi_sense(SCpnt),
 SCSI_SENSE_BUFFERSIZE,
 bad_lba);
if (!info_valid)
@@ -999,7 +999,7 @@ static int sd_done(struct scsi_cmnd *SCpnt)
 */
scsi_print_sense(sd, SCpnt);
SCpnt-result = 0;
-   memset(SCpnt-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+   scsi_eh_reset_sense(SCpnt);
good_bytes = xfer_size;
break;
case ILLEGAL_REQUEST:
diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index 50ba492..6dcb933 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -230,6 +230,7 @@ out:
  */
 static int sr_done(struct scsi_cmnd *SCpnt)
 {
+   const u8 *sense = scsi_sense(SCpnt);
int result = SCpnt-result;
int this_count = scsi_bufflen(SCpnt);
int good_bytes = (result == 0 ? this_count : 0);
@@ -248,17 +249,15 @@ static int sr_done(struct scsi_cmnd *SCpnt)
 * memcpy's that could be avoided.
 */
if (driver_byte(result) != 0  /* An error occurred */
-   (SCpnt-sense_buffer[0]  0x7f) == 0x70) { /* Sense current */
-   switch (SCpnt-sense_buffer[2]) {
+   (sense[0]  0x7f) == 0x70) { /* Sense current */
+   switch (sense[2]) {
case MEDIUM_ERROR:
case VOLUME_OVERFLOW:
case ILLEGAL_REQUEST:
-   if (!(SCpnt-sense_buffer[0]  0x90))
+   if (!(sense[0]  0x90))
break;
-   error_sector = (SCpnt-sense_buffer[3]  24) |
-   (SCpnt-sense_buffer[4]  16) |
-   (SCpnt-sense_buffer[5]  8) |
-   SCpnt-sense_buffer[6];
+   error_sector = (sense[3]  24) | (sense[4]  16) |
+   (sense[5]  8) | sense[6];
if (SCpnt-request-bio != NULL)
block_sectors =
bio_sectors(SCpnt-request-bio);
@@ -292,7 +291,7 @@ static int sr_done(struct scsi_cmnd *SCpnt)
 */
scsi_print_sense(sr, SCpnt);
SCpnt-result = 0;
-   SCpnt-sense_buffer[0] = 0x0;
+   scsi_eh_reset_sense(SCpnt);
good_bytes = this_count;
break;
 
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Nicholas A. Bellinger
On Mon, 2008-02-04 at 10:29 -0800, Linus Torvalds wrote:
 
 On Mon, 4 Feb 2008, James Bottomley wrote:
  
  The way a user space solution should work is to schedule mmapped I/O
  from the backing store and then send this mmapped region off for target
  I/O.
 
 mmap'ing may avoid the copy, but the overhead of a mmap operation is 
 quite often much *bigger* than the overhead of a copy operation.
 
 Please do not advocate the use of mmap() as a way to avoid memory copies. 
 It's not realistic. Even if you can do it with a single mmap() system 
 call (which is not at all a given, considering that block devices can 
 easily be much larger than the available virtual memory space), the fact 
 is that page table games along with the fault (and even just TLB miss) 
 overhead is easily more than the cost of copying a page in a nice 
 streaming manner.
 
 Yes, memory is slow, but dammit, so is mmap().
 
  You also have to pull tricks with the mmap region in the case of writes 
  to prevent useless data being read in from the backing store.  However, 
  none of this involves data copies.
 
 data copies is irrelevant. The only thing that matters is performance. 
 And if avoiding data copies is more costly (or even of a similar cost) 
 than the copies themselves would have been, there is absolutely no upside, 
 and only downsides due to extra complexity.
 

The iSER spec (RFC-5046) quotes the following in the TCP case for direct
data placement:

  Out-of-order TCP segments in the Traditional iSCSI model have to be
   stored and reassembled before the iSCSI protocol layer within an end
   node can place the data in the iSCSI buffers.  This reassembly is
   required because not every TCP segment is likely to contain an iSCSI
   header to enable its placement, and TCP itself does not have a
   built-in mechanism for signaling Upper Level Protocol (ULP) message
   boundaries to aid placement of out-of-order segments.  This TCP
   reassembly at high network speeds is quite counter-productive for the
   following reasons: wasted memory bandwidth in data copying, the need
   for reassembly memory, wasted CPU cycles in data copying, and the
   general store-and-forward latency from an application perspective.

While this does not have anything to do directly with the kernel vs. user 
discussion
for target mode storage engine, the scaling and latency case is easy enough
to make if we are talking about scaling TCP for 10 Gb/sec storage fabrics.

 If you want good performance for a service like this, you really generally 
 *do* need to in kernel space. You can play games in user space, but you're 
 fooling yourself if you think you can do as well as doing it in the 
 kernel. And you're *definitely* fooling yourself if you think mmap() 
 solves performance issues. Zero-copy does not equate to fast. Memory 
 speeds may be slower that core CPU speeds, but not infinitely so!
 

From looking at this problem from a kernel space perspective for a
number of years, I would be inclined to believe this is true for
software and hardware data-path cases.  The benefits of moving various
control statemachines for something like say traditional iSCSI to
userspace has always been debateable.  The most obvious ones are things
like authentication, espically if something more complex than CHAP are
the obvious case for userspace.  However, I have thought recovery for
failures caused from communication path (iSCSI connections) or entire
nexuses (iSCSI sessions) failures was very problematic to expect to have
to potentially push down IOs state to userspace.

Keeping statemachines for protocol and/or fabric specific statemachines
(CSM-E and CSM-I from connection recovery in iSCSI and iSER are the
obvious ones) are the best canidates for residing in kernel space.

 (That said: there *are* alternatives to mmap, like splice(), that really 
 do potentially solve some issues without the page table and TLB overheads. 
 But while splice() avoids the costs of paging, I strongly suspect it would 
 still have easily measurable latency issues. Switching between user and 
 kernel space multiple times is definitely not going to be free, although 
 it's probably not a huge issue if you have big enough requests).
 

Most of the SCSI OS storage subsystems that I have worked with in the
context of iSCSI have used 256 * 512 byte setctor requests, which the
default traditional iSCSI PDU data payload (MRDSL) being 64k to hit the
sweet spot with crc32c checksum calculations.  I am assuming this is
going to be the case for other fabrics as well.

--nab


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/12] qla4xxx: have qla4xxx directly call iscsi recovery functions

2008-02-04 Thread David Somayajulu
 

Mike Christie wrote:
 
 Qla4xxx can just call the iscsi recovery functions directly.
 There is no need for userspace to do this for qla4xxx, because
 we do not use the mutex to iterate over devices anymore and 
 iscsi_block
 /unblock_session can be called from interrupt context or the 
 dpc thread.
 And having userspace do this just creates uneeded headaches 
 for qla4xxx root
 situations where the session may experience problems. For example
 during the kernel shutdown the scsi layer wants to send sync 
 caches, but at
 this time userspace is not up (iscsid is not running), so we cannot
 recover from the problem.
 
 Signed-off-by: Mike Christie [EMAIL PROTECTED]
 ---
  drivers/scsi/qla4xxx/ql4_init.c |1 +
  drivers/scsi/qla4xxx/ql4_os.c   |   40 
 +++---
  2 files changed, 5 insertions(+), 36 deletions(-)
 
 diff --git a/drivers/scsi/qla4xxx/ql4_init.c 
 b/drivers/scsi/qla4xxx/ql4_init.c
 index cbe0a17..03e66cb 100644
 --- a/drivers/scsi/qla4xxx/ql4_init.c
 +++ b/drivers/scsi/qla4xxx/ql4_init.c
 @@ -1306,6 +1306,7 @@ int qla4xxx_process_ddb_changed(struct 
 scsi_qla_host *ha,
   atomic_set(ddb_entry-relogin_timer, 0);
   clear_bit(DF_RELOGIN, ddb_entry-flags);
   clear_bit(DF_NO_RELOGIN, ddb_entry-flags);
 + iscsi_unblock_session(ddb_entry-sess);
   iscsi_session_event(ddb_entry-sess,
   ISCSI_KEVENT_CREATE_SESSION);
   /*
 diff --git a/drivers/scsi/qla4xxx/ql4_os.c 
 b/drivers/scsi/qla4xxx/ql4_os.c
 index 2e2b9fe..a87fb9f 100644
 --- a/drivers/scsi/qla4xxx/ql4_os.c
 +++ b/drivers/scsi/qla4xxx/ql4_os.c
 @@ -63,8 +63,6 @@ static int qla4xxx_sess_get_param(struct 
 iscsi_cls_session *sess,
 enum iscsi_param param, char *buf);
  static int qla4xxx_host_get_param(struct Scsi_Host *shost,
 enum iscsi_host_param param, 
 char *buf);
 -static void qla4xxx_conn_stop(struct iscsi_cls_conn *conn, int flag);
 -static int qla4xxx_conn_start(struct iscsi_cls_conn *conn);
  static void qla4xxx_recovery_timedout(struct 
 iscsi_cls_session *session);
  
  /*
 @@ -116,8 +114,6 @@ static struct iscsi_transport 
 qla4xxx_iscsi_transport = {
   .get_conn_param = qla4xxx_conn_get_param,
   .get_session_param  = qla4xxx_sess_get_param,
   .get_host_param = qla4xxx_host_get_param,
 - .start_conn = qla4xxx_conn_start,
 - .stop_conn  = qla4xxx_conn_stop,
   .session_recovery_timedout = qla4xxx_recovery_timedout,
  };
  
 @@ -140,38 +136,6 @@ static void 
 qla4xxx_recovery_timedout(struct iscsi_cls_session *session)
   queue_work(ha-dpc_thread, ha-dpc_work);
  }
  
 -static int qla4xxx_conn_start(struct iscsi_cls_conn *conn)
 -{
 - struct iscsi_cls_session *session;
 - struct ddb_entry *ddb_entry;
 -
 - session = iscsi_dev_to_session(conn-dev.parent);
 - ddb_entry = session-dd_data;
 -
 - DEBUG2(printk(scsi%ld: %s: index [%d] starting conn\n,
 -   ddb_entry-ha-host_no, __func__,
 -   ddb_entry-fw_ddb_index));
 - iscsi_unblock_session(session);
 - return 0;
 -}
 -
 -static void qla4xxx_conn_stop(struct iscsi_cls_conn *conn, int flag)
 -{
 - struct iscsi_cls_session *session;
 - struct ddb_entry *ddb_entry;
 -
 - session = iscsi_dev_to_session(conn-dev.parent);
 - ddb_entry = session-dd_data;
 -
 - DEBUG2(printk(scsi%ld: %s: index [%d] stopping conn\n,
 -   ddb_entry-ha-host_no, __func__,
 -   ddb_entry-fw_ddb_index));
 - if (flag == STOP_CONN_RECOVER)
 - iscsi_block_session(session);
 - else
 - printk(KERN_ERR iscsi: invalid stop flag %d\n, flag);
 -}
 -
  static int qla4xxx_host_get_param(struct Scsi_Host *shost,
 enum iscsi_host_param param, 
 char *buf)
  {
 @@ -308,6 +272,9 @@ int qla4xxx_add_sess(struct ddb_entry *ddb_entry)
   DEBUG2(printk(KERN_ERR Could not add connection.\n));
   return -ENOMEM;
   }
 +
 + /* finally ready to go */
 + iscsi_unblock_session(ddb_entry-sess);
   return 0;
  }
  
 @@ -364,6 +331,7 @@ void qla4xxx_mark_device_missing(struct 
 scsi_qla_host *ha,
   DEBUG3(printk(scsi%d:%d:%d: index [%d] marked MISSING\n,
 ha-host_no, ddb_entry-bus, ddb_entry-target,
 ddb_entry-fw_ddb_index));
 + iscsi_block_session(ddb_entry-sess);
   iscsi_conn_error(ddb_entry-conn, ISCSI_ERR_CONN_FAILED);
  }
Acked by David Somayajulu [EMAIL PROTECTED]
 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 05/12] qla4xxx: fix recovery timer and session unblock race

2008-02-04 Thread David Somayajulu
 
Mike Christie wrote :
 If qla4xxx is resetting up a session and the recovery timer
 fires we do not want to just set it to dead, because
 the dpc thread could have just set it to online and is in the
 middle of resetting it up.
 
 Signed-off-by: Mike Christie [EMAIL PROTECTED]
 ---
  drivers/scsi/qla4xxx/ql4_os.c |   19 +++
  1 files changed, 11 insertions(+), 8 deletions(-)
 
 diff --git a/drivers/scsi/qla4xxx/ql4_os.c 
 b/drivers/scsi/qla4xxx/ql4_os.c
 index 437d169..d4dd149 100644
 --- a/drivers/scsi/qla4xxx/ql4_os.c
 +++ b/drivers/scsi/qla4xxx/ql4_os.c
 @@ -124,16 +124,19 @@ static void 
 qla4xxx_recovery_timedout(struct iscsi_cls_session *session)
   struct ddb_entry *ddb_entry = session-dd_data;
   struct scsi_qla_host *ha = ddb_entry-ha;
  
 - DEBUG2(printk(scsi%ld: %s: index [%d] port down retry 
 count of (%d) 
 -   secs exhausted, marking device DEAD.\n, 
 ha-host_no,
 -   __func__, ddb_entry-fw_ddb_index,
 -   ha-port_down_retry_count));
 + if (atomic_read(ddb_entry-state) != DDB_STATE_ONLINE) {
 + atomic_set(ddb_entry-state, DDB_STATE_DEAD);
  
 - atomic_set(ddb_entry-state, DDB_STATE_DEAD);
 + DEBUG2(printk(scsi%ld: %s: index [%d] port 
 down retry count 
 +   of (%d) secs exhausted, marking 
 device DEAD.\n,
 +   ha-host_no, __func__, 
 ddb_entry-fw_ddb_index,
 +   ha-port_down_retry_count));
  
 - DEBUG2(printk(scsi%ld: %s: scheduling dpc routine - 
 dpc flags = 
 -   0x%lx\n, ha-host_no, __func__, ha-dpc_flags));
 - queue_work(ha-dpc_thread, ha-dpc_work);
 + DEBUG2(printk(scsi%ld: %s: scheduling dpc 
 routine - dpc 
 +   flags = 0x%lx\n,
 +   ha-host_no, __func__, ha-dpc_flags));
 + queue_work(ha-dpc_thread, ha-dpc_work);
 + }
  }
  
  static int qla4xxx_host_get_param(struct Scsi_Host *shost,
 -- 
 1.5.2.1

Acked by David Somayajulu [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] scsi_dh: Add support for SDEV_PASSIVE

2008-02-04 Thread James Bottomley

On Mon, 2008-02-04 at 12:15 -0800, Chandra Seetharaman wrote:
 On Mon, 2008-02-04 at 12:58 -0600, James Bottomley wrote:
  On Wed, 2008-01-23 at 16:32 -0800, Chandra Seetharaman wrote:
   Subject: scsi_dh: Add support for SDEV_PASSIVE
   
   From: Chandra Seetharaman [EMAIL PROTECTED]
   
   This patch adds a new device state SDEV_PASSIVE, to correspond to the
   passive side access of an active/passive multipathed device.
  
  Really, no; this isn't right.  The state field of a SCSI device is for
  the SCSI state model.  Passive might be a valid device mapper state, but
 
 Hi James,
 
 It is not the device mapper state, it is the state of the device
 itself. These devices have active/passive paths, the passive paths will
 be represented by SDEV_PASSIVE device state in SCSI.

Yes, it is .. you're killing commands on the basis of being in this
state, which nothing in SCSI ever sets.

A proper return from a passive path is the SCSI standard NOT_READY
LOGICAL UNIT NOT READY, INITIALIZING COMMAND REQUIRED.  We expect to see
this, not the command being killed.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Nicholas A. Bellinger
On Mon, 2008-02-04 at 11:44 -0800, Linus Torvalds wrote:
 
 On Mon, 4 Feb 2008, Nicholas A. Bellinger wrote:
  
  While this does not have anything to do directly with the kernel vs. 
  user discussion for target mode storage engine, the scaling and latency 
  case is easy enough to make if we are talking about scaling TCP for 10 
  Gb/sec storage fabrics.
 
 I would like to point out that while I think there is no question that the 
 basic data transfer engine would perform better in kernel space, there 
 stll *are* questions whether
 
  - iSCSI is relevant enough for us to even care ...
 
  - ... and the complexity is actually worth it.
 
 That said, I also tend to believe that trying to split things up between 
 kernel and user space is often more complex than just keeping things in 
 one place, because the trade-offs of which part goes where wll inevitably 
 be wrong in *some* area, and then you're really screwed.
 
 So from a purely personal standpoint, I'd like to say that I'm not really 
 interested in iSCSI (and I don't quite know why I've been cc'd on this 
 whole discussion)

The generic target mode storage engine discussion quickly goes to
transport specific scenarios.  With so much interest in the SCSI
transports, in particuarly iSCSI, there are lots of devs, users, and
vendors who would like to see Linux improve in this respect.

  and think that other approaches are potentially *much* 
 better. So for example, I personally suspect that ATA-over-ethernet is way 
 better than some crazy SCSI-over-TCP crap,

Having the non SCSI target mode transports use the same data IO path as
the SCSI ones to SCSI, BIO, and FILE subsystems is something that can
easily be agreed on.  Also having to emulate the non SCSI control paths
in a non generic matter to a target mode engine has to suck (I don't
know what AoE does for that now, considering that this is going down to
libata or real SCSI hardware in some cases.  There are some of the more
arcane task management functionality in SCSI (ACA anyone?) that even
generic SCSI target mode engines do not use, and only seem to make
endlessly complex implement and emulate.

But aside from those very SCSI hardware specific cases, having a generic
method to use something like ABORT_TASK or LUN_RESET for a target mode
engine (along with the data path to all of the subsystems) would be
beneficial for any fabric.

 but I'm biased for simple and 
 low-level, and against those crazy SCSI people to begin with.

Well, having no obvious preconception (well, aside from the email
address), I am of the mindset than the iSCSI people are the LEAST crazy
said crazy SCSI people.  Some people (usually least crazy iSCSI
standards folks) say that FCoE people are crazy.  Being one of the iSCSI
people I am kinda obligated to agree, but the technical points are
really solid, and have been so for over a decade.  They are listed here
for those who are interested:

http://www.ietf.org/mail-archive/web/ips/current/msg02325.html

 
 So take any utterances of mine with a big pinch of salt.
 
 Historically, the only split that has worked pretty well is connection 
 initiation/setup in user space, actual data transfers in kernel space. 
 
 Pure user-space solutions work, but tend to eventually be turned into 
 kernel-space if they are simple enough and really do have throughput and 
 latency considerations (eg nfsd), and aren't quite complex and crazy 
 enough to have a large impedance-matching problem even for basic IO stuff 
 (eg samba).
 
 And totally pure kernel solutions work only if there are very stable 
 standards and no major authentication or connection setup issues (eg local 
 disks).
 
 So just going by what has happened in the past, I'd assume that iSCSI 
 would eventually turn into connecting/authentication in user space with 
 data transfers in kernel space. But only if it really does end up 
 mattering enough. We had a totally user-space NFS daemon for a long time, 
 and it was perfectly fine until people really started caring.

Thanks for putting this into an historical perspective.  Also it is
interesting to note that the iSCSI spec (RFC-3720) was ratified in April
2004, so it will be going on 4 years soon, which pre-RFC products first
going out in 2001 (yikes!).  In my experience, the iSCSI interopt
amongst implementations (espically between different OSes) has been
stable since about late 2004, early 2005, with interopt between OS SCSI
subsystems (espically talking to non SCSI hardware) being the slower of
the two.

--nab


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dmesg spam

2008-02-04 Thread Jeff Garzik

James Bottomley wrote:

It's here in sr_ioctl.c:


Ah, indeed.  My grep-fu sucks today.



I'm not averse to simply nuking the printk ... it's probably valueless
in a modern kernel, since something dbussy is supposed to tell you to
put a CD in the drive, not something in the kernel.


The reverse...  dbussy/HAL is implementing autodetection of media 
insertion, by polling ad infinitum.


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/9] scsi_dh: scsi handling of REQ_LB_OP_TRANSITION

2008-02-04 Thread Chandra Seetharaman
On Fri, 2008-02-01 at 14:00 -0600, Mike Christie wrote:
 Chandra Seetharaman wrote:
  @@ -1445,9 +1479,24 @@ static void scsi_kill_request(struct req
   static void scsi_softirq_done(struct request *rq)
   {
  struct scsi_cmnd *cmd = rq-completion_data;
  -   unsigned long wait_for = (cmd-allowed + 1) * cmd-timeout_per_command;
  int disposition;
  +   struct request_queue *q;
  +   unsigned long wait_for, flags;
   
  +   if (blk_linux_request(rq)) {
  +   q = rq-q;
  +   spin_lock_irqsave(q-queue_lock, flags);
  +   /*
  +* we always return 1 and the caller should
  +* check rq-errors for the complete status
  +*/
  +   end_that_request_last(rq, 1);
  +   spin_unlock_irqrestore(q-queue_lock, flags);
  +   return;
  +   }
  +
  +
  +   wait_for = (cmd-allowed + 1) * cmd-timeout_per_command;
  INIT_LIST_HEAD(cmd-eh_entry);
   
 .
 
  +
   /*
* Function:scsi_request_fn()
*
  @@ -1519,7 +1612,23 @@ static void scsi_request_fn(struct reque
   * accept it.
   */
  req = elv_next_request(q);
  -   if (!req || !scsi_dev_queue_ready(q, sdev))
  +   if (!req)
  +   break;
  +
  +   /*
  +* We do not account for linux blk req in the device
  +* or host busy accounting because it is not necessarily
  +* a scsi command that is sent to some object. The lower
  +* level can translate it into a request/scsi_cmnd, if
  +* necessary, and then queue that up using REQ_TYPE_BLOCK_PC.
  +*/
  +   if (blk_linux_request(req)) {
  +   blkdev_dequeue_request(req);
  +   scsi_execute_blk_linux_cmd(req);
  +   continue;
  +   }
  +
  +   if (!scsi_dev_queue_ready(q, sdev))
  break;
 
 I think these two pieces are one of the reasons I have not pushed the 
 patches. I thought the completion and execution pieces here are a little 
 ugly and seem to just wedge themselves in where they want to be.
 
 Is there any way to make the insertion of non-scsi commands more common? 
 Do we have the code for being able to send requests directly to 
 something like a fc rport done? Could we maybe inject these special 
 commands to the hw handler using something similar to how bsg would send 
 non scsi commands to weird objects (objects like rport, sessions, and 
 not devices we traditionally associated with queues like scsi_devices). 
 Just a thought with no code :) that is why the ugly code existed still :)

Can't it be done with this code itself ?

If the underlying functionality is going to be provided by the hardware
handler, then can't we add additional commands (like transition) when we
need them ?

Or am I missing something ?

-- 

--
Chandra Seetharaman   | Be careful what you choose
  - [EMAIL PROTECTED]   |  ...you may get it.
--


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 24/24][RFC] scsi: New sense handling

2008-02-04 Thread Boaz Harrosh
  Current code allocates a dma-able sense buffer for every scsi command.
  This is in 99.9% of cases a waste because:
- Most commands end successfully and do not need any sense space.
- most LLDs have the sense information already in their private data
  and do not DMA this information. the sense can go directly to ULD's
  buffer.
- Drivers that do synchronous REQUEST_SENSE through the scs_eh_prep_cmnd
  mechanism need only one dma buffer per target. (contingent allegiance 
condition)

  This leaves us with a very few drivers that need dma-able sense-buffer 
pre-allocated
  per-command. These drivers set the .pre_allocate_sense flag in thier host 
template
  and the mid-layer will make sure a sense buffer is allocated for them.

  So:
  - Removed the global sense mem_cache and sense allocation per command.
  - Add a per-host sense mempool_t of buffer-size specified by host template in
.sense_buffsize. If .sense_buffsize is not set then no mempool_t is not 
allocated.
 (Note: This is true for the majority of drivers.)
  - If host does not have .pre_allocate_sense set then a reserved sense buffer 
is
allocated per scsi-target (LUN).
  - If host has .pre_allocate_sense set then a sense buffer is allocated in
get_command() and the code behaves like today.
  - Drivers that need a dma sense buffer call 
scsi_make_sense()/scsi_return_sense()
Theses drivers where already converted. Here the implementation will 
allocate
the sense buffer from above per-host pool. Subsequent calls to 
scsi_make_sense()
will return the same buffer. The first call to scsi_return_sense() will do 
the
scsi_eh_cpy_sense(), and return the buffer to free store. Subsequent calls
will do nothing.
  - The scsi_eh_prep_cmnd()/scsi_eh_return_cmnd() are converted to use the
scsi_make_sense()/scsi_return_sense(). Since these drivers set for 
.sense_buffsize
but with .pre_allocate_sense not set, a pre-allocated sense buffer is 
guarantied
per target even in a low memory condition situation.
  - scsi_cmnd-sense_buffer is removed as it is no longer needed. And 
scsi_eh_cpy_sense()
Will copy directly to ULD's supplied buffer at request-sense. Not maxing 
-sense_max_len
and setting -sense_len to copied count. So scsi_lib sense handling code 
can also be
removed.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/scsi.c   |  207 ++---
 drivers/scsi/scsi_error.c |   45 +++---
 drivers/scsi/scsi_lib.c   |   15 +---
 drivers/scsi/scsi_priv.h  |3 +-
 drivers/scsi/scsi_scan.c  |6 ++
 include/scsi/scsi_cmnd.h  |5 +-
 include/scsi/scsi_eh.h|   14 +++-
 include/scsi/scsi_host.h  |3 +
 8 files changed, 194 insertions(+), 104 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index af29ccc..3f1bb9d 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -132,24 +132,20 @@ const char * scsi_device_type(unsigned type)
 EXPORT_SYMBOL(scsi_device_type);
 
 struct scsi_host_cmd_pool {
-   struct kmem_cache   *cmd_slab;
-   struct kmem_cache   *sense_slab;
-   unsigned intusers;
-   char*cmd_name;
-   char*sense_name;
-   unsigned intslab_flags;
-   gfp_t   gfp_mask;
+   struct kmem_cache   *slab;
+   unsigned intusers;
+   char*name;
+   unsigned intslab_flags;
+   gfp_t   gfp_mask;
 };
 
 static struct scsi_host_cmd_pool scsi_cmd_pool = {
-   .cmd_name   = scsi_cmd_cache,
-   .sense_name = scsi_sense_cache,
+   .name   = scsi_cmd_cache,
.slab_flags = SLAB_HWCACHE_ALIGN,
 };
 
 static struct scsi_host_cmd_pool scsi_cmd_dma_pool = {
-   .cmd_name   = scsi_cmd_cache(DMA),
-   .sense_name = scsi_sense_cache(DMA),
+   .name   = scsi_cmd_cache(DMA),
.slab_flags = SLAB_HWCACHE_ALIGN|SLAB_CACHE_DMA,
.gfp_mask   = __GFP_DMA,
 };
@@ -167,10 +163,9 @@ static DEFINE_MUTEX(host_cmd_pool_mutex);
 struct scsi_cmnd *__scsi_get_command(struct Scsi_Host *shost, gfp_t gfp_mask)
 {
struct scsi_cmnd *cmd;
-   unsigned char *buf;
 
-   cmd = kmem_cache_alloc(shost-cmd_pool-cmd_slab,
-  gfp_mask | shost-cmd_pool-gfp_mask);
+   cmd = kmem_cache_alloc(shost-cmd_pool-slab,
+   gfp_mask | shost-cmd_pool-gfp_mask);
 
if (unlikely(!cmd)) {
unsigned long flags;
@@ -182,22 +177,6 @@ struct scsi_cmnd *__scsi_get_command(struct Scsi_Host 
*shost, gfp_t gfp_mask)
list_del_init(cmd-list);
}
spin_unlock_irqrestore(shost-free_list_lock, flags);
-
-   if (cmd) {
-   buf = cmd-sense_buffer;
-   memset(cmd, 0, sizeof(*cmd));
-   cmd-sense_buffer = 

[PATCH 23/24][RFC] block: Minor changes to sense handling

2008-02-04 Thread Boaz Harrosh
  - It is no longer allowed to call blk_execute_rq_nowait() with out
a req-senes buffer. This is not a problem because greping all users shows
that this does not happen.

  - Add a sense_max_len which indicate the buffer size at req-sense. If zero
then SCSI_SENSE_BUFFERSIZE is assumed. (As before)

  - SCSI_SENSE_BUFFERSIZE is moved to scsi.h (from scsi_cmnd.h)

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 block/blk-core.c |1 +
 block/blk-exec.c |5 +
 block/scsi_ioctl.c   |1 -
 include/linux/blkdev.h   |3 ++-
 include/scsi/scsi.h  |6 ++
 include/scsi/scsi_cmnd.h |1 -
 include/scsi/scsi_eh.h   |1 +
 7 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1c5cfa7..de973ab 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -120,6 +120,7 @@ void rq_init(struct request_queue *q, struct request *rq)
rq-data = NULL;
rq-nr_phys_segments = 0;
rq-sense = NULL;
+   rq-sense_max_len = 0;
rq-end_io = NULL;
rq-end_io_data = NULL;
rq-completion_data = NULL;
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 391dd62..3c14a7b 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -51,6 +51,10 @@ void blk_execute_rq_nowait(struct request_queue *q, struct 
gendisk *bd_disk,
 {
int where = at_head ? ELEVATOR_INSERT_FRONT : ELEVATOR_INSERT_BACK;
 
+   BUG_ON(!rq-sense);
+   if (!rq-sense_max_len)
+   rq-sense_max_len = SCSI_SENSE_BUFFERSIZE;
+
rq-rq_disk = bd_disk;
rq-cmd_flags |= REQ_NOMERGE;
rq-end_io = done;
@@ -90,6 +94,7 @@ int blk_execute_rq(struct request_queue *q, struct gendisk 
*bd_disk,
memset(sense, 0, sizeof(sense));
rq-sense = sense;
rq-sense_len = 0;
+   rq-sense_max_len = SCSI_SENSE_BUFFERSIZE;
}
 
rq-end_io_data = wait;
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index a1d7070..9da2505 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -30,7 +30,6 @@
 
 #include scsi/scsi.h
 #include scsi/scsi_ioctl.h
-#include scsi/scsi_cmnd.h
 
 /* Command group 3 is reserved and should never be used.  */
 const unsigned char scsi_command_size_tbl[8] =
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a8a6c20..29fb039 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -222,7 +222,8 @@ struct request {
};
 
unsigned int data_len;
-   unsigned int sense_len;
+   unsigned short sense_len;
+   unsigned short sense_max_len;
void *data;
void *sense;
 
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 9c36800..91e65cf 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -415,6 +415,12 @@ struct scsi_lun {
 #define sense_valid(sense)  ((sense)  0x80);
 
 /*
+ * Some scsi sense constants
+ */
+#define SCSI_SENSE_BUFFERSIZE  96
+#define SCSI_SENSE_MAX_SIZE260
+
+/*
  * default timeouts
 */
 #define FORMAT_UNIT_TIMEOUT(2 * 60 * 60 * HZ)
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index c32d0da..000a544 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -88,7 +88,6 @@ struct scsi_cmnd {
struct request *request;/* The command we are
   working on */
 
-#define SCSI_SENSE_BUFFERSIZE  96
unsigned char *sense_buffer;
/* obtained by REQUEST SENSE when
 * CHECK CONDITION is received on original
diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
index ce84330..97a6180 100644
--- a/include/scsi/scsi_eh.h
+++ b/include/scsi/scsi_eh.h
@@ -4,6 +4,7 @@
 #include linux/scatterlist.h
 
 #include scsi/scsi_cmnd.h
+#include scsi/scsi.h
 struct scsi_device;
 struct Scsi_Host;
 
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Linus Torvalds


On Mon, 4 Feb 2008, J. Bruce Fields wrote:
 
 I'd assumed the move was primarily because of the difficulty of getting
 correct semantics on a shared filesystem

.. not even shared. It was hard to get correct semantics full stop. 

Which is a traditional problem. The thing is, the kernel always has some 
internal state, and it's hard to expose all the semantics that the kernel 
knows about to user space.

So no, performance is not the only reason to move to kernel space. It can 
easily be things like needing direct access to internal data queues (for a 
iSCSI target, this could be things like barriers or just tagged commands - 
yes, you can probably emulate things like that without access to the 
actual IO queues, but are you sure the semantics will be entirely right?

The kernel/userland boundary is not just a performance boundary, it's an 
abstraction boundary too, and these kinds of protocols tend to break 
abstractions. NFS broke it by having file handles (which is not 
something that really exists in user space, and is almost impossible to 
emulate correctly), and I bet the same thing happens when emulating a SCSI 
target in user space.

Maybe not. I _rally_ haven't looked into iSCSI, I'm just guessing there 
would be things like ordering issues.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] scsi_dh: Add support for SDEV_PASSIVE

2008-02-04 Thread Chandra Seetharaman
On Mon, 2008-02-04 at 14:28 -0600, James Bottomley wrote:
 On Mon, 2008-02-04 at 12:15 -0800, Chandra Seetharaman wrote:
  On Mon, 2008-02-04 at 12:58 -0600, James Bottomley wrote:
   On Wed, 2008-01-23 at 16:32 -0800, Chandra Seetharaman wrote:
Subject: scsi_dh: Add support for SDEV_PASSIVE

From: Chandra Seetharaman [EMAIL PROTECTED]

This patch adds a new device state SDEV_PASSIVE, to correspond to the
passive side access of an active/passive multipathed device.
   
   Really, no; this isn't right.  The state field of a SCSI device is for
   the SCSI state model.  Passive might be a valid device mapper state, but
  
  Hi James,
  
  It is not the device mapper state, it is the state of the device
  itself. These devices have active/passive paths, the passive paths will
  be represented by SDEV_PASSIVE device state in SCSI.
 
 Yes, it is .. you're killing commands on the basis of being in this
 state, which nothing in SCSI ever sets.
 
 A proper return from a passive path is the SCSI standard NOT_READY
 LOGICAL UNIT NOT READY, INITIALIZING COMMAND REQUIRED.  We expect to see
 this, not the command being killed.

The device does send these error messages currently, but it takes some
time to get the check condition back, which adds up the time to boot
especially when the # of LUNS is huge.

For example, in my test configuration, I had 40 luns, and the time
difference (with this patch and without it) to boot is 171 seconds and
1426 seconds.

We thought we will get it short circuited so as to return the failure
back faster.

Also, we only short circuit REQ_TYPE_FS.


 
 James
 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 

--
Chandra Seetharaman   | Be careful what you choose
  - [EMAIL PROTECTED]   |  ...you may get it.
--


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RFC] sd: make error handling more robust

2008-02-04 Thread Luben Tuikov
--- On Mon, 2/4/08, Tony Battersby [EMAIL PROTECTED] wrote:
 I _really_ _really_ hope that you don't believe that I
 am trying to take
 credit for your work. If you take another look, my original
 patch had
 the following hunk:
 
 +
 + /* Make sure that bad_lba is one of the sectors that the
 +  * command was trying to access.
 +  */
 + if (bad_lba  start_lba ||
 + bad_lba = start_lba + xfer_size / sector_size)
 + goto out;
 +
 
 
 Your response patch had the following hunk:
 
 + if (bad_lba  start_lba)
 + goto out;
 
 
 So I don't feel that it was dishonest for me to submit
 this as my
 work. If you were offended, then I apologize.

Oh, no, of course not.  The most important thing is
if it works for you and fixes your problem and makes
your customers happy (or you if you're a customer).

  I think it would've been much clearer if you had
  singled out the problems you were seeing with your
  HW and sent a single problem with a single patch per
  single email.
 

 Agreed. Sometimes it is difficult to predict when something
 that seems
 so straightforward will generate so much controversy.

Nah, maybe a couple of misunderstandings (email tends to
do that), but it's all good.

I think it would've been so much better for everyone if
the RAID vendor had simply fixed their code to not
set VALID when INFORMATION is not valid (spec behaviour).
Since the bug lies in their code, that would've been
the proper course of action.  Instead, every other OS
which uses that RAID HW would have to adjust to this
RAID FW bug (if they haven't already).  Oh, well.

   Luben

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dmesg spam

2008-02-04 Thread James Bottomley

On Mon, 2008-02-04 at 15:14 -0500, Jeff Garzik wrote:
 Andrew Morton wrote:
  On Mon, 4 Feb 2008 15:24:55 +0100 Bartlomiej Zolnierkiewicz [EMAIL 
  PROTECTED] wrote:
  
  On Sunday 03 February 2008, Andrew Morton wrote:
  With latest -mm, running fc8 I am getting this in the logs,
 ^^^
  = SCSI/libata
 
  cc:ing Jeff
 
  once per second.
 
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  
  Well..  it's coming out of the kernel.  Presumably it's that cdrom polling
  thing in KDE.  James recently made changes to sr_ioctl.c but I've been
  buried in more terminal regressions than this one.
 
 I don't see this in upstream...  can you isolate it to a particular git 
 tree?

It's here in sr_ioctl.c:

int sr_do_ioctl(Scsi_CD *cd, struct packet_command *cgc)
{
[...]
case NOT_READY: /* This happens if there is no disc in
drive */
[...]
if (!cgc-quiet)
printk(KERN_INFO %s: CDROM not ready.
Make sure there is a disc in the drive.\n, cd-cdi.name);
#ifdef DEBUG
scsi_print_sense_hdr(sr, sshdr);
#endif


 Clearly userland is initiating a once-per-second poll.  That is quite 
 normal for 99% of CDROMs, which do not support async notification.
 
 But also clearly that message is printk'd way too much in your case.

I'm not averse to simply nuking the printk ... it's probably valueless
in a modern kernel, since something dbussy is supposed to tell you to
put a CD in the drive, not something in the kernel.

I am however interested to see if it's a symptom of something else that
might be a bigger problem.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] scsi_dh: Add support for SDEV_PASSIVE

2008-02-04 Thread Chandra Seetharaman
On Mon, 2008-02-04 at 12:58 -0600, James Bottomley wrote:
 On Wed, 2008-01-23 at 16:32 -0800, Chandra Seetharaman wrote:
  Subject: scsi_dh: Add support for SDEV_PASSIVE
  
  From: Chandra Seetharaman [EMAIL PROTECTED]
  
  This patch adds a new device state SDEV_PASSIVE, to correspond to the
  passive side access of an active/passive multipathed device.
 
 Really, no; this isn't right.  The state field of a SCSI device is for
 the SCSI state model.  Passive might be a valid device mapper state, but

Hi James,

It is not the device mapper state, it is the state of the device
itself. These devices have active/passive paths, the passive paths will
be represented by SDEV_PASSIVE device state in SCSI.

chandra
 it's not a valid SCSI state.  If these patches can't work except by
 mucking with the SCSI state model, there's some layering problem
 elsewhere that needs sorting out.
 
 James
 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 

--
Chandra Seetharaman   | Be careful what you choose
  - [EMAIL PROTECTED]   |  ...you may get it.
--


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dmesg spam

2008-02-04 Thread James Bottomley
On Mon, 2008-02-04 at 12:05 -0800, Andrew Morton wrote:
 On Mon, 4 Feb 2008 15:24:55 +0100 Bartlomiej Zolnierkiewicz [EMAIL 
 PROTECTED] wrote:
 
  On Sunday 03 February 2008, Andrew Morton wrote:
   
   With latest -mm, running fc8 I am getting this in the logs,
 ^^^
  = SCSI/libata
  
  cc:ing Jeff
  
   once per second.
   
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
   sr0: CDROM not ready.  Make sure there is a disc in the drive.
 
 Well..  it's coming out of the kernel.  Presumably it's that cdrom polling
 thing in KDE.  James recently made changes to sr_ioctl.c but I've been
 buried in more terminal regressions than this one.

You're thinking of this one?

commit 210ba1d1724f5c4ed87a2ab1a21ca861a915f734
Author: James Bottomley [EMAIL PROTECTED]
Date:   Sat Jan 5 10:39:51 2008 -0600

[SCSI] sr: update to follow tray status correctly

Based on an original patch from: David Martin [EMAIL PROTECTED]
 
You could try reversing it if you want, but I'm not certain that's the
problem (the patch only affected sr_do_status, which is a cdrom internal
thing).

The message comes from sr_ioctl.c:sr_do_ioctl().  Which means some user
level application is poking the drive with a command that's returning
NOT_READY.  Apparently it will shut up if quiet is set in the packet
command structure.

It could be the application is getting the wrong idea of the status from
sr_do_staus() which leads it to send commands which require a medium?
But we'll need a bit of debugging to determine this.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix for an underflow condition in usb storage isd200.c

2008-02-04 Thread Alan Stern
On Sun, 3 Feb 2008, Matthew Dharm wrote:

 But, the modifications to usb_stor_access_xfer_buf() look good -- no
 request from a sub-driver should be allowed to scribble into memory.  The
 current code does make the implicit assumption that there is enough
 storage, and will walk right off the end of the sg list if there isn't.
 
 I'm not sure I like the mods to usb_stor_set_xfer_buf().  Any place we set
 a status that we know is going to be thrown away is an invitation for a
 problem later if someone changes the code to preserve that status.  It's a
 jack-in-the-box, waiting to spring open in our face later.  The limit check
 (which mirrors the usb_stor_access_xfer_buf modification) and WARN_ON() are
 probably good.
 
 In a strictly technical sense, the change to protocol.c are sufficient.
 That is, they will prevent a serious error.  There is a justification tho
 to fix all of the users of usb_stor_access_buf() to not attempt to use more
 SCSI buffer than exists.
 
 My opinion is this:  Let's make the protocol.c mods (modulo my comments
 about setting useless status bits) now.  Then, let's decide if we're going
 to patch all the other users of the usb_stor_*_xfer_buf() functions as a
 separate discussion.

I think the correct approach is to modify those routines so that they 
will never overrun the s-g buffer (like Boaz has done), and _document_ 
this behavior.  Then the callers can feel free to try and transfer as 
much as they want, knowing that an overrun can't occur.  There won't 
be any need for a WARN_ON or anything else.

However the interface to usb_stor_access_xfer_buf() will have to change
slightly.  Right now if it sees that *sgptr is NULL, it assumes this
means it should start at the beginning of the s-g buffer.  But with 
Boaz's change, *sgptr == NULL means the transfer has reached the end of 
the buffer.  So I'll have to go through and audit all the callers.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Vladislav Bolkhovitin

James Bottomley wrote:

On Mon, 2008-02-04 at 20:56 +0300, Vladislav Bolkhovitin wrote:


James Bottomley wrote:


On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:



James Bottomley wrote:


So, James, what is your opinion on the above? Or the overall SCSI target 
project simplicity doesn't matter much for you and you think it's fine 
to duplicate Linux page cache in the user space to keep the in-kernel 
part of the project as small as possible?



The answers were pretty much contained here

http://marc.info/?l=linux-scsim=120164008302435

and here:

http://marc.info/?l=linux-scsim=120171067107293

Weren't they?


No, sorry, it doesn't look so for me. They are about performance, but 
I'm asking about the overall project's architecture, namely about one 
part of it: simplicity. Particularly, what do you think about 
duplicating Linux page cache in the user space to have zero-copy cached 
I/O? Or can you suggest another architectural solution for that problem 
in the STGT's approach?



Isn't that an advantage of a user space solution?  It simply uses the
backing store of whatever device supplies the data.  That means it takes
advantage of the existing mechanisms for caching.


No, please reread this thread, especially this message: 
http://marc.info/?l=linux-kernelm=120169189504361w=2. This is one of 
the advantages of the kernel space implementation. The user space 
implementation has to have data copied between the cache and user space 
buffer, but the kernel space one can use pages in the cache directly, 
without extra copy.



Well, you've said it thrice (the bellman cried) but that doesn't make it
true.

The way a user space solution should work is to schedule mmapped I/O
from the backing store and then send this mmapped region off for target
I/O.  For reads, the page gather will ensure that the pages are up to
date from the backing store to the cache before sending the I/O out.
For writes, You actually have to do a msync on the region to get the
data secured to the backing store. 


James, have you checked how fast is mmaped I/O if work size  size of 
RAM? It's several times slower comparing to buffered I/O. It was many 
times discussed in LKML and, seems, VM people consider it unavoidable. 



Erm, but if you're using the case of work size  size of RAM, you'll
find buffered I/O won't help because you don't have the memory for
buffers either.


James, just check and you will see, buffered I/O is a lot faster.

So, using mmaped IO isn't an option for high performance. Plus, mmaped 
IO isn't an option for high reliability requirements, since it doesn't 
provide a practical way to handle I/O errors.


I think you'll find it does ... the page gather returns -EFAULT if
there's an I/O error in the gathered region. 


Err, to whom return? If you try to read from a mmaped page, which can't 
be populated due to I/O error, you will get SIGBUS or SIGSEGV, I don't 
remember exactly. It's quite tricky to get back to the faulted command 
from the signal handler.


Or do you mean mmap(MAP_POPULATE)/munmap() for each command? Do you 
think that such mapping/unmapping is good for performance?



msync does something
similar if there's a write failure.


You also have to pull tricks with
the mmap region in the case of writes to prevent useless data being read
in from the backing store.


Can you be more exact and specify what kind of tricks should be done for 
that?


Actually, just avoid touching it seems to do the trick with a recent
kernel.


Hmm, how can one write to an mmaped page and don't touch it?


James





-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI] sd: make error handling more robust (v2)

2008-02-04 Thread Luben Tuikov
--- On Mon, 2/4/08, James Bottomley [EMAIL PROTECTED] wrote:
 On Mon, 2008-02-04 at 01:11 -0800, Luben Tuikov wrote:
  Looks good except that End LBA is usually
 defined
  to be something of the sort of the LBA of the
 last
  logical block accessed by the command or
 the LBA
  of the logical block on which the command
 failed.
  
  A spec savvy editor of this code would be
  pleasantly surprised if they had to use
 end_lba,
  and didn't pay attention that it was actually
  End LBA + 1.
 
 Heh, well, that's where spec people and programmers
 part company.  The
 universal expectation of a programmer in looping is
 
 for (a = beginning; a  end; a++)
 
 rather than = if end were actually to point to last
 rather than last +
 1.

For loop invariants that's true, although I didn't see
a loop in sd_done().

   Luben

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Nicholas A. Bellinger
On Mon, 2008-02-04 at 17:00 -0600, James Bottomley wrote:
 On Mon, 2008-02-04 at 22:43 +, Alan Cox wrote:
   better. So for example, I personally suspect that ATA-over-ethernet is 
   way 
   better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
   low-level, and against those crazy SCSI people to begin with.
  
  Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
  would probably trash iSCSI for latency if nothing else.
 
 Actually, there's also FCoE now ... which is essentially SCSI
 encapsulated in Fibre Channel Protocols (FCP) running over ethernet with
 Jumbo frames.  It does the standard SCSI TCQ, so should answer all the
 latency pieces.  Intel even has an implementation:
 
 http://www.open-fcoe.org/
 
 I tend to prefer the low levels as well.  The whole disadvantage for IP
 as regards iSCSI was the layers of protocols on top of it for
 addressing, authenticating, encrypting and finding any iSCSI device
 anywhere in the connected universe.

Btw, while simple in-band discovery of iSCSI exists, the standards based
IP storage deployments (iSCSI and iFCP) use iSNS (RFC-4171) for
discovery and network fabric management, for things like sending state
change notifications when a particular network portal is going away so
that the initiator can bring up a different communication patch to a
different network portal, etc.

 
 I tend to see loss of routing from operating at the MAC level to be a
 nicely justifiable tradeoff (most storage networks tend to be hubbed or
 switched anyway).  Plus an ethernet MAC with jumbo frames is a large
 framed nearly lossless medium, which is practically what FCP is
 expecting.  If you really have to connect large remote sites ... well
 that's what tunnelling bridges are for.
 

Some of the points by Julo on the IPS TWG iSCSI vs. FCoE thread:

  * the network is limited in physical span and logical span (number
of switches)
  * flow-control/congestion control is achieved with a mechanism
adequate for a limited span network (credits). The packet loss
rate is almost nil and that allows FCP to avoid using a
transport (end-to-end) layer
  * FCP she switches are simple (addresses are local and the memory
requirements cam be limited through the credit mechanism)
  * The credit mechanisms is highly unstable for large networks
(check switch vendors planning docs for the network diameter
limits) – the scaling argument
  * Ethernet has no credit mechanism and any mechanism with a
similar effect increases the end point cost. Building a
transport layer in the protocol stack has always been the
preferred choice of the networking community – the community
argument
  * The performance penalty of a complete protocol stack has
always been overstated (and overrated). Advances in protocol
stack implementation and finer tuning of the congestion control
mechanisms make conventional TCP/IP performing well even at 10
Gb/s and over. Moreover the multicore processors that become
dominant on the computing scene have enough compute cycles
available to make any offloading possible as a mere code
restructuring exercise (see the stack reports from Intel, IBM
etc.)
  * Building on a complete stack makes available a wealth of
operational and management mechanisms built over the years by
the networking community (routing, provisioning, security,
service location etc.) – the community argument
  * Higher level storage access over an IP network is widely
available and having both block and file served over the same
connection with the same support and management structure is
compelling– the community argument
  * Highly efficient networks are easy to build over IP with optimal
(shortest path) routing while Layer 2 networks use bridging and
are limited by the logical tree structure that bridges must
follow. The effort to combine routers and bridges (rbridges) is
promising to change that but it will take some time to finalize
(and we don't know exactly how it will operate). Untill then the
scale of Layer 2 network is going to seriously limited – the
scaling argument

Perhaps it would be of worth to get some more linux-net guys in on the
discussion.  :-)

--nab


 James
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Jeff Garzik

Alan Cox wrote:
better. So for example, I personally suspect that ATA-over-ethernet is way 
better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
low-level, and against those crazy SCSI people to begin with.


Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
would probably trash iSCSI for latency if nothing else.


AoE is truly a thing of beauty.  It has a two/three page RFC (say no more!).

But quite so...  AoE is limited to MTU size, which really hurts.  Can't 
really do tagged queueing, etc.



iSCSI is way, way too complicated.  It's an Internet protocol designed 
by storage designers, what do you expect?


For years I have been hoping that someone will invent a simple protocol 
(w/ strong auth) that can transit ATA and SCSI commands and responses. 
Heck, it would be almost trivial if the kernel had a TLS/SSL implementation.


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley

On Mon, 2008-02-04 at 22:43 +, Alan Cox wrote:
  better. So for example, I personally suspect that ATA-over-ethernet is way 
  better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
  low-level, and against those crazy SCSI people to begin with.
 
 Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
 would probably trash iSCSI for latency if nothing else.

Actually, there's also FCoE now ... which is essentially SCSI
encapsulated in Fibre Channel Protocols (FCP) running over ethernet with
Jumbo frames.  It does the standard SCSI TCQ, so should answer all the
latency pieces.  Intel even has an implementation:

http://www.open-fcoe.org/

I tend to prefer the low levels as well.  The whole disadvantage for IP
as regards iSCSI was the layers of protocols on top of it for
addressing, authenticating, encrypting and finding any iSCSI device
anywhere in the connected universe.

I tend to see loss of routing from operating at the MAC level to be a
nicely justifiable tradeoff (most storage networks tend to be hubbed or
switched anyway).  Plus an ethernet MAC with jumbo frames is a large
framed nearly lossless medium, which is practically what FCP is
expecting.  If you really have to connect large remote sites ... well
that's what tunnelling bridges are for.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: (fwd) Bug#11922: I/O error on blank tapes

2008-02-04 Thread James Bottomley

On Mon, 2008-02-04 at 22:28 +0100, Borislav Petkov wrote:
 On Mon, Feb 04, 2008 at 03:22:06PM +0100, maximilian attems wrote:
 
 (Added Bart to CC)
 
  hello borislav,
  
  may i forward you that *old* Debian kernel bug,
  have seen you working on ide-tape:
  http://bugs.debian.org/11922
  no we don't carry any ide patches anymore.
  
  maybe you've already fixed it in latest?
  
  thanks
  
  -- 
  maks
  
  - Forwarded message from Stephen Kitt [EMAIL PROTECTED] -
  
  Subject: Bug#11922: I/O error on blank tapes
  Date: Sat, 1 Dec 2007 19:06:18 +0100
  From: Stephen Kitt [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]
  
  Hi,
  
  This does still occur with 2.6.22; with a blank tape in my HP DDS-4 drive:
  
  $ tar tzvf /dev/nst0
  tar: /dev/nst0: Cannot read: Input/output error

That's a SCSI tape, not an IDE one.  I cc'd the SCSI list

James

  tar: At beginning of tape, quitting now
  tar: Error is not recoverable: exiting now
  
  gzip: stdin: unexpected end of file
  tar: Child returned status 2
  tar: Error exit delayed from previous errors
  
  Nothing gets logged anywhere, which fits the original bug description.
  
  This is a well-known issue: see for example
  http://www.sibbald.com/bacula/html-manual/Bacula_Console.html (search for
  blank tape).
  
  Regards,
  
  Stephen
  
  
  
  -- 
  To UNSUBSCRIBE, email to [EMAIL PROTECTED]
  with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
  
  
  - End forwarded message -
 
 Hi Maks,
 
 we're currently in the process of aggressively cleaning up ide-tape. However,
 this brings (almost) no functional changes to the driver and we haven't looked
 at any bugs that might exist. Actually, I wanted to probe the community to see
 whether anyone is using ide-tape at all, and if not, to remove it completely.
 
 Since i don't have the hardware, i'm gonna have to ask you (or Stephen) to 
 wait
 until all changes have entered mainline and then to try to reproduce the bug 
 again
 after having enabled debugging (IDETAPE_DEBUG_LOG) and send me the syslog
 output.
 
 Thanks.
 

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dmesg spam

2008-02-04 Thread Jeff Garzik

Andrew Morton wrote:

On Mon, 4 Feb 2008 15:24:55 +0100 Bartlomiej Zolnierkiewicz [EMAIL PROTECTED] 
wrote:


On Sunday 03 February 2008, Andrew Morton wrote:

With latest -mm, running fc8 I am getting this in the logs,

   ^^^
= SCSI/libata

cc:ing Jeff


once per second.

sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.


Well..  it's coming out of the kernel.  Presumably it's that cdrom polling
thing in KDE.  James recently made changes to sr_ioctl.c but I've been
buried in more terminal regressions than this one.


I don't see this in upstream...  can you isolate it to a particular git 
tree?


Clearly userland is initiating a once-per-second poll.  That is quite 
normal for 99% of CDROMs, which do not support async notification.


But also clearly that message is printk'd way too much in your case.

Jeff




-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread 4news
On lunedì 4 febbraio 2008, Linus Torvalds wrote:
 So from a purely personal standpoint, I'd like to say that I'm not really
 interested in iSCSI (and I don't quite know why I've been cc'd on this
 whole discussion) and think that other approaches are potentially *much*
 better. So for example, I personally suspect that ATA-over-ethernet is way
 better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
 low-level, and against those crazy SCSI people to begin with.

surely aoe is better than iscsi almost on performance because of the lesser 
protocol stack:
iscsi -  scsi - ip - eth
aoe - ata - eth

but surely iscsi is more a standard than aoe and is more actively used by 
real-world .

Other really useful feature are that:
- iscsi is capable to move to a ip based san scsi devices by routing that ( 
i've some tape changer routed by scst to some system that don't have other 
way to see a tape).
- because it work on the ip layer it can be routed between long distance , so 
having needed bandwidth you can have a really remote block device spoking a 
standard protocol between non ethereogenus systems.
- iscsi is now the cheapest san avaible.

bye,
marco.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dmesg spam

2008-02-04 Thread Andrew Morton
On Mon, 4 Feb 2008 15:24:55 +0100 Bartlomiej Zolnierkiewicz [EMAIL PROTECTED] 
wrote:

 On Sunday 03 February 2008, Andrew Morton wrote:
  
  With latest -mm, running fc8 I am getting this in the logs,
^^^
 = SCSI/libata
 
 cc:ing Jeff
 
  once per second.
  
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.

Well..  it's coming out of the kernel.  Presumably it's that cdrom polling
thing in KDE.  James recently made changes to sr_ioctl.c but I've been
buried in more terminal regressions than this one.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] scsi_dh: Add support for SDEV_PASSIVE

2008-02-04 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote:
 
 On Wed, 2008-01-23 at 16:32 -0800, Chandra Seetharaman wrote:
  Subject: scsi_dh: Add support for SDEV_PASSIVE
  
  From: Chandra Seetharaman [EMAIL PROTECTED]
  
  This patch adds a new device state SDEV_PASSIVE, to correspond to the
  passive side access of an active/passive multipathed device.
 
 Really, no; this isn't right.  The state field of a SCSI device is for
 the SCSI state model.  Passive might be a valid device mapper state, but
 it's not a valid SCSI state.  If these patches can't work except by
 mucking with the SCSI state model, there's some layering problem
 elsewhere that needs sorting out.
 

It is actually a valid state for this device and a number of other
devices that have passive / active controller. There are differences in
response capability (i.e., media access commands) on certain sds until a
fail over command is given. The response behavior difference along with
all the partition scanning and other commands that get generated during
the probing of a device are what leads to the long boot times previously
mentioned by Chandra.

Since we have created a policy to remove the vendor specific multipath
drivers that handled the aggregation of the paths into a single device we
need some method to handle devices that are not fully capable, but are
still expose to the upper layers.

The patches are also addressing a long standing issue of sense data
processing, but that is not related to the SDEV_* state comment.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dmesg spam

2008-02-04 Thread Andrew Morton
On Mon, 04 Feb 2008 15:21:54 -0500
Jeff Garzik [EMAIL PROTECTED] wrote:

 James Bottomley wrote:
  The message comes from sr_ioctl.c:sr_do_ioctl().  Which means some user
  level application is poking the drive with a command that's returning
  NOT_READY.  Apparently it will shut up if quiet is set in the packet
  command structure.
  
  It could be the application is getting the wrong idea of the status from
  sr_do_staus() which leads it to send commands which require a medium?
  But we'll need a bit of debugging to determine this.
 
 
 Userland polling of the cdrom is quite normal (if unfortunately), 
 regardless of medium presence.  Probably HAL or dbus.
 
 In theory, the userland app should (a) set quiet and (b) handle 
 not-ready condition just fine.
 
 I presume that (b) is ok, since not-ready just means to continue polling 
 the cdrom ad infinitum, until media appears.
 
 A useful experiment, if only to confirm the obvious, would be to insert 
 some media.
 
 What controller and device is in use?
 

It's the thinkpad t61p.  Currently five miles away, powered off.  It's all
new Intel stuff iirc.

http://userweb.kernel.org/~akpm/dmesg-t61p.txt has some info but not the
right info afaict.

Bisection time I guess.  That'll be a new experience.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dmesg spam

2008-02-04 Thread James Bottomley

On Mon, 2008-02-04 at 15:24 -0500, Jeff Garzik wrote:
 James Bottomley wrote:
  It's here in sr_ioctl.c:
 
 Ah, indeed.  My grep-fu sucks today.
 
 
  I'm not averse to simply nuking the printk ... it's probably valueless
  in a modern kernel, since something dbussy is supposed to tell you to
  put a CD in the drive, not something in the kernel.
 
 The reverse...  dbussy/HAL is implementing autodetection of media 
 insertion, by polling ad infinitum.

Understood ... I meant the day of the user relying on a message from a
kernel printk to tell them they need a CD in the drive is long over.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Douglas Gilbert

Alan Cox wrote:
better. So for example, I personally suspect that ATA-over-ethernet is way 
better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
low-level, and against those crazy SCSI people to begin with.


Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
would probably trash iSCSI for latency if nothing else.


And a variant that doesn't do ATA or IP:
http://www.fcoe.com/
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >