Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Andrew Morton
On Mon, 06 Aug 2007 22:55:41 -0500 James Bottomley [EMAIL PROTECTED] wrote:

 The real root cause of all of this is that there's no tree I can
 persuade all the interested parties to test that includes all of these
 features.  In spite of the fact they've all been incubating in -mm for
 at least 3 months, no-one apparently tested all the features together
 until 2.6.23-rc1 was released, so then we're scrambling to address the
 issues as they arise.

I pulled git-scsi-misc on July 19 and there was no bsg code in there at
all.  I pulled again on July 20 and all the bsg code was in mainline.  So
it appears that the bsg code went mailing-list - mainline in less than 24
hours, so there wasn't a lot of opportunity for -mm testing there.

A lot of the stupid it-doesn't-compile stuff would have been fixed in -mm,
but more substantial problems might not have been picked up.  But one can
say that about anything.

 I really, *really* think we need a pre-release tree that consists of all
 the upstream targetted features (i.e. all of the for the next merge
 window git trees) and nothing else.

That *is* -mm.  The vast majority of -mm is the 75-odd subsystem trees. 
What you're suggesting amounts to omitting some of those trees for test
purposes (I think).  If so, which ones?

Now it coud be argued that subsystem maintainers should run two trees in
the last 2.6.x-rcN phase: one tree for 2.6.x+1 and one tree for 2.6.x+2. 
Then someone could pull all that together as the Linus tree in a month,
minus insufficiently baked stuff tree.  But frankly, I don't expect that
people will want to do that, nor will they be able to do it reliably.

Plus, an *amazing* amount of stuff turns up in the git trees which was
committed just a few days prior to the merge window opening, or even after
it opening.  eg, bsg which was, afaict, first committed to the scsi tree
eleven days after the 2.6.22 release.

  -mm doesn't really satisfy this,
 because it has so much other stuff that the people I need to get testing
 this don't trust it.

Right.  75-odd developers need to stop committing bugs to their devel
trees.  Interesting project ;)

  The lack of a tree like this that we could have
 persuaded people to test for the last month is what's causing us to
 scramble like this at the closure of the merge window.

Nope.  The scramble is caused by subsystem maintainers jamming stuff into
mainline at the last minute so they don't have to sit on it for the next
two months.

Look.  If we're serious about this then the rule needs to be something like

  If it wasn't committed to your tree *at least* two weeks prior to the
  2.6.x merge window opening, it shouldn't go into 2.6.x.

People are not presently observing this sort of discipline by a metric
mile.  And I'm not sure that we should, really.


I don't think it's terribly bad to whack half-baked things (bsg ;)) into
mainline during the merge window, as long as a) we're sure that we want the
feature in Linux and b) we're confident that we can get it fixed up within
a couple of months.  Two months is a long time.

But that's just me, and it is not the approach which Linus wants taken.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining

2007-08-07 Thread Jens Axboe
On Mon, Aug 06 2007, FUJITA Tomonori wrote:
 On Tue, 31 Jul 2007 23:12:26 +0300
 Boaz Harrosh [EMAIL PROTECTED] wrote:
 
  The tested Kernels:
  
  1. Jens's sglist-arch
I was not able to pass all tests with this Kernel. For some reason when
bigger than 256 pages commands are queued the Machine will run out
of memory and will kill the test. After the test is killed the system
is left with 10M of memory and can hardly reboot.
I have done some prints at the queuecommand entry in scsi_debug.c
and I can see that I receive the expected large sg_count and bufflen
but unlike other tests I get a different pointer at scsi_sglist().
In other tests since nothing is happening at this machine while in
the test, the sglist pointer is always the same. commands comes in,
allocates memory, do nothing in scsi_debug, freed, and returns. 
I suspect sglist leak or allocation bug.
 
 Ok, I found the leak.
 
 
 From 011c05c2e514d1db4834147ed83526473711b0a3 Mon Sep 17 00:00:00 2001
 From: FUJITA Tomonori [EMAIL PROTECTED]
 Date: Mon, 6 Aug 2007 16:16:24 +0900
 Subject: [PATCH] fix sg chaining leak
 
 Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
 ---
  drivers/scsi/scsi_lib.c |1 -
  1 files changed, 0 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
 index 5884b1b..25988b9 100644
 --- a/drivers/scsi/scsi_lib.c
 +++ b/drivers/scsi/scsi_lib.c
 @@ -48,7 +48,6 @@ static struct scsi_host_sg_pool scsi_sg_pools[] = {
   SP(32),
   SP(64),
   SP(128),
 - SP(256),
  };
  #undef SP

Thanks Tomo! Trying to catch up with mails, will apply this one right
away.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: An MCA ESP driver

2007-08-07 Thread David Miller
From: Matthew Wilcox [EMAIL PROTECTED]
Date: Mon, 6 Aug 2007 17:24:58 -0600

 @@ -514,11 +514,14 @@ struct esp {
  
   struct completion   *eh_reset;
  
 - struct sbus_dma *dma;
 + union {
 + struct sbus_dma *sbus_dma;
 + unsigned intx86_dma;
 + };
  };

Feel free to make this a void *dma_cookie or similar.
It's private to the bus front-end.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fc4: convert to use the data buffer accessors

2007-08-07 Thread FUJITA Tomonori
- remove the unnecessary map_single path.

- convert to use the new accessors for the sg lists and the
parameters.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 drivers/fc4/fc.c |   41 +++--
 1 files changed, 15 insertions(+), 26 deletions(-)

diff --git a/drivers/fc4/fc.c b/drivers/fc4/fc.c
index 22b62b3..82de9e1 100644
--- a/drivers/fc4/fc.c
+++ b/drivers/fc4/fc.c
@@ -427,15 +427,10 @@ static inline void fcp_scsi_receive(fc_channel *fc, int 
token, int status, fc_hd
memcpy(SCpnt-sense_buffer, ((char *)(rsp+1)), 
sense_len);
}

-   if (fcmd-data) {
-   if (SCpnt-use_sg)
-   dma_unmap_sg(fc-dev, (struct scatterlist 
*)SCpnt-request_buffer,
-   SCpnt-use_sg,
-   SCpnt-sc_data_direction);
-   else
-   dma_unmap_single(fc-dev, fcmd-data, 
SCpnt-request_bufflen,
-SCpnt-sc_data_direction);
-   }
+   if (fcmd-data)
+   dma_unmap_sg(fc-dev, scsi_sglist(SCpnt),
+scsi_sg_count(SCpnt),
+SCpnt-sc_data_direction);
break;
default:
host_status=DID_ERROR; /* FIXME */
@@ -793,10 +788,14 @@ static int fcp_scsi_queue_it(fc_channel *fc, struct 
scsi_cmnd *SCpnt,
fcp_cntl = FCP_CNTL_QTYPE_SIMPLE;
} else
fcp_cntl = FCP_CNTL_QTYPE_UNTAGGED;
-   if (!SCpnt-request_bufflen  !SCpnt-use_sg) {
+
+   if (!scsi_bufflen(SCpnt)) {
cmd-fcp_cntl = fcp_cntl;
fcmd-data = (dma_addr_t)NULL;
} else {
+   struct scatterlist *sg;
+   int nents;
+
switch (SCpnt-cmnd[0]) {
case WRITE_6:
case WRITE_10:
@@ -805,22 +804,12 @@ static int fcp_scsi_queue_it(fc_channel *fc, struct 
scsi_cmnd *SCpnt,
default:
cmd-fcp_cntl = (FCP_CNTL_READ | fcp_cntl); 
break;
}
-   if (!SCpnt-use_sg) {
-   cmd-fcp_data_len = SCpnt-request_bufflen;
-   fcmd-data = dma_map_single (fc-dev, (char 
*)SCpnt-request_buffer,
-
SCpnt-request_bufflen,
-
SCpnt-sc_data_direction);
-   } else {
-   struct scatterlist *sg = (struct scatterlist 
*)SCpnt-request_buffer;
-   int nents;
-
-   FCD((XXX: Use_sg %d %d\n, SCpnt-use_sg, 
sg-length))
-   nents = dma_map_sg (fc-dev, sg, SCpnt-use_sg,
-   SCpnt-sc_data_direction);
-   if (nents  1) printk (%s: SG for nents %d 
(use_sg %d) not handled yet\n, fc-name, nents, SCpnt-use_sg);
-   fcmd-data = sg_dma_address(sg);
-   cmd-fcp_data_len = sg_dma_len(sg);
-   }
+
+   sg = scsi_sglist(SCpnt);
+   nents = dma_map_sg(fc-dev, sg, scsi_sg_count(SCpnt),
+  SCpnt-sc_data_direction);
+   fcmd-data = sg_dma_address(sg);
+   cmd-fcp_data_len = sg_dma_len(sg);
}
memcpy (cmd-fcp_cdb, SCpnt-cmnd, SCpnt-cmd_len);
memset (cmd-fcp_cdb+SCpnt-cmd_len, 0, 
sizeof(cmd-fcp_cdb)-SCpnt-cmd_len);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining

2007-08-07 Thread FUJITA Tomonori
On Tue, 7 Aug 2007 08:55:49 +0200
Jens Axboe [EMAIL PROTECTED] wrote:

 On Mon, Aug 06 2007, FUJITA Tomonori wrote:
  On Tue, 31 Jul 2007 23:12:26 +0300
  Boaz Harrosh [EMAIL PROTECTED] wrote:
  
   The tested Kernels:
   
   1. Jens's sglist-arch
 I was not able to pass all tests with this Kernel. For some reason when
 bigger than 256 pages commands are queued the Machine will run out
 of memory and will kill the test. After the test is killed the system
 is left with 10M of memory and can hardly reboot.
 I have done some prints at the queuecommand entry in scsi_debug.c
 and I can see that I receive the expected large sg_count and bufflen
 but unlike other tests I get a different pointer at scsi_sglist().
 In other tests since nothing is happening at this machine while in
 the test, the sglist pointer is always the same. commands comes in,
 allocates memory, do nothing in scsi_debug, freed, and returns. 
 I suspect sglist leak or allocation bug.
  
  Ok, I found the leak.
  
  
  From 011c05c2e514d1db4834147ed83526473711b0a3 Mon Sep 17 00:00:00 2001
  From: FUJITA Tomonori [EMAIL PROTECTED]
  Date: Mon, 6 Aug 2007 16:16:24 +0900
  Subject: [PATCH] fix sg chaining leak
  
  Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
  ---
   drivers/scsi/scsi_lib.c |1 -
   1 files changed, 0 insertions(+), 1 deletions(-)
  
  diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
  index 5884b1b..25988b9 100644
  --- a/drivers/scsi/scsi_lib.c
  +++ b/drivers/scsi/scsi_lib.c
  @@ -48,7 +48,6 @@ static struct scsi_host_sg_pool scsi_sg_pools[] = {
  SP(32),
  SP(64),
  SP(128),
  -   SP(256),
   };
   #undef SP
 
 Thanks Tomo! Trying to catch up with mails, will apply this one right
 away.

You can add the following patch to your sglist branches:


From abd73c05d5f08ee307776150e1deecac7a709b60 Mon Sep 17 00:00:00 2001
From: FUJITA Tomonori [EMAIL PROTECTED]
Date: Mon, 30 Jul 2007 23:01:32 +0900
Subject: [PATCH] zfcp: sg chaining support

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 drivers/s390/scsi/zfcp_def.h  |1 +
 drivers/s390/scsi/zfcp_qdio.c |6 ++
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index b36dfc4..0d80150 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -34,6 +34,7 @@
 #include linux/slab.h
 #include linux/mempool.h
 #include linux/syscalls.h
+#include linux/scatterlist.h
 #include linux/ioctl.h
 #include scsi/scsi.h
 #include scsi/scsi_tcq.h
diff --git a/drivers/s390/scsi/zfcp_qdio.c b/drivers/s390/scsi/zfcp_qdio.c
index 81daa82..60bc269 100644
--- a/drivers/s390/scsi/zfcp_qdio.c
+++ b/drivers/s390/scsi/zfcp_qdio.c
@@ -591,7 +591,7 @@ zfcp_qdio_sbals_from_segment(struct zfcp_fsf_req *fsf_req, 
unsigned long sbtype,
  */
 int
 zfcp_qdio_sbals_from_sg(struct zfcp_fsf_req *fsf_req, unsigned long sbtype,
-struct scatterlist *sg,int sg_count, int 
max_sbals)
+struct scatterlist *sgl, int sg_count, int max_sbals)
 {
int sg_index;
struct scatterlist *sg_segment;
@@ -607,9 +607,7 @@ zfcp_qdio_sbals_from_sg(struct zfcp_fsf_req *fsf_req, 
unsigned long sbtype,
sbale-flags |= sbtype;
 
/* process all segements of scatter-gather list */
-   for (sg_index = 0, sg_segment = sg, bytes = 0;
-sg_index  sg_count;
-sg_index++, sg_segment++) {
+   for_each_sg(sgl, sg_segment, sg_count, sg_index) {
retval = zfcp_qdio_sbals_from_segment(
fsf_req,
sbtype,
-- 
1.5.2.4





-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Cbe-oss-dev] Playstation 3 BD-ROM access and LV1_DENIED_BY_POLICY

2007-08-07 Thread Nicholas A. Bellinger
On Mon, 2007-08-06 at 16:19 +0200, Geert Uytterhoeven wrote:
 On Mon, 6 Aug 2007, James Bottomley wrote:
  On Mon, 2007-08-06 at 15:38 +0200, Geert Uytterhoeven wrote:
   On Fri, 3 Aug 2007, Geoff Levand wrote:
Nicholas A. Bellinger wrote:
 Thank you for this information.  I since been able to resolve my issue
 on 2.6.16 (which ended up being my fault), and was able to determine
 that the issue on 2.6.23-rc1 is due to
 drivers/scsi/scsi_lib.c:scsi_execute_async() rejecting READ_10 and
 TEST_UNIT_READY commands in certain cases (perhaps a race in
 drivers/scsi/ps3rom.c..?) using this API that was causing the win32 
 side
 to throw exceptions.

If you get more info on what was happening here, please report it to 
Geert
so he can investigate.  He should return next week.
   
   Indeed.
   
   Perhaps because ps3rom cannot queue more than 1 command?
   I'm CCing the SCSI guys, just in case this rings a bell.
  
  Without details, it's really hard to speculate.  The problem description
  is manifestly strange for two reasons
  

My apologies for the delay as things have been busy on late..

On the kernel side, the setup is:

Sector Size:

2048 bytes for TYPE_ROM

Max Sectors:

32 from struct scsi_host-max_sectors.  The iSCSI/HD client software is
requesting single sector READ_10s at various LBAs of the media.

iSCSI TCQ:

Setting ExpCmdSn/MaxCmdSn Window == 1 has had no effect.

ATAPI Transport Level TCQ:

A single TCQ slot is detected from struct scsi_host  struct scsi_device
and the lowest of either is set and enforced.  Both scsi_execute_async()
and legacy scsi-request APIs are working elsewhere (outside of PS3
BD-ROM) with single ATAPI Transport level TCQ for SATA + USB and single
or many iSCSI TCQs value settings.

PS3-Linux BD-ROM support:

Also of interest is that both implementations of
drivers/block/ps3pf_storage.c and drivers/scsi/ps3rom.c using
scsi_execute_async() are able to trigger the exception scenario in
question.

PS3 System Software Revision.

There is no affect on PS3 System Software Revision. 

   1. READ_10 should never be issued via scsi_execute_async.  There's
  no ULD in the current kernel that does this.  The READ_X/WRITE_X
  commands are issued through the filesystem path.

Gotcha.  The filesystem path with scsi_excute_async() for SG_IO Cdbs is
where I will move towards supporting ps3-linux git latest.   Also, as
you mention, TUR expections is what eventually causes the software
player stop.  Only the READ_10s appear to be affected by the scenario in
question.

Thanks for this pointer, I will take another look at the wireshark logs
to verify this is indeed the case.

 Nicholas is using the PS3 as an iSCSI target for watching BD-ROM content on
 other machines. That's probably where the weird command submission comes from.
 
 He will hopefully fill in the rest...
 

On my side, the goal has been successful export of Linux-iSCSI Targets
with both formats of commerical HD with win32 iSCSI Initiator(s) and
commerical software decoding the many HD discs I have purchased.  These
iSCSI targets include a Linux/ppc64 with PS3 BD-ROM and a Xbox 360
HD-DVD USB, and Linux/x86 and Linux/Alpha export of Philips SPD7000P BD
Writer over GB/sec Ethernet and IPv4/IPv6.  I have been very pleased the
progress so far, and will be posting more info for an HOWTO in the
upcoming weeks.

 With kind regards,
  
 Geert Uytterhoeven
 Software Architect
 

Many thanks for your most valuable of time.

--nab


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mptlinux crashes on kernel 2.6.22.1

2007-08-07 Thread Rolf Eike Beer
Hommel, Thomas (GE Indust, GE Fanuc) wrote:
 Here's a record of driver initialization with debugging enabled. I can't
 figure out what goes wrong, but maybe sombody else can...

 Any help is appreciated

[...]
 mptbase: ioc0: WARNING - mpt_timer_expired complete!
 Unable to handle kernel paging request for data at address 0x0542
 Faulting instruction address: 0xa01d93b8
 Oops: Kernel access of bad area, sig: 11 [#1]
 SBS CM6
 NIP: a01d93b8 LR: a01d93b8 CTR: a000c2ac
 REGS: bffcbea0 TRAP: 0300   Not tainted  (2.6.22.1)
 MSR: 9032 EE,ME,IR,DR  CR: 82004028  XER: 
 DAR: 0542, DSISR: 4000
 TASK = bffc0030[5] 'events/0' THREAD: bffca000
 GPR00: a01d93b8 bffcbf50 bffc0030 bfffd0c0 bfff7800 0001 004971e0
 
 GPR08: 0001c7d0 0010 a039c000 bfff783c  ff9f6b57 0fffbd00
 
 GPR16: 0001  007fff00   007ffeb0 
 a034dd74
 GPR24: a035 a034dd74 a003 a034dd74 a02e bffca000 a01d93a0
 02c4
 NIP [a01d93b8] mptspi_dv_renegotiate_work+0x18/0x120
 LR [a01d93b8] mptspi_dv_renegotiate_work+0x18/0x120
 Call Trace:
 [bffcbf50] [a01d93b8] mptspi_dv_renegotiate_work+0x18/0x120 (unreliable)
 [bffcbf80] [a002d33c] run_workqueue+0xac/0x158
 [bffcbfa0] [a002d7a8] worker_thread+0x6c/0xd0
 [bffcbfd0] [a0030e74] kthread+0x84/0x8c
 [bffcbff0] [a00115c4] kernel_thread+0x44/0x60
 Instruction dump:
 4bffd6d1 80010024 83e1001c 38210020 7c0803a6 4e800020 7c0802a6 9421ffd0
 bf810020 90010034 83e30010 4be83521 a01f027e 2f80 419e009c
 813f
 mptbase: ioc0: Sending Config request type 4, page 1 and action 0
 mptbase: ioc0: mf_dma_addr=1fe82922 req_idx=3 RequestNB=2
 mptbase: ioc0: WARNING - mpt_timer_expired!
 mptbase: IOC setup_reset routed to MPT base driver!
 mptbase: Initiating ioc0 recovery
 mptbase::MakeIocReady, ioc0 [raw] state=2400
 mptbase: ioc0: IOC operational unexpected
 mptbase: whoinit 0x4 statefault 0 force 1

Looks like a NULL deref.

Find your mptspi.o, fusion.o or fusion.ko (all of them should work), and do

gdb fusion.o
l *mptspi_dv_renegotiate_work+0x18

That should give you the faulting line.

Eike


signature.asc
Description: This is a digitally signed message part.


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread James Smart

In defense of my maintainer, who was working on my behalf! ...

The lpfc mods were the bulk of the +/- counts.  We batch our bug fixes
together and then push to James as a large lump. Unfortunately, we had
a change that changed logging from a base object to a subobject. Although
not risky, it did account for a lot of +/- changes.  The way we pushed
to James, did not allow for him to easily segment one set of changes
from the other. Emulex will change this behavior, hopefully making this
easier on James to keep you happy.

However, I take issue with looking at line counts as the sole basis
for what's appropriate or not. It can be argued that some bug fixes may be
larger in scope than others, or patch batching so that the bug fix count is
higher will skew this perception. I also believe that more lesser bugfixes
should be allowed in an earlier -rc? than later, so a hard-and-fast rule for
line counts seem odd.  Also - what's a bug fix ?  There are many things
which are not features but are necessities for diagnosis or support of the
larger change. Some of these you simply don't find in time to make sure they
are in place for the -rc1 merge. Do you hold off on them, or do you make a
choice based risk/reward based on where the -rc is ? I vote for the latter.
I realize that the Linux kernel is such a beast overall that you must have
some simple guidelines, but basing it solely on numbers is a very bad pitfall.

-- james s


Linus Torvalds wrote:


On Mon, 6 Aug 2007, James Bottomley wrote:

Confused ... you did get the first pull request in the first week.


Here's the problem. Let me repeat it again:


And after -rc1, I don't want to see crap like this:

 46 files changed, 2837 insertions(+), 2050 deletions(-)


It DOES NOT MATTER if I get a first pull request in the first week, if 
that pull request is purely cosmetic, and is followed by stuff that 
*should* have been in the merge window four weeks afterwards.



OK ... that's arguable.


There's nothing arguable at all about it.

If you have 5000 lines of changes, that's not a bugfix any more. That's 
a big damn change, and it should have happened in the merge window. Or if 
it doesn't make it in time, in the *next* merge window.


Linus
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread FUJITA Tomonori
On Tue, 7 Aug 2007 00:14:29 -0700
Andrew Morton [EMAIL PROTECTED] wrote:

 On Mon, 06 Aug 2007 22:55:41 -0500 James Bottomley [EMAIL PROTECTED] wrote:
 
  The real root cause of all of this is that there's no tree I can
  persuade all the interested parties to test that includes all of these
  features.  In spite of the fact they've all been incubating in -mm for
  at least 3 months, no-one apparently tested all the features together
  until 2.6.23-rc1 was released, so then we're scrambling to address the
  issues as they arise.
 
 I pulled git-scsi-misc on July 19 and there was no bsg code in there at
 all.  I pulled again on July 20 and all the bsg code was in mainline.  So
 it appears that the bsg code went mailing-list - mainline in less than 24
 hours, so there wasn't a lot of opportunity for -mm testing there.

bsg was merged via Jens' branch. After that, I asked James to send
some fixes via the scsi-rc-fixes.


 A lot of the stupid it-doesn't-compile stuff would have been fixed in -mm,
 but more substantial problems might not have been picked up.  But one can
 say that about anything.

My mistake. I should have sent bsg to -mm. Sorry about that.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Jeff Garzik

FUJITA Tomonori wrote:

On Tue, 7 Aug 2007 00:14:29 -0700
Andrew Morton [EMAIL PROTECTED] wrote:


On Mon, 06 Aug 2007 22:55:41 -0500 James Bottomley [EMAIL PROTECTED] wrote:


The real root cause of all of this is that there's no tree I can
persuade all the interested parties to test that includes all of these
features.  In spite of the fact they've all been incubating in -mm for
at least 3 months, no-one apparently tested all the features together
until 2.6.23-rc1 was released, so then we're scrambling to address the
issues as they arise.

I pulled git-scsi-misc on July 19 and there was no bsg code in there at
all.  I pulled again on July 20 and all the bsg code was in mainline.  So
it appears that the bsg code went mailing-list - mainline in less than 24
hours, so there wasn't a lot of opportunity for -mm testing there.


bsg was merged via Jens' branch. After that, I asked James to send
some fixes via the scsi-rc-fixes.


ISTR that Jens doesn't regularly push / get picked up by -mm?  That 
seems like an easy problem to solve.


Jeff




-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread James Bottomley
On Tue, 2007-08-07 at 00:14 -0700, Andrew Morton wrote:
 On Mon, 06 Aug 2007 22:55:41 -0500 James Bottomley [EMAIL PROTECTED] wrote:
 
  The real root cause of all of this is that there's no tree I can
  persuade all the interested parties to test that includes all of these
  features.  In spite of the fact they've all been incubating in -mm for
  at least 3 months, no-one apparently tested all the features together
  until 2.6.23-rc1 was released, so then we're scrambling to address the
  issues as they arise.
 
 I pulled git-scsi-misc on July 19 and there was no bsg code in there at
 all.  I pulled again on July 20 and all the bsg code was in mainline.  So
 it appears that the bsg code went mailing-list - mainline in less than 24
 hours, so there wasn't a lot of opportunity for -mm testing there.

The initial bsg submit went via the block git tree ... which I believe
you have in -mm.  We only started taking the updates via the scsi tree
when it became evident that they were entangling both scsi and bsg too
deeply to be split between trees.

 A lot of the stupid it-doesn't-compile stuff would have been fixed in -mm,
 but more substantial problems might not have been picked up.  But one can
 say that about anything.

Actually, it was fixed ... just in a fashion I found to be unacceptable:
making SCSI built in if bsg was selected.

  I really, *really* think we need a pre-release tree that consists of all
  the upstream targetted features (i.e. all of the for the next merge
  window git trees) and nothing else.
 
 That *is* -mm.  The vast majority of -mm is the 75-odd subsystem trees. 
 What you're suggesting amounts to omitting some of those trees for test
 purposes (I think).  If so, which ones?

The problem is that there's too much other stuff in -mm.  Whenever
anyone asks where they can get scsi-misc (which is my tree for the next
merge window) from without constructing it themselves in git, I always
say use -mm.  Unfortunately, the attrition rate after telling them this
seems to be really high.

 Now it coud be argued that subsystem maintainers should run two trees in
 the last 2.6.x-rcN phase: one tree for 2.6.x+1 and one tree for 2.6.x+2. 
 Then someone could pull all that together as the Linus tree in a month,
 minus insufficiently baked stuff tree.  But frankly, I don't expect that
 people will want to do that, nor will they be able to do it reliably.

A sort of pre merge window freeze point?

 Plus, an *amazing* amount of stuff turns up in the git trees which was
 committed just a few days prior to the merge window opening, or even after
 it opening.  eg, bsg which was, afaict, first committed to the scsi tree
 eleven days after the 2.6.22 release.

Yes ... particularly in large trees like SCSI, there's the maintainer
bugger if I don't mail it out now I don't get it in for another three
months factor.

bsg had actually been sitting in the block tree since 2.6.21, so it had
followed the delayed merge rule ... it just seems that it didn't get
enough integration testing in that six months.  This is what I consider
the real problem to be.

   -mm doesn't really satisfy this,
  because it has so much other stuff that the people I need to get testing
  this don't trust it.
 
 Right.  75-odd developers need to stop committing bugs to their devel
 trees.  Interesting project ;)
 
   The lack of a tree like this that we could have
  persuaded people to test for the last month is what's causing us to
  scramble like this at the closure of the merge window.
 
 Nope.  The scramble is caused by subsystem maintainers jamming stuff into
 mainline at the last minute so they don't have to sit on it for the next
 two months.
 
 Look.  If we're serious about this then the rule needs to be something like
 
   If it wasn't committed to your tree *at least* two weeks prior to the
   2.6.x merge window opening, it shouldn't go into 2.6.x.
 
 People are not presently observing this sort of discipline by a metric
 mile.  And I'm not sure that we should, really.

I don't disagree; my point is that bsg did follow this rule (in fact it
tried to stabilise itself outside of mainline for far longer than this
rule implies)  the problem is that it didn't get sufficient integration
testing.

 I don't think it's terribly bad to whack half-baked things (bsg ;)) into

I wouldn't call bsg half baked ... it was very carefully matured.  There
were just a few integration issues.

 mainline during the merge window, as long as a) we're sure that we want the
 feature in Linux and b) we're confident that we can get it fixed up within
 a couple of months.  Two months is a long time.
 
 But that's just me, and it is not the approach which Linus wants taken.

I agree with this approach too ... that's what I've been doing.  It
means that feature stabilisation must finish at around -rc3.  The
relative quiet from SCSI over the last two releases was because I didn't
have any new features.

I fully agree, and firmly believe that the current 

Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread James Bottomley
On Mon, 2007-08-06 at 21:01 -0700, Linus Torvalds wrote:
 
 On Mon, 6 Aug 2007, James Bottomley wrote:
 
  Confused ... you did get the first pull request in the first week.
 
 Here's the problem. Let me repeat it again:
 
   And after -rc1, I don't want to see crap like this:
   
  46 files changed, 2837 insertions(+), 2050 deletions(-)
 
 It DOES NOT MATTER if I get a first pull request in the first week, if 
 that pull request is purely cosmetic, and is followed by stuff that 
 *should* have been in the merge window four weeks afterwards.
 
  OK ... that's arguable.
 
 There's nothing arguable at all about it.
 
 If you have 5000 lines of changes, that's not a bugfix any more. That's 
 a big damn change, and it should have happened in the merge window. Or if 
 it doesn't make it in time, in the *next* merge window.

I'm not arguing that the bug fix piece wasn't too big (although
realistically, line counts are only a guide not a rule.  If we discover
something like a calling convention bug in SCSI [reversed kmalloc
arguments, say], I could see a huge patch to fix all of the call
sites) ... I've said I'll take responsibility for that and fix it.

I'm arguing that a too strict an interpretation of bugfix only post -rc1
will damage feature stabilisation.  Please think carefully about this.
If we go out in a released kernel with a problematic user space ABI, we
end up being committed to it forever.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Jeff Garzik

Alan Cox wrote:

I fully agree, and firmly believe that the current stabilisation works
incredibly well for shaking out bugs.  My problem is that it doesn't
work for stabilising features.  Either we have to get far more people
doing feature integration testing before the merge window, or we have to
accept feature updates after the merge window (for existing features
that are having stability issues).


The other alternative is that if Linus won't take updates you ask him to
revert bsg so that you don't get a half baked merge as a result of this.
I'm not sure that is a good path to follow either however.


Like everything else in life, it's a balance.  If something is clearly 
half-baked and requires a bunch of post-rc1 changes just to be usable, a 
revert might make a lot of sense.


It's questions of: how much further change is required, how invasive are 
those changes, how half-baked and incomplete is the feature really, what 
is the downside of a revert, ...


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] move ULD attachment into the prep function

2007-08-07 Thread Boaz Harrosh
James Bottomley wrote:
 One of the intents of the block prep function was to allow ULDs to use
 it for preprocessing.  The original SCSI model was to have a single prep
 function and add a pointer indirect filter to build the necessary
 commands.  This patch reverses that, does away with the init_command
 field of the scsi_driver structure and makes ULDs attach directly to the
 prep function instead.  The value is really that it allows us to begin
 to separate the ULDs from the SCSI mid layer (as long as they don't use
 any core functions---which is hard at the moment---a ULD doesn't even
 need SCSI to bind).
 
 James
 
 Index: BUILD-2.6/drivers/scsi/scsi_lib.c

It turns out this patch is dependent on previous
sd: disentangle barriers in SCSI (02)

and that one is dependent on the previous-previous one: 
block: add protocol discriminators to requests and queues. (01)

but the middle one (02) does not apply it looks like there is a missing
hunk for scsi_lib.c in the first (01)

sd: disentangle barriers in SCSI (02)
@@ -1596,7 +1580,6 @@ struct request_queue *scsi_alloc_queue(s
return NULL;
 
blk_queue_prep_rq(q, scsi_prep_fn);
-   blk_queue_issue_flush_fn(q, scsi_issue_flush_fn);
blk_queue_softirq_done(q, scsi_softirq_done);
blk_queue_protocol(q, BLK_PROTOCOL_SCSI);
return q;
/sd: disentangle barriers in SCSI (02)

The before last sync line: 
blk_queue_protocol(q, BLK_PROTOCOL_SCSI);
is missing from (01). Any thing else I need?

So I guess my first complain is that these should have been
a series to denote dependency. Also I think an email with 
deeper explanation of where you are going with these, and 
what is the motivation could be nice.

Apart from that:

Ouch! ;) That patch hurts.

What is the time frame for these changes are they for immediate
inclusion into scsi-misc and into 2.6.24 merge window? Before
scsi_data_buff, sglist, bidi, Mike's execute_async_cleanup ... ?

I do not like this patch. I think that if your motivation was
to make sd/sr and other ULD's more independent of scsi-ml than
you achieved the opposite. 5 exported functions and intimate
knowledge of scsi_lib internals. Lots of same cut and past code 
in ULD's. Interdependence of scsi_lib.c with it's ULD's. Will 
make it hard for scsi_lib to change without touching ULD's.
(And there are lots of scheduled changes :-))

What about below approach? 
What I tried to do is keep scsi_lib private, export a more
simple API for ULD's. And keep common code common.
The code was compiled and booted but I did not do any error 
injection and/or low memory condition testing.

[PATCH 3/3] move ULD attachment into the prep function


  - scsi_lib.c prep_fn will only support blk_pc_commands.
  - Let ULD's that support blk_fs_request() overload prep_fn.
sd.c and sr.c will do so.
  - scsi_lib exports a scsi_prep_cmnd() helper that will take
a request and allocate and return a struct scsi_cmnd.
  - If ULD decides it wants to fail the command allocated above
It must call a new export scsi_prep_return() to cancel the
request and return the command to free store.

git-diff
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 60cbe37..c8ed932 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1128,8 +1128,6 @@ static int scsi_setup_blk_pc_cmnd(struct scsi_device 
*sdev, struct request *req)
 static int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct request *req)
 {
struct scsi_cmnd *cmd;
-   struct scsi_driver *drv;
-   int ret;
 
/*
 * Filesystem requests must transfer data.
@@ -1140,24 +1138,11 @@ static int scsi_setup_fs_cmnd(struct scsi_device *sdev, 
struct request *req)
if (unlikely(!cmd))
return BLKPREP_DEFER;
 
-   ret = scsi_init_io(cmd);
-   if (unlikely(ret))
-   return ret;
-
-   /*
-* Initialize the actual SCSI command for this request.
-*/
-   drv = *(struct scsi_driver **)req-rq_disk-private_data;
-   if (unlikely(!drv-init_command(cmd))) {
-   scsi_release_buffers(cmd);
-   scsi_put_command(cmd);
-   return BLKPREP_KILL;
-   }
-
-   return BLKPREP_OK;
+   return scsi_init_io(cmd);
 }
 
-static int scsi_prep_fn(struct request_queue *q, struct request *req)
+struct scsi_cmnd *scsi_prep_cmnd(struct request_queue *q, struct request *req,
+ int *pRet)
 {
struct scsi_device *sdev = q-queuedata;
int ret = BLKPREP_OK;
@@ -1231,6 +1216,16 @@ static int scsi_prep_fn(struct request_queue *q, struct 
request *req)
}
 
  out:
+   *pRet = ret;
+   return req-special;
+}
+EXPORT_SYMBOL(scsi_prep_cmnd);
+
+int scsi_prep_return(struct request_queue *q, struct request *req,
+ struct scsi_cmnd *cmd, int ret)
+{
+   struct scsi_device *sdev = q-queuedata;
+
switch (ret) {
case BLKPREP_KILL:
  

Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Rene Herman

On 08/07/2007 05:55 AM, James Bottomley wrote:

I really, *really* think we need a pre-release tree that consists of all 
the upstream targetted features (i.e. all of the for the next merge 
window git trees) and nothing else.  -mm doesn't really satisfy this, 
because it has so much other stuff that the people I need to get testing 
this don't trust it.


I much agree with this. I used to run -mm at least somewhat frequently if it 
included stuff I was interested in (random features, or for example latest 
ALSA) but gave that up after I had it break on something unrelated and (to 
me) uninteresting a few times too many. There's so much in there that before 
you know it you end up spending time not on that which you wanted to spend 
time on but on something completely unrelated which given a fixed amount of 
available total time (and level/spread of knowledge/interest) gets to be a 
problem.


For latest ALSA, I now sometimes look at the ALSA repositories directly but 
otherwise it seems I'm not testing much of anything before late -rc's.


Not everything going to Linus gets there by way of -mm, but perhaps it could 
help if Andrew split off -merge from -mm, where -merge contained only stuff 
(from -mm) expected to go upstream in the next merge window.


Not sure if this is proposing that other people do more work -- wouldn't 
want do to anything of the sort...


Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread James Bottomley
On Tue, 2007-08-07 at 11:11 -0400, Jeff Garzik wrote:
 James Bottomley wrote:
  The initial bsg submit went via the block git tree ... which I believe
  you have in -mm.  We only started taking the updates via the scsi tree
 
 Seven hours before you posted this, in 
 [EMAIL PROTECTED], Andrew already 
 noted it was not in -mm.
 
 A trivial examination of the broken-out mm patches backs up the absence 
   of Jens' block tree, too.
 
 So let's put this myth / bad assumption to rest, shall we?

Sorry ... I just assumed from the fact that it had been in the block git
tree for six months that it was also in -mm.

  Yes ... particularly in large trees like SCSI, there's the maintainer
  bugger if I don't mail it out now I don't get it in for another three
  months factor.
 
 That factor always exists.  It's not confined to SCSI or large trees. 
 It's basic the nature of the merge window.  Nothing new or shocking here.
 
 
  bsg had actually been sitting in the block tree since 2.6.21, so it had
  followed the delayed merge rule ... it just seems that it didn't get
  enough integration testing in that six months.  This is what I consider
 
 It didn't get integration testing, at least in part, because it did not 
 hit our official pre-release tree.  Quoth Andrew:
  I pulled git-scsi-misc on July 19 and there was no bsg code in there at
  all.  I pulled again on July 20 and all the bsg code was in mainline.
 
 
 
  I don't disagree; my point is that bsg did follow this rule (in fact it
 
 Evidence says otherwise.

It followed the rule of trying to stabilise outside mainline ... it just
didn't get sufficient integration testing.

  I wouldn't call bsg half baked ... it was very carefully matured.  There
  were just a few integration issues.
 
 I wouldn't call bsg carefully matured, if in addition to not really 
 gracing -mm with its presence, the userland API structure is still 
 getting changes on July 29, 2007 (0c6a89ba640d28e1dcd7fd1a217d2cfb92ae4953).

This would be the ABI change I talked about in the previous emails.

So would this problem have been fixed simply by adding the missing block
tree to -mm?

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Jeff Garzik

James Bottomley wrote:

OK ... that's arguable.  This one is larger than I like because of the
lpfc bug fix patch ... I accept I need to do a better job getting these
into the merge window via the scsi-misc tree.  So I will accept the too
big criticism and try to manage the driver maintainers better.

However, I won't accept the not bug fixes only criticism at -rc1.  The
problem is that we're trying to stabilise a new feature: bsg.


Just so we don't lose the forest for the trees...

Not trying to put words in Linus's mouth, but it seems to me he wasn't 
complaining specifically about bsg.  style cleanups, cosmetic 
cleanups, ancient ISA driver polishing (1542, my gdth patch) are 
definitely not bug fix only material.


The lpfc update was probably the biggest thing, LOC-wise.  And even 
though that was mostly bug fixes -- and notably NOT 100% fixes -- it is 
big enough to warrant integration testing and exposure prior to 
mainline.  Definitely merge-window-open material AFAICS.


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Jeff Garzik

James Smart wrote:

However, I take issue with looking at line counts as the sole basis
for what's appropriate or not. It can be argued that some bug fixes may be
larger in scope than others, or patch batching so that the bug fix count is
higher will skew this perception. I also believe that more lesser 
bugfixes
should be allowed in an earlier -rc? than later, so a hard-and-fast rule 
for

line counts seem odd.  Also - what's a bug fix ?  There are many things
which are not features but are necessities for diagnosis or support of 
the
larger change. Some of these you simply don't find in time to make sure 
they

are in place for the -rc1 merge. Do you hold off on them, or do you make a
choice based risk/reward based on where the -rc is ? I vote for the latter.
I realize that the Linux kernel is such a beast overall that you must have
some simple guidelines, but basing it solely on numbers is a very bad 
pitfall.



It's straightforward engineering math:  the more LOC that changed, the 
more important it is to /not/ stuff it into a stabilization release, 
because of the greater potential for breaking stuff and negating all the 
existing testing so far.


Once -rc1 is out there, that means the focus should be on stabilizing 
the existing codebase.  Pushing a big driver update means that effort 
must restart from scratch.  We just don't want to go down that road, 
which a big reason for the merge window in general.


If you miss the merge window, tough cookies :)  You gotta deal with it 
just like I do, and everyone else does.


Remember -- the more disciplined we all are with the merge window, the 
more likely it is that a release can be stabilized quickly, and thus, 
the more quickly we will reach the next merge window.


In contrast, increasing violations of the merge window mean increasing 
time between releases.


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Jeff Garzik

James Bottomley wrote:

I'm arguing that a too strict an interpretation of bugfix only post -rc1
will damage feature stabilisation.  Please think carefully about this.
If we go out in a released kernel with a problematic user space ABI, we
end up being committed to it forever.



IMO you're going off on your own tangent.  Linus never singled out bsg 
(far from it, in fact, since bsg was not a major LOC contributor) or 
declared ABI-related fixes verboten.


I don't think anyone wants to release a userspace ABI with problems, 
since we all know that's basically locked in stone once its in a 
mainline release.


AFAICS his main complaint was he felt your push was a big honking huge 
change, late in the game, that included obvious non-fixes.  And it was. 
 lpfc was probably the biggest part of that, not bsg, and it's pretty 
clear such a big lpfc update should have gone in when the merge window 
was open.  The [non-lpfc] cleanups were also not -rc2 material.


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread James Smart

Jeff Garzik wrote:
The lpfc update was probably the biggest thing, LOC-wise.  And even 
though that was mostly bug fixes -- and notably NOT 100% fixes -- it is 
big enough to warrant integration testing and exposure prior to 
mainline.  Definitely merge-window-open material AFAICS.


FYI - it is integrated and tested prior to mainline, by Emulex (and who
else *really* tests it close to the degree we do ?). We do so, as a whole,
weeks ahead of the submit to the maintainer. Usually, there's only a couple
of small api changes that are picked up when we merge into the maintainers
pool.  And most of these are caught by us prior anyway as we package the
patchsets and ensure the integration into the maintainers pool is smooth.

-- james s

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Jeff Garzik

James Smart wrote:

Jeff Garzik wrote:
The lpfc update was probably the biggest thing, LOC-wise.  And even 
though that was mostly bug fixes -- and notably NOT 100% fixes -- it 
is big enough to warrant integration testing and exposure prior to 
mainline.  Definitely merge-window-open material AFAICS.


FYI - it is integrated and tested prior to mainline, by Emulex (and who
else *really* tests it close to the degree we do ?). We do so, as a whole,
weeks ahead of the submit to the maintainer. Usually, there's only a couple
of small api changes that are picked up when we merge into the maintainers
pool.  And most of these are caught by us prior anyway as we package the
patchsets and ensure the integration into the maintainers pool is smooth.


This is a highly common pattern, and unfortunately you get the highly 
common Linux response:


In Linux we never ever assume a driver is working simply because the 
hardware vendor tested it.  A decade of real world experience PROVES 
precisely the opposite -- getting code out into the world early and 
often repeatedly turned up problems not seen in hardware vendor's testing.


Take a lesson from when I was on Linus's shit-list... twice:  Twice, 
Intel submitted an e1000 update after the merge window closed.  Twice, 
they claimed the driver passed their quite-exhaustive internal testing. 
 And twice, the most popular network driver broke for large masses of 
users because I took a hardware vendor's word on testing rather than 
rely on the testing PROVEN to flush out problems:  public linux kernel 
testing.


I'm not singling out Intel, there are plenty of other hardware vendors 
that repeat the exact same pattern.


It's quite simply impossible for a hardware vendor to test all the weird 
combinations in the field.  Our test lab -- the Internet -- is the one 
we trust.


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG in SCSI async scanning

2007-08-07 Thread James Bottomley
On Tue, 2007-08-07 at 12:54 -0400, Alan Stern wrote:
 Can somebody explain the reason for calling a separate 
 scsi_sysfs_add_devices() routine in the async scanning code instead of 
 just calling scsi_sysfs_add_sdev() normally from within scsi_add_lun()?

Matthew's away at the moment, so I'll speak for him.

The reason is so that the order of device enumeration remains
approximately constant.   We didn't necessarily want completely random
and timing related enumerations with the async scanning patch.

 This peculiar delayed approach has introduced a bug.  It evades the
 protection provided by shost-scan_mutex and as a result, if a host is
 hot-unplugged in the middle of an async scan, the SCSI core will
 attempt to unregister the host's devices before they have been
 registered!  Obviously this is not good.

Does this patch fix it:

http://marc.info/?l=linux-scsim=118289275414202

? it's one of a number designed to address the problems.

James


 There are at least two reports filed in the kernel.org bugilla showing 
 this bug: #8840 and #8846.
 
 Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sg: increase sglist_len of the sg_scatter_hold structure

2007-08-07 Thread Mike Christie

FUJITA Tomonori wrote:

Allocating 64K contiguous memory is not good so the next thing to do
is converting sg to use the sg chaining support fully. Or it might be


For LLDs like aic7xxx, I think we are stuck with a small 
scsi_host_template-sg_tablesize, so to continue to get large requests 
like before will we have to still allocate large segments?


Is block/scsi_ioctl.c converted to sg chaining in any tree yet? Is that 
in your tree or one of Jen's branches.



time to finish the overdue task, to convert sg to use the block layer
functions.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Andrew Morton
On Tue, 07 Aug 2007 10:21:18 -0400 Jeff Garzik [EMAIL PROTECTED] wrote:

 FUJITA Tomonori wrote:
  On Tue, 7 Aug 2007 00:14:29 -0700
  Andrew Morton [EMAIL PROTECTED] wrote:
  
  On Mon, 06 Aug 2007 22:55:41 -0500 James Bottomley [EMAIL PROTECTED] 
  wrote:
 
  The real root cause of all of this is that there's no tree I can
  persuade all the interested parties to test that includes all of these
  features.  In spite of the fact they've all been incubating in -mm for
  at least 3 months, no-one apparently tested all the features together
  until 2.6.23-rc1 was released, so then we're scrambling to address the
  issues as they arise.
  I pulled git-scsi-misc on July 19 and there was no bsg code in there at
  all.  I pulled again on July 20 and all the bsg code was in mainline.  So
  it appears that the bsg code went mailing-list - mainline in less than 24
  hours, so there wasn't a lot of opportunity for -mm testing there.
  
  bsg was merged via Jens' branch. After that, I asked James to send
  some fixes via the scsi-rc-fixes.
 
 ISTR that Jens doesn't regularly push / get picked up by -mm?  That 
 seems like an easy problem to solve.
 

-mm includes
git+ssh://master.kernel.org/pub/scm/linux/kernel/git/axboe/linux-2.6-block.git#for-akpm,
but it's up to Jens to choose what goes in there.

git-block has been dropped from -mm since 2.6.23-rc1-mm1 (I think) because
it has something in there which breaks sata on the Vaio.  Prior to that (in
2.6.22-rc-late) there was nothing in that tree.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] scsi bug fixes for 2.6.23-rc2

2007-08-07 Thread Andrew Morton
On Tue, 07 Aug 2007 10:38:44 -0500 James Bottomley [EMAIL PROTECTED] wrote:

 On Tue, 2007-08-07 at 11:11 -0400, Jeff Garzik wrote:
  James Bottomley wrote:
   The initial bsg submit went via the block git tree ... which I believe
   you have in -mm.  We only started taking the updates via the scsi tree
  
  Seven hours before you posted this, in 
  [EMAIL PROTECTED], Andrew already 
  noted it was not in -mm.
  
  A trivial examination of the broken-out mm patches backs up the absence 
of Jens' block tree, too.
  
  So let's put this myth / bad assumption to rest, shall we?
 
 Sorry ... I just assumed from the fact that it had been in the block git
 tree for six months that it was also in -mm.

bsg was never in the #for-akpm branch of git-block.  So I assume that
Jens had it in some other branch and for some reason never pulled it
across into #for-akpm.

It was most reasonable of you to expect that bsg had received a decent
run in -mm.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG in SCSI async scanning

2007-08-07 Thread Alan Stern
On Tue, 7 Aug 2007, James Bottomley wrote:

 On Tue, 2007-08-07 at 12:54 -0400, Alan Stern wrote:
  Can somebody explain the reason for calling a separate 
  scsi_sysfs_add_devices() routine in the async scanning code instead of 
  just calling scsi_sysfs_add_sdev() normally from within scsi_add_lun()?
 
 Matthew's away at the moment, so I'll speak for him.
 
 The reason is so that the order of device enumeration remains
 approximately constant.   We didn't necessarily want completely random
 and timing related enumerations with the async scanning patch.

If you say so.  But aren't synchronous enumerations just about as 
random and timing-related?  I guess you're concerned about cases where 
multiple SCSI buses are scanned in parallel.

  This peculiar delayed approach has introduced a bug.  It evades the
  protection provided by shost-scan_mutex and as a result, if a host is
  hot-unplugged in the middle of an async scan, the SCSI core will
  attempt to unregister the host's devices before they have been
  registered!  Obviously this is not good.
 
 Does this patch fix it:
 
 http://marc.info/?l=linux-scsim=118289275414202
 
 ? it's one of a number designed to address the problems.

Yes it does, thank you very much.  I urge you to send the patch 
upstream in time for 2.6.23 and also to submit it for the 2.6.22-stable 
branch (if you haven't done so already).

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] aacraid: default timeout for arrays too short

2007-08-07 Thread Salyzyn, Mark
The default SCSI timeout is 30 seconds for a logical device. The aacraid
based controllers currently have a 35 second timeout for the array. We
are bumping up the default SCSI timeout for array devices, which
typically manage many physical disks, to 45 seconds to provide a small
margin to permit the controller to do what it is designed for. We have
not observed any bad side-effects either way because no significant
actions are taken by the aacraid timeout handler except to take
advantage of the quiesced state to allow completion of all outstanding
commands in the controller to provide a poor-mans guaranty of delivery.
This is merely a preferential decision to reduce the number of timeout
reports in the system logs to only the more serious conditions.

This attached patch is against current scsi-misc-2.6.

ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patch attachments.

Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]

 drivers/scsi/aacraid/linit.c |6 ++
 1 file changed, 6 insertions(+)

Sincerely -- Mark Salyzyn


aacraid_array_timeout_too_short.patch
Description: aacraid_array_timeout_too_short.patch


Re: [PATCH] sg: increase sglist_len of the sg_scatter_hold structure

2007-08-07 Thread FUJITA Tomonori
On Tue, 07 Aug 2007 12:13:41 -0500
Mike Christie [EMAIL PROTECTED] wrote:

 FUJITA Tomonori wrote:
  Allocating 64K contiguous memory is not good so the next thing to do
  is converting sg to use the sg chaining support fully. Or it might be
 
 For LLDs like aic7xxx, I think we are stuck with a small 
 scsi_host_template-sg_tablesize, so to continue to get large requests 
 like before will we have to still allocate large segments?

No. sg.c has:

sizeof(struct scatterlist) * min(q-max_hw_segments, q-max_phys_segments)

If a lld has small max_hw_segments, it doesn't allocate big contiguous
memory.


 Is block/scsi_ioctl.c converted to sg chaining in any tree yet? Is that 
 in your tree or one of Jen's branches.

block/scsi_ioctl.c uses the standard block layer functions, there is
nothing to convert in it. sglist doesn't change the standard block
layer functions much since it doesn't allocate sg list. It changes
only blk_rq_map_sg.

Now only scsi-ml is changed to allocate chaining sg list
properly. Others like cciss are not converted yet, I think. It might
make sense to have the standard block layer functions to allocate
chaining sg list properly. So we could convert to potential consumers
(scsi-ml, sg, ccisss, etc) use them though I'm not sure how many non
scsi-ml needs chaining sg list.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] drivers/scsi/ips.c: fix scsi_add_host warning

2007-08-07 Thread Eugene Teo
This patch fixes the following warning:

drivers/scsi/ips.c: In function 'ips_register_scsi':
drivers/scsi/ips.c:6867: warning: ignoring return value of
'scsi_add_host', declared with attribute warn_unused_result

Signed-off-by: Eugene Teo [EMAIL PROTECTED]
---
 drivers/scsi/ips.c |   16 
 1 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 492a51b..b04c42f 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -6824,13 +6824,14 @@ ips_order_controllers(void)
 static int
 ips_register_scsi(int index)
 {
+   int rc = -1;
struct Scsi_Host *sh;
ips_ha_t *ha, *oldha = ips_ha[index];
sh = scsi_host_alloc(ips_driver_template, sizeof (ips_ha_t));
if (!sh) {
IPS_PRINTK(KERN_WARNING, oldha-pcidev,
   Unable to register controller with SCSI 
subsystem\n);
-   return -1;
+   return rc;
}
ha = IPS_HA(sh);
memcpy(ha, oldha, sizeof (ips_ha_t));
@@ -6839,8 +6840,7 @@ ips_register_scsi(int index)
if (request_irq(ha-irq, do_ipsintr, IRQF_SHARED, ips_name, ha)) {
IPS_PRINTK(KERN_WARNING, ha-pcidev,
   Unable to install interrupt handler\n);
-   scsi_host_put(sh);
-   return -1;
+   goto err_put_sh;
}
 
kfree(oldha);
@@ -6864,10 +6864,18 @@ ips_register_scsi(int index)
sh-max_channel = ha-nbus - 1;
sh-can_queue = ha-max_cmds - 1;
 
-   scsi_add_host(sh, NULL);
+   rc = scsi_add_host(sh, NULL);
+   if (rc)
+   goto err_free_irq;
scsi_scan_host(sh);
 
return 0;
+
+err_free_irq:
+   free_irq(ha-irq);
+err_put_sh:
+   scsi_host_put(sh);
+   return rc;
 }
 
 /*---*/

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html