from:"Andrew Morton"

Re: simplify procfs code for seq_file instances

2018-04-24 Thread Andrew Morton

On Tue, 24 Apr 2018 16:23:04 +0200 Christoph Hellwig  wrote:

> On Thu, Apr 19, 2018 at 09:57:50PM +0300, Alexey Dobriyan wrote:
> > > git://git.infradead.org/users/hch/misc.git proc_create
> > 
> > 
> > I want to ask if it is time to start using poorman function overloading
> > with _b_c_e(). There are millions of allocation functions for example,
> > all slightly difference, and people will add more. Seeing /proc interfaces
> > doubled like this is painful.
> 
> Function overloading is totally unacceptable.
> 
> And I very much disagree with a tradeoff that keeps 5000 lines of 
> code vs a few new helpers.

OK, the curiosity and suspense are killing me.  What the heck is
"function overloading with _b_c_e()"?

Re: [PATCH 1/2] wd719x: Remove last declaration using DEFINE_PCI_DEVICE_TABLE

2016-09-02 Thread Andrew Morton

On Fri, 02 Sep 2016 06:36:05 -0400 "Martin K. Petersen" 
 wrote:

> > "Joe" == Joe Perches  writes:
> 
> Joe> Convert it to the preferred const struct pci_device_id instead.
> 
> Applied to 4.9/scsi-queue.

That creates an ordering dependency between the scsi tree and -mm's
"treewide: remove references to the now unnecessary
DEFINE_PCI_DEVICE_TABLE".

So an ack would be preferred, please.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Dirty/Writeback fields in /proc/meminfo affected by 20d74bf29c

2016-08-04 Thread Andrew Morton

On Mon, 1 Aug 2016 04:36:28 +0200 Tomas Vondra  wrote:

> Hi,
> 
> While investigating a strange OOM issue on the 3.18.x branch (which 
> turned out to be already fixed by 52c84a95), I've noticed a strange 
> difference in Dirty/Writeback fields in /proc/meminfo depending on 
> kernel version. I'm wondering whether this is expected ...
> 
> I've bisected the change to 20d74bf29c, added in 3.18.22 (upstream 
> commit 4f258a46):
> 
>  sd: Fix maximum I/O size for BLOCK_PC requests
> 
> With /etc/sysctl.conf containing
> 
>  vm.dirty_background_bytes = 67108864
>  vm.dirty_bytes = 1073741824
> 
> a simple "dd" example writing 10GB file
> 
>  dd if=/dev/zero of=ssd.test.file bs=1M count=10240
> 
> results in about this on 3.18.21:
> 
>  Dirty:740856 kB
>  Writeback: 12400 kB
> 
> but on 3.18.22:
> 
>  Dirty: 49244 kB
>  Writeback:656396 kB
> 
> I.e. it seems to revert the relationship. I haven't identified any 
> performance impact, and apparently for random writes the behavior did 
> not change at all (or at least I haven't managed to reproduce it).
> 
> But it's unclear to me why setting a maximum I/O size should affect 
> this, and perhaps it has impact that I don't see.

So what appears to be happening here is that background writeback is
cutting in earlier - the amount of pending writeback ("Dirty") is
reduced while the amount of active writeback ("Writeback") is
correspondingly increased.

4f258a46 had the effect of permitting larger requests into the request
queue.  It's unclear to me why larger requests would cause background
writeback to cut in earlier - the writeback code doesn't even care
about individual request sizes, it only cares about aggregate pagecache
state.

Less Dirty and more Writeback isn't necessarily a bad thing at all, but
I don't like mysteries.  cc linux-mm to see if anyone else can
spot-the-difference.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] byteswap: try to avoid __builtin_constant_p gcc bug

2016-05-02 Thread Andrew Morton

On Tue, 03 May 2016 01:10:16 +0200 Arnd Bergmann <a...@arndb.de> wrote:

> On Monday 02 May 2016 16:02:18 Andrew Morton wrote:
> > On Mon, 02 May 2016 23:48:19 +0200 Arnd Bergmann <a...@arndb.de> wrote:
> > 
> > > This is another attempt to avoid a regression in wwn_to_u64() after
> > > that started using get_unaligned_be64(), which in turn ran into a
> > > bug on gcc-4.9 through 6.1.
> > 
> > I'm still getting a couple screenfuls of things like
> > 
> > net/tipc/name_distr.c: In function 'tipc_named_process_backlog':
> > net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned 
> > int', but argument 3 has type 'unsigned int'
> > net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned 
> > int', but argument 4 has type 'unsigned int'
> > net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned 
> > int', but argument 5 has type 'unsigned int'
> > net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned 
> > int', but argument 7 has type 'unsigned int'
> 
> I've built a few thousand kernels (arm32 with gcc-6.1) with the patch applied,
> but didn't see this one. What target architecture and compiler version 
> produced
> this? Does it go away if you add a (__u32) cast? I don't even know what the
> warning is trying to tell me.

heh, I didn't actually read it.

Hopefully we can write this off as a gcc-4.4.4 glitch. 4.8.4 is OK.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] byteswap: try to avoid __builtin_constant_p gcc bug

2016-05-02 Thread Andrew Morton

On Mon, 02 May 2016 23:48:19 +0200 Arnd Bergmann  wrote:

> This is another attempt to avoid a regression in wwn_to_u64() after
> that started using get_unaligned_be64(), which in turn ran into a
> bug on gcc-4.9 through 6.1.

I'm still getting a couple screenfuls of things like

net/tipc/name_distr.c: In function 'tipc_named_process_backlog':
net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned int', 
but argument 3 has type 'unsigned int'
net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned int', 
but argument 4 has type 'unsigned int'
net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned int', 
but argument 5 has type 'unsigned int'
net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned int', 
but argument 7 has type 'unsigned int'

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: mm: VM_BUG_ON_PAGE(PageTail(page)) in mbind

2016-01-26 Thread Andrew Morton

On Tue, 26 Jan 2016 22:28:29 +0200 "Kirill A. Shutemov"  
wrote:

> The patch below fixes the issue for me, but this bug makes me wounder how
> many bugs like this we have in kernel... :-/
> 
> Looks like we are too permissive about which VMA is migratable:
> vma_migratable() filters out VMA by VM_IO and VM_PFNMAP.
> I think VM_DONTEXPAND also correlate with VMA which cannot be migrated.
> 
> $ git grep VM_DONTEXPAND drivers | grep -v '\(VM_IO\|VM_PFNMAN\)' | wc -l 
> 33
> 
> Hm.. :-|
> 
> It worth looking on them closely... And I wouldn't be surprised if some
> VMAs without all of these flags are not migratable too.
> 
> Sigh.. Any thoughts?

Sigh indeed.  I think that both VM_DONTEXPAND and VM_DONTDUMP are
pretty good signs that mbind() should not be mucking with this vma.  If
such a policy sometimes results in mbind failing to set a policy then
that's not a huge loss - something runs a bit slower maybe.

I mean, we only really expect mbind() to operate against regular old
anon/pagecache memory, yes?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: mm: VM_BUG_ON_PAGE(PageTail(page)) in mbind

2016-01-26 Thread Andrew Morton

On Tue, 26 Jan 2016 22:28:29 +0200 "Kirill A. Shutemov"  
wrote:

> Let's mark the VMA as VM_IO to indicate to mm core that the VMA is
> migratable.
> 
> ...
>
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma)
>   }
>  
>   sfp->mmap_called = 1;
> - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> + vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
>   vma->vm_private_data = sfp;
>   vma->vm_ops = _mmap_vm_ops;
>   return 0;

I'll put cc:stable on this - I don't think we recently did anything to make
this happen?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: debug: fix type mismatch warning for sg_pcopy_from_buffer

2015-05-20 Thread Andrew Morton

On Tue, 19 May 2015 23:22:39 +0200 Arnd Bergmann a...@arndb.de wrote:

 The recent change to mark the input argument of sg_pcopy_from_buffer
 had the unfortunate side-effect to cause a new warning in the
 scsi_debug code:
 
 drivers/scsi/scsi_debug.c: In function 'do_device_access':
 drivers/scsi/scsi_debug.c:2376:8: warning: assignment from incompatible 
 pointer type [-Wincompatible-pointer-types]
func = sg_pcopy_from_buffer;
 
 This patch attempts to avoid that warning without adding
 evil type casts, but unfortunately makes the do_device_access
 function a lot uglier in the process.
 
 Signed-off-by: Arnd Bergmann a...@arndb.de
 Fixes: 5250326459 (lib/scatterlist: mark input buffer parameters as 'const')
 ---
 
 I can't decide if this is actually a good idea, or if we should rather drop
 the sg_pcopy_from_buffer() patch. Maybe someone else sees a better solution.

Could make do_device_access() call sg_copy_buffer() directly.

But yes, dropping the sg_pcopy_from/to_buffer changes is reasonable. 
sg_copy_buffer() is bidirectional and that won't be changing, so
putting constified wrapeprs around it is kinda fake.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scatterlist: enable sg chaining for all architectures

2015-04-28 Thread Andrew Morton

On Sat, 25 Apr 2015 23:56:16 +0900 Akinobu Mita akinobu.m...@gmail.com wrote:

 Some architectures enable sg chaining option while others do not.
 
 The requirement to enable sg chaining is that pages must be aligned
 at a 32-bit boundary in order to overload the LSB of the pointer.
 Regardless of whether ARCH_HAS_SG_CHAIN is defined or not, the above
 requirement is always chacked by BUG_ON() in sg_assign_page.  So
 all architectures can enable sg chaining.
 
 As you can see from the changes in drivers/target/target_core_rd.c,
 enabling SG chaining for all architectures allows us to allocate
 discontiguous scatterlist tables which can be traversed throughout
 by sg_next() without a special handling for some architectures.

Thanks, I'll grab this.  If anyone has concerns, speak now or hold both
pieces!

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 069/104] lib/string_helpers.c:string_get_size(): remove redundant prefixes

2015-02-13 Thread Andrew Morton

On Fri, 13 Feb 2015 16:05:54 -0800 James Bottomley 
james.bottom...@hansenpartnership.com wrote:

 @@ -42,31 +44,60 @@ void string_get_size(u64 size, const enum 
 string_size_units units,
   [STRING_UNITS_2] = 1024,
   };
   int i, j;
 - u32 remainder = 0, sf_cap;
 + u32 remainder = 0, sf_cap, exp;
   char tmp[8];
 + const char *unit;
  
   tmp[0] = '\0';
   i = 0;
 + if (!size)
 +   goto out;

whitespace wart.

 + if (blk_size = divisor[units]) {
 + while (blk_size = divisor[units]) {
 + remainder = do_div(blk_size, divisor[units]);
 + i++;
 + }
 + }

The `if' doesn't do anything.

 + exp = divisor[units];
 + do_div(exp, blk_size);
 + if (size = exp) {
 + remainder = do_div(size, divisor[units]);
 + remainder *= blk_size;
 + i++;
 + } else {
 + remainder *= size;
 + }
 + size *= blk_size;
 + size += (remainder/divisor[units]);
 + remainder %= divisor[units];
 +
   if (size = divisor[units]) {
   while (size = divisor[units]) {
   remainder = do_div(size, divisor[units]);
   i++;
   }
 + }

Here too.


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 069/104] lib/string_helpers.c:string_get_size(): remove redundant prefixes

2015-02-12 Thread Andrew Morton

On Thu, 12 Feb 2015 15:25:08 -0800 James Bottomley 
james.bottom...@hansenpartnership.com wrote:

 On Thu, 2015-02-12 at 15:01 -0800, a...@linux-foundation.org wrote:
  From: Rasmus Villemoes li...@rasmusvillemoes.dk
  Subject: lib/string_helpers.c:string_get_size(): remove redundant prefixes

  While 3c9f3681d0b4 [SCSI] lib: add generic helper to print sizes rounded
  to the correct SI range says that Z and Y are included in preparation for
  128 bit computers, they just waste .text currently.  If and when we get
  u128, string_get_size needs updating anyway (and ISO needs to come up with
  four more prefixes).

 This is rubbish. It's nothing to do with 128 bits.  This is to do with
 disk sizes linux gets attached to.  The current largest device clusters
 are Petabytes ... I think we may have some exabyte ones somewhere in the
 Academic community, so it's by no means inconcievable we'll have
 Zettabyte ones within a few years.  The SCSI standard, with 4k blocks
 supports up to 2^76, which is well into Zettabytes.  We obviously run
 off the mmap possibilities a lot sooner, because of the byte offsets,
 but that's fixable.  Someone will probably start first by passing blocks
 into that interface not bytes, so we'd like it not to be based on
 assumptions that think 2^64 is the largest possible value.

I don't get it.  As the man says, this is presently dead code and
string_get_size() will need to be changed to work for disks larger than
2^64 bytes.  That change may be to take a u128 or it may be as you
suggest: replace the `u64 size' with `u64 size, u64 units' which is
effectively the same thing.

  Also there's no need to include and test for the NULL sentinel; once we
  reach E size is at most 18.  [The test is also wrong; it should be
  units_str[units][i+1]; if we've reached NULL we're already doomed.]

 So fix the bug, don't set us up to run off the end of the array.  And
 please consult the community which keeps track of this rather than
 trying to get it into Linux without review.

That seems a bit harsh - you've been cc'ed on this every step of the way.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 069/104] lib/string_helpers.c:string_get_size(): remove redundant prefixes

2015-02-12 Thread Andrew Morton

On Thu, 12 Feb 2015 15:45:29 -0800 James Bottomley 
james.bottom...@hansenpartnership.com wrote:

 ...

  I don't get it.  As the man says, this is presently dead code and
  string_get_size() will need to be changed to work for disks larger than
  2^64 bytes.  That change may be to take a u128 or it may be as you
  suggest: replace the `u64 size' with `u64 size, u64 units' which is
  effectively the same thing.
 
 The first thing someone's going to do is pass in blocks, because that's
 the way the rest of block functions. If we're lucky the add ZB too,
 but if not we run off the end in some obscure large cluster somewhere.
 Don't set people up to make mistakes.

Well maybe.  A little bit.  But it assumes that someone is going to
make a change then not test it.

Also there's no need to include and test for the NULL sentinel; once we
reach E size is at most 18.  [The test is also wrong; it should be
units_str[units][i+1]; if we've reached NULL we're already doomed.]
   
   So fix the bug, don't set us up to run off the end of the array.  And
   please consult the community which keeps track of this rather than
   trying to get it into Linux without review.
  
  That seems a bit harsh - you've been cc'ed on this every step of the way.
 
 I think you need to check your scripts.  This is the first time I've
 seen this patch, which is why I'm reacting this way.

No, james.bottom...@hansenpartnership.com was cc'ed on the original
email and on the -mm spam.

Perhaps Rasmus should should also have cc'ed linux-scsi - practice
seems to vary a lot.  But he did cc the scsi maintainer and the author
of the patch he was modifying (yourself).


So I think the patch is reasonable and the way Rasmus and I handled it
is also reasonable.  Going nuts at us over it isn't reasonable!
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-22 Thread Andrew Morton

On Wed, 22 Jan 2014 11:30:19 -0800 James Bottomley 
james.bottom...@hansenpartnership.com wrote:

 But this, I think, is the fundamental point for debate.  If we can pull
 alignment and other tricks to solve 99% of the problem is there a need
 for radical VM surgery?  Is there anything coming down the pipe in the
 future that may move the devices ahead of the tricks?

I expect it would be relatively simple to get large blocksizes working
on powerpc with 64k PAGE_SIZE.  So before diving in and doing huge
amounts of work, perhaps someone can do a proof-of-concept on powerpc
(or ia64) with 64k blocksize.

That way we'll at least have an understanding of what the potential
gains will be.  If the answer is 1.5% then poof - go off and do
something else.

(And the gains on powerpc would be an upper bound - unlike powerpc, x86
still has to fiddle around with 16x as many pages and perhaps order-4
allocations(?))

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] remove cpqarray from mainline kernel

2013-10-17 Thread Andrew Morton

On Thu, 17 Oct 2013 12:52:26 -0500 Mike Miller mike.mil...@hp.com wrote:

 cpqarray hasn't been used in over 12 years. It's doubtful that anyone still
 uses the board. It's time the driver was removed from the mainline kernel.
 The only updates these days are minor and mostly done by people outside of HP.

It's amazing the weird stuff people get up to.  Perhaps we should disable
it in config for a cycle or two, see if that flushes anyone out?
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] block: Fix possible sleep in invalid context

2013-07-01 Thread Andrew Morton

On Mon,  1 Jul 2013 20:58:35 +0530 Sujit Reddy Thumma sthu...@codeaurora.org 
wrote:

 When block runtime PM is enabled following warning is seen
 while resuming the device.
 
 BUG: sleeping function called from invalid context at
 .../drivers/base/power/runtime.c:923
 in_atomic(): 1, irqs_disabled(): 128, pid: 12, name: kworker/0:1
 [c0014448] (unwind_backtrace+0x0/0x120) from
 [c03120e4] (__pm_runtime_suspend+0x34/0xa0) from
 [c021c33c] (blk_post_runtime_resume+0x4c/0x5c) from
 [c03297cc] (scsi_runtime_resume+0x90/0xb4) from
 [c0310940] (__rpm_callback+0x30/0x58) from
 [c0310980] (rpm_callback+0x18/0x28) from
 [c0311ab0] (rpm_resume+0x3dc/0x540) from
 [c03120a4] (pm_runtime_work+0x8c/0x98) from
 [c007767c] (process_one_work+0x238/0x3e4) from
 [c0077b90] (worker_thread+0x1ac/0x2ac) from
 [c007cfdc] (kthread+0x88/0x94) from
 [c000ece0] (kernel_thread_exit+0x0/0x8)
 
 Fix this by releasing spin_lock_irq() before calling
 pm_runtime_autosuspend() in blk_post_runtime_resume().
 
 --- a/block/blk-core.c
 +++ b/block/blk-core.c
 @@ -3159,16 +3159,18 @@ EXPORT_SYMBOL(blk_pre_runtime_resume);
   */
  void blk_post_runtime_resume(struct request_queue *q, int err)
  {
 - spin_lock_irq(q-queue_lock);
   if (!err) {
 + spin_lock_irq(q-queue_lock);
   q-rpm_status = RPM_ACTIVE;
   __blk_run_queue(q);
   pm_runtime_mark_last_busy(q-dev);
 + spin_unlock_irq(q-queue_lock);
   pm_runtime_autosuspend(q-dev);
   } else {
 + spin_lock_irq(q-queue_lock);
   q-rpm_status = RPM_SUSPENDED;
 + spin_unlock_irq(q-queue_lock);
   }
 - spin_unlock_irq(q-queue_lock);
  }
  EXPORT_SYMBOL(blk_post_runtime_resume);
  #endif

I suppose we can do this cleanly enough:

--- a/block/blk-core.c~block-fix-possible-sleep-in-invalid-context-fix
+++ a/block/blk-core.c
@@ -3159,15 +3159,14 @@ EXPORT_SYMBOL(blk_pre_runtime_resume);
  */
 void blk_post_runtime_resume(struct request_queue *q, int err)
 {
+   spin_lock_irq(q-queue_lock);
if (!err) {
-   spin_lock_irq(q-queue_lock);
q-rpm_status = RPM_ACTIVE;
__blk_run_queue(q);
pm_runtime_mark_last_busy(q-dev);
spin_unlock_irq(q-queue_lock);
pm_request_autosuspend(q-dev);
} else {
-   spin_lock_irq(q-queue_lock);
q-rpm_status = RPM_SUSPENDED;
spin_unlock_irq(q-queue_lock);
}
_


I wonder if we actually need locking around that second write to
q-rpm_status.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] block: Fix possible sleep in invalid context

2013-07-01 Thread Andrew Morton

On Mon, 01 Jul 2013 15:24:11 -0700 James Bottomley 
james.bottom...@hansenpartnership.com wrote:

  --- a/block/blk-core.c~block-fix-possible-sleep-in-invalid-context-fix
  +++ a/block/blk-core.c
  @@ -3159,15 +3159,14 @@ EXPORT_SYMBOL(blk_pre_runtime_resume);
*/
   void blk_post_runtime_resume(struct request_queue *q, int err)
   {
  +   spin_lock_irq(q-queue_lock);
  if (!err) {
  -   spin_lock_irq(q-queue_lock);
  q-rpm_status = RPM_ACTIVE;
  __blk_run_queue(q);
  pm_runtime_mark_last_busy(q-dev);
  spin_unlock_irq(q-queue_lock);
  pm_request_autosuspend(q-dev);
  } else {
  -   spin_lock_irq(q-queue_lock);
  q-rpm_status = RPM_SUSPENDED;
  spin_unlock_irq(q-queue_lock);
  }
  _
  
  
  I wonder if we actually need locking around that second write to
  q-rpm_status.
 
 Shouldn't: it's an int, which makes it a 32 bit quantity we believe to
 have atomic write properties on every platform.

Yes, but.  If there's some other code path which does:

spin_lock(queue_lock);
x = q-rpm_status;
...
y = q-rpm_status;
...
assumes x == y
spin_unlock(queue_lock);

then it blows up if we make the suggested change.  Stranger things have
happened...

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] cciss: add cciss_allow_hpsa module parameter

2013-04-22 Thread Andrew Morton

On Thu, 18 Apr 2013 13:49:37 -0500 Mike Miller mike.mil...@hp.com wrote:

 Add the cciss_allow_hpsa modules parameter. This allows users to use the hpsa
 driver instead of cciss for older controllers.
 Tested with 3.9.0-rc7 in combination with the bug fix submitted Tuesday. My
 apologies for not testing that patch with the correct kernel.

Could you please resend Tuesday's bug fix, with a much better
explanation than v1 had?  It's totally weird and wonderful that there's
an interaction between kdump and one scsi/block driver so let's try to
get a diagnosis into the record.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Patch 1/1] cciss: bug fix, prevent cciss from loading in kdump kernel

2013-04-17 Thread Andrew Morton

On Mon, 15 Apr 2013 12:59:06 -0500 Mike Miller mike.mil...@hp.com wrote:

 Patch 1/1
 
 If hpsa is selected as the Smart Array driver cciss may try to load in the
 kdump kernel. When this happens kdump fails and a core file cannot be created.
 This patch prevents cciss from trying to load in this scenario. This effects
 primarily older Smart Array controllers.
 
 ...

 --- a/drivers/block/cciss.c
 +++ b/drivers/block/cciss.c
 @@ -4960,6 +4960,12 @@ static int cciss_init_one(struct pci_dev *pdev, const 
 struct pci_device_id *ent)
   ctlr_info_t *h;
   unsigned long flags;
  
 + /*
 +  * if this is the kdump kernel and the user has set the flags to
 +  * use hpsa rather than cciss just bail
 +  */
 + if ((reset_devices)  (cciss_allow_hpsa == 1))
 + return -ENODEV;

OK, wazzup.  That's the only occurrence of the symbol
cciss_allow_hpsa in Linux and needless to say, the compiler laughed
at me.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Patch 1/1] cciss: bug fix, prevent cciss from loading in kdump kernel

2013-04-16 Thread Andrew Morton

On Mon, 15 Apr 2013 12:59:06 -0500 Mike Miller mike.mil...@hp.com wrote:

 Patch 1/1
 
 If hpsa is selected as the Smart Array driver cciss may try to load in the
 kdump kernel. When this happens kdump fails and a core file cannot be created.
 This patch prevents cciss from trying to load in this scenario. This effects
 primarily older Smart Array controllers.
 

OK, this is weird.  kdump and scsi drivers are pretty darn remote things
and I've never heard of such an interaction.  Can you tell us a bit more
about how and why this happened?  Is there something special about
cciss, or can we expect similar kdump interactions with other device drivers?


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH -mmotm] scsi: fix the wrong position of the comment

2013-03-10 Thread Andrew Morton

On Sun, 10 Mar 2013 08:22:47 + James Bottomley jbottom...@parallels.com 
wrote:

 [missing SCSI cc added]
 On Sun, 2013-03-10 at 17:09 +0900, Akinobu Mita wrote:
  This fixes the wrong position of the comment introduced by
  scsi-rename-random32-to-prandom_u32.patch in the -mm tree.
  
  Signed-off-by: Akinobu Mita akinobu.m...@gmail.com
  Cc: James E.J. Bottomley jbottom...@parallels.com
  Cc: Andrew Vasquez andrew.vasq...@qlogic.com
  ---
   drivers/scsi/qla2xxx/qla_attr.c | 6 +++---
   1 file changed, 3 insertions(+), 3 deletions(-)
  
  diff --git a/drivers/scsi/qla2xxx/qla_attr.c 
  b/drivers/scsi/qla2xxx/qla_attr.c
  index 04bf7b8..e44d47e 100644
  --- a/drivers/scsi/qla2xxx/qla_attr.c
  +++ b/drivers/scsi/qla2xxx/qla_attr.c
  @@ -1939,13 +1939,13 @@ qla24xx_vport_delete(struct fc_vport *fc_vport)
  }
   
  /* No pending activities shall be there on the vha now */
  -   if (ql2xextended_error_logging  ql_dbg_user)
  -   msleep(prandom_u32() % 10);
  +   if (ql2xextended_error_logging  ql_dbg_user) {
  /*
   * Just to see if something falls on the net we have placed
   * below
   */
  -
  +   msleep(prandom_u32() % 10);
  +   }
 
 I don't git a toss if it's random or prandom: Andrew: get rid of it; we
 do not sleep in kernel for random intervals whatever the provocation ...
 if this is supposed to be a warning or error condition then print
 something.

That msleep was added by

commit feafb7b1714cf599a6d0fed45801ab3f66046cbd
Author: Arun Easi arun.e...@qlogic.com
AuthorDate: Fri Sep 3 14:57:00 2010 -0700
Commit: James Bottomley james.bottom...@suse.de
CommitDate: Sun Sep 5 15:13:12 2010 -0300

[SCSI] qla2xxx: Fix vport delete issues



--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH][SCSI] hptiop: Support HighPoint RR4520/RR4522 HBA

2012-10-24 Thread Andrew Morton

On Wed, 24 Oct 2012 11:28:54 +0800
HighPoint Linux Team li...@highpoint-tech.com wrote:

 Support HighPoint RR4520/RR4522 HBAs which are based on Marvell Frey.
 
 Signed-off-by: HighPoint Linux Team li...@highpoint-tech.com
 
   Documentation/scsi/hptiop.txt |   69 ++-
   drivers/scsi/hptiop.c |  413 
 --
   drivers/scsi/hptiop.h |   72 +++
   3 files changed, 530 insertions(+), 24 deletions(-)

The patch is terribly wordwrapped and has its tabs replaced with
spaces.  I sugegst you resend it as a text/plain email attachment,
please.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: next-20120925: BUG at drivers/scsi/scsi_lib.c:640!

2012-09-25 Thread Andrew Morton

(cc's added)

On Tue, 25 Sep 2012 22:06:37 +0400
Dmitry Monakhov dmonak...@openvz.org wrote:

 
 Seems like barriers are broken again
 
  kernel BUG at drivers/scsi/scsi_lib.c:1180!
  invalid opcode:  [#1] SMP 
  Modules linked in: coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel 
 microcode sg xhci_hcd button ext3 jbd mbcache sd_mod crc_t10dif\
 elper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic 
 dm_mirror dm_region_hash dm_log dm_mod
  CPU 0 
  Pid: 753, comm: fsck.ext3 Not tainted 3.6.0-rc7-next-20120925+ #4
   /DQ67SW
  RIP: 0010:[81470dbc]  [81470dbc] 
 scsi_setup_fs_cmnd+0xec/0x180
  RSP: 0018:880233aff9f8  EFLAGS: 00010002
  RAX: 0003 RBX: 88022a741000 RCX: 0002
  RDX:  RSI: 0001 RDI: 81f32b48
  RBP: 880233affa18 R08: 0001 R09: 
  R10: 88022a26c800 R11:  R12: 880229369968
  R13: 0001 R14: 88022a741000 R15: 
  FS:  7f1348632760() GS:88023e20() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2: 003a3dc0e550 CR3: 0002338cf000 CR4: 000407f0
  DR0:  DR1:  DR2: 
  DR3:  DR6: 0ff0 DR7: 0400
  Process fsck.ext3 (pid: 753, threadinfo 880233afe000, task 
 880233f48240)
  Stack:
   880233affa48 880229369968 0001 880229bdb550
   880233affaa8 a00a8860 880233affab8 0082
    8107d696 8802 817410d8
  Call Trace:
   [a00a8860] sd_prep_fn+0x140/0xfe0 [sd_mod]
   [8107d696] ? lock_timer_base+0x76/0xf0
   [817410d8] ? _raw_spin_unlock_irq+0x48/0x80
   [8130023c] blk_peek_request+0x23c/0x450
   [8146fad0] scsi_request_fn+0x70/0x820
   [812f54e5] __blk_run_queue+0x55/0x70
   [8132a065] cfq_rq_enqueued+0x155/0x1c0
   [8132a386] cfq_insert_request+0x2b6/0x2f0
   [8132a11d] ? cfq_insert_request+0x4d/0x2f0
   [812f002f] ? md5_final+0x9f/0x130
   [810e5463] ? __lock_release+0xc3/0xe0
   [812fe074] ? drive_stat_acct+0x334/0x3b0
   [812f4be6] __elv_add_request+0x2a6/0x350
   [813010fb] blk_queue_bio+0x52b/0x570
   [812fd8f5] generic_make_request+0x125/0x1c0
   [812fdb68] submit_bio+0x1d8/0x240
   [81250c63] ? bio_alloc_bioset+0x103/0x1e0
   [813039e7] blkdev_issue_flush+0x177/0x200
   [81253afa] blkdev_fsync+0x4a/0x70
   [81245af6] vfs_fsync_range+0x36/0x60
   [81245b3c] vfs_fsync+0x1c/0x20
   [81245ea8] do_fsync+0x58/0x90
   [81246100] sys_fsync+0x10/0x20
   [8174e539] system_call_fastpath+0x16/0x1b
  Code: 00 48 c7 c7 48 2b f3 81 41 0f 94 c5 31 d2 44 89 ee e8 d9 e4 cd ff 49 
 63 c5 48 83 c0 02 48 83 04 c5 b0 a5 13 82 01 45 85 ed 74 04 0f\
  48 89 df 31 db e8 a3 f6 ff ff 48 85 c0 48 
  RIP  [81470dbc] scsi_setup_fs_cmnd+0xec/0x180
   RSP 880233aff9f8
 
 
  [ cut here ]
  kernel BUG at drivers/scsi/scsi_lib.c:640!
  invalid opcode:  [#1] SMP 
  Modules linked in: coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel 
 microcode sg xhci_hcd button ext3 jbd mbcache sd_mod crc_t10dif\
 elper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic 
 dm_mirror dm_region_hash dm_log dm_mod
  CPU 0 
  Pid: 727, comm: fsck.ext3 Not tainted 3.6.0-rc7-next-20120925+ #5
   /DQ67SW
  RIP: 0010:[81470585]  [81470585] 
 scsi_alloc_sgtable+0x55/0xe0
  RSP: 0018:880228215aa8  EFLAGS: 00010002
  RAX: 0003 RBX: 880228111a18 RCX: 0001
  RDX:  RSI: 0001 RDI: 81f32a08
  RBP: 880228215ac8 R08: 0001 R09: 
  R10: 0002 R11:  R12: 
  R13: 0020 R14: 0001 R15: 
  FS:  7fb605f35760() GS:88023e20() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2: 003a3dc0e550 CR3: 000233e83000 CR4: 000407f0
  DR0:  DR1:  DR2: 
  DR3:  DR6: 0ff0 DR7: 0400
  Process fsck.ext3 (pid: 727, threadinfo 880228214000, task 
 880233af8c80)
  Stack:
   880228111a18 88022a0a0638  88022a679000
   880228215b08 81470641 8802281119c0 88022a679000
   880228215b28 8802281119c0 88022a0a0638 0020
  Call Trace:
   [81470641] scsi_init_sgtable+0x31/0xe0
   [81470a2d] scsi_init_io+0x3d/0x2e0
   [81470e23] scsi_setup_fs_cmnd+0x153/0x180
   [a00a8860] sd_prep_fn+0x140/0xfe0 [sd_mod]
   [8135afec]

Re: [PATCH] fcoe: Remove redundant 'less than zero' check

2012-07-09 Thread Andrew Morton

On Thu, 05 Jul 2012 07:52:25 -0700
Robert Love robert.w.l...@intel.com wrote:

 strtoul returns an 'unsigned long' so there is no
 reason to check if the value is less than zero.
 
 strtoul already checks for the '-' character deep
 in its bowels. It will return an error if the user
 has provided a negative value and fcoe_str_to_dev_loss
 will return that error to its caller.

huh, I never knew that.  So if we feed -1 to kstrtoul() it gets treated
as an error?  That seems a bit surprising.  You're sure about that?

 This patch fixes the following Coverity reported warning:
 
 CID 703581 -  NO_EFFECT Unsigned compared against 0 - This
 less-than-zero comparison of an unsigned value is never true. *val  0UL.
 drivers/scsi/fcoe/fcoe_sysfs.c:105
 
 Signed-off-by: Robert Love robert.w.l...@intel.com
 ---
  drivers/scsi/fcoe/fcoe_sysfs.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/scsi/fcoe/fcoe_sysfs.c b/drivers/scsi/fcoe/fcoe_sysfs.c
 index 2bc1631..5e75168 100644
 --- a/drivers/scsi/fcoe/fcoe_sysfs.c
 +++ b/drivers/scsi/fcoe/fcoe_sysfs.c
 @@ -102,7 +102,7 @@ static int fcoe_str_to_dev_loss(const char *buf, unsigned 
 long *val)
   int ret;
  
   ret = kstrtoul(buf, 0, val);
 - if (ret || *val  0)
 + if (ret)
   return -EINVAL;
   /*
* Check for overflow; dev_loss_tmo is u32
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: arcmsr areca-1660 - strange behaviour under heavy load

2008-02-26 Thread Andrew Morton

On Tue, 26 Feb 2008 10:35:31 +0100 (CET) Nikola Ciprich [EMAIL PROTECTED] 
wrote:

 Hi
 
 On Sun, 24 Feb 2008, Andrew Morton wrote:
 
 Hi Andrew,
 thanks a lot for reply, I'm attaching requested information.
 please let me know if You need more information/testing, whatever.
 I'll be glad to help.
 BR
 nik
 
  Areca support doesn't seem to be very interested in the problem :-(
 
  (cc's added)
 
  Please get the machine into this state of memory exhaustion then take
  copies of the output of the following, and send them via reply-to-all to
  this email:
 
  - cat /proc/meminfo
 
  - cat /proc/slabinfo
 
  - dmesg -c  /dev/null ; echo m  /proc/sysrq-trigger ; dmesg -c
 
  Thanks.

Alas, that all looks OK to me.

You never get any out-of-memory messages, and no oom-killing messages?

Possibly what is happening here is that in this low-memory condition, some
of the driver's internal memory-allocation attempts are failing, and the
driver isn't correctly handling this.  This is a rare situation which may
well not have been hit in anyone else's testing.

I expect that the Areca engineers will be able to reproduce this with a
suitably small mem= kernel boot option.  If not, they could perhaps
investigate the kernel's fault-injection framework, which permits
simulation of page allocation failures.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: arcmsr areca-1660 - strange behaviour under heavy load

2008-02-24 Thread Andrew Morton

On Sat, 23 Feb 2008 12:20:12 +0100 (CET) Nikola Ciprich [EMAIL PROTECTED] 
wrote:

 Hi,
 
 I've found strange problem either in arcmsr driver, or maybe in 
 areca-1660 card...
 When system on SAS discs RAID connected to areca-1660 card 
 gets under heavy I/O load, it gets unusable after some time. I can 100% 
 reproduce 
 this, although it needs quite speciffic conditions:
 It can be reproduced on 2x quad core machine, RAM has to be limited to 
 ~192MB to cause heavy paging.
 Only thing needed to cause the problem is to start loop doing kernel 
 compilation using make -j 8 - this loads the system heavily, because of 
 lack of memory. After few correct compile runs the system gets into 
 state when all programs including the basic ones (ls, cp, ..) start 
 crashing... dmesg (when it works) doesn't say anything strange...
 After reboot, the system is OK again.
 I have tested it on different motherboards, with different CPUs, RAMs(all 
 were properly tested with memtest), with two different areca cards and 
 different drives. I can't reproduce the problem on same hardware when 
 using different RAID card (ie adaptec). All testing systems were properly 
 cooled..
 I have tried all available areca firmwares, two different distributions 
 (oracle linux, and centos), and kernels ranging from distribution ones, to 
 last GIT snapshot.
 Could somebody please give me some hints on how to hunt this problem?
 Areca support doesn't seem to be very interested in the problem :-(

(cc's added)

Please get the machine into this state of memory exhaustion then take
copies of the output of the following, and send them via reply-to-all to
this email:

- cat /proc/meminfo

- cat /proc/slabinfo

- dmesg -c  /dev/null ; echo m  /proc/sysrq-trigger ; dmesg -c

Thanks.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PATCH] scsi fixes for 2.6.25-rc2

2008-02-23 Thread Andrew Morton

On Sat, 23 Feb 2008 12:31:02 -0800 (PST) Linus Torvalds [EMAIL PROTECTED] 
wrote:

 
 
 On Sat, 23 Feb 2008, Jeff Garzik wrote:
  
  I know I am probably shooting myself in the foot here, since I am the 
  original
  author of mvsas, but...
  
  Should we be adding new drivers during -rc?
 
 I'm personally of the opinion that a new driver that doesn't add anything 
 but itself (ie no infrastructure changes etc) is fine. I'd rather have a 
 new, rough driver that might work, than no driver at all, and it's not 
 like it can cause a regression if you don't enable it.
 

Yes, I too think that adding new standalone code in late -rc is OK.

Especially drivers, because a new driver is a bugfix for people who own
that hardware!
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: LSI Logic MegaRAID SATA 150-4 / LSI Logic New Generation RAID Device Drivers (MEGARAID_NEWGEN) problems (megaraid abort: scsi cmd:14600, do now own)

2008-02-22 Thread Andrew Morton

(cc's added)

On Mon, 18 Feb 2008 21:09:22 -0500 David M. Strang [EMAIL PROTECTED] wrote:

 Greetings -
 
 A couple months back I purchased a LSI Logic MegaRAID ATA 150-4 
 controller, as well as 3 Seagate 500GB SATA-II hard drives to use in my 
 system. Previously, I was using a pair of WD4000YR's in software raid, 
 which seemed to work well. I've just not gotten around to working on 
 migrating my data to these new drivers + controller, and it's giving me 
 some issues. As with most, I'm having some severe performance issues, 
 the performance is simply abysmal. Before getting into the details, here 
 is a quick overview of my configuration:
 
 System:
 Tyan Tiger i7320/R (S5350) System Board
 2x Intel Xeon 3.0 GHz
 4GB RAM
 
 LSI Logic MegaRAID ATA 150-4 controller -  Firmware Revision: 713S
 3x Seagate 7200.10 (Perpendicular Recording) ST3500630AS 500GB SATA-II 
 drives configured as a RAID-1 array with a HotSpare.
 
 Also, connected to the onboard controller is a WD4000YR, where all of my 
 data currently resides.
 
 I'm running Gentoo Hardended AMD64 MultiLib 
 (/usr/portage/profiles/hardened/amd64/multilib)
 
 My current kernel revision is 2.6.23-hardened-r7.
 
 Here are some (possibly) relevant snippets from dmesg during startup:
 
 ...
 megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006)
 megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006)
 megaraid: probe new device 0x1000:0x1960:0x1000:0x4523: bus 3:slot 3:func 0
 ACPI: PCI Interrupt :03:03.0[A] - GSI 24 (level, low) - IRQ 24
 megaraid: fw version:[713S] bios version:[G121]
 scsi0 : LSI Logic MegaRAID driver
 scsi[0]: scanning scsi channel 0 [Phy 0] for non-raid devices
 scsi[0]: scanning scsi channel 1 [virtual] for logical drives
 scsi 0:1:0:0: Direct-Access MegaRAID LD 0 RAID1  476G 713S PQ: 0 ANSI: 2
 sd 0:1:0:0: [sda] 976762880 512-byte hardware sectors (500103 MB)
 sd 0:1:0:0: [sda] Write Protect is off
 sd 0:1:0:0: [sda] Mode Sense: 00 00 00 00
 sd 0:1:0:0: [sda] Asking for cache data failed
 sd 0:1:0:0: [sda] Assuming drive cache: write through
 sd 0:1:0:0: [sda] 976762880 512-byte hardware sectors (500103 MB)
 sd 0:1:0:0: [sda] Write Protect is off
 sd 0:1:0:0: [sda] Mode Sense: 00 00 00 00
 sd 0:1:0:0: [sda] Asking for cache data failed
 sd 0:1:0:0: [sda] Assuming drive cache: write through
  sda: sda1 sda2 sda3 sda4
 sd 0:1:0:0: [sda] Attached SCSI disk
 ata_piix :00:1f.2: version 2.12
 ata_piix :00:1f.2: MAP [ P0 -- P1 -- ]
 ACPI: PCI Interrupt :00:1f.2[A] - GSI 18 (level, low) - IRQ 18
 PCI: Setting latency timer of device :00:1f.2 to 64
 scsi1 : ata_piix
 scsi2 : ata_piix
 ata1: SATA max UDMA/133 cmd 0x000114a0 ctl 0x0001149a 
 bmdma 0x00011470 irq 18
 ata2: SATA max UDMA/133 cmd 0x00011490 ctl 0x00011486 
 bmdma 0x00011478 irq 18
 ata1.00: ATA-7: WDC WD4000YR-01PLB0, 01.06A01, max UDMA/133
 ata1.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32)
 ata1.00: configured for UDMA/133
 scsi 1:0:0:0: Direct-Access ATA  WDC WD4000YR-01P 01.0 PQ: 0 ANSI: 5
 sd 1:0:0:0: [sdb] 781422768 512-byte hardware sectors (400088 MB)
 sd 1:0:0:0: [sdb] Write Protect is off
 sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't 
 support DPO or FUA
 sd 1:0:0:0: [sdb] 781422768 512-byte hardware sectors (400088 MB)
 sd 1:0:0:0: [sdb] Write Protect is off
 sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't 
 support DPO or FUA
  sdb: sdb1 sdb2 sdb3 sdb4
 sd 1:0:0:0: [sdb] Attached SCSI disk
 ...
 
 My controller is configured for Write Back Caching, Adaptive Read Ahead, 
 and Direct I/O (I've also tried cached I/O but it scared me...)
 
 The first thing I'm noticing is the horrible performance on the raid 
 disk, compared to the single standalone hard disk. Here is the output 
 from hdparm -tT on the single disk:
 
 -([EMAIL PROTECTED])-(~)- # hdparm -tT /dev/sdb1
 
 /dev/sdb1:
  Timing cached reads:   1670 MB in  2.00 seconds = 835.00 MB/sec
  Timing buffered disk reads:  140 MB in  3.01 seconds =  46.45 MB/sec
 
 And then, the output from the raid-1 array:
 
 -([EMAIL PROTECTED])-(~)- # hdparm -tT /dev/sda1
 
 /dev/sda1:
  Timing cached reads:   1718 MB in  2.00 seconds = 859.65 MB/sec
  Timing buffered disk reads:   92 MB in  3.09 seconds =  29.76 MB/sec
 
 I'm not sure what the deal is with the buffered disk reads being so much 
 WORSE than a single disk. So poor performance is a concern, but what's 
 more alarming are the messages showing up in DMESG. When I first tried 
 Cached IO - performance seemed good... except, dmesg was littered with 
 these errors (?):
 
 megaraid: aborting-14610 cmd=2a c=1 t=0 l=0
 megaraid abort: scsi cmd:14610, do now own
 megaraid: aborting-14612 cmd=2a c=1 t=0 l=0
 megaraid abort: scsi cmd:14612, do now own
 megaraid: aborting-14614 cmd=2a c=1 t=0 l=0
 megaraid abort: scsi cmd:14614, do

Re: [PATCH 1/1] cciss: procfs updates to display info about many volumes

2008-02-19 Thread Andrew Morton

On Tue, 19 Feb 2008 11:48:18 +0100 Jens Axboe [EMAIL PROTECTED] wrote:

 On Mon, Feb 11 2008, Mike Miller wrote:
  Patch 1 of 1
  
  This patch allows us to display information about all of the logical volumes
  configured on a particular without stepping on memory even when there are
  many volumes (128 or more) configured. This patch replaces the one submitted
  on 20071214. See
  http://groups.google.com/group/linux.kernel/browse_thread/thread/49a50244b19f8855/ba3dc95b23391521?hl=enlnk=gstq=cciss#ba3dc95b23391521
  which has not been merged. That patch displayed information about only the
  first logical volume on each controller and had negative side effects for 
  some
  installers.
  Please consider this for inclusion.
 
 It looks ok, but has some flaws. Try to disable cciss scsi and tape
 support:
 
 In file included from drivers/block/cciss.c:231:
 drivers/block/cciss_scsi.c:1498:38: error: macro parameters must be
 comma-separated
 drivers/block/cciss.c: In function 'cciss_seq_show_header':
 drivers/block/cciss.c:272: error: implicit declaration of function
 'cciss_seq_tape_report'
 drivers/block/cciss.c: In function 'cciss_proc_write':
 drivers/block/cciss.c:393: error: implicit declaration of function
 'cciss_engage_scsi'
 
 You macro definition of cciss_seq_tape_report() is totally busted.
 Either write is as a macro OR as a function.
 
 Fix these up and resubmit, then I'll take it.
 

It also need to be updated to use the non-racy proc_create(),
please, as per Alexey's comments.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: gdth new set of patches for 2.6.24 stable

2008-02-18 Thread Andrew Morton

On Sun, 17 Feb 2008 18:46:03 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote:

 
 ...

 All my testers have reported back that with these 5 patches applied they can
 now run with a 2.6.24 kernel the same way they ran before. However there is
 that reported issue, with the dma_free_coherent WARN_ON (above). The code was 
 like that from day one and it is a very old issue, however it is a regression 
 because 2.6.24 introduced that new WARN_ON.
 (infamous commit aa24886e379d2b641c5117e178b15ce1d5d366ba)
 From posts on lkml and even recent one in linux-scsi about the arcmsr driver
 it looks that all a driver can do is work around it with different kernel 
 mechanisms
 and driver rewrites. I'm afraid I need your help here. I'm not sure I 
 understand
 why does the gdth driver uses the pci_{alloc,free}_consistent() API's, and 
 what
 is needed to replace it. Could you please have a look in gdth_proc.c and also 
 in
 gdth.c for all the places that call gdth_ioctl_alloc/gdth_ioctl_free, and 
 advise
 what can I do in it's place. Please bear in mind that we need it for 2.6.24, 
 as
 a bugfix.
 
 Apart from the above issue, please accept patches 3,4,5 above they have now
 been tested and are reported to bring broken system back to production.
 (Given that you approve off course). And mark them for inclusion to the
 2.6.24 stable releases. (Or is there some thing that I should do)
 
 ---
 Meanwhile on x86 systems I understand the WARN_ON is cosmetic, and does not
 pose any harm. Some people have reported stability with temporarily disabling
 it. For testers that want to try, here it is below. At your own risk.
 
 ---
 From 50d3657bf6a138ee63ad1ce00052380edc75ace7 Mon Sep 17 00:00:00 2001
 From: Boaz Harrosh [EMAIL PROTECTED]
 Date: Sun, 17 Feb 2008 12:49:35 +0200
 Subject: [PATCH] gdth: Hack to remove WARN_ON in arch/x86/kernel/pci-dma_32.c
 
   gdth uses dma_free_coherent() with interrupts disabled. Which
   is not portable, but is safe on the HW that supports gdth.
 
 NOT Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
  arch/x86/kernel/pci-dma_32.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/arch/x86/kernel/pci-dma_32.c b/arch/x86/kernel/pci-dma_32.c
 index 5133032..350dcfd 100644
 --- a/arch/x86/kernel/pci-dma_32.c
 +++ b/arch/x86/kernel/pci-dma_32.c
 @@ -63,7 +63,7 @@ void dma_free_coherent(struct device *dev, size_t size,
   struct dma_coherent_mem *mem = dev ? dev-dma_mem : NULL;
   int order = get_order(size);
  
 - WARN_ON(irqs_disabled());   /* for portability */
 +/*   WARN_ON(irqs_disabled());*/ /* for portability */
   if (mem  vaddr = mem-virt_base  vaddr  (mem-virt_base + 
 (mem-size  PAGE_SHIFT))) {
   int page = (vaddr - mem-virt_base)  PAGE_SHIFT;
  

Yes.   Let's reprise aa24886e379d2b641c5117e178b15ce1d5d366ba:

: commit aa24886e379d2b641c5117e178b15ce1d5d366ba
: Author: David Brownell [EMAIL PROTECTED]
: Date:   Fri Aug 10 13:10:27 2007 -0700
: 
: dma_free_coherent() needs irqs enabled (sigh)
: 
: On at least ARM (and I'm told MIPS too) dma_free_coherent() has a newish
: call context requirement: unlike its dma_alloc_coherent() sibling, it may
: not be called with IRQs disabled.  (This was new behavior on ARM as of 
late
: 2005, caused by ARM SMP updates.) This little surprise can be annoyingly
: driver-visible.
: 
: Since it looks like that restriction won't be removed, this patch changes
: the definition of the API to include that requirement.  Also, to help 
catch
: nonportable drivers, it updates the x86 and swiotlb versions to include 
the
: relevant warnings.  (I already observed that it trips on the
: bus_reset_tasklet of the new firewire_ohci driver.)
: 

In general, all Linux memory-freeing functions can be called from all
contexts.  (vfree is an irritating exception).  This is good, and provides
maximum usefulness to callees, as all utility functions should seek to do. 
It would be best to fix arm and mips.

But arm and mips require enabled local irqs because their
dma_free_coherent() needs to do a cross-cpu IPI call.  Presumably because
of certain unusual TLB protocols.

I'm not sure what we should do about this.  Presumably the gdth-on-arm
usage base is, umm, zero, so we could lamely add
CONFIG_DMA_FREE_COHERENT_WITH_LOCAL_IRQS_DISABLED_IS_OK and then use that
to disable gdth (and similar) on arm amd mips.  But ugh.

Russell, Ralf: is there something we can do here to relax this requirement?

I'm thinking that perhaps we can do some rcu/refcounting tricks: launch the
IPI from within dma_free_coherent(), but don't wait for it to complete. 
When all CPUs have handled the IPI then (and only then) the virtual address
becomes recyclable, or something like that?

double-checks

Actually I think David might have been wrong about mips.  afaict its
dma_free_coherent() is callable under local_irq_disable(), so ARM SMP is
the sole exception?  


-
To unsubscribe from this

Re: Aborted commands with arcmsr and 2xWD1500ADFD in RAID1

2008-02-13 Thread Andrew Morton


(cc's added)

On Mon, 11 Feb 2008 17:44:08 +0100 Aron Stansvik [EMAIL PROTECTED] wrote:

 Hello LKML.
 
 Under semi-high disk I/O (e.g. installing a compiled KDE), I get the
 following (accompanied by seconds of lock-ups on the machine):
 
 [ 7727.345183] arcmsr0: abort device command of scsi id = 0 lun = 0
 [ 7730.348776] arcmsr0: scsi id = 0 lun = 0 ccb =
 '0xdfb461c0' poll command abort successfully
 [ 8053.795943] arcmsr0: abort device command of scsi id = 0 lun = 0
 [ 8056.799528] arcmsr0: scsi id = 0 lun = 0 ccb =
 '0xdfb595e0' poll command abort successfully
 [ 8884.592810] arcmsr0: abort device command of scsi id = 0 lun = 0
 [ 8887.596392] arcmsr0: scsi id = 0 lun = 0 ccb =
 '0xdfb56d80' poll command abort successfully
 [ 8917.760216] arcmsr0: abort device command of scsi id = 0 lun = 0
 [ 8920.763797] arcmsr0: scsi id = 0 lun = 0 ccb =
 '0xdfb472c0' poll command abort successfully
 [ 9074.106547] arcmsr0: abort device command of scsi id = 0 lun = 0
 
 This is my setup:
 
 1 x MSI K8N Master2-FAR
 1 x Opteron 252
 1 x Areca ARC1200 (sitting in a PCIe x4 socket)
 2 x WD1500ADFD in RAID1
 
 [EMAIL PROTECTED]:~$ uname -a
 Linux rubik 2.6.24-7-generic #1 SMP Thu Feb 7 01:29:58 UTC 2008 i686 GNU/Linux
 [EMAIL PROTECTED]:~$ modinfo arcmsr
 filename:
 /lib/modules/2.6.24-7-generic/kernel/drivers/scsi/arcmsr/arcmsr.ko
 version:Driver Version 1.20.00.15 2007/08/30
 license:Dual BSD/GPL
 description:ARECA (ARC11xx/12xx/13xx/16xx) SATA/SAS RAID HOST Adapter
 author: Erich Chen [EMAIL PROTECTED]
 srcversion: 28EAD6AB49D4491CA04D465
 [...]
 
 I've read some previous posts here on LKML that it could be the Areca
 firmware who doesn't like my WD disks. Anyone know if this is an IRQ
 handling problem in the kernel, or if it's a problem with the RAID
 controller firmware?
 
 Erich Chen (of Areca); have you tried the new ARC1200 in RAID1
 configuration with Raptor disks on Linux?
 
 As a side note, I can tell you that I first tried running FreeBSD 6.3
 (RELENG_6) on this machine, but got random reboots during disk I/O
 (even with a kernel with KDB debugging turned on). This leads me to
 believe that it might be a firmware issue, and that Linux just handles
 it more gracefully than FreeBSD.
 
 Any ideas or advice is appriciated. This is my first post to the LKML,
 so please instruct me if you want more information or if you want me
 to take further debugging actions.
 
 Best regards,
 Aron Stansvik

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PATCH] SCSI bug fixes for 2.6.25-rc1

2008-02-13 Thread Andrew Morton

On Wed, 13 Feb 2008 18:02:44 -0600
James Bottomley [EMAIL PROTECTED] wrote:

 This one's not too bad given the number of patches we had in the merge
 window.  We have the advansys fix, a gdth severe problem fix (wouldn't
 scan any devices) a bug fix series for lpfc and a few other odds and
 ends.
 
 The patch is available here:
 
 master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git

I have scsi patches!



mptbase-reset-ioc-initiator-during-pci-resume.patch

  Fixes suspend/resume on all of Darrick's MPT cards.  I first merged it
  in September 2007.

kill-warnings-in-mptbaseh-on-parisc64.patch

  Warning fixes.

dell-cerc-support-for-megaraid_mbox.patch

  Turns non-booting machiens into booting ones.  Merged in -mm in
  November 2007.

3w-raid-drivers-memset-not-needed-in-probe.patch

  Small optimisation

scsi-aic94xx-cleanups.patch

  cleanups only.

scsi-qlogicptic-section-fixes.patch

  Fixes a reference from .text into .init.text and hence might fix a
  machine crash when this driver is build into vmlinux.  Merged a week ago.

megaraid-outb_p-extermination.patch

  Cleanup

gdth-convert-to-pci-hotplug-api.patch

  Just merged


So several of these patches address quite seriosu bugs, and have been
stuck in my tree for far too long.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PATCH] SCSI bug fixes for 2.6.25-rc1

2008-02-13 Thread Andrew Morton

On Wed, 13 Feb 2008 19:11:53 -0600 James Bottomley [EMAIL PROTECTED] wrote:

  mptbase-reset-ioc-initiator-during-pci-resume.patch
  
Fixes suspend/resume on all of Darrick's MPT cards.  I first merged it
in September 2007.
 
 Patch presented by LSI but has gone back with comments

Five months is too long to fix a bug when someone has already sent us a
patch.  If they insist on being this sluggish I'd suggest that you review
the patch yourself then just merge it.  That will get their attention. 
Maybe.

  dell-cerc-support-for-megaraid_mbox.patch
 
 I need megaraid to sign off (and test) this one.

Two months, same story.

  scsi-qlogicptic-section-fixes.patch
  
Fixes a reference from .text into .init.text and hence might fix a
machine crash when this driver is build into vmlinux.  Merged a week ago.
 
 This was the one we had the alternative fix for, wasn't it ... ?

In current mainline, __devinit qpti_sbus_probe() still is calling __init
qpti_chain_add() (for example).  So in a CONFIG_HOTPLUG kernel, hotplugging
a new device (on sbus, ok, bad example ;)) will crash.

But Adrian has fixed six such bugs in there, maybe one of them can hit.  I
don't think we've fixed these by alternative means, unless we've disabled
__devinit?


Still, we can discuss specific patches all day.  I think there is a
_general_ problem getting bugfixes, warning fixes and cleanups into scsi
drivers within reasonable amounts of time?
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 02/13] git-scsi-misc gdth fix

2008-02-12 Thread Andrew Morton

On Tue, 12 Feb 2008 17:27:33 +0200
Boaz Harrosh [EMAIL PROTECTED] wrote:

 On Tue, Feb 05 2008 at 9:53 +0200, [EMAIL PROTECTED] wrote:
  From: James Bottomley [EMAIL PROTECTED]
  
  On Sun, 2007-10-14 at 12:21 -0700, Andrew Morton wrote:
  On Sun, 14 Oct 2007 22:45:47 +0400 Dave Milter [EMAIL PROTECTED] wrote:
 
  I build linux-2.6.23-mm1 and try to boot it using qemu,
  and it crashed with trace like this:
  do_page_fault
  error_code
  lock_acquire
  _spin_lock_irqsave
  gdth_timeout
  run_timer_softirq
  __do_softirq
  do_softirq
 
  I have screenshot, but have no idea, is it legal to include it, if I
  sent copy to lkml.
  config of kernel in attachment,
  I apply all three patches from hot-fixes.
 
  The screenshot is here:  http://userweb.kernel.org/~akpm/crash.png
 
  It would appear that gdth_timeout() is passing a bad pointer into
  spin_lock_irqsave().
  
  There's a bug in the gdth rework in that the instance can be deleted
  from the list before the actual timer is stopped.  This can be worked
  around I think by the following patch; although we really should be
  stopping the timer from firing when the list goes empty.
  
  
  James said:
  
  This is almost certainly the wrong fix for real hardware.  Although it
  kills the timer when the list goes empty, nothing will ever restart it
  when the list fills again.
  
  Boaz, since you touched all of this, you get to fix it.  The correct fix
  will be to control the timer along with the actual list instead of at
  entry/exit time.  If you're not going to add this empty check to the
  timer routine, make sure you use del_timer_sync() before removing the
  last element from the list.
  
  
  Signed-off-by: Andrew Morton [EMAIL PROTECTED]
  ---
  
   drivers/scsi/gdth.c |3 +++
   1 file changed, 3 insertions(+)
  
  diff -puN drivers/scsi/gdth.c~git-scsi-misc-gdth-fix drivers/scsi/gdth.c
  --- a/drivers/scsi/gdth.c~git-scsi-misc-gdth-fix
  +++ a/drivers/scsi/gdth.c
  @@ -3791,6 +3791,9 @@ static void gdth_timeout(ulong data)
   gdth_ha_str *ha;
   ulong flags;
   
  +if (list_empty(gdth_instances))
  +   return;
  +
   ha = list_first_entry(gdth_instances, gdth_ha_str, list);
   spin_lock_irqsave(ha-smp_lock, flags);
   
  _
 Hello dear Andrew
 
 Do you perhaps remember who as reported this problem, and if he can
 test patches?
 

It was Dave Milter, who has been cc'ed on all of this.

 and if he can test patches?

Don't know.  Dave, would it be a possibility?

Thanks.

 
 ---
 gdth: Try to fix the Timer at exit problem
 
 Remove_sync the timer before we delete the cards.
 
 Testing-patches: Boaz Harrosh [EMAIL PROTECTED]
 
 ---
 git-diff --stat -p v2.6.24
  drivers/scsi/gdth.c |   19 ---
  1 files changed, 12 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
 index b253b8c..57fa756 100644
 --- a/drivers/scsi/gdth.c
 +++ b/drivers/scsi/gdth.c
 @@ -5102,6 +5105,9 @@ static int __init gdth_pci_probe_one(gdth_pci_str 
 *pcistr, int ctr)
   if (error)
   goto out_free_coal_stat;
   list_add_tail(ha-list, gdth_instances);
 +
 + scsi_scan_host(shp);
 +
   return 0;
  
   out_free_coal_stat:
 @@ -5137,8 +5143,6 @@ static void gdth_remove_one(gdth_ha_str *ha)
   ha-sdev = NULL;
   }
  
 - gdth_flush(ha);
 -
   if (shp-irq)
   free_irq(shp-irq,ha);
  
 @@ -5236,14 +5240,15 @@ static void __exit gdth_exit(void)
  {
   gdth_ha_str *ha;
  
 - list_for_each_entry(ha, gdth_instances, list)
 - gdth_remove_one(ha);
 + unregister_chrdev(major,gdth);
 + unregister_reboot_notifier(gdth_notifier);
  
  #ifdef GDTH_STATISTICS
 - del_timer(gdth_timer);
 + del_timer_sync(gdth_timer);
  #endif
 - unregister_chrdev(major,gdth);
 - unregister_reboot_notifier(gdth_notifier);
 +
 + list_for_each_entry(ha, gdth_instances, list)
 + gdth_remove_one(ha);
  }
  
  module_init(gdth_init);
 
 
 
 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PATCH] final SCSI updates for 2.6.24 merge window

2008-02-07 Thread Andrew Morton

On Thu, 07 Feb 2008 18:56:46 -0600 James Bottomley [EMAIL PROTECTED] wrote:

 Quite a bit of this is fixing things broken previously (the advansys fix
 is still pending resolution, but I'll send it as an -rc fix when we have
 it).  There's the final elimination of all drivers that are esp based
 but don't use the scsi_esp core (that's mostly m68k and alpha).  Plus
 the usual bunch of driver updates and the addition of a new enclosure
 services driver and the corresponding ULD.

Sob.  Can we please merge Convert SG from nopage to fault?  It has been
sent three times, the first time was Dec 5 last year and it has thus far
received the lead balloon treatment.  Despite my explicit request for
consideration last time I sent it

If there is no movement here then I have to carry the moderately intrusive
mm-remove-nopage.patch for another N months and we need to watch out for
new -nopage implementations popping up etc.




From: Nick Piggin [EMAIL PROTECTED]

Convert SG from nopage to fault.

Signed-off-by: Nick Piggin [EMAIL PROTECTED]
Cc: Douglas Gilbert [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/sg.c |   23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff -puN drivers/scsi/sg.c~sg-nopage drivers/scsi/sg.c
--- a/drivers/scsi/sg.c~sg-nopage
+++ a/drivers/scsi/sg.c
@@ -1160,23 +1160,22 @@ sg_fasync(int fd, struct file *filp, int
return (retval  0) ? retval : 0;
 }
 
-static struct page *
-sg_vma_nopage(struct vm_area_struct *vma, unsigned long addr, int *type)
+static int
+sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
Sg_fd *sfp;
-   struct page *page = NOPAGE_SIGBUS;
unsigned long offset, len, sa;
Sg_scatter_hold *rsv_schp;
struct scatterlist *sg;
int k;
 
if ((NULL == vma) || (!(sfp = (Sg_fd *) vma-vm_private_data)))
-   return page;
+   return VM_FAULT_SIGBUS;
rsv_schp = sfp-reserve;
-   offset = addr - vma-vm_start;
+   offset = vmf-pgoff  PAGE_SHIFT;
if (offset = rsv_schp-bufflen)
-   return page;
-   SCSI_LOG_TIMEOUT(3, printk(sg_vma_nopage: offset=%lu, scatg=%d\n,
+   return VM_FAULT_SIGBUS;
+   SCSI_LOG_TIMEOUT(3, printk(sg_vma_fault: offset=%lu, scatg=%d\n,
   offset, rsv_schp-k_use_sg));
sg = rsv_schp-buffer;
sa = vma-vm_start;
@@ -1185,21 +1184,21 @@ sg_vma_nopage(struct vm_area_struct *vma
len = vma-vm_end - sa;
len = (len  sg-length) ? len : sg-length;
if (offset  len) {
+   struct page *page;
page = virt_to_page(page_address(sg_page(sg)) + offset);
get_page(page); /* increment page count */
-   break;
+   vmf-page = page;
+   return 0; /* success */
}
sa += len;
offset -= len;
}
 
-   if (type)
-   *type = VM_FAULT_MINOR;
-   return page;
+   return VM_FAULT_SIGBUS;
 }
 
 static struct vm_operations_struct sg_mmap_vm_ops = {
-   .nopage = sg_vma_nopage,
+   .fault = sg_vma_fault,
 };
 
 static int
_

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 9901] New: kernel panic in stex modules (?)

2008-02-06 Thread Andrew Morton

On Wed,  6 Feb 2008 09:40:15 -0800 (PST) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=9901
 
Summary: kernel panic in stex modules (?)
Product: IO/Storage
Version: 2.5
  KernelVersion: 2.6.24
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: Serial ATA
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Latest working kernel version: 2.6.23-r6
 Earliest failing kernel version: 2.6.24
 Distribution: Gentoo
 Hardware Environment: Core2D E6600, Asus p5B Dlx, 2G DDR2 667, Promise ST
 EX4350
 Software Environment: GCC 4.2.3/4.1.2, CFLAGS=-O2
 
 Problem Description:
 The problem is frequent kernel panics within the same module. Can't say what 
 it
 is, but looks like it is related to dma and promise driver.
 The first culprit, the memory, is ok, 8 hours of memtest passed without 
 errors.
 Before, kernel 2.6.23-gentoo-r6, compiled with GCC 4.1.2 worked just fine, 
 then
 after upgrade to 4.2.2 th bug appeared. Upgrade to 2.6.24 didn't solve the
 problem. Switching back to GCC 4.1.2 made things better for a moment, crashes
 became less frequent and I thought compiler was the cause. But today system
 crashed again with same symptoms.
 Sorry, but I can't save crash log, so I'll provide screen shot:
 http://img238.imageshack.us/my.php?image=p2030030ki1.jpg
 
 Steps to reproduce:
 Boot, start FTP-server, load RAID with heavy input, in some hours it will
 crash. With pure reads system can run several days, heavy write load kills it
 much too easier.
 

The supertrak driver has regressed in 2.6.24.  And

commit 9cb83c7529d929c00f37d821daed1942a1b20602
Author: FUJITA Tomonori [EMAIL PROTECTED]
Date:   Tue Oct 16 11:24:32 2007 +0200

[SCSI] add use_sg_chaining option to scsi_host_template

looks a likely candidate.

And this:

commit d3f46f39b7092594b498abc12f0c73b0b9913bde
Author: James Bottomley [EMAIL PROTECTED]
Date:   Tue Jan 15 11:11:46 2008 -0600

[SCSI] remove use_sg_chaining

from 2.6.25 looks to be a likely fix for it.  Should it be backported?

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel Panic in MPT SAS on 2.6.24 (and 2.6.23.14, 2.6.23.9)

2008-02-06 Thread Andrew Morton

On Wed, 6 Feb 2008 22:04:26 +0100
Maximilian Wilhelm [EMAIL PROTECTED] wrote:

 Hi!
 
 While installing my new firewall I got the following kernel panic in
 the MPT SAS driver which I need for the disks.
 
 The first kernel I bootet was 2.6.23.14 which did panic so I tried a
 2.6.24 which panics, too. Our usual FAI kernel (2.6.23.9) is also
 affected.
 
 If there is any information you may need to track this down, please
 let me know.
 
 I've put the .config to http://files.rfc2324.org/mptsas_panic/2.6.24-config
 to limit the size of this mail.
 
 ...

 ide-floppy driver 0.99.newide
 aic94xx: Adaptec aic94xx SAS/SATA driver version 1.0.3 loaded
 megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006)
 megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006)
 megasas: 00.00.03.10-rc5 Thu May 17 10:09:32 PDT 2007
 Driver 'sd' needs updating - please use bus_type methods
 Fusion MPT base driver 3.04.06
 Copyright (c) 1999-2007 LSI Corporation
 Fusion MPT SAS Host driver 3.04.06
 mptbase: ioc0: Initiating bringup
 ioc0: LSISAS1068E B3: Capabilities={Initiator}
 scsi0 : ioc0: LSISAS1068E B3, FwRev=00142e00h, Ports=1, MaxQ=511, IRQ=16
 scsi 0:0:0:0: Direct-Access SEAGATE  ST973402SS   S207 PQ: 0 ANSI: 5
 scsi 0:0:1:0: Direct-Access SEAGATE  ST973402SS   S207 PQ: 0 ANSI: 5
 BUG: unable to handle kernel NULL pointer dereference at virtual address 
 0010
 printing eip: c02c0b38 *pde =  
 Oops:  [#1] SMP 
 Modules linked in:
 
 Pid: 1, comm: swapper Not tainted (2.6.24 #1)
 EIP: 0060:[c02c0b38] EFLAGS: 00010246 CPU: 1
 EIP is at mptsas_probe_expander_phys+0x51/0x4a2
 EAX: 0010 EBX: f7457ec0 ECX: f7c3fd9c EDX: 0004
 ESI: f7fe7800 EDI: f7fe7800 EBP: f7fe7904 ESP: f7c3fe18
  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
 Process swapper (pid: 1, ti=f7c3e000 task=f7c22ab0 task.ti=f7c3e000)
 Stack:   00200200 fffefd74 c02b9cc8 f7fe7800 c04c5280 
 f7c3fecc 
376b1000 0001    00100100 00200200 
  
00200200 fffefd74 c02b9cc8 f7fe7800 c04c5280 f7c3fe8c 376b1000 
 0001 
 Call Trace:
  [c02b9cc8] mpt_timer_expired+0x0/0x5c
  [c02b9cc8] mpt_timer_expired+0x0/0x5c
  [c028] ide_wait_cmd+0x90/0xa0
  [c02c2806] mptsas_probe+0x38a/0x40b
  [c0180522] sysfs_create_link+0xb7/0xf9
  [c021ceb6] pci_device_probe+0x36/0x57
  [c023bcd0] driver_probe_device+0xde/0x15c
  [c036d3e5] klist_next+0x4b/0x6b
  [c023bde0] __driver_attach+0x0/0x79
  [c023be26] __driver_attach+0x46/0x79
  [c023b2a8] bus_for_each_dev+0x33/0x55
  [c023bb37] driver_attach+0x16/0x18
  [c023bde0] __driver_attach+0x0/0x79
  [c023b58e] bus_add_driver+0x6d/0x197
  [c021cff2] __pci_register_driver+0x48/0x74
  [c0480bd3] mptsas_init+0xbf/0xd6
  [c046c74e] kernel_init+0x140/0x2a2
  [c01024ca] ret_from_fork+0x6/0x1c
  [c046c60e] kernel_init+0x0/0x2a2
  [c046c60e] kernel_init+0x0/0x2a2
  [c010319f] kernel_thread_helper+0x7/0x10
  ===
 Code: 85 c0 0f 84 68 04 00 00 8b 54 24 1c 8b 02 89 04 24 31 c9 89 da 89 f8 e8 
 2b f2 ff ff 89 44 24 2c 85 c0 8b 43 0c 0f 85 39 04 00 00 0f b7 00 8b 74 24 
 1c 89 06 8d 87 24 05 00 00 89 44 24 20 e8 5b 
 EIP: [c02c0b38] mptsas_probe_expander_phys+0x51/0x4a2 SS:ESP 0068:f7c3fe18
 ---[ end trace 50b3e7147499e641 ]---
 Kernel panic - not syncing: Attempted to kill init!
 

Thanks.   Cc's added...
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 08/13] sg: nopage

2008-02-05 Thread Andrew Morton

On Mon, 04 Feb 2008 23:53:21 -0800 [EMAIL PROTECTED] wrote:

 From: Nick Piggin [EMAIL PROTECTED]
 
 Convert SG from nopage to fault.
 

Please give this some additional attention.  We'd like to remove 
vm_operations_struct.nopage() altogether and we can't do that while
it's hanging around in various subsystems.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread Andrew Morton

On Sun, 03 Feb 2008 18:16:51 -0600
James Bottomley [EMAIL PROTECTED] wrote:

 
 From: James Bottomley [EMAIL PROTECTED]
 Date: Sun, 3 Feb 2008 15:40:56 -0600
 Subject: [SCSI] enclosure: add support for enclosure services
 
 The enclosure misc device is really just a library providing sysfs
 support for physical enclosure devices and their components.
 

Thanks for sending it out for review.

 +struct enclosure_device *enclosure_find(struct device *dev)
 +{
 + struct enclosure_device *edev = NULL;
 +
 + mutex_lock(container_list_lock);
 + list_for_each_entry(edev, container_list, node) {
 + if (edev-cdev.dev == dev) {
 + mutex_unlock(container_list_lock);
 + return edev;
 + }
 + }
 + mutex_unlock(container_list_lock);
 +
 + return NULL;
 +}
 +EXPORT_SYMBOL_GPL(enclosure_find);

This looks a little odd.  We don't take a ref on the object after looking
it up, so what prevents some other thread of control from freeing or
otherwise altering the returned object while the caller is playing with it?

 +/**
 + * enclosure_for_each_device - calls a function for each enclosure
 + * @fn:  the function to call
 + * @data:the data to pass to each call
 + *
 + * Loops over all the enclosures calling the function.
 + *
 + * Note, this function uses a mutex which will be held across calls to
 + * @fn, so it must have user context, and @fn should not sleep or

Probably non atomic context would be more accurate.

fn() actually _can_ sleep.

 + * otherwise cause the mutex to be held for indefinite periods
 + */
 +int enclosure_for_each_device(int (*fn)(struct enclosure_device *, void *),
 +   void *data)
 +{
 + int error = 0;
 + struct enclosure_device *edev;
 +
 + mutex_lock(container_list_lock);
 + list_for_each_entry(edev, container_list, node) {
 + error = fn(edev, data);
 + if (error)
 + break;
 + }
 + mutex_unlock(container_list_lock);
 +
 + return error;
 +}
 +EXPORT_SYMBOL_GPL(enclosure_for_each_device);
 +
 +/**
 + * enclosure_register - register device as an enclosure
 + *
 + * @dev: device containing the enclosure
 + * @components:  number of components in the enclosure
 + *
 + * This sets up the device for being an enclosure.  Note that @dev does
 + * not have to be a dedicated enclosure device.  It may be some other type
 + * of device that additionally responds to enclosure services
 + */
 +struct enclosure_device *
 +enclosure_register(struct device *dev, const char *name, int components,
 +struct enclosure_component_callbacks *cb)
 +{
 + struct enclosure_device *edev =
 + kzalloc(sizeof(struct enclosure_device) +
 + sizeof(struct enclosure_component)*components,
 + GFP_KERNEL);
 + int err, i;
 +
 + if (!edev)
 + return ERR_PTR(-ENOMEM);
 +
 + if (!cb) {
 + kfree(edev);
 + return ERR_PTR(-EINVAL);
 + }

It would be less fuss if this were to test cb before doing the kzalloc().

Can cb==NULL actually and legitimately happen?

 + edev-components = components;
 +
 + edev-cdev.class = enclosure_class;
 + edev-cdev.dev = get_device(dev);
 + edev-cb = cb;
 + snprintf(edev-cdev.class_id, BUS_ID_SIZE, %s, name);
 + err = class_device_register(edev-cdev);
 + if (err)
 + goto err;
 +
 + for (i = 0; i  components; i++)
 + edev-component[i].number = -1;
 +
 + mutex_lock(container_list_lock);
 + list_add_tail(edev-node, container_list);
 + mutex_unlock(container_list_lock);
 +
 + return edev;
 +
 + err:
 + put_device(edev-cdev.dev);
 + kfree(edev);
 + return ERR_PTR(err);
 +}
 +EXPORT_SYMBOL_GPL(enclosure_register);
 +
 +static struct enclosure_component_callbacks enclosure_null_callbacks;
 +
 +/**
 + * enclosure_unregister - remove an enclosure
 + *
 + * @edev:the registered enclosure to remove;
 + */
 +void enclosure_unregister(struct enclosure_device *edev)
 +{
 + int i;
 +
 + if (!edev)
 + return;

Is this legal?

 + mutex_lock(container_list_lock);
 + list_del(edev-node);
 + mutex_unlock(container_list_lock);

See, right now, someone who found this enclosure_device via
enclosure_find() could still be playing with it?

 + for (i = 0; i  edev-components; i++)
 + if (edev-component[i].number != -1)
 + class_device_unregister(edev-component[i].cdev);
 +
 + /* prevent any callbacks into service user */
 + edev-cb = enclosure_null_callbacks;
 + class_device_unregister(edev-cdev);
 +}
 +EXPORT_SYMBOL_GPL(enclosure_unregister);
 +
 +/**
 + * enclosure_component_register - add a particular component to an enclosure
 + * @edev:the enclosure to add the component
 + * @num: the device number
 + * @type:the type of component

Re: [patch] pci: pci_enable_device_bars() fix

2008-02-04 Thread Andrew Morton

On Mon, 4 Feb 2008 13:57:36 +0100 Ingo Molnar [EMAIL PROTECTED] wrote:

 
 * Jeff Garzik [EMAIL PROTECTED] wrote:
 
  Ingo Molnar wrote:
  so please tell me Jeff. If Greg, who is the super-maintainer of your 
  code area, and who deals with your code every day and changes it 
  every minute and hour, simply did not Cc: the SCSI list - how am i, a 
  largely outside party in this matter, supposed to notice that 3 
  maintainers and 3 mailing lists in the Cc: were somehow not enough 
  and that i was supposed to grow the already sizable Cc: list even 
  more?
 
  Because, regardless of the situation, it's both common courtesy and 
  wise practice to CC relevant driver maintainers, when you touch a 
  driver.
 
  And it's just common sense: Greg simply does not know the intimate 
  details of every PCI driver.  Nor do I.  Nor you.
 
  In the case of lpfc here, we have an active driver maintainer, and an 
  up-to-date MAINTAINERS entry.  Even if you are too slack to read 
  MAINTAINERS, 'git log' would have given you the same info.
 
  Don't pretend there is some benefit here to ignoring the people that 
  best know the driver.  I don't buy that; it simply makes no 
  engineering sense whatsoever.
 
 what you _STILL_ do not realize is the following: you still attribute 
 the lack of Cc:s to some intention of mine. No, it was not my intention. 
 At first glance the Cc: looked large and complete enough in an 
 _existing_ discussion and that's was the end of my (brief) attention 
 regarding the Cc: line. Yes, it would have been a bit better had i 
 noticed the lack of Cc:s in an existing discussion, but i didnt.

Actually I (and probably others) generally avoid cc'ing mailing lists on
patch traffic.  I spew out enough script-generated traffic as it is.

 ...
   mailing list aliases to get the 'guaranteed attention' of maintainers 


whoa.  You must know better mailing lists than I do ;)

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dmesg spam

2008-02-04 Thread Andrew Morton

On Mon, 4 Feb 2008 15:24:55 +0100 Bartlomiej Zolnierkiewicz [EMAIL PROTECTED] 
wrote:

 On Sunday 03 February 2008, Andrew Morton wrote:
  
  With latest -mm, running fc8 I am getting this in the logs,
^^^
 = SCSI/libata
 
 cc:ing Jeff
 
  once per second.
  
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.
  sr0: CDROM not ready.  Make sure there is a disc in the drive.

Well..  it's coming out of the kernel.  Presumably it's that cdrom polling
thing in KDE.  James recently made changes to sr_ioctl.c but I've been
buried in more terminal regressions than this one.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dmesg spam

2008-02-04 Thread Andrew Morton

On Mon, 04 Feb 2008 15:21:54 -0500
Jeff Garzik [EMAIL PROTECTED] wrote:

 James Bottomley wrote:
  The message comes from sr_ioctl.c:sr_do_ioctl().  Which means some user
  level application is poking the drive with a command that's returning
  NOT_READY.  Apparently it will shut up if quiet is set in the packet
  command structure.
  
  It could be the application is getting the wrong idea of the status from
  sr_do_staus() which leads it to send commands which require a medium?
  But we'll need a bit of debugging to determine this.
 
 
 Userland polling of the cdrom is quite normal (if unfortunately), 
 regardless of medium presence.  Probably HAL or dbus.
 
 In theory, the userland app should (a) set quiet and (b) handle 
 not-ready condition just fine.
 
 I presume that (b) is ok, since not-ready just means to continue polling 
 the cdrom ad infinitum, until media appears.
 
 A useful experiment, if only to confirm the obvious, would be to insert 
 some media.
 
 What controller and device is in use?
 

It's the thinkpad t61p.  Currently five miles away, powered off.  It's all
new Intel stuff iirc.

http://userweb.kernel.org/~akpm/dmesg-t61p.txt has some info but not the
right info afaict.

Bisection time I guess.  That'll be a new experience.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dmesg spam

2008-02-04 Thread Andrew Morton

On Mon, 04 Feb 2008 14:44:18 -0600 James Bottomley [EMAIL PROTECTED] wrote:

 
 On Mon, 2008-02-04 at 15:24 -0500, Jeff Garzik wrote:
  James Bottomley wrote:
   It's here in sr_ioctl.c:
  
  Ah, indeed.  My grep-fu sucks today.
  
  
   I'm not averse to simply nuking the printk ... it's probably valueless
   in a modern kernel, since something dbussy is supposed to tell you to
   put a CD in the drive, not something in the kernel.
  
  The reverse...  dbussy/HAL is implementing autodetection of media 
  insertion, by polling ad infinitum.
 
 Understood ... I meant the day of the user relying on a message from a
 kernel printk to tell them they need a CD in the drive is long over.
 

OK, sorry, I'm hopelessly full of it.  These messages also are produced by
2.6.24, 2.6.23 and 2.6.23.1-49.fc8.

I don't think anyone would miss this message were it to bite the D key.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

dmesg spam

2008-02-03 Thread Andrew Morton


With latest -mm, running fc8 I am getting this in the logs,
once per second.

sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
sr0: CDROM not ready.  Make sure there is a disc in the drive.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc8-mm1 Build Failure on scsi driver

2008-01-17 Thread Andrew Morton

On Thu, 17 Jan 2008 21:45:39 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote:

 Hi Andrew,
 
 The kernel build fails with following error
 
 drivers/scsi/aha152x.o: In function `aha152x_host_reset_host':
 /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/aha152x.c:1324: multiple 
 definition of `aha152x_host_reset_host'
 drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/aha152x.c:1324:
  first defined here
 drivers/scsi/aha152x.o: In function `aha152x_release':
 /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/aha152x.c:908: multiple 
 definition of `aha152x_release'
 drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/aha152x.c:908:
  first defined here
 ld: Warning: size of symbol `aha152x_release' changed from 68 in 
 drivers/scsi/pcmcia/built-in.o to 100 in drivers/scsi/aha152x.o
 drivers/scsi/aha152x.o: In function `aha152x_probe_one':

Neat.  Seems that the scsi build system is linking together two copies of
drivers/scsi/aha152x.o.  One via drivers/scsi/aha152x.o directly and the
other via drivers/scsi/pcmcia/built-in.o.

Please send the .config.

I'm looking suspiciously at this, from git-scsi-misc:

commit 8ae732a91df051aba6820068a47b631a06599d84
Author: Tejun Heo [EMAIL PROTECTED]
Date:   Fri Dec 7 22:36:23 2007 +0900

[SCSI] make pcmcia directory use obj-y|m instead of subdir-y|m

subdir-y|m isn't supposed to contain modules or built-in components.
Change subdir-$(CONFIG_PCMCIA) to obj-$(CONFIG_PCMCIA).

Signed-off-by: Tejun Heo [EMAIL PROTECTED]
Acked-by: Sam Ravnborg [EMAIL PROTECTED]
Signed-off-by: James Bottomley [EMAIL PROTECTED]

diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index b5441f5..93e1428 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -17,7 +17,7 @@
 CFLAGS_aha152x.o =   -DAHA152X_STAT -DAUTOCONF
 CFLAGS_gdth.o= # -DDEBUG_GDTH=2 -D__SERIAL__ -D__COM2__ -DGDTH_STATISTICS
 
-subdir-$(CONFIG_PCMCIA)+= pcmcia
+obj-$(CONFIG_PCMCIA)   += pcmcia/
 
 obj-$(CONFIG_SCSI) += scsi_mod.o
 obj-$(CONFIG_SCSI_TGT) += scsi_tgt.o

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc8-mm1 Build Failure on scsi driver

2008-01-17 Thread Andrew Morton

On Fri, 18 Jan 2008 12:07:27 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote:

 Hi Andrew,
 
 Patch from Tejun Heo fixes the aha152x.c build failure, and following second 
 part
 of the build failure, is still occurring.
 
 drivers/scsi/fdomain.o:(.data+0x0): multiple definition of 
 `fdomain_driver_template'
 drivers/scsi/pcmcia/built-in.o:(.data+0x5a0): first defined here
 drivers/scsi/fdomain.o: In function `fdomain_16x0_bus_reset':
 /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:1568: multiple 
 definition of `fdomain_16x0_bus_reset'
 drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:1568:
  first defined here
 drivers/scsi/fdomain.o: In function `__fdomain_16x0_detect':
 /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:894: multiple 
 definition of `__fdomain_16x0_detect'
 drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:894:
  first defined here
 ld: Warning: size of symbol `__fdomain_16x0_detect' changed from 1206 in 
 drivers/scsi/pcmcia/built-in.o to 1700 in drivers/scsi/fdomain.o
 drivers/scsi/fdomain.o: In function `fdomain_setup':
 /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:554: multiple 
 definition of `fdomain_setup'
 drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:554:
  first defined here

Tejun has more fixing to do, I suspect ;)

I assume a basic allyesconfig will weed out most remaining problems of this
sort.  Problem is, it needs to be done for all architectures (and even that
might not suffice).  So old-fashioned code inspection is also needed.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 9752] New: getting FAULT code message at startup in mpt fusion scsi driver

2008-01-15 Thread Andrew Morton

On Tue, 15 Jan 2008 06:30:11 -0800 (PST) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=9752
 
Summary: getting FAULT code message at startup in mpt fusion
 scsi driver
Product: SCSI Drivers
Version: 2.5
  KernelVersion: 2.6.14
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: Other
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Hardware Environment: 32-bit x86, dual Xeon
 Software Environment: nothing special
 Problem Description:
 
 On blade startup we're seeing the following message:
 
 Fusion MPT base driver 3.02.55
 Copyright (c) 1999-2005 LSI Logic Corporation
 Fusion MPT SAS Host driver 3.02.55
 mptbase: Initiating ioc0 bringup
 mptbase: ioc0: WARNING - IOC is in FAULT state!!!
 FAULT code = 1804h
 mptbase: ioc0: ERROR - Failed to come READY after reset! IocState=0
 mptbase: ioc0 NOT READY WARNING!
 mptbase: WARNING - ioc0 did not initialize properly! (-1)
 mptsas: probe of :05:01.0 failed with error -1
 
 
 I'm not very knowledgable with SCSI, so can someone tell me whether this is a
 disk fault or a host fault?  Even better, does anyone know what the specific
 fault code means or where I could look it up?
 
 
 -- 
 Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
 --- You are receiving this mail because: ---
 You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] megaraid: fix section mismatch

2008-01-10 Thread Andrew Morton

On Thu, 10 Jan 2008 14:33:16 -0800
Randy Dunlap [EMAIL PROTECTED] wrote:

 From: Randy Dunlap [EMAIL PROTECTED]
 
 Change megaraid_pci_driver_g variable name so that it matches the modpost
 whitelist that allows pointers to init text/data.
 
 WARNING: vmlinux.o(.data+0x1a8e30): Section mismatch: reference to 
 .init.text:megaraid_probe_one (between 'megaraid_pci_driver_g' and 
 'class_device_attr_megaraid_mbox_app_hndl')
 

All these patches fix references to possibly-discarded sections and hence
fix possibly-serious bugs.  So all of them should go into 2.6.24.

I already had the qla2xxx one.  It was sent to James a month ago with not
atypical results.  The advansys one is stuck in git-scsi-misc.

I'll give it 24 hours and then shall send these:

scsi-qla2xxx-qla_osc-section-fix.patch
megaraid-fix-section-mismatch.patch
cciss-section-mismatch.patch
x86-discover_ebda-section-mismatch.patch
tpm-infineon-section-mismatch.patch
dvb-av7110-fix-section-mismatch.patch
hostap-section-mismatch-warning.patch

in to Linus.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] megaraid: fix section mismatch

2008-01-10 Thread Andrew Morton

On Thu, 10 Jan 2008 22:45:35 -0600 James Bottomley [EMAIL PROTECTED] wrote:

 On Thu, 2008-01-10 at 16:10 -0800, Andrew Morton wrote:
  On Thu, 10 Jan 2008 14:33:16 -0800
  Randy Dunlap [EMAIL PROTECTED] wrote:
  
   From: Randy Dunlap [EMAIL PROTECTED]
   
   Change megaraid_pci_driver_g variable name so that it matches the modpost
   whitelist that allows pointers to init text/data.
   
   WARNING: vmlinux.o(.data+0x1a8e30): Section mismatch: reference to 
   .init.text:megaraid_probe_one (between 'megaraid_pci_driver_g' and 
   'class_device_attr_megaraid_mbox_app_hndl')
   
  
  All these patches fix references to possibly-discarded sections and hence
  fix possibly-serious bugs.  So all of them should go into 2.6.24.
 
 Renaming a variable fixes a serious bug?  It quiets a spurious warning
 from modpost, sure, but I hardly think that's -rc7 material.
 

Rather than unerringly zooming in on the vanishingly trivial: will you be
merging the advansys and qla2xx bugfixes or would you like me to?
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] MAINTAINERS: remove Adam Fritzler, update his email address in other sources

2007-12-22 Thread Andrew Morton

On Mon, 17 Dec 2007 20:48:03 -0800 Joe Perches [EMAIL PROTECTED] wrote:

 Back to Adam Fritzler...
 
 ...
 
 diff --git a/CREDITS b/CREDITS
 index ee909f2..449ec7f 100644
 --- a/CREDITS
 +++ b/CREDITS
 @@ -1124,6 +1124,9 @@ S: 1150 Ringwood Court
  S: San Jose, California 95131
  S: USA
  
 +N: Adam Fritzler
 +E: [EMAIL PROTECTED]
 +
  N: Fernando Fuganti
  E: [EMAIL PROTECTED]
  E: [EMAIL PROTECTED]
 diff --git a/MAINTAINERS b/MAINTAINERS
 index 9507b42..690f172 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -3758,13 +3758,6 @@ W: 
 http://www.kernel.org/pub/linux/kernel/people/bunk/trivial/
  T:   git kernel.org:/pub/scm/linux/kernel/git/bunk/trivial.git
  S:   Maintained
  
 -TMS380 TOKEN-RING NETWORK DRIVER
 -P:   Adam Fritzler
 -M:   [EMAIL PROTECTED]
 -L:   [EMAIL PROTECTED]
 -W:   http://www.auk.cx/tms380tr/
 -S:   Maintained

What was the rationale for removing Adam from MAINTAINERS?

That should have been in the non-existent changelog.  Please always reissue
a complete changelog when resending any patch.

hm, linux-tr.net seems to be defunct.  So I guess that orphaning TMS380 is
appropriate, if Adam has left us.  Has he?

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 18/30] scsi/qla2xxx/: possible cleanups

2007-12-21 Thread Andrew Morton

On Fri, 14 Dec 2007 10:20:04 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote:

 On Fri, 14 Dec 2007, Andrew Morton wrote:
 
   Could you drop this patch from your queue.  I'll carry it in my tree
   (along with additional code removals) for 2.6.25 submission.
   
  
  I'll normally carry patches until they turn up in a subsystem tree or
  mainline and will drop them then.  To minimise potential of lossage..
  
  Is your tree publically accessible?
 
 It is, though, not widely publicized:
 
   git://avgit01.qlogic.com/qla2xxx-upstream

 The repo is torndown and rebased on frequent a basis, and is meant to
 provide a snapshot of where qla2xxx is at any given time.  Currently
 it's comprised of linux-2.6.git with scsi-misc-2.6.git merged and a
 dozen or so patches queued for the next merge window (2.6.25).

That should be OK.  Sometimes ugly things can happen if James syncs with
Linus and you don't: when I ask git to generate the james-you diff, it
will generate a patch which reverts the Linus changes which are in James's
tree but which aren't in yours.  I have an alternative pull-git-trees
script which tries to fix that but not very successfully.

But whatever, we'll see.

One slight problem though:

fatal: Unable to look up avgit01.qlogic.com (port 9418) (Name or service not 
known)

I think I might need to be [EMAIL PROTECTED] to get at that?

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] cciss: export more attributes to sysfs (repost)

2007-12-21 Thread Andrew Morton

On Fri, 14 Dec 2007 16:17:44 -0600
Mike Miller [EMAIL PROTECTED] wrote:

 Patch 1 of 3
 
 Sorry to take so long to repost.
 
 This patch exports more attributes to /sys so we can work work better with
 udev. Some distros use unique_id among other attributes. This patch attempts
 to provide that and other attributes to reveal more information about cciss
 devices in /sys. It's also an effort to be more sysfs friendly.
 Please consider this for inclusion.
 

I'm getting some deja vu here.  I'm sure I already commented on some of
these things?

 
 diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
 index 7d70496..54080e6 100644
 --- a/drivers/block/cciss.c
 +++ b/drivers/block/cciss.c
 @@ -229,20 +229,485 @@ static inline CommandList_struct 
 *removeQ(CommandList_struct **Qptr,
   return c;
  }
  
 +static inline int find_drv_index(int ctlr, drive_info_struct *drv){
 +int i;
 +for (i=0; i  CISS_MAX_LUN; i++) {
 +if (hba[ctlr]-drv[i].LunID == drv-LunID)
 +return i;
 +}
 +return i;
 +}

pleeeze always feed all diffs through scripts/checkpatch.pl.  Twice.

This function has multiple coding-style mistakes.

It is also far too large to be inlined.

  #include cciss_scsi.c  /* For SCSI tape support */
  
 +#define ENG_GIG 10
 +#define ENG_GIG_FACTOR (ENG_GIG/512)
  #define RAID_UNKNOWN 6
 +static const char *raid_label[] = { 0, 4, 1(1+0), 5, 5+1, ADG,
 + UNKNOWN};
 +
 +
 +static spinlock_t sysfs_lock = SPIN_LOCK_UNLOCKED;

checkpatch would have informed you about this mistake as well.

 +static void cciss_sysfs_stat_inquiry(int ctlr, int logvol,
 + int withirq, drive_info_struct *drv)
 +{
 + int return_code;
 + InquiryData_struct *inq_buff;
 +
 + /* If there are no heads then this is the controller disk and
 +  * not a valid logical drive so don't query it.
 +  */
 + if (!drv-heads)
 + return;
 +
 + inq_buff = kzalloc(sizeof(InquiryData_struct), GFP_KERNEL);
 + if (!inq_buff) {
 + printk(KERN_ERR cciss: out of memory\n);

This failure gets dropped on the floor.  Is there really no need to report
it?  Will the driver still correctly function even thoug this function
didn't do anything?

 + goto err;
 + }
 +
 + if (withirq)
 + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr,
 + inq_buff, sizeof(*inq_buff), 1, logvol ,0, TYPE_CMD);
 + else
 + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff,
 + sizeof(*inq_buff), 1, logvol , 0, NULL, TYPE_CMD);
 + if (return_code == IO_OK) {
 + memcpy(drv-vendor, inq_buff-data_byte[8], 8);
 + drv-vendor[8]='\0';
 + memcpy(drv-model, inq_buff-data_byte[16], 16);
 + drv-model[16] = '\0';
 + memcpy(drv-rev, inq_buff-data_byte[32], 4);
 + drv-rev[4] = '\0';
 + } else { /* Get geometry failed */
 + printk(KERN_WARNING cciss: inquiry for VPD page 0 failed\n);
 + }
 +
 + if (withirq)
 + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr,
 + inq_buff, sizeof(*inq_buff), 1, logvol ,0x83, TYPE_CMD);
 + else
 + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff,
 + sizeof(*inq_buff), 1, logvol , 0x83, NULL, TYPE_CMD);
 +
 + if (return_code == IO_OK) {
 + memcpy(drv-uid, inq_buff-data_byte[8], 16);
 + } else { /* Get geometry failed */
 + printk(KERN_WARNING cciss: inquiry for VPD page 83 failed\n);
 + }
 +
 + kfree(inq_buff);
 +err:
 + drv-vendor[8] = '\0';
 + drv-model[16] = '\0';
 + drv-rev[4] = '\0';
 +
 +}
 +
 +static ssize_t cciss_show_raid_level(struct device *dev,
 +  struct device_attribute *attr, char *buf)
 +{
 + struct drv_dynamic *d;
 + drive_info_struct *drv;
 + ctlr_info_t *h;
 + unsigned long flags;
 + int raid;
 +
 + d = container_of(dev, struct drv_dynamic, dev);
 + spin_lock(sysfs_lock);
 + if (!d-disk) {
 + spin_unlock(sysfs_lock);
 + return -ENOENT;
 + }
 +
 + h = get_host(d-disk);
 +
 + spin_lock_irqsave(CCISS_LOCK(h-ctlr), flags);
 + if (h-busy_configuring) {
 + spin_unlock_irqrestore(CCISS_LOCK(h-ctlr), flags);
 + spin_unlock(sysfs_lock);
 + return snprintf(buf, 30, Device busy configuring\n);
 + }

The above code snippet gets repeated again and again and again.  As I
suggested last time: can this be fixed?

 + drv = d-disk-private_data;
 + if ((drv-raid_level  0) || (drv-raid_level)  5)
 + raid = RAID_UNKNOWN;
 + else
 + raid = drv-raid_level;
 +
 + spin_unlock_irqrestore(CCISS_LOCK(h-ctlr), flags);
 + spin_unlock(sysfs_lock);
 + return snprintf(buf, 20, RAID %s\n, raid_label[raid]);
 +}

Re: 2.6.24-rc5: tape drive not responding

2007-12-17 Thread Andrew Morton

On Mon, 17 Dec 2007 11:25:51 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote:

 On Sun, 16 Dec 2007 20:05:51 -0500
 John Stoffel [EMAIL PROTECTED] wrote:
 
  [  215.007701] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.008145] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.008678] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.009122] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.009598] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.010042] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.010516] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.010959] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.011403] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  215.011850] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  .
  .
  .
  [  232.954629] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  233.035902] scsi 3:0:3:0: DEVICE RESET operation started
  [  233.099514] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  .
  .
  .
  
  These repeat for about 15 seconds or so.  They're really annoying and
  I'd love to see some sort of rate limiting put in here.  The messages
  and end with:
  .
  .
  .
  [  238.084175] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  238.165887] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  238.247157] scsi 3:0:3:0: DEVICE RESET operation timed-out.
  [  238.313892] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  238.395192] scsi 3:0:3:0: BUS RESET operation started
  [  238.455690] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae
  [  238.539216] sym1: SCSI BUS reset detected.
  [  238.592552] sym1: SCSI BUS has been reset.
  [  238.641576] scsi 3:0:3:0: BUS RESET operation complete.
  [  248.700373]  target3:0:3: wide asynchronous
  [  248.752026]  target3:0:3: Wide Transfers Fail
  [  248.805220]  target3:0:3: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 15)
  [  248.886729]  target3:0:3: Domain Validation skipping write tests
  [  248.958666]  target3:0:3: Ending Domain Validation
  [  252.264086] scsi 3:0:0:0: Attached scsi generic sg2 type 8
  [  252.331257] st 3:0:2:0: Attached scsi tape st0
  [  252.384549] st 3:0:2:0: st0: try direct i/o: yes (alignment 512 B)
  [  252.458875] st 3:0:2:0: Attached scsi generic sg3 type 1
  [  252.523963] st 3:0:3:0: Attached scsi tape st1
  [  252.577184] st 3:0:3:0: st1: try direct i/o: yes (alignment 512 B)
  [  252.651484] st 3:0:3:0: Attached scsi generic sg4 type 1
  
  
  I've also got an ATL P1000 SCSI tape library hooked up to this same
  controller and port, and I can manipulate it properly using the 'mtx'
  program pointed to the /dev/changer alias, which points to the correct
  /dev/sg# device.
  
  Here's my /proc/scsi/scsi output, as you can see, I've got a bunch of
  devices on this system:
  
  # cat /proc/scsi/scsi 
  Attached devices:
  Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: COMPAQ   Model: HC01841729   Rev: 3208
Type:   Direct-AccessANSI  SCSI revision: 02
  Host: scsi0 Channel: 00 Id: 01 Lun: 00
Vendor: COMPAQ   Model: BD018222CA   Rev: B016
Type:   Direct-AccessANSI  SCSI revision: 02
  Host: scsi3 Channel: 00 Id: 00 Lun: 00
Vendor: ATL  Model: P10006220051 Rev: 1.20
Type:   Medium Changer   ANSI  SCSI revision: 02
  Host: scsi3 Channel: 00 Id: 02 Lun: 00
Vendor: QUANTUM  Model: DLT7000  Rev: 2565
Type:   Sequential-AccessANSI  SCSI revision: 02
  Host: scsi3 Channel: 00 Id: 03 Lun: 00
Vendor: QUANTUM  Model: DLT7000  Rev: 2565
Type:   Sequential-AccessANSI  SCSI revision: 02
  Host: scsi4 Channel: 00 Id: 00 Lun: 00
Vendor: SAMSUNG  Model: CDRW/DVD SM-352B Rev: T806
Type:   CD-ROM   ANSI  SCSI revision: 05
  Host: scsi6 Channel: 00 Id: 00 Lun: 00
Vendor: ATA  Model: ST3320620AS  Rev: 3.AA
Type:   Direct-AccessANSI  SCSI revision: 05
  Host: scsi7 Channel: 00 Id: 00 Lun: 00
Vendor: ATA  Model: WDC WD3200AAKS-0 Rev: 12.0
Type:   Direct-AccessANSI  SCSI revision: 05
  Host: scsi10 Channel: 00 Id: 00 Lun: 00
Vendor: ATA  Model: WDC WD1200JB-00C Rev: 17.0
Type:   Direct-AccessANSI  SCSI revision: 05
  Host: scsi11 Channel: 00 Id: 00 Lun: 00
Vendor: ATA  Model: WDC WD1200JB-00E Rev: 15.0
Type:   Direct-AccessANSI  SCSI revision: 05
  Host: scsi12 Channel: 00 Id: 00 Lun: 00
Vendor: Generic  Model: STORAGE DEVICE   Rev: 0001
Type:   Direct-AccessANSI  SCSI revision: 00
  Host: scsi12 Channel: 00 Id: 00 Lun: 01
Vendor: Generic  Model: STORAGE DEVICE   Rev:

Re: INITIO scsi driver fails to work properly

2007-12-17 Thread Andrew Morton

On Mon, 17 Dec 2007 11:39:47 +0200 Filippos Papadopoulos [EMAIL PROTECTED] 
wrote:

 Hi,
 I have got an INITIO 9100 UW SCSI Controller with an IBM
 IC35L036UWD210-0 scsi hard disk on a 32 bit x86 system.
 Currently i have SUSE 10.1 (Kernel 2.6.16).
 
 I tried to install OpenSUSE 10.3 (kernel 2.6.22.5) and the latest
 OpenSUSE 11.0 Alpha 0  (kernel 2.6.24-rc4) but although the initio
 driver
 gets loaded during the installation process, yast reports that no hard
 disk is found. I believe that this isnt a bug in suse's yast but a
 problem
 in the initio scsi driver because i also tried to install Fedora 8
 (kernel 2.6.23) with the same problem.
 I have seen the relevant thread Conflict when loading initio driver
 and i suppose that the initio driver isnt fixed yet.
 I can help testing the new patches in the initio driver if someone is
 interested.

initio doesn't seem to have a maintainer...

Are you able to identify any earlier kernel which worked OK?

Maybe it's a new device?  If you can get the `lspci -vvxx' output
for that device we can take a look.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc5: tape drive not responding

2007-12-17 Thread Andrew Morton

On Mon, 17 Dec 2007 16:02:02 -0500
John Stoffel [EMAIL PROTECTED] wrote:

 
 Just to confirm, the propsed patch to st.c fixes the issue with
 2.6.24-rc5 as well at 2.6.24-rc5-mm1 with access to my DLT tape
 drives.

err, what patch to st.c?

So it seems that 2.6.24 (and presumably 2.6.23?) need

1: Alan's initio: fix conflict when loading driver (currently stocuk
   in git-scsi-misc)

2: Boaz's initio: initio_build_scb() fix (my name for it)

3: The mystery st.c fix.

yes?
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] MAINTAINERS: remove Adam Fritzler, update his email address in other sources

2007-12-17 Thread Andrew Morton

On Mon, 17 Dec 2007 20:12:06 -0800 Joe Perches [EMAIL PROTECTED] wrote:

 Adam isn't a maintainer anymore.
 His old email address bounces.
 Update to new email address.
 
 On Mon, Dec 17, 2007 at 01:03:48PM -0800, Joe Perches wrote:
   You seem to have an old email address in the
   linux-kernel MAINTAINERS file.
   Should it be deleted or changed?
 On Mon, 2007-12-17 at 19:27 -0800, Adam Fritzler wrote:
  I am no longer actively involved. If you can mark me as a former point
  of contact, that's fine, or you can just delete the entry. My name is
  still in the source, but with the old address. It'd great if the
  address in source was updated.
 
 ...
  
 -TMS380 TOKEN-RING NETWORK DRIVER
 -P:   Adam Fritzler
 -M:   [EMAIL PROTECTED]
 -L:   [EMAIL PROTECTED]
 -W:   http://www.auk.cx/tms380tr/
 -S:   Maintained

 ...

 - * Added MCA support Adam Fritzler [EMAIL PROTECTED]
 + * Added MCA support Adam Fritzler [EMAIL PROTECTED]

This is fairly pointless - it'll just break again when Adam moves again.

Every problem can be solved with another layer of...

Please: just replace all instances with plain old Adam Fritzler and then
ensure that the lookup key Adam Fritzler has an accurate (and
non-duplicated anywhere else!) entry in MAINTAINERS or CREDITS or whatever.



btw, I cheerfully skipped all your spelling-fixes patches.  Some will have
stuck via subsystem maintainers but I have a secret no spelling fixes
unless they're end-user-visible policy.  That means I'll take spelling
fixes only if they're in printks or in Documentation/*.  This is a little
defense mechanism to avoid getting buried in micropatches.

I'd suggest that you find out if Adrian is still running the trivial tree
and if so, patchbomb him.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 18/30] scsi/qla2xxx/: possible cleanups

2007-12-14 Thread Andrew Morton

On Fri, 14 Dec 2007 07:37:24 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote:

 On Thu, 13 Dec 2007, [EMAIL PROTECTED] wrote:
 
  From: Adrian Bunk [EMAIL PROTECTED]
  
  - make the following needlessly global code static:
- qla_attr.c: qla24xx_vport_delete()
- qla_attr.c: qla24xx_vport_disable()
- qla_mid.c: qla24xx_allocate_vp_id()
- qla_mid.c: qla24xx_find_vhost_by_name()
- qla_mid.c: qla2x00_do_dpc_vp()
- qla_os.c: struct qla2x00_driver_template
- qla_os.c: qla2x00_stop_timer()
- qla_os.c: qla2x00_mem_alloc()
- qla_os.c: qla2x00_mem_free()
- qla_sup.c: qla2x00_lock_nvram_access()
- qla_sup.c: qla2x00_unlock_nvram_access()
- qla_sup.c: qla2x00_get_nvram_word()
- qla_sup.c: qla2x00_write_nvram_word()
  - #if 0 the following unused global functions:
- qla_dbg.c: qla2x00_dump_pkt()
- qla_mbx.c: qla2x00_system_error()
- qla_mbx.c: qla2x00_get_serdes_params()
- qla_mbx.c: qla2x00_get_idma_speed()
- qla_mbx.c: qla24xx_get_vp_database()
- qla_mbx.c: qla24xx_get_vp_entry()
  - qla_os.c: remove some unneeded function prototypes
  
  Signed-off-by: Adrian Bunk [EMAIL PROTECTED]
  Cc: Andrew Vasquez [EMAIL PROTECTED]
  Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 
 Andrew,
 
 Could you drop this patch from your queue.  I'll carry it in my tree
 (along with additional code removals) for 2.6.25 submission.
 

I'll normally carry patches until they turn up in a subsystem tree or
mainline and will drop them then.  To minimise potential of lossage..

Is your tree publically accessible?
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-14 Thread Andrew Morton

On Sat, 15 Dec 2007 01:09:41 + Mel Gorman [EMAIL PROTECTED] wrote:

 On (13/12/07 14:29), Andrew Morton didst pronounce:
   The simple way seems to be to malloc a large area, touch every page and
   then look at the physical pages assigned ... they now mostly seem to be
   descending in physical address.
   
  
  OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...
  
 
 I tried using pagemap to verify the patch but it triggered BUG_ON
 checks. Perhaps I am using the interface wrong but I would still not
 expect it to break in this fashion. I tried 2.6.24-rc4-mm1, 2.6.24-rc5-mm1,
 2.6.24-rc5 with just the maps4 patches applied and 2.6.23 with maps4 patches
 applied. Each time I get errors like this;
 
 [   90.108315] BUG: sleeping function called from invalid context at 
 include/asm/uaccess_32.h:457
 [   90.211227] in_atomic():1, irqs_disabled():0
 [   90.262251] no locks held by showcontiguous/2814.
 [   90.318475] Pid: 2814, comm: showcontiguous Not tainted 2.6.24-rc5 #1
 [   90.395344]  [c010522a] show_trace_log_lvl+0x1a/0x30
 [   90.456948]  [c0105bb2] show_trace+0x12/0x20
 [   90.510173]  [c0105eee] dump_stack+0x6e/0x80
 [   90.563409]  [c01205b3] __might_sleep+0xc3/0xe0
 [   90.619765]  [c02264fd] copy_to_user+0x3d/0x60
 [   90.675153]  [c01b3e9c] add_to_pagemap+0x5c/0x80
 [   90.732513]  [c01b43e8] pagemap_pte_range+0x68/0xb0
 [   90.793010]  [c0175ed2] walk_page_range+0x112/0x210
 [   90.853482]  [c01b47c6] pagemap_read+0x176/0x220
 [   90.910863]  [c0182dc4] vfs_read+0x94/0x150
 [   90.963058]  [c01832fd] sys_read+0x3d/0x70
 [   91.014219]  [c0104262] syscall_call+0x7/0xb
 
 ...

 Just using cp to read the file is enough to cause problems but I included
 a very basic program below that produces the BUG_ON checks. Is this a known
 issue or am I using the interface incorrectly?

I'd say you're using it correctly but you've found a hitherto unknown bug. 
On i386 highmem machines with CONFIG_HIGHPTE (at least) pte_offset_map()
takes kmap_atomic(), so pagemap_pte_range() can't do copy_to_user() as it
presently does.

Drat.

Still, that shouldn't really disrupt the testing which you're doing.  You
could disable CONFIG_HIGHPTE to shut it up.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Andrew Morton

On Thu, 13 Dec 2007 21:09:59 +0100
Jens Axboe [EMAIL PROTECTED] wrote:


 OK, it's a vm issue,

cc linux-mm and probable culprit.

  I have tens of thousand backward pages after a
 boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
 reverse. So it looks like that bug got reintroduced.

Bill Irwin fixed this a couple of years back: changed the page allocator so
that it mostly hands out pages in ascending physical-address order.

I guess we broke that, quite possibly in Mel's page allocator rework.

It would help if you could provide us with a simple recipe for
demonstrating this problem, please.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Andrew Morton

On Thu, 13 Dec 2007 17:15:06 -0500
James Bottomley [EMAIL PROTECTED] wrote:

 
 On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:
  On Thu, 13 Dec 2007 21:09:59 +0100
  Jens Axboe [EMAIL PROTECTED] wrote:
  
  
   OK, it's a vm issue,
  
  cc linux-mm and probable culprit.
  
I have tens of thousand backward pages after a
   boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not
   reverse. So it looks like that bug got reintroduced.
  
  Bill Irwin fixed this a couple of years back: changed the page allocator so
  that it mostly hands out pages in ascending physical-address order.
  
  I guess we broke that, quite possibly in Mel's page allocator rework.
  
  It would help if you could provide us with a simple recipe for
  demonstrating this problem, please.
 
 The simple way seems to be to malloc a large area, touch every page and
 then look at the physical pages assigned ... they now mostly seem to be
 descending in physical address.
 

OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Andrew Morton

On Thu, 13 Dec 2007 19:30:00 -0500
Mark Lord [EMAIL PROTECTED] wrote:

 Here's the commit that causes the regression:
 
 ...

 --- a/mm/page_alloc.c
 +++ b/mm/page_alloc.c
 @@ -760,7 +760,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int 
 order,
   struct page *page = __rmqueue(zone, order, migratetype);
   if (unlikely(page == NULL))
   break;
 - list_add_tail(page-lru, list);
 + list_add(page-lru, list);

well that looks fishy.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fix page_alloc for larger I/O segments

2007-12-13 Thread Andrew Morton

On Thu, 13 Dec 2007 19:40:09 -0500
Mark Lord [EMAIL PROTECTED] wrote:

 And here is a patch that seems to fix it for me here:
 
 * * * *
 
 Fix page allocator to give better change of larger contiguous segments 
 (again).
 
 Signed-off-by: Mark Lord [EMAIL PROTECTED]
 ---
 
 
 --- old/mm/page_alloc.c.orig  2007-12-13 19:25:15.0 -0500
 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:35:50.0 -0500
 @@ -954,7 +954,7 @@
   goto failed;
   }
   /* Find a page of the appropriate migrate type */
 - list_for_each_entry(page, pcp-list, lru) {
 + list_for_each_entry_reverse(page, pcp-list, lru) {
   if (page_private(page) == migratetype) {
   list_del(page-lru);
   pcp-count--;

- needs help to make it apply to mainline

- needs a comment, methinks...


--- 
a/mm/page_alloc.c~fix-page-allocator-to-give-better-chance-of-larger-contiguous-segments-again
+++ a/mm/page_alloc.c
@@ -1060,8 +1060,12 @@ again:
goto failed;
}
 
-   /* Find a page of the appropriate migrate type */
-   list_for_each_entry(page, pcp-list, lru)
+   /*
+* Find a page of the appropriate migrate type.  Doing a
+* reverse-order search here helps us to hand out pages in
+* ascending physical-address order.
+*/
+   list_for_each_entry_reverse(page, pcp-list, lru)
if (page_private(page) == migratetype)
break;
 
_

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fix page_alloc for larger I/O segments (improved)

2007-12-13 Thread Andrew Morton

On Thu, 13 Dec 2007 19:57:29 -0500
James Bottomley [EMAIL PROTECTED] wrote:

 
 On Thu, 2007-12-13 at 19:46 -0500, Mark Lord wrote:
  Improved version, more similar to the 2.6.23 code:
  
  Fix page allocator to give better chance of larger contiguous segments 
  (again).
  
  Signed-off-by: Mark Lord [EMAIL PROTECTED]
  ---
  
  --- old/mm/page_alloc.c 2007-12-13 19:25:15.0 -0500
  +++ linux-2.6/mm/page_alloc.c   2007-12-13 19:43:07.0 -0500
  @@ -760,7 +760,7 @@
  struct page *page = __rmqueue(zone, order, migratetype);
  if (unlikely(page == NULL))
  break;
  -   list_add(page-lru, list);
  +   list_add_tail(page-lru, list);
 
 Could we put a big comment above this explaining to the would be vm
 tweakers why this has to be a list_add_tail, so we don't end up back in
 this position after another two years?
 

Already done ;)

--- a/mm/page_alloc.c~fix-page_alloc-for-larger-i-o-segments-fix
+++ a/mm/page_alloc.c
@@ -847,6 +847,10 @@ static int rmqueue_bulk(struct zone *zon
struct page *page = __rmqueue(zone, order, migratetype);
if (unlikely(page == NULL))
break;
+   /*
+* Doing a list_add_tail() here helps us to hand out pages in
+* ascending physical-address order.
+*/
list_add_tail(page-lru, list);
set_page_private(page, migratetype);
}
_

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)

2007-12-12 Thread Andrew Morton

On Wed, 12 Dec 2007 11:58:41 +0100 Anders Henke [EMAIL PROTECTED] wrote:

 Hi,
 
 I'd like to let you now that my boxes are running a 32-bit kernel, so
 the 64-bit-uncleanliness shouldn't apply to my boxes; however,
 
 http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch
 
 fixed the issue on my testbox.
 
 I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: works.

What a huge patch :(

We already reverted the offening patch so I assume that 2.6.24-rc5 is
working for you?

I guess we need to look at restoring dpt_i2o: convert to SCSI hotplug
model and then absorbing what Miquel has done there.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)

2007-12-12 Thread Andrew Morton

On Wed, 12 Dec 2007 14:43:42 +0100 Anders Henke [EMAIL PROTECTED] wrote:

 Am 12.12.2007 schrieb Miquel van Smoorenburg:
  On Wed, 2007-12-12 at 03:38 -0800, Andrew Morton wrote:
   On Wed, 12 Dec 2007 11:58:41 +0100 Anders Henke [EMAIL PROTECTED] wrote:
   
Hi,

I'd like to let you now that my boxes are running a 32-bit kernel, so
the 64-bit-uncleanliness shouldn't apply to my boxes; however,

http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch

fixed the issue on my testbox.

I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: 
works.
   
   What a huge patch :(
   
   We already reverted the offening patch so I assume that 2.6.24-rc5 is
   working for you?
   
   I guess we need to look at restoring dpt_i2o: convert to SCSI hotplug
   model and then absorbing what Miquel has done there.
  
  This was just a patch I had lying around, if it worked it would confirm
  my suspicion, which it has.
  
  The minimal patch which is suitable for 2.6.23-stable and 2.6.24 would
  be the attached one-liner. The dpt_i2o: convert to SCSI hotplug model
  patch could be restored then.
  
  (if the list eats the attachment, it's also available here:
  http://www.miquels.cistron.nl/linux/linux-2.6.23+24-dpt_i2o-dma64.patch 
  )
  
  Anders, does this one-liner patch work for you ?
 
 Got it - and it works!
 
 I took a clean 2.6.23, applied the patch, recompiled the kernel and
 rebooted my testbox: came up with the fresh-compiled kernel 
 (verified by uname -a).
 

That looks appropriate for 2.6.23.x:

--- linux-2.6.23.9.orig/drivers/scsi/dpt_i2o.c  2007-11-26 18:51:43.0 
+0100
+++ linux-2.6.23.9/drivers/scsi/dpt_i2o.c   2007-12-12 13:21:05.0 
+0100
@@ -905,8 +905,7 @@
}
 
pci_set_master(pDev);
-   if (pci_set_dma_mask(pDev, DMA_64BIT_MASK) 
-   pci_set_dma_mask(pDev, DMA_32BIT_MASK))
+   if (pci_set_dma_mask(pDev, DMA_32BIT_MASK))
return -EINVAL;
 
base_addr0_phys = pci_resource_start(pDev,0);


However it is a bit mystifying that
55d9fcf57ba5ec427544fca7abc335cf3da78160 would cause a dma mask problem
(isn't it?)

The scsi people might want to restore
55d9fcf57ba5ec427544fca7abc335cf3da78160 and then apply Miquel's patch on
top for 2.6.24, or do it for 2.6.25?
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH acs_ame scsi driver 000 of 1] Introduction

2007-12-11 Thread Andrew Morton

On Tue, 11 Dec 2007 17:01:48 -0800
Andrew Morton [EMAIL PROTECTED] wrote:

 More hm.  It was from [EMAIL PROTECTED] which perhaps means that
 some attempt to recall it was made.  Oh well.

argh.  [EMAIL PROTECTED] really is our Jeff's email address.  And I
went and cc'ed [EMAIL PROTECTED] on my reply, only that person is an
innocent civilian.  Bad me.  Please remove [EMAIL PROTECTED] from any
replies.  Thanks.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Andrew Morton

On Thu, 6 Dec 2007 23:07:08 -0600 (CST) [EMAIL PROTECTED] (Bob Tracy) wrote:

 Andrew Morton wrote:
  commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba
  Merge: 2f1f53b... d90bf5a...
  Author: Linus Torvalds [EMAIL PROTECTED]
  Date:   Wed Nov 14 18:51:48 2007 -0800
  
  Merge branch 'master' of 
  master.kernel.org:/pub/scm/linux/kernel/git/davem/n
  
  * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NET]: rt_check_expire() can take a long time, add a cond_resched()
[ISDN] sc: Really, really fix warning
[ISDN] sc: Fix sndpkt to have the correct number of arguments
[TCP] FRTO: Clear frto_highmark only after process_frto that uses it
[NET]: Remove notifier block from chain when 
  register_netdevice_notifier f
[FS_ENET]: Fix module build.
[TCP]: Make sure write_queue_from does not begin with NULL ptr
[TCP]: Fix size calculation in sk_stream_alloc_pskb
[S2IO]: Fixed memory leak when MSI-X vector allocation fails
[BONDING]: Fix resource use after free
[SYSCTL]: Fix warning for token-ring from sysctl checker
[NET] random : secure_tcp_sequence_number should not assume 
  CONFIG_KTIME_S
[IWLWIFI]: Not correctly dealing with hotunplug.
[TCP] FRTO: Plug potential LOST-bit leak
[TCP] FRTO: Limit snd_cwnd if TCP was application limited
[E1000]: Fix schedule while atomic when called from mii-tool.
[NETX]: Fix build failure added by 2.6.24 statistics cleanup.
[EP93xx_ETH]: Build fix after 2.6.24 NAPI changes.
[PKT_SCHED]: Check subqueue status before calling hard_start_xmit
  
  I'm struggling to see how any of those could have broken block device
  mounting on alpha.  Are you sure you bisected right?
 
 Based on what's in that commit, it *does* appear something went wrong
 with bisection.  If the implicated commit is the next one in time
 sequence relative to
 
 # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
 INLINE and name timeval_cmp better
 
 then the test of whether I bisected correctly is as simple as applying
 the commit and seeing if things break, because I'm running on the
 kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right
 now.  Let me give that a try and I'll report back.  Worst case, I'll
 have to start over and write off the past four days...

Gad.  I trust the second time will be faster.

git-bisect _is_ very error prone.  I find one of the problems is that each
step is so far apart in time that you forget what you were doing.  Did I
remember to test that iteration?  Did I install the right kernel?  etc.

 Sorry about this...

Not appropriate ;)   Thanks for helping out.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc4-mm1 and excessive block IO errors

2007-12-07 Thread Andrew Morton

On Fri, 07 Dec 2007 20:44:45 +
Zan Lynx [EMAIL PROTECTED] wrote:

 I am not sure if this problem has been addressed already.  I read some
 about the fast-fail issues and this may be related?
 
 On nearly all my USB block devices, I have been getting zillions of I/O
 errors.  But they aren't real, they don't appear with 2.6.23 kernels.
 
 I can often read and write data to the device, but these IO errors cause
 error aborts in user space applications in many cases, making it a
 chancy thing to run backup software, for example.
 
 Here is a bit of dmesg from plugging in a perfectly good USB-2 flash
 drive.
 
 hub 3-0:1.0: state 7 ports 6 chg  evt 0004
 ehci_hcd :00:02.2: GetStatus port 2 status 001803 POWER sig=j CSC CONNECT
 hub 3-0:1.0: port 2, status 0501, change 0001, 480 Mb/s
 hub 3-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x501
 ehci_hcd :00:02.2: port 2 high speed
 ehci_hcd :00:02.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT
 usb 3-2: new high speed USB device using ehci_hcd and address 9
 ehci_hcd :00:02.2: port 2 high speed
 ehci_hcd :00:02.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT
 usb 3-2: default language 0x0409
 usb 3-2: uevent
 usb 3-2: usb_probe_device
 usb 3-2: configuration #1 chosen from 1 choice
 usb 3-2: adding 3-2:1.0 (config #1, interface 0)
 usb 3-2:1.0: uevent
 libusual 3-2:1.0: usb_probe_interface
 libusual 3-2:1.0: usb_probe_interface - got id
 usb-storage 3-2:1.0: usb_probe_interface
 usb-storage 3-2:1.0: usb_probe_interface - got id
 scsi4 : SCSI emulation for USB Mass Storage devices
 drivers/usb/core/inode.c: creating file '009'
 usb 3-2: New USB device found, idVendor=05dc, idProduct=a400
 usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
 usb 3-2: Product: JUMPDRIVE
 usb 3-2: Manufacturer: LEXAR MEDIA
 usb 3-2: SerialNumber: 0A4EEC05201219080904
 usb-storage: device found at 9
 usb-storage: waiting for device to settle before scanning
 usb-storage: device scan complete
 scsi 4:0:0:0: Direct-Access LEXARJUMPDRIVE1000 PQ: 0 ANSI: 0 
 CCS
 sd 4:0:0:0: [sdg] 2026592 512-byte hardware sectors (1038 MB)
 sd 4:0:0:0: [sdg] Write Protect is off
 sd 4:0:0:0: [sdg] Mode Sense: 43 00 00 00
 sd 4:0:0:0: [sdg] Assuming drive cache: write through
 sd 4:0:0:0: [sdg] 2026592 512-byte hardware sectors (1038 MB)
 sd 4:0:0:0: [sdg] Write Protect is off
 sd 4:0:0:0: [sdg] Mode Sense: 43 00 00 00
 sd 4:0:0:0: [sdg] Assuming drive cache: write through
  sdg: sdg1
 sd 4:0:0:0: [sdg] Attached SCSI removable disk
 sd 4:0:0:0: Attached scsi generic sg7 type 0
 sd 4:0:0:0: [sdg] Result: hostbyte=0x01 driverbyte=0x00
 end_request: I/O error, dev sdg, sector 3984

Yes, this is breakage in the scsi tree.  I believe that the offending patch
has been found and I have a nasty fix somewhere in my inbox - it involves
reverting a patch which doesn't revert properly.  I haven't got onto
looking at it yet, sorry.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BUG 2.6.24-rc4-mm1 -- Boot still hangs w/ async scsi scan

2007-12-06 Thread Andrew Morton

On Thu, 06 Dec 2007 13:14:22 -0500 Lee Schermerhorn [EMAIL PROTECTED] wrote:

 On Wed, 2007-12-05 at 13:20 -0800, Andrew Morton wrote:
  On Wed, 05 Dec 2007 11:36:39 -0500
  Lee Schermerhorn [EMAIL PROTECTED] wrote:
  
   As reported here:
   
 http://marc.info/?l=linux-scsim=119645761124683w=4
   
   against 24-rc3-mm2, I'm still seeing the hang on my HP ia64 NUMA
   platform under 24-rc4-mm1 with async scsi scan enabled.  I'm still
   seeing the message  mptspi: ioc#: mpt_config failed when it hangs. 
   
   I can boot by disabling async scan.  However, I've also noticed some
   disks attached via one of the mpt adapters [scsi8 in console long in
   message linked above] going off-line during stress tests.  This was
   under 24-rc3-mm2.  Haven't got that far yet with 24-rc4-mm1.
   
  
  Is ther any way of tricking you into
  http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt?
  
  Obvious culprits to start with would be git-scsi-misc and maybe
  scsi-early-detection-of-medium-not-present-updated.patch.  But there are
  only 20-odd scsi patches in there.
 
 The reported hang occurs after pushing the git-scsi-misc patch.

OK, thanks.

  I'm
 looking into it now, but it's rather large and I'm a neophyte in this
 area.  If James can point me at a broken-out quilt series for this
 patch, I'd be willing to try to bisect that--

I doubt if such a thing exists.

 assuming that it IS
 bisectable.

Often git trees are not bisectable.  But they should be.

Your best bet is to do a git-bisect on
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git

http://www.kernel.org/doc/local/git-quick.html

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Andrew Morton

On Thu, 6 Dec 2007 18:16:12 -0600 (CST)
[EMAIL PROTECTED] (Bob Tracy) wrote:

 OK.  Finally have this thing painted into a corner: git has identified
 6f37ac793d6ba7b35d338f791974166f67fdd9ba as the first bad commit.
 
 From git bisect log, this corresponds to 
 
 # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
 master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
 
 Here's the full log:
 
 git-bisect start
 # good: [9aae299f7fd1888ea3a195cfe0edef17bb647415] Linux 2.6.24-rc2
 git-bisect good 9aae299f7fd1888ea3a195cfe0edef17bb647415
 # bad: [f05092637dc0d9a3f2249c9b283b973e6e96b7d2] Linux 2.6.24-rc3
 git-bisect bad f05092637dc0d9a3f2249c9b283b973e6e96b7d2
 # good: [e6a5c27f3b0fef72e528fc35e343af4b2db790ff] Merge branch 'for-linus' 
 of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
 git-bisect good e6a5c27f3b0fef72e528fc35e343af4b2db790ff
 # good: [42614fcde7bfdcbe43a7b17035c167dfebc354dd] vmstat: fix section 
 mismatch warning
 git-bisect good 42614fcde7bfdcbe43a7b17035c167dfebc354dd
 # bad: [a052f4473603765eb6b4c19754689977601dc1d1] Merge 
 git://git.kernel.org/pub/scm/linux/kernel/git/sam/x86
 git-bisect bad a052f4473603765eb6b4c19754689977601dc1d1
 # good: [d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5] CRISv10 improve and bugfix 
 fasttimer
 git-bisect good d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5
 # good: [d90bf5a976793edfa88d3bb2393f0231eb8ce1e5] [NET]: rt_check_expire() 
 can take a long time, add a cond_resched()
 git-bisect good d90bf5a976793edfa88d3bb2393f0231eb8ce1e5
 # good: [2a113281f5cd2febbab21a93c8943f8d3eece4d3] kconfig: use $K64BIT to 
 set 64BIT with all*config targets
 git-bisect good 2a113281f5cd2febbab21a93c8943f8d3eece4d3
 # good: [2e2cd8bad6e03ceea73495ee6d557044213d95de] CRISv10 memset library add 
 lineendings to asm
 git-bisect good 2e2cd8bad6e03ceea73495ee6d557044213d95de
 # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
 master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
 git-bisect bad 6f37ac793d6ba7b35d338f791974166f67fdd9ba
 # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
 INLINE and name timeval_cmp better
 git-bisect good 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3

commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba
Merge: 2f1f53b... d90bf5a...
Author: Linus Torvalds [EMAIL PROTECTED]
Date:   Wed Nov 14 18:51:48 2007 -0800

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/n

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: rt_check_expire() can take a long time, add a cond_resched()
  [ISDN] sc: Really, really fix warning
  [ISDN] sc: Fix sndpkt to have the correct number of arguments
  [TCP] FRTO: Clear frto_highmark only after process_frto that uses it
  [NET]: Remove notifier block from chain when register_netdevice_notifier f
  [FS_ENET]: Fix module build.
  [TCP]: Make sure write_queue_from does not begin with NULL ptr
  [TCP]: Fix size calculation in sk_stream_alloc_pskb
  [S2IO]: Fixed memory leak when MSI-X vector allocation fails
  [BONDING]: Fix resource use after free
  [SYSCTL]: Fix warning for token-ring from sysctl checker
  [NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_S
  [IWLWIFI]: Not correctly dealing with hotunplug.
  [TCP] FRTO: Plug potential LOST-bit leak
  [TCP] FRTO: Limit snd_cwnd if TCP was application limited
  [E1000]: Fix schedule while atomic when called from mii-tool.
  [NETX]: Fix build failure added by 2.6.24 statistics cleanup.
  [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes.
  [PKT_SCHED]: Check subqueue status before calling hard_start_xmit

I'm struggling to see how any of those could have broken block device
mounting on alpha.  Are you sure you bisected right?

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: everything in wait_for_completion, what is my system doing?

2007-12-06 Thread Andrew Morton

On Wed, 5 Dec 2007 21:44:54 +0100
Bernd Schubert [EMAIL PROTECTED] wrote:

 after scsi-recovery a system here went into some kind lock-up, everything 
 seems to be in wait_for_completion(). Please see the attached 
 blocked_states.txt and all_states.txt files.
 This is 2.6.22.12, I can easily find out the line numbers if required.
 
 Any help is highly appreciated.
 
 

Please cc linux-scsi on scsi-related reports.

 
 
 [blocked_states.txt  text/plain (20.5KB)]
 [generate break]
 [ 1818.566436] SysRq : Show Blocked State
 [ 1818.570260]
 [ 1818.570261]  free
 sibling
 [ 1818.579253]   task PCstack   pid father child 
 younger older
 [ 1818.586987] events/7  D 0155dd642280 026  2 (L-TLB)
 [ 1818.593747]  81012b529ac0 0046  
 810128280d18
 [ 1818.601321]  8100ba2376f8 81012b689630 81012aff76b0 
 00078023e215
 [ 1818.608870]  00010003ca14  810001065400 
 000780430c13
 [ 1818.616222] Call Trace:
 [ 1818.618925]  [804ececb] io_schedule+0x28/0x36
 [ 1818.624207]  [8036e517] get_request_wait+0x104/0x158
 [ 1818.630112]  [8036e5a1] blk_get_request+0x36/0x6b
 [ 1818.635755]  [8042f5cb] scsi_execute+0x51/0x129
 [ 1818.641240]  [880cc11b] :scsi_transport_spi:spi_execute+0x87/0xf8
 [ 1818.648271]  [880cd5ae] 
 :scsi_transport_spi:spi_dv_device_echo_buffer+0x181/0x27d
 [ 1818.656739]  [880cd801] 
 :scsi_transport_spi:spi_dv_retrain+0x4e/0x240
 [ 1818.664139]  [880ce008] 
 :scsi_transport_spi:spi_dv_device+0x615/0x69c
 [ 1818.671542]  [880f16d1] :mptspi:mptspi_dv_device+0xb3/0x14b
 [ 1818.678042]  [880f27d3] 
 :mptspi:mptspi_dv_renegotiate_work+0xcb/0xef
 [ 1818.685348]  [80245bb8] run_workqueue+0x8e/0x120
 [ 1818.690905]  [80245d50] worker_thread+0x106/0x117
 [ 1818.696540]  [80249672] kthread+0x4b/0x82
 [ 1818.701474]  [8020ab28] child_rip+0xa/0x12
 [ 1818.706495]
 [ 1818.708022] unionfs-fuse- D 01a76ef63463 0  1119  1 (NOTLB)
 [ 1818.714764]  810129765988 0082  
 80337e22
 [ 1818.722329]  8101297658c8 81012b652f20 810129eec810 
 0006
 [ 1818.729895]  00010005204e  81000105c400 
 000680337c3e
 [ 1818.737249] Call Trace:
 [ 1818.739953]  [804ecfba] schedule_timeout+0x8a/0xb6
 [ 1818.745673]  [804ecf01] io_schedule_timeout+0x28/0x36
 [ 1818.751664]  [8026fba7] congestion_wait+0x9d/0xc2
 [ 1818.757300]  [80269b24] 
 balance_dirty_pages_ratelimited_nr+0x196/0x22f
 [ 1818.764781]  [80265a3f] generic_file_buffered_write+0x52a/0x60d
 [ 1818.771641]  [80266210] 
 __generic_file_aio_write_nolock+0x45a/0x491
 [ 1818.778852]  [802662a8] generic_file_aio_write+0x61/0xc1
 [ 1818.785101]  [8032eb94] nfs_file_write+0x138/0x1b7
 [ 1818.790822]  [8028d222] do_sync_write+0xcc/0x112
 [ 1818.796372]  [8028d32b] vfs_write+0xc3/0x165
 [ 1818.801575]  [8028d5df] sys_pwrite64+0x68/0x96
 [ 1818.806959]  [80209d0e] system_call+0x7e/0x83
 [ 1818.812250]  [2b4eeec3ea73]

 [snippage]


Possibly your device driver had conniptions and stopped generating
completion interrupts.

Which driver is in use?

I don't suppose it is repeatable.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BUG 2.6.24-rc4-mm1 -- Boot still hangs w/ async scsi scan

2007-12-05 Thread Andrew Morton

On Wed, 05 Dec 2007 11:36:39 -0500
Lee Schermerhorn [EMAIL PROTECTED] wrote:

 As reported here:
 
   http://marc.info/?l=linux-scsim=119645761124683w=4
 
 against 24-rc3-mm2, I'm still seeing the hang on my HP ia64 NUMA
 platform under 24-rc4-mm1 with async scsi scan enabled.  I'm still
 seeing the message  mptspi: ioc#: mpt_config failed when it hangs. 
 
 I can boot by disabling async scan.  However, I've also noticed some
 disks attached via one of the mpt adapters [scsi8 in console long in
 message linked above] going off-line during stress tests.  This was
 under 24-rc3-mm2.  Haven't got that far yet with 24-rc4-mm1.
 

Is ther any way of tricking you into
http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt?

Obvious culprits to start with would be git-scsi-misc and maybe
scsi-early-detection-of-medium-not-present-updated.patch.  But there are
only 20-odd scsi patches in there.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)

2007-12-05 Thread Andrew Morton

On Thu, 06 Dec 2007 14:49:37 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote:

drivers/scsi/dpt_i2o.c |  132 ++-
drivers/scsi/dpti.h|9 ++
2 files changed, 68 insertions(+), 73 deletions(-)
  
  I've done the following:
  
  -untared a clean 2.6.24-rc4 and compiled it with my 2.6.23.1-settings in 
  order
   to verify that the driver is still broken: checked, the box still won't 
   boot.
  
  -patched the just compiled kernel source with your patch, make dist-clean
   (by means of make-kpkg clean) and recompile: box boots fine.
 
  I've put the captured console logs to
  http://w.sysiphus.de/dpt_i2o/bootlog.2624-rc4-pristine
  http://w.sysiphus.de/dpt_i2o/bootlog.2624-rc4-patched
  ... and the kernelconfig (which shouldn't matter) to
  http://w.sysiphus.de/dpt_i2o/kernelconfig.2624-rc4
 
 Thanks for testing. So reverting Matthew's hotplug patch fixes the
 problem though I have no idea how the patch leads to this. Seems that
 nobody has any clue on that. We need to revert that patch for the
 moment.

OK, thanks.  Let's leave it a couple of days for people to register objections,
have bright ideas, etc.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)

2007-12-04 Thread Andrew Morton

On Thu, 29 Nov 2007 13:31:50 +0100
Anders Henke [EMAIL PROTECTED] wrote:

 On November 28 2007, Anders Henke wrote:
  As everything is reported as being zero is quite odd an Jan took a
  guess that it might be block-layer or driver-related, I've assumed
  that the driver is responsible for this; just out of the curiousity, 
  I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying 
  driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a 
  vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me.
  
  I haven't yet fine-tested from which kernel release on the dpt_i2o driver 
  behaves like this and spews out zeroed blocks when trying to mount
  the rootfs. Maybe this is just some timing issue.
 
 I've started the fine-tests and can say so far that dpt_i2o from 
 2.6.22 is still fine. Test is simple:
 
 [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ 
 dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/
 
 ... recompile the kernel, reboot: works.
 
 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different
 patch sets:
 -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1
 -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3
 -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10.
 
 When applying the 2.6.23-rc1-based driver to my 2.6.31.1 kernel,
 the zero blocks-symptom show up, so it's the lucky situation
 that the smallest patch actually seams to be the broken one.
 
 According to the 2.6.23-rc1 short-form changelog, there is
 one major edit on the dpt_i2o driver:
 
 FUJITA Tomonori 
 
   [SCSI] dpt_i2o: convert to use the data buffer accessors
 
 Stephen Rothwell 
   dpt_i2o depends on virt_to_bus
 
 Fujita, would you please take a look at this?

He won't have seen this.  cc's added.

 I think that something's broken in there, leading to the dpt_i2o 
 sending out blocks of zeroes right after initialization, at least on
 some specific controllers (in this case, Adaptec 2010S on Intel
 SE7501WV2S-based boxes).
 
 I don't have insight kernel driver development knowledge, so I'm
 quite out of help right now. Nevertheless, I'll add the diff
 from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o:
 

Can you please confirm that this revert (against 2.6.24-rc4) fixes the data
corruption problems?

Thanks.


diff -puN 
drivers/scsi/dpt_i2o.c~revert-dpt_i2o-convert-to-use-the-data-buffer-accessors 
drivers/scsi/dpt_i2o.c
--- 
a/drivers/scsi/dpt_i2o.c~revert-dpt_i2o-convert-to-use-the-data-buffer-accessors
+++ a/drivers/scsi/dpt_i2o.c
@@ -2062,13 +2062,12 @@ static s32 adpt_scsi_to_i2o(adpt_hba* pH
u32 *lenptr;
int direction;
int scsidir;
-   int nseg;
u32 len;
u32 reqlen;
s32 rcode;
 
memset(msg, 0 , sizeof(msg));
-   len = scsi_bufflen(cmd);
+   len = cmd-request_bufflen;
direction = 0x; 

scsidir = 0x;   // DATA NO XFER
@@ -2125,21 +2124,21 @@ static s32 adpt_scsi_to_i2o(adpt_hba* pH
lenptr=mptr++;  /* Remember me - fill in when we know */
reqlen = 14;// SINGLE SGE
/* Now fill in the SGList and command */
+   if(cmd-use_sg) {
+   struct scatterlist *sg = (struct scatterlist 
*)cmd-request_buffer;
+   int sg_count = pci_map_sg(pHba-pDev, sg, cmd-use_sg,
+   cmd-sc_data_direction);
 
-   nseg = scsi_dma_map(cmd);
-   BUG_ON(nseg  0);
-   if (nseg) {
-   struct scatterlist *sg;
 
len = 0;
-   scsi_for_each_sg(cmd, sg, nseg, i) {
+   for(i = 0 ; i  sg_count; i++) {
*mptr++ = direction|0x1000|sg_dma_len(sg);
len+=sg_dma_len(sg);
*mptr++ = sg_dma_address(sg);
-   /* Make this an end of list */
-   if (i == nseg - 1)
-   mptr[-2] = direction|0xD000|sg_dma_len(sg);
+   sg++;
}
+   /* Make this an end of list */
+   mptr[-2] = direction|0xD000|sg_dma_len(sg-1);
reqlen = mptr - msg;
*lenptr = len;

@@ -2148,8 +2147,16 @@ static s32 adpt_scsi_to_i2o(adpt_hba* pH
len, cmd-underflow);
}
} else {
-   *lenptr = len = 0;
-   reqlen = 12;
+   *lenptr = len = cmd-request_bufflen;
+   if(len == 0) {
+   reqlen = 12;
+   } else {
+   *mptr++ = 0xD000|direction|cmd-request_bufflen;
+   *mptr++ = pci_map_single(pHba-pDev,
+   cmd-request_buffer,
+   cmd-request_bufflen,
+   cmd-sc_data_direction);
+   }
}

/*

Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)

2007-12-04 Thread Andrew Morton

On Wed, 05 Dec 2007 10:04:03 +0900
FUJITA Tomonori [EMAIL PROTECTED] wrote:

 On Tue, 4 Dec 2007 16:57:38 -0800
 Andrew Morton [EMAIL PROTECTED] wrote:
 
  On Thu, 29 Nov 2007 13:31:50 +0100
  Anders Henke [EMAIL PROTECTED] wrote:
  
   On November 28 2007, Anders Henke wrote:
As everything is reported as being zero is quite odd an Jan took a
guess that it might be block-layer or driver-related, I've assumed
that the driver is responsible for this; just out of the curiousity, 
I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying 
driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a 
vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me.

I haven't yet fine-tested from which kernel release on the dpt_i2o 
driver 
behaves like this and spews out zeroed blocks when trying to mount
the rootfs. Maybe this is just some timing issue.
   
   I've started the fine-tests and can say so far that dpt_i2o from 
   2.6.22 is still fine. Test is simple:
   
   [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ 
   dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/
   
   ... recompile the kernel, reboot: works.
   
   2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different
   patch sets:
   -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1
   -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3
   -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10.
   
   When applying the 2.6.23-rc1-based driver to my 2.6.31.1 kernel,
   the zero blocks-symptom show up, so it's the lucky situation
   that the smallest patch actually seams to be the broken one.
   
   According to the 2.6.23-rc1 short-form changelog, there is
   one major edit on the dpt_i2o driver:
   
   FUJITA Tomonori 
   
 [SCSI] dpt_i2o: convert to use the data buffer accessors
   
   Stephen Rothwell 
 dpt_i2o depends on virt_to_bus
   
   Fujita, would you please take a look at this?
  
  He won't have seen this.  cc's added.
  
   I think that something's broken in there, leading to the dpt_i2o 
   sending out blocks of zeroes right after initialization, at least on
   some specific controllers (in this case, Adaptec 2010S on Intel
   SE7501WV2S-based boxes).
   
   I don't have insight kernel driver development knowledge, so I'm
   quite out of help right now. Nevertheless, I'll add the diff
   from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o:
   
  
  Can you please confirm that this revert (against 2.6.24-rc4) fixes the data
  corruption problems?
 
 Anders said that my patch is fine and seems that Matthew's hotplug
 conversion patch leads to the problem:
 
 http://marc.info/?l=linux-kernelm=119641892129732w=2

Oh.  Jan broke message threading :(

So it's been nearly a week and nothing has happened?  Do we revert that
change?

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)

2007-12-04 Thread Andrew Morton

On Wed, 05 Dec 2007 10:30:54 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote:

 On Tue, 4 Dec 2007 17:11:55 -0800
 Andrew Morton [EMAIL PROTECTED] wrote:
 
  On Wed, 05 Dec 2007 10:04:03 +0900
  FUJITA Tomonori [EMAIL PROTECTED] wrote:
  
   On Tue, 4 Dec 2007 16:57:38 -0800
   Andrew Morton [EMAIL PROTECTED] wrote:
   
On Thu, 29 Nov 2007 13:31:50 +0100
Anders Henke [EMAIL PROTECTED] wrote:

 On November 28 2007, Anders Henke wrote:
  As everything is reported as being zero is quite odd an Jan took a
  guess that it might be block-layer or driver-related, I've assumed
  that the driver is responsible for this; just out of the 
  curiousity, 
  I've manually replaced the dpt_i2o driver by the 2.6.19 one by 
  copying 
  driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into 
  a 
  vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me.
  
  I haven't yet fine-tested from which kernel release on the dpt_i2o 
  driver 
  behaves like this and spews out zeroed blocks when trying to mount
  the rootfs. Maybe this is just some timing issue.
 
 I've started the fine-tests and can say so far that dpt_i2o from 
 2.6.22 is still fine. Test is simple:
 
 [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ 
 dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/
 
 ... recompile the kernel, reboot: works.
 
 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two 
 different
 patch sets:
 -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1
 -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3
 -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10.
 
 When applying the 2.6.23-rc1-based driver to my 2.6.31.1 kernel,
 the zero blocks-symptom show up, so it's the lucky situation
 that the smallest patch actually seams to be the broken one.
 
 According to the 2.6.23-rc1 short-form changelog, there is
 one major edit on the dpt_i2o driver:
 
 FUJITA Tomonori 
 
   [SCSI] dpt_i2o: convert to use the data buffer accessors
 
 Stephen Rothwell 
   dpt_i2o depends on virt_to_bus
 
 Fujita, would you please take a look at this?

He won't have seen this.  cc's added.

 I think that something's broken in there, leading to the dpt_i2o 
 sending out blocks of zeroes right after initialization, at least on
 some specific controllers (in this case, Adaptec 2010S on Intel
 SE7501WV2S-based boxes).
 
 I don't have insight kernel driver development knowledge, so I'm
 quite out of help right now. Nevertheless, I'll add the diff
 from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o:
 

Can you please confirm that this revert (against 2.6.24-rc4) fixes the 
data
corruption problems?
   
   Anders said that my patch is fine and seems that Matthew's hotplug
   conversion patch leads to the problem:
   
   http://marc.info/?l=linux-kernelm=119641892129732w=2
  
  Oh.  Jan broke message threading :(
  
  So it's been nearly a week and nothing has happened?  Do we revert that
  change?
 
 SCSI people really want this conversion...
 
 Matthew, did you have a chance to look at it?

It seems pretty improbably that a change of that nature could cause data
corruption.  Anders, are you able to determine whether the revert (against
current Linus mainline or 2.6.24-rc4) fixes things?  Because it would be
very strange...

This is a grave bug.  It's really quite urgent...

Thanks.

 drivers/scsi/dpt_i2o.c |  132 ++-
 drivers/scsi/dpti.h|9 ++
 2 files changed, 68 insertions(+), 73 deletions(-)

diff -puN drivers/scsi/dpt_i2o.c~revert-dpt_i2o-convert-to-scsi-hotplug-model 
drivers/scsi/dpt_i2o.c
--- a/drivers/scsi/dpt_i2o.c~revert-dpt_i2o-convert-to-scsi-hotplug-model
+++ a/drivers/scsi/dpt_i2o.c
@@ -173,20 +173,20 @@ static struct pci_device_id dptids[] = {
 };
 MODULE_DEVICE_TABLE(pci,dptids);
 
-static void adpt_exit(void);
-
-static int adpt_detect(void)
+static int adpt_detect(struct scsi_host_template* sht)
 {
struct pci_dev *pDev = NULL;
adpt_hba* pHba;
 
+   adpt_init();
+
PINFO(Detecting Adaptec I2O RAID controllers...\n);
 
 /* search for all Adatpec I2O RAID cards */
while ((pDev = pci_get_device( PCI_DPT_VENDOR_ID, PCI_ANY_ID, pDev))) {
if(pDev-device == PCI_DPT_DEVICE_ID ||
   pDev-device == PCI_DPT_RAPTOR_DEVICE_ID){
-   if(adpt_install_hba(pDev) ){
+   if(adpt_install_hba(sht, pDev) ){
PERROR(Could not Init an I2O RAID device\n);
PERROR(Will not try to detect others.\n);
return hba_count-1;
@@ -248,33 +248,34 @@ rebuild_sys_tab:
}
 
for (pHba = hba_chain; pHba

Re: [BUG] 2.6.24-rc3-git2 softlockup detected

2007-12-03 Thread Andrew Morton

On Fri, 30 Nov 2007 12:58:06 +0530
Kamalesh Babulal [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  On Thu, 29 Nov 2007 23:00:47 -0800 Andrew Morton [EMAIL PROTECTED] wrote:
  
  On Fri, 30 Nov 2007 01:39:29 -0500 Kyle McMartin [EMAIL PROTECTED] wrote:
 
  On Thu, Nov 29, 2007 at 12:35:33AM -0800, Andrew Morton wrote:
  ten million is close enough to infinity for me to assume that we broke 
  the
  driver and that's never going to terminate.
 
  how about this? doesn't break things on my pa8800:
 
  diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c 
  b/drivers/scsi/sym53c8xx_2/sym_hipd.c
  index 463f119..ef01cb1 100644
  --- a/drivers/scsi/sym53c8xx_2/sym_hipd.c
  +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c
  @@ -1037,10 +1037,13 @@ restart_test:
/*
 *  Wait 'til done (with timeout)
 */
  - for (i=0; iSYM_SNOOP_TIMEOUT; i++)
  + do {
if (INB(np, nc_istat)  (INTF|SIP|DIP))
break;
  - if (i=SYM_SNOOP_TIMEOUT) {
  + msleep(10);
  + } while (i++  SYM_SNOOP_TIMEOUT);
  +
  + if (i = SYM_SNOOP_TIMEOUT) {
printf (CACHE TEST FAILED: timeout.\n);
return (0x20);
}
  diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.h 
  b/drivers/scsi/sym53c8xx_2/sym_hipd.h
  index ad07880..85c483b 100644
  --- a/drivers/scsi/sym53c8xx_2/sym_hipd.h
  +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.h
  @@ -339,7 +339,7 @@
   /*
*  Misc.
*/
  -#define SYM_SNOOP_TIMEOUT (1000)
  +#define SYM_SNOOP_TIMEOUT (1000)
   #define BUS_8_BIT0
   #define BUS_16_BIT   1
   
  That might be the fix, but do we know what we're actually fixing?  afaik
  2.6.24-rc3 doesn't get this timeout, 2.6.24-rc3-mm2 does get it and we
  don't know why?
 
  
  looks at Subject:
  
  Checks that Rafael was cc'ed
  
  So 2.6.24-rc3 was OK and 2.6.24-rc3-git2 is not?
 
 Yes, the 2.6.24-rc3 was Ok and this is seen from 2.6.24-rc3-git2/3/4.
 

There are effectively no drivers/scsi/ changes after 2.6.24-rc3 and we
don't (I believe) have a clue what caused this regression.

Can you please do a bisection search on this?

Thanks.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-11-30 Thread Andrew Morton

On Sat, 01 Dec 2007 11:30:01 +1300
Michael Cree [EMAIL PROTECTED] wrote:

 Bob Tracy wrote:
  Andrew Morton wrote:
  Could be something change in sysfs.  Please double-check the config
  options, make sure that something important didn't get disabled.
 
   Here's
  hoping someone else is seeing this or can replicate it in the meantime.
 
 Snap.
 
 2.6.24-rc2 works fine.   2.6.24-rc3 boots on Alpha but once /dev is 
 populated no partitions of the scsi sub-system are seen.  Looks like ide 
 sub-system similarly affected.

Rafael, I assume you have this regression in the list?

 Managed to get boot log.  Follows below (with output of various /proc info).
 
 Cheerz
 Michael.
 
 
 Linux version 2.6.24-rc3 ([EMAIL PROTECTED]) (gcc version 4.1.3 20071019 
 (prerelease) (Debian 4.1.2-17)) #1 Mon Nov 26 19:28:58 NZDT 2007
 Booting on Tsunami variation Monet using machine vector Monet from SRM
 Major Options: EV67 LEGACY_START VERBOSE_MCHECK
 Command line: ro root=/dev/sda3 console=ttyS0
 memcluster 0, usage 1, start0, end  215
 memcluster 1, usage 0, start  215, end   131062
 memcluster 2, usage 1, start   131062, end   131072
 freeing pages 215:384
 freeing pages 930:131062
 reserving pages 930:932
 4096K Bcache detected; load hit latency 21 cycles, load miss latency 127 
 cycles
 Console graphics on hose 0
 Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 130167
 Kernel command line: ro root=/dev/sda3 console=ttyS0
 PID hash table entries: 4096 (order: 12, 32768 bytes)
 Using epoch = 2000
 Turning on RTC interrupts.
 Console: colour VGA+ 80x25
 console [ttyS0] enabled
 Dentry cache hash table entries: 131072 (order: 7, 1048576 bytes)
 Inode-cache hash table entries: 65536 (order: 6, 524288 bytes)
 Memory: 1030896k/1048496k available (2786k kernel code, 15216k reserved, 
 370k data, 168k init)
 Mount-cache hash table entries: 512
 net_namespace: 120 bytes
 NET: Registered protocol family 16
 PCI: Bridge: 0001:01:08.0
IO window: 8000-8fff
MEM window: 0900-090f
PREFETCH window: disabled.
 SMC37c669 Super I/O Controller found @ 0x3f0
 Linux Plug and Play Support v0.97 (c) Adam Belay
 SCSI subsystem initialized
 NET: Registered protocol family 2
 IP route cache hash table entries: 8192 (order: 3, 65536 bytes)
 TCP established hash table entries: 32768 (order: 6, 524288 bytes)
 TCP bind hash table entries: 32768 (order: 5, 262144 bytes)
 TCP: Hash tables configured (established 32768 bind 32768)
 TCP reno registered
 srm_env: version 0.0.6 loaded successfully
 io scheduler noop registered
 io scheduler cfq registered (default)
 tridentfb: Trident framebuffer 0.7.8-NEWAPI initializing
 isapnp: Scanning for PnP cards...
 isapnp: No Plug  Play device found
 rtc: SRM (post-2000) epoch (2000) detected
 Real Time Clock Driver v1.12ac
 Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
 serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
 serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
 Floppy drive(s): fd0 is 2.88M
 FDC 0 is a post-1991 82077
 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
 CY82C693: IDE controller (0x1080:0xc693 rev 0x00) at  PCI slot :00:07.1
 CY82C693: not 100% native mode: will probe irqs later
 CY82C693U driver v0.34 99-13-12 Andreas S. Krebs ([EMAIL PROTECTED])
  ide0: BM-DMA at 0x8400-0x8407, BIOS settings: hda:pio, hdb:pio
 CY82C693: port 0x01f0 already claimed by ide0
 ALI15X3: IDE controller (0x10b9:0x5228 rev 0xc6) at  PCI slot 0001:02:09.1
 ALI15X3: 100% native mode on irq 28
  ide1: BM-DMA at 0x28410-0x28417, BIOS settings: hdc:DMA, 
 hdd:DMA
  ide2: BM-DMA at 0x28418-0x2841f, BIOS settings: hde:pio, 
 hdf:pio
 hdf: LITE-ON DVDRW SOHW-1653S, ATAPI CD/DVD-ROM drive
 hde: ST3200822A, ATA DISK drive
 ide2 at 0x28438-0x2843f,0x2844e on irq 28
 hde: max request size: 512KiB
 hde: 390721968 sectors (200049 MB) w/8192KiB Cache, CHS=24321/255/63, 
 UDMA(100)
 hde: cache flushes supported
   hde: hde1
 qla1280: QLA1040 found on PCI bus 1, dev 6
 scsi(0:0): Resetting SCSI BUS
 scsi0 : QLogic QLA1040 PCI to SCSI Host Adapter
 Firmware version:  7.65.06, Driver version 3.26
 serio: i8042 KBD port at 0x60,0x64 irq 1
 serio: i8042 AUX port at 0x60,0x64 irq 12
 mice: PS/2 mouse device common for all mice
 scsi 0:0:1:0: Direct-Access SEAGATE  ST336706LW   0109 PQ: 0 ANSI: 3
 scsi(0:0:1:0): Sync: period 10, offset 12, Wide
 input: AT Raw Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
 atkbd.c: keyboard reset failed on isa0060/serio1
 TCP cubic registered
 Initializing XFRM netlink socket
 NET: Registered protocol family 1
 NET: Registered protocol family 17
 NET: Registered protocol family 15
 scsi: waiting for bus probes to complete ...
 sd 0:0:1:0: [sda] 71687370 512-byte hardware sectors (36704 MB)
 sd 0:0:1:0: [sda] Write Protect is off
 sd 0:0:1:0: [sda] Write

[patch] SCSI: early detection of medium not present, updated

2007-11-29 Thread Andrew Morton


Guys, I have this marked as needed-in-2.6.24?





From: Alan Stern [EMAIL PROTECTED]

Taken from http://bugzilla.kernel.org/show_bug.cgi?id=8904

An updated (by Albert, I assume) version of the fourteen-month-old patch here:

http://marc.info/?l=linux-kernelm=115412002912837w=2

Apparently fixes the bug described at
http://bugzilla.kernel.org/show_bug.cgi?id=8904

Needs some TLC.  Perhaps urgently.

Cc: Albert Lee [EMAIL PROTECTED]
Cc: Alan Stern [EMAIL PROTECTED]
Cc: James Bottomley [EMAIL PROTECTED]
Cc: Tejun Heo [EMAIL PROTECTED]
Cc: Jens Axboe [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/scsi/scsi_ioctl.c  |2 +-
 drivers/scsi/scsi_lib.c|   20 ++--
 drivers/scsi/sd.c  |2 +-
 drivers/scsi/sr.c  |   15 +--
 include/scsi/scsi_device.h |2 +-
 5 files changed, 30 insertions(+), 11 deletions(-)

diff -puN 
drivers/scsi/scsi_ioctl.c~scsi-early-detection-of-medium-not-present-updated 
drivers/scsi/scsi_ioctl.c
--- 
a/drivers/scsi/scsi_ioctl.c~scsi-early-detection-of-medium-not-present-updated
+++ a/drivers/scsi/scsi_ioctl.c
@@ -244,7 +244,7 @@ int scsi_ioctl(struct scsi_device *sdev,
return scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
case SCSI_IOCTL_TEST_UNIT_READY:
return scsi_test_unit_ready(sdev, IOCTL_NORMAL_TIMEOUT,
-   NORMAL_RETRIES);
+   NORMAL_RETRIES, NULL);
case SCSI_IOCTL_START_UNIT:
scsi_cmd[0] = START_STOP;
scsi_cmd[1] = 0;
diff -puN 
drivers/scsi/scsi_lib.c~scsi-early-detection-of-medium-not-present-updated 
drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c~scsi-early-detection-of-medium-not-present-updated
+++ a/drivers/scsi/scsi_lib.c
@@ -2010,15 +2010,26 @@ scsi_mode_sense(struct scsi_device *sdev
 }
 EXPORT_SYMBOL(scsi_mode_sense);
 
+/**
+ * scsi_test_unit_ready - test if unit is ready
+ * @sdev:  scsi device to change the state of.
+ * @timeout: command timeout
+ * @retries: number of retries before failing
+ * @media_maybe_present: 1 if media maybe present or not.
+ *   0 if media not present.
+ *
+ * Returns zero if unsuccessful or an error if TUR failed.
+ **/
 int
-scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries)
+scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries, int 
*media_maybe_present)
 {
char cmd[] = {
TEST_UNIT_READY, 0, 0, 0, 0, 0,
};
struct scsi_sense_hdr sshdr;
int result;
-   
+   int maybe_present = 1;
+
result = scsi_execute_req(sdev, cmd, DMA_NONE, NULL, 0, sshdr,
  timeout, retries);
 
@@ -2027,10 +2038,15 @@ scsi_test_unit_ready(struct scsi_device 
if ((scsi_sense_valid(sshdr)) 
((sshdr.sense_key == UNIT_ATTENTION) ||
 (sshdr.sense_key == NOT_READY))) {
+   if (sshdr.asc == 0x3A)
+   maybe_present = 0;
sdev-changed = 1;
result = 0;
}
}
+
+   if (media_maybe_present)
+   *media_maybe_present = maybe_present;
return result;
 }
 EXPORT_SYMBOL(scsi_test_unit_ready);
diff -puN drivers/scsi/sd.c~scsi-early-detection-of-medium-not-present-updated 
drivers/scsi/sd.c
--- a/drivers/scsi/sd.c~scsi-early-detection-of-medium-not-present-updated
+++ a/drivers/scsi/sd.c
@@ -767,7 +767,7 @@ static int sd_media_changed(struct gendi
retval = -ENODEV;
 
if (scsi_block_when_processing_errors(sdp))
-   retval = scsi_test_unit_ready(sdp, SD_TIMEOUT, SD_MAX_RETRIES);
+   retval = scsi_test_unit_ready(sdp, SD_TIMEOUT, SD_MAX_RETRIES, 
NULL);
 
/*
 * Unable to test, unit probably not ready.   This usually
diff -puN drivers/scsi/sr.c~scsi-early-detection-of-medium-not-present-updated 
drivers/scsi/sr.c
--- a/drivers/scsi/sr.c~scsi-early-detection-of-medium-not-present-updated
+++ a/drivers/scsi/sr.c
@@ -179,18 +179,21 @@ static int sr_media_change(struct cdrom_
 {
struct scsi_cd *cd = cdi-handle;
int retval;
+   int media_maybe_present;
 
if (CDSL_CURRENT != slot) {
/* no changer support */
return -EINVAL;
}
 
-   retval = scsi_test_unit_ready(cd-device, SR_TIMEOUT, MAX_RETRIES);
-   if (retval) {
-   /* Unable to test, unit probably not ready.  This usually
-* means there is no disc in the drive.  Mark as changed,
-* and we will figure it out later once the drive is
-* available again.  */
+   retval = scsi_test_unit_ready(cd-device, SR_TIMEOUT, MAX_RETRIES,
+ media_maybe_present);
+   if (retval

Re: [BUG] 2.6.24-rc3-git2 softlockup detected

2007-11-29 Thread Andrew Morton

On Fri, 30 Nov 2007 01:39:29 -0500 Kyle McMartin [EMAIL PROTECTED] wrote:

 On Thu, Nov 29, 2007 at 12:35:33AM -0800, Andrew Morton wrote:
  ten million is close enough to infinity for me to assume that we broke the
  driver and that's never going to terminate.
  
 
 how about this? doesn't break things on my pa8800:
 
 diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c 
 b/drivers/scsi/sym53c8xx_2/sym_hipd.c
 index 463f119..ef01cb1 100644
 --- a/drivers/scsi/sym53c8xx_2/sym_hipd.c
 +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c
 @@ -1037,10 +1037,13 @@ restart_test:
   /*
*  Wait 'til done (with timeout)
*/
 - for (i=0; iSYM_SNOOP_TIMEOUT; i++)
 + do {
   if (INB(np, nc_istat)  (INTF|SIP|DIP))
   break;
 - if (i=SYM_SNOOP_TIMEOUT) {
 + msleep(10);
 + } while (i++  SYM_SNOOP_TIMEOUT);
 +
 + if (i = SYM_SNOOP_TIMEOUT) {
   printf (CACHE TEST FAILED: timeout.\n);
   return (0x20);
   }
 diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.h 
 b/drivers/scsi/sym53c8xx_2/sym_hipd.h
 index ad07880..85c483b 100644
 --- a/drivers/scsi/sym53c8xx_2/sym_hipd.h
 +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.h
 @@ -339,7 +339,7 @@
  /*
   *  Misc.
   */
 -#define SYM_SNOOP_TIMEOUT (1000)
 +#define SYM_SNOOP_TIMEOUT (1000)
  #define BUS_8_BIT0
  #define BUS_16_BIT   1
  

That might be the fix, but do we know what we're actually fixing?  afaik
2.6.24-rc3 doesn't get this timeout, 2.6.24-rc3-mm2 does get it and we
don't know why?


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc3-mm2: Result: hostbyte=0x01 driverbyte=0x00\nend_request: I/O error

2007-11-28 Thread Andrew Morton

On Wed, 28 Nov 2007 23:01:31 +0300
Alexey Dobriyan [EMAIL PROTECTED] wrote:

 Reliably spams dmesg with end_request() horrors. This happens when git
 starts checking out linux tree to fresh ext2 partition. Disk is several
 month old and there were no prolems with, say, 2.6.24-rc3:
 
 [  225.378426] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
 [  225.378659] end_request: I/O error, dev sdb, sector 141295703
 [  225.390133] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
 [  225.391988] end_request: I/O error, dev sdb, sector 141295703
 [  225.392463] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
 [  225.392625] end_request: I/O error, dev sdb, sector 141295703
 [  225.392999] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
 [  225.393161] end_request: I/O error, dev sdb, sector 141295703
 [  225.393571] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
 [  225.393731] end_request: I/O error, dev sdb, sector 141295703
 [  225.394382] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
 [  225.394544] end_request: I/O error, dev sdb, sector 141295703
 [  225.395247] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00
 [  225.395412] end_request: I/O error, dev sdb, sector 141295703
 
 CONFIG_ATA=y
 # CONFIG_ATA_NONSTANDARD is not set
 CONFIG_ATA_ACPI=y
 CONFIG_SATA_AHCI=y
 CONFIG_ATA_PIIX=y
 CONFIG_PATA_JMICRON=y

and

 [   35.229713] sd 2:0:1:0: [sdb] 976773168 512-byte hardware sectors (500108 
 MB)

So that's an OK sector number.


 [0.00] Linux version 2.6.24-rc3-mm2 ([EMAIL PROTECTED]) (gcc version 
 4.1.2 (Gentoo 4.1.2 p1.0.2)) #3 SMP PREEMPT Wed Nov 28 22:23:45 MSK 2007
 [0.00] Command line: root=/dev/sda2 [EMAIL PROTECTED]/eth0,[EMAIL 
 PROTECTED]/00:80:48:45:EC:73 ignore_loglevel
 [0.00] BIOS-provided physical RAM map:
 [0.00]  BIOS-e820:  - 0009fc00 (usable)
 [0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
 [0.00]  BIOS-e820: 000e4000 - 0010 (reserved)
 [0.00]  BIOS-e820: 0010 - 7ff9 (usable)
 [0.00]  BIOS-e820: 7ff9 - 7ff9e000 (ACPI data)
 [0.00]  BIOS-e820: 7ff9e000 - 7ffe (ACPI NVS)
 [0.00]  BIOS-e820: 7ffe - 8000 (reserved)
 [0.00]  BIOS-e820: fee0 - fee01000 (reserved)
 [0.00]  BIOS-e820: ffb0 - 0001 (reserved)
 [0.00]  BIOS-e820: 0001 - 00018000 (usable)
 [0.00] Entering add_active_range(0, 0, 159) 0 entries of 256 used
 [0.00] Entering add_active_range(0, 256, 524176) 1 entries of 256 used
 [0.00] Entering add_active_range(0, 1048576, 1572864) 2 entries of 
 256 used
 [0.00] end_pfn_map = 1572864
 [0.00] DMI 2.4 present.
 [0.00] ACPI: RSDP 000FA980, 0024 (r2 ACPIAM)
 [0.00] ACPI: XSDT 7FF90100, 0054 (r1 KOZIRO FRONTIER  2000707 MSFT
97)
 [0.00] ACPI: FACP 7FF90290, 00F4 (r3 MSTEST OEMFACP   2000707 MSFT
97)
 [0.00] ACPI: DSDT 7FF905C0, 8FA9 (r1  A0637 A06370000 INTL 
 20060113)
 [0.00] ACPI: FACS 7FF9E000, 0040
 [0.00] ACPI: APIC 7FF90390, 006C (r1 MSTEST OEMAPIC   2000707 MSFT
97)
 [0.00] ACPI: MCFG 7FF90400, 003C (r1 MSTEST OEMMCFG   2000707 MSFT
97)
 [0.00] ACPI: SLIC 7FF90440, 0176 (r1 KOZIRO FRONTIER  2000707 MSFT
97)
 [0.00] ACPI: OEMB 7FF9E040, 007B (r1 MSTEST AMI_OEM   2000707 MSFT
97)
 [0.00] ACPI: HPET 7FF99570, 0038 (r1 MSTEST OEMHPET   2000707 MSFT
97)
 [0.00] Entering add_active_range(0, 0, 159) 0 entries of 256 used
 [0.00] Entering add_active_range(0, 256, 524176) 1 entries of 256 used
 [0.00] Entering add_active_range(0, 1048576, 1572864) 2 entries of 
 256 used
 [0.00]  [e200-e21f] PMD -81000120 on 
 node 0
 [0.00]  [e220-e23f] PMD -81000160 on 
 node 0
 [0.00]  [e240-e25f] PMD -810001A0 on 
 node 0
 [0.00]  [e260-e27f] PMD -810001E0 on 
 node 0
 [0.00]  [e280-e29f] PMD -81000220 on 
 node 0
 [0.00]  [e2a0-e2bf] PMD -81000260 on 
 node 0
 [0.00]  [e2c0-e2df] PMD -810002A0 on 
 node 0
 [0.00]  [e2e0-e2ff] PMD -810002E0 on 
 node 0
 [0.00]  [e2000100-e200011f] PMD -81000320 on 
 node 0
 [0.00]  [e2000120-e200013f] PMD -81000360 on 
 node 0
 [0.00]  [e2000140-e200015f] PMD -810003A0 on 
 node 0
 [0.00]  [e2000160-e200017f] PMD -810003E0 on 
 node 0
 [0.00]

Re: 2.6.24-rc3-mm2: Result: hostbyte=0x01 driverbyte=0x00\nend_request: I/O error

2007-11-28 Thread Andrew Morton

On Wed, 28 Nov 2007 16:14:21 -0700
Matthew Wilcox [EMAIL PROTECTED] wrote:

 On Wed, Nov 28, 2007 at 01:40:36PM -0800, Andrew Morton wrote:
  On Wed, 28 Nov 2007 23:01:31 +0300
  Alexey Dobriyan [EMAIL PROTECTED] wrote:
  
   Reliably spams dmesg with end_request() horrors. This happens when git
   starts checking out linux tree to fresh ext2 partition. Disk is several
   month old and there were no prolems with, say, 2.6.24-rc3:
 
 Could you try reverting 6f5391c283d7fdcf24bf40786ea79061919d1e1d and see
 if the problem still exists?
 

That's not completely trivial..

I did a hand-made revert against 2.6.24-rc3-mm2 (below) but some other patch
in there causes:

drivers/scsi/scsi_lib.c: In function 'scsi_blk_pc_done':
drivers/scsi/scsi_lib.c:1251: error: 'struct scsi_cmnd' has no member named 
'request_bufflen'


--- a/drivers/scsi/scsi.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d
+++ a/drivers/scsi/scsi.c
@@ -59,7 +59,6 @@
 #include scsi/scsi_cmnd.h
 #include scsi/scsi_dbg.h
 #include scsi/scsi_device.h
-#include scsi/scsi_driver.h
 #include scsi/scsi_eh.h
 #include scsi/scsi_host.h
 #include scsi/scsi_tcq.h
@@ -379,8 +378,9 @@ void scsi_log_send(struct scsi_cmnd *cmd
scsi_print_command(cmd);
if (level  3) {
printk(KERN_INFO buffer = 0x%p, bufflen = %d,
-   queuecommand 0x%p\n,
+   done = 0x%p, queuecommand 0x%p\n,
scsi_sglist(cmd), scsi_bufflen(cmd),
+   cmd-done,
cmd-device-host-hostt-queuecommand);
 
}
@@ -667,12 +667,6 @@ void __scsi_done(struct scsi_cmnd *cmd)
blk_complete_request(rq);
 }
 
-/* Move this to a header if it becomes more generally useful */
-static struct scsi_driver *scsi_cmd_to_driver(struct scsi_cmnd *cmd)
-{
-   return *(struct scsi_driver **)cmd-request-rq_disk-private_data;
-}
-
 /**
  * scsi_finish_command - cleanup and pass command back to upper layer
  * @cmd: the command
@@ -685,8 +679,6 @@ void scsi_finish_command(struct scsi_cmn
 {
struct scsi_device *sdev = cmd-device;
struct Scsi_Host *shost = sdev-host;
-   struct scsi_driver *drv;
-   unsigned int good_bytes;
 
scsi_device_unbusy(sdev);
 
@@ -712,13 +704,7 @@ void scsi_finish_command(struct scsi_cmn
Notifying upper driver of completion 
(result %x)\n, cmd-result));
 
-   good_bytes = scsi_bufflen(cmd);
-if (cmd-request-cmd_type != REQ_TYPE_BLOCK_PC) {
-   drv = scsi_cmd_to_driver(cmd);
-   if (drv-done)
-   good_bytes = drv-done(cmd);
-   }
-   scsi_io_completion(cmd, good_bytes);
+   cmd-done(cmd);
 }
 EXPORT_SYMBOL(scsi_finish_command);
 
diff -puN 
drivers/scsi/scsi_error.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d 
drivers/scsi/scsi_error.c
--- a/drivers/scsi/scsi_error.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d
+++ a/drivers/scsi/scsi_error.c
@@ -1697,6 +1697,7 @@ scsi_reset_provider(struct scsi_device *
 
scmd-scsi_done = scsi_reset_provider_done_command;
memset(scmd-sdb, 0, sizeof(scmd-sdb));
+   scmd-done  = NULL;
 
scmd-cmd_len   = 0;
 
diff -puN 
drivers/scsi/scsi_lib.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d 
drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d
+++ a/drivers/scsi/scsi_lib.c
@@ -944,6 +944,7 @@ void scsi_end_bidi_request(struct scsi_c
 
scsi_finalize_request(cmd, 1);
 }
+EXPORT_SYMBOL(scsi_io_completion);
 
 /*
  * Function:scsi_io_completion()
@@ -1238,6 +1239,18 @@ static struct scsi_cmnd *scsi_get_cmd_fr
return cmd;
 }
 
+static void scsi_blk_pc_done(struct scsi_cmnd *cmd)
+{
+   BUG_ON(!blk_pc_request(cmd-request));
+   /*
+* This will complete the whole command with uptodate=1 so
+* as far as the block layer is concerned the command completed
+* successfully. Since this is a REQ_BLOCK_PC command the
+* caller should check the request's errors value
+*/
+   scsi_io_completion(cmd, cmd-request_bufflen);
+}
+
 int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req)
 {
struct scsi_cmnd *cmd;
@@ -1285,6 +1298,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_d
cmd-transfersize = req-data_len;
cmd-allowed = req-retries;
cmd-timeout_per_command = req-timeout;
+   cmd-done = scsi_blk_pc_done;
return BLKPREP_OK;
 }
 EXPORT_SYMBOL(scsi_setup_blk_pc_cmnd);
diff -puN 
drivers/scsi/scsi_priv.h~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d 
drivers/scsi/scsi_priv.h
--- a/drivers/scsi/scsi_priv.h~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d
+++ a/drivers/scsi

Re: [PATCH 2/2] ide-scsi: use print_hex_dump from linux/kernel.h

2007-11-27 Thread Andrew Morton

On Mon, 26 Nov 2007 15:16:13 +0800 Denis Cheng [EMAIL PROTECTED] wrote:

 these utilities implemented in lib/hexdump.c are more handy, please use this.
 
 ...

 --- a/drivers/scsi/ide-scsi.c
 +++ b/drivers/scsi/ide-scsi.c
 @@ -242,16 +242,6 @@ static void idescsi_output_buffers (ide_drive_t *drive, 
 idescsi_pc_t *pc, unsign
   }
  }
  
 -static void hexdump(u8 *x, int len)
 -{
 - int i;
 -
 - printk([ );
 - for (i = 0; i  len; i++)
 - printk(%x , x[i]);
 - printk(]\n);
 -}
 -
  static int idescsi_check_condition(ide_drive_t *drive, struct request 
 *failed_command)
  {
   idescsi_scsi_t *scsi = drive_to_idescsi(drive);
 @@ -282,7 +272,7 @@ static int idescsi_check_condition(ide_drive_t *drive, 
 struct request *failed_co
   pc-scsi_cmd = ((idescsi_pc_t *) failed_command-special)-scsi_cmd;
   if (test_bit(IDESCSI_LOG_CMD, scsi-log)) {
   printk (ide-scsi: %s: queue cmd = , drive-name);
 - hexdump(pc-c, 6);
 + print_hex_dump(KERN_DEBUG, , DUMP_PREFIX_OFFSET, 16, 1, 
 pc-c, 6, 1);
   }
   rq-rq_disk = scsi-disk;
   return ide_do_drive_cmd(drive, rq, ide_preempt);
 @@ -337,7 +327,7 @@ static int idescsi_end_request (ide_drive_t *drive, int 
 uptodate, int nrsecs)
   idescsi_pc_t *opc = (idescsi_pc_t *) rq-buffer;
   if (log) {
   printk (ide-scsi: %s: wrap up check %lu, rst = , 
 drive-name, opc-scsi_cmd-serial_number);
 - hexdump(pc-buffer,16);
 + print_hex_dump(KERN_DEBUG, , DUMP_PREFIX_OFFSET, 16, 
 1, pc-buffer, 16, 1);
   }
   memcpy((void *) opc-scsi_cmd-sense_buffer, pc-buffer, 
 SCSI_SENSE_BUFFERSIZE);
   kfree(pc-buffer);
 @@ -816,10 +806,10 @@ static int idescsi_queue (struct scsi_cmnd *cmd,
  
   if (test_bit(IDESCSI_LOG_CMD, scsi-log)) {
   printk (ide-scsi: %s: que %lu, cmd = , drive-name, 
 cmd-serial_number);
 - hexdump(cmd-cmnd, cmd-cmd_len);
 + print_hex_dump(KERN_DEBUG, , DUMP_PREFIX_OFFSET, 16, 1, 
 cmd-cmnd, cmd-cmd_len, 1);
   if (memcmp(pc-c, cmd-cmnd, cmd-cmd_len)) {
   printk (ide-scsi: %s: que %lu, tsl = , drive-name, 
 cmd-serial_number);
 - hexdump(pc-c, 12);
 + print_hex_dump(KERN_DEBUG, , DUMP_PREFIX_OFFSET, 16, 
 1, pc-c, 12, 1);
   }
   }
  

Would you believe that this patch (which removes code) actually increases
drivers/scsi/ide-scsi.o .text by 75 bytes?

I didn't look to see why - probably that huge arg count is hurting,
possibly some additional strings being emitted?

Either way, perhaps a simple little front-end to print_hex_dump() is called
for.


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 9462] New: Adaptec AHA-7850 with MOD drive: Hard lockup with new Adaptec driver

2007-11-27 Thread Andrew Morton


(switched to email - please respond via emailed reply-to-all, not via the
bugzilla web interface)

On Tue, 27 Nov 2007 07:11:02 -0800 (PST) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=9462
 
Summary: Adaptec AHA-7850 with MOD drive: Hard lockup with new
 Adaptec driver
Product: IO/Storage
Version: 2.5
  KernelVersion: 2.6.22.14
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: high
   Priority: P1
  Component: SCSI
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: Works with the old Adaptec
 driver. Had lockups with 2.6.18.8 as well, with the new driver. 
 
 Distribution: Debian etch
 
 Hardware Environment: Epox 8kta+, Athlon 800 MHz, 768MB RAM
 
 Output form lspci:
 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 02)
 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev
 22)
 00:07.1 IDE interface: VIA Technologies, Inc.
 VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 10)
 00:07.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1
 Controller (rev 10)
 00:07.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1
 Controller (rev 10)
 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 30)
 00:09.0 Ethernet controller: D-Link System Inc DL2000-based Gigabit Ethernet
 (rev 0c)
 00:0a.0 USB Controller: NEC Corporation USB (rev 43)
 00:0a.1 USB Controller: NEC Corporation USB (rev 43)
 00:0a.2 USB Controller: NEC Corporation USB 2.0 (rev 04)
 00:0b.0 Mass storage controller: Promise Technology, Inc. PDC20518/PDC40518
 (SATAII 150 TX4) (rev 02)
 00:0c.0 SCSI storage controller: Adaptec AHA-7850 (rev 03)
 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
 RTL-8139/8139C/8139C+ (rev 10)
 01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX/MX 
 400]
 (rev a1)
 
 
 Software Environment: Tried cp, dd_rescue and cat for copy. No difference.
 
 Problem Description: When writing to the filesystem on the MOD, the computer
 locks completely as far as I can tell, no responses to ping, no log-entries
 (this is a headless server) MOD not writing and only a hard reset helps. The
 target filesystem had (after recovery with e2fsck) 22MB on it in one try and
 350MB in another one. I guess there is some race condition or other randomized
 process at work. MOD size is 600MB.
 
 
 
 Steps to reproduce:
 Insert MOD, mount it, write to it, see lockup happen.
 Observed with new dribver in 2.6.18.8 and 2.6.22.14. Also happens with TCQ
 disabled.
 
 I should add that MODs are slow. The writing process blocks for 10-20 seconds
 frequently for disk flushes. This is normal and expected. 
 
 Maybe a warning should be added to the new drivr and the old one should be 
 kept
 for the time being.
 
 If there are some tests I can run for you, please let me know.
 

I guess this doesn't really count as a regression, as the new driver has
never worked.  I assume that the old driver continues to work OK and that
there is no plan to remove it?

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUG] 2.6.24-rc3-git2 softlockup detected

2007-11-27 Thread Andrew Morton

On Wed, 28 Nov 2007 11:59:00 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote:

 Hi,

(cc linux-scsi, for sym53c8xx)

 Soft lockup is detected while bootup with 2.6.24-rc3-git2 on powerbox

I assume this is a post-2.6.23 regression?

 BUG: soft lockup - CPU#1 stuck for 11s! [insmod:375]
 NIP: c002f02c LR: d01414fc CTR: c002f018
 REGS: c0077cbef0b0 TRAP: 0901   Not tainted  (2.6.24-rc3-git2-autotest)
 MSR: 80009032 EE,ME,IR,DR  CR: 24022088  XER: 
 TASK = c0077cbd8000[375] 'insmod' THREAD: c0077cbec000 CPU: 1
 GPR00: d01414fc c0077cbef330 c052b930 d80080002014 
 GPR04: d8008000202c  c0077ca1cb00 d014ce54 
 GPR08: c0077ca1c63c  002a c002f018 
 GPR12: d0143610 c0473d00 
 NIP [c002f02c] .ioread8+0x14/0x60
 LR [d01414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]
 Call Trace:
 [c0077cbef330] [c0077cbef3c0] 0xc0077cbef3c0 (unreliable)
 [c0077cbef3a0] [d01414fc] .sym_hcb_attach+0x1188/0x1378 
 [sym53c8xx]
 [c0077cbef470] [d01395f8] .sym2_probe+0x700/0x99c [sym53c8xx]
 [c0077cbef710] [c01bc118] .pci_device_probe+0x124/0x1b0
 [c0077cbef7b0] [c0221138] .driver_probe_device+0x144/0x20c
 [c0077cbef850] [c0221450] .__driver_attach+0xcc/0x154
 [c0077cbef8e0] [c021ff94] .bus_for_each_dev+0x7c/0xd4
 [c0077cbef9a0] [c0220e9c] .driver_attach+0x28/0x40
 [c0077cbefa20] [c02204d8] .bus_add_driver+0x90/0x228
 [c0077cbefac0] [c0221858] .driver_register+0x94/0xb0
 [c0077cbefb40] [c01bc430] .__pci_register_driver+0x6c/0xcc
 [c0077cbefbe0] [d0143428] .sym2_init+0x108/0x15b0 [sym53c8xx]
 [c0077cbefc80] [c008ce80] .sys_init_module+0x17c4/0x1958
 [c0077cbefe30] [c000872c] syscall_exit+0x0/0x40
 Instruction dump:
 6000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6 
 f8010010 f821ff91 7c0004ac 8923 0c09 4c00012c 79290620 2f8900ff 

I see no obvious lockup sites near the end of sym_hcb_attach().  Maybe it's
being called lots of times from a higher level..  Do the traces all look
the same?

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUG] 2.6.24-rc3-git2 softlockup detected

2007-11-27 Thread Andrew Morton

On Wed, 28 Nov 2007 12:47:19 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  On Wed, 28 Nov 2007 11:59:00 +0530 Kamalesh Babulal [EMAIL PROTECTED] 
  wrote:
  
  Hi,
  
  (cc linux-scsi, for sym53c8xx)
  
  Soft lockup is detected while bootup with 2.6.24-rc3-git2 on powerbox
  
  I assume this is a post-2.6.23 regression?
  
  BUG: soft lockup - CPU#1 stuck for 11s! [insmod:375]
  NIP: c002f02c LR: d01414fc CTR: c002f018
  REGS: c0077cbef0b0 TRAP: 0901   Not tainted  (2.6.24-rc3-git2-autotest)
  MSR: 80009032 EE,ME,IR,DR  CR: 24022088  XER: 
  TASK = c0077cbd8000[375] 'insmod' THREAD: c0077cbec000 CPU: 1
  GPR00: d01414fc c0077cbef330 c052b930 d80080002014 
  GPR04: d8008000202c  c0077ca1cb00 d014ce54 
  GPR08: c0077ca1c63c  002a c002f018 
  GPR12: d0143610 c0473d00 
  NIP [c002f02c] .ioread8+0x14/0x60
  LR [d01414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx]
  Call Trace:
  [c0077cbef330] [c0077cbef3c0] 0xc0077cbef3c0 (unreliable)
  [c0077cbef3a0] [d01414fc] .sym_hcb_attach+0x1188/0x1378 
  [sym53c8xx]
  [c0077cbef470] [d01395f8] .sym2_probe+0x700/0x99c [sym53c8xx]
  [c0077cbef710] [c01bc118] .pci_device_probe+0x124/0x1b0
  [c0077cbef7b0] [c0221138] .driver_probe_device+0x144/0x20c
  [c0077cbef850] [c0221450] .__driver_attach+0xcc/0x154
  [c0077cbef8e0] [c021ff94] .bus_for_each_dev+0x7c/0xd4
  [c0077cbef9a0] [c0220e9c] .driver_attach+0x28/0x40
  [c0077cbefa20] [c02204d8] .bus_add_driver+0x90/0x228
  [c0077cbefac0] [c0221858] .driver_register+0x94/0xb0
  [c0077cbefb40] [c01bc430] .__pci_register_driver+0x6c/0xcc
  [c0077cbefbe0] [d0143428] .sym2_init+0x108/0x15b0 [sym53c8xx]
  [c0077cbefc80] [c008ce80] .sys_init_module+0x17c4/0x1958
  [c0077cbefe30] [c000872c] syscall_exit+0x0/0x40
  Instruction dump:
  6000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6 
  f8010010 f821ff91 7c0004ac 8923 0c09 4c00012c 79290620 2f8900ff 
  
  I see no obvious lockup sites near the end of sym_hcb_attach().  Maybe it's
  being called lots of times from a higher level..  Do the traces all look
  the same?
 
 Hi Andrew,
 
 I see this call trace twice and both looks similar and on another reboot
 the following trace is seen twice in different cpu
 
 BUG: soft lockup detected on CPU#3!
 Call Trace:
 [C0003FEDEDA0] [C0010220] .show_stack+0x68/0x1b0 (unreliable)
 [C0003FEDEE40] [C00A061C] .softlockup_tick+0xf0/0x13c
 [C0003FEDEEF0] [C0072E2C] .run_local_timers+0x1c/0x30
 [C0003FEDEF70] [C0022FA0] .timer_interrupt+0xa8/0x488
 [C0003FEDF050] [C00034EC] decrementer_common+0xec/0x100
 --- Exception: 901 at .ioread8+0x14/0x60
 LR = .sym_hcb_attach+0x1194/0x1384 [sym53c8xx]
 [C0003FEDF340] [D02B3BC0] 0xd02b3bc0 (unreliable)
 [C0003FEDF3B0] [D029A3C0] .sym_hcb_attach+0x1194/0x1384 
 [sym53c8xx]
 [C0003FEDF480] [D0291D30] .sym2_probe+0x75c/0x9f8 [sym53c8xx]
 [C0003FEDF710] [C01B65A4] .pci_device_probe+0x13c/0x1dc
 [C0003FEDF7D0] [C0219A0C] .driver_probe_device+0xa0/0x15c
 [C0003FEDF870] [C0219C64] .__driver_attach+0xb4/0x138
 [C0003FEDF900] [C021913C] .bus_for_each_dev+0x7c/0xd4
 [C0003FEDF9C0] [C02198B0] .driver_attach+0x28/0x40
 [C0003FEDFA40] [C0218BA4] .bus_add_driver+0x98/0x18c
 [C0003FEDFAE0] [C021A064] .driver_register+0xa8/0xc4
 [C0003FEDFB60] [C01B68AC] .__pci_register_driver+0x5c/0xa4
 [C0003FEDFBF0] [D029C204] .sym2_init+0x104/0x1550 [sym53c8xx]
 [C0003FEDFC90] [C008D1F4] .sys_init_module+0x1764/0x1998
 [C0003FEDFE30] [C000869C] syscall_exit+0x0/0x40
 

hm, odd.

Can you look up sym_hcb_attach+0x1194/0x1384 in gdb?  Something like

- Enable CONFIG_DEBUG_INFO

- gdb sym53c8xx.o

(gdb) p sym_hcb_attach
prints 0xsomething
(gdb) p/x 0xsomething + 0x1194
prints 0xsomethingelse
(gdb) l *0xsomethingelse

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc3-mm1

2007-11-26 Thread Andrew Morton

On Fri, 23 Nov 2007 06:55:41 +0100 Gabriel C [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C [EMAIL PROTECTED] wrote:
  
  I have some warnings on each SCSI disc:
 
 
  ...
 
  [   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   
  0109 PQ: 0 ANSI: 3
  [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
  [   30.724435]  target0:0:0: Beginning Domain Validation
  [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed --
  [   30.724572]  target0:0:0: Ending Domain Validation
  [   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP
  0114 PQ: 0 ANSI: 4
  [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
  [   30.729771]  target0:0:1: Beginning Domain Validation
  [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed --
  [   30.729908]  target0:0:1: Ending Domain Validation
 
  
  Don't know what would have caused that.  But yes, something is wrong in
  scsi land.
 
 Actually I'm lucky the author didn't fix that FIXME in scsi_transport_spi.c 
 and I still can boot ;)
 
  
  no idea whatever this is related but buffered disk reads are 2.XX MB/sec 
  and the box is somewhat laggy.
 
  hdparm -t on sda and sdb reports :
 
  /dev/sda:
   Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec
 
  /dev/sdb:
   Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec
 
  My IDE discs are fine.
 
  Please let me know if you need my config or any other informations.
 
  
  And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
  
 
 I found the commit which cause these problems , it is in git-scsi-misc patch 
 and reverting it fixes both problems for me.
 
 http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d
 

OK, thanks.  I'll assume that James and Hannes have this in hand (or will
have, by mid-week) and I won't do anything here.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.24-rc3-mm1

2007-11-22 Thread Andrew Morton

On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C [EMAIL PROTECTED] wrote:

 I have some warnings on each SCSI disc:
 
 
 ...
 
 [   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   0109 
 PQ: 0 ANSI: 3
 [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
 [   30.724435]  target0:0:0: Beginning Domain Validation
 [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed --
 [   30.724572]  target0:0:0: Ending Domain Validation
 [   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP0114 
 PQ: 0 ANSI: 4
 [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
 [   30.729771]  target0:0:1: Beginning Domain Validation
 [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed --
 [   30.729908]  target0:0:1: Ending Domain Validation
 

Don't know what would have caused that.  But yes, something is wrong in
scsi land.

 
 no idea whatever this is related but buffered disk reads are 2.XX MB/sec and 
 the box is somewhat laggy.
 
 hdparm -t on sda and sdb reports :
 
 /dev/sda:
  Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec
 
 /dev/sdb:
  Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec
 
 My IDE discs are fine.
 
 Please let me know if you need my config or any other informations.
 

And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 9405] New: iSCSI does not implement ordering guarantees required by e.g. journaling filesystems

2007-11-19 Thread Andrew Morton

On Mon, 19 Nov 2007 05:44:01 -0800 (PST)
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=9405
 
Summary: iSCSI does not implement ordering guarantees required by
 e.g. journaling filesystems
Product: IO/Storage
Version: 2.5
  KernelVersion: 2.6.23.1
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: high
   Priority: P1
  Component: SCSI
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: (new issue)
 Distribution: any
 Hardware Environment: (does not apply)
 Software Environment: (does not apply) 
 Problem Description: The sd (SCSI disk) driver ignores block device barriers
 (REQ_HARDBARRIER). The iSCSI code in the kernel sends all iSCSI commands with
 flag ISCSI_ATTR_SIMPLE to the iSCSI target. This means that the target may
 reorder these commands. Since a.o. correct operation of journaling filesystems
 depends on being able to enforce the order of certain block write operations,
 not enforcing write ordering is a bug. This can be solved by either adding
 support for REQ_HARDBARRIER in the sd device or by replacing ISCSI_ATTR_SIMPLE
 by ISCSI_ATTR_ORDERED.
 
 Steps to reproduce: Source reading of drivers/scsi/sd.c and
 drivers/scsi/libiscsi.c.
 
 References: SCSI Architecture Model - 3, paragraph 8.6
 (http://www.t10.org/ftp/t10/drafts/sam3/sam3r14.pdf).
 

(does iscsi have a maintainer?)
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] cciss: add support for blktrace

2007-11-19 Thread Andrew Morton

On Mon, 19 Nov 2007 16:07:17 -0600 Mike Miller [EMAIL PROTECTED] wrote:

 Patch 2 of 3
 This patch adds support for the blktrace utility. Please consider this for
 inclusion. Seems there was already a call to blk_add_trace. This patch adds
 ifdef's and includes the header file.
 
 Signed-off-by: Mike Miller [EMAIL PROTECTED]
 
 
 diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
 index 2ba5a89..61bc0f3 100644
 --- a/drivers/block/cciss.c
 +++ b/drivers/block/cciss.c
 @@ -41,6 +41,10 @@
  #include asm/uaccess.h
  #include asm/io.h
  
 +#ifdef CONFIG_BLK_DEV_IO_TRACE
 +#include linux/blktrace_api.h
 +#endif /* CONFIG_BLK_DEV_IO_TRACE */

The ifdefs shouldn't be needed here.  If they are needed, blktrace_api.h needs
fixing.

  #include linux/dma-mapping.h
  #include linux/blkdev.h
  #include linux/genhd.h
 @@ -3013,7 +3017,9 @@ after_error_processing:
   }
   cmd-rq-data_len = 0;
   cmd-rq-completion_data = cmd;
 +#ifdef CONFIG_BLK_DEV_IO_TRACE
   blk_add_trace_rq(cmd-rq-q, cmd-rq, BLK_TA_COMPLETE);
 +#endif /* CONFIG_BLK_DEV_IO_TRACE */
   blk_complete_request(cmd-rq);
  }

Add if you remove the first set of ifdefs, these ifdefs can also be
removed.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] cciss: export more sysfs attributes

2007-11-19 Thread Andrew Morton

On Mon, 19 Nov 2007 16:03:07 -0600 Mike Miller [EMAIL PROTECTED] wrote:

 Patch 1 of 3
 This patch creates more sysfs attributes to be exported by cciss. Hopefully
 we can work better with udev. Please consider this patch for inclusion.
 

It would be appropriate if the changelog were to describe what the problem
is with udev, and how this patch attemtps to address it.

 
 diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
 index 7d70496..2ba5a89 100644
 --- a/drivers/block/cciss.c
 +++ b/drivers/block/cciss.c
 @@ -229,20 +229,483 @@ static inline CommandList_struct 
 *removeQ(CommandList_struct **Qptr,
   return c;
  }
  
 +static inline int find_drv_index(int ctlr, drive_info_struct *drv){
 +int i;
 +for (i=0; i  CISS_MAX_LUN; i++) {
 +if (hba[ctlr]-drv[i].LunID == drv-LunID)
 +return i;
 +}
 +return i;
 +}

Please pass all patches though scripts/checkpatch.pl before sending.  It
will detect things like the codingstyle errors in the above code.

Also, that function seems to be too large to be inlined.

  #include cciss_scsi.c  /* For SCSI tape support */
  
 +#define ENG_GIG 10
 +#define ENG_GIG_FACTOR (ENG_GIG/512)
  #define RAID_UNKNOWN 6
 +static const char *raid_label[] = { 0, 4, 1(1+0), 5, 5+1, ADG,
 + UNKNOWN};
 +
 +
 +static spinlock_t sysfs_lock = SPIN_LOCK_UNLOCKED;

And that's a bug which checkpatch would have detected.  Please use
DEFINE_SPINLOCK() to avoid confusing lockdep.

 +static void cciss_sysfs_stat_inquiry(int ctlr, int logvol,
 + int withirq, drive_info_struct *drv)
 +{
 + int return_code;
 + InquiryData_struct *inq_buff;
 +
 + /* If there are no heads then this is the controller disk and
 +  * not a valid logical drive so don't query it.
 +  */
 + if (!drv-heads)
 + return;
 +
 + inq_buff = kzalloc(sizeof(InquiryData_struct), GFP_KERNEL);
 + if (!inq_buff) {
 + printk(KERN_ERR cciss: out of memory\n);
 + goto err;
 + }
 +
 + if (withirq)
 + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr,
 + inq_buff, sizeof(*inq_buff), 1, logvol ,0, TYPE_CMD);
 + else
 + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff,
 + sizeof(*inq_buff), 1, logvol , 0, NULL, TYPE_CMD);
 + if (return_code == IO_OK) {
 + memcpy(drv-vendor, inq_buff-data_byte[8], 8);
 + drv-vendor[8]='\0';
 + memcpy(drv-model, inq_buff-data_byte[16], 16);
 + drv-model[16] = '\0';
 + memcpy(drv-rev, inq_buff-data_byte[32], 4);
 + drv-rev[4] = '\0';
 + } else { /* Get geometry failed */
 + printk(KERN_WARNING cciss: inquiry for VPD page 0 failed\n);
 + }
 +
 + if (withirq)
 + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr,
 + inq_buff, sizeof(*inq_buff), 1, logvol ,0x83, TYPE_CMD);
 + else
 + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff,
 + sizeof(*inq_buff), 1, logvol , 0x83, NULL, TYPE_CMD);
 +
 + if (return_code == IO_OK) {
 + memcpy(drv-uid, inq_buff-data_byte[8], 16);
 + } else { /* Get geometry failed */
 + printk(KERN_WARNING cciss: id logical drive failed\n);
 + }
 +
 + kfree(inq_buff);
 +err:
 + drv-vendor[8] = '\0';
 + drv-model[16] = '\0';
 + drv-rev[4] = '\0';
 +
 +}
 +
 +static ssize_t cciss_show_raid_level(struct device *dev,
 +  struct device_attribute *attr, char *buf)
 +{
 + struct drv_dynamic *d;
 + drive_info_struct *drv;
 + ctlr_info_t *h;
 + unsigned long flags;
 + int raid;
 +
 + d = container_of(dev, struct drv_dynamic, dev);
 + spin_lock(sysfs_lock);
 + if (!d-disk) {
 + spin_unlock(sysfs_lock);
 + return -ENOENT;
 + }
 +
 + h = get_host(d-disk);
 +
 + spin_lock_irqsave(CCISS_LOCK(h-ctlr), flags);
 + if (h-busy_configuring) {
 + spin_unlock_irqrestore(CCISS_LOCK(h-ctlr), flags);
 + spin_unlock(sysfs_lock);
 + return snprintf(buf, 30, Device busy configuring\n);
 + }
 +
 + drv = d-disk-private_data;
 + if ((drv-raid_level  0) || (drv-raid_level)  5)
 + raid = RAID_UNKNOWN;
 + else
 + raid = drv-raid_level;
 +
 + spin_unlock_irqrestore(CCISS_LOCK(h-ctlr), flags);
 + spin_unlock(sysfs_lock);
 + return snprintf(buf, 20, RAID %s\n, raid_label[raid]);
 +}
 +
 +static ssize_t cciss_show_disk_size(struct device *dev,
 + struct device_attribute *attr, char *buf)
 +{
 + struct drv_dynamic *d;
 + drive_info_struct *drv;
 + ctlr_info_t *h;
 + unsigned long flags;
 + sector_t vol_sz, vol_sz_frac;
 +
 + d = container_of(dev, struct drv_dynamic, dev);
 + spin_lock(sysfs_lock);
 +

Re: [PATCH 3/4] scsi_data_buffer

2007-11-12 Thread Andrew Morton

On Thu, 08 Nov 2007 18:59:30 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote:

   In preparation for bidi we abstract all IO members of scsi_cmnd,
   that will need to duplicate, into a substructure.
 
   - Group all IO members of scsi_cmnd into a scsi_data_buffer
 structure.

drivers/scsi/qla1280.c: In function 'qla1280_done':
drivers/scsi/qla1280.c:1313: error: 'struct scsi_cmnd' has no member named 
'use_sg'
drivers/scsi/qla1280.c:1314: error: 'struct scsi_cmnd' has no member named 
'request_buffer'
drivers/scsi/qla1280.c:1315: error: 'struct scsi_cmnd' has no member named 
'use_sg'
drivers/scsi/qla1280.c:1316: error: 'struct scsi_cmnd' has no member named 
'request_bufflen'
drivers/scsi/qla1280.c:1318: error: 'struct scsi_cmnd' has no member named 
'request_bufflen'
drivers/scsi/qla1280.c: In function 'qla1280_return_status':
drivers/scsi/qla1280.c:1409: error: 'struct scsi_cmnd' has no member named 
'request_bufflen'
drivers/scsi/qla1280.c:1416: error: 'struct scsi_cmnd' has no member named 
'resid'
drivers/scsi/qla1280.c: In function 'qla1280_64bit_start_scsi':
drivers/scsi/qla1280.c:2791: error: 'struct scsi_cmnd' has no member named 
'use_sg'
drivers/scsi/qla1280.c:2792: error: 'struct scsi_cmnd' has no member named 
'request_buffer'
drivers/scsi/qla1280.c:2793: error: 'struct scsi_cmnd' has no member named 
'use_sg'
drivers/scsi/qla1280.c:2801: error: 'struct scsi_cmnd' has no member named 
'request_bufflen'
drivers/scsi/qla1280.c:2896: error: 'struct scsi_cmnd' has no member named 
'use_sg'
drivers/scsi/qla1280.c:2991: error: 'struct scsi_cmnd' has no member named 
'request_buffer'
drivers/scsi/qla1280.c:2992: error: 'struct scsi_cmnd' has no member named 
'request_bufflen'
drivers/scsi/qla1280.c:3004: error: 'struct scsi_cmnd' has no member named 
'request_bufflen'
make[2]: *** [drivers/scsi/qla1280.o] Error 1

It mystfies me how a patch like this can have been floating about in N
submissions across M months and nobody has done an allmodconfig build or
even a grep to find out what broke.

ho hum.  I shall mark qla1280 BROKEN and shall plod onwards.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/4] scsi_data_buffer

2007-11-12 Thread Andrew Morton

On Tue, 13 Nov 2007 15:40:42 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote:

 On Mon, 12 Nov 2007 22:06:52 -0800
 Andrew Morton [EMAIL PROTECTED] wrote:
 
  On Thu, 08 Nov 2007 18:59:30 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote:
  
 In preparation for bidi we abstract all IO members of scsi_cmnd,
 that will need to duplicate, into a substructure.
   
 - Group all IO members of scsi_cmnd into a scsi_data_buffer
   structure.
  
  drivers/scsi/qla1280.c: In function 'qla1280_done':
  drivers/scsi/qla1280.c:1313: error: 'struct scsi_cmnd' has no member named 
  'use_sg'
  drivers/scsi/qla1280.c:1314: error: 'struct scsi_cmnd' has no member named 
  'request_buffer'
  drivers/scsi/qla1280.c:1315: error: 'struct scsi_cmnd' has no member named 
  'use_sg'
  drivers/scsi/qla1280.c:1316: error: 'struct scsi_cmnd' has no member named 
  'request_bufflen'
  drivers/scsi/qla1280.c:1318: error: 'struct scsi_cmnd' has no member named 
  'request_bufflen'
  drivers/scsi/qla1280.c: In function 'qla1280_return_status':
  drivers/scsi/qla1280.c:1409: error: 'struct scsi_cmnd' has no member named 
  'request_bufflen'
  drivers/scsi/qla1280.c:1416: error: 'struct scsi_cmnd' has no member named 
  'resid'
  drivers/scsi/qla1280.c: In function 'qla1280_64bit_start_scsi':
  drivers/scsi/qla1280.c:2791: error: 'struct scsi_cmnd' has no member named 
  'use_sg'
  drivers/scsi/qla1280.c:2792: error: 'struct scsi_cmnd' has no member named 
  'request_buffer'
  drivers/scsi/qla1280.c:2793: error: 'struct scsi_cmnd' has no member named 
  'use_sg'
  drivers/scsi/qla1280.c:2801: error: 'struct scsi_cmnd' has no member named 
  'request_bufflen'
  drivers/scsi/qla1280.c:2896: error: 'struct scsi_cmnd' has no member named 
  'use_sg'
  drivers/scsi/qla1280.c:2991: error: 'struct scsi_cmnd' has no member named 
  'request_buffer'
  drivers/scsi/qla1280.c:2992: error: 'struct scsi_cmnd' has no member named 
  'request_bufflen'
  drivers/scsi/qla1280.c:3004: error: 'struct scsi_cmnd' has no member named 
  'request_bufflen'
  make[2]: *** [drivers/scsi/qla1280.o] Error 1
  
  It mystfies me how a patch like this can have been floating about in N
  submissions across M months and nobody has done an allmodconfig build or
  even a grep to find out what broke.
  
  ho hum.  I shall mark qla1280 BROKEN and shall plod onwards.
 
 A patch to fix this is in James' scsi-pending tree. Jes tested and
 fixed it (thanks !) so it will go to -mm via scsi-misc soon.

oh gawd.  So we have git-scsi-misc, git-scsi-rc-fixes and now
git-scsi-pending?

I hope you fixed imm, ppa and any other broken drivers?

 Boaz, it's better to send major scsi patches to -mm via scsi-misc to
 avoid problems like this.
 
 
 By the way, Andrew, can you add the following patchset to -mm?
 
 http://lkml.org/lkml/2007/10/24/138
 
 It fixes the IOMMUs' problem to merge scatter/gather segments without
 considering LLDs' restrictions.

hmm, OK, I saved them away to look at after next -mm.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Add ACCUSYS RAID driver for Linux i386/x86-64

2007-10-24 Thread Andrew Morton

On Mon, 22 Oct 2007 18:17:49 +0800 Peter Chan [EMAIL PROTECTED] wrote:

 Dear Morton
 
 Thanks for your doing.
 We modified source code as your requested. If you have any comment please
 let me know.
 Do you need RAID HBA to test at this stage? If yes, Which address can i
 ship RAID HBA for you?

Please, you really will need to become a bit more familiar with the way we
work.  As far as I know, nobody in the linux world uses RAR format - that's
a windows thing.  I doubt if anyone except I has actually gone to the
effort to decrypt that attachment.

Start with

Documentation/SubmittingPatches
Documentation/SubmittingDrivers
Documentation/SubmitChecklist
http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt
http://linux.yyz.us/patch-format.html

There are a number of remaining stylistic things which we can look at more
closely when we have patches which are in a usable form.

- Linux doesn't use capitalisation in variable names.  Use tail, not Tail

- Linux uses underscored to separate words.  Use reply_frame, not
  replyframe or ReplyFrame.

- We don't like to see code which has any dependency on
  LINUX_VERSION_CODE or KENREL_VERSION: the code in Linux is suppsoed to
  work correctly in the version of the kernel which it s found and that's
  it.

- I don't know what this:

+#if defined(CONFIG_MODVERSIONS)  !defined(MODVERSIONS)
+   #define MODVERSIONS
+#endif

  is doing, but it's probably wrong.

- Use request_node, not RequestNode, etc.

- Don't parenthesise the argument to `return'.

- I see at least one U32 in there.  Please use u32. (Does U32 even work?)

- This:

+static int acs_ame_get_log(
+   struct Acs_Adapter *acs_adt,
+   struct EventLog *event_log)

  isn't preferred style.  Use


static int acs_ame_get_log(struct Acs_Adapter *acs_adt,
struct EventLog *event_log)

  or, if you particularly dislike that, blow the 80-col rule and do

static int acs_ame_get_log(struct Acs_Adapter *acs_adt, struct EventLog 
*event_log)

- This

+   writel((replyframe), base_addr+AME_REPLY_MSG_PORT);

  is overparenthesised.

- Beware that the scatter/gather APIs just got significantly changed. 
  You code might need adjustment to work against the latest mainline tree.

- What does CHAR_DEV do?  Probably it should be a Kconfig CONFIG_* option.

- All the code around acs_ame_schedule_command() (which is incorrectly
  identified as arcmsr_schedule_command in its comment block) is indented a
  tab stop.  That's really weird.  Please make it normal.

- acs_ame_schedule_command() has an up-to-sixty-second busywait.  Bad. 
  Can we get a sleep+wakeup in there?

- This:

+   struct
+   {
+   unsigned int   vendor_id;
+   unsigned int   device_id;
+   } const acs_ame_devices[] = {
+   { 0x14D6, DEVICEID_ACS_61000_XX }
+   , { 0x14D6, DEVICEID_ACS_62000_08 }
+   , { 0x1AB6, DEVICEID_ACS_61000_XX }
+   , { 0x1AB6, DEVICEID_ACS_62000_08 }
+   };

 should be

static const struct {
unsigned int   vendor_id;
unsigned int   device_id;
} acs_ame_devices[] = {
{ 0x14D6, DEVICEID_ACS_61000_XX },
{ 0x14D6, DEVICEID_ACS_62000_08 },
{ 0x1AB6, DEVICEID_ACS_61000_XX },
{ 0x1AB6, DEVICEID_ACS_62000_08 },
};

  which has many changes from the original.

  I'd have thought that the kernel already has a data type for this, but
  I can't find it.  Most drivers just rely upon the normal PCI device ID
  tables.  It is suspicious that this one doesn't.


Anyway, that's just from a quick scan.  There are a huge number of similar
issues in there.  Please take some time to study some well-maintained Linux
driver code and the interfaces which scsi and PCI drivers use and try to
make this driver a lot more Linux-like, thanks.



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PATCH] final SCSI pieces for the merge window

2007-10-24 Thread Andrew Morton

On Wed, 24 Oct 2007 09:28:10 -0400 James Bottomley [EMAIL PROTECTED] wrote:

 OK, so it's no secret that I'm the last of the subsystem maintainers
 whose day job isn't working on the linux kernel.

For the record, lots of subsystem maintainers are privateers.

goes through the git trees

I am not aware that these guys:

Mauro Chehab, Dmitry Torokhov, Sam Ravnborg, Pierre Ossman, Mark Hoffman,
Thomas Gleixner, David Airlie, Richard Purdie, Peter Anvin, Kyle McMartin,
Francois Romieu, Artem Bityutskiy, Erez Zadok, Josef Sipek, Anton
Altaparmakov, Eric Van Hensbergen, Latchesar Ionkov, Wim Van Sebroeck,
Antonino Daplas.

do it with any compensation.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PATCH] final SCSI pieces for the merge window

2007-10-24 Thread Andrew Morton

On Wed, 24 Oct 2007 08:35:21 -0700 (PDT) Linus Torvalds [EMAIL PROTECTED] 
wrote:

 On Wed, 24 Oct 2007, James Bottomley wrote:
  
  OK, so it's no secret that I'm the last of the subsystem maintainers
  whose day job isn't working on the linux kernel.  If you want a full
  time person, who did you have in mind?
 
 Quite frankly, at least for me personally, what I would rather have (in 
 general: this is really not at all SCSI-specific in any way, shape, or 
 form, and not directed at James!) is a less rigid maintainership 
 structure.
 
 Let's face it, we are *all* likely to be overworked at different times, 
 and even when not overworked, it's just the fact that people need to take 
 a breather etc. And there is seldom - if ever - a very strong argument for 
 having one person per subsystem.

Am OK with all of that, but with a rider.  It would make my life even more
miserable if there was a (say) git-scsi-tweedledee and a
git-scsi-tweedledum.  We already have too much out-of-scope code turning up
in the git trees and having two trees explicitly modifying the same
subsystem would hurt.  It's also bad from an engineering POV: there's a
decent chance that when combined, they just won't work.

So Tweedledee and Tweedledum should both commit to the same tree, please.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: stex driver panic in kernel 2.6.23

2007-10-24 Thread Andrew Morton

On Wed, 24 Oct 2007 11:59:30 -0700 Ed Lin [EMAIL PROTECTED] wrote:

 The shared tag issue was not fixed yet. Kernel panic
 happened while running I/O test in kernel 2.6.23
 (information attached). After applying the patch I posted
 (or the version James modified), panic disappeared.
 Switch back to standard kernel, panic again.

Did either of those patches get merged in 2.6.24-rc1?
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/4] [SCSI] ips: remove ips_ha members that duplicate struct pci_dev members

2007-10-24 Thread Andrew Morton

On Wed, 24 Oct 2007 19:48:26 -0400 (EDT) Jeff Garzik [EMAIL PROTECTED] wrote:

  drivers/scsi/ips.c |  178 
 

this driver seems a bit of a basket case :(


What's going on here?

scb-dcdb.cmd_attribute =
ips_command_direction[scb-scsi_cmd-cmnd[0]];

/* Allow a WRITE BUFFER Command to Have no Data */
/* This is Used by Tape Flash Utilites  */
if ((scb-scsi_cmd-cmnd[0] == WRITE_BUFFER)  (scb-data_len == 0)) 
scb-dcdb.cmd_attribute = 0;  

if (!(scb-dcdb.cmd_attribute  0x3))
scb-dcdb.transfer_length = 0;

if (scb-data_len = IPS_MAX_XFER) {

I hope that's just busted indentation and not a missing {} block.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: oops in lbmIODone, fails to boot [Re: 2.6.23-mm1]

2007-10-19 Thread Andrew Morton

On Sat, 20 Oct 2007 13:57:54 +0900 Mattia Dongili [EMAIL PROTECTED] wrote:

 On Thu, Oct 11, 2007 at 09:31:26PM -0700, Andrew Morton wrote:
  
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23/2.6.23-mm1/
 
 Hey there!!
 fails to boot here with this friendly oops:
 http://oioio.altervista.org/linux/dsc01702.jpg
 
 .config: http://oioio.altervista.org/linux/config-2.6.23-mm1-1
 
 2.6.23-rc8-mm2 booted ok but had other problems I haven't reported yet
 (no s2ram with mysql running and some net WARNING).
 Let's see if .23-mm1 still has those first.
 
 I'm adding Cc: linux-scsi
 
 PS: I'll hardly be able to bisect in the next days... :P

That looks like a Jens and Dave production to me.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Make advansys depend on CONFIG_VIRT_TO_BUS

2007-10-18 Thread Andrew Morton

On Thu, 18 Oct 2007 22:20:17 -0700 Randy Dunlap [EMAIL PROTECTED] wrote:

 On Fri, 19 Oct 2007 15:04:31 +1000 Stephen Rothwell wrote:
 
  At least for now.
 
 Please explain why in the changelog (what changelog?).
 
 E.g.:
 so that make allmodconfig on powerpc will have a better chance
 of building.

My version of this patch does that.  I'll be sending it into Linus in an
hour or so.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: OOM killer gripe (was Re: What still uses the block layer?)

2007-10-16 Thread Andrew Morton

On Mon, 15 Oct 2007 23:37:44 +1000
Nick Piggin [EMAIL PROTECTED] wrote:

 Would an oom-kill-someone-now sysrq be of help, I wonder?

Is already there: sysrq-f.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 >

1 - 100 of 286 matches

Mail list logo