Re: simplify procfs code for seq_file instances
On Tue, 24 Apr 2018 16:23:04 +0200 Christoph Hellwigwrote: > On Thu, Apr 19, 2018 at 09:57:50PM +0300, Alexey Dobriyan wrote: > > > git://git.infradead.org/users/hch/misc.git proc_create > > > > > > I want to ask if it is time to start using poorman function overloading > > with _b_c_e(). There are millions of allocation functions for example, > > all slightly difference, and people will add more. Seeing /proc interfaces > > doubled like this is painful. > > Function overloading is totally unacceptable. > > And I very much disagree with a tradeoff that keeps 5000 lines of > code vs a few new helpers. OK, the curiosity and suspense are killing me. What the heck is "function overloading with _b_c_e()"?
Re: [PATCH 1/2] wd719x: Remove last declaration using DEFINE_PCI_DEVICE_TABLE
On Fri, 02 Sep 2016 06:36:05 -0400 "Martin K. Petersen"wrote: > > "Joe" == Joe Perches writes: > > Joe> Convert it to the preferred const struct pci_device_id instead. > > Applied to 4.9/scsi-queue. That creates an ordering dependency between the scsi tree and -mm's "treewide: remove references to the now unnecessary DEFINE_PCI_DEVICE_TABLE". So an ack would be preferred, please. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Dirty/Writeback fields in /proc/meminfo affected by 20d74bf29c
On Mon, 1 Aug 2016 04:36:28 +0200 Tomas Vondrawrote: > Hi, > > While investigating a strange OOM issue on the 3.18.x branch (which > turned out to be already fixed by 52c84a95), I've noticed a strange > difference in Dirty/Writeback fields in /proc/meminfo depending on > kernel version. I'm wondering whether this is expected ... > > I've bisected the change to 20d74bf29c, added in 3.18.22 (upstream > commit 4f258a46): > > sd: Fix maximum I/O size for BLOCK_PC requests > > With /etc/sysctl.conf containing > > vm.dirty_background_bytes = 67108864 > vm.dirty_bytes = 1073741824 > > a simple "dd" example writing 10GB file > > dd if=/dev/zero of=ssd.test.file bs=1M count=10240 > > results in about this on 3.18.21: > > Dirty:740856 kB > Writeback: 12400 kB > > but on 3.18.22: > > Dirty: 49244 kB > Writeback:656396 kB > > I.e. it seems to revert the relationship. I haven't identified any > performance impact, and apparently for random writes the behavior did > not change at all (or at least I haven't managed to reproduce it). > > But it's unclear to me why setting a maximum I/O size should affect > this, and perhaps it has impact that I don't see. So what appears to be happening here is that background writeback is cutting in earlier - the amount of pending writeback ("Dirty") is reduced while the amount of active writeback ("Writeback") is correspondingly increased. 4f258a46 had the effect of permitting larger requests into the request queue. It's unclear to me why larger requests would cause background writeback to cut in earlier - the writeback code doesn't even care about individual request sizes, it only cares about aggregate pagecache state. Less Dirty and more Writeback isn't necessarily a bad thing at all, but I don't like mysteries. cc linux-mm to see if anyone else can spot-the-difference. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] byteswap: try to avoid __builtin_constant_p gcc bug
On Tue, 03 May 2016 01:10:16 +0200 Arnd Bergmann <a...@arndb.de> wrote: > On Monday 02 May 2016 16:02:18 Andrew Morton wrote: > > On Mon, 02 May 2016 23:48:19 +0200 Arnd Bergmann <a...@arndb.de> wrote: > > > > > This is another attempt to avoid a regression in wwn_to_u64() after > > > that started using get_unaligned_be64(), which in turn ran into a > > > bug on gcc-4.9 through 6.1. > > > > I'm still getting a couple screenfuls of things like > > > > net/tipc/name_distr.c: In function 'tipc_named_process_backlog': > > net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned > > int', but argument 3 has type 'unsigned int' > > net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned > > int', but argument 4 has type 'unsigned int' > > net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned > > int', but argument 5 has type 'unsigned int' > > net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned > > int', but argument 7 has type 'unsigned int' > > I've built a few thousand kernels (arm32 with gcc-6.1) with the patch applied, > but didn't see this one. What target architecture and compiler version > produced > this? Does it go away if you add a (__u32) cast? I don't even know what the > warning is trying to tell me. heh, I didn't actually read it. Hopefully we can write this off as a gcc-4.4.4 glitch. 4.8.4 is OK. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] byteswap: try to avoid __builtin_constant_p gcc bug
On Mon, 02 May 2016 23:48:19 +0200 Arnd Bergmannwrote: > This is another attempt to avoid a regression in wwn_to_u64() after > that started using get_unaligned_be64(), which in turn ran into a > bug on gcc-4.9 through 6.1. I'm still getting a couple screenfuls of things like net/tipc/name_distr.c: In function 'tipc_named_process_backlog': net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned int', but argument 3 has type 'unsigned int' net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned int', but argument 4 has type 'unsigned int' net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned int', but argument 5 has type 'unsigned int' net/tipc/name_distr.c:330: warning: format '%u' expects type 'unsigned int', but argument 7 has type 'unsigned int' -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mm: VM_BUG_ON_PAGE(PageTail(page)) in mbind
On Tue, 26 Jan 2016 22:28:29 +0200 "Kirill A. Shutemov"wrote: > The patch below fixes the issue for me, but this bug makes me wounder how > many bugs like this we have in kernel... :-/ > > Looks like we are too permissive about which VMA is migratable: > vma_migratable() filters out VMA by VM_IO and VM_PFNMAP. > I think VM_DONTEXPAND also correlate with VMA which cannot be migrated. > > $ git grep VM_DONTEXPAND drivers | grep -v '\(VM_IO\|VM_PFNMAN\)' | wc -l > 33 > > Hm.. :-| > > It worth looking on them closely... And I wouldn't be surprised if some > VMAs without all of these flags are not migratable too. > > Sigh.. Any thoughts? Sigh indeed. I think that both VM_DONTEXPAND and VM_DONTDUMP are pretty good signs that mbind() should not be mucking with this vma. If such a policy sometimes results in mbind failing to set a policy then that's not a huge loss - something runs a bit slower maybe. I mean, we only really expect mbind() to operate against regular old anon/pagecache memory, yes? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mm: VM_BUG_ON_PAGE(PageTail(page)) in mbind
On Tue, 26 Jan 2016 22:28:29 +0200 "Kirill A. Shutemov"wrote: > Let's mark the VMA as VM_IO to indicate to mm core that the VMA is > migratable. > > ... > > --- a/drivers/scsi/sg.c > +++ b/drivers/scsi/sg.c > @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma) > } > > sfp->mmap_called = 1; > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > + vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP; > vma->vm_private_data = sfp; > vma->vm_ops = _mmap_vm_ops; > return 0; I'll put cc:stable on this - I don't think we recently did anything to make this happen? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: debug: fix type mismatch warning for sg_pcopy_from_buffer
On Tue, 19 May 2015 23:22:39 +0200 Arnd Bergmann a...@arndb.de wrote: The recent change to mark the input argument of sg_pcopy_from_buffer had the unfortunate side-effect to cause a new warning in the scsi_debug code: drivers/scsi/scsi_debug.c: In function 'do_device_access': drivers/scsi/scsi_debug.c:2376:8: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types] func = sg_pcopy_from_buffer; This patch attempts to avoid that warning without adding evil type casts, but unfortunately makes the do_device_access function a lot uglier in the process. Signed-off-by: Arnd Bergmann a...@arndb.de Fixes: 5250326459 (lib/scatterlist: mark input buffer parameters as 'const') --- I can't decide if this is actually a good idea, or if we should rather drop the sg_pcopy_from_buffer() patch. Maybe someone else sees a better solution. Could make do_device_access() call sg_copy_buffer() directly. But yes, dropping the sg_pcopy_from/to_buffer changes is reasonable. sg_copy_buffer() is bidirectional and that won't be changing, so putting constified wrapeprs around it is kinda fake. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scatterlist: enable sg chaining for all architectures
On Sat, 25 Apr 2015 23:56:16 +0900 Akinobu Mita akinobu.m...@gmail.com wrote: Some architectures enable sg chaining option while others do not. The requirement to enable sg chaining is that pages must be aligned at a 32-bit boundary in order to overload the LSB of the pointer. Regardless of whether ARCH_HAS_SG_CHAIN is defined or not, the above requirement is always chacked by BUG_ON() in sg_assign_page. So all architectures can enable sg chaining. As you can see from the changes in drivers/target/target_core_rd.c, enabling SG chaining for all architectures allows us to allocate discontiguous scatterlist tables which can be traversed throughout by sg_next() without a special handling for some architectures. Thanks, I'll grab this. If anyone has concerns, speak now or hold both pieces! -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 069/104] lib/string_helpers.c:string_get_size(): remove redundant prefixes
On Fri, 13 Feb 2015 16:05:54 -0800 James Bottomley james.bottom...@hansenpartnership.com wrote: @@ -42,31 +44,60 @@ void string_get_size(u64 size, const enum string_size_units units, [STRING_UNITS_2] = 1024, }; int i, j; - u32 remainder = 0, sf_cap; + u32 remainder = 0, sf_cap, exp; char tmp[8]; + const char *unit; tmp[0] = '\0'; i = 0; + if (!size) + goto out; whitespace wart. + if (blk_size = divisor[units]) { + while (blk_size = divisor[units]) { + remainder = do_div(blk_size, divisor[units]); + i++; + } + } The `if' doesn't do anything. + exp = divisor[units]; + do_div(exp, blk_size); + if (size = exp) { + remainder = do_div(size, divisor[units]); + remainder *= blk_size; + i++; + } else { + remainder *= size; + } + size *= blk_size; + size += (remainder/divisor[units]); + remainder %= divisor[units]; + if (size = divisor[units]) { while (size = divisor[units]) { remainder = do_div(size, divisor[units]); i++; } + } Here too. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 069/104] lib/string_helpers.c:string_get_size(): remove redundant prefixes
On Thu, 12 Feb 2015 15:25:08 -0800 James Bottomley james.bottom...@hansenpartnership.com wrote: On Thu, 2015-02-12 at 15:01 -0800, a...@linux-foundation.org wrote: From: Rasmus Villemoes li...@rasmusvillemoes.dk Subject: lib/string_helpers.c:string_get_size(): remove redundant prefixes While 3c9f3681d0b4 [SCSI] lib: add generic helper to print sizes rounded to the correct SI range says that Z and Y are included in preparation for 128 bit computers, they just waste .text currently. If and when we get u128, string_get_size needs updating anyway (and ISO needs to come up with four more prefixes). This is rubbish. It's nothing to do with 128 bits. This is to do with disk sizes linux gets attached to. The current largest device clusters are Petabytes ... I think we may have some exabyte ones somewhere in the Academic community, so it's by no means inconcievable we'll have Zettabyte ones within a few years. The SCSI standard, with 4k blocks supports up to 2^76, which is well into Zettabytes. We obviously run off the mmap possibilities a lot sooner, because of the byte offsets, but that's fixable. Someone will probably start first by passing blocks into that interface not bytes, so we'd like it not to be based on assumptions that think 2^64 is the largest possible value. I don't get it. As the man says, this is presently dead code and string_get_size() will need to be changed to work for disks larger than 2^64 bytes. That change may be to take a u128 or it may be as you suggest: replace the `u64 size' with `u64 size, u64 units' which is effectively the same thing. Also there's no need to include and test for the NULL sentinel; once we reach E size is at most 18. [The test is also wrong; it should be units_str[units][i+1]; if we've reached NULL we're already doomed.] So fix the bug, don't set us up to run off the end of the array. And please consult the community which keeps track of this rather than trying to get it into Linux without review. That seems a bit harsh - you've been cc'ed on this every step of the way. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 069/104] lib/string_helpers.c:string_get_size(): remove redundant prefixes
On Thu, 12 Feb 2015 15:45:29 -0800 James Bottomley james.bottom...@hansenpartnership.com wrote: ... I don't get it. As the man says, this is presently dead code and string_get_size() will need to be changed to work for disks larger than 2^64 bytes. That change may be to take a u128 or it may be as you suggest: replace the `u64 size' with `u64 size, u64 units' which is effectively the same thing. The first thing someone's going to do is pass in blocks, because that's the way the rest of block functions. If we're lucky the add ZB too, but if not we run off the end in some obscure large cluster somewhere. Don't set people up to make mistakes. Well maybe. A little bit. But it assumes that someone is going to make a change then not test it. Also there's no need to include and test for the NULL sentinel; once we reach E size is at most 18. [The test is also wrong; it should be units_str[units][i+1]; if we've reached NULL we're already doomed.] So fix the bug, don't set us up to run off the end of the array. And please consult the community which keeps track of this rather than trying to get it into Linux without review. That seems a bit harsh - you've been cc'ed on this every step of the way. I think you need to check your scripts. This is the first time I've seen this patch, which is why I'm reacting this way. No, james.bottom...@hansenpartnership.com was cc'ed on the original email and on the -mm spam. Perhaps Rasmus should should also have cc'ed linux-scsi - practice seems to vary a lot. But he did cc the scsi maintainer and the author of the patch he was modifying (yourself). So I think the patch is reasonable and the way Rasmus and I handled it is also reasonable. Going nuts at us over it isn't reasonable! -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes
On Wed, 22 Jan 2014 11:30:19 -0800 James Bottomley james.bottom...@hansenpartnership.com wrote: But this, I think, is the fundamental point for debate. If we can pull alignment and other tricks to solve 99% of the problem is there a need for radical VM surgery? Is there anything coming down the pipe in the future that may move the devices ahead of the tricks? I expect it would be relatively simple to get large blocksizes working on powerpc with 64k PAGE_SIZE. So before diving in and doing huge amounts of work, perhaps someone can do a proof-of-concept on powerpc (or ia64) with 64k blocksize. That way we'll at least have an understanding of what the potential gains will be. If the answer is 1.5% then poof - go off and do something else. (And the gains on powerpc would be an upper bound - unlike powerpc, x86 still has to fiddle around with 16x as many pages and perhaps order-4 allocations(?)) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] remove cpqarray from mainline kernel
On Thu, 17 Oct 2013 12:52:26 -0500 Mike Miller mike.mil...@hp.com wrote: cpqarray hasn't been used in over 12 years. It's doubtful that anyone still uses the board. It's time the driver was removed from the mainline kernel. The only updates these days are minor and mostly done by people outside of HP. It's amazing the weird stuff people get up to. Perhaps we should disable it in config for a cycle or two, see if that flushes anyone out? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] block: Fix possible sleep in invalid context
On Mon, 1 Jul 2013 20:58:35 +0530 Sujit Reddy Thumma sthu...@codeaurora.org wrote: When block runtime PM is enabled following warning is seen while resuming the device. BUG: sleeping function called from invalid context at .../drivers/base/power/runtime.c:923 in_atomic(): 1, irqs_disabled(): 128, pid: 12, name: kworker/0:1 [c0014448] (unwind_backtrace+0x0/0x120) from [c03120e4] (__pm_runtime_suspend+0x34/0xa0) from [c021c33c] (blk_post_runtime_resume+0x4c/0x5c) from [c03297cc] (scsi_runtime_resume+0x90/0xb4) from [c0310940] (__rpm_callback+0x30/0x58) from [c0310980] (rpm_callback+0x18/0x28) from [c0311ab0] (rpm_resume+0x3dc/0x540) from [c03120a4] (pm_runtime_work+0x8c/0x98) from [c007767c] (process_one_work+0x238/0x3e4) from [c0077b90] (worker_thread+0x1ac/0x2ac) from [c007cfdc] (kthread+0x88/0x94) from [c000ece0] (kernel_thread_exit+0x0/0x8) Fix this by releasing spin_lock_irq() before calling pm_runtime_autosuspend() in blk_post_runtime_resume(). --- a/block/blk-core.c +++ b/block/blk-core.c @@ -3159,16 +3159,18 @@ EXPORT_SYMBOL(blk_pre_runtime_resume); */ void blk_post_runtime_resume(struct request_queue *q, int err) { - spin_lock_irq(q-queue_lock); if (!err) { + spin_lock_irq(q-queue_lock); q-rpm_status = RPM_ACTIVE; __blk_run_queue(q); pm_runtime_mark_last_busy(q-dev); + spin_unlock_irq(q-queue_lock); pm_runtime_autosuspend(q-dev); } else { + spin_lock_irq(q-queue_lock); q-rpm_status = RPM_SUSPENDED; + spin_unlock_irq(q-queue_lock); } - spin_unlock_irq(q-queue_lock); } EXPORT_SYMBOL(blk_post_runtime_resume); #endif I suppose we can do this cleanly enough: --- a/block/blk-core.c~block-fix-possible-sleep-in-invalid-context-fix +++ a/block/blk-core.c @@ -3159,15 +3159,14 @@ EXPORT_SYMBOL(blk_pre_runtime_resume); */ void blk_post_runtime_resume(struct request_queue *q, int err) { + spin_lock_irq(q-queue_lock); if (!err) { - spin_lock_irq(q-queue_lock); q-rpm_status = RPM_ACTIVE; __blk_run_queue(q); pm_runtime_mark_last_busy(q-dev); spin_unlock_irq(q-queue_lock); pm_request_autosuspend(q-dev); } else { - spin_lock_irq(q-queue_lock); q-rpm_status = RPM_SUSPENDED; spin_unlock_irq(q-queue_lock); } _ I wonder if we actually need locking around that second write to q-rpm_status. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] block: Fix possible sleep in invalid context
On Mon, 01 Jul 2013 15:24:11 -0700 James Bottomley james.bottom...@hansenpartnership.com wrote: --- a/block/blk-core.c~block-fix-possible-sleep-in-invalid-context-fix +++ a/block/blk-core.c @@ -3159,15 +3159,14 @@ EXPORT_SYMBOL(blk_pre_runtime_resume); */ void blk_post_runtime_resume(struct request_queue *q, int err) { + spin_lock_irq(q-queue_lock); if (!err) { - spin_lock_irq(q-queue_lock); q-rpm_status = RPM_ACTIVE; __blk_run_queue(q); pm_runtime_mark_last_busy(q-dev); spin_unlock_irq(q-queue_lock); pm_request_autosuspend(q-dev); } else { - spin_lock_irq(q-queue_lock); q-rpm_status = RPM_SUSPENDED; spin_unlock_irq(q-queue_lock); } _ I wonder if we actually need locking around that second write to q-rpm_status. Shouldn't: it's an int, which makes it a 32 bit quantity we believe to have atomic write properties on every platform. Yes, but. If there's some other code path which does: spin_lock(queue_lock); x = q-rpm_status; ... y = q-rpm_status; ... assumes x == y spin_unlock(queue_lock); then it blows up if we make the suggested change. Stranger things have happened... -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] cciss: add cciss_allow_hpsa module parameter
On Thu, 18 Apr 2013 13:49:37 -0500 Mike Miller mike.mil...@hp.com wrote: Add the cciss_allow_hpsa modules parameter. This allows users to use the hpsa driver instead of cciss for older controllers. Tested with 3.9.0-rc7 in combination with the bug fix submitted Tuesday. My apologies for not testing that patch with the correct kernel. Could you please resend Tuesday's bug fix, with a much better explanation than v1 had? It's totally weird and wonderful that there's an interaction between kdump and one scsi/block driver so let's try to get a diagnosis into the record. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 1/1] cciss: bug fix, prevent cciss from loading in kdump kernel
On Mon, 15 Apr 2013 12:59:06 -0500 Mike Miller mike.mil...@hp.com wrote: Patch 1/1 If hpsa is selected as the Smart Array driver cciss may try to load in the kdump kernel. When this happens kdump fails and a core file cannot be created. This patch prevents cciss from trying to load in this scenario. This effects primarily older Smart Array controllers. ... --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -4960,6 +4960,12 @@ static int cciss_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) ctlr_info_t *h; unsigned long flags; + /* + * if this is the kdump kernel and the user has set the flags to + * use hpsa rather than cciss just bail + */ + if ((reset_devices) (cciss_allow_hpsa == 1)) + return -ENODEV; OK, wazzup. That's the only occurrence of the symbol cciss_allow_hpsa in Linux and needless to say, the compiler laughed at me. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 1/1] cciss: bug fix, prevent cciss from loading in kdump kernel
On Mon, 15 Apr 2013 12:59:06 -0500 Mike Miller mike.mil...@hp.com wrote: Patch 1/1 If hpsa is selected as the Smart Array driver cciss may try to load in the kdump kernel. When this happens kdump fails and a core file cannot be created. This patch prevents cciss from trying to load in this scenario. This effects primarily older Smart Array controllers. OK, this is weird. kdump and scsi drivers are pretty darn remote things and I've never heard of such an interaction. Can you tell us a bit more about how and why this happened? Is there something special about cciss, or can we expect similar kdump interactions with other device drivers? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -mmotm] scsi: fix the wrong position of the comment
On Sun, 10 Mar 2013 08:22:47 + James Bottomley jbottom...@parallels.com wrote: [missing SCSI cc added] On Sun, 2013-03-10 at 17:09 +0900, Akinobu Mita wrote: This fixes the wrong position of the comment introduced by scsi-rename-random32-to-prandom_u32.patch in the -mm tree. Signed-off-by: Akinobu Mita akinobu.m...@gmail.com Cc: James E.J. Bottomley jbottom...@parallels.com Cc: Andrew Vasquez andrew.vasq...@qlogic.com --- drivers/scsi/qla2xxx/qla_attr.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c index 04bf7b8..e44d47e 100644 --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -1939,13 +1939,13 @@ qla24xx_vport_delete(struct fc_vport *fc_vport) } /* No pending activities shall be there on the vha now */ - if (ql2xextended_error_logging ql_dbg_user) - msleep(prandom_u32() % 10); + if (ql2xextended_error_logging ql_dbg_user) { /* * Just to see if something falls on the net we have placed * below */ - + msleep(prandom_u32() % 10); + } I don't git a toss if it's random or prandom: Andrew: get rid of it; we do not sleep in kernel for random intervals whatever the provocation ... if this is supposed to be a warning or error condition then print something. That msleep was added by commit feafb7b1714cf599a6d0fed45801ab3f66046cbd Author: Arun Easi arun.e...@qlogic.com AuthorDate: Fri Sep 3 14:57:00 2010 -0700 Commit: James Bottomley james.bottom...@suse.de CommitDate: Sun Sep 5 15:13:12 2010 -0300 [SCSI] qla2xxx: Fix vport delete issues -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][SCSI] hptiop: Support HighPoint RR4520/RR4522 HBA
On Wed, 24 Oct 2012 11:28:54 +0800 HighPoint Linux Team li...@highpoint-tech.com wrote: Support HighPoint RR4520/RR4522 HBAs which are based on Marvell Frey. Signed-off-by: HighPoint Linux Team li...@highpoint-tech.com Documentation/scsi/hptiop.txt | 69 ++- drivers/scsi/hptiop.c | 413 -- drivers/scsi/hptiop.h | 72 +++ 3 files changed, 530 insertions(+), 24 deletions(-) The patch is terribly wordwrapped and has its tabs replaced with spaces. I sugegst you resend it as a text/plain email attachment, please. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: next-20120925: BUG at drivers/scsi/scsi_lib.c:640!
(cc's added) On Tue, 25 Sep 2012 22:06:37 +0400 Dmitry Monakhov dmonak...@openvz.org wrote: Seems like barriers are broken again kernel BUG at drivers/scsi/scsi_lib.c:1180! invalid opcode: [#1] SMP Modules linked in: coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button ext3 jbd mbcache sd_mod crc_t10dif\ elper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod CPU 0 Pid: 753, comm: fsck.ext3 Not tainted 3.6.0-rc7-next-20120925+ #4 /DQ67SW RIP: 0010:[81470dbc] [81470dbc] scsi_setup_fs_cmnd+0xec/0x180 RSP: 0018:880233aff9f8 EFLAGS: 00010002 RAX: 0003 RBX: 88022a741000 RCX: 0002 RDX: RSI: 0001 RDI: 81f32b48 RBP: 880233affa18 R08: 0001 R09: R10: 88022a26c800 R11: R12: 880229369968 R13: 0001 R14: 88022a741000 R15: FS: 7f1348632760() GS:88023e20() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 003a3dc0e550 CR3: 0002338cf000 CR4: 000407f0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process fsck.ext3 (pid: 753, threadinfo 880233afe000, task 880233f48240) Stack: 880233affa48 880229369968 0001 880229bdb550 880233affaa8 a00a8860 880233affab8 0082 8107d696 8802 817410d8 Call Trace: [a00a8860] sd_prep_fn+0x140/0xfe0 [sd_mod] [8107d696] ? lock_timer_base+0x76/0xf0 [817410d8] ? _raw_spin_unlock_irq+0x48/0x80 [8130023c] blk_peek_request+0x23c/0x450 [8146fad0] scsi_request_fn+0x70/0x820 [812f54e5] __blk_run_queue+0x55/0x70 [8132a065] cfq_rq_enqueued+0x155/0x1c0 [8132a386] cfq_insert_request+0x2b6/0x2f0 [8132a11d] ? cfq_insert_request+0x4d/0x2f0 [812f002f] ? md5_final+0x9f/0x130 [810e5463] ? __lock_release+0xc3/0xe0 [812fe074] ? drive_stat_acct+0x334/0x3b0 [812f4be6] __elv_add_request+0x2a6/0x350 [813010fb] blk_queue_bio+0x52b/0x570 [812fd8f5] generic_make_request+0x125/0x1c0 [812fdb68] submit_bio+0x1d8/0x240 [81250c63] ? bio_alloc_bioset+0x103/0x1e0 [813039e7] blkdev_issue_flush+0x177/0x200 [81253afa] blkdev_fsync+0x4a/0x70 [81245af6] vfs_fsync_range+0x36/0x60 [81245b3c] vfs_fsync+0x1c/0x20 [81245ea8] do_fsync+0x58/0x90 [81246100] sys_fsync+0x10/0x20 [8174e539] system_call_fastpath+0x16/0x1b Code: 00 48 c7 c7 48 2b f3 81 41 0f 94 c5 31 d2 44 89 ee e8 d9 e4 cd ff 49 63 c5 48 83 c0 02 48 83 04 c5 b0 a5 13 82 01 45 85 ed 74 04 0f\ 48 89 df 31 db e8 a3 f6 ff ff 48 85 c0 48 RIP [81470dbc] scsi_setup_fs_cmnd+0xec/0x180 RSP 880233aff9f8 [ cut here ] kernel BUG at drivers/scsi/scsi_lib.c:640! invalid opcode: [#1] SMP Modules linked in: coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button ext3 jbd mbcache sd_mod crc_t10dif\ elper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod CPU 0 Pid: 727, comm: fsck.ext3 Not tainted 3.6.0-rc7-next-20120925+ #5 /DQ67SW RIP: 0010:[81470585] [81470585] scsi_alloc_sgtable+0x55/0xe0 RSP: 0018:880228215aa8 EFLAGS: 00010002 RAX: 0003 RBX: 880228111a18 RCX: 0001 RDX: RSI: 0001 RDI: 81f32a08 RBP: 880228215ac8 R08: 0001 R09: R10: 0002 R11: R12: R13: 0020 R14: 0001 R15: FS: 7fb605f35760() GS:88023e20() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 003a3dc0e550 CR3: 000233e83000 CR4: 000407f0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process fsck.ext3 (pid: 727, threadinfo 880228214000, task 880233af8c80) Stack: 880228111a18 88022a0a0638 88022a679000 880228215b08 81470641 8802281119c0 88022a679000 880228215b28 8802281119c0 88022a0a0638 0020 Call Trace: [81470641] scsi_init_sgtable+0x31/0xe0 [81470a2d] scsi_init_io+0x3d/0x2e0 [81470e23] scsi_setup_fs_cmnd+0x153/0x180 [a00a8860] sd_prep_fn+0x140/0xfe0 [sd_mod] [8135afec]
Re: [PATCH] fcoe: Remove redundant 'less than zero' check
On Thu, 05 Jul 2012 07:52:25 -0700 Robert Love robert.w.l...@intel.com wrote: strtoul returns an 'unsigned long' so there is no reason to check if the value is less than zero. strtoul already checks for the '-' character deep in its bowels. It will return an error if the user has provided a negative value and fcoe_str_to_dev_loss will return that error to its caller. huh, I never knew that. So if we feed -1 to kstrtoul() it gets treated as an error? That seems a bit surprising. You're sure about that? This patch fixes the following Coverity reported warning: CID 703581 - NO_EFFECT Unsigned compared against 0 - This less-than-zero comparison of an unsigned value is never true. *val 0UL. drivers/scsi/fcoe/fcoe_sysfs.c:105 Signed-off-by: Robert Love robert.w.l...@intel.com --- drivers/scsi/fcoe/fcoe_sysfs.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/fcoe/fcoe_sysfs.c b/drivers/scsi/fcoe/fcoe_sysfs.c index 2bc1631..5e75168 100644 --- a/drivers/scsi/fcoe/fcoe_sysfs.c +++ b/drivers/scsi/fcoe/fcoe_sysfs.c @@ -102,7 +102,7 @@ static int fcoe_str_to_dev_loss(const char *buf, unsigned long *val) int ret; ret = kstrtoul(buf, 0, val); - if (ret || *val 0) + if (ret) return -EINVAL; /* * Check for overflow; dev_loss_tmo is u32 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arcmsr areca-1660 - strange behaviour under heavy load
On Tue, 26 Feb 2008 10:35:31 +0100 (CET) Nikola Ciprich [EMAIL PROTECTED] wrote: Hi On Sun, 24 Feb 2008, Andrew Morton wrote: Hi Andrew, thanks a lot for reply, I'm attaching requested information. please let me know if You need more information/testing, whatever. I'll be glad to help. BR nik Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c /dev/null ; echo m /proc/sysrq-trigger ; dmesg -c Thanks. Alas, that all looks OK to me. You never get any out-of-memory messages, and no oom-killing messages? Possibly what is happening here is that in this low-memory condition, some of the driver's internal memory-allocation attempts are failing, and the driver isn't correctly handling this. This is a rare situation which may well not have been hit in anyone else's testing. I expect that the Areca engineers will be able to reproduce this with a suitably small mem= kernel boot option. If not, they could perhaps investigate the kernel's fault-injection framework, which permits simulation of page allocation failures. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arcmsr areca-1660 - strange behaviour under heavy load
On Sat, 23 Feb 2008 12:20:12 +0100 (CET) Nikola Ciprich [EMAIL PROTECTED] wrote: Hi, I've found strange problem either in arcmsr driver, or maybe in areca-1660 card... When system on SAS discs RAID connected to areca-1660 card gets under heavy I/O load, it gets unusable after some time. I can 100% reproduce this, although it needs quite speciffic conditions: It can be reproduced on 2x quad core machine, RAM has to be limited to ~192MB to cause heavy paging. Only thing needed to cause the problem is to start loop doing kernel compilation using make -j 8 - this loads the system heavily, because of lack of memory. After few correct compile runs the system gets into state when all programs including the basic ones (ls, cp, ..) start crashing... dmesg (when it works) doesn't say anything strange... After reboot, the system is OK again. I have tested it on different motherboards, with different CPUs, RAMs(all were properly tested with memtest), with two different areca cards and different drives. I can't reproduce the problem on same hardware when using different RAID card (ie adaptec). All testing systems were properly cooled.. I have tried all available areca firmwares, two different distributions (oracle linux, and centos), and kernels ranging from distribution ones, to last GIT snapshot. Could somebody please give me some hints on how to hunt this problem? Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c /dev/null ; echo m /proc/sysrq-trigger ; dmesg -c Thanks. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] scsi fixes for 2.6.25-rc2
On Sat, 23 Feb 2008 12:31:02 -0800 (PST) Linus Torvalds [EMAIL PROTECTED] wrote: On Sat, 23 Feb 2008, Jeff Garzik wrote: I know I am probably shooting myself in the foot here, since I am the original author of mvsas, but... Should we be adding new drivers during -rc? I'm personally of the opinion that a new driver that doesn't add anything but itself (ie no infrastructure changes etc) is fine. I'd rather have a new, rough driver that might work, than no driver at all, and it's not like it can cause a regression if you don't enable it. Yes, I too think that adding new standalone code in late -rc is OK. Especially drivers, because a new driver is a bugfix for people who own that hardware! - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: LSI Logic MegaRAID SATA 150-4 / LSI Logic New Generation RAID Device Drivers (MEGARAID_NEWGEN) problems (megaraid abort: scsi cmd:14600, do now own)
(cc's added) On Mon, 18 Feb 2008 21:09:22 -0500 David M. Strang [EMAIL PROTECTED] wrote: Greetings - A couple months back I purchased a LSI Logic MegaRAID ATA 150-4 controller, as well as 3 Seagate 500GB SATA-II hard drives to use in my system. Previously, I was using a pair of WD4000YR's in software raid, which seemed to work well. I've just not gotten around to working on migrating my data to these new drivers + controller, and it's giving me some issues. As with most, I'm having some severe performance issues, the performance is simply abysmal. Before getting into the details, here is a quick overview of my configuration: System: Tyan Tiger i7320/R (S5350) System Board 2x Intel Xeon 3.0 GHz 4GB RAM LSI Logic MegaRAID ATA 150-4 controller - Firmware Revision: 713S 3x Seagate 7200.10 (Perpendicular Recording) ST3500630AS 500GB SATA-II drives configured as a RAID-1 array with a HotSpare. Also, connected to the onboard controller is a WD4000YR, where all of my data currently resides. I'm running Gentoo Hardended AMD64 MultiLib (/usr/portage/profiles/hardened/amd64/multilib) My current kernel revision is 2.6.23-hardened-r7. Here are some (possibly) relevant snippets from dmesg during startup: ... megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006) megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006) megaraid: probe new device 0x1000:0x1960:0x1000:0x4523: bus 3:slot 3:func 0 ACPI: PCI Interrupt :03:03.0[A] - GSI 24 (level, low) - IRQ 24 megaraid: fw version:[713S] bios version:[G121] scsi0 : LSI Logic MegaRAID driver scsi[0]: scanning scsi channel 0 [Phy 0] for non-raid devices scsi[0]: scanning scsi channel 1 [virtual] for logical drives scsi 0:1:0:0: Direct-Access MegaRAID LD 0 RAID1 476G 713S PQ: 0 ANSI: 2 sd 0:1:0:0: [sda] 976762880 512-byte hardware sectors (500103 MB) sd 0:1:0:0: [sda] Write Protect is off sd 0:1:0:0: [sda] Mode Sense: 00 00 00 00 sd 0:1:0:0: [sda] Asking for cache data failed sd 0:1:0:0: [sda] Assuming drive cache: write through sd 0:1:0:0: [sda] 976762880 512-byte hardware sectors (500103 MB) sd 0:1:0:0: [sda] Write Protect is off sd 0:1:0:0: [sda] Mode Sense: 00 00 00 00 sd 0:1:0:0: [sda] Asking for cache data failed sd 0:1:0:0: [sda] Assuming drive cache: write through sda: sda1 sda2 sda3 sda4 sd 0:1:0:0: [sda] Attached SCSI disk ata_piix :00:1f.2: version 2.12 ata_piix :00:1f.2: MAP [ P0 -- P1 -- ] ACPI: PCI Interrupt :00:1f.2[A] - GSI 18 (level, low) - IRQ 18 PCI: Setting latency timer of device :00:1f.2 to 64 scsi1 : ata_piix scsi2 : ata_piix ata1: SATA max UDMA/133 cmd 0x000114a0 ctl 0x0001149a bmdma 0x00011470 irq 18 ata2: SATA max UDMA/133 cmd 0x00011490 ctl 0x00011486 bmdma 0x00011478 irq 18 ata1.00: ATA-7: WDC WD4000YR-01PLB0, 01.06A01, max UDMA/133 ata1.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/133 scsi 1:0:0:0: Direct-Access ATA WDC WD4000YR-01P 01.0 PQ: 0 ANSI: 5 sd 1:0:0:0: [sdb] 781422768 512-byte hardware sectors (400088 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sdb] 781422768 512-byte hardware sectors (400088 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: sdb1 sdb2 sdb3 sdb4 sd 1:0:0:0: [sdb] Attached SCSI disk ... My controller is configured for Write Back Caching, Adaptive Read Ahead, and Direct I/O (I've also tried cached I/O but it scared me...) The first thing I'm noticing is the horrible performance on the raid disk, compared to the single standalone hard disk. Here is the output from hdparm -tT on the single disk: -([EMAIL PROTECTED])-(~)- # hdparm -tT /dev/sdb1 /dev/sdb1: Timing cached reads: 1670 MB in 2.00 seconds = 835.00 MB/sec Timing buffered disk reads: 140 MB in 3.01 seconds = 46.45 MB/sec And then, the output from the raid-1 array: -([EMAIL PROTECTED])-(~)- # hdparm -tT /dev/sda1 /dev/sda1: Timing cached reads: 1718 MB in 2.00 seconds = 859.65 MB/sec Timing buffered disk reads: 92 MB in 3.09 seconds = 29.76 MB/sec I'm not sure what the deal is with the buffered disk reads being so much WORSE than a single disk. So poor performance is a concern, but what's more alarming are the messages showing up in DMESG. When I first tried Cached IO - performance seemed good... except, dmesg was littered with these errors (?): megaraid: aborting-14610 cmd=2a c=1 t=0 l=0 megaraid abort: scsi cmd:14610, do now own megaraid: aborting-14612 cmd=2a c=1 t=0 l=0 megaraid abort: scsi cmd:14612, do now own megaraid: aborting-14614 cmd=2a c=1 t=0 l=0 megaraid abort: scsi cmd:14614, do
Re: [PATCH 1/1] cciss: procfs updates to display info about many volumes
On Tue, 19 Feb 2008 11:48:18 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Mon, Feb 11 2008, Mike Miller wrote: Patch 1 of 1 This patch allows us to display information about all of the logical volumes configured on a particular without stepping on memory even when there are many volumes (128 or more) configured. This patch replaces the one submitted on 20071214. See http://groups.google.com/group/linux.kernel/browse_thread/thread/49a50244b19f8855/ba3dc95b23391521?hl=enlnk=gstq=cciss#ba3dc95b23391521 which has not been merged. That patch displayed information about only the first logical volume on each controller and had negative side effects for some installers. Please consider this for inclusion. It looks ok, but has some flaws. Try to disable cciss scsi and tape support: In file included from drivers/block/cciss.c:231: drivers/block/cciss_scsi.c:1498:38: error: macro parameters must be comma-separated drivers/block/cciss.c: In function 'cciss_seq_show_header': drivers/block/cciss.c:272: error: implicit declaration of function 'cciss_seq_tape_report' drivers/block/cciss.c: In function 'cciss_proc_write': drivers/block/cciss.c:393: error: implicit declaration of function 'cciss_engage_scsi' You macro definition of cciss_seq_tape_report() is totally busted. Either write is as a macro OR as a function. Fix these up and resubmit, then I'll take it. It also need to be updated to use the non-racy proc_create(), please, as per Alexey's comments. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: gdth new set of patches for 2.6.24 stable
On Sun, 17 Feb 2008 18:46:03 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: ... All my testers have reported back that with these 5 patches applied they can now run with a 2.6.24 kernel the same way they ran before. However there is that reported issue, with the dma_free_coherent WARN_ON (above). The code was like that from day one and it is a very old issue, however it is a regression because 2.6.24 introduced that new WARN_ON. (infamous commit aa24886e379d2b641c5117e178b15ce1d5d366ba) From posts on lkml and even recent one in linux-scsi about the arcmsr driver it looks that all a driver can do is work around it with different kernel mechanisms and driver rewrites. I'm afraid I need your help here. I'm not sure I understand why does the gdth driver uses the pci_{alloc,free}_consistent() API's, and what is needed to replace it. Could you please have a look in gdth_proc.c and also in gdth.c for all the places that call gdth_ioctl_alloc/gdth_ioctl_free, and advise what can I do in it's place. Please bear in mind that we need it for 2.6.24, as a bugfix. Apart from the above issue, please accept patches 3,4,5 above they have now been tested and are reported to bring broken system back to production. (Given that you approve off course). And mark them for inclusion to the 2.6.24 stable releases. (Or is there some thing that I should do) --- Meanwhile on x86 systems I understand the WARN_ON is cosmetic, and does not pose any harm. Some people have reported stability with temporarily disabling it. For testers that want to try, here it is below. At your own risk. --- From 50d3657bf6a138ee63ad1ce00052380edc75ace7 Mon Sep 17 00:00:00 2001 From: Boaz Harrosh [EMAIL PROTECTED] Date: Sun, 17 Feb 2008 12:49:35 +0200 Subject: [PATCH] gdth: Hack to remove WARN_ON in arch/x86/kernel/pci-dma_32.c gdth uses dma_free_coherent() with interrupts disabled. Which is not portable, but is safe on the HW that supports gdth. NOT Signed-off-by: Boaz Harrosh [EMAIL PROTECTED] --- arch/x86/kernel/pci-dma_32.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/pci-dma_32.c b/arch/x86/kernel/pci-dma_32.c index 5133032..350dcfd 100644 --- a/arch/x86/kernel/pci-dma_32.c +++ b/arch/x86/kernel/pci-dma_32.c @@ -63,7 +63,7 @@ void dma_free_coherent(struct device *dev, size_t size, struct dma_coherent_mem *mem = dev ? dev-dma_mem : NULL; int order = get_order(size); - WARN_ON(irqs_disabled()); /* for portability */ +/* WARN_ON(irqs_disabled());*/ /* for portability */ if (mem vaddr = mem-virt_base vaddr (mem-virt_base + (mem-size PAGE_SHIFT))) { int page = (vaddr - mem-virt_base) PAGE_SHIFT; Yes. Let's reprise aa24886e379d2b641c5117e178b15ce1d5d366ba: : commit aa24886e379d2b641c5117e178b15ce1d5d366ba : Author: David Brownell [EMAIL PROTECTED] : Date: Fri Aug 10 13:10:27 2007 -0700 : : dma_free_coherent() needs irqs enabled (sigh) : : On at least ARM (and I'm told MIPS too) dma_free_coherent() has a newish : call context requirement: unlike its dma_alloc_coherent() sibling, it may : not be called with IRQs disabled. (This was new behavior on ARM as of late : 2005, caused by ARM SMP updates.) This little surprise can be annoyingly : driver-visible. : : Since it looks like that restriction won't be removed, this patch changes : the definition of the API to include that requirement. Also, to help catch : nonportable drivers, it updates the x86 and swiotlb versions to include the : relevant warnings. (I already observed that it trips on the : bus_reset_tasklet of the new firewire_ohci driver.) : In general, all Linux memory-freeing functions can be called from all contexts. (vfree is an irritating exception). This is good, and provides maximum usefulness to callees, as all utility functions should seek to do. It would be best to fix arm and mips. But arm and mips require enabled local irqs because their dma_free_coherent() needs to do a cross-cpu IPI call. Presumably because of certain unusual TLB protocols. I'm not sure what we should do about this. Presumably the gdth-on-arm usage base is, umm, zero, so we could lamely add CONFIG_DMA_FREE_COHERENT_WITH_LOCAL_IRQS_DISABLED_IS_OK and then use that to disable gdth (and similar) on arm amd mips. But ugh. Russell, Ralf: is there something we can do here to relax this requirement? I'm thinking that perhaps we can do some rcu/refcounting tricks: launch the IPI from within dma_free_coherent(), but don't wait for it to complete. When all CPUs have handled the IPI then (and only then) the virtual address becomes recyclable, or something like that? double-checks Actually I think David might have been wrong about mips. afaict its dma_free_coherent() is callable under local_irq_disable(), so ARM SMP is the sole exception? - To unsubscribe from this
Re: Aborted commands with arcmsr and 2xWD1500ADFD in RAID1
(cc's added) On Mon, 11 Feb 2008 17:44:08 +0100 Aron Stansvik [EMAIL PROTECTED] wrote: Hello LKML. Under semi-high disk I/O (e.g. installing a compiled KDE), I get the following (accompanied by seconds of lock-ups on the machine): [ 7727.345183] arcmsr0: abort device command of scsi id = 0 lun = 0 [ 7730.348776] arcmsr0: scsi id = 0 lun = 0 ccb = '0xdfb461c0' poll command abort successfully [ 8053.795943] arcmsr0: abort device command of scsi id = 0 lun = 0 [ 8056.799528] arcmsr0: scsi id = 0 lun = 0 ccb = '0xdfb595e0' poll command abort successfully [ 8884.592810] arcmsr0: abort device command of scsi id = 0 lun = 0 [ 8887.596392] arcmsr0: scsi id = 0 lun = 0 ccb = '0xdfb56d80' poll command abort successfully [ 8917.760216] arcmsr0: abort device command of scsi id = 0 lun = 0 [ 8920.763797] arcmsr0: scsi id = 0 lun = 0 ccb = '0xdfb472c0' poll command abort successfully [ 9074.106547] arcmsr0: abort device command of scsi id = 0 lun = 0 This is my setup: 1 x MSI K8N Master2-FAR 1 x Opteron 252 1 x Areca ARC1200 (sitting in a PCIe x4 socket) 2 x WD1500ADFD in RAID1 [EMAIL PROTECTED]:~$ uname -a Linux rubik 2.6.24-7-generic #1 SMP Thu Feb 7 01:29:58 UTC 2008 i686 GNU/Linux [EMAIL PROTECTED]:~$ modinfo arcmsr filename: /lib/modules/2.6.24-7-generic/kernel/drivers/scsi/arcmsr/arcmsr.ko version:Driver Version 1.20.00.15 2007/08/30 license:Dual BSD/GPL description:ARECA (ARC11xx/12xx/13xx/16xx) SATA/SAS RAID HOST Adapter author: Erich Chen [EMAIL PROTECTED] srcversion: 28EAD6AB49D4491CA04D465 [...] I've read some previous posts here on LKML that it could be the Areca firmware who doesn't like my WD disks. Anyone know if this is an IRQ handling problem in the kernel, or if it's a problem with the RAID controller firmware? Erich Chen (of Areca); have you tried the new ARC1200 in RAID1 configuration with Raptor disks on Linux? As a side note, I can tell you that I first tried running FreeBSD 6.3 (RELENG_6) on this machine, but got random reboots during disk I/O (even with a kernel with KDB debugging turned on). This leads me to believe that it might be a firmware issue, and that Linux just handles it more gracefully than FreeBSD. Any ideas or advice is appriciated. This is my first post to the LKML, so please instruct me if you want more information or if you want me to take further debugging actions. Best regards, Aron Stansvik - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] SCSI bug fixes for 2.6.25-rc1
On Wed, 13 Feb 2008 18:02:44 -0600 James Bottomley [EMAIL PROTECTED] wrote: This one's not too bad given the number of patches we had in the merge window. We have the advansys fix, a gdth severe problem fix (wouldn't scan any devices) a bug fix series for lpfc and a few other odds and ends. The patch is available here: master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git I have scsi patches! mptbase-reset-ioc-initiator-during-pci-resume.patch Fixes suspend/resume on all of Darrick's MPT cards. I first merged it in September 2007. kill-warnings-in-mptbaseh-on-parisc64.patch Warning fixes. dell-cerc-support-for-megaraid_mbox.patch Turns non-booting machiens into booting ones. Merged in -mm in November 2007. 3w-raid-drivers-memset-not-needed-in-probe.patch Small optimisation scsi-aic94xx-cleanups.patch cleanups only. scsi-qlogicptic-section-fixes.patch Fixes a reference from .text into .init.text and hence might fix a machine crash when this driver is build into vmlinux. Merged a week ago. megaraid-outb_p-extermination.patch Cleanup gdth-convert-to-pci-hotplug-api.patch Just merged So several of these patches address quite seriosu bugs, and have been stuck in my tree for far too long. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] SCSI bug fixes for 2.6.25-rc1
On Wed, 13 Feb 2008 19:11:53 -0600 James Bottomley [EMAIL PROTECTED] wrote: mptbase-reset-ioc-initiator-during-pci-resume.patch Fixes suspend/resume on all of Darrick's MPT cards. I first merged it in September 2007. Patch presented by LSI but has gone back with comments Five months is too long to fix a bug when someone has already sent us a patch. If they insist on being this sluggish I'd suggest that you review the patch yourself then just merge it. That will get their attention. Maybe. dell-cerc-support-for-megaraid_mbox.patch I need megaraid to sign off (and test) this one. Two months, same story. scsi-qlogicptic-section-fixes.patch Fixes a reference from .text into .init.text and hence might fix a machine crash when this driver is build into vmlinux. Merged a week ago. This was the one we had the alternative fix for, wasn't it ... ? In current mainline, __devinit qpti_sbus_probe() still is calling __init qpti_chain_add() (for example). So in a CONFIG_HOTPLUG kernel, hotplugging a new device (on sbus, ok, bad example ;)) will crash. But Adrian has fixed six such bugs in there, maybe one of them can hit. I don't think we've fixed these by alternative means, unless we've disabled __devinit? Still, we can discuss specific patches all day. I think there is a _general_ problem getting bugfixes, warning fixes and cleanups into scsi drivers within reasonable amounts of time? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 02/13] git-scsi-misc gdth fix
On Tue, 12 Feb 2008 17:27:33 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Tue, Feb 05 2008 at 9:53 +0200, [EMAIL PROTECTED] wrote: From: James Bottomley [EMAIL PROTECTED] On Sun, 2007-10-14 at 12:21 -0700, Andrew Morton wrote: On Sun, 14 Oct 2007 22:45:47 +0400 Dave Milter [EMAIL PROTECTED] wrote: I build linux-2.6.23-mm1 and try to boot it using qemu, and it crashed with trace like this: do_page_fault error_code lock_acquire _spin_lock_irqsave gdth_timeout run_timer_softirq __do_softirq do_softirq I have screenshot, but have no idea, is it legal to include it, if I sent copy to lkml. config of kernel in attachment, I apply all three patches from hot-fixes. The screenshot is here: http://userweb.kernel.org/~akpm/crash.png It would appear that gdth_timeout() is passing a bad pointer into spin_lock_irqsave(). There's a bug in the gdth rework in that the instance can be deleted from the list before the actual timer is stopped. This can be worked around I think by the following patch; although we really should be stopping the timer from firing when the list goes empty. James said: This is almost certainly the wrong fix for real hardware. Although it kills the timer when the list goes empty, nothing will ever restart it when the list fills again. Boaz, since you touched all of this, you get to fix it. The correct fix will be to control the timer along with the actual list instead of at entry/exit time. If you're not going to add this empty check to the timer routine, make sure you use del_timer_sync() before removing the last element from the list. Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/gdth.c |3 +++ 1 file changed, 3 insertions(+) diff -puN drivers/scsi/gdth.c~git-scsi-misc-gdth-fix drivers/scsi/gdth.c --- a/drivers/scsi/gdth.c~git-scsi-misc-gdth-fix +++ a/drivers/scsi/gdth.c @@ -3791,6 +3791,9 @@ static void gdth_timeout(ulong data) gdth_ha_str *ha; ulong flags; +if (list_empty(gdth_instances)) + return; + ha = list_first_entry(gdth_instances, gdth_ha_str, list); spin_lock_irqsave(ha-smp_lock, flags); _ Hello dear Andrew Do you perhaps remember who as reported this problem, and if he can test patches? It was Dave Milter, who has been cc'ed on all of this. and if he can test patches? Don't know. Dave, would it be a possibility? Thanks. --- gdth: Try to fix the Timer at exit problem Remove_sync the timer before we delete the cards. Testing-patches: Boaz Harrosh [EMAIL PROTECTED] --- git-diff --stat -p v2.6.24 drivers/scsi/gdth.c | 19 --- 1 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c index b253b8c..57fa756 100644 --- a/drivers/scsi/gdth.c +++ b/drivers/scsi/gdth.c @@ -5102,6 +5105,9 @@ static int __init gdth_pci_probe_one(gdth_pci_str *pcistr, int ctr) if (error) goto out_free_coal_stat; list_add_tail(ha-list, gdth_instances); + + scsi_scan_host(shp); + return 0; out_free_coal_stat: @@ -5137,8 +5143,6 @@ static void gdth_remove_one(gdth_ha_str *ha) ha-sdev = NULL; } - gdth_flush(ha); - if (shp-irq) free_irq(shp-irq,ha); @@ -5236,14 +5240,15 @@ static void __exit gdth_exit(void) { gdth_ha_str *ha; - list_for_each_entry(ha, gdth_instances, list) - gdth_remove_one(ha); + unregister_chrdev(major,gdth); + unregister_reboot_notifier(gdth_notifier); #ifdef GDTH_STATISTICS - del_timer(gdth_timer); + del_timer_sync(gdth_timer); #endif - unregister_chrdev(major,gdth); - unregister_reboot_notifier(gdth_notifier); + + list_for_each_entry(ha, gdth_instances, list) + gdth_remove_one(ha); } module_init(gdth_init); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] final SCSI updates for 2.6.24 merge window
On Thu, 07 Feb 2008 18:56:46 -0600 James Bottomley [EMAIL PROTECTED] wrote: Quite a bit of this is fixing things broken previously (the advansys fix is still pending resolution, but I'll send it as an -rc fix when we have it). There's the final elimination of all drivers that are esp based but don't use the scsi_esp core (that's mostly m68k and alpha). Plus the usual bunch of driver updates and the addition of a new enclosure services driver and the corresponding ULD. Sob. Can we please merge Convert SG from nopage to fault? It has been sent three times, the first time was Dec 5 last year and it has thus far received the lead balloon treatment. Despite my explicit request for consideration last time I sent it If there is no movement here then I have to carry the moderately intrusive mm-remove-nopage.patch for another N months and we need to watch out for new -nopage implementations popping up etc. From: Nick Piggin [EMAIL PROTECTED] Convert SG from nopage to fault. Signed-off-by: Nick Piggin [EMAIL PROTECTED] Cc: Douglas Gilbert [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/sg.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-) diff -puN drivers/scsi/sg.c~sg-nopage drivers/scsi/sg.c --- a/drivers/scsi/sg.c~sg-nopage +++ a/drivers/scsi/sg.c @@ -1160,23 +1160,22 @@ sg_fasync(int fd, struct file *filp, int return (retval 0) ? retval : 0; } -static struct page * -sg_vma_nopage(struct vm_area_struct *vma, unsigned long addr, int *type) +static int +sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { Sg_fd *sfp; - struct page *page = NOPAGE_SIGBUS; unsigned long offset, len, sa; Sg_scatter_hold *rsv_schp; struct scatterlist *sg; int k; if ((NULL == vma) || (!(sfp = (Sg_fd *) vma-vm_private_data))) - return page; + return VM_FAULT_SIGBUS; rsv_schp = sfp-reserve; - offset = addr - vma-vm_start; + offset = vmf-pgoff PAGE_SHIFT; if (offset = rsv_schp-bufflen) - return page; - SCSI_LOG_TIMEOUT(3, printk(sg_vma_nopage: offset=%lu, scatg=%d\n, + return VM_FAULT_SIGBUS; + SCSI_LOG_TIMEOUT(3, printk(sg_vma_fault: offset=%lu, scatg=%d\n, offset, rsv_schp-k_use_sg)); sg = rsv_schp-buffer; sa = vma-vm_start; @@ -1185,21 +1184,21 @@ sg_vma_nopage(struct vm_area_struct *vma len = vma-vm_end - sa; len = (len sg-length) ? len : sg-length; if (offset len) { + struct page *page; page = virt_to_page(page_address(sg_page(sg)) + offset); get_page(page); /* increment page count */ - break; + vmf-page = page; + return 0; /* success */ } sa += len; offset -= len; } - if (type) - *type = VM_FAULT_MINOR; - return page; + return VM_FAULT_SIGBUS; } static struct vm_operations_struct sg_mmap_vm_ops = { - .nopage = sg_vma_nopage, + .fault = sg_vma_fault, }; static int _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 9901] New: kernel panic in stex modules (?)
On Wed, 6 Feb 2008 09:40:15 -0800 (PST) [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=9901 Summary: kernel panic in stex modules (?) Product: IO/Storage Version: 2.5 KernelVersion: 2.6.24 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Serial ATA AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Latest working kernel version: 2.6.23-r6 Earliest failing kernel version: 2.6.24 Distribution: Gentoo Hardware Environment: Core2D E6600, Asus p5B Dlx, 2G DDR2 667, Promise ST EX4350 Software Environment: GCC 4.2.3/4.1.2, CFLAGS=-O2 Problem Description: The problem is frequent kernel panics within the same module. Can't say what it is, but looks like it is related to dma and promise driver. The first culprit, the memory, is ok, 8 hours of memtest passed without errors. Before, kernel 2.6.23-gentoo-r6, compiled with GCC 4.1.2 worked just fine, then after upgrade to 4.2.2 th bug appeared. Upgrade to 2.6.24 didn't solve the problem. Switching back to GCC 4.1.2 made things better for a moment, crashes became less frequent and I thought compiler was the cause. But today system crashed again with same symptoms. Sorry, but I can't save crash log, so I'll provide screen shot: http://img238.imageshack.us/my.php?image=p2030030ki1.jpg Steps to reproduce: Boot, start FTP-server, load RAID with heavy input, in some hours it will crash. With pure reads system can run several days, heavy write load kills it much too easier. The supertrak driver has regressed in 2.6.24. And commit 9cb83c7529d929c00f37d821daed1942a1b20602 Author: FUJITA Tomonori [EMAIL PROTECTED] Date: Tue Oct 16 11:24:32 2007 +0200 [SCSI] add use_sg_chaining option to scsi_host_template looks a likely candidate. And this: commit d3f46f39b7092594b498abc12f0c73b0b9913bde Author: James Bottomley [EMAIL PROTECTED] Date: Tue Jan 15 11:11:46 2008 -0600 [SCSI] remove use_sg_chaining from 2.6.25 looks to be a likely fix for it. Should it be backported? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel Panic in MPT SAS on 2.6.24 (and 2.6.23.14, 2.6.23.9)
On Wed, 6 Feb 2008 22:04:26 +0100 Maximilian Wilhelm [EMAIL PROTECTED] wrote: Hi! While installing my new firewall I got the following kernel panic in the MPT SAS driver which I need for the disks. The first kernel I bootet was 2.6.23.14 which did panic so I tried a 2.6.24 which panics, too. Our usual FAI kernel (2.6.23.9) is also affected. If there is any information you may need to track this down, please let me know. I've put the .config to http://files.rfc2324.org/mptsas_panic/2.6.24-config to limit the size of this mail. ... ide-floppy driver 0.99.newide aic94xx: Adaptec aic94xx SAS/SATA driver version 1.0.3 loaded megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006) megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006) megasas: 00.00.03.10-rc5 Thu May 17 10:09:32 PDT 2007 Driver 'sd' needs updating - please use bus_type methods Fusion MPT base driver 3.04.06 Copyright (c) 1999-2007 LSI Corporation Fusion MPT SAS Host driver 3.04.06 mptbase: ioc0: Initiating bringup ioc0: LSISAS1068E B3: Capabilities={Initiator} scsi0 : ioc0: LSISAS1068E B3, FwRev=00142e00h, Ports=1, MaxQ=511, IRQ=16 scsi 0:0:0:0: Direct-Access SEAGATE ST973402SS S207 PQ: 0 ANSI: 5 scsi 0:0:1:0: Direct-Access SEAGATE ST973402SS S207 PQ: 0 ANSI: 5 BUG: unable to handle kernel NULL pointer dereference at virtual address 0010 printing eip: c02c0b38 *pde = Oops: [#1] SMP Modules linked in: Pid: 1, comm: swapper Not tainted (2.6.24 #1) EIP: 0060:[c02c0b38] EFLAGS: 00010246 CPU: 1 EIP is at mptsas_probe_expander_phys+0x51/0x4a2 EAX: 0010 EBX: f7457ec0 ECX: f7c3fd9c EDX: 0004 ESI: f7fe7800 EDI: f7fe7800 EBP: f7fe7904 ESP: f7c3fe18 DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 Process swapper (pid: 1, ti=f7c3e000 task=f7c22ab0 task.ti=f7c3e000) Stack: 00200200 fffefd74 c02b9cc8 f7fe7800 c04c5280 f7c3fecc 376b1000 0001 00100100 00200200 00200200 fffefd74 c02b9cc8 f7fe7800 c04c5280 f7c3fe8c 376b1000 0001 Call Trace: [c02b9cc8] mpt_timer_expired+0x0/0x5c [c02b9cc8] mpt_timer_expired+0x0/0x5c [c028] ide_wait_cmd+0x90/0xa0 [c02c2806] mptsas_probe+0x38a/0x40b [c0180522] sysfs_create_link+0xb7/0xf9 [c021ceb6] pci_device_probe+0x36/0x57 [c023bcd0] driver_probe_device+0xde/0x15c [c036d3e5] klist_next+0x4b/0x6b [c023bde0] __driver_attach+0x0/0x79 [c023be26] __driver_attach+0x46/0x79 [c023b2a8] bus_for_each_dev+0x33/0x55 [c023bb37] driver_attach+0x16/0x18 [c023bde0] __driver_attach+0x0/0x79 [c023b58e] bus_add_driver+0x6d/0x197 [c021cff2] __pci_register_driver+0x48/0x74 [c0480bd3] mptsas_init+0xbf/0xd6 [c046c74e] kernel_init+0x140/0x2a2 [c01024ca] ret_from_fork+0x6/0x1c [c046c60e] kernel_init+0x0/0x2a2 [c046c60e] kernel_init+0x0/0x2a2 [c010319f] kernel_thread_helper+0x7/0x10 === Code: 85 c0 0f 84 68 04 00 00 8b 54 24 1c 8b 02 89 04 24 31 c9 89 da 89 f8 e8 2b f2 ff ff 89 44 24 2c 85 c0 8b 43 0c 0f 85 39 04 00 00 0f b7 00 8b 74 24 1c 89 06 8d 87 24 05 00 00 89 44 24 20 e8 5b EIP: [c02c0b38] mptsas_probe_expander_phys+0x51/0x4a2 SS:ESP 0068:f7c3fe18 ---[ end trace 50b3e7147499e641 ]--- Kernel panic - not syncing: Attempted to kill init! Thanks. Cc's added... - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 08/13] sg: nopage
On Mon, 04 Feb 2008 23:53:21 -0800 [EMAIL PROTECTED] wrote: From: Nick Piggin [EMAIL PROTECTED] Convert SG from nopage to fault. Please give this some additional attention. We'd like to remove vm_operations_struct.nopage() altogether and we can't do that while it's hanging around in various subsystems. Thanks. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] enclosure: add support for enclosure services
On Sun, 03 Feb 2008 18:16:51 -0600 James Bottomley [EMAIL PROTECTED] wrote: From: James Bottomley [EMAIL PROTECTED] Date: Sun, 3 Feb 2008 15:40:56 -0600 Subject: [SCSI] enclosure: add support for enclosure services The enclosure misc device is really just a library providing sysfs support for physical enclosure devices and their components. Thanks for sending it out for review. +struct enclosure_device *enclosure_find(struct device *dev) +{ + struct enclosure_device *edev = NULL; + + mutex_lock(container_list_lock); + list_for_each_entry(edev, container_list, node) { + if (edev-cdev.dev == dev) { + mutex_unlock(container_list_lock); + return edev; + } + } + mutex_unlock(container_list_lock); + + return NULL; +} +EXPORT_SYMBOL_GPL(enclosure_find); This looks a little odd. We don't take a ref on the object after looking it up, so what prevents some other thread of control from freeing or otherwise altering the returned object while the caller is playing with it? +/** + * enclosure_for_each_device - calls a function for each enclosure + * @fn: the function to call + * @data:the data to pass to each call + * + * Loops over all the enclosures calling the function. + * + * Note, this function uses a mutex which will be held across calls to + * @fn, so it must have user context, and @fn should not sleep or Probably non atomic context would be more accurate. fn() actually _can_ sleep. + * otherwise cause the mutex to be held for indefinite periods + */ +int enclosure_for_each_device(int (*fn)(struct enclosure_device *, void *), + void *data) +{ + int error = 0; + struct enclosure_device *edev; + + mutex_lock(container_list_lock); + list_for_each_entry(edev, container_list, node) { + error = fn(edev, data); + if (error) + break; + } + mutex_unlock(container_list_lock); + + return error; +} +EXPORT_SYMBOL_GPL(enclosure_for_each_device); + +/** + * enclosure_register - register device as an enclosure + * + * @dev: device containing the enclosure + * @components: number of components in the enclosure + * + * This sets up the device for being an enclosure. Note that @dev does + * not have to be a dedicated enclosure device. It may be some other type + * of device that additionally responds to enclosure services + */ +struct enclosure_device * +enclosure_register(struct device *dev, const char *name, int components, +struct enclosure_component_callbacks *cb) +{ + struct enclosure_device *edev = + kzalloc(sizeof(struct enclosure_device) + + sizeof(struct enclosure_component)*components, + GFP_KERNEL); + int err, i; + + if (!edev) + return ERR_PTR(-ENOMEM); + + if (!cb) { + kfree(edev); + return ERR_PTR(-EINVAL); + } It would be less fuss if this were to test cb before doing the kzalloc(). Can cb==NULL actually and legitimately happen? + edev-components = components; + + edev-cdev.class = enclosure_class; + edev-cdev.dev = get_device(dev); + edev-cb = cb; + snprintf(edev-cdev.class_id, BUS_ID_SIZE, %s, name); + err = class_device_register(edev-cdev); + if (err) + goto err; + + for (i = 0; i components; i++) + edev-component[i].number = -1; + + mutex_lock(container_list_lock); + list_add_tail(edev-node, container_list); + mutex_unlock(container_list_lock); + + return edev; + + err: + put_device(edev-cdev.dev); + kfree(edev); + return ERR_PTR(err); +} +EXPORT_SYMBOL_GPL(enclosure_register); + +static struct enclosure_component_callbacks enclosure_null_callbacks; + +/** + * enclosure_unregister - remove an enclosure + * + * @edev:the registered enclosure to remove; + */ +void enclosure_unregister(struct enclosure_device *edev) +{ + int i; + + if (!edev) + return; Is this legal? + mutex_lock(container_list_lock); + list_del(edev-node); + mutex_unlock(container_list_lock); See, right now, someone who found this enclosure_device via enclosure_find() could still be playing with it? + for (i = 0; i edev-components; i++) + if (edev-component[i].number != -1) + class_device_unregister(edev-component[i].cdev); + + /* prevent any callbacks into service user */ + edev-cb = enclosure_null_callbacks; + class_device_unregister(edev-cdev); +} +EXPORT_SYMBOL_GPL(enclosure_unregister); + +/** + * enclosure_component_register - add a particular component to an enclosure + * @edev:the enclosure to add the component + * @num: the device number + * @type:the type of component
Re: [patch] pci: pci_enable_device_bars() fix
On Mon, 4 Feb 2008 13:57:36 +0100 Ingo Molnar [EMAIL PROTECTED] wrote: * Jeff Garzik [EMAIL PROTECTED] wrote: Ingo Molnar wrote: so please tell me Jeff. If Greg, who is the super-maintainer of your code area, and who deals with your code every day and changes it every minute and hour, simply did not Cc: the SCSI list - how am i, a largely outside party in this matter, supposed to notice that 3 maintainers and 3 mailing lists in the Cc: were somehow not enough and that i was supposed to grow the already sizable Cc: list even more? Because, regardless of the situation, it's both common courtesy and wise practice to CC relevant driver maintainers, when you touch a driver. And it's just common sense: Greg simply does not know the intimate details of every PCI driver. Nor do I. Nor you. In the case of lpfc here, we have an active driver maintainer, and an up-to-date MAINTAINERS entry. Even if you are too slack to read MAINTAINERS, 'git log' would have given you the same info. Don't pretend there is some benefit here to ignoring the people that best know the driver. I don't buy that; it simply makes no engineering sense whatsoever. what you _STILL_ do not realize is the following: you still attribute the lack of Cc:s to some intention of mine. No, it was not my intention. At first glance the Cc: looked large and complete enough in an _existing_ discussion and that's was the end of my (brief) attention regarding the Cc: line. Yes, it would have been a bit better had i noticed the lack of Cc:s in an existing discussion, but i didnt. Actually I (and probably others) generally avoid cc'ing mailing lists on patch traffic. I spew out enough script-generated traffic as it is. ... mailing list aliases to get the 'guaranteed attention' of maintainers whoa. You must know better mailing lists than I do ;) - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dmesg spam
On Mon, 4 Feb 2008 15:24:55 +0100 Bartlomiej Zolnierkiewicz [EMAIL PROTECTED] wrote: On Sunday 03 February 2008, Andrew Morton wrote: With latest -mm, running fc8 I am getting this in the logs, ^^^ = SCSI/libata cc:ing Jeff once per second. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. Well.. it's coming out of the kernel. Presumably it's that cdrom polling thing in KDE. James recently made changes to sr_ioctl.c but I've been buried in more terminal regressions than this one. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dmesg spam
On Mon, 04 Feb 2008 15:21:54 -0500 Jeff Garzik [EMAIL PROTECTED] wrote: James Bottomley wrote: The message comes from sr_ioctl.c:sr_do_ioctl(). Which means some user level application is poking the drive with a command that's returning NOT_READY. Apparently it will shut up if quiet is set in the packet command structure. It could be the application is getting the wrong idea of the status from sr_do_staus() which leads it to send commands which require a medium? But we'll need a bit of debugging to determine this. Userland polling of the cdrom is quite normal (if unfortunately), regardless of medium presence. Probably HAL or dbus. In theory, the userland app should (a) set quiet and (b) handle not-ready condition just fine. I presume that (b) is ok, since not-ready just means to continue polling the cdrom ad infinitum, until media appears. A useful experiment, if only to confirm the obvious, would be to insert some media. What controller and device is in use? It's the thinkpad t61p. Currently five miles away, powered off. It's all new Intel stuff iirc. http://userweb.kernel.org/~akpm/dmesg-t61p.txt has some info but not the right info afaict. Bisection time I guess. That'll be a new experience. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dmesg spam
On Mon, 04 Feb 2008 14:44:18 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Mon, 2008-02-04 at 15:24 -0500, Jeff Garzik wrote: James Bottomley wrote: It's here in sr_ioctl.c: Ah, indeed. My grep-fu sucks today. I'm not averse to simply nuking the printk ... it's probably valueless in a modern kernel, since something dbussy is supposed to tell you to put a CD in the drive, not something in the kernel. The reverse... dbussy/HAL is implementing autodetection of media insertion, by polling ad infinitum. Understood ... I meant the day of the user relying on a message from a kernel printk to tell them they need a CD in the drive is long over. OK, sorry, I'm hopelessly full of it. These messages also are produced by 2.6.24, 2.6.23 and 2.6.23.1-49.fc8. I don't think anyone would miss this message were it to bite the D key. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
dmesg spam
With latest -mm, running fc8 I am getting this in the logs, once per second. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. sr0: CDROM not ready. Make sure there is a disc in the drive. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc8-mm1 Build Failure on scsi driver
On Thu, 17 Jan 2008 21:45:39 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: Hi Andrew, The kernel build fails with following error drivers/scsi/aha152x.o: In function `aha152x_host_reset_host': /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/aha152x.c:1324: multiple definition of `aha152x_host_reset_host' drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/aha152x.c:1324: first defined here drivers/scsi/aha152x.o: In function `aha152x_release': /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/aha152x.c:908: multiple definition of `aha152x_release' drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/aha152x.c:908: first defined here ld: Warning: size of symbol `aha152x_release' changed from 68 in drivers/scsi/pcmcia/built-in.o to 100 in drivers/scsi/aha152x.o drivers/scsi/aha152x.o: In function `aha152x_probe_one': Neat. Seems that the scsi build system is linking together two copies of drivers/scsi/aha152x.o. One via drivers/scsi/aha152x.o directly and the other via drivers/scsi/pcmcia/built-in.o. Please send the .config. I'm looking suspiciously at this, from git-scsi-misc: commit 8ae732a91df051aba6820068a47b631a06599d84 Author: Tejun Heo [EMAIL PROTECTED] Date: Fri Dec 7 22:36:23 2007 +0900 [SCSI] make pcmcia directory use obj-y|m instead of subdir-y|m subdir-y|m isn't supposed to contain modules or built-in components. Change subdir-$(CONFIG_PCMCIA) to obj-$(CONFIG_PCMCIA). Signed-off-by: Tejun Heo [EMAIL PROTECTED] Acked-by: Sam Ravnborg [EMAIL PROTECTED] Signed-off-by: James Bottomley [EMAIL PROTECTED] diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index b5441f5..93e1428 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -17,7 +17,7 @@ CFLAGS_aha152x.o = -DAHA152X_STAT -DAUTOCONF CFLAGS_gdth.o= # -DDEBUG_GDTH=2 -D__SERIAL__ -D__COM2__ -DGDTH_STATISTICS -subdir-$(CONFIG_PCMCIA)+= pcmcia +obj-$(CONFIG_PCMCIA) += pcmcia/ obj-$(CONFIG_SCSI) += scsi_mod.o obj-$(CONFIG_SCSI_TGT) += scsi_tgt.o - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc8-mm1 Build Failure on scsi driver
On Fri, 18 Jan 2008 12:07:27 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: Hi Andrew, Patch from Tejun Heo fixes the aha152x.c build failure, and following second part of the build failure, is still occurring. drivers/scsi/fdomain.o:(.data+0x0): multiple definition of `fdomain_driver_template' drivers/scsi/pcmcia/built-in.o:(.data+0x5a0): first defined here drivers/scsi/fdomain.o: In function `fdomain_16x0_bus_reset': /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:1568: multiple definition of `fdomain_16x0_bus_reset' drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:1568: first defined here drivers/scsi/fdomain.o: In function `__fdomain_16x0_detect': /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:894: multiple definition of `__fdomain_16x0_detect' drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:894: first defined here ld: Warning: size of symbol `__fdomain_16x0_detect' changed from 1206 in drivers/scsi/pcmcia/built-in.o to 1700 in drivers/scsi/fdomain.o drivers/scsi/fdomain.o: In function `fdomain_setup': /home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:554: multiple definition of `fdomain_setup' drivers/scsi/pcmcia/built-in.o:/home/kamalesh/scrap/linux-2.6.24-rc8/drivers/scsi/fdomain.c:554: first defined here Tejun has more fixing to do, I suspect ;) I assume a basic allyesconfig will weed out most remaining problems of this sort. Problem is, it needs to be done for all architectures (and even that might not suffice). So old-fashioned code inspection is also needed. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 9752] New: getting FAULT code message at startup in mpt fusion scsi driver
On Tue, 15 Jan 2008 06:30:11 -0800 (PST) [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=9752 Summary: getting FAULT code message at startup in mpt fusion scsi driver Product: SCSI Drivers Version: 2.5 KernelVersion: 2.6.14 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Other AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Hardware Environment: 32-bit x86, dual Xeon Software Environment: nothing special Problem Description: On blade startup we're seeing the following message: Fusion MPT base driver 3.02.55 Copyright (c) 1999-2005 LSI Logic Corporation Fusion MPT SAS Host driver 3.02.55 mptbase: Initiating ioc0 bringup mptbase: ioc0: WARNING - IOC is in FAULT state!!! FAULT code = 1804h mptbase: ioc0: ERROR - Failed to come READY after reset! IocState=0 mptbase: ioc0 NOT READY WARNING! mptbase: WARNING - ioc0 did not initialize properly! (-1) mptsas: probe of :05:01.0 failed with error -1 I'm not very knowledgable with SCSI, so can someone tell me whether this is a disk fault or a host fault? Even better, does anyone know what the specific fault code means or where I could look it up? -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug, or are watching someone who is. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] megaraid: fix section mismatch
On Thu, 10 Jan 2008 14:33:16 -0800 Randy Dunlap [EMAIL PROTECTED] wrote: From: Randy Dunlap [EMAIL PROTECTED] Change megaraid_pci_driver_g variable name so that it matches the modpost whitelist that allows pointers to init text/data. WARNING: vmlinux.o(.data+0x1a8e30): Section mismatch: reference to .init.text:megaraid_probe_one (between 'megaraid_pci_driver_g' and 'class_device_attr_megaraid_mbox_app_hndl') All these patches fix references to possibly-discarded sections and hence fix possibly-serious bugs. So all of them should go into 2.6.24. I already had the qla2xxx one. It was sent to James a month ago with not atypical results. The advansys one is stuck in git-scsi-misc. I'll give it 24 hours and then shall send these: scsi-qla2xxx-qla_osc-section-fix.patch megaraid-fix-section-mismatch.patch cciss-section-mismatch.patch x86-discover_ebda-section-mismatch.patch tpm-infineon-section-mismatch.patch dvb-av7110-fix-section-mismatch.patch hostap-section-mismatch-warning.patch in to Linus. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] megaraid: fix section mismatch
On Thu, 10 Jan 2008 22:45:35 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2008-01-10 at 16:10 -0800, Andrew Morton wrote: On Thu, 10 Jan 2008 14:33:16 -0800 Randy Dunlap [EMAIL PROTECTED] wrote: From: Randy Dunlap [EMAIL PROTECTED] Change megaraid_pci_driver_g variable name so that it matches the modpost whitelist that allows pointers to init text/data. WARNING: vmlinux.o(.data+0x1a8e30): Section mismatch: reference to .init.text:megaraid_probe_one (between 'megaraid_pci_driver_g' and 'class_device_attr_megaraid_mbox_app_hndl') All these patches fix references to possibly-discarded sections and hence fix possibly-serious bugs. So all of them should go into 2.6.24. Renaming a variable fixes a serious bug? It quiets a spurious warning from modpost, sure, but I hardly think that's -rc7 material. Rather than unerringly zooming in on the vanishingly trivial: will you be merging the advansys and qla2xx bugfixes or would you like me to? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] MAINTAINERS: remove Adam Fritzler, update his email address in other sources
On Mon, 17 Dec 2007 20:48:03 -0800 Joe Perches [EMAIL PROTECTED] wrote: Back to Adam Fritzler... ... diff --git a/CREDITS b/CREDITS index ee909f2..449ec7f 100644 --- a/CREDITS +++ b/CREDITS @@ -1124,6 +1124,9 @@ S: 1150 Ringwood Court S: San Jose, California 95131 S: USA +N: Adam Fritzler +E: [EMAIL PROTECTED] + N: Fernando Fuganti E: [EMAIL PROTECTED] E: [EMAIL PROTECTED] diff --git a/MAINTAINERS b/MAINTAINERS index 9507b42..690f172 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3758,13 +3758,6 @@ W: http://www.kernel.org/pub/linux/kernel/people/bunk/trivial/ T: git kernel.org:/pub/scm/linux/kernel/git/bunk/trivial.git S: Maintained -TMS380 TOKEN-RING NETWORK DRIVER -P: Adam Fritzler -M: [EMAIL PROTECTED] -L: [EMAIL PROTECTED] -W: http://www.auk.cx/tms380tr/ -S: Maintained What was the rationale for removing Adam from MAINTAINERS? That should have been in the non-existent changelog. Please always reissue a complete changelog when resending any patch. hm, linux-tr.net seems to be defunct. So I guess that orphaning TMS380 is appropriate, if Adam has left us. Has he? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 18/30] scsi/qla2xxx/: possible cleanups
On Fri, 14 Dec 2007 10:20:04 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: On Fri, 14 Dec 2007, Andrew Morton wrote: Could you drop this patch from your queue. I'll carry it in my tree (along with additional code removals) for 2.6.25 submission. I'll normally carry patches until they turn up in a subsystem tree or mainline and will drop them then. To minimise potential of lossage.. Is your tree publically accessible? It is, though, not widely publicized: git://avgit01.qlogic.com/qla2xxx-upstream The repo is torndown and rebased on frequent a basis, and is meant to provide a snapshot of where qla2xxx is at any given time. Currently it's comprised of linux-2.6.git with scsi-misc-2.6.git merged and a dozen or so patches queued for the next merge window (2.6.25). That should be OK. Sometimes ugly things can happen if James syncs with Linus and you don't: when I ask git to generate the james-you diff, it will generate a patch which reverts the Linus changes which are in James's tree but which aren't in yours. I have an alternative pull-git-trees script which tries to fix that but not very successfully. But whatever, we'll see. One slight problem though: fatal: Unable to look up avgit01.qlogic.com (port 9418) (Name or service not known) I think I might need to be [EMAIL PROTECTED] to get at that? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cciss: export more attributes to sysfs (repost)
On Fri, 14 Dec 2007 16:17:44 -0600 Mike Miller [EMAIL PROTECTED] wrote: Patch 1 of 3 Sorry to take so long to repost. This patch exports more attributes to /sys so we can work work better with udev. Some distros use unique_id among other attributes. This patch attempts to provide that and other attributes to reveal more information about cciss devices in /sys. It's also an effort to be more sysfs friendly. Please consider this for inclusion. I'm getting some deja vu here. I'm sure I already commented on some of these things? diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index 7d70496..54080e6 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -229,20 +229,485 @@ static inline CommandList_struct *removeQ(CommandList_struct **Qptr, return c; } +static inline int find_drv_index(int ctlr, drive_info_struct *drv){ +int i; +for (i=0; i CISS_MAX_LUN; i++) { +if (hba[ctlr]-drv[i].LunID == drv-LunID) +return i; +} +return i; +} pleeeze always feed all diffs through scripts/checkpatch.pl. Twice. This function has multiple coding-style mistakes. It is also far too large to be inlined. #include cciss_scsi.c /* For SCSI tape support */ +#define ENG_GIG 10 +#define ENG_GIG_FACTOR (ENG_GIG/512) #define RAID_UNKNOWN 6 +static const char *raid_label[] = { 0, 4, 1(1+0), 5, 5+1, ADG, + UNKNOWN}; + + +static spinlock_t sysfs_lock = SPIN_LOCK_UNLOCKED; checkpatch would have informed you about this mistake as well. +static void cciss_sysfs_stat_inquiry(int ctlr, int logvol, + int withirq, drive_info_struct *drv) +{ + int return_code; + InquiryData_struct *inq_buff; + + /* If there are no heads then this is the controller disk and + * not a valid logical drive so don't query it. + */ + if (!drv-heads) + return; + + inq_buff = kzalloc(sizeof(InquiryData_struct), GFP_KERNEL); + if (!inq_buff) { + printk(KERN_ERR cciss: out of memory\n); This failure gets dropped on the floor. Is there really no need to report it? Will the driver still correctly function even thoug this function didn't do anything? + goto err; + } + + if (withirq) + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr, + inq_buff, sizeof(*inq_buff), 1, logvol ,0, TYPE_CMD); + else + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff, + sizeof(*inq_buff), 1, logvol , 0, NULL, TYPE_CMD); + if (return_code == IO_OK) { + memcpy(drv-vendor, inq_buff-data_byte[8], 8); + drv-vendor[8]='\0'; + memcpy(drv-model, inq_buff-data_byte[16], 16); + drv-model[16] = '\0'; + memcpy(drv-rev, inq_buff-data_byte[32], 4); + drv-rev[4] = '\0'; + } else { /* Get geometry failed */ + printk(KERN_WARNING cciss: inquiry for VPD page 0 failed\n); + } + + if (withirq) + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr, + inq_buff, sizeof(*inq_buff), 1, logvol ,0x83, TYPE_CMD); + else + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff, + sizeof(*inq_buff), 1, logvol , 0x83, NULL, TYPE_CMD); + + if (return_code == IO_OK) { + memcpy(drv-uid, inq_buff-data_byte[8], 16); + } else { /* Get geometry failed */ + printk(KERN_WARNING cciss: inquiry for VPD page 83 failed\n); + } + + kfree(inq_buff); +err: + drv-vendor[8] = '\0'; + drv-model[16] = '\0'; + drv-rev[4] = '\0'; + +} + +static ssize_t cciss_show_raid_level(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct drv_dynamic *d; + drive_info_struct *drv; + ctlr_info_t *h; + unsigned long flags; + int raid; + + d = container_of(dev, struct drv_dynamic, dev); + spin_lock(sysfs_lock); + if (!d-disk) { + spin_unlock(sysfs_lock); + return -ENOENT; + } + + h = get_host(d-disk); + + spin_lock_irqsave(CCISS_LOCK(h-ctlr), flags); + if (h-busy_configuring) { + spin_unlock_irqrestore(CCISS_LOCK(h-ctlr), flags); + spin_unlock(sysfs_lock); + return snprintf(buf, 30, Device busy configuring\n); + } The above code snippet gets repeated again and again and again. As I suggested last time: can this be fixed? + drv = d-disk-private_data; + if ((drv-raid_level 0) || (drv-raid_level) 5) + raid = RAID_UNKNOWN; + else + raid = drv-raid_level; + + spin_unlock_irqrestore(CCISS_LOCK(h-ctlr), flags); + spin_unlock(sysfs_lock); + return snprintf(buf, 20, RAID %s\n, raid_label[raid]); +}
Re: 2.6.24-rc5: tape drive not responding
On Mon, 17 Dec 2007 11:25:51 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: On Sun, 16 Dec 2007 20:05:51 -0500 John Stoffel [EMAIL PROTECTED] wrote: [ 215.007701] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.008145] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.008678] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.009122] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.009598] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.010042] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.010516] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.010959] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.011403] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 215.011850] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae . . . [ 232.954629] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 233.035902] scsi 3:0:3:0: DEVICE RESET operation started [ 233.099514] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae . . . These repeat for about 15 seconds or so. They're really annoying and I'd love to see some sort of rate limiting put in here. The messages and end with: . . . [ 238.084175] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 238.165887] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 238.247157] scsi 3:0:3:0: DEVICE RESET operation timed-out. [ 238.313892] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 238.395192] scsi 3:0:3:0: BUS RESET operation started [ 238.455690] sym1: SCSI parity error detected: SCR1=1 DBC=1128 SBCL=ae [ 238.539216] sym1: SCSI BUS reset detected. [ 238.592552] sym1: SCSI BUS has been reset. [ 238.641576] scsi 3:0:3:0: BUS RESET operation complete. [ 248.700373] target3:0:3: wide asynchronous [ 248.752026] target3:0:3: Wide Transfers Fail [ 248.805220] target3:0:3: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 15) [ 248.886729] target3:0:3: Domain Validation skipping write tests [ 248.958666] target3:0:3: Ending Domain Validation [ 252.264086] scsi 3:0:0:0: Attached scsi generic sg2 type 8 [ 252.331257] st 3:0:2:0: Attached scsi tape st0 [ 252.384549] st 3:0:2:0: st0: try direct i/o: yes (alignment 512 B) [ 252.458875] st 3:0:2:0: Attached scsi generic sg3 type 1 [ 252.523963] st 3:0:3:0: Attached scsi tape st1 [ 252.577184] st 3:0:3:0: st1: try direct i/o: yes (alignment 512 B) [ 252.651484] st 3:0:3:0: Attached scsi generic sg4 type 1 I've also got an ATL P1000 SCSI tape library hooked up to this same controller and port, and I can manipulate it properly using the 'mtx' program pointed to the /dev/changer alias, which points to the correct /dev/sg# device. Here's my /proc/scsi/scsi output, as you can see, I've got a bunch of devices on this system: # cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: COMPAQ Model: HC01841729 Rev: 3208 Type: Direct-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: COMPAQ Model: BD018222CA Rev: B016 Type: Direct-AccessANSI SCSI revision: 02 Host: scsi3 Channel: 00 Id: 00 Lun: 00 Vendor: ATL Model: P10006220051 Rev: 1.20 Type: Medium Changer ANSI SCSI revision: 02 Host: scsi3 Channel: 00 Id: 02 Lun: 00 Vendor: QUANTUM Model: DLT7000 Rev: 2565 Type: Sequential-AccessANSI SCSI revision: 02 Host: scsi3 Channel: 00 Id: 03 Lun: 00 Vendor: QUANTUM Model: DLT7000 Rev: 2565 Type: Sequential-AccessANSI SCSI revision: 02 Host: scsi4 Channel: 00 Id: 00 Lun: 00 Vendor: SAMSUNG Model: CDRW/DVD SM-352B Rev: T806 Type: CD-ROM ANSI SCSI revision: 05 Host: scsi6 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: ST3320620AS Rev: 3.AA Type: Direct-AccessANSI SCSI revision: 05 Host: scsi7 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: WDC WD3200AAKS-0 Rev: 12.0 Type: Direct-AccessANSI SCSI revision: 05 Host: scsi10 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: WDC WD1200JB-00C Rev: 17.0 Type: Direct-AccessANSI SCSI revision: 05 Host: scsi11 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: WDC WD1200JB-00E Rev: 15.0 Type: Direct-AccessANSI SCSI revision: 05 Host: scsi12 Channel: 00 Id: 00 Lun: 00 Vendor: Generic Model: STORAGE DEVICE Rev: 0001 Type: Direct-AccessANSI SCSI revision: 00 Host: scsi12 Channel: 00 Id: 00 Lun: 01 Vendor: Generic Model: STORAGE DEVICE Rev:
Re: INITIO scsi driver fails to work properly
On Mon, 17 Dec 2007 11:39:47 +0200 Filippos Papadopoulos [EMAIL PROTECTED] wrote: Hi, I have got an INITIO 9100 UW SCSI Controller with an IBM IC35L036UWD210-0 scsi hard disk on a 32 bit x86 system. Currently i have SUSE 10.1 (Kernel 2.6.16). I tried to install OpenSUSE 10.3 (kernel 2.6.22.5) and the latest OpenSUSE 11.0 Alpha 0 (kernel 2.6.24-rc4) but although the initio driver gets loaded during the installation process, yast reports that no hard disk is found. I believe that this isnt a bug in suse's yast but a problem in the initio scsi driver because i also tried to install Fedora 8 (kernel 2.6.23) with the same problem. I have seen the relevant thread Conflict when loading initio driver and i suppose that the initio driver isnt fixed yet. I can help testing the new patches in the initio driver if someone is interested. initio doesn't seem to have a maintainer... Are you able to identify any earlier kernel which worked OK? Maybe it's a new device? If you can get the `lspci -vvxx' output for that device we can take a look. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5: tape drive not responding
On Mon, 17 Dec 2007 16:02:02 -0500 John Stoffel [EMAIL PROTECTED] wrote: Just to confirm, the propsed patch to st.c fixes the issue with 2.6.24-rc5 as well at 2.6.24-rc5-mm1 with access to my DLT tape drives. err, what patch to st.c? So it seems that 2.6.24 (and presumably 2.6.23?) need 1: Alan's initio: fix conflict when loading driver (currently stocuk in git-scsi-misc) 2: Boaz's initio: initio_build_scb() fix (my name for it) 3: The mystery st.c fix. yes? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] MAINTAINERS: remove Adam Fritzler, update his email address in other sources
On Mon, 17 Dec 2007 20:12:06 -0800 Joe Perches [EMAIL PROTECTED] wrote: Adam isn't a maintainer anymore. His old email address bounces. Update to new email address. On Mon, Dec 17, 2007 at 01:03:48PM -0800, Joe Perches wrote: You seem to have an old email address in the linux-kernel MAINTAINERS file. Should it be deleted or changed? On Mon, 2007-12-17 at 19:27 -0800, Adam Fritzler wrote: I am no longer actively involved. If you can mark me as a former point of contact, that's fine, or you can just delete the entry. My name is still in the source, but with the old address. It'd great if the address in source was updated. ... -TMS380 TOKEN-RING NETWORK DRIVER -P: Adam Fritzler -M: [EMAIL PROTECTED] -L: [EMAIL PROTECTED] -W: http://www.auk.cx/tms380tr/ -S: Maintained ... - * Added MCA support Adam Fritzler [EMAIL PROTECTED] + * Added MCA support Adam Fritzler [EMAIL PROTECTED] This is fairly pointless - it'll just break again when Adam moves again. Every problem can be solved with another layer of... Please: just replace all instances with plain old Adam Fritzler and then ensure that the lookup key Adam Fritzler has an accurate (and non-duplicated anywhere else!) entry in MAINTAINERS or CREDITS or whatever. btw, I cheerfully skipped all your spelling-fixes patches. Some will have stuck via subsystem maintainers but I have a secret no spelling fixes unless they're end-user-visible policy. That means I'll take spelling fixes only if they're in printks or in Documentation/*. This is a little defense mechanism to avoid getting buried in micropatches. I'd suggest that you find out if Adrian is still running the trivial tree and if so, patchbomb him. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 18/30] scsi/qla2xxx/: possible cleanups
On Fri, 14 Dec 2007 07:37:24 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: On Thu, 13 Dec 2007, [EMAIL PROTECTED] wrote: From: Adrian Bunk [EMAIL PROTECTED] - make the following needlessly global code static: - qla_attr.c: qla24xx_vport_delete() - qla_attr.c: qla24xx_vport_disable() - qla_mid.c: qla24xx_allocate_vp_id() - qla_mid.c: qla24xx_find_vhost_by_name() - qla_mid.c: qla2x00_do_dpc_vp() - qla_os.c: struct qla2x00_driver_template - qla_os.c: qla2x00_stop_timer() - qla_os.c: qla2x00_mem_alloc() - qla_os.c: qla2x00_mem_free() - qla_sup.c: qla2x00_lock_nvram_access() - qla_sup.c: qla2x00_unlock_nvram_access() - qla_sup.c: qla2x00_get_nvram_word() - qla_sup.c: qla2x00_write_nvram_word() - #if 0 the following unused global functions: - qla_dbg.c: qla2x00_dump_pkt() - qla_mbx.c: qla2x00_system_error() - qla_mbx.c: qla2x00_get_serdes_params() - qla_mbx.c: qla2x00_get_idma_speed() - qla_mbx.c: qla24xx_get_vp_database() - qla_mbx.c: qla24xx_get_vp_entry() - qla_os.c: remove some unneeded function prototypes Signed-off-by: Adrian Bunk [EMAIL PROTECTED] Cc: Andrew Vasquez [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] Andrew, Could you drop this patch from your queue. I'll carry it in my tree (along with additional code removals) for 2.6.25 submission. I'll normally carry patches until they turn up in a subsystem tree or mainline and will drop them then. To minimise potential of lossage.. Is your tree publically accessible? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Sat, 15 Dec 2007 01:09:41 + Mel Gorman [EMAIL PROTECTED] wrote: On (13/12/07 14:29), Andrew Morton didst pronounce: The simple way seems to be to malloc a large area, touch every page and then look at the physical pages assigned ... they now mostly seem to be descending in physical address. OIC. -mm's /proc/pid/pagemap can be used to get the pfn's... I tried using pagemap to verify the patch but it triggered BUG_ON checks. Perhaps I am using the interface wrong but I would still not expect it to break in this fashion. I tried 2.6.24-rc4-mm1, 2.6.24-rc5-mm1, 2.6.24-rc5 with just the maps4 patches applied and 2.6.23 with maps4 patches applied. Each time I get errors like this; [ 90.108315] BUG: sleeping function called from invalid context at include/asm/uaccess_32.h:457 [ 90.211227] in_atomic():1, irqs_disabled():0 [ 90.262251] no locks held by showcontiguous/2814. [ 90.318475] Pid: 2814, comm: showcontiguous Not tainted 2.6.24-rc5 #1 [ 90.395344] [c010522a] show_trace_log_lvl+0x1a/0x30 [ 90.456948] [c0105bb2] show_trace+0x12/0x20 [ 90.510173] [c0105eee] dump_stack+0x6e/0x80 [ 90.563409] [c01205b3] __might_sleep+0xc3/0xe0 [ 90.619765] [c02264fd] copy_to_user+0x3d/0x60 [ 90.675153] [c01b3e9c] add_to_pagemap+0x5c/0x80 [ 90.732513] [c01b43e8] pagemap_pte_range+0x68/0xb0 [ 90.793010] [c0175ed2] walk_page_range+0x112/0x210 [ 90.853482] [c01b47c6] pagemap_read+0x176/0x220 [ 90.910863] [c0182dc4] vfs_read+0x94/0x150 [ 90.963058] [c01832fd] sys_read+0x3d/0x70 [ 91.014219] [c0104262] syscall_call+0x7/0xb ... Just using cp to read the file is enough to cause problems but I included a very basic program below that produces the BUG_ON checks. Is this a known issue or am I using the interface incorrectly? I'd say you're using it correctly but you've found a hitherto unknown bug. On i386 highmem machines with CONFIG_HIGHPTE (at least) pte_offset_map() takes kmap_atomic(), so pagemap_pte_range() can't do copy_to_user() as it presently does. Drat. Still, that shouldn't really disrupt the testing which you're doing. You could disable CONFIG_HIGHPTE to shut it up. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, 13 Dec 2007 17:15:06 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote: On Thu, 13 Dec 2007 21:09:59 +0100 Jens Axboe [EMAIL PROTECTED] wrote: OK, it's a vm issue, cc linux-mm and probable culprit. I have tens of thousand backward pages after a boot - IOW, bvec-bv_page is the page before bvprv-bv_page, not reverse. So it looks like that bug got reintroduced. Bill Irwin fixed this a couple of years back: changed the page allocator so that it mostly hands out pages in ascending physical-address order. I guess we broke that, quite possibly in Mel's page allocator rework. It would help if you could provide us with a simple recipe for demonstrating this problem, please. The simple way seems to be to malloc a large area, touch every page and then look at the physical pages assigned ... they now mostly seem to be descending in physical address. OIC. -mm's /proc/pid/pagemap can be used to get the pfn's... - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?
On Thu, 13 Dec 2007 19:30:00 -0500 Mark Lord [EMAIL PROTECTED] wrote: Here's the commit that causes the regression: ... --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -760,7 +760,8 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; - list_add_tail(page-lru, list); + list_add(page-lru, list); well that looks fishy. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix page_alloc for larger I/O segments
On Thu, 13 Dec 2007 19:40:09 -0500 Mark Lord [EMAIL PROTECTED] wrote: And here is a patch that seems to fix it for me here: * * * * Fix page allocator to give better change of larger contiguous segments (again). Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/mm/page_alloc.c.orig 2007-12-13 19:25:15.0 -0500 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:35:50.0 -0500 @@ -954,7 +954,7 @@ goto failed; } /* Find a page of the appropriate migrate type */ - list_for_each_entry(page, pcp-list, lru) { + list_for_each_entry_reverse(page, pcp-list, lru) { if (page_private(page) == migratetype) { list_del(page-lru); pcp-count--; - needs help to make it apply to mainline - needs a comment, methinks... --- a/mm/page_alloc.c~fix-page-allocator-to-give-better-chance-of-larger-contiguous-segments-again +++ a/mm/page_alloc.c @@ -1060,8 +1060,12 @@ again: goto failed; } - /* Find a page of the appropriate migrate type */ - list_for_each_entry(page, pcp-list, lru) + /* +* Find a page of the appropriate migrate type. Doing a +* reverse-order search here helps us to hand out pages in +* ascending physical-address order. +*/ + list_for_each_entry_reverse(page, pcp-list, lru) if (page_private(page) == migratetype) break; _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix page_alloc for larger I/O segments (improved)
On Thu, 13 Dec 2007 19:57:29 -0500 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2007-12-13 at 19:46 -0500, Mark Lord wrote: Improved version, more similar to the 2.6.23 code: Fix page allocator to give better chance of larger contiguous segments (again). Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/mm/page_alloc.c 2007-12-13 19:25:15.0 -0500 +++ linux-2.6/mm/page_alloc.c 2007-12-13 19:43:07.0 -0500 @@ -760,7 +760,7 @@ struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; - list_add(page-lru, list); + list_add_tail(page-lru, list); Could we put a big comment above this explaining to the would be vm tweakers why this has to be a list_add_tail, so we don't end up back in this position after another two years? Already done ;) --- a/mm/page_alloc.c~fix-page_alloc-for-larger-i-o-segments-fix +++ a/mm/page_alloc.c @@ -847,6 +847,10 @@ static int rmqueue_bulk(struct zone *zon struct page *page = __rmqueue(zone, order, migratetype); if (unlikely(page == NULL)) break; + /* +* Doing a list_add_tail() here helps us to hand out pages in +* ascending physical-address order. +*/ list_add_tail(page-lru, list); set_page_private(page, migratetype); } _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)
On Wed, 12 Dec 2007 11:58:41 +0100 Anders Henke [EMAIL PROTECTED] wrote: Hi, I'd like to let you now that my boxes are running a 32-bit kernel, so the 64-bit-uncleanliness shouldn't apply to my boxes; however, http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch fixed the issue on my testbox. I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: works. What a huge patch :( We already reverted the offening patch so I assume that 2.6.24-rc5 is working for you? I guess we need to look at restoring dpt_i2o: convert to SCSI hotplug model and then absorbing what Miquel has done there. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: broken dpt_i2o in 2.6.23 (was: ext2 check page: bad entry in directory) (fwd)
On Wed, 12 Dec 2007 14:43:42 +0100 Anders Henke [EMAIL PROTECTED] wrote: Am 12.12.2007 schrieb Miquel van Smoorenburg: On Wed, 2007-12-12 at 03:38 -0800, Andrew Morton wrote: On Wed, 12 Dec 2007 11:58:41 +0100 Anders Henke [EMAIL PROTECTED] wrote: Hi, I'd like to let you now that my boxes are running a 32-bit kernel, so the 64-bit-uncleanliness shouldn't apply to my boxes; however, http://www.miquels.cistron.nl/linux/dpt_i2o-64bit-2.6.23.patch fixed the issue on my testbox. I took a clean 2.6.23, applied patch, recompiled the kernel, reboot: works. What a huge patch :( We already reverted the offening patch so I assume that 2.6.24-rc5 is working for you? I guess we need to look at restoring dpt_i2o: convert to SCSI hotplug model and then absorbing what Miquel has done there. This was just a patch I had lying around, if it worked it would confirm my suspicion, which it has. The minimal patch which is suitable for 2.6.23-stable and 2.6.24 would be the attached one-liner. The dpt_i2o: convert to SCSI hotplug model patch could be restored then. (if the list eats the attachment, it's also available here: http://www.miquels.cistron.nl/linux/linux-2.6.23+24-dpt_i2o-dma64.patch ) Anders, does this one-liner patch work for you ? Got it - and it works! I took a clean 2.6.23, applied the patch, recompiled the kernel and rebooted my testbox: came up with the fresh-compiled kernel (verified by uname -a). That looks appropriate for 2.6.23.x: --- linux-2.6.23.9.orig/drivers/scsi/dpt_i2o.c 2007-11-26 18:51:43.0 +0100 +++ linux-2.6.23.9/drivers/scsi/dpt_i2o.c 2007-12-12 13:21:05.0 +0100 @@ -905,8 +905,7 @@ } pci_set_master(pDev); - if (pci_set_dma_mask(pDev, DMA_64BIT_MASK) - pci_set_dma_mask(pDev, DMA_32BIT_MASK)) + if (pci_set_dma_mask(pDev, DMA_32BIT_MASK)) return -EINVAL; base_addr0_phys = pci_resource_start(pDev,0); However it is a bit mystifying that 55d9fcf57ba5ec427544fca7abc335cf3da78160 would cause a dma mask problem (isn't it?) The scsi people might want to restore 55d9fcf57ba5ec427544fca7abc335cf3da78160 and then apply Miquel's patch on top for 2.6.24, or do it for 2.6.25? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH acs_ame scsi driver 000 of 1] Introduction
On Tue, 11 Dec 2007 17:01:48 -0800 Andrew Morton [EMAIL PROTECTED] wrote: More hm. It was from [EMAIL PROTECTED] which perhaps means that some attempt to recall it was made. Oh well. argh. [EMAIL PROTECTED] really is our Jeff's email address. And I went and cc'ed [EMAIL PROTECTED] on my reply, only that person is an innocent civilian. Bad me. Please remove [EMAIL PROTECTED] from any replies. Thanks. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha
On Thu, 6 Dec 2007 23:07:08 -0600 (CST) [EMAIL PROTECTED] (Bob Tracy) wrote: Andrew Morton wrote: commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba Merge: 2f1f53b... d90bf5a... Author: Linus Torvalds [EMAIL PROTECTED] Date: Wed Nov 14 18:51:48 2007 -0800 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/n * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: rt_check_expire() can take a long time, add a cond_resched() [ISDN] sc: Really, really fix warning [ISDN] sc: Fix sndpkt to have the correct number of arguments [TCP] FRTO: Clear frto_highmark only after process_frto that uses it [NET]: Remove notifier block from chain when register_netdevice_notifier f [FS_ENET]: Fix module build. [TCP]: Make sure write_queue_from does not begin with NULL ptr [TCP]: Fix size calculation in sk_stream_alloc_pskb [S2IO]: Fixed memory leak when MSI-X vector allocation fails [BONDING]: Fix resource use after free [SYSCTL]: Fix warning for token-ring from sysctl checker [NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_S [IWLWIFI]: Not correctly dealing with hotunplug. [TCP] FRTO: Plug potential LOST-bit leak [TCP] FRTO: Limit snd_cwnd if TCP was application limited [E1000]: Fix schedule while atomic when called from mii-tool. [NETX]: Fix build failure added by 2.6.24 statistics cleanup. [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes. [PKT_SCHED]: Check subqueue status before calling hard_start_xmit I'm struggling to see how any of those could have broken block device mounting on alpha. Are you sure you bisected right? Based on what's in that commit, it *does* appear something went wrong with bisection. If the implicated commit is the next one in time sequence relative to # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap INLINE and name timeval_cmp better then the test of whether I bisected correctly is as simple as applying the commit and seeing if things break, because I'm running on the kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right now. Let me give that a try and I'll report back. Worst case, I'll have to start over and write off the past four days... Gad. I trust the second time will be faster. git-bisect _is_ very error prone. I find one of the problems is that each step is so far apart in time that you forget what you were doing. Did I remember to test that iteration? Did I install the right kernel? etc. Sorry about this... Not appropriate ;) Thanks for helping out. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc4-mm1 and excessive block IO errors
On Fri, 07 Dec 2007 20:44:45 + Zan Lynx [EMAIL PROTECTED] wrote: I am not sure if this problem has been addressed already. I read some about the fast-fail issues and this may be related? On nearly all my USB block devices, I have been getting zillions of I/O errors. But they aren't real, they don't appear with 2.6.23 kernels. I can often read and write data to the device, but these IO errors cause error aborts in user space applications in many cases, making it a chancy thing to run backup software, for example. Here is a bit of dmesg from plugging in a perfectly good USB-2 flash drive. hub 3-0:1.0: state 7 ports 6 chg evt 0004 ehci_hcd :00:02.2: GetStatus port 2 status 001803 POWER sig=j CSC CONNECT hub 3-0:1.0: port 2, status 0501, change 0001, 480 Mb/s hub 3-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x501 ehci_hcd :00:02.2: port 2 high speed ehci_hcd :00:02.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT usb 3-2: new high speed USB device using ehci_hcd and address 9 ehci_hcd :00:02.2: port 2 high speed ehci_hcd :00:02.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT usb 3-2: default language 0x0409 usb 3-2: uevent usb 3-2: usb_probe_device usb 3-2: configuration #1 chosen from 1 choice usb 3-2: adding 3-2:1.0 (config #1, interface 0) usb 3-2:1.0: uevent libusual 3-2:1.0: usb_probe_interface libusual 3-2:1.0: usb_probe_interface - got id usb-storage 3-2:1.0: usb_probe_interface usb-storage 3-2:1.0: usb_probe_interface - got id scsi4 : SCSI emulation for USB Mass Storage devices drivers/usb/core/inode.c: creating file '009' usb 3-2: New USB device found, idVendor=05dc, idProduct=a400 usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 3-2: Product: JUMPDRIVE usb 3-2: Manufacturer: LEXAR MEDIA usb 3-2: SerialNumber: 0A4EEC05201219080904 usb-storage: device found at 9 usb-storage: waiting for device to settle before scanning usb-storage: device scan complete scsi 4:0:0:0: Direct-Access LEXARJUMPDRIVE1000 PQ: 0 ANSI: 0 CCS sd 4:0:0:0: [sdg] 2026592 512-byte hardware sectors (1038 MB) sd 4:0:0:0: [sdg] Write Protect is off sd 4:0:0:0: [sdg] Mode Sense: 43 00 00 00 sd 4:0:0:0: [sdg] Assuming drive cache: write through sd 4:0:0:0: [sdg] 2026592 512-byte hardware sectors (1038 MB) sd 4:0:0:0: [sdg] Write Protect is off sd 4:0:0:0: [sdg] Mode Sense: 43 00 00 00 sd 4:0:0:0: [sdg] Assuming drive cache: write through sdg: sdg1 sd 4:0:0:0: [sdg] Attached SCSI removable disk sd 4:0:0:0: Attached scsi generic sg7 type 0 sd 4:0:0:0: [sdg] Result: hostbyte=0x01 driverbyte=0x00 end_request: I/O error, dev sdg, sector 3984 Yes, this is breakage in the scsi tree. I believe that the offending patch has been found and I have a nasty fix somewhere in my inbox - it involves reverting a patch which doesn't revert properly. I haven't got onto looking at it yet, sorry. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG 2.6.24-rc4-mm1 -- Boot still hangs w/ async scsi scan
On Thu, 06 Dec 2007 13:14:22 -0500 Lee Schermerhorn [EMAIL PROTECTED] wrote: On Wed, 2007-12-05 at 13:20 -0800, Andrew Morton wrote: On Wed, 05 Dec 2007 11:36:39 -0500 Lee Schermerhorn [EMAIL PROTECTED] wrote: As reported here: http://marc.info/?l=linux-scsim=119645761124683w=4 against 24-rc3-mm2, I'm still seeing the hang on my HP ia64 NUMA platform under 24-rc4-mm1 with async scsi scan enabled. I'm still seeing the message mptspi: ioc#: mpt_config failed when it hangs. I can boot by disabling async scan. However, I've also noticed some disks attached via one of the mpt adapters [scsi8 in console long in message linked above] going off-line during stress tests. This was under 24-rc3-mm2. Haven't got that far yet with 24-rc4-mm1. Is ther any way of tricking you into http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt? Obvious culprits to start with would be git-scsi-misc and maybe scsi-early-detection-of-medium-not-present-updated.patch. But there are only 20-odd scsi patches in there. The reported hang occurs after pushing the git-scsi-misc patch. OK, thanks. I'm looking into it now, but it's rather large and I'm a neophyte in this area. If James can point me at a broken-out quilt series for this patch, I'd be willing to try to bisect that-- I doubt if such a thing exists. assuming that it IS bisectable. Often git trees are not bisectable. But they should be. Your best bet is to do a git-bisect on git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git http://www.kernel.org/doc/local/git-quick.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha
On Thu, 6 Dec 2007 18:16:12 -0600 (CST) [EMAIL PROTECTED] (Bob Tracy) wrote: OK. Finally have this thing painted into a corner: git has identified 6f37ac793d6ba7b35d338f791974166f67fdd9ba as the first bad commit. From git bisect log, this corresponds to # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Here's the full log: git-bisect start # good: [9aae299f7fd1888ea3a195cfe0edef17bb647415] Linux 2.6.24-rc2 git-bisect good 9aae299f7fd1888ea3a195cfe0edef17bb647415 # bad: [f05092637dc0d9a3f2249c9b283b973e6e96b7d2] Linux 2.6.24-rc3 git-bisect bad f05092637dc0d9a3f2249c9b283b973e6e96b7d2 # good: [e6a5c27f3b0fef72e528fc35e343af4b2db790ff] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm git-bisect good e6a5c27f3b0fef72e528fc35e343af4b2db790ff # good: [42614fcde7bfdcbe43a7b17035c167dfebc354dd] vmstat: fix section mismatch warning git-bisect good 42614fcde7bfdcbe43a7b17035c167dfebc354dd # bad: [a052f4473603765eb6b4c19754689977601dc1d1] Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/x86 git-bisect bad a052f4473603765eb6b4c19754689977601dc1d1 # good: [d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5] CRISv10 improve and bugfix fasttimer git-bisect good d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5 # good: [d90bf5a976793edfa88d3bb2393f0231eb8ce1e5] [NET]: rt_check_expire() can take a long time, add a cond_resched() git-bisect good d90bf5a976793edfa88d3bb2393f0231eb8ce1e5 # good: [2a113281f5cd2febbab21a93c8943f8d3eece4d3] kconfig: use $K64BIT to set 64BIT with all*config targets git-bisect good 2a113281f5cd2febbab21a93c8943f8d3eece4d3 # good: [2e2cd8bad6e03ceea73495ee6d557044213d95de] CRISv10 memset library add lineendings to asm git-bisect good 2e2cd8bad6e03ceea73495ee6d557044213d95de # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 git-bisect bad 6f37ac793d6ba7b35d338f791974166f67fdd9ba # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap INLINE and name timeval_cmp better git-bisect good 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba Merge: 2f1f53b... d90bf5a... Author: Linus Torvalds [EMAIL PROTECTED] Date: Wed Nov 14 18:51:48 2007 -0800 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/n * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: rt_check_expire() can take a long time, add a cond_resched() [ISDN] sc: Really, really fix warning [ISDN] sc: Fix sndpkt to have the correct number of arguments [TCP] FRTO: Clear frto_highmark only after process_frto that uses it [NET]: Remove notifier block from chain when register_netdevice_notifier f [FS_ENET]: Fix module build. [TCP]: Make sure write_queue_from does not begin with NULL ptr [TCP]: Fix size calculation in sk_stream_alloc_pskb [S2IO]: Fixed memory leak when MSI-X vector allocation fails [BONDING]: Fix resource use after free [SYSCTL]: Fix warning for token-ring from sysctl checker [NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_S [IWLWIFI]: Not correctly dealing with hotunplug. [TCP] FRTO: Plug potential LOST-bit leak [TCP] FRTO: Limit snd_cwnd if TCP was application limited [E1000]: Fix schedule while atomic when called from mii-tool. [NETX]: Fix build failure added by 2.6.24 statistics cleanup. [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes. [PKT_SCHED]: Check subqueue status before calling hard_start_xmit I'm struggling to see how any of those could have broken block device mounting on alpha. Are you sure you bisected right? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: everything in wait_for_completion, what is my system doing?
On Wed, 5 Dec 2007 21:44:54 +0100 Bernd Schubert [EMAIL PROTECTED] wrote: after scsi-recovery a system here went into some kind lock-up, everything seems to be in wait_for_completion(). Please see the attached blocked_states.txt and all_states.txt files. This is 2.6.22.12, I can easily find out the line numbers if required. Any help is highly appreciated. Please cc linux-scsi on scsi-related reports. [blocked_states.txt text/plain (20.5KB)] [generate break] [ 1818.566436] SysRq : Show Blocked State [ 1818.570260] [ 1818.570261] free sibling [ 1818.579253] task PCstack pid father child younger older [ 1818.586987] events/7 D 0155dd642280 026 2 (L-TLB) [ 1818.593747] 81012b529ac0 0046 810128280d18 [ 1818.601321] 8100ba2376f8 81012b689630 81012aff76b0 00078023e215 [ 1818.608870] 00010003ca14 810001065400 000780430c13 [ 1818.616222] Call Trace: [ 1818.618925] [804ececb] io_schedule+0x28/0x36 [ 1818.624207] [8036e517] get_request_wait+0x104/0x158 [ 1818.630112] [8036e5a1] blk_get_request+0x36/0x6b [ 1818.635755] [8042f5cb] scsi_execute+0x51/0x129 [ 1818.641240] [880cc11b] :scsi_transport_spi:spi_execute+0x87/0xf8 [ 1818.648271] [880cd5ae] :scsi_transport_spi:spi_dv_device_echo_buffer+0x181/0x27d [ 1818.656739] [880cd801] :scsi_transport_spi:spi_dv_retrain+0x4e/0x240 [ 1818.664139] [880ce008] :scsi_transport_spi:spi_dv_device+0x615/0x69c [ 1818.671542] [880f16d1] :mptspi:mptspi_dv_device+0xb3/0x14b [ 1818.678042] [880f27d3] :mptspi:mptspi_dv_renegotiate_work+0xcb/0xef [ 1818.685348] [80245bb8] run_workqueue+0x8e/0x120 [ 1818.690905] [80245d50] worker_thread+0x106/0x117 [ 1818.696540] [80249672] kthread+0x4b/0x82 [ 1818.701474] [8020ab28] child_rip+0xa/0x12 [ 1818.706495] [ 1818.708022] unionfs-fuse- D 01a76ef63463 0 1119 1 (NOTLB) [ 1818.714764] 810129765988 0082 80337e22 [ 1818.722329] 8101297658c8 81012b652f20 810129eec810 0006 [ 1818.729895] 00010005204e 81000105c400 000680337c3e [ 1818.737249] Call Trace: [ 1818.739953] [804ecfba] schedule_timeout+0x8a/0xb6 [ 1818.745673] [804ecf01] io_schedule_timeout+0x28/0x36 [ 1818.751664] [8026fba7] congestion_wait+0x9d/0xc2 [ 1818.757300] [80269b24] balance_dirty_pages_ratelimited_nr+0x196/0x22f [ 1818.764781] [80265a3f] generic_file_buffered_write+0x52a/0x60d [ 1818.771641] [80266210] __generic_file_aio_write_nolock+0x45a/0x491 [ 1818.778852] [802662a8] generic_file_aio_write+0x61/0xc1 [ 1818.785101] [8032eb94] nfs_file_write+0x138/0x1b7 [ 1818.790822] [8028d222] do_sync_write+0xcc/0x112 [ 1818.796372] [8028d32b] vfs_write+0xc3/0x165 [ 1818.801575] [8028d5df] sys_pwrite64+0x68/0x96 [ 1818.806959] [80209d0e] system_call+0x7e/0x83 [ 1818.812250] [2b4eeec3ea73] [snippage] Possibly your device driver had conniptions and stopped generating completion interrupts. Which driver is in use? I don't suppose it is repeatable. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG 2.6.24-rc4-mm1 -- Boot still hangs w/ async scsi scan
On Wed, 05 Dec 2007 11:36:39 -0500 Lee Schermerhorn [EMAIL PROTECTED] wrote: As reported here: http://marc.info/?l=linux-scsim=119645761124683w=4 against 24-rc3-mm2, I'm still seeing the hang on my HP ia64 NUMA platform under 24-rc4-mm1 with async scsi scan enabled. I'm still seeing the message mptspi: ioc#: mpt_config failed when it hangs. I can boot by disabling async scan. However, I've also noticed some disks attached via one of the mpt adapters [scsi8 in console long in message linked above] going off-line during stress tests. This was under 24-rc3-mm2. Haven't got that far yet with 24-rc4-mm1. Is ther any way of tricking you into http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt? Obvious culprits to start with would be git-scsi-misc and maybe scsi-early-detection-of-medium-not-present-updated.patch. But there are only 20-odd scsi patches in there. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)
On Thu, 06 Dec 2007 14:49:37 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: drivers/scsi/dpt_i2o.c | 132 ++- drivers/scsi/dpti.h|9 ++ 2 files changed, 68 insertions(+), 73 deletions(-) I've done the following: -untared a clean 2.6.24-rc4 and compiled it with my 2.6.23.1-settings in order to verify that the driver is still broken: checked, the box still won't boot. -patched the just compiled kernel source with your patch, make dist-clean (by means of make-kpkg clean) and recompile: box boots fine. I've put the captured console logs to http://w.sysiphus.de/dpt_i2o/bootlog.2624-rc4-pristine http://w.sysiphus.de/dpt_i2o/bootlog.2624-rc4-patched ... and the kernelconfig (which shouldn't matter) to http://w.sysiphus.de/dpt_i2o/kernelconfig.2624-rc4 Thanks for testing. So reverting Matthew's hotplug patch fixes the problem though I have no idea how the patch leads to this. Seems that nobody has any clue on that. We need to revert that patch for the moment. OK, thanks. Let's leave it a couple of days for people to register objections, have bright ideas, etc. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)
On Thu, 29 Nov 2007 13:31:50 +0100 Anders Henke [EMAIL PROTECTED] wrote: On November 28 2007, Anders Henke wrote: As everything is reported as being zero is quite odd an Jan took a guess that it might be block-layer or driver-related, I've assumed that the driver is responsible for this; just out of the curiousity, I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me. I haven't yet fine-tested from which kernel release on the dpt_i2o driver behaves like this and spews out zeroed blocks when trying to mount the rootfs. Maybe this is just some timing issue. I've started the fine-tests and can say so far that dpt_i2o from 2.6.22 is still fine. Test is simple: [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/ ... recompile the kernel, reboot: works. 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different patch sets: -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1 -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3 -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10. When applying the 2.6.23-rc1-based driver to my 2.6.31.1 kernel, the zero blocks-symptom show up, so it's the lucky situation that the smallest patch actually seams to be the broken one. According to the 2.6.23-rc1 short-form changelog, there is one major edit on the dpt_i2o driver: FUJITA Tomonori [SCSI] dpt_i2o: convert to use the data buffer accessors Stephen Rothwell dpt_i2o depends on virt_to_bus Fujita, would you please take a look at this? He won't have seen this. cc's added. I think that something's broken in there, leading to the dpt_i2o sending out blocks of zeroes right after initialization, at least on some specific controllers (in this case, Adaptec 2010S on Intel SE7501WV2S-based boxes). I don't have insight kernel driver development knowledge, so I'm quite out of help right now. Nevertheless, I'll add the diff from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o: Can you please confirm that this revert (against 2.6.24-rc4) fixes the data corruption problems? Thanks. diff -puN drivers/scsi/dpt_i2o.c~revert-dpt_i2o-convert-to-use-the-data-buffer-accessors drivers/scsi/dpt_i2o.c --- a/drivers/scsi/dpt_i2o.c~revert-dpt_i2o-convert-to-use-the-data-buffer-accessors +++ a/drivers/scsi/dpt_i2o.c @@ -2062,13 +2062,12 @@ static s32 adpt_scsi_to_i2o(adpt_hba* pH u32 *lenptr; int direction; int scsidir; - int nseg; u32 len; u32 reqlen; s32 rcode; memset(msg, 0 , sizeof(msg)); - len = scsi_bufflen(cmd); + len = cmd-request_bufflen; direction = 0x; scsidir = 0x; // DATA NO XFER @@ -2125,21 +2124,21 @@ static s32 adpt_scsi_to_i2o(adpt_hba* pH lenptr=mptr++; /* Remember me - fill in when we know */ reqlen = 14;// SINGLE SGE /* Now fill in the SGList and command */ + if(cmd-use_sg) { + struct scatterlist *sg = (struct scatterlist *)cmd-request_buffer; + int sg_count = pci_map_sg(pHba-pDev, sg, cmd-use_sg, + cmd-sc_data_direction); - nseg = scsi_dma_map(cmd); - BUG_ON(nseg 0); - if (nseg) { - struct scatterlist *sg; len = 0; - scsi_for_each_sg(cmd, sg, nseg, i) { + for(i = 0 ; i sg_count; i++) { *mptr++ = direction|0x1000|sg_dma_len(sg); len+=sg_dma_len(sg); *mptr++ = sg_dma_address(sg); - /* Make this an end of list */ - if (i == nseg - 1) - mptr[-2] = direction|0xD000|sg_dma_len(sg); + sg++; } + /* Make this an end of list */ + mptr[-2] = direction|0xD000|sg_dma_len(sg-1); reqlen = mptr - msg; *lenptr = len; @@ -2148,8 +2147,16 @@ static s32 adpt_scsi_to_i2o(adpt_hba* pH len, cmd-underflow); } } else { - *lenptr = len = 0; - reqlen = 12; + *lenptr = len = cmd-request_bufflen; + if(len == 0) { + reqlen = 12; + } else { + *mptr++ = 0xD000|direction|cmd-request_bufflen; + *mptr++ = pci_map_single(pHba-pDev, + cmd-request_buffer, + cmd-request_bufflen, + cmd-sc_data_direction); + } } /*
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)
On Wed, 05 Dec 2007 10:04:03 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: On Tue, 4 Dec 2007 16:57:38 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Thu, 29 Nov 2007 13:31:50 +0100 Anders Henke [EMAIL PROTECTED] wrote: On November 28 2007, Anders Henke wrote: As everything is reported as being zero is quite odd an Jan took a guess that it might be block-layer or driver-related, I've assumed that the driver is responsible for this; just out of the curiousity, I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me. I haven't yet fine-tested from which kernel release on the dpt_i2o driver behaves like this and spews out zeroed blocks when trying to mount the rootfs. Maybe this is just some timing issue. I've started the fine-tests and can say so far that dpt_i2o from 2.6.22 is still fine. Test is simple: [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/ ... recompile the kernel, reboot: works. 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different patch sets: -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1 -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3 -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10. When applying the 2.6.23-rc1-based driver to my 2.6.31.1 kernel, the zero blocks-symptom show up, so it's the lucky situation that the smallest patch actually seams to be the broken one. According to the 2.6.23-rc1 short-form changelog, there is one major edit on the dpt_i2o driver: FUJITA Tomonori [SCSI] dpt_i2o: convert to use the data buffer accessors Stephen Rothwell dpt_i2o depends on virt_to_bus Fujita, would you please take a look at this? He won't have seen this. cc's added. I think that something's broken in there, leading to the dpt_i2o sending out blocks of zeroes right after initialization, at least on some specific controllers (in this case, Adaptec 2010S on Intel SE7501WV2S-based boxes). I don't have insight kernel driver development knowledge, so I'm quite out of help right now. Nevertheless, I'll add the diff from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o: Can you please confirm that this revert (against 2.6.24-rc4) fixes the data corruption problems? Anders said that my patch is fine and seems that Matthew's hotplug conversion patch leads to the problem: http://marc.info/?l=linux-kernelm=119641892129732w=2 Oh. Jan broke message threading :( So it's been nearly a week and nothing has happened? Do we revert that change? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)
On Wed, 05 Dec 2007 10:30:54 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: On Tue, 4 Dec 2007 17:11:55 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Wed, 05 Dec 2007 10:04:03 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: On Tue, 4 Dec 2007 16:57:38 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Thu, 29 Nov 2007 13:31:50 +0100 Anders Henke [EMAIL PROTECTED] wrote: On November 28 2007, Anders Henke wrote: As everything is reported as being zero is quite odd an Jan took a guess that it might be block-layer or driver-related, I've assumed that the driver is responsible for this; just out of the curiousity, I've manually replaced the dpt_i2o driver by the 2.6.19 one by copying driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ into a vanilla 2.6.23.1. kernel; using this kernel fixed the issue for me. I haven't yet fine-tested from which kernel release on the dpt_i2o driver behaves like this and spews out zeroed blocks when trying to mount the rootfs. Maybe this is just some timing issue. I've started the fine-tests and can say so far that dpt_i2o from 2.6.22 is still fine. Test is simple: [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r dpt/ dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/ ... recompile the kernel, reboot: works. 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two different patch sets: -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1 -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3 -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10. When applying the 2.6.23-rc1-based driver to my 2.6.31.1 kernel, the zero blocks-symptom show up, so it's the lucky situation that the smallest patch actually seams to be the broken one. According to the 2.6.23-rc1 short-form changelog, there is one major edit on the dpt_i2o driver: FUJITA Tomonori [SCSI] dpt_i2o: convert to use the data buffer accessors Stephen Rothwell dpt_i2o depends on virt_to_bus Fujita, would you please take a look at this? He won't have seen this. cc's added. I think that something's broken in there, leading to the dpt_i2o sending out blocks of zeroes right after initialization, at least on some specific controllers (in this case, Adaptec 2010S on Intel SE7501WV2S-based boxes). I don't have insight kernel driver development knowledge, so I'm quite out of help right now. Nevertheless, I'll add the diff from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o: Can you please confirm that this revert (against 2.6.24-rc4) fixes the data corruption problems? Anders said that my patch is fine and seems that Matthew's hotplug conversion patch leads to the problem: http://marc.info/?l=linux-kernelm=119641892129732w=2 Oh. Jan broke message threading :( So it's been nearly a week and nothing has happened? Do we revert that change? SCSI people really want this conversion... Matthew, did you have a chance to look at it? It seems pretty improbably that a change of that nature could cause data corruption. Anders, are you able to determine whether the revert (against current Linus mainline or 2.6.24-rc4) fixes things? Because it would be very strange... This is a grave bug. It's really quite urgent... Thanks. drivers/scsi/dpt_i2o.c | 132 ++- drivers/scsi/dpti.h|9 ++ 2 files changed, 68 insertions(+), 73 deletions(-) diff -puN drivers/scsi/dpt_i2o.c~revert-dpt_i2o-convert-to-scsi-hotplug-model drivers/scsi/dpt_i2o.c --- a/drivers/scsi/dpt_i2o.c~revert-dpt_i2o-convert-to-scsi-hotplug-model +++ a/drivers/scsi/dpt_i2o.c @@ -173,20 +173,20 @@ static struct pci_device_id dptids[] = { }; MODULE_DEVICE_TABLE(pci,dptids); -static void adpt_exit(void); - -static int adpt_detect(void) +static int adpt_detect(struct scsi_host_template* sht) { struct pci_dev *pDev = NULL; adpt_hba* pHba; + adpt_init(); + PINFO(Detecting Adaptec I2O RAID controllers...\n); /* search for all Adatpec I2O RAID cards */ while ((pDev = pci_get_device( PCI_DPT_VENDOR_ID, PCI_ANY_ID, pDev))) { if(pDev-device == PCI_DPT_DEVICE_ID || pDev-device == PCI_DPT_RAPTOR_DEVICE_ID){ - if(adpt_install_hba(pDev) ){ + if(adpt_install_hba(sht, pDev) ){ PERROR(Could not Init an I2O RAID device\n); PERROR(Will not try to detect others.\n); return hba_count-1; @@ -248,33 +248,34 @@ rebuild_sys_tab: } for (pHba = hba_chain; pHba
Re: [BUG] 2.6.24-rc3-git2 softlockup detected
On Fri, 30 Nov 2007 12:58:06 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: Andrew Morton wrote: On Thu, 29 Nov 2007 23:00:47 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Fri, 30 Nov 2007 01:39:29 -0500 Kyle McMartin [EMAIL PROTECTED] wrote: On Thu, Nov 29, 2007 at 12:35:33AM -0800, Andrew Morton wrote: ten million is close enough to infinity for me to assume that we broke the driver and that's never going to terminate. how about this? doesn't break things on my pa8800: diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c b/drivers/scsi/sym53c8xx_2/sym_hipd.c index 463f119..ef01cb1 100644 --- a/drivers/scsi/sym53c8xx_2/sym_hipd.c +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c @@ -1037,10 +1037,13 @@ restart_test: /* * Wait 'til done (with timeout) */ - for (i=0; iSYM_SNOOP_TIMEOUT; i++) + do { if (INB(np, nc_istat) (INTF|SIP|DIP)) break; - if (i=SYM_SNOOP_TIMEOUT) { + msleep(10); + } while (i++ SYM_SNOOP_TIMEOUT); + + if (i = SYM_SNOOP_TIMEOUT) { printf (CACHE TEST FAILED: timeout.\n); return (0x20); } diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.h b/drivers/scsi/sym53c8xx_2/sym_hipd.h index ad07880..85c483b 100644 --- a/drivers/scsi/sym53c8xx_2/sym_hipd.h +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.h @@ -339,7 +339,7 @@ /* * Misc. */ -#define SYM_SNOOP_TIMEOUT (1000) +#define SYM_SNOOP_TIMEOUT (1000) #define BUS_8_BIT0 #define BUS_16_BIT 1 That might be the fix, but do we know what we're actually fixing? afaik 2.6.24-rc3 doesn't get this timeout, 2.6.24-rc3-mm2 does get it and we don't know why? looks at Subject: Checks that Rafael was cc'ed So 2.6.24-rc3 was OK and 2.6.24-rc3-git2 is not? Yes, the 2.6.24-rc3 was Ok and this is seen from 2.6.24-rc3-git2/3/4. There are effectively no drivers/scsi/ changes after 2.6.24-rc3 and we don't (I believe) have a clue what caused this regression. Can you please do a bisection search on this? Thanks. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha
On Sat, 01 Dec 2007 11:30:01 +1300 Michael Cree [EMAIL PROTECTED] wrote: Bob Tracy wrote: Andrew Morton wrote: Could be something change in sysfs. Please double-check the config options, make sure that something important didn't get disabled. Here's hoping someone else is seeing this or can replicate it in the meantime. Snap. 2.6.24-rc2 works fine. 2.6.24-rc3 boots on Alpha but once /dev is populated no partitions of the scsi sub-system are seen. Looks like ide sub-system similarly affected. Rafael, I assume you have this regression in the list? Managed to get boot log. Follows below (with output of various /proc info). Cheerz Michael. Linux version 2.6.24-rc3 ([EMAIL PROTECTED]) (gcc version 4.1.3 20071019 (prerelease) (Debian 4.1.2-17)) #1 Mon Nov 26 19:28:58 NZDT 2007 Booting on Tsunami variation Monet using machine vector Monet from SRM Major Options: EV67 LEGACY_START VERBOSE_MCHECK Command line: ro root=/dev/sda3 console=ttyS0 memcluster 0, usage 1, start0, end 215 memcluster 1, usage 0, start 215, end 131062 memcluster 2, usage 1, start 131062, end 131072 freeing pages 215:384 freeing pages 930:131062 reserving pages 930:932 4096K Bcache detected; load hit latency 21 cycles, load miss latency 127 cycles Console graphics on hose 0 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 130167 Kernel command line: ro root=/dev/sda3 console=ttyS0 PID hash table entries: 4096 (order: 12, 32768 bytes) Using epoch = 2000 Turning on RTC interrupts. Console: colour VGA+ 80x25 console [ttyS0] enabled Dentry cache hash table entries: 131072 (order: 7, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 6, 524288 bytes) Memory: 1030896k/1048496k available (2786k kernel code, 15216k reserved, 370k data, 168k init) Mount-cache hash table entries: 512 net_namespace: 120 bytes NET: Registered protocol family 16 PCI: Bridge: 0001:01:08.0 IO window: 8000-8fff MEM window: 0900-090f PREFETCH window: disabled. SMC37c669 Super I/O Controller found @ 0x3f0 Linux Plug and Play Support v0.97 (c) Adam Belay SCSI subsystem initialized NET: Registered protocol family 2 IP route cache hash table entries: 8192 (order: 3, 65536 bytes) TCP established hash table entries: 32768 (order: 6, 524288 bytes) TCP bind hash table entries: 32768 (order: 5, 262144 bytes) TCP: Hash tables configured (established 32768 bind 32768) TCP reno registered srm_env: version 0.0.6 loaded successfully io scheduler noop registered io scheduler cfq registered (default) tridentfb: Trident framebuffer 0.7.8-NEWAPI initializing isapnp: Scanning for PnP cards... isapnp: No Plug Play device found rtc: SRM (post-2000) epoch (2000) detected Real Time Clock Driver v1.12ac Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A Floppy drive(s): fd0 is 2.88M FDC 0 is a post-1991 82077 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx CY82C693: IDE controller (0x1080:0xc693 rev 0x00) at PCI slot :00:07.1 CY82C693: not 100% native mode: will probe irqs later CY82C693U driver v0.34 99-13-12 Andreas S. Krebs ([EMAIL PROTECTED]) ide0: BM-DMA at 0x8400-0x8407, BIOS settings: hda:pio, hdb:pio CY82C693: port 0x01f0 already claimed by ide0 ALI15X3: IDE controller (0x10b9:0x5228 rev 0xc6) at PCI slot 0001:02:09.1 ALI15X3: 100% native mode on irq 28 ide1: BM-DMA at 0x28410-0x28417, BIOS settings: hdc:DMA, hdd:DMA ide2: BM-DMA at 0x28418-0x2841f, BIOS settings: hde:pio, hdf:pio hdf: LITE-ON DVDRW SOHW-1653S, ATAPI CD/DVD-ROM drive hde: ST3200822A, ATA DISK drive ide2 at 0x28438-0x2843f,0x2844e on irq 28 hde: max request size: 512KiB hde: 390721968 sectors (200049 MB) w/8192KiB Cache, CHS=24321/255/63, UDMA(100) hde: cache flushes supported hde: hde1 qla1280: QLA1040 found on PCI bus 1, dev 6 scsi(0:0): Resetting SCSI BUS scsi0 : QLogic QLA1040 PCI to SCSI Host Adapter Firmware version: 7.65.06, Driver version 3.26 serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice scsi 0:0:1:0: Direct-Access SEAGATE ST336706LW 0109 PQ: 0 ANSI: 3 scsi(0:0:1:0): Sync: period 10, offset 12, Wide input: AT Raw Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 atkbd.c: keyboard reset failed on isa0060/serio1 TCP cubic registered Initializing XFRM netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 NET: Registered protocol family 15 scsi: waiting for bus probes to complete ... sd 0:0:1:0: [sda] 71687370 512-byte hardware sectors (36704 MB) sd 0:0:1:0: [sda] Write Protect is off sd 0:0:1:0: [sda] Write
[patch] SCSI: early detection of medium not present, updated
Guys, I have this marked as needed-in-2.6.24? From: Alan Stern [EMAIL PROTECTED] Taken from http://bugzilla.kernel.org/show_bug.cgi?id=8904 An updated (by Albert, I assume) version of the fourteen-month-old patch here: http://marc.info/?l=linux-kernelm=115412002912837w=2 Apparently fixes the bug described at http://bugzilla.kernel.org/show_bug.cgi?id=8904 Needs some TLC. Perhaps urgently. Cc: Albert Lee [EMAIL PROTECTED] Cc: Alan Stern [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Cc: Tejun Heo [EMAIL PROTECTED] Cc: Jens Axboe [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/scsi_ioctl.c |2 +- drivers/scsi/scsi_lib.c| 20 ++-- drivers/scsi/sd.c |2 +- drivers/scsi/sr.c | 15 +-- include/scsi/scsi_device.h |2 +- 5 files changed, 30 insertions(+), 11 deletions(-) diff -puN drivers/scsi/scsi_ioctl.c~scsi-early-detection-of-medium-not-present-updated drivers/scsi/scsi_ioctl.c --- a/drivers/scsi/scsi_ioctl.c~scsi-early-detection-of-medium-not-present-updated +++ a/drivers/scsi/scsi_ioctl.c @@ -244,7 +244,7 @@ int scsi_ioctl(struct scsi_device *sdev, return scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW); case SCSI_IOCTL_TEST_UNIT_READY: return scsi_test_unit_ready(sdev, IOCTL_NORMAL_TIMEOUT, - NORMAL_RETRIES); + NORMAL_RETRIES, NULL); case SCSI_IOCTL_START_UNIT: scsi_cmd[0] = START_STOP; scsi_cmd[1] = 0; diff -puN drivers/scsi/scsi_lib.c~scsi-early-detection-of-medium-not-present-updated drivers/scsi/scsi_lib.c --- a/drivers/scsi/scsi_lib.c~scsi-early-detection-of-medium-not-present-updated +++ a/drivers/scsi/scsi_lib.c @@ -2010,15 +2010,26 @@ scsi_mode_sense(struct scsi_device *sdev } EXPORT_SYMBOL(scsi_mode_sense); +/** + * scsi_test_unit_ready - test if unit is ready + * @sdev: scsi device to change the state of. + * @timeout: command timeout + * @retries: number of retries before failing + * @media_maybe_present: 1 if media maybe present or not. + * 0 if media not present. + * + * Returns zero if unsuccessful or an error if TUR failed. + **/ int -scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries) +scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries, int *media_maybe_present) { char cmd[] = { TEST_UNIT_READY, 0, 0, 0, 0, 0, }; struct scsi_sense_hdr sshdr; int result; - + int maybe_present = 1; + result = scsi_execute_req(sdev, cmd, DMA_NONE, NULL, 0, sshdr, timeout, retries); @@ -2027,10 +2038,15 @@ scsi_test_unit_ready(struct scsi_device if ((scsi_sense_valid(sshdr)) ((sshdr.sense_key == UNIT_ATTENTION) || (sshdr.sense_key == NOT_READY))) { + if (sshdr.asc == 0x3A) + maybe_present = 0; sdev-changed = 1; result = 0; } } + + if (media_maybe_present) + *media_maybe_present = maybe_present; return result; } EXPORT_SYMBOL(scsi_test_unit_ready); diff -puN drivers/scsi/sd.c~scsi-early-detection-of-medium-not-present-updated drivers/scsi/sd.c --- a/drivers/scsi/sd.c~scsi-early-detection-of-medium-not-present-updated +++ a/drivers/scsi/sd.c @@ -767,7 +767,7 @@ static int sd_media_changed(struct gendi retval = -ENODEV; if (scsi_block_when_processing_errors(sdp)) - retval = scsi_test_unit_ready(sdp, SD_TIMEOUT, SD_MAX_RETRIES); + retval = scsi_test_unit_ready(sdp, SD_TIMEOUT, SD_MAX_RETRIES, NULL); /* * Unable to test, unit probably not ready. This usually diff -puN drivers/scsi/sr.c~scsi-early-detection-of-medium-not-present-updated drivers/scsi/sr.c --- a/drivers/scsi/sr.c~scsi-early-detection-of-medium-not-present-updated +++ a/drivers/scsi/sr.c @@ -179,18 +179,21 @@ static int sr_media_change(struct cdrom_ { struct scsi_cd *cd = cdi-handle; int retval; + int media_maybe_present; if (CDSL_CURRENT != slot) { /* no changer support */ return -EINVAL; } - retval = scsi_test_unit_ready(cd-device, SR_TIMEOUT, MAX_RETRIES); - if (retval) { - /* Unable to test, unit probably not ready. This usually -* means there is no disc in the drive. Mark as changed, -* and we will figure it out later once the drive is -* available again. */ + retval = scsi_test_unit_ready(cd-device, SR_TIMEOUT, MAX_RETRIES, + media_maybe_present); + if (retval
Re: [BUG] 2.6.24-rc3-git2 softlockup detected
On Fri, 30 Nov 2007 01:39:29 -0500 Kyle McMartin [EMAIL PROTECTED] wrote: On Thu, Nov 29, 2007 at 12:35:33AM -0800, Andrew Morton wrote: ten million is close enough to infinity for me to assume that we broke the driver and that's never going to terminate. how about this? doesn't break things on my pa8800: diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.c b/drivers/scsi/sym53c8xx_2/sym_hipd.c index 463f119..ef01cb1 100644 --- a/drivers/scsi/sym53c8xx_2/sym_hipd.c +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.c @@ -1037,10 +1037,13 @@ restart_test: /* * Wait 'til done (with timeout) */ - for (i=0; iSYM_SNOOP_TIMEOUT; i++) + do { if (INB(np, nc_istat) (INTF|SIP|DIP)) break; - if (i=SYM_SNOOP_TIMEOUT) { + msleep(10); + } while (i++ SYM_SNOOP_TIMEOUT); + + if (i = SYM_SNOOP_TIMEOUT) { printf (CACHE TEST FAILED: timeout.\n); return (0x20); } diff --git a/drivers/scsi/sym53c8xx_2/sym_hipd.h b/drivers/scsi/sym53c8xx_2/sym_hipd.h index ad07880..85c483b 100644 --- a/drivers/scsi/sym53c8xx_2/sym_hipd.h +++ b/drivers/scsi/sym53c8xx_2/sym_hipd.h @@ -339,7 +339,7 @@ /* * Misc. */ -#define SYM_SNOOP_TIMEOUT (1000) +#define SYM_SNOOP_TIMEOUT (1000) #define BUS_8_BIT0 #define BUS_16_BIT 1 That might be the fix, but do we know what we're actually fixing? afaik 2.6.24-rc3 doesn't get this timeout, 2.6.24-rc3-mm2 does get it and we don't know why? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc3-mm2: Result: hostbyte=0x01 driverbyte=0x00\nend_request: I/O error
On Wed, 28 Nov 2007 23:01:31 +0300 Alexey Dobriyan [EMAIL PROTECTED] wrote: Reliably spams dmesg with end_request() horrors. This happens when git starts checking out linux tree to fresh ext2 partition. Disk is several month old and there were no prolems with, say, 2.6.24-rc3: [ 225.378426] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00 [ 225.378659] end_request: I/O error, dev sdb, sector 141295703 [ 225.390133] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00 [ 225.391988] end_request: I/O error, dev sdb, sector 141295703 [ 225.392463] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00 [ 225.392625] end_request: I/O error, dev sdb, sector 141295703 [ 225.392999] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00 [ 225.393161] end_request: I/O error, dev sdb, sector 141295703 [ 225.393571] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00 [ 225.393731] end_request: I/O error, dev sdb, sector 141295703 [ 225.394382] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00 [ 225.394544] end_request: I/O error, dev sdb, sector 141295703 [ 225.395247] sd 2:0:1:0: [sdb] Result: hostbyte=0x01 driverbyte=0x00 [ 225.395412] end_request: I/O error, dev sdb, sector 141295703 CONFIG_ATA=y # CONFIG_ATA_NONSTANDARD is not set CONFIG_ATA_ACPI=y CONFIG_SATA_AHCI=y CONFIG_ATA_PIIX=y CONFIG_PATA_JMICRON=y and [ 35.229713] sd 2:0:1:0: [sdb] 976773168 512-byte hardware sectors (500108 MB) So that's an OK sector number. [0.00] Linux version 2.6.24-rc3-mm2 ([EMAIL PROTECTED]) (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.2)) #3 SMP PREEMPT Wed Nov 28 22:23:45 MSK 2007 [0.00] Command line: root=/dev/sda2 [EMAIL PROTECTED]/eth0,[EMAIL PROTECTED]/00:80:48:45:EC:73 ignore_loglevel [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009fc00 (usable) [0.00] BIOS-e820: 0009fc00 - 000a (reserved) [0.00] BIOS-e820: 000e4000 - 0010 (reserved) [0.00] BIOS-e820: 0010 - 7ff9 (usable) [0.00] BIOS-e820: 7ff9 - 7ff9e000 (ACPI data) [0.00] BIOS-e820: 7ff9e000 - 7ffe (ACPI NVS) [0.00] BIOS-e820: 7ffe - 8000 (reserved) [0.00] BIOS-e820: fee0 - fee01000 (reserved) [0.00] BIOS-e820: ffb0 - 0001 (reserved) [0.00] BIOS-e820: 0001 - 00018000 (usable) [0.00] Entering add_active_range(0, 0, 159) 0 entries of 256 used [0.00] Entering add_active_range(0, 256, 524176) 1 entries of 256 used [0.00] Entering add_active_range(0, 1048576, 1572864) 2 entries of 256 used [0.00] end_pfn_map = 1572864 [0.00] DMI 2.4 present. [0.00] ACPI: RSDP 000FA980, 0024 (r2 ACPIAM) [0.00] ACPI: XSDT 7FF90100, 0054 (r1 KOZIRO FRONTIER 2000707 MSFT 97) [0.00] ACPI: FACP 7FF90290, 00F4 (r3 MSTEST OEMFACP 2000707 MSFT 97) [0.00] ACPI: DSDT 7FF905C0, 8FA9 (r1 A0637 A06370000 INTL 20060113) [0.00] ACPI: FACS 7FF9E000, 0040 [0.00] ACPI: APIC 7FF90390, 006C (r1 MSTEST OEMAPIC 2000707 MSFT 97) [0.00] ACPI: MCFG 7FF90400, 003C (r1 MSTEST OEMMCFG 2000707 MSFT 97) [0.00] ACPI: SLIC 7FF90440, 0176 (r1 KOZIRO FRONTIER 2000707 MSFT 97) [0.00] ACPI: OEMB 7FF9E040, 007B (r1 MSTEST AMI_OEM 2000707 MSFT 97) [0.00] ACPI: HPET 7FF99570, 0038 (r1 MSTEST OEMHPET 2000707 MSFT 97) [0.00] Entering add_active_range(0, 0, 159) 0 entries of 256 used [0.00] Entering add_active_range(0, 256, 524176) 1 entries of 256 used [0.00] Entering add_active_range(0, 1048576, 1572864) 2 entries of 256 used [0.00] [e200-e21f] PMD -81000120 on node 0 [0.00] [e220-e23f] PMD -81000160 on node 0 [0.00] [e240-e25f] PMD -810001A0 on node 0 [0.00] [e260-e27f] PMD -810001E0 on node 0 [0.00] [e280-e29f] PMD -81000220 on node 0 [0.00] [e2a0-e2bf] PMD -81000260 on node 0 [0.00] [e2c0-e2df] PMD -810002A0 on node 0 [0.00] [e2e0-e2ff] PMD -810002E0 on node 0 [0.00] [e2000100-e200011f] PMD -81000320 on node 0 [0.00] [e2000120-e200013f] PMD -81000360 on node 0 [0.00] [e2000140-e200015f] PMD -810003A0 on node 0 [0.00] [e2000160-e200017f] PMD -810003E0 on node 0 [0.00]
Re: 2.6.24-rc3-mm2: Result: hostbyte=0x01 driverbyte=0x00\nend_request: I/O error
On Wed, 28 Nov 2007 16:14:21 -0700 Matthew Wilcox [EMAIL PROTECTED] wrote: On Wed, Nov 28, 2007 at 01:40:36PM -0800, Andrew Morton wrote: On Wed, 28 Nov 2007 23:01:31 +0300 Alexey Dobriyan [EMAIL PROTECTED] wrote: Reliably spams dmesg with end_request() horrors. This happens when git starts checking out linux tree to fresh ext2 partition. Disk is several month old and there were no prolems with, say, 2.6.24-rc3: Could you try reverting 6f5391c283d7fdcf24bf40786ea79061919d1e1d and see if the problem still exists? That's not completely trivial.. I did a hand-made revert against 2.6.24-rc3-mm2 (below) but some other patch in there causes: drivers/scsi/scsi_lib.c: In function 'scsi_blk_pc_done': drivers/scsi/scsi_lib.c:1251: error: 'struct scsi_cmnd' has no member named 'request_bufflen' --- a/drivers/scsi/scsi.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d +++ a/drivers/scsi/scsi.c @@ -59,7 +59,6 @@ #include scsi/scsi_cmnd.h #include scsi/scsi_dbg.h #include scsi/scsi_device.h -#include scsi/scsi_driver.h #include scsi/scsi_eh.h #include scsi/scsi_host.h #include scsi/scsi_tcq.h @@ -379,8 +378,9 @@ void scsi_log_send(struct scsi_cmnd *cmd scsi_print_command(cmd); if (level 3) { printk(KERN_INFO buffer = 0x%p, bufflen = %d, - queuecommand 0x%p\n, + done = 0x%p, queuecommand 0x%p\n, scsi_sglist(cmd), scsi_bufflen(cmd), + cmd-done, cmd-device-host-hostt-queuecommand); } @@ -667,12 +667,6 @@ void __scsi_done(struct scsi_cmnd *cmd) blk_complete_request(rq); } -/* Move this to a header if it becomes more generally useful */ -static struct scsi_driver *scsi_cmd_to_driver(struct scsi_cmnd *cmd) -{ - return *(struct scsi_driver **)cmd-request-rq_disk-private_data; -} - /** * scsi_finish_command - cleanup and pass command back to upper layer * @cmd: the command @@ -685,8 +679,6 @@ void scsi_finish_command(struct scsi_cmn { struct scsi_device *sdev = cmd-device; struct Scsi_Host *shost = sdev-host; - struct scsi_driver *drv; - unsigned int good_bytes; scsi_device_unbusy(sdev); @@ -712,13 +704,7 @@ void scsi_finish_command(struct scsi_cmn Notifying upper driver of completion (result %x)\n, cmd-result)); - good_bytes = scsi_bufflen(cmd); -if (cmd-request-cmd_type != REQ_TYPE_BLOCK_PC) { - drv = scsi_cmd_to_driver(cmd); - if (drv-done) - good_bytes = drv-done(cmd); - } - scsi_io_completion(cmd, good_bytes); + cmd-done(cmd); } EXPORT_SYMBOL(scsi_finish_command); diff -puN drivers/scsi/scsi_error.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d drivers/scsi/scsi_error.c --- a/drivers/scsi/scsi_error.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d +++ a/drivers/scsi/scsi_error.c @@ -1697,6 +1697,7 @@ scsi_reset_provider(struct scsi_device * scmd-scsi_done = scsi_reset_provider_done_command; memset(scmd-sdb, 0, sizeof(scmd-sdb)); + scmd-done = NULL; scmd-cmd_len = 0; diff -puN drivers/scsi/scsi_lib.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d drivers/scsi/scsi_lib.c --- a/drivers/scsi/scsi_lib.c~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d +++ a/drivers/scsi/scsi_lib.c @@ -944,6 +944,7 @@ void scsi_end_bidi_request(struct scsi_c scsi_finalize_request(cmd, 1); } +EXPORT_SYMBOL(scsi_io_completion); /* * Function:scsi_io_completion() @@ -1238,6 +1239,18 @@ static struct scsi_cmnd *scsi_get_cmd_fr return cmd; } +static void scsi_blk_pc_done(struct scsi_cmnd *cmd) +{ + BUG_ON(!blk_pc_request(cmd-request)); + /* +* This will complete the whole command with uptodate=1 so +* as far as the block layer is concerned the command completed +* successfully. Since this is a REQ_BLOCK_PC command the +* caller should check the request's errors value +*/ + scsi_io_completion(cmd, cmd-request_bufflen); +} + int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req) { struct scsi_cmnd *cmd; @@ -1285,6 +1298,7 @@ int scsi_setup_blk_pc_cmnd(struct scsi_d cmd-transfersize = req-data_len; cmd-allowed = req-retries; cmd-timeout_per_command = req-timeout; + cmd-done = scsi_blk_pc_done; return BLKPREP_OK; } EXPORT_SYMBOL(scsi_setup_blk_pc_cmnd); diff -puN drivers/scsi/scsi_priv.h~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d drivers/scsi/scsi_priv.h --- a/drivers/scsi/scsi_priv.h~revert-6f5391c283d7fdcf24bf40786ea79061919d1e1d +++ a/drivers/scsi
Re: [PATCH 2/2] ide-scsi: use print_hex_dump from linux/kernel.h
On Mon, 26 Nov 2007 15:16:13 +0800 Denis Cheng [EMAIL PROTECTED] wrote: these utilities implemented in lib/hexdump.c are more handy, please use this. ... --- a/drivers/scsi/ide-scsi.c +++ b/drivers/scsi/ide-scsi.c @@ -242,16 +242,6 @@ static void idescsi_output_buffers (ide_drive_t *drive, idescsi_pc_t *pc, unsign } } -static void hexdump(u8 *x, int len) -{ - int i; - - printk([ ); - for (i = 0; i len; i++) - printk(%x , x[i]); - printk(]\n); -} - static int idescsi_check_condition(ide_drive_t *drive, struct request *failed_command) { idescsi_scsi_t *scsi = drive_to_idescsi(drive); @@ -282,7 +272,7 @@ static int idescsi_check_condition(ide_drive_t *drive, struct request *failed_co pc-scsi_cmd = ((idescsi_pc_t *) failed_command-special)-scsi_cmd; if (test_bit(IDESCSI_LOG_CMD, scsi-log)) { printk (ide-scsi: %s: queue cmd = , drive-name); - hexdump(pc-c, 6); + print_hex_dump(KERN_DEBUG, , DUMP_PREFIX_OFFSET, 16, 1, pc-c, 6, 1); } rq-rq_disk = scsi-disk; return ide_do_drive_cmd(drive, rq, ide_preempt); @@ -337,7 +327,7 @@ static int idescsi_end_request (ide_drive_t *drive, int uptodate, int nrsecs) idescsi_pc_t *opc = (idescsi_pc_t *) rq-buffer; if (log) { printk (ide-scsi: %s: wrap up check %lu, rst = , drive-name, opc-scsi_cmd-serial_number); - hexdump(pc-buffer,16); + print_hex_dump(KERN_DEBUG, , DUMP_PREFIX_OFFSET, 16, 1, pc-buffer, 16, 1); } memcpy((void *) opc-scsi_cmd-sense_buffer, pc-buffer, SCSI_SENSE_BUFFERSIZE); kfree(pc-buffer); @@ -816,10 +806,10 @@ static int idescsi_queue (struct scsi_cmnd *cmd, if (test_bit(IDESCSI_LOG_CMD, scsi-log)) { printk (ide-scsi: %s: que %lu, cmd = , drive-name, cmd-serial_number); - hexdump(cmd-cmnd, cmd-cmd_len); + print_hex_dump(KERN_DEBUG, , DUMP_PREFIX_OFFSET, 16, 1, cmd-cmnd, cmd-cmd_len, 1); if (memcmp(pc-c, cmd-cmnd, cmd-cmd_len)) { printk (ide-scsi: %s: que %lu, tsl = , drive-name, cmd-serial_number); - hexdump(pc-c, 12); + print_hex_dump(KERN_DEBUG, , DUMP_PREFIX_OFFSET, 16, 1, pc-c, 12, 1); } } Would you believe that this patch (which removes code) actually increases drivers/scsi/ide-scsi.o .text by 75 bytes? I didn't look to see why - probably that huge arg count is hurting, possibly some additional strings being emitted? Either way, perhaps a simple little front-end to print_hex_dump() is called for. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 9462] New: Adaptec AHA-7850 with MOD drive: Hard lockup with new Adaptec driver
(switched to email - please respond via emailed reply-to-all, not via the bugzilla web interface) On Tue, 27 Nov 2007 07:11:02 -0800 (PST) [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=9462 Summary: Adaptec AHA-7850 with MOD drive: Hard lockup with new Adaptec driver Product: IO/Storage Version: 2.5 KernelVersion: 2.6.22.14 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: SCSI AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Most recent kernel where this bug did not occur: Works with the old Adaptec driver. Had lockups with 2.6.18.8 as well, with the new driver. Distribution: Debian etch Hardware Environment: Epox 8kta+, Athlon 800 MHz, 768MB RAM Output form lspci: 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 02) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 22) 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 10) 00:07.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller (rev 10) 00:07.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller (rev 10) 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 30) 00:09.0 Ethernet controller: D-Link System Inc DL2000-based Gigabit Ethernet (rev 0c) 00:0a.0 USB Controller: NEC Corporation USB (rev 43) 00:0a.1 USB Controller: NEC Corporation USB (rev 43) 00:0a.2 USB Controller: NEC Corporation USB 2.0 (rev 04) 00:0b.0 Mass storage controller: Promise Technology, Inc. PDC20518/PDC40518 (SATAII 150 TX4) (rev 02) 00:0c.0 SCSI storage controller: Adaptec AHA-7850 (rev 03) 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) 01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX/MX 400] (rev a1) Software Environment: Tried cp, dd_rescue and cat for copy. No difference. Problem Description: When writing to the filesystem on the MOD, the computer locks completely as far as I can tell, no responses to ping, no log-entries (this is a headless server) MOD not writing and only a hard reset helps. The target filesystem had (after recovery with e2fsck) 22MB on it in one try and 350MB in another one. I guess there is some race condition or other randomized process at work. MOD size is 600MB. Steps to reproduce: Insert MOD, mount it, write to it, see lockup happen. Observed with new dribver in 2.6.18.8 and 2.6.22.14. Also happens with TCQ disabled. I should add that MODs are slow. The writing process blocks for 10-20 seconds frequently for disk flushes. This is normal and expected. Maybe a warning should be added to the new drivr and the old one should be kept for the time being. If there are some tests I can run for you, please let me know. I guess this doesn't really count as a regression, as the new driver has never worked. I assume that the old driver continues to work OK and that there is no plan to remove it? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24-rc3-git2 softlockup detected
On Wed, 28 Nov 2007 11:59:00 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: Hi, (cc linux-scsi, for sym53c8xx) Soft lockup is detected while bootup with 2.6.24-rc3-git2 on powerbox I assume this is a post-2.6.23 regression? BUG: soft lockup - CPU#1 stuck for 11s! [insmod:375] NIP: c002f02c LR: d01414fc CTR: c002f018 REGS: c0077cbef0b0 TRAP: 0901 Not tainted (2.6.24-rc3-git2-autotest) MSR: 80009032 EE,ME,IR,DR CR: 24022088 XER: TASK = c0077cbd8000[375] 'insmod' THREAD: c0077cbec000 CPU: 1 GPR00: d01414fc c0077cbef330 c052b930 d80080002014 GPR04: d8008000202c c0077ca1cb00 d014ce54 GPR08: c0077ca1c63c 002a c002f018 GPR12: d0143610 c0473d00 NIP [c002f02c] .ioread8+0x14/0x60 LR [d01414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx] Call Trace: [c0077cbef330] [c0077cbef3c0] 0xc0077cbef3c0 (unreliable) [c0077cbef3a0] [d01414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx] [c0077cbef470] [d01395f8] .sym2_probe+0x700/0x99c [sym53c8xx] [c0077cbef710] [c01bc118] .pci_device_probe+0x124/0x1b0 [c0077cbef7b0] [c0221138] .driver_probe_device+0x144/0x20c [c0077cbef850] [c0221450] .__driver_attach+0xcc/0x154 [c0077cbef8e0] [c021ff94] .bus_for_each_dev+0x7c/0xd4 [c0077cbef9a0] [c0220e9c] .driver_attach+0x28/0x40 [c0077cbefa20] [c02204d8] .bus_add_driver+0x90/0x228 [c0077cbefac0] [c0221858] .driver_register+0x94/0xb0 [c0077cbefb40] [c01bc430] .__pci_register_driver+0x6c/0xcc [c0077cbefbe0] [d0143428] .sym2_init+0x108/0x15b0 [sym53c8xx] [c0077cbefc80] [c008ce80] .sys_init_module+0x17c4/0x1958 [c0077cbefe30] [c000872c] syscall_exit+0x0/0x40 Instruction dump: 6000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6 f8010010 f821ff91 7c0004ac 8923 0c09 4c00012c 79290620 2f8900ff I see no obvious lockup sites near the end of sym_hcb_attach(). Maybe it's being called lots of times from a higher level.. Do the traces all look the same? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24-rc3-git2 softlockup detected
On Wed, 28 Nov 2007 12:47:19 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: Andrew Morton wrote: On Wed, 28 Nov 2007 11:59:00 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: Hi, (cc linux-scsi, for sym53c8xx) Soft lockup is detected while bootup with 2.6.24-rc3-git2 on powerbox I assume this is a post-2.6.23 regression? BUG: soft lockup - CPU#1 stuck for 11s! [insmod:375] NIP: c002f02c LR: d01414fc CTR: c002f018 REGS: c0077cbef0b0 TRAP: 0901 Not tainted (2.6.24-rc3-git2-autotest) MSR: 80009032 EE,ME,IR,DR CR: 24022088 XER: TASK = c0077cbd8000[375] 'insmod' THREAD: c0077cbec000 CPU: 1 GPR00: d01414fc c0077cbef330 c052b930 d80080002014 GPR04: d8008000202c c0077ca1cb00 d014ce54 GPR08: c0077ca1c63c 002a c002f018 GPR12: d0143610 c0473d00 NIP [c002f02c] .ioread8+0x14/0x60 LR [d01414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx] Call Trace: [c0077cbef330] [c0077cbef3c0] 0xc0077cbef3c0 (unreliable) [c0077cbef3a0] [d01414fc] .sym_hcb_attach+0x1188/0x1378 [sym53c8xx] [c0077cbef470] [d01395f8] .sym2_probe+0x700/0x99c [sym53c8xx] [c0077cbef710] [c01bc118] .pci_device_probe+0x124/0x1b0 [c0077cbef7b0] [c0221138] .driver_probe_device+0x144/0x20c [c0077cbef850] [c0221450] .__driver_attach+0xcc/0x154 [c0077cbef8e0] [c021ff94] .bus_for_each_dev+0x7c/0xd4 [c0077cbef9a0] [c0220e9c] .driver_attach+0x28/0x40 [c0077cbefa20] [c02204d8] .bus_add_driver+0x90/0x228 [c0077cbefac0] [c0221858] .driver_register+0x94/0xb0 [c0077cbefb40] [c01bc430] .__pci_register_driver+0x6c/0xcc [c0077cbefbe0] [d0143428] .sym2_init+0x108/0x15b0 [sym53c8xx] [c0077cbefc80] [c008ce80] .sys_init_module+0x17c4/0x1958 [c0077cbefe30] [c000872c] syscall_exit+0x0/0x40 Instruction dump: 6000 786b0420 38210070 7d635b78 e8010010 7c0803a6 4e800020 7c0802a6 f8010010 f821ff91 7c0004ac 8923 0c09 4c00012c 79290620 2f8900ff I see no obvious lockup sites near the end of sym_hcb_attach(). Maybe it's being called lots of times from a higher level.. Do the traces all look the same? Hi Andrew, I see this call trace twice and both looks similar and on another reboot the following trace is seen twice in different cpu BUG: soft lockup detected on CPU#3! Call Trace: [C0003FEDEDA0] [C0010220] .show_stack+0x68/0x1b0 (unreliable) [C0003FEDEE40] [C00A061C] .softlockup_tick+0xf0/0x13c [C0003FEDEEF0] [C0072E2C] .run_local_timers+0x1c/0x30 [C0003FEDEF70] [C0022FA0] .timer_interrupt+0xa8/0x488 [C0003FEDF050] [C00034EC] decrementer_common+0xec/0x100 --- Exception: 901 at .ioread8+0x14/0x60 LR = .sym_hcb_attach+0x1194/0x1384 [sym53c8xx] [C0003FEDF340] [D02B3BC0] 0xd02b3bc0 (unreliable) [C0003FEDF3B0] [D029A3C0] .sym_hcb_attach+0x1194/0x1384 [sym53c8xx] [C0003FEDF480] [D0291D30] .sym2_probe+0x75c/0x9f8 [sym53c8xx] [C0003FEDF710] [C01B65A4] .pci_device_probe+0x13c/0x1dc [C0003FEDF7D0] [C0219A0C] .driver_probe_device+0xa0/0x15c [C0003FEDF870] [C0219C64] .__driver_attach+0xb4/0x138 [C0003FEDF900] [C021913C] .bus_for_each_dev+0x7c/0xd4 [C0003FEDF9C0] [C02198B0] .driver_attach+0x28/0x40 [C0003FEDFA40] [C0218BA4] .bus_add_driver+0x98/0x18c [C0003FEDFAE0] [C021A064] .driver_register+0xa8/0xc4 [C0003FEDFB60] [C01B68AC] .__pci_register_driver+0x5c/0xa4 [C0003FEDFBF0] [D029C204] .sym2_init+0x104/0x1550 [sym53c8xx] [C0003FEDFC90] [C008D1F4] .sys_init_module+0x1764/0x1998 [C0003FEDFE30] [C000869C] syscall_exit+0x0/0x40 hm, odd. Can you look up sym_hcb_attach+0x1194/0x1384 in gdb? Something like - Enable CONFIG_DEBUG_INFO - gdb sym53c8xx.o (gdb) p sym_hcb_attach prints 0xsomething (gdb) p/x 0xsomething + 0x1194 prints 0xsomethingelse (gdb) l *0xsomethingelse - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc3-mm1
On Fri, 23 Nov 2007 06:55:41 +0100 Gabriel C [EMAIL PROTECTED] wrote: Andrew Morton wrote: On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C [EMAIL PROTECTED] wrote: I have some warnings on each SCSI disc: ... [ 30.724410] scsi 0:0:0:0: Direct-Access SEAGATE ST318406LW 0109 PQ: 0 ANSI: 3 [ 30.724419] scsi0:A:0:0: Tagged Queuing enabled. Depth 32 [ 30.724435] target0:0:0: Beginning Domain Validation [ 30.724446] target0:0:0: Domain Validation Initial Inquiry Failed -- [ 30.724572] target0:0:0: Ending Domain Validation [ 30.729747] scsi 0:0:1:0: Direct-Access FUJITSU MAH3182MP 0114 PQ: 0 ANSI: 4 [ 30.729754] scsi0:A:1:0: Tagged Queuing enabled. Depth 32 [ 30.729771] target0:0:1: Beginning Domain Validation [ 30.729780] target0:0:1: Domain Validation Initial Inquiry Failed -- [ 30.729908] target0:0:1: Ending Domain Validation Don't know what would have caused that. But yes, something is wrong in scsi land. Actually I'm lucky the author didn't fix that FIXME in scsi_transport_spi.c and I still can boot ;) no idea whatever this is related but buffered disk reads are 2.XX MB/sec and the box is somewhat laggy. hdparm -t on sda and sdb reports : /dev/sda: Timing buffered disk reads:8 MB in 3.26 seconds = 2.46 MB/sec /dev/sdb: Timing buffered disk reads:8 MB in 3.56 seconds = 2.25 MB/sec My IDE discs are fine. Please let me know if you need my config or any other informations. And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1. I found the commit which cause these problems , it is in git-scsi-misc patch and reverting it fixes both problems for me. http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d OK, thanks. I'll assume that James and Hannes have this in hand (or will have, by mid-week) and I won't do anything here. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc3-mm1
On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C [EMAIL PROTECTED] wrote: I have some warnings on each SCSI disc: ... [ 30.724410] scsi 0:0:0:0: Direct-Access SEAGATE ST318406LW 0109 PQ: 0 ANSI: 3 [ 30.724419] scsi0:A:0:0: Tagged Queuing enabled. Depth 32 [ 30.724435] target0:0:0: Beginning Domain Validation [ 30.724446] target0:0:0: Domain Validation Initial Inquiry Failed -- [ 30.724572] target0:0:0: Ending Domain Validation [ 30.729747] scsi 0:0:1:0: Direct-Access FUJITSU MAH3182MP0114 PQ: 0 ANSI: 4 [ 30.729754] scsi0:A:1:0: Tagged Queuing enabled. Depth 32 [ 30.729771] target0:0:1: Beginning Domain Validation [ 30.729780] target0:0:1: Domain Validation Initial Inquiry Failed -- [ 30.729908] target0:0:1: Ending Domain Validation Don't know what would have caused that. But yes, something is wrong in scsi land. no idea whatever this is related but buffered disk reads are 2.XX MB/sec and the box is somewhat laggy. hdparm -t on sda and sdb reports : /dev/sda: Timing buffered disk reads:8 MB in 3.26 seconds = 2.46 MB/sec /dev/sdb: Timing buffered disk reads:8 MB in 3.56 seconds = 2.25 MB/sec My IDE discs are fine. Please let me know if you need my config or any other informations. And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 9405] New: iSCSI does not implement ordering guarantees required by e.g. journaling filesystems
On Mon, 19 Nov 2007 05:44:01 -0800 (PST) [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=9405 Summary: iSCSI does not implement ordering guarantees required by e.g. journaling filesystems Product: IO/Storage Version: 2.5 KernelVersion: 2.6.23.1 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: SCSI AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Most recent kernel where this bug did not occur: (new issue) Distribution: any Hardware Environment: (does not apply) Software Environment: (does not apply) Problem Description: The sd (SCSI disk) driver ignores block device barriers (REQ_HARDBARRIER). The iSCSI code in the kernel sends all iSCSI commands with flag ISCSI_ATTR_SIMPLE to the iSCSI target. This means that the target may reorder these commands. Since a.o. correct operation of journaling filesystems depends on being able to enforce the order of certain block write operations, not enforcing write ordering is a bug. This can be solved by either adding support for REQ_HARDBARRIER in the sd device or by replacing ISCSI_ATTR_SIMPLE by ISCSI_ATTR_ORDERED. Steps to reproduce: Source reading of drivers/scsi/sd.c and drivers/scsi/libiscsi.c. References: SCSI Architecture Model - 3, paragraph 8.6 (http://www.t10.org/ftp/t10/drafts/sam3/sam3r14.pdf). (does iscsi have a maintainer?) - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] cciss: add support for blktrace
On Mon, 19 Nov 2007 16:07:17 -0600 Mike Miller [EMAIL PROTECTED] wrote: Patch 2 of 3 This patch adds support for the blktrace utility. Please consider this for inclusion. Seems there was already a call to blk_add_trace. This patch adds ifdef's and includes the header file. Signed-off-by: Mike Miller [EMAIL PROTECTED] diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index 2ba5a89..61bc0f3 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -41,6 +41,10 @@ #include asm/uaccess.h #include asm/io.h +#ifdef CONFIG_BLK_DEV_IO_TRACE +#include linux/blktrace_api.h +#endif /* CONFIG_BLK_DEV_IO_TRACE */ The ifdefs shouldn't be needed here. If they are needed, blktrace_api.h needs fixing. #include linux/dma-mapping.h #include linux/blkdev.h #include linux/genhd.h @@ -3013,7 +3017,9 @@ after_error_processing: } cmd-rq-data_len = 0; cmd-rq-completion_data = cmd; +#ifdef CONFIG_BLK_DEV_IO_TRACE blk_add_trace_rq(cmd-rq-q, cmd-rq, BLK_TA_COMPLETE); +#endif /* CONFIG_BLK_DEV_IO_TRACE */ blk_complete_request(cmd-rq); } Add if you remove the first set of ifdefs, these ifdefs can also be removed. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cciss: export more sysfs attributes
On Mon, 19 Nov 2007 16:03:07 -0600 Mike Miller [EMAIL PROTECTED] wrote: Patch 1 of 3 This patch creates more sysfs attributes to be exported by cciss. Hopefully we can work better with udev. Please consider this patch for inclusion. It would be appropriate if the changelog were to describe what the problem is with udev, and how this patch attemtps to address it. diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index 7d70496..2ba5a89 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -229,20 +229,483 @@ static inline CommandList_struct *removeQ(CommandList_struct **Qptr, return c; } +static inline int find_drv_index(int ctlr, drive_info_struct *drv){ +int i; +for (i=0; i CISS_MAX_LUN; i++) { +if (hba[ctlr]-drv[i].LunID == drv-LunID) +return i; +} +return i; +} Please pass all patches though scripts/checkpatch.pl before sending. It will detect things like the codingstyle errors in the above code. Also, that function seems to be too large to be inlined. #include cciss_scsi.c /* For SCSI tape support */ +#define ENG_GIG 10 +#define ENG_GIG_FACTOR (ENG_GIG/512) #define RAID_UNKNOWN 6 +static const char *raid_label[] = { 0, 4, 1(1+0), 5, 5+1, ADG, + UNKNOWN}; + + +static spinlock_t sysfs_lock = SPIN_LOCK_UNLOCKED; And that's a bug which checkpatch would have detected. Please use DEFINE_SPINLOCK() to avoid confusing lockdep. +static void cciss_sysfs_stat_inquiry(int ctlr, int logvol, + int withirq, drive_info_struct *drv) +{ + int return_code; + InquiryData_struct *inq_buff; + + /* If there are no heads then this is the controller disk and + * not a valid logical drive so don't query it. + */ + if (!drv-heads) + return; + + inq_buff = kzalloc(sizeof(InquiryData_struct), GFP_KERNEL); + if (!inq_buff) { + printk(KERN_ERR cciss: out of memory\n); + goto err; + } + + if (withirq) + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr, + inq_buff, sizeof(*inq_buff), 1, logvol ,0, TYPE_CMD); + else + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff, + sizeof(*inq_buff), 1, logvol , 0, NULL, TYPE_CMD); + if (return_code == IO_OK) { + memcpy(drv-vendor, inq_buff-data_byte[8], 8); + drv-vendor[8]='\0'; + memcpy(drv-model, inq_buff-data_byte[16], 16); + drv-model[16] = '\0'; + memcpy(drv-rev, inq_buff-data_byte[32], 4); + drv-rev[4] = '\0'; + } else { /* Get geometry failed */ + printk(KERN_WARNING cciss: inquiry for VPD page 0 failed\n); + } + + if (withirq) + return_code = sendcmd_withirq(CISS_INQUIRY, ctlr, + inq_buff, sizeof(*inq_buff), 1, logvol ,0x83, TYPE_CMD); + else + return_code = sendcmd(CISS_INQUIRY, ctlr, inq_buff, + sizeof(*inq_buff), 1, logvol , 0x83, NULL, TYPE_CMD); + + if (return_code == IO_OK) { + memcpy(drv-uid, inq_buff-data_byte[8], 16); + } else { /* Get geometry failed */ + printk(KERN_WARNING cciss: id logical drive failed\n); + } + + kfree(inq_buff); +err: + drv-vendor[8] = '\0'; + drv-model[16] = '\0'; + drv-rev[4] = '\0'; + +} + +static ssize_t cciss_show_raid_level(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct drv_dynamic *d; + drive_info_struct *drv; + ctlr_info_t *h; + unsigned long flags; + int raid; + + d = container_of(dev, struct drv_dynamic, dev); + spin_lock(sysfs_lock); + if (!d-disk) { + spin_unlock(sysfs_lock); + return -ENOENT; + } + + h = get_host(d-disk); + + spin_lock_irqsave(CCISS_LOCK(h-ctlr), flags); + if (h-busy_configuring) { + spin_unlock_irqrestore(CCISS_LOCK(h-ctlr), flags); + spin_unlock(sysfs_lock); + return snprintf(buf, 30, Device busy configuring\n); + } + + drv = d-disk-private_data; + if ((drv-raid_level 0) || (drv-raid_level) 5) + raid = RAID_UNKNOWN; + else + raid = drv-raid_level; + + spin_unlock_irqrestore(CCISS_LOCK(h-ctlr), flags); + spin_unlock(sysfs_lock); + return snprintf(buf, 20, RAID %s\n, raid_label[raid]); +} + +static ssize_t cciss_show_disk_size(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct drv_dynamic *d; + drive_info_struct *drv; + ctlr_info_t *h; + unsigned long flags; + sector_t vol_sz, vol_sz_frac; + + d = container_of(dev, struct drv_dynamic, dev); + spin_lock(sysfs_lock); +
Re: [PATCH 3/4] scsi_data_buffer
On Thu, 08 Nov 2007 18:59:30 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: In preparation for bidi we abstract all IO members of scsi_cmnd, that will need to duplicate, into a substructure. - Group all IO members of scsi_cmnd into a scsi_data_buffer structure. drivers/scsi/qla1280.c: In function 'qla1280_done': drivers/scsi/qla1280.c:1313: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:1314: error: 'struct scsi_cmnd' has no member named 'request_buffer' drivers/scsi/qla1280.c:1315: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:1316: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c:1318: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c: In function 'qla1280_return_status': drivers/scsi/qla1280.c:1409: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c:1416: error: 'struct scsi_cmnd' has no member named 'resid' drivers/scsi/qla1280.c: In function 'qla1280_64bit_start_scsi': drivers/scsi/qla1280.c:2791: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:2792: error: 'struct scsi_cmnd' has no member named 'request_buffer' drivers/scsi/qla1280.c:2793: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:2801: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c:2896: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:2991: error: 'struct scsi_cmnd' has no member named 'request_buffer' drivers/scsi/qla1280.c:2992: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c:3004: error: 'struct scsi_cmnd' has no member named 'request_bufflen' make[2]: *** [drivers/scsi/qla1280.o] Error 1 It mystfies me how a patch like this can have been floating about in N submissions across M months and nobody has done an allmodconfig build or even a grep to find out what broke. ho hum. I shall mark qla1280 BROKEN and shall plod onwards. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] scsi_data_buffer
On Tue, 13 Nov 2007 15:40:42 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: On Mon, 12 Nov 2007 22:06:52 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Thu, 08 Nov 2007 18:59:30 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: In preparation for bidi we abstract all IO members of scsi_cmnd, that will need to duplicate, into a substructure. - Group all IO members of scsi_cmnd into a scsi_data_buffer structure. drivers/scsi/qla1280.c: In function 'qla1280_done': drivers/scsi/qla1280.c:1313: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:1314: error: 'struct scsi_cmnd' has no member named 'request_buffer' drivers/scsi/qla1280.c:1315: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:1316: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c:1318: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c: In function 'qla1280_return_status': drivers/scsi/qla1280.c:1409: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c:1416: error: 'struct scsi_cmnd' has no member named 'resid' drivers/scsi/qla1280.c: In function 'qla1280_64bit_start_scsi': drivers/scsi/qla1280.c:2791: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:2792: error: 'struct scsi_cmnd' has no member named 'request_buffer' drivers/scsi/qla1280.c:2793: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:2801: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c:2896: error: 'struct scsi_cmnd' has no member named 'use_sg' drivers/scsi/qla1280.c:2991: error: 'struct scsi_cmnd' has no member named 'request_buffer' drivers/scsi/qla1280.c:2992: error: 'struct scsi_cmnd' has no member named 'request_bufflen' drivers/scsi/qla1280.c:3004: error: 'struct scsi_cmnd' has no member named 'request_bufflen' make[2]: *** [drivers/scsi/qla1280.o] Error 1 It mystfies me how a patch like this can have been floating about in N submissions across M months and nobody has done an allmodconfig build or even a grep to find out what broke. ho hum. I shall mark qla1280 BROKEN and shall plod onwards. A patch to fix this is in James' scsi-pending tree. Jes tested and fixed it (thanks !) so it will go to -mm via scsi-misc soon. oh gawd. So we have git-scsi-misc, git-scsi-rc-fixes and now git-scsi-pending? I hope you fixed imm, ppa and any other broken drivers? Boaz, it's better to send major scsi patches to -mm via scsi-misc to avoid problems like this. By the way, Andrew, can you add the following patchset to -mm? http://lkml.org/lkml/2007/10/24/138 It fixes the IOMMUs' problem to merge scatter/gather segments without considering LLDs' restrictions. hmm, OK, I saved them away to look at after next -mm. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Add ACCUSYS RAID driver for Linux i386/x86-64
On Mon, 22 Oct 2007 18:17:49 +0800 Peter Chan [EMAIL PROTECTED] wrote: Dear Morton Thanks for your doing. We modified source code as your requested. If you have any comment please let me know. Do you need RAID HBA to test at this stage? If yes, Which address can i ship RAID HBA for you? Please, you really will need to become a bit more familiar with the way we work. As far as I know, nobody in the linux world uses RAR format - that's a windows thing. I doubt if anyone except I has actually gone to the effort to decrypt that attachment. Start with Documentation/SubmittingPatches Documentation/SubmittingDrivers Documentation/SubmitChecklist http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt http://linux.yyz.us/patch-format.html There are a number of remaining stylistic things which we can look at more closely when we have patches which are in a usable form. - Linux doesn't use capitalisation in variable names. Use tail, not Tail - Linux uses underscored to separate words. Use reply_frame, not replyframe or ReplyFrame. - We don't like to see code which has any dependency on LINUX_VERSION_CODE or KENREL_VERSION: the code in Linux is suppsoed to work correctly in the version of the kernel which it s found and that's it. - I don't know what this: +#if defined(CONFIG_MODVERSIONS) !defined(MODVERSIONS) + #define MODVERSIONS +#endif is doing, but it's probably wrong. - Use request_node, not RequestNode, etc. - Don't parenthesise the argument to `return'. - I see at least one U32 in there. Please use u32. (Does U32 even work?) - This: +static int acs_ame_get_log( + struct Acs_Adapter *acs_adt, + struct EventLog *event_log) isn't preferred style. Use static int acs_ame_get_log(struct Acs_Adapter *acs_adt, struct EventLog *event_log) or, if you particularly dislike that, blow the 80-col rule and do static int acs_ame_get_log(struct Acs_Adapter *acs_adt, struct EventLog *event_log) - This + writel((replyframe), base_addr+AME_REPLY_MSG_PORT); is overparenthesised. - Beware that the scatter/gather APIs just got significantly changed. You code might need adjustment to work against the latest mainline tree. - What does CHAR_DEV do? Probably it should be a Kconfig CONFIG_* option. - All the code around acs_ame_schedule_command() (which is incorrectly identified as arcmsr_schedule_command in its comment block) is indented a tab stop. That's really weird. Please make it normal. - acs_ame_schedule_command() has an up-to-sixty-second busywait. Bad. Can we get a sleep+wakeup in there? - This: + struct + { + unsigned int vendor_id; + unsigned int device_id; + } const acs_ame_devices[] = { + { 0x14D6, DEVICEID_ACS_61000_XX } + , { 0x14D6, DEVICEID_ACS_62000_08 } + , { 0x1AB6, DEVICEID_ACS_61000_XX } + , { 0x1AB6, DEVICEID_ACS_62000_08 } + }; should be static const struct { unsigned int vendor_id; unsigned int device_id; } acs_ame_devices[] = { { 0x14D6, DEVICEID_ACS_61000_XX }, { 0x14D6, DEVICEID_ACS_62000_08 }, { 0x1AB6, DEVICEID_ACS_61000_XX }, { 0x1AB6, DEVICEID_ACS_62000_08 }, }; which has many changes from the original. I'd have thought that the kernel already has a data type for this, but I can't find it. Most drivers just rely upon the normal PCI device ID tables. It is suspicious that this one doesn't. Anyway, that's just from a quick scan. There are a huge number of similar issues in there. Please take some time to study some well-maintained Linux driver code and the interfaces which scsi and PCI drivers use and try to make this driver a lot more Linux-like, thanks. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] final SCSI pieces for the merge window
On Wed, 24 Oct 2007 09:28:10 -0400 James Bottomley [EMAIL PROTECTED] wrote: OK, so it's no secret that I'm the last of the subsystem maintainers whose day job isn't working on the linux kernel. For the record, lots of subsystem maintainers are privateers. goes through the git trees I am not aware that these guys: Mauro Chehab, Dmitry Torokhov, Sam Ravnborg, Pierre Ossman, Mark Hoffman, Thomas Gleixner, David Airlie, Richard Purdie, Peter Anvin, Kyle McMartin, Francois Romieu, Artem Bityutskiy, Erez Zadok, Josef Sipek, Anton Altaparmakov, Eric Van Hensbergen, Latchesar Ionkov, Wim Van Sebroeck, Antonino Daplas. do it with any compensation. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] final SCSI pieces for the merge window
On Wed, 24 Oct 2007 08:35:21 -0700 (PDT) Linus Torvalds [EMAIL PROTECTED] wrote: On Wed, 24 Oct 2007, James Bottomley wrote: OK, so it's no secret that I'm the last of the subsystem maintainers whose day job isn't working on the linux kernel. If you want a full time person, who did you have in mind? Quite frankly, at least for me personally, what I would rather have (in general: this is really not at all SCSI-specific in any way, shape, or form, and not directed at James!) is a less rigid maintainership structure. Let's face it, we are *all* likely to be overworked at different times, and even when not overworked, it's just the fact that people need to take a breather etc. And there is seldom - if ever - a very strong argument for having one person per subsystem. Am OK with all of that, but with a rider. It would make my life even more miserable if there was a (say) git-scsi-tweedledee and a git-scsi-tweedledum. We already have too much out-of-scope code turning up in the git trees and having two trees explicitly modifying the same subsystem would hurt. It's also bad from an engineering POV: there's a decent chance that when combined, they just won't work. So Tweedledee and Tweedledum should both commit to the same tree, please. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: stex driver panic in kernel 2.6.23
On Wed, 24 Oct 2007 11:59:30 -0700 Ed Lin [EMAIL PROTECTED] wrote: The shared tag issue was not fixed yet. Kernel panic happened while running I/O test in kernel 2.6.23 (information attached). After applying the patch I posted (or the version James modified), panic disappeared. Switch back to standard kernel, panic again. Did either of those patches get merged in 2.6.24-rc1? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] [SCSI] ips: remove ips_ha members that duplicate struct pci_dev members
On Wed, 24 Oct 2007 19:48:26 -0400 (EDT) Jeff Garzik [EMAIL PROTECTED] wrote: drivers/scsi/ips.c | 178 this driver seems a bit of a basket case :( What's going on here? scb-dcdb.cmd_attribute = ips_command_direction[scb-scsi_cmd-cmnd[0]]; /* Allow a WRITE BUFFER Command to Have no Data */ /* This is Used by Tape Flash Utilites */ if ((scb-scsi_cmd-cmnd[0] == WRITE_BUFFER) (scb-data_len == 0)) scb-dcdb.cmd_attribute = 0; if (!(scb-dcdb.cmd_attribute 0x3)) scb-dcdb.transfer_length = 0; if (scb-data_len = IPS_MAX_XFER) { I hope that's just busted indentation and not a missing {} block. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: oops in lbmIODone, fails to boot [Re: 2.6.23-mm1]
On Sat, 20 Oct 2007 13:57:54 +0900 Mattia Dongili [EMAIL PROTECTED] wrote: On Thu, Oct 11, 2007 at 09:31:26PM -0700, Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23/2.6.23-mm1/ Hey there!! fails to boot here with this friendly oops: http://oioio.altervista.org/linux/dsc01702.jpg .config: http://oioio.altervista.org/linux/config-2.6.23-mm1-1 2.6.23-rc8-mm2 booted ok but had other problems I haven't reported yet (no s2ram with mysql running and some net WARNING). Let's see if .23-mm1 still has those first. I'm adding Cc: linux-scsi PS: I'll hardly be able to bisect in the next days... :P That looks like a Jens and Dave production to me. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Make advansys depend on CONFIG_VIRT_TO_BUS
On Thu, 18 Oct 2007 22:20:17 -0700 Randy Dunlap [EMAIL PROTECTED] wrote: On Fri, 19 Oct 2007 15:04:31 +1000 Stephen Rothwell wrote: At least for now. Please explain why in the changelog (what changelog?). E.g.: so that make allmodconfig on powerpc will have a better chance of building. My version of this patch does that. I'll be sending it into Linus in an hour or so. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: OOM killer gripe (was Re: What still uses the block layer?)
On Mon, 15 Oct 2007 23:37:44 +1000 Nick Piggin [EMAIL PROTECTED] wrote: Would an oom-kill-someone-now sysrq be of help, I wonder? Is already there: sysrq-f. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html