Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.
Arjan van de Ven <[EMAIL PROTECTED]> writes: >> > Once the migration operation is complete we know we will receive >> > no more interrupts on this vector so the irq pending state for >> > this irq will no longer be updated. If the irq is not pending and >> > we are in the intermediate state we immediately free the vector, >> > otherwise in we free the vector in do_IRQ when the pending irq >> > arrives. >> >> So is this a for-2.6.20 thing? The bug was present in 2.6.19, so >> I assume it doesn't affect many people? > > I got a few reports of this; irqbalance may trigger this kernel bug it > seems... I would suggest to consider this for 2.6.20 since it's a > hard-hang case Yes. The bug I fixed will not happen if you don't migrate irqs. At the very least we want the patch below (already in -mm) that makes it not a hard hang case. Subject: [PATCH] x86_64: Survive having no irq mapping for a vector Occasionally the kernel has bugs that result in no irq being found for a given cpu vector. If we acknowledge the irq the system has a good chance of continuing even though we dropped an missed an irq message. If we continue to simply print a message and drop and not acknowledge the irq the system is likely to become non-responsive shortly there after. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- arch/x86_64/kernel/irq.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c index 0c06af6..648055a 100644 --- a/arch/x86_64/kernel/irq.c +++ b/arch/x86_64/kernel/irq.c @@ -120,9 +120,14 @@ asmlinkage unsigned int do_IRQ(struct pt_regs *regs) if (likely(irq < NR_IRQS)) generic_handle_irq(irq); - else if (printk_ratelimit()) - printk(KERN_EMERG "%s: %d.%d No irq handler for vector\n", - __func__, smp_processor_id(), vector); + else { + if (!disable_apic) + ack_APIC_irq(); + + if (printk_ratelimit()) + printk(KERN_EMERG "%s: %d.%d No irq handler for vector\n", + __func__, smp_processor_id(), vector); + } irq_exit(); -- 1.4.4.1.g278f - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20-rc7: known regressions (v2) (part 1)
Auke Kok <[EMAIL PROTECTED]> writes: > Adrian Bunk wrote: >> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19 >> that are not yet fixed in Linus' tree. >> >> If you find your name in the Cc header, you are either submitter of one >> of the bugs, maintainer of an affectected subsystem or driver, a patch >> of you caused a breakage or I'm considering you in any other way possibly >> involved with one or more of these issues. > > >> Subject: e1000: 82571EB/82572EI PCI-E cards: link is always down >> (MSI related) >> References : http://lkml.org/lkml/2007/1/16/27 >> http://lkml.org/lkml/2007/1/17/182 >> Submitter : Allen Parker <[EMAIL PROTECTED]> >> Adam Kropelin <[EMAIL PROTECTED]> >> Handled-By : Auke Kok <[EMAIL PROTECTED]> >> Status : problem is being debugged > > I probably can't fix this bug. Not only do I doubt that the e1000 driver is at > fault here, I don't have a system with this particular chipset. Most likely > the > regression comes from a combination of MSI layer rewrites and possibly > platform > issues. We've seen many reports that are similar and all are on the platform > type mentioned here. I really don't want to point fingers here either. > > None of the MSI code in e1000 has changed significantly either. as far as I > can > see, the msi code in e1000 has not changed since 2.6.18. Nonetheless there's > no > way I can debug any of this without a system. > > I will address the fact that we are lacking any of these systems to test on, > but > that is not going to get this issue handled (not to mention soon) in the way > it > needs to be. > > I strongly encourage the people on the linux-pci list to help out, I'll trace > the e1000 driver for suspicious activity (again), but I run countless tests on > the latest trees and nothing has shown up recently, other than Eric > Biederman's > msi irq reclaim leak fix. > > Perhaps Adam can git-bisect this issue? Adam? Do we have any explanation about the weird /proc/interrupts output? i.e. Multiple MSI irqs being assigned to the same card? Does /sbin/ifconfig ethN down ; /sbin/ifconfig ethN up have anything to do with the duplication in /proc/interrupts? I can't see any way for a pci device that doesn't support msi-x to be assigned multiple interrupts simultaneously. I just skimmed through the code and there hasn't been any significant generic MSI work since 2.6.19. Did this device really work with MSI enabled in 2.6.19? Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update
On Friday 02 February 2007 17:47, Maynard Johnson wrote: > > > We also want to be able to profile the context switch code itself, which > > means that we also need one event buffer associated with the kernel to > > collect events that for a zero context_id. > The hardware design precludes tracing both SPU and PPU simultaneously. > I mean the SPU-side part of the context switch code, which you can find in arch/powerpc/platforms/cell/spufs/spu_{save,restore}*. This code is the one that runs when context_id == 0 is passed to the callback. Arnd <>< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.
> > Once the migration operation is complete we know we will receive > > no more interrupts on this vector so the irq pending state for > > this irq will no longer be updated. If the irq is not pending and > > we are in the intermediate state we immediately free the vector, > > otherwise in we free the vector in do_IRQ when the pending irq > > arrives. > > So is this a for-2.6.20 thing? The bug was present in 2.6.19, so > I assume it doesn't affect many people? I got a few reports of this; irqbalance may trigger this kernel bug it seems... I would suggest to consider this for 2.6.20 since it's a hard-hang case - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20-rc6-mm3
Cedric Le Goater wrote: > Starikovskiy, Alexey Y wrote: >>> so it probably means that drivers/acpi/tables/tbxfroot.c is >>> obsolete ? >> Yes. Could you please try it? >>> sure, I'll cancel the current boot test in which I was using >>> acpi_find_root_pointer() in tbxfroot.c and restart one with your >>> new patch. I should have the result today. >> How long does it take to boot this thing? > > well, not that long, but i don't have access directly to this > machine, only through a test batch manager ... dmesg looks fine. However, there is a : ACPI Warning (tbfadt-0415): Optional field "Gpe1Block" has zero address or length: /4 [20070126] but I don't know how to interpret this ? Any Idea ? thanks, C. Linux version 2.6.20-rc6-mm3-lxc2-autokern1 ([EMAIL PROTECTED]) (gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)) #1 SMP Fri Feb 2 20:38:46 UTC 2007 BIOS-provided physical RAM map: sanitize start sanitize end copy_e820_map() start: size: 0009dc00 end: 0009dc00 type: 1 copy_e820_map() type is E820_RAM copy_e820_map() start: 0009dc00 size: 2400 end: 000a type: 2 copy_e820_map() start: 000e size: 0002 end: 0010 type: 2 copy_e820_map() start: 0010 size: dfea25c0 end: dffa25c0 type: 1 copy_e820_map() type is E820_RAM copy_e820_map() start: dffa25c0 size: 9c80 end: dffac240 type: 3 copy_e820_map() start: dffac240 size: 00053dc0 end: e000 type: 2 copy_e820_map() start: fec0 size: 0140 end: 0001 type: 2 copy_e820_map() start: 0001 size: 00012000 end: 00022000 type: 1 copy_e820_map() type is E820_RAM BIOS-e820: - 0009dc00 (usable) BIOS-e820: 0009dc00 - 000a (reserved) BIOS-e820: 000e - 0010 (reserved) BIOS-e820: 0010 - dffa25c0 (usable) BIOS-e820: dffa25c0 - dffac240 (ACPI data) BIOS-e820: dffac240 - e000 (reserved) BIOS-e820: fec0 - 0001 (reserved) BIOS-e820: 0001 - 00022000 (usable) Node: 0, start_pfn: 0, end_pfn: 157 Node: 0, start_pfn: 256, end_pfn: 917410 Node: 0, start_pfn: 1048576, end_pfn: 2228224 get_memcfg_from_srat: assigning address to rsdp RSD PTR v0 [IBM ] Begin SRAT table scan CPU 0x00 in proximity domain 0x00 CPU 0x02 in proximity domain 0x00 CPU 0x10 in proximity domain 0x00 CPU 0x12 in proximity domain 0x00 CPU 0x01 in proximity domain 0x00 CPU 0x03 in proximity domain 0x00 CPU 0x11 in proximity domain 0x00 CPU 0x13 in proximity domain 0x00 Memory range 0x0 to 0xE (type 0x1) in proximity domain 0x00 enabled Memory range 0x10 to 0x22 (type 0x1) in proximity domain 0x00 enabled pxm bitmap: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Number of logical nodes in system = 1 Number of memory chunks in system = 2 chunk 0 nid 0 start_pfn end_pfn 000e chunk 1 nid 0 start_pfn 0010 end_pfn 0022 Node: 0, start_pfn: 0, end_pfn: 2228224 Reserving 17920 pages of KVA for lmem_map of node 0 Shrinking node 0 from 2228224 pages to 2210304 pages Reserving total of 17920 pages for numa KVA remap kva_start_pfn ~ 211456 find_max_low_pfn() ~ 229376 max_pfn = 2228224 7808MB HIGHMEM available. 896MB LOWMEM available. min_low_pfn = 1156, max_low_pfn = 229376, highstart_pfn = 229376 Low memory ends at vaddr f800 node 0 will remap to vaddr f3a0 - fc60 High memory starts at vaddr f800 found SMP MP-table at 0009dd40 Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 229376 HighMem229376 -> 2228224 early_node_map[2] active PFN ranges 0:0 -> 917504 0: 1048576 -> 2210304 DMI 2.3 present. Using APIC driver default IBM eserver xSeries 440 detected: force use of acpi=ht ACPI: RSDP @ 0x000fde20/0x0014 (v000 IBM ) ACPI: RSDT @ 0xdffac1c0/0x0034 (v001 IBMSERVIGIL 0x1000 IBM 0x45444F43) ACPI: FACP @ 0xdffac140/0x0074 (v001 IBMSERVIGIL 0x1000 IBM 0x45444F43) ACPI Warning (tbfadt-0415): Optional field "Gpe1Block" has zero address or length: /4 [20070126] ACPI: DSDT @ 0xdffa25c0/0x4436 (v001 IBMSERVIGIL 0x1000 INTL 0x02002025) ACPI: FACS @ 0xdffabf00/0x0040 ACPI: APIC @ 0xdffac040/0x00D2 (v001 IBMSERVIGIL 0x1000 IBM 0x45444F43) ACPI: SRAT @ 0xdffabf40/0x0100 (v001 IBMSERVIGIL 0x1000 IBM 0x45444F43) ACPI: SSDT @ 0xdffa6a00/0x5467 (v001 IBMVIGSSDT0 0x1000 INTL 0x02002025) ACPI: PM-Timer IO Port: 0x508 Switched to APIC driver `summit'. ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 15:1 APIC version 20 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled) Processor #2 15:1 APIC version 20 ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled) Processor #16
Re: [PATCH 2 of 4] Introduce i386 fibril scheduling
On Fri, Feb 02, 2007 at 04:56:22PM -0800, Linus Torvalds wrote: > > On Sat, 3 Feb 2007, Ingo Molnar wrote: > > > > Well, in my picture, 'only if you block' is a pure thread utilization > > decision: bounce a piece of work to another thread if this thread cannot > > complete it. (if the kernel is lucky enough that the user context told > > it "it's fine to do that".) > > Sure, you can do it that way too. But at that point, your argument that we > shouldn't do it with fibrils is wrong: you'd still need basically the > exact same setup that Zach does in his fibril stuff, and the exact same > hook in the scheduler, testing the exact same value ("do we have a pending > queue of work"). > > So at that point, you really are arguing about a rather small detail in > the implementation, I think. > > Which is fair enough. > > But I actually think the *bigger* argument and problems are elsewhere, > namely in the interface details. Notably, I think the *real* issues end up > how we handle synchronization, and how we handle signalling. Those are in > many ways (I think) more important than whether we actually can schedule > these trivial things on multiple CPU's concurrently or not. > > For example, I think serialization is potentially a much more expensive > issue. Could we, for example, allow users to serialize with these things > *without* having to go through the expense of doing a system call? Again, > I'm thinking of the case of no IO happening, in which case there also > won't be any actual threading taking place, in which case it's a total > waste of time to do a system call at all. > > And trying to do that actually has implications for the interfaces (like > possibly returning a zero cookie for the async() system call if it was > doable totally synchronously?) This would be useful - the application wouldn't have to set up state to remember for handling completions for operations that complete synchronously I know Samba folks would like that. The laio_syscall implementation (Lazy asynchronous IO) seems to have experimented with such an interface http://www.usenix.org/events/usenix04/tech/general/elmeleegy.html Regards Suparna > > Signal handling is similar: I actually think that a "async()" system call > should be interruptible within the context of the caller, since we would > want to *try* to execute it synchronously. That automatically means that > we have semantic meaning for fibrils and signal handling. > > Finally, can we actually get POSIX aio semantics with this? Can we > implement the current aio_xyzzy() system calls using this same feature? > And most importantly - does it perform well enough that we really can do > that? > > THOSE are to me bigger questions than what happens inside the kernel, and > whether we actually end up using another thread if we end up doing it > non-synchronously. > > Linus > > -- > To unsubscribe, send a message with 'unsubscribe linux-aio' in > the body to [EMAIL PROTECTED] For more info on Linux AIO, > see: http://www.kvack.org/aio/ > Don't email: mailto:"[EMAIL PROTECTED]">[EMAIL PROTECTED] -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Ksummit-2007-discuss] Re: [Ksummit-2006-discuss] 2007 Linux Kernel Summit
On Tuesday 30 January 2007 08:30, Theodore Tso wrote: > Well, Usenix has offerred to provide logistical support for some > mini-summits if anyoen wants to take them up on it. Using some of the > sponsorship money from last year, we've proposed to make some hotel > conference rooms right before OLS available if anyone wants to do a > 10-30 person mini-summit in Ottawa. > > Is there any interest? Yes, suspect that a day attached to OLS may make a good power-management summit day. -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/9] buffered write deadlock fix
On Fri, Feb 02, 2007 at 03:52:32PM -0800, Andrew Morton wrote: > On Mon, 29 Jan 2007 11:31:37 +0100 (CET) > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > The following set of patches attempt to fix the buffered write > > locking problems (and there are a couple of peripheral patches > > and cleanups there too). > > > > Patches against 2.6.20-rc6. I was hoping that 2.6.20-rc6-mm2 would > > be an easier diff with the fsaio patches gone, but the readahead > > rewrite clashes badly :( > > Well fsaio is restored, but there's now considerable doubt over it due to > the recent febril febrility. I think Ingo made a point earlier about letting the old co-exist with the new. Fibrils + kevents have great potential for a next generation solution but we need to give the whole story some time to play out and prove it in practice, debate and benchmark the alternative combinations, optimize it for various workloads etc. It will also take more work on top before we can get the whole POSIX AIO implementation supported on top of this. I'll be very happy when that happens ... it is just that it is still too early to be sure. Since this is going to be a new interface, not the existing linux AIO interface, I do not see any conflict between the two. Samba4 already uses fsaio, and we now have the ability to do POSIX AIO over kernel AIO (which depends on fsaio). The more we delay real world usage the longer we take to learn about the application patterns that matter. And it is those patterns that are key. > > How bad is the clash with the readahead patches? > > Clashes with git-block are likely, too. > > Bugfixes come first, so I will drop readahead and fsaio and git-block to get > this work completed if needed - please work agaisnt mainline. If you need help with fixing the clashes, please let me know. Regards Suparna -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] Tracking mlocked pages and moving them off the LRU
This is a new variation on the earlier RFC for tracking mlocked pages. We now mark a mlocked page with a bit in the page flags and remove them from the LRU. Pages get moved back when no vma that references the page has VM_LOCKED set anymore. This means that vmscan no longer uselessly cycles over large amounts of mlocked memory should someone attempt to mlock large amounts of memory (may even result in a livelock on large systems). Synchronization is build around state changes of the PageMlocked bit. The NR_MLOCK counter is incremented and decremented based on state transitions of PageMlocked. So the count is accurate. There is still some unfinished business: 1. We use the 21st page flag and we only have 20 on 32 bit NUMA platforms. 2. Since mlocked pages are now off the LRU page migration will no longer move them. 3. Use NR_MLOCK to tune various VM behaviors so that the VM does not longer fall due to too many mlocked pages in certain areas. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: current/include/linux/mmzone.h === --- current.orig/include/linux/mmzone.h 2007-02-02 16:42:51.0 -0800 +++ current/include/linux/mmzone.h 2007-02-02 16:43:28.0 -0800 @@ -58,6 +58,7 @@ enum zone_stat_item { NR_FILE_DIRTY, NR_WRITEBACK, /* Second 128 byte cacheline */ + NR_MLOCK, /* Mlocked pages */ NR_SLAB_RECLAIMABLE, NR_SLAB_UNRECLAIMABLE, NR_PAGETABLE, /* used for pagetables */ Index: current/mm/memory.c === --- current.orig/mm/memory.c2007-02-02 16:42:51.0 -0800 +++ current/mm/memory.c 2007-02-02 21:24:20.0 -0800 @@ -682,6 +682,8 @@ static unsigned long zap_pte_range(struc file_rss--; } page_remove_rmap(page, vma); + if (PageMlocked(page) && (vma->vm_flags & VM_LOCKED)) + mlock_remove(page, vma); tlb_remove_page(tlb, page); continue; } @@ -898,6 +900,21 @@ unsigned long zap_page_range(struct vm_a } /* + * Add a new anonymous page + */ +void anon_add(struct vm_area_struct *vma, struct page *page, + unsigned long address) +{ + inc_mm_counter(vma->vm_mm, anon_rss); + if (vma->vm_flags & VM_LOCKED) { + SetPageMlocked(page); + inc_zone_page_state(page, NR_MLOCK); + } else + lru_cache_add_active(page); + page_add_new_anon_rmap(page, vma, address); +} + +/* * Do a quick page-table lookup for a single page. */ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, @@ -949,6 +966,10 @@ struct page *follow_page(struct vm_area_ if (unlikely(!page)) goto unlock; + if ((flags & FOLL_MLOCK) && + !PageMlocked(page) && + (vma->vm_flags & VM_LOCKED)) + mlock_add(page, vma); if (flags & FOLL_GET) get_page(page); if (flags & FOLL_TOUCH) { @@ -1045,7 +1066,7 @@ int get_user_pages(struct task_struct *t continue; } - foll_flags = FOLL_TOUCH; + foll_flags = FOLL_TOUCH | FOLL_MLOCK; if (pages) foll_flags |= FOLL_GET; if (!write && !(vma->vm_flags & VM_LOCKED) && @@ -2101,9 +2122,7 @@ static int do_anonymous_page(struct mm_s page_table = pte_offset_map_lock(mm, pmd, address, ); if (!pte_none(*page_table)) goto release; - inc_mm_counter(mm, anon_rss); - lru_cache_add_active(page); - page_add_new_anon_rmap(page, vma, address); + anon_add(vma, page, address); } else { /* Map the ZERO_PAGE - vm_page_prot is readonly */ page = ZERO_PAGE(address); @@ -2247,12 +2266,13 @@ retry: if (write_access) entry = maybe_mkwrite(pte_mkdirty(entry), vma); set_pte_at(mm, address, page_table, entry); - if (anon) { - inc_mm_counter(mm, anon_rss); - lru_cache_add_active(new_page); - page_add_new_anon_rmap(new_page, vma, address); - } else { + if (anon) + anon_add(vma, new_page, address); + else { inc_mm_counter(mm, file_rss); + if (!PageMlocked(new_page) && + (vma->vm_flags & VM_LOCKED)) + mlock_add(new_page, vma); page_add_file_rmap(new_page);
Re: 2.6.20-rc7: known regressions (v2) (part 1)
Adrian Bunk wrote: This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Subject: e1000: 82571EB/82572EI PCI-E cards: link is always down (MSI related) References : http://lkml.org/lkml/2007/1/16/27 http://lkml.org/lkml/2007/1/17/182 Submitter : Allen Parker <[EMAIL PROTECTED]> Adam Kropelin <[EMAIL PROTECTED]> Handled-By : Auke Kok <[EMAIL PROTECTED]> Status : problem is being debugged I probably can't fix this bug. Not only do I doubt that the e1000 driver is at fault here, I don't have a system with this particular chipset. Most likely the regression comes from a combination of MSI layer rewrites and possibly platform issues. We've seen many reports that are similar and all are on the platform type mentioned here. I really don't want to point fingers here either. None of the MSI code in e1000 has changed significantly either. as far as I can see, the msi code in e1000 has not changed since 2.6.18. Nonetheless there's no way I can debug any of this without a system. I will address the fact that we are lacking any of these systems to test on, but that is not going to get this issue handled (not to mention soon) in the way it needs to be. I strongly encourage the people on the linux-pci list to help out, I'll trace the e1000 driver for suspicious activity (again), but I run countless tests on the latest trees and nothing has shown up recently, other than Eric Biederman's msi irq reclaim leak fix. Perhaps Adam can git-bisect this issue? Adam? Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Please revert "fix typo in geode_configre()@cyrix.c"
Hi. I'm late. I'll to resend the patch against 2.6.19. original code doesn't write back to CCR4 register. this patch reflects a value of a register. diff -Narup linux-2.6.19.orig/arch/i386/kernel/cpu/cyrix.c linux-2.6.19/arch/i386/kernel/cpu/cyrix.c --- linux-2.6.19.orig/arch/i386/kernel/cpu/cyrix.c 2006-11-30 06:57:37.0 +0900 +++ linux-2.6.19/arch/i386/kernel/cpu/cyrix.c 2007-02-03 14:57:35.0 +0900 @@ -161,19 +161,19 @@ static void __cpuinit set_cx86_inc(void) static void __cpuinit geode_configure(void) { unsigned long flags; - u8 ccr3, ccr4; + u8 ccr3; local_irq_save(flags); /* Suspend on halt power saving and enable #SUSP pin */ setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88); ccr3 = getCx86(CX86_CCR3); - setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* Enable */ + setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */ - ccr4 = getCx86(CX86_CCR4); - ccr4 |= 0x38; /* FPU fast, DTE cache, Mem bypass */ - - setCx86(CX86_CCR3, ccr3); + + /* FPU fast, DTE cache, Mem bypass */ + setCx86(CX86_CCR4, getCx86(CX86_CCR4) | 0x38); + setCx86(CX86_CCR3, ccr3); /* disable MAPEN */ set_cx86_memwb(); set_cx86_reorder(); @@ -415,15 +415,14 @@ static void __cpuinit cyrix_identify(str if (dir0 == 5 || dir0 == 3) { - unsigned char ccr3, ccr4; + unsigned char ccr3; unsigned long flags; printk(KERN_INFO "Enabling CPUID on Cyrix processor.\n"); local_irq_save(flags); ccr3 = getCx86(CX86_CCR3); - setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */ - ccr4 = getCx86(CX86_CCR4); - setCx86(CX86_CCR4, ccr4 | 0x80); /* enable cpuid */ - setCx86(CX86_CCR3, ccr3); /* disable MAPEN */ + setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */ + setCx86(CX86_CCR4, getCx86(CX86_CCR4) | 0x80); /* enable cpuid */ + setCx86(CX86_CCR3, ccr3); /* disable MAPEN */ local_irq_restore(flags); } } On Fri, 2 Feb 2007 13:18:54 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Fri, 2 Feb 2007 10:12:36 -0500 > [EMAIL PROTECTED] (Lennart Sorensen) wrote: > > > On Fri, Feb 02, 2007 at 12:05:43AM -0800, Andrew Morton wrote: > > > On Fri, 2 Feb 2007 07:29:41 +0100 Adrian Bunk <[EMAIL PROTECTED]> wrote: > > > > > > > Linus, please revert commit e4f0ae0ea63caceff37a13f281a72652b7ea71ba > > > > > > > > > > Yup. > > > > > > That discussion seems to have died. The 2.6.19 code looks rather silly, > > > but > > > presumably it passed someone's testing at some stage. > > > > The discussion ended because the last patch seemed to be correct to > > everyone involved in the discussion. At least that is my understanding. > > Of course I am just one of the users affected by the patch. > > The discussion ended with me asking for someone to send a patch. That > hasn't happened yet. I don't want to have to troll through 20-30 messages > and try to work out what patch we ended up with - that's the way in which > mistakes occur. > > Linus has now reverted e4f0ae0ea63caceff37a13f281a72652b7ea71ba. Now, > please, could someone send a patch against either current -git or against > 2.6.19? One which includes a descriptin of what it does, and why. > > Thanks. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- TAKADA <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
Björn Steinbrink wrote: On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote: On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: Larry Walton wrote: The last patch (sata_nv-force-int-dev-in-interrupt.patch) seems to have fix the problem. Much appreciated, thank you. I'd consider it a must have in 2.6.20. Can any of the rest of you that have been seeing this problem also confirm that this fixes it? Seems to work for me, uptime is about an hour now and no exception yet. Had the stress test running for only about 10 minutes, but I usually got an exception within an hour even during plain irssi usage, so I'm quite confident that the patch fixes it. Or maybe not :( Just got an exception on 2.6.20-rc6. Took 4 days of uptime to trigger, so it's just a lot harder to trigger now. Same exception details as before? There's a patch in -mm (sata_nv-use-adma-for-nodata-commands.patch) which should hopefully avoid this problem for the cache flush commands, at least - can you try that one out? You'll have to apply the other sata_nv patches in -mm first, i.e. this order: http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2.patch http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2-cleanup.patch http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-use-adma-for-nodata-commands.patch -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135
On Fri, 02 Feb 2007, Randy Dunlap wrote: > On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote: > > > On Fri, 2 Feb 2007 12:56:30 -0800 > > Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > > > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats > > > > > limit=2m passes=100 pattern=iot dlimit=2048 > > > > What is this mysterious dt command, btw? > > I expect that it's the one here: > http://www.scsifaq.org/RMiller_Tools/index.html yep, that's the one. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fastboot] [PATCH] kexec: Fix CONFIG_SMP=n compilation (ia64)
On Fri, Feb 02, 2007 at 08:53:00PM +0900, Magnus Damm wrote: > On 2/2/07, Magnus Damm <[EMAIL PROTECTED]> wrote: > > On 2/2/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > > > Magnus Damm <[EMAIL PROTECTED]> wrote: > > > > > > > kexec: Fix CONFIG_SMP=n compilation (ia64) > > > > > > > > This patch makes it possible to compile kexec for ia64 without SMP > > > > support. > > > > --- 0002/arch/ia64/kernel/machine_kexec.c > > > > +++ work/arch/ia64/kernel/machine_kexec.c 2007-02-01 > > > > 12:35:46.0 +0900 > > > > @@ -70,12 +70,14 @@ void machine_kexec_cleanup(struct kimage > > > > > > > > void machine_shutdown(void) > > > > { > > > > +#ifdef CONFIG_SMP > > > > int cpu; > > > > > > > > for_each_online_cpu(cpu) { > > > > if (cpu != smp_processor_id()) > > > > cpu_down(cpu); > > > > } > > > > +#endif > > > > kexec_disable_iosapic(); > > > > } > > > > > > hm. I suspect this one should have been #ifndef CONFIG_HOTPLUG_CPU? > > Re-reading this I assume you mean #ifdef CONFIG_HOTPLUG_CPU. > > I would be happy to resend a new updated version of the patch, but I > wonder if it may be better to fail miserably during the build than > fail silently in the case of CONFIG_SMP=y but CONFIG_HOTPLUG_CPU=n. There used to be alternate code for the CONFIG_SMP + !CONFIG_HOTPLUG_CPU, but this was removed because it was determined to be flakey and not maintainable (I can dig up the threads if you want). I think that this means that if we have CONFIG_KEXEC and CONFIG_SMP then CONFIG_HOTPLUG_CPU is required. I think this is expressable in Kconfig somehow. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix d_path for lazy unmounts
Hello, here is a bugfix to d_path. Please apply (after 2.6.20). First, when d_path() hits a lazily unmounted mount point, it tries to prepend the name of the lazily unmounted dentry to the path name. It gets this wrong, and also overwrites the slash that separates the name from the following pathname component. This is demonstrated by the attached test case, which prints "getcwd returned d_path-bugsubdir" with the bug. The correct result would be "getcwd returned d_path-bug/subdir". It could be argued that the name of the root dentry should not be part of the result of d_path in the first place. On the other hand, what the unconnected namespace was once reachable as may provide some useful hints to users, and so that seems okay. Second, it isn't always possible to tell from the __d_path result whether the specified root and rootmnt (i.e., the chroot) was reached: lazy unmounts of bind mounts will produce a path that does start with a non-slash so we can tell from that, but other lazy unmounts will produce a path that starts with a slash, just like "ordinary" paths. The attached patch cleans up __d_path() to fix the bug with overlapping pathname components. It also adds a @fail_deleted argument, which allows to get rid of some of the mess in sys_getcwd(). Grabbing the dcache_lock can then also be moved into __d_path(). The patch also makes sure that paths will only start with a slash for paths which are connected to the root and rootmnt. The @fail_deleted argument could be added to d_path() as well: this would allow callers to recognize deleted files, without having to resort to the ambiguous check for the " (deleted)" string at the end of the pathnames. This is not currently done, but it might be worthwhile. Signed-off-by: Andreas Gruenbacher <[EMAIL PROTECTED]> Index: linux-2.6/fs/dcache.c === --- linux-2.6.orig/fs/dcache.c +++ linux-2.6/fs/dcache.c @@ -1739,45 +1739,43 @@ shouldnt_be_hashed: * @rootmnt: vfsmnt to which the root dentry belongs * @buffer: buffer to return value in * @buflen: buffer length + * @fail_deleted: what to return for deleted files * - * Convert a dentry into an ASCII path name. If the entry has been deleted - * the string " (deleted)" is appended. Note that this is ambiguous. + * Convert a dentry into an ASCII path name. If the entry has been deleted, + * then if @fail_deleted is true, ERR_PTR(-ENOENT) is returned. Otherwise, + * the the string " (deleted)" is appended. Note that this is ambiguous. * - * Returns the buffer or an error code if the path was too long. - * - * "buflen" should be positive. Caller holds the dcache_lock. + * Returns the buffer or an error code. */ -static char * __d_path( struct dentry *dentry, struct vfsmount *vfsmnt, - struct dentry *root, struct vfsmount *rootmnt, - char *buffer, int buflen) +static char *__d_path(struct dentry *dentry, struct vfsmount *vfsmnt, + struct dentry *root, struct vfsmount *rootmnt, + char *buffer, int buflen, int fail_deleted) { - char * end = buffer+buflen; - char * retval; + char *end = buffer + buflen - 1; int namelen; - *--end = '\0'; + buffer = end; + if (buflen < 2) + return ERR_PTR(-ENAMETOOLONG); + *end = '\0'; buflen--; + + spin_lock(_lock); if (!IS_ROOT(dentry) && d_unhashed(dentry)) { - buflen -= 10; - end -= 10; - if (buflen < 0) + if (fail_deleted) { + buffer = ERR_PTR(-ENOENT); + goto out; + } + if (buflen < 10) goto Elong; - memcpy(end, " (deleted)", 10); + buflen -= 10; + buffer -= 10; + memcpy(buffer, " (deleted)", 10); } - - if (buflen < 1) - goto Elong; - /* Get '/' right */ - retval = end-1; - *retval = '/'; - - for (;;) { + while (dentry != root || vfsmnt != rootmnt) { struct dentry * parent; - if (dentry == root && vfsmnt == rootmnt) - break; if (dentry == vfsmnt->mnt_root || IS_ROOT(dentry)) { - /* Global root? */ spin_lock(_lock); if (vfsmnt->mnt_parent == vfsmnt) { spin_unlock(_lock); @@ -1791,33 +1789,49 @@ static char * __d_path( struct dentry *d parent = dentry->d_parent; prefetch(parent); namelen = dentry->d_name.len; - buflen -= namelen + 1; - if (buflen < 0) + if (buflen <= namelen) goto Elong; - end -= namelen; - memcpy(end, dentry->d_name.name, namelen); -
Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash
On Fri, 2007-02-02 at 17:56 -0800, Greg KH wrote: > > Thanks - I'll queue this up for 2.6.20 also. > > No objection from me, as long as James says this is ok. > > I wonder why we haven't noticed this in the past? Because the race is so small ... I'll queue it in the rc-fixes tree .. I have three others for 2.6.20 James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] - Altix: more ACPI PRT support
On Friday 02 February 2007 20:37, Andrew Morton wrote: > On Fri, 02 Feb 2007 14:54:12 -0600 > John Keller <[EMAIL PROTECTED]> wrote: > > > The SN Altix platform does not conform to the > > IOSAPIC IRQ routing model. Add code in acpi_unregister_gsi() > > to check if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) and > > return. > > > > Signed-off-by: John Keller <[EMAIL PROTECTED]> > > --- > > > > Due to an oversight, this code was not added previously when > > similar code was added to acpi_register_gsi(). > > > > http://marc.theaimsgroup.com/?l=linux-acpi=116680983430121=2 > > > > arch/ia64/kernel/acpi.c |3 +++ > > 1 file changed, 3 insertions(+) > > > > > > Index: linux-2.6/arch/ia64/kernel/acpi.c > > === > > --- linux-2.6.orig/arch/ia64/kernel/acpi.c 2007-02-02 14:44:31.0 > > -0600 > > +++ linux-2.6/arch/ia64/kernel/acpi.c 2007-02-02 14:47:44.658143727 > > -0600 > > @@ -609,6 +609,9 @@ EXPORT_SYMBOL(acpi_register_gsi); > > > > void acpi_unregister_gsi(u32 gsi) > > { > > + if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) > > + return; > > + > > iosapic_unregister_intr(gsi); > > } > > Given that the December 22 patch appears to be in mainline, and that this > patch is simple, I shall cheerily bypass maintainers and send it in for > 2.6.20. Yep. Acked-by: Len Brown <[EMAIL PROTECTED]> thanks, -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 17/59] PCI: prevent down_read when pci_devices is empty
-stable review patch. If anyone has any objections, please let us know. -- From: Ard van Breemen <[EMAIL PROTECTED]> The pci_find_subsys gets called very early by obsolete ide setup parameters. This is a bogus call since pci is not initialized yet, so the list is empty. But in the mean time, interrupts get enabled by down_read. This can result in a kernel panic when the irq controller gets initialized. This patch checks if the device list is empty before taking the semaphore, and hence will not enable irq's. Furthermore it will inform that it is called while pci_devices is empty as a reminder that the ide code needs to be fixed. The pci_get_subsys can get called in the same manner, and as such is patched in the same manner. [EMAIL PROTECTED]: cleanups] Signed-off-by: Ard van Breemen <[EMAIL PROTECTED]> Cc: Greg KH <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- This is the other half of the fix for bug #7505 drivers/pci/search.c | 24 1 file changed, 24 insertions(+) --- linux-2.6.19.2.orig/drivers/pci/search.c +++ linux-2.6.19.2/drivers/pci/search.c @@ -193,6 +193,18 @@ static struct pci_dev * pci_find_subsys( struct pci_dev *dev; WARN_ON(in_interrupt()); + + /* +* pci_find_subsys() can be called on the ide_setup() path, super-early +* in boot. But the down_read() will enable local interrupts, which +* can cause some machines to crash. So here we detect and flag that +* situation and bail out early. +*/ + if (unlikely(list_empty(_devices))) { + printk(KERN_INFO "pci_find_subsys() called while pci_devices " + "is still empty\n"); + return NULL; + } down_read(_bus_sem); n = from ? from->global_list.next : pci_devices.next; @@ -259,6 +271,18 @@ pci_get_subsys(unsigned int vendor, unsi struct pci_dev *dev; WARN_ON(in_interrupt()); + + /* +* pci_get_subsys() can potentially be called by drivers super-early +* in boot. But the down_read() will enable local interrupts, which +* can cause some machines to crash. So here we detect and flag that +* situation and bail out early. +*/ + if (unlikely(list_empty(_devices))) { + printk(KERN_NOTICE "pci_get_subsys() called while pci_devices " + "is still empty\n"); + return NULL; + } down_read(_bus_sem); n = from ? from->global_list.next : pci_devices.next; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 09/59] NETFILTER: arp_tables: fix userspace compilation
-stable review patch. If anyone has any objections, please let us know. -- From: Patrick McHardy <[EMAIL PROTECTED]> The included patch translates arpt_counters to xt_counters, making userspace arptables compile against recent kernels. Signed-off-by: Bart De Schuymer <[EMAIL PROTECTED]> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- include/linux/netfilter_arp/arp_tables.h |1 + 1 file changed, 1 insertion(+) --- linux-2.6.19.2.orig/include/linux/netfilter_arp/arp_tables.h +++ linux-2.6.19.2/include/linux/netfilter_arp/arp_tables.h @@ -190,6 +190,7 @@ struct arpt_replace /* The argument to ARPT_SO_ADD_COUNTERS. */ #define arpt_counters_info xt_counters_info +#define arpt_counters xt_counters /* The argument to ARPT_SO_GET_ENTRIES. */ struct arpt_get_entries -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 18/59] IPV6 MCAST: Fix joining all-node multicast group on device initialization.
-stable review patch. If anyone has any objections, please let us know. -- From: YOSHIFUJI Hideaki <[EMAIL PROTECTED]> Join all-node multicast group after assignment of dev->ip6_ptr because it must be assigned when ipv6_dev_mc_inc() is called. This fixes Bug#7817, reported by <[EMAIL PROTECTED]>. Closes: 7817 Signed-off-by: YOSHIFUJI Hideaki <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv6/addrconf.c |6 ++ net/ipv6/mcast.c|6 -- 2 files changed, 6 insertions(+), 6 deletions(-) --- linux-2.6.19.2.orig/net/ipv6/addrconf.c +++ linux-2.6.19.2/net/ipv6/addrconf.c @@ -341,6 +341,7 @@ void in6_dev_finish_destroy(struct inet6 static struct inet6_dev * ipv6_add_dev(struct net_device *dev) { struct inet6_dev *ndev; + struct in6_addr maddr; ASSERT_RTNL(); @@ -425,6 +426,11 @@ static struct inet6_dev * ipv6_add_dev(s #endif /* protected by rtnl_lock */ rcu_assign_pointer(dev->ip6_ptr, ndev); + + /* Join all-node multicast group */ + ipv6_addr_all_nodes(); + ipv6_dev_mc_inc(dev, ); + return ndev; } --- linux-2.6.19.2.orig/net/ipv6/mcast.c +++ linux-2.6.19.2/net/ipv6/mcast.c @@ -2252,8 +2252,6 @@ void ipv6_mc_up(struct inet6_dev *idev) void ipv6_mc_init_dev(struct inet6_dev *idev) { - struct in6_addr maddr; - write_lock_bh(>lock); rwlock_init(>mc_lock); idev->mc_gq_running = 0; @@ -2269,10 +2267,6 @@ void ipv6_mc_init_dev(struct inet6_dev * idev->mc_maxdelay = IGMP6_UNSOLICITED_IVAL; idev->mc_v1_seen = 0; write_unlock_bh(>lock); - - /* Add all-nodes address. */ - ipv6_addr_all_nodes(); - ipv6_dev_mc_inc(idev->dev, ); } /* -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [patch 00/59] -stable review
* Chris Wright ([EMAIL PROTECTED]) wrote: > Responses should be made by Mon Feb 3 02:30 UTC 2007 Yes, that's Mon Feb 5 (thanks to those on their toes ;-) And the roll-up will be available at: http://www.kernel.org/pub/linux/kernel/people/chrisw/stable/patch-2.6.19.3-rc1.{gz,bz2} once mirroring finishes. thanks, -chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 03/59] Check for populated zone in __drain_pages
-stable review patch. If anyone has any objections, please let us know. -- From: Christoph Lameter <[EMAIL PROTECTED]> Both process_zones() and drain_node_pages() check for populated zones before touching pagesets. However, __drain_pages does not do so, This may result in a NULL pointer dereference for pagesets in unpopulated zones if a NUMA setup is combined with cpu hotplug. Initially the unpopulated zone has the pcp pointers pointing to the boot pagesets. Since the zone is not populated the boot pageset pointers will not be changed during page allocator and slab bootstrap. If a cpu is later brought down (first call to __drain_pages()) then the pcp pointers for cpus in unpopulated zones are set to NULL since __drain_pages does not first check for an unpopulated zone. If the cpu is then brought up again then we call process_zones() which will ignore the unpopulated zone. So the pageset pointers will still be NULL. If the cpu is then again brought down then __drain_pages will attempt to drain pages by following the NULL pageset pointer for unpopulated zones. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f2e12bb272f2544d1504f982270e90ae3dcc4ff2 mm/page_alloc.c |3 +++ 1 file changed, 3 insertions(+) --- linux-2.6.19.2.orig/mm/page_alloc.c +++ linux-2.6.19.2/mm/page_alloc.c @@ -710,6 +710,9 @@ static void __drain_pages(unsigned int c for_each_zone(zone) { struct per_cpu_pageset *pset; + if (!populated_zone(zone)) + continue; + pset = zone_pcp(zone, cpu); for (i = 0; i < ARRAY_SIZE(pset->pcp); i++) { struct per_cpu_pages *pcp; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 08/59] NETFILTER: tcp conntrack: fix IP_CT_TCP_FLAG_CLOSE_INIT value
-stable review patch. If anyone has any objections, please let us know. -- From: Patrick McHardy <[EMAIL PROTECTED]> IP_CT_TCP_FLAG_CLOSE_INIT is a flag and should have a value of 0x4 instead of 0x3, which is IP_CT_TCP_FLAG_WINDOW_SCALE | IP_CT_TCP_FLAG_SACK_PERM. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- include/linux/netfilter/nf_conntrack_tcp.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19.2.orig/include/linux/netfilter/nf_conntrack_tcp.h +++ linux-2.6.19.2/include/linux/netfilter/nf_conntrack_tcp.h @@ -25,7 +25,7 @@ enum tcp_conntrack { #define IP_CT_TCP_FLAG_SACK_PERM 0x02 /* This sender sent FIN first */ -#define IP_CT_TCP_FLAG_CLOSE_INIT 0x03 +#define IP_CT_TCP_FLAG_CLOSE_INIT 0x04 #ifdef __KERNEL__ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 20/59] NETFILTER: ctnetlink: fix leak in ctnetlink_create_conntrack error path
-stable review patch. If anyone has any objections, please let us know. -- From: Patrick McHardy <[EMAIL PROTECTED]> --- net/ipv4/netfilter/ip_conntrack_netlink.c |2 +- net/netfilter/nf_conntrack_netlink.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/net/ipv4/netfilter/ip_conntrack_netlink.c +++ linux-2.6.19.2/net/ipv4/netfilter/ip_conntrack_netlink.c @@ -955,7 +955,7 @@ ctnetlink_create_conntrack(struct nfattr if (cda[CTA_PROTOINFO-1]) { err = ctnetlink_change_protoinfo(ct, cda); if (err < 0) - return err; + goto err; } #if defined(CONFIG_IP_NF_CONNTRACK_MARK) --- linux-2.6.19.2.orig/net/netfilter/nf_conntrack_netlink.c +++ linux-2.6.19.2/net/netfilter/nf_conntrack_netlink.c @@ -972,7 +972,7 @@ ctnetlink_create_conntrack(struct nfattr if (cda[CTA_PROTOINFO-1]) { err = ctnetlink_change_protoinfo(ct, cda); if (err < 0) - return err; + goto err; } #if defined(CONFIG_NF_CONNTRACK_MARK) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 22/59] ALSA hda-codec - Fix NULL dereference in generic hda code
-stable review patch. If anyone has any objections, please let us know. -- From: Takashi Iwai <[EMAIL PROTECTED]> Fix NULL dereference in hda_generic.c. Signed-off-by: Takashi Iwai <[EMAIL PROTECTED]> Signed-off-by: Jaroslav Kysela <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- commit 6afeb11de5b28e47adea1459c35e598bb98424d6 tree 07f4dba0e2fb094b448eb9863de7b6364b768add parent f9cc8a8b1887e6e2bb430405d0a4f9b5fb39fa5d author Takashi Iwai <[EMAIL PROTECTED]> Mon, 18 Dec 2006 16:16:04 +0100 committer Jaroslav Kysela <[EMAIL PROTECTED]> Tue, 09 Jan 2007 09:06:17 +0100 sound/pci/hda/hda_generic.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/sound/pci/hda/hda_generic.c +++ linux-2.6.19.2/sound/pci/hda/hda_generic.c @@ -485,8 +485,9 @@ static const char *get_input_type(struct return "Front Aux"; return "Aux"; case AC_JACK_MIC_IN: - if (node->pin_caps & - (AC_PINCAP_VREF_80 << AC_PINCAP_VREF_SHIFT)) + if (pinctl && + (node->pin_caps & +(AC_PINCAP_VREF_80 << AC_PINCAP_VREF_SHIFT))) *pinctl |= AC_PINCTL_VREF_80; if ((location & 0x0f) == AC_JACK_LOC_FRONT) return "Front Mic"; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 27/59] x86: Work around gcc 4.2 over aggressive optimizer
-stable review patch. If anyone has any objections, please let us know. -- From: Andi Kleen <[EMAIL PROTECTED]> The new PDA code uses a dummy _proxy_pda variable to describe memory references to the PDA. It is never referenced in inline assembly, but exists as input/output arguments. gcc 4.2 in some cases can CSE references to this which causes unresolved symbols. Define it to zero to avoid this. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- arch/i386/kernel/vmlinux.lds.S |1 + arch/x86_64/kernel/vmlinux.lds.S |1 + 2 files changed, 2 insertions(+) --- linux-2.6.19.2.orig/arch/i386/kernel/vmlinux.lds.S +++ linux-2.6.19.2/arch/i386/kernel/vmlinux.lds.S @@ -13,6 +13,7 @@ OUTPUT_FORMAT("elf32-i386", "elf32-i386" OUTPUT_ARCH(i386) ENTRY(phys_startup_32) jiffies = jiffies_64; +_proxy_pda = 0; PHDRS { text PT_LOAD FLAGS(5); /* R_E */ --- linux-2.6.19.2.orig/arch/x86_64/kernel/vmlinux.lds.S +++ linux-2.6.19.2/arch/x86_64/kernel/vmlinux.lds.S @@ -13,6 +13,7 @@ OUTPUT_FORMAT("elf64-x86-64", "elf64-x86 OUTPUT_ARCH(i386:x86-64) ENTRY(phys_startup_64) jiffies_64 = jiffies; +_proxy_pda = 0; PHDRS { text PT_LOAD FLAGS(5); /* R_E */ data PT_LOAD FLAGS(7); /* RWE */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 19/59] NETFILTER: ctnetlink: check for status attribute existence on conntrack creation
-stable review patch. If anyone has any objections, please let us know. -- From: Pablo Neira Ayuso <[EMAIL PROTECTED]> Check that status flags are available in the netlink message received to create a new conntrack. Fixes a crash in ctnetlink_create_conntrack when the CTA_STATUS attribute is not present. Signed-off-by: Pablo Neira Ayuso <[EMAIL PROTECTED]> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv4/netfilter/ip_conntrack_netlink.c |8 +--- net/netfilter/nf_conntrack_netlink.c |8 +--- 2 files changed, 10 insertions(+), 6 deletions(-) --- linux-2.6.19.2.orig/net/ipv4/netfilter/ip_conntrack_netlink.c +++ linux-2.6.19.2/net/ipv4/netfilter/ip_conntrack_netlink.c @@ -946,9 +946,11 @@ ctnetlink_create_conntrack(struct nfattr ct->timeout.expires = jiffies + ct->timeout.expires * HZ; ct->status |= IPS_CONFIRMED; - err = ctnetlink_change_status(ct, cda); - if (err < 0) - goto err; + if (cda[CTA_STATUS-1]) { + err = ctnetlink_change_status(ct, cda); + if (err < 0) + goto err; + } if (cda[CTA_PROTOINFO-1]) { err = ctnetlink_change_protoinfo(ct, cda); --- linux-2.6.19.2.orig/net/netfilter/nf_conntrack_netlink.c +++ linux-2.6.19.2/net/netfilter/nf_conntrack_netlink.c @@ -963,9 +963,11 @@ ctnetlink_create_conntrack(struct nfattr ct->timeout.expires = jiffies + ct->timeout.expires * HZ; ct->status |= IPS_CONFIRMED; - err = ctnetlink_change_status(ct, cda); - if (err < 0) - goto err; + if (cda[CTA_STATUS-1]) { + err = ctnetlink_change_status(ct, cda); + if (err < 0) + goto err; + } if (cda[CTA_PROTOINFO-1]) { err = ctnetlink_change_protoinfo(ct, cda); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 24/59] IB/iser: return error code when PDUs may not be sent
-stable review patch. If anyone has any objections, please let us know. -- From: Erez Zilber <[EMAIL PROTECTED]> iSER limits the number of outstanding PDUs to send. When this threshold is reached, it should return an error code (-ENOBUFS) instead of setting the suspend_tx bit (which should be used only by libiscsi). Without this fix, during logout, open-iscsi over iSER tries to logout forever. Signed-off-by: Erez Zilber <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/iser/iscsi_iser.c |4 ++-- drivers/infiniband/ulp/iser/iser_initiator.c | 26 -- 2 files changed, 14 insertions(+), 16 deletions(-) --- linux-2.6.19.2.orig/drivers/infiniband/ulp/iser/iscsi_iser.c +++ linux-2.6.19.2/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -177,7 +177,7 @@ iscsi_iser_mtask_xmit(struct iscsi_conn * - if yes, the mtask is recycled at iscsi_complete_pdu * - if no, the mtask is recycled at iser_snd_completion */ - if (error && error != -EAGAIN) + if (error && error != -ENOBUFS) iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED); return error; @@ -241,7 +241,7 @@ iscsi_iser_ctask_xmit(struct iscsi_conn error = iscsi_iser_ctask_xmit_unsol_data(conn, ctask); iscsi_iser_ctask_xmit_exit: - if (error && error != -EAGAIN) + if (error && error != -ENOBUFS) iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED); return error; } --- linux-2.6.19.2.orig/drivers/infiniband/ulp/iser/iser_initiator.c +++ linux-2.6.19.2/drivers/infiniband/ulp/iser/iser_initiator.c @@ -304,18 +304,14 @@ int iser_conn_set_full_featured_mode(str static int iser_check_xmit(struct iscsi_conn *conn, void *task) { - int rc = 0; struct iscsi_iser_conn *iser_conn = conn->dd_data; - write_lock_bh(conn->recv_lock); if (atomic_read(_conn->ib_conn->post_send_buf_count) == ISER_QP_MAX_REQ_DTOS) { - iser_dbg("%ld can't xmit task %p, suspending tx\n",jiffies,task); - set_bit(ISCSI_SUSPEND_BIT, >suspend_tx); - rc = -EAGAIN; + iser_dbg("%ld can't xmit task %p\n",jiffies,task); + return -ENOBUFS; } - write_unlock_bh(conn->recv_lock); - return rc; + return 0; } @@ -340,7 +336,7 @@ int iser_send_command(struct iscsi_conn return -EPERM; } if (iser_check_xmit(conn, ctask)) - return -EAGAIN; + return -ENOBUFS; edtl = ntohl(hdr->data_length); @@ -426,7 +422,7 @@ int iser_send_data_out(struct iscsi_conn } if (iser_check_xmit(conn, ctask)) - return -EAGAIN; + return -ENOBUFS; itt = ntohl(hdr->itt); data_seg_len = ntoh24(hdr->dlength); @@ -500,7 +496,7 @@ int iser_send_control(struct iscsi_conn } if (iser_check_xmit(conn,mtask)) - return -EAGAIN; + return -ENOBUFS; /* build the tx desc regd header and add it to the tx desc dto */ mdesc->type = ISCSI_TX_CONTROL; @@ -609,6 +605,7 @@ void iser_snd_completion(struct iser_des struct iscsi_iser_conn *iser_conn = ib_conn->iser_conn; struct iscsi_conn *conn = iser_conn->iscsi_conn; struct iscsi_mgmt_task *mtask; + int resume_tx = 0; iser_dbg("Initiator, Data sent dto=0x%p\n", dto); @@ -617,15 +614,16 @@ void iser_snd_completion(struct iser_des if (tx_desc->type == ISCSI_TX_DATAOUT) kmem_cache_free(ig.desc_cache, tx_desc); + if (atomic_read(_conn->ib_conn->post_send_buf_count) == + ISER_QP_MAX_REQ_DTOS) + resume_tx = 1; + atomic_dec(_conn->post_send_buf_count); - write_lock(conn->recv_lock); - if (conn->suspend_tx) { + if (resume_tx) { iser_dbg("%ld resuming tx\n",jiffies); - clear_bit(ISCSI_SUSPEND_BIT, >suspend_tx); scsi_queue_work(conn->session->host, >xmitwork); } - write_unlock(conn->recv_lock); if (tx_desc->type == ISCSI_TX_CONTROL) { /* this arithmetic is legal by libiscsi dd_data allocation */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 28/59] NETFILTER: Fix iptables ABI breakage on (at least) CRIS
-stable review patch. If anyone has any objections, please let us know. -- From: Patrick McHardy <[EMAIL PROTECTED]> With the introduction of x_tables we accidentally broke compatibility by defining IPT_TABLE_MAXNAMELEN to XT_FUNCTION_MAXNAMELEN instead of XT_TABLE_MAXNAMELEN, which is two bytes larger. On most architectures it doesn't really matter since we don't have any tables with names that long in the kernel and the structure layout didn't change because of alignment requirements of following members. On CRIS however (and other architectures that don't align data) this changed the structure layout and thus broke compatibility with old iptables binaries. Changing it back will break compatibility with binaries compiled against recent kernels again, but since the breakage has only been there for three releases this seems like the better choice. Spotted by Jonas Berlin <[EMAIL PROTECTED]>. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- include/linux/netfilter_ipv4/ip_tables.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19.2.orig/include/linux/netfilter_ipv4/ip_tables.h +++ linux-2.6.19.2/include/linux/netfilter_ipv4/ip_tables.h @@ -28,7 +28,7 @@ #include #define IPT_FUNCTION_MAXNAMELEN XT_FUNCTION_MAXNAMELEN -#define IPT_TABLE_MAXNAMELEN XT_FUNCTION_MAXNAMELEN +#define IPT_TABLE_MAXNAMELEN XT_TABLE_MAXNAMELEN #define ipt_match xt_match #define ipt_target xt_target #define ipt_table xt_table -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 26/59] ACPI: fix cpufreq regression
-stable review patch. If anyone has any objections, please let us know. -- From: Ingo Molnar <[EMAIL PROTECTED]> recently cpufreq support on my laptop (Lenovo T60) broke completely: when it's plugged into AC it would never go higher than 1 GHz - neither 1.3 GHz nor 1.83 GHz is possible - no matter which governor (userspace, speed or ondemand) is used. after some cpufreq debugging i tracked the regression back to the following (totally correct) bug-fix commit: commit 0916bd3ebb7cefdd0f432e8491abe24f4b5a101e Author: Dave Jones <[EMAIL PROTECTED]> Date: Wed Nov 22 20:42:01 2006 -0500 [PATCH] Correct bound checking from the value returned from _PPC method. this bugfix, which makes other laptops work, made a previously hidden (BIOS) bug visible on my laptop. The bug is the following: if the _PPC (Performance Present Capabilities) optional ACPI object is queried /after/ bootup then the BIOS reports an incorrect value of '2'. My laptop (Lenovo T60) has the following performance states supported: 0: 1833000 1: 1333000 2: 100 Per ACPI specification, a _PPC value of '0' means that all 3 performance states are usable. A _PPC value of '1' means states 1 .. 2 are usable, a value of '2' means only state '2' (slowest) is usable. now, the _PPC object is optional, and it also comes with notification. Furthermore, when a CPU object is initialized, the _PPC object is initialized as well. So the following evaluation of the _PPC object is superfluous: [] acpi_processor_get_platform_limit+0xa1/0xaf [] acpi_processor_register_performance+0x3b9/0x3ef [] acpi_cpufreq_cpu_init+0xb7/0x596 [] cpufreq_add_dev+0x160/0x4a8 [] sysdev_driver_register+0x5a/0xa0 [] cpufreq_register_driver+0xb4/0x176 [] acpi_cpufreq_init+0xe5/0xeb [] init+0x14f/0x3dd and this is the point where my laptop's BIOS returns the incorrect value of '2'. Note that it has not sent any notification event, so the value is probably not really intentional (possibly spurious), and Windows likely doesnt query it after bootup either. Maybe the value is kept at '2' normally, and is only set to the real value when a true asynchronous event (such as AC plug event, battery switch, etc.) occurs. So i /think/ this is a grey area of the ACPI spec: per the letter of the spec the _PPC value only changes when notified, so there's no reason to query it after the system has booted up. So in my opinion the best (and most compatible) strategy would be to do the change below, and to not evaluate the _PPC object in the acpi_processor_get_performance_info() call, but only evaluate it if _PPC is present during CPU object init, or if it's notified during an asynchronous event. This change is more permissive than the previous logic, so it definitely shouldnt break any existing system. This also happens to fix my laptop, which is merrily chugging along at 1.83 GHz now. Yay! Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Cc: Dave Jones <[EMAIL PROTECTED]> Acked-by: Len Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- Thomas Renninger <[EMAIL PROTECTED]> wrote: Beside the Thinkpad it also seems to fix other system: http://bugzilla.kernel.org/show_bug.cgi?id=7859 drivers/acpi/processor_perflib.c |4 1 file changed, 4 deletions(-) --- linux-2.6.19.2.orig/drivers/acpi/processor_perflib.c +++ linux-2.6.19.2/drivers/acpi/processor_perflib.c @@ -322,10 +322,6 @@ static int acpi_processor_get_performanc if (result) return result; - result = acpi_processor_get_platform_limit(pr); - if (result) - return result; - return 0; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 01/59] i2c-mv64xxx: Fix random oops at boot
-stable review patch. If anyone has any objections, please let us know. -- From: Maxime Bizon <[EMAIL PROTECTED]> I have a Marvell board which has the same i2c hw block than mv64xxx, so I'm trying to use i2c-mv64xxx driver. But I get the following random oops at boot: Unable to handle kernel NULL pointer dereference at virtual address 0002 Backtrace: [] (mv64xxx_i2c_intr+0x0/0x2b8) from [] (__do_irq+0x4c/0x8c) [] (__do_irq+0x0/0x8c) from [] (do_level_IRQ+0x68/0xc0) r8 = C0501E08 r7 = 0005 r6 = C0501E08 r5 = 0005 r4 = C048BB78 [] (do_level_IRQ+0x0/0xc0) from [] (asm_do_IRQ+0x50/0x134) r6 = C0449C78 r5 = F102 r4 = [] (asm_do_IRQ+0x0/0x134) from [] (__irq_svc+0x24/0x100) r8 = C1CAC400 r7 = 0005 r6 = 0002 r5 = F102 r4 = [] (setup_irq+0x0/0x124) from [] (request_irq+0xb0/0xd0) r7 = C041B2AC r6 = C0397E4C r5 = r4 = 0005 [] (request_irq+0x0/0xd0) from [] (mv64xxx_i2c_probe+0x148/0x244) [] (mv64xxx_i2c_probe+0x0/0x244) from [] (platform_drv_probe+0x20/0x24) The oops is caused by a spurious interrupt that occurs when request_irq is called. mv64xxx_i2c_fsm() tries to read drv_data->msg, which is NULL. I noticed that hardware init is done after requesting irq. Thus any pending irq from previous hardware usage may cause this. The following patch fixes it: Signed-off-by: Maxime Bizon <[EMAIL PROTECTED]> Acked-by: Mark A. Greer <[EMAIL PROTECTED]> Signed-off-by: Jean Delvare <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- Merged in 2.6.20-rc4: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3269bb63eb076318ce4fb554851d047e1c9aa1a5 drivers/i2c/busses/i2c-mv64xxx.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/drivers/i2c/busses/i2c-mv64xxx.c +++ linux-2.6.19.2/drivers/i2c/busses/i2c-mv64xxx.c @@ -529,6 +529,8 @@ mv64xxx_i2c_probe(struct platform_device platform_set_drvdata(pd, drv_data); i2c_set_adapdata(_data->adapter, drv_data); + mv64xxx_i2c_hw_init(drv_data); + if (request_irq(drv_data->irq, mv64xxx_i2c_intr, 0, MV64XXX_I2C_CTLR_NAME, drv_data)) { dev_err(_data->adapter.dev, @@ -542,8 +544,6 @@ mv64xxx_i2c_probe(struct platform_device goto exit_free_irq; } - mv64xxx_i2c_hw_init(drv_data); - return 0; exit_free_irq: -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 11/59] [stable] [PATCH] IB/mthca: Fix off-by-one in FMR handling on memfree
-stable review patch. If anyone has any objections, please let us know. -- From: Michael S. Tsirkin <[EMAIL PROTECTED]> mthca_table_find() will return the wrong address when the table entry being searched for is exactly at the beginning of a sglist entry (other than the first), because it uses >= when it should use >. Example: assume we have 2 entries in scatterlist, 4K each, offset is 4K. The current code will return first entry + 4K when we really want the second entry. In particular this means mapping an FMR on a memfree HCA may end up writing the page table into the wrong place, leading to memory corruption and also causing the HCA to use an incorrect address translation table. Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]> Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- This is upstream, and fixes a data corruption/crash bug with storage over SRP. drivers/infiniband/hw/mthca/mthca_memfree.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19.2.orig/drivers/infiniband/hw/mthca/mthca_memfree.c +++ linux-2.6.19.2/drivers/infiniband/hw/mthca/mthca_memfree.c @@ -232,7 +232,7 @@ void *mthca_table_find(struct mthca_icm_ list_for_each_entry(chunk, >chunk_list, list) { for (i = 0; i < chunk->npages; ++i) { - if (chunk->mem[i].length >= offset) { + if (chunk->mem[i].length > offset) { page = chunk->mem[i].page; goto out; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 02/59] i2c/m41t00: Do not forget to write year
-stable review patch. If anyone has any objections, please let us know. -- From: Philippe De Muyter <[EMAIL PROTECTED]> m41t00.c forgets to set the year field in set_rtc_time; fix that. Signed-off-by: Philippe De Muyter <[EMAIL PROTECTED]> Acked-by: Mark A. Greer <[EMAIL PROTECTED]> Signed-off-by: Jean Delvare <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- Merged in 2.6.20-rc4: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=81ffbc04a8ea06c4bea534154f49ed598013ee6b drivers/i2c/chips/m41t00.c |1 + 1 file changed, 1 insertion(+) --- linux-2.6.19.2.orig/drivers/i2c/chips/m41t00.c +++ linux-2.6.19.2/drivers/i2c/chips/m41t00.c @@ -209,6 +209,7 @@ m41t00_set(void *arg) buf[m41t00_chip->hour] = (buf[m41t00_chip->hour] & ~0x3f) | (hour& 0x3f); buf[m41t00_chip->day] = (buf[m41t00_chip->day] & ~0x3f) | (day & 0x3f); buf[m41t00_chip->mon] = (buf[m41t00_chip->mon] & ~0x1f) | (mon & 0x1f); + buf[m41t00_chip->year] = year; if (i2c_master_send(save_client, wbuf, 9) < 0) dev_err(_client->dev, "m41t00_set: Write error\n"); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 10/59] Repair snd-usb-usx2y over OHCI
-stable review patch. If anyone has any objections, please let us know. -- From: Karsten Wiese <[EMAIL PROTECTED]> The previous patch "Repair snd-usb-usx2y for usb 2.6.18" assumed urb->start_frame roll over beyond MAX_INT for both UHCI & OHCI. This isn't true until now (kernel 2.6.20). Fix this by only looking at the common between OHCI & UHCI Frame number range. This is for mainline and stable kernels >= 2.6.18. Signed-off-by: Karsten Wiese <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- sound/usb/usx2y/usbusx2yaudio.c |2 +- sound/usb/usx2y/usx2yhwdeppcm.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/sound/usb/usx2y/usbusx2yaudio.c +++ linux-2.6.19.2/sound/usb/usx2y/usbusx2yaudio.c @@ -322,7 +322,7 @@ static void i_usX2Y_urb_complete(struct usX2Y_error_urb_status(usX2Y, subs, urb); return; } - if (likely(urb->start_frame == usX2Y->wait_iso_frame)) + if (likely((urb->start_frame & 0x) == (usX2Y->wait_iso_frame & 0x))) subs->completed_urb = urb; else { usX2Y_error_sequence(usX2Y, subs, urb); --- linux-2.6.19.2.orig/sound/usb/usx2y/usx2yhwdeppcm.c +++ linux-2.6.19.2/sound/usb/usx2y/usx2yhwdeppcm.c @@ -243,7 +243,7 @@ static void i_usX2Y_usbpcm_urb_complete( usX2Y_error_urb_status(usX2Y, subs, urb); return; } - if (likely(urb->start_frame == usX2Y->wait_iso_frame)) + if (likely((urb->start_frame & 0x) == (usX2Y->wait_iso_frame & 0x))) subs->completed_urb = urb; else { usX2Y_error_sequence(usX2Y, subs, urb); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 07/59] NETFILTER: nf_conntrack_ipv6: fix crash when handling fragments
-stable review patch. If anyone has any objections, please let us know. -- From: Patrick McHardy <[EMAIL PROTECTED]> When IPv6 connection tracking splits up a defragmented packet into its original fragments, the packets are taken from a list and are passed to the network stack with skb->next still set. This causes dev_hard_start_xmit to treat them as GSO fragments, resulting in a use after free when connection tracking handles the next fragment. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv6/netfilter/nf_conntrack_reasm.c |2 ++ 1 file changed, 2 insertions(+) --- linux-2.6.19.2.orig/net/ipv6/netfilter/nf_conntrack_reasm.c +++ linux-2.6.19.2/net/ipv6/netfilter/nf_conntrack_reasm.c @@ -835,6 +835,8 @@ void nf_ct_frag6_output(unsigned int hoo s->nfct_reasm = skb; s2 = s->next; + s->next = NULL; + NF_HOOK_THRESH(PF_INET6, hooknum, s, in, out, okfn, NF_IP6_PRI_CONNTRACK_DEFRAG + 1); s = s2; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 06/59] NETFILTER: Fix routing of REJECT target generated packets in output chain
-stable review patch. If anyone has any objections, please let us know. -- From: Patrick McHardy <[EMAIL PROTECTED]> Packets generated by the REJECT target in the output chain have a local destination address and a foreign source address. Make sure not to use the foreign source address for the output route lookup. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv4/netfilter.c |7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/net/ipv4/netfilter.c +++ linux-2.6.19.2/net/ipv4/netfilter.c @@ -15,16 +15,19 @@ int ip_route_me_harder(struct sk_buff ** struct flowi fl = {}; struct dst_entry *odst; unsigned int hh_len; + unsigned int type; + type = inet_addr_type(iph->saddr); if (addr_type == RTN_UNSPEC) - addr_type = inet_addr_type(iph->saddr); + addr_type = type; /* some non-standard hacks like ipt_REJECT.c:send_reset() can cause * packets with foreign saddr to appear on the NF_IP_LOCAL_OUT hook. */ if (addr_type == RTN_LOCAL) { fl.nl_u.ip4_u.daddr = iph->daddr; - fl.nl_u.ip4_u.saddr = iph->saddr; + if (type == RTN_LOCAL) + fl.nl_u.ip4_u.saddr = iph->saddr; fl.nl_u.ip4_u.tos = RT_TOS(iph->tos); fl.oif = (*pskb)->sk ? (*pskb)->sk->sk_bound_dev_if : 0; #ifdef CONFIG_IP_ROUTE_FWMARK -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 36/59] knfsd: fix type mismatch with filldir_t used by nfsd.
-stable review patch. If anyone has any objections, please let us know. -- From: NeilBrown <[EMAIL PROTECTED]> nfsd defines a type 'encode_dent_fn' which is much like 'filldir_t' except that the first pointer is 'struct readdir_cd *' rather than 'void *'. It then casts encode_dent_fn points to 'filldir_t' as needed. This hides any other type mismatches between the two such as the fact that the 'ino' arg recently changed from ino_t to u64. So: get rid of 'encode_dent_fn', get rid of the cast of the function type, change the first arg of various functions from 'struct readdir_cd *' to 'void *', and live with the fact that we have a little less type checking on the calling of these functions now. Less internal (to nfsd) checking offset by more external checking, which is more important. Thanks to Gabriel Paubert <[EMAIL PROTECTED]> for discovering this and providing an initial patch. Signed-off-by: Gabriel Paubert <[EMAIL PROTECTED]> Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- fs/nfsd/nfs3xdr.c |9 + fs/nfsd/nfs4xdr.c |5 +++-- fs/nfsd/nfsxdr.c |5 +++-- fs/nfsd/vfs.c |4 ++-- include/linux/nfsd/nfsd.h |4 +--- include/linux/nfsd/xdr.h |4 ++-- include/linux/nfsd/xdr3.h |8 7 files changed, 20 insertions(+), 19 deletions(-) --- linux-2.6.19.2.orig/fs/nfsd/nfs3xdr.c +++ linux-2.6.19.2/fs/nfsd/nfs3xdr.c @@ -994,15 +994,16 @@ encode_entry(struct readdir_cd *ccd, con } int -nfs3svc_encode_entry(struct readdir_cd *cd, const char *name, -int namlen, loff_t offset, ino_t ino, unsigned int d_type) +nfs3svc_encode_entry(void *cd, const char *name, +int namlen, loff_t offset, u64 ino, unsigned int d_type) { return encode_entry(cd, name, namlen, offset, ino, d_type, 0); } int -nfs3svc_encode_entry_plus(struct readdir_cd *cd, const char *name, - int namlen, loff_t offset, ino_t ino, unsigned int d_type) +nfs3svc_encode_entry_plus(void *cd, const char *name, + int namlen, loff_t offset, u64 ino, + unsigned int d_type) { return encode_entry(cd, name, namlen, offset, ino, d_type, 1); } --- linux-2.6.19.2.orig/fs/nfsd/nfs4xdr.c +++ linux-2.6.19.2/fs/nfsd/nfs4xdr.c @@ -1884,9 +1884,10 @@ nfsd4_encode_rdattr_error(__be32 *p, int } static int -nfsd4_encode_dirent(struct readdir_cd *ccd, const char *name, int namlen, - loff_t offset, ino_t ino, unsigned int d_type) +nfsd4_encode_dirent(void *ccdv, const char *name, int namlen, + loff_t offset, u64 ino, unsigned int d_type) { + struct readdir_cd *ccd = ccdv; struct nfsd4_readdir *cd = container_of(ccd, struct nfsd4_readdir, common); int buflen; __be32 *p = cd->buffer; --- linux-2.6.19.2.orig/fs/nfsd/nfsxdr.c +++ linux-2.6.19.2/fs/nfsd/nfsxdr.c @@ -467,9 +467,10 @@ nfssvc_encode_statfsres(struct svc_rqst } int -nfssvc_encode_entry(struct readdir_cd *ccd, const char *name, - int namlen, loff_t offset, ino_t ino, unsigned int d_type) +nfssvc_encode_entry(void *ccdv, const char *name, + int namlen, loff_t offset, u64 ino, unsigned int d_type) { + struct readdir_cd *ccd = ccdv; struct nfsd_readdirres *cd = container_of(ccd, struct nfsd_readdirres, common); __be32 *p = cd->buffer; int buflen, slen; --- linux-2.6.19.2.orig/fs/nfsd/vfs.c +++ linux-2.6.19.2/fs/nfsd/vfs.c @@ -1727,7 +1727,7 @@ out: */ __be32 nfsd_readdir(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t *offsetp, -struct readdir_cd *cdp, encode_dent_fn func) +struct readdir_cd *cdp, filldir_t func) { __be32 err; int host_err; @@ -1752,7 +1752,7 @@ nfsd_readdir(struct svc_rqst *rqstp, str do { cdp->err = nfserr_eof; /* will be cleared on successful read */ - host_err = vfs_readdir(file, (filldir_t) func, cdp); + host_err = vfs_readdir(file, func, cdp); } while (host_err >=0 && cdp->err == nfs_ok); if (host_err) err = nfserrno(host_err); --- linux-2.6.19.2.orig/include/linux/nfsd/nfsd.h +++ linux-2.6.19.2/include/linux/nfsd/nfsd.h @@ -52,8 +52,6 @@ struct readdir_cd { __be32 err;/* 0, nfserr, or nfserr_eof */ }; -typedef int(*encode_dent_fn)(struct readdir_cd *, const char *, - int, loff_t, ino_t, unsigned int); typedef int (*nfsd_dirop_t)(struct inode *, struct dentry *, int, int); extern struct svc_program nfsd_program; @@ -117,7 +115,7 @@ __be32 nfsd_unlink(struct svc_rqst *, s intnfsd_truncate(struct svc_rqst *, struct svc_fh *, unsigned long size); __be32
[patch 34/59] knfsd: fix setting of ACL server versions.
-stable review patch. If anyone has any objections, please let us know. -- From: NeilBrown <[EMAIL PROTECTED]> Due to silly typos, if the nfs versions are explicitly set, no NFSACL versions get enabled. Also improve an error message that would have made this bug a little easier to find. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- fs/nfsd/nfssvc.c |8 net/sunrpc/svc.c |3 ++- 2 files changed, 6 insertions(+), 5 deletions(-) --- linux-2.6.19.2.orig/fs/nfsd/nfssvc.c +++ linux-2.6.19.2/fs/nfsd/nfssvc.c @@ -72,7 +72,7 @@ static struct svc_program nfsd_acl_progr .pg_prog= NFS_ACL_PROGRAM, .pg_nvers = NFSD_ACL_NRVERS, .pg_vers= nfsd_acl_versions, - .pg_name= "nfsd", + .pg_name= "nfsacl", .pg_class = "nfsd", .pg_stats = _acl_svcstats, .pg_authenticate= _set_client, @@ -118,16 +118,16 @@ int nfsd_vers(int vers, enum vers_op cha switch(change) { case NFSD_SET: nfsd_versions[vers] = nfsd_version[vers]; - break; #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) if (vers < NFSD_ACL_NRVERS) - nfsd_acl_version[vers] = nfsd_acl_version[vers]; + nfsd_acl_versions[vers] = nfsd_acl_version[vers]; #endif + break; case NFSD_CLEAR: nfsd_versions[vers] = NULL; #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) if (vers < NFSD_ACL_NRVERS) - nfsd_acl_version[vers] = NULL; + nfsd_acl_versions[vers] = NULL; #endif break; case NFSD_TEST: --- linux-2.6.19.2.orig/net/sunrpc/svc.c +++ linux-2.6.19.2/net/sunrpc/svc.c @@ -910,7 +910,8 @@ err_bad_prog: err_bad_vers: #ifdef RPC_PARANOIA - printk("svc: unknown version (%d)\n", vers); + printk("svc: unknown version (%d for prog %d, %s)\n", + vers, prog, progp->pg_name); #endif serv->sv_stats->rpcbadfmt++; svc_putnl(resv, RPC_PROG_MISMATCH); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 48/59] SPARC32: Fix over-optimization by GCC near ip_fast_csum.
-stable review patch. If anyone has any objections, please let us know. -- From: Bob Breuer <[EMAIL PROTECTED]> In some cases such as: iph->check = 0; iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); GCC may optimize out the previous store. Observed as a failure of NFS over udp (bad checksums on ip fragments) when compiled with GCC 3.4.2. Signed-off-by: Bob Breuer <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- include/asm-sparc/checksum.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19.2.orig/include/asm-sparc/checksum.h +++ linux-2.6.19.2/include/asm-sparc/checksum.h @@ -159,7 +159,7 @@ static inline unsigned short ip_fast_csu "xnor\t%%g0, %0, %0" : "=r" (sum), "=" (iph) : "r" (ihl), "1" (iph) -: "g2", "g3", "g4", "cc"); +: "g2", "g3", "g4", "cc", "memory"); return sum; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 58/59] move_task_off_dead_cpu() should be called with disabled ints
-stable review patch. If anyone has any objections, please let us know. -- From: Kirill Korotaev <[EMAIL PROTECTED]> move_task_off_dead_cpu() requires interrupts to be disabled, while migrate_dead() calls it with enabled interrupts. Added appropriate comments to functions and added BUG_ON(!irqs_disabled()) into double_rq_lock() and double_lock_balance() which are the origin sources of such bugs. Signed-off-by: Kirill Korotaev <[EMAIL PROTECTED]> Acked-by: Ingo Molnar <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- kernel/sched.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) --- linux-2.6.19.2.orig/kernel/sched.c +++ linux-2.6.19.2/kernel/sched.c @@ -1941,6 +1941,7 @@ static void double_rq_lock(struct rq *rq __acquires(rq1->lock) __acquires(rq2->lock) { + BUG_ON(!irqs_disabled()); if (rq1 == rq2) { spin_lock(>lock); __acquire(rq2->lock); /* Fake it out ;) */ @@ -1980,6 +1981,11 @@ static void double_lock_balance(struct r __acquires(busiest->lock) __acquires(this_rq->lock) { + if (unlikely(!irqs_disabled())) { + /* printk() doesn't work good under rq->lock */ + spin_unlock(_rq->lock); + BUG_ON(1); + } if (unlikely(!spin_trylock(>lock))) { if (busiest < this_rq) { spin_unlock(_rq->lock); @@ -5050,7 +5056,10 @@ wait_to_die: } #ifdef CONFIG_HOTPLUG_CPU -/* Figure out where task on dead CPU should go, use force if neccessary. */ +/* + * Figure out where task on dead CPU should go, use force if neccessary. + * NOTE: interrupts should be disabled by the caller + */ static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p) { unsigned long flags; @@ -5170,6 +5179,7 @@ void idle_task_exit(void) mmdrop(mm); } +/* called under rq->lock with disabled interrupts */ static void migrate_dead(unsigned int dead_cpu, struct task_struct *p) { struct rq *rq = cpu_rq(dead_cpu); @@ -5186,10 +5196,11 @@ static void migrate_dead(unsigned int de * Drop lock around migration; if someone else moves it, * that's OK. No task can be added to this CPU, so iteration is * fine. +* NOTE: interrupts should be left disabled --dev@ */ - spin_unlock_irq(>lock); + spin_unlock(>lock); move_task_off_dead_cpu(dead_cpu, p); - spin_lock_irq(>lock); + spin_lock(>lock); put_task_struct(p); } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 59/59] sched: fix cond_resched_softirq() offset
-stable review patch. If anyone has any objections, please let us know. -- From: Ingo Molnar <[EMAIL PROTECTED]> Remove the __resched_legal() check: it is conceptually broken. The biggest problem it had is that it can mask buggy cond_resched() calls. A cond_resched() call is only legal if we are not in an atomic context, with two narrow exceptions: - if the system is booting - a reacquire_kernel_lock() down() done while PREEMPT_ACTIVE is set But __resched_legal() hid this and just silently returned whenever these primitives were called from invalid contexts. (Same goes for cond_resched_locked() and cond_resched_softirq()). Furthermore, the __legal_resched(0) call was buggy in that it caused unnecessarily long softirq latencies via cond_resched_softirq(). (which is only called from softirq-off sections, hence the code did nothing.) The fix is to resurrect the efficiency of the might_sleep checks and to only allow the narrow exceptions. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> [chrisw: backport to 2.6.19.2] Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- kernel/sched.c | 16 1 file changed, 4 insertions(+), 12 deletions(-) --- linux-2.6.19.2.orig/kernel/sched.c +++ linux-2.6.19.2/kernel/sched.c @@ -4524,15 +4524,6 @@ asmlinkage long sys_sched_yield(void) return 0; } -static inline int __resched_legal(int expected_preempt_count) -{ - if (unlikely(preempt_count() != expected_preempt_count)) - return 0; - if (unlikely(system_state != SYSTEM_RUNNING)) - return 0; - return 1; -} - static void __cond_resched(void) { #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP @@ -4552,7 +4543,8 @@ static void __cond_resched(void) int __sched cond_resched(void) { - if (need_resched() && __resched_legal(0)) { + if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && + system_state == SYSTEM_RUNNING) { __cond_resched(); return 1; } @@ -4578,7 +4570,7 @@ int cond_resched_lock(spinlock_t *lock) ret = 1; spin_lock(lock); } - if (need_resched() && __resched_legal(1)) { + if (need_resched() && system_state == SYSTEM_RUNNING) { spin_release(>dep_map, 1, _THIS_IP_); _raw_spin_unlock(lock); preempt_enable_no_resched(); @@ -4594,7 +4586,7 @@ int __sched cond_resched_softirq(void) { BUG_ON(!in_softirq()); - if (need_resched() && __resched_legal(0)) { + if (need_resched() && system_state == SYSTEM_RUNNING) { raw_local_irq_disable(); _local_bh_enable(); raw_local_irq_enable(); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 51/59] AF_PACKET: Fix BPF handling.
-stable review patch. If anyone has any objections, please let us know. -- From: David S. Miller <[EMAIL PROTECTED]> This fixes a bug introduced by: commit fda9ef5d679b07c9d9097aaf6ef7f069d794a8f9 Author: Dmitry Mishin <[EMAIL PROTECTED]> Date: Thu Aug 31 15:28:39 2006 -0700 [NET]: Fix sk->sk_filter field access sk_run_filter() returns either 0 or an unsigned 32-bit length which says how much of the packet to retain. If that 32-bit unsigned integer is larger than the packet, this is fine we just leave the packet unchanged. The above commit caused all filter return values which were negative when interpreted as a signed integer to indicate a packet drop, which is wrong. Based upon a report and initial patch by Raivis Bucis. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/packet/af_packet.c | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) --- linux-2.6.19.2.orig/net/packet/af_packet.c +++ linux-2.6.19.2/net/packet/af_packet.c @@ -427,24 +427,18 @@ out_unlock: } #endif -static inline int run_filter(struct sk_buff *skb, struct sock *sk, - unsigned *snaplen) +static inline unsigned int run_filter(struct sk_buff *skb, struct sock *sk, + unsigned int res) { struct sk_filter *filter; - int err = 0; rcu_read_lock_bh(); filter = rcu_dereference(sk->sk_filter); - if (filter != NULL) { - err = sk_run_filter(skb, filter->insns, filter->len); - if (!err) - err = -EPERM; - else if (*snaplen > err) - *snaplen = err; - } + if (filter != NULL) + res = sk_run_filter(skb, filter->insns, filter->len); rcu_read_unlock_bh(); - return err; + return res; } /* @@ -466,7 +460,7 @@ static int packet_rcv(struct sk_buff *sk struct packet_sock *po; u8 * skb_head = skb->data; int skb_len = skb->len; - unsigned snaplen; + unsigned int snaplen, res; if (skb->pkt_type == PACKET_LOOPBACK) goto drop; @@ -494,8 +488,11 @@ static int packet_rcv(struct sk_buff *sk snaplen = skb->len; - if (run_filter(skb, sk, ) < 0) + res = run_filter(skb, sk, snaplen); + if (!res) goto drop_n_restore; + if (snaplen > res) + snaplen = res; if (atomic_read(>sk_rmem_alloc) + skb->truesize >= (unsigned)sk->sk_rcvbuf) @@ -567,7 +564,7 @@ static int tpacket_rcv(struct sk_buff *s struct tpacket_hdr *h; u8 * skb_head = skb->data; int skb_len = skb->len; - unsigned snaplen; + unsigned int snaplen, res; unsigned long status = TP_STATUS_LOSING|TP_STATUS_USER; unsigned short macoff, netoff; struct sk_buff *copy_skb = NULL; @@ -591,8 +588,11 @@ static int tpacket_rcv(struct sk_buff *s snaplen = skb->len; - if (run_filter(skb, sk, ) < 0) + res = run_filter(skb, sk, snaplen); + if (!res) goto drop_n_restore; + if (snaplen > res) + snaplen = res; if (sk->sk_type == SOCK_DGRAM) { macoff = netoff = TPACKET_ALIGN(TPACKET_HDRLEN) + 16; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 55/59] TCP: skb is unexpectedly freed.
-stable review patch. If anyone has any objections, please let us know. -- From: Masayuki Nakagawa <[EMAIL PROTECTED]> I encountered a kernel panic with my test program, which is a very simple IPv6 client-server program. The server side sets IPV6_RECVPKTINFO on a listening socket, and the client side just sends a message to the server. Then the kernel panic occurs on the server. (If you need the test program, please let me know. I can provide it.) This problem happens because a skb is forcibly freed in tcp_rcv_state_process(). When a socket in listening state(TCP_LISTEN) receives a syn packet, then tcp_v6_conn_request() will be called from tcp_rcv_state_process(). If the tcp_v6_conn_request() successfully returns, the skb would be discarded by __kfree_skb(). However, in case of a listening socket which was already set IPV6_RECVPKTINFO, an address of the skb will be stored in treq->pktopts and a ref count of the skb will be incremented in tcp_v6_conn_request(). But, even if the skb is still in use, the skb will be freed. Then someone still using the freed skb will cause the kernel panic. I suggest to use kfree_skb() instead of __kfree_skb(). Signed-off-by: Masayuki Nakagawa <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv4/tcp_input.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/net/ipv4/tcp_input.c +++ linux-2.6.19.2/net/ipv4/tcp_input.c @@ -4411,9 +4411,11 @@ int tcp_rcv_state_process(struct sock *s * But, this leaves one open to an easy denial of * service attack, and SYN cookies can't defend * against this problem. So, we drop the data -* in the interest of security over speed. +* in the interest of security over speed unless +* it's still in use. */ - goto discard; + kfree_skb(skb); + return 0; } goto discard; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 54/59] TCP: Fix sorting of SACK blocks.
-stable review patch. If anyone has any objections, please let us know. -- From: Baruch Even <[EMAIL PROTECTED]> The sorting of SACK blocks actually munges them rather than sort, causing the TCP stack to ignore some SACK information and breaking the assumption of ordered SACK blocks after sorting. The sort takes the data from a second buffer which isn't moved causing subsequent data moves to occur from the wrong location. The fix is to use a temporary buffer as a normal sort does. Signed-off-By: Baruch Even <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv4/tcp_input.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) --- linux-2.6.19.2.orig/net/ipv4/tcp_input.c +++ linux-2.6.19.2/net/ipv4/tcp_input.c @@ -1011,10 +1011,11 @@ tcp_sacktag_write_queue(struct sock *sk, for (j = 0; j < i; j++){ if (after(ntohl(sp[j].start_seq), ntohl(sp[j+1].start_seq))){ - sp[j].start_seq = htonl(tp->recv_sack_cache[j+1].start_seq); - sp[j].end_seq = htonl(tp->recv_sack_cache[j+1].end_seq); - sp[j+1].start_seq = htonl(tp->recv_sack_cache[j].start_seq); - sp[j+1].end_seq = htonl(tp->recv_sack_cache[j].end_seq); + struct tcp_sack_block_wire tmp; + + tmp = sp[j]; + sp[j] = sp[j+1]; + sp[j+1] = tmp; } } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 53/59] TCP: rare bad TCP checksum with 2.6.19
-stable review patch. If anyone has any objections, please let us know. -- From: Jarek Poplawski <[EMAIL PROTECTED]> The patch "Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE" changed to unconditional copying of ip_summed field from collapsed skb. This patch reverts this change. The majority of substantial work including heavy testing and diagnosing by: Michael Tokarev <[EMAIL PROTECTED]> Possible reasons pointed by: Herbert Xu and Patrick McHardy. Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> Acked-by: Herbert Xu <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv4/tcp_output.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- linux-2.6.19.2.orig/net/ipv4/tcp_output.c +++ linux-2.6.19.2/net/ipv4/tcp_output.c @@ -1590,7 +1590,8 @@ static void tcp_retrans_try_collapse(str memcpy(skb_put(skb, next_skb_size), next_skb->data, next_skb_size); - skb->ip_summed = next_skb->ip_summed; + if (next_skb->ip_summed == CHECKSUM_PARTIAL) + skb->ip_summed = CHECKSUM_PARTIAL; if (skb->ip_summed != CHECKSUM_PARTIAL) skb->csum = csum_block_add(skb->csum, next_skb->csum, skb_size); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 52/59] AF_PACKET: Check device down state before hard header callbacks.
-stable review patch. If anyone has any objections, please let us know. -- From: David S. Miller <[EMAIL PROTECTED]> If the device is down, invoking the device hard header callbacks is not legal, so check it early. Based upon a shaper OOPS report from Frederik Deweerdt. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/packet/af_packet.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) --- linux-2.6.19.2.orig/net/packet/af_packet.c +++ linux-2.6.19.2/net/packet/af_packet.c @@ -358,6 +358,10 @@ static int packet_sendmsg_spkt(struct ki if (dev == NULL) goto out_unlock; + err = -ENETDOWN; + if (!(dev->flags & IFF_UP)) + goto out_unlock; + /* * You may not queue a frame bigger than the mtu. This is the lowest level * raw protocol and you must do your own fragmentation at this level. @@ -406,10 +410,6 @@ static int packet_sendmsg_spkt(struct ki if (err) goto out_free; - err = -ENETDOWN; - if (!(dev->flags & IFF_UP)) - goto out_free; - /* * Now send it */ @@ -737,6 +737,10 @@ static int packet_sendmsg(struct kiocb * if (sock->type == SOCK_RAW) reserve = dev->hard_header_len; + err = -ENETDOWN; + if (!(dev->flags & IFF_UP)) + goto out_unlock; + err = -EMSGSIZE; if (len > dev->mtu+reserve) goto out_unlock; @@ -769,10 +773,6 @@ static int packet_sendmsg(struct kiocb * skb->dev = dev; skb->priority = sk->sk_priority; - err = -ENETDOWN; - if (!(dev->flags & IFF_UP)) - goto out_free; - /* * Now send it */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 15/59] Fix up CIFS for "test_clear_page_dirty()" removal
-stable review patch. If anyone has any objections, please let us know. -- From: Linus Torvalds <[EMAIL PROTECTED]> Fix up CIFS for "test_clear_page_dirty()" removal This also adds he required page "writeback" flag handling, that cifs hasn't been doing and that the page dirty flag changes made obvious. Acked-by: Steve French <[EMAIL PROTECTED]> Acked-by: Dave Kleikamp <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- This fixes a long term corruption bug when copying large files to a CIFS mount. Thanks Linus! --- fs/cifs/file.c | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) --- linux-2.6.19.2.orig/fs/cifs/file.c +++ linux-2.6.19.2/fs/cifs/file.c @@ -1244,14 +1244,21 @@ retry: wait_on_page_writeback(page); if (PageWriteback(page) || - !test_clear_page_dirty(page)) { + !clear_page_dirty_for_io(page)) { unlock_page(page); break; } + /* +* This actually clears the dirty bit in the radix tree. +* See cifs_writepage() for more commentary. +*/ + set_page_writeback(page); + if (page_offset(page) >= mapping->host->i_size) { done = 1; unlock_page(page); + end_page_writeback(page); break; } @@ -1315,6 +1322,7 @@ retry: SetPageError(page); kunmap(page); unlock_page(page); + end_page_writeback(page); page_cache_release(page); } if ((wbc->nr_to_write -= n_iov) <= 0) @@ -1351,11 +1359,23 @@ static int cifs_writepage(struct page* p if (!PageUptodate(page)) { cFYI(1, ("ppw - page not up to date")); } - + + /* +* Set the "writeback" flag, and clear "dirty" in the radix tree. +* +* A writepage() implementation always needs to do either this, +* or re-dirty the page with "redirty_page_for_writepage()" in +* the case of a failure. +* +* Just unlocking the page will cause the radix tree tag-bits +* to fail to update with the state of the page correctly. +*/ + set_page_writeback(page); rc = cifs_partialpagewrite(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); /* BB add check for error and Clearuptodate? */ unlock_page(page); - page_cache_release(page); + end_page_writeback(page); + page_cache_release(page); FreeXid(xid); return rc; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 47/59] DECNET: Handle a failure in neigh_parms_alloc (take 2)
-stable review patch. If anyone has any objections, please let us know. -- From: Eric W. Biederman <[EMAIL PROTECTED]> While enhancing the neighbour code to handle multiple network namespaces I noticed that decnet is assuming neigh_parms_alloc will allways succeed, which is clearly wrong. So handle the failure. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Acked-by: Steven Whitehouse <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/decnet/dn_dev.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/net/decnet/dn_dev.c +++ linux-2.6.19.2/net/decnet/dn_dev.c @@ -1116,16 +1116,23 @@ struct dn_dev *dn_dev_create(struct net_ init_timer(_db->timer); dn_db->uptime = jiffies; + + dn_db->neigh_parms = neigh_parms_alloc(dev, _neigh_table); + if (!dn_db->neigh_parms) { + dev->dn_ptr = NULL; + kfree(dn_db); + return NULL; + } + if (dn_db->parms.up) { if (dn_db->parms.up(dev) < 0) { + neigh_parms_release(_neigh_table, dn_db->neigh_parms); dev->dn_ptr = NULL; kfree(dn_db); return NULL; } } - dn_db->neigh_parms = neigh_parms_alloc(dev, _neigh_table); - dn_dev_sysctl_register(dev, _db->parms); dn_dev_set_timer(dev); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 50/59] IPV4: Fix single-entry /proc/net/fib_trie output.
-stable review patch. If anyone has any objections, please let us know. -- From: Robert Olsson <[EMAIL PROTECTED]> When main table is just a single leaf this gets printed as belonging to the local table in /proc/net/fib_trie. A fix is below. Signed-off-by: Robert Olsson <[EMAIL PROTECTED]> Acked-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv4/fib_trie.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) --- linux-2.6.19.2.orig/net/ipv4/fib_trie.c +++ linux-2.6.19.2/net/ipv4/fib_trie.c @@ -2290,16 +2290,17 @@ static int fib_trie_seq_show(struct seq_ if (v == SEQ_START_TOKEN) return 0; + if (!NODE_PARENT(n)) { + if (iter->trie == trie_local) + seq_puts(seq, ":\n"); + else + seq_puts(seq, ":\n"); + } + if (IS_TNODE(n)) { struct tnode *tn = (struct tnode *) n; __be32 prf = htonl(MASK_PFX(tn->key, tn->pos)); - if (!NODE_PARENT(n)) { - if (iter->trie == trie_local) - seq_puts(seq, ":\n"); - else - seq_puts(seq, ":\n"); - } seq_indent(seq, iter->depth-1); seq_printf(seq, " +-- %d.%d.%d.%d/%d %d %d %d\n", NIPQUAD(prf), tn->pos, tn->bits, tn->full_children, -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 57/59] SUNRPC: Give cloned RPC clients their own rpc_pipefs directory
-stable review patch. If anyone has any objections, please let us know. -- From: Trond Myklebust <[EMAIL PROTECTED]> This patch fixes a regression in 2.6.19 in which the use of multiple krb5 mounts against the same NFS server may result in an Oops on unmount. The Oops is due to the fact that multiple NFS krb5 clients may end up inadvertently sharing the same rpc_pipefs upcall pipe. The first client to 'umount' will unlink that shared pipe, causing an Oops. The solution is to give each client their own upcall pipe. This fix has been in mainline since 2.6.20-rc1. Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> [chrisw: backport to 2.6.19.2] Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- include/linux/sunrpc/clnt.h |1 + net/sunrpc/clnt.c | 26 +++--- 2 files changed, 16 insertions(+), 11 deletions(-) --- linux-2.6.19.2.orig/include/linux/sunrpc/clnt.h +++ linux-2.6.19.2/include/linux/sunrpc/clnt.h @@ -53,6 +53,7 @@ struct rpc_clnt { struct dentry * cl_dentry; /* inode */ struct rpc_clnt * cl_parent; /* Points to parent of clones */ struct rpc_rtt cl_rtt_default; + struct rpc_program *cl_program; charcl_inline_name[32]; }; --- linux-2.6.19.2.orig/net/sunrpc/clnt.c +++ linux-2.6.19.2/net/sunrpc/clnt.c @@ -141,6 +141,7 @@ static struct rpc_clnt * rpc_new_client( clnt->cl_vers = version->number; clnt->cl_stats= program->stats; clnt->cl_metrics = rpc_alloc_iostats(clnt); + clnt->cl_program = program; if (!xprt_bound(clnt->cl_xprt)) clnt->cl_autobind = 1; @@ -252,6 +253,7 @@ struct rpc_clnt * rpc_clone_client(struct rpc_clnt *clnt) { struct rpc_clnt *new; + int err = -ENOMEM; new = kmalloc(sizeof(*new), GFP_KERNEL); if (!new) @@ -259,6 +261,10 @@ rpc_clone_client(struct rpc_clnt *clnt) memcpy(new, clnt, sizeof(*new)); atomic_set(>cl_count, 1); atomic_set(>cl_users, 0); + new->cl_metrics = rpc_alloc_iostats(clnt); + err = rpc_setup_pipedir(new, clnt->cl_program->pipe_dir_name); + if (err != 0) + goto out_no_path; new->cl_parent = clnt; atomic_inc(>cl_count); new->cl_xprt = xprt_get(clnt->cl_xprt); @@ -266,16 +272,16 @@ rpc_clone_client(struct rpc_clnt *clnt) new->cl_autobind = 0; new->cl_oneshot = 0; new->cl_dead = 0; - if (!IS_ERR(new->cl_dentry)) - dget(new->cl_dentry); rpc_init_rtt(>cl_rtt_default, clnt->cl_xprt->timeout.to_initval); if (new->cl_auth) atomic_inc(>cl_auth->au_count); - new->cl_metrics = rpc_alloc_iostats(clnt); return new; +out_no_path: + rpc_free_iostats(new->cl_metrics); + kfree(new); out_no_clnt: - printk(KERN_INFO "RPC: out of memory in %s\n", __FUNCTION__); - return ERR_PTR(-ENOMEM); + dprintk("RPC: %s returned error %d\n", __FUNCTION__, err); + return ERR_PTR(err); } /* @@ -328,16 +334,14 @@ rpc_destroy_client(struct rpc_clnt *clnt rpcauth_destroy(clnt->cl_auth); clnt->cl_auth = NULL; } - if (clnt->cl_parent != clnt) { - if (!IS_ERR(clnt->cl_dentry)) - dput(clnt->cl_dentry); - rpc_destroy_client(clnt->cl_parent); - goto out_free; - } if (!IS_ERR(clnt->cl_dentry)) { rpc_rmdir(clnt->cl_dentry); rpc_put_mount(); } + if (clnt->cl_parent != clnt) { + rpc_destroy_client(clnt->cl_parent); + goto out_free; + } if (clnt->cl_server != clnt->cl_inline_name) kfree(clnt->cl_server); out_free: -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 44/59] uml: fix signal frame alignment
-stable review patch. If anyone has any objections, please let us know. -- From: Jeff Dike <[EMAIL PROTECTED]> Use the same signal frame alignment calculations as the underlying architecture. x86_64 appeared to do this, but the "- 8" was really subtracting 8 * sizeof(struct rt_sigframe) rather than 8 bytes. UML/i386 might have been OK, but I changed the calculation to match i386 just to be sure. Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Cc: Adrian Bunk <[EMAIL PROTECTED]> Cc: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]> Acked-by: Antoine Martin <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- arch/um/sys-i386/signal.c |3 ++- arch/um/sys-x86_64/signal.c |5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-) --- linux-2.6.19.2.orig/arch/um/sys-i386/signal.c +++ linux-2.6.19.2/arch/um/sys-i386/signal.c @@ -219,7 +219,8 @@ int setup_signal_stack_sc(unsigned long unsigned long save_sp = PT_REGS_SP(regs); int err = 0; - stack_top &= -8UL; + /* This is the same calculation as i386 - ((sp + 4) & 15) == 0 */ + stack_top = ((stack_top + 4) & -16UL) - 4; frame = (struct sigframe __user *) stack_top - 1; if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame))) return 1; --- linux-2.6.19.2.orig/arch/um/sys-x86_64/signal.c +++ linux-2.6.19.2/arch/um/sys-x86_64/signal.c @@ -191,8 +191,9 @@ int setup_signal_stack_si(unsigned long struct task_struct *me = current; frame = (struct rt_sigframe __user *) - round_down(stack_top - sizeof(struct rt_sigframe), 16) - 8; -frame = (struct rt_sigframe __user *) ((unsigned long) frame - 128); + round_down(stack_top - sizeof(struct rt_sigframe), 16); + /* Subtract 128 for a red zone and 8 for proper alignment */ +frame = (struct rt_sigframe __user *) ((unsigned long) frame - 128 - 8); if (!access_ok(VERIFY_WRITE, fp, sizeof(struct _fpstate))) goto out; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 46/59] jmicron: 40/80pin primary detection
-stable review patch. If anyone has any objections, please let us know. -- From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> jmicron module detects all JMB36x as JMB361 and PATA0 has wrong pin status of XICBLID. Cc: Jeff Garzik <[EMAIL PROTECTED]> Cc: Alan Cox <[EMAIL PROTECTED]> Cc: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]> Cc: Sergei Shtylyov <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> [EMAIL PROTECTED]: I folded in the warning fix (a51545ab25) because otherwise it makes the tester think the patch caused the warning that was already there. Cc: Dave Jones <[EMAIL PROTECTED]> Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/ide/pci/jmicron.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) --- linux-2.6.19.2.orig/drivers/ide/pci/jmicron.c +++ linux-2.6.19.2/drivers/ide/pci/jmicron.c @@ -86,15 +86,16 @@ static int __devinit ata66_jmicron(ide_h { case PORT_PATA0: if (control & (1 << 3)) /* 40/80 pin primary */ - return 1; - return 0; + return 0; + return 1; case PORT_PATA1: if (control5 & (1 << 19)) /* 40/80 pin secondary */ return 0; return 1; case PORT_SATA: - return 1; + break; } + return 1; /* Avoid bogus "control reaches end of non-void function" */ } static void jmicron_tuneproc (ide_drive_t *drive, byte mode_wanted) @@ -240,11 +241,11 @@ static int __devinit jmicron_init_one(st } static struct pci_device_id jmicron_pci_tbl[] = { - { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB361), 0}, - { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB363), 1}, - { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB365), 2}, - { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB366), 3}, - { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB368), 4}, + { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB361, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, + { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB363, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 1}, + { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB365, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 2}, + { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB366, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 3}, + { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB368, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 4}, { 0, }, }; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 49/59] IPV4: Fix the fib trie iterator to work with a single entry routing tables
-stable review patch. If anyone has any objections, please let us know. -- From: Eric W. Biederman <[EMAIL PROTECTED]> In a kernel with trie routing enabled I had a simple routing setup with only a single route to the outside world and no default route. "ip route table list main" showed my the route just fine but /proc/net/route was an empty file. What was going on? Thinking it was a bug in something I did and I looked deeper. Eventually I setup a second route and everything looked correct, huh? Finally I realized that the it was just the iterator pair in fib_trie_get_first, fib_trie_get_next just could not handle a routing table with a single entry. So to save myself and others further confusion, here is a simple fix for the fib proc iterator so it works even when there is only a single route in a routing table. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Signed-off-by: Robert Olsson <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/ipv4/fib_trie.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) --- linux-2.6.19.2.orig/net/ipv4/fib_trie.c +++ linux-2.6.19.2/net/ipv4/fib_trie.c @@ -1989,6 +1989,10 @@ static struct node *fib_trie_get_next(st unsigned cindex = iter->index; struct tnode *p; + /* A single entry routing table */ + if (!tn) + return NULL; + pr_debug("get_next iter={node=%p index=%d depth=%d}\n", iter->tnode, iter->index, iter->depth); rescan: @@ -2037,11 +2041,18 @@ static struct node *fib_trie_get_first(s if(!iter) return NULL; - if (n && IS_TNODE(n)) { - iter->tnode = (struct tnode *) n; - iter->trie = t; - iter->index = 0; - iter->depth = 1; + if (n) { + if (IS_TNODE(n)) { + iter->tnode = (struct tnode *) n; + iter->trie = t; + iter->index = 0; + iter->depth = 1; + } else { + iter->tnode = NULL; + iter->trie = t; + iter->index = 0; + iter->depth = 0; + } return n; } return NULL; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 56/59] NETFILTER: xt_connbytes: fix division by zero
-stable review patch. If anyone has any objections, please let us know. -- From: Patrick McHardy <[EMAIL PROTECTED]> When the packet counter of a connection is zero a division by zero occurs in div64_64(). Fix that by using zero as average value, which is correct as long as the packet counter didn't overflow, at which point we have lost anyway. Additionally we're probably going to go back to 64 bit counters in 2.6.21. Based on patch from Jonas Berlin <[EMAIL PROTECTED]>, with suggestions from KOVACS Krisztian <[EMAIL PROTECTED]>. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/netfilter/xt_connbytes.c | 29 - 1 file changed, 12 insertions(+), 17 deletions(-) --- linux-2.6.19.2.orig/net/netfilter/xt_connbytes.c +++ linux-2.6.19.2/net/netfilter/xt_connbytes.c @@ -52,6 +52,8 @@ match(const struct sk_buff *skb, { const struct xt_connbytes_info *sinfo = matchinfo; u_int64_t what = 0; /* initialize to make gcc happy */ + u_int64_t bytes = 0; + u_int64_t pkts = 0; const struct ip_conntrack_counter *counters; if (!(counters = nf_ct_get_counters(skb))) @@ -89,29 +91,22 @@ match(const struct sk_buff *skb, case XT_CONNBYTES_AVGPKT: switch (sinfo->direction) { case XT_CONNBYTES_DIR_ORIGINAL: - what = div64_64(counters[IP_CT_DIR_ORIGINAL].bytes, - counters[IP_CT_DIR_ORIGINAL].packets); + bytes = counters[IP_CT_DIR_ORIGINAL].bytes; + pkts = counters[IP_CT_DIR_ORIGINAL].packets; break; case XT_CONNBYTES_DIR_REPLY: - what = div64_64(counters[IP_CT_DIR_REPLY].bytes, - counters[IP_CT_DIR_REPLY].packets); + bytes = counters[IP_CT_DIR_REPLY].bytes; + pkts = counters[IP_CT_DIR_REPLY].packets; break; case XT_CONNBYTES_DIR_BOTH: - { - u_int64_t bytes; - u_int64_t pkts; - bytes = counters[IP_CT_DIR_ORIGINAL].bytes + - counters[IP_CT_DIR_REPLY].bytes; - pkts = counters[IP_CT_DIR_ORIGINAL].packets+ - counters[IP_CT_DIR_REPLY].packets; - - /* FIXME_THEORETICAL: what to do if sum -* overflows ? */ - - what = div64_64(bytes, pkts); - } + bytes = counters[IP_CT_DIR_ORIGINAL].bytes + + counters[IP_CT_DIR_REPLY].bytes; + pkts = counters[IP_CT_DIR_ORIGINAL].packets + + counters[IP_CT_DIR_REPLY].packets; break; } + if (pkts != 0) + what = div64_64(bytes, pkts); break; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 29/59] elevator: move clearing of unplug flag earlier
-stable review patch. If anyone has any objections, please let us know. -- From: Linas Vepstas <[EMAIL PROTECTED]> A flag was recently added to the elevator code to avoid performing an unplug when reuests are being re-queued. The goal of this flag was to avoid a deep recursion that can occur when re-queueing requests after a SCSI device/host reset. See http://lkml.org/lkml/2006/5/17/254 However, that fix added the flag near the bottom of a case statement, where an earlier break (in an if statement) could transport one out of the case, without setting the flag. This patch sets the flag earlier in the case statement. I re-discovered the deep recursion recently during testing; I was told that it was a known problem, and the fix to it was in the kernel I was testing. Indeed it was ... but it didn't fix the bug. With the patch below, I no longer see the bug. Signed-off by: Linas Vepstas <[EMAIL PROTECTED]> Signed-off-by: Jens Axboe <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- block/elevator.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) --- linux-2.6.19.2.orig/block/elevator.c +++ linux-2.6.19.2/block/elevator.c @@ -572,6 +572,12 @@ void elv_insert(request_queue_t *q, stru */ rq->cmd_flags |= REQ_SOFTBARRIER; + /* +* Most requeues happen because of a busy condition, +* don't force unplug of the queue for that case. +*/ + unplug_it = 0; + if (q->ordseq == 0) { list_add(>queuelist, >queue_head); break; @@ -586,11 +592,6 @@ void elv_insert(request_queue_t *q, stru } list_add_tail(>queuelist, pos); - /* -* most requeues happen because of a busy condition, don't -* force unplug of the queue for that case. -*/ - unplug_it = 0; break; default: -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 39/59] md: make repair actually work for raid1.
-stable review patch. If anyone has any objections, please let us know. -- From: NeilBrown <[EMAIL PROTECTED]> When 'repair' finds a block that is different one the various parts of the mirror. it is meant to write a chosen good version to the others. However it currently writes out the original data to each. The memcpy to make all the data the same is missing. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/md/raid1.c |5 + 1 file changed, 5 insertions(+) --- linux-2.6.19.2.orig/drivers/md/raid1.c +++ linux-2.6.19.2/drivers/md/raid1.c @@ -1266,6 +1266,11 @@ static void sync_request_write(mddev_t * sbio->bi_sector = r1_bio->sector + conf->mirrors[i].rdev->data_offset; sbio->bi_bdev = conf->mirrors[i].rdev->bdev; + for (j = 0; j < vcnt ; j++) + memcpy(page_address(sbio->bi_io_vec[j].bv_page), + page_address(pbio->bi_io_vec[j].bv_page), + PAGE_SIZE); + } } } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 42/59] libata: use kmap_atomic(KM_IRQ0) in SCSI simulator
-stable review patch. If anyone has any objections, please let us know. -- From: Jeff Garzik <[EMAIL PROTECTED]> We are inside spin_lock_irqsave(). quoth akpm's debug facility: [ 231.948000] SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB) [ 232.232000] ata1.00: configured for UDMA/33 [ 232.404000] WARNING (1) at arch/i386/mm/highmem.c:47 kmap_atomic() [ 232.404000] [] kmap_atomic+0xa9/0x1ab [ 232.404000] [] ata_scsi_rbuf_get+0x1c/0x30 [ 232.404000] [] ata_scsi_rbuf_fill+0x1a/0x87 [ 232.404000] [] ata_scsiop_mode_sense+0x0/0x309 [ 232.404000] [] end_bio_bh_io_sync+0x0/0x37 [ 232.404000] [] scsi_done+0x0/0x16 [ 232.404000] [] scsi_done+0x0/0x16 [ 232.404000] [] ata_scsi_simulate+0xb0/0x13f [...] Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/ata/libata-scsi.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/drivers/ata/libata-scsi.c +++ linux-2.6.19.2/drivers/ata/libata-scsi.c @@ -1648,7 +1648,7 @@ static unsigned int ata_scsi_rbuf_get(st struct scatterlist *sg; sg = (struct scatterlist *) cmd->request_buffer; - buf = kmap_atomic(sg->page, KM_USER0) + sg->offset; + buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset; buflen = sg->length; } else { buf = cmd->request_buffer; @@ -1676,7 +1676,7 @@ static inline void ata_scsi_rbuf_put(str struct scatterlist *sg; sg = (struct scatterlist *) cmd->request_buffer; - kunmap_atomic(buf - sg->offset, KM_USER0); + kunmap_atomic(buf - sg->offset, KM_IRQ0); } } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 45/59] bonding: ARP monitoring broken on x86_64
-stable review patch. If anyone has any objections, please let us know. -- From: Andy Gospodarek <[EMAIL PROTECTED]> While working with the latest bonding code I noticed a nasty problem that will prevent arp monitoring from always functioning correctly on x86_64 systems. Comparing ints to longs and expecting reliable results on x86_64 is a bad idea. With this patch, arp monitoring works correctly again. Signed-off-by: Andy Gospodarek <[EMAIL PROTECTED]> Cc: "David S. Miller" <[EMAIL PROTECTED]> Cc: Stephen Hemminger <[EMAIL PROTECTED]> Cc: Jeff Garzik <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/net/bonding/bonding.h |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) --- linux-2.6.19.2.orig/drivers/net/bonding/bonding.h +++ linux-2.6.19.2/drivers/net/bonding/bonding.h @@ -151,8 +151,8 @@ struct slave { struct slave *next; struct slave *prev; intdelay; - u32jiffies; - u32last_arp_rx; + unsigned long jiffies; + unsigned long last_arp_rx; s8 link;/* one of BOND_LINK_ */ s8 state; /* one of BOND_STATE_ */ u32original_flags; @@ -242,7 +242,8 @@ extern inline int slave_do_arp_validate( return bond->params.arp_validate & (1 << slave->state); } -extern inline u32 slave_last_rx(struct bonding *bond, struct slave *slave) +extern inline unsigned long slave_last_rx(struct bonding *bond, + struct slave *slave) { if (slave_do_arp_validate(bond, slave)) return slave->last_arp_rx; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 32/59] SPARC64: Set g4/g5 properly in sun4v dtlb-prot handling.
-stable review patch. If anyone has any objections, please let us know. -- From: David S. Miller <[EMAIL PROTECTED]> Mirror the logic in the sun4u handler, we have to update both registers even when we branch out to window fault fixup handling. The way it works is that if we are in etrap processing a fault already, g4/g5 holds the original fault information. If we take a window spill fault while doing etrap, then we put the window spill fault info into g4/g5 and this is what the top-level fault handler ends up processing first. Then we retry the originally faulting instruction, and process the original fault at that time. This is all necessary because of how constrained the trap registers are in these code paths. These cases trigger very rarely, so even if there is some performance implication it's doesn't happen very often. In fact the rarity is why it took so long to trigger and find this particular bug. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- --- arch/sparc64/kernel/sun4v_tlb_miss.S |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/arch/sparc64/kernel/sun4v_tlb_miss.S +++ linux-2.6.19.2/arch/sparc64/kernel/sun4v_tlb_miss.S @@ -142,9 +142,9 @@ sun4v_dtlb_prot: rdpr%tl, %g1 cmp %g1, 1 bgu,pn %xcc, winfix_trampoline -nop - ba,pt %xcc, sparc64_realfault_common movFAULT_CODE_DTLB | FAULT_CODE_WRITE, %g4 + ba,pt %xcc, sparc64_realfault_common +nop /* Called from trap table: * %g4: vaddr -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 43/59] Dont allow the stack to grow into hugetlb reserved regions
-stable review patch. If anyone has any objections, please let us know. -- From: Adam Litke <[EMAIL PROTECTED]> When expanding the stack, we don't currently check if the VMA will cross into an area of the address space that is reserved for hugetlb pages. Subsequent faults on the expanded portion of such a VMA will confuse the low-level MMU code, resulting in an OOPS. Check for this. Signed-off-by: Adam Litke <[EMAIL PROTECTED]> Cc: David Gibson <[EMAIL PROTECTED]> Cc: William Lee Irwin III <[EMAIL PROTECTED]> Cc: Hugh Dickins <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- mm/mmap.c |7 +++ 1 file changed, 7 insertions(+) --- linux-2.6.19.2.orig/mm/mmap.c +++ linux-2.6.19.2/mm/mmap.c @@ -1477,6 +1477,7 @@ static int acct_stack_growth(struct vm_a { struct mm_struct *mm = vma->vm_mm; struct rlimit *rlim = current->signal->rlim; + unsigned long new_start; /* address space limit tests */ if (!may_expand_vm(mm, grow)) @@ -1496,6 +1497,12 @@ static int acct_stack_growth(struct vm_a return -ENOMEM; } + /* Check to ensure the stack will not grow into a hugetlb-only region */ + new_start = (vma->vm_flags & VM_GROWSUP) ? vma->vm_start : + vma->vm_end - size; + if (is_hugepage_only_range(vma->vm_mm, new_start, size)) + return -EFAULT; + /* * Overcommit.. This must be the final test, as it will * update security statistics. -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 40/59] md: fix a few problems with the interface (sysfs and ioctl) to md.
-stable review patch. If anyone has any objections, please let us know. -- From: NeilBrown <[EMAIL PROTECTED]> While developing more functionality in mdadm I found some bugs in md... - When we remove a device from an inactive array (write 'remove' to the 'state' sysfs file - see 'state_store') would should not update the superblock information - as we may not have read and processed it all properly yet. - initialise all raid_disk entries to '-1' else the 'slot sysfs file will claim '0' for all devices in an array before the array is started. - all '\n' not to be present at the end of words written to sysfs files - when we use SET_ARRAY_INFO to set the md metadata version, set the flag to say that there is persistant metadata. - allow GET_BITMAP_FILE to be called on an array that hasn't been started yet. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/md/md.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) --- linux-2.6.19.2.orig/drivers/md/md.c +++ linux-2.6.19.2/drivers/md/md.c @@ -1792,7 +1792,8 @@ state_store(mdk_rdev_t *rdev, const char else { mddev_t *mddev = rdev->mddev; kick_rdev_from_array(rdev); - md_update_sb(mddev, 1); + if (mddev->pers) + md_update_sb(mddev, 1); md_new_event(mddev); err = 0; } @@ -2004,6 +2005,7 @@ static mdk_rdev_t *md_import_device(dev_ rdev->desc_nr = -1; rdev->saved_raid_disk = -1; + rdev->raid_disk = -1; rdev->flags = 0; rdev->data_offset = 0; rdev->sb_events = 0; @@ -2233,7 +2235,6 @@ static int update_raid_disks(mddev_t *md static ssize_t raid_disks_store(mddev_t *mddev, const char *buf, size_t len) { - /* can only set raid_disks if array is not yet active */ char *e; int rv = 0; unsigned long n = simple_strtoul(buf, , 10); @@ -2631,7 +2632,7 @@ metadata_store(mddev_t *mddev, const cha return -EINVAL; buf = e+1; minor = simple_strtoul(buf, , 10); - if (e==buf || *e != '\n') + if (e==buf || (*e && *e != '\n') ) return -EINVAL; if (major >= sizeof(super_types)/sizeof(super_types[0]) || super_types[major].name == NULL) @@ -3978,6 +3979,7 @@ static int set_array_info(mddev_t * mdde mddev->major_version = info->major_version; mddev->minor_version = info->minor_version; mddev->patch_version = info->patch_version; + mddev->persistent = ! info->not_persistent; return 0; } mddev->major_version = MD_MAJOR_VERSION; @@ -4302,9 +4304,10 @@ static int md_ioctl(struct inode *inode, * Commands querying/configuring an existing array: */ /* if we are not initialised yet, only ADD_NEW_DISK, STOP_ARRAY, -* RUN_ARRAY, and SET_BITMAP_FILE are allowed */ +* RUN_ARRAY, and GET_ and SET_BITMAP_FILE are allowed */ if (!mddev->raid_disks && cmd != ADD_NEW_DISK && cmd != STOP_ARRAY - && cmd != RUN_ARRAY && cmd != SET_BITMAP_FILE) { + && cmd != RUN_ARRAY && cmd != SET_BITMAP_FILE + && cmd != GET_BITMAP_FILE) { err = -ENODEV; goto abort_unlock; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 37/59] knfsd: fix up some bit-rot in exp_export
-stable review patch. If anyone has any objections, please let us know. -- From: NeilBrown <[EMAIL PROTECTED]> The nfsservctl systemcall isn't used but recent nfs-utils releases for exporting filesystems, and consequently the code that is uses - exp_export - has suffered some bitrot. Particular: - some newly added fields in 'struct svc_export' are being initialised properly. - the return value is now always -ENOMEM ... This patch fixes both these problems. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- fs/nfsd/export.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) --- linux-2.6.19.2.orig/fs/nfsd/export.c +++ linux-2.6.19.2/fs/nfsd/export.c @@ -950,6 +950,8 @@ exp_export(struct nfsctl_export *nxp) exp = exp_get_by_name(clp, nd.mnt, nd.dentry, NULL); + memset(, 0, sizeof(new)); + /* must make sure there won't be an ex_fsid clash */ if ((nxp->ex_flags & NFSEXP_FSID) && (fsid_key = exp_get_fsid_key(clp, nxp->ex_dev)) && @@ -980,6 +982,9 @@ exp_export(struct nfsctl_export *nxp) new.h.expiry_time = NEVER; new.h.flags = 0; + new.ex_path = kstrdup(nxp->ex_path, GFP_KERNEL); + if (!new.ex_path) + goto finish; new.ex_client = clp; new.ex_mnt = nd.mnt; new.ex_dentry = nd.dentry; @@ -1000,10 +1005,11 @@ exp_export(struct nfsctl_export *nxp) /* failed to create at least one index */ exp_do_unexport(exp); cache_flush(); - err = -ENOMEM; - } - + } else + err = 0; finish: + if (new.ex_path) + kfree(new.ex_path); if (exp) exp_put(exp); if (fsid_key && !IS_ERR(fsid_key)) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 41/59] md: fix potential memalloc deadlock in md
-stable review patch. If anyone has any objections, please let us know. -- From: NeilBrown <[EMAIL PROTECTED]> If a GFP_KERNEL allocation is attempted in md while the mddev_lock is held, it is possible for a deadlock to eventuate. This happens if the array was marked 'clean', and the memalloc triggers a write-out to the md device. For the writeout to succeed, the array must be marked 'dirty', and that requires getting the mddev_lock. So, before attempting a GFP_KERNEL alloction while holding the lock, make sure the array is marked 'dirty' (unless it is currently read-only). Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/md/md.c | 29 + drivers/md/raid1.c |2 ++ drivers/md/raid5.c |3 +++ include/linux/raid/md.h |2 +- 4 files changed, 35 insertions(+), 1 deletion(-) --- linux-2.6.19.2.orig/drivers/md/md.c +++ linux-2.6.19.2/drivers/md/md.c @@ -3561,6 +3561,8 @@ static int get_bitmap_file(mddev_t * mdd char *ptr, *buf = NULL; int err = -ENOMEM; + md_allow_write(mddev); + file = kmalloc(sizeof(*file), GFP_KERNEL); if (!file) goto out; @@ -5029,6 +5031,33 @@ void md_write_end(mddev_t *mddev) } } +/* md_allow_write(mddev) + * Calling this ensures that the array is marked 'active' so that writes + * may proceed without blocking. It is important to call this before + * attempting a GFP_KERNEL allocation while holding the mddev lock. + * Must be called with mddev_lock held. + */ +void md_allow_write(mddev_t *mddev) +{ + if (!mddev->pers) + return; + if (mddev->ro) + return; + + spin_lock_irq(>write_lock); + if (mddev->in_sync) { + mddev->in_sync = 0; + set_bit(MD_CHANGE_CLEAN, >flags); + if (mddev->safemode_delay && + mddev->safemode == 0) + mddev->safemode = 1; + spin_unlock_irq(>write_lock); + md_update_sb(mddev, 0); + } else + spin_unlock_irq(>write_lock); +} +EXPORT_SYMBOL_GPL(md_allow_write); + static DECLARE_WAIT_QUEUE_HEAD(resync_wait); #define SYNC_MARKS 10 --- linux-2.6.19.2.orig/drivers/md/raid1.c +++ linux-2.6.19.2/drivers/md/raid1.c @@ -2104,6 +2104,8 @@ static int raid1_reshape(mddev_t *mddev) return -EINVAL; } + md_allow_write(mddev); + raid_disks = mddev->raid_disks + mddev->delta_disks; if (raid_disks < conf->raid_disks) { --- linux-2.6.19.2.orig/drivers/md/raid5.c +++ linux-2.6.19.2/drivers/md/raid5.c @@ -403,6 +403,8 @@ static int resize_stripes(raid5_conf_t * if (newsize <= conf->pool_size) return 0; /* never bother to shrink */ + md_allow_write(conf->mddev); + /* Step 1 */ sc = kmem_cache_create(conf->cache_name[1-conf->active_name], sizeof(struct stripe_head)+(newsize-1)*sizeof(struct r5dev), @@ -3045,6 +3047,7 @@ raid5_store_stripe_cache_size(mddev_t *m else break; } + md_allow_write(mddev); while (new > conf->max_nr_stripes) { if (grow_one_stripe(conf)) conf->max_nr_stripes++; --- linux-2.6.19.2.orig/include/linux/raid/md.h +++ linux-2.6.19.2/include/linux/raid/md.h @@ -94,7 +94,7 @@ extern int sync_page_io(struct block_dev struct page *page, int rw); extern void md_do_sync(mddev_t *mddev); extern void md_new_event(mddev_t *mddev); - +extern void md_allow_write(mddev_t *mddev); #endif /* CONFIG_MD */ #endif -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 38/59] md: assorted md and raid1 one-liners
-stable review patch. If anyone has any objections, please let us know. -- From: NeilBrown <[EMAIL PROTECTED]> Fix few bugs that meant that: - superblocks weren't alway written at exactly the right time (this could show up if the array was not written to - writting to the array causes lots of superblock updates and so hides these errors). - restarting device recovery after a clean shutdown (version-1 metadata only) didn't work as intended (or at all). 1/ Ensure superblock is updated when a new device is added. 2/ Remove an inappropriate test on MD_RECOVERY_SYNC in md_do_sync. The body of this if takes one of two branches depending on whether MD_RECOVERY_SYNC is set, so testing it in the clause of the if is wrong. 3/ Flag superblock for updating after a resync/recovery finishes. 4/ If we find the neeed to restart a recovery in the middle (version-1 metadata only) make sure a full recovery (not just as guided by bitmaps) does get done. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/md/md.c|3 ++- drivers/md/raid1.c |1 + 2 files changed, 3 insertions(+), 1 deletion(-) --- linux-2.6.19.2.orig/drivers/md/md.c +++ linux-2.6.19.2/drivers/md/md.c @@ -3722,6 +3722,7 @@ static int add_new_disk(mddev_t * mddev, if (err) export_rdev(rdev); + md_update_sb(mddev, 1); set_bit(MD_RECOVERY_NEEDED, >recovery); md_wakeup_thread(mddev->thread); return err; @@ -5273,7 +5274,6 @@ void md_do_sync(mddev_t *mddev) mddev->pers->sync_request(mddev, max_sectors, , 1); if (!test_bit(MD_RECOVERY_ERR, >recovery) && - test_bit(MD_RECOVERY_SYNC, >recovery) && !test_bit(MD_RECOVERY_CHECK, >recovery) && mddev->curr_resync > 2) { if (test_bit(MD_RECOVERY_SYNC, >recovery)) { @@ -5297,6 +5297,7 @@ void md_do_sync(mddev_t *mddev) rdev->recovery_offset = mddev->curr_resync; } } + set_bit(MD_CHANGE_DEVS, >flags); skip: mddev->curr_resync = 0; --- linux-2.6.19.2.orig/drivers/md/raid1.c +++ linux-2.6.19.2/drivers/md/raid1.c @@ -1956,6 +1956,7 @@ static int run(mddev_t *mddev) !test_bit(In_sync, >rdev->flags)) { disk->head_position = 0; mddev->degraded++; + conf->fullsync = 1; } } if (mddev->degraded == conf->raid_disks) { -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 30/59] Revert "[PATCH] Fix up mmap_kmem"
-stable review patch. If anyone has any objections, please let us know. -- From: Linus Torvalds <[EMAIL PROTECTED]> This reverts commit 99a10a60ba9bedcf5d70ef81414d3e03816afa3f. As per Hugh Dickins: "Nadia Derbey has reported that mmap of /dev/kmem no longer works with the kernel virtual address as offset, and Franck has confirmed that his patch came from a misunderstanding of what an offset means to /dev/kmem - whereas his patch description seems to say that he was correcting the offset on a few plaforms, there was no such problem to correct, and his patch was in fact changing its API on all platforms." Suggested-by: Hugh Dickins <[EMAIL PROTECTED]> Cc: Franck Bui-Huu <[EMAIL PROTECTED]> Cc: Nadia Derbey <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Arjan van de Ven <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/char/mem.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/drivers/char/mem.c +++ linux-2.6.19.2/drivers/char/mem.c @@ -293,8 +293,8 @@ static int mmap_kmem(struct file * file, { unsigned long pfn; - /* Turn a pfn offset into an absolute pfn */ - pfn = PFN_DOWN(virt_to_phys((void *)PAGE_OFFSET)) + vma->vm_pgoff; + /* Turn a kernel-virtual address into a physical page frame */ + pfn = __pa((u64)vma->vm_pgoff << PAGE_SHIFT) >> PAGE_SHIFT; /* * RED-PEN: on some architectures there is more mapped memory -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 33/59] sis190: failure to set the MAC address from EEPROM
-stable review patch. If anyone has any objections, please let us know. -- From: Francois Romieu <[EMAIL PROTECTED]> Fix from http://bugzilla.kernel.org/show_bug.cgi?id=7747 Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Signed-off-by: Francois Romieu <[EMAIL PROTECTED]> Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/net/sis190.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19.2.orig/drivers/net/sis190.c +++ linux-2.6.19.2/drivers/net/sis190.c @@ -1559,7 +1559,7 @@ static int __devinit sis190_get_mac_addr for (i = 0; i < MAC_ADDR_LEN / 2; i++) { __le16 w = sis190_read_eeprom(ioaddr, EEPROMMACAddr + i); - ((u16 *)dev->dev_addr)[0] = le16_to_cpu(w); + ((u16 *)dev->dev_addr)[i] = le16_to_cpu(w); } sis190_set_rgmii(tp, sis190_read_eeprom(ioaddr, EEPROMInfo)); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 35/59] knfsd: fix an NFSD bug with full sized, non-page-aligned reads.
-stable review patch. If anyone has any objections, please let us know. -- From: NeilBrown <[EMAIL PROTECTED]> NFSd assumes that largest number of pages that will be needed for a request+response is 2+N where N pages is the size of the largest permitted read/write request. The '2' are 1 for the non-data part of the request, and 1 for the non-data part of the reply. However, when a read request is not page-aligned, and we choose to use ->sendfile to send it directly from the page cache, we may need N+1 pages to hold the whole reply. This can overflow and array and cause an Oops. This patch increases size of the array for holding pages by one and makes sure that entry is NULL when it is not in use. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- fs/nfsd/vfs.c |3 ++- include/linux/sunrpc/svc.h |5 - net/sunrpc/svcsock.c |2 ++ 3 files changed, 8 insertions(+), 2 deletions(-) --- linux-2.6.19.2.orig/fs/nfsd/vfs.c +++ linux-2.6.19.2/fs/nfsd/vfs.c @@ -822,7 +822,8 @@ nfsd_read_actor(read_descriptor_t *desc, rqstp->rq_res.page_len = size; } else if (page != pp[-1]) { get_page(page); - put_page(*pp); + if (*pp) + put_page(*pp); *pp = page; rqstp->rq_resused++; rqstp->rq_res.page_len += size; --- linux-2.6.19.2.orig/include/linux/sunrpc/svc.h +++ linux-2.6.19.2/include/linux/sunrpc/svc.h @@ -144,8 +144,11 @@ extern u32 svc_max_payload(const struct * * Each request/reply pair can have at most one "payload", plus two pages, * one for the request, and one for the reply. + * We using ->sendfile to return read data, we might need one extra page + * if the request is not page-aligned. So add another '1'. */ -#define RPCSVC_MAXPAGES ((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE + 2) +#define RPCSVC_MAXPAGES ((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE \ + + 2 + 1) static inline u32 svc_getnl(struct kvec *iov) { --- linux-2.6.19.2.orig/net/sunrpc/svcsock.c +++ linux-2.6.19.2/net/sunrpc/svcsock.c @@ -1248,6 +1248,8 @@ svc_recv(struct svc_rqst *rqstp, long ti schedule_timeout_uninterruptible(msecs_to_jiffies(500)); rqstp->rq_pages[i] = p; } + rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */ + BUG_ON(pages >= RPCSVC_MAXPAGES); /* Make arg->head point to first page and arg->pages point to rest */ arg = >rq_arg; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 31/59] remove __devinit markings from rtc_sysfs_add_device()
-stable review patch. If anyone has any objections, please let us know. -- From: Mike Frysinger <[EMAIL PROTECTED]> rtc_sysfs_add_device is needed even after dev initialization, so drop __devinit. Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]> Acked-by: Alessandro Zummo <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/rtc/rtc-sysfs.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19.2.orig/drivers/rtc/rtc-sysfs.c +++ linux-2.6.19.2/drivers/rtc/rtc-sysfs.c @@ -78,7 +78,7 @@ static struct attribute_group rtc_attr_g .attrs = rtc_attrs, }; -static int __devinit rtc_sysfs_add_device(struct class_device *class_dev, +static int rtc_sysfs_add_device(struct class_device *class_dev, struct class_interface *class_intf) { int err; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 25/59] Fix UML on non-standard VM split hosts
-stable review patch. If anyone has any objections, please let us know. -- From: Jeff Dike <[EMAIL PROTECTED]> This fixes UML on hosts with non-standard VM splits. We had changed the config variable that controls UML behavior on such hosts, but not propogated the change everywhere. In particular, the values of STUB_CODE and STUB_DATA relied on the old variable. I also reformatted the HOST_VMSPLIT_3G help to make it more standard. Spotted by [EMAIL PROTECTED] Signed-off-by: Jeff Dike <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> -- arch/um/Kconfig.i386 | 38 +++--- 1 file changed, 19 insertions(+), 19 deletions(-) --- linux-2.6.19.2.orig/arch/um/Kconfig.i386 +++ linux-2.6.19.2/arch/um/Kconfig.i386 @@ -19,22 +19,22 @@ config SEMAPHORE_SLEEPERS choice prompt "Host memory split" default HOST_VMSPLIT_3G - ---help--- - This is needed when the host kernel on which you run has a non-default - (like 2G/2G) memory split, instead of the customary 3G/1G. If you did - not recompile your own kernel but use the default distro's one, you can - safely accept the "Default split" option. - - It can be enabled on recent (>=2.6.16-rc2) vanilla kernels via - CONFIG_VM_SPLIT_*, or on previous kernels with special patches (-ck - patchset by Con Kolivas, or other ones) - option names match closely the - host CONFIG_VM_SPLIT_* ones. - - A lower setting (where 1G/3G is lowest and 3G/1G is higher) will - tolerate even more "normal" host kernels, but an higher setting will be - stricter. + help +This is needed when the host kernel on which you run has a non-default + (like 2G/2G) memory split, instead of the customary 3G/1G. If you did + not recompile your own kernel but use the default distro's one, you can + safely accept the "Default split" option. + + It can be enabled on recent (>=2.6.16-rc2) vanilla kernels via + CONFIG_VM_SPLIT_*, or on previous kernels with special patches (-ck + patchset by Con Kolivas, or other ones) - option names match closely the + host CONFIG_VM_SPLIT_* ones. + + A lower setting (where 1G/3G is lowest and 3G/1G is higher) will + tolerate even more "normal" host kernels, but an higher setting will be + stricter. - So, if you do not know what to do here, say 'Default split'. + So, if you do not know what to do here, say 'Default split'. config HOST_VMSPLIT_3G bool "Default split (3G/1G user/kernel host split)" @@ -67,13 +67,13 @@ config 3_LEVEL_PGTABLES config STUB_CODE hex - default 0xbfffe000 if !HOST_2G_2G - default 0x7fffe000 if HOST_2G_2G + default 0xbfffe000 if !HOST_VMSPLIT_2G + default 0x7fffe000 if HOST_VMSPLIT_2G config STUB_DATA hex - default 0xb000 if !HOST_2G_2G - default 0x7000 if HOST_2G_2G + default 0xb000 if !HOST_VMSPLIT_2G + default 0x7000 if HOST_VMSPLIT_2G config STUB_START hex -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 21/59] IPSEC: Policy list disorder
-stable review patch. If anyone has any objections, please let us know. -- From: Herbert Xu <[EMAIL PROTECTED]> The recent hashing introduced an off-by-one bug in policy list insertion. Instead of adding after the last entry with a lesser or equal priority, we're adding after the successor of that entry. This patch fixes this and also adds a warning if we detect a duplicate entry in the policy list. This should never happen due to this if clause. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- net/xfrm/xfrm_policy.c | 16 +--- 1 file changed, 5 insertions(+), 11 deletions(-) --- linux-2.6.19.2.orig/net/xfrm/xfrm_policy.c +++ linux-2.6.19.2/net/xfrm/xfrm_policy.c @@ -615,19 +615,18 @@ int xfrm_policy_insert(int dir, struct x struct xfrm_policy *pol; struct xfrm_policy *delpol; struct hlist_head *chain; - struct hlist_node *entry, *newpos, *last; + struct hlist_node *entry, *newpos; struct dst_entry *gc_list; write_lock_bh(_policy_lock); chain = policy_hash_bysel(>selector, policy->family, dir); delpol = NULL; newpos = NULL; - last = NULL; hlist_for_each_entry(pol, entry, chain, bydst) { - if (!delpol && - pol->type == policy->type && + if (pol->type == policy->type && !selector_cmp(>selector, >selector) && - xfrm_sec_ctx_match(pol->security, policy->security)) { + xfrm_sec_ctx_match(pol->security, policy->security) && + !WARN_ON(delpol)) { if (excl) { write_unlock_bh(_policy_lock); return -EEXIST; @@ -636,17 +635,12 @@ int xfrm_policy_insert(int dir, struct x if (policy->priority > pol->priority) continue; } else if (policy->priority >= pol->priority) { - last = >bydst; + newpos = >bydst; continue; } - if (!newpos) - newpos = >bydst; if (delpol) break; - last = >bydst; } - if (!newpos) - newpos = last; if (newpos) hlist_add_after(newpos, >bydst); else -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 23/59] SELinux: fix an oops with NetLabel and non-MLS SELinux policy
-stable review patch. If anyone has any objections, please let us know. -- From: <[EMAIL PROTECTED]> In the case where a user has configured NetLabel in the kernel but is not using a SELinux policy with the MLS/MCS feature enabled there is a bug in mls_export_cat() where a NULL pointer is used. The initial problem report and discussion can be found here (this patch has been ACK'd by Stephen Smalley and James Morris in the discussion thread below): * http://marc2.theaimsgroup.com/?t=11692030254=1=2 This patch is specific to the 2.6.19.y kernel series as the mls_export_cat() function has been replaced in the 2.6.20 kernel. Signed-off-by: Paul Moore <[EMAIL PROTECTED]> Acked-by: Stephen Smalley <[EMAIL PROTECTED]> Acked-by: James Morris <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- security/selinux/ss/mls.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) --- linux-2.6.19.2.orig/security/selinux/ss/mls.c +++ linux-2.6.19.2/security/selinux/ss/mls.c @@ -641,10 +641,14 @@ int mls_export_cat(const struct context int rc = -EPERM; if (!selinux_mls_enabled) { - *low = NULL; - *low_len = 0; - *high = NULL; - *high_len = 0; + if (low != NULL) { + *low = NULL; + *low_len = 0; + } + if (high != NULL) { + *high = NULL; + *high_len = 0; + } return 0; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 13/59] ieee1394: sbp2: fix probing of some DVD-ROM/RWs
-stable review patch. If anyone has any objections, please let us know. -- From: Stefan Richter <[EMAIL PROTECTED]> Since commit 98e238cd42be6c0852da519303cf0182690f8d9f in Linux 2.6.19, "ieee1394: sbp2: don't prefer MODE SENSE 10", some FireWire DVD-ROMs and DVD-RWs were mistaken as CD-ROM because sr_mod now sent MODE SENSE 6. The MMC command set includes only MODE SENSE 10. http://bugzilla.kernel.org/show_bug.cgi?id=7800 This fix lets sbp2 switch scsi_device.use_10_for_rw on for MMC LUs. This should rather be done in the command set driver sr_mod, not in the sbp2 transport driver, and an according patch will follow for a next Linux release. Signed-off-by: Stefan Richter <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- same as commit 1a74bc68e4c0534d150e6454b45a70dab831fa32 --- drivers/ieee1394/sbp2.c |2 ++ 1 file changed, 2 insertions(+) --- linux-2.6.19.2.orig/drivers/ieee1394/sbp2.c +++ linux-2.6.19.2/drivers/ieee1394/sbp2.c @@ -2530,6 +2530,8 @@ static int sbp2scsi_slave_configure(stru blk_queue_dma_alignment(sdev->request_queue, (512 - 1)); sdev->use_10_for_rw = 1; + if (sdev->type == TYPE_ROM) + sdev->use_10_for_ms = 1; if (sdev->type == TYPE_DISK && scsi_id->workarounds & SBP2_WORKAROUND_MODE_SENSE_8) sdev->skip_ms_page_8 = 1; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 05/59] md: pass down BIO_RW_SYNC in raid{1,10}
-stable review patch. If anyone has any objections, please let us know. -- From: Lars Ellenberg <[EMAIL PROTECTED]> md raidX make_request functions strip off the BIO_RW_SYNC flag, thus introducing additional latency. Fixing this in raid1 and raid10 seems to be straightforward enough. For our particular usage case in DRBD, passing this flag improved some initialization time from ~5 minutes to ~5 seconds. Acked-by: NeilBrown <[EMAIL PROTECTED]> Signed-off-by: Lars Ellenberg <[EMAIL PROTECTED]> Acked-by: Jens Axboe <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/md/raid1.c | 13 + drivers/md/raid10.c | 11 --- 2 files changed, 17 insertions(+), 7 deletions(-) --- linux-2.6.19.2.orig/drivers/md/raid1.c +++ linux-2.6.19.2/drivers/md/raid1.c @@ -775,6 +775,7 @@ static int make_request(request_queue_t struct bio_list bl; struct page **behind_pages = NULL; const int rw = bio_data_dir(bio); + const int do_sync = bio_sync(bio); int do_barriers; /* @@ -835,7 +836,7 @@ static int make_request(request_queue_t read_bio->bi_sector = r1_bio->sector + mirror->rdev->data_offset; read_bio->bi_bdev = mirror->rdev->bdev; read_bio->bi_end_io = raid1_end_read_request; - read_bio->bi_rw = READ; + read_bio->bi_rw = READ | do_sync; read_bio->bi_private = r1_bio; generic_make_request(read_bio); @@ -906,7 +907,7 @@ static int make_request(request_queue_t mbio->bi_sector = r1_bio->sector + conf->mirrors[i].rdev->data_offset; mbio->bi_bdev = conf->mirrors[i].rdev->bdev; mbio->bi_end_io = raid1_end_write_request; - mbio->bi_rw = WRITE | do_barriers; + mbio->bi_rw = WRITE | do_barriers | do_sync; mbio->bi_private = r1_bio; if (behind_pages) { @@ -941,6 +942,8 @@ static int make_request(request_queue_t blk_plug_device(mddev->queue); spin_unlock_irqrestore(>device_lock, flags); + if (do_sync) + md_wakeup_thread(mddev->thread); #if 0 while ((bio = bio_list_pop()) != NULL) generic_make_request(bio); @@ -1541,6 +1544,7 @@ static void raid1d(mddev_t *mddev) * We already have a nr_pending reference on these rdevs. */ int i; + const int do_sync = bio_sync(r1_bio->master_bio); clear_bit(R1BIO_BarrierRetry, _bio->state); clear_bit(R1BIO_Barrier, _bio->state); for (i=0; i < conf->raid_disks; i++) @@ -1561,7 +1565,7 @@ static void raid1d(mddev_t *mddev) conf->mirrors[i].rdev->data_offset; bio->bi_bdev = conf->mirrors[i].rdev->bdev; bio->bi_end_io = raid1_end_write_request; - bio->bi_rw = WRITE; + bio->bi_rw = WRITE | do_sync; bio->bi_private = r1_bio; r1_bio->bios[i] = bio; generic_make_request(bio); @@ -1593,6 +1597,7 @@ static void raid1d(mddev_t *mddev) (unsigned long long)r1_bio->sector); raid_end_bio_io(r1_bio); } else { + const int do_sync = bio_sync(r1_bio->master_bio); r1_bio->bios[r1_bio->read_disk] = mddev->ro ? IO_BLOCKED : NULL; r1_bio->read_disk = disk; @@ -1608,7 +1613,7 @@ static void raid1d(mddev_t *mddev) bio->bi_sector = r1_bio->sector + rdev->data_offset; bio->bi_bdev = rdev->bdev; bio->bi_end_io = raid1_end_read_request; - bio->bi_rw = READ; + bio->bi_rw = READ | do_sync; bio->bi_private = r1_bio; unplug = 1; generic_make_request(bio); --- linux-2.6.19.2.orig/drivers/md/raid10.c +++ linux-2.6.19.2/drivers/md/raid10.c @@ -782,6 +782,7 @@ static int make_request(request_queue_t int i; int chunk_sects = conf->chunk_mask + 1; const int rw = bio_data_dir(bio); + const int do_sync = bio_sync(bio); struct bio_list bl; unsigned long flags; @@ -863,7 +864,7 @@ static int make_request(request_queue_t
[patch 04/59] Fix HWRNG built-in initcalls priority
-stable review patch. If anyone has any objections, please let us know. -- From: Michael Buesch <[EMAIL PROTECTED]> This changes all HWRNG driver initcalls to module_init(). We must probe the RNGs after the major kernel subsystems are already up and running (like PCI). This fixes Bug 7730. http://bugzilla.kernel.org/show_bug.cgi?id=7730 Signed-off-by: Michael Buesch <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/char/hw_random/amd-rng.c|2 +- drivers/char/hw_random/geode-rng.c |2 +- drivers/char/hw_random/intel-rng.c |2 +- drivers/char/hw_random/ixp4xx-rng.c |2 +- drivers/char/hw_random/via-rng.c|2 +- 5 files changed, 5 insertions(+), 5 deletions(-) --- linux-2.6.19.2.orig/drivers/char/hw_random/amd-rng.c +++ linux-2.6.19.2/drivers/char/hw_random/amd-rng.c @@ -144,7 +144,7 @@ static void __exit mod_exit(void) hwrng_unregister(_rng); } -subsys_initcall(mod_init); +module_init(mod_init); module_exit(mod_exit); MODULE_AUTHOR("The Linux Kernel team"); --- linux-2.6.19.2.orig/drivers/char/hw_random/geode-rng.c +++ linux-2.6.19.2/drivers/char/hw_random/geode-rng.c @@ -125,7 +125,7 @@ static void __exit mod_exit(void) iounmap(mem); } -subsys_initcall(mod_init); +module_init(mod_init); module_exit(mod_exit); MODULE_DESCRIPTION("H/W RNG driver for AMD Geode LX CPUs"); --- linux-2.6.19.2.orig/drivers/char/hw_random/intel-rng.c +++ linux-2.6.19.2/drivers/char/hw_random/intel-rng.c @@ -350,7 +350,7 @@ static void __exit mod_exit(void) iounmap(mem); } -subsys_initcall(mod_init); +module_init(mod_init); module_exit(mod_exit); MODULE_DESCRIPTION("H/W RNG driver for Intel chipsets"); --- linux-2.6.19.2.orig/drivers/char/hw_random/ixp4xx-rng.c +++ linux-2.6.19.2/drivers/char/hw_random/ixp4xx-rng.c @@ -64,7 +64,7 @@ static void __exit ixp4xx_rng_exit(void) iounmap(rng_base); } -subsys_initcall(ixp4xx_rng_init); +module_init(ixp4xx_rng_init); module_exit(ixp4xx_rng_exit); MODULE_AUTHOR("Deepak Saxena <[EMAIL PROTECTED]>"); --- linux-2.6.19.2.orig/drivers/char/hw_random/via-rng.c +++ linux-2.6.19.2/drivers/char/hw_random/via-rng.c @@ -176,7 +176,7 @@ static void __exit mod_exit(void) hwrng_unregister(_rng); } -subsys_initcall(mod_init); +module_init(mod_init); module_exit(mod_exit); MODULE_DESCRIPTION("H/W RNG driver for VIA chipsets"); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 16/59] start_kernel: test if irqs got enabled early, barf, and disable them again
-stable review patch. If anyone has any objections, please let us know. -- From: Ard van Breemen <[EMAIL PROTECTED]> The calls made by parse_parms to other initialization code might enable interrupts again way too early. Having interrupts on this early can make systems PANIC when they initialize the IRQ controllers (which happens later in the code). This patch detects that irq's are enabled again, barfs about it and disables them again as a safety net. [EMAIL PROTECTED]: cleanups] Signed-off-by: Ard van Breemen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- This is half of the fix for http://bugzilla.kernel.org/show_bug.cgi?id=7505 init/main.c |5 + 1 file changed, 5 insertions(+) --- linux-2.6.19.2.orig/init/main.c +++ linux-2.6.19.2/init/main.c @@ -525,6 +525,11 @@ asmlinkage void __init start_kernel(void parse_args("Booting kernel", command_line, __start___param, __stop___param - __start___param, _bootoption); + if (!irqs_disabled()) { + printk(KERN_WARNING "start_kernel(): bug: interrupts were " + "enabled *very* early, fixing it\n"); + local_irq_disable(); + } sort_main_extable(); trap_init(); rcu_init(); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 12/59] [PATCH] Fix reparenting to the same thread group. (take 2)
-stable review patch. If anyone has any objections, please let us know. -- From: Eric W. Biederman <[EMAIL PROTECTED]> This patch fixes the case when we reparent to a different thread in the same thread group. This modifies the code so that we do not send signals and do not change the signal to send to SIGCHLD unless we have change the thread group of our parents. It also suppresses sending pdeath_sig in this cas as well since the result of geppid doesn't change. Thanks to Oleg for spotting my bug of only fixing this for non-ptraced tasks. This fixes the issues identified by Albert Cahalan in thread http://lkml.org/lkml/2006/12/21/22. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> Acked-by: Mike Galbraith <[EMAIL PROTECTED]> Cc: Albert Cahalan <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> Cc: Roland McGrath <[EMAIL PROTECTED]> Cc: Ingo Molnar <[EMAIL PROTECTED]> Cc: Coywolf Qi Hunt <[EMAIL PROTECTED]> Acked-by: Oleg Nesterov <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> [chrisw: fold in 241ceee0b442, Oleg's fix to restore user visible behaviour] Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- kernel/exit.c | 29 ++--- 1 file changed, 18 insertions(+), 11 deletions(-) --- linux-2.6.19.2.orig/kernel/exit.c +++ linux-2.6.19.2/kernel/exit.c @@ -603,10 +603,6 @@ choose_new_parent(struct task_struct *p, static void reparent_thread(struct task_struct *p, struct task_struct *father, int traced) { - /* We don't want people slaying init. */ - if (p->exit_signal != -1) - p->exit_signal = SIGCHLD; - if (p->pdeath_signal) /* We already hold the tasklist_lock here. */ group_send_sig_info(p->pdeath_signal, SEND_SIG_NOINFO, p); @@ -626,13 +622,7 @@ reparent_thread(struct task_struct *p, s p->parent = p->real_parent; add_parent(p); - /* If we'd notified the old parent about this child's death, -* also notify the new parent. -*/ - if (p->exit_state == EXIT_ZOMBIE && p->exit_signal != -1 && - thread_group_empty(p)) - do_notify_parent(p, p->exit_signal); - else if (p->state == TASK_TRACED) { + if (p->state == TASK_TRACED) { /* * If it was at a trace stop, turn it into * a normal stop since it's no longer being @@ -642,6 +632,23 @@ reparent_thread(struct task_struct *p, s } } + /* If this is a threaded reparent there is no need to +* notify anyone anything has happened. +*/ + if (p->real_parent->group_leader == father->group_leader) + return; + + /* We don't want people slaying init. */ + if (p->exit_signal != -1) + p->exit_signal = SIGCHLD; + + /* If we'd notified the old parent about this child's death, +* also notify the new parent. +*/ + if (!traced && p->exit_state == EXIT_ZOMBIE && + p->exit_signal != -1 && thread_group_empty(p)) + do_notify_parent(p, p->exit_signal); + /* * process group orphan check * Case ii: Our child is in a different pgrp -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 14/59] sched: tasks cannot run on cpus onlined after boot
-stable review patch. If anyone has any objections, please let us know. -- From: Nathan Lynch <[EMAIL PROTECTED]> Commit 5c1e176781f43bc902a51e5832f789756bff911b ("sched: force /sbin/init off isolated cpus") sets init's cpus_allowed to a subset of cpu_online_map at boot time, which means that tasks won't be scheduled on cpus that are added to the system later. Make init's cpus_allowed a subset of cpu_possible_map instead. This should still preserve the behavior that Nick's change intended. Thanks to Giuliano Pochini for reporting this and testing the fix: http://ozlabs.org/pipermail/linuxppc-dev/2006-December/029397.html Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]> Acked-by: Ingo Molnar <[EMAIL PROTECTED]> Cc: Nick Piggin <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- kernel/sched.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.19.2.orig/kernel/sched.c +++ linux-2.6.19.2/kernel/sched.c @@ -6765,7 +6765,7 @@ void __init sched_init_smp(void) lock_cpu_hotplug(); arch_init_sched_domains(_online_map); - cpus_andnot(non_isolated_cpus, cpu_online_map, cpu_isolated_map); + cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map); if (cpus_empty(non_isolated_cpus)) cpu_set(smp_processor_id(), non_isolated_cpus); unlock_cpu_hotplug(); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 00/59] -stable review
This is the start of the stable review cycle for the 2.6.19.3 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let us know. If anyone is a maintainer of the proper subsystem, and wants to add a Signed-off-by: line to the patch, please respond with it. These patches are sent out with a number of different people on the Cc: line. If you wish to be a reviewer, please email [EMAIL PROTECTED] to add your name to the list. If you want to be off the reviewer list, also email us. Responses should be made by Mon Feb 3 02:30 UTC 2007 Anything received after that time might be too late. thanks, the -stable release team -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/9] fs: libfs buffered write leak fix
On Fri, Feb 02, 2007 at 06:19:55PM -0800, Andrew Morton wrote: > On Sat, 3 Feb 2007 03:09:26 +0100 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > From: Nick Piggin <[EMAIL PROTECTED]> > > To: Andrew Morton <[EMAIL PROTECTED]> > > argh. Yesterday all my emails were getting a mysterious > s/osdl/linux-foundation/ done to them at the server, so I switched everything > over. Now it would appear that they are getting an equally mysterious > s/linux-foundation/osdl/ done to them. I assume you sent this to > [EMAIL PROTECTED] No. Your first reply I got to this patch came as linux-foundantion, and that's what I replied to. Your subsequent reply back to me ("Yes, the page just isn't uptodate yet..."), came from osdl.org, which is what I replied to. > > Cc: Linux Kernel , Linux Filesystems > > , Linux Memory Management <[EMAIL PROTECTED]> > > Subject: Re: [patch 1/9] fs: libfs buffered write leak fix > > Date: Sat, 3 Feb 2007 03:09:26 +0100 > > User-Agent: Mutt/1.5.9i > > > > On Fri, Feb 02, 2007 at 05:58:01PM -0800, Andrew Morton wrote: > > > On Sat, 3 Feb 2007 02:33:16 +0100 > > > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > > > I think just setting page uptodate in commit_write might do the > > > > trick? (and getting rid of the set_page_dirty there). > > > > > > Yes, the page just isn't uptodate yet in prepare_write() - moving things > > > to commti_write() sounds sane. > > > > > > But please, can we have sufficient changelogs and comments in the next > > > version? > > > > You're right, sorry. Is this any better? > > yup, thanks. > > > (warning: nobh code is untested) > > ow. I'll get a chance to do that later today. I have to fire up the old test case and see if I can reproduce the problem with nobh on a real fs... Will get back to you when I do. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20-rc7: known regressions
On Fri, 02 Feb 2007 21:03:48 -0500 Jeff Garzik <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > On Fri, 2 Feb 2007 06:49:16 +0100 > > Adrian Bunk <[EMAIL PROTECTED]> wrote: > > > >> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19 > >> that are not yet fixed in Linus' tree. > > > > There are still a few things hanging around. > > > > I have these queued: > > > > aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch > > kexec-avoid-migration-of-already-disabled-irqs-ia64.patch > > net-smc911x-match-up-spin-lock-unlock.patch > > rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch > > alpha-fix-epoll-syscall-enumerations.patch > > revert-blockdev-direct-io-back-to-2619-version.patch > > scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch > > altix-more-acpi-prt-support.patch > > Would you forward the x86-64 dma_noncoherent API build fix I posted? > Anything that uses that API won't build on x86-64 without my [simple and > obvious] patch. Yup. That's this: --- a/include/asm-x86_64/dma-mapping.h~x86-64-define-dma-noncoherent-api-functions +++ a/include/asm-x86_64/dma-mapping.h @@ -63,6 +63,9 @@ static inline int dma_mapping_error(dma_ return (dma_addr == bad_dma_address); } +#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f) +#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h) + extern void *dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t gfp); extern void dma_free_coherent(struct device *dev, size_t size, void *vaddr, _ > > > - I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating > > about, but I forget its status. > > I posted a preferred patch (which someone then noted need to use > setup_timer), and am waiting for an "it works" response of some sort OK, thanks, I'll drop it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/9] fs: libfs buffered write leak fix
On Sat, 3 Feb 2007 03:09:26 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > From: Nick Piggin <[EMAIL PROTECTED]> > To: Andrew Morton <[EMAIL PROTECTED]> argh. Yesterday all my emails were getting a mysterious s/osdl/linux-foundation/ done to them at the server, so I switched everything over. Now it would appear that they are getting an equally mysterious s/linux-foundation/osdl/ done to them. I assume you sent this to [EMAIL PROTECTED] > Cc: Linux Kernel , Linux Filesystems > , Linux Memory Management <[EMAIL PROTECTED]> > Subject: Re: [patch 1/9] fs: libfs buffered write leak fix > Date: Sat, 3 Feb 2007 03:09:26 +0100 > User-Agent: Mutt/1.5.9i > > On Fri, Feb 02, 2007 at 05:58:01PM -0800, Andrew Morton wrote: > > On Sat, 3 Feb 2007 02:33:16 +0100 > > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > I think just setting page uptodate in commit_write might do the > > > trick? (and getting rid of the set_page_dirty there). > > > > Yes, the page just isn't uptodate yet in prepare_write() - moving things > > to commti_write() sounds sane. > > > > But please, can we have sufficient changelogs and comments in the next > > version? > > You're right, sorry. Is this any better? yup, thanks. > (warning: nobh code is untested) ow. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
libata_uli puts second channel to PIO4 on 2.6.18
Hi, I got this SATA PCI card: 00:04.0 Mass storage controller: ALi Corporation ALi M5281 Serial ATA / RAID Host Controller (rev a4) (prog-if 85) Subsystem: ALi Corporation ALi M5281 Serial ATA / RAID Host Controller Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Latency: 128, Cache Line Size: 512 bytes Interrupt: pin A routed to IRQ 185 Region 0: I/O ports at d400 [size=8] Region 1: I/O ports at d000 [size=4] Region 2: I/O ports at b800 [size=8] Region 3: I/O ports at b400 [size=4] Region 4: I/O ports at b000 [size=16] [virtual] Expansion ROM at 8800 [disabled] [size=64K] 00:04.1 Mass storage controller: ALi Corporation M5228 ALi ATA/RAID Controller (rev c6) (prog-if 85) Subsystem: ALi Corporation Unknown device 5281 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Latency: 128 Interrupt: pin A routed to IRQ 9 Region 0: I/O ports at a800 [size=8] Region 1: I/O ports at a400 [size=4] Region 2: I/O ports at a000 [size=8] Region 3: I/O ports at 9800 [size=4] Region 4: I/O ports at 9400 [size=16] It worked very well for half a year but with one disk (IIRC it was even plugged into second channel but I wont bet on it). Now I have second disk (very similar) and it is always put into PIO4 mode: [ 17.404451] libata version 2.00 loaded. [ 17.404916] sata_uli :00:04.0: version 1.0 [ 17.405009] ACPI: PCI Interrupt :00:04.0[A] -> GSI 18 (level, low) -> IRQ 185 [ 17.405223] ata1: SATA max UDMA/133 cmd 0xD400 ctl 0xD002 bmdma 0xB000 irq 185 [ 17.405385] ata2: SATA max UDMA/133 cmd 0xB800 ctl 0xB402 bmdma 0xB008 irq 185 [ 17.405519] scsi2 : sata_uli [ 17.858803] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [ 17.880541] ata1.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ (depth 0/32) [ 17.880660] ata1.00: ata1: dev 0 multi count 16 [ 17.58] ata1.00: configured for UDMA/133 [ 17.888941] scsi3 : sata_uli [ 18.342469] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [ 18.343573] ata2.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ (depth 0/32) [ 18.343691] ata2.00: ata2: dev 0 multi count 16 [ 18.344972] ata2.00: configured for PIO4 [ 18.345466] Vendor: ATA Model: ST3250620NS Rev: 3.AE [ 18.346391] Type: Direct-Access ANSI SCSI revision: 05 [ 18.347464] Vendor: ATA Model: ST3250620NS Rev: 3.AE [ 18.348390] Type: Direct-Access ANSI SCSI revision: 05 [ 18.349457] SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB) [ 18.350234] sda: Write Protect is off [ 18.350307] sda: Mode Sense: 00 3a 00 00 [ 18.351234] SCSI device sda: drive cache: write back [ 18.352233] SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB) [ 18.352444] sda: Write Protect is off [ 18.352517] sda: Mode Sense: 00 3a 00 00 [ 18.353443] SCSI device sda: drive cache: write back [ 18.353522] sda: sda1 sda2 [ 18.371118] sd 2:0:0:0: Attached scsi disk sda [ 18.372221] SCSI device sdb: 488397168 512-byte hdwr sectors (250059 MB) [ 18.372431] sdb: Write Protect is off [ 18.372504] sdb: Mode Sense: 00 3a 00 00 [ 18.373440] SCSI device sdb: drive cache: write back [ 18.374430] SCSI device sdb: 488397168 512-byte hdwr sectors (250059 MB) [ 18.375218] sdb: Write Protect is off [ 18.375291] sdb: Mode Sense: 00 3a 00 00 [ 18.376216] SCSI device sdb: drive cache: write back [ 18.376295] sdb: unknown partition table [ 18.381481] sd 3:0:0:0: Attached scsi disk sdb As you probably know this gives very very poor performance. Is there any way to make it fast? I tried changing cables and reconnecting them but it looks like it does not help. I can't do too much with this hardware since it is used as production server. But testing some patches is of course possible. On the other hand full kernel upgrade to 2.6.19 or .20 is not possible because this kernel has openvz patches and I don't have them for .19 or .20 yet. This is what I am getting from various utilities: # dmesg [0.00] Linux version 2.6.18-028test010 ([EMAIL PROTECTED]) (gcc version 3.4.6 (Gentoo 3.4.6-r2, ssp-3.4.6-1.0, pie-8.7.10)) #3 SMP Thu Jan 18 02:02:53 CET 2007 [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009f000 (usable) [0.00] BIOS-e820: 0009f000 - 000a (reserved) [0.00] BIOS-e820: 000f - 0010 (reserved) [0.00] BIOS-e820: 0010 - 7fffb000 (usable) [0.00] BIOS-e820: 7fffb000 -
[PATCH/RFC] alternative aproach to: Ban module license tag string termination trick
This patch changes the module license handling code to: - allow modules to have multiple licenses - access GPL symbols if at least one license is GPL-compatible - prevent the "GPL\0 for nothing"-trick - fix an off-by-one buffer overflow (exploitable only if the attacker can load modules) - move the ndiswrapper check into the new license checking routine Signed-Off-By: Bodo Eggert <[EMAIL PROTECTED]> --- The license handling code was kind of strange: - The kernel itself would only consider the first license, while modpost looks at all of them. - If you offer your module under a non-GPL license in addition to GPL, modpost would consider this module to be non-GPL. Therefore you can't say MODULE_LICENSE("GPL");\nMODULE_LICENSE("completely free"); Since I had to rewrite this part, I changed the behaviour to accept all modules having _at_least_ one GPL-compatible license. Prohibiting the \0-trick is done by storing the length of the license behind the license itself, uuencoded, as $=xyz. Currently, only 18 bits (256 KB) of the length are stored, but storing up to 30 bits is possible without changing anything besides the macro. You can still trick this code by including "...\0license=GPL\0$=$\0..." or by manually fabricating this string into .modinfo. Fix: Document this to mean that you actually GPL-license the module. TODO: get_modinfo: make sure the value returned does not exceed the end of the buffer. include/linux/license.h | 27 +--- include/linux/module.h |3 - include/linux/moduleinfo.h | 19 + include/linux/moduleparam.h | 11 + kernel/Makefile |2 kernel/module.c | 92 +--- kernel/moduleinfo.c | 73 ++ scripts/mod/Makefile|2 scripts/mod/modpost.c | 42 +--- scripts/mod/moduleinfo.c|3 + 10 files changed, 195 insertions(+), 79 deletions(-) diff -X dontdiff -pruN 2.6.19/include/linux/license.h 2.6.19.license/include/linux/license.h --- 2.6.19/include/linux/license.h 2006-11-29 22:57:37.0 +0100 +++ 2.6.19.license/include/linux/license.h 2007-02-02 18:30:44.0 +0100 @@ -1,14 +1,27 @@ #ifndef __LICENSE_H #define __LICENSE_H -static inline int license_is_gpl_compatible(const char *license) +static inline int license_is_gpl_compatible(const char *license, +int length) { - return (strcmp(license, "GPL") == 0 - || strcmp(license, "GPL v2") == 0 - || strcmp(license, "GPL and additional rights") == 0 - || strcmp(license, "Dual BSD/GPL") == 0 - || strcmp(license, "Dual MIT/GPL") == 0 - || strcmp(license, "Dual MPL/GPL") == 0); + static char *gpl_compatible[] = { + "GPL", + "GPL v2", + "GPL and additional rights", + "Dual BSD/GPL", + "Dual MIT/GPL", + "Dual MPL/GPL", + NULL + }; + char **p = gpl_compatible; + + while (*p) { + if(!strcmp(license, *p) + && length == strlen(*p)) + return 1; + p++; + } + return 0; } #endif diff -X dontdiff -pruN 2.6.19/include/linux/module.h 2.6.19.license/include/linux/module.h --- 2.6.19/include/linux/module.h 2006-11-29 22:57:37.0 +0100 +++ 2.6.19.license/include/linux/module.h 2007-02-02 23:56:39.0 +0100 @@ -92,6 +92,7 @@ extern struct module __this_module; /* Generic info of form tag = "info" */ #define MODULE_INFO(tag, info) __MODULE_INFO(tag, tag, info) +#define MODULE_INFO_I(tag, info) __MODULE_INFO_I(tag, tag, info) /* For userspace: you can also call me... */ #define MODULE_ALIAS(_alias) MODULE_INFO(alias, _alias) @@ -124,7 +125,7 @@ extern struct module __this_module; * 2. So the community can ignore bug reports including proprietary modules * 3. So vendors can do likewise based on their own policies */ -#define MODULE_LICENSE(_license) MODULE_INFO(license, _license) +#define MODULE_LICENSE(_license) MODULE_INFO_I(license, _license) /* Author, ideally of form NAME [, NAME ]*[ and NAME ] */ #define MODULE_AUTHOR(_author) MODULE_INFO(author, _author) diff -X dontdiff -pruN 2.6.19/include/linux/moduleinfo.h 2.6.19.license/include/linux/moduleinfo.h --- 2.6.19/include/linux/moduleinfo.h 1970-01-01 01:00:00.0 +0100 +++ 2.6.19.license/include/linux/moduleinfo.h 2007-02-02 20:33:26.0 +0100 @@ -0,0 +1,19 @@ +#ifndef __MODULEINFO_H +#define __MODULEINFO_H + +struct pstring_len { + char * s; + unsigned long i; +}; + +extern void do_get_next_modinfo_len(struct pstring_len *ret, +char * start, +unsigned long size, +const char
Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash
On Fri, Feb 02, 2007 at 05:19:24PM -0800, Andrew Morton wrote: > On Fri, 2 Feb 2007 17:34:56 +0530 > Nagendra Singh Tomar <[EMAIL PROTECTED]> wrote: > > > Hi, > > sd_probe() calls class_device_add() even before initializing the > > sdkp->device variable. class_device_add() eventually results in the user > > mode > > udev program to be called. udev program can read the the allow_restart > > attribute of the newly created scsi device. This is resulting in a crash as > > the show function for allow_restart (i.e sd_show_allow_restart) returns the > > attribute value by reading the sdkp->device->allow_restart variable. As the > > sdkp->device is not initialized before calling the user mode hotplug > > helper, > > this results in a crash. > > The patch below solves it by calling class_device_add() only after the > > necessary fields in the scsi_disk structure are initialized properly. > > > > > > > > --- linux-2.6.19.2/drivers/scsi/sd.c.orig 2007-02-02 17:03:03.0 > > +0530 > > +++ linux-2.6.19.2/drivers/scsi/sd.c2007-02-02 17:04:04.0 > > +0530 > > @@ -1646,16 +1646,6 @@ static int sd_probe(struct device *dev) > > if (error) > > goto out_put; > > > > - class_device_initialize(>cdev); > > - sdkp->cdev.dev = >sdev_gendev; > > - sdkp->cdev.class = _disk_class; > > - strncpy(sdkp->cdev.class_id, sdp->sdev_gendev.bus_id, BUS_ID_SIZE); > > - > > - if (class_device_add(>cdev)) > > - goto out_put; > > - > > - get_device(>sdev_gendev); > > - > > sdkp->device = sdp; > > sdkp->driver = _template; > > sdkp->disk = gd; > > @@ -1669,6 +1659,16 @@ static int sd_probe(struct device *dev) > > sdp->timeout = SD_MOD_TIMEOUT; > > } > > > > + class_device_initialize(>cdev); > > + sdkp->cdev.dev = >sdev_gendev; > > + sdkp->cdev.class = _disk_class; > > + strncpy(sdkp->cdev.class_id, sdp->sdev_gendev.bus_id, BUS_ID_SIZE); > > + > > + if (class_device_add(>cdev)) > > + goto out_put; > > + > > + get_device(>sdev_gendev); > > + > > gd->major = sd_major((index & 0xf0) >> 4); > > gd->first_minor = ((index & 0xf) << 4) | (index & 0xfff00); > > gd->minors = 16; > > Thanks - I'll queue this up for 2.6.20 also. No objection from me, as long as James says this is ok. I wonder why we haven't noticed this in the past? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/9] fs: libfs buffered write leak fix
On Fri, Feb 02, 2007 at 05:58:01PM -0800, Andrew Morton wrote: > On Sat, 3 Feb 2007 02:33:16 +0100 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > I think just setting page uptodate in commit_write might do the > > trick? (and getting rid of the set_page_dirty there). > > Yes, the page just isn't uptodate yet in prepare_write() - moving things > to commti_write() sounds sane. > > But please, can we have sufficient changelogs and comments in the next > version? You're right, sorry. Is this any better? (warning: nobh code is untested) -- simple_prepare_write and nobh_prepare_write leak uninitialised kernel data. This happens because the prepare_write functions leave an uninitialised "hole" over the part of the page that the write is expected to go to. This is fine, but they then mark the page uptodate, which means a concurrent read can come in and copy the uninitialised memory into userspace before it written to. Fix simple_readpage by simply initialising the whole page in the case of a partial-page write. In the case of a full-page write, we don't SetPageDirty until commit_write time. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/fs/libfs.c === --- linux-2.6.orig/fs/libfs.c +++ linux-2.6/fs/libfs.c @@ -327,25 +327,32 @@ int simple_readpage(struct file *file, s int simple_prepare_write(struct file *file, struct page *page, unsigned from, unsigned to) { - if (!PageUptodate(page)) { - if (to - from != PAGE_CACHE_SIZE) { - void *kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr, 0, from); - memset(kaddr + to, 0, PAGE_CACHE_SIZE - to); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); - } + if (PageUptodate(page)) + return 0; + + if (to - from != PAGE_CACHE_SIZE) { + /* +* Partial-page write? Initialise the complete page and +* set it uptodate. We could avoid initialising the +* (from, to) hole, and opt to mark it uptodate in +* simple_commit_write, but that's probably only a win +* for filesystems that would need to read blocks off disk. +*/ + memclear_highpage_flush(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); } + return 0; } int simple_commit_write(struct file *file, struct page *page, - unsigned offset, unsigned to) + unsigned from, unsigned to) { struct inode *inode = page->mapping->host; loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to; + if (to - from == PAGE_CACHE_SIZE) + SetPageUptodate(page); /* * No need to use i_size_read() here, the i_size * cannot change under us because we hold the i_mutex. @@ -353,6 +360,7 @@ int simple_commit_write(struct file *fil if (pos > inode->i_size) i_size_write(inode, pos); set_page_dirty(page); + return 0; } Index: linux-2.6/fs/buffer.c === --- linux-2.6.orig/fs/buffer.c +++ linux-2.6/fs/buffer.c @@ -2344,17 +2344,6 @@ int nobh_prepare_write(struct page *page if (is_mapped_to_disk) SetPageMappedToDisk(page); - SetPageUptodate(page); - - /* -* Setting the page dirty here isn't necessary for the prepare_write -* function - commit_write will do that. But if/when this function is -* used within the pagefault handler to ensure that all mmapped pages -* have backing space in the filesystem, we will need to dirty the page -* if its contents were altered. -*/ - if (dirtied_it) - set_page_dirty(page); return 0; @@ -2384,6 +2373,7 @@ int nobh_commit_write(struct file *file, struct inode *inode = page->mapping->host; loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to; + SetPageUptodate(page); set_page_dirty(page); if (pos > inode->i_size) { i_size_write(inode, pos); Index: linux-2.6/Documentation/filesystems/vfs.txt === --- linux-2.6.orig/Documentation/filesystems/vfs.txt +++ linux-2.6/Documentation/filesystems/vfs.txt @@ -617,6 +617,11 @@ struct address_space_operations { In this case the prepare_write will be retried one the lock is regained. + Note: the page _must not_ be marked uptodate in this function + (or anywhere else) unless it actually is uptodate right now. As + soon as a page is marked uptodate, it is possible for a concurrent + read(2) to copy it to userspace. + commit_write: If prepare_write
Re: 2.6.20-rc7: known regressions
Andrew Morton wrote: On Fri, 2 Feb 2007 06:49:16 +0100 Adrian Bunk <[EMAIL PROTECTED]> wrote: This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19 that are not yet fixed in Linus' tree. There are still a few things hanging around. I have these queued: aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch kexec-avoid-migration-of-already-disabled-irqs-ia64.patch net-smc911x-match-up-spin-lock-unlock.patch rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch alpha-fix-epoll-syscall-enumerations.patch revert-blockdev-direct-io-back-to-2619-version.patch scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch altix-more-acpi-prt-support.patch Would you forward the x86-64 dma_noncoherent API build fix I posted? Anything that uses that API won't build on x86-64 without my [simple and obvious] patch. - I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating about, but I forget its status. I posted a preferred patch (which someone then noted need to use setup_timer), and am waiting for an "it works" response of some sort Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.
On Fri, 02 Feb 2007 18:39:15 -0700 [EMAIL PROTECTED] (Eric W. Biederman) wrote: > Andrew Morton <[EMAIL PROTECTED]> writes: > > > So is this a for-2.6.20 thing? The bug was present in 2.6.19, so > > I assume it doesn't affect many people? > > If it's not to late, and this patch isn't too scary. > > It's a really rare set of circumstances that trigger it, but the > possibility of being hit is pretty widespread, anything with > more than one cpu, and more then one irq could see this. > > The easiest way to trigger this is to have two level triggered irqs on > two different cpus using the same vector. In that case if one acks > it's irq while the other irq is migrating to a different cpu 2.6.19 > get completely confused and stop handling interrupts properly. > > With my previous bug fix (not to drop the ack when we are confused) > the machine will stay up, and that is obviously correct and can't > affect anything else so is probably a candidate for the stable tree. > > With this fix everything just works. > > I don't know how often a legitimate case of the exact same irq > going off twice in a row is, but that is a possibility as well > especially with edge triggered interrupts. > > Setting up the test scenario was a pain, but by extremely limiting > my choice of vectors I was able to confirm I survived several hundred > of these events with in a couple of minutes no problem. > OK, thanks. Let's await Andi's feedback. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20-rc7: known regressions
On Fri, 2 Feb 2007 06:49:16 +0100 Adrian Bunk <[EMAIL PROTECTED]> wrote: > This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19 > that are not yet fixed in Linus' tree. There are still a few things hanging around. I have these queued: aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch kexec-avoid-migration-of-already-disabled-irqs-ia64.patch net-smc911x-match-up-spin-lock-unlock.patch rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch alpha-fix-epoll-syscall-enumerations.patch revert-blockdev-direct-io-back-to-2619-version.patch scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch altix-more-acpi-prt-support.patch which I'll get through to Linus later today. Plus: - x86_64-irq-simplfy-__assign_irq_vector.patch and x86_64-irq-handle-irqs-pending-in-irr-during-irq-migration.patch which are big and scary. Am awaiting feedback from Andi and Eric on what to do with these. - A fix from Trond for http://bugzilla.kernel.org/show_bug.cgi?id=7923. Am awaiting acks to merge that. - sky2-flow-control-off.patch from shemminger which I assume Linus will be merging anyway. - v9fs_vfs_mkdir-fix-a-double-free.patch which I guess I'll merge unless Eric suddenly nacks it. - I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating about, but I forget its status. - I have efi-x86-pass-firmware-call-parameters-on-the-stack.patch, but I'm not sure it's right and unless something really rapid happens, we'll ship with that bug unfixed. - enable-mouse-button-23-emulation-for-x86-macs.patch looks simple enough, but I'm waiting for Ben to wake up. - x86-fix-vdso-mapping-for-aout-executables.patch probably works OK, but Andi points out that it'd be better to implement this with attribute-weak. So I guess 2.6.20 will ship with non-functional a.out on i386, like 2.6.29. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/9] fs: libfs buffered write leak fix
On Sat, 3 Feb 2007 02:33:16 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > === > > > --- linux-2.6.orig/fs/buffer.c > > > +++ linux-2.6/fs/buffer.c > > > @@ -2344,6 +2344,8 @@ int nobh_prepare_write(struct page *page > > > > > > if (is_mapped_to_disk) > > > SetPageMappedToDisk(page); > > > + > > > + /* XXX: information leak vs read(2) */ > > > SetPageUptodate(page); > > > > > > /* > > > > That comment is too terse to be useful. > > OK, similar problem here - we have brought all the buffers uptodate > that we are *not* going to write over, or partially write over, but > we can have an uninitialised hole over the region we want to write. > > I think just setting page uptodate in commit_write might do the > trick? (and getting rid of the set_page_dirty there). Yes, the page just isn't uptodate yet in prepare_write() - moving things to commti_write() sounds sane. But please, can we have sufficient changelogs and comments in the next version? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA exceptions with 2.6.20-rc5
On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote: > On 2007.01.23 17:18:43 -0600, Robert Hancock wrote: > > Larry Walton wrote: > > >The last patch (sata_nv-force-int-dev-in-interrupt.patch) > > >seems to have fix the problem. Much appreciated, > > >thank you. I'd consider it a must have in 2.6.20. > > > > Can any of the rest of you that have been seeing this problem also > > confirm that this fixes it? > > Seems to work for me, uptime is about an hour now and no exception yet. > Had the stress test running for only about 10 minutes, but I usually got > an exception within an hour even during plain irssi usage, so I'm quite > confident that the patch fixes it. Or maybe not :( Just got an exception on 2.6.20-rc6. Took 4 days of uptime to trigger, so it's just a lot harder to trigger now. Björn - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.
Andrew Morton <[EMAIL PROTECTED]> writes: > So is this a for-2.6.20 thing? The bug was present in 2.6.19, so > I assume it doesn't affect many people? If it's not to late, and this patch isn't too scary. It's a really rare set of circumstances that trigger it, but the possibility of being hit is pretty widespread, anything with more than one cpu, and more then one irq could see this. The easiest way to trigger this is to have two level triggered irqs on two different cpus using the same vector. In that case if one acks it's irq while the other irq is migrating to a different cpu 2.6.19 get completely confused and stop handling interrupts properly. With my previous bug fix (not to drop the ack when we are confused) the machine will stay up, and that is obviously correct and can't affect anything else so is probably a candidate for the stable tree. With this fix everything just works. I don't know how often a legitimate case of the exact same irq going off twice in a row is, but that is a possibility as well especially with edge triggered interrupts. Setting up the test scenario was a pain, but by extremely limiting my choice of vectors I was able to confirm I survived several hundred of these events with in a couple of minutes no problem. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How many people are using 2.6.16?
On Thu, Feb 01, 2007 at 03:13:03PM +0300, Vladimir V. Saveliev wrote: > Hello Hi Vladimir, > On Wednesday 31 January 2007 10:02, Adrian Bunk wrote: >... > > reiserfs: > > commit de14569f94513279e3d44d9571a421e9da1759ae > > [PATCH] resierfs: avoid tail packing if an inode was ever mmapped > > backport to 2.6.16 required > > Here it goes: >... thanks a lot, applied to 2.6.16. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] - Altix: more ACPI PRT support
On Fri, 02 Feb 2007 14:54:12 -0600 John Keller <[EMAIL PROTECTED]> wrote: > The SN Altix platform does not conform to the > IOSAPIC IRQ routing model. Add code in acpi_unregister_gsi() > to check if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) and > return. > > Signed-off-by: John Keller <[EMAIL PROTECTED]> > --- > > Due to an oversight, this code was not added previously when > similar code was added to acpi_register_gsi(). > > http://marc.theaimsgroup.com/?l=linux-acpi=116680983430121=2 > > arch/ia64/kernel/acpi.c |3 +++ > 1 file changed, 3 insertions(+) > > > Index: linux-2.6/arch/ia64/kernel/acpi.c > === > --- linux-2.6.orig/arch/ia64/kernel/acpi.c2007-02-02 14:44:31.0 > -0600 > +++ linux-2.6/arch/ia64/kernel/acpi.c 2007-02-02 14:47:44.658143727 -0600 > @@ -609,6 +609,9 @@ EXPORT_SYMBOL(acpi_register_gsi); > > void acpi_unregister_gsi(u32 gsi) > { > + if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) > + return; > + > iosapic_unregister_intr(gsi); > } Given that the December 22 patch appears to be in mainline, and that this patch is simple, I shall cheerily bypass maintainers and send it in for 2.6.20. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 9/9] mm: fix pagecache write deadlocks
On Fri, Feb 02, 2007 at 03:53:11PM -0800, Andrew Morton wrote: > On Mon, 29 Jan 2007 11:33:03 +0100 (CET) > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > Modify the core write() code so that it won't take a pagefault while > > holding a > > lock on the pagecache page. There are a number of different deadlocks > > possible > > if we try to do such a thing: > > > > 1. generic_buffered_write > > 2. lock_page > > 3.prepare_write > > 4. unlock_page+vmtruncate > > 5. copy_from_user > > 6. mmap_sem(r) > > 7. handle_mm_fault > > 8.lock_page (filemap_nopage) > > 9.commit_write > > 10. unlock_page > > > > a. sys_munmap / sys_mlock / others > > b. mmap_sem(w) > > c. make_pages_present > > d.get_user_pages > > e. handle_mm_fault > > f. lock_page (filemap_nopage) > > > > 2,8 - recursive deadlock if page is same > > 2,8;2,8 - ABBA deadlock is page is different > > 2,6;b,f - ABBA deadlock if page is same > > > > The solution is as follows: > > 1. If we find the destination page is uptodate, continue as normal, but use > > atomic usercopies which do not take pagefaults and do not zero the > > uncopied > > tail of the destination. The destination is already uptodate, so we can > > commit_write the full length even if there was a partial copy: it does > > not > > matter that the tail was not modified, because if it is dirtied and > > written > > back to disk it will not cause any problems (uptodate *means* that the > > destination page is as new or newer than the copy on disk). > > > > 1a. The above requires that fault_in_pages_readable correctly returns access > > information, because atomic usercopies cannot distinguish between > > non-present pages in a readable mapping, from lack of a readable > > mapping. > > > > 2. If we find the destination page is non uptodate, unlock it (this could > > be > > made slightly more optimal), then find and pin the source page with > > get_user_pages. Relock the destination page and continue with the copy. > > However, instead of a usercopy (which might take a fault), copy the data > > via the kernel address space. > > > > Oh what a mess we're making :( > > Unfortunately, write() into a non-uptodate page is very much the common > case. We've always tried to avoid doing a pte-walk in the write() path to > fix this bug. Careful performance testing is needed here so we can assess > the impact. For threaded applications, simply the taking of mmap_sem might > be the biggest problem. > > And I can't think of any tricks we can play to avoid doing the pte-walk in > most cases. For example, we don't yet have a page to run page_mapped() > against. After this patch series, I am working on another that will allow filesystems to specifically code around the problem (eg. by handling short usercopies properly). I tried to take this approach generically the first time, but it turns out lots of filesystems had subtle problems, so if we do it this way instead, then filesystem developers who actually care enough can improve their code, and those that don't won't hold them back (or prevent this bug from being fixed). > > break; > > } > > > > + /* > > +* non-uptodate pages cannot cope with short copies, and we > > +* cannot take a pagefault with the destination page locked. > > +* So pin the source page to copy it. > > +*/ > > + if (!PageUptodate(page)) { > > + unlock_page(page); > > + > > + bytes = min(bytes, PAGE_CACHE_SIZE - > > +((unsigned long)buf & ~PAGE_CACHE_MASK)); > > + > > + /* > > +* Cannot get_user_pages with a page locked for the > > +* same reason as we can't take a page fault with a > > +* page locked (as explained below). > > +*/ > > + down_read(>mm->mmap_sem); > > + status = get_user_pages(current, current->mm, > > + (unsigned long)buf & PAGE_CACHE_MASK, 1, > > + 0, 0, _page, NULL); > > + up_read(>mm->mmap_sem); > > + if (status != 1) { > > + page_cache_release(page); > > + break; > > + } > > + > > + lock_page(page); > > + if (!page->mapping) { > > Hopefully this can't happen? If it can, who went and took our page off the > mapping? Reclaim? The elevated page_count will prevent that? Truncate/invalidate? > > + unlock_page(page); > > + page_cache_release(page); > > + page_cache_release(src_page); > > + continue; > > + } > > +
Re: [PATCH] Ban module license tag string termination trick
On Feb 2 2007 17:12, Randy Dunlap wrote: >> >> >if (MODULE_LICENSE_contains_null(license)) >> >> > printk(KERN_WARNING "this module's license is suspicious\n"); >> >> Whatever, I just want to see how you are going to implement >> MODULE_LICENSE_contains_null. > >I was busy on other things this morning (my time). >Now I have looked and I see what you mean. ;) > >I think it's possible, but it requires digging/learning about >Elf headers. That's what I did... Jan -- ft: http://freshmeat.net/p/chaostables/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/9] fs: libfs buffered write leak fix
On Fri, Feb 02, 2007 at 03:52:36PM -0800, Andrew Morton wrote: > On Mon, 29 Jan 2007 11:31:46 +0100 (CET) > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > simple_prepare_write and nobh_prepare_write leak uninitialised kernel data. > > They do? Under what situation? Yes, I have at least reproduced the libfs leak. The situation is when you write into a !uptodate page, the prepare_write function runs SetPageUptodate *before* we have copied data in. Thus you can read uninitialised data out of there. SetPageUptodate must not be used (or at least used carefully) in prepare_write. commit_write is the correct place to do this. > > Fix the former, > > How? If doing a partial-write, simply clear the whole page and set it uptodate (don't need to get too tricky). If doing a full-write, only set it uptodate in the commit_write. > > make a note of the latter. Several other filesystems seem > > to be iffy here, too. > > Please, tell us what the bug is so that others have a chance of reviewing > and, if needed, fixing those other filesystems. > > > --- linux-2.6.orig/fs/libfs.c > > +++ linux-2.6/fs/libfs.c > > @@ -327,32 +327,35 @@ int simple_readpage(struct file *file, s > > int simple_prepare_write(struct file *file, struct page *page, > > unsigned from, unsigned to) > > { > > - if (!PageUptodate(page)) { > > - if (to - from != PAGE_CACHE_SIZE) { > > - void *kaddr = kmap_atomic(page, KM_USER0); > > - memset(kaddr, 0, from); > > - memset(kaddr + to, 0, PAGE_CACHE_SIZE - to); > > - flush_dcache_page(page); > > - kunmap_atomic(kaddr, KM_USER0); > > - } > > + if (PageUptodate(page)) > > + return 0; > > + > > + if (to - from != PAGE_CACHE_SIZE) { > > + clear_highpage(page); > > + flush_dcache_page(page); > > SetPageUptodate(page); > > } > > memclear_highpage_flush() is fashionable. Good one. > > === > > --- linux-2.6.orig/fs/buffer.c > > +++ linux-2.6/fs/buffer.c > > @@ -2344,6 +2344,8 @@ int nobh_prepare_write(struct page *page > > > > if (is_mapped_to_disk) > > SetPageMappedToDisk(page); > > + > > + /* XXX: information leak vs read(2) */ > > SetPageUptodate(page); > > > > /* > > That comment is too terse to be useful. OK, similar problem here - we have brought all the buffers uptodate that we are *not* going to write over, or partially write over, but we can have an uninitialised hole over the region we want to write. I think just setting page uptodate in commit_write might do the trick? (and getting rid of the set_page_dirty there). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135
On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote: > On Fri, 2 Feb 2007 12:56:30 -0800 > Andrew Vasquez <[EMAIL PROTECTED]> wrote: > > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats > > > > limit=2m passes=100 pattern=iot dlimit=2048 > > What is this mysterious dt command, btw? I expect that it's the one here: http://www.scsifaq.org/RMiller_Tools/index.html --- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/9] buffered write deadlock fix
On Fri, Feb 02, 2007 at 03:52:32PM -0800, Andrew Morton wrote: > On Mon, 29 Jan 2007 11:31:37 +0100 (CET) > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > The following set of patches attempt to fix the buffered write > > locking problems (and there are a couple of peripheral patches > > and cleanups there too). > > > > Patches against 2.6.20-rc6. I was hoping that 2.6.20-rc6-mm2 would > > be an easier diff with the fsaio patches gone, but the readahead > > rewrite clashes badly :( > > Well fsaio is restored, but there's now considerable doubt over it due to > the recent febril febrility. > > How bad is the clash with the readahead patches? I don't think it would be so bad that one couldn't merge readahead back on top quite easily... The fsaio ones are a little harder because they change generic_file_buffered_write. > Clashes with git-block are likely, too. > > Bugfixes come first, so I will drop readahead and fsaio and git-block to get > this work completed if needed - please work agaisnt mainline. OK. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash
On Fri, 2 Feb 2007 17:34:56 +0530 Nagendra Singh Tomar <[EMAIL PROTECTED]> wrote: > Hi, > sd_probe() calls class_device_add() even before initializing the > sdkp->device variable. class_device_add() eventually results in the user mode > udev program to be called. udev program can read the the allow_restart > attribute of the newly created scsi device. This is resulting in a crash as > the show function for allow_restart (i.e sd_show_allow_restart) returns the > attribute value by reading the sdkp->device->allow_restart variable. As the > sdkp->device is not initialized before calling the user mode hotplug helper, > this results in a crash. > The patch below solves it by calling class_device_add() only after the > necessary fields in the scsi_disk structure are initialized properly. > > > > --- linux-2.6.19.2/drivers/scsi/sd.c.orig 2007-02-02 17:03:03.0 > +0530 > +++ linux-2.6.19.2/drivers/scsi/sd.c 2007-02-02 17:04:04.0 +0530 > @@ -1646,16 +1646,6 @@ static int sd_probe(struct device *dev) > if (error) > goto out_put; > > - class_device_initialize(>cdev); > - sdkp->cdev.dev = >sdev_gendev; > - sdkp->cdev.class = _disk_class; > - strncpy(sdkp->cdev.class_id, sdp->sdev_gendev.bus_id, BUS_ID_SIZE); > - > - if (class_device_add(>cdev)) > - goto out_put; > - > - get_device(>sdev_gendev); > - > sdkp->device = sdp; > sdkp->driver = _template; > sdkp->disk = gd; > @@ -1669,6 +1659,16 @@ static int sd_probe(struct device *dev) > sdp->timeout = SD_MOD_TIMEOUT; > } > > + class_device_initialize(>cdev); > + sdkp->cdev.dev = >sdev_gendev; > + sdkp->cdev.class = _disk_class; > + strncpy(sdkp->cdev.class_id, sdp->sdev_gendev.bus_id, BUS_ID_SIZE); > + > + if (class_device_add(>cdev)) > + goto out_put; > + > + get_device(>sdev_gendev); > + > gd->major = sd_major((index & 0xf0) >> 4); > gd->first_minor = ((index & 0xf) << 4) | (index & 0xfff00); > gd->minors = 16; Thanks - I'll queue this up for 2.6.20 also. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 4/9] Remove the TSC synchronization on SMP machines
[EMAIL PROTECTED] wrote: TSC is either synchronized by design or not reliable to be used for anything, let alone timekeeping. This refers to eliminating the offset between multiple synchronized TSCs. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/