date:20070202

Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.

2007-02-02 Thread Eric W. Biederman

Arjan van de Ven <[EMAIL PROTECTED]> writes:

>> > Once the migration operation is complete we know we will receive
>> > no more interrupts on this vector so the irq pending state for
>> > this irq will no longer be updated.  If the irq is not pending and
>> > we are in the intermediate state we immediately free the vector,
>> > otherwise in we free the vector in do_IRQ when the pending irq
>> > arrives.
>> 
>> So is this a for-2.6.20 thing?  The bug was present in 2.6.19, so
>> I assume it doesn't affect many people?
>
> I got a few reports of this; irqbalance may trigger this kernel bug it
> seems... I would suggest to consider this for 2.6.20 since it's a
> hard-hang case


Yes.  The bug I fixed will not happen if you don't migrate irqs.

At the very least we want the patch below (already in -mm)
that makes it not a hard hang case.

Subject: [PATCH] x86_64:  Survive having no irq mapping for a vector

Occasionally the kernel has bugs that result in no irq being
found for a given cpu vector.  If we acknowledge the irq
the system has a good chance of continuing even though we dropped
an missed an irq message.  If we continue to simply print a
message and drop and not acknowledge the irq the system is
likely to become non-responsive shortly there after.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 arch/x86_64/kernel/irq.c |   11 ---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index 0c06af6..648055a 100644
--- a/arch/x86_64/kernel/irq.c
+++ b/arch/x86_64/kernel/irq.c
@@ -120,9 +120,14 @@ asmlinkage unsigned int do_IRQ(struct pt_regs *regs)
 
if (likely(irq < NR_IRQS))
generic_handle_irq(irq);
-   else if (printk_ratelimit())
-   printk(KERN_EMERG "%s: %d.%d No irq handler for vector\n",
-   __func__, smp_processor_id(), vector);
+   else {
+   if (!disable_apic)
+   ack_APIC_irq();
+
+   if (printk_ratelimit())
+   printk(KERN_EMERG "%s: %d.%d No irq handler for 
vector\n",
+   __func__, smp_processor_id(), vector);
+   }
 
irq_exit();
 
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc7: known regressions (v2) (part 1)

2007-02-02 Thread Eric W. Biederman

Auke Kok <[EMAIL PROTECTED]> writes:

> Adrian Bunk wrote:
>> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
>> that are not yet fixed in Linus' tree.
>>
>> If you find your name in the Cc header, you are either submitter of one
>> of the bugs, maintainer of an affectected subsystem or driver, a patch
>> of you caused a breakage or I'm considering you in any other way possibly
>> involved with one or more of these issues.
>
>
>> Subject: e1000: 82571EB/82572EI PCI-E cards: link is always down
>>  (MSI related)
>> References : http://lkml.org/lkml/2007/1/16/27
>>  http://lkml.org/lkml/2007/1/17/182
>> Submitter  : Allen Parker <[EMAIL PROTECTED]>
>>  Adam Kropelin <[EMAIL PROTECTED]>
>> Handled-By : Auke Kok <[EMAIL PROTECTED]>
>> Status : problem is being debugged
>
> I probably can't fix this bug. Not only do I doubt that the e1000 driver is at
> fault here, I don't have a system with this particular chipset. Most likely 
> the
> regression comes from a combination of MSI layer rewrites and possibly 
> platform
> issues. We've seen many reports that are similar and all are on the platform
> type mentioned here. I really don't want to point fingers here either.
>
> None of the MSI code in e1000 has changed significantly either. as far as I 
> can
> see, the msi code in e1000 has not changed since 2.6.18. Nonetheless there's 
> no
> way I can debug any of this without a system.
>
> I will address the fact that we are lacking any of these systems to test on, 
> but
> that is not going to get this issue handled (not to mention soon) in the way 
> it
> needs to be.
>
> I strongly encourage the people on the linux-pci list to help out, I'll trace
> the e1000 driver for suspicious activity (again), but I run countless tests on
> the latest trees and nothing has shown up recently, other than Eric 
> Biederman's
> msi irq reclaim leak fix.
>
> Perhaps Adam can git-bisect this issue? Adam?

Do we have any explanation about the weird /proc/interrupts output?
i.e. Multiple MSI irqs being assigned to the same card?

Does /sbin/ifconfig ethN down ; /sbin/ifconfig ethN up have anything to do
with the duplication in /proc/interrupts?

I can't see any way for a pci device that doesn't support msi-x to be assigned
multiple interrupts simultaneously.

I just skimmed through the code and there hasn't been any significant
generic MSI work since 2.6.19.

Did this device really work with MSI enabled in 2.6.19?

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Cbe-oss-dev] [RFC, PATCH 4/4] Add support to OProfile for profiling Cell BE SPUs -- update

2007-02-02 Thread Arnd Bergmann

On Friday 02 February 2007 17:47, Maynard Johnson wrote:
> 
> > We also want to be able to profile the context switch code itself, which
> > means that we also need one event buffer associated with the kernel to
> > collect events that for a zero context_id.
> The hardware design precludes tracing both SPU and PPU simultaneously.
> 
I mean the SPU-side part of the context switch code, which you can find
in arch/powerpc/platforms/cell/spufs/spu_{save,restore}*.

This code is the one that runs when context_id == 0 is passed to the
callback.

Arnd <><
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.

2007-02-02 Thread Arjan van de Ven


> > Once the migration operation is complete we know we will receive
> > no more interrupts on this vector so the irq pending state for
> > this irq will no longer be updated.  If the irq is not pending and
> > we are in the intermediate state we immediately free the vector,
> > otherwise in we free the vector in do_IRQ when the pending irq
> > arrives.
> 
> So is this a for-2.6.20 thing?  The bug was present in 2.6.19, so
> I assume it doesn't affect many people?

I got a few reports of this; irqbalance may trigger this kernel bug it
seems... I would suggest to consider this for 2.6.20 since it's a
hard-hang case

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc6-mm3

2007-02-02 Thread Cedric Le Goater

Cedric Le Goater wrote:
> Starikovskiy, Alexey Y wrote:
>>> so it probably means that drivers/acpi/tables/tbxfroot.c is
>>> obsolete ?
>> Yes.
 Could you please try it?
>>> sure, I'll cancel the current boot test in which I was using
>>> acpi_find_root_pointer() in tbxfroot.c and restart one with your
>>> new patch. I should have the result today.
>> How long does it take to boot this thing?
> 
> well, not that long, but i don't have access directly to this 
> machine, only through a test batch manager ... 

dmesg looks fine. However, there is a :

ACPI Warning (tbfadt-0415): Optional field "Gpe1Block" has zero address or 
length: /4 [20070126]

but I don't know how to interpret this ? Any Idea ?

thanks,


C.


Linux version 2.6.20-rc6-mm3-lxc2-autokern1 ([EMAIL PROTECTED]) (gcc version 
4.0.3 (Ubuntu 4.0.3-1ubuntu5)) #1 SMP Fri Feb 2 20:38:46 UTC 2007
BIOS-provided physical RAM map:
sanitize start
sanitize end
copy_e820_map() start:  size: 0009dc00 end: 
0009dc00 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 0009dc00 size: 2400 end: 
000a type: 2
copy_e820_map() start: 000e size: 0002 end: 
0010 type: 2
copy_e820_map() start: 0010 size: dfea25c0 end: 
dffa25c0 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: dffa25c0 size: 9c80 end: 
dffac240 type: 3
copy_e820_map() start: dffac240 size: 00053dc0 end: 
e000 type: 2
copy_e820_map() start: fec0 size: 0140 end: 
0001 type: 2
copy_e820_map() start: 0001 size: 00012000 end: 
00022000 type: 1
copy_e820_map() type is E820_RAM
 BIOS-e820:  - 0009dc00 (usable)
 BIOS-e820: 0009dc00 - 000a (reserved)
 BIOS-e820: 000e - 0010 (reserved)
 BIOS-e820: 0010 - dffa25c0 (usable)
 BIOS-e820: dffa25c0 - dffac240 (ACPI data)
 BIOS-e820: dffac240 - e000 (reserved)
 BIOS-e820: fec0 - 0001 (reserved)
 BIOS-e820: 0001 - 00022000 (usable)
Node: 0, start_pfn: 0, end_pfn: 157
Node: 0, start_pfn: 256, end_pfn: 917410
Node: 0, start_pfn: 1048576, end_pfn: 2228224
get_memcfg_from_srat: assigning address to rsdp
RSD PTR  v0 [IBM   ]
Begin SRAT table scan
CPU 0x00 in proximity domain 0x00
CPU 0x02 in proximity domain 0x00
CPU 0x10 in proximity domain 0x00
CPU 0x12 in proximity domain 0x00
CPU 0x01 in proximity domain 0x00
CPU 0x03 in proximity domain 0x00
CPU 0x11 in proximity domain 0x00
CPU 0x13 in proximity domain 0x00
Memory range 0x0 to 0xE (type 0x1) in proximity domain 0x00 enabled
Memory range 0x10 to 0x22 (type 0x1) in proximity domain 0x00 enabled
pxm bitmap: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 
Number of logical nodes in system = 1
Number of memory chunks in system = 2
chunk 0 nid 0 start_pfn  end_pfn 000e
chunk 1 nid 0 start_pfn 0010 end_pfn 0022
Node: 0, start_pfn: 0, end_pfn: 2228224
Reserving 17920 pages of KVA for lmem_map of node 0
Shrinking node 0 from 2228224 pages to 2210304 pages
Reserving total of 17920 pages for numa KVA remap
kva_start_pfn ~ 211456 find_max_low_pfn() ~ 229376
max_pfn = 2228224
7808MB HIGHMEM available.
896MB LOWMEM available.
min_low_pfn = 1156, max_low_pfn = 229376, highstart_pfn = 229376
Low memory ends at vaddr f800
node 0 will remap to vaddr f3a0 - fc60
High memory starts at vaddr f800
found SMP MP-table at 0009dd40
Zone PFN ranges:
  DMA 0 -> 4096
  Normal   4096 ->   229376
  HighMem229376 ->  2228224
early_node_map[2] active PFN ranges
0:0 ->   917504
0:  1048576 ->  2210304
DMI 2.3 present.
Using APIC driver default
IBM eserver xSeries 440 detected: force use of acpi=ht
ACPI: RSDP @ 0x000fde20/0x0014 (v000 IBM   )
ACPI: RSDT @ 0xdffac1c0/0x0034 (v001 IBMSERVIGIL 0x1000 IBM  0x45444F43)
ACPI: FACP @ 0xdffac140/0x0074 (v001 IBMSERVIGIL 0x1000 IBM  0x45444F43)
ACPI Warning (tbfadt-0415): Optional field "Gpe1Block" has zero address or 
length: /4 [20070126]
ACPI: DSDT @ 0xdffa25c0/0x4436 (v001 IBMSERVIGIL 0x1000 INTL 0x02002025)
ACPI: FACS @ 0xdffabf00/0x0040
ACPI: APIC @ 0xdffac040/0x00D2 (v001 IBMSERVIGIL 0x1000 IBM  0x45444F43)
ACPI: SRAT @ 0xdffabf40/0x0100 (v001 IBMSERVIGIL 0x1000 IBM  0x45444F43)
ACPI: SSDT @ 0xdffa6a00/0x5467 (v001 IBMVIGSSDT0 0x1000 INTL 0x02002025)
ACPI: PM-Timer IO Port: 0x508
Switched to APIC driver `summit'.
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:1 APIC version 20
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled)
Processor #2 15:1 APIC version 20
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
Processor #16

Re: [PATCH 2 of 4] Introduce i386 fibril scheduling

2007-02-02 Thread Suparna Bhattacharya

On Fri, Feb 02, 2007 at 04:56:22PM -0800, Linus Torvalds wrote:
> 
> On Sat, 3 Feb 2007, Ingo Molnar wrote:
> > 
> > Well, in my picture, 'only if you block' is a pure thread utilization 
> > decision: bounce a piece of work to another thread if this thread cannot 
> > complete it. (if the kernel is lucky enough that the user context told 
> > it "it's fine to do that".)
> 
> Sure, you can do it that way too. But at that point, your argument that we 
> shouldn't do it with fibrils is wrong: you'd still need basically the 
> exact same setup that Zach does in his fibril stuff, and the exact same 
> hook in the scheduler, testing the exact same value ("do we have a pending 
> queue of work").
> 
> So at that point, you really are arguing about a rather small detail in 
> the implementation, I think.
> 
> Which is fair enough. 
> 
> But I actually think the *bigger* argument and problems are elsewhere, 
> namely in the interface details. Notably, I think the *real* issues end up 
> how we handle synchronization, and how we handle signalling. Those are in 
> many ways (I think) more important than whether we actually can schedule 
> these trivial things on multiple CPU's concurrently or not.
> 
> For example, I think serialization is potentially a much more expensive 
> issue. Could we, for example, allow users to serialize with these things 
> *without* having to go through the expense of doing a system call? Again, 
> I'm thinking of the case of no IO happening, in which case there also 
> won't be any actual threading taking place, in which case it's a total 
> waste of time to do a system call at all.
> 
> And trying to do that actually has implications for the interfaces (like 
> possibly returning a zero cookie for the async() system call if it was 
> doable totally synchronously?)

This would be useful - the application wouldn't have to set up state
to remember for handling completions for operations that complete synchronously
I know Samba folks would like that.

The laio_syscall implementation (Lazy asynchronous IO) seems to have
experimented with such an interface
http://www.usenix.org/events/usenix04/tech/general/elmeleegy.html

Regards
Suparna

> 
> Signal handling is similar: I actually think that a "async()" system call 
> should be interruptible within the context of the caller, since we would 
> want to *try* to execute it synchronously. That automatically means that 
> we have semantic meaning for fibrils and signal handling.
> 
> Finally, can we actually get POSIX aio semantics with this? Can we 
> implement the current aio_xyzzy() system calls using this same feature? 
> And most importantly - does it perform well enough that we really can do 
> that?
> 
> THOSE are to me bigger questions than what happens inside the kernel, and 
> whether we actually end up using another thread if we end up doing it 
> non-synchronously.
> 
>   Linus
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [EMAIL PROTECTED]  For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: mailto:"[EMAIL PROTECTED]">[EMAIL PROTECTED]

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Ksummit-2007-discuss] Re: [Ksummit-2006-discuss] 2007 Linux Kernel Summit

2007-02-02 Thread Len Brown

On Tuesday 30 January 2007 08:30, Theodore Tso wrote:

> Well, Usenix has offerred to provide logistical support for some
> mini-summits if anyoen wants to take them up on it.  Using some of the
> sponsorship money from last year, we've proposed to make some hotel
> conference rooms right before OLS available if anyone wants to do a
> 10-30 person mini-summit in Ottawa.
> 
> Is there any interest?

Yes, suspect that a day attached to OLS may make a good power-management summit 
day.

-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/9] buffered write deadlock fix

2007-02-02 Thread Suparna Bhattacharya

On Fri, Feb 02, 2007 at 03:52:32PM -0800, Andrew Morton wrote:
> On Mon, 29 Jan 2007 11:31:37 +0100 (CET)
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > The following set of patches attempt to fix the buffered write
> > locking problems (and there are a couple of peripheral patches
> > and cleanups there too).
> > 
> > Patches against 2.6.20-rc6. I was hoping that 2.6.20-rc6-mm2 would
> > be an easier diff with the fsaio patches gone, but the readahead
> > rewrite clashes badly :(
> 
> Well fsaio is restored, but there's now considerable doubt over it due to
> the recent febril febrility.

I think Ingo made a point earlier about letting the old co-exist with the
new. Fibrils + kevents have great potential for a next generation
solution but we need to give the whole story some time to play out and prove
it in practice, debate and benchmark the alternative combinations, optimize it
for various workloads etc.  It will also take more work on top before we
can get the whole POSIX AIO implementation supported on top of this. I'll be
very happy when that happens ... it is just that it is still too early to
be sure.

Since this is going to be a new interface, not the existing linux AIO
interface, I do not see any conflict between the two. Samba4 already uses
fsaio, and we now have the ability to do POSIX AIO over kernel AIO (which
depends on fsaio). The more we delay real world usage the longer we take
to learn about the application patterns that matter. And it is those
patterns that are key.

> 
> How bad is the clash with the readahead patches?
> 
> Clashes with git-block are likely, too.
> 
> Bugfixes come first, so I will drop readahead and fsaio and git-block to get
> this work completed if needed - please work agaisnt mainline.

If you need help with fixing the clashes, please let me know.

Regards
Suparna

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC] Tracking mlocked pages and moving them off the LRU

2007-02-02 Thread Christoph Lameter

This is a new variation on the earlier RFC for tracking mlocked pages.
We now mark a mlocked page with a bit in the page flags and remove
them from the LRU. Pages get moved back when no vma that references
the page has VM_LOCKED set anymore.

This means that vmscan no longer uselessly cycles over large amounts
of mlocked memory should someone attempt to mlock large amounts of
memory (may even result in a livelock on large systems).

Synchronization is build around state changes of the PageMlocked bit.
The NR_MLOCK counter is incremented and decremented based on
state transitions of PageMlocked. So the count is accurate.

There is still some unfinished business:

1. We use the 21st page flag and we only have 20 on 32 bit NUMA platforms.

2. Since mlocked pages are now off the LRU page migration will no longer
   move them.

3. Use NR_MLOCK to tune various VM behaviors so that the VM does not 
   longer fall due to too many mlocked pages in certain areas.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: current/include/linux/mmzone.h
===
--- current.orig/include/linux/mmzone.h 2007-02-02 16:42:51.0 -0800
+++ current/include/linux/mmzone.h  2007-02-02 16:43:28.0 -0800
@@ -58,6 +58,7 @@ enum zone_stat_item {
NR_FILE_DIRTY,
NR_WRITEBACK,
/* Second 128 byte cacheline */
+   NR_MLOCK,   /* Mlocked pages */
NR_SLAB_RECLAIMABLE,
NR_SLAB_UNRECLAIMABLE,
NR_PAGETABLE,   /* used for pagetables */
Index: current/mm/memory.c
===
--- current.orig/mm/memory.c2007-02-02 16:42:51.0 -0800
+++ current/mm/memory.c 2007-02-02 21:24:20.0 -0800
@@ -682,6 +682,8 @@ static unsigned long zap_pte_range(struc
file_rss--;
}
page_remove_rmap(page, vma);
+   if (PageMlocked(page) && (vma->vm_flags & VM_LOCKED))
+   mlock_remove(page, vma);
tlb_remove_page(tlb, page);
continue;
}
@@ -898,6 +900,21 @@ unsigned long zap_page_range(struct vm_a
 }
 
 /*
+ * Add a new anonymous page
+ */
+void anon_add(struct vm_area_struct *vma, struct page *page,
+   unsigned long address)
+{
+   inc_mm_counter(vma->vm_mm, anon_rss);
+   if (vma->vm_flags & VM_LOCKED) {
+   SetPageMlocked(page);
+   inc_zone_page_state(page, NR_MLOCK);
+   } else
+   lru_cache_add_active(page);
+   page_add_new_anon_rmap(page, vma, address);
+}
+
+/*
  * Do a quick page-table lookup for a single page.
  */
 struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
@@ -949,6 +966,10 @@ struct page *follow_page(struct vm_area_
if (unlikely(!page))
goto unlock;
 
+   if ((flags & FOLL_MLOCK) &&
+   !PageMlocked(page) &&
+   (vma->vm_flags & VM_LOCKED))
+   mlock_add(page, vma);
if (flags & FOLL_GET)
get_page(page);
if (flags & FOLL_TOUCH) {
@@ -1045,7 +1066,7 @@ int get_user_pages(struct task_struct *t
continue;
}
 
-   foll_flags = FOLL_TOUCH;
+   foll_flags = FOLL_TOUCH | FOLL_MLOCK;
if (pages)
foll_flags |= FOLL_GET;
if (!write && !(vma->vm_flags & VM_LOCKED) &&
@@ -2101,9 +2122,7 @@ static int do_anonymous_page(struct mm_s
page_table = pte_offset_map_lock(mm, pmd, address, );
if (!pte_none(*page_table))
goto release;
-   inc_mm_counter(mm, anon_rss);
-   lru_cache_add_active(page);
-   page_add_new_anon_rmap(page, vma, address);
+   anon_add(vma, page, address);
} else {
/* Map the ZERO_PAGE - vm_page_prot is readonly */
page = ZERO_PAGE(address);
@@ -2247,12 +2266,13 @@ retry:
if (write_access)
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
set_pte_at(mm, address, page_table, entry);
-   if (anon) {
-   inc_mm_counter(mm, anon_rss);
-   lru_cache_add_active(new_page);
-   page_add_new_anon_rmap(new_page, vma, address);
-   } else {
+   if (anon)
+   anon_add(vma, new_page, address);
+   else {
inc_mm_counter(mm, file_rss);
+   if (!PageMlocked(new_page) &&
+   (vma->vm_flags & VM_LOCKED))
+   mlock_add(new_page, vma);
page_add_file_rmap(new_page);

Re: 2.6.20-rc7: known regressions (v2) (part 1)

2007-02-02 Thread Auke Kok


Adrian Bunk wrote:

This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.




Subject: e1000: 82571EB/82572EI PCI-E cards: link is always down
 (MSI related)
References : http://lkml.org/lkml/2007/1/16/27
 http://lkml.org/lkml/2007/1/17/182
Submitter  : Allen Parker <[EMAIL PROTECTED]>
 Adam Kropelin <[EMAIL PROTECTED]>
Handled-By : Auke Kok <[EMAIL PROTECTED]>
Status : problem is being debugged


I probably can't fix this bug. Not only do I doubt that the e1000 driver is at 
fault here, I don't have a system with this particular chipset. Most likely the 
regression comes from a combination of MSI layer rewrites and possibly platform 
issues. We've seen many reports that are similar and all are on the platform 
type mentioned here. I really don't want to point fingers here either.


None of the MSI code in e1000 has changed significantly either. as far as I can 
see, the msi code in e1000 has not changed since 2.6.18. Nonetheless there's no 
way I can debug any of this without a system.


I will address the fact that we are lacking any of these systems to test on, but 
that is not going to get this issue handled (not to mention soon) in the way it 
needs to be.


I strongly encourage the people on the linux-pci list to help out, I'll trace 
the e1000 driver for suspicious activity (again), but I run countless tests on 
the latest trees and nothing has shown up recently, other than Eric Biederman's 
msi irq reclaim leak fix.


Perhaps Adam can git-bisect this issue? Adam?

Cheers,

Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please revert "fix typo in geode_configre()@cyrix.c"

2007-02-02 Thread TAKADA Yoshihito

Hi. I'm late.

I'll to resend the patch against 2.6.19.

original code doesn't write back to CCR4 register. this patch reflects a
value of a register.

diff -Narup linux-2.6.19.orig/arch/i386/kernel/cpu/cyrix.c 
linux-2.6.19/arch/i386/kernel/cpu/cyrix.c
--- linux-2.6.19.orig/arch/i386/kernel/cpu/cyrix.c  2006-11-30 
06:57:37.0 +0900
+++ linux-2.6.19/arch/i386/kernel/cpu/cyrix.c   2007-02-03 14:57:35.0 
+0900
@@ -161,19 +161,19 @@ static void __cpuinit set_cx86_inc(void)
 static void __cpuinit geode_configure(void)
 {
unsigned long flags;
-   u8 ccr3, ccr4;
+   u8 ccr3;
local_irq_save(flags);
 
/* Suspend on halt power saving and enable #SUSP pin */
setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88);
 
ccr3 = getCx86(CX86_CCR3);
-   setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10);   /* Enable */
+   setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10);   /* enable MAPEN */

-   ccr4 = getCx86(CX86_CCR4);
-   ccr4 |= 0x38;   /* FPU fast, DTE cache, Mem bypass */
-   
-   setCx86(CX86_CCR3, ccr3);
+
+   /* FPU fast, DTE cache, Mem bypass */
+   setCx86(CX86_CCR4, getCx86(CX86_CCR4) | 0x38);
+   setCx86(CX86_CCR3, ccr3);   /* disable MAPEN */

set_cx86_memwb();
set_cx86_reorder(); 
@@ -415,15 +415,14 @@ static void __cpuinit cyrix_identify(str

if (dir0 == 5 || dir0 == 3)
{
-   unsigned char ccr3, ccr4;
+   unsigned char ccr3;
unsigned long flags;
printk(KERN_INFO "Enabling CPUID on Cyrix 
processor.\n");
local_irq_save(flags);
ccr3 = getCx86(CX86_CCR3);
-   setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable 
MAPEN  */
-   ccr4 = getCx86(CX86_CCR4);
-   setCx86(CX86_CCR4, ccr4 | 0x80);  /* enable 
cpuid  */
-   setCx86(CX86_CCR3, ccr3); /* disable 
MAPEN */
+   setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10);   /* 
enable MAPEN  */
+   setCx86(CX86_CCR4, getCx86(CX86_CCR4) | 0x80);  /* 
enable cpuid  */
+   setCx86(CX86_CCR3, ccr3);   /* 
disable MAPEN */
local_irq_restore(flags);
}
}


On Fri, 2 Feb 2007 13:18:54 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Fri, 2 Feb 2007 10:12:36 -0500
> [EMAIL PROTECTED] (Lennart Sorensen) wrote:
> 
> > On Fri, Feb 02, 2007 at 12:05:43AM -0800, Andrew Morton wrote:
> > > On Fri, 2 Feb 2007 07:29:41 +0100 Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > > 
> > > > Linus, please revert commit e4f0ae0ea63caceff37a13f281a72652b7ea71ba
> > > > 
> > > 
> > > Yup.
> > > 
> > > That discussion seems to have died.  The 2.6.19 code looks rather silly, 
> > > but
> > > presumably it passed someone's testing at some stage.
> > 
> > The discussion ended because the last patch seemed to be correct to
> > everyone involved in the discussion.  At least that is my understanding.
> > Of course I am just one of the users affected by the patch.
> 
> The discussion ended with me asking for someone to send a patch.  That
> hasn't happened yet.  I don't want to have to troll through 20-30 messages
> and try to work out what patch we ended up with - that's the way in which
> mistakes occur.
> 
> Linus has now reverted e4f0ae0ea63caceff37a13f281a72652b7ea71ba.  Now,
> please, could someone send a patch against either current -git or against
> 2.6.19?  One which includes a descriptin of what it does, and why.
> 
> Thanks.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
TAKADA <[EMAIL PROTECTED]>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SATA exceptions with 2.6.20-rc5

2007-02-02 Thread Robert Hancock


Björn Steinbrink wrote:

On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote:

On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:

Larry Walton wrote:
The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
seems to have fix the problem.  Much appreciated, 
thank you. I'd consider it a must have in 2.6.20.
Can any of the rest of you that have been seeing this problem also 
confirm that this fixes it?

Seems to work for me, uptime is about an hour now and no exception yet.
Had the stress test running for only about 10 minutes, but I usually got
an exception within an hour even during plain irssi usage, so I'm quite
confident that the patch fixes it.


Or maybe not :( Just got an exception on 2.6.20-rc6. Took 4 days of
uptime to trigger, so it's just a lot harder to trigger now.


Same exception details as before?

There's a patch in -mm (sata_nv-use-adma-for-nodata-commands.patch) 
which should hopefully avoid this problem for the cache flush commands, 
at least - can you try that one out? You'll have to apply the other 
sata_nv patches in -mm first, i.e. this order:


http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2.patch
http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2-cleanup.patch
http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-use-adma-for-nodata-commands.patch

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez

On Fri, 02 Feb 2007, Randy Dunlap wrote:

> On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote:
> 
> > On Fri, 2 Feb 2007 12:56:30 -0800
> > Andrew Vasquez <[EMAIL PROTECTED]> wrote:
> > 
> > > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats 
> > > > > limit=2m passes=100 pattern=iot dlimit=2048
> > 
> > What is this mysterious dt command, btw?
> 
> I expect that it's the one here:
> http://www.scsifaq.org/RMiller_Tools/index.html

yep, that's the one.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Fastboot] [PATCH] kexec: Fix CONFIG_SMP=n compilation (ia64)

2007-02-02 Thread Horms

On Fri, Feb 02, 2007 at 08:53:00PM +0900, Magnus Damm wrote:
> On 2/2/07, Magnus Damm <[EMAIL PROTECTED]> wrote:
> > On 2/2/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > Magnus Damm <[EMAIL PROTECTED]> wrote:
> > >
> > > > kexec: Fix CONFIG_SMP=n compilation (ia64)
> > > >
> > > > This patch makes it possible to compile kexec for ia64 without SMP 
> > > > support.
> > > > --- 0002/arch/ia64/kernel/machine_kexec.c
> > > > +++ work/arch/ia64/kernel/machine_kexec.c 2007-02-01 
> > > > 12:35:46.0 +0900
> > > > @@ -70,12 +70,14 @@ void machine_kexec_cleanup(struct kimage
> > > >
> > > >  void machine_shutdown(void)
> > > >  {
> > > > +#ifdef CONFIG_SMP
> > > >   int cpu;
> > > >
> > > >   for_each_online_cpu(cpu) {
> > > >   if (cpu != smp_processor_id())
> > > >   cpu_down(cpu);
> > > >   }
> > > > +#endif
> > > >   kexec_disable_iosapic();
> > > >  }
> > >
> > > hm.  I suspect this one should have been #ifndef CONFIG_HOTPLUG_CPU?
> 
> Re-reading this I assume you mean #ifdef CONFIG_HOTPLUG_CPU.
> 
> I would be happy to resend a new updated version of the patch, but I
> wonder if it may be better to fail miserably during the build than
> fail silently in the case of CONFIG_SMP=y but CONFIG_HOTPLUG_CPU=n.

There used to be alternate code for the CONFIG_SMP +
!CONFIG_HOTPLUG_CPU, but this was removed because it was determined to
be flakey and not maintainable (I can dig up the threads if you want).
I think that this means that if we have CONFIG_KEXEC and CONFIG_SMP then
CONFIG_HOTPLUG_CPU is required. I think this is expressable in Kconfig
somehow.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Fix d_path for lazy unmounts

2007-02-02 Thread Andreas Gruenbacher

Hello,

here is a bugfix to d_path. Please apply (after 2.6.20).

First, when d_path() hits a lazily unmounted mount point, it tries to
prepend the name of the lazily unmounted dentry to the path name. It
gets this wrong, and also overwrites the slash that separates the name
from the following pathname component. This is demonstrated by the
attached test case, which prints "getcwd returned d_path-bugsubdir"
with the bug. The correct result would be "getcwd returned
d_path-bug/subdir".

It could be argued that the name of the root dentry should not be part
of the result of d_path in the first place. On the other hand, what the
unconnected namespace was once reachable as may provide some useful
hints to users, and so that seems okay.

Second, it isn't always possible to tell from the __d_path result whether
the specified root and rootmnt (i.e., the chroot) was reached: lazy
unmounts of bind mounts will produce a path that does start with a
non-slash so we can tell from that, but other lazy unmounts will produce
a path that starts with a slash, just like "ordinary" paths.

The attached patch cleans up __d_path() to fix the bug with overlapping
pathname components. It also adds a @fail_deleted argument, which allows
to get rid of some of the mess in sys_getcwd(). Grabbing the dcache_lock
can then also be moved into __d_path(). The patch also makes sure that
paths will only start with a slash for paths which are connected to the
root and rootmnt.

The @fail_deleted argument could be added to d_path() as well: this would
allow callers to recognize deleted files, without having to resort to the
ambiguous check for the " (deleted)" string at the end of the pathnames.
This is not currently done, but it might be worthwhile.

Signed-off-by: Andreas Gruenbacher <[EMAIL PROTECTED]>

Index: linux-2.6/fs/dcache.c
===
--- linux-2.6.orig/fs/dcache.c
+++ linux-2.6/fs/dcache.c
@@ -1739,45 +1739,43 @@ shouldnt_be_hashed:
  * @rootmnt: vfsmnt to which the root dentry belongs
  * @buffer: buffer to return value in
  * @buflen: buffer length
+ * @fail_deleted: what to return for deleted files
  *
- * Convert a dentry into an ASCII path name. If the entry has been deleted
- * the string " (deleted)" is appended. Note that this is ambiguous.
+ * Convert a dentry into an ASCII path name. If the entry has been deleted,
+ * then if @fail_deleted is true, ERR_PTR(-ENOENT) is returned. Otherwise,
+ * the the string " (deleted)" is appended. Note that this is ambiguous.
  *
- * Returns the buffer or an error code if the path was too long.
- *
- * "buflen" should be positive. Caller holds the dcache_lock.
+ * Returns the buffer or an error code.
  */
-static char * __d_path( struct dentry *dentry, struct vfsmount *vfsmnt,
-   struct dentry *root, struct vfsmount *rootmnt,
-   char *buffer, int buflen)
+static char *__d_path(struct dentry *dentry, struct vfsmount *vfsmnt,
+ struct dentry *root, struct vfsmount *rootmnt,
+ char *buffer, int buflen, int fail_deleted)
 {
-   char * end = buffer+buflen;
-   char * retval;
+   char *end = buffer + buflen - 1;
int namelen;
 
-   *--end = '\0';
+   buffer = end;
+   if (buflen < 2)
+   return ERR_PTR(-ENAMETOOLONG);
+   *end = '\0';
buflen--;
+
+   spin_lock(_lock);
if (!IS_ROOT(dentry) && d_unhashed(dentry)) {
-   buflen -= 10;
-   end -= 10;
-   if (buflen < 0)
+   if (fail_deleted) {
+   buffer = ERR_PTR(-ENOENT);
+   goto out;
+   }
+   if (buflen < 10)
goto Elong;
-   memcpy(end, " (deleted)", 10);
+   buflen -= 10;
+   buffer -= 10;
+   memcpy(buffer, " (deleted)", 10);
}
-
-   if (buflen < 1)
-   goto Elong;
-   /* Get '/' right */
-   retval = end-1;
-   *retval = '/';
-
-   for (;;) {
+   while (dentry != root || vfsmnt != rootmnt) {
struct dentry * parent;
 
-   if (dentry == root && vfsmnt == rootmnt)
-   break;
if (dentry == vfsmnt->mnt_root || IS_ROOT(dentry)) {
-   /* Global root? */
spin_lock(_lock);
if (vfsmnt->mnt_parent == vfsmnt) {
spin_unlock(_lock);
@@ -1791,33 +1789,49 @@ static char * __d_path( struct dentry *d
parent = dentry->d_parent;
prefetch(parent);
namelen = dentry->d_name.len;
-   buflen -= namelen + 1;
-   if (buflen < 0)
+   if (buflen <= namelen)
goto Elong;
-   end -= namelen;
-   memcpy(end, dentry->d_name.name, namelen);
-

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread James Bottomley

On Fri, 2007-02-02 at 17:56 -0800, Greg KH wrote:
> > Thanks - I'll queue this up for 2.6.20 also.
> 
> No objection from me, as long as James says this is ok.
> 
> I wonder why we haven't noticed this in the past?

Because the race is so small ...

I'll queue it in the rc-fixes tree .. I have three others for 2.6.20

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] - Altix: more ACPI PRT support

2007-02-02 Thread Len Brown

On Friday 02 February 2007 20:37, Andrew Morton wrote:
> On Fri, 02 Feb 2007 14:54:12 -0600
> John Keller <[EMAIL PROTECTED]> wrote:
> 
> > The SN Altix platform does not conform to the 
> > IOSAPIC IRQ routing model. Add code in acpi_unregister_gsi()
> > to check if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) and
> > return.
> > 
> > Signed-off-by: John Keller <[EMAIL PROTECTED]>
> > ---
> > 
> > Due to an oversight, this code was not added previously when
> > similar code was added to acpi_register_gsi().
> > 
> > http://marc.theaimsgroup.com/?l=linux-acpi=116680983430121=2
> > 
> >  arch/ia64/kernel/acpi.c |3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > 
> > Index: linux-2.6/arch/ia64/kernel/acpi.c
> > ===
> > --- linux-2.6.orig/arch/ia64/kernel/acpi.c  2007-02-02 14:44:31.0 
> > -0600
> > +++ linux-2.6/arch/ia64/kernel/acpi.c   2007-02-02 14:47:44.658143727 
> > -0600
> > @@ -609,6 +609,9 @@ EXPORT_SYMBOL(acpi_register_gsi);
> >  
> >  void acpi_unregister_gsi(u32 gsi)
> >  {
> > +   if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM)
> > +   return;
> > +
> > iosapic_unregister_intr(gsi);
> >  }
> 
> Given that the December 22 patch appears to be in mainline, and that this
> patch is simple, I shall cheerily bypass maintainers and send it in for
> 2.6.20.

Yep.

Acked-by: Len Brown <[EMAIL PROTECTED]>

thanks,
-Len

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 17/59] PCI: prevent down_read when pci_devices is empty

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Ard van Breemen <[EMAIL PROTECTED]>

The pci_find_subsys gets called very early by obsolete ide setup parameters.
This is a bogus call since pci is not initialized yet, so the list is empty.
But in the mean time, interrupts get enabled by down_read.  This can result in
a kernel panic when the irq controller gets initialized.

This patch checks if the device list is empty before taking the semaphore, and
hence will not enable irq's.  Furthermore it will inform that it is called
while pci_devices is empty as a reminder that the ide code needs to be fixed.

The pci_get_subsys can get called in the same manner, and as such is patched
in the same manner.

[EMAIL PROTECTED]: cleanups]
Signed-off-by: Ard van Breemen <[EMAIL PROTECTED]>
Cc: Greg KH <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
This is the other half of the fix for bug #7505

 drivers/pci/search.c |   24 
 1 file changed, 24 insertions(+)

--- linux-2.6.19.2.orig/drivers/pci/search.c
+++ linux-2.6.19.2/drivers/pci/search.c
@@ -193,6 +193,18 @@ static struct pci_dev * pci_find_subsys(
struct pci_dev *dev;
 
WARN_ON(in_interrupt());
+
+   /*
+* pci_find_subsys() can be called on the ide_setup() path, super-early
+* in boot.  But the down_read() will enable local interrupts, which
+* can cause some machines to crash.  So here we detect and flag that
+* situation and bail out early.
+*/
+   if (unlikely(list_empty(_devices))) {
+   printk(KERN_INFO "pci_find_subsys() called while pci_devices "
+   "is still empty\n");
+   return NULL;
+   }
down_read(_bus_sem);
n = from ? from->global_list.next : pci_devices.next;
 
@@ -259,6 +271,18 @@ pci_get_subsys(unsigned int vendor, unsi
struct pci_dev *dev;
 
WARN_ON(in_interrupt());
+
+   /*
+* pci_get_subsys() can potentially be called by drivers super-early
+* in boot.  But the down_read() will enable local interrupts, which
+* can cause some machines to crash.  So here we detect and flag that
+* situation and bail out early.
+*/
+   if (unlikely(list_empty(_devices))) {
+   printk(KERN_NOTICE "pci_get_subsys() called while pci_devices "
+   "is still empty\n");
+   return NULL;
+   }
down_read(_bus_sem);
n = from ? from->global_list.next : pci_devices.next;
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 09/59] NETFILTER: arp_tables: fix userspace compilation

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Patrick McHardy <[EMAIL PROTECTED]>

The included patch translates arpt_counters to xt_counters, making
userspace arptables compile against recent kernels.

Signed-off-by: Bart De Schuymer <[EMAIL PROTECTED]>
Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---

 include/linux/netfilter_arp/arp_tables.h |1 +
 1 file changed, 1 insertion(+)

--- linux-2.6.19.2.orig/include/linux/netfilter_arp/arp_tables.h
+++ linux-2.6.19.2/include/linux/netfilter_arp/arp_tables.h
@@ -190,6 +190,7 @@ struct arpt_replace
 
 /* The argument to ARPT_SO_ADD_COUNTERS. */
 #define arpt_counters_info xt_counters_info
+#define arpt_counters xt_counters
 
 /* The argument to ARPT_SO_GET_ENTRIES. */
 struct arpt_get_entries

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 18/59] IPV6 MCAST: Fix joining all-node multicast group on device initialization.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: YOSHIFUJI Hideaki <[EMAIL PROTECTED]>

Join all-node multicast group after assignment of dev->ip6_ptr
because it must be assigned when ipv6_dev_mc_inc() is called.
This fixes Bug#7817, reported by <[EMAIL PROTECTED]>.

Closes: 7817
Signed-off-by: YOSHIFUJI Hideaki <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/ipv6/addrconf.c |6 ++
 net/ipv6/mcast.c|6 --
 2 files changed, 6 insertions(+), 6 deletions(-)

--- linux-2.6.19.2.orig/net/ipv6/addrconf.c
+++ linux-2.6.19.2/net/ipv6/addrconf.c
@@ -341,6 +341,7 @@ void in6_dev_finish_destroy(struct inet6
 static struct inet6_dev * ipv6_add_dev(struct net_device *dev)
 {
struct inet6_dev *ndev;
+   struct in6_addr maddr;
 
ASSERT_RTNL();
 
@@ -425,6 +426,11 @@ static struct inet6_dev * ipv6_add_dev(s
 #endif
/* protected by rtnl_lock */
rcu_assign_pointer(dev->ip6_ptr, ndev);
+
+   /* Join all-node multicast group */
+   ipv6_addr_all_nodes();
+   ipv6_dev_mc_inc(dev, );
+
return ndev;
 }
 
--- linux-2.6.19.2.orig/net/ipv6/mcast.c
+++ linux-2.6.19.2/net/ipv6/mcast.c
@@ -2252,8 +2252,6 @@ void ipv6_mc_up(struct inet6_dev *idev)
 
 void ipv6_mc_init_dev(struct inet6_dev *idev)
 {
-   struct in6_addr maddr;
-
write_lock_bh(>lock);
rwlock_init(>mc_lock);
idev->mc_gq_running = 0;
@@ -2269,10 +2267,6 @@ void ipv6_mc_init_dev(struct inet6_dev *
idev->mc_maxdelay = IGMP6_UNSOLICITED_IVAL;
idev->mc_v1_seen = 0;
write_unlock_bh(>lock);
-
-   /* Add all-nodes address. */
-   ipv6_addr_all_nodes();
-   ipv6_dev_mc_inc(idev->dev, );
 }
 
 /*

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [stable] [patch 00/59] -stable review

2007-02-02 Thread Chris Wright

* Chris Wright ([EMAIL PROTECTED]) wrote:
> Responses should be made by Mon Feb  3 02:30 UTC 2007

Yes, that's Mon Feb 5 (thanks to those on their toes ;-)

And the roll-up will be available at:


http://www.kernel.org/pub/linux/kernel/people/chrisw/stable/patch-2.6.19.3-rc1.{gz,bz2}

once mirroring finishes.

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 03/59] Check for populated zone in __drain_pages

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Christoph Lameter <[EMAIL PROTECTED]>

Both process_zones() and drain_node_pages() check for populated zones
before touching pagesets.  However, __drain_pages does not do so,

This may result in a NULL pointer dereference for pagesets in unpopulated
zones if a NUMA setup is combined with cpu hotplug.

Initially the unpopulated zone has the pcp pointers pointing to the boot
pagesets.  Since the zone is not populated the boot pageset pointers will
not be changed during page allocator and slab bootstrap.

If a cpu is later brought down (first call to __drain_pages()) then the pcp
pointers for cpus in unpopulated zones are set to NULL since __drain_pages
does not first check for an unpopulated zone.

If the cpu is then brought up again then we call process_zones() which will
ignore the unpopulated zone.  So the pageset pointers will still be NULL.

If the cpu is then again brought down then __drain_pages will attempt to
drain pages by following the NULL pageset pointer for unpopulated zones.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f2e12bb272f2544d1504f982270e90ae3dcc4ff2

 mm/page_alloc.c |3 +++
 1 file changed, 3 insertions(+)

--- linux-2.6.19.2.orig/mm/page_alloc.c
+++ linux-2.6.19.2/mm/page_alloc.c
@@ -710,6 +710,9 @@ static void __drain_pages(unsigned int c
for_each_zone(zone) {
struct per_cpu_pageset *pset;
 
+   if (!populated_zone(zone))
+   continue;
+
pset = zone_pcp(zone, cpu);
for (i = 0; i < ARRAY_SIZE(pset->pcp); i++) {
struct per_cpu_pages *pcp;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 08/59] NETFILTER: tcp conntrack: fix IP_CT_TCP_FLAG_CLOSE_INIT value

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Patrick McHardy <[EMAIL PROTECTED]>

IP_CT_TCP_FLAG_CLOSE_INIT is a flag and should have a value of 0x4 instead
of 0x3, which is IP_CT_TCP_FLAG_WINDOW_SCALE | IP_CT_TCP_FLAG_SACK_PERM.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---

 include/linux/netfilter/nf_conntrack_tcp.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.2.orig/include/linux/netfilter/nf_conntrack_tcp.h
+++ linux-2.6.19.2/include/linux/netfilter/nf_conntrack_tcp.h
@@ -25,7 +25,7 @@ enum tcp_conntrack {
 #define IP_CT_TCP_FLAG_SACK_PERM   0x02
 
 /* This sender sent FIN first */
-#define IP_CT_TCP_FLAG_CLOSE_INIT  0x03
+#define IP_CT_TCP_FLAG_CLOSE_INIT  0x04
 
 #ifdef __KERNEL__
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 20/59] NETFILTER: ctnetlink: fix leak in ctnetlink_create_conntrack error path

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Patrick McHardy <[EMAIL PROTECTED]>

---

 net/ipv4/netfilter/ip_conntrack_netlink.c |2 +-
 net/netfilter/nf_conntrack_netlink.c  |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/net/ipv4/netfilter/ip_conntrack_netlink.c
+++ linux-2.6.19.2/net/ipv4/netfilter/ip_conntrack_netlink.c
@@ -955,7 +955,7 @@ ctnetlink_create_conntrack(struct nfattr
if (cda[CTA_PROTOINFO-1]) {
err = ctnetlink_change_protoinfo(ct, cda);
if (err < 0)
-   return err;
+   goto err;
}
 
 #if defined(CONFIG_IP_NF_CONNTRACK_MARK)
--- linux-2.6.19.2.orig/net/netfilter/nf_conntrack_netlink.c
+++ linux-2.6.19.2/net/netfilter/nf_conntrack_netlink.c
@@ -972,7 +972,7 @@ ctnetlink_create_conntrack(struct nfattr
if (cda[CTA_PROTOINFO-1]) {
err = ctnetlink_change_protoinfo(ct, cda);
if (err < 0)
-   return err;
+   goto err;
}
 
 #if defined(CONFIG_NF_CONNTRACK_MARK)

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 22/59] ALSA hda-codec - Fix NULL dereference in generic hda code

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Takashi Iwai <[EMAIL PROTECTED]>

Fix NULL dereference in hda_generic.c.

Signed-off-by: Takashi Iwai <[EMAIL PROTECTED]>
Signed-off-by: Jaroslav Kysela <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
commit 6afeb11de5b28e47adea1459c35e598bb98424d6
tree 07f4dba0e2fb094b448eb9863de7b6364b768add
parent f9cc8a8b1887e6e2bb430405d0a4f9b5fb39fa5d
author Takashi Iwai <[EMAIL PROTECTED]> Mon, 18 Dec 2006 16:16:04 +0100
committer Jaroslav Kysela <[EMAIL PROTECTED]> Tue, 09 Jan 2007 09:06:17 +0100

 sound/pci/hda/hda_generic.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/sound/pci/hda/hda_generic.c
+++ linux-2.6.19.2/sound/pci/hda/hda_generic.c
@@ -485,8 +485,9 @@ static const char *get_input_type(struct
return "Front Aux";
return "Aux";
case AC_JACK_MIC_IN:
-   if (node->pin_caps &
-   (AC_PINCAP_VREF_80 << AC_PINCAP_VREF_SHIFT))
+   if (pinctl &&
+   (node->pin_caps &
+(AC_PINCAP_VREF_80 << AC_PINCAP_VREF_SHIFT)))
*pinctl |= AC_PINCTL_VREF_80;
if ((location & 0x0f) == AC_JACK_LOC_FRONT)
return "Front Mic";

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 27/59] x86: Work around gcc 4.2 over aggressive optimizer

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Andi Kleen <[EMAIL PROTECTED]>

The new PDA code uses a dummy _proxy_pda variable to describe
memory references to the PDA. It is never referenced
in inline assembly, but exists as input/output arguments.
gcc 4.2 in some cases can CSE references to this which causes
unresolved symbols.  Define it to zero to avoid this.

Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 arch/i386/kernel/vmlinux.lds.S   |1 +
 arch/x86_64/kernel/vmlinux.lds.S |1 +
 2 files changed, 2 insertions(+)

--- linux-2.6.19.2.orig/arch/i386/kernel/vmlinux.lds.S
+++ linux-2.6.19.2/arch/i386/kernel/vmlinux.lds.S
@@ -13,6 +13,7 @@ OUTPUT_FORMAT("elf32-i386", "elf32-i386"
 OUTPUT_ARCH(i386)
 ENTRY(phys_startup_32)
 jiffies = jiffies_64;
+_proxy_pda = 0;
 
 PHDRS {
text PT_LOAD FLAGS(5);  /* R_E */
--- linux-2.6.19.2.orig/arch/x86_64/kernel/vmlinux.lds.S
+++ linux-2.6.19.2/arch/x86_64/kernel/vmlinux.lds.S
@@ -13,6 +13,7 @@ OUTPUT_FORMAT("elf64-x86-64", "elf64-x86
 OUTPUT_ARCH(i386:x86-64)
 ENTRY(phys_startup_64)
 jiffies_64 = jiffies;
+_proxy_pda = 0;
 PHDRS {
text PT_LOAD FLAGS(5);  /* R_E */
data PT_LOAD FLAGS(7);  /* RWE */

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 19/59] NETFILTER: ctnetlink: check for status attribute existence on conntrack creation

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Pablo Neira Ayuso <[EMAIL PROTECTED]>

Check that status flags are available in the netlink message received
to create a new conntrack.

Fixes a crash in ctnetlink_create_conntrack when the CTA_STATUS attribute
is not present.

Signed-off-by: Pablo Neira Ayuso <[EMAIL PROTECTED]>
Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---

 net/ipv4/netfilter/ip_conntrack_netlink.c |8 +---
 net/netfilter/nf_conntrack_netlink.c  |8 +---
 2 files changed, 10 insertions(+), 6 deletions(-)

--- linux-2.6.19.2.orig/net/ipv4/netfilter/ip_conntrack_netlink.c
+++ linux-2.6.19.2/net/ipv4/netfilter/ip_conntrack_netlink.c
@@ -946,9 +946,11 @@ ctnetlink_create_conntrack(struct nfattr
ct->timeout.expires = jiffies + ct->timeout.expires * HZ;
ct->status |= IPS_CONFIRMED;
 
-   err = ctnetlink_change_status(ct, cda);
-   if (err < 0)
-   goto err;
+   if (cda[CTA_STATUS-1]) {
+   err = ctnetlink_change_status(ct, cda);
+   if (err < 0)
+   goto err;
+   }
 
if (cda[CTA_PROTOINFO-1]) {
err = ctnetlink_change_protoinfo(ct, cda);
--- linux-2.6.19.2.orig/net/netfilter/nf_conntrack_netlink.c
+++ linux-2.6.19.2/net/netfilter/nf_conntrack_netlink.c
@@ -963,9 +963,11 @@ ctnetlink_create_conntrack(struct nfattr
ct->timeout.expires = jiffies + ct->timeout.expires * HZ;
ct->status |= IPS_CONFIRMED;
 
-   err = ctnetlink_change_status(ct, cda);
-   if (err < 0)
-   goto err;
+   if (cda[CTA_STATUS-1]) {
+   err = ctnetlink_change_status(ct, cda);
+   if (err < 0)
+   goto err;
+   }
 
if (cda[CTA_PROTOINFO-1]) {
err = ctnetlink_change_protoinfo(ct, cda);

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 24/59] IB/iser: return error code when PDUs may not be sent

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Erez Zilber <[EMAIL PROTECTED]>

iSER limits the number of outstanding PDUs to send. When this threshold is
reached, it should return an error code (-ENOBUFS) instead of setting the
suspend_tx bit (which should be used only by libiscsi). Without this fix, 
during logout, open-iscsi over iSER tries to logout forever.

Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/infiniband/ulp/iser/iscsi_iser.c |4 ++--
 drivers/infiniband/ulp/iser/iser_initiator.c |   26 --
 2 files changed, 14 insertions(+), 16 deletions(-)

--- linux-2.6.19.2.orig/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ linux-2.6.19.2/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -177,7 +177,7 @@ iscsi_iser_mtask_xmit(struct iscsi_conn 
 * - if yes, the mtask is recycled at iscsi_complete_pdu
 * - if no,  the mtask is recycled at iser_snd_completion
 */
-   if (error && error != -EAGAIN)
+   if (error && error != -ENOBUFS)
iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED);
 
return error;
@@ -241,7 +241,7 @@ iscsi_iser_ctask_xmit(struct iscsi_conn 
error = iscsi_iser_ctask_xmit_unsol_data(conn, ctask);
 
  iscsi_iser_ctask_xmit_exit:
-   if (error && error != -EAGAIN)
+   if (error && error != -ENOBUFS)
iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED);
return error;
 }
--- linux-2.6.19.2.orig/drivers/infiniband/ulp/iser/iser_initiator.c
+++ linux-2.6.19.2/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -304,18 +304,14 @@ int iser_conn_set_full_featured_mode(str
 static int
 iser_check_xmit(struct iscsi_conn *conn, void *task)
 {
-   int rc = 0;
struct iscsi_iser_conn *iser_conn = conn->dd_data;
 
-   write_lock_bh(conn->recv_lock);
if (atomic_read(_conn->ib_conn->post_send_buf_count) ==
ISER_QP_MAX_REQ_DTOS) {
-   iser_dbg("%ld can't xmit task %p, suspending 
tx\n",jiffies,task);
-   set_bit(ISCSI_SUSPEND_BIT, >suspend_tx);
-   rc = -EAGAIN;
+   iser_dbg("%ld can't xmit task %p\n",jiffies,task);
+   return -ENOBUFS;
}
-   write_unlock_bh(conn->recv_lock);
-   return rc;
+   return 0;
 }
 
 
@@ -340,7 +336,7 @@ int iser_send_command(struct iscsi_conn 
return -EPERM;
}
if (iser_check_xmit(conn, ctask))
-   return -EAGAIN;
+   return -ENOBUFS;
 
edtl = ntohl(hdr->data_length);
 
@@ -426,7 +422,7 @@ int iser_send_data_out(struct iscsi_conn
}
 
if (iser_check_xmit(conn, ctask))
-   return -EAGAIN;
+   return -ENOBUFS;
 
itt = ntohl(hdr->itt);
data_seg_len = ntoh24(hdr->dlength);
@@ -500,7 +496,7 @@ int iser_send_control(struct iscsi_conn 
}
 
if (iser_check_xmit(conn,mtask))
-   return -EAGAIN;
+   return -ENOBUFS;
 
/* build the tx desc regd header and add it to the tx desc dto */
mdesc->type = ISCSI_TX_CONTROL;
@@ -609,6 +605,7 @@ void iser_snd_completion(struct iser_des
struct iscsi_iser_conn *iser_conn = ib_conn->iser_conn;
struct iscsi_conn  *conn = iser_conn->iscsi_conn;
struct iscsi_mgmt_task *mtask;
+   int resume_tx = 0;
 
iser_dbg("Initiator, Data sent dto=0x%p\n", dto);
 
@@ -617,15 +614,16 @@ void iser_snd_completion(struct iser_des
if (tx_desc->type == ISCSI_TX_DATAOUT)
kmem_cache_free(ig.desc_cache, tx_desc);
 
+   if (atomic_read(_conn->ib_conn->post_send_buf_count) ==
+   ISER_QP_MAX_REQ_DTOS)
+   resume_tx = 1;
+
atomic_dec(_conn->post_send_buf_count);
 
-   write_lock(conn->recv_lock);
-   if (conn->suspend_tx) {
+   if (resume_tx) {
iser_dbg("%ld resuming tx\n",jiffies);
-   clear_bit(ISCSI_SUSPEND_BIT, >suspend_tx);
scsi_queue_work(conn->session->host, >xmitwork);
}
-   write_unlock(conn->recv_lock);
 
if (tx_desc->type == ISCSI_TX_CONTROL) {
/* this arithmetic is legal by libiscsi dd_data allocation */

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 28/59] NETFILTER: Fix iptables ABI breakage on (at least) CRIS

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Patrick McHardy <[EMAIL PROTECTED]>

With the introduction of x_tables we accidentally broke compatibility
by defining IPT_TABLE_MAXNAMELEN to XT_FUNCTION_MAXNAMELEN instead of
XT_TABLE_MAXNAMELEN, which is two bytes larger.

On most architectures it doesn't really matter since we don't have
any tables with names that long in the kernel and the structure
layout didn't change because of alignment requirements of following
members. On CRIS however (and other architectures that don't align
data) this changed the structure layout and thus broke compatibility
with old iptables binaries.

Changing it back will break compatibility with binaries compiled
against recent kernels again, but since the breakage has only been
there for three releases this seems like the better choice.

Spotted by Jonas Berlin <[EMAIL PROTECTED]>.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---

 include/linux/netfilter_ipv4/ip_tables.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.2.orig/include/linux/netfilter_ipv4/ip_tables.h
+++ linux-2.6.19.2/include/linux/netfilter_ipv4/ip_tables.h
@@ -28,7 +28,7 @@
 #include 
 
 #define IPT_FUNCTION_MAXNAMELEN XT_FUNCTION_MAXNAMELEN
-#define IPT_TABLE_MAXNAMELEN XT_FUNCTION_MAXNAMELEN
+#define IPT_TABLE_MAXNAMELEN XT_TABLE_MAXNAMELEN
 #define ipt_match xt_match
 #define ipt_target xt_target
 #define ipt_table xt_table

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 26/59] ACPI: fix cpufreq regression

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Ingo Molnar <[EMAIL PROTECTED]>

recently cpufreq support on my laptop (Lenovo T60) broke completely: 
when it's plugged into AC it would never go higher than 1 GHz - neither 
1.3 GHz nor 1.83 GHz is possible - no matter which governor (userspace, 
speed or ondemand) is used.

after some cpufreq debugging i tracked the regression back to the 
following (totally correct) bug-fix commit:

   commit 0916bd3ebb7cefdd0f432e8491abe24f4b5a101e
   Author: Dave Jones <[EMAIL PROTECTED]>
   Date:   Wed Nov 22 20:42:01 2006 -0500

[PATCH] Correct bound checking from the value returned from _PPC method.

this bugfix, which makes other laptops work, made a previously hidden 
(BIOS) bug visible on my laptop.

The bug is the following: if the _PPC (Performance Present Capabilities) 
optional ACPI object is queried /after/ bootup then the BIOS reports an 
incorrect value of '2'.

My laptop (Lenovo T60) has the following performance states supported:

   0: 1833000
   1: 1333000
   2: 100

Per ACPI specification, a _PPC value of '0' means that all 3 performance 
states are usable. A _PPC value of '1' means states 1 .. 2 are usable, a 
value of '2' means only state '2' (slowest) is usable.

now, the _PPC object is optional, and it also comes with notification. 
Furthermore, when a CPU object is initialized, the _PPC object is 
initialized as well. So the following evaluation of the _PPC object is 
superfluous:

 [] acpi_processor_get_platform_limit+0xa1/0xaf
 [] acpi_processor_register_performance+0x3b9/0x3ef
 [] acpi_cpufreq_cpu_init+0xb7/0x596
 [] cpufreq_add_dev+0x160/0x4a8
 [] sysdev_driver_register+0x5a/0xa0
 [] cpufreq_register_driver+0xb4/0x176
 [] acpi_cpufreq_init+0xe5/0xeb
 [] init+0x14f/0x3dd

and this is the point where my laptop's BIOS returns the incorrect value 
of '2'. Note that it has not sent any notification event, so the value 
is probably not really intentional (possibly spurious), and Windows 
likely doesnt query it after bootup either. Maybe the value is kept at 
'2' normally, and is only set to the real value when a true asynchronous 
event (such as AC plug event, battery switch, etc.) occurs.

So i /think/ this is a grey area of the ACPI spec: per the letter of the 
spec the _PPC value only changes when notified, so there's no reason to 
query it after the system has booted up. So in my opinion the best (and 
most compatible) strategy would be to do the change below, and to not 
evaluate the _PPC object in the acpi_processor_get_performance_info() 
call, but only evaluate it if _PPC is present during CPU object init, or 
if it's notified during an asynchronous event. This change is more 
permissive than the previous logic, so it definitely shouldnt break any 
existing system.

This also happens to fix my laptop, which is merrily chugging along at 
1.83 GHz now. Yay!

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Cc: Dave Jones <[EMAIL PROTECTED]>
Acked-by: Len Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
Thomas Renninger <[EMAIL PROTECTED]> wrote:
 Beside the Thinkpad it also seems to fix other system:
 http://bugzilla.kernel.org/show_bug.cgi?id=7859

 drivers/acpi/processor_perflib.c |4 
 1 file changed, 4 deletions(-)

--- linux-2.6.19.2.orig/drivers/acpi/processor_perflib.c
+++ linux-2.6.19.2/drivers/acpi/processor_perflib.c
@@ -322,10 +322,6 @@ static int acpi_processor_get_performanc
if (result)
return result;
 
-   result = acpi_processor_get_platform_limit(pr);
-   if (result)
-   return result;
-
return 0;
 }
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 01/59] i2c-mv64xxx: Fix random oops at boot

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Maxime Bizon <[EMAIL PROTECTED]>

I have a Marvell board which has the same i2c hw block than mv64xxx, so
I'm trying to use i2c-mv64xxx driver.

But I get the following random oops at boot:

Unable to handle kernel NULL pointer dereference at virtual address 0002
Backtrace: 
[] (mv64xxx_i2c_intr+0x0/0x2b8) from [] (__do_irq+0x4c/0x8c)
[] (__do_irq+0x0/0x8c) from [] (do_level_IRQ+0x68/0xc0)
 r8 = C0501E08  r7 = 0005  r6 = C0501E08  r5 = 0005
 r4 = C048BB78 
[] (do_level_IRQ+0x0/0xc0) from [] (asm_do_IRQ+0x50/0x134)
 r6 = C0449C78  r5 = F102  r4 =  
[] (asm_do_IRQ+0x0/0x134) from [] (__irq_svc+0x24/0x100)
 r8 = C1CAC400  r7 = 0005  r6 = 0002  r5 = F102
 r4 =  
[] (setup_irq+0x0/0x124) from [] (request_irq+0xb0/0xd0)
 r7 = C041B2AC  r6 = C0397E4C  r5 =   r4 = 0005
[] (request_irq+0x0/0xd0) from [] 
(mv64xxx_i2c_probe+0x148/0x244)
[] (mv64xxx_i2c_probe+0x0/0x244) from [] 
(platform_drv_probe+0x20/0x24)


The oops is caused by a spurious interrupt that occurs when request_irq
is called. mv64xxx_i2c_fsm() tries to read drv_data->msg, which is NULL.

I noticed that hardware init is done after requesting irq. Thus any
pending irq from previous hardware usage may cause this.

The following patch fixes it:

Signed-off-by: Maxime Bizon <[EMAIL PROTECTED]>
Acked-by: Mark A. Greer <[EMAIL PROTECTED]>
Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
Merged in 2.6.20-rc4:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3269bb63eb076318ce4fb554851d047e1c9aa1a5

 drivers/i2c/busses/i2c-mv64xxx.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/drivers/i2c/busses/i2c-mv64xxx.c
+++ linux-2.6.19.2/drivers/i2c/busses/i2c-mv64xxx.c
@@ -529,6 +529,8 @@ mv64xxx_i2c_probe(struct platform_device
platform_set_drvdata(pd, drv_data);
i2c_set_adapdata(_data->adapter, drv_data);
 
+   mv64xxx_i2c_hw_init(drv_data);
+
if (request_irq(drv_data->irq, mv64xxx_i2c_intr, 0,
MV64XXX_I2C_CTLR_NAME, drv_data)) {
dev_err(_data->adapter.dev,
@@ -542,8 +544,6 @@ mv64xxx_i2c_probe(struct platform_device
goto exit_free_irq;
}
 
-   mv64xxx_i2c_hw_init(drv_data);
-
return 0;
 
exit_free_irq:

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 11/59] [stable] [PATCH] IB/mthca: Fix off-by-one in FMR handling on memfree

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Michael S. Tsirkin <[EMAIL PROTECTED]>

mthca_table_find() will return the wrong address when the table entry
being searched for is exactly at the beginning of a sglist entry
(other than the first), because it uses >= when it should use >.

Example: assume we have 2 entries in scatterlist, 4K each, offset is
4K.  The current code will return first entry + 4K when we really want
the second entry.

In particular this means mapping an FMR on a memfree HCA may end up
writing the page table into the wrong place, leading to memory
corruption and also causing the HCA to use an incorrect address
translation table.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>
Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
This is upstream, and fixes a data corruption/crash bug with storage
over SRP.

 drivers/infiniband/hw/mthca/mthca_memfree.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.2.orig/drivers/infiniband/hw/mthca/mthca_memfree.c
+++ linux-2.6.19.2/drivers/infiniband/hw/mthca/mthca_memfree.c
@@ -232,7 +232,7 @@ void *mthca_table_find(struct mthca_icm_
 
list_for_each_entry(chunk, >chunk_list, list) {
for (i = 0; i < chunk->npages; ++i) {
-   if (chunk->mem[i].length >= offset) {
+   if (chunk->mem[i].length > offset) {
page = chunk->mem[i].page;
goto out;
}

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 02/59] i2c/m41t00: Do not forget to write year

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Philippe De Muyter <[EMAIL PROTECTED]>

m41t00.c forgets to set the year field in set_rtc_time; fix that.

Signed-off-by: Philippe De Muyter <[EMAIL PROTECTED]>
Acked-by: Mark A. Greer <[EMAIL PROTECTED]>
Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
Merged in 2.6.20-rc4:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=81ffbc04a8ea06c4bea534154f49ed598013ee6b

 drivers/i2c/chips/m41t00.c |1 +
 1 file changed, 1 insertion(+)

--- linux-2.6.19.2.orig/drivers/i2c/chips/m41t00.c
+++ linux-2.6.19.2/drivers/i2c/chips/m41t00.c
@@ -209,6 +209,7 @@ m41t00_set(void *arg)
buf[m41t00_chip->hour] = (buf[m41t00_chip->hour] & ~0x3f) | (hour& 
0x3f);
buf[m41t00_chip->day] = (buf[m41t00_chip->day] & ~0x3f) | (day & 0x3f);
buf[m41t00_chip->mon] = (buf[m41t00_chip->mon] & ~0x1f) | (mon & 0x1f);
+   buf[m41t00_chip->year] = year;
 
if (i2c_master_send(save_client, wbuf, 9) < 0)
dev_err(_client->dev, "m41t00_set: Write error\n");

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 10/59] Repair snd-usb-usx2y over OHCI

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Karsten Wiese <[EMAIL PROTECTED]>

The previous patch "Repair snd-usb-usx2y for usb 2.6.18" assumed
urb->start_frame roll over beyond MAX_INT for both UHCI & OHCI.
This isn't true until now (kernel 2.6.20).
Fix this by only looking at the common between OHCI & UHCI Frame number
range.
This is for mainline and stable kernels >= 2.6.18.

Signed-off-by: Karsten Wiese <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 sound/usb/usx2y/usbusx2yaudio.c |2 +-
 sound/usb/usx2y/usx2yhwdeppcm.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/sound/usb/usx2y/usbusx2yaudio.c
+++ linux-2.6.19.2/sound/usb/usx2y/usbusx2yaudio.c
@@ -322,7 +322,7 @@ static void i_usX2Y_urb_complete(struct 
usX2Y_error_urb_status(usX2Y, subs, urb);
return;
}
-   if (likely(urb->start_frame == usX2Y->wait_iso_frame))
+   if (likely((urb->start_frame & 0x) == (usX2Y->wait_iso_frame & 
0x)))
subs->completed_urb = urb;
else {
usX2Y_error_sequence(usX2Y, subs, urb);
--- linux-2.6.19.2.orig/sound/usb/usx2y/usx2yhwdeppcm.c
+++ linux-2.6.19.2/sound/usb/usx2y/usx2yhwdeppcm.c
@@ -243,7 +243,7 @@ static void i_usX2Y_usbpcm_urb_complete(
usX2Y_error_urb_status(usX2Y, subs, urb);
return;
}
-   if (likely(urb->start_frame == usX2Y->wait_iso_frame))
+   if (likely((urb->start_frame & 0x) == (usX2Y->wait_iso_frame & 
0x)))
subs->completed_urb = urb;
else {
usX2Y_error_sequence(usX2Y, subs, urb);

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 07/59] NETFILTER: nf_conntrack_ipv6: fix crash when handling fragments

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Patrick McHardy <[EMAIL PROTECTED]>

When IPv6 connection tracking splits up a defragmented packet into
its original fragments, the packets are taken from a list and are
passed to the network stack with skb->next still set. This causes
dev_hard_start_xmit to treat them as GSO fragments, resulting in
a use after free when connection tracking handles the next fragment.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---

 net/ipv6/netfilter/nf_conntrack_reasm.c |2 ++
 1 file changed, 2 insertions(+)

--- linux-2.6.19.2.orig/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ linux-2.6.19.2/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -835,6 +835,8 @@ void nf_ct_frag6_output(unsigned int hoo
s->nfct_reasm = skb;
 
s2 = s->next;
+   s->next = NULL;
+
NF_HOOK_THRESH(PF_INET6, hooknum, s, in, out, okfn,
   NF_IP6_PRI_CONNTRACK_DEFRAG + 1);
s = s2;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 06/59] NETFILTER: Fix routing of REJECT target generated packets in output chain

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Patrick McHardy <[EMAIL PROTECTED]>

Packets generated by the REJECT target in the output chain have a local
destination address and a foreign source address. Make sure not to use
the foreign source address for the output route lookup.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/ipv4/netfilter.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/net/ipv4/netfilter.c
+++ linux-2.6.19.2/net/ipv4/netfilter.c
@@ -15,16 +15,19 @@ int ip_route_me_harder(struct sk_buff **
struct flowi fl = {};
struct dst_entry *odst;
unsigned int hh_len;
+   unsigned int type;
 
+   type = inet_addr_type(iph->saddr);
if (addr_type == RTN_UNSPEC)
-   addr_type = inet_addr_type(iph->saddr);
+   addr_type = type;
 
/* some non-standard hacks like ipt_REJECT.c:send_reset() can cause
 * packets with foreign saddr to appear on the NF_IP_LOCAL_OUT hook.
 */
if (addr_type == RTN_LOCAL) {
fl.nl_u.ip4_u.daddr = iph->daddr;
-   fl.nl_u.ip4_u.saddr = iph->saddr;
+   if (type == RTN_LOCAL)
+   fl.nl_u.ip4_u.saddr = iph->saddr;
fl.nl_u.ip4_u.tos = RT_TOS(iph->tos);
fl.oif = (*pskb)->sk ? (*pskb)->sk->sk_bound_dev_if : 0;
 #ifdef CONFIG_IP_ROUTE_FWMARK

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 36/59] knfsd: fix type mismatch with filldir_t used by nfsd.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: NeilBrown <[EMAIL PROTECTED]>

nfsd defines a type 'encode_dent_fn' which is much like 'filldir_t'
except that the first pointer is 'struct readdir_cd *' rather than
'void *'.  It then casts encode_dent_fn points to 'filldir_t' as
needed.  This hides any other type mismatches between the two such as
the fact that the 'ino' arg recently changed from ino_t to u64.

So: get rid of 'encode_dent_fn', get rid of the cast of the function
type, change the first arg of various functions from 'struct readdir_cd *'
to 'void *', and live with the fact that we have a little less type
checking on the calling of these functions now.  
Less internal (to nfsd) checking offset by more external checking, which
is more important.

Thanks to Gabriel Paubert <[EMAIL PROTECTED]> for discovering this and
providing an initial patch.

Signed-off-by: Gabriel Paubert <[EMAIL PROTECTED]>
Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 fs/nfsd/nfs3xdr.c |9 +
 fs/nfsd/nfs4xdr.c |5 +++--
 fs/nfsd/nfsxdr.c  |5 +++--
 fs/nfsd/vfs.c |4 ++--
 include/linux/nfsd/nfsd.h |4 +---
 include/linux/nfsd/xdr.h  |4 ++--
 include/linux/nfsd/xdr3.h |8 
 7 files changed, 20 insertions(+), 19 deletions(-)

--- linux-2.6.19.2.orig/fs/nfsd/nfs3xdr.c
+++ linux-2.6.19.2/fs/nfsd/nfs3xdr.c
@@ -994,15 +994,16 @@ encode_entry(struct readdir_cd *ccd, con
 }
 
 int
-nfs3svc_encode_entry(struct readdir_cd *cd, const char *name,
-int namlen, loff_t offset, ino_t ino, unsigned int d_type)
+nfs3svc_encode_entry(void *cd, const char *name,
+int namlen, loff_t offset, u64 ino, unsigned int d_type)
 {
return encode_entry(cd, name, namlen, offset, ino, d_type, 0);
 }
 
 int
-nfs3svc_encode_entry_plus(struct readdir_cd *cd, const char *name,
- int namlen, loff_t offset, ino_t ino, unsigned int 
d_type)
+nfs3svc_encode_entry_plus(void *cd, const char *name,
+ int namlen, loff_t offset, u64 ino,
+ unsigned int d_type)
 {
return encode_entry(cd, name, namlen, offset, ino, d_type, 1);
 }
--- linux-2.6.19.2.orig/fs/nfsd/nfs4xdr.c
+++ linux-2.6.19.2/fs/nfsd/nfs4xdr.c
@@ -1884,9 +1884,10 @@ nfsd4_encode_rdattr_error(__be32 *p, int
 }
 
 static int
-nfsd4_encode_dirent(struct readdir_cd *ccd, const char *name, int namlen,
-   loff_t offset, ino_t ino, unsigned int d_type)
+nfsd4_encode_dirent(void *ccdv, const char *name, int namlen,
+   loff_t offset, u64 ino, unsigned int d_type)
 {
+   struct readdir_cd *ccd = ccdv;
struct nfsd4_readdir *cd = container_of(ccd, struct nfsd4_readdir, 
common);
int buflen;
__be32 *p = cd->buffer;
--- linux-2.6.19.2.orig/fs/nfsd/nfsxdr.c
+++ linux-2.6.19.2/fs/nfsd/nfsxdr.c
@@ -467,9 +467,10 @@ nfssvc_encode_statfsres(struct svc_rqst 
 }
 
 int
-nfssvc_encode_entry(struct readdir_cd *ccd, const char *name,
-   int namlen, loff_t offset, ino_t ino, unsigned int d_type)
+nfssvc_encode_entry(void *ccdv, const char *name,
+   int namlen, loff_t offset, u64 ino, unsigned int d_type)
 {
+   struct readdir_cd *ccd = ccdv;
struct nfsd_readdirres *cd = container_of(ccd, struct nfsd_readdirres, 
common);
__be32  *p = cd->buffer;
int buflen, slen;
--- linux-2.6.19.2.orig/fs/nfsd/vfs.c
+++ linux-2.6.19.2/fs/nfsd/vfs.c
@@ -1727,7 +1727,7 @@ out:
  */
 __be32
 nfsd_readdir(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t *offsetp, 
-struct readdir_cd *cdp, encode_dent_fn func)
+struct readdir_cd *cdp, filldir_t func)
 {
__be32  err;
int host_err;
@@ -1752,7 +1752,7 @@ nfsd_readdir(struct svc_rqst *rqstp, str
 
do {
cdp->err = nfserr_eof; /* will be cleared on successful read */
-   host_err = vfs_readdir(file, (filldir_t) func, cdp);
+   host_err = vfs_readdir(file, func, cdp);
} while (host_err >=0 && cdp->err == nfs_ok);
if (host_err)
err = nfserrno(host_err);
--- linux-2.6.19.2.orig/include/linux/nfsd/nfsd.h
+++ linux-2.6.19.2/include/linux/nfsd/nfsd.h
@@ -52,8 +52,6 @@
 struct readdir_cd {
__be32  err;/* 0, nfserr, or nfserr_eof */
 };
-typedef int(*encode_dent_fn)(struct readdir_cd *, const char *,
-   int, loff_t, ino_t, unsigned 
int);
 typedef int (*nfsd_dirop_t)(struct inode *, struct dentry *, int, int);
 
 extern struct svc_program  nfsd_program;
@@ -117,7 +115,7 @@ __be32  nfsd_unlink(struct svc_rqst *, s
 intnfsd_truncate(struct svc_rqst *, struct svc_fh *,
unsigned long size);
 __be32

[patch 34/59] knfsd: fix setting of ACL server versions.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: NeilBrown <[EMAIL PROTECTED]>

Due to silly typos, if the nfs versions are explicitly set,
no NFSACL versions get enabled.

Also improve an error message that would have made this bug
a little easier to find.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 fs/nfsd/nfssvc.c |8 
 net/sunrpc/svc.c |3 ++-
 2 files changed, 6 insertions(+), 5 deletions(-)

--- linux-2.6.19.2.orig/fs/nfsd/nfssvc.c
+++ linux-2.6.19.2/fs/nfsd/nfssvc.c
@@ -72,7 +72,7 @@ static struct svc_program nfsd_acl_progr
.pg_prog= NFS_ACL_PROGRAM,
.pg_nvers   = NFSD_ACL_NRVERS,
.pg_vers= nfsd_acl_versions,
-   .pg_name= "nfsd",
+   .pg_name= "nfsacl",
.pg_class   = "nfsd",
.pg_stats   = _acl_svcstats,
.pg_authenticate= _set_client,
@@ -118,16 +118,16 @@ int nfsd_vers(int vers, enum vers_op cha
switch(change) {
case NFSD_SET:
nfsd_versions[vers] = nfsd_version[vers];
-   break;
 #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
if (vers < NFSD_ACL_NRVERS)
-   nfsd_acl_version[vers] = nfsd_acl_version[vers];
+   nfsd_acl_versions[vers] = nfsd_acl_version[vers];
 #endif
+   break;
case NFSD_CLEAR:
nfsd_versions[vers] = NULL;
 #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
if (vers < NFSD_ACL_NRVERS)
-   nfsd_acl_version[vers] = NULL;
+   nfsd_acl_versions[vers] = NULL;
 #endif
break;
case NFSD_TEST:
--- linux-2.6.19.2.orig/net/sunrpc/svc.c
+++ linux-2.6.19.2/net/sunrpc/svc.c
@@ -910,7 +910,8 @@ err_bad_prog:
 
 err_bad_vers:
 #ifdef RPC_PARANOIA
-   printk("svc: unknown version (%d)\n", vers);
+   printk("svc: unknown version (%d for prog %d, %s)\n",
+  vers, prog, progp->pg_name);
 #endif
serv->sv_stats->rpcbadfmt++;
svc_putnl(resv, RPC_PROG_MISMATCH);

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 48/59] SPARC32: Fix over-optimization by GCC near ip_fast_csum.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Bob Breuer <[EMAIL PROTECTED]>

In some cases such as:
iph->check = 0;
iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
GCC may optimize out the previous store.

Observed as a failure of NFS over udp (bad checksums on ip fragments)
when compiled with GCC 3.4.2.

Signed-off-by: Bob Breuer <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 include/asm-sparc/checksum.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.2.orig/include/asm-sparc/checksum.h
+++ linux-2.6.19.2/include/asm-sparc/checksum.h
@@ -159,7 +159,7 @@ static inline unsigned short ip_fast_csu
 "xnor\t%%g0, %0, %0"
 : "=r" (sum), "=" (iph)
 : "r" (ihl), "1" (iph)
-: "g2", "g3", "g4", "cc");
+: "g2", "g3", "g4", "cc", "memory");
return sum;
 }
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 58/59] move_task_off_dead_cpu() should be called with disabled ints

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Kirill Korotaev <[EMAIL PROTECTED]>

move_task_off_dead_cpu() requires interrupts to be disabled, while
migrate_dead() calls it with enabled interrupts.  Added appropriate
comments to functions and added BUG_ON(!irqs_disabled()) into
double_rq_lock() and double_lock_balance() which are the origin sources of
such bugs.

Signed-off-by: Kirill Korotaev <[EMAIL PROTECTED]>
Acked-by: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 kernel/sched.c |   17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

--- linux-2.6.19.2.orig/kernel/sched.c
+++ linux-2.6.19.2/kernel/sched.c
@@ -1941,6 +1941,7 @@ static void double_rq_lock(struct rq *rq
__acquires(rq1->lock)
__acquires(rq2->lock)
 {
+   BUG_ON(!irqs_disabled());
if (rq1 == rq2) {
spin_lock(>lock);
__acquire(rq2->lock);   /* Fake it out ;) */
@@ -1980,6 +1981,11 @@ static void double_lock_balance(struct r
__acquires(busiest->lock)
__acquires(this_rq->lock)
 {
+   if (unlikely(!irqs_disabled())) {
+   /* printk() doesn't work good under rq->lock */
+   spin_unlock(_rq->lock);
+   BUG_ON(1);
+   }
if (unlikely(!spin_trylock(>lock))) {
if (busiest < this_rq) {
spin_unlock(_rq->lock);
@@ -5050,7 +5056,10 @@ wait_to_die:
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
-/* Figure out where task on dead CPU should go, use force if neccessary. */
+/*
+ * Figure out where task on dead CPU should go, use force if neccessary.
+ * NOTE: interrupts should be disabled by the caller
+ */
 static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p)
 {
unsigned long flags;
@@ -5170,6 +5179,7 @@ void idle_task_exit(void)
mmdrop(mm);
 }
 
+/* called under rq->lock with disabled interrupts */
 static void migrate_dead(unsigned int dead_cpu, struct task_struct *p)
 {
struct rq *rq = cpu_rq(dead_cpu);
@@ -5186,10 +5196,11 @@ static void migrate_dead(unsigned int de
 * Drop lock around migration; if someone else moves it,
 * that's OK.  No task can be added to this CPU, so iteration is
 * fine.
+* NOTE: interrupts should be left disabled  --dev@
 */
-   spin_unlock_irq(>lock);
+   spin_unlock(>lock);
move_task_off_dead_cpu(dead_cpu, p);
-   spin_lock_irq(>lock);
+   spin_lock(>lock);
 
put_task_struct(p);
 }

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 59/59] sched: fix cond_resched_softirq() offset

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Ingo Molnar <[EMAIL PROTECTED]>

Remove the __resched_legal() check: it is conceptually broken.  The biggest
problem it had is that it can mask buggy cond_resched() calls.  A
cond_resched() call is only legal if we are not in an atomic context, with
two narrow exceptions:

 - if the system is booting
 - a reacquire_kernel_lock() down() done while PREEMPT_ACTIVE is set

But __resched_legal() hid this and just silently returned whenever
these primitives were called from invalid contexts. (Same goes for
cond_resched_locked() and cond_resched_softirq()).

Furthermore, the __legal_resched(0) call was buggy in that it caused
unnecessarily long softirq latencies via cond_resched_softirq().  (which is
only called from softirq-off sections, hence the code did nothing.)

The fix is to resurrect the efficiency of the might_sleep checks and to
only allow the narrow exceptions.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
[chrisw: backport to 2.6.19.2]
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 kernel/sched.c |   16 
 1 file changed, 4 insertions(+), 12 deletions(-)

--- linux-2.6.19.2.orig/kernel/sched.c
+++ linux-2.6.19.2/kernel/sched.c
@@ -4524,15 +4524,6 @@ asmlinkage long sys_sched_yield(void)
return 0;
 }
 
-static inline int __resched_legal(int expected_preempt_count)
-{
-   if (unlikely(preempt_count() != expected_preempt_count))
-   return 0;
-   if (unlikely(system_state != SYSTEM_RUNNING))
-   return 0;
-   return 1;
-}
-
 static void __cond_resched(void)
 {
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
@@ -4552,7 +4543,8 @@ static void __cond_resched(void)
 
 int __sched cond_resched(void)
 {
-   if (need_resched() && __resched_legal(0)) {
+   if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) &&
+   system_state == SYSTEM_RUNNING) {
__cond_resched();
return 1;
}
@@ -4578,7 +4570,7 @@ int cond_resched_lock(spinlock_t *lock)
ret = 1;
spin_lock(lock);
}
-   if (need_resched() && __resched_legal(1)) {
+   if (need_resched() && system_state == SYSTEM_RUNNING) {
spin_release(>dep_map, 1, _THIS_IP_);
_raw_spin_unlock(lock);
preempt_enable_no_resched();
@@ -4594,7 +4586,7 @@ int __sched cond_resched_softirq(void)
 {
BUG_ON(!in_softirq());
 
-   if (need_resched() && __resched_legal(0)) {
+   if (need_resched() && system_state == SYSTEM_RUNNING) {
raw_local_irq_disable();
_local_bh_enable();
raw_local_irq_enable();

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 51/59] AF_PACKET: Fix BPF handling.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: David S. Miller <[EMAIL PROTECTED]>

This fixes a bug introduced by:

commit fda9ef5d679b07c9d9097aaf6ef7f069d794a8f9
Author: Dmitry Mishin <[EMAIL PROTECTED]>
Date:   Thu Aug 31 15:28:39 2006 -0700

[NET]: Fix sk->sk_filter field access

sk_run_filter() returns either 0 or an unsigned 32-bit
length which says how much of the packet to retain.
If that 32-bit unsigned integer is larger than the packet,
this is fine we just leave the packet unchanged.

The above commit caused all filter return values which
were negative when interpreted as a signed integer to
indicate a packet drop, which is wrong.

Based upon a report and initial patch by Raivis Bucis.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/packet/af_packet.c |   30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

--- linux-2.6.19.2.orig/net/packet/af_packet.c
+++ linux-2.6.19.2/net/packet/af_packet.c
@@ -427,24 +427,18 @@ out_unlock:
 }
 #endif
 
-static inline int run_filter(struct sk_buff *skb, struct sock *sk,
-   unsigned *snaplen)
+static inline unsigned int run_filter(struct sk_buff *skb, struct sock *sk,
+ unsigned int res)
 {
struct sk_filter *filter;
-   int err = 0;
 
rcu_read_lock_bh();
filter = rcu_dereference(sk->sk_filter);
-   if (filter != NULL) {
-   err = sk_run_filter(skb, filter->insns, filter->len);
-   if (!err)
-   err = -EPERM;
-   else if (*snaplen > err)
-   *snaplen = err;
-   }
+   if (filter != NULL)
+   res = sk_run_filter(skb, filter->insns, filter->len);
rcu_read_unlock_bh();
 
-   return err;
+   return res;
 }
 
 /*
@@ -466,7 +460,7 @@ static int packet_rcv(struct sk_buff *sk
struct packet_sock *po;
u8 * skb_head = skb->data;
int skb_len = skb->len;
-   unsigned snaplen;
+   unsigned int snaplen, res;
 
if (skb->pkt_type == PACKET_LOOPBACK)
goto drop;
@@ -494,8 +488,11 @@ static int packet_rcv(struct sk_buff *sk
 
snaplen = skb->len;
 
-   if (run_filter(skb, sk, ) < 0)
+   res = run_filter(skb, sk, snaplen);
+   if (!res)
goto drop_n_restore;
+   if (snaplen > res)
+   snaplen = res;
 
if (atomic_read(>sk_rmem_alloc) + skb->truesize >=
(unsigned)sk->sk_rcvbuf)
@@ -567,7 +564,7 @@ static int tpacket_rcv(struct sk_buff *s
struct tpacket_hdr *h;
u8 * skb_head = skb->data;
int skb_len = skb->len;
-   unsigned snaplen;
+   unsigned int snaplen, res;
unsigned long status = TP_STATUS_LOSING|TP_STATUS_USER;
unsigned short macoff, netoff;
struct sk_buff *copy_skb = NULL;
@@ -591,8 +588,11 @@ static int tpacket_rcv(struct sk_buff *s
 
snaplen = skb->len;
 
-   if (run_filter(skb, sk, ) < 0)
+   res = run_filter(skb, sk, snaplen);
+   if (!res)
goto drop_n_restore;
+   if (snaplen > res)
+   snaplen = res;
 
if (sk->sk_type == SOCK_DGRAM) {
macoff = netoff = TPACKET_ALIGN(TPACKET_HDRLEN) + 16;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 55/59] TCP: skb is unexpectedly freed.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Masayuki Nakagawa <[EMAIL PROTECTED]>

I encountered a kernel panic with my test program, which is a very
simple IPv6 client-server program.

The server side sets IPV6_RECVPKTINFO on a listening socket, and the
client side just sends a message to the server.  Then the kernel panic
occurs on the server.  (If you need the test program, please let me
know. I can provide it.)

This problem happens because a skb is forcibly freed in
tcp_rcv_state_process().

When a socket in listening state(TCP_LISTEN) receives a syn packet,
then tcp_v6_conn_request() will be called from
tcp_rcv_state_process().  If the tcp_v6_conn_request() successfully
returns, the skb would be discarded by __kfree_skb().

However, in case of a listening socket which was already set
IPV6_RECVPKTINFO, an address of the skb will be stored in
treq->pktopts and a ref count of the skb will be incremented in
tcp_v6_conn_request().  But, even if the skb is still in use, the skb
will be freed.  Then someone still using the freed skb will cause the
kernel panic.

I suggest to use kfree_skb() instead of __kfree_skb().

Signed-off-by: Masayuki Nakagawa <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>

---
 net/ipv4/tcp_input.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/net/ipv4/tcp_input.c
+++ linux-2.6.19.2/net/ipv4/tcp_input.c
@@ -4411,9 +4411,11 @@ int tcp_rcv_state_process(struct sock *s
 * But, this leaves one open to an easy denial of
 * service attack, and SYN cookies can't defend
 * against this problem. So, we drop the data
-* in the interest of security over speed.
+* in the interest of security over speed unless
+* it's still in use.
 */
-   goto discard;
+   kfree_skb(skb);
+   return 0;
}
goto discard;
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 54/59] TCP: Fix sorting of SACK blocks.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Baruch Even <[EMAIL PROTECTED]>

The sorting of SACK blocks actually munges them rather than sort,
causing the TCP stack to ignore some SACK information and breaking the
assumption of ordered SACK blocks after sorting.

The sort takes the data from a second buffer which isn't moved causing
subsequent data moves to occur from the wrong location. The fix is to
use a temporary buffer as a normal sort does.

Signed-off-By: Baruch Even <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/ipv4/tcp_input.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

--- linux-2.6.19.2.orig/net/ipv4/tcp_input.c
+++ linux-2.6.19.2/net/ipv4/tcp_input.c
@@ -1011,10 +1011,11 @@ tcp_sacktag_write_queue(struct sock *sk,
for (j = 0; j < i; j++){
if (after(ntohl(sp[j].start_seq),
  ntohl(sp[j+1].start_seq))){
-   sp[j].start_seq = 
htonl(tp->recv_sack_cache[j+1].start_seq);
-   sp[j].end_seq = 
htonl(tp->recv_sack_cache[j+1].end_seq);
-   sp[j+1].start_seq = 
htonl(tp->recv_sack_cache[j].start_seq);
-   sp[j+1].end_seq = 
htonl(tp->recv_sack_cache[j].end_seq);
+   struct tcp_sack_block_wire tmp;
+
+   tmp = sp[j];
+   sp[j] = sp[j+1];
+   sp[j+1] = tmp;
}
 
}

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 53/59] TCP: rare bad TCP checksum with 2.6.19

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Jarek Poplawski <[EMAIL PROTECTED]>

The patch "Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE"
changed to unconditional copying of ip_summed field from collapsed
skb. This patch reverts this change.

The majority of substantial work including heavy testing
and diagnosing by: Michael Tokarev <[EMAIL PROTECTED]>
Possible reasons pointed by: Herbert Xu and Patrick McHardy.

Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>
Acked-by: Herbert Xu <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/ipv4/tcp_output.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-2.6.19.2.orig/net/ipv4/tcp_output.c
+++ linux-2.6.19.2/net/ipv4/tcp_output.c
@@ -1590,7 +1590,8 @@ static void tcp_retrans_try_collapse(str
 
memcpy(skb_put(skb, next_skb_size), next_skb->data, 
next_skb_size);
 
-   skb->ip_summed = next_skb->ip_summed;
+   if (next_skb->ip_summed == CHECKSUM_PARTIAL)
+   skb->ip_summed = CHECKSUM_PARTIAL;
 
if (skb->ip_summed != CHECKSUM_PARTIAL)
skb->csum = csum_block_add(skb->csum, next_skb->csum, 
skb_size);

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 52/59] AF_PACKET: Check device down state before hard header callbacks.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: David S. Miller <[EMAIL PROTECTED]>

If the device is down, invoking the device hard header callbacks
is not legal, so check it early.

Based upon a shaper OOPS report from Frederik Deweerdt.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/packet/af_packet.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

--- linux-2.6.19.2.orig/net/packet/af_packet.c
+++ linux-2.6.19.2/net/packet/af_packet.c
@@ -358,6 +358,10 @@ static int packet_sendmsg_spkt(struct ki
if (dev == NULL)
goto out_unlock;

+   err = -ENETDOWN;
+   if (!(dev->flags & IFF_UP))
+   goto out_unlock;
+
/*
 *  You may not queue a frame bigger than the mtu. This is the 
lowest level
 *  raw protocol and you must do your own fragmentation at this 
level.
@@ -406,10 +410,6 @@ static int packet_sendmsg_spkt(struct ki
if (err)
goto out_free;
 
-   err = -ENETDOWN;
-   if (!(dev->flags & IFF_UP))
-   goto out_free;
-
/*
 *  Now send it
 */
@@ -737,6 +737,10 @@ static int packet_sendmsg(struct kiocb *
if (sock->type == SOCK_RAW)
reserve = dev->hard_header_len;
 
+   err = -ENETDOWN;
+   if (!(dev->flags & IFF_UP))
+   goto out_unlock;
+
err = -EMSGSIZE;
if (len > dev->mtu+reserve)
goto out_unlock;
@@ -769,10 +773,6 @@ static int packet_sendmsg(struct kiocb *
skb->dev = dev;
skb->priority = sk->sk_priority;
 
-   err = -ENETDOWN;
-   if (!(dev->flags & IFF_UP))
-   goto out_free;
-
/*
 *  Now send it
 */

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 15/59] Fix up CIFS for "test_clear_page_dirty()" removal

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Linus Torvalds <[EMAIL PROTECTED]>

Fix up CIFS for "test_clear_page_dirty()" removal

This also adds he required page "writeback" flag handling, that cifs
hasn't been doing and that the page dirty flag changes made obvious.

Acked-by: Steve French <[EMAIL PROTECTED]>
Acked-by: Dave Kleikamp <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
This fixes a long term corruption bug when copying large files to a CIFS 
mount. Thanks Linus!

---
 fs/cifs/file.c |   26 +++---
 1 file changed, 23 insertions(+), 3 deletions(-)

--- linux-2.6.19.2.orig/fs/cifs/file.c
+++ linux-2.6.19.2/fs/cifs/file.c
@@ -1244,14 +1244,21 @@ retry:
wait_on_page_writeback(page);
 
if (PageWriteback(page) ||
-   !test_clear_page_dirty(page)) {
+   !clear_page_dirty_for_io(page)) {
unlock_page(page);
break;
}
 
+   /*
+* This actually clears the dirty bit in the radix tree.
+* See cifs_writepage() for more commentary.
+*/
+   set_page_writeback(page);
+
if (page_offset(page) >= mapping->host->i_size) {
done = 1;
unlock_page(page);
+   end_page_writeback(page);
break;
}
 
@@ -1315,6 +1322,7 @@ retry:
SetPageError(page);
kunmap(page);
unlock_page(page);
+   end_page_writeback(page);
page_cache_release(page);
}
if ((wbc->nr_to_write -= n_iov) <= 0)
@@ -1351,11 +1359,23 @@ static int cifs_writepage(struct page* p
 if (!PageUptodate(page)) {
cFYI(1, ("ppw - page not up to date"));
}
-   
+
+   /*
+* Set the "writeback" flag, and clear "dirty" in the radix tree.
+*
+* A writepage() implementation always needs to do either this,
+* or re-dirty the page with "redirty_page_for_writepage()" in
+* the case of a failure.
+*
+* Just unlocking the page will cause the radix tree tag-bits
+* to fail to update with the state of the page correctly.
+*/
+   set_page_writeback(page);   
rc = cifs_partialpagewrite(page, 0, PAGE_CACHE_SIZE);
SetPageUptodate(page); /* BB add check for error and Clearuptodate? */
unlock_page(page);
-   page_cache_release(page);   
+   end_page_writeback(page);
+   page_cache_release(page);
FreeXid(xid);
return rc;
 }

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 47/59] DECNET: Handle a failure in neigh_parms_alloc (take 2)

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Eric W. Biederman <[EMAIL PROTECTED]>

While enhancing the neighbour code to handle multiple network
namespaces I noticed that decnet is assuming neigh_parms_alloc
will allways succeed, which is clearly wrong.  So handle the
failure.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Acked-by: Steven Whitehouse <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/decnet/dn_dev.c |   11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/net/decnet/dn_dev.c
+++ linux-2.6.19.2/net/decnet/dn_dev.c
@@ -1116,16 +1116,23 @@ struct dn_dev *dn_dev_create(struct net_
init_timer(_db->timer);
 
dn_db->uptime = jiffies;
+
+   dn_db->neigh_parms = neigh_parms_alloc(dev, _neigh_table);
+   if (!dn_db->neigh_parms) {
+   dev->dn_ptr = NULL;
+   kfree(dn_db);
+   return NULL;
+   }
+
if (dn_db->parms.up) {
if (dn_db->parms.up(dev) < 0) {
+   neigh_parms_release(_neigh_table, 
dn_db->neigh_parms);
dev->dn_ptr = NULL;
kfree(dn_db);
return NULL;
}
}
 
-   dn_db->neigh_parms = neigh_parms_alloc(dev, _neigh_table);
-
dn_dev_sysctl_register(dev, _db->parms);
 
dn_dev_set_timer(dev);

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 50/59] IPV4: Fix single-entry /proc/net/fib_trie output.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Robert Olsson <[EMAIL PROTECTED]>

When main table is just a single leaf this gets printed as belonging to the
local table in /proc/net/fib_trie. A fix is below.

Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Acked-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/ipv4/fib_trie.c |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

--- linux-2.6.19.2.orig/net/ipv4/fib_trie.c
+++ linux-2.6.19.2/net/ipv4/fib_trie.c
@@ -2290,16 +2290,17 @@ static int fib_trie_seq_show(struct seq_
if (v == SEQ_START_TOKEN)
return 0;
 
+   if (!NODE_PARENT(n)) {
+   if (iter->trie == trie_local)
+   seq_puts(seq, ":\n");
+   else
+   seq_puts(seq, ":\n");
+   }
+
if (IS_TNODE(n)) {
struct tnode *tn = (struct tnode *) n;
__be32 prf = htonl(MASK_PFX(tn->key, tn->pos));
 
-   if (!NODE_PARENT(n)) {
-   if (iter->trie == trie_local)
-   seq_puts(seq, ":\n");
-   else
-   seq_puts(seq, ":\n");
-   } 
seq_indent(seq, iter->depth-1);
seq_printf(seq, "  +-- %d.%d.%d.%d/%d %d %d %d\n",
   NIPQUAD(prf), tn->pos, tn->bits, tn->full_children, 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 57/59] SUNRPC: Give cloned RPC clients their own rpc_pipefs directory

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Trond Myklebust <[EMAIL PROTECTED]>

This patch fixes a regression in 2.6.19 in which the use of multiple
krb5 mounts against the same NFS server may result in an Oops on
unmount. The Oops is due to the fact that multiple NFS krb5 clients may
end up inadvertently sharing the same rpc_pipefs upcall pipe. The first
client to 'umount' will unlink that shared pipe, causing an Oops.

The solution is to give each client their own upcall pipe. This fix has
been in mainline since 2.6.20-rc1.

Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]>
[chrisw: backport to 2.6.19.2]
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 include/linux/sunrpc/clnt.h |1 +
 net/sunrpc/clnt.c   |   26 +++---
 2 files changed, 16 insertions(+), 11 deletions(-)

--- linux-2.6.19.2.orig/include/linux/sunrpc/clnt.h
+++ linux-2.6.19.2/include/linux/sunrpc/clnt.h
@@ -53,6 +53,7 @@ struct rpc_clnt {
struct dentry * cl_dentry;  /* inode */
struct rpc_clnt *   cl_parent;  /* Points to parent of clones */
struct rpc_rtt  cl_rtt_default;
+   struct rpc_program *cl_program;
charcl_inline_name[32];
 };
 
--- linux-2.6.19.2.orig/net/sunrpc/clnt.c
+++ linux-2.6.19.2/net/sunrpc/clnt.c
@@ -141,6 +141,7 @@ static struct rpc_clnt * rpc_new_client(
clnt->cl_vers = version->number;
clnt->cl_stats= program->stats;
clnt->cl_metrics  = rpc_alloc_iostats(clnt);
+   clnt->cl_program  = program;
 
if (!xprt_bound(clnt->cl_xprt))
clnt->cl_autobind = 1;
@@ -252,6 +253,7 @@ struct rpc_clnt *
 rpc_clone_client(struct rpc_clnt *clnt)
 {
struct rpc_clnt *new;
+   int err = -ENOMEM;
 
new = kmalloc(sizeof(*new), GFP_KERNEL);
if (!new)
@@ -259,6 +261,10 @@ rpc_clone_client(struct rpc_clnt *clnt)
memcpy(new, clnt, sizeof(*new));
atomic_set(>cl_count, 1);
atomic_set(>cl_users, 0);
+   new->cl_metrics = rpc_alloc_iostats(clnt);
+   err = rpc_setup_pipedir(new, clnt->cl_program->pipe_dir_name);
+   if (err != 0)
+   goto out_no_path;
new->cl_parent = clnt;
atomic_inc(>cl_count);
new->cl_xprt = xprt_get(clnt->cl_xprt);
@@ -266,16 +272,16 @@ rpc_clone_client(struct rpc_clnt *clnt)
new->cl_autobind = 0;
new->cl_oneshot = 0;
new->cl_dead = 0;
-   if (!IS_ERR(new->cl_dentry))
-   dget(new->cl_dentry);
rpc_init_rtt(>cl_rtt_default, clnt->cl_xprt->timeout.to_initval);
if (new->cl_auth)
atomic_inc(>cl_auth->au_count);
-   new->cl_metrics = rpc_alloc_iostats(clnt);
return new;
+out_no_path:
+   rpc_free_iostats(new->cl_metrics);
+   kfree(new);
 out_no_clnt:
-   printk(KERN_INFO "RPC: out of memory in %s\n", __FUNCTION__);
-   return ERR_PTR(-ENOMEM);
+   dprintk("RPC: %s returned error %d\n", __FUNCTION__, err);
+   return ERR_PTR(err);
 }
 
 /*
@@ -328,16 +334,14 @@ rpc_destroy_client(struct rpc_clnt *clnt
rpcauth_destroy(clnt->cl_auth);
clnt->cl_auth = NULL;
}
-   if (clnt->cl_parent != clnt) {
-   if (!IS_ERR(clnt->cl_dentry))
-   dput(clnt->cl_dentry);
-   rpc_destroy_client(clnt->cl_parent);
-   goto out_free;
-   }
if (!IS_ERR(clnt->cl_dentry)) {
rpc_rmdir(clnt->cl_dentry);
rpc_put_mount();
}
+   if (clnt->cl_parent != clnt) {
+   rpc_destroy_client(clnt->cl_parent);
+   goto out_free;
+   }
if (clnt->cl_server != clnt->cl_inline_name)
kfree(clnt->cl_server);
 out_free:

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 44/59] uml: fix signal frame alignment

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Jeff Dike <[EMAIL PROTECTED]>

Use the same signal frame alignment calculations as the underlying
architecture.  x86_64 appeared to do this, but the "- 8" was really
subtracting 8 * sizeof(struct rt_sigframe) rather than 8 bytes.

UML/i386 might have been OK, but I changed the calculation to match
i386 just to be sure.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Cc: Adrian Bunk <[EMAIL PROTECTED]>
Cc: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]>
Acked-by: Antoine Martin <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 arch/um/sys-i386/signal.c   |3 ++-
 arch/um/sys-x86_64/signal.c |5 +++--
 2 files changed, 5 insertions(+), 3 deletions(-)

--- linux-2.6.19.2.orig/arch/um/sys-i386/signal.c
+++ linux-2.6.19.2/arch/um/sys-i386/signal.c
@@ -219,7 +219,8 @@ int setup_signal_stack_sc(unsigned long 
unsigned long save_sp = PT_REGS_SP(regs);
int err = 0;
 
-   stack_top &= -8UL;
+   /* This is the same calculation as i386 - ((sp + 4) & 15) == 0 */
+   stack_top = ((stack_top + 4) & -16UL) - 4;
frame = (struct sigframe __user *) stack_top - 1;
if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame)))
return 1;
--- linux-2.6.19.2.orig/arch/um/sys-x86_64/signal.c
+++ linux-2.6.19.2/arch/um/sys-x86_64/signal.c
@@ -191,8 +191,9 @@ int setup_signal_stack_si(unsigned long 
struct task_struct *me = current;
 
frame = (struct rt_sigframe __user *)
-   round_down(stack_top - sizeof(struct rt_sigframe), 16) - 8;
-frame = (struct rt_sigframe __user *) ((unsigned long) frame - 128);
+   round_down(stack_top - sizeof(struct rt_sigframe), 16);
+   /* Subtract 128 for a red zone and 8 for proper alignment */
+frame = (struct rt_sigframe __user *) ((unsigned long) frame - 128 - 
8);
 
if (!access_ok(VERIFY_WRITE, fp, sizeof(struct _fpstate)))
goto out;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 46/59] jmicron: 40/80pin primary detection

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: [EMAIL PROTECTED] <[EMAIL PROTECTED]>

jmicron module detects all JMB36x as JMB361 and PATA0 has wrong pin status
of XICBLID.

Cc: Jeff Garzik <[EMAIL PROTECTED]>
Cc: Alan Cox <[EMAIL PROTECTED]>
Cc: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
Cc: Sergei Shtylyov <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

[EMAIL PROTECTED]: I folded in the warning fix (a51545ab25) because
otherwise it makes the tester think the patch caused the warning
that was already there.

Cc: Dave Jones <[EMAIL PROTECTED]>
Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/ide/pci/jmicron.c |   17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

--- linux-2.6.19.2.orig/drivers/ide/pci/jmicron.c
+++ linux-2.6.19.2/drivers/ide/pci/jmicron.c
@@ -86,15 +86,16 @@ static int __devinit ata66_jmicron(ide_h
{
case PORT_PATA0:
if (control & (1 << 3)) /* 40/80 pin primary */
-   return 1;
-   return 0;
+   return 0;
+   return 1;
case PORT_PATA1:
if (control5 & (1 << 19))   /* 40/80 pin secondary */
return 0;
return 1;
case PORT_SATA:
-   return 1;
+   break;
}
+   return 1; /* Avoid bogus "control reaches end of non-void function" */
 }
 
 static void jmicron_tuneproc (ide_drive_t *drive, byte mode_wanted)
@@ -240,11 +241,11 @@ static int __devinit jmicron_init_one(st
 }
 
 static struct pci_device_id jmicron_pci_tbl[] = {
-   { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB361), 0},
-   { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB363), 1},
-   { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB365), 2},
-   { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB366), 3},
-   { PCI_DEVICE(PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB368), 4},
+   { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB361, PCI_ANY_ID, 
PCI_ANY_ID, 0, 0, 0},
+   { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB363, PCI_ANY_ID, 
PCI_ANY_ID, 0, 0, 1},
+   { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB365, PCI_ANY_ID, 
PCI_ANY_ID, 0, 0, 2},
+   { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB366, PCI_ANY_ID, 
PCI_ANY_ID, 0, 0, 3},
+   { PCI_VENDOR_ID_JMICRON, PCI_DEVICE_ID_JMICRON_JMB368, PCI_ANY_ID, 
PCI_ANY_ID, 0, 0, 4},
{ 0, },
 };
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 49/59] IPV4: Fix the fib trie iterator to work with a single entry routing tables

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Eric W. Biederman <[EMAIL PROTECTED]>

In a kernel with trie routing enabled I had a simple routing setup
with only a single route to the outside world and no default
route. "ip route table list main" showed my the route just fine but
/proc/net/route was an empty file.  What was going on?

Thinking it was a bug in something I did and I looked deeper.  Eventually
I setup a second route and everything looked correct, huh?  Finally I
realized that the it was just the iterator pair in fib_trie_get_first,
fib_trie_get_next just could not handle a routing table with a single entry.

So to save myself and others further confusion, here is a simple fix for
the fib proc iterator so it works even when there is only a single route
in a routing table.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/ipv4/fib_trie.c |   21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

--- linux-2.6.19.2.orig/net/ipv4/fib_trie.c
+++ linux-2.6.19.2/net/ipv4/fib_trie.c
@@ -1989,6 +1989,10 @@ static struct node *fib_trie_get_next(st
unsigned cindex = iter->index;
struct tnode *p;
 
+   /* A single entry routing table */
+   if (!tn)
+   return NULL;
+
pr_debug("get_next iter={node=%p index=%d depth=%d}\n",
 iter->tnode, iter->index, iter->depth);
 rescan:
@@ -2037,11 +2041,18 @@ static struct node *fib_trie_get_first(s
if(!iter)
return NULL;
 
-   if (n && IS_TNODE(n)) {
-   iter->tnode = (struct tnode *) n;
-   iter->trie = t;
-   iter->index = 0;
-   iter->depth = 1;
+   if (n) {
+   if (IS_TNODE(n)) {
+   iter->tnode = (struct tnode *) n;
+   iter->trie = t;
+   iter->index = 0;
+   iter->depth = 1;
+   } else {
+   iter->tnode = NULL;
+   iter->trie  = t;
+   iter->index = 0;
+   iter->depth = 0;
+   }
return n;
}
return NULL;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 56/59] NETFILTER: xt_connbytes: fix division by zero

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Patrick McHardy <[EMAIL PROTECTED]>

When the packet counter of a connection is zero a division by zero
occurs in div64_64(). Fix that by using zero as average value, which
is correct as long as the packet counter didn't overflow, at which
point we have lost anyway.

Additionally we're probably going to go back to 64 bit counters
in 2.6.21.

Based on patch from Jonas Berlin <[EMAIL PROTECTED]>,
with suggestions from KOVACS Krisztian <[EMAIL PROTECTED]>.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/netfilter/xt_connbytes.c |   29 -
 1 file changed, 12 insertions(+), 17 deletions(-)

--- linux-2.6.19.2.orig/net/netfilter/xt_connbytes.c
+++ linux-2.6.19.2/net/netfilter/xt_connbytes.c
@@ -52,6 +52,8 @@ match(const struct sk_buff *skb,
 {
const struct xt_connbytes_info *sinfo = matchinfo;
u_int64_t what = 0; /* initialize to make gcc happy */
+   u_int64_t bytes = 0;
+   u_int64_t pkts = 0;
const struct ip_conntrack_counter *counters;
 
if (!(counters = nf_ct_get_counters(skb)))
@@ -89,29 +91,22 @@ match(const struct sk_buff *skb,
case XT_CONNBYTES_AVGPKT:
switch (sinfo->direction) {
case XT_CONNBYTES_DIR_ORIGINAL:
-   what = div64_64(counters[IP_CT_DIR_ORIGINAL].bytes,
-   counters[IP_CT_DIR_ORIGINAL].packets);
+   bytes = counters[IP_CT_DIR_ORIGINAL].bytes;
+   pkts  = counters[IP_CT_DIR_ORIGINAL].packets;
break;
case XT_CONNBYTES_DIR_REPLY:
-   what = div64_64(counters[IP_CT_DIR_REPLY].bytes,
-   counters[IP_CT_DIR_REPLY].packets);
+   bytes = counters[IP_CT_DIR_REPLY].bytes;
+   pkts  = counters[IP_CT_DIR_REPLY].packets;
break;
case XT_CONNBYTES_DIR_BOTH:
-   {
-   u_int64_t bytes;
-   u_int64_t pkts;
-   bytes = counters[IP_CT_DIR_ORIGINAL].bytes +
-   counters[IP_CT_DIR_REPLY].bytes;
-   pkts = counters[IP_CT_DIR_ORIGINAL].packets+
-   counters[IP_CT_DIR_REPLY].packets;
-
-   /* FIXME_THEORETICAL: what to do if sum
-* overflows ? */
-
-   what = div64_64(bytes, pkts);
-   }
+   bytes = counters[IP_CT_DIR_ORIGINAL].bytes +
+   counters[IP_CT_DIR_REPLY].bytes;
+   pkts  = counters[IP_CT_DIR_ORIGINAL].packets +
+   counters[IP_CT_DIR_REPLY].packets;
break;
}
+   if (pkts != 0)
+   what = div64_64(bytes, pkts);
break;
}
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 29/59] elevator: move clearing of unplug flag earlier

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Linas Vepstas <[EMAIL PROTECTED]>

A flag was recently added to the elevator code to avoid
performing an unplug when reuests are being re-queued.
The goal of this flag was to avoid a deep recursion that
can occur when re-queueing requests after a SCSI device/host
reset.  See http://lkml.org/lkml/2006/5/17/254

However, that fix added the flag near the bottom of a case
statement, where an earlier break (in an if statement) could
transport one out of the case, without setting the flag.
This patch sets the flag earlier in the case statement.

I re-discovered the deep recursion recently during testing;
I was told that it was a known problem, and the fix to it was
in the kernel I was testing. Indeed it was ... but it didn't
fix the bug. With the patch below, I no longer see the bug.

Signed-off by: Linas Vepstas <[EMAIL PROTECTED]>
Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 block/elevator.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

--- linux-2.6.19.2.orig/block/elevator.c
+++ linux-2.6.19.2/block/elevator.c
@@ -572,6 +572,12 @@ void elv_insert(request_queue_t *q, stru
 */
rq->cmd_flags |= REQ_SOFTBARRIER;
 
+   /*
+* Most requeues happen because of a busy condition,
+* don't force unplug of the queue for that case.
+*/
+   unplug_it = 0;
+
if (q->ordseq == 0) {
list_add(>queuelist, >queue_head);
break;
@@ -586,11 +592,6 @@ void elv_insert(request_queue_t *q, stru
}
 
list_add_tail(>queuelist, pos);
-   /*
-* most requeues happen because of a busy condition, don't
-* force unplug of the queue for that case.
-*/
-   unplug_it = 0;
break;
 
default:

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 39/59] md: make repair actually work for raid1.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: NeilBrown <[EMAIL PROTECTED]>

When 'repair' finds a block that is different one the various
parts of the mirror. it is meant to write a chosen good version
to the others.  However it currently writes out the original data
to each. The memcpy to make all the data the same is missing.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/md/raid1.c |5 +
 1 file changed, 5 insertions(+)

--- linux-2.6.19.2.orig/drivers/md/raid1.c
+++ linux-2.6.19.2/drivers/md/raid1.c
@@ -1266,6 +1266,11 @@ static void sync_request_write(mddev_t *
sbio->bi_sector = r1_bio->sector +

conf->mirrors[i].rdev->data_offset;
sbio->bi_bdev = 
conf->mirrors[i].rdev->bdev;
+   for (j = 0; j < vcnt ; j++)
+   
memcpy(page_address(sbio->bi_io_vec[j].bv_page),
+  
page_address(pbio->bi_io_vec[j].bv_page),
+  PAGE_SIZE);
+
}
}
}

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 42/59] libata: use kmap_atomic(KM_IRQ0) in SCSI simulator

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Jeff Garzik <[EMAIL PROTECTED]>

We are inside spin_lock_irqsave().  quoth akpm's debug facility:

 [  231.948000] SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB)
 [  232.232000] ata1.00: configured for UDMA/33
 [  232.404000] WARNING (1) at arch/i386/mm/highmem.c:47 kmap_atomic()
 [  232.404000]  [] kmap_atomic+0xa9/0x1ab
 [  232.404000]  [] ata_scsi_rbuf_get+0x1c/0x30
 [  232.404000]  [] ata_scsi_rbuf_fill+0x1a/0x87
 [  232.404000]  [] ata_scsiop_mode_sense+0x0/0x309
 [  232.404000]  [] end_bio_bh_io_sync+0x0/0x37
 [  232.404000]  [] scsi_done+0x0/0x16
 [  232.404000]  [] scsi_done+0x0/0x16
 [  232.404000]  [] ata_scsi_simulate+0xb0/0x13f
[...]

Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/ata/libata-scsi.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/drivers/ata/libata-scsi.c
+++ linux-2.6.19.2/drivers/ata/libata-scsi.c
@@ -1648,7 +1648,7 @@ static unsigned int ata_scsi_rbuf_get(st
struct scatterlist *sg;
 
sg = (struct scatterlist *) cmd->request_buffer;
-   buf = kmap_atomic(sg->page, KM_USER0) + sg->offset;
+   buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
buflen = sg->length;
} else {
buf = cmd->request_buffer;
@@ -1676,7 +1676,7 @@ static inline void ata_scsi_rbuf_put(str
struct scatterlist *sg;
 
sg = (struct scatterlist *) cmd->request_buffer;
-   kunmap_atomic(buf - sg->offset, KM_USER0);
+   kunmap_atomic(buf - sg->offset, KM_IRQ0);
}
 }
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 45/59] bonding: ARP monitoring broken on x86_64

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Andy Gospodarek <[EMAIL PROTECTED]>

While working with the latest bonding code I noticed a nasty problem that
will prevent arp monitoring from always functioning correctly on x86_64
systems.  Comparing ints to longs and expecting reliable results on x86_64
is a bad idea.  With this patch, arp monitoring works correctly again.

Signed-off-by: Andy Gospodarek <[EMAIL PROTECTED]>
Cc: "David S. Miller" <[EMAIL PROTECTED]>
Cc: Stephen Hemminger <[EMAIL PROTECTED]>
Cc: Jeff Garzik <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/net/bonding/bonding.h |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- linux-2.6.19.2.orig/drivers/net/bonding/bonding.h
+++ linux-2.6.19.2/drivers/net/bonding/bonding.h
@@ -151,8 +151,8 @@ struct slave {
struct slave *next;
struct slave *prev;
intdelay;
-   u32jiffies;
-   u32last_arp_rx;
+   unsigned long jiffies;
+   unsigned long last_arp_rx;
s8 link;/* one of BOND_LINK_ */
s8 state;   /* one of BOND_STATE_ */
u32original_flags;
@@ -242,7 +242,8 @@ extern inline int slave_do_arp_validate(
return bond->params.arp_validate & (1 << slave->state);
 }
 
-extern inline u32 slave_last_rx(struct bonding *bond, struct slave *slave)
+extern inline unsigned long slave_last_rx(struct bonding *bond,
+   struct slave *slave)
 {
if (slave_do_arp_validate(bond, slave))
return slave->last_arp_rx;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 32/59] SPARC64: Set g4/g5 properly in sun4v dtlb-prot handling.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: David S. Miller <[EMAIL PROTECTED]>

Mirror the logic in the sun4u handler, we have to update
both registers even when we branch out to window fault
fixup handling.

The way it works is that if we are in etrap processing a
fault already, g4/g5 holds the original fault information.
If we take a window spill fault while doing etrap, then
we put the window spill fault info into g4/g5 and this is
what the top-level fault handler ends up processing first.

Then we retry the originally faulting instruction, and
process the original fault at that time.

This is all necessary because of how constrained the trap
registers are in these code paths.  These cases trigger
very rarely, so even if there is some performance implication
it's doesn't happen very often.  In fact the rarity is why
it took so long to trigger and find this particular bug.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---

---
 arch/sparc64/kernel/sun4v_tlb_miss.S |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/arch/sparc64/kernel/sun4v_tlb_miss.S
+++ linux-2.6.19.2/arch/sparc64/kernel/sun4v_tlb_miss.S
@@ -142,9 +142,9 @@ sun4v_dtlb_prot:
rdpr%tl, %g1
cmp %g1, 1
bgu,pn  %xcc, winfix_trampoline
-nop
-   ba,pt   %xcc, sparc64_realfault_common
 movFAULT_CODE_DTLB | FAULT_CODE_WRITE, %g4
+   ba,pt   %xcc, sparc64_realfault_common
+nop
 
/* Called from trap table:
 * %g4: vaddr

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 43/59] Dont allow the stack to grow into hugetlb reserved regions

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Adam Litke <[EMAIL PROTECTED]>

When expanding the stack, we don't currently check if the VMA will cross
into an area of the address space that is reserved for hugetlb pages. 
Subsequent faults on the expanded portion of such a VMA will confuse the
low-level MMU code, resulting in an OOPS.  Check for this.

Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
Cc: David Gibson <[EMAIL PROTECTED]>
Cc: William Lee Irwin III <[EMAIL PROTECTED]>
Cc: Hugh Dickins <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---

 mm/mmap.c |7 +++
 1 file changed, 7 insertions(+)

--- linux-2.6.19.2.orig/mm/mmap.c
+++ linux-2.6.19.2/mm/mmap.c
@@ -1477,6 +1477,7 @@ static int acct_stack_growth(struct vm_a
 {
struct mm_struct *mm = vma->vm_mm;
struct rlimit *rlim = current->signal->rlim;
+   unsigned long new_start;
 
/* address space limit tests */
if (!may_expand_vm(mm, grow))
@@ -1496,6 +1497,12 @@ static int acct_stack_growth(struct vm_a
return -ENOMEM;
}
 
+   /* Check to ensure the stack will not grow into a hugetlb-only region */
+   new_start = (vma->vm_flags & VM_GROWSUP) ? vma->vm_start :
+   vma->vm_end - size;
+   if (is_hugepage_only_range(vma->vm_mm, new_start, size))
+   return -EFAULT;
+
/*
 * Overcommit..  This must be the final test, as it will
 * update security statistics.

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 40/59] md: fix a few problems with the interface (sysfs and ioctl) to md.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: NeilBrown <[EMAIL PROTECTED]>

While developing more functionality in mdadm I found some bugs in md...

- When we remove a device from an inactive array (write 'remove' to 
  the 'state' sysfs file - see 'state_store') would should not
  update the superblock information - as we may not have
  read and processed it all properly yet.

- initialise all raid_disk entries to '-1' else the 'slot sysfs file
  will claim '0' for all devices in an array before the array is
  started.

- all '\n' not to be present at the end of words written to
  sysfs files
- when we use SET_ARRAY_INFO to set the md metadata version,
  set the flag to say that there is persistant metadata.
- allow GET_BITMAP_FILE to be called on an array that hasn't
  been started yet.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/md/md.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

--- linux-2.6.19.2.orig/drivers/md/md.c
+++ linux-2.6.19.2/drivers/md/md.c
@@ -1792,7 +1792,8 @@ state_store(mdk_rdev_t *rdev, const char
else {
mddev_t *mddev = rdev->mddev;
kick_rdev_from_array(rdev);
-   md_update_sb(mddev, 1);
+   if (mddev->pers)
+   md_update_sb(mddev, 1);
md_new_event(mddev);
err = 0;
}
@@ -2004,6 +2005,7 @@ static mdk_rdev_t *md_import_device(dev_
 
rdev->desc_nr = -1;
rdev->saved_raid_disk = -1;
+   rdev->raid_disk = -1;
rdev->flags = 0;
rdev->data_offset = 0;
rdev->sb_events = 0;
@@ -2233,7 +2235,6 @@ static int update_raid_disks(mddev_t *md
 static ssize_t
 raid_disks_store(mddev_t *mddev, const char *buf, size_t len)
 {
-   /* can only set raid_disks if array is not yet active */
char *e;
int rv = 0;
unsigned long n = simple_strtoul(buf, , 10);
@@ -2631,7 +2632,7 @@ metadata_store(mddev_t *mddev, const cha
return -EINVAL;
buf = e+1;
minor = simple_strtoul(buf, , 10);
-   if (e==buf || *e != '\n')
+   if (e==buf || (*e && *e != '\n') )
return -EINVAL;
if (major >= sizeof(super_types)/sizeof(super_types[0]) ||
super_types[major].name == NULL)
@@ -3978,6 +3979,7 @@ static int set_array_info(mddev_t * mdde
mddev->major_version = info->major_version;
mddev->minor_version = info->minor_version;
mddev->patch_version = info->patch_version;
+   mddev->persistent = ! info->not_persistent;
return 0;
}
mddev->major_version = MD_MAJOR_VERSION;
@@ -4302,9 +4304,10 @@ static int md_ioctl(struct inode *inode,
 * Commands querying/configuring an existing array:
 */
/* if we are not initialised yet, only ADD_NEW_DISK, STOP_ARRAY,
-* RUN_ARRAY, and SET_BITMAP_FILE are allowed */
+* RUN_ARRAY, and GET_ and SET_BITMAP_FILE are allowed */
if (!mddev->raid_disks && cmd != ADD_NEW_DISK && cmd != STOP_ARRAY
-   && cmd != RUN_ARRAY && cmd != SET_BITMAP_FILE) {
+   && cmd != RUN_ARRAY && cmd != SET_BITMAP_FILE
+   && cmd != GET_BITMAP_FILE) {
err = -ENODEV;
goto abort_unlock;
}

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 37/59] knfsd: fix up some bit-rot in exp_export

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: NeilBrown <[EMAIL PROTECTED]>

The nfsservctl systemcall isn't used but recent nfs-utils releases for
exporting filesystems, and consequently the code that is uses -
exp_export - has suffered some bitrot.

Particular:
  - some newly added fields in 'struct svc_export' are being initialised
properly.
  - the return value is now always -ENOMEM ...

This patch fixes both these problems.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 fs/nfsd/export.c |   12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

--- linux-2.6.19.2.orig/fs/nfsd/export.c
+++ linux-2.6.19.2/fs/nfsd/export.c
@@ -950,6 +950,8 @@ exp_export(struct nfsctl_export *nxp)
 
exp = exp_get_by_name(clp, nd.mnt, nd.dentry, NULL);
 
+   memset(, 0, sizeof(new));
+
/* must make sure there won't be an ex_fsid clash */
if ((nxp->ex_flags & NFSEXP_FSID) &&
(fsid_key = exp_get_fsid_key(clp, nxp->ex_dev)) &&
@@ -980,6 +982,9 @@ exp_export(struct nfsctl_export *nxp)
 
new.h.expiry_time = NEVER;
new.h.flags = 0;
+   new.ex_path = kstrdup(nxp->ex_path, GFP_KERNEL);
+   if (!new.ex_path)
+   goto finish;
new.ex_client = clp;
new.ex_mnt = nd.mnt;
new.ex_dentry = nd.dentry;
@@ -1000,10 +1005,11 @@ exp_export(struct nfsctl_export *nxp)
/* failed to create at least one index */
exp_do_unexport(exp);
cache_flush();
-   err = -ENOMEM;
-   }
-
+   } else
+   err = 0;
 finish:
+   if (new.ex_path)
+   kfree(new.ex_path);
if (exp)
exp_put(exp);
if (fsid_key && !IS_ERR(fsid_key))

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 41/59] md: fix potential memalloc deadlock in md

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: NeilBrown <[EMAIL PROTECTED]>

If a GFP_KERNEL allocation is attempted in md while the mddev_lock is
held, it is possible for a deadlock to eventuate.
This happens if the array was marked 'clean', and the memalloc triggers 
a write-out to the md device.
For the writeout to succeed, the array must be marked 'dirty', and that 
requires getting the mddev_lock.

So, before attempting a GFP_KERNEL alloction while holding the lock,
make sure the array is marked 'dirty' (unless it is currently read-only).

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/md/md.c |   29 +
 drivers/md/raid1.c  |2 ++
 drivers/md/raid5.c  |3 +++
 include/linux/raid/md.h |2 +-
 4 files changed, 35 insertions(+), 1 deletion(-)

--- linux-2.6.19.2.orig/drivers/md/md.c
+++ linux-2.6.19.2/drivers/md/md.c
@@ -3561,6 +3561,8 @@ static int get_bitmap_file(mddev_t * mdd
char *ptr, *buf = NULL;
int err = -ENOMEM;
 
+   md_allow_write(mddev);
+
file = kmalloc(sizeof(*file), GFP_KERNEL);
if (!file)
goto out;
@@ -5029,6 +5031,33 @@ void md_write_end(mddev_t *mddev)
}
 }
 
+/* md_allow_write(mddev)
+ * Calling this ensures that the array is marked 'active' so that writes
+ * may proceed without blocking.  It is important to call this before
+ * attempting a GFP_KERNEL allocation while holding the mddev lock.
+ * Must be called with mddev_lock held.
+ */
+void md_allow_write(mddev_t *mddev)
+{
+   if (!mddev->pers)
+   return;
+   if (mddev->ro)
+   return;
+
+   spin_lock_irq(>write_lock);
+   if (mddev->in_sync) {
+   mddev->in_sync = 0;
+   set_bit(MD_CHANGE_CLEAN, >flags);
+   if (mddev->safemode_delay &&
+   mddev->safemode == 0)
+   mddev->safemode = 1;
+   spin_unlock_irq(>write_lock);
+   md_update_sb(mddev, 0);
+   } else
+   spin_unlock_irq(>write_lock);
+}
+EXPORT_SYMBOL_GPL(md_allow_write);
+
 static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
 
 #define SYNC_MARKS 10
--- linux-2.6.19.2.orig/drivers/md/raid1.c
+++ linux-2.6.19.2/drivers/md/raid1.c
@@ -2104,6 +2104,8 @@ static int raid1_reshape(mddev_t *mddev)
return -EINVAL;
}
 
+   md_allow_write(mddev);
+
raid_disks = mddev->raid_disks + mddev->delta_disks;
 
if (raid_disks < conf->raid_disks) {
--- linux-2.6.19.2.orig/drivers/md/raid5.c
+++ linux-2.6.19.2/drivers/md/raid5.c
@@ -403,6 +403,8 @@ static int resize_stripes(raid5_conf_t *
if (newsize <= conf->pool_size)
return 0; /* never bother to shrink */
 
+   md_allow_write(conf->mddev);
+
/* Step 1 */
sc = kmem_cache_create(conf->cache_name[1-conf->active_name],
   sizeof(struct 
stripe_head)+(newsize-1)*sizeof(struct r5dev),
@@ -3045,6 +3047,7 @@ raid5_store_stripe_cache_size(mddev_t *m
else
break;
}
+   md_allow_write(mddev);
while (new > conf->max_nr_stripes) {
if (grow_one_stripe(conf))
conf->max_nr_stripes++;
--- linux-2.6.19.2.orig/include/linux/raid/md.h
+++ linux-2.6.19.2/include/linux/raid/md.h
@@ -94,7 +94,7 @@ extern int sync_page_io(struct block_dev
struct page *page, int rw);
 extern void md_do_sync(mddev_t *mddev);
 extern void md_new_event(mddev_t *mddev);
-
+extern void md_allow_write(mddev_t *mddev);
 
 #endif /* CONFIG_MD */
 #endif 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 38/59] md: assorted md and raid1 one-liners

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: NeilBrown <[EMAIL PROTECTED]>

Fix few bugs that meant that:
  - superblocks weren't alway written at exactly the right time (this
could show up if the array was not written to - writting to the array
causes lots of superblock updates and so hides these errors).

  - restarting device recovery after a clean shutdown (version-1 metadata
only) didn't work as intended (or at all).

1/ Ensure superblock is updated when a new device is added.
2/ Remove an inappropriate test on MD_RECOVERY_SYNC in md_do_sync.
   The body of this if takes one of two branches depending on whether
   MD_RECOVERY_SYNC is set, so testing it in the clause of the if
   is wrong.
3/ Flag superblock for updating after a resync/recovery finishes.
4/ If we find the neeed to restart a recovery in the middle (version-1
   metadata only) make sure a full recovery (not just as guided by
   bitmaps) does get done.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/md/md.c|3 ++-
 drivers/md/raid1.c |1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

--- linux-2.6.19.2.orig/drivers/md/md.c
+++ linux-2.6.19.2/drivers/md/md.c
@@ -3722,6 +3722,7 @@ static int add_new_disk(mddev_t * mddev,
if (err)
export_rdev(rdev);
 
+   md_update_sb(mddev, 1);
set_bit(MD_RECOVERY_NEEDED, >recovery);
md_wakeup_thread(mddev->thread);
return err;
@@ -5273,7 +5274,6 @@ void md_do_sync(mddev_t *mddev)
mddev->pers->sync_request(mddev, max_sectors, , 1);
 
if (!test_bit(MD_RECOVERY_ERR, >recovery) &&
-   test_bit(MD_RECOVERY_SYNC, >recovery) &&
!test_bit(MD_RECOVERY_CHECK, >recovery) &&
mddev->curr_resync > 2) {
if (test_bit(MD_RECOVERY_SYNC, >recovery)) {
@@ -5297,6 +5297,7 @@ void md_do_sync(mddev_t *mddev)
rdev->recovery_offset = 
mddev->curr_resync;
}
}
+   set_bit(MD_CHANGE_DEVS, >flags);
 
  skip:
mddev->curr_resync = 0;
--- linux-2.6.19.2.orig/drivers/md/raid1.c
+++ linux-2.6.19.2/drivers/md/raid1.c
@@ -1956,6 +1956,7 @@ static int run(mddev_t *mddev)
!test_bit(In_sync, >rdev->flags)) {
disk->head_position = 0;
mddev->degraded++;
+   conf->fullsync = 1;
}
}
if (mddev->degraded == conf->raid_disks) {

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 30/59] Revert "[PATCH] Fix up mmap_kmem"

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Linus Torvalds <[EMAIL PROTECTED]>

This reverts commit 99a10a60ba9bedcf5d70ef81414d3e03816afa3f.

As per Hugh Dickins:

  "Nadia Derbey has reported that mmap of /dev/kmem no longer works with
   the kernel virtual address as offset, and Franck has confirmed that
   his patch came from a misunderstanding of what an offset means to
   /dev/kmem - whereas his patch description seems to say that he was
   correcting the offset on a few plaforms, there was no such problem to
   correct, and his patch was in fact changing its API on all platforms."

Suggested-by: Hugh Dickins <[EMAIL PROTECTED]>
Cc: Franck Bui-Huu <[EMAIL PROTECTED]>
Cc: Nadia Derbey <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Arjan van de Ven <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/char/mem.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/drivers/char/mem.c
+++ linux-2.6.19.2/drivers/char/mem.c
@@ -293,8 +293,8 @@ static int mmap_kmem(struct file * file,
 {
unsigned long pfn;
 
-   /* Turn a pfn offset into an absolute pfn */
-   pfn = PFN_DOWN(virt_to_phys((void *)PAGE_OFFSET)) + vma->vm_pgoff;
+   /* Turn a kernel-virtual address into a physical page frame */
+   pfn = __pa((u64)vma->vm_pgoff << PAGE_SHIFT) >> PAGE_SHIFT;
 
/*
 * RED-PEN: on some architectures there is more mapped memory

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 33/59] sis190: failure to set the MAC address from EEPROM

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Francois Romieu <[EMAIL PROTECTED]>

Fix from http://bugzilla.kernel.org/show_bug.cgi?id=7747

Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Signed-off-by: Francois Romieu <[EMAIL PROTECTED]>
Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/net/sis190.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.2.orig/drivers/net/sis190.c
+++ linux-2.6.19.2/drivers/net/sis190.c
@@ -1559,7 +1559,7 @@ static int __devinit sis190_get_mac_addr
for (i = 0; i < MAC_ADDR_LEN / 2; i++) {
__le16 w = sis190_read_eeprom(ioaddr, EEPROMMACAddr + i);
 
-   ((u16 *)dev->dev_addr)[0] = le16_to_cpu(w);
+   ((u16 *)dev->dev_addr)[i] = le16_to_cpu(w);
}
 
sis190_set_rgmii(tp, sis190_read_eeprom(ioaddr, EEPROMInfo));

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 35/59] knfsd: fix an NFSD bug with full sized, non-page-aligned reads.

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: NeilBrown <[EMAIL PROTECTED]>

NFSd assumes that largest number of pages that will be needed
for a request+response is 2+N where N pages is the size of the largest
permitted read/write request.  The '2' are 1 for the non-data part of
the request, and 1 for the non-data part of the reply.

However, when a read request is not page-aligned, and we choose to use
->sendfile to send it directly from the page cache, we may need N+1
pages to hold the whole reply.  This can overflow and array and cause
an Oops.

This patch increases size of the array for holding pages by one and
makes sure that entry is NULL when it is not in use.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 fs/nfsd/vfs.c  |3 ++-
 include/linux/sunrpc/svc.h |5 -
 net/sunrpc/svcsock.c   |2 ++
 3 files changed, 8 insertions(+), 2 deletions(-)

--- linux-2.6.19.2.orig/fs/nfsd/vfs.c
+++ linux-2.6.19.2/fs/nfsd/vfs.c
@@ -822,7 +822,8 @@ nfsd_read_actor(read_descriptor_t *desc,
rqstp->rq_res.page_len = size;
} else if (page != pp[-1]) {
get_page(page);
-   put_page(*pp);
+   if (*pp)
+   put_page(*pp);
*pp = page;
rqstp->rq_resused++;
rqstp->rq_res.page_len += size;
--- linux-2.6.19.2.orig/include/linux/sunrpc/svc.h
+++ linux-2.6.19.2/include/linux/sunrpc/svc.h
@@ -144,8 +144,11 @@ extern u32 svc_max_payload(const struct 
  *
  * Each request/reply pair can have at most one "payload", plus two pages,
  * one for the request, and one for the reply.
+ * We using ->sendfile to return read data, we might need one extra page
+ * if the request is not page-aligned.  So add another '1'.
  */
-#define RPCSVC_MAXPAGES
((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE + 2)
+#define RPCSVC_MAXPAGES
((RPCSVC_MAXPAYLOAD+PAGE_SIZE-1)/PAGE_SIZE \
+   + 2 + 1)
 
 static inline u32 svc_getnl(struct kvec *iov)
 {
--- linux-2.6.19.2.orig/net/sunrpc/svcsock.c
+++ linux-2.6.19.2/net/sunrpc/svcsock.c
@@ -1248,6 +1248,8 @@ svc_recv(struct svc_rqst *rqstp, long ti

schedule_timeout_uninterruptible(msecs_to_jiffies(500));
rqstp->rq_pages[i] = p;
}
+   rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */
+   BUG_ON(pages >= RPCSVC_MAXPAGES);
 
/* Make arg->head point to first page and arg->pages point to rest */
arg = >rq_arg;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 31/59] remove __devinit markings from rtc_sysfs_add_device()

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Mike Frysinger <[EMAIL PROTECTED]>

rtc_sysfs_add_device is needed even after dev initialization, so drop __devinit.

Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]>
Acked-by: Alessandro Zummo <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/rtc/rtc-sysfs.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.2.orig/drivers/rtc/rtc-sysfs.c
+++ linux-2.6.19.2/drivers/rtc/rtc-sysfs.c
@@ -78,7 +78,7 @@ static struct attribute_group rtc_attr_g
.attrs = rtc_attrs,
 };
 
-static int __devinit rtc_sysfs_add_device(struct class_device *class_dev,
+static int rtc_sysfs_add_device(struct class_device *class_dev,
struct class_interface *class_intf)
 {
int err;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 25/59] Fix UML on non-standard VM split hosts

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Jeff Dike <[EMAIL PROTECTED]>

This fixes UML on hosts with non-standard VM splits.  We had changed
the config variable that controls UML behavior on such hosts, but not
propogated the change everywhere.  In particular, the values of
STUB_CODE and STUB_DATA relied on the old variable.

I also reformatted the HOST_VMSPLIT_3G help to make it more standard.

Spotted by [EMAIL PROTECTED]

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
--
 arch/um/Kconfig.i386 |   38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

--- linux-2.6.19.2.orig/arch/um/Kconfig.i386
+++ linux-2.6.19.2/arch/um/Kconfig.i386
@@ -19,22 +19,22 @@ config SEMAPHORE_SLEEPERS
 choice
prompt "Host memory split"
default HOST_VMSPLIT_3G
-   ---help---
-  This is needed when the host kernel on which you run has a 
non-default
-  (like 2G/2G) memory split, instead of the customary 3G/1G. If you did
-  not recompile your own kernel but use the default distro's one, you 
can
-  safely accept the "Default split" option.
-
-  It can be enabled on recent (>=2.6.16-rc2) vanilla kernels via
-  CONFIG_VM_SPLIT_*, or on previous kernels with special patches (-ck
-  patchset by Con Kolivas, or other ones) - option names match closely 
the
-  host CONFIG_VM_SPLIT_* ones.
-
-  A lower setting (where 1G/3G is lowest and 3G/1G is higher) will
-  tolerate even more "normal" host kernels, but an higher setting will 
be
-  stricter.
+   help
+This is needed when the host kernel on which you run has a non-default
+   (like 2G/2G) memory split, instead of the customary 3G/1G. If you did
+   not recompile your own kernel but use the default distro's one, you can
+   safely accept the "Default split" option.
+
+   It can be enabled on recent (>=2.6.16-rc2) vanilla kernels via
+   CONFIG_VM_SPLIT_*, or on previous kernels with special patches (-ck
+   patchset by Con Kolivas, or other ones) - option names match closely the
+   host CONFIG_VM_SPLIT_* ones.
+
+   A lower setting (where 1G/3G is lowest and 3G/1G is higher) will
+   tolerate even more "normal" host kernels, but an higher setting will be
+   stricter.
 
-  So, if you do not know what to do here, say 'Default split'.
+   So, if you do not know what to do here, say 'Default split'.
 
config HOST_VMSPLIT_3G
bool "Default split (3G/1G user/kernel host split)"
@@ -67,13 +67,13 @@ config 3_LEVEL_PGTABLES
 
 config STUB_CODE
hex
-   default 0xbfffe000 if !HOST_2G_2G
-   default 0x7fffe000 if HOST_2G_2G
+   default 0xbfffe000 if !HOST_VMSPLIT_2G
+   default 0x7fffe000 if HOST_VMSPLIT_2G
 
 config STUB_DATA
hex
-   default 0xb000 if !HOST_2G_2G
-   default 0x7000 if HOST_2G_2G
+   default 0xb000 if !HOST_VMSPLIT_2G
+   default 0x7000 if HOST_VMSPLIT_2G
 
 config STUB_START
hex

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 21/59] IPSEC: Policy list disorder

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Herbert Xu <[EMAIL PROTECTED]>

The recent hashing introduced an off-by-one bug in policy list insertion.
Instead of adding after the last entry with a lesser or equal priority,
we're adding after the successor of that entry.

This patch fixes this and also adds a warning if we detect a duplicate
entry in the policy list.  This should never happen due to this if clause.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>
Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 net/xfrm/xfrm_policy.c |   16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

--- linux-2.6.19.2.orig/net/xfrm/xfrm_policy.c
+++ linux-2.6.19.2/net/xfrm/xfrm_policy.c
@@ -615,19 +615,18 @@ int xfrm_policy_insert(int dir, struct x
struct xfrm_policy *pol;
struct xfrm_policy *delpol;
struct hlist_head *chain;
-   struct hlist_node *entry, *newpos, *last;
+   struct hlist_node *entry, *newpos;
struct dst_entry *gc_list;
 
write_lock_bh(_policy_lock);
chain = policy_hash_bysel(>selector, policy->family, dir);
delpol = NULL;
newpos = NULL;
-   last = NULL;
hlist_for_each_entry(pol, entry, chain, bydst) {
-   if (!delpol &&
-   pol->type == policy->type &&
+   if (pol->type == policy->type &&
!selector_cmp(>selector, >selector) &&
-   xfrm_sec_ctx_match(pol->security, policy->security)) {
+   xfrm_sec_ctx_match(pol->security, policy->security) &&
+   !WARN_ON(delpol)) {
if (excl) {
write_unlock_bh(_policy_lock);
return -EEXIST;
@@ -636,17 +635,12 @@ int xfrm_policy_insert(int dir, struct x
if (policy->priority > pol->priority)
continue;
} else if (policy->priority >= pol->priority) {
-   last = >bydst;
+   newpos = >bydst;
continue;
}
-   if (!newpos)
-   newpos = >bydst;
if (delpol)
break;
-   last = >bydst;
}
-   if (!newpos)
-   newpos = last;
if (newpos)
hlist_add_after(newpos, >bydst);
else

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 23/59] SELinux: fix an oops with NetLabel and non-MLS SELinux policy

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From:  <[EMAIL PROTECTED]>

In the case where a user has configured NetLabel in the kernel but is not
using a SELinux policy with the MLS/MCS feature enabled there is a bug in
mls_export_cat() where a NULL pointer is used.  The initial problem report and
discussion can be found here (this patch has been ACK'd by Stephen Smalley and
 James Morris in the discussion thread below):

 * http://marc2.theaimsgroup.com/?t=11692030254=1=2

This patch is specific to the 2.6.19.y kernel series as the mls_export_cat()
function has been replaced in the 2.6.20 kernel.

Signed-off-by: Paul Moore <[EMAIL PROTECTED]>
Acked-by:  Stephen Smalley <[EMAIL PROTECTED]>
Acked-by: James Morris <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 security/selinux/ss/mls.c |   12 
 1 file changed, 8 insertions(+), 4 deletions(-)

--- linux-2.6.19.2.orig/security/selinux/ss/mls.c
+++ linux-2.6.19.2/security/selinux/ss/mls.c
@@ -641,10 +641,14 @@ int mls_export_cat(const struct context 
int rc = -EPERM;
 
if (!selinux_mls_enabled) {
-   *low = NULL;
-   *low_len = 0;
-   *high = NULL;
-   *high_len = 0;
+   if (low != NULL) {
+   *low = NULL;
+   *low_len = 0;
+   }
+   if (high != NULL) {
+   *high = NULL;
+   *high_len = 0;
+   }
return 0;
}
 

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 13/59] ieee1394: sbp2: fix probing of some DVD-ROM/RWs

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Stefan Richter <[EMAIL PROTECTED]>

Since commit 98e238cd42be6c0852da519303cf0182690f8d9f in Linux 2.6.19,
"ieee1394: sbp2: don't prefer MODE SENSE 10", some FireWire DVD-ROMs and
DVD-RWs were mistaken as CD-ROM because sr_mod now sent MODE SENSE 6.
The MMC command set includes only MODE SENSE 10.
http://bugzilla.kernel.org/show_bug.cgi?id=7800

This fix lets sbp2 switch scsi_device.use_10_for_rw on for MMC LUs.
This should rather be done in the command set driver sr_mod, not in the
sbp2 transport driver, and an according patch will follow for a next
Linux release.

Signed-off-by: Stefan Richter <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
same as commit 1a74bc68e4c0534d150e6454b45a70dab831fa32

---
 drivers/ieee1394/sbp2.c |2 ++
 1 file changed, 2 insertions(+)

--- linux-2.6.19.2.orig/drivers/ieee1394/sbp2.c
+++ linux-2.6.19.2/drivers/ieee1394/sbp2.c
@@ -2530,6 +2530,8 @@ static int sbp2scsi_slave_configure(stru
blk_queue_dma_alignment(sdev->request_queue, (512 - 1));
sdev->use_10_for_rw = 1;
 
+   if (sdev->type == TYPE_ROM)
+   sdev->use_10_for_ms = 1;
if (sdev->type == TYPE_DISK &&
scsi_id->workarounds & SBP2_WORKAROUND_MODE_SENSE_8)
sdev->skip_ms_page_8 = 1;

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 05/59] md: pass down BIO_RW_SYNC in raid{1,10}

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Lars Ellenberg <[EMAIL PROTECTED]>

md raidX make_request functions strip off the BIO_RW_SYNC flag, thus
introducing additional latency.

Fixing this in raid1 and raid10 seems to be straightforward enough.

For our particular usage case in DRBD, passing this flag improved some
initialization time from ~5 minutes to ~5 seconds.

Acked-by: NeilBrown <[EMAIL PROTECTED]>
Signed-off-by: Lars Ellenberg <[EMAIL PROTECTED]>
Acked-by: Jens Axboe <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/md/raid1.c  |   13 +
 drivers/md/raid10.c |   11 ---
 2 files changed, 17 insertions(+), 7 deletions(-)

--- linux-2.6.19.2.orig/drivers/md/raid1.c
+++ linux-2.6.19.2/drivers/md/raid1.c
@@ -775,6 +775,7 @@ static int make_request(request_queue_t 
struct bio_list bl;
struct page **behind_pages = NULL;
const int rw = bio_data_dir(bio);
+   const int do_sync = bio_sync(bio);
int do_barriers;
 
/*
@@ -835,7 +836,7 @@ static int make_request(request_queue_t 
read_bio->bi_sector = r1_bio->sector + 
mirror->rdev->data_offset;
read_bio->bi_bdev = mirror->rdev->bdev;
read_bio->bi_end_io = raid1_end_read_request;
-   read_bio->bi_rw = READ;
+   read_bio->bi_rw = READ | do_sync;
read_bio->bi_private = r1_bio;
 
generic_make_request(read_bio);
@@ -906,7 +907,7 @@ static int make_request(request_queue_t 
mbio->bi_sector = r1_bio->sector + 
conf->mirrors[i].rdev->data_offset;
mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
mbio->bi_end_io = raid1_end_write_request;
-   mbio->bi_rw = WRITE | do_barriers;
+   mbio->bi_rw = WRITE | do_barriers | do_sync;
mbio->bi_private = r1_bio;
 
if (behind_pages) {
@@ -941,6 +942,8 @@ static int make_request(request_queue_t 
blk_plug_device(mddev->queue);
spin_unlock_irqrestore(>device_lock, flags);
 
+   if (do_sync)
+   md_wakeup_thread(mddev->thread);
 #if 0
while ((bio = bio_list_pop()) != NULL)
generic_make_request(bio);
@@ -1541,6 +1544,7 @@ static void raid1d(mddev_t *mddev)
 * We already have a nr_pending reference on these 
rdevs.
 */
int i;
+   const int do_sync = bio_sync(r1_bio->master_bio);
clear_bit(R1BIO_BarrierRetry, _bio->state);
clear_bit(R1BIO_Barrier, _bio->state);
for (i=0; i < conf->raid_disks; i++)
@@ -1561,7 +1565,7 @@ static void raid1d(mddev_t *mddev)

conf->mirrors[i].rdev->data_offset;
bio->bi_bdev = 
conf->mirrors[i].rdev->bdev;
bio->bi_end_io = 
raid1_end_write_request;
-   bio->bi_rw = WRITE;
+   bio->bi_rw = WRITE | do_sync;
bio->bi_private = r1_bio;
r1_bio->bios[i] = bio;
generic_make_request(bio);
@@ -1593,6 +1597,7 @@ static void raid1d(mddev_t *mddev)
   (unsigned long long)r1_bio->sector);
raid_end_bio_io(r1_bio);
} else {
+   const int do_sync = 
bio_sync(r1_bio->master_bio);
r1_bio->bios[r1_bio->read_disk] =
mddev->ro ? IO_BLOCKED : NULL;
r1_bio->read_disk = disk;
@@ -1608,7 +1613,7 @@ static void raid1d(mddev_t *mddev)
bio->bi_sector = r1_bio->sector + 
rdev->data_offset;
bio->bi_bdev = rdev->bdev;
bio->bi_end_io = raid1_end_read_request;
-   bio->bi_rw = READ;
+   bio->bi_rw = READ | do_sync;
bio->bi_private = r1_bio;
unplug = 1;
generic_make_request(bio);
--- linux-2.6.19.2.orig/drivers/md/raid10.c
+++ linux-2.6.19.2/drivers/md/raid10.c
@@ -782,6 +782,7 @@ static int make_request(request_queue_t 
int i;
int chunk_sects = conf->chunk_mask + 1;
const int rw = bio_data_dir(bio);
+   const int do_sync = bio_sync(bio);
struct bio_list bl;
unsigned long flags;
 
@@ -863,7 +864,7 @@ static int make_request(request_queue_t

[patch 04/59] Fix HWRNG built-in initcalls priority

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Michael Buesch <[EMAIL PROTECTED]>

This changes all HWRNG driver initcalls to module_init().
We must probe the RNGs after the major kernel subsystems
are already up and running (like PCI).
This fixes Bug 7730.
http://bugzilla.kernel.org/show_bug.cgi?id=7730

Signed-off-by: Michael Buesch <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/char/hw_random/amd-rng.c|2 +-
 drivers/char/hw_random/geode-rng.c  |2 +-
 drivers/char/hw_random/intel-rng.c  |2 +-
 drivers/char/hw_random/ixp4xx-rng.c |2 +-
 drivers/char/hw_random/via-rng.c|2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

--- linux-2.6.19.2.orig/drivers/char/hw_random/amd-rng.c
+++ linux-2.6.19.2/drivers/char/hw_random/amd-rng.c
@@ -144,7 +144,7 @@ static void __exit mod_exit(void)
hwrng_unregister(_rng);
 }
 
-subsys_initcall(mod_init);
+module_init(mod_init);
 module_exit(mod_exit);
 
 MODULE_AUTHOR("The Linux Kernel team");
--- linux-2.6.19.2.orig/drivers/char/hw_random/geode-rng.c
+++ linux-2.6.19.2/drivers/char/hw_random/geode-rng.c
@@ -125,7 +125,7 @@ static void __exit mod_exit(void)
iounmap(mem);
 }
 
-subsys_initcall(mod_init);
+module_init(mod_init);
 module_exit(mod_exit);
 
 MODULE_DESCRIPTION("H/W RNG driver for AMD Geode LX CPUs");
--- linux-2.6.19.2.orig/drivers/char/hw_random/intel-rng.c
+++ linux-2.6.19.2/drivers/char/hw_random/intel-rng.c
@@ -350,7 +350,7 @@ static void __exit mod_exit(void)
iounmap(mem);
 }
 
-subsys_initcall(mod_init);
+module_init(mod_init);
 module_exit(mod_exit);
 
 MODULE_DESCRIPTION("H/W RNG driver for Intel chipsets");
--- linux-2.6.19.2.orig/drivers/char/hw_random/ixp4xx-rng.c
+++ linux-2.6.19.2/drivers/char/hw_random/ixp4xx-rng.c
@@ -64,7 +64,7 @@ static void __exit ixp4xx_rng_exit(void)
iounmap(rng_base);
 }
 
-subsys_initcall(ixp4xx_rng_init);
+module_init(ixp4xx_rng_init);
 module_exit(ixp4xx_rng_exit);
 
 MODULE_AUTHOR("Deepak Saxena <[EMAIL PROTECTED]>");
--- linux-2.6.19.2.orig/drivers/char/hw_random/via-rng.c
+++ linux-2.6.19.2/drivers/char/hw_random/via-rng.c
@@ -176,7 +176,7 @@ static void __exit mod_exit(void)
hwrng_unregister(_rng);
 }
 
-subsys_initcall(mod_init);
+module_init(mod_init);
 module_exit(mod_exit);
 
 MODULE_DESCRIPTION("H/W RNG driver for VIA chipsets");

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 16/59] start_kernel: test if irqs got enabled early, barf, and disable them again

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Ard van Breemen <[EMAIL PROTECTED]>

The calls made by parse_parms to other initialization code might enable
interrupts again way too early.

Having interrupts on this early can make systems PANIC when they initialize
the IRQ controllers (which happens later in the code).  This patch detects
that irq's are enabled again, barfs about it and disables them again as a
safety net.

[EMAIL PROTECTED]: cleanups]
Signed-off-by: Ard van Breemen <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
This is half of the fix for http://bugzilla.kernel.org/show_bug.cgi?id=7505

 init/main.c |5 +
 1 file changed, 5 insertions(+)

--- linux-2.6.19.2.orig/init/main.c
+++ linux-2.6.19.2/init/main.c
@@ -525,6 +525,11 @@ asmlinkage void __init start_kernel(void
parse_args("Booting kernel", command_line, __start___param,
   __stop___param - __start___param,
   _bootoption);
+   if (!irqs_disabled()) {
+   printk(KERN_WARNING "start_kernel(): bug: interrupts were "
+   "enabled *very* early, fixing it\n");
+   local_irq_disable();
+   }
sort_main_extable();
trap_init();
rcu_init();

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 12/59] [PATCH] Fix reparenting to the same thread group. (take 2)

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Eric W. Biederman <[EMAIL PROTECTED]>

This patch fixes the case when we reparent to a different thread in the
same thread group.  This modifies the code so that we do not send
signals and do not change the signal to send to SIGCHLD unless we have
change the thread group of our parents.  It also suppresses sending
pdeath_sig in this cas as well since the result of geppid doesn't
change.

Thanks to Oleg for spotting my bug of only fixing this for non-ptraced
tasks.

This fixes the issues identified by Albert Cahalan in thread
http://lkml.org/lkml/2006/12/21/22.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
Acked-by: Mike Galbraith <[EMAIL PROTECTED]>
Cc: Albert Cahalan <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
Cc: Roland McGrath <[EMAIL PROTECTED]>
Cc: Ingo Molnar <[EMAIL PROTECTED]>
Cc: Coywolf Qi Hunt <[EMAIL PROTECTED]>
Acked-by: Oleg Nesterov <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
[chrisw: fold in 241ceee0b442, Oleg's fix to restore user visible behaviour]
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 kernel/exit.c |   29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

--- linux-2.6.19.2.orig/kernel/exit.c
+++ linux-2.6.19.2/kernel/exit.c
@@ -603,10 +603,6 @@ choose_new_parent(struct task_struct *p,
 static void
 reparent_thread(struct task_struct *p, struct task_struct *father, int traced)
 {
-   /* We don't want people slaying init.  */
-   if (p->exit_signal != -1)
-   p->exit_signal = SIGCHLD;
-
if (p->pdeath_signal)
/* We already hold the tasklist_lock here.  */
group_send_sig_info(p->pdeath_signal, SEND_SIG_NOINFO, p);
@@ -626,13 +622,7 @@ reparent_thread(struct task_struct *p, s
p->parent = p->real_parent;
add_parent(p);
 
-   /* If we'd notified the old parent about this child's death,
-* also notify the new parent.
-*/
-   if (p->exit_state == EXIT_ZOMBIE && p->exit_signal != -1 &&
-   thread_group_empty(p))
-   do_notify_parent(p, p->exit_signal);
-   else if (p->state == TASK_TRACED) {
+   if (p->state == TASK_TRACED) {
/*
 * If it was at a trace stop, turn it into
 * a normal stop since it's no longer being
@@ -642,6 +632,23 @@ reparent_thread(struct task_struct *p, s
}
}
 
+   /* If this is a threaded reparent there is no need to
+* notify anyone anything has happened.
+*/
+   if (p->real_parent->group_leader == father->group_leader)
+   return;
+
+   /* We don't want people slaying init.  */
+   if (p->exit_signal != -1)
+   p->exit_signal = SIGCHLD;
+
+   /* If we'd notified the old parent about this child's death,
+* also notify the new parent.
+*/
+   if (!traced && p->exit_state == EXIT_ZOMBIE &&
+   p->exit_signal != -1 && thread_group_empty(p))
+   do_notify_parent(p, p->exit_signal);
+
/*
 * process group orphan check
 * Case ii: Our child is in a different pgrp

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 14/59] sched: tasks cannot run on cpus onlined after boot

2007-02-02 Thread Chris Wright

-stable review patch.  If anyone has any objections, please let us know.
--

From: Nathan Lynch <[EMAIL PROTECTED]>

Commit 5c1e176781f43bc902a51e5832f789756bff911b ("sched: force /sbin/init
off isolated cpus") sets init's cpus_allowed to a subset of cpu_online_map
at boot time, which means that tasks won't be scheduled on cpus that are
added to the system later.

Make init's cpus_allowed a subset of cpu_possible_map instead.  This should
still preserve the behavior that Nick's change intended.

Thanks to Giuliano Pochini for reporting this and testing the fix:

http://ozlabs.org/pipermail/linuxppc-dev/2006-December/029397.html

Signed-off-by: Nathan Lynch <[EMAIL PROTECTED]>
Acked-by: Ingo Molnar <[EMAIL PROTECTED]>
Cc: Nick Piggin <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 kernel/sched.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.2.orig/kernel/sched.c
+++ linux-2.6.19.2/kernel/sched.c
@@ -6765,7 +6765,7 @@ void __init sched_init_smp(void)
 
lock_cpu_hotplug();
arch_init_sched_domains(_online_map);
-   cpus_andnot(non_isolated_cpus, cpu_online_map, cpu_isolated_map);
+   cpus_andnot(non_isolated_cpus, cpu_possible_map, cpu_isolated_map);
if (cpus_empty(non_isolated_cpus))
cpu_set(smp_processor_id(), non_isolated_cpus);
unlock_cpu_hotplug();

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 00/59] -stable review

2007-02-02 Thread Chris Wright

This is the start of the stable review cycle for the 2.6.19.3 release.
There are 59 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let us know.  If anyone is a maintainer of the proper subsystem, and
wants to add a Signed-off-by: line to the patch, please respond with it.

These patches are sent out with a number of different people on the
Cc: line.  If you wish to be a reviewer, please email [EMAIL PROTECTED]
to add your name to the list.  If you want to be off the reviewer list,
also email us.

Responses should be made by Mon Feb  3 02:30 UTC 2007
Anything received after that time might be too late.

thanks,

the -stable release team
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/9] fs: libfs buffered write leak fix

2007-02-02 Thread Nick Piggin

On Fri, Feb 02, 2007 at 06:19:55PM -0800, Andrew Morton wrote:
> On Sat, 3 Feb 2007 03:09:26 +0100
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > From: Nick Piggin <[EMAIL PROTECTED]>
> > To: Andrew Morton <[EMAIL PROTECTED]>
> 
> argh.  Yesterday all my emails were getting a mysterious
> s/osdl/linux-foundation/ done to them at the server, so I switched everything
> over.  Now it would appear that they are getting an equally mysterious
> s/linux-foundation/osdl/ done to them.  I assume you sent this to
> [EMAIL PROTECTED]

No. Your first reply I got to this patch came as linux-foundantion, and
that's what I replied to. Your subsequent reply back to me ("Yes, the page
just isn't uptodate yet..."), came from osdl.org, which is what I replied
to.

> > Cc: Linux Kernel , Linux Filesystems 
> > , Linux Memory Management <[EMAIL PROTECTED]>
> > Subject: Re: [patch 1/9] fs: libfs buffered write leak fix
> > Date: Sat, 3 Feb 2007 03:09:26 +0100
> > User-Agent: Mutt/1.5.9i
> > 
> > On Fri, Feb 02, 2007 at 05:58:01PM -0800, Andrew Morton wrote:
> > > On Sat, 3 Feb 2007 02:33:16 +0100
> > > Nick Piggin <[EMAIL PROTECTED]> wrote:
> > > 
> > > > I think just setting page uptodate in commit_write might do the
> > > > trick? (and getting rid of the set_page_dirty there).
> > > 
> > > Yes, the page just isn't uptodate yet in prepare_write() - moving things
> > > to commti_write() sounds sane.
> > > 
> > > But please, can we have sufficient changelogs and comments in the next 
> > > version?
> > 
> > You're right, sorry. Is this any better?
> 
> yup, thanks.
> 
> > (warning: nobh code is untested)
> 
> ow.

I'll get a chance to do that later today. I have to fire up the old test
case and see if I can reproduce the problem with nobh on a real fs...
Will get back to you when I do.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc7: known regressions

2007-02-02 Thread Andrew Morton

On Fri, 02 Feb 2007 21:03:48 -0500
Jeff Garzik <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> > On Fri, 2 Feb 2007 06:49:16 +0100
> > Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > 
> >> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
> >> that are not yet fixed in Linus' tree.
> > 
> > There are still a few things hanging around.
> > 
> > I have these queued:
> > 
> > aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch
> > kexec-avoid-migration-of-already-disabled-irqs-ia64.patch
> > net-smc911x-match-up-spin-lock-unlock.patch
> > rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch
> > alpha-fix-epoll-syscall-enumerations.patch
> > revert-blockdev-direct-io-back-to-2619-version.patch
> > scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch
> > altix-more-acpi-prt-support.patch
> 
> Would you forward the x86-64 dma_noncoherent API build fix I posted? 
> Anything that uses that API won't build on x86-64 without my [simple and 
> obvious] patch.

Yup.  That's this:

--- 
a/include/asm-x86_64/dma-mapping.h~x86-64-define-dma-noncoherent-api-functions
+++ a/include/asm-x86_64/dma-mapping.h
@@ -63,6 +63,9 @@ static inline int dma_mapping_error(dma_
return (dma_addr == bad_dma_address);
 }
 
+#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
+#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
+
 extern void *dma_alloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp);
 extern void dma_free_coherent(struct device *dev, size_t size, void *vaddr,
_

> 
> > - I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating
> >   about, but I forget its status.  
> 
> I posted a preferred patch (which someone then noted need to use 
> setup_timer), and am waiting for an "it works" response of some sort

OK, thanks, I'll drop it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/9] fs: libfs buffered write leak fix

2007-02-02 Thread Andrew Morton

On Sat, 3 Feb 2007 03:09:26 +0100
Nick Piggin <[EMAIL PROTECTED]> wrote:

> From: Nick Piggin <[EMAIL PROTECTED]>
> To: Andrew Morton <[EMAIL PROTECTED]>

argh.  Yesterday all my emails were getting a mysterious
s/osdl/linux-foundation/ done to them at the server, so I switched everything
over.  Now it would appear that they are getting an equally mysterious
s/linux-foundation/osdl/ done to them.  I assume you sent this to
[EMAIL PROTECTED]

> Cc: Linux Kernel , Linux Filesystems 
> , Linux Memory Management <[EMAIL PROTECTED]>
> Subject: Re: [patch 1/9] fs: libfs buffered write leak fix
> Date: Sat, 3 Feb 2007 03:09:26 +0100
> User-Agent: Mutt/1.5.9i
> 
> On Fri, Feb 02, 2007 at 05:58:01PM -0800, Andrew Morton wrote:
> > On Sat, 3 Feb 2007 02:33:16 +0100
> > Nick Piggin <[EMAIL PROTECTED]> wrote:
> > 
> > > I think just setting page uptodate in commit_write might do the
> > > trick? (and getting rid of the set_page_dirty there).
> > 
> > Yes, the page just isn't uptodate yet in prepare_write() - moving things
> > to commti_write() sounds sane.
> > 
> > But please, can we have sufficient changelogs and comments in the next 
> > version?
> 
> You're right, sorry. Is this any better?

yup, thanks.

> (warning: nobh code is untested)

ow.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

libata_uli puts second channel to PIO4 on 2.6.18

2007-02-02 Thread Grzegorz Kulewski


Hi,

I got this SATA PCI card:

00:04.0 Mass storage controller: ALi Corporation ALi M5281 Serial ATA / 
RAID Host Controller (rev a4) (prog-if 85)
Subsystem: ALi Corporation ALi M5281 Serial ATA / RAID Host 
Controller
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- 
Latency: 128, Cache Line Size: 512 bytes
Interrupt: pin A routed to IRQ 185
Region 0: I/O ports at d400 [size=8]
Region 1: I/O ports at d000 [size=4]
Region 2: I/O ports at b800 [size=8]
Region 3: I/O ports at b400 [size=4]
Region 4: I/O ports at b000 [size=16]
[virtual] Expansion ROM at 8800 [disabled] [size=64K]

00:04.1 Mass storage controller: ALi Corporation M5228 ALi ATA/RAID 
Controller (rev c6) (prog-if 85)

Subsystem: ALi Corporation Unknown device 5281
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- 
Latency: 128
Interrupt: pin A routed to IRQ 9
Region 0: I/O ports at a800 [size=8]
Region 1: I/O ports at a400 [size=4]
Region 2: I/O ports at a000 [size=8]
Region 3: I/O ports at 9800 [size=4]
Region 4: I/O ports at 9400 [size=16]


It worked very well for half a year but with one disk (IIRC it was even 
plugged into second channel but I wont bet on it). Now I have second disk 
(very similar) and it is always put into PIO4 mode:


[   17.404451] libata version 2.00 loaded.
[   17.404916] sata_uli :00:04.0: version 1.0
[   17.405009] ACPI: PCI Interrupt :00:04.0[A] -> GSI 18 (level, low) 
-> IRQ 185
[   17.405223] ata1: SATA max UDMA/133 cmd 0xD400 ctl 0xD002 bmdma 0xB000 
irq 185
[   17.405385] ata2: SATA max UDMA/133 cmd 0xB800 ctl 0xB402 bmdma 0xB008 
irq 185

[   17.405519] scsi2 : sata_uli
[   17.858803] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   17.880541] ata1.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ 
(depth 0/32)

[   17.880660] ata1.00: ata1: dev 0 multi count 16
[   17.58] ata1.00: configured for UDMA/133
[   17.888941] scsi3 : sata_uli
[   18.342469] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   18.343573] ata2.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ 
(depth 0/32)

[   18.343691] ata2.00: ata2: dev 0 multi count 16
[   18.344972] ata2.00: configured for PIO4
[   18.345466]   Vendor: ATA   Model: ST3250620NS   Rev: 3.AE
[   18.346391]   Type:   Direct-Access  ANSI SCSI 
revision: 05

[   18.347464]   Vendor: ATA   Model: ST3250620NS   Rev: 3.AE
[   18.348390]   Type:   Direct-Access  ANSI SCSI 
revision: 05
[   18.349457] SCSI device sda: 488397168 512-byte hdwr sectors (250059 
MB)

[   18.350234] sda: Write Protect is off
[   18.350307] sda: Mode Sense: 00 3a 00 00
[   18.351234] SCSI device sda: drive cache: write back
[   18.352233] SCSI device sda: 488397168 512-byte hdwr sectors (250059 
MB)

[   18.352444] sda: Write Protect is off
[   18.352517] sda: Mode Sense: 00 3a 00 00
[   18.353443] SCSI device sda: drive cache: write back
[   18.353522]  sda: sda1 sda2
[   18.371118] sd 2:0:0:0: Attached scsi disk sda
[   18.372221] SCSI device sdb: 488397168 512-byte hdwr sectors (250059 
MB)

[   18.372431] sdb: Write Protect is off
[   18.372504] sdb: Mode Sense: 00 3a 00 00
[   18.373440] SCSI device sdb: drive cache: write back
[   18.374430] SCSI device sdb: 488397168 512-byte hdwr sectors (250059 
MB)

[   18.375218] sdb: Write Protect is off
[   18.375291] sdb: Mode Sense: 00 3a 00 00
[   18.376216] SCSI device sdb: drive cache: write back
[   18.376295]  sdb: unknown partition table
[   18.381481] sd 3:0:0:0: Attached scsi disk sdb


As you probably know this gives very very poor performance. Is there any 
way to make it fast? I tried changing cables and reconnecting them but it 
looks like it does not help. I can't do too much with this hardware since 
it is used as production server. But testing some patches is of course 
possible. On the other hand full kernel upgrade to 2.6.19 or .20 is not 
possible because this kernel has openvz patches and I don't have them for 
.19 or .20 yet.



This is what I am getting from various utilities:

# dmesg
[0.00] Linux version 2.6.18-028test010 ([EMAIL PROTECTED]) (gcc version 
3.4.6 (Gentoo 3.4.6-r2, ssp-3.4.6-1.0, pie-8.7.10)) #3 SMP Thu Jan 18 
02:02:53 CET 2007

[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009f000 (usable)
[0.00]  BIOS-e820: 0009f000 - 000a (reserved)
[0.00]  BIOS-e820: 000f - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 7fffb000 (usable)
[0.00]  BIOS-e820: 7fffb000 -

[PATCH/RFC] alternative aproach to: Ban module license tag string termination trick

2007-02-02 Thread Bodo Eggert

This patch changes the module license handling code to:
- allow modules to have multiple licenses
- access GPL symbols if at least one license is GPL-compatible
- prevent the "GPL\0 for nothing"-trick
- fix an off-by-one buffer overflow
  (exploitable only if the attacker can load modules)
- move the ndiswrapper check into the new license checking routine

Signed-Off-By: Bodo Eggert <[EMAIL PROTECTED]>
---

The license handling code was kind of strange:
 - The kernel itself would only consider the first license, while modpost
   looks at all of them.
 - If you offer your module under a non-GPL license in addition to GPL,
   modpost would consider this module to be non-GPL. Therefore you can't
   say MODULE_LICENSE("GPL");\nMODULE_LICENSE("completely free");

Since I had to rewrite this part, I changed the behaviour to accept all 
modules having _at_least_ one GPL-compatible license.


Prohibiting the \0-trick is done by storing the length of the license
behind the license itself, uuencoded, as $=xyz.

Currently, only 18 bits (256 KB) of the length are stored, but storing up
to 30 bits is possible without changing anything besides the macro.

You can still trick this code by including "...\0license=GPL\0$=$\0..." or
by manually fabricating this string into .modinfo. Fix: Document this to
mean that you actually GPL-license the module. 


TODO: get_modinfo: make sure the value returned does not exceed the end of 
  the buffer.


 include/linux/license.h |   27 +---
 include/linux/module.h  |3 -
 include/linux/moduleinfo.h  |   19 +
 include/linux/moduleparam.h |   11 +
 kernel/Makefile |2 
 kernel/module.c |   92 +---
 kernel/moduleinfo.c |   73 ++
 scripts/mod/Makefile|2 
 scripts/mod/modpost.c   |   42 +---
 scripts/mod/moduleinfo.c|3 +
 10 files changed, 195 insertions(+), 79 deletions(-)

diff -X dontdiff -pruN 2.6.19/include/linux/license.h 
2.6.19.license/include/linux/license.h
--- 2.6.19/include/linux/license.h  2006-11-29 22:57:37.0 +0100
+++ 2.6.19.license/include/linux/license.h  2007-02-02 18:30:44.0 
+0100
@@ -1,14 +1,27 @@
 #ifndef __LICENSE_H
 #define __LICENSE_H
 
-static inline int license_is_gpl_compatible(const char *license)
+static inline int license_is_gpl_compatible(const char *license,
+int length)
 {
-   return (strcmp(license, "GPL") == 0
-   || strcmp(license, "GPL v2") == 0
-   || strcmp(license, "GPL and additional rights") == 0
-   || strcmp(license, "Dual BSD/GPL") == 0
-   || strcmp(license, "Dual MIT/GPL") == 0
-   || strcmp(license, "Dual MPL/GPL") == 0);
+   static char *gpl_compatible[] = {
+   "GPL",
+   "GPL v2",
+   "GPL and additional rights",
+   "Dual BSD/GPL",
+   "Dual MIT/GPL",
+   "Dual MPL/GPL",
+   NULL
+   };
+   char **p = gpl_compatible;
+
+   while (*p) {
+   if(!strcmp(license, *p)
+   && length == strlen(*p))
+   return 1;
+   p++;
+   }
+   return 0;
 }
 
 #endif
diff -X dontdiff -pruN 2.6.19/include/linux/module.h 
2.6.19.license/include/linux/module.h
--- 2.6.19/include/linux/module.h   2006-11-29 22:57:37.0 +0100
+++ 2.6.19.license/include/linux/module.h   2007-02-02 23:56:39.0 
+0100
@@ -92,6 +92,7 @@ extern struct module __this_module;
 
 /* Generic info of form tag = "info" */
 #define MODULE_INFO(tag, info) __MODULE_INFO(tag, tag, info)
+#define MODULE_INFO_I(tag, info) __MODULE_INFO_I(tag, tag, info)
 
 /* For userspace: you can also call me... */
 #define MODULE_ALIAS(_alias) MODULE_INFO(alias, _alias)
@@ -124,7 +125,7 @@ extern struct module __this_module;
  * 2.  So the community can ignore bug reports including proprietary modules
  * 3.  So vendors can do likewise based on their own policies
  */
-#define MODULE_LICENSE(_license) MODULE_INFO(license, _license)
+#define MODULE_LICENSE(_license) MODULE_INFO_I(license, _license)
 
 /* Author, ideally of form NAME [, NAME ]*[ and NAME ] */
 #define MODULE_AUTHOR(_author) MODULE_INFO(author, _author)
diff -X dontdiff -pruN 2.6.19/include/linux/moduleinfo.h 
2.6.19.license/include/linux/moduleinfo.h
--- 2.6.19/include/linux/moduleinfo.h   1970-01-01 01:00:00.0 +0100
+++ 2.6.19.license/include/linux/moduleinfo.h   2007-02-02 20:33:26.0 
+0100
@@ -0,0 +1,19 @@
+#ifndef __MODULEINFO_H
+#define __MODULEINFO_H
+
+struct pstring_len {
+   char * s;
+   unsigned long i;
+};
+
+extern void do_get_next_modinfo_len(struct pstring_len *ret,
+char * start,
+unsigned long size,
+const char

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread Greg KH

On Fri, Feb 02, 2007 at 05:19:24PM -0800, Andrew Morton wrote:
> On Fri, 2 Feb 2007 17:34:56 +0530
> Nagendra Singh Tomar <[EMAIL PROTECTED]> wrote:
> 
> > Hi,
> > sd_probe() calls class_device_add() even before initializing the 
> > sdkp->device variable. class_device_add() eventually results in the user 
> > mode 
> > udev program to be called. udev program can read the the allow_restart 
> > attribute of the newly created scsi device. This is resulting in a crash as 
> > the show function for allow_restart (i.e sd_show_allow_restart) returns the 
> > attribute value by reading the sdkp->device->allow_restart variable. As the 
> > sdkp->device is not initialized before calling the user mode hotplug 
> > helper, 
> > this results in a crash.
> > The patch below solves it by calling class_device_add() only after the 
> > necessary fields in the scsi_disk structure are initialized properly.
> > 
> > 
> > 
> > --- linux-2.6.19.2/drivers/scsi/sd.c.orig   2007-02-02 17:03:03.0 
> > +0530
> > +++ linux-2.6.19.2/drivers/scsi/sd.c2007-02-02 17:04:04.0 
> > +0530
> > @@ -1646,16 +1646,6 @@ static int sd_probe(struct device *dev)
> > if (error)
> > goto out_put;
> >  
> > -   class_device_initialize(>cdev);
> > -   sdkp->cdev.dev = >sdev_gendev;
> > -   sdkp->cdev.class = _disk_class;
> > -   strncpy(sdkp->cdev.class_id, sdp->sdev_gendev.bus_id, BUS_ID_SIZE);
> > -
> > -   if (class_device_add(>cdev))
> > -   goto out_put;
> > -
> > -   get_device(>sdev_gendev);
> > -
> > sdkp->device = sdp;
> > sdkp->driver = _template;
> > sdkp->disk = gd;
> > @@ -1669,6 +1659,16 @@ static int sd_probe(struct device *dev)
> > sdp->timeout = SD_MOD_TIMEOUT;
> > }
> >  
> > +   class_device_initialize(>cdev);
> > +   sdkp->cdev.dev = >sdev_gendev;
> > +   sdkp->cdev.class = _disk_class;
> > +   strncpy(sdkp->cdev.class_id, sdp->sdev_gendev.bus_id, BUS_ID_SIZE);
> > +
> > +   if (class_device_add(>cdev))
> > +   goto out_put;
> > +
> > +   get_device(>sdev_gendev);
> > +
> > gd->major = sd_major((index & 0xf0) >> 4);
> > gd->first_minor = ((index & 0xf) << 4) | (index & 0xfff00);
> > gd->minors = 16;
> 
> Thanks - I'll queue this up for 2.6.20 also.

No objection from me, as long as James says this is ok.

I wonder why we haven't noticed this in the past?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/9] fs: libfs buffered write leak fix

2007-02-02 Thread Nick Piggin

On Fri, Feb 02, 2007 at 05:58:01PM -0800, Andrew Morton wrote:
> On Sat, 3 Feb 2007 02:33:16 +0100
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > I think just setting page uptodate in commit_write might do the
> > trick? (and getting rid of the set_page_dirty there).
> 
> Yes, the page just isn't uptodate yet in prepare_write() - moving things
> to commti_write() sounds sane.
> 
> But please, can we have sufficient changelogs and comments in the next 
> version?

You're right, sorry. Is this any better? (warning: nobh code is untested)

--
simple_prepare_write and nobh_prepare_write leak uninitialised kernel data.
This happens because the prepare_write functions leave an uninitialised
"hole" over the part of the page that the write is expected to go to. This
is fine, but they then mark the page uptodate, which means a concurrent read
can come in and copy the uninitialised memory into userspace before it written
to.

Fix simple_readpage by simply initialising the whole page in the case of a
partial-page write. In the case of a full-page write, we don't SetPageDirty
until commit_write time.

Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>

Index: linux-2.6/fs/libfs.c
===
--- linux-2.6.orig/fs/libfs.c
+++ linux-2.6/fs/libfs.c
@@ -327,25 +327,32 @@ int simple_readpage(struct file *file, s
 int simple_prepare_write(struct file *file, struct page *page,
unsigned from, unsigned to)
 {
-   if (!PageUptodate(page)) {
-   if (to - from != PAGE_CACHE_SIZE) {
-   void *kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr, 0, from);
-   memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
-   }
+   if (PageUptodate(page))
+   return 0;
+
+   if (to - from != PAGE_CACHE_SIZE) {
+   /*
+* Partial-page write? Initialise the complete page and
+* set it uptodate. We could avoid initialising the
+* (from, to) hole, and opt to mark it uptodate in
+* simple_commit_write, but that's probably only a win
+* for filesystems that would need to read blocks off disk.
+*/
+   memclear_highpage_flush(page, 0, PAGE_CACHE_SIZE);
SetPageUptodate(page);
}
+
return 0;
 }
 
 int simple_commit_write(struct file *file, struct page *page,
-   unsigned offset, unsigned to)
+   unsigned from, unsigned to)
 {
struct inode *inode = page->mapping->host;
loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
 
+   if (to - from == PAGE_CACHE_SIZE)
+   SetPageUptodate(page);
/*
 * No need to use i_size_read() here, the i_size
 * cannot change under us because we hold the i_mutex.
@@ -353,6 +360,7 @@ int simple_commit_write(struct file *fil
if (pos > inode->i_size)
i_size_write(inode, pos);
set_page_dirty(page);
+
return 0;
 }
 
Index: linux-2.6/fs/buffer.c
===
--- linux-2.6.orig/fs/buffer.c
+++ linux-2.6/fs/buffer.c
@@ -2344,17 +2344,6 @@ int nobh_prepare_write(struct page *page
 
if (is_mapped_to_disk)
SetPageMappedToDisk(page);
-   SetPageUptodate(page);
-
-   /*
-* Setting the page dirty here isn't necessary for the prepare_write
-* function - commit_write will do that.  But if/when this function is
-* used within the pagefault handler to ensure that all mmapped pages
-* have backing space in the filesystem, we will need to dirty the page
-* if its contents were altered.
-*/
-   if (dirtied_it)
-   set_page_dirty(page);
 
return 0;
 
@@ -2384,6 +2373,7 @@ int nobh_commit_write(struct file *file,
struct inode *inode = page->mapping->host;
loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
 
+   SetPageUptodate(page);
set_page_dirty(page);
if (pos > inode->i_size) {
i_size_write(inode, pos);
Index: linux-2.6/Documentation/filesystems/vfs.txt
===
--- linux-2.6.orig/Documentation/filesystems/vfs.txt
+++ linux-2.6/Documentation/filesystems/vfs.txt
@@ -617,6 +617,11 @@ struct address_space_operations {
In this case the prepare_write will be retried one the lock is
regained.
 
+   Note: the page _must not_ be marked uptodate in this function
+   (or anywhere else) unless it actually is uptodate right now. As
+   soon as a page is marked uptodate, it is possible for a concurrent
+   read(2) to copy it to userspace.
+
   commit_write: If prepare_write

Re: 2.6.20-rc7: known regressions

2007-02-02 Thread Jeff Garzik


Andrew Morton wrote:

On Fri, 2 Feb 2007 06:49:16 +0100
Adrian Bunk <[EMAIL PROTECTED]> wrote:


This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
that are not yet fixed in Linus' tree.


There are still a few things hanging around.

I have these queued:

aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch
kexec-avoid-migration-of-already-disabled-irqs-ia64.patch
net-smc911x-match-up-spin-lock-unlock.patch
rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch
alpha-fix-epoll-syscall-enumerations.patch
revert-blockdev-direct-io-back-to-2619-version.patch
scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch
altix-more-acpi-prt-support.patch


Would you forward the x86-64 dma_noncoherent API build fix I posted? 
Anything that uses that API won't build on x86-64 without my [simple and 
obvious] patch.




- I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating
  about, but I forget its status.  


I posted a preferred patch (which someone then noted need to use 
setup_timer), and am waiting for an "it works" response of some sort


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.

2007-02-02 Thread Andrew Morton

On Fri, 02 Feb 2007 18:39:15 -0700
[EMAIL PROTECTED] (Eric W. Biederman) wrote:

> Andrew Morton <[EMAIL PROTECTED]> writes:
> 
> > So is this a for-2.6.20 thing?  The bug was present in 2.6.19, so
> > I assume it doesn't affect many people?
> 
> If it's not to late, and this patch isn't too scary.
> 
> It's a really rare set of circumstances that trigger it, but the
> possibility of being hit is pretty widespread, anything with
> more than one cpu, and more then one irq could see this.
> 
> The easiest way to trigger this is to have two level triggered irqs on
> two different cpus using the same vector.  In that case if one acks
> it's irq while the other irq is migrating to a different cpu 2.6.19
> get completely confused and stop handling interrupts properly.
> 
> With my previous bug fix (not to drop the ack when we are confused)
> the machine will stay up, and that is obviously correct and can't
> affect anything else so is probably a candidate for the stable tree.
> 
> With this fix everything just works.
> 
> I don't know how often a legitimate case of the exact same irq
> going off twice in a row is, but that is a possibility as well
> especially with edge triggered interrupts.
> 
> Setting up the test scenario was a pain, but by extremely limiting
> my choice of vectors I was able to confirm I survived several hundred
> of these events with in a couple of minutes no problem.
> 

OK, thanks.  Let's await Andi's feedback.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc7: known regressions

2007-02-02 Thread Andrew Morton

On Fri, 2 Feb 2007 06:49:16 +0100
Adrian Bunk <[EMAIL PROTECTED]> wrote:

> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
> that are not yet fixed in Linus' tree.

There are still a few things hanging around.

I have these queued:

aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch
kexec-avoid-migration-of-already-disabled-irqs-ia64.patch
net-smc911x-match-up-spin-lock-unlock.patch
rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch
alpha-fix-epoll-syscall-enumerations.patch
revert-blockdev-direct-io-back-to-2619-version.patch
scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch
altix-more-acpi-prt-support.patch

which I'll get through to Linus later today.  Plus:


- x86_64-irq-simplfy-__assign_irq_vector.patch and
  x86_64-irq-handle-irqs-pending-in-irr-during-irq-migration.patch which
  are big and scary.  Am awaiting feedback from Andi and Eric on what to do
  with these.

- A fix from Trond for http://bugzilla.kernel.org/show_bug.cgi?id=7923. 
  Am awaiting acks to merge that.

- sky2-flow-control-off.patch from shemminger which I assume Linus will
  be merging anyway.

- v9fs_vfs_mkdir-fix-a-double-free.patch which I guess I'll merge unless
  Eric suddenly nacks it.

- I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating
  about, but I forget its status.  

- I have efi-x86-pass-firmware-call-parameters-on-the-stack.patch, but
  I'm not sure it's right and unless something really rapid happens, we'll
  ship with that bug unfixed.

- enable-mouse-button-23-emulation-for-x86-macs.patch looks simple
  enough, but I'm waiting for Ben to wake up.

- x86-fix-vdso-mapping-for-aout-executables.patch probably works OK, but
  Andi points out that it'd be better to implement this with
  attribute-weak.  So I guess 2.6.20 will ship with non-functional a.out on
  i386, like 2.6.29.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/9] fs: libfs buffered write leak fix

2007-02-02 Thread Andrew Morton

On Sat, 3 Feb 2007 02:33:16 +0100
Nick Piggin <[EMAIL PROTECTED]> wrote:

> > > ===
> > > --- linux-2.6.orig/fs/buffer.c
> > > +++ linux-2.6/fs/buffer.c
> > > @@ -2344,6 +2344,8 @@ int nobh_prepare_write(struct page *page
> > >  
> > >   if (is_mapped_to_disk)
> > >   SetPageMappedToDisk(page);
> > > +
> > > + /* XXX: information leak vs read(2) */
> > >   SetPageUptodate(page);
> > >  
> > >   /*
> > 
> > That comment is too terse to be useful.
> 
> OK, similar problem here - we have brought all the buffers uptodate
> that we are *not* going to write over, or partially write over, but
> we can have an uninitialised hole over the region we want to write.
> 
> I think just setting page uptodate in commit_write might do the
> trick? (and getting rid of the set_page_dirty there).

Yes, the page just isn't uptodate yet in prepare_write() - moving things
to commti_write() sounds sane.

But please, can we have sufficient changelogs and comments in the next version?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SATA exceptions with 2.6.20-rc5

2007-02-02 Thread Björn Steinbrink

On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote:
> On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
> > Larry Walton wrote:
> > >The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
> > >seems to have fix the problem.  Much appreciated, 
> > >thank you. I'd consider it a must have in 2.6.20.
> > 
> > Can any of the rest of you that have been seeing this problem also 
> > confirm that this fixes it?
> 
> Seems to work for me, uptime is about an hour now and no exception yet.
> Had the stress test running for only about 10 minutes, but I usually got
> an exception within an hour even during plain irssi usage, so I'm quite
> confident that the patch fixes it.

Or maybe not :( Just got an exception on 2.6.20-rc6. Took 4 days of
uptime to trigger, so it's just a lot harder to trigger now.

Björn
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration.

2007-02-02 Thread Eric W. Biederman

Andrew Morton <[EMAIL PROTECTED]> writes:

> So is this a for-2.6.20 thing?  The bug was present in 2.6.19, so
> I assume it doesn't affect many people?

If it's not to late, and this patch isn't too scary.

It's a really rare set of circumstances that trigger it, but the
possibility of being hit is pretty widespread, anything with
more than one cpu, and more then one irq could see this.

The easiest way to trigger this is to have two level triggered irqs on
two different cpus using the same vector.  In that case if one acks
it's irq while the other irq is migrating to a different cpu 2.6.19
get completely confused and stop handling interrupts properly.

With my previous bug fix (not to drop the ack when we are confused)
the machine will stay up, and that is obviously correct and can't
affect anything else so is probably a candidate for the stable tree.

With this fix everything just works.

I don't know how often a legitimate case of the exact same irq
going off twice in a row is, but that is a possibility as well
especially with edge triggered interrupts.

Setting up the test scenario was a pain, but by extremely limiting
my choice of vectors I was able to confirm I survived several hundred
of these events with in a couple of minutes no problem.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How many people are using 2.6.16?

2007-02-02 Thread Adrian Bunk

On Thu, Feb 01, 2007 at 03:13:03PM +0300, Vladimir V. Saveliev wrote:
> Hello

Hi Vladimir,

> On Wednesday 31 January 2007 10:02, Adrian Bunk wrote:
>...
> > reiserfs:
> > commit de14569f94513279e3d44d9571a421e9da1759ae
> >   [PATCH] resierfs: avoid tail packing if an inode was ever mmapped
> > backport to 2.6.16 required
> 
> Here it goes:
>...

thanks a lot, applied to 2.6.16.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] - Altix: more ACPI PRT support

2007-02-02 Thread Andrew Morton

On Fri, 02 Feb 2007 14:54:12 -0600
John Keller <[EMAIL PROTECTED]> wrote:

> The SN Altix platform does not conform to the 
> IOSAPIC IRQ routing model. Add code in acpi_unregister_gsi()
> to check if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) and
> return.
> 
> Signed-off-by: John Keller <[EMAIL PROTECTED]>
> ---
> 
> Due to an oversight, this code was not added previously when
> similar code was added to acpi_register_gsi().
> 
> http://marc.theaimsgroup.com/?l=linux-acpi=116680983430121=2
> 
>  arch/ia64/kernel/acpi.c |3 +++
>  1 file changed, 3 insertions(+)
> 
> 
> Index: linux-2.6/arch/ia64/kernel/acpi.c
> ===
> --- linux-2.6.orig/arch/ia64/kernel/acpi.c2007-02-02 14:44:31.0 
> -0600
> +++ linux-2.6/arch/ia64/kernel/acpi.c 2007-02-02 14:47:44.658143727 -0600
> @@ -609,6 +609,9 @@ EXPORT_SYMBOL(acpi_register_gsi);
>  
>  void acpi_unregister_gsi(u32 gsi)
>  {
> + if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM)
> + return;
> +
>   iosapic_unregister_intr(gsi);
>  }

Given that the December 22 patch appears to be in mainline, and that this
patch is simple, I shall cheerily bypass maintainers and send it in for
2.6.20.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-02 Thread Nick Piggin

On Fri, Feb 02, 2007 at 03:53:11PM -0800, Andrew Morton wrote:
> On Mon, 29 Jan 2007 11:33:03 +0100 (CET)
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > Modify the core write() code so that it won't take a pagefault while 
> > holding a
> > lock on the pagecache page. There are a number of different deadlocks 
> > possible
> > if we try to do such a thing:
> > 
> > 1.  generic_buffered_write
> > 2.   lock_page
> > 3.prepare_write
> > 4. unlock_page+vmtruncate
> > 5. copy_from_user
> > 6.  mmap_sem(r)
> > 7.   handle_mm_fault
> > 8.lock_page (filemap_nopage)
> > 9.commit_write
> > 10.  unlock_page
> > 
> > a. sys_munmap / sys_mlock / others
> > b.  mmap_sem(w)
> > c.   make_pages_present
> > d.get_user_pages
> > e. handle_mm_fault
> > f.  lock_page (filemap_nopage)
> > 
> > 2,8 - recursive deadlock if page is same
> > 2,8;2,8 - ABBA deadlock is page is different
> > 2,6;b,f - ABBA deadlock if page is same
> > 
> > The solution is as follows:
> > 1.  If we find the destination page is uptodate, continue as normal, but use
> > atomic usercopies which do not take pagefaults and do not zero the 
> > uncopied
> > tail of the destination. The destination is already uptodate, so we can
> > commit_write the full length even if there was a partial copy: it does 
> > not
> > matter that the tail was not modified, because if it is dirtied and 
> > written
> > back to disk it will not cause any problems (uptodate *means* that the
> > destination page is as new or newer than the copy on disk).
> > 
> > 1a. The above requires that fault_in_pages_readable correctly returns access
> > information, because atomic usercopies cannot distinguish between
> > non-present pages in a readable mapping, from lack of a readable 
> > mapping.
> > 
> > 2.  If we find the destination page is non uptodate, unlock it (this could 
> > be
> > made slightly more optimal), then find and pin the source page with
> > get_user_pages. Relock the destination page and continue with the copy.
> > However, instead of a usercopy (which might take a fault), copy the data
> > via the kernel address space.
> > 
> 
> Oh what a mess we're making :(
> 
> Unfortunately, write() into a non-uptodate page is very much the common
> case.  We've always tried to avoid doing a pte-walk in the write() path to
> fix this bug.  Careful performance testing is needed here so we can assess
> the impact.  For threaded applications, simply the taking of mmap_sem might
> be the biggest problem.
> 
> And I can't think of any tricks we can play to avoid doing the pte-walk in
> most cases.  For example, we don't yet have a page to run page_mapped()
> against.

After this patch series, I am working on another that will allow filesystems
to specifically code around the problem (eg. by handling short usercopies
properly).

I tried to take this approach generically the first time, but it turns out
lots of filesystems had subtle problems, so if we do it this way instead,
then filesystem developers who actually care enough can improve their
code, and those that don't won't hold them back (or prevent this bug from
being fixed).

> > break;
> > }
> >  
> > +   /*
> > +* non-uptodate pages cannot cope with short copies, and we
> > +* cannot take a pagefault with the destination page locked.
> > +* So pin the source page to copy it.
> > +*/
> > +   if (!PageUptodate(page)) {
> > +   unlock_page(page);
> > +
> > +   bytes = min(bytes, PAGE_CACHE_SIZE -
> > +((unsigned long)buf & ~PAGE_CACHE_MASK));
> > +
> > +   /*
> > +* Cannot get_user_pages with a page locked for the
> > +* same reason as we can't take a page fault with a
> > +* page locked (as explained below).
> > +*/
> > +   down_read(>mm->mmap_sem);
> > +   status = get_user_pages(current, current->mm,
> > +   (unsigned long)buf & PAGE_CACHE_MASK, 1,
> > +   0, 0, _page, NULL);
> > +   up_read(>mm->mmap_sem);
> > +   if (status != 1) {
> > +   page_cache_release(page);
> > +   break;
> > +   }
> > +
> > +   lock_page(page);
> > +   if (!page->mapping) {
> 
> Hopefully this can't happen?  If it can, who went and took our page off the
> mapping?  Reclaim?  The elevated page_count will prevent that?

Truncate/invalidate?

> > +   unlock_page(page);
> > +   page_cache_release(page);
> > +   page_cache_release(src_page);
> > +   continue;
> > +   }
> > +

Re: [PATCH] Ban module license tag string termination trick

2007-02-02 Thread Jan Engelhardt


On Feb 2 2007 17:12, Randy Dunlap wrote:
>> >> >if (MODULE_LICENSE_contains_null(license))
>> >> > printk(KERN_WARNING "this module's license is suspicious\n");
>> 
>> Whatever, I just want to see how you are going to implement
>> MODULE_LICENSE_contains_null.
>
>I was busy on other things this morning (my time).
>Now I have looked and I see what you mean.  ;)
>
>I think it's possible, but it requires digging/learning about
>Elf headers.

That's what I did...



Jan
-- 
ft: http://freshmeat.net/p/chaostables/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/9] fs: libfs buffered write leak fix

2007-02-02 Thread Nick Piggin

On Fri, Feb 02, 2007 at 03:52:36PM -0800, Andrew Morton wrote:
> On Mon, 29 Jan 2007 11:31:46 +0100 (CET)
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > simple_prepare_write and nobh_prepare_write leak uninitialised kernel data.
> 
> They do?  Under what situation?

Yes, I have at least reproduced the libfs leak.

The situation is when you write into a !uptodate page, the prepare_write
function runs SetPageUptodate *before* we have copied data in. Thus you
can read uninitialised data out of there.

SetPageUptodate must not be used (or at least used carefully) in
prepare_write. commit_write is the correct place to do this.

> > Fix the former,
> 
> How?

If doing a partial-write, simply clear the whole page and set it uptodate
(don't need to get too tricky). If doing a full-write, only set it uptodate
in the commit_write.

> > make a note of the latter. Several other filesystems seem
> > to be iffy here, too.
> 
> Please, tell us what the bug is so that others have a chance of reviewing
> and, if needed, fixing those other filesystems.
> 
> > --- linux-2.6.orig/fs/libfs.c
> > +++ linux-2.6/fs/libfs.c
> > @@ -327,32 +327,35 @@ int simple_readpage(struct file *file, s
> >  int simple_prepare_write(struct file *file, struct page *page,
> > unsigned from, unsigned to)
> >  {
> > -   if (!PageUptodate(page)) {
> > -   if (to - from != PAGE_CACHE_SIZE) {
> > -   void *kaddr = kmap_atomic(page, KM_USER0);
> > -   memset(kaddr, 0, from);
> > -   memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
> > -   flush_dcache_page(page);
> > -   kunmap_atomic(kaddr, KM_USER0);
> > -   }
> > +   if (PageUptodate(page))
> > +   return 0;
> > +
> > +   if (to - from != PAGE_CACHE_SIZE) {
> > +   clear_highpage(page);
> > +   flush_dcache_page(page);
> > SetPageUptodate(page);
> > }
> 
> memclear_highpage_flush() is fashionable.

Good one.

> > ===
> > --- linux-2.6.orig/fs/buffer.c
> > +++ linux-2.6/fs/buffer.c
> > @@ -2344,6 +2344,8 @@ int nobh_prepare_write(struct page *page
> >  
> > if (is_mapped_to_disk)
> > SetPageMappedToDisk(page);
> > +
> > +   /* XXX: information leak vs read(2) */
> > SetPageUptodate(page);
> >  
> > /*
> 
> That comment is too terse to be useful.

OK, similar problem here - we have brought all the buffers uptodate
that we are *not* going to write over, or partially write over, but
we can have an uninitialised hole over the region we want to write.

I think just setting page uptodate in commit_write might do the
trick? (and getting rid of the set_page_dirty there).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Randy Dunlap

On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote:

> On Fri, 2 Feb 2007 12:56:30 -0800
> Andrew Vasquez <[EMAIL PROTECTED]> wrote:
> 
> > > > dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats 
> > > > limit=2m passes=100 pattern=iot dlimit=2048
> 
> What is this mysterious dt command, btw?

I expect that it's the one here:
http://www.scsifaq.org/RMiller_Tools/index.html

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/9] buffered write deadlock fix

2007-02-02 Thread Nick Piggin

On Fri, Feb 02, 2007 at 03:52:32PM -0800, Andrew Morton wrote:
> On Mon, 29 Jan 2007 11:31:37 +0100 (CET)
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > The following set of patches attempt to fix the buffered write
> > locking problems (and there are a couple of peripheral patches
> > and cleanups there too).
> > 
> > Patches against 2.6.20-rc6. I was hoping that 2.6.20-rc6-mm2 would
> > be an easier diff with the fsaio patches gone, but the readahead
> > rewrite clashes badly :(
> 
> Well fsaio is restored, but there's now considerable doubt over it due to
> the recent febril febrility.
> 
> How bad is the clash with the readahead patches?

I don't think it would be so bad that one couldn't merge readahead
back on top quite easily... The fsaio ones are a little harder because
they change generic_file_buffered_write.

> Clashes with git-block are likely, too.
> 
> Bugfixes come first, so I will drop readahead and fsaio and git-block to get
> this work completed if needed - please work agaisnt mainline.

OK.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread Andrew Morton

On Fri, 2 Feb 2007 17:34:56 +0530
Nagendra Singh Tomar <[EMAIL PROTECTED]> wrote:

> Hi,
>   sd_probe() calls class_device_add() even before initializing the 
> sdkp->device variable. class_device_add() eventually results in the user mode 
> udev program to be called. udev program can read the the allow_restart 
> attribute of the newly created scsi device. This is resulting in a crash as 
> the show function for allow_restart (i.e sd_show_allow_restart) returns the 
> attribute value by reading the sdkp->device->allow_restart variable. As the 
> sdkp->device is not initialized before calling the user mode hotplug helper, 
> this results in a crash.
>   The patch below solves it by calling class_device_add() only after the 
> necessary fields in the scsi_disk structure are initialized properly.
> 
> 
> 
> --- linux-2.6.19.2/drivers/scsi/sd.c.orig 2007-02-02 17:03:03.0 
> +0530
> +++ linux-2.6.19.2/drivers/scsi/sd.c  2007-02-02 17:04:04.0 +0530
> @@ -1646,16 +1646,6 @@ static int sd_probe(struct device *dev)
>   if (error)
>   goto out_put;
>  
> - class_device_initialize(>cdev);
> - sdkp->cdev.dev = >sdev_gendev;
> - sdkp->cdev.class = _disk_class;
> - strncpy(sdkp->cdev.class_id, sdp->sdev_gendev.bus_id, BUS_ID_SIZE);
> -
> - if (class_device_add(>cdev))
> - goto out_put;
> -
> - get_device(>sdev_gendev);
> -
>   sdkp->device = sdp;
>   sdkp->driver = _template;
>   sdkp->disk = gd;
> @@ -1669,6 +1659,16 @@ static int sd_probe(struct device *dev)
>   sdp->timeout = SD_MOD_TIMEOUT;
>   }
>  
> + class_device_initialize(>cdev);
> + sdkp->cdev.dev = >sdev_gendev;
> + sdkp->cdev.class = _disk_class;
> + strncpy(sdkp->cdev.class_id, sdp->sdev_gendev.bus_id, BUS_ID_SIZE);
> +
> + if (class_device_add(>cdev))
> + goto out_put;
> +
> + get_device(>sdev_gendev);
> +
>   gd->major = sd_major((index & 0xf0) >> 4);
>   gd->first_minor = ((index & 0xf) << 4) | (index & 0xfff00);
>   gd->minors = 16;

Thanks - I'll queue this up for 2.6.20 also.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 4/9] Remove the TSC synchronization on SMP machines

2007-02-02 Thread H. Peter Anvin


[EMAIL PROTECTED] wrote:

TSC is either synchronized by design or not reliable
to be used for anything, let alone timekeeping.


This refers to eliminating the offset between multiple synchronized TSCs.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 >

1 - 100 of 638 matches

Mail list logo