Help for kernel module programming

2006-11-28 Thread prajakta choudhari

Hi:
I am writing a kernel module for assging an ip address to an interface.
I  have included linux/igmp.h but still whenever i use the function
declared in  igmp.h file, it says unresolved symbol for that function.
I am new to this programming.
i use the following command to compile it:
gcc -c -D__KERNEL__   -DMODULE
-I/home/newkernelsource/linux-2.4.22/include  hello.c
--
-
-
Regards,
Prajakta Choudhari,
Project Engineer,
Networking and Internet Software Group,
CDAC,Pune
Email:[EMAIL PROTECTED]
Mobile:9890302701
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm1: drivers/net/chelsio/: unused code

2006-11-28 Thread Andrew Morton
On Wed, 29 Nov 2006 08:36:09 +0100
Adrian Bunk <[EMAIL PROTECTED]> wrote:

> On Mon, Nov 27, 2006 at 10:24:55AM -0800, Stephen Hemminger wrote:
> > On Fri, 24 Nov 2006 01:17:31 +0100
> > Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > 
> > > On Thu, Nov 23, 2006 at 02:17:03AM -0800, Andrew Morton wrote:
> > > >...
> > > > Changes since 2.6.19-rc5-mm2:
> > > >...
> > > > +chelsio-22-driver.patch
> > > >...
> > > >  netdev updates
> > > 
> > > It is suspicious that the following newly added code is completely unused:
> > >   drivers/net/chelsio/ixf1010.o
> > > t1_ixf1010_ops
> > >   drivers/net/chelsio/mac.o
> > > t1_chelsio_mac_ops
> > >   drivers/net/chelsio/vsc8244.o
> > > t1_vsc8244_ops
> > > 
> > > cu
> > > Adrian
> > > 
> > 
> > All that is gone in later version. I reposted new patches
> > after -mm2 was done.
> 
> It seems these patches didn't make it into 2.6.19-rc6-mm2 ?
> 

I dropped that patch and picked up Francois's tree instead.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] lockdep: fix sk->sk_callback_lock locking

2006-11-28 Thread Herbert Xu
Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> 
> =
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.19-rc6 #4
> -
> nc/1854 just changed the state of lock:
> (af_callback_keys + sk->sk_family#2){-.-?}, at: [] 
> sock_def_error_report+0x1f/0x90
> but this lock was taken by another, soft-irq-safe lock in the past:
> (slock-AF_INET){-+..}
> 
> and interrupts could create inverse lock ordering between them.

I think this is bogus.  The slock is not a standard lock.  When we
hold it in process context we don't actually hold the spin lock part
of it.  However, it does prevent the softirq path from running in
critical sections which also prevents any attempt to grab the
callback lock from softirq context.

If you still think there is a problem, please show an actual scenario
where it dead locks.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm1: drivers/net/chelsio/: unused code

2006-11-28 Thread Adrian Bunk
On Mon, Nov 27, 2006 at 10:24:55AM -0800, Stephen Hemminger wrote:
> On Fri, 24 Nov 2006 01:17:31 +0100
> Adrian Bunk <[EMAIL PROTECTED]> wrote:
> 
> > On Thu, Nov 23, 2006 at 02:17:03AM -0800, Andrew Morton wrote:
> > >...
> > > Changes since 2.6.19-rc5-mm2:
> > >...
> > > +chelsio-22-driver.patch
> > >...
> > >  netdev updates
> > 
> > It is suspicious that the following newly added code is completely unused:
> >   drivers/net/chelsio/ixf1010.o
> > t1_ixf1010_ops
> >   drivers/net/chelsio/mac.o
> > t1_chelsio_mac_ops
> >   drivers/net/chelsio/vsc8244.o
> > t1_vsc8244_ops
> > 
> > cu
> > Adrian
> > 
> 
> All that is gone in later version. I reposted new patches
> after -mm2 was done.

It seems these patches didn't make it into 2.6.19-rc6-mm2 ?

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] Mark rdtsc as sync only for netburst, not for core2

2006-11-28 Thread Arjan van de Ven

Zhang, Yanmin wrote:

If it's a single processor, the go backwards issue doesn't exist. Below is
my patch based on Arjan's. It's against 2.6.19-rc5-mm2.

Hi,

this patch is incorrect

--- linux-2.6.19-rc5-mm2_arjan/arch/x86_64/kernel/setup.c   2006-11-29 
10:41:21.0 +0800
+++ linux-2.6.19-rc5-mm2_arjan_fix/arch/x86_64/kernel/setup.c   2006-11-29 
10:42:28.0 +0800
@@ -861,7 +861,7 @@ static void __cpuinit init_intel(struct 
 		set_bit(X86_FEATURE_CONSTANT_TSC, >x86_capability);

if (c->x86 == 6)
set_bit(X86_FEATURE_REP_GOOD, >x86_capability);
-   if (c->x86 == 15)
+   if (c->x86 == 15 && num_possible_cpus() != 1)
set_bit(X86_FEATURE_SYNC_RDTSC, >x86_capability);


first of all, you probably meant "|| num_possible_cpus() == 1"

but second of all, the core2 cpus are dual core so.. .what does it 
bring you at all?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread Jarek Poplawski
On 29-11-2006 05:25, David Miller wrote:
...
> commit 93e3a20d6c67a09b867431e7d5b3e7bc97154fab
> Author: David S. Miller <[EMAIL PROTECTED]>
> Date:   Tue Nov 28 20:24:10 2006 -0800
> 
> [NET]: Fix MAX_HEADER setting.
> 
> MAX_HEADER is either set to LL_MAX_HEADER or LL_MAX_HEADER + 48, and
> this is controlled by a set of CONFIG_* ifdef tests.
...
> Noticed by Patrick McHardy.

And if we talk about names:

+ Spotted by Krzysztof Halasa.

probably wouldn't be too exaggerated...

> Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt8: alsa xruns

2006-11-28 Thread Ingo Molnar

* Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote:

> > I'll turn off the machine and cold boot it...)
> 
> No difference, actually it looks like the regression re-regresses if I 
> enable the trace... Arghhh.

yeah, that happens sometimes if some race is particularly narrow :-/

> Toggling /proc/sys/kernel/trace_enabled makes the long xruns reported 
> by jack come and go.

i'll try to reproduce it. Can you see it with my yum kernel too? (that 
would simplify checking this on many testboxes)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.19-rc6-rt9 - fails to compile

2006-11-28 Thread Marcus Hartig

Hi!

2.6.19-rc6-rt9 fails to compile on my Dual Core Notebook with FC6.

  CHK include/linux/version.h
  CHK include/linux/utsrelease.h
  CC  arch/i386/kernel/asm-offsets.s
In file included from include/linux/time.h:7,
 from include/linux/timex.h:57,
 from include/linux/sched.h:50,
 from include/linux/module.h:9,
 from include/linux/crypto.h:21,
 from arch/i386/kernel/asm-offsets.c:7:
include/linux/seqlock.h: In function '__read_seqretry':
include/linux/seqlock.h:139: error: expected expression before 'do'
In file included from include/linux/module.h:9,
 from include/linux/crypto.h:21,
 from arch/i386/kernel/asm-offsets.c:7:
include/linux/sched.h: In function 'dequeue_signal_lock':
include/linux/sched.h:1478: error: expected expression before 'do'
make[1]: *** [arch/i386/kernel/asm-offsets.s] Error 1
make: *** [prepare0] Error 2

www.marcush.de/kernel/.config

Regards,
Marcus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt8: alsa xruns

2006-11-28 Thread Ingo Molnar

* Daniel Walker <[EMAIL PROTECTED]> wrote:

> > i fixed this in -rt8: the latency tracer now uses the time of day 
> > clocksource - pmtimer in this case. (that means function tracing is 
> > slower than with the TSC, but latency figures are more reliable.)
> 
> I have a patch set to make the using the clocksources a little nicer.. 
> Is there anything I should add to that interface to help enable 
> latency tracing, or are you satisfied with using the timekeeping 
> clocksource? It might get constrictive after a while.

please talk to John and Thomas about GTOD interfaces. Right now the 
solution used by the latency tracer is working out pretty OK - but if 
something better comes along i can use that too. It's not a burning 
issue though, unless you know of some bug. (i'm not sure what you mean 
by it becoming constrictive)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-28 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> Andrew,
> 
> could we try this one in -mm? It unifies (and simplifies) the TSC sync 
> code between i386 and x86_64, and also offers a stronger guarantee 
> that we'll only activate the TSC clock on CPU where the TSC is synced 
> correctly by the hardware.

updated patch below. (Mike Galbraith reported that suspend broke on -rt 
kernels, it was due to an __init/__cpuinit mismatch)

Ingo

->
Subject: x86: rewrite SMP TSC sync code
From: Ingo Molnar <[EMAIL PROTECTED]>

make the TSC synchronization code more robust, and unify
it between x86_64 and i386.

The biggest change is the removal of the 'fix up TSCs' code
on x86_64 and i386, in some rare cases it was /causing/
time-warps on SMP systems.

The new code only checks for TSC asynchronity - and if it can
prove a time-warp (if it can observe the TSC going backwards
when going from one CPU to another within a critical section),
then the TSC clock-source is turned off.

The TSC synchronization-checking code also got moved into a
separate file.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/i386/kernel/Makefile |2 
 arch/i386/kernel/smpboot.c|  178 ++--
 arch/i386/kernel/tsc.c|4 
 arch/i386/kernel/tsc_sync.c   |1 
 arch/x86_64/kernel/Makefile   |2 
 arch/x86_64/kernel/smpboot.c  |  230 ++
 arch/x86_64/kernel/time.c |   11 ++
 arch/x86_64/kernel/tsc_sync.c |  187 ++
 include/asm-i386/tsc.h|   49 
 include/asm-x86_64/proto.h|2 
 include/asm-x86_64/timex.h|   26 
 include/asm-x86_64/tsc.h  |   66 
 12 files changed, 295 insertions(+), 463 deletions(-)

Index: linux/arch/i386/kernel/Makefile
===
--- linux.orig/arch/i386/kernel/Makefile
+++ linux/arch/i386/kernel/Makefile
@@ -18,7 +18,7 @@ obj-$(CONFIG_X86_MSR) += msr.o
 obj-$(CONFIG_X86_CPUID)+= cpuid.o
 obj-$(CONFIG_MICROCODE)+= microcode.o
 obj-$(CONFIG_APM)  += apm.o
-obj-$(CONFIG_X86_SMP)  += smp.o smpboot.o
+obj-$(CONFIG_X86_SMP)  += smp.o smpboot.o tsc_sync.o
 obj-$(CONFIG_X86_TRAMPOLINE)   += trampoline.o
 obj-$(CONFIG_X86_MPPARSE)  += mpparse.o
 obj-$(CONFIG_X86_LOCAL_APIC)   += apic.o nmi.o
Index: linux/arch/i386/kernel/smpboot.c
===
--- linux.orig/arch/i386/kernel/smpboot.c
+++ linux/arch/i386/kernel/smpboot.c
@@ -88,12 +88,6 @@ cpumask_t cpu_possible_map;
 EXPORT_SYMBOL(cpu_possible_map);
 static cpumask_t smp_commenced_mask;
 
-/* TSC's upper 32 bits can't be written in eariler CPU (before prescott), there
- * is no way to resync one AP against BP. TBD: for prescott and above, we
- * should use IA64's algorithm
- */
-static int __devinitdata tsc_sync_disabled;
-
 /* Per CPU bogomips and other parameters */
 struct cpuinfo_x86 cpu_data[NR_CPUS] __cacheline_aligned;
 EXPORT_SYMBOL(cpu_data);
@@ -210,151 +204,6 @@ valid_k7:
;
 }
 
-/*
- * TSC synchronization.
- *
- * We first check whether all CPUs have their TSC's synchronized,
- * then we print a warning if not, and always resync.
- */
-
-static struct {
-   atomic_t start_flag;
-   atomic_t count_start;
-   atomic_t count_stop;
-   unsigned long long values[NR_CPUS];
-} tsc __initdata = {
-   .start_flag = ATOMIC_INIT(0),
-   .count_start = ATOMIC_INIT(0),
-   .count_stop = ATOMIC_INIT(0),
-};
-
-#define NR_LOOPS 5
-
-static void __init synchronize_tsc_bp(void)
-{
-   int i;
-   unsigned long long t0;
-   unsigned long long sum, avg;
-   long long delta;
-   unsigned int one_usec;
-   int buggy = 0;
-
-   printk(KERN_INFO "checking TSC synchronization across %u CPUs: ", 
num_booting_cpus());
-
-   /* convert from kcyc/sec to cyc/usec */
-   one_usec = cpu_khz / 1000;
-
-   atomic_set(_flag, 1);
-   wmb();
-
-   /*
-* We loop a few times to get a primed instruction cache,
-* then the last pass is more or less synchronized and
-* the BP and APs set their cycle counters to zero all at
-* once. This reduces the chance of having random offsets
-* between the processors, and guarantees that the maximum
-* delay between the cycle counters is never bigger than
-* the latency of information-passing (cachelines) between
-* two CPUs.
-*/
-   for (i = 0; i < NR_LOOPS; i++) {
-   /*
-* all APs synchronize but they loop on '== num_cpus'
-*/
-   while (atomic_read(_start) != num_booting_cpus()-1)
-   cpu_relax();
-   atomic_set(_stop, 0);
-   wmb();
-   /*
-* this lets the APs save their current TSC:
-*/
-

Re: The VFS cache is not freed when there is not enough free memory to allocate

2006-11-28 Thread Sonic Zhang

Forward to the mailing list.

Sonic Zhang wrote:

On 11/27/06, Nick Piggin <[EMAIL PROTECTED]> wrote:




I haven't actually written any nommu userspace code, but it is obvious
that you must try to keep malloc to <= PAGE_SIZE (although order 2 and
even 3 allocations seem to be reasonable, from process context)... Then
you would use something a bit more advanced than a linear array to store
data (a pagetable-like radix tree would be a nice, easy idea).



But, even we split the 8M memory into 2048 x 4k blocks, we still face
this failure. The key problem is that available memory is small than
2048 x 4k, while there are still a lot of VFS cache. The VFS cache can
be freed, but kernel allocation function ignores it. See the new test
application.



Which kernel allocation function? If you can provide more details I'd
like to get to the bottom of this.

Because the anonymous memory allocation in mm/nommu.c is all allocated
with GFP_KERNEL from process context, and in that case, the allocator
should not fail but call into page reclaim which in turn will free VFS
caches.




What's a better way to free the VFS cache in memory allocator?



It should be freeing it for you, so I'm not quite sure what is going
on. Can you send over the kernel messages you see when the allocation
fails?

Also, do you happen to know of a reasonable toolchain + emulator setup
that I could test the nommu kernel with?

Thanks,
Nick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] genapic: default to physical mode on hotplug CPU kernels

2006-11-28 Thread Ingo Molnar

* Siddha, Suresh B <[EMAIL PROTECTED]> wrote:

> On Tue, Nov 28, 2006 at 09:23:22PM +0100, Ingo Molnar wrote:
> > 
> > * Siddha, Suresh B <[EMAIL PROTECTED]> wrote:
> > 
> > > On Tue, Nov 28, 2006 at 07:33:46AM +0100, Ingo Molnar wrote:
> > > > -   if (clusters <= 1 && max_cluster <= 8 && cluster_cnt[0] == 
> > > > max_cluster)
> > > > +   if (max_apic < 8)
> > > 
> > > Patch mostly looks good.  Instead of checking for max_apic, can we use
> > >   cpus_weight(cpu_possible_map) <= 8
> > 
> > ok - but i think it's still possible the BIOS tells us APIC IDs that are 
> > larger than 7, even if there are fewer CPUs. So i think the patch below 
> > should cover it. Agreed?
> > 
> 
> I think it is ok to use flat mode even when APIC IDs are larger than 
> 7, as we rely on LDR's which are programmed using smp_processor_id().
> 
> IMO, cpus_weight check should be fine.

hm - indeed. Then we can indeed do the patch below. Nice simplification!

Ingo

>
From: Ingo Molnar <[EMAIL PROTECTED]>
Subject: [patch] genapic: default to physical mode on hotplug CPU kernels

default to physical mode on hotplug CPU kernels. Furher simplify and
clean up the APIC initialization code.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/x86_64/kernel/genapic.c |   20 +++-
 arch/x86_64/kernel/mpparse.c |2 +-
 include/asm-x86_64/apic.h|2 +-
 3 files changed, 5 insertions(+), 19 deletions(-)

Index: linux/arch/x86_64/kernel/genapic.c
===
--- linux.orig/arch/x86_64/kernel/genapic.c
+++ linux/arch/x86_64/kernel/genapic.c
@@ -33,25 +33,11 @@ u8 x86_cpu_to_log_apicid[NR_CPUS]   = { [0
 struct genapic __read_mostly *genapic = _flat;
 
 /*
- * Check the APIC IDs in bios_cpu_apicid and choose the APIC mode.
+ * Choose the APIC routing mode:
  */
-void __init clustered_apic_check(void)
+void __init setup_apic_routing(void)
 {
-   unsigned int i, max_apic = 0;
-   u8 id;
-
-   /*
-* Determine the maximum APIC ID in use:
-*/
-   for (i = 0; i < NR_CPUS; i++) {
-   id = bios_cpu_apicid[i];
-   if (id == BAD_APICID)
-   continue;
-   if (id > max_apic)
-   max_apic = id;
-   }
-
-   if (max_apic < 8)
+   if (cpus_weight(cpu_possible_map) <= 8)
genapic = _flat;
else
genapic = _physflat;
Index: linux/arch/x86_64/kernel/mpparse.c
===
--- linux.orig/arch/x86_64/kernel/mpparse.c
+++ linux/arch/x86_64/kernel/mpparse.c
@@ -302,7 +302,7 @@ static int __init smp_read_mpc(struct mp
}
}
}
-   clustered_apic_check();
+   setup_apic_routing();
if (!num_processors)
printk(KERN_ERR "MPTABLE: no processors registered!\n");
return num_processors;
Index: linux/include/asm-x86_64/apic.h
===
--- linux.orig/include/asm-x86_64/apic.h
+++ linux/include/asm-x86_64/apic.h
@@ -82,7 +82,7 @@ extern void setup_secondary_APIC_clock (
 extern int APIC_init_uniprocessor (void);
 extern void disable_APIC_timer(void);
 extern void enable_APIC_timer(void);
-extern void clustered_apic_check(void);
+extern void setup_apic_routing(void);
 static inline void lapic_timer_idle_broadcast(int broadcast) { }
 
 extern void setup_APIC_extened_lvt(unsigned char lvt_off, unsigned char vector,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt8

2006-11-28 Thread Ingo Molnar

* Karsten Wiese <[EMAIL PROTECTED]> wrote:

> After estimated 15 minutes more it bugged again.
> Related dmesg translates to linux error
>   -EXDEV
> propably caused by the following lines:
> 
> 
> static int uhci_result_isochronous(struct uhci_hcd *uhci, struct urb *urb)

hm. Below are all the USB changes done by -rt. Maybe one of them has 
some side-effect?

Ingo

Index: linux/drivers/usb/core/devio.c
===
--- linux.orig/drivers/usb/core/devio.c
+++ linux/drivers/usb/core/devio.c
@@ -309,10 +309,11 @@ static void async_completed(struct urb *
 struct async *as = urb->context;
 struct dev_state *ps = as->ps;
struct siginfo sinfo;
+   unsigned long flags;
 
-spin_lock(>lock);
-list_move_tail(>asynclist, >async_completed);
-spin_unlock(>lock);
+   spin_lock_irqsave(>lock, flags);
+   list_move_tail(>asynclist, >async_completed);
+   spin_unlock_irqrestore(>lock, flags);
if (as->signr) {
sinfo.si_signo = as->signr;
sinfo.si_errno = as->urb->status;
Index: linux/drivers/usb/core/hcd.c
===
--- linux.orig/drivers/usb/core/hcd.c
+++ linux/drivers/usb/core/hcd.c
@@ -517,13 +517,11 @@ error:
}
 
/* any errors get returned through the urb completion */
-   local_irq_save (flags);
-   spin_lock (>lock);
+   spin_lock_irqsave(>lock, flags);
if (urb->status == -EINPROGRESS)
urb->status = status;
-   spin_unlock (>lock);
+   spin_unlock_irqrestore(>lock, flags);
usb_hcd_giveback_urb (hcd, urb);
-   local_irq_restore (flags);
return 0;
 }
 
@@ -551,8 +549,7 @@ void usb_hcd_poll_rh_status(struct usb_h
if (length > 0) {
 
/* try to complete the status urb */
-   local_irq_save (flags);
-   spin_lock(_root_hub_lock);
+   spin_lock_irqsave(_root_hub_lock, flags);
urb = hcd->status_urb;
if (urb) {
spin_lock(>lock);
@@ -568,14 +565,13 @@ void usb_hcd_poll_rh_status(struct usb_h
spin_unlock(>lock);
} else
length = 0;
-   spin_unlock(_root_hub_lock);
+   spin_unlock_irqrestore(_root_hub_lock, flags);
 
/* local irqs are always blocked in completions */
if (length > 0)
usb_hcd_giveback_urb (hcd, urb);
else
hcd->poll_pending = 1;
-   local_irq_restore (flags);
}
 
/* The USB 2.0 spec says 256 ms.  This is close enough and won't
@@ -647,17 +643,15 @@ static int usb_rh_urb_dequeue (struct us
} else {/* Status URB */
if (!hcd->uses_new_polling)
del_timer (>rh_timer);
-   local_irq_save (flags);
-   spin_lock (_root_hub_lock);
+   spin_lock_irqsave(_root_hub_lock, flags);
if (urb == hcd->status_urb) {
hcd->status_urb = NULL;
urb->hcpriv = NULL;
} else
urb = NULL; /* wasn't fully queued */
-   spin_unlock (_root_hub_lock);
+   spin_unlock_irqrestore(_root_hub_lock, flags);
if (urb)
usb_hcd_giveback_urb (hcd, urb);
-   local_irq_restore (flags);
}
 
return 0;
@@ -1311,11 +1305,9 @@ void usb_hcd_endpoint_disable (struct us
WARN_ON (!HC_IS_RUNNING (hcd->state) && hcd->state != HC_STATE_HALT &&
udev->state != USB_STATE_NOTATTACHED);
 
-   local_irq_disable ();
-
/* ep is already gone from udev->ep_{in,out}[]; no more submits */
 rescan:
-   spin_lock (_data_lock);
+   spin_lock_irq(_data_lock);
list_for_each_entry (urb, >urb_list, urb_list) {
int tmp;
 
@@ -1323,13 +1315,13 @@ rescan:
if (urb->status != -EINPROGRESS)
continue;
usb_get_urb (urb);
-   spin_unlock (_data_lock);
+   spin_unlock_irq(_data_lock);
 
-   spin_lock (>lock);
+   spin_lock_irq(>lock);
tmp = urb->status;
if (tmp == -EINPROGRESS)
urb->status = -ESHUTDOWN;
-   spin_unlock (>lock);
+   spin_unlock_irq(>lock);
 
/* kick hcd unless it's already returning this */
if (tmp == -EINPROGRESS) {
@@ -1352,8 +1344,7 @@ rescan:
/* list contents may have changed */
goto rescan;
}
-   spin_unlock (_data_lock);
-   local_irq_enable ();
+   spin_unlock_irq(_data_lock);
 
/* 

Re: 2.6.19-rc6-rt8

2006-11-28 Thread Ingo Molnar

* Hu Gang <[EMAIL PROTECTED]> wrote:

> > thanks, applied. I'll let the PPC -rt folks sort out the hack effects. 
> > Do you have CONFIG_HIGH_RES_TIMERS enabled?
> no.
> 
> 
> [hugang@:~]$ uname -a
> Linux hugang.soulinfo.com 2.6.19-rc6-rt8 #2 PREEMPT Wed Nov 29 09:29:43 UTC 
> 2006 ppc GNU/Linux 
> [hugang@:~]$ zgrep CONFIG_HIGH_RES_TIMERS /proc/config.gz 
> [hugang@:~]$

could you send me your config? (i'm just curious what else is 
enabled/disabled)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt8

2006-11-28 Thread Ingo Molnar

* Karsten Wiese <[EMAIL PROTECTED]> wrote:

> Am Montag, 27. November 2006 10:49 schrieb Ingo Molnar:
> > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from 
> 
> I saw usb transport errors here before rebooting with
>   nmi_watchdog=0
> contained in kernel command line.

so nmi_watchdog=1 (or was it nmi_watchdog=2 ?) caused these problems - 
and then nmi_watchdog=0 fixed them? i686? Extremely weird. Does the 
patch below fix the issue perhaps?

Ingo

Index: linux/arch/i386/kernel/nmi.c
===
--- linux.orig/arch/i386/kernel/nmi.c
+++ linux/arch/i386/kernel/nmi.c
@@ -932,12 +932,14 @@ notrace __kprobes int nmi_watchdog_tick(
 
__profile_tick(CPU_PROFILING, regs);
 
+#if 0
/* check for other users first */
if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
== NOTIFY_STOP) {
rc = 1;
touched = 1;
}
+#endif
 
/*
 * Take the local apic timer and PIT/HPET into account. We don't
Index: linux/arch/x86_64/kernel/nmi.c
===
--- linux.orig/arch/x86_64/kernel/nmi.c
+++ linux/arch/x86_64/kernel/nmi.c
@@ -814,12 +814,14 @@ int __kprobes nmi_watchdog_tick(struct p
 
__profile_tick(CPU_PROFILING, regs);
 
+#if 0
/* check for other users first */
if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
== NOTIFY_STOP) {
rc = 1;
touched = 1;
}
+#endif
 
sum = read_pda(apic_timer_irqs);
if (nmi_show_regs[cpu]) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt5

2006-11-28 Thread Ingo Molnar

* Mark Knecht <[EMAIL PROTECTED]> wrote:

> Forwarding it off list.
> 
> Thanks Ingo. I'm very interested if it works for you to do this.

i've integrated it into -rt (see the patch below), but i marked it 
obsolete and i might not be able to carry it for long - we'll see. The 
preferred solution is to use newer PAM and its rt-limits features. But 
to ease migration i'll keep the realtime-lsm for a while.

Ingo
---
 security/Kconfig   |9 +++
 security/Makefile  |1 
 security/realcap.c |  147 +
 3 files changed, 157 insertions(+)

Index: linux/security/Kconfig
===
--- linux.orig/security/Kconfig
+++ linux/security/Kconfig
@@ -80,6 +80,15 @@ config SECURITY_CAPABILITIES
  This enables the "default" Linux capabilities functionality.
  If you are unsure how to answer this question, answer Y.
 
+config REALTIME_CAPABILITIES
+   tristate "Real-Time LSM (Obsolete)"
+   depends on SECURITY && EXPERIMENTAL
+   help
+ This is an obsolete LSM - use newer PAM and rt-limites
+ to manage your real-time apps.
+
+ If you are unsure how to answer this question, answer N.
+
 config SECURITY_ROOTPLUG
tristate "Root Plug Support"
depends on USB && SECURITY
Index: linux/security/Makefile
===
--- linux.orig/security/Makefile
+++ linux/security/Makefile
@@ -15,4 +15,5 @@ obj-$(CONFIG_SECURITY)+= security.o d
 # Must precede capability.o in order to stack properly.
 obj-$(CONFIG_SECURITY_SELINUX) += selinux/built-in.o
 obj-$(CONFIG_SECURITY_CAPABILITIES)+= commoncap.o capability.o
+obj-$(COMMON_REALTIME_CAPABILITIES)+= commoncap.o realcap.o
 obj-$(CONFIG_SECURITY_ROOTPLUG)+= commoncap.o root_plug.o
Index: linux/security/realcap.c
===
--- /dev/null
+++ linux/security/realcap.c
@@ -0,0 +1,147 @@
+/*
+ * Realtime Capabilities Linux Security Module
+ *
+ *  Copyright (C) 2003 Torben Hohn
+ *  Copyright (C) 2003, 2004 Jack O'Quin
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+
+#define RT_LSM "Realtime LSM " /* syslog module name prefix */
+#define RT_ERR "Realtime: "/* syslog error message prefix */
+
+#include 
+MODULE_INFO(vermagic,VERMAGIC_STRING);
+
+/* module parameters
+ *
+ *  These values could change at any time due to some process writing
+ *  a new value in /sys/module/realtime/parameters.  This is OK,
+ *  because each is referenced only once in each function call.
+ *  Nothing depends on parameters having the same value every time.
+ */
+
+/* if TRUE, any process is realtime */
+static int rt_any;
+module_param_named(any, rt_any, int, 0644);
+MODULE_PARM_DESC(any, " grant realtime privileges to any process.");
+
+/* realtime group id, or NO_GROUP */
+static int rt_gid = -1;
+module_param_named(gid, rt_gid, int, 0644);
+MODULE_PARM_DESC(gid, " the group ID with access to realtime privileges.");
+
+/* enable mlock() privileges */
+static int rt_mlock = 1;
+module_param_named(mlock, rt_mlock, int, 0644);
+MODULE_PARM_DESC(mlock, " enable memory locking privileges.");
+
+/* helper function for testing group membership */
+static inline int gid_ok(int gid)
+{
+   if (gid == -1)
+   return 0;
+
+   if (gid == current->gid)
+   return 1;
+
+   return in_egroup_p(gid);
+}
+
+static void realtime_bprm_apply_creds(struct linux_binprm *bprm, int unsafe)
+{
+   cap_bprm_apply_creds(bprm, unsafe);
+
+   /*  If a non-zero `any' parameter was specified, we grant
+*  realtime privileges to every process.  If the `gid'
+*  parameter was specified and it matches the group id of the
+*  executable, of the current process or any supplementary
+*  groups, we grant realtime capabilites.
+*/
+
+   if (rt_any || gid_ok(rt_gid)) {
+   cap_raise(current->cap_effective, CAP_SYS_NICE);
+   if (rt_mlock) {
+   cap_raise(current->cap_effective, CAP_IPC_LOCK);
+   cap_raise(current->cap_effective, CAP_SYS_RESOURCE);
+   }
+   }
+}
+
+static struct security_operations capability_ops = {
+   .ptrace =   cap_ptrace,
+   .capget =   cap_capget,
+   .capset_check = cap_capset_check,
+   .capset_set =   cap_capset_set,
+   .capable =  cap_capable,
+   .netlink_send = cap_netlink_send,
+   .netlink_recv 

[PATCH] x86_64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq

2006-11-28 Thread Yinghai Lu

plesae check the patch
[PATCH] x86_64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq

setup_IO_APIC_irqs could fail to get vector for some device 
when you have too many devices, because at that time only boot
cpu is online. So check vector for irq in setup_ioapic_dest and 
call setup_IO_APIC_irq to make sure IO-APIC irq-routing table is
initialized.

Also seperate setup_IO_APIC_irq from setup_IO_APIC_irqs.

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>


diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c
index 14654e6..496ba4e 100644
--- a/arch/x86_64/kernel/io_apic.c
+++ b/arch/x86_64/kernel/io_apic.c
@@ -796,27 +845,65 @@ static void ioapic_register_intr(int irq
 	  handle_edge_irq, "edge");
 	}
 }
-
-static void __init setup_IO_APIC_irqs(void)
+static void __init setup_IO_APIC_irq(int apic, int pin, int idx, int irq)
 {
 	struct IO_APIC_route_entry entry;
-	int apic, pin, idx, irq, first_notcon = 1, vector;
+	int vector;
 	unsigned long flags;
 
-	apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n");
 
-	for (apic = 0; apic < nr_ioapics; apic++) {
-	for (pin = 0; pin < nr_ioapic_registers[apic]; pin++) {
+	/*
+	 * add it to the IO-APIC irq-routing table:
+	 */
+	memset(,0,sizeof(entry));
 
-		/*
-		 * add it to the IO-APIC irq-routing table:
-		 */
-		memset(,0,sizeof(entry));
+	entry.delivery_mode = INT_DELIVERY_MODE;
+	entry.dest_mode = INT_DEST_MODE;
+	entry.mask = 0;/* enable IRQ */
+	entry.dest.logical.logical_dest = cpu_mask_to_apicid(TARGET_CPUS);
+
+	entry.trigger = irq_trigger(idx);
+	entry.polarity = irq_polarity(idx);
 
-		entry.delivery_mode = INT_DELIVERY_MODE;
-		entry.dest_mode = INT_DEST_MODE;
-		entry.mask = 0;/* enable IRQ */
+	if (irq_trigger(idx)) {
+		entry.trigger = 1;
+		entry.mask = 1;
 		entry.dest.logical.logical_dest = cpu_mask_to_apicid(TARGET_CPUS);
+	}
+
+	if (!apic && !IO_APIC_IRQ(irq))
+		return;
+
+	if (IO_APIC_IRQ(irq)) {
+		cpumask_t mask;
+		vector = assign_irq_vector(irq, TARGET_CPUS, );
+		if (vector < 0)
+			return;
+
+		entry.dest.logical.logical_dest = cpu_mask_to_apicid(mask);
+		entry.vector = vector;
+
+		ioapic_register_intr(irq, vector, IOAPIC_AUTO);
+		if (!apic && (irq < 16))
+			disable_8259A_irq(irq);
+	}
+
+	ioapic_write_entry(apic, pin, entry);
+
+	spin_lock_irqsave(_lock, flags);
+	set_native_irq_info(irq, TARGET_CPUS);
+	spin_unlock_irqrestore(_lock, flags);
+
+}
+
+static void __init setup_IO_APIC_irqs(void)
+{
+	int apic, pin, idx, irq, first_notcon = 1;
+
+	apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n");
+
+	for (apic = 0; apic < nr_ioapics; apic++) {
+	for (pin = 0; pin < nr_ioapic_registers[apic]; pin++) {
 
 		idx = find_irq_entry(apic,pin,mp_INT);
 		if (idx == -1) {
@@ -828,39 +915,11 @@ static void __init setup_IO_APIC_irqs(vo
 			continue;
 		}
 
-		entry.trigger = irq_trigger(idx);
-		entry.polarity = irq_polarity(idx);
-
-		if (irq_trigger(idx)) {
-			entry.trigger = 1;
-			entry.mask = 1;
-			entry.dest.logical.logical_dest = cpu_mask_to_apicid(TARGET_CPUS);
-		}
-
 		irq = pin_2_irq(idx, apic, pin);
 		add_pin_to_irq(irq, apic, pin);
 
-		if (!apic && !IO_APIC_IRQ(irq))
-			continue;
-
-		if (IO_APIC_IRQ(irq)) {
-			cpumask_t mask;
-			vector = assign_irq_vector(irq, TARGET_CPUS, );
-			if (vector < 0)
-continue;
+		setup_IO_APIC_irq(apic, pin, idx, irq);
 
-			entry.dest.logical.logical_dest = cpu_mask_to_apicid(mask);
-			entry.vector = vector;
-
-			ioapic_register_intr(irq, vector, IOAPIC_AUTO);
-			if (!apic && (irq < 16))
-disable_8259A_irq(irq);
-		}
-		ioapic_write_entry(apic, pin, entry);
-
-		spin_lock_irqsave(_lock, flags);
-		set_native_irq_info(irq, TARGET_CPUS);
-		spin_unlock_irqrestore(_lock, flags);
 	}
 	}
 
@@ -2141,7 +2200,15 @@ void __init setup_ioapic_dest(void)
 			if (irq_entry == -1)
 continue;
 			irq = pin_2_irq(irq_entry, ioapic, pin);
-			set_ioapic_affinity_irq(irq, TARGET_CPUS);
+
+			/* setup_IO_APIC_irqs could fail to get vector for some device 
+			 * when you have too many devices, because at that time only boot
+			 * cpu is online.
+			 */
+			if(!irq_vector[irq])
+setup_IO_APIC_irq(ioapic, pin, irq_entry, irq);
+			else
+set_ioapic_affinity_irq(irq, TARGET_CPUS);
 		}
 
 	}


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread Herbert Xu
On Tue, Nov 28, 2006 at 09:04:16PM -0800, David Miller wrote:
>
> > Definitely.  I'm not sure whether 48 is enough even for recursive
> > tunnels.  This should really just be a hint.  It's OK to spend a
> > bit of time reallocating skb's if it's too small, but it's not OK
> > to die.
> 
> The recursive tunnel case is handled by the PMTU reductions
> in the route, isn't it?

Oh I wasn't suggesting that the current code is broken.

I'm just emphasising that LL_MAX_HEADER is by no means the *maximum*
header size in a Linux system.  Anybody should be able to load a
new NIC module with a hard header size exceeding what LL_MAX_HEADER
is and the system should still function (albeit slower since every
packet sent down that device has to be reallocated).

In particular, nested tunnels is one such device which anybody can
construct without writing a kernel module.

As to getting rid of those ifdefs, here is one idea.  We keep a
read-mostly global variable that represents the actual current
maximum LL header size.  Everytime a new device appears (or if
its hard header size changes) we update this variable if needed.

Hmm, we don't actually update the hard header size should the
underlying device change for tunnels.  Good thing the tunnels
only use that as a hint and reallocate if necessary :)

This is not optimal in that it never decreases, but it's certainly
better than a compile-time constant (e.g., people using distribution
kernels don't necessarily use tunnels).

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt8

2006-11-28 Thread Hu Gang
On Wed, 29 Nov 2006 07:41:09 +0100
Ingo Molnar <[EMAIL PROTECTED]> wrote:

> 
> * Hu Gang <[EMAIL PROTECTED]> wrote:
> 
> > On Mon, 27 Nov 2006 10:49:27 +0100
> > Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > 
> > > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from 
> > > the usual place:
> > > 
> > > http://redhat.com/~mingo/realtime-preempt/
> > 
> > attached patch to making it compile and works in my PowerBook G4. 
> 
> thanks, applied. I'll let the PPC -rt folks sort out the hack effects. 
> Do you have CONFIG_HIGH_RES_TIMERS enabled?
no.


[hugang@:~]$ uname -a
Linux hugang.soulinfo.com 2.6.19-rc6-rt8 #2 PREEMPT Wed Nov 29 09:29:43 UTC 
2006 ppc GNU/Linux 
[hugang@:~]$ zgrep CONFIG_HIGH_RES_TIMERS /proc/config.gz 
[hugang@:~]$
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt8

2006-11-28 Thread Ingo Molnar

* Hu Gang <[EMAIL PROTECTED]> wrote:

> On Mon, 27 Nov 2006 10:49:27 +0100
> Ingo Molnar <[EMAIL PROTECTED]> wrote:
> 
> > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from 
> > the usual place:
> > 
> > http://redhat.com/~mingo/realtime-preempt/
> 
> attached patch to making it compile and works in my PowerBook G4. 

thanks, applied. I'll let the PPC -rt folks sort out the hack effects. 
Do you have CONFIG_HIGH_RES_TIMERS enabled?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Slab: Remove kmem_cache_t

2006-11-28 Thread Nick Piggin

Linus Torvalds wrote:


So typedefs are good for

 - "u8"/"u16"/"u32"/"u64" kind of things, where the underlying types 
   really are potentially different on different architectures.


 - "sector_t"-like things which may be 32-bit or 64-bit depending on some 
   CONFIG_LBD option or other.


 - as a special case, "sparse" actually makes bitwise typedefs have real 
   meaning as types, so if you are using sparse to distinguish between a 
   little-endian 16-bit entity or a big-endian 16-bit entity, the typedef 
   there is actually important and has real meaning to sparse (without the 
   typedef, each bitwise type declaration would be strictly a _different_ 
   type from another bitwise type declaration that otherwise looks the 
   same).


But typedefs are NOT good for:

 - trying to avoid typing a few characters:

	"kmem_cache_t" is strictly _worse_ than "struct kmem_cache", not 
	just because it causes declaration issues. It also hides the fact 
	that the thing really is a structure (and hiding the fact that 
	it's a pointer is a shooting offense: things like "voidptr_t" 
	should not be allowed at all)


 - incorrect "portability". 


the POSIX "socklen_t" was not only a really bad way to write
	"int", it actually caused a lot of NON-portability, and made some 
	people think it should be "size_t" or something equally broken.


The one excuse for typedefs in the "typing" sense can be complicated 
function pointer types. Some function pointers are just too easy to screw 
up, and using a


typedef (*myfn_t)(int, ...);

can be preferable over forcing people to write that really complex kind of 
type out every time. But that shouldn't be overused either (but we use it 
for things like "readdir_t", for example, for exactly this reason).



You are saying that they should only be used to create new "primitive"
types (ie. that you can use in arithmetic / logical ops) that can
change depending on the config.

That's fair enough. I'm sure you've also said in the past that they can
be used (IIRC you even encouraged it) when the type is opaque in the
context it is being used. I won't bother trying to dig out the post,
because I could be wrong and you are entitled to change your mind. I
just want to get this straight.

Thanks,
Nick

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] ALSA: add struct forward declaration

2006-11-28 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

I see about 10 sets of these in a random config.

  CC  drivers/media/video/saa7134/saa7134-cards.o
In file included from drivers/media/video/saa7134/saa7134.h:43,
 from drivers/media/video/saa7134/saa7134-cards.c:27:
include/sound/pcm.h:59: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:59: warning: its scope is only this definition or 
declaration, which is probably not what you want
include/sound/pcm.h:60: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:62: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:64: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:65: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:66: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:67: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:68: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:71: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:73: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:75: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:76: warning: 'struct snd_pcm_substream' declared inside 
parameter list
include/sound/pcm.h:77: warning: 'struct snd_pcm_substream' declared inside 
parameter list

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 include/sound/pcm.h |2 ++
 1 file changed, 2 insertions(+)

--- linux-2.6.19-rc6-mm2.orig/include/sound/pcm.h
+++ linux-2.6.19-rc6-mm2/include/sound/pcm.h
@@ -55,6 +55,8 @@ struct snd_pcm_hardware {
size_t fifo_size;   /* fifo size in bytes */
 };
 
+struct snd_pcm_substream;
+
 struct snd_pcm_ops {
int (*open)(struct snd_pcm_substream *substream);
int (*close)(struct snd_pcm_substream *substream);


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


The return of the dreaded "nobody cared" message with a Promise Card

2006-11-28 Thread Rogério Brito
Hi, Andrew, hi Alan, hi others.

First of all, I would kindly ask you that you keep me in the Cc'ed
messages.

I'm currently finishing grades of (loads and loads) of students and I'm
having a hard time keeping up with my health problems and real life
work, let alone the traffic of lkml.

Well, let me get straight to the problem. I have an Asus A7V (Classic)
motherboard with a VIA KT133 chipset and it has the two usual VIA IDE
controllers and two extra Promise PDC20265 controllers. Right now, my
setup is the following (given advice that once Alan gave me, but he may
not recall it):

* hda: DVD+-RW burner;
* hdc: Plain CD-ROM reader;
* hde: Seagate ST3160021A (7200.??) drive;
* hdg: QUANTUM FIREBALLlct15 30 drive.

The problem is that whenever I plug the Quantum drive, I get stack
traces like this one (with a bit of context, so that you can get sense of
what I am talking about):

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ide1 at 0x170-0x177,0x376 on irq 15
PDC20265: IDE controller at PCI slot :00:11.0
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI Interrupt :00:11.0[A] -> Link [LNKB] -> GSI 10 (level, low) -> 
IRQ 10
PDC20265: chipset revision 2
PDC20265: ROM enabled at 0x3000
PDC20265: 100% native mode on irq 10
PDC20265: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode.
ide2: BM-DMA at 0x7400-0x7407, BIOS settings: hde:pio, hdf:pio
ide3: BM-DMA at 0x7408-0x740f, BIOS settings: hdg:DMA, hdh:pio
Probing IDE interface ide2...
hde: ST3160021A, ATA DISK drive
ide2 at 0x8800-0x8807,0x8402 on irq 10
Probing IDE interface ide3...
hdg: QUANTUM FIREBALLlct15 30, ATA DISK drive
irq 10: nobody cared (try booting with the "irqpoll" option)
 [] show_trace_log_lvl+0x58/0x16a
 [] show_trace+0xd/0x10
 [] dump_stack+0x19/0x1b
 [] __report_bad_irq+0x2e/0x6f
 [] note_interrupt+0x19f/0x1d5
 [] __do_IRQ+0xb5/0xeb
 [] do_IRQ+0x67/0x86
 [] common_interrupt+0x1a/0x20
DWARF2 unwinder stuck at common_interrupt+0x1a/0x20
Leftover inexact backtrace:
 ===
handlers:
[] (ide_intr+0x0/0x19b)
Disabling IRQ #10
Warning: Secondary channel requires an 80-pin cable for operation.
hdg reduced to Ultra33 mode.
ide3 at 0x8000-0x8007,0x7802 on irq 10
hde: max request size: 128KiB
hde: 312581808 sectors (160041 MB) w/2048KiB Cache, CHS=19457/255/63, UDMA(100)
hde: cache flushes supported
 hde: hde1 hde2 hde3 hde4
hdg: max request size: 128KiB
hdg: 58633344 sectors (30020 MB) w/418KiB Cache, CHS=58168/16/63, UDMA(33)
hdg: cache flushes not supported
 hdg: hdg1 hdg2 hdg3 hdg4
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

This is what I get when I boot with a 2.6.18.3 (custom) kernel *AND*
with the irqpoll option already enabled. The kernel 2.6.19-rc6 that I
have here doesn't work at all if I pass the irqpoll option (it just
freezes right at "Uncompressing Linux" nothing is displayed at least
during a minute or so---I think that it hanged).

Since Linus Torvalds once said something to the effect that "users that
are willing to help with patches are worth their weight in gold", I
would like to contribute here. :-)

I am willing to do a git bisect to see which may be a problematic patch
or not, but the "irq 10: nobody cared (try booting with the "irqpoll"
option)" is one that I reported to Andrew quite some time ago (I thought
that it had gone away), and it didn't manifest itself until I had to
reuse this extra drive, since I am doing a work that is producing a lot
of data.

Please, if you want any further information, don't hesitate to ask. I
can test patches that are even moderately invasive, since I'm talking
backups of the vital data of my system with regularity.


Regards and thank you very much for any help, Rogério Brito.

-- 
Rogério Brito : [EMAIL PROTECTED] : http://www.ime.usp.br/~rbrito
Homepage of the algorithms package : http://algorithms.berlios.de
Homepage on freshmeat:  http://freshmeat.net/projects/algorithms/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SCSI init discussion/SAN problem (not interesting)

2006-11-28 Thread Evan Rempel

Bernd Eckenfels wrote:

In article <[EMAIL PROTECTED]> you wrote:


Was this post just not interesting enough, or is it the lack of access to 
hardware
to test this on that prevented it from being picked up by someone?



see google, for example: http://christophe.varoqui.free.fr/multipath.html



While that information is accurate, it is not new to me.

I must have been unclear in my description of how the scsi device registration
with the kernel causes multipath devices to function inefficiently.

When a device has multiple paths, the kernel will see multiple scsi devices, 
even
though there is only one physical device. For each of the scsi devices that the
kernel can see, the partition table (or some other IO that I am unaware of) is
read from the device, meaning IO is generated on ALL paths to the device. This 
isn't
a problem for some devices, but on others it can initiate a failover process 
which can
take many seconds, only to have the process repeated as IO is generated on a 
third path to
the device.

Is it unreasonable for the scsi initialization routines to be aware that some 
kernel scsi
devices are really the same physical devices and register them with the kernel 
WITHOUT
generating any IO on the physical device?

Doing this there would be a maximum of one failover per physical device durint 
the boot
sequence. This one failover could be eliminated if the scsi initialization code 
were aware
of "active" paths and only generated IO on active paths, rather than the first 
path.

All of this is before device mapper or multipath get thier hands on the scsi 
devices. It is
completely within the scope of the scsi initialization code in the kernel.

Is this more clear? If not, could someone ask for clearification of the fuzzy 
parts?

Evan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] prune_icache_sb

2006-11-28 Thread Wendy Cheng

Andrew Morton wrote:


We shouldn't export this particular implementation to modules because it
has bad failure modes.  There might be a case for exposing an
i_sb_list-based API or, perhaps better, a max-unused-inodes mount option.


 

Ok, thanks for looking into this - it is appreciated. I'll try to figure 
out something else.


-- Wendy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/12] ext3 balloc: reset windowsz when full

2006-11-28 Thread Hugh Dickins
On Tue, 28 Nov 2006, Mingming Cao wrote:

> Port a series ext2 balloc patches from Hugh to ext3/4. The first 6
> patches are against ext3, and the rest are aginst ext4.

Thanks for all that, Mingming:
whichever is appropriate, all twelve
Acked-by: Hugh Dickins <[EMAIL PROTECTED]>
or
Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>

I'll think about your other mails, those that need further thought,
later on: I need to pin down more accurately the repetitious sequence of
reservations in the mistaken case - maybe it indicated further issues,
maybe not; and I need to consider our different views of the my_rsv
find_next_usable_block.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm2

2006-11-28 Thread Avi Kivity

Andrew Morton wrote:

On Tue, 28 Nov 2006 19:24:45 -0500
Thomas Tuttle <[EMAIL PROTECTED]> wrote:

  

I've found a couple of bugs so far...

1. I did `modprobe kvm' and then tried running a version of the KVM Qemu
compiled for a different kernel.  My mistake.  But I got an oops:

BUG: unable to handle kernel NULL pointer dereference at virtual address 
0008
Code: 14 0f 87 77 02 00 00 8b 0c b5 00 15 20 f9 85 c9 0f 84 68 02 00 00 89 ea 89 f8 ff d1 85 c0 0f 84 4c 02 00 00 89 f8 e8 31 e9 ff ff <65> a1 08 00 00 00 8b 40 04 8b 40 08 a8 04 0f 85 ae 02 00 00 e8 
EIP: [] kvm_vmx_return+0xef/0x4d0 [kvm] SS:ESP 0068:e5a4fd54





65 a1 08 00 00 00   mov%gs:0x8,%eax

kvm isn't restoring gs properly.

I'll look into it.



Oh, and I get a ton of these messages with kvm:

rtc: lost some interrupts at 1024Hz.



  


I'll look into these too, though I'm not sure where.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] char: drivers use/need PCI

2006-11-28 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

With CONFIG_PCI=n:
drivers/char/mxser_new.c: In function 'mxser_release_res':
drivers/char/mxser_new.c:2383: warning: implicit declaration of function 
'pci_release_region'
drivers/char/mxser_new.c: In function 'mxser_probe':
drivers/char/mxser_new.c:2578: warning: implicit declaration of function 
'pci_request_region'
drivers/built-in.o: In function `sx_remove_card':
sx.c:(.text.sx_remove_card+0x65): undefined reference to `pci_release_region'
drivers/char/isicom.c: In function 'isicom_probe':
drivers/char/isicom.c:1793: warning: implicit declaration of function 
'pci_request_region'
drivers/char/isicom.c:1827: warning: implicit declaration of function 
'pci_release_region'

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/char/Kconfig |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.19-rc6-mm2.orig/drivers/char/Kconfig
+++ linux-2.6.19-rc6-mm2/drivers/char/Kconfig
@@ -203,7 +203,7 @@ config MOXA_SMARTIO
 
 config MOXA_SMARTIO_NEW
tristate "Moxa SmartIO support v. 2.0 (EXPERIMENTAL)"
-   depends on SERIAL_NONSTANDARD
+   depends on SERIAL_NONSTANDARD && PCI
help
  Say Y here if you have a Moxa SmartIO multiport serial card and/or
  want to help develop a new version of this driver.
@@ -218,7 +218,7 @@ config MOXA_SMARTIO_NEW
 
 config ISI
tristate "Multi-Tech multiport card support (EXPERIMENTAL)"
-   depends on SERIAL_NONSTANDARD
+   depends on SERIAL_NONSTANDARD && PCI
select FW_LOADER
help
  This is a driver for the Multi-Tech cards which provide several
@@ -312,7 +312,7 @@ config SPECIALIX_RTSCTS
 
 config SX
tristate "Specialix SX (and SI) card support"
-   depends on SERIAL_NONSTANDARD
+   depends on SERIAL_NONSTANDARD && PCI
help
  This is a driver for the SX and SI multiport serial cards.
  Please read the file  for details.


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] MTD: ESB2ROM uses PCI

2006-11-28 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

ESB2ROM uses PCI interface functions.

With CONFIG_PCI=n:
drivers/mtd/maps/esb2rom.c: In function 'esb2rom_init_one':
drivers/mtd/maps/esb2rom.c:167: warning: implicit declaration of function 
'pci_dev_get'

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/mtd/maps/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19-rc6-mm2.orig/drivers/mtd/maps/Kconfig
+++ linux-2.6.19-rc6-mm2/drivers/mtd/maps/Kconfig
@@ -186,7 +186,7 @@ config MTD_ICHXROM
 
 config MTD_ESB2ROM
 tristate "BIOS flash chip on Intel ESB Controller Hub 2"
-depends on X86 && MTD_JEDECPROBE
+depends on X86 && MTD_JEDECPROBE && PCI
 help
   Support for treating the BIOS flash chip on ESB2 motherboards
   as an MTD device - with this you can reprogram your BIOS.


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v4l-dvb-maintainer] [2.6 patch] remove DVB_AV7110_FIRMWARE

2006-11-28 Thread Adrian Bunk
On Tue, Nov 28, 2006 at 08:45:56PM -0800, Trent Piepho wrote:
> On Wed, 29 Nov 2006, Adrian Bunk wrote:
> > On Tue, Nov 28, 2006 at 01:06:02PM -0800, Trent Piepho wrote:
> > > On Sun, 26 Nov 2006, Adrian Bunk wrote:
> > > > DVB_AV7110_FIRMWARE was (except for some OSS drivers) the only option
> > > > that was still compiling a binary-only user-supplied firmware file at
> > > > build-time into the kernel.
> > > >
> > > > This patch changes the driver to always use the standard
> > > > request_firmware() way for firmware by removing DVB_AV7110_FIRMWARE.
> > >
> > > Doesn't this also prevent the AV7110 module from getting compiled
> > > into the kernel?  Shouldn't the Kconfig file be adjusted so
> > > that 'y' can't be selected anymore and it depends on MODULES?
> >
> > No.
> > No.
> >
> > request_firmware() works fine for built-in drivers.
> 
> Wouldn't that require loading the firmware file before the filesystems are
> mounted?

Sure.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread David Miller
From: Herbert Xu <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 15:56:57 +1100

> David Miller <[EMAIL PROTECTED]> wrote:
> > 
> > Longer term this is really messy, we should handle this some
> > other way.
> 
> Definitely.  I'm not sure whether 48 is enough even for recursive
> tunnels.  This should really just be a hint.  It's OK to spend a
> bit of time reallocating skb's if it's too small, but it's not OK
> to die.

The recursive tunnel case is handled by the PMTU reductions
in the route, isn't it?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away

2006-11-28 Thread Nicholas Miell
On Wed, 2006-11-29 at 15:30 +1100, Keith Owens wrote:
> David Miller (on Tue, 28 Nov 2006 20:04:53 -0800 (PST)) wrote:
> >From: Keith Owens 
> >Date: Wed, 29 Nov 2006 14:56:20 +1100
> >
> >> Secondly, I believe that this is a separate problem from bug 22278.
> >> hpet_readl() is correctly using volatile internally, but its result is
> >> being assigned to a pair of normal integers (not declared as volatile).
> >> In the context of wait_hpet_tick, all the variables are unqualified so
> >> gcc is allowed to optimize the comparison away.
> >> 
> >> The same problem may exist in other parts of arch/i386/kernel/time_hpet.c,
> >> where the return value from hpet_readl() is assigned to a normal
> >> variable.  Nothing in the C standard says that those unqualified
> >> variables should be magically treated as volatile, just because the
> >> original code that extracted the value used volatile.  IOW, time_hpet.c
> >> needs to declare any variables that hold the result of hpet_readl() as
> >> being volatile variables.
> >
> >I disagree with this.
> >
> >readl() returns values from an opaque source, and it is declared
> >as such to show this to GCC.  It's like a function that GCC
> >cannot see the implementation of, which it cannot determine
> >anything about wrt. return values.
> >
> >The volatile'ness does not simply disappear the moment you
> >assign the result to some local variable which is not volatile.
> >
> >Half of our drivers would break if this were true.
> 
> This is definitely a gcc bug, 4.1.0 is doing something weird.  Compile
> with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and the bug appears,
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y has no problem.
> 
> Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and _either_ of the patches
> below and the problem disappears.
> 

My theory: gcc is inlining readl into hpet_readl (readl is an inline
function, so it should be doing this no matter what), and inlining
hpet_readl into wait_hpet_tick (otherwise, it can't possibly make any
assumptions about the return values of hpet_readl -- this looks to be a
SUSE-specific over-aggressive optimization), and somewhere along the way
the volatile qualifier is getting lost.

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread Herbert Xu
David Miller <[EMAIL PROTECTED]> wrote:
> 
> Longer term this is really messy, we should handle this some
> other way.

Definitely.  I'm not sure whether 48 is enough even for recursive
tunnels.  This should really just be a hint.  It's OK to spend a
bit of time reallocating skb's if it's too small, but it's not OK
to die.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] sysctl: Simplify ipc ns specific sysctls

2006-11-28 Thread Serge E. Hallyn
Quoting Eric W. Biederman ([EMAIL PROTECTED]):
> This patch refactors the ipc sysctl support so that it is
> simpler, more readable, and prepares for fixing the bug
> with the wrong values being returned in the sys_sysctl interface.
> 
> The function proc_do_ipc_string was misnamed as it never handled
> strings.  It's magic of when to work with strings and when to work
> with longs belonged in the sysctl table.  I couldn't tell if the
> code would work if you disabled the ipc namespace but it certainly
> looked like it would have problems.
> 
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>

Hi,

A little belated (sorry), but the only comment I have right now on the
patchset is that the get_ipc() seems like it shouldn't take the write
arg.  Perhaps if consistency is the concern, get_uts() should simply
be called get_uts_locked(table, need_write) ?  This also avoids the
mysterious '1' argument in the next patch at get_ipc(table, 1);

Oh, I lied, one more comment.  It seems worth a comment at the top of
get_uts() and get_ipc() explaining that table->data points to
init_uts->data and that's why the 'which = which - init_uts + uts'
works.

thanks,
-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v4l-dvb-maintainer] [2.6 patch] remove DVB_AV7110_FIRMWARE

2006-11-28 Thread Trent Piepho
On Wed, 29 Nov 2006, Adrian Bunk wrote:
> On Tue, Nov 28, 2006 at 01:06:02PM -0800, Trent Piepho wrote:
> > On Sun, 26 Nov 2006, Adrian Bunk wrote:
> > > DVB_AV7110_FIRMWARE was (except for some OSS drivers) the only option
> > > that was still compiling a binary-only user-supplied firmware file at
> > > build-time into the kernel.
> > >
> > > This patch changes the driver to always use the standard
> > > request_firmware() way for firmware by removing DVB_AV7110_FIRMWARE.
> >
> > Doesn't this also prevent the AV7110 module from getting compiled
> > into the kernel?  Shouldn't the Kconfig file be adjusted so
> > that 'y' can't be selected anymore and it depends on MODULES?
>
> No.
> No.
>
> request_firmware() works fine for built-in drivers.

Wouldn't that require loading the firmware file before the filesystems are
mounted?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread David Miller
From: Herbert Xu <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 15:38:29 +1100

> David Miller <[EMAIL PROTECTED]> wrote:
> > 
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 9264139..95e86ac 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -94,7 +94,9 @@ #endif
> > #endif
> > 
> > #if !defined(CONFIG_NET_IPIP) && \
> > -!defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE)
> > +!defined(CONFIG_NET_IPGRE) && \
> > +!defined(CONFIG_IPV6_SIT) && \
> > +!defined(CONFIG_IPV6_TUNNEL)
> > #define MAX_HEADER LL_MAX_HEADER
> > #else
> > #define MAX_HEADER (LL_MAX_HEADER + 48)
> 
> What if ipip/gre are modules?

Good catch, I'll fix that up by adding the missing CONFIG_*_MODULE
cases.

Longer term this is really messy, we should handle this some
other way.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread Herbert Xu
David Miller <[EMAIL PROTECTED]> wrote:
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 9264139..95e86ac 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -94,7 +94,9 @@ #endif
> #endif
> 
> #if !defined(CONFIG_NET_IPIP) && \
> -!defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE)
> +!defined(CONFIG_NET_IPGRE) && \
> +!defined(CONFIG_IPV6_SIT) && \
> +!defined(CONFIG_IPV6_TUNNEL)
> #define MAX_HEADER LL_MAX_HEADER
> #else
> #define MAX_HEADER (LL_MAX_HEADER + 48)

What if ipip/gre are modules?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm2 is ok (2.6.19-rc1-mm1+ memory problem)

2006-11-28 Thread Michael Raskin

Michael Raskin wrote:
I have a strange problem with 2.6.19-rc-mm kernels. After I load X, I 
notice that memory is marked used at rate of tens of KB/s. Then it 


Tried 2.6.19-rc6-mm2. Now the problem is gone. Sometimes memory is 
getting maked used as before, but when the loss reaches a few MB's it is 
all freed. After 3 hours of X+all those scripts that cause leak + 
ThunderBird I can still shut down everything except a few processes and 
have only 50MB used. Script that demonstrated leak is now working 
without problems and without eating memory.


Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away

2006-11-28 Thread Keith Owens
David Miller (on Tue, 28 Nov 2006 20:04:53 -0800 (PST)) wrote:
>From: Keith Owens 
>Date: Wed, 29 Nov 2006 14:56:20 +1100
>
>> Secondly, I believe that this is a separate problem from bug 22278.
>> hpet_readl() is correctly using volatile internally, but its result is
>> being assigned to a pair of normal integers (not declared as volatile).
>> In the context of wait_hpet_tick, all the variables are unqualified so
>> gcc is allowed to optimize the comparison away.
>> 
>> The same problem may exist in other parts of arch/i386/kernel/time_hpet.c,
>> where the return value from hpet_readl() is assigned to a normal
>> variable.  Nothing in the C standard says that those unqualified
>> variables should be magically treated as volatile, just because the
>> original code that extracted the value used volatile.  IOW, time_hpet.c
>> needs to declare any variables that hold the result of hpet_readl() as
>> being volatile variables.
>
>I disagree with this.
>
>readl() returns values from an opaque source, and it is declared
>as such to show this to GCC.  It's like a function that GCC
>cannot see the implementation of, which it cannot determine
>anything about wrt. return values.
>
>The volatile'ness does not simply disappear the moment you
>assign the result to some local variable which is not volatile.
>
>Half of our drivers would break if this were true.

This is definitely a gcc bug, 4.1.0 is doing something weird.  Compile
with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and the bug appears,
CONFIG_CC_OPTIMIZE_FOR_SIZE=y has no problem.

Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and _either_ of the patches
below and the problem disappears.

Index: linux/arch/i386/kernel/time_hpet.c
===
--- linux.orig/arch/i386/kernel/time_hpet.c 2006-11-29 13:51:33.900462088 
+1100
+++ linux/arch/i386/kernel/time_hpet.c  2006-11-29 15:25:47.853245938 +1100
@@ -35,7 +35,8 @@ static void __iomem * hpet_virt_address;
 
 int hpet_readl(unsigned long a)
 {
-   return readl(hpet_virt_address + a);
+   volatile int v = readl(hpet_virt_address + a);
+   return v;
 }
 
 static void hpet_writel(unsigned long d, unsigned long a)


Index: linux-2.6/arch/i386/kernel/time_hpet.c
===
--- linux-2.6.orig/arch/i386/kernel/time_hpet.c
+++ linux-2.6/arch/i386/kernel/time_hpet.c
@@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d,
  */
 static void __devinit wait_hpet_tick(void)
 {
-   unsigned int start_cmp_val, end_cmp_val;
+   unsigned volatile int start_cmp_val, end_cmp_val;
 
start_cmp_val = hpet_readl(HPET_T0_CMP);
do {

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 03:28:25 +0100

> [NETFILTER]: ipt_REJECT: fix memory corruption
> 
> On devices with hard_header_len > LL_MAX_HEADER ip_route_me_harder()
> reallocates the skb, leading to memory corruption when using the stale
> tcph pointer to update the checksum.
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

Applied, thanks Patrick.

And based upon your discovery wrt. MAX_HEADER I'm also
applying the following.

commit 93e3a20d6c67a09b867431e7d5b3e7bc97154fab
Author: David S. Miller <[EMAIL PROTECTED]>
Date:   Tue Nov 28 20:24:10 2006 -0800

[NET]: Fix MAX_HEADER setting.

MAX_HEADER is either set to LL_MAX_HEADER or LL_MAX_HEADER + 48, and
this is controlled by a set of CONFIG_* ifdef tests.

It is trying to use LL_MAX_HEADER + 48 when any of the tunnels are
enabled which set hard_header_len like this:

dev->hard_header_len = LL_MAX_HEADER + sizeof(struct xxx);

The correct set of tunnel drivers which do this are:

ipip
ip_gre
ip6_tunnel
sit

so make the ifdef test match.

Noticed by Patrick McHardy.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9264139..95e86ac 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -94,7 +94,9 @@ #endif
 #endif
 
 #if !defined(CONFIG_NET_IPIP) && \
-!defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE)
+!defined(CONFIG_NET_IPGRE) && \
+!defined(CONFIG_IPV6_SIT) && \
+!defined(CONFIG_IPV6_TUNNEL)
 #define MAX_HEADER LL_MAX_HEADER
 #else
 #define MAX_HEADER (LL_MAX_HEADER + 48)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/12] ext4 balloc: say rb_entry not list_entry

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: say rb_entry not list_entry
From: Hugh Dickins <[EMAIL PROTECTED]>

The reservations tree is an rb_tree not a list, so it's less confusing to use
rb_entry() than list_entry() - though they're both just container_of().

--

Sync up this fix in ext4

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-say-rb_entry-not-list_entry 
fs/ext4/balloc.c
--- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-say-rb_entry-not-list_entry   
2006-11-28 19:37:08.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:08.0 
-0800
@@ -165,7 +165,7 @@ restart:
 
printk("Block Allocation Reservation Windows Map (%s):\n", fn);
while (n) {
-   rsv = list_entry(n, struct ext4_reserve_window_node, rsv_node);
+   rsv = rb_entry(n, struct ext4_reserve_window_node, rsv_node);
if (verbose)
printk("reservation window 0x%p "
   "start:  %llu, end:  %llu\n",
@@ -966,7 +966,7 @@ static int find_next_reservable_window(
 
prev = rsv;
next = rb_next(>rsv_node);
-   rsv = list_entry(next,struct ext4_reserve_window_node,rsv_node);
+   rsv = rb_entry(next,struct ext4_reserve_window_node,rsv_node);
 
/*
 * Reached the last reservation, we can just append to the
@@ -1210,7 +1210,7 @@ static void try_to_extend_reservation(st
if (!next)
my_rsv->rsv_end += size;
else {
-   next_rsv = list_entry(next, struct ext4_reserve_window_node, 
rsv_node);
+   next_rsv = rb_entry(next, struct ext4_reserve_window_node, 
rsv_node);
 
if ((next_rsv->rsv_start - my_rsv->rsv_end - 1) >= size)
my_rsv->rsv_end += size;

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 9/12] ext4 balloc: fix off-by-one against rsv_end

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: fix off-by-one against rsv_end
From: Hugh Dickins <[EMAIL PROTECTED]>

rsv_end is the last block within the reservation, so alloc_new_reservation
should accept start_block == rsv_end as success.
--
Sync up  a ext2 reservation fix in ext4
Signed-Off-By: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-rsv_end 
fs/ext4/balloc.c
--- 
linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-rsv_end
2006-11-28 19:37:15.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:15.0 
-0800
@@ -1165,7 +1165,7 @@ retry:
 * check if the first free block is within the
 * free space we just reserved
 */
-   if (start_block >= my_rsv->rsv_start && start_block < my_rsv->rsv_end)
+   if (start_block >= my_rsv->rsv_start && start_block <= my_rsv->rsv_end)
return 0;   /* success */
/*
 * if the first free bit we found is out of the reservable space

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/12] ext3 balloc: fix off-by-one against grp_goal

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: fix off-by-one against grp_goal
From: Hugh Dickins <[EMAIL PROTECTED]>

grp_goal 0 is a genuine goal (unlike -1), so ext2_try_to_allocate_with_rsv
should treat it as such.
--

Sync up with ext2 reservation fix  in ext3

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-grp_goal 
fs/ext3/balloc.c
--- 
linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-grp_goal   
2006-11-28 19:36:48.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:48.0 
-0800
@@ -1271,7 +1271,7 @@ ext3_try_to_allocate_with_rsv(struct sup
}
/*
 * grp_goal is a group relative block number (if there is a goal)
-* 0 < grp_goal < EXT3_BLOCKS_PER_GROUP(sb)
+* 0 <= grp_goal < EXT3_BLOCKS_PER_GROUP(sb)
 * first block is a filesystem wide block number
 * first block is the block number of the first block in this group
 */
@@ -1307,7 +1307,7 @@ ext3_try_to_allocate_with_rsv(struct sup
if (!goal_in_my_reservation(_rsv->rsv_window,
grp_goal, group, sb))
grp_goal = -1;
-   } else if (grp_goal > 0) {
+   } else if (grp_goal >= 0) {
int curr = my_rsv->rsv_end -
(grp_goal + group_first_block) + 1;
 

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/12] ext3 balloc: fix _with_rsv freeze

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: fix _with_rsv freeze
From: Hugh Dickins <[EMAIL PROTECTED]>

After several days of testing ext2 with reservations, it got caught inside
ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding on
the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to find the
free block guaranteed to be included (unless there's contention).

Fix the range to find_next_usable_block's memscan: the scan from "here"
(0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes not 2
(the relevant bytes of bitmap in this case being f7 df ff - none 00, but the
premature cutoff implying that the last was found 00).

Is this a problem for mainline ext2?  No, because the "size" in its memscan is
always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a multiple of
8.  Is this a problem for ext3 or ext4?  No, because they have an additional
extN_test_allocatable test which rescues them from the error.

--

Sync up a reservation fix from ext2 in ext4
Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-fix-_with_rsv-freeze fs/ext4/balloc.c
--- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-_with_rsv-freeze  
2006-11-28 19:37:12.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:12.0 
-0800
@@ -747,7 +747,7 @@ find_next_usable_block(ext4_grpblk_t sta
here = 0;
 
p = ((char *)bh->b_data) + (here >> 3);
-   r = memscan(p, 0, (maxblocks - here + 7) >> 3);
+   r = memscan(p, 0, ((maxblocks + 7) >> 3 - (here >> 3));
next = (r - ((char *)bh->b_data)) << 3;
 
if (next < maxblocks && next >= start && ext4_test_allocatable(next, 
bh))

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/12] ext3 balloc: use io_error label

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: use io_error label
From: Hugh Dickins <[EMAIL PROTECTED]>

ext2_new_blocks has a nice io_error label for setting -EIO, so goto that in
the one place that doesn't already use it.

--
Fix it in ext3_new_blocks.

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-use-io_error-label fs/ext3/balloc.c
--- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-use-io_error-label
2006-11-28 19:45:51.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:45:51.0 
-0800
@@ -1515,10 +1515,8 @@ retry_alloc:
if (group_no >= ngroups)
group_no = 0;
gdp = ext3_get_group_desc(sb, group_no, _bh);
-   if (!gdp) {
-   *errp = -EIO;
-   goto out;
-   }
+   if (!gdp)
+   goto io_error;
free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
/*
 * skip this group if the number of

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/12] ext2 balloc: fix _with_rsv freeze

2006-11-28 Thread Mingming Cao

Sync up a reservation fix from ext2 in ext3
--
Subject: ext2 balloc: fix _with_rsv freeze
From: Hugh Dickins <[EMAIL PROTECTED]>

After several days of testing ext2 with reservations, it got caught inside
ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding on
the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to find the
free block guaranteed to be included (unless there's contention).

Fix the range to find_next_usable_block's memscan: the scan from "here"
(0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes not 2
(the relevant bytes of bitmap in this case being f7 df ff - none 00, but the
premature cutoff implying that the last was found 00).

Is this a problem for mainline ext2?  No, because the "size" in its memscan is
always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a multiple of
8.  Is this a problem for ext3 or ext4?  No, because they have an additional
extN_test_allocatable test which rescues them from the error.

--

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-fix-_with_rsv-freeze fs/ext3/balloc.c
--- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-_with_rsv-freeze  
2006-11-28 19:36:55.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:55.0 
-0800
@@ -730,7 +730,7 @@ find_next_usable_block(ext3_grpblk_t sta
here = 0;
 
p = ((char *)bh->b_data) + (here >> 3);
-   r = memscan(p, 0, (maxblocks - here + 7) >> 3);
+   r = memscan(p, 0, ((maxblocks + 7) >> 3) - (here >> 3));
next = (r - ((char *)bh->b_data)) << 3;
 
if (next < maxblocks && next >= start && ext3_test_allocatable(next, 
bh))

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/12] ext4 balloc: fix off-by-one against grp_goal

2006-11-28 Thread Mingming Cao

Subject: ext2 balloc: fix off-by-one against grp_goal
From: Hugh Dickins <[EMAIL PROTECTED]>

grp_goal 0 is a genuine goal (unlike -1), so ext2_try_to_allocate_with_rsv
should treat it as such.
--
Sync up with ext2 reservation fix  in ext4
Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-grp_goal 
fs/ext4/balloc.c
--- 
linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-grp_goal   
2006-11-28 19:37:05.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:05.0 
-0800
@@ -1288,7 +1288,7 @@ ext4_try_to_allocate_with_rsv(struct sup
}
/*
 * grp_goal is a group relative block number (if there is a goal)
-* 0 < grp_goal < EXT4_BLOCKS_PER_GROUP(sb)
+* 0 <= grp_goal < EXT4_BLOCKS_PER_GROUP(sb)
 * first block is a filesystem wide block number
 * first block is the block number of the first block in this group
 */
@@ -1324,7 +1324,7 @@ ext4_try_to_allocate_with_rsv(struct sup
if (!goal_in_my_reservation(_rsv->rsv_window,
grp_goal, group, sb))
grp_goal = -1;
-   } else if (grp_goal > 0) {
+   } else if (grp_goal >= 0) {
int curr = my_rsv->rsv_end -
(grp_goal + group_first_block) + 1;
 

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/12] ext4 balloc: reset windowsz when full

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: reset windowsz when full
From: Hugh Dickins <[EMAIL PROTECTED]>

ext2_new_blocks should reset the reservation window size to 0 when squeezing
the last blocks out of an almost full filesystem, so the retry doesn't skip
any groups with less than half that free, reporting ENOSPC too soon.

--
Sync up reservation fix from ext2 in ext4

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |1 +
 1 file changed, 1 insertion(+)

diff -puN fs/ext4/balloc.c~ext4_reset_windowsz_in_full_fs fs/ext4/balloc.c
--- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4_reset_windowsz_in_full_fs
2006-11-28 19:37:01.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:01.0 
-0800
@@ -1566,6 +1566,7 @@ retry_alloc:
 */
if (my_rsv) {
my_rsv = NULL;
+   windowsz = 0;
group_no = goal_group;
goto retry_alloc;
}

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/12] ext4 balloc: use io_error label

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: use io_error label
From: Hugh Dickins <[EMAIL PROTECTED]>

ext2_new_blocks has a nice io_error label for setting -EIO, so goto that in
the one place that doesn't already use it.

--
Fix it in ext4_new_blocks.

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-use-io_error-label fs/ext4/balloc.c
--- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-use-io_error-label
2006-11-28 19:42:45.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:43:21.0 
-0800
@@ -1529,10 +1529,8 @@ retry_alloc:
if (group_no >= ngroups)
group_no = 0;
gdp = ext4_get_group_desc(sb, group_no, _bh);
-   if (!gdp) {
-   *errp = -EIO;
-   goto out;
-   }
+   if (!gdp)
+   goto io_error;
free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
/*
 * skip this group if the number of

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/12] ext3 balloc: say rb_entry not list_entry

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: say rb_entry not list_entry
From: Hugh Dickins <[EMAIL PROTECTED]>

The reservations tree is an rb_tree not a list, so it's less confusing to use
rb_entry() than list_entry() - though they're both just container_of().

--

Sync up this fix in ext3

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-say-rb_entry-not-list_entry 
fs/ext3/balloc.c
--- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-say-rb_entry-not-list_entry   
2006-11-28 19:36:52.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:52.0 
-0800
@@ -144,7 +144,7 @@ restart:
 
printk("Block Allocation Reservation Windows Map (%s):\n", fn);
while (n) {
-   rsv = list_entry(n, struct ext3_reserve_window_node, rsv_node);
+   rsv = rb_entry(n, struct ext3_reserve_window_node, rsv_node);
if (verbose)
printk("reservation window 0x%p "
   "start:  %lu, end:  %lu\n",
@@ -949,7 +949,7 @@ static int find_next_reservable_window(
 
prev = rsv;
next = rb_next(>rsv_node);
-   rsv = list_entry(next,struct ext3_reserve_window_node,rsv_node);
+   rsv = rb_entry(next,struct ext3_reserve_window_node,rsv_node);
 
/*
 * Reached the last reservation, we can just append to the
@@ -1193,7 +1193,7 @@ static void try_to_extend_reservation(st
if (!next)
my_rsv->rsv_end += size;
else {
-   next_rsv = list_entry(next, struct ext3_reserve_window_node, 
rsv_node);
+   next_rsv = rb_entry(next, struct ext3_reserve_window_node, 
rsv_node);
 
if ((next_rsv->rsv_start - my_rsv->rsv_end - 1) >= size)
my_rsv->rsv_end += size;

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/12] ext3 balloc: fix off-by-one against rsv_end

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: fix off-by-one against rsv_end
From: Hugh Dickins <[EMAIL PROTECTED]>

rsv_end is the last block within the reservation, so alloc_new_reservation
should accept start_block == rsv_end as success.
--
Sync up  a ext2 reservation fix in ext3

Signed-Off-By: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-rsv_end 
fs/ext3/balloc.c
--- 
linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-rsv_end
2006-11-28 19:36:58.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:58.0 
-0800
@@ -1148,7 +1148,7 @@ retry:
 * check if the first free block is within the
 * free space we just reserved
 */
-   if (start_block >= my_rsv->rsv_start && start_block < my_rsv->rsv_end)
+   if (start_block >= my_rsv->rsv_start && start_block <= my_rsv->rsv_end)
return 0;   /* success */
/*
 * if the first free bit we found is out of the reservable space

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/12] ext3 balloc: reset windowsz when full

2006-11-28 Thread Mingming Cao
Port a series ext2 balloc patches from Hugh to ext3/4. The first 6
patches are against ext3, and the rest are aginst ext4.


--
Subject: ext2 balloc: reset windowsz when full
From: Hugh Dickins <[EMAIL PROTECTED]>

ext2_new_blocks should reset the reservation window size to 0 when squeezing
the last blocks out of an almost full filesystem, so the retry doesn't skip
any groups with less than half that free, reporting ENOSPC too soon.

--
Sync up reservation fix from ext2

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |1 +
 1 file changed, 1 insertion(+)

diff -puN fs/ext3/balloc.c~ext3_reset_windowsz_in_full_fs fs/ext3/balloc.c
--- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3_reset_windowsz_in_full_fs
2006-11-28 19:36:41.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:41.0 
-0800
@@ -1552,6 +1552,7 @@ retry_alloc:
 */
if (my_rsv) {
my_rsv = NULL;
+   windowsz = 0;
group_no = goal_group;
goto retry_alloc;
}

_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away

2006-11-28 Thread David Miller
From: Keith Owens 
Date: Wed, 29 Nov 2006 14:56:20 +1100

> Secondly, I believe that this is a separate problem from bug 22278.
> hpet_readl() is correctly using volatile internally, but its result is
> being assigned to a pair of normal integers (not declared as volatile).
> In the context of wait_hpet_tick, all the variables are unqualified so
> gcc is allowed to optimize the comparison away.
> 
> The same problem may exist in other parts of arch/i386/kernel/time_hpet.c,
> where the return value from hpet_readl() is assigned to a normal
> variable.  Nothing in the C standard says that those unqualified
> variables should be magically treated as volatile, just because the
> original code that extracted the value used volatile.  IOW, time_hpet.c
> needs to declare any variables that hold the result of hpet_readl() as
> being volatile variables.

I disagree with this.

readl() returns values from an opaque source, and it is declared
as such to show this to GCC.  It's like a function that GCC
cannot see the implementation of, which it cannot determine
anything about wrt. return values.

The volatile'ness does not simply disappear the moment you
assign the result to some local variable which is not volatile.

Half of our drivers would break if this were true.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.16.33

2006-11-28 Thread Adrian Bunk
On Mon, Nov 27, 2006 at 11:45:30AM +, Ian Campbell wrote:
> Hi Adrian,
> 
> On Thu, 2006-11-23 at 01:05 +0100, Adrian Bunk wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/v2.6/
> 
> I can see the changelog and the patch but not the whole tarball. Does
> that take longer to appear?

PEBKAC ;-)

I forgot to copy it.

Thanks for your reminder, it's now there.

> Cheers,
> Ian.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away

2006-11-28 Thread Keith Owens
Nicholas Miell (on Tue, 28 Nov 2006 19:08:25 -0800) wrote:
>On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote:
>> Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux),
>> wait_hpet_tick is optimized away to a never ending loop and the kernel
>> hangs on boot in timer setup.
>> 
>> 001a :
>>   1a:   55  push   %ebp
>>   1b:   89 e5   mov%esp,%ebp
>>   1d:   eb fe   jmp1d 
>> 
>> This is not a problem with gcc 3.3.5.  Adding barrier() calls to
>> wait_hpet_tick does not help, making the variables volatile does.
>> 
>> Signed-off-by: Keith Owens 
>> 
>> ---
>>  arch/i386/kernel/time_hpet.c |2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> Index: linux-2.6/arch/i386/kernel/time_hpet.c
>> ===
>> --- linux-2.6.orig/arch/i386/kernel/time_hpet.c
>> +++ linux-2.6/arch/i386/kernel/time_hpet.c
>> @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d,
>>   */
>>  static void __devinit wait_hpet_tick(void)
>>  {
>> -unsigned int start_cmp_val, end_cmp_val;
>> +unsigned volatile int start_cmp_val, end_cmp_val;
>>  
>>  start_cmp_val = hpet_readl(HPET_T0_CMP);
>>  do {
>
>When you examine the inlined functions involved, this looks an awful lot
>like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278
>
>Perhaps SUSE should fix their gcc instead of working around compiler
>problems in the kernel?

Firstly, the fix for 22278 is included in gcc 4.1.0.

Secondly, I believe that this is a separate problem from bug 22278.
hpet_readl() is correctly using volatile internally, but its result is
being assigned to a pair of normal integers (not declared as volatile).
In the context of wait_hpet_tick, all the variables are unqualified so
gcc is allowed to optimize the comparison away.

The same problem may exist in other parts of arch/i386/kernel/time_hpet.c,
where the return value from hpet_readl() is assigned to a normal
variable.  Nothing in the C standard says that those unqualified
variables should be magically treated as volatile, just because the
original code that extracted the value used volatile.  IOW, time_hpet.c
needs to declare any variables that hold the result of hpet_readl() as
being volatile variables.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] lib functions: always build hweight for loadable modules

2006-11-28 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

Always build hweight8/16/32/64() functions into the kernel so that
loadable modules may use them.

I didn't remove GENERIC_HWEIGHT since ALPHA_EV67, ia64, and some
variants of UltraSparc(64) provide their own hweight functions.

Fixes config/build problems with NTFS=m and JOYSTICK_ANALOG=m.

Kernel: arch/x86_64/boot/bzImage is ready  (#19)
  Building modules, stage 2.
  MODPOST 94 modules
WARNING: "hweight32" [fs/ntfs/ntfs.ko] undefined!
WARNING: "hweight16" [drivers/input/joystick/analog.ko] undefined!
WARNING: "hweight8" [drivers/input/joystick/analog.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 lib/Makefile |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19-rc6-git10.orig/lib/Makefile
+++ linux-2.6.19-rc6-git10/lib/Makefile
@@ -25,7 +25,7 @@ lib-$(CONFIG_RWSEM_GENERIC_SPINLOCK) += 
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
 lib-$(CONFIG_SEMAPHORE_SLEEPERS) += semaphore-sleepers.o
 lib-$(CONFIG_GENERIC_FIND_NEXT_BIT) += find_next_bit.o
-lib-$(CONFIG_GENERIC_HWEIGHT) += hweight.o
+obj-$(CONFIG_GENERIC_HWEIGHT) += hweight.o
 obj-$(CONFIG_LOCK_KERNEL) += kernel_lock.o
 obj-$(CONFIG_PLIST) += plist.o
 obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away

2006-11-28 Thread Nicholas Miell
On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote:
> Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux),
> wait_hpet_tick is optimized away to a never ending loop and the kernel
> hangs on boot in timer setup.
> 
> 001a :
>   1a:   55  push   %ebp
>   1b:   89 e5   mov%esp,%ebp
>   1d:   eb fe   jmp1d 
> 
> This is not a problem with gcc 3.3.5.  Adding barrier() calls to
> wait_hpet_tick does not help, making the variables volatile does.
> 
> Signed-off-by: Keith Owens 
> 
> ---
>  arch/i386/kernel/time_hpet.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6/arch/i386/kernel/time_hpet.c
> ===
> --- linux-2.6.orig/arch/i386/kernel/time_hpet.c
> +++ linux-2.6/arch/i386/kernel/time_hpet.c
> @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d,
>   */
>  static void __devinit wait_hpet_tick(void)
>  {
> - unsigned int start_cmp_val, end_cmp_val;
> + unsigned volatile int start_cmp_val, end_cmp_val;
>  
>   start_cmp_val = hpet_readl(HPET_T0_CMP);
>   do {

When you examine the inlined functions involved, this looks an awful lot
like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278

Perhaps SUSE should fix their gcc instead of working around compiler
problems in the kernel?

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v4l-dvb-maintainer] [2.6 patch] remove DVB_AV7110_FIRMWARE

2006-11-28 Thread Adrian Bunk
On Tue, Nov 28, 2006 at 01:06:02PM -0800, Trent Piepho wrote:
> On Sun, 26 Nov 2006, Adrian Bunk wrote:
> > DVB_AV7110_FIRMWARE was (except for some OSS drivers) the only option
> > that was still compiling a binary-only user-supplied firmware file at
> > build-time into the kernel.
> >
> > This patch changes the driver to always use the standard
> > request_firmware() way for firmware by removing DVB_AV7110_FIRMWARE.
> 
> Doesn't this also prevent the AV7110 module from getting compiled
> into the kernel?  Shouldn't the Kconfig file be adjusted so
> that 'y' can't be selected anymore and it depends on MODULES?

No.
No.

request_firmware() works fine for built-in drivers.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5][time][x86_64] hpet_address cleanup

2006-11-28 Thread john stultz
In preparation for supporting generic timekeeping, this patch cleans up 
x86-64's use of vxtime.hpet_address, changing it to just hpet_address 
as is also used in i386. This is necessary since the vxtime structure 
will be going away.

Signed-off-by: John Stultz <[EMAIL PROTECTED]>


 arch/i386/kernel/acpi/boot.c |   23 ++-
 arch/x86_64/kernel/apic.c|3 ++-
 arch/x86_64/kernel/time.c|   36 +++-
 include/asm-x86_64/hpet.h|1 +
 4 files changed, 28 insertions(+), 35 deletions(-)

linux-2.6.19-rc6git11_timeofday-arch-x86-64-hpet-address-cleanup_C7.patch

diff --git a/arch/i386/kernel/acpi/boot.c b/arch/i386/kernel/acpi/boot.c
index d12fb97..b9e9f17 100644
--- a/arch/i386/kernel/acpi/boot.c
+++ b/arch/i386/kernel/acpi/boot.c
@@ -638,6 +638,7 @@ static int __init acpi_parse_sbf(unsigne
 }
 
 #ifdef CONFIG_HPET_TIMER
+#include 
 
 static int __init acpi_parse_hpet(unsigned long phys, unsigned long size)
 {
@@ -671,32 +672,20 @@ #define HPET_RESOURCE_NAME_SIZE 9
hpet_res->end = (1 * 1024) - 1;
}
 
+   hpet_address = hpet_tbl->addr.addrl;
 #ifdef CONFIG_X86_64
-   vxtime.hpet_address = hpet_tbl->addr.addrl |
-   ((long)hpet_tbl->addr.addrh << 32);
-
+   hpet_address |= ((long)hpet_tbl->addr.addrh << 32);
+#endif
printk(KERN_INFO PREFIX "HPET id: %#x base: %#lx\n",
-  hpet_tbl->id, vxtime.hpet_address);
-
-   res_start = vxtime.hpet_address;
-#else  /* X86 */
-   {
-   extern unsigned long hpet_address;
+  hpet_tbl->id, hpet_address);
 
-   hpet_address = hpet_tbl->addr.addrl;
-   printk(KERN_INFO PREFIX "HPET id: %#x base: %#lx\n",
-  hpet_tbl->id, hpet_address);
-
-   res_start = hpet_address;
-   }
-#endif /* X86 */
+   res_start = hpet_address;
 
if (hpet_res) {
hpet_res->start = res_start;
hpet_res->end += res_start;
insert_resource(_resource, hpet_res);
}
-
return 0;
 }
 #else
diff --git a/arch/x86_64/kernel/apic.c b/arch/x86_64/kernel/apic.c
index 4d9d5ed..02f5961 100644
--- a/arch/x86_64/kernel/apic.c
+++ b/arch/x86_64/kernel/apic.c
@@ -36,6 +36,7 @@ #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 int apic_mapped;
@@ -673,7 +674,7 @@ static void setup_APIC_timer(unsigned in
local_irq_save(flags);
 
/* wait for irq slice */
-   if (vxtime.hpet_address && hpet_use_timer) {
+   if (hpet_address && hpet_use_timer) {
int trigger = hpet_readl(HPET_T0_CMP);
while (hpet_readl(HPET_COUNTER) >= trigger)
/* do nothing */ ;
diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
index e3ef544..a6820e0 100644
--- a/arch/x86_64/kernel/time.c
+++ b/arch/x86_64/kernel/time.c
@@ -67,6 +67,7 @@ #define US_SCALE  32 /* 2^32, arbitralril
 
 unsigned int cpu_khz;  /* TSC clocks / usec, 
not used here */
 EXPORT_SYMBOL(cpu_khz);
+unsigned long hpet_address;
 static unsigned long hpet_period;  /* fsecs / HPET clock */
 unsigned long hpet_tick;   /* HPET clocks / 
interrupt */
 int hpet_use_timer;/* Use counter of hpet for time 
keeping, otherwise PIT */
@@ -316,7 +317,7 @@ static noinline void handle_lost_ticks(i
   KERN_WARNING "Your time source seems to be instable or "
"some driver is hogging interupts\n");
print_symbol("rip %s\n", get_irq_regs()->rip);
-   if (vxtime.mode == VXTIME_TSC && vxtime.hpet_address) {
+   if (vxtime.mode == VXTIME_TSC && hpet_address) {
printk(KERN_WARNING "Falling back to HPET\n");
if (hpet_use_timer)
vxtime.last = hpet_readl(HPET_T0_CMP) - 
@@ -324,6 +325,7 @@ static noinline void handle_lost_ticks(i
else
vxtime.last = hpet_readl(HPET_COUNTER);
vxtime.mode = VXTIME_HPET;
+   vxtime.hpet_address = hpet_address;
do_gettimeoffset = do_gettimeoffset_hpet;
}
/* else should fall back to PIT, but code missing. */
@@ -354,7 +356,7 @@ void main_timer_handler(void)
 
write_seqlock(_lock);
 
-   if (vxtime.hpet_address)
+   if (hpet_address)
offset = hpet_readl(HPET_COUNTER);
 
if (hpet_use_timer) {
@@ -717,7 +719,7 @@ static __init int late_hpet_init(void)
struct hpet_datahd;
unsigned intntimer;
 
-   if (!vxtime.hpet_address)
+   if (!hpet_address)
return 0;
 
memset(, 0, 

[PATCH 5/5][time][x86_64] Re-enable vsyscall support for x86_64

2006-11-28 Thread john stultz
Cleanup and re-enable vsyscall gettimeofday using the generic 
clocksource infrastructure.

Signed-off-by: John Stultz <[EMAIL PROTECTED]>

 arch/x86_64/Kconfig  |4 +
 arch/x86_64/kernel/hpet.c|6 +
 arch/x86_64/kernel/time.c|6 -
 arch/x86_64/kernel/tsc.c |7 ++
 arch/x86_64/kernel/vmlinux.lds.S |   28 +++--
 arch/x86_64/kernel/vsyscall.c|  121 +++
 include/asm-x86_64/proto.h   |3 
 include/asm-x86_64/timex.h   |1 
 include/asm-x86_64/vsyscall.h|   32 +-
 9 files changed, 105 insertions(+), 103 deletions(-)

linux-2.6.19-rc6git11_timeofday-arch-x86-64-vsyscall-reenablement_C7.patch

diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig
index 20bcd6d..c8026f8 100644
--- a/arch/x86_64/Kconfig
+++ b/arch/x86_64/Kconfig
@@ -28,6 +28,10 @@ config GENERIC_TIME
bool
default y
 
+config GENERIC_TIME_VSYSCALL
+   bool
+   default y
+
 config ZONE_DMA32
bool
default y
diff --git a/arch/x86_64/kernel/hpet.c b/arch/x86_64/kernel/hpet.c
index c00b01a..2d3aed1 100644
--- a/arch/x86_64/kernel/hpet.c
+++ b/arch/x86_64/kernel/hpet.c
@@ -440,6 +440,11 @@ static cycle_t read_hpet(void)
return (cycle_t)readl(hpet_ptr);
 }
 
+static cycle_t __vsyscall_fn vread_hpet(void)
+{
+   return (cycle_t)readl((void *)fix_to_virt(VSYSCALL_HPET) + 0xf0);
+}
+
 struct clocksource clocksource_hpet = {
.name   = "hpet",
.rating = 250,
@@ -448,6 +453,7 @@ struct clocksource clocksource_hpet = {
.mult   = 0, /* set below */
.shift  = HPET_SHIFT,
.is_continuous  = 1,
+   .vread  = vread_hpet,
 };
 
 static int __init init_hpet_clocksource(void)
diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
index 4bc737c..17bb7de 100644
--- a/arch/x86_64/kernel/time.c
+++ b/arch/x86_64/kernel/time.c
@@ -53,13 +53,7 @@ DEFINE_SPINLOCK(rtc_lock);
 EXPORT_SYMBOL(rtc_lock);
 DEFINE_SPINLOCK(i8253_lock);
 
-unsigned long vxtime_hz = PIT_TICK_RATE;
-
-struct vxtime_data __vxtime __section_vxtime;  /* for vsyscalls */
-
 volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
-struct timespec __xtime __section_xtime;
-struct timezone __sys_tz __section_sys_tz;
 
 unsigned long profile_pc(struct pt_regs *regs)
 {
diff --git a/arch/x86_64/kernel/tsc.c b/arch/x86_64/kernel/tsc.c
index 682e122..5c768cf 100644
--- a/arch/x86_64/kernel/tsc.c
+++ b/arch/x86_64/kernel/tsc.c
@@ -185,6 +185,12 @@ static cycle_t read_tsc(void)
return ret;
 }
 
+static cycle_t __vsyscall_fn vread_tsc(void)
+{
+   cycle_t ret = (cycle_t)get_cycles_sync();
+   return ret;
+}
+
 static struct clocksource clocksource_tsc = {
.name   = "tsc",
.rating = 300,
@@ -194,6 +200,7 @@ static struct clocksource clocksource_ts
.shift  = 22,
.update_callback= tsc_update_callback,
.is_continuous  = 1,
+   .vread  = vread_tsc,
 };
 
 static int tsc_update_callback(void)
diff --git a/arch/x86_64/kernel/vmlinux.lds.S b/arch/x86_64/kernel/vmlinux.lds.S
index d9534e7..5b10798 100644
--- a/arch/x86_64/kernel/vmlinux.lds.S
+++ b/arch/x86_64/kernel/vmlinux.lds.S
@@ -94,31 +94,25 @@ #define VVIRT(x) (ADDR(x) - VVIRT_OFFSET
   __vsyscall_0 = VSYSCALL_VIRT_ADDR;
 
   . = ALIGN(CONFIG_X86_L1_CACHE_BYTES);
-  .xtime_lock : AT(VLOAD(.xtime_lock)) { *(.xtime_lock) }
-  xtime_lock = VVIRT(.xtime_lock);
-
-  .vxtime : AT(VLOAD(.vxtime)) { *(.vxtime) }
-  vxtime = VVIRT(.vxtime);
+  .vsyscall_fn : AT(VLOAD(.vsyscall_fn)) { *(.vsyscall_fn) }
+  . = ALIGN(CONFIG_X86_L1_CACHE_BYTES);
+  .vsyscall_gtod_data : AT(VLOAD(.vsyscall_gtod_data))
+   { *(.vsyscall_gtod_data) }
+  vsyscall_gtod_data = VVIRT(.vsyscall_gtod_data);
 
   .vgetcpu_mode : AT(VLOAD(.vgetcpu_mode)) { *(.vgetcpu_mode) }
   vgetcpu_mode = VVIRT(.vgetcpu_mode);
 
-  .sys_tz : AT(VLOAD(.sys_tz)) { *(.sys_tz) }
-  sys_tz = VVIRT(.sys_tz);
-
-  .sysctl_vsyscall : AT(VLOAD(.sysctl_vsyscall)) { *(.sysctl_vsyscall) }
-  sysctl_vsyscall = VVIRT(.sysctl_vsyscall);
-
-  .xtime : AT(VLOAD(.xtime)) { *(.xtime) }
-  xtime = VVIRT(.xtime);
-
   . = ALIGN(CONFIG_X86_L1_CACHE_BYTES);
   .jiffies : AT(VLOAD(.jiffies)) { *(.jiffies) }
   jiffies = VVIRT(.jiffies);
 
-  .vsyscall_1 ADDR(.vsyscall_0) + 1024: AT(VLOAD(.vsyscall_1)) { 
*(.vsyscall_1) }
-  .vsyscall_2 ADDR(.vsyscall_0) + 2048: AT(VLOAD(.vsyscall_2)) { 
*(.vsyscall_2) }
-  .vsyscall_3 ADDR(.vsyscall_0) + 3072: AT(VLOAD(.vsyscall_3)) { 
*(.vsyscall_3) }
+  .vsyscall_1 ADDR(.vsyscall_0) + 1024: AT(VLOAD(.vsyscall_1))
+   { *(.vsyscall_1) }
+  .vsyscall_2 ADDR(.vsyscall_0) + 2048: AT(VLOAD(.vsyscall_2))
+   { *(.vsyscall_2) }
+  .vsyscall_3 ADDR(.vsyscall_0) + 3072: AT(VLOAD(.vsyscall_3))
+   { *(.vsyscall_3) }
 

[PATCH 1/5][time][Generic] vsyscall-gtod support for GENERIC_TIME

2006-11-28 Thread john stultz
Provides generic infrastructure for vsyscall-gtod.

Signed-off-by: John Stultz <[EMAIL PROTECTED]>

 include/linux/clocksource.h |8 
 kernel/timer.c  |1 +
 2 files changed, 9 insertions(+)

linux-2.6.19-rc6git11_timeofday-vsyscall-support_C7.patch

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index d852024..62a600d 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -46,6 +46,7 @@ typedef u64 cycle_t;
  * @shift: cycle to nanosecond divisor (power of two)
  * @update_callback:   called when safe to alter clocksource values
  * @is_continuous: defines if clocksource is free-running.
+ * @vread: vsyscall based read
  * @cycle_interval:Used internally by timekeeping core, please ignore.
  * @xtime_interval:Used internally by timekeeping core, please ignore.
  */
@@ -59,6 +60,7 @@ struct clocksource {
u32 shift;
int (*update_callback)(void);
int is_continuous;
+   cycle_t (*vread)(void);
 
/* timekeeping specific data, ignore */
cycle_t cycle_last, cycle_interval;
@@ -182,4 +184,10 @@ int clocksource_register(struct clocksou
 void clocksource_reselect(void);
 struct clocksource* clocksource_get_next(void);
 
+#ifdef CONFIG_GENERIC_TIME_VSYSCALL
+extern void update_vsyscall(struct timespec *ts, struct clocksource *c);
+#else
+#define update_vsyscall(now, c) do { } while(0)
+#endif
+
 #endif /* _LINUX_CLOCKSOURCE_H */
diff --git a/kernel/timer.c b/kernel/timer.c
index c1c7fbc..38fd4a7 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -956,6 +956,7 @@ #endif
clock->xtime_nsec = 0;
clocksource_calculate_interval(clock, tick_nsec);
}
+   update_vsyscall(, clock);
 }
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5][time][x86_64] Convert x86_64 to use GENERIC_TIME

2006-11-28 Thread john stultz
This patch converts x86_64 to use the GENERIC_TIME infrastructure and 
adds clocksource structures for both TSC and HPET (ACPI PM is shared w/ 
i386).

Signed-off-by: John Stultz <[EMAIL PROTECTED]>

 arch/x86_64/Kconfig  |4 
 arch/x86_64/kernel/apic.c|2 
 arch/x86_64/kernel/hpet.c|   65 -
 arch/x86_64/kernel/pmtimer.c |   58 
 arch/x86_64/kernel/smpboot.c |1 
 arch/x86_64/kernel/time.c|  301 ---
 arch/x86_64/kernel/tsc.c |  105 +--
 include/asm-x86_64/proto.h   |1 
 include/asm-x86_64/timex.h   |5 
 9 files changed, 133 insertions(+), 409 deletions(-)

linux-2.6.19-rc6git11_timeofday-arch-x86-64-generic-time-conversion_C7.patch

diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig
index 010d226..20bcd6d 100644
--- a/arch/x86_64/Kconfig
+++ b/arch/x86_64/Kconfig
@@ -24,6 +24,10 @@ config X86
bool
default y
 
+config GENERIC_TIME
+   bool
+   default y
+
 config ZONE_DMA32
bool
default y
diff --git a/arch/x86_64/kernel/apic.c b/arch/x86_64/kernel/apic.c
index 02f5961..588ef3d 100644
--- a/arch/x86_64/kernel/apic.c
+++ b/arch/x86_64/kernel/apic.c
@@ -696,7 +696,7 @@ static void setup_APIC_timer(unsigned in
/* Turn off PIT interrupt if we use APIC timer as main timer.
   Only works with the PM timer right now
   TBD fix it for HPET too. */
-   if (vxtime.mode == VXTIME_PMTMR &&
+   if ((pmtmr_ioport != 0) &&
smp_processor_id() == boot_cpu_id &&
apic_runs_main_timer == 1 &&
!cpu_isset(boot_cpu_id, timer_interrupt_broadcast_ipi_mask)) {
diff --git a/arch/x86_64/kernel/hpet.c b/arch/x86_64/kernel/hpet.c
index a219786..c00b01a 100644
--- a/arch/x86_64/kernel/hpet.c
+++ b/arch/x86_64/kernel/hpet.c
@@ -19,12 +19,6 @@ unsigned long hpet_tick; /* HPET clocks 
 int hpet_use_timer;/* Use counter of hpet for time keeping,
 * otherwise PIT
 */
-unsigned int do_gettimeoffset_hpet(void)
-{
-   /* cap counter read to one tick to avoid inconsistencies */
-   unsigned long counter = hpet_readl(HPET_COUNTER) - vxtime.last;
-   return (min(counter,hpet_tick) * vxtime.quot) >> US_SCALE;
-}
 
 #ifdef CONFIG_HPET
 static __init int late_hpet_init(void)
@@ -433,3 +427,62 @@ static int __init nohpet_setup(char *s)
 
 __setup("nohpet", nohpet_setup);
 
+#define HPET_MASK  0x
+#define HPET_SHIFT 22
+
+/* FSEC = 10^-15 NSEC = 10^-9 */
+#define FSEC_PER_NSEC  100
+
+static void *hpet_ptr;
+
+static cycle_t read_hpet(void)
+{
+   return (cycle_t)readl(hpet_ptr);
+}
+
+struct clocksource clocksource_hpet = {
+   .name   = "hpet",
+   .rating = 250,
+   .read   = read_hpet,
+   .mask   = (cycle_t)HPET_MASK,
+   .mult   = 0, /* set below */
+   .shift  = HPET_SHIFT,
+   .is_continuous  = 1,
+};
+
+static int __init init_hpet_clocksource(void)
+{
+   unsigned long hpet_period;
+   void __iomem *hpet_base;
+   u64 tmp;
+
+   if (!hpet_address)
+   return -ENODEV;
+
+   /* calculate the hpet address: */
+   hpet_base =
+   (void __iomem*)ioremap_nocache(hpet_address, HPET_MMAP_SIZE);
+   hpet_ptr = hpet_base + HPET_COUNTER;
+
+   /* calculate the frequency: */
+   hpet_period = readl(hpet_base + HPET_PERIOD);
+
+   /*
+* hpet period is in femto seconds per cycle
+* so we need to convert this to ns/cyc units
+* aproximated by mult/2^shift
+*
+*  fsec/cyc * 1nsec/100fsec = nsec/cyc = mult/2^shift
+*  fsec/cyc * 1ns/100fsec * 2^shift = mult
+*  fsec/cyc * 2^shift * 1nsec/100fsec = mult
+*  (fsec/cyc << shift)/100 = mult
+*  (hpet_period << shift)/FSEC_PER_NSEC = mult
+*/
+   tmp = (u64)hpet_period << HPET_SHIFT;
+   do_div(tmp, FSEC_PER_NSEC);
+   clocksource_hpet.mult = (u32)tmp;
+
+   return clocksource_register(_hpet);
+}
+
+module_init(init_hpet_clocksource);
diff --git a/arch/x86_64/kernel/pmtimer.c b/arch/x86_64/kernel/pmtimer.c
index 7554458..ae8f912 100644
--- a/arch/x86_64/kernel/pmtimer.c
+++ b/arch/x86_64/kernel/pmtimer.c
@@ -24,15 +24,6 @@ #include 
 #include 
 #include 
 
-/* The I/O port the PMTMR resides at.
- * The location is detected during setup_arch(),
- * in arch/i386/kernel/acpi/boot.c */
-u32 pmtmr_ioport __read_mostly;
-
-/* value of the Power timer at last timer interrupt */
-static u32 offset_delay;
-static u32 last_pmtmr_tick;
-
 #define ACPI_PM_MASK 0xFF /* limit it to 24 bits */
 
 static inline u32 cyc2us(u32 cycles)
@@ -48,38 +39,6 @@ static inline u32 cyc2us(u32 cycles)
return (cycles >> 10);
 }
 
-int pmtimer_mark_offset(void)
-{
-   static int first_run = 1;
-

[PATCH 3/5][time][x86_64] Split x86_64/kernel/time.c up

2006-11-28 Thread john stultz
In preperation for the x86_64 generic time conversion, this patch 
splits out TSC and HPET related code from arch/x86_64/kernel/time.c 
into respective hpet.c and tsc.c files.

Signed-off-by: John Stultz <[EMAIL PROTECTED]>

 arch/x86_64/kernel/Makefile |2 
 arch/x86_64/kernel/hpet.c   |  435 ++
 arch/x86_64/kernel/time.c   |  628 
 arch/x86_64/kernel/tsc.c|  201 ++
 include/asm-x86_64/hpet.h   |6 
 include/asm-x86_64/timex.h  |   11 
 6 files changed, 658 insertions(+), 625 deletions(-)

linux-2.6.19-rc6git11_timeofday-arch-x86-64-split-hpet-tsc-time_C7.patch

diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile
index 3c7cbff..e68a87e 100644
--- a/arch/x86_64/kernel/Makefile
+++ b/arch/x86_64/kernel/Makefile
@@ -8,7 +8,7 @@ obj-y   := process.o signal.o entry.o trap
ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_x86_64.o \
x8664_ksyms.o i387.o syscall.o vsyscall.o \
setup64.o bootflag.o e820.o reboot.o quirks.o i8237.o \
-   pci-dma.o pci-nommu.o alternative.o
+   pci-dma.o pci-nommu.o alternative.o hpet.o tsc.o
 
 obj-$(CONFIG_STACKTRACE)   += stacktrace.o
 obj-$(CONFIG_X86_MCE)  += mce.o therm_throt.o
diff --git a/arch/x86_64/kernel/hpet.c b/arch/x86_64/kernel/hpet.c
new file mode 100644
index 000..a219786
--- /dev/null
+++ b/arch/x86_64/kernel/hpet.c
@@ -0,0 +1,435 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+int nohpet __initdata = 0;
+
+unsigned long hpet_address;
+unsigned long hpet_period; /* fsecs / HPET clock */
+unsigned long hpet_tick;   /* HPET clocks / interrupt */
+
+int hpet_use_timer;/* Use counter of hpet for time keeping,
+* otherwise PIT
+*/
+unsigned int do_gettimeoffset_hpet(void)
+{
+   /* cap counter read to one tick to avoid inconsistencies */
+   unsigned long counter = hpet_readl(HPET_COUNTER) - vxtime.last;
+   return (min(counter,hpet_tick) * vxtime.quot) >> US_SCALE;
+}
+
+#ifdef CONFIG_HPET
+static __init int late_hpet_init(void)
+{
+   struct hpet_datahd;
+   unsigned intntimer;
+
+   if (!hpet_address)
+   return 0;
+
+   memset(, 0, sizeof (hd));
+
+   ntimer = hpet_readl(HPET_ID);
+   ntimer = (ntimer & HPET_ID_NUMBER) >> HPET_ID_NUMBER_SHIFT;
+   ntimer++;
+
+   /*
+* Register with driver.
+* Timer0 and Timer1 is used by platform.
+*/
+   hd.hd_phys_address = hpet_address;
+   hd.hd_address = (void __iomem *)fix_to_virt(FIX_HPET_BASE);
+   hd.hd_nirqs = ntimer;
+   hd.hd_flags = HPET_DATA_PLATFORM;
+   hpet_reserve_timer(, 0);
+#ifdef CONFIG_HPET_EMULATE_RTC
+   hpet_reserve_timer(, 1);
+#endif
+   hd.hd_irq[0] = HPET_LEGACY_8254;
+   hd.hd_irq[1] = HPET_LEGACY_RTC;
+   if (ntimer > 2) {
+   struct hpet *hpet;
+   struct hpet_timer   *timer;
+   int i;
+
+   hpet = (struct hpet *) fix_to_virt(FIX_HPET_BASE);
+   timer = >hpet_timers[2];
+   for (i = 2; i < ntimer; timer++, i++)
+   hd.hd_irq[i] = (timer->hpet_config &
+   Tn_INT_ROUTE_CNF_MASK) >>
+   Tn_INT_ROUTE_CNF_SHIFT;
+
+   }
+
+   hpet_alloc();
+   return 0;
+}
+fs_initcall(late_hpet_init);
+#endif
+
+int hpet_timer_stop_set_go(unsigned long tick)
+{
+   unsigned int cfg;
+
+/*
+ * Stop the timers and reset the main counter.
+ */
+
+   cfg = hpet_readl(HPET_CFG);
+   cfg &= ~(HPET_CFG_ENABLE | HPET_CFG_LEGACY);
+   hpet_writel(cfg, HPET_CFG);
+   hpet_writel(0, HPET_COUNTER);
+   hpet_writel(0, HPET_COUNTER + 4);
+
+/*
+ * Set up timer 0, as periodic with first interrupt to happen at hpet_tick,
+ * and period also hpet_tick.
+ */
+   if (hpet_use_timer) {
+   hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL |
+   HPET_TN_32BIT, HPET_T0_CFG);
+   hpet_writel(hpet_tick, HPET_T0_CMP); /* next interrupt */
+   hpet_writel(hpet_tick, HPET_T0_CMP); /* period */
+   cfg |= HPET_CFG_LEGACY;
+   }
+/*
+ * Go!
+ */
+
+   cfg |= HPET_CFG_ENABLE;
+   hpet_writel(cfg, HPET_CFG);
+
+   return 0;
+}
+
+int hpet_arch_init(void)
+{
+   unsigned int id;
+
+   if (!hpet_address)
+   return -1;
+   set_fixmap_nocache(FIX_HPET_BASE, hpet_address);
+   __set_fixmap(VSYSCALL_HPET, hpet_address, PAGE_KERNEL_VSYSCALL_NOCACHE);
+
+/*
+ * Read the period, compute tick and quotient.
+ */
+
+   id = hpet_readl(HPET_ID);
+
+   if (!(id & 

[PATCH 0/5][time][x86_64] GENERIC_TIME patchset for x86_64

2006-11-28 Thread john stultz
Hey Andi,
First let me apologize, I've been busy with other things and 
its been far too long since I last posted this. Anyway, I found some 
time to resync my trees and wanted to send this along.

You had asked earlier about performance impact:

Vanilla TSC:
149 nsecs per gtod call
367 nsecs per CLOCK_MONOTONIC call
288 nsecs per CLOCK_REALTIME call
Vanilla ACPI PM:
1272 nsecs per gtod call
1335 nsecs per CLOCK_MONOTONIC call
1273 nsecs per CLOCK_REALTIME call

GENERIC_TIME TSC:
149 nsecs per gtod call
304 nsecs per CLOCK_MONOTONIC call
275 nsecs per CLOCK_REALTIME call
GENERIC_TIME ACPI PM:
1273 nsecs per gtod call
1275 nsecs per CLOCK_MONOTONIC call
1273 nsecs per CLOCK_REALTIME call

So almost no performance change.

Ingo has a few cleanups I need to merge, but otherwise I think this is 
getting close to ready for inclusion into -mm for testing. Please let 
me know if you have any major objections and if not I'll re-diff it 
against -mm and send it to Andrew. 

New in the current C7 release:
o Synched up w/ 2.6.19-rc6-git11
o Reworked the patch order to be a bit more logical
o Dropped the apic_runs_main_timer removal on Andi's request

Let me know if you have any thoughts or comments!

thanks again!
-john
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2 -mm] fault-injection: lightweight code-coverage maximizer

2006-11-28 Thread Akinobu Mita
On Tue, Nov 28, 2006 at 12:14:36PM -0800, Don Mullis wrote:
> First, waiting a few seconds for the standard FC-6 daemons to wake up.
> Then, Xemacs and Firefox.  Not tested on SMP.

Is it failslab or fail_page_alloc ?

> > This doesn't maximize code coverage. It makes fault-injector reject
> > any failures which have same stacktrace before.
> 
> Since the volume of (repeated) dumps is greatly reduced, 
> interval/probability can be set more aggressively without crippling
> interaction.  This increases the number of error recovery paths covered
> per unit of wall clock time.
> 

It seems artificial. Injecting failures into slab or page allocator causes
vastly greater range of errors and it should be. I feel what you really
want is new fault capability.

Fault injection is designed be extensible. It's not only for failslab,
fail_page_alloc, and fail_make_request.

If we want to inject errors into try_something() and have own tuning or
setting, we just need to extend fault attribute and define own judging
function,

struct fail_try_something_attr {

struct gorgeous_tuning tuning;
struct fail_attr attr;

} = fail_try_something = {
.attr = FAULT_ATTR_INITIALIZER,
};

static int should_fail_try_something(void *data)
{
if (tuning_did_clever_decision(_try_something.tuning, data))
return 0;

return should_fail(_try_something.attr);
}

Then insert it into try_something()

int try_something(void *data)
{
if (should_fail_try_something(data))
return 0;
...
return 1;
}

Common debugfs entries for fault capabilities will be complicated
soon by pushing new entries for every fault case or pattern.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] atl1: Revised Attansic L1 ethernet driver

2006-11-28 Thread Jonathan deBoer

Jay Cliburn wrote:
I've been working on this with Jay since his initial submission. 
Thanks to everyone who has provided feedback on the resubmit. We're 
currently quite short on actual testers, since the chip only seems to 
be on Asus M2V motherboards at present. Please let me and Jay know if 
you have one of these boards and would like to test and/or have 
encountered bugs.
I purchased a Asus P5B-E today which also has this network card, and 
would be interested in testing driver changes.


Please email me directly, as I am not subscribed to the LKML yet.

Thanks.

--
Jonathan deBoer
email: jonathanseltecabca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread Patrick McHardy
Krzysztof Halasa wrote:
> Patrick McHardy <[EMAIL PROTECTED]> writes:
> 
> 
>>It might be the case that your network device has a
>>hard_header_len > LL_MAX_HEADER, which could trigger
>>a corruption.
> 
> 
> Hmm... GRE tunnels add 24 bytes... I just noticed the following code in
> include/linux/netdevice.h:
> 
> /*
>  *  Compute the worst case header length according to the protocols
>  *  used.
>  */
> #if !defined(CONFIG_NET_IPIP) && \
> !defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE)
> #define MAX_HEADER LL_MAX_HEADER
> #else
> #define MAX_HEADER (LL_MAX_HEADER + 48)
> #endif
> 
> I don't use AX25, Token Ring, the old IPIP tunnels nor IPv6 here, but
> I wonder if GRE tunnel (which is basically another, more compatible
> form of IPIP) need the same treatment as IPIP.

Both ipip and gre do this:

dev->hard_header_len= LL_MAX_HEADER + sizeof(struct iphdr);



which explains it. It is a bug in the REJECT target, but I was
wondering whether you were really seeing this. It looks like
it makes sense to add GRE to the MAX_HEADER case above though.

>>Please try this patch on top of the REJECT patch (ideally after
>>verifying that the REJECT patch is really introducing the
>>corruption).
> 
> 
> That was certain. The patch fixed the problem, confirmed with current
> git tree as well. Thanks for looking at it.

Thanks. Dave, please apply this patch.

[NETFILTER]: ipt_REJECT: fix memory corruption

On devices with hard_header_len > LL_MAX_HEADER ip_route_me_harder()
reallocates the skb, leading to memory corruption when using the stale
tcph pointer to update the checksum.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

diff --git a/net/ipv4/netfilter/ipt_REJECT.c b/net/ipv4/netfilter/ipt_REJECT.c
index ad0312d..264763a 100644
--- a/net/ipv4/netfilter/ipt_REJECT.c
+++ b/net/ipv4/netfilter/ipt_REJECT.c
@@ -114,6 +114,14 @@ static void send_reset(struct sk_buff *o
tcph->window = 0;
tcph->urg_ptr = 0;
 
+   /* Adjust TCP checksum */
+   tcph->check = 0;
+   tcph->check = tcp_v4_check(tcph, sizeof(struct tcphdr),
+  nskb->nh.iph->saddr,
+  nskb->nh.iph->daddr,
+  csum_partial((char *)tcph,
+   sizeof(struct tcphdr), 0));
+
/* Set DF, id = 0 */
nskb->nh.iph->frag_off = htons(IP_DF);
nskb->nh.iph->id = 0;
@@ -129,14 +137,8 @@ #endif
if (ip_route_me_harder(, addr_type))
goto free_nskb;
 
-   /* Adjust TCP checksum */
nskb->ip_summed = CHECKSUM_NONE;
-   tcph->check = 0;
-   tcph->check = tcp_v4_check(tcph, sizeof(struct tcphdr),
-  nskb->nh.iph->saddr,
-  nskb->nh.iph->daddr,
-  csum_partial((char *)tcph,
-   sizeof(struct tcphdr), 0));
+
/* Adjust IP TTL */
nskb->nh.iph->ttl = dst_metric(nskb->dst, RTAX_HOPLIMIT);
 


[patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away

2006-11-28 Thread Keith Owens
Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux),
wait_hpet_tick is optimized away to a never ending loop and the kernel
hangs on boot in timer setup.

001a :
  1a:   55  push   %ebp
  1b:   89 e5   mov%esp,%ebp
  1d:   eb fe   jmp1d 

This is not a problem with gcc 3.3.5.  Adding barrier() calls to
wait_hpet_tick does not help, making the variables volatile does.

Signed-off-by: Keith Owens 

---
 arch/i386/kernel/time_hpet.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/arch/i386/kernel/time_hpet.c
===
--- linux-2.6.orig/arch/i386/kernel/time_hpet.c
+++ linux-2.6/arch/i386/kernel/time_hpet.c
@@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d,
  */
 static void __devinit wait_hpet_tick(void)
 {
-   unsigned int start_cmp_val, end_cmp_val;
+   unsigned volatile int start_cmp_val, end_cmp_val;
 
start_cmp_val = hpet_readl(HPET_T0_CMP);
do {

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-28 Thread Krzysztof Halasa
Patrick McHardy <[EMAIL PROTECTED]> writes:

> It might be the case that your network device has a
> hard_header_len > LL_MAX_HEADER, which could trigger
> a corruption.

Hmm... GRE tunnels add 24 bytes... I just noticed the following code in
include/linux/netdevice.h:

/*
 *  Compute the worst case header length according to the protocols
 *  used.
 */
 
#if !defined(CONFIG_AX25) && !defined(CONFIG_AX25_MODULE) && !defined(CONFIG_TR)
#define LL_MAX_HEADER   32
#else
#if defined(CONFIG_AX25) || defined(CONFIG_AX25_MODULE)
#define LL_MAX_HEADER   96
#else
#define LL_MAX_HEADER   48
#endif
#endif

#if !defined(CONFIG_NET_IPIP) && \
!defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE)
#define MAX_HEADER LL_MAX_HEADER
#else
#define MAX_HEADER (LL_MAX_HEADER + 48)
#endif

I don't use AX25, Token Ring, the old IPIP tunnels nor IPv6 here, but
I wonder if GRE tunnel (which is basically another, more compatible
form of IPIP) need the same treatment as IPIP.

I've confirmed that REJECTs over GRE tunnel caused that corruption.

> Please try this patch on top of the REJECT patch (ideally after
> verifying that the REJECT patch is really introducing the
> corruption).

That was certain. The patch fixed the problem, confirmed with current
git tree as well. Thanks for looking at it.


I'm not sure about LL_MAX_HEADER (and/or MAX_HEADER) though. Should it
be changed as well?

There are many devices adding data to header space, perhaps tacking
devices doesn't count as the skb is being linearized in
dev->hard_start_xmit() or equivalent path?
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.15.4 rel.2 1/1] libata: add hotswap to sata_svw

2006-11-28 Thread Martin Devera

Benjamin Herrenschmidt wrote:

On Tue, 2006-11-28 at 23:22 +, David Woodhouse wrote:

On Thu, 2006-02-16 at 16:09 +0100, Martin Devera wrote:

From: Martin Devera <[EMAIL PROTECTED]>

Add hotswap capability to Serverworks/BroadCom SATA controlers. The
controler has SIM register and it selects which bits in SATA_ERROR
register fires interrupt.
The solution hooks on COMWAKE (plug), PHYRDY change and 10B8B decode 
error (unplug) and calls into Lukasz's hotswap framework.
The code got one day testing on dual core Athlon64 H8SSL Supermicro 
MoBo with HT-1000 SATA, SMP kernel and two CaviarRE SATA HDDs in

hotswap bays.

Signed-off-by: Martin Devera <[EMAIL PROTECTED]>

What became of this?


I might be to blame for not testing it... The Xserve I had on my desk
was too noisy for most of my co-workers so I kept delaying and forgot
about it 


Also the Xserve I have only has one disk, which makes hotplug testing a
bit harder :-)


Unfortunately my box with ht1000 is already deployed. Another similar one should
arrive soon so that I'll retest it.
Just now I've VIA based mobo here - and hotswap is NOT working with it ..

Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP

2006-11-28 Thread David Chinner
On Thu, Nov 23, 2006 at 12:18:09PM +1100, David Chinner wrote:
> On Wed, Nov 22, 2006 at 01:58:11PM +0100, Jesper Juhl wrote:
> > 
> > Attached are two files. The one named stack_overflows.txt.gz contains
> > one instance of each unique stack overflow + trace that I've got.  The
> > other file named kernel_BUG.txt.gz contains a few BUG() messages that
> > were also in the logs.

> I've just checked on a 2.6.17 build on i386 how much stack we
> are using (from checkstack.pl with min size reported set to 32 bytes)
> here in XFS:

> So, assuming the stacks less than 32 bytes are 32 bytes, we've got
> 1380 bytes in the XFS stack there, and very few functions where it
> can be reduced further. Still, 1380 bytes is way, way short of 4KB,
> so unless there is extra stack usage that checkstack doesn't tell us
> about I'm not sure why this amount of usage is causing repeated
> stack overflows with very little stack usage on either side of it.
> 
> Can someone enlighten me as to where all the rest of the stack
> is being used up here?

FYI.

With some help from Keith Owens, we've determined that gcc 3.3.5
resulted in XFS stack usage of about 1.9KB through the writeback and
allocation path with another ~800 bytes of stack usage in generic
code in this path.

The big difference between the numbers I was getting from checkstack
and reality was CONFIG_CC_OPTIMISE_FOR_SIZE=y being set on the
kernels I was stack checking. IOWs, CONFIG_CC_OPTIMISE_FOR_SIZE=y
appears to reduce XFS stack usage by at least 20% and so probably
should be used with XFS on 4k stacks.

Keith also confirmed that gcc-4.1's aggressive inlining of static
functions substantially increases stack usage (by ~15%) through this
call chain.  Given that many of the inlined static functions are not
required by the critical path (i.e. they'd previously been factored
out to reduce stack usage), gcc is effectively undoing past mods
that had substantially reduced XFS's stack usage.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][UPDATE] i2c: Add support for virtual I2C adapters

2006-11-28 Thread Sujoy Gupta

Is there a reason why the files and config options have been renamed
from i2c-virtual to i2c-virt?

On 4/7/06, Kumar Gala <[EMAIL PROTECTED]> wrote:

Any comments or acceptance of this patch?

- k

On Mar 30, 2006, at 5:05 PM, Kumar Gala wrote:

> Virtual adapters are useful to handle multiplexed I2C bus
> topologies, by
> presenting each multiplexed segment as a I2C adapter.  Typically,
> either
> a mux (or switch) exists which is an I2C device on the parent bus.
> One
> selects a given child bus via programming the mux and then all the
> devices
> on that bus become present on the parent bus.  The intent is to allow
> multiple devices of the same type to exist in a system which would
> normally
> have address conflicts.
>
> Since virtual adapters will get registered in an I2C client's detect
> function we have to expose versions of i2c_{add,del}_adapter for
> i2c_{add,del}_virt_adapter to call that don't lock.
>
> Additionally, i2c_virt_master_xfer (and i2c_virt_smbus_xfer) acquire
> the parent->bus_lock and call the parent's master_xfer directly.  This
> is because on a i2c_virt_master_xfer we have issue an i2c write on
> the parent bus to select the given virtual adapter, then do the i2c
> operation on the parent bus, followed by another i2c write on the
> parent to deslect the virtual adapter.
>
> Signed-off-by: Kumar Gala <[EMAIL PROTECTED]>
>
> ---
> commit 862cbc263e3d3e44028d7465a912847cf5366163
> tree 2c91bad8eb66cab9727f3071831a916ada41edf8
> parent 5d4fe2c1ce83c3e967ccc1ba3d580c1a5603a866
> author Kumar Gala <[EMAIL PROTECTED]> Thu, 30 Mar 2006
> 17:03:42 -0600
> committer Kumar Gala <[EMAIL PROTECTED]> Thu, 30 Mar 2006
> 17:03:42 -0600
>
>  drivers/i2c/Kconfig|9 ++
>  drivers/i2c/Makefile   |1
>  drivers/i2c/i2c-core.c |   42 
>  drivers/i2c/i2c-virt.c |  173 +
> +++
>  include/linux/i2c-id.h |2 +
>  include/linux/i2c.h|   20 ++
>  6 files changed, 234 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/i2c/Kconfig b/drivers/i2c/Kconfig
> index 24383af..b8a8fc1 100644
> --- a/drivers/i2c/Kconfig
> +++ b/drivers/i2c/Kconfig
> @@ -34,6 +34,15 @@ config I2C_CHARDEV
> This support is also available as a module.  If so, the module
> will be called i2c-dev.
>
> +config I2C_VIRT
> + tristate "I2C virtual adapter support"
> + depends on I2C
> + help
> +   Say Y here if you want the I2C core to support the ability to have
> +   virtual adapters. Virtual adapters are useful to handle
> multiplexed
> +   I2C bus topologies, by presenting each multiplexed segment as a
> +   I2C adapter.
> +
>  source drivers/i2c/algos/Kconfig
>  source drivers/i2c/busses/Kconfig
>  source drivers/i2c/chips/Kconfig
> diff --git a/drivers/i2c/Makefile b/drivers/i2c/Makefile
> index 71c5a85..4467db2 100644
> --- a/drivers/i2c/Makefile
> +++ b/drivers/i2c/Makefile
> @@ -3,6 +3,7 @@
>  #
>
>  obj-$(CONFIG_I2C)+= i2c-core.o
> +obj-$(CONFIG_I2C_VIRT)   += i2c-virt.o
>  obj-$(CONFIG_I2C_CHARDEV)+= i2c-dev.o
>  obj-y+= busses/ chips/ algos/
>
> diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
> index 45e2cdf..64c1c9e 100644
> --- a/drivers/i2c/i2c-core.c
> +++ b/drivers/i2c/i2c-core.c
> @@ -150,22 +150,31 @@ static struct device_attribute dev_attr_
>   */
>  int i2c_add_adapter(struct i2c_adapter *adap)
>  {
> + int res;
> +
> + mutex_lock(_lists);
> + res = i2c_add_adapter_nolock(adap);
> + mutex_unlock(_lists);
> +
> + return res;
> +}
> +
> +int i2c_add_adapter_nolock(struct i2c_adapter *adap)
> +{
>   int id, res = 0;
>   struct list_head   *item;
>   struct i2c_driver  *driver;
>
> - mutex_lock(_lists);
> -
>   if (idr_pre_get(_adapter_idr, GFP_KERNEL) == 0) {
>   res = -ENOMEM;
> - goto out_unlock;
> + goto out;
>   }
>
>   res = idr_get_new(_adapter_idr, adap, );
>   if (res < 0) {
>   if (res == -EAGAIN)
>   res = -ENOMEM;
> - goto out_unlock;
> + goto out;
>   }
>
>   adap->nr =  id & MAX_ID_MASK;
> @@ -203,21 +212,29 @@ int i2c_add_adapter(struct i2c_adapter *
>   driver->attach_adapter(adap);
>   }
>
> -out_unlock:
> - mutex_unlock(_lists);
> +out:
>   return res;
>  }
>
> -
>  int i2c_del_adapter(struct i2c_adapter *adap)
>  {
> + int res;
> +
> + mutex_lock(_lists);
> + res = i2c_del_adapter_nolock(adap);
> + mutex_unlock(_lists);
> +
> + return res;
> +}
> +
> +int i2c_del_adapter_nolock(struct i2c_adapter *adap)
> +{
>   struct list_head  *item, *_n;
>   struct i2c_adapter *adap_from_list;
>   struct i2c_driver *driver;
>   struct i2c_client *client;
>   int res = 0;
>
> - mutex_lock(_lists);
>
>   /* First make sure that this adapter was ever added */
>   

Re: failed 'ljmp' in linear addressing mode

2006-11-28 Thread Jun Sun
On Tue, Nov 28, 2006 at 06:49:17PM -0500, linux-os (Dick Johnson) wrote:
> 
> On Tue, 28 Nov 2006, Jun Sun wrote:
> 
> > On Tue, Nov 28, 2006 at 08:46:44AM -0500, linux-os (Dick Johnson) wrote:
> >>
> >> On Mon, 27 Nov 2006, Jun Sun wrote:
> >>
> >>>
> >>> On Mon, Nov 27, 2006 at 08:58:57AM -0500, linux-os (Dick Johnson) wrote:
> 
>  I think it probably resets the instant that you turn off paging. To
>  turn off paging, you need to copy some code (properly linked) to an
>  area where there is a 1:1 mapping between virtual and physical addresses.
>  A safe place is somewhere below 1 megabyte. Then you need to set up a
>  call descriptor so you can call that code (you can ljump if you never
>  plan to get back). You then need to clear interrupts on all CPUs (use a
>  spin-lock). Once you are executing from the new area, you reset your
>  segments to the new area. The call descriptor would have already set
>  CS, as would have the long-jump. At this time you can turn off paging
>  and flush the TLB. You are now in linear-address protected mode.
> 
> >>>
> >>> Thanks for the reply.  But I am pretty much sure I did above correctly.
> >>> I use single-instruction infinite loop in the call path to verify
> >>> that control does reach last 'ljmp' but not the jump destination.
> >>>
> >>> Below is the hack I made to machine_kexec.c file.  As you can see, I
> >>> managed to make the identical mapping between virtual and physical 
> >>> addresses.
> >>>
> >>> Note I did not copy the code into the first 1M.  In fact the code
> >>> is located at 0xc0477000 (0x00477000 in physical).  I thought that should 
> >>> be
> >>> OK as I did not really go all the way back to real-address mode.
> >>>
> >>> That last suspect I have now is the wrong value in CS descriptor.  Does 
> >>> kernel
> >>> have a suitable CS descriptor for the last ljmep to 0x1000 in linear
> >>> addressing mode?  The CS descriptor seems to be a pretty dark magic to me 
> >>> ...
> >>>
> >>> Cheers.
> >>>
> >>> Jun
> >>>
> >>> -
> >>> diff -Nru linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c.orig 
> >>> linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c
> >>> --- linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c.orig   
> >>> 2006-10-13 11:55:04.0 -0700
> >>> +++ linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c
> >>> 2006-11-22 15:01:45.0 -0800
> >>> @@ -212,3 +212,19 @@
> >>>rnk = (relocate_new_kernel_t) reboot_code_buffer;
> >>>(*rnk)(page_list, reboot_code_buffer, image->start, cpu_has_pae);
> >>> }
> >>> +
> >>> +extern void do_os_switching(void);
> >>> +void os_switch(void)
> >>> +{
> >>> +   void (*foo)(void);
> >>> +
> >>> +   /* absolutely no irq */
> >>> +   local_irq_disable();
> >>> +
> >>> +   /* create identity mapping */
> >>> +   foo=virt_to_phys(do_os_switching);
> >>> +   identity_map_page((unsigned long)foo);
> >>> +
> >>> +   /* jump to the real address */
> >>> +   foo();
> >>> +}
> >>>
> >> Get a copy of the Intel 486 Microprocessor Reference Manual or read it on-
> >> line. There is no way that you can make a call like that.
> >
> > By "a call like that", you mean "foo()"?  Are you sure about that?
> >
> > The machine_kexec() function in the same file is basically doing the
> > same way (i.e., use "call *$eax" instead of "ljmp").  That is where I got
> > my idea from.
> >
> > In addition, if I put "1: jmp 1b" instruction anywhere *inside*
> > do_os_switching() I would get infinite hanging instead of reboot,
> > which seems to suggest I *did* jump into do_os_switching() successfully.
> >
> > According to Intel Architecture Software Developer's Manual (1997), Vol 3,
> > page 8-14:
> >
> > "2.  If paging is enabled perform the following operations:
> >
> >  - Transfer program control to linear addresses that are identity mapped to
> >physical addresses (that is, linear addresses equal physical addresses)
> >  ...
> > "
> >
> > it does not indicate one has to use "ljmp" to do this control transfer.
> 
> Assume you are accessing memory at 0xc000-. This address, when
> page translation is occurring (page 5-17), consists of three parts.
> 
> (1) A 12-bit offset 0:11
> (2) A 10-bit index  11:21
> (3) A 10-bit index  21:31
> 
> So 0xc00 is an index into the page directory. If you wish to turn off
> translation, you can't just turn off those bits. The next instruction
> will be fetched from memory with the page-cache upper bits reset, i.e,
> using offset 0 of the page directory. You somehow need to turn off those
> bits at the same time the next instruction is fetched. Normally you
> use a call gate. However, you can do a long jump which reloads the
> segment register. When the instruction book says "transfer control"
> it doesn't mean just jump to some offset. When the instruction address is
> 0xC000-, it is not the same as 0x-.  These two addresses are 
> different (to 

Re: XFS internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c (kernel 2.6.18.1)

2006-11-28 Thread David Chinner
On Tue, Nov 28, 2006 at 04:49:00PM +0100, Jesper Juhl wrote:
> Hi,
> 
> One of my NFS servers just gave me a nasty surprise that I think it is
> relevant to tell you about:

Thanks, Jesper.

> Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1138 of
> file fs/xfs/xfs_trans.c.  Caller 0x8034b47e
> 
> Call Trace:
> [] show_trace+0xb2/0x380
> [] dump_stack+0x15/0x20
> [] xfs_error_report+0x3c/0x50
> [] xfs_trans_cancel+0x6e/0x130
> [] xfs_create+0x5ee/0x6a0
> [] xfs_vn_mknod+0x156/0x2e0
> [] xfs_vn_create+0xb/0x10
> [] vfs_create+0x8c/0xd0
> [] nfsd_create_v3+0x31a/0x560
> [] nfsd3_proc_create+0x148/0x170
> [] nfsd_dispatch+0xf9/0x1e0
> [] svc_process+0x437/0x6e0
> [] nfsd+0x1cd/0x360
> [] child_rip+0xa/0x12
> xfs_force_shutdown(dm-1,0x8) called from line 1139 of file
> fs/xfs/xfs_trans.c.  Return address = 0x80359daa

We shut down the filesystem because we cancelled a dirty transaction.
Once we start to dirty the incore objects, we can't roll back to
an unchanged state if a subsequent fatal error occurs during the
transaction and we have to abort it.

If I understand historic occurrences of this correctly, there is
a possibility that it can be triggered in ENOMEM situations. Was your
machine running out of memoy when this occurred?

> Filesystem "dm-1": Corruption of in-memory data detected.  Shutting
> down filesystem: dm-1
> Please umount the filesystem, and rectify the problem(s)
> nfsd: non-standard errno: 5

EIO gets returned in certain locations once the filesystem has
been shutdown.

> I unmounted the filesystem, ran xfs_repair which told me to try an
> mount it first to replay the log, so I did, unmounted it again, ran
> xfs_repair (which didn't find any problems) and finally mounted it and
> everything is good - the filesystem seems intact.

Yeah, the above error report typically is due to an in-memory
problem, not an on disk issue.

> The server in question is running kernel 2.6.18.1

Can happen to XFS on any kernel version - got a report of this from
someone running a 2.4 kernel a couple of weeks ago

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.18 tsc clocksource + ntp = excessive drift; acpi_pm does fine.

2006-11-28 Thread Alexandre Pereira Nunes

john stultz escreveu:


On Tue, 2006-11-28 at 21:46 -0200, Alexandre Pereira Nunes wrote:
 


Hi,

with default boot I got tsc clocksource selected on an debian's
2.6.18-3-k7 SMP build (but UP machine). ntp keeps bothering me with this
message:
frequency error 512 PPM exceeds tolerance 500 PPM
   



Hmmm. Could you send me your dmesg? Also what frequency is your cpu?

 


Sure, attached!
You'll notice an "acpi_pm installed" or something at the end, that was 
at the time I typed the echo acpi_pm >/sys/whatever.


My cpu is an athlon xp 2600+, I attached a copy of /proc/cpuinfo for 
convenience.



Also does booting w/ "noapic" change the behavior?
 

I'll test it and let you know. I also read (but didn't try) about some 
"notsc" option, I assume that's not a good one to try, right?



[cut]


If I remove ntp's drift file, then do a: echo acpi_pm
>/sys/devices/system/clocksource/clocksource0/available_clocksource ;
   



I think you mean "current_clocksource" there...

 


Ooops. Let's just pretend no one else saw that! :-)


[cut]
Yea, its likely the generic timekeeping changes for i386. Previously
(pre-2.6.18) it probably defaulted to the acpi pm timer and was fine.
The new code is a bit more aggressive in trying to use the TSC.
 

Just out of curiousity: what about this acpi_pm stuff ... Reading from 
tsc is probably cheaper than any other "accurate" clock source, but how 
bad (or good) is acpi_pm?



As a short term workaround, you can put "clocksource=acpi_pm" on your
grub line and that will force the clocksource at boot.
 



Yeah, I googled around and had put that on grub's config, but didn't 
reboot. I'll swap that with noapic and reboot, by tomorrow I should have 
some news.


- Alexandre

Linux version 2.6.18-3-k7 (Debian 2.6.18-6) ([EMAIL PROTECTED]) (gcc version 
4.1.2 20061115 (prerelease) (Debian 4.1.1-20)) #1 SMP Thu Nov 23 21:37:22 UTC 
2006
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000d - 000d6000 (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 1fff (usable)
 BIOS-e820: 1fff - 1fff8000 (ACPI data)
 BIOS-e820: 1fff8000 - 2000 (ACPI NVS)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820: fff8 - 0001 (reserved)
0MB HIGHMEM available.
511MB LOWMEM available.
On node 0 totalpages: 131056
  DMA zone: 4096 pages, LIFO batch:0
  Normal zone: 126960 pages, LIFO batch:31
DMI 2.3 present.
ACPI: RSDP (v000 AMI   ) @ 0x000fa8a0
ACPI: RSDT (v001 AMIINT VIA_K7   0x0010 MSFT 0x0097) @ 0x1fff
ACPI: FADT (v001 AMIINT VIA_K7   0x0011 MSFT 0x0097) @ 0x1fff0030
ACPI: MADT (v001 AMIINT VIA_K7   0x0009 MSFT 0x0097) @ 0x1fff00c0
ACPI: DSDT (v001VIA   KT266A 0x1000 MSFT 0x010d) @ 0x
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 6:8 APIC version 16
ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0])
IOAPIC[0]: apic_id 2, version 3, address 0xfec0, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 3000 (gap: 2000:dec0)
Detected 2133.046 MHz processor.
Built 1 zonelists.  Total pages: 131056
Kernel command line: root=/dev/hda2 ro 
mapped APIC to d000 (fee0)
mapped IOAPIC to c000 (fec0)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 2048 (order: 11, 8192 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 515356k/524224k available (1556k kernel code, 8332k reserved, 582k 
data, 196k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 4270.91 BogoMIPS (lpj=8541825)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0383fbff c1c3fbff    
 
CPU: After vendor identify, caps: 0383fbff c1c3fbff    
 
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After all inits, caps: 0383fbff c1c3fbff  0420  
 
Intel machine 

Re: [rfc PATCH] ieee1394: ohci1394: delete bogus spinlock, flush MMIO writes

2006-11-28 Thread Alan
On Wed, 29 Nov 2006 00:50:43 +0100
Stefan Richter <[EMAIL PROTECTED]> wrote:

> Alan wrote:
> > On Tue, 28 Nov 2006 22:24:11 +0100 (CET)
> > Stefan Richter <[EMAIL PROTECTED]> wrote:
> >> All MMIO writes which were surrounded by the spinlock as well as the
> >> very last MMIO write of the IRQ handler are now explicitly flushed by
> >> MMIO reads of the respective register.
> > 
> > MMIO is ordered anyway on the bus, you just need mmiowb() to force
> > ordering to the bus controller in case you are on a big numa box.
> 
> The mmiowb is a checkpoint to ensure ordering between different threads
> of MMIO writes; i.e. it doesn't halt the thread until the write actually
> reached the device like a read would do, right?

It guarantees that no other mmio will sneak past it from another thread
but doesn't guarantee the previous I/O has hit the hardware. It's a much
weaker (and thus far faster) guarantee which is usually sufficient as it
can be combined with spin_unlock to enforce I/O ordering matching the
lock ordering.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt8

2006-11-28 Thread Hu Gang
On Mon, 27 Nov 2006 10:49:27 +0100
Ingo Molnar <[EMAIL PROTECTED]> wrote:

> i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from 
> the usual place:
> 
> http://redhat.com/~mingo/realtime-preempt/

attached patch to making it compile and works in my PowerBook G4. 


Index: linux-2.6.19-rc6-rt5/arch/powerpc/kernel/time.c
===
--- linux-2.6.19-rc6-rt5.orig/arch/powerpc/kernel/time.c2006-11-28 
22:13:54.0 +
+++ linux-2.6.19-rc6-rt5/arch/powerpc/kernel/time.c 2006-11-28 
22:15:48.0 +
@@ -507,7 +507,7 @@
if (per_cpu(last_jiffy, cpu) >= tb_next_jiffy) {
tb_last_jiffy = tb_next_jiffy;
do_timer(1);
-   timer_recalc_offset(tb_last_jiffy);
+   /*timer_recalc_offset(tb_last_jiffy);*/
timer_check_rtc();
}
write_sequnlock(_lock);
Index: linux-2.6.19-rc6-rt5/include/asm-powerpc/semaphore.h
===
--- linux-2.6.19-rc6-rt5.orig/include/asm-powerpc/semaphore.h   2006-11-28 
22:13:54.0 +
+++ linux-2.6.19-rc6-rt5/include/asm-powerpc/semaphore.h2006-11-28 
22:15:48.0 +
@@ -10,7 +10,7 @@
 
 #ifdef __KERNEL__
 
-#include 
+/*#include */
 #include 
 #include 
 #include 
Index: linux-2.6.19-rc6-rt5/mm/page_alloc.c
===
--- linux-2.6.19-rc6-rt5.orig/mm/page_alloc.c   2006-11-28 22:13:54.0 
+
+++ linux-2.6.19-rc6-rt5/mm/page_alloc.c2006-11-28 22:15:48.0 
+
@@ -2800,7 +2800,9 @@
 
 void __init page_alloc_init(void)
 {
+#ifdef CONFIG_HOTPLUG_CPU
hotcpu_notifier(page_alloc_cpu_notify, 0);
+#endif
 }
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm2

2006-11-28 Thread Jean Tourrilhes
On Tue, Nov 28, 2006 at 04:58:28PM -0800, Andrew Morton wrote:
> On Tue, 28 Nov 2006 19:24:45 -0500
> Thomas Tuttle <[EMAIL PROTECTED]> wrote:
> 
> > 2. I'm not sure if this bug is in the kernel, wireless tools, or the
> > ipw3945 driver, but I haven't changed the version of anything but the
> > kernel.  When I do `iwconfig eth1 essid foobar' something drops the
> > last character of the essid, and a subsequent `iwconfig eth1' shows
> > "fooba" as the essid.  And it's actually set as "fooba", since I had
> > to do `iwconfig eth1 essid MyUsualEssid_' (note underscore) to get on
> > to my usual network.
> 
> This could be version skew between the wireless APIs in the kernel.org kernel,
> the wireless userspace, the out-of-tree ipw3945 driver and conceivably one
> of the git trees in -mm (although I suspect not the latter).
> 
> I don't know, but I know who to cc ;)   Probably they will want to knwo which
> version of wireless-tools userspace you are running.

Yes, it's a problem because the driver is out-of-tree. I sent
a patch to the maintainer to make the driver compatible with kernel
before/after, and it's actually integrated in the version 1.1.2 of the
driver (Nov 1st).
So, please upgrade your driver and tell us how it works...

Jean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Don't compare unsigned variable for <0 in sys_prctl()

2006-11-28 Thread Jesper Juhl

On 29/11/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:



On Wed, 29 Nov 2006, Jesper Juhl wrote:
>
> I would venture that "-Wshadow" is another one of those.

I'd agree, except for the fact that gcc does a horribly _bad_ job of
-Wshadow, making it (again) totally unusable.

For example, it's often entirely interesting to hear about local variables
that shadow each other. No question about it.

HOWEVER. It's _not_ really interesting to hear about a local variable that
happens to have a common name that is also shared by a extern function.

There just isn't any room for confusion, and it's actually not even that
unusual - I tried using -Wshadow on real programs, and it was just
horribly irritating.

In the kernel, we had obvious things like local use of "jiffies" that just
make _total_ sense in a small inline function, and the fact that there
happens to be an extern declaration for "jiffies" just isn't very
interesting.

Similarly, with nested macro expansion, even the "local variable shadows
another local variable" case - that looks like it should have an obvious
warning on the face of it - really isn't always necessarily that
interesting after all. Maybe it is a bug, maybe it isn't, but it's no
longer _obviously_ bogus any more.

So I'm not convinced about the usefulness of "-Wshadow". ESPECIALLY the
way that gcc implements it, it's almost totally useless in real life.

For example, I tried it on "git" one time, and this is a perfect example
of why "-Wshadow" is totally broken:

diff-delta.c: In function 'create_delta_index':
diff-delta.c:142: warning: declaration of 'index' shadows a global 
declaration

(and there's a _lot_ of those). If I'm not allowed to use "index" as a
local variable and include  at the same time, something is
simply SERIOUSLY WRONG with the warning.

So the fact is, the C language has scoping rules for a reason. Can you
screw yourself by usign them badly? Sure. But that does NOT mean that the
same name in different scopes is a bad thing that should be warned about.

If I wanted a language that didn't allow me to do anything wrong, I'd be
using Pascal. As it is, it turns out that things that "look" wrong on a
local level are often not wrong after all.



I can't really say anything else at this point but, point conceded...

--
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/16] LTTng 0.6.36 for 2.6.18 : Linux Kernel Markers

2006-11-28 Thread Frank Ch. Eigler
Hi -

On Tue, Nov 28, 2006 at 05:40:36AM +, Christoph Hellwig wrote:
> [...]
> > > Are you sure the license_gplok check is necessary here?  We should
> > > consider encouraging non-gpl module writers to instrument their code,
> > > to give users a slightly better chance of debugging problems.

> > [... the authors of clearcase] have the funny habit of
> > distributing their kernel modules as ".ko" files instead of
> > sending a proper ".o" and later link it against a wrapper.  The
> > result is, I must say, quite bad [...]  the structure is
> > corrupted.

> Please don't add hacks like that for non-GPL modules.  

Indeed, offline Matheiu elaborated on his problem, and it turns out
that good old modversions would have solved it.

> But neither should we export any tracing functionality for them.
> They're not the kind of people we want to help at all,

Making that sort of political decision is beyond my pay grade.  
I merely suggested its consideration.

> and Frank just shows once again that he should rather stay away from
> kernel stuff and keep on writing C++.

Now now, if you don't like my C++, wait till you see my Smalltalk-80.
Or are you just jealous that my initials subsume yours?

- FChE (this space for rent)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm2

2006-11-28 Thread Jiri Kosina
On Tue, 28 Nov 2006, Andrew Morton wrote:

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/

md-change-lifetime-rules-for-md-devices.patch gives me the following early 
during boot (first WARNING() inside __mutex_lock_slowpath(), then BUG at 
__mutex_lock_slowpath(), just after that slab corruption).

When I revert md-change-lifetime-rules-for-md-devices.patch, everything 
seems to go fine (this machine does use neither LVM nor RAID, but the 
kernel has DM compiled in).

Config is at http://www.jikos.cz/jikos/junk/.config_md

 WARNING at kernel/mutex.c:132 __mutex_lock_common()
  [] dump_trace+0x68/0x1b5
  [] show_trace_log_lvl+0x18/0x2c
  [] show_trace+0xf/0x11
  [] dump_stack+0x12/0x14
  [] __mutex_lock_slowpath+0xa1/0x213
  [] create_dir+0x24/0x1ba
  [] sysfs_create_dir+0x45/0x5f
  [] kobject_add+0xce/0x185
  [] kobject_register+0x19/0x30
  [] md_probe+0x11a/0x124
  [] kobj_lookup+0xe6/0x122
  [] get_gendisk+0xe/0x1b
  [] do_open+0x2e/0x298
  [] blkdev_open+0x25/0x4d
  [] __dentry_open+0xc3/0x17e
  [] nameidata_to_filp+0x24/0x33
  [] do_filp_open+0x32/0x39
  [] do_sys_open+0x3a/0x66
  [] sys_open+0x1c/0x1e
  [] syscall_call+0x7/0xb
 DWARF2 unwinder stuck at syscall_call+0x7/0xb
 Leftover inexact backtrace:
  ===
 BUG: unable to handle kernel paging request at virtual address 6b6b6b6b
  printing eip:
 c01fc5ab
 *pde = 
 Oops:  [#1]
 SMP
 last sysfs file: /class/input/input5/event5/dev
 Modules linked in: video sony_acpi button battery backlight ac ipv6 floppy 
i2c_viapro i2c_core snd_via82xx gameport snd_ac97_codec snd_ac97_bus 
snd_seq_dummy via_rhine snd_seq_oss snd_seq_midi_event snd_seq mii snd_pcm_oss 
snd_mixer_oss snd_pcm pcspkr snd_timer snd_page_alloc snd_mpu401_uart 
snd_rawmidi snd_seq_device snd soundcore serio_raw ehci_hcd ohci_hcd uhci_hcd
 CPU:0
 EIP:0060:[]Not tainted VLI
 EFLAGS: 00010046   (2.6.19-rc6-mm2 #1)
 EIP is at __list_add+0x2a/0x5c
 eax: 6b6b6b6b   ebx: edee9de0   ecx: eb8c34d8   edx: 6b6b6b6b
 esi: eb8c34b8   edi: 0246   ebp: ef60a050   esp: edee9db4
 ds: 007b   es: 007b   ss: 0068
 Process nash (pid: 1321, ti=edee8000 task=ef60a050 task.ti=edee8000)
 Stack: 0001 c0197c7d edee9de0 edee9de0 edee9de0 eb8c34b8 c036e703 
0002 c0197c7d c03752fd edee9de0 edee9de0  eb8c34b8 edee9de0
eb882cac ffea eb882cac edee9e30 c0197c7d ef60a5a0  ee8d3404
 Call Trace:
  [] __mutex_lock_slowpath+0xea/0x213
  [] create_dir+0x24/0x1ba
  [] sysfs_create_dir+0x45/0x5f
  [] kobject_add+0xce/0x185
  [] kobject_register+0x19/0x30
  [] md_probe+0x11a/0x124
  [] kobj_lookup+0xe6/0x122
  [] get_gendisk+0xe/0x1b
  [] do_open+0x2e/0x298
  [] blkdev_open+0x25/0x4d
  [] __dentry_open+0xc3/0x17e
  [] nameidata_to_filp+0x24/0x33
  [] do_filp_open+0x32/0x39
  [] do_sys_open+0x3a/0x66
  [] sys_open+0x1c/0x1e
  [] syscall_call+0x7/0xb
 DWARF2 unwinder stuck at syscall_call+0x7/0xb
 Leftover inexact backtrace:
  ===
 no locks held by nash/1321.
 Code: c3 56 53 89 c3 83 ec 10 8b 41 04 39 d0 74 1c 89 4c 24 0c 89 54 24 04 89 
44 24 08 c7 04 24 80 94 3a c0 e8 be f9 f1 ff 0f 0b eb fe <8b> 32 39 ce 74 1c 89 
54 24 0c 89 74 24 08 89 4c 24 04 c7 04 24
 EIP: [] __list_add+0x2a/0x5c SS:ESP 0068:edee9db4
  <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
 in_atomic():0, irqs_disabled():1
 no locks held by nash/1321.
  [] dump_trace+0x68/0x1b5
  [] show_trace_log_lvl+0x18/0x2c
  [] show_trace+0xf/0x11
  [] dump_stack+0x12/0x14
  [] down_read+0x15/0x4e
  [] __blocking_notifier_call_chain+0x11/0x3d
  [] blocking_notifier_call_chain+0x17/0x1a
  [] do_exit+0x19/0x782
  [] die+0x20c/0x231
  [] do_page_fault+0x450/0x51e
  [] error_code+0x7c/0x84
 DWARF2 unwinder stuck at error_code+0x7c/0x84
 Leftover inexact backtrace:
  [] __list_add+0x2a/0x5c
  [] create_dir+0x24/0x1ba
  [] __mutex_lock_slowpath+0xea/0x213
  [] create_dir+0x24/0x1ba
  [] create_dir+0x24/0x1ba
  [] sysfs_create_dir+0x45/0x5f
  [] kobject_add+0xce/0x185
  [] init_waitqueue_head+0x12/0x20
  [] kobject_init+0x5b/0x7d
  [] kobject_register+0x19/0x30
  [] md_probe+0x11a/0x124
  [] kobj_lookup+0xe6/0x122
  [] md_probe+0x0/0x124
  [] blkdev_open+0x0/0x4d
  [] get_gendisk+0xe/0x1b
  [] do_open+0x2e/0x298
  [] blkdev_open+0x0/0x4d
  [] blkdev_open+0x0/0x4d
  [] blkdev_open+0x25/0x4d
  [] __dentry_open+0xc3/0x17e
  [] nameidata_to_filp+0x24/0x33
  [] do_filp_open+0x32/0x39
  [] get_unused_fd+0xaa/0xb4
  [] _spin_unlock+0x14/0x1c
  [] get_unused_fd+0xaa/0xb4
  [] do_sys_open+0x3a/0x66
  [] sys_open+0x1c/0x1e
  [] syscall_call+0x7/0xb
  ===
 Slab corruption: start=eb8c3428, len=488
 Redzone: 0x5a2cf071/0x5a2cf071.
 Last user: [](iput+0x60/0x62)
 090: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
 Single bit error detected. Probably bad RAM.
 Run memtest86+ or a similar memory test tool.
 Prev obj: start=eb8c3234, len=488
 Redzone: 0x5a2cf071/0x5a2cf071.
 Last user: 

Re: 2.6.18 tsc clocksource + ntp = excessive drift; acpi_pm does fine.

2006-11-28 Thread john stultz
On Tue, 2006-11-28 at 21:46 -0200, Alexandre Pereira Nunes wrote:
> Hi,
> 
> with default boot I got tsc clocksource selected on an debian's
> 2.6.18-3-k7 SMP build (but UP machine). ntp keeps bothering me with this
> message:
> frequency error 512 PPM exceeds tolerance 500 PPM

Hmmm. Could you send me your dmesg? Also what frequency is your cpu?

Also does booting w/ "noapic" change the behavior?

> If I remove ntp's drift file and restart, it goes fine for a while and
> then it goes with that behaviour again.
> If I remove ntp's drift file, then do a: echo acpi_pm
>  >/sys/devices/system/clocksource/clocksource0/available_clocksource ;

I think you mean "current_clocksource" there...

> and then restart ntp, it goes fine "forever".
> 
> Any toughs, something I should look at?
> 
> I'll be glad to give more feedback.
> 
> I don't know if that happened with 2.6.17, but I'm pretty sure that with
> 2.6.16 it was fine.

Yea, its likely the generic timekeeping changes for i386. Previously
(pre-2.6.18) it probably defaulted to the acpi pm timer and was fine.
The new code is a bit more aggressive in trying to use the TSC.

As a short term workaround, you can put "clocksource=acpi_pm" on your
grub line and that will force the clocksource at boot.

thanks
-john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm2

2006-11-28 Thread Andrew Morton
On Tue, 28 Nov 2006 19:24:45 -0500
Thomas Tuttle <[EMAIL PROTECTED]> wrote:

> 2. I'm not sure if this bug is in the kernel, wireless tools, or the
> ipw3945 driver, but I haven't changed the version of anything but the
> kernel.  When I do `iwconfig eth1 essid foobar' something drops the
> last character of the essid, and a subsequent `iwconfig eth1' shows
> "fooba" as the essid.  And it's actually set as "fooba", since I had
> to do `iwconfig eth1 essid MyUsualEssid_' (note underscore) to get on
> to my usual network.

This could be version skew between the wireless APIs in the kernel.org kernel,
the wireless userspace, the out-of-tree ipw3945 driver and conceivably one
of the git trees in -mm (although I suspect not the latter).

I don't know, but I know who to cc ;)   Probably they will want to knwo which
version of wireless-tools userspace you are running.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


isochronous receives?

2006-11-28 Thread Robert Crocombe

Keith, et. al,

I am having problems with isochronous receives, and remembered just as
I was getting ready to dig into the source that there was a message
about this stuff.  Lo and behold your message to linux1394-user from
September 7:


I'm trying to receive isochronous streams (using libraw1394 1.2.0), and
I've noticed that if data is transmitted on channel 63, then my app tends
to work fine. If the stream is on a different channel, then I don't see
any isochronous packets at all.  I'm using 2.4.29, I've also tried 2.6.15
with similar results, can't seem to receive channels < 63.


Did you ultimately have any success getting this going?  Funnily
enough, when I tested isochronous stuff in July, I just did iso
transmit since I figured receives *must* be working since everyone has
camcorders and whatnot.  My currently my iso xmit stuff does appear to
be working, but iso receives are not.

I have a Firespy and no reason not to trust it, so I can see the junk
I'm spewing out.  I've tried transmitting on channels 4 and 63 (per
your advice), but neither works for me.  I suppose it could my
stuff... nah.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm2

2006-11-28 Thread Andrew Morton
On Tue, 28 Nov 2006 19:24:45 -0500
Thomas Tuttle <[EMAIL PROTECTED]> wrote:

> I've found a couple of bugs so far...
> 
> 1. I did `modprobe kvm' and then tried running a version of the KVM Qemu
> compiled for a different kernel.  My mistake.  But I got an oops:
> 
> BUG: unable to handle kernel NULL pointer dereference at virtual address 
> 0008
>  printing eip:
> f91f9c3f
> *pde = 
> Oops:  [#1]
> SMP 
> last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_max_freq
> Modules linked in: kvm iTCO_wdt i8k rfcomm l2cap rtc sdhci mmc_block mmc_core 
> hci_usb bluetooth b44 mii ohci1394 ieee1394 uhci_hcd ehci_hcd usbcore psmouse 
> evdev i915 drm cpuid msr speedstep_centrino video thermal processor fan 
> container button battery ac
> CPU:0
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00010202   (2.6.19-rc6-mm1 #1)
> EIP is at kvm_vmx_return+0xef/0x4d0 [kvm]
> eax: e5490068   ebx:    ecx:    edx: e5491ca4
> esi:    edi: e5490060   ebp: e5a4fde0   esp: e5a4fd54
> ds: 007b   es: 007b   ss: 0068
> Process qemu (pid: 24193, ti=e5a4e000 task=c2286a90 task.ti=e5a4e000)
> Stack: 0002 0001 f7fe1278 0002 b7f92000 e549  
>  
>e5a4fdac  00d8 f783a580 e5a4fdac c043b98a bfb93f7c 
> f91fa020 
>e5a4fde0 bfb93f7c bfb93f7c f91fa0cb 04f3 c03fb974 e549 
>  
> Call Trace:
>  [] kvm_dev_ioctl+0x0/0x1040 [kvm]
>  [] kvm_dev_ioctl+0xab/0x1040 [kvm]
>  [] error_code+0x7c/0x84
>  [] kmap_atomic+0xc9/0xe0
>  [] permission+0x2b/0xd0
>  [] sys_swapon+0x978/0xaf0
>  [] kunmap_atomic+0x63/0x70
>  [] kmap_atomic+0xc9/0xe0
>  [] kunmap_atomic+0x63/0x70
>  [] get_page_from_freelist+0x27d/0x340
>  [] kmap_atomic+0xc9/0xe0
>  [] kunmap_atomic+0x63/0x70
>  [] get_page_from_freelist+0x27d/0x340
>  [] find_get_page+0x20/0x60
>  [] filemap_nopage+0x2dc/0x490
>  [] do_sync_read+0xc7/0x110
>  [] kmap_atomic+0xc9/0xe0
>  [] kunmap_atomic+0x63/0x70
>  [] __handle_mm_fault+0x246/0x9c0
>  [] kvm_dev_ioctl+0x0/0x1040 [kvm]
>  [] scsi_host_alloc+0x202/0x2a0
>   [] do_ioctl+0x2b/0x90
>  [] vfs_ioctl+0x5c/0x2b0
>  [] sys_ioctl+0x3d/0x70
>  [] syscall_call+0x7/0xb
>  [] scsi_host_alloc+0x202/0x2a0
>  ===
> Code: 14 0f 87 77 02 00 00 8b 0c b5 00 15 20 f9 85 c9 0f 84 68 02 00 00 89 ea 
> 89 f8 ff d1 85 c0 0f 84 4c 02 00 00 89 f8 e8 31 e9 ff ff <65> a1 08 00 00 00 
> 8b 40 04 8b 40 08 a8 04 0f 85 ae 02 00 00 e8 
> EIP: [] kvm_vmx_return+0xef/0x4d0 [kvm] SS:ESP 0068:e5a4fd54
>  msrs: 2
> 
> Oh, and I get a ton of these messages with kvm:
> 
> rtc: lost some interrupts at 1024Hz.

KVM culprits cc'ed.  The KVM patches are I got them didn't even compile on
i386, so runtime breakage isn't very surprising.  Looks like you need an
x86_64 machine ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] lib + ntfs: let modules force HWEIGHT

2006-11-28 Thread Randy Dunlap

Andrew Morton wrote:

On Tue, 28 Nov 2006 14:08:40 -0800
Randy Dunlap <[EMAIL PROTECTED]> wrote:


From: Randy Dunlap <[EMAIL PROTECTED]>

NTFS (=m) uses hweight32(), but that function is only linked
into the kernel image if it is used inside the kernel image,
not in loadable modules.  Let modules force HWEIGHT to be
built into the kernel image.  Otherwise build fails:

  Building modules, stage 2.
  MODPOST 94 modules
WARNING: "hweight32" [fs/ntfs/ntfs.ko] undefined!

Yes, I'd certainly prefer for this to be more automated rather than
forced by each module that needs it.


Perhaps we should just put it in lib-y and remove CONFIG_GENERIC_HWEIGHT.
It's either part of the API or it ain't.


Yes, that matches how I feel about it, but I expected some disagreement
(from elsewhere, not from you).

I'll send another patch later.  Replacement patch OK?  (vs. update)

--
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] lib + ntfs: let modules force HWEIGHT

2006-11-28 Thread Andrew Morton
On Tue, 28 Nov 2006 14:08:40 -0800
Randy Dunlap <[EMAIL PROTECTED]> wrote:

> From: Randy Dunlap <[EMAIL PROTECTED]>
> 
> NTFS (=m) uses hweight32(), but that function is only linked
> into the kernel image if it is used inside the kernel image,
> not in loadable modules.  Let modules force HWEIGHT to be
> built into the kernel image.  Otherwise build fails:
> 
>   Building modules, stage 2.
>   MODPOST 94 modules
> WARNING: "hweight32" [fs/ntfs/ntfs.ko] undefined!
> 
> Yes, I'd certainly prefer for this to be more automated rather than
> forced by each module that needs it.

Perhaps we should just put it in lib-y and remove CONFIG_GENERIC_HWEIGHT.
It's either part of the API or it ain't.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] ext2 balloc: fix _with_rsv freeze

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 20:07 +, Hugh Dickins wrote:
> On Tue, 28 Nov 2006, Mingming Cao wrote:
> > On Tue, 2006-11-28 at 17:40 +, Hugh Dickins wrote:
> > > After several days of testing ext2 with reservations, it got caught inside
> > > ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding
> > > on the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to
> > > find the free block guaranteed to be included (unless there's contention).
> > >
> >
> > Hmm, I suspect there is other issue: alloc_new_reservation should not
> > repeatedly allocating the same window, if ext2_try_to_allocate
> > repeatedly fails to find a free block in that window.
> > find_next_reservable_window() takes my_rsv (the old window that he
> > thinks there is no free block) as a guide to find a window "after" the
> > end block of my_rsv, so how could this happen?
> 
> Hmmm.  I haven't studied that part of the code, but what you say sounds
> sensible: that would leave more to be explained, yes.  I guess it would
> happen if all the rest of the bitmap were either allocated or reserved,

But bitmap_search_next_usable_block() will fail in the case the rest of
bitmap were allocated, and find_next_reservable_space() will fail in the
case the rest of group were all reserved. alloc_new_reservation() should
not create a new window in this case. 

> but I don't believe that was the case here: I have noted that the map
> was all 00s from offset 0x1ae onwards, plenty unallocated; I've not
> recorded the following reservations, but it seems unlikely they covered
> the remaining free area (and still covered it even when the remaining
> tasks got to the point of just waiting for this one).
> 
> >
> > > Fix the range to find_next_usable_block's memscan: the scan from "here"
> > > (0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes
> > > not 2 (the relevant bytes of bitmap in this case being f7 df ff - none
> > > 00, but the premature cutoff implying that the last was found 00).
> > >
> >
> > alloc_new_reservation() reserved a window with free block, when come to
> > the time to claim it, it scans the window again. So it seems that the
> > range of the the scan is too small:
> 
> The range of the scan is 1 byte too small in this case, yes.
> 
> >
> > p = ((char *)bh->b_data) + (here >> 3);
> > r = memscan(p, 0, (maxblocks - here + 7) >> 3);
> > next = (r - ((char *)bh->b_data)) << 3;
> >
> > ->   next is -1
> 
> I don't understand you: next was not -1, it was 0xd08.
> 
> > if (next < maxblocks && next >= here)
> > return next;
> >
> > --> falls to false branch
> 
> No, it passed the "next < maxblocks && next >= here" test
> (maxblocks being 0xd0e and here being 0xcfe), so returned
> pointing to an allocated block - then the caller finds it
> cannot set the bit.
> 

Apologies for the confusion. I thought ext2_try_to_allocate() failed
because we could not find a free block in the reserved window (i.e.,
find_next_usable_block() failed)

It seems in this case, find_next_usable_block() incorrectly returns a
bit it *thinks* free, but ext2_try_to_allocate() fails to claim it as
it's being marked as used.


So yes, Acked this fix. Thanks.

> >
> > here = bitmap_search_next_usable_block(here, bh, maxblocks);
> > return here;
> >
> > So we failed to find a free byte in the range.  That's seems fine to me.
> > It's only a nice thing to have -- try to allocate a block in a place
> > where it's neighbors are all free also. If it fails, it will search the
> > window bit by bit. So I don't understand why it is not being recovered
> > by bitmap_search_next_usable_block(), which test the bitmap bit by bit?
> 
> It already returned, it doesn't reach that line.
> 
Yep.

> >
> > > Is this a problem for mainline ext2?  No, because the "size" in its 
> > > memscan
> > > is always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a
> > > multiple of 8.  Is this a problem for ext3 or ext4?  No, because they have
> > > an additional extN_test_allocatable test which rescues them from the 
> > > error.
> > >
> > Hmm, if the error is it prematurely think there is no free block in the
> > range (bitmap on disk), then even in ext3/4, it will not bother checking
> > the jbd copy of the bitmap. I am not sure this is the cause that ext3/4
> > may not has the problem.
> 
> In the ext3/4 case, it indeed won't bother to check the jbd copy
> (having found this bitmap bit set), it'll fall through to the
> bitmap_search_next_usable_block you indicated above,
> and that should do the right thing, finding the first
> free bit in the area originally reserved.
> 
Make sense.  
> >
> > > But the bigger question is, why does the my_rsv case come here to
> > > find_next_usable_block at all?
> >
> > Because grp_goal is -1?
> 
> Well, yes, but my point is that we've got a reservation, and we're
> hoping to 

Re: 2.6.19-rc6-mm2

2006-11-28 Thread Thomas Tuttle
I've found a couple of bugs so far...

1. I did `modprobe kvm' and then tried running a version of the KVM Qemu
compiled for a different kernel.  My mistake.  But I got an oops:

BUG: unable to handle kernel NULL pointer dereference at virtual address 
0008
 printing eip:
f91f9c3f
*pde = 
Oops:  [#1]
SMP 
last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_max_freq
Modules linked in: kvm iTCO_wdt i8k rfcomm l2cap rtc sdhci mmc_block mmc_core 
hci_usb bluetooth b44 mii ohci1394 ieee1394 uhci_hcd ehci_hcd usbcore psmouse 
evdev i915 drm cpuid msr speedstep_centrino video thermal processor fan 
container button battery ac
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010202   (2.6.19-rc6-mm1 #1)
EIP is at kvm_vmx_return+0xef/0x4d0 [kvm]
eax: e5490068   ebx:    ecx:    edx: e5491ca4
esi:    edi: e5490060   ebp: e5a4fde0   esp: e5a4fd54
ds: 007b   es: 007b   ss: 0068
Process qemu (pid: 24193, ti=e5a4e000 task=c2286a90 task.ti=e5a4e000)
Stack: 0002 0001 f7fe1278 0002 b7f92000 e549   
   e5a4fdac  00d8 f783a580 e5a4fdac c043b98a bfb93f7c f91fa020 
   e5a4fde0 bfb93f7c bfb93f7c f91fa0cb 04f3 c03fb974 e549  
Call Trace:
 [] kvm_dev_ioctl+0x0/0x1040 [kvm]
 [] kvm_dev_ioctl+0xab/0x1040 [kvm]
 [] error_code+0x7c/0x84
 [] kmap_atomic+0xc9/0xe0
 [] permission+0x2b/0xd0
 [] sys_swapon+0x978/0xaf0
 [] kunmap_atomic+0x63/0x70
 [] kmap_atomic+0xc9/0xe0
 [] kunmap_atomic+0x63/0x70
 [] get_page_from_freelist+0x27d/0x340
 [] kmap_atomic+0xc9/0xe0
 [] kunmap_atomic+0x63/0x70
 [] get_page_from_freelist+0x27d/0x340
 [] find_get_page+0x20/0x60
 [] filemap_nopage+0x2dc/0x490
 [] do_sync_read+0xc7/0x110
 [] kmap_atomic+0xc9/0xe0
 [] kunmap_atomic+0x63/0x70
 [] __handle_mm_fault+0x246/0x9c0
 [] kvm_dev_ioctl+0x0/0x1040 [kvm]
 [] scsi_host_alloc+0x202/0x2a0
  [] do_ioctl+0x2b/0x90
 [] vfs_ioctl+0x5c/0x2b0
 [] sys_ioctl+0x3d/0x70
 [] syscall_call+0x7/0xb
 [] scsi_host_alloc+0x202/0x2a0
 ===
Code: 14 0f 87 77 02 00 00 8b 0c b5 00 15 20 f9 85 c9 0f 84 68 02 00 00 89 ea 
89 f8 ff d1 85 c0 0f 84 4c 02 00 00 89 f8 e8 31 e9 ff ff <65> a1 08 00 00 00 8b 
40 04 8b 40 08 a8 04 0f 85 ae 02 00 00 e8 
EIP: [] kvm_vmx_return+0xef/0x4d0 [kvm] SS:ESP 0068:e5a4fd54
 msrs: 2

Oh, and I get a ton of these messages with kvm:

rtc: lost some interrupts at 1024Hz.

2. I'm not sure if this bug is in the kernel, wireless tools, or the
ipw3945 driver, but I haven't changed the version of anything but the
kernel.  When I do `iwconfig eth1 essid foobar' something drops the
last character of the essid, and a subsequent `iwconfig eth1' shows
"fooba" as the essid.  And it's actually set as "fooba", since I had
to do `iwconfig eth1 essid MyUsualEssid_' (note underscore) to get on
to my usual network.

--Thomas Tuttle


pgpEvsWJPAyNU.pgp
Description: PGP signature


Re: [PATCH] prune_icache_sb

2006-11-28 Thread Andrew Morton
On Tue, 28 Nov 2006 16:41:07 -0500
Wendy Cheng <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> > On Mon, 27 Nov 2006 18:52:58 -0500
> > Wendy Cheng <[EMAIL PROTECTED]> wrote:
> >
> >   
> >> Not sure about walking thru sb->s_inodes for several reasons
> >>
> >> 1. First, the changes made are mostly for file server setup with large 
> >> fs size - the entry count in sb->s_inodes may not be shorter then 
> >> inode_unused list.
> >> 
> >
> > umm, that's the best-case.  We also care about worst-case.  Think:
> > 1,000,000 inodes on inode_unused, of which a randomly-sprinkled 10,000 are
> > from the being-unmounted filesytem.  The code as-proposed will do 100x more
> > work that it needs to do.  All under a global spinlock.
> >   
> By walking thru sb->s_inodes, we also need to take inode_lock and 
> iprune_mutex (?), since we're purging the inodes from the system - or 
> specifically, removing them from inode_unused list. There is really not 
> much difference from the current prune_icache() logic.

There's quite a bit of difference.  The change you're proposing will
perform poorly if it is used in the scenario which I describe above.  It
will waste CPU cycles and will destroy the inode_unused LRU ordering (for
what that's worth, which isn't much).

Trust me, every single time we've had an inefficient search in core kernel,
someone has gone and done something which hits it and causes general
meltdown in their workload.  So we've had to make significant changes to
remove the O(n) or higher search complexity.

And in this case we *already have* the date structures in place to make it
O(1).

> What's been 
> proposed here is simply *exporting* the prune_icache() kernel code to 
> allow filesystems to trim (purge a small percentage of ) its 
> (potentially will be) unused per-mount inodes for *latency* considerations.

It just happens to work in your setup.  If you have a large machine with
two filesystems and you run rsync on both filesystems and run FTP agains
one of them, it might not work so well.  Because the proposed
prune_icache_sb() might need to chew through 500,000 inodes from the wrong
superblock before reclaiming any of the inodes which you want to reclaim. 
Or something like that.

> I made a mistake by using the "page dirty ratio" to explain the problem 
> (sorry! I was not thinking well in previous write-up) that could mislead 
> you to think this is a VM issue. This is not so much about 
> low-on-free-pages (and/or memory fragmentation) issue (though 
> fragmentation is normally part of the symptoms). What the (external) 
> kernel module does is to tie its cluster-wide file lock with in-memory 
> inode that is obtained during file look-up time. The lock is removed 
> from the machine when
> 
> 1. the lock is granted to other (cluster) machine; or
> 2. the in-memory inode is purged from the system.

It seems peculiar to be tying the lifetime of a DLM lock to the system's
memory size and current memory pressure?

> One of the clusters that has this latency issue is an IP/TV application 
> where it "rsync" with main station server (with long geographical 
> distance) every 15 minutes. It subsequently (and constantly) generates 
> large amount of inode (and locks) hanging around. When other nodes, 
> served as FTP servers, within the same cluster are serving the files, 
> DLM has to wade through huge amount of locks entries to know whether the 
> lock requests can be granted. That's where this latency issue gets 
> popped out. Our profiling data shows when the cluster performance is 
> dropped into un-acceptable ranges, DLM could hogs 40% of CPU cycle in 
> lock searching logic. From VM point of view, the system does not have 
> memory shortage so it doesn't have a need to kick off prune_icache() call.

OK..

> This issue could also be fixed in several different ways - maybe by a 
> better DLM hash function,

It does sound like the lock lookup is broken.

I assume there's some reason for keeping these things floating about in
memory, so there must be a downside to artificially pruning them in
this manner?  If so, a (much) faster lookup would seem to be the best fix.

> maybe by asking IT people to umount the 
> filesystem where *all* per-mount inodes are unconditionally purged (but 
> it defeats the purpose of caching inodes and, in our case, the locks) 
> after each rsync, , etc. But I do think the proposed patch is the 
> most sensible way to fix this issue and believe it will be one of these 
> functions that if you export it, people will find a good use of it. It 
> helps with memory fragmentation and/or shortage *before* it becomes a 
> problem as well. I certainly understand and respect a maintainer's 
> daunting job on how to take/reject a patch - let me know how you think 
> so I can start to work on other solutions if required.

We shouldn't export this particular implementation to modules because it
has bad failure modes.  There might be a case for exposing an

Re: [PATCH] Don't compare unsigned variable for <0 in sys_prctl()

2006-11-28 Thread Linus Torvalds


On Wed, 29 Nov 2006, Jesper Juhl wrote:
> 
> I would venture that "-Wshadow" is another one of those. 

I'd agree, except for the fact that gcc does a horribly _bad_ job of 
-Wshadow, making it (again) totally unusable.

For example, it's often entirely interesting to hear about local variables 
that shadow each other. No question about it.

HOWEVER. It's _not_ really interesting to hear about a local variable that 
happens to have a common name that is also shared by a extern function. 

There just isn't any room for confusion, and it's actually not even that 
unusual - I tried using -Wshadow on real programs, and it was just 
horribly irritating.

In the kernel, we had obvious things like local use of "jiffies" that just 
make _total_ sense in a small inline function, and the fact that there 
happens to be an extern declaration for "jiffies" just isn't very 
interesting.

Similarly, with nested macro expansion, even the "local variable shadows 
another local variable" case - that looks like it should have an obvious 
warning on the face of it - really isn't always necessarily that 
interesting after all. Maybe it is a bug, maybe it isn't, but it's no 
longer _obviously_ bogus any more.

So I'm not convinced about the usefulness of "-Wshadow". ESPECIALLY the 
way that gcc implements it, it's almost totally useless in real life.

For example, I tried it on "git" one time, and this is a perfect example 
of why "-Wshadow" is totally broken:

diff-delta.c: In function 'create_delta_index':
diff-delta.c:142: warning: declaration of 'index' shadows a global 
declaration

(and there's a _lot_ of those). If I'm not allowed to use "index" as a 
local variable and include  at the same time, something is 
simply SERIOUSLY WRONG with the warning.

So the fact is, the C language has scoping rules for a reason. Can you 
screw yourself by usign them badly? Sure. But that does NOT mean that the 
same name in different scopes is a bad thing that should be warned about.

If I wanted a language that didn't allow me to do anything wrong, I'd be 
using Pascal. As it is, it turns out that things that "look" wrong on a 
local level are often not wrong after all.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [NET] dont insert socket dentries into dentry_hashtable.

2006-11-28 Thread Andrew Morton
On Tue, 28 Nov 2006 15:35:31 -0800 (PST)
David Miller <[EMAIL PROTECTED]> wrote:

> 
> Andrew, I'm fine with these three patches, specifically:
> 
> [PATCH] dont insert pipe dentries into dentry_hashtable.
> [PATCH] [DCACHE] : avoid RCU for never hashed dentries
> [PATCH] [NET] dont insert socket dentries into dentry_hashtable.
> 
> Could you toss them into -mm if you haven't already?

They were in rc6-mm2.

>  This
> makes better sense then me putting it into net-2.6.20 since
> it touches FS stuff.
> 

No probs, they're all lined up and ready to go, thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm patch] drivers/mtd/nand/rtc_from4.c: use lib/bitrev.c

2006-11-28 Thread Andrew Morton
On Tue, 28 Nov 2006 22:52:16 +
David Woodhouse <[EMAIL PROTECTED]> wrote:

> > I'll take that as an ack and shall merge this once
> > crc32-replace-bitreverse-by-bitrev32.patch is merged ;)
> 
> I assume the bitrev thing will be going in as soon as 2.6.19 is actually
> released,

It will take over a week after 2.6.19 - I prefer to wait until the git tree
laggards^Wowners have merged before merging -mm stuff, so things land in
appropriate order.

> so there's no point in me reverting it from the mtd tree?

Your call.

I do have a fixlet against this patch:

--- 
a/drivers/mtd/nand/rtc_from4.c~drivers-mtd-nand-rtc_from4c-use-lib-bitrevc-tidy
+++ a/drivers/mtd/nand/rtc_from4.c
@@ -357,7 +357,7 @@ static int rtc_from4_correct_data(struct
/* Read the syndrom pattern from the FPGA and correct the bitorder */
rs_ecc = (volatile unsigned short *)(rtc_from4_fio_base + 
RTC_FROM4_RS_ECC);
for (i = 0; i < 8; i++) {
-   ecc[i] = byte_rev_table[(*rs_ecc) & 0xFF];
+   ecc[i] = bitrev8(*rs_ecc);
rs_ecc++;
}
 
_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2 -mm] fault-injection: safer defaults, trivial optimization, cleanup

2006-11-28 Thread Andrew Morton
On Tue, 28 Nov 2006 14:50:45 -0800
Don Mullis <[EMAIL PROTECTED]> wrote:

> On Tue, 2006-11-28 at 13:37 -0800, Andrew Morton wrote:
> 
> > We'd prefer one-patch-per-concept, please. This all sounds like about
> > six patches.
> 
> Understood.
> 
> > We _could_ merge this patch as-is, but it means that when this stuff
> > finally hits mainline it would go in as a nice sequence of logical patches,
> > followed by a random thing which is splattered all over all the preceding
> > patches.
> 
> Does this argue for a respin of the original patches, folding in
> content from this one, rather than splitting it into an additional six to
> be appended to the series?

If the fixes are one-patch-per-concept, and if the original patch series is
one-patch-per-concept (it is) then I can usually insert the fixups in the
right place, later fold each into its appropriate base patch and everything
lands in git squeaky-clean.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt8

2006-11-28 Thread Karsten Wiese
Am Dienstag, 28. November 2006 23:40 schrieb Karsten Wiese:
> Am Montag, 27. November 2006 10:49 schrieb Ingo Molnar:
> > i have released the 2.6.19-rc6-rt8 tree, which can be downloaded from 
> 
> I saw usb transport errors here before rebooting with
>   nmi_watchdog=0
> contained in kernel command line.
> 
> Testcase stalled within 2 minutes before change,
> ticks happily after change for 15 minutes now.
> .config is a "release" type, no debugging options.

After estimated 15 minutes more it bugged again.
Related dmesg translates to linux error
-EXDEV
propably caused by the following lines:


static int uhci_result_isochronous(struct uhci_hcd *uhci, struct urb *urb)
{
struct uhci_td *td, *tmp;
struct urb_priv *urbp = urb->hcpriv;
struct uhci_qh *qh = urbp->qh;

list_for_each_entry_safe(td, tmp, >td_list, list) {
unsigned int ctrlstat;
int status;
int actlength;

if (uhci_frame_before_eq(uhci->cur_iso_frame, qh->iso_frame))
return -EINPROGRESS;

uhci_remove_tds_from_frame(uhci, qh->iso_frame);

ctrlstat = td_status(td);
if (ctrlstat & TD_CTRL_ACTIVE) {
status = -EXDEV;/* TD was added too late? */


  Karsten
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.15.4 rel.2 1/1] libata: add hotswap to sata_svw

2006-11-28 Thread Benjamin Herrenschmidt
On Tue, 2006-11-28 at 23:22 +, David Woodhouse wrote:
> On Thu, 2006-02-16 at 16:09 +0100, Martin Devera wrote:
> > From: Martin Devera <[EMAIL PROTECTED]>
> > 
> > Add hotswap capability to Serverworks/BroadCom SATA controlers. The
> > controler has SIM register and it selects which bits in SATA_ERROR
> > register fires interrupt.
> > The solution hooks on COMWAKE (plug), PHYRDY change and 10B8B decode 
> > error (unplug) and calls into Lukasz's hotswap framework.
> > The code got one day testing on dual core Athlon64 H8SSL Supermicro 
> > MoBo with HT-1000 SATA, SMP kernel and two CaviarRE SATA HDDs in
> > hotswap bays.
> > 
> > Signed-off-by: Martin Devera <[EMAIL PROTECTED]>
> 
> What became of this?

I might be to blame for not testing it... The Xserve I had on my desk
was too noisy for most of my co-workers so I kept delaying and forgot
about it 

Also the Xserve I have only has one disk, which makes hotplug testing a
bit harder :-)

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc PATCH] ieee1394: ohci1394: delete bogus spinlock, flush MMIO writes

2006-11-28 Thread Stefan Richter
Alan wrote:
> On Tue, 28 Nov 2006 22:24:11 +0100 (CET)
> Stefan Richter <[EMAIL PROTECTED]> wrote:
>> All MMIO writes which were surrounded by the spinlock as well as the
>> very last MMIO write of the IRQ handler are now explicitly flushed by
>> MMIO reads of the respective register.
> 
> MMIO is ordered anyway on the bus, you just need mmiowb() to force
> ordering to the bus controller in case you are on a big numa box.

The mmiowb is a checkpoint to ensure ordering between different threads
of MMIO writes; i.e. it doesn't halt the thread until the write actually
reached the device like a read would do, right?
-- 
Stefan Richter
-=-=-==- =-== ===-=
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: failed 'ljmp' in linear addressing mode

2006-11-28 Thread linux-os \(Dick Johnson\)

On Tue, 28 Nov 2006, Jun Sun wrote:

> On Tue, Nov 28, 2006 at 08:46:44AM -0500, linux-os (Dick Johnson) wrote:
>>
>> On Mon, 27 Nov 2006, Jun Sun wrote:
>>
>>>
>>> On Mon, Nov 27, 2006 at 08:58:57AM -0500, linux-os (Dick Johnson) wrote:

 I think it probably resets the instant that you turn off paging. To
 turn off paging, you need to copy some code (properly linked) to an
 area where there is a 1:1 mapping between virtual and physical addresses.
 A safe place is somewhere below 1 megabyte. Then you need to set up a
 call descriptor so you can call that code (you can ljump if you never
 plan to get back). You then need to clear interrupts on all CPUs (use a
 spin-lock). Once you are executing from the new area, you reset your
 segments to the new area. The call descriptor would have already set
 CS, as would have the long-jump. At this time you can turn off paging
 and flush the TLB. You are now in linear-address protected mode.

>>>
>>> Thanks for the reply.  But I am pretty much sure I did above correctly.
>>> I use single-instruction infinite loop in the call path to verify
>>> that control does reach last 'ljmp' but not the jump destination.
>>>
>>> Below is the hack I made to machine_kexec.c file.  As you can see, I
>>> managed to make the identical mapping between virtual and physical 
>>> addresses.
>>>
>>> Note I did not copy the code into the first 1M.  In fact the code
>>> is located at 0xc0477000 (0x00477000 in physical).  I thought that should be
>>> OK as I did not really go all the way back to real-address mode.
>>>
>>> That last suspect I have now is the wrong value in CS descriptor.  Does 
>>> kernel
>>> have a suitable CS descriptor for the last ljmep to 0x1000 in linear
>>> addressing mode?  The CS descriptor seems to be a pretty dark magic to me 
>>> ...
>>>
>>> Cheers.
>>>
>>> Jun
>>>
>>> -
>>> diff -Nru linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c.orig 
>>> linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c
>>> --- linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c.orig   2006-10-13 
>>> 11:55:04.0 -0700
>>> +++ linux-2.6.17.14-1st/arch/i386/kernel/machine_kexec.c2006-11-22 
>>> 15:01:45.0 -0800
>>> @@ -212,3 +212,19 @@
>>>rnk = (relocate_new_kernel_t) reboot_code_buffer;
>>>(*rnk)(page_list, reboot_code_buffer, image->start, cpu_has_pae);
>>> }
>>> +
>>> +extern void do_os_switching(void);
>>> +void os_switch(void)
>>> +{
>>> +   void (*foo)(void);
>>> +
>>> +   /* absolutely no irq */
>>> +   local_irq_disable();
>>> +
>>> +   /* create identity mapping */
>>> +   foo=virt_to_phys(do_os_switching);
>>> +   identity_map_page((unsigned long)foo);
>>> +
>>> +   /* jump to the real address */
>>> +   foo();
>>> +}
>>>
>> Get a copy of the Intel 486 Microprocessor Reference Manual or read it on-
>> line. There is no way that you can make a call like that.
>
> By "a call like that", you mean "foo()"?  Are you sure about that?
>
> The machine_kexec() function in the same file is basically doing the
> same way (i.e., use "call *$eax" instead of "ljmp").  That is where I got
> my idea from.
>
> In addition, if I put "1: jmp 1b" instruction anywhere *inside*
> do_os_switching() I would get infinite hanging instead of reboot,
> which seems to suggest I *did* jump into do_os_switching() successfully.
>
> According to Intel Architecture Software Developer's Manual (1997), Vol 3,
> page 8-14:
>
> "2.  If paging is enabled perform the following operations:
>
>  - Transfer program control to linear addresses that are identity mapped to
>physical addresses (that is, linear addresses equal physical addresses)
>  ...
> "
>
> it does not indicate one has to use "ljmp" to do this control transfer.

Assume you are accessing memory at 0xc000-. This address, when
page translation is occurring (page 5-17), consists of three parts.

(1) A 12-bit offset 0:11
(2) A 10-bit index  11:21
(3) A 10-bit index  21:31

So 0xc00 is an index into the page directory. If you wish to turn off
translation, you can't just turn off those bits. The next instruction
will be fetched from memory with the page-cache upper bits reset, i.e,
using offset 0 of the page directory. You somehow need to turn off those
bits at the same time the next instruction is fetched. Normally you
use a call gate. However, you can do a long jump which reloads the
segment register. When the instruction book says "transfer control"
it doesn't mean just jump to some offset. When the instruction address is
0xC000-, it is not the same as 0x-. These two addresses are 
different (to the CPU) until after those page translation bits are reset, not 
before.

>
>> You would need to
>> call through a task-gate or otherwise set the code-segment and the 
>> instruction
>> pointer at the same instant. First, look at the startup code for a GDT entry
>> that maps the linear address-space you are 

2.6.18 tsc clocksource + ntp = excessive drift; acpi_pm does fine.

2006-11-28 Thread Alexandre Pereira Nunes

Hi,

with default boot I got tsc clocksource selected on an debian's 
2.6.18-3-k7 SMP build (but UP machine). ntp keeps bothering me with this 
message:

frequency error 512 PPM exceeds tolerance 500 PPM

If I remove ntp's drift file and restart, it goes fine for a while and 
then it goes with that behaviour again.
If I remove ntp's drift file, then do a: echo acpi_pm 
>/sys/devices/system/clocksource/clocksource0/available_clocksource ; 
and then restart ntp, it goes fine "forever".


Any toughs, something I should look at?

I'll be glad to give more feedback.

I don't know if that happened with 2.6.17, but I'm pretty sure that with 
2.6.16 it was fine.


- Alexandre
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Don't compare unsigned variable for <0 in sys_prctl()

2006-11-28 Thread Jesper Juhl

On 29/11/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:



On Tue, 28 Nov 2006, Jesper Juhl wrote:
>
> > Friends don't let friends use "-W".
>
> Hehe, ok, I'll stop cleaning this stuff up then.
> Nice little hobby out the window there ;)

You might want to look at some of the other warnings gcc spits out, but
this class isn't one of them.

Other warnings we have added over the years (and that really _are_ good
warnings) have included the "-Wstrict-prototypes", and some other ones.

If you can pinpoint _which_ gcc warning flag it is that causes gcc to emit
the bogus ones, you _could_ try "-W -Wno-xyz-warning", which should cause
gcc to enable all the "other" warnings, but then not the "xyz-warning"
that causes problems.

Of course, there is often a reason why a warning is in "-W" but not in
"-Wall". Most of the time it's sign that the warning is bogus. Not always,
though - we do tend to want to be fairly strict, and Wstrict-prototypes is
an example of a _good_ warning that is not in -Wall.



I would venture that "-Wshadow" is another one of those.  I've, in the
past, submitted quite a few patches to clean up shadow warnings (some
accepted, some not) and I'll probably try going down that path again
in the near future.  It's a class of warnings that have the potential
to uncover real bugs (even if we don't currently have any) and it
would be a nice one to be able to enable by default in the Makefile.

I agree with you though that the "expression always false|true due to
unsigned" type of warnings are usually bogus - although there have
actually been real bugs hiding behind some of those warnings in the
past.  But, I'll make sure to only submit patches for that type of
warnings in the future if I can prove that the warning actually
uncovered a real bug.

--
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   >